Avatar of Adam Fields

Notes

Updates from me with curated content from across the web.

Grok-1

X.ai just open-sourced Grok-1, a 314-billion parameter mixture-of-experts model.

The repo has a Torrent magnet link to download the checkpoint, which is over 300GB. You’ll probably need a handful of A100 80GBs, which are $6/hr a piece right now according to Brev.

This is a base model, so it’s not fine-tuned for a specific task like chat or instruct; but those will show up on HuggingFace soon, I’m sure. Elon is my boy again 🤗

ai

Gemma

Google released Gemma, a new family of open-weight models related to Gemini. Hugging Face also published their own blog post about it.

Initially there are 2B and 7B sizes, with 8k context, available in base and instruct-tuned variants. The latter (instruct) are already available in Perplexity Labs to play around with.

Google also open-sourced the following repositories on GitHub:

Finally, there are official guides on how to use Gemma with KerasNLP including how to fine-tune using LoRA.

ai

verto.sh

verto is a web app from Luca Cavallin. It’s a tool for discovering open source issues to work on. You can filter by language, tag, as well as search.

The available issues are from a curated selection of repos filtered by labels. Inclusion criteria is in the README.

My New Year’s resolution every year is to contribute to OSS more, so this could be exactly what I’ve been looking for. Beyond the utility, the design is amazing. It’s built with Tailwind on Next.js using octokit.

dev

Preordered Rabbit R1

I bought a Rabbit R1.

It’s a handheld device designed by Teenage Engineering. The founder, Jesse Lyu, sold his previous company to Baidu.

I thought it was just a pocket LangChain or a voice controller for Perplexity, but it’s actually using a proprietary Large Action Model.

I tried not to, but Perplexity offered 1 year of Pro for the first 100,000 orders, so I couldn’t refuse.

I won’t get it until July though…

hw

Pocketbase

Pocketbase is a single-file binary written in Go with:

  • an embedded SQLite database with realtime subscriptions
  • file handling with static serving
  • user administration and email
  • a REST API with admin dashboard
  • auth with basic, JWT, and OAuth2
  • JS and Dart client SDKs

You can also use it as a framework to build on top of in Go.

dev

Hugging Face Tasks

Tasks is a page on the 🤗 website. They published a package a couple of months ago related to it, so it’s fairly new.

From the README:

The Task pages are made to lower the barrier of entry to understand a task that can be solved with machine learning and use or train a model to accomplish it. It’s a collaborative documentation effort made to help out software developers, social scientists, or anyone with no background in machine learning that is interested in understanding how machine learning models can be used to solve a problem.

Check out the Text Classification page as an example. There is a video explanation, a README with examples of the task, and even the metrics used to evaluate the models for that task. There’s also links to 🤗 resources like Models, Datasets, Spaces, Autotrain, and Endpoints.

ai

Mistral API

Got my beta invite for Mistral’s new API, la plateforme. Here’s a current pricing table with OpenAI and Perplexity for comparison:

APIModel$ / 1M tokens (in)$ / 1M tokens (out)
MistralMedium€2.50€7.50
MistralSmall (8x7B)€0.60€1.80
MistralTiny (7B)€0.14€0.42
OpenAIGPT-4$10.00$30.00
OpenAIGPT-3.5$1.00$2.00
Perplexity70B$0.70$2.80
Perplexity34B$0.35$1.40
Perplexity7B$0.07$0.28

The noteworthy inclusion here is Mistral’s new “medium” model, currently one of the top models on the arena.

ai

Llamafile: Single-file portable LLMs

Incredible project from Justine Tunney and Mozilla. A llamafile is a compiled LLM, weights and all. It uses Georgi Gerganov’s llama.cpp compiled with Justine’s Cosmopolitan library. The resulting binaries are APEs (actually portable executables).

Because it’s based on llama.cpp, it aso ships with the web UI from Tobi’s PR. Aside from that being one of my favorite PRs of all time (I worked at Tobi’s company), it’s a pretty interesting read on how to jam a web app into a C program.

The embedded server also provides an OpenAI-compatible REST API for local development.

Justine has 🤗 repos with compiled llamafiles. Note the resolve path segment in the URL. This resolves to an AWS CloudFront URL so large files download fast. Here’s how to run LLaVA:

# download to ~/.local/bin/llava and make executable
wget -O ~/.local/bin/llava https://huggingface.co/jartine/llava-v1.5-7B-GGUF/resolve/main/llava-v1.5-7b-q4.llamafile
chmod +x ~/.local/bin/llava

# localhost:8080
llava

On Windows/WSL2 it’s slightly different. You’re limited to 4GB models like the quantized LLaVA 7B unless you compile your own and use external weights.

Note that you probably need a working CUDA and cuDNN environment first.

# if you see "APE is running on WIN32 inside WSL", one of these should fix it
sudo sh -c 'echo -1 > /proc/sys/fs/binfmt_misc/WSLInterop'
sudo sh -c 'echo -1 > /proc/sys/fs/binfmt_misc/WSLInterop-late'

# set GPU layers to 35
llava -ngl 35

Justine published Bash one-liners you can try with the llamafiles.

Be sure to also check out the impressive Ollama project.

ai

CLI Guide

I found this one-page guide to writing command-line interfaces (CLIs) while working on my dotfiles. It’s from the creators of docker-compose at the easy-to-remember clig.dev. From the foreward:

Inspired by traditional UNIX philosophy, driven by an interest in encouraging a more delightful and accessible CLI environment, and guided by our experiences as programmers, we decided it was time to revisit the best practices and design principles for building command-line programs.

It’s written like a nice-to-read book. The first half, Philosophy, would be appreciated by anyone who enjoys good technical writing. If you want to get into best practices for things like telemetry and signal handling, then the second half, Guidelines, is for you.

dev

Perplexity.ai

Perplexity is a new AI startup. Their app combines online search with LLMs similar to how ChatGPT uses Bing. A cool feature is being able to pick which LLM you want to use. OpenAI, Anthropic, or Perplexity’s in-house models are available.

They also have a free playground at labs.pplx.ai where you can experiment with open-weight models like Llama and Mistral.

Their Pro plan costs the same as ChatGPT Plus ($20/mo), except it includes $5/mo of API credits (millions of tokens). I got two months of Pro for free (see below).

ai