Conversation

Oops haven't tweeted too much recently; I'm mostly watching with interest the open source LLM ecosystem experiencing early signs of a cambrian explosion. Roughly speaking the story as of now: 1. Pretraining LLM base models remains very expensive. Think: supercomputer + months. 2. But finetuning LLMs is turning out to be very cheap and effective due to recent PEFT (parameter efficient training) techniques that work surprisingly well, e.g. LoRA / LLaMA-Adapter, and other awesome work, e.g. low precision as in bitsandbytes library. Think: few GPUs + day, even for very large models. 3. Therefore, the cambrian explosion, which requires wide reach and a lot of experimentation, is quite tractable due to (2), but only conditioned on (1). 4. The de facto OG release of (1) was Facebook's sorry Meta's LLaMA release - a very well executed high quality series of models from 7B all the way to 65B, trained nice and long, correctly ignoring the "Chinchilla trap". But LLaMA weights are research-only, been locked down behind forms, but have also awkwardly leaked all over the place... it's a bit messy. 5. In absence of an available and permissive (1), (2) cannot fully proceed. So there are a number of efforts on (1), under the banner "LLaMA but actually open", with e.g. current models from , ~matching the performance of the smallest (7B) LLaMA model, and , nearby. For now, things are moving along (e.g. see the 10 chat finetuned models released last ~week, and projects like llama.cpp and friends) but a bit awkwardly due to LLaMA weights being open but not really but still. And most interestingly, a lot of questions of intuition remain to be resolved, e.g. especially around how well finetuned model work in practice, even at smaller scales.
I didn't agree what that post. I cheer for open source but the conclusion over-reaches. I'm still trying to formalize a coherent mental picture of the dynamics, meanwhile the nearest neighbor of my own hot take atm is along the lines of
Quote Tweet
This leaked google memo is a great overview of what open source AI has achieved... but the conclusion is wrong. twitter.com/simonw/status/…
Show this thread
Image
6
146
Show replies
Show replies
Super excited to push this even further: - Next week: bitsandbytes 4-bit closed beta that allows you to finetune 30B/65B LLaMA models on a single 24/48 GB GPU (no degradation vs full fine-tuning in 16-bit) - Two weeks: Full release of code, paper, and a collection of 65B models
40
1,418
Show replies
It's a great question. I roughly think of finetuning as analogous to expertise in people: - Describe a task in words ~= zero-shot prompting - Give examples of solving task ~= few-shot prompting - Allow person to practice task ~= finetuning With this analogy in mind, it's… Show more
Image
29
874
Show replies
Show replies