Opens profile photo
Follow
nostalgebraist
@nostalgebraist
nostalgebraist.tumblr.comJoined December 2011

nostalgebraist’s Tweets

thank you!! the book should be complete sometime later this year. <10 chapters left -- just need to find the time to write them.
Quote Tweet
I think nostalgebraist has graduated from “guy I know who writes fiction” to “one of the most creative contemporary science fiction authors I know” archiveofourown.org/works/8341348/
Show this thread
1
20
My hot take is the smartest thing Google could do right now is improve search, not lard everything up with AI features. Search is still better, more useful, and more trustworthy for most queries, and it could stand to benefit immensely by making that case
10
767
Show this thread
wait, what?? why do Bard AND ChatGPT *both* write an anodyne story about a young woman in idyllic "Willow Creek" at sundown??? (details: it's not deterministic, often you get a different town name, different phrasing, etc. broad strokes are similar though. gpt-4 does it too.)
Image
Image
6
85
what's going on with the middle pane of fig. 5? - why do the lines jump vertically? doesn't happen in the other fig. 5 panes. - the accompanying text says "we see that our results do not vary strongly with RLHF at any model size" but this is not visually evident.
Image
Quote Tweet
Image
Language models (LMs) exhibit harmful biases that can get worse with size. Reinforcement learning from human feedback (RLHF) helps, but not always enough. We show that simple prompting approaches can help LMs trained with RLHF produce less harmful outputs. arxiv.org/abs/2302.07459
Show this thread
5
finally, use the closed-form optimum for the approach 3 functional form to calculate the optima with these A/B/a/b values. this is not what is done in approaches 1 or 2, and it does not get the same results. 4/4
Show this thread
then, assume that the constants A, B in that parametric L(N, D) function are identical between the 3 approaches. (note that this cannot be right. those A, B came from a fit to the data. if you fix them, but change a and b, the function won't fit the data anymore!)
1
2
Show this thread
iiuc, the methodology in the screenshot goes like this: assume that the L(N, D) functions in approaches 1-2 follow the parametric form fitted in approach 3. (approaches 1-2 do not assume a functional form for L(N, D) and do not estimate it.) 2/
1
Show this thread
this is not correct. first, note that the paper provides explicit C/N/D example numbers for all 3 approaches in tables 3, A3. there, all 3 methods require more than 4.3e23 compute to go above 67B params. so immediately we know something can't be right 1/
Quote Tweet
...anything from a 28B model (on 2.5T tokens) to a 260B model (on 270B tokens) between their 3 "Approaches". That's an unhelpful order of magnitude difference in how large of a model you should be training in order to be considered "compute optimal" 😐. [5/7]
Show this thread
Image
1
12
Show this thread
Announcing our new mechanistic interpretability paper! We use causal interventions to reverse-engineer a 26-head circuit in GPT-2 small (inspired by ’s circuits work) The largest end-to-end explanation of a natural LM behavior, our circuit is localized + interpretable 🧵
11
361
Show this thread
This is a really neat idea! I gave it a try in my GPT application (a tumblr bot). Some notes on what I observed: nostalgebraist.tumblr.com/post/675469656
Quote Tweet
Neural language models are really good at explaining held-out data. So when we sample from them, why do they yield dull and degenerate text? Our paper analyzes this behavior using information theory, and corrects for it with a new sampling principle: arxiv.org/abs/2202.00666
Show this thread
Image
1
11
Quote Tweet
nostalgebraist on Tumblr has trained a diffusion model that can write the text input to a text encoder into the image! (I think he is also using a separate language model to generate the text.) stalgebraist-autoresponder.tumblr.com/post/671421163 stalgebraist-autoresponder.tumblr.com/post/671420177
25
great stuff!! this is (a first step towards) a specific kind of transformers interpretation work that i've always hoped would exist one day. i spend a non-trivial fraction of my spare time thinking about this problem, mostly without success. anthropic's actually doing it!
Quote Tweet
Our first interpretability paper explores a mathematical framework for trying to reverse engineer transformer language models: A Mathematical Framework for Transformer Circuits: transformer-circuits.pub/2021/framework
Show this thread
15
update: asked these questions on EleutherAI discord and got helpful answers! - the bfloat16 checkpointing was an accident - probably little to no quality impact: training loss curves don't change noticeably around load/save steps
Quote Tweet
Replying to @nostalgebraist @shlevy and 2 others
i haven't tried the 1.3B huggingface checkpoint, but i don't see an indication it's 16-bit. if so, was it cast to 32-bit from the original bfloat16 checkpoint?
1
Show this thread
Replying to
Yeah. Copy editor needs extra time. Can’t blame them. There’s a lot of Latin. And Greek. And unusual alphabets. And code. And technical vocabulary. And meter. And extremely fussy pronouns. And rare punctuation. And it’s 2x as long as expected. And there’s this one chapter... 😅
5
49
The quest for "usability" and, to a lesser extent, "security" has wrapped most tech in so many layers of prescriptive, over simplified interfaces that it has cut off the paths that used to bridge professional and casual tech competence, creating two distinct social classes.
Quote Tweet
Please quote this tweet with a thing that everyone in your field knows and nobody in your industry talks about because it would lead to general chaos.
Show this thread
8
571
Show this thread
unfortunately, this is much harder than the "logit lens." representing a hidden state change (dim=1024) in a basis of FF values (dim=4096 per FF layer used) is underdetermined. tried to regularize for sparsity, many choices for how to do that... got kinda lost, was frustrating.
2
Show this thread