James Bradbury
@jekbradbury
JAX, TPUs, and LLMs at Google. Previously MetaMind/, Linguistics, . Spends too much time reading about Things that are Happening.
Joined October 2012
James Bradbury’s Tweets
Lots of 《流浪气球》 jokes on Weibo right now. In the original Chinese, it’s a pun on the film “The Wandering Earth.” In English, it’s the far less snazzy “The Wandering Balloon.”
(Nobody on Weibo is buying that it’s a civilian balloon, for what that’s worth.)
5
46
265
The Anthropic team is fantastic and I'm so excited to be working with them!
Quote Tweet
We're excited to use Google Cloud to train our AI systems, including Claude! anthropic.com/news/announcem
3
6
300
Here you go! Hot off the digital press:
China’s Spy Balloon Over Montana Is Part Of A Larger, More Troubling Pattern
35
158
440
Show this thread
Part 3 in the series is out, by Mangpo Phothilimthana and , covering the intersection of ML and computer systems (distributed systems, compilers, hardware design, energy-efficient and low emissions computing, etc).
4
25
99
Show this thread
I just spoke to a reporter about local government housing element noncompliance in the Bay Area, and they told me they spoke to a noncompliant local government this morning about if they were concerned about the and the city immediately hung up
7
16
217
"Progressive Prompts"
- Current LLMs have hard time w/ catastrophic forgetting & leveraging past exp
- Novel approach learns a prompt for new task + concatenates w/ frozen previously learned prompts
-Efficiently transfers knowledge to future tasks
Paper: arxiv.org/abs/2301.12314
6
57
331
Show this thread
Scaling laws for single-agent reinforcement learning
Finds that intrinsic performance scales as a power law in model size and environment interactions. Consequently, the optimal model size scales as a power law in the training compute budget.
arxiv.org/abs/2301.13442
1
21
90
I've been at Anthropic for over six months now and I'm happy to recommend it to a friend! We're hiring for software engineers to work on our research, product and infrastructure, and particularly you can come work with me on a newly formed✨Tokens Team!
8
20
269
Show this thread
Good morning, Bay Area. According to our evaluations, 69 of your 109 local jurisdictions are now subject to the builder's remedy. 🧵
45
348
2,333
Show this thread
ASML's CEO has an incentive to say China "ultimately… will get there" on EUV, whether he really believes it or not
predictably tho, that rhetoric then has a self-fulfilling prophecy effect by bolstering the Chinese chip industry's confidence in success
mp.weixin.qq.com/s/l3pQKtzPsw54
1
1
5
Show this thread
The story is more complicated though. While on aggregate text-davinci-002 is rated as comparable to freelance writers, we show that annotators have diverse and stable preference for either text-davinci-002 or the freelance writers.
1
3
20
Show this thread
Mark Lemley, one of the country's leading scholars in IP and internet law, will represent in the closely watched copyright suit from a group of artists.
He's part of Lex Lumina PLLC, a recently formed firm composed of a pretty formidable group of IP scholars.
4
23
43
Show this thread
Huge, huge finding: across grade levels & subjects, students improve in academic performance at the same rate with each practice session
The gap between good & bad students is initial knowledge, anyone can get mastery with practice (7 sessions on average) psyarxiv.com/pxsfh/
25
236
1,023
Show this thread
We’re developing a new tool to help distinguish between AI-written and human-written text. We’re releasing an initial version to collect feedback and hope to share improved methods in the future.
641
2,675
8,659
Thank you everyone for your advice. We love what you’re creating, but a set of actors use our tech for malicious purposes. We decided to take the following steps to address the issues:
104
125
252
Show this thread
Not all paths lead to ROME. Surprise: knowing where a fact is stored doesn’t help with amplifying or erasing that fact! Causal tracing explains only 3% of the variance in edit success. arxiv.org/abs/2301.04213 (2/9)
1
4
25
Show this thread
Finally got round to giving JAX a serious go beyond the tutorial. Stateless classes & picky control flow takes some getting used to, but for batchwise grads, Jacobians, JIT, etc. it's seriously good.
A few hours rewriting my PyTorch code took runtime from 80 minutes to 1.3s
4
13
228
Show this thread
Models such as Stable Diffusion are trained on copyrighted, trademarked, private, and sensitive images.
Yet, our new paper shows that diffusion models memorize images from their training data and emit them at generation time.
Paper: arxiv.org/abs/2301.13188
👇[1/9]
182
2,463
9,978
Show this thread
One of the key models in MusicLM is SoundStream, an audio codec. It made vocoders obsolete; and reshaped audio generation as a token prediction task.
SS is not open to public, but a similar neural audio codec Encodec is completely open-source →
7
49
411
Show this thread
Excited to share SingSong, a system which can generate instrumental accompaniments to pair with input vocals!
📄arxiv.org/abs/2301.12662
🔊g.co/magenta/singso
Work co-led by myself, , and as part of and the broader MusicLM project 🧵
140
1,085
3,131
Show this thread
"Looped Transformers as Programmable Computers"
Shipped today, long thread with arxiv link tomorrow.
This paper marks a significant shift in my group's research, that I'm very excited about.
21
139
881
Show this thread
I am fucking *begging* you to not feed LLM outputs to anything resembling an `eval` statement.
Jesus christ let me work on interesting attacks and not just "oops I asked it nicely to give me a shell and it did"
Quote Tweet
TIL. Just used prompt injection to read out the secret OpenAI API key of a very well known GPT-3 application.
In essence, whenever parts of the returned response from GPT-3 is executed directly, e.g. using eval() in Python, malicious user can basically execute arbitrary code
.
Show this thread
1
27
71
The Dutch and ASML still do not understand the national security rationale for controlling DUV, and the rest of the semiconductor industry continues to grapple with the inclusion of memory in the end use controls, as memory is a commodity product....
1
9
37
Show this thread
I’ve been investigating the contamination theory of the obesity epidemic as laid out in ’s “A Chemical Hunger,” and found that several of what seem to be the strongest arguments for the theory turned out to be dubious. 🧵
15
105
508
Show this thread
You probably heard about Yandex, it’s the 4th biggest search engine by market share worldwide. Yesterday proprietary source code of Yandex was leaked.
The most interesting part for SEO community is: the list of all 1922 ranking factors used in the search algorithm
[🧵THREAD]
172
1,878
5,448
Show this thread
whoa, this is bigger than ChatGPT to me.
google almost solved music generation, i'd say.
google-research.github.io/seanet/musiclm
161
1,675
6,702
Show this thread
ChatGPT for biology? Excited to share our work on LLMs for protein design out today
nature.com/articles/s4158
+ Proud to publicly announce with a $9M seed round to tackle meaningful challenges in biology with AI. Join us!
33
366
1,572
Show this thread
For code language models, every token is a new chance to break a program. What if LLMs wrote code like people, decomposing programs into solvable parts? They can solve competition-level coding problems by writing natural language programs in Parsel🐍, beating prior SoTA by >75%!
read image description
ALT
10
90
520
Show this thread
We’re awarding prizes to 7/48 submissions to the Inverse Scaling Prize Round 2! Tasks show inverse scaling on models, often even after training with human feedback. Details at irmckenzie.co.uk/round2 and 🧵 on winners:
3
72
253
Show this thread
Presenting MemoTrap🪤, a dataset of 2.5K examples spanning text completion, translation & QA, where repeating memorized text & concepts is *not* the desired behavior. We find that LMs perform worse📉 as they scale up, revealing severe failures in simple instruction-following🧵
read image description
ALT
read image description
ALT
Quote Tweet
Show this thread
6
97
513
Show this thread
It is sad to lose the DeepMind office in Edmonton to the Tech layoffs and looming recession. But AI is not going away, and I am more focused than ever on the Alberta Plan for AI research.
9
90
795
Show this thread
How to do Transformer Mechanistic Interpretability in 50 Lines - great post from ! Includes a bunch of significant improvements on my tutorials, and an interactive space to try out the methods!
lesswrong.com/posts/hnzHrdqn
3
29
191
Show this thread
CUTLASS 3.0 has just been released, offering optimal performance on NVIDIA’s H100 and a new approach to template metaprogramming in CUDA C++.
4
48
247








































