Excited to share what I've been working on with and !
We find that overoptimization of reward models can be modelled by simple functional forms with coefficients that scale smoothly with reward model size.
Paper: arxiv.org/abs/2210.10760
List
AI
By 2030 AI will be writing the tweets in this list
Follow
Click to Follow
This example shows why it is still risky to treat LLMs like software. The current models are both subject to hallucinations & do not act in the same way every time. Not good for a software tool that is built into a process.
AI works better as a supervised “intern” than software.
6
10
53
The write-it-for-me Button has come to gmail (or at least to beta testers). It is now easier to just have AI write email for you than to do it yourself.
I don’t think we are ready for what this means, as I discovered when The Button came to Google Docs. oneusefulthing.org/p/setting-time
17
56
340
It annoys me how much those advocating for existential risk expect us to believe them based on pure ethos (ie. authority of who says it)... do you know how many *years* of research it took to convince people machine learning models *might* be biased? And some are still in denial!
25
62
306
I hope that people don't start to deliberately cede significant control over critical security control to AI. I think if it happens, it may be because vendors convince the military and politicians of unrealistic expectations of the benefits. Critical voices would be crucial.
1
1
I certainly agree. This may be easier said than done, especially when AI systems achieve Legg’s definition of AGI and therefore would confer substantial military benefit during a conflict. So the admin capacity to exercise extreme care must be well-established ahead of time.
2
Thanks, this is an interesting perspective. I'm very open to the argument that AI can cause widespread damage if unchecked. But I don't think the extreme framing of "existential risk" is necessary or helpful & adequately substantiated to be so widespread:
Quote Tweet
3
1
3
Replying to
On the mitigation side have this interesting approach: arxiv.org/abs/2302.10870 essentially if you split your data and train 2 LLMs on disjoint data, if you force the models to only generate outputs that are likely under both models you get guarantees
1
7
If the results of this paper are robust (e.g. haven't scrutinized yet, is the clip baseline strong?) then this is a huge update for visual understanding. The simpler thing (just do LM) works best.
Quote Tweet
Who killed non-contrastive image-text pretraining? @AlecRad and @_jongwook_kim with the below Fig2 in CLIP.
Who collected the 7 Dragonballs and asked Shenron to resurrect it? Yours truly, in this new paper of ours.
Generative captioning is not only competitive, it seems better!
Show this thread
1
5
I wrote an obit for Jim Hartle, who was brilliant, and—crucially—kind.
3
9
maybe other countries should randomly deny visas to US students because of their military-civil fusion strategy 🤔
Quote Tweet
"In the US, Big Tech and the Military are increasingly fused, brought together by finance, joint projects, research and infrastructure. Untying the knot will be key to prevent endless wars abroad and militarised policing at home." tni.org/en/article/mil
1
When we organized against Google's Maven contract w US DOD, it wasn't just about 1 contract to build AI for an illegal drone war. We were concerned w the leverage it gave the US military over a massive surveillance corp & the shared incentives for war such dependency would instil
Quote Tweet
"In the US, Big Tech and the Military are increasingly fused, brought together by finance, joint projects, research and infrastructure. Untying the knot will be key to prevent endless wars abroad and militarised policing at home." tni.org/en/article/mil
2
38
122
Show this thread
Powerful artificial-intelligence ban possible, government adviser warns
12
13
31
works also on a mac.
(amazing how alien fr seems at first, compared to the canonical rf)
Quote Tweet
5
i like to think about GPT4 as a kind of resonator, an instrument.
and if you sing it songs of reason and insight, it can sing back
Quote Tweet
GPT4 and GPT3.5 can both track a chess game indefinitely deep into the game, if data is presented to them in a specific way. The moves produced are sometimes very high quality, and draws can be obtained at low rates against Stockfish 8(With the help of an external program coded… Show more
Show this thread
9
7
67
This is also the best way of working with (base) language models, and the same thing happens where if you take simulations seriously your problems start showing up analogized and you can solve them in simulation, it's pretty neat
Quote Tweet
Carl Jung with a creative approach to working with the psyche.
read image description
ALT
1
2
30
"what we owe to orcas" coming out soon!
Quote Tweet
「 You are the computer in the cell, calculating odds and strategies for winning against a jailer with a finite number of rules. You are the glint of insanity that squats, waiting to be let out, in the genius' eyes. 」
5
4
43
Illustration of Bing's short story, "The Terminal". [The original quote is by GPT-3.]
2
1
14
Quote Tweet
QRT this tweet with something in your drafts (please)
7
360
4,314
This is primarily about the computational complexity of reverse engineering circuits in the brain (intractable, as it turns out), but as the speaker briefly alludes, it's relevant to mechanistic interpretability of AI/ML systems as well!
Quote Tweet
Show this thread
1
5
11
Looks like the paper isn't available yet, but here's a guess as to how it applies to ML:
If interpreting a NN circuit means finding a human understandable symbolic program that approximates the circuit's behavior, searching over all those programs is likely intractable!
2
1
3
If we care about ML interpretability then, perhaps that's reason to adopt approaches that learn human-interpretable programs *by design*, as in this line of work:
(Or perhaps, to move away from ANNs altogether 😉)
Quote Tweet
Learning Transformer Programs
We designed a modified Transformer that can be trained to solve a task and then automatically converted into a discrete, human-readable program. With @_awettig and @danqi_chen.
Paper: arxiv.org/abs/2306.01128
Code: github.com/princeton-nlp/
[1/12]
Show this thread
1
2
6
Pro tip: if a machine learned model scores 100% on something, the metric is wrong, or the evaluation dataset is not robust, or the evaluation dataset is in the training data, or a combination of all of the above. At least in this point in time.
Quote Tweet
This was an impressive paper with some big implications
GPT-4 was given 4,550 novel questions representing the entire “MIT Mathematics and EECS undergraduate curriculum, including problem sets, midterms, and final exams”
With good prompts, it scored 100% arxiv.org/abs/2306.08997
Show this thread
35
124
709
AI Denialists will always find ever newer goalposts to move.
Quote Tweet
5
1
41
🚨 Do you think bottom-up (a.k.a. implementation-first or brain-first) approaches to cognition give us insights faster? Well, think again!
🎬🎥🍿 Watch 's talk about our paper: "Resource Demands of an Implementationist Approach to Cognition"
2
13
53
Show this thread
Really honored that my theorizing of wage setting and allocation in the gig economy as “algorithmic wage discrimination” is resonating as a way to understand what is happening.
Now let’s ban the practice.
Quote Tweet
"Gig workers contend with uncertain working conditions and algorithmic wage discrimination from the platforms they rely on to match with clients."
For @logic_magazine, I report on how algorithms are leading gig workers to debt and ruin.
logicmag.io/supa-dupa-skie
Show this thread
2
41
183
Happy to have a part 2 debate about why this is actually a good idea!
Quote Tweet
Here’s the latest OpenAI regulatory capture propaganda pushing the government to make it illegal for anyone to do advanced matrix multiplication aka use computers…
downloads.regulations.gov/NTIA-2023-0005
Show this thread
12
2
37
Daniel Ellsberg made an incredibly influential academic contribution en.wikipedia.org/wiki/Ellsberg_, and then threw away his career, and risked his life, to expose the wrongdoing of the powerful. A bit of a role model
1
7
28
Show this thread
Good post by Dr Buolamwini. I agree strongly re: the opportunities for strategic cooperation between groups who don't see all aspects of the risks the same way; many concrete interventions that serve to make the world safer across these concerns. 1/2
Quote Tweet
ICYMI , this morning Tawana Petty spoke alongside Professors Yoshua Bengio (Mila) and Max Tegmark on @democracynow about AI threats. I have immense respect for Dr. Bengio who put his reputation on the line to defend early AI bias research from corporate attack. I do not view… Show more
2
2
8
A signal boost to my colleagues and 's excellent paper on this from a couple of years ago:
1
2
Without careful policy, the default outcome of these technological transformations is that they benefit a lot of people (but predominantly those with most power/privilege), and harm those with least voice and recourse.
4
Courage and moral leadership until his last day.
Quote Tweet
1
במקום לנרמל אי שליחת אימיילים/ווטסאפ/ווטאבר של עבודה בשעות מסויימות, בואו ננרמל את זה שלא צריך לענות על הודעות כאלו אם זה לא נח. כל הקטע של תקשורת אסינכרונית זה ש*לא* צריך לחכות לזמן המתאים כדי לשלוח הודעה.
4
4
55
Good thread and article by Harry Law. "It is not only accuracy of predictions that matters. To predict is to influence, and to influence is to exert power"
This is key, especially for extinction risk-level concerns 1/2
Quote Tweet
1/ I’ve seen a few people ask whether AI is having a ‘limits to growth’ moment, so here’s a
on the 1972 limits to growth report, why predictions of the future are used to inform policymaking, and what the relevance is for anyone interested in governing powerful models
Show this thread
1
1
As extinction-level concerns can be used to justify almost anything (see e.g. population concerns used by Paul Ehrlich to justify support for forced sterilisation). It behooves us to speak and act carefully. 2/2
1
Just start the project. It will be autocompleted.
Quote Tweet
Replying to @dotnetschizo and @mwilcox
Just start the project. It will be autocompleted.
5
3
29
Had an insightful conversation with about AI and catastrophic risks. Two thoughts we want to share:
(i) It's important that AI scientists reach consensus on risks-similar to climate scientists, who have rough consensus on climate change-to shape good policy.… Show more
215
955
3,528
Interview with Tencent:
"AI horror! How long until human extinction? Initiator of the 'AI Risk Statement': Maybe it will be decades"
view.inews.qq.com/a/20230612A08Q
1
7
"Mistakes were made? A critical look at how EA has approached AI safety"
David Krueger | EAG London 23
youtube.com/watch?v=kFkyW-
3
11
73
ICYMI , this morning Tawana Petty spoke alongside Professors Yoshua Bengio (Mila) and Max Tegmark on about AI threats. I have immense respect for Dr. Bengio who put his reputation on the line to defend early AI bias research from corporate attack. I do not view… Show more
23
69
A good (if cheeky) response to when people say "we need research breakthroughs to get to AGI!":
"How many of those did we have in the last 10 years?"
(whatever number (e.g. 0) they give; you can just say "if they keep happening at that rate, we can expect tons of progress")
2
2
12
Not easy, but if we can figure out how to extend this to LLMs the impact would be huge. Imagine having access to models that take 5 minutes to ponder each response but the output is as good as a model that's 1,000x larger and trained for 1,000x longer than GPT-4
Quote Tweet
I keep revisiting this great paper from @andy_l_jones: “Scaling scaling laws with board games”. It shows how training compute and inference compute of MCTS can be traded off against each other. 10x more MCTS steps is almost the same as training 10x more. arxiv.org/abs/2104.03113
5
19
148
Our noses are incredible instruments with a terrible user interface, so we don’t know how much they can do
We have a fine-grained ability to identify a wide range of molecules that would take a well-equipped modern lab to find. Look at this chart from McGee’s great “Nose Dive”👇
4
11
87
More examples of our sense of smell doing a lot more than you might expect.
Quote Tweet
We don't know the power of our
We think we are not as good at smelling as animals, but we can accurately track scents if needed. This experiment blocked every other sense & had humans crawl on the ground following a trail of chocolate smell. It worked nature.com/articles/nn1819
Show this thread
2
2
18
kinda absurd that we need evidence for "gpt4 cannot evaluate gpt4" rather than assuming that's the case until we see evidence that it can.
Quote Tweet
If anyone wants a reference showing GPT-4 cannot correctly evaluate GPT-4 in expert domains, please see our paper: arxiv.org/abs/2304.05376
Show this thread
4
5
111
I've been trying to write a good Tweet(/thread) about why objections that AI x-risk is "not scientific enough" are misguided and prove too much.
There's a lot to say, and it deserves a full blog post, but I'll try and just rapid fire a few arguments:
5
16
49
6) Science (and it's cousins statistics, math, and logic) should be a guide for policy-making and public debate, when applicable. This emphatically *is the case already* for the AI x-risk discussion.
(side note: these only serve as guides, and values must also play a role).
1
10
7) Critics who wish to argue aginst AI x-risk on the grounds that it is not scientific should:
1) elaborate their case further, in order to show how it doesn't prove too much
AND/OR
2) stop framing their critiques as sufficient cause to dismiss AI x-risk concerns.
4
1
13
turns out there is "prior work":
1
16
Quote Tweet
The San Francisco I know has been obliterated by the tech bros. All the places I used to go for dancing closed. Homelessness like I’ve never seen. They keep on talking about “innovation” in the Bay Area when we don’t even have a functioning public transportation system. twitter.com/londonbreed/st…
21
159
2,078
Replying to
you can dance anywhere anytime sounds like a skill issue to me
2
1
98
U.S. NTIA received 1,447 written comments, including ~285 institutions (my estimate), in response to its AI Accountability Policy Request for Comment (RFC) ntia.doc.gov/press-release/
1
interesting thread/work showing that simple image-encoder-->text-decoder architecture can work as well as or better than the contrastive CLIP objective.
Quote Tweet
PS2: am I saying the CLIP figure is wrong? No! We reproduce a similar result.
However, this extreme inefficiency of captioning seems limited to this exact combination of ResNet50+short training+ImageNet 0shot
Switching to ViT, training longer, other benchmarks all look better.
Show this thread
4
5
40
i am conflicted w.r.t to which of the two worldviews this strengthens. is it:
(a) architectures don't matter anymore, with enough data and expressive enough models, everything works roughly the same.
or?
(b) transformers are all you need
(or maybe its both?)
7
28
“working with” is a funny way of saying we told the MBTA to do it and no, we don’t plan on reimbursing them for it (cc: )
why is MassDOT stealing millions from the T to mitigate their highway project?
34
89
This was an impressive paper with some big implications
GPT-4 was given 4,550 novel questions representing the entire “MIT Mathematics and EECS undergraduate curriculum, including problem sets, midterms, and final exams”
With good prompts, it scored 100% arxiv.org/abs/2306.08997
197
1,012
4,053
GPT-4 scored 90% just out of the box, without any special prompting techniques
11
14
500
1 million views and lots of good (and some bad) discussion. But only 3,710 clicked the link to read the paper!
The paper isn't without flaws, but many of your questions are answered, like why the authors think this data was not in the training set, how prompting was done, etc.
13
4
106
One reason AI is hard to "get" is that LLMs are bad at tasks you would expect an AI to be good at (citations, facts, quotes, manipulating and counting words or letters) but surprisingly good at things you expect it to be bad at (generating creative ideas, writing with "empathy").
7
24
151
Guys, there’s a bit of an ebb and flow to the discourse, especially on here. We haven’t “won” the debate. We haven’t “lost” the debate. We’re just in the middle of a conversation
1
2
19
Assembly and disentanglement puzzles have been around for centuries, but they still provide kids and adults with fun while keeping their minds sharp and creative. This is a nice collection of interlocking games
[read more: buff.ly/465onCK]
From
Vese
131
6,852
34.2K
BREAKING: UN Secretary General recognizes AI extinction risk, calls for coordination
"Alarm bells over [AI] are deafening, and they are loudest from the developers who designed it.
The experts have called upon the world to act, declaring it an existential threat to humanity."
49
129
339
Show this thread
4
16
198
Show this thread
Replying to
The CEOs of every leading AI lab have said their product might exterminate humanity. Don't you think you should at leaste say something about why you're confident it won't before berating safety efforts?
5
2
32
Potential plot twist: there was a lot of worry that AI-generated content would undermine the value of Stack Overflow.
But this paper suggests that AI instead led more people to contribute better-written answers, and there was no significant difference in upvotes for that AI work
Quote Tweet
ChatGPT has been really good for Stack Overflow.
"Our findings reveal that using generative AI leads to an increase in the number of answers generated by users."
"these answers tend to be shorter in length and easier to read."
papers.ssrn.com/sol3/papers.cf
1
13
81
Love seeing a Name of the Wind reference in a research codebase.
6
Founders keep their companies flexible.
Using data from every startup in the US, this paper shows that when founders leave a company (including when they unexpectedly die) that company becomes more subject to organizational inertia and less likely to engage in radical changes.
3
13
82
We need a new law kind of like Godwin’s law but for Elon Musk: anything Elon Musk had anything to do with ever will be described by the media as his.
4
I joined other attendees of the in affirming that the interdisciplinary community of AI policy researchers and practitioners "already have tools to help build a safer technological future" and in calling on "policymakers to fully deploy them."
Quote Tweet
29
77
EU Parliament passes the AI Act including
⏺️ban of live facial recognition tech (NatSec carveout?)
⏺️ban on scraping of biometric data
⏺️pre-deployment risk assessment
Next: final negotiations between the EU Parliament, EU Commission and the EU Council
10
216
351
Really excited about our new work for controlling robots with LLMs! "Language to Rewards" uses an LLM to generate rewards to optimize over with MuJoCo MPC. This flexible framework enables general behaviors, even moonwalking...
See 's 🧵 and site language-to-reward.github.io.
GIF
Quote Tweet
0:09
Show this thread
5
48
A surprising amount of modern AI is getting the model into the right mood for solving your problem.
90
143
1,012
and #FCRC are being held this week, & one of my students mentioned to me that she was considering attending the conference. But she is a transgender woman and Florida has passed laws that make traveling to Florida unsafe for trans people.
🧵& blog:
jonathanaldrich.github.io/2023/06/14/con
5
23
69
Show this thread
We're hiring for a Policy Advisor (fixed-term, 5 months with potential for permanent role).
As our only full-time policy employee, we expect the role to have a large influence over Apollo's policy efforts.
We intend to hire relatively soon.
7
10
🤖Excited to share our project where we propose to use rewards represented in code as a flexible interface between LLMs and an optimization-based motion controller.
website: language-to-reward.github.io
Want to learn more about how we make a robot dog do moonwalk MJ style?🕺🕺
4
125
410
Show this thread
I've been asked to make a list of the 15 companies which may do the biggest training runs in the next 18 months.
Lmk if you disagree.
1. DeepMind
2. OpenAI
3. Anthropic
4. Cohere ai
5. Adept ai
6. Character ai
7. Inflection ai
8. Meta
9. Apple
10. Baidu
11. Huawei
12.… Show more
21
11
72
Replying to
Maybe Alibaba. Might put someone like Salesforce or Palantir in there instead of HuggingFace.
2
14
Replying to
Yeah good point, I wasn't sure which Chinese company were collaborating and were not.
Thx for mentioning SalesForce!
3
Language-to-reward is a good idea. I am much more optimistic about this than directly controlling behavior with language.
Quote Tweet
Show this thread
0:09
17.6K views
1
6
20
Show this thread
Career update: I am now on leave from MIT and working at . My focus here is on doing research around deployment of AI systems and, overall, shaping its positive impact. Looking forward to this chapter!
32
29
848





































































