Opens profile photo
Follow
Richard Ngo
@RichardMCNgo
What would we need to understand in order to design an amazing future? Figuring that out
San Francisco, CAthinkingcomplete.comJoined October 2009

Richard Ngo’s Tweets

Quote Tweet
Replying to @gbrl_dick
Unfortunately the full argument is not just "we can't rule out danger" but also "we have specific reasons to expect danger via training AGIs" (which don't apply to software in general) as per my recent paper: twitter.com/RichardMCNgo/s
7
Show this thread
Deep learning has taught us that we don't need to understand a capability in order to train it into a model. So Chollet's skeptical argument should actually alarm us: if we don't understand the models we're training, then we can't guarantee they'll behave in desirable ways.
Quote Tweet
It's hard to take seriously someone who talks about how "AGI" is imminent yet cannot give you a precise definition of what they mean by "intelligence" or tell you how it works in principle
Show this thread
13
173
Show this thread
Excited to announce that our work, Progress Measures for Grokking via Mechanistic Interpretability, has been accepted as a spotlight at ICLR 23! (despite being rejected from Arxiv twice!) This was significantly refined from my prior work, thoughts in 🧵 arxiv.org/abs/2301.05217
Embedded video
0:21
35.7K views
Quote Tweet
I've spent the past few months exploring @OpenAI's grokking result through the lens of mechanistic interpretability. I fully reverse engineered the modular addition model, and looked at what it does when training. So what's up with grokking? A 🧵... (1/17) alignmentforum.org/posts/N6WM6hs7
Show this thread
7
464
Show this thread
the real issue w/ "understanding the math behind machine learning" is that, esp with the newer models, this will not really buy you anything. you can have a complete mechanistic understanding of everything in the process, it will not help you understand why/how these things work.
59
1,143
Show this thread
In my experience there's a particular mental sensation that arises when I'm in the middle of making a mistake. I'll give more details later - but before I anchor you, I'm curious if you relate to this, and if so how you'd describe it.
33
148
It also conflicts with the prevalence of imposter syndrome, status anxiety, and low self esteem. I'd guess the published findings focus on very specific tasks + stated beliefs, not broader attitudes + revealed beliefs; if so they're much less practically relevant than they seem.
7
67
Show this thread
I'm confused about how to reconcile this type of overconfidence with the fact that most people take way too few risks in so many domains (e.g. moving to new places, asking people out, switching jobs, trying new hobbies, etc).
Quote Tweet
"All mentally healthy people, especially men, show a systematic bias towards overconfidence in a wide range of domains. In particular, people tend to (1) overestimate their capabilities (2) overestimate their control over events, and (3) underestimate their vulnerability to risk"
Image
41
340
Show this thread
What if another person's worldview *isn't* actually reasonable? Well, it still seems much more reasonable from *their* perspective than yours! So if you can't see why, you're missing something - and as a rule you're missing more of other people's perspectives than you think.
9
60
Show this thread
Other related terms are "cognitive decoupling" and 's "scout mindset". But again I want to emphasize the holistic aspect. Many people can decouple in isolated thought experiments; far fewer can decouple from foundational worldview assumptions.
Quote Tweet
1/ Hey guys, I’m excited to announce my first book is coming out April 13! It’s about something I call “scout mindset” - the motivation to see things as they are, not as you wish they were; to be intellectually honest & curious about what's actually true penguinrandomhouse.com/books/555240/t
Show this thread
Image
2
32
Show this thread
This should be seen as a subset of "cognitive empathy" (aka "perspective-taking") focused on overall worldviews rather than everyday parts of people's perspectives (like personal conflicts). Seems worth distinguishing because it often anticorrelates with other types of empathy.
1
55
Show this thread
A term I've found useful lately is "intellectual empathy": the ability to see the ways that someone else's worldview is reasonable, even when it disagrees with your own. Emotional empathy is a vital skill, but intellectual empathy seems to be even rarer and more valuable.
62
1,411
Show this thread
Update: unsurprisingly in hindsight, routine cryotherapy wasn't meant to hurt this much. Anyway, I now have second degree burns on both hands and will see a plastic surgeon tomorrow. Good reminder: nobody's infallible. If something feels like it might be going wrong, speak up!
Quote Tweet
Just had a routine medical procedure that was nevertheless the most painful experience of my life, leaving me curled up and nauseous. I'm now feeling very visceral empathy and horror for the thousands (millions?) of people who underwent surgeries before anaesthesia.
Show this thread
9
77
Show this thread
There's also a lot of work to be done in getting people to internalise the horror of extreme suffering. It's insane that we can't even agree that hell is bad! Increasing our civilisational empathy is one of the most important projects possible.
Quote Tweet
It is terrifying that the species which is on a long-term trajectory towards the development of godlike technology contains so many people who are unbothered by the concept of hell.
Show this thread
2
37
Show this thread