pledge to donate at least 10% of my income for the rest of my life or until I retire to the organizations that can use it most effectively to improve the lives of others. 🧵 (1/5)
Personally, I am deeply concerned that the current trajectory for AI regulation will accomplish little more than a regulatory moat.
Licensing, evaluations, and audits are all good ideas.
But they are not enough.
We need an indefinite pause on the development of more powerful AI.
AI researchers from leading universities worldwide have signed the AI extinction statement, a situation reminiscent of atomic scientists issuing warnings about the very technologies they've created. As Robert Oppenheimer noted, “We knew the world would not be the same.”
🧵(2/6)
We’ve released a statement on the risk of extinction from AI.
Signatories include:
- Three Turing Award winners
- Authors of the standard textbooks on AI/DL/RL
- CEOs and Execs from OpenAI, Microsoft, Google, Google DeepMind, Anthropic
- Many more
“Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war.”
Signatories: CEOs of OpenAI, DeepMind, and Anthropic, 2/3 “godfathers of AI”, etc.
United Nations Office for Disaster Risk Reduction:
“An existential risk is 1.9 to 14.3% likely this century”
They don’t give a good description of what the x-risk from AI is about (spoiler: literally killing everyone), but cite Bostrom, Russel, Ord.
Good to see this work.
We need to transform risk governance to address emerging risks from fast technological change.
Watch the video and check out the study about existential risks http://ow.ly/Mn8750NIV0A
To safely use AI in a domain like solving alignment-complete problems by many orders of magnitude faster than you’d solve one without AI is an alignment-complete problem
Someone was reviewing my post draft and told me that "might kill everyone" doesn't sound as spooky as "may create power-seeking agents"
(because "specific threats work better")
By that point, incentives to train more capable models are also much lower, this level of coordination probably means that you can't profit from the models without attracting attention from intelligence agencies, people consider you basically a terrorist and don't really work for…Show more
If this actually happens, it doesn't seem like an issue for humanity to give all the people in the area advance notice to evacuate and only then air-strike the GPU farms
. He doesn't suggest air-striking anything right now. He's saying that unless there's a coordination of an almost impossible level of "states and society understand the problem and cooperate enough to be prepared to literally air-strike…Show more
"Progress in AI capabilities is running vastly, vastly ahead of progress in AI alignment or even progress in understanding what the hell is going on inside those systems. If we actually do this, we are all going to die."
Now is the first time the Metaculus community median on the “date weak AGI is publicly known” question is earlier than mine. I’ve frozen my prediction in May (the community median was 2033 back then). I’m not happy it’s people updating towards me and not the other way around
“If you launch anything capable and you don’t have good reasons to think it’s an aligned mind, it will not be an aligned mind. If you try to prevent specific failure modes- if you identify optimizations towards something different from what you want, or how exactly gradient…Show more
The idea with agent foundations, which I guess hasn't successfully been communicated to this day, was finding a coherent target to try to get into the system by any means (potentially including DL ones).
Say how you really feel ChatGPT...
(I know this is still benign because it's only outputting langauge and not taking agentic actions in the real world but -- this out of distribution behavior is still happening after months of RLHF...makes you think about how hard alignment is)
Look, I really didn't want to write about this, but here we are.
I'm very upset with Nick Bostrom.
His original email was terrible, racist, and offensive.
His apology was absolutely idiotically executed.
Here I explain why I feel this way.
I don't know yet how we'll repair the damage Sam did and harden EA against other bad actors.
But I know that we're going to try, because the stakes remain painfully high.
15/15