A place where we discuss AI x-risk/safety, theories of intelligence, agency, consciousness, and ethics, in Russian:
Roman Leventov
@leventov
An independent researcher of ethics, AI safety, and AI impacts. lesswrong.com/users/roman-le
Roman Leventov’s Tweets
𝗧𝗵𝗲 𝗽𝗮𝘀𝘁 𝟲 𝗺𝗼𝗻𝘁𝗵𝘀:
“Of course, we won’t give the AI internet access”
𝘔𝘪𝘤𝘳𝘰𝘴𝘰𝘧𝘵 𝘉𝘪𝘯𝘨: 🤪
“Of course, we’ll keep it in a box”
𝘍𝘢𝘤𝘦𝘣𝘰𝘰𝘬: 😜
“Of course, we won’t build autonomous weapons”
𝘗𝘢𝘭𝘢𝘯𝘵𝘪𝘳: 😚
“Of course, we’ll coordinate and… Show more
146
459
2,789
Dagan Shani has IMHO made the most important film of the year - about the harsh #AI truth - see it right here on Twitter:
Quote Tweet
Don't Look Up - The Documentary: The Case For AI As An Existential Threat.
Show this thread
17:15
9.8K views
26
62
249
Realising Synthetic Active Inference Agents, Part II: Variational Message Updates. Thijs van de Laar, Magnus Koudahl, and Bert de Vries
1
4
8
Show this thread
The current wave of AI allows to support people at stages 4 (exobrain to support attention and rational decision-making) and 5 (AI as a teacher, aide, coach, psychotherapist). Brain-computer interfaces and advanced VR should democratise level 6.
Quote Tweet
Where stage 5 allows the deconstruction of one's own identity, stage 6 goes one level deeper and deals with the implementation of perception, the construction of qualia, the architecture of motivation and the regulation of physiology. This is the domain of advanced meditators.
Show this thread
Replying to
Very interesting, thanks! This part reminded me of Michael Levin's intriguing work around care as the driver of intelligence
mindrxiv.org/wemr6/
1
2
3
I’ve written down some purely personal thoughts about AI and what remains special about the human mind. I hope these will resonate for some readers.
21
107
288
Is cognition all in the head (neurons)?
Here we suggest that cognition is a complex multiscale information processing distributed across every single cell in the body 😎
We focus on the immune / neuro tandem E. Shmeleva
frontiersin.org/articles/10.33
33
163
509
Wouldn't it be elegant if the science of ethics will say something along the lines of: if the agent is agreeable and persuadable, align with them, if not -- use it as an energetic resource (i.e., destroy them).
Then the main challenge and the litmus test for the humanity now is…Show more
I left seven large criticial comments to this post: lesswrong.com/posts/ZKeNbGBf, totalling half as much text as the post itself. I guess I should have entered this OpenPhil contest...
Quote Tweet
Third, there's Types and Degrees of Alignment, which hopefully helps point out some important common confusions and contradictions on the subject. shorturl.at/lAFW5
Show this thread
Our paper "Auditing large language models: a three-layered approach" has been published in AI and Ethics (w/ Jakob Mökander, , & )
doi.org/10.1007/s43681
1
12
68
Show this thread
The remaining limitation is linguistic inertia. It's reluctant to admit mistakes and radically rewrite some desiderata (specifically, "6. Robustness and Flexibility" should have been inverted), but instead tries to "amend" them.
LLMs don't have cognitive flexibility of GFlowNets
Show this thread
About a year ago, when I asked whether LLMs are close to superhuman level in philosophy, I've been laughed at... Here it is, GPT-4 does very impressive philosophy of science and meta-ethics: chat.openai.com/share/1025a325. In some respects making mistakes, but in others surprising me.
1
1
1
Show this thread
(World) model structure, not goal structure, is the key question. lesswrong.com/posts/fqfAmAGF
Quote Tweet
This is a key alignment question: what goal structure will an AGI system end up with given some training process? How likely is this to be aligned with our own goals? Different training processes will have different likelihoods of producing an aligned agent twitter.com/ESYudkowsky/st…
1
The "reasonable theory of ethics" isn't guaranteed to be anthropocentric though, and may kill all humans because humans are not very reasonable to keep around from the ethical perspective (e.g., replace them with brain-in-wats). But deliberately causing suffering seems nonsense.
Show this thread
This is too much hand-waving and I think is wrong. Intelligence (any) "wants" to be self-coherent; theories of physics, rationality, evolution, etc. cohere with some reasonable theories of ethics (theory of ethics = "what I should do").
Quote Tweet
Summoning an angel and a demon are just nearly identical operations. You can’t summon an angel without also being able to summon a demon if you wanted. It’s like an angel with a “not” before every operation
Show this thread
1
Show this thread
TLDR: H-JEPA probably won't save us unless we already have an aligned LLM-based cognitive architecture
Quote Tweet
Aligning an H-JEPA agent with GFlowNet actors via training on the outputs of an aligned LMCA: lesswrong.com/posts/MJXwnHbq
Aligning an H-JEPA agent with GFlowNet actors via training on the outputs of an aligned LMCA:
1
2
A simple idea of an aligned LMCA and the discussion of its properties:
I polled Yoshua Bengio on AI risks 8-9 years ago, and he was in the "AGI not a realistic concern at all" camp.
This is a post from him on Facebook, from today.
My, how far we've come.
16
52
312
Show this thread
Replying to
Mechanics determine inductive biases, and inductive biases are very important to study.
You may be misled to think that psychology is independent of mechanics by the apparent "human-likeness" of LLM's reasoning (and, thus, psychology), but this because they are currently at… Show more
1
1
Like many others, I was thrilled to see and colleagues publish this timely report on the need for model evaluations to prevent extreme risks from advanced AI systems.
A 🧵 with some highlights of the paper that stuck out to me the most.
3
28
90
Show this thread
Announcing Alignment Assemblies (AAs)!
CIP is piloting approaches to involve the public in shaping AI's development for the collective good. 🌍
We're working with partners including and on ways to connect the public directly to power.
cip.org/alignmentassem
2
52
103
Show this thread
Values depend on the world model at least to be identified (a person can't value freedom without understanding what it is), but also for people at different levels of (WM) development (or lucidity: twitter.com/Plinz/status/1) some values respectively become necessary or obsolete
Quote Tweet
Levels of lucidity:
Stage 1: Reactive survival (infant)
Stage 2: Personal self (young child)
Stage 3: Social self (adolescence, domesticated adult)
Stage 4: Rational agency (self-direction)
Stage 5: Self authoring (full adult, wisdom)
Stage 6: Enlightenment
Stage 7: Transcendence
Show this thread
1
Show this thread
Values are heuristics (either of behaviour or important objects) that help people with relatively specific world models to behave adaptively in a certain, *concrete* society/system. Not any society and eternally!
1
Show this thread
'Is There a Hard Problem of Consciousness… and of Everything Else?'
(ii) Why-Questions (iii) The #HardProblemofConsciousness (iv) The #HardProblem of Liquidity (v) Hard Problem of White Walls (vi) The Hard Problem of Balls
#DrKaneBaker #DavidChalmers
paulaustinmurphypam.blogspot.com/2023/05/is-the
2
1
Terrific lecture!
Quote Tweet
@InferenceActive #FEP
youtube.com/watch?v=RpOrRw
Show this thread
Now Microsoft’s chief economist has said:
“We shouldn’t regulate AI until we see some meaningful harm that is actually happening”
“The first time we started requiring driver’s license it was after many dozens of people died in car accidents, right, and that was the right thing”
13
38
140
Show this thread
As a question on LW:
Show this thread
AI interpretability could be harmful? A superhuman ethical AI might want to model adversaries and their actions, e.g., model which bioweapons an adversary might develop and prepare response plans and antidotes.
If such predictions are done in interpretable representations, they… Show more
2
1
1
Show this thread
Replying to
Awesome, thank you for sharing, here is the direct YouTube link:
youtube.com/watch?v=mdL4zE
Free Energy Principle, Consciousness, Illusionism, and Realism | Brains Roundtable discussion
Karl Friston, Mark Solms, Wanja Wiese, Krzysztof Dolega, and host Majid D. Beni
3
10
6) On that note -- for a while, I've said there is an "uncanny valley" of alignment: before systems are aligned enough to be existentially safe, they are aligned enough for people to want to deploy them. I wish I'd pushed on this point harder in the past.
4
5
42
Show this thread
“Resource-rational contractualism: A triple theory of moral cognition.” With my incredible co-authors , Josh Tenenbaum, and . Available here: psyarxiv.com/p48t7/ (1/n)
3
56
178
Show this thread
My reply: “A Path Towards Autonomous Machine Intelligence” might technically work in a modified form", lesswrong.com/posts/umsGb5qk cc ;)
Quote Tweet
Blog post: “@ylecun’s “A Path Towards Autonomous Machine Intelligence” Has an Unsolved Technical Alignment Problem” alignmentforum.org/posts/C5guLAx7
An annotated reply to a recent Yoshua Bengio's AI safety proposal:
1
But: rather bad sanguine takes about self-regulation towards the end, e.g.: "If somebody is able to edge out a competitive advantage just with the tools that we have today, then I think it will spread, and you’ll end up with a sort of self-regulating system"
Show this thread























