Opens profile photo
Follow
Roman Leventov
@leventov
An independent researcher of ethics, AI safety, and AI impacts. lesswrong.com/users/roman-le

Roman Leventov’s Tweets

𝗧𝗵𝗲 𝗽𝗮𝘀𝘁 𝟲 𝗺𝗼𝗻𝘁𝗵𝘀: “Of course, we won’t give the AI internet access” 𝘔𝘪𝘤𝘳𝘰𝘴𝘰𝘧𝘵 𝘉𝘪𝘯𝘨: 🤪 “Of course, we’ll keep it in a box” 𝘍𝘢𝘤𝘦𝘣𝘰𝘰𝘬: 😜 “Of course, we won’t build autonomous weapons” 𝘗𝘢𝘭𝘢𝘯𝘵𝘪𝘳: 😚 “Of course, we’ll coordinate and… Show more
146
2,789
The current wave of AI allows to support people at stages 4 (exobrain to support attention and rational decision-making) and 5 (AI as a teacher, aide, coach, psychotherapist). Brain-computer interfaces and advanced VR should democratise level 6.
Quote Tweet
Where stage 5 allows the deconstruction of one's own identity, stage 6 goes one level deeper and deals with the implementation of perception, the construction of qualia, the architecture of motivation and the regulation of physiology. This is the domain of advanced meditators.
Show this thread
Is cognition all in the head (neurons)? Here we suggest that cognition is a complex multiscale information processing distributed across every single cell in the body 😎 We focus on the immune / neuro tandem ⁦⁩ E. Shmeleva frontiersin.org/articles/10.33
Image
33
509
Wouldn't it be elegant if the science of ethics will say something along the lines of: if the agent is agreeable and persuadable, align with them, if not -- use it as an energetic resource (i.e., destroy them). Then the main challenge and the litmus test for the humanity now is…Show more
The remaining limitation is linguistic inertia. It's reluctant to admit mistakes and radically rewrite some desiderata (specifically, "6. Robustness and Flexibility" should have been inverted), but instead tries to "amend" them. LLMs don't have cognitive flexibility of GFlowNets
Show this thread
Quote Tweet
This is a key alignment question: what goal structure will an AGI system end up with given some training process? How likely is this to be aligned with our own goals? Different training processes will have different likelihoods of producing an aligned agent twitter.com/ESYudkowsky/st…
The "reasonable theory of ethics" isn't guaranteed to be anthropocentric though, and may kill all humans because humans are not very reasonable to keep around from the ethical perspective (e.g., replace them with brain-in-wats). But deliberately causing suffering seems nonsense.
Show this thread
This is too much hand-waving and I think is wrong. Intelligence (any) "wants" to be self-coherent; theories of physics, rationality, evolution, etc. cohere with some reasonable theories of ethics (theory of ethics = "what I should do").
Quote Tweet
Summoning an angel and a demon are just nearly identical operations. You can’t summon an angel without also being able to summon a demon if you wanted. It’s like an angel with a “not” before every operation
Show this thread
1
Show this thread
TLDR: H-JEPA probably won't save us unless we already have an aligned LLM-based cognitive architecture
Quote Tweet
Aligning an H-JEPA agent with GFlowNet actors via training on the outputs of an aligned LMCA: lesswrong.com/posts/MJXwnHbq
Replying to
Mechanics determine inductive biases, and inductive biases are very important to study. You may be misled to think that psychology is independent of mechanics by the apparent "human-likeness" of LLM's reasoning (and, thus, psychology), but this because they are currently at… Show more
1
Values depend on the world model at least to be identified (a person can't value freedom without understanding what it is), but also for people at different levels of (WM) development (or lucidity: twitter.com/Plinz/status/1) some values respectively become necessary or obsolete
Quote Tweet
Levels of lucidity: Stage 1: Reactive survival (infant) Stage 2: Personal self (young child) Stage 3: Social self (adolescence, domesticated adult) Stage 4: Rational agency (self-direction) Stage 5: Self authoring (full adult, wisdom) Stage 6: Enlightenment Stage 7: Transcendence
Show this thread
1
Show this thread
Values are heuristics (either of behaviour or important objects) that help people with relatively specific world models to behave adaptively in a certain, *concrete* society/system. Not any society and eternally!
1
Show this thread
Now Microsoft’s chief economist has said: “We shouldn’t regulate AI until we see some meaningful harm that is actually happening” “The first time we started requiring driver’s license it was after many dozens of people died in car accidents, right, and that was the right thing”
13
140
Show this thread
AI interpretability could be harmful? A superhuman ethical AI might want to model adversaries and their actions, e.g., model which bioweapons an adversary might develop and prepare response plans and antidotes. If such predictions are done in interpretable representations, they… Show more
2
1
Show this thread
My reply: “A Path Towards Autonomous Machine Intelligence” might technically work in a modified form", lesswrong.com/posts/umsGb5qk cc ;)
Quote Tweet
Blog post: “@ylecun’s “A Path Towards Autonomous Machine Intelligence” Has an Unsolved Technical Alignment Problem” alignmentforum.org/posts/C5guLAx7
But: rather bad sanguine takes about self-regulation towards the end, e.g.: "If somebody is able to edge out a competitive advantage just with the tools that we have today, then I think it will spread, and you’ll end up with a sort of self-regulating system"
Show this thread