1/ New research agenda: Supervising AIs improving AIs 🤖 How can we ensure future prosaic AIs remain safe and controllable while shaping their own development or that of successor AIs? I will go over the agenda below.
Why should humans do space exploration? If you are for it, what is it about space exploration that you consider it to be worth the effort?
If you’re against it, why?
If you’re ambivalent about it, why?
Almost every AI risk debate on Twitter is one person talking about superintelligent AGI and the other one talking about current LLMs. Neither is starting from the same place. AI risk folks should be more specific about what they’re talking about. Might lead to more understanding.
In particular, it makes it very difficult for humans to *a-priori* value something like parenting, at least until the human actually experiences being a parent (since genetic reward circuitry can then activate for various sensory correlates of interacting with ones children).
I ask this question because I think most people will say no, but it seems to me that if their main goal is maximize their personal happiness and they are in a relationship, then will they act in the best interest of everyone involved?
Christopher Nolan explains why he didn’t use CGI for the nuclear explosion in ‘OPPENHEIMER’.
“I find CGI rarely is able to grab you. It tends to feel safe. Even if it's impressive and beautiful, it's difficult to make you feel danger.”
(Source: https://empireonline.com/movies/news/cillian-murphy-oppenheimer-essential-cinematic-experience-blow-peoples-lids-off-exclusive-image/…)
But I definitely find it a little bizarre how people find these videos so satisfying. Could be that they feel like someone who rejects others must think they are better than everyone so they hate that and want them to get a “reality check.”
After a couple of guys, you end up having one guy who wants revenge “for the boys.” It’s kind of weird. I don’t know if it’s because they hate it when people are picky or hate seeing rejection?
Why do people get so mad when watching that button dating show and one of the participants touches the button multiple times? There’s a lot of hate out there for people having a preference and making it clear. Like, are they just supposed to pretend they are attracted to someone?
Had a great conversation with Yoshua Bengio. Both of us agreed that a good step forward for AI risk is to articulate the concrete scenarios where AI can lead to significant harm. More to come, and looking forward to continuing the conversation!
I wonder if AI labs should be dedicating a lot more effort into using AI for cybersecurity as a way to prevent weights or insights from being stolen. Would be good for safety and it seems like it could be a pretty big cash cow too?
There’s a strategy you will often see from political pundits where they say wild stuff to grab attention and get their own side all fired up. They don’t bother being factual, they just want to stir emotions and confirm biases.
But when someone calls them out, they switch gears.…Show more
Ever wanted to mindwipe an LLM?
Our method, LEAst-squares Concept Erasure (LEACE), provably erases all linearly-encoded information about a concept from neural net activations. It does so surgically, inflicting minimal damage to other concepts. 🧵
Future AI systems should be able to predict ahead of time (and work with humanity) to figure out whether it would be good to train on some specific pieces of data. In other words, it could filter out data that doesn't align with the values we want it to have.
With AI, you eventually run into a problem of value drift. You need to figure out a process where value change over time still aligns with what humanity wants. So, you don't necessarily want no value "drift" at all; you instead want it to drift and update in the right direction.
I had never read Vonnegut's stuff and had only watched a video or two describing him and Slaughterhouse-Five. I'm about halfway through Breakfast of Champions right now.
I quite like watching documentaries about famous people. I often watch one before I dive into their work because I find it valuable for understanding why they wrote a particular book or what was their mental state at the time.
Watched a Kurt Vonnegut documentary yesterday; he lived quite an interesting life, though I was a little sad to see what he did to his first wife and dealt with the fame.
I felt sad about how he seemed to lack meaning during his later years. "We live too long," he would say.
There’s an additional effect that’s been added onto the Streisand effect after it became known. Essentially, you’ve got people sharing something that someone tried to hide because they want to look cool mentioning the Streisand effect.
So, it might be either they think progress will be slow enough that we can fix failures as they happen or they just don’t think a superintelligent AGI could ever become powerful enough to cause an x-risk.
Then again, some just don’t buy the premise of AGI at all? Which to me, again, seems a bit bizarre to me. Mostly because I expect AI labour to greatly accelerate AI research and it’s not hard for me to imagine you can’t create insanely capable systems in the coming decades.
And then the crux becomes just a timing issue for some? Like, “I don’t think we are 10 years away so let’s wait to work on AGI safety until we are actually 5-10 years away.”
I feel like when I hear some people try to argue that we shouldn’t worry about AGI, they just aren’t imagining superintelligent AGI at all?
I often read their statements and I’m just thinking, “Ok, if you don’t think it’s coming that soon, just add another 10 years then? Maybe…Show more
Late night thinking out loud:
So, is a big part of the debate between AI x-risk people and non-x-risk people just that the second group is only imagining AI systems that already exist rather than the superintelligences that x-risk people worry about?
Ok Dad Twitter...
Let the world decide who has dared to defy gravity
Highest Child Toss Contest
Bonus points if mom is in the background looking terrified!
The picture does not have to be current but it has to be you.
My oldest daughter.
This is sad news, I’ve definitely seen a noticeable increase of young people not wanting and of course birth rates are already way down. This will have massive impacts on demographics going forward.