2/ Original usage and intended meaning: The problem with turning the future over to just any superintelligence is that its utility function may have its attainable maximum at states we'd see as very low-value, even from the most cosmopolitan standpoint.https://arbital.com/p/paperclip/
-
-
Show this thread
-
3/ Misunderstood and widespread meaning: The first AGI ever to arise could show up in a paperclip factory (instead of a research lab specifically trying to do that). And then because AIs just mechanically carry out orders, it does what the humans had in mind, but too much of it.
Show this thread -
4/ The key intended ideas are, (a), it requires no defect of intelligence, rationality, or reflectivity to optimize for paperclips, and (b), if you *don't* manage to align an AI and have it do something interpretable, the result will look as meaningless as paperclips.
Show this thread -
5/ Imagining that an AI just goes on making paperclips because it's mechanical and unreflective sends the opposite message of (a). Supposing that the AI was meant to run a paperclip factory, and deliberately successfully aligned to do that, sends the opposite message of (b).
Show this thread
End of conversation
New conversation -
-
-
As I understand it, it's a warning about how an AI will maximize its values and that those values are unlikely to coincidentally align w/ human values. Thus, example of seemingly harmless paperclick imperative that runs amok when backed w/ enormous intelligence. Roughly accurate?
-
The values of the AI in these scenarios are whatever we tell it to optimize for - it's just that if we don't take the end result of such optimization into account, we will give the AI priorities that are not, ultimately, in line with well-rounded human flourishing.
-
You only get to fail at that point if you have succeeded on the much earlier problem of having the AGI's optimization target bear any resemblance whatsoever to what you hoped you were targeting.
-
Seems easier from a layman's perspective. Won't the early work on such an AGI's predecessors ensure that it's hitting some sort of benchmark, as a measure of progress, ie AlphaGo's win/loss ratio? If AlphaGo were generalized, its values would presumably be go-related, right. No?
-
That's as likely to happen as natural selection producing humans who exclusively and explicitly target inclusive genetic fitness. Hill-climbing with X as a fitness function does not scale to produce smart consequentialists targeting X as a goal.
-
Interesting. I would have thought your latter statement was obviously wrong, which of necessity recommends a huge update in my beliefs (now must be set at, "I don't understand this domain very well and am very bad at predicting things as relates to it.")
-
Are you've sure you've understood the statement that you think you'd have considered obviously wrong? Like, is the evolution of humans now a clear example and demonstration of the statement that sounded obviously wrong?
-
I think so? Perhaps you'd help: if I understand, you're saying that a species moving incremental updates towards a set goal (either via evolution or programming) is unlikely to yield higher-order thinking that still moves towards that goal in a predictable way? Like how rock 1/2
- 3 more replies
New conversation -
-
-
you did not coin the AI paperclip scenario, I hope thats not what youre saying here
-
I didn't? My memory wasn't sure if it was myself or Bostrom, but somebody else ran a search of the old mailing list where it plausibly originated, and they reported the earliest mention on that list was by me. What do you think is the first mention?
-
*crickets*
End of conversation
New conversation -
-
-
So what you had in mind is that the AI is told to "maximize happiness"; except that for some reason it concludes that happiness is correlated to paperclips, and since it's the best cost-effect ratio it goes on making paperclips at full speed?
-
No. I have in mind that alignment is hard, that most of the work consists in lining up internal systems that target any goal whatever in a way that remains stable when scaled up to superintelligence, and what the researchers want is simply uncorrelated with the outcome.
End of conversation
New conversation -
-
-
Tonight I was the main guest in a discussion program on Swedish TV about AI risk. The program opened with a little cartoon narrative involving the cute (?) paperclippers. I *think* I got the main point across. But in Swedish. https://www.svtplay.se/video/16696405/idevarlden/idevarlden-sasong-2-avsnitt-2 …
Thanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
@threadreaderapp unroll please -
Hi! what you asked for is: https://threadreaderapp.com/thread/955197541376606208.html … Share it if you think it's great. It's a pleasure to help
End of conversation
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.