I think "AIs make paperclips" has probably obtained more audience than all of my other conceptual originations combined. I guess it's not very surprising that it's an invention that had the potential for easy misunderstanding, and that it's the misunderstanding that spread.
-
Show this thread
-
Replying to @ESYudkowsky
As I understand it, it's a warning about how an AI will maximize its values and that those values are unlikely to coincidentally align w/ human values. Thus, example of seemingly harmless paperclick imperative that runs amok when backed w/ enormous intelligence. Roughly accurate?
1 reply 0 retweets 1 like -
Replying to @ADeebus @ESYudkowsky
The values of the AI in these scenarios are whatever we tell it to optimize for - it's just that if we don't take the end result of such optimization into account, we will give the AI priorities that are not, ultimately, in line with well-rounded human flourishing.
2 replies 0 retweets 1 like -
Replying to @amalgamary @ADeebus
You only get to fail at that point if you have succeeded on the much earlier problem of having the AGI's optimization target bear any resemblance whatsoever to what you hoped you were targeting.
1 reply 0 retweets 1 like -
Replying to @ESYudkowsky @amalgamary
Seems easier from a layman's perspective. Won't the early work on such an AGI's predecessors ensure that it's hitting some sort of benchmark, as a measure of progress, ie AlphaGo's win/loss ratio? If AlphaGo were generalized, its values would presumably be go-related, right. No?
2 replies 0 retweets 0 likes -
Replying to @ADeebus @amalgamary
That's as likely to happen as natural selection producing humans who exclusively and explicitly target inclusive genetic fitness. Hill-climbing with X as a fitness function does not scale to produce smart consequentialists targeting X as a goal.
1 reply 0 retweets 2 likes -
Replying to @ESYudkowsky @amalgamary
Interesting. I would have thought your latter statement was obviously wrong, which of necessity recommends a huge update in my beliefs (now must be set at, "I don't understand this domain very well and am very bad at predicting things as relates to it.")
1 reply 0 retweets 1 like -
Replying to @ADeebus @amalgamary
Are you've sure you've understood the statement that you think you'd have considered obviously wrong? Like, is the evolution of humans now a clear example and demonstration of the statement that sounded obviously wrong?
1 reply 0 retweets 0 likes -
Replying to @ESYudkowsky @amalgamary
I think so? Perhaps you'd help: if I understand, you're saying that a species moving incremental updates towards a set goal (either via evolution or programming) is unlikely to yield higher-order thinking that still moves towards that goal in a predictable way? Like how rock 1/2
1 reply 0 retweets 0 likes -
crystals may have formed larger and more complex structures, and yet once they got complex enough to (in one theory) evolve into living cells, their replication no longer tracked with previous inferred goals of "make bigger rock crystals?" 2/2
1 reply 0 retweets 0 likes
I'm not sure that's a good example? They were unintelligent systems reproducing by other means. Humans, who do have explicit desires, have no explicit desire to replicate DNA. External selection on X doesn't produce internal explicit quoted goals of X.
-
-
Replying to @ESYudkowsky @amalgamary
Makes sense. I was trying to come up with something that was far enough away to not merely be aping the previous examples, but it ended up more as an analogy. But I do think I understand this thread, at least. And so I'm confident in my lack of confidence in this domain. Thanks!
0 replies 0 retweets 2 likesThanks. Twitter will use this to make your timeline better. UndoUndo
-
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.