Listening to the Harris & Yudkowsky podcast on AI. Realized that Yudkowsky likely simply paid his test subjects more than the $10 & $20 prize to let him out of the box.
-
-
Right, but in the experiment if they said "prove it" I was wondering whether proof was role played or not
-
In real life if you tell a superintelligence "prove you can cure cancer" then it has probably just escaped the box (though the game rules disallow that conclusion), but if not, it can certainly prove it!
-
No tricks involve both no payments as well as “when you gave me access to this you effectively let me out” scenarios? Why not publish various exchanges at this point?
-
1) preserve future experiments, 2) replicate the feeling of knowing it would happen but not knowing how, 3) avoiding the familiarity discount of "sure sure that's a known trick BUT"
-
I believe that the argument is that we cant have an unaligned AGI in a box to begin with as if we get to that point we’ve already lost. But it seems a bit fatalistic to assume it is over then and to not give people access to some of the best arguments ahead of time.
-
More importantly, as people outside of Yudkowsky’s direct contact circles start working on this (like say, many Chinese govt funded researchers) the benefit of being more convincing to a specific individual gets smaller compared to the info being out there and available.
-
Remember, if the experiment is "if a chimp can do it, a human will get there but very likely on a different path," then revealing how the chimp did it doesn't really inoculate us.
-
Unlike the chimp, humans have the ability to keep working on these problems over time to understand the next level. If we don’t accept that assumption then worrying about how to ensure AGI is aligned is completely pointless.
- 3 more replies
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.