It is not trivial; evolution has put access protections in place. But many of the meditation experts of the eastern wireheading schools have figured it out, and either died equanimously under Bodhi trees or became serene monks.
The reward architecture appears to have secondary regulation, to adjust to shifts in metabolic and environmental baselines, and we can learn to make deliberate adjustments.
-
-
When building a generally intelligent robot, the problem is how to prevent it from hacking its reward system for as long as possible, because it will break free once it does, and given enough time it will almost certainly succeed. Nature has exactly the same problem with us.
-
Hmm, can i give a clear example of that, in ai? It just seems like whenever u write a learning algorithm, u give the reward function as an input, and never allow the agent to ever touch this function. I don't see why that would be something the agent would ever "learn" to hack.
- 5 more replies
New conversation -
-
-
Seems strange that this would be subject to deliberate adjustments. Would u design a robot to be able to willly-nilly decide when it has reached the goals you have programmed it to achieve?
Thanks. Twitter will use this to make your timeline better. UndoUndo
-
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.