What’s the best argument for why AI systems in the future will develop goals? Is this something that will emerge as models scale, or do we expect economic incentives to encourage AI labs to deliberately build agents?
Relatedly, should we expect LLMs to ever develop agency?
Conversation
The phenomenon you call by names like "goals" or "agency" is one possible shadow of the deep structure of optimization - roughly, preimaging outcomes onto choices by reversing a complicated transformation.
Show replies
Show replies
you say important things, but the way you say them is not helping. Please please please collaborate with someone who can help express your concerns on a way that will make the public understand them.
2
12
Show replies
human, even the smartest ones, can perfectly exist without utility function, is it a required for intelligence?
I think there's an important distinction between narrow and broad agency. Humans have agency under all possible conditions. Agency as a peripheral mask-level prompt will be brittle to the next new situation/jailbreak. Evolutionary illusion-of-self isn't breakable.
1
1
In humans, desires are transformed into goals via the preimaging you describe. It's not obvious to me that ML models "want" to minimise their loss functions in a way that would lead to goals. They might "feel" satisfied with the value their loss function has on the given data.
1






