belief in “incentive design” is quite touching
Conversation
It’s not that incentives don’t work, it’s that there are more incentives in play in a complex system than you can dominate by design
behaviors reveal landscape of incentives
thinking designed incentives drive behaviors is a bit like thinking mountains will step aside for roads
3
12
64
At best designed incentives can usefully perturb wild behaviors if you take the time to map their host landscape a bit
Like how road builders survey land in laying out roads
1
2
20
A particular problem with legible incentives is that they rapidly hit cognitive limits: accurate, rational computation of optimal response to incentives is for trading algos, not humans. Humans struggle to grok even simple shit like “pay off highest interest debt first”
1
1
25
I think I last acted optimally in relation to my incentives in 2003
jk I’ve never done it
Once Cass Sunstein tried to nudge me to eat a salad but I didn’t notice the clever cue and ate a sandwich instead
2
2
21
Replying to
Lebowski theorem applies more to humans than AIs.
Quote Tweet
The Lebowski theorem: No superintelligent AI is going to bother with a task that is harder than hacking its reward function
Show this thread
1
3
26
Most incentives try to get humans to invest reasonable effort for fair rewards. Most humans only pay attention if there are unfair rewards. So you have to offer unfair rewards or penalties (“death for littering”) to shape behavior strongly enough to claim you’ve “designed” it
3
23

