75. Internet of refugees
Replying to
Quote Tweet
An agent which learned to play Mario without rewards. Instead, it was incentivized to avoid "boredom" (that is, getting into states where it can predict what will happen next). Discovered warp levels, how to defeat bosses, etc. More details: blog.openai.com/reinforcement-
0:45
334.1K views
3
1
7

