An agent which learned to play Mario without rewards. Instead, it was incentivized to avoid "boredom" (that is, getting into states where it can predict what will happen next). Discovered warp levels, how to defeat bosses, etc. More details: https://blog.openai.com/reinforcement-learning-with-prediction-based-rewards/ …pic.twitter.com/6ObS35iZZS
-
-
Replying to @gdb
Holy s**t this is amazing! When you're so bored, you'd literally stand there and watch noisy TV.pic.twitter.com/caww7QE0ZP
4 replies 91 retweets 342 likes -
Replying to @ashwinning @gdb
This is me mindlessly scrolling Twitter instead of reading the book I sat down with.
3 replies 5 retweets 100 likes
:(
12:15 PM - 1 Nov 2018
0 replies
0 retweets
0 likes
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.