Why doesn’t RL use more toy tasks to measure advances in specific aspects of a problem like long term planning, large action spaces, imperfect information, etc? Complex environments such as Starcraft are impressive but make it difficult to disentangle *why* an agent wins.
-
-
There's this one https://arxiv.org/abs/1806.01946 : 5x5 grid viewed as pixels, 3 objects, 3 possible shapes, 3 possible colours. Simple RL task, right? Wrong. It's instruction conditional, and although there are only a few thousand instructions (a new one every episode), A3C struggles.
-
More recently there's https://arxiv.org/abs/1810.08272 by
@DBahdanau and some MILA folk. Visually/conceptually simple language-conditional env, difficult tasks. -
It goes without saying there are probably many more papers which tackle similar problems, but either are consigned to obscurity or rejected by superficial reviewers because they don't "look" hard enough.
End of conversation
New conversation -
-
-
Our recent paper used BoxWorld, which can be an arbitrarily complex planning problem:https://openreview.net/forum?id=HkxaFoC9KQ …
- 1 more reply
New conversation -
-
-
My only slight issue with gridworlds is that often they're used in papers the "wrong way round". In that, you first think up a model and then contrive a gridworld that you can solve but say, an LSTM cant. Papers could do with motivating *why* a specific gridworld is important
- 1 more reply
New conversation -
-
-
Maybe a subcollection to be created within openaigym?
Thanks. Twitter will use this to make your timeline better. UndoUndo
-
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.
