Letting an agent crash a car a million times in a simulated universe is fine! We don't really feel comfortable letting agents explore the environment where the environment is actual people, and simulating how real people behave is pretty difficult.
-
-
-
I think
@fchollet's point is better illustrated by enterprise software, or the software behind some app, where no lives are immediately at stake. It's not that RL doesn't work well enough to trust with lives; it's that, more often than not, it just doesn't work. - Show replies
New conversation -
-
-
just wait until someone figures out how to incorporate RL into Excel spreadsheets
-
Hope it's better than Excel's supervised learning interface!
End of conversation
New conversation -
-
-
Maybe. Just because a problem can be seen as an RL problem, this doesn't necessarily mean that this is the most productive way of seeing it. Many problems that could be RL might be better seen as something else (planning, optimization, solving supervised learning, etc).
-
I agree though that we do not in general have good ways of solving RL problems.
- Show replies
New conversation -
-
-
My perspective is that RL is suited for sequential based problems, and quite honestly we haven't tacked enough significant segments of action/reward systems thus far.
-
starting with multiple armed bandits
End of conversation
New conversation -
-
-
So RL just needs more time & people to be hands on with it? Or do you feel it will stay boxed in for specific uses?
Thanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
This Tweet is unavailable.
-
They are working on this limitation. Recently there was research on learning to drive (in the real world) with am RL algorithm.
- Show replies
-
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.