To the left, you see a trained agent playing a level of a game. To the right, you see the same playthrough from an agent-centric perspective: cropped, translated, and rotated with the agent in the center. Which perspective is the best input for the agent? https://arxiv.org/abs/2001.09908 pic.twitter.com/7bCtBp8xUG
-
-
It is well known from "non-deep" agent learning research (say, evolutionary robotics work and game AI work from early to mid 2000s) that the sensor representation is extremely important to the agent's ability to learn. http://julian.togelius.com/VanHoorn2009Hierarchical.pdf …pic.twitter.com/Ugj9IDsdDR
Prikaži ovu nit -
The promise/premise of deep learning is that we don't have to worry about these representations, because the network will figure out the input representation itself. But, really, will it? Will a network of a few layers really learn to rotate and translate to focus on the agent?
Prikaži ovu nit -
Also, can it? It is hard to imagine that a neural network of just a few layers could actually implement the transformations necessary to even understand where things are relative to the agent, so that the policy can be location-independent?
Prikaži ovu nit -
It is possible that the standard paradigm of a neural network with a handful of layers learning to master, say, Atari games from a static third-person view is actually impossible. That is, it doesn't learn any general playing skills. It learns some kind of stimulus-response table
Prikaži ovu nit -
In any case, even if this is possible in principle, it seems that the way we represent the input makes a lot of difference for the generality of skills that can be learned in practice.
Prikaži ovu nit
Kraj razgovora
Novi razgovor -
Čini se da učitavanje traje već neko vrijeme.
Twitter je možda preopterećen ili ima kratkotrajnih poteškoća u radu. Pokušajte ponovno ili potražite dodatne informacije u odjeljku Status Twittera.