A huge focus in RL (at least at this year's ICLR conference) is on advancements in sample efficiency
-
-
Kiitos. Käytämme tätä aikajanasi parantamiseen. KumoaKumoa
-
-
-
It took us millions of years to get good at hide and seek.
-
exactly. unfair comparison by Pedro. Not that the network actually knows what it is doing though, but then, does a fly know that it is flying? Nope, it just flies.
Keskustelun loppu
Uusi keskustelu -
-
-
Well, surely the first time a human played hide and seek, she already had learned how to use tools in different contexts.
-
And had watched their peers use them.
- Näytä vastaukset
Uusi keskustelu -
-
-
Humans learn to walk after "n" games of "try to walk and fall". On the other hand, some animals learn almost on their first try. Something magical must be going on... or maybe just millions of years of evolution encoded at birth? Food for thought...
-
Granted, locomotion is probably a simpler mechanism than the ability to generalize with few samples, but I suspect that mechanism in humans builds upon other mechanisms that evolved over millions of years. So it's still remarkable, but perhaps there is no shortcut to intelligence
- Näytä vastaukset
Uusi keskustelu -
-
-
It is not about how long it took, but about what the agents actually learned. This behavioral approach relies on inducing what the agent knows by looking at how the agent acts. But this premise is invalid if the agent was trained to perform the task used for evaluation
-
By sugar-coating a path we can make ants write like Shakespeare. If anything, we are using RL for building zombie agents that are internally devoid from purpose, understanding and, of course, intelligence
Keskustelun loppu
Uusi keskustelu -
-
-
But we have been learning all our lives Sir.
Kiitos. Käytämme tätä aikajanasi parantamiseen. KumoaKumoa
-
Lataaminen näyttää kestävän hetken.
Twitter saattaa olla ruuhkautunut tai ongelma on muuten hetkellinen. Yritä uudelleen tai käy Twitterin tilasivulla saadaksesi lisätietoja.