Excited to share our work: collaboration requires understanding! In Overcooked, self-play doesn't gel with humans: it expects them to play like itself. (1/4) Demo: https://humancompatibleai.github.io/overcooked-demo/ … Blog: https://bair.berkeley.edu/blog/2019/10/21/coordination/ … Paper: https://arxiv.org/abs/1910.05789 Code: https://github.com/HumanCompatibleAI/overcooked_ai …pic.twitter.com/lqbzeTwoqr
-
Prikaži ovu nit
-
In competitive games, the minimax theorem allows self-play to be agnostic to its opponent: if they are suboptimal, SP will crush them even harder. That doesn’t work in collaborative games, where the partner’s suboptimal move and SP’s failure to anticipate it will hurt. (2/4)pic.twitter.com/6I6KwLOp0Z
3 proslijeđena tweeta 19 korisnika označava da im se sviđaPrikaži ovu nit -
We need an agent that has the right “expectation” about its partner. Obvious solution: train a human model with behavior cloning, and then train an agent to play well with that model. This does way better than SP in simulation (i.e. evaluated against a “test” human model). (3/4)pic.twitter.com/v1ykAkLpkE
1 reply 3 proslijeđena tweeta 16 korisnika označava da im se sviđaPrikaži ovu nit
Real humans adapt to the opaque protocols that SP learns, and play differently than the naive behavior cloned model that our agent was trained against, so the effect is smaller. Nonetheless, the human-aware agent still does better, sometimes beating human performance! (4/4)pic.twitter.com/FmR9Mn2Xwx
Čini se da učitavanje traje već neko vrijeme.
Twitter je možda preopterećen ili ima kratkotrajnih poteškoća u radu. Pokušajte ponovno ili potražite dodatne informacije u odjeljku Status Twittera.