Rohin Shah

@rohinmshah

PhD student at the Center for Human-Compatible AI at UC Berkeley. I publish the Alignment Newsletter.

Vrijeme pridruživanja: listopad 2017.

Medijski sadržaj

  1. 21. lis 2019.

    Real humans adapt to the opaque protocols that SP learns, and play differently than the naive behavior cloned model that our agent was trained against, so the effect is smaller. Nonetheless, the human-aware agent still does better, sometimes beating human performance! (4/4)

    Prikaži ovu nit
  2. 21. lis 2019.

    We need an agent that has the right “expectation” about its partner. Obvious solution: train a human model with behavior cloning, and then train an agent to play well with that model. This does way better than SP in simulation (i.e. evaluated against a “test” human model). (3/4)

    Prikaži ovu nit
  3. 21. lis 2019.

    In competitive games, the minimax theorem allows self-play to be agnostic to its opponent: if they are suboptimal, SP will crush them even harder. That doesn’t work in collaborative games, where the partner’s suboptimal move and SP’s failure to anticipate it will hurt. (2/4)

    Prikaži ovu nit
  4. 21. lis 2019.

    Excited to share our work: collaboration requires understanding! In Overcooked, self-play doesn't gel with humans: it expects them to play like itself. (1/4) Demo: Blog: Paper: Code:

    Prikaži ovu nit
  5. 12. velj 2019.

    We developed Reward Learning by Simulating the Past (RLSP), and created a suite of gridworlds to showcase its properties. The top row shows what happens with a misspecified reward, while the bottom shows what happens when using RLSP to correct the reward. (4/4)

    Prikaži ovu nit
  6. 12. velj 2019.

    We've released our work on inferring human preferences from the state of the world! By thinking about what "must have happened" in the past, we can infer what should or shouldn't be done. Blog post: Paper: (1/4)

    Prikaži ovu nit

Čini se da učitavanje traje već neko vrijeme.

Twitter je možda preopterećen ili ima kratkotrajnih poteškoća u radu. Pokušajte ponovno ili potražite dodatne informacije u odjeljku Status Twittera.

    Možda bi vam se svidjelo i ovo:

    ·