Tom Everitt

@tom4everitt

Research Scientist working on AI safety at

Vrijeme pridruživanja: kolovoz 2017.

Tweetovi

Blokirali ste korisnika/cu @tom4everitt

Jeste li sigurni da želite vidjeti te tweetove? Time nećete deblokirati korisnika/cu @tom4everitt

  1. proslijedio/la je Tweet
    24. sij

    Nice example of how SCM can serve as a laboratory to test various interpretations of familiar and colloquially used terms, in this case "incentives".

    Poništi
  2. 23. sij

    Thanks to structural causal models, we now a more precise understanding of incentives in causal influence diagrams blog post: arXiv:

    Poništi
  3. proslijedio/la je Tweet
    15. sij

    We have 2 papers published in today! 🎉 One describes AlphaFold, which uses deep neural networks to predict protein structures with high accuracy. AlphaFold made the most accurate predictions at the 2018 scientific community assessment CASP13. 1/4

    Prikaži ovu nit
    Poništi
  4. 27. pro 2019.

    nice game theoretic analysis of debate as an AI safety method

    Poništi
  5. 13. pro 2019.

    Model-based RL can be really helpful for safety. Great work!

    Poništi
  6. proslijedio/la je Tweet
    13. pro 2019.

    How do you train an RL agent in the presence of unknown, unsafe states without visiting them even once? New algorithm by our intern synthesizes trajectories with a generative model and ask a human to label them for safety.

    Prikaži ovu nit
    Poništi
  7. proslijedio/la je Tweet
    14. kol 2019.

    With time, we hope these design principles can help inform best practices for how to build capable RL agents without reward tampering incentives. This builds upon our previous work on understanding agent incentives with causal influence diagrams:

    Prikaži ovu nit
    Poništi
  8. proslijedio/la je Tweet
    14. kol 2019.

    In our latest AI safety blog post, we explore principled solutions to the reward tampering problem, in which a reinforcement learning agent actively changes its reward function to maximise reward. Blog post: Paper:

    Prikaži ovu nit
    Poništi
  9. 14. kol 2019.

    How can we make sure that agents work to achieve their intended task instead of tampering with their reward mechanism? Reward tampering problems correspond to undesired causal paths in causal influence diagrams, and solutions to ways of removing the paths.

    Poništi
  10. proslijedio/la je Tweet
    14. kol 2019.

    another step towards developing a set of best practices for designing safe RL agents - in this case, by avoiding incentives for agents to tamper with their own reward function. great work, and team 🚀 🤖 ✅

    Poništi
  11. proslijedio/la je Tweet
    27. velj 2019.

    Many AI safety problems are about agent incentives. In our latest blog post, we suggest a systematic study of agent incentives based on influence diagrams and graphical criteria for different types of incentives. Blog post: Paper:

    Poništi
  12. proslijedio/la je Tweet
    20. stu 2018.

    The agent alignment problem may be one of the biggest obstacles for using ML to improve people’s lives. Today I’m very excited to share a research direction for how we’ll aim to solve alignment at . Blog post: Paper:

    Poništi

Čini se da učitavanje traje već neko vrijeme.

Twitter je možda preopterećen ili ima kratkotrajnih poteškoća u radu. Pokušajte ponovno ili potražite dodatne informacije u odjeljku Status Twittera.

    Možda bi vam se svidjelo i ovo:

    ·