Will Dabney

@wwdabney

Research scientist at DeepMind. On the critical path to AGI. Also, a persistent optimist.

London, England
Joined May 2015

Tweets

You blocked @wwdabney

Are you sure you want to view these Tweets? Viewing Tweets won't unblock @wwdabney

  1. Retweeted

    By teaching machines to understand our true desires, one scientist hopes to avoid the potentially disastrous consequences of having them do what we command.

    Undo
  2. Jan 29

    Our paper 'A distributional code for value in dopamine-based reinforcement learning' on the cover of ! Read it here: Shout out to the amazing artists/designers at who make this possible, while we get to focus on the research.

    Undo
  3. Retweeted
    Jan 28

    I haven't found a single person who has used jax and said they don't like it. I've been actively priming people to criticize it, but noone does. Instead they tell me how good it feels getting off of Tensorflow. Looking forward to jaxxing myself soon.

    Undo
  4. Retweeted
    Jan 16

    Hey everyone, I'm so excited to share my recent interview on Music & AI plus "A Geometric Perspective on Reinforcement Learning" with for the podcast. Check it out! via

    Undo
  5. Undo
  6. Jan 16

    Thank everyone! You can also read the paper for free here:

    Show this thread
    Undo
  7. Jan 15

    And it all started (for me) almost exactly three years ago working with and Remi on distributional RL:

    Show this thread
    Undo
  8. Jan 15

    It has been an incredible collaboration with my co-authors, especially working with and Matt Botvinick. Also incredibly grateful to Naoshige Uchida and Clara Starkweather from Harvard, as well as Remi Munos and Demis Hassabis for their work and constant endurance! 2/

    Show this thread
    Undo
  9. Jan 15

    When neuroscience and AI researchers get to chatting, cool stuff happens! My first, and I hope not last, trip into neuroscience has been published in Nature. 1/

    Show this thread
    Undo
  10. Jan 8

    Almost all of these (IMO) apply equally well to research. I most disagree with the “short 1:1, long group meetings” one, but do other research people think most of these apply to them?

    Undo
  11. 21 Dec 2019

    Work is also with Andre Barreto, , Tom Van de Wiele, and !

    Show this thread
    Undo
  12. 21 Dec 2019

    Happy to have worked with on research combining behavioural mutual information and successor features, which has been accepted for oral presentation at ICLR. Favorite part: clean answer to where to get the “features” for successor features.

    Show this thread
    Undo
  13. 8 Dec 2019

    So fun how non-stationary our perception is. It’s not hard to get the direction cued onto any physical change. Opening/closing hand, blinking, you can even pretend to spin it this way and that with your thumb and it will switch.

    Undo
  14. Retweeted
    5 Dec 2019

    Really excited for next week and to present our spotlight on credit assignment :) tl;dr We can rewrite value functions in terms of a hindsight quantity that explicitly captures credit assignment and get a whole new family of RL algs! 🥳

    Show this thread
    Undo
  15. 4 Dec 2019

    Let’s just hope Fox News doesn’t run this, or he might just declare war on all our allies.

    Undo
  16. 22 Nov 2019

    Wow, Phil Thomas and co-authors invention of Seldonian ML hits on so many levels. I love the clean framing of the problem, the sci-fi callback, and that it actually makes progress in an area where most articles are more philosophical than algorithmic!

    Undo
  17. Retweeted
    21 Nov 2019

    Shout out to my friend and collaborator Phil Thomas, along with and others, for their new paper in Science on a general framework for defining and avoiding undesirable behavior in ML algorithms:

    Undo
  18. 8 Nov 2019

    I found this inspiring. Seeing the leaders in our field leveraging their position to push back on government, not for some new funding but to get deserving researches to a conference. Hoping Canada is proud enough of their AI leadership to fix these visa issues.

    Undo
  19. 15 Oct 2019

    This is an impressive manipulation task! However, the agent isn’t “learning to solve the Rubik’s cube”. It is using a hard coded algorithm to solve the cube and has been trained to implement the limited set of macro actions needed for that algorithm. Still very cool!

    Undo
  20. 14 Oct 2019

    London Underground , an expensive torturous actual monopoly with 50% downtime. Come rain, or snow, or gentle breeze you can be sure of a train failure to ruin your day.

    Undo

Loading seems to be taking a while.

Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.

    You may also like

    ·