Tweets
- Tweets, current page.
- Tweets & replies
- Media
You blocked @rohinmshah
Are you sure you want to view these Tweets? Viewing Tweets won't unblock @rohinmshah
-
Pinned Tweet
I publish the Alignment Newsletter, a weekly publication with recent content relevant to AI alignment. Check it out!http://rohinshah.com/alignment-newsletter/ …
Thanks. Twitter will use this to make your timeline better. UndoUndo -
Wondering what the field of long-term AI safety does, but don't want to read hundreds of posts? Check out my review of work done in 2018-19! Please do leave comments and suggestions: https://docs.google.com/document/d/1Fng1J_QPb7GEeLBMmWWfZOguw7yUTZot0egrCbKpVwk/edit# … The summary is also Alignment Newsletter #84: https://mailchi.mp/1af38085edc5/an-84-reviewing-ai-alignment-work-in-2018-19 …
Thanks. Twitter will use this to make your timeline better. UndoUndo -
[Alignment Newsletter #83]: Sample-efficient deep learning with ReMixMatch - https://mailchi.mp/ff565f097630/an-83-sample-efficient-deep-learning-with-remixmatch …
Thanks. Twitter will use this to make your timeline better. UndoUndo -
[Alignment Newsletter #82]: How OpenAI Five distributed their training computation - https://mailchi.mp/7ba40faa7eed/an-82-how-openai-five-distributed-their-training-computation …
Thanks. Twitter will use this to make your timeline better. UndoUndo -
[Alignment Newsletter #81]: Universality as a potential solution to conceptual difficulties in intent alignment - https://mailchi.mp/6078fe4f9928/an-81-universality-as-a-potential-solution-to-conceptual-difficulties-in-intent-alignment …
Thanks. Twitter will use this to make your timeline better. UndoUndo -
Alignment Newsletter #80: Why AI risk might be solved without additional intervention from longtermists - https://mailchi.mp/b3dc916ac7e2/an-80-why-ai-risk-might-be-solved-without-additional-intervention-from-longtermists …
Thanks. Twitter will use this to make your timeline better. UndoUndo -
Alignment Newsletter #79: Recursive reward modeling as an alignment technique integrated with deep RL - https://mailchi.mp/8d9e3703fbde/an-79-recursive-reward-modeling-as-an-alignment-technique-integrated-with-deep-rl …
Thanks. Twitter will use this to make your timeline better. UndoUndo -
Alignment Newsletter #78: Formalizing power and instrumental convergence, and the end-of-year AI safety charity comparison - https://mailchi.mp/eef1d6c95d7c/an-78formalizing-power-and-instrumental-convergence-and-the-end-of-year-ai-safety-charity-comparison …
Thanks. Twitter will use this to make your timeline better. UndoUndo -
[Alignment Newsletter #77]: Double descent: a unification of statistical theory and modern ML practice - https://mailchi.mp/d2f2d15b7114/an-77-double-descent-a-unification-of-statistical-theory-and-modern-ml-practice …
Thanks. Twitter will use this to make your timeline better. UndoUndo -
Rohin Shah Retweeted
Shout out to
@rohinmshah for his impressive AI alignment newsletter. If you want to keep up to speed with what is going on in the field of AI alignment, there's nothing better: https://rohinshah.com/alignment-newsletter/ … His team has summarised 1,200 papers to date!Thanks. Twitter will use this to make your timeline better. UndoUndo -
Rohin Shah Retweeted
Want to ensure AI is beneficial for society? Come talk to like-minded people at the Human-Aligned AI Social at
#NeurIPS2019, Thursday 7-10 pm, room West 205-207. https://nips.cc/Conferences/2019/Schedule?showEvent=15974 …@claudia_shi57@victorveitchpic.twitter.com/0KgrHGZSiu
Thanks. Twitter will use this to make your timeline better. UndoUndo -
[Alignment Newsletter #76]: How dataset size affects robustness, and benchmarking safe exploration by measuring constraint violations - https://mailchi.mp/1106d0ce6766/an-76how-dataset-size-affects-robustness-and-benchmarking-safe-exploration-by-measuring-constraint-violations …
Thanks. Twitter will use this to make your timeline better. UndoUndo -
[Alignment Newsletter #75]: Solving Atari and Go with learned game models, and thoughts from a MIRI employee - https://mailchi.mp/3e34fa1f299a/an-75-solving-atari-and-go-with-learned-game-models-and-thoughts-from-a-miri-employee …
Thanks. Twitter will use this to make your timeline better. UndoUndo -
[Alignment Newsletter #74]: Separating beneficial AI into competence, alignment, and coping with impacts - https://mailchi.mp/49c956f84771/an-74separating-beneficial-ai-into-competence-alignment-and-coping-with-impacts …
Thanks. Twitter will use this to make your timeline better. UndoUndo -
[Alignment Newsletter #73]: Detecting catastrophic failures by learning how agents tend to break - https://mailchi.mp/ef55eb52b0fd/an-73-detecting-catastrophic-failures-by-learning-how-agents-tend-to-break …
Thanks. Twitter will use this to make your timeline better. UndoUndo -
[Alignment Newsletter #72]: Alignment, robustness, methodology, and system building as research priorities for AI safety - https://mailchi.mp/cac125522aa3/an-72-alignment-robustness-methodology-and-system-building-as-research-priorities-for-ai-safety …
Thanks. Twitter will use this to make your timeline better. UndoUndo -
[Alignment Newsletter #71]: Avoiding reward tampering through current-RF optimization - https://mailchi.mp/938a7eed18c3/an-71avoiding-reward-tampering-through-current-rf-optimization …
Thanks. Twitter will use this to make your timeline better. UndoUndo -
[Alignment Newsletter #70]: Agents that help humans who are still learning about their own preferences - https://mailchi.mp/732eaa192df0/an-70-agents-that-help-humans-who-are-still-learning-about-their-own-preferences …
Thanks. Twitter will use this to make your timeline better. UndoUndo -
Real humans adapt to the opaque protocols that SP learns, and play differently than the naive behavior cloned model that our agent was trained against, so the effect is smaller. Nonetheless, the human-aware agent still does better, sometimes beating human performance! (4/4)pic.twitter.com/FmR9Mn2Xwx
Show this threadThanks. Twitter will use this to make your timeline better. UndoUndo -
We need an agent that has the right “expectation” about its partner. Obvious solution: train a human model with behavior cloning, and then train an agent to play well with that model. This does way better than SP in simulation (i.e. evaluated against a “test” human model). (3/4)pic.twitter.com/v1ykAkLpkE
Show this threadThanks. Twitter will use this to make your timeline better. UndoUndo -
In competitive games, the minimax theorem allows self-play to be agnostic to its opponent: if they are suboptimal, SP will crush them even harder. That doesn’t work in collaborative games, where the partner’s suboptimal move and SP’s failure to anticipate it will hurt. (2/4)pic.twitter.com/6I6KwLOp0Z
Show this threadThanks. Twitter will use this to make your timeline better. UndoUndo
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.