This is the main idea of the new AlphaGo paper:pic.twitter.com/a8QmDALslL
Ours is the era of inadequate AI alignment theory. Any other facts about this era are relatively unimportant, but sometimes I tweet about them anyway.
You can add location information to your Tweets, such as your city or precise location, from the web and via third-party applications. You always have the option to delete your Tweet location history. Learn more
Add this Tweet to your website by copying the code below. Learn more
Add this video to your website by copying the code below. Learn more
By embedding Twitter content in your website or app, you are agreeing to the Twitter Developer Agreement and Developer Policy.
| Country | Code | For customers of |
|---|---|---|
| United States | 40404 | (any) |
| Canada | 21212 | (any) |
| United Kingdom | 86444 | Vodafone, Orange, 3, O2 |
| Brazil | 40404 | Nextel, TIM |
| Haiti | 40404 | Digicel, Voila |
| Ireland | 51210 | Vodafone, O2 |
| India | 53000 | Bharti Airtel, Videocon, Reliance |
| Indonesia | 89887 | AXIS, 3, Telkomsel, Indosat, XL Axiata |
| Italy | 4880804 | Wind |
| 3424486444 | Vodafone | |
| » See SMS short codes for other countries | ||
This timeline is where you’ll spend most of your time, getting instant updates about what matters to you.
Hover over the profile pic and click the Following button to unfollow any account.
When you see a Tweet you love, tap the heart — it lets the person who wrote it know you shared the love.
The fastest way to share someone else’s Tweet with your followers is with a Retweet. Tap the icon to send it instantly.
Add your thoughts about any Tweet with a Reply. Find a topic you’re passionate about, and jump right in.
Get instant insight into what people are talking about now.
Follow more accounts to get instant updates about topics you care about.
See the latest conversations about any topic instantly.
Catch up instantly on the best stories happening as they unfold.
This is the main idea of the new AlphaGo paper:pic.twitter.com/a8QmDALslL
I thought that was how AlphaGo already worked. What on Earth were they doing before?
various differences. 2 nets before, 1 now; MCTS more deeply integrated into net eval vs just a parameter that blends their contributions etv
I'd imagine some non-obvious problem had to be solved to make the obvious simpler system work with 1 net, and that was the real key idea.
I do remember wondering why the value & policy net system was required and assuming that it factored the problem in a helpful/necessary way.
But I thought the result of that 2-net factoring was morally equivalent to this, modulo some actual quantities replaced with predicted ones.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.