Some comments on AlphaGo Zero:https://www.reddit.com/r/reinforcementlearning/comments/778vbk/mastering_the_game_of_go_without_human_knowledge/ …
-
-
RL with nonlinear function approximators isn't guaranteed to get anywhere ever; can simply loop in cycles of learning/forgetting or divergn.
-
(And that is in fact what usually happens when you try to use self-play, & what Silver says happened to most of their AG pure self-players.)
- 2 more replies
New conversation -
-
-
You really ARE emulating Dreyfus these days, are you? Care to put your money where your mouth is and brute force Go for a bit to show us?
-
You know, Dreyfus died a few months back. 'Always two there are... a master and an apprentice.'
- 1 more reply
New conversation -
-
-
wouldn't explain why they managed to improve performance while decreasing computing power by an order of magnitude
-
Trivial problems admit easy optimizations :)
- 1 more reply
New conversation -
-
-
what is a non-trivial problem?
-
starcraft is nontrivial imo
- 2 more replies
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.