The state space of most games is so big that vanilla Q learning is kind of like brute force search. There are other approaches that converge faster https://fluxml.ai/2019/03/05/dp-vs-rl.html … Maybe the popularity of RL stems from not wanting to learn calculus properly xD
-
-
-
V effective introduction, a small masterpiece really. Really well done, on par w http://distill.pub Is there a list of these differentiable design patterns mentioned in the Coda per chance?
- 3 more replies
New conversation -
-
-
Galaxy brain hot take: all software (i.e. the binary itself) is just a parameter to a Universal Turing Machine
-
Some hyperparameters are more comprehensible and predictable than others.
End of conversation
New conversation -
-
- End of conversation
New conversation -
-
“It works on my machine”
Thanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
Also, sometimes the same seed does not yield the same random number stream over library versions, language versions, or hardware changes.
-
Any result not reporting a confidence interval over random inits is bogus.
End of conversation
New conversation -
-
-
This Tweet is unavailable.
-
There's a super book on that by Sir Martin Rees, Just Six Numbers. It's not the numbers (constants) alone, it's their relationships and the very small error bands that have to be met in order for systems w certain properties to emerge and persist https://www.physics.purdue.edu/webapps/index.php/course_document/index/phys217H/1542/293/10748 …
- 5 more replies
-
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.