The really annoying thing with gradient-based algos is lrate/momentum/decay tuning. Adagrad and Adadelta don't actually solve this problem.
1:37 PM - 3 Feb 2015
0 replies
0 retweets
1 like
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.