To be learnable, a problem has to be stable. Therefore in a stacked ML architecture, the lower levels should evolve at slower scale.
-
-
Replying to @stephenroller
@stephenroller Personal exp. Try it: take a deep MLP, and test 1) constant learning rates, 2) higher LR for higher levels, 3) the reverse2 replies 0 retweets 1 like -
Replying to @stephenroller
@stephenroller both. I use Torch7 for CNNs, Theano for low-dim problems, and currently numpy+pypy to write my own stuff (large sparse space)1 reply 0 retweets 0 likes
Replying to @stephenroller
@stephenroller shoot me an email, let's talk about it. Twitter isn't quite the right medium for this. fchollet@sphere-engineering.com
9:38 PM - 5 Dec 2014
0 replies
0 retweets
0 likes
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.