On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima https://arxiv.org/abs/1609.04836 - Keras code:https://github.com/keskarnitish/large-batch-training …
-
-
Replying to @fchollet
interesting paper. would be nice to test this on a ResNet since their cifar10 results are quite mediocre for both large and small.
1 reply 0 retweets 2 likes
Replying to @hardmaru
agreed, the models are not state of the art, but I don't think this invalidates the findings. Still an interesting result.
10:45 AM - 20 Sep 2016
0 replies
0 retweets
1 like
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.