Snapshot ensembles - a new take on an old idea... I'd like to see a comparison with Polyak averaging.http://openreview.net/forum?id=BJYwwY9ll …
I was doing it a few years back when I started out in DL, but I stopped after I learned about Polyak averaging. Works much better.
-
-
how do you do it? taking the average of all steps, only the last k steps, ...?
-
at inference time, use a slow exponential average of the weights taken after every previous batch. Easy to accumulate in TF/Theano
- Show replies
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.