I am -loving- this combined Rectifed Adam (RAdam) + LookAhead optimizer. It's freaky stable and powerful, especially for my purposes. Only weird thing to keep in mind- I've had to multiply my learning rates by over 100x to maximally take advantage.https://medium.com/@lessw/new-deep-learning-optimizer-ranger-synergistic-combination-of-radam-lookahead-for-the-best-of-2dc83f79a48d …
-
Show this thread
-
Replying to @citnaj
I tried it on a non-image problem and it performed worse. Didn't play with the learning rate though, maybe i will try to increase it and test again.
1 reply 0 retweets 3 likes -
Replying to @taniajacob
Yeah you may be able to increase the learning rate quite a bit. And in fact I had to because otherwise the loss was a big strangely stagnant. Not sure exactly why.
1 reply 0 retweets 0 likes -
Replying to @citnaj @taniajacob
It seems worth playing with LR scheduling other than OneCycle to unleash the full potential of these new optimizers.https://forums.fast.ai/t/imagenette-woof-leaderboards-guidelines-for-proving-new-high-scores/52714/19 …
3 replies 1 retweet 4 likes
Oh man just say your code for this. Thank you!https://github.com/mgrankin/over9000 …
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.