Stating the obvious: a lot of current deep learning tricks are overfit to the validation sets of well-known benchmarks, including CIFAR10. It's nice to see this quantified. This has been a problem with ImageNet since at least 2015.https://arxiv.org/abs/1806.00451
-
-
If you're doing a Kaggle competition and you're evaluating your models / ideas according to a fixed validation split of the training data (plus the public leaderboard), you will consistently underperform on the private leaderboard. The same is true in research at large.
Show this thread -
Here is a very simple recommendation to help you overcome this: use a higher-entropy validation process, such as k-fold validation, or even better, iterated k-fold validation with shuffling. Only check your results on the official validation set at the very end.
Show this thread -
Yes, it's more expensive, but that cost itself is a regularization factor: it will force you to try fewer ideas instead of throwing spaghetti at the wall and seeing what sticks.
Show this thread
End of conversation
New conversation -
-
-
Sounds awfully like Complexity to me. Maybe Deep-learning and AI would be better considered a social science instead?
Thanks. Twitter will use this to make your timeline better. UndoUndo
-
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.