If you're in a high-dimensional encoding space, this curve is, of course, a high-dimensional curve. But that's because it needs to deal with the encoding space, not because the problem is intrinsically high-dimensional (as mentioned earlier).
-
-
This is why *data augmentation techniques* like exposing a model to variations in image brightness or rotation angle is an extremely effective way to improve test-time performance. Data augmentation is all about densifying your latent space coverage (by leveraging visual priors).
Show this thread -
In conclusion: the only things you'll find in a DL model is what you put into it: the priors encoded in its architecture and the data it was trained on. DL models are not magic. They're big curves that fit their training samples, with some constraints on their structure.
Show this thread
End of conversation
New conversation -
-
-
Thanks
@fchollet for this amazing thread. Though the point on augmentation makes perfect sense, wanted to check how noisy inputs also lead to better generalization (see for example SpecAugment https://arxiv.org/abs/1904.08779 ) ?Thanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
How about adding information on the data generating process? Could you build a better classifier with less data if you somehow take into account that (for example) those digits are written with a small number of strokes without lifting the pen, rather than as a bunch of pixels?
Thanks. Twitter will use this to make your timeline better. UndoUndo
-
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.