Once you have such a curve, you can walk on the curve to make sense of *samples you've never seen before* (that are interpolated from samples you have seen). This is how a GAN can generate faces that weren't in the training data, or how a MNIST classifier can recognize new digits
-
Show this thread
-
If you're in a high-dimensional encoding space, this curve is, of course, a high-dimensional curve. But that's because it needs to deal with the encoding space, not because the problem is intrinsically high-dimensional (as mentioned earlier).
1 reply 4 retweets 54 likesShow this thread -
Now, how do you learn such a curve? That's where deep learning comes in.
2 replies 3 retweets 44 likesShow this thread -
But by this point this thread is LONG and the Keras team sync starts in 30s, so I refer you to DLwP, chapter 5 for how DL models and gradient descent are an awesome way to achieve generalization via interpolation on the latent manifold.https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff …
3 replies 9 retweets 92 likesShow this thread -
I'm back, just wanted to add one important note to conclude the thread: deep learning models are basically big curves fitted via gradient, that approximate the latent manifold of a dataset. The *quality of this approximation* determines how well the model will generalize.
2 replies 15 retweets 122 likesShow this thread -
The ideal model literally just encodes the latent space -- it would be able to perfectly generalize to *any* new sample. An imperfect model will partially deviate from the latent space, leading to possible errors.
1 reply 4 retweets 54 likesShow this thread -
Being able to fit a curve that approximates the latent space relies critically on two factors: 1. The structure of the latent space itself! (a property of the data, not of your model) 2. The availability of a "sufficiently dense" sampling of the latent manifold, i.e. enough data
1 reply 9 retweets 95 likesShow this thread -
You *cannot* generalize in this way to a problem where the manifold hypothesis does not apply (i.e. a true discrete problem, like finding prime numbers).
2 replies 5 retweets 81 likesShow this thread -
In this case, there is no latent manifold to fit to, which means that your curve (i.e. deep learning model) will simply memorize the data -- interpolated points on the curve will be meaningless. Your model will be a very inefficient hashtable that embeds your discrete space.
2 replies 9 retweets 81 likesShow this thread -
The second point -- training data density -- is equally important. You will naturally only be able to train on a very space sampling *of the encoding space*, but you need to *densely cover the latent space*.
2 replies 6 retweets 62 likesShow this thread
Typo: sparse. "a very sparse sampling of the encoding space"
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.