Usual reminder: when I've been saying for the past 5+ years that deep learning is interpolative, I don't mean it does linear interpolation in the original encoding space (which would be useless). It does interpolation on a low-dimensional manifold embedded in the encoding space.
-
Show this thread
-
Replying to @fchollet
Ethan Caballero Retweeted Joscha Bach
What's your explanation for the compositional generalization that DALL·E does? E.g. there don't seem to be any baby lemurs in pajamas with calculators in the training data, yet DALL·E generates them perfectly:https://twitter.com/Plinz/status/1347280122202624001 …
Ethan Caballero added,
2 replies 2 retweets 12 likes -
Replying to @ethancaballero
That is exactly what manifold interpolation is. E.g. producing a spork by interpolating between a spoon and a fork on a manifold of kitchenware. Interpolation obviously does generalize to items not present in the training set -- DL would be useless otherwise.
2 replies 1 retweet 17 likes -
Replying to @fchollet @ethancaballero
The above is basically a fancy version of the "smile vector" or "sunglasses vector" people were demonstrating in 2016. It's interpolation on a visual manifold that was densely sampled during training. It's actually a fantastic demo of what manifold interpolation means!
1 reply 0 retweets 10 likes
The easy thing you could do demonstrate manifoldness here is to sample a grid of generated pictures that densely cover the spectrum "character type" X "object held". It would look awesome and I'd love to see it.
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.