One interesting thing about the ARC competition is that it serves to highlight how people who use deep learning often have little idea of what deep learning actually does, and when they should be using it or not
-
Show this thread
-
DL is applicable when you're doing *pattern recognition*: when you have data that lies on a smooth manifold, along which samples can be interpolated. And you're going to need a dense sampling of your manifold as training data in order to fit a parametric approximation of it
4 replies 10 retweets 100 likesShow this thread -
Generalization in deep learning is interpolation along a latent manifold (or rather a learned approximation of it). It has little to do with your model itself and everything to do with the natural organization of your data
3 replies 15 retweets 69 likesShow this thread -
Differentiability & minibatch SGD are the strengths of DL: besides making the learning practically tractable, the smoothness & continuity of the function & the incrementality of its fitting work great to learn to approximate latent manifold. But its strengths are also its limits
1 reply 4 retweets 30 likesShow this thread -
The whole setup breaks down when you are no longer doing pattern recognition -- when you no longer have a latent manifold (any kind of discrete problem) or no longer have a dense sampling of it. Or when your manifold changes over time.
4 replies 7 retweets 48 likesShow this thread -
Replying to @fchollet
The game of go and chess are highly discrete. One slightly wrong move and the game is lost. Still AlphaZero does very well. Even move proposals from a ResNet (without search) beats very good amateur go players.
1 reply 1 retweet 5 likes -
Replying to @ChrSzegedy
These games involve a mixture of pattern recognition (what a player would call 'intuition') and explicit reasoning. It's not all or nothing. The better you are at pattern recognition, the less you need to rely on reasoning, and inversely.
1 reply 1 retweet 3 likes -
Replying to @fchollet @ChrSzegedy
AlphaZero illustrates this trade-off well: the better the convnet the lesser the need to rely on MCTS. It also really shows how explicit reasoning (e.g. MCTS) enables much greater experience-efficiency in achieving high skill (i.e. greater intelligence): (cont)
1 reply 1 retweet 2 likes -
Replying to @fchollet @ChrSzegedy
playing purely based on pattern recognition (intuition) requires an insane amount of training data (a dense sampling of the manifold). That's not how humans play: for best efficiency, we rely on an interconnected *mix* of intuition and explicit reasoning & planning.
2 replies 3 retweets 5 likes -
Replying to @fchollet
The amount of training data is irrelevant. The fact that AlphaZero captures the discrete landscape of chess and go better than (most) humans demonstrates that DL does not require a continuous domain to work well.
1 reply 0 retweets 9 likes
No, it's a highly structured space which can therefore be embedded in a continuous manifold if you can sample enough games (which is a really ridiculous amount of games). This is true for virtually any task, as I was saying earlier
-
-
Replying to @fchollet
Being structured does not imply anything like that. Also nothing can be intelligent in an environment without structure.
1 reply 0 retweets 6 likes -
Replying to @ChrSzegedy
No, it does. I could give you any number of examples. You can always fit a continuous function to a set of discrete points (split bt training & val), but the more structure they have the more the continuous function will be likely to intercept data points not in your training set
2 replies 0 retweets 4 likes - Show replies
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.