DL is applicable when you're doing *pattern recognition*: when you have data that lies on a smooth manifold, along which samples can be interpolated. And you're going to need a dense sampling of your manifold as training data in order to fit a parametric approximation of it
-
Show this thread
-
Generalization in deep learning is interpolation along a latent manifold (or rather a learned approximation of it). It has little to do with your model itself and everything to do with the natural organization of your data
3 replies 15 retweets 69 likesShow this thread -
Differentiability & minibatch SGD are the strengths of DL: besides making the learning practically tractable, the smoothness & continuity of the function & the incrementality of its fitting work great to learn to approximate latent manifold. But its strengths are also its limits
1 reply 4 retweets 30 likesShow this thread -
The whole setup breaks down when you are no longer doing pattern recognition -- when you no longer have a latent manifold (any kind of discrete problem) or no longer have a dense sampling of it. Or when your manifold changes over time.
4 replies 7 retweets 48 likesShow this thread -
Replying to @fchollet
The game of go and chess are highly discrete. One slightly wrong move and the game is lost. Still AlphaZero does very well. Even move proposals from a ResNet (without search) beats very good amateur go players.
1 reply 1 retweet 5 likes -
Replying to @ChrSzegedy
These games involve a mixture of pattern recognition (what a player would call 'intuition') and explicit reasoning. It's not all or nothing. The better you are at pattern recognition, the less you need to rely on reasoning, and inversely.
1 reply 1 retweet 3 likes -
Replying to @fchollet @ChrSzegedy
AlphaZero illustrates this trade-off well: the better the convnet the lesser the need to rely on MCTS. It also really shows how explicit reasoning (e.g. MCTS) enables much greater experience-efficiency in achieving high skill (i.e. greater intelligence): (cont)
1 reply 1 retweet 2 likes -
Replying to @fchollet @ChrSzegedy
playing purely based on pattern recognition (intuition) requires an insane amount of training data (a dense sampling of the manifold). That's not how humans play: for best efficiency, we rely on an interconnected *mix* of intuition and explicit reasoning & planning.
2 replies 3 retweets 5 likes -
Replying to @fchollet @ChrSzegedy
It's also not the case that some games/tasks are discrete or interpolative. All problems can be solved with pattern recognition if the problem is stable and you have infinite data. Inversely, img classif is pattern rec, but in a single-shot setting it becomes a reasoning problem.
1 reply 1 retweet 2 likes -
Replying to @fchollet @ChrSzegedy
If I give you a natural language description of the rules of chess and a couple of example games, then Chess is a reasoning problem. If I give you 100,000,000,000 example games, it's a pattern recognition problem (or at least it can be treated as one).
1 reply 1 retweet 8 likes
It's no different than the fact that you can approximate an arbitrary discrete function with a continuous one. This applies to ARC as well. But doing this curve fitting requires a very precise description of what you are trying to approximate (a dense sampling)
-
-
Replying to @fchollet @ChrSzegedy
The very *point* of cognition is to be able to operate given an insufficient description of the operation space. That's what intelligence is. Intelligence is not curve fitting.
1 reply 1 retweet 3 likes -
Replying to @fchollet
You were not talking about *cognition* but about the constraints on the type of functions DNNs can capture. I don't think those constraints are backed by facts
0 replies 0 retweets 0 likes
End of conversation
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.