Differentiability & minibatch SGD are the strengths of DL: besides making the learning practically tractable, the smoothness & continuity of the function & the incrementality of its fitting work great to learn to approximate latent manifold. But its strengths are also its limits
-
Show this thread
-
The whole setup breaks down when you are no longer doing pattern recognition -- when you no longer have a latent manifold (any kind of discrete problem) or no longer have a dense sampling of it. Or when your manifold changes over time.
4 replies 7 retweets 48 likesShow this thread -
Replying to @fchollet
The game of go and chess are highly discrete. One slightly wrong move and the game is lost. Still AlphaZero does very well. Even move proposals from a ResNet (without search) beats very good amateur go players.
1 reply 1 retweet 5 likes -
Replying to @ChrSzegedy
These games involve a mixture of pattern recognition (what a player would call 'intuition') and explicit reasoning. It's not all or nothing. The better you are at pattern recognition, the less you need to rely on reasoning, and inversely.
1 reply 1 retweet 3 likes -
Replying to @fchollet @ChrSzegedy
AlphaZero illustrates this trade-off well: the better the convnet the lesser the need to rely on MCTS. It also really shows how explicit reasoning (e.g. MCTS) enables much greater experience-efficiency in achieving high skill (i.e. greater intelligence): (cont)
1 reply 1 retweet 2 likes -
Replying to @fchollet @ChrSzegedy
playing purely based on pattern recognition (intuition) requires an insane amount of training data (a dense sampling of the manifold). That's not how humans play: for best efficiency, we rely on an interconnected *mix* of intuition and explicit reasoning & planning.
2 replies 3 retweets 5 likes -
Replying to @fchollet
The amount of training data is irrelevant. The fact that AlphaZero captures the discrete landscape of chess and go better than (most) humans demonstrates that DL does not require a continuous domain to work well.
1 reply 0 retweets 9 likes -
Replying to @ChrSzegedy
No, it's a highly structured space which can therefore be embedded in a continuous manifold if you can sample enough games (which is a really ridiculous amount of games). This is true for virtually any task, as I was saying earlier
1 reply 0 retweets 7 likes -
Replying to @fchollet
Being structured does not imply anything like that. Also nothing can be intelligent in an environment without structure.
1 reply 0 retweets 6 likes -
Replying to @ChrSzegedy
No, it does. I could give you any number of examples. You can always fit a continuous function to a set of discrete points (split bt training & val), but the more structure they have the more the continuous function will be likely to intercept data points not in your training set
2 replies 0 retweets 4 likes
i.e. to generalize. How course the term "structure" is completely informal so not accurate here, but not that's not a convo for Twitter
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.