Not all functions we want to learn can be built efficiently from differentiable parts. What's less obvious to me right now: could all the important metalearning functions turn out to be differentiable? (not that they have to!)
-
-
Replying to @Plinz
What's an example of a function that cannot be built efficiently this way?
1 reply 0 retweets 0 likes -
-
Replying to @Plinz
Is there an intuitive way to categorize this class of functions? I guess anything that has a lot of abstract structure and interdependencies.
1 reply 0 retweets 0 likes -
-
Isn't the main problem the presence of large & many discontinuities (~by definition of differentiability). Looping computations are frequently fairly smooth (each iteration brings you a bit closer to the result). Branching (if-statements) OTOH cause very sharp discontinuities.
1 reply 0 retweets 0 likes -
Replying to @mere_mortise @DKedmey
It is very easy to entrain a neural network with function that performs an if statement, and hard to entrain it with a loop.
1 reply 0 retweets 0 likes -
The limiting factor for learning loopy computations is not smoothness but (unrolled) network depth (shattering and vanishing gradients). Branchy computations have a unwieldy error surface because getting one decision wrong completely throws off your prediction.
2 replies 0 retweets 0 likes
Exactly. We probably don't want to learn looping algorithms via deep gradient descent. Perhaps we will find that there are very few things that we actually want to learn via deep gradient descent.
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.