Given that the brain appears to be quite noisy and somewhat linear, this means that we can almost certainly locally approximate the task loss L linearly around the starting parameters W_0 as L(W_0+ΔW)≈L(W_0 )+ΔW ∇L(W) 2/5
-
-
Prikaži ovu nit
-
This immediately implies that the brain can only get better if the changes induced by a learning episode are proportional to the loss gradient ∇L(W). Animals almost always improve behaviorally. 3/5
Prikaži ovu nit -
I also assume this approximation to be good. For a given improvement, the gradient descent solution of ΔW=-γ∇L is the one that would change the brain least. The size of the overall change to the brain is a central factor in across-task interference. Minimal interference! 4/5
Prikaži ovu nit -
This logic is why I have trouble seeing gradient descent as "just another theory that may be right or wrong" - it has a strong normative justification and, under certain rather harmless assumptions, must be at least correlated to plasticity in the brain. 5/5
Prikaži ovu nit -
Related to this discussion: https://twitter.com/KordingLab/status/1200403592341053440?s=20 …
@GaryMarcus@tyrell_turingPrikaži ovu nit -
And also, see eq 1 in this beautiful paper by
@Timothy0Leary : https://www.pnas.org/content/116/21/10537.short … - it is obviously not really a new point i am making, just local linearization.Prikaži ovu nit
Kraj razgovora
Novi razgovor -
-
-
i think it might in special cases, but the brain does so much, it likely does not in lots of other cases. i guess the question is how important is the GD approximation, versus other stuff going on?
-
indeed. But here I am just making the generic point that GD approximation is in all likelihood, part of it. And I obviously agree that all kinds of mechanisms can approximate it well enough. Indeed, many algorithms approximate GD while being faster.
- Još 6 drugih odgovora
Novi razgovor -
-
-
Not all dynamical phenomena have a Lyapunov function. Aren't you assuming what you are trying to prove?
-
My statement is strictly local.
- Još 16 drugih odgovora
Novi razgovor -
Čini se da učitavanje traje već neko vrijeme.
Twitter je možda preopterećen ili ima kratkotrajnih poteškoća u radu. Pokušajte ponovno ili potražite dodatne informacije u odjeljku Status Twittera.
