(I realize many answers will restrict the architecture or objective fxn greatly, to allow specialized non-backprop ways to get such gradients, which wouldn’t work for general fxn approx in arbitrary net topology. What can you really do with those? What’s the best alternative?)
-
-
Replying to @AdamMarblestone @pfau and
I guess it breaks into 2 cases: 1) you don’t need the gradient, or 2) you do but you have a way to get it that is structurally very different than backprop. For instance this paper gets gradient w/ either backprop or EM (perhaps an example of case #2): https://arxiv.org/abs/1202.3732
1 reply 0 retweets 3 likes -
Replying to @AdamMarblestone @xaqlab and
Also I think "the brain minimizes an objective function" is a vacuous statement. All dynamical systems can be reframed as solving a variational problem.
3 replies 0 retweets 5 likes -
Replying to @pfau @AdamMarblestone and
E.g. a flow field with curl is not the gradient of anything.
3 replies 0 retweets 4 likes -
Right. The claim is not just that brain’s dynamics satisfies some arbitrary Euler-Lagrange equations, but that it’s learning dynamics actually corresponds to ongoing optimization during the lifetime of the organism, i.e., brain is itself carrying out an optimization algorithm .
1 reply 0 retweets 1 like -
Replying to @AdamMarblestone @xaqlab and
Ok. What about belief propagation? The fixed points are minima of the Bethe free energy, but it doesn't look much like gradient descent.
1 reply 0 retweets 0 likes -
I think a good example. So we have BP (backprop) and BP (belief propagation). Vicarious seems to like the latter. And people know to search for this in large scale cortical experiments, e.g., many of the IARPA MICRONS proposals centered on this in 2015 or so.
1 reply 0 retweets 0 likes -
Replying to @AdamMarblestone @xaqlab and
Sure, I mean, I picked that as an example because I know it's Xaq's favorite theory of how cortex works (and Pearl himself was inspired by Rumelhart's speculation about neuroscience).
2 replies 0 retweets 1 like -
So there are 2 interesting, non-arbitrary, not-(entirely-) sociologically-driven theories... maybe science is functioning OK and we mostly now need much, much better data... :-)
2 replies 0 retweets 0 likes -
Replying to @AdamMarblestone @xaqlab and
And we need to be able to *interpret* that data. It may be that the optimization is implemented in such an alien way that we can hardly recognize it.
1 reply 0 retweets 0 likes
Admittedly possible.
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.