Yes, totally. But, that's precisely what we have to change (and what a few groups are working on, including mine): our credit assignment models need to start making physiological predictions that can be falsified. 
-
-
Replying to @tyrell_turing @pfau and
Wait, credit assignment = backprop?
2 replies 0 retweets 1 like -
Replying to @xaqlab @tyrell_turing and
The brain does credit assignment by whatever was cool at NeurIPS 5 years ago.
1 reply 0 retweets 3 likes -
Let’s add meat to this then, though. What are good options for credit assignment that *don’t* require efficient access to an estimate of the 1st order gradient of an objective function w.r.t. a given synaptic weight deep in a network? Honest question.
1 reply 1 retweet 3 likes -
Replying to @AdamMarblestone @pfau and
(I realize many answers will restrict the architecture or objective fxn greatly, to allow specialized non-backprop ways to get such gradients, which wouldn’t work for general fxn approx in arbitrary net topology. What can you really do with those? What’s the best alternative?)
2 replies 0 retweets 1 like -
Replying to @AdamMarblestone @pfau and
I guess it breaks into 2 cases: 1) you don’t need the gradient, or 2) you do but you have a way to get it that is structurally very different than backprop. For instance this paper gets gradient w/ either backprop or EM (perhaps an example of case #2): https://arxiv.org/abs/1202.3732
1 reply 0 retweets 3 likes -
Replying to @AdamMarblestone @xaqlab and
Also I think "the brain minimizes an objective function" is a vacuous statement. All dynamical systems can be reframed as solving a variational problem.
3 replies 0 retweets 5 likes -
Replying to @pfau @AdamMarblestone and
E.g. a flow field with curl is not the gradient of anything.
3 replies 0 retweets 4 likes -
Replying to @xaqlab @AdamMarblestone and
That's a narrow definition of minimizing an objective. Hamiltonian dynamics follow a vector field with curl, and that can be reframed as finding a path of least action.
2 replies 0 retweets 4 likes -
But it doesn't count as credit assignment if it's not normative, i.e. if there isn't a relationship to behaviour getting better in some way. This is not a given. If it was, we wouldn't need backprop in ML.
1 reply 0 retweets 0 likes
There is also minimizing in the sense of “being a minimum (stationary point) of” versus minimizing in the sense of “causing to go down over time”.
-
-
Replying to @AdamMarblestone @tyrell_turing and
Good point. When Belief Prop converges, it does so to a local minimum of the Bethe free energy (BFE), but it doesn't follow a gradient on the dynamical variables. And it doesn't always converge either, in which cases it cannot be framed as minimizing the BFE.
1 reply 0 retweets 1 like -
Replying to @xaqlab @AdamMarblestone and
The point though, is about getting more optimal behavior and you can get this if the dynamics converge on the typical set of some unknown distribution. Which I suppose answers a way to get learning without optimization (it is also strictly better)
1 reply 0 retweets 0 likes - 2 more replies
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.