It did (Werbos, Bryson & Ho, etc.), but he didn't know about it.
-
-
Vastauksena käyttäjille @pmddomingos ja @rao2z
Yann LeCun uudelleentwiittasi Yann LeCun
See this responsehttps://twitter.com/ylecun/status/1449006954886701069?t=eLh9lU6RBAiQff9bMd1QhA&s=19 …
Yann LeCun lisäsi,
Yann LeCun @ylecunVastauksena käyttäjälle @rao2zOk: - chain rule existed. - Lagrangian mechanics existed. - the adjoint state methods in optimal control existed. But no one had shown chain rule could be used to train multilayer nets. And *clearly* *no one* had actually made it work until about 1985.1 vastaus 0 uudelleentwiittausta 6 tykkäystä -
So you don’t think something exists until it’s been shown to work. More accurate would be to give everyone credit for their part instead of just saying backprop was invented in 1985 (and even then by multiple separate people at once).
3 vastausta 0 uudelleentwiittausta 2 tykkäystä -
With this reasoning then we should credit Newton for inventing backprop. I mean Newton (/Leibniz) invented differentiation, so we should credit them for inventing the chains rule, which is apparently backprop too?
1 vastaus 0 uudelleentwiittausta 2 tykkäystä -
We should credit them with taking an important step in the invention of backprop, certainly.
1 vastaus 0 uudelleentwiittausta 0 tykkäystä -
Stop this Euro-centrism, Pedro! We all know that Brahmagupta invented and documented Zero, and everything after that was just an infinitesimal detail..
1 vastaus 1 uudelleentwiittaus 4 tykkäystä -
Yann can correct me, but I think I've heard him joke that backprop should really be credited to Lebniz because he invented the chain rule. So pin that one on him :)
1 vastaus 0 uudelleentwiittausta 1 tykkäys -
Vastauksena käyttäjille @pmddomingos, @rao2z ja
The bigger point here, which perhaps we can all agree on, is that the most important (& hardest, at the time) part of backprop is realizing that it's a valid way to train models with multiple optima even though it has no guarantee of finding the global one (cf. Minsky & Papert).
1 vastaus 0 uudelleentwiittausta 0 tykkäystä -
Well.. not sure what "valid" means. Basically we found that the learned models are doing okay, irrespective of what optima if any they settled in. Ergo, the perceived usefulness of BP is predicated on the availability of data/compute infrastructure that allowed us to see this.
1 vastaus 0 uudelleentwiittausta 0 tykkäystä -
Vastauksena käyttäjille @rao2z, @pmddomingos ja
We did at least get out of the "but who will tell us how much of the output error the interior units are responsible for?" hand-wringing that seemed popular with psychologists until early 80's..
1 vastaus 0 uudelleentwiittausta 0 tykkäystä
That's my point. We needed computers and data to get beyond the "it must have a proof" frame of mind (and some still haven't).
Lataaminen näyttää kestävän hetken.
Twitter saattaa olla ruuhkautunut tai ongelma on muuten hetkellinen. Yritä uudelleen tai käy Twitterin tilasivulla saadaksesi lisätietoja.