To make backprop more efficient, you would need to make it modular and hierarchical, but you would need to that without compromising the very thing that has made it so successful. Not an easy challenge.
-
-
Show this threadThanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
But many numerical optimisation methods never come close to global extreme. In fact this lack of convergence (implicit regularisation) may be needed to ensure valid results. Perhaps the global optimum would be overfit.
Thanks. Twitter will use this to make your timeline better. UndoUndo
-
-
Show additional replies, including those that may contain offensive content
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.