About 10,000 deep learning papers have been written about "hard-coding priors about a specific task into a NN architecture works better than a lack of prior" -- but they're typically being passed as "architecture XYZ offers superior performance for [overly generic task category]"
-
-
Basically, a lot of papers can be rephrased as "we achieved better performance on this specific task by going to great lengths to inject more information about the task in our training setup"
Show this thread -
An extreme case of this: working with a synthetically-generated dataset where samples follow a "template" (e.g. bAbI), and manually hard-coding that template into your NN architecture
Show this thread -
Fitting parametric models via gradient descent, unsurprisingly, works best when what you are fitting is already a template of the solution. Of course, convnets are an instance of this (but in a good way, since their assumptions generalize to all visual data).
Show this thread
End of conversation
New conversation -
-
-
I totally disagree. Probably because I am usually using machine learning as part of understanding a concept instead of trying to solve GAI.
-
Otoh, maybe it just depends on your definitions for "specific task" and "generalisable"
End of conversation
New conversation -
-
-
"Which is the only thing that matters" - matters in what way? For finding better learning algorithms - sure, but in applications which typically seek to answer a very narrow question domain knowledge and the know-how to translate it into priors of the model is king.
Thanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
I wish there was a go to paper that could provide me with enough insight into how to solve or at least improve my own models generalisation disappointment.
Thanks. Twitter will use this to make your timeline better. UndoUndo
-
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.