About 10,000 deep learning papers have been written about "hard-coding priors about a specific task into a NN architecture works better than a lack of prior" -- but they're typically being passed as "architecture XYZ offers superior performance for [overly generic task category]"
-
-
So methodologically, papers making a claim of architectural generality need to demonstrate this generality across a range of tasks and datasets. This is what the ML community used to do before the arrival of these large and expensive dataset-based tasks
-
Basically yes, researchers should focus on measuring generalization rather than performance on a single task. The point of an architecture innovation is to be able to reuse it across tasks / datasets -- so you should understand what priors it contains, what its specialization is
- Show replies
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.