You can always "buy" performance by either training on more data, better data, or by injecting task information into the architecture or the preprocessing. However, this isn't informative about the generalization power of the techniques used (which is the only thing that matters)
-
-
Show this thread
-
Basically, a lot of papers can be rephrased as "we achieved better performance on this specific task by going to great lengths to inject more information about the task in our training setup"
Show this thread -
An extreme case of this: working with a synthetically-generated dataset where samples follow a "template" (e.g. bAbI), and manually hard-coding that template into your NN architecture
Show this thread -
Fitting parametric models via gradient descent, unsurprisingly, works best when what you are fitting is already a template of the solution. Of course, convnets are an instance of this (but in a good way, since their assumptions generalize to all visual data).
Show this thread
End of conversation
New conversation -
-
-
For most relevant applications, in particular medicine, we will never have enough data to "let the data sort itself out". Apart from this, why ignore that we have decades or even centuries of knowledge? Finding the best way to incorporate this into
#MachineLearning is key -
Agree, especially if the goal is to mimic the brain.
- Show replies
New conversation -
-
-
Not doing this myself but I could see situations where data is limited and adding knowledge could help it generalize.
-
The point is that by picking a specific architecture you are implicitly selecting a prior. There is no way to avoid doing this since the space of all possible models is vastly greater than any data you could possibly select.
End of conversation
New conversation -
-
-
Can you give an example of the kind of thing you are describing? What is the "prior" knowledge? I think there is a huge gap between the language of architectural structures and typical languages for prior knowledge.
-
I mentioned the example of bAbI. If you already know how to go from the input data to the answer, you can express the process as a parametric formula and have your network learn the parameters. Most of the work was done by you, not by your network.
- Show replies
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.