I am reaching a little bit but why not? @fchollet I am reading your paper: https://arxiv.org/abs/1911.01547 (for the 3rd time I think) why is it that generalization has necessarily a cost that makes it antagonistic to compression in the Occam's Razor sense? @RobertTLange?
-
-
The more general point is that generalization requires you to store seemingly useless information at training time (info that doesn't help your training objective), that will become useful in the future (when generalization actually happens). That's the opposite of compression.
-
Now, obviously, generalization requires abstraction, which requires erasing irrelevant details, so your high-generalization system will be doing *some* amount of compression. But it will be storing lots of seemingly useless info as well.
- Show replies
New conversation -
-
-
Ok yeah that makes total sense. So the point on it having to be optimizable for English is such that you can actually get some learning out of the interaction with the task?
-
Consider a simple example: I give you a set of questions about a movie, and you watch the movie and try to answer the questions. During the movie you write notes. The shortest possible notes that enable you to answer the questions are just the answers to the questions.
- Show replies
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.