I am reaching a little bit but why not? @fchollet I am reading your paper: https://arxiv.org/abs/1911.01547 (for the 3rd time I think) why is it that generalization has necessarily a cost that makes it antagonistic to compression in the Occam's Razor sense? @RobertTLange?
-
-
Because by definition compression discards all information that isn't relevant to the training goal, and by definition the training goal isn't what you want to generalize to (otherwise there is no generalization happening).
-
The most compressed model that does X is only capable of doing X. If it could do more then you could compress further.
- Show replies
New conversation -
-
-
First of all...how cool is it that I got a response from you...this is like so awesome, thanks a lot. With respect to the argument. Ok, I see your point on the dict to compress English, but wouldn't there be some compressed version of language that comprises both?
-
There might be, but it will be by definition less optimal when applied to just English. You won't find it by optimizing for English.
- Show replies
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.