The phenomena I'm explaining are empirical. I like to think about theoretical explanations. The statement "and therefore..." is a non-sequitur, unless I'm missing something.
-
-
I think you must be missing something, or I'm not explaining clearly. If you have an un-regularized, very over-parameterized model, trained over a number of epochs, to a global minimum, it won't generalize. That's why we use dropout, data augmentation, weight decay, etc. Right?
2 replies 1 retweet 3 likes -
No. That's what the community thought. But now it's recognized (though apparently the news hasn't spread) that SGD, on its own, regularizes. Even if you use dropout, batch norm, etc., you may be finding global minima that generalize. Depends which global minima you find!
3 replies 1 retweet 20 likes -
I'm well aware of that theory. Obviously using mini batches adds randomness that does some regularizing. But not enough in practice for nearly all real world uses. You can use tiny batches to make stronger effect, but that has its own problems.
1 reply 1 retweet 3 likes -
Replying to @jeremyphoward @roydanroy and
In practice our best models are nearly always extremely over parameterized, and need a *lot* of regularization. More than just mini batch variation can provide.
2 replies 1 retweet 5 likes -
I agree that we need regularization in over-parametrized models. But mathematically, this has little to do with the achieved training risk _on its own_. Generalization error can depend on training risk, but also, learned classifier, algorithm, data, data distribution, etc.
1 reply 0 retweets 2 likes -
Well then we're in agreement
1 reply 0 retweets 2 likes -
Well then!
1 reply 0 retweets 1 like -
Replying to @roydanroy @jeremyphoward and
I would be curious to see what the original statement was, can someone share a screenshot (I can’t see tweets from Francois since he blocked the last time I disagreed with him)
3 replies 2 retweets 15 likes -
Replying to @fhuszar @roydanroy and
"he blocked the last time I disagreed with him" What??
1 reply 0 retweets 1 like
There are people who violently & impolitely disagree with me on Twitter every day, but I have never blocked anyone for that. I block a total of 4 people (not couting bots), including pranv and ferenc. If I block you, usually it means you were being a huge asshole, repeatedly.
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.