The only setup where a global minimum has any chance to generalize is for underparameterized models, such as logistic regression, or a 1-layer network with few units. Otherwise, the points in parameter space that you are looking for during training are pretty far from the minima
-
-
Show this thread
-
An important research direction would be to use the information bottleneck principle to come up with models that have exactly the right amount of memorization capacity for a given task, as well as optimization methods to get to the global optimum
Show this thread
End of conversation
New conversation -
-
-
This Tweet is unavailable.
-
Arguably a hashtable would work better for your use case
End of conversation
-
-
-
Since we can’t find the global min via SGD how do you know that if we could it would overfit? I’m presuming this is an empirical reality and not based on generalization bounds for deep nets
-
You can probably compare the SGD with batch gradient descent with no regularisation. If the result is the same or similar that would mean that the model is overfitting.
End of conversation
New conversation -
-
-
Would this be the case even when the loss function is regularized? Isn't the point of regularization to ensure that the minimum loss corresponds to weights that generalize?
-
Interesting question: we know "regularization" reduces variance, but how does it impact the local vs global minima landscape? I think what you're suggesting is that regularization will shift the global minima to some place more generalizable?
- Show replies
New conversation -
-
-
In my own experiments (where I train deep nets to learn Boolean functions), this is most definitely not the case. Admitedly the input dimensionality is very small, 7, and no noise, so not sure yet how relevant for larger problems.
-
Oh and my nets are very overparametrized and unregularized.
End of conversation
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.