Lookahead selects a search direction based on k steps of the inner optimizer. We demonstrate that this reduces variance, which improves convergence and makes Lookahead more robust to hyperparameter choices. This is desirable on novel datasets without well-calibrated baselines.
-
-
Prikaži ovu nit
-
The algorithm has minimal computational overhead and stores one additional copy of the parameters. It can be incorporated into existing pipelines with a couple of lines of code. Our implementation is available at:https://github.com/michaelrzhang/lookahead …
Prikaži ovu nit
Kraj razgovora
Novi razgovor -
Čini se da učitavanje traje već neko vrijeme.
Twitter je možda preopterećen ili ima kratkotrajnih poteškoća u radu. Pokušajte ponovno ili potražite dodatne informacije u odjeljku Status Twittera.