Tweetovi
- Tweetovi, trenutna stranica.
- Tweetovi i odgovori
- Medijski sadržaj
Blokirali ste korisnika/cu @jaschasd
Jeste li sigurni da želite vidjeti te tweetove? Time nećete deblokirati korisnika/cu @jaschasd
-
This has surprisingly little effect on prediction accuracy, but does improve the match between networks considered by theory and used in practice. With Roman Novak,
@sschoenholz, and@hoonkp. Implementation in https://github.com/google/neural-tangents ….Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
This makes NTK training dynamics dissimilar from those of standard finite width networks. (Infinite width Bayesian networks, NNGPs, don't suffer from this problem.) In https://arxiv.org/pdf/2001.07301.pdf … we derive infinite width kernels for the *standard* parameterization, resolving this.
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Research on the Neural Tangent Kernel (NTK) almost exclusively uses a non-standard neural network parameterization, where activations are divided by sqrt(width), and weights are initialized to have variance 1 rather than variance 1/width.
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Jascha proslijedio/la je Tweet
I'll be presenting this work with
@samgreydanus and@jaschasd tomorrow morning (Friday Dec 13) at 11:30am at the NeurIPS Deep Inverse Problems workshop (http://deep-inverse.org )https://twitter.com/shoyer/status/1177770543434485760 …
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Jascha proslijedio/la je TweetHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi
-
Jascha proslijedio/la je Tweet
Visit our poster today(12/12 Thu) at
#NeurIPS2019 10:45am! #175 "Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient Descent"- work with@Locchiu,@sschoenholz,@yasamanbb, Roman Novak,@jaschasd, Jeffrey Pennington https://github.com/google/neural-tangents/raw/master/presentation/neurips_linearization_poster.pdf …Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi
-
Infinite width networks (NNGPs and NTKs) are the most promising lead for theoretical understanding in deep learning. But, running experiments with them currently resembles the dark age of ML research before ubiquitous automatic differentiation. Neural Tangents fixes that.https://twitter.com/sschoenholz/status/1202988151569973248 …
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Jascha proslijedio/la je Tweet
1/ I can't teach you how to dougie but I can teach you how to compute the Gaussian Process corresponding to infinite-width neural network of ANY architecture, feedforward or recurrent, eg: resnet, GRU, transformers, etc ... RT plz
http://arxiv.org/abs/1910.12478 pic.twitter.com/TgCBmf1OcA
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Even better -- the code runs in your browser in colab!https://twitter.com/shoyer/status/1187170353061216256 …
Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Neural reparameterization improves structural optimization! By parameterizing physical design in terms of the (constrained) output of a neural network, we propose stronger and more elegant bridges, skyscrapers, and cantilevers. http://arxiv.org/pdf/1909.04240.pdf … With shoyer@ samgreydanus@pic.twitter.com/PZzJjgoCep
Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Jascha proslijedio/la je TweetHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi
-
Takeaways: 1) A prescription for adjusting SGD hyperparameters with width 2) Generalization strictly improves with width 3) Test accuracy is predicted surprisingly well by a single scalar (eq 4) 4) There is a critical width, beyond which optimal hyperparameters are unachievable
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
A careful empirical study of the effect of network width on generalization and fixed learning rate SGD, for MLPs, convnets, resnets, and batch norm. With superstar resident Daniel Park, and
@quocleix + Sam Smith. https://arxiv.org/abs/1905.03776 pic.twitter.com/41VYZG9MRR
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Jascha proslijedio/la je Tweet
Exciting work in the evolution approach and meta learning! —
@Luke_Metz blowing us away with neuron local meta learned update rules and their surprising generalization ability (across datasets, data modalities, and architecture!!)@jaschasd https://arxiv.org/abs/1804.00222 pic.twitter.com/a7wfOWKPOS
Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Jascha proslijedio/la je TweetHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi
-
Jascha proslijedio/la je Tweet
1/ Does batchnorm make optimization landscape more smooth? http://arxiv.org/abs/1805.11604 says yes, but our new
@iclr2019 paper http://arxiv.org/abs/1902.08129 shows BN causes grad explosion in randomly initialized deep BN net. Contradiction? We clarify belowpic.twitter.com/sSa2rvJol1
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
:
@laurent_dinh is the most fun to work with. He always has extremely novel ideas ... and makes the most mesmerizing animations.https://twitter.com/laurent_dinh/status/1108807270555619328 …
Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Including a massive, well curated, dataset mapping hyperparameter configuration to model performance. This may be a useful resource in your own research.https://twitter.com/GoogleAI/status/1108051837691662336 …
Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Wonderful collaboration with powerhouses
@TheGregYang Jeffrey Pennington Vinay Rao and@sschoenholz . Thanks to Sergey Ioffe,@aleks_madry, and@TsendeeMTS for comments and discussion along the way.Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi
Čini se da učitavanje traje već neko vrijeme.
Twitter je možda preopterećen ili ima kratkotrajnih poteškoća u radu. Pokušajte ponovno ili potražite dodatne informacije u odjeljku Status Twittera.