Tweetovi
- Tweetovi, trenutna stranica.
- Tweetovi i odgovori
- Medijski sadržaj
Blokirali ste korisnika/cu @DavidDuvenaud
Jeste li sigurni da želite vidjeti te tweetove? Time nećete deblokirati korisnika/cu @DavidDuvenaud
-
This review on normalizing flows is excellent. It's full of clear writing, precise claims, and useful connections.https://twitter.com/gpapamak/status/1202935540175310854 …
Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
We think one reason no one did reverse-time adjoints for SDEs before was that it wasn’t shown that one can simulate another SDE in reverse to numerically reconstruct the original trajectory. We think we’re the first to demonstrate this theoretically and empirically.
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Some related work: Tzen &
@mraginsky gave continuous-time forward-mode derivatives: https://arxiv.org/abs/1905.09883 , but this is slow for large models. We’re now working on infinitely-deep Bayesian neural nets like ones proposed in https://arxiv.org/abs/1905.11065 pic.twitter.com/e1jSzmKezFPrikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
We’ll release a set of adaptive SDE solvers in PyTorch along with the code for the stochastic adjoint sensitivities soon. Caveat: It’s kind of slow. So far we scaled it to fitting 50D mocap data with 11000 params, but we just started.pic.twitter.com/a2MWuI0NCd
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
We can train latent SDEs as an infinite-dimensional variational autoencoder, with the latent variable being the trajectory of the SDE. The observations can have any differentiable likelihood, and come at arbitrary times.pic.twitter.com/PHeCTNvtsi
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
One hard part was storing noise for re-use on the reverse pass. To allow adaptive solvers, we have to store an entire function drawn from Brownian motion. We show how to reconstruct a fixed function draw at any point with O(1) memory cost, based on Brownian Trees.pic.twitter.com/HA5J8a9hRG
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
For neural ODEs, continuous-time backprop had already been worked out. For SDEs, surprisingly, there was no analogous reverse-mode method. The algorithm ended up being a simple extension of the ODE method with fixed noise, a sort of continuous-time reparameterization trick.pic.twitter.com/hutnHNevOU
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Training Neural SDEs: We worked out how to do scalable reverse-mode autodiff for stochastic differential equations. This lets us fit SDEs defined by neural nets with black-box adaptive higher-order solvers. https://arxiv.org/pdf/2001.01328.pdf … With
@lxuechen,@rtqichen and@wongtkleonard.pic.twitter.com/qlUwMxezjOPrikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
David Duvenaud proslijedio/la je Tweet
Want a sneak peak for what's next with Neural ODEs? I will be talking about Higher Order Autodiff in JAX. Work with
@jacobjinkelly,@SingularMattrix, and@DavidDuvenaud! Program Transformations: https://program-transformations.github.io/ Sat. 14, Room West 114 Poster: 12-14:00 Talk: 17:00-17:20pic.twitter.com/WEvaNij0HyPrikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
We just released our code for training classifiers as energy-based models: https://wgrathwohl.github.io/JEM/ has scripts for training, generation, out-of-distribution detection, and running adversarial attacks. Also has a pretrained model for CIFAR10.
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Finally, the class-conditional samples look pretty good, although not high-res (yet!). With
@wgrathwohl,@kcjacksonwang,@jh_jacobsen,@Mo_Norouzi, and@kswersk.pic.twitter.com/LaIJEcuvgC
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
And also helps with out-of-distribution detection enough to get rid of the strange SVHN results seen recently by
@eric_nalisnick and others:pic.twitter.com/cgYTny2fYZ
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
It also improves calibration, especially in low-data settings:pic.twitter.com/ppglO6DDjD
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Training to match the unlabeled data density regularizes the model in every way we measured. For instance, it improves adversarial robustness, and even optimizing images to maximize their class probability gives images that still look (somewhat) like that class:pic.twitter.com/kp2gHwtsPQ
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Classifiers are secretly energy-based models! Every softmax giving p(c|x) has an unused degree of freedom, which we use to compute the input density p(x). This makes classifiers into generative models without changing the architecture. https://arxiv.org/abs/1912.03263 pic.twitter.com/IzMPxiNxFQ
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Biggest surprise: Approximating the training Hessian with the identity works pretty well. Caveats: Still need to choose optimization hypers, and have to worry about overfitting hyperparams. But that's better than relying on gradient-free optimization being unable to overfit.
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Gradient-based tuning of enough hyperparams gives nearly zero validation loss on large image classification and language models. We also distilled datasets into representative training examples by treating pixels as hyperparams. Here are a few distilled classes from CIFAR100:pic.twitter.com/IRYjJruZdA
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
We approximated the implicit function theorem to tune millions of hyperparameters. Now we can train data augmentation networks from scratch using gradients from the validation loss. https://arxiv.org/pdf/1911.02590.pdf … With
@JonLorraine and@PaulVicolpic.twitter.com/BUVS4JSWPP
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
With
@lrjconan,@liyuajia,@YSongStanford, Shenlong Wang, Charlie Nash,@williamleif, Raquel Urtasun, and Rich Zemel. A remaining open problem is achieving better order invariance when computing likelihoods.pic.twitter.com/U8zlvh1Dm1
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
I heard you like graphs, so we put a graph neural net in your graph generative model, so you can be invariant to order while you add edges to your graph. Scales to 5000 nodes. Paper: https://arxiv.org/abs/1910.00760 Code: https://github.com/lrjconan/GRAN pic.twitter.com/FFyuYqjy3q
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi
Čini se da učitavanje traje već neko vrijeme.
Twitter je možda preopterećen ili ima kratkotrajnih poteškoća u radu. Pokušajte ponovno ili potražite dodatne informacije u odjeljku Status Twittera.