David Duvenaud

@DavidDuvenaud

Machine learning prof . Working on generative models, inference, & latent structure.

Vrijeme pridruživanja: srpanj 2013.

Tweetovi

Blokirali ste korisnika/cu @DavidDuvenaud

Jeste li sigurni da želite vidjeti te tweetove? Time nećete deblokirati korisnika/cu @DavidDuvenaud

  1. 17. sij

    This review on normalizing flows is excellent. It's full of clear writing, precise claims, and useful connections.

    Poništi
  2. 11. sij

    We think one reason no one did reverse-time adjoints for SDEs before was that it wasn’t shown that one can simulate another SDE in reverse to numerically reconstruct the original trajectory. We think we’re the first to demonstrate this theoretically and empirically.

    Prikaži ovu nit
    Poništi
  3. 9. sij

    Some related work: Tzen & gave continuous-time forward-mode derivatives: , but this is slow for large models. We’re now working on infinitely-deep Bayesian neural nets like ones proposed in

    Prikaži ovu nit
    Poništi
  4. 9. sij

    We’ll release a set of adaptive SDE solvers in PyTorch along with the code for the stochastic adjoint sensitivities soon. Caveat: It’s kind of slow. So far we scaled it to fitting 50D mocap data with 11000 params, but we just started.

    Prikaži ovu nit
    Poništi
  5. 9. sij

    We can train latent SDEs as an infinite-dimensional variational autoencoder, with the latent variable being the trajectory of the SDE. The observations can have any differentiable likelihood, and come at arbitrary times.

    Prikaži ovu nit
    Poništi
  6. 9. sij

    One hard part was storing noise for re-use on the reverse pass. To allow adaptive solvers, we have to store an entire function drawn from Brownian motion. We show how to reconstruct a fixed function draw at any point with O(1) memory cost, based on Brownian Trees.

    Prikaži ovu nit
    Poništi
  7. 9. sij

    For neural ODEs, continuous-time backprop had already been worked out. For SDEs, surprisingly, there was no analogous reverse-mode method. The algorithm ended up being a simple extension of the ODE method with fixed noise, a sort of continuous-time reparameterization trick.

    Prikaži ovu nit
    Poništi
  8. 9. sij

    Training Neural SDEs: We worked out how to do scalable reverse-mode autodiff for stochastic differential equations. This lets us fit SDEs defined by neural nets with black-box adaptive higher-order solvers. With , and .

    Prikaži ovu nit
    Poništi
  9. proslijedio/la je Tweet
    13. pro 2019.

    Want a sneak peak for what's next with Neural ODEs? I will be talking about Higher Order Autodiff in JAX. Work with , , and ! Program Transformations: Sat. 14, Room West 114 Poster: 12-14:00 Talk: 17:00-17:20

    Prikaži ovu nit
    Poništi
  10. 11. pro 2019.

    We just released our code for training classifiers as energy-based models: has scripts for training, generation, out-of-distribution detection, and running adversarial attacks. Also has a pretrained model for CIFAR10.

    Prikaži ovu nit
    Poništi
  11. 9. pro 2019.

    Finally, the class-conditional samples look pretty good, although not high-res (yet!). With , , , , and .

    Prikaži ovu nit
    Poništi
  12. 9. pro 2019.

    And also helps with out-of-distribution detection enough to get rid of the strange SVHN results seen recently by and others:

    Prikaži ovu nit
    Poništi
  13. 9. pro 2019.

    It also improves calibration, especially in low-data settings:

    Prikaži ovu nit
    Poništi
  14. 9. pro 2019.

    Training to match the unlabeled data density regularizes the model in every way we measured. For instance, it improves adversarial robustness, and even optimizing images to maximize their class probability gives images that still look (somewhat) like that class:

    Prikaži ovu nit
    Poništi
  15. 9. pro 2019.

    Classifiers are secretly energy-based models! Every softmax giving p(c|x) has an unused degree of freedom, which we use to compute the input density p(x). This makes classifiers into generative models without changing the architecture.

    , , i još njih 2
    Prikaži ovu nit
    Poništi
  16. 11. stu 2019.

    Biggest surprise: Approximating the training Hessian with the identity works pretty well. Caveats: Still need to choose optimization hypers, and have to worry about overfitting hyperparams. But that's better than relying on gradient-free optimization being unable to overfit.

    Prikaži ovu nit
    Poništi
  17. 11. stu 2019.

    Gradient-based tuning of enough hyperparams gives nearly zero validation loss on large image classification and language models. We also distilled datasets into representative training examples by treating pixels as hyperparams. Here are a few distilled classes from CIFAR100:

    Prikaži ovu nit
    Poništi
  18. 11. stu 2019.

    We approximated the implicit function theorem to tune millions of hyperparameters. Now we can train data augmentation networks from scratch using gradients from the validation loss. With and

    Prikaži ovu nit
    Poništi
  19. 3. lis 2019.

    With , , , Shenlong Wang, Charlie Nash, , Raquel Urtasun, and Rich Zemel. A remaining open problem is achieving better order invariance when computing likelihoods.

    Prikaži ovu nit
    Poništi
  20. 3. lis 2019.

    I heard you like graphs, so we put a graph neural net in your graph generative model, so you can be invariant to order while you add edges to your graph. Scales to 5000 nodes. Paper: Code:

    Prikaži ovu nit
    Poništi

Čini se da učitavanje traje već neko vrijeme.

Twitter je možda preopterećen ili ima kratkotrajnih poteškoća u radu. Pokušajte ponovno ili potražite dodatne informacije u odjeljku Status Twittera.

    Možda bi vam se svidjelo i ovo:

    ·