We've spent almost a year learning and reading everything we can find over the last 50 years in this area
. One of our messages is that there is no uniformly best gradient estimator. Fig 3 was insightful to us.
pic.twitter.com/dUi1edcQc0
U tweetove putem weba ili aplikacija drugih proizvođača možete dodati podatke o lokaciji, kao što su grad ili točna lokacija. Povijest lokacija tweetova uvijek možete izbrisati. Saznajte više
We've spent almost a year learning and reading everything we can find over the last 50 years in this area
. One of our messages is that there is no uniformly best gradient estimator. Fig 3 was insightful to us.
pic.twitter.com/dUi1edcQc0
There are three simple probabilistic tricks that are used to derive three estimators in the paper. The score function, exploits the definition of the derivative of a logarithm to derive the score function estimator.
pic.twitter.com/jm43GaJC9L
Continuous distributions have a sampling property that allows them to be simulated using samples from a simpler distribution and a known sampling path. Using this path gives the pathwise estimator (often referred to as the reparam trick).
pic.twitter.com/yfiIS32ygh
Derivatives of densities can be expressed as the difference of two densities, using its weak derivative, its measure-theoretic property. This allows us to derive the measure-valued gradient.
pic.twitter.com/wTNlv3rVc8
And these gradient estimators allows us to see extensions and connections to many areas, including probabilistic programming, optimal transport, Fourier analysis, quasi-monte carlo, Stein's Idenitity, Malliavin calculus, control variates, queuing theory, ...
pic.twitter.com/yIFSamBDXN
We think there is something in here for the person just starting off in this area, and the seasoned expert. Help us improve the paper by letting us know if we missed any important connections, can explain anything better, or have errors
. 
Personally, I believe this might be the best paper I have been part of writing so far. We tried to be patient in our writing, and to use the principle that our duty is to uplift our readers. We hope we took a step in that direction. #writinggoals 
After a short delay, the code in a notebook to reproduce the graphs in section 3 of our paper (https://arxiv.org/abs/1906.10652 ) is online. More to be come soon. See thread above
. https://github.com/deepmind/mc_gradients … 
Thanks @shakir_za, really great! BTW, by now we (Jingkai Mao @j_foerst @alshedivat @greg_far & @shimon8282) have an ICML paper proposing a variance reduction technique for any-order gradient estimation in stochastic computation graphs: http://proceedings.mlr.press/v97/mao19a/mao19a.pdf …
Thanks
! Will add.
Twitter je možda preopterećen ili ima kratkotrajnih poteškoća u radu. Pokušajte ponovno ili potražite dodatne informacije u odjeljku Status Twittera.