A look at the evolution of generative AI since 2014: bit.ly/3CCmG2C
Yuxin Wu’s Tweets
Tips to avoid a >40x waste of RAM in pytorch dataloaders - an issue that many have encountered.
Quote Tweet
Ever faced a "memory leak" in your @PyTorch data loader? (I did
) @ppwwyyxx has an amazing blog post on demystifying RAM usage in multi-process data loaders. Check out his post: ppwwyyxx.com/blog/2022/Demy
4
34
Tips for everyone: whenever you do ANYTHING slightly fancy in test time, always try recomputing the population statistics using correct (non-moving) average. This tweet is one example, and Sec 3.3 of arxiv.org/abs/2105.07576 lists a few more.
github.com/facebookresear is a good impl
Quote Tweet
The solution: After interpolating between BatchNorm networks, we need to reset the normalization statistics (running_mean and running_var). Once this is done, the barrier is immediately even smaller than that of the LayerNorm networks used in the paper.
7/14
Show this thread
1
7
Recognized by for “selfless contribution to other members of the computer vision community” — WELL DONE to , , Ilija Radosavovic, , , , Wan-Yen Lo, Piotr Dollár and Kaiming He
Quote Tweet
Congratulations to the Detectron team Ross Girshick, Yuxin Wu, Ilija Radosavovic, Alexander Kirillov, Georgia Gkioxari, Francisco Massa, Wan-Yen Lo, Piotr Dollár, & Kaiming He for winning the PAMI Everingham Prize at #ICCV2021. Learn about the platform: ai.facebook.com/tools/detectro
1:00
15K views
10
9
46
This paper gives me anxiety. BatchNorm is the most deviously subtly complex layer in deep learning. Many issues (silently) root cause to it. Yet it is ubiquitous because it works well (it multi-task helps optimization/regularization) and can be fused to affines at inference time.
20
149
1,015
Show this thread
无聊之际做了一个小游戏,可供大家浪费5到10分钟时间。如果反响不错的话,我会加几个新的World,做个完整版。
20
49
116
Show this thread
#detectron2 in action.
Quote Tweet
Testing out #detectron2 by @facebookai , was dazzled by the precision level and data richness of the model. Every frame is beautifully captured by @erikconover feat. @saradietschy and $3500 #Manhattan apt.
@cwillycs @PyTorch #NYC #4k @ComputerVisions #imagesegmentation
1:13
44.1K views
7
167
769
The paper uses Horovod, which means the TensorFlow part of it runs in single process and does nothing about scaling. It's Horovod that scales well.
Quote Tweet
TensorFlow scales pretty well. Work done on the Summit supercomputer at Oak Ridge. 27,600 V100 GPUs, near-linear scaling. twitter.com/NandoDF/status…
4
Thanks ! Wondering how long it would take for the community to break it ..(fingers crossed
Quote Tweet
FAIR has released code for the robust ImageNet model by Cihang Xie et al: github.com/facebookresear
1
4
Experiment results reproduced in 3 different frameworks: github.com/ppwwyyxx/Group
Quote Tweet
This figure from group norm (arxiv.org/abs/1803.08494) is super useful for anyone trying to keep track of how all these things relate:
2
With the help of Group Normalization, MaskRCNN can now be trained WITHOUT ImageNet pretrained models and still outperform BatchNorm-based models. github.com/facebookresear
3
9
1:00
56.4K views
31
704
1,848
I just published “Unawareness of Deep Learning Mistakes”
2
4
8
Everything I know about design of ConvNets (resnets, bigger=better, batchnorms etc) is useless in RL. Superbasic 4-layer ConvNets work best.
17
196
614
Comparison of different GAN variants. I couldn't fit the full version in our paper.
5
253
507
Verifying myself: I am ppwwyyxx on Keybase.io. nxS9zCIY8f6rICwBnkRPLLn7wQ4o6kfFqmtz / keybase.io/ppwwyyxx/sigs/
1
1
Surprised to see someone talking about a paper I worked on..
The implementation is a monstrosity, but works: Type-safe runtime-indexed tuples for C++14
5
25
24
Frances Foong Yao的Erdos Number明明是1的...但是academic.research.microsoft.com/SilverlightIns 上竟然隔了一个Graham..
写博客之前我和 会交流主题。我会事先问他知不知道某事,如果不知道,我觉得大部分人也不了解,就决定要写。他也会事先给我做题,如果我都做出来了,他就觉得所有人都会做,就不打算写了。 // 好难过但也算派上用场了
3
14
























