Conversation

ICYMI: PyTorch 1.10 was released last Thursday. Here are some highlights of the release. Stay tuned for tweet threads in the next couple weeks delving deeper into these cool new features! 1/8
12
361
CUDA Graphs are now in beta, and allow you to capture (and replay!) static CUDA workloads without needing to relaunch kernels, leading to massive overhead reductions! Our integration allows for seamless interop between CUDA graphs and the rest of your model. 2/9
Image
1
39
FX, an easy to use Python platform for writing Python-to-Python transforms of PyTorch programs, is now in stable. FX makes it easy to programmatically do things like fusing convolution w/ batch norm. Stay tuned for some FX examples of cool things that users have built! 3/9
1
31
nn.Module parametrization (moving from Beta to Stable) allows you to implement reparametrizations in an user-extensible manner. For example, you can apply spectral normalization or enforce that the parameter is orthogonal! See twitter.com/rasbt/status/1 for an example. 4/9
Quote Tweet
I always think that PyTorch is already so feature-rich and polished, what could they change and/or add? A neat additions is the "parameterize" module. Below, a quick example creating a custom layer. However, the real cool use-cases are applying it to larger modules of course! twitter.com/PyTorch/status…
Show this thread
Image
Image
1
29
We have also turned conjugation on complex tensors from an O(N) copy into a constant time operation (just like transpose)! This allows us to fuse conjugation with other PyTorch operators, like matmuls, for as much as 50% increase in speed & 30% increase in memory savings! 5/9
1
24
A new LLVM-based JIT compiler is now available for CPUs that can fuse together sequences of PyTorch ops to improve performance. While we’ve had this ability for some time on GPUs, this release brings this capability to CPUs. For certain cases this can bring massive speedups! 6/9
NNAPI (which was moved from prototype to beta) allows PyTorch on Android to leverage specialized hardware such as GPUs or other specialized chips to accelerate neural networks. Since the prototype release, we’ve been hard at work adding more op coverage and other goodies. 7/9
Image
1
18
One pain point for using torch.jit.script is that it has often required type annotations in order for compilation to be successful. Now, we’ve enabled profile directed typing for torch.jit.script by leveraging existing tools like MonkeyType, accelerating the process. 8/9
Image
Image
1
33