Tweetovi
- Tweetovi, trenutna stranica.
- Tweetovi i odgovori
- Medijski sadržaj
Blokirali ste korisnika/cu @TheGregYang
Jeste li sigurni da želite vidjeti te tweetove? Time nećete deblokirati korisnika/cu @TheGregYang
-
Prikvačeni tweet
1/ Why do wide, random neural networks form Gaussian processes, *regardless of architecture*? Let me give an overview in case you are too lazy to check out the paper https://arxiv.org/abs/1910.12478 or the code https://github.com/thegregyang/GP4A …. The proof has two parts…pic.twitter.com/cKtfpRGMQd
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Greg Yang proslijedio/la je Tweet
With projects ranging in focus from healthcare to gaming to server workload management, 2019-20 Microsoft AI residents are pushing forward real-world applications in artificial intelligence. Learn more about the program and apply for the class of 2020-21: https://aka.ms/AA75lzx
Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Greg Yang proslijedio/la je Tweet
Dr.
@JianfengGao0217 leads the Deep Learning Group at Microsoft Research. Hear how recent advances in training data and infrastructure have allowed Dr. Gao’s team to work on applications for deep learning in language, vision, and video on the#MSRPodcast: https://aka.ms/AA75q4rHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Greg Yang proslijedio/la je Tweet
This is a great question that I've gotten periodically. Previously it would have taken too long to put something together, but using Neural Tangents (http://github.com/google/neural-tangents …) it's really easy and fast! Here is the reproduction in a colab: https://colab.sandbox.google.com/github/google/neural-tangents/blob/master/notebooks/phase_diagram.ipynb …https://twitter.com/DigantaMisra1/status/1220571913904242688 …
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Greg Yang proslijedio/la je Tweet
With TensorWatch, researchers seek to help other researchers/engineers get much-needed info about the state of their ML systems. The open-source debugging and visualization tool works in Jupyter Notebook to perform key analysis tasks for models and data: https://aka.ms/AA5f5cf
Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Greg Yang proslijedio/la je Tweet
Research on the Neural Tangent Kernel (NTK) almost exclusively uses a non-standard neural network parameterization, where activations are divided by sqrt(width), and weights are initialized to have variance 1 rather than variance 1/width.
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Greg Yang proslijedio/la je Tweet
Super disgusted by people working on privacy-related technologies who knowingly misrepresent "their" technology to the employer or policymakers hoping for a faster or wider adoption. Such people make matters strictly worse for everyone (except possibly the company they work for).
Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Joint work with these awesome
@UCBerkeley algebraicists and geometers Justin Chen, Christopher Eur, and, Mengyuan Zhang!Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi
-
Learnability (VC dim) is a *topological property*, as I proved in https://arxiv.org/pdf/1701.02302.pdf … for parity, conjunctions, poly threshold fctns. Now this extends to downward-closed classes, conjunction of parities, and k-CNFs, as well! Just how far does this go? https://arxiv.org/abs/1909.02159 pic.twitter.com/zbN0C3YB71
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Greg Yang proslijedio/la je Tweet
Text-based games provide a platform to train RL agents that generate goal-driven language. Jericho framework by
@mhauskn@rajammanabrolu@Cote_Marc &@ericxyuan provides benchmarks for scaling RL to combinatorially sized language action spaces: https://aka.ms/AA706nm#AAAI2020Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Did u know? The NTK & GP kernels of an MLP on the sphere have the form K(x, y) = F(x.y/dim) for scalar F. Each (eigenvalue*multiplicity) of K goes to F^{(k)}(0)/k! as dim -> infty, over unif dist on the sphere (or the bool cube, also extends to Gaussian) https://arxiv.org/abs/1907.10599 pic.twitter.com/Pidjz0RAms
Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Greg Yang proslijedio/la je Tweet
Microsoft AirSim, now on Unity, provides an open-source system for training autonomous systems. Faster and safer than training these systems in the real world, explore how AirSim utilizes multiple learning methods to create a realistic environment: http://aka.ms/AA36fgq
Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Greg Yang proslijedio/la je Tweet
A few weeks ago,
@MSFTResearch published some interesting research on how smiles can serve as feedback for machine learning. I don't think it got enough attention. https://www.the-vital-edge.com/smiles-and-artificial-intelligence/ …#MachineLearning#ReinforcementLearning@akapoor_av8r@danmcduffHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Neural networks tend to Gaussian processes (GPs) as their widths tend to infinity --- now you can play with these GP kernels in
@GoogleColab ! Try out RNN-GP, GRU-GP, Transformer-GP, or Batchnorm-GP today! Repo: https://github.com/thegregyang/GP4A … Colab Entry Point:https://colab.research.google.com/github/thegregyang/GP4A/blob/master/colab/RNN.ipynb …Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
It was a pleasure to visit the Physics department of Taiwan National University! What a beautiful place!https://twitter.com/yjkao/status/1210445658668191744 …
Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
15/ The tensor program framework is very powerful and has many other consequences, such as the universality of Neural Tangent Kernels. If you would like to know more, check out https://arxiv.org/abs/1902.04760 pic.twitter.com/gNCL5vZEpx
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
14/ Finally, we verify our theory is more and more accurate as the width of the network increases. Here we measure the relative frobenius norm of the empirical kernel to the infinite-width theoretical kernel, and we see that the deviation tends to 0 like 1/sqrt(width)pic.twitter.com/NbOIlejwCq
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
13/ These are examples of kernels corresponding to the infinite-width limit of other architectures: GRU, transformer, batchnorm.pic.twitter.com/QAQCfWzUOz
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
12/ Again, see https://twitter.com/TheGregYang/status/1202608248534077440?s=20 … for an outline of the proof.
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
11B/ While at face value, this may seem limited, one can in fact express almost all modern architectures in this framework: resnet, transformer, LSTM, etc. Here are some simple examples of tensor programs expressing NN computation.pic.twitter.com/ObYofpPgIw
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi
Čini se da učitavanje traje već neko vrijeme.
Twitter je možda preopterećen ili ima kratkotrajnih poteškoća u radu. Pokušajte ponovno ili potražite dodatne informacije u odjeljku Status Twittera.