Tweetovi
- Tweetovi, trenutna stranica.
- Tweetovi i odgovori
- Medijski sadržaj
Blokirali ste korisnika/cu @ngutten
Jeste li sigurni da želite vidjeti te tweetove? Time nećete deblokirati korisnika/cu @ngutten
-
Attention mechanisms for nnets may not always achieve state of the art results, but they seem to always generate cool things to look at...
Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Conclusion is obviously, needs more niches.
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Ran CMA-ES for a day, looked at the trajectory of the parameters, and in the end it was basically a line. Sort of disappointing...
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
At what level of cognition is music evocative?
Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
I guess part of the surprise is, this is like the decoder half of a variational auto-encoder. But generally one might expect a VAE to indicate outlier data in the latents by anomalously large values. But this goes the other way, with out-of-distribution stuff having small norms.
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Curious phenomenon when using per-datapoint embeddings (where the training set is 'train embeddings and network' and the validation set is 'train embeddings only'). The standard deviation of the validation embeddings seems to be significantly smaller.
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Bonus question - does the lack of invariances in a latent code limit the scalability of using it as a representation? For example, the difference in expressiveness of StyleGAN and BigGAN. 5/
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
How could we design it better? For example, does something like K-Nearest-Neighbors have this issue? Does MAML escape it because it has access to the weight space? 4/
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
I'm wondering if these two limits are compatible. For few shot stuff I've tried in the past, there's not much improvement beyond a few hundred data points. Is this because the task representation basically becomes dense and you can't escape central limit theorem? 3/
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
But there's another view of within- lifetime learning that intuitively more additive. That is, we add 'new' skills, facts, etc. This would be something like the infinite dimensional version of the above - each new piece of evidence doesn't intersect with previous data 2/
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
On fewshot/metalearning... One method basically consists of learning a fixed size latent code to represent a particular task out of the metalearned distribution. This is roughly 'inferring the goal', and should converge like repeat measurements of a value (central limit thm) 1/
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Nicholas Guttenberg proslijedio/la je Tweet
New work "what graph neural networks cannot learn: depth vs width" (ICLR20) studies the expressive power of GNN: It provides sufficient conditions for universality and shows that many classical problems are impossible when depth x width < c n^p. Blogpost: https://andreasloukas.blog/2019/12/27/what-gnn-can-and-cannot-learn/ …pic.twitter.com/VpaQh4fI9Q
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Deploying ML models in Unity is still... PyTorch -> Tensorflow 1.7.1 / ML-Agents 0.5 was what worked. TensorSharp 1.15 had a missing DLL issue, Tensorflow .NET seems to have mysteriously dropped 'restoring saved models' from its docs (and API?). Better suggestions?
Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
I really want to understand that in the middle case, where boundaries between what is the network and what is the world are muddied.
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
We talk a lot about NNs and the problem of inferring causality in the world, but another take is to think about causality within the network and its relation to learning. Learning is efficient when it's clear how to intervene on parameters to achieve a desired computation.
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Nicholas Guttenberg proslijedio/la je Tweet
BackPACK: Packing more into backprop "we introduce BackPACK, an efficient framework built on top of PyTorch, that extends the backpropagation algorithm to extract additional information from first- and second-order derivatives" https://arxiv.org/abs/1912.10985 https://f-dangel.github.io/backpack/ pic.twitter.com/sX9Rp6We1w
Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Nicholas Guttenberg proslijedio/la je Tweet
We're hiring! We are looking for AI Research Scientists to work on our Badger Architecture and a Unity Programmer to work on our AI Game. We are open to remote working. Find out more: https://www.goodai.com/careers/ pic.twitter.com/FBCEAprHbS
Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Nicholas Guttenberg proslijedio/la je Tweet
We are publishing new information about our Badger architecture - “how can homogenous experts inside an agent coordinate together to learn to solve new tasks?” Get in touch if you are interested! https://blog.marekrosa.org/2019/12/badger.html …pic.twitter.com/n4F4Reuf4j
Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
Interestingly, it finds that the optimum place to put the threshold (on the Moore neighborhood of 8 adjacent cells) is from 3-5. That is, rules that say 'if x>=3,4,5' generate more diverse behaviors than rules using a lot of 'if x>=2' or 'if x>=7'.
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi -
The trees can be of various depth, degree of balance, and also the thresholds for the decision branches can be distributed in various ways. So I'm using CMA-ES (a genetic algorithm) to find the tree hyperparameters that give the most variety in resulting CAs.
Prikaži ovu nitHvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi
Čini se da učitavanje traje već neko vrijeme.
Twitter je možda preopterećen ili ima kratkotrajnih poteškoća u radu. Pokušajte ponovno ili potražite dodatne informacije u odjeljku Status Twittera.