New preprint from the lab: "Individual differences among deep neural network models."
https://www.biorxiv.org/content/10.1101/2020.01.08.898288v1 …
Work with @KriegeskorteLab, @HannesMehrer, and Courtney Spoerer. #tweeprint below. 1/7
-
-
interesting! given the range performances of the networks across seeds (up to 1% accuracy difference) do you think that each random seed has an optimal learning rate? if so, do you think if the learning rate was tuned appropriately for each seed, there would be less differences?
-
They are all trained using Adam. Do you mean using different hyperparameters, or SGD with different schedules?
- Još 1 odgovor
Novi razgovor -
-
-
Yes!!!! This is something I've been thinking for a while now. We should be comparing to multiple trained networks, and using stats on that to make claims about comparisons to brains.
-
Absolutely. "Networks as participants" is the project slogan.pic.twitter.com/ZuST8QxE3O
- Još 2 druga odgovora
Novi razgovor -
-
-
Cool work! Have you considered comparison methods other than RSA? It’s been shown that dot-product based methods can show similarities in differently initialized networks (kornblith et al., 2019).
-
Our goal here was to use the same methods as people in cognitive/computational neuroscience. Other metrics exist in the ML community (many CCA related, cited in the paper). I am guessing that even Kornblith's measure does not render network instances identical.
- Još 3 druga odgovora
Novi razgovor -
-
-
Have you looked at any other architectures? I would expect something like VGG to be much more sensitive to these kinds of variations than something like ResNet or DenseNet.
-
Good point. We needed to train a lot (!) of network instances to run these analyses, which is why we have only looked at two architectures (one similar to VGG-S, and ALL-CNN-C). Would be interesting to explore this across many architecture types.
- Još 2 druga odgovora
Novi razgovor -
-
-
Very cool paper! Have you considered this from a Bayesian perspective? I didn't see any discussion of this in the paper. Analysis of posterior estimates would be effectively equivalent to analyzing multiple DNNs, no? See https://arxiv.org/pdf/1912.02757.pdf … for a recent discussion
-
Thanks for the pointer, super interesting!
Kraj razgovora
Novi razgovor -
Čini se da učitavanje traje već neko vrijeme.
Twitter je možda preopterećen ili ima kratkotrajnih poteškoća u radu. Pokušajte ponovno ili potražite dodatne informacije u odjeljku Status Twittera.