This was much harder than we expected. Adding recurrence to CNNs gives you temporal dynamics, but standard RNN structures (like LSTMs) didn't actually help these models do hard visual tasks (like ImageNet object recognition) -- better just to make the CNNs deeper...
-
-
Prikaži ovu nit
-
...meaning that the recurrence didn't actually explain what computation the dynamics are there for. So we had to design and then "evolve" RNN structures that actually boosted task performance. This also made the networks more parameter-efficient than their feedforward versions.
Prikaži ovu nit -
NB: Feedforward networks trained on ImageNet already explain time-averaged visual responses very well! But since they only produce a single output for a given image, they do nothing to explain the observed response dynamics.pic.twitter.com/rO7bwLIElk
Prikaži ovu nit -
In summary, our work suggests that recurrence and temporal dynamics might help solve "hard" visual tasks like ImageNet. What makes a task "hard?" We're not exactly sure, but it may involve object occlusion, clutter, etc. as suggested by
@KohitijKar@KriegeskorteLab and others.Prikaži ovu nit
Kraj razgovora
Novi razgovor -
-
-
Tweet je nedostupan.
-
It's computed from a feedforward model very similar but not identical to ResNet-18 (the "baseline" model from Figure 4.) Would expect similar results from ResNet-18, treating each "block" as one cortical area. Yes we could use a ConvRNN, but...
- Još 9 drugih odgovora
-
-
-
Nice! If "vanilla RNNs" is a technical term, do you have a flavor to name your task-optimized RNNs after too?
-
Choco-redcurrant
- Još 1 odgovor
Novi razgovor -
Čini se da učitavanje traje već neko vrijeme.
Twitter je možda preopterećen ili ima kratkotrajnih poteškoća u radu. Pokušajte ponovno ili potražite dodatne informacije u odjeljku Status Twittera.