Before Deep Learning a common technique to classify images was based on the "bags" of local image features present in an image. Advantage: potentially easy to interpret as one can directly trace the contribution of each image feature. Downside: classification performance. 2/8
-
-
Prikaži ovu nit
-
We suggest a very simple "bag" variant of a ResNet-50 (called BagNet) that classifies images based on the following strategy: split image into small patches (9x9, 17x17 or 33x33 pixels), apply DNN to each patch to get class evidence (logits) and sum evidence over all patches. 3/8pic.twitter.com/jNBJyLa2Cy
Prikaži ovu nit -
We reach AlexNet-level performance for 17x17 patches and 87% top-5 for 33x33 patches. Take away: ImageNet can be solved to high accuracy using only very small local features. Pretrained models @ https://github.com/wielandbrendel/bag-of-local-features-models … 4/8pic.twitter.com/7QoHq7mIiZ
Prikaži ovu nit -
BagNets are interpretable in that we see which features are important. Below are important features for a few classes that lead to correct (top row) and wrong decisions (bottom row): a tench is recognised by fingers holding the fish like a trophy, which also explains errors 5/8pic.twitter.com/LicS5YQTtC
Prikaži ovu nit -
We test VGG, ResNets & DenseNets on signatures of bag-of-feature classification: (a) image scrambling, (b) interaction between image parts, (c) error distribution and (d) feature sensitivity. VGG and BagNets are strikingly similar, deeper nets show somewhat larger deviations. 6/8
Prikaži ovu nit -
Deeper nets likely use the same bag-of-feature strategy at their core but add more long-range information to increase classification performance. More work is needed to shed light on this. 7/8
Prikaži ovu nit -
Overall the results are not surprising: if weak local statistics are sufficient to solve ImageNet, why would nets infer complex global features like object shape? We hence need better tasks that push nets to learn the physical nature of our world. 8/8
Prikaži ovu nit
Kraj razgovora
Novi razgovor -
-
-
How effective is this strategy for object localization / coarse segmentation? It might be nice to understand what % logits per pixel area leads to good performance
Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi
-
-
-
@threader_app compile \o/ -
Hey, the thread is ready and compiled. You can read the whole version here:https://threader.app/thread/1093175703690989569 …
Kraj razgovora
Novi razgovor -
Čini se da učitavanje traje već neko vrijeme.
Twitter je možda preopterećen ili ima kratkotrajnih poteškoća u radu. Pokušajte ponovno ili potražite dodatne informacije u odjeljku Status Twittera.