[1/n] Can explainability improve model accuracy? Our latest work shows the answer is yes!
arxiv.org/pdf/2206.01161
github.com/hila-chefer/Ro
We noticed that ViTs suffer from salient issues- their output is often based on supportive signals (background) rather than the actual object
Conversation
[2/n] This phenomenon results in poor generalization to domain shifts- a bus is classified as a snowplow due to the snow, a lemon is classified as a golf ball due to the grass, a forklift is classified as a garbage-truck due to the garbage
1
9
In this work, we show that one can *directly optimize the explanations*- i.e. use a loss on the explainability signal to ensure that the classification is based on the *right reasons*- the foreground and not the background.
1
12
We only use 3 labeled examples for 500 classes, and achieve a large increase in robustness (ImageNet-A, R, ObjectNet)
1
13
Check out our colab notebook where you can experiment with our fine-tuned models and compare them to the original ones:
colab.research.google.com/github/hila-ch
1
13
1
4
Very cool work.
I wonder if on Image-Net using the context from the background (e.g the sky) improves accuracy. Would we want to always focus just on the object of interest or should we also leave some space for looking at the surrounding?
2
3
[1/n] Thanks ! That's a great question, of course, this thread is a TLDR for the work, but I'll try to answer it in multiple parts.
Part 1: the method does not entirely eliminate the use of the background (it is, as you said, useful) see example below
1
8
Show replies




