Nice study.
TL;DR: if you want a resolution independent, pre-trained, image recognition model that can be fine-tuned with a small and different dataset, use ConvNext.
Quote Tweet
We now finally have the thing I've always wanted: an analysis of the best vision models for finetuning.
In joint work with @capetorch we tried nearly 100 models on two very different datasets across various hyperparams.
Here's the top 15 on the IIT-Pet dataset: 1/
Show this thread
8
75
440


