@fchollet I agree. Attention models can in turn be considered possibly differentiable data augmentation, or active learning, devices.
Learning data augmentation schemes is essentially learned perceptive attention. But quasi-optimal.
-
-
-
@fchollet I.e. doing more with less data either by making more (hopefully plausible) data or using the data you have more intelligently.
End of conversation
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.