I use the mjsynth dataset and paste the samples together. They are just images of text words. Thus if I want multiple words...
-
-
Thanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
yes, but, flickr-30k entities is an alternative.
Thanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
The
@VisualGenome dataset includes, amongst many things, object regions. Might be too fine grained. https://visualgenome.org/static/paper/Visual_Genome.pdf …Thanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
I think some of the flcker30k or 8k do have multiple labels, I will double check.
Thanks. Twitter will use this to make your timeline better. UndoUndo
-
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.