I'm not sure what's the conclusion over DeViSE and older approaches (or extreme classification folks).
-
-
Thanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
I think you could unleash the full potential of embedding regression vs classification on the Flickr dataset BTW (100k+ labels)
-
if you want to have a look at Flickr, also look up http://arxiv.org/abs/1511.02251 by
@armandjoulin and colleagues.
End of conversation
New conversation -
-
-
Congrats! By the way - would 1 K80 be detected as 1 or 2 GPU? - what did you use for parallelism? Keras w/ tf.device()?
-
1 K80 would be 1 GPU. It's data parallelism (asynchronous SGD). It's pure TensorFlow (although I have the same setup using Keras now)
End of conversation
New conversation -
-
-
how do you decide which architecture you should invest time on?
-
just throw spaghetti at the wall, see what sticks
End of conversation
New conversation -
-
-
that's awesome that it's both faster *and* more accurate!
Thanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
is test time slower? (ie because you have to do NN) have you thought about doing a discrete ver ala learned hash reprs?
-
yes, inference is incrementally slower (essentially it's like having one more FC layer).
End of conversation
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.