- make it possible - make it work - make it efficient - make it dependable and invisible - move on to the next layer and never think about it again
-
-
Constantly retraining from scratch very similar models, taking days at a time (sometimes weeks or months) seems particularly wasteful. At a lower-level, most models seem to use way more ops/params than they need, and backprop + SGD itself is fundamentally inefficient
-
Nods. And that I think is consistent with my view that we haven’t necessarily figured out the optimal architecture...we grok their general shape, but they are not necessarily efficient,
- Show replies
New conversation -
-
-
That makes sense.
Thanks. Twitter will use this to make your timeline better. UndoUndo
-
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.