I don't recall if there has been any rigorous study about whether to do conv-act-bn or conv-bn-act (I always do conv-bn-act since it's the most intuitive). Anyone has a good reference?
-
-
Here's 10 trials at each permutation of batch norm layers in a skip connection of a convnet model, ordered by loss at epoch 30. Conclusion: use BN everywhere, but if just once, use it after activation. Am curious to see others' results.pic.twitter.com/HpgG0S1cKl
Thanks. Twitter will use this to make your timeline better. UndoUndo
-
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.