Did anyone explore avg vs max pooling for 1-d convnets in lang? (or more generally semi-rigorous discussion on conv archs for language?)
-
-
(a) is this empirical or intuition? (b) shouldn't the conv layer learn to deal with this weighting?
-
a) both b) yes, but expecting the network to route around architecture defects only works so far. Avg pool dilutes valuable info.
End of conversation
New conversation -
-
-
some sent. class observations, max-over-time spikes on few n-grams that matter.
-
do you have a sense of how many ngrams matter in entire corpus? if your conv produce 100dim vecs, can you capture > 100 ngrams?
- Show replies
New conversation -
-
-
I agree. That said, mean pooling allows representations to adjust for this.
Thanks. Twitter will use this to make your timeline better. UndoUndo
-
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.