You may be interested in some of our previous work that augments CNNs with self attention. The biggest effect for us was its interesting generalization abilities.https://openreview.net/forum?id=HkxaFoC9KQ …
-
-
-
Nice work, and thanks for the link. We will cite it in the next revision.
Kraj razgovora
Novi razgovor -
-
-
Soon attention will truly be all you need.
-
If convolutions are an inductive bias towards locality but nonlocal attention can already reach similar accuracy, perhaps it already is and one just needs more data to replace the prior. (Incidentally, is the caption for Figure 11 correct? Seems to reverse col/row?)
- Još 1 odgovor
Novi razgovor -
-
-
How about replacing self attention's fully connected FFN with conv Like tensor2tensor librispeech transformer v2 hyperparams
Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi
-
-
-
Do you actually use the long sequence of tf ops in section A2 for relative positional encoding, or do you have an special in house optimized op replacing all of that?
Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi
-
-
-
Very cool work. But I am unsurprised and perhaps also relieved that complicated positional are unnecessary and simply specifying coordinates as a feature map suffices.pic.twitter.com/LJm7p50BmV
Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi
-
Čini se da učitavanje traje već neko vrijeme.
Twitter je možda preopterećen ili ima kratkotrajnih poteškoća u radu. Pokušajte ponovno ili potražite dodatne informacije u odjeljku Status Twittera.