Still vastly underused in image to image to this day and I don't understand why: Unets and self-attention. I keep seeing papers come out that have obvious problems that could be solved using these two simple things!
Can you expand on it being harder? It seems like a pretty easy thing to drop in a network...
-
-
This exactly the point: in terms of code this is easy to do but when you do it this is hard to improve your current system. I agree than some research is needed there. It could be that sequence of words are statistically easier to exploit than the result of strided convolutions.
-
Thanks!
End of conversation
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.