Still vastly underused in image to image to this day and I don't understand why: Unets and self-attention. I keep seeing papers come out that have obvious problems that could be solved using these two simple things!
-
-
Replying to @citnaj @jeremyphoward
Maybe becaue image to image attention is much harder than with NLP. Journal of negative results is a view of mind, but trust me many researchers tried - in our case it was for virtual try. Yet, Conditional Batch Norm works well to modulate an img wrt a text.
1 reply 0 retweets 1 like -
Replying to @JeremieMary @jeremyphoward
Can you expand on it being harder? It seems like a pretty easy thing to drop in a network...
1 reply 0 retweets 0 likes -
Replying to @citnaj @jeremyphoward
This exactly the point: in terms of code this is easy to do but when you do it this is hard to improve your current system. I agree than some research is needed there. It could be that sequence of words are statistically easier to exploit than the result of strided convolutions.
1 reply 0 retweets 1 like
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.