Thanks @jeremyphoward for the shout out!
-
-
Replying to @sebd_d @jeremyphoward
I've tried to add symmetry to SelfAttention, a very minor improvement if any.https://github.com/mgrankin/SimpleSelfAttention/blob/master/Imagenette%20Simple%20Symmetric%20Self%20Attention.ipynb …
2 replies 0 retweets 1 like -
Replying to @mgrankin @jeremyphoward
I updated the repo, added your symmetry hack, thank you!
1 reply 0 retweets 1 like -
Replying to @sebd_d @jeremyphoward
Have you seen that paper? I believe Simple Self-Attention would work even better. https://arxiv.org/pdf/1906.05909.pdf …
2 replies 0 retweets 2 likes -
Replying to @mgrankin @jeremyphoward
@mgrankin , you might be interested in the modification I made to Simple Self-Attention (currently in the xresnet file in my repo). One small change in order of operation is making it much faster and probably less sensitive to image size.2 replies 0 retweets 1 like -
Replying to @sebd_d @jeremyphoward
I've made an injection of Simple Self-Attention layer to a pretrained Resnet. It could be more practical to use it that way. It works good, although it is hard to tell is it really better than plain Resnet. https://github.com/mgrankin/SimpleSelfAttention/blob/master/pretrained.py …pic.twitter.com/KibuEamHMP
3 replies 3 retweets 8 likes -
Replying to @mgrankin @jeremyphoward
Have you been tracking gamma? You'll want to train until that levels off a bit. Self-attention tends to take a while to train. I'm also curious how this compares to PooledSelfAttention (in FastAI library). I've been using that lately and it's next level for me.
2 replies 0 retweets 2 likes -
-
Replying to @mgrankin @jeremyphoward
In the self-attention forward: o = self.gamma * o + x . i.e. the weighting of attention.
2 replies 0 retweets 1 like -
Replying to @citnaj @jeremyphoward
I've caught a huge bug in my code, thanks to your suggestion. Now, it really works.
1 reply 0 retweets 3 likes
Oh! Are you getting different results now?
-
-
Replying to @citnaj @jeremyphoward
I don't see any difference in the results. But at least it's training now, gamma was zero before.
1 reply 0 retweets 2 likes -
That 94.4% you got on epoch 5 is interesting. I'm gonna give this a try, Thanks for sharing!
1 reply 0 retweets 2 likes - Show replies
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.