@cmuratori "Why does mask-move make it 3x slower?" - because of the "non-temporal hint". That's Intel-speak for "bypass the cache".
-
-
Replying to @rygorous
@rygorous@cmuratori Sorry I'm not following: what instruction are you talking about here? _mm_movemask_ps ??1 reply 0 retweets 0 likes -
Replying to @PierreTerdiman
@PierreTerdiman@rygorous We're talking about maskmovdqu / _mm_maskmovu_si128 vs. just doing the and/andnot/or with dest and using movups.1 reply 0 retweets 0 likes
@PierreTerdiman @rygorous The confusing is understandable, since they made one called "mask move" and the other called "move mask" :P
7:26 AM - 13 May 2015
0 replies
0 retweets
0 likes
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.