@cmuratori "Why does mask-move make it 3x slower?" - because of the "non-temporal hint". That's Intel-speak for "bypass the cache".
@PierreTerdiman @rygorous We're talking about maskmovdqu / _mm_maskmovu_si128 vs. just doing the and/andnot/or with dest and using movups.
-
-
@PierreTerdiman@rygorous The confusing is understandable, since they made one called "mask move" and the other called "move mask" :PThanks. Twitter will use this to make your timeline better. UndoUndo
-
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.