Majority got it wrong. They actually optimize the less-optimizable version more.https://twitter.com/RichFelker/status/1028357639347036160 …
Not sure. But in the first case it doesn't matter because it's UB. There's no reason to actually sub if K is a multiple of 32.
-
-
Whereas in the second version, omitting the sub and mask depends on the x86 behavior. (Actually the sub can be optimized out unconditionally if K%32==0; maybe that's what gcc and clang are doing.)
Thanks. Twitter will use this to make your timeline better. UndoUndo
-
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.