@pcwalton why not two lea?
-
-
-
@qcar_ Yeah, two lea would probably be faster.
End of conversation
New conversation -
-
-
@pcwalton in practice, the MUL is probably faster because it's fewer uops (unless you're bound entirely by p1 or latency). -
@pcwalton zeroth-order approximation when writing assembly: pick the approach with fewest ops. it'll be faster most of the time - 3 more replies
New conversation -
-
-
.
@pcwalton Looks like it; 1 cycle for mov, dispatch both shl in one (on Haswell), 1 cycle to add. mul has 4-cycle latency.Thanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
@pcwalton Probably, but... only one way to find out! Can also do: mov ebx, eax; add ebx, ebx; lea eax, [ebx + eax*8] -
New conversation -
-
-
@pcwalton on modern hardware, probably not. On a 486? Probably.Thanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
@pcwalton Yes, but it must be in a loop, like 10+e8 times, and it is cryptic.Thanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
@pcwalton it should be pretty close either way. will take at least 3 clocks. IIRC mul usually takes 4. probably depends on the HW though...Thanks. Twitter will use this to make your timeline better. UndoUndo
-
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.