Beginning to grok how to benchmark code running on an out-of-order CPU. For most code there is a critical dependency path of instructions that block each other. Any instruction not on that path may have zero impact on measured performance. However, if too many instructions not on
-
-
Interesting, but seems to have a lot of shortcomings. It doesn't really do branches? Seems like it just assumes that every instruction is going to run once.
-
This and IACA are only for analyzing loops. So generally you put markers after the { and after the } of a loop and then it assumes that the thing will run forever, and show you that pipeline. They usually assume you ain't doing conditionals, I think.
- Show replies
New conversation -
-
-
Anyway, after you told me about the xxh collision tester I figured that such a program could run on a normal PC, if only ... And two weeks later I'm messing with fast inverse 5th root and libdivide in order to get the runtime down.
Thanks. Twitter will use this to make your timeline better. UndoUndo
-
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.