2 passes over an array in some cases can be faster than a single pass w/ a conditional statement causing branch miss predictions. Hmmm WAT.
-
-
@bascule@kellabyte there's more'n you might think. "perf" on Linux rules with right h/w, if you can decipher output. http://www.halobates.de/modern-pmus-yokohama.pdf … -
@RichardBarrell@kellabyte I just wish we had Azul-like tools for profiling apps on x86
End of conversation
New conversation -
-
-
@bascule@kellabyte e.g. counting cycles or L1/L2/L3 misses is easy, but I couldn't figure getting my i7 to count branch mispredicts for me. -
@RichardBarrell@bascule Too bad. Would be awesome to play with some code and measure that vs performance. -
@kellabyte@bascule I think it's just out-of-tree driver needs loading? Will look again later, possibly easy. Bare "perf" still wins hard.
End of conversation
New conversation -
-
-
@bascule@kellabyte See the PAPI library.Thanks. Twitter will use this to make your timeline better. UndoUndo
-
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.