Update: I made it really fast. 2.7 GB/s on my i5! All credit goes to @ciphergoth; I just translated his AVX2 and SSE2 implementations of NH to Go asm. (NH clocks in at a ludicrous 12GB/s!!)
-
-
Show this threadThanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
I fixed the test vectors so they go out to 4k now. On another note, you generally can't compare cpb on very different architectures: we achieve a much better cpb on a modern x64 processor than we do on Cortex-A7.
-
Cool, I'll add those vectors tonight. Also removed the cpb comparison. Do you happen to know the GB/s of your impl on x64?
End of conversation
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.