I'm not being snarky, I really do love this little file
-
-
Thanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
@johnregehr The aliasing? I plan to fix it with an __attribute__((__may_alias__)) type conditional on __GNUC__, fallback to naive if !GNUC. -
@johnregehr Actually, for 32-bit words, a naive 16x unrolled byte-based strlen blows away everything else in performance. -
@johnregehr It consist of 100% correctly-predicted branches until end-of-string, and achieves ~1 cycle/byte even on my slow Atoms. -
@johnregehr The bad news: compilers won't generate this unrolled code even if you ask them to. You have to write it by hand... -
@RichFelker gcc 5.3 gives me a nice 8-way unroll with -funroll-all-loops -O3 -
@johnregehr What's your idea of a "nice unroll"? I get cmpb;leaq;je per iteration. Instead it should just do cmpb;je. -
@johnregehr This change makes a 2x performance difference, from "slowest reasonable strlen" to "fastest strlen". - 2 more replies
New conversation -
-
-
@johnregehr don't forget musl's beauties such as this 64-bit atomic primitive which totally won't tear: http://git.musl-libc.org/cgit/musl/tree/arch/arm/atomic.h#n247 … -
@jfbastien I don't even understand -
@johnregehr@jfbastien Do you have more examples of questionable musl stuff? OpenWrt (many SOHO routers & IoT things) just switched to it... -
@BRIAN_____@johnregehr not at the moment, I'll send patches to@RichFelker once we start doing WebAssembly's libc using musl.
End of conversation
New conversation -
-
-
@johnregehr just the read past the end or something else I don't see? -
@whitequark yep, reads past the end-- probably not a big deal -
@johnregehr can you even implement fast strlen() without UB? -
@whitequark I can't see any way to do it w/o padding all strings -
@johnregehr@whitequark either write asm, or write the naive loop and depend on a Smart Compiler. -
@johnregehr@whitequark (significantly less compiler smarts required for the naive loop than this clever implementation, BTW). -
@stephentyrone@johnregehr there's something incestous in compiler recognizing this idiom and replacing it with optimized SIMD -
@stephentyrone@johnregehr hell, let it also recognize bubble sort and replace with qsort. why not. - 11 more replies
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.