Anyone know why gcc and clang are unable to vectorize a simple memcmp?
Conversation
Replying to
I don't think either can vectorize loops with early exits. LLVM has more advanced vectorization available but the standard loop and basic block vectorization isn't sophisticated at all. It works well for simple loops over entire arrays without extra exits / conditions.
Replying to
Uhg, of course, but they should be able to... Failure to do that is pretty bad.
2
I think for musl's memcmp I'm going to need something like our memcpy (git.musl-libc.org/cgit/musl/tree) with rotation through misaligned words. Might also need to start allowing archs to declare that they can do misaligned accesses.
1

