what is particularly funny is that clang without the fix is slower than gcc, but clang with the fix is actually 10% faster than gcc
Conversation
Replying to
There are missed optimizations from Clang implementing bool as i8 instead of i1. Range metadata (llvm.org/docs/LangRef.h) is supposed to help with this but doesn't work very well and gets stripped out by transformations. LLVM itself is better at this. Not sure why Clang uses i8.
1
2
12
Clang similarly doesn't make much use of LLVM aggregates. In general, this is a sign that there's something wrong with the LLVM feature and it isn't a good idea to use it elsewhere if you expect correct and efficient code generation. C ABI support in LLVM is a total hack too.
1
4
Clang manually deals with implementing the C calling convention for structs instead of LLVM's C calling convention implementing it for aggregates. These workarounds in Clang cause many missed optimizations in LLVM, and LLVM also isn't optimized to deal well with first-class code.
1
5
Replying to
yeah, I've hit that particular issue pretty badly on my projects (then again, who doesn't?)
1
2
Replying to
I was looking into making i1 work better at one point and had a dozen tiny fixes like lists.llvm.org/pipermail/llvm. I ended up running into some problems that were too hard me to to solve on my own and gave up on the whole project. I never ended up submitting the other fixes I made.
1
5
Since Clang uses i8 for bool, optimization passes often haven't been implemented to handle i1.
So, you choose between deciding to miss optimizations because LLVM thinks the other 7 bits might be used vs. missing optimizations because no one optimized for doing what makes sense.
1
8
I think that's the right approach. I expect that you're missing a lot of optimizations though. First class aggregates are similar. The way Clang does thinks makes it inherently harder to optimize and it does miss optimizations because of it but not as much as doing things sanely.
1
The main issue with range metadata (llvm.org/docs/LangRef.h) is that like other metadata, stripping it away at any point is completely valid so that's what happens.
Still, I think the best code generation comes from using i8 and setting the range as a boolean due to Clang...
1
Since LLVM doesn't have first-class tagged unions and the concept doesn't exist at a language level in C, LLVM is particularly bad at handling those and it's bad enough at handling booleans...
Also, since restrict is hardly used in C, noalias is still incredibly broken in LLVM.
Higher-level languages could make heavy usage of noalias metadata for anything they know is immutable or only used through a single reference. It's defined based on memory dependencies, not pointer equality. Hardly anyone uses restrict so it has historically been broken in LLVM.
1
4
And then, since that's largely unavailable in C, optimizations have not been written to take advantage of it. I can fully understand why a language would reinvent the wheel when they have to make their own IR for optimization anyway and then LLVM botches low-level optimization.
1
6


