just made CXXRTL 20% faster overall (2× faster commits) on clang-11 by replacing a bunch of `flag |= x.commit()` with `if (x.commit()) flag = true;` (gcc-10 is unaffected)
compilers🤦♀️
There are missed optimizations from Clang implementing bool as i8 instead of i1. Range metadata (https://llvm.org/docs/LangRef.html#range-metadata…) is supposed to help with this but doesn't work very well and gets stripped out by transformations. LLVM itself is better at this. Not sure why Clang uses i8.
Clang similarly doesn't make much use of LLVM aggregates. In general, this is a sign that there's something wrong with the LLVM feature and it isn't a good idea to use it elsewhere if you expect correct and efficient code generation. C ABI support in LLVM is a total hack too.
Clang manually deals with implementing the C calling convention for structs instead of LLVM's C calling convention implementing it for aggregates. These workarounds in Clang cause many missed optimizations in LLVM, and LLVM also isn't optimized to deal well with first-class code.
I was looking into making i1 work better at one point and had a dozen tiny fixes like http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20131021/192718.html…. I ended up running into some problems that were too hard me to to solve on my own and gave up on the whole project. I never ended up submitting the other fixes I made.
Since Clang uses i8 for bool, optimization passes often haven't been implemented to handle i1.
So, you choose between deciding to miss optimizations because LLVM thinks the other 7 bits might be used vs. missing optimizations because no one optimized for doing what makes sense.
I think that's the right approach. I expect that you're missing a lot of optimizations though. First class aggregates are similar. The way Clang does thinks makes it inherently harder to optimize and it does miss optimizations because of it but not as much as doing things sanely.
The main issue with range metadata (https://llvm.org/docs/LangRef.html#range-metadata…) is that like other metadata, stripping it away at any point is completely valid so that's what happens.
Still, I think the best code generation comes from using i8 and setting the range as a boolean due to Clang...
Since LLVM doesn't have first-class tagged unions and the concept doesn't exist at a language level in C, LLVM is particularly bad at handling those and it's bad enough at handling booleans...
Also, since restrict is hardly used in C, noalias is still incredibly broken in LLVM.
Higher-level languages could make heavy usage of noalias metadata for anything they know is immutable or only used through a single reference. It's defined based on memory dependencies, not pointer equality. Hardly anyone uses restrict so it has historically been broken in LLVM.