Conversation

It should just be removed. People should use restrict if they want it, and while the definition of restrict in the standard is screwed up it's defined in a reasonable way by Clang at least even though there were massive implementation bugs which are now largely fixed due to Rust.
1
4
TBAA is compiler people doing an optimization that was never intended to be supported and was clearly a standards violation and was then gradually brought into the standard in order to permit what compilers were already doing. It shouldn't have ever been a thing. Super broken.
1
3
It's completely reasonable to want to do those optimizations but real world C code is in violation of the rules and it was not historically permitted or intended in the initial standards. They decided to bend and outright break the rules and then "fixed" standards to allow it.
1
4
They should also really define signed overly as either wrapping or trapping with trapping allowed to be either eager or any degree of lazy that still prevents the results from being used for anything with a side effect. Honestly though, I don't really care. C just needs to go.
1
6
Also, hurting the performance of C code is a good thing. I'm all for disabling optimizations that were never actually standards compliant for C. LLVM is also wrongly applying the modern C++ rules for fudging the halting problem to C and legacy C++ when it's not actually allowed.
1
4
It really shouldn't be a thing that compilers add incorrect optimizations to make C faster when what they're doing is unsafe in practice and non-standards-compliant. Personally, I think the ever increasing cost of all the weak mitigations gradually adding up will help replace C.
1
6
For example, -fstack-protector-strong which is used anywhere that's at all security conscious as a global default typically has somewhere in the range of 2% to 5% performance cost. Add in ShadowCallStack and CFI, and you could have paid less performance with full bounds checking.
3
8
Cost of -fstack-protector-all is a higher range than 2% to 5%. Lots of papers to look at with estimates of performance cost for SSP in various non-IO-bound micro / macro benchmarks. I'm talking about by itself without other mitigations like ShadowCallStack or SafeStack enabled.
1
It's not quite that much when using type-based CFI and ShadowCallStack/SafeStack because cost of indirect calls and returns has already increased. Far less useful in that scenario though. SSP has a terrible cost vs. benefit trade-off. Barely mitigates any modern vulnerabilities.
1
SSP gets used alongside ShadowCallStack + CFI since it protects data beyond the return address and function pointers but it's such an extremely narrow use case and such a ridiculously weak mitigation especially in userspace where it's a single global stack canary set on exec.
I agree there are stronger & cheaper mitigations, but I’m a little surprised that a few extra pushes, pops, & branches add up to 2-5%. Some people on Twitter claim correctly predicted branches are free (which is dubious). There are lots of cpus without arm pac or intel cet around
1
1
The cost is primarily reading and writing the canary on the stack. It also reads the expected value from global or thread local storage so there are 2 reads and a write, and then a branch based on the 2 reads that's set up so prediction is trivial but still wastes CPU resources.
1
Show replies