Conversation

Memory safety is certainly UB and certainly heavily impacted by optimizations. They would need to trap on memory corruption / type confusion bugs in order to get rid of undefined behavior while also still being able to heavily optimize without changing runtime behavior.
2
2
It's true. It's how the specification defines these things and how it gets implemented. You're talking about the specification and compilers implementing that specification, so why not stick to the agreed upon definitions of terms and argue your case without playing word games.
1
1
No, you're misunderstanding / misinterpreting what I was saying. I was talking about a lot more than the GEP inbounds marker for pointer arithmetic. LLVM and GCC fundamentally treat pointers as more than addresses and objects as more than data laid out at an address in memory.
2
1
Sure, and you lose a whole lot of the optimizations that make C efficient. Pointers and aliasing are huge blockers for optimization even with all the extensive leveraging of the aliasing / indexing guarantees which go a long way to making optimizations possible.
1
2
The most important optimizations still work in the C that I describe. I’ve tried variations. I think the only one that really hurts is strict aliasing. Without that you lose too much load elimination. But that one doesn’t produce full UB - it just creates extra load motion.
1
Type-based alias analysis (-fstrict-aliasing) is only a small portion of the overall alias analysis. The basic baseline alias analysis and reasoning about memory even without AA is extremely important for basic optimizations / code movement, and there's no switch for disabling.
2
1
You're saying that you want that all entirely disabled, with pointers treated as addresses. That's asking for a whole lot more than -fno-strict-aliasing and not marking pointer arithmetic inbounds (and the latter definitely has a huge impact on important loop optimizations, etc).
1
1
Here's something that's undefined: accessing uninitialized data. int a; if (cond1) { a = 5; } else { if (cond2) { a = 10; } } use(a); do_other_stuff(); use(a); Lets say that cond1 & cond2. Should both calls to use(a) be guaranteed to read the same value from uninitialized data?
2
2
i.e. it actually needs to assign a register to it, read a value, and potentially spill that to the stack so it can read it again later. LLVM won't currently do that. In rare cases this can have safety implications. A safe language would either force another else branch or zero.
2
Show replies