Conversation

The question of whether memory unsafety implies UB is sort of at the heart of the disconnect between the C spec and C practitioners. As a practitioner (and compiler guy) I view memory unsafety as a separate thing - after all a “bad” store still stores to a well defined place.
2
2
There is nothing well defined about what an out-of-bounds access or use-after-free will access. The compiler, linker and even runtime environment are assuming that is never going to happen and there's nothing defined about what the consequences are going to be from the C code.
3
1
Memory safety is certainly UB and certainly heavily impacted by optimizations. They would need to trap on memory corruption / type confusion bugs in order to get rid of undefined behavior while also still being able to heavily optimize without changing runtime behavior.
2
2
Tons of programs have latent memory corruption bugs and are depending on the specific way that the compiler, malloc implementation, etc. chose to lay things out in memory, what happens to be zeroed based on what they chose to do, etc. It's certainly UB with similar impacts.
2
No, it doesn't, because the compiler is optimizing based on the assumption that UB doesn't happen. If you write past the end of the array or to a freed object, you don't know what exactly is going to happen at runtime. C pointers are not treated as addresses by the compiler.
2
C pointers can trivially be treated as addressed by the compiler and llvm will happily do that for you if you use int math rather than gep. The compiler is not optimizing based on the assumption that writing past the end of an array can’t happen unless you explicitly tell it to.
1
There is no way to disable that from being undefined either. There is no opt-out and no switch to pass to make it well defined. There's a concept of pointer provenance used in GCC/LLVM where pointers must be based on object that they are being used to access. It's how it works.
2
Passing -fno-strict-aliasing to disable outputting type-based alias metadata does not mean there is no alias analysis. It means there is no type-based alias analysis. There's still object-based alias analysis, and other optimizations not just treating pointers as addresses.
You can make it defined by turning it into int math. So it’s not a gep anymore. Then there is no notion of objects or types that could have bounds. Whether or not that access hits valid memory depends on what the programmer did and is defined from the compiler’s standpoint.