Sure, and you lose a whole lot of the optimizations that make C efficient. Pointers and aliasing are huge blockers for optimization even with all the extensive leveraging of the aliasing / indexing guarantees which go a long way to making optimizations possible.
Conversation
The most important optimizations still work in the C that I describe. I’ve tried variations. I think the only one that really hurts is strict aliasing. Without that you lose too much load elimination. But that one doesn’t produce full UB - it just creates extra load motion.
1
Type-based alias analysis (-fstrict-aliasing) is only a small portion of the overall alias analysis. The basic baseline alias analysis and reasoning about memory even without AA is extremely important for basic optimizations / code movement, and there's no switch for disabling.
2
1
You're saying that you want that all entirely disabled, with pointers treated as addresses. That's asking for a whole lot more than -fno-strict-aliasing and not marking pointer arithmetic inbounds (and the latter definitely has a huge impact on important loop optimizations, etc).
1
1
I'm certainly not saying that it can't be done or that it isn't useful but that you're understating how much needs to be changed and the impact of it. Memory corruption also doesn't become predictable, just *less* impacted by optimization. It's still always going to be impacted.
1
1
Here's something that's undefined: accessing uninitialized data.
int a; if (cond1) { a = 5; } else { if (cond2) { a = 10; } } use(a); do_other_stuff(); use(a);
Lets say that cond1 & cond2. Should both calls to use(a) be guaranteed to read the same value from uninitialized data?
2
1
2
Of course they should see the same value. Holy cow you’re throwing the easiest examples at me. It’s like you’re making my point for me: that this is easy to fix but that some people need compiler educations.
1
2
Okay, and that also means that using MADV_FREE in malloc and elsewhere is not possible either, which is a massive performance cost. Uninitialized memory can and does change value at runtime beyond just compiler optimizations avoiding saving uninitialized data via spill / restore.
2
3
That's likely glibc what is going to be doing for their stack cache since MADV_DONTNEED is a significant performance cost for their implementation, and it doesn't become a non-issue if restricted to malloc since it still means that uninitialized memory can change between reads.
1
4
Reading uninit data being undefined instead of locking it to an unspecified value permits massive optimizations like MADV_FREE and more efficient register allocation/spilling. Similarly, other memory safety issues being undefined permits optimization / freedom of implementation.
Many programs have bugs where they read data that has just been freed, but handle it being an arbitrary value. The issue is often benign with common allocators. However, with other implementations the access will fault and they crash. It's good it's not required to let it work.
2
3
Also, signed overflow being undefined rather than defined as wrapping means that more secure implementations where it traps are permitted. Passing -fsanitize=signed-integer-overflow -fsanitize-trap=signed-integer-overflow is standards compliant and used for hardening in AOSP.
3
5
Show replies
This Tweet was deleted by the Tweet author. Learn more
MADV_FREE has nothing to do with rewriting code, or compiler optimizations, and yes it is certainly a massive optimization. Code that's reading uninitialized memory is undefined and dangerous. It doesn't make that substantially worse. It does make it act a bit weirder, sure.

