(Honestly, playing with 8088 so much, that gives me some comfort that even on flat addrspace archs, the whole addrspace can't be used for a single object in high-level langs :P)
Conversation
twitter.com/DanielMicay/st I guess my follow up is: what do you mean by it's unrealistic to avoid "x - y" when it overflows?
The sentence immediately after describes how gcc/llvm do it... it prevents your program from continuing :)!
Quote Tweet
Replying to @DanielMicay @brouhaha and @iximeow
Even with the C standard semantics, it's unrealistic to avoid x-y when it would overflow. GCC and LLVM don't give you the opportunity to try to use it correctly. It just isn't supported. It's one of many rules they don't really bother to document. It's how they intend it to work.
1
So you're basically saying "even though signed overflow is undefined, it's so pervasive in ptr contexts that gcc/llvm won't optimize it out/do cute tricks"?
1
No, what I'm saying is that even if objects larger than PTRDIFF_MAX were supported by LLVM and GCC, pointer difference overflows would still be undefined. Since it's so common to take differences on arbitrary slices, etc. it would still be undefined to allow making those objects.
1
LLVM and GCC do actively break code with signed integer overflows if you aren't passing -fwrapv or -fno-strict-overflow. They're particularly aggressive with breaking it for pointers. They barely have any integer range analysis, etc. so they don't break much with integers.
1
1
It's not really that they avoid breaking code due to being cautious but rather they're terrible at doing analysis of integer ranges or optimizing based on it. It's a big part of why they both suck at removing bounds checks.
1
1
Most languages don't say signed integer overflow is undefined like C and LLVM/GCC won't hold back generic optimizations just to avoid breaking C code. They'll eventually add proper integer range analysis. C programmers can either use -fwrapv or have their undefined code break.
1
1
Passing -fwrapv hurts optimization a lot with Clang though. It removes inbounds from pointers too, and that doesn't only lose the non-wrapping guarantee but also the guarantee of it being within the bounds of the object up to one byte past the end.
1
1
I seem to vaguely recall a hypothetical C implementation where a buffer could be placed such that "one past the end" wrapped around, and this was in fact legal.
But ptr arithmetic would have to be impl'd specially to accommodate this edge case.
3
LLVM/GCC just assume that no object can be located at 0 or the maximum possible address. They have a bunch of these unstated requirements for the runtime / libc. They also hard-wired behavior based on both C function names and compiler runtime function names within C spec rules.

