Conversation

Replying to and
x - y overflow for pointers is undefined behavior but it's incredibly common to do it with arbitrary slices, etc. in both C and Rust. You can do things in a tricky way with uintptr_t casts but it has an impact on optimization / code generation.
2
2
Replying to and
No, what I'm saying is that even if objects larger than PTRDIFF_MAX were supported by LLVM and GCC, pointer difference overflows would still be undefined. Since it's so common to take differences on arbitrary slices, etc. it would still be undefined to allow making those objects.
1
LLVM and GCC do actively break code with signed integer overflows if you aren't passing -fwrapv or -fno-strict-overflow. They're particularly aggressive with breaking it for pointers. They barely have any integer range analysis, etc. so they don't break much with integers.
1
1
Most languages don't say signed integer overflow is undefined like C and LLVM/GCC won't hold back generic optimizations just to avoid breaking C code. They'll eventually add proper integer range analysis. C programmers can either use -fwrapv or have their undefined code break.
1
1
Passing -fwrapv hurts optimization a lot with Clang though. It removes inbounds from pointers too, and that doesn't only lose the non-wrapping guarantee but also the guarantee of it being within the bounds of the object up to one byte past the end.
1
1
I seem to vaguely recall a hypothetical C implementation where a buffer could be placed such that "one past the end" wrapped around, and this was in fact legal. But ptr arithmetic would have to be impl'd specially to accommodate this edge case.
3
Replying to and
The inbounds marker is just a guarantee that the pointer arithmetic will result in a pointer within the bounds of the object. They define one byte past the end as a special case that's allowed. The part that goes beyond C spec are their runtime / libc assumptions.
1
Try twitter.com/DanielMicay/st with and without -fwrapv or -fno-strict-overflow with Clang. In theory, the inbounds marker could be split up into 2 separate markers to provide the no-overflow guarantee as a separate guarantee from being within the bounds of the object.
Quote Tweet
Replying to @DanielMicay and @iximeow
For example, in C: char *foo(char *x) { return x + 10; } Compile this with `clang foo.c -S -emit-llvm -o - -O2`. The function `foo` is a guarantee that `x` is not NULL and is at least 10 bytes large. The result is at most 1 byte past the end of `x`. It's a promise.
1
1
The only way you really get non-inbounds GEP from Clang is when you do stuff like casting to/from integers and it happens to compile that code back to GEP. Casting to/from integers is what gets incredibly sketchy and is arguably broken due to pointer provenance rules they use.
2
Show replies
That touches on my follow up: Is there something preventing creating a new marker that says "one past the object is valid", while simultaneously saying "this pointer may wrap"? This could make -fwrapv hurt less? But you'd have to actually ensure the pointer doesn't wrap!
1
Replying to and
-fwrapv could be split up for integers and pointers. That's essentially what Rust does, since signed integer overflow is well-defined but most unsafe code uses ptr::add (which is an unsafe fn) Rust does expose ptr::wrapping_add for non-inbounds GEP (which is a safe fn).