The current C standard doesn't really standardize it. LLVM / GCC and likely other compilers choose to come up with those rules. They feel it's the only reasonable approach because it would be too hard to optimize C otherwise. They'd still do it for other languages regardless.
Conversation
C standard retroactively turns things into undefined behavior regularly. They see their job as largely standardizing real world implementations. If compiler authors want something badly enough, they'll get it, because they'll do it and the standard will change. Likely for this.
2
1
The standard currently implies that optimization based on pointer provenance is not really a thing. It omits talking about it and says nothing about it being undefined. However, compilers do it, and the standard will likely be brought in line with what compilers choose to do.
2
2
Safe Rust code is actually meant to be properly sound and completely well defined / specified.
However, unsafe Rust code pretty much just follows the same kind of rules as C based on the LLVM choices. Despite not having a formal spec it's better specified in certain ways though.
1
1
Replying to
There's less UB, even in unsafe Rust (though the documentation on UB admits the current list is nonexhaustive).
1
I vaguely remember Ralf Jung implying there’s more UB and it’s easier to exploit, since exploiting UB is guaranteed not to break safe Rust.
2
Perhaps that list should be updated on the Rust website then :P. Last I counted the Nomicon had like 10 different UB behaviors listed, and the C spec has what, 240?
2
Maybe?
Compare doc.rust-lang.org/nomicon/what-u, ralfj.de/blog/2020/12/1 and the papers on clang. What I study is C++, not Rust, but the Rust list doesn't look too convincing, assuming Ralf is right (which seems the most likely thing).
2
Rust doesn't do TBAA and doesn't really need it due to the &T and &mut T restrictions so that's a lot of complexity that's avoided. It doesn't have the pointer provenance rules but LLVM doesn't really document what they do and my generic wording was lost when they updated it.
2
2
(generic wording? Do you have a link to the original, correct docs as you wrote them?)
1
The list there is almost the same as my original list but someone declared it complete/exhaustive when it originally didn't claim that, and the part about aliasing was ruined. Whoever replaced that seemed to miss a lot of the point. Not my problem to resolve though.
LLVM considers NoAlias to be about lack of memory dependencies between types. That means &T for types without interior mutability (which has to be declared) is NoAlias in the LLVM sense.
Also, that list is clearly missing the rules about immutability, among various other things.
1
Rust's rules for UB in unsafe code will definitely be a lot simpler than C and it doesn't have UB in safe code.
I strongly dislike the style that the documentation was remade into and part of that is it deliberately eschews accuracy in favor of being amusing and comforting.
1
1
Show replies


