The fact that llvm has bugs doesn't really change the answer. Even a compiler that was designed to avoid UB could have miscompile bugs. Also, the context of the conversation allows for making changes to llvm - so this seems like the kind of thing that could be fixed.
Conversation
I guess I don't understand the context. It seems to be about C, and I don't see how you can resolve that problem for C without coming up with a model to enforce a form of memory safety. What is the scope of UB that should be avoided? You mean, for a language like Rust or Swift?
2
I was just responding about the infinite recursion thing, particularly because I don't believe that's actually undefined in C. It doesn't talk about a stack, let alone saying that overflowing it in undefined or that infinite recursion can just be removed. I think that's wrong.
1
I don't think features like opt-in to undefined behavior on signed or unsigned overflow (nsw, nuw) are an issue in LLVM since frontends can avoid emitting it (like Rust and Swift). It's only a major issue when there isn't a decent alternative or when UB is poorly defined/unclear.
1
For example, various shifts, casts and floating point operations are undefined, so if you want to use them safely you need to wrap everything in branches which the compiler is unlikely to optimize out. I think the solution to that is just providing more specific variants of them.
2
If you want overflow to be handled a certain way, you would use the instruction for those semantics and it would just be lowered inefficiently on architectures without native support. If you want efficiency you could have the frontend switch it based on the architecture targeted.
1
LLVM is overly focused on C and takes too many shortcuts that are allowed for C but don't work for other languages. When someone talks about these issues for C, I don't really get the point. These issues are what goes horribly wrong for C in the real world even in 1% of cases.
1
But you can make C not have the UB issues while retaining the ability to fuck with memory directly. Such a thing can be defined and it’s what lots of systems code expects.
1
No, memory unsafety is by far the biggest issue. The compiler certainly heavily optimizes with the assumption that memory safety is not violated. It's a far more extreme situation than optimizations based on the assumption that signed int overflow doesn't happen, etc.
2
Biggest issue for what? Crashes? Security? Or this conversation? Because this conversation isn’t about fixing the fact that C programs may crash or have other bugs. It’s about the compiler and spec exacerbating the problem with bugs that arise from an overly symbolic spec.
1
The compiler and specification certainly exacerbate the problem of spatial and temporal memory safety issues by not guaranteeing that they trap, which they could do and it could be quite realistically implemented. What makes that outside the scope of a compiler issue? It is one.
I don’t think that guaranteeing that they trap is what folks want. Ideally C would be a structured assembler with optimizations. It’s almost that. Silly reasoning about the symbolic meaning of memory breaks that.

