It's undefined behavior and defining it as zero is no more of a language dialect than guaranteeing a non-zero pattern or guaranteeing that it traps. Traps are something that can be relied upon and processed. It's no more of a language extension than any of the UBSan sanitizers.
Conversation
The language leaves it up to compilers to choose how to implement implementation-defined and undefined behavior. LLVM historically didn't provide a way to get safe implementations of most undefined behaviors but -fsanitize=undefined -fsanitize-trap=undefined is exactly that.
1
How is that not a dialect? This code will work on compilers that implement zero-init and will likely fail randomly or have vulnerabilities on compilers that don't.
Perhaps we don't have a common understanding of what a dialect is.
1
1
1
They say that zero init is a dialect but initialization with any other byte pattern chosen by the developer is somehow not a dialect. That doesn't make any sense. Regardless of how a dialect is defined, either both of those are language dialects or neither of them is a dialect.
1
Similarly, if zero-init is a dialect, so is trapping or zero-or-trap. These are all ways of defining an undefined behavior which the language leaves up to compilers to handle. The -ftrivial-auto-var-init=zero switch does NOT make it correct to use uninitialized data. Still a bug.
2
as use of compiler zeroing increases, using not-explicitly-initialized data will stop being a bug, I hope. first in practice, then in the standards.
1
I don't want that and it's a separate topic than choosing how the compiler generates code. It's not what this switch provides since it just changes the code generation and doesn't disable warnings / sanitizers to make this not considered an error. Why even have a special value?
2
Could have just supported pattern initialization without being concerned with which byte pattern is chosen. As is, this is an entirely developer hostile approach where they support pattern initialization to a chosen pattern as long as it isn't the one most people want to use.
2
If you believe zeroing at the code generation level, with it still semantically not being initialized to be a dialect, then so is setting it to 0xFF. If there wasn't hostility towards developers and towards security, the switch would not have a bias against any particular value.
1
I see what you are all getting at.
Those other variants are all dialects, too in some sense. The difference is they aren't likely to be used as a dialect in practice, because the additional behavior they add or simply specify mostly isn't useful to write working programs in.
3
1
They didn't have to add any special treatment of zero. They could just delete the code special casing zero and avoid supporting 'zero' as a flag entirely since it can just be specified as another byte pattern instead. They're either all dialects or none of them are dialects.
The other byte patterns are presumably mostly not useful to rely on though. A programmer doesn't say "boy, I hope this counter starts out as 0xBADF00D or whatever", but zero is special, other languages zero-init, C++ zero inits in almost every other than locals, etc.
1
Patterns with at least one zero byte are a valid NUL terminated string. That's why it makes sense to guarantee a leading zero for heap and stack canaries. It doesn't mean that the language is now a dialect where non-terminated C string heap overflows are allowed.


