GrapheneOS used to implement a local-init sanitizer for Clang to zero uninitialized variables in C and C++. In the development branch, this feature is once again globally enabled for Vanadium and the GrapheneOS kernel/userspace via Clang's new -ftrivial-auto-var-init=zero switch.
Conversation
Zero initialization is hidden behind -enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang. We consider it part of the stable API regardless and depend on the feature. If it's removed, it will break compatibility with GrapheneOS and we'll just fork Clang again.
2
1
9
The existence of this switch is due to hostility by LLVM developers towards making existing C and C++ code safer. The desire for zeroing was misrepresented as being solely about performance. They're in denial about the unsafety of real world C and C++ code including within LLVM.
2
2
8
In the real world, C and C++ code makes pervasive use of uninitialized data and often depends on it being zeroed in practice. Initialization with a non-zero value is a dangerous backwards incompatible change. It regularly turns inert latent bugs into exploitable vulnerabilities.
2
7
Replying to
That sounds like a significant exaggeration. I agree zeroing yields increased safety in event of bugs, but no C code "depends on" it intentionally. No real world implementation except the clang option gives zeros; normally you get whatever was there last.
1
Replying to
When you're deploying it for an entire OS, enabling zeroing on init doesn't uncover any bugs in practice while enabling filling with a non-zero value uncovers a lot of them. For whatever reason lots of code has developed unintentional dependencies on uninit data often being zero.
2
1
Replying to
But in practice it's not zero, except with that clang option or maybe on first time stack reaches that depth. BTW I found some old uclinux minimal shell with such a bug. It blew up with dynamic linking because ldso had already used the stack.
1
Replying to
Overall, uninitialized data usually isn't zero, but it being zero is common enough that there are many cases where it's actually fairly reliably zero so that software somehow develops dependencies on it being zero. The dependency is often that there's one zero byte, not all zero.
1
1
Even after the initial usage of the stack, a fair bit of it is only used for padding or unused space in buffers so it remains zeroed. There's also a lot of code zeroing structures / arrays. It's by far the most pervasive byte. Somehow, software manages to depend on it. *shrug*
1
1
For malloc, I think it must be rare to use the final bytes of certain size classes, so code using C strings manages to depend on having zeroed padding at the end. I feel like a lot of code is written by just fixing all the obvious problems that come up until it appears to work...
1
1
The problems we run into these days are with apps and drivers. Camera and Wi-Fi drivers are both really horrifying and it's not exclusive to a specific brand of them. Atheros and Broadcom both have some really horrifying, awful code and they love uninit data and use-after-free.
If there is little trust in that code (i.e. use-after-free, use-of-unintialized-variable are likely) why trust the remaining parts of the very same code - the logic most notably?
1
Who said anything about trusting the code? Taking networking as an example it doesn't matter if network drivers of the TCP/IP attack screw up their own security. What matters is that they don't screw up the security of the rest of the system such as giving an attacker RCE.
1
1
Show replies


