My preferred solution to that problem is deprecating fork and treating nontrivial use of fork as the nasty detail rest of program has to tiptoe around, but languages that chose to expose fork rejected that. 😣
Conversation
So I think where the problem is really hard to assign blame/responsibility for fixing is where a program in an affected language (Ruby?) uses fork and indirectly uses a module with FFI to a C library using threads they weren't even aware of.
1
1
It's a very common issue but the race condition often has a very low chance of happening in practice, so it mostly goes unnoticed. Users will notice that applications sometimes fail to start or perform an operation but it's hard for them to figure out what could be causing that.
1
2
Yep. With malloc it's noticed quickly, and glibc makes malloc work, but only malloc. Other stuff just goes unnoticed because the races are hard to hit.
1
Even with malloc, applications often don't trigger the race condition regularly. In our experience some cases only happen 1/20 times, etc.
Android uses a fork-based spawning model by default and they had to make everything used by the app runtime work, but not beyond that.
1
1
The runtime for managed code (Java, Kotlin) is explicitly designed to support this and take full advantage of it via optimizations. Some hooks like android.googlesource.com/platform/art/+ solely exist to support this spawning model. Zygote uses this to pause making new threads, as one example.
1
We list out some of the disadvantages in grapheneos.org/usage#exec-spa. It's a nasty optimization. iOS has similar optimizations with similar downsides but the implementation is different. Their linker is a system service and shares a mapping with the same ASLR base / relocs applied.
1
1
There are applications on desktop Linux using a zygote-based spawning model though. I'm sure you're aware of some those but you may not realize that Chromium does it on desktop Linux:
chromium.googlesource.com/chromium/src.g
2
1
What we dislike about the Zygote spawning model existing is that it makes it easy for them to implement optimizations simply by loading libraries and doing initialization in the Zygote. Since they only run the code once per architecture during boot, it's not as heavily optimized.
1
They explicitly do a bunch of preloading work to run a bunch of initialization needed by nearly every app in the Zygote:
android.googlesource.com/platform/frame
It saves a lot of time, especially on lower-end devices. It also saves a lot of memory by reusing loaded resources, relocations, etc.
android.googlesource.com/platform/libco is the rest of the preloading. You can see they do things like preloading the ICU locale and timezone, so it only has to run once per architecture and any dirty pages end up being shared anyway. It has a substantial impact overall.
2
On the positive side, since they now officially support wrappers as a development tool in production releases, they keep exec-based spawning working upstream... but their implementation is ridiculously slow, and we lack the time to deal with more than the lowest hanging fruit.

