Conversation

There's more to come soon, including high-level design for the new malloc. Might post after I get a bite to eat; if not, tomorrow.
1
2
The quick version: the design I've mostly worked out now is in some sense slab-like, but with the flexibility to avoid being overly eagar to grow, and without tying in slab structure assumptions to the layer that will be the fast-path.
1
1
That is to say, the outer layer that works with an "active group" of slots per size class could just as well be hooked up to the "bump allocator with free" in the non-candidates post, with no change to logic.
1
1
Replying to
No, no per-cpu/thread caching. That compromises optimal use of free memory (non-fragmentation) and security properties by inherently lacking a globally consistent view of what's free.
2
I think it's also problematic that the batching caused by thread caches increases the 95th percentile and up latency. Thread caches can also use a ridiculous amount of memory as the number of threads scale up, especially if they aren't tiny caches only for the smallest sizes.
1
1
For hardened_malloc, it's specifically only for 64-bit systems with a fair bit of resources and likely without tons and tons of tiny processes. The focus for performance / memory usage is on the worst cases involving fragmentation, latency, huge numbers of threads, and so on.
1
Show replies