NanoMalloc on Darwin does that!
Conversation
github.com/GrapheneOS/har uses completely out-of-line metadata and knows the size of slab allocations solely based on their address. It isn't a performance-oriented allocator though. Has design decisions and tricks which could be used in a performance-oriented design though.
1
1
Well doesn’t have to look nearly as far. Zalloc (XNU’s allocator) has (for half a release already):
- 100% out of line metadata
- va sequestering (isolation)
- dynamically scaling cpu caching (driven by contention)
- zero on free (checked on alloc)
2
6
(And since iOS 15 type isolation too)
The type of elements is derived from the pointer address too very much like graphene’s allocator.
Nano is funnier though because it has a pre-partitioned way to allocate its slabs …
1
2
… and all 64M regions of it will have pre tuned regions and the size class is literally derived from a range of bits of your pointer (bits 12-20 or something like that). The actual mapping is scrambled but is a single piece of constant out of line metadata)
2
1
hardened_malloc reserves the whole slab allocation region with a randomized sub-region for each size class in advance, which sounds somewhat similar. There's also a metadata region reserved in advance for all the mutable state including for slab allocations and large allocations.
2
So instead of quarantines the caching is used as a cheap poor man’s FIFO queue. The more you use a zone the more elements rotate. The more you’ll find UaF or double frees. That’s it. I just don’t think quarantines are worth it in an adversarial context.
1
For large allocations, the quarantines are cheap since it's just wasted virtual memory. The slab allocation quarantines are quite expensive both in terms of memory and performance. They're one of the optional features rather than a core part of it. Could be made leaner though.
1
Until recently it also wasn't possible to do real per-CPU caching in userspace on Linux. There's now a restartable sequences API so it could be implemented. Since we probably wouldn't use it ourselves in GrapheneOS, we haven't bothered to look into implementing that though.


