It will trigger the -fsanitize=object-size (included in -fsanitize=undefined) and _FORTIFY_SOURCE checks.
Using malloc_object_size is also not efficient. In modern malloc implementations, it usually requires reading out-of-line metadata since they avoid per-allocation metadata.
Conversation
Extended malloc/realloc APIs providing the size as an output parameter is more efficient but it's still not ideal and runs into the same alloc_size issue. The best way to do it is really just having the collection types be aware of the malloc size classes to choose good sizes.
1
Makes sense. I deal in small allocations so much I forgot about large ones 😂 thanks for the info!
(And yeah, I’ve been looking into tuning collection growth strategies specifically to avoid the overhead of asking for the size. It’s particularly steep for us for silly reasons)
1
2
github.com/GrapheneOS/har uses completely out-of-line metadata and knows the size of slab allocations solely based on their address. It isn't a performance-oriented allocator though. Has design decisions and tricks which could be used in a performance-oriented design though.
1
1
Well doesn’t have to look nearly as far. Zalloc (XNU’s allocator) has (for half a release already):
- 100% out of line metadata
- va sequestering (isolation)
- dynamically scaling cpu caching (driven by contention)
- zero on free (checked on alloc)
2
6
(And since iOS 15 type isolation too)
The type of elements is derived from the pointer address too very much like graphene’s allocator.
Nano is funnier though because it has a pre-partitioned way to allocate its slabs …
1
2
… and all 64M regions of it will have pre tuned regions and the size class is literally derived from a range of bits of your pointer (bits 12-20 or something like that). The actual mapping is scrambled but is a single piece of constant out of line metadata)
2
1
hardened_malloc reserves the whole slab allocation region with a randomized sub-region for each size class in advance, which sounds somewhat similar. There's also a metadata region reserved in advance for all the mutable state including for slab allocations and large allocations.
2
It uses jemalloc-style arenas but doesn't use thread caching since it would interfere with the hardening. It could do allocation queues for batches for allocations.
Slab allocation quarantines are very heavy on memory so caching more doesn't really make sense right now.
It's hard to come up with any significant improvements right now without having memory tagging. Memory tagging will make things a lot better. At some point, it might make sense to start completely from scratch with everything fully designed around memory tagging though.
I don’t think caching interferes with hardening. There’s very little you can do in practice against a UaF our double free in an allocator. An attacker can always “spray more” to exhaust quarantines. It’s a lost battle and cost used that way is lost to more meaningful defenses.



