Consider a loop that calls malloc(16) and free(16) repeatedly. Normally the allocations will always be in cache. But in an allocator that never reuses memory every new page will miss.
Conversation
Indeed this example will be slow, but who calls malloc/free in a hot loop?
There's also a much more granular safety story - you could allow some of your trusted dependencies to be compiled in ReleaseFast mode but most of your application in ReleaseSafe.
4
3
We've had a long time to determine the answer to the question of whether it matters in practice that the memory malloc returns is in cache, and the answer is that yes, it does.
3
5
The state-of-the-art in hardened malloc is 's github.com/GrapheneOS/har and it does quarantine for large objects only (I think, could be wrong).
1
7
It has optional quarantines for both small and large allocations as separate features. Large means a mmap region with guards.
In both cases, it has a random quarantine based on swapping with a random element in an array and a ring buffer as a queue for a deterministic delay.
1
2
For the large allocation quarantine, it replaces the allocation with fresh PROT_NONE pages so it's essentially a virtual memory quarantine.
Either way, it always deterministically detects any invalid free of any pointer that's not a valid non-free, non-quarantined allocation.
1
2
By default, the configuration is very security centric with all security features enabled other than the somewhat superfluous MPK-based metadata protection.
Quarantines are largely separate from the core code other than slab metadata having an extra bitmap for quarantined slots.
2
1
Have you done measurements about the results of quarantining all allocations forever? I'm skeptical but interested in seeing results
2
1
It wouldn't work for small allocations as currently implemented. It doesn't currently pay the cost of aggressively purging memory held onto by the slab allocation quarantine. That's something we're working on improving. For large allocations... you'll run out of address space.
2
1
3
My strategy for this is to use an atomic, monotonically increasing address hint to mmap, and unmap pages. So the guarantees are weaker, but it will wrap around and start using the beginning of the address space when the kernel decides the hint is out of range.
1
1
There are generally other users of mmap though and the Linux kernel does O(log n) best fit allocation for mmap these days. Tracking the mappings is very cheap since it's literally just a ring buffer with pointers so it's a single pointer for every freed > 128k allocation.
This is the large allocation (region) quarantine implementation:
github.com/GrapheneOS/har
It swaps with a random slot in an array for the random portion and then swaps with the next slot in the ring buffer. Both can be configured to be as large as desired.
1
The slab allocation quarantine is the same concept. In that case, it's a per-size-class-allocator random array + ring buffer. It has a bit more complexity since it needs an extra bitmap in the slab metadata to track quarantined slots. The main issue with that is wasted memory.
1
2
Show replies

