By default, the configuration is very security centric with all security features enabled other than the somewhat superfluous MPK-based metadata protection.
Quarantines are largely separate from the core code other than slab metadata having an extra bitmap for quarantined slots.
Conversation
Have you done measurements about the results of quarantining all allocations forever? I'm skeptical but interested in seeing results
2
1
It wouldn't work for small allocations as currently implemented. It doesn't currently pay the cost of aggressively purging memory held onto by the slab allocation quarantine. That's something we're working on improving. For large allocations... you'll run out of address space.
2
1
3
My strategy for this is to use an atomic, monotonically increasing address hint to mmap, and unmap pages. So the guarantees are weaker, but it will wrap around and start using the beginning of the address space when the kernel decides the hint is out of range.
1
1
There are generally other users of mmap though and the Linux kernel does O(log n) best fit allocation for mmap these days. Tracking the mappings is very cheap since it's literally just a ring buffer with pointers so it's a single pointer for every freed > 128k allocation.
1
1
This is the large allocation (region) quarantine implementation:
github.com/GrapheneOS/har
It swaps with a random slot in an array for the random portion and then swaps with the next slot in the ring buffer. Both can be configured to be as large as desired.
1
The slab allocation quarantine is the same concept. In that case, it's a per-size-class-allocator random array + ring buffer. It has a bit more complexity since it needs an extra bitmap in the slab metadata to track quarantined slots. The main issue with that is wasted memory.
1
2
For slab allocations, the cost of enabling all ptional security features adds up. Zero on free, write after free check (verifies memory is zero at allocation time), canaries, etc. aren't that bad alone but it really adds up. Slab allocation quarantine is pretty expensive itself.
1
hardened_malloc doesn't do thread caching for security reasons since it's incompatible with deterministic invalid free detection and other features.
If you compare performance with the expensive optional security features disabled vs. non-tcache glibc malloc they're comparable.
1
1
The core design with fully out-of-line metadata, deterministic detection of invalid frees, isolated regions for each size class, never reusing address space between size class regions or for metadata, etc. has comparable performance to the core glibc malloc design though.
1
The way hardened_malloc does slab allocation is the heaviest usage of address space. It makes an entirely dedicated PROT_NONE region for every arena. Inside that region, it has entirely dedicated regions for each size class. It also makes a dedicated region for all metadata ever.
At free time, distinguishes between small vs. large allocations based on whether they're in the slab allocation region. It then determines the arena, size and finds the slab metadata based on the address. This is why it uses libdivide.
1
Large allocations are just in a hash table. It has the space it needs for that reserved alongside all the metadata it will ever need for slabs. It works really well and is perfectly compatible with non-overcommit systems since it's all PROT_NONE and purged when it's not in use.
1
Show replies

