That's something I need to look into for HardenedBSD again: random-sized MAP_GUARD stack guard pages for per-thread stacks.
Conversation
Replying to
You can see how it's implemented:
github.com/AndroidHardeni
It's only tricky for aligned allocations (not applicable to stacks) and isn't that complex.
Here's where it chooses the random guard size:
github.com/AndroidHardeni
It's stored alongside size:
github.com/AndroidHardeni
1
1
I meant to link to pages.c rather than memory.c:
github.com/AndroidHardeni
The whole purpose of pages.c / pages.h is to provide a wrapper around memory.c / memory.h implementing support for guard pages in a mostly transparent way. It's not entirely transparent due to realloc.
1
1
Since realloc actually has to be aware of how the guard pages work for the efficient approaches to large allocation shrinking, growth and moving pages to a new location with the mapping already reserved via MREMAP_MAYMOVE|MREMAP_FIXED. Those are just optimized fast paths though.
2
So this realloc section is essentially part of the implementation of guard pages too and is by far the most complex part of it:
github.com/AndroidHardeni
It's quite tricky to implement some of this with proper handling for out-of-memory errors. It can barely use MREMAP_MAYMOVE.
1
It has to actually allocate a new mapping in the usual way to provide it with guard pages, and then move the non-guard pages to the new inner portion via MREMAP_MAYMOVE|MREMAP_FIXED. It essentially uses it as an optimized memcpy and falls back to memcpy if that system call fails.
1
On a system without MREMAP_MAYMOVE, that portion would just be disabled so it falls through to the standard handling at the bottom:
github.com/AndroidHardeni
There's another use of mremap without MREMAP_MAYMOVE for in-place growth. MAP_FIXED_NOREPLACE would also work for that.
1
Those are both just optimizations. There's no harm in disabling both. It's rarely ever possible to do in-place growth anyway, at least on Linux where the mmap heap starts at a high address and grows downwards. MREMAP_MAYMOVE helps a huge amount but it's only for massive realloc.
1
Also worth noting that one of the few forms of misuse this allocator implementation isn't hardened against is a buggy program with realloc races on the same allocation between different threads. I've thought about how to address that but it's quite difficult and would be costly.
1
It's hardened against it in terms of having randomization, etc. but it won't directly catch that form of misuse like many other errors. I can't think of a reasonable way to do it beyond an arena-global realloc lock which would be expensive and would make realloc less scalable.
1
The way this allocator will scale to many cores is via the combination of per-size-class locks with arenas implemented by dividing up the slab allocation region, which is similar to jemalloc without needing the jemalloc alignment trick substantially reducing malloc heap entropy.
So it'll have substantially finer-grained locking than jemalloc since the slab allocation locks are per-size-class-per-arena and totally independent from each other. It can't use thread caches since they completely compromise the security from global knowledge of what is free.
1
It could use per-thread allocation queues to do multiple allocations each time it grabs the per-size-class-per-arena lock but having queues for free or an actual cache used for both allocation and free like jemalloc is unacceptable and against the whole idea behind this work.
1
Show replies

