Hardening in Android Q:
security.googleblog.com/2019/05/queue-
Scudo won't be used on GrapheneOS for 64-bit, but it will be nice to have for 32-bit where the far more hardened github.com/GrapheneOS/har implementation isn't available due to completely dependence on a very large address space.
Conversation
Replying to
Contributions are welcome if you feel like it. We definitely strive for a balance perf/security rather than full security. Standalone (non sanitizer_common dependent) version is being committed @ github.com/llvm/llvm-proj
1
Replying to
github.com/GrapheneOS/har is certainly balancing performance, memory usage and security too. It takes a much different approach inherited from allocators like OpenBSD malloc rather than allocators like dlmalloc and is specifically designed around the large address space on 64-bit.
1
There's some documentation on the approach in the README. It was written to be a replacement for the previous port of OpenBSD malloc and shares a lot of the basic design concepts with it. It does lean more towards security than performance and cares a lot about low fragmentation.
1
I would be using a slab allocator approach with out-of-line slab metadata even for a fully performance oriented allocator, like the Linux kernel. Bitmaps instead of free lists is the main difference from a pure performance-oriented approach but jemalloc uses bitmaps to pack data.
1
Mapping from addresses to the slab metadata can be done in various ways like the alignment trick in jemalloc, PartitionAlloc and OpenBSD malloc. That's not needed here due to reserving isolated regions for slab allocations metadata (slabs, large allocations, and all other state).
1
That could still be a perfectly suitable approach in a purely performance-oriented allocator. I'd just throw away all the security features layered on top, replace bitmaps with free lists and add small array-based thread caches like jemalloc to amortize the cost of locking.
1
Scalability primarily comes from having locking divided up per-arena and per-size-class within the arenas, so with 4 arenas, there are 4 separate sets of size class regions, each with their own entirely independent locking (there is no global or arena-level locking for slabs).
1
Being solely focused on 64-bit support is liberating and is what makes it very different than OpenBSD malloc. It frees it from needing to use the alignment tricks or global data structures and enables never mixing / reusing address space between size classes or metadata.
1
Guard slabs also only hurt performance by making the address space sparser. It simply skips them and they never get unprotected. For the default guard slab interval of 1, slabs with a single slot are getting guaranteed guards on both sides with no direct / major performance cost.
1
In a purely performance-oriented allocator, the slab allocator approach is faster, lower fragmentation and has far lower memory usage for metadata. Metadata overhead is lower than 1% for 16 byte allocations with a slab allocator using free lists and a bit over 1% with bitmaps.
It's a bit higher here, since it has a bitmap for the quarantine as well in order to provide proper reliable double-free detection without needing an approach like a hash table tracking every element in the quarantine. Canaries also waste space but memory tagging will replace it.
1
Since it doesn't support 64-bit, I do need to choose something else for legacy 32-bit components. That may mean enabling Scudo globally for 32-bit, since it's already going to be around. I prefer the slab allocator approach though, and it just offers better security properties.
1
Show replies

