Conversation

MTE is quite powerful despite 4-bit tag limitation. It can be used to provide deterministic guarantees. It has explicit support for reserving tags. OS reserving a single tag for internal use allows it to protect all kinds of metadata, make 16 byte granularity 'guard pages', etc.
1
2
ShadowCallStack, inline malloc metadata, freed allocations, etc. can be protected with a single reserved tag. Making sure adjacent allocations have different tags or having protected metadata between them wipes out small / linear overflows. Can do a lot more than random tags.
1
1
I think it's far more compelling than PAC even if you disregard the ability to use random tags and use entirely deterministic ones. PAC is yet another attempt at targeting exploit techniques. Only protects specific pointers rather than memory in general. IMO, it's underwhelming.
1
6
PAC is at odds with using the address space for exploit mitigations. It's directly opposed to approaches like splitting up the address space and avoiding reuse which is a *deterministic* UAF mitigation. It isn't just taking away bits from ASLR but also more interesting things.
1
Those 'more interesting things' also include memory tagging since eventually it could be possible to use 24-bit or larger tags via unused upper bits. PAC is using up a bunch of those precious bits for an inherently very weak and hard to widely deploy probabilistic mitigation.
2
1
Going from the current 16 byte granularity to 64 byte would at least make it possible to use 16-bit tags instead of 4-bit without needing more storage. They can certainly go a lot further than that if they're willing to offer the option to waste more space on the tag metadata.
1
If you go to 64 byte granularity, the amount of wasted memory for the majority of consumer workloads (browsers, etc) becomes your bottleneck. SPARC worked fine with that because it aligned with Oracle DB needs, but consumer OS are in a different space
1
For the ecosystem we work on, nearly all of that is using concurrent compacting garbage collection. C++ and now Rust are used for particularly high performance libraries or application code and it wouldn't make much sense to use it if that code was heavily impacted by malloc.
1
Android was using jemalloc for 5.x and beyond which provided very high performance with low metadata overhead, but those times have ended. It defaults to Scudo which is a semi-hardened allocator. Easily 4x slower with dramatically more metadata overhead for small allocations.
2
1
Chromium uses GC for DOM, etc. and their own PartitionAlloc setup / other allocators for most other things. It does use the system allocator on Android for generic usage and it impacts a few things including 2D canvas performance but there's no measurable impact for most things.
1