Holy shit mallocng is doing something right. I'm used to mutt growing to over 120 MB and sticking there forever after opening large folders...
Conversation
Welp, with mallocng LD_PRELOADed, it's dropping down to 2-4 MB after switching back to a small folder.
1
8
Replying to
Have you tried using it with a web browser and notoriously resource-hungry pages? I remember the people behind the Mesh allocator having good luck with Firefox?
1
Replying to
Yeah, I've been using it with Firefox for a long time with usage & performance "feeling" better but I don't have a good way to measure.
1
Running in headless mode to render one page to output and then exit would be a good benchmark probably?
1
Yes, it probably would for performance. I'm not sure how to compare memory usage though. The interesting thing for a large app is not initial usage but non-leaking (non-growth-over-time, return-to-previous-usage after closing stuff).
1
2
Reducing the working set of pages by avoiding fragmentation and metadata overhead improves performance too. Not only frees up memory for caching but reduces TLB pressure. For small allocations, lower metadata overhead and better packing together improves usage of the cache.
2
1
Fragmentation over time degrades performance of the application and the overall system. Small allocations fit poorly into the traditional dlmalloc-style approach, and the approximate best-fit approach also causes a lot more fragmentation over time than using precise best-fit.
1
Precise best-fit with ties broken by address works well for large allocations and can be implemented in O(log n) via ordered map keyed by (size, address) and another keyed by address. Linux used to have a horrible allocator design for mmap but it evolved into something like this.
1
2
O(1) approximations of best-fit with buckets cause a lot more fragmentation, and directly using best-fit just doesn't work well with granular sizes and tons of small allocations. Typical to have layers of caching on top which makes it even worse. Leaks like a sieve in practice.
1
The worst case is ridiculously bad and applications create horrible scenarios in practice, especially long-lived ones. Slab style allocators are so much more predictable with much lower waste and fragmentation. Much easier to reason about and tune based on desired properties.
Yep. AFAIK mine is the first of this kind to optimize for saving memory rather than performance though especially at low total usage.
1
2




