The default settings for autovacuum_*_scale_factor and autovacuum_*_threshold aren't great for large tables. I like to set them to 0 and 1m respectively for tables over 20m rows or so. #PostgreSQL
Conversation
Replying to
I don't think that's a good idea. The cost of vacuuming continues to increase on larger tables due to index vacuums, and the gain of vacuuming 1 million dead rows on a 10billion row table isn't meaningful.
3
You’re unable to view this Tweet because this account owner limits who can view their Tweets. Learn more
I think they're ok enough defaults given the current autovacuum logic, but it heavily depends on the workload.
Instead of different defaults, we need more types of autovacuum triggers.
2
E.g. a more aggressive heap vacuum trigger to set VM bits, separate from the triggers for index vacuums.
But for that we should make heap vacuums more efficient, to prevent regressions.
Some of and Sawada-san's recent work goes some steps toward that.
2
2
IMV the right approach is to decouple *everything* in VACUUM and let it figure out what matters by noticing per-index/table problems as they happen. Bloat is both harmful and benign (often both). Top-down scheduling seems too complex due to non-linear behaviors. Bottom-up works.
2
3
Generally agreed.
Having some pressure for doing such work close to each other also has benefits though - a considerably higher likelihood of cache hits and re-modifying pages that are already dirty before they're written out. And a lower likelihood of FPIs.
2
1
Absolutely agreed on that! I was just going to clarify: VACUUM should be much more bottom-up, but still somewhat top-down. It should be bottom-up about extreme bloat/stressors and top-down about doing a clean sweep to keep things humming/stable over time.


