I often hear complaints about the vacuum process in #PostgreSQL, mostly about how it impacts performance. There seems to be a lack of understanding about how vacuum works and the benefits it has.
Have you encountered such situations? How do you think this should be addressed?
Conversation
Replying to
Most VACUUM problems are performance *stability* problems IME. It is a little like LSM compaction in RocksDB. So I agree; general understanding is important. At the same time, VACUUM *can* be made much more predictable, purely trough non-invasive enhancements.
1
3
Yes, vacuum isn't a feature, it supports a feature. Most engines with MVCC have something like it (InnoDB purge, LSM compaction). But in my days with InnoDB I rarely worried about purge. Hopefully vacuum reaches that point. No dbms is perfect.
2
2
Exactly! To me, the important difference between Postgres and InnoDB isn't VACUUM. More like: Postgres versions logical rows, InnoDB logical pages. Many workloads perform most GC work on-the-fly, not during VACUUM. Which *amplifies* the perception that the problem is just VACUUM.
Incidentally, this is why I don't see MySQL/InnoDB Bug #74919 getting addressed anytime soon (despite MyRocks getting this right). I think that approximately the same technique that MyRocks used will also work in Postgres.
1
1
Had to ask Domas for that bug number. Wasn't able to find it with a quick search despite always remembering the open feature request.
1
Show replies
To be fair, there have been problems with InnoDB purge, they just weren't common for me:
* Purge might do IO, especially when it lags. Multi-threading that helped.
* Long open snapshots blocks purge (and vacuum). Both also suffer from bugs.mysql.com/bug.php?id=749 (RocksDB does not)


