Yes, vacuum isn't a feature, it supports a feature. Most engines with MVCC have something like it (InnoDB purge, LSM compaction). But in my days with InnoDB I rarely worried about purge. Hopefully vacuum reaches that point. No dbms is perfect.
Conversation
Exactly! To me, the important difference between Postgres and InnoDB isn't VACUUM. More like: Postgres versions logical rows, InnoDB logical pages. Many workloads perform most GC work on-the-fly, not during VACUUM. Which *amplifies* the perception that the problem is just VACUUM.
2
2
Incidentally, this is why I don't see MySQL/InnoDB Bug #74919 getting addressed anytime soon (despite MyRocks getting this right). I think that approximately the same technique that MyRocks used will also work in Postgres.
1
1
Had to ask Domas for that bug number. Wasn't able to find it with a quick search despite always remembering the open feature request.
1
Issues in index-only scans in InnoDB secondary indexes also demonstrate my point, I think. Whole pages (not rows) are what a given InnoDB MVCC snapshot sees. It's an indivisible thing (mostly), which is sometimes for the better, other times (e.g., with bug #74919) for the worse.
1
I am not sure upstream InnoDB has a counter to show when secondary index page requires fetching base row to determine visibility. So it isn't a well known problem.
Does PG have a counter for the equivalent case (when visibility map bit isn't set)?
1
PG doesn't really have a counter (only shows it in EXPLAIN ANALYZE), but it should. Each set of index-only scan problems is very similar - but *also* very different. My guess is that the worst case for InnoDB IoS is much worse than Postgres IoS, while InnoDB is better on average.
1
Back in the day when I was running the insert benchmark I really needed such counters to explain IO/query. Queries used covering secondary indexes, but the insert rate made it hard for purge/vacuum to keep up.
1
Maybe secondary index scans needed to materialize index pages the hard way all too often. Their MVCC snapshots never become old (in "human scale" wall clock time), but *were* old relative to the high rate of change in pages. Maybe it was just this competition - not purge per se?
1
Unfortunately, I never explained what was going on and probably might not return to it. My focus is RocksDB now.
1
In case that wasn't enough wild speculation for one day: seems as if trends in hardware increasingly disfavor "logical database first" designs. Coordination and latency are the bottleneck now -- not I/O. See also:
Would be great if that were motivation for getting an LSM storage engine into Postgres
1
That does seem like a good reason to write a new table AM. It's just not obvious what that means for WAL, checkpoints, etc. Postgres is traditionally highly extensible, and so in theory new tables AM should be additive. This is very much above my pay grade.
1
1
Show replies

