It's now clear that Stonebraker's prediction that "one size fits all" DBMSs will fall out of favor (see cacm.acm.org/blogs/blog-cac) was dead wrong. It's hard to say why, exactly - even now. But I have to wonder why ~2009 seemed like more of an "inflection point" than any other time.
Conversation
"One size fits all" was always kind of wrong - he said so himself back then. What changed? You might think that the thesis hung on something about main memory size scaling, but that doesn't seem relevant to many of the specific kinds of workloads he mentions.
2
2
In short, it seems like he was mostly wrong because: 1.) He failed to consider performance _relative to the total cost of ownership_, 2.) Having a memory hierarchy is really helpful when you think about costs in a holistic fashion, and 3.) Minimizing complexity really matters.
1
1
10
Replying to
4) motivated reasoning, because more specialized database companies are a lot easier to start
5) Under-estimated application side scale out techniques that are necessary for pure machine size reasons anyway
1
1
5
I don't think it's so clear. [3)complexity] was always my driver. If one PostgreSQL database does the job, I'd take it over five special-purpose DB technologies and three data streaming stacks... fewer things to break and page me on the weekend please!
3
2
It's even worse than that, though. Stonebraker suggested that everybody would use something like VoltDB for OLTP, while having an ETL process to get it into a column store. You could do analytics on the same data later on that way. This idea in particular was always preposterous.
1
Not so preposterous looking at modern cloud born companies data pipelines which all seem to include Postgres, Redis, Snowflake, & Kafka. That said I'd still say #3 (operational complexity) is the mind-killer for most; training/expertise/resilience is still too costly w/o scale
1
Yah. You get two tiers, in which a handful of really really big operations (e.g. Facebook, Walmart) use All The Data Tools. And then you have everyone else, who just want one "good enough" option for data storage.
1
Stonebraker made the error of looking at DB cost of ownership strictly in terms of *hardware* cost. He forgot about people-time cost and cognitive load.
1
1
I think that it's both. I agree that under-appreciating the cost and difficulty of employing experts was his largest mistake, practically speaking. I find his under-appreciation of the value of a separate transparent capacity tier more interesting as an engineer, though.
Yah. He wasn't *totally* wrong though; if you look around, you'll find that the overall usage of special-purpose databases has tremendously increased since 2005.
1
I agree with that, too. I just don't think that it had much to do with any fundamental sea change. Stonebraker talks about arrays being more natural than tables for scientific applications. I'm sure that that's true, but it's also true that the same people still use FORTRAN.
1
Show replies



