Jepsen

@jepsen_io

Distributed systems safety analysis

Joined March 2018

Tweets

You blocked @jepsen_io

Are you sure you want to view these Tweets? Viewing Tweets won't unblock @jepsen_io

  1. Jan 30

    New Jepsen analysis! We talk about etcd's kv operations, watches, and locks. KV ops look strict serializable, and watches deliver all changes in order (with a minor undocumented edge case around revision zero). As usual, distributed locks aren't real:

    Undo
  2. 18 Sep 2019

    I'm committed to giving everyone the most accurate, rigorous reporting on database correctness that I can, and I encourage vendors to do the same. Be open, honest, and nuanced in your writing. That honesty is good for users, and it builds trust in your team.

    Show this thread
    Undo
  3. 18 Sep 2019

    But still, some vendors do misrepresent the results of our work together, and this bugs me. I expected vendors to call each other out for this sort of thing, because they're more than willing to write take-downs over interpretation of benchmarks, but so far that hasn't happened.

    Show this thread
    Undo
  4. 18 Sep 2019

    Most vendors are telling the truth here: by the time we conclude our collaboration, the safety issues we found have usually been addressed, and the test suite often passes. Many vendors also follow up "passes" with a description of the issues we found, which I think is honest.

    Show this thread
    Undo
  5. 18 Sep 2019

    In more general terms, almost every database we tested with Jepsen fails, sometimes in dozens of ways, before its test suite passes. That's how we know Jepsen is *working*! The vendor headline that comes out of that process is usually "X passes Jepsen".

    Show this thread
    Undo
  6. 18 Sep 2019

    Crashes, unavailability, and performance problems aren't usually reported by Jepsen as "failing" results, because we're primarily concerned with checking safety, rather than liveness problems. It's hard to say how slow is too slow. We file and discuss these issues qualitatively.

    Show this thread
    Undo
  7. 18 Sep 2019

    There are other issues that we found in our work, like slowly spawning an ever-increasing number of backend worker processes which eventually consume all resources and kill the machine. This one's still open too.

    Show this thread
    Undo
  8. 18 Sep 2019

    Because these problems involve schema changes (e.g. creating tables), they may not impact users frequently. YugaByte doesn't think they're relevant to the core transactional mechanism in YugaByte DB, which is why they're not discussing them when they say "Jepsen tests passed".

    Show this thread
    Undo
  9. 18 Sep 2019

    Again, YugaByte DB's Jepsen tests did not pass. They do not currently pass. Correctness issues we identified in our collaboration, mainly due to non-transactional schema changes, are still unaddressed. YugaByte and I have talked about this.

    Show this thread
    Undo
  10. 5 Sep 2019

    An open question in my mind: can non-transactional schema changes (e.g. adding a column) result in *data-level* serializability violations? What would those anomalies look like? I'm honestly not sure, but it's something we can explore going forward!

    Show this thread
    Undo
  11. 5 Sep 2019

    So... when YugaByte says they "pass Jepsen" () they're only talking about the parts of the test suite which look at changes to data records in the absence of schema changes. We think that's most important for users, and it's the vast majority of our tests

    Show this thread
    Undo
  12. 5 Sep 2019

    The impact of this issue (like many of the problems we found in schema modification) is limited to a short time around table creation. Schema changes in general aren't transactional, so this might occur during other changes, like adding/removing columns--we haven't looked yet.

    Show this thread
    Undo
  13. 5 Sep 2019

    YugaByte DB doesn't pass Jepsen presently; some of the safety issues we identified in testing are still extant. For instance, YugaByte DB has a race condition which allows `DEFAULT NOW()` columns to be initialized to `NULL`, rather than a timestamp.

    Show this thread
    Undo
  14. 5 Sep 2019

    New report! We worked with to evaluate YugaByte DB 1.3.1's beta support for serializable SQL transactions. We found 2 safety bugs including anti-dependency cycles (now fixed), and availability issues like a slow leak in backend processes.

    Show this thread
    Undo
  15. 28 Aug 2019

    Jepsen 0.1.15 is out! It's a minor bugfix and ergonomics release.

    Undo
  16. 14 Aug 2019
    Show this thread
    Undo
  17. 31 Jul 2019

    On Wednesday, August 14, I'll be giving a livestreamed talk with , discussing various approaches for transactional isolation in globally distributed systems, including Spanner, , FaunaDB, @YugaByteDB, , Percolator, and TiDB.

    Show this thread
    Undo
  18. 12 Jun 2019

    In the latest analysis, we worked with to test TiDB 2.1.7 through 3.0.0-rc.2. Due to 2 transaction retry mechanisms, TiDB exhibited constant snapshot isolation anomalies under normal conditions. This and other bugs are fixed in 3.0.0-rc.2:

    Undo
  19. 20 May 2019

    Jepsen 0.1.14 is now available, featuring a bunch of bugfixes, usability improvements, and new tests.

    Undo
  20. Retweeted
    15 May 2019

    During testing we discovered failovers after a network partition weren't happening quick enough. Here's how we made them faster -

    Undo

Loading seems to be taking a while.

Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.

    You may also like

    ·