Why do I always get the weirdest bugs? By my calculation, this one has a 0.03% chance of hitting. https://bugzilla.redhat.com/show_bug.cgi?id=1632590 …
-
Show this thread
-
This Tweet is unavailable.
-
Replying to @yukkuriyankey
It's not a Ceph bug; I've actually never had a bad Ceph bug so far. It's just a libvirt bug/bad default that happens to be triggered more easily by using Ceph.
0 replies 0 retweets 0 likes -
This Tweet is unavailable.
-
Replying to @yukkuriyankey
Honestly, that PM is mostly a story of a bunch of admin errors plus one or two Ceph bugs. No data was lost. I'm not going to criticize the admins since they're unpaid volunteers and we all learn from mistakes, but I'm having a much better time running my Ceph clusters.
1 reply 0 retweets 0 likes -
Replying to @marcan42
The worst Ceph story I have to share (impact-wise) was just last week, when I did a full (intended to be) hitless reinstall on a production cluster (Ceph+OS upgrade and full OSD reformat into BlueStore) and a VM server was running an old, buggy version of librbd.
1 reply 0 retweets 0 likes -
Replying to @marcan42
I consider it my screwup, because I forgot that I had upgraded one VM cluster recently but the other one was still running year-old software. A quick VM restart fixed it, total downtime was ~15 minutes. Then I hitlessly updated the VM servers using live migration.
1 reply 0 retweets 0 likes -
Replying to @marcan42
Other services using the cluster weren't impacted at all, which is pretty impressive when you consider I was doing a full rolling disk format and OS reinstall on every OSD in the Ceph cluster.
1 reply 0 retweets 0 likes
But yes, we have backups. We cross-backup between two Ceph clusters (and don't update them at the same time).
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.