When you have a terabyte of ram, all it takes is one faulty bit to wreck your week. Its amazing anything works with so many possible points of failure.
Conversation
Replying to
Not all of it is ECC unfortunately. After this past week / month, that's going to change real soon.
2
4
Replying to
I feel like I oughta get in on that since I seem to get one crash a day lately, albeit on much tinier datasets cause I don't want my electric bill to take off like a rocket
1
3
Replying to
This was extremely anxiety inducing because I was getting weird corruption errors in my ZFS array, compressed files that would work, then not work, then work again -- really crazy shit. I mean pure random chaos.
Replying to
This is why I only like coding for stats/analysis etc, operations shit is just stressful.
4

