How to lose data: 1. A problem process eats disk space 2. Your email alert threshold is at 10% free 3. Your paging (wake me up) threshold is at 5% free 4. The ext4 reserved blocks are the default 5%. Woke up to FS at 5.01% free, a pile of <10% alert emails, and lost data.
This *was* a separate mount, so it indeed only killed the app. But the app was storing data that only comes in once, so several hours of data are now lost.
-
-
The sad thing is there was 1.5TB of wasted space on that filesystem, because it used to be the only storage for the app, but then I switched to automatically moving data elsewhere... But I left the old files around when I did so months ago, intended to delete them, never did :/
-
The cronjob didn't actually fail, it was stuck doing backlog cleanup of months of *metadata* for hours, because the service it speaks to is crap and takes forever. I found out about the broken metadata cleanup yesterday and fixed it... But that took longer than expected.
- Show replies
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.