"Error budgets" are the god-damn worst idea I've heard of in recent years. SLAs should be realistic goals about what we can achieve with our current techniques and tools, not permission to fail a certain amount.
-
-
9 times out of 10 I hear about error budgets it's a team who is focusing on their own convenience, because they haven't invested in convenient resilience, and isn't putting customers first. SLAs can be crude at this ... is a 4 hour outage ok on your busiest day? Still 99.95% YoY!
Show this threadThanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
Counterpoints: http://reddit.com , a top-20 website worldwide in terms of traffic, regularly goes offline for maintenance. http://bhphotovideo.com is 1 of the biggest e-commerce website; they intentionally turn off the site for a full day every week for religious reasons.
-
Apple basically turn off the online store on announcement days too and they're not doing so badly. But I can't, I just can't. To me it's like hitting the patient.
- 5 more replies
New conversation -
-
-
I'm in the middle of a very strong counter-argument right now with
@Flickr. ;) -
I do appreciate the delicious Panda-themed lemonade!
End of conversation
New conversation -
-
-
This Tweet is unavailable.
-
It's time when you're ... down.
End of conversation
-
-
-
For specific purposes, error budgets (and spending them) is a useful tool. e.g. taking a service offline in one AZ allows you to validate that no-one has hardcoded a dependency on it (despite it being globally available). Sounds like some teams are abusing the concept.
Thanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
Interesting - this might have changed now but not striving to be more available than the SLA was strong in google’s culture (ref SRE book). I always found that interesting but also understood the reasons
Thanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
Generally I agree. But taking multi hour planned outages of some features of my bank on the weekend doesn’t seem to be hurting their business. So I think it depends on the context. Certainly at Internet-scale it doesn’t make any sense because of the cost of outages.
Thanks. Twitter will use this to make your timeline better. UndoUndo
-
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.