"Error budgets" are the god-damn worst idea I've heard of in recent years. SLAs should be realistic goals about what we can achieve with our current techniques and tools, not permission to fail a certain amount.
-
Show this thread
-
Replying to @colmmacc
Does anyone really use an “error-budget” though - outside Google? I don’t know of anyone, tbh.
6 replies 0 retweets 3 likes -
Replying to @copyconstruct
SRE culture has tentacles, and seems to bring error budgets, seemingly endless Kubernetes tinkering, and stack over-complication to some places. Plenty of good too, mostly good on balance, but some of it gets to me.
1 reply 1 retweet 16 likes -
-
Replying to @copyconstruct @colmmacc
the "platonic ideal" of error budgets seem to me like a useful goal to align all stakeholders (ugh) in the lifecycle (design, dev, operation) )of a "service". It seems to me like the cost of autonomy is an increased risk of externalizing failures (1/N)
1 reply 0 retweets 3 likes -
E.g., "Whups, sorry a bug in a new feature in our microservice took down the whole platform". Error budgets simply provide a strong contract on how to gate feature velocity when these things happen. Probably more rarely, they can help enforce architectural desires (2/N)
2 replies 0 retweets 1 like -
Replying to @jhscott @copyconstruct
SLAs already do this. If you're in danger of missing an SLA ... more work on stability and resilience is evidently needed. Typically that means fewer features .. but naturally. I think having to gate features is unhealthy and a sign of organizational disfunction.
1 reply 0 retweets 2 likes -
Replying to @colmmacc @copyconstruct
I think that "strong contracts empower distributed ownership". Thus, an explicit error budget -- with e.g. common visibility of current value -- empowers teams to make the right decisions and have clear expectations what happens when you miss your SLA.
2 replies 0 retweets 2 likes -
Replying to @jhscott @copyconstruct
Measuring SLAs and how you're doing, down to each customer's experience, and having transparency about that ... that's all good. But that's just using an SLA IMO. Error budgets are an extra concept layered on top, and my point is I don't think it's a good one.
4 replies 0 retweets 1 like
Even the name implies that there's a budget for errors. It's a bad framing. Leads to behaviors like being less cautious when the budget is less spent. Or worst ... that you can plan to spend it pro-actively to help cut some corners.
-
-
Replying to @colmmacc @copyconstruct
ah but I think that's one of the best parts! Like there is always a balance between velocity and reliability, right? So setting an *appropriate* SLA and error budget makes it clear to e.g. product managers that you are moving slowly b/c you need five nines.
1 reply 0 retweets 0 likes -
vs moving faster if you can get away with three nines. If your business use case can't take the corner cutting, tighten your SLA!
1 reply 0 retweets 0 likes - 8 more replies
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.