Even though Borg tasks are designed to be resilient, this could get pretty disruptive. Rate-limiting maintenance tasks independently isn't efficient if you have dozens of them, and it's not always feasible to perform all types of maintenance at the same time.
-
Show this thread
-
Even if rate-limiting machine disruptions, it's also possible for the same task to get disrupted over and over again, like a can being kicked down the road. Hence, the Safe Removal Service (aka SRSly -- pronounced seriously) was developed by SRE. (SRE builds lots of automation.)
1 reply 0 retweets 10 likesShow this thread -
SRSly kept track of how often tasks of the same Borg Job were disrupted (aka evicted). Maintenance automation queried SRSly regarding all tasks scheduled on a machine before taking it out of service. This enabled Borg to provide a SLO on task disruption.
1 reply 0 retweets 6 likesShow this thread -
Borgmaster, however, did not know about SRSly. Instead, all critical/production workloads were changed to run with the same priority so that they wouldn't preempt each other. Doing that for every Borg Job in the company was extremely painful -- more on priority/preemption later
1 reply 0 retweets 6 likesShow this thread -
For Omega, we developed a model that could be applied both during task preemption to run a higher-priority task and eviction for maintenance -- disruption counters. There was a time dimension that ended up not being effective due to constant changes, so we dropped it in K8s
1 reply 1 retweet 6 likesShow this thread -
I think I first mentioned this in Kubernetes in my big scheduling braindump comment: https://github.com/kubernetes/kubernetes/issues/4301#issuecomment-74355529 …. It came up again when I proposed maxUnavailable to moderate concurrent disruptions caused by updates during the design of Deployment:https://github.com/kubernetes/kubernetes/pull/12236#discussion_r36501373 …
1 reply 0 retweets 5 likesShow this thread -
That discussion was forked into https://github.com/kubernetes/kubernetes/issues/12611 …. Around that time, Matt Liggett (https://github.com/kubernetes/kubernetes/pulls?q=is%3Apr+author%3Amml+is%3Aclosed …) joined the GKE team from Borg SRE (woo hoo!). One of the first things Matt worked on was improving node drains:https://github.com/kubernetes/kubernetes/issues/6080 …
1 reply 0 retweets 7 likesShow this thread -
Together with
@davidopp and@erictune4, we folded disruption budgets into the rescheduling design proposal: https://github.com/kubernetes/community/blob/master/contributors/design-proposals/scheduling/rescheduling.md#disruption-budget …. (Rescheduling deserves its own thread -- I'll do that one next.) Implementation began in https://github.com/kubernetes/kubernetes/pull/24697 … andhttps://github.com/kubernetes/kubernetes/pull/25551 …1 reply 0 retweets 3 likesShow this thread -
PodDisruptionBudget is now documented: https://kubernetes.io/docs/concepts/workloads/pods/disruptions/ … and https://kubernetes.io/docs/tasks/run-application/configure-pdb/ …. Try it out and give us feedback on how well it works for you. We're looking to advance it from beta to GA:https://github.com/kubernetes/enhancements/issues/85 …
3 replies 2 retweets 11 likesShow this thread -
This Tweet is unavailable.
PDBs are deliberately fairly limited right now. Readiness probes (a future topic) are also relatively limited. Since the addition of readiness gates (https://github.com/kubernetes/enhancements/blob/master/keps/sig-network/0007-pod-ready%2B%2B.md …) we've discussed perhaps extending custom "metrics" (from HPA) to more parts of the system.
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.