Borgmaster had 2 models: built-in logic used synchronous edge-triggered state machines, while external components were asynchronous and level-based. More on level vs. edge triggering:https://hackernoon.com/level-triggering-and-reconciliation-in-kubernetes-1f17fe30333d …
-
Show this thread
-
One of the first things I did when joining the Borgmaster team back in 2009 was to parallelize the handling of read requests. Something like 99% of requests were reads, primarily from polling external controllers and monitoring systems.
1 reply 0 retweets 2 likesShow this thread -
Only BNS (analogous to K8s Endpoints) was written to Chubby, which enabled replicated caching and update notification. That enabled it to scale to much larger numbers of readers (~every container in Borg) and reduced latency, which for polling could be tens of seconds
1 reply 0 retweets 1 likeShow this thread -
Watch-like notification APIs (aka sync and tail) were common for storage systems such as Chubby, Colossus, and Bigtable. In 2013, a generalized Watch API was designed so that each system wouldn't need to reinvent the wheel. A variant "Observe" added per-entity sequencing
1 reply 0 retweets 1 likeShow this thread -
We built Kubernetes upon Etcd due to its similarities to Chubby and to the Omega store. When we exposed Etcd's watch (https://coreos.com/etcd/docs/latest/learning/api.html …) through the K8s API, we let more Etcd details bleed through than originally intended. We need to clean up some of those details soon
2 replies 0 retweets 8 likesShow this thread -
The Kubernetes model is described here: https://github.com/kubernetes/community/blob/master/contributors/design-proposals/architecture/resource-management.md#declarative-control …
1 reply 1 retweet 7 likesShow this thread -
Some other systems use message buses for notifications. Why didn't we? Controllers need to start from the initial state, we also don't want them to fall behind or operate on state that's too stale, and they need to be able to handle "missed" events -- the level-based rationale
3 replies 3 retweets 14 likesShow this thread -
Replying to @bgrant0607
What about the issue that message bus based systems would typically hit scaling issues with the message bus being the bottleneck? Were you all aware of problems associated with that approach and take that into your decision process as well?
1 reply 0 retweets 0 likes -
Replying to @bradtopol
All distribution methods have scaling challenges. A lot depends on required semantics of the data and update delivery: atomicity, serialization, consistency, freshness, etc. For instance, if you modify 7 entities of different types, will clients observe updates in the same order?
2 replies 0 retweets 0 likes -
Replying to @bgrant0607
All of the above is true. I was referring more to the difficulty of scaling messaging systems like RabbitMQ. But thanks for the memories. As someone lucky enough to build tools for early cluster systems like PVM it’s amazing to see all the innovations combined into K8s
1 reply 0 retweets 0 likes
Ah, yes, I used PVM when at LLNL, and was involved in early discussions around MPI, shortly after the publication of "Attack of the Killer Micros".
-
-
Replying to @bgrant0607
Good times!!! I remember at Supercomputing ‘93 Kendall Square Research had a very cheesy juggling acrobat show at their booth called the “Alpha Brothers”. The end was near...
pic.twitter.com/xI4QzLZWOM
2 replies 0 retweets 1 like -
Replying to @bradtopol
I had that mug! I still have my CM-5 tshirt from SC93
1 reply 0 retweets 1 like - Show replies
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.