We put the servers behind a load balancer, and each server gets a portion of the incoming requests. LOTS of services are built like this. Super common pattern. You're probably yawning!pic.twitter.com/6KPDf5OXQg
You can add location information to your Tweets, such as your city or precise location, from the web and via third-party applications. You always have the option to delete your Tweet location history. Learn more
Well, if we look at
's neighbors. They're still fine! As long as their client is fault tolerant, which can be as simple as using retries, they can still get service.
gets service from node 2 for example.pic.twitter.com/xJRpdK6Fgn
O.k. let's PAUSE for a second and appreciate that. Same number of nodes. Same number of nodes for each customer. Same number of customers. Just by using MATH, we've reduced the blast radius to 1 customer! That's INSANE.
The blast radius ends up getting really small. It's roughly proportionate to the factorial of the shard size (small) divided by the factorial of the number of nodes (which is big) ... so it can get really really small.pic.twitter.com/LmaffLA3tR
Let's look at example. So for 8 nodes and a shard size of 2, like these slides fit, the blast radius ends up being just 3.6%. What that means is that if one customer triggers an issue, only 3.6% of other customers will be impacted. Much better than the 25% we saw earlier.pic.twitter.com/BuDe9pCMEa
But that's still way to high for us. AWS HyperPlane, the system that powers VPC NAT Gateway, Network Load Balancer, PrivateLink, etc ... we design for a hundred nodes, and a shard size of five. Let's look at those numbers ...
O.k. now things get really really small. About 0.0000013% of other customers would share fate in this case. It's so small that because we have fewer than a million customers per cell anyway, there can be zero full overlap.pic.twitter.com/xmZHjKMIy0
Again think about, we can build a huge big multi-tenant system with lots of customers on it, and still guarantee that there is *no* full overlap between those customers. Just using math. This still blows my mind.
If you want to try some numbers out for yourself, here's a python script that calculates the blast radius: https://gist.github.com/colmmacc/4a39a6416d2a58b6c70bc73027bea4dc … . Try it for Route 53's numbers. There are 2048 Route 53 virtual name servers, and each hosted zone is assigned to 4. So n = 2048, and m = 4.
If you want to make your own Shuffle Shard patterns, and make guarantees about non-overlap, we open sourced our approach years ago. It's at:https://github.com/awslabs/route53-infima …
Shuffle Sharding is amazing! It's just an application of combinatorials, but it decreases blast radiuses by huge factorial factors. So what does it take to use it in practice?
Well the client has to be fault-tolerant. That's easy, nearly all are. The technique works for servers, queues, and even things like storage. So that's easy too. The big gotcha is that you need a routing mechanism.pic.twitter.com/1berEp8FO9
You either give each customer resource a DNS name, like we do for S3, CloudFront, Route53, and handle it at the DNS layer, or you need a content-aware router than can do ShuffleSharding. Of course at our scale, this makes sense, but not everyone.
O.k,. bonus MATH content!! I want to convince you that that rough approximation from earlier is correct, because with more insight we can make smarter decisions.pic.twitter.com/rsgJrYQu63
Shuffle Sharding is just like a lottery. Think about your nodes like the numbers in a lottery, and each customer gets a ticket with |shardsize| count of numbers. You want to measure the probability that two or tickets match.
First, we have to define some shorthand. N is the number of nodes. S is the shard size. O is the potential overlap between two tickets/customers. https://en.wikipedia.org/wiki/Lottery_mathematics … has good background on how we then come to this equation ...pic.twitter.com/RMPYMIBtx4
Now, let's take the special case of "full overlap". That's the case we care the most about; the problem request eats all of the nodes it can reach. How many other customers are impacted? Since O=S in this case, we end up with ...pic.twitter.com/hIb2VR7RPq
The bracket notation is short for "choose" and since x choose x is 1, and x choose 0 is 1, we can replace everything above the line with 1.pic.twitter.com/X6evtSZMsv
Now let's expand the choose operator into its factorials ...pic.twitter.com/f0L2ZBGZYw
When you were about 10 you probably learned the reciprocal of a reciprocal is just to turn it upside down. That's still true. That gives us our final form ...pic.twitter.com/KfL5vELMxy
In that form it's pretty easy to see that the smaller S is relative to N, that Shuffle Sharding gets dramatically more and more effective! Convinced? I hope so!
That was my whole talk and it took me longer to tweet it than it did to give it on stage! Go figure. Feel free to AMA. And thanks for reading and making it to the end. Now use Shuffle Sharding anywhere that you can!
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.