Shuffle Sharding is amazing! It's just an application of combinatorials, but it decreases blast radiuses by huge factorial factors. So what does it take to use it in practice?
-
Show this thread
-
Well the client has to be fault-tolerant. That's easy, nearly all are. The technique works for servers, queues, and even things like storage. So that's easy too. The big gotcha is that you need a routing mechanism.pic.twitter.com/1berEp8FO9
1 reply 0 retweets 11 likesShow this thread -
You either give each customer resource a DNS name, like we do for S3, CloudFront, Route53, and handle it at the DNS layer, or you need a content-aware router than can do ShuffleSharding. Of course at our scale, this makes sense, but not everyone.
3 replies 0 retweets 9 likesShow this thread -
O.k,. bonus MATH content!! I want to convince you that that rough approximation from earlier is correct, because with more insight we can make smarter decisions.pic.twitter.com/rsgJrYQu63
1 reply 0 retweets 2 likesShow this thread -
Shuffle Sharding is just like a lottery. Think about your nodes like the numbers in a lottery, and each customer gets a ticket with |shardsize| count of numbers. You want to measure the probability that two or tickets match.
1 reply 0 retweets 4 likesShow this thread -
First, we have to define some shorthand. N is the number of nodes. S is the shard size. O is the potential overlap between two tickets/customers. https://en.wikipedia.org/wiki/Lottery_mathematics … has good background on how we then come to this equation ...pic.twitter.com/RMPYMIBtx4
1 reply 0 retweets 2 likesShow this thread -
Now, let's take the special case of "full overlap". That's the case we care the most about; the problem request eats all of the nodes it can reach. How many other customers are impacted? Since O=S in this case, we end up with ...pic.twitter.com/hIb2VR7RPq
1 reply 0 retweets 1 likeShow this thread -
The bracket notation is short for "choose" and since x choose x is 1, and x choose 0 is 1, we can replace everything above the line with 1.pic.twitter.com/X6evtSZMsv
1 reply 0 retweets 2 likesShow this thread -
Now let's expand the choose operator into its factorials ...pic.twitter.com/f0L2ZBGZYw
1 reply 0 retweets 1 likeShow this thread -
When you were about 10 you probably learned the reciprocal of a reciprocal is just to turn it upside down. That's still true. That gives us our final form ...pic.twitter.com/KfL5vELMxy
1 reply 1 retweet 4 likesShow this thread
In that form it's pretty easy to see that the smaller S is relative to N, that Shuffle Sharding gets dramatically more and more effective! Convinced? I hope so!
-
-
That was my whole talk and it took me longer to tweet it than it did to give it on stage! Go figure. Feel free to AMA. And thanks for reading and making it to the end. Now use Shuffle Sharding anywhere that you can!
6 replies 1 retweet 29 likesShow this threadThanks. Twitter will use this to make your timeline better. UndoUndo
-
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.