Have you ever needed to generate a random number in code? whether it's for rolling a dice, or shuffling a set, this tweet thread is here for you! There's no reason that it should be easy or obvious, very experienced programmers repeat common mistakes. I did, before I learned ...
-
Show this thread
-
O.k. let's start with the most common problem, and the most common mistakes: how do we pick a random number between 0 and N inclusive, let's say N = 5, so like a dice that starts at zero because we're nerds.
1 reply 1 retweet 53 likesShow this thread -
A common solution is to r = rand() % (N + 1). Easy, right? Wrong! This solution is biased. To see how, imagine that RAND_MAX is "15". 0 % 6 == 0, 6 % 6 == 0, and 12 % 6 == 0 , so there are three rand() values each that return 0. Same works for 1, 2, 3 ...
6 replies 6 retweets 63 likesShow this thread -
But there are only two rand() values that return 4: 4 % 6 and 10 % 6. Same for 5. So 0, 1, 2, 3 are all 50% more likely than 4 or 5. That's bias! Of course with a big RAND_MAX, this bias diminishes, but it's still there. So don't do it this way.
1 reply 2 retweets 42 likesShow this thread -
O.k. so what if we use a float to avoid this? It's tempting! We can build a float between 0.0 and 1.0 by doing f = (float) rand() / (float) RAND_MAX. So we build a little macro or function for that, and now we can just do r = randFloat() * 5, right?
2 replies 1 retweet 20 likesShow this thread -
No! This doesn't work other, because FLOATING POINT IS LIES. The distribution between 0.0 and 1.0 is not evenly spread and there's all sorts of rounding fuzziness that will make r non-uniform. No good.
5 replies 8 retweets 89 likesShow this thread -
The solution here is pretty interesting and totally non-intuitive. What you have to do is to compute the highest multiple of n that is smaller than RAND_MAX. Then you call rand() and if you get a number that's higher, you discard it and go again. E.g. https://github.com/awslabs/s2n/blob/643976c1c03c7f4a3003d7e066f18536c410c2b4/utils/s2n_random.c#L166 …
2 replies 22 retweets 148 likesShow this thread -
int random(int max) { while(1) { int r = rand(): if (r < (RAND_MAX - (RAND_MAX % max))) { return r % max; } } }
5 replies 9 retweets 131 likesShow this thread -
Replying to @colmmacc
Do you know of some original source I should cite for this when mentioning it in a paper, or is it "obvious"?
1 reply 0 retweets 0 likes
Academic papers usually call it rejection sampling, or acceptance-rejection sampling, and I think it's so old as to not have a clear progenitor.
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.