It's a neat one, but I had to ditch using it recently. Turns out for my use case, the RNG was the bottleneck. It was faster to call the RNG a single time, but make two passes over the data.
-
-
-
I strongly suspect that will be the case for me as well, but I wanted to try both to be sure.
- Još 1 odgovor
Novi razgovor -
-
-
I was just teaching about reservoir sampling in my API design class today. It's such a clever algorithm.
Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi
-
-
-
There's also a buffered variant:https://gist.github.com/pervognsen/e0896d13b8b8a52da36520b5faeb8302 …
-
The only reason you'd use it, I think, is that it uses less randomness by a constant factor. Reservoir sampling uses lg(1) + lg(2) + ... lg(n) = lg(n!) ~ n lg(n) bits when the info theory optimum is only lg(n). With a k-buffer you get lg(k) + lg((n/k)!) ~ lg(k) + (n/k) lg(n/k).
Kraj razgovora
Novi razgovor -
-
-
But there’s an algorithm in the Wikipedia page (L) which does a lot less work than Z, should have just read Wikipedia...
- Još 2 druga odgovora
Novi razgovor -
-
The case of sampling 1 item, where replacement probability sequence is 1, 1/2, 1/3, 1/4, etc., is useful to know, as it has other applications For example, in old Photoshop w/o additive blend, you could average 4 images by stacking them in layers w/opacity = 1, 1/2, 1/3, 1/4.
Hvala. Twitter će to iskoristiti za poboljšanje vaše vremenske crte. PoništiPoništi
-
-
-
Ha, I was given this as an interview problem. "You are processing a stream of unknown length. You can store one item from the stream at a time. How can you select a single item from the stream, at random, with equal probability?"
-
Haha me too!
Kraj razgovora
Novi razgovor -
Čini se da učitavanje traje već neko vrijeme.
Twitter je možda preopterećen ili ima kratkotrajnih poteškoća u radu. Pokušajte ponovno ili potražite dodatne informacije u odjeljku Status Twittera.