HyperLogLog and MinHash. On estimating cardinality of unions and intersections efficiently, by Andrew Pascoe @adroll.
http://tech.adroll.com/blog/data/2013/07/10/hll-minhash.html …
@robanhk I’m suspicious that they would have gotten lower error using the same total amount of memory for one or the other, vs both.
-
-
@avibryant Hmm, I guess they don't show a comparison. You *can* estimate intersection from HLL. Can you get cardinality from MinHash? -
@robanhk I haven’t seen any formal proofs of the error but in practice, yes. I think my MinHash in algebird has a method for that.
End of conversation
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.