I have a maybe interesting cardinality estimation problem. Twitter-friends, who should I talk to?
-
-
@avibryant@todd534 we currently use a similar approach. The hope is that we can get some advantage with having sets & complements... -
-
@avibryant@todd534 that was my hope, but no such luck. Closest thing I've found used HLL + minhash. Convert the expr to CNF then... -
@avibryant@todd534 compute unions for each sub expr then intersect all. It's not clear how error compounds as more sets are intersected. -
-
@avibryant@todd534@posco was looking at http://tech.adroll.com/blog/data/2013/07/10/hll-minhash.html … uses minhashes to estimate set similarity. Usual prob w/ large card. diff -
-
- 2 more replies
New conversation -
-
-
@avibryant@cbeckpdx that's actually not that far from what we have now. were hoping we could constrain the resources a bit more. -
-
@avibryant@todd534 ditto, but we're interested to see if we can get a deterministic error bound. Could be a 90% solution.
End of conversation
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.