-
-
Replying to @avibryant
@todd534@cbeckpdx each shard can independently estimate the result of the algebraic expression in their subspace, then combine results1 reply 0 retweets 0 likes -
Replying to @avibryant
@todd534@cbeckpdx depending on the sizes of the sub-subsets you can choose how many shards you need to get a decent estimate.2 replies 0 retweets 0 likes -
Replying to @avibryant
@avibryant@todd534 we currently use a similar approach. The hope is that we can get some advantage with having sets & complements...1 reply 0 retweets 0 likes -
Replying to @avibryant
@avibryant@todd534 that was my hope, but no such luck. Closest thing I've found used HLL + minhash. Convert the expr to CNF then...1 reply 0 retweets 0 likes -
Replying to @cbeckpdx
@avibryant@todd534 compute unions for each sub expr then intersect all. It's not clear how error compounds as more sets are intersected.1 reply 0 retweets 0 likes -
Replying to @avibryant
@avibryant@todd534@posco was looking at http://tech.adroll.com/blog/data/2013/07/10/hll-minhash.html … uses minhashes to estimate set similarity. Usual prob w/ large card. diff1 reply 0 retweets 0 likes
-
-
Replying to @avibryant
@avibryant@todd534@posco I'll benchmark next week. I could see where it might help, but as you said not sure a larger HLL wouldn't do more1 reply 0 retweets 1 like - 1 more reply
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.