@avibryant @cbeckpdx want to do set-size estm. for arb. algebraic expns on a fixed set of sets; plan to sketch A and -A for all A in advance
-
-
Replying to @todd534
@avibryant@cbeckpdx sets vary wildly in size, and are all subsets of the same say 10M items.2 replies 0 retweets 0 likes -
Replying to @avibryant
@todd534@cbeckpdx each shard can independently estimate the result of the algebraic expression in their subspace, then combine results1 reply 0 retweets 0 likes -
Replying to @avibryant
@todd534@cbeckpdx depending on the sizes of the sub-subsets you can choose how many shards you need to get a decent estimate.2 replies 0 retweets 0 likes -
Replying to @avibryant
@avibryant@todd534 we currently use a similar approach. The hope is that we can get some advantage with having sets & complements...1 reply 0 retweets 0 likes -
Replying to @avibryant
@avibryant@todd534 that was my hope, but no such luck. Closest thing I've found used HLL + minhash. Convert the expr to CNF then...1 reply 0 retweets 0 likes -
Replying to @cbeckpdx
@avibryant@todd534 compute unions for each sub expr then intersect all. It's not clear how error compounds as more sets are intersected.1 reply 0 retweets 0 likes
@cbeckpdx @todd534 @posco this is what I was thinking of: https://github.com/twitter/algebird/blob/develop/algebird-core/src/main/scala/com/twitter/algebird/HyperLogLog.scala#L601 …
-
-
Replying to @avibryant
@avibryant@cbeckpdx thanks for thoughts! appreciate your time.0 replies 0 retweets 0 likesThanks. Twitter will use this to make your timeline better. UndoUndo
-
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.