Tinkering about how to publish domain blacklists without immediately revealing the domains. Compressed bloom filters or Golomb-coded sets using hash functions whose complexity increases exponentially + a final memory hard hash?
-
Show this thread
-
A simpler approach would be to publish a set of truncated hashes of exponentially increasing work factor for each domain. Easier to do and update, but obviously less memory efficient than probabilistic structures.
3 replies 0 retweets 0 likesShow this thread -
Replying to @jedisct1
Maybe the tricky part in Bloomfilters is the management of the false-positive ratio. Also the canonization of the domains can be a source of issues. Not sure what the best data-structure is...
2 replies 0 retweets 0 likes
Replying to @adulau
This can be negligible if the vector is large enough. It will remain more compact than the second approach no matter what.
3:58 AM - 4 Apr 2018
from Saint-Mandé, France
0 replies
0 retweets
0 likes
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.