Conversation

Intersection of cryptography, software engineering, statistics, and performance: If i want to do a random sample from a set of data, and I wish to simply do "hash(dataitem) & bitmask < value", what properties does my hash function need to be statistically sound? Clearly, a ...
3
7
(plus, if in doubt, assuming you ask for non-cryptographic hashes for efficiency reasons: can't you test it on a representative corpus of data? SHA-256/512 would probably be fine. Chances are you have even better-optimized DEFLATE, so maybe use that to decrease computation)
1