developers.google.com/safe-browsing/ is good documentation on how the Safe Browsing API used by browsers (Chromium, Firefox, Safari, etc.) works. It searches a database based on a truncated hash (4 bytes) of a canonicalized URL. It leaks some information, but doesn't send URLs directly.
Conversation
Safe Browsing isn't currently supported by GrapheneOS and won't be enabled by default with the standard approach. It leaks too much. I haven't done anything to intentionally break it and wouldn't mind it as an optional feature, but the default mobile approach uses Play Services.
2
4
Replying to
It fetches the database based on a 4 byte truncated hash of canonicalized URLs (which strips out a bunch of data before hashing). So, for cases where it didn't already have an updated list of hashes for that truncated hash, it leaks the fact that you're visiting one of many URLs.
32 bits does seem like a lot. I wonder how large the total database of URLs is. If that's n, this scheme provides k-anonymity n/(2**32). The web is big, but I wonder if it's big enough that that's enough anonymization.
1
OTOH, there's a tradeoff here, because the safe browsing list prevents you from visiting malicious sites. You're leaking a little bit of data to one party in order to decrease your odds of potentially-extreme leakage to other almost-certainly malicious parties.


