A thing I just realized when talking to a colleague who isn't an engineer: you know how there is an O(N) way and an O(N^2) way to answer the question "Which lines in this CSV file have an email address which is not unique in the file?" ? Well:
-
-
Nontechnical guys next question: What’s a hash table Patrick?
-
Happy to help, nontechnical guys! It's a fundamental data structure which is used *pervasively* in professional programming. The bottom line is it lets you insert things very fast and retrieve them (or check presence) very fast, regardless of how many things there are in table.
End of conversation
New conversation -
-
-
Is this generally true? The lower the order of the computational complexity, the easier it is to write/reason about/extend?
-
Nope, but happens to be true for this case.
End of conversation
New conversation -
-
-
For a single check, I would prefer the N log N sort to the N-in-theory-but-not-on-real-hardware hash table approach. Hash table is much better if you want to do this online of course.
Thanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
"Excellent question!" that is a beautiful response. Positive feedback for curiosity!
Thanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
Nice side-effect is that this also works for different equalities by choice of hasfunction. Ignore whitespace, casing, support palindromes, etc.
Thanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
Curious to know the O(?) of filing an entry in your brain's neural net, and using the ocular interface api that returns whether you recognize the next entry or not. Assuming no read or retrieval errors for estimates. Is a synapse firing O(1)?
-
Sure, but our brains clearly have a high constant logarithmic access time. It's the only way we could remember as much as we do. It also must have learned index structures to help key off various axis (there was an amazing paper last year about them). So surely it's log-based.
End of conversation
New conversation -
-
-
First idea in my head was bloom filter. I wonder how that would be easily explainable in non-technical terms.
-
You're tooling up for a > billion emails, I see.
End of conversation
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.