A thing I just realized when talking to a colleague who isn't an engineer: you know how there is an O(N) way and an O(N^2) way to answer the question "Which lines in this CSV file have an email address which is not unique in the file?" ? Well:
-
-
"How do you tell than a list with 100,000 words in it doesn't include a duplicate without checking each word against every other word, Patrick?" Excellent question! Answer: you stuff each word as you're checking it into hash table and then you only have to go through list once.
Show this threadThanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
There are many O(N) solutions, and some are harder to reason about or extend, e.g. https://stackoverflow.com/a/14946745 It's unclear which solution you like. I'd typically make the emails be keys in a hash which stores counts, which is not the most efficient option.
Thanks. Twitter will use this to make your timeline better. UndoUndo
-
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.