Try using '--buffer-size=6G' or larger to use more RAM. Also remember that 'mmuniq' is potentially lossy - an odd row may be lost due to hash collision. Please do report if you encounter such a case and if it's a problem. Maybe we could bump hash to 128 bits at only 2x RAM cost
-
-
This Tweet is unavailable.
-
-
-
-
this is awesome, checkout thishttps://amplitude.engineering/dedupe-events-at-scale-f9e416e46ca9 …
Thanks. Twitter will use this to make your timeline better. UndoUndo
-
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.