40 hours into this Redshift COPY query. Maybe I should have split it up into smaller chunks.
@SnowPlowData on the order of 50,000 files, 400 million rows, took 41 hours on 2 8xlarge nodes /cc @rahulpathak
-
-
@avibryant@rahulpathak COPY sucks with 1000+ files. Reckon a S3DistCp step to reduce to 100-500ish files would give you a massive speedup - 1 more reply
New conversation -
-
-
@avibryant@rahulpathak We've been loading 200m rows into a smallish cluster in about 1 hour; COMPROWS 20 million added 6 hours to thatThanks. Twitter will use this to make your timeline better. UndoUndo
-
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.