@avibryant COPY is v sensitive to number of files. How many?
40 hours into this Redshift COPY query. Maybe I should have split it up into smaller chunks.
-
-
-
@SnowPlowData on the order of 50,000 files, 400 million rows, took 41 hours on 2 8xlarge nodes /cc@rahulpathak -
@avibryant@rahulpathak COPY sucks with 1000+ files. Reckon a S3DistCp step to reduce to 100-500ish files would give you a massive speedup - 1 more reply
New conversation -
-
-
@avibryant let me know if I can answer any questions.Thanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
@avibryant Would be nice to hear about your experience after you get the data loaded in.Thanks. Twitter will use this to make your timeline better. UndoUndo
-
-
-
@sorenmacbeth Soren Macbeth: A biochemical mechanism for turning free range eggs into links to Deal With It Gifs@avibryant
End of conversation
New conversation -
-
@avibryant it's always the etl that kills yaThanks. Twitter will use this to make your timeline better. UndoUndo
-
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.