Conversation

If you are looking for Reddit data for all of 2020 and early 2021, look no further. I thought I had to reingest the previous months because of a server failure a while back but I actually had a backup server still ingesting the data. So I was cleaning up today and found over
6
54
one billion Reddit comments for all of 2020 and early 2021. The script was doing its job happily until the 4TB SSD drive ran out of space -- but the data is there! I just need to compress 1.2+ billion comments and submissions and the data will be up within a week.
2
24
Replying to
Yes -- As soon as the dumps are up, I'm going to do a full scan on production and fill any missing gaps and get the new ingest code in place so there aren't any more large delays in prod (beta has been keeping up to date with the new code).
1