Update on Pushshift Reddit API ingest. Today there was an issue that caused approximately 15 hours of data to be skipped. The ingest code was written in such a way to specifically avoid any data gaps, so something else may have happened deeper in the system (Elasticsearch, etc.)
Conversation
Replying to
That said, tomorrow I will be spending most of the day working to finally move the new ingest code into production as well as filling any gaps in the data from today. I sincerely appreciate the reports I received today alerting me to this issue.
The goal by tomorrow evening
1
4
will be to move the new and improved ingest logic into production and include additional monitoring logic to avoid future occurrences of this bug. I greatly appreciate the feedback and your patience as I work to fix these issues tomorrow.
Thank you again!
2
