Next steps in addressing and excellent paper (arxiv.org/abs/1803.05046) to fix issues with the data dumps -- I will be reviewing the largest gaps for comment data and ingesting that data over the next few weeks (knock out the low hanging fruit) -- I would ...
Conversation
... anticipate that it will take between 4-6 weeks to fill in the largest gaps for missing comments. I will then rescan all missing ids in the sequential areas (ids over 27 billion for comments) and ingest the missing data there. Probably 1-2 months before complete.
