Conversation

Update on the new Reddit ingest pipeline. The new ingest code is currently being tested for any esoteric bugs but so far things are looking really good. Here are the targets for the new pipeline: 1) Max delay for indexing of new Reddit data will be targeted at 60 seconds.
1
9
That means ideally that new comments and submissions will be available for search within 60 seconds of their creation on Reddit. 2) A median time of less than 15 seconds for newly indexed information. The goal is to have new objects available for search in around 15 seconds
1
2
Over 99% of the time. 3) Reduce latency of all searches to below 250ms. I will continue testing for any edge-case bugs but my feeling is that this will be ready for production in 2-3 weeks. Thank you!
2
1
Replying to
Nah, it's a tricky thing to solve. Usually with systems I've seen its a queue of things to check later and when to check them based on the state they're in. State 0, check in X minutes. State 1, check in 2 hours. etc until you hit the max state.
2
1
Replying to and
But this means your queue is huge and when asking for things to check, you'll get a big list of things in all sorts of states. But the benefit is you'll get a "curve" of progression. And usually scores don't change much a day later.
1
1
Show replies