Update on the new Reddit ingest pipeline. The new ingest code is currently being tested for any esoteric bugs but so far things are looking really good.
Here are the targets for the new pipeline:
1) Max delay for indexing of new Reddit data will be targeted at 60 seconds.
Conversation
That means ideally that new comments and submissions will be available for search within 60 seconds of their creation on Reddit.
2) A median time of less than 15 seconds for newly indexed information. The goal is to have new objects available for search in around 15 seconds
1
2
Over 99% of the time.
3) Reduce latency of all searches to below 250ms.
I will continue testing for any edge-case bugs but my feeling is that this will be ready for production in 2-3 weeks.
Thank you!
2
1
Replying to
Nice work? Any changes to scores or are they still only once 24 hours after posting?
2
1
Replying to
Good question. I haven't thought that far ahead honestly but if you have some ideas, let me know!
1
Replying to
Nah, it's a tricky thing to solve.
Usually with systems I've seen its a queue of things to check later and when to check them based on the state they're in.
State 0, check in X minutes.
State 1, check in 2 hours.
etc until you hit the max state.
2
1
Replying to
Yeah exactly. You obviously have previous experience with these type of logic problems in software development.
In fact, it would be nice to speak voice with you in the next week or two!
Show replies

