-- Quick question that I didn't find in your documentation. When documents are requested directly by their _id, is that operation an O(1) operation?
Conversation
not sure the O notation makes much sense in a distributed system. you'll have a mix of network hops (coordinating node, primary / replica shard), memory and potentially disk access
the question might make more sense for lucene. blog.mikemccandless.com/2014/05/choosi is a great starting point
1
That's a good point but I was curious from the standpoint of having one node and searching on that node. It's probably a better question to direct at Lucene. I was going to check with the Lucene team anyway on this: discuss.elastic.co/t/sorting-by-t
The sort index feature is awesome, but
1
it seems odd that it is so much slower if the sort is in the opposite direction.
Thanks for your help!
1
Generally there are so many optimizations in Lucene that an O notation might depend on a lot of factors (like the structure of the ID). Index sorting is another special case but I'm surprised by your outcome. I'd have expected it to be similar (without knowing the code)
1
1
I agree with you on the Lucene sort issue. It might be worth me going through Lucene's tickets and checking if there is anything already flagged but it's probably at least worth bringing up -- even if they have an explanation why (it would be nice to know why it is). Thanks!!
1
1
I didn't check the code, but index sorting might be specific to the elasticsearch implementation. I think lucene exposed the indexwriter, so specific optimizations are possible now. maybe give the discuss thread a few days before digging further.
2
1
Yeah, that is a great read. I'm going to dig deeper into the Lucene source code and see if I can find anything related to the sort direction. Since it is asked to be specified, I'm guessing there is a reason for it. With indexes like Postgres, sort direction plays a more minor ..
role in regards to performance issues. Generally if you have a sorted list, searching it should be close to O(log N) no matter which direction it is sorted (that's my understanding). There may be other considerations for the Lucene index, though.
1
yes, that's the behavior of a B* tree and it's pretty easy to reason about but lucene's data structures are different
1


