Oh, I had missed that the Galil rule gives Boyer-Moore linear worst case performance. This might mean that BM is always a better choice than KMP.
Yeah that's fine. My main point here was to question whether 900 MB/s should be your target. :-) Your search there (in cache) is running at ~4.9 GB/s. It's interesting though that your out of cache search isn't saturating your disk bandwidth? Or is it what you'd expect?
-
-
I'd expect it to not saturate IO, yeah. It's single-threaded, alternating between searching and IO. Add in system call and decryption overheads, and it doesn't strike me as out of the ordinary.
-
Interesting. I wouldn't necessarily think that, although perhaps decryption overhead is the variable I'm missing. What does `pv < OpenSubtitles2018.raw.en > /dev/null` say after dropping caches? If ripgrep can get 4.9 GB/s in cache, then I'd expect it to be as fast as `pv`.
- 3 more replies
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.