Oh, I had missed that the Galil rule gives Boyer-Moore linear worst case performance. This might mean that BM is always a better choice than KMP.
Anyway, it helps to standardize on the same input. Otherwise it's hard to diagnose. :-)
-
-
I guess I had a typo. Tried a simpler string, and now rg and fgrep both find matches but disagree on the count (different handling of binary?). I'll try your linked file when I get home.
-
If you're using `-a` then the counts should be identical, otherwise you may have found a bug. If you aren't using `-a`, then indeed, there can be differences if your input contains a NUL byte. Try running with `--no-mmap` to see if that changes things.
End of conversation
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.