@pcwalton Russ Cox makes a pretty good case for the Thompson method here: http://bit.ly/1g2dX7s . I'm planning on writing a version in Rust.
-
-
Replying to @GraphenePunk
@graphenepunk For speed what you really need is a form that's easy to JIT. That's generally recursive backtracking.1 reply 0 retweets 1 like -
Replying to @pcwalton
@pcwalton@GraphenePunk seems like generating code for an NFA would be just as easy, if not more so. Just a bunch of labels and jumps.1 reply 0 retweets 0 likes -
-
Replying to @ssylvan
@ssylvan@graphenepunk That'd be an interesting experiment. I don't know of any benchmarks against a good JIT on non-pathological regexes.1 reply 0 retweets 0 likes -
Replying to @pcwalton
@pcwalton@GraphenePunk "pathological" regexes aren't as uncommon as you think. When it blows up, it goes nuclear, why risk that?2 replies 0 retweets 0 likes -
Replying to @ssylvan
@ssylvan@graphenepunk In my experience they're uncommon, and why sacrifice performance of the common case to handle uncommon cases?2 replies 0 retweets 0 likes -
Replying to @pcwalton
@pcwalton@ssylvan@GraphenePunk You can't always trust the sources of your regexps.1 reply 0 retweets 0 likes -
Replying to @won3d
@won3d@ssylvan@graphenepunk Why not? Usually regexes are hardcoded, e.g. URL routing, or metacharacters are not allowed.1 reply 0 retweets 0 likes -
Replying to @pcwalton
@pcwalton@ssylvan@GraphenePunk Sometimes they are the input to a web form, like they were in Code Search.1 reply 0 retweets 0 likes
@won3d @ssylvan @graphenepunk OK, in that case something like re2 would be appropriate, I agree.
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.