Am I right thinking that the pat => expr go beyond LR, but is still unambiguous?
Conversation
That depends heavily on the specifics of the grammar, so there's no guaranteed answer, but my gut instinct is that there's nothing obviously special about that syntax and so I would expect to be able to encode it as an LR grammar. Warning: I might be wrong.
1
1
The reason I think it may not be is that patterns and exprs can be nested ((a,b),b), so at the (a,b) we don't know whether we need to call the semantic action for the pattern version of (a,b) or the expression version of it, we only know that at =>?
1
One workaround may be to merge the productions for pat and expr, and then in a post pass walk over the AST and see if everything that's supposed to be an expr is an expr and everything that's supposed to be a pat is a pat, so as to reject things like (a,b+c) => ...
1
I tentatively believe is wrong about this being LR and you are correct; I have a blog post about exactly this issue, rntz.net/post/2018-07-1. IIUC Haskell uses the merge-pattern-and-expression approach you suggest.
3
2
Knuth actually discusses precisely this problem as the very last paragraph on his original paper on LR parsing. To my knowledge, his proposed generalization has never been explored, but I could easily be wrong about this. (I simply haven’t ever seen it discussed.)
1
3
Something I would find interesting would be a generalization of LR(1) that can handle these sorts of ambiguities without full backtracking by delaying the evaluation of semantic actions until the ambiguity is resolved, constructing a concrete “token tree” to be interpreted later.
1
2
That is, record the parsing work done so far so you never have to rewind your position in the stream and examine the same token more than once. Instead, construct a generic CST that can be “interpreted” against the appropriate production rules once the ambiguity is resolved.
1
2
How does GLR compare to this generalisation of LR? At least from what I heard in a talk on tree-sitter it sounds somewhat similar? I'm no expert on parsing though, so could be mistaken!
2
1
It’s true that you can get a similar result to what I’m describing using GLR, but GLR parsers provide no guarantee that the ambiguity will eventually be resolved—they may produce multiple results. I had in mind something that still statically guarantees an unambiguous parse.
1
1
Ahh, cool! Thanks for clarifying! Yeah, having nice static analysis for eventual ambiguities would be really nice - it's part of why I prefer LR generators, despite the error issues. (LALRPOP tends to reasonably good on the error front at least)




