Conversation

What's your favorite approach to lexical analysis (i.e. turning a stream of bytes/characters into a stream of tokens)?
14
15
My approach is usually ... just to do it by hand. But depending on the language I also like to throw in the smallest amount of parsing by matching parentheses/brackets/braces at the same time. It's not very elegant but it makes the output immediately useful.
1
8
By "just to do it by hand", I mean it's usually a big loop with a switch statement with a getc-equivalent, manually keeping track of line/column numbers ... it's great. 😿
2
8
Is that better? Need to think about it. Could also just store the column and keep track of newlines some other way.
1
Replying to
In general I like to store tokens in an array and use the token index (or token handle, as it were) instead of storing file locations directly. So how you represent file position data in the token array(s) doesn't really affect AST size.
1
1
Show replies