Got distracted with the Scheme->Lua compiler. Been debugging it to try to get it to the point where it can run the Ribbit VM.
Conversation
Starting to understand why people are tempted to bootstrap compilers. Doing this in Lua works up to a point, but I'm starting to really miss pattern matching, and the only way out I see is to use the Scheme-in-Lua to bootstrap a Scheme to write a pattern matcher.
1
This is getting out of hand...
Also I don't think I'm ever going to write a direct output compiler again. ( Best way to learn is to make told-you-so mistakes. )
You really want to be able to clean up the output in an additional pass, even if it is just for debugging
1
1
Once you leave AST/Sexpr domain it's game over.
1
But what I did find out (probably re-invent) is that it makes sense to generate the output as a token stream even if it generating a nested expression, at least for compilers written in semi-imperative style.
1
Probably artifact of trying to do this in one pass. Seems to be mostly about the pretty-printer. Splitting up in more passes, the intermediate passes do want to be pure.
1
Made a draft of a simple recursive matcher. Thing is, with s-expressions available and Lua's interned strings for memoization, there's a lot of room for building small s-expression based DSLs inside lua.
1
{"(add ,a ,b)", function(m) return m.a + m.b end}
1
Added quotation support to the reader.
1
But I also discovered that the pattern can be represented as a function (HOAS), and that you can basically turn any parameterized data constructor into a matcher.
This blew my mind.
1
1
Replying to
This has to be common knowledge. Just where do I start reading to find these things out?
1
Anyways I am stoked because it looks like being stuck with Lua is not going to be such a problem. Especially taking into account that we're using LuaJIT, and that implementing "string DSLs" is probably still going to be pretty fast.
1
In this light, interned strings, tables (finite functions), and table object identity as primitives are not so bad.
1
Split it up further into frontend (macro exp, beta red, let insertion, var rename), simplification pass (block flattening), and ir + lua pretty printer. Not complete but looking a lot better.
1
So... this is turning into a thing. Wrote a bit more at the luarocks page luarocks.org/modules/zwizwa
1
Getting quite obsessed. My wife is worried. Haven't been this obsessed in ages. Focused on the Scheme->Lua compiler V2 for a bit, the multipass one. I got it wrapped to a point where it's possible to write a Lua module in Scheme:
1
The pattern matcher that has enabled most of the refactoring of the Lua code is now exposed in Scheme as well, so I can start writing compiler passes in Scheme.
1
Plan is to refactor the SMC Scheme to C state machine compiler to a Scheme dialect, so I can host it in Racket as well as Lure. This would decouple the frivolous distraction (Lure) from the thing I originally set out to do (SMC).
1
Almost done with 'parameterize'. That + pattern matching + basic (non-tail recursive) Scheme should be enough to implement the rest of the compilers / interpreters.
1
One thing I'm not happy with is the lack of space-safe tail recursion. There is a trampoline to implement letrec-like forms but semantics is not 100% correct. Seems that compiling these to loops requires a bit more analysis than I can currently do, so not implementing it.
1
Does create some incentive to not write explicit recursion and use combinators. Fold and friends are implemented as loops.
1
Added interpreter for the IR that can do proper tail calls and uses continuations internally. This is something I did before long ago, directly for core Scheme. Now in ANF it's a lot simpler as there is just one kind of continuation (chained remainders of blocks = typical stack).
1
This multi-pass approach is really useful. Hard to see exactly how it all fits together without actually trying to factor it. It is a puzzle but the end point feels natural somehow.
1
Also found this, for when I grow up: docs.racket-lang.org/nanopass/index
1
Wrote a new state machine compiler on top of the block IR as a pass that extends the IR with labels and goto. This one is much more general than the first attempt.
1
Main idea: Inline every closure application, keeping track of instantiated closures indexed by the current continuation. If the same function is encountered more than once for the same continuation, it's a loop, and application can be replaced with a goto to a previous instance.
1
This works for nested loops, without the need for special loop forms (it's all lambdas). All loop forms can then be Scheme macros.
1
Took a good while to figure out _where_ to put the function bodies to not mess up lexical scope for downward closures until I realized that they can just be inlined if there is a proper stop criterion for the tail recursion.
1
Non-tail recursion will still blow up this way, but this is explicitly not part of the problem.
1
On to testing...
1
Still wrong.. moving code around is tricky. Where to hide the bodies.
1
There's a path now that feeds the SM compiler output IR back to Scheme for interpretation, so I can at least test execution and scope errors.
1
I think I solved the scope errors. Time for some more testing: start with Scheme program that contains trace calls. Interpret one directly, compile the other to state machine form then back to Scheme, and compare traces.
1
Found a lot more corner cases with testing. Also some very surprising things that unexpectedly work wrt. moving lambdas around, which makes me think I should really go all in on "partial evaluation as a UI".
1
This is dynamically typed, but staged. A type error is something that doesn't doesn't survive compile time evaluation and or doesn't specialize to state machine form. I'm thinking that might be good enough as guard rail.
1
Or.. add an actual type system. Playing with this is reshaping the way I think about managing my future "C life". I cannot escape C nor do I want to, but I can write a whole lot more in a Schemish language + ad hoc compile time checks than I originally thought was possible.
1
Doodled a bit to combine new SM compiler with the new interpreter for partial evaluation, but seem to have lost my stride after 2 weeks of putting out fires at work...
1
Two more weeks of exhaustion.. But picked it up again. Lure Scheme: added preliminary Erlang, JavaScript and GDB script backends, and started working on dataflow analysis (liveness) for the state machine compiler to implement state variable allocation across suspend points.
1
Still existential questions about what this thing is supposed to be. For now it's self-propelling. Something something Scheme, microcontrollers and test systems.
1
Show replies
