Thursday, November 17, 2011

State machine redux redux

My random link strategy is working - it brought me to my state machine post of last year, whereupon I realized I'd kind of forgotten about state machines entirely. And yet a clear state machine presentation is such an improvement over code!

Incidentally, the Wikipedia page for finite-state machines is pretty nice.

Here's my tentative DSL for state machines, then:
  • The overall tag is "statemachine" and has a name. This name will resolve as a function outside the state machine.
  • The state machine tag is an iffy executor; that is, if it's the last tag in a program, it will be in control.
  • Within the state machine node, the children are named. Children named "prepare", "output", or "input" that occur before "start" are special.
  • The "prepare" child is code executed on each input to prepare it. The local variable $input contains the prepared input (the return from the "prepare" code) and $raw contains the raw input should it be required (this is @_ in the "prepare" code).
  • The "output" child is code executed on each out-of-band output in the state machine (see below). The default output is the same as anywhere in Decl; it's to pass output to the parent, where eventually it just gets printed to stdout if you don't redirect it.
  • The "input" child is code executed to obtain the next input token, if the state machine is in control. If there is no input, then the state machine can't be in control; it must be called from other code for each input token.
  • The "start" child is the first state - every child after "start" is another state, so you can still call a state "prepare" or "input" if you need to.
  • Within a state, we still have special parse rules, but in general, execution goes down the list of the state's children.
  • A string followed by "->" consumes an input token if it matches, and changes the state.
  • A line introduced by "->" just changes the state.
  • Either of those may have code attached; if so, this code executes before the state transition. But with or without code, both of those act like a "do".
  • A line that doesn't consist of string and -> or just -> is parsed as normal Decl code and does whatever it's supposed to.
  • The code morpher will be updated to understand "->" at the start of a line as a state transition if you're inside a state machine. (That's probably a trickle-up thing as well, actually.)
It should be possible to compile a state machine to Perl or C or something; run it in interpreted mode for testing, then spin out C for performance later. Something like that. I'm only vaguely starting to apprehend this aspect of Decl - that eventually it should be entirely master of its own fate and understand how to generate high-performance code from its own programs when necessary.

I still think this would be a pretty powerful feature; I nearly always run aground when coding things based on state machines because it's just so hard to keep track of the darn things, but this lets me pseudocode my way right through it.

Here's a possible rendering of a recognizer for "nice", each letter being an input token:
statemachine nice
start
n -> n_found
-> error
n_found
i -> i_found
-> error
i_found
c -> c_found
-> error
c_found
e -> success
-> error
success (accept)
>> Yay!
   -> start
 error (fail)
>> bad!
   -> start

That seems to do what I want; it doesn't show any code, but it does show the basic pseudocode I want to use.

No comments:

Post a Comment