Tuesday, March 30, 2010

Milestone: line parser works!

Just to mark the event.

Sunday, March 28, 2010

Debugging parsers

This parser subproject is the hardest programming I've done in twenty years. Well, wait - I took Compilers from Kent Dybvig in ... must have been 1995. So fifteen years, anyway, but I don't remember Compilers. That is, I remember Kent, and I remember the class, but I don't really remember the coding. This stuff is stored in a different place in my brain, I think, very intuitive and nonverbal - part of what makes it so exhilarating.

Anyway, debugging recursive descent compilers is hard, because they're declarative in nature. This is actually going to end up being a key insight (no, not hexapodia) - debugging an imperative program is far, far easier than debugging a declarative one, because you can step through what the program is doing. In a declarative program, not only do you not know when or where things are being done, it's exceptionally difficult to interpose a check point.

I wrote a small debug atom for parsers, though - when it's invoked, it always succeeds, consuming nothing, just like \&nothing - but it prints a message.

Using it felt like ... Prolog.

Wednesday, March 24, 2010

Down the parsing rabbit hole

So I wanted to do parsers, you know, because I'd like to be able to parse SQL statements and stuff? So I got sucked into Chapter 8 of Higher Order Perl, of course, and once I really started getting into it and realizing how much better life would be if I did some stuff different in the basic parser, well, two weeks had passed.

Still not done.

I do have a lot of parser tools done, though.

But my basic approach to parsing nodal structure was naive. First, the idea that a tag will always mean the same throughout the application is naive - a page in a PDF will have to mean something different from a page in a Web site, and yet I still want to use the tag "page" for both meanings.

But secondly, I realized that I couldn't rely on runtime objects to determine parsing structures, and that rankled.

So I'm going to do things differently now, and in a much more flexible manner. I'm removing the dependency on Parse::RecursiveDescent, and I'm making Class::Declarative::Node a primary class instead of using XML::xmlapi (snif).

Each top-level tag will be parsed minimally into a line and a body, and the first word in the line will determine its semantics, as now. But those semantics will already be able to determine the parsing of all lines indented under that tag! In other words, if it wants to use a different parser, it can. If templates should be expressed, they will be. I'm halfway leaning towards everything always being a template, actually.

This scheme, though, allows me to vary the semantics of inner tags, so if I want to use a radically different syntax to express parser rules, for instance, I can, without a lot of twisting or tweaking.

More on this when it firms up. But there will be recursive-descent parser support built right into Class::Declarative from the get-go. If the strength of Lisp is that it has no syntax, let the strength of Declarative be that it has all syntaxes.

Sunday, March 7, 2010

Code generation

Ran across a fascinating book online: "Code Generation in Action", by Jack Herrington. Turns out that code generation is something of a concept nowadays, in several different domains. Maybe I'm not so crazy.

The code generation in the Class::Declarative framework is all in Class::Declarative::Semantics::Code, which seems reasonable. It could doubtlessly be extended in some way - I'd particularly be interested in a way of defining new code generators in a plugin manner. Not sure how best to organize that.

Thursday, March 4, 2010

Mapping

Consider this. A macro expression is a one-way street; given a macro instance, the macro engine expresses it as more ramified code, possibly affected by the environment, or parameters.

A more powerful macro engine could define a mapping between two structures, whereby a change in either could be reflected as a change in the other. For example, a mapping between the graphical boxes on a diagram and the underlying database could be set up that could modify the screen when the database was changed, and write (the significant) changes to the database when the screen image was manipulated.

This would be the equivalent of a mapping in cognitive science, the syntactic/semantic mapping we use to talk about how language expresses concepts. Or the same as the analogies Douglas Hofstadter uses to talk about ... everything.

A mapping in a declarative tree would look superficially a lot like an XSLT program. The only difference is that a mapping would remember where its results were, and could act as the trigger described above.

Another example: we've already got magic variables that look to a GUI field, for example, for reading and writing. An explicit mapping structure could define that in the language instead of just in the code; we could define this sort of mapping in our programs. This could be used to set up Excel-like functions in a spreadsheet. Or not in a spreadsheet if you don't like grids - just set up a dataflow program. (TODO: think about that a little more.)

So the to-do list for macros is:
  • !tag defines a one-way map to a virtual tag. It expresses in place during built, and stops.
  • !!tag defines an active one-way map; it expresses in place, but rebuilds whenever one of its parameters is changed (or potentially does; we could probably specify some active and some one-time parameters somehow).
  • An explicit macro or template tag could also match specific parts of the tree, specify assertions about its structure that must be met, and coordinate expression in multiple points.
  • A mapping would define that in two directions in some as-yet-unspecified manner.
The basic ones are going to be necessary to make PDF::Declarative make any sense for real-world use, because they allow us to specify a PDF structure that is parameterizable, say, with variable text. Then we can use PDF::Declarative to build a template that can be used again and again (e.g. as an invoice template, etc.) Do that, and my test case will be complete: I want to use PDF::Declarative to generate my translation invoices.

PDF::Declarative

The PDF::Declarative class being a wrapper around PDF::API2 (at least currently), I went looking for good examples, and found a fantastic tutorial by Rick Measham. It took some work and some extra functionality added to Class::Declarative (which was of course the point), but I can now generate a PDF equivalent to his tutorial example using PDF::Declarative. This is the code, somewhat abridged, because it contains the text for the PDF:

use Class::Declarative qw(PDF::Declarative);

pdf (displaytitle, encoding=latin1) "mynewpdf.pdf"
author "Michael Roberts"
title "PDF::Declarative Example 1"
subject "Building PDFs with explicitly placed elements"
keywords "Declarative PDF generation"
mediabox "105mm x 148mm"
#bleedbox "5mm, 5mm, 100mm, 143 mm"
cropbox "7.5mm 7.5mm 97.5 mm 140.5mm"
#artbox "10mm, 10mm, 95mm, 138mm"

page
graphic blue_box
fill (darkblue)
rect "5mm, 125mm, 95mm, 18mm"
graphic red_line
stroke (red)
move "5mm, 125mm"
line "100mm, 125mm"
text heading (flow=no, x=95mm, y=131mm, align=right, color=white, font=helvetica, bold, fontsize=18pt)
Using PDF::Declarative
graphic background
stroke (lightgrey)
circle "20mm, 45mm, 45mm"
circle "18mm, 48mm, 43mm"
circle "19mm, 40mm, 46mm"
box left_column (border) "10mm, 121mm, 41mm, 111mm"
text (lead=7pt, parspace=0, align=justify, color=black, font=times, fontsize=6)
Perci ent ulluptat vel eum zzriure feuguero core consenis adignim...

text (align=center, font=helvetica, bold, fontsize=6pt, color=blue)
Enim eugiamc ommodolor sendre feum zzrit at. Ut prat. Ut lum quisi.

text (align=right, font=times, color=black, fontsize=6pt)
It augait ate magniametum irit, venim doloreet augiamet...

graphic
image "54mm, 66mm, 41mm, 55mm"
jpeg "Portrait.jpg"

box right_column (border, dash=2 2 1 2, color=blue) "54mm, 64mm, 41mm, 54mm"
text (lead=7pt, parspace=0pt, align=justify, indent=5pt, fontsize=6pt, bullet=B7)
Orpero do odipit ercilis ad er augait ing ex elit autatio....
Well, one correction: I haven't implemented bullet points yet.

Again: the above is a complete Perl program, and it generates a valid PDF file with justified text in columnar boxes. With a few extensions to the existing code, I think it's going to be just about time to release it into the wild, my first semantic module to qualify.

Current calendar time invested: 11 days. I think about a month would be necessary to do PDFs right - probably far more to do them right, but "good enough" in a month is pretty fast work. Fast enough I haven't lost interest before finishing something useful.