Sunday, January 12, 2014


So OK, what about boilerplate?  (Again...)

First, there are a number of modules in CPAN that produce boilerplate of various description, the first being naturally Module::Starter, which I use on a weekly basis. Its structure is surprisingly straightforward, but that's just another way of saying that it is expressing things in Perl that (in my opinion) could better be expressed in a specific boilerplate DSL.

There are others, such as Module::Starter::PBP by Damien Conway, based on "Perl Best Practices". Drupal::Module::Start. A Padre plugin (I did not know this until just now). There's Test::STDmaker, which takes Perl test output and puts it into a boilerplated document to conform to military purchasing standards. HTML::HTML5::Builder, which "erects scaffolding" for Web apps. Even WWW::Mechanize::Boilerplate, which I really would like to look at more closely.

In other words, CPAN contains a lot of knowledge about boilerplate, both Perl-specific and otherwise. (Another reason for a survey, yeah?) But what occurs to me is that I don't just want to look at boilerplate generation. I also want to explore boilerplate degeneration, as it were - extraction of higher-level information from a given text based on recognition and abstraction of boilerplate. This is just phrase-based parsing writ large, using more complex lexical entities, but I think it would be - well. It's a lot of what I expect an exegesis to be, to be honest; an abstract "understanding" of syntactic forms.

So there is actually a boilerplate extractor on CPAN, Text::Identify::Boilerplate, which, given a set of files, will do a line-by-line diff and extract the boilerplate. That's pretty slick!

But first, I propose Text::Boiler, which will take some kind of declarative boilerplate description and build that. Then Text::Unboiler, which will undo boilerplate (perhaps with overrides for changes to the boilerplate itself) and return to you the original record used to create the final files.

Ah. Right. Boilerplate + information = syntax. Boilerplate contains named fields, probably also lists and so forth, and the record contains those fields (which can also have default values if the record omits them). But the record can also override anything in the boilerplate. If the boilerplate also has named sections to make that easier, then unboiling should be pretty flexible indeed!

I think this is going to be a pretty profitable way of looking at things, especially in terms of exegesis, which is kind of a "manual unboiling".

No comments:

Post a Comment