Saturday, June 25, 2011
So my cousin Erin has written a book, and it's about content, and I had to read the introduction (and may actually (gasp) purchase the book itself, yes, I know, it's a frightening concept) and then I read the comments on the introduction, and Erin's response to one of them, where she talks about intelligent content, which I take to be some sort of adaptive system to generate or modify content based on user input.
Well, Gentle Reader, if indeed you exist and you've read anything at all I've written, you know that topic is guaranteed to make me sit up and drool. So I have a research goal, I guess: find these people of whom she speaks, and make them give up their secrets.
I want to take a short moment here and say just how profoundly strange it is for someone whose birth I remember to have exceeded my own accomplishments in the field I (tangentially) ended up in. Although in Erin's case, it's OK. She was always just about my favorite cousin.
Friday, June 24, 2011
You know what's damned hard about software? It's that best practices change over time. There has to be a way to track best-practice answers to specific questions, and come up with a warning of some sort when your old design assumptions for a given thing turn bad.
Case in point: Instapaper had a server confiscated by the FBI by mistake (probably) and posted about it in public. The community notified Marco that SHA-1 hashes of passwords are no longer considered secure; bcrypt or scrypt is the best practice today.
So ... I'm having troubles really envisioning how exactly this would work, but ... the design of a given software system has dozens of answers to specific questions of this nature, where an algorithm or a library is selected to meet a need. As time goes on, it should be possible to know when there is an incipient risk, and ideally the programming system should just reprogram the application to use the updated solution.
How do you get there from here? I dunno.
Sunday, June 19, 2011
So here I am, doing some sysadmin stuff for Techspex, and thinking about how really, a task is the semantic unit of action. That is to say, when I think of things I have to do to get something done (e.g. install Wordpress - this requires upgrade of MySQL, and that requires a dump to be done, etc.) each of those verbs denotes a task.
The definition of those tasks at the human level might include snippets of shell code to execute the commands required, it might refer to documentation pages, and so on. All those things involve the semantic environment that a human requires to make sense of the actions being done and to be sure that they're reasonably correct.
That's really the essence of a semantic approach. How can I get from a high-level description of a set of tasks to be performed to the specific code required to perform them? That's what programming is, of course. That's where I need to go.
Another consideration: there are certain short lists and items of data that describe a given sysadmin environment - host names, IP addresses, directories, what have you. If these are assigned string variable names, you haven't gained anything; you still have to remember those naming strings. Instead, you need some kind of semantic note-taking structure that can store information of that nature in such a way that it can be retrieved in a purpose-oriented manner.
And that ties back into Code Bubbles, really: the point of that IDE is to arrange a working set of information being used to address a given ... task. See? See how this all makes sense?
Going back even further into my past, I need to resurrect my notion of the semantic database or Lexicon. That's where items of this nature would be grouped. A given context might be "sysadmin work for Techspex". That would be a subcontext of "sysadmin work", and that supercontext would provide useful things to know about any sysadmin environment, such as the hostname, etc. (This could be a checklist of things to discover about a new environment, say.)
But the point there is that the information in that context would be indexed with things like what a hostname is, how it can be determined, how to choose one for a new machine, I don't know - all the things that represent what a system administrator knows. A semantic domain indeed - far more semantically oriented than the Decl domains I keep proposing right and left. Eventually those Decl domains will grow into this concept, but that's still a ways off.
But system administration is a domain where it may make sense to explore it. If only I had more of it to do. (Except that's a great way to lose sleep and hair.)
Saturday, June 18, 2011
Friday, June 17, 2011
You know what the unit of workflow is? It's the task. And you know what the natural grouping of tasks is? The checklist.
Build those two things into the language, and I think that's the only really basic support you need for workflow. I suspect (I haven't taken the time to think this through) that all other workflow structures can be derived from those two. For example, a sequence (as opposed to the parallel nature of a pure checklist) can be expressed as a checklist in which the completion of each task's predecessor is a prerequisite for its start.
So what's a task? It's a macro action consisting of:
- A set of actions to carry out (this really is a sequence)
- Plain code
- Subtasks in a subchecklist
- A set of prerequisites, or pre-existing conditions
- Completion of other tasks
- Resource requirements
- Assertions about input data
- A set of post-facto assertions, or expectations
- Expected outcomes of the task
A checklist can persist beyond the technical process running the workflow, and that's really the essential component that makes workflow workflow - but even without the persistence, the checklist is a useful design component. The order in which tasks execute in a checklist is undetermined; the checklist is only complete when all its tasks are complete. The post-facto assertions are used to determine completeness - always.
If an assertion fails, this is an exception. There may be exception handlers, etc. - but if not, the entire checklist hangs (persistently) until the exceptions are dealt with at the human level.
An example is the 1694 LUZ project I've been spending time with lately. Here, the issue is the translation of a few thousand documents of various formats in a complex directory structure. After translation, each file must be cleaned, and there are a multitude of ways in which this cleaning step can fail. As things stand, I have no good exception mechanism; the result is a laborious process of making sure I haven't lost my place when fixing individual files.
A persistent checklist would already be able to handle that situation, and as I say, the non-persistent checklist (a sort of "parallel loop") would handle similar things inside a single technical process.
Task dependency within a checklist is an additional organizational layer on top of this, and really has little to do with the underlying checklist-and-task structure. Similarly, other types of control flow can be modeled with items that can change dependencies, introducing dependencies on local variable values, and so on. Conditionals can be modeled using post-facto assertions that bypass the entire execution of a branch (i.e. that something is complete before it starts). Loops can be modeled by adding tasks to a checklist dynamically while it's still running. For performance, the checklist should really be a queue (minus the presumption of order) - completed tasks are simply removed once complete.
Add logging to a checklist and you've got a good history mechanism. Again, persistence makes a true log of this.
A checklist should include the concept of multiple actor roles (=task queues); the system is one, but even the system should have a list of outstanding tasks in a given checklist. It's a simple extension to add that list of outstanding tasks in an index over a given class of active (persistent) checklists.
I'm pretty sure that basically covers the entire set of workflow functionality. The wftk had some other mechanisms that are good (notification, delegation, etc.) but they're essentially extraneous to the core workflow engine. That core - checklists and tasks - needs to be inherent in the Decl core semantics. It's just too useful not to include it.
Thursday, June 16, 2011
Wednesday, June 15, 2011
My point: refactoring is the kind of reasoning about software that a semantic programming system should (somehow) support. This point is still somewhat vague in my mind, as is doubtlessly obvious.
Oh, oh, oh, this is an article that speaks to my heart! [HNN] Takeaways:
- Not every object should be in a relational database
- SQL (RDMBSes in general) answer questions; thus SQL doesn't necessarily map to an object definition. This is so damned important; I need to rework some of my database stuff just on this insight alone.
- The practice of deriving all SQL from an object model is pernicious: "They'll get you up and running quickly, but you'll be running in the wrong direction."
- Grouping SQL into one place is a good idea, but in the sense that you are defining an API consisting of answers to questions you can ask your database. I can't say how clarifying that is!
Interestingly, Googling ORM turned up not only object-relational mapping, which the article is about, but also object role mapping, an interesting approach indeed that I want to think about in more detail.
The AXR project is an interesting rethinking of Web presentation languages, with content in XML and style in "HSS", a hierarchical stylesheet language based on CSS that ends up basically doing the same things as less-css. Interesting reading, though still pretty young. Hmm. This might just be a takeoff on less-css anyway, as it even reuses the phrase "done right". Still - hierarchical style organization seems to be a Good Idea.
Read it again. (Specifically involves a Ruby library, Ruote.)
OK, OK, I think I've already highlighted this as a target domain, but ... there have been a lot of new textbooks [here and more or less here] and other information [here on decision trees, whole blog is interesting] posted recently and frankly it would be nice to work through one or more of them and Do Things Right.
So: target domain, machine learning.
This is a Wiki with minimal code snippets for basic tasks in various languages. I like this. I don't like its being a Wiki all that much, though - wouldn't a real database be interesting? That deserves some thought.
(Update 2013-04-18: it's still maintained and weeded, but seems otherwise moribund; nothing but minor changes in the last month.)
(Update 2013-04-18: it's still maintained and weeded, but seems otherwise moribund; nothing but minor changes in the last month.)
Monday, June 13, 2011
Sunday, June 12, 2011
Here's a nice rundown of some Pythonic idioms. The interesting thing about idioms between languages is that they syntactically encode the same (or similar, or sometimes congruent) mental/semantic structures about what the code is supposed to do.
That deserves thought.
Saturday, June 11, 2011
Same thing applies to mail. Unison really doesn't do well with mbox-formatted mail, for the obvious reason: Unison works with files. I need a way to categorize mail that synchs between my different machines. And along the way, I need a way to search mail that presents an SQL API. And a way better means of accessing mail from Perl.
On top of that, a client - eventually. But at least I should be able to define some mboxes and work from there. Synching between mbox sets should be easy.
Mail needs to be categorizable with keywords (not just single folders) and honestly, the keywords should be structured as well, so I don't always need to see every job number in the world when categorizing things.
Archival into longer-term storage would be as keyword-specific mboxes. But short-term indexing needs to happen in, say, SQLite.
The client doesn't need to be very impressive; really, a very simple set of functionality should just expose Perl modules dealing with mail. From my research, Mail::Box/Mailtools is kind of the usual solution, but is probably overweight. Reviews mention MIME::Lite and Net::SMTP, but obviously I need to sit down and think about it a bit.
This would entail a Mail::Declarative module. Sorely needed.
I know photo archives have been done to death, but I need one, and I might as well do it in Decl, right? I've got directory and file support now, so defining an archive with special properties should be a cinch.
Here's my problem: I take a lot of pictures. Well, that's not the problem - the real problem is that I do it while traveling, so the pictures end up on my laptop or my desktop. And some of them my wife wants on her laptop for the screen saver, etc. Then there's the fact I'd like to share some out to Flickr, print them at Meijers by way of Snapfish, and so on.
As you know, I love Unison for file synchronization, but it's not quite fine-grained enough - I can't keep a subset of an archive somewhere, because Unison doesn't really have the concept of different machines, just "local" and "remote" for each pair of machines at a time. (Which begs the question of file synchronization between specific sets of machines - but I'm not going there yet.) (Yet.)
So what I really want is a synchronized index, and a central storage location by means of which files can be retrieved in bulk based on queries.
That sounds pretty refined. I think it might require an API. In other words, it's perfect for a semantic programming experiment. So: target application.
Wednesday, June 8, 2011
Tuesday, June 7, 2011
Monday, June 6, 2011
Sunday, June 5, 2011
Saturday, June 4, 2011
A helpful recursive descent parser definition for vanilla CSS. (Not Less-CSS.) So I could either just use the CSS module or roll my own. Honestly: rolling my own is attractive, but ... so is finishing things. I wouldn't think of writing my own HTML parser, after all. (I'm crazy, but not that crazy.)
Of course, I'm also not proposing switching back and forth between text and Decl in HTML definitions, as I am with CSS. So ....
I'm now looking more closely at CSS (after saying I could get away with ignoring it). Building on the basic insights of LessCSS and OOCSS, I'd like to go through a bunch of CSS examples and ... do whatever comes naturally.
For example, here is a list of CSS examples. Or I suppose I could have looked at StackOverflow to start with. Another list of tutorials. Really, there's a lot written about Web design. Go figure.
Another good reference.
Thursday, June 2, 2011
PHPOpen.net is a directory for open-source PHP Web apps. I've selected PHP Agenda as a candidate for understanding how PHP works. (Because it's small.) There's no way I can even pretend to be able to hit a June 15 launch for Depatenting, but by God I'm making progress.
Until I do, I'm not ready to move.
Wednesday, June 1, 2011
JQuery WormHole. An intuitive way to drag things between containers. This is the kind of thing that could so easily be a building block in an advanced UI description language.
Speaking of which, I'm making loads of progress on HTML::Declarative in the context of WWW::Publisher. Nearly to the point where I can see where that advanced UI description language might come from.
A fantastic, fantastic tchrist comment on StackOverflow about UTF-8 and Unicode handling in Perl. Honestly, this should be read repeatedly.
Before I get too much further with Decl, I really need to sit down and think hard about text encodings. I mean, I treat text as a separate datatype already, essentially - it should be fully Unicode-aware. Which is not easy.
Some excellent UI design tips that highlight the differences in thought between the UI designer and the programmer. tl;dr: think of UI in terms of the tasks your user is performing, not in terms of the underlying data structures you've built to support those tasks.