Wednesday, July 28, 2010

Flash remoting

In Perl. Interesting platform element.

Wx and Flash

If I'm going to write a browser in Perl, I'll want to run Flash. Unfortunately, Windows is the only supported platform because it runs ActiveX, but ... at least it would work- Wx::ActiveX::Flash.

Grist for the mill

So of course you know what I'd like to do is to analyze streams of jobs appearing on the various freelancer sites, and be able to post with a finished product within minutes of the job being posted. I know, I know, but this is the sort of pipe dream that drives genius of my level. So when I run across sources for said job streams, I feel the need to bookmark them. And now, blog them, especially when I need to close my browser and dump links.

Twitter! Specifically, "I need software that".

Scriptlance "Data mining" tag. You can search by content as well. Whether an RSS feed is available I don't yet know, but a topic-specific set of aggregators would be the thing to start with, eh?

Freelancer.com "scraping" search term.

Anyway, it would be cool at least to start on the aggregator and NLP end of this, just to see how far I'd get before running into the really-not-worth-it wall.

Tuesday, July 27, 2010

Target domain: Web interaction

Maybe I shouldn't just be thinking in terms of Web reading, but rather (in the 2.0 spirit) Web interaction. So, on that note, two things:

1. Hookbox is a system (in JS) that treats Websites as channels, and intermediates. Definitely needs analysis.

2. A valuable addition to WWW::Declarative would be the "web-agent". The web-agent is what uses the browser and target sites in order to carry out ... stuff. And that should generalize to an agent in general, and subclass for WWW-specific domain knowledge and actions.

The agent object could provide things like flow charts, action-reaction tables, timing and scheduling, and that sort of thing. You could build a chatbot around a generic agent, for instance. I think it's a sufficiently general concept that it's justified.

Update 7/30/2011 - dammit, the link there is toast now. Probably need Wayback or something to figure it out now. Why do people do that? Thirty seconds on Google gives me an interesting overview that's probably better for my purposes. My knee-jerk interpretation: the guy was hired and Hookbox disappeared into proprietaritude. It looks like it sparked a flurry of interest in late 2010, though.

Target domain: Web reading

OK, so Web slurping and robots (data acquisition from the Web) were always going to be one of my target domains, and I've got a need right now, so I guess WWW::Declarative is herewith on the table.

WWW::Declarative plus Wx::Declarative should make it possible to build a browser that can be automated in Perl. I think that would be a really handy little tool.

At least initially, WWW::Declarative will wrap LWP (book, intro) and HTTP::Cookies. At some point WWW::Robot might be an interesting thing to look at. Or WWW::Mechanize. (Or, of course, both.) It's always hard to judge, but from the volume of writing, I'd say Mechanize seems to be the leader in the field.

HTML::TreeBuilder will also be part of WWW::Declarative. I suspect that will just build a nodal structure (that seems to make the most sense) that we can then map to whatever. I'd feel more confident if all that had already been implemented; perhaps this is the domain where I'll implement it, yes?

Monday, July 26, 2010

Clay: generic compiled language

For efficiency without the loss of higher-level semantics.

Sunday, July 25, 2010

Levels of abstraction as maps

So back in May I had a post on levels of abstraction in code, followed this month by some thoughts on finding mappings - these are actually the same thought.

In my invoice example, I imagine a very high-level abstract specification of a specific invoice, then a medium-level specification of the invoice as a generic document, then perhaps a lower-level specification of that document as a PDF. Followed, I guess, by the actual PDF. The point is that each of those specifications represents the same thing.

When I talk about levels of abstraction in a program, this is the same thing again - a high-level summary (or specification) of the parts of the program, and a lower-level specification of the actual program.

I just need to figure out how to express this understandably, and I'm good to go.

Perl: notes on threads

Threads in Perl have decent support; Wx in Perl just uses the Perl variety without using much of the wxWidgets machinery, as far as I can tell. (Caveat: I don't know squat about threads except for theory; I took the right classes in grad school, but this is another thing I've never felt the need to deal with in real life. Yet.)

First, threads in Wx. Note that this references the modules threads and threads::shared. Finally, the perlthrtut is a recommended read.

My thinking is leaning towards providing threads only in the context of event handlers. Some event handlers could be marked as explicitly threaded, and those would spawn a new thread when fired. Wx methods would then be used for synchronization under Wx; in the general case, we'd use thread->join.

Target domain: Ruby on Rails

I've tacitly been thinking about Web apps in terms of PHP and perhaps jQuery, but really, to be honest, I'd be a fool not to look at different platforms, now wouldn't I?

So here's a very interesting roadmap for learning Ruby on Rails. Here's what I like about this: it takes the semantic subdomains needed to plan (or understand) a Web app, and lays them out. A similar thing could be done for other approaches and platforms - and that result would be the semantic map for the domain.

That is what semantic programming is about.

Target domain: Project management

Here's another domain I've spent a lot of time thinking about in the past. And here are some open-source links to various GANNT chart options and open-source projects (thanks to a recent post on HNN, as always).

dotProject.net - what it says on the tin.
faces - seems to be a Python-based DSL for project planning or something.
OpenProj - what it says on the tin.

A page by Edward Tufte about alternative graphical languages for project presentation.

Two ways to make GANNT charts in LaTeX.

Saturday, July 24, 2010

Perl: mucking with the symbol table

Suppose you want to be able to build a function on the fly (as a closure) then call it like any other subroutine. Your answer is to modify the symbol table, to wit:
$sub1 = sub {
print "Hi!\n";
};

$clo2 = sub {
print "OK?\n";
local *sub1 = $sub1;
sub1();
};

$clo2->();


sub1();
Here, the call to $clo2 will print "OK?", then "Hi!", then the second attempt to call sub1() will fail because it's not defined (we localized the typeglob within $clo2).

This is how I've implemented subroutines in Semantics::Code.

wxdtut - a Wx::Declarative tutorial (first attempt and thoughts)

So here's the idea: I want to write a self-contained HTML-based tutorial that will both publish to the Web and run the demo programs in a standalone format. As an initial stab, I put together forty lines of Declarative (posted here on the Wiki).

It's a nice first try; it does actually find a sloppily-named script in the demo directory, and runs it. There are a few holes it shows in my design so far; no "sub" for common code is the worst. So I'm going to try to solve that first.

Ultimately, I'll want a tree of chapter/section, a built-in editor for trying demos, output capture, and so on. But this is a good start.

Wednesday, July 21, 2010

More thoughts on invoices

In addition to a "brittle" definition of a specific data structure representing an invoice, a semantic programming system should have some general, dare I say semantic knowledge concerning invoices, and business processes in general, documents, and so on. In other words, there should be a semantic web, one view of which might be a specific data structure definition, and the system could interact with e.g. Perl code using that specific data structure definition - but it could also modify it based on what it knows about invoices.

That's pretty darned hand-wavy, but again: it's where I want to go.

AI

I always go crazy for any AI I see. I'm happy to note there's a CPAN module AI::Genetic (searched for it after I saw an HNN post on genetic algorithms in Python). Anyway, this is a good target domain. Duh.

Tuesday, July 20, 2010

3D modeling

New domain of interest. ICE is an interactive composition environment that looks interesting. A blog.

Saturday, July 17, 2010

Coffeescript

Interesting: compiles to Javascript. Slideshow.

Thursday, July 8, 2010

SYMADE

An example of semantic-oriented programming. Interesting.

Finding mappings

Imagine, then, we have an invoice. We moreover have a task that somehow says, "Make a document from this invoice." The way to do that is to find a unit that is indexed as a map from an invoice onto a document. That map satisfies the need to make a document, so we set up the map, let it do its thing, and we now have a document.

Note that the task is thus a high-level description of the essence of the program. Again, this is akin to 4GL languages in that we state what to do, and the semantics tell us how to do it. This is truly declarative programming.

Mapping in real life

OK. So today's epiphany is as follows: Given a unit "invoice" which I have already specified (i.e. filled in), I can map that onto a "document" unit that might look like this:
document
text title "Invoice #3"
text customer
Customer name
Street 9
D-whatever Germany
text identifying
Job # whatever
table
header
text "Description"
text "Units"
text "Unit price"
text "Price"
row
text "Translation DE-EN"
text "4802 wds"
text "See PO"
text "438.00 €"
text (align=right) "Total: 438.00 €"
text bottom_boilerplate
This maps back and forth to the invoice, which is the abstract view of the same thing. And this can be specified even further with layout information, either in the abstract document or a different, more specific template, and then that structure can be mapped onto a Word document or PDF. (Or both.)

The point being that this map is then itself a live object that can be stored and represented, and that is semantic programming. This part is a lot like XSLT, because XSLT is all about mapping and transforming tree structures. But it's unlike XSLT because (1) it doesn't presume that the mapping is a one-way, one-time transformation, and (2) the organization of the maps and semantic units is organized in a lexical database somehow. That lexical database is itself the program. In some way.

I hope this cleared all that up.

Wednesday, July 7, 2010

Diggy/DGE Javascript/DHTML-based game platform

Another cool domain - Javascript games. (Also examine as prototype for Fireworks.)

Tuesday, July 6, 2010

Yahoo! design pattern library

Not the first time I've run across this, of course, but... it has semantics written all over it, so into the link heap it goes.

Monday, July 5, 2010

Another stab at "invoice"

I now dislike my earlier text-based notion of macros. The macro system should be native, and more importantly, needs to be at a semantic level. That is, we are describing to Class::Declarative what sort of node it should be building.

With this in mind, here's my current notion:
unit invoice
has customer => customer
has data items (description, price, unit, subtotal)
assert usually count(items) > 0
calc total = sum(price) from items
has currency => currency default USD
has comments

I need to find a more specifically macro-oriented structural definition and express it like this.

Now Lisp in PHP!

Of all things. I should just go ahead and finish my Perl one.

Another thought on files

I'm proposing a new approach to the formalization of knowledge about programming constructs. Instead of building a new module, I'm proposing the definition of heuristics for the use of the old ones. OK, sometimes your application justifies the creation of a new module (rather often, granted), but sometimes, given a script order, you either don't have the luxury of installing a module, or you just don't want to involve that kind of overhead.

In such a case, you want to be able to write a script that uses existing Perl infrastructure (say) but still manages to deal with sticky cases like UTF-8 BOM markers because the actual specific files in use for the case include them.

So the trajectory I went through: write the naive code, discover BOM markers, work out a way to deal with them, find one file in the list that didn't have BOM markers and write the appropriate conditional code to handle both cases - that trajectory is something that is amenable to automation.

Well. It's not really very amenable to automation right now. That's the point of semantic programming. That's what I want to automate. It didn't require a whole lot of human insight, just some techniques I've learned over the years and some basic logic. I don't think it's AI-complete; it's just one more corner to break off the brick, and it's the corner I have my eye on.

Saturday, July 3, 2010

Couponing automation

Probably stupid, but Jeffrey believed that with couponing, he could live on $1 a day for food and have plenty to eat (via MeFi). He blogged it for a month, spending less than $30 and eating rather well - while donating food to a local food bank. Then he kept doing it.

This intrigues me. A lot of the coupon game involves planning and "deal detection" that might be amenable to automation. So ... maybe it's a good target?

If so: SavingAdvice.com coupon database. Jeffrey's FAQ. An explanation of blinkies. There are lots of fora, of course, where heuristics could be gleaned.

Worth thinking about.

MetaOptimize Q&A site

A stack-overflow clone for statistical methods, machine learning, etc.

A great domain for semantic programming.

Javascript local data store

Another interesting component in Javascript. Why, yes, I am in link dump mode, thank you.

Understanding systems

A blog post musing on understanding systems and how it makes us better programmers. This, again, is kind of where I want to be going with a semantic software system.

Intrusion detection

This blog post on intrusion detection probably has no place here, except that if semantic programming can't also include semantically motivated machine learning paradigms like intrusion detection, then I'm probably not thinking about the right thing.

Well. I find it interesting, so it gets bookmarked here.

UTF-8 files and Perl

I ran into some problems trying to load some files with UTF-8 text (German) using Perl. Thing is, these files had a three-byte byte order marker (BOM) of ef bb bf [here is another useful link] and Perl freaks out. You have to check the first line for those three bytes; if present, you toss them and keep a flag. Then for the rest of the file, you have to set the UTF-8 flag on each line read.

The code:
use Encode;

open F, "$d/$file";
$utf8_file = 0;
$firstline = scalar <F>;
if ($firstline) {
if ($firstline =~ /^\xef\xbb\xbf/) {
$firstline =~ s/^\xef\xbb\xbf//g;
$utf8_file = 1;
Encode::_utf8_on($firstline);
}
[ consume $firstline ];
}
while (<F>) {
Encode::_utf8_on($_) if $utf8_file;
[ consume $_ ];
}
This is tedious.

I see two approaches to dealing with this. The first is to create Yet Another Module (this includes adding code to Class::Declarative), then always use that module when coding. This is kind of the default, and ultimately it is unsatisfying.

The other approach is some kind of pattern / macro / template system that would include this knowledge and would somehow generate the appropriate code as needed. That's where semantic programming needs to be headed.

Boy, that's vague.