Monday, May 30, 2011
Oh, man, my Achilles heel is error handling. So it was with great interest that I read this article about how (not) to do it. It kind of ends up being a long-format advert for Erlang, but that's OK; non-mutable variables are a good way of dealing with exceptions.
What I'd really like to think harder about is to address error/exception handling at the semantic level. Somewhere in there, errors are workflow. They might be really simple workflow, but at least potentially they're a matter of making a bookmark to a record that tripped us up, and doing something about it later.
The notion of saving state is also interesting, as is the concept of transactions as a method of keeping changes in bundles.
At some point, I'd really like to get serious about a uniform way of talking about errors.
Huh. Looks like I was already thinking about this in 2009. Well - to be fair, it was already bothering me in 1989.
Here's an interesting paper on the efficacy of a number of different crowdsourcing strategies (using Mechanical Turk). Which leads to the concept of "programming for crowds". If I can program a company, then I can surely program a crowdsourcing application, right?
Think about it.
Saturday, May 28, 2011
Here's a fun post: the author wrote an AI to play Tetris (OK, I have this same urge, of course), and that AI had a set of parameters he guessed at. The logical next step was to run a genetic algorithm to try out different values for those parameters.
Here's a fascinating idea. Pastebin has a public section. Concerned about abuse, one Internet citizen scraped it to see what was there - there were lots of .... things there of dubious ethics. Let's say.
Why not scrape it daily and analyze what's there? This seems kind of interesting to me, and I'm not even entirely sure why.
So there's this site Rosetta Code that shows snippets of how to do Task X in lots of different languages. I find that pretty fascinating from the semantic-programming standpoint; in a certain sense, each set of solutions encodes the same thing, the same meaning.
The link goes to "open a window"; Perl has five ways of doing it, depending on the GUI framework you're using, and I'd like to explore that parallelism in Decl.
Wednesday, May 18, 2011
COLM is a language for parsing and tree transformation (like TXL before it). Bears watching.
(I'm pretty sure my concept of mapping is equivalent to a tree transformation with notes kept to permit reversibility.)
Tuesday, May 17, 2011
Here is a really nice blow-by-blow set of instructions for setting up a good node.js server on EC2. Recall that I also wanted to do more or less the same thing for VirtualBox setups. Clearly, what's needed is a semantic description of operating environments, maybe a Deployment::Decl or something. Anyway, bear in mind. Node.js needs more investigation anyway.
Monday, May 16, 2011
Sunday, May 15, 2011
So look here: telescopictext.com. This is a simple story: "I made tea." Click on highlighted words for more detail. A lot more detail. And yet the overall narrative remains that the author made tea. [More detail at telescopictext.org, including a toolset!]
This is more or less what I'm saying about semantic programming. Looking at the high-level specification for a particular action, we see "Make tea". When we consider that action in more detail, we resolve further specifications that were previously invisible. Like varying f-stops that the eye cannot perceive, the mind elides these vast gulfs of detail in order to make sense of the world.
This process is nearly imperceptible to us. As programmers, we're familiar with it - it's the reason for Hoftstader's Law, if nothing else - but programming languages don't take it into consideration (except insofar as some are higher-level, and of course LISP can be made to do some of this, with its fancy macro system to hide cruft at will).
Maybe "telescopic code"?
Down to the last three open in my browser, but categorization fatigue has overwhelmed me.
- Stanford has Protovis, a visualization library for statistical things.
- ArsTechnica reports on a Georgia Tech project Kermit, to permit ease of management for home networks (throttling bandwidth, etc.) My router is hopelessly inadequate for these tasks, but on the plus side it was cheap. I'm thinking of loading Unix on a router and doing things right. Real Soon Now.
Posted by Michael at 12:37 PM
So here's a thoughtful post about why relational databases aren't always the right thing to do. TL;DR - Oracle's standard order management model has 126 tables, and surely that's sometimes overkill.
This is exactly what I mean with the notion of semantic programming. Semantic structures have sliding levels of detail, and the actual number of tables or location of things in RDBMS or NoSQL or whatever should have no effect whatsoever on the actual semantics of the problem domain.
So: food for thought. It's a good article.
OK, OK, this is another clogged space. Still - I'd like to look closely at the semantics. Unfuddle.com is yet another target app. They make money, so ... I could, too, presumably. Why not? The key is to be able to iterate faster than other humans, and making your specific software a fungible expression of a fixed semantic domain is definitely the way to go.
- Some LISP idioms for various tasks. It's the meta-level thinking I'm interested in here.
- How to write your own native Node.js extension: some boilerplate stuff.
- HTML5 canvas cheat sheet. Not really boilerplate, but I've been cleaning up my browser all day and I'm suffering from categorization fatigue.
- A neat-o classification of HTTP APIs. This deserves more serious thought.
Some interesting things coming up in natural language this month.
- Text-processing.com has some online NLTK demos and offers NLP as an API! Cool stuff!
- Article content extraction (once the exclusive domain of Readability and its ports) is now getting a little more coverage: the Goose library (Java, sadly).
- A paper on semantic analysis. I need more sleep even to read the title, apparently.
- Tag extraction from text. !!! The Apresta tagger library.
Back when tower defense games were all the rage, I spent a little time developing some tools to play tower defense for me. It was more challenging than it sounds - but only because TD games are all in Flash, and Flash has no machine-accessible output or state beyond its actual screen output. Screen capture is slow. So actually responding to screen output is essentially impossible. (Not to mention the shocking dearth of easy-to-use OCR libraries, which seems still not to have been resolved - and it's 2011!)
Anyway, the description of a level is presumably in a nice little string. That string could be evolved with a GA - making new levels that people could play in a Web2.0 fashion, providing grist for the evolution! But that would require a lot of people.
So why not evolve playing strategies as well? This would consist of a list of pullback coordinates and delays. The ease with which a given population could evolve a playing strategy would allow the calculation of a "playability metric" - and that in turn would be an evolutionary metric for the new levels.
So by harnessing two levels of evolution, you could (maybe) generate an arbitrary number of entertaining Angry Birds levels.
After many, many years, my sister and I have convinced my mother that all those decades of old photographs need to be scanned and curated. My sister being a CPA, she has a nice document-feeding scanner. So I ran a few test runs, and it turns out that at 600 pixel full-color resolution, any dust at all inside the scanner leaves vertical stripes on the scanned image that are ... not really monochromatic, but sort of a transparency of a monochromatic stripe.
Naturally, I figure there must be software out there to help me remove those stripes. My best strategy so far is to scan each picture twice, once upside down, so that the stripes will be in different places - then do something to recognize the stripes and eliminate them using the corresponding places on the other image.
So far this is a Hard Problem. Here is a link dump of some of the things I've run across while researching it:
- The CImg library - and here I thought ImageMagick was all there was!
- The hdrprep script - a Perl script to manipulate imagesets prior to stitching them together to average out their light levels (HDR = High Dynamic Range, a very neat technique)
- ALE, which is a tool that works magic on images of such refined scope that I can't even truly understand the explanatory blurb - except that I know what registration is, and it's Good Stuff. This only runs on Linux, as a command-line utility, so it's going to take some actual effort to use it because I'm lazy and my Linux box is downstairs and thus requires ssh to hit.
Friday, May 13, 2011
Here's a little article musing about HTML escaping: the tl;dr is that you need to know your application before really being able to do a perfect job. This being not so far from the problem with i18n and l10n of resources, it reminded me of an idea I had, which is to produce a generic library for the higher-level manipulation and preparation of program output.
It's kind of in a sketchy state, but - the generation of natural language being much easier than its parsing, it might be a good place to start with more flexible output design.
Smashing Magazine has a truly wonderful article musing about best practices for signups and logins. Read it.