Monday, May 30, 2011

Target app: www.timeoffhq.com

Look at it! Quick, easy, simple! It should be dead easy to do that!

Useful string functions for PHP

It caught my eye.

HTML overview

Just because I'm finally doing HTML::Declarative, here is a useful link to an HTML overview that I'm going to use to build a database of valid attributes and tags.

Dmainai - an AI tank game

Fun!

Unix tools

Man, this is a domain rife for semantic clarification if ever there were one. It'd be good for me just to write it, actually, whether it ever got used or not.

Everyjs.com

Ha. A list of currently useful Javascript libraries. "The right tool for the job."

Data Mining: the map

Here's a fascinating map of data mining techniques.

MessagePack

An alternative to JSON that packs things much smaller. Good idea!

Error handling

Oh, man, my Achilles heel is error handling. So it was with great interest that I read this article about how (not) to do it. It kind of ends up being a long-format advert for Erlang, but that's OK; non-mutable variables are a good way of dealing with exceptions.

What I'd really like to think harder about is to address error/exception handling at the semantic level. Somewhere in there, errors are workflow. They might be really simple workflow, but at least potentially they're a matter of making a bookmark to a record that tripped us up, and doing something about it later.

The notion of saving state is also interesting, as is the concept of transactions as a method of keeping changes in bundles.

At some point, I'd really like to get serious about a uniform way of talking about errors.

Huh. Looks like I was already thinking about this in 2009. Well - to be fair, it was already bothering me in 1989.

UI design

An excellent list of ten considerations for UI design. Again: we oughta have a semantic domain for this. Patterns. Boilerplate. Macros. The works.

CSS again

Thirty-two neat things you can do with CSS that traditionally required JS.

Project management with Task Warrior

Task Warrior is a command-line tool for task management. It's really quite extensive. A project management domain could interface with it, say.

Domain: crowdsourcing strategies

Here's an interesting paper on the efficacy of a number of different crowdsourcing strategies (using Mechanical Turk). Which leads to the concept of "programming for crowds". If I can program a company, then I can surely program a crowdsourcing application, right?

Think about it.

LLVM

A chapter in The Architecture of Open-Source Software on LLVM, the compiler backend that's used in GCC and lots of other things.

Boilerplate

It's what's for dinner! A boilerplate setup for mobile services.

Saturday, May 28, 2011

Tetris AI

Here's a fun post: the author wrote an AI to play Tetris (OK, I have this same urge, of course), and that AI had a set of parameters he guessed at. The logical next step was to run a genetic algorithm to try out different values for those parameters.

That's cool!

Pastebin harvesting

Here's a fascinating idea. Pastebin has a public section. Concerned about abuse, one Internet citizen scraped it to see what was there - there were lots of .... things there of dubious ethics. Let's say.

Why not scrape it daily and analyze what's there? This seems kind of interesting to me, and I'm not even entirely sure why.

NLP Christmas in May

I found out that not only is the journal Computational Linguistics now free online - but a lot of journals in computational linguistics are free online (archive here). All I need to do now is build a semantic bot to analyze them...

Rosetta Code

So there's this site Rosetta Code that shows snippets of how to do Task X in lots of different languages. I find that pretty fascinating from the semantic-programming standpoint; in a certain sense, each set of solutions encodes the same thing, the same meaning.

The link goes to "open a window"; Perl has five ways of doing it, depending on the GUI framework you're using, and I'd like to explore that parallelism in Decl.

Wednesday, May 18, 2011

COLM: COmputer Language Manipulator

COLM is a language for parsing and tree transformation (like TXL before it). Bears watching.

(I'm pretty sure my concept of mapping is equivalent to a tree transformation with notes kept to permit reversibility.)

Tuesday, May 17, 2011

API Marketplace

O API Marketplace, where art thou? Very thought-provoking article about APIs and clearinghouses for them, along with a maybe somewhat dubious business model he wishes somebody else had.

PHP best practice for "require"

A good benchmarking article on different ways to include PHP files.

The concept of "compiling to PHP" really starts to make more sense given these different alternatives. Compiling to anything, really, is what I want to do.

DIY node.js server on EC2

Here is a really nice blow-by-blow set of instructions for setting up a good node.js server on EC2. Recall that I also wanted to do more or less the same thing for VirtualBox setups. Clearly, what's needed is a semantic description of operating environments, maybe a Deployment::Decl or something. Anyway, bear in mind. Node.js needs more investigation anyway.

Monday, May 16, 2011

MISC: a map-based, vaguely LISPish language

This is cool. MISC runs on Javascript and is a map-based, lazy-evaluated language with a lot of the features of LISP. Worth the read!

Sunday, May 15, 2011

Telescopic text

So look here: telescopictext.com. This is a simple story: "I made tea." Click on highlighted words for more detail. A lot more detail. And yet the overall narrative remains that the author made tea. [More detail at telescopictext.org, including a toolset!]

This is more or less what I'm saying about semantic programming. Looking at the high-level specification for a particular action, we see "Make tea". When we consider that action in more detail, we resolve further specifications that were previously invisible. Like varying f-stops that the eye cannot perceive, the mind elides these vast gulfs of detail in order to make sense of the world.

This process is nearly imperceptible to us. As programmers, we're familiar with it - it's the reason for Hoftstader's Law, if nothing else - but programming languages don't take it into consideration (except insofar as some are higher-level, and of course LISP can be made to do some of this, with its fancy macro system to hide cruft at will).

Maybe "telescopic code"?

Miscellaneous links

Down to the last three open in my browser, but categorization fatigue has overwhelmed me.
  • CloudMade has a neat Javascript API for online maps.
  • Stanford has Protovis, a visualization library for statistical things.
  • ArsTechnica reports on a Georgia Tech project Kermit, to permit ease of management for home networks (throttling bandwidth, etc.) My router is hopelessly inadequate for these tasks, but on the plus side it was cheap. I'm thinking of loading Unix on a router and doing things right. Real Soon Now.

Data modeling

So here's a thoughtful post about why relational databases aren't always the right thing to do. TL;DR - Oracle's standard order management model has 126 tables, and surely that's sometimes overkill.

This is exactly what I mean with the notion of semantic programming. Semantic structures have sliding levels of detail, and the actual number of tables or location of things in RDBMS or NoSQL or whatever should have no effect whatsoever on the actual semantics of the problem domain.

So: food for thought. It's a good article.

A new approach to comments

Interesting. Mozilla and PBS are both posting interesting things about rethinking the comments system we all love and hate. (It's the same project, just from two perspectives.)

Target domain: project management

OK, OK, this is another clogged space. Still - I'd like to look closely at the semantics. Unfuddle.com is yet another target app. They make money, so ... I could, too, presumably. Why not? The key is to be able to iterate faster than other humans, and making your specific software a fungible expression of a fixed semantic domain is definitely the way to go.

Learning Modern 3D Graphics Programming

An excellent C++/OpenGL-based tutorial, as yet unfinished.

Patterns/boilerplate/idioms linkdump

Design linkdump

You can always tell when my browser gets so full I can't see the icons in the tab headers.
  • 39 CSS3 box shadow tricks
  • Umm... I thought I had more than that open for design. Apparently not.

Target application: wireframe / chart / whatever

This space is beyond full, of course. Here are a couple more:

Natural language linkdump

Some interesting things coming up in natural language this month.
  • Text-processing.com has some online NLTK demos and offers NLP as an API! Cool stuff!
  • Article content extraction (once the exclusive domain of Readability and its ports) is now getting a little more coverage: the Goose library (Java, sadly).
  • A paper on semantic analysis. I need more sleep even to read the title, apparently.
  • Tag extraction from text. !!! The Apresta tagger library.

Algorithmic trading: field studies

Ha. Still a topic dear to my heart, although I think I'm too late to the table.

Webapp theme boilerplate

I'm really getting into the notion of boilerplate. Here's an article presenting a couple of Webapp themes.

Angry Birds on Chrome + evolution = WIN!!

So Angry Birds is now available in a Javascript port on Chrome. I have yet to really study it beyond, you know, playing it for a couple of days to see what all the fuss is about (and yeah, it's a pretty addictive game!) So I had this cool idea, as one does.

Back when tower defense games were all the rage, I spent a little time developing some tools to play tower defense for me. It was more challenging than it sounds - but only because TD games are all in Flash, and Flash has no machine-accessible output or state beyond its actual screen output. Screen capture is slow. So actually responding to screen output is essentially impossible. (Not to mention the shocking dearth of easy-to-use OCR libraries, which seems still not to have been resolved - and it's 2011!)

Anyway, the description of a level is presumably in a nice little string. That string could be evolved with a GA - making new levels that people could play in a Web2.0 fashion, providing grist for the evolution! But that would require a lot of people.

So why not evolve playing strategies as well? This would consist of a list of pullback coordinates and delays. The ease with which a given population could evolve a playing strategy would allow the calculation of a "playability metric" - and that in turn would be an evolutionary metric for the new levels.

So by harnessing two levels of evolution, you could (maybe) generate an arbitrary number of entertaining Angry Birds levels.

Cool, huh?

Image processing

After many, many years, my sister and I have convinced my mother that all those decades of old photographs need to be scanned and curated. My sister being a CPA, she has a nice document-feeding scanner. So I ran a few test runs, and it turns out that at 600 pixel full-color resolution, any dust at all inside the scanner leaves vertical stripes on the scanned image that are ... not really monochromatic, but sort of a transparency of a monochromatic stripe.

Naturally, I figure there must be software out there to help me remove those stripes. My best strategy so far is to scan each picture twice, once upside down, so that the stripes will be in different places - then do something to recognize the stripes and eliminate them using the corresponding places on the other image.

So far this is a Hard Problem. Here is a link dump of some of the things I've run across while researching it:
  • The CImg library - and here I thought ImageMagick was all there was!
  • The hdrprep script - a Perl script to manipulate imagesets prior to stitching them together to average out their light levels (HDR = High Dynamic Range, a very neat technique)
  • ALE, which is a tool that works magic on images of such refined scope that I can't even truly understand the explanatory blurb - except that I know what registration is, and it's Good Stuff. This only runs on Linux, as a command-line utility, so it's going to take some actual effort to use it because I'm lazy and my Linux box is downstairs and thus requires ssh to hit.

Friday, May 13, 2011

Semantic representations

Here's a little article musing about HTML escaping: the tl;dr is that you need to know your application before really being able to do a perfect job. This being not so far from the problem with i18n and l10n of resources, it reminded me of an idea I had, which is to produce a generic library for the higher-level manipulation and preparation of program output.

It's kind of in a sketchy state, but - the generation of natural language being much easier than its parsing, it might be a good place to start with more flexible output design.

MediaWiki parsing

... is apparently a hard nut to crack. I find this a little surprising. Here's an interesting article about a rigorous parsing approach (Dropbox backup) and the HNN thread.

Here's an example parse.

Some excellent design patterns for signups and logins

Smashing Magazine has a truly wonderful article musing about best practices for signups and logins. Read it.