Wednesday, October 27, 2010

The only thing that could improve this would be an automated form that lets you insert different people's names into it

Via HNN, like so many other posts:

I really enjoyed that essay by Paul Graham. Paul Graham is an excellent writer and a very nice fellow. But when he said that thing that made me look bad, I just had to draw the line. For years, I’ve been doing something and telling people I’m doing it and then all of a sudden Paul Graham comes along and tells me it’s a bad idea. I think it’s time to question his assumptions.

Yeah. So why is putting up a form so hard? Answer: it really isn't, except you always have to look things up again. That's the background logistics that a smooth system would relieve you (read: me) of.

Freelancing with Ruby on Rails

An interesting post on tips and techniques.

Link dump: journalism and data sets (mostly)

From lecture #6 in the P2PU course, we have followthemoney.org (that sounds very familiar), the NICAR data library, the Sunlight Foundation, and DocumentCloud's open-source foundations.

Tuesday, October 26, 2010

Startup::Declarative

Ha. Just a thought: a declarative language describing the structure of a startup company. It would hook into workflow, maybe a site description language ... ??? 3. Profit.

String matching algorithms

Here. For once, a link from Metafilter, not HNN.

Saturday, October 9, 2010

Further thoughts on Color::Declarative

The basic object in Color::Declarative is, of course, the color:
color green
or
color (SVG) blue
There is also a dictionary object to create custom dictionaries:
color-dictionary my-palette
If we're not using a custom dictionary, then we use Color::Library for named dictionaries, and the default there is the SVG dictionary as a kind of general catchall.

We can also define a whole new custom color from scratch like this:
color my-green #00ff02
But we're not normally going to do this. Instead, normally we'll define a color by its function:
color button-color (SVG) green
Then we can use it elsewhere. And we can do that for a whole set of colors like this:
color-dictionary my-palette
button (SVG) green
titlebar (SVG) blue
Then we can define a button something like this:
button fahrenheit (x=130, y=50, color=button) "Fahrenheit"
We could also imagine a palette defined something like:
color-dictionary
primary (SVG) { $retrieved_value }
secondary (SVG) primary.complement
I don't like that poorly considered syntax, but the point is that we should be able to build sophisticated palettes based on functional relationships.

Wednesday, October 6, 2010

Text analysis - identification of sources in news articles

So I'm taking this online class about journalism, and one of the exercises is to identify the sources in a news article. By hand, of course, this is easy. Wouldn't it be nice to automate it (even partially)?

Of course, nothing is easy when natural language is concerned. I see two parts to this, clearly. First is taking a page and extracting the news item. Frankly, I don't see any better way to do this than simply to have a bunch of definitions for different news services that could identify the CSS classes used by each of them to mark their payload text. And this is exactly the kind of task that a pattern-matching language would be dandy for.

Which leaves us with the text, and its analysis. Which is hard. I can think of a couple of ways to get some sources out of a given text: "'...' said x" is one obvious pattern. Without language-savvy tools, it would be a series of hacks, but maybe worth the effort. (With language-savvy tools, a lot of this stuff starts to look more amenable to solution, though, doesn't it?)

Mapping

A quick thought on mapping. A "document" object can map onto a PDF via a PDF builder, and a PDF object can map back onto a document object by means of some sort of thing that recognizes tables and things with less-than-perfect accuracy. Point being that we've got different tools in the two directions, and that one direction might be lossy.

Forgive me if that was obvious. I told you these were more like notes for myself than a regular blog.