Ooh. The "Readability" Javascript tool munges a page to put its "content" - that poorly defined part of the HTML that represents the parts the humans actually read - into a separate area for actual, well, reading, minus all the ads and links and sidebars and so on.
That algorithm has been ported into Perl as HTML::ExtractMain. So going into WWW::Declarative.
Ran across this list of static blog builders provided by a static-page hosting company. One in particular, blogc, stood out as it's a command-line ANSI C tool. That's kinda neat.
As usual, this kind of thing just screams "semantic domain" to me and begs analysis from that standpoint. Soon, compadres. Soon....
No comments:
Post a Comment