Wednesday, October 31, 2012

Using Dropbox as a database

Cool idea: use JSON files on Dropbox for database storage.  Built into Opa (a server-and-client JavaScript language that I've noted before).

Circular

Circular is a Backbone.js application that uses PHP, Bootstrap, and MongoDB for storage.  It allows you to store up tweets in advance and have them sent to Twitter on a schedule. And it's open source.  Good boilerplate extraction example.

Sunday, October 28, 2012

Ruby

I should learn Ruby.  Starting here.

Haskell

A pretty hilarious take on the state of Haskell package management.  The highlight: Haskell people don't know when they're shooting themselves in the foot because they can't remember what it's like to have a foot without a bullet wound.

Software approaches to memory training

Language learning is a subject near to my heart, of course, so here's an interesting little piece in Salon about language software.  There's plenty of room in the market.

Also, HVPT word pair training.

Saturday, October 27, 2012

GEOS

An open-source geometry library.

Condensing fact from the vapor of nuance: security

So yeah, it's possible to instrument your own virtual memory on a VM running on a shared processor in the cloud in order to detect the differences in instructions run by a SSL key decoder and glean enough (noisy) information from that to reconstruct the private keys used to do the decoding.

That just blows my mind.

Friday, October 26, 2012

Here's a to-do manager built in PHP on a NoSQL backend

This is something small enough to work as a good first step in understanding PHP architecture: a to-do list manager that stores JSON in a NoSQL database.  Cool!

AsciiDoc

Learned about a new markdown variant today, AsciiDoc [cheat sheet].  It's in Python, but that doesn't necessary mean it's bad.  (ha)  I'm considering using it for my notes application.

Also, there's talk about standardizing Markdown.  Interesting.

Thursday, October 25, 2012

How to use SSH for fun and profit

I've never been good at SSH.  Here's a great blog post about using it.

Also: ssh-copy-id.

Druid: open-source real-time analytics store

Druid has just been open sourced.  It appears to be a competitor for Hadoop.

Prose

Prose is an online editor for your Github documents.

Zoomstra

A training workbook service.  Pretty cool, actually.  This is both a target application and a monetization strategy.

Substance Document API

This is kind of interesting.  A company named "Substance" has published an API for documents, to be used for Web applications involving group editing of documents - with annotations and a change history.

That's pretty neat.

Evolving regexps

This is neat.

History of the United States in 141 maps

DDD-based page that really rocks.

Bot links

I'm collecting too much stuff in the tabs, so here's a little link dump, some of which is redundant within this blog:

  • Scrappy (blog post by the author; it's been rebuilt on Moose)
  • WWW::Wikevent::Bot - a useful example of a bot
  • Javascript in Perl!  Seems a little static (2010) but it's better than anything I've built yet.
  • Example code for Mechanize - good for reimplementation, you see.
  • TreeBuilder.  Everything coming into Bot::Page will be parsed.
  • RDF::Scutter.  I still don't know what this does, except maybe it's gathering semantics?
  • The Spidering Hacks book (2003) from O'Reilly.  Mine it for examples.


Web scraping

And yet I just can't get past thinking about Web scraping as something fun and profitable.

If it could just be simplified, a lot.  So I'm thinking again about declarative means of describing the "shape" of a site in terms of where the useful data is - and I'm coming up empty.  Again.

The only way to get my mind around it is to build some Web scrapers.  Elance is not going to be an interesting place to find challenging scraper specifications, so I'm going to have to look at the ones on ScraperWiki and go from there.

Oh, ho! The Mechanize Cookbook is replete with interesting examples.  I shall start there.

Update: Those seem boring and old.  Instead, I've subscribed to the ScraperWiki mailing list, which involves requests to the masses.  Here's a cool one already: find all the churches in Germany, with lots of links to start with.  So yeah.

Wednesday, October 24, 2012

The state of job boards

The last time I looked at Elance was 2008, and there were reasonable amounts of jobs by people who seemed to know what they were doing.  Now I'm looking there and it's all people who have a fantastic idea that they want implemented for five bucks.  (Or so it seems.)

And none of it seems well-specified.

Maybe that particular strategy is dead.

Tuesday, October 23, 2012

Mathematics/computer algebra

So SFTP under Perl requires Net::SSH::Perl, which in turn relies on Math::Pari, which interfaces to PARI [wiki], a computer algebra system that implements a lot of number theory algorithms.  PARI is a library that is normally accessed through GP/PARI, a scripting language specifically written for it.

Here's the thing.  Math::Pari doesn't install on Windows (nothing I tried tonight installed on Windows) and, reading into it, the implementer of Math::Pari seems like a real ... ahem, doesn't seem to play well with others.  Math::Pari will only install if you have built PARI on your own machine; it requires the build directory to build the Perl.  Period.  Unfortunately, it doesn't react at all well to the current version.  Like, "Perl dies" levels of poor reactions.

I'd like to wrap PARI a little better.  Maybe Inline::GP or something, I don't know, but there is most definitely room for improvement.

But that aside, I ran across Sage again.  Sage is essentially an open-source mathematics Swiss Army chainsaw.  What it is, is Mathematica re-implemented on open-source, and it includes PARI, SymPy, and a boatload of other open-source tools of that nature.

There's a whole comparison list here.

There, my friend, is a domain crying out.

Monday, October 22, 2012

The Prime Pages

Model this.  It's a database of information about prime numbers.  Way Web 2.0 before that was even a concept (founded 1990-freaking-4, before even I got into online databases...)

Sunday, October 21, 2012

uiji.js

... is JavaScript in reverse.  I don't know what it means, but it all looks neat.

Saturday, October 20, 2012

Carbon: compiles to C

This is pretty neat: a really thin OO C extension (libco2) that has a higher-level C-like language defined on top of it (Carbon).  Carbon supports direct object orientation, exceptions, and some other stuff, and compiles to C.

It compiles to C.  Like Decl could compile to Perl.  (Or to C, I guess.)

Friday, October 19, 2012

Visual programming doesn't work because it fails to scale

Good post on visual programming, the point being that it can't deal with any complexity at all without becoming incomprehensible.

Although you kind of think that maybe it could still work if it's just organized well enough.  There should probably be a limit of 7±2 items on any given view.

Online JSON editor

This is kinda neat: http://jsoneditoronline.org/

LispyScript

A macro-enabled JavaScript variant.  Now that's what I call cool!

Free texts on machine learning

Someday I'll have time to get back to ML.

Free email detector API

Interesting little service: given an email address, detect whether it's a one-time use address.  Available via API.

More jQuery plugins than you can shake a stick at

If you use jQuery, then something here is surely going to be useful.

Thursday, October 18, 2012

PostgreSQL and Redis

PostgreSQL can talk to anything.  Like Redis.

Learning things

OK, so flashcard learning is a discipline that has some serious history.  Here's a really refined flashcard algorithm that pumps words into your head.

On that note, another language learning program (a game, for Bulgarian).

Signal processing primer

What it says on the tin.

Wednesday, October 17, 2012

Rap Genius

This is slick, very slick: Rap Genius (link goes to transcript of second Romney/Obama debate) allows you to post lyrics to a song (or any other fixed text) and highlight individual words or phrases with popup notes.  (Its original name was Rap Exegesis, which I prefer.)  Neat idea!

They want to expand it to law, a great application.

Okapi workflow manager

Perusing my own blog over at Xlat-perl, I was reminded of Okapi's workflow manager.  It strikes me that this is where I want to go with the "low-level business process language" thing of files and directories and FTP servers I envision.  Tagged as "workflow" for lack of a better idea, although it's not really workflow without managing tasks for humans.

Bountify

Neat little programming-snippet bounty site.  Looks like some fun stuff there!  If it catches on, it'll be interesting to see how the data trends.  Tagged as a "job source" - but it isn't really that.  The bounties are more like prizes than pay.

PHP comment-style annotations not good

Nice post on PHP annotations in comments and why they're probably a mistake - but indicating that metadata in code is not a mistake.  And even offering an alternative that doesn't suck!

Personally I think treating PHP as the compilation target for a macro language is probably the better approach, but maybe that's a claim requiring extraordinary proof that I can't yet offer.

RJMetrics

Some sort of data science toolset?  Check it later.  Nice blog post!

Data structures

An open textbook in data structures.

Dbpatterns.com

Dbpatterns.com is a neat gallery/forking site for database patterns - it seems a little thin (you don't seem to be able to download patterns in a machine-readable form, for example), but the site itself is on Github and the idea is groovy.

Take that and make it machine-readable in a schema description format that can be composed and used to output SQL and man, I love it.

Deployment to a VM

So testing deployment should be pretty easy if I have a VM to test against.  (I'm looking at VMs again due to OpenLogos, natch).  The idea is basically to fire up a blank VM of the OS in question, then run scripts against it to build ... whatever.  The "whatever" then being a deliverable VM, but built using a deliverable, repeatable script.  And that script could then itself be an expression of the higher-level deployment description language I envision (and that already exists in other deployment tools).

Ah.  VirtualBox has a command-line interface.

Downloadable Oracle development VMs

Wow - this would have been nice back when I was developing an Oracle adapter for my workflow toolkit: Oracle now has downloadable development VMs available for their enterprise software, explicitly marked as no support and not suitable for production use but otherwise free of charge for development.

Very cool!

Data analysis languages

A very nice comparison of MATLAB/Octave, R, Julia, and a little Python at the end.

Monday, October 15, 2012

Syte: Personal site boilerplate (kinda)

Syte is a Github project that allows you to customize a few things and have a personal Django-based, mega-socially-interactive Website up on Heroku at breakneck speed.  You just fork it, run it, and customize it with its own tools.

Neat!

Sunday, October 14, 2012

Pattern: Python web-mining system

Drool.  Of course, nothing restricts Decl to Perl...

Visualization in Perl (and not in Perl)

There was a post to the Quantified Onion group this week about visualization in Perl.  Here are some useful links that arose from that discussion.

Another landing page tester

Here's a code-free landing page builder that gives you the ability to let customers select a pricing model preference during their interest registration.

Mingw package manager yypkg

The open-source takeover of Windows continues.

Exotic data structures and concatenative languages

This is cool!  This is a page about some exotic data structures (useful in specific algorithmic situations), which is already pretty neat.  But it's on a Wiki - a very fast Wiki - about "concatenative" languages, by which they mean Forth and Forth-like languages, and that Wiki itself is written in Factor, a concatenative language.

Update: Reddit responds with actually exotic data structures at MIT.  Ha.

Friday, October 12, 2012

The problem with taking classes

It certainly feels as though the classes I started (and still didn't manage to finish) drained the creative impulse from my other work - as though they took the spare clock cycles in my head and used them all up, leaving nothing left over but classes and paying work.

Citrus Perl

I think this isn't actually new, but Citrus Perl is a thing: a distribution of Perl specifically designed to support Wx - and to repackage itself as your standalone application.  This is a gamechanger.

If I were to finish (or quasi-finish) Wx::Declarative, then repackage a Citrus-like distribution as Decl, I could really have something pretty powerful, right out of the box.

That deserves thought. (Note: Mark Dootson's blog is here.)

XML fever

A tongue-in-cheek guided tour of a few misconceptions people have about XML and how they're solved by mechanisms in the XML ecosystem.  Very interesting stuff!

SmallVCM

A small renderer for 3D scenes using a variety of different algorithms.

Sphinx documentation tool

So Sphinx is a tool used to write nice documentation for software.  It's based on DocUtils, which I'd already heard of, but apparently does a lot more; I need to read more about it but I do have to admit it looks very nice indeed.

I came to it through a blog post by Brandon Rhodes with a useful technique: if you're generating documentation from text files that won't be read by the end user, then break lines at major phrases and periods.  This permits you to revise your text while allowing the VCS to see where the changes really are (that is, you don't need to rearrange your text to reflect margins).

Declarative dataflow coding in the wild!

Wow!  The folks at Prismatic, who use Clojure to build newsfeed aggregators to spec, have developed a declarative dataflow graphing structure to describe their workflows - it's fab!

This is definitely going into Decl! I can't tell you how this resonates with what I am trying to approach.

Game of Life: modern variants

While I wasn't paying attention, interesting things have been going on in the Game of Life.  First: a video of a floating-point version of Life that looks a lot more organic, fascinatingly. [code] [paper]  Tim Hutton made the video in Ready, a cellular automata exploration platform.  Neat!

Then, as HNN posts often do, this induced somebody else to post this freaky-deaky Life emulated in Life video.  This one uses Golly in its low-level Life grid - a Life platform that has scripting to build really refined structures.  People get really focused on this!

For a library of some of the well-known patterns, here's a pretty cool library site.

It's beautiful.

Thursday, October 11, 2012

One-man programming shops

Here's an interesting blog post bemoaning the fact that one guy can't really expect to have a service that is as refined as something a whole team has hammered out.  His example is notifications - people expect a notification workflow that is as perfectly refined as Facebook, and it's too jarring if something doesn't work quite perfectly, whether that's because it took too long to notify, or there was a notification flood when twenty people opened your link, or .. whatever.

I'm not sure I buy this.  I mean, sure, it's true that a good team can work out something to perfection.  But a good team can also come out with iOS 6 maps - having more people on something is no guarantee of success.

And Facebook has a billion users - you can survive fine on five hundred.  They're offering what they offer - but you're offering something new, or something niche, and your users know you're not Facebook.  They really won't mind the occasional rough edge.  Those that do are callow wastes of your time - who leaves a good service because the logo was three pixels too wide?  Nobody.  Seriously.

But, all that said, this is why Best Practices and architectural patterns exist.  For something as universal as notifications, there should be some reference standard workflow, and ways to talk about it.  Every now and again you see a really nice survey article about best practices in, say, password reminders or order workflow - this is the kind of thing I'd like to see more of.

Skype IM worm

I got a Skype IM from a friend the other day that turned out to be a worm malware vector.  The cool thing was that the IM was in Hungarian (the friend is also Hungarian).  Impressive social engineering indeed!  Here's a CNET article on it.

Wednesday, October 10, 2012

Word link dump

So I got the Word template engine working in a rather simple form.  It can handle exactly one set of tabular information, and text in the columns is handled the way I want to handle it (each repeated field can appear in a column with other text in the first row, and subsequent repetitions of those fields will appear alone in cells inserted below the original appearance of the field).  It's a far cry from a really mature template engine, but it does what I need it to do right now, which is pretty slick.

During all that, I collected some Word automation links - mostly forum posts by people asking for help.  But I don't want to lose them. There's a great deal of Word expertise out there and I want to understand it better.

Anyway, full documentation of the Word API is ... well, there's a whole lot of functionality in Word.  I almost think the documentation/configuration of an API structure should come in modules or something, like "core text manipulation", then add reviewing and left-to-right text if you need them.  I don't know.  It's complicated - but worth doing.

Sentiment analysis

Cute and brief blog post on presidential sentiment analysis.

Openera

Automatically save, organize, and back up files and email attachments.

Divshot

Ought to clone this. Divshot is a UI webapp based on Boostrap, with some kind of WYSIWIG interface.

Monday, October 8, 2012

Down the Word rabbit hole

I knew there was a reason I hadn't done this Word wrapper thing.  Word is freaking huge.  Did you know that Word's "formula" mechanism includes an entire dataflow computation engine?

The sheer vastness of Word's scope is humbling.  Actually setting it forth in a coherent form?  That's a daunting task indeed.

Sunday, October 7, 2012

PLEAC cross-language sample code site

Interesting effort - another Rosetta-Stone kind of programming language site taking all the example code from the Perl Cookbook (vintage 2001) and asking people to submit the equivalent in their language of choice.

This is another source of something along the lines of "transaction patterns", that is, higher-level "things to do" that are implemented in lower-level code.  A semantically oriented declarative language needs to be able to talk at that level.  Maybe instead of "transaction patterns" they should be called "action patterns" or something.  They're at a far lower level than architectural patterns.

Saturday, October 6, 2012

jQuery intermediate tips

jQuery is a pretty advanced environment.

Light Table

And then there's Light Table (of Kickstarter fame).  He also references CodeBubbles, which piqued my interest last year (but which never actually went anywhere).

Factorization diagrams in Haskell

Here's an excellent post about factorization diagrams (arranging dots to represent the prime factorization of a number), implemented in Haskell.  It seems like this would be a great example for a mapping application.

Brackets code editor

Slick!  Written in HTML/CSS/JavaScript with a thin native wrapper, and has some pretty interesting features, like active file sets independent of the directory structure and inline editors for related code sections between files.  Very neat.

Friday, October 5, 2012

Combining functional and imperative programming

Here's a nice post about a study!  Turns out functional programming isn't actually easier to use - in fact, this study showed that Scala's rigid type system made debugging harder.  The theory, of course, is that rigid typing should make debugging less necessary because many logical errors won't even compile.

The Scala programs were also smaller, but not by a whole lot.

The title, though, is misleading.  If I understand correctly, this isn't combining approaches so much as comparing them.  Decl should combine them.  Is there an Inline::Haskell?  (Answer: no. But there are some sorta-kinda things that look real Haskelly.)

Word and Perl

So the current Word plan is, as I say:
  • An OLE wrapper module
  • The Word (and Excel) module written on that
  • Higher-level Word tools written on that (templates, etc.)
  • A library of useful Word code - ongoing research project for "Word patterns"
To that end, a small link dump:
More on all this later: this weekend looks like time to start getting serious about the template module, and that means all this.

Speaking of type systems: XML

Here's a fascinating paper on XML, how it works, why it kind of sucks, and lots and lots of information.  One of those things you need to read a few times.

Greg Wilson on software engineering

I actually ran across this in August (yes, I have a six-week backlog on bookmarked things to post about - mostly I put them back into the stream in the same week I encountered them, but this one is really good and I want to think about it a little more than just posting a "this exists" post.)

Greg Wilson has been bopping around the software world for a long time now, and is concerned about the science of computer programming.  As in: there isn't one.  He has a fantastic slideshow here, which you can scan through in about three minutes, and it all leads up to his book "The Architecture of Open-Source Applications".

I've just grazed the surface with it, but before I spout off some thoughts, let me inject a couple more links:  http://www.neverworkintheory.org/ is a blog about software development research that is relevant in practice (says so on the header), http://software-carpentry.org/ is an organization teaching researchers how to code better, and Wilson's own blog at http://third-bit.com/ - those are the links at the end of the slideshow.

OK. So.  Architecture of open-source applications, yeah.  This is essentially a list of high-level descriptions of the shape of the code for 49 different serious open-source applications and a little rumination on choices made.  This is higher-level than Decl core is looking, but clearly a separate semantics of application architecture would be really nice to have.  To that end, I need to read this book, cover to cover.

What I'd like to do is then look at some of these applications at the code level, and build a "semantic framework" describing the code.  I'm not even sure what that means yet.  But I want to evolve towards code understanding code (for certain values of "understanding").

Dark Patterns

A fun Wiki documenting user interface patterns whose purpose is to scam the user.  Slow as molasses, but good work.

Z3: an efficient theorem prover

Z3 is a theorem prover being developed at Microsoft Research that has been published under a quasi-open source license this week.  Interesting.

Coursera course on neural networks

I'm tempted.  But so far my attempts to take online classes have ended in catastrophe.  It's hard to allocate that kind of time!

Type systems

It rankles, because I'm thoroughly a liberal-language kind of guy, but ... type systems [wiki] have their place, and a programming semantics that makes any claim to being comprehensive certainly can't avoid them.  What brought them back to my attention is a blog post by one of the creators of Rust.  The problem with the blog post is, I can't understand it.  I have thought so very little about type semantics over the years that it's as though the author were speaking a different language.

So: task.  Think about type semantics and how they might profitably be incorporated into Decl.  Clearly types will have to be a part of the declaration of a value, and equally clearly they'll have to be part of a functional part of the system (functional in the sense of functional language, a modality I also want to support due to its strongly declarative nature).

One more feature.  It really is going to be a kitchen-sink language.  I should probably just break down and go find a reasonably detailed syllabus of computer science and just put everything in there.  (I'm not actually kidding.  It's all "semantics of programming".)

On that note, see also Microsoft's new open-source typed Javascript language that compiles to Javascript: TypeScript.  This might honestly be a good place for me to start.

Essentially, I see type systems as providing assertions about the usage of certain constructs in the program.  At compile time (note: Decl has no compilation step, so "at checking time" - checking being a specific action in that case), the compiler checks everything you've told it and makes sure it's all consistent.  That's essentially all type checking is.  You can get very, very ramified with the semantics of your typing system, and it looks like that might be what the Rust blog post is about (the syntax is writing checks the semantics can't yet cash) - but the notion of offline examination of some subset of the logic of a program is a valuable one, and one that Decl could very easily embrace.

Especially since I'd like to see a more interactive dialog going on with the editor at some point - type checking should probably be going on all the time, as that's likely to be one more indicator of what you're thinking as you're coding.  So the editor could effectively be coming up with a list of questions you could think of, some of which would include, "Is that what you really meant?  Because you said 'x' was an integer up here."

Wednesday, October 3, 2012

ScalaNLP

And speaking of NLP, here's a whole Scala NLP project.

Latent Semantic Analysis

Another neat Python NLP tutorial.

C as an intermediate language

Fantastic, fantastic article about writing a simple Forth compiler that compiles to C, and how to get it to play nicely with the debugger.  Great read!

Marpa tutorial

Ooh, Kegler wrote another Marpa example tutorial of a DSL.

Think what this will do when wrapped in Decl!  Maybe I should write the parser tutorial chapter early on, then double back to write more sensible things before it.

Cucumber, and BDD: Behavior-Driven Development

So here's something I ran across (again) today, and really started thinking about what it means: behavior-driven development, BDD.  It's a philosophical refinement of test-driven development, in which you define test scenarios in, essentially, English using some keywords (Feature:, Scenario:, etc.).  This keyworded set of requirements is then used to define tests, and then you do the TDD thing to make all the tests work.

It's pretty cool.  It's been an outgrowth of the Ruby Agile community, and its latest incarnation there is Cucumber.  And yes, there is a Test::BDD::Cucumber on CPAN.

So here we have a very declarative approach to specifying the behavior of a program and using that to drive the development process.  I don't care for the somewhat clunky regexp-driven way you get from the plain text description to the executable code, but the overall shape of the development process is very promising.

Here's the Wiki page for BDD.  It points out that some of the notions in BDD come from domain-driven design.  And all of that needs to be condensed a little into some principles for Decl, perhaps.

SQP

I factored something I ended up calling SQP out of the invoicing program.  It's a shell-based rapid SQL/Perl prototyping tool.  So far it's just got a quick way to find a database, use it for SQL, check the SQL for a couple of foot-shooting methods (forgetting 'where' clauses on delete and update), and add a command for each Perl script it finds in the directory.

It needs to be able to show/define/modify tables in some way, which it doesn't yet, and some other functionality is also still missing.  But it's a promising start.

Oh, and it still doesn't have any testing.  I mean, the whole thing is like fifty lines of code, but I'm sure there are plenty of ways I can fail when I change it next.

Other things to factor out:
  • A Word template handler that's not the barely functional Decl monstrosity I'm using now, based on:
  • A Word module that's a thin wrapper around the OLE object description, written in a declarative style using:
  • A declarative OLE module wrapper definition language module.
  • Some kind of ORM for the invoicing part, anyway.  Not sure how to manage it, exactly.  But I'd really like some kind of higher-level language for talking about SQL databases at the level of semantics (grouping tables into meaningful modules or something).  I'm sure somebody else is already doing something like this, so I'll also see if I can't research that.