Thursday, December 13, 2012

The semantics of code

Some thoughts I've been turning over in my head the past few days involve the semantics of code.  Not the semantics addressed by the coded solution - the semantics of the code itself, which clearly do map onto the semantics addressed by the coded solution.

Here's the thing.  Each section of code is made of meaningful parts in a hierarchical structure. The parts are things like "variables", "loops", and this kind of low-level thing. By recognizing these (which, yes, are pretty close to the syntactic objects they denote) and grouping them by purpose, a human programmer can intuit the intent of the programmer.  At a low level, the intent of the programmer is something like "get data out of this file" or "sort this list".  At a higher level, we work with APIs (which themselves have an internal semantic structure) to form semantic units that are closer to human actions, like "put this record in the database" or "show this box on the screen".

Once the intent of the programmer is understood (whether correctly or not), we can ask questions about that intent.  Does the code actually meet the intent?  (The code could be wrong.)  Then we have a coding error that should potentially be fixed.  This kind of thing is a higher-level example of what static analysis does (static analysis actually does some pattern matching on the code for common errors, and warns the programmer that certain sections of the code look fishy).

Now.  At the highest level of the code, our semantic structures should look a whole lot like those expressed in the requirements and specifications documents.  These are human-readable documents that (hopefully) express the purpose of the code in a way that the programmer has implemented or is supposed to implement at some point in the future.

Here, too, errors can occur, and a semantic code checker that could understand something of English might be able to check whether the semantic structures look similar.  (And of course eventually will do just that.)

Moreover, tests could be derived from the semantic structures at the specifications/requirements level that could empirically check whether the code matches.

So that's the code-semantics aspect of my recent thinking.

No comments:

Post a Comment