You know, TinyScheme is essentially in the public domain, and would really be easy to embed into Perl. In fact, you could just use Perl's own scalars as a basic Scheme type and get most of the R5RS spec for free that way...
I think I want to think harder about that. There are a lot of things I want to do in Lisp-y environments without giving up my CPAN, and TinyScheme is really, really lightweight. It's quite conceivable that you could also use the Perl module structure (and CPAN) to organize your Scheme libraries. Hmm...
But even more fascinatingly, a Scheme embedded in XS could also be used to get some introspective (and on-the-fly!) access to other XS-defined items. Say, OLE.XS, which still has me stymied in my effort to get IEMech up and running in the modern age.
So - TinyScheme is probably the lightestweight choice, but there's also Chibi Scheme that looks a more full-featured alternative, plus of course there's already Inline::Guile for serious scheming.
(I'm going to go with TinyScheme, though, and make something self-contained, a la SQLite, that already has everything you need to start with Scheme.)
Oh, here's a link on what hygienic macros are. Makes sense.
Anyway, the motivation for all that is that Sussman physics-in-Scheme course. That thing is incredibly densely written! Fortunately I'm married to a theoretical physicist who, so far, appears impressed at my staying power and is more than happy to talk about anything I find unclear. At length.
Tuesday, December 31, 2013
Speech synthesis
So I just found out about something called Vocaloid, which is a Yamaha product, closed source, for text-to-singing. There's nothing remotely like it in the open source world, but I suspect you could cobble it together from parts already in existence (mostly) by including the melody and timing into an existing text-to-speech engine.
Possible engines might be:
- MARY (another project from the DFKI)
- eSpeak - this is more or less the Linux default
- flite, which is Festival lite.
- And Festvox, which is the full Edinburgh/CMU Festival system.
I can only imagine that Vocaloid is a unit synthesizer with a large database; the output is pretty natural-sounding, in contrast to the state-of-the-art of truly synthetic speech. It would be a lot of fun to play with this stuff, especially in the context of music.
Friday, December 27, 2013
Datalog
More LogicBlox/Datalog stuff:
- A paper about a tutorial
- The query language LogiQL
- Slide presentation
I'm lagging in my investigations.
StackEdit
Here's a complete markup editor for formatting posts for StackOverflow and other markup-based stuff. It's bewilderingly good.
Text/NLP stuff
Like clockwork, I collect links to interesting NLP stuff.
- GATE: a General Architecture for Text Processing
- textteaser extracts summaries with machine learning
- Dezi is a Lucy/Lucene-alike in Perl
Sound generators
A couple of sounds for ... stuff.
- A whole site with a bunch of tunable sound generators. These are fascinating. "Noise machines."
- Dial-up modem sound effects, for the sake of nostalgia.
Negative captcha
A Ruby framework for building forms that are more bot-proof. This is nice from the standpoint of technique.
D3.js
I always love anything about d3.js, so October saw three links:
- Another basic tutorial
- Raw is a webapp drawing graphics using d3.js
- And here's a huge list of examples. (1897 of them as of today.)
Thursday, December 26, 2013
Ontology
I've launched into a project at last, a terminological database toolset that's been on the drawing board for a very long time indeed (with what I hope will prove to be an accompanying business model to harness it all), and one thing that I ran across in my initial data scheme for termbases is the "context" field. Logically, that context is an ontological specification - a kind of "where am I?" in the larger scheme of the vocabulary of the full language - and it's used to draw distinctions about the terminology used in specific applications.
Well, so I delved into the available literature about ontology tools. Of which there are many. And I hadn't really looked in many years; they've proliferated, especially in the context of the semantic web and bioinformatics, so here's a partial linkdump of some of the information that looks most promising.
Well, so I delved into the available literature about ontology tools. Of which there are many. And I hadn't really looked in many years; they've proliferated, especially in the context of the semantic web and bioinformatics, so here's a partial linkdump of some of the information that looks most promising.
- A decent overview.
- KIF = Knowledge Interchange Format [here] [in SUMO]. This is a declarative language with LISP syntax used to express first-order logic predicates about concepts.
- SUMO = Suggested Upper Merged Ontology. Sort of the basic list of concepts that underlie everything else.
- Tips on ontology development. And pitfalls.
- A few basic tutorials about the semantic web. It's based on a graph database model (for semantic networks).
- RDF is used to encode chunks of graph data in the semantic web (it can also be embedded in HTML, of course).
- Ontological data about RDF documents is encoded in RDFS and OWL.
- OntoSelect is apparently a cataloging service for ontologies found/discovered on the Semantic Web - here's a mention, but the service itself seems to be down.
- Biology is another area where ontologies are used extensively; here, for example, is the Experimental Factor Ontology. Note that it is downloadable in OWL format. Experimental ontologies are generally available as free, open-source data, while anything with any hint of commercial usefulness is blisteringly expensive (pharmacovigilance, for example - the adverse effects ontology used for drug side effect reporting).
- The Gene Expression Atlas is also ontology-based. This is a real-world application of something that used to be considered hard AI, and I find that pretty fascinating in and of itself.
- Aaand a protein ontology that I've linked partly because proteins are inherently cool and partly because the legend is so pretty.
- Bioinformatics ontologies aren't always published in OWL; OBO is a competing standard. The Obofoundry catalogs a bunch of ontologies.
- Ontobee.org is an ontology viewer for ontologies published on the Web. Here's the display for an adverse event ontology.
There are reams and reams of information about ontologies these days. Those are the more interesting things I ran across while determining that I don't need to go into that kind of depth to do what I need to do.
Sunday, December 22, 2013
GPG audit
GPG is, of course, an important piece of software in the security world. It's kinda crufty and old. It probably needs an audit. Tptacek on HNN says more than that, it needs some decent code documentation, hence my idea of an exegesis, a deliteralization of sorts.
Anyway, multilevel code understanding and presentation.
Anyway, multilevel code understanding and presentation.
Turbulenz
Free game engine. Makes my laptop sound like a jet engine warming up for takeoff - but then essentially anything does that, so it's not really an antifeature.
StupidFilter
An attempt (abortive, apparently) to use Bayesian techniques to detect stupidity. Obviously, this can't detect stupidity at the semantic level, but it might be able to pick up on syntactic markers of stupidity. It's an interesting exercise.
cdecl.org
A translator between English descriptions of C type declarations, and the type declarations. Pretty fascinating!
The Architecture of Open-Source Programs
Online book. I think I linked it already - well, it deserves multiple linking.
Ginko: tree-shaped organization of text
An interesting perspective on document organization: hierarchical, two-dimensional organization of text into successive layers of detail. I kinda like it.
Ginko.
Ginko.
Brilliant vs. insane code
Here's an odd little ditty musing about a line of Python Stavros Korokisthakis (perhaps HNN's StavrosK?) ran across:
It's clockwork, and quite clever - and not the way people think (well, except insofar as people build clockwork and this Python in order to do things like this of course). In terms of code understanding this code is not self-documenting in any way. To determine programmer intent, we have to simulate what it does and see why it does that.
It's kind of like a syntactic artifact of a semantic reasoning process, one that we can recover (hopefully!) with careful reasoning. But the original reasoning is gone.
Interesting.
def GetContourPoints(self, array): """Parses an array of xyz points and returns a array of point dictionaries.""" return zip(*[iter(array)]*3)Hmm. Like it says on the label, it takes an iterable of points and returns an iterable of triples in order. But as Stavros notes, it's not at all obvious how it does that. You have to reason your way through it.
It's clockwork, and quite clever - and not the way people think (well, except insofar as people build clockwork and this Python in order to do things like this of course). In terms of code understanding this code is not self-documenting in any way. To determine programmer intent, we have to simulate what it does and see why it does that.
It's kind of like a syntactic artifact of a semantic reasoning process, one that we can recover (hopefully!) with careful reasoning. But the original reasoning is gone.
Interesting.
Random user generator
Randomuser.me gives you a random user profile - "Lorem ipsum for people". (See how this ties back to the contact management thing?)
So here's a semantic pole for ya: people. They just keep coming up all over the place. And they often share a lot of things with one another. So why don't we have a range of tools like this one, for people, for companies, etc.?
Sort of a semantic toolbox kinda thing.
So here's a semantic pole for ya: people. They just keep coming up all over the place. And they often share a lot of things with one another. So why don't we have a range of tools like this one, for people, for companies, etc.?
Sort of a semantic toolbox kinda thing.
Contact management
A recurring problem for everybody who deals with people. Which is ... everybody. [musings] [hnn]
Seems to me that part of the problem is that not every application of contact management requires a full-on heavy-artillery solution. So - as with many, many other domains - there is a kind of sliding scale of complexity that could be modeled using a set of mapped semantic domains.
I really think this concept is going to pay off once I have it clear in my head.
Seems to me that part of the problem is that not every application of contact management requires a full-on heavy-artillery solution. So - as with many, many other domains - there is a kind of sliding scale of complexity that could be modeled using a set of mapped semantic domains.
I really think this concept is going to pay off once I have it clear in my head.
September
By the way, that was the first post from stuff I bookmarked in September. I seem to have a 3-month lag that is more or less constant, at this point.
Funny vs. LOL
UI patterns
So I've come back quasi-full circle, finding myself thinking of Wx-enabled smart widgets based on the wftk (seriously, it's like it's 2002 again), and last night after the laptop was off, I scribbled down the following note:
This kind of pattern-based "architectural" programming could and should be carried down to the lowest possible levels of programming. That's a semantic mode of conceptualizing software.
Anyway, so I looked up "UI patterns", thinking that would get me that Yahoo effort (YUI, already noted elsewhere on this blog) - but instead it turned up UI-patterns.com, a short-lived effort by Danish web developer Anders Toxboe, an attempt to develop a UI pattern gallery/database/article focus site that seems to have gone on for 2010 and 2011 and stopped. Plenty of spam in the comments sections, but otherwise a ghost town. Too bad, because it's pretty much what I'd like to start from in this attempt to come up with a language of UI pattern design.
Some good grist for the mill, anyway.
Filed under UI design, patterns, and architectural patterns, because I suspect the UI design has to be based on some conceptualization of the data that would be reflected in the system architecture.
Basic mail UI against repo. Define mail-like queues as a thing.The first sentence is a UI pattern (basic mail reader, with groups, then messages, then a document view for the mail itself), and the second is the data pattern it presents to the user (the concept of a message queue, possibly with threads, certainly with some kind of topic grouping).
This kind of pattern-based "architectural" programming could and should be carried down to the lowest possible levels of programming. That's a semantic mode of conceptualizing software.
Anyway, so I looked up "UI patterns", thinking that would get me that Yahoo effort (YUI, already noted elsewhere on this blog) - but instead it turned up UI-patterns.com, a short-lived effort by Danish web developer Anders Toxboe, an attempt to develop a UI pattern gallery/database/article focus site that seems to have gone on for 2010 and 2011 and stopped. Plenty of spam in the comments sections, but otherwise a ghost town. Too bad, because it's pretty much what I'd like to start from in this attempt to come up with a language of UI pattern design.
Some good grist for the mill, anyway.
Filed under UI design, patterns, and architectural patterns, because I suspect the UI design has to be based on some conceptualization of the data that would be reflected in the system architecture.
Subscribe to:
Posts (Atom)