Sunday, February 5, 2012

Speech recognition

Mulling over possible ways to use speech recognition in my translation business, I discovered that Windows 7 actually has a speech recognition engine built in. It does a pretty darned good job, too - unfortunately, it appears that my translation skills are more or less built on serialized output through my fingers - I tend to be able to formulate the translation only in phrases, and I stumble when trying to put them into words vocally. I hope that's just going to be a matter of training, because if I learn stenography only to find out I can't actually translate as fast as I type, I'm going to be sorely disappointed.

Anyway, speech recognition is still interesting in terms of output bandwidth from my brain, so here are a couple of links:
  • Microsoft documentation library for SAPI. I know this type of link tends to rot pretty quickly thanks to Microsoft's ongoing efforts to erase their own history, but it'll be good for a year or two anyway.
  • A 2007 overview article on Microsoft speech recognition.
  • CMU Sphinx is a popular open-source speech recognition engine that deserves examination. The same sorts of vocabulary hints that ought to be available to typing due to the segment being translated ought to really help in quality of speech recognition, so eventually I can envision some pretty darned good input techniques.
Anyway, yet another field that needs investigation.

No comments:

Post a Comment