The recent announcement of Siri made me curious about the voice recognition that’s already out there.
I started off slow with the Google Chrome Oweb Voice Input extension which does recognize words pretty well, but only works in some search fields and only in Chrome. Still, one click on the little microphone icon and talking is quicker than typing.
Portability, however, is big in making whatever you are talking to a real personal assistant. I downloaded an app called Voice Actions which is kind of like Siri’s cousin on Android (it tells me it doesn’t have ‘brothers and sisters, but more accurately clones’). You can name it whatever you want, but the voice never changes. It recognizes words fairly well and can usually understand what you are trying to get at.
The coolest aspect is that you can try to carry on a conversation with it. It will play 20 Questions with you and has likes and dislikes which go a long way into making it a convincingly conscious entity—at least until it completely botches what you said and throws out something completely random and illogical. Opening apps, sending texts, most other commands don’t seem much of a challenge for it.
It greets me pleasantly by name when I address it, but does not seem to have much of a memory about what I have told it about me.
While not portable, the best system I have used is the Windows Speech Recognition feature which comes with Vista and Windows 7; though it is probably ignored by most. You don’t have to use it for that long to feel like you are in 2001: A Space Odyssey (which probably excites me more than most).
It will do pretty much anything you want it to. Opening, closing, minimizing, and maximizing windows. Navigating among windows can be a little drawn out—either by dividing the screen into numbered sections or assigning each button a number.
“Show numbers.” Brings assigns each choice on the screen with a number.
“17.” Will highlight selection number 17.
“OK.” Confirms your choice.
Can you say four words faster than moving the mouse and clicking? It’s close. Surfing the web is tough, having to use the number system for each tab rather than “Close tab” or “Open About Us in new window” which seem like simple additions for the next phase.
Other options though, like “Scroll down” allow you to sit back in your chair and read a long passage without having to use the mouse. It’s small, but is more comfortable.
Composing longer passages is possible, but will take some practice. It really forces you to think through what you are going to say before you start, whereas typing something gives you some time to think ahead as you move through the sentence. The editing features are all there as well, but as virtually everyone who will try this will be used to composing with a keyboard it will be much slower. I’ve come up empty finding a video of someone who has mastered it, so it’s difficult to compare keyboard vs voice composition at equal levels of competency.
Though it seems to be the best of the three aforementioned systems at accepting vocal input and recognizing words, Windows Speech Recognition does not talk back to you. Not that it has to, it isn’t really there to be a personal assistant, but I (and I think most people) feel more comfortable talking to a voice rather than a cold machine. Combining the two, a PC that can minimize a window and then schedule a meeting the next with a vocal confirmation that it has done the latter, will be a cool tool to have and hopefully one that will show up as soon as Windows 8.
One thing is certainly clear: we will be talking to our devices a lot more. Vocal commands can address so many more options in a short period of time that it doesn’t make sense not to utilize it. Navigation will also get an eventual boost from eye tracking, motions sensors, and ultimately thought.
Talking to the Voice Actions app—not giving it commands, but trying to hold a conversation—shows that computers can already be fairly convincing. They are only getting smarter and within two decades will match us. Before they get to a level near human, they will be “freaky” by today’s standards. This will take us into debates over AI consciousness (a subject that I find endlessly fascinating) and slavery.
In the meantime, you can blame that habit of talking to yourself on your assistant.