Tuesday, March 24, 2009

Speech Recognition: Why everybody is heading north?







This has been a hot topic for a while, but I don't know why I didn't "yet" seen the best software of the speech recognition spreading.

My Mac has a built-in speech recognition it is good. But still most of the time it doesn't recognize my own voice.

Lets think out of the box. Forget we have current speech systems, forget the current algorithms, forget everything and start fresh. Lets reinvent this together.

When I talk to you in English you understand me. Sometimes you don't if my accent wasn't clear enough. However, eventually you will get it.I want to talk to my machine as if I am talking to a human. I don't want the machine to feel, I just want it to understand. Whether I told it to "Quit everything" or , "Close everything" or "quit all applications". Whether I or my mother said it. It should get it if the machine understands English.

If I learned a new word, say "petrified" in English from Roy. I will recognize the word "petrified" from anyone. I won't say, sorry I only know Roy's
version of "petrified".

Perhaps we shall start from the capturing process (the computer Ear). How do I know if the computer is capturing my voice correctly?


"Open the control panel", the machine should open it. That's it. No room for mistakes.It think the current systems were driven by the technology and lost in the bit-by-bit and the neural networks analysis. I know this feeling when you get lost in the code and your left brain take the lead. You can't be creative while at that mode.


So Why everybody is heading north? Is it because there is one way to do Speech Recognition?

Its all the same:
Huffman started compression, Winzip, Winrar, Powerachieve followed the same approach.

Someone started voice recoginition, wrote an algorithm, now everybody is treating it as a Bible and heading north. 



EDIT: Siri, Google Voice and others are all new technology that hits this field pretty hard. But not hard enough.