« December 2005 | Main | February 2006 »
January 31, 2006
speech conference
I'm at speech tek west this week. people often ask me, what is the best dictation software? It depends, I answer. If you just want to play around with it, and you're running Windows XP, you already have a speech recognition engine installed on your computer. simply go to the control panel and turn it on. according to most reviews, though, if you want to do serious speech recogniti, You should look at Dragon NaturallySpeaking 8.
by now you may have guessed that I'm doing this post using just speech recognition. I am using nuance is Dragon NaturallySpeaking 8. To be fair, I spent about 10 minutes training it, and I'm also watching Groundhog Day on the TV. I'm really amazed it's working at all. I haven't opened the manual, and I spent very little time training the speech engine.
I really think that with a little practice, I could use this instead of a keyboard.I think this has great potential and I'm going to keep working with it. Good night!
Terry
January 31, 2006 in Speech Recognition | Permalink | Comments (1) | TrackBack
January 25, 2006
Assistant 1.0
It still feels funny to say I have an assistant. Most entrepreneurs probably wait too long to find someone to help them. I waited almost 10 years, which was about 8 years too long.
Suzanne joined Gold Systems as my first assistant and trained me to be more effective, and she took on jobs that I didn't even know needed to be done. We became a better company, and my life was easier. Then she got married, moved to Alaska and I was fortunate (thanks Judy!) to have a chance to hire Angela. Angela is doing a fantastic job and like Suzanne, when I adopt new technology, she has to adapt to it. Angela has been thinking about how my new Treo 700w is changing her job, and she wishes for a new product to go along with it. Here is Angela's first "guest blog."
Terry
-----------------------------------------
If you normally read Terry’s blog, then you know that he recently retired both his Nokia 3650 cell phone and Axim X30 Pocket PC and upgraded to the new Palm 700w. My life as his assistant has not been the same since!
This morning I received three calls from Terry first thing in the morning, each time he told me something new, but then claimed there was something else that he forgot. Within a few seconds, the phone was ringing again with the forgotten thought. Secretly, I knew he was driving in his car playing with his new phone and saying “Dial Angela Watson Office” using the voice recognition feature because it’s just cool!Its funny how gadget driven this man I work for is, but I’ve got to admit, that thing really is cool. Terry has been teasing me with the idea of getting another one for my own use. (Mind you, I would opt for a more stylish carrying case other than the leather belt clip, maybe something pink.)
As much fun as having a 700w of my very own would be, it misses the most valuable tool that all portable devices miss in order to aid the professional executive assistant. The capability to pull up your executive’s schedule!
50% of my day is spent combing over Terry’s calendar, contacts and tasks and making sure I’ve allowed enough time for him to eat and sleep. Assistants struggle with the problem of not having their executive’s calendar while working remote, and we either carry a tickler file with a print-out of their schedule, or we bribe the company IT wizard to come to our home to set up the VPN.
I luckily do have the ability to work in full force at home. (Jerry likes home cooked meals.) But until Microsoft comes out with Windows Mobile – Assistant Version 1.0, the best I can do for a mobile devise is still the good ol’ laptop and separate cell phone.
Respectfully Submitted by:
Angela Watson
Senior Executive Assistant to Terry Gold
Gold Systems
-------------------
Terry here - Jerry, the IT wizard that Angela bribed in the story above, has just discovered that Angela can in fact get to my schedule or anyone else's that she has access to. We run Exchange Server here with OWA (Outlook Web Access) enabled. That means that we can use any web browser, including the web browser on the 700w or my old Axim, to access the Exchange Server. Since Angela has permission to access my calender, she can use OWA to check my schedule. It's not as nice as a client on her PocketPC, but it is a step in the right direction.
This is the last post (for awhile) about my new phone. I expect I'll be reviewing some new speech recognition products in the next few weeks as I continue to try to surround myself with the technology.
January 25, 2006 in Speech Recognition | Permalink | Comments (0) | TrackBack
January 21, 2006
Voice Command Cheat Sheet
I've lived with my Treo 700w phone for almost two weeks now, and despite having to do a hard reset yesterday, I'm loving it. I've quit carrying my Pocket PC, and even gave it away a few days ago, so you know I'm serious about this new device.
Right now I'm in an airport using my phone as a high-speed modem. Despite what Verizon says, it can be done, you just need a little piece of software called pdaNet. I'm connected with a USB cable, but I expect the software will evolve to do the trick over Bluetooth.
Last week I mentioned that a lot of speech recognition applications suffer from a lack of documentation and "cheat sheets". Piyush Dogra from Microsoft forwarded this cheat sheet to me the very next day. Eric Badger, one of the developers of the product created it and has given me permission to share it. Thank you Eric! This is one of coolest pieces of software I've seen in a long time. As you point out "Knowing what to say makes all the difference when using Voice Command." As of today I have 829 contacts in Outlook, and Voice Command never misses when I say a person's name.
I also find myself saying "What's my next appointment" because it is just easier than opening the schedule and scrolling around the screen. Speech recognition really shines when you have deep trees of information that you need to directly access. It's a long story, but even though my phone came with Voice Command, I ended up buying a copy at the local computer store. The retail product does come with very good documentation that should get you going. If you have Voice Command or a Windows Mobile phone, you've got to give it a chance. Learn a few commands and you will wonder how you got by without it.
Here's Eric's Cheat Sheet - Enjoy!
Terry
Voice Command Cheat Sheet for Treo 700w
Knowing what you can say makes all the difference when using Voice Command.
===== CALLING A CONTACT =====
Commands:
Call <contact>
Call <contact> at home
Call <contact> at work
Call <contact> on mobile
Call <contact> on cell
Call <contact> on cellular
Call <contact> at home two
Call <contact> at work two
Call <contact> at car
Call <contact> on radio
Call <contact> on pager
Call <contact> at assistant
To confirm that you want to make the call after Voice Command responds:
You can say "Yes" or "Correct" to call.
You can say "No" or "Incorrect" to try again.
If Voice Command asks you which location, you can:
Repeat one of the locations that Voice Command offers to call.
Say "No" to try again.
Related commands:
You can say "Call back" to call back the last call that you received.
You can say "Redial" to call back the last call that you made.
Examples:
Call Karen Archer on cell
Call Frank Miller
Call City Light and Power
Call Barbara Sparrow Home
Notes:
Voice Command indexes by the Contact's first and last name if it exists. If you have a nickname entered, you can use that
too. Voice Command will only let you call by company name if there is no first or last name.
You must prefix contact calling with the "call" keyword. If you use "dial", it won't work!
===== DIALING A NUMBER ======
Commands:
Dial <7-digit number>
Dial <10-digit number>
Dial <1+10-digits>
Dial <N-1-1>
Examples:
Dial 555-0200
Dial 800-555-1212
Dial 1-800-555-1212
Dial 411
You must prefix digit dialing with the "dial" keyword. If you use "call", it won't work!
===== CHECKING CALENDAR =====
Commands:
What are my appointments today?
What are my appointments tomorrow?
What's my next appointment?
===== START MENU =====
Commands:
Start <program>
Example:
Start Solitare
Start Messaging
Start Internet Explorer
Start Pictures and Video
Notes:
Voice Command will index any file that is in or inside of \windows\program files
You have to say the file name exactly as it is written. It may be helpful to rename shortcuts.
Also, you can put links to web pages here and go straight to a saved web page this way.
===== MEDIA =====
Commands:
Play music
Play media
Play artist
Play album
Play genre
Play <artist name>
Play <album name>
Play <genre name>
Play <everything>
Play
Pause
Stop
Next
Previous (track)
Shuffle on
Shuffle off
What song is this?
What track is this?
Examples:
Play The Beatles
Play The White Album
Play Rock
Play Everything
Notes:
You cannot play individual tracks using voice
Voice Command will index the media based on the metadata. You can use a metadata editor to groom the fields.
January 21, 2006 in Speech Recognition | Permalink | Comments (4) | TrackBack
January 12, 2006
Speech Recognition and the trough of disillusionment
In 1995 Gartner came up with what they call The Hype Cycle to explain how new technologies get hyped, fall out of favor with the press, and then ultimately (sometimes) go on to be mainstream. One phase is the Trough of Disillusionment, and I believe that Speech Recognition may be in the trough now. All great technologies must go through it. Even as the technology continues to improve and some amazing things are happening, it seems to me that some people are getting tired of hearing how great it is going to be and they just want it to understand everything they say with little tolerance for errors.
There are two issues that have little to do with the science of speech recognition. The first is Human Factors. (I capitalize it because I believe it is so important.) No one would disagree with me that Human Factors is important, but we still see applications being built that seem to go out of their way to make life difficult for the user. That's another soap box for another time - I'll just say that it is very hard to make something very simple, but it is worth the effort.
The other issue is documentation, or at least expectation setting. If you encounter speech recognition on the telephone, there is almost never documentation in hand for what the system can understand, and since we're years away from a system that can understand everything a person might say (hey, people can't even do it!) you have to guess at what you might be able to say, or you have to wait for the system to prompt you.
Lately I've been trying to surround myself with speech recognition, just to live with it and understand what works and what doesn't work. I have "Wise Crackin' Shrek and Donkey" and all sorts of gadgets that do speech recognition. My latest is the new Palm 700W, which is a Palm Trio phone that runs Windows Mobile. Sort of like an Intel based Apple - they both came out this month, causing many people to wonder if in fact Hell has frozen over. My very first Palm was made by U.S. Robotics and until switching to the Pocket PC a few years ago I always liked the Palms, so I was happy to have the best of both worlds when the Trio came out last week.
I quickly loaded a cool little application called Microsoft Voice Command. (I think it comes with it - not sure)
It's been around for awhile and runs on Windows Mobile and Pocket PC Phone Edition. You push a button on the phone and then speak to it. You can say "Call Terry Gold at work", or anyone else that is in your contacts. No training required. I tried 20 different names and it got every one right except for "Dan DeGolier". I have over 900 contacts, so there were a lot to differentiate. Now, I just looked in my contacts and I had Dan in as "Daniel B. DeGolier". When I changed it to "Dan DeGolier" and let it automatically sync, it immediately got it right. (Sorry Dan, the text-to-speech still makes a mess of your last name.)
It isn't just for speed dialing though. You can say things like "What are my appointments", "What calls have I missed", and even "Start Program", where Program is any program you have loaded on your phone. I'm going to see if I can do most of the command and control using just speech recognition. This is what Bill Joy once called "prototyping the future." You figure out some way to live with the technology of the future, and that lets you think even farther ahead.
But back to the second challenge of great speech recognition. The one thing it couldn't recognize was me saying "Display Terry Gold". According to the website, it is supposed to bring up my contact. In fact it wouldn't work on any other name. Determined to make it work, I kept at it. No matter how carefully I spoke it, up would pop Media Player and Bill Monroe would start to sing "Long Black Veil". Bill Monroe is the Father of Bluegrass music and I'm an amateur Bluegrass mandolin player, so I took it as a great compliment that Voice Command was getting us confused. After all, we did grow up only 37 miles and 50 years apart.
Since Voice Command had worked so well up to this point, I didn't give up. After figuring out that I could say "Help" and then "Contacts", I realized that the software was actually looking for me to say "Show Terry Gold", not "Display Terry Gold."
My guess is that the developers realized that "Display" and "Play" were too similar late in the product life cycle, especially for guys like me who pronounce "Display" as two words - "Dis" "Play". "Terry Gold" sounds enough like "Bill Monroe" that I can see that mistake. The documentation on the web didn't get changed, and now some people are having a bad experience through no fault of the technology. The product is so great though, that hopefully this won't turn anyone off. I'll see if I can get to them to point out the typo.
Simply having a cheat sheet is a great help with speech recognition devices. That's how I found this mistake - I was making my own little cheat sheet. It is easy enough to just ask for help, but I wanted something on paper that I could have on my desk until I figured out the common commands that I would be using.
I have another application that can recognize hundreds of commands, and it does a great job, but the documentation listed all of the commands in alphabetical order. Again I made my own cheat sheet of the ten commands that I cared about, and now I can't imagine not using the product. I'll bet most people tried it, didn't know exactly what to say, and gave up on it. I'll write about that one another day.
When I first learned the vi editor, someone gave me a dog-eared card of the most common commands. It made all the difference in the world and I was soon raving about how superior vi was to any other editor in the world, especially Emacs. All because of that card. Until speech recognition advances to the point where we really can just say anything, let's see more cheat sheets, more obvious commands and help prompts that don't make the user feel like an idiot.
January 12, 2006 in Speech Recognition | Permalink | Comments (1) | TrackBack
January 03, 2006
Back to work!
This year a lot of people had inflatable snowmen, santas, grinches and what not in their yard. Am I the only one who finds it a little strange to see so many that look like this during the day? I hope everyone feels better than this guy on the first official workday of 2006.
January 3, 2006 in Current Affairs | Permalink | Comments (0) | TrackBack