Speech recognition technology will hear you now
By Ian Lamont
(IDG) -- Customer service at AirTran Airways has come a long way in the past few months. Prior to this year, if you called the airline -- which operates about 300 flights daily, mostly on the East Coast -- to find out about flight availability or to check on flight delays, chances are you would have waited for an average of 7 minutes before your call was picked up by a call center representative. It would have taken an additional two and a half minutes for the AirTran staffer to handle the call.
If you make a similar call today to AirTran, you would be put through in about 2 seconds. And your question would be answered in just over a minute, on average.
The reason for the dramatic reduction in customer handling times -- and associated costs, such as 800 toll charges -- is speech recognition.
After years of hype and false starts, automated speech recognition (ASR) technology is ready for prime time. The trend has become apparent in the customer service arena, especially in the airline industry, which has been one of the most enthusiastic adopters.
Forget listening to Muzak for 5 minutes while waiting on hold, or making choices by punching strings of numbers on the telephone keypad. Several carriers -- including AirTran and United Airlines -- have shifted the burden of relaying flight information stored in databases from customer service reps to automated systems which respond to customers' voices.
"As an organization, we understood the significant revenue and efficiency gains available to those which offer speech-enabled services to enhance consumer lifestyles and dramatically surpass the current limitations of touch-tone technology," says Rocky Wiggins, AirTran CIO.
Besides reducing wait times and cutting 800 toll costs, he says the new ASR system has allowed the airline to transfer its 650 call center employees to sales initiatives, customer retention duties and "countering competitive initiatives." It has also boosted employee morale, he adds.
There are other ASR applications that are entering the mainstream, too. GMAC Mortgage and others have adopted voice-controlled switchboards, which let callers access telephone extensions and voice mail by speaking a name or department. AOL, Yahoo and others are promoting information-by-phone services, which let users access stock quotes, news headlines and even e-mail. (AOL Time Warner is the parent company of CNN.com.)
"There are numerous applications for speech recognition that will fuel this market over the next year," predicts Elizabeth Herrell, research director at Giga Information Group. Besides automated switchboards and voice content providers, she expects "voice-activated car clients" that can give driving directions based on voice commands.
Herrell notes that the accuracy of ASR technology has improved to the point where users can employ natural speech, rather than stilted menus. "[This] will result in increased market acceptance of speech technology in both business-to-business and business-to-consumer applications," she says.
Larry Whitehead, CTO of voice content portal Audiopoint, says that a proper mix of functionality and ease-of-use is crucial. "We believe the user interface is absolutely the most important challenge," he says. "If the first experience is poor, you have lost the caller forever. But, for the power user, if the service is slow and cumbersome, you will lose that caller as well."
SpeechWorks, one of the major players in the voice-recognition market, realizes the importance of its text-to-speech product sounding "natural." One of its products, Speechify, is used by AOL and Yahoo to read users' e-mail messages over the phone. "Everyone has always wanted that app," says Steve Chambers, a SpeechWorks vice president. "But the quality of the text to speech never really supported such rigorous use. It wasn't quite as natural sounding as people wanted."
Accuracy is also vital to building trust, Chambers says. SpeechWorks' flagship voice recognition product, SpeechWorks 6.5, has achieved between 97 percent to 99 percent accuracy in major customer deployments, and a total of 18 languages are supported. While regional accents are not a problem, he says unusual languages or dialects are: "If you started talking to me with a thick Spanish accent, it could probably understand you. If you started talking in 'Spanglish,' it wouldn't," he says.
Chambers estimates that servicing people through automation can result in dramatic cost savings: Industry averages are typically 30 cents to 45 cents per minute for automated systems, compared with $3.50 per minute for agent-assisted calls. "If we can get our needs met with an automated system and not a live agent, it's far cheaper," he explains, citing reduced staffing costs and shorter toll calls as the main savings points.
Still, the initial cost of deploying ASR technology is daunting for many organizations. Chambers says SpeechWorks 6.5 typically costs between $500 to $1,500 per port for installation, while Speechify costs roughly $650 per port.
Giga's Herrell says cost may discourage all but the largest companies from purchasing speech recognition technology. However, she points to hosted voice services as an option for midsized firms.
Hooking up the back-end
William Meisel, president of TMA Associates, a speech industry consulting firm, identifies another obstacle to voice recognition deployments: "The biggest barriers [include] the perception that creating such applications takes talents that corporate IT or telecom departments don't have," Meisel says.
It's true that implementing ASR is not simply a matter of installing an off-the-shelf application. For AirTran, it took eight months of evaluation, coordination and "tuning" before the first call could be handled by the custom speech application in January.
"One of the most difficult deployment tasks, directly correlated to the project's overall success or failure, was the integration of this new technology within an existing infrastructure composed of legacy systems, disparate databases, nonstandard platforms and an enterprise-wide technology upgrade and replacement initiative still in process," AirTran's Wiggins says.
The project began in May 2000, when AirTran contracted SpeechWorks and CommerceQuest, a Tampa, Fla., firm specializing in the integration and deployment of IBM MQSeries and XML technologies. SpeechWorks developers and speech scientists worked with staff from AirTran's reservations, customer service and marketing departments to create a series of "call-flow dialog modules," which, when paired with voice queries, could lead customers through the process of getting information about flight availability and schedules.
Meanwhile, AirTran's telephony specialists and systems administrators worked with CommerceQuest to identify the integration points within the airline's databases. The application call flow is controlled by SpeechWorks InterVoice/Brite NSP-5000 Voice Response Unit (VRU). After arriving at the VRU, the caller's request is turned into an XML document and sent to the MQSeries server on the outbound message queue. Via MQSeries APIs, AirTran's Bornemann FliteTracDB is queried for flight information. The response takes the reverse path and sits on the inbound message queue at the VRU until the telephone line number that initially sent out the request removes it. If a reply is not received within 5 seconds, the VRU will time out, and the caller will be transferred to a customer service agent.
However, there are still some bugs to be ironed out. AirTran's automated flight information system responds quickly, but caller interaction is not yet as seamless as speaking with a real person. Pauses and repeated queries occasionally disrupt the flow of customer requests for flight information, and SpeechWorks and CommerceQuest are still fine-tuning the system.
Nevertheless, it is clear that ASR technology has passed out of the realm of science fiction and into the real world. It's already a practical technology that is making strong inroads in corporate America, and stands to change the way many companies do business.
|Back to the top|