Hey Siri, Tell Me About Advanced Speech Recognition

ORDER REPRINTS DOWNLOAD COMMENT DISCUSS SHARE

Within a very short time, consumers have turned speed into a preferred way of interacting with their devices. Today, voice-enabled applications—such as Siri, Alexa, or Google Assistant—are increasingly woven into the fabric of everyday life. In the last year alone, the market for smart speakers has grown 128 percent—just in the U.S. This growth will increase, too: advanced speech recognition, text-to-speech, and speaker verification constitute a market that will be valued at $18.3 billion by 2023, according to Markets and Markets’ Speech and Voice Recognition Forecast Report.

This dynamic market presents tantalizing opportunities for communications service providers (CSPs) to capitalize on such potential. And they’re well-positioned: CSPs already have millions of subscribers—both consumer and business—using their networks every day for voice and video services. Devices with speech interfaces are now nearly ubiquitous, with nearly being the operative word. As nearly is eclipsed, CSPs can bridge the gap for speech enablement anywhere, anytime by offering speech recognition services in the context of a live voice or video call. Doing so will create the opportunity to offer value-added applications that generate new revenues in the process.

As the adoption of speech recognition capabilities to continues to grow at a rapid pace, however, key factors will prove critical to success. These include implementation costs, quality of experience, responsiveness, and streamlined user interfaces.

Market Overview for Speech Recognition

While speech recognition technology has actually been around for decades, recent advances have driven dramatic evolution (and the corresponding adoption) within the last five years. Machine learning has finely tuned recognition accuracy, and cost-effective deployment of mobile and other small form factor devices has driven adoption. As a result, speech is now the de facto—and, in some cases, the only—user interface for a number of devices. At scale, however, these solutions can be expensive, increasing the need for a new solution—or, more aptly, new solutions.

Figure 1. History of Speech Recognition Technologies

According to Strategy Analytics, the hot markets today for advanced speech recognition include:

In-home smart home assistants, such as Amazon Echo or Google Home Control
On-the-road smartphone assistants, such as Siri on Apple devices and
Vocabulary- and context-specific industry verticals, for example for medical surgery heads-up displays or for in-vehicle speech interfaces for hands-free navigation

Another growing market and major opportunity for service providers is “in-call” speech recognition. These in-call capabilities can support person-to-person interactions, person-to-machine interactions, person-to-bot interactions and more. They are able to serve the requests of callers who are already having a conversation or are on a conference call. During the call, callers can easily invoke the application to dial someone to join a phone call or to record the conference call. To add these capabilities to their networks, however, service providers need to overcome inherent challenges.

Follow @PipelineWire