Over the past year, Microsoft has been advancing its new vision of “mobile first/cloud first.” Voice is emerging as a key player and may ultimately turn into a key profit generator for the company. Microsoft’s growing portfolio and power in voice applications is going to have a significant impact upon emerging SmartVoice companies while telephone companies continue to travel down a “dumb pipe” road.”
Three key examples to Microsoft’s growing voice technology portfolio include the Cortana intelligent assistant, Skype Translator and support of WebRTC in Internet Explorer. Each part blends well with the mobile first/cloud first (M1C1) vision, with the underlying technologies able to be used as building blocks for more sophisticated value-added services.
Microsoft Cortana, an intelligent personal assistant first introduced for Windows Phone and Windows wearable tech is now expected to proliferate to Windows 10 and the Xbox One. People will be able to access it through any Windows platform, including tablets, notebooks and desktop PCs. Cortana is currently region-specific, adapting to the global markets it is (virtually) located in, be it the U.S., the U.K., or China, with global access available by 2015.
Cortana is currently steered to the personal assistant role, but imagine the behind-the-scenes work going to meld it to a more business centric-approach. Imagine Cortana being teamed with a Lync/Skype for Business cloud service to provide a natural language IVR experience ahead of anything currently on the market. After that, add a business customization module for Cortana to schedule conference calls, meetings, calendar updates, and links to customer resource management (CRM) database services. You may even be able to verbally flip Cortana between personal mode and business mode, with the personal assistant putting on its “business hat” when you are dealing with work issues.
Skype Translator conducts real-time transcription and translation from one language to another. Businesses can use real-time transcription for conference calls, even without translation. Again, imagine a Lync/Skype for Business cloud offering low-cost or even “free” real-time transcription and call recording for 30 days. Mid-sized and larger businesses will, in the future, be able to access a Microsoft-based voice analytics service; the company already has demonstrated cloud analytics services for Internet of Things (IoT) applications, so voice is simply shifting data types and conducting a bit more processing.
Call recording and transcription also leads to the world of Hypervoice applications, with voice recording easily searchable via key word index. Microsoft has to be aware of the Hypervoice consortium and the early applications built around voice-as-searchable, so it’s not unreasonable to think the company might offer its own spin on Hypervoice in the future.
WebRTC may be the common “glue” to bind and fit pieces together outside of the Windows ecosystem. Microsoft is incorporating WebRTC into Internet Explorer and other services. Providing WebRTC support also means Android and iOS devices can access Microsoft-hosted voice-based applications, with the end-game being more mobile devices—regardless of operating system—feeding into the Microsoft cloud.
As some point in time, I expect Microsoft to open up access to its voice processing services via API. The WebRTC world has established a market for “API as a service,” and there are hordes of developers who would love cost-effective access to natural language processing and call transcription as a start. Voice analytics and Hypervoice services are likely to remain Microsoft-owned (closed) for the short term.
The only piece missing from Microsoft’s voice portfolio is voice biometrics. Is the company working on its own service, planning to buy someone not named “Nuance” or simply going to work out some strategic licensing? Voice identification is a moving target. I have no doubts Microsoft has done some work in the arena, but I’m not sure if the company is prepared to commit the resources to make voice biometrics a business line. On the other hand, it has a lot of cloud service cycles to fill up.