At first glance, few technologies feel as unsexy as voice. From a user’s perspective, little has changed since the days of Alexander Graham Bell. Most see voice as a mature technology that simply connects people in real-time across a distance. But voice is experiencing a wave of innovation that will fundamentally alter this definition.
During Mobile World Congress, Jae-woan Byun, the CTO of SK Telecom, condemned current voice offerings as “boring for users” but promised a “second tsunami” that could change everything.
The first tsunami was about messaging. It swept away SMS volumes and revenues and resulted in the kind of valuation that Facebook placed on WhatsApp. Thanks to the elimination of the historical limitations that telephony placed on voice, we are already sensing the shockwaves of the next tectonic shift.
Voice will be:
Available every “wear.” Voice is fast becoming a primary interface for wearable technology. Voice will soon become ambient, with audio sensors embedded into our environment: cars, living and workspace, and fashion accessories. Conversations will follow us from home to car to office — jumping automatically from device to device.
Private and secure. Encryption of voice will become the default, not the exception. Layered security models will include voice biometrics as a standard component. And for our most private conversations and transactions, speech will continuously authenticate us – not simply at the outset of a conversation.
Smartphone-native. Today, the dialer application on a smartphone replicates 1970s touchtone telephony. The ability to tap, swipe, wave, drag, point, rotate, shake and talk means that powerful new features will be simple and easy to use, in the same way that the iPod made mobile music easy.
Imagine rotating your phone to landscape orientation to turn a 1:1 call into a conference call. Apps will allow easy customization of the voice experience. Your CRM app will handle calls from clients; another will intercept calls when you are roaming and it’s 3 AM; and another will manage calls from the “burner” number you put in an ad to sell your car. Powerful new services will be so easy and intuitive that we won’t even notice a learning curve.
Application-embedded features. Beyond caller ID, inbound voice calls carry little context today. Increasingly, voice calls are originated within apps and web pages and are thus full of useful metadata. Moving forward, voice calls will come complete with context, such as where the user is stuck in a business process, allowing organizations to build and continuously refine a fit-for-purpose voice experience.
Beyond the “call.” Sadly, we are still replicating the patterns and limitations of 1876 telephony with the idea of a call today. We either schedule calls with fixed timing, length and attendees or blindly interrupt people. Future voice communication will mirror the more fluid activity streams on Facebook, Yammer or Google Hangouts. We will invite others into a call as needed, allowing them to jump in and out of conversations seamlessly. Outside calls or cold calls will come with a “conversation request,” where the caller pitches the receiver on why he or she should answer and invest their time.
Augmented memory & total recall. Voice is about to become recordable by default, and in many contexts and corporations, it already has been for decades. We are moving beyond simple record keeping to active knowledge management via voice. Similar to how we search our email for past conversations and threads, we will be able to do that with our voice conversations too. Essentially, we will be able to jump to the 15 seconds that mattered in that last call and have perfect recall of all our conversations.
Your intelligent voice assistant. Basic AI technology has offered voice command control for over a decade, and Siri and Google Hotwording have taken that experience to a new level. As intelligent assistants continue to improve and adapt, we can see a future where they join us during the call. They will interpret questions and offer answers, content and ideas in both spoken and visual form. This will help us perform various administrative tasks, like scheduling a meeting, querying past correspondence or adding a task to your to-do list.
Accessible to all. The next generation of voice services will not only have high-definition audio, but also customized acoustic profiles to us individually and our environment. We don’t all speak the same languages or dialects, so automated real-time subtitles and translation will become commonplace. One in five people have significant hearing loss, and end-to-end digital cloud-centric hearing aids will remove the “analog gap” for hearing-impaired users.
Voice intersects with a long list of hot topics: the internet of things, search, location services, wearables, security, connected car, big data, quantified self and beyond. As analyst Benedict Evans of Andreessen Horowitz recently tweeted: “It’s kind of ironic that voice is one of the next big things in mobile.”
I would say Evans is partially correct. It’s not just mobile. Voice promises to be the next big thing in communications, period.