The possibility to interact with computers in our natural language is a dream as old as the computer itself. In fact, the reason we rely mainly on a keyboard and mouse is because that’s the only way technology had been allowed to enter information. This explains why the launch of Siri in 2011, a voice assistant integrated into a consumer device for the first time, generated a renewed interest for voice interfaces. Since then, other tech companies followed, resulting in the release of Amazon Echo (the first standalone device for voice assistants), Google Home, Microsoft’s Cortana, and many others.
The reasons why these voice services are gaining traction is mainly due to the progress in the field of AI and NLU (Natural Language Understanding), possible only by the vast datasets made available with the advent of cloud computing in the last decade. However, services relying on voice recognition are not completely new in computer science. The first consumer dictation software and telephone IVR systems date back to the early 90s, and basic word recognition attempts even earlier. These interfaces have been present in our lives for a few decades although they found their main role as AI-powered Voice assistants.
Today, it is estimated that 46% of Americans use voice assistants (either on their smartphone or on a standalone device). Since the launch of the Amazon Echo in 2014, more than 15,000 3rd party Alexa Skills (nomenclature used to identify Alexa’s voice apps) have been developed. From ordering a pizza at Domino’s to calling an Uber, top brands are already leveraging voice-enabled apps to offer new services for their users.
Voice apps have also proved to be valuable for a variety of industries and purposes: not only by enhancing existing services by providing voice capabilities, but also by creating complete new experiences and narratives. The television network HBO recently made a successful example of this concept by creating “The Maze” a voice game for Alexa based on its popular series Westworld.
“Voice is the next frontier of interactive storytelling, with a powerful ability as a marketing tool to deepen engagement with our fanbases.” Sabrina Caluori, Senior Vice President, HBO Digital and Social Marketing.
To fully appreciate the potential of voice interfaces, you need to understand their advantage over classic graphic interfaces. They are manifold and related to the nature of voice and sound itself.
Understanding the properties of voice
The main difference from Graphic User Interfaces is the omnidirectional property of audio. It differs in that it can’t be ignored as easily a visual stimulus with which you can close your eyes or redirect your sight. A device such as the Amazon Echo can be placed in a room and be accessed from anywhere in that environment, regardless of position and direction. This efficiency is further enhanced by contexts in which the user can’t directly use their hands, for example when driving or cooking, or to help people with visual disabilities.
Increasing efficiency in user interactions
Voice interaction possesses a directive-input modality: users have the possibility to directly ask the computer/device what they need from it and in their own terms, removing all the unnecessary information that would clutter the interface and make the experience more frustrating. This allows us to simply say “Alexa, add a 2:30 p.m. meeting to my calendar tomorrow”, instead of opening my device, navigating to the calendar app, finding the appropriate day and time, clicking to create an event, naming it, before clicking “Create”.
Brands can finally talk 1:1 with their customers
And last but not least, another opportunity for brands working with voice is the possibility of conveying emotions and personality. As humans, we have developed a fine capability of recognizing in voices attributes such as age, gender, personality, speaker’s intentions and emotional state, regional background, and more. Spoken messages carry much more information about emotion and intent that are conveyed with tone of voice. Text conversation in comparison often needs additional indicators such as punctuation and elaborate wording to convey emotions, and even then it can leave space for ambiguity.
Therefore, setting the user’s expectations correctly is fundamental. Designing the perfect voice experience goes hand in hand with the technology itself (and its constant evolution). A voice assistant with a realistic voice will resemble more of a real person and in turn, our brain will both set higher expectations and apply more severe judgment if, for example, one of our instruction gets misinterpreted. An additional point to note is that products that closely resemble real humans — but without succeeding completely — often fall in the so-called “uncanny valley” pit, feeling strange or possibly creepy for the user.
When applied to companies, a voice app can be like having an employee talking directly to their users. It’s therefore fundamental to set the voice and tone right in order to be able to convey the brand’s values in every interaction. Inconsistencies between voice and other features may result in confusion and make the brand feel off. Creating Design Systems today is not only is about how a brand should look visually but also, how it should talk.
What can Voice Interfaces do for your brand?
Digital assistants are becoming more present in our digital life and are contributing to a shift in the way we interact with technology. There isn’t any value in creating voice assistants with no purpose just because other people are doing it. However if done right, voice services can offer new and innovative touch points to customers. For organizations to create real impact for their customers they need to invest time and resources to gain a deep understanding of their users’ needs, to develop a set of ideas that could make a difference in people’s brand experience and finally, to prototype and test them. This way, they will be able to understand how Voice Interface solutions could contribute to their brand.
Originally published at www.edenspiekermann.com.