What Problems Can Voice Help Solve?

I have been fascinated by smart speakers and conversational interfaces ever since I spoke to my first Echo.  It reminds me of the first time I browsed the web or used a native iPhone app.  It feels like a completely new medium, with its own set of strengths, weaknesses, and possibilities.  And like any new technology, there is a lot of attention being paid and pixels being typed towards new ideas and bold predictions.  To help frame my own thinking and avoid the trap of ‘technology looking for a problem to solve’, I have created a set of use case parameters.  These are based on how voice is actually being used by people, rather than starting with technical product specs. 


So how are people using voice?

It is staggering how quickly voice is being adopted, as close to 10% of North American households now own a smart speaker.  Drivers for growth include a low-price point, ease of access and usage, strong word-of-mouth, and familiarity with voice search on smartphone—50% of all searches by 2020 will be voice (comScore, 2017).  The reasons people want a smart speaker include listening to music, asking questions without typing, listening to news and information, and making it easier to do things (Edison Research, 2017).  Most importantly, current owners are largely satisfied as 50% say they use it more now than they did during their first month of ownership, and 63% plan to purchase another (AnswerLab, 2017).  So smart speakers are being adopted, owners are pleased, and usage is expected to increase—what types of use cases make the most sense for voice?


1. When it is difficult to use your hands

Let’s start with the most obvious.  There are certain situations where it is not possible or practical to use your hands to access a smartphone, such as when you are cooking, driving, or multi-tasking.  Voice is a natural channel to help serve people in these environments and situations. 

Focus on contexts where smartphones are not accessible: 64% of smart speaker owners are interested in having the technology in their car (comScore, 2017).


2. When it is easier than using a mobile app

There are certain types of tasks that are typically completed through a mobile app which can be made more convenient through voice, such as converting measurement amounts or playing a specific song.  Voice commands can help simplify some complex interactions and take away multiple screens that a mobile app may require for the same function. 

Focus on specific moments of need: “brands need to find their own raw chicken on the hands moments where they make a task 10X easier via Alexa” (via Econsultancy).


3. When the need frequently and regularly arises

There are certain repetitive needs that people regularly encounter which often are part (or become part) of their regular routines, such as checking local news and traffic for commuting.  Popular voice applications today include Flash Briefings and real-time content– such as traffic, weather, and news – that people look for to start every day.

Focus on recurring daily or weekly needs: 72% of people who own a voice-activated speaker say that their devices are often used as part of their daily routine (Google, 2017).


4. When the need is clear and easy to express

There are many situations where a query or task is very clear and simple to communicate through natural language, such as setting a reminder or adding something to a shopping list.  Voice users are frustrated when devices do not understand their queries, so topics that are more nuanced and discussed using various taxonomies do not work as well as conventional queries.

Focus on needs that people can easily communicate: 70% of requests to the Google Assistant are expressed in natural language, not typical keywords used for web search (Google, 2017).


5. When the required input is simple

There are certain types of questions and tasks that do not require much input from people, such as when you are setting a timer or asking for the local weather forecast.  Voice interactions do not work well when a user is required to provide more than one data input, such as completing multiple fields in a form. 

Focus on requiring the minimal amount of input necessary to complete a task: 59% of people who do not own a smart speaker feel that the devices are intrusive and seek too much personal information (Capgemini, 2017).


6. When the desired output is simple

There are certain types of questions or tasks where people are looking for a single answer or action rather than options or details, such as when you are looking for a movie time at a specific theatre.  Voice users typically do not expect to spend much time during a single interaction and are not taking down notes from a voice-delivered response.

Focus on needs where a ‘single best answer’ is acceptable: 45% of voice-speaker owners report that they do not make purchases through the device because they cannot see product details (comScore, 2017).


7. When it is socially acceptable and additive

There are certain situations where an activity or answer might add to a conversation or group setting, such as accessing a trivia game or finding an answer to a question that may resolve a friendly debate.  People are hesitant to share potentially embarrassing questions or information in a social setting and may not trust smart speakers with keeping information secure.

Focus on use cases that do not require personally-sensitive information: 89% of voice-speaker users agree that they are comfortable talking to a voice assistant when they are alone vs. 47% comfortable in a social setting (Capgemini, 2017).


8. When the need involves a specific location

There are certain queries or tasks that are specific to different rooms in a person’s house, such as scheduling a morning alarm in the bedroom and controlling the thermostat in a living room.  Voice applications may be designed to the specific needs associated with different rooms, as evidenced by the popularity of recipes being used via smart speakers placed in kitchens.

Focus on use cases that may arise in popular speaker locations: 21% of owners of smart speakers have the devices in their kitchen, and 19% have the devices in their master bedroom (Edison Research, 2017).


9. When it involves quick status updates

There are certain instances where a person may be interested in accessing a quick and current update, such as when a package will be delivered or when a person is expected to arrive home.  Voice is more natural channel to receive an update, rather than browsing for a product or completing a new transaction that involves product photos and credit card payment.

Focus on servicing over transactions: 49% of customers would like to interact with smart speakers to check delivery status vs. 35% making a purchase (Capgemini, 2017).


10. When a digital service already exists

There are a number of digital services that people frequently use through multiple access points including browser and mobile app that present new use cases, such as streaming audio, wearable fitness devices, and ride-sharing.  Voice applications have gained traction where people have a relationship with an existing digital service, particularly smart home devices and audio content subscriptions.

Focus on combining voice with screens and other access points: 65% of owners of smart speakers listen to more music, 28% listen to more news, and 20% listen to more podcasts (Edison Research, 2017).