voice design

Designing Voice UIs

Ottomatias Peura

Sep 09, 2020

3 min read

Designing voice-first applications requires new approaches to UX and UI design. In this post, we'll go through some best practices for designing voice-driven user interfaces.

  • Copy link

  • Mail

  • LinkedIn

  • Facebook

  • Twitter

Back in the early times of digitalization, human-computer-interaction meant white (or green!) box blinking on a black screen. We have come far from that. First revolution came in the form of mouses on graphical user interfaces (GUI), then with touch and mobile phones. Now we are entering the era of voice user interfaces (VUI). What does it mean from a UI designer's point of view?

In this blog post, I'll share you some do's and don'ts you should consider when designing a voice user interface. You can find more tips for voice design in our guide.

1 Don't try to imitate human conversation

Conversational AI is a hot buzzword, but people most often don't want to conversate with their computers. Most if not all of us have better friends, even though we do spend more time with our computers and mobile phones than any of them.

Most often conversational AI refer to chatbots. Google defines chatbots as "a computer program designed to simulate conversation with human users, especially over the Internet". These are voice user interfaces for sure, but most often you don't need the bi-directional "talking" that these chatbots offer.

2 Update visual UI as the user speaks

Most voice user interfaces are based on a simple question-answer pattern. The user asks something, the system waits and then something happens. Often this something is voice assistant answering back in speech.

This is a problem, as voice is an interrupting channel. If you've ever had a conversation, you'll know that both parties should not speak at the same time. This leads into a noise where neither understands each other.

However, we humans can easily speak and digest visual information simulatenously. When designing voice UIs, this should be taken advantage of.

When the user says something, the user interface should continously update to reflect the user commands. For example, if the user says something like "Show me red t-shirts from Hugo Boss in size large", the UI can be updated three times: first showing t-shirts, then showing t-shirts from Hugo Boss and last showing large t-shirts from Hugo Boss.

This encourages the user to continue with the voice experience and enables the user to correct themselves naturally.

3 Show transcript

Transcript is the most important feedback that the user needs when using a voice UI. If the transcript is not shown, the user can't be sure whether they are being listened at all. And most importantly, without the transcript there's no way for the user to understand why they were not understood correctly, if that's the case.

Always show the transcript in the field of vision. When the user activates the microphone, the transcript should be clearly visible in a natural place, either close to the microphone or in the top of the screen.

4 Give visual clue on what the user can say

One big issue with voice assistants is that it's impossible for the user to know what they can and can't achieve with the device without trying.

If you ask the assistant to give an alert, that works great. They also know when Michael Jackson died. But does a Google smart speaker know how many steps you took yesterday on Google Fit? It seems no. There's no way a user can know it beforehand, because the smart speaker doesn't give any visual tips on what is possible.

In a real-world application one simple way where this works great is a form. Let's say you want to book a flight and you are presented with a form that has an input fields for from, to, date, class and some other information.

Now it's very clear for the user what they should say and what is the context in which they should be commanding the system.

Latest blog posts

use cases

ADL Report: Voice Chat Remains a Top Channel for Online Harassment

The annual ADL report about harassment in multiplayer video games showed a significant problem worsening. Voice Chat is once again a leading channel for concern.

Collin Borns

Jan 27, 2023

3 min read

use cases

ADL Report: Online Harassment In Games is Bad and Getting Worse

ADL's annual report about harassment in online multiplayer games paints a negative picture for young people and adults alike. Is 2023 the year the gaming industry will start to overcome these challenges?

Collin Borns

Jan 18, 2023

2 min read

use cases

The Hidden Power of Full-Duplex AI for Voice Assistants and Voice Chat Moderation

The most popular voice assistants (Alexa, Siri, Google) use half-duplex architectures, meaning the user and assistant must take turns to speak. However, Full-duplex systems employ real-time understanding where the system begins predicting the user intent from the very first word uttered, unlocking the ability for Proactive Content Moderation.

Hannes Heikinheimo

Dec 09, 2022

8 min read