Voice User Interfaces (Voice UIs) often refer to UIs that use voice both for user input and output. Voice UIs are typically built to enable a more efficient user experience. However, we frequently run into problems with voice-only UIs that result in confusion and frustration for users.
At Speechly, we believe that many of the problems that exist with Voice UIs today can be mitigated or completely eliminated by adopting a multi-modal design philosophy. This means leveraging all the available modalities (voice, visual, touch) of the user's context to make the user interaction as easy and smooth as possible. One of the most fascinating platforms for multi-modal Voice UIs is the web, but if you look for design patterns for adding voice features to web applications, you will quickly realize a lack of quality resources.
To make designing and developing Voice UIs on the web easier, we are excited to release some of our research on this topic as a set of ready made UI components. These components can be used to give visual cues to users that the Voice UI is working as expected.
4 UI Components for Voice UIs
- Push-To-Talk Button is a holdable switch for controlling the Voice User Interface.
- Big Transcript is an overlay-style component that displays the real-time speech-to-text transcript and feedback to the user.
- Transcript Drawer is an alternative for Big Transcript that slides down from the top of the viewport. It displays usage tips along with the real-time speech-to-text transcript and feedback.
- Intro Popup is an overlay-style popup that is automatically displayed when the user first interacts with Push-To-Talk Button. It displays a customizable introduction text that briefly explains voice features microphone permissions are needed for. Intro Popup also automatically appears to help recover from a common problems.
Our Multi-Modal Design Philosophy helps design better voice-enabled user interfaces
We believe most of the problems that face Voice UIs can be overcome with a multi-modal design philosophy. Below is the multi-modal design philosophy we embody at Speechly. This Design Philosophy Guide should be used as a complimentary resource with the UI components above when designing or developing a Voice UI.
Chapter 1: Setting the right context
- Resist the temptation to build an assistant.
- Design the interactions around command & control, not conversation
- Give visual guidance on what the user can say
- Use voice ONLY for the tasks it is good for
Chapter 2: Receiving commands from the user
- Onboard the user
- When a pressable button is available wake word is not needed
- Prefer a push-to-talk button mechanism
- Signal clearly when the microphone button is pushed down
Chapter 3: Giving feedback to the user
Chapter 4: Recovering from mistakes
- Show the text transcript in real time
- Enable corrections both verbally and by using touch
- Offer an alternative way to complete the task using touch
Free Voice UI Components for Download
You can find more information about these UI components inside our documentation. If you would like to access the Speechly UI component design files, they are now available in Figma and Sketch for download.
If you have any questions on how to best take advantage of our Voice UI components, please feel free to reach out to the team at firstname.lastname@example.org.