Ready-made UI components make development of Voice UIs faster.
Copy link
Mail
LinkedIn
Facebook
Twitter
Voice User Interfaces (Voice UIs) often refer to UIs that use voice both for user input and output. Voice UIs are typically built to enable a more efficient user experience. However, we frequently run into problems with voice-only UIs that result in confusion and frustration for users.
At Speechly, we believe that many of the problems that exist with Voice UIs today can be mitigated or completely eliminated by adopting a multi-modal design philosophy. This means leveraging all the available modalities (voice, visual, touch) of the user's context to make the user interaction as easy and smooth as possible. One of the most fascinating platforms for multi-modal Voice UIs is the web, but if you look for design patterns for adding voice features to web applications, you will quickly realize a lack of quality resources.
To make designing and developing Voice UIs on the web easier, we are excited to release some of our research on this topic as a set of ready made UI components. These components can be used to give visual cues to users that the Voice UI is working as expected.
4 UI Components for Voice UIs
Push-To-Talk Button is a holdable switch for controlling the Voice User Interface.
Big Transcript is an overlay-style component that displays the real-time speech-to-text transcript and feedback to the user.
Transcript Drawer is an alternative for Big Transcript that slides down from the top of the viewport. It displays usage tips along with the real-time speech-to-text transcript and feedback.
Intro Popup is an overlay-style popup that is automatically displayed when the user first interacts with Push-To-Talk Button. It displays a customizable introduction text that briefly explains voice features microphone permissions are needed for. Intro Popup also automatically appears to help recover from a common problems.
Our Multi-Modal Design Philosophy helps design better voice-enabled user interfaces
We believe most of the problems that face Voice UIs can be overcome with a multi-modal design philosophy. Below is the multi-modal design philosophy we embody at Speechly. This Design Philosophy Guide should be used as a complimentary resource with the UI components above when designing or developing a Voice UI.
Chapter 1: Setting the right context
Resist the temptation to build an assistant.
Design the interactions around command & control, not conversation
Give visual guidance on what the user can say
Use voice ONLY for the tasks it is good for
Chapter 2: Receiving commands from the user
Onboard the user
When a pressable button is available wake word is not needed
Prefer a push-to-talk button mechanism
Signal clearly when the microphone button is pushed down
Enable corrections both verbally and by using touch
Offer an alternative way to complete the task using touch
Free Voice UI Components for Download
You can find more information about these UI components inside our documentation. If you would like to access the Speechly UI component design files, they are now available in Figma and Sketch for download.
If you have any questions on how to best take advantage of our Voice UI components, please feel free to reach out to the team at design@speechly.com.
About Speechly
Speechly is a YC backed company building tools for speech recognition and natural language understanding. Speechly offers flexible deployment options (cloud, on-premise, and on-device), super accurate custom models for any domain, privacy and scalability for hundreds of thousands of hours of audio.
Speechly has recently received SOC 2 Type II certification. This certification demonstrates Speechly's unwavering commitment to maintaining robust security controls and protecting client data.
Markus Lång
Jun 01, 2023
1 min read
use cases
Countering Extremism in Online Games - New NYU Report
A recent NYU report exposes how extremist actors exploit online game communication features. In this blog we expand on NYU's data and recommendations for maintaining safety and security in online gaming communities.
Collin Borns
May 30, 2023
4 min read
voice tech
What You Can Learn from The Data in Xbox’s Transparency Report
The 2023 Xbox Transparency Report is (likely) around the corner. Our first blog broke down how the moderation process works at Xbox, but this blog will take a deep dive into the data from the inaugural report comparing Reactive vs Proactive moderation.