voice tech

UI Components for Voice UIs in the Web

Ari Nykänen

Oct 06, 2021

3 min read

Ready-made UI components make development of Voice UIs faster.

  • Copy link

  • Mail

  • LinkedIn

  • Facebook

  • Twitter

Voice User Interfaces (Voice UIs) often refer to UIs that use voice both for user input and output. Voice UIs are typically built to enable a more efficient user experience. However, we frequently run into problems with voice-only UIs that result in confusion and frustration for users.

At Speechly, we believe that many of the problems that exist with Voice UIs today can be mitigated or completely eliminated by adopting a multi-modal design philosophy. This means leveraging all the available modalities (voice, visual, touch) of the user's context to make the user interaction as easy and smooth as possible. One of the most fascinating platforms for multi-modal Voice UIs is the web, but if you look for design patterns for adding voice features to web applications, you will quickly realize a lack of quality resources.

To make designing and developing Voice UIs on the web easier, we are excited to release some of our research on this topic as a set of ready made UI components. These components can be used to give visual cues to users that the Voice UI is working as expected.

4 UI Components for Voice UIs

  • Push-To-Talk Button is a holdable switch for controlling the Voice User Interface.
  • Big Transcript is an overlay-style component that displays the real-time speech-to-text transcript and feedback to the user.
  • Transcript Drawer is an alternative for Big Transcript that slides down from the top of the viewport. It displays usage tips along with the real-time speech-to-text transcript and feedback.
  • Intro Popup is an overlay-style popup that is automatically displayed when the user first interacts with Push-To-Talk Button. It displays a customizable introduction text that briefly explains voice features microphone permissions are needed for. Intro Popup also automatically appears to help recover from a common problems.

Our Multi-Modal Design Philosophy helps design better voice-enabled user interfaces

We believe most of the problems that face Voice UIs can be overcome with a multi-modal design philosophy. Below is the multi-modal design philosophy we embody at Speechly. This Design Philosophy Guide should be used as a complimentary resource with the UI components above when designing or developing a Voice UI.

Chapter 1: Setting the right context

  • Resist the temptation to build an assistant.
  • Design the interactions around command & control, not conversation
  • Give visual guidance on what the user can say
  • Use voice ONLY for the tasks it is good for

Chapter 2: Receiving commands from the user

  • Onboard the user
  • When a pressable button is available wake word is not needed
  • Prefer a push-to-talk button mechanism
  • Signal clearly when the microphone button is pushed down

Chapter 3: Giving feedback to the user

Chapter 4: Recovering from mistakes

  • Show the text transcript in real time
  • Enable corrections both verbally and by using touch
  • Offer an alternative way to complete the task using touch

Free Voice UI Components for Download

You can find more information about these UI components inside our documentation. If you would like to access the Speechly UI component design files, they are now available in Figma and Sketch for download.

If you have any questions on how to best take advantage of our Voice UI components, please feel free to reach out to the team at

About Speechly

Speechly is a YC backed company building tools for speech recognition and natural language understanding. Speechly offers flexible deployment options (cloud, on-premise, and on-device), super accurate custom models for any domain, privacy and scalability for hundreds of thousands of hours of audio.

Latest blog posts

company news

Speechly is joining Roblox

Hannes Heikinheimo

Sep 19, 2023

1 min read

voice tech

4 Voice Chat Solutions for Virtual Reality

Voice chat has become an expected feature in virtual reality (VR) experiences. However, there are important factors to consider when picking the best solution to power your experience. This post will compare the pros and cons of the 4 leading VR voice chat solutions to help you make the best selection possible for your game or social experience.

Matt Durgavich

Jul 06, 2023

5 min read

company news

Speechly Has Received SOC 2 Type II Certification

Speechly has recently received SOC 2 Type II certification. This certification demonstrates Speechly's unwavering commitment to maintaining robust security controls and protecting client data.

Markus Lång

Jun 01, 2023

1 min read