Examples of Natural Voice User Interfaces

Ottomatias Peura

Feb 08, 2021

5 min read

Reactive voice user user interfaces enable intuitive and efficient experiences that improve key metrics

In this post, we'll go through some use cases and examples for natural voice user interfaces in various domains and user tasks.

Not many apps or websites already employ a voice user interface because of a lack of developer tools for building them. These examples use Speechly Spoken Language Understanding technology for a natural voice UI, enhancing the touch screen user experience with voice functionalities.

Speechly offers a unique tool for building real-time multimodal voice user interface. Our technology can be applied to any industry or domain to enhance current touch user interfaces with voice functionalities.

Speechly offers a Spoken Language Understanding API that returns user intents and entities in real time for user voice input. This approach enables end users to see the result of their voice commands visually as they speak instead of the traditional smart speaker paradigm that is based on turn-based queries.

Real-time visual feedback is the key to efficient and intuitive user interfaces, because it allows users to multitask. Instead of saying something and waiting for the answer, end users can speak in a stream of consciousness fashion, correcting themselves if needed. On the other hand, the visual feedback encourages users to go on with the voice experience.

We have collected some examples of user tasks that can be solved more effectively with our voice technology. In short, voice user interface works great if:

  • Your users know what they want to achieve
  • Data quality is important
  • User tasks are repetitive

Let's see our examples.

Form filling by voice

Voice is a great solution for information heavy, repetitive tasks such as form filling. Filling forms on a mobile device can be cumbersome because of difficult typing and common usability issues on different screen sizes and mobile browsers.

In our demo, we enhanced an existing HTML form with voice functionalities. The form can be manipulated by using touch or voice simulatenously. End user can use natural language to fill the form and gets instantenous feedback on the form.

By seeing the form, the user knows exactly what kind of questions they need to answer and they can the form in any order and by using any interaction modality.

eCommerce search filtering

Search is one of the most important parts of a eCommerce customer experience. Up to 30% of eCommerce visitors will use search for navigating and a user who doesn't find what they are looking for is a lost customer.

A major share of Google searches are already done by using voice, but very few eCommerce sites offer a similar experience.

Speechly makes it simple to add voice functionalities to eCommerce stores. Again, the user can use natural language to search for products and unlike with traditional categories, voice categories naturally supports synonyms. No matter if the user asks for pants or trousers, they find what they are looking.

It's also important that the user interface updates in real time. This enables user to correct themselves in case of an error and encourages them to go on with the voice experience.

Grocery shopping

Grocery shopping is a special kind of shopping experience, because the user wants to add a lot of familiar products from a large inventory to their shopping cart as easily as possible.

Traditional user experience requires a lot of repeated searches and selections, but voice enables the user to just say out loud the products they want and see them added to their cart. If they need to change a certain product, for example by changing a milk to another brand of milks, they can do it easily by just clicking the product.

Professional applications

Professional applications are a great use case for voice functionalities, because the language used in these settings is accurate and commonly shared by everyone.

Speechly can be used to create efficient user experience for professional applications in many industries and domains. In this example, airline maintenance workers can easily report anomalies and defects in airplane cabin.

You can also read about our offering for warehouse professionals and logistics.

Voice in VR/AR

Virtual reality environments can offer a very immersive experience that can showcase for instance real estate locations easily and accurately, even amidst pandemic situations.

However, the first time user experience in these environments suffer from clunky hand controllers that are unintuitive and hard to use. Learning these controllers take time from the actual experience.

Voice, on the other hand, is a very intuitive and natural way to interact in a virtual reality environment. Speechly created a virtual reality environment with our partner ZOAN that improves the first time user experience significantly.

Information heavy data input

Speechly can be used to improve form filling when efficiency and data quality is important.

The following demo showcases a CRM use task in which a sales professional can input sales data by using voice. This leads into better data quality and improved data collection.

CRM is a great example of how voice can improve data input. The quality of the data is very important and data input is done in a repetitive way. Similar examples include health apps such as meal tracking and fitness tracking and other professional applications.

Web applications with voice UIs

Unlike most other solutions, Speechly is supported by all modern browsers and can be used to create awesome voice experiences for web.

In our demo application, we created a simple photo editing application that is used solely by using voice. It supports natural language and the user can see the effects being applied to the image in near real time.

Speech recognition accuracy

Speechly is not optimized for pure speech recognition. Our models are configured for a certain use case and we use this configuration to bias the speech recognition model. This helps improve the accuracy.

However, our speech recognition accuracy is still on par with general purpose speech recognition software such as Google Cloud Speech.

In this demo video, Speechly and Google Webspeech API are transcribing the Jobs keynote from the first iPhone release event.

You can try out our general accuracy here. Do note that the ASR accuracy improves significantly when the models are configured for your use case.


Voice GUIs can be used to improve user experience in wide variety of applications and domains. If you want to hear how your application's user experience can be improved with modern voice technologies, submit your email and we'll contact you as soon as possible.

If you are still not convinced, here's what our customers think of working with us. You can also read more about the advantages of voice user interfaces.

About Speechly

Speechly is a YC backed company building tools for speech recognition and natural language understanding. Speechly offers flexible deployment options (cloud, on-premise, and on-device), super accurate custom models for any domain, privacy and scalability for hundreds of thousands of hours of audio.

