Voice design
·

Examples of Real-time Multimodal Voice User Interfaces

Speechly Spoken Language Understanding demos showcasing real-time visual feedback and multimodality

Portrait of Ottomatias Peura

Ottomatias Peura

Head of Growth
Article's hero image

In this post

In this post, we'll go through some use cases for voice user interfaces in various domains and user tasks. Not many apps or websites already employ a voice user interface because of a lack of developer tools for building them. These examples are all using Speechly Spoken Language technology for a truly multimodal voice UI, enhancing the touch screen user interface with voice functionalities.

  • Form filling with voice
  • Voice in eCommerce and search filtering with voice
  • Adding items from a big inventory, such as grocery eCommerce
  • Professional applications
  • Information heavy data input
  • Voice in VR/AR and gaming
  • Web applications with voice user interfaces
  • Speechly's speech recognition accuracy

Speechly offers a unique tool for building real-time multimodal voice user interface. Our technology can be applied to any industry or domain to enhance current touch user interfaces with voice functionalities.

Speechly offers a Spoken Language Understanding API that returns user intents and entities in real time for user voice input. This approach enables end users to see the result of their voice commands visually as they speak instead of the traditional smart speaker paradigm that is based on turn-based queries.

Real-time visual feedback is the key to efficient and intuitive user interfaces, because it allows users to multitask. Instead of saying something and waiting for the answer, end users can speak in a stream of consciousness fashion, correcting themselves if needed. On the other hand, the visual feedback encourages users to go on with the voice experience.

When using

We have collected some examples of user tasks that can be solved more effectively with our technology.

Form filling by voice

Voice is a great solution for information heavy, repetitive tasks such as form filling. Filling forms on a mobile device can be cumbersome because of difficult typing and common usability issues on different screen sizes and mobile browsers.

In our demo, we enhanced an existing HTML form with voice functionalities. The form can be manipulated by using touch or voice simulatenously. End user can use natural language to fill the form and gets instantenous feedback on the form.

By seeing the form, the user knows exactly what kind of questions they need to answer and they can the form in any order and by using any interaction modality.

eCommerce search filtering

Search is one of the most important parts of a eCommerce customer experience. Up to 30% of eCommerce visitors will use search for navigating and a user who doesn't find what they are looking for is a lost customer.

A major share of Google searches are already done by using voice, but very few eCommerce sites offer a similar experience.

Speechly makes it simple to add voice functionalities to eCommerce stores. Again, the user can use natural language to search for products and unlike with traditional categories, voice categories naturall support synonyms. No matter if the user asks for pants or trousers, they find what they are looking.

Grocery shopping

Grocery shopping is a special kind of shopping experience, because the user wants to add a lot of familiar products from a large inventory to their shopping cart as easily as possible.

Traditional user experience requires a lot of repeated searches and selections, but voice enables the user to just say out loud the products they want and see them added to their cart. If they need to change a certain product, for example by changing a milk to another brand of milks, they can do it easily by just clicking the product.

Professional applications

Professional applications are a great use case for voice functionalities, because the language used in these settings is accurate and commonly shared by everyone.

Speechly can be used to create efficient user experience for professional applications in many industries and domains. In this example, airline maintenance workers can easily report anomalies and defects in airplane cabin.

You can also read about our offering for warehouse professionals and logistics.

Voice in VR/AR

Virtual reality environments can offer a very immersive experience that can showcase for instance real estate locations easily and accurately, even amidst pandemic situations.

However, the first time user experience in these environments suffer from clunky hand controllers that are unintuitive and hard to use. Learning these controllers take time from the actual experience.

Voice, on the other hand, is a very intuitive and natural way to interact in a virtual reality environment. Speechly created a virtual reality environment with our partner ZOAN that improves the first time user experience significantly.

Information heavy data input

Speechly can be used to improve form filling when efficiency and data quality is important.

The following demo showcases a CRM use task in which a sales professional can input sales data by using voice. This leads into better data quality and improved data collection.

CRM is a great example of how voice can improve data input. The quality of the data is very important and data input is done in a repetitive way. Similar examples include health apps such as meal tracking and fitness tracking and other professional applications.

Web applications with voice UIs

Unlike most other solutions, Speechly is supported by all modern browsers and can be used to create awesome voice experiences for web.

In our demo application, we created a simple photo editing application that is used solely by using voice. It supports natural language and the user can see the effects being applied to the image in near real time.

Speech recognition accuracy

Speechly is not optimized for pure speech recognition. Our models are configured for a certain use case and we use this configuration to bias the speech recognition model. This helps improve the accuracy.

However, our speech recognition accuracy is still on par with general purpose speech recognition software such as Google Cloud Speech.

In this demo video, Speechly and Google Webspeech API are transcribing the Jobs keynote from the first iPhone release event.

You can try out our general accuracy here

Conclusions

Voice GUIs can be used to improve user experience in wide variety of applications and domains. If you want to hear how your application's user experience can be improved with modern voice technologies, submit your email and we'll contact you as soon as possible.

If you are still not convinced, here's what our customers think of working with us.

Learn how voice can improve your user experience

Tell about your use case and our industry professional will get back to you soon

Fields marked with * are required

Read also