When companies integrate voice technology, most will default to think of generalized Voice Assistant platforms like Amazon Alexa or Google Assistant. This mindshare has been earned by Big Tech and the innovation they have pushed forward in voice technology, however a lot has changed since Siri was announced in 2011 or Alexa in 2014.
A major evolution is the alternative options businesses have to integrate voice technology, outside of the Big Tech providers. The purpose of this post is to understand the true value of domain ownership when integrating voice technology. In order to do this we need to first understand some of the risks associated with building voice technology alongside Big Tech and two optimal alternatives for integrating voice technology.
There are risks that need to be paid attention to when assessing the value of building a voice experience using a legacy technology Voice Assistant platform like Amazon Alexa or Google Assistant. The obvious threat that comes to mind is data. Handing over relevant customer data to the largest companies in the world with nearly infinite resources should be enough reason to raise caution.
Companies should also consider the control of their experience with their end customer. Generalized Assistant platforms offer voice technology to developers, so long as they are willing to operate within their specified domains of interest like smart speakers and smart home automation. While building out these new platforms might be a top priority for a company like Amazon, it ignores if this is the best place to build a valuable voice experience for your customer. Product teams should ultimately decide where a voice technology may fit within existing digital experiences to solve an end user problem.
A company’s brand should also be paid attention to. When looking at a company's brand, there is a gatekeeper to generalized Voice Assistant platforms in the form of a wake word. It is challenging to build brand awareness when you have to insert another company's name before getting to your brand's experience. There is also long term risk with developing a new user behavior for your product and potentially associating that product with a Big Tech company.
The risks mentioned above should be considered by companies looking to integrate voice technology, from startups to Fortune 500 brands. While the underlying voice technology large internet companies have brought to market and popularized is truly innovative, the reality is there are other providers outside of these players that companies should be aware of and consider.
There are a few pieces of core technology that power the voice experiences people are familiar with today, such as Speech-To-Text (STT) for quickly speaking text messages or Natural Language Understanding (NLU) that tries to actually understand the intent of what a user is saying. The two most relevant alternatives to building a voice experience on a Generalized Voice Assistant platform are Independent or “Owned Assistants” and Voice Manipulated Graphical User Interfaces or “Voice GUI”. These alternatives leverage a handful of different core voice technologies, such as STT and NLU mentioned above.
Owned Assistants are similar in nature to generalized assistants, as it relates to the end user experience. They are able to provide a conversational experience, however they are able to build this assistant experience within an existing mobile application or website. This can be a good alternative for brands who want a conversational experience without the risks mentioned above. However, there are natural complexities with approaching voice technology in a “conversational” manner.
The other alternative to a generalized assistant is a Voice GUI. With a Voice GUI, voice input can be used as an additional option to manipulate the graphical elements of a GUI just like clicking, tapping, swiping and typing. Looking at voice technology from this perspective, you can think about where voice as an input can be a tool in solving end user problems without ignoring the other benefits of a rich GUI.
The goal of this post is not to debate Owned Assistants vs Voice GUIs, rather focus on the Value of Domain Ownership when integrating voice technology.
Using a wake word to launch a voice assistant was popularized by Apple’s Siri. One could argue simple access to voice control via a wake word has been one of the key drivers in voice technology adoption across the globe. However, just because something has always been done a certain way does not mean it has to be done forever into the future. Integrating voice into your existing domains allows you to control your brand messaging. Some scenarios may call for a branded wake word while others may benefit from voice input alongside a tap of a screen or touch of a button. More attention should be given to training a brand’s end users to say a wake word with a Big Tech brand while building out new user behavior with voice technology.
If you look at existing digital domains like a website or mobile application as a starting point for integrating voice technology, it gives you the ability to better control the end user experience. We are still in the early innings as it relates to voice as an input for technology. Sure, we are familiar with Interactive Voice Response (IVR) call center experiences of “Say 1 for X”, but the underlying technology powering voice experiences today has come a long way. Pair this innovation in voice technology with a rich GUI and it makes for a perfect relationship.
Innovation in far-field voice technology, specifically with the announcement of Alexa in 2014, pulled general attention away from designing best practices for voice as an input in websites and mobile applications and more towards figuring out smart speakers. There is a real opportunity for companies to reset their outlook with voice technology and realize the power of voice within existing digital domains. These domains provide a perfect opportunity for companies to establish best practices for solving end user problems with a voice GUI or Owned Assistant while also giving the ability to fully control the evolution of their end users experience with voice technology.
Building quality end user experiences with voice technology requires a lot of data. Handing over valuable customer data to take advantage of an emerging platform can be a tough decision to make. Integrating voice technology into existing domains allows companies to maintain control of their valuable customer data, while still being able to take advantage of the innovation in voice technology.
This applies to an initial integration of voice technology, but also is important when thinking into the future. It is hard to guess the trajectory of any emerging technology or user behavior. However, you have a better chance of iterating and using the technology to build a valuable experience if you have full access to the usage data. Business interests in creating valuable end user experiences and Big Tech’s interest in solving Generalized AI are at odds when it comes to this valuable user data. The only way to get truly raw access to this end user data is by integrating voice technology within an existing digital domain.
Speechly is paving the way in giving product teams the ability to unlock end user value with Voice GUI’s. If you are interested in learning more please fill in the form below.
Leave your email address and our industry professional will contact you as soon as possible
Speechly React Client enables developers to integrate voice functionalities to their React applications
In this article, we'll introduce the guidelines and best practices for creating voice enabled applications for touch screens.
The extremely fast feedback that the iPhone touch screen experience provided to the user, resulting in a very responsive and intuitive user experience is still missing from current voice user interfaces.