Sep 19, 2023
1 min read
Before we jump into voice technology specifically, let’s start with technology more broadly. While a lot of digital innovation is driven by some combination of technology and user need, it’s very easy to fall into the trap of innovation for innovation’s sake. See: Smalt, a smart salt shaker, Juicero, a $400 wifi connected juicer, and of course, the CueCat, a barcode scanner that required a lot of personal information in order to serve you advertisements. While these all sound like something you might have laughed at in a SkyMall or SharperImage catalog, they collectively raised over $300m - and none of them exist today.
In the highly popular Design Thinking methodology, before you can solve a problem you need to define it. The first step of that process is to Empathize. This crucial starting point requires immersion into the human experience, interviewing and observing subjects. The goal is to challenge your own assumptions and see people holistically in their environments to better understand the issues they experience. It’s only after this immersion that you move on to the next step, which is to Define the Problem.
The best products all start with a clear, singular focus on a well-defined problem. From Duo Security’s focus on making companies more secure by making security easier for employees, to Lemonade’s focus on simple, accessible, and fast insurance coverage, there’s power in remaining focused on solving problems. Duo’s founders knew that convoluted and fragmented security solutions that users couldn’t (or wouldn’t) use effectively, put companies at risk. Lemonade realized that confusing insurance policies with convoluted sign up experiences made getting insurance something to dread, and lead to analysis paralysis of multiple different policies. Both of these companies have used technology and good user experiences to grow their businesses. And while they both exist in vastly different categories, they have one big thing in common: they both dove deep into the needs of their users when designing their products.
Most companies do just a handful of things; whether you’re a restaurant focused on making and serving food, or a doctor’s office focused on patient care, the customer comes with a specific end-goal in mind. By talking to and observing your customers, pain points in the overall customer journey can emerge, and they often don’t look like what you’d imagined.
An important thing to remember when you’re looking to define the customer problem is that it doesn’t need to be earth shattering. It could just be a seemingly small pain point that they encounter frequently, or something that happens infrequently but creates a big, negative impact.
As an example, in the quick service restaurant industry, the carryout experience is full of tension. A customer arrives at the location, unsure of whether or not their food is ready. They enter the store and wait to be acknowledged. Their food sits under a heat lamp while they try to get someone’s attention. By the time they get the food, it doesn’t taste fresh. They reach out to management complaining that the food wasn’t fresh. Management sees food freshness as a problem and works to fix it - when in reality, the problem was that the customer didn’t know when to arrive to get their food, and they didn’t know how to announce themselves in order to quickly receive their order upon arrival. Those problems can both be solved by some scaleable combination of operations and technology.
Whether the pain point is tied to the limitations of a touch only interface, like multiple steps to search or filter, or one that comes out of the usage of voice, like being the target of harassment in an online voice chat, voice technology offers new ways to provide clear and immediate value to the user.
Allowing people to tap to activate a microphone and speak their request rather than type it manually can save time, and cut down on typos in the search experience. The added benefit of voice search with a screen is that you have the ability to display prompts that can help the user by setting expectations and scoping the context of your UI.
Allowing the user’s voice to command and control aspects of the visual interface takes the experience to the next level. Being able to speak naturally and ask the UI to return complex queries like, “Show we women’s shoes in a size nine, in black…actually white. And sort them to show the best deal first,” without tapping to select and deselect multiple categories and sorting orders, hits at a common digital experience pain point. You can try this experience out in our e-commerce demo to see just how much faster it really is.
While many call center software platforms offer agents the ability to save common information that allows them to autocomplete forms, many involve lengthier forms that require a healthy amount of manual input. Whether it’s a request for a detailed medical history or an open-ended query from a homeowner looking for help from a contractor, agents are often acting as stenographers, transcribing the caller’s information while trying to provide them with good service. Leveraging voice technology to run in the background and manage the transcription and information input automatically saves time and lets the agent focus on making a human to human connection with the caller. It’s using AI to improve the customer experience.
If you’re unsure if using voice for transcription and form completion to support agent assistance makes sense for your company, we encourage you to examine how your callers are interacting with your agents, how long it's taking them to complete manual data input, the quality of the information being input, and whether there are any complaints about the experience.
Imagine for a moment that the word “list” doesn’t immediately bring to mind a notepad and pen. In the digital context, a list can be anything from a collection of to-do’s to a detailed food order in a cart. One of the most common “analog” voice list building experiences is placing an order at a drive-thru. When efficiency and ease of use are top of mind, using voice to add items to a list (or cart) is a natural way to improve a digital experience.
Whether in the metaverse or online multiplayer games, voice chat is a popular method for communication and collaboration. It has also, unfortunately, been a popular method for harassing strangers online. People targeted by the harassment find themselves with negative associations tied to the online game or community, and some may abandon them altogether to avoid further harm. For the companies that rely on community members and players, this represents an existential threat to their business. Implementing voice technology behind the scenes that can leverage AI to help support moderation efforts is not only scalable, it offers real-time recognition and understanding that has a direct impact on real people.
Once the customer problems and pain points have been defined, the ideation and prototyping can begin. It’s important to take the idea of a paper prototype and adjust it for the voice experience. In practice, that can look like faking an experience and testing it with users to determine if it’s worth building.
Another option is to build a quick prototype using a simple API like Speechly’s API, which allows for quick builds and deployment into existing tech stacks. That means less time dealing with incompatibility and more time focused on testing and iteration across multiple different AI powered voice technology solutions.
Whatever direction you go in, remember to center the user in everything that you do. They are ultimately the ones who decide whether what you’ve built offers them any value.
Speechly is a YC backed company building tools for speech recognition and natural language understanding. Speechly offers flexible deployment options (cloud, on-premise, and on-device), super accurate custom models for any domain, privacy and scalability for hundreds of thousands of hours of audio.
Sep 19, 2023
1 min read
Voice chat has become an expected feature in virtual reality (VR) experiences. However, there are important factors to consider when picking the best solution to power your experience. This post will compare the pros and cons of the 4 leading VR voice chat solutions to help you make the best selection possible for your game or social experience.
Jul 06, 2023
5 min read
Speechly has recently received SOC 2 Type II certification. This certification demonstrates Speechly's unwavering commitment to maintaining robust security controls and protecting client data.
Jun 01, 2023
1 min read