voice tech

Voice Summit Reflections, December 2021

Mandi Galluch

Dec 20, 2021

2 min read

Top voice industry themes in 2021, from voice user experience to voice technology.

  • Copy link

  • Mail

  • LinkedIn

  • Facebook

  • Twitter

For the first time since 2019, the Speechly team was able to attend an in-person Voice Summit on December 7th and 8th at the Renaissance Arlington Capital View Hotel in Arlington, Virginia. Over 400 people attended in-person, with even more tuning in virtually to talk about voice technology and the broader voice industry.

Speechly CEO Otto Söderlund flew in from Helsinki to do a live demo


and present a Keynote that included a surprise announcement


From the simple things like being able to enjoy our morning coffee together


to spending quality time connecting with people


it was a great venue for deep discussion about some of the themes we’re seeing in the voice industry.

Top themes from Voice21

There’s been a clear shift from voice as a novelty to voice as a utility.

This is something that JP from Vixen Labs talked about in his presentation: the idea that voice was for “play things.” It kept popping in conversations about how the industry has changed recently. Discussions about voice center around use cases, user experience, and business objectives. Users are looking for more from voice, and companies are looking for new ways to offer value - often in the form of increased efficiency or ease of use. That means that voice is not “just” a vehicle for a back and forth conversation with an assistant or bot. It’s a means to an end.

And to that end, it’s less about new channels, and more about improving the experience on existing ones.

Developing and driving usage of new, voice-only, channels can be difficult and expensive - especially if you step outside of the Big Tech assistant ecosystem. Driving usage of new voice features on existing channels requires less effort, and often drives higher usage and reduced churn on those channels. Why? At the end of the day, people are results oriented - when they realize how much faster they can get things done with their voice, everything else becomes frustratingly slow by comparison. When the experience is better, people come back to it.

All of that means that the market is maturing, and with that maturation we shift into optimization.

Developers are looking for foundational technologies that will help them optimize and streamline their tech stacks.

Building legacy voice experiences has been either tightly tied to Big Tech device centric platforms or a cobbled together mashup of Natural Language Understanding, Speech Recognition, and CSS. These legacy tools have provided a strong foundation from a hands-on learning perspective but they’ve also required long ramp-up times, capital investment and come with trade-offs in performance or user experience. As voice continues to prove itself as a viable feature and user expectations continue to grow, the trade-offs become less viable.

We left the event energized by the conversations and excited about where the voice industry is heading. See you in 2022!

Latest blog posts

use cases

ADL Report: Voice Chat Remains a Top Channel for Online Harassment

The annual ADL report about harassment in multiplayer video games showed a significant problem worsening. Voice Chat is once again a leading channel for concern.

Collin Borns

Jan 27, 2023

3 min read

use cases

ADL Report: Online Harassment In Games is Bad and Getting Worse

ADL's annual report about harassment in online multiplayer games paints a negative picture for young people and adults alike. Is 2023 the year the gaming industry will start to overcome these challenges?

Collin Borns

Jan 18, 2023

2 min read

use cases

The Hidden Power of Full-Duplex AI for Voice Assistants and Voice Chat Moderation

The most popular voice assistants (Alexa, Siri, Google) use half-duplex architectures, meaning the user and assistant must take turns to speak. However, Full-duplex systems employ real-time understanding where the system begins predicting the user intent from the very first word uttered, unlocking the ability for Proactive Content Moderation.

Hannes Heikinheimo

Dec 09, 2022

8 min read