Sep 19, 2023
1 min read
We’ve been hard at work at Speechly for the past months and as a result of that work, we’ve opened up our new Speechly Dashboard in private beta. And the best thing is, you can join the beta, too!
Speechly Dashboard is our web tool that can be used to build Spoken Language Understanding (SLU) models that can be used to build voice user interfaces to any app or service. The model is configured by providing it with sample utterances that are annotated using our own syntax language.
After the model is configured, it can be tested in the Speechly Playground and integrated into applications with our client libraries. You can even share the Playground to your friends or colleagues for feedback.
Speechly Playground is a web application that provides the user with a microphone and when an user gives permission to the browser and starts speaking, it returns with the user intent and entities that are extracted as per the sample utterances that it’s configured with.
If it sounds complicated, it’s not. Let’s make an example: we want to build a simple home automation app that can be used to turn on and off lights in different rooms. These actions are the user intents that our model is interested in. The user has two kinds of intents, to turn the lights on and off. Let’s call them
The rooms where the lights can be activated are modifiers for these intents. We call these modifiers
entities. In some other similar tools they can also be called
Now we have to think of different ways the user can control lights with our app. It’s easy to think of at least a few different ways. Here’s a short list of these examples or
- Turn off the lights in kitchen - Switch the lights on in bedroom - Turn the bedroom lights on - Make the living room dark - Turn the kitchen lights on - Switch on the bedroom lamp
The more sample utterances we give to our model, the better it works, so you can (and you should!) come up with more. In a real-life application, you’d preferably collect these from your users, too.
Because the Speechy uses deep neural networks for doing the end-to-end speech-to-intent, it quickly learns to generalize and detect intent and entities correctly even for cases that it has not been explicitly trained for. This means that the developer does not need to build an exhaustive list of user “commands”, but rather examples that train a model that can adapt to natural human speech. For users this means that they can communicate using their own words and expressions rather than having to learn and repeat preset commands.
Now we have to annotate the sample utterances. We have the two intents, turn_on and turn_off, so we just tell each utterance which intent it has. With our syntax, it’s done like so:
*turn_off Turn off the lights in kitchen *turn_on Switch the lights on in bedroom *turn_on Turn the bedroom lights on *turn_off Make the living room dark *turn_on Turn the kitchen lights on *turn_on Switch on the bedroom lamp
But now our model would return exactly the same intent with each of these utterances and it would be hard to distinguish between different rooms.
Let’s use entities for this and annotate the utterances again.
*turn_off Turn off the lights in [kitchen](location) *turn_on Switch the lights on in [bedroom](location) *turn_on Turn the [bedroom](location) lights on *turn_off Make the [living room](location) dark *turn_on Turn the [kitchen](location) lights on *turn_on Switch on the [bedroom](location) lamp
Now our model would know, for example, that for the first utterance, the user intent is to turn off the lights and the room where the user wants to turn them off is the kitchen. We could make this even smarter by using our advanced SLU rules, but that’s a more advanced topic that you can learn more in our documentation.
Now that we have configured our model, it can be tested in the Speechly Playground. The Playground returns the user intents and entities in real time along with the text transcript of what was said.
When the model works as expected, it can be integrated into a website, iPhone or Android app or any other service by using our client libraries.
If you are interested in getting an access to our private beta, sign up to our waiting list with the form below. You can also send us an email to email@example.com and tell more about what you are trying to achieve and we’ll help you forward.
Speechly is a YC backed company building tools for speech recognition and natural language understanding. Speechly offers flexible deployment options (cloud, on-premise, and on-device), super accurate custom models for any domain, privacy and scalability for hundreds of thousands of hours of audio.
Sep 19, 2023
1 min read
Voice chat has become an expected feature in virtual reality (VR) experiences. However, there are important factors to consider when picking the best solution to power your experience. This post will compare the pros and cons of the 4 leading VR voice chat solutions to help you make the best selection possible for your game or social experience.
Jul 06, 2023
5 min read
Speechly has recently received SOC 2 Type II certification. This certification demonstrates Speechly's unwavering commitment to maintaining robust security controls and protecting client data.
Jun 01, 2023
1 min read