Speechly is a Voice Interface API that lets you add responsive voice functionalities to websites, mobile applications, or other systems. It is available as a SaaS platform that implements a hybrid speech-to-text / natural-language-understanding engine. To make integration as easy as possible, we also provide a number of Client libraries for different platforms (Web, React, Android, iOS).
We provide a generous free tier that is sufficient for small-to-medium size projects. You can start right away, no credit cards required. See pricing for more information.
We currently support English and Finnish. Spanish support is on the roadmap. For enterprise customers we have the capability to add new languages with a few months lead time.
The Speechly Voice Interface API is a generic tool that can be used to implement all kinds of Voice UIs across different verticals. It is particularly suited for multimodal applications on the Web and mobile, where rapid visual feedback is required. Please see the Solutions section for more information about use-cases. Also check out our demos.
Not really, but Speechly could be the underlying technology of a Voice Assistant. Speechly is a Voice Interface API that you can configure for a variety of use-cases. Speechly could be used together with a text-to-speech engine and a dialogue manager, but these features are currently not part of our default offering.
You can use an external text to speech converter as part of your application. Our API, however, provides only speech to text as well as natural language understanding capabilities.
For general discussion or questions, please use our GitHub discussions.
We have our own, in-house developed speech to text system. It provides state-of-the-art accuracy when using text-only adaptation (see also our Interspeech 2021 paper). For more difficult cases we can also use acoustic adaptation for even better transcription accuracy.
Our speech to text engine has been trained with thousands of hours of speech, and this training data is augmented with various types of background noises and other artefacts. For enterprise customers we can also provide models that are customised for a particular noise profile.
Our speech to text engine has been trained with thousands of hours of speech. This includes speakers having different accents and ethnic backgrounds.
Probably yes. However, our API does not do dialogue management. But if your Alexa model uses only single-turn commands it is in most cases easily converted to a Speechly configuration.
Yes. You can use our API only for speech recognition, and then forward the returned transcript to Dialogflow or any other similar service. However, we do not have extensive tooling for this. You could start by taking a look at our API definitions.
We have our own, in-house developed state-of-the-art NLU engine that is tightly integrated with our speech to text system.
Our web client is compatible with all modern web browsers. Please refer to our documentation for a complete list of browser support.
Yes. Our browser-client can be used to build WebXR applications right in e.g. the Oculus browser. Also, our technology has been successfully integrated with an Unreal-engine based VR experience.
Our browser-client is a generic NPM module that should integrate with any web UI framework. For React developers we provide the react-client which provides a somewhat more React-like interface. Notice: browser-client is not compatible with node.js, so if your UI framework relies on server side rendering (such as e.g. SvelteKit), please refer to the documentation of your UI framework on how to use libraries that cannot be run outside a browser.
Our free tier is only available as a SaaS cloud API. However, we can also deploy our API to a private cloud, or in most cases also run the API on your local machine. Our speech to text engine runs even on a Raspberry Pi. However, support for similar resource constrained devices can be more limited, and depends on the hardware in question and features required. For more information, please contact sales.
Yes. Our browser and mobile clients can be integrated in a way that does not interfere with screen readers.