Blog

Voice tech

Speechly Browser Client v2.0 Released

Mathias Lindholm

May 02, 2022

2 min read

The Speechly Browser Client v2.0 is now available on NPM. In this post we’ll cover the major changes as well as how to upgrade to the new version.

  • Copy link

  • Mail

  • LinkedIn

  • Facebook

  • Twitter

What's new in Speechly Browser Client v2.0

The main new feature with the Speechly Browser Client v2.0 is the capability to flexibly choose an audio source for the client via the Media Capture and Streams API. This is a significant evolution to the client and it's also a breaking change.

Previously the only way to provide the client with audio was to use the device’s default microphone. It was challenging to control which microphone was used if there were multiple microphones available. When initializing the Speechly Browser Client, the client behind the scenes silently chose the first audio capture device it found.

Moreover, if you had audio available for example as a live MediaStream or in an audio file, and thus no need for a microphone, you had to resort to elaborate workarounds.

The Speechly Browser Client v2.0 fixes these issues.

Using the default microphone is still straightforward. It exists in a separate BrowserMicrophone class that you can initialize and attach to the client, and everything works as before. However, a nice bonus of separating the default microphone implementation from the client is that you ask for the microphone permission only when the microphone is initialized, instead of when the client is created!

Furthermore, now you can also attach any existing MediaStream object from which the audio will be read. This allows to easily integrate Speechly for example to WebRTC applications that expose incoming and outgoing audio as MediaStreams.

Finally, to make things easier when dealing with audio files, we've added a uploadAudioData function which decodes the given audio data and uploads it to the API. This currently works with popular file types such as WAV, MP3, M4A and others.

Check out Speechly Browser Client v2.0 on NPM.

How to upgrade to Speechly Browser Client v2.0

Install the package

// Using Yarn
yarn add @speechly/browser-client

// Using NPM
npm install --save @speechly/browser-client

Updates to microphone

Speechly Browser Client v2.0 extracts the microphone to a separate class and as a result the initialization looks a bit different. Also note that startContext and stopContext have been renamed to start and stop.

// Before
import { Client, Segment } from '@speechly/browser-client'

const client = new Client({appId: 'your-app-id'})
await client.initialize()

client.onSegmentChange((segment: Segment) => {
  console.log('Received new segment from the API:',
    segment.intent,
    segment.entities,
    segment.words,
    segment.isFinal
  )
})

await client.startContext()
setTimeout(async function() {
  await client.stopContext()
}, 3000)
// After
import { BrowserClient, BrowserMicrophone, Segment } from '@speechly/browser-client'

const client = new BrowserClient({ appId: 'your-app-id' })
const microphone = new BrowserMicrophone()
await microphone.initialize() // must be called from a user triggered event!
await client.attach(microphone.mediaStream)

client.onSegmentChange((segment: Segment) => {
  console.log('Received new segment from the API:',
    segment.intent,
    segment.entities,
    segment.words,
    segment.isFinal
  )
})

await client.start()
setTimeout(async function () {
  await client.stop()
}, 3000)

Usage with audio files

You can now use the new uploadAudioData function to send an AudioBuffer directly to the client without using the microphone.

const client = new BrowserClient({ appId: 'your-app-id' })
const response = await fetch('url-to-audio-file')
const buffer =  await response.arrayBuffer()
await client.uploadAudioData(buffer)

For more details, check out our GitHub repository. Happy developing!

Latest blog posts

Voice tech

Create a WebRTC Video Chat App With Speechly Transcription

Learn how to build a WebRTC video chat application that uses the Speechly Browser Client to transcribe audio from a MediaStream.

Mathias Lindholm

May 24, 2022

8 min read

Voice design

The Fastest UI for the Web and Mobile

Abandoning the Voice Assistant model for a Voice UI as a Feature results in the most efficient UI since the Touchscreen.

Collin Borns

May 12, 2022

2 min read

Voice design

Evolution of UIs

From Punched Cards to Touch Screens, User Interfaces have evolved significantly. Will Voice be next in this evolution?

Collin Borns

May 11, 2022

4 min read