Editing SLU Examples

Editing your SLU model happens by adding SLU examples to the Speechly Example configuration. The SLU model defines how users can interact with the voice user interface.

Introduction

Learn more

This part of the documentation expands on what we’ve previously written about models, intents, and entities. The best way to start learning Speechly is by completing the Quick Start. You might want to learn about the SLU basics too.

The Speechly SLU applications are built by specifying a set of example utterances, for which we use our Speechly Annotation Language (SAL). The example utterances should, as accurately as possible, reflect what your users might say to your application. Your examples are then fed as training data to a fairly complex machine learning system, which takes care of building all the bits and pieces required for a computer to understand human speech.

Luckily, to build an SLU app with Speechly, you don’t have to understand what’s going on under the hood. It is useful, however, to know a bit what the example utterances are used for. You can think of them as a way to explain to the computer what a particular sentence means, and which words or phrases in the utterance are important.

We’ll use a simple home automation application as a running example when walking through the different SAL features. Let’s start with something simple, and then progressively move on to more complicated stuff.

Before continuing, you should get acquainted with the basic concepts of SLU applications: utterances, intents, and entities. You can familiarize yourself with them here. If you haven’t done that yet, please do so now. It will make following this tutorial a lot easier! If you already know the SLU basics, let’s move on!

Intents

The Speechly Annotation Language defines a set of example utterances that can be used when talking to your SLU application. These examples essentially capture the language your users use, and what your application should understand. For example, suppose we wanted to describe how to bid someone turn on the lights, the simplest SAL example toward this goal would look like this:

*turn_lights_on turn on the lights

The first element in the example above, *turn_lights_on, signifies that the text that follows has the intent of turning the lights on (turn_lights_on). Intent names in the SAL appear as special tokens indicated by the asterisk (*) as the word-initial character.

Since natural language is incredibly diverse, people use a variety of expressions for the same intent. Just think how many ways of asking someone to turn the lights on you can come up with! By giving your model a few variations of the same intent in the SAL will help with the accuracy of the system. So, let’s add a couple of example utterances for an intent:

*turn_lights_on turn on the lights
*turn_lights_on put the lights on please
*turn_lights_on switch the lights on

Note that all of the examples are prefixed with the intent name they correspond to.

You may want to include more than one intent to your model to enable more functions for your users. In our running example, a reasonable additional intent to the home automation application could be that of turning the lights off. So, let’s add some example utterances for that purpose next:

*turn_lights_on turn on the lights
*turn_lights_on put the lights on please
*turn_lights_on switch the lights on
*turn_lights_off turn off the lights
*turn_lights_off put the lights off please
*turn_lights_off switch the lights off

Here we made the *turn_lights_off utterances simply by copypasting the examples for the *turn_lights_on intent and then replaced the word “on” with the word “off”. Please note that there is no need for the examples of different intents to look almost identical, as they do here. In fact, it’s better if utterances of different intents were unsimilar, but sometimes that’s just not feasible.

So far, we haven’t identified any entities in our examples. To see why entities are useful, consider how you would apply what we’ve learned, to build an application where your users can give commands also to other devices at home, such as, the air conditioner or the music player.

Basically, we could do everything with intents; we could define the same on and off intents for the AC as we did for the lights:

*turn_ac_on turn on the ac
*turn_ac_on put the ac on please
*turn_ac_on switch the ac on
*turn_ac_off turn off the ac
*turn_ac_off put the ac off please
*turn_ac_off switch the ac off

And for the music player, you could add two intents more: *turn_music_on and *turn_music_off. You’d then end up with three or more near-identical sets of intents for controlling your different electric devices at home.

As you probably can see, this could get very tedious - especially, if we had several of these devices, in different rooms. We could have a music player in the bedroom as well as in the living room, and for sure we’d have lights in all rooms. You’d soon have intent names like *turn_on_bedroom_lights and *turn_off_living_room_music. Dealing with all this as separate intents is very complicated to write and maintain.

Entities

When we look at the home automation example, we can identify two kinds of things in the utterances: actions, and objects (i.e. the things that receive the action). In this case, the action turns something on or off, and the objects that receive the action are the various devices, possibly located in different rooms.

This structure applies to many other SLU applications as well. The intents are useful for capturing the action the user eventually wants to achieve. Entities, on the other hand, are modifiers for intents. They help specify what the target or some attribute of the action is.

Now, let’s rewrite our SAL examples. We’ll use two intents: *turn_on and *turn_off. This time, however, we list the devices and their locations as entities. Let’s start again with a single example:

*turn_on turn on the [bedroom](room) [lights](device)

Again, the utterance starts with the intent name, which is identified by the *. This time the example also contains some new syntax. In [bedroom](room), the latter part in parenthesis signifies the entity, and “bedroom” in square brackets its value. Likewise, [lights](device) means that we have defined an entity called “device” of which value is “lights”. So, in the SAL syntax, the entity value is enclosed in square brackets, followed by the entity name in parentheses.

There can be several different values that a given entity can take. Let’s add a couple of examples:

*turn_on turn on the [bedroom](room) [lights](device)
*turn_on please turn on the [living room](room) [ac](device)
*turn_on switch the [kitchen](room) [music player](device) on

Now we have three different room entities (bedroom, living room, kitchen) and three device entities (lights, ac, music player). Also, we included some variation in the example utterances, just like before. Let’s then add examples for the turning off intent:

*turn_off turn off the [bedroom](room) [lights](device)
*turn_off please turn off the [living room](room) [ac](device)
*turn_off switch the [kitchen](room) [music player](device) off

This minimal set of SAL examples is already a start for building a simple SLU application.

To make the system more robust, you will need to add more examples. The phrases above do not contain anything about lights in the living room or the kitchen. And it might not even be necessary. If you’re in luck, the machine learning system figures out even from a very small set of examples that sometimes you might want to talk about the bedroom air conditioner, or the music player in the living room. To ensure that this will happen rather than leave it to luck, however, it’s always best to give as many examples as you can.

You can imagine, though, that writing these examples might get a bit tedious. Especially, if there are a lot of different possible combinations of entities. Luckily, you don’t have to do all that, as we will see next!

Advanced syntax features

Inline lists

Templates are a feature of the Speechly Annotation Language that allow you to use a much more compact syntax to express a large set of example sentences. Let’s consider the following three examples:

*turn_on turn on the [bedroom](room) [lights](device)
*turn_on turn on the [bedroom](room) [ac](device)
*turn_on turn on the [bedroom](room) [music player](device)

An equivalent way of expressing the above examples in the SAL is:

*turn_on turn on the [bedroom](room) [lights | ac | music player](device)

Rather than writing out three different example sentences, we can use something called an inline list to indicate that in this part of the sentence one of the items from the list [lights | ac | music player] should appear. An inline list is defined by a list of words separated from one another by the “pipe” symbol |, and enclosed in square brackets.

Note that the use of inline lists is not restricted to entity values. We can also write:

*turn_on [turn | switch] on the [bedroom](room) [lights | ac | music player](device)

In this example here, we have used the inline list to provide a selection of alternatives also for the first word in the utterance, which could be either “turn” or “switch”, as both make sense in this context.

As you can probably imagine, nothing prevents us from adding yet another inline list that gives us alternatives for different rooms:

*turn_on [turn | switch] on the [bedroom | living room | kitchen](room) [lights | ac | music player](device)

The single SAL line above expresses a set of 18 different examples, which are formed by taking into account all the possible combinations from the inline lists and using those in the corresponding locations of the sentence. Behind the scenes, our machine learning system considers all the 18 example variations when building your SLU application.

That’s pretty neat! But even writing those lists seems like a lot of work. For example, just by re-introducing the *turn_off intent, we should have both of the following lines:

*turn_on [turn | switch] on the [bedroom | living room | kitchen](room) [lights | ac | music player](device)
*turn_off [turn | switch] off the [bedroom | living room | kitchen](room) [lights | ac | music player](device)

We now have copies of the same inline lists linked to these two intents. Should we want these commands to apply to yet more devices and rooms, the additions need to be done to all of the respective inline lists. That’s a lot of error-prone work!

Variables

Again, the Speechly Annotation Language has a feature that helps us avoid this. For example, we can use variables that represent lists, and when writing the examples, we can refer to these variables rather than write the lists in full like above. This can be done in the following way:

start_phrase = [turn
                switch]
rooms = [bedroom
         living room
         kitchen]
devices = [lights
           ac
           music player]
*turn_on $start_phrase on the $rooms(room) $devices(device)
*turn_off $start_phrase off the $rooms(room) $devices(device)

The SAL code above defines exactly the same set of 18 utterances as the previous example with the inline lists, but this one is easier to read as well as maintain when new rooms or devices are added.

To define a list, write the name of a variable you want to use for it, followed by the = sign and the list enclosed in square brackets - each listed item on the line of its own. In the example utterances, you simply refer to the variable by prepending the variable name with the $ sign.

Define lists before using them

The definition of the variable must be given before referenced for the first time. A good practice is to put all the variable definitions at the beginning of your SAL input, and add the example utterances only after these.

The SAL allows you to define whatever is needed within a variable (except the intent). The variable definition might contain other variables, lists, entities, optional inputs, or plain text:

start_phrase = [turn
                switch
                activate]
rooms = [bedroom
         living room
         kitchen]
devices = [lights
           ac
           music player]

turn_on_device = $start_phrase on the $devices(device)
turn_off_device = $start_phrase off the $devices(device)
increase_temperature = [raise | increase] the temperature

*turn_on $turn_on_device in the $rooms(room)
*turn_off $turn_off_device in the $rooms(room)
*increase_temp $increase_temperature in the $rooms(room)

Optional input

As we saw earlier, it’s often useful to specify different variations of the examples to make it easier for your app to understand different ways of expressing the same intent. Also, the intents can have “simple” and “complex” variations. For instance, utterances with the turn on intent could be defined as follows:

*turn_on $start_phrase on the $rooms(room) $devices(device)
*turn_on $start_phrase on the $devices(device)

In the second example, we have omitted $rooms(room) to exemplify the case where the user does not specify in which room they want to turn the device on. The location could be obvious from the context, for if the user happens to be in the living room, it makes sense that the simple utterance, “Turn on the lights” would most likely refer to the living room lights. Again, to save us from specifying all such variations separately, the SAL allows you to annotate some parts of the example utterance as optional. This is done by putting the optional part inside curly brackets {}:

*turn_on $start_phrase on the {$rooms(room)} $devices(device)

The line above captures the same set of examples as the two preceding utterances. As with the inline lists, the optional parts can be placed anywhere in the utterance. They can be useful, for instance, when specifying the so-called carrier phrases, as shown below:

*turn_on {can you} {please} $start_phrase on the $devices(device)

Now, this line captures all of the following examples:

*turn_on can you please $start_phrase on the $devices(device)
*turn_on can you $start_phrase on the $devices(device)
*turn_on please $start_phrase on the $devices(device)
*turn_on $start_phrase on the $devices(device)

Putting all this together, we get the following SAL input, which is already a step closer toward a simple home automation application!

start_phrase = [turn
                switch]
rooms = [bedroom
         living room
         kitchen]
devices = [lights
           ac
           music player]

*turn_on {can you} {please} $start_phrase on the $rooms(room) $devices(device)
*turn_off {can you} {please} $start_phrase off the $rooms(room) $devices(device)

Copypaste and try!

You can copypaste the examples above and give it a try on the Speechly Dashboard.

Number ranges

Often SLU applications need to understand numbers. For instance, you may have a home automation application with intents that allow the user to adjust the room temperature. Given what we explained above, you could do this in the following way:

number = [one
          two
          three
          four
          five
          six
          seven
          eight
          nine
          ten]
rooms = [bedroom
         living room
         kitchen]
*increase_temp [raise | increase] the {$rooms(room)} temperature by $number(degrees) {degrees}
*decrease_temp [lower | decrease] the {$rooms(room)} temperature by $number(degrees) {degrees}

This works fine, but we can do something smarter. The SAL has a special notation for expressing numeric ranges. An equivalent, but much shorter way to express that above is:

number = [1..10]
rooms = [bedroom
         living room
         kitchen]
*increase_temp [raise | increase] the {$rooms(room)} temperature by $number(degrees) {degrees}
*decrease_temp [lower | decrease] the {$rooms(room)} temperature by $number(degrees) {degrees}

A number range is given by specifying its first and last value with two dots between, and thereby “1..10” expands to a list of numbers starting from one and ending at ten.

Multi-intent utterances

Speechly is a fully streaming voice API for building complex voice user interfaces that can be used in natural language. While what we’ve already learned can be used to build well-working applications, there’s one major part we haven’t yet touched upon.

What if a user wants to turn on the TV and raise the room temperature, both at the same time? With the examples we have just defined, the user isn’t really able to express a combination like this in the most natural way. That is, by saying something like, “Turn on the TV and raise the temperature by 2 degrees.”

A simple solution would be to add {and} as an optional part to all of your example utterances to enable such intent compounds.

*turn_on {and} {can you} {please} $start_phrase on the {$rooms(room)} $devices(device)
*increase_temp {and} {can you} {please} raise the temperature by $number(degrees) {degrees}

Now, one could say, “Can you please turn on the TV and raise the temperature by 4 degrees” or “Please raise the temperature by 2 degrees and turn on the TV,” in either order.

Luckily, Speechly supports another alternative too, as multi-intent utterances are also possible. Here’s an example:

*turn_on {can you} {please} $start_phrase on the {$rooms(room)} $devices(device) and *increase_temp raise the temperature by $number(degrees) {degrees}

List weights

What if some of the phrases in the input were more probable to occur than others? For example, a phrase like “Turn on the music player” is more likely to be used than “Activate the player” or “Switch on the player”. It is a good practice to adjust the model so that it “knows” the probability distribution over the occurring items. In the SAL, it is possible to define the weights for the list items:

start_phrase = [3: turn
                1: switch
                1: activate]

In this case, the proportion of these words in the generated examples will be 3:1:1. If you don’t explicitly set the weights, they all default to 1.

You can also use float numbers to make the weights look more like probabilities:

start_phrase = [0.6: turn
                0.2: switch
                0.2: activate]

This definition is equal to the previous one.

If you define the weights of only some items on the list, others with undefined weights default to 1 as well.

start_phrase = [3: turn
                switch
                activate]

Inline lists can also contain the weights:

*increase_temp $increase_temperature by [0.4: one |0.4: two |0.2: five] degrees

Optional input weight

The weight of an optional part can be used only in the probability manner, always in the interval [0,1]. Let’s assume that we have the following template:

*turn_on $start_phrase on the $devices(device) {please}

It may happen that the input “please” occurs only in 10% of all the cases when the user wants to activate a device. This we could exemplify by:

*turn_on $start_phrase on the $devices(device) {0.1: please}

Among the examples generated from the template above, 10% will contain the word “please” and 90% will not.

Optional inputs nested

Another good thing about the optional input is that you can combine several of them, thus making them nested. Let’s take the example we created before.

number = [1..10]
rooms = [bedroom
         living room
         kitchen]
*increase_temp [raise | increase] the {$rooms(room)} temperature by $number(degrees) {degrees}
*decrease_temp [lower | decrease] the {$rooms(room)} temperature by $number(degrees) {degrees}

It may happen so that the user says, “Raise the bedroom temperature” without specifying how high it should be lifted. This option could be expressed by setting by $number(degrees) degrees as an optional input:

*increase_temp [raise | increase] the {$rooms(room)} temperature {by $number(degrees) degrees}

As we already marked {degrees} as an optional input, we can just combine it here like this:

*increase_temp [raise | increase] the {$rooms(room)} temperature {by $number(degrees) {degrees}}

Profile image for karoliina-louhema

Last updated by karoliina-louhema on June 26, 2020 at 07:47 +0300

Found an error on our documentation? Please file an issue or make a pull request