Standard Variables and Data Types
Standard Variables make recognizing common expressions and Data Types easy by parsing them to a normalized format.
The Speechly SLU applications are built by specifying a set of example utterances for which we use our Speechly Annotation Language (SAL). The example utterances should, as accurately as possible, reflect what your users might say to your application. Your examples are then fed as training data to a fairly complex machine learning system, which takes care of building all the bits and pieces required for a computer to understand human speech.
Standard Variables are building blocks that make supporting certain common but somewhat complex expressions in the Speechly SLU applications easier. While you could construct these expressions yourself with the SAL, using our predefined standard types lets you focus more on the unique aspects of your application. The Standard Variables look like and are used like normal variables, but you don’t have to define them in your configuration, because we’ve already done it for you. The Standard Variables can be identified by their names, which start with SPEECHLY. You can see a standard variable being used in the example utterance below, which now permits various ways of expressing dates to be recognized, without having to define them individually.
*book book a flight for $SPEECHLY.DATE(departure)
Data Types determine what is done to an entity after it has been recognized. While the default Data Type String leaves the entities as they are recognized, the other Data Types, such as Date provide normalizations for the entities that make their further use easier. The Data Types are defined in the Speechly Dashboard when listing entities.
While the Standard Variables and Data Types can be used separately, the two features are best when combined. With the entity departure defined as Date in the example above, an utterance like “Book a flight for August ninth two thousand twenty,” would be recognized, and the SLU API would return the recognition as:
Now one neither has to define how the dates look like nor determine how they map into a structured format.
$SPEECHLY.DATE and Date recognize expressions that define a date, for example, tomorrow, next Friday, or January fifth twenty twenty, and parse them into strings as ISO-8601 (e.g., 2020-05-01).
$SPEECHLY.FOUR_DIGIT_NUMBER and Number recognize numbers that consist of four digits, for example, five six four nine, and normalize them into digits (e.g., 5649).
In many applications, you need to use dates and concepts such as tomorrow or today. While, theoretically, you could provide the model with thousands of examples of how the users may refer to certain times, there’s a simpler way.
If you use the Standard Variable $SPEECHLY.DATE, the model automatically understands dates and relative constructs that can be mapped into a certain date or month:
This allows the end-users to use any sensible way of referring to dates such as, July the fifth, twenty-twenty or fifth of July.
When you also define the entity type on the Speechly Dashboard as Date, the resulting entity is parsed as ISO-8601, a date string (e.g., 2020-07-05).
Of course, it’s not always sensible to have all dates available in all applications. If your application supports a limited range of date expressions, you might want to add a set of examples in your configuration instead:
weekdays = [monday|tuesday|wednesday|thursday|friday]
*scheduling next week only $weekdays(available) is okay
*scheduling next week only $weekdays(available) [and|or] $weekdays(available) are okay
Often SLU applications need to understand numbers. Speechly supports the Standard Variable $SPEECHLY.FOUR_DIGIT_NUMBER for recognizing numbers that consist of four digits. The Speechly Annotation Language also has the number range syntax amount = [1..20] (explained here) for defining custom number ranges easily.
amount = [1..20]
Both of these can be used with the Data Type Number, which parses the recognized expression as a string consisting of digits. For example, zero zero three five would be parsed as 0035 and nineteen as 19.
Last updated by karoliina-louhema on July 1, 2020 at 08:50 +0300
Found an error on our documentation? Please file an issue or make a pull request