2 Perceived Performance
This chapter provides information on things that don't necessarily make the application run faster, but that make the user experience appear faster. Chapter 1, Actual Performance, on the other hand, provides information about things you can do to your application that actually affect how fast the application runs.
Note: This document is currently under revision. At present, this chapter is primarily a list of topics to think about, with a few details at the end. A later version of this document will provide more details.
Areas of Concern
Reduce perceived latency
| | Mask latency, either with a thinking sound or a special prompt such as an informational tip. |
| | If you know the wait will be long, warn the user before the wait. You need to be careful with this, however. If you claim that the wait will be long and then there is no wait, you leave a bad impression. |
| | Latencies longer than 5 seconds cannot be effectively masked. Don't try. Instead, really fix the performance or redesign the VUI to avoid the latency. |
Give the user control during latency
| | Be responsive during latency. Make sure the user can correct an error that got them here or cancel a wait that the user perceives as too long. |
| | If possible, it's better to confirm what the user said before going into a waiting period. For example, this interaction: |
| | User: "Go to Horoscopes" System: "Horoscopes. Hold on a sec." <3 second pause> System: "Today, the stars align." |
| | may be perceived as faster than this one: |
| | User: "Go to Horoscopes" <2 second pause> System: "Horoscopes. Today, the stars align." |
Design VUI for speedy flow
| | Design the flow so that the most common path is first and easiest to follow. Initially present only the most relevant options; if the user wants the road less travelled, let them ask for it. |
| | Remove unnecessary flow options; don't overdesign. |
| | On the other hand, have shortcuts for expert users. If possible, remove or shorten prompts for experts. |
| | Used mixed initiative dialogs to allow the user to supply multiple pieces of information at once. Drop back to step-by-step interactions only if necessary. |
| | Avoid unnecessary confirmations. For example, rely on "go back" instead of getting confirmation. |
| | Use explicit confirms only if a misrecognition would be disastrous or is very common in this state. A way to avoid using an explicit confirm in cases where a misrecognition would be disastrous would be to provide a way to back out of an incorrect decision after the fact, such as by undoing a database commit if the user says 'go back' or 'cancel'. |
Control recognition latency
| | Turn on bargein. See Barge-In. |
| | Use hotword when appropriate. You need to be careful here. If hotword commands commonly get misrecognized, the final effect may seem worse than if you didn't allow hotword in the first place. Also, try to make all the commands in a hotword grammar about the same length. The interpreter waits for speech long enough for a user to have said the longest thing in the grammar. |
| | Use weighting in your grammars so the recognizer can more quickly prune search paths. You need to be very careful setting weights for your grammars. |
More Details
Barge-In
The bargein property controls whether a user is allowed to "barge in" or interrupt a prompt. If its value is true, the interpreter listens for speech while playing a prompt; if the user speaks, the prompt stops immediately. If the bargein property is false, the interpreter plays an entire prompt without allowing interruption.
The default value for this property is true. A <prompt> tag may have its own bargein attribute to override the property.
If the value of bargein is false, then any DTMF input buffered in a transition state is deleted from the buffer instead of being queued for the next recognition state.
|
|
Performance Tip:
|
| | For fastest execution, keep bargein set to true. In general, your application should always allow experienced users to speed up a dialog by interrupting a prompt when they already know what to say next. Barging in should be prevented only for error messages, advertisements, and other prompts that the user must hear in their entirety. This is easily done by using the bargein attribute of individual <prompt> tags. |
|
Timing in Speech Recognition
When the VoiceXML interpreter listens to speech input, it constantly compares the incoming audio stream to all active grammars, looking for a match. At some point after the user stops talking, the speech-recognition engine decides whether the input is valid. The timing for this is controlled by several properties that you can modify to adjust this process:
| | The timeout property specifies how long the interpreter waits for the user to say something. If nothing is said within this time, the interpreter generates a no-input event that can be handled by your VoiceXML application. The default value for this property is 5 seconds. |
| | The completetimeout property specifies how long the interpreter waits after the user stops talking when the speech input has been recognized as a valid input. After this time has elapsed, the interpreter accepts the recognized input. The default value for this property is 0.5 sec. |
| | The incompletetimeout property specifies how long the interpreter waits after the user stops talking when the speech input has not been recognized as a valid input, but it could be the first part of one. After this time has elapsed, the interpreter throws a no-match event that can be handled by your VoiceXML application. The default value for this property is 1.2 sec. |
Note that incompletetimeout is longer than completetimeout. This gives the user an extra moment to finish an incomplete command.
|
|
Performance Tip:
|
| | For fastest execution, keep timeouts short, especially in dialogs for experienced or advanced users. Note, however, that if the timeouts are too short, the interpreter will generate no-match events when users pause during speech. Ironically, this may result in users speaking more slowly, producing more no-match events. |
|
Timing in DTMF Recognition
The interpreter has several properties that affect how it responds to DTMF input:
| | The termchar property specifies a character that can indicate the end of a DTMF entry. The default value for this property is the pound (#) character. |
| | The timeout property specifies how long the interpreter waits for the user to press a key. The default value for this property is 5 seconds. |
| | The interdigittimeout property and termtimeout properties together specify how long the interpreter waits after a keypress before deciding that the user is finished. The default value for interdigittimeout is 3 seconds; the default value for termtimeout is 0 seconds. |
| | The bevocal.dtmf.flushbuffer property controls whether or not to keep buffered DTMF input available for the next recognition state. |
The DTMF recognition algorithm works as follows:
| Under these conditions... |
The interpreter does this... |
| | The user does not press any keys and |
| | timeout seconds elapse |
|
Generates a no-input event
|
| | The user presses the termchar character |
|
Recognition immediately ends and the interpreter either returns the valid input or throws a no-match event
|
| | The keys the user has pressed so far are not a valid input and |
| | interdigittimeout seconds elapse |
|
Throws a no-match event
|
| | The keys the user has pressed so far are a valid input and |
| | The grammar does not allow the input to be extended to make a longer valid input (see below) and |
| | There is not an active termchar character |
|
Immediately returns the valid input
|
| | The keys the user has pressed so far are a valid input and |
| | The grammar does not allow the input to be extended to make a longer valid input and |
| | There is an active termchar character |
|
Waits termtimeout seconds before returning the valid input
|
| | The keys the user has pressed so far are a valid input and |
| | The grammar does allow the input to be extended to make a longer valid input and |
|
Waits interdigittimeout seconds before returning the valid input
|
To construct a DTMF grammar for which the interpreter can unambiguously recognize that a key is the last of a series, you can use fixed-length fields. You specify a fixed-length field with <field type="digits?len=nnn"> where nnn is the number of digits in a valid input. For example, <field type="digits?len=5"> indicates that a valid input consists of 5 digits. DTMF grammars that you construct can also specify the expected number of digits.
|
|
Performance Tips:
|
| | Use DTMF grammars that specify entries with fixed lengths, so that the last digit of an entry can be quickly recognized. |
| | Use a smaller interdigittimeout or termtimeout when appropriate. If users are entering a short access code, or something that they may have memorized, a shorter timeout (1 second or less) will be convenient. However, if users are reading a long series of digits, such as when entering a credit card number, they may need enough time to enter a few digits, look at the number, enter more digits and so on. |
|
[Show Frames] [FIRST] [PREVIOUS] [NEXT]