4   Fetching and Caching Resources

You can think of the VoiceXML interpreter as a telephone-based web browser. As with HTML documents, VoiceXML documents have Web URIs and can be located on any Web server. In addition to VoiceXML documents, a VoiceXML application can use several different types of files, including recorded audio data, speech and DTMF grammars, and XML data (this last is a BeVocal extension).

Some data may be obtained from streaming sources, servers responding to form requests, CGI script output, and so on. We follow the World Wide Web Consortium's convention of using the term resources to refer collectively to files, streams, and other data sources. All of these resources can be accessed with standard Web URIs and can be located on any Web server.

One significant difference between an HTML application intended for a standard Web browser and a VoiceXML application intended for your telephone is that a standard Web browser runs locally on your machine, whereas the VoiceXML interpreter does not run on your telephone but runs remotely, for example at the VoiceXML hosting site. On every call to a VoiceXML application, all of the resources needed by that application may need to be retrieved (or fetched) from a location other than where the VoiceXML interpreter runs. Those resources may then be stored locally at the hosting site (that is, cached) by the VoiceXML interpreter for later use by the same or a different application on another call.

This chapter discusses concepts related to retrieval and caching of your application's resources:

 •  How Fetching and Caching Work
 •  What You Can Control from VoiceXML
 •  Prefetching Resources
 •  Handling Fetching Delays
 •  Controlling the Use of Cached Resources
 •  Submitting Complex JavaScript Objects

This chapter describes the mechanics of this process. For information on how take advantage of these features when designing your application for best performance, see the VoiceXML Performance Guide.

How Fetching and Caching Work

The BeVocal VoiceXML interpreter follows the VoiceXML 2.0 and Hypertext Transfer Protocol (HTTP) 1.1 standards for fetching and caching resources. The HTTP standard in particular provides a lot of flexibility in how fetching and caching can be implemented. This discussion does not describe all of the details, not even all of the details that apply to VoiceXML. It merely gives an overview of the basic model and the most common ways to use this model in VoiceXML.

Note: Fetches made using secure HTTPS are not cached in the proxy server.

Requests and Responses

Basically, any application that follows the HTTP 1.1 standard for fetching and caching sends a request to a server for a particular resource and then gets a response back from that server. The request consists of a request type (typically a GET request for VoiceXML requests), a URI that specifies what resource is wanted, and various headers that specify details about what will be acceptable in the returned resource. The response consists of a response code containing information about the type of response, various headers with details about what can be done with the resource, and possibly a body containing the actual resource.

As anyone who has waited for the download of a graphic-intensive Web page knows, fetching resources over the Web can be a time-consuming process. Consequently, it's a very good idea to avoid as much as possible going out over the Web to get new copies of things that have not changed since the last time you got them. This is where caching comes in; the fundamental idea of caching is to avoid sending a body in the response unless absolutely necessary.

Because of the importance of caching to reduce the time involved in fetching resources, many of the headers (both request and response headers) contain information about how old the resource is, how old the resource can be and still be fresh or unexpired, and what restrictions, if any, there are on caching the resource.

For standard HTML applications, you can directly set request and response headers. For VoiceXML applications, on the other hand, you still directly set response headers, but you use VoiceXML attributes and properties to set request headers. The VoiceXML interpreter translates the VoiceXML attributes and properties into the appropriate HTTP request headers.

Using Multiple Caches

Your standard Web browser typically sets aside some amount of disk space on your machine for caching Web pages. This means that as you explore back and forth amongst the multiple pages of a Web site, the browser does not have to download every page every time you visit it. This corresponds to a single phone call to a VoiceXML application. In theory, the VoiceXML interpreter could cache resources for a single phone call; however, single phone calls don't tend to be long enough for this to be worth the overhead. What is most assuredly worth the overhead is...

If you open multiple Web browsers on your machine at the same time, you can visit the same or different Web pages in each Web browser. These multiple browser instances can share the same cache for Web pages. That is, pages that are cached by one instance are available for use by the other instances. Correspondingly, at a VoiceXML hosting site, a single VoiceXML Media Gateway may service multiple phone calls at the same time. All of those phone calls share the same VoiceXML interpreter cache. So, resources downloaded for an application on one phone call may be available to the next phone call to that or to a different application on the same Media Gateway.

Many of us use the Web through our company's connection to it. Many companies are set up with proxy servers that sit between the company's machines and the outside world of the World Wide Web. A proxy server may store Web pages and then have those Web pages available to any machine that goes through that server out to the Web. Thus, even if you haven't visited a site from your machine, if someone else in your company has visited that site, its pages may be stored in the proxy server cache and be available to you from that cache instead of going out over the Web. While not as fast as if it were stored on your machine, the proxy cache is frequently still faster than downloading the page from the Web.

A BeVocal hosting site typically contains multiple VoiceXML Media Gateways, to facilitate handling more calls simultaneously. Just as your company uses a proxy cache for Web pages, a hosting site uses a proxy cache for VoiceXML resources. So, if anybody has used the Horoscopes application, its pages may be available in the persistent site-wide proxy cache. When a new call comes in for that application, that call may be able to use the pages from this cache, rather than going out and downloading the information again.

The following diagram shows how requests flow from your application through the various levels of caching and finally out to other servers on the Web.

When a user interacts with your application, she typically calls a phone number that is associated with a particular BeVocal hosting site (such as the one that hosts BeVocal Café applications). That site starts a local copy of the VoiceXML interpreter for your application and runs your application on one of its Media Gateways, retrieving resources as needed from wherever they live. As has been said earlier, the resources for your application may live on multiple servers throughout the Web. Even if your resources are hosted at the hosting site, however, they must be fetched by the interpreter when the user calls.

Note: There is a naming ambiguity you should be aware of. Every phone call to a BeVocal hosting site gets its own local copy of the VoiceXML interpreter; it does not share that interpreter with other phone calls. However, the "VoiceXML Interpreter Cache" is a cache for an entire VoiceXML Media Gateway, not for an individual phone call into that Media Gateway. An individual instance of a VoiceXML interpreter (that is, a single phone call) does not have a completely separate cache.

At any one time, a lot of different users may be using a lot of different applications all at the same BeVocal hosting site. Each call will use a different set of resources; sometimes these resources overlap and sometimes they do not. For example, if 10 users simultaneously call the same Horoscopes application, they all need access to common documents and audio files for that application. If at the same time another 5 people call a Sports application, that second group will need a similar set of resources for the Sports application, but these resources probably won't overlap much with the resources needed by the group calling the Horoscopes application.

At its simplest, if all resources are fresh forever, the sequence would be as follows:

1. The first time a call accesses a particular resource, the interpreter generates the request, looks in the local VoiceXML interpreter cache, then in the site-wide proxy cache, and finally goes out and retrieves the resource from the appropriate server on the Web.
2. When a successful response comes back from that server, the resource is first stored in the site-wide proxy cache, then passed to the local interpreter cache where it is also stored, before finally being used by the running application.
3. Later in the same call, if the same resource is requested again, the interpreter gets it directly from the local interpreter cache, without having to download it again from the proxy cache, let alone from the original server.
4. When that call ends, the copy remains both in the local VoiceXML interpreter cache for the VoiceXML Media Gateway and in the site-wide proxy cache.
5. If another call comes in a few minutes later to the same Media Gateway and asks for the same resource, that call can use the copy in the local cache. If another call comes in to a different Media Gateway and asks for the same resource, that call does not have the resource in its local cache, but can use the resource copy in the proxy cache.

Sounds relatively simple, doesn't it? Things get complicated for a variety of reasons:

 •  Sometimes the server providing the resource wants to control how long or even whether the information is cached. For example, for a resource that has security implications, the server may instruct the BeVocal platform to not cache the resource.
 •  Most resources are time sensitive, meaning that they are valid for some time period. For example, if the resource is the current stock price for some company, that resource may only be valid at the time requested. Or, if the resource is an audio file containing a daily horoscope, it may be valid for a single 24-hour period. On the other hand, if the resource is an audio file corresponding to the application's main menu, it may be theoretically valid for weeks or even months.
 •  Conversely, sometimes the VoiceXML application may want to control when to use cached information. For example, you may not have control over the server that stores your audio files and so cannot say when those audio files are no longer useful. In that case, you'd want the VoiceXML application to provide this information.

Typically, you do not need to be concerned with the differences between the VoiceXML interpreter's local cache and the proxy cache for the entire site. The facilities available for controlling caching within a VoiceXML application do not distinguish which cache you are controlling; they talk about whether a resource can be cached at all and for how long it can be cached. Those settings apply to all relevant caches within the BeVocal platform and to any relevant caches out on the Web, for example, at the application's Web server.

Also note that this discussion only addresses the caches that are present on the VoiceXML side. The remote server may have its own caches and proxies for storing resources before sending them to your application.

For simplicity, most of the rest of this document just talks about "the cache", without distinguishing between the local interpreter cache and the proxy cache.

Fundamentals of Controlling Caching

As we've said, you control caching with the HTTP response and request headers; that is, either on the responding server or from the requesting application. You can specify control information in both of these places. Typically, however, unless you're a caching expert, for a single type of resource (VoiceXML document, grammar file, audio file, SSML file, or XML file), you should decide whether you want to control that type of resource from the server or from the application. Things can get complicated if you try to control a single resource from both ends at once.

The primary concepts for controlling caching are the freshness of a resource (whether or not it has expired) and the time interval during which you can use a resource, based on its freshness or on when it was originally fetched.

Application Control

If you control a resource completely from your VoiceXML application, the normal HTTP fetch sequence for a GET request is basically as follows:

1. The first time the resource is requested, the request goes to the origin server; that is, it goes to the server on which the resource resides.
2. The origin server returns a response.
3. The date and time of receipt (the fetch date) are recorded by the requestor, the resource is stored in the cache, and the resource is returned to the requesting application.
  The response always includes the Date header, indicating when the response was generated. It might include a Last-modified header, indicating when the resource was last changed on the server. The response may include an Etag header, which is a way of uniquely identifying the actual content of the resource.
4. Right now, we're assuming that the server has not specified any expiration information, so the resource expires immediately.
5. By default, the very next time the resource is requested, the interpreter must make a new request for the resource.
6. The new request can include one or both of the If-Modified-Since header, set to the fetch time of the cached resource, and the If-None-Match header, set to the cached Etag. If both are sent, the If-None-Match header takes precedence; if neither is present, the request must be for a completely new copy of the resource.
  Etag and If-None-Match are both HTTP 1.1 features. HTTP 1.1 servers give them precedence over the If-Modified-Since header. However, HTTP 1.0 servers use If-Modified-Since.
7. The server uses these headers to determine whether a new copy of the resource needs to be sent in the body of the response or whether it should simply indicate that the requesting application can use the copy in its cache.
8. If the response includes a new copy of the resource, the new copy and its headers replace the old copy in the cache.
9. Even if the response does not include a new copy of the resource, the cached header information can change. For example, the server can, and often does, send a new expiration date for the resource. Even if the server does not update the expiration date, the VoiceXML cache updates the default expiration time to the new fetch time.

That sequence of events also applies to a POST request, but only if the response includes an Expires or Cache-Control header.

When controlling caching from your application, you change step 5 above, by specifying how stale the resource can be. That is, you can use the Cache-Control: max-stale request header to indicate a number of seconds after the expiration of a resource during which your application can use the expired resource. (Remember that you use VoiceXML attributes and properties to set this header information; see Maximum Stale Time.)

Here it helps to remember the difference between the local VoiceXML interpreter cache and the site-wide proxy cache. If a particular call needs a resource and that resource is already in its local cache, fetched within max-stale seconds, the interpreter doesn't need to send a request anywhere. However, if the resource is either not in the local cache at all or the local copy is too old, a request is sent to the proxy cache, to see if it contains a copy fetched within max-stale seconds. If the proxy cache does not contain an appropriately dated resource, the proxy cache sends a request to the server for a new copy.

Server Control

If you control a resource primarily from the server, the sequence is basically as follows:

1. The first time the resource is requested, the request goes to the origin server (as a GET request).
2. The origin server returns a response.
3. Unless the response header includes a Cache-control: no-cache or Cache-control: no-store header, the response headers are cached for later use by the requestor. The fetch date is recorded, the resource stored, and the resource is returned to the requesting application.
  Note: In HTTP 1.1, the relevent header is Cache-control, not pragma. HTTP 1.0 used pragma, but this directive is no longer understood by many servers.
  The response still includes the Date header and might include Last-modified or Etag headers.
  With server control, the response typically includes either an Expires header or a Cache-control: max-age header. If it contains both, the max-age header takes precedence over the Expires header. Expires indicates an exact date and time at which the resource expires. Cache-control: max-age indicates a number of seconds after the Date at which the resource expires.
  For example, these 2 sets of headers are equivalent:
 
Date: 12 December 2002 15:34:00 GMT
Expires: 12 December 2002 15:36:00 GMT
  or
 
Date: 12 December 2002 15:34:00 GMT
Cache-Control: max-age=120
4. The next time the resource is requested, if the resource was cached, the interpreter uses the appropriate combination of Expires, max-age, and Date headers to determine whether or not the resource has expired.
5. If the resource has not expired, the interpreter returns the cached copy. In our example, if the next request is at 12 December 2002 15:34:30 GMT, the cached copy is returned.
6. If the resource has expired, the interpreter makes a new request for the resource. In our example, if the next request is at 12 December 2002 15:36:30 GMT, a new request is sent to the origin server.
7. If available, the new request includes both the If-Modified-Since header, set to the Last-Modified header of the original request, and the If-None-Match header, set to the Etag of the original response. If both are sent, the If-None-Match header takes precedence; if neither is present, the request must be for a completely new copy of the resource.
  Etag and If-None-Match are both HTTP 1.1 features. HTTP 1.1 servers give them precedence over the If-Modified-Since header. However, HTTP 1.0 servers use If-Modified-Since.
8. The server uses these headers to determine whether a new copy of the resource needs to be sent in the body of the response or whether it should simply indicate that the requesting application can use the copy in its cache.
9. If the response includes a new copy of the resource, the new copy and its headers replace the old copy in the cache.
10. Even if the response does not include a new copy of the resource, the cached header information can change. For example, the server can, and often does, send a new expiration date for the resource. Even if the server does not update the expiration date, the VoiceXML cache updates the default expiration time to the new fetch time.

You may have no control over the expiration times sent by your server. As an example, you may even know that some resources are modified more frequently than the server indicates with its expiration times. In this situation, you have some options on how to change the caching behavior.

 •  You might choose not to request a new resource based on when the resource in your cache expires, but rather on how long ago you fetched the resource in your cache. You use the Cache-Control: max-age request header for this purpose. (Remember that you use VoiceXML attributes and properties to set this header information; see Maximum Age.)
  Note: In a response header, max-age indicates when the resource will expire, regardless of when it was fetched. In a request header, it indicates the opposite; max-age indicates how long ago the resource could have been fetched, regardless of when it will expire.
  You can specify both max-age and max-stale headers in the same request. If you do so, the max-age header takes precedence over the max-stale header.
 •  For VoiceXML document and grammar document resources, you have another alternative. You can use the http-equiv attribute of the <meta> tag to act as if you had set response headers for these resource types. When the VoiceXML interpreter parses the VoiceXML document or grammar, if it encounters a <meta> tag that specifies caching information, the interpreter must then go and modify the cache to include this information.
  Note: This method is not recommended because it can only affect the local VoiceXML cache. The overhead of having the intermediate site-wide proxy caches interpret every file would be prohibitive. The proxy caches do not interpret the contents of files, they only look at the headers. Consequently, proxy caches do not understand or implement the caching behavior specified in <meta> tags.

Summary of HTTP Headers

Remember that for your VoiceXML application, you do not create HTTP request headers manually. Rather you use the appropriate attributes or properties that are described later in this chapter. The VoiceXML interpreter translates these attributes and properties in a fairly obvious way to the HTTP request headers described here.

The HTTP request headers most commonly relevant for caching with the BeVocal platform and VoiceXML interpreter are:

Request Header Description
User-Agent: BeVocal/ivers
VoiceXML/vvers
BVPlatform/pvers

Every request contains a User-Agent header of this form. ivers is the version of the BeVocal VoiceXML interpreter, vvers is the version of VoiceXML used in the requesting document, and pvers is the BeVocal platform version. For example:

User-Agent: BeVocal/2.4 VoiceXML/2.0
BVPlatform/1.8.0.4

If you need to detect the User-Agent for a request, you should probably ignore the last 2 digits in the platform version, as these are likely to change occasionally.

If-Modified-Since: date

If the modification date on the requested resource is after this time, send a new copy.

If-None-Match: tag

tag is a unique identifier for a resource. If the resource that would be provided for the request has the same identifier as tag, don't send a new copy.

Cache-Control: max-age=N

A number of seconds after which the resource must be fetched again from the origin server, regardless of whether or not it has expired.

Note that max-age in a response header is quite different from max-age in a request header. In a response header, it lets the server specify a number of seconds during which the resource is still fresh. In a request header, it lets the application specify a number of seconds after which a resource must be refetched, even if the server says the resource is still fresh.

Cache-Control: max-stale=N

A number of seconds after the expiration time during which the application is still willing to use an expired resource.

If you have control over the response sent by the server, you can directly set response headers (or even configure the server itself). The HTTP response headers most commonly relevant for caching with the BeVocal platform and VoiceXML interpreter are:

Response Header Description
Etag: tag

A unique identifier for this response. The details of how the server creates the identifier are server-specific; what is important is that the identifier indicates a particular version of the resource so the server can determine if it has been modified.

Date: date

The date and time the response was generated. This header is required for all HTTP 1.1 responses.

Expires: date

The date and time on which this response expires. This is an HTTP 1.0 header, but is still widely used with HTTP 1.1. If Expires is set to 0, the interpreter does not cache the resource.

Cache-Control: max-age=N

The resource expires N seconds after generation. That is, to determine the expiration date, add N seconds to the date specified with the Date header.

Note that max-age in a response header is quite different from max-age in a request header. In a response header, it lets the server specify a number of seconds during which the resource is still fresh. In a request header, it lets the application specify a number of seconds after which a resource must be refetched, even if the server says the resource is still fresh.

Cache-Control: s-maxage=N

For a shared cache (but not for a private cache), the maximum age specified by this directive overrides the maximum age specified by either the max-age directive or the Expires header.

Cache-Control: no-cache

Do not store this resource in the cache.

Cache-Control: no-store

For VoiceXML applications, has the same effect as the no-cache directive.

Cache-Control:
must-revalidate

The cache must not use the resource after its expiration time, even if the request says that stale information is acceptable.

Cache-Control: private

All or part of the response is intended for a single user. The response must not be cached by a shared cache; it can be cached by a private (non-shared) cache.

Cache-Control: public

The response is cachable by any cache, even if it would normally be non-cachable.

Cache-Control:
proxy-revalidate

For VoiceXML applications, this is equivalent to the must-revalidate directive.

What You Can Control from VoiceXML

From your VoiceXML application, you can control various things about fetching and caching of resources. The various attributes and properties that provide this control are collectively referred to as the application's fetch policies. The fetch policies govern the following aspects of fetching:

 •  Prefetching Resources--The VoiceXML interpreter can try to start fetching resources before they are actually needed (prefetch them), in an attempt to have them already available when actually required.
 •  Handling Fetching Delays--No matter what else you do, there inevitably will be noticeable delays between when a resource is requested and when it is available.
 •  Controlling the Use of Cached Resources--These are the policies that control request and response headers. See How Fetching and Caching Work for how request headers affect caching.

Some fetch policies are set by a single property for all types of resources. Other fetch policies can be set separately for different types of resource. For these policies, there is usually a corresponding set of properties, one for each of these resource types:

 •  VoiceXML documents
 •  Recorded audio data
 •  Grammar files
 •  JavaScript source files
 •  SSML files (Extension)
 •  XML data files (Extension)

In addition, for all of the fetch policies, the appropriate VoiceXML tags support a corresponding attribute.

For example, to optimize fetch operations, you can use the audiofetchhint, documentfetchhint, grammarfetchhint, scriptfetchhint, and ssmlfetchhint properties. In addition, the <audio>, <choice>, <data>, <dtmf>, <goto>, <grammar>, <link>, <script>, and <subdialog> tags all support the fetchhint attribute.

All policies have default settings. An application can change any default setting with a <property> element that sets a property corresponding to the policy to be changed. Any tag that requests a fetch operation includes attributes that can be set to override the current policy settings during that one fetch operation:

 •  A property set in the <vxml> element of a single-document application or the application root document of a multidocument application sets the policy for fetching resources from that document and the application, overriding the default setting.
 •  A property set in the <vxml> element of a non-root document of a multidocument application sets the policy for fetching resources from that document, overriding the setting for the application.
 •  A property set in a <form> or <menu> element sets the policy for fetching resources from that dialog, overriding the setting for the containing document.
 •  A property set in a form item sets the policy for fetching reosources from that form item, overriding the setting for the containing form.
 •  An attribute set in an element that fetches a resource sets the policy for that fetch, overriding any other setting of the policy.

There are a couple of subtleties you need to be clear about:

 •  When you set fetch properties within the document foo.vxml, you are setting properties for resources that foo.vxml fetches; you are not setting properties that affect fetching foo.vxml itself.
 •  Fetch properties are always set by some VoiceXML document. For any given phone call to an application, the initial document of the application is not fetched from another VoiceXML document. Consequently, there is no way to set fetching policies for the initial document.
 •  The application root (for example, root.vxml) for document foo.vxml is set in the <vxml> tag of foo.vxml. The application root document is not directly fetched by another document and it inherits the fetch policies of the document of which it is the root. For example, if the document bar.vxml fetches foo.vxml, and bar.vxml sets maxage=120 for foo.vxml, then the VoiceXML interpreter also uses maxage=120 for root.vxml.

The following sections describe the policies and their default settings, and also list the properties that can be used to set each policy. For a detailed description of the various properties, see Chapter  12, Properties.

Prefetching Resources

The interpreter can attempt to optimize dialog interpretation by prefetching files that might be needed. The interpreter prefetches resources used by a document by starting to fetch them as soon as a document is loaded, rather than waiting until execution of the VoiceXML tags that reference those resources. Prefetching can improve an application's performance by allowing it to fetch resources during "free time" while the user is speaking or listening to a dialog.

The interpreter prefetches resources in the order in which they appear in the document. Consequently, those near the top of the document are retrieved first, unless there are delays at the server, heavy Internet traffic, and so on. The interpreter prefetches several resources at once; a delay retrieving one resource does not affect others.

Note: Prefetching resources can generate many simultaneous requests on your server.

Prefetch Cache Details

The VoiceXML interpreter prefetches resources separately for each phone call and for each document executed within that phone call.

While the interpreter executes a single VoiceXML document on a single phone call, it has a queue of the resources that it can prefetch for that document. During execution of that document on that call, the interpreter prefetches as many resources as it can and puts them in the prefetch cache.

During the execution of a document, the VoiceXML interpreter always checks the prefetch cache for a resource before initiating a new fetch operation. If the resource is in the prefetch cache, the interpreter uses it, even if the resource expires between when it is prefetched and when it is needed.

When execution leaves the document (either by the call ending or by transitioning to another document in the same call), the interpreter flushes the prefetch queue and cache and starts over for the next document.

Note that if there are multiple simultaneous calls to the same application, they may be executing the same document at the same time. However, the resources for each phone call will be in separate prefetch caches. This means that in some cases, the interpreter will fetch a new copy of a resource for one phone call even though another phone call is using an unexpired copy.

To illustrate all this, assume that an application has documents d1.vxml and d2.vxml, both of which refer to the same audio file, foo.wav. If there are 2 simultaneous calls to this application, then at the same time foo.wav might be stored in the prefetch cache of d1.vxml for the first call and the separate prefetch cache of d1.vxml for the second call. Or, it might be in the prefetch cache of d1.vxml for the first call and the prefetch cache for d2.vxml for the second call. foo.wav cannot, however, be in the prefetch caches for d1.vxml for the first call and for d2.vxml for the first call, because 2 different documents on the same phone call cannot be executing at the same time and so cannot have active prefetch caches at the same time.

Fetch Hints

Prefetching is controlled by instructions called hints, that are specified by the fetchhint attribute, and also by the typefetchhint properties where type is a placeholder for the type of resource to be fetched. For example, the audiofetchhint property controls prefetching of audio files.

The fetch hint policy can be set to one of the following values:

 •  prefetch--Fetch the resource when the page is loaded.
 •  safe--Fetch the resource only when it is needed.

Any tag that can fetch a resource has a fetchhint attribute that specifies how to fetch the resource. If this attribute is not set, the interpreter uses the current value of the appropriate typefetchhint property, where type is a placeholder for the type of resource to be fetched, as shown in the table below.

Resource type Tags that support the fetchhint attribute Property Default Value (for property)

Recorded audio data

<audio>
audiofetchhint
prefetch

VoiceXML documents

<choice>
<goto>
<link>
<subdialog>
documentfetchhint
safe

SSML document

<audio>
ssmlfetchhint
safe

Grammar files

<grammar>
<dtmf> VoiceXML 1.0 only
grammarfetchhint
prefetch

JavaScript source files

<script>
scriptfetchhint
prefetch

SSML files (Extension)

<audio>

ssmlfetchhint
prefetch

XML data files (Extension)

<data>

datafetchhint
safe

Restrictions on Prefetching

Prefetching is disabled when the URI or other attributes of a tag are computed at runtime. In these cases, even if the fetching hints specify prefetch, the interpreter cannot fetch the resource until the tag is executed and the exact values of the attributes are determined.

For example, programmers sometimes simplify their job by writing VoiceXML such as:

 <audio expr="audioURI('hello')">

where audioURI() is a JavaScript function that adds a prefix such as http://mycompany.com/audio/ and an ending such as .wav to the parameter, resulting in a complete URI of http://mycompany.com/audio/hello.wav. This technique saves some typing and simplifies program maintenance. However, the interpreter cannot prefetch the audio file in this case, because the exact URI is not known until the tag is executed.

Handling Fetching Delays

Regardless of how well you set the various caching and prefetching policies, inevitably fetching resources from remote servers will sometimes generate delays. Various fetch policies control how the interpreter handles these delays.

Timeouts

By default, the interpreter waits up to one minute for a resource or document to be fetched. The application can control this behavior with the fetchtimeout attribute of a tag that fetches a resource. That attribute specifies how long the interpreter waits for a resource to arrive. If the resource does not arrive within the specified time, the interpreter throws an error.badfetch event. The value is a number representing the time in milliseconds.

This attribute is available for all tags that fetch resources, specifically:

 •  <audio>
 •  <choice>
 •  <data> (Extension)
 •  <goto>
 •  <grammar>
 •  <link>
 •  <script>
 •  <subdialog>
 •  <submit>
 •  <dtmf> (VoiceXML 1.0 only)
 •  <send> (VoiceXML 1.0 only; Extension)

When this attribute is not specified, the interpreter uses the current value of the fetchtimeout property, whose default value is 60 seconds.

Background Audio

By default, the user does not hear any audio output while the interpreter is fetching a resource of any kind. The application can change this behavior with the fetchaudio policy, which specifies the URI of a "background audio" file to be played while the interpreter fetches a VoiceXML document or XML data file. Background audio can be helpful if the fetch operation may cause a noticeable delay in processing, such as when an on-line purchase is being verified and processed by a transaction server. The audio file can contain music, a "please wait" message, and so on.

Background audio is never played while the interpreter fetches grammar, audio, or script files. It is only played when fetching VoiceXML documents or XML data files. The fetchaudio policy is controlled with the fetchaudio property and the fetchaudio attribute of the following tags:

 •  All tags that fetch VoiceXML documents: <choice>, <goto>, <link>, <subdialog>, and <submit>.
 •  Extension. The <data> tag, which fetches XML data files; this attribute is relevant only if the bevocal.fetchaudio.allfetches property is true.
 •  Extension; VoiceXML 1.0 only. The <send> tag, which submits values to a Web server.

If the fetchaudio attribute is not specified, the interpreter uses the current value of the fetchaudio property. This property does not have a default value; that is, by default no background audio is played.

When a background audio file is specified for a fetch operation, the fetching of the background audio file itself is governed by the audiofetchhint, audiomaxage, audiomaxstale, and fetchtimeout properties that are in effect at the time of the fetch. (In VoiceXML 1.0 applications, the caching property is used in place of audiomaxage and audiomaxstale).

Note: The interpreter plays the background audio file only once during a given fetch operation; it does not loop (repeat).

Two properties govern the playing of the background audio clip:

 •  The interpreter does not start to play the audio file unless the time to fetch the resource exceeds a limit set by the fetchaudiodelay property. This can prevent the user from hearing very short audio clips when there are very slight delays in fetching resources. The default value of this property is 0.
 •  The value of the fetchaudiominimum property is the minimum time interval to play the fetchaudio source, once started, even if the fetch operation completes during play. The default value of this property is 0; with this default, the interpreter interrupts the audio playback as soon as the resource is fetched, and resumes normal processing. If set to a larger value, it prevents the user from hearing a short clip of background audio which is immediately cut off.

Queued Prompts when Fetching

By default in VoiceXML 2.0, queued prompts are not played in the background during the execution of a fetch.

However, for those tags which fetch data and for which background audio can be played during the fetch (<choice>, <goto>, <link>, <subdialog>, <submit>, and <data>), if background audio will be played (as described next), then queued prompts are played during the fetch and before the background audio is played.

If background audio will not be played, but the bevocal.fetchaudio.flushqueue property is set to true, then queued prompts will still be played during the fetch.

Controlling the Use of Cached Resources

After a resource expires, it remains in the cache although it is "stale". If the same file is needed in the future, the interpreter takes one of the following actions:

 •  Uses the stale cached file as is.
 •  Revalidates the stale cached file with a Get-If-Modified request to the resource's server. If the server replies that the resource has not been modified, the interpreter uses the cached copy.
 •  Refetches the resource unconditionally.

If the interpreter needs a resource and the cache contains a copy of that resource, there are 3 primary policies governing whether the interpreter uses the cached copy:

 •  The Maximum Age for the cached file
 •  The Maximum Stale Time for the cached file
 •  (VoiceXML 1.0 only) The Caching policy for the file

These policies all affect the headers sent when a resource is requested. In addition, you can set caching information that would normally be in the response header for a VoiceXML document or a grammar document. (See Mimicking Response Headers.)

See Prefetch Cache Details for how the prefetch cache interacts with these policies.

Maximum Age

An application can specify that it will use a cached resource only if its time in the cache does not exceed a maximum age:

 •  If the cached copy is older than the maximum, it will be refetched with a Get-If-Modified header.
 •  If the cached copy is within the maximum age and has not expired, it will be used.
 •  If the cached copy is within the maximum age but has expired, the relevant maximum-stale-time policy determines whether the interpreter uses the expired cached file. See Maximum Stale Time.

Any tag that can fetch a resource has a maxage attribute that specifies the maximum age in seconds of a cached resource. If this attribute is not set, the interpreter uses the current value of the appropriate typemaxage property, where type is a placeholder for the type of resource to be fetched. For example, the audiomaxage property specifies the maximum age for audio files.

In VoiceXML 1.0 applications, when no value is set for maxage, the caching attribute controls whether an unexpired cached file is used.

Resource type Tags that support the maxage attribute Property

Recorded audio data

<audio>
audiomaxage

VoiceXML documents

<choice>
<goto>
<link>
<subdialog>
<submit>
documentmaxage

SSML documents

<audio>
ssmlmaxage

Grammar files

<grammar>
grammarmaxage

JavaScript source files

<script>
scriptmaxage

XML data files (Extension)

<data>

datamaxage

SSML data (Extension)

<audio>
ssmlmaxage

No default is set for these properties, which means that any unexpired cached file will be used.

If you set a maximum-age property to a non-zero value, you ensure that:

 •  The interpreter uses an unexpired resource whose age is less than or equal to the maximum age--without doing a Get-If-Modified request to verify that the cached file is up to date.
 •  The interpreter fetches a fresh copy of a resource whose age is more than the maximum age--even if the cached file has not yet expired.

For example, suppose you fetch a VoiceXML document file that expires in 60 seconds, and after 40 seconds you need the same file. If documentmaxage is set to 30, the application will refetch the document file; if documentmaxage is set to 60, it will use the cached file.

You can set a maximum-age property to 0 to ensure that a fresh copy is fetched if the resource has been modified since it was last fetched.

Maximum Stale Time

An application can specify that an expired file that is "not too stale" can still be used. The maximum stale time for a file is the time by which its expiration time can be exceeded. Within this allowable stale time, an expired cached file will be used without being refetched. If an expired cached file is needed after its maximum stale time has been exceeded, the file will be refetched.

Any tag that can fetch a resource has a maxstale attribute that specifies the maximum time in seconds during which a stale (expired) cached resource may be used. If this attribute is not set, the interpreter uses the current value of the appropriate typemaxstale property, where type is a placeholder for the type of resource to be fetched. For example, the audiomaxstale property specifies the maximum stale time for audio files. The following table specifies the appropriate typemaxstale property for each resource type.

Resource type Tags that support the maxstale attribute Property Default value (for property)

Recorded audio data

<audio>
audiomaxstale
300s

VoiceXML documents

<choice>
<goto>
<link>
<subdialog>
<submit>
documentmaxstale
0s

SSML documents

<audio>
ssmlmaxstale
0s

Grammar files

<grammar>
grammarmaxstale
0s

JavaScript source files

<script>
scriptmaxstale
0s

XML data files (Extension)

<data>

datamaxstale
0s

SSML data (Extension)

<audio>
ssmlmaxstale
300s

The maximum stale time is relevant either when the expired file is within the maximum age or when no maximum age is set for the file. If the number of seconds since the cached file expired is less than or equal to the maximum stale time, the cached file is used. If the file has been expired for longer than the maximum stale time, the interpreter does a Get-If-Modified request to update the cached file, if necessary.

Caching

VoiceXML 1.0 only. In a VoiceXML 1.0 application, when the relevant maximum-age property is not set, the caching policy determines whether the interpreter uses an unexpired cached copy of a file:

 •  If the caching policy is fast (the default), the cached copy is used if it has not expired.
 •  If the caching policy is safe, the interpreter sends a Get-If-Modified request to the server, even if the resource is still in the cache. This ensures that the most recent copy of the resource is always used; however, it does introduce some extra delays, because the interpreter must contact the resource's server. The safe setting is intended mainly for use during development and debugging, when documents and other files may be updated frequently; it is equivalent to maxage="0".

In VoiceXML 1.0, any tag that can fetch a resource has a caching attribute that specifies the caching policy for the resource:

 •  <audio>
 •  <choice>
 •  <data> (Extension)
 •  <dtmf>
 •  <goto>
 •  <grammar>
 •  <link>
 •  <script>
 •  <subdialog>
 •  <submit>

If this attribute is not set, the interpreter uses the current value of the caching property. Note that the default value for the property is fast. That is, fast is the normal condition for any tag that does not explicitly specify caching="safe".

If the cached file has expired, the relevant maximum-stale-time policy determines whether the interpreter uses the expired cached file.

Note: This attribute is used only in when all the following conditions are met:

 •  The version attribute of the <vxml> tag is 1.0.
 •  The maxage attribute does not have a value.
 •  The cache contains an unexpired copy of the resource.

Mimicking Response Headers

You may not have direct control over the response headers sent by your server. If you do not, then for VoiceXML documents and XML grammar files, you can use the <meta> tag with its http-equiv attribute to mimic the use of HTTP response headers. For example:

 <meta http-equiv="Cache-control" content="max-age=10"/> 
 <meta http-equiv="Expires" content="02 Feb 2002 23:59:59 GMT"/>

When the VoiceXML interpreter parses a VoiceXML document or a grammar file in the XML format, it interprets these instances of <meta> tag as though the HTTP response had sent these response headers. The interpreter goes back to its cache and changes the associated information.

Note: This method is not recommended because it can only affect the local VoiceXML cache. The overhead of having the intermediate site-wide proxy caches interpret every file would be prohibitive. The proxy caches do not interpret the contents of files, they only look at the headers. Consequently, proxy caches do not understand or implement the caching behavior specified in <meta> tags.

Submitting Complex JavaScript Objects

The following tags submit variables to a server:

 •  <subdialog>
 •  <submit>
 •  <data> (Extension)
 •  <send> (Extension; VoiceXML 1.0 only)

The tag's namelist attribute specifies the variables to be submitted. If one of the specified variables is set to a complex JavaScript object, all component values in the object are submitted as separate variables. For example, the following <submit> tag submits the object foo:

 <script>
   var foo = new Object;
   foo[0] = 1;
   foo[1] = 7;
   foo[2] = "hello";
 </script>
 <submit src="bar.jsp" namelist="foo"/>

The interpreter submits three individual variables to the server with a URI of the form:

 bar.jsp?foo[0]=1&foo[1]=7&foo[2]=hello

The URI includes the necessary encoding escapes around the bracket characters [ and ].

An arbitrarily complex object can be submitted in this way. The individual values at each level of the structure are submitted individually. For example, following <submit> tag submits the object top:

 <script>
   var subObj = new Object;
   subObj.A = 2;
   subObj.B = 4;
   var superObj = new Object;
   top.size = 2;
   top.name = "Test";
   top.part = subObj;
 </script>
 <submit src="bar.jsp" namelist="superObj"/>

The interpreter submits four individual variables to the server with a URI of the form:

 bar.jsp?top.size=2&top.name=Test&top.part.A=2&top.part.B=4

A server-side package that receives individual component variables in this form can put them back together into the appropriate objects.


[Show Frames]   [FIRST] [PREVIOUS] [NEXT]
BeVocal, Inc. Café Home | Developer Agreement | Privacy Policy | Site Map | Terms & Conditions
Part No. 520-0001-02 | © 1999-2007, BeVocal, Inc. All rights reserved | 1.877.33.VOCAL