The BeVocal VoiceXML Dynamic SSML facility enables the loading of Speech Synthesis Markup Language (SSML) documents from URIs. This allows VoiceXML developers to generate SSML documents based on parameter values and thus generate complex prompts based on a variety of input criteria.
| | Introduction |
| | Using Dynamic SSML |
| | Examples and Notes |
| | Errors |
| | SSML Document |
| | Extensions to the SSML spec |
Note: The Dynamic SSML facility is an experimental extension to VoiceXML; its implementation and behavior are subject to change. The current implementation of BeVocal VoiceXML contains the feature before it has been standardized so that developers may provide feedback. If this capability becomes a standard part of a future version of VoiceXML, the BeVocal VoiceXML implementation will change as necessary to match the VoiceXML standard.
For the latest W3C working draft of the Speech Synthesis Markup Language spec, see http://www.w3c.org/TR/speech-synthesis/.
This document describes the support in the BeVocal VoiceXML interpreter for dynamic SSML. The <audio> tag has been extended with extended attributes to reference the URIs of SSML documents. The SSML document can be generated based on parameters of the URI used to reference it.
| | To provide an extension to the <audio> element so that the developers can have a flexible mechanism for dynamically generating simple as well as complex, layered prompts. |
| | To provide support for interpreting and executing an SSML document (compliant with the SSML spec) from within a VoiceXML context. |
In order to have dynamic SSML generated and executed from within a VoiceXML document, the <audio> tag has been extended with two new attributes.
| | bevocal:ssml - the URI which generates the dynamic SSML. |
| | bevocal:ssmlexpr - A JavaScript expression which resolves to the value of bevocal:ssml. |
Only one of bevocal:ssml or bevocal:ssmlexpr should be used.
The resource that resides at the URI should be an SSML document, compliant with the SSML spec. The interpreter downloads the resource, identified by the URI above, parses it as an SSML document and then places the contents of the SSML document's <speak> tag inline to replace the <audio> tag which references the SSML document. If there was a problem fetching the resource from the above URI or if there was a parse error in the downloaded SSML document, then an error.badfetch is thrown. The alternate text for <audio> is ignored in this case. It is as if the <audio> tag is replaced by the SSML. This behavior is semantically different than the normal <audio> behavior where if alternate text is specified and an error.badfetch forces the alternate text to be played.
Note: if the <audio> element inside an SSML document contains either the bevocal:ssml or bevocal:ssmlexpr, then an error.badfetch is thrown at the SSML document parse time.
The SSML document is cached according to the current caching properties that were relevant for the <audio> element, which resulted in the SSML download.
The following vxml source shows how to use the bevocal:ssml attribute with <audio>:
<?xml version="1.0"?>
<!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN"
"http://cafe.bevocal.com/libraries/dtd/vxml-bevocal.dtd">
<vxml version="2.0" xmlns="http://www.w3.org/2001/vxml">
<form>
<block>
<prompt>
<audio bevocal:ssml="http://www.foo.com/ssml/foo.ssml" />
</prompt>
</block>
</form>
</vxml>
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE speak PUBLIC "-//W3C//DTD SYNTHESIS 1.0//EN"
"http://www.w3.org/TR/speech-synthesis/synthesis.dtd">
<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" >
<voice name="mark">
How are You ?
Please say one of
<audio
src="http://cafe.bevocal.com/libraries/audio/female1/en_us/calculator/add.wav"
>add</audio>
or
<audio src="missing.wav">multiply</audio>
or
<audio
src="http://cafe.bevocal.com/libraries/audio/female1/en_us/calculator/divide.w
av">divide</audio>
</voice>
</speak>
When the above vxml code is executed then the foo.ssml is downloaded and parsed. The resulting SSML elements are then interpreted and added to the prompt queue in place of the containing <audio> tag as if the original <audio> element in the vxml source code was something like
...
<prompt>
<voice name="mark">
How are You ?
Please say one of
<audio
src="http://cafe.bevocal.com/libraries/audio/female1/en_us/calculator/add.wav"
>add</audio>
or
<audio src="missing.wav">multiply</audio>
or
<audio
src="http://cafe.bevocal.com/libraries/audio/female1/en_us/calculator/divide.w
av">divide</audio>
</voice>
</prompt>
...
If there are errors downloading from the URI specified by the bevocal:ssml attribute of <audio>, then an error.badfetch is thrown
<?xml version="1.0"?>
<!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN"
"http://cafe.bevocal.com/libraries/dtd/vxml-bevocal.dtd">
<vxml version="2.0" xmlns="http://www.w3.org/2001/vxml">
<catch event="error.badfetch">
SSML download failed
</catch>
<form>
<block>
<prompt>
<audio bevocal:ssml="http://www.foo.com/ssml/foo.ssml" />
</prompt>
</block>
</form>
</vxml>
thrown which can be caught with a <catch> handler and appropriate action can be taken.
The <audio> elements in an SSML document cannot contain bevocal:ssml or bevocal:ssmlexpr attributes. The following SSML document would result in an error.badfetch at the document parse time.
<?xml version="1.0" encoding="ISO-8859-1"?> <!DOCTYPE speak PUBLIC "-//W3C//DTD SYNTHESIS 1.0//EN" "http://www.w3.org/TR/speech-synthesis/synthesis.dtd"> <speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" > <audio bevocal:ssml="error.ssml"/> </speak>
The SSML document that results from the execution of the <audio> tag with the bevocal:ssml or bevocal:ssmlexpr should be an SSML document compliant with the SSML spec. The root element should be <speak>. A sample SSML document would look like
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE speak PUBLIC "-//W3C//DTD SYNTHESIS 1.0//EN"
"http://cafe.bevocal.com/libraries/dtd/ssml-bevocal.dtd">
<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" >
<voice name="mark">
How are you ?
<prosody rate="slow">
I am fine
</prosody>
</voice>
</speak>
Developers can refer to the SSML DTD via the PUBLIC id
"-//W3C//DTD SYNTHESIS 1.0//EN"
as shown in the above SSML document.
The location of the SSML document serves as the Base URI for resolving any relative URI references inside the SSML document. Also the xml:base attribute on the <speak> element can modify the Base URI, as in VoiceXML 2.0.
BeVocal VoiceXML adds extended attributes to two SSML elements, in addition to what is specified in the SSML spec. The two elements and their new attributes are:
The <audio> elements inside the downloaded SSML document can specify all the vxml 2.0 caching attributes, like fetchhint, maxage, maxstale, and so on, in order to control how the interpreter can cache the audio resource. Note that specifying vxml 1.0 caching attributes like caching would result in a parse error as only vxml 2.0 style caching attributes are supported.
The <say-as> element can specify the BeVocal VoiceXML extended attribute bevocal:mode to specify if the <say-as> should render the output in a TTS voice (default) or in a recorded voice.
| Café Home |
Developer Agreement |
Privacy Policy |
Site Map |
Terms & Conditions Part No. 520-0001-02 | © 1999-2007, BeVocal, Inc. All rights reserved | 1.877.33.VOCAL |