|
VoiceGenie FAQ
Table of Contents
- General
- What is VoiceXML?
- Why is VoiceXML important?
- How about internationalization?
- Where can I run my VoiceXML application?
- Can I use free web hosting for my VoiceXML pages?
- Programming
- How do I write VoiceXML programs?
- What Programming Languages can I use?
- Do you have some sample VoiceXML programs that I can look at?
- How might I get ahold of the digits pressed for a field input such as in the example below?
- How do you reference a variable that is sent into a VoiceXML file as a post or get?
- How do I use '#' in a DTMF grammar?
- Application
- Do you have some suggestions on how to build a good VUI?
- Where can I take a course on VUI design?
- Why doesn't barge-in work?
- What's barge-in?
- How can I prevent false barge-ins caused by noisy/windy background conditions?
- How about books or papers?
- I'm behind a PBX, what does this mean?
- My auto-login doesn't work.
- How should I talk to VoiceGenie?
- How about hands-free or wireless phones?
- Platform
- What kind of ECMAScript expressions can I use?
- Are there any extensions to VoiceXML?
- Do you support access to virtual hosts?
- Can I specify anything other than a (static) file for a URI?
- Why isn't my grammar file working?
- How do I turn barge-in on and off?
- How do I control thresholds for recognition?
- How can I find out what is happening with my application?
- What changes do I have to make to my webserver?
- I've changed my page, but I can't see the changes!
- Which pre-recorded audio formats do you support?
- VoiceGenie can't find my page. Why not?
- My timeout isn't working. Why not?
General
- What is VoiceXML?
VoiceXML is a relatively new open markup language designed to
support telephone and hands-free speech applications, usually, but
not exclusively, in the context of the Inter-net. The (W3C)
VoiceXML 2.0 specification is based on many years of research and
development by AT&T, IBM, Lucent Technologies and Motorola, as
well as support from the VoiceXML Forum members, one of which is
VoiceGenie.
- Why is VoiceXML important?
In a single sentence, standardization brought about by the advent
of VoiceXML promises to simplify the creation, delivery and
maintenance of Web-based voice services, making all of these
extremely cost-effective.
- How about internationalization?
VoiceXML is defined by an XML DTD, and has no inherent
internationalization constraints. However, the programming language
used to deliver pages may have some issues, and appropriate selection
of TTS and ASR resources is necessary.
- Where can I run my VoiceXML application?
Any VoiceXML compliant server can execute your application. VoiceGenie
also maintains a developer node that you can use to get off the ground
and try things out. You provide the URL that gives the starting point
of your application, and we'll provide you with a telephone number and
extension. That's it! You can immediately call and try out your
application.
- Can I use free web hosting for my VoiceXML pages?
Many free web hosts don't allow the posting of .vxml
files, or add advertising information that causes errors when the
pages are loaded. As of this writing,
Tripod and
Angelfire host web sites for free,
and are able to host .vxml files properly. Although they do add
stuff to your pages, it's added at the bottom - after </vxml>
- so VoiceGenie never needs to process it. You might also be interested in the
forum
posting that discusses this topic.
Programming
- How do I write VoiceXML programs?
VoiceXML is simply a markup language. It is delivered to the
VoiceGenie server using standard HTTP requests, so any web server
can deliver VoiceXML pages. VoiceXML pages are submitted using
the Common Gateway Interface (CGI) protocol. This means that anything
you can use to create web content can usually be used to create
VoiceXML content as well.
Examples include:
- Static VoiceXML Pages (just text)
- Perl (www.perl.org)
- PHP (www.php.net)
- ColdFusion (www.allaire.com)
- ASP (www.microsoft.com)
- JSP (java.sun.com)
- And pretty much anything else you might be comfortable with.
We can get you started with all of these.
- What Programming Languages can I use?
Whatever you like. If it can be used in a CGI and Web Server framework,
then it will work. We use Java, Perl, PHP, and others.
- Do you have some sample VoiceXML applications that I can look at?
We have some, and we're building lots more. Have a look at these
samples.
There are lots of examples in the tutorials as well.
- How might I get ahold of the digits pressed for a field input such as in the example below?
<field name="accesscode" type="digits?length=7">
<prompt> Please give your access code. </prompt>
<filled>
<goto nextitem="confirm"/>
</filled>
</field>
<block name="confirm">
<prompt>
I think that was <value expr="???"/>.
</prompt>
</block>
In this case, it is most common to use:
<value expr="accesscode"/>
to play back the digits. The value will be formatted like "1234567",
both for DTMF or for ASR.
Updated February 24, 2005, 18:00EST.
- How do you reference a variable that is sent into a VoiceXML file as a post or get?
When data is POSTed, it is usually used in the receiving CGI either
to declare a local VoiceXML variable, or control the CGI itself.
For example, in Perl, you would have something like:
print "<var name=\"firstname\" expr=\"'".$q->param("firstname")."'\"/>\n";
In ColdFusion, it might be something like:
<CFOUTPUT>
<var name="firstname" expr="'#Form.firstname#'"/>
</CFOUTPUT>
This new variable would be used in the dynamic VoiceXML form,
perhaps something like:
<block>
Your first name is <value expr="firstname"/>
</block>
- How do I use '#' in a DTMF grammar?
To use '#' in a DTMF grammar, you need to set the TERMCHAR (which is '#'
by default) to another character (or nothing, as below).
Otherwise, the '#' will incorrectly terminate DTMF collection, instead of
being collected itself.
<property name="TERMCHAR" value=""/>
....
<dtmf> 1 | 2 | # </dmtf>
Also, the '#' character must now be used without escaping. Using a backslash
will cause an error.
Added May 5, 2001, 15:30EST. Last modified Aug 5, 2003, 10:30EST.
Application
- Do you have some suggestions on how to build a good VUI?
We have a list of guidelines and references
that are very useful.
- Where can I take a course on VUI design?
Most Speech Recognition Vendors provide courses on VUI design, and
on the design of grammars for their engines. VoiceGenie can arrange
for courses for you, or connect you with your preferred speech vendor.
- Why doesn't barge-in work?
Barge-in is sometimes unreliable in noisy environments, so it is
turned off in some applications.
- What's barge-in?
We thought you'd never ask. It is the ability to speak over a prompt
that is being played, thus interrupting the prompt and moving on
in the dialog.
- How can I prevent false barge-ins caused by noisy/windy background conditions?
Try setting the sensitivity level to a lower value. For example:
<property name="sensitivity" value="0.2"/>
The default is 0.5, and the lower you go, the less sensitive the speech detector is to noise.
Some customers with SpeechWorks OSR 1.1.x have found that 0.2 is an optimal setting.
Use with caution, however, since sensitivity levels below 0.5 will also make the speech detector less sensitive to valid utterances.
- How about books or papers?
We have a list of guidelines and references
that are very useful.
- I'm behind a PBX, what does this mean?
If you're behind a PBX, you shouldn't rely on delivery of your phone
number to the VoiceGenie server. In fact, everyone in your company may
appear to be calling from the same number. So it's best to use your
home or cell phone number for auto-login purposes.
- My auto-login doesn't work.
Auto-login for most applications will use the telephone number
that you've called from to speed up your login. However, your
telephone number isn't always delivered to the VoiceGenie server by
the telephone network. So if the system doesn't get your number,
you won't be able to use auto-login, and you'll be asked to explicitly
identify yourself (for example, by saying/entering an account number).
- How should I talk to VoiceGenie?
You should speak normally and clearly when you talk to VoiceGenie.
Don't alter your speech or speak slowly, for example.
The speech recognition technologies used by VoiceGenie are very
powerful, and work best when you speak normally.
One element that will affect how well VoiceGenie performs is the
amount of background noise. So if you're having trouble while you're
driving down the highway with the radio turned way up and all the
window open, you might want to adjust your environment.
- How about hands-free or wireless phones?
Current ASR technology performs quite well with hands-free and
wireless telephones. However, the amount of background noise will
impact the performance of the application, so it is best to try it
in the expected usage environment. The tuning of the system during
trial and early deployment is also quite important.
Platform
- What kind of ECMAScript expressions can I use?
The current platform supports a complete ECMAScript engine, so any
standard ECMAScript expressions can be used.
- Are there any extensions to VoiceXML?
Yes, VoiceGenie does provide some extensions, which you may find useful.
Please see the
tag summary,
where the extensions are highlighted in green.
Updated February 24, 2005, 18:00EST.
- Do you support access to virtual hosts?
Yes. The VoiceGenie server sends the 'Host:' header in the request to
external servers. This is a commonly supported extension to HTTP/1.0,
and is a recommended update to HTTP 1.0 clients and servers in
RFC 2616.
- Can I specify anything other than a (static) file for a URI?
Yes, you can specify parameters for URIs used in tags such as <audio> and
<goto>/<submit>, allowing dynamic generation of the responses.
- Why isn't my grammar file working?
The rules for using a grammar are as follows:
-
Make sure you have indicated the ASR engine you are using (or make sure you
intend to use the platform default).
Use the following property tag, setting the value to one of the ASR engines that are
installed/running on your platform:
<property name="ASRENGINE" value="NUANCE"/>
-
Write a grammar in a format that is supported with the specified ASR engine.
See the grammar tutorial for a list of supported formats, and details
on grammar formats.
-
Make sure you specify the grammar 'type' attribute, if necessary.
When adding a grammar reference to your VoiceXML page, see the
list of supported formats (in the
grammar tutorial) for default types. If the grammar is a type other
than the default for the specified ASRENGINE, you must reference the grammar as follows:
<grammar ... type="MIME-type"/>
Updated February 24, 2005, 19:00EST.
- How do I turn barge-in on and off?
Use the following property tag:
<property name="bargein" value="false"/>
- How do I control thresholds for recognition?
Use the following property tag:
<property name="confidencelevel" value="0.45"/>
- How can I find out what is happening with my application?
You should add the following <meta> tags to your top-level VoiceXML page:
<meta name="maintainer" content="my.address@mydomain.com"/>
<meta name="application" content="My Application Name"/>
as well as the <property>:
<property name="loglevel" value="4"/>
Updated February 24, 2005, 18:15EST.
- What changes do I have to make to my webserver?
The MIME definition requirements for content delivery with the current version
of VoiceGenie are documented in the
caching tutorial.
Updated February 24, 2005, 18:15EST.
- I've changed my page, but I can't see the changes!
If you are using static pages, then the VoiceGenie server uses caching
algorithms to reduce latency. To force a check of the last
modification of a static page that is cached (i.e. that is not returned by the
webserver with immediate expiration), add the line
<property name="documentmaxage" value="0"/>
to your VoiceXML application. This must either be set in the previous page (that
fetches the changed page), or in the platform defaults file. If you set it in the
previous page, you could alternatively set the 'maxage' attribute to 0 for the
fetching tag (ex. <goto>, <submit>, etc.).
Updated February 24, 2005, 19:00EST.
- Which pre-recorded audio formats do you support?
The audio formats supported for playback with <audio> are documented
in the
audio tutorial.
When using <record>, the format of the generated audio file will depend
on what was specified. The supported formats are documented in the
<record> tag reference.
Updated February 24, 2005, 18:30EST.
- VoiceGenie can't find my page. Why not?
This can be caused by a number of things:
-
The page is not located on the webserver at the specified URI, or the URI is not acccessible by the
VoiceGenie platform, or the URI references dynamic code (ex. jsp, perl, etc.) which runs into errors.
Try fetching the URI from a visual browser (ex. Internet Explorer), to see exactly what the
URI returns.
-
The page itself is available at its URI, but another resource (i.e. audio, script, grammar) that
is referenced in the page is unavailable at its URI.
- Trying to do a
<submit> with method="POST" to a static VoiceXML page. For example:
Many web servers will recognize this as an error, and return a status code 405 method not allowed.
- Other Web fetch errors. Check the e-mail log for your page. Any HTTP status codes will be reported there if you have
set your
LOGLEVEL property to 3.
-
Check the spec for other causes of error.badfetch.
Updated February 24, 2004, 19:00EST.
- My timeout isn't working. Why not?
For prompting and recording, there are some minimum times you should be aware of.
- For
<prompt>, you shouldn't use a timeout shorter
than 50ms. Otherwise it will be silently set to 50ms.
-
For
<record>, the following durations apply:
| Duration |
Min |
Max |
Default |
| beginsilence |
0ms |
value of 'maxtime' attribute, which is 600s by default |
timeout of last prompt before the recording, which is 10s by default |
| finalsilence |
100ms |
value of 'maxtime' attribute, which is 600s by default |
4s |
| mintime |
250ms |
value of 'maxtime' attribute, which is 600s by default |
250ms |
|
*See <record> tag reference for more details.
Updated February 24, 2005, 18:15EST.
|
|