CSE134A LECTURE NOTES

May 23, 2001
 
 

ANNOUNCEMENTS

We'll return the midterm before the end of class today.

Teamwork, leadership, equal participation, PHP.
 
 

VOICEXML

XML is a human-readable notation for writing and exchanging structured information of all sorts.  VoiceXML is a language similar to HTML, but for telephone-based interaction.  Syntactically, VoiceXML happens to be an application of XML.

This example is adapted from http://www.webreference.com/perl/tutorial/20/tutorial20.html.

<?xml version="1.0"?>
  <vxml version="1.0" >
    <form id="login">
      <field name="pin">
             <grammar>
                <![CDATA[Four_digits]]>
             </grammar>
             <prompt>Please enter your 4 digit pin code.</prompt>
             <filled>
               <submit next="http://www.web.com/pin.php"/>
             </filled>
        <noinput>No PIN entered.<reprompt/></noinput>
        <nomatch count="1">Invalid pin code.<reprompt/></nomatch>
        <nomatch count="2">Too many attempts.<exit/></nomatch>
      </field>
    </form>
  </vxml>
 
 

THE TELLME SERVICE

The human calls an 800 number and reaches the TellMe service.  TellMe gets a VoiceXML script from your web server and executes it.  The VoiceXML script can be stored in a text file with a .vxml extension, or it can be generated as the output of a PHP script.

TellMe servers are hosted by an Exodus facility, which provides Internet connectivity.  The servers are connected by a dedicated OC48 network (2500 Mbits/s) to ATT telephone switches in three different locations.

The TellMe servers interpret VoiceXML scripts, convert text to speech, recognize speech, and compile grammars for voice recognition.  All have load balancing, fault tolerance, and adaptive caching.

TellMe makes it easy to invoke server-side scripts.  Javascript is available inside VoiceXML, but complex code inside VoiceXML is discouraged.  The reasons are the same ones for the failure of client-side Java and C++ inside HTML.
 
 

VOICE RECOGNITION

A grammar specifies what the alternatives are for what the user might be saying.  Before a grammar can be used, it must be compiled.  Reusing grammars that have already been compiled is critical for speed.
 
 

USER INTERFACE

Getting the UI right is difficult.  Bad visual interfaces are still usable, but bad voice interfaces are not.  There are few established principles, but here are some guidelines.

(0) Provide 80% of the functionality with 20% of the complexity in the interface.

(1) Make the service sensitive to the context of the user.  For example, let the user say "today" when a date is needed.

(2) Keep prompts short, but specific.  For example, "say or type your four-digit PIN" not "enter your pertsonal identification number."

(3) Get confirmation from the user, but unobtrusively.  Make backing up easy.

A fundamental issue: user initiative versus system initiative versus mixed initiative.
 
 

REFERENCES AND TUTORIALS

This shows how to write a simple form in VoiceXML using TellMe Studio:
    http://www.webreference.com/perl/tutorial/20/tutorial20.html
This shows how to build an interactive application that links to a backend Perl script:
    http://www.webreference.com/perl/tutorial/21/tutorial21.html

This PDF document explains the role of VoiceXML and the architecture of the TellMe service:
    http://www.tellme.com/business/downloads/VoiceXML_facts_and_fiction.pdf

These documents discuss two small but interesting VoiceXML applications:
    http://studio.tellme.com/articles/TRAIN.html
    http://studio.tellme.com/articles/OnCalls.html
 



Copyright (c) by Charles Elkan, 2001.