INTRODUCTION TO VOICEXML FOR DISTRIBUTED WEB-BASED APPLICATIONS

Similar documents
VClarity Voice Platform

Special Lecture (406) Spoken Language Dialog Systems Introduction to VoiceXML

Special Lecture (406) Spoken Language Dialog Systems VoiceXML: Dialogs, Forms and Fields

Voice Foundation Classes

Position Statement for Multi-Modal Access

An Approach to VoiceXML Application Modeling

Voice Extensible Markup Language (VoiceXML)

BeVocal VoiceXML Tutorial

LABORATORY 117. Intorduction to VoiceXML

Delivering Superior Self Service with Open Standards

VoiceXML Application Development Recommendations

Niusha, the first Persian speech-enabled IVR platform

Authors Martin Eckert Ingmar Kliche Deutsche Telekom Laboratories.

Speech Applications. How do they work?

Speaker Verification in BeVocal VoiceXML

Introducing the VoiceXML Server

VoiceXML Reference Version 1.6 June 2001

SCXML. Michael Bodell.

EVALITA 2009: Loquendo Spoken Dialog System

VoiceXML. Installation and Configuration Guide. Interactive Intelligence Customer Interaction Center (CIC) Version 2016 R4

Application Notes for InfoTalk-Vbrowser 3.0 with Avaya Aura Communication Manager and Avaya Aura Session Manager 6.3 Issue 1.0

Application Notes for Deploying a VoiceXML Application Using Avaya Interactive Response and Audium Studio - Issue 1.0

Introduction. VoiceXML overview. Traditional IVR technologies limitations

Back-end Avaya Aura Experience Portal and SIP-enabled Avaya Contact Center Select using a Play and Collect sample application

Application Notes for Configuring Computer Instruments Experience Configuration Interface, with Avaya Aura Experience Portal Issue 1.

A NOVEL MECHANISM FOR MEDIA RESOURCE CONTROL IN SIP MOBILE NETWORKS

White Paper Subcategory. Overview of XML Communication Technologies

CCXML in Action: A CCXML Auto Attendant

Avaya Dialog Designer Dialog Designer Developer s Guide

User Guide for Cisco Unified CVP VXML Server and Cisco Unified Call Studio Release 10.5(1)

WAP-Speech: Deriving Synergy between WAP and the Spoken Dialog Interface

Version 2.7. Audio File Maintenance Advanced User s Guide

EPiServer Portals. Abstract

Menu Support for 2_Option_Menu Through 10_Option_Menu

SurVo. Stepping Through the Basics. Version 2.0

A Scripting Language for Multimodal Presentation on Mobile Phones

A Convedia White Paper. Controlling Media Servers with SIP

LABORATORY 117. Intorduction to VoiceXML (3)

Back-end Avaya Aura Experience Portal and SIP-enabled Avaya Aura Contact Center using Context Creation

Form. Settings, page 2 Element Data, page 7 Exit States, page 8 Audio Groups, page 9 Folder and Class Information, page 9 Events, page 10

Application Notes for Nuance OpenSpeech Attendant with Avaya Voice Portal Issue 1.0

Web Architectures. Goal of Architecture Design. Architecture Design. Single Server Configuration. All basic components installed on same machine

SR Telephony Applications. Designing Speech-Only User Interfaces

WAP-Sync-Spec. Data Synchronisation Specification Version 30-May Wireless Application Protocol WAP-234-SYNC a

ATTENDANT USER GUIDE

WFSTDM Builder Network-based Spoken Dialogue System Builder for Easy Prototyping

Abstract. These Application Notes describe the procedures for configuring Computer Instruments eci to interoperate with Avaya Voice Portal.

Auto Attendant. Blue Platform. Administration. User Guide

About Unified IP IVR. Product names. Summary description of Unified IP IVR. This chapter contains the following:

Tutorial 1 Creating a Personal Call Flow

MPML: A Multimodal Presentation Markup Language with Character Agent Control Functions

Dialog Designer Call Flow Elements

Application Notes for Beijing InfoQuick SinoVoice Speech Technology (SinoVoice) jtts with Avaya Voice Portal Issue 1.0

Application Notes for NMS Communications Vision Media Gateway Model VG2000 with Avaya Voice Portal and Avaya SIP Enablement Services Issue 1.

Cisco CVP VoiceXML 3.0. User Guide

Hosted Fax Mail. Blue Platform. User Guide

Cisco CVP VoiceXML 3.0. Element Specifications

An overview of interactive voice response applications

Avaya Media Processing Server VXML Browser User Guide

July 2004 Sophia Antipolis, France

Application Notes for Telisma telispeech Automatic Speech Recognition Engine with Avaya Voice Portal - Issue 1.0

Application Notes for InfoTalk-Recognizer 9.0 with Avaya Aura Experience Portal 6.0 and Avaya Aura Communication Manager 6.2 Issue 1.

Dynamic Aural Browsing of MathML Documents via VoiceXML

Implementation of ASR4CRM : An Automated Speech- Enabled Customer Care Service System

Network Working Group. Category: Informational January 2006

An Information Filtering Agent for Customer Service Delivery Using Single Authoring

Cache Operation. Version 31-Jul Wireless Application Protocol WAP-175-CacheOp a

Application Notes for Beijing InfoQuick SinoVoice Speech Technology (SinoVoice) jtts with Avaya Interactive Response Issue 1.0

Application Notes for LumenVox Speech Engine with Avaya Voice Portal Issue 1.0

EE Voice xml application developer exam.

Abstract. Avaya Solution & Interoperability Test Lab

Wireless Access Protocol(WAP) architecture

Version 2.6. Smart Click-to-Call Advanced User s Guide

WAP WTAI WAP-170-WTAI Version 07-Jul-2000

Chapter 3. Technology Adopted. 3.1 Introduction

Beacon Office. User Manual. Radianta Inc. Version - 2.5(1)

Version 2.6. SurVo Advanced User s Guide

e-business on demand Competitive Technical Briefing Enterprise Portals

ETSI TS V1.1.1 ( ) Technical Specification

Voice Browser Working Group (VBWG) Input on application backplane topics. Scott McGlashan (HP) Rafah Hosn (IBM)

2004 NASCIO Recognition Awards. Nomination Form

Media Resource Control Protocol v2

Getting help with Edline 2. Edline basics 3. Displaying a class picture and description 6. Using the News box 7. Using the Calendar box 9

Oracle Developer Day

Multi-modal Web IBM Position

School Installation Guide ELLIS Academic 5.2.6

Using Speech Recognition for controlling a Pan-Tilt-Zoom Network Camera

Abstract. Avaya Solution & Interoperability Test Lab

Phonologies The Voice of Technology

API Extensions. JSP Suffix A PPENDIX A

Easy Attendant User Guide

VoIP INTERNET-BASED PHONE SYSTEMS CHOCK FULL OF FEATURES

IP PBX for Service Oriented Architectures Communications Web Services

Tools and Toolkits for Voice and Animated Character based Interventions. Overview

Dheeraj Sanghi. Abstract. In the last few years, there have been a revolution in the telecommunication scenario of

A Sample Configuration for Nuance OpenSpeech Recognizer and Nuance Speechify Text-to-Speech Using Avaya Interactive Response - Issue 1.

Record_With_Confirm. Settings

Integrated Conference Bridge Professional

Transcription:

ιατµηµατικό Μεταπτυχιακό Πρόγραµµα Σπουδών : Οικονοµική & ιοίκηση Τηλεπικοινωνιακών ικτύων (Νέες υπηρεσίες και τεχνολογίες δικτύων) INTRODUCTION TO VOICEXML FOR DISTRIBUTED WEB-BASED APPLICATIONS Π.Κ Κίκιραs 1, A.Σταυρόπουλος 2, (1) ΠΜΣ «ιοίκησης & Οικονοµικής των Τηλεπικοινωνιακών ικτύων» e-mail: kikirasp@ieee.org (2) ΠΜΣ «ιοίκησης & Οικονοµικής των Τηλεπικοινωνιακών ικτύων» e-mail: astaur@odt.uoa.gr ABSTRACT The promise of the information revolution has been achieved via the extensive use of internet and telecom recourses (wireless or wired). One can access information anytime, anywhere and with any device. Voice applications are significant part of the pervasive computing vision. As so many people, globally, have access to advanced telephone devices, companies can use voice applications to reach this huge customer base of users who do not have access to computer systems due to time or location. This paper deals with VoiceXML; a worldwide adopted W3C specification for developing voice applications. Introduction Mobile access to the Internet seems to be on everyone's mind these days. The holy grail of anytime, anywhere access to the Internet has created a lot of interest and experimentation in thin-client mobility-supporting technologies such as VoiceXML and WML (wireless markup language). Although adoption of small screen browser access technologies such as WML is lower than anticipated, the continuing quest for mobile Internet access solutions is generating even more interest in voice enabled solutions and more specifically in VoiceXML and voice portals. VoiceXML is an attempt to bring the advantages of Web-based development and content delivery to interactive voice response applications. By aiming in bringing together the world's estimated two billion fixed line and mobile phones with the applications developers, the advances in VoiceXML are adding a significant milestone in the convergence of telecom technologies and the Web. 1 VOICEXML OVERVIEW. The history of Voice Markup Languages is not very old. The development of Voice Markup Languages started in 1995 when AT&T Bell laboratories started a project called PhoneWeb. The aim was to develop a markup language like HTML for defining voice markup for voice-based devices such as telephones and voice browsers. When AT&T Bell Labs split into two companies, both companies continued their work on the development of a Voice Markup Language and both came up with their own versions of Phone Markup Language (PML). Later, in 1998, Motorola came up with a new voice Markup language called VoxML. This language is a subset of XML and enables the user to define voice markup for voice-based applications. Soon IBM also joined the race with the launch of SpeechML, the IBM version of Voice Markup

Language. Other companies such as Vocalis, HP and Sun Microsystems developed their own versions of this language. The VoiceXML Forum is an organization founded by Motorola, IBM, AT&T, and Lucent to promote voice-based development. This forum introduced a new language called VoiceXML based on the legacy of languages already promoted by these four companies. In August 1999, the VoiceXML Forum released the first specification of VoiceXML version 0.9. In March 2000, version 1.0 was released. Then in October 2001, the first working draft of the latest VoiceXML version 2.0 was published. Starting from late 2001 Forum s VoiceXML initiative was adopted by the W3C consortium as an integral part of its own Speech Interface Framework, which have led to the release, at March 16 2004, of VoiceXML Forum s working draft as a W3C recommendation, promoting VoiceXML 2.0 specification as a Web standard by industry and the Web community. 1.1 VoiceXML and HTML Though VoiceXML has adopted many concepts and designs from HTML it differs in several ways. HTML was designed for visual Web pages and lacks the control over the user-application interaction that is needed for a speech-based interface. The main difference between VoiceXML and HTML is in the sequential structure of the VoiceXML s documents. HTML document is a single unit that resides on a web server characterized by a unique uniform resource identifier and whenever is accessed by a client, it simultaneously downloads it self as a whole to the clients browser. In contrast, a VoiceXML document contains a number of dialogue units (menus or forms), properly divided with markup tags, presented sequentially. This difference is due to the visual medium s ability to display a number of items in parallel, while the voice medium is inherently sequential. 1.2 Architecture - Key Concepts of VoiceXML The architecture of VoiceXML is very similar to that of standard web applications. When a user requires some documents from the server, sends a request to the server by using software called a browser. Upon receiving the user request through the browser, the server starts processing the required documents and sends the result to the browser as its response. The browser forwards this response to the user. In VoiceXML applications, just as in web applications, documents are hosted on the web server. In addition to a web server, a VoiceXML architecture houses another server, the voice server, which handles all interactions between the user and the web server. The voice server plays the role of the browser in a voice application, interpreting all spoken inputs received from the user and providing audio prompts as responses to the user. In the case of voice applications, the end user need not have any high-end computer and sophisticated browsing software. He can access the voice application through a simple telephony device connected to copper or unwired telephone network. Figure 1 shows the architecture of a voice application. 2

Figure 1. High level Architecture of a VoiceXML enabled network The standard implementation architecture of a voice application includes the following components: Web server: This is the server hosting a VoiceXML application in his network. It is important to notice that VoiceXML can be delivered from any common Web server. VoiceXML Gateway interpreter: The VoiceXML gateway consists of hardware and software that bridge the PSTN and Internet networks. This gateway consists of a VoiceXML browser and resources for ASR, TTS, and DTMF. These resources may be hardware and/or software. PSTN/Wireless telephone network: Public Switched Telephone Network or Wireless Telephone Network are the telephone services most of people have, and it carries our speech and DTMF interactions, such as prompts played by the VoiceXML Gateway and responses the caller speaks. Client device: The device the caller uses to access a VoiceXML application. 2 VOICEXML KEY CONCEPTS A VoiceXML document (or a set of documents called an application) forms a conversational finite state machine. The user is always in one conversational state, or dialog, at a time. Each dialog determines the next dialog to transition to. Transitions are specified using URLs, which define the next document and dialog to use. If a URI does not refer to a document, the current document is assumed. If it does not refer to a dialog, the first dialog in the document is assumed. Execution is terminated when a dialog does not specify a successor, or if it has an element that explicitly exits the conversation. VoiceXML key concepts are: Session: A session begins when the user starts to interact with a VoiceXML interpreter context, continues as documents are loaded and processed, and ends when requested by the user, a document, or the interpreter context. Document: A VoiceXML document is primarily composed of top-level elements called dialogs. There are two types of dialogs: forms and menus. A document 3

may also have <meta> elements, <var> and <script> elements, <property> elements, <catch> elements, and <link> elements. Dialog: Is a top-level element. The user when interacting with a VoiceXML application is always in one dialog state. There are two types of dialogs: forms and menus. Subdialogs: A subdialog is like a function call, in that it provides a mechanism for invoking a new interaction, and returning to the original form. Local data, grammars, and state information are saved and are available upon returning to the calling document. Subdialogs can be used, for example, to create a confirmation sequence that may require a database query; to create a set of components that may be shared among documents in a single application; or to create a reusable library of dialogs shared among many applications. Menu: A menu presents the user with a choice of options and then transitions to another dialog based on that choice. Grammar: Each dialog has one or more speech and/or DTMF grammars associated with it. A grammar specifies a list of permissible vocabulary for the user to select from in order to interact with the VoiceXML application. Each dialog has one or more speech and/or grammars associated with it. Form: Forms are the key component of VoiceXML documents. A form contains: A set of form items, elements that are visited in the main loop of the form interpretation algorithm. Form items are subdivided into field items, those that define the form s field item variables, and control items, those that help control the gathering of the form s fields. Declarations of non-field item variables. Event handlers. Filled actions, blocks of procedural logic that execute when certain combinations of field items are filled in. Form attributes are: id: The name of the form. Scope: The default scope of the form s grammars. If it is dialog then the form grammars are active only in the form. If the scope is document, then the form are active during any dialog in the same document. If the scope is document and the document is an application root document, then the form grammars are active during any dialog in any document of this application. A form grammar that has dialog scope is active only in its form. Application: An application is a collection of VoiceXML documents. All the documents in an application share the application root document. 2.1 A Sample VoiceXML application Due to VoiceXML is an extension of XML, it follows the basic rules of XML. In the following sample VoiceXML application we provide a sample informative application for ODT students. In the specified application we are going to exhibit VoiceXML main features and capabilities by implementing the following scenario: An ODT student is trying to access to department s secretariat voice application which is able to provide him with information considering class schedules, and general announcements, the application will be consisted by two vxml files (exact code can be found at appendix A. The architecture of the application is described in figure 2. The application environment that we use for the development and testing of the application is MOTOROLA Wireless IDE with Mobile ADK 2.0 (http://www.motorola.com/msp/products/developer/index.html). 4

Main.vxml (root) Schedule.vxml (root) Figure 2. Application Architecture 3 CONCLUSIONS VoiceXML is an approach for enhanced and more user friendly man machine interfaces. It leverages the gap between web and voice applications by enabling content delivery to the latter. It also creates opportunities for business to the web content providers by enabling their access to the market of the world's estimated two billion fixed line and mobile phones owners. Further developments of VoiceXML will focus on the implementation of suitable features in order to support natural dialogues between machines and humans. 5

References Bruce Lucas, VoiceXML for Web-based Distributed conversational applications Communications of the ACM September 2000/Vol. 43, No 9. Dave Ragget, Getting Started with VoiceXML W3C tutorial, 2001. Vivek Malhotra, Developing VoiceXML applications, 2000. Harsra Srivatsa, Deep into VoiceXML, Part 1 & 2, 2001 Charul Shukla, Avnish Dass & Vikas Gupta, VoiceXML 2.0 Developer s Guide Building Professional Voice-Enabled Applications with JSP, ASP, & ColdFusion, McGraw-Hill/Osborne, 2002 Rick Beasley & Mike Farley, Voice Application Development with VoiceXML, Sams Publishing, August, 2001 6

Appendix A 7

1. Main.vxml Source Code <?xml version="1.0"?> <vxml version="1.0"> <!-- user hears welcome the first time --> <form id="intro"> <block> <prompt>welcome to ODT's Secretariat Voice Application</prompt> <goto next="#choice"/> </block> </form> <menu id="choice" dtmf="true"> <prompt> Say schedule, announcements, or quit. </prompt> <!-- next attribute takes you to the appropriate documents or anchor within the current document --> <choice next="http://147.102.15.69 /schedule.vxml">schedule</choice> <choice next="#announcements">anouncements</choice> <choice next="#quit_app">quit</choice> </menu> <!-- give the news --> <form id="announcements"> <block> <prompt>next Monday's Lesson will be postoponed</prompt> <goto next="#choice"/> </block> </form> <!-- quit the application --> <form id="quit_app"> <block> <prompt>goodbye!</prompt> </block> </form> </vxml> 8

2. Schedule.vxml Source Code <?xml version="1.0"?> <vxml version="1.0"> <form id="chooseday"> <field name="userselection"> <grammar> <![CDATA[ [ [one Monday dtmf -1] {<option> "Monday" >} [two Tuesday dtmf -2] {<option> "Tuesday" >} ] ]]> </grammar> <!-- prompt the user what to do --> <prompt> Welcome to the weekly schedule of ODT's classes. You can say Monday or Tuesday or you can push one for Monday and two for Tuesday. Please make your selection </prompt> <!-- if no match with active grammar list then user prompted again. --> <nomatch> What did you say? <reprompt/> </nomatch> <!-- executed if no input is provided by the user --> <noinput> Please input something! <reprompt/> </noinput> <filled> <if cond="usersellection == 'monday'"> <prompt> Monday's schedule is as follows </prompt> <goto next = "http://147.102.15.69/main.vxml"/> </if> <if cond="usersellection == 'tuesday'"> <prompt> Monday's schedule is as follows </prompt> <goto next = "http://147.102.15.69/main.vxml"/> </if> </filled> </field> </form> </vxml> 9