EMMA: Extensible MultiModal Annotation

Size: px
Start display at page:

Download "EMMA: Extensible MultiModal Annotation"

Transcription

1 EMMA: Extensible MultiModal Annotation W3C Workshop Rich Multimodal Application Development Michael Johnston AT&T Labs Research 2013 AT&T Intellectual Property. All rights reserved. AT&T and the AT&T logo are trademarks of AT&T Knowledge Ventures.

2 Multimodal Interactive Systems Support user input and/or system output over multiple modes such as speech, pen, and gesture Graphical interfaces augmented with speech input Multimodal integration directions from here <touch> to there <touch> Enables more natural and effective interaction Human-human communication is multimodal Certain kinds of input/output best suited to particular modes Multimodal error recovery Page 2

3 Authoring Multimodal Systems Many research prototypes showing utility of multimodality for interactive systems Authoring remains complex and specialized task Graphical interface in concert with variety of input and output processing components Ad hoc or proprietary protocols Limit plug and play Complicate rapid prototyping Page 3

4 EMMA: Extensible Multimodal Annotation EMMA standard provides common XML language for representing the interpretation of inputs to spoken and multimodal systems XML markup for capturing and annotating various processing stages of user inputs Container elements Annotation elements and attributes W3C Recommendation February Implementations AT&T, Microsoft, Nuance, Loquendo, OpenStream AT&T Developer Program Speech APIs Current EMMA 1.1 Working Draft Page 4

5 Multimodal Architecture Modality components and Interaction manager are core components of MMI Architecture EMMA used for input between modules Modality component Interaction manager Modality input component Interpretation Integration EMMA Modality component Input history Data Context Session state Dialog manager Application (backend) Modality output component Output generation Presentation manager Modality component API Page 5

6 EMMA Example: flights from Boston to Denver Page 6 <emma:emma version="1.0" xmlns:emma=" xmlns:xsi=" xsi:schemalocation=" xmlns=" <emma:one-of id="r1"! emma:start=" " emma:end=" ! emma:source="smm:platform=iphone h11! emma:signal="smm:file=audio amr emma:signal-size="4902! emma:process="smm:type=asr&version=asr_eng2.4! emma:media-type="audio/amr; rate=8000"! emma:lang="en-us emma:grammar-ref="gram1 emma:model-ref="model1">! <emma:interpretation id="int1" emma:confidence="0.75"! emma:tokens="flights from boston to denver">! <flt><orig>boston</orig> <dest>denver</dest></flt>! <emma:interpretation id="int2 emma:confidence="0.68" emma:tokens="flights from austin to denver">! <flt><orig>austin</orig> <dest>denver</dest></flt>! </emma:one-of> <emma:info><session>e50dae19-79b5-44ba-892d</session></emma:info>! <emma:grammar id="gram1" ref="smm:grammar=flights"/>! <emma:model id="model1" ref="smm:file=flights.xsd"/>!

7 EMMA Example: flights from Boston to Denver <emma:emma version="1.0" xmlns:emma=" xmlns:xsi=" xsi:schemalocation=" xmlns=" <emma:one-of id="r1"! emma:start=" " emma:end=" ! emma:source="smm:platform=iphone h11! emma:signal="smm:file=audio amr emma:signal-size="4902! <emma:emma> emma:process="smm:type=asr&version=asr_eng2.4! Root element emma:media-type="audio/amr; of all EMMA documents rate=8000"! emma:lang="en-us emma:grammar-ref="gram1 emma:model-ref="model1">! Carries namespace, schema, version <emma:interpretation id="int1" emma:confidence="0.75"! emma:tokens="flights from boston to denver">! <flt><orig>boston</orig> <dest>denver</dest></flt>! <emma:interpretation id="int2 emma:confidence="0.68" emma:tokens="flights from austin to denver">! <flt><orig>austin</orig> <dest>denver</dest></flt>! </emma:one-of> <emma:info><session>e50dae19-79b5-44ba-892d</session></emma:info>! <emma:grammar id="gram1" ref="smm:grammar=flights"/>! <emma:model id="model1" ref="smm:file=flights.xsd"/>! Page 7

8 Container element tree <emma:one-of>! <emma:group>! <emma:sequence> Terminating in <emma: interpretation>! Page 8 <emma:emma>! <emma:one-of id="r1"! emma:start=" " emma:end=" ! emma:source="smm:platform=iphone h11! emma:signal="smm:file=audio amr! emma:signal-size="4902! emma:process="smm:type=asr&version=asr_eng2.4! emma:media-type="audio/amr; rate=8000"! emma:lang="en-us! emma:grammar-ref="gram1! emma:model-ref="model1">! <emma:interpretation id="int1"! emma:confidence="0.75"! emma:tokens="flights from boston to denver">! <flt><orig>boston</orig><dest>denver</dest></flt>! <emma:interpretation id="int2! emma:confidence="0.68" emma:tokens="flights from austin to denver">! <flt><orig>austin</orig><dest>denver</dest></flt>! </emma:one-of> <emma:info>! <session>e50dae19-79b5-44ba-892d</session>! </emma:info>! <emma:grammar id="gram1" ref="smm:grammar=flights"/>! <emma:model id="model1" ref="smm:file=flights.xsd"/>!

9 <emma: interpretation>! Contains the application specific markup Annotations specific to that interpretation Page 9 emma:confidence! emma:tokens Semantic representation is not standardized <emma:emma>! <emma:one-of id="r1"! emma:start=" " emma:end=" ! emma:source="smm:platform=iphone h11! emma:signal="smm:file=audio amr! emma:signal-size="4902! emma:process="smm:type=asr&version=asr_eng2.4! emma:media-type="audio/amr; rate=8000"! emma:lang="en-us! emma:grammar-ref="gram1! emma:model-ref="model1">! <emma:interpretation id="int1"! emma:confidence="0.75"! emma:tokens="flights from boston to denver">! <flt><orig>boston</orig><dest>denver</dest></flt>! <emma:interpretation id="int2! emma:confidence="0.68" emma:tokens="flights from austin to denver">! <flt><orig>austin</orig><dest>denver</dest></flt>! </emma:one-of> <emma:info>! <session>e50dae19-79b5-44ba-892d</session>! </emma:info>! <emma:grammar id="gram1" ref="smm:grammar=flights"/>! <emma:model id="model1" ref="smm:file=flights.xsd"/>!

10 Annotation Scope Annotations on <emma:one-of> are assumed to apply to the contained interpretations Page 10 <emma:emma>! <emma:one-of id="r1"! emma:start=" " emma:end=" ! emma:source="smm:platform=iphone h11! emma:signal="smm:file=audio amr! emma:signal-size="4902! emma:process="smm:type=asr&version=asr_eng2.4! emma:media-type="audio/amr; rate=8000"! emma:lang="en-us! emma:grammar-ref="gram1! emma:model-ref="model1">! <emma:interpretation id="int1"! emma:confidence="0.75"! emma:tokens="flights from boston to denver">! <flt><orig>boston</orig><dest>denver</dest></flt>! <emma:interpretation id="int2! emma:confidence="0.68" emma:tokens="flights from austin to denver">! <flt><orig>austin</orig><dest>denver</dest></flt>! </emma:one-of> <emma:info>! <session>e50dae19-79b5-44ba-892d</session>! </emma:info>! <emma:grammar id="gram1" ref="smm:grammar=flights"/>! <emma:model id="model1" ref="smm:file=flights.xsd"/>!

11 Annotations Classification of input emma:medium! acoustic! visual! tactile! emma:mode! voice! pen! gui emma:function! dialog! verification! recording! transcription! emma:verbal! true/false! Page 11 <emma:emma>! <emma:one-of id="r1"! emma:start=" " emma:end=" ! emma:source="smm:platform=iphone h11! emma:signal="smm:file=audio amr! emma:signal-size="4902! emma:process="smm:type=asr&version=asr_eng2.4! emma:media-type="audio/amr; rate=8000"! emma:lang="en-us! emma:grammar-ref="gram1! emma:model-ref="model1">! <emma:interpretation id="int1"! emma:confidence="0.75"! emma:tokens="flights from boston to denver">! <flt><orig>boston</orig><dest>denver</dest></flt>! <emma:interpretation id="int2! emma:confidence="0.68" emma:tokens="flights from austin to denver">! <flt><orig>austin</orig><dest>denver</dest></flt>! </emma:one-of> <emma:info>! <session>e50dae19-79b5-44ba-892d</session>! </emma:info>! <emma:grammar id="gram1" ref="smm:grammar=flights"/>! <emma:model id="model1" ref="smm:file=flights.xsd"/>!

12 Annotations Timestamps emma:start! emma:end! Absolute Milliseconds Page 12 <emma:emma>! <emma:one-of id="r1"! emma:start=" " emma:end=" ! emma:source="smm:platform=iphone h11! emma:signal="smm:file=audio amr! emma:signal-size="4902! emma:process="smm:type=asr&version=asr_eng2.4! emma:media-type="audio/amr; rate=8000"! emma:lang="en-us! emma:grammar-ref="gram1! emma:model-ref="model1">! <emma:interpretation id="int1"! emma:confidence="0.75"! emma:tokens="flights from boston to denver">! <flt><orig>boston</orig><dest>denver</dest></flt>! <emma:interpretation id="int2! emma:confidence="0.68" emma:tokens="flights from austin to denver">! <flt><orig>austin</orig><dest>denver</dest></flt>! </emma:one-of> <emma:info>! <session>e50dae19-79b5-44ba-892d</session>! </emma:info>! <emma:grammar id="gram1" ref="smm:grammar=flights"/>! <emma:model id="model1" ref="smm:file=flights.xsd"/>!

13 Annotations emma:source! URI describing the input device emma:signal! References the signal e.g. audio file emma:signal-size! Bytes emma:process! URI describing the process that generated these interpretations Page 13 <emma:emma>! <emma:one-of id="r1"! emma:start=" " emma:end=" ! emma:source="smm:platform=iphone h11! emma:signal="smm:file=audio amr! emma:signal-size="4902! emma:process="smm:type=asr&version=asr_eng2.4! emma:media-type="audio/amr; rate=8000"! emma:lang="en-us! emma:grammar-ref="gram1! emma:model-ref="model1">! <emma:interpretation id="int1"! emma:confidence="0.75"! emma:tokens="flights from boston to denver">! <flt><orig>boston</orig><dest>denver</dest></flt>! <emma:interpretation id="int2! emma:confidence="0.68" emma:tokens="flights from austin to denver">! <flt><orig>austin</orig><dest>denver</dest></flt>! </emma:one-of> <emma:info>! <session>e50dae19-79b5-44ba-892d</session>! </emma:info>! <emma:grammar id="gram1" ref="smm:grammar=flights"/>! <emma:model id="model1" ref="smm:file=flights.xsd"/>!

14 Annotations emma:media-type! MIME type of signal Use to indicate code (amr) and sampling rate (8000) emma:lang! Language of the user input cf. xml:lang! Page 14 <emma:emma>! <emma:one-of id="r1"! emma:start=" " emma:end=" ! emma:source="smm:platform=iphone h11! emma:signal="smm:file=audio amr! emma:signal-size="4902! emma:process="smm:type=asr&version=asr_eng2.4! emma:media-type="audio/amr; rate=8000"! emma:lang="en-us! emma:grammar-ref="gram1! emma:model-ref="model1">! <emma:interpretation id="int1"! emma:confidence="0.75"! emma:tokens="flights from boston to denver">! <flt><orig>boston</orig><dest>denver</dest></flt>! <emma:interpretation id="int2! emma:confidence="0.68" emma:tokens="flights from austin to denver">! <flt><orig>austin</orig><dest>denver</dest></flt>! </emma:one-of> <emma:info>! <session>e50dae19-79b5-44ba-892d</session>! </emma:info>! <emma:grammar id="gram1" ref="smm:grammar=flights"/>! <emma:model id="model1" ref="smm:file=flights.xsd"/>!

15 Annotations emma:grammar-ref! References <emma: grammar> element under <emma:emma>! URI reference to name or location of grammar Page 15 <emma:emma>! <emma:one-of id="r1"! emma:start=" " emma:end=" ! emma:source="smm:platform=iphone h11! emma:signal="smm:file=audio amr! emma:signal-size="4902! emma:process="smm:type=asr&version=asr_eng2.4! emma:media-type="audio/amr; rate=8000"! emma:lang="en-us! emma:grammar-ref="gram1! emma:model-ref="model1">! <emma:interpretation id="int1"! emma:confidence="0.75"! emma:tokens="flights from boston to denver">! <flt><orig>boston</orig><dest>denver</dest></flt>! <emma:interpretation id="int2! emma:confidence="0.68" emma:tokens="flights from austin to denver">! <flt><orig>austin</orig><dest>denver</dest></flt>! </emma:one-of> <emma:info>! <session>e50dae19-79b5-44ba-892d</session>! </emma:info>! <emma:grammar id="gram1" ref="smm:grammar=flights"/>! <emma:model id="model1" ref="smm:file=flights.xsd"/>!

16 Annotations emma:model-ref! References <emma:model> element under <emma:emma>! URI reference to name or location of data model for application semantic markup Page 16 <emma:emma>! <emma:one-of id="r1"! emma:start=" " emma:end=" ! emma:source="smm:platform=iphone h11! emma:signal="smm:file=audio amr! emma:signal-size="4902! emma:process="smm:type=asr&version=asr_eng2.4! emma:media-type="audio/amr; rate=8000"! emma:lang="en-us! emma:grammar-ref="gram1! emma:model-ref="model1">! <emma:interpretation id="int1"! emma:confidence="0.75"! emma:tokens="flights from boston to denver">! <flt><orig>boston</orig><dest>denver</dest></flt>! <emma:interpretation id="int2! emma:confidence="0.68" emma:tokens="flights from austin to denver">! <flt><orig>austin</orig><dest>denver</dest></flt>! </emma:one-of> <emma:info>! <session>e50dae19-79b5-44ba-892d</session>! </emma:info>! <emma:grammar id="gram1" ref="smm:grammar=flights"/>! <emma:model id="model1" ref="smm:file=flights.xsd"/>!

17 Annotations emma:confidence! Confidence score assigned to interpretation emma:tokens! Token sequence Page 17 <emma:emma>! <emma:one-of id="r1"! emma:start=" " emma:end=" ! emma:source="smm:platform=iphone h11! emma:signal="smm:file=audio amr! emma:signal-size="4902! emma:process="smm:type=asr&version=asr_eng2.4! emma:media-type="audio/amr; rate=8000"! emma:lang="en-us! emma:grammar-ref="gram1! emma:model-ref="model1">! <emma:interpretation id="int1"! emma:confidence="0.75"! emma:tokens="flights from boston to denver">! <flt><orig>boston</orig><dest>denver</dest></flt>! <emma:interpretation id="int2! emma:confidence="0.68" emma:tokens="flights from austin to denver">! <flt><orig>austin</orig><dest>denver</dest></flt>! </emma:one-of> <emma:info>! <session>e50dae19-79b5-44ba-892d</session>! </emma:info>! <emma:grammar id="gram1" ref="smm:grammar=flights"/>! <emma:model id="model1" ref="smm:file=flights.xsd"/>!

18 Annotations <emma:info>! Extensibility Application and vendor specific annotations Page 18 <emma:emma>! <emma:one-of id="r1"! emma:start=" " emma:end=" ! emma:source="smm:platform=iphone h11! emma:signal="smm:file=audio amr! emma:signal-size="4902! emma:process="smm:type=asr&version=asr_eng2.4! emma:media-type="audio/amr; rate=8000"! emma:lang="en-us! emma:grammar-ref="gram1! emma:model-ref="model1">! <emma:interpretation id="int1"! emma:confidence="0.75"! emma:tokens="flights from boston to denver">! <flt><orig>boston</orig><dest>denver</dest></flt>! <emma:interpretation id="int2! emma:confidence="0.68" emma:tokens="flights from austin to denver">! <flt><orig>austin</orig><dest>denver</dest></flt>! </emma:one-of> <emma:info>! <session>e50dae19-79b5-44ba-892d</session>! </emma:info>! <emma:grammar id="gram1" ref="smm:grammar=flights"/>! <emma:model id="model1" ref="smm:file=flights.xsd"/>!

19 Representing multiple stages of processing of input <emma: derived-from>! Provides a pointer to the EMMA interpretation this interpretation results from! <emma:emma>! <emma:interpretation! id="int2"! emma:tokens="comedy movies directed by woody! allen and starring diane keaton"! emma:confidence="0.7 emma:process="smm:type=fusion&version=mmfst1.0">! <query><genre>comedy</genre>! <dir>woody_allen</dir>! <cast>diane_keaton</cast></query>! <emma:derived-from resource="#int1"/>! <emma:derivation>! <emma:interpretation id="int1"! emma:start=" "! emma:end=" "! emma:confidence="0.8" emma:lang="en-us"! emma:process="smm:type=asr&version=asr_eng2.4"! emma:media-type="audio/amr; rate=8000">! <emma:literal>comedy movies directed by woody!!allen and starring diane keaton</emma:literal>! </emma:derivation>! Page 19

20 <emma:derivation> <emma: derivation>! Container element for holding previous processing stages inline Page 20 <emma:emma>! <emma:interpretation! id="int2"! emma:tokens="comedy movies directed by woody! allen and starring diane keaton"! emma:confidence="0.7 emma:process="smm:type=fusion&version=mmfst1.0">! <query><genre>comedy</genre>! <dir>woody_allen</dir>! <cast>diane_keaton</cast></query>! <emma:derived-from resource="#int1"/>! <emma:derivation>! <emma:interpretation id="int1"! emma:start=" "! emma:end=" "! emma:confidence="0.8" emma:lang="en-us"! emma:process="smm:type=asr&version=asr_eng2.4"! emma:media-type="audio/amr; rate=8000">! <emma:literal>comedy movies directed by woody!!allen and starring diane keaton</emma:literal>! </emma:derivation>!

21 <emma:emma>! <emma:one-of id="one-of1"!!!!emma:lang="en-us" emma:start=" "!!emma:end=" "!!emma:media-type="audio/amr; rate=8000"!!emma:process="smm:type=asr&version=watson6">! <emma:interpretation id="nbest1"!!emma:confidence="1.00"!!emma:tokens="jon smith">!!<pn>jon</pn><ln>smith</ln>! <emma:interpretation id="nbest2"!!emma:confidence="0.99"!!emma:tokens="john smith">!!<fn>john</fn><ln>smith</ln>! <emma:interpretation id="nbest3"!!emma:confidence="0.99"!!emma:tokens="joann smith">!!<fn>joann</fn><ln>smith</ln>! <emma:interpretation id="nbest4"!!emma:confidence="0.98"!!emma:tokens="joan smith">!!<fn>joan</fn><ln>smith</ln>! </emma:one-of>! Page 21 <emma: one-of>

22 Multimodal Integration Example Multimodal dynamic map Cohen et al 1998, QuickSet Gustafson et al 2000, AdApt Johnston et al 2002, MATCH Gruenstein et al 2009 Local search application Map locations/search for restaurants/initiate calls Results in list or map view Multimodal commands chinese restaurants near here e.g. finite-state integration Bangalore and Johnston 2009, Computational Linguistics Page 22 <emma:group/lattice/derivation>!

23 Speech input Speech result in EMMA: <emma:emma>! <emma:interpretation id="speech1"! emma:confidence="0.9" emma:verbal="true"! emma:start=" " emma:end=" "! emma:confidence="0.8" emma:lang="en-us"! emma:process="smm:type=asr&version=asr_eng2.4"! emma:media-type="audio/amr; rate=8000">! <emma:literal>french restaurants near here</emma:literal>!! Page 23

24 Touch input Client collects touch events on map Represents gesture using <emma:lattice>! <emma:interpretation id="touch1! emma:confidence="0.8"! emma:medium="tactile" emma:mode="touch"! emma:start=" " emma:end=" ">! <emma:lattice initial="0" final="4">! <emma:arc from="0" to="1">g</emma:arc>! <emma:arc from="1" to="2">point</emma:arc>! <emma:arc from="2" to="3">coords</emma:arc>! <emma:arc from="3" to="4"> SEM([ , ])</emma:arc>! </emma:lattice>! G point coords SEM ([ , ]) Page 24

25 Package of multimodal input Page 25 <emma:emma>! <emma:group! emma:medium="acoustic,tactile" emma:mode="voice,touch"! emma:function="dialog">! <emma:interpretation id="speech1"! <emma:group>! emma:confidence="0.9" emma:verbal="true"! emma:start=" " emma:end=" " Based on temporal! emma:medium="acoustic" emma:mode="voice" constraints! client emma:confidence="0.8" emma:lang="en-us" groups! speech and emma:process="smm:type=asr&version=asr_eng2.4" gesture! inputs using emma:media-type="audio/amr; rate=8000"> <emma:group>! <emma:literal>french restaurants near here container </emma:literal>! element <emma:interpretation id="touch1! emma:confidence="0.8"! emma:medium="tactile" emma:mode="touch"! emma:start=" " emma:end=" "> Client posts! <emma:lattice initial="0" final="4">! multimodal input to <emma:arc from="0" to="1">g</emma:arc> multimodal! fusion <emma:arc from="1" to="2">point</emma:arc> server! <emma:arc from="2" to="3">coords</emma:arc>! <emma:arc from="3" to="4"> SEM([ , ])</emma:arc>! </emma:lattice>! <emma:group-info>temporal</emma:group-info>! </emma:group>! <emma:group-info> indicates nature of grouping

26 Multimodal EMMA Result Multimodal Interpretation Results from finite state multimodal understanding emma:medium, emma:mode have multiple values Timestamp is the union of speech and gesture timestamps One <emma:derivedfrom> element for each mode Combining inputs can be contained inline in <emma:derivation>! Page 26 <emma:emma>! <emma:interpretation! emma:medium="acoustic,tactile" emma:mode="voice,touch"! emma:function="dialog"! emma:process="smm:type=fusion&version=watson6"! emma:start=" " emma:end=" "! <query><cuisine>french</cuisine>! <location>[ , ]</location></query>! <emma:derived-from resource="#speech1"/> <emma:derived-from resource="#touch1"/>! <emma:derivation>! <emma:interpretation id="speech1! emma:confidence="0.9"! emma:start=" "! emma:end=" " emma:verbal="true"! emma:confidence="0.8" emma:lang="en-us"! emma:process="smm:type=asr&version=asr_eng2.4"! emma:media-type="audio/amr; rate=8000">! <emma:literal>french restaurants near here </emma:literal>! <emma:interpretation id="touch1"! emma:confidence="0.8"! emma:medium="tactile" emma:mode="touch"! emma:start=" "! emma:end=" ">! <emma:lattice initial="0" final="4">! <emma:arc from="0" to="1">g</emma:arc>! <emma:arc from="1" to="2">point</emma:arc>! <emma:arc from="2" to="3">coords</emma:arc>! <emma:arc from="3" to="4"> SEM([ , ])</emma:arc>! </emma:lattice>! </emma:derivation>!

27 Conclusion The W3C EMMA (1.1) standard provides an XML representation language for containing and annotating inputs to multimodal systems Some key features Standard representations for common metadata/annotations on inputs to interactive systems Representation of uncertainty one-of/confidence/lattice <emma:group> for packages of multimodal input for fusion <emma:derivation> <emma:derived-from> for representation of multiple processing stages Tomorrow EMMA 1.1 (emma:annotation, emma:ref...) Future EMMA uses cases (emma:presentation) Page 27

28 EMMA Resources Specifications The EMMA specification -- Use Cases for EMMA EMMA Other information Building Multimodal Applications with EMMA, Michael Johnston, Proceedings of ICMI-MLMI 2009, Boston, MA, 2009 Improving Dialogs with EMMA Deborah Dahl, SpeechTEK 2009 Extensible Multimodal Annotation for Intelligent Virtual Agents Deborah Dahl, 10th International Conference on Intelligent Virtual Agents, September 20-22, 2010 Introducing EMMA Speech Technology Magazine, March, EMMA aspx Page 28

29 Acknowledgements Debbie Dahl Paolo Baggia Dan Burnett Ingmar Kluche Kazuyuki Ashimura Michael Bodell Dave Raggett Roberto Pieraccini Max Froumentin Massimo Romanelli Gerry McCobb Andrew Wahbe Patricio Bergallo Jerry Carter Wu Chou Yuan Shao Jin Liu Katriina Halonen T. V. Raman Stephen Potter Page 29

Building Multimodal Applications with EMMA

Building Multimodal Applications with EMMA Building Multimodal Applications with EMMA Michael Johnston AT&T Labs Research 180 Park Ave Florham Park, NJ 07932 johnston@research.att.com ABSTRACT Multimodal interfaces combining natural modalities

More information

W3C on mobile, CSS, multimodal and more

W3C on mobile, CSS, multimodal and more W3C on mobile, CSS, multimodal and more A look at the upcoming standards by W3C Bert Bos, W3C, December 2005 I saw in an entry on the 22C3 blog the question if this presentation and my other one on W3C

More information

A Speech Mashup Framework for Multimodal Mobile Services

A Speech Mashup Framework for Multimodal Mobile Services A Speech Mashup Framework for Multimodal Mobile Services Giuseppe Di Fabbrizio AT&T Labs - Research 180 Park Avenue Florham Park, NJ 07932 - USA pino@research.att.com Thomas Okken AT&T Labs - Research,

More information

Standards for Multimodal Interaction: Activities in the W3C Multimodal Interaction Working Group. Group

Standards for Multimodal Interaction: Activities in the W3C Multimodal Interaction Working Group. Group Standards for Multimodal Interaction: Activities in the W3C Multimodal Interaction Working Group Deborah A. Dahl Chair, W3C Multimodal Interaction Working Group www.w3.org/2002/mmi Overview Goals of the

More information

Multimodal Applications from Mobile to Kiosk. Michael Johnston AT&T Labs Research W3C Sophia-Antipolis July 2004

Multimodal Applications from Mobile to Kiosk. Michael Johnston AT&T Labs Research W3C Sophia-Antipolis July 2004 Multimodal Applications from Mobile to Kiosk Michael Johnston AT&T Labs Research W3C Sophia-Antipolis July 2004 Overview Composite multimodality MATCH: Multimodal Access To City Help MATCHKiosk Design

More information

Voice Browser and Multimodal Interaction In 2009

Voice Browser and Multimodal Interaction In 2009 Voice Browser and Multimodal Interaction In 2009 Paolo Baggia Director of International Standards March 6th, 2009 Google TechTalk Google TechTalk Mar 6 th, 2009 Paolo Baggia 11 Overview A Bit of History

More information

EVALITA 2009: Loquendo Spoken Dialog System

EVALITA 2009: Loquendo Spoken Dialog System EVALITA 2009: Loquendo Spoken Dialog System Paolo Baggia Director of International Standards Speech Luminary at SpeechTEK 2009 Evalita Workshop December 12 th, 2009 Evalita Workshop 2009 Paolo Baggia 11

More information

Spatial Audio with the W3C Architecture for Multimodal Interfaces

Spatial Audio with the W3C Architecture for Multimodal Interfaces Spatial Audio with the W3C Architecture for Multimodal Interfaces Stefan Radomski TU Darmstadt - Telecooperation Group Hochschulstr. 10 64289 Darmstadt, Germany radomski@tk.informatik.tu-darmstadt.de ABSTRACT

More information

Network Working Group. Category: Informational January 2006

Network Working Group. Category: Informational January 2006 Network Working Group G. McCobb Request for Comments: 4374 IBM Corporation Category: Informational January 2006 Status of This Memo The application/xv+xml Media Type This memo provides information for

More information

A Dialog Model for Multi Device Interfaces with Different Modalities

A Dialog Model for Multi Device Interfaces with Different Modalities A Dialog Model for Multi Device Interfaces with Different Modalities Robbie Schaefer, Wolfgang Mueller Paderborn University Fuerstenallee 11 D-33102 Paderborn, Germany Steffen Bleul University of Kassel

More information

Internationalizing Speech Synthesis

Internationalizing Speech Synthesis Internationalizing Speech Synthesis Daniel C. Burnett, Vexeo Zhiwei Shuang, IBM Co-chairs of W3C SSML 1.1 Subgroup Speech Synthesis Markup Language 1.0 The Speech Synthesis Markup Language 1.0 (SSML1.0),

More information

Atef Zaguia 1, Manolo Dulva Hina 1,2, Chakib Tadj 1, Amar Ramdane-Cherif 2,3

Atef Zaguia 1, Manolo Dulva Hina 1,2, Chakib Tadj 1, Amar Ramdane-Cherif 2,3 Using Multimodal Fusion in Accessing Web Services Atef Zaguia 1, Manolo Dulva Hina 1,2, Chakib Tadj 1, Amar Ramdane-Cherif 2,3 1 LATIS Laboratory, Université du Québec, École de technologie supérieure

More information

Authors Martin Eckert Ingmar Kliche Deutsche Telekom Laboratories.

Authors Martin Eckert Ingmar Kliche Deutsche Telekom Laboratories. Workshop on speaker biometrics and VoiceXML 3.0 March 5-6, 2009, Menlo Park, CA, US Proposal of an SIV architecture and requirements Authors Martin Eckert (martin.eckert@telekom.de), Ingmar Kliche (ingmar.kliche@telekom.de),

More information

Introducing the VoiceXML Server

Introducing the VoiceXML Server Introducing the VoiceXML Server David Asher Product Manager, Platform Solutions, NMS July 2005 Webinar Agenda Markets and introduction What is VoiceXML? System configurations Product description and features

More information

The SEMAINE API: An open-source research platform for multimodal, emotion-oriented interactive systems. Marc Schröder, DFKI. enterface 2010, Amsterdam

The SEMAINE API: An open-source research platform for multimodal, emotion-oriented interactive systems. Marc Schröder, DFKI. enterface 2010, Amsterdam The SEMAINE API: An open-source research platform for multimodal, emotion-oriented interactive systems enterface 2010, Amsterdam Outline The SEMAINE project The SEMAINE API Motivation A component integration

More information

SmartWeb Handheld Multimodal Interaction with Ontological Knowledge Bases and Semantic Web Services

SmartWeb Handheld Multimodal Interaction with Ontological Knowledge Bases and Semantic Web Services IJCAI Workshop AI4HC, Hyderabad, 6/1/2007 SmartWeb Handheld Multimodal Interaction with Ontological Knowledge Bases and Semantic Web Services Daniel Sonntag, Ralf Engel, Gerd Herzog, Alexander Pfalzgraf,

More information

SOFTWARE DESIGN AND DEVELOPMENT OF MUTIMODAL INTERACTION

SOFTWARE DESIGN AND DEVELOPMENT OF MUTIMODAL INTERACTION SOFTWARE DESIGN AND DEVELOPMENT OF MUTIMODAL INTERACTION Marie-Luce Bourguet Queen Mary University of London Abstract: Key words: The multimodal dimension of a user interface raises numerous problems that

More information

Human-Computer Interaction (HCI) Interaction modality and Multimodality Laurence Nigay

Human-Computer Interaction (HCI) Interaction modality and Multimodality Laurence Nigay Human-Computer Interaction (HCI) Interaction modality and Multimodality Laurence Nigay University of Grenoble CLIPS-IMAG Laboratory User Interface Engineering Team Introduction: the Domain Human-Computer

More information

Modularization of Multimodal Interaction Specifications

Modularization of Multimodal Interaction Specifications Modularization of Multimodal Interaction Specifications Matthias Denecke, Kohji Dohsaka, Mikio Nakano Communication Sciences Laboratories, NTT NTT Corporation Morinosato Wakamiya 3-1 Atsugi, Kanagawa 243-0198,

More information

Deborah A. Dahl Editor. Multimodal Interaction with W3C Standards. Toward Natural User Interfaces to Everything

Deborah A. Dahl Editor. Multimodal Interaction with W3C Standards. Toward Natural User Interfaces to Everything Deborah A. Dahl Editor Multimodal Interaction with W3C Standards Toward Natural User Interfaces to Everything Multimodal Interaction with W3C Standards Deborah A. Dahl Editor Multimodal Interaction with

More information

ETSI TS V7.3.1 ( ) Technical Specification

ETSI TS V7.3.1 ( ) Technical Specification TS 123 333 V7.3.1 (2008-04) Technical Specification Universal Mobile Telecommunications System (UMTS); Multimedia Resource Function Controller () - Multimedia Resource Function Processor () Mp interface;

More information

ETSI TS V ( )

ETSI TS V ( ) TS 123 333 V10.3.0 (2012-01) Technical Specification Universal Mobile Telecommunications System (UMTS); LTE; Multimedia Resource Function Controller (MRFC) - Multimedia Resource Function Processor (MRFP)

More information

W3C Workshop on Multimodal Interaction

W3C Workshop on Multimodal Interaction W3C Workshop on Multimodal Interaction Ramalingam Hariharan, Akos Vetek Nokia 1 NOKIA W3C Workshop on Multimodal Interaction 2004.PPT / 17-07-2004 / AV & RH Outline Multimodality from a mobile perspective

More information

Component Model for Multimodal Web Applications

Component Model for Multimodal Web Applications Component Model for Multimodal Web Applications How to combine object oriented concepts with markup for event driven applications 1/22 Dave Raggett, W3C/Canon MWeb Workshop, Sophia Antipolis, July 2004

More information

An Adaptive Approach to Collecting Multimodal Input

An Adaptive Approach to Collecting Multimodal Input An Adaptive Approach to Collecting Multimodal Input Anurag Gupta University of New South Wales School of Computer Science and Engineering Sydney, NSW 2052 Australia akgu380@cse.unsw.edu.au Abstract Multimodal

More information

Human Interaction Container Paradigm

Human Interaction Container Paradigm Human Interaction Container Paradigm HIT Lab. Sébastien PRAUD July 20th, 2004 THALES Research & Technology Motivations Human Machine Interfaces (HMI) were designed to render applications or systems tracktable

More information

The Ubiquitous Web. Dave Raggett, W3C Technical Plenary, March /14

The Ubiquitous Web. Dave Raggett, W3C Technical Plenary, March /14 The Ubiquitous Web Dave Raggett, W3C Technical Plenary, March 2005 1/14 Ubiquitous. [adj] 1. (seemingly) present everywhere simultaneously. 2. often encountered [Latin ubique everywhere] Oxford English

More information

The W3C Emotion Incubator Group

The W3C Emotion Incubator Group The W3C Emotion Incubator Group Marc Schröder DFKI GmbH Saarbrücken Germany DFKI in-house Workshop on W3C Activities Kaiserslautern, 28. Nov. 2006 Outline Motivation from HUMAINE EARL towards Standardisation?

More information

Getting rid of OK Google : Individual Multimodal Input Adaption in Real World Applications

Getting rid of OK Google : Individual Multimodal Input Adaption in Real World Applications Getting rid of OK Google : Individual Multimodal Input Adaption in Real World Applications Felix Schüssel 1, Frank Honold 1, Nikola Bubalo 2, Anke Huckauf 2, and Michael Weber 1 1 Institute of Media Informatics,

More information

The Ubiquitous Web. Dave Raggett (W3C/Volantis) CE2006, 19 September Contact:

The Ubiquitous Web. Dave Raggett (W3C/Volantis) CE2006, 19 September Contact: The Ubiquitous Web Dave Raggett (W3C/Volantis) CE2006, 19 September 2006 Contact: dsr@w3.org Web of Everything Incorporates a very wide range of devices Physical world of sensors and effectors Access to

More information

Ontology driven voice-based interaction in mobile environment

Ontology driven voice-based interaction in mobile environment Ontology driven voice-based interaction in mobile environment Jiri Kopsa 1, Zdenek Mikovec 1, Pavel Slavik 1 1 Czech Technical University in Prague Karlovo namesti 13, Prague 2, Czech Republic j.kopsa@fee.ctup.cz,

More information

Multilingual Aspects in Speech and Multimodal Interfaces. Paolo Baggia Director of International Standards

Multilingual Aspects in Speech and Multimodal Interfaces. Paolo Baggia Director of International Standards Multilingual Aspects in Speech and Multimodal Interfaces Paolo Baggia Director of International Standards 1 Outline Loquendo Today Do we need multilingual applications? Voice is different from text? Current

More information

Conversational Knowledge Graphs. Larry Heck Microsoft Research

Conversational Knowledge Graphs. Larry Heck Microsoft Research Conversational Knowledge Graphs Larry Heck Microsoft Research Multi-modal systems e.g., Microsoft MiPad, Pocket PC TV Voice Search e.g., Bing on Xbox Task-specific argument extraction (e.g., Nuance, SpeechWorks)

More information

Multimodal Dialog Description Language for Rapid System Development

Multimodal Dialog Description Language for Rapid System Development Multimodal Dialog Description Language for Rapid System Development Masahiro Araki Kenji Tachibana Kyoto Institute of Technology Graduate School of Science and Technology, Department of Information Science

More information

An Approach to VoiceXML Application Modeling

An Approach to VoiceXML Application Modeling An Approach to Application Modeling Xin Ni 1 Meng Ye 2 Lianhong Cai 3 1,3 Tsinghua University, Beijing, China 2 IBM China Research Lab nx01@mails.tsinghua.edu.cn, yemeng@cn.ibm.com, clh-dcs@tsinghua.edu.cn

More information

Interacting with the Ambience: Multimodal Interaction and Ambient Intelligence

Interacting with the Ambience: Multimodal Interaction and Ambient Intelligence Interacting with the Ambience: Multimodal Interaction and Ambient Intelligence W3C Workshop on Multimodal Interaction 19/20 July 2004, Sophia Antipolis, France Kai Richter (ZGDV) & Michael Hellenschmidt

More information

Form. Settings, page 2 Element Data, page 7 Exit States, page 8 Audio Groups, page 9 Folder and Class Information, page 9 Events, page 10

Form. Settings, page 2 Element Data, page 7 Exit States, page 8 Audio Groups, page 9 Folder and Class Information, page 9 Events, page 10 The voice element is used to capture any input from the caller, based on application designer-specified grammars. The valid caller inputs can be specified either directly in the voice element settings

More information

Human Robot Interaction

Human Robot Interaction Human Robot Interaction Emanuele Bastianelli, Daniele Nardi bastianelli@dis.uniroma1.it Department of Computer, Control, and Management Engineering Sapienza University of Rome, Italy Introduction Robots

More information

Disconnecting the application from the interaction model

Disconnecting the application from the interaction model Disconnecting the application from the interaction model Ing-Marie Jonsson, Neil Scott, Judy Jackson Project Archimedes, CSLI Stanford University {ingmarie,ngscott,jackson}@csli.stanford.edu Abstract:

More information

Position Statement for Multi-Modal Access

Position Statement for Multi-Modal Access Information and Communication Mobile Position Statement for Multi-Modal Access 26.11.2001 Authors: Nathalie Amann, SRIT (E-Mail: Nathalie.Amann@SRIT.siemens.fr) Laurent Hue, SRIT (E-Mail: Laurent.Hue@SRIT.siemens.fr)

More information

A Scripting Language for Multimodal Presentation on Mobile Phones

A Scripting Language for Multimodal Presentation on Mobile Phones A Scripting Language for Multimodal Presentation on Mobile Phones Santi Saeyor 1, Suman Mukherjee 2, Koki Uchiyama 2, Ishizuka Mitsuru 1 1 Dept. of Information and Communication Engineering, University

More information

Special Lecture (406) Spoken Language Dialog Systems Introduction to VoiceXML

Special Lecture (406) Spoken Language Dialog Systems Introduction to VoiceXML Special Lecture (406) Spoken Language Dialog Systems Introduction to VoiceXML Rolf Schwitter schwitt@ics.mq.edu.au Macquarie University 2004 1 Today s Program Developing speech interfaces Brief history

More information

A NOVEL MECHANISM FOR MEDIA RESOURCE CONTROL IN SIP MOBILE NETWORKS

A NOVEL MECHANISM FOR MEDIA RESOURCE CONTROL IN SIP MOBILE NETWORKS A NOVEL MECHANISM FOR MEDIA RESOURCE CONTROL IN SIP MOBILE NETWORKS Noël CRESPI, Youssef CHADLI, Institut National des Telecommunications 9, rue Charles Fourier 91011 EVRY Cedex FRANCE Authors: N.Crespi,

More information

Smart Documents Timely Information Access For All T. V. Raman Senior Computer Scientist Advanced Technology Group Adobe Systems

Smart Documents Timely Information Access For All T. V. Raman Senior Computer Scientist Advanced Technology Group Adobe Systems Smart Documents Timely Information Access For All T. V. Raman Senior Computer Scientist Advanced Technology Group Adobe Systems 1 Outline The Information REvolution. Information is not just for viewing!

More information

WFSTDM Builder Network-based Spoken Dialogue System Builder for Easy Prototyping

WFSTDM Builder Network-based Spoken Dialogue System Builder for Easy Prototyping WFSTDM Builder Network-based Spoken Dialogue System Builder for Easy Prototyping Etsuo Mizukami and Chiori Hori Abstract This paper introduces a network-based spoken dialog system development tool kit:

More information

Chapter 9 Conceptual and Practical Framework for the Integration of Multimodal Interaction in 3D Worlds

Chapter 9 Conceptual and Practical Framework for the Integration of Multimodal Interaction in 3D Worlds 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 Chapter 9 Conceptual and Practical Framework for the Integration of

More information

MPEG-21 SESSION MOBILITY FOR HETEROGENEOUS DEVICES

MPEG-21 SESSION MOBILITY FOR HETEROGENEOUS DEVICES MPEG-21 SESSION MOBILITY FOR HETEROGENEOUS DEVICES Frederik De Keukelaere Davy De Schrijver Saar De Zutter Rik Van de Walle Multimedia Lab Department of Electronics and Information Systems Ghent University

More information

Speech Tuner. and Chief Scientist at EIG

Speech Tuner. and Chief Scientist at EIG Speech Tuner LumenVox's Speech Tuner is a complete maintenance tool for end-users, valueadded resellers, and platform providers. It s designed to perform tuning and transcription, as well as parameter,

More information

SOAP Specification. 3 major parts. SOAP envelope specification. Data encoding rules. RPC conventions

SOAP Specification. 3 major parts. SOAP envelope specification. Data encoding rules. RPC conventions SOAP, UDDI and WSDL SOAP SOAP Specification 3 major parts SOAP envelope specification Defines rules for encapsulating data Method name to invoke Method parameters Return values How to encode error messages

More information

INTRODUCTION TO VOICEXML FOR DISTRIBUTED WEB-BASED APPLICATIONS

INTRODUCTION TO VOICEXML FOR DISTRIBUTED WEB-BASED APPLICATIONS ιατµηµατικό Μεταπτυχιακό Πρόγραµµα Σπουδών : Οικονοµική & ιοίκηση Τηλεπικοινωνιακών ικτύων (Νέες υπηρεσίες και τεχνολογίες δικτύων) INTRODUCTION TO VOICEXML FOR DISTRIBUTED WEB-BASED APPLICATIONS Π.Κ Κίκιραs

More information

Streaming-Archival InkML Conversion

Streaming-Archival InkML Conversion Streaming-Archival InkML Conversion Birendra Keshari and Stephen M. Watt Dept. of Computer Science University of Western Ontario London, Ontario, Canada N6A 5B7 {bkeshari,watt}@csd.uwo.ca Abstract Ink

More information

Interaction Design and Implementation for Multimodal Mobile Semantic Web Interfaces

Interaction Design and Implementation for Multimodal Mobile Semantic Web Interfaces HCI International, Beijing, China, 27th July 2007 Interaction Design and Implementation for Multimodal Mobile Semantic Web Interfaces Daniel Sonntag German Research Center for Artificial Intelligence 66123

More information

Cache Operation. Version 31-Jul Wireless Application Protocol WAP-175-CacheOp a

Cache Operation. Version 31-Jul Wireless Application Protocol WAP-175-CacheOp a Cache Operation Version 31-Jul-2001 Wireless Application Protocol WAP-175-CacheOp-20010731-a A list of errata and updates to this document is available from the WAP Forum Web site, http://www.wapforum.org/,

More information

Menu Support for 2_Option_Menu Through 10_Option_Menu

Menu Support for 2_Option_Menu Through 10_Option_Menu Menu Support for 2_Option_Menu Through 10_Option_Menu These voice elements define menus that support from 2 to 10 options. The Menu voice elements are similar to the Form voice element, however the number

More information

Automatic Enhancement of Correspondence Detection in an Object Tracking System

Automatic Enhancement of Correspondence Detection in an Object Tracking System Automatic Enhancement of Correspondence Detection in an Object Tracking System Denis Schulze 1, Sven Wachsmuth 1 and Katharina J. Rohlfing 2 1- University of Bielefeld - Applied Informatics Universitätsstr.

More information

Multi-modal Web IBM Position

Multi-modal Web IBM Position Human Language Technologies Multi-modal Web IBM Position W3C / WAP Workshop Mobile Speech Solutions & Conversational AdTech Stéphane H. Maes smaes@us.ibm.com TV Raman 1 Definitions by example: evolution

More information

MRCP. Google SR Plugin. Usage Guide. Powered by Universal Speech Solutions LLC

MRCP. Google SR Plugin. Usage Guide. Powered by Universal Speech Solutions LLC Powered by Universal Speech Solutions LLC MRCP Google SR Plugin Usage Guide Revision: 6 Created: May 17, 2017 Last updated: January 22, 2018 Author: Arsen Chaloyan Universal Speech Solutions LLC Overview

More information

A Convedia White Paper. Controlling Media Servers with SIP

A Convedia White Paper. Controlling Media Servers with SIP Version 1.2 June, 2004 Contents: Introduction page 3 Media Server Overview page 3 Dimensions of Interaction page 5 Types of Interaction page 6 SIP Standards for Media Server Control page 7 Introduction

More information

Holly5 VoiceXML Developer Guide Holly Voice Platform 5.1. Document number: hvp-vxml-0009 Version: 1-0 Issue date: December

Holly5 VoiceXML Developer Guide Holly Voice Platform 5.1. Document number: hvp-vxml-0009 Version: 1-0 Issue date: December Holly5 VoiceXML Developer Guide Holly Voice Platform 5.1 Document number: hvp-vxml-0009 Version: 1-0 Issue date: December 22 2009 Copyright Copyright 2013 West Corporation. These documents are confidential

More information

[MS-TTML]: Internet Explorer Timed Text Markup Language (TTML) 1.0 Standards Support Documentation

[MS-TTML]: Internet Explorer Timed Text Markup Language (TTML) 1.0 Standards Support Documentation [MS-TTML]: Internet Explorer Timed Text Markup Language (TTML) 1.0 Standards Support Documentation Intellectual Property Rights Notice for Open Specifications Documentation Technical Documentation. Microsoft

More information

Application Notes for Yandex Speechkit Speech Recognition 1.6 with Avaya Aura Experience Portal Issue 1.0

Application Notes for Yandex Speechkit Speech Recognition 1.6 with Avaya Aura Experience Portal Issue 1.0 Avaya Solution & Interoperability Test Lab Application Notes for Yandex Speechkit Speech Recognition 1.6 with Avaya Aura Experience Portal 7.0.1 - Issue 1.0 Abstract These application notes describe the

More information

ETSI TS V1.1.1 ( ) Technical Specification

ETSI TS V1.1.1 ( ) Technical Specification TS 102 632 V1.1.1 (2008-11) Technical Specification Digital Audio Broadcasting (DAB); Voice Applications European Broadcasting Union Union Européenne de Radio-Télévision EBU UER 2 TS 102 632 V1.1.1 (2008-11)

More information

A Technical Overview: Voiyager Dynamic Application Discovery

A Technical Overview: Voiyager Dynamic Application Discovery A Technical Overview: Voiyager Dynamic Application Discovery A brief look at the Voiyager architecture and how it provides the most comprehensive VoiceXML application testing and validation method available.

More information

SmartKom: Towards Multimodal Dialogues with Anthropomorphic Interface Agents

SmartKom: Towards Multimodal Dialogues with Anthropomorphic Interface Agents SmartKom: Towards Multimodal Dialogues with Anthropomorphic Interface Agents Wolfgang Wahlster Norbert Reithinger Anselm Blocher DFKI GmbH, D-66123 Saarbrücken, Germany {wahlster,reithinger,blocher}@dfki.de

More information

Key differentiating technologies for mobile search

Key differentiating technologies for mobile search Key differentiating technologies for mobile search Orange Labs Michel PLU, ORANGE Labs - Research & Development Exploring the Future of Mobile Search Workshop, GHENT Some key differentiating technologies

More information

3GPP TS V ( )

3GPP TS V ( ) TS 23.333 V12.5.0 (2015-03) Technical Specification 3rd Generation Partnership Project; Technical Specification Group Core Network and Terminals; Multimedia Resource Function Controller (MRFC) - Multimedia

More information

SALT. Speech Application Language Tags 0.9 Specification (Draft)

SALT. Speech Application Language Tags 0.9 Specification (Draft) SALT Speech Application Language Tags 0.9 Specification (Draft) Document SALT.0.9.pdf 19 Feb 2002 Cisco Systems Inc., Comverse Inc., Intel Corporation, Microsoft Corporation, Philips Electronics N.V.,

More information

PRACTICAL SPEECH USER INTERFACE DESIGN

PRACTICAL SPEECH USER INTERFACE DESIGN ; ; : : : : ; : ; PRACTICAL SPEECH USER INTERFACE DESIGN й fail James R. Lewis. CRC Press Taylor &. Francis Group Boca Raton London New York CRC Press is an imprint of the Taylor & Francis Group, an informa

More information

A MULTILINGUAL DIALOGUE SYSTEM FOR ACCESSING THE WEB

A MULTILINGUAL DIALOGUE SYSTEM FOR ACCESSING THE WEB A MULTILINGUAL DIALOGUE SYSTEM FOR ACCESSING THE WEB Marta Gatius, Meritxell González and Elisabet Comelles Technical University of Catalonia, Software Department, Campus Nord UPC, Jordi Girona, 1-3 08034

More information

VClarity Voice Platform

VClarity Voice Platform VClarity Voice Platform VClarity L.L.C. Voice Platform Snap-in Functional Overview White Paper Technical Pre-release Version 2.0 for VClarity Voice Platform Updated February 12, 2007 Table of Contents

More information

[MS-XMLSS]: Microsoft XML Schema (Part 1: Structures) Standards Support Document

[MS-XMLSS]: Microsoft XML Schema (Part 1: Structures) Standards Support Document [MS-XMLSS]: Microsoft XML Schema (Part 1: Structures) Standards Support Document Intellectual Property Rights Notice for Open Specifications Documentation Technical Documentation. Microsoft publishes Open

More information

Say-it: Design of a Multimodal Game Interface for Children Based on CMU Sphinx 4 Framework

Say-it: Design of a Multimodal Game Interface for Children Based on CMU Sphinx 4 Framework Grand Valley State University ScholarWorks@GVSU Technical Library School of Computing and Information Systems 2014 Say-it: Design of a Multimodal Game Interface for Children Based on CMU Sphinx 4 Framework

More information

Network Working Group Request for Comments: 4424 February 2006 Updates: 4348 Category: Standards Track

Network Working Group Request for Comments: 4424 February 2006 Updates: 4348 Category: Standards Track Network Working Group S. Ahmadi Request for Comments: 4424 February 2006 Updates: 4348 Category: Standards Track Real-Time Transport Protocol (RTP) Payload Format for the Variable-Rate Multimode Wideband

More information

Linked Data in Translation-Kits

Linked Data in Translation-Kits Linked Data in Translation-Kits FEISGILTT Dublin June 2014 Yves Savourel ENLASO Corporation Slides and code are available at: https://copy.com/ngbco13l9yls This presentation was made possible by The main

More information

XML Information Set. Working Draft of May 17, 1999

XML Information Set. Working Draft of May 17, 1999 XML Information Set Working Draft of May 17, 1999 This version: http://www.w3.org/tr/1999/wd-xml-infoset-19990517 Latest version: http://www.w3.org/tr/xml-infoset Editors: John Cowan David Megginson Copyright

More information

ARI Technical Working Group Report Meetings held Nov 2014 Report 30 January 2015

ARI Technical Working Group Report Meetings held Nov 2014 Report 30 January 2015 Introduction ARI Technical Working Group Report Meetings held 12-13 Nov 2014 Report 30 January 2015 The Accessible Rendered Item (ARI) format is a new concept for describing computer-administered assessment

More information

The Atom Project. Tim Bray, Sun Microsystems Paul Hoffman, IMC

The Atom Project. Tim Bray, Sun Microsystems Paul Hoffman, IMC The Atom Project Tim Bray, Sun Microsystems Paul Hoffman, IMC Recent Numbers On June 23, 2004 (according to Technorati.com): There were 2.8 million feeds tracked 14,000 new blogs were created 270,000 new

More information

Confidence Measures: how much we can trust our speech recognizers

Confidence Measures: how much we can trust our speech recognizers Confidence Measures: how much we can trust our speech recognizers Prof. Hui Jiang Department of Computer Science York University, Toronto, Ontario, Canada Email: hj@cs.yorku.ca Outline Speech recognition

More information

Task Completion Platform: A self-serve multi-domain goal oriented dialogue platform

Task Completion Platform: A self-serve multi-domain goal oriented dialogue platform Task Completion Platform: A self-serve multi-domain goal oriented dialogue platform P. A. Crook, A. Marin, V. Agarwal, K. Aggarwal, T. Anastasakos, R. Bikkula, D. Boies, A. Celikyilmaz, S. Chandramohan,

More information

The Dublin Core Metadata Element Set

The Dublin Core Metadata Element Set ISSN: 1041-5635 The Dublin Core Metadata Element Set Abstract: Defines fifteen metadata elements for resource description in a crossdisciplinary information environment. A proposed American National Standard

More information

Speech Applications. How do they work?

Speech Applications. How do they work? Speech Applications How do they work? What is a VUI? What the user interacts with when using a speech application VUI Elements Prompts or System Messages Prerecorded or Synthesized Grammars Define the

More information

Web & Automotive. Paris, April Dave Raggett

Web & Automotive. Paris, April Dave Raggett Web & Automotive Paris, April 2012 Dave Raggett 1 Aims To discuss potential for Web Apps in cars Identify what kinds of Web standards are needed Discuss plans for W3C Web & Automotive Workshop

More information

Network Working Group Internet-Draft October 27, 2007 Intended status: Experimental Expires: April 29, 2008

Network Working Group Internet-Draft October 27, 2007 Intended status: Experimental Expires: April 29, 2008 Network Working Group J. Snell Internet-Draft October 27, 2007 Intended status: Experimental Expires: April 29, 2008 Status of this Memo Atom Publishing Protocol Feature Discovery draft-snell-atompub-feature-12.txt

More information

An exchange format for multimodal annotations

An exchange format for multimodal annotations An exchange format for multimodal annotations Thomas Schmidt, Susan Duncan, Oliver Ehmer, Jeffrey Hoyt, Michael Kipp, Dan Loehr, Magnus Magnusson, Travis Rose, Han Sloetjes Background International Society

More information

2004 NASCIO Recognition Awards. Nomination Form

2004 NASCIO Recognition Awards. Nomination Form 2004 NASCIO Recognition Awards Nomination Form Title of Nomination: Access Delaware Project Project/System Manager: Mark J. Headd Job Title: Assistant to the Chief Information Officer Agency: Office of

More information

SCXML. Michael Bodell.

SCXML. Michael Bodell. SCXML Michael Bodell bodell@tellme.com Prologue (VXML 2.0/2.1) VoiceXML 2.0/2.1 is a standard out of the Voice Browser Working Group of the W3C VXML is to networked phone browsers as HTML is to internet

More information

Emacspeak Direct Speech Access. T. V. Raman Senior Computer Scientist Adobe Systems. c1996 Adobe Systems Incorporated.All rights reserved.

Emacspeak Direct Speech Access. T. V. Raman Senior Computer Scientist Adobe Systems. c1996 Adobe Systems Incorporated.All rights reserved. Emacspeak Direct Speech Access T. V. Raman Senior Computer Scientist Adobe Systems 1 Outline Overview of speech applications. Emacspeak Architecture. Emacspeak The user experience. 2 Screen Access User

More information

Realisation of SOA using Web Services. Adomas Svirskas Vilnius University December 2005

Realisation of SOA using Web Services. Adomas Svirskas Vilnius University December 2005 Realisation of SOA using Web Services Adomas Svirskas Vilnius University December 2005 Agenda SOA Realisation Web Services Web Services Core Technologies SOA and Web Services [1] SOA is a way of organising

More information

Lesson 11. Media Retrieval. Information Retrieval. Image Retrieval. Video Retrieval. Audio Retrieval

Lesson 11. Media Retrieval. Information Retrieval. Image Retrieval. Video Retrieval. Audio Retrieval Lesson 11 Media Retrieval Information Retrieval Image Retrieval Video Retrieval Audio Retrieval Information Retrieval Retrieval = Query + Search Informational Retrieval: Get required information from database/web

More information

Voice Extensible Markup Language (VoiceXML)

Voice Extensible Markup Language (VoiceXML) Voice Extensible Markup Language (VoiceXML) Version 2.0 W3C Working Draft 24 April 2002 This Version: http://www.w3.org/tr/2002/wd-voicexml20-20020424/ Latest Version: http://www.w3.org/tr/voicexml20 Previous

More information

Genesys App Automation Platform Deployment Guide. Hardware and Software Specifications

Genesys App Automation Platform Deployment Guide. Hardware and Software Specifications Genesys App Automation Platform Deployment Guide Hardware and Software Specifications 6/28/2018 Contents 1 Hardware and Software Specifications 1.1 Hardware 1.2 Software 1.3 IVR technologies and platforms

More information

Contents. G52IWS: The Semantic Web. The Semantic Web. Semantic web elements. Semantic Web technologies. Semantic Web Services

Contents. G52IWS: The Semantic Web. The Semantic Web. Semantic web elements. Semantic Web technologies. Semantic Web Services Contents G52IWS: The Semantic Web Chris Greenhalgh 2007-11-10 Introduction to the Semantic Web Semantic Web technologies Overview RDF OWL Semantic Web Services Concluding comments 1 See Developing Semantic

More information

On the Way to the Semantic Web

On the Way to the Semantic Web On the Way to the Semantic Web Presented on 1 Fórum W3C Brasil, by Klaus Birkenbihl, Coordinator World Offices, W3C based on a slide set mostly created by Ivan Herman, Semantic Web Activity Lead, W3C Sept.

More information

Institutional Repository - Research Portal Dépôt Institutionnel - Portail de la Recherche

Institutional Repository - Research Portal Dépôt Institutionnel - Portail de la Recherche Institutional Repository - Research Portal Dépôt Institutionnel - Portail de la Recherche researchportal.unamur.be RESEARCH OUTPUTS / RÉSULTATS DE RECHERCHE Prototyping Multimodal Interfaces with the SMUIML

More information

Integrate Speech Technology for Hands-free Operation

Integrate Speech Technology for Hands-free Operation Integrate Speech Technology for Hands-free Operation Copyright 2011 Chant Inc. All rights reserved. Chant, SpeechKit, Getting the World Talking with Technology, talking man, and headset are trademarks

More information

Data Synchronization in Mobile Computing Systems Lesson 12 Synchronized Multimedia Markup Language (SMIL)

Data Synchronization in Mobile Computing Systems Lesson 12 Synchronized Multimedia Markup Language (SMIL) Data Synchronization in Mobile Computing Systems Lesson 12 Synchronized Multimedia Markup Language (SMIL) Oxford University Press 2007. All rights reserved. 1 Language required to specify the multimodal

More information

Abstract. Avaya Solution & Interoperability Test Lab

Abstract. Avaya Solution & Interoperability Test Lab Avaya Solution & Interoperability Test Lab Application Notes for Jabra Link 33 EHS Adapter, Jabra Engage 65 and Jabra Engage 75 Convertible USB/DECT headsets with Avaya 96x1 Series IP Deskphone (H.323

More information

ISO/IEC INTERNATIONAL STANDARD. Information technology Multimedia content description interface Part 5: Multimedia description schemes

ISO/IEC INTERNATIONAL STANDARD. Information technology Multimedia content description interface Part 5: Multimedia description schemes INTERNATIONAL STANDARD ISO/IEC 15938-5 First edition 2003-05-15 Information technology Multimedia content description interface Part 5: Multimedia description schemes Technologies de l'information Interface

More information

Temporal Aspects of CARE-based Multimodal Fusion: From a Fusion Mechanism to Composition Components and WoZ Components

Temporal Aspects of CARE-based Multimodal Fusion: From a Fusion Mechanism to Composition Components and WoZ Components Temporal Aspects of CARE-based Multimodal Fusion: From a Fusion Mechanism to Composition Components and WoZ Components Marcos Serrano and Laurence Nigay University of Grenoble, CNRS, LIG B.P. 53, 38041,

More information

Application Notes for Configuring Nuance Speech Attendant with Avaya Aura Session Manager R6.3 and Avaya Communication Server 1000 R7.6 Issue 1.

Application Notes for Configuring Nuance Speech Attendant with Avaya Aura Session Manager R6.3 and Avaya Communication Server 1000 R7.6 Issue 1. Avaya Solution & Interoperability Test Lab Application Notes for Configuring Nuance Speech Attendant with Avaya Aura Session Manager R6.3 and Avaya Communication Server 1000 R7.6 Issue 1.0 Abstract These

More information