SmartWeb Handheld Multimodal Interaction with Ontological Knowledge Bases and Semantic Web Services

Similar documents
Interaction Design and Implementation for Multimodal Mobile Semantic Web Interfaces

Context-Sensitive Multimodal Mobile Interfaces

SmartWeb: Towards Semantic Web Services for Ambient Intelligence

Daniel Sonntag, Paul Buitelaar. DFKI GmbH Stuhlsatzenhausweg 3 D Saarbrücken, Germany

DFKI Approach to Dialogue Management. Norbert Reithinger, Elsa Pecourt, Markus Löckelt

A Multimodal Mobile B2B Dialogue Interface on the iphone

UNIFYING SEMANTIC ANNOTATION AND QUERYING IN BIOMEDICAL IMAGE REPOSITORIES One Solution for Two Problems of Medical Knowledge Engineering

Human Interaction Container Paradigm

IMOTION. Heiko Schuldt, University of Basel, Switzerland

Flexible Multimodal Human-Machine Interaction in Mobile Environments

Key differentiating technologies for mobile search

SOFTWARE DESIGN AND DEVELOPMENT OF MUTIMODAL INTERACTION

Towards Combining Finite State, Ontologies, and Data Driven Approaches to Dialogue Management for Multimodal Question Answering

SmartKom: Towards Multimodal Dialogues with Anthropomorphic Interface Agents

DOLCE ergo SUMO: On foundational and domain models in the SmartWeb Integrated Ontology (SWIntO)

The Need for Mobile Multimodal Dialogue Systems

Multimodal Applications from Mobile to Kiosk. Michael Johnston AT&T Labs Research W3C Sophia-Antipolis July 2004

Enhanced retrieval using semantic technologies:

Human-Computer Interaction (HCI) Interaction modality and Multimodality Laurence Nigay

SmartKom - Adaptive and Flexible Multimodal Access to Multiple Applications

Ontology driven voice-based interaction in mobile environment

Towards Cross-Media Feature Extraction

Dialogue Systems Go Multimodal: The SmartKom Experience

How GPUs Power Comcast's X1 Voice Remote and Smart Video Analytics. Jan Neumann Comcast Labs DC May 10th, 2017

Eleven+ Views of Semantic Search

CHIST-ERA Projects Seminar Topic IUI

W3C Workshop on Multimodal Interaction

Assessing the Quality of Natural Language Text

Optimal Video Adaptation and Skimming Using a Utility-Based Framework

The W3C Emotion Incubator Group

6 Designing Interactive Systems

Context Aware Computing

2 Basic HCI Principles

6 Designing Interactive Systems

Virtual Campfire Digital Media Discourses between Cultures, Continents and Generations

Modularization of Multimodal Interaction Specifications

Component Model for Multimodal Web Applications

Semantic Web and Natural Language Processing

An Integrated Ontology Set and Reasoning Mechanism for Multi- Modality Visualisation Destined to Collaborative Planning Environments

Context Ontology Construction For Cricket Video

Enterprise Multimedia Integration and Search

TALK project work at UCAM

3 Publishing Technique

stanford hci group / cs376 Design Tools Ron B. Yeh 26 October 2004 Research Topics in Human-Computer Interaction

Designing for Multimedia

End-to-End Evaluation of Multimodal Dialogue Systems - can we Transfer Established Methods?

Performing Intelligent Mobile Searches in the Cloud using Semantic Technologies

A fully-automatic approach to answer geographic queries: GIRSA-WP at GikiP

The NEPOMUK project. Dr. Ansgar Bernardi DFKI GmbH Kaiserslautern, Germany

Riding the Wave: Move Beyond Text TIB's strategy in the context of non-textual materials. Uwe Rosemann, Irina Sens IATUL Conference Singapur

MULTIMODAL ENHANCEMENTS AND DISTRIBUTION OF DAISY-BOOKS

<is web> 1 st Face to Face Meeting of the W3C XG Multimedia Semantics XG MMSEM. Thomas Franz. ISWeb - Information Systems & Semantic Web

EMMA: Extensible MultiModal Annotation

Building Multimodal Applications with EMMA

Automatic Generation of General Speech User Interface

Tools and Toolkits for Voice and Animated Character based Interventions. Overview

Adaptive Mobile Multimodal User Interfaces for Dynamic Accessibility

The value of voice easing access to complex functionality

A Mobile Scenario for the XX Olympic Winter Games Torino 2006

Toward an Ontology-Based Chatbot Endowed with Natural Language Processing and Generation

Rapid Development of Multimodal Dialogue Applications with Semantic Models

Introduction

Research on Construction of Road Network Database Based on Video Retrieval Technology

Decentralized User Modeling with UserML and GUMO

The Zachman Framework

State of the Art and Trends in Search Engine Technology. Gerhard Weikum

Generalized concept overlay for semantic multi-modal analysis of audio-visual content

The universaal UI Framework.

stanford hci group / cs376 Capture & Access Scott Klemmer 14 November 2006

Design and Realization of the EXCITEMENT Open Platform for Textual Entailment. Günter Neumann, DFKI Sebastian Pado, Universität Stuttgart

Using context information to generate dynamic user interfaces

Mensch-Maschine-Interaktion 1. Chapter 2 (May 12th, 2011, 9am-12pm): Basic HCI principles 1

Business to Consumer Markets on the Semantic Web

CS Human Computer Interaction

EVALITA 2009: Loquendo Spoken Dialog System

The ToCAI Description Scheme for Indexing and Retrieval of Multimedia Documents 1

MMIR Framework: Multimodal Mobile Interaction and Rendering

Integrating the Amsterdam Hypermedia Model with the Standard Reference Model for Intelligent Multimedia Presentation Systems

XETA: extensible metadata System

WebLicht: Web-based LRT Services in a Distributed escience Infrastructure

Intelligent Information Management

3 Basic HCI Principles and Models

Towards multimodal public information systems

7 Implementing Interactive Systems

Well-formed Default Unification in Non-deterministic Multiple Inheritance Hierarchies

Abstract. 1. Conformance. 2. Introduction. 3. Abstract User Interface

Text Classification and Clustering Using Kernels for Structured Data

clarin:el an infrastructure for documenting, sharing and processing language data

Giving instructions, conversing, manipulating and navigating (direct manipulation), exploring and browsing, proactive computing

Computer-Aided Semantic Annotation of Multimedia. Project Presentation. June 2008

Modality Fusion in a Route Navigation System

A MULTILINGUAL DIALOGUE SYSTEM FOR ACCESSING THE WEB

ICT & Digital Cinema A new start? Roberto Cencioni. DG Information Society and Media. Unit INFSO.E2 - Content & Knowledge

Semantic Video Indexing

Adobe Sign Voluntary Product Accessibility Template

Dan Henderlong July 27, 2017

Semantic Web. Ontology Engineering and Evaluation. Morteza Amini. Sharif University of Technology Fall 93-94

Hybrid Communication. CODECS Workshop / May 19, 2017 Karsten Roscher, Fraunhofer ESK Enrique Onieva, Deusto

Enhancing the Flexibility of a Multimodal Smart-Home Environment

Chapter 6 Architectural Design. Lecture 1. Chapter 6 Architectural design

Transcription:

IJCAI Workshop AI4HC, Hyderabad, 6/1/2007 SmartWeb Handheld Multimodal Interaction with Ontological Knowledge Bases and Semantic Web Services Daniel Sonntag, Ralf Engel, Gerd Herzog, Alexander Pfalzgraf, Norbert Pfleger, Massimo Romanelli, Norbert Reithinger German Research Center for Artificial Intelligence 66123 Saarbrücken, Germany daniel.sonntag@dfki.de

Agenda SmartWeb & Multimodal QA Requirements Architecture Approach Ontology Representation and Web Services Semantic Parsing and Discourse Processing Conclusions 06/01/2007 2

Project Goals SmartWeb goal: Intuitive multimodal access to a rich selection of Web-based information services. HCI/Dialogue system goals: Demonstrate the strength of Semantic Web technology for information gathering dialogue systems. Show how knowledge retrieval from ontologies and Web Services can be combined with advanced dialogical interaction, e.g., system clarifications. Ontology-based integration of verbal and non-verbal system input (fusion) and output (reaction/presentation). 06/01/2007 3

Smartweb Requirements Multimodal dialogue with question answering functionality. Speech is dominant input modality for interaction. Multimodal recognition for speech or gestures. 3G smartphone Dashboard display Modality interpretation and fusion, intention processing. Modality fission, result rendering for text, images, videos, graphics, and synthesis of speech. Reuse already existing components. Motorbike cockpit Control the message flow in the system. 06/01/2007 4

The SmartWeb Consortium Funded by the German Government and Industry Funding: 13.7 M, Budget: 24 M Scientific Director: Wolfgang Wahlster European Media Lab IMS Institut für Maschinelle Sprachverarbeitung, Universität Stuttgart Project Duration: 2004-2008 More than 60 Researchers and Engineers Ludwig-Maximilians- Universität München Berkeley, USA 06/01/2007 5

Application Scenarios Personal guide at the FIFA Worldcup 2006 SmartWeb: Getting Answers on the Go (keynote Wahlster, ECAI 2006) http://www2.dfki.de/~wahlster/ecai2006/ German Telekom Mobility and Navigation Scenario http://smartweb.dfki.de/smartweb_flashdemo_eng_v09.exe 06/01/2007 6

Interaction Guidelines Multimodal Guidelines and Assets Multimodality: More modalities allow for more natural communication. Encapsulation: Encapsulate user interface proper from the rest of the application. Standards: Re-use own and others resources. Representation: A common ontological knowledge base eases data flow, avoids transformations, and provide a basis for processing natural language dialogue phenomena. 06/01/2007 7

Multimodal Input and Output Speech/Spoken dialogue over Bluetooth Headset or PDA micro Gestures on Graphical user interface based on Pen Input Handheld scenario Multimodal Interaction Face Camera/Facial EXpression Bio Signals Haptic Feedback/ Physical Action by handlebars on BMW bike 06/01/2007 8

Interaction Example U (Query): Show me the mascot of the football WCS. S (Clarification): Which year? 2006 2002 1998 1994 1990 U (Feedback): 2006 S (Multimodal): GOLEO U (Query): I need some texts about football rules. S (Intermediate Result): Paragraph: Yellow card Paragraph: Red card Paragraph: Penalty shot U (Feedback): What does red and yellow card mean? S (Final Result) Paragraph: Yellow, yellow-red, and red cards are shown 06/01/2007 9

Handheld Interaction Example Inducting/Deducing Enumeration question Ellipsis resolution/ Query completion Integration of verbal and non-verbal output Web Service Interface & System clarifications 06/01/2007 10

Handheld Architecture Implicit confirmation and language (in)dependent visualisation Graphical Design Technical Design 06/01/2007 11

Ontologies An Ontology is an explicit specification of a conceptualization [Gruber 93] a shared understanding of a domain of interest [Uschold/Gruninger 96] Make domain assumptions explicit Separate domain knowledge from operational knowledge Re-use domain and operational knowledge separately A community reference for applications Shared understanding of what particular information means 06/01/2007 12

Ontology Representation Ontological infrastructure: SWIntO LingInfo SmartSUMO Adapted DOLCE Adapted SUMO SmartWeb Integrated Ontology Sport Event Ontology Navigation Ontology Discourse Ontology Webcam Ontology SmartMedia Ontology 06/01/2007 13

Discourse Ontology for Semantic Web Applications Multimodal Dialogue Management EMMA Format swemma:result, swemma:status Multimodal question/ answer types Dialogue Modelling Lexical rules for syntactic/ semantic mapping Dialogue Acts Dialogue Memory System reactions usage- and clarification dialogs Presentation elements/patterns for multimodal fusion/fission Dialogue Model Phases/States, (FSM/PLAN) HCI Interaction Patterns 06/01/2007 14

Ontology Representation and Multimedia Framework for gesture and speech fusion. Multimedia decomposition in space, time and frequency. Link to the Upper Model Ontology to close the Semantic Gap. 06/01/2007 15

Ontology Representation and Web Services Detect underspecified user queries which lack required input parameters (GPS Missing). Plan-based Service Composer GOAL Route Planner, POI Search Movies, Events, Maps, Weather U-Context, e.g. p/t, Weather Books, Movie Posters WM-Guide Train Conncetions, Hotels 06/01/2007 16

Web Service Composition Results Text-based event details, additional image material, and the location map are semantically represented. User perceptual feedback: Feedback on natural language understanding Presentation of Multimodal results combinjng text, image and speech synthesis 06/01/2007 17

Language Understanding and Text Generation Lexico-Semantic Mapping on word level (SPIN). Fast and robust speech processing (ASR errors and disfluencies). Order-independent matching NIPSGEN uses SPIN + TAG grammar 06/01/2007 18

Multimodal Discourse Processing (FADE) Production rule system and discourse modeller How do I get to Berlin from here? Who won the Fifa World Cup in 1990? And in 2002? How often did this team [pointing gesture] win the World Cup? What s the weather going to be like tomorrow? And the day after tomorrow? What s the name of the third player in the top row?. Where do you want to start? - Berlin 06/01/2007 19

Multimodal Fusion Referencing Mpeg-7 sub-annotations: [player click] + How many goals did this player score. 06/01/2007 20

Multimodal Dialogue Control Initial FSA policy for demonstrator Outgoing arcs provide classification task IS features from ontological instances Learn Reaction and Presentation Behaviour Clarification strategies Filtering strategies Multimodal presentation strategies with temporal constraints User Modelling Recommendations 06/01/2007 21

Experience on Component Integration Using ontologies in information gathering dialog systems for knowledge retrieval from ontologies and web services in combination with advanced dialogical interaction is an iterative ontology engineering process. Ontological representations offer framework for gesture and speech fusion when users interact with Semantic Web results such as MPEG7-annotated images and maps. Generate structured input spaces for more context-relevant reaction planning to ensure naturalness in system-user interactions to a large degree. 06/01/2007 22

Conclusions SmartWeb was sucessfully demonstrated in the context of the FootballWorld Cup 2006 in Germany. Flexible control flow to be combined with dialog system strategies for error recoveries clarifications with the user multimodal interactions Future integration plans: Dialogue management adaptations via machine learning Collaborative filtering (of redundant results) Incremental presentation of results 06/01/2007 23