HCI International, Beijing, China, 27th July 2007 Interaction Design and Implementation for Multimodal Mobile Semantic Web Interfaces Daniel Sonntag German Research Center for Artificial Intelligence 66123 Saarbruecken, Germany daniel.sonntag@dfki.de Agenda SmartWeb and Multimodal HCI Requirements Architecture Approach Interaction Design and Implementation Conclusions 27/07/2007 2 1
SmartWeb and Multimodal HCI Requirements goal: Intuitive multimodal access to a rich selection of Web-based information services. HCI and dialogue system goals: Provide concise and correct multimedia answers in a multimodal way. Show how knowledge retrieval from ontologies and Web Services can be combined with advanced dialogical interaction, e.g., system clarifications. Provide ontology-based integration of verbal and non-verbal system input (fusion) and output (reaction/presentation). 27/07/2007 3 The SmartWeb Consortium European Media Lab Ludwig-Maximilians- Universität München Berkeley, USA Funded by the German Government and Industry IMS Institut für Maschinelle Sprachverarbeitung, Universität Stuttgart Funding: 13.7 M, Budget: 24 M Scientific Director: Wolfgang Wahlster Project Duration: 2004-2008 More than 60 Researchers and Engineers 27/07/2007 4 2
Smartweb Requirements Multimodal dialogue with question answering functionality. Speech is dominant input modality for interaction. 3G smartphone Multimodal recognition for speech or gestures. Modality interpretation and fusion, intention processing. Modality fission, result rendering for text, images, videos, graphics, and synthesis of speech. Reuse already existing components. Control the message flow in the system. Motorbike cockpit Dashboard display 27/07/2007 5 Interactive Multimodal Semantic Web Access? (1) Lack of computational power (2) Small screen (3) Exploit contextual information (4) New interaction possibilities Challenges Opportunities Q&A System Dialogue System Semantic Mediator Web Service Access Agent Based Web Access Knowledge Server 27/07/2007 6 3
Application Scenarios Personal guide at the FIFA Worldcup 2006 Answer football related and navigation related questions. German Telekom Mobility and Navigation Scenario http://smartweb.dfki.de/smartweb_flashdemo_eng_v09.exe 27/07/2007 7 Natural Dialogue Based Mobile Interaction Example Inducting/deducing enumeration question Ellipsis resolution/ query completion Integration of verbal and non-verbal output System clarifications in Web Service interface 27/07/2007 8 4
Design Principles for Mobile Multimodal Dialogue 1. Follow general UI principles. 2. Display the recognised user input. 3. Provide interface simplicity by progressive disclosure. 4. Provide status information to the user. 27/07/2007 9 Graphical interaction areas Initial screen layout Interactions with paper prototypes Implemented system Perceptual feedback allows the user to understand the current processing state of the system. Use lively metaphors for system states and perceptual states: (1) Listening/Idle State, (2) Recording (green and red icon), (3) Understanding by presenting a semantic query interpretation, (4) Query Processing (status bar), (5) Presentation Planning, and (6) Presenting. 27/07/2007 10 5
Core User Interface 27/07/2007 11 Multimodal Interaction Guidelines Multimodality: More modalities allow for more natural communication. Encapsulation: Encapsulate user interface proper from the rest of the application. Standards: Re-use own and others resources. Representation: A common ontological knowledge base eases data flow, avoids transformations, and provide a basis for processing natural language dialogue phenomena. Principles:» No presentation without representation» No interaction without representation 27/07/2007 12 6
Representational Guidelines Consistent and homogeneous presentations from multiple answer streams is the result of ontological representations. We base all decision processes and presentation items on ontology structures. 27/07/2007 13 Ontologies An Ontology is an explicit specification of a conceptualization [Gruber 93]. a shared understanding of a domain of interest [Uschold/Gruninger 96]. Make domain assumptions explicit. Separate domain knowledge from operational knowledge. Re-use domain and operational knowledge separately. A community reference for applications Shared understanding of what particular information means 27/07/2007 14 7
Ontology Representation and Multimedia Framework for gesture and speech fusion Multimedia decomposition in space, time and frequency Link to the Upper Model Ontology to close the Semantic Gap 27/07/2007 15 Language Understanding and Text Generation Enabled by ontology representation Lexico-Semantic Mapping on word level (SPIN) Concise information presentation 27/07/2007 16 8
Feedback, correction possibilities, and focus attention ASR & Correction Semantic Paraphrase Concept Icons 27/07/2007 17 Language understanding feedback Semantic Paraphrase Semantic Ontology Query 27/07/2007 18 9
Handheld GUI Example (U) asks simple factoid and inspection questions or commands ( Who got a red card in the final? ). (S) answers questions by natural language dialogue and multimedia presentations. (U) gets feedback. (U) searches, explores, and inspects information. (U) controls the system. 27/07/2007 19 Technical and Graphical Design Implicit confirmation and language independent visualisation Technical Design Graphical Design 27/07/2007 20 10
Ontology-based result information for link generation 27/07/2007 21 NaviLink Structures 27/07/2007 22 11
Portrait and Landscape Orientation 27/07/2007 23 Informal Evaluation and Design Suggestions 1. Audio-repetition of user queries is useless; give only implicit and non-intrusive feedback. 2. Implement text editors with bigger fonts. 3. Editing of Semantic Paraphrase remains a challenge. 4. Building integrated ontological representations for mobile HCIs is very time-consuming (30% additional effort). 5. Ontology engineering has positive side-effects: No presentation without representation Profound data structure design Semantic based input data integration framework Enabler of robust input fusion, NLU, and TG tasks. -> enables design of natural language QA systems on mobile devices. 27/07/2007 24 12
Conclusions We presented interaction design and implementation for mobile Semantic Web interfaces. We (indirectly) motivated Semantic Web data structures for multimodal interaction design and implementation. Spinning the Semantic Web Further improvements: Incremental presentation of results Editing functions via concurrent pen and voice (multimodal query correction) More fine-grained co-ordination of multimodal information Graph-like visualisation / Semantic Navigation 27/07/2007 25 Thank You! 27/07/2007 26 13