EVENT VERIFICATION THROUGH VOICE PROCESS USING ANDROID. Kodela Divya* 1, J.Pratibha 2

Size: px
Start display at page:

Download "EVENT VERIFICATION THROUGH VOICE PROCESS USING ANDROID. Kodela Divya* 1, J.Pratibha 2"

Transcription

1 ISSN IJESR/May 2015/ Vol-5/Issue-5/ Kodela Divya et. al./ International Journal of Engineering & Science Research EVENT VERIFICATION THROUGH VOICE PROCESS USING ANDROID ABSTRACT Kodela Divya* 1, J.Pratibha 2 1 M.Tech Scholar, Vignan University, Vadlamudi, Guntur (A.P), India. 2 Asst. Prof, Vignan University, Vadlamudi, Guntur (A.P), India. In this paper, we present the problem of mixed sound event verification in a wireless sensor network for home automation systems. In home automation systems, the sound recognized by the system becomes the basis for performing certain tasks. However, if a target source is mixed with another sound due to simultaneous occurrence, the system would generate poor recognition results, subsequently leading to inappropriate responses. Home automation needs to be simpler and easy to use and yet cost effective to be widely acceptable. Wireless home automation today needs to use the latest technology advances in order to be user friendly and powerful. There has been a lot of work done already in this area. In this project, current technology components will be used and home automation will be implemented using the communication technologies like Internet and speech recognition. The project will research and evaluate different user interface possibilities for home automation for mobile devices. The automation centers on using relatively cheap wireless communication modules. The intended home automation system will control the lights and electrical appliances in home using voice commands. Keywords: WSN, VAD, VHD. INTRODUCTION In this project, we present the problem of mixed sound event verification in a wireless sensor network for home automation systems. In home automation systems, the sound recognized by the system becomes the basis for performing certain tasks. However, if a target source is mixed with another sound due to simultaneous, the system would generate poor recognition results, subsequently leading to inappropriate responses. To handle such problems, this study proposes a framework, which consists of sound separation and sound verification techniques based on a wireless sensor network (WSN), to realize sound-triggered automation. In the sound separation phase, we present a convolutive blind source separation system with source number estimation using time-frequency clustering. An accurate mixing matrix can be estimated by the proposed phase compensation technique and used for reconstructing the separated sound sources. In the verification phase, Mel frequency cepstral coefficients and Fisher scores that are derived from the wavelet packet decomposition of signals are used as features for support vector machines. Finally, a sound of interest can be selected for triggering automated services according to the verification result. The experimental results demonstrate the robustness and feasibility of the proposed system for mixed sound verification in WSN-based home environments. EXISTING SYSTEM The existing system if a target source is mixed with another sound due to simultaneous Occurrence, the system would generate poor recognition results, subsequently leading to inappropriate responses. We consider the capturing and processing of sounds of interest that are mixed with other sounds. For example, the sound of doorbell ringing or door knocking captured by sensor nodes is usually mixed with sounds of human speech simultaneously occurring in the same environment Limitations The existing generates poor recognition results. They don t improve the sound verification performance significantly. *Corresponding Author 179

2 Less efficiency. PROPOSED SYSTEM Embodied Conversational Agents (ECAs) are animated virtual characters that emulate human behavior and communication. We consider the development of wireless enabled smart systems that can achieve seamless monitoring and control of localized devices or device networks with a Smart Phone, through secure two-way communications between the Smart Phone and the managed devices. ECA-based mobile applications rely on an external server that performs the processor intensive tasks, such as speech recognition, language understanding and text-to-speech. The proposed platform is based on free and open source libraries. We developed a prototype installed on a tablet for controlling a home automation system. Advantages Proposed system includes 6 factors Voice Activity Detector, Automatic Speech Recognition, Conversational Engine, Control Interface, Text-To-Speech, Virtual Head Animation. It designs mobile-based device monitoring and control, which can be applied in both fixed or moving LAN scenarios, such as vehicle electronics, power and energy systems, etc., ARCHITECTURE Voice Activity Detector Automatic Speech Virtual Head Animation Conversational Engine Text To Speech Control Interface LITERATURE SURVEY 1) Humor and Embodied onversational Agents This report surveys the role of humour in human-to-human interaction and the possible role of humor in human computer interaction. The aim is to see whether it is useful for embodied conversational agents to integrate humour capabilities in their internal model of intelligence, emotions and interaction (verbal and nonverbal)capabilities. A current state of the art of research in embodied conversational agents, affective computing and verbal and nonverbal interaction is presented. 2) An Intelligent TV interface based on Statistical Dialogue Management In this paper, we propose an intelligent TV interface using a voice-enable dialogue system. This paper rests on the both directions: a new type of dialogue management model and its use for practical systems to commercialize. We devise a practical dialogue management model based on statistical learning methods. To analyze discourse context, we utilize statistical learning techniques for anaphora resolution and discourse history Copyright 2015 Published by IJESR. All rights reserved 180

3 management. Contrary to the rule-based system, we develop an incremental learning method to construct dialogue strategies from the training corpus. 3) Grounded Language Modelling for Automatic Speech Recognition of Sports Video Michael Fleischman This paper describe show they are learned from large corpora of unlabeled video, and are applied to the task of automatic speech recognition of sports video. Results show that grounded language models improve perplexity and word errorate over text based language models, and further, support video information retrieval better than human generated speech transcriptions. 4) How was your day? An affective companion ECA prototype arc Cavazza This paper presents a dialogue system in the form of an ECA that acts as a sociable and emotionally intelligent companion for the user. The system dialogue is not task-driven but is social conversation in which the user talks about his/her day at the office. During conversations the system monitors the emotional state of the user and uses that information to inform its dialogue turns. The system is able to respond to spoken interruptions by the user. MODULES Voice Activity Detector: Automatic Speech Recognition Conversational Engine Control Interface Text-To-Speech Virtual Head Animation Voice Activity Detector: The Voice Activity Detector s (VAD) role is to discriminate the user s voice frames from those containing noise. This module reads the digitized audio samples acquired from a microphone and sends the filtered raw audio to the ASR. The actual implementation of the VAD module is based on the Sphinx Base library, which was modified so it can work with the OpenSL ES native audio libraries present on Android. libraries Micropho ne VAD ASR Filtered Raw audio Automatic Speech Recognition The Automatic Speech Recognition (ASR) module performs speech to text conversion. It takes as input the utterance with the user s speech that come from the VAD and sends the resultant text to the CE. In the proposed platform, the ASR module is based on the Pocket Sphinx speech recognition library. Copyright 2015 Published by IJESR. All rights reserved 181

4 speech recognition Text VAD ASR CE Conversational Engine The Conversational Engine (CE) extracts the meaning of the utterance, manages the dialog flow and produces the actions appropriate for the target domain. It generates a response based on the input, the current state of the conversation and the dialog history. It was also added support for an object-oriented database that can decrease the dynamic memory usage at the expense of an increment of the response time speech ASR CE CI Control Interface The Control Interface translates the commands said by then user to a format that can be understood by the target applications or services running on the same device or accessible remotely. This module is domain-specific and has to be reimplemented or adapted for every new target application. User commands CI Target Text-To-Speech The TTS module implementation is based on the espeak library. The Text-To-Speech (TTS) subsystem carries out the generation of the synthetic output voice from the text that comes as a response from the CE. it sends to the VHA module a list of the phonemes with their duration so animation and artificial speech match up. The TTS module implementation is based on the espeak library. Speak Synthetic Voice CE TTS VHA Copyright 2015 Published by IJESR. All rights reserved 182

5 Virtual Head Animation This module receives as inputs both the mood information from the CE and the list of the phonemes durations from the TTS module. By processing the inputs, it generates the visemes (the visual representation of the phonemes) and the facial expression that will be rendered along with the synthetic voice. CE Mood information VHA TTS Phonemes durations CONCLUSION The main goal of this work was to explain a platform aimed at developing ECA-based interfaces on hand-held devices equipped with golem. Thus, we have a tendency to planned a doable design and gave implementation details for such platform. The whole platform is predicated on free and open supply libraries and a primary example was developed for dominant a home automation system. The future work consists of to convey some experiments with real users to live the quality, usability and performance of the platform. REFERENCES [1] Louwerse MM, Graesser AC, McNamara DS, Lu S. Embodied conversational agents as conversational partners. Applied Cognitive Psychology 2009; 23(9): [2] De Carolis B, Mazzotta I, Novielli N, Pizzutilo S. Social robots and ECAs for accessing smart environments services. Proceedings of the International Conference on Advanced Visual Interfaces, ser. AVI 10. New York, NY, USA: ACM, 2010; [3] Cavazza M, Santos de la C amara R, Turunen M, Rela no Gil J, Hakulinen J, Crook N, Field D. How was your day? An Affective Companion ECA Prototype, in Proceedings of the SIGDIAL 2010 Conference, Association for Computational Linguistics. Tokyo, Japan: Association for Computational Linguistics, September 2010; [4] Oh HJ, Lee CH, Jang MG, Lee KY. An Intelligent TV interface based on Statistical Dialogue Management. Consumer Electronics, IEEE Transactions 2007; 53(4): [5] Santos-Perez M, Gonzalez-Parada E, Cano-Garcıa JM. Efficient Use of Voice Activity Detector and Automatic Speech Recognition in Embedded Platforms for Natural Language Interaction. Highlights in Practical Applications of Agents and Multiagent Systems, ser. Advances in Intelligent and Soft Computing. Springer Berlin Heidelberg, 2011; 89: [6] Santos-Perez M, Gonzalez-Parada E, Cano-Garcıa JM. Topic Dependent Language Model Switching for Embedded Automatic Speech Recognition. Ambient Intelligence - Software and Applications, ser. Advances in Intelligent and Soft Computing. Springer Berlin / Heidelberg, 2012; 153: [7] Santos-Perez M, Gonzalez-Parada E, Cano-Garcıa JM. Embedded Conversational Engine for Natural Language Interaction in Spanish. Proceedings of the Paralinguistic Information and its Integration in Spoken Dialogue Systems Workshop. Springer New York, 2011; Copyright 2015 Published by IJESR. All rights reserved 183

A personal digital assistant as an advanced remote control for audio/video equipment

A personal digital assistant as an advanced remote control for audio/video equipment A personal digital assistant as an advanced remote control for audio/video equipment John de Vet & Vincent Buil Philips Research Prof. Holstlaan 4 5656 AA Eindhoven The Netherlands Email: {devet, builv}@natlab.research.philips.com

More information

Voice Recognition Based Smart Home Control System

Voice Recognition Based Smart Home Control System International Journal of Engineering Inventions e-issn: 2278-7461, p-issn: 2319-6491 Volume 6, Issue 4 [April 2017] PP: 01-05 Voice Recognition Based Smart Home Control System Awadalla Taifour Ali 1, Eisa

More information

Speech Applications. How do they work?

Speech Applications. How do they work? Speech Applications How do they work? What is a VUI? What the user interacts with when using a speech application VUI Elements Prompts or System Messages Prerecorded or Synthesized Grammars Define the

More information

Device Activation based on Voice Recognition using Mel Frequency Cepstral Coefficients (MFCC s) Algorithm

Device Activation based on Voice Recognition using Mel Frequency Cepstral Coefficients (MFCC s) Algorithm Device Activation based on Voice Recognition using Mel Frequency Cepstral Coefficients (MFCC s) Algorithm Hassan Mohammed Obaid Al Marzuqi 1, Shaik Mazhar Hussain 2, Dr Anilloy Frank 3 1,2,3Middle East

More information

Virtual Human Creation Pipeline

Virtual Human Creation Pipeline Virtual Human Creation Pipeline Virtual Human Toolkit Workshop Patrick Kenny 9/24/2008 The projects or efforts depicted were or are sponsored by the U.S. Army Research, Development, and Engineering Command

More information

Real-time Talking Head Driven by Voice and its Application to Communication and Entertainment

Real-time Talking Head Driven by Voice and its Application to Communication and Entertainment ISCA Archive Real-time Talking Head Driven by Voice and its Application to Communication and Entertainment Shigeo MORISHIMA Seikei University ABSTRACT Recently computer can make cyberspace to walk through

More information

Smart Driver Assistant Software Requirements Specifications

Smart Driver Assistant Software Requirements Specifications 2016 Software Requirements Specifications SEYMUR MAMMADLI SHKELQIM MEMOLLA NAIL IBRAHIMLI MEHMET KURHAN MIDDLE EAST TECHNICAL UNIVERSITY Department Of Computer Engineering Preface This document contains

More information

Arduino Based Speech Controlled Robot for Human Interactions

Arduino Based Speech Controlled Robot for Human Interactions Arduino Based Speech Controlled Robot for Human Interactions B. Sathish kumar 1, Dr. Radhika Baskar 2 1BE Scholar, Electronics and Communication Engineering, Saveetha School of Engineering, Kuthamakkam,

More information

SPEECH FEATURE EXTRACTION USING WEIGHTED HIGHER-ORDER LOCAL AUTO-CORRELATION

SPEECH FEATURE EXTRACTION USING WEIGHTED HIGHER-ORDER LOCAL AUTO-CORRELATION Far East Journal of Electronics and Communications Volume 3, Number 2, 2009, Pages 125-140 Published Online: September 14, 2009 This paper is available online at http://www.pphmj.com 2009 Pushpa Publishing

More information

Advanced System for Management and Recognition of Minutiae in Fingerprints

Advanced System for Management and Recognition of Minutiae in Fingerprints Advanced System for Management and Recognition of Minutiae in Fingerprints Angélica González, José Gómez, Miguel Ramón, and Luis García * Abstract. This article briefly describes the advanced computer

More information

AUTOMATED CHARACTER ANIMATION TOOL

AUTOMATED CHARACTER ANIMATION TOOL Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 11, November 2014,

More information

Confidence Measures: how much we can trust our speech recognizers

Confidence Measures: how much we can trust our speech recognizers Confidence Measures: how much we can trust our speech recognizers Prof. Hui Jiang Department of Computer Science York University, Toronto, Ontario, Canada Email: hj@cs.yorku.ca Outline Speech recognition

More information

FACE ANALYSIS AND SYNTHESIS FOR INTERACTIVE ENTERTAINMENT

FACE ANALYSIS AND SYNTHESIS FOR INTERACTIVE ENTERTAINMENT FACE ANALYSIS AND SYNTHESIS FOR INTERACTIVE ENTERTAINMENT Shoichiro IWASAWA*I, Tatsuo YOTSUKURA*2, Shigeo MORISHIMA*2 */ Telecommunication Advancement Organization *2Facu!ty of Engineering, Seikei University

More information

Implementing MODA: A Multi-Strategy, Mobile, Conversational Consumer Decision-Aid System

Implementing MODA: A Multi-Strategy, Mobile, Conversational Consumer Decision-Aid System Implementing MODA: A Multi-Strategy, Mobile, Conversational Consumer Decision-Aid System Kiana Alikhademi kalikhademi@ufl.edu Naja A. Mack najamac@ufl.edu Kacee Ross mkr0028@tigermail.auburn.edu Brianna

More information

An Experimental Platform Based on MCE for Interactive TV

An Experimental Platform Based on MCE for Interactive TV Appear in Proceedings of the 6th European Interactive TV Conference (EuroiTV2008), 2008 An Experimental Platform Based on MCE for Interactive TV Ping-Yi Liu 1, Hung-Wei Lee 2, Tsai-Yen Li 1, Shwu-Lih Huang

More information

Tutorial on Virtual Human System

Tutorial on Virtual Human System Tutorial on Virtual Human System How to run stuff Patrick Kenny 9/24/2008 The projects or efforts depicted were or are sponsored by the U.S. Army Research, Development, and Engineering Command (RDECOM),and/or

More information

ABSTRACT 1. INTRODUCTION

ABSTRACT 1. INTRODUCTION ABSTRACT A Framework for Multi-Agent Multimedia Indexing Bernard Merialdo Multimedia Communications Department Institut Eurecom BP 193, 06904 Sophia-Antipolis, France merialdo@eurecom.fr March 31st, 1995

More information

A Kinect Sensor based Windows Control Interface

A Kinect Sensor based Windows Control Interface , pp.113-124 http://dx.doi.org/10.14257/ijca.2014.7.3.12 A Kinect Sensor based Windows Control Interface Sang-Hyuk Lee 1 and Seung-Hyun Oh 2 Department of Computer Science, Dongguk University, Gyeongju,

More information

Lecture Video Indexing and Retrieval Using Topic Keywords

Lecture Video Indexing and Retrieval Using Topic Keywords Lecture Video Indexing and Retrieval Using Topic Keywords B. J. Sandesh, Saurabha Jirgi, S. Vidya, Prakash Eljer, Gowri Srinivasa International Science Index, Computer and Information Engineering waset.org/publication/10007915

More information

Getting Started with Crazy Talk 6

Getting Started with Crazy Talk 6 Getting Started with Crazy Talk 6 Crazy Talk 6 is an application that generates talking characters from an image or photo, as well as facial animation for video. Importing an Image Launch Crazy Talk and

More information

SK International Journal of Multidisciplinary Research Hub Research Article / Survey Paper / Case Study Published By: SK Publisher

SK International Journal of Multidisciplinary Research Hub Research Article / Survey Paper / Case Study Published By: SK Publisher ISSN: 2394 3122 (Online) Volume 2, Issue 1, January 2015 Research Article / Survey Paper / Case Study Published By: SK Publisher P. Elamathi 1 M.Phil. Full Time Research Scholar Vivekanandha College of

More information

Dragon TV Overview. TIF Workshop 24. Sept Reimund Schmald mob:

Dragon TV Overview. TIF Workshop 24. Sept Reimund Schmald mob: Dragon TV Overview TIF Workshop 24. Sept. 2013 Reimund Schmald reimund.schmald@nuance.com mob: +49 171 5591906 2002-2013 Nuance Communications, Inc. All rights reserved. Page 1 Reinventing the relationship

More information

Niusha, the first Persian speech-enabled IVR platform

Niusha, the first Persian speech-enabled IVR platform 2010 5th International Symposium on Telecommunications (IST'2010) Niusha, the first Persian speech-enabled IVR platform M.H. Bokaei, H. Sameti, H. Eghbal-zadeh, B. BabaAli, KH. Hosseinzadeh, M. Bahrani,

More information

Spoken Document Retrieval (SDR) for Broadcast News in Indian Languages

Spoken Document Retrieval (SDR) for Broadcast News in Indian Languages Spoken Document Retrieval (SDR) for Broadcast News in Indian Languages Chirag Shah Dept. of CSE IIT Madras Chennai - 600036 Tamilnadu, India. chirag@speech.iitm.ernet.in A. Nayeemulla Khan Dept. of CSE

More information

Intelligent Hands Free Speech based SMS System on Android

Intelligent Hands Free Speech based SMS System on Android Intelligent Hands Free Speech based SMS System on Android Gulbakshee Dharmale 1, Dr. Vilas Thakare 3, Dr. Dipti D. Patil 2 1,3 Computer Science Dept., SGB Amravati University, Amravati, INDIA. 2 Computer

More information

Multimodal Biometric System in Secure e- Transaction in Smart Phone

Multimodal Biometric System in Secure e- Transaction in Smart Phone Multimodal Biometric System in Secure e- Transaction in Smart Phone Amit Kumar PG Student Department of computer science.,sssist, Sehore Kailash Patidar Professor Department of computer science,sssist,

More information

A Study on Different Challenges in Facial Recognition Methods

A Study on Different Challenges in Facial Recognition Methods Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 6, June 2015, pg.521

More information

ADVANCED SECURITY SYSTEM USING FACIAL RECOGNITION Mahesh Karanjkar 1, Shrikrishna Jogdand* 2

ADVANCED SECURITY SYSTEM USING FACIAL RECOGNITION Mahesh Karanjkar 1, Shrikrishna Jogdand* 2 ISSN 2277-2685 IJESR/Oct. 2015/ Vol-5/Issue-10/1285-1289 ADVANCED SECURITY SYSTEM USING FACIAL RECOGNITION Mahesh Karanjkar 1, Shrikrishna Jogdand* 2 1 Prof. & HOD, Dept of ETC, Shri Tuljabhavani College

More information

A Hierarchical Domain Model-Based Multi-Domain Selection Framework for Multi-Domain Dialog Systems

A Hierarchical Domain Model-Based Multi-Domain Selection Framework for Multi-Domain Dialog Systems A Hierarchical Domain Model-Based Multi-Domain Selection Framework for Multi-Domain Dialog Systems Seonghan Ryu 1 Donghyeon Lee 1 Injae Lee 1 Sangdo Han 1 Gary Geunbae Lee 1 Myungjae Kim 2 Kyungduk Kim

More information

Making an on-device personal assistant a reality

Making an on-device personal assistant a reality June 2018 @qualcomm_tech Making an on-device personal assistant a reality Qualcomm Technologies, Inc. AI brings human-like understanding and behaviors to the machines Perception Hear, see, and observe

More information

Facial Expression Recognition using Principal Component Analysis with Singular Value Decomposition

Facial Expression Recognition using Principal Component Analysis with Singular Value Decomposition ISSN: 2321-7782 (Online) Volume 1, Issue 6, November 2013 International Journal of Advance Research in Computer Science and Management Studies Research Paper Available online at: www.ijarcsms.com Facial

More information

Gender-dependent acoustic models fusion developed for automatic subtitling of Parliament meetings broadcasted by the Czech TV

Gender-dependent acoustic models fusion developed for automatic subtitling of Parliament meetings broadcasted by the Czech TV Gender-dependent acoustic models fusion developed for automatic subtitling of Parliament meetings broadcasted by the Czech TV Jan Vaněk and Josef V. Psutka Department of Cybernetics, West Bohemia University,

More information

Maximum Likelihood Beamforming for Robust Automatic Speech Recognition

Maximum Likelihood Beamforming for Robust Automatic Speech Recognition Maximum Likelihood Beamforming for Robust Automatic Speech Recognition Barbara Rauch barbara@lsv.uni-saarland.de IGK Colloquium, Saarbrücken, 16 February 2006 Agenda Background: Standard ASR Robust ASR

More information

Automatic Transcription of Speech From Applied Research to the Market

Automatic Transcription of Speech From Applied Research to the Market Think beyond the limits! Automatic Transcription of Speech From Applied Research to the Market Contact: Jimmy Kunzmann kunzmann@eml.org European Media Laboratory European Media Laboratory (founded 1997)

More information

A Novel Template Matching Approach To Speaker-Independent Arabic Spoken Digit Recognition

A Novel Template Matching Approach To Speaker-Independent Arabic Spoken Digit Recognition Special Session: Intelligent Knowledge Management A Novel Template Matching Approach To Speaker-Independent Arabic Spoken Digit Recognition Jiping Sun 1, Jeremy Sun 1, Kacem Abida 2, and Fakhri Karray

More information

How to create dialog system

How to create dialog system How to create dialog system Jan Balata Dept. of computer graphics and interaction Czech Technical University in Prague 1 / 55 Outline Intro Architecture Types of control Designing dialog system IBM Bluemix

More information

DTask & LiteBody: Open Source, Standards-based Tools for Building Web-deployed Embodied Conversational Agents

DTask & LiteBody: Open Source, Standards-based Tools for Building Web-deployed Embodied Conversational Agents DTask & LiteBody: Open Source, Standards-based Tools for Building Web-deployed Embodied Conversational Agents Timothy Bickmore, Daniel Schulman, and George Shaw Northeastern University College of Computer

More information

2. Basic Task of Pattern Classification

2. Basic Task of Pattern Classification 2. Basic Task of Pattern Classification Definition of the Task Informal Definition: Telling things apart 3 Definition: http://www.webopedia.com/term/p/pattern_recognition.html pattern recognition Last

More information

Presenting Non-verbal Communication to Blind Users in Brainstorming Sessions

Presenting Non-verbal Communication to Blind Users in Brainstorming Sessions Presenting Non-verbal Communication to Blind Users in Brainstorming Sessions Stephan Pölzer and Klaus Miesenberger Johannes Kepler Universität Linz, Institute Integriert Studieren, Linz, Austria {stephan.poelzer,klaus.miesenberger}@jku.at

More information

A Smart Power System Weihan Bo, Mi Li, Xi-Ping Peng, Xiang Li, Xin Huang *

A Smart Power System Weihan Bo, Mi Li, Xi-Ping Peng, Xiang Li, Xin Huang * 3rd International Conference on Mechanical Engineering and Intelligent Systems (ICMEIS 2015) A Smart Power System Weihan Bo, Mi Li, Xi-Ping Peng, Xiang Li, Xin Huang * Xi'an Jiaotong-Liverpool University,

More information

Human Computer Interaction Using Speech Recognition Technology

Human Computer Interaction Using Speech Recognition Technology International Bulletin of Mathematical Research Volume 2, Issue 1, March 2015 Pages 231-235, ISSN: 2394-7802 Human Computer Interaction Using Recognition Technology Madhu Joshi 1 and Saurabh Ranjan Srivastava

More information

Ontology driven voice-based interaction in mobile environment

Ontology driven voice-based interaction in mobile environment Ontology driven voice-based interaction in mobile environment Jiri Kopsa 1, Zdenek Mikovec 1, Pavel Slavik 1 1 Czech Technical University in Prague Karlovo namesti 13, Prague 2, Czech Republic j.kopsa@fee.ctup.cz,

More information

Evaluating Spoken Dialogue Systems. Julia Hirschberg CS /27/2011 1

Evaluating Spoken Dialogue Systems. Julia Hirschberg CS /27/2011 1 Evaluating Spoken Dialogue Systems Julia Hirschberg CS 4706 4/27/2011 1 Dialogue System Evaluation Key point about SLP. Whenever we design a new algorithm or build a new application, need to evaluate it

More information

Dialogue systems. Volha Petukhova Saarland University

Dialogue systems. Volha Petukhova Saarland University Dialogue systems Volha Petukhova Saarland University 20/07/2016 Einführung in Diskurs and Pragmatik, Sommersemester 2016 Introduction Multimodal natural-language based dialogue as humanmachine interface

More information

Pervasive Computing offers Adaptable Interfaces

Pervasive Computing offers Adaptable Interfaces Pervasive Computing offers Adaptable Interfaces Signals, Standards, Metadata, and ICADI June 26, 2003 This has already gone live Elite Care - Elder Care Delivery Wired residential buildings Locator badges,

More information

BUILDING CORPORA OF TRANSCRIBED SPEECH FROM OPEN ACCESS SOURCES

BUILDING CORPORA OF TRANSCRIBED SPEECH FROM OPEN ACCESS SOURCES BUILDING CORPORA OF TRANSCRIBED SPEECH FROM OPEN ACCESS SOURCES O.O. Iakushkin a, G.A. Fedoseev, A.S. Shaleva, O.S. Sedova Saint Petersburg State University, 7/9 Universitetskaya nab., St. Petersburg,

More information

International Journal of Modern Trends in Engineering and Research e-issn No.: , Date: 2-4 July, 2015

International Journal of Modern Trends in Engineering and Research   e-issn No.: , Date: 2-4 July, 2015 International Journal of Modern Trends in Engineering and Research www.ijmter.com e-issn No.:2349-9745, Date: 2-4 July, 2015 Communication media for Blinds Based on Voice Mrs.K.M.Sanghavi 1, Radhika Maru

More information

Multi-modal Translation and Evaluation of Lip-synchronization using Noise Added Voice

Multi-modal Translation and Evaluation of Lip-synchronization using Noise Added Voice Multi-modal Translation and Evaluation of Lip-synchronization using Noise Added Voice Shigeo MORISHIMA (,2), Satoshi NAKAMURA (2) () Faculty of Engineering, Seikei University. --, Kichijoji-Kitamachi,

More information

IMPLEMENTING ON OPTICAL CHARACTER RECOGNITION USING MEDICAL TABLET FOR BLIND PEOPLE

IMPLEMENTING ON OPTICAL CHARACTER RECOGNITION USING MEDICAL TABLET FOR BLIND PEOPLE Impact Factor (SJIF): 5.301 International Journal of Advance Research in Engineering, Science & Technology e-issn: 2393-9877, p-issn: 2394-2444 Volume 5, Issue 3, March-2018 IMPLEMENTING ON OPTICAL CHARACTER

More information

How to Build Optimized ML Applications with Arm Software

How to Build Optimized ML Applications with Arm Software How to Build Optimized ML Applications with Arm Software Arm Technical Symposia 2018 ML Group Overview Today we will talk about applied machine learning (ML) on Arm. My aim for today is to show you just

More information

SOFTWARE DESIGN AND DEVELOPMENT OF MUTIMODAL INTERACTION

SOFTWARE DESIGN AND DEVELOPMENT OF MUTIMODAL INTERACTION SOFTWARE DESIGN AND DEVELOPMENT OF MUTIMODAL INTERACTION Marie-Luce Bourguet Queen Mary University of London Abstract: Key words: The multimodal dimension of a user interface raises numerous problems that

More information

SpeakUp click. Contents. Applications. SpeakUp Firwmware. Algorithm. SpeakUp and SpeakUp 2 click. From MikroElektonika Documentation

SpeakUp click. Contents. Applications. SpeakUp Firwmware. Algorithm. SpeakUp and SpeakUp 2 click. From MikroElektonika Documentation Page 1 of 8 SpeakUp click From MikroElektonika Documentation SpeakUp click and Speakup 2 click are speaker dependent speech recognition click boards with standalone capabilities. They work by matching

More information

PRACTICAL SPEECH USER INTERFACE DESIGN

PRACTICAL SPEECH USER INTERFACE DESIGN ; ; : : : : ; : ; PRACTICAL SPEECH USER INTERFACE DESIGN й fail James R. Lewis. CRC Press Taylor &. Francis Group Boca Raton London New York CRC Press is an imprint of the Taylor & Francis Group, an informa

More information

Two-Layered Audio-Visual Speech Recognition for Robots in Noisy Environments

Two-Layered Audio-Visual Speech Recognition for Robots in Noisy Environments The 2 IEEE/RSJ International Conference on Intelligent Robots and Systems October 8-22, 2, Taipei, Taiwan Two-Layered Audio-Visual Speech Recognition for Robots in Noisy Environments Takami Yoshida, Kazuhiro

More information

Human Robot Interaction

Human Robot Interaction Human Robot Interaction Emanuele Bastianelli, Daniele Nardi bastianelli@dis.uniroma1.it Department of Computer, Control, and Management Engineering Sapienza University of Rome, Italy Introduction Robots

More information

Chapter 22: Communication Process Study Guide Matching

Chapter 22: Communication Process Study Guide Matching Chapter 22: Communication Process Study Guide Matching Match the following terms with their definitions. A. barrier B. channel C. decoding D. empathy E. encoding F. feedback G. four Cs of writing H. memo

More information

Hybrid Face Recognition and Classification System for Real Time Environment

Hybrid Face Recognition and Classification System for Real Time Environment Hybrid Face Recognition and Classification System for Real Time Environment Dr.Matheel E. Abdulmunem Department of Computer Science University of Technology, Baghdad, Iraq. Fatima B. Ibrahim Department

More information

Speech Recognition. Project: Phone Recognition using Sphinx. Chia-Ho Ling. Sunya Santananchai. Professor: Dr. Kepuska

Speech Recognition. Project: Phone Recognition using Sphinx. Chia-Ho Ling. Sunya Santananchai. Professor: Dr. Kepuska Speech Recognition Project: Phone Recognition using Sphinx Chia-Ho Ling Sunya Santananchai Professor: Dr. Kepuska Objective Use speech data corpora to build a model using CMU Sphinx.Apply a built model

More information

Voice Control becomes Natural

Voice Control becomes Natural Voice Control becomes Natural ITU-T FOCUS GROUP CarCom -- SPEECH IN CARS Dr. Udo Haiber Torino, Italy, October 16, 2009 Overview Company What is Natural? Involved Components Focus Change Approach Conclusion

More information

Pouya Kousha Fall 2018 CSE 5194 Prof. DK Panda

Pouya Kousha Fall 2018 CSE 5194 Prof. DK Panda Pouya Kousha Fall 2018 CSE 5194 Prof. DK Panda 1 Observe novel applicability of DL techniques in Big Data Analytics. Applications of DL techniques for common Big Data Analytics problems. Semantic indexing

More information

EP A1 (19) (11) EP A1 (12) EUROPEAN PATENT APPLICATION. (43) Date of publication: Bulletin 2012/34

EP A1 (19) (11) EP A1 (12) EUROPEAN PATENT APPLICATION. (43) Date of publication: Bulletin 2012/34 (19) (12) EUROPEAN PATENT APPLICATION (11) EP 2 490 138 A1 (43) Date of publication: 22.08.2012 Bulletin 2012/34 (1) Int Cl.: G06F 17/30 (2006.01) (21) Application number: 1214420.9 (22) Date of filing:

More information

Voice Annotation Technique for Reading-disabled People on Mobile System

Voice Annotation Technique for Reading-disabled People on Mobile System , pp.53-57 http://dx.doi.org/10.14257/astl.2014.60.14 Voice Annotation Technique for Reading-disabled People on Mobile System Joo Hyun Park 1, KyungHee Lee 1, JongWoo Lee 1, Soon-Bum Lim 1, 1 Multimedia

More information

Multimodal Dialog Description Language for Rapid System Development

Multimodal Dialog Description Language for Rapid System Development Multimodal Dialog Description Language for Rapid System Development Masahiro Araki Kenji Tachibana Kyoto Institute of Technology Graduate School of Science and Technology, Department of Information Science

More information

Minimal-Impact Personal Audio Archives

Minimal-Impact Personal Audio Archives Minimal-Impact Personal Audio Archives Dan Ellis, Keansub Lee, Jim Ogle Laboratory for Recognition and Organization of Speech and Audio Dept. Electrical Eng., Columbia Univ., NY USA dpwe@ee.columbia.edu

More information

Disconnecting the application from the interaction model

Disconnecting the application from the interaction model Disconnecting the application from the interaction model Ing-Marie Jonsson, Neil Scott, Judy Jackson Project Archimedes, CSLI Stanford University {ingmarie,ngscott,jackson}@csli.stanford.edu Abstract:

More information

MediaTek Natural User Interface

MediaTek Natural User Interface MediaTek White Paper October 2014 2014 MediaTek Inc. Table of Contents 1 Introduction... 3 2 Computer Vision Technology... 7 3 Voice Interface Technology... 9 3.1 Overview... 9 3.2 Voice Keyword Control...

More information

SYSTEM AND METHOD FOR SPEECH RECOGNITION

SYSTEM AND METHOD FOR SPEECH RECOGNITION Technical Disclosure Commons Defensive Publications Series September 06, 2016 SYSTEM AND METHOD FOR SPEECH RECOGNITION Dimitri Kanevsky Tara Sainath Follow this and additional works at: http://www.tdcommons.org/dpubs_series

More information

A Scripting Language for Multimodal Presentation on Mobile Phones

A Scripting Language for Multimodal Presentation on Mobile Phones A Scripting Language for Multimodal Presentation on Mobile Phones Santi Saeyor 1, Suman Mukherjee 2, Koki Uchiyama 2, Ishizuka Mitsuru 1 1 Dept. of Information and Communication Engineering, University

More information

Voice Command Based Computer Application Control Using MFCC

Voice Command Based Computer Application Control Using MFCC Voice Command Based Computer Application Control Using MFCC Abinayaa B., Arun D., Darshini B., Nataraj C Department of Embedded Systems Technologies, Sri Ramakrishna College of Engineering, Coimbatore,

More information

REALISTIC FACIAL EXPRESSION SYNTHESIS FOR AN IMAGE-BASED TALKING HEAD. Kang Liu and Joern Ostermann

REALISTIC FACIAL EXPRESSION SYNTHESIS FOR AN IMAGE-BASED TALKING HEAD. Kang Liu and Joern Ostermann REALISTIC FACIAL EXPRESSION SYNTHESIS FOR AN IMAGE-BASED TALKING HEAD Kang Liu and Joern Ostermann Institut für Informationsverarbeitung, Leibniz Universität Hannover Appelstr. 9A, 3167 Hannover, Germany

More information

MARATHI TEXT-TO-SPEECH SYNTHESISYSTEM FOR ANDROID PHONES

MARATHI TEXT-TO-SPEECH SYNTHESISYSTEM FOR ANDROID PHONES International Journal of Advances in Applied Science and Engineering (IJAEAS) ISSN (P): 2348-1811; ISSN (E): 2348-182X Vol. 3, Issue 2, May 2016, 34-38 IIST MARATHI TEXT-TO-SPEECH SYNTHESISYSTEM FOR ANDROID

More information

Understanding Tracking and StroMotion of Soccer Ball

Understanding Tracking and StroMotion of Soccer Ball Understanding Tracking and StroMotion of Soccer Ball Nhat H. Nguyen Master Student 205 Witherspoon Hall Charlotte, NC 28223 704 656 2021 rich.uncc@gmail.com ABSTRACT Soccer requires rapid ball movements.

More information

Authors Martin Eckert Ingmar Kliche Deutsche Telekom Laboratories.

Authors Martin Eckert Ingmar Kliche Deutsche Telekom Laboratories. Workshop on speaker biometrics and VoiceXML 3.0 March 5-6, 2009, Menlo Park, CA, US Proposal of an SIV architecture and requirements Authors Martin Eckert (martin.eckert@telekom.de), Ingmar Kliche (ingmar.kliche@telekom.de),

More information

Image Based Feature Extraction Technique For Multiple Face Detection and Recognition in Color Images

Image Based Feature Extraction Technique For Multiple Face Detection and Recognition in Color Images Image Based Feature Extraction Technique For Multiple Face Detection and Recognition in Color Images 1 Anusha Nandigam, 2 A.N. Lakshmipathi 1 Dept. of CSE, Sir C R Reddy College of Engineering, Eluru,

More information

A Language Independent Author Verifier Using Fuzzy C-Means Clustering

A Language Independent Author Verifier Using Fuzzy C-Means Clustering A Language Independent Author Verifier Using Fuzzy C-Means Clustering Notebook for PAN at CLEF 2014 Pashutan Modaresi 1,2 and Philipp Gross 1 1 pressrelations GmbH, Düsseldorf, Germany {pashutan.modaresi,

More information

Securing a Smart Home Network using Voice Biometric

Securing a Smart Home Network using Voice Biometric Securing a Smart Home Network using Voice Biometric Oinam Joymala 1, Neha Khare 2 1 MTech. Final Year, Computer Science - Software Engineering, BBD University, Lucknow(U.P.), India 2 Assistant Professor,

More information

Footprint Recognition using Modified Sequential Haar Energy Transform (MSHET)

Footprint Recognition using Modified Sequential Haar Energy Transform (MSHET) 47 Footprint Recognition using Modified Sequential Haar Energy Transform (MSHET) V. D. Ambeth Kumar 1 M. Ramakrishnan 2 1 Research scholar in sathyabamauniversity, Chennai, Tamil Nadu- 600 119, India.

More information

COPYRIGHTED MATERIAL. Introduction. 1.1 Introduction

COPYRIGHTED MATERIAL. Introduction. 1.1 Introduction 1 Introduction 1.1 Introduction One of the most fascinating characteristics of humans is their capability to communicate ideas by means of speech. This capability is undoubtedly one of the facts that has

More information

3 Publishing Technique

3 Publishing Technique Publishing Tool 32 3 Publishing Technique As discussed in Chapter 2, annotations can be extracted from audio, text, and visual features. The extraction of text features from the audio layer is the approach

More information

A Survey on Speech Recognition Using HCI

A Survey on Speech Recognition Using HCI A Survey on Speech Recognition Using HCI Done By- Rohith Pagadala Department Of Data Science Michigan Technological University CS5760 Topic Assignment 2 rpagadal@mtu.edu ABSTRACT In this paper, we will

More information

Human body animation. Computer Animation. Human Body Animation. Skeletal Animation

Human body animation. Computer Animation. Human Body Animation. Skeletal Animation Computer Animation Aitor Rovira March 2010 Human body animation Based on slides by Marco Gillies Human Body Animation Skeletal Animation Skeletal Animation (FK, IK) Motion Capture Motion Editing (retargeting,

More information

Topics for thesis. Automatic Speech-based Emotion Recognition

Topics for thesis. Automatic Speech-based Emotion Recognition Topics for thesis Bachelor: Automatic Speech-based Emotion Recognition Emotion recognition is an important part of Human-Computer Interaction (HCI). It has various applications in industrial and commercial

More information

Web2cToGo: Bringing the Web2cToolkit to Mobile Devices. Reinhard Bacher DESY, Hamburg, Germany

Web2cToGo: Bringing the Web2cToolkit to Mobile Devices. Reinhard Bacher DESY, Hamburg, Germany Web2cToGo: Bringing the Web2cToolkit to Mobile Devices Reinhard Bacher DESY, Hamburg, Germany Outline Introduction to Web2cToolkit New: Web2cToGo project Web2cToGo Web-Desktop Web-Desktop navigation and

More information

Mobile Application for Improving Speech and Text Data Collection Approach

Mobile Application for Improving Speech and Text Data Collection Approach Mobile Application for Improving Speech and Text Data Collection Approach Sarah Samson Juan 1 and Jennifer Fiona Wilfred Busu 1,2 1 Institute of Social Informatics and Technological Innovations, Universiti

More information

Authentication of Fingerprint Recognition Using Natural Language Processing

Authentication of Fingerprint Recognition Using Natural Language Processing Authentication of Fingerprint Recognition Using Natural Language Shrikala B. Digavadekar 1, Prof. Ravindra T. Patil 2 1 Tatyasaheb Kore Institute of Engineering & Technology, Warananagar, India 2 Tatyasaheb

More information

Innovative M-Tech projects list IEEE papers

Innovative M-Tech projects list IEEE papers Innovative M-Tech projects list IEEE-2013-14 papers 1. Analysis and Practical Considerations in Implementing Multiple Transmitters for Wireless Power Transfer via Coupled Magnetic Resonance (IEEE 2014)

More information

Using the Sensory NLP-5x LCD Module and LCD Sample

Using the Sensory NLP-5x LCD Module and LCD Sample Using the Sensory NLP-5x LCD Module and LCD Sample DESIGN NOTE I. Overview The Sensory NLP-5x has on-board control logic which can drive an external static or multiplexed Liquid Crystal Display (LCD) of

More information

Enhancing applications with Cognitive APIs IBM Corporation

Enhancing applications with Cognitive APIs IBM Corporation Enhancing applications with Cognitive APIs After you complete this section, you should understand: The Watson Developer Cloud offerings and APIs The benefits of commonly used Cognitive services 2 Watson

More information

Using context information to generate dynamic user interfaces

Using context information to generate dynamic user interfaces Using context information to generate dynamic user interfaces Xavier Alamán, Rubén Cabello, Francisco Gómez-Arriba, Pablo Haya, Antonio Martinez, Javier Martinez, Germán Montoro Departamento de Ingeniería

More information

DWT-SVD based Multiple Watermarking Techniques

DWT-SVD based Multiple Watermarking Techniques International Journal of Engineering Science Invention (IJESI) ISSN (Online): 2319 6734, ISSN (Print): 2319 6726 www.ijesi.org PP. 01-05 DWT-SVD based Multiple Watermarking Techniques C. Ananth 1, Dr.M.Karthikeyan

More information

International Journal of Advance Research in Computer Science and Management Studies

International Journal of Advance Research in Computer Science and Management Studies Volume 2, Issue 12, December 2014 ISSN: 2321 7782 (Online) International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online

More information

EVALUATING COMPETING AGENT STRATEGIES FOR A VOICE AGENT

EVALUATING COMPETING AGENT STRATEGIES FOR A VOICE  AGENT EVALUATING COMPETING AGENT STRATEGIES FOR A VOICE EMAIL AGENT Marilyn Walker, Donald Hindle, Jeanne Fromer, Giuseppe Di Fabbrizio, Craig Mestel AT&T Labs Research 180 Park Ave, Florham Park, NJ 07932,

More information

RLAT Rapid Language Adaptation Toolkit

RLAT Rapid Language Adaptation Toolkit RLAT Rapid Language Adaptation Toolkit Tim Schlippe May 15, 2012 RLAT Rapid Language Adaptation Toolkit - 2 RLAT Rapid Language Adaptation Toolkit RLAT Rapid Language Adaptation Toolkit - 3 Outline Introduction

More information

A two-step approach for efficient domain selection in multi-domain dialog systems

A two-step approach for efficient domain selection in multi-domain dialog systems A two-step approach for efficient domain selection in multi-domain dialog systems Injae Lee, Seokhwan Kim, Kyungduk Kim, Donghyeon Lee, Junhui Choi, Seonghan Ryu and Gary Geunbae Lee POSTECH, South Korea

More information

AUTOMATIC VIDEO INDEXING

AUTOMATIC VIDEO INDEXING AUTOMATIC VIDEO INDEXING Itxaso Bustos Maite Frutos TABLE OF CONTENTS Introduction Methods Key-frame extraction Automatic visual indexing Shot boundary detection Video OCR Index in motion Image processing

More information

Learning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li

Learning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li Learning to Match Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li 1. Introduction The main tasks in many applications can be formalized as matching between heterogeneous objects, including search, recommendation,

More information

The Approach of Mean Shift based Cosine Dissimilarity for Multi-Recording Speaker Clustering

The Approach of Mean Shift based Cosine Dissimilarity for Multi-Recording Speaker Clustering The Approach of Mean Shift based Cosine Dissimilarity for Multi-Recording Speaker Clustering 1 D. Jareena Begum, 2 K Rajendra Prasad, 3 M Suleman Basha 1 M.Tech in SE, RGMCET, Nandyal 2 Assoc Prof, Dept

More information

Text-Independent Speaker Identification

Text-Independent Speaker Identification December 8, 1999 Text-Independent Speaker Identification Til T. Phan and Thomas Soong 1.0 Introduction 1.1 Motivation The problem of speaker identification is an area with many different applications.

More information

Fingerprint Recognition using Texture Features

Fingerprint Recognition using Texture Features Fingerprint Recognition using Texture Features Manidipa Saha, Jyotismita Chaki, Ranjan Parekh,, School of Education Technology, Jadavpur University, Kolkata, India Abstract: This paper proposes an efficient

More information

Comparative Evaluation of Feature Normalization Techniques for Speaker Verification

Comparative Evaluation of Feature Normalization Techniques for Speaker Verification Comparative Evaluation of Feature Normalization Techniques for Speaker Verification Md Jahangir Alam 1,2, Pierre Ouellet 1, Patrick Kenny 1, Douglas O Shaughnessy 2, 1 CRIM, Montreal, Canada {Janagir.Alam,

More information