Developing the Maltese Speech Synthesis Engine

Size: px
Start display at page:

Download "Developing the Maltese Speech Synthesis Engine"

Transcription

1 Developing the Maltese Speech Synthesis Engine Crimsonwing Research Team The Maltese Text to Speech Synthesiser Crimsonwing (Malta) p.l.c. awarded tender to develop the Maltese Text to Speech Synthesiser by the Foundation for Information Technology Accessibility (FITA) Project co-financed (85%) by the EU s ERDF (European Regional Development Fund), and national funds (15%) Operational Programme I Cohesion Policy Investing in Competitiveness for a Better Quality of Life

2 The Maltese Text to Speech Synthesiser Features: 3 different voices: male, female, child High quality: Studio recorded (44 KHz 16bit sound quality) Neutral discourse Windows SAPI compliant (Speech Application Programming Interface) Inter-operability with any application that is SAPI compliant (e.g. Window-Eyes, etc.) Freely available for download Available in 2012 Text to Speech (TTS) Synthesis 1 st generation (1960 s to mid-1980 s): Formant synthesis Articulatory synthesis (based on vocal tract models) Robotic sounding 2 nd generation (mid-1980 s to mid-1990 s): Concatenative synthesis Single instance of each recorded unit Heavy DSP (digital signal processing) Can suffer from audible glitches at concatenation points 1 st work in Maltese TTS falls here (P. Micallef, PhD 1998) 3 rd generation (mid-1990 s onwards): Concatenative Synthesis with Unit Selection Multiple instances of each recorded unit Choosing the best chain of candidate units Less DSP

3 Evolution of the MSE Prototype 1: Second Generation engine based on the diphones created by Prof. Paul Micallef SAPI Complaint Prototype 2: Third generation engine One voice (male) Limited diphones & Lexicon No intonation model Prototype 3 Three voices Intonation model implemented Diphones (100K) and Lexicon (30K) Concatenative Speech Synthesis What type of units to use for TTS? Diphones chosen for the Maltese TTS engine. Compromise between number of units, co-articulation effects Easier to do concatenation at the stationary parts of speech signals ǫə + b /d/ /ǫə / /b/ /Ǻ/

4 Simple Example The word jiena converted to phonetic form /jǻə nǡ/ via lexicon or rules Consists of the 4 phones /j/, /Ǻə /, /n/, and /Ǡ/ Grouped into 5 phone pairs (diphones) [ j], [jǻə ], [Ǻə n], [nǡ], and [Ǡ ] We need to find the best sequence of diphones taking into into consideration pitch and energy Concatenative Speech Synthesis Dan dǡn mhux mțȓ xogħol ȓǥə l ħafif, hǡfǻf, imma ǺmmǠ jrid jrǻt isir. ǺsǺr. Given some utterance to be synthesised A phonemic transcription is generated The required prosodic model is generated Database with recorded speech, segmented into audio segments (units) The given utterance is divided into segments (units) and the best matching units from the database are selected The units are concatenated together Some DSP is applied to smooth the joins between the units

5 Dan mhux xogħol ħafif imma jrid isir. Odin irid debħa mdemma għal kull wieħed mill-āellieda tiegħu biex iħallih jidħol āewwa Valħalla. Qalb ittaqlib tal-ħajja talbniedem, il-ħolqien sabiħ jindokra lill- Diphone Database Diphone database recorded speech corpus TTS Quality of synthesised speech is highly dependent on the corpus of recorded speech used to create the diphone database Large database required for sufficiently naturalsounding speech (spanning several to tens of hours) Large number of diphones needed for unit selection TTS /b/ + /Ǡ/ Diphone Database Creation Diphone database Dan mhux xogħol ħafif imma jrid isir. Odin irid debħa mdemma għal kull wieħed mill-āellieda tiegħu biex iħallih jidħol āewwa Valħalla. Qalb ittaqlib tal-ħajja talbniedem, il-ħolqien sabiħ jindokra lillrecorded speech corpus TTS Diphone Coverage How many of the potential diphones occur in Maltese? Which are the most frequent diphones? Need statistics on diphone frequency and variation Research Paper Preparation of a Free-Running Text Corpus for Maltese Concatenative Speech Synthesis; presented at the 3rd International Conference on Maltese Linguistics, 08-Apr-2011

6 Dan mhux xogħol ħafif imma jrid isir. Odin irid debħa mdemma għal kull wieħed mill-āellieda tiegħu biex iħallih jidħol āewwa Valħalla. Qalb ittaqlib tal-ħajja talbniedem, il-ħolqien sabiħ jindokra lill- Diphone Database Creation Diphone database recorded speech corpus TTS Diphone cutting: Manual process Performance of automatic diphone segmentation methods is currently limited Semi-automatic methods still require manual intervention Labour and time intensive Lexicon Phonemic Transcription database Tool constructed to manage the database

7 Applications Spelli client application packed with MSE MSE as a web service Online demo on fitamalta.eu ispeakmaltese (ipad / iphone / ipod / Android / Windows Mobile 7)

Loquendo TTS Director: The next generation prompt-authoring suite for creating, editing and checking prompts

Loquendo TTS Director: The next generation prompt-authoring suite for creating, editing and checking prompts Loquendo TTS Director: The next generation prompt-authoring suite for creating, editing and checking prompts 1. Overview Davide Bonardo with the collaboration of Simon Parr The release of Loquendo TTS

More information

Improved Tamil Text to Speech Synthesis

Improved Tamil Text to Speech Synthesis Improved Tamil Text to Speech Synthesis M.Vinu Krithiga * and T.V.Geetha ** * Research Scholar ; ** Professor Department of Computer Science and Engineering,

More information

XII International PhD Workshop OWD 2010, October Efficient Diphone Database Creation for MBROLA, a Multilingual Speech Synthesiser

XII International PhD Workshop OWD 2010, October Efficient Diphone Database Creation for MBROLA, a Multilingual Speech Synthesiser XII International PhD Workshop OWD 2010, 23 26 October 2010 Efficient Diphone Database Creation for MBROLA, a Multilingual Speech Synthesiser Jolanta Bachan, Institute of Linguistics, Adam Mickiewicz University

More information

Speech Synthesis. Simon King University of Edinburgh

Speech Synthesis. Simon King University of Edinburgh Speech Synthesis Simon King University of Edinburgh Hybrid speech synthesis Partial synthesis Case study: Trajectory Tiling Orientation SPSS (with HMMs or DNNs) flexible, robust to labelling errors but

More information

Towards Audiovisual TTS

Towards Audiovisual TTS Towards Audiovisual TTS in Estonian Einar MEISTER a, SaschaFAGEL b and RainerMETSVAHI a a Institute of Cybernetics at Tallinn University of Technology, Estonia b zoobemessageentertainmentgmbh, Berlin,

More information

Speech Applications. How do they work?

Speech Applications. How do they work? Speech Applications How do they work? What is a VUI? What the user interacts with when using a speech application VUI Elements Prompts or System Messages Prerecorded or Synthesized Grammars Define the

More information

CS 224S / LINGUIST 281 Speech Recognition, Synthesis, and Dialogue Dan Jurafsky. Lecture 6: Waveform Synthesis (in Concatenative TTS)

CS 224S / LINGUIST 281 Speech Recognition, Synthesis, and Dialogue Dan Jurafsky. Lecture 6: Waveform Synthesis (in Concatenative TTS) CS 224S / LINGUIST 281 Speech Recognition, Synthesis, and Dialogue Dan Jurafsky Lecture 6: Waveform Synthesis (in Concatenative TTS) IP Notice: many of these slides come directly from Richard Sproat s

More information

Increased Diphone Recognition for an Afrikaans TTS system

Increased Diphone Recognition for an Afrikaans TTS system Increased Diphone Recognition for an Afrikaans TTS system Francois Rousseau and Daniel Mashao Department of Electrical Engineering, University of Cape Town, Rondebosch, Cape Town, South Africa, frousseau@crg.ee.uct.ac.za,

More information

Assignment 1: Speech Production and Models EN2300 Speech Signal Processing

Assignment 1: Speech Production and Models EN2300 Speech Signal Processing Assignment 1: Speech Production and Models EN2300 Speech Signal Processing 2011-10-23 Instructions for the deliverables Perform all (or as many as you can) of the tasks in this project assignment. Summarize

More information

Applying Backoff to Concatenative Speech Synthesis

Applying Backoff to Concatenative Speech Synthesis Applying Backoff to Concatenative Speech Synthesis Lily Liu Stanford University lliu23@stanford.edu Luladay Price Stanford University luladayp@stanford.edu Andrew Zhang Stanford University azhang97@stanford.edu

More information

OGIresLPC : Diphone synthesizer using residualexcited linear prediction

OGIresLPC : Diphone synthesizer using residualexcited linear prediction Oregon Health & Science University OHSU Digital Commons CSETech October 1997 OGIresLPC : Diphone synthesizer using residualexcited linear prediction Michael Macon Andrew Cronk Johan Wouters Alex Kain Follow

More information

M I RA Lab. Speech Animation. Where do we stand today? Speech Animation : Hierarchy. What are the technologies?

M I RA Lab. Speech Animation. Where do we stand today? Speech Animation : Hierarchy. What are the technologies? MIRALab Where Research means Creativity Where do we stand today? M I RA Lab Nadia Magnenat-Thalmann MIRALab, University of Geneva thalmann@miralab.unige.ch Video Input (face) Audio Input (speech) FAP Extraction

More information

AUDIOVISUAL SYNTHESIS OF EXAGGERATED SPEECH FOR CORRECTIVE FEEDBACK IN COMPUTER-ASSISTED PRONUNCIATION TRAINING.

AUDIOVISUAL SYNTHESIS OF EXAGGERATED SPEECH FOR CORRECTIVE FEEDBACK IN COMPUTER-ASSISTED PRONUNCIATION TRAINING. AUDIOVISUAL SYNTHESIS OF EXAGGERATED SPEECH FOR CORRECTIVE FEEDBACK IN COMPUTER-ASSISTED PRONUNCIATION TRAINING Junhong Zhao 1,2, Hua Yuan 3, Wai-Kim Leung 4, Helen Meng 4, Jia Liu 3 and Shanhong Xia 1

More information

An Open Source Speech Synthesis Frontend for HTS

An Open Source Speech Synthesis Frontend for HTS An Open Source Speech Synthesis Frontend for HTS Markus Toman and Michael Pucher FTW Telecommunications Research Center Vienna Donau-City-Straße 1, A-1220 Vienna, Austria http://www.ftw.at {toman,pucher}@ftw.at

More information

Tina John University of Munich Workshop on standards for phonological corpora Tina John M.A. 1

Tina John University of Munich Workshop on standards for phonological corpora Tina John M.A. 1 Tina John University of Munich (natty_t@gmx.net) 1 Emu Speech Database System Database system for: audio data parametrical data annotation 2 Emu Speech Database System provides: 3 Platforms following setups

More information

Extraction and Representation of Features, Spring Lecture 4: Speech and Audio: Basics and Resources. Zheng-Hua Tan

Extraction and Representation of Features, Spring Lecture 4: Speech and Audio: Basics and Resources. Zheng-Hua Tan Extraction and Representation of Features, Spring 2011 Lecture 4: Speech and Audio: Basics and Resources Zheng-Hua Tan Multimedia Information and Signal Processing Department of Electronic Systems Aalborg

More information

1 Introduction. 2 Speech Compression

1 Introduction. 2 Speech Compression Abstract In this paper, the effect of MPEG audio compression on HMM-based speech synthesis is studied. Speech signals are encoded with various compression rates and analyzed using the GlottHMM vocoder.

More information

Editing Pronunciation in Clicker 5

Editing Pronunciation in Clicker 5 Editing Pronunciation in Clicker 5 Depending on which computer you use for Clicker 5, you may notice that some words, especially proper names and technical terms, are mispronounced when you click a word

More information

SoundWriter 2.0 Manual

SoundWriter 2.0 Manual SoundWriter 2.0 Manual 1 Overview SoundWriter 2.0 Manual John W. Du Bois SoundWriter (available free from http://www.linguistics.ucsb.edu/projects/transcription, for Windows only) is software designed

More information

Available online Journal of Scientific and Engineering Research, 2016, 3(4): Research Article

Available online   Journal of Scientific and Engineering Research, 2016, 3(4): Research Article Available online www.jsaer.com, 2016, 3(4):417-422 Research Article ISSN: 2394-2630 CODEN(USA): JSERBR Automatic Indexing of Multimedia Documents by Neural Networks Dabbabi Turkia 1, Lamia Bouafif 2, Ellouze

More information

Effect of MPEG Audio Compression on HMM-based Speech Synthesis

Effect of MPEG Audio Compression on HMM-based Speech Synthesis Effect of MPEG Audio Compression on HMM-based Speech Synthesis Bajibabu Bollepalli 1, Tuomo Raitio 2, Paavo Alku 2 1 Department of Speech, Music and Hearing, KTH, Stockholm, Sweden 2 Department of Signal

More information

Designing in Text-To-Speech Capability for Portable Devices

Designing in Text-To-Speech Capability for Portable Devices Designing in Text-To-Speech Capability for Portable Devices Market Dynamics Currently, the automotive, wireless and cellular markets are experiencing increased demand from customers for a simplified and

More information

ACCURATE SPECTRAL ENVELOPE ESTIMATION FOR ARTICULATION-TO-SPEECH SYNTHESIS. Yoshinori Shiga and Simon King

ACCURATE SPECTRAL ENVELOPE ESTIMATION FOR ARTICULATION-TO-SPEECH SYNTHESIS. Yoshinori Shiga and Simon King ACCURATE SPECTRAL ENVELOPE ESTIMATION FOR ARTICULATION-TO-SPEECH SYNTHESIS Yoshinori Shiga and Simon King Centre for Speech Technology Research, University of Edinburgh, UK yoshi@cstredacuk ABSTRACT This

More information

General Technical Information

General Technical Information General Technical Information In this file technical information is given on how to use the wave forms files present on the CDROM. File format together with file naming in use in the EUROM1 speech database

More information

Integrate Speech Technology for Hands-free Operation

Integrate Speech Technology for Hands-free Operation Integrate Speech Technology for Hands-free Operation Copyright 2011 Chant Inc. All rights reserved. Chant, SpeechKit, Getting the World Talking with Technology, talking man, and headset are trademarks

More information

DRAGON NATURALLYSPEAKING 12 FEATURE MATRIX COMPARISON BY PRODUCT EDITION

DRAGON NATURALLYSPEAKING 12 FEATURE MATRIX COMPARISON BY PRODUCT EDITION Recognition Accuracy Turns your voice into text with up to 99% accuracy NEW - Up to a 20% improvement to out-of-the-box accuracy compared to Dragon version 11 Recognition Speed Words appear on the screen

More information

ON-LINE SIMULATION MODULES FOR TEACHING SPEECH AND AUDIO COMPRESSION TECHNIQUES

ON-LINE SIMULATION MODULES FOR TEACHING SPEECH AND AUDIO COMPRESSION TECHNIQUES ON-LINE SIMULATION MODULES FOR TEACHING SPEECH AND AUDIO COMPRESSION TECHNIQUES Venkatraman Atti 1 and Andreas Spanias 1 Abstract In this paper, we present a collection of software educational tools for

More information

MATLAB Apps for Teaching Digital Speech Processing

MATLAB Apps for Teaching Digital Speech Processing MATLAB Apps for Teaching Digital Speech Processing Lawrence Rabiner, Rutgers University Ronald Schafer, Stanford University GUI LITE 2.5 editor written by Maria d Souza and Dan Litvin MATLAB coding support

More information

Understanding mobile programming and applications

Understanding mobile programming and applications Understanding mobile programming and applications 1. Introduction Mobile wireless technologies overcome amazing technical challenges to deliver rich content to our mobile devices. Understanding the basics

More information

EVENT VERIFICATION THROUGH VOICE PROCESS USING ANDROID. Kodela Divya* 1, J.Pratibha 2

EVENT VERIFICATION THROUGH VOICE PROCESS USING ANDROID. Kodela Divya* 1, J.Pratibha 2 ISSN 2277-2685 IJESR/May 2015/ Vol-5/Issue-5/179-183 Kodela Divya et. al./ International Journal of Engineering & Science Research EVENT VERIFICATION THROUGH VOICE PROCESS USING ANDROID ABSTRACT Kodela

More information

Speech Recognition. Project: Phone Recognition using Sphinx. Chia-Ho Ling. Sunya Santananchai. Professor: Dr. Kepuska

Speech Recognition. Project: Phone Recognition using Sphinx. Chia-Ho Ling. Sunya Santananchai. Professor: Dr. Kepuska Speech Recognition Project: Phone Recognition using Sphinx Chia-Ho Ling Sunya Santananchai Professor: Dr. Kepuska Objective Use speech data corpora to build a model using CMU Sphinx.Apply a built model

More information

Starting-Up Fast with Speech-Over Professional

Starting-Up Fast with Speech-Over Professional Starting-Up Fast with Speech-Over Professional Contents #1 Getting Ready... 2 Starting Up... 2 Initial Preferences Settings... 3 Adding a Narration Clip... 3 On-Line Tutorials... 3 #2: Creating a Synchronized

More information

Create Swift mobile apps with IBM Watson services IBM Corporation

Create Swift mobile apps with IBM Watson services IBM Corporation Create Swift mobile apps with IBM Watson services Create a Watson sentiment analysis app with Swift Learning objectives In this section, you ll learn how to write a mobile app in Swift for ios and add

More information

The Future of Solid State Lighting in Europe

The Future of Solid State Lighting in Europe PLUS Conference "LED Lighting Strategies for Urban Spaces", 20/6/2012 The Future of Solid State Lighting in Europe Michael Ziegler European Commission DG Information Society and Media [-> DG CONNECT -

More information

Complex Identification Decision Based on Several Independent Speaker Recognition Methods. Ilya Oparin Speech Technology Center

Complex Identification Decision Based on Several Independent Speaker Recognition Methods. Ilya Oparin Speech Technology Center Complex Identification Decision Based on Several Independent Speaker Recognition Methods Ilya Oparin Speech Technology Center Corporate Overview Global provider of voice biometric solutions Company name:

More information

Speech Articulation Training PART 1. VATA (Vowel Articulation Training Aid)

Speech Articulation Training PART 1. VATA (Vowel Articulation Training Aid) Speech Articulation Training PART 1 VATA (Vowel Articulation Training Aid) VATA is a speech therapy tool designed to supplement insufficient or missing auditory feedback for hearing impaired persons. The

More information

The DEMOSTHeNES Speech Composer

The DEMOSTHeNES Speech Composer The DEMOSTHeNES Speech Composer Gerasimos Xydas and Georgios Kouroupetroglou University of Athens, Department of Informatics and Telecommunications Division of Communication and Signal Processing Panepistimiopolis,

More information

Speech Technology Using in Wechat

Speech Technology Using in Wechat Speech Technology Using in Wechat FENG RAO Powered by WeChat Outline Introduce Algorithm of Speech Recognition Acoustic Model Language Model Decoder Speech Technology Open Platform Framework of Speech

More information

Hybrid Speech Synthesis

Hybrid Speech Synthesis Hybrid Speech Synthesis Simon King Centre for Speech Technology Research University of Edinburgh 2 What are you going to learn? Another recap of unit selection let s properly understand the Acoustic Space

More information

RLAT Rapid Language Adaptation Toolkit

RLAT Rapid Language Adaptation Toolkit RLAT Rapid Language Adaptation Toolkit Tim Schlippe May 15, 2012 RLAT Rapid Language Adaptation Toolkit - 2 RLAT Rapid Language Adaptation Toolkit RLAT Rapid Language Adaptation Toolkit - 3 Outline Introduction

More information

Smart Gas Grids. Manuel Sánchez, Ph.D. Team Leader Smart Grids Directorate General for Energy European Commission

Smart Gas Grids. Manuel Sánchez, Ph.D. Team Leader Smart Grids Directorate General for Energy European Commission Smart Gas Grids Manuel Sánchez, Ph.D. Team Leader Smart Grids Directorate General for Energy European Commission Smart Gas Grids in practice Brussels 1st December 2015 Energy Low carbon economy requires

More information

Multimodal Transcription Software Programmes

Multimodal Transcription Software Programmes CAPD / CUROP 1 Multimodal Transcription Software Programmes ANVIL Anvil ChronoViz CLAN ELAN EXMARaLDA Praat Transana ANVIL describes itself as a video annotation tool. It allows for information to be coded

More information

Wikipedia and the Web of Confusable Entities: Experience from Entity Linking Query Creation for TAC 2009 Knowledge Base Population

Wikipedia and the Web of Confusable Entities: Experience from Entity Linking Query Creation for TAC 2009 Knowledge Base Population Wikipedia and the Web of Confusable Entities: Experience from Entity Linking Query Creation for TAC 2009 Knowledge Base Population Heather Simpson 1, Stephanie Strassel 1, Robert Parker 1, Paul McNamee

More information

Turns your voice into text with up to 99% accuracy. New - Up to a 15% improvement to out-of-the-box accuracy compared to Dragon version 12

Turns your voice into text with up to 99% accuracy. New - Up to a 15% improvement to out-of-the-box accuracy compared to Dragon version 12 Recognition accuracy Turns your voice into text with up to 99% accuracy New - Up to a 15% improvement to out-of-the-box accuracy compared to Dragon version 12 Recognition speed Words appear on the screen

More information

PRACTICAL SPEECH USER INTERFACE DESIGN

PRACTICAL SPEECH USER INTERFACE DESIGN ; ; : : : : ; : ; PRACTICAL SPEECH USER INTERFACE DESIGN й fail James R. Lewis. CRC Press Taylor &. Francis Group Boca Raton London New York CRC Press is an imprint of the Taylor & Francis Group, an informa

More information

Machine Learning in Speech Synthesis. Alan W Black Language Technologies Institute Carnegie Mellon University Sept 2009

Machine Learning in Speech Synthesis. Alan W Black Language Technologies Institute Carnegie Mellon University Sept 2009 Machine Learning in Speech Synthesis Alan W Black Language Technologies Institute Carnegie Mellon University Sept 2009 Overview u Speech Synthesis History and Overview l From hand-crafted to data-driven

More information

Communication and Telecommunications

Communication and Telecommunications Information Booklet Communication and Telecommunications ~ Choosing Your Device ~ Easy English Format Table of Content Communication... 3 Telecommunication... 3 Telecommunication Functions... 4 Types of

More information

Object-based audio production. Chris Baume EBU-PTS - 27th January 2016

Object-based audio production. Chris Baume EBU-PTS - 27th January 2016 Object-based audio production Chris Baume EBU-PTS - 27th January 2016 Structure Challenges in Radio ORPHEUS project Impact on production workflow Production tool demo What is object-based

More information

10 Of The Best Dictation Apps

10 Of The Best Dictation Apps 10 Of The Best Dictation Apps Take a note: your smartphone and/or tablet is very much capable of capturing your speech and displaying it in a word document all through the means of a third-party app. These

More information

An overview of interactive voice response applications

An overview of interactive voice response applications An overview of interactive voice response applications Suneetha Chittamuri Senior Software Engineer IBM India April, 2004 Copyright International Business Machines Corporation 2004. All rights reserved.

More information

Data-Driven Face Modeling and Animation

Data-Driven Face Modeling and Animation 1. Research Team Data-Driven Face Modeling and Animation Project Leader: Post Doc(s): Graduate Students: Undergraduate Students: Prof. Ulrich Neumann, IMSC and Computer Science John P. Lewis Zhigang Deng,

More information

Real-time Talking Head Driven by Voice and its Application to Communication and Entertainment

Real-time Talking Head Driven by Voice and its Application to Communication and Entertainment ISCA Archive Real-time Talking Head Driven by Voice and its Application to Communication and Entertainment Shigeo MORISHIMA Seikei University ABSTRACT Recently computer can make cyberspace to walk through

More information

How to create dialog system

How to create dialog system How to create dialog system Jan Balata Dept. of computer graphics and interaction Czech Technical University in Prague 1 / 55 Outline Intro Architecture Types of control Designing dialog system IBM Bluemix

More information

Cable length: 100 cm (39.4 in) Cable length: 30 cm (11.8 in)

Cable length: 100 cm (39.4 in) Cable length: 30 cm (11.8 in) User s Manual The d:vice MMA-A Digital Audio Interface is a high-quality dual-channel microphone preamplifier and A/D converter. It allows you to capture crystal-clear audio via your favorite recording

More information

ARTutor & Moodle. Athens, 1 2 December 2017

ARTutor & Moodle. Athens, 1 2 December 2017 ARTutor & Moodle Prof. Avgoustos Tsinakos, Director of Advanced Educational Technologies and Mobile Applications Lab Eastern Macedonia and Thrace Institute of Technology E-mail: tsinakos@teiemt.gr Athens,

More information

FAQ. Thump Series. What models are featured in the Thump Series?

FAQ. Thump Series. What models are featured in the Thump Series? Thump Series http:///products/thump What models are featured in the Thump Series? The Thump Series will consist of four powered loudspeakers and one powered subwoofer: Thump12A 1300W 12" Powered Loudspeaker

More information

Embedded Audio & Robotic Ear

Embedded Audio & Robotic Ear Embedded Audio & Robotic Ear Marc HERVIEU IoT Marketing Manager Marc.Hervieu@st.com Voice Communication: key driver of innovation since 1800 s 2 IoT Evolution of Voice Automation: the IoT Voice Assistant

More information

COMPREHENSIVE MANY-TO-MANY PHONEME-TO-VISEME MAPPING AND ITS APPLICATION FOR CONCATENATIVE VISUAL SPEECH SYNTHESIS

COMPREHENSIVE MANY-TO-MANY PHONEME-TO-VISEME MAPPING AND ITS APPLICATION FOR CONCATENATIVE VISUAL SPEECH SYNTHESIS COMPREHENSIVE MANY-TO-MANY PHONEME-TO-VISEME MAPPING AND ITS APPLICATION FOR CONCATENATIVE VISUAL SPEECH SYNTHESIS Wesley Mattheyses 1, Lukas Latacz 1 and Werner Verhelst 1,2 1 Vrije Universiteit Brussel,

More information

image-based visual synthesis: facial overlay

image-based visual synthesis: facial overlay Universität des Saarlandes Fachrichtung 4.7 Phonetik Sommersemester 2002 Seminar: Audiovisuelle Sprache in der Sprachtechnologie Seminarleitung: Dr. Jacques Koreman image-based visual synthesis: facial

More information

Yealink Audio Conferencing Solution Easy Conferencing, Clear Communication

Yealink Audio Conferencing Solution Easy Conferencing, Clear Communication Yealink Audio Conferencing Solution Easy Conferencing, Clear Communication Conferencing is a rapidly growing market and efficient communication and collaboration is critical to business success. The number

More information

Learning The Lexicon!

Learning The Lexicon! Learning The Lexicon! A Pronunciation Mixture Model! Ian McGraw! (imcgraw@mit.edu)! Ibrahim Badr Jim Glass! Computer Science and Artificial Intelligence Lab! Massachusetts Institute of Technology! Cambridge,

More information

Web-enabled Speech Synthesizer for Tamil

Web-enabled Speech Synthesizer for Tamil Web-enabled Speech Synthesizer for Tamil P. Prathibha and A. G. Ramakrishnan Department of Electrical Engineering, Indian Institute of Science, Bangalore 560012, INDIA 1. Introduction The Internet is popular

More information

USER GUIDE FOR PREDICTION ERROR METHOD OF ADAPTIVE FEEDBACK CANCELLER ON ios PLATFORM FOR HEARING AID APPLICATIONS

USER GUIDE FOR PREDICTION ERROR METHOD OF ADAPTIVE FEEDBACK CANCELLER ON ios PLATFORM FOR HEARING AID APPLICATIONS Page 1 of 13 USER GUIDE FOR PREDICTION ERROR METHOD OF ADAPTIVE FEEDBACK CANCELLER ON ios PLATFORM FOR HEARING AID APPLICATIONS Parth Mishra, Anshuman Ganguly, Nikhil Shankar STATISTICAL SIGNAL PROCESSING

More information

A MOBILE OFFICE AND ENTERTAINMENT SYSTEM BASED ON ANDROID. 1 Introduction. 2 Service Description

A MOBILE OFFICE AND ENTERTAINMENT SYSTEM BASED ON ANDROID. 1 Introduction. 2 Service Description A MOBILE OFFICE AND ENTERTAINMENT SYSTEM BASED ON ANDROID Felix Burkhardt, Martin Eckert, Julia Niemann, Frank Oberle, Thomas Scheerbarth, Stefan Seide und Jianshen Zhou DTAG Laboratories Felix.Burkhardt@telekom.de

More information

A Proposed e-payment Service for Visually Disabled

A Proposed e-payment Service for Visually Disabled IJCSNS International Journal of Computer Science and Network Security, VOL.17 No.5, May 2017 253 A Proposed e-payment Service for Visually Disabled Gamal H. Eladl 1 1 Information Systems Department, Faculty

More information

A Gaussian Mixture Model Spectral Representation for Speech Recognition

A Gaussian Mixture Model Spectral Representation for Speech Recognition A Gaussian Mixture Model Spectral Representation for Speech Recognition Matthew Nicholas Stuttle Hughes Hall and Cambridge University Engineering Department PSfrag replacements July 2003 Dissertation submitted

More information

Best practices in the design, creation and dissemination of speech corpora at The Language Archive

Best practices in the design, creation and dissemination of speech corpora at The Language Archive LREC Workshop 18 2012-05-21 Istanbul Best practices in the design, creation and dissemination of speech corpora at The Language Archive Sebastian Drude, Daan Broeder, Peter Wittenburg, Han Sloetjes The

More information

Free app itunes download

Free app itunes download Free app itunes download The Borg System is 100 % Free app itunes download itunes, free and safe download. itunes latest version: Still one of the best music players. itunes is an audio and video player

More information

Modeling Coarticulation in Continuous Speech

Modeling Coarticulation in Continuous Speech ing in Oregon Health & Science University Center for Spoken Language Understanding December 16, 2013 Outline in 1 2 3 4 5 2 / 40 in is the influence of one phoneme on another Figure: of coarticulation

More information

09 June 2011 Affärskollegan - Your Business Partner 2

09 June 2011 Affärskollegan - Your Business Partner 2 Improving Health Care and Advancing Health Innovations with Public Procurement Sven-Eric Hargeskog Public Procurement & Innovation Expert Affärskollegan Your Business Partner What is public procurement?

More information

Semi-Automatic Generation of Arabic Digital Talking Books

Semi-Automatic Generation of Arabic Digital Talking Books Semi-Automatic Generation of Arabic Digital Talking Books Iyad Abu Doush 1, Faisal Alkhateeb 2 and Abed Al Raoof Bsoul 3 Computer Science Department Yarmouk University Irbid - Jordan { 1 iyad.doush, 2

More information

REALISTIC FACIAL EXPRESSION SYNTHESIS FOR AN IMAGE-BASED TALKING HEAD. Kang Liu and Joern Ostermann

REALISTIC FACIAL EXPRESSION SYNTHESIS FOR AN IMAGE-BASED TALKING HEAD. Kang Liu and Joern Ostermann REALISTIC FACIAL EXPRESSION SYNTHESIS FOR AN IMAGE-BASED TALKING HEAD Kang Liu and Joern Ostermann Institut für Informationsverarbeitung, Leibniz Universität Hannover Appelstr. 9A, 3167 Hannover, Germany

More information

Microsoft. MS-101 EXAM Microsoft 365 Mobility and Security. m/ Product: Demo File

Microsoft. MS-101 EXAM Microsoft 365 Mobility and Security.   m/ Product: Demo File Page No 1 https://www.dumpsplanet.com m/ Microsoft MS-101 EXAM Microsoft 365 Mobility and Security Product: Demo File For More Information: MS-101-dumps Question: 1 Your company uses Windows Defender Advanced

More information

Efficient e Government Through Mass Solutions Provided by Banks Nordic lessions. Erkki Poutiainen 14 September 2006

Efficient e Government Through Mass Solutions Provided by Banks Nordic lessions. Erkki Poutiainen 14 September 2006 Efficient e Government Through Mass Solutions Provided by Banks Nordic lessions Erkki Poutiainen 14 September 2006 Vision Efficiency in The Networked Economy 1. The framework the transition in the economy

More information

Speaker Classification for Mobile Devices

Speaker Classification for Mobile Devices Speaker Classification for Mobile Devices Michael Feld, Christian Müller German Research Center for Artificial Intelligence (DFKI) Saarbrücken, Germany {michael.feld,christian.mueller}@dfki.de Abstract

More information

Automatic Subtitle Generation for Sound in Videos

Automatic Subtitle Generation for Sound in Videos ISSN: 2454-132X Impact factor: 4.295 (Volume 4, Issue 2) Available online at: www.ijariit.com Automatic Subtitle Generation for Sound in Videos Anshul Ganvir anshulganvir65@gmail.com Sanket Jagtap myfavaudia8@gmail.com

More information

Spectral modeling of musical sounds

Spectral modeling of musical sounds Spectral modeling of musical sounds Xavier Serra Audiovisual Institute, Pompeu Fabra University http://www.iua.upf.es xserra@iua.upf.es 1. Introduction Spectral based analysis/synthesis techniques offer

More information

Screen Reader for Windows Based on Speech Output

Screen Reader for Windows Based on Speech Output Screen Reader for Windows Based on Speech Output Paolo Graziani 1 and Bruno Breschi ~ 1 - I.R.O.E. "Nello Carrara" - C.N.R., Via Panciatichi 64 1-50127 Firenze 2 - IDEA I.S.E.s.n.c., Via S. Francesco d'assisi

More information

This is a repository copy of Diphthong Synthesis Using the Dynamic 3D Digital Waveguide Mesh.

This is a repository copy of Diphthong Synthesis Using the Dynamic 3D Digital Waveguide Mesh. This is a repository copy of Diphthong Synthesis Using the Dynamic 3D Digital Waveguide Mesh. White Rose Research Online URL for this paper: http://eprints.whiterose.ac.uk/126358/ Version: Accepted Version

More information

Text-to-Audiovisual Speech Synthesizer

Text-to-Audiovisual Speech Synthesizer Text-to-Audiovisual Speech Synthesizer Udit Kumar Goyal, Ashish Kapoor and Prem Kalra Department of Computer Science and Engineering, Indian Institute of Technology, Delhi pkalra@cse.iitd.ernet.in Abstract.

More information

Flite: a small fast run-time synthesis engine

Flite: a small fast run-time synthesis engine ISCA Archive Flite: a small fast run-time synthesis engine Alan W Black and Kevin A. Lenzo Carnegie Mellon University awb@cs.cmu.edu, lenzo@cs.cmu.edu Abstract Flite is a small, fast run-time synthesis

More information

Assignment 11. Part 1: Pitch Extraction and synthesis. Linguistics 582 Basics of Digital Signal Processing

Assignment 11. Part 1: Pitch Extraction and synthesis. Linguistics 582 Basics of Digital Signal Processing Linguistics 582 Basics of Digital Signal Processing Assignment 11 Part 1: Pitch Extraction and synthesis (1) Analyze the fundamental frequency of the two utterances you recorded for Assignment 10, using

More information

Care360 Mobile Frequently Asked Questions

Care360 Mobile Frequently Asked Questions Care360 Mobile Frequently Asked Questions Table of Contents Care360 for Mobile Devices... 3 What mobile devices can run Care360?... 3 How do I upgrade one of the supported devices to ios 9.x?... 3 How

More information

Introduction to Speech Synthesis

Introduction to Speech Synthesis IBM TJ Watson Research Center Human Language Technologies Introduction to Speech Synthesis Raul Fernandez fernanra@us.ibm.com IBM Research, Yorktown Heights Outline Ø Introduction and Motivation General

More information

Gender-dependent acoustic models fusion developed for automatic subtitling of Parliament meetings broadcasted by the Czech TV

Gender-dependent acoustic models fusion developed for automatic subtitling of Parliament meetings broadcasted by the Czech TV Gender-dependent acoustic models fusion developed for automatic subtitling of Parliament meetings broadcasted by the Czech TV Jan Vaněk and Josef V. Psutka Department of Cybernetics, West Bohemia University,

More information

Feel the touch. touchscreen interfaces for visually impaired users. 21 November 2016, Pisa. 617AA Tecnologie assistive per la didattica

Feel the touch. touchscreen interfaces for visually impaired users. 21 November 2016, Pisa. 617AA Tecnologie assistive per la didattica Feel the touch touchscreen interfaces for visually impaired users 617AA Tecnologie assistive per la didattica 21 November 2016, Pisa Text Entry on Touchscreen Mobiles Difficult for blind users due to

More information

MARATHI TEXT-TO-SPEECH SYNTHESISYSTEM FOR ANDROID PHONES

MARATHI TEXT-TO-SPEECH SYNTHESISYSTEM FOR ANDROID PHONES International Journal of Advances in Applied Science and Engineering (IJAEAS) ISSN (P): 2348-1811; ISSN (E): 2348-182X Vol. 3, Issue 2, May 2016, 34-38 IIST MARATHI TEXT-TO-SPEECH SYNTHESISYSTEM FOR ANDROID

More information

Acoustic to Articulatory Mapping using Memory Based Regression and Trajectory Smoothing

Acoustic to Articulatory Mapping using Memory Based Regression and Trajectory Smoothing Acoustic to Articulatory Mapping using Memory Based Regression and Trajectory Smoothing Samer Al Moubayed Center for Speech Technology, Department of Speech, Music, and Hearing, KTH, Sweden. sameram@kth.se

More information

Why is Office 365 the right choice?

Why is Office 365 the right choice? Why is Office 365 the right choice? People today want to be productive wherever they go. They want to work faster and smarter across their favorite devices, while staying current and connected. Simply

More information

Microsoft Windows Vista Simplified By Paul McFedries READ ONLINE

Microsoft Windows Vista Simplified By Paul McFedries READ ONLINE Microsoft Windows Vista Simplified By Paul McFedries READ ONLINE May 19, 2008 Microsoft YaHei Regular and Bold Version 5.00 for Windows XP to improve rendering of Simplified Chinese text in Windows Presentation

More information

If you re using a Mac, follow these commands to prepare your computer to run these demos (and any other analysis you conduct with the Audio BNC

If you re using a Mac, follow these commands to prepare your computer to run these demos (and any other analysis you conduct with the Audio BNC If you re using a Mac, follow these commands to prepare your computer to run these demos (and any other analysis you conduct with the Audio BNC sample). All examples use your Workshop directory (e.g. /Users/peggy/workshop)

More information

The office for the anywhere worker!!! Your LCB SOFTPHONE: A powerful new take on the all-in-one for a more immersive experience.

The office for the anywhere worker!!! Your LCB SOFTPHONE: A powerful new take on the all-in-one for a more immersive experience. The office for the anywhere worker!!! Your LCB SOFTPHONE: A powerful new take on the all-in-one for a more immersive experience. LCB SOFTPHONE FOR SALESFORCE Combine real-time communications and tracking

More information

Topics in Linguistic Theory: Laboratory Phonology Spring 2007

Topics in Linguistic Theory: Laboratory Phonology Spring 2007 MIT OpenCourseWare http://ocw.mit.edu 24.910 Topics in Linguistic Theory: Laboratory Phonology Spring 2007 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

More information

A MULTI-RATE SPEECH AND CHANNEL CODEC: A GSM AMR HALF-RATE CANDIDATE

A MULTI-RATE SPEECH AND CHANNEL CODEC: A GSM AMR HALF-RATE CANDIDATE A MULTI-RATE SPEECH AND CHANNEL CODEC: A GSM AMR HALF-RATE CANDIDATE S.Villette, M.Stefanovic, A.Kondoz Centre for Communication Systems Research University of Surrey, Guildford GU2 5XH, Surrey, United

More information

Confidence Measures: how much we can trust our speech recognizers

Confidence Measures: how much we can trust our speech recognizers Confidence Measures: how much we can trust our speech recognizers Prof. Hui Jiang Department of Computer Science York University, Toronto, Ontario, Canada Email: hj@cs.yorku.ca Outline Speech recognition

More information

ESD.051 / Engineering Innovation & Design

ESD.051 / Engineering Innovation & Design ESD.051 / 6.902 Engineering Innovation & Design 1 Principles of Design (1-10) Class 1 2 3 4 5 6 7 8 9 10 Day of Week/ Date W Sept 5 M Sept 10 W Sept 12 M Sept 17 W Sept 19 M Sept 24 W Sept 26 M Oct 1 W

More information

HSBC Talking ATMs. Instructions and Guidance Handbook

HSBC Talking ATMs. Instructions and Guidance Handbook HSBC Talking ATMs Instructions and Guidance Handbook This document provides detailed instructions and guidance on the use of our Talking ATMs. What is a Talking ATM? A Talking ATM is self-service machine

More information

Exam Name: Microsoft Software Testing with Visual Studio 2012

Exam Name: Microsoft Software Testing with Visual Studio 2012 Vendor: Microsoft Exam Code: 70-497 Exam Name: Microsoft Software Testing with Visual Studio 2012 Version: DEMO QUESTION 1 Drag and Drop Question You are using Microsoft Test Manager (MTM) to manage customer

More information

Model TS-04 -W. Wireless Speech Keyboard System 2.0. Users Guide

Model TS-04 -W. Wireless Speech Keyboard System 2.0. Users Guide Model TS-04 -W Wireless Speech Keyboard System 2.0 Users Guide Overview TextSpeak TS-04 text-to-speech speaker and wireless keyboard system instantly converts typed text to a natural sounding voice. The

More information

MOTIV. ios and USB Microphones and Recording Solutions BECAUSE THE WORLD IS YOUR STUDIO.

MOTIV. ios and USB Microphones and Recording Solutions BECAUSE THE WORLD IS YOUR STUDIO. TM MOTIV ios and USB Microphones and Recording Solutions BECAUSE THE WORLD IS YOUR STUDIO. MOTIV for Recording Musicians shure.com/motiv/recording-musician MOTIV for Podcasters shure.com/motiv/podcaster

More information