Speech Recognition. Can you hear me?

Similar documents
Models with Display Audio Display Audio Operation

Speech Recognition Voice Pro Enterprise 4.0 (Windows based Client) MANUAL Linguatec GmbH

Models with Display Audio Touchscreen Operation*

Display Audio Operation

USER GUIDE WITH OPTIONAL NAVIGATION SYSTEM

Navi 900 IntelliLink, Touch R700 IntelliLink Frequently Asked Questions

set in Options). Returns the cursor to its position prior to the Correct command.

Display Audio System AUDIO AND CONNECTIVITY. Use simple gestures-including touching, swiping and scrolling-to operate certain audio functions.

Touchscreen Operation

ES01-KA

BLUETOOTH HANDSFREELINK

Display Audio Operation

Technoversity Tuesdays

Models with Display Audio Basic HFL Operation

Google Home. List of Basic Commands

15 Things You Can Do With Cortana

Speech Recognition Voice Pro Enterprise 4.0 Client (Windows based Client) MANUAL Linguatec GmbH

Turns your voice into text with up to 99% accuracy. New - Up to a 15% improvement to out-of-the-box accuracy compared to Dragon version 12

Models with Display Audio Basic HFL Operation

Basic HFL Operation Models with Display Audio

AUDIO AND CONNECTIVITY

Master Your Mac. simple ways to tweak, customize, and secure os x

DRAGON FOR AMBULATORY CARE PROVIDERS

Speech Recognition, The process of taking spoken word as an input to a computer

Microsoft speech offering

Voice Activated Devices

Start Dragon NaturallySpeaking from the Windows Start Menu

AUDIO AND CONNECTIVITY

GENERAL SET UP & APP. Swipe up and tap Restart.

Sit all three speakers next to each other, starting with the Echo and ending with the Dot, and you'd get one tall, one medium, and one short.

BLUETOOTH HANDSFREELINK

User Guide. MyLincoln Touch

USER GUIDE USER GUIDE

Wireless headset user guide

BANG & OLUFSEN AND GOOGLE HOME

DRAGON NATURALLYSPEAKING 12 FEATURE MATRIX COMPARISON BY PRODUCT EDITION

Navi 900 IntelliLink, Touch R700 IntelliLink Frequently Asked Questions

BLUETOOTH HANDSFREELINK (HFL)

Sprint Direct Connect Now 3.0

Dragon Naturally Speaking and TARGIT Intelligent Wizard Call your software!

iphones for beginners

2014 Entune Premium Audio with Navigation and App Suite

Ipod Manual Turn Off Voice Control My Mac

SUBARU STARLINK. Quick Reference Guide. your subaru connected. Love. It s what makes a Subaru, a Subaru.

QUICK START GUIDE MODEL: WF-1002

Early Access Program User Guide

Dragon TV Overview. TIF Workshop 24. Sept Reimund Schmald mob:

BLUETOOTH HANDSFREELINK (HFL)

User Guide: Sprint Direct Connect Plus - ios. User Guide. Sprint Direct Connect Plus Application. ios. Release 8.3. December 2017.

BT CLOUD PHONE. USER GUIDE FOR MY EXTENSION.

Taking Your iphone to

Accessory HandsFreeLink TM User s Information Manual

Welcome to. Manager s User Guide. 1 Claris Healthcare

AAA CENTER FOR DRIVING SAFETY & TECHNOLOGY

MITSUBISHI MOTORS NORTH AMERICA, INC. SMARTPHONE LINK DISPLAY AUDIO SYSTEM (SDA) QUICK REFERENCE GUIDE FOR ANDROID USERS

HKIoTDemo Documentation

SpeakToText 2.5 Speech Recognition QUICK START GUIDE (Version 2.51)

AAA CENTER FOR DRIVING SAFETY & TECHNOLOGY

AUDIO AND CONNECTIVITY

Q.bo Webi User s Guide

SPEECH RECOGNITION COMMON COMMANDS

How To Manually Turn Off Display Windows 7

IHF 1500 Bluetooth Handsfree Kit. Troubleshooting. and. Frequently Asked Questions Guide (FAQ) Version 1.0 Dec. 1st, 2006 All Rights Reserved

(P6) "Everyone will benefit from that," cofounder and CEO Alex Lebrun says.

InformationTechnology

Windows 10: Part 2. Updated: May 2018 Price: $1.80

May Product Olympics Reference Guide

HOPE System User Manual

BUILT FOR BUSINESS. 10 Reasons BlackBerry Smartphones Are Still the Best Way to Do Business. Whitepaper

Quick Guide Crosstrek Impreza Legacy Outback. Love. It s what makes a Subaru, a Subaru.

Introduction to Amazon Echo and Dot

E8884. April 2014 First Edition

Overview of General Dragon Commands

4s Instead Of Voice Control

USER GUIDE. Apple CarPlay

MITSUBISHI MOTORS NORTH AMERICA, INC. SMARTPHONE LINK DISPLAY AUDIO SYSTEM (SDA) QUICK REFERENCE GUIDE FOR APPLE CARPLAY USERS

AUDIO AND CONNECTIVITY

Siri/VOICE RECOGNITION

Voyager Focus UC. User Guide

Echo: Master Your Echo; User Guide And Manual PDF

AUDIO AND CONNECTIVITY

Quick Start Guide MAC Operating System Built-In Accessibility

Verizon Bluetooth Use Manual For Samsung Galaxy S3 On T Mobile

Instructions On How To Use Siri On Iphone 4s First Time >>>CLICK HERE<<<

How To Setup Bluetooth Iphone 4s Ringtones On Windows 7 >>>CLICK HERE<<<

wego write Predictable User Guide Find more resources online: For wego write-d Speech-Generating Devices

Index COPYRIGHTED MATERIAL

Contents in Detail. Introduction... xxiii How This Book Is Organized... xxiv. Part 1: Getting Started

Starting-Up Fast with Speech-Over Professional

SRV Canada VRS TM USER'S MANUAL for Mobile applications

Student Guide Updated February 22, 2018

Wearables for Transition

In-Vehicle Infotainment (IVI) Control from Mobile Devices

Discover the Accessibility Features of Smartphones! A Wireless Education Workshop for Consumers with Mobility and Dexterity Impairments

Steering wheel ADJUSTING THE STEERING WHEEL. Entry and exit mode

Quick Reference Guide

How To Setup Bluetooth Iphone 4s Ringtones On Windows >>>CLICK HERE<<<

User Guide. BlackBerry Pearl 8130 Smartphone

AUDIO AND CONNECTIVITY

Today we ll be demonstrating Clarity's web portal where you can access all the features of Clarity s Phone System.

Transcription:

Speech Recognition Can you hear me?

Speech Recognition Brief Overview How does it work? Types of Applications Demonstrations Talk about the future

What is Speech Recognition? Basically just another user interface Conversion of spoken words into text and actions Very rapidly developing technology Recent technological advances have led to many new functional & friendly uses

Why is it Important? More natural interaction - Don't have to be trained Convenient - Simply say what you need You can easily respond to prompts & questions (Don't need to touch devices) Don't need reading glasses to use your smartphone You can speak much faster than you can type (100+ wpm vs. "How fast can you type?")

Business Uses for SR Accessibility: Helps people with physical impairments who can't type Education: Helps students quickly transfer ideas onto paper Social Services: Helps case workers create documents, email and field reports Insurance: Speeds claims input & streamline report creation in the field Financial Services: Minimizes compliance risk by speeding documentation process & boosting advisor productivity Legal: Speeds document turnaround, reduces transcription costs, streamlines repetitive work flows Medical: Allows doctors to easily transcribe notes Public Safety: Easier way for officers to complete administrative work

Personal Uses for SR Dictation: Transcribe documents, emails, text messages, social network postings Web searches: Retrieve data from the Web Translations: Convert one spoken language into another language Functions: Set reminders, make phone calls, find directions, etc. Control: Launch PC and mobile applications

Development History 1950's and 1960's: Baby Talk - Pattern recognition analysis - Only digits and 16 words 1970's: SR Takes Off - Template based analysis - 1000 words 1980's: SR Turns Toward Prediction - Statistical analysis (Hidden Markov Models), 5-10k words 1990's: It Comes to the Masses - Syntax & semantic analysis - Dragon Naturally Speaking: 100 wpm 2000's: SR Plateaus - Multimodal dialog analysis - Google Search: 230 billion words The Future: Accurate, unambiguous speech

History of SR Accuracy 90% 80% 81 81 70% 60% 50% 40% 48 30% 20% 10% 0% 10 1 Year 1940 1993 1995 1999 2001

Speech Recognition How does it work?

Speech Recognition Steps Step 1: Convert analog signal to digital information Step2: Divide into plosive consonant sounds such as "c", "p", etc. Step 3: Match to phonemes in the appropriate language (~50 for American English) Step 4: Compare phonemes in context with other phonemes (example: "h eh l ow" becomes "hello") Note: Grammatical or rules of speech are not used Step 5: Statistical algorithms determine likely outcome (Looking at words, sentences, phrases, & preceding phrase)

Voice Input is the Key Good microphone is very important Desktop/laptop microphones & microphones built into PC devices (such as webcams) are not ideal Headset microphones are best for PC's Smartphone microphones are very good Position of microphone to your mouth is important Ambient sounds need to be kept to a minimum

Weaknesses & Flaws No system is 100% perfect Program needs to clearly "hear" (High SNR desired) Different speech patterns & accents may cause problems Overlapping speech from multiple users impacts results Intensive use of computer power is needed Homonyms (2 words sounding the same) cause difficulty Example: "Recognize speech" vs. "Recognize a nice beach"

Speech Recognition Applications

Important Things for Good Voice Recognition Have a good microphone Position microphone to mouth Speak naturally (not too loud or too soft) Speak very clearly and distinctly For dictation, punctuation seems to help accuracy Some apps "learn" and accuracy improves with training

Speech Recognition Apps Numerous (100+) Speech Recognition apps now available The number of SR apps & uses are rapidly expanding Majority of personal SR apps are for smartphones Some are good and some are bad Most popular PC apps: Windows Voice, Chrome, Dragon Naturally Speaking, Google Now (soon)

Top Smartphone Apps Dictation Apple ios: Dragon Dictation (free) Android: ListNote Speech (free) Translation Apple ios: itranslate (free) Android: Talking Translator (free) Personal Assistant Apple ios: Siri (free) Android: Google Now (free) Source: appcrawlr.com

PC Windows "Voice" Imbedded in Vista, Windows 7 & Windows 8 May also be uploaded to Windows XP (free) To activate: Control Panel -> Ease of Access -> Speech Recognition Learns (adapts) to your voice, good tutorial setup Can launch programs, dictate & correct text, etc. My experience: Not very good SR application

Nuance's "Dragon" PC Version "NaturallySpeaking" ($60-$125) Controls Windows and converts speech to text Learns (adapts) to your voice and written text Claim 98% accuracy - Probably the most advanced personal SR product Smartphone: Several apps - "Dictation", "Search", "Go!", "Mobile Assistant" (all free)

Dragon "NaturallySpeaking" Customization to improve accuracy: English accent (Speech model) Computer type (single core vs. multi-core) Microphone type (Also is Bluetooth used?) Extensive training "Readings" Problem word training Recognition mode desired ("control"?)

Smartphones Voice Recognition Most smartphones have built-in voice recognition (keyboard substitute) Many smartphone apps have, or can use, voice recognition Personal assistant apps, such as Siri, provide speech understanding and control

Google's Speech Recognition A recent pioneer (with Nuance) in Speech Recognition Several applications for both PC's & smartphones Search via Chrome or Web search Voice: voicemail, YouTube transcription Translate: Speech-to-speech translation Actions via "Voice Actions" and "Now" (mobile only) Fast and accurate Speech Recognition

Personal Assistant Apps Natural language interface - Precise wording not needed Interprets what you want to do Can take action based on interpretation Current mobile apps typically require some mainframe processing Approaching "Artificial Intelligence" where device perceives its environment and takes action on it's own

ios "Siri" Revolutionized the Personal Assistant concept Included in the latest Apple ios devices Uses natural language to perform functions Voice input only (No keyboard input) Initially commands are evaluated locally to see if they can be handled locally. If not, command is processed via a server in the cloud. As accurate, but not as fast, as Google speech recognition

Things "Siri" Can Do Place a call Send a text message Set an alarm Get directions Check the weather Play a tune Dictate an e-mail Location based queries Launch a Web site Ask a question Do the math Set reminders Schedule appointments Make reservations

Some Alternatives to "Siri" Google Now (free on Android, soon on ios) Vlingo (free on ios & Android) Dragon Go, Mobile Assistant (free on ios and Android) Voice Answer ($3.99 on ios and Android) Voice Control (built-in on earlier ios devices) Speaktoit (free on Android) Skyvi (free on Android) Indigo (free on Android, Windows Phone 8, Web browser)

Speech Recognition Demonstration

Demo Applications Speech to text (Dragon Dictation) Language Translation (itranslate) Internet search (Google Search) Personal Assistant (Siri)

Dictation Commands Command Action Command Action new line new line apostrophe ' new paragraph new paragraph hyphen - tab insert tab percent sign % comma, ampersand & period. asterisk * question mark? dollar sign $ exclamation mark! cent sign open quote " pound sign # close quote " " degree sign 0 open parenthesis ( forward slash / close parenthesis ) back slash \ open bracket [ vertical bar close bracket ] i e i.e.

Dictation... Example: For the rest of the briefing, he pasted a smile on his face, nodded occasionally, and made all the appropriate noises. The truth was, he wasn t listening. He was already forming a new strategy, one that would benefit only him. He berated himself for not having thought along that line before. Windows Voice accuracy: 87-92% (best with headset mic.) Dragon, Siri, Google Voice accuracy: 92-98% (1 to 4 words errors)

Dictation... Example (homonym): "Where were you when I was looking for clothes to wear?" Example (Proper names): "Do you want to eat at Kacha Thai Bistro tonight?"

Language Translation itranslate (free) & Jibbigo (free/$5) itranslate requires Internet data connection Jibbigo ($5) does not need data connection Demo: "Where is the nearest bathroom?" "How much does it cost?" "That is too much!" Travel Trick: Google Translate remembers "star" favorites - Enter standard guidebook phrases when you have Internet access and then use Google when you don't have data connection.

Google Search & Siri Demo "How far is the moon?" "Where is the nearest steak restaurant?" "What will the weather be like tomorrow?" "Give me the directions to Deer Ridge Golf Course." "Send an e-mail to Phil Goff." "What appointments do I have this week?" "Google search flight status of Southwest 107?" "What is the meaning of life?"

Some Aditional Siri Instructions What day of the week is November 3o, 1980? Remind me to pick up milk when I leave here Remind me to get bread the next time I am here? What is the current outside temperature? Will it rain this morning? What time is it in Hong Kong? How high did AAPL get today? What did the market do today? How did the Giants do today?

Speech Recognition Where are we heading?

Auto Applications Hands free control audio, navigation and climate systems Natural-language requests Announce incoming calls, read inbound text & e-mail messages and allow you to reply back Look up directions, suggest restaurants, make reservations, search the Web, shop for you, etc., etc. Siri is an example what can be done If not implemented properly, may impact perception of car "Quality" (example: Ford MyTouch)

"Eyes Free" Siri Support 12 automobile manufacturers have stated they will be incorporating Siri into their vehicles Audi, BMW, Cadillac, Chrysler, Ferrari, GM, Honda, Jaguar, Land Rover, Mercedes, Toyota, Viper... Not clear how it will be implemented and how Android & Blackberry devices will be handled Dedicated steering wheel button(s) may be slow to be implemented

Television Applications Speak conversationally to perform functions, get answers or find points of interest Voice commands would significantly simplify functions compared to standard remote Examples: Find a specific movie or program Record a program ("Record all episodes of the Good Wife") Learn more about a movie, actor or advertised product Find shows ("List all action movies right now") Some new smart TV's have simple (and slow) voice controls

Home Automation Wi-Fi connected house with microphones in each room Ability to control by simply speaking Examples: Set alarm clock Set thermostat (Nest) Clean house with robotic vacuum Turn lights on/off or set program Adjust sprinkler controls Ask for weather forecast Report stock prices

Other Possible Innovative Applications Dual translator headset or phone Google "Glass" Apple "iwatch" (??) Customer satisfaction detection by "tone of voice" Lie detection by "stress analysis" (Russian ATM example) Only limited by imagination...

The End a2cat.sirinc2.org

Supplemental Information

Some Siri Tips To use Google Maps instead of Apple Maps, say "Give me directions to xxxx via transit" To get Google Answers, instead of Apple answers, start the statement with the word Google, i.e. Google flight status of Southwest 105 To use Siri through your auto Bluetooth, select speaker icon on bottom right of Siri screen Private IMDb search, i.e., "What movies star both Meryl Streep and Tommy Lee Jones?" Get movie reviews, i.e., What was the movie review for Burn After Reading?" Examples of things you can do with Siri: "http://m.tuaw.com/2012/09/14/what-can-you-say-to-siri-in-ios-6/"

Fun things to ask Siri What are you? How are you? Where are you? What do you look like? Why am I here? Tell me a story. Will you marry me? Sing a song. Where are you from? How old are you? How old am I? Tell me a joke. Knock Knock What is the meaning of life? I love you. Do you love me? Are you funny? What is your mother's name? I am drunk. I have to go to the bathroom. Merry Christmas! What is your favorite color? What is my name? I am tired. Testing. Testing, testing. What are you doing? Who is your favorite person?