Knowledge-Based Word Lattice Rescoring in a Dynamic Context. Todd Shore, Friedrich Faubel, Hartmut Helmke, Dietrich Klakow
|
|
- Ethelbert Snow
- 5 years ago
- Views:
Transcription
1 Knowledge-Based Word Lattice Rescoring in a Dynamic Context Todd Shore, Friedrich Faubel, Hartmut Helmke, Dietrich Klakow
2 Section I Motivation
3 Motivation Problem: difficult to incorporate higher-level knowledge sources into automatic speech recognition (ASR)
4 Motivation Problem: difficult to incorporate higher-level knowledge sources into automatic speech recognition (ASR) However: there are domains in which the situational context of utterances is available, e.g. air traffic control (ATC) or command and control tasks
5 Motivation Problem: difficult to incorporate higher-level knowledge sources into automatic speech recognition (ASR) However: there are domains in which the situational context of utterances is available, e.g. air traffic control (ATC) or command and control tasks Approach taken in this work: incorporate contextual knowledge into ASR by rescoring word lattice output of an ATC task
6 Section II The Air Traffic Control Task
7 The Air Traffic Control Task Air traffic controllers at their workstations
8 The Air Traffic Control Task Air traffic controllers at their workstations Primary objective: maintain aircraft separation safely guide approaching aircraft to their runway threshold integrate departing and passing aircraft
9 The Air Traffic Control Task Air traffic controllers at their workstations Have access to: radar screens revealing aircraft positions and speeds flight plans weather reports, indicators for speed of wind, etc.
10 The Air Traffic Control Task Air traffic controllers at their workstations Implementation: issue verbal commands to aircraft pilots use of a standardized subset of English which is formally specified by the International Civil Aviation Organization (ICAO)
11 The Air Traffic Control Task ICAO Phraseology for ATC commands comprises single aircraft callsign (identifying the aircraft) goal actions to execute goal values to be achieved
12 The Air Traffic Control Task ICAO Phraseology for ATC commands comprises single aircraft callsign (identifying the aircraft) goal actions to execute goal values to be achieved
13 The Air Traffic Control Task ICAO Phraseology for ATC commands comprises single aircraft callsign (identifying the aircraft) goal actions to execute goal values to be achieved Example: Delta four three niner turn right heading two two zero
14 The Air Traffic Control Task ICAO Phraseology for ATC commands comprises single aircraft callsign (identifying the aircraft) goal actions to execute goal values to be achieved Example: Delta four three niner turn right heading two two zero Relevant Information: DL 439 TURN, DIR=right, HDG=220
15 The Air Traffic Control Task Use of a recognition grammar is sufficient for recognizing ICAO phraseology simplifies extraction of semantic content
16 The Air Traffic Control Task Use of a recognition grammar is sufficient for recognizing ICAO phraseology simplifies extraction of semantic content Example: Delta four three niner turn right heading two two zero
17 The Air Traffic Control Task Use of a recognition grammar is sufficient for recognizing ICAO phraseology simplifies extraction of semantic content Example: Delta four three niner turn right heading two two zero XML-tags for easy extraction of semantic frames: <callsign> delta four three niner </callsign> <cmd_turn> turn </cmd_turn> <direction> right </direction> heading <heading> two two zero </heading>
18 The Air Traffic Control Task Use of a recognition grammar is sufficient for recognizing ICAO phraseology simplifies extraction of semantic content Example: Delta four three niner turn right heading two two zero XML-tags for easy extraction of semantic frames: <callsign> delta four three niner </callsign> <cmd_turn> turn </cmd_turn> <direction> right </direction> heading <heading> two two zero </heading>
19 The Air Traffic Control Task Use of a recognition grammar is sufficient for recognizing ICAO phraseology simplifies extraction of semantic content Example: Delta four three niner turn right heading two two zero XML-tags for easy extraction of semantic frames: <callsign> delta four three niner </callsign> <cmd_turn> turn </cmd_turn> <direction> right </direction> heading <heading> two two zero </heading>
20 The Air Traffic Control Task Use of a recognition grammar is sufficient for recognizing ICAO phraseology simplifies extraction of semantic content Example: Delta four three niner turn right heading two two zero XML-tags for easy extraction of semantic frames: <callsign> delta four three niner </callsign> <cmd_turn> turn </cmd_turn> <direction> right </direction> heading <heading> two two zero </heading>
21 The Air Traffic Control Task Use of a recognition grammar is sufficient for recognizing ICAO phraseology simplifies extraction of semantic content Example: Delta four three niner turn right heading two two zero XML-tags for easy extraction of semantic frames: <callsign> delta four three niner </callsign> <cmd_turn> turn </cmd_turn> <direction> right </direction> heading <heading> two two zero </heading>
22 The Air Traffic Control Task Main idea of this work: rescore recognized utterances through use of context knowledge about the situation in the airspace
23 The Air Traffic Control Task Main idea of this work: rescore recognized utterances through use of context knowledge about the situation in the airspace But where does this information come from?
24 The Air Traffic Control Task Modernized ATC workspace as envisioned by the German Aerospace Center (DLR)
25 The Air Traffic Control Task including the arrival manager 4D-CARMA, which assists controllers in managing aircraft arrivals
26 The Air Traffic Control Task including the arrival manager 4D-CARMA, which assists controllers in managing aircraft arrivals This system allows us to extract: the callsigns of aircraft in the airspace the aircraft positions relative to the radar Their speeds, altitudes, climb/descend rates, reduce rates, etc.
27 Section III Knowledge-Based Lattice Rescoring
28 Knowledge-Based Lattice Rescoring WFST Decoder: can be used to generate phone-to-word transducer lattice during Viterbi search
29 Knowledge-Based Lattice Rescoring WFST Decoder: can be used to generate phone-to-word transducer lattice during Viterbi search eight EY -b(100) q 167 /callsign ε callsign ε q21 q22 speedbird S -b(141) q 23 two T -b(175) q 307 /callsign ε three T -b(67) q 233 /callsign ε
30 Knowledge-Based Lattice Rescoring WFST Decoder: can be used to generate phone-to-word transducer lattice during Viterbi search This allows us to directly extract semantic frames by simply adding a pointer to a semantic template structure, which is filled on the fly
31 Knowledge-Based Lattice Rescoring WFST Decoder: can be used to generate phone-to-word transducer lattice during Viterbi search This allows us to directly extract semantic frames by simply adding a pointer to a semantic template structure, which is filled on the fly callsign q q21 22 speedbird q 23 eight q 167 /callsign
32 Knowledge-Based Lattice Rescoring WFST Decoder: can be used to generate phone-to-word transducer lattice during Viterbi search This allows us to directly extract semantic frames by simply adding a pointer to a semantic template structure, which is filled on the fly callsign q q21 22 speedbird q 23 eight q 167 /callsign semantic template callsign:
33 Knowledge-Based Lattice Rescoring WFST Decoder: can be used to generate phone-to-word transducer lattice during Viterbi search This allows us to directly extract semantic frames by simply adding a pointer to a semantic template structure, which is filled on the fly callsign q q21 22 speedbird q 23 eight q 167 /callsign semantic template callsign: BA
34 Knowledge-Based Lattice Rescoring WFST Decoder: can be used to generate phone-to-word transducer lattice during Viterbi search This allows us to directly extract semantic frames by simply adding a pointer to a semantic template structure, which is filled on the fly callsign q q21 22 speedbird q 23 eight q 167 /callsign semantic template callsign: BA 8
35 Knowledge-Based Lattice Rescoring WFST Decoder: can be used to generate phone-to-word transducer lattice during Viterbi search This allows us to directly extract semantic frames by simply adding a pointer to a semantic template structure, which is filled on the fly callsign q q21 22 speedbird q 23 eight q 167 /callsign semantic template callsign: BA 8
36 Knowledge-Based Lattice Rescoring WFST Decoder: can be used to generate phone-to-word transducer lattice during Viterbi search Principal Idea: penalize invalid callsigns and unlikely command values based on discourse representation system semantic template callsign: BA 8 discourse representation
37 Knowledge-Based Lattice Rescoring WFST Decoder: can be used to generate phone-to-word transducer lattice during Viterbi search Principal Idea: penalize invalid callsigns and unlikely command values based on discourse representation system semantic template callsign: BA 8 check if exists discourse representation
38 Knowledge-Based Lattice Rescoring WFST Decoder: can be used to generate phone-to-word transducer lattice during Viterbi search Principal Idea: penalize invalid callsigns and unlikely command values based on discourse representation system semantic template callsign: BA 8 check if exists if no apply penalty
39 Knowledge-Based Lattice Rescoring WFST Decoder: can be used to generate phone-to-word transducer lattice during Viterbi search Principal Idea: penalize invalid callsigns and unlikely command values based on discourse representation system semantic template callsign: BA 8 q 167 : /callsign edge information - start / end time - acoustic / LM score q 273
40 Knowledge-Based Lattice Rescoring WFST Decoder: can be used to generate phone-to-word transducer lattice during Viterbi search Principal Idea: penalize invalid callsigns and unlikely command values based on discourse representation system semantic template callsign: BA 8 q 167 : /callsign edge information - start / end time - acoustic / LM score - knowledge score q 273
41 Knowledge-Based Lattice Rescoring WFST Decoder: can be used to generate phone-to-word transducer lattice during Viterbi search Principal Idea: penalize invalid callsigns and unlikely command values based on discourse representation system semantic template callsign: BA 8 discourse representation
42 Knowledge-Based Lattice Rescoring WFST Decoder: can be used to generate phone-to-word transducer lattice during Viterbi search Principal Idea: penalize invalid callsigns and unlikely command values based on discourse representation system semantic template callsign: BA 8 cmd: REDUCE val: SPD=220 discourse representation
43 Knowledge-Based Lattice Rescoring WFST Decoder: can be used to generate phone-to-word transducer lattice during Viterbi search Principal Idea: penalize invalid callsigns and unlikely command values based on discourse representation system semantic template callsign: BA 8 cmd: REDUCE val: SPD=220 check speed discourse representation
44 Knowledge-Based Lattice Rescoring WFST Decoder: can be used to generate phone-to-word transducer lattice during Viterbi search Principal Idea: penalize invalid callsigns and unlikely command values based on discourse representation system semantic template callsign: BA 8 cmd: REDUCE val: SPD=220 check speed apply penalty
45 Knowledge-Based Lattice Rescoring WFST Decoder: can be used to generate phone-to-word transducer lattice during Viterbi search Principal Idea: penalize invalid callsigns and unlikely command values based on discourse representation system semantic template callsign: BA 8 cmd: REDUCE val: SPD=220 kn q 729 : /speed ( q, q ) c i j K k 1 k cost k q 785 (callsign,cmd,val)
46 Knowledge-Based Lattice Rescoring WFST Decoder: can be used to generate phone-to-word transducer lattice during Viterbi search Principal Idea: penalize invalid callsigns and unlikely command values based on discourse representation system semantic template callsign: BA 8 cmd: REDUCE val: SPD=220 kn q 729 : /speed ( q, q ) c i j K k 1 k cost k q 785 (callsign,cmd,val) penalty score
47 Knowledge-Based Lattice Rescoring WFST Decoder: can be used to generate phone-to-word transducer lattice during Viterbi search Principal Idea: penalize invalid callsigns and unlikely command values based on discourse representation system semantic template callsign: BA 8 cmd: REDUCE val: SPD=220 kn q 729 : /speed ( q, q ) c i j K k 1 k cost k q 785 (callsign,cmd,val) constraint penalty function c k ( ) [0,1]
48 Knowledge-Based Lattice Rescoring WFST Decoder: can be used to generate phone-to-word transducer lattice during Viterbi search Principal Idea: penalize invalid callsigns and unlikely command values based on discourse representation system semantic template callsign: BA 8 cmd: REDUCE val: SPD=220 q 729 : /speed total cost: q 785 ( ) () () () ac ac lm lm kn kn
49 Knowledge-Based Lattice Rescoring WFST Decoder: can be used to generate phone-to-word transducer lattice during Viterbi search Rescoring: in analogy to rescoring with different language model scales and word insertion penalties
50 Knowledge-Based Lattice Rescoring WFST Decoder: can be used to generate phone-to-word transducer lattice during Viterbi search Rescoring: in analogy to rescoring with different language model scales and word insertion penalties 1 Propagate scores along the nodes
51 Knowledge-Based Lattice Rescoring WFST Decoder: can be used to generate phone-to-word transducer lattice during Viterbi search Rescoring: in analogy to rescoring with different language model scales and word insertion penalties callsign q q21 22 speedbird q 23 eight q 167 /callsign token - acscore - lmscore - knscore
52 Knowledge-Based Lattice Rescoring WFST Decoder: can be used to generate phone-to-word transducer lattice during Viterbi search Rescoring: in analogy to rescoring with different language model scales and word insertion penalties callsign q q21 22 speedbird q 23 eight q 167 /callsign token - acscore - lmscore - knscore acscore ac lmscore lm knscore kn ac lm kn () () [? lp] ( )
53 Knowledge-Based Lattice Rescoring WFST Decoder: can be used to generate phone-to-word transducer lattice during Viterbi search Rescoring: in analogy to rescoring with different language model scales and word insertion penalties callsign q q21 22 speedbird q 23 eight q 167 /callsign token - acscore - lmscore - knscore acscore ac lmscore lm knscore kn ac lm kn () () [? lp] ( )
54 Knowledge-Based Lattice Rescoring WFST Decoder: can be used to generate phone-to-word transducer lattice during Viterbi search Rescoring: in analogy to rescoring with different language model scales and word insertion penalties 1 Propagate scores along the nodes
55 Knowledge-Based Lattice Rescoring WFST Decoder: can be used to generate phone-to-word transducer lattice during Viterbi search Rescoring: in analogy to rescoring with different language model scales and word insertion penalties 1 Propagate scores along the nodes 2 Trace back from best final state
56 Section IV Rescoring Experiments
57 Rescoring Experiments Corpus: recorded using the 4D-CARMA software from DLR includes aircraft state vectors (5 second intervals) total of 1,107 ATC commands 9.5 words per sentence approx. 100 minutes of speech
58 Rescoring Experiments GUI used by the participants
59 Rescoring Experiments Recognition Grammar: branching factor of ,220 unique callsigns sentence perplexity of 3.9E+09
60 Rescoring Experiments Recognition Grammar: branching factor of ,220 unique callsigns sentence perplexity of 3.9E+09 Used Constraints: callsign speed altitude
61 Rescoring Experiments Recognition Grammar: branching factor of ,220 unique callsigns sentence perplexity of 3.9E+09 Used Constraints: callsign speed altitude Rescoring WER SER MRR None (baseline) Callsign Callsign, Spd, Alt Oracle
62 Section V Conclusions
63 Conclusions We have shown how dynamic context knowledge can profitably be used for rescoring ASR hypotheses
64 Conclusions We have shown how dynamic context knowledge can profitably be used for rescoring ASR hypotheses This is of particular interest in scenarios where explicit context information is available, such as ATC: radar-derived aircraft state vectors video games virtual reality
65 Thank you very much for your attention!
66 The Air Traffic Control Task XML-tags for easy extraction of semantic frames: <callsign> delta four three niner </callsign> <cmd_turn> turn </cmd_turn> <direction> right </direction> heading <heading> two two zero </heading>
67 The Air Traffic Control Task XML-tags for easy extraction of semantic frames: <callsign> delta four three niner </callsign> <cmd_turn> turn </cmd_turn> <direction> right </direction> heading <heading> two two zero </heading> Can easily be extended to class-based N-grams [callsign] [cmd_turn] [direction] heading [heading].
68 The Air Traffic Control Task XML-tags for easy extraction of semantic frames: <callsign> delta four three niner </callsign> <cmd_turn> turn </cmd_turn> <direction> right </direction> heading <heading> two two zero </heading> Can easily be extended to class-based N-grams [callsign] [cmd_turn] [direction] heading [heading].
69 The Air Traffic Control Task XML-tags for easy extraction of semantic frames: <callsign> delta four three niner </callsign> <cmd_turn> turn </cmd_turn> <direction> right </direction> heading <heading> two two zero </heading> Can easily be extended to class-based N-grams [callsign] [cmd_turn] [direction] heading [heading].
70 The Air Traffic Control Task XML-tags for easy extraction of semantic frames: <callsign> delta four three niner </callsign> <cmd_turn> turn </cmd_turn> <direction> right </direction> heading <heading> two two zero </heading> Can easily be extended to class-based N-grams [callsign] [cmd_turn] [direction] heading [heading].
71 The Air Traffic Control Task XML-tags for easy extraction of semantic frames: <callsign> delta four three niner </callsign> <cmd_turn> turn </cmd_turn> <direction> right </direction> heading <heading> two two zero </heading> Can easily be extended to class-based N-grams [callsign] [cmd_turn] [direction] heading [heading].
Automatic Speech Recognition (ASR)
Automatic Speech Recognition (ASR) February 2018 Reza Yazdani Aminabadi Universitat Politecnica de Catalunya (UPC) State-of-the-art State-of-the-art ASR system: DNN+HMM Speech (words) Sound Signal Graph
More informationMemory-Efficient Heterogeneous Speech Recognition Hybrid in GPU-Equipped Mobile Devices
Memory-Efficient Heterogeneous Speech Recognition Hybrid in GPU-Equipped Mobile Devices Alexei V. Ivanov, CTO, Verbumware Inc. GPU Technology Conference, San Jose, March 17, 2015 Autonomous Speech Recognition
More informationSemantic Word Embedding Neural Network Language Models for Automatic Speech Recognition
Semantic Word Embedding Neural Network Language Models for Automatic Speech Recognition Kartik Audhkhasi, Abhinav Sethy Bhuvana Ramabhadran Watson Multimodal Group IBM T. J. Watson Research Center Motivation
More informationSpeech Tuner. and Chief Scientist at EIG
Speech Tuner LumenVox's Speech Tuner is a complete maintenance tool for end-users, valueadded resellers, and platform providers. It s designed to perform tuning and transcription, as well as parameter,
More informationContents. Resumen. List of Acronyms. List of Mathematical Symbols. List of Figures. List of Tables. I Introduction 1
Contents Agraïments Resum Resumen Abstract List of Acronyms List of Mathematical Symbols List of Figures List of Tables VII IX XI XIII XVIII XIX XXII XXIV I Introduction 1 1 Introduction 3 1.1 Motivation...
More informationA Weighted Finite State Transducer Implementation of the Alignment Template Model for Statistical Machine Translation.
A Weighted Finite State Transducer Implementation of the Alignment Template Model for Statistical Machine Translation May 29, 2003 Shankar Kumar and Bill Byrne Center for Language and Speech Processing
More informationE.ATIS ENHANCED AUTOMATED TERMINAL INFORMATION SYSTEM
E.ATIS ENHANCED AUTOMATED TERMINAL INFORMATION SYSTEM ATIS/VOLMET SYSTEM 1 1. Introduction The E.ATIS is a state of the art automatic system that converts selected meteorological products and air traffic
More informationarxiv: v1 [cs.cl] 30 Jan 2018
ACCELERATING RECURRENT NEURAL NETWORK LANGUAGE MODEL BASED ONLINE SPEECH RECOGNITION SYSTEM Kyungmin Lee, Chiyoun Park, Namhoon Kim, and Jaewon Lee DMC R&D Center, Samsung Electronics, Seoul, Korea {k.m.lee,
More informationOverview. Search and Decoding. HMM Speech Recognition. The Search Problem in ASR (1) Today s lecture. Steve Renals
Overview Search and Decoding Steve Renals Automatic Speech Recognition ASR Lecture 10 January - March 2012 Today s lecture Search in (large vocabulary) speech recognition Viterbi decoding Approximate search
More informationHierarchical Phrase-Based Translation with WFSTs. Weighted Finite State Transducers
Hierarchical Phrase-Based Translation with Weighted Finite State Transducers Gonzalo Iglesias 1 Adrià de Gispert 2 Eduardo R. Banga 1 William Byrne 2 1 Department of Signal Processing and Communications
More informationSpeech Technology Using in Wechat
Speech Technology Using in Wechat FENG RAO Powered by WeChat Outline Introduce Algorithm of Speech Recognition Acoustic Model Language Model Decoder Speech Technology Open Platform Framework of Speech
More informationReducing Controller Workload with Automatic Speech Recognition
Reducing Controller Workload with Automatic Speech Recognition Hartmut Helmke, Oliver Ohneiser, Thorsten Mühlhausen, Matthias Wies German Aerospace Center (DLR) Institute of Flight Guidance Lilienthalplatz
More informationLattice Rescoring for Speech Recognition Using Large Scale Distributed Language Models
Lattice Rescoring for Speech Recognition Using Large Scale Distributed Language Models ABSTRACT Euisok Chung Hyung-Bae Jeon Jeon-Gue Park and Yun-Keun Lee Speech Processing Research Team, ETRI, 138 Gajeongno,
More informationTHE CUED NIST 2009 ARABIC-ENGLISH SMT SYSTEM
THE CUED NIST 2009 ARABIC-ENGLISH SMT SYSTEM Adrià de Gispert, Gonzalo Iglesias, Graeme Blackwood, Jamie Brunning, Bill Byrne NIST Open MT 2009 Evaluation Workshop Ottawa, The CUED SMT system Lattice-based
More informationSparse Non-negative Matrix Language Modeling
Sparse Non-negative Matrix Language Modeling Joris Pelemans Noam Shazeer Ciprian Chelba joris@pelemans.be noam@google.com ciprianchelba@google.com 1 Outline Motivation Sparse Non-negative Matrix Language
More informationCMU Sphinx: the recognizer library
CMU Sphinx: the recognizer library Authors: Massimo Basile Mario Fabrizi Supervisor: Prof. Paola Velardi 01/02/2013 Contents 1 Introduction 2 2 Sphinx download and installation 4 2.1 Download..........................................
More informationATCoach Air Traffic Control Simulator
A ATCoach TCoach is an advanced, multi-purpose, real-time system from UFA. In a single package, ATCoach provides functionality for both stand-alone and embedded simulation environments that support a wide
More informationDiscriminative Training of Decoding Graphs for Large Vocabulary Continuous Speech Recognition
Discriminative Training of Decoding Graphs for Large Vocabulary Continuous Speech Recognition by Hong-Kwang Jeff Kuo, Brian Kingsbury (IBM Research) and Geoffry Zweig (Microsoft Research) ICASSP 2007 Presented
More information1 SOFTWARE INSTALLATION
Welcome to the IVAO Belgium division! This guide is intended to make sure you know your way around the network and get the software ready and connected to accomplish what you are looking for: fly with
More informationBASIC IVAP FUNCTIONS FOR FS
1. Introduction BASIC IVAP FUNCTIONS FOR FS In this document, you will be instructed how to use IvAp (IVAOs pilots tool) properly. This is just a short overview of the basics required for the PP exam.
More informationHands-Free Internet using Speech Recognition
Introduction Trevor Donnell December 7, 2001 6.191 Preliminary Thesis Proposal Hands-Free Internet using Speech Recognition The hands-free Internet will be a system whereby a user has the ability to access
More informationWeighted Finite State Transducers in Automatic Speech Recognition
Weighted Finite State Transducers in Automatic Speech Recognition ZRE lecture 10.04.2013 Mirko Hannemann Slides provided with permission, Daniel Povey some slides from T. Schultz, M. Mohri and M. Riley
More informationSpeech Applications. How do they work?
Speech Applications How do they work? What is a VUI? What the user interacts with when using a speech application VUI Elements Prompts or System Messages Prerecorded or Synthesized Grammars Define the
More informationIntroduction to HTK Toolkit
Introduction to HTK Toolkit Berlin Chen 2003 Reference: - The HTK Book, Version 3.2 Outline An Overview of HTK HTK Processing Stages Data Preparation Tools Training Tools Testing Tools Analysis Tools Homework:
More informationHeadset plugs earphones (left), microphone (right)
Aerodrome control tower simulator [pd] Deployment - connect the power supply plug - connect the USB mouse and possibly a USB keyboard - connect the headset (pseudopilot) or the loudspeakers (student) -
More information5/4/2016 ht.ttlms.com
Order Template Screen 1 Free Form Lesson Overview 2 Free Form Performance Based CNS Requirements 3 Free Form Performance Based CNS Requirements 4 Single Answer Knowledge Check 5 Free Form Related ICAO
More informationLearning The Lexicon!
Learning The Lexicon! A Pronunciation Mixture Model! Ian McGraw! (imcgraw@mit.edu)! Ibrahim Badr Jim Glass! Computer Science and Artificial Intelligence Lab! Massachusetts Institute of Technology! Cambridge,
More informationTwo-Year Academic Course Schedule - Fall 2017 through Summer Program Director: Eric Himler Campus Phone :
AVT 101 Private Pilot Ground School 4.00 GR M GR M GR M GR M AVT 102 Private Pilot Flight 4.00 O ARR O ARR O ARR O ARR AVT 103 Intro to ATC 3.00 GR M GR M GR M GR M AVT 105 Aviation Meteorology 4.00 GR
More informationGoogle Search by Voice
Language Modeling for Automatic Speech Recognition Meets the Web: Google Search by Voice Ciprian Chelba, Johan Schalkwyk, Boulos Harb, Carolina Parada, Cyril Allauzen, Leif Johnson, Michael Riley, Peng
More informationLexicographic Semirings for Exact Automata Encoding of Sequence Models
Lexicographic Semirings for Exact Automata Encoding of Sequence Models Brian Roark, Richard Sproat, and Izhak Shafran {roark,rws,zak}@cslu.ogi.edu Abstract In this paper we introduce a novel use of the
More informationATCOSIM - Air Traffic Control Simulation Speech Corpus Validation Report
ATCOSIM - Air Traffic Control Simulation Speech Corpus Validation Report Stefan Petrik October 31, 2007 Abstract The ATCOSIM speech corpus provided by Eurocontrol Experimental Centre has been validated
More informationAn Ultra Low-Power Hardware Accelerator for Automatic Speech Recognition
An Ultra Low-Power Hardware Accelerator for Automatic Speech Recognition Reza Yazdani, Albert Segura, Jose-Maria Arnau, Antonio Gonzalez Computer Architecture Department, Universitat Politecnica de Catalunya
More informationWeighted Finite State Transducers in Automatic Speech Recognition
Weighted Finite State Transducers in Automatic Speech Recognition ZRE lecture 15.04.2015 Mirko Hannemann Slides provided with permission, Daniel Povey some slides from T. Schultz, M. Mohri, M. Riley and
More informationTEAMSPEAK 2 BASICS. 1. Introduction. 2. Use in IVAO. 3. Equipment need
TEAMSPEAK 2 BASICS 1. Introduction TeamSpeak 2 is the unique bi directional audio communication software allowed in IVAO Network. TeamSpeak 2 simulates an aircraft or ATC radio transceiver. TeamSpeak 3
More informationConstrained Discriminative Training of N-gram Language Models
Constrained Discriminative Training of N-gram Language Models Ariya Rastrow #1, Abhinav Sethy 2, Bhuvana Ramabhadran 3 # Human Language Technology Center of Excellence, and Center for Language and Speech
More informationCUED-RNNLM An Open-Source Toolkit for Efficient Training and Evaluation of Recurrent Neural Network Language Models
CUED-RNNLM An Open-Source Toolkit for Efficient Training and Evaluation of Recurrent Neural Network Language Models Xie Chen, Xunying Liu, Yanmin Qian, Mark Gales and Phil Woodland April 1, 2016 Overview
More informationRLAT Rapid Language Adaptation Toolkit
RLAT Rapid Language Adaptation Toolkit Tim Schlippe May 15, 2012 RLAT Rapid Language Adaptation Toolkit - 2 RLAT Rapid Language Adaptation Toolkit RLAT Rapid Language Adaptation Toolkit - 3 Outline Introduction
More informationCRCT Capabilities Detailed Functional Description
MTR 00W0000302 MITRE TECHNICAL REPORT CRCT Capabilities Detailed Functional Description March 2001 Dr. Laurel S. Rhodes Lowell R. Rhodes Emily K. Beaton 2001 The MITRE Corporation Approved for public release;
More informationACT Automated Clearance Tool: Improving the Diplomatic Clearance Process for AMC
ACT Automated Clearance Tool: Improving the Diplomatic Clearance Process for AMC Alice M. Mulvehill Brett Benyo David Rager - BBN Technologies Edward DePalma - Air Force Research Laboratory June 2004 ACT
More informationLecture 9. LVCSR Decoding (cont d) and Robustness. Michael Picheny, Bhuvana Ramabhadran, Stanley F. Chen
Lecture 9 LVCSR Decoding (cont d) and Robustness Michael Picheny, huvana Ramabhadran, Stanley F. Chen IM T.J. Watson Research Center Yorktown Heights, New York, USA {picheny,bhuvana,stanchen}@us.ibm.com
More informationAndi Buzo, Horia Cucu, Mihai Safta and Corneliu Burileanu. Speech & Dialogue (SpeeD) Research Laboratory University Politehnica of Bucharest (UPB)
Andi Buzo, Horia Cucu, Mihai Safta and Corneliu Burileanu Speech & Dialogue (SpeeD) Research Laboratory University Politehnica of Bucharest (UPB) The MediaEval 2012 SWS task A multilingual, query by example,
More informationA study of large vocabulary speech recognition decoding using finite-state graphs 1
A study of large vocabulary speech recognition decoding using finite-state graphs 1 Zhijian OU, Ji XIAO Department of Electronic Engineering, Tsinghua University, Beijing Corresponding email: ozj@tsinghua.edu.cn
More informationArtificial Intelligence for Real Airplanes
Artificial Intelligence for Real Airplanes FAA-EUROCONTROL Technical Interchange Meeting / Agency Research Team Machine Learning and Artificial Intelligence Chip Meserole Sherry Yang Airspace Operational
More information! o o o o o o o o o (Per the February 9, 2015 rule change to 14 CFR 91.225 (b)(1)(ii)) (A letter of authorization from the aircraft manufacturer is needed for installation in LSA aircraft) ± ! ADF
More informationMLSALT11: Large Vocabulary Speech Recognition
MLSALT11: Large Vocabulary Speech Recognition Riashat Islam Department of Engineering University of Cambridge Trumpington Street, Cambridge, CB2 1PZ, England ri258@cam.ac.uk I. INTRODUCTION The objective
More informationLarge Scale Distributed Acoustic Modeling With Back-off N-grams
Large Scale Distributed Acoustic Modeling With Back-off N-grams Ciprian Chelba* and Peng Xu and Fernando Pereira and Thomas Richardson Abstract The paper revives an older approach to acoustic modeling
More informationJARUS RECOMMENDATIONS ON THE USE OF CONTROLLER PILOT DATA LINK COMMUNICATIONS (CPDLC) IN THE RPAS COMMUNICATIONS CONTEXT
Joint Authorities for Rulemaking of Unmanned Systems JARUS RECOMMENDATIONS ON THE USE OF CONTROLLER PILOT DATA LINK COMMUNICATIONS (CPDLC) IN THE RPAS COMMUNICATIONS CONTEXT DOCUMENT IDENTIFIER : JAR_DEL_WG5_D.04
More informationA Path Planning Algorithm to Enable Well-Clear Low Altitude UAS Operation Beyond Visual Line of Sight
A Path Planning Algorithm to Enable Well-Clear Low Altitude UAS Operation Beyond Visual Line of Sight Swee Balachandran National Institute of Aerospace, Hampton, VA Anthony Narkawicz, César Muñoz, María
More informationTALK project work at UCAM
TALK project work at UCAM Matt Stuttle August 2005 Web: Email: http://mi.eng.cam.ac.uk/ mns25 mns25@eng.cam.ac.uk Dialogs on Dialogs Overview TALK project Goals Baseline system the Hidden Information State
More informationImagine How-To Videos
Grounded Sequence to Sequence Transduction (Multi-Modal Speech Recognition) Florian Metze July 17, 2018 Imagine How-To Videos Lots of potential for multi-modal processing & fusion For Speech-to-Text and
More informationLarge-Scale Flight Phase identification from ADS-B Data Using Machine Learning Methods
Large-Scale Flight Phase identification from ADS-B Data Using Methods Junzi Sun 06.2016 PhD student, ATM Control and Simulation, Aerospace Engineering Large-Scale Flight Phase identification from ADS-B
More informationRelative Significance of Trajectory Prediction Errors on an Automated Separation Assurance Algorithm
Relative Significance of Trajectory Prediction Errors on an Automated Separation Assurance Algorithm Todd Lauderdale Andrew Cone Aisha Bowe NASA Ames Research Center Separation Assurance Automation Should
More informationEuropean Sky ATM Research (SESAR) [5][6] in Europe both consider the implementation of SWIM as a fundamental element for future ATM systems.
(FIXM) and the weather information exchange model 1. INTRODUCTION With the rapid increase in local and global air traffic, the system-wide operational information exchange and life-cycle management technologies
More informationC. The system is equally reliable for classifying any one of the eight logo types 78% of the time.
Volume: 63 Questions Question No: 1 A system with a set of classifiers is trained to recognize eight different company logos from images. It is 78% accurate. Without further information, which statement
More informationvstars Version 1.0 Facility Engineer's Guide
vstars Version 1.0 Facility Engineer's Guide Table of Contents Introduction...3 The Facilities Window...3 Facility Configuration...4 Airspace Tab...4 Data Blocks Tab...5 Positions & Aliases Tab...6 Airports
More informationPing-pong decoding Combining forward and backward search
Combining forward and backward search Research Internship 09/ - /0/0 Mirko Hannemann Microsoft Research, Speech Technology (Redmond) Supervisor: Daniel Povey /0/0 Mirko Hannemann / Beam Search Search Errors
More informationBasic SYSCO Principles and Standard OLDI Communication
MID AIDC/OLDI Seminar Basic SYSCO Principles and Standard OLDI Communication ICAO EUR/NAT Office Celso Figueiredo Regional Officer ANS - ATM Cairo, Egypt 3 to 5 March 2014 By the end of the Presentation
More informationSpeech Recognition Lecture 12: Lattice Algorithms. Cyril Allauzen Google, NYU Courant Institute Slide Credit: Mehryar Mohri
Speech Recognition Lecture 12: Lattice Algorithms. Cyril Allauzen Google, NYU Courant Institute allauzen@cs.nyu.edu Slide Credit: Mehryar Mohri This Lecture Speech recognition evaluation N-best strings
More informationEvaluation of CDA as A Standard Terminal Airspace Operation
Partnership for AiR Transportation Noise and Emission Reduction An FAA/NASA/TC/DOD/EPA-sponsored Center of Excellence Evaluation of CDA as A Standard Terminal Airspace Operation School of Aeronautics &
More informationFP SIMPLE4ALL deliverable D6.5. Deliverable D6.5. Initial Public Release of Open Source Tools
Deliverable D6.5 Initial Public Release of Open Source Tools The research leading to these results has received funding from the European Community s Seventh Framework Programme (FP7/2007-2013) under grant
More informationTaming Text. How to Find, Organize, and Manipulate It MANNING GRANT S. INGERSOLL THOMAS S. MORTON ANDREW L. KARRIS. Shelter Island
Taming Text How to Find, Organize, and Manipulate It GRANT S. INGERSOLL THOMAS S. MORTON ANDREW L. KARRIS 11 MANNING Shelter Island contents foreword xiii preface xiv acknowledgments xvii about this book
More informationDynamic Time Warping
Centre for Vision Speech & Signal Processing University of Surrey, Guildford GU2 7XH. Dynamic Time Warping Dr Philip Jackson Acoustic features Distance measures Pattern matching Distortion penalties DTW
More informationBritish Airways : Boeing [Assembly Instructions]
British Airways : Boeing-00 [Assembly Instructions] Front Over look Side Underside This British Airways Boeing -00 papercraft is modelled on the British Airways plane of the same name. The red and blue
More informationGarrecht Avionik GmbH VT-02 Transponder User manual Dokument E. Secondary Surveillance Radar Transponder Mode-S.
VT-02 Secondary Surveillance Radar Transponder Mode-S User manual Add this manual to the flight instruction manual of your aircraft 2007-2008 Garrecht Avionik GmbH, 55411 Bingen / Germany Revision: 1.1
More informationSlide Title. Click to edit Master text styles. Second level. Third level. Organisation Logo. Fourth level Fifth level
Slide Title Slide Title Neil Roduner RPAS Operational Risks air traffic controller perspective Slide ScopeTitle Click Safety to and edit what Master it means text styles from the ATC perspective Facilitation
More information: EUROCONTROL Specification. for Surveillance Data Exchange ASTERIX Part 29 Category 032 Miniplan Reports to an SDPS
EUROCONTROL Specification for Surveillance Data Exchange ASTERIX Part 29 Category 032 Miniplan Reports to an SDPS DOCUMENT IDENTIFIER : Edition Number : 1.0 Edition Date : 04/08/2017 Status : Released
More informationAFTN Terminal. Architecture Overview
AFTN Terminal Architecture Overview Flight ATM Systems Ltd. Document Number AFTNTERM-ARCH Rev A0.01 Filename: GEN_AFTN_Terminal Architecture.doc Paper size: A4 Template: Flight ATM.dot persons, without
More informationLessons learned: WakeScene Results of quantitative (Monte Carlo) simulation studies
Lessons learned: WakeScene Results of quantitative (Monte Carlo) simulation studies Jan Kladetzke DLR, Oberpfaffenhofen, Institut für Robotik und Mechatronik Jan Kladetzke 01/06/2010 Contents History of
More informationLarge-scale discriminative language model reranking for voice-search
Large-scale discriminative language model reranking for voice-search Preethi Jyothi The Ohio State University Columbus, OH jyothi@cse.ohio-state.edu Leif Johnson UT Austin Austin, TX leif@cs.utexas.edu
More informationTALP: Xgram-based Spoken Language Translation System Adrià de Gispert José B. Mariño
TALP: Xgram-based Spoken Language Translation System Adrià de Gispert José B. Mariño Outline Overview Outline Translation generation Training IWSLT'04 Chinese-English supplied task results Conclusion and
More informationApplications of Keyword-Constraining in Speaker Recognition. Howard Lei. July 2, Introduction 3
Applications of Keyword-Constraining in Speaker Recognition Howard Lei hlei@icsi.berkeley.edu July 2, 2007 Contents 1 Introduction 3 2 The keyword HMM system 4 2.1 Background keyword HMM training............................
More informationHow SPICE Language Modeling Works
How SPICE Language Modeling Works Abstract Enhancement of the Language Model is a first step towards enhancing the performance of an Automatic Speech Recognition system. This report describes an integrated
More informationA tool for Cross-Language Pair Annotations: CLPA
A tool for Cross-Language Pair Annotations: CLPA August 28, 2006 This document describes our tool called Cross-Language Pair Annotator (CLPA) that is capable to automatically annotate cognates and false
More informationA Scalable Speech Recognizer with Deep-Neural-Network Acoustic Models
A Scalable Speech Recognizer with Deep-Neural-Network Acoustic Models and Voice-Activated Power Gating Michael Price*, James Glass, Anantha Chandrakasan MIT, Cambridge, MA * now at Analog Devices, Cambridge,
More informationMaximum Likelihood Beamforming for Robust Automatic Speech Recognition
Maximum Likelihood Beamforming for Robust Automatic Speech Recognition Barbara Rauch barbara@lsv.uni-saarland.de IGK Colloquium, Saarbrücken, 16 February 2006 Agenda Background: Standard ASR Robust ASR
More informationImproving the Understanding between Control Tower Operator & Pilot Using Semantic Techniques A New Approach
2017 IEEE 13th International Symposium on Autonomous Decentralized Systems Improving the Understanding between Control Tower Operator & Pilot Using Semantic Techniques A New Approach Dewan Abdullah Dewan
More informationScalable Trigram Backoff Language Models
Scalable Trigram Backoff Language Models Kristie Seymore Ronald Rosenfeld May 1996 CMU-CS-96-139 School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 This material is based upon work
More informationChapter 3. Speech segmentation. 3.1 Preprocessing
, as done in this dissertation, refers to the process of determining the boundaries between phonemes in the speech signal. No higher-level lexical information is used to accomplish this. This chapter presents
More informationEUROCAE ED-122 / RTCA DO-306 Oceanic SPR Standard
Oceanic SPR Standard ICAO NAT PBCS Workshop, Feb 2013 Presented by Jerome CONDIS RTCA SC-214 / EUROCAE WG-78 Co-chair 1 History SC189 / WG53 1 2 3 4 ED100 / DO 258 : Interoperability Requirements for ATS
More informationU.S. ADS-B Ground Infrastructure Program. January 28, 2010
U.S. ADS-B Ground Infrastructure Program January 28, 2010 ITT At a Glance A global, diversified manufacturing and engineering company headquartered in White Plains, NY A Fortune 300 company 40,000+ employees
More informationRNAV 1 Approval Process
RNAV 1 Approval Process JAA Temporary Guidance Material TGL 10 Published November 2000 P-RNAV meets all PBN requirements for RNAV 1 Operations using DME/DME or GNSS EASA transposing JAA guidance into AMC
More informationPLUS Software Solution
PLUS Software Solution Processing Language Upgrades Safety Safety Data - CFH Data Science in Aviation 2017 Workshop 29th September 2017 1. Background and objectives 2. PLUS Software Solution 3. They trust
More informationLightweight Simulation of Air Traffic Control Using Simple Temporal Networks
Lightweight Simulation of Air Traffic Control Using Simple Temporal Networks Russell Knight Jet Propulsion Laboratory, California Institute of Technology 4800 Oak Grove Drive MS 126-347 Pasadena, CA 91109
More informationEUROCONTROL Specification for SWIM Information Definition
EUROCONTROL EUROCONTROL Specification for SWIM Information Definition Edition: 1.0 Edition date: 01/12/2017 Reference nr: EUROCONTROL-SPEC-169 EUROPEAN ORGANISATION FOR THE SAFETY OF AIR NAVIGATION EUROCONTROL
More informationWFST: Weighted Finite State Transducer. September 12, 2017 Prof. Marie Meteer
+ WFST: Weighted Finite State Transducer September 12, 217 Prof. Marie Meteer + FSAs: A recurring structure in speech 2 Phonetic HMM Viterbi trellis Language model Pronunciation modeling + The language
More informationDIGITAL TOWER SERVICES
COMPANY RESTRICTED NOT EXPORT CONTROLLED NOT CLASSIFIED DIGITAL TOWER SERVICES David A Shomar, Vice President Civil Security, MENA Region dshomar@saabsensis.com EVOLUTION COMPANY RESTRICTED NOT EXPORT
More informationTCT Detailed Design Document
Detailed Design Document TCT Detailed Design Document Issue: Draft 1.1 Author: Mike Vere Date: 3 January 2008 Company: Graffica Ltd Page 1 of 16 Table Of Contents 1 Introduction...3 1.1 OVERVIEW...3 1.2
More informationExam in TDT 4242 Software Requirements and Testing. Tuesday June 8, :00 am 1:00 pm
Side 1 av 11 NTNU Norwegian University of Science and Technology ENGLISH Faculty of Physics, Informatics and Mathematics Department of Computer and Information Sciences Sensurfrist: 2010-06-29 Exam in
More informationA Delay Model for Satellite Constellation Networks with Inter-Satellite Links
A Delay Model for Satellite Constellation Networks with Inter-Satellite Links Romain Hermenier, Christian Kissling, Anton Donner Deutsches Zentrum für Luft- und Raumfahrt (DLR) Institute of Communications
More informationLOW-RANK MATRIX FACTORIZATION FOR DEEP NEURAL NETWORK TRAINING WITH HIGH-DIMENSIONAL OUTPUT TARGETS
LOW-RANK MATRIX FACTORIZATION FOR DEEP NEURAL NETWORK TRAINING WITH HIGH-DIMENSIONAL OUTPUT TARGETS Tara N. Sainath, Brian Kingsbury, Vikas Sindhwani, Ebru Arisoy, Bhuvana Ramabhadran IBM T. J. Watson
More informationFaster Simulations of the National Airspace System
Faster Simulations of the National Airspace System PK Menon Monish Tandale Sandy Wiraatmadja Optimal Synthesis Inc. Joseph Rios NASA Ames Research Center NVIDIA GPU Technology Conference 2010, San Jose,
More informationHuman Computer Interaction Using Speech Recognition Technology
International Bulletin of Mathematical Research Volume 2, Issue 1, March 2015 Pages 231-235, ISSN: 2394-7802 Human Computer Interaction Using Recognition Technology Madhu Joshi 1 and Saurabh Ranjan Srivastava
More informationStory Workbench Quickstart Guide Version 1.2.0
1 Basic Concepts Story Workbench Quickstart Guide Version 1.2.0 Mark A. Finlayson (markaf@mit.edu) Annotation An indivisible piece of data attached to a text is called an annotation. Annotations, also
More informationDetection and Extraction of Events from s
Detection and Extraction of Events from Emails Shashank Senapaty Department of Computer Science Stanford University, Stanford CA senapaty@cs.stanford.edu December 12, 2008 Abstract I build a system to
More informationConfidence Measures: how much we can trust our speech recognizers
Confidence Measures: how much we can trust our speech recognizers Prof. Hui Jiang Department of Computer Science York University, Toronto, Ontario, Canada Email: hj@cs.yorku.ca Outline Speech recognition
More informationA Comparison of Sequence-Trained Deep Neural Networks and Recurrent Neural Networks Optical Modeling For Handwriting Recognition
A Comparison of Sequence-Trained Deep Neural Networks and Recurrent Neural Networks Optical Modeling For Handwriting Recognition Théodore Bluche, Hermann Ney, Christopher Kermorvant SLSP 14, Grenoble October
More informationCS 532c Probabilistic Graphical Models N-Best Hypotheses. December
CS 532c Probabilistic Graphical Models N-Best Hypotheses Zvonimir Rakamaric Chris Dabrowski December 18 2004 Contents 1 Introduction 3 2 Background Info 3 3 Brute Force Algorithm 4 3.1 Description.........................................
More informationImproving airports throughput
Improving airports throughput RECAT EU and Approach Time-Based Vincent TREVE Introduction Analysing summer period 2012, 6 airports were congested in the sense of operating at 80% or more of their capacity
More informationBetter Evaluation for Grammatical Error Correction
Better Evaluation for Grammatical Error Correction Daniel Dahlmeier 1 and Hwee Tou Ng 1,2 1 NUS Graduate School for Integrative Sciences and Engineering 2 Department of Computer Science, National University
More informationAdvisory Circular. Subject: INTERNET COMMUNICATIONS OF Date: 11/1/02 AC No.: AVIATION WEATHER AND NOTAMS Initiated by: ARS-100
U.S. Department of Transportation Federal Aviation Administration Advisory Circular Subject: INTERNET COMMUNICATIONS OF Date: 11/1/02 AC No.: 00-62 AVIATION WEATHER AND NOTAMS Initiated by: ARS-100 1.
More information