Sequence Prediction with Neural Segmental Models. Hao Tang

Size: px
Start display at page:

Download "Sequence Prediction with Neural Segmental Models. Hao Tang"

Transcription

1 Sequence Prediction with Neural Segmental Models Hao Tang

2 About Me Pronunciation modeling [TKL 2012] Segmental models [TGL 2014] [TWGL 2015] [TWGL 2016] [TWGL 2016] American Sign Language fingerspelling recognition [KWTL 2015] Under-resourced speech recognition [LJTMSKHK 2016] [HJMMLDELMTLCHKSL 2016] State-tying with CCA [WTL 2016] Dialog state tracking [TWMH 2014] Finite-state transducers Discriminative training Linear models Structured prediction Neural networks

3 Segments Netflix announces House of Cards return with dark inauguration day promo.

4 Frames and Segments segment: variable-length unit Netflix announces House of Cards return with dark inauguration day promo. frame: fixed-length unit

5 Frame-Based Models B O B I I O O y 1 y 2 y 3 y 4 y 5 y 6 y 7 x 1 x 2 x 3 x 4 x 5 x 6 x 7 Netflix announces House of Cards return with

6 Frame Labels BIO tags in named-entity recognition B O B I I O O Netflix announces House of Cards return with Netflix announces House of Cards return with sub-phonetic states in phonetic recognition ay-1 ay-2 ay-3 ay

7 Frame Labels BIO tags in named-entity recognition B O B I I O O Netflix announces House of Cards return with Netflix announces House of Cards return with sub-phonetic states in phonetic recognition ay-1 ay-2 ay-3 ay We are not able to express features such as duration formants I[the segment has balanced parentheses]

8 Reduction to Graph Search Problem named-entity recognition speech recognition parsing translation Sequence Prediction Graph Search Inference: 1. Take input x. Build search graph G. 2. Find the maximum scoring path in G.

9 Frame-Based Models B B B B y 1 y 2 y 3 y 4 I I I I x 1 x 2 x 3 x 4 O O O O Netflix announces House of

10 Segmental Models I I O O O Netflix announces House of Cards return with

11 Segmental Models I I O O Netflix announces House of Cards return with

12 Segmental Models I O I O Netflix announces House of Cards return with

13 Segmental Models I O Netflix announces House of Cards return with

14 Segmental Models I O Netflix announces House of Cards return with

15 Segmental Models I I O O O Netflix announces House of Cards return with

16 Segmental Models Netflix announces House of Cards return with Netflix announces House of Cards return with

17 Segmental Models Netflix announces House of Cards return with Netflix announces House of Cards return with The difference between frame-based models and segmental models: Search graph. Features can be extracted from variable-length units.

18 Problem Definition Netflix announces House of Cards return with Search space: G = (V, E) Weight: w θ (x, e), where x is the input and e is an edge. Inference: finding the maximum scoring path. Learning: finding θ that minimizes a loss function.

19 Past Research on Segmental Models Network-based digit recognition [Bush and Kopec, 1985] SUMMIT [Zue et al., 1989] [Glass, 2003] Stochastic segmental models [Ostendorf and Roukos, 1989] Hidden semi-markov models [Sarawagi and Cohen, 2004] Segmental conditional random fields (SCRF) [Zweig and Ngyuen, 2009] [Zweig et al., 2011] [Zweig, 2012] Boundary-factored segmental CRF [He and Fosler-Lussier, 2012] Deep segmental neural networks [Abdel-Hamid et al., 2013] Discriminative segmental cascades [TWGL 2015] Segmental recurrent neural networks [Lu et al., 2016]

20 Problem: Efficiency Runtime for inference O( E c), where c is the time to compute the weight of an edge. Suppose x = T, and the label set is of size L. Frame-based models: E = O(TL) Segmental models: E = O(TLD), where D is the maximum duration. L D named-entity recognition 30 4 action recognition phoneme recognition word recognition

21 Past Research on Efficiency of Segmental Models Bottom-up approach [Zue and Glass, 1988] Other ASR systems [Chang and Glass, 1997] [Zweig et al., 2010] Augmentation [Glass et al., 1996] [Chang and Glass, 1997] Separate pruners [Okanohara et al., 2006] Different graph topologies [Andrew, 2006] [Vinh et al., 2011] [He and Fosler-Lussier, 2012]

22 Contribution Desideratum: No HMMs! Discriminative segmental cascades [TWGL 2015] [TWGL 2016] Improved performance with segmental neural networks and higher-order features while maintaining efficiency Structured composition for computing higher-order features efficiently Speedup in inference and learning without accuracy loss End-to-end training for segmental models [TWGL 2016] Two-stage training can serve as a good initialization for end-to-end training. Hinge loss converges the fastest and log loss achieves the best accuracy. Marginal log loss achieves strong results without relying on manual alignments.

23 Discriminative Segmental Cascades first-pass search space H 1 = Y 1 search space size feature complexity segmental features higher-order features

24 Discriminative Segmental Cascades first-pass search space H 1 = Y 1 prune a b a b pruned search space H 2 a c search space size feature complexity segmental features higher-order features

25 Discriminative Segmental Cascades first-pass search space H 1 = Y 1 prune a b a b pruned search space H 2 a c search space size feature complexity segmental features higher-order features σ-compose bigram LM L 2 ɛ b b b b a a b b c ɛ a a a a a a c second-pass search space H 2 σ L 2 = Y 2

26 Discriminative Segmental Cascades first-pass search space H 1 = Y 1 prune a b a b pruned search space H 2 a c search space size feature complexity segmental features higher-order features σ-compose bigram LM L 2 ɛ b b b b a a b b c ɛ a a a a a a c second-pass search space H 2 σ L 2 = Y 2

27 Discriminative Segmental Cascades first-pass search space H 1 = Y 1 prune a b a b pruned search space H 2 a c search space size feature complexity segmental features higher-order features σ-compose bigram LM L 2 ɛ b b b b a a b b c ɛ a a a a a a c second-pass search space H 2 σ L 2 = Y 2

28 Max-Marginal Pruning [Sixtus and Ortmanns, 1999, Weiss et al., 2012] for e E, prune e if γ(e) < t Max-marginal of e E is defined as γ(e) = max w(x, y) y e For α (0, 1), the threshold is defined as 1 t = α max γ(e) + (1 α) e E E At least one path is retained. γ(e) e E All paths with scores higher than t are retained. e

29 Structured Composition (σ-composition) The structured composition of A and B is defined as an FST G where a b a b a c V G = V A V B { E G = e 1, e 2 E A E B : o A (e 1) = i B (e2) } ɛ b ɛ a b b b a a b a a a a σ-compose b c a c After σ-composition, the search space becomes L n 1 times larger when using a n-gram language model with a vocabulary of size L. the weight function has access to a pair of labels

30 Experimental Setup iy v eh n ih f... Task Phonetic recognition Dataset TIMIT Size 6 hours Ground truth Manual alignments Loss function hinge loss Maximum duration 30 Label set size 48 Average input length

31 Beam Pruning vs Max-marginal Pruning oracle error (%) beam pruning max-marginal pruning oracle error (%) beam pruning max-marginal pruning density (edges per gold edge) real-time factor Beam pruning is faster. Max-marginal pruning produces more compact lattices.

32 Beam Search vs Exact Search dev PER (%) hit rate (%) beam width beam width When the model is well-trained, beam search can be as good as exact search. Dual decomposition is not an option, since we only allow a single pass over the edges.

33 Beam Search vs Exact Search dev PER (%) hit rate (%) beam width beam width When the model is well-trained, beam search can be as good as exact search. Dual decomposition is not an option, since we only allow a single pass over the edges.

34 Beam Search vs Exact Search dev PER (%) hit rate (%) beam width beam width When the model is well-trained, beam search can be as good as exact search. Dual decomposition is not an option, since we only allow a single pass over the edges.

35 Beam Search vs Exact Search dev PER (%) hit rate (%) beam width beam width When the model is well-trained, beam search can be as good as exact search. Dual decomposition is not an option, since we only allow a single pass over the edges.

36 Learning with Beam Search vs Learning with Cascades unigram bigram dev PER (%) epoch exact beam=10 beam=20 beam=30 cascade Learning with beam search is fine for the unigram case but fails in the bigram case. Learning with cascades is both effective and efficient.

37 Learning with Beam Search vs Learning with Cascades unigram bigram dev PER (%) epoch exact beam=10 beam=20 beam=30 cascade Learning with beam search is fine for the unigram case but fails in the bigram case. Learning with cascades is both effective and efficient.

38 Learning with Beam Search vs Learning with Cascades unigram bigram dev PER (%) epoch exact beam=10 beam=20 beam=30 cascade Learning with beam search is fine for the unigram case but fails in the bigram case. Learning with cascades is both effective and efficient.

39 Learning with Beam Search vs Learning with Cascades unigram bigram dev PER (%) epoch dev PER (%) epoch exact beam=10 beam=20 beam=30 cascade Learning with beam search is fine for the unigram case but fails in the bigram case. Learning with cascades is both effective and efficient.

40 Phonetic Recognition on TIMIT dev test HMM-DNN st -pass segmental model bigram LM nd -order boundary features st -order segment NN st -order bi-phone NN bottleneck

41 We consider signer-dependent, signer-independent, and signer-adapted recognition. We Americannext Sign describe Language the recognizers Fingerspelling we compare, as well as Recognition the techniques we[kim exploreetfor al., signer 2016] adaptation. All of the recognizers use deep neural network (DNN) classifiers of letters or handshape features. <s> T U L I P </s> <s> T U L I P </s> Figure 2-1: Images and ground-truth segmentations of the fingerspelled word TULIP produced by two signers. Image frames are sub-sampled at the same rate from both signers to show the true relative speeds. Asterisks indicate manually annotated peak frames for each letter. <s> and </s> denote non-signing intervals before/after signing. LER Tandem HMM 14.6% 19 Rescoring SCRF 11.5% cascade 1 st pass 8.8% cascade 2 nd pass 7.6%

42 Improving Efficiency first-pass search space H 1 = Y 1 prune first-pass search space H 1 = Y 1 a b a b a c second-pass search space H 2 = Y 2

43 Improving Efficiency baseline proposed dev PER (%) real-time factor 1st pass 2nd pass training hours baseline 1st pass proposed 1st pass proposed 2nd pass

44 Contribution Desideratum: No HMMs! Discriminative segmental cascades [TWGL 2015] [TWGL 2016] Improved performance with segmental neural networks and higher-order features while maintaining efficiency Structured composition for computing higher-order features efficiently Speedup in inference and learning without accuracy loss End-to-end training for segmental models [TWGL 2016] Two-stage training can serve as a good initialization for end-to-end training. Hinge loss converges the fastest and log loss achieves the best accuracy. Marginal log loss achieves strong results without relying on manual alignments.

45 Two-Stage vs End-to-End Training log prob [ ] [ ] [ ] [ ] [ ] [ ] [ ] f Λ x [ ] [ ] [ ] [ ] [ ] [ ] [ ]

46 Two-Stage vs End-to-End Training log prob [ ] [ ] [ ] [ ] [ ] [ ] [ ] f Λ x [ ] [ ] [ ] [ ] [ ] [ ] [ ]

47 Two-Stage vs End-to-End Training log prob [ ] [ ] [ ] [ ] [ ] [ ] [ ] f Λ x [ ] [ ] [ ] [ ] [ ] [ ] [ ]

48 Two-Stage vs End-to-End Training log prob [ ] [ ] [ ] [ ] [ ] [ ] [ ] f Λ x [ ] [ ] [ ] [ ] [ ] [ ] [ ]

49 Two-Stage vs End-to-End Training??? [ ] [ ] [ ] [ ] [ ] [ ] [ ] f Λ x [ ] [ ] [ ] [ ] [ ] [ ] [ ]

50 Two-Stage vs End-to-End Training Two-stage training 1. Find Λ by minimizing cross entropy at each frame. 2. Fix Λ. Find θ by minimizing hinge loss l hinge (θ, Λ; x, y, z) [ = max (y,z ) P cost((y, z ), (y, z)) θ φ Λ (x, y, z) + θ φ Λ (x, y, z ) ] End-to-end training from scratch 1. Randomly initialize Λ. 2. Find θ and Λ jointly by minimizing hinge loss. End-to-end fine-tuning 1. Two-stage training 2. End-to-end training

51 Two-Stage vs End-to-End Training for Hinge Loss 27.5 test PER stage e2e fine tuning End-to-end training can get stuck at a poor local optimum. Two-stage training provides a better starting point.

52 Two-Stage vs End-to-End Training for Hinge Loss test PER training loss stage e2e fine tuning stage e2e fine tuning End-to-end training can get stuck at a poor local optimum. Two-stage training provides a better starting point.

53 Log Loss Log loss l log (θ, Λ; x, y, z) = log p(y, z x) ) p(y, z x) = (θ 1 Z exp φ Λ (x, y, z) Z = ( ) exp θ φ Λ (x, y, z ) (y,z ) P

54 Two-Stage vs End-to-End Training for Log Loss test PER training loss stage e2e fine tuning 2-stage e2e fine tuning End-to-end training for log loss seems easier to optimize. Two-stage training provides a better starting point.

55 Frame-wise Cross Entropy 1.0 cross entropy best dev best dev dropout best dev dropout fine-tuning train CE dev CE End-to-end fine-tuning sticks to the log probability representation and improves it.

56 Other Loss Functions Marginal log loss l log (θ, Λ; x, y) = log p(y x) = log p(y, z x) z Z Latent hinge loss l latent-hinge (θ, Λ; x, y) [ = max (y,z ) P cost((y, z ), (y, z)) max θ φ Λ (x, y, z ) + θ φ Λ (x, y, z ) z Z ] z = argmaxθ φ Λ (x, y, z ) z Z

57 End-to-End Training without Manual Alignments test PER latent hinge loss marginal log loss MLL align 2-stage fine-tuning 2 stage fine-tuning e2e from scratch End-to-end training for marginal log loss seems easier. Two-stage training provides a better starting point.

58 End-to-End Training without Manual Alignments test PER latent hinge loss marginal log loss MLL align 2-stage fine-tuning 2 stage fine-tuning e2e from scratch End-to-end training for marginal log loss seems easier. Two-stage training provides a better starting point.

59 Loss Functions hours hinge loss log loss latent hinge loss marginal log loss LSTM alignments required convex in θ smooth sparse update hinge loss log loss latent hinge loss marginal log loss

60 Where are we? speakerindependent speakeradapted HMM-DNN HMM-CNN [Tòth, 2015] 16.5 Segment-based models [Glass 2003] 24.4 SCRF [Zweig, 2012] 33.1 SCRF with shallow NN [He and Fosler-Lussier, 2012] 26.5 SCRF with DNN [He, 2015] 19.1 Deep segmental NN [Abdel-Hamid et al., 2013] 21.9 cascade 1 st pass [TWGL 2015] 21.7 cascade 2 nd pass [TWGL 2015] 19.9 End-to-end + two-stage training [TWGL 2016] 19.7 Segmental RNN [Lu et al., 2016]

61 Contribution Desideratum: No HMMs! Discriminative segmental cascades [TWGL 2015] [TWGL 2016] Improved performance with segmental neural networks and higher-order features while maintaining efficiency Structured composition for computing higher-order features efficiently Speedup in inference and learning without accuracy loss End-to-end training for segmental models [TWGL 2016] Two-stage training can serve as a good initialization for end-to-end training. Hinge loss converges the fastest and log loss achieves the best accuracy. Marginal log loss achieves strong results without relying on manual alignments.

62 Ongoing and Future Work Unsupervised learning lexical unit discovery contrastive estimation [Smith and Eisner, 2005] autoencoder [Ammar et al., 2014, Tran et al., 2016] generative adversarial networks [Goodfellow et al., 2016] Structure + Network Networks Deep structured models [Chen et al., 2015] Attention [Chorowski et al., 2015] Structured attention networks [Kim et al., 2016] Large-scale structured prediction whole-word speech recognizers TIDIGITS (4.45% SER) Beam search + early update rule [Collins and Roark, 2004] First-order methods for inference Dijkstra s algorithm is steepest descent in the dual [Murota and Shioura, 2010] Structured Prediction Energy Network [Belanger and McCallum, 2015]

63 Acknowledgement Weiran Wang Taehwan Kim Kevin Gimpel Karen Livescu This research was supported by a Google faculty research award and NSF grant IIS The GPUs used for this research were donated by NVIDIA.

64 th ae ng k s

Discriminative Training of Decoding Graphs for Large Vocabulary Continuous Speech Recognition

Discriminative Training of Decoding Graphs for Large Vocabulary Continuous Speech Recognition Discriminative Training of Decoding Graphs for Large Vocabulary Continuous Speech Recognition by Hong-Kwang Jeff Kuo, Brian Kingsbury (IBM Research) and Geoffry Zweig (Microsoft Research) ICASSP 2007 Presented

More information

JOINT INTENT DETECTION AND SLOT FILLING USING CONVOLUTIONAL NEURAL NETWORKS. Puyang Xu, Ruhi Sarikaya. Microsoft Corporation

JOINT INTENT DETECTION AND SLOT FILLING USING CONVOLUTIONAL NEURAL NETWORKS. Puyang Xu, Ruhi Sarikaya. Microsoft Corporation JOINT INTENT DETECTION AND SLOT FILLING USING CONVOLUTIONAL NEURAL NETWORKS Puyang Xu, Ruhi Sarikaya Microsoft Corporation ABSTRACT We describe a joint model for intent detection and slot filling based

More information

A Comparison of Sequence-Trained Deep Neural Networks and Recurrent Neural Networks Optical Modeling For Handwriting Recognition

A Comparison of Sequence-Trained Deep Neural Networks and Recurrent Neural Networks Optical Modeling For Handwriting Recognition A Comparison of Sequence-Trained Deep Neural Networks and Recurrent Neural Networks Optical Modeling For Handwriting Recognition Théodore Bluche, Hermann Ney, Christopher Kermorvant SLSP 14, Grenoble October

More information

A comparison of training approaches for discriminative segmental models

A comparison of training approaches for discriminative segmental models A comparison of training approaches for discriminative segmental models Hao Tang, Kevin Gimpel, Karen Livescu Toyota Technological Institute at Chicago {haotang,kgimpel,klivescu}@ttic.edu Abstract Segmental

More information

Learning The Lexicon!

Learning The Lexicon! Learning The Lexicon! A Pronunciation Mixture Model! Ian McGraw! (imcgraw@mit.edu)! Ibrahim Badr Jim Glass! Computer Science and Artificial Intelligence Lab! Massachusetts Institute of Technology! Cambridge,

More information

Lecture 21 : A Hybrid: Deep Learning and Graphical Models

Lecture 21 : A Hybrid: Deep Learning and Graphical Models 10-708: Probabilistic Graphical Models, Spring 2018 Lecture 21 : A Hybrid: Deep Learning and Graphical Models Lecturer: Kayhan Batmanghelich Scribes: Paul Liang, Anirudha Rayasam 1 Introduction and Motivation

More information

Lecture 7: Neural network acoustic models in speech recognition

Lecture 7: Neural network acoustic models in speech recognition CS 224S / LINGUIST 285 Spoken Language Processing Andrew Maas Stanford University Spring 2017 Lecture 7: Neural network acoustic models in speech recognition Outline Hybrid acoustic modeling overview Basic

More information

27: Hybrid Graphical Models and Neural Networks

27: Hybrid Graphical Models and Neural Networks 10-708: Probabilistic Graphical Models 10-708 Spring 2016 27: Hybrid Graphical Models and Neural Networks Lecturer: Matt Gormley Scribes: Jakob Bauer Otilia Stretcu Rohan Varma 1 Motivation We first look

More information

Deep Learning Applications

Deep Learning Applications October 20, 2017 Overview Supervised Learning Feedforward neural network Convolution neural network Recurrent neural network Recursive neural network (Recursive neural tensor network) Unsupervised Learning

More information

Structured Learning. Jun Zhu

Structured Learning. Jun Zhu Structured Learning Jun Zhu Supervised learning Given a set of I.I.D. training samples Learn a prediction function b r a c e Supervised learning (cont d) Many different choices Logistic Regression Maximum

More information

Lexicographic Semirings for Exact Automata Encoding of Sequence Models

Lexicographic Semirings for Exact Automata Encoding of Sequence Models Lexicographic Semirings for Exact Automata Encoding of Sequence Models Brian Roark, Richard Sproat, and Izhak Shafran {roark,rws,zak}@cslu.ogi.edu Abstract In this paper we introduce a novel use of the

More information

Conditional Random Fields : Theory and Application

Conditional Random Fields : Theory and Application Conditional Random Fields : Theory and Application Matt Seigel (mss46@cam.ac.uk) 3 June 2010 Cambridge University Engineering Department Outline The Sequence Classification Problem Linear Chain CRFs CRF

More information

Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani

Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani Neural Networks CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Biological and artificial neural networks Feed-forward neural networks Single layer

More information

LSTM for Language Translation and Image Captioning. Tel Aviv University Deep Learning Seminar Oran Gafni & Noa Yedidia

LSTM for Language Translation and Image Captioning. Tel Aviv University Deep Learning Seminar Oran Gafni & Noa Yedidia 1 LSTM for Language Translation and Image Captioning Tel Aviv University Deep Learning Seminar Oran Gafni & Noa Yedidia 2 Part I LSTM for Language Translation Motivation Background (RNNs, LSTMs) Model

More information

Neural Networks and Deep Learning

Neural Networks and Deep Learning Neural Networks and Deep Learning Example Learning Problem Example Learning Problem Celebrity Faces in the Wild Machine Learning Pipeline Raw data Feature extract. Feature computation Inference: prediction,

More information

Machine Learning. The Breadth of ML Neural Networks & Deep Learning. Marc Toussaint. Duy Nguyen-Tuong. University of Stuttgart

Machine Learning. The Breadth of ML Neural Networks & Deep Learning. Marc Toussaint. Duy Nguyen-Tuong. University of Stuttgart Machine Learning The Breadth of ML Neural Networks & Deep Learning Marc Toussaint University of Stuttgart Duy Nguyen-Tuong Bosch Center for Artificial Intelligence Summer 2017 Neural Networks Consider

More information

Akarsh Pokkunuru EECS Department Contractive Auto-Encoders: Explicit Invariance During Feature Extraction

Akarsh Pokkunuru EECS Department Contractive Auto-Encoders: Explicit Invariance During Feature Extraction Akarsh Pokkunuru EECS Department 03-16-2017 Contractive Auto-Encoders: Explicit Invariance During Feature Extraction 1 AGENDA Introduction to Auto-encoders Types of Auto-encoders Analysis of different

More information

Automatic Speech Recognition (ASR)

Automatic Speech Recognition (ASR) Automatic Speech Recognition (ASR) February 2018 Reza Yazdani Aminabadi Universitat Politecnica de Catalunya (UPC) State-of-the-art State-of-the-art ASR system: DNN+HMM Speech (words) Sound Signal Graph

More information

Kernels vs. DNNs for Speech Recognition

Kernels vs. DNNs for Speech Recognition Kernels vs. DNNs for Speech Recognition Joint work with: Columbia: Linxi (Jim) Fan, Michael Collins (my advisor) USC: Zhiyun Lu, Kuan Liu, Alireza Bagheri Garakani, Dong Guo, Aurélien Bellet, Fei Sha IBM:

More information

TTIC 31190: Natural Language Processing

TTIC 31190: Natural Language Processing TTIC 31190: Natural Language Processing Kevin Gimpel Winter 2016 Lecture 2: Text Classification 1 Please email me (kgimpel@ttic.edu) with the following: your name your email address whether you taking

More information

Why DNN Works for Speech and How to Make it More Efficient?

Why DNN Works for Speech and How to Make it More Efficient? Why DNN Works for Speech and How to Make it More Efficient? Hui Jiang Department of Electrical Engineering and Computer Science Lassonde School of Engineering, York University, CANADA Joint work with Y.

More information

Hidden Markov Models. Gabriela Tavares and Juri Minxha Mentor: Taehwan Kim CS159 04/25/2017

Hidden Markov Models. Gabriela Tavares and Juri Minxha Mentor: Taehwan Kim CS159 04/25/2017 Hidden Markov Models Gabriela Tavares and Juri Minxha Mentor: Taehwan Kim CS159 04/25/2017 1 Outline 1. 2. 3. 4. Brief review of HMMs Hidden Markov Support Vector Machines Large Margin Hidden Markov Models

More information

Speech Technology Using in Wechat

Speech Technology Using in Wechat Speech Technology Using in Wechat FENG RAO Powered by WeChat Outline Introduce Algorithm of Speech Recognition Acoustic Model Language Model Decoder Speech Technology Open Platform Framework of Speech

More information

Natural Language Processing with Deep Learning CS224N/Ling284

Natural Language Processing with Deep Learning CS224N/Ling284 Natural Language Processing with Deep Learning CS224N/Ling284 Lecture 8: Recurrent Neural Networks Christopher Manning and Richard Socher Organization Extra project office hour today after lecture Overview

More information

Lecture 20: Neural Networks for NLP. Zubin Pahuja

Lecture 20: Neural Networks for NLP. Zubin Pahuja Lecture 20: Neural Networks for NLP Zubin Pahuja zpahuja2@illinois.edu courses.engr.illinois.edu/cs447 CS447: Natural Language Processing 1 Today s Lecture Feed-forward neural networks as classifiers simple

More information

COMP 551 Applied Machine Learning Lecture 16: Deep Learning

COMP 551 Applied Machine Learning Lecture 16: Deep Learning COMP 551 Applied Machine Learning Lecture 16: Deep Learning Instructor: Ryan Lowe (ryan.lowe@cs.mcgill.ca) Slides mostly by: Class web page: www.cs.mcgill.ca/~hvanho2/comp551 Unless otherwise noted, all

More information

Conditional Random Fields - A probabilistic graphical model. Yen-Chin Lee 指導老師 : 鮑興國

Conditional Random Fields - A probabilistic graphical model. Yen-Chin Lee 指導老師 : 鮑興國 Conditional Random Fields - A probabilistic graphical model Yen-Chin Lee 指導老師 : 鮑興國 Outline Labeling sequence data problem Introduction conditional random field (CRF) Different views on building a conditional

More information

Structured Perceptron. Ye Qiu, Xinghui Lu, Yue Lu, Ruofei Shen

Structured Perceptron. Ye Qiu, Xinghui Lu, Yue Lu, Ruofei Shen Structured Perceptron Ye Qiu, Xinghui Lu, Yue Lu, Ruofei Shen 1 Outline 1. 2. 3. 4. Brief review of perceptron Structured Perceptron Discriminative Training Methods for Hidden Markov Models: Theory and

More information

Tuning. Philipp Koehn presented by Gaurav Kumar. 28 September 2017

Tuning. Philipp Koehn presented by Gaurav Kumar. 28 September 2017 Tuning Philipp Koehn presented by Gaurav Kumar 28 September 2017 The Story so Far: Generative Models 1 The definition of translation probability follows a mathematical derivation argmax e p(e f) = argmax

More information

LSTM: An Image Classification Model Based on Fashion-MNIST Dataset

LSTM: An Image Classification Model Based on Fashion-MNIST Dataset LSTM: An Image Classification Model Based on Fashion-MNIST Dataset Kexin Zhang, Research School of Computer Science, Australian National University Kexin Zhang, U6342657@anu.edu.au Abstract. The application

More information

Deep Learning. Volker Tresp Summer 2014

Deep Learning. Volker Tresp Summer 2014 Deep Learning Volker Tresp Summer 2014 1 Neural Network Winter and Revival While Machine Learning was flourishing, there was a Neural Network winter (late 1990 s until late 2000 s) Around 2010 there

More information

Unsupervised Learning

Unsupervised Learning Deep Learning for Graphics Unsupervised Learning Niloy Mitra Iasonas Kokkinos Paul Guerrero Vladimir Kim Kostas Rematas Tobias Ritschel UCL UCL/Facebook UCL Adobe Research U Washington UCL Timetable Niloy

More information

arxiv: v1 [cs.cl] 30 Jan 2018

arxiv: v1 [cs.cl] 30 Jan 2018 ACCELERATING RECURRENT NEURAL NETWORK LANGUAGE MODEL BASED ONLINE SPEECH RECOGNITION SYSTEM Kyungmin Lee, Chiyoun Park, Namhoon Kim, and Jaewon Lee DMC R&D Center, Samsung Electronics, Seoul, Korea {k.m.lee,

More information

CUED-RNNLM An Open-Source Toolkit for Efficient Training and Evaluation of Recurrent Neural Network Language Models

CUED-RNNLM An Open-Source Toolkit for Efficient Training and Evaluation of Recurrent Neural Network Language Models CUED-RNNLM An Open-Source Toolkit for Efficient Training and Evaluation of Recurrent Neural Network Language Models Xie Chen, Xunying Liu, Yanmin Qian, Mark Gales and Phil Woodland April 1, 2016 Overview

More information

Clinical Named Entity Recognition Method Based on CRF

Clinical Named Entity Recognition Method Based on CRF Clinical Named Entity Recognition Method Based on CRF Yanxu Chen 1, Gang Zhang 1, Haizhou Fang 1, Bin He, and Yi Guan Research Center of Language Technology Harbin Institute of Technology, Harbin, China

More information

Structured Attention Networks

Structured Attention Networks Structured Attention Networks Yoon Kim Carl Denton Luong Hoang Alexander M. Rush HarvardNLP ICLR, 2017 Presenter: Chao Jiang ICLR, 2017 Presenter: Chao Jiang 1 / Outline 1 Deep Neutral Networks for Text

More information

Shallow Parsing Swapnil Chaudhari 11305R011 Ankur Aher Raj Dabre 11305R001

Shallow Parsing Swapnil Chaudhari 11305R011 Ankur Aher Raj Dabre 11305R001 Shallow Parsing Swapnil Chaudhari 11305R011 Ankur Aher - 113059006 Raj Dabre 11305R001 Purpose of the Seminar To emphasize on the need for Shallow Parsing. To impart basic information about techniques

More information

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU,

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU, Machine Learning 10-701, Fall 2015 Deep Learning Eric Xing (and Pengtao Xie) Lecture 8, October 6, 2015 Eric Xing @ CMU, 2015 1 A perennial challenge in computer vision: feature engineering SIFT Spin image

More information

Sparse Non-negative Matrix Language Modeling

Sparse Non-negative Matrix Language Modeling Sparse Non-negative Matrix Language Modeling Joris Pelemans Noam Shazeer Ciprian Chelba joris@pelemans.be noam@google.com ciprianchelba@google.com 1 Outline Motivation Sparse Non-negative Matrix Language

More information

Part-of-Speech Tagging

Part-of-Speech Tagging Part-of-Speech Tagging A Canonical Finite-State Task 600.465 - Intro to NLP - J. Eisner 1 The Tagging Task Input: the lead paint is unsafe Output: the/ lead/n paint/n is/v unsafe/ Uses: text-to-speech

More information

Semantic Word Embedding Neural Network Language Models for Automatic Speech Recognition

Semantic Word Embedding Neural Network Language Models for Automatic Speech Recognition Semantic Word Embedding Neural Network Language Models for Automatic Speech Recognition Kartik Audhkhasi, Abhinav Sethy Bhuvana Ramabhadran Watson Multimodal Group IBM T. J. Watson Research Center Motivation

More information

Recurrent Neural Nets II

Recurrent Neural Nets II Recurrent Neural Nets II Steven Spielberg Pon Kumar, Tingke (Kevin) Shen Machine Learning Reading Group, Fall 2016 9 November, 2016 Outline 1 Introduction 2 Problem Formulations with RNNs 3 LSTM for Optimization

More information

Overview. Search and Decoding. HMM Speech Recognition. The Search Problem in ASR (1) Today s lecture. Steve Renals

Overview. Search and Decoding. HMM Speech Recognition. The Search Problem in ASR (1) Today s lecture. Steve Renals Overview Search and Decoding Steve Renals Automatic Speech Recognition ASR Lecture 10 January - March 2012 Today s lecture Search in (large vocabulary) speech recognition Viterbi decoding Approximate search

More information

Lecture 13. Deep Belief Networks. Michael Picheny, Bhuvana Ramabhadran, Stanley F. Chen

Lecture 13. Deep Belief Networks. Michael Picheny, Bhuvana Ramabhadran, Stanley F. Chen Lecture 13 Deep Belief Networks Michael Picheny, Bhuvana Ramabhadran, Stanley F. Chen IBM T.J. Watson Research Center Yorktown Heights, New York, USA {picheny,bhuvana,stanchen}@us.ibm.com 12 December 2012

More information

Machine Learning With Python. Bin Chen Nov. 7, 2017 Research Computing Center

Machine Learning With Python. Bin Chen Nov. 7, 2017 Research Computing Center Machine Learning With Python Bin Chen Nov. 7, 2017 Research Computing Center Outline Introduction to Machine Learning (ML) Introduction to Neural Network (NN) Introduction to Deep Learning NN Introduction

More information

Recurrent Neural Networks. Nand Kishore, Audrey Huang, Rohan Batra

Recurrent Neural Networks. Nand Kishore, Audrey Huang, Rohan Batra Recurrent Neural Networks Nand Kishore, Audrey Huang, Rohan Batra Roadmap Issues Motivation 1 Application 1: Sequence Level Training 2 Basic Structure 3 4 Variations 5 Application 3: Image Classification

More information

Pattern Recognition. Kjell Elenius. Speech, Music and Hearing KTH. March 29, 2007 Speech recognition

Pattern Recognition. Kjell Elenius. Speech, Music and Hearing KTH. March 29, 2007 Speech recognition Pattern Recognition Kjell Elenius Speech, Music and Hearing KTH March 29, 2007 Speech recognition 2007 1 Ch 4. Pattern Recognition 1(3) Bayes Decision Theory Minimum-Error-Rate Decision Rules Discriminant

More information

MEMMs (Log-Linear Tagging Models)

MEMMs (Log-Linear Tagging Models) Chapter 8 MEMMs (Log-Linear Tagging Models) 8.1 Introduction In this chapter we return to the problem of tagging. We previously described hidden Markov models (HMMs) for tagging problems. This chapter

More information

Modeling Phonetic Context with Non-random Forests for Speech Recognition

Modeling Phonetic Context with Non-random Forests for Speech Recognition Modeling Phonetic Context with Non-random Forests for Speech Recognition Hainan Xu Center for Language and Speech Processing, Johns Hopkins University September 4, 2015 Hainan Xu September 4, 2015 1 /

More information

Image Captioning and Generation From Text

Image Captioning and Generation From Text Image Captioning and Generation From Text Presented by: Tony Zhang, Jonathan Kenny, and Jeremy Bernstein Mentor: Stephan Zheng CS159 Advanced Topics in Machine Learning: Structured Prediction California

More information

Transition-Based Dependency Parsing with Stack Long Short-Term Memory

Transition-Based Dependency Parsing with Stack Long Short-Term Memory Transition-Based Dependency Parsing with Stack Long Short-Term Memory Chris Dyer, Miguel Ballesteros, Wang Ling, Austin Matthews, Noah A. Smith Association for Computational Linguistics (ACL), 2015 Presented

More information

Clinical Name Entity Recognition using Conditional Random Field with Augmented Features

Clinical Name Entity Recognition using Conditional Random Field with Augmented Features Clinical Name Entity Recognition using Conditional Random Field with Augmented Features Dawei Geng (Intern at Philips Research China, Shanghai) Abstract. In this paper, We presents a Chinese medical term

More information

Complex Prediction Problems

Complex Prediction Problems Problems A novel approach to multiple Structured Output Prediction Max-Planck Institute ECML HLIE08 Information Extraction Extract structured information from unstructured data Typical subtasks Named Entity

More information

Tutorial on Keras CAP ADVANCED COMPUTER VISION SPRING 2018 KISHAN S ATHREY

Tutorial on Keras CAP ADVANCED COMPUTER VISION SPRING 2018 KISHAN S ATHREY Tutorial on Keras CAP 6412 - ADVANCED COMPUTER VISION SPRING 2018 KISHAN S ATHREY Deep learning packages TensorFlow Google PyTorch Facebook AI research Keras Francois Chollet (now at Google) Chainer Company

More information

POINT CLOUD DEEP LEARNING

POINT CLOUD DEEP LEARNING POINT CLOUD DEEP LEARNING Innfarn Yoo, 3/29/28 / 57 Introduction AGENDA Previous Work Method Result Conclusion 2 / 57 INTRODUCTION 3 / 57 2D OBJECT CLASSIFICATION Deep Learning for 2D Object Classification

More information

Perceptron: This is convolution!

Perceptron: This is convolution! Perceptron: This is convolution! v v v Shared weights v Filter = local perceptron. Also called kernel. By pooling responses at different locations, we gain robustness to the exact spatial location of image

More information

Joint Optimisation of Tandem Systems using Gaussian Mixture Density Neural Network Discriminative Sequence Training

Joint Optimisation of Tandem Systems using Gaussian Mixture Density Neural Network Discriminative Sequence Training Joint Optimisation of Tandem Systems using Gaussian Mixture Density Neural Network Discriminative Sequence Training Chao Zhang and Phil Woodland March 8, 07 Cambridge University Engineering Department

More information

Frame and Segment Level Recurrent Neural Networks for Phone Classification

Frame and Segment Level Recurrent Neural Networks for Phone Classification Frame and Segment Level Recurrent Neural Networks for Phone Classification Martin Ratajczak 1, Sebastian Tschiatschek 2, Franz Pernkopf 1 1 Graz University of Technology, Signal Processing and Speech Communication

More information

SEMANTIC COMPUTING. Lecture 8: Introduction to Deep Learning. TU Dresden, 7 December Dagmar Gromann International Center For Computational Logic

SEMANTIC COMPUTING. Lecture 8: Introduction to Deep Learning. TU Dresden, 7 December Dagmar Gromann International Center For Computational Logic SEMANTIC COMPUTING Lecture 8: Introduction to Deep Learning Dagmar Gromann International Center For Computational Logic TU Dresden, 7 December 2018 Overview Introduction Deep Learning General Neural Networks

More information

Lattice Rescoring for Speech Recognition Using Large Scale Distributed Language Models

Lattice Rescoring for Speech Recognition Using Large Scale Distributed Language Models Lattice Rescoring for Speech Recognition Using Large Scale Distributed Language Models ABSTRACT Euisok Chung Hyung-Bae Jeon Jeon-Gue Park and Yun-Keun Lee Speech Processing Research Team, ETRI, 138 Gajeongno,

More information

The exam is closed book, closed notes except your one-page (two-sided) cheat sheet.

The exam is closed book, closed notes except your one-page (two-sided) cheat sheet. CS 189 Spring 2015 Introduction to Machine Learning Final You have 2 hours 50 minutes for the exam. The exam is closed book, closed notes except your one-page (two-sided) cheat sheet. No calculators or

More information

Application of Deep Learning Techniques in Satellite Telemetry Analysis.

Application of Deep Learning Techniques in Satellite Telemetry Analysis. Application of Deep Learning Techniques in Satellite Telemetry Analysis. Greg Adamski, Member of Technical Staff L3 Technologies Telemetry and RF Products Julian Spencer Jones, Spacecraft Engineer Telenor

More information

Seq2SQL: Generating Structured Queries from Natural Language Using Reinforcement Learning

Seq2SQL: Generating Structured Queries from Natural Language Using Reinforcement Learning Seq2SQL: Generating Structured Queries from Natural Language Using Reinforcement Learning V. Zhong, C. Xiong, R. Socher Salesforce Research arxiv: 1709.00103 Reviewed by : Bill Zhang University of Virginia

More information

Chapter 3. Speech segmentation. 3.1 Preprocessing

Chapter 3. Speech segmentation. 3.1 Preprocessing , as done in this dissertation, refers to the process of determining the boundaries between phonemes in the speech signal. No higher-level lexical information is used to accomplish this. This chapter presents

More information

Constrained Discriminative Training of N-gram Language Models

Constrained Discriminative Training of N-gram Language Models Constrained Discriminative Training of N-gram Language Models Ariya Rastrow #1, Abhinav Sethy 2, Bhuvana Ramabhadran 3 # Human Language Technology Center of Excellence, and Center for Language and Speech

More information

Transductive Phoneme Classification Using Local Scaling And Confidence

Transductive Phoneme Classification Using Local Scaling And Confidence 202 IEEE 27-th Convention of Electrical and Electronics Engineers in Israel Transductive Phoneme Classification Using Local Scaling And Confidence Matan Orbach Dept. of Electrical Engineering Technion

More information

Training for Fast Sequential Prediction Using Dynamic Feature Selection

Training for Fast Sequential Prediction Using Dynamic Feature Selection Training for Fast Sequential Prediction Using Dynamic Feature Selection Emma Strubell Luke Vilnis Andrew McCallum School of Computer Science University of Massachusetts, Amherst Amherst, MA 01002 {strubell,

More information

Discriminative Training with Perceptron Algorithm for POS Tagging Task

Discriminative Training with Perceptron Algorithm for POS Tagging Task Discriminative Training with Perceptron Algorithm for POS Tagging Task Mahsa Yarmohammadi Center for Spoken Language Understanding Oregon Health & Science University Portland, Oregon yarmoham@ohsu.edu

More information

Decentralized and Distributed Machine Learning Model Training with Actors

Decentralized and Distributed Machine Learning Model Training with Actors Decentralized and Distributed Machine Learning Model Training with Actors Travis Addair Stanford University taddair@stanford.edu Abstract Training a machine learning model with terabytes to petabytes of

More information

ECCV Presented by: Boris Ivanovic and Yolanda Wang CS 331B - November 16, 2016

ECCV Presented by: Boris Ivanovic and Yolanda Wang CS 331B - November 16, 2016 ECCV 2016 Presented by: Boris Ivanovic and Yolanda Wang CS 331B - November 16, 2016 Fundamental Question What is a good vector representation of an object? Something that can be easily predicted from 2D

More information

Bilevel Sparse Coding

Bilevel Sparse Coding Adobe Research 345 Park Ave, San Jose, CA Mar 15, 2013 Outline 1 2 The learning model The learning algorithm 3 4 Sparse Modeling Many types of sensory data, e.g., images and audio, are in high-dimensional

More information

Weighted Finite State Transducers in Automatic Speech Recognition

Weighted Finite State Transducers in Automatic Speech Recognition Weighted Finite State Transducers in Automatic Speech Recognition ZRE lecture 15.04.2015 Mirko Hannemann Slides provided with permission, Daniel Povey some slides from T. Schultz, M. Mohri, M. Riley and

More information

Statistical parsing. Fei Xia Feb 27, 2009 CSE 590A

Statistical parsing. Fei Xia Feb 27, 2009 CSE 590A Statistical parsing Fei Xia Feb 27, 2009 CSE 590A Statistical parsing History-based models (1995-2000) Recent development (2000-present): Supervised learning: reranking and label splitting Semi-supervised

More information

arxiv: v2 [cs.lg] 6 Jun 2015

arxiv: v2 [cs.lg] 6 Jun 2015 HOPE (Zhang and Jiang) 1 Hybrid Orthogonal Projection and Estimation (HOPE): A New Framework to Probe and Learn Neural Networks Shiliang Zhang and Hui Jiang arxiv:1502.00702v2 [cs.lg 6 Jun 2015 National

More information

Expectation Maximization. Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University

Expectation Maximization. Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University Expectation Maximization Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University April 10 th, 2006 1 Announcements Reminder: Project milestone due Wednesday beginning of class 2 Coordinate

More information

A Hybrid Neural Model for Type Classification of Entity Mentions

A Hybrid Neural Model for Type Classification of Entity Mentions A Hybrid Neural Model for Type Classification of Entity Mentions Motivation Types group entities to categories Entity types are important for various NLP tasks Our task: predict an entity mention s type

More information

Conditional Random Field for tracking user behavior based on his eye s movements 1

Conditional Random Field for tracking user behavior based on his eye s movements 1 Conditional Random Field for tracing user behavior based on his eye s movements 1 Trinh Minh Tri Do Thierry Artières LIP6, Université Paris 6 LIP6, Université Paris 6 8 rue du capitaine Scott 8 rue du

More information

Deep Learning Cook Book

Deep Learning Cook Book Deep Learning Cook Book Robert Haschke (CITEC) Overview Input Representation Output Layer + Cost Function Hidden Layer Units Initialization Regularization Input representation Choose an input representation

More information

Bayesian model ensembling using meta-trained recurrent neural networks

Bayesian model ensembling using meta-trained recurrent neural networks Bayesian model ensembling using meta-trained recurrent neural networks Luca Ambrogioni l.ambrogioni@donders.ru.nl Umut Güçlü u.guclu@donders.ru.nl Yağmur Güçlütürk y.gucluturk@donders.ru.nl Julia Berezutskaya

More information

Sparse Feature Learning

Sparse Feature Learning Sparse Feature Learning Philipp Koehn 1 March 2016 Multiple Component Models 1 Translation Model Language Model Reordering Model Component Weights 2 Language Model.05 Translation Model.26.04.19.1 Reordering

More information

An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation

An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation Hugo Larochelle, Dumitru Erhan, Aaron Courville, James Bergstra, and Yoshua Bengio Université de Montréal 13/06/2007

More information

Computationally Efficient M-Estimation of Log-Linear Structure Models

Computationally Efficient M-Estimation of Log-Linear Structure Models Computationally Efficient M-Estimation of Log-Linear Structure Models Noah Smith, Doug Vail, and John Lafferty School of Computer Science Carnegie Mellon University {nasmith,dvail2,lafferty}@cs.cmu.edu

More information

Restricted Boltzmann Machines. Shallow vs. deep networks. Stacked RBMs. Boltzmann Machine learning: Unsupervised version

Restricted Boltzmann Machines. Shallow vs. deep networks. Stacked RBMs. Boltzmann Machine learning: Unsupervised version Shallow vs. deep networks Restricted Boltzmann Machines Shallow: one hidden layer Features can be learned more-or-less independently Arbitrary function approximator (with enough hidden units) Deep: two

More information

CAP 6412 Advanced Computer Vision

CAP 6412 Advanced Computer Vision CAP 6412 Advanced Computer Vision http://www.cs.ucf.edu/~bgong/cap6412.html Boqing Gong April 21st, 2016 Today Administrivia Free parameters in an approach, model, or algorithm? Egocentric videos by Aisha

More information

Hidden Markov Models in the context of genetic analysis

Hidden Markov Models in the context of genetic analysis Hidden Markov Models in the context of genetic analysis Vincent Plagnol UCL Genetics Institute November 22, 2012 Outline 1 Introduction 2 Two basic problems Forward/backward Baum-Welch algorithm Viterbi

More information

Finite-State and the Noisy Channel Intro to NLP - J. Eisner 1

Finite-State and the Noisy Channel Intro to NLP - J. Eisner 1 Finite-State and the Noisy Channel 600.465 - Intro to NLP - J. Eisner 1 Word Segmentation x = theprophetsaidtothecity What does this say? And what other words are substrings? Could segment with parsing

More information

Grounded Compositional Semantics for Finding and Describing Images with Sentences

Grounded Compositional Semantics for Finding and Describing Images with Sentences Grounded Compositional Semantics for Finding and Describing Images with Sentences R. Socher, A. Karpathy, V. Le,D. Manning, A Y. Ng - 2013 Ali Gharaee 1 Alireza Keshavarzi 2 1 Department of Computational

More information

Network Traffic Measurements and Analysis

Network Traffic Measurements and Analysis DEIB - Politecnico di Milano Fall, 2017 Sources Hastie, Tibshirani, Friedman: The Elements of Statistical Learning James, Witten, Hastie, Tibshirani: An Introduction to Statistical Learning Andrew Ng:

More information

COMP90051 Statistical Machine Learning

COMP90051 Statistical Machine Learning COMP90051 Statistical Machine Learning Semester 2, 2016 Lecturer: Trevor Cohn 20. PGM Representation Next Lectures Representation of joint distributions Conditional/marginal independence * Directed vs

More information

Advanced Search Algorithms

Advanced Search Algorithms CS11-747 Neural Networks for NLP Advanced Search Algorithms Daniel Clothiaux https://phontron.com/class/nn4nlp2017/ Why search? So far, decoding has mostly been greedy Chose the most likely output from

More information

Semantic image search using queries

Semantic image search using queries Semantic image search using queries Shabaz Basheer Patel, Anand Sampat Department of Electrical Engineering Stanford University CA 94305 shabaz@stanford.edu,asampat@stanford.edu Abstract Previous work,

More information

Scene Text Recognition for Augmented Reality. Sagar G V Adviser: Prof. Bharadwaj Amrutur Indian Institute Of Science

Scene Text Recognition for Augmented Reality. Sagar G V Adviser: Prof. Bharadwaj Amrutur Indian Institute Of Science Scene Text Recognition for Augmented Reality Sagar G V Adviser: Prof. Bharadwaj Amrutur Indian Institute Of Science Outline Research area and motivation Finding text in natural scenes Prior art Improving

More information

Neural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /10/2017

Neural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /10/2017 3/0/207 Neural Networks Emily Fox University of Washington March 0, 207 Slides adapted from Ali Farhadi (via Carlos Guestrin and Luke Zettlemoyer) Single-layer neural network 3/0/207 Perceptron as a neural

More information

Conditional Random Fields. Mike Brodie CS 778

Conditional Random Fields. Mike Brodie CS 778 Conditional Random Fields Mike Brodie CS 778 Motivation Part-Of-Speech Tagger 2 Motivation object 3 Motivation I object! 4 Motivation object Do you see that object? 5 Motivation Part-Of-Speech Tagger -

More information

Alternatives to Direct Supervision

Alternatives to Direct Supervision CreativeAI: Deep Learning for Graphics Alternatives to Direct Supervision Niloy Mitra Iasonas Kokkinos Paul Guerrero Nils Thuerey Tobias Ritschel UCL UCL UCL TUM UCL Timetable Theory and Basics State of

More information

DEEP LEARNING REVIEW. Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature Presented by Divya Chitimalla

DEEP LEARNING REVIEW. Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature Presented by Divya Chitimalla DEEP LEARNING REVIEW Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature 2015 -Presented by Divya Chitimalla What is deep learning Deep learning allows computational models that are composed of multiple

More information

A new approach for supervised power disaggregation by using a deep recurrent LSTM network

A new approach for supervised power disaggregation by using a deep recurrent LSTM network A new approach for supervised power disaggregation by using a deep recurrent LSTM network GlobalSIP 2015, 14th Dec. Lukas Mauch and Bin Yang Institute of Signal Processing and System Theory University

More information

Exam Marco Kuhlmann. This exam consists of three parts:

Exam Marco Kuhlmann. This exam consists of three parts: TDDE09, 729A27 Natural Language Processing (2017) Exam 2017-03-13 Marco Kuhlmann This exam consists of three parts: 1. Part A consists of 5 items, each worth 3 points. These items test your understanding

More information

Deep Generative Models Variational Autoencoders

Deep Generative Models Variational Autoencoders Deep Generative Models Variational Autoencoders Sudeshna Sarkar 5 April 2017 Generative Nets Generative models that represent probability distributions over multiple variables in some way. Directed Generative

More information

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University DS 4400 Machine Learning and Data Mining I Alina Oprea Associate Professor, CCIS Northeastern University September 20 2018 Review Solution for multiple linear regression can be computed in closed form

More information