Speech Recognition with Quaternion Neural Networks

Size: px
Start display at page:

Download "Speech Recognition with Quaternion Neural Networks"

Transcription

1 Speech Recognition with Quaternion Neural Networks LIA Titouan Parcollet, Mohamed Morchid, Georges Linarès University of Avignon, France ORKIS, France

2 Summary I. Problem definition II. Quaternion numbers III. Quaternion convolutional neural networks IV. Experimentation and discussions 1

3 Summary I. Problem definition II. Quaternion numbers III. Quaternion convolutional neural networks IV. Experimentation and discussions 1

4 Summary I. Problem definition II. Quaternion numbers III. Quaternion neural networks IV. Experimentation and discussions 1

5 Summary I. Problem definition II. Quaternion numbers III. Quaternion neural networks IV. Experiments and discussions 1

6 Problem definition 2

7 Problem definition Established fact n 1: the bigger the model is, the better the results are (with a good training procedure and enough data) 2

8 Problem definition Established fact n 1: the bigger the model is, the better the results are (with a good training procedure and enough data) Is the model really efficient? 2

9 Problem definition Established fact n 1: the bigger the model is, the better the results are (with a good training procedure and enough data) Is the model really efficient? Established fact n 2: Input features are often multidimensional 2

10 Problem definition Established fact n 1: the bigger the model is, the better the results are (with a good training procedure and enough data) Is the model really efficient? Established fact n 2: Input features are often multidimensional Is the usual flat real-valued representation good? 2

11 Problem definition Can we define a more natural representation of multidimensional input features than the real-valued one, that will helps neural networks to be more efficient? 3

12 Quaternion numbers 4

13 Quaternion numbers Q = r1+xi + yj + zk 4

14 Quaternion numbers Real part Q = r1+xi + yj + zk 4

15 Quaternion numbers Imaginary part Q = r1+xi + yj + zk 4

16 Quaternion numbers Real part Imaginary part Q = r1+xi + yj + zk Quaternions solve the multidimensionality problem! 4

17 Quaternion numbers Real part Imaginary part Q = r1+xi + yj + zk Quaternions solve the multidimensionality problem! Acoustic quaternion for speech processing Q(f,t) j 2 t k 4

18 Quaternion numbers Real part Imaginary part Q = r1+xi + yj + zk Quaternions solve the multidimensionality problem! Acoustic quaternion for speech processing MFCC, Mel-filter-banks Q(f,t) j 2 t k 4

19 Quaternion numbers Real part Imaginary part Q = r1+xi + yj + zk Quaternions solve the multidimensionality problem! Acoustic quaternion for speech processing First and second order derivatives Q(f,t) j 2 t k 4

20 Quaternion numbers Real part Imaginary part Q = r1+xi + yj + zk Quaternions solve the multidimensionality problem! Acoustic quaternion for speech processing Purely imaginary acoustic quaternion Q(f,t) j 2 t k 4

21 Quaternion numbers Real part Imaginary part Q = r1+xi + yj + zk Quaternions solve the multidimensionality problem! Pixel quaternion for image processing Purely imaginary pixel quaternion Q(p) =0+Red (p)i + Green(p)j + Blue(p)k 4

22 Quaternion numbers Hamilton product 5

23 Quaternion numbers Hamilton product Components are related to each others 5

24 Quaternion numbers Hamilton product in neural networks Real-valued layer 6

25 Quaternion numbers Hamilton product in neural networks Real-valued layer 6

26 Quaternion numbers Hamilton product in neural networks Real-valued layer Quaternion-valued layer 4 x 4 = 16 weights 6

27 Quaternion numbers Hamilton product in neural networks Real-valued layer Quaternion-valued layer r i j k r i j k 4 x 4 = 16 weights 6

28 Quaternion numbers Hamilton product in neural networks Real-valued layer Quaternion-valued layer r i j k r i j k 4 x 4 = 16 weights W q X q =(w r x r w x x x w y x y w z x z )+ (w r x x + w x x r + w y x z w z x y )i+ (w r x y w x x z + w y x r + w z x x )j+ (w r x z + w x x y w y x x + w z x r )k 6

29 Quaternion numbers Hamilton product in neural networks Real-valued layer Quaternion-valued layer r i j k r i j k 4 x 4 = 16 weights W q X q =(w r x r w x x x w y x y w z x z )+ (w r x x + w x x r + w y x z w z x y )i+ (w r x y w x x z + w y x r + w z x x )j+ (w r x z + w x x y w y x x + w z x r )k 6

30 Quaternion numbers Hamilton product in neural networks Real-valued layer Quaternion-valued layer r i j k r i j k 4 x 4 = 16 weights W q X q =(w r x r w x x x w y x y w z x z )+ (w r x x + w x x r + w y x z w z x y )i+ (w r x y w x x z + w y x r + w z x x )j+ (w r x z + w x x y w y x x + w z x r )k 1 weight = 4 parameters 6

31 Quaternion numbers Hamilton product in neural networks Quaternions can learn internal relations within input features! Quaternions reduce the number of neural parameters! 7

32 Quaternion Neural Networks (QNN)

33 Quaternion Neural Networks (QNN) QNN = NN with all parameters being quaternions 8

34 Quaternion Neural Networks (QNN) QNN = NN with all parameters being quaternions QNN = NN with to replace 8

35 Quaternion Neural Networks (QNN) QNN = NN with all parameters being quaternions QNN = NN with to replace QNN backpropagation and update differ from NN[1] [1] P. Arena, L. Fortuna, L. Occhipinti, and M. G. Xibilia, Neural networks for quaternion-valued function approximation, in Circuits and Systems, ISCAS 94., 1994 IEEE International Symposium on, vol. 6. IEEE, 1994, pp

36 Quaternion Neural Networks (QNN) Activation function: the «split» approach [1] Q = r1+xi + yj + zk [1] P. Arena, L. Fortuna, L. Occhipinti, and M. G. Xibilia, Neural networks for quaternion-valued function approximation, in Circuits and Systems, ISCAS 94., 1994 IEEE International Symposium on, vol. 6. IEEE, 1994, pp

37 Quaternion Neural Networks (QNN) Activation function: the «split» approach [1] Q = r1+xi + yj + zk f(q) =f(r)+f(x)i + f(y)j + f(z)k The function f can be any real-valued activation function Sigmoid, TanH, ReLU, ELU [1] P. Arena, L. Fortuna, L. Occhipinti, and M. G. Xibilia, Neural networks for quaternion-valued function approximation, in Circuits and Systems, ISCAS 94., 1994 IEEE International Symposium on, vol. 6. IEEE, 1994, pp

38 Quaternion Neural Networks (QNN) Neural parameters initialization 10

39 Quaternion Neural Networks (QNN) Neural parameters initialization 10

40 Quaternion Neural Networks (QNN) Neural parameters initialization 10

41 Experiments and discussions

42 Experiments and discussions Neural Networks reminder

43 Experiments and discussions Neural Networks reminder Convolutional neural networks (CNN) 11

44 Experiments and discussions Neural Networks reminder Recurrent neural networks (RNN) 12

45 Experiments and discussions Neural Networks reminder Long-Short Term Memory recurrent neural networks (LSTM) 13

46 Experiments and discussions Speech Recognition tasks Where are we using neural networks? 14

47 Experiments and discussions Speech Recognition tasks Automatic Speech Recognition (ASR) system overly simplified 15

48 Experiments and discussions Speech Recognition tasks Automatic Speech Recognition (ASR) system overly simplified 15

49 Experiments and discussions Acoustic Modelling Speech Recognition tasks 16

50 Experiments and discussions End-to-End - TIMIT speakers training set - 50 speakers validation set sentences as a core test set - SA records are removed from the training Acoustic Modelling Speech Recognition tasks 16

51 Experiments and discussions End-to-End - TIMIT speakers training set - 50 speakers validation set sentences as a core test set - SA records are removed from the training Acoustic Modelling - Q-Convolutional Neural Network + CTC [2] Speech Recognition tasks [2] A. Graves, S. Fernández, F. Gomez, and J. Schmidhuber, Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks, in Proceedings of the 23rd international conference on Machine learning. ACM, 2006, pp

52 Experiments and discussions Acoustic Modelling Speech Recognition tasks End-to-End - TIMIT speakers training set - 50 speakers validation set sentences as a core test set - SA records are removed from the training - Q-Convolutional Neural Network + CTC [2] Traditional HMM - TIMIT speakers training set - 50 speakers validation set sentences as a core test set - SA records are removed from the training - Wall Street Journal (WSJ) - 14h and 81h training set - test-dev93 used as a validation set - test-eval92 used as a test set [2] A. Graves, S. Fernández, F. Gomez, and J. Schmidhuber, Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks, in Proceedings of the 23rd international conference on Machine learning. ACM, 2006, pp

53 Experiments and discussions Acoustic Modelling Speech Recognition tasks End-to-End - TIMIT speakers training set - 50 speakers validation set sentences as a core test set - SA records are removed from the training - Q-Convolutional Neural Network + CTC [2] Traditional HMM - TIMIT speakers training set - 50 speakers validation set sentences as a core test set - SA records are removed from the training - Wall Street Journal (WSJ) - 14h and 81h training set - test-dev93 used as a validation set - test-eval92 used as a test set - Q-Recurrent Neural Networks (QRNN) - Q-Long-Short Term Memory NN (QLSTM) [2] A. Graves, S. Fernández, F. Gomez, and J. Schmidhuber, Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks, in Proceedings of the 23rd international conference on Machine learning. ACM, 2006, pp

54 Experiments and discussions Acoustic features 17

55 Experiments and discussions Acoustic features acoustic quaternion 40 mel-filter-banks + delta + dd + ddd = 160 real-valued inputs 40 MFCC + deltas + deltas-deltas = 40 quaternion-valued inputs 17

56 Experiments and discussions End-to-End results on TIMIT - QCNN + CTC Models architectures will be discussed during the questions! ;) 18

57 Experiments and discussions End-to-End results on TIMIT - QCNN + CTC Error expressed in Phoneme Error Rate (PER %) FM = Feature Maps 19

58 Experiments and discussions End-to-End results on TIMIT - QCNN + CTC Error expressed in Phoneme Error Rate (PER %) FM = Feature Maps 19

59 Experiments and discussions End-to-End results on TIMIT - QCNN + CTC Error expressed in Phoneme Error Rate (PER %) FM = Feature Maps 4x less parameters! 19

60 Experiments and discussions No more End-to-End results 20

61 Experiments and discussions HMM on TIMIT with PyTorch-Kaldi - QRNN 21

62 Experiments and discussions HMM on TIMIT with PyTorch-Kaldi - QRNN Error expressed in Phoneme Error Rate (PER %) 22

63 Experiments and discussions HMM on TIMIT with PyTorch-Kaldi - QRNN Error expressed in Phoneme Error Rate (PER %) 22

64 Experiments and discussions HMM on TIMIT with PyTorch-Kaldi - QRNN Error expressed in Phoneme Error Rate (PER %) 2.5x less parameters! 22

65 Experiments and discussions HMM on TIMIT with PyTorch-Kaldi - QLSTM 23

66 Experiments and discussions HMM on TIMIT with PyTorch-Kaldi - QLSTM Error expressed in Phoneme Error Rate (PER %) 24

67 Experiments and discussions HMM on TIMIT with PyTorch-Kaldi - QLSTM Error expressed in Phoneme Error Rate (PER %) 24

68 Experiments and discussions HMM on TIMIT with PyTorch-Kaldi - QLSTM Error expressed in Phoneme Error Rate (PER %) 3.2x less parameters! 24

69 Experiments and discussions HMM on WSJ with PyTorch-Kaldi - QLSTM 25

70 Experiments and discussions HMM on WSJ with PyTorch-Kaldi - QLSTM Error expressed in Word Error Rate (WER %) 26

71 Experiments and discussions HMM on WSJ with PyTorch-Kaldi - QLSTM Error expressed in Word Error Rate (WER %) 26

72 Experiments and discussions HMM on WSJ with PyTorch-Kaldi - QLSTM Error expressed in Word Error Rate (WER %) 26

73 Conclusion Can we define a more natural representation of multidimensional input features than the real-valued one, that will helps neural networks to be more efficient? 27

74 Conclusion Can we define a more natural representation of multidimensional input features than the real-valued one, that will helps neural networks to be more efficient? Better and more natural representation multidimensional features 27

75 Conclusion Can we define a more natural representation of multidimensional input features than the real-valued one, that will helps neural networks to be more efficient? Better and more natural representation multidimensional features The Hamilton product alongside with neural networks allow QNN to well-learn both internal and contextual dependencies 27

76 Conclusion Can we define a more natural representation of multidimensional input features than the real-valued one, that will helps neural networks to be more efficient? Better and more natural representation multidimensional features The Hamilton product alongside with neural networks allow QNN to well-learn both internal and contextual dependencies Reduction of the number of free parameters 27

77 Conclusion Can we define a more natural representation of multidimensional input features than the real-valued one, that will helps neural networks to be more efficient? Better and more natural representation multidimensional features The Hamilton product alongside with neural networks allow QNN to well-learn both internal and contextual dependencies Reduction of the number of free parameters Y E S W E C A N 27

78 Ressources Related to this presentation: - «Quaternion Recurrent Neural Networks» ICLR 2019, Titouan Parcollet, Mirco Ravanelli, Mohamed Morchid, Chiheb Trabelsi, Georges Linarès, Renato De Mori, Yoshua Bengio - «Speech Recognition with Quaternion Neural Networks», NIPS (NeurIPS) IRASL, Titouan Parcollet, Mirco Ravanelli, Mohamed Morchid, Georges Linarès, Renato De Mori - «Quaternion Convolutional Neural Networks for End-to-End Speech Recognition» Interspeech 2018 Oral Session on «End-to-End ASR», Titouan Parcollet,Ying Zhang, Mohamed Morchid, Chiheb Trabelsi, Georges Linarès, Renato De Mori, Yoshua Bengio - «Bidirectional Quaternion Long-Short Term Memory Recurrent Neural Networks for Speech Recognition» Submitted ICASSP 2019, Titouan Parcollet, Mohamed Morchid, Georges Linarès, Renato De Mori - «The Pytorch-Kaldi Speech Recognition Toolkit» Submitted ICASSP 2019, Mirco Ravanelli, Titouan Parcollet, Yoshua Bengio QNN with PyTorch and Keras: PyTorch-Kaldi:

79 Thanks you! Questions? Related to this presentation: *Eve follows a rotation described by a unit quaternion around Wall-e - «Quaternion Recurrent Neural Networks» ICLR 2019, Titouan Parcollet, Mirco Ravanelli, Mohamed Morchid, Chiheb Trabelsi, Georges Linarès, Renato De Mori, Yoshua Bengio - «Speech Recognition with Quaternion Neural Networks», NIPS (NeurIPS) IRASL, Titouan Parcollet, Mirco Ravanelli, Mohamed Morchid, Georges Linarès, Renato De Mori - «Quaternion Convolutional Neural Networks for End-to-End Speech Recognition» Interspeech 2018 Oral Session on «End-to-End ASR», Titouan Parcollet,Ying Zhang, Mohamed Morchid, Chiheb Trabelsi, Georges Linarès, Renato De Mori, Yoshua Bengio - «Bidirectional Quaternion Long-Short Term Memory Recurrent Neural Networks for Speech Recognition» Submitted ICASSP 2019, Titouan Parcollet, Mohamed Morchid, Georges Linarès, Renato De Mori - «The Pytorch-Kaldi Speech Recognition Toolkit» Submitted ICASSP 2019, Mirco Ravanelli, Titouan Parcollet, Yoshua Bengio QNN with PyTorch and Keras: PyTorch-Kaldi:

80 Quaternion numbers Hamilton product in neural networks

81 Quaternion convolution

82 Computations W q X q =(w r x r w x x x w y x y w z x z )+ (w r x x + w x x r + w y x z w z x y )i+ (w r x y w x x z + w y x r + w z x x )j+ (w r x z + w x x y w y x x + w z x r )k 28 operations that should be computed in parallel with CUDA and GPUs

83 Quaternion equations Q = r x y z x r z y 7 y z r x5 Q = r1 xi yj zk z y x r Q / = Q p r2 + x 2 + y 2 + z 2 Q = Q e n = Q (cos( )+nsin( )) n = xi + yj + zk Q sin( )

84 Connectionist Temporal Classification[2] Hannun, "Sequence Modeling with CTC", Distill, [2] A. Graves, S. Fernández, F. Gomez, and J. Schmidhuber, Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks, in Proceedings of the 23rd international conference on Machine learning. ACM, 2006, pp

85 Learning Internal relations with QCNN

arxiv: v1 [cs.sd] 20 Jun 2018

arxiv: v1 [cs.sd] 20 Jun 2018 Quaternion Convolutional Neural Networks for End-to-End Automatic Speech Recognition Titouan Parcollet 1,2,4, Ying Zhang 2,5, Mohamed Morchid 1, Chiheb Trabelsi 2, Georges Linarès 1, Renato De Mori 1,3

More information

arxiv: v1 [eess.as] 21 Nov 2018

arxiv: v1 [eess.as] 21 Nov 2018 Speech Recognition with Quaternion Neural Networks arxiv:1811.09678v1 [eess.as] 21 Nov 2018 Titouan Parcollet 1,4 Mirco Ravanelli 2 Mohamed Morchid 1 Georges Linarès 1 Renato De Mori 1,3 1 LIA, Université

More information

Lecture 7: Neural network acoustic models in speech recognition

Lecture 7: Neural network acoustic models in speech recognition CS 224S / LINGUIST 285 Spoken Language Processing Andrew Maas Stanford University Spring 2017 Lecture 7: Neural network acoustic models in speech recognition Outline Hybrid acoustic modeling overview Basic

More information

Voice command module for Smart Home Automation

Voice command module for Smart Home Automation Voice command module for Smart Home Automation LUKA KRALJEVIĆ, MLADEN RUSSO, MAJA STELLA Laboratory for Smart Environment Technologies, University of Split, FESB Ruđera Boškovića 32, 21000, Split CROATIA

More information

Long Short-Term Memory Recurrent Neural Network Architectures for Large Scale Acoustic Modeling

Long Short-Term Memory Recurrent Neural Network Architectures for Large Scale Acoustic Modeling INTERSPEECH 2014 Long Short-Term Memory Recurrent Neural Network Architectures for Large Scale Acoustic Modeling Haşim Sak, Andrew Senior, Françoise Beaufays Google, USA {hasim,andrewsenior,fsb@google.com}

More information

Training LDCRF model on unsegmented sequences using Connectionist Temporal Classification

Training LDCRF model on unsegmented sequences using Connectionist Temporal Classification Training LDCRF model on unsegmented sequences using Connectionist Temporal Classification 1 Amir Ahooye Atashin, 2 Kamaledin Ghiasi-Shirazi, 3 Ahad Harati Department of Computer Engineering Ferdowsi University

More information

Improving CTC-based Acoustic Model with Very Deep Residual Time-delay Neural Networks

Improving CTC-based Acoustic Model with Very Deep Residual Time-delay Neural Networks Interspeech 2018 2-6 September 2018, Hyderabad Improving CTC-based Acoustic Model with Very Deep Residual Time-delay Neural Networks Sheng Li 1, Xugang Lu 1, Ryoichi Takashima 1, Peng Shen 1, Tatsuya Kawahara

More information

arxiv: v1 [cs.ai] 14 May 2007

arxiv: v1 [cs.ai] 14 May 2007 Multi-Dimensional Recurrent Neural Networks Alex Graves, Santiago Fernández, Jürgen Schmidhuber IDSIA Galleria 2, 6928 Manno, Switzerland {alex,santiago,juergen}@idsia.ch arxiv:0705.2011v1 [cs.ai] 14 May

More information

END-TO-END CHINESE TEXT RECOGNITION

END-TO-END CHINESE TEXT RECOGNITION END-TO-END CHINESE TEXT RECOGNITION Jie Hu 1, Tszhang Guo 1, Ji Cao 2, Changshui Zhang 1 1 Department of Automation, Tsinghua University 2 Beijing SinoVoice Technology November 15, 2017 Presentation at

More information

Gated Recurrent Unit Based Acoustic Modeling with Future Context

Gated Recurrent Unit Based Acoustic Modeling with Future Context Interspeech 2018 2-6 September 2018, Hyderabad Gated Recurrent Unit Based Acoustic Modeling with Future Context Jie Li 1, Xiaorui Wang 1, Yuanyuan Zhao 2, Yan Li 1 1 Kwai, Beijing, P.R. China 2 Institute

More information

End-to-End Training of Acoustic Models for Large Vocabulary Continuous Speech Recognition with TensorFlow

End-to-End Training of Acoustic Models for Large Vocabulary Continuous Speech Recognition with TensorFlow End-to-End Training of Acoustic Models for Large Vocabulary Continuous Speech Recognition with TensorFlow Ehsan Variani, Tom Bagby, Erik McDermott, Michiel Bacchiani Google Inc, Mountain View, CA, USA

More information

QUATERNION NEURAL NETWORKS FOR SPOKEN LANGUAGE UNDERSTANDING

QUATERNION NEURAL NETWORKS FOR SPOKEN LANGUAGE UNDERSTANDING QUATERIO EURAL ETWORKS FOR SPOKE LAGUAGE UDERSTADIG Titouan Parcollet, Mohamed Morchid, Pierre-Michel Bousquet, Richard Dufour Georges Linarès and Renato De Mori, Fellow, IEEE LIA, University of Avignon

More information

Low latency acoustic modeling using temporal convolution and LSTMs

Low latency acoustic modeling using temporal convolution and LSTMs 1 Low latency acoustic modeling using temporal convolution and LSTMs Vijayaditya Peddinti, Yiming Wang, Daniel Povey, Sanjeev Khudanpur Abstract Bidirectional long short term memory (BLSTM) acoustic models

More information

End- To- End Speech Recogni0on with Recurrent Neural Networks

End- To- End Speech Recogni0on with Recurrent Neural Networks RTTH Summer School on Speech Technology: A Deep Learning Perspec0ve End- To- End Speech Recogni0on with Recurrent Neural Networks José A. R. Fonollosa Universitat Politècnica de Catalunya. Barcelona Barcelona,

More information

Multi-Dimensional Recurrent Neural Networks

Multi-Dimensional Recurrent Neural Networks Multi-Dimensional Recurrent Neural Networks Alex Graves 1, Santiago Fernández 1, Jürgen Schmidhuber 1,2 1 IDSIA, Galleria 2, 6928 Manno-Lugano, Switzerland 2 TU Munich, Boltzmannstr. 3, 85748 Garching,

More information

Deep Neural Networks in HMM- based and HMM- free Speech RecogniDon

Deep Neural Networks in HMM- based and HMM- free Speech RecogniDon Deep Neural Networks in HMM- based and HMM- free Speech RecogniDon Andrew Maas Collaborators: Awni Hannun, Peng Qi, Chris Lengerich, Ziang Xie, and Anshul Samar Advisors: Andrew Ng and Dan Jurafsky Outline

More information

Factored deep convolutional neural networks for noise robust speech recognition

Factored deep convolutional neural networks for noise robust speech recognition INTERSPEECH 2017 August 20 24, 2017, Stockholm, Sweden Factored deep convolutional neural networks for noise robust speech recognition Masakiyo Fujimoto National Institute of Information and Communications

More information

Bidirectional Truncated Recurrent Neural Networks for Efficient Speech Denoising

Bidirectional Truncated Recurrent Neural Networks for Efficient Speech Denoising Bidirectional Truncated Recurrent Neural Networks for Efficient Speech Denoising Philémon Brakel, Dirk Stroobandt, Benjamin Schrauwen Department of Electronics and Information Systems, Ghent University,

More information

Introducing CURRENNT: The Munich Open-Source CUDA RecurREnt Neural Network Toolkit

Introducing CURRENNT: The Munich Open-Source CUDA RecurREnt Neural Network Toolkit Journal of Machine Learning Research 6 205) 547-55 Submitted 7/3; Published 3/5 Introducing CURRENNT: The Munich Open-Source CUDA RecurREnt Neural Network Toolkit Felix Weninger weninger@tum.de Johannes

More information

A Comparison of Sequence-Trained Deep Neural Networks and Recurrent Neural Networks Optical Modeling For Handwriting Recognition

A Comparison of Sequence-Trained Deep Neural Networks and Recurrent Neural Networks Optical Modeling For Handwriting Recognition A Comparison of Sequence-Trained Deep Neural Networks and Recurrent Neural Networks Optical Modeling For Handwriting Recognition Théodore Bluche, Hermann Ney, Christopher Kermorvant SLSP 14, Grenoble October

More information

Detecting Fraudulent Behavior Using Recurrent Neural Networks

Detecting Fraudulent Behavior Using Recurrent Neural Networks Computer Security Symposium 2016 11-13 October 2016 Detecting Fraudulent Behavior Using Recurrent Neural Networks Yoshihiro Ando 1,2,a),b) Hidehito Gomi 2,c) Hidehiko Tanaka 1,d) Abstract: Due to an increase

More information

arxiv: v1 [cs.cl] 28 Nov 2016

arxiv: v1 [cs.cl] 28 Nov 2016 An End-to-End Architecture for Keyword Spotting and Voice Activity Detection arxiv:1611.09405v1 [cs.cl] 28 Nov 2016 Chris Lengerich Mindori Palo Alto, CA chris@mindori.com Abstract Awni Hannun Mindori

More information

arxiv: v3 [cs.sd] 1 Nov 2018

arxiv: v3 [cs.sd] 1 Nov 2018 AN IMPROVED HYBRID CTC-ATTENTION MODEL FOR SPEECH RECOGNITION Zhe Yuan, Zhuoran Lyu, Jiwei Li and Xi Zhou Cloudwalk Technology Inc, Shanghai, China arxiv:1810.12020v3 [cs.sd] 1 Nov 2018 ABSTRACT Recently,

More information

LOW-RANK MATRIX FACTORIZATION FOR DEEP NEURAL NETWORK TRAINING WITH HIGH-DIMENSIONAL OUTPUT TARGETS

LOW-RANK MATRIX FACTORIZATION FOR DEEP NEURAL NETWORK TRAINING WITH HIGH-DIMENSIONAL OUTPUT TARGETS LOW-RANK MATRIX FACTORIZATION FOR DEEP NEURAL NETWORK TRAINING WITH HIGH-DIMENSIONAL OUTPUT TARGETS Tara N. Sainath, Brian Kingsbury, Vikas Sindhwani, Ebru Arisoy, Bhuvana Ramabhadran IBM T. J. Watson

More information

DEEP LEARNING REVIEW. Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature Presented by Divya Chitimalla

DEEP LEARNING REVIEW. Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature Presented by Divya Chitimalla DEEP LEARNING REVIEW Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature 2015 -Presented by Divya Chitimalla What is deep learning Deep learning allows computational models that are composed of multiple

More information

Text Recognition in Videos using a Recurrent Connectionist Approach

Text Recognition in Videos using a Recurrent Connectionist Approach Author manuscript, published in "ICANN - 22th International Conference on Artificial Neural Networks, Lausanne : Switzerland (2012)" DOI : 10.1007/978-3-642-33266-1_22 Text Recognition in Videos using

More information

An Online Sequence-to-Sequence Model Using Partial Conditioning

An Online Sequence-to-Sequence Model Using Partial Conditioning An Online Sequence-to-Sequence Model Using Partial Conditioning Navdeep Jaitly Google Brain ndjaitly@google.com David Sussillo Google Brain sussillo@google.com Quoc V. Le Google Brain qvl@google.com Oriol

More information

A Robust Dissimilarity-based Neural Network for Temporal Pattern Recognition

A Robust Dissimilarity-based Neural Network for Temporal Pattern Recognition A Robust Dissimilarity-based Neural Network for Temporal Pattern Recognition Brian Kenji Iwana, Volkmar Frinken, Seiichi Uchida Department of Advanced Information Technology, Kyushu University, Fukuoka,

More information

Variable-Component Deep Neural Network for Robust Speech Recognition

Variable-Component Deep Neural Network for Robust Speech Recognition Variable-Component Deep Neural Network for Robust Speech Recognition Rui Zhao 1, Jinyu Li 2, and Yifan Gong 2 1 Microsoft Search Technology Center Asia, Beijing, China 2 Microsoft Corporation, One Microsoft

More information

Deep Neural Networks Applications in Handwriting Recognition

Deep Neural Networks Applications in Handwriting Recognition Deep Neural Networks Applications in Handwriting Recognition 2 Who am I? Théodore Bluche PhD defended at Université Paris-Sud last year Deep Neural Networks for Large Vocabulary Handwritten

More information

27: Hybrid Graphical Models and Neural Networks

27: Hybrid Graphical Models and Neural Networks 10-708: Probabilistic Graphical Models 10-708 Spring 2016 27: Hybrid Graphical Models and Neural Networks Lecturer: Matt Gormley Scribes: Jakob Bauer Otilia Stretcu Rohan Varma 1 Motivation We first look

More information

A Novel Approach to On-Line Handwriting Recognition Based on Bidirectional Long Short-Term Memory Networks

A Novel Approach to On-Line Handwriting Recognition Based on Bidirectional Long Short-Term Memory Networks A Novel Approach to n-line Handwriting Recognition Based on Bidirectional Long Short-Term Memory Networks Marcus Liwicki 1 Alex Graves 2 Horst Bunke 1 Jürgen Schmidhuber 2,3 1 nst. of Computer Science

More information

Unsupervised Feature Learning for Optical Character Recognition

Unsupervised Feature Learning for Optical Character Recognition Unsupervised Feature Learning for Optical Character Recognition Devendra K Sahu and C. V. Jawahar Center for Visual Information Technology, IIIT Hyderabad, India. Abstract Most of the popular optical character

More information

Code Mania Artificial Intelligence: a. Module - 1: Introduction to Artificial intelligence and Python:

Code Mania Artificial Intelligence: a. Module - 1: Introduction to Artificial intelligence and Python: Code Mania 2019 Artificial Intelligence: a. Module - 1: Introduction to Artificial intelligence and Python: 1. Introduction to Artificial Intelligence 2. Introduction to python programming and Environment

More information

Deep Neural Networks Applications in Handwriting Recognition

Deep Neural Networks Applications in Handwriting Recognition Deep Neural Networks Applications in Handwriting Recognition Théodore Bluche theodore.bluche@gmail.com São Paulo Meetup - 9 Mar. 2017 2 Who am I? Théodore Bluche PhD defended

More information

CUED-RNNLM An Open-Source Toolkit for Efficient Training and Evaluation of Recurrent Neural Network Language Models

CUED-RNNLM An Open-Source Toolkit for Efficient Training and Evaluation of Recurrent Neural Network Language Models CUED-RNNLM An Open-Source Toolkit for Efficient Training and Evaluation of Recurrent Neural Network Language Models Xie Chen, Xunying Liu, Yanmin Qian, Mark Gales and Phil Woodland April 1, 2016 Overview

More information

Handwritten Gurumukhi Character Recognition by using Recurrent Neural Network

Handwritten Gurumukhi Character Recognition by using Recurrent Neural Network 139 Handwritten Gurumukhi Character Recognition by using Recurrent Neural Network Harmit Kaur 1, Simpel Rani 2 1 M. Tech. Research Scholar (Department of Computer Science & Engineering), Yadavindra College

More information

arxiv: v1 [cs.cv] 4 Feb 2018

arxiv: v1 [cs.cv] 4 Feb 2018 End2You The Imperial Toolkit for Multimodal Profiling by End-to-End Learning arxiv:1802.01115v1 [cs.cv] 4 Feb 2018 Panagiotis Tzirakis Stefanos Zafeiriou Björn W. Schuller Department of Computing Imperial

More information

arxiv: v5 [cs.lg] 21 Feb 2014

arxiv: v5 [cs.lg] 21 Feb 2014 Do Deep Nets Really Need to be Deep? arxiv:1312.6184v5 [cs.lg] 21 Feb 2014 Lei Jimmy Ba University of Toronto jimmy@psi.utoronto.ca Abstract Rich Caruana Microsoft Research rcaruana@microsoft.com Currently,

More information

GMM-FREE DNN TRAINING. Andrew Senior, Georg Heigold, Michiel Bacchiani, Hank Liao

GMM-FREE DNN TRAINING. Andrew Senior, Georg Heigold, Michiel Bacchiani, Hank Liao GMM-FREE DNN TRAINING Andrew Senior, Georg Heigold, Michiel Bacchiani, Hank Liao Google Inc., New York {andrewsenior,heigold,michiel,hankliao}@google.com ABSTRACT While deep neural networks (DNNs) have

More information

Deep Neural Networks for Recognizing Online Handwritten Mathematical Symbols

Deep Neural Networks for Recognizing Online Handwritten Mathematical Symbols Deep Neural Networks for Recognizing Online Handwritten Mathematical Symbols Hai Dai Nguyen 1, Anh Duc Le 2 and Masaki Nakagawa 3 Tokyo University of Agriculture and Technology 2-24-16 Nakacho, Koganei-shi,

More information

Scaling Deep Learning. Bryan

Scaling Deep Learning. Bryan Scaling Deep Learning @ctnzr What do we want AI to do? Guide us to content Keep us organized Help us find things Help us communicate 帮助我们沟通 Drive us to work Serve drinks? Image Q&A Baidu IDL Sample questions

More information

Scanning Neural Network for Text Line Recognition

Scanning Neural Network for Text Line Recognition 2012 10th IAPR International Workshop on Document Analysis Systems Scanning Neural Network for Text Line Recognition Sheikh Faisal Rashid, Faisal Shafait and Thomas M. Breuel Department of Computer Science

More information

Machine Learning 13. week

Machine Learning 13. week Machine Learning 13. week Deep Learning Convolutional Neural Network Recurrent Neural Network 1 Why Deep Learning is so Popular? 1. Increase in the amount of data Thanks to the Internet, huge amount of

More information

The Hitachi/JHU CHiME-5 system: Advances in speech recognition for everyday home environments using multiple microphone arrays

The Hitachi/JHU CHiME-5 system: Advances in speech recognition for everyday home environments using multiple microphone arrays CHiME2018 workshop The Hitachi/JHU CHiME-5 system: Advances in speech recognition for everyday home environments using multiple microphone arrays Naoyuki Kanda 1, Rintaro Ikeshita 1, Shota Horiguchi 1,

More information

Hello Edge: Keyword Spotting on Microcontrollers

Hello Edge: Keyword Spotting on Microcontrollers Hello Edge: Keyword Spotting on Microcontrollers Yundong Zhang, Naveen Suda, Liangzhen Lai and Vikas Chandra ARM Research, Stanford University arxiv.org, 2017 Presented by Mohammad Mofrad University of

More information

Machine Learning. MGS Lecture 3: Deep Learning

Machine Learning. MGS Lecture 3: Deep Learning Dr Michel F. Valstar http://cs.nott.ac.uk/~mfv/ Machine Learning MGS Lecture 3: Deep Learning Dr Michel F. Valstar http://cs.nott.ac.uk/~mfv/ WHAT IS DEEP LEARNING? Shallow network: Only one hidden layer

More information

Frame and Segment Level Recurrent Neural Networks for Phone Classification

Frame and Segment Level Recurrent Neural Networks for Phone Classification Frame and Segment Level Recurrent Neural Networks for Phone Classification Martin Ratajczak 1, Sebastian Tschiatschek 2, Franz Pernkopf 1 1 Graz University of Technology, Signal Processing and Speech Communication

More information

NON-LINEAR DIMENSION REDUCTION OF GABOR FEATURES FOR NOISE-ROBUST ASR. Hitesh Anand Gupta, Anirudh Raju, Abeer Alwan

NON-LINEAR DIMENSION REDUCTION OF GABOR FEATURES FOR NOISE-ROBUST ASR. Hitesh Anand Gupta, Anirudh Raju, Abeer Alwan NON-LINEAR DIMENSION REDUCTION OF GABOR FEATURES FOR NOISE-ROBUST ASR Hitesh Anand Gupta, Anirudh Raju, Abeer Alwan Department of Electrical Engineering, University of California Los Angeles, USA {hiteshag,

More information

Boosting Handwriting Text Recognition in Small Databases with Transfer Learning

Boosting Handwriting Text Recognition in Small Databases with Transfer Learning Boosting Handwriting Text Recognition in Small Databases with Transfer Learning José Carlos Aradillas University of Seville Seville, Spain 41092 Email: jaradillas@us.es Juan José Murillo-Fuentes University

More information

An Optimization of Deep Neural Networks in ASR using Singular Value Decomposition

An Optimization of Deep Neural Networks in ASR using Singular Value Decomposition An Optimization of Deep Neural Networks in ASR using Singular Value Decomposition Bachelor Thesis of Igor Tseyzer At the Department of Informatics Institute for Anthropomatics (IFA) Reviewer: Second reviewer:

More information

Convolutional Sequence to Sequence Non-intrusive Load Monitoring

Convolutional Sequence to Sequence Non-intrusive Load Monitoring 1 Convolutional Sequence to Sequence Non-intrusive Load Monitoring Kunjin Chen, Qin Wang, Ziyu He Kunlong Chen, Jun Hu and Jinliang He Department of Electrical Engineering, Tsinghua University, Beijing,

More information

Clinical Name Entity Recognition using Conditional Random Field with Augmented Features

Clinical Name Entity Recognition using Conditional Random Field with Augmented Features Clinical Name Entity Recognition using Conditional Random Field with Augmented Features Dawei Geng (Intern at Philips Research China, Shanghai) Abstract. In this paper, We presents a Chinese medical term

More information

AUTOMATIC TRANSPORT NETWORK MATCHING USING DEEP LEARNING

AUTOMATIC TRANSPORT NETWORK MATCHING USING DEEP LEARNING AUTOMATIC TRANSPORT NETWORK MATCHING USING DEEP LEARNING Manuel Martin Salvador We Are Base / Bournemouth University Marcin Budka Bournemouth University Tom Quay We Are Base 1. INTRODUCTION Public transport

More information

Segmentation-free Vehicle License Plate Recognition using ConvNet-RNN

Segmentation-free Vehicle License Plate Recognition using ConvNet-RNN Segmentation-free Vehicle License Plate Recognition using ConvNet-RNN Teik Koon Cheang (Author) cheangtk@1utar.my Yong Shean Chong yshean@1utar.my Yong Haur Tay tayyh@utar.edu.my Abstract While vehicle

More information

A long, deep and wide artificial neural net for robust speech recognition in unknown noise

A long, deep and wide artificial neural net for robust speech recognition in unknown noise A long, deep and wide artificial neural net for robust speech recognition in unknown noise Feipeng Li, Phani S. Nidadavolu, and Hynek Hermansky Center for Language and Speech Processing Johns Hopkins University,

More information

LSTM and its variants for visual recognition. Xiaodan Liang Sun Yat-sen University

LSTM and its variants for visual recognition. Xiaodan Liang Sun Yat-sen University LSTM and its variants for visual recognition Xiaodan Liang xdliang328@gmail.com Sun Yat-sen University Outline Context Modelling with CNN LSTM and its Variants LSTM Architecture Variants Application in

More information

Improving Bottleneck Features for Automatic Speech Recognition using Gammatone-based Cochleagram and Sparsity Regularization

Improving Bottleneck Features for Automatic Speech Recognition using Gammatone-based Cochleagram and Sparsity Regularization Improving Bottleneck Features for Automatic Speech Recognition using Gammatone-based Cochleagram and Sparsity Regularization Chao Ma 1,2,3, Jun Qi 4, Dongmei Li 1,2,3, Runsheng Liu 1,2,3 1. Department

More information

저작권법에따른이용자의권리는위의내용에의하여영향을받지않습니다.

저작권법에따른이용자의권리는위의내용에의하여영향을받지않습니다. 저작자표시 - 비영리 - 변경금지 2.0 대한민국 이용자는아래의조건을따르는경우에한하여자유롭게 이저작물을복제, 배포, 전송, 전시, 공연및방송할수있습니다. 다음과같은조건을따라야합니다 : 저작자표시. 귀하는원저작자를표시하여야합니다. 비영리. 귀하는이저작물을영리목적으로이용할수없습니다. 변경금지. 귀하는이저작물을개작, 변형또는가공할수없습니다. 귀하는, 이저작물의재이용이나배포의경우,

More information

RLAT Rapid Language Adaptation Toolkit

RLAT Rapid Language Adaptation Toolkit RLAT Rapid Language Adaptation Toolkit Tim Schlippe May 15, 2012 RLAT Rapid Language Adaptation Toolkit - 2 RLAT Rapid Language Adaptation Toolkit RLAT Rapid Language Adaptation Toolkit - 3 Outline Introduction

More information

DEEP CLUSTERING WITH GATED CONVOLUTIONAL NETWORKS

DEEP CLUSTERING WITH GATED CONVOLUTIONAL NETWORKS DEEP CLUSTERING WITH GATED CONVOLUTIONAL NETWORKS Li Li 1,2, Hirokazu Kameoka 1 1 NTT Communication Science Laboratories, NTT Corporation, Japan 2 University of Tsukuba, Japan lili@mmlab.cs.tsukuba.ac.jp,

More information

Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling

Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling Authors: Junyoung Chung, Caglar Gulcehre, KyungHyun Cho and Yoshua Bengio Presenter: Yu-Wei Lin Background: Recurrent Neural

More information

Comparison of Bernoulli and Gaussian HMMs using a vertical repositioning technique for off-line handwriting recognition

Comparison of Bernoulli and Gaussian HMMs using a vertical repositioning technique for off-line handwriting recognition 2012 International Conference on Frontiers in Handwriting Recognition Comparison of Bernoulli and Gaussian HMMs using a vertical repositioning technique for off-line handwriting recognition Patrick Doetsch,

More information

SPEECH FEATURE EXTRACTION USING WEIGHTED HIGHER-ORDER LOCAL AUTO-CORRELATION

SPEECH FEATURE EXTRACTION USING WEIGHTED HIGHER-ORDER LOCAL AUTO-CORRELATION Far East Journal of Electronics and Communications Volume 3, Number 2, 2009, Pages 125-140 Published Online: September 14, 2009 This paper is available online at http://www.pphmj.com 2009 Pushpa Publishing

More information

Pair-wise Distance Metric Learning of Neural Network Model for Spoken Language Identification

Pair-wise Distance Metric Learning of Neural Network Model for Spoken Language Identification INTERSPEECH 2016 September 8 12, 2016, San Francisco, USA Pair-wise Distance Metric Learning of Neural Network Model for Spoken Language Identification 2 1 Xugang Lu 1, Peng Shen 1, Yu Tsao 2, Hisashi

More information

Temporal Activity Detection in Untrimmed Videos with Recurrent Neural Networks

Temporal Activity Detection in Untrimmed Videos with Recurrent Neural Networks Temporal Activity Detection in Untrimmed Videos with Recurrent Neural Networks Alberto Montes al.montes.gomez@gmail.com Santiago Pascual TALP Research Center santiago.pascual@tsc.upc.edu Amaia Salvador

More information

ESE: Efficient Speech Recognition Engine for Sparse LSTM on FPGA

ESE: Efficient Speech Recognition Engine for Sparse LSTM on FPGA ESE: Efficient Speech Recognition Engine for Sparse LSTM on FPGA Song Han 1,2, Junlong Kang 2, Huizi Mao 1, Yiming Hu 3, Xin Li 2, Yubin Li 2, Dongliang Xie 2, Hong Luo 2, Song Yao 2, Yu Wang 2,3, Huazhong

More information

A Deep Learning primer

A Deep Learning primer A Deep Learning primer Riccardo Zanella r.zanella@cineca.it SuperComputing Applications and Innovation Department 1/21 Table of Contents Deep Learning: a review Representation Learning methods DL Applications

More information

Slide credit from Hung-Yi Lee & Richard Socher

Slide credit from Hung-Yi Lee & Richard Socher Slide credit from Hung-Yi Lee & Richard Socher 1 Review Word Vector 2 Word2Vec Variants Skip-gram: predicting surrounding words given the target word (Mikolov+, 2013) CBOW (continuous bag-of-words): predicting

More information

Bidirectional Recurrent Convolutional Networks for Video Super-Resolution

Bidirectional Recurrent Convolutional Networks for Video Super-Resolution Bidirectional Recurrent Convolutional Networks for Video Super-Resolution Qi Zhang & Yan Huang Center for Research on Intelligent Perception and Computing (CRIPAC) National Laboratory of Pattern Recognition

More information

Acoustic to Articulatory Mapping using Memory Based Regression and Trajectory Smoothing

Acoustic to Articulatory Mapping using Memory Based Regression and Trajectory Smoothing Acoustic to Articulatory Mapping using Memory Based Regression and Trajectory Smoothing Samer Al Moubayed Center for Speech Technology, Department of Speech, Music, and Hearing, KTH, Sweden. sameram@kth.se

More information

Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks

Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks Alex Graves TU Munich, Germany graves@in.tum.de Jürgen Schmidhuber IDSIA, Switzerland and TU Munich, Germany juergen@idsia.ch

More information

Semantic Word Embedding Neural Network Language Models for Automatic Speech Recognition

Semantic Word Embedding Neural Network Language Models for Automatic Speech Recognition Semantic Word Embedding Neural Network Language Models for Automatic Speech Recognition Kartik Audhkhasi, Abhinav Sethy Bhuvana Ramabhadran Watson Multimodal Group IBM T. J. Watson Research Center Motivation

More information

Recurrent Neural Networks with Attention for Genre Classification

Recurrent Neural Networks with Attention for Genre Classification Recurrent Neural Networks with Attention for Genre Classification Jeremy Irvin Stanford University jirvin16@stanford.edu Elliott Chartock Stanford University elboy@stanford.edu Nadav Hollander Stanford

More information

Manifold Constrained Deep Neural Networks for ASR

Manifold Constrained Deep Neural Networks for ASR 1 Manifold Constrained Deep Neural Networks for ASR Department of Electrical and Computer Engineering, McGill University Richard Rose and Vikrant Tomar Motivation Speech features can be characterized as

More information

Gating Neural Network for Large Vocabulary Audiovisual Speech Recognition

Gating Neural Network for Large Vocabulary Audiovisual Speech Recognition IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. XX, NO. X, JUNE 2017 1 Gating Neural Network for Large Vocabulary Audiovisual Speech Recognition Fei Tao, Student Member, IEEE, and

More information

Chapter 3. Speech segmentation. 3.1 Preprocessing

Chapter 3. Speech segmentation. 3.1 Preprocessing , as done in this dissertation, refers to the process of determining the boundaries between phonemes in the speech signal. No higher-level lexical information is used to accomplish this. This chapter presents

More information

ACOUSTIC MODELING WITH NEURAL GRAPH EMBEDDINGS. Yuzong Liu, Katrin Kirchhoff

ACOUSTIC MODELING WITH NEURAL GRAPH EMBEDDINGS. Yuzong Liu, Katrin Kirchhoff ACOUSTIC MODELING WITH NEURAL GRAPH EMBEDDINGS Yuzong Liu, Katrin Kirchhoff Department of Electrical Engineering University of Washington, Seattle, WA 98195 ABSTRACT Graph-based learning (GBL) is a form

More information

Object Detection Lecture Introduction to deep learning (CNN) Idar Dyrdal

Object Detection Lecture Introduction to deep learning (CNN) Idar Dyrdal Object Detection Lecture 10.3 - Introduction to deep learning (CNN) Idar Dyrdal Deep Learning Labels Computational models composed of multiple processing layers (non-linear transformations) Used to learn

More information

Reverberant Speech Recognition Based on Denoising Autoencoder

Reverberant Speech Recognition Based on Denoising Autoencoder INTERSPEECH 2013 Reverberant Speech Recognition Based on Denoising Autoencoder Takaaki Ishii 1, Hiroki Komiyama 1, Takahiro Shinozaki 2, Yasuo Horiuchi 1, Shingo Kuroiwa 1 1 Division of Information Sciences,

More information

Maximum Likelihood Beamforming for Robust Automatic Speech Recognition

Maximum Likelihood Beamforming for Robust Automatic Speech Recognition Maximum Likelihood Beamforming for Robust Automatic Speech Recognition Barbara Rauch barbara@lsv.uni-saarland.de IGK Colloquium, Saarbrücken, 16 February 2006 Agenda Background: Standard ASR Robust ASR

More information

Combining Neural Networks and Log-linear Models to Improve Relation Extraction

Combining Neural Networks and Log-linear Models to Improve Relation Extraction Combining Neural Networks and Log-linear Models to Improve Relation Extraction Thien Huu Nguyen and Ralph Grishman Computer Science Department, New York University {thien,grishman}@cs.nyu.edu Outline Relation

More information

arxiv: v5 [cs.lg] 2 Feb 2017

arxiv: v5 [cs.lg] 2 Feb 2017 ONLINE SEQUENCE TRAINING OF RECURRENT NEURAL NETWORKS WITH CONNECTIONIST TEMPORAL CLASSIFICATION arxiv:5.0684v5 [cs.lg] 2 Feb 207 Kyuyeon Hwang & Wonyong Sung Department of Electrical and Computer Engineering

More information

Lecture 2 Notes. Outline. Neural Networks. The Big Idea. Architecture. Instructors: Parth Shah, Riju Pahwa

Lecture 2 Notes. Outline. Neural Networks. The Big Idea. Architecture. Instructors: Parth Shah, Riju Pahwa Instructors: Parth Shah, Riju Pahwa Lecture 2 Notes Outline 1. Neural Networks The Big Idea Architecture SGD and Backpropagation 2. Convolutional Neural Networks Intuition Architecture 3. Recurrent Neural

More information

arxiv: v1 [cs.cl] 30 Jan 2018

arxiv: v1 [cs.cl] 30 Jan 2018 ACCELERATING RECURRENT NEURAL NETWORK LANGUAGE MODEL BASED ONLINE SPEECH RECOGNITION SYSTEM Kyungmin Lee, Chiyoun Park, Namhoon Kim, and Jaewon Lee DMC R&D Center, Samsung Electronics, Seoul, Korea {k.m.lee,

More information

How to Build Optimized ML Applications with Arm Software

How to Build Optimized ML Applications with Arm Software How to Build Optimized ML Applications with Arm Software Arm Technical Symposia 2018 ML Group Overview Today we will talk about applied machine learning (ML) on Arm. My aim for today is to show you just

More information

Empirical Evaluation of RNN Architectures on Sentence Classification Task

Empirical Evaluation of RNN Architectures on Sentence Classification Task Empirical Evaluation of RNN Architectures on Sentence Classification Task Lei Shen, Junlin Zhang Chanjet Information Technology lorashen@126.com, zhangjlh@chanjet.com Abstract. Recurrent Neural Networks

More information

Applications of Berkeley s Dwarfs on Nvidia GPUs

Applications of Berkeley s Dwarfs on Nvidia GPUs Applications of Berkeley s Dwarfs on Nvidia GPUs Seminar: Topics in High-Performance and Scientific Computing Team N2: Yang Zhang, Haiqing Wang 05.02.2015 Overview CUDA The Dwarfs Dynamic Programming Sparse

More information

Adversarial Feature-Mapping for Speech Enhancement

Adversarial Feature-Mapping for Speech Enhancement Interspeech 2018 2-6 September 2018, Hyderabad Adversarial Feature-Mapping for Speech Enhancement Zhong Meng 1,2, Jinyu Li 1, Yifan Gong 1, Biing-Hwang (Fred) Juang 2 1 Microsoft AI and Research, Redmond,

More information

An Efficient End-to-End Neural Model for Handwritten Text Recognition

An Efficient End-to-End Neural Model for Handwritten Text Recognition CHOWDHURY, VIG: AN EFFICIENT END-TO-END NEURAL MODEL FOR HANDWRITTEN 1 An Efficient End-to-End Neural Model for Handwritten Text Recognition Arindam Chowdhury chowdhury.arindam1@tcs.com Lovekesh Vig lovekesh.vig@tcs.com

More information

Pixel-level Generative Model

Pixel-level Generative Model Pixel-level Generative Model Generative Image Modeling Using Spatial LSTMs (2015NIPS) L. Theis and M. Bethge University of Tübingen, Germany Pixel Recurrent Neural Networks (2016ICML) A. van den Oord,

More information

Deep Belief Networks for phone recognition

Deep Belief Networks for phone recognition Deep Belief Networks for phone recognition Abdel-rahman Mohamed, George Dahl, and Geoffrey Hinton Department of Computer Science University of Toronto {asamir,gdahl,hinton}@cs.toronto.edu Abstract Hidden

More information

Encoding RNNs, 48 End of sentence (EOS) token, 207 Exploding gradient, 131 Exponential function, 42 Exponential Linear Unit (ELU), 44

Encoding RNNs, 48 End of sentence (EOS) token, 207 Exploding gradient, 131 Exponential function, 42 Exponential Linear Unit (ELU), 44 A Activation potential, 40 Annotated corpus add padding, 162 check versions, 158 create checkpoints, 164, 166 create input, 160 create train and validation datasets, 163 dropout, 163 DRUG-AE.rel file,

More information

Why DNN Works for Speech and How to Make it More Efficient?

Why DNN Works for Speech and How to Make it More Efficient? Why DNN Works for Speech and How to Make it More Efficient? Hui Jiang Department of Electrical Engineering and Computer Science Lassonde School of Engineering, York University, CANADA Joint work with Y.

More information

Deep Learning Applications

Deep Learning Applications October 20, 2017 Overview Supervised Learning Feedforward neural network Convolution neural network Recurrent neural network Recursive neural network (Recursive neural tensor network) Unsupervised Learning

More information

2015 The MathWorks, Inc. 1

2015 The MathWorks, Inc. 1 2015 The MathWorks, Inc. 1 개발에서구현까지 MATLAB 환경에서의딥러닝 김종남 Application Engineer 2015 The MathWorks, Inc. 2 3 Why MATLAB for Deep Learning? MATLAB is Productive MATLAB is Fast MATLAB Integrates with Open Source

More information

Asynchronous Parallel Learning for Neural Networks and Structured Models with Dense Features

Asynchronous Parallel Learning for Neural Networks and Structured Models with Dense Features Asynchronous Parallel Learning for Neural Networks and Structured Models with Dense Features Xu SUN ( 孙栩 ) Peking University xusun@pku.edu.cn Motivation Neural networks -> Good Performance CNN, RNN, LSTM

More information

LSTM: An Image Classification Model Based on Fashion-MNIST Dataset

LSTM: An Image Classification Model Based on Fashion-MNIST Dataset LSTM: An Image Classification Model Based on Fashion-MNIST Dataset Kexin Zhang, Research School of Computer Science, Australian National University Kexin Zhang, U6342657@anu.edu.au Abstract. The application

More information

A PRIORITIZED GRID LONG SHORT-TERM MEMORY RNN FOR SPEECH RECOGNITION. Wei-Ning Hsu, Yu Zhang, and James Glass

A PRIORITIZED GRID LONG SHORT-TERM MEMORY RNN FOR SPEECH RECOGNITION. Wei-Ning Hsu, Yu Zhang, and James Glass A PRIORITIZED GRID LONG SHORT-TERM MEMORY RNN FOR SPEECH RECOGNITION Wei-Ning Hsu, Yu Zhang, and James Glass Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology

More information

arxiv: v1 [cs.lg] 20 Apr 2018

arxiv: v1 [cs.lg] 20 Apr 2018 Modelling customer online behaviours with neural networks: applications to conversion prediction and advertising retargeting Yanwei Cui, Rogatien Tobossi, and Olivia Vigouroux GMF Assurances, Groupe Covéa

More information