Slide credit from Hung-Yi Lee & Richard Socher

Size: px
Start display at page:

Download "Slide credit from Hung-Yi Lee & Richard Socher"

Transcription

1 Slide credit from Hung-Yi Lee & Richard Socher 1

2 Review Word Vector 2

3 Word2Vec Variants Skip-gram: predicting surrounding words given the target word (Mikolov+, 2013) CBOW (continuous bag-of-words): predicting the target word given the surrounding words (Mikolov+, 2013) LM (Language modeling): predicting the next words given the proceeding contexts (Mikolov+, 2013) Mikolov et al., Efficient estimation of word representations in vector space, in ICLR Workshop, Mikolov et al., Linguistic regularities in continuous space word representations, in NAACL HLT,

4 Word2Vec LM Goal: predicting the next words given the proceeding contexts 4

5 Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM) Recurrent Neural Network Definition Training via Backpropagation through Time (BPTT) Training Issue Applications Sequential Input Sequential Output Aligned Sequential Pairs (Tagging) Unaligned Sequential Pairs (Seq2Seq/Encoder-Decoder) 5

6 Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM) Recurrent Neural Network Definition Training via Backpropagation through Time (BPTT) Training Issue Applications Sequential Input Sequential Output Aligned Sequential Pairs (Tagging) Unaligned Sequential Pairs (Seq2Seq/Encoder-Decoder) 6

7 Language Modeling Goal: estimate the probability of a word sequence Example task: determinate whether a sequence is grammatical or makes more sense recognize speech or wreck a nice beach If P(recognize speech) > P(wreck a nice beach) Output = recognize speech 7

8 Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM) Recurrent Neural Network Definition Training via Backpropagation through Time (BPTT) Training Issue Applications Sequential Input Sequential Output Aligned Sequential Pairs (Tagging) Unaligned Sequential Pairs (Seq2Seq/Encoder-Decoder) 8

9 N-Gram Language Modeling Goal: estimate the probability of a word sequence N-gram language model Probability is conditioned on a window of (n-1) previous words Estimate the probability based on the training data P beach nice = C nice beach C nice Count of nice beach in the training data Count of nice in the training data Issue: some sequences may not appear in the training data 9

10 N-Gram Language Modeling Training data: The dog ran The cat jumped P( jumped dog ) = 0 P( ran cat ) = give some small probability smoothing The probability is not accurate. The phenomenon happens because we cannot collect all the possible text in the world as training data. 10

11 Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM) Recurrent Neural Network Definition Training via Backpropagation through Time (BPTT) Training Issue Applications Sequential Input Sequential Output Aligned Sequential Pairs (Tagging) Unaligned Sequential Pairs (Seq2Seq/Encoder-Decoder) 11

12 Neural Language Modeling Idea: estimate not from count, but from the NN prediction P( wreck a nice beach ) = P(wreck START)P(a wreck)p(nice a)p(beach nice) P(next word is wreck ) P(next word is a ) P(next word is nice ) P(next word is beach ) Neural Network Neural Network Neural Network Neural Network vector of START vector of wreck vector of a vector of nice 12

13 Neural Language Modeling Probability distribution of the next word output hidden input context vector Issue: fixed context window for conditioning Bengio et al., A Neural Probabilistic Language Model, in JMLR,

14 Neural Language Modeling The input layer (or hidden layer) of the related words are close h 2 dog rabbit cat h 1 If P(jump dog) is large, P(jump cat) increase accordingly (even there is not cat jump in the data) Smoothing is automatically done 14

15 Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM) Recurrent Neural Network Definition Training via Backpropagation through Time (BPTT) Training Issue Applications Sequential Input Sequential Output Aligned Sequential Pairs (Tagging) Unaligned Sequential Pairs (Seq2Seq/Encoder-Decoder) 15

16 Recurrent Neural Network Idea: condition the neural network on all previous words and tie the weights at each time step Assumption: temporal information matters 16

17 output word prob dist RNN Language Modeling hidden P(next word is wreck ) P(next word is a ) input P(next word is nice ) context vector P(next word is beach ) vector of START vector of wreck vector of a vector of nice Idea: pass the information from the previous hidden layer to leverage all contexts 17

18 Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM) Recurrent Neural Network Definition Training via Backpropagation through Time (BPTT) Training Issue Applications Sequential Input Sequential Output Aligned Sequential Pairs (Tagging) Unaligned Sequential Pairs (Seq2Seq/Encoder-Decoder) 18

19 RNNLM Formulation At each time step, probability of the next word vector of the current word 19

20 Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM) Recurrent Neural Network Definition Training via Backpropagation through Time (BPTT) Training Issue Applications Sequential Input Sequential Output Aligned Sequential Pairs (Tagging) Unaligned Sequential Pairs (Seq2Seq/Encoder-Decoder) 20

21 Recurrent Neural Network Definition : tanh, ReLU 21

22 Model Training All model parameters can be updated by y t-1 y t y t+1 target predicted 22

23 Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM) Recurrent Neural Network Definition Training via Backpropagation through Time (BPTT) Training Issue Applications Sequential Input Sequential Output Aligned Sequential Pairs (Tagging) Unaligned Sequential Pairs (Seq2Seq/Encoder-Decoder) 23

24 Backpropagation Layer l j l w ij Layer l 1 2 i l i Backward Pass Error signal l a j x 1 j l 1 l 1 Forward Pass 24

25 Backpropagation l δ l δ 1 l δ 2 Layer l 1 2 l z 1 l z 2 Layer L-1 L-1 1 z 2 z L1 1 L1 2 Layer L L 1 2 L z 1 L z 2 C y C y 1 C y 2 l i Backward Pass Error signal l δ i i l z i l W 1 T m W L T L1 z m n L z n C y n 25

26 Backpropagation through Time (BPTT) Unfold x t s t o t y t Input: init, x 1, x 2,, x t Output: o t Target: y t init x 1 s 1 x t-2 x t-1 s t-1 Cy C s t-2 y1 C y 2 C y n 26

27 Backpropagation through Time (BPTT) Unfold x t s t o t y t Input: init, x 1, x 2,, x t Output: o t Target: y t x t-2 x t-1 s t-1 1 s t Cy x 1 s 1 2 n init n 27

28 Backpropagation through Time (BPTT) Unfold x t s t o t y t x t-1 s t-1 Cy Input: init, x 1, x 2,, x t Output: o t Target: y t x t-2 s t-2 x 1 s 1 init 28

29 Backpropagation through Time (BPTT) Unfold Input: init, x 1, x 2,, x t Output: o t Target: y t init i x 1 s 1 j x t-2 j the same memory x t i x t-1 s t-1 j i s t-2 pointer pointer j i s t o t y t Cy Weights are tied together 29

30 Backpropagation through Time (BPTT) Unfold Input: init, x 1, x 2,, x t Output: o t Target: y t i x 1 s 1 j x t-2 j x t i x t-1 s t-1 j k i s t-2 k j i s t o t y t Cy init k Weights are tied together 30

31 BPTT Forward Pass: Backward Pass: Compute s 1, s 2, s 3, s 4 For C (4) For C (3) For C (2) For C (1) y 1 y 2 y 3 y 4 C (1) C (2) C (3) C (4) o 1 o 2 o 3 o 4 init s 1 s 2 s 3 s 4 x 1 x 2 x 3 x 4 31

32 Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM) Recurrent Neural Network Definition Training via Backpropagation through Time (BPTT) Training Issue Applications Sequential Input Sequential Output Aligned Sequential Pairs (Tagging) Unaligned Sequential Pairs (Seq2Seq/Encoder-Decoder) 32

33 RNN Training Issue The gradient is a product of Jacobian matrices, each associated with a step in the forward computation Multiply the same matrix at each time step during backprop The gradient becomes very small or very large quickly vanishing or exploding gradient Bengio et al., Learning long-term dependencies with gradient descent is difficult, IEEE Trans. of Neural Networks, [link] Pascanu et al., On the difficulty of training recurrent neural networks, in ICML, [link] 33

34 Rough Error Surface Cost w 2 w1 The error surface is either very flat or very steep Bengio et al., Learning long-term dependencies with gradient descent is difficult, IEEE Trans. of Neural Networks, [link] Pascanu et al., On the difficulty of training recurrent neural networks, in ICML, [link] 34

35 Vanishing/Exploding Gradient Example step 2 steps 5 steps steps 20 steps 50 steps

36 Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM) Recurrent Neural Network Definition Training via Backpropagation through Time (BPTT) Training Issue Applications Sequential Input Sequential Output Aligned Sequential Pairs (Tagging) Unaligned Sequential Pairs (Seq2Seq/Encoder-Decoder) 36

37 How to Frame the Learning Problem? The learning algorithm f is to map the input domain X into the output domain Y f : X Y Input domain: word, word sequence, audio signal, click logs Output domain: single label, sequence tags, tree structure, probability distribution Network design should leverage input and output domain properties 37

38 Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM) Recurrent Neural Network Definition Training via Backpropagation through Time (BPTT) Training Issue Applications Sequential Input Sequential Output Aligned Sequential Pairs (Tagging) Unaligned Sequential Pairs (Seq2Seq/Encoder-Decoder) 38

39 Input Domain Sequence Modeling Idea: aggregate the meaning from all words into a vector Method: Basic combination: average, sum Neural combination: Recursive neural network (RvNN) Recurrent neural network (RNN) Convolutional neural network (CNN) How to compute 這 (this) 規格 (specification) 有 (have) 誠意 (sincerity) N-dim 39

40 Sentiment Analysis Encode the sequential input into a vector using RNN h 4 Input x 1 Output y 1 x 4 x 2 y 2 這規格有 誠意 x N y M RNN considers temporal information to learn sentence vectors as the input of classification tasks 40

41 Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM) Recurrent Neural Network Definition Training via Backpropagation through Time (BPTT) Training Issue Applications Sequential Input Sequential Output Aligned Sequential Pairs (Tagging) Unaligned Sequential Pairs (Seq2Seq/Encoder-Decoder) 41

42 Output Domain Sequence Prediction POS Tagging 推薦我台大後門的餐廳 Speech Recognition 推薦 /VV 我 /PN 台大 /NR 後門 /NN 的 /DEG 餐廳 /NN 大家好 Machine Translation How are you doing today? 你好嗎? The output can be viewed as a sequence of classification 42

43 Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM) Recurrent Neural Network Definition Training via Backpropagation through Time (BPTT) Training Issue Applications Sequential Input Sequential Output Aligned Sequential Pairs (Tagging) Unaligned Sequential Pairs (Seq2Seq/Encoder-Decoder) 43

44 POS Tagging Tag a word at each timestamp Input: word sequence Output: corresponding POS tag sequence N VA AD 四樓好專業 44

45 Natural Language Understanding (NLU) Tag a word at each timestamp Input: word sequence Output: IOB-format slot tag and intent tag <START> just sent to bob about fishing this weekend <END> O O O O O O B-contact_name B-subject I-subject I-subject send_ (contact_name= bob, subject= fishing this weekend ) send_ Temporal orders for input and output are the same 45

46 Outline Language Modeling N-gram Language Model Feed-Forward Neural Language Model Recurrent Neural Network Language Model (RNNLM) Recurrent Neural Network Definition Training via Backpropagation through Time (BPTT) Training Issue Applications Sequential Input Sequential Output Aligned Sequential Pairs (Tagging) Unaligned Sequential Pairs (Seq2Seq/Encoder-Decoder) 46

47 Machine Translation Cascade two RNNs, one for encoding and one for decoding Input: word sequences in the source language Output: word sequences in the target language encoder decoder 超棒的醬汁 47

48 Chit-Chat Dialogue Modeling Cascade two RNNs, one for encoding and one for decoding Input: word sequences in the question Output: word sequences in the response Temporal ordering for input and output may be different 48

49 Concluding Remarks Language Modeling RNNLM Recurrent Neural Networks Definition Backpropagation through Time (BPTT) Vanishing/Exploding Gradient Applications Sequential Input: Sequence-Level Embedding Sequential Output: Tagging / Seq2Seq (Encoder-Decoder) 49

CSC 578 Neural Networks and Deep Learning

CSC 578 Neural Networks and Deep Learning CSC 578 Neural Networks and Deep Learning Fall 2018/19 7. Recurrent Neural Networks (Some figures adapted from NNDL book) 1 Recurrent Neural Networks 1. Recurrent Neural Networks (RNNs) 2. RNN Training

More information

Machine Learning 13. week

Machine Learning 13. week Machine Learning 13. week Deep Learning Convolutional Neural Network Recurrent Neural Network 1 Why Deep Learning is so Popular? 1. Increase in the amount of data Thanks to the Internet, huge amount of

More information

Encoding RNNs, 48 End of sentence (EOS) token, 207 Exploding gradient, 131 Exponential function, 42 Exponential Linear Unit (ELU), 44

Encoding RNNs, 48 End of sentence (EOS) token, 207 Exploding gradient, 131 Exponential function, 42 Exponential Linear Unit (ELU), 44 A Activation potential, 40 Annotated corpus add padding, 162 check versions, 158 create checkpoints, 164, 166 create input, 160 create train and validation datasets, 163 dropout, 163 DRUG-AE.rel file,

More information

Natural Language Processing with Deep Learning CS224N/Ling284

Natural Language Processing with Deep Learning CS224N/Ling284 Natural Language Processing with Deep Learning CS224N/Ling284 Lecture 8: Recurrent Neural Networks Christopher Manning and Richard Socher Organization Extra project office hour today after lecture Overview

More information

Sequence Modeling: Recurrent and Recursive Nets. By Pyry Takala 14 Oct 2015

Sequence Modeling: Recurrent and Recursive Nets. By Pyry Takala 14 Oct 2015 Sequence Modeling: Recurrent and Recursive Nets By Pyry Takala 14 Oct 2015 Agenda Why Recurrent neural networks? Anatomy and basic training of an RNN (10.2, 10.2.1) Properties of RNNs (10.2.2, 8.2.6) Using

More information

FastText. Jon Koss, Abhishek Jindal

FastText. Jon Koss, Abhishek Jindal FastText Jon Koss, Abhishek Jindal FastText FastText is on par with state-of-the-art deep learning classifiers in terms of accuracy But it is way faster: FastText can train on more than one billion words

More information

Recurrent Neural Networks

Recurrent Neural Networks Recurrent Neural Networks Javier Béjar Deep Learning 2018/2019 Fall Master in Artificial Intelligence (FIB-UPC) Introduction Sequential data Many problems are described by sequences Time series Video/audio

More information

Natural Language Processing CS 6320 Lecture 6 Neural Language Models. Instructor: Sanda Harabagiu

Natural Language Processing CS 6320 Lecture 6 Neural Language Models. Instructor: Sanda Harabagiu Natural Language Processing CS 6320 Lecture 6 Neural Language Models Instructor: Sanda Harabagiu In this lecture We shall cover: Deep Neural Models for Natural Language Processing Introduce Feed Forward

More information

Lecture 2 Notes. Outline. Neural Networks. The Big Idea. Architecture. Instructors: Parth Shah, Riju Pahwa

Lecture 2 Notes. Outline. Neural Networks. The Big Idea. Architecture. Instructors: Parth Shah, Riju Pahwa Instructors: Parth Shah, Riju Pahwa Lecture 2 Notes Outline 1. Neural Networks The Big Idea Architecture SGD and Backpropagation 2. Convolutional Neural Networks Intuition Architecture 3. Recurrent Neural

More information

CS839: Probabilistic Graphical Models. Lecture 22: The Attention Mechanism. Theo Rekatsinas

CS839: Probabilistic Graphical Models. Lecture 22: The Attention Mechanism. Theo Rekatsinas CS839: Probabilistic Graphical Models Lecture 22: The Attention Mechanism Theo Rekatsinas 1 Why Attention? Consider machine translation: We need to pay attention to the word we are currently translating.

More information

JOINT INTENT DETECTION AND SLOT FILLING USING CONVOLUTIONAL NEURAL NETWORKS. Puyang Xu, Ruhi Sarikaya. Microsoft Corporation

JOINT INTENT DETECTION AND SLOT FILLING USING CONVOLUTIONAL NEURAL NETWORKS. Puyang Xu, Ruhi Sarikaya. Microsoft Corporation JOINT INTENT DETECTION AND SLOT FILLING USING CONVOLUTIONAL NEURAL NETWORKS Puyang Xu, Ruhi Sarikaya Microsoft Corporation ABSTRACT We describe a joint model for intent detection and slot filling based

More information

Code Mania Artificial Intelligence: a. Module - 1: Introduction to Artificial intelligence and Python:

Code Mania Artificial Intelligence: a. Module - 1: Introduction to Artificial intelligence and Python: Code Mania 2019 Artificial Intelligence: a. Module - 1: Introduction to Artificial intelligence and Python: 1. Introduction to Artificial Intelligence 2. Introduction to python programming and Environment

More information

SEMANTIC COMPUTING. Lecture 9: Deep Learning: Recurrent Neural Networks (RNNs) TU Dresden, 21 December 2018

SEMANTIC COMPUTING. Lecture 9: Deep Learning: Recurrent Neural Networks (RNNs) TU Dresden, 21 December 2018 SEMANTIC COMPUTING Lecture 9: Deep Learning: Recurrent Neural Networks (RNNs) Dagmar Gromann International Center For Computational Logic TU Dresden, 21 December 2018 Overview Handling Overfitting Recurrent

More information

Deep Learning Applications

Deep Learning Applications October 20, 2017 Overview Supervised Learning Feedforward neural network Convolution neural network Recurrent neural network Recursive neural network (Recursive neural tensor network) Unsupervised Learning

More information

Convolutional Networks for Text

Convolutional Networks for Text CS11-747 Neural Networks for NLP Convolutional Networks for Text Graham Neubig Site https://phontron.com/class/nn4nlp2017/ An Example Prediction Problem: Sentence Classification I hate this movie very

More information

Gated Recurrent Models. Stephan Gouws & Richard Klein

Gated Recurrent Models. Stephan Gouws & Richard Klein Gated Recurrent Models Stephan Gouws & Richard Klein Outline Part 1: Intuition, Inference and Training Building intuitions: From Feedforward to Recurrent Models Inference in RNNs: Fprop Training in RNNs:

More information

Residual Networks And Attention Models. cs273b Recitation 11/11/2016. Anna Shcherbina

Residual Networks And Attention Models. cs273b Recitation 11/11/2016. Anna Shcherbina Residual Networks And Attention Models cs273b Recitation 11/11/2016 Anna Shcherbina Introduction to ResNets Introduced in 2015 by Microsoft Research Deep Residual Learning for Image Recognition (He, Zhang,

More information

A Neuro Probabilistic Language Model Bengio et. al. 2003

A Neuro Probabilistic Language Model Bengio et. al. 2003 A Neuro Probabilistic Language Model Bengio et. al. 2003 Class Discussion Notes Scribe: Olivia Winn February 1, 2016 Opening thoughts (or why this paper is interesting): Word embeddings currently have

More information

COMP9444 Neural Networks and Deep Learning 7. Image Processing. COMP9444 c Alan Blair, 2017

COMP9444 Neural Networks and Deep Learning 7. Image Processing. COMP9444 c Alan Blair, 2017 COMP9444 Neural Networks and Deep Learning 7. Image Processing COMP9444 17s2 Image Processing 1 Outline Image Datasets and Tasks Convolution in Detail AlexNet Weight Initialization Batch Normalization

More information

EECS 496 Statistical Language Models. Winter 2018

EECS 496 Statistical Language Models. Winter 2018 EECS 496 Statistical Language Models Winter 2018 Introductions Professor: Doug Downey Course web site: www.cs.northwestern.edu/~ddowney/courses/496_winter2018 (linked off prof. home page) Logistics Grading

More information

DEEP LEARNING REVIEW. Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature Presented by Divya Chitimalla

DEEP LEARNING REVIEW. Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature Presented by Divya Chitimalla DEEP LEARNING REVIEW Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature 2015 -Presented by Divya Chitimalla What is deep learning Deep learning allows computational models that are composed of multiple

More information

CS 224d: Assignment #1

CS 224d: Assignment #1 Due date: assignment) 4/19 11:59 PM PST (You are allowed to use three (3) late days maximum for this These questions require thought, but do not require long answers. Please be as concise as possible.

More information

CS 224N: Assignment #1

CS 224N: Assignment #1 Due date: assignment) 1/25 11:59 PM PST (You are allowed to use three (3) late days maximum for this These questions require thought, but do not require long answers. Please be as concise as possible.

More information

Lecture 20: Neural Networks for NLP. Zubin Pahuja

Lecture 20: Neural Networks for NLP. Zubin Pahuja Lecture 20: Neural Networks for NLP Zubin Pahuja zpahuja2@illinois.edu courses.engr.illinois.edu/cs447 CS447: Natural Language Processing 1 Today s Lecture Feed-forward neural networks as classifiers simple

More information

Convolutional Sequence to Sequence Learning. Denis Yarats with Jonas Gehring, Michael Auli, David Grangier, Yann Dauphin Facebook AI Research

Convolutional Sequence to Sequence Learning. Denis Yarats with Jonas Gehring, Michael Auli, David Grangier, Yann Dauphin Facebook AI Research Convolutional Sequence to Sequence Learning Denis Yarats with Jonas Gehring, Michael Auli, David Grangier, Yann Dauphin Facebook AI Research Sequence generation Need to model a conditional distribution

More information

Recurrent Neural Nets II

Recurrent Neural Nets II Recurrent Neural Nets II Steven Spielberg Pon Kumar, Tingke (Kevin) Shen Machine Learning Reading Group, Fall 2016 9 November, 2016 Outline 1 Introduction 2 Problem Formulations with RNNs 3 LSTM for Optimization

More information

Machine Learning. MGS Lecture 3: Deep Learning

Machine Learning. MGS Lecture 3: Deep Learning Dr Michel F. Valstar http://cs.nott.ac.uk/~mfv/ Machine Learning MGS Lecture 3: Deep Learning Dr Michel F. Valstar http://cs.nott.ac.uk/~mfv/ WHAT IS DEEP LEARNING? Shallow network: Only one hidden layer

More information

27: Hybrid Graphical Models and Neural Networks

27: Hybrid Graphical Models and Neural Networks 10-708: Probabilistic Graphical Models 10-708 Spring 2016 27: Hybrid Graphical Models and Neural Networks Lecturer: Matt Gormley Scribes: Jakob Bauer Otilia Stretcu Rohan Varma 1 Motivation We first look

More information

SEMANTIC COMPUTING. Lecture 8: Introduction to Deep Learning. TU Dresden, 7 December Dagmar Gromann International Center For Computational Logic

SEMANTIC COMPUTING. Lecture 8: Introduction to Deep Learning. TU Dresden, 7 December Dagmar Gromann International Center For Computational Logic SEMANTIC COMPUTING Lecture 8: Introduction to Deep Learning Dagmar Gromann International Center For Computational Logic TU Dresden, 7 December 2018 Overview Introduction Deep Learning General Neural Networks

More information

COMP 551 Applied Machine Learning Lecture 16: Deep Learning

COMP 551 Applied Machine Learning Lecture 16: Deep Learning COMP 551 Applied Machine Learning Lecture 16: Deep Learning Instructor: Ryan Lowe (ryan.lowe@cs.mcgill.ca) Slides mostly by: Class web page: www.cs.mcgill.ca/~hvanho2/comp551 Unless otherwise noted, all

More information

CS 6501: Deep Learning for Computer Graphics. Training Neural Networks II. Connelly Barnes

CS 6501: Deep Learning for Computer Graphics. Training Neural Networks II. Connelly Barnes CS 6501: Deep Learning for Computer Graphics Training Neural Networks II Connelly Barnes Overview Preprocessing Initialization Vanishing/exploding gradients problem Batch normalization Dropout Additional

More information

Recurrent Convolutional Neural Networks for Scene Labeling

Recurrent Convolutional Neural Networks for Scene Labeling Recurrent Convolutional Neural Networks for Scene Labeling Pedro O. Pinheiro, Ronan Collobert Reviewed by Yizhe Zhang August 14, 2015 Scene labeling task Scene labeling: assign a class label to each pixel

More information

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU,

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU, Machine Learning 10-701, Fall 2015 Deep Learning Eric Xing (and Pengtao Xie) Lecture 8, October 6, 2015 Eric Xing @ CMU, 2015 1 A perennial challenge in computer vision: feature engineering SIFT Spin image

More information

CS 224N: Assignment #1

CS 224N: Assignment #1 Due date: assignment) 1/25 11:59 PM PST (You are allowed to use three (3) late days maximum for this These questions require thought, but do not require long answers. Please be as concise as possible.

More information

Recurrent Neural Networks. Nand Kishore, Audrey Huang, Rohan Batra

Recurrent Neural Networks. Nand Kishore, Audrey Huang, Rohan Batra Recurrent Neural Networks Nand Kishore, Audrey Huang, Rohan Batra Roadmap Issues Motivation 1 Application 1: Sequence Level Training 2 Basic Structure 3 4 Variations 5 Application 3: Image Classification

More information

On the Efficiency of Recurrent Neural Network Optimization Algorithms

On the Efficiency of Recurrent Neural Network Optimization Algorithms On the Efficiency of Recurrent Neural Network Optimization Algorithms Ben Krause, Liang Lu, Iain Murray, Steve Renals University of Edinburgh Department of Informatics s17005@sms.ed.ac.uk, llu@staffmail.ed.ac.uk,

More information

Empirical Evaluation of RNN Architectures on Sentence Classification Task

Empirical Evaluation of RNN Architectures on Sentence Classification Task Empirical Evaluation of RNN Architectures on Sentence Classification Task Lei Shen, Junlin Zhang Chanjet Information Technology lorashen@126.com, zhangjlh@chanjet.com Abstract. Recurrent Neural Networks

More information

Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling

Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling Authors: Junyoung Chung, Caglar Gulcehre, KyungHyun Cho and Yoshua Bengio Presenter: Yu-Wei Lin Background: Recurrent Neural

More information

Show, Discriminate, and Tell: A Discriminatory Image Captioning Model with Deep Neural Networks

Show, Discriminate, and Tell: A Discriminatory Image Captioning Model with Deep Neural Networks Show, Discriminate, and Tell: A Discriminatory Image Captioning Model with Deep Neural Networks Zelun Luo Department of Computer Science Stanford University zelunluo@stanford.edu Te-Lin Wu Department of

More information

(Multinomial) Logistic Regression + Feature Engineering

(Multinomial) Logistic Regression + Feature Engineering -6 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University (Multinomial) Logistic Regression + Feature Engineering Matt Gormley Lecture 9 Feb.

More information

DEEP LEARNING IN PYTHON. The need for optimization

DEEP LEARNING IN PYTHON. The need for optimization DEEP LEARNING IN PYTHON The need for optimization A baseline neural network Input 2 Hidden Layer 5 2 Output - 9-3 Actual Value of Target: 3 Error: Actual - Predicted = 4 A baseline neural network Input

More information

Machine Learning for Natural Language Processing. Alice Oh January 17, 2018

Machine Learning for Natural Language Processing. Alice Oh January 17, 2018 Machine Learning for Natural Language Processing Alice Oh January 17, 2018 Overview Distributed representation Temporal neural networks RNN LSTM GRU Sequence-to-sequence models Machine translation Response

More information

This Talk. 1) Node embeddings. Map nodes to low-dimensional embeddings. 2) Graph neural networks. Deep learning architectures for graphstructured

This Talk. 1) Node embeddings. Map nodes to low-dimensional embeddings. 2) Graph neural networks. Deep learning architectures for graphstructured Representation Learning on Networks, snap.stanford.edu/proj/embeddings-www, WWW 2018 1 This Talk 1) Node embeddings Map nodes to low-dimensional embeddings. 2) Graph neural networks Deep learning architectures

More information

Lecture 7: Neural network acoustic models in speech recognition

Lecture 7: Neural network acoustic models in speech recognition CS 224S / LINGUIST 285 Spoken Language Processing Andrew Maas Stanford University Spring 2017 Lecture 7: Neural network acoustic models in speech recognition Outline Hybrid acoustic modeling overview Basic

More information

Recurrent Neural Networks

Recurrent Neural Networks Recurrent Neural Networks 11-785 / Fall 2018 / Recitation 7 Raphaël Olivier Recap : RNNs are magic They have infinite memory They handle all kinds of series They re the basis of recent NLP : Translation,

More information

Transition-based Parsing with Neural Nets

Transition-based Parsing with Neural Nets CS11-747 Neural Networks for NLP Transition-based Parsing with Neural Nets Graham Neubig Site https://phontron.com/class/nn4nlp2017/ Two Types of Linguistic Structure Dependency: focus on relations between

More information

Machine Learning Practice and Theory

Machine Learning Practice and Theory Machine Learning Practice and Theory Day 9 - Feature Extraction Govind Gopakumar IIT Kanpur 1 Prelude 2 Announcements Programming Tutorial on Ensemble methods, PCA up Lecture slides for usage of Neural

More information

Recurrent Neural Network (RNN) Industrial AI Lab.

Recurrent Neural Network (RNN) Industrial AI Lab. Recurrent Neural Network (RNN) Industrial AI Lab. For example (Deterministic) Time Series Data Closed- form Linear difference equation (LDE) and initial condition High order LDEs 2 (Stochastic) Time Series

More information

Combining Neural Networks and Log-linear Models to Improve Relation Extraction

Combining Neural Networks and Log-linear Models to Improve Relation Extraction Combining Neural Networks and Log-linear Models to Improve Relation Extraction Thien Huu Nguyen and Ralph Grishman Computer Science Department, New York University {thien,grishman}@cs.nyu.edu Outline Relation

More information

Mining Human Trajectory Data: A Study on Check-in Sequences. Xin Zhao Renmin University of China,

Mining Human Trajectory Data: A Study on Check-in Sequences. Xin Zhao Renmin University of China, Mining Human Trajectory Data: A Study on Check-in Sequences Xin Zhao batmanfly@qq.com Renmin University of China, Check-in data What information these check-in data contain? User ID Location ID Check-in

More information

Lecture 17: Neural Networks and Deep Learning. Instructor: Saravanan Thirumuruganathan

Lecture 17: Neural Networks and Deep Learning. Instructor: Saravanan Thirumuruganathan Lecture 17: Neural Networks and Deep Learning Instructor: Saravanan Thirumuruganathan Outline Perceptron Neural Networks Deep Learning Convolutional Neural Networks Recurrent Neural Networks Auto Encoders

More information

Sentiment Classification of Food Reviews

Sentiment Classification of Food Reviews Sentiment Classification of Food Reviews Hua Feng Department of Electrical Engineering Stanford University Stanford, CA 94305 fengh15@stanford.edu Ruixi Lin Department of Electrical Engineering Stanford

More information

Modeling Sequences Conditioned on Context with RNNs

Modeling Sequences Conditioned on Context with RNNs Modeling Sequences Conditioned on Context with RNNs Sargur Srihari srihari@buffalo.edu This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/cse676 1 10. Topics in Sequence

More information

DeepWalk: Online Learning of Social Representations

DeepWalk: Online Learning of Social Representations DeepWalk: Online Learning of Social Representations ACM SIG-KDD August 26, 2014, Rami Al-Rfou, Steven Skiena Stony Brook University Outline Introduction: Graphs as Features Language Modeling DeepWalk Evaluation:

More information

Transition-Based Dependency Parsing with Stack Long Short-Term Memory

Transition-Based Dependency Parsing with Stack Long Short-Term Memory Transition-Based Dependency Parsing with Stack Long Short-Term Memory Chris Dyer, Miguel Ballesteros, Wang Ling, Austin Matthews, Noah A. Smith Association for Computational Linguistics (ACL), 2015 Presented

More information

LSTM and its variants for visual recognition. Xiaodan Liang Sun Yat-sen University

LSTM and its variants for visual recognition. Xiaodan Liang Sun Yat-sen University LSTM and its variants for visual recognition Xiaodan Liang xdliang328@gmail.com Sun Yat-sen University Outline Context Modelling with CNN LSTM and its Variants LSTM Architecture Variants Application in

More information

Speech and Language Processing. Daniel Jurafsky & James H. Martin. Copyright c All rights reserved. Draft of September 23, 2018.

Speech and Language Processing. Daniel Jurafsky & James H. Martin. Copyright c All rights reserved. Draft of September 23, 2018. Speech and Language Processing. Daniel Jurafsky & James H. Martin. Copyright c 2018. All rights reserved. Draft of September 23, 2018. CHAPTER 9 Sequence Processing with Recurrent Networks Time will explain.

More information

Deep Learning with R. Francesca Lazzeri Data Scientist II - Microsoft, AI Research

Deep Learning with R. Francesca Lazzeri Data Scientist II - Microsoft, AI Research with R Francesca Lazzeri - @frlazzeri Data Scientist II - Microsoft, AI Research Agenda with R What is Demo Better understanding of R DL tools Fundamental concepts in Forward Propagation Algorithm Activation

More information

Semantic Word Embedding Neural Network Language Models for Automatic Speech Recognition

Semantic Word Embedding Neural Network Language Models for Automatic Speech Recognition Semantic Word Embedding Neural Network Language Models for Automatic Speech Recognition Kartik Audhkhasi, Abhinav Sethy Bhuvana Ramabhadran Watson Multimodal Group IBM T. J. Watson Research Center Motivation

More information

COMP 551 Applied Machine Learning Lecture 14: Neural Networks

COMP 551 Applied Machine Learning Lecture 14: Neural Networks COMP 551 Applied Machine Learning Lecture 14: Neural Networks Instructor: (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/comp551 Unless otherwise noted, all material posted for this course

More information

Slides credited from Dr. David Silver & Hung-Yi Lee

Slides credited from Dr. David Silver & Hung-Yi Lee Slides credited from Dr. David Silver & Hung-Yi Lee Review Reinforcement Learning 2 Reinforcement Learning RL is a general purpose framework for decision making RL is for an agent with the capacity to

More information

Natural Language Processing with Deep Learning CS224N/Ling284. Christopher Manning Lecture 4: Backpropagation and computation graphs

Natural Language Processing with Deep Learning CS224N/Ling284. Christopher Manning Lecture 4: Backpropagation and computation graphs Natural Language Processing with Deep Learning CS4N/Ling84 Christopher Manning Lecture 4: Backpropagation and computation graphs Lecture Plan Lecture 4: Backpropagation and computation graphs 1. Matrix

More information

Deep Neural Networks Applications in Handwriting Recognition

Deep Neural Networks Applications in Handwriting Recognition Deep Neural Networks Applications in Handwriting Recognition Théodore Bluche theodore.bluche@gmail.com São Paulo Meetup - 9 Mar. 2017 2 Who am I? Théodore Bluche PhD defended

More information

Detecting Fraudulent Behavior Using Recurrent Neural Networks

Detecting Fraudulent Behavior Using Recurrent Neural Networks Computer Security Symposium 2016 11-13 October 2016 Detecting Fraudulent Behavior Using Recurrent Neural Networks Yoshihiro Ando 1,2,a),b) Hidehito Gomi 2,c) Hidehiko Tanaka 1,d) Abstract: Due to an increase

More information

PTE : Predictive Text Embedding through Large-scale Heterogeneous Text Networks

PTE : Predictive Text Embedding through Large-scale Heterogeneous Text Networks PTE : Predictive Text Embedding through Large-scale Heterogeneous Text Networks Pramod Srinivasan CS591txt - Text Mining Seminar University of Illinois, Urbana-Champaign April 8, 2016 Pramod Srinivasan

More information

Deep Learning on Graphs

Deep Learning on Graphs Deep Learning on Graphs with Graph Convolutional Networks Hidden layer Hidden layer Input Output ReLU ReLU, 22 March 2017 joint work with Max Welling (University of Amsterdam) BDL Workshop @ NIPS 2016

More information

Training LDCRF model on unsegmented sequences using Connectionist Temporal Classification

Training LDCRF model on unsegmented sequences using Connectionist Temporal Classification Training LDCRF model on unsegmented sequences using Connectionist Temporal Classification 1 Amir Ahooye Atashin, 2 Kamaledin Ghiasi-Shirazi, 3 Ahad Harati Department of Computer Engineering Ferdowsi University

More information

A Comparison of Sequence-Trained Deep Neural Networks and Recurrent Neural Networks Optical Modeling For Handwriting Recognition

A Comparison of Sequence-Trained Deep Neural Networks and Recurrent Neural Networks Optical Modeling For Handwriting Recognition A Comparison of Sequence-Trained Deep Neural Networks and Recurrent Neural Networks Optical Modeling For Handwriting Recognition Théodore Bluche, Hermann Ney, Christopher Kermorvant SLSP 14, Grenoble October

More information

Convolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech

Convolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech Convolutional Neural Networks Computer Vision Jia-Bin Huang, Virginia Tech Today s class Overview Convolutional Neural Network (CNN) Training CNN Understanding and Visualizing CNN Image Categorization:

More information

Multi-Dimensional Recurrent Neural Networks

Multi-Dimensional Recurrent Neural Networks Multi-Dimensional Recurrent Neural Networks Alex Graves 1, Santiago Fernández 1, Jürgen Schmidhuber 1,2 1 IDSIA, Galleria 2, 6928 Manno-Lugano, Switzerland 2 TU Munich, Boltzmannstr. 3, 85748 Garching,

More information

Deep Neural Networks Applications in Handwriting Recognition

Deep Neural Networks Applications in Handwriting Recognition Deep Neural Networks Applications in Handwriting Recognition 2 Who am I? Théodore Bluche PhD defended at Université Paris-Sud last year Deep Neural Networks for Large Vocabulary Handwritten

More information

Deep Learning on Graphs

Deep Learning on Graphs Deep Learning on Graphs with Graph Convolutional Networks Hidden layer Hidden layer Input Output ReLU ReLU, 6 April 2017 joint work with Max Welling (University of Amsterdam) The success story of deep

More information

Ensemble methods in machine learning. Example. Neural networks. Neural networks

Ensemble methods in machine learning. Example. Neural networks. Neural networks Ensemble methods in machine learning Bootstrap aggregating (bagging) train an ensemble of models based on randomly resampled versions of the training set, then take a majority vote Example What if you

More information

arxiv: v1 [cs.ai] 14 May 2007

arxiv: v1 [cs.ai] 14 May 2007 Multi-Dimensional Recurrent Neural Networks Alex Graves, Santiago Fernández, Jürgen Schmidhuber IDSIA Galleria 2, 6928 Manno, Switzerland {alex,santiago,juergen}@idsia.ch arxiv:0705.2011v1 [cs.ai] 14 May

More information

Deep neural networks II

Deep neural networks II Deep neural networks II May 31 st, 2018 Yong Jae Lee UC Davis Many slides from Rob Fergus, Svetlana Lazebnik, Jia-Bin Huang, Derek Hoiem, Adriana Kovashka, Why (convolutional) neural networks? State of

More information

16-785: Integrated Intelligence in Robotics: Vision, Language, and Planning. Spring 2018 Lecture 14. Image to Text

16-785: Integrated Intelligence in Robotics: Vision, Language, and Planning. Spring 2018 Lecture 14. Image to Text 16-785: Integrated Intelligence in Robotics: Vision, Language, and Planning Spring 2018 Lecture 14. Image to Text Input Output Classification tasks 4/1/18 CMU 16-785: Integrated Intelligence in Robotics

More information

Gate-Variants of Gated Recurrent Unit (GRU) Neural Networks

Gate-Variants of Gated Recurrent Unit (GRU) Neural Networks Gate-Variants of Gated Recurrent Unit (GRU) Neural Networks Rahul Dey and Fathi M. Salem Circuits, Systems, and Neural Networks (CSANN) LAB Department of Electrical and Computer Engineering Michigan State

More information

Administrative. Assignment 1 due Wednesday April 18, 11:59pm

Administrative. Assignment 1 due Wednesday April 18, 11:59pm Lecture 4-1 Administrative Assignment 1 due Wednesday April 18, 11:59pm Lecture 4-2 Administrative All office hours this week will use queuestatus Lecture 4-3 Where we are... scores function SVM loss data

More information

Grounded Compositional Semantics for Finding and Describing Images with Sentences

Grounded Compositional Semantics for Finding and Describing Images with Sentences Grounded Compositional Semantics for Finding and Describing Images with Sentences R. Socher, A. Karpathy, V. Le,D. Manning, A Y. Ng - 2013 Ali Gharaee 1 Alireza Keshavarzi 2 1 Department of Computational

More information

EE 511 Neural Networks

EE 511 Neural Networks Slides adapted from Ali Farhadi, Mari Ostendorf, Pedro Domingos, Carlos Guestrin, and Luke Zettelmoyer, Andrei Karpathy EE 511 Neural Networks Instructor: Hanna Hajishirzi hannaneh@washington.edu Computational

More information

Table of Contents. What Really is a Hidden Unit? Visualizing Feed-Forward NNs. Visualizing Convolutional NNs. Visualizing Recurrent NNs

Table of Contents. What Really is a Hidden Unit? Visualizing Feed-Forward NNs. Visualizing Convolutional NNs. Visualizing Recurrent NNs Table of Contents What Really is a Hidden Unit? Visualizing Feed-Forward NNs Visualizing Convolutional NNs Visualizing Recurrent NNs Visualizing Attention Visualizing High Dimensional Data What do visualizations

More information

Query Intent Detection using Convolutional Neural Networks

Query Intent Detection using Convolutional Neural Networks Query Intent Detection using Convolutional Neural Networks Homa B. Hashemi, Amir Asiaee, Reiner Kraft QRUMS workshop - February 22, 2016 Query Intent Detection michelle obama age Query Intent Detection

More information

Backpropagation and Neural Networks. Lecture 4-1

Backpropagation and Neural Networks. Lecture 4-1 Lecture 4: Backpropagation and Neural Networks Lecture 4-1 Administrative Assignment 1 due Thursday April 20, 11:59pm on Canvas Lecture 4-2 Administrative Project: TA specialities and some project ideas

More information

A Simple (?) Exercise: Predicting the Next Word

A Simple (?) Exercise: Predicting the Next Word CS11-747 Neural Networks for NLP A Simple (?) Exercise: Predicting the Next Word Graham Neubig Site https://phontron.com/class/nn4nlp2017/ Are These Sentences OK? Jane went to the store. store to Jane

More information

Neural Machine Translation In Linear Time

Neural Machine Translation In Linear Time Neural Machine Translation In Linear Time Authors: Nal Kalchbrenner, Lasse Espeholt, Karen Simonyan, Aaron van den Oord, Alex Graves, Koray Kavukcuoglu Presenter: SunMao sm4206 YuZheng yz2978 OVERVIEW

More information

Voice command module for Smart Home Automation

Voice command module for Smart Home Automation Voice command module for Smart Home Automation LUKA KRALJEVIĆ, MLADEN RUSSO, MAJA STELLA Laboratory for Smart Environment Technologies, University of Split, FESB Ruđera Boškovića 32, 21000, Split CROATIA

More information

Neural Networks for unsupervised learning From Principal Components Analysis to Autoencoders to semantic hashing

Neural Networks for unsupervised learning From Principal Components Analysis to Autoencoders to semantic hashing Neural Networks for unsupervised learning From Principal Components Analysis to Autoencoders to semantic hashing feature 3 PC 3 Beate Sick Many slides are taken form Hinton s great lecture on NN: https://www.coursera.org/course/neuralnets

More information

Neural Nets & Deep Learning

Neural Nets & Deep Learning Neural Nets & Deep Learning The Inspiration Inputs Outputs Our brains are pretty amazing, what if we could do something similar with computers? Image Source: http://ib.bioninja.com.au/_media/neuron _med.jpeg

More information

Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank text

Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank text Philosophische Fakultät Seminar für Sprachwissenschaft Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank text 06 July 2017, Patricia Fischer & Neele Witte Overview Sentiment

More information

Deep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group

Deep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group Deep Learning Vladimir Golkov Technical University of Munich Computer Vision Group 1D Input, 1D Output target input 2 2D Input, 1D Output: Data Distribution Complexity Imagine many dimensions (data occupies

More information

Keras: Handwritten Digit Recognition using MNIST Dataset

Keras: Handwritten Digit Recognition using MNIST Dataset Keras: Handwritten Digit Recognition using MNIST Dataset IIT PATNA January 31, 2018 1 / 30 OUTLINE 1 Keras: Introduction 2 Installing Keras 3 Keras: Building, Testing, Improving A Simple Network 2 / 30

More information

Crafting Adversarial Input Sequences for Recurrent Neural Networks

Crafting Adversarial Input Sequences for Recurrent Neural Networks Crafting Adversarial Input Sequences for Recurrent Neural Networks Nicolas Papernot and Patrick McDaniel The Pennsylvania State University University Park, PA {ngp5056,mcdaniel}@cse.psu.edu Ananthram Swami

More information

CS489/698: Intro to ML

CS489/698: Intro to ML CS489/698: Intro to ML Lecture 14: Training of Deep NNs Instructor: Sun Sun 1 Outline Activation functions Regularization Gradient-based optimization 2 Examples of activation functions 3 5/28/18 Sun Sun

More information

Structured Attention Networks

Structured Attention Networks Structured Attention Networks Yoon Kim Carl Denton Luong Hoang Alexander M. Rush HarvardNLP ICLR, 2017 Presenter: Chao Jiang ICLR, 2017 Presenter: Chao Jiang 1 / Outline 1 Deep Neutral Networks for Text

More information

Sparse Non-negative Matrix Language Modeling

Sparse Non-negative Matrix Language Modeling Sparse Non-negative Matrix Language Modeling Joris Pelemans Noam Shazeer Ciprian Chelba joris@pelemans.be noam@google.com ciprianchelba@google.com 1 Outline Motivation Sparse Non-negative Matrix Language

More information

Machine learning for vision. It s the features, stupid! cathedral. high-rise. Winter Roland Memisevic. Lecture 2, January 26, 2016

Machine learning for vision. It s the features, stupid! cathedral. high-rise. Winter Roland Memisevic. Lecture 2, January 26, 2016 Winter 2016 Lecture 2, Januar 26, 2016 f2? cathedral high-rise f1 A common computer vision pipeline before 2012 1. 2. 3. 4. Find interest points. Crop patches around them. Represent each patch with a sparse

More information

Clinical Named Entity Recognition Method Based on CRF

Clinical Named Entity Recognition Method Based on CRF Clinical Named Entity Recognition Method Based on CRF Yanxu Chen 1, Gang Zhang 1, Haizhou Fang 1, Bin He, and Yi Guan Research Center of Language Technology Harbin Institute of Technology, Harbin, China

More information

Image Captioning and Generation From Text

Image Captioning and Generation From Text Image Captioning and Generation From Text Presented by: Tony Zhang, Jonathan Kenny, and Jeremy Bernstein Mentor: Stephan Zheng CS159 Advanced Topics in Machine Learning: Structured Prediction California

More information

Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani

Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani Neural Networks CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Biological and artificial neural networks Feed-forward neural networks Single layer

More information

arxiv: v1 [cs.ne] 15 Nov 2018

arxiv: v1 [cs.ne] 15 Nov 2018 Multi-cell LSTM Based Neural Language Model Thomas Cherian, Akshay Badola and Vineet Padmanabhan School of Computer & Information Sciences, University of Hyderabad, India thoma@uohyd.ac.in, badola@uohyd.ac.in,

More information