Recurrent Neural Nets II
|
|
- Aubrie Stevens
- 5 years ago
- Views:
Transcription
1 Recurrent Neural Nets II Steven Spielberg Pon Kumar, Tingke (Kevin) Shen Machine Learning Reading Group, Fall November, 2016
2 Outline 1 Introduction 2 Problem Formulations with RNNs 3 LSTM for Optimization 4 Seq2Seq Learning
3 Introduction Feed-Forward Neural Networks (NN)
4 Introduction Rolling NN over time
5 Introduction Rolling NN over time
6 Introduction Rolling NN over time
7 Introduction Computation Flow in RNN
8 Introduction Computation Flow in RNN
9 Introduction Computation Flow in RNN
10 Introduction Computation Flow in RNN
11 Introduction Computation Flow in RNN
12 Introduction Computation Flow in RNN
13 Introduction Computation Flow in RNN
14 Introduction Computation Flow in RNN
15 Introduction Computation Flow in RNN
16 Introduction Computation Flow in RNN
17 Introduction Computation Flow in RNN
18 Introduction Computation Flow in RNN
19 Introduction RNN Representation
20 Introduction RNN Representation
21 Introduction RNN Representation
22 Problem Formulations with RNNs Time Series Prediction Given x t 3, x t 2, x t 1, x t Find x t+1
23 Problem Formulations with RNNs Time Series Prediction - Learning Error, e = n i=1 ( ˆ x i t+1 x i t+1 )2 Update Weights θ with Error, e using Back Propogation through Time eg: Weather Forecasting, Stock Prediction etc.
24 Problem Formulations with RNNs Implemetation
25 Problem Formulations with RNNs Sentence Classification - RNN Error, e = - n i=1 y i log(ŷ i ) Update Weights θ with Error, e using Back Propogation through Time
26 Problem Formulations with RNNs Sentence Classification - Bidirectional RNN Error, e = - n i=1 y i log(ŷ i ) Update Weights θ with Error, e using Back Propogation through Time
27 Problem Formulations with RNNs Character Level RNN
28 Problem Formulations with RNNs Character Level RNN
29 Problem Formulations with RNNs Sampled Examples from Character level RNN
30 Problem Formulations with RNNs Dynamic Systems
31 Problem Formulations with RNNs Dynamic Systems Example
32 LSTM for Optimization Optimization with LSTM Gradient Descent θ t+1 = θ t + αg( f (θ)) where g( f (θ)) is handcrafted update Rule Learning Gradient Descent Update Rule where φ is parameters of LSTM θ t+1 = θ t + g( f (θ), φ)
33 LSTM for Optimization Learning to Learn Gradient Descent by Gradient Descent
34 LSTM for Optimization Learning to Learn Gradient Descent by Gradient Descent
35 LSTM for Optimization Learning to Learn Gradient Descent by Gradient Descent
36 LSTM for Optimization Learning to Learn Gradient Descent by Gradient Descent
37 LSTM for Optimization Learning to Learn Gradient Descent by Gradient Descent
38 LSTM for Optimization Learning to Learn Gradient Descent by Gradient Descent
39 LSTM for Optimization Learning to Learn Gradient Descent by Gradient Descent
40 LSTM for Optimization Learning to Learn Gradient Descent by Gradient Descent LSTM LSTM+GAC NTM-BFGS ADAM RMSprop Rprop Adadelta Adagrad SGD Figure 4: Comparisons between learned and hand-crafted optimizers performance. Learned optimizers are shown with solid lines and hand-crafted optimizers are shown with dashed lines. Units for the y axis in the MNIST plots are logits. Left: Performance of different optimizers on randomly sampled 10-dimensional quadratic functions. Center: the LSTM optimizer outperforms standard methods training the base network on MNIST. Right: Learning curves for steps by an optimizer trained to optimize for 100 steps (continuation of center plot).
41 3.1 Quadratic functions Recurrent Neural Nets II LSTM for Optimization Learning 0 to Learn 50 Gradient 100 Descent by Gradient Descent Figure 5: Comparisons between learned and hand-crafted optimizers performance. Units for the y axis are logits. Left: Generalization to the different number of hidden units (40 instead of 20). Center: Generalization to the different number of hidden layers (2 instead of 1). This optimization problem is very hard, because the hidden layers are very narrow. Right: Training curves for an MLP with 20 hidden units using ReLU activations. The LSTM optimizer was trained on an MLP with sigmoid activations. Figure 6: Examples of images styled using the LSTM optimizer. Each triple consists of the content image (left), style (right) and image generated by the LSTM optimizer (center). Left: The result of applying the training style at the training resolution to a test image. Right: The result of applying a new style to a test image at double the resolution on which the optimizer was trained. a learning rate (e.g. decay coefficients for ADAM) we use the default values from the optim package in Torch7. Initial values of all optimizee parameters were sampled from an IID Gaussian distribution.
42 Seq2Seq Learning Sequence to Sequence Learning We covered how to use LSTM with fixed length inputs and fixed length outputs Sequence to Sequence learning uses two RNNs to solve general sequence to sequence problem of different length Eg: Machine Translation, Chatbots, Image Captioning etc.
43 Seq2Seq Learning Machine Translation
44 Seq2Seq Learning Machine Translation
45 Seq2Seq Learning Machine Translation
46 Seq2Seq Learning Machine Translation
47 Seq2Seq Learning Machine Translation
48 Seq2Seq Learning Machine Translation Error, e = - n i=1 Tt=1 y i t log(ŷ i t) Update Weights θ with Error, e using Back Propogation through Time
49 Seq2Seq Learning GPU Implementation GPU6 A B C D GPU5 A B C D 80k softmax by 1000 dims This is very big! GPU4 Split softmax into 4 GPUs GPU3 GPU LSTM cells 2000 dims per timestep GPU x 4 = 8k dims per sentence A B C D A B C 160k vocab in input language
50 Seq2Seq Learning GPU Implementation GPU6 A B C D GPU5 A B C D 80k softmax by 1000 dims This is very big! GPU4 Split softmax into 4 GPUs GPU3 GPU LSTM cells 2000 dims per timestep GPU x 4 = 8k dims per sentence A B C D A B C 160k vocab in input language
51 Seq2Seq Learning GPU Implementation GPU6 A B C D GPU5 A B C D 80k softmax by 1000 dims This is very big! GPU4 Split softmax into 4 GPUs GPU3 GPU LSTM cells 2000 dims per timestep GPU x 4 = 8k dims per sentence A B C D A B C 160k vocab in input language
52 Seq2Seq Learning GPU Implementation GPU6 A B C D GPU5 A B C D 80k softmax by 1000 dims This is very big! GPU4 Split softmax into 4 GPUs GPU3 GPU LSTM cells 2000 dims per timestep GPU x 4 = 8k dims per sentence A B C D A B C 160k vocab in input language
53 Seq2Seq Learning GPU Implementation GPU6 A B C D GPU5 A B C D 80k softmax by 1000 dims This is very big! GPU4 Split softmax into 4 GPUs GPU3 GPU LSTM cells 2000 dims per timestep GPU x 4 = 8k dims per sentence A B C D A B C 160k vocab in input language
54 Seq2Seq Learning GPU Implementation GPU6 A B C D GPU5 A B C D 80k softmax by 1000 dims This is very big! GPU4 Split softmax into 4 GPUs GPU3 GPU LSTM cells 2000 dims per timestep GPU x 4 = 8k dims per sentence A B C D A B C 160k vocab in input language
55 Seq2Seq Learning GPU Implementation GPU6 A B C D GPU5 A B C D 80k softmax by 1000 dims This is very big! GPU4 Split softmax into 4 GPUs GPU3 GPU LSTM cells 2000 dims per timestep GPU x 4 = 8k dims per sentence A B C D A B C 160k vocab in input language
56 arecurrent sizeable Neural margin, Nets IIdespite its inability to handle out-of-vocabulary words. The LSTM is within 0. BLEU points of the previous state of the art by rescoring the 1000-best list of the baseline system. Seq2Seq Learning 3.7 Performance on long sentences Projection of the Encoder LSTM We were surprised to discover that the LSTM did well on long sentences, which is shown quantita tively in figure 3. Table 3 presents several examples of long sentences and their translations. 3.8 Model Analysis 4 15 I was given a card by her in the garden Mary admires John Mary is in love with John 10 5 In the garden, she gave me a card She gave me a card in the garden John admires Mary John is in love with Mary Mary respects John 0 5 In the garden, I gave her a card She was given a card by me in the garden John respects Mary 15 I gave her a card in the garden Figure 2: The figure shows a 2-dimensional PCA projection of the LSTM hidden states that are obtaine after processing the phrases in the figures. The phrases are clustered by meaning, which in these examples i
57 Seq2Seq Learning Seq2Seq Application: Video to Text Input: Output: A monkey is pulling a dog's tail and is chased by the dog. Supervised problem: Given video and sentence pairs
58 Seq2Seq Learning Why Describe Videos? Robotics applications: human to robot interaction Describing videos for the blind Video indexing Just because
59 Seq2Seq Learning Alternative Models Holistic video representation: Train classifiers to suggest subject, object, actions Combine objects/actions with language model using graphical model and real world knowledge Pick most probable subject, action, object triplet Insert triplet into sentence template Fix video sequence length Encode video with vanilla NN Output sentence using LSTM Holistic video representation:
60 Seq2Seq Learning Recall: why Sequence to Sequence? Model sees the entire input sequence before starting to output Output sequence length is not fixed to be equal to input sequence length
61 Seq2Seq Learning Model
62 Seq2Seq Learning Predictions We want to pick the most probable sequence of words generate sentences greedily, picking highest softmax probability at each time step use beam search: e.g. try top 3 most probable words each time step and only keep top 3 most probable sequences so far
63 Seq2Seq Learning Examples Demo: Paper, code, examples: vsub/s2vt.html
64 Seq2Seq Learning Seq2Seq Models and Attention Problem: Basic seq2seq RNN models cannot handle very long sequences NEURAL MACHINE TRANSLATION BY JOINTLY LEARNING TO ALIGN AND TRANSLATE Authors: Dzmitry Bahdanau, KyungHyun Cho, Yoshua Bengio
65 Seq2Seq Learning Basic Seq2Seq Model p(y i {y 1...y i 1 }, x) = g(y i 1, s i, c)
66 Seq2Seq Learning Attention Seq2Seq Model p(y i {y 1...y i 1 }, x) = g(y i 1, s i, c i ) T c i = α ij h j j=1
67 Seq2Seq Learning Encoder: Bidirectional RNN 2 independent RNNs, 1 each direction Overall hiddens is concatenation of 2 independent hiddens Hiddens at each time contain information from entire input h j is influenced by inputs around x j the most
68 Seq2Seq Learning Decoder: Attention α ij = T c i = α ij h j j=1 exp(e ij ) Tk=1 exp(e ik ) e ij = a(s i 1, h j )
69 Seq2Seq Learning Improvements
70 Seq2Seq Learning Attention Visualization Picture shows matrix of α ij α ij relevance of input word j to output word i White is 1, black is 0
71 Seq2Seq Learning End Thank you!
CS839: Probabilistic Graphical Models. Lecture 22: The Attention Mechanism. Theo Rekatsinas
CS839: Probabilistic Graphical Models Lecture 22: The Attention Mechanism Theo Rekatsinas 1 Why Attention? Consider machine translation: We need to pay attention to the word we are currently translating.
More informationSequence Modeling: Recurrent and Recursive Nets. By Pyry Takala 14 Oct 2015
Sequence Modeling: Recurrent and Recursive Nets By Pyry Takala 14 Oct 2015 Agenda Why Recurrent neural networks? Anatomy and basic training of an RNN (10.2, 10.2.1) Properties of RNNs (10.2.2, 8.2.6) Using
More informationEmpirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling
Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling Authors: Junyoung Chung, Caglar Gulcehre, KyungHyun Cho and Yoshua Bengio Presenter: Yu-Wei Lin Background: Recurrent Neural
More informationPointer Network. Oriol Vinyals. 박천음 강원대학교 Intelligent Software Lab.
Pointer Network Oriol Vinyals 박천음 강원대학교 Intelligent Software Lab. Intelligent Software Lab. Pointer Network 1 Pointer Network 2 Intelligent Software Lab. 2 Sequence-to-Sequence Model Train 학습학습학습학습학습 Test
More informationShow, Discriminate, and Tell: A Discriminatory Image Captioning Model with Deep Neural Networks
Show, Discriminate, and Tell: A Discriminatory Image Captioning Model with Deep Neural Networks Zelun Luo Department of Computer Science Stanford University zelunluo@stanford.edu Te-Lin Wu Department of
More informationDeep Learning Applications
October 20, 2017 Overview Supervised Learning Feedforward neural network Convolution neural network Recurrent neural network Recursive neural network (Recursive neural tensor network) Unsupervised Learning
More informationTable of Contents. What Really is a Hidden Unit? Visualizing Feed-Forward NNs. Visualizing Convolutional NNs. Visualizing Recurrent NNs
Table of Contents What Really is a Hidden Unit? Visualizing Feed-Forward NNs Visualizing Convolutional NNs Visualizing Recurrent NNs Visualizing Attention Visualizing High Dimensional Data What do visualizations
More information27: Hybrid Graphical Models and Neural Networks
10-708: Probabilistic Graphical Models 10-708 Spring 2016 27: Hybrid Graphical Models and Neural Networks Lecturer: Matt Gormley Scribes: Jakob Bauer Otilia Stretcu Rohan Varma 1 Motivation We first look
More informationCSC 578 Neural Networks and Deep Learning
CSC 578 Neural Networks and Deep Learning Fall 2018/19 7. Recurrent Neural Networks (Some figures adapted from NNDL book) 1 Recurrent Neural Networks 1. Recurrent Neural Networks (RNNs) 2. RNN Training
More informationFastText. Jon Koss, Abhishek Jindal
FastText Jon Koss, Abhishek Jindal FastText FastText is on par with state-of-the-art deep learning classifiers in terms of accuracy But it is way faster: FastText can train on more than one billion words
More informationMachine Learning 13. week
Machine Learning 13. week Deep Learning Convolutional Neural Network Recurrent Neural Network 1 Why Deep Learning is so Popular? 1. Increase in the amount of data Thanks to the Internet, huge amount of
More informationCAP 6412 Advanced Computer Vision
CAP 6412 Advanced Computer Vision http://www.cs.ucf.edu/~bgong/cap6412.html Boqing Gong Feb 04, 2016 Today Administrivia Attention Modeling in Image Captioning, by Karan Neural networks & Backpropagation
More informationTutorial on Machine Learning Tools
Tutorial on Machine Learning Tools Yanbing Xue Milos Hauskrecht Why do we need these tools? Widely deployed classical models No need to code from scratch Easy-to-use GUI Outline Matlab Apps Weka 3 UI TensorFlow
More informationLSTM for Language Translation and Image Captioning. Tel Aviv University Deep Learning Seminar Oran Gafni & Noa Yedidia
1 LSTM for Language Translation and Image Captioning Tel Aviv University Deep Learning Seminar Oran Gafni & Noa Yedidia 2 Part I LSTM for Language Translation Motivation Background (RNNs, LSTMs) Model
More informationLecture 20: Neural Networks for NLP. Zubin Pahuja
Lecture 20: Neural Networks for NLP Zubin Pahuja zpahuja2@illinois.edu courses.engr.illinois.edu/cs447 CS447: Natural Language Processing 1 Today s Lecture Feed-forward neural networks as classifiers simple
More informationEmpirical Evaluation of RNN Architectures on Sentence Classification Task
Empirical Evaluation of RNN Architectures on Sentence Classification Task Lei Shen, Junlin Zhang Chanjet Information Technology lorashen@126.com, zhangjlh@chanjet.com Abstract. Recurrent Neural Networks
More informationNatural Language Processing CS 6320 Lecture 6 Neural Language Models. Instructor: Sanda Harabagiu
Natural Language Processing CS 6320 Lecture 6 Neural Language Models Instructor: Sanda Harabagiu In this lecture We shall cover: Deep Neural Models for Natural Language Processing Introduce Feed Forward
More informationCPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2016
CPSC 340: Machine Learning and Data Mining Principal Component Analysis Fall 2016 A2/Midterm: Admin Grades/solutions will be posted after class. Assignment 4: Posted, due November 14. Extra office hours:
More informationA Comparison of Sequence-Trained Deep Neural Networks and Recurrent Neural Networks Optical Modeling For Handwriting Recognition
A Comparison of Sequence-Trained Deep Neural Networks and Recurrent Neural Networks Optical Modeling For Handwriting Recognition Théodore Bluche, Hermann Ney, Christopher Kermorvant SLSP 14, Grenoble October
More information16-785: Integrated Intelligence in Robotics: Vision, Language, and Planning. Spring 2018 Lecture 14. Image to Text
16-785: Integrated Intelligence in Robotics: Vision, Language, and Planning Spring 2018 Lecture 14. Image to Text Input Output Classification tasks 4/1/18 CMU 16-785: Integrated Intelligence in Robotics
More informationConvolutional Sequence to Sequence Learning. Denis Yarats with Jonas Gehring, Michael Auli, David Grangier, Yann Dauphin Facebook AI Research
Convolutional Sequence to Sequence Learning Denis Yarats with Jonas Gehring, Michael Auli, David Grangier, Yann Dauphin Facebook AI Research Sequence generation Need to model a conditional distribution
More informationDeep Neural Networks Optimization
Deep Neural Networks Optimization Creative Commons (cc) by Akritasa http://arxiv.org/pdf/1406.2572.pdf Slides from Geoffrey Hinton CSC411/2515: Machine Learning and Data Mining, Winter 2018 Michael Guerzhoy
More informationShow, Attend and Tell: Neural Image Caption Generation with Visual Attention
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhutdinov, Richard Zemel, Yoshua Bengio Presented
More informationDynamic Routing Between Capsules
Report Explainable Machine Learning Dynamic Routing Between Capsules Author: Michael Dorkenwald Supervisor: Dr. Ullrich Köthe 28. Juni 2018 Inhaltsverzeichnis 1 Introduction 2 2 Motivation 2 3 CapusleNet
More informationA Simple (?) Exercise: Predicting the Next Word
CS11-747 Neural Networks for NLP A Simple (?) Exercise: Predicting the Next Word Graham Neubig Site https://phontron.com/class/nn4nlp2017/ Are These Sentences OK? Jane went to the store. store to Jane
More informationImage Captioning with Object Detection and Localization
Image Captioning with Object Detection and Localization Zhongliang Yang, Yu-Jin Zhang, Sadaqat ur Rehman, Yongfeng Huang, Department of Electronic Engineering, Tsinghua University, Beijing 100084, China
More informationMoonRiver: Deep Neural Network in C++
MoonRiver: Deep Neural Network in C++ Chung-Yi Weng Computer Science & Engineering University of Washington chungyi@cs.washington.edu Abstract Artificial intelligence resurges with its dramatic improvement
More informationJOINT INTENT DETECTION AND SLOT FILLING USING CONVOLUTIONAL NEURAL NETWORKS. Puyang Xu, Ruhi Sarikaya. Microsoft Corporation
JOINT INTENT DETECTION AND SLOT FILLING USING CONVOLUTIONAL NEURAL NETWORKS Puyang Xu, Ruhi Sarikaya Microsoft Corporation ABSTRACT We describe a joint model for intent detection and slot filling based
More informationCS489/698: Intro to ML
CS489/698: Intro to ML Lecture 14: Training of Deep NNs Instructor: Sun Sun 1 Outline Activation functions Regularization Gradient-based optimization 2 Examples of activation functions 3 5/28/18 Sun Sun
More informationAdvanced RNN (GRU and LSTM) for Machine Transla:on. Dr. Kira Radinsky CTO SalesPredict Visi8ng Professor/Scien8st Technion
Advanced RNN (GRU and LSTM) for Machine Transla:on Dr. Kira Radinsky CTO SalesPredict Visi8ng Professor/Scien8st Technion Slides were adapted from lectures by Richard Socher Overview Machine transla8on
More informationIndex. Umberto Michelucci 2018 U. Michelucci, Applied Deep Learning,
A Acquisition function, 298, 301 Adam optimizer, 175 178 Anaconda navigator conda command, 3 Create button, 5 download and install, 1 installing packages, 8 Jupyter Notebook, 11 13 left navigation pane,
More informationLecture 2 Notes. Outline. Neural Networks. The Big Idea. Architecture. Instructors: Parth Shah, Riju Pahwa
Instructors: Parth Shah, Riju Pahwa Lecture 2 Notes Outline 1. Neural Networks The Big Idea Architecture SGD and Backpropagation 2. Convolutional Neural Networks Intuition Architecture 3. Recurrent Neural
More informationShow, Discriminate, and Tell: A Discriminatory Image Captioning Model with Deep Neural Networks
Show, Discriminate, and Tell: A Discriminatory Image Captioning Model with Deep Neural Networks Boya Peng Department of Computer Science Stanford University boya@stanford.edu Zelun Luo Department of Computer
More informationMIXED PRECISION TRAINING: THEORY AND PRACTICE Paulius Micikevicius
MIXED PRECISION TRAINING: THEORY AND PRACTICE Paulius Micikevicius What is Mixed Precision Training? Reduced precision tensor math with FP32 accumulation, FP16 storage Successfully used to train a variety
More informationSentiment Classification of Food Reviews
Sentiment Classification of Food Reviews Hua Feng Department of Electrical Engineering Stanford University Stanford, CA 94305 fengh15@stanford.edu Ruixi Lin Department of Electrical Engineering Stanford
More informationEECS 496 Statistical Language Models. Winter 2018
EECS 496 Statistical Language Models Winter 2018 Introductions Professor: Doug Downey Course web site: www.cs.northwestern.edu/~ddowney/courses/496_winter2018 (linked off prof. home page) Logistics Grading
More informationLSTM: An Image Classification Model Based on Fashion-MNIST Dataset
LSTM: An Image Classification Model Based on Fashion-MNIST Dataset Kexin Zhang, Research School of Computer Science, Australian National University Kexin Zhang, U6342657@anu.edu.au Abstract. The application
More informationModeling Sequences Conditioned on Context with RNNs
Modeling Sequences Conditioned on Context with RNNs Sargur Srihari srihari@buffalo.edu This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/cse676 1 10. Topics in Sequence
More informationA Quick Guide on Training a neural network using Keras.
A Quick Guide on Training a neural network using Keras. TensorFlow and Keras Keras Open source High level, less flexible Easy to learn Perfect for quick implementations Starts by François Chollet from
More informationNeural Networks and Deep Learning
Neural Networks and Deep Learning Example Learning Problem Example Learning Problem Celebrity Faces in the Wild Machine Learning Pipeline Raw data Feature extract. Feature computation Inference: prediction,
More informationMachine Learning With Python. Bin Chen Nov. 7, 2017 Research Computing Center
Machine Learning With Python Bin Chen Nov. 7, 2017 Research Computing Center Outline Introduction to Machine Learning (ML) Introduction to Neural Network (NN) Introduction to Deep Learning NN Introduction
More informationLearning to Rank with Attentive Media Attributes
Learning to Rank with Attentive Media Attributes Baldo Faieta Yang (Allie) Yang Adobe Adobe San Francisco, CA 94103 San Francisco, CA. 94103 bfaieta@adobe.com yangyan@adobe.com Abstract In the context
More informationKeras: Handwritten Digit Recognition using MNIST Dataset
Keras: Handwritten Digit Recognition using MNIST Dataset IIT PATNA February 9, 2017 1 / 24 OUTLINE 1 Introduction Keras: Deep Learning library for Theano and TensorFlow 2 Installing Keras Installation
More informationCS 224n: Assignment #3
CS 224n: Assignment #3 Due date: 2/27 11:59 PM PST (You are allowed to use 3 late days maximum for this assignment) These questions require thought, but do not require long answers. Please be as concise
More informationDeep Learning. Practical introduction with Keras JORDI TORRES 27/05/2018. Chapter 3 JORDI TORRES
Deep Learning Practical introduction with Keras Chapter 3 27/05/2018 Neuron A neural network is formed by neurons connected to each other; in turn, each connection of one neural network is associated
More informationHow to Develop Encoder-Decoder LSTMs
Chapter 9 How to Develop Encoder-Decoder LSTMs 9.0.1 Lesson Goal The goal of this lesson is to learn how to develop encoder-decoder LSTM models. completing this lesson, you will know: After ˆ The Encoder-Decoder
More informationSeq2SQL: Generating Structured Queries from Natural Language Using Reinforcement Learning
Seq2SQL: Generating Structured Queries from Natural Language Using Reinforcement Learning V. Zhong, C. Xiong, R. Socher Salesforce Research arxiv: 1709.00103 Reviewed by : Bill Zhang University of Virginia
More informationDeep Neural Networks Applications in Handwriting Recognition
Deep Neural Networks Applications in Handwriting Recognition 2 Who am I? Théodore Bluche PhD defended at Université Paris-Sud last year Deep Neural Networks for Large Vocabulary Handwritten
More informationVulnerability of machine learning models to adversarial examples
Vulnerability of machine learning models to adversarial examples Petra Vidnerová Institute of Computer Science The Czech Academy of Sciences Hora Informaticae 1 Outline Introduction Works on adversarial
More informationCrowd Scene Understanding with Coherent Recurrent Neural Networks
Crowd Scene Understanding with Coherent Recurrent Neural Networks Hang Su, Yinpeng Dong, Jun Zhu May 22, 2016 Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 1 / 26 Outline 1 Introduction 2 LSTM
More informationOutline GF-RNN ReNet. Outline
Outline Gated Feedback Recurrent Neural Networks. arxiv1502. Introduction: RNN & Gated RNN Gated Feedback Recurrent Neural Networks (GF-RNN) Experiments: Character-level Language Modeling & Python Program
More informationRecurrent Neural Networks
Recurrent Neural Networks Javier Béjar Deep Learning 2018/2019 Fall Master in Artificial Intelligence (FIB-UPC) Introduction Sequential data Many problems are described by sequences Time series Video/audio
More informationDeep Neural Networks Applications in Handwriting Recognition
Deep Neural Networks Applications in Handwriting Recognition Théodore Bluche theodore.bluche@gmail.com São Paulo Meetup - 9 Mar. 2017 2 Who am I? Théodore Bluche PhD defended
More informationMore on Neural Networks. Read Chapter 5 in the text by Bishop, except omit Sections 5.3.3, 5.3.4, 5.4, 5.5.4, 5.5.5, 5.5.6, 5.5.7, and 5.
More on Neural Networks Read Chapter 5 in the text by Bishop, except omit Sections 5.3.3, 5.3.4, 5.4, 5.5.4, 5.5.5, 5.5.6, 5.5.7, and 5.6 Recall the MLP Training Example From Last Lecture log likelihood
More informationNeural Networks for unsupervised learning From Principal Components Analysis to Autoencoders to semantic hashing
Neural Networks for unsupervised learning From Principal Components Analysis to Autoencoders to semantic hashing feature 3 PC 3 Beate Sick Many slides are taken form Hinton s great lecture on NN: https://www.coursera.org/course/neuralnets
More informationKeras: Handwritten Digit Recognition using MNIST Dataset
Keras: Handwritten Digit Recognition using MNIST Dataset IIT PATNA January 31, 2018 1 / 30 OUTLINE 1 Keras: Introduction 2 Installing Keras 3 Keras: Building, Testing, Improving A Simple Network 2 / 30
More informationDeep Learning in NLP. Horacio Rodríguez. AHLT Deep Learning 2 1
Deep Learning in NLP Horacio Rodríguez AHLT Deep Learning 2 1 Outline Introduction Short review of Distributional Semantics, Semantic spaces, VSM, Embeddings Embedding of words Embedding of more complex
More informationDCU-UvA Multimodal MT System Report
DCU-UvA Multimodal MT System Report Iacer Calixto ADAPT Centre School of Computing Dublin City University Dublin, Ireland iacer.calixto@adaptcentre.ie Desmond Elliott ILLC University of Amsterdam Science
More informationTutorial on Keras CAP ADVANCED COMPUTER VISION SPRING 2018 KISHAN S ATHREY
Tutorial on Keras CAP 6412 - ADVANCED COMPUTER VISION SPRING 2018 KISHAN S ATHREY Deep learning packages TensorFlow Google PyTorch Facebook AI research Keras Francois Chollet (now at Google) Chainer Company
More informationCombining Neural Networks and Log-linear Models to Improve Relation Extraction
Combining Neural Networks and Log-linear Models to Improve Relation Extraction Thien Huu Nguyen and Ralph Grishman Computer Science Department, New York University {thien,grishman}@cs.nyu.edu Outline Relation
More informationRNN LSTM and Deep Learning Libraries
RNN LSTM and Deep Learning Libraries UDRC Summer School Muhammad Awais m.a.rana@surrey.ac.uk Outline Recurrent Neural Network Application of RNN LSTM Caffe Torch Theano TensorFlow Flexibility of Recurrent
More informationMachine Learning for Natural Language Processing. Alice Oh January 17, 2018
Machine Learning for Natural Language Processing Alice Oh January 17, 2018 Overview Distributed representation Temporal neural networks RNN LSTM GRU Sequence-to-sequence models Machine translation Response
More informationNatural Language to Neural Programs
Natural Language to Neural Programs Daniel Simig Department of Engineering University of Cambridge This dissertation is submitted for the degree of Master of Philosophy in Machine Learning, Speech and
More informationComputer Vision: Homework 5 Optical Character Recognition using Neural Networks
16-720 Computer Vision: Homework 5 Optical Character Recognition using Neural Networks Instructors: Deva Ramanan TAs: Achal Dave*, Sashank Jujjavarapu, Siddarth Malreddy, Brian Pugh Originally developed
More informationEncoding RNNs, 48 End of sentence (EOS) token, 207 Exploding gradient, 131 Exponential function, 42 Exponential Linear Unit (ELU), 44
A Activation potential, 40 Annotated corpus add padding, 162 check versions, 158 create checkpoints, 164, 166 create input, 160 create train and validation datasets, 163 dropout, 163 DRUG-AE.rel file,
More informationRNNs as Directed Graphical Models
RNNs as Directed Graphical Models Sargur Srihari srihari@buffalo.edu This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/cse676 1 10. Topics in Sequence Modeling Overview
More informationCan Active Memory Replace Attention?
Google Brain NIPS 2016 Presenter: Chao Jiang NIPS 2016 Presenter: Chao Jiang 1 / Outline 1 Introduction 2 Active Memory 3 Step by Step to Neural GPU 4 Another two steps: 1. the Markovian Neural GPU 5 Another
More informationRecurrent Neural Networks and Transfer Learning for Action Recognition
Recurrent Neural Networks and Transfer Learning for Action Recognition Andrew Giel Stanford University agiel@stanford.edu Ryan Diaz Stanford University ryandiaz@stanford.edu Abstract We have taken on the
More informationDeep Learning with Tensorflow AlexNet
Machine Learning and Computer Vision Group Deep Learning with Tensorflow http://cvml.ist.ac.at/courses/dlwt_w17/ AlexNet Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton, "Imagenet classification
More informationRecurrent Neural Networks
Recurrent Neural Networks 11-785 / Fall 2018 / Recitation 7 Raphaël Olivier Recap : RNNs are magic They have infinite memory They handle all kinds of series They re the basis of recent NLP : Translation,
More informationMulti-Glance Attention Models For Image Classification
Multi-Glance Attention Models For Image Classification Chinmay Duvedi Stanford University Stanford, CA cduvedi@stanford.edu Pararth Shah Stanford University Stanford, CA pararth@stanford.edu Abstract We
More informationDEEP LEARNING REVIEW. Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature Presented by Divya Chitimalla
DEEP LEARNING REVIEW Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature 2015 -Presented by Divya Chitimalla What is deep learning Deep learning allows computational models that are composed of multiple
More informationA Dendrogram. Bioinformatics (Lec 17)
A Dendrogram 3/15/05 1 Hierarchical Clustering [Johnson, SC, 1967] Given n points in R d, compute the distance between every pair of points While (not done) Pick closest pair of points s i and s j and
More informationRecurrent Neural Networks with Attention for Genre Classification
Recurrent Neural Networks with Attention for Genre Classification Jeremy Irvin Stanford University jirvin16@stanford.edu Elliott Chartock Stanford University elboy@stanford.edu Nadav Hollander Stanford
More informationNatural Language Processing with Deep Learning CS224N/Ling284
Natural Language Processing with Deep Learning CS224N/Ling284 Lecture 8: Recurrent Neural Networks Christopher Manning and Richard Socher Organization Extra project office hour today after lecture Overview
More informationSlide credit from Hung-Yi Lee & Richard Socher
Slide credit from Hung-Yi Lee & Richard Socher 1 Review Word Vector 2 Word2Vec Variants Skip-gram: predicting surrounding words given the target word (Mikolov+, 2013) CBOW (continuous bag-of-words): predicting
More informationPerceptron: This is convolution!
Perceptron: This is convolution! v v v Shared weights v Filter = local perceptron. Also called kernel. By pooling responses at different locations, we gain robustness to the exact spatial location of image
More informationDeep Learning. Volker Tresp Summer 2014
Deep Learning Volker Tresp Summer 2014 1 Neural Network Winter and Revival While Machine Learning was flourishing, there was a Neural Network winter (late 1990 s until late 2000 s) Around 2010 there
More informationSEMANTIC COMPUTING. Lecture 8: Introduction to Deep Learning. TU Dresden, 7 December Dagmar Gromann International Center For Computational Logic
SEMANTIC COMPUTING Lecture 8: Introduction to Deep Learning Dagmar Gromann International Center For Computational Logic TU Dresden, 7 December 2018 Overview Introduction Deep Learning General Neural Networks
More informationNeural Network Joint Language Model: An Investigation and An Extension With Global Source Context
Neural Network Joint Language Model: An Investigation and An Extension With Global Source Context Ruizhongtai (Charles) Qi Department of Electrical Engineering, Stanford University rqi@stanford.edu Abstract
More informationIdentification of the correct hard-scatter vertex at the Large Hadron Collider
Identification of the correct hard-scatter vertex at the Large Hadron Collider Pratik Kumar, Neel Mani Singh pratikk@stanford.edu, neelmani@stanford.edu Under the guidance of Prof. Ariel Schwartzman( sch@slac.stanford.edu
More informationCode Mania Artificial Intelligence: a. Module - 1: Introduction to Artificial intelligence and Python:
Code Mania 2019 Artificial Intelligence: a. Module - 1: Introduction to Artificial intelligence and Python: 1. Introduction to Artificial Intelligence 2. Introduction to python programming and Environment
More information10-701/15-781, Fall 2006, Final
-7/-78, Fall 6, Final Dec, :pm-8:pm There are 9 questions in this exam ( pages including this cover sheet). If you need more room to work out your answer to a question, use the back of the page and clearly
More informationDeep Learning and Its Applications
Convolutional Neural Network and Its Application in Image Recognition Oct 28, 2016 Outline 1 A Motivating Example 2 The Convolutional Neural Network (CNN) Model 3 Training the CNN Model 4 Issues and Recent
More informationMachine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU,
Machine Learning 10-701, Fall 2015 Deep Learning Eric Xing (and Pengtao Xie) Lecture 8, October 6, 2015 Eric Xing @ CMU, 2015 1 A perennial challenge in computer vision: feature engineering SIFT Spin image
More informationBayesian model ensembling using meta-trained recurrent neural networks
Bayesian model ensembling using meta-trained recurrent neural networks Luca Ambrogioni l.ambrogioni@donders.ru.nl Umut Güçlü u.guclu@donders.ru.nl Yağmur Güçlütürk y.gucluturk@donders.ru.nl Julia Berezutskaya
More informationAn Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation
An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation Hugo Larochelle, Dumitru Erhan, Aaron Courville, James Bergstra, and Yoshua Bengio Université de Montréal 13/06/2007
More informationNeural Network Optimization and Tuning / Spring 2018 / Recitation 3
Neural Network Optimization and Tuning 11-785 / Spring 2018 / Recitation 3 1 Logistics You will work through a Jupyter notebook that contains sample and starter code with explanations and comments throughout.
More informationBackpropagation + Deep Learning
10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Backpropagation + Deep Learning Matt Gormley Lecture 13 Mar 1, 2018 1 Reminders
More informationCS 224N: Assignment #1
Due date: assignment) 1/25 11:59 PM PST (You are allowed to use three (3) late days maximum for this These questions require thought, but do not require long answers. Please be as concise as possible.
More informationLecture 21 : A Hybrid: Deep Learning and Graphical Models
10-708: Probabilistic Graphical Models, Spring 2018 Lecture 21 : A Hybrid: Deep Learning and Graphical Models Lecturer: Kayhan Batmanghelich Scribes: Paul Liang, Anirudha Rayasam 1 Introduction and Motivation
More informationGrounded Compositional Semantics for Finding and Describing Images with Sentences
Grounded Compositional Semantics for Finding and Describing Images with Sentences R. Socher, A. Karpathy, V. Le,D. Manning, A Y. Ng - 2013 Ali Gharaee 1 Alireza Keshavarzi 2 1 Department of Computational
More informationGate-Variants of Gated Recurrent Unit (GRU) Neural Networks
Gate-Variants of Gated Recurrent Unit (GRU) Neural Networks Rahul Dey and Fathi M. Salem Circuits, Systems, and Neural Networks (CSANN) LAB Department of Electrical and Computer Engineering Michigan State
More informationDeep Learning on Graphs
Deep Learning on Graphs with Graph Convolutional Networks Hidden layer Hidden layer Input Output ReLU ReLU, 6 April 2017 joint work with Max Welling (University of Amsterdam) The success story of deep
More informationImage Captioning and Generation From Text
Image Captioning and Generation From Text Presented by: Tony Zhang, Jonathan Kenny, and Jeremy Bernstein Mentor: Stephan Zheng CS159 Advanced Topics in Machine Learning: Structured Prediction California
More informationCPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2017
CPSC 340: Machine Learning and Data Mining Principal Component Analysis Fall 2017 Assignment 3: 2 late days to hand in tonight. Admin Assignment 4: Due Friday of next week. Last Time: MAP Estimation MAP
More informationCUED-RNNLM An Open-Source Toolkit for Efficient Training and Evaluation of Recurrent Neural Network Language Models
CUED-RNNLM An Open-Source Toolkit for Efficient Training and Evaluation of Recurrent Neural Network Language Models Xie Chen, Xunying Liu, Yanmin Qian, Mark Gales and Phil Woodland April 1, 2016 Overview
More informationOpening the Black Box Data Driven Visualizaion of Neural N
Opening the Black Box Data Driven Visualizaion of Neural Networks September 20, 2006 Aritificial Neural Networks Limitations of ANNs Use of Visualization (ANNs) mimic the processes found in biological
More informationRecurrent Neural Networks. Nand Kishore, Audrey Huang, Rohan Batra
Recurrent Neural Networks Nand Kishore, Audrey Huang, Rohan Batra Roadmap Issues Motivation 1 Application 1: Sequence Level Training 2 Basic Structure 3 4 Variations 5 Application 3: Image Classification
More informationCIS 520, Machine Learning, Fall 2015: Assignment 7 Due: Mon, Nov 16, :59pm, PDF to Canvas [100 points]
CIS 520, Machine Learning, Fall 2015: Assignment 7 Due: Mon, Nov 16, 2015. 11:59pm, PDF to Canvas [100 points] Instructions. Please write up your responses to the following problems clearly and concisely.
More information