Recurrent Neural Network (RNN) Industrial AI Lab.

Similar documents
Recurrent Neural Network

Sequence Modeling: Recurrent and Recursive Nets. By Pyry Takala 14 Oct 2015

CSC 578 Neural Networks and Deep Learning

RNNs as Directed Graphical Models

Recurrent Neural Networks. Nand Kishore, Audrey Huang, Rohan Batra

Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling

A new approach for supervised power disaggregation by using a deep recurrent LSTM network

Convolutional Neural Networks (CNN)

House Price Prediction Using LSTM

Code Mania Artificial Intelligence: a. Module - 1: Introduction to Artificial intelligence and Python:

DEEP LEARNING REVIEW. Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature Presented by Divya Chitimalla

Application of Deep Learning Techniques in Satellite Telemetry Analysis.

Natural Language Processing CS 6320 Lecture 6 Neural Language Models. Instructor: Sanda Harabagiu

Deep Generative Models Variational Autoencoders

Machine Learning 13. week

Lecture 2 Notes. Outline. Neural Networks. The Big Idea. Architecture. Instructors: Parth Shah, Riju Pahwa

Autoencoder. By Prof. Seungchul Lee isystems Design Lab UNIST. Table of Contents

Autoencoder. 1. Unsupervised Learning. By Prof. Seungchul Lee Industrial AI Lab POSTECH.

Autoencoder. 1. Unsupervised Learning. By Prof. Seungchul Lee Industrial AI Lab POSTECH.

A Brief Introduction to Reinforcement Learning

Topics in AI (CPSC 532L): Multimodal Learning with Vision, Language and Sound. Lecture 12: Deep Reinforcement Learning

CS6220: DATA MINING TECHNIQUES

Modeling Sequences Conditioned on Context with RNNs

Hidden Markov Models. Slides adapted from Joyce Ho, David Sontag, Geoffrey Hinton, Eric Xing, and Nicholas Ruozzi

CS489/698: Intro to ML

Sentiment Classification of Food Reviews

Asynchronous Parallel Learning for Neural Networks and Structured Models with Dense Features

Index. Umberto Michelucci 2018 U. Michelucci, Applied Deep Learning,

SEMANTIC COMPUTING. Lecture 9: Deep Learning: Recurrent Neural Networks (RNNs) TU Dresden, 21 December 2018

LSTM and its variants for visual recognition. Xiaodan Liang Sun Yat-sen University

Differentiable Data Structures (and POMDPs)

Tutorial on Keras CAP ADVANCED COMPUTER VISION SPRING 2018 KISHAN S ATHREY

27: Hybrid Graphical Models and Neural Networks

OPTIMIZING PERFORMANCE OF RECURRENT NEURAL NETWORKS

Deep Learning Applications

Encoding RNNs, 48 End of sentence (EOS) token, 207 Exploding gradient, 131 Exponential function, 42 Exponential Linear Unit (ELU), 44

Speech and Language Processing. Daniel Jurafsky & James H. Martin. Copyright c All rights reserved. Draft of September 23, 2018.

Table of Contents. What Really is a Hidden Unit? Visualizing Feed-Forward NNs. Visualizing Convolutional NNs. Visualizing Recurrent NNs

Diffusion Convolutional Recurrent Neural Network: Data-Driven Traffic Forecasting

Tracking Algorithms. Lecture16: Visual Tracking I. Probabilistic Tracking. Joint Probability and Graphical Model. Deterministic methods

Deep Learning and Its Applications

Real-time Gesture Pattern Classification with IMU Data

Ensemble of Specialized Neural Networks for Time Series Forecasting. Slawek Smyl ISF 2017

LSTM: An Image Classification Model Based on Fashion-MNIST Dataset

Bayesian model ensembling using meta-trained recurrent neural networks

Cse634 DATA MINING TEST REVIEW. Professor Anita Wasilewska Computer Science Department Stony Brook University

COMP 551 Applied Machine Learning Lecture 14: Neural Networks

Neural Network Optimization and Tuning / Spring 2018 / Recitation 3

XES Tensorflow Process Prediction using the Tensorflow Deep-Learning Framework

Machine Learning A W 1sst KU. b) [1 P] Give an example for a probability distributions P (A, B, C) that disproves

The Hitchhiker s Guide to TensorFlow:

Deep Learning Basic Lecture - Complex Systems & Artificial Intelligence 2017/18 (VO) Asan Agibetov, PhD.

COMP9444 Neural Networks and Deep Learning 7. Image Processing. COMP9444 c Alan Blair, 2017

Improving the way neural networks learn Srikumar Ramalingam School of Computing University of Utah

Deep Learning in Visual Recognition. Thanks Da Zhang for the slides

Crowd Scene Understanding with Coherent Recurrent Neural Networks

Machine Learning With Python. Bin Chen Nov. 7, 2017 Research Computing Center

Keras: Handwritten Digit Recognition using MNIST Dataset

Neural Networks and Deep Learning

Convolutional Sequence to Sequence Learning. Denis Yarats with Jonas Gehring, Michael Auli, David Grangier, Yann Dauphin Facebook AI Research

A Quick Guide on Training a neural network using Keras.

Motivation. Problem: With our linear methods, we can train the weights but not the basis functions: Activator Trainable weight. Fixed basis function

Lecture 21 : A Hybrid: Deep Learning and Graphical Models

SEMANTIC COMPUTING. Lecture 8: Introduction to Deep Learning. TU Dresden, 7 December Dagmar Gromann International Center For Computational Logic

Deep Learning. Practical introduction with Keras JORDI TORRES 27/05/2018. Chapter 3 JORDI TORRES

Replacing Neural Networks with Black-Box ODE Solvers

DeepPose & Convolutional Pose Machines

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU,

Deep Boltzmann Machines

10703 Deep Reinforcement Learning and Control

An Adaptable Neural-Network Model for Recursive Nonlinear Traffic Prediction and Modeling of MPEG Video Sources

On the Efficiency of Recurrent Neural Network Optimization Algorithms

Deep Neural Networks Applications in Handwriting Recognition

Optimization. Industrial AI Lab.

ECE 5470 Classification, Machine Learning, and Neural Network Review

10-701/15-781, Fall 2006, Final

Residual Networks And Attention Models. cs273b Recitation 11/11/2016. Anna Shcherbina

Structured Learning. Jun Zhu

Time Series Analysis by State Space Methods

Natural Language Processing with Deep Learning CS224N/Ling284. Christopher Manning Lecture 4: Backpropagation and computation graphs

DEEP LEARNING IN PYTHON. The need for optimization

CS 224n: Assignment #3

Tutorial on Machine Learning Tools

Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani

Deep Learning with R. Francesca Lazzeri Data Scientist II - Microsoft, AI Research

MACHINE LEARNING CLASSIFIERS ADVANTAGES AND CHALLENGES OF SELECTED METHODS

Assignment # 5. Farrukh Jabeen Due Date: November 2, Neural Networks: Backpropation

Hidden Units. Sargur N. Srihari

Deep Learning for Computer Vision II

How GPUs Power Comcast's X1 Voice Remote and Smart Video Analytics. Jan Neumann Comcast Labs DC May 10th, 2017

Transition-Based Dependency Parsing with Stack Long Short-Term Memory

Machine Learning. Chao Lan

Learning from Data: Adaptive Basis Functions

STA 4273H: Statistical Machine Learning

Conditional Random Fields - A probabilistic graphical model. Yen-Chin Lee 指導老師 : 鮑興國

Rationalizing Sentiment Analysis in Tensorflow

Natural Language Processing with Deep Learning CS224N/Ling284

7. Boosting and Bagging Bagging

MoonRiver: Deep Neural Network in C++

A Comparison of Sequence-Trained Deep Neural Networks and Recurrent Neural Networks Optical Modeling For Handwriting Recognition

Transcription:

Recurrent Neural Network (RNN) Industrial AI Lab.

For example (Deterministic) Time Series Data Closed- form Linear difference equation (LDE) and initial condition High order LDEs 2

(Stochastic) Time Series Data Stationary Non- stationary Mean and variance change over time 3

Linear trends Dealing with Non- Stationarity 4

Non- linear trends Dealing with Non- Stationarity 5

Seasonal trends Dealing with Non- Stationarity 6

Model assumption Dealing with Non- Stationarity 7

Markov Chain Joint distribution can be factored into a series of conditional distributions Markovian property (assumption) Tractable in computation of joint distribution 8

Hidden Markov Model (HMM) True state (or hidden variable) follows Markov chain Observation emitted from state Question: state estimation HMM can do this, but with many difficulties 9

Kalman Filter Linear dynamical system of motion A, B, C? 10

Recurrent Neural Network (RNN) RNNs are a family of neural networks for processing sequential data Feedforward Network and Sequential Data Separate parameters for each value of the time index Cannot share statistical strength across different time indices 11

Recurrent Neural Network (RNN) 12

Recurrent Neural Network (RNN) 13

Order matters Recurrence Structure of RNN It is possible to use the same transition function f with the same parameters at every time step 14

Hidden State Summary of the past sequence of inputs up to t Keep some aspects of the past sequence with more precision than other aspects Network learns the function f 15

Deep Recurrent Networks Three blocks of parameters and associated transformation From the input to the hidden state (from green to yellow) From the previous hidden state to the next hidden state (from yellow to red) From the hidden state to the output (from red to blue) 16

Deep Recurrent Networks Three blocks of parameters and associated transformation From the input to the hidden state (from green to yellow) From the previous hidden state to the next hidden state (from yellow to red) From the hidden state to the output (from red to blue) deep time 17

Long- Term Dependencies RNN with LSTM Gradients propagated over many stages tend to either vanish or explode Difficulty with long- term dependencies arises from the exponentially smaller weights given to long- term interactions 18

Long Short- Term Memory (LSTM) Allow the network to accumulate information over a long duration Once that information has been used, it might be used for the neural network to forget the old state 19

Long Short- Term Memory (LSTM) Summary Connect LSTM cells in a recurrent manner Train parameters in LSTM cells 20

Time Series Data and RNN 21

RNN for Classification 22

RNN for Prediction 23

Prediction Example 24

From Image to Time Series Data 25

Prediction of MNIST 26

Prediction of MNIST 27

RNN with TensorFlow An example for predicting a next piece of an image Regression problem Import Library Load MNIST Data Download MNIST data from the TensorFlow tutorial example 28

RNN with TensorFlow 29

RNN Structure 30

LSTM Cell LSTM, Weights and Biases Do not need to define weights and biases of LSTM cells Fully connected Define parameters based on the predefined layer size Initialize with a normal distribution with μ = 0 and σ = 0.01 31

First, define the LSTM cells Build a Model Second, compute hidden state (h) and LSTM cell (c) with the predefined LSTM cell and input 32

Loss Cost, Initializer and Optimizer Regression: Squared loss Initializer Initialize all the empty variables Optimizer AdamOptimizer: the most popular optimize 33

Summary of Optimization Process 34

Iteration Configuration Define parameters for training RNN n_iter: the number of training steps n_prt: check loss for every n_prt iteration 35

Optimization Do not run on CPU. It will take quite a while. 36

Test or Evaluation Do not run on CPU. It will take quite a while. Predict the MNIST image MNIST is 28 x 28 image. The model predicts a piece of 1 x 28 image. First, 14 x 28 image will be fed into a model, then the model predict the last 14 x 28 image, recursively. 37

Test or Evaluation 38

Load pre- trained Model We trained the model on GPU for you. You can load the pre- trained model to see RNN MNIST results LSTM size n_lstm1 = 128 n_lstm2 = 256 39

Test with pre- trained Model 40

41