DEEP LEARNING REVIEW. Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature Presented by Divya Chitimalla

Similar documents
A Deep Learning primer

Object Detection Lecture Introduction to deep learning (CNN) Idar Dyrdal

ImageNet Classification with Deep Convolutional Neural Networks

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU,

Deep Learning. Deep Learning provided breakthrough results in speech recognition and image classification. Why?

Machine Learning 13. week

SEMANTIC COMPUTING. Lecture 8: Introduction to Deep Learning. TU Dresden, 7 December Dagmar Gromann International Center For Computational Logic

Convolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech

Lecture 17: Neural Networks and Deep Learning. Instructor: Saravanan Thirumuruganathan

Using neural nets to recognize hand-written digits. Srikumar Ramalingam School of Computing University of Utah

Lecture 2 Notes. Outline. Neural Networks. The Big Idea. Architecture. Instructors: Parth Shah, Riju Pahwa

Deep Learning with Tensorflow AlexNet

Deep Learning in Visual Recognition. Thanks Da Zhang for the slides

Machine Learning. MGS Lecture 3: Deep Learning

COMP 551 Applied Machine Learning Lecture 16: Deep Learning

Convolutional Neural Networks

CSC 578 Neural Networks and Deep Learning

Advanced Introduction to Machine Learning, CMU-10715

CMU Lecture 18: Deep learning and Vision: Convolutional neural networks. Teacher: Gianni A. Di Caro

Deep Learning for Computer Vision II

CS 2750: Machine Learning. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh April 13, 2016

Multi-Glance Attention Models For Image Classification

Deep Learning. Deep Learning. Practical Application Automatically Adding Sounds To Silent Movies

Code Mania Artificial Intelligence: a. Module - 1: Introduction to Artificial intelligence and Python:

Recurrent Convolutional Neural Networks for Scene Labeling

Deep Convolutional Neural Networks. Nov. 20th, 2015 Bruce Draper

Deep Learning. Architecture Design for. Sargur N. Srihari

Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling

Ryerson University CP8208. Soft Computing and Machine Intelligence. Naive Road-Detection using CNNS. Authors: Sarah Asiri - Domenic Curro

27: Hybrid Graphical Models and Neural Networks

Deep Neural Networks:

Machine Learning. The Breadth of ML Neural Networks & Deep Learning. Marc Toussaint. Duy Nguyen-Tuong. University of Stuttgart

COMP9444 Neural Networks and Deep Learning 7. Image Processing. COMP9444 c Alan Blair, 2017

CENG 783. Special topics in. Deep Learning. AlchemyAPI. Week 11. Sinan Kalkan

Deep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group

Deep (1) Matthieu Cord LIP6 / UPMC Paris 6

Neural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /10/2017

Know your data - many types of networks

ConvolutionalNN's... ConvNet's... deep learnig

Neural Network Neurons

Character Recognition Using Convolutional Neural Networks

Deep Learning Workshop. Nov. 20, 2015 Andrew Fishberg, Rowan Zellers

DEEP LEARNING IN PYTHON. The need for optimization

Deep Learning Applications

Recurrent Neural Network (RNN) Industrial AI Lab.

Deep Learning Basic Lecture - Complex Systems & Artificial Intelligence 2017/18 (VO) Asan Agibetov, PhD.

6. Convolutional Neural Networks

Study of Residual Networks for Image Recognition

Deep Learning With Noise

Natural Language Processing CS 6320 Lecture 6 Neural Language Models. Instructor: Sanda Harabagiu

10/6/2017. What is deep learning Some motivation Enablers Data Computation Structure Software Closing examples

LSTM and its variants for visual recognition. Xiaodan Liang Sun Yat-sen University

Computer Vision Lecture 16

Neural Networks: promises of current research

JOINT INTENT DETECTION AND SLOT FILLING USING CONVOLUTIONAL NEURAL NETWORKS. Puyang Xu, Ruhi Sarikaya. Microsoft Corporation

Machine Learning With Python. Bin Chen Nov. 7, 2017 Research Computing Center

Neural Network and Deep Learning. Donglin Zeng, Department of Biostatistics, University of North Carolina

COMP9444 Neural Networks and Deep Learning 5. Geometry of Hidden Units

Does the Brain do Inverse Graphics?

Linear Separability. Linear Separability. Capabilities of Threshold Neurons. Capabilities of Threshold Neurons. Capabilities of Threshold Neurons

Extracting and Composing Robust Features with Denoising Autoencoders

ECE 6504: Deep Learning for Perception

CS 1674: Intro to Computer Vision. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh November 16, 2016

Deep Learning. Practical introduction with Keras JORDI TORRES 27/05/2018. Chapter 3 JORDI TORRES

2. Neural network basics

Keras: Handwritten Digit Recognition using MNIST Dataset

Convolutional Neural Networks: Applications and a short timeline. 7th Deep Learning Meetup Kornel Kis Vienna,

Deep Learning. Volker Tresp Summer 2015

Efficient Algorithms may not be those we think

Lecture 37: ConvNets (Cont d) and Training

Deep Learning and Its Applications

Graph Neural Network. learning algorithm and applications. Shujia Zhang

Multi-layer Perceptron Forward Pass Backpropagation. Lecture 11: Aykut Erdem November 2016 Hacettepe University

A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images

Restricted Boltzmann Machines. Shallow vs. deep networks. Stacked RBMs. Boltzmann Machine learning: Unsupervised version

SGD: Stochastic Gradient Descent

Assignment # 5. Farrukh Jabeen Due Date: November 2, Neural Networks: Backpropation

Notes on Multilayer, Feedforward Neural Networks

A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images

Supervised Learning in Neural Networks (Part 2)

Seminars in Artifiial Intelligenie and Robotiis

Deep Learning. Volker Tresp Summer 2014

Using Capsule Networks. for Image and Speech Recognition Problems. Yan Xiong

Residual Networks And Attention Models. cs273b Recitation 11/11/2016. Anna Shcherbina

Vulnerability of machine learning models to adversarial examples

This Talk. 1) Node embeddings. Map nodes to low-dimensional embeddings. 2) Graph neural networks. Deep learning architectures for graphstructured

Deep Learning & Neural Networks

Global Optimality in Neural Network Training

Vulnerability of machine learning models to adversarial examples

Facial Expression Classification with Random Filters Feature Extraction

Recurrent Neural Networks. Nand Kishore, Audrey Huang, Rohan Batra

Does the Brain do Inverse Graphics?

Topics in AI (CPSC 532L): Multimodal Learning with Vision, Language and Sound. Lecture 12: Deep Reinforcement Learning

Deep Neural Networks with Flexible Activation Function

SEMANTIC COMPUTING. Lecture 9: Deep Learning: Recurrent Neural Networks (RNNs) TU Dresden, 21 December 2018

Multimodal Gesture Recognition using Multi-stream Recurrent Neural Network

To be Bernoulli or to be Gaussian, for a Restricted Boltzmann Machine

Deep Learning. Visualizing and Understanding Convolutional Networks. Christopher Funk. Pennsylvania State University.

Deep Learning for Computer Vision

Deep Learning. Volker Tresp Summer 2017

Transcription:

DEEP LEARNING REVIEW Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature 2015 -Presented by Divya Chitimalla

What is deep learning Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Key aspect of deep learning is that layers of features are not designed by human engineers: they are learned from data using a general-purpose learning procedure Breakthroughs in learning networks, processing images, video, speech and audio.

Conventional ML vs. Deep learning Conventional machine-learning (ML) techniques are limited in their ability to process natural data in their raw form. It requires careful engineering and considerable domain expertise to design a feature extractor that transformed the raw data into a suitable internal representation or feature Representation learning is a set of methods that allows a machine to be fed with raw data and to automatically discover the representations needed for detection or classification. Deep-learning methods are representation-learning methods with multiple levels of representation, obtained by composing simple but non-linear modules that each transform the representation at one level (starting with the raw input) into a representation at a higher, slightly more abstract level. With the composition of enough such transformations, very complex functions can be learned. For classification tasks, higher layers of representation amplify aspects of the input that are important for discrimination and suppress irrelevant variations.

Supervised learning Most common form of machine learning System that can be classified using labels using training, compute an objective function that measures the error (or distance) between the output scores and the desired pattern of scores and then modifies its internal adjustable parameters to reduce this error. In deep-learning system, there are millions of these adjustable weights, and hundreds of millions of labelled examples with which to train the machine. To properly adjust the weight vector, the learning algorithm computes a gradient vector that, for each weight, indicates by what amount the error would increase or decrease if the weight were increased by a tiny amount. The weight vector is then adjusted in the opposite direction to the gradient vector. Stochastic gradient descent (SGD) consists of showing the input vector for a few examples, computing the outputs and the errors, computing the average gradient for those examples, and adjusting the weights accordingly.

Neural network with back propagation Compute the gradient of an objective function with respect to the weights of a multilayer stack of modules using chain rule for derivatives. The key insight is that the derivative (or gradient) of the objective with respect to the input of a module can be computed by working backwards from the gradient with respect to the output of that module (or the input of the subsequent module) Backpropagation equation can be applied repeatedly to propagate gradients through all modules, starting from the output To go from one layer to the next, a set of units compute a weighted sum of their inputs from the previous layer and pass the result through a non-linear function. At present, the most popular non-linear function is the rectified linear unit (ReLU) tanh(z) or 1/(1 + exp( z)

Neural network with back propagation

Convolution neural network ConvNets are designed to process data that come in the form of multiple arrays, for example a color image composed of three 2D arrays containing pixel intensities. The architecture of a typical ConvNet is structured as a series of stages. The first few stages are composed of two types of layers: convolutional layers and pooling layers. Units in a convolutional layer are organized in feature maps, within which each unit is connected to local patches in the feature maps of the previous layer through a set of weights called a filter bank. The result of this local weighted sum is then passed through a non-linearity such as a ReLU. All units in a feature map share the same filter bank. Different feature maps in a layer use different filter banks. First, in array data such as images, local groups of values are often highly correlated, forming distinctive local motifs that are easily detected. Second, the local statistics of images and other signals are invariant to location. Typically used in image recognition, speech processing etc.

Convolution neural network

Recurrent neural networks RNNs process an input sequence one element at a time, maintaining in their hidden units a state vector that implicitly contains information about the history of all the past elements of the sequence. When we consider the outputs of the hidden units at different discrete time steps as if they were the outputs of different neurons in a deep multilayer network RNNs are very powerful dynamic systems, but training them has proved to be problematic because the backpropagated gradients either grow or shrink at each time step, so over many time steps they typically explode or vanis RNNs can be seen as very deep feedforward networks in which all the layers share the same weights. Although their main purpose is to learn long-term dependencies, theoretical and empirical evidence shows that it is difficult to learn to store information for very long. To correct for that, one idea is to augment the network with an explicit memory. The first proposal of this kind is the long short-term memory (LSTM) networks that use special hidden units, the natural behaviour of which is to remember inputs for a long time

Future of deep learning It is everywhere Unsupervised learning will become far more important in the longer term. Human and animal learning is largely unsupervised: we discover the structure of the world by observing it, not by being told the name of every object.