Lecture 37: ConvNets (Cont d) and Training

Similar documents
Intro to Deep Learning. Slides Credit: Andrej Karapathy, Derek Hoiem, Marc Aurelio, Yann LeCunn

Deep Learning with Tensorflow AlexNet

Deep Learning for Computer Vision II

Convolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech

Perceptron: This is convolution!

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU,

Know your data - many types of networks

Convolu'onal Neural Networks

Study of Residual Networks for Image Recognition

Computer Vision Lecture 16

Machine Learning. MGS Lecture 3: Deep Learning

ImageNet Classification with Deep Convolutional Neural Networks

Fuzzy Set Theory in Computer Vision: Example 3, Part II

Deep Learning for Embedded Security Evaluation

CS 6501: Deep Learning for Computer Graphics. Training Neural Networks II. Connelly Barnes

Deep Learning For Video Classification. Presented by Natalie Carlebach & Gil Sharon

CSE 559A: Computer Vision

COMP9444 Neural Networks and Deep Learning 7. Image Processing. COMP9444 c Alan Blair, 2017

COMP 551 Applied Machine Learning Lecture 16: Deep Learning

Deep Learning Cook Book

CNN Basics. Chongruo Wu

INTRODUCTION TO DEEP LEARNING

CENG 783. Special topics in. Deep Learning. AlchemyAPI. Week 11. Sinan Kalkan

Fei-Fei Li & Justin Johnson & Serena Yeung

Deep Learning in Visual Recognition. Thanks Da Zhang for the slides

Introduction to Neural Networks

ECE 6504: Deep Learning for Perception

The Mathematics Behind Neural Networks

Project 3 Q&A. Jonathan Krause

Convolutional Neural Networks + Neural Style Transfer. Justin Johnson 2/1/2017

CNNS FROM THE BASICS TO RECENT ADVANCES. Dmytro Mishkin Center for Machine Perception Czech Technical University in Prague

A performance comparison of Deep Learning frameworks on KNL

Deep Learning Explained Module 4: Convolution Neural Networks (CNN or Conv Nets)

Deep Learning and Its Applications

Convolutional Neural Networks

Convolution Neural Networks for Chinese Handwriting Recognition

Large-scale Video Classification with Convolutional Neural Networks

CS 2750: Machine Learning. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh April 13, 2016

Spatial Localization and Detection. Lecture 8-1

Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset. By Joa õ Carreira and Andrew Zisserman Presenter: Zhisheng Huang 03/02/2018

DEEP LEARNING REVIEW. Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature Presented by Divya Chitimalla

Convolution Neural Network for Traditional Chinese Calligraphy Recognition

Computer Vision Lecture 16

Deep Learning Basic Lecture - Complex Systems & Artificial Intelligence 2017/18 (VO) Asan Agibetov, PhD.

A Torch Library for Action Recognition and Detection Using CNNs and LSTMs

CS 1674: Intro to Computer Vision. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh November 16, 2016

Computer Vision Lecture 16

Recurrent Convolutional Neural Networks for Scene Labeling

Supplementary material for Analyzing Filters Toward Efficient ConvNet

Return of the Devil in the Details: Delving Deep into Convolutional Nets

Deep Learning. Deep Learning provided breakthrough results in speech recognition and image classification. Why?

Convolutional Neural Networks

Encoder-Decoder Networks for Semantic Segmentation. Sachin Mehta

Real-time Object Detection CS 229 Course Project

Structured Prediction using Convolutional Neural Networks

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Deep Learning Workshop. Nov. 20, 2015 Andrew Fishberg, Rowan Zellers

Deep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group

CS 179 Lecture 16. Logistic Regression & Parallel SGD

Two-Stream Convolutional Networks for Action Recognition in Videos

Fully Convolutional Networks for Semantic Segmentation

INF 5860 Machine learning for image classification. Lecture 11: Visualization Anne Solberg April 4, 2018

Multi-Glance Attention Models For Image Classification

ECCV Presented by: Boris Ivanovic and Yolanda Wang CS 331B - November 16, 2016

Lecture: Deep Convolutional Neural Networks

11. Neural Network Regularization

Machine Learning 13. week

Deep Learning. Visualizing and Understanding Convolutional Networks. Christopher Funk. Pennsylvania State University.

Index. Umberto Michelucci 2018 U. Michelucci, Applied Deep Learning,

Lecture 5: Training Neural Networks, Part I

Dynamic Routing Between Capsules

Advanced Video Analysis & Imaging

CAP 6412 Advanced Computer Vision

HENet: A Highly Efficient Convolutional Neural. Networks Optimized for Accuracy, Speed and Storage

Introduction to Neural Networks

All You Want To Know About CNNs. Yukun Zhu

AttentionNet for Accurate Localization and Detection of Objects. (To appear in ICCV 2015)

DD2427 Final Project Report. Human face attributes prediction with Deep Learning

Neural Network Optimization and Tuning / Spring 2018 / Recitation 3

Deconvolutions in Convolutional Neural Networks

Lecture 13. Deep Belief Networks. Michael Picheny, Bhuvana Ramabhadran, Stanley F. Chen

Como funciona o Deep Learning

Lecture : Neural net: initialization, activations, normalizations and other practical details Anne Solberg March 10, 2017

Dropout. Sargur N. Srihari This is part of lecture slides on Deep Learning:

3D model classification using convolutional neural network

CS231N Course Project Report Classifying Shadowgraph Images of Planktons Using Convolutional Neural Networks

Yelp Restaurant Photo Classification

Plankton Classification Using ConvNets

DeepIndex for Accurate and Efficient Image Retrieval

Deep Convolutional Neural Networks. Nov. 20th, 2015 Bruce Draper

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Tutorial on Keras CAP ADVANCED COMPUTER VISION SPRING 2018 KISHAN S ATHREY

The exam is closed book, closed notes except your one-page (two-sided) cheat sheet.

Convolutional Neural Networks and Supervised Learning

Disguised Face Identification (DFI) with Facial KeyPoints using Spatial Fusion Convolutional Network. Nathan Sun CIS601

Deep Learning for Computer Vision

EE 511 Neural Networks

Object detection with CNNs

SEMANTIC COMPUTING. Lecture 9: Deep Learning: Recurrent Neural Networks (RNNs) TU Dresden, 21 December 2018

Using Machine Learning for Classification of Cancer Cells

Transcription:

Lecture 37: ConvNets (Cont d) and Training CS 4670/5670 Sean Bell [http://bbabenko.tumblr.com/post/83319141207/convolutional-learnings-things-i-learned-by]

(Unrelated) Dog vs Food [Karen Zack, @teenybiscuit]

(Unrelated) Dog vs Food [Karen Zack, @teenybiscuit]

(Unrelated) Dog vs Food [Karen Zack, @teenybiscuit]

(Recap) Backprop From Geoff Hinton s seminar at Stanford yesterday

(Recap) Backprop Parameters: θ = θ θ! 1 2 All of the weights and biases in the network, stacked together Gradient: L θ = L θ 1 L θ 2 Intuition: How fast would the error change if I change myself by a little bit

(Recap) Backprop Forward Propagation: compute the activations and loss θ (1) θ (n) x Function h (1)... Function s L Backward Propagation: compute the gradient ( error signal ) L L θ (1) θ (n) L x Function L h (1)... Function L s L

(Recap) A ConvNet is a sequence of convolutional layers, interspersed with activation functions (and possibly other layer types)

(Recap)

(Recap)

(Recap)

(Recap)

(Recap)

(Recap)

(Recap)

Web demo 1: Convolution http://cs231n.github.io/convolutional-networks/ [Karpathy 2016]

Web demo 2: ConvNet in a Browser http://cs.stanford.edu/people/karpathy/convnetjs/demo/ mnist.html [Karpathy 2014]

Convolution: Stride During convolution, the weights slide along the input to generate each output Weights Output Input

Convolution: Stride During convolution, the weights slide along the input to generate each output Output Input

Convolution: Stride During convolution, the weights slide along the input to generate each output Output Input

Convolution: Stride During convolution, the weights slide along the input to generate each output Output Input

Convolution: Stride During convolution, the weights slide along the input to generate each output Output Input

Convolution: Stride During convolution, the weights slide along the input to generate each output Output Input

Convolution: Stride During convolution, the weights slide along the input to generate each output Recall that at each position, we are doing a 3D sum: h r = x r ijk W ijk + b ijk (channel, row, column) Input

Convolution: Stride But we can also convolve with a stride, e.g. stride = 2 Output Input

Convolution: Stride But we can also convolve with a stride, e.g. stride = 2 Output Input

Convolution: Stride But we can also convolve with a stride, e.g. stride = 2 Output Input

Convolution: Stride But we can also convolve with a stride, e.g. stride = 2 Output - Notice that with certain strides, we may not be able to cover all of the input Input - The output is also half the size of the input

Convolution: Padding We can also pad the input with zeros. Here, pad = 1, stride = 2 0 Output 0 Input

Convolution: Padding We can also pad the input with zeros. Here, pad = 1, stride = 2 0 Output 0 Input

Convolution: Padding We can also pad the input with zeros. Here, pad = 1, stride = 2 0 Output 0 Input

Convolution: Padding We can also pad the input with zeros. Here, pad = 1, stride = 2 0 Output 0 Input

Convolution: stride s How big is the output? 0 kernel k In general, the output has size: w out = w + 2 p k in s +1 0 p width w in p

Convolution: stride s How big is the output? 0 0 p kernel k width w in p Example: k=3, s=1, p=1 w out = w + 2 p k in s = w + 2 3 in 1 = w in +1 +1 VGGNet [Simonyan 2014] uses filters of this shape

Pooling For most ConvNets, convolution is often followed by pooling: - Creates a smaller representation while retaining the most important information - The max operation is the most common - Why might avg be a poor choice? Figure: Andrej Karpathy

Pooling

Max Pooling What s the backprop rule for max pooling? - In the forward pass, store the index that took the max - The backprop gradient is the input gradient at that index Figure: Andrej Karpathy

Example ConvNet Figure: Andrej Karpathy

Example ConvNet Figure: Andrej Karpathy

Example ConvNet Figure: Andrej Karpathy

Example ConvNet 10x3x3 conv filters, stride 1, pad 1 2x2 pool filters, stride 2 Figure: Andrej Karpathy

Example: AlexNet [Krizhevsky 2012] FC class scores 3 227x227 1000 max full max : max pooling norm : local response normalization full : fully connected Figure: [Karnowski 2015] (with corrections)

Example: AlexNet [Krizhevsky 2012] zoom in Figure: [Karnowski 2015]

Questions?

How do you actually train these things? why so many layers? [Szegedy et al, 2014]

How do you actually train these things? Roughly speaking: Gather labeled data Find a ConvNet architecture Minimize the loss

Training a convolutional neural network Split and preprocess your data Choose your network architecture Initialize the weights Find a learning rate and regularization strength Minimize the loss and monitor progress Fiddle with knobs

Mini-batch Gradient Descent Loop: 1. Sample a batch of training data (~100 images) 2. Forwards pass: compute loss (avg. over batch) 3. Backwards pass: compute gradient 4. Update all parameters Note: usually called stochastic gradient descent even though SGD has a batch size of 1

Regularization Regularization reduces overfitting: L = L data + L reg L reg = λ 1 2 W 2 2 [Andrej Karpathy http://cs.stanford.edu/people/karpathy/convnetjs/demo/classify2d.html]

Overfitting Overfitting: modeling noise in the training set instead of the true underlying relationship Underfitting: insufficiently modeling the relationship in the training set General rule: models that are bigger or have more capacity are more likely to overfit [Image: https://en.wikipedia.org/wiki/file:overfitted_data.png]

(0) Dataset split Split your data into train, validation, and test : Dataset Validation Train Test

(0) Dataset split Validation Train Test Train: gradient descent and fine-tuning of parameters Validation: determining hyper-parameters (learning rate, regularization strength, etc) and picking an architecture Test: estimate real-world performance (e.g. accuracy = fraction correctly classified)

(0) Dataset split Validation Train Test Be careful with false discovery: To avoid false discovery, once we have used a test set once, we should not use it again (but nobody follows this rule, since it s expensive to collect datasets) Instead, try and avoid looking at the test score until the end

(0) Dataset split Cross-validation: cycle which data is used as validation Train Val Test Train Val Train Test Train Val Train Test Val Train Test Average scores across validation splits

(1) Data preprocessing Preprocess the data so that learning is better conditioned: Figure: Andrej Karpathy

(1) Data preprocessing In practice, you may also see PCA and Whitening of the data: Slide: Andrej Karpathy

(1) Data preprocessing For ConvNets, typically only the mean is subtracted. A per-channel mean also works (one value per R,G,B). Figure: Alex Krizhevsky

(1) Data preprocessing Augment the data extract random crops from the input, with slightly jittered offsets. Without this, typical ConvNets (e.g. [Krizhevsky 2012]) overfit the data. E.g. 224x224 patches extracted from 256x256 images Randomly reflect horizontally Perform the augmentation live during training Figure: Alex Krizhevsky