Seminars in Artifiial Intelligenie and Robotiis

Similar documents
Convolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech

Deep Learning in Visual Recognition. Thanks Da Zhang for the slides

Deep Learning for Computer Vision II

COMP9444 Neural Networks and Deep Learning 7. Image Processing. COMP9444 c Alan Blair, 2017

Deep Learning with Tensorflow AlexNet

CS 2750: Machine Learning. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh April 13, 2016

Deep Neural Networks:

Convolutional Neural Networks

Convolutional Neural Networks: Applications and a short timeline. 7th Deep Learning Meetup Kornel Kis Vienna,

Deep Learning Workshop. Nov. 20, 2015 Andrew Fishberg, Rowan Zellers

Computer Vision Lecture 16

Structured Prediction using Convolutional Neural Networks

Machine Learning. MGS Lecture 3: Deep Learning

Computer Vision Lecture 16

International Journal of Computer Engineering and Applications, Volume XII, Special Issue, September 18,

Spatial Localization and Detection. Lecture 8-1

Computer Vision Lecture 16

Study of Residual Networks for Image Recognition

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU,

Advanced Introduction to Machine Learning, CMU-10715

Deconvolutions in Convolutional Neural Networks

Learning Deep Features for Visual Recognition

Deep Residual Learning

Learning Deep Representations for Visual Recognition

Introduction to Deep Learning

Traffic Multiple Target Detection on YOLOv2

Intro to Deep Learning. Slides Credit: Andrej Karapathy, Derek Hoiem, Marc Aurelio, Yann LeCunn

Introduction to Neural Networks

Proceedings of the International MultiConference of Engineers and Computer Scientists 2018 Vol I IMECS 2018, March 14-16, 2018, Hong Kong

Application of Convolutional Neural Network for Image Classification on Pascal VOC Challenge 2012 dataset

Using neural nets to recognize hand-written digits. Srikumar Ramalingam School of Computing University of Utah

Deep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group

Machine Learning. The Breadth of ML Neural Networks & Deep Learning. Marc Toussaint. Duy Nguyen-Tuong. University of Stuttgart

YOLO9000: Better, Faster, Stronger

INTRODUCTION TO DEEP LEARNING

Deep Learning. Visualizing and Understanding Convolutional Networks. Christopher Funk. Pennsylvania State University.

Global Optimality in Neural Network Training

On the Effectiveness of Neural Networks Classifying the MNIST Dataset

Efficient Convolutional Network Learning using Parametric Log based Dual-Tree Wavelet ScatterNet

HENet: A Highly Efficient Convolutional Neural. Networks Optimized for Accuracy, Speed and Storage

Yiqi Yan. May 10, 2017

Deep Learning. Deep Learning provided breakthrough results in speech recognition and image classification. Why?

Fuzzy Set Theory in Computer Vision: Example 3

Extend the shallow part of Single Shot MultiBox Detector via Convolutional Neural Network

COMP 551 Applied Machine Learning Lecture 16: Deep Learning

arxiv: v1 [cs.cv] 29 Oct 2017

Direct Multi-Scale Dual-Stream Network for Pedestrian Detection Sang-Il Jung and Ki-Sang Hong Image Information Processing Lab.

DEEP LEARNING REVIEW. Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature Presented by Divya Chitimalla

Ryerson University CP8208. Soft Computing and Machine Intelligence. Naive Road-Detection using CNNS. Authors: Sarah Asiri - Domenic Curro

A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images

CS 1674: Intro to Computer Vision. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh November 16, 2016

Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks

A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images

Object detection with CNNs

ImageNet Classification with Deep Convolutional Neural Networks

Rotation Invariance Neural Network

SEMANTIC COMPUTING. Lecture 8: Introduction to Deep Learning. TU Dresden, 7 December Dagmar Gromann International Center For Computational Logic

Convolutional Neural Networks

Channel Locality Block: A Variant of Squeeze-and-Excitation

Neural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /10/2017

Convolution Neural Network for Traditional Chinese Calligraphy Recognition

Deep Neural Network Hyperparameter Optimization with Genetic Algorithms

Deep Learning With Noise

Depth Image Dimension Reduction Using Deep Belief Networks

BranchyNet: Fast Inference via Early Exiting from Deep Neural Networks

CS489/698: Intro to ML

Deep Learning Basic Lecture - Complex Systems & Artificial Intelligence 2017/18 (VO) Asan Agibetov, PhD.

Object Detection and Its Implementation on Android Devices

Introduction to Neural Networks

Deep Learning Explained Module 4: Convolution Neural Networks (CNN or Conv Nets)

Object Detection. CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR

Restricted Boltzmann Machines. Shallow vs. deep networks. Stacked RBMs. Boltzmann Machine learning: Unsupervised version

Stacked Denoising Autoencoders for Face Pose Normalization

arxiv: v2 [cs.cv] 20 Oct 2018

Inception and Residual Networks. Hantao Zhang. Deep Learning with Python.

Convolutional Neural Networks. CSC 4510/9010 Andrew Keenan

ETALON IMAGES: UNDERSTANDING THE CONVOLUTION NEURAL NETWORKS

DEEP NEURAL NETWORKS FOR OBJECT DETECTION

CENG 783. Special topics in. Deep Learning. AlchemyAPI. Week 11. Sinan Kalkan

FACIAL POINT DETECTION BASED ON A CONVOLUTIONAL NEURAL NETWORK WITH OPTIMAL MINI-BATCH PROCEDURE. Chubu University 1200, Matsumoto-cho, Kasugai, AICHI

Deep learning for dense per-pixel prediction. Chunhua Shen The University of Adelaide, Australia

Classification of objects from Video Data (Group 30)

POINT CLOUD DEEP LEARNING

REGION AVERAGE POOLING FOR CONTEXT-AWARE OBJECT DETECTION

Real-time convolutional networks for sonar image classification in low-power embedded systems

Residual Networks And Attention Models. cs273b Recitation 11/11/2016. Anna Shcherbina

Real-time Object Detection CS 229 Course Project

CNNS FROM THE BASICS TO RECENT ADVANCES. Dmytro Mishkin Center for Machine Perception Czech Technical University in Prague

Vehicle Classification on Low-resolution and Occluded images: A low-cost labeled dataset for augmentation

Perceptron: This is convolution!

Lecture 5: Object Detection

Autoencoders, denoising autoencoders, and learning deep networks

11. Neural Network Regularization

Facial Key Points Detection using Deep Convolutional Neural Network - NaimishNet

3D Object Recognition with Deep Belief Nets

Smart Parking System using Deep Learning. Sheece Gardezi Supervised By: Anoop Cherian Peter Strazdins

Deep Learning Based Large Scale Handwritten Devanagari Character Recognition

Fuzzy Set Theory in Computer Vision: Example 3, Part II

EFFECTIVE OBJECT DETECTION FROM TRAFFIC CAMERA VIDEOS. Honghui Shi, Zhichao Liu*, Yuchen Fan, Xinchao Wang, Thomas Huang

arxiv: v2 [cs.cv] 30 Oct 2018

Transcription:

Seminars in Artifiial Intelligenie and Robotiis Computer Vision for Intelligent Robotiis Basiis and hints on CNNs Alberto Pretto

What is a neural network? We start from the frst type of artifcal neuron, the perieptron. A perceptron takes several binary inputs, x1,x2,, compute a weighted sum of the inputs and produces a single binary output using a fxed threshold: We can use the perceptron to take decisions: by varying the weights and the threshold, we can get diferent models of decision-making.

Multi-level perieptrons More complex networks of perceptrons can deal with more complex decision problems: The frst column (i.e., the frst layer)of perceptrons is making simple, low level decisions, by directly weighing the inputs. The perceptrons in the second layer is making a decision by weighing the results from the frst layer: the second layer ian make a deiision at a more iomplex and more abstrait level. A fully ionneited layer (as in this case) is a layer where all its neurons have full connections to all the output in the previous layer. Note: It is trivial to show that perceptrons can be used to sintetize logical functions (AND, OR, ecc...)

From perieptrons to artifiial neuron 1) Write the weighted sum as dot product. 2) Replace the threshold with the bias b = -threshold 3) Smooth the output using the sigmoid function: This is called a sigmoid neuron: that small changes in the weights and bias cause only a small change in the output. That's the crucial fact which will allow a network of sigmoid neurons to learn.

Softmax From outputs to probability distribution:

Toward deep learning A deeper network (i.e., with hidden layers) can breaks down a very complicated question ( e.g., does this image show a face or not) into simple questions Two or more hidden layers deep neural networks. Deep learning methods aim at learning feature hierarchies with features from higher levels of the hierarchy formed by the composition of lower level features. [Glorot and Bengio]

Learning the network parameters Given a labeled dataset x = {x1, x2, } of inputs with associated outputs y(x) = {y(x1), y(x2), }, fnd the weights w and biases b that minimize the cost function (a is the output of the network given the current parameters w and b): Easy answer! Gradient descent! Correct but very difcult implementation in practice, due to: Very large parameters set Very slow convergence rate Huge amount of data. Weight saturation.

Solve the learning problem (no details) Baikpropagation Stochastic gradient descent Dataset divided in batches! Massive parallelization.

Baikpropagation insights (1/2) Goal: minimize (This cost function can be written as an average over cost functions for individual training examples: ) We need to compute all the partial derivatives and The goal of backpropagation is to compute effiiiently these derivatives.

Baikpropagation insights (2/2) Let defne the the error of a neuron j in layer l as Backpropagation equations:

The Baikpropagation Algorithm

The full algorithm

From Neural Network to CNNs (1/2) Apply NN to images to perform classifcation, detection,etc using the classical fully connected layers but for an RGB 200x200 image = 120000 parameters for eaih node!! Picture from M.A. Ranzato

From Neural Network to CNNs (2/2) Convolutional neural networks use three basic ideas: loial reieptive felds, shared weights, and pooling. Picture from M.A. Ranzato

Convolutional layers Loial reieptive felds and shared weights: all the neurons in the frst hidden layer detect exactly the same feature, (edges, textures, etc..) just at diferent locations in the input image!! Convolutional networks are well adapted to the translation invariance of images. Another idea is to apply, for each layer, diferent weights: Input image Filter 1 Filter 2 Filter 2 Convolution Features Features Features map 1 map 2 map 3

Pooling layers Convolutional neural networks also contain pooling layers. Pooling layers are usually used immediately after convolutional layers. Pooling layers simplify the information in the output from the convolutional layer. In max-pooling, a pooling unit simply outputs the maximum.

ReLU aitivation funition It has been shown that non-saturated activation functions such as the reitifed linear unit (ReLU) outperforms the classical activation functions (e.g. sigmoid). ReLU(x) = max(0,x)

Data Preproiessing Preprocessing helps to simplify the classifcation problem. First preprocessing issue: data normalization The aim is to remove all redundant information from the data. Common solution is to subtract the mean (calculated only on train dataset) and normalize with respect to the covariance.

Avoid overftting (1/3) Several ways to prevent overftting: Regularization, Dropout and Data Augmentation Regularization methods are used for model selection, in particular to prevent overftting by penalizing models with extreme parameter values. Common solution are: L2 regularization L1 regularization Max norm constraints

Avoid overftting (2/3) Dropout is implemented by only keeping a neuron active with some probability or by setting it to zero otherwise. This afects also the back propagation, training only the activated neurons.

Avoid overftting (3/3) In Data Augmentation, fake data is simulated, encoding image transformations that shouldn t change object identity. Flip horizontally Random mix/combinations of: Translation Rotation Stretching... Random Crop Color Jittering

A CNN for MNIST MNIST: a subset of the NIST* database of handwritten digits - 2 convolutional + Max pooling layers - 2 fully connected layers [Jonatan Ward et al. Efficiefint mc監ap mp mingco mfct mhemefict mcainingco mfc Co mnvo mlut mio mnalcnefiucalcnefit mwo mcksct mo mcaccuda-basefidcclust mefic] *National Institute of Standards and Technology

A typiial (small) CNN Running 20000 iteration steps, we can reach an accuracy of 99.2% in the MNIST dataset.

(Breaf & Iniomplete) History of CNNs 1988-94: LeNet Convolution Pooling Non-linearity Y. LeCun, L. Bottou, Y. Bengio and P. Haffner, Gradient-based learning applied to document recognition, Proceedings of the IEEE ( Volume: 86, Issue: 11, Nov 1998 )

(Breaf & Iniomplete) History of CNNs 1994-2010: indeed, not many contributions on CNNs among others, convolutional auto-encoders (CAE) for invariant features extraction Marc'Aurelio Ranzato, Fu Jie Huang, Y-Lan Boureau and Yann LeCun, Unsupervised Learning of Invariant Feature Hierarchies with Applications to Object Recognition, IEEE Conference on Computer Vision and Pattern Recognition, 2007. CVPR '07. P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, and P.-A. Manzagol, Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion, J. Mach. Learn. Res., vol.11, pp. 3371-3408, 2010.

(Breaf & Iniomplete) History of CNNs 2012: AlexNet Rectifed linear units (ReLU) as non-linearities Dropout technique Multiple powerful GPUs implementation Krizhevsky, A., Sutskever, I. and Hinton, G. E. ImageNet Classification with Deep Convolutional Neural Networks NIPS 2012: Neural Information Processing Systems

(Breaf & Iniomplete) History of CNNs 2014: VGG network Small 3 3 flters in each convolutional layers Sequence of convolutions K. Simonyan, A. Zisserman, Very Deep Convolutional Networks for Large-Scale Image Recognition arxiv technical report, 2014

(Breaf & Iniomplete) History of CNNs 2015: ResNet Learn residuals through bypasses It makes it possible to train up to thousands of layers Kaiming He, Xiangyu Zhang, Shaoqing Ren and Jian Sun. Deep Residual Learning for Image Recognition CVPR 2016.

(Breaf & Iniomplete) History of CNNs 2016: YOLO End to end training for object detection Grid based object confdence with bounding box regression Joseph Redmon, Santosh Divvala, Ross Girshick and Ali Farhadi, You Only Look Once: Unified, Real-Time Object Detection, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Other Referenies [1] Ian Goodfellow, Yoshua Bengio, Aaron Courville, Deep Learning, MIT Press, 2016 [2] Michael Nielsen, Neural Networks and Deep Learning Free online version: http://neuralnetworksanddeeplearning.com/ [3] Nikhil Buduma, Fundamentals of Deep Learning, O Reilly, 2017

Seminars in Artifiial Intelligenie and Robotiis Computer Vision for Intelligent Robotiis Basiis and hints on CNNs Alberto Pretto