Machine Learning. The Breadth of ML Neural Networks & Deep Learning. Marc Toussaint. Duy Nguyen-Tuong. University of Stuttgart
|
|
- Edward Boone
- 6 years ago
- Views:
Transcription
1 Machine Learning The Breadth of ML Neural Networks & Deep Learning Marc Toussaint University of Stuttgart Duy Nguyen-Tuong Bosch Center for Artificial Intelligence Summer 2017
2 Neural Networks Consider a regression problem with input x R d and output y R Linear function: (β R d ) f(x) = β x 1-layer Neural Network function: (W 0 R h 1 d ) f(x) = β σ(w 0x) 2-layer Neural Network function: f(x) = β σ(w 1σ(W 0x)) Neural Networks are a special function model y = f(x, w), i.e. a special way to parameterize non-linear functions 2/22
3 Neural Networks: Training How to determine the weights W l,ij in the layer l for the node j, given the sample {x i, y i }? Idea: Initialize the weights W l,j for each layer l and each node j First, propagate x i through the network, bottom up (Forward Propagation) Then, compute the error between prediction and ground-truth y i, given an error function l Subsequently, propagate the error backwards through the network, and recursively compute the error gradients for each W l,ij (Back-Propagation) Update the weights W l,j using the computed error gradients for each sample {x i, y i} Notation: Consider L hidden layers, each h l -dimensional let z l = W l-1 x l-1 be the inputs to all neurons in layer l let x l = σ(z l ) the activation of all neurons in layer l redundantly, we denote by x 0 x the activation of the input layer, and by φ(x) x L the activation of the last hidden layer 3/22
4 Neural Networks: Basic Equations Forward propagation: An L-layer NN recursively computes, for l = 1,.., L, l=1,..,l : z l = W l-1 x l-1, x l = σ(z l ) and then computes the output f z L+1 = W Lx L Backpropagation: Given some loss l(f), let δ L+1 = l. We can recursivly f compute the loss-gradient w.r.t. the inputs of layer l: l=l,..,1 : δ l = dl = dl z l+1 x l = [δ l+1 W l ] [x l (1 x l )] dz l dz l+1 x l z l where is an element-wise product. The gradient w.r.t. weights is: dl = dl dw l,ij dz l+1,i z l+1,i W l,ij = δ l+1,i x l,j or dl dw l = δ l+1x l Weight-update: many ways of different weight-updates possible, given gradients dl dw l for example, the delta rule: W new l = W old l + W l = W old l η dl dw l 4/22
5 Neural Networks: Regression In the standard regression case, y R, we typically assume a squared error loss l(f) = i (f(x i, w) y i ) 2. We have δ L+1 = i 2(f(x i, w) y i ) Regularization: Add a L 2 or L 1 regularization. First compute all gradients as before, then add λw l,ij (for L 2 ), or λ sign W l,ij (for L 1 ) to the gradient. Historically, this is called weight decay, as the additional gradient leads to a step decaying the weighs. The optimal output weights are as for standard regression W L = (X X + λi) -1 X y where X is the data matrix of activations x L φ(x) 5/22
6 Neural Networks: Classification Consider the multi-class case y {1,.., M}. Then we have M output neurons to represent the discriminative function f(x, y, w) = (W Lz L ) y, W R M h L Choosing neg-log-likelihood objective logistic regression Choosing hinge loss objective NN + SVM For given x, let y be the correct class. The one-vs-all hinge loss is: y y max{0, 1 (f y f y)} For output neuron y y this implies a gradient δ y = [f y < f y + 1] For output neuron y this implies a gradient δ y = y y [f y < f y + 1] Only data points inside the margin induce an error (and gradient). This is also called Perceptron Algorithm 6/22
7 Neural Networks: Dimensionality Reduction Dimensionality reduction can be performed with autoencoders An autoencoder typically is a NN of the type which is trained to reproduce the input: min i y(x i ) x i 2 The hidden layer ( bottleneck ) needs to find a good representation/compression. Similar to the PCA objective, but nonlinear Stacking autoencoders (Deep Autoencoders): 7/22
8 Remarks NN is usually trained based on the gradient W l f(x) (The output weights can be optimized analytically as for linear regression) NNs are a very powerful function class (By tweaking/training the weights one can approximate any non-linear function) BUT: Are there any guarantees on generalization? What happens with the gradients, when the NN is very deep? How can NN be used to learn intelligent (autonomous) behavior (e.g. Autonomous Learning, Reinforcement Learning, Robotics, etc.)? Is there any insight on what the neurons will actually represent (e.g. discovering/developing abstractions, hierarchies, etc.)? Deep Learning is a revival of Neural Networks and was mainly driven by the latter, i.e. learning useful representations 8/22
9 Deep Learning: Basic Concept Idea: learn hierarchical features from data, from simple features to complex features Deep Learning can also be performed with other frameworks, e.g. Deep Gaussian Processes So what changed towards classical NN? Algorithmic advancement e.g. Dropout, ReLUs, Pre-training More general models, e.g. Deep GPs, Deep Kernel Machines,... More computational power (e.g. GPUs) Large data sets Deep Learning is useful for very high dimensional problems with many labeled or unlabeled samples (e.g. vision and speech tasks) 9/22
10 Typical Process to Train a Deep Network pre-process data, e.g. ZCA, distortions network type, e.g. convolutional network activation function, e.g. ReLU regularization, e.g. dropout network training, e.g. stochastic gradient descent with Adadelta combining multiple models, e.g. ensemble of networks optimizing high-level parameters, e.g. with Bayesian optimization Many heuristics involved when training Deep Networks 10/22
11 Example: 2-D Convolutional Network Open parameters: Nr. of layers Nr. of feature maps per convolution Filter size for each convolution Subsamling size Nr. of hidden units 11/22
12 Pre-Processing Steps 1. Removing means from images 2. Distortions of images 3. Zero Component Analysis Subtracting mean from images Standardizing the data Add distorted images to training data Randomly translate & rotate images Zero Component Analysis Perform transformation: x = P T Λ 1 P x where Λ = diag( σ 1 + ɛ, σ 2 + ɛ,..., σ n + ɛ) In practice, ɛ has the effect of strengthening the edges 12/22
13 Activation Function: Rectified Linear Units New activation function: rectified linear units (ReLUs) ReLU: f(z) = max(0, z) non-saturating sparse activation helps against vanishing gradients Relation to logistic activations n=1 logistic(z n) log(1 + e z ) max(0, z) 13/22
14 Deep Networks and Overfitting Overfitting: good training, bad testing performance. Deep models are very sensitive to overfitting, due complex model structures. How to avoid overfitting Weight-decay, penalize W 1 or W 2 Early stopping: recognize overfitting on validation data set Pre-training: initialize parameters meaningful Dropout 14/22
15 Dropout Training (Backpropagation): randomly deactivate each unit with probability p compute error for new network architecture perform gradient descent step Prediction (Forward Propagation): multiply output of each unit by p preserves expected value of output for single layer /22
16 Dropout Training (Backpropagation): randomly deactivate each unit with probability p compute error for new network architecture perform gradient descent step Prediction (Forward Propagation): multiply output of each unit by p preserves expected value of output for single layer /22
17 ADADELTA: Stochastic Gradient Descent Computation of update steps on batch of samples ADADELTA uses only first-order gradients Simple in implementation and application Appl. for large data and number of parameters ( ) ADADELTA Update rule: x t+1 = x t + x t, where x t = η t g t = α T i=1 ρi (1 ρ) x t i T g i=0 ρi (1 ρ) g t t i Remarks: Adaptive learning rate η t. Parameters α and ρ muss be chosen Estimation of learning rate from previous gradients g t and t The algorithm has shown to work well in practice 16/22
18 Bayesian Optimization Optimizing selected network parameters, e.g. decay rate ρ Objective function unknown (i.e. parameters pred. errors) Bayesian Optimization: Optimizing while approximating objective function Infering objective functions from data, i.e. [parameters, errors] 17/22
19 Bayesian Optimization Optimizing selected network parameters, e.g. decay rate ρ Objective function unknown (i.e. parameters pred. errors) Bayesian Optimization: Optimizing while approximating objective function Infering objective functions from data, i.e. [parameters, errors] Initializing parameters 17/22
20 Bayesian Optimization Optimizing selected network parameters, e.g. decay rate ρ Objective function unknown (i.e. parameters pred. errors) Bayesian Optimization: Optimizing while approximating objective function Infering objective functions from data, i.e. [parameters, errors] Initializing parameters Training the network with the parameters 17/22
21 Bayesian Optimization Optimizing selected network parameters, e.g. decay rate ρ Objective function unknown (i.e. parameters pred. errors) Bayesian Optimization: Optimizing while approximating objective function Infering objective functions from data, i.e. [parameters, errors] Initializing parameters Training the network with the parameters Compute prediction error on validation data 17/22
22 Bayesian Optimization Optimizing selected network parameters, e.g. decay rate ρ Objective function unknown (i.e. parameters pred. errors) Bayesian Optimization: Optimizing while approximating objective function Infering objective functions from data, i.e. [parameters, errors] Initializing parameters Learn the objective function: Parameters Validation error Training the network with the parameters Compute prediction error on validation data 17/22
23 Bayesian Optimization Optimizing selected network parameters, e.g. decay rate ρ Objective function unknown (i.e. parameters pred. errors) Bayesian Optimization: Optimizing while approximating objective function Infering objective functions from data, i.e. [parameters, errors] Choosing parameters according to a criterion Initializing parameters Learn the objective function: Parameters Validation error Training the network with the parameters Compute prediction error on validation data 17/22
24 Bayesian Optimization with Gaussian Prior Learning objective function with Gaussian process regression GP prediction for a test point x t : N(µ(x t ), ν(x t )) Selection criterion is computed based µ(x t ) and ν(x t ) 1 Φ norm( )... normal accumulative distribution function, φ norm( )... normal probability density function, y best... currently best measurement / observation 18/22
25 Bayesian Optimization with Gaussian Prior Learning objective function with Gaussian process regression GP prediction for a test point x t : N(µ(x t ), ν(x t )) Selection criterion is computed based µ(x t ) and ν(x t ) Expected Improvement criterion for given point x a EI = ν(x) [γ(x)φ norm (γ(x)) + φ norm (x)] ; γ(x) = y best µ(x) Expected Improvement ν(x) Φ norm( )... normal accumulative distribution function, φ norm( )... normal probability density function, y best... currently best measurement / observation 18/22
26 Ensembles Boosting Prediction Performance Standard ML approach to improve test performance: combine output of different models We can use different random weight initializations and training with/without the validation set How to combine the predictions? each network gives us a prediction, e.g. p 1 = (0.4, 0.3, 0.3), p 2 = (0.35, 0.35, 0.3), p 3 = (0.1, 0.9, 0.0) we can take the arithmetic or geometric mean, e.g. p avg = (0.28, 0.52, 0.2) class prediction is the index with the highest score, e.g. class 2 19/22
27 Results on Traffic Sign Recognition CCR (%) Team Deep Learning used IDSIA yes human average BOSCH deep nets yes Sermanet yes CAOR no INI-RTCV no INI-RTCV no INI-RTCV no Correct classification rate (CCR) on the final-stage of the German Traffic Sign Recognition Benchmark: images for training and images for testing from 43 different German road sign classes. 20/22
28 Remarks Various approaches for optimizing and training of Deep Nets, e.g. Bayesian optimization, pre-processing, dropouts... Choice of appropriate techniques based on applications, experiences and knowledge in Machine Learning Try-out of different training approaches Gaining experiences Keep up with the developments in the Deep Learning community Further research problems: Bayesian Deep Learning, unsupervised learning, generative deep models, deep reinforcement learning, adversarial problems, etc. 60 % Error Rate traditional involves deep learning 100 Error Score # DL Publications Google Scholar 1400 traditional involves deep learning % % % 16.4 % % * 2014 * Year * only 10 best results plotted Year Year * * as of October 14, /22
29 Deep Learning further reading Weston, Ratle & Collobert: Deep Learning via Semi-Supervised Embedding, ICML Hinton & Salakhutdinov: Reducing the Dimensionality of Data with Neural Networks, Science 313, pp , Bengio & LeCun: Scaling Learning Algorithms Towards AI. In Bottou et al. (Eds) Large-Scale Kernel Machines, MIT Press Hadsell, Chopra & LeCun: Dimensionality Reduction by Learning an Invariant Mapping, CVPR Glorot, Bengio: Understanding the difficulty of training deep feedforward neural networks, AISTATS 10. Jason Weston et al.: Deep Learning via Semi-SupervisedEmbedding, ICML and newer papers citing those 22/22
Artificial Intelligence
Artificial Intelligence AI & Machine Learning & Neural Nets Marc Toussaint University of Stuttgart Winter 2018/19 Motivation: Neural networks became a central topic for Machine Learning and AI. But in
More informationImageNet Classification with Deep Convolutional Neural Networks
ImageNet Classification with Deep Convolutional Neural Networks Alex Krizhevsky Ilya Sutskever Geoffrey Hinton University of Toronto Canada Paper with same name to appear in NIPS 2012 Main idea Architecture
More informationCPSC 340: Machine Learning and Data Mining. Deep Learning Fall 2016
CPSC 340: Machine Learning and Data Mining Deep Learning Fall 2016 Assignment 5: Due Friday. Assignment 6: Due next Friday. Final: Admin December 12 (8:30am HEBB 100) Covers Assignments 1-6. Final from
More informationMachine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU,
Machine Learning 10-701, Fall 2015 Deep Learning Eric Xing (and Pengtao Xie) Lecture 8, October 6, 2015 Eric Xing @ CMU, 2015 1 A perennial challenge in computer vision: feature engineering SIFT Spin image
More informationDeep Learning for Computer Vision II
IIIT Hyderabad Deep Learning for Computer Vision II C. V. Jawahar Paradigm Shift Feature Extraction (SIFT, HoG, ) Part Models / Encoding Classifier Sparrow Feature Learning Classifier Sparrow L 1 L 2 L
More informationNeural Networks. Theory And Practice. Marco Del Vecchio 19/07/2017. Warwick Manufacturing Group University of Warwick
Neural Networks Theory And Practice Marco Del Vecchio marco@delvecchiomarco.com Warwick Manufacturing Group University of Warwick 19/07/2017 Outline I 1 Introduction 2 Linear Regression Models 3 Linear
More informationDeep Learning with Tensorflow AlexNet
Machine Learning and Computer Vision Group Deep Learning with Tensorflow http://cvml.ist.ac.at/courses/dlwt_w17/ AlexNet Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton, "Imagenet classification
More informationNeural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /10/2017
3/0/207 Neural Networks Emily Fox University of Washington March 0, 207 Slides adapted from Ali Farhadi (via Carlos Guestrin and Luke Zettlemoyer) Single-layer neural network 3/0/207 Perceptron as a neural
More informationDeep Learning Basic Lecture - Complex Systems & Artificial Intelligence 2017/18 (VO) Asan Agibetov, PhD.
Deep Learning 861.061 Basic Lecture - Complex Systems & Artificial Intelligence 2017/18 (VO) Asan Agibetov, PhD asan.agibetov@meduniwien.ac.at Medical University of Vienna Center for Medical Statistics,
More informationModel Generalization and the Bias-Variance Trade-Off
Charu C. Aggarwal IBM T J Watson Research Center Yorktown Heights, NY Model Generalization and the Bias-Variance Trade-Off Neural Networks and Deep Learning, Springer, 2018 Chapter 4, Section 4.1-4.2 What
More informationDeep Learning & Neural Networks
Deep Learning & Neural Networks Machine Learning CSE4546 Sham Kakade University of Washington November 29, 2016 Sham Kakade 1 Announcements: HW4 posted Poster Session Thurs, Dec 8 Today: Review: EM Neural
More informationDEEP LEARNING REVIEW. Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature Presented by Divya Chitimalla
DEEP LEARNING REVIEW Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature 2015 -Presented by Divya Chitimalla What is deep learning Deep learning allows computational models that are composed of multiple
More informationAdvanced Introduction to Machine Learning, CMU-10715
Advanced Introduction to Machine Learning, CMU-10715 Deep Learning Barnabás Póczos, Sept 17 Credits Many of the pictures, results, and other materials are taken from: Ruslan Salakhutdinov Joshua Bengio
More informationNeural Network Neurons
Neural Networks Neural Network Neurons 1 Receives n inputs (plus a bias term) Multiplies each input by its weight Applies activation function to the sum of results Outputs result Activation Functions Given
More informationCOMP 551 Applied Machine Learning Lecture 14: Neural Networks
COMP 551 Applied Machine Learning Lecture 14: Neural Networks Instructor: (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/comp551 Unless otherwise noted, all material posted for this course
More informationNeural Network and Deep Learning. Donglin Zeng, Department of Biostatistics, University of North Carolina
Neural Network and Deep Learning Early history of deep learning Deep learning dates back to 1940s: known as cybernetics in the 1940s-60s, connectionism in the 1980s-90s, and under the current name starting
More informationConvolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech
Convolutional Neural Networks Computer Vision Jia-Bin Huang, Virginia Tech Today s class Overview Convolutional Neural Network (CNN) Training CNN Understanding and Visualizing CNN Image Categorization:
More informationLecture 20: Neural Networks for NLP. Zubin Pahuja
Lecture 20: Neural Networks for NLP Zubin Pahuja zpahuja2@illinois.edu courses.engr.illinois.edu/cs447 CS447: Natural Language Processing 1 Today s Lecture Feed-forward neural networks as classifiers simple
More informationNeural Network Optimization and Tuning / Spring 2018 / Recitation 3
Neural Network Optimization and Tuning 11-785 / Spring 2018 / Recitation 3 1 Logistics You will work through a Jupyter notebook that contains sample and starter code with explanations and comments throughout.
More informationDeep Learning. Volker Tresp Summer 2014
Deep Learning Volker Tresp Summer 2014 1 Neural Network Winter and Revival While Machine Learning was flourishing, there was a Neural Network winter (late 1990 s until late 2000 s) Around 2010 there
More informationRestricted Boltzmann Machines. Shallow vs. deep networks. Stacked RBMs. Boltzmann Machine learning: Unsupervised version
Shallow vs. deep networks Restricted Boltzmann Machines Shallow: one hidden layer Features can be learned more-or-less independently Arbitrary function approximator (with enough hidden units) Deep: two
More informationFacial Expression Classification with Random Filters Feature Extraction
Facial Expression Classification with Random Filters Feature Extraction Mengye Ren Facial Monkey mren@cs.toronto.edu Zhi Hao Luo It s Me lzh@cs.toronto.edu I. ABSTRACT In our work, we attempted to tackle
More informationIntroduction to Deep Learning
ENEE698A : Machine Learning Seminar Introduction to Deep Learning Raviteja Vemulapalli Image credit: [LeCun 1998] Resources Unsupervised feature learning and deep learning (UFLDL) tutorial (http://ufldl.stanford.edu/wiki/index.php/ufldl_tutorial)
More informationCS 6501: Deep Learning for Computer Graphics. Training Neural Networks II. Connelly Barnes
CS 6501: Deep Learning for Computer Graphics Training Neural Networks II Connelly Barnes Overview Preprocessing Initialization Vanishing/exploding gradients problem Batch normalization Dropout Additional
More informationNeural Networks and Deep Learning
Neural Networks and Deep Learning Example Learning Problem Example Learning Problem Celebrity Faces in the Wild Machine Learning Pipeline Raw data Feature extract. Feature computation Inference: prediction,
More informationDropout. Sargur N. Srihari This is part of lecture slides on Deep Learning:
Dropout Sargur N. srihari@buffalo.edu This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/cse676 1 Regularization Strategies 1. Parameter Norm Penalties 2. Norm Penalties
More informationThe exam is closed book, closed notes except your one-page (two-sided) cheat sheet.
CS 189 Spring 2015 Introduction to Machine Learning Final You have 2 hours 50 minutes for the exam. The exam is closed book, closed notes except your one-page (two-sided) cheat sheet. No calculators or
More informationCOMP 551 Applied Machine Learning Lecture 16: Deep Learning
COMP 551 Applied Machine Learning Lecture 16: Deep Learning Instructor: Ryan Lowe (ryan.lowe@cs.mcgill.ca) Slides mostly by: Class web page: www.cs.mcgill.ca/~hvanho2/comp551 Unless otherwise noted, all
More informationNatural Language Processing CS 6320 Lecture 6 Neural Language Models. Instructor: Sanda Harabagiu
Natural Language Processing CS 6320 Lecture 6 Neural Language Models Instructor: Sanda Harabagiu In this lecture We shall cover: Deep Neural Models for Natural Language Processing Introduce Feed Forward
More informationUnsupervised Learning
Deep Learning for Graphics Unsupervised Learning Niloy Mitra Iasonas Kokkinos Paul Guerrero Vladimir Kim Kostas Rematas Tobias Ritschel UCL UCL/Facebook UCL Adobe Research U Washington UCL Timetable Niloy
More informationAlternatives to Direct Supervision
CreativeAI: Deep Learning for Graphics Alternatives to Direct Supervision Niloy Mitra Iasonas Kokkinos Paul Guerrero Nils Thuerey Tobias Ritschel UCL UCL UCL TUM UCL Timetable Theory and Basics State of
More informationSEMANTIC COMPUTING. Lecture 8: Introduction to Deep Learning. TU Dresden, 7 December Dagmar Gromann International Center For Computational Logic
SEMANTIC COMPUTING Lecture 8: Introduction to Deep Learning Dagmar Gromann International Center For Computational Logic TU Dresden, 7 December 2018 Overview Introduction Deep Learning General Neural Networks
More information10-701/15-781, Fall 2006, Final
-7/-78, Fall 6, Final Dec, :pm-8:pm There are 9 questions in this exam ( pages including this cover sheet). If you need more room to work out your answer to a question, use the back of the page and clearly
More informationCOMP9444 Neural Networks and Deep Learning 7. Image Processing. COMP9444 c Alan Blair, 2017
COMP9444 Neural Networks and Deep Learning 7. Image Processing COMP9444 17s2 Image Processing 1 Outline Image Datasets and Tasks Convolution in Detail AlexNet Weight Initialization Batch Normalization
More informationDeep Learning Cook Book
Deep Learning Cook Book Robert Haschke (CITEC) Overview Input Representation Output Layer + Cost Function Hidden Layer Units Initialization Regularization Input representation Choose an input representation
More informationDeep Learning Applications
October 20, 2017 Overview Supervised Learning Feedforward neural network Convolution neural network Recurrent neural network Recursive neural network (Recursive neural tensor network) Unsupervised Learning
More informationImproving the way neural networks learn Srikumar Ramalingam School of Computing University of Utah
Improving the way neural networks learn Srikumar Ramalingam School of Computing University of Utah Reference Most of the slides are taken from the third chapter of the online book by Michael Nielson: neuralnetworksanddeeplearning.com
More informationComputer Vision Lecture 16
Computer Vision Lecture 16 Deep Learning for Object Categorization 14.01.2016 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Announcements Seminar registration period
More informationNeural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani
Neural Networks CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Biological and artificial neural networks Feed-forward neural networks Single layer
More informationCS489/698: Intro to ML
CS489/698: Intro to ML Lecture 14: Training of Deep NNs Instructor: Sun Sun 1 Outline Activation functions Regularization Gradient-based optimization 2 Examples of activation functions 3 5/28/18 Sun Sun
More information27: Hybrid Graphical Models and Neural Networks
10-708: Probabilistic Graphical Models 10-708 Spring 2016 27: Hybrid Graphical Models and Neural Networks Lecturer: Matt Gormley Scribes: Jakob Bauer Otilia Stretcu Rohan Varma 1 Motivation We first look
More informationOn the Effectiveness of Neural Networks Classifying the MNIST Dataset
On the Effectiveness of Neural Networks Classifying the MNIST Dataset Carter W. Blum March 2017 1 Abstract Convolutional Neural Networks (CNNs) are the primary driver of the explosion of computer vision.
More informationMachine Learning. Chao Lan
Machine Learning Chao Lan Machine Learning Prediction Models Regression Model - linear regression (least square, ridge regression, Lasso) Classification Model - naive Bayes, logistic regression, Gaussian
More informationBilevel Sparse Coding
Adobe Research 345 Park Ave, San Jose, CA Mar 15, 2013 Outline 1 2 The learning model The learning algorithm 3 4 Sparse Modeling Many types of sensory data, e.g., images and audio, are in high-dimensional
More informationEnsemble methods in machine learning. Example. Neural networks. Neural networks
Ensemble methods in machine learning Bootstrap aggregating (bagging) train an ensemble of models based on randomly resampled versions of the training set, then take a majority vote Example What if you
More informationConvolutional Neural Networks
Lecturer: Barnabas Poczos Introduction to Machine Learning (Lecture Notes) Convolutional Neural Networks Disclaimer: These notes have not been subjected to the usual scrutiny reserved for formal publications.
More informationCS 2750: Machine Learning. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh April 13, 2016
CS 2750: Machine Learning Neural Networks Prof. Adriana Kovashka University of Pittsburgh April 13, 2016 Plan for today Neural network definition and examples Training neural networks (backprop) Convolutional
More informationCOMPUTATIONAL INTELLIGENCE
COMPUTATIONAL INTELLIGENCE Radial Basis Function Networks Adrian Horzyk Preface Radial Basis Function Networks (RBFN) are a kind of artificial neural networks that use radial basis functions (RBF) as activation
More informationDeep Generative Models Variational Autoencoders
Deep Generative Models Variational Autoencoders Sudeshna Sarkar 5 April 2017 Generative Nets Generative models that represent probability distributions over multiple variables in some way. Directed Generative
More informationMachine Learning 13. week
Machine Learning 13. week Deep Learning Convolutional Neural Network Recurrent Neural Network 1 Why Deep Learning is so Popular? 1. Increase in the amount of data Thanks to the Internet, huge amount of
More informationECG782: Multidimensional Digital Signal Processing
ECG782: Multidimensional Digital Signal Processing Object Recognition http://www.ee.unlv.edu/~b1morris/ecg782/ 2 Outline Knowledge Representation Statistical Pattern Recognition Neural Networks Boosting
More information5 Learning hypothesis classes (16 points)
5 Learning hypothesis classes (16 points) Consider a classification problem with two real valued inputs. For each of the following algorithms, specify all of the separators below that it could have generated
More informationEfficient Algorithms may not be those we think
Efficient Algorithms may not be those we think Yann LeCun, Computational and Biological Learning Lab The Courant Institute of Mathematical Sciences New York University http://yann.lecun.com http://www.cs.nyu.edu/~yann
More informationTutorial on Machine Learning Tools
Tutorial on Machine Learning Tools Yanbing Xue Milos Hauskrecht Why do we need these tools? Widely deployed classical models No need to code from scratch Easy-to-use GUI Outline Matlab Apps Weka 3 UI TensorFlow
More informationMachine Learning Classifiers and Boosting
Machine Learning Classifiers and Boosting Reading Ch 18.6-18.12, 20.1-20.3.2 Outline Different types of learning problems Different types of learning algorithms Supervised learning Decision trees Naïve
More informationSeminars in Artifiial Intelligenie and Robotiis
Seminars in Artifiial Intelligenie and Robotiis Computer Vision for Intelligent Robotiis Basiis and hints on CNNs Alberto Pretto What is a neural network? We start from the frst type of artifcal neuron,
More informationCS 1674: Intro to Computer Vision. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh November 16, 2016
CS 1674: Intro to Computer Vision Neural Networks Prof. Adriana Kovashka University of Pittsburgh November 16, 2016 Announcements Please watch the videos I sent you, if you haven t yet (that s your reading)
More informationMachine Learning. MGS Lecture 3: Deep Learning
Dr Michel F. Valstar http://cs.nott.ac.uk/~mfv/ Machine Learning MGS Lecture 3: Deep Learning Dr Michel F. Valstar http://cs.nott.ac.uk/~mfv/ WHAT IS DEEP LEARNING? Shallow network: Only one hidden layer
More informationWhat is machine learning?
Machine learning, pattern recognition and statistical data modelling Lecture 12. The last lecture Coryn Bailer-Jones 1 What is machine learning? Data description and interpretation finding simpler relationship
More informationDeep Learning. Volker Tresp Summer 2017
Deep Learning Volker Tresp Summer 2017 1 Neural Network Winter and Revival While Machine Learning was flourishing, there was a Neural Network winter (late 1990 s until late 2000 s) Around 2010 there
More informationIntroduction to Neural Networks
Introduction to Neural Networks Machine Learning and Object Recognition 2016-2017 Course website: http://thoth.inrialpes.fr/~verbeek/mlor.16.17.php Biological motivation Neuron is basic computational unit
More informationDeep Learning for Computer Vision
Deep Learning for Computer Vision Lecture 7: Universal Approximation Theorem, More Hidden Units, Multi-Class Classifiers, Softmax, and Regularization Peter Belhumeur Computer Science Columbia University
More informationDeep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group
Deep Learning Vladimir Golkov Technical University of Munich Computer Vision Group 1D Input, 1D Output target input 2 2D Input, 1D Output: Data Distribution Complexity Imagine many dimensions (data occupies
More informationPerceptron: This is convolution!
Perceptron: This is convolution! v v v Shared weights v Filter = local perceptron. Also called kernel. By pooling responses at different locations, we gain robustness to the exact spatial location of image
More informationIntroduction to Neural Networks
Introduction to Neural Networks Jakob Verbeek 2017-2018 Biological motivation Neuron is basic computational unit of the brain about 10^11 neurons in human brain Simplified neuron model as linear threshold
More informationDeep Neural Networks with Flexible Activation Function
Deep Neural Networks with Flexible Activation Function Faculty of Information Engineering, Informatics, and Statistics Master in Artificial Intelligence and Robotics Candidate Alberto Marinelli 1571560
More informationDeep Learning in Visual Recognition. Thanks Da Zhang for the slides
Deep Learning in Visual Recognition Thanks Da Zhang for the slides Deep Learning is Everywhere 2 Roadmap Introduction Convolutional Neural Network Application Image Classification Object Detection Object
More informationCMU Lecture 18: Deep learning and Vision: Convolutional neural networks. Teacher: Gianni A. Di Caro
CMU 15-781 Lecture 18: Deep learning and Vision: Convolutional neural networks Teacher: Gianni A. Di Caro DEEP, SHALLOW, CONNECTED, SPARSE? Fully connected multi-layer feed-forward perceptrons: More powerful
More informationThe exam is closed book, closed notes except your one-page (two-sided) cheat sheet.
CS 189 Spring 2015 Introduction to Machine Learning Final You have 2 hours 50 minutes for the exam. The exam is closed book, closed notes except your one-page (two-sided) cheat sheet. No calculators or
More informationResearch on Pruning Convolutional Neural Network, Autoencoder and Capsule Network
Research on Pruning Convolutional Neural Network, Autoencoder and Capsule Network Tianyu Wang Australia National University, Colledge of Engineering and Computer Science u@anu.edu.au Abstract. Some tasks,
More informationPreface to the Second Edition. Preface to the First Edition. 1 Introduction 1
Preface to the Second Edition Preface to the First Edition vii xi 1 Introduction 1 2 Overview of Supervised Learning 9 2.1 Introduction... 9 2.2 Variable Types and Terminology... 9 2.3 Two Simple Approaches
More information3D model classification using convolutional neural network
3D model classification using convolutional neural network JunYoung Gwak Stanford jgwak@cs.stanford.edu Abstract Our goal is to classify 3D models directly using convolutional neural network. Most of existing
More informationStacked Denoising Autoencoders for Face Pose Normalization
Stacked Denoising Autoencoders for Face Pose Normalization Yoonseop Kang 1, Kang-Tae Lee 2,JihyunEun 2, Sung Eun Park 2 and Seungjin Choi 1 1 Department of Computer Science and Engineering Pohang University
More informationLecture 17: Neural Networks and Deep Learning. Instructor: Saravanan Thirumuruganathan
Lecture 17: Neural Networks and Deep Learning Instructor: Saravanan Thirumuruganathan Outline Perceptron Neural Networks Deep Learning Convolutional Neural Networks Recurrent Neural Networks Auto Encoders
More informationStudy of Residual Networks for Image Recognition
Study of Residual Networks for Image Recognition Mohammad Sadegh Ebrahimi Stanford University sadegh@stanford.edu Hossein Karkeh Abadi Stanford University hosseink@stanford.edu Abstract Deep neural networks
More informationLearning image representations equivariant to ego-motion (Supplementary material)
Learning image representations equivariant to ego-motion (Supplementary material) Dinesh Jayaraman UT Austin dineshj@cs.utexas.edu Kristen Grauman UT Austin grauman@cs.utexas.edu max-pool (3x3, stride2)
More informationDeep Learning for Vision: Tricks of the Trade
Deep Learning for Vision: Tricks of the Trade Marc'Aurelio Ranzato Facebook, AI Group www.cs.toronto.edu/~ranzato BAVM Friday, 4 October 2013 Ideal Features Ideal Feature Extractor - window, right - chair,
More informationLecture 21 : A Hybrid: Deep Learning and Graphical Models
10-708: Probabilistic Graphical Models, Spring 2018 Lecture 21 : A Hybrid: Deep Learning and Graphical Models Lecturer: Kayhan Batmanghelich Scribes: Paul Liang, Anirudha Rayasam 1 Introduction and Motivation
More informationObject Detection Lecture Introduction to deep learning (CNN) Idar Dyrdal
Object Detection Lecture 10.3 - Introduction to deep learning (CNN) Idar Dyrdal Deep Learning Labels Computational models composed of multiple processing layers (non-linear transformations) Used to learn
More informationVulnerability of machine learning models to adversarial examples
Vulnerability of machine learning models to adversarial examples Petra Vidnerová Institute of Computer Science The Czech Academy of Sciences Hora Informaticae 1 Outline Introduction Works on adversarial
More informationWeiguang Guan Code & data: guanw.sharcnet.ca/ss2017-deeplearning.tar.gz
Weiguang Guan guanw@sharcnet.ca Code & data: guanw.sharcnet.ca/ss2017-deeplearning.tar.gz Outline Part I: Introduction Overview of machine learning and AI Introduction to neural network and deep learning
More informationPouya Kousha Fall 2018 CSE 5194 Prof. DK Panda
Pouya Kousha Fall 2018 CSE 5194 Prof. DK Panda 1 Observe novel applicability of DL techniques in Big Data Analytics. Applications of DL techniques for common Big Data Analytics problems. Semantic indexing
More informationIs Bigger CNN Better? Samer Hijazi on behalf of IPG CTO Group Embedded Neural Networks Summit (enns2016) San Jose Feb. 9th
Is Bigger CNN Better? Samer Hijazi on behalf of IPG CTO Group Embedded Neural Networks Summit (enns2016) San Jose Feb. 9th Today s Story Why does CNN matter to the embedded world? How to enable CNN in
More informationCS6220: DATA MINING TECHNIQUES
CS6220: DATA MINING TECHNIQUES Image Data: Classification via Neural Networks Instructor: Yizhou Sun yzsun@ccs.neu.edu November 19, 2015 Methods to Learn Classification Clustering Frequent Pattern Mining
More informationGenerative and discriminative classification techniques
Generative and discriminative classification techniques Machine Learning and Category Representation 013-014 Jakob Verbeek, December 13+0, 013 Course website: http://lear.inrialpes.fr/~verbeek/mlcr.13.14
More informationRyerson University CP8208. Soft Computing and Machine Intelligence. Naive Road-Detection using CNNS. Authors: Sarah Asiri - Domenic Curro
Ryerson University CP8208 Soft Computing and Machine Intelligence Naive Road-Detection using CNNS Authors: Sarah Asiri - Domenic Curro April 24 2016 Contents 1 Abstract 2 2 Introduction 2 3 Motivation
More informationDeep Learning. Practical introduction with Keras JORDI TORRES 27/05/2018. Chapter 3 JORDI TORRES
Deep Learning Practical introduction with Keras Chapter 3 27/05/2018 Neuron A neural network is formed by neurons connected to each other; in turn, each connection of one neural network is associated
More informationDeep Learning. Volker Tresp Summer 2015
Deep Learning Volker Tresp Summer 2015 1 Neural Network Winter and Revival While Machine Learning was flourishing, there was a Neural Network winter (late 1990 s until late 2000 s) Around 2010 there
More informationDeep neural networks II
Deep neural networks II May 31 st, 2018 Yong Jae Lee UC Davis Many slides from Rob Fergus, Svetlana Lazebnik, Jia-Bin Huang, Derek Hoiem, Adriana Kovashka, Why (convolutional) neural networks? State of
More informationComputer Vision Lecture 16
Announcements Computer Vision Lecture 16 Deep Learning Applications 11.01.2017 Seminar registration period starts on Friday We will offer a lab course in the summer semester Deep Robot Learning Topic:
More informationAutoencoders, denoising autoencoders, and learning deep networks
4 th CiFAR Summer School on Learning and Vision in Biology and Engineering Toronto, August 5-9 2008 Autoencoders, denoising autoencoders, and learning deep networks Part II joint work with Hugo Larochelle,
More informationCPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2016
CPSC 340: Machine Learning and Data Mining Principal Component Analysis Fall 2016 A2/Midterm: Admin Grades/solutions will be posted after class. Assignment 4: Posted, due November 14. Extra office hours:
More informationNeural Networks for Machine Learning. Lecture 15a From Principal Components Analysis to Autoencoders
Neural Networks for Machine Learning Lecture 15a From Principal Components Analysis to Autoencoders Geoffrey Hinton Nitish Srivastava, Kevin Swersky Tijmen Tieleman Abdel-rahman Mohamed Principal Components
More informationProbabilistic Siamese Network for Learning Representations. Chen Liu
Probabilistic Siamese Network for Learning Representations by Chen Liu A thesis submitted in conformity with the requirements for the degree of Master of Applied Science Graduate Department of Electrical
More informationThe exam is closed book, closed notes except your one-page cheat sheet.
CS 189 Fall 2015 Introduction to Machine Learning Final Please do not turn over the page before you are instructed to do so. You have 2 hours and 50 minutes. Please write your initials on the top-right
More informationMinimum Risk Feature Transformations
Minimum Risk Feature Transformations Shivani Agarwal Dan Roth Department of Computer Science, University of Illinois, Urbana, IL 61801 USA sagarwal@cs.uiuc.edu danr@cs.uiuc.edu Abstract We develop an approach
More information3D Object Recognition with Deep Belief Nets
3D Object Recognition with Deep Belief Nets Vinod Nair and Geoffrey E. Hinton Department of Computer Science, University of Toronto 10 King s College Road, Toronto, M5S 3G5 Canada {vnair,hinton}@cs.toronto.edu
More informationLecture 13. Deep Belief Networks. Michael Picheny, Bhuvana Ramabhadran, Stanley F. Chen
Lecture 13 Deep Belief Networks Michael Picheny, Bhuvana Ramabhadran, Stanley F. Chen IBM T.J. Watson Research Center Yorktown Heights, New York, USA {picheny,bhuvana,stanchen}@us.ibm.com 12 December 2012
More informationSingle Image Depth Estimation via Deep Learning
Single Image Depth Estimation via Deep Learning Wei Song Stanford University Stanford, CA Abstract The goal of the project is to apply direct supervised deep learning to the problem of monocular depth
More informationDeep Learning. Architecture Design for. Sargur N. Srihari
Architecture Design for Deep Learning Sargur N. srihari@cedar.buffalo.edu 1 Topics Overview 1. Example: Learning XOR 2. Gradient-Based Learning 3. Hidden Units 4. Architecture Design 5. Backpropagation
More information