Global Optimality in Neural Network Training
|
|
- Easter Horn
- 6 years ago
- Views:
Transcription
1 Global Optimality in Neural Network Training Benjamin D. Haeffele and René Vidal Johns Hopkins University, Center for Imaging Science. Baltimore, USA
2 Questions in Deep Learning Architecture Design Optimization Generalization
3 Questions in Deep Learning Are there principled ways to design networks? How many layers? Size of layers? Choice of layer types? How does architecture impact expressiveness? [1] [1] Cohen, et al., On the expressive power of deep learning: A tensor analysis. COLT. (2016)
4 Questions in Deep Learning How to train neural networks?
5 Questions in Deep Learning How to train neural networks? Problem is non-convex.
6 Questions in Deep Learning How to train neural networks? X Problem is non-convex.
7 Questions in Deep Learning How to train neural networks? X Problem is non-convex. What does the loss surface look like? [1] Any guarantees for network training? [2] How to guarantee optimality? When will local descent succeed? [1] Choromanska, et al., "The loss surfaces of multilayer networks." Artificial Intelligence and Statistics. (2015) [2] Janzamin, et al., "Beating the perils of non-convexity: Guaranteed training of neural networks using tensor methods." arxiv (2015).
8 Questions in Deep Learning Performance Guarantees? Simple X Complex How do networks generalize? How should networks be regularized? How to prevent overfitting?
9 Interrelated Problems Architecture Optimization can impact generalization. [1] Generalization/ Regularization Optimization Architecture has a strong effect on the generalization of networks. [2] Some architectures could be easier to optimize than others. [1] Neyshabur, et al., In Search of the Real Inductive Bias: On the Role of Implicit Regularization in Deep Learning. ICLR workshop. (2015). [2] Zhang, et al., Understanding deep learning requires rethinking generalization. ICLR. (2017).
10 Today s Talk: The Questions Architecture Are there properties of the network architecture that allow efficient optimization? Generalization/ Regularization Optimization
11 Today s Talk: The Questions Generalization/ Regularization Architecture Optimization Are there properties of the network architecture that allow efficient optimization? Positive Homogeneity Parallel Subnetwork Structure
12 Today s Talk: The Questions Generalization/ Regularization Architecture Optimization Are there properties of the network architecture that allow efficient optimization? Positive Homogeneity Parallel Subnetwork Structure Are there properties of the regularization that allow efficient optimization?
13 Today s Talk: The Questions Generalization/ Regularization Architecture Optimization Are there properties of the network architecture that allow efficient optimization? Positive Homogeneity Parallel Subnetwork Structure Are there properties of the regularization that allow efficient optimization? Positive Homogeneity Adapt network architecture to data [1] [1] Bengio, et al., Convex neural networks. NIPS. (2005)
14 Today s Talk: The Results Optimization
15 Today s Talk: The Results Optimization A local minimum such that one subnetwork is all zero is a global minimum.
16 Today s Talk: The Results Optimization Once the size of the network becomes large enough... Local descent can reach a global minimum from any initialization. Non-Convex Function Today s Framework
17 Outline Architecture 1. Network properties that allow efficient optimization Positive Homogeneity Parallel Subnetwork Structure Generalization/ Regularization Optimization 2. Network size from regularization 3. Theoretical guarantees Sufficient conditions for global optimality Local descent can reach global minimizers
18 Key Property 1: Positive Homogeneity Start with a network. Network Outputs Network Weights
19 Key Property 1: Positive Homogeneity Scale the weights by a non-negative constant.
20 Key Property 1: Positive Homogeneity Scale the weights by a non-negative constant.
21 Key Property 1: Positive Homogeneity The network output scales by the constant to some power.
22 Key Property 1: Positive Homogeneity The network output scales by the constant to some power. Network Mapping
23 Key Property 1: Positive Homogeneity The network output scales by the constant to some power. Network Mapping - Degree of positive homogeneity
24 Most Modern Networks Are Positively Homogeneous Example: Rectified Linear Units (ReLUs)
25 Most Modern Networks Are Positively Homogeneous Example: Rectified Linear Units (ReLUs)
26 Most Modern Networks Are Positively Homogeneous Example: Rectified Linear Units (ReLUs)
27 Most Modern Networks Are Positively Homogeneous Example: Rectified Linear Units (ReLUs)
28 Most Modern Networks Are Positively Homogeneous Example: Rectified Linear Units (ReLUs) Doesn t change rectification
29 Most Modern Networks Are Positively Homogeneous Example: Rectified Linear Units (ReLUs) Doesn t change rectification
30 Most Modern Networks Are Positively Homogeneous Simple Network Input Conv + ReLU Conv + ReLU Max Pool Linear Out
31 Most Modern Networks Are Positively Homogeneous Simple Network Input Conv + ReLU Conv + ReLU Max Pool Linear Out
32 Most Modern Networks Are Positively Homogeneous Simple Network Input Conv + ReLU Conv + ReLU Max Pool Linear Out
33 Most Modern Networks Are Positively Homogeneous Simple Network Input Conv + ReLU Conv + ReLU Max Pool Linear Out
34 Most Modern Networks Are Positively Homogeneous Simple Network Input Conv + ReLU Conv + ReLU Max Pool Linear Out
35 Most Modern Networks Are Positively Homogeneous Simple Network Input Conv + ReLU Conv + ReLU Max Pool Linear Out
36 Most Modern Networks Are Positively Homogeneous Simple Network Input Conv + ReLU Conv + ReLU Max Pool Linear Out
37 Most Modern Networks Are Positively Homogeneous Simple Network Input Conv + ReLU Conv + ReLU Max Pool Linear Out
38 Most Modern Networks Are Positively Homogeneous Simple Network Input Conv + ReLU Conv + ReLU Max Pool Linear Out
39 Most Modern Networks Are Positively Homogeneous Simple Network Input Conv + ReLU Conv + ReLU Max Pool Linear Out
40 Most Modern Networks Are Positively Homogeneous Simple Network Input Conv + ReLU Conv + ReLU Max Pool Linear Out
41 Most Modern Networks Are Positively Homogeneous Simple Network Input Conv + ReLU Conv + ReLU Max Pool Linear Out Typically each weight layer increases degree of homogeneity by 1.
42 Most Modern Networks Are Positively Homogeneous Some Common Positively Homogeneous Layers Fully Connected + ReLU Convolution + ReLU Max Pooling Linear Layers Mean Pooling Max Out Many possibilities
43 Most Modern Networks Are Positively Homogeneous Some Common Positively Homogeneous Layers Fully Connected + ReLU Convolution + ReLU Max Pooling Linear Layers Mean Pooling Max Out Many possibilities X Not Sigmoids
44 Outline Architecture 1. Network properties that allow efficient optimization Positive Homogeneity Parallel Subnetwork Structure Generalization/ Regularization Optimization 2. Network regularization 3. Theoretical guarantees Sufficient conditions for global optimality Local descent can reach global minimizers
45 Key Property 2: Parallel Subnetworks Subnetworks with identical architecture connected in parallel.
46 Key Property 2: Parallel Subnetworks Subnetworks with identical architecture connected in parallel. Simple Example: Single hidden layer network
47 Key Property 2: Parallel Subnetworks Subnetworks with identical architecture connected in parallel. Simple Example: Single hidden layer network
48 Key Property 2: Parallel Subnetworks Subnetworks with identical architecture connected in parallel. Simple Example: Single hidden layer network Subnetwork: One ReLU hidden unit
49 Key Property 2: Parallel Subnetworks Any positively homogeneous subnetwork can be used Subnetwork: Multiple ReLU layers
50 Key Property 2: Parallel Subnetworks Example: Parallel AlexNets[1] AlexNet Subnetwork: AlexNet Input AlexNet AlexNet AlexNet AlexNet Output [1] Krizhevsky, Sutskever, and Hinton. "Imagenet classification with deep convolutional neural networks." NIPS, 2012.
51 Outline Architecture 1. Network properties that allow efficient optimization Positive Homogeneity Parallel Subnetwork Structure Generalization/ Regularization Optimization 2. Network regularization 3. Theoretical guarantees Sufficient conditions for global optimality Local descent can reach global minimizers
52 Basic Regularization: Weight Decay Network Weights
53 Basic Regularization: Weight Decay Network Weights
54 Basic Regularization: Weight Decay Network Weights
55 Basic Regularization: Weight Decay Network Weights
56 Basic Regularization: Weight Decay Network Weights
57 Basic Regularization: Weight Decay Network Weights Degrees of positive homogeneity don t match = Bad things happen.
58 Basic Regularization: Weight Decay Degrees of positive homogeneity don t match = Bad things happen. Network Weights Proposition: There will always exist non-optimal local minima.
59 Adapting the size of the network via regularization Start with a positively homogeneous network with parallel structure
60 Adapting the size of the network via regularization Take the weights of one subnetwork.
61 Adapting the size of the network via regularization Define a regularization function on the weights.
62 Adapting the size of the network via regularization Define a regularization function on the weights. Non-negative. Positively homogeneous with same degree as network mapping.
63 Adapting the size of the network via regularization Define a regularization function on the weights. Non-negative. Positively homogeneous with same degree as network mapping.
64 Adapting the size of the network via regularization Define a regularization function on the weights. Non-negative. Positively homogeneous with same degree as network mapping.
65 Adapting the size of the network via regularization Define a regularization function on the weights. Non-negative. Positively homogeneous with same degree as network mapping. Example: Product of norms
66 Adapting the size of the network via regularization Sum over all the subnetworks.
67 Adapting the size of the network via regularization Sum over all the subnetworks.
68 Adapting the size of the network via regularization Sum over all the subnetworks.
69 Adapting the size of the network via regularization Sum over all the subnetworks. # of Subnetworks
70 Adapting the size of the network via regularization Allow the number of subnetworks to vary. # of Subnetworks Adding a subnetwork is penalized by an additional term in the sum. Acts to constrain the number of subnetworks.
71 Outline Architecture 1. Network properties that allow efficient optimization Positive Homogeneity Parallel Subnetwork Structure Generalization/ Regularization Optimization 2. Network regularization 3. Theoretical guarantees Sufficient conditions for global optimality Local descent can reach global minimizers
72 Our problem
73 Our problem The non-convex problem we re interested in
74 Our problem The non-convex problem we re interested in
75 Our problem The non-convex problem we re interested in Labels Loss Function: Assume convex and once differentiable in Examples: Cross-entropy Least-squares
76 Why do all this?
77 Why do all this? Induces a convex function on the network outputs.
78 Why do all this? Induces a convex function on the network outputs. Induced Function: Comes from the regularization
79 Why do all this? Induces a convex function on the network outputs. Induced Function: Comes from the regularization
80 Why do all this? Induces a convex function on the network outputs. Induced Function: Comes from the regularization
81 Why do all this? Induces a convex function on the network outputs. Induced Function: Comes from the regularization
82 Why do all this? Induces a convex function on the network outputs. The convex problem provides an achievable lower bound for the non-convex network training problem.
83 Why do all this? Induces a convex function on the network outputs. The convex problem provides an achievable lower bound for the non-convex network training problem. Use the convex function as an analysis tool to study the non-convex network training problem.
84 Sufficient Conditions for Global Optimality Theorem: A local minimum such that one subnetwork is all zero is a global minimum.
85 Sufficient Conditions for Global Optimality Theorem: A local minimum such that one subnetwork is all zero is a global minimum.
86 Sufficient Conditions for Global Optimality Theorem: A local minimum such that one subnetwork is all zero is a global minimum. Intuition: The local minimum satisfies the optimality conditions for the convex problem.
87 Global Minima from Local Descent Theorem: If the size of the network is large enough (has enough subnetworks), then a global minimum can always be reached by local descent from any initialization.
88 Global Minima from Local Descent Theorem: If the size of the network is large enough (has enough subnetworks), then a global minimum can always be reached by local descent from any initialization. Non-Convex Function Today s Framework
89 Global Minima from Local Descent Theorem: If the size of the network is large enough (has enough subnetworks), then a global minimum can always be reached by local descent from any initialization. Non-Convex Function Today s Framework Meta-Algorithm: If not at a local minima, perform local descent At local minima, test if first Theorem is satisfied If not, add a subnetwork in parallel and continue Maximum number of subnetworks guaranteed to be bounded by the dimensions of the network output
90 Conclusions Network size matters Optimize network weights AND network size Current: Size = Number of parallel subnetworks Future: Size = Number of layers, neurons per layer, etc Regularization design matters Match the degrees of positive homogeneity between network and regularization Regularization can control the size of the network Not done yet Several practical and theoretical limitations
91 Thank You Vision Johns Hopkins University Center for Imaging Johns Hopkins University Work supported by NSF grants , and
Deep Learning with Tensorflow AlexNet
Machine Learning and Computer Vision Group Deep Learning with Tensorflow http://cvml.ist.ac.at/courses/dlwt_w17/ AlexNet Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton, "Imagenet classification
More informationConvolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech
Convolutional Neural Networks Computer Vision Jia-Bin Huang, Virginia Tech Today s class Overview Convolutional Neural Network (CNN) Training CNN Understanding and Visualizing CNN Image Categorization:
More informationAll You Want To Know About CNNs. Yukun Zhu
All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from http://imgur.com/ Deep Learning Image from http://imgur.com/ Deep Learning Image from http://imgur.com/ Deep Learning Image
More informationMachine Learning. MGS Lecture 3: Deep Learning
Dr Michel F. Valstar http://cs.nott.ac.uk/~mfv/ Machine Learning MGS Lecture 3: Deep Learning Dr Michel F. Valstar http://cs.nott.ac.uk/~mfv/ WHAT IS DEEP LEARNING? Shallow network: Only one hidden layer
More informationPerceptron: This is convolution!
Perceptron: This is convolution! v v v Shared weights v Filter = local perceptron. Also called kernel. By pooling responses at different locations, we gain robustness to the exact spatial location of image
More informationDeep Learning for Computer Vision II
IIIT Hyderabad Deep Learning for Computer Vision II C. V. Jawahar Paradigm Shift Feature Extraction (SIFT, HoG, ) Part Models / Encoding Classifier Sparrow Feature Learning Classifier Sparrow L 1 L 2 L
More informationCOMP 551 Applied Machine Learning Lecture 16: Deep Learning
COMP 551 Applied Machine Learning Lecture 16: Deep Learning Instructor: Ryan Lowe (ryan.lowe@cs.mcgill.ca) Slides mostly by: Class web page: www.cs.mcgill.ca/~hvanho2/comp551 Unless otherwise noted, all
More informationConvolutional Neural Networks
Lecturer: Barnabas Poczos Introduction to Machine Learning (Lecture Notes) Convolutional Neural Networks Disclaimer: These notes have not been subjected to the usual scrutiny reserved for formal publications.
More informationImproving the way neural networks learn Srikumar Ramalingam School of Computing University of Utah
Improving the way neural networks learn Srikumar Ramalingam School of Computing University of Utah Reference Most of the slides are taken from the third chapter of the online book by Michael Nielson: neuralnetworksanddeeplearning.com
More informationImageNet Classification with Deep Convolutional Neural Networks
ImageNet Classification with Deep Convolutional Neural Networks Alex Krizhevsky Ilya Sutskever Geoffrey Hinton University of Toronto Canada Paper with same name to appear in NIPS 2012 Main idea Architecture
More informationIntro to Deep Learning. Slides Credit: Andrej Karapathy, Derek Hoiem, Marc Aurelio, Yann LeCunn
Intro to Deep Learning Slides Credit: Andrej Karapathy, Derek Hoiem, Marc Aurelio, Yann LeCunn Why this class? Deep Features Have been able to harness the big data in the most efficient and effective
More informationRestricted Boltzmann Machines. Shallow vs. deep networks. Stacked RBMs. Boltzmann Machine learning: Unsupervised version
Shallow vs. deep networks Restricted Boltzmann Machines Shallow: one hidden layer Features can be learned more-or-less independently Arbitrary function approximator (with enough hidden units) Deep: two
More informationCS 2750: Machine Learning. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh April 13, 2016
CS 2750: Machine Learning Neural Networks Prof. Adriana Kovashka University of Pittsburgh April 13, 2016 Plan for today Neural network definition and examples Training neural networks (backprop) Convolutional
More informationDeep Learning Workshop. Nov. 20, 2015 Andrew Fishberg, Rowan Zellers
Deep Learning Workshop Nov. 20, 2015 Andrew Fishberg, Rowan Zellers Why deep learning? The ImageNet Challenge Goal: image classification with 1000 categories Top 5 error rate of 15%. Krizhevsky, Alex,
More informationDeep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group
Deep Learning Vladimir Golkov Technical University of Munich Computer Vision Group 1D Input, 1D Output target input 2 2D Input, 1D Output: Data Distribution Complexity Imagine many dimensions (data occupies
More informationLecture 2 Notes. Outline. Neural Networks. The Big Idea. Architecture. Instructors: Parth Shah, Riju Pahwa
Instructors: Parth Shah, Riju Pahwa Lecture 2 Notes Outline 1. Neural Networks The Big Idea Architecture SGD and Backpropagation 2. Convolutional Neural Networks Intuition Architecture 3. Recurrent Neural
More informationCSE 559A: Computer Vision
CSE 559A: Computer Vision Fall 2018: T-R: 11:30-1pm @ Lopata 101 Instructor: Ayan Chakrabarti (ayan@wustl.edu). Course Staff: Zhihao Xia, Charlie Wu, Han Liu http://www.cse.wustl.edu/~ayan/courses/cse559a/
More informationCPSC 340: Machine Learning and Data Mining. Deep Learning Fall 2016
CPSC 340: Machine Learning and Data Mining Deep Learning Fall 2016 Assignment 5: Due Friday. Assignment 6: Due next Friday. Final: Admin December 12 (8:30am HEBB 100) Covers Assignments 1-6. Final from
More informationNatural Language Processing CS 6320 Lecture 6 Neural Language Models. Instructor: Sanda Harabagiu
Natural Language Processing CS 6320 Lecture 6 Neural Language Models Instructor: Sanda Harabagiu In this lecture We shall cover: Deep Neural Models for Natural Language Processing Introduce Feed Forward
More informationNeural Network Neurons
Neural Networks Neural Network Neurons 1 Receives n inputs (plus a bias term) Multiplies each input by its weight Applies activation function to the sum of results Outputs result Activation Functions Given
More informationIntroduction to Deep Learning
ENEE698A : Machine Learning Seminar Introduction to Deep Learning Raviteja Vemulapalli Image credit: [LeCun 1998] Resources Unsupervised feature learning and deep learning (UFLDL) tutorial (http://ufldl.stanford.edu/wiki/index.php/ufldl_tutorial)
More informationLecture 20: Neural Networks for NLP. Zubin Pahuja
Lecture 20: Neural Networks for NLP Zubin Pahuja zpahuja2@illinois.edu courses.engr.illinois.edu/cs447 CS447: Natural Language Processing 1 Today s Lecture Feed-forward neural networks as classifiers simple
More informationDeep Learning in Visual Recognition. Thanks Da Zhang for the slides
Deep Learning in Visual Recognition Thanks Da Zhang for the slides Deep Learning is Everywhere 2 Roadmap Introduction Convolutional Neural Network Application Image Classification Object Detection Object
More informationSeminars in Artifiial Intelligenie and Robotiis
Seminars in Artifiial Intelligenie and Robotiis Computer Vision for Intelligent Robotiis Basiis and hints on CNNs Alberto Pretto What is a neural network? We start from the frst type of artifcal neuron,
More informationComputer Vision Lecture 16
Computer Vision Lecture 16 Deep Learning for Object Categorization 14.01.2016 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Announcements Seminar registration period
More informationCMU Lecture 18: Deep learning and Vision: Convolutional neural networks. Teacher: Gianni A. Di Caro
CMU 15-781 Lecture 18: Deep learning and Vision: Convolutional neural networks Teacher: Gianni A. Di Caro DEEP, SHALLOW, CONNECTED, SPARSE? Fully connected multi-layer feed-forward perceptrons: More powerful
More informationCOMP9444 Neural Networks and Deep Learning 7. Image Processing. COMP9444 c Alan Blair, 2017
COMP9444 Neural Networks and Deep Learning 7. Image Processing COMP9444 17s2 Image Processing 1 Outline Image Datasets and Tasks Convolution in Detail AlexNet Weight Initialization Batch Normalization
More informationDeep Learning & Neural Networks
Deep Learning & Neural Networks Machine Learning CSE4546 Sham Kakade University of Washington November 29, 2016 Sham Kakade 1 Announcements: HW4 posted Poster Session Thurs, Dec 8 Today: Review: EM Neural
More information3D model classification using convolutional neural network
3D model classification using convolutional neural network JunYoung Gwak Stanford jgwak@cs.stanford.edu Abstract Our goal is to classify 3D models directly using convolutional neural network. Most of existing
More informationDEEP LEARNING REVIEW. Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature Presented by Divya Chitimalla
DEEP LEARNING REVIEW Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature 2015 -Presented by Divya Chitimalla What is deep learning Deep learning allows computational models that are composed of multiple
More informationDeep Learning. Architecture Design for. Sargur N. Srihari
Architecture Design for Deep Learning Sargur N. srihari@cedar.buffalo.edu 1 Topics Overview 1. Example: Learning XOR 2. Gradient-Based Learning 3. Hidden Units 4. Architecture Design 5. Backpropagation
More informationDeep Learning and Its Applications
Convolutional Neural Network and Its Application in Image Recognition Oct 28, 2016 Outline 1 A Motivating Example 2 The Convolutional Neural Network (CNN) Model 3 Training the CNN Model 4 Issues and Recent
More informationSEMANTIC COMPUTING. Lecture 8: Introduction to Deep Learning. TU Dresden, 7 December Dagmar Gromann International Center For Computational Logic
SEMANTIC COMPUTING Lecture 8: Introduction to Deep Learning Dagmar Gromann International Center For Computational Logic TU Dresden, 7 December 2018 Overview Introduction Deep Learning General Neural Networks
More informationIntroduction to Neural Networks
Introduction to Neural Networks Jakob Verbeek 2017-2018 Biological motivation Neuron is basic computational unit of the brain about 10^11 neurons in human brain Simplified neuron model as linear threshold
More informationMachine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU,
Machine Learning 10-701, Fall 2015 Deep Learning Eric Xing (and Pengtao Xie) Lecture 8, October 6, 2015 Eric Xing @ CMU, 2015 1 A perennial challenge in computer vision: feature engineering SIFT Spin image
More informationAdvanced Machine Learning
Advanced Machine Learning Convolutional Neural Networks for Handwritten Digit Recognition Andreas Georgopoulos CID: 01281486 Abstract Abstract At this project three different Convolutional Neural Netwroks
More informationECE 6504: Deep Learning for Perception
ECE 6504: Deep Learning for Perception Topics: (Finish) Backprop Convolutional Neural Nets Dhruv Batra Virginia Tech Administrativia Presentation Assignments https://docs.google.com/spreadsheets/d/ 1m76E4mC0wfRjc4HRBWFdAlXKPIzlEwfw1-u7rBw9TJ8/
More informationDeep Convolutional Neural Networks. Nov. 20th, 2015 Bruce Draper
Deep Convolutional Neural Networks Nov. 20th, 2015 Bruce Draper Background: Fully-connected single layer neural networks Feed-forward classification Trained through back-propagation Example Computer Vision
More informationIntroduction to Neural Networks
Introduction to Neural Networks Machine Learning and Object Recognition 2016-2017 Course website: http://thoth.inrialpes.fr/~verbeek/mlor.16.17.php Biological motivation Neuron is basic computational unit
More informationConvolutional Neural Network Layer Reordering for Acceleration
R1-15 SASIMI 2016 Proceedings Convolutional Neural Network Layer Reordering for Acceleration Vijay Daultani Subhajit Chaudhury Kazuhisa Ishizaka System Platform Labs Value Co-creation Center System Platform
More informationDeep Learning. Visualizing and Understanding Convolutional Networks. Christopher Funk. Pennsylvania State University.
Visualizing and Understanding Convolutional Networks Christopher Pennsylvania State University February 23, 2015 Some Slide Information taken from Pierre Sermanet (Google) presentation on and Computer
More informationDeep Neural Networks:
Deep Neural Networks: Part II Convolutional Neural Network (CNN) Yuan-Kai Wang, 2016 Web site of this course: http://pattern-recognition.weebly.com source: CNN for ImageClassification, by S. Lazebnik,
More informationApplication of Convolutional Neural Network for Image Classification on Pascal VOC Challenge 2012 dataset
Application of Convolutional Neural Network for Image Classification on Pascal VOC Challenge 2012 dataset Suyash Shetty Manipal Institute of Technology suyash.shashikant@learner.manipal.edu Abstract In
More informationStudy of Residual Networks for Image Recognition
Study of Residual Networks for Image Recognition Mohammad Sadegh Ebrahimi Stanford University sadegh@stanford.edu Hossein Karkeh Abadi Stanford University hosseink@stanford.edu Abstract Deep neural networks
More informationDeep Learning Basic Lecture - Complex Systems & Artificial Intelligence 2017/18 (VO) Asan Agibetov, PhD.
Deep Learning 861.061 Basic Lecture - Complex Systems & Artificial Intelligence 2017/18 (VO) Asan Agibetov, PhD asan.agibetov@meduniwien.ac.at Medical University of Vienna Center for Medical Statistics,
More informationFuzzy Set Theory in Computer Vision: Example 3, Part II
Fuzzy Set Theory in Computer Vision: Example 3, Part II Derek T. Anderson and James M. Keller FUZZ-IEEE, July 2017 Overview Resource; CS231n: Convolutional Neural Networks for Visual Recognition https://github.com/tuanavu/stanford-
More informationLearning Deep Representations for Visual Recognition
Learning Deep Representations for Visual Recognition CVPR 2018 Tutorial Kaiming He Facebook AI Research (FAIR) Deep Learning is Representation Learning Representation Learning: worth a conference name
More informationDeep Learning Explained Module 4: Convolution Neural Networks (CNN or Conv Nets)
Deep Learning Explained Module 4: Convolution Neural Networks (CNN or Conv Nets) Sayan D. Pathak, Ph.D., Principal ML Scientist, Microsoft Roland Fernandez, Senior Researcher, Microsoft Module Outline
More informationObject Detection Lecture Introduction to deep learning (CNN) Idar Dyrdal
Object Detection Lecture 10.3 - Introduction to deep learning (CNN) Idar Dyrdal Deep Learning Labels Computational models composed of multiple processing layers (non-linear transformations) Used to learn
More informationNeural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /10/2017
3/0/207 Neural Networks Emily Fox University of Washington March 0, 207 Slides adapted from Ali Farhadi (via Carlos Guestrin and Luke Zettlemoyer) Single-layer neural network 3/0/207 Perceptron as a neural
More informationDeep Neural Networks Optimization
Deep Neural Networks Optimization Creative Commons (cc) by Akritasa http://arxiv.org/pdf/1406.2572.pdf Slides from Geoffrey Hinton CSC411/2515: Machine Learning and Data Mining, Winter 2018 Michael Guerzhoy
More informationNeural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani
Neural Networks CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Biological and artificial neural networks Feed-forward neural networks Single layer
More informationSwitched by Input: Power Efficient Structure for RRAMbased Convolutional Neural Network
Switched by Input: Power Efficient Structure for RRAMbased Convolutional Neural Network Lixue Xia, Tianqi Tang, Wenqin Huangfu, Ming Cheng, Xiling Yin, Boxun Li, Yu Wang, Huazhong Yang Dept. of E.E., Tsinghua
More informationBranchyNet: Fast Inference via Early Exiting from Deep Neural Networks
BranchyNet: Fast Inference via Early Exiting from Deep Neural Networks Surat Teerapittayanon Harvard University Email: steerapi@seas.harvard.edu Bradley McDanel Harvard University Email: mcdanel@fas.harvard.edu
More informationSupplementary material for Analyzing Filters Toward Efficient ConvNet
Supplementary material for Analyzing Filters Toward Efficient Net Takumi Kobayashi National Institute of Advanced Industrial Science and Technology, Japan takumi.kobayashi@aist.go.jp A. Orthonormal Steerable
More informationOn the Effectiveness of Neural Networks Classifying the MNIST Dataset
On the Effectiveness of Neural Networks Classifying the MNIST Dataset Carter W. Blum March 2017 1 Abstract Convolutional Neural Networks (CNNs) are the primary driver of the explosion of computer vision.
More informationStructured Prediction using Convolutional Neural Networks
Overview Structured Prediction using Convolutional Neural Networks Bohyung Han bhhan@postech.ac.kr Computer Vision Lab. Convolutional Neural Networks (CNNs) Structured predictions for low level computer
More informationThroughput-Optimized OpenCL-based FPGA Accelerator for Large-Scale Convolutional Neural Networks
Throughput-Optimized OpenCL-based FPGA Accelerator for Large-Scale Convolutional Neural Networks Naveen Suda, Vikas Chandra *, Ganesh Dasika *, Abinash Mohanty, Yufei Ma, Sarma Vrudhula, Jae-sun Seo, Yu
More informationNeural Networks (pp )
Notation: Means pencil-and-paper QUIZ Means coding QUIZ Neural Networks (pp. 106-121) The first artificial neural network (ANN) was the (single-layer) perceptron, a simplified model of a biological neuron.
More informationMachine Learning Classifiers and Boosting
Machine Learning Classifiers and Boosting Reading Ch 18.6-18.12, 20.1-20.3.2 Outline Different types of learning problems Different types of learning algorithms Supervised learning Decision trees Naïve
More informationNeural Network Learning. Today s Lecture. Continuation of Neural Networks. Artificial Neural Networks. Lecture 24: Learning 3. Victor R.
Lecture 24: Learning 3 Victor R. Lesser CMPSCI 683 Fall 2010 Today s Lecture Continuation of Neural Networks Artificial Neural Networks Compose of nodes/units connected by links Each link has a numeric
More informationLearning Transferable Features with Deep Adaptation Networks
Learning Transferable Features with Deep Adaptation Networks Mingsheng Long, Yue Cao, Jianmin Wang, Michael I. Jordan Presented by Changyou Chen October 30, 2015 1 Changyou Chen Learning Transferable Features
More informationDistributed Training of Deep Neural Networks: Theoretical and Practical Limits of Parallel Scalability
Distributed Training of Deep Neural Networks: Theoretical and Practical Limits of Parallel Scalability Janis Keuper Itwm.fraunhofer.de/ml Competence Center High Performance Computing Fraunhofer ITWM, Kaiserslautern,
More informationarxiv: v1 [cs.cv] 20 Dec 2016
End-to-End Pedestrian Collision Warning System based on a Convolutional Neural Network with Semantic Segmentation arxiv:1612.06558v1 [cs.cv] 20 Dec 2016 Heechul Jung heechul@dgist.ac.kr Min-Kook Choi mkchoi@dgist.ac.kr
More informationDeep Learning Cook Book
Deep Learning Cook Book Robert Haschke (CITEC) Overview Input Representation Output Layer + Cost Function Hidden Layer Units Initialization Regularization Input representation Choose an input representation
More informationFeedback Alignment Algorithms. Lisa Zhang, Tingwu Wang, Mengye Ren
Feedback Alignment Algorithms Lisa Zhang, Tingwu Wang, Mengye Ren Agenda Review of Back Propagation Random feedback weights support learning in deep neural networks Direct Feedback Alignment Provides Learning
More informationVolumetric and Multi-View CNNs for Object Classification on 3D Data Supplementary Material
Volumetric and Multi-View CNNs for Object Classification on 3D Data Supplementary Material Charles R. Qi Hao Su Matthias Nießner Angela Dai Mengyuan Yan Leonidas J. Guibas Stanford University 1. Details
More informationRotation Invariance Neural Network
Rotation Invariance Neural Network Shiyuan Li Abstract Rotation invariance and translate invariance have great values in image recognition. In this paper, we bring a new architecture in convolutional neural
More informationDeconvolutions in Convolutional Neural Networks
Overview Deconvolutions in Convolutional Neural Networks Bohyung Han bhhan@postech.ac.kr Computer Vision Lab. Convolutional Neural Networks (CNNs) Deconvolutions in CNNs Applications Network visualization
More informationDeep Indian Delicacy: Classification of Indian Food Images using Convolutional Neural Networks
Deep Indian Delicacy: Classification of Indian Food Images using Convolutional Neural Networks Shamay Jahan 1, Shashi Rekha 2. H, Shah Ayub Quadri 3 1 M. Tech., semester 4, DoS in CSE, Visvesvaraya Technological
More informationScalable and Modularized RTL Compilation of Convolutional Neural Networks onto FPGA
Scalable and Modularized RTL Compilation of Convolutional Neural Networks onto FPGA Yufei Ma, Naveen Suda, Yu Cao, Jae-sun Seo, Sarma Vrudhula School of Electrical, Computer and Energy Engineering School
More informationMachine learning for vision. It s the features, stupid! cathedral. high-rise. Winter Roland Memisevic. Lecture 2, January 26, 2016
Winter 2016 Lecture 2, Januar 26, 2016 f2? cathedral high-rise f1 A common computer vision pipeline before 2012 1. 2. 3. 4. Find interest points. Crop patches around them. Represent each patch with a sparse
More informationCOMP9444 Neural Networks and Deep Learning 5. Geometry of Hidden Units
COMP9 8s Geometry of Hidden Units COMP9 Neural Networks and Deep Learning 5. Geometry of Hidden Units Outline Geometry of Hidden Unit Activations Limitations of -layer networks Alternative transfer functions
More informationNVIDIA FOR DEEP LEARNING. Bill Veenhuis
NVIDIA FOR DEEP LEARNING Bill Veenhuis bveenhuis@nvidia.com Nvidia is the world s leading ai platform ONE ARCHITECTURE CUDA 2 GPU: Perfect Companion for Accelerating Apps & A.I. CPU GPU 3 Intro to AI AGENDA
More informationEnsemble methods in machine learning. Example. Neural networks. Neural networks
Ensemble methods in machine learning Bootstrap aggregating (bagging) train an ensemble of models based on randomly resampled versions of the training set, then take a majority vote Example What if you
More informationNeural Networks with Input Specified Thresholds
Neural Networks with Input Specified Thresholds Fei Liu Stanford University liufei@stanford.edu Junyang Qian Stanford University junyangq@stanford.edu Abstract In this project report, we propose a method
More informationCS 523: Multimedia Systems
CS 523: Multimedia Systems Angus Forbes creativecoding.evl.uic.edu/courses/cs523 Today - Convolutional Neural Networks - Work on Project 1 http://playground.tensorflow.org/ Convolutional Neural Networks
More informationMinimizing Computation in Convolutional Neural Networks
Minimizing Computation in Convolutional Neural Networks Jason Cong and Bingjun Xiao Computer Science Department, University of California, Los Angeles, CA 90095, USA {cong,xiao}@cs.ucla.edu Abstract. Convolutional
More informationINTRODUCTION TO DEEP LEARNING
INTRODUCTION TO DEEP LEARNING CONTENTS Introduction to deep learning Contents 1. Examples 2. Machine learning 3. Neural networks 4. Deep learning 5. Convolutional neural networks 6. Conclusion 7. Additional
More informationLearning Deep Features for Visual Recognition
7x7 conv, 64, /2, pool/2 1x1 conv, 64 3x3 conv, 64 1x1 conv, 64 3x3 conv, 64 1x1 conv, 64 3x3 conv, 64 1x1 conv, 128, /2 3x3 conv, 128 1x1 conv, 512 1x1 conv, 128 3x3 conv, 128 1x1 conv, 512 1x1 conv,
More informationHidden Units. Sargur N. Srihari
Hidden Units Sargur N. srihari@cedar.buffalo.edu 1 Topics in Deep Feedforward Networks Overview 1. Example: Learning XOR 2. Gradient-Based Learning 3. Hidden Units 4. Architecture Design 5. Backpropagation
More informationKnow your data - many types of networks
Architectures Know your data - many types of networks Fixed length representation Variable length representation Online video sequences, or samples of different sizes Images Specific architectures for
More informationarxiv: v3 [cs.cv] 29 May 2017
RandomOut: Using a convolutional gradient norm to rescue convolutional filters Joseph Paul Cohen 1 Henry Z. Lo 2 Wei Ding 2 arxiv:1602.05931v3 [cs.cv] 29 May 2017 Abstract Filters in convolutional neural
More informationReal-time convolutional networks for sonar image classification in low-power embedded systems
Real-time convolutional networks for sonar image classification in low-power embedded systems Matias Valdenegro-Toro Ocean Systems Laboratory - School of Engineering & Physical Sciences Heriot-Watt University,
More informationMachine Learning 13. week
Machine Learning 13. week Deep Learning Convolutional Neural Network Recurrent Neural Network 1 Why Deep Learning is so Popular? 1. Increase in the amount of data Thanks to the Internet, huge amount of
More informationLecture 7: Neural network acoustic models in speech recognition
CS 224S / LINGUIST 285 Spoken Language Processing Andrew Maas Stanford University Spring 2017 Lecture 7: Neural network acoustic models in speech recognition Outline Hybrid acoustic modeling overview Basic
More informationConvolutionalNN's... ConvNet's... deep learnig
Deep Learning ConvolutionalNN's... ConvNet's... deep learnig Markus Thaler, TG208 tham@zhaw.ch www.zhaw.ch/~tham Martin Weisenhorn, TB427 weie@zhaw.ch 20.08.2018 1 Neural Networks Classification: up to
More informationIndex. Umberto Michelucci 2018 U. Michelucci, Applied Deep Learning,
A Acquisition function, 298, 301 Adam optimizer, 175 178 Anaconda navigator conda command, 3 Create button, 5 download and install, 1 installing packages, 8 Jupyter Notebook, 11 13 left navigation pane,
More informationComparing Dropout Nets to Sum-Product Networks for Predicting Molecular Activity
000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050
More information2. Neural network basics
2. Neural network basics Next commonalities among different neural networks are discussed in order to get started and show which structural parts or concepts appear in almost all networks. It is presented
More informationElastic Neural Networks for Classification
Elastic Neural Networks for Classification Yi Zhou 1, Yue Bai 1, Shuvra S. Bhattacharyya 1, 2 and Heikki Huttunen 1 1 Tampere University of Technology, Finland, 2 University of Maryland, USA arxiv:1810.00589v3
More informationTiny ImageNet Visual Recognition Challenge
Tiny ImageNet Visual Recognition Challenge Ya Le Department of Statistics Stanford University yle@stanford.edu Xuan Yang Department of Electrical Engineering Stanford University xuany@stanford.edu Abstract
More informationSpatial Localization and Detection. Lecture 8-1
Lecture 8: Spatial Localization and Detection Lecture 8-1 Administrative - Project Proposals were due on Saturday Homework 2 due Friday 2/5 Homework 1 grades out this week Midterm will be in-class on Wednesday
More informationNotes on Multilayer, Feedforward Neural Networks
Notes on Multilayer, Feedforward Neural Networks CS425/528: Machine Learning Fall 2012 Prepared by: Lynne E. Parker [Material in these notes was gleaned from various sources, including E. Alpaydin s book
More informationDeep Learning With Noise
Deep Learning With Noise Yixin Luo Computer Science Department Carnegie Mellon University yixinluo@cs.cmu.edu Fan Yang Department of Mathematical Sciences Carnegie Mellon University fanyang1@andrew.cmu.edu
More informationDynamic Routing Between Capsules
Report Explainable Machine Learning Dynamic Routing Between Capsules Author: Michael Dorkenwald Supervisor: Dr. Ullrich Köthe 28. Juni 2018 Inhaltsverzeichnis 1 Introduction 2 2 Motivation 2 3 CapusleNet
More informationPart Localization by Exploiting Deep Convolutional Networks
Part Localization by Exploiting Deep Convolutional Networks Marcel Simon, Erik Rodner, and Joachim Denzler Computer Vision Group, Friedrich Schiller University of Jena, Germany www.inf-cv.uni-jena.de Abstract.
More informationMind the regularized GAP, for human action classification and semi-supervised localization based on visual saliency.
Mind the regularized GAP, for human action classification and semi-supervised localization based on visual saliency. Marc Moreaux123, Natalia Lyubova1, Isabelle Ferrane 2 and Frederic Lerasle3 1 Softbank
More informationDeepIndex for Accurate and Efficient Image Retrieval
DeepIndex for Accurate and Efficient Image Retrieval Yu Liu, Yanming Guo, Song Wu, Michael S. Lew Media Lab, Leiden Institute of Advance Computer Science Outline Motivation Proposed Approach Results Conclusions
More informationROB 537: Learning-Based Control
ROB 537: Learning-Based Control Week 6, Lecture 1 Deep Learning (based on lectures by Fuxin Li, CS 519: Deep Learning) Announcements: HW 3 Due TODAY Midterm Exam on 11/6 Reading: Survey paper on Deep Learning
More information