DECISION TREES & RANDOM FORESTS X CONVOLUTIONAL NEURAL NETWORKS
|
|
- Lynette Osborne
- 5 years ago
- Views:
Transcription
1 DECISION TREES & RANDOM FORESTS X CONVOLUTIONAL NEURAL NETWORKS Deep Neural Decision Forests Microsoft Research Cambridge UK, ICCV 2015 Decision Forests, Convolutional Networks and the Models in-between Microsoft Research Technical Report arxiv 3 Mar Meir Dalal Or Gorodissky 1
2 OVERVIEW OF THE PRESENTATION MOTIVATION DECISION TREES RANDOM FORESTS DECISION TREES VS CNN COMBINING DECISION TREE & CNN 2
3 MOTIVATION Combining CNN s feature learning with Random Forest s classification capacities 3
4 DECISION TREE - WHAT IS IT Supervised learning algorithm used for classification An inductive learning task - use particular facts to make more generalized conclusions A predictive model based on a branching series of tests These smaller tests are less complex than a one-stage classifier (Divide & Conquer) Different way to look at : each node either predicates the answer or passes the problem to a different node Example 4
5 DECISION TREES - TYPIC AL (NAIVE) PROBLEM Training examples Example Attributes Target 5
6 DECISION TREES - TYPICAL (NAIVE) PROBLEM CONT. 6
7 DECISION TREES - TYPICAL (NAIVE) PROBLEM CONT. 7
8 DECISION TREES - HOW TO CONSTRUCT When to stop All the instances have the same target class There are no more instances There are no more attributes Reach to pre-defined max depth How to split? constructing a decision trees usually work top-down Gini impurity Information gain 8
9 DECISION TREES - TERMINOLOGY Root Node Decision Node Splitting Prediction Node 9
10 DECISION TREES - STOCHASTIC ROUTING Input space χ, output space Y Decision nodes : n Ν d n ( ; Θ) Prediction nodes : l L: π l over Y Θ - Decision node parameterization Routing function till now d n is binary and the routing is deterministic Leaf prediction mark as π l π: Stochastic routing function d n ( ; Θ) : χ 0,1 Routing decision is an output of a Bernoulli random variable with mean d n ( ; Θ) Leaf node contain a probability for each class 10
11 DECISION TREE - ENSEMBLE METHODS If a decision tree is fully grown, it may lose some generalization capability Overfitting How to solve it? Ensemble methods Involve group of predictive models to achieve a better accuracy and model stability 11
12 RANDOM FOREST When you can t think of any algorithm, use random forest! Algorithm (Bootstrap Aggregation) 1. Grow K different decision trees 1. Pick a random subset of the training examples (with return) 2. Pick d << D random attributes to split the data 3. Each tree is grown to the largest extent possible and there is no pruning 2. Given a new data point χ 1. Classify χ using each of the trees T 1 T K 2. Predict new data by aggregating the predictions of the tree trees (i.e., majority votes for classification, average for regression). F O R E S T D E C I S I O N A v e r a g i n g a l l t h e t r e e s p r e d i c t i o n s 12
13 DECISION TREES X CONV NEURAL NETS DT Levels Divide & Conquer Only log 2 N parameters used in test time No feature learned (at most) Training is done layer wise High efficiency CNN Layers High dimensionality Use all the parameters in test time! Feature learning integrated classification Training E2E with S/GD State of the art accuracy How to efficiently combine DT/RF with CNN? 13
14 DECISION TREE BY CNN FEATURES ARCHITECTURE CNN RF Softmax 14
15 DECISION TREE BY CNN FEATURES ARCHITECTURE CNN RF 15
16 DECISION TREE BY CNN FEATURES ARCHITECTURE CNN RF F O R E ST D E C I S I O N Ave r a g i n g a l l t h e t re e s p re d i c t i o n s 16
17 DECISION TREE BY CNN FEATURES ARCHITECTURE d n ; Θ = σ f n x ; Θ σ x = 1 + e x 1 (sigmoid function) f n ( ; Θ) : χ R Decision Nodes Prediction Probability Prediction for sample x p T y x, Θ, π = l L π ly μ l (x Θ) where π ly - probability of a sample reaching a leaf l to take class y μ l (x Θ) - probability that sample x will reach leaf l l L μ l (x Θ) = 1 Forest Of Decision Trees Deliver a prediction for a x sample by averaging the output of each tree: P F y x = 1 K h=1 K P Th y x K - number of decision trees in the forest 17
18 TWO-STEP OPTIMIZATION STRATEGY Objective Function: (1) Learning decision nodes min Θ Our goal: R( Θ, π; T) (2) Learning predictions nodes min π Our goal: R( Θ, π; T) η > 0 learning rate B T - random subset Z l t normalization fcator π ly0 arbitrary > 0 18
19 LEARNING TREE BY BACK PROPAGATION (2) (1) π Update the predication nodes in each tree independently since each tree has its own set of leaf predictions Randomly select a tree in the forest for each mini-batch Θ 19
20 Histogram Counts LEARNING AND ENTROPY How can we quantify that the network s learned process? Measure the decision uncertainty for a given sample x Decisions Nodes As the certainty of routing a sample increase, the sample will only be routed to a small subset of available decisions nodes with reasonably high probability d n response on validation set 100 epochs 500 epochs 1K epochs d n output values 20
21 Average leaf entropy [bits] LEARNING AND ENTROPY How can we quantify that the network s learned process? Leaf Entropy Measure the leaf posterior distribution Highly peaked distributions for the leaf predictors, leads to low entropy Average leaf entropy during training H > H #Training epochs 21
22 RESULTS 1 Algorithms ADF - state-of-the-art stand-alone, off-the-shelf forest ensemble sndf -1 fully connected layer, no hidden layers 22
23 RESULTS 2 Architecture GoogLeNet* - GoogLeNet implementation Distributed (Deep) Machine Learning Common (DMLC) library dndf.net - Replacing each softmax layer in GoogLeNet* (1) with Random Forest consisting of 10 trees 23
24 CONCLUSIONS Novel algorithm for learning Random Forest - sndf (shallow neural decision forest) Model unified representation learning and classifier using random forest - dndf.net (deep neural decision forest) Train dnfts - 2 step stochastic gradient descent Prediction function Routing function No dramatic improvement in accuracy comparing to regular GoogLeNet 24
25 RECAP Before: Decision trees and random forests are efficient classifiers CNNs are state of the art at feature extractions an classifiers In Deep Neural Decision Forests ICCV 2015: All softmax layers are used to deduce a random forest GoogLeNet variation Two steps SGD defined for finding both the decision and prediction functions Trained E2E achieved (slightly) better results Peter Kontschieder Now: In Decision Forests, Convolutional Networks and the Models in-between Microsoft Research Technical Report arxiv 3 Mar Generalize DT and CNN as Conditional Networks using routers Improve state of the art architectures compute cost while maintaining accuracy Yani Ioannou 25
26 SAVE THE PLANET / YOUR PHONE (MOTIVATION) VGG16 single forward pass uses ~ 30G FLOPS Top ranking efficient super computer (HPC) ~ 10G FLOPS / Watt 100,000,000 US search for an image on their cloud ~ 300MWatt After one hour: Energy equivalent to a ~ 45 ton of coal Nomophobia From Wikipedia, the free encyclopedia is a proposed name for the phobia of being out of mobile phone contact. [1][2] It is, however, arguable that the word "phobia" is misused and that in the majority of cases it is another form of anxiety disorder. [3][not in citation given] Although nomophobia does not appear in the current Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5), it has been proposed as a "specific phobia", based on definitions given in the DSM-IV. [4][dubious discuss] 26
27 MOTIVATION Neural networks are becoming deeper and more complex carrying a quickly growing computational cost We would like to make more efficient neural networks by introducing ideas from decision trees Decide on the fly how accurate efficient you want your prediction to be (trade off) Top 1 accuracy on imagenet Vs. number of operations (GFLOPS) size is the number of parameters /
28 DECISION TREES X DEEP NEURAL NETS TA K I N G A C L OSER LOOK DT Decision nodes Random forest Prediction nodes Deactivating branches More Efficient CNN Relu Ensembles Softmax Dropout More Accurate Actually they are similar But how do we combine them? - Generalize both as Conditional Networks 28
29 POC - FROM NET TO TREE Take 2 consecutive layers from trained CNN (VGG) Calculate the 2 layers crosscorrelation matrix of a fully connected neural network Rearrange as a block matrix (higher cross-correlation values) Decorrelate by zeroing block off-diagonal elements Replot the net with the branched structure 29
30 FAST NOTATION 30
31 INTRODUCING THE ROUTER NODE split node P l R ʃ r(1) data router r(2) Implemented here as perceptron though other choices are possible Outputs real value weights that affect data routing: 31
32 INTRODUCING THE ROUTER NODE split node data router Implemented here as perceptron though other choices are possible Outputs real value weights that affect data routing: Explicit Routing data is sent conditionally to a single / multiple routes 32
33 INTRODUCING THE ROUTER NODE split node data router Implemented here as perceptron though other choices are possible Outputs real value weights that affect data routing: Explicit Routing data is sent conditionally to a single / multiple routes Implicit Routing data is sent unconditionally but selectively to all son nodes 33
34 INTRODUCING THE ROUTER NODE Partial derivative: split node data router Implemented here as perceptron though other choices are possible Outputs real value weights that affect data routing: Explicit Routing data is sent conditionally to a single / multiple routes Implicit Routing data is sent unconditionally but selectively to all son nodes Hard Routing binary weights on branches (on/off) Soft Routing real weights on branches 34
35 INTRODUCING THE ROUTER NODE Quizwhere are DTs? Hard Explicit Implicit Soft Implemented here as perceptron though other choices are possible Outputs real value weights that affect data routing: Explicit Routing data is sent conditionally to a single / multiple routes Implicit Routing data is sent unconditionally but selectively to all son nodes Hard Routing binary weights on branches (on/off) Soft Routing real weights on branches 35
36 INTRODUCING THE ROUTER NODE Explicit Implicit Hard DT Soft Implemented here as perceptron though other choices are possible Outputs real value weights that affect data routing: Explicit Routing data is sent conditionally to a single / multiple routes Implicit Routing data is sent unconditionally but selectively to all son nodes Hard Routing binary weights on branches (on/off) Soft Routing real weights on branches Generalization is called Conditional Network 36
37 EXPERIMENT CONDITIONAL GOOGLE-NET Ensemble/Random forest architecture Based on two GoogLeNets: regular and one with 10x oversampling. This time we learn an explicit router based simple CNN1 Router is trained together to predict the accuracy of each route for each image. 37
38 EXPERIMENT CONDITIONAL GOOGLE-NET Purple Dots: original networks accuracies. Dashed Line: accuracy when choosing each network at random Green Line: amortized cost to accuracy curve on the validation set Green Point: operation point where we achieve almost the 10x oversampled CNN accuracy with less than half the computational cost. We could decide during test time what accuracy we require. 38
39 EFFICIENCY BENEFITS OF IMPLICIT ROUTING Top: A standard CNN (one route). Bottom: A two-routed implicit arch. The larger boxes denote feature maps, the smaller ones the filters Due to branching, the depth of the second set of kernels (in yellow) changes between the two architectures yielding lower computational cost. 39
40 EXPERIMENT CONDITIONAL VGG11 Split features into 2 Based on VGG11 with additional global max polling layer after last convolutional layer. Implemented as DAG 40
41 EXPERIMENT CONDITIONAL VGG11 Matching the original VGG11 top5 error with less than half the compute (45%), and almost one-fifth (21%) of the parameters. Training from scratch took twice the epochs but the overall time remained the same due to the decrease in computations. 41
42 TL;DR Decision Trees are efficient and CNN are Accurate Conditional NN are the generalization of both Trade off - we try to find the sweet spot combining the two By using Implicit Routing: we could achieve 50% reduction of computational and memory cost. By using Explicit Routing: we could achieve 50% reduction of computational cost same accuracy Decide on the fly how accurate-costly we want to be *If you aren t more accurate maybe you re more efficient 42
Perceptron: This is convolution!
Perceptron: This is convolution! v v v Shared weights v Filter = local perceptron. Also called kernel. By pooling responses at different locations, we gain robustness to the exact spatial location of image
More informationDeep Learning and Its Applications
Convolutional Neural Network and Its Application in Image Recognition Oct 28, 2016 Outline 1 A Motivating Example 2 The Convolutional Neural Network (CNN) Model 3 Training the CNN Model 4 Issues and Recent
More informationPredictive Analytics: Demystifying Current and Emerging Methodologies. Tom Kolde, FCAS, MAAA Linda Brobeck, FCAS, MAAA
Predictive Analytics: Demystifying Current and Emerging Methodologies Tom Kolde, FCAS, MAAA Linda Brobeck, FCAS, MAAA May 18, 2017 About the Presenters Tom Kolde, FCAS, MAAA Consulting Actuary Chicago,
More informationDeep Learning for Computer Vision II
IIIT Hyderabad Deep Learning for Computer Vision II C. V. Jawahar Paradigm Shift Feature Extraction (SIFT, HoG, ) Part Models / Encoding Classifier Sparrow Feature Learning Classifier Sparrow L 1 L 2 L
More informationNetwork Traffic Measurements and Analysis
DEIB - Politecnico di Milano Fall, 2017 Sources Hastie, Tibshirani, Friedman: The Elements of Statistical Learning James, Witten, Hastie, Tibshirani: An Introduction to Statistical Learning Andrew Ng:
More informationThe exam is closed book, closed notes except your one-page cheat sheet.
CS 189 Fall 2015 Introduction to Machine Learning Final Please do not turn over the page before you are instructed to do so. You have 2 hours and 50 minutes. Please write your initials on the top-right
More informationLecture 20: Neural Networks for NLP. Zubin Pahuja
Lecture 20: Neural Networks for NLP Zubin Pahuja zpahuja2@illinois.edu courses.engr.illinois.edu/cs447 CS447: Natural Language Processing 1 Today s Lecture Feed-forward neural networks as classifiers simple
More informationDeep Learning with Tensorflow AlexNet
Machine Learning and Computer Vision Group Deep Learning with Tensorflow http://cvml.ist.ac.at/courses/dlwt_w17/ AlexNet Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton, "Imagenet classification
More informationSpatial Localization and Detection. Lecture 8-1
Lecture 8: Spatial Localization and Detection Lecture 8-1 Administrative - Project Proposals were due on Saturday Homework 2 due Friday 2/5 Homework 1 grades out this week Midterm will be in-class on Wednesday
More informationRandom Forest A. Fornaser
Random Forest A. Fornaser alberto.fornaser@unitn.it Sources Lecture 15: decision trees, information theory and random forests, Dr. Richard E. Turner Trees and Random Forests, Adele Cutler, Utah State University
More informationDeep Learning for Computer Vision
Deep Learning for Computer Vision Lecture 7: Universal Approximation Theorem, More Hidden Units, Multi-Class Classifiers, Softmax, and Regularization Peter Belhumeur Computer Science Columbia University
More informationFace Recognition A Deep Learning Approach
Face Recognition A Deep Learning Approach Lihi Shiloh Tal Perl Deep Learning Seminar 2 Outline What about Cat recognition? Classical face recognition Modern face recognition DeepFace FaceNet Comparison
More informationComo funciona o Deep Learning
Como funciona o Deep Learning Moacir Ponti (com ajuda de Gabriel Paranhos da Costa) ICMC, Universidade de São Paulo Contact: www.icmc.usp.br/~moacir moacir@icmc.usp.br Uberlandia-MG/Brazil October, 2017
More informationInception Network Overview. David White CS793
Inception Network Overview David White CS793 So, Leonardo DiCaprio dreams about dreaming... https://m.media-amazon.com/images/m/mv5bmjaxmzy3njcxnf5bml5banbnxkftztcwnti5otm0mw@@._v1_sy1000_cr0,0,675,1 000_AL_.jpg
More informationMachine Learning. MGS Lecture 3: Deep Learning
Dr Michel F. Valstar http://cs.nott.ac.uk/~mfv/ Machine Learning MGS Lecture 3: Deep Learning Dr Michel F. Valstar http://cs.nott.ac.uk/~mfv/ WHAT IS DEEP LEARNING? Shallow network: Only one hidden layer
More informationAdvanced Video Content Analysis and Video Compression (5LSH0), Module 8B
Advanced Video Content Analysis and Video Compression (5LSH0), Module 8B 1 Supervised learning Catogarized / labeled data Objects in a picture: chair, desk, person, 2 Classification Fons van der Sommen
More informationMachine Learning Classifiers and Boosting
Machine Learning Classifiers and Boosting Reading Ch 18.6-18.12, 20.1-20.3.2 Outline Different types of learning problems Different types of learning algorithms Supervised learning Decision trees Naïve
More informationRandom Forests and Boosting
Random Forests and Boosting Tree-based methods are simple and useful for interpretation. However they typically are not competitive with the best supervised learning approaches in terms of prediction accuracy.
More informationSEMANTIC COMPUTING. Lecture 8: Introduction to Deep Learning. TU Dresden, 7 December Dagmar Gromann International Center For Computational Logic
SEMANTIC COMPUTING Lecture 8: Introduction to Deep Learning Dagmar Gromann International Center For Computational Logic TU Dresden, 7 December 2018 Overview Introduction Deep Learning General Neural Networks
More informationInception and Residual Networks. Hantao Zhang. Deep Learning with Python.
Inception and Residual Networks Hantao Zhang Deep Learning with Python https://en.wikipedia.org/wiki/residual_neural_network Deep Neural Network Progress from Large Scale Visual Recognition Challenge (ILSVRC)
More informationMIT Samberg Center Cambridge, MA, USA. May 30 th June 2 nd, by C. Rea, R.S. Granetz MIT Plasma Science and Fusion Center, Cambridge, MA, USA
Exploratory Machine Learning studies for disruption prediction on DIII-D by C. Rea, R.S. Granetz MIT Plasma Science and Fusion Center, Cambridge, MA, USA Presented at the 2 nd IAEA Technical Meeting on
More informationMini-project 2 CMPSCI 689 Spring 2015 Due: Tuesday, April 07, in class
Mini-project 2 CMPSCI 689 Spring 2015 Due: Tuesday, April 07, in class Guidelines Submission. Submit a hardcopy of the report containing all the figures and printouts of code in class. For readability
More information8. Tree-based approaches
Foundations of Machine Learning École Centrale Paris Fall 2015 8. Tree-based approaches Chloé-Agathe Azencott Centre for Computational Biology, Mines ParisTech chloe agathe.azencott@mines paristech.fr
More informationCMU Lecture 18: Deep learning and Vision: Convolutional neural networks. Teacher: Gianni A. Di Caro
CMU 15-781 Lecture 18: Deep learning and Vision: Convolutional neural networks Teacher: Gianni A. Di Caro DEEP, SHALLOW, CONNECTED, SPARSE? Fully connected multi-layer feed-forward perceptrons: More powerful
More informationSupervised vs unsupervised clustering
Classification Supervised vs unsupervised clustering Cluster analysis: Classes are not known a- priori. Classification: Classes are defined a-priori Sometimes called supervised clustering Extract useful
More informationDeep Learning Cook Book
Deep Learning Cook Book Robert Haschke (CITEC) Overview Input Representation Output Layer + Cost Function Hidden Layer Units Initialization Regularization Input representation Choose an input representation
More informationData Mining Lecture 8: Decision Trees
Data Mining Lecture 8: Decision Trees Jo Houghton ECS Southampton March 8, 2019 1 / 30 Decision Trees - Introduction A decision tree is like a flow chart. E. g. I need to buy a new car Can I afford it?
More informationThe Basics of Decision Trees
Tree-based Methods Here we describe tree-based methods for regression and classification. These involve stratifying or segmenting the predictor space into a number of simple regions. Since the set of splitting
More informationDeep Learning Explained Module 4: Convolution Neural Networks (CNN or Conv Nets)
Deep Learning Explained Module 4: Convolution Neural Networks (CNN or Conv Nets) Sayan D. Pathak, Ph.D., Principal ML Scientist, Microsoft Roland Fernandez, Senior Researcher, Microsoft Module Outline
More informationDeep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group
Deep Learning Vladimir Golkov Technical University of Munich Computer Vision Group 1D Input, 1D Output target input 2 2D Input, 1D Output: Data Distribution Complexity Imagine many dimensions (data occupies
More informationDeep Learning For Video Classification. Presented by Natalie Carlebach & Gil Sharon
Deep Learning For Video Classification Presented by Natalie Carlebach & Gil Sharon Overview Of Presentation Motivation Challenges of video classification Common datasets 4 different methods presented in
More informationBayesian model ensembling using meta-trained recurrent neural networks
Bayesian model ensembling using meta-trained recurrent neural networks Luca Ambrogioni l.ambrogioni@donders.ru.nl Umut Güçlü u.guclu@donders.ru.nl Yağmur Güçlütürk y.gucluturk@donders.ru.nl Julia Berezutskaya
More informationTree-based methods for classification and regression
Tree-based methods for classification and regression Ryan Tibshirani Data Mining: 36-462/36-662 April 11 2013 Optional reading: ISL 8.1, ESL 9.2 1 Tree-based methods Tree-based based methods for predicting
More informationIn-Place Activated BatchNorm for Memory- Optimized Training of DNNs
In-Place Activated BatchNorm for Memory- Optimized Training of DNNs Samuel Rota Bulò, Lorenzo Porzi, Peter Kontschieder Mapillary Research Paper: https://arxiv.org/abs/1712.02616 Code: https://github.com/mapillary/inplace_abn
More informationFast Edge Detection Using Structured Forests
Fast Edge Detection Using Structured Forests Piotr Dollár, C. Lawrence Zitnick [1] Zhihao Li (zhihaol@andrew.cmu.edu) Computer Science Department Carnegie Mellon University Table of contents 1. Introduction
More informationAdvanced Video Analysis & Imaging
Advanced Video Analysis & Imaging (5LSH0), Module 09B Machine Learning with Convolutional Neural Networks (CNNs) - Workout Farhad G. Zanjani, Clint Sebastian, Egor Bondarev, Peter H.N. de With ( p.h.n.de.with@tue.nl
More informationLogical Rhythm - Class 3. August 27, 2018
Logical Rhythm - Class 3 August 27, 2018 In this Class Neural Networks (Intro To Deep Learning) Decision Trees Ensemble Methods(Random Forest) Hyperparameter Optimisation and Bias Variance Tradeoff Biological
More informationMachine Learning. Chao Lan
Machine Learning Chao Lan Machine Learning Prediction Models Regression Model - linear regression (least square, ridge regression, Lasso) Classification Model - naive Bayes, logistic regression, Gaussian
More informationConvolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech
Convolutional Neural Networks Computer Vision Jia-Bin Huang, Virginia Tech Today s class Overview Convolutional Neural Network (CNN) Training CNN Understanding and Visualizing CNN Image Categorization:
More informationNeural Network Neurons
Neural Networks Neural Network Neurons 1 Receives n inputs (plus a bias term) Multiplies each input by its weight Applies activation function to the sum of results Outputs result Activation Functions Given
More informationCPSC 340: Machine Learning and Data Mining. Deep Learning Fall 2016
CPSC 340: Machine Learning and Data Mining Deep Learning Fall 2016 Assignment 5: Due Friday. Assignment 6: Due next Friday. Final: Admin December 12 (8:30am HEBB 100) Covers Assignments 1-6. Final from
More informationImproving the way neural networks learn Srikumar Ramalingam School of Computing University of Utah
Improving the way neural networks learn Srikumar Ramalingam School of Computing University of Utah Reference Most of the slides are taken from the third chapter of the online book by Michael Nielson: neuralnetworksanddeeplearning.com
More informationMachine Learning With Python. Bin Chen Nov. 7, 2017 Research Computing Center
Machine Learning With Python Bin Chen Nov. 7, 2017 Research Computing Center Outline Introduction to Machine Learning (ML) Introduction to Neural Network (NN) Introduction to Deep Learning NN Introduction
More informationDynamic Routing Between Capsules
Report Explainable Machine Learning Dynamic Routing Between Capsules Author: Michael Dorkenwald Supervisor: Dr. Ullrich Köthe 28. Juni 2018 Inhaltsverzeichnis 1 Introduction 2 2 Motivation 2 3 CapusleNet
More informationBusiness Club. Decision Trees
Business Club Decision Trees Business Club Analytics Team December 2017 Index 1. Motivation- A Case Study 2. The Trees a. What is a decision tree b. Representation 3. Regression v/s Classification 4. Building
More informationINTRODUCTION TO DEEP LEARNING
INTRODUCTION TO DEEP LEARNING CONTENTS Introduction to deep learning Contents 1. Examples 2. Machine learning 3. Neural networks 4. Deep learning 5. Convolutional neural networks 6. Conclusion 7. Additional
More informationDeep Learning Workshop. Nov. 20, 2015 Andrew Fishberg, Rowan Zellers
Deep Learning Workshop Nov. 20, 2015 Andrew Fishberg, Rowan Zellers Why deep learning? The ImageNet Challenge Goal: image classification with 1000 categories Top 5 error rate of 15%. Krizhevsky, Alex,
More informationMachine Learning 13. week
Machine Learning 13. week Deep Learning Convolutional Neural Network Recurrent Neural Network 1 Why Deep Learning is so Popular? 1. Increase in the amount of data Thanks to the Internet, huge amount of
More informationCS489/698: Intro to ML
CS489/698: Intro to ML Lecture 14: Training of Deep NNs Instructor: Sun Sun 1 Outline Activation functions Regularization Gradient-based optimization 2 Examples of activation functions 3 5/28/18 Sun Sun
More informationMachine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU,
Machine Learning 10-701, Fall 2015 Deep Learning Eric Xing (and Pengtao Xie) Lecture 8, October 6, 2015 Eric Xing @ CMU, 2015 1 A perennial challenge in computer vision: feature engineering SIFT Spin image
More information7. Boosting and Bagging Bagging
Group Prof. Daniel Cremers 7. Boosting and Bagging Bagging Bagging So far: Boosting as an ensemble learning method, i.e.: a combination of (weak) learners A different way to combine classifiers is known
More informationThe exam is closed book, closed notes except your one-page (two-sided) cheat sheet.
CS 189 Spring 2015 Introduction to Machine Learning Final You have 2 hours 50 minutes for the exam. The exam is closed book, closed notes except your one-page (two-sided) cheat sheet. No calculators or
More informationSupervised Learning for Image Segmentation
Supervised Learning for Image Segmentation Raphael Meier 06.10.2016 Raphael Meier MIA 2016 06.10.2016 1 / 52 References A. Ng, Machine Learning lecture, Stanford University. A. Criminisi, J. Shotton, E.
More informationCENG 783. Special topics in. Deep Learning. AlchemyAPI. Week 11. Sinan Kalkan
CENG 783 Special topics in Deep Learning AlchemyAPI Week 11 Sinan Kalkan TRAINING A CNN Fig: http://www.robots.ox.ac.uk/~vgg/practicals/cnn/ Feed-forward pass Note that this is written in terms of the
More informationSupervised Learning Classification Algorithms Comparison
Supervised Learning Classification Algorithms Comparison Aditya Singh Rathore B.Tech, J.K. Lakshmipat University -------------------------------------------------------------***---------------------------------------------------------
More informationCNN Basics. Chongruo Wu
CNN Basics Chongruo Wu Overview 1. 2. 3. Forward: compute the output of each layer Back propagation: compute gradient Updating: update the parameters with computed gradient Agenda 1. Forward Conv, Fully
More informationOverall Description. Goal: to improve spatial invariance to the input data. Translation, Rotation, Scale, Clutter, Elastic
Philippe Giguère Overall Description Goal: to improve spatial invariance to the input data Translation, Rotation, Scale, Clutter, Elastic How: add a learnable module which explicitly manipulate spatially
More informationFuzzy Set Theory in Computer Vision: Example 3
Fuzzy Set Theory in Computer Vision: Example 3 Derek T. Anderson and James M. Keller FUZZ-IEEE, July 2017 Overview Purpose of these slides are to make you aware of a few of the different CNN architectures
More informationClassification with Decision Tree Induction
Classification with Decision Tree Induction This algorithm makes Classification Decision for a test sample with the help of tree like structure (Similar to Binary Tree OR k-ary tree) Nodes in the tree
More informationEvaluation of different biological data and computational classification methods for use in protein interaction prediction.
Evaluation of different biological data and computational classification methods for use in protein interaction prediction. Yanjun Qi, Ziv Bar-Joseph, Judith Klein-Seetharaman Protein 2006 Motivation Correctly
More informationDeep Neural Networks Optimization
Deep Neural Networks Optimization Creative Commons (cc) by Akritasa http://arxiv.org/pdf/1406.2572.pdf Slides from Geoffrey Hinton CSC411/2515: Machine Learning and Data Mining, Winter 2018 Michael Guerzhoy
More informationDeep Neural Decision Forests
Deep Neural Decision Forests Peter Kontschieder, Madalina Fiterau, Antonio Criminisi, Samuel Rota Bulò Microsoft Research Cambridge, UK Stanford University California, USA Fondazione Bruno Kessler Trento,
More informationDropout. Sargur N. Srihari This is part of lecture slides on Deep Learning:
Dropout Sargur N. srihari@buffalo.edu This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/cse676 1 Regularization Strategies 1. Parameter Norm Penalties 2. Norm Penalties
More informationTutorial on Machine Learning Tools
Tutorial on Machine Learning Tools Yanbing Xue Milos Hauskrecht Why do we need these tools? Widely deployed classical models No need to code from scratch Easy-to-use GUI Outline Matlab Apps Weka 3 UI TensorFlow
More informationDeep Learning Applications
October 20, 2017 Overview Supervised Learning Feedforward neural network Convolution neural network Recurrent neural network Recursive neural network (Recursive neural tensor network) Unsupervised Learning
More informationEnsemble methods in machine learning. Example. Neural networks. Neural networks
Ensemble methods in machine learning Bootstrap aggregating (bagging) train an ensemble of models based on randomly resampled versions of the training set, then take a majority vote Example What if you
More informationEnsemble Learning: An Introduction. Adapted from Slides by Tan, Steinbach, Kumar
Ensemble Learning: An Introduction Adapted from Slides by Tan, Steinbach, Kumar 1 General Idea D Original Training data Step 1: Create Multiple Data Sets... D 1 D 2 D t-1 D t Step 2: Build Multiple Classifiers
More informationHow Learning Differs from Optimization. Sargur N. Srihari
How Learning Differs from Optimization Sargur N. srihari@cedar.buffalo.edu 1 Topics in Optimization Optimization for Training Deep Models: Overview How learning differs from optimization Risk, empirical
More information5 Learning hypothesis classes (16 points)
5 Learning hypothesis classes (16 points) Consider a classification problem with two real valued inputs. For each of the following algorithms, specify all of the separators below that it could have generated
More informationEnsemble Methods, Decision Trees
CS 1675: Intro to Machine Learning Ensemble Methods, Decision Trees Prof. Adriana Kovashka University of Pittsburgh November 13, 2018 Plan for This Lecture Ensemble methods: introduction Boosting Algorithm
More informationHENet: A Highly Efficient Convolutional Neural. Networks Optimized for Accuracy, Speed and Storage
HENet: A Highly Efficient Convolutional Neural Networks Optimized for Accuracy, Speed and Storage Qiuyu Zhu Shanghai University zhuqiuyu@staff.shu.edu.cn Ruixin Zhang Shanghai University chriszhang96@shu.edu.cn
More informationImageNet Classification with Deep Convolutional Neural Networks
ImageNet Classification with Deep Convolutional Neural Networks Alex Krizhevsky Ilya Sutskever Geoffrey Hinton University of Toronto Canada Paper with same name to appear in NIPS 2012 Main idea Architecture
More informationCS229 Final Project: Predicting Expected Response Times
CS229 Final Project: Predicting Expected Email Response Times Laura Cruz-Albrecht (lcruzalb), Kevin Khieu (kkhieu) December 15, 2017 1 Introduction Each day, countless emails are sent out, yet the time
More informationCS6375: Machine Learning Gautam Kunapuli. Mid-Term Review
Gautam Kunapuli Machine Learning Data is identically and independently distributed Goal is to learn a function that maps to Data is generated using an unknown function Learn a hypothesis that minimizes
More informationResidual Networks And Attention Models. cs273b Recitation 11/11/2016. Anna Shcherbina
Residual Networks And Attention Models cs273b Recitation 11/11/2016 Anna Shcherbina Introduction to ResNets Introduced in 2015 by Microsoft Research Deep Residual Learning for Image Recognition (He, Zhang,
More informationAn Exploration of Computer Vision Techniques for Bird Species Classification
An Exploration of Computer Vision Techniques for Bird Species Classification Anne L. Alter, Karen M. Wang December 15, 2017 Abstract Bird classification, a fine-grained categorization task, is a complex
More informationNeural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani
Neural Networks CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Biological and artificial neural networks Feed-forward neural networks Single layer
More informationCS 6501: Deep Learning for Computer Graphics. Training Neural Networks II. Connelly Barnes
CS 6501: Deep Learning for Computer Graphics Training Neural Networks II Connelly Barnes Overview Preprocessing Initialization Vanishing/exploding gradients problem Batch normalization Dropout Additional
More informationCOMP 551 Applied Machine Learning Lecture 16: Deep Learning
COMP 551 Applied Machine Learning Lecture 16: Deep Learning Instructor: Ryan Lowe (ryan.lowe@cs.mcgill.ca) Slides mostly by: Class web page: www.cs.mcgill.ca/~hvanho2/comp551 Unless otherwise noted, all
More informationA Systematic Overview of Data Mining Algorithms
A Systematic Overview of Data Mining Algorithms 1 Data Mining Algorithm A well-defined procedure that takes data as input and produces output as models or patterns well-defined: precisely encoded as a
More informationDeep Learning for Embedded Security Evaluation
Deep Learning for Embedded Security Evaluation Emmanuel Prouff 1 1 Laboratoire de Sécurité des Composants, ANSSI, France April 2018, CISCO April 2018, CISCO E. Prouff 1/22 Contents 1. Context and Motivation
More informationLecture 6-Decision Tree & MDL
6-Decision Tree & MDL-1 Machine Learning Lecture 6-Decision Tree & MDL Lecturer: Haim Permuter Scribes: Asaf Lavi and Ben Marinberg This lecture discusses decision trees and the minimum description length
More informationComputer Vision Lecture 16
Computer Vision Lecture 16 Deep Learning for Object Categorization 14.01.2016 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Announcements Seminar registration period
More informationIs Bigger CNN Better? Samer Hijazi on behalf of IPG CTO Group Embedded Neural Networks Summit (enns2016) San Jose Feb. 9th
Is Bigger CNN Better? Samer Hijazi on behalf of IPG CTO Group Embedded Neural Networks Summit (enns2016) San Jose Feb. 9th Today s Story Why does CNN matter to the embedded world? How to enable CNN in
More informationMIT 801. Machine Learning I. [Presented by Anna Bosman] 16 February 2018
MIT 801 [Presented by Anna Bosman] 16 February 2018 Machine Learning What is machine learning? Artificial Intelligence? Yes as we know it. What is intelligence? The ability to acquire and apply knowledge
More informationCPSC 340: Machine Learning and Data Mining. Deep Learning Fall 2018
CPSC 340: Machine Learning and Data Mining Deep Learning Fall 2018 Last Time: Multi-Dimensional Scaling Multi-dimensional scaling (MDS): Non-parametric visualization: directly optimize the z i locations.
More informationDecision trees. Decision trees are useful to a large degree because of their simplicity and interpretability
Decision trees A decision tree is a method for classification/regression that aims to ask a few relatively simple questions about an input and then predicts the associated output Decision trees are useful
More informationStudy of Residual Networks for Image Recognition
Study of Residual Networks for Image Recognition Mohammad Sadegh Ebrahimi Stanford University sadegh@stanford.edu Hossein Karkeh Abadi Stanford University hosseink@stanford.edu Abstract Deep neural networks
More informationAn introduction to random forests
An introduction to random forests Eric Debreuve / Team Morpheme Institutions: University Nice Sophia Antipolis / CNRS / Inria Labs: I3S / Inria CRI SA-M / ibv Outline Machine learning Decision tree Random
More informationFast or furious? - User analysis of SF Express Inc
CS 229 PROJECT, DEC. 2017 1 Fast or furious? - User analysis of SF Express Inc Gege Wen@gegewen, Yiyuan Zhang@yiyuan12, Kezhen Zhao@zkz I. MOTIVATION The motivation of this project is to predict the likelihood
More informationClassifying Depositional Environments in Satellite Images
Classifying Depositional Environments in Satellite Images Alex Miltenberger and Rayan Kanfar Department of Geophysics School of Earth, Energy, and Environmental Sciences Stanford University 1 Introduction
More informationUnivariate and Multivariate Decision Trees
Univariate and Multivariate Decision Trees Olcay Taner Yıldız and Ethem Alpaydın Department of Computer Engineering Boğaziçi University İstanbul 80815 Turkey Abstract. Univariate decision trees at each
More informationRandom Forest Classification and Attribute Selection Program rfc3d
Random Forest Classification and Attribute Selection Program rfc3d Overview Random Forest (RF) is a supervised classification algorithm using multiple decision trees. Program rfc3d uses training data generated
More informationMondrian Forests: Efficient Online Random Forests
Mondrian Forests: Efficient Online Random Forests Balaji Lakshminarayanan (Gatsby Unit, UCL) Daniel M. Roy (Cambridge Toronto) Yee Whye Teh (Oxford) September 4, 2014 1 Outline Background and Motivation
More informationDeep neural networks II
Deep neural networks II May 31 st, 2018 Yong Jae Lee UC Davis Many slides from Rob Fergus, Svetlana Lazebnik, Jia-Bin Huang, Derek Hoiem, Adriana Kovashka, Why (convolutional) neural networks? State of
More informationDecision Trees Dr. G. Bharadwaja Kumar VIT Chennai
Decision Trees Decision Tree Decision Trees (DTs) are a nonparametric supervised learning method used for classification and regression. The goal is to create a model that predicts the value of a target
More informationDeep Learning Basic Lecture - Complex Systems & Artificial Intelligence 2017/18 (VO) Asan Agibetov, PhD.
Deep Learning 861.061 Basic Lecture - Complex Systems & Artificial Intelligence 2017/18 (VO) Asan Agibetov, PhD asan.agibetov@meduniwien.ac.at Medical University of Vienna Center for Medical Statistics,
More information11. Neural Network Regularization
11. Neural Network Regularization CS 519 Deep Learning, Winter 2016 Fuxin Li With materials from Andrej Karpathy, Zsolt Kira Preventing overfitting Approach 1: Get more data! Always best if possible! If
More informationNeural Networks (Overview) Prof. Richard Zanibbi
Neural Networks (Overview) Prof. Richard Zanibbi Inspired by Biology Introduction But as used in pattern recognition research, have little relation with real neural systems (studied in neurology and neuroscience)
More informationMachine Learning Methods for Ship Detection in Satellite Images
Machine Learning Methods for Ship Detection in Satellite Images Yifan Li yil150@ucsd.edu Huadong Zhang huz095@ucsd.edu Qianfeng Guo qig020@ucsd.edu Xiaoshi Li xil758@ucsd.edu Abstract In this project,
More information