Deep Residual Learning
|
|
- Josephine Summers
- 6 years ago
- Views:
Transcription
1 Deep Residual Learning ILSVRC & COCO 2015 competitions Kaiming He with Xiangyu Zhang, Shaoqing Ren, Jifeng Dai, & Jian Sun Microsoft Research Asia (MSRA)
2 ILSVRC & COCO 2015 Competitions 1st places in all five main tracks ImageNet Classification: Ultra-deep (quote Yann) 152-layer nets ImageNet Detection: 16% better than 2nd ImageNet Localization: 27% better than 2nd COCO Detection: 11% better than 2nd COCO Segmentation: 12% better than 2nd *improvements are relative numbers
3 Revolution of Depth 152 layers layers 19 layers layers 8 layers shallow ILSVRC'15 ResNet ILSVRC'14 GoogleNet ILSVRC'14 VGG ILSVRC'13 ILSVRC'12 AlexNet ImageNet Classification top-5 error (%) ILSVRC'11 ILSVRC'10
4 Revolution of Depth 101 layers Engines of visual recognition shallow 8 layers 16 layers HOG, DPM AlexNet (RCNN) VGG (RCNN) ResNet (Faster RCNN)* PASCAL VOC 2007 Object Detection map (%) *w/ other improvements & more data
5 Revolution of Depth AlexNet, 8 layers (ILSVRC 2012) 11x11 conv, 96, /4, pool/2 5x5 conv, 256, pool/2 3x3 conv, 384 3x3 conv, 384, pool/2 fc, 4096 fc, 4096 fc, 1000
6 Revolution of Depth soft max2 Soft maxact ivat ion FC AlexNet, 8 layers (ILSVRC 2012) 11x11 conv, 96, /4, pool/2 5x5 conv, 256, pool/2 3x3 conv, 384 VGG, 19 layers (ILSVRC 2014), pool/2 GoogleNet, 22 layers (ILSVRC 2014) AveragePool 7x7+ 1(V) Dept hconcat Conv Conv Conv Conv 1x1+ 1(S) 3x3+ 1(S) 5x5+ 1(S) 1x1+ 1(S) Conv Conv MaxPool 1x1+ 1(S) 1x1+ 1(S) 3x3+ 1(S) Dept hconcat Conv Conv Conv Conv 1x1+ 1(S) 3x3+ 1(S) 5x5+ 1(S) 1x1+ 1(S) soft max1 3x3 conv, 384, pool/2 Conv Conv MaxPool 1x1+ 1(S) 1x1+ 1(S) 3x3+ 1(S) MaxPool 3x3+ 2(S) Soft maxact ivat ion FC, pool/2 Dept hconcat Conv Conv Conv Conv 1x1+ 1(S) 3x3+ 1(S) 5x5+ 1(S) 1x1+ 1(S) FC Conv 1x1+ 1(S) fc, 4096 Conv Conv MaxPool 1x1+ 1(S) 1x1+ 1(S) 3x3+ 1(S) Dept hconcat AveragePool 5x5+ 3(V) fc, 4096 Conv Conv Conv Conv 1x1+ 1(S) 3x3+ 1(S) 5x5+ 1(S) 1x1+ 1(S) Conv Conv MaxPool 1x1+ 1(S) 1x1+ 1(S) 3x3+ 1(S) fc, 1000, pool/2 Dept hconcat Conv Conv Conv Conv 1x1+ 1(S) 3x3+ 1(S) 5x5+ 1(S) 1x1+ 1(S) soft max0 Soft maxact ivat ion Conv Conv MaxPool 1x1+ 1(S) 1x1+ 1(S) 3x3+ 1(S) Dept hconcat FC FC Conv Conv Conv Conv 1x1+ 1(S) 3x3+ 1(S) 5x5+ 1(S) 1x1+ 1(S) Conv Conv MaxPool 1x1+ 1(S) 1x1+ 1(S) 3x3+ 1(S) Conv 1x1+ 1(S) AveragePool 5x5+ 3(V), pool/2, pool/2 fc, 4096 fc, 4096 Dept hconcat Conv Conv Conv Conv 1x1+ 1(S) 3x3+ 1(S) 5x5+ 1(S) 1x1+ 1(S) Conv Conv MaxPool 1x1+ 1(S) 1x1+ 1(S) 3x3+ 1(S) MaxPool 3x3+ 2(S) Dept hconcat Conv Conv Conv Conv 1x1+ 1(S) 3x3+ 1(S) 5x5+ 1(S) 1x1+ 1(S) Conv Conv MaxPool 1x1+ 1(S) 1x1+ 1(S) 3x3+ 1(S) Dept hconcat Conv Conv Conv Conv 1x1+ 1(S) 3x3+ 1(S) 5x5+ 1(S) 1x1+ 1(S) Conv Conv MaxPool 1x1+ 1(S) 1x1+ 1(S) 3x3+ 1(S) MaxPool 3x3+ 2(S) LocalRespNorm Conv 3x3+ 1(S) Conv 1x1+ 1(V) LocalRespNorm MaxPool 3x3+ 2(S) fc, 1000 input Conv 7x7+ 2(S)
7 11x11 conv, 96, /4, pool/2 5x5 conv, 256, pool/2 3x3 conv, 384 3x3 conv, 384, pool/2 fc, 4096 fc, 4096 fc, 1000, pool/2, pool/2, pool/2, pool/2, pool/2 fc, 4096 fc, 4096 fc, x7 conv, 64, /2, po ol /2 1x1 conv, 64 1x1 conv, 64 1x1 conv, 64 1x1 conv, 128, /2 1x1 conv, 128 1x1 conv, 128 1x1 conv, 128 1x1 conv, 128 1x1 conv, 128 1x1 conv, 128 1x1 conv, 128, /2, /2 1x1 conv, x1 conv, x1 conv, 2048 ave pool, fc 1000 Revolution of Depth AlexNet, 8 layers (ILSVRC 2012) VGG, 19 layers (ILSVRC 2014) ResNet, 152 layers (ILSVRC 2015)
8 7x7 conv, 64, /2, po ol /2 1x1 conv, 64 Revolution of Depth 1x1 conv, 64 ResNet, 152 layers 1x1 conv, 64 1x1 conv, 128, /2 1x1 conv, 128 1x1 conv, 128 1x1 conv, 128 1x1 conv, 128 1x1 conv, 128 1x1 conv, 128 (there was an animation here) 1x1 conv, 128, /2
9 1x1 conv, 128 Revolution of Depth, /2 ResNet, 152 layers (there was an animation here) Kaiming He, Xiangyu Zhang, 1x1 conv, Shaoqing 256 Ren, & Jian Sun. Deep Residual Learning for Image Recognition. arxiv 2015.
10 Revolution of Depth ResNet, 152 layers (there was an animation here) Kaiming He, Xiangyu Zhang, 3x3 conv, Shaoqing 256 Ren, & Jian Sun. Deep Residual Learning for Image Recognition. arxiv 2015.
11 Revolution of Depth ResNet, 152 layers, /2 1x1 conv, x1 conv, 2048 (there was an animation here) 1x1 conv, 2048 ave pool, fc 1000
12 Is learning better networks as simple as stacking more layers?
13 Simply stacking layers? 20 train error (%) 56-layer CIFAR test error (%) 56-layer layer layer iter. (1e4) iter. (1e4) Plain nets: stacking 3x3 conv layers 56-layer net has higher training error and test error than 20-layer net
14 Simply stacking layers? error (%) CIFAR layer 44-layer 32-layer 20-layer error (%) ImageNet layer 5 plain-20 plain-32 plain plain iter. (1e4) solid: test/val dashed: train 30 plain-18 plain iter. (1e4) 18-layer Overly deep plain nets have higher training error A general phenomenon, observed in many datasets
15 a shallower model (18 layers) 7x7 conv, 64, /2 7x7 conv, 64, /2 a deeper counterpart (34 layers), /2, /2 A deeper model should not have higher training error, /2 extra layers, /2 A solution by construction: original layers: copied from a learned shallower model extra layers: set as identity at least the same training error, /2 fc 1000, /2 fc 1000 Optimization difficulties: solvers cannot find the solution when going deeper
16 Deep Residual Learning Plaint net xx HH xx is any desired mapping, hope the 2 weight layers fit HH(xx) any two stacked layers weight layer weight layer HH(xx) relu relu
17 Deep Residual Learning Residual net xx HH xx is any desired mapping, hope the 2 weight layers fit HH(xx) weight layer hope the 2 weight layers fit FF(xx) FF(xx) relu weight layer identity xx let HH xx = FF xx + xx HH xx = FF xx + xx relu
18 Deep Residual Learning FF xx is a residual mapping w.r.t. identity xx FF(xx) weight layer relu weight layer identity xx If identity were optimal, easy to set weights as 0 If optimal mapping is closer to identity, easier to find small fluctuations HH xx = FF xx + xx relu
19 Related Works Residual Representations VLAD & Fisher Vector [Jegou et al 2010], [Perronnin et al 2007] Encoding residual vectors; powerful shallower representations. Product Quantization (IVF-ADC) [Jegou et al 2011] Quantizing residual vectors; efficient nearest-neighbor search. MultiGrid & Hierarchical Precondition [Briggs, et al 2000], [Szeliski 1990, 2006] Solving residual sub-problems; efficient PDE solvers.
20 7x7 conv, 64, /2 7x7 conv, 64, /2 pool, /2 pool, /2 Network Design plain net ResNet, /2, /2 Keep it simple Our basic design (VGG-style) all 3x3 conv (almost) spatial size /2 => # filters x2 Simple design; just deep!, /2, /2 Other remarks: no max pooling (almost) no hidden fc no dropout, /2, /2 avg pool avg pool fc 1000 fc 1000
21 Training All plain/residual nets are trained from scratch All plain/residual nets use Batch Normalization Standard hyper-parameters & augmentation
22 CIFAR-10 experiments error (%) CIFAR-10 plain nets 5 plain-20 plain-32 plain plain iter. (1e4) layer 44-layer 32-layer 20-layer solid: test dashed: train error (%) CIFAR-10 ResNets iter. (1e4) ResNet-20 ResNet-32 ResNet-44 ResNet-56 ResNet layer 32-layer 44-layer 56-layer 110-layer Deep ResNets can be trained without difficulties Deeper ResNets have lower training error, and also lower test error
23 ImageNet experiments ImageNet plain nets ImageNet ResNets error (%) layer error (%) layer 30 solid: test plain-18 dashed: train plain iter. (1e4) 18-layer 30 ResNet-18 ResNet iter. (1e4) layer Deep ResNets can be trained without difficulties Deeper ResNets have lower training error, and also lower test error
24 ImageNet experiments A practical design of going deeper 64-d 256-d 3x3, 64 relu 3x3, 64 1x1, 64 relu 3x3, 64 relu 1x1, 256 relu all-3x3 similar complexity relu bottleneck (for ResNet-50/101/152)
25 ImageNet experiments this model has lower time complexity than VGG-16/19 Deeper ResNets have lower error ResNet-152 ResNet-101 ResNet-50 ResNet crop testing, top-5 val error (%)
26 ImageNet experiments 152 layers layers 19 layers layers 8 layers shallow ILSVRC'15 ResNet ILSVRC'14 GoogleNet ILSVRC'14 VGG ILSVRC'13 ILSVRC'12 AlexNet ImageNet Classification top-5 error (%) ILSVRC'11 ILSVRC'10
27 Just classification? A treasure from ImageNet is on learning features.
28 Features matter. (quote [Girshick et al. 2014], the R-CNN paper) task 2nd-place winner MSRA margin (relative) ImageNet Localization (top-5 error) % absolute 8.5% better! ImageNet Detection % COCO Detection % COCO Segmentation % Our results are all based on ResNet-101 Our features are well transferrable
29 Object Detection (brief) Simply Faster R-CNN + ResNet proposals classifier RoI pooling Faster R-CNN baseline map@.5 map@.5:.95 VGG ResNet Region Proposal Net feature map COCO detection results (ResNet has 28% relative gain) CNN image Shaoqing Ren, Kaiming He, Ross Girshick, & Jian Sun. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. NIPS 2015.
30 Object Detection (brief) RPN learns proposals by extremely deep nets We use only 300 proposals (no SS/EB/MCG!) Add what is just missing in Faster R-CNN Iterative localization Context modeling Multi-scale testing All are based on CNN features; all are end-to-end (train and/or inference) All benefit more from deeper features cumulative gains! Shaoqing Ren, Kaiming He, Ross Girshick, & Jian Sun. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. NIPS 2015.
31 Our results on COCO too many objects, let s check carefully! *the original image is from the COCO dataset Shaoqing Ren, Kaiming He, Ross Girshick, & Jian Sun. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. NIPS 2015.
32 *the original image is from the COCO dataset Shaoqing Ren, Kaiming He, Ross Girshick, & Jian Sun. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. NIPS 2015.
33 *the original image is from the COCO dataset Shaoqing Ren, Kaiming He, Ross Girshick, & Jian Sun. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. NIPS 2015.
34 Instance Segmentation (brief) CONVs box instances (RoIs) Solely CNN-based ( features matter ) Differentiable RoI warping layer (w.r.t box coord.) Multi-task cascades, exact end-to-end training CONVs FCs mask instances RoI warping, pooling categorized instances for each RoI FCs conv feature map masking person person horse person for each RoI Jifeng Dai, Kaiming He, & Jian Sun. Instance-aware Semantic Segmentation via Multi-task Network Cascades. arxiv 2015.
35 input *the original image is from the COCO dataset Jifeng Dai, Kaiming He, & Jian Sun. Instance-aware Semantic Segmentation via Multi-task Network Cascades. arxiv 2015.
36 Conclusions MSRA team Deeper is still better Features matter! Kaiming He Xiangyu Zhang Shaoqing Ren Faster R-CNN is just amazing Jifeng Dai Jian Sun Shaoqing Ren, Kaiming He, Ross Girshick, & Jian Sun. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. NIPS Jifeng Dai, Kaiming He, & Jian Sun. Instance-aware Semantic Segmentation via Multi-task Network Cascades. arxiv 2015.
Learning Deep Representations for Visual Recognition
Learning Deep Representations for Visual Recognition CVPR 2018 Tutorial Kaiming He Facebook AI Research (FAIR) Deep Learning is Representation Learning Representation Learning: worth a conference name
More informationLearning Deep Features for Visual Recognition
7x7 conv, 64, /2, pool/2 1x1 conv, 64 3x3 conv, 64 1x1 conv, 64 3x3 conv, 64 1x1 conv, 64 3x3 conv, 64 1x1 conv, 128, /2 3x3 conv, 128 1x1 conv, 512 1x1 conv, 128 3x3 conv, 128 1x1 conv, 512 1x1 conv,
More informationInception and Residual Networks. Hantao Zhang. Deep Learning with Python.
Inception and Residual Networks Hantao Zhang Deep Learning with Python https://en.wikipedia.org/wiki/residual_neural_network Deep Neural Network Progress from Large Scale Visual Recognition Challenge (ILSVRC)
More informationConvolutional Neural Networks: Applications and a short timeline. 7th Deep Learning Meetup Kornel Kis Vienna,
Convolutional Neural Networks: Applications and a short timeline 7th Deep Learning Meetup Kornel Kis Vienna, 1.12.2016. Introduction Currently a master student Master thesis at BME SmartLab Started deep
More informationSpatial Localization and Detection. Lecture 8-1
Lecture 8: Spatial Localization and Detection Lecture 8-1 Administrative - Project Proposals were due on Saturday Homework 2 due Friday 2/5 Homework 1 grades out this week Midterm will be in-class on Wednesday
More informationInstance-aware Semantic Segmentation via Multi-task Network Cascades
Instance-aware Semantic Segmentation via Multi-task Network Cascades Jifeng Dai, Kaiming He, Jian Sun Microsoft research 2016 Yotam Gil Amit Nativ Agenda Introduction Highlights Implementation Further
More informationMask R-CNN. Kaiming He, Georgia, Gkioxari, Piotr Dollar, Ross Girshick Presenters: Xiaokang Wang, Mengyao Shi Feb. 13, 2018
Mask R-CNN Kaiming He, Georgia, Gkioxari, Piotr Dollar, Ross Girshick Presenters: Xiaokang Wang, Mengyao Shi Feb. 13, 2018 1 Common computer vision tasks Image Classification: one label is generated for
More informationYiqi Yan. May 10, 2017
Yiqi Yan May 10, 2017 P a r t I F u n d a m e n t a l B a c k g r o u n d s Convolution Single Filter Multiple Filters 3 Convolution: case study, 2 filters 4 Convolution: receptive field receptive field
More informationDeep Learning in Visual Recognition. Thanks Da Zhang for the slides
Deep Learning in Visual Recognition Thanks Da Zhang for the slides Deep Learning is Everywhere 2 Roadmap Introduction Convolutional Neural Network Application Image Classification Object Detection Object
More informationFuzzy Set Theory in Computer Vision: Example 3
Fuzzy Set Theory in Computer Vision: Example 3 Derek T. Anderson and James M. Keller FUZZ-IEEE, July 2017 Overview Purpose of these slides are to make you aware of a few of the different CNN architectures
More informationFaster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Shaoqing Ren Kaiming He Ross Girshick Jian Sun Present by: Yixin Yang Mingdong Wang 1 Object Detection 2 1 Applications Basic
More informationClassifying a specific image region using convolutional nets with an ROI mask as input
Classifying a specific image region using convolutional nets with an ROI mask as input 1 Sagi Eppel Abstract Convolutional neural nets (CNN) are the leading computer vision method for classifying images.
More informationarxiv: v1 [cs.cv] 10 Dec 2015
Deep Residual Learning for Image Recognition Kaiming He Xiangyu Zhang Shaoqing Ren Jian Sun Microsoft Research {kahe, v-xiangz, v-shren, jiansun}@microsoft.com arxiv:1512.03385v1 [cs.cv] 10 Dec 2015 Abstract
More informationResidual Networks And Attention Models. cs273b Recitation 11/11/2016. Anna Shcherbina
Residual Networks And Attention Models cs273b Recitation 11/11/2016 Anna Shcherbina Introduction to ResNets Introduced in 2015 by Microsoft Research Deep Residual Learning for Image Recognition (He, Zhang,
More informationObject detection with CNNs
Object detection with CNNs 80% PASCAL VOC mean0average0precision0(map) 70% 60% 50% 40% 30% 20% 10% Before CNNs After CNNs 0% 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 year Region proposals
More informationCascade Region Regression for Robust Object Detection
Large Scale Visual Recognition Challenge 2015 (ILSVRC2015) Cascade Region Regression for Robust Object Detection Jiankang Deng, Shaoli Huang, Jing Yang, Hui Shuai, Zhengbo Yu, Zongguang Lu, Qiang Ma, Yali
More informationKnow your data - many types of networks
Architectures Know your data - many types of networks Fixed length representation Variable length representation Online video sequences, or samples of different sizes Images Specific architectures for
More informationDeep Learning with Tensorflow AlexNet
Machine Learning and Computer Vision Group Deep Learning with Tensorflow http://cvml.ist.ac.at/courses/dlwt_w17/ AlexNet Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton, "Imagenet classification
More informationObject Detection Based on Deep Learning
Object Detection Based on Deep Learning Yurii Pashchenko AI Ukraine 2016, Kharkiv, 2016 Image classification (mostly what you ve seen) http://tutorial.caffe.berkeleyvision.org/caffe-cvpr15-detection.pdf
More informationFaster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun Presented by Tushar Bansal Objective 1. Get bounding box for all objects
More informationComputer Vision Lecture 16
Announcements Computer Vision Lecture 16 Deep Learning Applications 11.01.2017 Seminar registration period starts on Friday We will offer a lab course in the summer semester Deep Robot Learning Topic:
More informationDeep learning for object detection. Slides from Svetlana Lazebnik and many others
Deep learning for object detection Slides from Svetlana Lazebnik and many others Recent developments in object detection 80% PASCAL VOC mean0average0precision0(map) 70% 60% 50% 40% 30% 20% 10% Before deep
More informationComputer Vision Lecture 16
Computer Vision Lecture 16 Deep Learning Applications 11.01.2017 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Announcements Seminar registration period starts
More informationConvolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech
Convolutional Neural Networks Computer Vision Jia-Bin Huang, Virginia Tech Today s class Overview Convolutional Neural Network (CNN) Training CNN Understanding and Visualizing CNN Image Categorization:
More informationObject Detection. CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR
Object Detection CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR Problem Description Arguably the most important part of perception Long term goals for object recognition: Generalization
More informationAdvanced Video Analysis & Imaging
Advanced Video Analysis & Imaging (5LSH0), Module 09B Machine Learning with Convolutional Neural Networks (CNNs) - Workout Farhad G. Zanjani, Clint Sebastian, Egor Bondarev, Peter H.N. de With ( p.h.n.de.with@tue.nl
More informationFei-Fei Li & Justin Johnson & Serena Yeung
Lecture 9-1 Administrative A2 due Wed May 2 Midterm: In-class Tue May 8. Covers material through Lecture 10 (Thu May 3). Sample midterm released on piazza. Midterm review session: Fri May 4 discussion
More informationTodo before next class
Todo before next class Each project group should submit a short project report (4 pages presentation slides) including 1. Problem definition 2. Related work 3. Preliminary results 4. Future plan Submission:
More informationCNNS FROM THE BASICS TO RECENT ADVANCES. Dmytro Mishkin Center for Machine Perception Czech Technical University in Prague
CNNS FROM THE BASICS TO RECENT ADVANCES Dmytro Mishkin Center for Machine Perception Czech Technical University in Prague ducha.aiki@gmail.com OUTLINE Short review of the CNN design Architecture progress
More informationContent-Based Image Recovery
Content-Based Image Recovery Hong-Yu Zhou and Jianxin Wu National Key Laboratory for Novel Software Technology Nanjing University, China zhouhy@lamda.nju.edu.cn wujx2001@nju.edu.cn Abstract. We propose
More informationPOINT CLOUD DEEP LEARNING
POINT CLOUD DEEP LEARNING Innfarn Yoo, 3/29/28 / 57 Introduction AGENDA Previous Work Method Result Conclusion 2 / 57 INTRODUCTION 3 / 57 2D OBJECT CLASSIFICATION Deep Learning for 2D Object Classification
More informationExtend the shallow part of Single Shot MultiBox Detector via Convolutional Neural Network
Extend the shallow part of Single Shot MultiBox Detector via Convolutional Neural Network Liwen Zheng, Canmiao Fu, Yong Zhao * School of Electronic and Computer Engineering, Shenzhen Graduate School of
More informationDeep Learning Explained Module 4: Convolution Neural Networks (CNN or Conv Nets)
Deep Learning Explained Module 4: Convolution Neural Networks (CNN or Conv Nets) Sayan D. Pathak, Ph.D., Principal ML Scientist, Microsoft Roland Fernandez, Senior Researcher, Microsoft Module Outline
More informationStudy of Residual Networks for Image Recognition
Study of Residual Networks for Image Recognition Mohammad Sadegh Ebrahimi Stanford University sadegh@stanford.edu Hossein Karkeh Abadi Stanford University hosseink@stanford.edu Abstract Deep neural networks
More informationFinal Report: Smart Trash Net: Waste Localization and Classification
Final Report: Smart Trash Net: Waste Localization and Classification Oluwasanya Awe oawe@stanford.edu Robel Mengistu robel@stanford.edu December 15, 2017 Vikram Sreedhar vsreed@stanford.edu Abstract Given
More informationAll You Want To Know About CNNs. Yukun Zhu
All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from http://imgur.com/ Deep Learning Image from http://imgur.com/ Deep Learning Image from http://imgur.com/ Deep Learning Image
More informationComputer Vision Lecture 16
Computer Vision Lecture 16 Deep Learning for Object Categorization 14.01.2016 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Announcements Seminar registration period
More informationCENG 783. Special topics in. Deep Learning. AlchemyAPI. Week 11. Sinan Kalkan
CENG 783 Special topics in Deep Learning AlchemyAPI Week 11 Sinan Kalkan TRAINING A CNN Fig: http://www.robots.ox.ac.uk/~vgg/practicals/cnn/ Feed-forward pass Note that this is written in terms of the
More informationREGION AVERAGE POOLING FOR CONTEXT-AWARE OBJECT DETECTION
REGION AVERAGE POOLING FOR CONTEXT-AWARE OBJECT DETECTION Kingsley Kuan 1, Gaurav Manek 1, Jie Lin 1, Yuan Fang 1, Vijay Chandrasekhar 1,2 Institute for Infocomm Research, A*STAR, Singapore 1 Nanyang Technological
More informationSupplementary material for Analyzing Filters Toward Efficient ConvNet
Supplementary material for Analyzing Filters Toward Efficient Net Takumi Kobayashi National Institute of Advanced Industrial Science and Technology, Japan takumi.kobayashi@aist.go.jp A. Orthonormal Steerable
More informationarxiv: v1 [cs.cv] 14 Dec 2015
Instance-aware Semantic Segmentation via Multi-task Network Cascades Jifeng Dai Kaiming He Jian Sun Microsoft Research {jifdai,kahe,jiansun}@microsoft.com arxiv:1512.04412v1 [cs.cv] 14 Dec 2015 Abstract
More informationEncoder-Decoder Networks for Semantic Segmentation. Sachin Mehta
Encoder-Decoder Networks for Semantic Segmentation Sachin Mehta Outline > Overview of Semantic Segmentation > Encoder-Decoder Networks > Results What is Semantic Segmentation? Input: RGB Image Output:
More informationCOMP9444 Neural Networks and Deep Learning 7. Image Processing. COMP9444 c Alan Blair, 2017
COMP9444 Neural Networks and Deep Learning 7. Image Processing COMP9444 17s2 Image Processing 1 Outline Image Datasets and Tasks Convolution in Detail AlexNet Weight Initialization Batch Normalization
More informationR-FCN: Object Detection with Really - Friggin Convolutional Networks
R-FCN: Object Detection with Really - Friggin Convolutional Networks Jifeng Dai Microsoft Research Li Yi Tsinghua Univ. Kaiming He FAIR Jian Sun Microsoft Research NIPS, 2016 Or Region-based Fully Convolutional
More informationMCMOT: Multi-Class Multi-Object Tracking using Changing Point Detection
MCMOT: Multi-Class Multi-Object Tracking using Changing Point Detection ILSVRC 2016 Object Detection from Video Byungjae Lee¹, Songguo Jin¹, Enkhbayar Erdenee¹, Mi Young Nam², Young Gui Jung², Phill Kyu
More informationSeminars in Artifiial Intelligenie and Robotiis
Seminars in Artifiial Intelligenie and Robotiis Computer Vision for Intelligent Robotiis Basiis and hints on CNNs Alberto Pretto What is a neural network? We start from the frst type of artifcal neuron,
More informationDeep Learning for Computer Vision II
IIIT Hyderabad Deep Learning for Computer Vision II C. V. Jawahar Paradigm Shift Feature Extraction (SIFT, HoG, ) Part Models / Encoding Classifier Sparrow Feature Learning Classifier Sparrow L 1 L 2 L
More informationIntro to Deep Learning. Slides Credit: Andrej Karapathy, Derek Hoiem, Marc Aurelio, Yann LeCunn
Intro to Deep Learning Slides Credit: Andrej Karapathy, Derek Hoiem, Marc Aurelio, Yann LeCunn Why this class? Deep Features Have been able to harness the big data in the most efficient and effective
More informationConvolutional Neural Networks
NPFL114, Lecture 4 Convolutional Neural Networks Milan Straka March 25, 2019 Charles University in Prague Faculty of Mathematics and Physics Institute of Formal and Applied Linguistics unless otherwise
More informationMask R-CNN. presented by Jiageng Zhang, Jingyao Zhan, Yunhan Ma
Mask R-CNN presented by Jiageng Zhang, Jingyao Zhan, Yunhan Ma Mask R-CNN Background Related Work Architecture Experiment Mask R-CNN Background Related Work Architecture Experiment Background From left
More informationRich feature hierarchies for accurate object detection and semantic segmentation
Rich feature hierarchies for accurate object detection and semantic segmentation Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik Presented by Pandian Raju and Jialin Wu Last class SGD for Document
More informationStructured Prediction using Convolutional Neural Networks
Overview Structured Prediction using Convolutional Neural Networks Bohyung Han bhhan@postech.ac.kr Computer Vision Lab. Convolutional Neural Networks (CNNs) Structured predictions for low level computer
More informationMask R-CNN. By Kaiming He, Georgia Gkioxari, Piotr Dollar and Ross Girshick Presented By Aditya Sanghi
Mask R-CNN By Kaiming He, Georgia Gkioxari, Piotr Dollar and Ross Girshick Presented By Aditya Sanghi Types of Computer Vision Tasks http://cs231n.stanford.edu/ Semantic vs Instance Segmentation Image
More informationR-FCN: OBJECT DETECTION VIA REGION-BASED FULLY CONVOLUTIONAL NETWORKS
R-FCN: OBJECT DETECTION VIA REGION-BASED FULLY CONVOLUTIONAL NETWORKS JIFENG DAI YI LI KAIMING HE JIAN SUN MICROSOFT RESEARCH TSINGHUA UNIVERSITY MICROSOFT RESEARCH MICROSOFT RESEARCH SPEED/ACCURACY TRADE-OFFS
More informationOBJECT DETECTION HYUNG IL KOO
OBJECT DETECTION HYUNG IL KOO INTRODUCTION Computer Vision Tasks Classification + Localization Classification: C-classes Input: image Output: class label Evaluation metric: accuracy Localization Input:
More informationChannel Locality Block: A Variant of Squeeze-and-Excitation
Channel Locality Block: A Variant of Squeeze-and-Excitation 1 st Huayu Li Northern Arizona University Flagstaff, United State Northern Arizona University hl459@nau.edu arxiv:1901.01493v1 [cs.lg] 6 Jan
More informationLecture 5: Object Detection
Object Detection CSED703R: Deep Learning for Visual Recognition (2017F) Lecture 5: Object Detection Bohyung Han Computer Vision Lab. bhhan@postech.ac.kr 2 Traditional Object Detection Algorithms Region-based
More informationInception Network Overview. David White CS793
Inception Network Overview David White CS793 So, Leonardo DiCaprio dreams about dreaming... https://m.media-amazon.com/images/m/mv5bmjaxmzy3njcxnf5bml5banbnxkftztcwnti5otm0mw@@._v1_sy1000_cr0,0,675,1 000_AL_.jpg
More informationFeature-Fused SSD: Fast Detection for Small Objects
Feature-Fused SSD: Fast Detection for Small Objects Guimei Cao, Xuemei Xie, Wenzhe Yang, Quan Liao, Guangming Shi, Jinjian Wu School of Electronic Engineering, Xidian University, China xmxie@mail.xidian.edu.cn
More informationVisual features detection based on deep neural network in autonomous driving tasks
430 Fomin I., Gromoshinskii D., Stepanov D. Visual features detection based on deep neural network in autonomous driving tasks Ivan Fomin, Dmitrii Gromoshinskii, Dmitry Stepanov Computer vision lab Russian
More informationFlow-Based Video Recognition
Flow-Based Video Recognition Jifeng Dai Visual Computing Group, Microsoft Research Asia Joint work with Xizhou Zhu*, Yuwen Xiong*, Yujie Wang*, Lu Yuan and Yichen Wei (* interns) Talk pipeline Introduction
More informationReturn of the Devil in the Details: Delving Deep into Convolutional Nets
Return of the Devil in the Details: Delving Deep into Convolutional Nets Ken Chatfield - Karen Simonyan - Andrea Vedaldi - Andrew Zisserman University of Oxford The Devil is still in the Details 2011 2014
More informationarxiv: v2 [cs.cv] 26 Jan 2018
DIRACNETS: TRAINING VERY DEEP NEURAL NET- WORKS WITHOUT SKIP-CONNECTIONS Sergey Zagoruyko, Nikos Komodakis Université Paris-Est, École des Ponts ParisTech Paris, France {sergey.zagoruyko,nikos.komodakis}@enpc.fr
More informationLecture 7: Semantic Segmentation
Semantic Segmentation CSED703R: Deep Learning for Visual Recognition (207F) Segmenting images based on its semantic notion Lecture 7: Semantic Segmentation Bohyung Han Computer Vision Lab. bhhanpostech.ac.kr
More informationarxiv: v1 [cs.cv] 20 Dec 2016
End-to-End Pedestrian Collision Warning System based on a Convolutional Neural Network with Semantic Segmentation arxiv:1612.06558v1 [cs.cv] 20 Dec 2016 Heechul Jung heechul@dgist.ac.kr Min-Kook Choi mkchoi@dgist.ac.kr
More informationGenerative Modeling with Convolutional Neural Networks. Denis Dus Data Scientist at InData Labs
Generative Modeling with Convolutional Neural Networks Denis Dus Data Scientist at InData Labs What we will discuss 1. 2. 3. 4. Discriminative vs Generative modeling Convolutional Neural Networks How to
More informationDynamic Routing Between Capsules
Report Explainable Machine Learning Dynamic Routing Between Capsules Author: Michael Dorkenwald Supervisor: Dr. Ullrich Köthe 28. Juni 2018 Inhaltsverzeichnis 1 Introduction 2 2 Motivation 2 3 CapusleNet
More informationConvolutional Neural Networks
Lecturer: Barnabas Poczos Introduction to Machine Learning (Lecture Notes) Convolutional Neural Networks Disclaimer: These notes have not been subjected to the usual scrutiny reserved for formal publications.
More informationRich feature hierarchies for accurate object detection and semant
Rich feature hierarchies for accurate object detection and semantic segmentation Speaker: Yucong Shen 4/5/2018 Develop of Object Detection 1 DPM (Deformable parts models) 2 R-CNN 3 Fast R-CNN 4 Faster
More informationBinary Convolutional Neural Network on RRAM
Binary Convolutional Neural Network on RRAM Tianqi Tang, Lixue Xia, Boxun Li, Yu Wang, Huazhong Yang Dept. of E.E, Tsinghua National Laboratory for Information Science and Technology (TNList) Tsinghua
More informationWE are witnessing a rapid, revolutionary change in our
1904 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 37, NO. 9, SEPTEMBER 2015 Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition Kaiming He, Xiangyu Zhang,
More informationDeep Learning for Object detection & localization
Deep Learning for Object detection & localization RCNN, Fast RCNN, Faster RCNN, YOLO, GAP, CAM, MSROI Aaditya Prakash Sep 25, 2018 Image classification Image classification Whole of image is classified
More informationHENet: A Highly Efficient Convolutional Neural. Networks Optimized for Accuracy, Speed and Storage
HENet: A Highly Efficient Convolutional Neural Networks Optimized for Accuracy, Speed and Storage Qiuyu Zhu Shanghai University zhuqiuyu@staff.shu.edu.cn Ruixin Zhang Shanghai University chriszhang96@shu.edu.cn
More informationObject detection using Region Proposals (RCNN) Ernest Cheung COMP Presentation
Object detection using Region Proposals (RCNN) Ernest Cheung COMP790-125 Presentation 1 2 Problem to solve Object detection Input: Image Output: Bounding box of the object 3 Object detection using CNN
More informationApplication of Convolutional Neural Network for Image Classification on Pascal VOC Challenge 2012 dataset
Application of Convolutional Neural Network for Image Classification on Pascal VOC Challenge 2012 dataset Suyash Shetty Manipal Institute of Technology suyash.shashikant@learner.manipal.edu Abstract In
More informationConvolutional Neural Networks + Neural Style Transfer. Justin Johnson 2/1/2017
Convolutional Neural Networks + Neural Style Transfer Justin Johnson 2/1/2017 Outline Convolutional Neural Networks Convolution Pooling Feature Visualization Neural Style Transfer Feature Inversion Texture
More informationYOLO9000: Better, Faster, Stronger
YOLO9000: Better, Faster, Stronger Date: January 24, 2018 Prepared by Haris Khan (University of Toronto) Haris Khan CSC2548: Machine Learning in Computer Vision 1 Overview 1. Motivation for one-shot object
More informationarxiv: v1 [cs.cv] 4 Dec 2014
Convolutional Neural Networks at Constrained Time Cost Kaiming He Jian Sun Microsoft Research {kahe,jiansun}@microsoft.com arxiv:1412.1710v1 [cs.cv] 4 Dec 2014 Abstract Though recent advanced convolutional
More informationAutomatic detection of books based on Faster R-CNN
Automatic detection of books based on Faster R-CNN Beibei Zhu, Xiaoyu Wu, Lei Yang, Yinghua Shen School of Information Engineering, Communication University of China Beijing, China e-mail: zhubeibei@cuc.edu.cn,
More informationOverall Description. Goal: to improve spatial invariance to the input data. Translation, Rotation, Scale, Clutter, Elastic
Philippe Giguère Overall Description Goal: to improve spatial invariance to the input data Translation, Rotation, Scale, Clutter, Elastic How: add a learnable module which explicitly manipulate spatially
More informationDeep learning for dense per-pixel prediction. Chunhua Shen The University of Adelaide, Australia
Deep learning for dense per-pixel prediction Chunhua Shen The University of Adelaide, Australia Image understanding Classification error Convolution Neural Networks 0.3 0.2 0.1 Image Classification [Krizhevsky
More informationCIS680: Vision & Learning Assignment 2.b: RPN, Faster R-CNN and Mask R-CNN Due: Nov. 21, 2018 at 11:59 pm
CIS680: Vision & Learning Assignment 2.b: RPN, Faster R-CNN and Mask R-CNN Due: Nov. 21, 2018 at 11:59 pm Instructions This is an individual assignment. Individual means each student must hand in their
More informationModern Convolutional Object Detectors
Modern Convolutional Object Detectors Faster R-CNN, R-FCN, SSD 29 September 2017 Presented by: Kevin Liang Papers Presented Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
More informationOptimizing Object Detection:
Lecture 10: Optimizing Object Detection: A Case Study of R-CNN, Fast R-CNN, and Faster R-CNN Visual Computing Systems Today s task: object detection Image classification: what is the object in this image?
More informationRich feature hierarchies for accurate object detection and semantic segmentation
Rich feature hierarchies for accurate object detection and semantic segmentation BY; ROSS GIRSHICK, JEFF DONAHUE, TREVOR DARRELL AND JITENDRA MALIK PRESENTER; MUHAMMAD OSAMA Object detection vs. classification
More informationImplementing Deep Learning for Video Analytics on Tegra X1.
Implementing Deep Learning for Video Analytics on Tegra X1 research@hertasecurity.com Index Who we are, what we do Video analytics pipeline Video decoding Facial detection and preprocessing DNN: learning
More informationCS6501: Deep Learning for Visual Recognition. Object Detection I: RCNN, Fast-RCNN, Faster-RCNN
CS6501: Deep Learning for Visual Recognition Object Detection I: RCNN, Fast-RCNN, Faster-RCNN Today s Class Object Detection The RCNN Object Detector (2014) The Fast RCNN Object Detector (2015) The Faster
More informationEfficient Segmentation-Aided Text Detection For Intelligent Robots
Efficient Segmentation-Aided Text Detection For Intelligent Robots Junting Zhang, Yuewei Na, Siyang Li, C.-C. Jay Kuo University of Southern California Outline Problem Definition and Motivation Related
More informationMocha.jl. Deep Learning in Julia. Chiyuan Zhang CSAIL, MIT
Mocha.jl Deep Learning in Julia Chiyuan Zhang (@pluskid) CSAIL, MIT Deep Learning Learning with multi-layer (3~30) neural networks, on a huge training set. State-of-the-art on many AI tasks Computer Vision:
More informationComo funciona o Deep Learning
Como funciona o Deep Learning Moacir Ponti (com ajuda de Gabriel Paranhos da Costa) ICMC, Universidade de São Paulo Contact: www.icmc.usp.br/~moacir moacir@icmc.usp.br Uberlandia-MG/Brazil October, 2017
More informationPaper Motivation. Fixed geometric structures of CNN models. CNNs are inherently limited to model geometric transformations
Paper Motivation Fixed geometric structures of CNN models CNNs are inherently limited to model geometric transformations Higher-level features combine lower-level features at fixed positions as a weighted
More informationDeeply Cascaded Networks
Deeply Cascaded Networks Eunbyung Park Department of Computer Science University of North Carolina at Chapel Hill eunbyung@cs.unc.edu 1 Introduction After the seminal work of Viola-Jones[15] fast object
More informationYOLO 9000 TAEWAN KIM
YOLO 9000 TAEWAN KIM DNN-based Object Detection R-CNN MultiBox SPP-Net DeepID-Net NoC Fast R-CNN DeepBox MR-CNN 2013.11 Faster R-CNN YOLO AttentionNet DenseBox SSD Inside-OutsideNet(ION) G-CNN NIPS 15
More informationCS 2750: Machine Learning. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh April 13, 2016
CS 2750: Machine Learning Neural Networks Prof. Adriana Kovashka University of Pittsburgh April 13, 2016 Plan for today Neural network definition and examples Training neural networks (backprop) Convolutional
More informationMachine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU,
Machine Learning 10-701, Fall 2015 Deep Learning Eric Xing (and Pengtao Xie) Lecture 8, October 6, 2015 Eric Xing @ CMU, 2015 1 A perennial challenge in computer vision: feature engineering SIFT Spin image
More informationObject Detection. TA : Young-geun Kim. Biostatistics Lab., Seoul National University. March-June, 2018
Object Detection TA : Young-geun Kim Biostatistics Lab., Seoul National University March-June, 2018 Seoul National University Deep Learning March-June, 2018 1 / 57 Index 1 Introduction 2 R-CNN 3 YOLO 4
More informationTutorial on Keras CAP ADVANCED COMPUTER VISION SPRING 2018 KISHAN S ATHREY
Tutorial on Keras CAP 6412 - ADVANCED COMPUTER VISION SPRING 2018 KISHAN S ATHREY Deep learning packages TensorFlow Google PyTorch Facebook AI research Keras Francois Chollet (now at Google) Chainer Company
More informationObject Detection. Part1. Presenter: Dae-Yong
Object Part1 Presenter: Dae-Yong Contents 1. What is an Object? 2. Traditional Object Detector 3. Deep Learning-based Object Detector What is an Object? Subset of Object Recognition What is an Object?
More informationTiny ImageNet Challenge Submission
Tiny ImageNet Challenge Submission Lucas Hansen Stanford University lucash@stanford.edu Abstract Implemented a deep convolutional neural network on the GPU using Caffe and Amazon Web Services (AWS). Current
More information3 Object Detection. BVM 2018 Tutorial: Advanced Deep Learning Methods. Paul F. Jaeger, Division of Medical Image Computing
3 Object Detection BVM 2018 Tutorial: Advanced Deep Learning Methods Paul F. Jaeger, of Medical Image Computing What is object detection? classification segmentation obj. detection (1 label per pixel)
More information