Why Is Recognition Hard? Object Recognizer

Size: px
Start display at page:

Download "Why Is Recognition Hard? Object Recognizer"

Transcription

1 Why Is Recognition Hard? Object Recognizer panda

2 Why is Recognition Hard? Object Recognizer panda Pose

3 Why is Recognition Hard? Object Recognizer panda Occlusion

4 Why is Recognition Hard? Object Recognizer panda Multiple objects

5 Why is Recognition Hard? Object Recognizer panda Inter-class similarity

6 Ideal Features Ideal Feature Extractor - window, top-left - clock, top-middle - shelf, left - drawing,middle - statue, bottom left - - hat, bottom right Q.: What objects are in the image? Where is the clock? What is on the top of the table?

7 Ideal Features are Non-Linear I 1 Ideal Feature Extractor - club, angle = 90 - man, frontal pose...? Ideal Feature Extractor - club, angle = man, frontal pose... I 2 Ideal Feature Extractor - club, angle = man, side pose...

8 Ideal Features are Non-Linear I 1 Ideal Feature Extractor - club, angle = 90 - man, frontal pose... INPUT IS NOT THE AVERAGE! Ideal Feature Extractor - club, angle = man, frontal pose... I 2 Ideal Feature Extractor - club, angle = man, side pose...

9 Ideal Features are Non-Linear I 1 Ideal Feature Extractor - club, angle = 90 - man, frontal pose... Ideal Feature Extractor - club, angle = man, frontal pose... I 2 Ideal Feature Extractor - club, angle = man, side pose...

10 The Manifold of Natural Images 11

11 Ideal Feature Extraction Pixel n Ideal Feature Extractor Pose Pixel 2 Pixel 1 Expression

12 Learning Non-Linear Features f(x; θ) features Q.: Which class of non-linear functions shall we consider?

13 Learning Non-Linear Features Given a dictionary of simple non-linear functions: g 1,, g n Proposal #1: linear combination f x σ j g j Proposal #2: composition f x g 1 (g 2 g n x )

14 Learning Non-Linear Features Given a dictionary of simple non-linear functions: g 1,, g n Proposal #1: linear combination f x σ j g j Kernel learning Boosting Proposal #2: composition f x g 1 (g 2 g n x ) Deep learning Scattering networks (wavelet cascade) S.C. Zhou & D.Mumford grammar

15 Linear Combination + prediction of class templete matchers... BAD: it may require an exponential no. of templates!!! Input image

16 Composition Prediction of class high-level parts mid-level parts low-level parts Reuse of intermediate parts distributed representations GOOD: (exponentially) more efficient Input image

17 Convolution Neural Network

18 Convolution Definition from Wolfram MathWorld A convolution is an integral that expresses the amount of overlap of one function g as it is shifted over another function f. It therefore "blends" one function with another. f ( x, y) Input image h(x, y) Kernel (filter) g(x, y) Output image g = f h = y,where ቐ g x, f x + u, y + v h u, v dudv g x, y = σ u,v f x + u, y + v h u, v 18

19 Connectivity & Weight Sharing All different weights All different weights Shared weights Convolution layer has much smaller number of parameters by local connection and weight sharing 19

20 Fully Connected Layer Example: 1000x1000 image 1M hidden units parameters!!! - Spatial correlation is local - Better to put resources elsewhere!

21 Locally Connected Layer Example: 1000x1000 image 1M hidden units Filter size: 10x10 100M parameters Filter/Kernel/Receptive field: input patch which the hidden unit is connected to.

22 Locally Connected Layer STATIONARITY? Statistics are similar at different locations (translation invariance) Example: 1000x1000 image 1M hidden units Filter size: 10x10 100M parameters

23 Convolution Layer Share the same parameters across different locations: Convolutions with learned kernels

24 Convolution Layer Learn multiple filters. E.g.: 1000x1000 image 100 Filters Filter size: 10x10 10K parameters

25 Convolution Layer hidden unit / filter response

26 Convolution Layer output feature map 3D kernel (filter) Input feature maps

27 Convolution Layer output feature map many 3D kernels (filters) Input feature maps NOTE: the no. of output feature maps is usually larger than the no. of input feature maps

28 Convolution Layer Convolutional Layer Input feature maps Output feature map NOTE: the no. of output feature maps is usually larger than the no. of input feature maps

29 Key Ideas: Convolutional Nets A standard neural net applied to images: Scales quadratically with the size of the input. Does not leverage stationarity. Solution: Connect each hidden unit to a small patch of the input. Share the weight across hidden units. This is called: Convolutional network.

30 Convolution Layer Detect the same feature at different positions in the input image. features Filter (kernel) Input 30

31 Tanh Non-linearity or Activation Function Sigmoid: e x Rectified Linear Unit (ReLU): max(0, x) Simplifies back propagation Makes learning faster Make feature sparse Preferred option 31

32 Pooling (average, L2, max) Special Layers layer i N(x, y) layer i+1 (x, y) H i+1,x,y = max (j,k) N x,y h i,j,k Local Contrast Normalization (over space/features) layer i N(x, y) layer i+1 (x, y) H i+1,x,y = h i,x,y m i,x,y σ i,x,y

33 Pooling Let us assume filter is an eye detector. Q.: how can we make the detection robust to the exact location of the eye?

34 Pooling By pooling (e.g., taking max) filter responses at different locations we gain robustness to the exac t spatial location of features. H i+1,x,y = max (j,k) N x,y h i,j,k

35 Pooling Layer NOTE: 1) the no. of output feature maps is the same as the no. of input feature maps 2) spatial resolution is reduced patch collapsed into one value use of stride > 1 Input feature maps Output feature maps

36 Pooling Layer NOTE: 1) the no. of output feature maps is the same as the no. of input feature maps 2) spatial resolution is reduced patch collapsed into one value use of stride > 1 Pooling Layer Input feature maps Output feature maps

37 Pooling Role of Pooling Invariance to small transformations. Reduce the effect of noises and shift or distortion. Max Sum 37

38 Local Contrast Normalization H i+1,x,y = h i,x,y m i,x,y σ i,x,y

39 Local Contrast Normalization H i+1,x,y = h i,x,y m i,x,y σ i,x,y We want the same response.

40 Local Contrast Normalization H i+1,x,y = h i,x,y m i,x,y σ i,x,y Performed also across features and in the higher layers. Effects: improves invariance improves optimization increases sparsity

41 Convolutional Nets: Typical Architecture One stage (zoom) Convol. LCN Pooling Convol. LCN Pooling Convolutional layer increases no. feature maps. Pooling layer decreases spatial resolution.

42 Convolutional Nets: Typical Architecture One stage (zoom) Convol. LCN Pooling Convol. LCN Pooling Example with only two filters

43 Convolutional Nets: Typical Architecture One stage (zoom) Convol. LCN Pooling Convol. LCN Pooling A hidden unit in the first hidden layer is influenced by a small neighborhood (equal to size of filter).

44 Convolutional Nets: Typical Architecture One stage (zoom) Convol. LCN Pooling Convol. LCN Pooling A hidden unit after the pooling layer is influenced by a larger neighborhood (it depends on filter sizes and strides).

45 Convolutional Nets: Typical Architecture One stage (zoom) Convol. LCN Pooling Input Image Whole system Fully Conn. Layers Class Labels 1 st stage 2 nd stage 3 rd stage After a few stages, residual spatial resolution is very small. We have learned a descriptor for the whole image.

46 Convolutional Nets: Typical Architecture One stage (zoom) Convol. LCN Pooling Conceptually similar to: SIFT K-Means Pyramid Pooling SVM Lazebnik et al....spatial Pyramid Matching... CVPR 2006 SIFT Fisher Vector Pooling SVM Sanchez et al. Image classification with F.V.: Theory and practice IJCV 2012

47 Convolution Neural Network??? A special kind of multi-layer neural networks. Implicitly extract relevant features. A feed-forward network that can extract topological properties from an image. Like almost every other neural networks, CNN are trained with a version of the back-propagation algorithm. 47

48 Deep Learning Framework

49 Deep Learning Framework Main contributor Language (API) 1 st : 2 nd : C++ 1 st : 2 nd : 3 rd : C++ Language (Backend) C++ C++ C++ 49

50 Deep Learning Framework Pre-trained CNN Models Official 7/tutorials/image_recognition/index.html Inception v3 Others VGG 16 AlexNet All Caffe models Model Zoo Others (by facebook) Inception v1 Inception v3 ResNet NIN VGG_M / VGG_S.. PlaceCNN GoolgLeNet Caffenet Alexnet VGG16 / VGG19 ResNet 50

51 Available Codes Deep Learning Framework Computer Vision Applications DPPnet ImageQA SimpleBaseline ImageQA VQA_LSTM_CNN ImageQA Neural Talk2 Caption Generation Fast R-CNN Object Detection FCN Semantic Segmentation DeconvNet Semantic Segmentation DecoupledNet Semantic Segmentation Ask Neurons ImageQA Generative Models VAE Variational Auto-Encoder DCGAN Generative Adversarial Net VAM Visual Analogy-Making Semi-supervised Ladder Network VAE Variational Auto-Encoder DCGAN Generative Adversarial Net LAPGAN Generative Adversarial Net Action Conditional Video Prediction Useful modules STN Spatial Transformer Networks Deep Reinforcement Learning DQN On by toy example Neural Turning Machine / Memory Network MemNN End-to-End Memory Network NTM Neural Turing Machine N2N Net2Net Simple Algorithm Learning DQN Atari game MemNN End-to-End Memory Network NTM Neural Turing Machine Learning to Execute RNN / Language lstm-char-cnn Character Aware LM Language Modeling Sequence-to-sequence learning lstm-char-cnn Character Aware LM char-rnn character language model lstm Long-short term memory LRCN Long-term Recurrent Convolutional Network 51

52 Pros and Cons Deep Learning Framework Pros python Cons distributed training abstraction (don t have to know everything) Abstraction (hard to control every single detail) Many basic tensor operations Weight sharing is simple Many open sources lua (less libraries than python) Adding new module is easy Lua (mostly) C++ / Cuda (sometimes) Very flexible (we only use tensors and nn functi ons) More deep learning libraries Weight sharing is complex State-of-the-art Detection Implementation State-of-the-art Semantic Segmentation Implement ation matlab / python Poor rnn support Less flexible (control flow is not supported) Less basic tensor operations Less effort for CNN training Adding New Layer C++ / Cuda (always) 52

53 Object Detection - Region-based CNN -

54 Region Proposals Find blobby image regions that are likely to contain objects. Class-agnostic object detector. Look for blob-like regions.

55 Region Proposals Selective Search Bottom-up segmentation, merging regions at multiple scales Convert regions to boxes Uijlings et al, Selective Search for Object Recognition, IJCV 2013

56 Region-based CNN (R-CNN) Input image Extract region proposals (~2k/images) Compute CNN features on regions Classify and refine regions

57 Region-based CNN (R-CNN) R-CNN at test time: Step 1 Input image Extract region proposals (~2k/images) Compute CNN features Classify regions Selective Search

58 Region-based CNN (R-CNN) R-CNN at test time: Step 2 Input image Extract region proposals (~2k/images) Compute CNN features Classify regions

59 Region-based CNN (R-CNN) R-CNN at test time: Step 2 Input image Extract region proposals (~2k/images) Compute CNN features Classify regions

60 Region-based CNN (R-CNN) R-CNN at test time: Step 2 Input image Extract region proposals (~2k/images) Compute CNN features Classify regions

61 Region-based CNN (R-CNN) R-CNN at test time: Step 2 Input image Extract region proposals (~2k/images) Compute CNN features Classify regions

62 Region-based CNN (R-CNN) R-CNN at test time: Step 2 Input image Extract region proposals (~2k/images) Compute CNN features Classify regions

63 Region-based CNN (R-CNN) R-CNN at test time: Step 3 Input image Extract region proposals (~2k/images) Compute CNN features Classify regions

64 Region-based CNN (R-CNN) R-CNN at test time: Step 4, Object proposal refinement Linear regression on CNN fea tu res Original p roposal Bo unding- box regression Predicted object bounding b ox

65 Bounding-Box Regression

66 R-CNN Training: Step 1 Supervised pre-training Train AlexNet for the 1000-way ILSVRC image classification task.

67 R-CNN Training: Step 2 Fine-tune the CNN for detection Transfer the representation learned for ILSVRC classification to PASCAL (or ImageNet detection). Instead of 1000 ImageNet classes, want to 20 object classes + background. Throw away final fully connected layer, reinitialize from scratch. Keep training CNN model using positive/negative regions from detection images.

68 R-CNN Training: Step 3 Extract Features Extract region proposals for all images. For each region: warp to CNN input size, run forward through CNN, save pool5 features to disk. Have a big hard drive: features are ~200 GB for PASCAL dataset.

69 R-CNN Training: Step 4 Train one binary SVM per class to classify region features

70 R-CNN Training: Step 4 Train one binary SVM per class to classify region features

71 R-CNN Training: Step 5 Bounding-box regression: For each class, train a linear regression model to map from cached features to offsets to GT boxes to make up for slightly wrong proposals.

72 R-CNN Results on PASCAL VOC 2007 VOC 2010 DPM v5 (Girshick et al. 2011) 33.7% 29.6% UVA sel. search (Uijlings et al. 2013) 35.1% Regionlets (Wang et al. 2013) 41.7% 39.7% SegDPM (Fidler et al. 2013) 40.4% R-CNN 54.2% 50.2% R-CNN + bbox regression 58.5% 53.7% m etric: m ean average p recision (higher is bet t er)

73 ImageNet Detection New Challenge introduced in 2013 PASCAL-like image complexity 200 object categories

74 R-CNN on ImageNet Detection

75 Object Detection - Fast R-CNN -

76 R-CNN Problems Slow at test time Need to run full forward pass of CNN for each region proposal SVMs and regressors are post-hoc CNN features not updated in response to SVMs and regressors Complex multi-stage training pipeline Very time consuming processes (total 84 hours). 2k proposals x (Computing Conv. time + FC time) (18 hours). For SVM and bounding box regressor training (3 hours). Feature extraction: time consuming (63 hours), disk I/O is costly.

77 Fast R-CNN Testing R-CNN Problem #1: Slow at test time due to independent forward passes of the CNN. Solution: Share computation of convolutional layers between proposals for an image.

78 Fast R-CNN Training R-CNN Problem #2: Post-hoc training CNN not updated in response to final classifiers and regressors. R-CNN Problem #3: Complex training pipeline. Solution: Just train the whole system end-to-end all at once.

79 Fast R-CNN Feature maps are shared to each proposals. Applied to entire image. Train the detector in a single stage, end-to-end. No caching features to disk. No post hoc training steps. 1 computing Conv. Time + 2K x FC time: 8.75 hours

80 Fast R-CNN Testing Regions of Interest (RoIs) from a proposal method ConvNet conv5 feature map of image Forward whole image through ConvNet Input image

81 Fast R-CNN Testing Regions of Interest (RoIs) from a proposal method ConvNet RoI Pooling layer conv5 feature map of image Forward whole image through ConvNet Input image

82 Fast R-CNN Testing Softmax classifier Linear + softmax FCs Fully-connected layers Regions of Interest (RoIs) from a proposal method ConvNet RoI Pooling layer conv5 feature map of image Forward whole image through ConvNet Input image

83 Testing Fast R-CNN Linear + Softmax classifier softmax Linear Bounding-box regressors FCs Fully-connected layers Regions of Interest (RoIs) from a proposal method ConvNet RoI Pooling layer conv5 feature map of image Forward whole image through ConvNet Input image

84 Training Linear + softmax Fast R-CNN Log loss + smooth L1 loss Linear Multi task loss FCs Trainable ConvNet

85 Results Fast R-CNN R-CNN [1] SPP-net [2] Train time (h) Speedup 8.8x 1x 3.4x Test time / image 0.32s 47.0s 2.3s Test speedup 146x 1x 20x map 66.9% 66.0% 63.1% Timings exclude object proposal time, which is equal for all methods. All methods use VGG-16.

86 Fast R-CNN

87 Fast R-CNN Results R-CNN Fast R-CNN Training time (h) Speedup 1x 8.8x Test time / image 47 seconds 0.32 seconds - Speedup 1x 146x map (VOC 2007) All methods use VGG-16.

88 Object Detection - Faster R-CNN -

89 Fast R-CNN Problems Test-time speeds don t include region proposals. R-CNN Model Fast R-CNN Time SS time *Conv. Time *Fc Time SS time + 1*Conv. Time *Fc Time Including time for selective search. R-CNN Fast R-CNN Test time / image 47 seconds 0.32 seconds - Speedup 1x 146x Test time / image with Selective search 50 seconds 2 seconds - Speedup 1x 25x

90 Fast R-CNN Problems Test-time speeds don t include region proposals. R-CNN Fast R-CNN Test time / image 47 seconds 0.32 seconds - Speedup 1x 146x Test time / image with Selective search 50 seconds 2 seconds - Speedup 1x 25x Solution Just make the CNN do region proposals too!!!

91 Faster R-CNN

92 Faster R-CNN Insert Region Proposal Network (RPN) after the last convolutional layer. RPN trained to produce region proposals directly; No need for external proposals. After RPN, use RoI pooling and an upstream classifier and bounding box regressor just like Fast R-CNN.

93 Region Proposal Networks (RPN) Goal Share computation with a Fast R-CNN object detection network. RPN input Image of any size. Output Rectangular object proposals with objectness score.

94 Region Proposal Networks (RPN) Slide a small window on the feature map. Build a small network for Classifying object or not-object. Regressing bounding box locations. Position of the sliding window provides localization information with reference (anchor) to the image. Bounding box regression provides finer localization information with reference (anchor) to this sliding widow.

95 Region Proposal Networks (RPN) Use k anchor boxes (ex., k = 9; 3 scales, 3 aspect ratios.) at each location. Anchors are translation invariant Use the same ones at every location. Regression gives offsets from anchor boxes. Classification gives the probability that each (regressed) anchor shows an object.

96 RPN: Loss function L p i, t i = 1 N cls i L cls p i, p i + λ 1 N reg i p i L reg t i, t i

97 RPN: Loss function - classification 2 class Softmax loss Discriminative training p i = 1 if IoU > 0.7 p i = 0 if IoU < 0.3 Otherwise, do not contribute to loss. L cls p i, p i L reg t i, t i L p i, t i = 1 N cls i L cls p i, p i + λ 1 N reg i p i L reg t i, t i

98 RPN: Loss function - regression L reg t, t = in which i x,y,w,h smooth L1 t i, t i, L cls p i, p i L reg t i, t i smooth L1 x = ቊ 0.5x2 if x < 1 x 0.5 otherwise t x = x x a /w a, t y = y y a /h a, t w = log w/w a, t h = log h/h a, t x = x x a /w a, t y = y y a /h a, t w = log w /w a, t h = log h /h a L p i, t i = 1 N cls i L cls p i, p i + λ 1 N reg i p i L reg t i, t i Only positive sample contribute to reg. loss.

99 Bounding-Box Regression

100 How to train faster R-CNN Balance sampling (pos. vs neg. = 1:1) An anchor is labeled as positive if The anchor/anchors with the highest IoU overlap with a GT box. The anchor has an IoU overlap with a GT box higher than 0.7 Negative labels are assigned to anchors with IoU lower than 0.3 for all GT boxes. Alternating optimization (in the first paper) 4 step training. Joint training (since publication) One network, four losses.

101 How to train faster R-CNN Alternating optimization Step 1: Train RPN initialized with ImageNet pre-trained model. [no parameter sharing] Step 2: Train fast R-CNN using proposal generated in Step 1. [no parameter sharing] Step 3: Use convolutional layers trained in Step2 to initialize model, fix shared convolutional layers, update RPN. Step 4: fix shared convolutional layers, fine-tune fully connected layers of fast R-CNN.

102 How to train faster R-CNN Joint training one network, four losses RPN classification (anchor good / bad) RPN regression (anchor proposal) Fast R-CNN classification (over classes) Fast R-CNN regression (proposal box)

103 Object Detection: Datasets Model PASCAL VOC ImageNet Detection (ILSVRC) MS COCO # of classes # of images (train+val) ~20k ~470k ~120k Avg. object/image

104 VOC 2007, ZF Conv. Net Results

105 Results VOC 2007, using VGG-16, Fast R-CNN vs. Faster R-CNN

106 Results VOC 2012, using VGG-16, Fast R-CNN vs. Faster R-CNN

107 Results VOC 2007, using VGG-16, different settings of achnors

108 Fast R-CNN vs. Faster R-CNN Results

109 Results Model R-CNN Fast R-CNN Faster R-CNN Time SS time *Conv. Time *Fc Time SS time + 1*Conv. Time *Fc Time 1*Conv. Time + 300*Fc Time Test time / image (with proposals) R-CNN Fast R-CNN Faster R-CNN 50 seconds 2 seconds 0.2 seconds - Speedup 1x 25x 250x map (VOC 2007)

110 Summary Object Detection Find a variable number of objects by classifying image regions. Before CNNs, Dense multi-scale sliding window (HoG, DPM). Avoid dense sliding windows with region proposals. R-CNN Selective search + CNN Classification / bounding box regression. Fast R-CNN Swap order of convolutions and region extraction. Faster R-CNN Compute region proposals within the network.

Spatial Localization and Detection. Lecture 8-1

Spatial Localization and Detection. Lecture 8-1 Lecture 8: Spatial Localization and Detection Lecture 8-1 Administrative - Project Proposals were due on Saturday Homework 2 due Friday 2/5 Homework 1 grades out this week Midterm will be in-class on Wednesday

More information

Deep learning for object detection. Slides from Svetlana Lazebnik and many others

Deep learning for object detection. Slides from Svetlana Lazebnik and many others Deep learning for object detection Slides from Svetlana Lazebnik and many others Recent developments in object detection 80% PASCAL VOC mean0average0precision0(map) 70% 60% 50% 40% 30% 20% 10% Before deep

More information

OBJECT DETECTION HYUNG IL KOO

OBJECT DETECTION HYUNG IL KOO OBJECT DETECTION HYUNG IL KOO INTRODUCTION Computer Vision Tasks Classification + Localization Classification: C-classes Input: image Output: class label Evaluation metric: accuracy Localization Input:

More information

Object detection with CNNs

Object detection with CNNs Object detection with CNNs 80% PASCAL VOC mean0average0precision0(map) 70% 60% 50% 40% 30% 20% 10% Before CNNs After CNNs 0% 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 year Region proposals

More information

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun Presented by Tushar Bansal Objective 1. Get bounding box for all objects

More information

Yiqi Yan. May 10, 2017

Yiqi Yan. May 10, 2017 Yiqi Yan May 10, 2017 P a r t I F u n d a m e n t a l B a c k g r o u n d s Convolution Single Filter Multiple Filters 3 Convolution: case study, 2 filters 4 Convolution: receptive field receptive field

More information

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Shaoqing Ren Kaiming He Ross Girshick Jian Sun Present by: Yixin Yang Mingdong Wang 1 Object Detection 2 1 Applications Basic

More information

Object Detection Based on Deep Learning

Object Detection Based on Deep Learning Object Detection Based on Deep Learning Yurii Pashchenko AI Ukraine 2016, Kharkiv, 2016 Image classification (mostly what you ve seen) http://tutorial.caffe.berkeleyvision.org/caffe-cvpr15-detection.pdf

More information

Optimizing Object Detection:

Optimizing Object Detection: Lecture 10: Optimizing Object Detection: A Case Study of R-CNN, Fast R-CNN, and Faster R-CNN Visual Computing Systems Today s task: object detection Image classification: what is the object in this image?

More information

Deep Learning for Object detection & localization

Deep Learning for Object detection & localization Deep Learning for Object detection & localization RCNN, Fast RCNN, Faster RCNN, YOLO, GAP, CAM, MSROI Aaditya Prakash Sep 25, 2018 Image classification Image classification Whole of image is classified

More information

Convolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech

Convolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech Convolutional Neural Networks Computer Vision Jia-Bin Huang, Virginia Tech Today s class Overview Convolutional Neural Network (CNN) Training CNN Understanding and Visualizing CNN Image Categorization:

More information

CS6501: Deep Learning for Visual Recognition. Object Detection I: RCNN, Fast-RCNN, Faster-RCNN

CS6501: Deep Learning for Visual Recognition. Object Detection I: RCNN, Fast-RCNN, Faster-RCNN CS6501: Deep Learning for Visual Recognition Object Detection I: RCNN, Fast-RCNN, Faster-RCNN Today s Class Object Detection The RCNN Object Detector (2014) The Fast RCNN Object Detector (2015) The Faster

More information

Rich feature hierarchies for accurate object detection and semantic segmentation

Rich feature hierarchies for accurate object detection and semantic segmentation Rich feature hierarchies for accurate object detection and semantic segmentation BY; ROSS GIRSHICK, JEFF DONAHUE, TREVOR DARRELL AND JITENDRA MALIK PRESENTER; MUHAMMAD OSAMA Object detection vs. classification

More information

ECE 6504: Deep Learning for Perception

ECE 6504: Deep Learning for Perception ECE 6504: Deep Learning for Perception Topics: (Finish) Backprop Convolutional Neural Nets Dhruv Batra Virginia Tech Administrativia Presentation Assignments https://docs.google.com/spreadsheets/d/ 1m76E4mC0wfRjc4HRBWFdAlXKPIzlEwfw1-u7rBw9TJ8/

More information

Introduction to Deep Learning for Facial Understanding Part III: Regional CNNs

Introduction to Deep Learning for Facial Understanding Part III: Regional CNNs Introduction to Deep Learning for Facial Understanding Part III: Regional CNNs Raymond Ptucha, Rochester Institute of Technology, USA Tutorial-9 May 19, 218 www.nvidia.com/dli R. Ptucha 18 1 Fair Use Agreement

More information

Computer Vision Lecture 16

Computer Vision Lecture 16 Computer Vision Lecture 16 Deep Learning for Object Categorization 14.01.2016 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Announcements Seminar registration period

More information

Mask R-CNN. presented by Jiageng Zhang, Jingyao Zhan, Yunhan Ma

Mask R-CNN. presented by Jiageng Zhang, Jingyao Zhan, Yunhan Ma Mask R-CNN presented by Jiageng Zhang, Jingyao Zhan, Yunhan Ma Mask R-CNN Background Related Work Architecture Experiment Mask R-CNN Background Related Work Architecture Experiment Background From left

More information

Yield Estimation using faster R-CNN

Yield Estimation using faster R-CNN Yield Estimation using faster R-CNN 1 Vidhya Sagar, 2 Sailesh J.Jain and 2 Arjun P. 1 Assistant Professor, 2 UG Scholar, Department of Computer Engineering and Science SRM Institute of Science and Technology,Chennai,

More information

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU,

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU, Machine Learning 10-701, Fall 2015 Deep Learning Eric Xing (and Pengtao Xie) Lecture 8, October 6, 2015 Eric Xing @ CMU, 2015 1 A perennial challenge in computer vision: feature engineering SIFT Spin image

More information

Regionlet Object Detector with Hand-crafted and CNN Feature

Regionlet Object Detector with Hand-crafted and CNN Feature Regionlet Object Detector with Hand-crafted and CNN Feature Xiaoyu Wang Research Xiaoyu Wang Research Ming Yang Horizon Robotics Shenghuo Zhu Alibaba Group Yuanqing Lin Baidu Overview of this section Regionlet

More information

Machine Learning 13. week

Machine Learning 13. week Machine Learning 13. week Deep Learning Convolutional Neural Network Recurrent Neural Network 1 Why Deep Learning is so Popular? 1. Increase in the amount of data Thanks to the Internet, huge amount of

More information

Computer Vision Lecture 16

Computer Vision Lecture 16 Computer Vision Lecture 16 Deep Learning Applications 11.01.2017 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Announcements Seminar registration period starts

More information

Computer Vision Lecture 16

Computer Vision Lecture 16 Announcements Computer Vision Lecture 16 Deep Learning Applications 11.01.2017 Seminar registration period starts on Friday We will offer a lab course in the summer semester Deep Robot Learning Topic:

More information

Know your data - many types of networks

Know your data - many types of networks Architectures Know your data - many types of networks Fixed length representation Variable length representation Online video sequences, or samples of different sizes Images Specific architectures for

More information

Deep Learning in Visual Recognition. Thanks Da Zhang for the slides

Deep Learning in Visual Recognition. Thanks Da Zhang for the slides Deep Learning in Visual Recognition Thanks Da Zhang for the slides Deep Learning is Everywhere 2 Roadmap Introduction Convolutional Neural Network Application Image Classification Object Detection Object

More information

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution and Fully Connected CRFs

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution and Fully Connected CRFs DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution and Fully Connected CRFs Zhipeng Yan, Moyuan Huang, Hao Jiang 5/1/2017 1 Outline Background semantic segmentation Objective,

More information

Semantic Segmentation

Semantic Segmentation Semantic Segmentation UCLA:https://goo.gl/images/I0VTi2 OUTLINE Semantic Segmentation Why? Paper to talk about: Fully Convolutional Networks for Semantic Segmentation. J. Long, E. Shelhamer, and T. Darrell,

More information

Object detection using Region Proposals (RCNN) Ernest Cheung COMP Presentation

Object detection using Region Proposals (RCNN) Ernest Cheung COMP Presentation Object detection using Region Proposals (RCNN) Ernest Cheung COMP790-125 Presentation 1 2 Problem to solve Object detection Input: Image Output: Bounding box of the object 3 Object detection using CNN

More information

3 Object Detection. BVM 2018 Tutorial: Advanced Deep Learning Methods. Paul F. Jaeger, Division of Medical Image Computing

3 Object Detection. BVM 2018 Tutorial: Advanced Deep Learning Methods. Paul F. Jaeger, Division of Medical Image Computing 3 Object Detection BVM 2018 Tutorial: Advanced Deep Learning Methods Paul F. Jaeger, of Medical Image Computing What is object detection? classification segmentation obj. detection (1 label per pixel)

More information

Rich feature hierarchies for accurate object detection and semantic segmentation

Rich feature hierarchies for accurate object detection and semantic segmentation Rich feature hierarchies for accurate object detection and semantic segmentation Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik Presented by Pandian Raju and Jialin Wu Last class SGD for Document

More information

CMU Lecture 18: Deep learning and Vision: Convolutional neural networks. Teacher: Gianni A. Di Caro

CMU Lecture 18: Deep learning and Vision: Convolutional neural networks. Teacher: Gianni A. Di Caro CMU 15-781 Lecture 18: Deep learning and Vision: Convolutional neural networks Teacher: Gianni A. Di Caro DEEP, SHALLOW, CONNECTED, SPARSE? Fully connected multi-layer feed-forward perceptrons: More powerful

More information

Object Detection on Self-Driving Cars in China. Lingyun Li

Object Detection on Self-Driving Cars in China. Lingyun Li Object Detection on Self-Driving Cars in China Lingyun Li Introduction Motivation: Perception is the key of self-driving cars Data set: 10000 images with annotation 2000 images without annotation (not

More information

Deep Learning for Computer Vision II

Deep Learning for Computer Vision II IIIT Hyderabad Deep Learning for Computer Vision II C. V. Jawahar Paradigm Shift Feature Extraction (SIFT, HoG, ) Part Models / Encoding Classifier Sparrow Feature Learning Classifier Sparrow L 1 L 2 L

More information

Deep Neural Networks:

Deep Neural Networks: Deep Neural Networks: Part II Convolutional Neural Network (CNN) Yuan-Kai Wang, 2016 Web site of this course: http://pattern-recognition.weebly.com source: CNN for ImageClassification, by S. Lazebnik,

More information

Fully Convolutional Networks for Semantic Segmentation

Fully Convolutional Networks for Semantic Segmentation Fully Convolutional Networks for Semantic Segmentation Jonathan Long* Evan Shelhamer* Trevor Darrell UC Berkeley Chaim Ginzburg for Deep Learning seminar 1 Semantic Segmentation Define a pixel-wise labeling

More information

Advanced Video Analysis & Imaging

Advanced Video Analysis & Imaging Advanced Video Analysis & Imaging (5LSH0), Module 09B Machine Learning with Convolutional Neural Networks (CNNs) - Workout Farhad G. Zanjani, Clint Sebastian, Egor Bondarev, Peter H.N. de With ( p.h.n.de.with@tue.nl

More information

COMP 551 Applied Machine Learning Lecture 16: Deep Learning

COMP 551 Applied Machine Learning Lecture 16: Deep Learning COMP 551 Applied Machine Learning Lecture 16: Deep Learning Instructor: Ryan Lowe (ryan.lowe@cs.mcgill.ca) Slides mostly by: Class web page: www.cs.mcgill.ca/~hvanho2/comp551 Unless otherwise noted, all

More information

Classification of objects from Video Data (Group 30)

Classification of objects from Video Data (Group 30) Classification of objects from Video Data (Group 30) Sheallika Singh 12665 Vibhuti Mahajan 12792 Aahitagni Mukherjee 12001 M Arvind 12385 1 Motivation Video surveillance has been employed for a long time

More information

Deep Learning with Tensorflow AlexNet

Deep Learning with Tensorflow   AlexNet Machine Learning and Computer Vision Group Deep Learning with Tensorflow http://cvml.ist.ac.at/courses/dlwt_w17/ AlexNet Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton, "Imagenet classification

More information

CIS680: Vision & Learning Assignment 2.b: RPN, Faster R-CNN and Mask R-CNN Due: Nov. 21, 2018 at 11:59 pm

CIS680: Vision & Learning Assignment 2.b: RPN, Faster R-CNN and Mask R-CNN Due: Nov. 21, 2018 at 11:59 pm CIS680: Vision & Learning Assignment 2.b: RPN, Faster R-CNN and Mask R-CNN Due: Nov. 21, 2018 at 11:59 pm Instructions This is an individual assignment. Individual means each student must hand in their

More information

Deep Learning for Vision: Tricks of the Trade

Deep Learning for Vision: Tricks of the Trade Deep Learning for Vision: Tricks of the Trade Marc'Aurelio Ranzato Facebook, AI Group www.cs.toronto.edu/~ranzato BAVM Friday, 4 October 2013 Ideal Features Ideal Feature Extractor - window, right - chair,

More information

Automatic detection of books based on Faster R-CNN

Automatic detection of books based on Faster R-CNN Automatic detection of books based on Faster R-CNN Beibei Zhu, Xiaoyu Wu, Lei Yang, Yinghua Shen School of Information Engineering, Communication University of China Beijing, China e-mail: zhubeibei@cuc.edu.cn,

More information

Object Detection. CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR

Object Detection. CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR Object Detection CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR Problem Description Arguably the most important part of perception Long term goals for object recognition: Generalization

More information

CS 2750: Machine Learning. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh April 13, 2016

CS 2750: Machine Learning. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh April 13, 2016 CS 2750: Machine Learning Neural Networks Prof. Adriana Kovashka University of Pittsburgh April 13, 2016 Plan for today Neural network definition and examples Training neural networks (backprop) Convolutional

More information

Project 3 Q&A. Jonathan Krause

Project 3 Q&A. Jonathan Krause Project 3 Q&A Jonathan Krause 1 Outline R-CNN Review Error metrics Code Overview Project 3 Report Project 3 Presentations 2 Outline R-CNN Review Error metrics Code Overview Project 3 Report Project 3 Presentations

More information

Self Driving. DNN * * Reinforcement * Unsupervised *

Self Driving. DNN * * Reinforcement * Unsupervised * CNN 응용 Methods Traditional Deep-Learning based Non-machine Learning Machine-Learning based method Supervised SVM MLP CNN RNN (LSTM) Localizati on GPS, SLAM Self Driving Perception Pedestrian detection

More information

Structured Prediction using Convolutional Neural Networks

Structured Prediction using Convolutional Neural Networks Overview Structured Prediction using Convolutional Neural Networks Bohyung Han bhhan@postech.ac.kr Computer Vision Lab. Convolutional Neural Networks (CNNs) Structured predictions for low level computer

More information

CS 1674: Intro to Computer Vision. Object Recognition. Prof. Adriana Kovashka University of Pittsburgh April 3, 5, 2018

CS 1674: Intro to Computer Vision. Object Recognition. Prof. Adriana Kovashka University of Pittsburgh April 3, 5, 2018 CS 1674: Intro to Computer Vision Object Recognition Prof. Adriana Kovashka University of Pittsburgh April 3, 5, 2018 Different Flavors of Object Recognition Semantic Segmentation Classification + Localization

More information

Optimizing Object Detection:

Optimizing Object Detection: Lecture 10: Optimizing Object Detection: A Case Study of R-CNN, Fast R-CNN, and Faster R-CNN and Single Shot Detection Visual Computing Systems Today s task: object detection Image classification: what

More information

Deep Residual Learning

Deep Residual Learning Deep Residual Learning MSRA @ ILSVRC & COCO 2015 competitions Kaiming He with Xiangyu Zhang, Shaoqing Ren, Jifeng Dai, & Jian Sun Microsoft Research Asia (MSRA) MSRA @ ILSVRC & COCO 2015 Competitions 1st

More information

Object Detection. TA : Young-geun Kim. Biostatistics Lab., Seoul National University. March-June, 2018

Object Detection. TA : Young-geun Kim. Biostatistics Lab., Seoul National University. March-June, 2018 Object Detection TA : Young-geun Kim Biostatistics Lab., Seoul National University March-June, 2018 Seoul National University Deep Learning March-June, 2018 1 / 57 Index 1 Introduction 2 R-CNN 3 YOLO 4

More information

Deep Learning. Visualizing and Understanding Convolutional Networks. Christopher Funk. Pennsylvania State University.

Deep Learning. Visualizing and Understanding Convolutional Networks. Christopher Funk. Pennsylvania State University. Visualizing and Understanding Convolutional Networks Christopher Pennsylvania State University February 23, 2015 Some Slide Information taken from Pierre Sermanet (Google) presentation on and Computer

More information

Rich feature hierarchies for accurate object detection and semant

Rich feature hierarchies for accurate object detection and semant Rich feature hierarchies for accurate object detection and semantic segmentation Speaker: Yucong Shen 4/5/2018 Develop of Object Detection 1 DPM (Deformable parts models) 2 R-CNN 3 Fast R-CNN 4 Faster

More information

Encoder-Decoder Networks for Semantic Segmentation. Sachin Mehta

Encoder-Decoder Networks for Semantic Segmentation. Sachin Mehta Encoder-Decoder Networks for Semantic Segmentation Sachin Mehta Outline > Overview of Semantic Segmentation > Encoder-Decoder Networks > Results What is Semantic Segmentation? Input: RGB Image Output:

More information

Modern Convolutional Object Detectors

Modern Convolutional Object Detectors Modern Convolutional Object Detectors Faster R-CNN, R-FCN, SSD 29 September 2017 Presented by: Kevin Liang Papers Presented Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

More information

Convolutional Neural Networks

Convolutional Neural Networks NPFL114, Lecture 4 Convolutional Neural Networks Milan Straka March 25, 2019 Charles University in Prague Faculty of Mathematics and Physics Institute of Formal and Applied Linguistics unless otherwise

More information

AttentionNet for Accurate Localization and Detection of Objects. (To appear in ICCV 2015)

AttentionNet for Accurate Localization and Detection of Objects. (To appear in ICCV 2015) AttentionNet for Accurate Localization and Detection of Objects. (To appear in ICCV 2015) Donggeun Yoo, Sunggyun Park, Joon-Young Lee, Anthony Paek, In So Kweon. State-of-the-art frameworks for object

More information

CPSC340. State-of-the-art Neural Networks. Nando de Freitas November, 2012 University of British Columbia

CPSC340. State-of-the-art Neural Networks. Nando de Freitas November, 2012 University of British Columbia CPSC340 State-of-the-art Neural Networks Nando de Freitas November, 2012 University of British Columbia Outline of the lecture This lecture provides an overview of two state-of-the-art neural networks:

More information

Object Detection Lecture Introduction to deep learning (CNN) Idar Dyrdal

Object Detection Lecture Introduction to deep learning (CNN) Idar Dyrdal Object Detection Lecture 10.3 - Introduction to deep learning (CNN) Idar Dyrdal Deep Learning Labels Computational models composed of multiple processing layers (non-linear transformations) Used to learn

More information

Object Recognition II

Object Recognition II Object Recognition II Linda Shapiro EE/CSE 576 with CNN slides from Ross Girshick 1 Outline Object detection the task, evaluation, datasets Convolutional Neural Networks (CNNs) overview and history Region-based

More information

Lecture 5: Object Detection

Lecture 5: Object Detection Object Detection CSED703R: Deep Learning for Visual Recognition (2017F) Lecture 5: Object Detection Bohyung Han Computer Vision Lab. bhhan@postech.ac.kr 2 Traditional Object Detection Algorithms Region-based

More information

Instance-aware Semantic Segmentation via Multi-task Network Cascades

Instance-aware Semantic Segmentation via Multi-task Network Cascades Instance-aware Semantic Segmentation via Multi-task Network Cascades Jifeng Dai, Kaiming He, Jian Sun Microsoft research 2016 Yotam Gil Amit Nativ Agenda Introduction Highlights Implementation Further

More information

An Exploration of Computer Vision Techniques for Bird Species Classification

An Exploration of Computer Vision Techniques for Bird Species Classification An Exploration of Computer Vision Techniques for Bird Species Classification Anne L. Alter, Karen M. Wang December 15, 2017 Abstract Bird classification, a fine-grained categorization task, is a complex

More information

Deep Convolutional Neural Networks. Nov. 20th, 2015 Bruce Draper

Deep Convolutional Neural Networks. Nov. 20th, 2015 Bruce Draper Deep Convolutional Neural Networks Nov. 20th, 2015 Bruce Draper Background: Fully-connected single layer neural networks Feed-forward classification Trained through back-propagation Example Computer Vision

More information

R-FCN: OBJECT DETECTION VIA REGION-BASED FULLY CONVOLUTIONAL NETWORKS

R-FCN: OBJECT DETECTION VIA REGION-BASED FULLY CONVOLUTIONAL NETWORKS R-FCN: OBJECT DETECTION VIA REGION-BASED FULLY CONVOLUTIONAL NETWORKS JIFENG DAI YI LI KAIMING HE JIAN SUN MICROSOFT RESEARCH TSINGHUA UNIVERSITY MICROSOFT RESEARCH MICROSOFT RESEARCH SPEED/ACCURACY TRADE-OFFS

More information

Convolutional Neural Networks: Applications and a short timeline. 7th Deep Learning Meetup Kornel Kis Vienna,

Convolutional Neural Networks: Applications and a short timeline. 7th Deep Learning Meetup Kornel Kis Vienna, Convolutional Neural Networks: Applications and a short timeline 7th Deep Learning Meetup Kornel Kis Vienna, 1.12.2016. Introduction Currently a master student Master thesis at BME SmartLab Started deep

More information

Deconvolutions in Convolutional Neural Networks

Deconvolutions in Convolutional Neural Networks Overview Deconvolutions in Convolutional Neural Networks Bohyung Han bhhan@postech.ac.kr Computer Vision Lab. Convolutional Neural Networks (CNNs) Deconvolutions in CNNs Applications Network visualization

More information

Mask R-CNN. By Kaiming He, Georgia Gkioxari, Piotr Dollar and Ross Girshick Presented By Aditya Sanghi

Mask R-CNN. By Kaiming He, Georgia Gkioxari, Piotr Dollar and Ross Girshick Presented By Aditya Sanghi Mask R-CNN By Kaiming He, Georgia Gkioxari, Piotr Dollar and Ross Girshick Presented By Aditya Sanghi Types of Computer Vision Tasks http://cs231n.stanford.edu/ Semantic vs Instance Segmentation Image

More information

Object Recognition and Detection

Object Recognition and Detection CS 2770: Computer Vision Object Recognition and Detection Prof. Adriana Kovashka University of Pittsburgh March 16, 21, 23, 2017 Plan for the next few lectures Recognizing the category in the image as

More information

Direct Multi-Scale Dual-Stream Network for Pedestrian Detection Sang-Il Jung and Ki-Sang Hong Image Information Processing Lab.

Direct Multi-Scale Dual-Stream Network for Pedestrian Detection Sang-Il Jung and Ki-Sang Hong Image Information Processing Lab. [ICIP 2017] Direct Multi-Scale Dual-Stream Network for Pedestrian Detection Sang-Il Jung and Ki-Sang Hong Image Information Processing Lab., POSTECH Pedestrian Detection Goal To draw bounding boxes that

More information

MCMOT: Multi-Class Multi-Object Tracking using Changing Point Detection

MCMOT: Multi-Class Multi-Object Tracking using Changing Point Detection MCMOT: Multi-Class Multi-Object Tracking using Changing Point Detection ILSVRC 2016 Object Detection from Video Byungjae Lee¹, Songguo Jin¹, Enkhbayar Erdenee¹, Mi Young Nam², Young Gui Jung², Phill Kyu

More information

Scene Text Recognition for Augmented Reality. Sagar G V Adviser: Prof. Bharadwaj Amrutur Indian Institute Of Science

Scene Text Recognition for Augmented Reality. Sagar G V Adviser: Prof. Bharadwaj Amrutur Indian Institute Of Science Scene Text Recognition for Augmented Reality Sagar G V Adviser: Prof. Bharadwaj Amrutur Indian Institute Of Science Outline Research area and motivation Finding text in natural scenes Prior art Improving

More information

CS 1674: Intro to Computer Vision. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh November 16, 2016

CS 1674: Intro to Computer Vision. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh November 16, 2016 CS 1674: Intro to Computer Vision Neural Networks Prof. Adriana Kovashka University of Pittsburgh November 16, 2016 Announcements Please watch the videos I sent you, if you haven t yet (that s your reading)

More information

All You Want To Know About CNNs. Yukun Zhu

All You Want To Know About CNNs. Yukun Zhu All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from http://imgur.com/ Deep Learning Image from http://imgur.com/ Deep Learning Image from http://imgur.com/ Deep Learning Image

More information

Convolutional-Recursive Deep Learning for 3D Object Classification

Convolutional-Recursive Deep Learning for 3D Object Classification Convolutional-Recursive Deep Learning for 3D Object Classification Richard Socher, Brody Huval, Bharath Bhat, Christopher D. Manning, Andrew Y. Ng NIPS 2012 Iro Armeni, Manik Dhar Motivation Hand-designed

More information

INF 5860 Machine learning for image classification. Lecture 11: Visualization Anne Solberg April 4, 2018

INF 5860 Machine learning for image classification. Lecture 11: Visualization Anne Solberg April 4, 2018 INF 5860 Machine learning for image classification Lecture 11: Visualization Anne Solberg April 4, 2018 Reading material The lecture is based on papers: Deep Dream: https://research.googleblog.com/2015/06/inceptionism-goingdeeper-into-neural.html

More information

Deep Learning. Deep Learning provided breakthrough results in speech recognition and image classification. Why?

Deep Learning. Deep Learning provided breakthrough results in speech recognition and image classification. Why? Data Mining Deep Learning Deep Learning provided breakthrough results in speech recognition and image classification. Why? Because Speech recognition and image classification are two basic examples of

More information

Deep Learning For Video Classification. Presented by Natalie Carlebach & Gil Sharon

Deep Learning For Video Classification. Presented by Natalie Carlebach & Gil Sharon Deep Learning For Video Classification Presented by Natalie Carlebach & Gil Sharon Overview Of Presentation Motivation Challenges of video classification Common datasets 4 different methods presented in

More information

Using Machine Learning for Classification of Cancer Cells

Using Machine Learning for Classification of Cancer Cells Using Machine Learning for Classification of Cancer Cells Camille Biscarrat University of California, Berkeley I Introduction Cell screening is a commonly used technique in the development of new drugs.

More information

SSD: Single Shot MultiBox Detector. Author: Wei Liu et al. Presenter: Siyu Jiang

SSD: Single Shot MultiBox Detector. Author: Wei Liu et al. Presenter: Siyu Jiang SSD: Single Shot MultiBox Detector Author: Wei Liu et al. Presenter: Siyu Jiang Outline 1. Motivations 2. Contributions 3. Methodology 4. Experiments 5. Conclusions 6. Extensions Motivation Motivation

More information

Autoencoder. Representation learning (related to dictionary learning) Both the input and the output are x

Autoencoder. Representation learning (related to dictionary learning) Both the input and the output are x Deep Learning 4 Autoencoder, Attention (spatial transformer), Multi-modal learning, Neural Turing Machine, Memory Networks, Generative Adversarial Net Jian Li IIIS, Tsinghua Autoencoder Autoencoder Unsupervised

More information

Large-Scale Visual Recognition With Deep Learning

Large-Scale Visual Recognition With Deep Learning Large-Scale Visual Recognition With Deep Learning Marc'Aurelio ranzato@google.com www.cs.toronto.edu/~ranzato Sunday 23 June 2013 Why Is Recognition Hard? Object Recognizer panda 2 Why Is Recognition Hard?

More information

Mask R-CNN. Kaiming He, Georgia, Gkioxari, Piotr Dollar, Ross Girshick Presenters: Xiaokang Wang, Mengyao Shi Feb. 13, 2018

Mask R-CNN. Kaiming He, Georgia, Gkioxari, Piotr Dollar, Ross Girshick Presenters: Xiaokang Wang, Mengyao Shi Feb. 13, 2018 Mask R-CNN Kaiming He, Georgia, Gkioxari, Piotr Dollar, Ross Girshick Presenters: Xiaokang Wang, Mengyao Shi Feb. 13, 2018 1 Common computer vision tasks Image Classification: one label is generated for

More information

Generative Modeling with Convolutional Neural Networks. Denis Dus Data Scientist at InData Labs

Generative Modeling with Convolutional Neural Networks. Denis Dus Data Scientist at InData Labs Generative Modeling with Convolutional Neural Networks Denis Dus Data Scientist at InData Labs What we will discuss 1. 2. 3. 4. Discriminative vs Generative modeling Convolutional Neural Networks How to

More information

Fuzzy Set Theory in Computer Vision: Example 3

Fuzzy Set Theory in Computer Vision: Example 3 Fuzzy Set Theory in Computer Vision: Example 3 Derek T. Anderson and James M. Keller FUZZ-IEEE, July 2017 Overview Purpose of these slides are to make you aware of a few of the different CNN architectures

More information

Martian lava field, NASA, Wikipedia

Martian lava field, NASA, Wikipedia Martian lava field, NASA, Wikipedia Old Man of the Mountain, Franconia, New Hampshire Pareidolia http://smrt.ccel.ca/203/2/6/pareidolia/ Reddit for more : ) https://www.reddit.com/r/pareidolia/top/ Pareidolia

More information

Binary Convolutional Neural Network on RRAM

Binary Convolutional Neural Network on RRAM Binary Convolutional Neural Network on RRAM Tianqi Tang, Lixue Xia, Boxun Li, Yu Wang, Huazhong Yang Dept. of E.E, Tsinghua National Laboratory for Information Science and Technology (TNList) Tsinghua

More information

Object Detection. Part1. Presenter: Dae-Yong

Object Detection. Part1. Presenter: Dae-Yong Object Part1 Presenter: Dae-Yong Contents 1. What is an Object? 2. Traditional Object Detector 3. Deep Learning-based Object Detector What is an Object? Subset of Object Recognition What is an Object?

More information

Cascade Region Regression for Robust Object Detection

Cascade Region Regression for Robust Object Detection Large Scale Visual Recognition Challenge 2015 (ILSVRC2015) Cascade Region Regression for Robust Object Detection Jiankang Deng, Shaoli Huang, Jing Yang, Hui Shuai, Zhengbo Yu, Zongguang Lu, Qiang Ma, Yali

More information

Deep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group

Deep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group Deep Learning Vladimir Golkov Technical University of Munich Computer Vision Group 1D Input, 1D Output target input 2 2D Input, 1D Output: Data Distribution Complexity Imagine many dimensions (data occupies

More information

Kaggle Data Science Bowl 2017 Technical Report

Kaggle Data Science Bowl 2017 Technical Report Kaggle Data Science Bowl 2017 Technical Report qfpxfd Team May 11, 2017 1 Team Members Table 1: Team members Name E-Mail University Jia Ding dingjia@pku.edu.cn Peking University, Beijing, China Aoxue Li

More information

Flow-Based Video Recognition

Flow-Based Video Recognition Flow-Based Video Recognition Jifeng Dai Visual Computing Group, Microsoft Research Asia Joint work with Xizhou Zhu*, Yuwen Xiong*, Yujie Wang*, Lu Yuan and Yichen Wei (* interns) Talk pipeline Introduction

More information

DEEP LEARNING REVIEW. Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature Presented by Divya Chitimalla

DEEP LEARNING REVIEW. Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature Presented by Divya Chitimalla DEEP LEARNING REVIEW Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature 2015 -Presented by Divya Chitimalla What is deep learning Deep learning allows computational models that are composed of multiple

More information

Inception and Residual Networks. Hantao Zhang. Deep Learning with Python.

Inception and Residual Networks. Hantao Zhang. Deep Learning with Python. Inception and Residual Networks Hantao Zhang Deep Learning with Python https://en.wikipedia.org/wiki/residual_neural_network Deep Neural Network Progress from Large Scale Visual Recognition Challenge (ILSVRC)

More information

WITH the recent advance [1] of deep convolutional neural

WITH the recent advance [1] of deep convolutional neural 1 Action-Driven Object Detection with Top-Down Visual Attentions Donggeun Yoo, Student Member, IEEE, Sunggyun Park, Student Member, IEEE, Kyunghyun Paeng, Student Member, IEEE, Joon-Young Lee, Member,

More information

YOLO 9000 TAEWAN KIM

YOLO 9000 TAEWAN KIM YOLO 9000 TAEWAN KIM DNN-based Object Detection R-CNN MultiBox SPP-Net DeepID-Net NoC Fast R-CNN DeepBox MR-CNN 2013.11 Faster R-CNN YOLO AttentionNet DenseBox SSD Inside-OutsideNet(ION) G-CNN NIPS 15

More information

Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks

Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Si Chen The George Washington University sichen@gwmail.gwu.edu Meera Hahn Emory University mhahn7@emory.edu Mentor: Afshin

More information

LSTM and its variants for visual recognition. Xiaodan Liang Sun Yat-sen University

LSTM and its variants for visual recognition. Xiaodan Liang Sun Yat-sen University LSTM and its variants for visual recognition Xiaodan Liang xdliang328@gmail.com Sun Yat-sen University Outline Context Modelling with CNN LSTM and its Variants LSTM Architecture Variants Application in

More information

WE are witnessing a rapid, revolutionary change in our

WE are witnessing a rapid, revolutionary change in our 1904 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 37, NO. 9, SEPTEMBER 2015 Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition Kaiming He, Xiangyu Zhang,

More information

Machine Learning. MGS Lecture 3: Deep Learning

Machine Learning. MGS Lecture 3: Deep Learning Dr Michel F. Valstar http://cs.nott.ac.uk/~mfv/ Machine Learning MGS Lecture 3: Deep Learning Dr Michel F. Valstar http://cs.nott.ac.uk/~mfv/ WHAT IS DEEP LEARNING? Shallow network: Only one hidden layer

More information