Rich feature hierarchies for accurate object detection and semantic segmentation

Size: px

Start display at page:

Download "Rich feature hierarchies for accurate object detection and semantic segmentation"

Meredith Manning
6 years ago
Views:

1 Rich feature hierarchies for accurate object detection and semantic segmentation Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik Presented by Pandian Raju and Jialin Wu

2 Last class SGD for Document recognition ImageNet classification LeCun et al Krizhevsky et al What problems did they solve?

3 Image Classification ImageNet sample image Image credit:

$credit: \https://www.pinterest.$

4 Image Classification ImageNet sample image Image credit: \

5 Image Classification ImageNet sample image Image credit:

6 Image Classification MS COCO sample image Image credit:

7 Object Detection MS COCO sample image Image credit:

8 Semantic segmentation Image credit:

9 Different visual recognition tasks Image classification Object detection Semantic segmentation

10 History

11 PASCAL VOC Detection history Image credit: Ross Girshick

12 DPM - Deformable Parts Model Image credit:

13 Feature learning with CNNs Fukushima 1980 Neocognitron LeCun et al SGD for document recognition Krizhevsky et al ImageNet classification (AlexNet)

14 Brute force

15 Brute force Forget it!

16 Regions Gu et al Recognition using regions

17 R-CNN High level flow Category independent region proposals Extract feature vector using CNN Classify each region using a linear SVM per class

18 Region Proposal 2000 region proposals Selective search algorithm (Uijilings et al. 2012) Exhaustive search Segmentation Selective search

19 Feature vector using CNN 5 convolutional layers 2 fully connected layers Output: 4096 dimension feature vector

20 Linear SVMs SVM for Cat No SVM for Dog Yes CNN SVM for Lion A sample region No

21 Challenges Localization Region proposal and bounding box regression Limited training set

22 Challenges Localization Limited training set Region proposal and bounding box regression Supervised pre-training with fine tuning

23 Training Cat Supervised Dog CNN SVM pre-training Fine tuning (SGD) Car Regions from PASCAL VOC ILSVRC (2012)

24 Testing Intersection-over-Union (IoU) Image credit:

25 Testing Ignore unwanted regions (using non-maxima suppression) SVM Scores

26 Object Detection Results Slide credit: Ross Girshick

27 Semantic segmentation Results R-CNN easily extended for semantic segmentation task O2P: then leader in the task (uses CPMC - Constrained Parametric Min-Cuts) Image credit: Ross Girshick

28 PROS

29 Intuitive Combining Regions with CNN

30 Performant Easily scales with number of classes

31 Run time analysis Once for all SVM classes Feature vectors Supervised ImageNet CNN SVM pre-training Fine tuning (SGD) Regions from PASCAL VOC Feature vec: low dimension - 4K

32 Run time analysis Slide credit: Ross Girshick

33 Impact One of the commonly used methods used for semantic segmentation

34 Ablation studies Analyzing performance impact of different layers

35 Ablation studies Effect of different layers and fine tuning on map (mean average precision) Without fine tuning fc7 generalizes worse than fc6 Representational power: conv layer > fully connected layer Fine tuning: increases map by 8%.

36 Visualization of network Showing which layers learn which features

37 Visualization of network what features does each layer learn? Image credit: Ross Girshick

38 Evaluation Compared with different other baselines and methods

39 Failure modes Analyzed common failure modes and also suggested solutions (BB)

40 Detection error analysis Image credit: Ross Girshick

41 Bounding box regression Most errors: Mislocalizations BB regression: Linear regression model to predict a new detection window given the pool5 features. Bounding box regression

42 Source code Properly documented source code in github

43 Source code Image credit:

44 CONS

45 Computational costly Every proposals have to go through the whole network

47 Need two-steps for Detection Can t unify proposal step and classification step

48 Using SVM No end to end training

49 Violate spatial translation invariance Devils are FC layers

51 No global information

52 Person or Not

53 Idea is simple No more than image classification

Rich feature hierarchies for accurate object detection and semantic segmentation

Rich feature hierarchies for accurate object detection and semantic segmentation BY; ROSS GIRSHICK, JEFF DONAHUE, TREVOR DARRELL AND JITENDRA MALIK PRESENTER; MUHAMMAD OSAMA Object detection vs. classification