Object Recognition and Detection
|
|
- Lewis Martin
- 5 years ago
- Views:
Transcription
1 CS 2770: Computer Vision Object Recognition and Detection Prof. Adriana Kovashka University of Pittsburgh March 16, 21, 23, 2017
2 Plan for the next few lectures Recognizing the category in the image as a whole Detecting the region in the image that corresponds to a category Using window templates Face detection Pedestrian detection Using parts Implicit Shape Models Deformable Part Models Using Convolutional Neural Networks R-CNN, Fast R-CNN YOLO (You Only Look Once)
3 Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories CVPR 2006 Svetlana Lazebnik Beckman Institute, University of Illinois at Urbana-Champaign Cordelia Schmid INRIA Rhône-Alpes, France Jean Ponce Ecole Normale Supérieure, France
4 Scene category dataset Fei-Fei & Perona (2005), Oliva & Torralba (2001) Slide credit: L. Lazebnik
5 Bag-of-words representation 1. Extract local features 2. Learn visual vocabulary using clustering 3. Quantize local features using visual vocabulary 4. Represent images by frequencies of visual words Slide credit: L. Lazebnik
6 Image categorization with bag of words Training 1. Compute bag-of-words representation for training images 2. Train classifier on labeled examples using histogram values as features 3. Labels are the scene types (e.g. mountain vs field) Testing 1. Extract keypoints/descriptors for test images 2. Quantize into visual words using the clusters computed at training time 3. Compute visual word histogram for test images 4. Compute labels on test images using classifier obtained at training time 5. Measure accuracy of test predictions by comparing them to groundtruth test labels (obtained from humans) Adapted from D. Hoiem
7 Feature extraction (on which BOW is based) Weak features Strong features Edge points at 2 scales and 8 orientations (vocabulary size 16) SIFT descriptors of 16x16 patches sampled on a regular grid, quantized to form visual vocabulary (size 200, 400) Slide credit: L. Lazebnik
8 What about spatial layout? All of these images have the same color histogram Slide credit: D. Hoiem
9 Spatial pyramid Compute histogram in each spatial bin Slide credit: D. Hoiem
10 Spatial pyramid [Lazebnik et al. CVPR 2006] Slide credit: D. Hoiem
11 Adapted from L. Lazebnik Pyramid matching Indyk & Thaper (2003), Grauman & Darrell (2005) Matching using pyramid and histogram intersection for some particular visual word: x i x j Original images Feature histograms: Level 3 Level 2 Level 1 Level 0 K( x i, x j ) Total weight (value of pyramid match kernel):
12 Scene category dataset Fei-Fei & Perona (2005), Oliva & Torralba (2001) Multi-class classification results (100 training images per class) Fei-Fei & Perona: 65.2% Slide credit: L. Lazebnik
13 Scene category confusions Difficult indoor images kitchen living room bedroom Slide credit: L. Lazebnik
14 Caltech101 dataset Fei-Fei et al. (2004) Multi-class classification results (30 training images per class) Slide credit: L. Lazebnik
15 Plan for the next few lectures Recognizing the category in the image as a whole Detecting the region in the image that corresponds to a category Using window templates Face detection Pedestrian detection Using parts Implicit Shape Models Deformable Part Models Using Convolutional Neural Networks R-CNN, Fast R-CNN YOLO (You Only Look Once)
16 Category detection: basic framework Build/train object model Choose a representation Learn or fit parameters of model / classifier Generate candidates in new image Score the candidates Kristen Grauman
17 Category detection: representation choice Window-based Part-based Kristen Grauman
18 Window-template-based models Building an object model Consider edges, contours, oriented intensity gradients Summarize local distribution of gradients with histogram Locally orderless: offers invariance to small shifts and rotations Adapted from Kristen Grauman
19 Window-template-based models Building an object model Given the representation, train a binary classifier Car/non-car Classifier No, Yes, not car. a car. Kristen Grauman
20 Window-template-based models Generating and scoring candidates Car/non-car Classifier Kristen Grauman
21 Window-template-based object detection: recap Training: 1. Obtain training data 2. Define features 3. Define classifier Given new image: 1. Slide window 2. Score by classifier Training examples Car/non-car Classifier Feature extraction Kristen Grauman
22 Special case: Faces Detection Recognition Sally Lana Lazebnik
23 Challenges of face detection Sliding window detector must evaluate tens of thousands of location/scale combinations Faces are rare: 0 10 per image A megapixel image has ~10 6 pixels and a comparable number of candidate face locations For computational efficiency, we should try to spend as little time as possible on the non-face windows To avoid having a false positive in every image, our false positive rate has to be less than 10-6 Lana Lazebnik
24 Viola-Jones face detector
25 Boosting intuition Weak Classifier 1 Paul Viola
26 Boosting illustration Weights Increased Paul Viola
27 Boosting illustration Weak Classifier 2 Paul Viola
28 Boosting illustration Weights Increased Paul Viola
29 Boosting illustration Weak Classifier 3 Paul Viola
30 Boosting illustration Final classifier is a combination of weak classifiers Paul Viola
31 Boosting: training Initially, weight each training example equally In each boosting round: Find the weak learner that achieves the lowest weighted training error Raise weights of training examples misclassified by current weak learner Compute final classifier as linear combination of all weak learners (weight of each learner is directly proportional to its accuracy) Exact formulas for re-weighting and combining weak learners depend on the particular boosting scheme (e.g., AdaBoost) Lana Lazebnik, Kristen Grauman
32 Main idea: Viola-Jones face detector Represent local texture with efficiently computable rectangular features within window of interest Select discriminative features to be weak classifiers Use boosted combination of them as final classifier Form a cascade of such classifiers, rejecting clear negatives quickly Kristen Grauman
33 Viola-Jones detector: features Rectangular filters Feature output is difference between adjacent regions Value = (pixels in white area) (pixels in black area) Efficiently computable with integral image: any sum can be computed in constant time Value at (x,y) is sum of pixels above and to the left of (x,y) Integral image Adapted from Kristen Grauman and Lana Lazebnik
34 Fast computation with integral images The integral image computes a value at each pixel (x,y) that is the sum of the pixel values above and to the left of (x,y), inclusive This can quickly be computed in one pass through the image (x,y) Lana Lazebnik
35 Lana Lazebnik Computing sum within a rectangle Let A,B,C,D be the values of the integral image at the corners of a rectangle Then the sum of original image values within the rectangle can be computed as: sum = A B C + D Only 3 additions are required for any size of rectangle! D C B A
36 Lana Lazebnik Example Source Result
37 Viola-Jones detector: features Which subset of these features should we use to determine if a window has a face? Considering all possible filter parameters: position, scale, and type: 180,000+ possible features associated with each 24 x 24 window Use AdaBoost both to select the informative features and to form the classifier Kristen Grauman
38 Viola-Jones detector: AdaBoost Want to select the single rectangle feature and threshold that best separates positive (faces) and negative (nonfaces) training examples, in terms of weighted error. Resulting weak classifier: Outputs of a possible rectangle feature on faces and non-faces. For next round, reweight the examples according to errors, choose another filter/threshold combo. Kristen Grauman
39 Start with uniform weights on training examples. For M rounds Evaluate weighted error for each weak learner, pick best learner. y m (x n ) is the prediction, t n is ground truth for x n Figure from C. Bishop, notes from K. Grauman (d) Normalize the weights so they sum to 1 Re-weight the examples: Incorrectly classified get more weight, correctly classified get less weight. Final classifier is combination of weak ones, weighted according to error they had.
40 Boosting for face detection First two features selected by boosting: This feature combination can yield 100% detection rate and 50% false positive rate Lana Lazebnik
41 Boosting: pros and cons Advantages of boosting Integrates classification with feature selection Complexity of training is linear in the number of training examples Flexibility in the choice of weak learners, boosting scheme Testing is fast Easy to implement Disadvantages Needs many training examples Often found not to work as well as an alternative discriminative classifier, support vector machine (SVM) Lana Lazebnik
42 Are we done? Even if the filters are fast to compute, each new image has a lot of possible windows to search. How to make the detection more efficient? Kristen Grauman
43 Cascading classifiers for detection Form a cascade with low false negative rates early on Apply less accurate but faster classifiers first to immediately discard windows that clearly appear to be negative Kristen Grauman
44 Viola-Jones detector: summary Train cascade of classifiers with AdaBoost Faces New image Selected features, thresholds, and weights Non-faces Train with 5K positives, 350M negatives Real-time detector using 38 layer cascade (0.067s) 6061 features in all layers Adapted from Kristen Grauman
45 Viola-Jones detector: summary A seminal approach to real-time object detection Training is slow, but detection is very fast Key ideas Integral images for fast feature evaluation Boosting for feature selection Attentional cascade of classifiers for fast rejection of non-face windows P. Viola and M. Jones. Rapid object detection using a boosted cascade of simple features. CVPR P. Viola and M. Jones. Robust real-time face detection. IJCV 57(2), Matlab demo: Adapted from Kristen Grauman
46 Visual Perceptual Object and Recognition Sensory Augmented Tutorial Computing Kristen Grauman Viola-Jones Face Detector: Results
47 Visual Perceptual Object and Recognition Sensory Augmented Tutorial Computing Kristen Grauman Viola-Jones Face Detector: Results
48 Visual Perceptual Object and Recognition Sensory Augmented Tutorial Computing Kristen Grauman Viola-Jones Face Detector: Results
49 Visual Perceptual Object and Recognition Sensory Augmented Tutorial Computing Kristen Grauman Detecting profile faces? Can we use the same detector?
50 Visual Perceptual Object and Recognition Sensory Augmented Tutorial Computing Viola-Jones Face Detector: Results Paul Viola, ICCV tutorial Kristen Grauman
51 Dalal-Triggs pedestrian detector 1. Extract fixed-sized (64x128 pixel) window at each position and scale 2. Compute HOG (histogram of gradient) features within each window 3. Score the window with a linear SVM classifier 4. Perform non-maxima suppression to remove overlapping detections with lower scores Navneet Dalal and Bill Triggs, Histograms of Oriented Gradients for Human Detection, CVPR05
52 Histograms of oriented gradients (HOG) Divide image into 8x8 regions Orientation: 9 bins (for unsigned angles) Histograms in 8x8 pixel cells Votes weighted by magnitude Adapted from Pete Barnum Navneet Dalal and Bill Triggs, Histograms of Oriented Gradients for Human Detection, CVPR05
53 Histograms of oriented gradients (HOG) 10x10 cells 20x20 cells N. Dalal and B. Triggs, Histograms of Oriented Gradients for Human Detection, CVPR 2005 Image credit: N. Snavely
54 Histograms of oriented gradients (HOG) N. Dalal and B. Triggs, Histograms of Oriented Gradients for Human Detection, CVPR 2005 Image credit: N. Snavely
55 Histograms of oriented gradients (HOG) N. Dalal and B. Triggs, Histograms of Oriented Gradients for Human Detection, CVPR 2005
56 Histograms of oriented gradients (HOG) N. Dalal and B. Triggs, Histograms of Oriented Gradients for Human Detection, CVPR 2005
57 Train SVM for pedestrian detection using HoG pos w neg w + pedestrian Adapted from Pete Barnum Navneet Dalal and Bill Triggs, Histograms of Oriented Gradients for Human Detection, CVPR05
58 Remove overlapping detections Non-max suppression Score = 0.8 Score = 0.8 Score = 0.1 Adapted from Derek Hoiem
59 Plan for the next few lectures Recognizing the category in the image as a whole Detecting the region in the image that corresponds to a category Using window templates Face detection Pedestrian detection Using parts Implicit Shape Models Deformable Part Models Using Convolutional Neural Networks R-CNN, Fast R-CNN YOLO (You Only Look Once)
60 Sliding window detector
61 Are window templates enough? Single rigid window template usually not enough to represent a category Many objects (e.g. humans) are articulated, or have parts that can vary in configuration Many object categories look very different from different viewpoints, or from instance to instance Slide by N. Snavely
62 Deformable objects Images from Caltech-256 Slide Credit: Duan Tran
63 Deformable objects Images from D. Ramanan s dataset Slide Credit: Duan Tran
64 Parts-based Models Define object by collection of parts modeled by 1. Appearance 2. Spatial configuration Slide credit: Rob Fergus
65 How to model spatial relations? One extreme: fixed template Derek Hoiem
66 Fixed part-based template Object model = sum of scores of features at fixed positions = -0.5? > 7.5 Non-object = 10.5 Object? > 7.5 Derek Hoiem
67 How to model spatial relations? Another extreme: bag of words = Derek Hoiem
68 How to model spatial relations? Star-shaped model X = X X Derek Hoiem
69 How to model spatial relations? Star-shaped model Part Part Part Root Part Part Derek Hoiem
70 Parts-based Models Articulated parts model Object is configuration of parts Each part is detectable and can move around Adapted from Derek Hoiem, images from Felzenszwalb
71 Implicit shape models Visual vocabulary is used to index votes for object position [a visual word = part ] training image annotated with object localization info visual codeword with displacement vectors B. Leibe, A. Leonardis, and B. Schiele, Combined Object Categorization and Segmentation with an Implicit Shape Model, ECCV Workshop on Statistical Learning in Computer Vision 2004 Lana Lazebnik
72 Implicit shape models: Training 1. Build vocabulary of patches around extracted interest points using clustering Lana Lazebnik
73 Implicit shape models: Training 1. Build vocabulary of patches around extracted interest points using clustering 2. Map the patch around each interest point to closest word Lana Lazebnik
74 Implicit shape models: Training 1. Build vocabulary of patches around extracted interest points using clustering 2. Map the patch around each interest point to closest word 3. For each word, store all positions it was found, relative to object center Lana Lazebnik
75 Recall: Generalized Hough transform Template representation: for each type of landmark point, store all possible displacement vectors towards the center Template Model Svetlana Lazebnik
76 Implicit shape models: Testing 1. Given new test image, extract patches, match to vocabulary words 2. Cast votes for possible positions of object center 3. Search for maxima in voting space Lana Lazebnik
77 Visual Perceptual Object and Recognition Sensory Augmented Tutorial Computing Detection Results Qualitative Performance Recognizes different kinds of objects Robust to clutter, occlusion, noise, low contrast K. Grauman, B. Leibe
78 Discriminative part-based models Root filter Part filters Deformation weights P. Felzenszwalb, R. Girshick, D. McAllester, D. Ramanan, Object Detection with Discriminatively Trained Part Based Models, PAMI 32(9), 2010 Lana Lazebnik
79 Discriminative part-based models Multiple components P. Felzenszwalb, R. Girshick, D. McAllester, D. Ramanan, Object Detection with Discriminatively Trained Part Based Models, PAMI 32(9), 2010 Lana Lazebnik
80 Scoring an object hypothesis The score of a hypothesis is the sum of appearance scores minus the sum of deformation costs part loc anchor loc Displacements i.e. how much the part p i moved from its expected anchor location in the x, y directions Appearance weights Part features Deformation weights i.e. how much we ll penalize the part p i Felzenszwalb et al. for moving from its expected location
81 Felzenszwalb et al. Detection
82 Training Training data: images with labeled bounding boxes Parts are not annotated Need to learn the weights and deformation parameters Adapted from Lana Lazebnik
83 Training Our classifier has the form f ( x) max w HΦ ( x, z) z w are model parameters, z are latent hypotheses Latent SVM training: Initialize w and iterate: Fix w and find the best z for each training example Fix z and solve for w (standard SVM training) Lana Lazebnik
84 Car model Component 1 Component 2 Lana Lazebnik
85 Car detections Lana Lazebnik
86 Person model Lana Lazebnik
87 Person detections Lana Lazebnik
88 Cat model Lana Lazebnik
89 Cat detections Lana Lazebnik
90 Speeding up detection: Restrict set of windows we pass through SVM to those w/ high objectness Alexe et al., CVPR 2010
91 Alexe et al., CVPR 2010 Objectness cue #1: Where people look
92 Objectness cue #2: color contrast at boundary Alexe et al., CVPR 2010
93 Objectness cue #3: no segments straddling the object box Alexe et al., CVPR 2010
94 Boxes found to have high objectness Cyan = ground truth bounding boxes, yellow = correct and red = incorrect predictions for objectness Only run the sheep / horse / chair etc. classifier on the yellow/red boxes. Alexe et al., CVPR 2010
95 How do detectors fail? Most errors that detectors make are reasonable Localization error and confusion with similar objects Misdetection of occluded or small objects Detectors have different sensitivity to different factors E.g. less sensitive to truncation than to size differences Failure analysis code and annotations available online Adapted from Hoiem et al., ECCV 2012
96 Analysis of object characteristics Additional annotations for seven categories: occlusion level, parts visible, sides visible Hoiem et al., ECCV 2012
97 Top false positives: Airplane (DPM) AP = Background 27% Localization 29% Other Objects 11% Similar Objects 33% Bird, Boat, Car Hoiem et al., ECCV 2012
98 Object characteristics: Aeroplane Occlusion: poor robustness to occlusion, but little impact on overall performance Easier (None) Hoiem et al., ECCV 2012 Harder (Heavy)
99 Object characteristics: Aeroplane Size: strong preference for average to above average sized airplanes Large Medium X-Large Small X-Small Easier Hoiem et al., ECCV 2012 Harder
100 Object characteristics: Aeroplane Aspect Ratio: 2-3x better at detecting wide (side) views than tall views X-Wide Wide Medium X-Tall Tall Easier (Wide) Hoiem et al., ECCV 2012 Harder (Tall)
101 Object characteristics: Aeroplane Sides/Parts: best performance = direct side view with all parts visible Easier (Side) Hoiem et al., ECCV 2012 Harder (Non-Side)
102 Summary Window-template-based approaches Assume object appears in roughly the same configuration in different images Look for alignment with a global template Part-based methods Allow parts to move somewhat from their usual locations Look for good fits in appearance, for both the global template and the individual part templates Speed up by only scoring boxes that look like any object Models prefer that objects appear in certain views
103 Plan for the next few lectures Recognizing the category in the image as a whole Detecting the region in the image that corresponds to a category Using window templates Face detection Pedestrian detection Using parts Implicit Shape Models Deformable Part Models Using Convolutional Neural Networks R-CNN, Fast R-CNN YOLO (You Only Look Once)
104 map (%) Complexity and theplateau [Source: esults/index.html] % DPM 23% DPM, HOG+BOW 28% DPM, MKL plateau & increasing complexity 37% DPM++ 41% 41% DPM++, MKL, Selective Search Selective Search, DPM++, MKL Top competition results ( ) 0 VOC 07 VOC 08 VOC 09 VOC 10 VOC 11 VOC 12 PASCAL VOC challenge dataset Girshick et al., R i c h Feature Hierarchies for Accurate Object Detection and Semantic Segmentation, CVPR 2014
105 map (%) R-CNN: Regions with CNNfeatures R-CNN 58.5% R-CNN 53.7% R-CNN 53.3% Postcompetition results ( present) Top competition results ( ) 0 VOC 07 VOC 08 VOC 09 VOC 10 VOC 11 VOC 12 PASCAL VOC challenge dataset Girshick et al., R i c h Feature Hierarchies for Accurate Object Detection and Semantic Segmentation, CVPR 2014
106 R-CNN: Regions with CNNfeatures CNN aeroplane? no.. person? yes... tvmonitor? no. Input image Extract region proposals (~2k / image) Compute CNN features Classify regions (linear SVM) Girshick et al., R i c h Feature Hierarchies for Accurate Object Detection and Semantic Segmentation, CVPR 2014
107 R-CNN at test time: Step 1 CNN aeroplane? no.. person? yes... tvmonitor? no. Input image Extract region proposals (~2k / image) Proposal-method agnostic, many choices - Selective Search [van de Sande, Uijlings et al.] (Used in this work) - Objectness [Alexe etal.] - Category independent object proposals [Endres & Hoiem] - CPMC [Carreira & Sminchisescu] Active area, at this CVPR - BING [Ming et al.] fast - MCG [Arbelaez et al.] high-quality segmentation Girshick et al., R i c h Feature Hierarchies for Accurate Object Detection and Semantic Segmentation, CVPR 2014
108 R-CNN at test time: Step 2 CNN aeroplane? no.. person? yes... tvmonitor? no. Input image Extract region proposals (~2k / image) Compute CNN features Girshick et al., R i c h Feature Hierarchies for Accurate Object Detection and Semantic Segmentation, CVPR 2014
109 R-CNN at test time: Step 2 CNN aeroplane? no.. person? yes... tvmonitor? no. Input image Extract region proposals (~2k / image) Compute CNN features Dilate proposal Girshick et al., R i c h Feature Hierarchies for Accurate Object Detection and Semantic Segmentation, CVPR 2014
110 R-CNN at test time: Step 2 CNN aeroplane? no.. person? yes... tvmonitor? no. Input image Extract region proposals (~2k / image) Compute CNN features a. Crop Girshick et al., R i c h Feature Hierarchies for Accurate Object Detection and Semantic Segmentation, CVPR 2014
111 R-CNN at test time: Step 2 CNN aeroplane? no.. person? yes... tvmonitor? no. Input image Extract region proposals (~2k / image) Compute CNN features 227 x 227 a. Crop b. Scale (anisotropic) Girshick et al., R i c h Feature Hierarchies for Accurate Object Detection and Semantic Segmentation, CVPR 2014
112 R-CNN at test time: Step 2 CNN aeroplane? no.. person? yes... tvmonitor? no. Input image Extract region proposals (~2k / image) Compute CNN features Crop b. Scale (anisotropic) c. Forward propagate Output: fc7 features Girshick et al., R i c h Feature Hierarchies for Accurate Object Detection and Semantic Segmentation, CVPR 2014
113 R-CNN at test time: Step 3 CNN aeroplane? no.. person? yes... tvmonitor? no. Input image Extract region proposals (~2k / image) Compute CNN features Classify regions person? horse? proposal 4096-dimensional fc7 feature vector linear classifiers (SVM or softmax) Girshick et al., R i c h Feature Hierarchies for Accurate Object Detection and Semantic Segmentation, CVPR 2014
114 Step 4: Object proposal refinement Linear regression on CNNfeatures Original proposal Predicted object bounding box Bounding-box regression Girshick et al., R i c h Feature Hierarchies for Accurate Object Detection and Semantic Segmentation, CVPR 2014
115 R-CNN results onpascal VOC 2007 VOC 2010 DPM v5 (Girshick et al. 2011) 33.7% 29.6% UVA sel. search (Uijlings et al. 35.1% 2013) Regionlets (Wang et al. 2013) 41.7% 39.7% SegDPM (Fidler et al. 2013) 40.4% Reference systems R-CNN 54.2% 50.2% R-CNN + bbox regression 58.5% 53.7% metric: mean average precision (higher is better) Girshick et al., R i c h Feature Hierarchies for Accurate Object Detection and Semantic Segmentation, CVPR 2014
116 R-CNN results onpascal VOC 2007 VOC 2010 DPM v5 (Girshick et al. 2011) 33.7% 29.6% UVA sel. search (Uijlings et al. 35.1% 2013) Regionlets (Wang et al. 2013) 41.7% 39.7% SegDPM (Fidler et al. 2013) 40.4% R-CNN 54.2% 50.2% R-CNN + bbox regression 58.5% 53.7% metric: mean average precision (higher is better) Girshick et al., R i c h Feature Hierarchies for Accurate Object Detection and Semantic Segmentation, CVPR 2014
117 R-CNN on ImageNet detection ILSVRC2013 detection test set map *R CNN BB *OverFeat (2) UvA Euvision *NEC MU *OverFeat (1) Toronto A SYSU_Vision GPU_UCLA 31.4% 24.3% 22.6% 20.9% 19.4% 11.5% 10.5% 9.8% Delta UIUC IFP 1.0% 6.1% post competition result competition result mean average precision (map) in % 0 Girshick et al., R i c h Feature Hierarchies for Accurate Object Detection and Semantic Segmentation, CVPR 2014
118 Training R-CNN Bounding-box labeled detection data is scarce Key insight: Use supervised pre-training on a data-rich auxiliary task and transfer to detection Girshick et al., R i c h Feature Hierarchies for Accurate Object Detection and Semantic Segmentation, CVPR 2014
119 R-CNN training: Step 1 Supervised pre-training Train a SuperVision CNN* for the 1000-way ILSVRC image classification task train CNN Auxiliary task: ILSVRC 2012 classification (1.2 million images) *Network from Krizhevsky, Sutskever & Hinton. NIPS 2012 Also called AlexNet Girshick et al., R i c h Feature Hierarchies for Accurate Object Detection and Semantic Segmentation, CVPR 2014
120 R-CNN training: Step 2 Fine-tune the CNN for detection Transfer the representation learned for ILSVRC classification to PASCAL (or ImageNet detection) fine-tune CNN Target task: PASCAL VOCdetection (~25k object labels) Girshick et al., R i c h Feature Hierarchies for Accurate Object Detection and Semantic Segmentation, CVPR 2014
121 R-CNN training: Step 3 Train detection SVMs (With the softmax classifier from fine-tuning map decreases from 54% to 51%) PASCAL VOC object proposals ~ 2k windows / image CNN features training labels per-class SVM Girshick et al., R i c h Feature Hierarchies for Accurate Object Detection and Semantic Segmentation, CVPR 2014
122 Slow R-CNN Apply bounding-box regressors Bbox reg SVMs Classify regions with SVMs Bbox reg SVMs Bbox reg SVMs ConvNet Forward each region through ConvNet ConvNet ConvNet Warped image regions Regions of Interest (RoI) from a proposal method (~2k) Girshick et al. CVPR14 Input image Post hoc component
123 What s wrong with slow R-CNN? Ad hoc training objectives Fine-tune network with softmax classifier (log loss) Train post-hoc linear SVMs (hingeloss) Train post-hoc bounding-box regressions (least squares) Training is slow (84h), takes a lot of disk space Inference (detection) is slow 47s / image with VGG16 [Simonyan & Zisserman, ICLR15] Girshick, Fast R-CNN, ICCV 2015 ~2000 ConvNet forward passes per image
124 Fast R-CNN Fast test time One network, trained in one stage Higher mean average precision Girshick, Fast R-CNN, ICCV 2015
125 Fast R-CNN (test time) Regions of Interest (RoIs) from a proposal method conv5 feature map of image Forward whole image through ConvNet ConvNet Input image Girshick, Fast R-CNN, ICCV 2015
126 Fast R-CNN (test time) RoI Pooling layer Regions of Interest (RoIs) from a proposal method conv5 feature map of image Forward whole image through ConvNet ConvNet Input image Girshick, Fast R-CNN, ICCV 2015
127 Fast R-CNN (test time) Softmax classifier Linear + softmax FCs Fully-connected layers RoI Pooling layer Regions of Interest (RoIs) from a proposal method conv5 feature map of image Forward whole image through ConvNet ConvNet Input image Girshick, Fast R-CNN, ICCV 2015
128 Fast R-CNN (test time) Softmax classifier Linear + softmax Linear Bounding-box regressors FCs Fully-connected layers RoI Pooling layer Regions of Interest (RoIs) from a proposal method conv5 feature map of image Forward whole image through ConvNet ConvNet Input image Girshick, Fast R-CNN, ICCV 2015
129 Fast R-CNN (training) Linear + softmax Linear FCs ConvNet Girshick, Fast R-CNN, ICCV 2015
130 Fast R-CNN (training) Log loss + smooth L1 loss Multi-task loss Linear + softmax Linear FCs ConvNet Girshick, Fast R-CNN, ICCV 2015
131 Fast R-CNN (training) Linear + softmax Log loss + smooth L1 loss Linear Multi-task loss FCs Trainable ConvNet Girshick, Fast R-CNN, ICCV 2015
132 Main results Fast R-CNN R-CNN [1] SPP-net[2] Train time (h) Speedup 8.8x 1x 3.4x Test time / image 0.32s 47.0s 2.3s Test speedup 146x 1x 20x map 66.9% 66.0% 63.1% Timings exclude object proposal time, which is equal for all methods. All methods use VGG16 from Simonyan and Zisserman. [1] Girshick et al. CVPR14 [2] He et al. ECCV14 Girshick, Fast R-CNN, ICCV 2015
133 Accurate object detection is slow! Pascal 2007 map Speed DPM v FPS 14 s/img R-CNN FPS 20 s/img ⅓ Mile, 1760 feet Redmon et al., You Only Look Once: Unified, Real-Time Object Detection, CVPR 2016
134 Accurate object detection is slow! Pascal 2007 map Speed DPM v FPS 14 s/img R-CNN FPS 20 s/img Fast R-CNN FPS 2 s/img Faster R-CNN FPS 140 ms/img YOLO FPS 22 ms/img 2 feet Redmon et al., You Only Look Once: Unified, Real-Time Object Detection, CVPR 2016
135 Split the image into a grid Redmon et al., You Only Look Once: Unified, Real-Time Object Detection, CVPR 2016
136 Each cell predicts boxes and confidences: P(Object) Redmon et al., You Only Look Once: Unified, Real-Time Object Detection, CVPR 2016
137 Each cell also predicts a probability P(Class Object) Bicycle Car Dog Dining Table Redmon et al., You Only Look Once: Unified, Real-Time Object Detection, CVPR 2016
138 Combine the box and class predictions Redmon et al., You Only Look Once: Unified, Real-Time Object Detection, CVPR 2016
139 Finally do NMS and threshold detections Redmon et al., You Only Look Once: Unified, Real-Time Object Detection, CVPR 2016
140 YOLO works across many natural images Redmon et al., You Only Look Once: Unified, Real-Time Object Detection, CVPR 2016
141 It also generalizes well to new domains Redmon et al., You Only Look Once: Unified, Real-Time Object Detection, CVPR 2016
Deformable Part Models
CS 1674: Intro to Computer Vision Deformable Part Models Prof. Adriana Kovashka University of Pittsburgh November 9, 2016 Today: Object category detection Window-based approaches: Last time: Viola-Jones
More informationCS 1674: Intro to Computer Vision. Object Recognition. Prof. Adriana Kovashka University of Pittsburgh April 3, 5, 2018
CS 1674: Intro to Computer Vision Object Recognition Prof. Adriana Kovashka University of Pittsburgh April 3, 5, 2018 Different Flavors of Object Recognition Semantic Segmentation Classification + Localization
More informationDetection III: Analyzing and Debugging Detection Methods
CS 1699: Intro to Computer Vision Detection III: Analyzing and Debugging Detection Methods Prof. Adriana Kovashka University of Pittsburgh November 17, 2015 Today Review: Deformable part models How can
More informationWindow based detectors
Window based detectors CS 554 Computer Vision Pinar Duygulu Bilkent University (Source: James Hays, Brown) Today Window-based generic object detection basic pipeline boosting classifiers face detection
More informationObject Detection Based on Deep Learning
Object Detection Based on Deep Learning Yurii Pashchenko AI Ukraine 2016, Kharkiv, 2016 Image classification (mostly what you ve seen) http://tutorial.caffe.berkeleyvision.org/caffe-cvpr15-detection.pdf
More informationRecap Image Classification with Bags of Local Features
Recap Image Classification with Bags of Local Features Bag of Feature models were the state of the art for image classification for a decade BoF may still be the state of the art for instance retrieval
More informationSpatial Localization and Detection. Lecture 8-1
Lecture 8: Spatial Localization and Detection Lecture 8-1 Administrative - Project Proposals were due on Saturday Homework 2 due Friday 2/5 Homework 1 grades out this week Midterm will be in-class on Wednesday
More informationDeep learning for object detection. Slides from Svetlana Lazebnik and many others
Deep learning for object detection Slides from Svetlana Lazebnik and many others Recent developments in object detection 80% PASCAL VOC mean0average0precision0(map) 70% 60% 50% 40% 30% 20% 10% Before deep
More informationhttps://en.wikipedia.org/wiki/the_dress Recap: Viola-Jones sliding window detector Fast detection through two mechanisms Quickly eliminate unlikely windows Use features that are fast to compute Viola
More informationObject Category Detection. Slides mostly from Derek Hoiem
Object Category Detection Slides mostly from Derek Hoiem Today s class: Object Category Detection Overview of object category detection Statistical template matching with sliding window Part-based Models
More informationGeneric Object-Face detection
Generic Object-Face detection Jana Kosecka Many slides adapted from P. Viola, K. Grauman, S. Lazebnik and many others Today Window-based generic object detection basic pipeline boosting classifiers face
More informationFace Detection and Alignment. Prof. Xin Yang HUST
Face Detection and Alignment Prof. Xin Yang HUST Many slides adapted from P. Viola Face detection Face detection Basic idea: slide a window across image and evaluate a face model at every location Challenges
More informationObject detection with CNNs
Object detection with CNNs 80% PASCAL VOC mean0average0precision0(map) 70% 60% 50% 40% 30% 20% 10% Before CNNs After CNNs 0% 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 year Region proposals
More informationPart based models for recognition. Kristen Grauman
Part based models for recognition Kristen Grauman UT Austin Limitations of window-based models Not all objects are box-shaped Assuming specific 2d view of object Local components themselves do not necessarily
More informationCategory-level localization
Category-level localization Cordelia Schmid Recognition Classification Object present/absent in an image Often presence of a significant amount of background clutter Localization / Detection Localize object
More informationObject Category Detection: Sliding Windows
04/10/12 Object Category Detection: Sliding Windows Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem Today s class: Object Category Detection Overview of object category detection Statistical
More informationObject Recognition II
Object Recognition II Linda Shapiro EE/CSE 576 with CNN slides from Ross Girshick 1 Outline Object detection the task, evaluation, datasets Convolutional Neural Networks (CNNs) overview and history Region-based
More informationPreviously. Part-based and local feature models for generic object recognition. Bag-of-words model 4/20/2011
Previously Part-based and local feature models for generic object recognition Wed, April 20 UT-Austin Discriminative classifiers Boosting Nearest neighbors Support vector machines Useful for object recognition
More informationObject Detection Design challenges
Object Detection Design challenges How to efficiently search for likely objects Even simple models require searching hundreds of thousands of positions and scales Feature design and scoring How should
More informationPreviously. Window-based models for generic object detection 4/11/2011
Previously for generic object detection Monday, April 11 UT-Austin Instance recognition Local features: detection and description Local feature matching, scalable indexing Spatial verification Intro to
More informationCategory vs. instance recognition
Category vs. instance recognition Category: Find all the people Find all the buildings Often within a single image Often sliding window Instance: Is this face James? Find this specific famous building
More informationPart-based and local feature models for generic object recognition
Part-based and local feature models for generic object recognition May 28 th, 2015 Yong Jae Lee UC Davis Announcements PS2 grades up on SmartSite PS2 stats: Mean: 80.15 Standard Dev: 22.77 Vote on piazza
More informationBeyond Bags of features Spatial information & Shape models
Beyond Bags of features Spatial information & Shape models Jana Kosecka Many slides adapted from S. Lazebnik, FeiFei Li, Rob Fergus, and Antonio Torralba Detection, recognition (so far )! Bags of features
More informationObject detection as supervised classification
Object detection as supervised classification Tues Nov 10 Kristen Grauman UT Austin Today Supervised classification Window-based generic object detection basic pipeline boosting classifiers face detection
More informationBeyond bags of features: Adding spatial information. Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba
Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba Adding spatial information Forming vocabularies from pairs of nearby features doublets
More informationRich feature hierarchies for accurate object detection and semantic segmentation
Rich feature hierarchies for accurate object detection and semantic segmentation BY; ROSS GIRSHICK, JEFF DONAHUE, TREVOR DARRELL AND JITENDRA MALIK PRESENTER; MUHAMMAD OSAMA Object detection vs. classification
More informationDiscriminative classifiers for image recognition
Discriminative classifiers for image recognition May 26 th, 2015 Yong Jae Lee UC Davis Outline Last time: window-based generic object detection basic pipeline face detection with boosting as case study
More informationObject Category Detection: Sliding Windows
03/18/10 Object Category Detection: Sliding Windows Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem Goal: Detect all instances of objects Influential Works in Detection Sung-Poggio
More informationDeep Learning for Object detection & localization
Deep Learning for Object detection & localization RCNN, Fast RCNN, Faster RCNN, YOLO, GAP, CAM, MSROI Aaditya Prakash Sep 25, 2018 Image classification Image classification Whole of image is classified
More informationRegionlet Object Detector with Hand-crafted and CNN Feature
Regionlet Object Detector with Hand-crafted and CNN Feature Xiaoyu Wang Research Xiaoyu Wang Research Ming Yang Horizon Robotics Shenghuo Zhu Alibaba Group Yuanqing Lin Baidu Overview of this section Regionlet
More informationBias-Variance Trade-off + Other Models and Problems
CS 1699: Intro to Computer Vision Bias-Variance Trade-off + Other Models and Problems Prof. Adriana Kovashka University of Pittsburgh November 3, 2015 Outline Support Vector Machines (review + other uses)
More informationModern Object Detection. Most slides from Ali Farhadi
Modern Object Detection Most slides from Ali Farhadi Comparison of Classifiers assuming x in {0 1} Learning Objective Training Inference Naïve Bayes maximize j i logp + logp ( x y ; θ ) ( y ; θ ) i ij
More informationObject Detection. Computer Vision Yuliang Zou, Virginia Tech. Many slides from D. Hoiem, J. Hays, J. Johnson, R. Girshick
Object Detection Computer Vision Yuliang Zou, Virginia Tech Many slides from D. Hoiem, J. Hays, J. Johnson, R. Girshick Administrative stuffs HW 4 due 11:59pm on Wed, November 8 HW 3 grades are out Average:
More informationClassifier Case Study: Viola-Jones Face Detector
Classifier Case Study: Viola-Jones Face Detector P. Viola and M. Jones. Rapid object detection using a boosted cascade of simple features. CVPR 2001. P. Viola and M. Jones. Robust real-time face detection.
More informationObject detection using Region Proposals (RCNN) Ernest Cheung COMP Presentation
Object detection using Region Proposals (RCNN) Ernest Cheung COMP790-125 Presentation 1 2 Problem to solve Object detection Input: Image Output: Bounding box of the object 3 Object detection using CNN
More informationObject Detection. Sanja Fidler CSC420: Intro to Image Understanding 1/ 1
Object Detection Sanja Fidler CSC420: Intro to Image Understanding 1/ 1 Object Detection The goal of object detection is to localize objects in an image and tell their class Localization: place a tight
More informationFine-tuning Pre-trained Large Scaled ImageNet model on smaller dataset for Detection task
Fine-tuning Pre-trained Large Scaled ImageNet model on smaller dataset for Detection task Kyunghee Kim Stanford University 353 Serra Mall Stanford, CA 94305 kyunghee.kim@stanford.edu Abstract We use a
More informationBeyond Bags of Features
: for Recognizing Natural Scene Categories Matching and Modeling Seminar Instructed by Prof. Haim J. Wolfson School of Computer Science Tel Aviv University December 9 th, 2015
More informationConvolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech
Convolutional Neural Networks Computer Vision Jia-Bin Huang, Virginia Tech Today s class Overview Convolutional Neural Network (CNN) Training CNN Understanding and Visualizing CNN Image Categorization:
More informationObject Recognition. Computer Vision. Slides from Lana Lazebnik, Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce
Object Recognition Computer Vision Slides from Lana Lazebnik, Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce How many visual object categories are there? Biederman 1987 ANIMALS PLANTS OBJECTS
More informationFaster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Shaoqing Ren Kaiming He Ross Girshick Jian Sun Present by: Yixin Yang Mingdong Wang 1 Object Detection 2 1 Applications Basic
More informationEnsemble Methods, Decision Trees
CS 1675: Intro to Machine Learning Ensemble Methods, Decision Trees Prof. Adriana Kovashka University of Pittsburgh November 13, 2018 Plan for This Lecture Ensemble methods: introduction Boosting Algorithm
More informationComputer Vision Lecture 16
Computer Vision Lecture 16 Deep Learning for Object Categorization 14.01.2016 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Announcements Seminar registration period
More informationFaster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun Presented by Tushar Bansal Objective 1. Get bounding box for all objects
More informationOBJECT DETECTION HYUNG IL KOO
OBJECT DETECTION HYUNG IL KOO INTRODUCTION Computer Vision Tasks Classification + Localization Classification: C-classes Input: image Output: class label Evaluation metric: accuracy Localization Input:
More informationObject recognition. Methods for classification and image representation
Object recognition Methods for classification and image representation Credits Slides by Pete Barnum Slides by FeiFei Li Paul Viola, Michael Jones, Robust Realtime Object Detection, IJCV 04 Navneet Dalal
More informationFace detection and recognition. Detection Recognition Sally
Face detection and recognition Detection Recognition Sally Face detection & recognition Viola & Jones detector Available in open CV Face recognition Eigenfaces for face recognition Metric learning identification
More informationBag-of-features. Cordelia Schmid
Bag-of-features for category classification Cordelia Schmid Visual search Particular objects and scenes, large databases Category recognition Image classification: assigning a class label to the image
More informationObject detection. Announcements. Last time: Mid-level cues 2/23/2016. Wed Feb 24 Kristen Grauman UT Austin
Object detection Wed Feb 24 Kristen Grauman UT Austin Announcements Reminder: Assignment 2 is due Mar 9 and Mar 10 Be ready to run your code again on a new test set on Mar 10 Vision talk next Tuesday 11
More informationOptimizing Object Detection:
Lecture 10: Optimizing Object Detection: A Case Study of R-CNN, Fast R-CNN, and Faster R-CNN Visual Computing Systems Today s task: object detection Image classification: what is the object in this image?
More informationProject 3 Q&A. Jonathan Krause
Project 3 Q&A Jonathan Krause 1 Outline R-CNN Review Error metrics Code Overview Project 3 Report Project 3 Presentations 2 Outline R-CNN Review Error metrics Code Overview Project 3 Report Project 3 Presentations
More informationObject Detection by 3D Aspectlets and Occlusion Reasoning
Object Detection by 3D Aspectlets and Occlusion Reasoning Yu Xiang University of Michigan Silvio Savarese Stanford University In the 4th International IEEE Workshop on 3D Representation and Recognition
More informationObject Detection. TA : Young-geun Kim. Biostatistics Lab., Seoul National University. March-June, 2018
Object Detection TA : Young-geun Kim Biostatistics Lab., Seoul National University March-June, 2018 Seoul National University Deep Learning March-June, 2018 1 / 57 Index 1 Introduction 2 R-CNN 3 YOLO 4
More informationCS6501: Deep Learning for Visual Recognition. Object Detection I: RCNN, Fast-RCNN, Faster-RCNN
CS6501: Deep Learning for Visual Recognition Object Detection I: RCNN, Fast-RCNN, Faster-RCNN Today s Class Object Detection The RCNN Object Detector (2014) The Fast RCNN Object Detector (2015) The Faster
More informationFeature descriptors. Alain Pagani Prof. Didier Stricker. Computer Vision: Object and People Tracking
Feature descriptors Alain Pagani Prof. Didier Stricker Computer Vision: Object and People Tracking 1 Overview Previous lectures: Feature extraction Today: Gradiant/edge Points (Kanade-Tomasi + Harris)
More informationDiagnosing Error in Object Detectors
Diagnosing Error in Object Detectors Derek Hoiem Yodsawalai Chodpathumwan Qieyun Dai (presented by Yuduo Wu) Most of the slides are from Derek Hoiem's ECCV 2012 presentation Object detecion is a collecion
More informationFind that! Visual Object Detection Primer
Find that! Visual Object Detection Primer SkTech/MIT Innovation Workshop August 16, 2012 Dr. Tomasz Malisiewicz tomasz@csail.mit.edu Find that! Your Goals...imagine one such system that drives information
More informationLecture 5: Object Detection
Object Detection CSED703R: Deep Learning for Visual Recognition (2017F) Lecture 5: Object Detection Bohyung Han Computer Vision Lab. bhhan@postech.ac.kr 2 Traditional Object Detection Algorithms Region-based
More informationObject recognition (part 1)
Recognition Object recognition (part 1) CSE P 576 Larry Zitnick (larryz@microsoft.com) The Margaret Thatcher Illusion, by Peter Thompson Readings Szeliski Chapter 14 Recognition What do we mean by object
More informationYiqi Yan. May 10, 2017
Yiqi Yan May 10, 2017 P a r t I F u n d a m e n t a l B a c k g r o u n d s Convolution Single Filter Multiple Filters 3 Convolution: case study, 2 filters 4 Convolution: receptive field receptive field
More informationImage Analysis. Window-based face detection: The Viola-Jones algorithm. iphoto decides that this is a face. It can be trained to recognize pets!
Image Analysis 2 Face detection and recognition Window-based face detection: The Viola-Jones algorithm Christophoros Nikou cnikou@cs.uoi.gr Images taken from: D. Forsyth and J. Ponce. Computer Vision:
More informationCS6670: Computer Vision
CS6670: Computer Vision Noah Snavely Lecture 16: Bag-of-words models Object Bag of words Announcements Project 3: Eigenfaces due Wednesday, November 11 at 11:59pm solo project Final project presentations:
More informationClassification of objects from Video Data (Group 30)
Classification of objects from Video Data (Group 30) Sheallika Singh 12665 Vibhuti Mahajan 12792 Aahitagni Mukherjee 12001 M Arvind 12385 1 Motivation Video surveillance has been employed for a long time
More informationSupervised learning. y = f(x) function
Supervised learning y = f(x) output prediction function Image feature Training: given a training set of labeled examples {(x 1,y 1 ),, (x N,y N )}, estimate the prediction function f by minimizing the
More informationCategory-level Localization
Category-level Localization Andrew Zisserman Visual Geometry Group University of Oxford http://www.robots.ox.ac.uk/~vgg Includes slides from: Ondra Chum, Alyosha Efros, Mark Everingham, Pedro Felzenszwalb,
More informationPatch Descriptors. CSE 455 Linda Shapiro
Patch Descriptors CSE 455 Linda Shapiro How can we find corresponding points? How can we find correspondences? How do we describe an image patch? How do we describe an image patch? Patches with similar
More informationAnalysis: TextonBoost and Semantic Texton Forests. Daniel Munoz Februrary 9, 2009
Analysis: TextonBoost and Semantic Texton Forests Daniel Munoz 16-721 Februrary 9, 2009 Papers [shotton-eccv-06] J. Shotton, J. Winn, C. Rother, A. Criminisi, TextonBoost: Joint Appearance, Shape and Context
More information2D Image Processing Feature Descriptors
2D Image Processing Feature Descriptors Prof. Didier Stricker Kaiserlautern University http://ags.cs.uni-kl.de/ DFKI Deutsches Forschungszentrum für Künstliche Intelligenz http://av.dfki.de 1 Overview
More informationUnified, real-time object detection
Unified, real-time object detection Final Project Report, Group 02, 8 Nov 2016 Akshat Agarwal (13068), Siddharth Tanwar (13699) CS698N: Recent Advances in Computer Vision, Jul Nov 2016 Instructor: Gaurav
More informationAn Object Detection Algorithm based on Deformable Part Models with Bing Features Chunwei Li1, a and Youjun Bu1, b
5th International Conference on Advanced Materials and Computer Science (ICAMCS 2016) An Object Detection Algorithm based on Deformable Part Models with Bing Features Chunwei Li1, a and Youjun Bu1, b 1
More informationObject Detection with Discriminatively Trained Part Based Models
Object Detection with Discriminatively Trained Part Based Models Pedro F. Felzenszwelb, Ross B. Girshick, David McAllester and Deva Ramanan Presented by Fabricio Santolin da Silva Kaustav Basu Some slides
More informationSegmentation as Selective Search for Object Recognition in ILSVRC2011
Segmentation as Selective Search for Object Recognition in ILSVRC2011 Koen van de Sande Jasper Uijlings Arnold Smeulders Theo Gevers Nicu Sebe Cees Snoek University of Amsterdam, University of Trento ILSVRC2011
More informationRich feature hierarchies for accurate object detection and semantic segmentation
Rich feature hierarchies for accurate object detection and semantic segmentation Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik Presented by Pandian Raju and Jialin Wu Last class SGD for Document
More informationFace detection and recognition. Many slides adapted from K. Grauman and D. Lowe
Face detection and recognition Many slides adapted from K. Grauman and D. Lowe Face detection and recognition Detection Recognition Sally History Early face recognition systems: based on features and distances
More informationDevelopment in Object Detection. Junyuan Lin May 4th
Development in Object Detection Junyuan Lin May 4th Line of Research [1] N. Dalal and B. Triggs. Histograms of oriented gradients for human detection, CVPR 2005. HOG Feature template [2] P. Felzenszwalb,
More informationFitting: The Hough transform
Fitting: The Hough transform Voting schemes Let each feature vote for all the models that are compatible with it Hopefully the noise features will not vote consistently for any single model Missing data
More informationSelective Search for Object Recognition
Selective Search for Object Recognition Uijlings et al. Schuyler Smith Overview Introduction Object Recognition Selective Search Similarity Metrics Results Object Recognition Kitten Goal: Problem: Where
More informationObject Detection with Partial Occlusion Based on a Deformable Parts-Based Model
Object Detection with Partial Occlusion Based on a Deformable Parts-Based Model Johnson Hsieh (johnsonhsieh@gmail.com), Alexander Chia (alexchia@stanford.edu) Abstract -- Object occlusion presents a major
More informationLearning Representations for Visual Object Class Recognition
Learning Representations for Visual Object Class Recognition Marcin Marszałek Cordelia Schmid Hedi Harzallah Joost van de Weijer LEAR, INRIA Grenoble, Rhône-Alpes, France October 15th, 2007 Bag-of-Features
More informationYOLO: You Only Look Once Unified Real-Time Object Detection. Presenter: Liyang Zhong Quan Zou
YOLO: You Only Look Once Unified Real-Time Object Detection Presenter: Liyang Zhong Quan Zou Outline 1. Review: R-CNN 2. YOLO: -- Detection Procedure -- Network Design -- Training Part -- Experiments Rich
More informationModel Fitting: The Hough transform II
Model Fitting: The Hough transform II Guido Gerig, CS6640 Image Processing, Utah Theory: See handwritten notes GG: HT-notes-GG-II.pdf Credits: S. Narasimhan, CMU, Spring 2006 15-385,-685, Link Svetlana
More informationSkin and Face Detection
Skin and Face Detection Linda Shapiro EE/CSE 576 1 What s Coming 1. Review of Bakic flesh detector 2. Fleck and Forsyth flesh detector 3. Details of Rowley face detector 4. Review of the basic AdaBoost
More informationIntroduction to Deep Learning for Facial Understanding Part III: Regional CNNs
Introduction to Deep Learning for Facial Understanding Part III: Regional CNNs Raymond Ptucha, Rochester Institute of Technology, USA Tutorial-9 May 19, 218 www.nvidia.com/dli R. Ptucha 18 1 Fair Use Agreement
More informationObject Detection on Self-Driving Cars in China. Lingyun Li
Object Detection on Self-Driving Cars in China Lingyun Li Introduction Motivation: Perception is the key of self-driving cars Data set: 10000 images with annotation 2000 images without annotation (not
More informationFitting: The Hough transform
Fitting: The Hough transform Voting schemes Let each feature vote for all the models that are compatible with it Hopefully the noise features will not vote consistently for any single model Missing data
More informationCS 2750: Machine Learning. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh April 13, 2016
CS 2750: Machine Learning Neural Networks Prof. Adriana Kovashka University of Pittsburgh April 13, 2016 Plan for today Neural network definition and examples Training neural networks (backprop) Convolutional
More informationObject Detection with YOLO on Artwork Dataset
Object Detection with YOLO on Artwork Dataset Yihui He Computer Science Department, Xi an Jiaotong University heyihui@stu.xjtu.edu.cn Abstract Person: 0.64 Horse: 0.28 I design a small object detection
More informationCS 1674: Intro to Computer Vision. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh November 16, 2016
CS 1674: Intro to Computer Vision Neural Networks Prof. Adriana Kovashka University of Pittsburgh November 16, 2016 Announcements Please watch the videos I sent you, if you haven t yet (that s your reading)
More informationObject Detection. CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR
Object Detection CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR Problem Description Arguably the most important part of perception Long term goals for object recognition: Generalization
More informationHuman detection using histogram of oriented gradients. Srikumar Ramalingam School of Computing University of Utah
Human detection using histogram of oriented gradients Srikumar Ramalingam School of Computing University of Utah Reference Navneet Dalal and Bill Triggs, Histograms of Oriented Gradients for Human Detection,
More informationPart-based models. Lecture 10
Part-based models Lecture 10 Overview Representation Location Appearance Generative interpretation Learning Distance transforms Other approaches using parts Felzenszwalb, Girshick, McAllester, Ramanan
More informationComputer Vision Lecture 16
Announcements Computer Vision Lecture 16 Deep Learning Applications 11.01.2017 Seminar registration period starts on Friday We will offer a lab course in the summer semester Deep Robot Learning Topic:
More informationComputer Vision Lecture 16
Computer Vision Lecture 16 Deep Learning Applications 11.01.2017 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Announcements Seminar registration period starts
More informationTRANSPARENT OBJECT DETECTION USING REGIONS WITH CONVOLUTIONAL NEURAL NETWORK
TRANSPARENT OBJECT DETECTION USING REGIONS WITH CONVOLUTIONAL NEURAL NETWORK 1 Po-Jen Lai ( 賴柏任 ), 2 Chiou-Shann Fuh ( 傅楸善 ) 1 Dept. of Electrical Engineering, National Taiwan University, Taiwan 2 Dept.
More informationHigh Level Computer Vision
High Level Computer Vision Part-Based Models for Object Class Recognition Part 2 Bernt Schiele - schiele@mpi-inf.mpg.de Mario Fritz - mfritz@mpi-inf.mpg.de http://www.d2.mpi-inf.mpg.de/cv Please Note No
More informationDeep Neural Networks:
Deep Neural Networks: Part II Convolutional Neural Network (CNN) Yuan-Kai Wang, 2016 Web site of this course: http://pattern-recognition.weebly.com source: CNN for ImageClassification, by S. Lazebnik,
More informationFitting: The Hough transform
Fitting: The Hough transform Voting schemes Let each feature vote for all the models that are compatible with it Hopefully the noise features will not vote consistently for any single model Missing data
More informationREGION AVERAGE POOLING FOR CONTEXT-AWARE OBJECT DETECTION
REGION AVERAGE POOLING FOR CONTEXT-AWARE OBJECT DETECTION Kingsley Kuan 1, Gaurav Manek 1, Jie Lin 1, Yuan Fang 1, Vijay Chandrasekhar 1,2 Institute for Infocomm Research, A*STAR, Singapore 1 Nanyang Technological
More informationG-CNN: an Iterative Grid Based Object Detector
G-CNN: an Iterative Grid Based Object Detector Mahyar Najibi 1, Mohammad Rastegari 1,2, Larry S. Davis 1 1 University of Maryland, College Park 2 Allen Institute for Artificial Intelligence najibi@cs.umd.edu
More informationTemplates and Background Subtraction. Prof. D. Stricker Doz. G. Bleser
Templates and Background Subtraction Prof. D. Stricker Doz. G. Bleser 1 Surveillance Video: Example of multiple people tracking http://www.youtube.com/watch?v=inqv34bchem&feature=player_embedded As for
More information