Object Detection. Sanja Fidler CSC420: Intro to Image Understanding 1/ 1

Similar documents
Fitting: The Hough transform

Fitting: The Hough transform

Fitting: The Hough transform

Model Fitting: The Hough transform II

Deformable Part Models

Previously. Part-based and local feature models for generic object recognition. Bag-of-words model 4/20/2011

Part based models for recognition. Kristen Grauman

Finding 2D Shapes and the Hough Transform

Object Category Detection. Slides mostly from Derek Hoiem

Patch-based Object Recognition. Basic Idea

Beyond bags of features: Adding spatial information. Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba

Part-based and local feature models for generic object recognition

Image Features: Local Descriptors. Sanja Fidler CSC420: Intro to Image Understanding 1/ 58

Visuelle Perzeption für Mensch- Maschine Schnittstellen

Lecture 8: Fitting. Tuesday, Sept 25

Recognition. Topics that we will try to cover:

Announcements, schedule. Lecture 8: Fitting. Weighted graph representation. Outline. Segmentation by Graph Cuts. Images as graphs

Category vs. instance recognition

Fitting: Voting and the Hough Transform April 23 rd, Yong Jae Lee UC Davis

Prof. Kristen Grauman

Beyond Bags of features Spatial information & Shape models

Local features: detection and description. Local invariant features

Patch Descriptors. CSE 455 Linda Shapiro

EECS 442 Computer vision. Object Recognition

Lecture 9: Hough Transform and Thresholding base Segmentation

RANSAC and some HOUGH transform

Computer Vision 6 Segmentation by Fitting

Computer Vision Lecture 6

Feature Matching and Robust Fitting

Object detection. Announcements. Last time: Mid-level cues 2/23/2016. Wed Feb 24 Kristen Grauman UT Austin

Backprojection Revisited: Scalable Multi-view Object Detection and Similarity Metrics for Detections

Example: Line fitting. Difficulty of line fitting. Fitting lines. Fitting lines. Fitting lines. Voting 9/22/2009

Patch Descriptors. EE/CSE 576 Linda Shapiro

Feature Matching + Indexing and Retrieval

Object Category Detection: Sliding Windows

Fitting. Lecture 8. Cristian Sminchisescu. Slide credits: K. Grauman, S. Seitz, S. Lazebnik, D. Forsyth, J. Ponce

Computer Vision Lecture 6

Local features: detection and description May 12 th, 2015

Learning an Alphabet of Shape and Appearance for Multi-Class Object Detection

Local features and image matching. Prof. Xin Yang HUST

Detection III: Analyzing and Debugging Detection Methods

Hough Transform and RANSAC

HOUGH TRANSFORM CS 6350 C V

Feature descriptors. Alain Pagani Prof. Didier Stricker. Computer Vision: Object and People Tracking

Midterm Wed. Local features: detection and description. Today. Last time. Local features: main components. Goal: interest operator repeatability

Local invariant features

Efficient Kernels for Identifying Unbounded-Order Spatial Features

Instance-level recognition part 2

Recap Image Classification with Bags of Local Features

Keypoint-based Recognition and Object Search

Incremental Learning of Object Detectors Using a Visual Shape Alphabet

Motion illusion, rotating snakes

Local Image Features

Automatic Image Alignment (feature-based)

Lecture 4: Finding lines: from detection to model fitting

Instance-level recognition II.

Object Recognition and Detection

2D Image Processing Feature Descriptors

EECS 442 Computer vision. Fitting methods

arxiv: v1 [cs.cv] 7 Aug 2017

Bag-of-features. Cordelia Schmid

OBJECT CATEGORIZATION

Category-level localization

CEE598 - Visual Sensing for Civil Infrastructure Eng. & Mgmt.

Part-based models. Lecture 10

Methods for Representing and Recognizing 3D objects

Local Image Features

CS6670: Computer Vision

CEE598 - Visual Sensing for Civil Infrastructure Eng. & Mgmt.

Combined Object Categorization and Segmentation with an Implicit Shape Model

Visual Object Recognition

Local Image Features

Object Detection Design challenges

High-Level Computer Vision

Window based detectors

10/03/11. Model Fitting. Computer Vision CS 143, Brown. James Hays. Slides from Silvio Savarese, Svetlana Lazebnik, and Derek Hoiem

Fitting: Voting and the Hough Transform

Computer Vision for HCI. Topics of This Lecture

Feature descriptors and matching

Detecting and Segmenting Humans in Crowded Scenes

Compu&ng Correspondences in Geometric Datasets. 4.2 Symmetry & Symmetriza/on

CS 558: Computer Vision 4 th Set of Notes

CEE598 - Visual Sensing for Civil Infrastructure Eng. & Mgmt.

Types of Edges. Why Edge Detection? Types of Edges. Edge Detection. Gradient. Edge Detection

Semantic Segmentation without Annotating Segments

Feature Detectors and Descriptors: Corners, Lines, etc.

Perception IV: Place Recognition, Line Extraction

Person Detection in Images using HoG + Gentleboost. Rahul Rajan June 1st July 15th CMU Q Robotics Lab

Implicit Shape Models For 3D Shape Classification With a Continuous Voting Space

Issues with Curve Detection Grouping (e.g., the Canny hysteresis thresholding procedure) Model tting They can be performed sequentially or simultaneou

Local Features and Bag of Words Models

A Discriminative Voting Scheme for Object Detection using Hough Forests

Category-level Localization

Computer Vision: Summary and Discussion

Analysis: TextonBoost and Semantic Texton Forests. Daniel Munoz Februrary 9, 2009

Object Recognition II

Visual Object Recognition

Stereo Epipolar Geometry for General Cameras. Sanja Fidler CSC420: Intro to Image Understanding 1 / 33

Harder case. Image matching. Even harder case. Harder still? by Diva Sian. by swashford

Oriented Filters for Object Recognition: an empirical study

Transcription:

Object Detection Sanja Fidler CSC420: Intro to Image Understanding 1/ 1

Object Detection The goal of object detection is to localize objects in an image and tell their class Localization: place a tight bounding box around object Most approaches find only objects of one or a few specific classes, e.g. car or cow Sanja Fidler CSC420: Intro to Image Understanding 2/ 1

Type of Approaches Di erent approaches tackle detection di erently. They can roughly be categorized into three main types: Find interest points, followed by Hough voting Sanja Fidler CSC420: Intro to Image Understanding 3/ 1

Interest Point Based Approaches Compute interest points (e.g., Harris corner detector is a popular choice) Vote for where the object could be given the content around interest points Sanja Fidler CSC420: Intro to Image Understanding 4/ 1

Interest Point Based Approaches Compute interest points (e.g., Harris corner detector is a popular choice) Vote for where the object could be given the content around interest points Sanja Fidler CSC420: Intro to Image Understanding 4/ 1

Interest Point Based Approaches Compute interest points (e.g., Harris corner detector is a popular choice) Vote for where the object could be given the content around interest points Sanja Fidler CSC420: Intro to Image Understanding 4/ 1

Interest Point Based Approaches Compute interest points (e.g., Harris corner detector is a popular choice) Vote for where the object could be given the content around interest points Sanja Fidler CSC420: Intro to Image Understanding 4/ 1

Interest Point Based Approaches Compute interest points (e.g., Harris corner detector is a popular choice) Vote for where the object could be given the content around interest points Sanja Fidler CSC420: Intro to Image Understanding 4/ 1

Type of Approaches Di erent approaches tackle detection di erently. They can roughly be categorized into three main types: Find interest points, followed by Hough voting Sliding windows: slide a box around image and classify each image crop inside a box (contains object or not?) Sanja Fidler CSC420: Intro to Image Understanding 5/ 1

Sliding Window Approaches Slide window and ask a classifier: Is sheep in window or not? 0.1 confidence [Slide: R. Urtasun] Sanja Fidler CSC420: Intro to Image Understanding 6/ 1

Sliding Window Approaches Slide window and ask a classifier: Is sheep in window or not? -0.2 [Slide: R. Urtasun] Sanja Fidler CSC420: Intro to Image Understanding 6/ 1

Sliding Window Approaches Slide window and ask a classifier: Is sheep in window or not? -0.1 [Slide: R. Urtasun] Sanja Fidler CSC420: Intro to Image Understanding 6/ 1

Sliding Window Approaches Slide window and ask a classifier: Is sheep in window or not? 0.1 [Slide: R. Urtasun] Sanja Fidler CSC420: Intro to Image Understanding 6/ 1

Sliding Window Approaches Slide window and ask a classifier: Is sheep in window or not?... 1.5... [Slide: R. Urtasun] Sanja Fidler CSC420: Intro to Image Understanding 6/ 1

Sliding Window Approaches Slide window and ask a classifier: Is sheep in window or not? 0.5 [Slide: R. Urtasun] Sanja Fidler CSC420: Intro to Image Understanding 6/ 1

Sliding Window Approaches Slide window and ask a classifier: Is sheep in window or not? 0.4 [Slide: R. Urtasun] Sanja Fidler CSC420: Intro to Image Understanding 6/ 1

Sliding Window Approaches Slide window and ask a classifier: Is sheep in window or not? 0.3 [Slide: R. Urtasun] Sanja Fidler CSC420: Intro to Image Understanding 6/ 1

Sliding Window Approaches Slide window and ask a classifier: Is sheep in window or not? 0.1 confidence- 0.2-0.1 0.1... 1.5... 0.5 0.4 0.3 [Slide: R. Urtasun] Sanja Fidler CSC420: Intro to Image Understanding 6/ 1

Type of Approaches Di erent approaches tackle detection di erently. They can roughly be categorized into three main types: Find interest points, followed by Hough voting Sliding windows: slide a box around image and classify each image crop inside a box (contains object or not?) Generate region (object) proposals, and classify each region Sanja Fidler CSC420: Intro to Image Understanding 7/ 1

Region Proposal Based Approaches Group pixels into object-like regions Sanja Fidler CSC420: Intro to Image Understanding 8/ 1

Region Proposal Based Approaches Group pixels into object-like regions Sanja Fidler CSC420: Intro to Image Understanding 8/ 1

Region Proposal Based Approaches Group pixels into object-like regions Sanja Fidler CSC420: Intro to Image Understanding 8/ 1

Region Proposal Based Approaches Generate many di erent regions Sanja Fidler CSC420: Intro to Image Understanding 8/ 1

Region Proposal Based Approaches Generate many di erent regions Sanja Fidler CSC420: Intro to Image Understanding 8/ 1

Region Proposal Based Approaches Generate many di erent regions Sanja Fidler CSC420: Intro to Image Understanding 8/ 1

Region Proposal Based Approaches The hope is that at least a few will cover real objects Sanja Fidler CSC420: Intro to Image Understanding 8/ 1

Region Proposal Based Approaches The hope is that at least a few will cover real objects Sanja Fidler CSC420: Intro to Image Understanding 8/ 1

Region Proposal Based Approaches Select a region Sanja Fidler CSC420: Intro to Image Understanding 8/ 1

Region Proposal Based Approaches Crop out an image patch around it, throw to classifier (e.g., Neural Net) Sanja Fidler CSC420: Intro to Image Understanding 8/ 1

Region Proposal Based Approaches Do this for every region Sanja Fidler CSC420: Intro to Image Understanding 8/ 1

Region Proposal Based Approaches Do this for every region Sanja Fidler CSC420: Intro to Image Understanding 8/ 1

Region Proposal Based Approaches Do this for every region Sanja Fidler CSC420: Intro to Image Understanding 8/ 1

Type of Approaches Di erent approaches tackle detection di erently. They can roughly be categorized into three main types: Find interest points, followed by Hough voting Let s first look at one example method for this Sliding windows: slide a box around image and classify each image crop inside a box (contains object or not?) Generate region (object) proposals, and classify each region Sanja Fidler CSC420: Intro to Image Understanding 9/ 1

Object Detection via Hough Voting: Implicit Shape Model B. Leibe, A. Leonardis, B. Schiele Robust Object Detection with Interleaved Categorization and Segmentation IJCV, 2008 Paper: http://www.vision.rwth-aachen.de/publications/pdf/leibe-interleaved-ijcv07final.pdf Sanja Fidler CSC420: Intro to Image Understanding 10 / 1

[Source: K. Grauman] Sanja Fidler CSC420: Intro to Image Understanding 11 / 1 Start with Simple: Line Detection How can I find lines in this image?

Hough Transform Idea: Voting (Hough Transform) Voting is a general technique where we let the features vote for all models that are compatible with it. Cycle through features, cast votes for model parameters. Look for model parameters that receive a lot of votes. [Source: K. Grauman] Sanja Fidler CSC420: Intro to Image Understanding 12 / 1

Hough Transform: Line Detection Hough space: parameter space Connection between image (x, y) and Hough (m, b) spaces [Source: S. Seitz] A line in the image corresponds to a point in Hough space What does a point (x 0, y 0 ) in the image space map to in Hough space? Sanja Fidler CSC420: Intro to Image Understanding 13 / 1

Hough Transform: Line Detection Hough space: parameter space Connection between image (x, y) and Hough (m, b) spaces A line in the image corresponds to a point in Hough space A point in image space votes for all the lines that go through this point. This votes are a line in the Hough space. [Source: S. Seitz] Sanja Fidler CSC420: Intro to Image Understanding 14 / 1

Hough Transform: Line Detection Hough space: parameter space Two points: Each point corresponds to a line in the Hough space A point where these two lines meet defines a line in the image! [Source: S. Seitz] Sanja Fidler CSC420: Intro to Image Understanding 15 / 1

Hough Transform: Line Detection Hough space: parameter space Vote with each image point Find peaks in Hough space. Each peak is a line in the image. [Source: S. Seitz] Sanja Fidler CSC420: Intro to Image Understanding 16 / 1

Hough Transform: Line Detection Issues with usual (m, b) parameterspace:undefinedforverticallines A better representation is a polar representation of lines [Source: S. Seitz] Sanja Fidler CSC420: Intro to Image Understanding 17 / 1

Example Hough Transform With the parameterization x cos + y sin = d Points in picture represent sinusoids in parameter space Points in parameter space represent lines in picture Example 0.6x +0.4y =2.4, Sinusoids intersect at d =2.4, =0.9273 [Source: M. Kazhdan, slide credit: R. Urtasun] Sanja Fidler CSC420: Intro to Image Understanding 18 / 1

Hough Transform: Line Detection Hough Voting algorithm [Source: S. Seitz] Sanja Fidler CSC420: Intro to Image Understanding 19 / 1

Hough Transform: Circle Detection What about circles? How can I fit circles around these coins? Sanja Fidler CSC420: Intro to Image Understanding 20 / 1

Hough Transform: Circle Detection Assume we are looking for a circle of known radius r Circle: (x a) 2 +(y b) 2 = r 2 Hough space (a, b): A point (x 0, y 0 )mapsto (a x 0 ) 2 +(b y 0 ) 2 = r 2! a circle around (x 0, y 0 )withradiusr Each image point votes for a circle in Hough space [Source: H. Rhody] Sanja Fidler CSC420: Intro to Image Understanding 21 / 1

Hough Transform: Circle Detection What if we don t know r? Hough space:? [Source: K. Grauman] Sanja Fidler CSC420: Intro to Image Understanding 22 / 1

Hough Transform: Circle Detection What if we don t know r? Hough space: conics [Source: K. Grauman] Sanja Fidler CSC420: Intro to Image Understanding 23 / 1

Hough Transform: Circle Detection Find the coins [Source: K. Grauman] Sanja Fidler CSC420: Intro to Image Understanding 24 / 1

Hough Transform: Circle Detection Iris detection [Source: K. Grauman] Sanja Fidler CSC420: Intro to Image Understanding 25 / 1

Dana H. Ballard, Generalizing the Hough Transform to Detect Arbitrary Shapes, 1980 Sanja Fidler CSC420: Intro to Image Understanding 26 / 1 Generalized Hough Voting Hough Voting for general shapes

Implicit Shape Model Implicit Shape Model adopts the idea of voting Basic idea: Find interest points in an image Match patch around each interest point to a training patch Vote for object center given that training instance Sanja Fidler CSC420: Intro to Image Understanding 27 / 1

Implicit Shape Model: Basic Idea Vote for object center Sanja Fidler CSC420: Intro to Image Understanding 28 / 1

Implicit Shape Model: Basic Idea Vote for object center Sanja Fidler CSC420: Intro to Image Understanding 28 / 1

Implicit Shape Model: Basic Idea Vote for object center Sanja Fidler CSC420: Intro to Image Understanding 28 / 1

Implicit Shape Model: Basic Idea Vote for object center Sanja Fidler CSC420: Intro to Image Understanding 28 / 1

Implicit Shape Model: Basic Idea Vote for object center Sanja Fidler CSC420: Intro to Image Understanding 28 / 1

Implicit Shape Model: Basic Idea Find the patches that produced the peak Sanja Fidler CSC420: Intro to Image Understanding 28 / 1

Implicit Shape Model: Basic Idea Place a box around these patches! objects! Sanja Fidler CSC420: Intro to Image Understanding 28 / 1

Implicit Shape Model: Basic Idea Really easy. Only one problem... Would be slow... How do we make it fast? Sanja Fidler CSC420: Intro to Image Understanding 28 / 1

Implicit Shape Model: Basic Idea Visual vocabulary (we saw this for retrieval) Compare each patch to a small set of visual words (clusters) Sanja Fidler CSC420: Intro to Image Understanding 28 / 1

Implicit Shape Model: Basic Idea Training: Getting the vocabulary Sanja Fidler CSC420: Intro to Image Understanding 28 / 1

Implicit Shape Model: Basic Idea Find interest points in each training image Sanja Fidler CSC420: Intro to Image Understanding 28 / 1

Implicit Shape Model: Basic Idea Collect patches around each interest point Sanja Fidler CSC420: Intro to Image Understanding 28 / 1

Implicit Shape Model: Basic Idea Collect patches across all training examples Sanja Fidler CSC420: Intro to Image Understanding 28 / 1

Implicit Shape Model: Basic Idea Cluster the patches to get a small set of representative patches Sanja Fidler CSC420: Intro to Image Understanding 28 / 1

Implicit Shape Model: Training Represent each training patch with the closest visual word. Record the displacement vectors for each word across all training examples. Training image Visual codeword with displacement vectors [Leibe et al. IJCV 2008] Sanja Fidler CSC420: Intro to Image Understanding 29 / 1

Implicit Shape Model: Test At test times detect interest points Assign each patch around interest point to closes visual word Vote with all displacement vectors for that word [Source: B. Leibe] Sanja Fidler CSC420: Intro to Image Understanding 30 / 1

Recognition Pipeline [Source: B. Leibe] Sanja Fidler CSC420: Intro to Image Understanding 31 / 1

Recognition Summary Apply interest points and extract features around selected locations. Match those to the codebook. Collect consistent configurations using Generalized Hough Transform. Each entry votes for a set of possible positions and scales in continuous space. Extract maxima in the continuous space using Mean Shift. Refinement can be done by sampling more local features. [Source: R. Urtasun] Sanja Fidler CSC420: Intro to Image Understanding 32 / 1

Example Original image [Source: B. Leibe, credit: R. Urtasun] Sanja Fidler CSC420: Intro to Image Understanding 33 / 1

Example Interest points [Source: B. Leibe, credit: R. Urtasun] Sanja Fidler CSC420: Intro to Image Understanding 33 / 1

Example Matched patches [Source: B. Leibe, credit: R. Urtasun] Sanja Fidler CSC420: Intro to Image Understanding 33 / 1

Example [Source: B. Leibe, credit: R. Urtasun] Voting space Sanja Fidler CSC420: Intro to Image Understanding 33 / 1

Example 1 st hypothesis [Source: B. Leibe, credit: R. Urtasun] Sanja Fidler CSC420: Intro to Image Understanding 33 / 1

Example 2 nd hypothesis [Source: B. Leibe, credit: R. Urtasun] Sanja Fidler CSC420: Intro to Image Understanding 33 / 1

Example 3 rd hypothesis [Source: B. Leibe, credit: R. Urtasun] Sanja Fidler CSC420: Intro to Image Understanding 33 / 1

Scale Invariant Voting Scale-invariant feature selection Scale-invariant interest points Rescale extracted patches Match to constant-size codebook Generate scale votes Scale as 3rd dimension in voting space x vote = x img x occ (s img /s occ ) y vote = y img y occ (s img /s occ ) s vote = s img /s occ Search for maxima in 3D voting space [Source: B. Leibe, credit: R. Urtasun] Sanja Fidler CSC420: Intro to Image Understanding 34 / 1

Scale Invariant Voting s Search window y x [Slide credit: R. Urtasun] Sanja Fidler CSC420: Intro to Image Understanding 35 / 1

Scale Voting: E cient Computation Continuous Generalized Hough Transform Binned accumulator array similar to standard Gen. Hough Transf. Quickly identify candidate maxima locations Refine locations by Mean-Shift search only around those points Avoid quantization e ects by keeping exact vote locations. s s s s y y y y x Scale votes x Binned accum. array x Candidate maxima x Refinement (Mean-Shift) [Source: B. Leibe, credit: R. Urtasun] Sanja Fidler CSC420: Intro to Image Understanding 36 / 1

Extension: Rotation-Invariant Detection Polar instead of Cartesian voting scheme Recognize objects under image-plane rotations Possibility to share parts between articulations But also increases false positive detections [Source: B. Leibe, credit: R. Urtasun] Sanja Fidler CSC420: Intro to Image Understanding 37 / 1

Sometimes it s Necessary Figure from [Mikolajczyk et al., CVPR 06] B. Leibe 20 [Source: B. Leibe, credit: R. Urtasun] Sanja Fidler CSC420: Intro to Image Understanding 38 / 1

Recognition and Segmentation Augment each visual word with meta-deta: for example, segmentation mask Sanja Fidler CSC420: Intro to Image Understanding 39 / 1

Recognition and Segmentation Local Features Matched Codebook Entries Probabilistic Voting Segmentation Backproject Metainformation y s x 3D Voting Space (continuous) Pixel Contributions Backprojected Hypotheses Backprojection of Maxima Sanja Fidler CSC420: Intro to Image Understanding 40 / 1

Results [Source: B. Leibe] Sanja Fidler CSC420: Intro to Image Understanding 41 / 1

Results [Source: B. Leibe] Sanja Fidler CSC420: Intro to Image Understanding 42 / 1

Results Dining room chairs Office chairs [Source: B. Leibe] Sanja Fidler CSC420: Intro to Image Understanding 43 / 1

Results [Source: B. Leibe] Sanja Fidler CSC420: Intro to Image Understanding 44 / 1

Inferring Other Information: Part Labels Training Test Output [Source: B. Leibe] Sanja Fidler CSC420: Intro to Image Understanding 45 / 1

Inferring Other Information: Part Labels [Source: B. Leibe] Sanja Fidler CSC420: Intro to Image Understanding 46 / 1

Inferring Other Information: Depth Depth from a single image [Source: B. Leibe] Sanja Fidler CSC420: Intro to Image Understanding 47 / 1

Conclusion Exploits a lot of parts (as many as interest points) Very simple Voting scheme: Generalized Hough Transform Works well, but not as well as Deformable Part-based Models with latent SVM training (next time) Extensions: train the weights discriminatively. Code, datasets & several pre-trained detectors available at http://www.vision.ee.ethz.ch/bleibe/code [Source: B. Leibe, credit: R. Urtasun] Sanja Fidler CSC420: Intro to Image Understanding 48 / 1