Local Features: Detection, Description & Matching

Similar documents
Harder case. Image matching. Even harder case. Harder still? by Diva Sian. by swashford

Harder case. Image matching. Even harder case. Harder still? by Diva Sian. by swashford

Image matching. Announcements. Harder case. Even harder case. Project 1 Out today Help session at the end of class. by Diva Sian.

CAP 5415 Computer Vision Fall 2012

Local features: detection and description. Local invariant features

Local features: detection and description May 12 th, 2015

CS4670: Computer Vision

Midterm Wed. Local features: detection and description. Today. Last time. Local features: main components. Goal: interest operator repeatability

Local features and image matching. Prof. Xin Yang HUST

SIFT: SCALE INVARIANT FEATURE TRANSFORM SURF: SPEEDED UP ROBUST FEATURES BASHAR ALSADIK EOS DEPT. TOPMAP M13 3D GEOINFORMATION FROM IMAGES 2014

Local Patch Descriptors

Local invariant features

The SIFT (Scale Invariant Feature

Outline 7/2/201011/6/

Image Features. Work on project 1. All is Vanity, by C. Allan Gilbert,

SIFT - scale-invariant feature transform Konrad Schindler

Scale Invariant Feature Transform

Introduction. Introduction. Related Research. SIFT method. SIFT method. Distinctive Image Features from Scale-Invariant. Scale.

Scale Invariant Feature Transform

Computer Vision for HCI. Topics of This Lecture

BSB663 Image Processing Pinar Duygulu. Slides are adapted from Selim Aksoy

Local Feature Detectors

Feature descriptors. Alain Pagani Prof. Didier Stricker. Computer Vision: Object and People Tracking

CS 558: Computer Vision 4 th Set of Notes

CEE598 - Visual Sensing for Civil Infrastructure Eng. & Mgmt.

EECS150 - Digital Design Lecture 14 FIFO 2 and SIFT. Recap and Outline

SCALE INVARIANT FEATURE TRANSFORM (SIFT)

Motion Estimation and Optical Flow Tracking

2D Image Processing Feature Descriptors

CS5670: Computer Vision

Local Features Tutorial: Nov. 8, 04

Ulas Bagci

School of Computing University of Utah

Computer Vision. Recap: Smoothing with a Gaussian. Recap: Effect of σ on derivatives. Computer Science Tripos Part II. Dr Christopher Town

CS 4495 Computer Vision A. Bobick. CS 4495 Computer Vision. Features 2 SIFT descriptor. Aaron Bobick School of Interactive Computing

Feature Detection. Raul Queiroz Feitosa. 3/30/2017 Feature Detection 1

Key properties of local features

Feature descriptors and matching

SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS

Features Points. Andrea Torsello DAIS Università Ca Foscari via Torino 155, Mestre (VE)

Feature Based Registration - Image Alignment

SIFT: Scale Invariant Feature Transform

Implementing the Scale Invariant Feature Transform(SIFT) Method

Patch Descriptors. CSE 455 Linda Shapiro

Obtaining Feature Correspondences

Patch Descriptors. EE/CSE 576 Linda Shapiro

Scale Invariant Feature Transform by David Lowe

EE368 Project Report CD Cover Recognition Using Modified SIFT Algorithm

Lecture 10 Detectors and descriptors

Image Features: Local Descriptors. Sanja Fidler CSC420: Intro to Image Understanding 1/ 58

AK Computer Vision Feature Point Detectors and Descriptors

Edge and corner detection

Eppur si muove ( And yet it moves )

Implementation and Comparison of Feature Detection Methods in Image Mosaicing

Local Image Features

Building a Panorama. Matching features. Matching with Features. How do we build a panorama? Computational Photography, 6.882

Performance Evaluation of Scale-Interpolated Hessian-Laplace and Haar Descriptors for Feature Matching

Motion illusion, rotating snakes

Object Detection by Point Feature Matching using Matlab

Augmented Reality VU. Computer Vision 3D Registration (2) Prof. Vincent Lepetit

Lecture: RANSAC and feature detectors

Automatic Image Alignment (feature-based)

Lecture 6: Finding Features (part 1/2)

Scott Smith Advanced Image Processing March 15, Speeded-Up Robust Features SURF

Feature Detection and Matching

Feature Detectors and Descriptors: Corners, Lines, etc.

Local Image Features

HISTOGRAMS OF ORIENTATIO N GRADIENTS

Object Recognition with Invariant Features

Corner Detection. GV12/3072 Image Processing.

Comparison of Feature Detection and Matching Approaches: SIFT and SURF

Image features. Image Features

CS231A Section 6: Problem Set 3

SURF. Lecture6: SURF and HOG. Integral Image. Feature Evaluation with Integral Image

Image Matching. AKA: Image registration, the correspondence problem, Tracking,

Feature Descriptors. CS 510 Lecture #21 April 29 th, 2013

Chapter 3 Image Registration. Chapter 3 Image Registration

Coarse-to-fine image registration

A Novel Algorithm for Color Image matching using Wavelet-SIFT

Local Image Features

TA Section 7 Problem Set 3. SIFT (Lowe 2004) Shape Context (Belongie et al. 2002) Voxel Coloring (Seitz and Dyer 1999)

Scale Invariant Feature Transform (SIFT) CS 763 Ajit Rajwade

A NEW FEATURE BASED IMAGE REGISTRATION ALGORITHM INTRODUCTION

Prof. Feng Liu. Spring /26/2017

Feature Matching and Robust Fitting

EE795: Computer Vision and Intelligent Systems

Distinctive Image Features from Scale-Invariant Keypoints

CS664 Lecture #21: SIFT, object recognition, dynamic programming

Advanced Video Content Analysis and Video Compression (5LSH0), Module 4

Patch-based Object Recognition. Basic Idea

Multi-modal Registration of Visual Data. Massimiliano Corsini Visual Computing Lab, ISTI - CNR - Italy

A Comparison of SIFT, PCA-SIFT and SURF

3D Reconstruction From Multiple Views Based on Scale-Invariant Feature Transform. Wenqi Zhu

Designing Applications that See Lecture 7: Object Recognition

Image Features: Detection, Description, and Matching and their Applications

Distinctive Image Features from Scale-Invariant Keypoints

Digital Image Processing (CS/ECE 545) Lecture 5: Edge Detection (Part 2) & Corner Detection

AN ADVANCED SCALE INVARIANT FEATURE TRANSFORM ALGORITHM FOR FACE RECOGNITION

Robotics Programming Laboratory

A Comparison of SIFT and SURF

Transcription:

Local Features: Detection, Description & Matching Lecture 08 Computer Vision

Material Citations Dr George Stockman Professor Emeritus, Michigan State University Dr David Lowe Professor, University of British Columbia Dr Richard Szeliski Head of Interactive Visual Media Group at Microsoft Research Dr Steve Seitz Professor, University of Washington Dr Kristen Grauman Associate Professor, University of Texas at Austin Dr Allan + Jepson Professor, University of Toronto

Suggested Readings Chapter 4 Richard Szeliski, Computer Vision: Algorithms and Applications, Springer; 2011. D. Lowe, Distinctive image features from scale invariant key points, International Journal of Computer Vision, 2004.

Recall: Corners as distinctive interest points Since M is symmetric, we have u E( u, v) u, v M v M X 1 0 0 X 2 T The eigen values of M reveal the amount of intensity change in the two principal orthogonal gradient directions in the window.

Recall: Corners as distinctive interest points edge : 1 >> 2 2 >> 1 corner : 1 and 2 are large, 1 ~ 2 ; flat region 1 and 2 are small; One way to score the cornerness: 2 R k 1 2 1 2

Properties of the Harris corner detector Rotation invariant? Yes Scale invariant? M X 1 0 0 X 2 T

Properties of the Harris corner detector Rotation invariant? Yes Scale invariant? No All points will be classified as edges Corner!

Image matching by Diva Sian by swashford

Harder case by Diva Sian by scgbt

Harder still? NASA Mars Rover images

Look for tiny colored squares.. NASA Mars Rover images with SIFT feature matches Figure by Noah Snavely

Image Matching At an interesting point, let s define a coordinate system (x,y axis) Use the coordinate system to pull out a patch at that point

Image Matching

Local features: main components Detection: Identify the interest points (1) (1) X1 [ x1,... x d ] Description: Extract feature vector descriptor surrounding each interest point (2) (2) X 2 [ x1,... x d ] Matching: Determine correspondence between descriptors in two views

SIFT D. Lowe, Distinctive image features from scale invariant key points, International Journal of Computer Vision, 2004 SIFT Scale Invariant Feature Transform - Detector and descriptor - SIFT is computationally expensive and copyrighted

SIFT - Feature detector A SIFT keypoint is a circular image region with an orientation It is described by a geometric frame of four parameters: - Keypoint center coordinates x and y - Its scale (the radius of the region) - and its orientation (an angle expressed in radians)

Steps for Extracting Key Points Scale space peak selection - Potential locations for finding features Key point localization - Accurately locating the feature key points Orientation Assignment - Assigning orientation to the key points Key point descriptor - Describing the key point as a high dimensional vector

Invariant local features Algorithm for finding points and representing their patches should produce similar results even when conditions vary Buzzword is invariance geometric invariance: translation, rotation, scale photometric invariance: brightness, exposure, Feature Descriptors

What makes a good feature? Say we have 2 images of this scene we d like to align by matching local features What would be the good local features (ones easy to match)? Uniqueness

Illumination Types of invariance

Illumination Scale Types of invariance

Types of invariance Illumination Scale Rotation

Types of invariance Illumination Scale Rotation Affine

Types of invariance Illumination Scale Rotation Affine Perspective

Scale invariant interest points Independently select interest points in each image, such that the detections are repeatable across different scales

Automatic scale selection Intuition: Find scale that gives local maxima of some function f in both position and scale. f Image 1 f Image 2 s 1 region size s 2 region size

Automatic scale selection Intuition: Find scale that gives local maxima of some function f in both position and scale. f is a local maximum in both position and scale Common definition of f: Laplacian: or difference between two Gaussian filtered images with different sigmas

Recall: Edge detection f Edge d dx g Derivative of Gaussian f d dx g Edge = maximum of derivative

Recall: Edge detection f Edge d dx 2 2 g Second derivative of Gaussian (Laplacian) d dx 2 f 2 g Edge = zero crossing of second derivative

From edges to blobs Edge: ripple Blob: superposition of two ripples Spatial selection: the magnitude of the Laplacian response will achieve a maximum at the center of the blob, provided the scale of the Laplacian is matched to the scale of the blob maximum

Blob detection in 2D Laplacian of Gaussian: Circularly symmetric operator for blob detection in 2D 2 g 2 g 2 x 2 g 2 y

filter scales Blob detection in 2D: scale selection Laplacian-of-Gaussian = blob detector 2 g 2 g 2 x 2 g 2 y img1 img2 img3

Blob detection in 2D We define the characteristic scale as the scale that produces peak of Laplacian response characteristic scale

Scale space blob detector- Example Original image at ¾ the size Original image

Scale space blob detector- Example Original image down sampled to ¾ the size but later stretched to match with original size

Scale space blob detector- Example

Scale space blob detector- Example

Scale space blob detector- Example

Scale space blob detector- Example

Scale space blob detector- Example

Scale invariant interest points Interest points are local maxima in both position and scale Laplacian-of- Gaussian σ 5 σ 4 L xx ( ) L ( ) yy σ 3 σ 2 Squared filter response maps σ 1 scale List of (x, y, σ)

Scale-space blob detector: Example

Scale invariant interest points Pyramids Smooth the image Subsample the smoothed image Repeat until image is small Each cycle of this process results in a smaller image with increased smoothing, but with decreased spatial sampling density (that is, decreased image resolution) Scale Space (DOG method)

Pyramids

Scale invariant interest points Pyramids Scale Space (DOG method) Pyramid but fill gaps with blurred images Take features from differences of these images If the feature is repeatably present in between Difference of Gaussians it is Scale Invariant and we should keep it.

Building a Scale Space Scale-space representation L(x,y;t) at scale t=0,1,4,16,64,256 respectively, corresponding to the original image

Building a Scale Space All scales must be examined to identify scale invariant features An efficient function is to compute the Laplacian Pyramid (Difference of Gaussian)

Approximation of LoG by Difference of Gaussians Heat Equation Rate of change of Gaussian Typical values : σ =1.6; k= 2 Gaussian filter with σ

Scale invariant interest points - Technical detail We can approximate the Laplacian with a difference of Gaussians; more efficient to implement. 2 L Gxx x y Gyy x y (,, ) (,, ) Laplacian DoG G( x, y, k ) G( x, y, ) Difference of Gaussians

Building a Scale Space Scale (next octave) k 4 σ Scale (first octave) k 3 σ k 2 σ kσ Peak detection Local maxima/minima σ Gaussian Difference of Gaussian

Building a Scale Space http://www.vlfeat.org/api/sift.html

Scale Space Peak Detection Compare a pixel (X) with 26 pixels in current and adjacent scales (Green Circles) Select a pixel (X) if larger/smaller than all 26 pixels Large number of extrema, computationally expensive - Detect the most stable subset with a coarse sampling of scales 9+9+8 = 26 http://www.vlfeat.org/api/sift.html

Key Point Localization Candidates are chosen from extrema detection Original Image Extrema locations

Initial Outlier Rejection Low contrast candidates Poorly localized candidates along an edge Taylor series expansion of DOG, D Minima or maxima is located at Value of D(x) at minima/maxima must be large, D(x) >th

Initial Outlier Rejection Reduced from 832 key points to 729 key points, th=0.03

Further Outlier Rejection DOG has strong response along edge Assume DOG as a surface - Compute principal curvatures (PC) - Along the edge PC is very low, across the edge is high

Further Outlier Rejection Analogous to Harris corner detector Compute Hessian of D Remove outliers by evaluating

Further Outlier Rejection Following quantity is minimum when r=1 It increases with r Eliminate key points if

Further Outlier Rejection Reduced from 729 key points to 536 key points

Local features: main components Detection: Identify the interest points (1) (1) X1 [ x1,... x d ] Description: Extract feature vector descriptor surrounding each interest point (2) (2) X 2 [ x1,... x d ] Matching: Determine correspondence between descriptors in two views

How to achieve invariance Need both of the following: 1. Make sure your detector is invariant Harris is invariant to translation and rotation Scale is trickier common approach is to detect features at many scales using a Gaussian pyramid More sophisticated methods find the best scale to represent each feature (e.g., SIFT) 2. Design an invariant feature descriptor A descriptor captures the information in a region around the detected feature point The simplest descriptor: a square window of pixels Better approach is SIFT descriptor

SIFT - Feature descriptors SIFT Descriptor: is a 3-D spatial histogram of the image gradients in characterizing the appearance of a keypoint.

SIFT - Feature descriptors Use histograms to bin pixels within sub-patches according to their orientation. 0 2 p

Making descriptor rotation invariant To achieve rotation invariance, compute central derivatives, gradient magnitude and direction of L (smooth image) at the scale of key point (x,y)

Orientation assignment Create a weighted direction histogram in a neighborhood of a key point (36 bins) 10 degree for each orientation = 36x10 = 360 Weights are - Gradient magnitudes - Spatial gaussian filter with σ =1.5 x <scale of key point> Gradient magnitude

Orientation assignment Select the highest peak as direction of the key point Use of gradient orientation histograms has robust representation

Orientation assignment Find dominant orientation of the image patch This is given by x +, the eigenvector of H corresponding to + + is the larger eigen value Rotate the patch according to this angle

Orientation assignment Rotate patch according to its dominant gradient orientation This puts the patches into a canonical orientation Gradient orientation is more stable then raw intensity value CSE 576: Computer Vision

SIFT - Feature descriptors Extraordinarily robust matching technique Can handle changes in viewpoint - Up to about 60 degree out of plane rotation Can handle significant changes in illumination - Sometimes even day vs. night (below) Fast and efficient can run in real time http://people.csail.mit.edu/albert/ladypack/wiki/index.php?title=known_implementations_of_sift

SIFT - Feature descriptors Take 16x16 square window around detected feature Compute edge orientation (angle of the gradient) for each pixel Throw out weak edges (threshold gradient magnitude) Create histogram of surviving edge orientations 0 2 angle histogram Image gradients Keypoint descriptor

SIFT - Feature descriptors Divide the 16x16 window into a 4x4 grid of cells (2x2 case shown below) Compute an orientation histogram for each cell 16 cells * 8 orientations = 128 dimensional descriptor Image gradients Keypoint descriptor

Local features: main components Detection: Identify the interest points (1) (1) X1 [ x1,... x d ] Description: Extract feature vector descriptor surrounding each interest point (2) (2) X 2 [ x1,... x d ] Matching: Determine correspondence between descriptors in two views

Matching local features Given a feature in I 1, how to find the best match in I 2? Define distance function that compares two descriptors Image 1 Image 2

Matching local features To generate candidate matches, find patches that have the most similar appearance (e.g., lowest SSD) Simplest approach: compare them all, take the closest (or closest k, or within a thresholded distance)? Image 1 Image 2

Matching local features How to define the difference between two features f 1, f 2? Simple approach is SSD(f 1, f 2 ) Sum of Square Differences between entries of the two descriptors f 1 f 2 Image 1 Image 2 SSD can give good scores to very ambiguous (bad) matches!!!

Matching local features How to define the difference between two features f 1, f 2? Better approach: ratio = SSD(f 1, f 2 ) / SSD(f 1, f 2 ) f 2 is best SSD match to f 1 in I 2 f 2 is 2 nd best SSD match to f 1 in I 2 Gives small values for ambiguous matches f 1 f 2 f 2 ' Image 1 Image 2

Ambiguous matches To add robustness to matching, can consider ratio : Euclidean distance to best match / Euclidean distance to second best match - If low, first match looks good - If high, could be ambiguous match???? Image 1 Image 2

Matching SIFT Descriptors Nearest neighbor (Euclidean distance) Threshold ratio of nearest to 2 nd nearest descriptor

Evaluating the results How can we measure the performance of a feature matcher? 50 75 200 feature distance

True/false positives The distance threshold affects performance True positives = # of detected matches that are correct Suppose we want to maximize these how to choose threshold? False positives = # of detected matches that are incorrect Suppose we want to minimize these how to choose threshold? 50 75 true match 200 false match feature distance

Evaluating the results How can we measure the performance of a feature matcher? 1 0.7 # true positives # matching features (positives) true positive rate 0 0.1 false positive rate 1 # false positives # unmatched features (negatives)

# true positives # matching features (positives) Evaluating the results How can we measure the performance of a feature matcher? ROC Curves - Generated by counting # current/incorrect matches, for different threholds - Want to maximize area under the curve (AUC) - Useful for comparing different feature matching methods - For more info: http://en.wikipedia.org/wiki/receiver_oper ating_characteristic ROC curve ( Receiver Operator Characteristic ) 0.7 1 true positive rate 0 0.1 false positive rate 1 # false positives # unmatched features (negatives)

More on feature detection/description

Object recognition results David Lowe