SIFT (Scale Invariant Feature Transform) descriptor

Similar documents
SIFT: SCALE INVARIANT FEATURE TRANSFORM SURF: SPEEDED UP ROBUST FEATURES BASHAR ALSADIK EOS DEPT. TOPMAP M13 3D GEOINFORMATION FROM IMAGES 2014

SURF. Lecture6: SURF and HOG. Integral Image. Feature Evaluation with Integral Image

Comparison of Feature Detection and Matching Approaches: SIFT and SURF

Key properties of local features

Computer Vision for HCI. Topics of This Lecture

SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS

Implementation and Comparison of Feature Detection Methods in Image Mosaicing

The SIFT (Scale Invariant Feature

Introduction. Introduction. Related Research. SIFT method. SIFT method. Distinctive Image Features from Scale-Invariant. Scale.

CS 4495 Computer Vision A. Bobick. CS 4495 Computer Vision. Features 2 SIFT descriptor. Aaron Bobick School of Interactive Computing

Local Features: Detection, Description & Matching

SURF: Speeded Up Robust Features. CRV Tutorial Day 2010 David Chi Chung Tam Ryerson University

Motion Estimation and Optical Flow Tracking

CEE598 - Visual Sensing for Civil Infrastructure Eng. & Mgmt.

Motion illusion, rotating snakes

Feature Detection and Matching

Local Descriptors. CS 510 Lecture #21 April 6 rd 2015

Scott Smith Advanced Image Processing March 15, Speeded-Up Robust Features SURF

EE368 Project Report CD Cover Recognition Using Modified SIFT Algorithm

SIFT: Scale Invariant Feature Transform

Click to edit title style

Feature Based Registration - Image Alignment

AK Computer Vision Feature Point Detectors and Descriptors

Feature Descriptors. CS 510 Lecture #21 April 29 th, 2013

SIFT - scale-invariant feature transform Konrad Schindler

Implementing the Scale Invariant Feature Transform(SIFT) Method

Feature Detection. Raul Queiroz Feitosa. 3/30/2017 Feature Detection 1

Outline 7/2/201011/6/

Computer Vision. Recap: Smoothing with a Gaussian. Recap: Effect of σ on derivatives. Computer Science Tripos Part II. Dr Christopher Town

CAP 5415 Computer Vision Fall 2012

A Comparison of SIFT and SURF

Local Image Features

Sampling Strategies for Object Classifica6on. Gautam Muralidhar

BoW model. Textual data: Bag of Words model

Scale Invariant Feature Transform by David Lowe

Multi-modal Registration of Visual Data. Massimiliano Corsini Visual Computing Lab, ISTI - CNR - Italy

Augmented Reality VU. Computer Vision 3D Registration (2) Prof. Vincent Lepetit

Scale Invariant Feature Transform

Local invariant features

Scale Invariant Feature Transform

Local Image Features

Building a Panorama. Matching features. Matching with Features. How do we build a panorama? Computational Photography, 6.882

Image Features: Local Descriptors. Sanja Fidler CSC420: Intro to Image Understanding 1/ 58

SCALE INVARIANT FEATURE TRANSFORM (SIFT)

EE795: Computer Vision and Intelligent Systems

Evaluation and comparison of interest points/regions

A Comparison of SIFT, PCA-SIFT and SURF

Local Features Tutorial: Nov. 8, 04

Object Detection by Point Feature Matching using Matlab

EECS150 - Digital Design Lecture 14 FIFO 2 and SIFT. Recap and Outline

Local Patch Descriptors

Lecture 10 Detectors and descriptors

Lecture 4.1 Feature descriptors. Trym Vegard Haavardsholm

Local Image Features

(More) Algorithms for Cameras: Edge Detec8on Modeling Cameras/Objects. Connelly Barnes

Yudistira Pictures; Universitas Brawijaya

Anno accademico 2006/2007. Davide Migliore

3D Object Recognition using Multiclass SVM-KNN

Image Processing. Image Features

Click to edit title style

Local Feature Detectors

2D Image Processing Feature Descriptors

A Novel Extreme Point Selection Algorithm in SIFT

A SIFT Descriptor with Global Context

BSB663 Image Processing Pinar Duygulu. Slides are adapted from Selim Aksoy

Object Recognition Algorithms for Computer Vision System: A Survey

Local features and image matching. Prof. Xin Yang HUST

Deformable Part Models

Histogram of Oriented Gradients for Human Detection

III. VERVIEW OF THE METHODS

A Novel Algorithm for Color Image matching using Wavelet-SIFT

Verslag Project beeldverwerking A study of the 2D SIFT algorithm

Feature descriptors. Alain Pagani Prof. Didier Stricker. Computer Vision: Object and People Tracking

Performance Analysis of Computationally Efficient Model based Object Detection and Recognition Techniques

CS 558: Computer Vision 4 th Set of Notes

Feature descriptors and matching

Chapter 3 Image Registration. Chapter 3 Image Registration

Requirements for region detection

VK Multimedia Information Systems

Image matching. Announcements. Harder case. Even harder case. Project 1 Out today Help session at the end of class. by Diva Sian.

Obtaining Feature Correspondences

School of Computing University of Utah

SURF: Speeded Up Robust Features

Computer Vision I - Appearance-based Matching and Projective Geometry

Fast Image Matching Using Multi-level Texture Descriptor

Image Features: Detection, Description, and Matching and their Applications

Distinctive Image Features from Scale-Invariant Keypoints

Image features. Image Features

Image Segmentation and Registration

IMAGE-GUIDED TOURS: FAST-APPROXIMATED SIFT WITH U-SURF FEATURES

CS 378: Autonomous Intelligent Robotics. Instructor: Jivko Sinapov

Image Matching. AKA: Image registration, the correspondence problem, Tracking,

Prof. Feng Liu. Spring /26/2017

Image key points detection and matching

Local features: detection and description May 12 th, 2015

Patch Descriptors. CSE 455 Linda Shapiro

Patch Descriptors. EE/CSE 576 Linda Shapiro

FESID: Finite Element Scale Invariant Detector

Distinctive Image Features from Scale-Invariant Keypoints

Features Points. Andrea Torsello DAIS Università Ca Foscari via Torino 155, Mestre (VE)

Transcription:

Local descriptors

SIFT (Scale Invariant Feature Transform) descriptor SIFT keypoints at loca;on xy and scale σ have been obtained according to a procedure that guarantees illumina;on and scale invance. By assigning a consistent orienta;on, the SIFT keypoint descriptor can be also orienta;on invariant. SIFT descriptor is therefore obtained from the following steps: Determine loca;on and scale by maximizing DoG in scale and in space (i.e. find the blurred image of closest scale) Sample the points around the keypoint and use the keypoint local orienta;on as the dominant gradient direc;on. Use this scale and orienta;on to make all further computa;ons invariant to scale and rota;on (i.e. rotate the gradients and coordinates by the dominant orienta;on) Separate the region around the keypoint into subregions and compute a 8- bin gradient orienta;on histogram of each subregion weigthing samples with a σ = 1.5 Gaussian

SIFT aggrega;on window In order to derive a descriptor for the keypoint region we could sample intensi;es around the keypoint, but they are sensi;ve to ligh;ng changes and to slight errors in x, y, θ. In order to make keypoint es;mate more reliable, it is usually preferable to use a larger aggrega;on window (i.e. the Gaussian kernel size) than the detec;on window. The SIFT descriptor is hence obtained from thresholded image gradients sampled over a 16x16 array of loca;ons in the neighbourhood of the detected keypoint, using the level of the Gaussian pyramid at which the keypoint was detected. For each 4x4 region samples they are accumulated into a gradient orienta;on histogram with 8 sampled orienta;ons. In total it results into a 4x4x8 = 128 dimensional feature vector 4x4 neighbourhood region 16x16 keypoint neighbourhood A a ass keypoint 8 sampled orientations

SIFT Gradient orienta;on histogram The gradient magnitudes are downweighted by a Gaussian func;on (red circle) in order to reduce the influence of gradients far from the center, as these are more affected by small misregistra;ons (smoothed by a Gaussian func;on with σ equal to 1.5 the keypoint scale). The 8- bin gradient orienta;on histogram in each 4 4 quadrant, a is formed by so#ly adding the weighted gradient magnitudes. Orientation histogram Dominant direction SoX distribu;on of values to adjacent histogram bins is performed by trilinear interpola;on to reduce the effects of loca;on and dominant orienta;on mises;ma;on

Image from: Jonas Hurrelmann

SIFT illumina;on invariance The keypoint descriptor is normalized to unit length to make it invariant to intensity change, i.e. to reduce the effects of contrast or gain. To make the descriptor robust to other photometric varia;ons, gradient magnitude values are clipped to 0.2 and the resul;ng vector is once again renormalized to unit length.

SIFT has been empirically found to show very good performance, invariant to image rota.on, scale, intensity change, and to moderate affine transforma;ons [Mikolajczyk & Schmid 2005] s extensive survey : 80% Repeatability at: 10% image noise 45 viewing angle 1k- 100k keypoints in database

Color SIFT descriptors Local descriptors like SIFT are usually based only on luminance and shape, so they use grey- scale values and ignore color, mainly because it is very difficult to select a color model that it sufficiently robust and general. Nevertheless, color is very important to describe/dis;nguish objects or scenes Different types of descriptors can be combined to improve representa;on; the most common combina;on is between a local shape- descriptor (e.g. SIFT) and a color descriptor (e.g. color histogram in a smart color space like Luv or HSV) An example of color- SIFT (sparse) descriptor (van de Weijer and Schmid, ECCV 2006). The combined descriptor is obtained by fusion of standard SIFT and Hue descriptor Courtesy J. van de Weijer

Dense gray and color SIFT SIFT descriptors can also be taken at fixed loca;ons by defining a regular grid over the image. In this case at each center point, 128- d SIFT descriptors are computed. In this case the descriptor account for the distrubu;on of the gradient orienta;on but does not has scale invariance as obtained from the DoG detector. Mul;ple descriptors are therefore computed to allow for scale varia;on 128- d SIFT (128 x 3)- d SIFT

SIFT descriptors in numbers 1 patch (1 SIFT descriptor) = 128 float = 128 * 4 byte (32bit) = 512 byte. 1 image 320x240 well textured ~ 600 keypoints= 600*512 byte = 307200 byte = 300 KB. This presenta;on in memory= 51 slides = 300 KB*51 = 15300 KB ~ 15 MB...

SIFT: implementa;ons available SIFT: available implementa;ons: David Lowe s: first code of SIFT algorithm by its creator (only binary). OpenCV 2.2 implementa;on - wrapper from Vedaldi code. SIFT original Lowe s algorithm is quite slow (~6 sec for an image of size 1280x768): it is computa;onally expensive and copyrighted Implementa;on by Rob Hess. Best ever enta;ons: A References: C A. htp://www.cs.ubc.ca/~lowe/keypoints/ B. htp://opencv.willowgarage.com/documenta;on/cpp/features2d_feature_detec;on_and_descrip;on.html#six C. htp://blogs.oregonstate.edu/hess/code/six/ B

Rob Hess SIFT implementa;on Features: Best Open Source SIFT implementa;on in terms of speed, efficiency and similarity compared to Lowe s binary. SIFT library is writen in C with versions available for both Linux and Windows Easy to integrate with an OpenCV Project. Contains a suite of algorithms: SIFT keypoint detec;on/descrip;on KD- tree matching Robust plane- to- plane trasforma;on References: 1. htps://web.engr.oregonstate.edu/%7ehess/publica;ons/sixlib- acmmm10.pdf 2. htp://blogs.oregonstate.edu/hess/code/six/

Compile and Install: Unix plazorms are preferred On a debian- based simply: 1. sudo apt- get install build- essen;al libgtk2.0- dev libcv- dev libcvaux- dev 2. cd <path- to- six>/ && make all 3../bin/siXfeat - h Output: Usage: sixfeat [op;ons] <img_file> Op;ons: - h Display this message and exit - o <out_file> Output keypoints to text file - m <out_img> Output keypoint image file (format determined by extension) - i <intervals> Set number of sampled intervals per octave in scale space pyramid (default 3) - s <sigma> Set sigma for ini;al gaussian smoothing at each octave (default 1.6000) - c <thresh> Set threshold on keypoint contrast D(x) based on [0,1] pixel values (default 0.1400) - r <thresh> Set threshold on keypoint ra;o of principle curvatures (default 10) - n <width> Set width of descriptor histogram array (default 4) - b <bins> Set number of bins per histogram in descriptor array (default 8) - d Toggle image doubling (default on) - x Turn off keypoint display

SIFT parameters to tune ( in si#.h or as argument of./bin/si#feat) : SIFT_CONTR_THR [0.04] is the default threshold on keypoint contrast D(x). To high values correspond less but stronger keypoints and vice versa. SIFT_DESCR_HIST_BINS [8] is the default number of bins per histogram in descriptor array. Trade- off between dis;nc;veness and efficiency. SIFT_IMG_DBL [0/1] tells if the image must be resized twice before keypoint localiza;on. Increase the detectable keypoints lowering performance. NN Matching parameters to tune ( in./src/match.c) : KDTREE_BBF_MAX_NN_CHKS [200] is the maximum number of keypoint NN candidates to check during BBF search. NN_SQ_DIST_RATIO_THR [0.49] is the threshold on squared ra;o of distances between 1- st NN and 2- nd NN. Used to reject noisy matches in high dimensions.

SIFT alterna;ves GLOH (Gradient Loca;on and Orienta;on Histogram). larger ini;al descriptor + PCA, TPAMI 2005 SURF (Speeded Up Robust Features) faster than SIFT and some;mes more robust. ECCV 2006. (343, Google cita;ons) GIST Rapid Biologically- Inspired Scene Classifica;on Using Features Shared with Visual Aten;on, TPAMI 2007 LESH Head Pose Es;ma;on In Face Recogni;on Across Pose Scenarios, VISAPP 2008 PCA- SIFT A More Dis;nc;ve Representa;on for Local Image Descriptors, CVPR 2004 Spin Image Sparse Texture Representa;on Using Affine- Invariant Neighborhoods, CVPR 2003

GLOH (Gradient Loca;on and Orienta;on Histogram) descriptor GLOH is a method for local shape descrip;on very similar to SIFT introduced by Miko in 2005 Differently from SIFT it employs a log- polar loca;on grid: 3 bins in radial direc;on 8 bins in angular direc;on 16 bins for Gradient orienta;on quan;za;on The GLOH descriptor is therefore a higher dimensional vector with a total of 17 (i.e. 2x8+1) * 16 = 272 bins. PCA dimension reduc;on is employed in the vector representa;on space

Example 16 dim 2x8 +1 dim GLOH

SURF (Speed Up Robust Features) descriptor SURF is a performant scale and rota;on invariant interest point detector and descriptor. It approximates or even outperforms SIFT and other local descriptors with respect to repeatability, dis;nc;veness, and robustness, yet can be computed and compared much faster This is achieved by: relying on integral images for image convolu;ons building on the strengths of the leading exis;ng detectors and descriptors (using a Hessian matrix- based measure for the detector, and a distribu;on- based descriptor) simplifying these methods to the essen;al

The approach of SURF is similar to that of SIFT but... Integral images in conjunc;on with Haar Wavelets are used to increase robustness and decrease computa;on ;me Instead of itera;vely reducing the image size (like in the SIFT approach), the use of integral images allows the up- scaling of the filter at constant cost SIFT approach SURF approach The SURF descriptor computa;on can be divided into two tasks: Orienta;on assignment Extrac;on of descriptor components

SURF integral images Much of the performance increase in SURF can be atributed to the usage of Integral Images: Integral Image is computed rapidly from an input image I Then it is used to speed- up the calcula;on of the area of any upright rectangular area Given an input image I and a point (x,y) the integral image I is calculated by the sum of the pixel intensi;es between the point and the origin: Using this representa;on, the cost of computa;on for a rectangular area is only 4 addi;ons

The integral image ii (x,y) at loca;on x,y is an intermediate representa;on of the image i (x,y) that contains all the pixels above and to the lex of xy It can be computed in one pass over the original image integra;on along rows integra;on along columns where s(x,y) is the cumula;ve row sum.! " " = y y x x y x i y x ii ', ' ') ', ( ), ( 0 ) 1, ( 0 1), ( ), ( ) 1, ( ), ( ), ( 1), ( ), ( =! =! +! = +! = y ii x s y x s y x ii y x ii y x i y x s y x s

Integral Image ii(x,y) x (0,0) s(x,y) = s(x,y- 1) + i(x,y) y (x- 1,y) (x,y) ii(x,y) = ii(x- 1,y) + s(x,y) Using the integral image representa;on one can compute the value of any rectangular sum in constant ;me. For example the integral sum inside rectangle D is computed as: ii(4) + ii(1) ii(2) ii(3)

SURF fast Hessian Detec;on SURF detector approximates the the second- order Gaussian deriva;ves of the image at point x, also referred to as Laplacian of Gaussians (LoG) by using box filter representa;ons of the Gaussian kernels (differently from SIFT that uses instead Difference of Gaussians (DoG)). In the SURF approach interest points are detected at loca;ons where the determinant of the Hessian Matrix is maximum. SURF permits a very efficient implementa;on, making good use of integral images to perform fast convolu;ons of varying size box filters (at near costant ;me) The Hessian matrix H is calculated as a func;on of both space and scale: L xx (x, σ), L yy (x, σ), L xy (x, σ) are the second- order Gaussian deriva;ves of the image at point x (LoGs)

SURF LoG approxima;on Approximated second order deriva;ves (LoGs) with box filters: 0 0 The 9x9 box filters in the figure are approxima;ons of L yy (x, σ), L xy (x, σ) second- order Gaussian deriva;ves of the image at point x. A Gaussian with σ=1.2 is considered that represents the lowest scale Haar- Wavelets are simple filters which can be used to find gradients in x and y direc;ons. For each template, the corresponding feature value is the sum of the pixels' intensity lying under the black part, minus the sum of the pixels' intensity lying under the white part: x response y response +1-1 - 1 +1 When used with integral images each wavelet requires 6 opera;ons to compute

SURF rienta;on assignment In order to achieve invariance to image rota;on each detected interest point is assigned a reproducible orienta;on. To determine the orienta;on, Haar wavelet responses of 4σ are calculated for a set of pixels within radius 6σ of the detected point: The Haar wavelet responses are represented as vectors. The dominant orienta;on is es;mated by calcula;ng the sum of all responses within a sliding orienta;on window of 60 degrees. Circular neighborhood of radius 6σ Sliding orienta;on window: the longest vector is the dominant orienta;on 6σ (x,y,σ)

SURF extrac;on of descriptor components To extract descriptor components, a square window of size 20σ is taken around the interest point oriented along the dominant orienta;on. The window is divided into 4x4 regular subregions Haar wavelets of size 2σ are calculated for 25 regularly distributed sample points in each subregion The x and y wavelet responses (dx, dy) are collected for each subregion according to: Each subregion therefore contributes 4 values to the SURF descriptor vector leading to an overall vector of length 4x4x4 = 64 The green square bounds one of the 16 subregions and blue circles represent the sample points at which we compute the wavelet responses x ad y responses are calculated rela.ve to the dominant orienta.on

SURF computa;onal cost SURF computa;onal cost (detector and descriptor) against the most- used detectors and the SIFT descriptor:

SURF implementa;ons available A. Herbert Bay code (creator). Library and source code. B. OpenCV implementa;on. C. OpenSurf1 at google code. See matcher_simple.cpp in samples/cpp in OpenCV 2.2 A References: B C A. htp://www.vision.ee.ethz.ch/~surf/ B. htp://opencv.willowgarage.com/documenta;on/cpp/features2d_feature_detec;on_and_descrip;on.html#surf C. htp://code.google.com/p/opensurf1/

SURF alterna;ve versions U- SURF (Upright version) For some applica;ons rota;on invariance is not necessary. U- SURF (Upright version) of SURF does not implement the orienta.on assignment step. It is faster while maintaining a robustness to rota;on of about +/- 15 degrees SURF- 128 SURF- 128 implements a descriptor vector of 128 length. The sum of dx and dx are computed separately for dy < 0 and dy > 0 (and so for dy and dy ) It is more precise and not much slower to compute, although slower to match

Compara;ve table for invariance of main descriptors Only moderate Van Gool 06 SURF Only moderate Fei- Fei Li

Affine Invariant Descriptors Fit an ellipse to the auto- correla;on (using eigenvalue analysis) and then use the principal axes and ra;os of this fit as the affine coordinate frame. The square root of the moment matrix can be used to transform local patches into a frame which is similar up to rota;on. Find affine normalized frame Σ = 1 T pp A Σ = 2 T qq 1 T 1 1 1 A 1 1 Σ = AA A 2 Σ = AA T 2 2 2 rotation Compute rota;onal invariant descriptor in this normalized frame Intensity domain spin image (2- D histogram of brightness) in the affine- normalized patch.

Affine invariant color moments is another affine invariant descriptor: m abc p q a (, ) b (, ) c pq = x y R x y G x y B ( x, y) dxdy region Different combina;ons of these moments are fully affine invariant Also invariant to affine transforma;on of intensity I a I + b F.Mindru et.al. Recognizing Color Paterns Irrespec;ve of Viewpoint and Illumina;on. CVPR99