SIFT: SCALE INVARIANT FEATURE TRANSFORM SURF: SPEEDED UP ROBUST FEATURES BASHAR ALSADIK EOS DEPT. TOPMAP M13 3D GEOINFORMATION FROM IMAGES 2014

Similar documents
Outline 7/2/201011/6/

Local Features: Detection, Description & Matching

Scale Invariant Feature Transform

Scale Invariant Feature Transform

SIFT: Scale Invariant Feature Transform

Feature Detection. Raul Queiroz Feitosa. 3/30/2017 Feature Detection 1

The SIFT (Scale Invariant Feature

CAP 5415 Computer Vision Fall 2012

Computer Vision for HCI. Topics of This Lecture

Key properties of local features

SCALE INVARIANT FEATURE TRANSFORM (SIFT)

CS 4495 Computer Vision A. Bobick. CS 4495 Computer Vision. Features 2 SIFT descriptor. Aaron Bobick School of Interactive Computing

SURF. Lecture6: SURF and HOG. Integral Image. Feature Evaluation with Integral Image

Introduction. Introduction. Related Research. SIFT method. SIFT method. Distinctive Image Features from Scale-Invariant. Scale.

Local features: detection and description. Local invariant features

SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS

School of Computing University of Utah

BSB663 Image Processing Pinar Duygulu. Slides are adapted from Selim Aksoy

Comparison of Feature Detection and Matching Approaches: SIFT and SURF

Local features: detection and description May 12 th, 2015

Local Feature Detectors

SIFT - scale-invariant feature transform Konrad Schindler

Local features and image matching. Prof. Xin Yang HUST

Implementation and Comparison of Feature Detection Methods in Image Mosaicing

Feature Based Registration - Image Alignment

Augmented Reality VU. Computer Vision 3D Registration (2) Prof. Vincent Lepetit

EE368 Project Report CD Cover Recognition Using Modified SIFT Algorithm

EECS150 - Digital Design Lecture 14 FIFO 2 and SIFT. Recap and Outline

Scale Invariant Feature Transform by David Lowe

Features Points. Andrea Torsello DAIS Università Ca Foscari via Torino 155, Mestre (VE)

Local invariant features

Midterm Wed. Local features: detection and description. Today. Last time. Local features: main components. Goal: interest operator repeatability

Obtaining Feature Correspondences

Local Features Tutorial: Nov. 8, 04

Scott Smith Advanced Image Processing March 15, Speeded-Up Robust Features SURF

Feature descriptors. Alain Pagani Prof. Didier Stricker. Computer Vision: Object and People Tracking

Motion Estimation and Optical Flow Tracking

CS 556: Computer Vision. Lecture 3

Object Detection by Point Feature Matching using Matlab

Prof. Feng Liu. Spring /26/2017

2D Image Processing Feature Descriptors

CS 558: Computer Vision 4 th Set of Notes

Multi-modal Registration of Visual Data. Massimiliano Corsini Visual Computing Lab, ISTI - CNR - Italy

TA Section 7 Problem Set 3. SIFT (Lowe 2004) Shape Context (Belongie et al. 2002) Voxel Coloring (Seitz and Dyer 1999)

Local Image Features

Building a Panorama. Matching features. Matching with Features. How do we build a panorama? Computational Photography, 6.882

Feature Detection and Matching

Ulas Bagci

Local Image Features

Implementing the Scale Invariant Feature Transform(SIFT) Method

AK Computer Vision Feature Point Detectors and Descriptors

Motion illusion, rotating snakes

CEE598 - Visual Sensing for Civil Infrastructure Eng. & Mgmt.

A Comparison of SIFT and SURF

A Novel Algorithm for Color Image matching using Wavelet-SIFT

Object Recognition with Invariant Features

Feature Detectors and Descriptors: Corners, Lines, etc.

Corner Detection. GV12/3072 Image Processing.

CS5670: Computer Vision

Edge and corner detection

Image features. Image Features

Harder case. Image matching. Even harder case. Harder still? by Diva Sian. by swashford

SURF: Speeded Up Robust Features. CRV Tutorial Day 2010 David Chi Chung Tam Ryerson University

Harder case. Image matching. Even harder case. Harder still? by Diva Sian. by swashford

Verslag Project beeldverwerking A study of the 2D SIFT algorithm

3D from Photographs: Automatic Matching of Images. Dr Francesco Banterle

Feature descriptors and matching

Wikipedia - Mysid

Feature Descriptors. CS 510 Lecture #21 April 29 th, 2013

Image matching. Announcements. Harder case. Even harder case. Project 1 Out today Help session at the end of class. by Diva Sian.

Patch-based Object Recognition. Basic Idea

Eppur si muove ( And yet it moves )

Automatic Image Alignment (feature-based)

Image Mosaic with Rotated Camera and Book Searching Applications Using SURF Method

Feature-based methods for image matching

Image Features. Work on project 1. All is Vanity, by C. Allan Gilbert,

Image Features: Local Descriptors. Sanja Fidler CSC420: Intro to Image Understanding 1/ 58

Digital Image Processing (CS/ECE 545) Lecture 5: Edge Detection (Part 2) & Corner Detection

Scale Invariant Feature Transform (SIFT) CS 763 Ajit Rajwade

Lecture 10 Detectors and descriptors

Image Segmentation and Registration

CS 556: Computer Vision. Lecture 3

Performance Evaluation of Scale-Interpolated Hessian-Laplace and Haar Descriptors for Feature Matching

Chapter 3 Image Registration. Chapter 3 Image Registration

Computer Vision. Recap: Smoothing with a Gaussian. Recap: Effect of σ on derivatives. Computer Science Tripos Part II. Dr Christopher Town

A Comparison of SIFT, PCA-SIFT and SURF

ACEEE Int. J. on Information Technology, Vol. 02, No. 01, March 2012

CS4670: Computer Vision

Computational Optical Imaging - Optique Numerique. -- Multiple View Geometry and Stereo --

Distinctive Image Features from Scale-Invariant Keypoints

Image Features: Detection, Description, and Matching and their Applications

Advanced Video Content Analysis and Video Compression (5LSH0), Module 4

Problems with template matching

THE ANNALS OF DUNAREA DE JOS UNIVERSITY OF GALATI FASCICLE III, 2007 ISSN X ELECTROTECHNICS, ELECTRONICS, AUTOMATIC CONTROL, INFORMATICS

Image Matching. AKA: Image registration, the correspondence problem, Tracking,

Click to edit title style

Visual Tracking (1) Tracking of Feature Points and Planar Rigid Objects

Distinctive Image Features from Scale-Invariant Keypoints

IMAGE MATCHING USING SCALE INVARIANT FEATURE TRANSFORM (SIFT) Naotoshi Seo and David A. Schug

AN ADVANCED SCALE INVARIANT FEATURE TRANSFORM ALGORITHM FOR FACE RECOGNITION

Transcription:

SIFT: SCALE INVARIANT FEATURE TRANSFORM SURF: SPEEDED UP ROBUST FEATURES BASHAR ALSADIK EOS DEPT. TOPMAP M13 3D GEOINFORMATION FROM IMAGES 2014

SIFT SIFT: Scale Invariant Feature Transform; transform image data into scale-invariant coordinates relative to local features Invariances: Scaling Rotation Illumination Deformation Distinctive image features from scale-invariant keypoints. David G. Lowe, International Journal of Computer Vision, 60, 2 (2004), pp. 91-110.

ADVANTAGES OF INVARIANT LOCAL FEATURES? Invariant to translation, rotation, scale changes Robust or covariant to out of plane ( affine) transformations Robust to lighting variations, noise, blur, quantization Locality: Features are local, therefore robust to occlusion and clutter. Efficiency: Close to real time performance. Requirements Region extraction needs to be repeatable (reliably finds the same interest points under different viewing conditions) and accurate We need a sufficient number of regions to cover the object.

MAJOR APPLICATIONS Wide baseline stereo Motion tracking Panoramas and mosaicking 3D reconstruction Recognition Challenge : wide imaging base + different illumination + different cameras

MATCHING - GENERAL APPROACH - Detecting keypoints Detectors are operators which search 2D locations geometrically stable under different transformation. - Extract Vector Feature Descriptors Surrounding Keypoints Descriptors provide for the detected keypoints a 2D vector of pixel information. x (1) [ x,, x (1) 1 1 d ] x (2) [ x,, x (2) 2 1 d ] - Matching descriptors Local features: detection and description - Prof. Kristen Grauman UT-Austin

SIFT MATCHING- METHODOLOGY 1. Find a set of distinctive keypoints 2. Define a region around each keypoint 3. Extract and normalize the region content 4. Compute a local descriptor from the normalized region 5. Match local descriptors

50 100 150 200 250 300 350 400 450 100 200 300 400 500 600 700 FINDING KEYPOINTS (CORNERS) Idea: Find keypoints, but scale invariant Approach: Run linear filter (diff. of Gaussians) at different resolutions (scale) of image pyramid Build difference of Gaussian DoG pyramid Locate extremas of DoG pyramid (xi,yi,σi) Refine the keypoint localization Discarding weak points

DIFFERENCE OF GAUSSIAN PYRAMID Blur B3 A3-B3 A3 DOG3 Downsample Blur B2 A2-B2 DOG2 A2 Downsample Input Image Blur Blur B1 A1 A1-B1 DOG1 Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron

DIFFERENCE OF GAUSSIAN PYRAMID Detect maxima and minima of difference-of-gaussian in scale space by - comparing each pixel in the pyramid to its 26 neighbors (8 neighbors at the same level of the pyramid) This is followed by a refining and filtering steps. Maxima and minima in a (3*3*3) neighborhood

KEYPOINTS LOCALIZATION The fourth step is to refine the location of DoG extrema D(x, y, σ) D(x, y, σ ) The location refinement of the detected exremas is done through a quadratic least square fit as follows: The 2 nd Taylor expansion of D at (x, y, σ): D x = D x + D T x + 1 x 2 x T. Take the derivatives with respect to x D x = D x + 2 D x 2 2 D x 2 x. x

KEYPOINTS LOCALIZATION Solve for D x = 0 to find the extremum x = 2 D x 2 1. D x Where x = x y σ, D x = D x D y D σ, 2 D x 2 = 2 D x 2 2 D x y 2 D x σ 2 D x y 2 D y 2 2 D y σ 2 D x σ 2 D y σ 2 D σ 2 Extermum (x, y, σ) moved to (x + x, y + y, σ + σ)

DISCARDING LOW-CONTRAST KEYPOINTS The elimination strategy of weak points is simply by testing against a threshold: D(x i, y i, σ i ) > threshold (0.03-Lowe) After scale space extremas are detected SIFT algorithm discards low contrast keypoints http://en.wikipedia.org/wiki/scale-invariant_feature_transform

ELIMINATING EDGE RESPONSES Edge points removal are similar to Harris operator by evaluating the eigenvalues of the hessian matrix and test against a threshold. det H = λ 1 λ H = I x 2 I x I y 2 tr(h) 2 I x I y I y trace H = λ 1 + λ 2 det (H) 2 < (r + 1)2 r r = 10 SIFT algorithm filters out those located on edges http://en.wikipedia.org/wiki/scale-invariant_feature_transform

WHAT IS NEXT AFTER DETECTION? After the detection of points, How to describe them for matching? Rotation Invariant Descriptors Descriptors characterize the local neighborhood of a point in a vector. This locality gives robustness to illumination changes. Find local orientation Dominant direction of gradient for the Image patch Rotate patch according to this angle This puts the patches into a canonical orientation.

ORIENTATION To get a rotation invariant descriptors, the orientation is assigned for each keypoint. the computations are performed in scale invariant manner by using the scale to select the G. smoothed image m x, y = (L x + 1, y L x 1, y ) 2 +(L x, y + 1 L x, y 1 ) 2 θ x, y = tan 1 L x, y + 1 L x, y 1 ( L x + 1, y L x 1, y Select canonical orientation Compute orientation of a 36 bin histogram for each point Select dominant orientation Normalize: rotate to fixed orientation 36 direction with a 10 degree range of angles.

ORIENTATION The summarized procedure is lustrated as follows: Compute the smoothed image Compute the gradient magnitude and orientation in neighborhood of (xi, yi ) Compute the histogram of orientations The highest peak in the histogram is assigned for orientation

SIFT DESCRIPTORS The methodology is implemented as follows: Compute image gradient over 16x16 array of locations centered at (xi, yi) in the Gaussian smoothed image at the scale of the keypoint. Create an array of orientation histograms by computing the gradient orientation with respect to the keypoint gradient orientation. Compute the orientation histogram of 8 orientations (45 o covering) in a 4x4 pixel block which results in 128 vector descriptor. Compute the weights of contribution of each pixel in the orientation histogram according to its closeness to the keypoint center. Normalize to unit length to reduce the effect of illumination change

SIFT DESCRIPTORS 1- Compute image gradient over 16x16 array of locations centered at (xi, yi) in the Gaussian smoothed image at the scale of the keypoint. 2- Create an array of orientation histograms by computing the gradient orientation with respect to the keypoint gradient orientation. Image gradients

4x4 histogram array SIFT DESCRIPTORS Compute the orientation histogram of 8 orientations (45 o covering) in a 4x4 pixel block which results in 128 vector descriptor Keypoint descriptors 8 bins The result: 128 dimensions feature vector.

OUTPUT OF SIFT DETECTOR & DESCRIPTOR One image yields: n 128 dimensional descriptors: each one is a histogram of the gradient orientations within a patch [n x 128 matrix] 250 n scale parameters specifying the size of each patch [n x 1 vector] n orientation parameters specifying the angle of the patch [n x 1 vector] n 2D points giving positions of the Patches [n x 2 matrix] 50 100 150 200 300 350 400 450 100 200 300 400 500 600 700 For the above image (475x713) pixels Keypoints = 1187 Descriptors =1187* 128 Scale = 1187*1 Orientation=1187*1 Location = 1187*2

MATCHING LOCAL FEATURES? Image 1 Image 2 To generate candidate matches, find patches that have the most similar appearance (e.g., lowest SSD) Simplest approach: compare them all, take the closest (within a threshold distance)

140 120 100 80 60 40 20 0 20 40 60 80 100 120 MATCHING DESCRIPTORS Hypotheses are generated by approximate nearest neighbor matching of each feature to vectors in the database. Row= 323 Col = 291 scale = 4.9 orientation= -15 Row= 751 Col = 167 scale = 4.1 orientation= 63 140 120 100 80 60 40 20 0 ( ) 0 0 20 40 60 80 100 120 140? = ( )

MATCHING OF DESCRIPTORS We now compare the differential invariant features. compare distance = 0.5 0.2 0.3 Top-points in Image Matching Bram Platel et al Department of Biomedical Engineering Technical university of Eindhoven

MATCHING OF DESCRIPTORS The vectors with the smallest distance are paired. smallest distance distance = 0.2 Top-points in Image Matching Bram Platel et al Department of Biomedical Engineering Technical university of Eindhoven

EXAMPLE 1 st image 2 nd image

Outlier! AFTER RANSAC

SURF SPEEDED UP ROBUST FEATURE The goal is to develop both a detector and descriptor, which is faster than SIFT, while not sacrificing performance. How? By reducing the descriptor s dimension and complexity Fast interest point detection Distinctive interest point description Speeded-up descriptor matching Invariant to common image transformations: Image rotation Scale changes Illumination change Small change in Viewpoint Bay, H., A. Ess, et al. (2008). "Speeded-Up Robust Features (SURF)." Computer Vision and Image Understanding 110(3): 346-359.

METHODOLOGY Integral image Much of the performance increase in SURF can be attributed to the use of an intermediate image known as the Integral Image. The integral image is computed and is used to speed up the calculation of any upright rectangular area. Given an input image I, and point (x, y) the integral image I x,y is calculated by the sum of the values between the point and the origin. Formally, this can be defined by the formula:. i x j y I x,y = I(x, y) i=0 j=0

INTEGRAL IMAGE If we wanted to compute the sum of intensity of a rectangle area bounded by vertices A, B, C and D, the sum of pixel intensities is calculated according to the integral image by: A B C + D and offers a speeded calculation for large areas Area computation using integral images

DETECTION 1- Hessian The SURF detector (for finding maxima or minima) is based on the determinant of the Hessian matrix and if it is positive definite. Hessian matrix needs to compute the second order partial derivative of a function. For images, this can be done by convolution with an appropriate kernel. The Hessian matrix applied on image is: H = L xx L xy L xy L yy For every image point Where L xx L yy and L xy are the second derivatives or the Laplacian of Gaussian of the image.

LAPLACIAN OF GAUSSIANS - APPROXIMATION In SIFT the Laplacian of Gaussian is approximated by the DoG. In a similar manner, Bay et al. (2008) proposed an approximation to the Laplacian of Gaussians by using box filter representations of the respective kernels. integral image Box filter increase the performance

BOX FILTERS the 2 nd derivative L xx L yy and L xy. the weighted box filter approximation

SCALE ANALYSIS WITH CONSTANT IMAGE SIZE In SIFT each layer, of an image pyramid, relies on the previous, and images need to be resized which is not computationally efficient. SIFT - subsampling pyramid

FILTER IMAGE PYRAMIDS (Bay et al., 2008) proposed that since the processing time of the kernels used in SURF is size invariant according to the used box filters, the scale-space can be created by applying kernels of increasing size to the original image. This allows for multiple layers of the scale-space pyramid to be processed simultaneously and contradicts the need to subsample the image hence providing performance increase. SURF scale space pyramid

SCALE - SPACE The scale-space is divided into a number of octaves, where an octave refers to a series of response maps of covering a doubling of scale. In SURF the lowest level of the scale - space is obtained from the output of the 9*9 filters. These filters correspond to a real valued Gaussian with a scale of 1.2 values Increment of six or higher needed for preservation of the filter structure Filters Dyy (top) and Dxy (bottom) for successive scale levels (9*9,15*15 and 21*21).

LOCALIZATION The task of localizing the scale and rotation, the invariant interest points in the image can be divided into three steps: Filtering with a threshold to keep strong points. Non-maximum suppression and interpolation to find a set of candidate points just like the SIFT method. The construction of the scale space starts with the 9*9 filter, which calculates the blob response of the image for the smallest scale. Then, filters with sizes 15*15, 21*21,and 27*27 are applied, Interpolating the nearby data, to find the location in both space and scale to sub-pixel accuracy. This is done by fitting a 3D quadratic

EXAMPLE 1 st octave 9,15,21,25

EXAMPLE 2 nd octave 15, 27, 39, 51.

EXAMPLE 3 rd octave 27, 51, 75, 99

EXAMPLE 4 th octave 51, 99, 147, 195.

EXAMPLE

DESCRIPTION The SURF descriptor of each interest point detected by using the integral images in conjunction with filters known as Haar wavelets in order to increase robustness and decrease computation time. - Haar wavelets are simple filters which can be used to find gradients in the x and y directions. 1-1 -1 1 Haar wavelets, the left filter computes the response in the x-direction and the right the y-direction

DESCRIPTION Orientation Task In order to achieve rotation invariance each detected interest point is assigned a re-producible orientation. To determine the orientation, Haar wavelet responses of size 4s are calculated for a set pixel within a radius 6s of the detected interest point, s refers to the scale at which the point was detected. To find the dominant orientation: The Haar wavelet responses are represented as vectors Sum all responses within a sliding orientation window covering an angle of 60 o. The two summed response yield a new vector.

ORIENTATION TASK The longest vector is the dominant orientation and neglects the others. dominant orientation

DESCRIPTOR COMPONENTS The descriptor component starts by constructing a square window around the interest point of 4 4 square sub-regions with 5 5 regularly spaced sample points inside. This square window is oriented along the dominant direction. Then, calculate Haar wavelet response d x and d y for these 25 points dx dy dx dy

DESCRIPTOR COMPONENTS - Weight the response with a Gaussian kernel centered at the interest point - Sum the response over each sub-region for d x and d y - extract the sum of absolute value of the responses feature vector of length 64 v subregion = dx, dy, dx, dy Feature vector of length = v subregion 4 4 = 64 - Normalize the vector into unit length to be invariant to contrast. dx dy dx dy

DESCRIPTOR COMPONENTS

COMPARE SURF TO SIFT SURF: Fast-Hessian detector + SURF descriptor SIFT: DOG detector + SIFT descriptor SURF is good at handling serious blurring handling image rotation SURF is poor at handling viewpoint change handling illumination change SURF describes image faster than SIFT by 3 times. SURF is not as well as SIFT on invariance to illumination change and viewpoint change (Bay et al., 2008)

Thank you for your attention Bashar Alsadik