Image Features: Detection, Description, and Matching and their Applications

Image Representation: Global Versus Local Features

Features/ keypoints/ interset points are interesting locations in the image. Features serve to give a new compact representation of an image. Features can describe objects. Types of image features: Edges, Corners, blobs, small image patches

Global Features Image is represented by one multidimensional feature vector, describing the information in the whole image. Measure various aspects of the image such as color, texture or shape. Much faster and compact while easy to compute and generally require small amounts of memory. Not invariant to significant transformations and sensitive to clutter and occlusion. Local Features Image may have hundreds of local features. Measure local structures that are more distinctive and stable than other structures. Local features have superior performance. Robust against occlusion and background clutter.

Features Corner Region Edge

Features Detection: Detection: identify the interest points Feature detection = how to find some interesting points (features) in the image.(ex: find a corner, find a region and so on...)

Features Detection: Detectors Single-scale Multi-scale Affineinvariant Harris FAST SUSAN Harris- Laplace Hessian- Laplace LOG DOG Harris- Affine Hessian -Affine

Features description: Description: Extract feature vector surrounding each interest point. Feature description = how to represent the interesting points we found to compare them with other interesting points (features) in the image.

Features description Descriptors Gradient-based (Floating point) Binary-based (Binary string) SIFT SURF GLOH BRIEF ORB BRISK FREAK

Features Matching: Matching: match/compare the extracted descriptor from the query image to those in the database. Feature Matching= find the distance between two or more descriptors/ finding the similarity.

Features Matching: Given two such patches, how can we determine their similarity? We can measure the pixel to pixel similarity by measuring their Euclidean distance, but that measure is very sensitive to noise, rotation, translation and illumination changes. In most applications we would like to be robust to such changes.

Applications 1.Object Recognition 2. Camera Calibration 3. Image Retrieval 4. Image Registration

Binary Descriptors Binary descriptors are composed of three parts: A sampling pattern: where to sample points in the region around the feature/keypoint. Orientation compensation: some mechanism to measure the orientation of the keypoint and rotate it to compensate for rotation changes. Sampling pairs: which pairs to compare when building the final descriptor.

Why binary descriptors? Pixel comparisons are much faster than gradients. Matching binary representations is done using hamming distance which is much faster to compute than any distance metric. Whereas a single gradient based descriptor commonly require 64 or 128 floating point values to store, a single binary descriptors requires only 512 bits 4 to 8 times less.

Change of scale Challenges Change of orientation (rotation) Change of viewpoint (affine, projective transformations) Change of illumination Noise, Clutter, and occlusion Repetitive patterns (windows, street marks, etc.) Thus, we need robust similarity measures, detector, descriptors, and matchers

Ideas Studying the effect of different sampling patterns on descriptors performance. Investigating the effect of different similarity metrics on the matching performance. Adaptation of binary descriptors for computational resources. Enhancing existing descriptors robustness against geometric and photometric transformations. Proposing our own descriptor and using it in applications.

Thank you

A schematic representation of the SIFT descriptor for a 16 16 pixel patch and a 4 4 descriptor array

schematic representation of the GLOH algorithm using log-polar bins