Designing Applications that See Lecture 7: Object Recognition

Similar documents
Features Points. Andrea Torsello DAIS Università Ca Foscari via Torino 155, Mestre (VE)

Outline 7/2/201011/6/

CS 4495 Computer Vision A. Bobick. CS 4495 Computer Vision. Features 2 SIFT descriptor. Aaron Bobick School of Interactive Computing

Scale Invariant Feature Transform

Scale Invariant Feature Transform

SIFT: SCALE INVARIANT FEATURE TRANSFORM SURF: SPEEDED UP ROBUST FEATURES BASHAR ALSADIK EOS DEPT. TOPMAP M13 3D GEOINFORMATION FROM IMAGES 2014

SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS

Local Features: Detection, Description & Matching

EECS150 - Digital Design Lecture 14 FIFO 2 and SIFT. Recap and Outline

Object Recognition with Invariant Features

Local invariant features

CAP 5415 Computer Vision Fall 2012

CS664 Lecture #21: SIFT, object recognition, dynamic programming

The SIFT (Scale Invariant Feature

Key properties of local features

Local Features Tutorial: Nov. 8, 04

BSB663 Image Processing Pinar Duygulu. Slides are adapted from Selim Aksoy

Local Feature Detectors

Introduction. Introduction. Related Research. SIFT method. SIFT method. Distinctive Image Features from Scale-Invariant. Scale.

Feature descriptors. Alain Pagani Prof. Didier Stricker. Computer Vision: Object and People Tracking

Local features: detection and description. Local invariant features

CS231A Section 6: Problem Set 3

Motion Estimation and Optical Flow Tracking

Local Image Features

2D Image Processing Feature Descriptors

Computer Vision for HCI. Topics of This Lecture

Advanced Video Content Analysis and Video Compression (5LSH0), Module 4

SCALE INVARIANT FEATURE TRANSFORM (SIFT)

Object Detection Design challenges

Harder case. Image matching. Even harder case. Harder still? by Diva Sian. by swashford

Augmented Reality VU. Computer Vision 3D Registration (2) Prof. Vincent Lepetit

Building a Panorama. Matching features. Matching with Features. How do we build a panorama? Computational Photography, 6.882

Patch-based Object Recognition. Basic Idea

Computer Vision with MATLAB MATLAB Expo 2012 Steve Kuznicki

Window based detectors

Vehicle Detection Method using Haar-like Feature on Real Time System

Human Detection. A state-of-the-art survey. Mohammad Dorgham. University of Hamburg

HISTOGRAMS OF ORIENTATIO N GRADIENTS

Discovering Visual Hierarchy through Unsupervised Learning Haider Razvi

Lecture 10 Detectors and descriptors

Image Features: Local Descriptors. Sanja Fidler CSC420: Intro to Image Understanding 1/ 58

Local Image Features

Chapter 3 Image Registration. Chapter 3 Image Registration

Feature Based Registration - Image Alignment

Midterm Wed. Local features: detection and description. Today. Last time. Local features: main components. Goal: interest operator repeatability

School of Computing University of Utah

Harder case. Image matching. Even harder case. Harder still? by Diva Sian. by swashford

Recognition of a Predefined Landmark Using Optical Flow Sensor/Camera

Local features: detection and description May 12 th, 2015

Edge and corner detection

CEE598 - Visual Sensing for Civil Infrastructure Eng. & Mgmt.

Feature descriptors and matching

Image matching. Announcements. Harder case. Even harder case. Project 1 Out today Help session at the end of class. by Diva Sian.

Feature Matching and Robust Fitting

TA Section 7 Problem Set 3. SIFT (Lowe 2004) Shape Context (Belongie et al. 2002) Voxel Coloring (Seitz and Dyer 1999)

Local Patch Descriptors

CS4670: Computer Vision


Image Processing. Image Features

Object and Class Recognition I:

Face Detection on OpenCV using Raspberry Pi

Oriented Filters for Object Recognition: an empirical study

Object Recognition. Lecture 11, April 21 st, Lexing Xie. EE4830 Digital Image Processing

CS5670: Computer Vision

Implementing the Scale Invariant Feature Transform(SIFT) Method

EE368 Project Report CD Cover Recognition Using Modified SIFT Algorithm

Object Category Detection: Sliding Windows

Local features and image matching. Prof. Xin Yang HUST

Skin and Face Detection

Ulas Bagci

IMAGE-GUIDED TOURS: FAST-APPROXIMATED SIFT WITH U-SURF FEATURES

Object recognition (part 1)

Scale Invariant Feature Transform by David Lowe

Learning to Detect Faces. A Large-Scale Application of Machine Learning

Feature Detection. Raul Queiroz Feitosa. 3/30/2017 Feature Detection 1

Automatic Image Alignment (feature-based)

Computer Vision. Recap: Smoothing with a Gaussian. Recap: Effect of σ on derivatives. Computer Science Tripos Part II. Dr Christopher Town

Parallel Tracking. Henry Spang Ethan Peters

Motion illusion, rotating snakes

SIFT: Scale Invariant Feature Transform

Eppur si muove ( And yet it moves )

Comparison of Feature Detection and Matching Approaches: SIFT and SURF

Object Detection by Point Feature Matching using Matlab

CS 378: Autonomous Intelligent Robotics. Instructor: Jivko Sinapov

Object Recognition Algorithms for Computer Vision System: A Survey

Image features. Image Features

Image Features: Detection, Description, and Matching and their Applications

On Road Vehicle Detection using Shadows

Face detection and recognition. Many slides adapted from K. Grauman and D. Lowe

Corner Detection. GV12/3072 Image Processing.

Category vs. instance recognition

Face detection and recognition. Detection Recognition Sally

Local Image preprocessing (cont d)

Machine Learning for Signal Processing Detecting faces (& other objects) in images

Eye-blink Detection Using Gradient Orientations

Wikipedia - Mysid

Implementation and Comparison of Feature Detection Methods in Image Mosaicing

A Novel Algorithm for Color Image matching using Wavelet-SIFT

Prof. Feng Liu. Spring /26/2017

Computer vision: models, learning and inference. Chapter 13 Image preprocessing and feature extraction

Transcription:

stanford hci group / cs377s Designing Applications that See Lecture 7: Object Recognition Dan Maynes-Aminzade 29 January 2008 Designing Applications that See http://cs377s.stanford.edu

Reminders Pick up graded Assignment #1 Assignment #2 released today Fill out the interim course evaluation CS247L workshop on Flash tomorrow night, 6-8PM in Wallenberg 332 Next class: OpenCV tutorial Bring your webcams We will be using Windows with Visual Studio (you can try to follow along on your own machine, but you will need to figure out things like library and include paths on your system) 2

Today s Goals Understand the challenges of object recognition Learn about some algorithms that attempt to recognize particular objects (and object classes) from their basic constituent features 3

Outline Object recognition: general problem description Boosted tree classifiers SIFT algorithm 4

Fast, Accurate, General Recognition This guy is wearing a haircut called a mullet (courtesy of G. Bradski) 5

Quick, Spot the Mullets! (courtesy of G. Bradski) 6

What Makes a Car a Car? CARS NOT CARS 7

Object Detection Goal: find an object of a pre-defined class in a static image or video frame. Approach Extract certain image features, such as edges, color regions, textures, contours, etc. Use some heuristics to find configurations and/or combinations of those features specific to the object of interest 8

Many Approaches to Recognition Geometric Non-Geo rela ations Constellation/Perona Patches/Ulman Histograms/Schiele HMAX/Poggio Eigen Objects/Turk Shape models MRF/Freeman, Murphy Local features Global (courtesy of G. Bradski) 9

A Common 2-Step Strategy 10

Possible Things to Look For Symmetry Color Shadow Corners Edges Texture Taillights 11

Symmetry Observed from rear view, a car generally has vertical symmetry Problems: Symmetry estimation is sensitive to noise Prone to false detections, such as symmetrical background objects Doesn t work for partly occluded vehicles 12

Color Road is a fairly constant color Non-road regions within a road area are potential vehicles Problems: Color of an object depends on illumination, reflectance properties of the object, viewing geometry, and camera properties Color of an object can be very different during different times of the day, under different weather conditions, and under different poses 13

Shadow Area underneath a vehicle is distinctly darker than any other areas on an asphalt paved road. Problem: Doesn t work in rain, under bad illumination (under a bridge for example) Intensity of the shadow depends on illumination of the image: how to choose appropriate threshold values? 14

Corners Vehicles in general have a rectangular shape Can use four templates one for each corner, to detect all corners in image, and then use a search method to find valid configurations (matching corners) 15

Edges Rear views of cars contain many horizontal and vertical structures (rear-window, bumper, etc.) One possibility: horizontal edge detector on the image (such as Sobel operator), then sum response in each column to locate horizontal position (should be at the peaks) Problem: Lots of parameters: threshold values for the edge detectors, the threshold values for picking the most important vertical and horizontal edges, and the threshold values for choosing the best maxima in profile image 16

Texture Presence of a car causes local intensity changes General similarities among all vehicles means that the intensity changes may follow a certain pattern Problem: in most environments the background contains lots of texture as well 17

Taillights Fairly salient feature of all vehicles Problem: A little different on every car Not that bright during the daytime; probably would work only at night. 18

Other Factors to Consider Perspective: This is not a likely position / size Shape: Trace along the outer contour 19

General Problem For complex objects, such as vehicles, it is hard to find features and heuristics that will handle the huge variety of instances of the object class: May be rotated in any direction Lots of different kinds of cars in different colors May be a truck May have a missing headlight, bumper stickers, etc. May be half in light, half in shadow 20

Statistical Model Training Training Set Positive Samples Negative Samples Different features are extracted from the training samples and distinctive features that can be used to classify the object are selected. This information is compressed into the statistical model parameters. Each time the trained classifier does not detect an object (misses the object) or mistakenly detects the absent object (gives a false alarm), model is adjusted. 21

Training in OpenCV Uses simple features and a cascade of boosted tree classifiers as a statistical model. Paul Viola and Michael J. Jones. Rapid Object Detection using a Boosted Cascade of Simple Features. IEEE CVPR, 2001. Rainer Lienhart and Jochen Maydt. An Extended Set of Haar-like Features for Rapid Object Detection. IEEE ICIP 2002, Vol. 1, pp. 900-903, Sep. 2002. 22

Approach Summary Classifier is trained on images of fixed size (Viola uses 24x24) Detection is done by sliding a search window of that size through the image and checking whether an image region at a certain location looks like a car or not. Image (or classifier) can be scaled to detect objects of different sizes. A very large set of very simple weak classifiers that use a single feature to classify the image region as car or non-car. Each feature is described by the template (shape of the feature), its coordinate relative to the search window origin and the size (scale factor) of the feature. 23

Types of Features Feature s value is a weighted sum of two components: Pixel sum over the black rectangle Sum over the whole feature area 24

Weak Classifier Computed feature value is used as input to a very simple decision tree classifier with 2 terminal nodes Bar detector works well for nose nose, aface detecting stump. 1 means car and -1 means non-car It doesn t work well for cars. 25

Boosted Classifier Complex and robust classifier is built out of multiple weak classifiers using a procedure called boosting. The boosted classifier is built iteratively as a weighted sum of weak classifiers: F = sign(c 1 f 1 + c 2 f 2 + + c n f n ) On each iteration, a new weak classifier f i is trained and added to the sum. The smaller the error f i gives on the training set, the larger is the coefficient c i that is assigned to it. 26

Cascade of Boosted Classifiers Sequence of boosted classifiers with constantly increasing complexity Chained into a cascade with the simpler classifiers going first. 27

Classifier Cascade Demo 28

Identifying a Particular Object Correlation-based template matching? Computationally infeasible when object rotation, scale, illumination and 3D pose vary Even more infeasible with partial occlusion 29

Local Image Features We need to learn a set of local object features that are unaffected by Nearby clutter Partial occlusion andinvariantto to Illumination 3D projective transforms Common object variations...but, at the same time, sufficiently distinctive to identify specific objects among many alternatives! 30

What Features to Use? Line segments, edges and region grouping? Detection not good enough for reliable recognition Peaks detection in local image variations? Example: Harris corner detector Problem: image examined at only a single scale Different key locations as the image scale changes 31

Invariant Local Features Goal: Transform image content into local feature coordinates that are invariant to translation, rotation, and scaling 32

SIFT Method Scale Invariant Feature Transform (SIFT) Staged filtering approach Identifies stable points (called image keys ) Computation time less than 2 seconds Local features: Invariant to image translation, scaling, rotation Partially invariant to illumination changes and 3D projection (up to 20 of rotation) Minimally affected by noise Similar properties with neurons in inferior temporal cortex used for object recognition in primate vision 33

SIFT: Algorithm Outline Detect extrema across a scale-space Uses a difference-of-gaussians function Localize keypoints Find sub-pixel llocation and scale to fit a model Assign orientations to keypoints 1 or more for each keypoint Build keypoint descriptor Created from local image gradients 34

Scale Space Build a pyramid of images Images are difference-of-gaussian functions Resampling between each level 35

Finding Keypoints Basic idea: find strong corners in the image, but in a scale invariant fashion Run a linear filter (difference of Gaussians) at different resolutions on image pyramid - = Difference of Gaussians 36

Finding Keypoints Keypoints are the extreme values in the difference of Gaussians function across the scale space 37

Key Localization Find maxima and minima, and extract image gradient and orientation at each level to build an orientation histogram 1 st level 2 nd level 38

Select Canonical Orientation Create histogram of local gradient directions computed at selected scale Assign canonical orientation at peak of smoothed histogram 0 2π 39

Orientation Histogram Stored relative to the canonical orientation for the keypoint, to achieve rotation invariance 40

SIFT Keypoints Feature vector describing the local image region sampled relative to its scale-space coordinate frame Represents blurred image gradient locations in multiple orientations planes and at multiple scales 41

SIFT: Experimental Results 42

SIFT: Finding Matches Goal: identify candidate object matches The best candidate match is the nearest neighbor (i.e., minimum Euclidean distance between descriptor vectors) The exact solution for high dimensional vectors has high complexity, so SIFT uses approximate Best-Bin-First Bin (BBF) search method (Beis and Lowe) Goal: Final verification Low-residual least-squares fit Solution of a linear system: x = [ATA]-1ATb When at least 3 keys agree with low residual, there is strong evidence for the presence of the object Since there are dozens of keys in the image, this still works with partial occlusion 43

SIFT Example (courtesy of David Lowe) 44

Partial Occlusion Example (courtesy of David Lowe) 45

SIFT Demo 46

Context Focus? (courtesy of Kevin Murphy) 47

Importance of Context (courtesy of Kevin Murphy) 48

Importance of Context We know there is a keyboard present in this scene even if we cannot see it clearly. We know there is no keyboard present in this scene even if there is one indeed. (courtesy of Kevin Murphy) 49

Summary SIFT Very robust recognition of a specific object, even with messy backgrounds and varying illumination conditions Very quick and easy to train A little too slow to run at interactive frame rates (if the background is complex) Classifier Cascades Can recognize a general class of objects (rather than a specific object) Less robust (more false positives / missed detections) Slow to train, requires many examples Runs well at interactive frame rates 50