Announcements. Recognition. Recognition. Recognition. Recognition. Homework 3 is due May 18, 11:59 PM Reading: Computer Vision I CSE 152 Lecture 14

Similar documents
Supervised learning. y = f(x) function

Supervised learning. y = f(x) function

Announcements. Recognition I. Gradient Space (p,q) What is the reflectance map?

Local Features and Bag of Words Models

CV as making bank. Intel buys Mobileye! $15 billion. Mobileye:

CS6670: Computer Vision

Bag of Words Models. CS4670 / 5670: Computer Vision Noah Snavely. Bag-of-words models 11/26/2013

Object Recognition. Computer Vision. Slides from Lana Lazebnik, Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce

Patch Descriptors. CSE 455 Linda Shapiro

Machine Learning Crash Course

Bayes Risk. Classifiers for Recognition Reading: Chapter 22 (skip 22.3) Discriminative vs Generative Models. Loss functions in classifiers

Patch Descriptors. EE/CSE 576 Linda Shapiro

Classifiers for Recognition Reading: Chapter 22 (skip 22.3)

Course Administration

Announcements. Recognition I. Optical Flow: Where do pixels move to? dy dt. I + y. I = x. di dt. dx dt. = t

Supervised Sementation: Pixel Classification

Introduction to object recognition. Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and others

By Suren Manvelyan,

Bag-of-features. Cordelia Schmid

Supervised learning. f(x) = y. Image feature

Applications Video Surveillance (On-line or off-line)

Object recognition (part 1)

Beyond Bags of Features

Verification: is that a lamp? What do we mean by recognition? Recognition. Recognition

What do we mean by recognition?

Recognition: Face Recognition. Linda Shapiro EE/CSE 576

Image Analysis. Window-based face detection: The Viola-Jones algorithm. iphoto decides that this is a face. It can be trained to recognize pets!

TEXTURE CLASSIFICATION METHODS: A REVIEW

EE795: Computer Vision and Intelligent Systems

Distribution Distance Functions

CS4495/6495 Introduction to Computer Vision. 8C-L1 Classification: Discriminative models

Selection of Scale-Invariant Parts for Object Class Recognition

Local Image Features

Evaluation and comparison of interest points/regions

Announcements. Stereo Vision Wrapup & Intro Recognition

Object Recognition. Lecture 11, April 21 st, Lexing Xie. EE4830 Digital Image Processing

Lecture 6: Texture. Tuesday, Sept 18

Announcements. Recognition (Part 3) Model-Based Vision. A Rough Recognition Spectrum. Pose consistency. Recognition by Hypothesize and Test

SURF. Lecture6: SURF and HOG. Integral Image. Feature Evaluation with Integral Image

Object and Class Recognition I:

Today. Main questions 10/30/2008. Bag of words models. Last time: Local invariant features. Harris corner detector: rotation invariant detection

Multiple-Choice Questionnaire Group C

Computer vision: models, learning and inference. Chapter 13 Image preprocessing and feature extraction

Recognition (Part 4) Introduction to Computer Vision CSE 152 Lecture 17

CS4670: Computer Vision

Classifying Images with Visual/Textual Cues. By Steven Kappes and Yan Cao

What is Computer Vision? Introduction. We all make mistakes. Why is this hard? What was happening. What do you see? Intro Computer Vision

2. Basic Task of Pattern Classification

Feature Descriptors. CS 510 Lecture #21 April 29 th, 2013

Visual Object Recognition

TA Section: Problem Set 4

Previously. Window-based models for generic object detection 4/11/2011

SVM-KNN : Discriminative Nearest Neighbor Classification for Visual Category Recognition

Texton-based Texture Classification

Descriptors for CV. Introduc)on:

Case-Based Reasoning. CS 188: Artificial Intelligence Fall Nearest-Neighbor Classification. Parametric / Non-parametric.

CS 188: Artificial Intelligence Fall 2008

Final Exam Schedule. Final exam has been scheduled. 12:30 pm 3:00 pm, May 7. Location: INNOVA It will cover all the topics discussed in class

Fuzzy based Multiple Dictionary Bag of Words for Image Classification

Recognition. Computer Vision I. CSE252A Lecture 20

Face/Flesh Detection and Face Recognition

Ensemble of Bayesian Filters for Loop Closure Detection

Detecting Printed and Handwritten Partial Copies of Line Drawings Embedded in Complex Backgrounds

Image processing & Computer vision Xử lí ảnh và thị giác máy tính

Visuelle Perzeption für Mensch- Maschine Schnittstellen

Last week. Multi-Frame Structure from Motion: Multi-View Stereo. Unknown camera viewpoints

Scale Invariant Feature Transform

Beyond bags of features: Adding spatial information. Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba

Instance-level recognition II.

Local features: detection and description. Local invariant features

Content-based Image Retrieval (CBIR)

COSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor

EE795: Computer Vision and Intelligent Systems

Visual words. Map high-dimensional descriptors to tokens/words by quantizing the feature space.

Image Features: Local Descriptors. Sanja Fidler CSC420: Intro to Image Understanding 1/ 58

Object Detection Using Segmented Images

Contents I IMAGE FORMATION 1

Local features: detection and description May 12 th, 2015

Human Detection. A state-of-the-art survey. Mohammad Dorgham. University of Hamburg

The goals of segmentation

Schedule for Rest of Semester

Classification and Detection in Images. D.A. Forsyth

Beyond Bags of features Spatial information & Shape models

Scale Invariant Feature Transform

Learning to Recognize Faces in Realistic Conditions

Texture. COS 429 Princeton University

Discriminative Object Class Models of Appearance and Shape by Correlatons

Feature descriptors. Alain Pagani Prof. Didier Stricker. Computer Vision: Object and People Tracking

HW2 due on Thursday. Face Recognition: Dimensionality Reduction. Biometrics CSE 190 Lecture 11. Perceptron Revisited: Linear Separators

CS5670: Intro to Computer Vision

Computer Vision. Exercise Session 10 Image Categorization

Detection of a Single Hand Shape in the Foreground of Still Images

Category vs. instance recognition

Discriminative classifiers for image recognition

Midterm Wed. Local features: detection and description. Today. Last time. Local features: main components. Goal: interest operator repeatability

CSIS Introduction to Pattern Classification. CSIS Introduction to Pattern Classification 1

Bidirectional Imaging and Modeling of Skin Texture

String distance for automatic image classification

CLASSIFICATION Experiments

Histograms of Oriented Gradients for Human Detection p. 1/1

Transcription:

Announcements Computer Vision I CSE 152 Lecture 14 Homework 3 is due May 18, 11:59 PM Reading: Chapter 15: Learning to Classify Chapter 16: Classifying Images Chapter 17: Detecting Objects in Images Given a database of objects and an image determine what, if any of the objects are present in the image. Given a database of objects and an image determine what, if any of the objects are present in the image. How many visual object categories are there? Given a database of objects and an image determine what, if any of the objects are present in the image. Biederman 1987 1

Specific recognition tasks Scene categorization Image annotation/tagging outdoor/indoor city/forest/factory/etc. street people building mountain Object detection find pedestrians Image parsing sky mountain building tree banner building street lamp people market 2

Object : The Problem Within-class variations Given: A database D of known objects and an image I: 1. Determine which (if any) objects in D appear in I 2. Determine the pose (rotation and translation) of the object Pose Est. (where is it 3D) Segmentation (where is it 2D) (what is it) WHAT AND WHERE!!! Challenges Within-class variability Different objects within the class have different shapes or different material characteristics Deformable Articulated Compositional Pose variability: 2-D Image transformation (translation, rotation, scale) 3-D Pose Variability (perspective, orthographic projection) Lighting Direction (multiple sources & type) Color Shadows Occlusion partial Clutter in background -> false positives OBJECTS ANIMALS PLANTS INANIMATE NATURAL MAN-MADE.. VERTEBRATE MAMMALS BIRDS TAPIR BOAR GROUSE CAMERA Object Categories (Classes) Categories near top of tree (e.g., vehicles) lots of within class variability Fine grain categories (e.g., species of birds) -- Moderate within class variation Instance recognition (e.g., person identification) within class mostly shape articulation, bending, etc. Pattern Classification Supervised vs. Unsupervised: Do we have labels? Supervised Nearest Neighbor Bayesian Plug in classifier Distribution-based Projection methods Neural Network Support Vector Machine Kernel methods Unsupervised Clustering Reinforcement learning 3

A Rough Spectrum Appearance-Based Shape Contexts Geometric Invariants Image Abstractions / Volumetric Primitives Appearance-Based Local Features + Spatial Relations 3-D Model-Based Function Increasing Generality Appearance-Based Vision for Instances Level A Pattern Classification Viewpoint 1. Bayesian Classification 2. Appearance Manifolds 3. Feature Space 4. Dimensionality Reduction Bayesian Classification Example: Sorting incoming Fish on a conveyor according to species using optical sensing Adopt the lightness and add the width of the fish Fish x T = [x 1, x 2 ] Lightness Width 4

Basic ideas in classifiers Loss Some errors may be more expensive than others e.g., a fatal disease that is easily cured by a cheap medicine with no side-effects -> false positives in diagnosis are better than false negatives We discuss two class classification: L(1->2) is the loss caused by calling 1 a 2 Total risk of using classifier s Basic ideas in classifiers Generally, we should classify as 1 if the expected loss of classifying as 1 is better than for 2 gives 1 if Some loss may be inevitable: the minimum risk (shaded area) is called the Bayes risk 2 if Crucial notion: Decision boundary Points where the loss is the same for either case 5

Finding a decision boundary is not the same as modeling a conditional density Classifier boils down to: choose class that minimizes: Mahalanobis distance x, k 2 2 log k where x, k x k T 1 x k 1 2 Because covariance is common, this simplifies to sign of a linear expression (i.e., Voronoi diagram in 2D for =I) Plug-in classifiers Assume that class conditional distributions P(x ωi) have some parametric form - now estimate the parameters from the data Common: Assume a normal distribution with shared covariance, different means; use usual estimates Normal distribution but with different covariances Issue: parameter estimates that are good may not give optimal classifiers Example: Finding skin Skin has a very small range of (intensity independent) colors and little texture Compute an intensity-independent color measure, check if color is in this range, check if there is little texture (median filter) See this as a classifier We can set up the tests by hand or learn them Get class conditional densities (histograms) and priors from data (counting) Classifier is Receiver Operating Curve Figure from Statistical color models with application to skin detection, M.J. Jones and J. Rehg, Proc. Computer Vision and Pattern, 1999 copyright 1999, IEEE Figure from Statistical color models with application to skin detection, M.J. Jones and J. Rehg, Proc. Computer Vision and Pattern, 1999 copyright 1999, IEEE 6

Variability: Camera position Illumination Internal parameters Within-class variations Appearance manifold approach - For every object 1. Sample the set of viewing conditions 2. Crop & scale images to standard size 3. Use as feature vector - Apply principal component analysis (PCA) over all the images - Keep the dominant principal components - Set of views for one object is represented as a manifold in the projected space - : What is nearest manifold for a given test image? (Nayar et al. 96) Limitations of these approaches Object must be segmented from background (How would one do this in non-trivial situations?) Occlusion? The variability (dimension) in images is large (Is sampling feasible?) How can one generalize to classes of objects? Appearance-Based Vision: Lessons Strengths Posing the recognition metric in the image space rather than a derived representation is more powerful than expected. Modeling objects from many images is not unreasonable given hardware developments. The data (images) may provide a better representations than abstractions for many tasks. Appearance-Based Vision: Lessons Weaknesses Segmentation or object detection is still an issue. To train the method, objects have to be observed under a wide range of conditions (e.g. pose, lighting, shape deformation). Limited power to extrapolate or generalize (abstract) to novel conditions. Bag-of-features models Object Bag of words Slides from Svetlana Lazebnik who borrowed from others 7

Bag-of-features models Origin 1: Texture recognition Texture is characterized by the repetition of basic elements or textons For stochastic textures, it is the identity of the textons, not their spatial arrangement, that matters Julesz, 1981; Cula & Dana, 2001; Leung & Malik 2001; Mori, Belongie & Malik, 2001; Schmid 2001; Varma & Zisserman, 2002, 2003; Lazebnik, Schmid & Ponce, 2003 Origin 1: Texture recognition histogram Origin 2: Bag-of-words models Orderless document representation: frequencies of words from a dictionary Salton & McGill (1983) Universal texton dictionary Julesz, 1981; Cula & Dana, 2001; Leung & Malik 2001; Mori, Belongie & Malik, 2001; Schmid 2001; Varma & Zisserman, 2002, 2003; Lazebnik, Schmid & Ponce, 2003 Which US President? Franklin D. Roosevelt, John F. Kennedy, George W. Bush Origin 2: Bag-of-words models Orderless document representation: frequencies of words from a dictionary Salton & McGill (1983) Origin 2: Bag-of-words models Orderless document representation: frequencies of words from a dictionary Salton & McGill (1983) Which US President? Franklin D. Roosevelt, John F. Kennedy, George W. Bush US Presidential Speeches Tag Cloud http://chir.ag/phernalia/preztags/ Which US President? Franklin D. Roosevelt, John F. Kennedy, George W. Bush US Presidential Speeches Tag Cloud http://chir.ag/phernalia/preztags/ 8

Origin 2: Bag-of-words models Orderless document representation: frequencies of words from a dictionary Salton & McGill (1983) Origin 2: Bag-of-words models Orderless document representation: frequencies of words from a dictionary Salton & McGill (1983) Which US President? Franklin D. Roosevelt, John F. Kennedy, George W. Bush US Presidential Speeches Tag Cloud http://chir.ag/phernalia/preztags/ Which US President? Franklin D. Roosevelt, John F. Kennedy, George W. Bush US Presidential Speeches Tag Cloud http://chir.ag/phernalia/preztags/ Bag-of-features steps 1. Extract features 2. Learn visual vocabulary 3. Quantize features using visual vocabulary 4. Represent images by frequencies (histogram) of visual words 5. using histograms as input to classifier Feature extraction Regular grid or interest regions Feature Space Sliding window approaches Sketch of a Pattern Architecture Image (window) Feature Extraction Feature Vector Classification Object Identity 9

Example: Face Detection Scan window over image Search over position & scale Classify window as either: Face Non-face Feature Space What are the features? What is the classifier? Window Classifier Face Non-face x 1 x 2 x n The Space of Images x 1 x 2 x n We will treat an d-pixel image as a point in an d- dimensional space, x R d. Each pixel value is a coordinate of x. More features Filtered image Filter with multiple filters (bank of filters) Histogram of colors Histogram of Gradients (HOG) Haar wavelets Scale Invariant Feature Transform (SIFT) Speeded Up Robust Feature (SURF) Feature Space What are the features? What is the classifier? Nearest Neighbor Classifier { R j } are set of training images. ID argmin dist(r j,i) j I Variation of this: k nearest neighbors x2 R 1 x 1 x 3 R 2 10

Comments on Nearest Neighbor Sometimes called Template Matching Variations on distance function (e.g., L 1, robust distances) Multiple templates per class - perhaps many training images per class Expensive to compute k distances, especially when each image is big (d-dimensional) May not generalize well to unseen examples of class No worse than twice the error rate of the optimal classifier (if enough training samples) Some solutions: Bayesian classification Dimensionality reduction Next Lectures, detection, and classification Reading: Chapter 15: Learning to Classify Chapter 16: Classifying Images Chapter 17: Detecting Objects in Images 11