Pattern recognition (3)

Similar documents
Advanced Techniques for Mobile Robotics Bag-of-Words Models & Appearance-Based Mapping

By Suren Manvelyan,

Colorado School of Mines. Computer Vision. Professor William Hoff Dept of Electrical Engineering &Computer Science.

Recognition with Bag-ofWords. (Borrowing heavily from Tutorial Slides by Li Fei-fei)

Distances and Kernels. Motivation

Object Classification for Video Surveillance

Indexing local features and instance recognition May 16 th, 2017

CS 4495 Computer Vision Classification 3: Bag of Words. Aaron Bobick School of Interactive Computing

Indexing local features and instance recognition May 15 th, 2018

Indexing local features and instance recognition May 14 th, 2015

Lecture 14: Indexing with local features. Thursday, Nov 1 Prof. Kristen Grauman. Outline

Recognizing Object Instances. Prof. Xin Yang HUST

Recognizing object instances

Visual Navigation for Flying Robots. Structure From Motion

Today. Main questions 10/30/2008. Bag of words models. Last time: Local invariant features. Harris corner detector: rotation invariant detection

CEE598 - Visual Sensing for Civil Infrastructure Eng. & Mgmt.

Recognizing object instances. Some pset 3 results! 4/5/2011. Monday, April 4 Prof. Kristen Grauman UT-Austin. Christopher Tosh.

The bits the whirl-wind left out..

Feature Matching + Indexing and Retrieval

Lecture 12 Visual recognition

Image Features and Categorization. Computer Vision Jia-Bin Huang, Virginia Tech

Instance recognition

EECS 442 Computer vision. Object Recognition

Image Features and Categorization. Computer Vision Jia-Bin Huang, Virginia Tech

Improving Recognition through Object Sub-categorization

Lecture 15: Object recognition: Bag of Words models & Part based generative models

Previously. Part-based and local feature models for generic object recognition. Bag-of-words model 4/20/2011

Beyond bags of features: Adding spatial information. Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba

Part-based and local feature models for generic object recognition

CS6670: Computer Vision

Introduction to Object Recognition & Bag of Words (BoW) Models

Part based models for recognition. Kristen Grauman

Bag of Words Models. CS4670 / 5670: Computer Vision Noah Snavely. Bag-of-words models 11/26/2013

Patch Descriptors. CSE 455 Linda Shapiro

Lecture 12 Recognition

OBJECT CATEGORIZATION

Descriptors for CV. Introduc)on:

Patch Descriptors. EE/CSE 576 Linda Shapiro

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications

Lecture 12 Recognition. Davide Scaramuzza

Lecture 18: Human Motion Recognition

Local invariant features

ECE 5424: Introduction to Machine Learning

String distance for automatic image classification

Vision and Image Processing Lab., CRV Tutorial day- May 30, 2010 Ottawa, Canada

Beyond Bags of Features

The SIFT (Scale Invariant Feature

Visual Object Recognition

Extracting Spatio-temporal Local Features Considering Consecutiveness of Motions

Video annotation based on adaptive annular spatial partition scheme

Computer Vision. Exercise Session 10 Image Categorization

Efficient Kernels for Identifying Unbounded-Order Spatial Features

Lecture 16: Object recognition: Part-based generative models

Object Recognition. Computer Vision. Slides from Lana Lazebnik, Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce

Ensemble of Bayesian Filters for Loop Closure Detection

Overview. Non-Parametrics Models Definitions KNN. Ensemble Methods Definitions, Examples Random Forests. Clustering. k-means Clustering 2 / 8

EECS150 - Digital Design Lecture 14 FIFO 2 and SIFT. Recap and Outline

CS 231A Computer Vision (Fall 2011) Problem Set 4

Local features: detection and description. Local invariant features

Multiple-Choice Questionnaire Group C

Local Image Features

Image classification Computer Vision Spring 2018, Lecture 18

Spatial Latent Dirichlet Allocation

Human Detection and Action Recognition. in Video Sequences

Classifying Images with Visual/Textual Cues. By Steven Kappes and Yan Cao

Basic Problem Addressed. The Approach I: Training. Main Idea. The Approach II: Testing. Why a set of vocabularies?

Discovering Visual Hierarchy through Unsupervised Learning Haider Razvi

Category Recognition. Jia-Bin Huang Virginia Tech ECE 6554 Advanced Computer Vision

Supervised learning. y = f(x) function

Supervised learning. f(x) = y. Image feature

EE795: Computer Vision and Intelligent Systems

Determinant of homography-matrix-based multiple-object recognition

Mining Discriminative Adjectives and Prepositions for Natural Scene Recognition

CSE 158. Web Mining and Recommender Systems. Midterm recap

TA Section: Problem Set 4

Unsupervised Learning of Categories from Sets of Partially Matching Image Features

Object of interest discovery in video sequences

Artistic ideation based on computer vision methods

Image Processing. Image Features

Clustering. Mihaela van der Schaar. January 27, Department of Engineering Science University of Oxford

What do we mean by recognition?

Motion illusion, rotating snakes

Announcements. Recognition. Recognition. Recognition. Recognition. Homework 3 is due May 18, 11:59 PM Reading: Computer Vision I CSE 152 Lecture 14

Recognition. Topics that we will try to cover:

Pattern recognition (4)

Local Features and Bag of Words Models

CS 378: Autonomous Intelligent Robotics. Instructor: Jivko Sinapov

Discriminative sparse model and dictionary learning for object category recognition

Visual Object Recognition

Selection of Scale-Invariant Parts for Object Class Recognition

6.819 / 6.869: Advances in Computer Vision

Recognition of Animal Skin Texture Attributes in the Wild. Amey Dharwadker (aap2174) Kai Zhang (kz2213)

EVENT DETECTION AND HUMAN BEHAVIOR RECOGNITION. Ing. Lorenzo Seidenari

Wikipedia - Mysid

Verification: is that a lamp? What do we mean by recognition? Recognition. Recognition

Preliminary Local Feature Selection by Support Vector Machine for Bag of Features

Feature Based Registration - Image Alignment

Aggregating Descriptors with Local Gaussian Metrics

Bayes Classifiers and Generative Methods

Local features and image matching. Prof. Xin Yang HUST

Transcription:

Pattern recognition (3) 1

Things we have discussed until now Statistical pattern recognition Building simple classifiers Supervised classification Minimum distance classifier Bayesian classifier Building discriminant functions Unsupervised classification K-means algorithm 2

Discriminants A function used to test the class membership is called a discriminant Construct a single discriminant g i (x) for each class ω i, and assign x to class ω i if g i (x) > g j (x) for all other classes ω j. 3

Last lecture we stopped at: Discriminant functions for Bayes classifier in the 1D case Today: Extension to multiple classes Naïve Bayes: considers that features are statistically independent This lecture is based on a tutorial on object classification techniques presented at ICCV 2005. Optional reading: - Csurka et al Visual Categorization with Bags of Keypoints, ECCV 2004. - Wikipedia article of the Bag-of-Words concept for Computer Vision applications - http://en.wikipedia.org/wiki/bag_of_words_model_i n_computer_vision 4

Naïve Bayes Derivation How do we represent objects through features that are statistically independent? The bag-of-words model 5

What is the Bag-of-Words model? Initially used in Matural Language Processing (NLP) It is a method for document representation which ignores the word order Example: a good book and a book good are the same object The model is dictionary based Each document looks like a bag containing some words of the dictionary 6

Slide from Tutorial on Learning and Recognizing Object categories, ICCV 2005 Author: Li Fei Fei Analogy to document analysis Of all the sensory impressions proceeding to the brain, the visual experiences are the dominant ones. Our perception of the world around us is based essentially on the messages that reach the brain from our eyes. For a long time it was thought that the retinal image was transmitted sensory, point brain, by point to visual centers in the brain; the cerebral cortex was a visual, perception, movie screen, so to speak, upon which the image in retinal, the eye was cerebral projected. Through cortex, the discoveries of eye, Hubel cell, and Wiesel optical we now know that behind the origin of the visual perception in the nerve, brain there image is a considerably more complicated Hubel, course of Wiesel events. By following the visual impulses along their path to the various cell layers of the optical cortex, Hubel and Wiesel have been able to demonstrate that the message about the image falling on the retina undergoes a stepwise analysis in a system of nerve cells stored in columns. In this system each cell has its specific function and is responsible for a specific detail in the pattern of the retinal image. China is forecasting a trade surplus of $90bn ( 51bn) to $100bn this year, a threefold increase on 2004's $32bn. The Commerce Ministry said the surplus would be created by a predicted 30% jump in exports to $750bn, compared with a 18% rise in imports to $660bn. The figures China, are likely trade, to further annoy the US, which has long argued that surplus, commerce, China's exports are unfairly helped by a deliberately exports, undervalued imports, yuan. Beijing US, agrees the yuan, surplus bank, is too high, domestic, but says the yuan is only one factor. Bank of China governor Zhou foreign, Xiaochuan increase, said the country also needed to do trade, more to value boost domestic demand so more goods stayed within the country. China increased the value of the yuan against the dollar by 2.1% in July and permitted it to trade within a narrow band, but the US wants the yuan to be allowed to trade freely. However, Beijing has made it clear that it will take its time and tread carefully before allowing the yuan to rise further in value. 7

How to create a bag of words in NLP Two simple documents: John likes to watch movies. Mary likes too. John also likes to watch football games. Based on these two text documents, a dictionary is constructed as: dictionary={1:"john", 2:"likes", 3:"to", 4:"watch", 5:"movies", 6:"also", 7:"football", 8:"games", 9:"Mary", 10:"too"}, which has 10 distinct words. Using the indexes of the dictionary, each document is represented by a 10-entry vector: [1, 2, 1, 1, 1, 0, 0, 0, 1, 1] [1, 1, 1, 1, 0, 1, 1, 1, 0, 0], where each entry of the vectors refers to count of the corresponding entry in the dictionary (a histogram-based representation). Source: http://en.wikipedia.org/wiki/bag_of_words_model_in_computer_vision 8

How to create a Bag-of-Words representation in Computer Vision An image can be treated as a document Features extracted from the image are considered words word in images does not have the same straightforward meaning as in documents 9

10 Slide from Tutorial on Learning and Recognizing Object categories, ICCV 2005 Author: Li Fei Fei

Steps in generating a bag of words for images Feature Detection Feature Representation Generation of codebook (equivalent to the dictionary in text documents) 11

1.Feature detection 1.Feature detection - extracts several local patches (or regions) which are considered as candidates for basic elements (words) 12

Feature selection Regular grid Good for natural scene categorization Disadv.: very little information on the image itself 13

Feature selection (cont d) Detection of interest points (Harris corner detector, SIFT) Extract salient patches (considered more important than other patches; they also attract human attention) Example of a visual word (airplane wheel) occurring in different images - elliptical region is superimposed. - bottom row shows normalized regions for the top row of images. - normalized regions appear quite similar this is why they represent the same word. Sivic et al, Discovering Object Categories in Image Collections, MIT Technical Report 2005 14

Feature representation We represent patches as numerical vectors Normalization before representation Popular representation: SIFT (Scale-Invariant Feature Transform, Lowe 1999) converts each patch to a 128 vector. 15

Codebook generation Final step Converts vector-represented patches to codewords (analogy to words in text documents) The codebook = the equivalent of dictionary in text documents Codeword=a representative of several similar patches Can be performed by performing K-means clustering over all vectors The codewords are the centers of the learned clusters 16

2. Codewords dictionary formation 17 Slide credit: Josef Sivic

all patches detected for this image. patches from two selected clusters occurring in this image (yellow and magenta ellipses). Csurka et al Visual Categorization with Bags of Keypoints, ECCV 2004. 18

2. Codewords dictionary formation 19 Fei-Fei et al. 2005

3. Image representation frequency.. codewords 20

Representation 1. feature detection & representation 2. codewords dictionary image representation 3. 21

Learning and Recognition codewords dictionary category models (and/or) classifiers category decision 22

Case study Naïve Bayes classifier Csurka et al Visual Categorization with Bags of Keypoints, ECCV 2004. Area: Content-Based Information Retrieval Question asked: Given an image, what category of objects does it contain? Notations: wn : each patch in an image wn = [0,0, 1,,0,0] T w: a collection of all N patches in an image w = [w1,w2,,wn ] c: category of the image 23

Case study: the Naïve Bayes model c w N c = arg max c p( c w) p( c) p( w c) = p( c) N n= 1 p( w n c) Object class decision Prior prob. of the object classes Image likelihood given the class 24 Csurka et al. 2004

25 Csurka et al. 2004

26 Csurka et al. 2004

More results Examples of images correctly classified, even if most of the keypoints were present on the background rather than on the objects 27

More results Examples of images correctly classified, in the presence of partial view and varying viewpoint 28

Summary We have studied one possible extension of the Bayes classifier to multiple dimension This extension, called Naïve Bayes classifier, considers all features of an object as independent random variables We can build object and image representations (example: Bag of Words) that respect this assumption in the Naïve Bayes Classifier Next: the general Bayes Classifier 29

Multivariate normal Bayesian classification For multiple classes in a p-dimensional feature space, each class ω i has its own mean vector m i and covariance matrix C i The class-conditional probabilities are: 30

Mahalonobis distance The expression can be considered as a distance between feature vector x and class i. C is the covariance matrix computed for class i. It can be proven that the Mahalonobis distance satisfies all properties of a distance function 31

Equivalence between classifiers Pattern recognition using multivariate normal distributions and equal priors is simply a minimum Mahalonobis distance classifier. 32