Categorizing objects: of appearance

Similar documents
Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

Discriminative classifiers for object classification. Last time

Recognition continued: discriminative classifiers

Announcements. Recognizing object categories. Today 2/10/2016. Recognition via feature matching+spatial verification. Kristen Grauman UT-Austin

Discriminative classifiers for image recognition

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification

What is Object Detection? Face Detection using AdaBoost. Detection as Classification. Principle of Boosting (Schapire 90)

Window based detectors

CMPSCI 670: Computer Vision! Object detection continued. University of Massachusetts, Amherst November 10, 2014 Instructor: Subhransu Maji

Support Vector Machines

Machine Learning. Support Vector Machines. (contains material adapted from talks by Constantin F. Aliferis & Ioannis Tsamardinos, and Martin Law)

Image Alignment CSC 767

Support Vector Machines

EYE CENTER LOCALIZATION ON A FACIAL IMAGE BASED ON MULTI-BLOCK LOCAL BINARY PATTERNS

Previously. Window-based models for generic object detection 4/11/2011

Announcements. Supervised Learning

Fast Feature Value Searching for Face Detection

Lecture 5: Multilayer Perceptrons

Face Detection with Deep Learning

Multi-stable Perception. Necker Cube

Generic Object-Face detection

Histogram of Template for Pedestrian Detection

Fitting & Matching. Lecture 4 Prof. Bregler. Slides from: S. Lazebnik, S. Seitz, M. Pollefeys, A. Effros.

Edge Detection in Noisy Images Using the Support Vector Machines

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

Feature Reduction and Selection

Metrol. Meas. Syst., Vol. XXIII (2016), No. 1, pp METROLOGY AND MEASUREMENT SYSTEMS. Index , ISSN

Learning-based License Plate Detection on Edge Features

Discriminative Dictionary Learning with Pairwise Constraints

Face Recognition University at Buffalo CSE666 Lecture Slides Resources:

CHAPTER 3 SEQUENTIAL MINIMAL OPTIMIZATION TRAINED SUPPORT VECTOR CLASSIFIER FOR CANCER PREDICTION

Object detection as supervised classification

Shape Representation Robust to the Sketching Order Using Distance Map and Direction Histogram

Range images. Range image registration. Examples of sampling patterns. Range images and range surfaces

CS 534: Computer Vision Model Fitting

Image Representation & Visualization Basic Imaging Algorithms Shape Representation and Analysis. outline

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Fitting and Alignment

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

Support Vector Machines. CS534 - Machine Learning

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET

Outline. Self-Organizing Maps (SOM) US Hebbian Learning, Cntd. The learning rule is Hebbian like:

Classification / Regression Support Vector Machines

Hierarchical clustering for gene expression data analysis

Detection of an Object by using Principal Component Analysis

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

Active Contours/Snakes

Computer Animation and Visualisation. Lecture 4. Rigging / Skinning

Fitting: Deformable contours April 26 th, 2018

Local Quaternary Patterns and Feature Local Quaternary Patterns

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Recognizing Faces. Outline

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach

Classification of Face Images Based on Gender using Dimensionality Reduction Techniques and SVM

Learning Ensemble of Local PDM-based Regressions. Yen Le Computational Biomedicine Lab Advisor: Prof. Ioannis A. Kakadiaris

Scale Selective Extended Local Binary Pattern For Texture Classification

PERFORMANCE EVALUATION FOR SCENE MATCHING ALGORITHMS BY SVM

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

The Research of Support Vector Machine in Agricultural Data Classification

Face Recognition Based on SVM and 2DPCA

Online Detection and Classification of Moving Objects Using Progressively Improving Detectors

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 15

Machine Learning: Algorithms and Applications

3D vector computer graphics

Computer Vision. Pa0ern Recogni4on Concepts Part II. Luis F. Teixeira MAP- i 2012/13

Smoothing Spline ANOVA for variable screening

LECTURE : MANIFOLD LEARNING

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

An Efficient Face Detection Method Using Adaboost and Facial Parts

Machine Learning 9. week

Multi-view 3D Position Estimation of Sports Players

Efficient Object Detection Using Cascades of Nearest Convex Model Classifiers

Parallelism for Nested Loops with Non-uniform and Flow Dependences

MULTI-VIEW ANCHOR GRAPH HASHING

S1 Note. Basis functions.

A Probability Distribution Kernel based on Whitening. Transformation

Face Detection Using DCT Coefficients in MPEG Video. Jun Wang, Mohan S Kankanhalli, Philippe Mulhem, Hadi Hassan Abdulredha

A Modified Median Filter for the Removal of Impulse Noise Based on the Support Vector Machines

Face Detection and Alignment. Prof. Xin Yang HUST

Collaboratively Regularized Nearest Points for Set Based Recognition

Improved SIFT-Features Matching for Object Recognition

Incremental Multiple Kernel Learning for Object Recognition

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

An Image Fusion Approach Based on Segmentation Region

Wishing you all a Total Quality New Year!

Fitting: Voting and the Hough Transform

Multiclass Object Recognition based on Texture Linear Genetic Programming

Incremental Learning with Support Vector Machines and Fuzzy Set Theory

Fast Sparse Gaussian Processes Learning for Man-Made Structure Classification

An efficient method to build panoramic image mosaics

y and the total sum of

SUMMARY... I TABLE OF CONTENTS...II INTRODUCTION...

Object-Based Techniques for Image Retrieval

Lecture notes: Histogram, convolution, smoothing

Unsupervised Learning

Margin-Constrained Multiple Kernel Learning Based Multi-Modal Fusion for Affect Recognition

BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION

Implementation of Robust HOG-SVM based Pedestrian Classification

Using Neural Networks and Support Vector Machines in Data Mining

Transcription:

Categorzng objects: global and part-based models of appearance UT Austn Generc categorzaton problem 1

Challenges: robustness Realstc scenes are crowded, cluttered, have overlappng objects. Generc category recognton: basc framework Buld/tran object model Choose a representaton Learn or ft parameters of model / classfer Generate canddates n new mage Score the canddates 2

Generc category recognton: representaton choce Wndow based Part based Wndow-based models Buldng an object model Smple holstc descrptons of mage content grayscale / color hstogram vector of pxel ntenstes 3

Wndow-based models Buldng an object model Pxel-based representatons senstve to small shfts Color or grayscale-based appearance descrpton can be senstve to llumnaton and ntra-class appearance varaton Wndow-based models Buldng an object model Consder edges, contours, and (orented) ntensty gradents 4

Wndow-based models Buldng an object model Consder edges, contours, and (orented) ntensty gradents Summarze local dstrbuton of gradents wth hstogram Locally orderless: offers nvarance to small shfts and rotatons Contrast-normalzaton: try to correct for varable llumnaton Wndow-based models Buldng an object model Gven the representaton, tran a bnary classfer Car/non-car Classfer No, Yes, not car. a car. 5

Dscrmnatve classfer constructon Nearest neghbor Neural networks 10 6 examples Shakhnarovch, Vola, Darrell 2003 Berg, Berg, Malk 2005... LeCun, Bottou, Bengo, Haffner 1998 Rowley, Baluja, Kanade 1998 Support Vector Machnes Boostng Condtonal Random Felds Guyon, Vapnk Hesele, Serre, Poggo, 2001, Vola, Jones 2001, Torralba et al. 2004, Opelt et al. 2006, McCallum, Fretag, Perera 2000; Kumar, Hebert 2003 Slde adapted from Antono Torralba Generc category recognton: basc framework Buld/tran object model Choose a representaton Learn or ft parameters of model / classfer Generate canddates n new mage Score the canddates 6

Wndow-based models Generatng and scorng canddates Car/non-car Classfer Wndow-based object detecton: recap Tranng: 1. Obtan tranng data 2. Defne features 3. Defne classfer Gven new mage: 1. Slde wndow 2. Score by classfer Tranng examples Car/non-car Classfer Feature extracton 7

What classfer? Factors n choosng: Issues Generatve or dscrmnatve model? Data resources how much tranng data? How s the labeled data prepared? Tranng tme allowance Test tme requrements real-tme? Ft wth the representaton Issues What classfer? What features or representatons? How to make t affordable? What categores are amenable? 8

Issues What categores are amenable? Smlar to specfc object matchng, we expect spatal layout to be farly rgdly preserved. Unlke specfc object matchng, by tranng classfers we attempt to capture ntra-class varaton or determne requred dscrmnatve features. What categores are amenable to wndow-based reps? 9

Wndow-based models: Three case studes Boostng + face detecton NN + scene Gst classfcaton SVM + person detecton Vola & Jones e.g., Hays & Efros e.g., Dalal & Trggs Man dea: Vola-Jones face detector Represent local texture wth effcently computable rectangular features wthn wndow of nterest Select dscrmnatve features to be weak classfers Use boosted combnaton of them as fnal classfer Form a cascade of such classfers, rejectng clear negatves quckly 10

Boostng ntuton Weak Classfer 1 Slde credt: Paul Vola Boostng llustraton Weghts Increased 11

Boostng llustraton Weak Classfer 2 Boostng llustraton Weghts Increased 12

Boostng llustraton Weak Classfer 3 Boostng llustraton Fnal classfer s a combnaton of weak classfers 13

Boostng: tranng Intally, weght each tranng example equally In each boostng round: Fnd the weak learner that acheves the lowest weghted tranng error Rase weghts of tranng examples msclassfed by current weak learner Compute fnal classfer as lnear combnaton of all weak learners (weght of each learner s drectly proportonal to ts accuracy) Exact formulas for re-weghtng and combnng weak learners depend on the partcular boostng scheme (e.g., AdaBoost) Slde credt: Lana Lazebnk Boostng: pros and cons Advantages of boostng Integrates classfcaton wth feature selecton Complexty of tranng s lnear n the number of tranng examples Flexblty n the choce of weak learners, boostng scheme Testng s fast Easy to mplement Dsadvantages Needs many tranng examples Often found not to work as well as an alternatve dscrmnatve classfer, support vector machne (SVM) especally for many-class problems Slde credt: Lana Lazebnk 14

Vola-Jones detector: features Rectangular flters Feature output s dfference between adjacent regons Effcently computable wth ntegral mage: any sum can be computed n constant tme. Value at (x,y) s sum of pxels above and to the left of (x,y) Integral mage Computng the ntegral mage Lana Lazebnk 15

Computng the ntegral mage (x, y-1) s(x-1, y) (x, y) Cumulatve row sum: s(x, y) = s(x 1, y) + (x, y) Integral mage: (x, y) = (x, y 1) + s(x, y) Lana Lazebnk Computng sum wthn a rectangle Let A,B,C,D be the values of the ntegral mage at the corners of a rectangle Then the sum of orgnal mage values wthn the rectangle can be computed as: sum = A B C + D Only 3 addtons are requred for any sze of rectangle! D C B A Lana Lazebnk 16

Vola-Jones detector: features Rectangular flters Feature output s dfference between adjacent regons Effcently computable wth ntegral mage: any sum can be computed n constant tme Avod scalng mages scale features drectly for same cost Value at (x,y) s sum of pxels above and to the left of (x,y) Integral mage Vola-Jones detector: features Consderng all possble flter parameters: poston, scale, and type: 180,000+ possble features assocated wth each 24 x 24 wndow Whch subset of these features should we use to determne f a wndow has a face? Use AdaBoost both to select the nformatve features and to form the classfer 17

Vola-Jones detector: AdaBoost Want to select the sngle rectangle feature and threshold that best separates postve (faces) and negatve (nonfaces) tranng examples, n terms of weghted error. Resultng weak classfer: Outputs of a possble rectangle feature on faces and non-faces. For next round, reweght the examples accordng to errors, choose another flter/threshold combo. Vola-Jones Face Detector: Results ng gnton ory Augmented Tutoral Comput Vsual Perceptual Object and Recog Sens Frst two features selected 18

Even f the flters are fast to compute, each new mage has a lot of possble wndows to search. How to make the detecton more effcent? Cascadng classfers for detecton Form a cascade wth low false negatve rates early on Apply less accurate but faster classfers frst to mmedately dscard wndows that clearly appear to be negatve 19

Vola-Jones detector: summary Tran cascade of classfers wth AdaBoost Faces New mage Non-faces Selected features, thresholds, and weghts Tran wth 5K postves, 350M negatves Real tme detector usng 38 layer cascade 6061 features n all layers [Implementaton avalable n OpenCV: http://www.ntel.com/technology/computng/opencv/] Vola-Jones detector: summary A semnal approach to real-tme object detecton Tranng s slow, but detecton s very fast Key deas Integral mages for fast feature evaluaton Boostng for feature selecton Attentonal cascade of classfers for fast rejecton of nonface wndows P. Vola and M. Jones. Rapd object detecton usng a boosted cascade of smple features. CVPR 2001. P. Vola and M. Jones. Robust real-tme face detecton. IJCV 57(2), 2004. 20

Vola-Jones Face Detector: Results Vsual Perceptual Object and Recog Sens gnton ory Augmented Tutoral Comput ng Vola-Jones Face Detector: Results Vsual Perceptual Object and Recog Sens gnton ory Augmented Tutoral Comput ng 21

Vola-Jones Face Detector: Results Vsual Perceptual Object and Recog Sens gnton ory Augmented Tutoral Comput ng Detectng profle faces? Can we use the same detector? Vsual Perceptual Object and Recog Sens gnton ory Augmented Tutoral Comput ng 22

Vola-Jones Face Detector: Results Vsual Perceptual Object and Recog Sens gnton ory Augmented Tutoral Comput ng Paul Vola, ICCV tutoral Example usng Vola Jones detector Frontal faces detected and then tracked, character names nferred wth algnment of scrpt and subttles. Everngham, M., Svc, J. and Zsserman, A. "Hello! My name s... Buffy" - Automatc namng of characters n TV vdeo, BMVC 2006. http://www.robots.ox.ac.uk/~vgg/research/nface/ndex.html 23

Consumer applcaton: Photo http://www.apple.com/lfe/photo/ Slde credt: Lana Lazebnk 24

Consumer applcaton: Photo Thngs Photo thnks are faces Slde credt: Lana Lazebnk Consumer applcaton: Photo Can be traned to recognze pets! http://www.maclfe.com/artcle/news/photos_faces_recognzes_cats Slde credt: Lana Lazebnk 25

Wndow-based models: Three case studes Boostng + face detecton NN + scene Gst classfcaton SVM + person detecton Vola & Jones e.g., Hays & Efros e.g., Dalal & Trggs Nearest Neghbor classfcaton Assgn label of nearest tranng data pont to each test data pont Black = negatve Red = postve from Duda et al. Novel test example Closest to a postve example from the tranng set, so classfy t as postve. Vorono parttonng of feature space for 2-category 2D data 26

K-Nearest Neghbors classfcaton For a new pont, fnd the k closest ponts from tranng data Labels of the k ponts vote to classfy Black = negatve Red = postve k= 5 If query lands here, the 5 NN consst of 3 negatves and 2 postves, so we classfy t as negatve. Source: D. Lowe A nearest neghbor recognton example 27

Where n the World? [Hays and Efros. m2gps: Estmatng Geographc Informaton from a Sngle Image. CVPR 2008.] Where n the World? 28

Where n the World? 6+ mllon geotagged photos by 109,788 photographers Annotated by Flckr users 29

6+ mllon geotagged photos by 109,788 photographers Annotated by Flckr users Whch scene propertes are relevant? 30

Spatal Envelope Theory of Scene Representaton Olva & Torralba (2001) A scene s a sngle surface that can be represented by global (statstcal) descrptors Slde Credt: Aude Olva Global texture: capturng the Gst of the scene Capture global mage propertes whle keepng some spatal nformaton Olva & Torralba IJCV 2001, Torralba et al. CVPR 2003 Gst descrptor 31

Whch scene propertes are relevant? Gst scene descrptor Color Hstograms L*A*B* 4x14x14 hstograms Texton Hstograms 512 entry, flter bank based Lne Features Hstograms of straght lne stats Scene Matches [Hays and Efros. m2gps: Estmatng Geographc Informaton from a Sngle Image. CVPR 2008.] 32

Scene Matches [Hays and Efros. m2gps: Estmatng Geographc Informaton from a Sngle Image. CVPR 2008.] 33

[Hays and Efros. m2gps: Estmatng Geographc Informaton from a Sngle Image. CVPR 2008.] Scene Matches [Hays and Efros. m2gps: Estmatng Geographc Informaton from a Sngle Image. CVPR 2008.] 34

[Hays and Efros. m2gps: Estmatng Geographc Informaton from a Sngle Image. CVPR 2008.] Quanttatve Evaluaton Test Set 35

The Importance of Data [Hays and Efros. m2gps: Estmatng Geographc Informaton from a Sngle Image. CVPR 2008.] Nearest neghbors: pros and cons Pros: Smple to mplement Flexble to feature / dstance choces Naturally handles mult-class cases Can do well n practce wth enough representatve data Cons: Large search problem to fnd nearest neghbors Storage of data Must know we have a meanngful dstance functon 36

Wndow-based models: Three case studes Boostng + face detecton NN + scene Gst classfcaton SVM + person detecton Vola & Jones e.g., Hays & Efros e.g., Dalal & Trggs Lnear classfers 37

Lnear classfers Fnd lnear functon to separate postve and negatve examples x postve : x negatve : x w b 0 x w b 0 Whch lne s best? Support Vector Machnes (SVMs) Dscrmnatve classfer based on optmal separatng lne (for 2d case) Maxmze the margn Maxmze the margn between the postve and negatve tranng examples 38

Support vector machnes Want lne that maxmzes the margn. x postve ( y x negatve( y 1) : 1) : x w b 1 x w b 1 For support, vectors, x w b 1 Support vectors Margn C. Burges, A Tutoral on Support Vector Machnes for Pattern Recognton, Data Mnng and Knowledge Dscovery, 1998 Support vector machnes Want lne that maxmzes the margn. x postve ( y x negatve( y 1) : 1) : x w b 1 x w b 1 Support vectors Margn M For support, vectors, x w b 1 Dstance between pont x w b and lne: w For support vectors: Τ w x b 1 M w w 1 1 w w 2 w 39

Support vector machnes Want lne that maxmzes the margn. x postve ( y x negatve( y 1) : 1) : x w b 1 x w b 1 Support vectors Margn M For support, vectors, x w b 1 Dstance between pont x w b and lne: w Therefore, the margn s 2 / w Fndng the maxmum margn lne 1. Maxmze margn 2/ w 2. Correctly classfy all tranng data ponts: x postve ( y x negatve ( y 1) : x w b 1 Quadratc optmzaton problem: Mnmze 1) : 1 w T w 2 Subject to y (w x +b) 1 x w b 1 40

Fndng the maxmum margn lne Soluton: w y x learned weght Support vector Fndng the maxmum margn lne Soluton: w y x b = y w x (for any support vector) w x b y x x Classfcaton functon: f ( x) sgn ( w x b) sgn x x b b If f(x) < 0, classfy as negatve, f f(x) > 0, classfy as postve C. Burges, A Tutoral on Support Vector Machnes for Pattern Recognton, Data Mnng and Knowledge Dscovery, 1 41

Person detecton wth HoG s & lnear SVM s Dalal & Trggs, CVPR 2005 Map each grd cell n the nput wndow to a hstogram countng the gradents per orentaton. Tran a lnear SVM usng tranng set of pedestran vs. non-pedestran wndows. Code avalable: http://pascal.nralpes.fr/soft/olt/ 42

HoG descrptor Dalal & Trggs, CVPR 2005 Code avalable: http://pascal.nralpes.fr/soft/olt/ Person detecton wth HoGs & lnear SVMs Hstograms of Orented Gradents for Human Detecton, Navneet Dalal, Bll Trggs, Internatonal Conference on Computer Vson & Pattern Recognton - June 2005 http://lear.nralpes.fr/pubs/2005/dt05/ 43

Questons What f the data s not lnearly separable? What f we have more than just two categores? Non lnear SVMs Datasets that are lnearly separable wth some nose work out great: 0 x But what are we gong to do f the dataset s just too hard? 0 x How about mappng data to a hgher-dmensonal space: x 2 0 x 44

Non lnear SVMs: feature spaces General dea: the orgnal nput space can be mapped to some hgher-dmensonal feature space where the tranng set s separable: Φ: x φ(x) Slde from Andrew Moore s tutoral: http://www.autonlab.org/tutorals/svm.html The Kernel Trck The lnear classfer reles on dot product between vectors K(x,x j )=x T x j If every data pont s mapped nto hgh-dmensonal space va some transformaton Φ: x φ(x), the dot product becomes: K(x,x j )= φ(x ) T φ(x j ) A kernel functon s smlarty functon that corresponds to an nner product n some expanded feature space. Slde from Andrew Moore s tutoral: http://www.autonlab.org/tutorals/svm.html 45

Example 2-dmensonal vectors x=[x 1 x 2 ]; let K(x,x j)=(1 + x T x j) 2 Need to show that K(x,x j )= φ(x ) T φ(x j ): K(x,x j )=(1 + x T x j ) 2, = 1+ x 12 x 2 j1 + 2 x 1 x j1 x 2 x j2 + x 22 x 2 j2 + 2x 1 x j1 + 2x 2 x j2 = [1 x 2 2 T 1 2 x 1 x 2 x 2 2x 1 2x 2 ] [1 x 2 j1 2 x j1 x j2 x 2 j2 2x j1 2x j2 ] = φ(x ) T φ(x j ), where φ(x) = [1 x 2 1 2 x 1 x 2 x 2 2 2x 1 2x 2 ] Nonlnear SVMs The kernel trck: nstead of explctly computng the lftng transformaton φ(x), defne a kernel functon K such that K(x,x j j) = φ(x ) φ(x j ) Ths gves a nonlnear decson boundary n the orgnal feature space: yk ( x, x ) b 46

Examples of kernel functons Lnear: K( x, x j ) x T x j Gaussan RBF: x x j K( x,x j ) exp( 2 2 2 ) Hstogram ntersecton: K ( x, x j ) mn( x ( k), x j ( k)) k SVMs for recognton 1. Defne your representaton for each example. 2. Select a kernel functon. 3. Compute parwse kernel values between labeled examples 4. Use ths kernel matrx to solve for SVM support vectors & weghts. 5. To classfy a new example: compute kernel values between new nput and support vectors, apply weghts, check sgn of output. 47

Questons What f the data s not lnearly separable? What f we have more than just two categores? Mult-class SVMs Acheve mult-class classfer by combnng a number of bnary classfers One vs. all Tranng: learn an SVM for each class vs. the rest Testng: apply each SVM to test example and assgn to t the class of the SVM that returns the hghest decson value One vs. one Tranng: learn an SVM for each par of classes Testng: each learned SVM votes for a class to assgn to the test example 48

SVMs: Pros and cons Pros Kernel-based framework s very powerful, flexble Often a sparse set of support vectors compact at test tme Work very well n practce, even wth very small tranng sample szes Cons No drect mult-class SVM, must combne two-class SVMs Can be trcky to select best kernel functon for a problem Computaton, memory Durng tranng tme, must compute matrx of kernel values for every par of examples Learnng can take a very long tme for large-scale problems Adapted from Lana Lazebnk Scorng a sldng wndow detector If predcton and ground truth are boundng boxes, when do we have a correct detecton? 49

Scorng a sldng wndow detector B p a o 0. 5 correct B gt We ll say the detecton s correct (a true postve ) f the ntersecton of the boundng boxes, dvded by ther unon, s > 50%. Scorng an object detector If the detector can produce a confdence score on the detectons, then we can plot the rate of true vs. false postves as a threshold on the confdence s vared. TPR= fracton of postve examples that are correctly labeled. FPR=fracton of negatve examples that are msclassfed as postve. 50

Wndow-based detecton: strengths ng gnton ory Augmented Tutoral Comput Vsual Perceptual Object and Recog Sens Sldng wndow detecton and global appearance descrptors: Smple detecton protocol to mplement Good feature choces crtcal Past successes for certan classes Wndow-based detecton: Lmtatons ng gnton ory Augmented Tutoral Comput Vsual Perceptual Object and Recog Sens Hgh computatonal complexty For example: 250,000 locatons x 30 orentatons x 4 scales = 30,000,000 evaluatons! If tranng bnary detectors ndependently, means cost ncreases lnearly wth number of classes Wth so many wndows, false postve rate better be low 51

Lmtatons (contnued) Not all objects are box shaped Vsual Perceptual Object and Recog Sens gnton ory Augmented Tutoral Comput ng Lmtatons (contnued) ng gnton ory Augmented Tutoral Comput Vsual Perceptual Object and Recog Sens Non-rgd, deformable objects not captured well wth representatons assumng a fxed 2d structure; or must assume fxed vewpont Objects wth less-regular textures not captured well wth holstc appearance-based descrptons 52

Lmtatons (contnued) If consderng wndows n solaton, context s lost ng gnton ory Augmented Tutoral Comput Vsual Perceptual Object and Recog Sens Fgure credt: Derek Hoem Sldng wndow Detector s vew Lmtatons (contnued) ng gnton ory Augmented Tutoral Comput Vsual Perceptual Object and Recog Sens In practce, often entals large, cropped tranng set (expensve) Requrng good match to a global appearance descrpton can lead to senstvty to partal occlusons Image credt: Adam, Rvln, & Shmshon 53

Summary Basc ppelne for wndow-based detecton Model/representaton/classfer choce Sldng wndow and classfer scorng Dscrmnatve classfers for wndow-based representatons Boostng Vola-Jones face detector example Nearest neghbors Scene recognton example Support vector machnes HOG person detecton example Pros and cons of wndow-based detecton 54