Lecture 16: Object recognition: Part-based generative models

Size: px

Start display at page:

Download "Lecture 16: Object recognition: Part-based generative models"

Colleen Wells
5 years ago
Views:

1 Lecture 16: Object recognition: Part-based generative models Professor Stanford Vision Lab 1

2 What we will learn today? Introduction Constellation model Weakly supervised training One-shot learning (Problem Set 4 (Q1)) 2

3 Challenges: intra-class variation 3

4 Usual Challenges: Variability due to: View point Illumination Occlusions 4

5 Representation Basic issues 2D Bag of Words (BoW) models; Part-based models; Multi-view models; Learning Generative & Discriminative BoW models Generative models Probabilistic Hough voting Recognition Classification with BoW Classification with Part-based models 5

6 Representation Basic issues 2D Bag of Words (BoW) models; Part-based models; Multi-view models; Learning Generative & Discriminative BoW models Generative models Probabilistic Hough voting Recognition Classification with BoW Classification with Part-based models 6

7 Representation Basic issues 2D Bag of Words (BoW) models; Part-based models; Multi-view models (Lecture#19); Learning Generative & Discriminative BoW models Generative models Probabilistic Hough voting Recognition Classification with BoW Classification with Part-based models 7

8 Problem with bag-of-words All have equal probability for bag-of-words methods Location information is important 8

9 Model: Parts and Structure 9

10 Parts and Structure Literature Fischler & Elschlager 1973 Yuille 91 Brunelli & Poggio 93 Lades, v.d. Malsburg et al. 93 Cootes, Lanitis, Taylor et al. 95 Amit & Geman 95, 99 et al. Perona 95, 96, 98, 00, 03 Huttenlocher et al. 00 Agarwal & Roth 02 etc 10

11 The Constellation Model T. Leung Representation Shape statistics F&G 95 Affine invariant shape CVPR 98 M. Burl Detection CVPR 96 ECCV 98 M. Weber M. Welling Unsupervised Learning ECCV 00 Multiple views - F&G 00 Discovering categories - CVPR 00 R. Fergus Joint shape & appearance learning Generic feature detectors CVPR 03 Polluted datasets - ECCV 04 L. Fei-Fei One-Shot Learning Incremental learning ICCV 03 CVPR 04 11

12 Deformations A B C D 12

13 Presence / Absence of Features occlusion 13

14 Background clutter 14

15 Foreground model Generative probabilistic model Clutter model Gaussian shape pdf Prob. of detection Uniform shape pdf # detections Assumptions: (a) Clutter independent of foreground detections (b) Clutter detections independent of each other Example 1. Object Part Positions Part Absence 3a. N false detect p Poisson (N 1 λ 1 ) p Poisson (N 2 λ 2 ) p Poisson (N 3 λ 3 ) 3b. Position f. detect N 1 N 2 N 3 15

16 Learning Models `Manually Obtain set of training images Choose parts Label parts by hand, train detectors Learn model from labeled parts 16

17 Recognition 1. Run part detectors exhaustively over image = = e.g h N N N N h K K K K Try different combinations of detections in model - Allow detections to be missing (occlusion) 3. Pick hypothesis which maximizes: 4. If ratio is above threshold then, instance detected ), ( ), ( Hyp Clutter Data p Hyp Object Data p 17

18 So far.. Representation Joint model of part locations Ability to deal with background clutter and occlusions Learning Manual construction of part detectors Estimate parameters of shape density Recognition Run part detectors over image Try combinations of features in model Use efficient search techniques to make fast 18

19 Unsupervised Learning Weber & Welling et. al. 19

20 (Semi) Unsupervised learning Know if image contains object or not But no segmentation of object or manual selection of features 20

Unsupervised detector training - 1 10 10 Highly textured

21 Unsupervised detector training Highly textured neighborhoods are selected automatically produces patterns per image 21

22 Unsupervised detector training - 2 Pattern Space (100+ dimensions) 22

23 Unsupervised detector training images ~100 detectors 23

24 Learning Take training images. Pick set of detectors. Apply detectors. Task: Estimation of model parameters Chicken and Egg type problem, since we initially know neither: - Model parameters - Assignment of regions to foreground / background Let the assignments be a hidden variable and use EM algorithm to learn them and the model parameters 24

25 ML using EM 1. Current estimate pdf Large P 2. Assign probabilities to constellations... Image 1 Image 2 Image i Small P 3. Use probabilities as weights to re-estimate parameters. Example: µ Large P x + Small P x + = new estimate of µ 25

26 Detector Selection Try out different combinations of detectors (Greedy search) Model 1 Choice 1 Parameter Estimation Choice 2 Parameter Estimation Model 2 Detectors ( 100) Predict / measure model performance (validation set or directly from model) 26

27 Frontal Views of Faces 200 Images (100 training, 100 testing) 30 people, different for training and testing 27

28 Pre-selected Parts Learned face model Test Error: 6% (4 Parts) Parts in Model Model Foreground pdf Sample Detection 28

29 Face images 29

30 Background images 30

31 Preselected Parts Car from Rear Test Error: 13% (5 Parts) Parts in Model Model Foreground pdf Sample Detection 31

32 Detections of Cars 32

33 Background Images 33

34 3D Object recognition Multiple mixture components 34

35 3D Orientation Tuning 100 Orientation Tuning % Correct % Correct angle in degrees Frontal Profile 35

36 So far (2).. Representation Multiple mixture components for different viewpoints Learning Now semi-unsupervised Automatic construction and selection of part detectors Estimation of parameters using EM Recognition As before Issues: -Learning is slow (many combinations of detectors) -Appearance learnt first, then shape 36

37 Speed of learning Issues Slow (many combinations of detectors) Appearance learnt first, then shape Difficult to learn part that has stable location but variable appearance Each detector is used as a cross-correlation filter, giving a hard definition of the part s appearance Would like a fully probabilistic representation of the object 37

38 Object categorization Fergus et. al. CVPR 03, IJCV 06 38

of region centre Scale Appearance Radius

39 Detection & Representation of regions Find regions within image Use salient region operator (Kadir & Brady 01) Location (x,y) coords. of region centre Scale Appearance Radius of region (pixels) Normalize 11x11 patch Projection onto PCA basis Gives representation of appearance in low-dimensional vector space 39 c 1 c 2.. c 15

40 Kadir & Brady saliency region detector Motorbikes example 40

41 Generative probabilistic model (2) Foreground model Gaussian shape pdf Gaussian part appearance pdf based on Burl, Weber et al. [ECCV 98, 00] Gaussian relative scale pdf log(scale) Prob. of detection Clutter model Uniform shape pdf Gaussian background appearance pdf Uniform relative scale pdf log(scale) Poission pdf on # detections 41

42 Motorbikes Samples from appearance model 42

43 Recognized Motorbikes 43

44 Background images evaluated with motorbike model 44

45 Frontal faces 45

46 Airplanes 46

47 Spotted cats 47

48 Summary of results Dataset Fixed scale experiment Scale invariant experiment Motorbikes Faces Airplanes Cars (Rear) Spotted cats % equal error rate Note: Within each series, same settings used for all datasets 48

49 Comparison to other methods Dataset Ours Others Motorbikes Weber et al. [ECCV 00] Faces Weber Airplanes Weber Cars (Side) Agarwal Roth [ECCV 02] % equal error rate 49

50 Why this design? Generic features seem to well in finding consistent parts of the object Some categories perform badly different feature types needed Why PCA representation? Tried ICA, FLD, Oriented filter responses etc. But PCA worked best Fully probabilistic representation lets us use tools from machine learning community 50

51 S. Savarese,

52 52 P. Buegel, 1562

53 One-Shot learning Fei-Fei et. al. ICCV 03, PAMI 06 53

54 Algorithm Training Examples Categories Burl, et al. Weber, et al. Fergus, et al. 200 ~ 400 Faces, Motorbikes, Spotted cats, Airplanes, Cars Viola et al. ~10,000 Faces Schneiderman, et al. ~2,000 Faces, Cars Rowley et al. ~500 Faces 54

55 Number of training examples Generalisation performance 6 part Motorbike model Test Train Classification error (%) Previously log 2 (Training images) 55

56 How do we do better than what statisticians have told us? Intuition 1: use Prior information Intuition 2: make best use of training information 56

57 Prior knowledge: means Appearance Shape 57

58 Bayesian framework P(object test, train) vs. P(clutter test, train) Bayes Rule p( test object, train ) p(object ) Expansion by parametrization p(test θ,object) p( θ object, train) dθ 58

59 Bayesian framework P(object test, train) vs. P(clutter test, train) Bayes Rule p( test object, train ) p(object ) Expansion by parametrization p(test θ,object) p( θ object, train) dθ Previous Work: ( ML ) δ θ 59

60 Bayesian framework P(object test, train) vs. P(clutter test, train) Bayes Rule p( test object, train ) p(object ) Expansion by parametrization p(test θ,object) p( θ object, train) dθ One-Shot learning: p ( train θ, object ) p ( θ ) 60

61 model (θ) space Model Structure Each object model θ Gaussian shape pdf Gaussian part appearance pdf θ 1 θ n θ 2 61

62 model (θ) space Model Structure Each object model θ Gaussian shape pdf Gaussian part appearance pdf θ 1 θ n θ 2 model distribution: p(θ) conjugate distribution of p(train θ,object) 62

63 p Learning Model Distribution ( ) ( ) θ object, train p train θ, object p( θ ) use Prior information Bayesian learning marginalize over theta Variational EM (Attias, Hinton, Minka, etc.) 63

64 Variational EM Random initialization new θ s E-Step M-Step new estimate of p(θ train) prior knowledge of p(θ) 64

65 Experiments Training: 1-6 randomly drawn images Testing: 50 fg/ 50 bg images object present/absent Datasets faces airplanes spotted cats motorbikes 65

66 Faces Motorbikes Airplanes Spotted cats 66

67 Experiments: obtaining priors airplanes model (θ) space spotted cats faces motorbikes 67

68 Experiments: obtaining priors airplanes model (θ) space faces spotted cats motorbikes 68

69 Number of training examples 69

70 Number of training examples 70

71 Number of training examples 71

72 Number of training examples 72

73 Algorithm Training Examples Categories Results(e rror) Burl, et al. Weber, et al. Fergus, et al. 200 ~ 400 Faces, Motorbikes, Spotted cats, Airplanes, Cars % Viola et al. ~10,000 Faces 7-21% Schneiderman, et al. ~2,000 Faces, Cars % Rowley et al. ~500 Faces % Bayesian One-Shot 1 ~ 5 Faces, Motorbikes, Spotted cats, Airplanes 8 15 % 73

74 What we have learned today? Introduction Constellation model Weakly supervised training One-shot learning (Problem Set 4 (Q1)) 74

Lecture 15: Object recognition: Bag of Words models & Part based generative models

Lecture 15: Object recognition: Bag of Words models & Part based generative models Professor Fei Fei Li Stanford Vision Lab 1 Basic issues Representation How to represent an object category; which classification