Learning to Localize Objects with Structured Output Regression

Size: px

Start display at page:

Download "Learning to Localize Objects with Structured Output Regression"

Audrey Hancock
6 years ago
Views:

1 Learning to Localize Objects with Structured Output Regression Matthew Blaschko and Christopher Lampert ECCV 2008 Best Student Paper Award Presentation by Jaeyong Sung and Yiting Xie 1

2 Object Localization important task for image understanding 2

3 Object Localization important task for image understanding standard approach: binary training + sliding window 3

4 Object Localization important task for image understanding standard approach: binary training + sliding window 4

5 Object Localization Main disadvantages of sliding window Inefficient to scan over the entire image 320 x 240 image one billion rectangular sub-images 5

6 Object Localization Main disadvantages of sliding window Inefficient to scan over the entire image 320 x 240 image one billion rectangular sub-images Not clear how to optimally train a discriminant function main contribution of this paper utilizes structured learning? 6

7 Parameterization of Bounding Box Y ω, t, l, b, r ω +1, 1, t, l, b, r R 4 If ω = 1, the coordinate vector is ignored.

8 Structured Regression a structured regression rather than classification 8

9 Structured Regression a structured regression rather than classification outputs are not independent of each other right coordinate > left coordinate bottom coordinate > top coordinate t b l r overlapping boxes should have similar objective 9

10 Object Localization as Structured Learning Given Input images x 1,, x n X Associated annotations y 1,, y n Y ω +1, 1, Y ω, t, l, b, r t, l, b, r R 4 t b l r 10

11 Object Localization as Structured Learning Given Input images x 1,, x n X Associated annotations y 1,, y n Y ω +1, 1, Y ω, t, l, b, r t, l, b, r R 4 t b l r Goal is to learn a mapping g: X Y g x = argmax f(x, y) y f: X Y R f x, y = w, φ x, y 11

12 Object Localization as Structured Learning To train a discriminant function f Δ( ) ξ 1 Δ( ) ξ 1 12

13 Object Localization as Structured Learning To train a discriminant function f

14 Object Localization as Structured Learning To train a discriminant function f 14

15 Loss Function Measure of overlap Δ ( ) = Δ ( ) = 15

16 Loss Function Measure of overlap Δ ( ) 1 Δ ( ) 0 16

17 Loss Function Measure of overlap no object Δ ( ) = no object Δ ( ) = no object 17

18 Loss Function Measure of overlap no object Δ ( ) = 1 no object Δ ( ) = 0 no object 18

19 Joint Kernel Map x Bag of Words; Spatial Pyramids; Histogram of Oriented Gradients y

20 Joint Kernel Map for Localization Slide from Blaschko and Lampert 21

21 Joint Kernel Map for Localization Slide from Blaschko and Lampert 22

22 Maximization Step Training stage: Testing stage: max Δ y i, y + w, φ x i, y arg max y Y w, φ x i, y Exhaustive search computationally infeasible Branch-and-bound optimization algorithm

23 Branch-and-bound: bounding box splitting

24 Branch-and-bound: branch step Set of All Possible Bounding Boxes Subset1 Subset2 Subset11 Subset12 Branching can be done by splitting image coordinates (left/right; top/bottom) Branch-and-bound is efficient because only the upper bound of a branch (a set of boxes) needs to be computed! Each branch corresponds to a set of bounding boxes

25 Branch-and-bound: splitting examples Splitting right coordinates Splitting left coordinates 26

26 Branch-and-bound: splitting examples Splitting right coordinates Splitting left coordinates 27

27 Branch-and-bound: splitting examples Splitting right coordinates Splitting left coordinates 28

28 Branch-and-bound: bounding box splitting

+ R max + f R min All positive features Maximum bounding

29 Branch-and-bound: quality function A quality function to compute the upper bound for a set of boxes: split L f R = f + R max + f R min All positive features Maximum bounding box in a set All negative features Minimum bounding box in a set

30 Branch-and-bound: bound step 1. For each branching step, only keep the branch (set of boxes) with higher upper bound. 2. Create sub-branch for the current branch. Repeat 1 until there is only one box left.

31 Experiment: Dataset TU Darmstadt cows 111 training images 557 test images PASCAL VOC ,304 images of 10 classes Evenly split into a train/validation and a test part

32 Experiment: Setup Local SURF descriptors from feature points 10,000 descriptors from training images 3,000 entry visual codebook SVM struct package was used. Benchmark against standard sliding window approach Binary training Linear image kernel over bag-of-visual-word histogram

Results: TU Darmstadt Cows Performance at equal

33 Results: TU Darmstadt Cows Performance at equal error rate (EER). Performance at ERR Implicit Shape Model (ISM) 96.1% Local Kernels (LK) 95.3% LK + ISM 97.1% Binary training 97.3% Tighter contour Structured training 98.2% Binary Structured Binary Structured Bottom right Box dimension fixed corner fixed

34 Results: PASCAL VOC 2006 Precision-recall curves and example detections Precision=TP/(TP+FP) Recall=TP/(TP+FN)

35 Results: PASCAL VOC 2006 Average Precision Scores on the 10 categories of PASCAL VOC 2006

36 Discussion and Conclusion Structured training often exceeds state-of-the art performance. It has access to all possible bounding boxes. It is able to better handle partial detection problem.

37 Demo! 39

Learning to Localize Objects with Structured Output Regression

Learning to Localize Objects with Structured Output Regression Matthew B. Blaschko and Christoph H. Lampert Max Planck Institute for Biological Cybernetics 72076 Tübingen, Germany {blaschko,chl}@tuebingen.mpg.de