Harmony Poten,als: Fusing Global and Local Scale for Seman,c Image Segmenta,on

Size: px

Start display at page:

Download "Harmony Poten,als: Fusing Global and Local Scale for Seman,c Image Segmenta,on"

Rachel Simmons
5 years ago
Views:

1 Harmony Poten,als: Fusing Global and Local Scale for Seman,c Image Segmenta,on J. M. Gonfaus X. Boix F. S. Khan J. van de Weijer A. Bagdanov M. Pedersoli J. Serrat X. Roca J. Gonzàlez

2 Mo,va,on (I) Why combine global and local scale?

3 Mo,va,on (I) Why combine global and local scale?

4 Mo,va,on (I) Classifica,on is open impossible based on local appearance only. Image Classifier Aeroplane Bus Sofa Plant Chair 0 0,5 1 Context is a powerful and dis,nc,ve cue

5 Mo,va,on (II) How can we improve local classifiers? Aeroplane Horse Is this object X or some other object Cow Cat Dog 0 0,5 1 Is this the foreground or the background of Aeroplane Cow Horse Cat Dog 0 0,5 1 Inaccurate segmenta,on Good class discrimina,on Why not combine them? Good figure segmenta,on Bad class discrimina,on

6 Mo,va,on (II) How can we improve local classifiers? Aeroplane Horse Is this object X or some other object Cow Cat Dog 0 0,5 1 Is this the foreground or the background of Aeroplane Cow Horse Cat Dog 0 0,5 1 Inaccurate segmenta,on Good class discrimina,on Why not combine them? Good figure segmenta,on Bad class discrimina,on

7 Mo,va,on (II) How can we improve local classifiers? More informa,on sources Mid- level informa,on through object detectors

8 Outline Overview of our method How to fuse local and global scale Harmony Poten,als* CVC_Harmony submission (35.4% on test) Improving local classifiers CVC_Harmony+Det submission (40.1% on test) Results Conclusions *J.M. Gonfaus, X. Boix, J. Van de Weijer, A. D. Bagdanov, J. Serrat, J. Gonzàlez Harmony Poten,als for Joint Classifica,on and Segmenta,on, in CVPR 2010

9 Overview of our method

10 Overview of our method Unsupervised segmenta,on. Around 500 superpixels/image

11 Overview of our method Unsupervised segmenta,on. Superpixel nodes Unary poten,al (CVC_Harmony) BoW inside AND neighborhood Smoothness poten,al BoW Pairwise Pois poten,al SIFT, RGB Histogram, SSIM Mul,scale: 12, 24, 36, 48 square patches Step size 50% of the patch Quan,zed to 1000, 400, 300 words Learned on SVM with 8000 samples + retraining

12 Overview of our method Unsupervised segmenta,on. Superpixel nodes Unary poten,al BoW inside AND neighborhood Detec,on scores Loca,on prior Smoothness poten,al BoW Pairwise Pois poten,al (CVC_Harmony+det) SIFT, RGB Histogram, SSIM Mul,scale: 12, 24, 36, 48 square patches Step size 50% of the patch Quan,zed to 1000, 400, 300 words Learned on SVM with 8000 samples + retraining

13 Overview of our method Unsupervised segmenta,on. Superpixel nodes Global Node Unary poten,al: Global classifier method CVC_flat submission: map: 61% for classifica,on task Consistency poten,al From global node to each sp Harmony Poten,al

14 Model Unary Poten,al Smoothness Poten,al Consistency Poten,al

15 Model Consistency Poten,al

16 Consistency poten,al Ground- Truth Unary Poten,als Pois- based Poten,als Robust P N Poten,als Harmony Poten,als

17 Consistency poten,al GT Ground- Truth Unary Poten,als Pois- based Poten,als Robust P N Poten,als Harmony Poten,als

18 Consistency poten,al GT Ground- Truth Unary Poten,als Pois- based Poten,als Robust P N Poten,als Harmony Poten,als

19 Consistency poten,al GT Free Ground- Truth Unary Poten,als Pois- based Poten,als Robust P N Poten,als Harmony Poten,als

20 Consistency poten,al GT Ground- Truth Unary Poten,als Pois- based Poten,als Robust P N Poten,als Harmony Poten,als

21 Consistency poten,al = All possible label combina,ons is unfeasible

22 Consistency poten,al Ranked subsampling of Few best combina,ons are required to saturate the performance Prior From the training data we extract the co- occurrence sta,s,cs of labels Likelihood Image classifica,on scores each combina,on

23 Unary Poten,al Model

24 Unary poten,al Local classifiers are weak classifiers Too ambiguous because liile informa,on is used Combining mul,ple classifiers makes our local unary poten,al stronger. Features: foreground/background class versus others object detec,ons spa,al loca,on prior

25 F fg- bg : Fore- Background Easy to iden,fy whether the superpixel belongs to the object class or to its common background

26 F fg- bg : Fore- Background Easy to iden,fy whether the superpixel belongs to the object class or to its common background

27 F fg- bg : Fore- Background Easy to iden,fy whether the superpixel belongs to the object class or to its common background

28 Fclass: Class vs. other classes Learning how diﬀerent an object is from its common background becomes diﬃcult for certain class combina,ons Foreground Background

29 Fclass: Class vs. other classes Learning how diﬀerent an object is from its common background becomes diﬃcult for certain class combina,ons

30 F posi,on : Loca,on prior Objects tend to appear in class- specific, par,cular loca,ons (and not at the borders)

F det : Object detector* scores Mid- level informa,on is added by considering object detec,ons [Felzenszwalb et al. 2010]. Average over superpixel area with maximum detec,on score at each pixel.

31 F det : Object detector* scores Mid- level informa,on is added by considering object detec,ons [Felzenszwalb et al. 2010]. Average over superpixel area with maximum detec,on score at each pixel. Scores = [- 1, ) Class specific No detec,on score is learned. Keeps the CRF and the model simple. *Felzenszwalb, Girshick, McAllester, Ramanan, Object Detec,on with Discriminately Trained Part based models, PAMI 2010

32 F det : Object detector* scores

33 Results on valida,on set 2010 Mean Average Precision Fg_Bk 33, submission Class 23,4 Loc 20 Det 26 Fg_Bk + Loc 34,5 Fg_Bk + Class 36,6 All 40,1

34 Combina,on of features Naïve Bayes approach Specific sigmoid per class and per classifier φ(x i ) = f F Total number of parameters to be learned: 2x20x = 185 parameters 1 1+ exp( a f x i f + b f ) feature sigmoids no_detec,on score CRF weights background probability All parameters are jointly op,mized by stochas,c steepest ascent

Results on valida,on set 2010 Mean Average Precision 15 20 25 30 35 40 Fg_Bk 33,2 2009 submission Class Loc Det 20 23,4 26

35 Results on valida,on set 2010 Mean Average Precision Fg_Bk 33, submission Class Loc Det 20 23,4 26 CVC_Harmony 2010 submission 35,4 on test Fg_Bk + Loc Fg_Bk + Class 34,5 36,6 CVC_Harmony_Det 2010 submission 40,1 on test All 39,2

36 Illustra,ve examples class Fg/bg det loc ﬁnal unary * = * = * = * = * = * = * = * = * =

37 Illustra,ve examples Fg/bg class det loc final unary * = * = * = * = * = * = * = * = * =

38 Final results

39 Conclusions Harmony poten,al is an effec,ve way to fuse global and local scales for seman,c image segmenta,on. We have focused on improving the local classifiers Baseline: 29% + combining fg/bg and mul,class classifiers (+2%) + object detec,on (+3%) + loca,on prior (+1%) + per class parameter op,miza,on (+5%) more details: hip://iselab.cvc.uab.es/pvoc2010

40 Thanks for your aien,on! Gràcies per la vostra atenció! Ευχαριστω για την προσοχη σας

41 Full Prac,cal Example

42 F fgbg : Fore- Back ground

43 F class : Class against other classes

44 Close- up comparison Fore- Back ground learning Class against others learning

45 Ffgbg * Fclass

46 F det : Detector Scores

47 Ffgbg*Fclass*Fdet

48 F loca,on : Loca,on Prior

49 Ffgbg*Fclass*Fdet*Floc

50 Result

Deformable Part Models

Deformable Part Models References: Felzenszwalb, Girshick, McAllester and Ramanan, Object Detec@on with Discrimina@vely Trained Part Based Models, PAMI 2010 Code available at hkp://www.cs.berkeley.edu/~rbg/latent/