Learning Spatial Context: Using Stuff to Find Things

Size: px

Start display at page:

Download "Learning Spatial Context: Using Stuff to Find Things"

Gabriel Baker
5 years ago
Views:

1 Learning Spatial Context: Using Stuff to Find Things Wei-Cheng Su

naturally classified based on texture or color. e.g.

2 Motivation 2 Leverage contextual information to enhance detection Some context objects are non-rigid and are more naturally classified based on texture or color. e.g., sky, trees, road Find the relationships between the stuff of context and the object

3 Outline 3 Training and inferring Preprocessing Experimental results Things-and-stuff relationships Performance Effect of parameters Conclusion

Things-and-stuff stuff relationships Model parameters

4 Training 4 Segmentation Region features & centroids Learning Detection Candidate boxes & scores Things-and-stuff stuff relationships Model parameters Annotation Ground truths *Red boxes indicate high scores Blue boxes indicate low scores

5 Inferring 5 Segmentation Region features & centroids Inferring Detection Candidate boxes & prior scores Posterior scores for all candidates

6 Outline 6 Training and inferring Preprocessing Experimental results Things-and-stuff relationships Performance Effect of parameters Conclusion

7 Preprocessing 7 Segmentation e Superpixel Pentium-D 2.4 GHz, 4G RAM Run out of memory with a 792x636 image ~6.4 minutes for a 480x321 image Detection HOG for detecting humans, cars, bicycles, and motorbikes Patch-based boosted detector for detecting cars in satellite images

8 Segmentation 8 This level of segmentation result is used

9 9 HoG-Cars

10 HoG-People 10

11 11 HoG-Motorbikes

12 HoG-Bicycles 12

13 13 Satellite

14 Satellite 14 Th=0

15 Satellite 15 Th=0 0.95

16 Satellite 16 Th =

17 Satellite 17 Th=

18 Outline 18 Training and inferring Preprocessing Experimental results Things-and-stuff relationships Performance Effect of parameters Conclusion

19 Running TAS 19 Run TAS inference on all detected candidates False positives detected by the base detector will be filtered out Object not detected by the base detector could not be detected by TAS Data set: VOC2005, Google earth satellite images

20 Base Detector vs TAS 20 Left: base detector result. Right: TAS result

21 21 Base Detector vs TAS

22 22 Base Detector vs TAS

23 23 Base Detector vs TAS

24 24 Base Detector vs TAS

25 25 Base Detector vs TAS

26 26 Base Detector vs TAS

27 27 Base Detector vs TAS

28 28 Base Detector

29 29 TAS

30 30 Base Detector

31 31 TAS

32 32 Base Detector

33 33 TAS

34 Outline 34 Training and inferring Preprocessing Experimental results Things-and-stuff relationships Performance Effect of parameters Conclusion

35 Things-and-Stuff Relationships 35 Feature description: 44 features, including color, texture, shape The relationships are learnt during training The relationships change the score of a candidate 25 relationship candidates

36 Relationships 36

37 Relationships 37

38 Relationships 38

39 Relationships 39

40 Relationships 40

41 Relationships 41

42 Relationships 42

43 Relationships 43

44 Relationships 44

45 Relationships 45

46 Relationships 46

47 Relationships 47

48 Relationships 48

49 Relationships 49

50 Relationships 50

51 Relationships 51

52 Relationships 52

53 Relationships 53

54 Relationships 54

55 Relationships 55

56 Relationships 56 Some regions inside the bounding box have Some regions inside the bounding box have relationships with the candidate

57 Relationships 57 View point. Different viewpoints generate different relationships Region features might be misleading

58 Relationships 58 The diversities of the backgrounds The region features inside the bounding box might be a complementary cue to the features used by the base detector

59 Outline 59 Training and inferring Preprocessing Experimental results Things-and-stuff relationships Performance Effect of parameters Conclusion

60 Performance Analysis 60 Training samples: 15 Test samples: 15 Image size: 792x636 Test machine: Core(TM)2 8G RAM Implemented in Matlab Detection and segmentation are not included Required computing power Learning seconds of CPU time Inferring seconds of CPU time

61 Base Detector vs TAS 61 Cars People Red: base detector. Blue: TAS

62 Base Detector vs TAS - Motorbikes 62 Motorbikes Bicycles Red: base detector. Blue: TAS

63 63 Base Detector vs TAS - Satellite

64 Outline 64 Training and inferring Preprocessing Experimental results Things-and-stuff relationships Performance Effect of parameters Conclusion

65 Number of Region Clusters 65 Red: 10 Blue: 3 Blue: 5 Blue: 20 Blue: 30

66 Number of Gibbs Iterations 66 Red: 10 Blue: 20 Blue: 100

67 Outline 67 Training and inferring Preprocessing Experimental results Things-and-stuff relationships Performance Effect of parameters Conclusion

68 Conclusion 68 Can be easily integrated with detectors The performance is dependent on the detector The stuff can come from the context as well as the object itself Especially suitable for background consistent and view point consistent datasets, ex: aerial images 3D information could be used to improve the performance

69 Reference 69 Learning Spatial Context: Using Stuff to Find Things,, Geremy Heitz and Daphne Koller. European Conference on Computer Vision (ECCV), 2008 TAS Superpixel HOG implemetation i i l / f / l PASCAL VOC soton ac dex.html

1 Learning Spatial Context: 2 Using Stuff to Find Things

1 Learning Spatial Context: 2 Using Stuff to Find Things Learning Spatial Context: 2 Using Stuff to Find Things 2 Geremy Heitz Daphne Koller 3 3 4 Department of Computer Science 4 5 Stanford University 5 6 Stanford, CA 94305 6 7 Abstract. The sliding window