Reconstructive Sparse Code Transfer for Contour Detection and Semantic Labeling

Size: px

Start display at page:

Download "Reconstructive Sparse Code Transfer for Contour Detection and Semantic Labeling"

Bryan Dean
6 years ago
Views:

1 Reconstructive Sparse Code Transfer for Contour Detection and Semantic Labeling Michael Maire 1,2 Stella X. Yu 3 Pietro Perona 2 1 TTI Chicago 2 California Institute of Technology 3 University of California at Berkeley / ICSI

2 Motivation: Deep Representations

3 Image Classification Motivation: Deep Representations [Krizhevsky, Sutskever, and Hinton, NIPS 2012]

4 Image Classification Motivation: Deep Representations [Krizhevsky, Sutskever, and Hinton, NIPS 2012] [Bo, Ren, and Fox, CVPR 2013]

5 Image Classification Motivation: Deep Representations [Krizhevsky, Sutskever, and Hinton, NIPS 2012] [Bo, Ren, and Fox, CVPR 2013] Object Detection [Girshick, Donahue, Darrell, and Malik, CVPR 2014]

6 Deep Representations for Semantic Labeling classify image detect objects

7 Deep Representations for Semantic Labeling classify image label every pixel detect objects

8 Contour Detection: Special Case { 0 if in region interior 1 if on region boundary

9 Contour Detection: Special Case

10 Contour Detection: Special Case contour detection serves as foundation for: segmentation, object proposals

11 Semantic Labeling Strategy predict patch labels from a spatially localized multilayer slice of a deep representation

12 Semantic Labeling Strategy predict patch labels from a spatially localized multilayer slice of a deep representation generalization of sparse reconstruction

13 Multipath Sparse Coding [Bo, Ren, and Fox, CVPR 2013]

14 [Bo, Ren, and Fox, CVPR 2013]

15 [Bo, Ren, and Fox, CVPR 2013]

16 [Bo, Ren, and Fox, CVPR 2013]

18 patch descriptor (sparse)

19 use patch descriptors in a sparse reconstructive setting patch descriptor (sparse)

20 Sparse Coding & Reconstruction Image

21 Sparse Coding & Reconstruction Image Patch Dictionary Batch OMP

22 Sparse Coding & Reconstruction Image Patch Dictionary Batch OMP = z i,0 + z i, z i,l z i 0 K x i d 0 d 1 d L

23 Sparse Coding & Reconstruction Image Patch Dictionary Sparse Codes Batch OMP coefficients pixels = z i,0 + z i, z i,l z i 0 K x i d 0 d 1 d L

24 Sparse Coding & Reconstruction Image Patch Dictionary Sparse Codes Batch OMP coefficients pixels = z i,0 + z i, z i,l z i 0 K x i d 0 d 1 d L

25 Sparse Coding & Reconstruction Image Patch Dictionary Sparse Codes Batch OMP coefficients pixels

26 Sparse Coding & Reconstruction Image Patch Dictionary Sparse Codes Batch OMP coefficients pixels Sparse Codes coefficients pixels

27 Sparse Coding & Reconstruction Image Patch Dictionary Sparse Codes Batch OMP coefficients pixels coefficients Sparse Codes pixels Patch Dictionary Reconstruction

28 Sparse Coding & Reconstruction Image Patch Dictionary Sparse Codes Batch OMP coefficients pixels coefficients Sparse Codes pixels Patch Dictionary Reconstruction Image

29 Reconstructive Sparse Code Transfer Sparse Codes coefficients pixels

30 Reconstructive Sparse Code Transfer Rectified Sparse Codes coefficients pixels [ ] z i max(zi T, 0), max( zi T T, 0), 1

31 Reconstructive Sparse Code Transfer Rectified Sparse Codes Transfer Dictionary coefficients pixels [ ] z i max(zi T, 0), max( zi T T, 0), 1 Reconstruction

32 Reconstructive Sparse Code Transfer Rectified Sparse Codes Transfer Dictionary coefficients pixels [ ] z i max(zi T, 0), max( zi T T, 0), 1 Reconstruction Contour Detection

33 Sparse Coding: Generative Training

34 Sparse Coding: Generative Training sample patches X = [x 0, x 1,...] from training images

35 Sparse Coding: Generative Training sample patches X = [x 0, x 1,...] from training images learn: sparse representations Z = [z0, z 1,...] dictionary D = [d0, d 1,..., d L 1 ]

36 Sparse Coding: Generative Training sample patches X = [x 0, x 1,...] from training images learn: sparse representations Z = [z0, z 1,...] dictionary D = [d0, d 1,..., d L 1 ] MI-KSVD finds an approximate solution to: [ L 1 L 1 X DZ 2 F + λ argmin D, Z i=0 j=0,j i s.t. i, d i 2 = 1 and n, z n 0 K d T i d j ]

37 Sparse Coding: Generative Training sample patches X = [x 0, x 1,...] from training images learn: sparse representations Z = [z0, z 1,...] dictionary D = [d0, d 1,..., d L 1 ] MI-KSVD finds an approximate solution to: [ L 1 L 1 X DZ 2 F + λ argmin D, Z i=0 j=0,j i s.t. i, d i 2 = 1 and n, z n 0 K encode patch x R m m c as z R L : d T i d j ] argmin x Dz 2 z s.t. z 0 K

38 Transfer Dictionary: Discriminative Training

39 Transfer Dictionary: Discriminative Training D maps sparse vector z R L to patch P R m m c

40 Transfer Dictionary: Discriminative Training D maps sparse vector z R L to patch P R m m c replace D with function F ( )

41 Transfer Dictionary: Discriminative Training D maps sparse vector z R L to patch P R m m c replace D with function F ( ) F (z) predicts groundtruth labeling G R m m h

42 Transfer Dictionary: Discriminative Training D maps sparse vector z R L to patch P R m m c replace D with function F ( ) F (z) predicts groundtruth labeling G R m m h patch-level discriminative training:

43 Transfer Dictionary: Discriminative Training D maps sparse vector z R L to patch P R m m c replace D with function F ( ) F (z) predicts groundtruth labeling G R m m h patch-level discriminative training: sample codes {z0, z 1,...} and corresponding groundtruth patches {g 0, g 1,...}

44 Transfer Dictionary: Discriminative Training D maps sparse vector z R L to patch P R m m c replace D with function F ( ) F (z) predicts groundtruth labeling G R m m h patch-level discriminative training: sample codes {z0, z 1,...} and corresponding groundtruth patches {g 0, g 1,...} split F ( ) into independently trained predictors [f 0, f 1,..., f (m 2 h 1)], one for each output element

45 Transfer Dictionary: Discriminative Training D maps sparse vector z R L to patch P R m m c replace D with function F ( ) F (z) predicts groundtruth labeling G R m m h patch-level discriminative training: sample codes {z0, z 1,...} and corresponding groundtruth patches {g 0, g 1,...} split F ( ) into independently trained predictors [f 0, f 1,..., f (m 2 h 1)], one for each output element we want: f j (z i ) g i [j] i, j

46 Transfer Dictionary: Discriminative Training D maps sparse vector z R L to patch P R m m c replace D with function F ( ) F (z) predicts groundtruth labeling G R m m h patch-level discriminative training: sample codes {z0, z 1,...} and corresponding groundtruth patches {g 0, g 1,...} split F ( ) into independently trained predictors [f 0, f 1,..., f (m 2 h 1)], one for each output element we want: f j (z i ) g i [j] i, j train each fj ( ) using logistic regression

47 Transfer Dictionary: Discriminative Training D maps sparse vector z R L to patch P R m m c replace D with function F ( ) F (z) predicts groundtruth labeling G R m m h patch-level discriminative training: sample codes {z0, z 1,...} and corresponding groundtruth patches {g 0, g 1,...} split F ( ) into independently trained predictors [f 0, f 1,..., f (m 2 h 1)], one for each output element we want: f j (z i ) g i [j] i, j train each fj ( ) using logistic regression replace z with concatenated multipath codes

49 Multiple Scales

50 Multiple Scales Layer 1 Dictionaries 5x5, 64 atoms 11x11, 64 atoms 5x5, 512 atoms 11x11, 512 atoms 21x21, 512 atoms 31x31, 512 atoms

51 Multiple Scales Layer 1 Dictionaries 5x5, 64 atoms 11x11, 64 atoms 5x5, 512 atoms 11x11, 512 atoms 21x21, 512 atoms 31x31, 512 atoms pooling Layer 2 Dictionaries 5x5, 512 atoms 5x5, 512 atoms

52 Multiple Scales Layer 1 Dictionaries 5x5, 64 atoms 11x11, 64 atoms 5x5, 512 atoms 11x11, 512 atoms 21x21, 512 atoms 31x31, 512 atoms pooling = Layer 2 Dictionaries 5x5, 512 atoms 5x5, 512 atoms rectify, upsample, concatenate sparse activation maps

53 Multiple Scales Layer 1 Dictionaries 5x5, 64 atoms 11x11, 64 atoms 5x5, 512 atoms 11x11, 512 atoms 21x21, 512 atoms 31x31, 512 atoms pooling = Layer 2 Dictionaries 5x5, 512 atoms 5x5, 512 atoms rectify, upsample, concatenate sparse activation maps dimensions Sparse Representation

upsample, concatenate sparse activation maps =

54 Multiple Scales Layer 1 Dictionaries 5x5, 64 atoms 11x11, 64 atoms 5x5, 512 atoms 11x11, 512 atoms 21x21, 512 atoms 31x31, 512 atoms pooling = Layer 2 Dictionaries 5x5, 512 atoms 5x5, 512 atoms rectify, upsample, concatenate sparse activation maps = dimensions Contour Reconstruction Sparse Representation

55 Contour Detection Groundtruth [Martin, Fowlkes, Tal, and Malik, ICCV 2001]

56 Contour Detection Results

57 Contour Detection Results

58 Contour Detection Performance

59 Contour Detection Performance

60 Contour Detection Performance Performance Metric Hand-Designed Spectral ODS F OIS F AP Features? Filters? Globalization? Human Structured Edges yes no no local SCG (color) no yes no Sparse Code Transfer Layers no no no Sparse Code Transfer Layer no no no local SCG (gray) no yes no multiscale Pb yes yes no Canny Edge Detector yes yes no global SCG (color) yes yes yes global Pb + UCM yes yes yes + UCM global Pb yes yes yes Sparse Code Transfer: Performance competitive with top approaches Both representation and classifier are learned Free from reliance on hand-designed features or filters

61 Texture and Network Depth Layer 1 Layers 1+2

62 Texture and Network Depth Layer 1 Layers 1+2

63 Semantic Labeling hair skin background Labeled Faces in the Wild Dataset [Kae, Sohn, Lee, Learned-Miller, CVPR 2013]

64 Summary

65 Summary Semantic labeling using spatially localized slices of deep representations

66 Summary Semantic labeling using spatially localized slices of deep representations Sparse reconstruction generalized to multipath networks

67 Summary Semantic labeling using spatially localized slices of deep representations Sparse reconstruction generalized to multipath networks Transfer learning perspective: Generatively trained patch representation Task-specific discriminately trained transfer classifier

68 Summary Semantic labeling using spatially localized slices of deep representations Sparse reconstruction generalized to multipath networks Transfer learning perspective: Generatively trained patch representation Task-specific discriminately trained transfer classifier Pure learning for high performance contour detection

69 Summary Semantic labeling using spatially localized slices of deep representations Sparse reconstruction generalized to multipath networks Transfer learning perspective: Generatively trained patch representation Task-specific discriminately trained transfer classifier Pure learning for high performance contour detection Contours obtained as a byproduct of deep representations

70 Summary Semantic labeling using spatially localized slices of deep representations Sparse reconstruction generalized to multipath networks Transfer learning perspective: Generatively trained patch representation Task-specific discriminately trained transfer classifier Pure learning for high performance contour detection Contours obtained as a byproduct of deep representations Thank You!

arxiv: v1 [cs.cv] 16 Oct 2014

arxiv: v1 [cs.cv] 16 Oct 2014 Reconstructive Sparse Code Transfer for Contour Detection and Semantic Labeling Michael Maire 1,2, Stella X. Yu 3, and Pietro Perona 2 arxiv:1410.4521v1 [cs.cv] 16 Oct 2014 1 TTI Chicago 2 California Institute