Speaker: Ming-Ming Cheng Nankai University 15-Sep-17 Towards Weakly Supervised Image Understanding

Size: px

Start display at page:

Download "Speaker: Ming-Ming Cheng Nankai University 15-Sep-17 Towards Weakly Supervised Image Understanding"

Opal Davidson
5 years ago
Views:

1 Towards Weakly Supervised Image Understanding (WSIU) Speaker: Ming-Ming Cheng Nankai University 1/50

2 Understanding Visual Information Image by kirkh.deviantart.com 2/50

3 Dataset Annotation 3/50

4 Dataset Annotation PASCAL 11: 10? workers bounding boxes ImageNet: workers images labeled with one word My mother: segmented objects Job offer: I am looking for more parents CVML 2012, Antonio Torralba 4/50

5 Dataset Annotation 5/50

6 Towards WSIU Editing Synthesis Graphics/vision applications Web images Semantic segmentation Interaction Light weighted semantic parsing Attention Boundary Low level vision Segmentation 6/50

7 Visual attention: motivation 7/50

Visual attention fixation prediction Salient object

object detection with short connections, IEEE CVPR 2017.

Feature Integration Approach, IJCV 2017.

Global Contrast based Salient Region Detection, IEEE TPAMI

8 Visual attention fixation prediction Salient object detection Objectness proposals Deeply supervised salient object detection with short connections, IEEE CVPR Salient Object Detection: A Discriminative Regional Feature Integration Approach, IJCV Salient Object Detection: A Benchmark, IEEE TIP Global Contrast based Salient Region Detection, IEEE TPAMI BING: Binarized Normed Gradients for Objectness Estimation at 300fps, IEEE CVPR /50

9 Global Contrast based Salient Region Detection, IEEE TPAMI, 2015, MM Cheng, et. al. (2nd most cited paper in CVPR 2011) 10/50

10 Core idea: region contrast (RC) Image Segmentation σ s 2 σ s Spatial weighting Region size S r k = rk r i exp D s r k,r i σ s 2 ω r i D r (r k, r i ) Region contrast by sparse histogram comparison. 11/50

11 Experimental results Dataset: MSRA1000 [Achanta09] Precision vs. recall 12/50

12 Supervised feature integration Salient Object Detection: A Discriminative Regional Feature Integration Approach, IJCV Salient Object Detection: A Benchmark, IEEE TIP /50

13 Going with deep models Deeply supervised salient object detection with short connections, IEEE CVPR /50

14 Bridging between multi-levels 15/50

15 Messages from numbers 16/50

Methodology: observation Objects are stand-alone things with well defined closed boundaries and centers. Finding pictures of objects in large collections of images.

16 Methodology: observation Objects are stand-alone things with well defined closed boundaries and centers. Finding pictures of objects in large collections of images. Springer Berlin Heidelberg, 1996, Forsyth et. al. Using stuff to find things. ECCV 2008, Heitz et. al. Measuring the objectness of image window, IEEE TPAMI 2012, Alexe et. al. Little variations could present in such abstracted view. 17/50

17 Experimental results Proposal quality on PASCAL VOC /50

18 Experimental results Computational time A laptop with an Intel i7-3940xm CPU 20 seconds for training on the PASCAL 2007 training set!! Testing time 300fps on VOC 2007 images Method [1] OBN [2] CSVM [3] SEL [4] Our BING Time (seconds) Category-Independent Object Proposals With Diverse Ranking, PAMI 2014, Endres et. al. Measuring the objectness of image windows. PAMI 2012, Alexe, et. al. Proposal Generation for Object Detection using Cascaded Ranking SVMs. CVPR 2011, Zhang et. al. Selective Search for Object Recognition, IJCV 2013, Uijlings et. al. 19/50

19 Towards WSIU Editing Synthesis Graphics/vision applications Web images Semantic segmentation Interaction Light weighted semantic parsing Attention Boundary Low level vision Segmentation 20/50

20 Boundary Richer Convolutional Features for Edge Detection, IEEE CVPR /50

21 22/50

22 Samples image G-Truth results 23/50

23 45 years of boundary detection Source: Arbelaez 24/50

24 Sate of the arts 25/50

25 Towards WSIU Editing Synthesis Graphics/vision applications Web images Semantic segmentation Interaction Light weighted semantic parsing Attention Boundary Low level vision Segmentation 26/50

SaliencyCut Iterative refine: iteratively run GrabCut to refine segmentation Adaptive fitting: adaptively fit with newly segmented salient

26 SaliencyCut Iterative refine: iteratively run GrabCut to refine segmentation Adaptive fitting: adaptively fit with newly segmented salient region Enables automatic initialization provided by salient object detection. Global Contrast based Salient Region Detection, IEEE TPAMI /50

27 Salient shape Is salient object detection for simple images useful? SalientShape: Group Saliency in Image Collections, TVC /50

28 Segmentation HFS: Hierarchical Feature Selection for Efficient Image Segmentation, ECCV /50

29 Segmentation HFS: Hierarchical Feature Selection for Efficient Image Segmentation, ECCV /50

30 Towards WSIU Editing Synthesis Graphics/vision applications Web images Semantic segmentation Interaction Light weighted semantic parsing Attention Boundary Low level vision Segmentation 31/50

31 STC 10% improvement over state of the art! STC: A Simple to Complex Framework for Weakly-supervised Semantic Segmentation, IEEE TPAMI /50

32 Interaction Object Region Mining with Adversarial Erasing: A Simple Classification to Semantic Segmentation Approach, IEEE CVPR (oral) /50

33 Results 34/50

34 Towards WSIU Editing Synthesis Graphics/vision applications Web images Semantic segmentation Interaction Light weighted semantic parsing Attention Boundary Low level vision Segmentation 35/50

35 Motivation Pixels/patches Objects Scene Layering 36/50

Approximately Repeated Scene Elements for Image Editing,

36 Motivation Rough depth ordering is possible for a single image with repeated elements RepFinder: Finding Approximately Repeated Scene Elements for Image Editing, ACM TOG Sep-17 Towards Weakly Supervised Image Understanding 37/50

37 Image rearrangement Source image 38/50

38 39/50

39 40/50

40 Towards WSIU Editing Synthesis Graphics/vision applications Web images Semantic segmentation Interaction Light weighted semantic parsing Attention Boundary Low level vision Segmentation 41/50

41 ImageSpirit: Verbal Guided Image Parsing, ACM TOG, /50

42 Motivations 43/50

43 Verbal guided image parsing Make the wood cabinet in bottom-middle lower nouns Adjective Verb/Adverb Object Attributes Commands Multi label CRF 44/50

44 Towards WSIU Editing Synthesis Graphics/vision applications Web images Semantic segmentation Interaction Light weighted semantic parsing Attention Boundary Low level vision Segmentation 45/50

45 Sketch2Photo Sketch2photo: internet image montage, ACM TOG /50

46 Towards WSIU Editing Synthesis Graphics/vision applications Web images Semantic segmentation Interaction Light weighted semantic parsing Attention Boundary Low level vision Segmentation 47/50

47 Dealing with web images UCSD, Zhuowen Tu, CVPR 12 NUS, Ping Tan ACM TOG 11 MIT, Rubinstein CVPR 13 Columbia, Shihfu Chang CVPR 12 NUS Shuicheng Yan PAMI 17 北理工黄华 ACM TOG 11 48/50

48 free 49/50

49 Thanks! 50/50

50 Outline of the survey Low level vision Attention: CVPR 11, CVPR 14, TPAMI 15, IEEE TIP 15, IJCV 17, CVPR 17 Boundary: CVPR 17 Segmentation: TVC 14, CGF 15, ECCV 15, TPAMI 15 Light weighted semantic parsing Semantic segmentation: TPAMI 17, CVPR (oral) 17 Interaction: TOG 15, TOG 10, TOG 12 Graphics/vision applications Editing: TOG 14 Synthesis: TOG 09 Web images: CVPR 12, TOG 11, CVPR 12, CVPR 13, TOG 12 51/50

Main Subject Detection via Adaptive Feature Selection

Main Subject Detection via Adaptive Feature Selection Cuong Vu and Damon Chandler Image Coding and Analysis Lab Oklahoma State University Main Subject Detection is easy for human 2 Outline Introduction