Constrained Convolutional Neural Networks for Weakly Supervised Segmentation. Deepak Pathak, Philipp Krähenbühl and Trevor Darrell

Size: px

Start display at page:

Download "Constrained Convolutional Neural Networks for Weakly Supervised Segmentation. Deepak Pathak, Philipp Krähenbühl and Trevor Darrell"

May Ford
6 years ago
Views:

1 Constrained Convolutional Neural Networks for Weakly Supervised Segmentation Deepak Pathak, Philipp Krähenbühl and Trevor Darrell 1

2 Multi-class Image Segmentation Assign a class label to each pixel in the image background table chair 2

3 Multi-class Image Segmentation Pixel-level classification Train a classifier Qi(l) for each pixel i and label l Convolutional neural network (CNN) trained end-to-end background chair table Fully Convolutional Networks for Semantic Segmentation [Long et al. CVPR 2015] 3

4 How does prior work train CNN back propagation stochastic gradient descent (SGD) large labeled dataset 4

5 Limitation : Training Supervision Need full supervision Time consuming to obtain 79s per label per image [Russakovsky et al. Arxiv 2015] Expensive to obtain Bottleneck for learning models at large scale 5

6 Weak Training Supervision Weak supervision Class labels or tags Cheap to obtain 1s per label per image [Russakovsky et al. Arxiv 2015] Scalable to large number of categories person horse background 6

7 Training a CNN using weak supervision - Prior work Multiple instance learning Tag present at least one pixel takes label Tag absent No pixel takes that label Shown promise for weak detection person car background 7

8 Multiple instance learning - Issues Very weak signal one pixel per class per image Converges to bad local minima Requires good initialization! Heuristics to get out of local optima person car background 8

9 Weakly Supervised Training Is there a better description of the desired solution? person car background person person car car 9

10 Idea : Weakly Supervised Training with constraints Space of Segmentation Masks Space of Good Segmentation Label Masks Constraint Hyperplanes 10

11 Description Constraints Suppression Constraint: suppress labels that do not appear in the image. person car background Horse Cat Dog Others 11

12 Description Constraints Foreground Constraint: label at least some pixels for each object present person car background Person Car 12

13 Description Constraints Background Constraint: The number of background pixels in an image should be bounded say between 10% to 75% person car background Background 13

14 Constrained Convolutional Neural Network [CCNN] Convolutional Neural Network + Constraints CNN Space of Segmentation Masks Space of Good Segmentation Label Masks Constraint Hyperplanes 14

15 How to constrain CNN output? Constraints on CNN distribution QI CNN θ Expensive and Non-Convex QI 15

16 CCNN : Output as latent distribution Introduce latent variable PI for distribution of network output Apply constraints on the latent distribution CNN θ Minimize the distance between PI and QI PI QI 16

17 CCNN Optimization KL-Divergence minimization between latent distribution and network output distribution CNN θ minimize,p D(P I kq I ) PI QI subject to A I ~ PI ~ b, ~1 > ~ PI =1 17

18 CCNN Optimization Solves same optimization problem minimize,p D(P I kq I ) Convex in P Standard convnet loss for Q subject to A I ~ PI ~ b, ~1 > ~ PI =1 log-likelihood / cross entropy Convex for log-linear model logistic regression 18

CCNN Optimization : Alternative Minimization Optimization using block coordinate descent : Solve for P while convnet parameters θ fixed minimize,p D(P I kq I ) subject

19 CCNN Optimization : Alternative Minimization Optimization using block coordinate descent : Solve for P while convnet parameters θ fixed minimize,p D(P I kq I ) subject to A I ~ PI ~ b, ~1 > ~ PI =1 Gradient step in θ while P fixed Each step guaranteed to decrease the overall objective P (0) SGD P (1) Q (0) Q (1) Constrained Region 19

20 Summary : Constrained CNN Input Image Predicted labeling CNN Person Car Person Network Output Car Weak Labels Latent Distribution Constrained Region 20

21 Evaluation VOC 2012 dataset Trained using 10,582 tagged images Training time: 8hrs Constraint satisfaction: 30ms per image on CPU Forward - Backward: 400ms per image on GPU Evaluated on Intersection over Union score 21

22 Results : State of the Art State of the art weakly supervised semantic segmentation 22

23 Additional 1-bit Supervision 1-bit additional information: object size is small (<10%) or large (>10%) Size Constraints Boost large objects Limit small objects person car background Car large Person small 23

24 Results : Comparison with Fully Supervised 10% improvement using 1-bit additional supervision at training time. 24

25 train cat person bicycle sofa dog sheep horse 25

26 person table person plantbottle bicycle 26

27 Questions? Paper (and code) is available : Constrained Convolutional Neural Networks for Weakly Supervised Segmentation, ICCV

28 28

Gradient of the lower bound

Gradient of the lower bound Weakly Supervised with Latent PhD advisor: Dr. Ambedkar Dukkipati Department of Computer Science and Automation gaurav.pandey@csa.iisc.ernet.in Objective Given a training set that comprises image and image-level