Image Segmentation. April 24, Stanford University. Philipp Krähenbühl (Stanford University) Segmentation April 24, / 63

Image Segmentation Philipp Krähenbühl Stanford University April 24, 2013 Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 1 / 63

Image Segmentation Goal: identify groups of pixels that go together Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 2 / 63

Success Story Philipp Kra henbu hl (Stanford University) Segmentation April 24, 2013 3 / 63

Gestalt Theory Gestalt: whole or group The whole is greater than the sum of its parts Relationships between parts can yield new properties/features Psychologists identified series of factors that predispose set of elements to be grouped (by human visual system) Max Wertheimer (1880-1943) I stand at the window and see a house, trees, sky. Theoretically I might say there were 327 brightnesses and nuances of color. Do I have 327? No. I have sky, house, and trees. Untersuchungen zur Lehre von der Gestalt, Psychologische Forschung, Vol. 4, pp. 301-350, 1923 http://psy.ed.asu.edu/~classics/wertheimer/forms/ forms.htm Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 4 / 63

Gestalt Theory These factors make intuitive sense, but are very difficult to translate into algorithms. Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 5 / 63

Segmentation as clustering 0 20 40 60 80 100 120 40 302010 60 50 40 30 20 10 0 10 20 30 30 40 0 1020 Pixels are points in a high dimensional space color: 3d color+location:5d Cluster pixels into segment Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 6 / 63

K-Means clustering 4.0 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 1 Randomly initialize K cluster centers, c 1,..., c k 2 Given cluster centers, determine points in each cluster For each point p, find the closest c i. Put p into cluster i. 3 Given points in each cluster, solve for c i Set ci to be the mean of points in cluster i 4 If c i have changed, repeat Step 2 Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 7 / 63

K-Means clustering Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 8 / 63

Expectation Maximization (EM) Goal Find blob parameters θ that maximize the likelihood function: P(data θ) = x P(x θ) Approach: 1 E-step: given current guess of blobs, compute ownership of each point 2 M-step: given ownership probabilities, update blobs to maximize likelihood function 3 Repeat until convergence Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 9 / 63

Expectation Maximization (EM) Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 10 / 63

Mean-Shift Algorithm Iterative Mode search 1 Initialize random seed, and window W 2 Calculate center of gravity (the mean ) of W : x W xh(x) 3 Shift the search window to the mean 4 Repeat Step 2 until convergence Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 11 / 63

Mean-Shift Segmentation Iterative Mode search Find features (color, gradients, texture, etc) Initialize windows at individual pixel locations Perform mean shift for each window until convergence Merge windows that end up near the same peak or mode Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 12 / 63

Expectation Maximization (EM) Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 13 / 63

Back to Image Segmentation Goal: identify groups of pixels that go together Up to now, we have focused on ways to group pixels into image segments based on their appearance... Segmentation as clustering. We also want to enforce region constraints. Spatial consistency Smooth borders Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 14 / 63

What we will learn today? Graph theoretic segmentation Normalized Cuts Using texture features Segmentation as Energy Minimization Markov Random Fields (MRF) / Conditional Random Fields (CRF) Graph cuts for image segmentation Applications Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 15 / 63

Images as Graphs Images as Graphs q p w pq w ly connected graph (Fully-Connected) Graph Node (vertex) Node (vertex) for every for every pixel pixel Link between every pair of pixels, (p,q) Affinity weight w pq for each link (edge) Link between (every) pair of pixels, (p,q) Affinity weight w pq for each link (edge) w pq measures similarity w pq measures Inverse similarity proportional to distance (difference in color and position) Similarity is inversely proportional to difference Slide Credit: Steve Seitz (in color and position ) Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 16 / 63

Segmentation by Graph Cuts w A B C Break Graph into Segments (cliques) Delete links that cross between segments ak Graph into Segments Delete links Easiest that to break cross links between that low similarity segments (low affinity weight) Similar pixels should be in the same segment Easiest to break links that have low similarity (low weight) Dissimilar pixels should be if different segments Similar pixels should be in the same segments Slide Credit: Steve Seitz Dissimilar pixels should be in different segments Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 17 / 63

Measuring Affinity Distance Intensity exp( 1 2σ 2 x y 2 ) exp( 1 2σ 2 I (x) I (y) 2 ) Color exp( 1 dist(c(x), c(y))2 ) 2σ2 }{{} suitable color distance Texture exp( 1 f (x) f (y) 2 ) 2σ2 }{{} Filter output Source: Forsyth & Ponce Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 18 / 63

Scale Affects Affinity Small σ: group only nearby points Large σ: group far-away points 1.0 2.0 0.8 1.5 affinity 0.6 0.4 1.0 0.2 0.5 0.0 0 20 40 60 80 100 distance 2 0.0 0.0 0.5 1.0 1.5 2.0 small σ medium σ large σ Slide Credit: Svetlana Lazebnik Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 19 / 63

Graph Cut: Using Eigenvalues Affinity matrix W Extract a single good cluster (v n ) vn (i): probability of point i belonging to the cluster Elements have high affinity with each other Constraint vn v n = 1 Prevents v n Constraint objective v n Wv n v n Wv n λ(1 v n v n ) A Reduces to Eigenvalue problem v n W = λv n Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 20 / 63

Graph Cut: Using Eigenvalues 2.0 1.5 1.0 0.5 0.0 0.0 0.5 1.0 1.5 2.0 4 largest eigenvalues 0.30 0.30 0.30 0.30 0.25 0.25 0.25 0.25 0.20 0.20 0.20 0.20 0.15 0.15 0.15 0.15 0.10 0.10 0.10 0.10 0.05 0.05 0.05 0.05 0.00 0 10 20 30 40 50 60 70 80 0.00 0 10 20 30 40 50 60 70 80 0.00 0 10 20 30 40 50 60 70 80 0.00 0 10 20 30 40 50 60 70 80 Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 21 / 63

Clustering by Graph Eigenvectors 1 Construct an affinity matrix 2 Compute the eigenvalues and eigenvectors of the affinity matrix 3 Until there are sufficient clusters Take the eigenvector corresponding to the largest unprocessed eigenvalue zero all components corresponding to elements that have already been clustered threshold the remaining components to determine which element belongs to this cluster, choose a threshold by clustering the components, or using a threshold fixed in advance. If all elements have been accounted for, there are sufficient clusters: end Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 22 / 63

Graph Cut: Using Eigenvalues Effects of the scaling 0.40 0.30 0.15 0.35 0.30 0.25 0.20 0.15 0.10 0.05 0.25 0.20 0.15 0.10 0.05 0.14 0.13 0.12 0.11 0.10 0.09 0.00 0 10 20 30 40 50 60 70 80 0.00 0 10 20 30 40 50 60 70 80 0.08 0 10 20 30 40 50 60 70 80 small σ medium σ large σ Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 23 / 63

Graph Cut Graph Cut A B Set Find of set edges of edges whose removal makes makes graph disconnected a graph disconnected Cost of a cut Cost of Sum a of cut weights of cut edges: cut(a, B) = p A,q B w pq Graph cut gives us a segmentation Sum of weights of cut edges: cut( A, B) = Slide Credit: Steve Seitz A graph cut gives us a segmentation What is a good graph cut and how do we find one? What is a good graph cut and how do we find one? w p, q p A, q B Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 24 / 63

Graph Cut Graph Cut Here, the cut is nicely defined by the block-diagonal structure of the affinity matrix. How can this be generalized? Image Source: Forsyth & Ponce Image Source: Forsyth & Ponce Fei-Fei Li Lecture 8-23 19 Apr 11 Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 25 / 63

Minimum Cut Minimum Cut We can do segmentation by finding the minimum cut in a graph We can do segmentation by finding the minimum cut in a graph a minimum cut of a graph is a cut whose cutset has the smallest number a minimum cut of a graph is a cut whose cutset has the smallest of elements (unweighted case) or smallest sum of weights possible. affinity. Efficient Efficient algorithms algorithms exist exist for for doing doing this this (max-flow) Drawback: Weight Weight of of cut cut proportional proportional to to number number of of edges edges in in the the cut cut Minimum cut tends to cut off very small, isolated components Minimum cut tends to cut off very small, isolated components Fei-Fei Li Ideal Cut Slide Credit: Khurram Hassan-Shafique Lecture 8-24 Cuts with lesser weight than the ideal cut 19 Apr 11 Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 26 / 63

Normalized Cut (NCut) A minimum cut penalizes large segments This can be fixed by normalizing for size of segments The normalized cut cost is: cut(a, B) Ncut(A, B) = assoc(a, V ) [ = cut(a, B) + cut(a, B) assoc(b, V ) 1 p A,q w p,q ] 1 + q B,p w p,q assoc(a, V ) = sum of weights of all edges in V that touch A The exact solution is NP-hard but an approximation can be computed by solving a generalized eigenvalue problem. J. Shi and J. Malik. Normalized cuts and image segmentation. PAMI 2000 Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 27 / 63

Interpretation as a Dynamical System Interpretation as a Dynamical System Treat the links as springs and shake the system Treat Elasticity the linksproportional as springs to and cost shake the system Vibration modes correspond to segments Elasticity proportional to cost Vibration modes correspond to segments Can compute these by solving a generalized eigenvector problem Can compute these by solving a generalized eigenvector problem Fei-Fei Li Lecture 8-26 19 Apr 11 Slide Credit: Steve Seitz Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 28 / 63 Slide credit: Steve Seitz

NCuts as a Generalized Eigenvalue Problem NCuts as a Generalized Eigenvector Problem Definitions W : the affinity matrix D: diagonal matrix, Dii = j W ij Wx: : athe vector affinity { 1, matrix, 1} N, W( i, j) = w ; Definitions Rewriting D : the the diag. Normalized matrix, Cut Dii (, in) = matrix Wform ( i, j); N x: a vector in cut(a, {1, 1} B), x( i) = 1 cut(a, i B) A. Ncut(A, B) = + assoc(a, V ) assoc(b, V ) =... Rewriting Normalized Cut in matrix form: Slide credit: Jitendra Malik Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 29 / 63 cut(a,b) cut(a,b) NCut(A,B) = + assoc(a,v) assoc(b,v) Slide Credit: Jitentra Malik j i, j T T (1 + x) ( D W)(1 + x) (1 x) ( D W)(1 x) Dii (,) x 0 ; i > = + k = T T k1 D1 (1 k)1 D1 D( i, i) =... i

Some more math... Some More Math Slide credit: Jitendra Malik Fei-Fei Li Lecture 8-28 19 Apr 11 Slide Credit: Jitentra Malik Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 30 / 63

NCuts as a Generalized Eigenvalue Problem After simplifications, we get Ncut(A, B) = y (D W )y y Dy with y i { 1, b} and y D1 = 0 This is the Rayleigh Quotient Solution given by the generalized eigenvalue problem (D W )y = λdy Subtleties Optimal solution is second smallest eigenvector Gives continuous result must convert into discrete values of y Hard as a discrete problem Continuous approximation Slide Credit: Jitentra Malik Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 31 / 63

NCuts Example NCuts Example Smallest eigenvectors NCuts segments Image source: Shi & Malik Fei-Fei Li Lecture 8-30 19 Apr 11 Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 32 / 63

NCuts Example Discretization Problem: Problem: eigenvectors eigenvectors take ontake continuous continuous values values How to choose the splitting point to binarize the image? How to choose the splitting point to binarize the image? Image Eigenvector NCut scores Possible Possible procedures procedures a) Pick a constant value (0, or 0.5). Pick a constant value (0, or 0.5). b) Pick the median value as splitting point. Pick the median value as splitting point. c) Look for the splitting point that has the minimum NCut value: Look for the splitting point that has the minimum NCut value: 1. Choose n possible splitting points. 1 2 Choose 2. Compute n possible NCut splitting value. points. Compute 3. Pick NCut minimum. value. 3 Pick minimum. Fei-Fei Li Lecture 8-31 19 Apr 11 Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 33 / 63

NCuts: Overall Procedure 1 Construct a weighted graph G = (V, E) from an image. 2 Connect each pair of pixels, and assign graph edge weights W ij = Prob. that i and j belong to the same region. 3 Solve (D W )y = λdy for the smallest few eigenvectors. This yields a continuous solution. 4 Threshold eigenvectors to get a discrete cut This is where the approximation is made (we re not solving NP). 5 Recursively subdivide if NCut value is below a pre-specified value. NCuts Matlab code available at http://www.cis.upenn.edu/~jshi/software/ Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 34 / 63

NCuts Results Color Image Segmentation with NCuts Image Source: Shi & Malik Fei-Fei Li Lecture 8-33 19 Apr 11 Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 35 / 63

Using Texture Features for Segmentation Using Texture Features for Segmentation Texture descriptor is vector of filter bank outputs Texture descriptor is vector of filter bank outputs J. J. Malik, S. Belongie, T. T. Leung and and J. Shi. J. Shi. "Contour and Texture Analysis for Image Segmentation". Contour IJCV 43(1),7 27,2001. and Texture Analysis for Image Segmentation. IJCV 43(1),7-27,2001 Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 36 / 63

Using Texture Features for Segmentation Using Texture Features for Segmentation Texture descriptor is vector of filter bank outputs. Texture descriptor is vector of filter bank Textons outputs. are found by Textons clustering. are found by clustering. Bag of words Slide credit: Svetlana Lazebnik Slide Credit: Svetlana Lazebnik Fei-Fei Li Lecture 8-35 19 Apr 11 Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 37 / 63

Using Texture Features for Segmentation Texture descriptor is vector of filter bank outputs. Textons are found by Texture clustering. descriptor is vector of filter bank Affinities outputs. are given by Textons similarities are found by of clustering. texton Bag histograms of words over Affinities windows are given given by similarities by the of texton local histograms scale over of windows the texture. given by the local scale of the texture. Slide credit: Svetlana Lazebnik Fei-Fei Li Lecture 8-36 19 Apr 11 Slide Credit: Svetlana Lazebnik Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 38 / 63

Results with Color and Texture Results with Color & Texture Fei-Fei Li Lecture 8-37 19 Apr 11 Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 39 / 63

ory Summary: complexity Normalized can Cuts be high connected graphs many affinity computations value problem for each cut Pros: Generic framework, flexible to choice of function that computes weights ( affinities ) between nodes Does not require any model of the data distribution Cons: Time and memory complexity can be high Dense, highly connected graphs many affinity computations Solving eigenvalue problem for each cut Preference for balanced partitions If a region is uniform, NCuts will find the modes of vibration of the image dimensions balanced partitions niform, NCuts will find the ation of the image dimensions Lecture 8-38 19 Apr Slide Credit: Kristen Grauman Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 40 / 63

Markov Random Fields Markov Random Fields Allow rich probabilistic models for images Allow rich probabilistic models for images But But built built in a local, in a modular local, modular way way Learn/model local effects, get global effects out Learn local effects, get global effects out Observed evidence Hidden true states Neighborhood relations Slide credit: William Freeman Fei-Fei Li Slide Credit: William Freeman Lecture 8-40 19 Apr 11 Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 42 / 63

MRF Nodes as Pixels MRF Nodes as Pixels Original image Degraded image Reconstruction from MRF modeling pixel neighborhood statistics Slide credit: Bastian Leibe Fei-Fei Li Lecture 8-41 19 Apr 11 Slide Credit: Bastian Leibe Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 43 / 63

MRF Nodes as Patches MRF Nodes as Patches Image patches Φ( x, y ) i i Ψ( x, x ) i j Scene patches Image Scene Slide credit: William Freeman Fei-Fei Li Lecture 8-42 19 Apr 11 Slide Credit: William Freeman Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 44 / 63

Network Joint Probability Network Joint Probability P( xy, ) = Φ( x, y) Ψ( x, x) i i i i j i, j Scene Image Image-scene compatibility function Local observations Scene-scene compatibility function Neighboring scene nodes Slide credit: William Freeman Fei-Fei Li Lecture 8-43 19 Apr 11 Slide Credit: William Freeman Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 45 / 63

Energy Formulation Joint probability P(x, y) = 1 Z Φ(x i, y i ) i ij Ψ(x i, x j ) Taking the log turns this into an Energy optimization E(x, y) = i ϕ(x i, y i ) + ij ψ(x i, x j ) This is similar to free-energy problems in statistical mechanics (spin glass theory). We therefore draw the analogy and call E an energy function. ϕ and ψ are called potentials. Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 46 / 63

Energy Formulation Energy function E(x, y) = i ϕ(x i, y i ) }{{} unary term + ij ψ(x i, x j ) }{{} pairwise term Energy Formulation Unary potential ϕ Encode local Energy information function about the given pixel/patch How likely is a pixel/patch to belong to a certain class Exy (, ) = (e.g. foreground/background)? ϕ( xi, yi) + ψ( xi, xj) i i, j Pairwise potential ψ Single-node Pairwise potentials potentials Encode neighborhood information How different Single node is a pixel/patch s potentials ϕ label from that of its neighbor? (e.g. Encode based local oninformation intensity/color/texture about the given pixel/patch How likely is a pixel/patch to belong to a certain class difference, edges) (e.g. foreground/background)? ϕ( x, y ) Pairwise potentials ψ Encode neighborhood information Slide Credit: Bastian Leibe How different is a pixel/patch s label from that of its neighbor? (e.g. based on intensity/color/texture difference, edges) Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 47 / 63 i i ψ ( x, x ) i j e credit: Bastian Leibe

Segmentation using MRFs/CRFs Boykov and Jolly (2001) X X E (x, y ) = ϕ(xi, yi ) + ψ(xi, xj ) i Variables I xi : Binary variable I yi : Annotation F F ij foreground/background foreground/background/empty Unary term I I ϕ(xi, yi ) = K [xi 6= yi ] Pay a penalty for disregarding the annotation Pairwise term I I I ψ(xi, xj ) = [xi 6= xj ]wij Encourage smooth annotations wij affinity between pixels i and j Philipp Kra henbu hl (Stanford University) Segmentation April 24, 2013 48 / 63

Efficient solutions Grid structured random fields I I I Efficient solution using Maxflow/Mincut Optimal solution for binary labeling Boykov & Kolmogorov, An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision, PAMI 26(9): 1124-1137 (2004) Fully connected models I I Efficient solution using convolution mean-field Kra henbu hl and Koltun, Efficient Inference in Fully-Connected CRFs with Gaussian edge potentials, NIPS 2011 Philipp Kra henbu hl (Stanford University) Segmentation April 24, 2013 49 / 63

GrabCut: Interactive Foreground Extraction GrabCut: Interactive Foreground Extraction Slides credit: Carsten Rother Fei-Fei Li Lecture 8-46 19 Apr 11 Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 50 / 63

What GrabCut Does What GrabCut Does Magic Wand (Adobe, 2002) Intelligent Scissors Mortensen and Barrett (1995) GrabCut User Input Result Regions Boundary Regions & Boundary Fei-Fei Li Lecture 8-48 19 Apr 11 Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 51 / 63

GrabCut Energy function E(x, k, θ I) = i ϕ(x i, k i, θ z i ) + ij ψ(x i, x j z i, z j ) Variables x i {0, 1}: Foreground/background label ki {0,..., K}: Gaussian mixture component θ: Model parameters (GMM parameters) I = {z1,..., z N }: RGB Image Unary term ϕ(x i, k i, θ z i ) Gaussian mixture model (log of a GMM) Pairwise term ψ(x i, x j z i, z j ) = [x i x j ] exp( β z i z j 2 ) Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 52 / 63

GrabCut Philipp Kra henbu hl (Stanford University) Segmentation April 24, 2013 53 / 63

GrabCut - Unary term Gaussian Mixture Model P(z i x i, θ) = π(x i, k)p(z k k, θ) Hard to optimize ( k k ) Tractable solution Assign each variable x i a single mixture component k i P(z i x i, k i, θ) = π(x i, k i )p(z k k i, θ) Optimize over ki Unary term ϕ(x i, k i, θ z i ) = log π(x i, k i ) log p(z k k i, θ) = log π(x i, k i ) + 1 2 log Σ(k i) + 1 2 (z i µ(k i )) Σ(k i ) 1 (z i µ(k i )) Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 54 / 63

GrabCut - Unary term Unary term ϕ(x i, k i, θ z i ) = log π(x i, k i ) + 1 2 log Σ(k i) Model parameters + 1 2 (z i µ(k i )) Σ(k i ) 1 (z i µ(k i )) θ = { π(x i, k i ), µ(k }{{} i ), Σ(k i ) } }{{} mixture weight mean and variance Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 55 / 63

GrabCut - Iterative optimization 1 Initialize Mixture Models 2 Assign GMM components k i = arg min ϕ(x i, k i, θ z i ) k 3 Learn GMM parameters θ = arg min ϕ(x i, k i, θ z i ) i? 4 Estimate segmentation using mincut x = arg min E(x, k, θ I) 5 Repeat from 2 until convergence GrabCut I? θ, k Initiali Foregrou Backgrou GrabCut Iterativ Initialization Gaussian Mixture Model for Colour Background Distributions Ba θ, k = argmin U ( α, Foreground Update & k: GMM c Background Update θ: Gaussia E( α, k, θ, z) = U( α, k, θ, z) + V θ, k = argmin U ( α, k, θ, z) Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 56 / 63

GrabCut - Iterative optimization 1 Initialize Mixture Models 2 Assign GMM components k i = arg min k ϕ(x i, k i, θ z i ) 3 Learn GMM parameters Foreground θ = arg min i ϕ(x i, k i, θ z i ) 4 Estimate segmentation using mincut Background x = arg min E(x, k, θ I) 5 Repeat from 2 until convergence Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 56 / 63

GrabCut - Iterative optimization Cut Iterative Optimiza 1 Initialize Mixture Models 2 Assign GMM components Initialization k i = arg min ϕ(x i, k i, θ z i ) k 3 Learn GMM Foreground parameters & Background θ = arg min ϕ(x i, k i, θ z i ) i Foreground 4 Estimate segmentation using mincut Background x = arg min E(x, k, θ I) α = argmin E( α, k, θ, z) 5 Repeat from 2 until convergence α Background Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 56 / 63

GrabCut - Iterative optimization 1 Initialize Mixture Models 2 Assign GMM components t Iterative Optimization k i = arg min ϕ(x i, k i, θ z i ) k Initialization 3 Learn GMM parameters Foreground & θ = arg min Background ϕ(x i, k i, θ z i ) i, k 4 Estimate segmentation using mincut Background in U ( α, k, θ, z) x = arg min E(x, k, θ I) 5 Repeat from 2 until convergence : GMM component assignment; Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 56 / 63

GrabCut - Iterative optimization 1 Initialize Mixture Models 2 Assign GMM components k i = arg min k ϕ(x i, k i, θ z i ) 3 Learn GMM parameters θ = arg min i ϕ(x i, k i, θ z i ) 4 Estimate segmentation using mincut x = arg min E(x, k, θ I) 5 Repeat from 2 until convergence Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 56 / 63

GrabCut - Iterative optimization GrabCut Iterative Optimization 1 2 3 4 Result Energy after each Iteration Fei-Fei Li Lecture 8-54 19 Apr 11 Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 57 / 63

GrabCut - Further editing Further Editing Automatic Segmentation Automatic Segmentation Fei-Fei Li Lecture 8-55 19 Apr 11 Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 58 / 63

GrabCut - More results More Results GrabCut completes automatically Fei-Fei Li Lecture 8-56 19 Apr 11 Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 59 / 63

GrabCut - Live demo Included in MS Office 2010 Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 60 / 63

Improving Efficiency of Segmentation Problem: Images contain many pixels Even with efficient graph cuts, an MRF formulation has too many nodes for interactive results. Efficiency trick: Superpixels Group together similar-looking pixels for efficiency of further processing. Cheap, local oversegmentation Important to ensure that superpixels do not cross boundaries Several different approaches possible Superpixel code available here http: //www.cs.sfu.ca/~mori/research/superpixels/ Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 61 / 63

Summary: Graph Cuts Segmentation Pros Powerful technique, based on probabilistic model (MRF). Applicable for a wide range of problems. Very efficient algorithms available for vision problems. Becoming a de-facto standard for many segmentation tasks. Cons/Issues Graph cuts can only solve a limited class of models Submodular energy functions Can capture only part of the expressiveness of MRFs Only approximate algorithms available for multi-label case Philipp Krähenbühl (Stanford University) Segmentation April 24, 2013 62 / 63