Image Segmentation Srikumar Ramalingam School of Computing University of Utah Slides borrowed from Ross Whitaker
Segmentation
Semantic Segmentation
Indoor layout estimation
What is Segmentation? Partitioning images/volumes into meaningful pieces Partitioning problem Labels Isolating a specific region of interest ( find the star or bluish thing ) Delineation problem Binary
Why? Detection/recognition Where is the vehicle? What type of vehicle is it? Quantifying object properties How big is the tumor? Is it expanding or shrinking? How much of a radiation do I need to treat this prostate? Statistical analyses of sets of biological volumes
What is The Best Way to Segment Images? Depends Kind of data: type of noise, signal, etc. What you are looking for: shape, size, variability Application specifics: how accurate, how many State of the art Specific data and shapes Train a template or model (variability) Deform to fit specific data General data and shapes So many methods So few good ones in practice: hand contouring
User Interaction Quick and easy general-purpose segmentation tool Time consuming 3D: slice-by-slice with cursor defining boundary User variation (esp. slice to slice) Tools available. E.g. Harvard SPL Slicer
Interactive segmentation One of the state-of-the-art image segmentation algorithms where we could mark strokes on both the foreground and background regions. The technique used is graph cuts, where maxflow-mincut algorithm is used to separate the set of pixels into two categories. How do you handle cases where we want to have more than 2 regions. 9
General Purpose Segmentation Strategies Region-based methods (connected) Regions are locally homogeneous (in some property) Regions satisfy some property (to within an tolerance) E.g. Flood fill Edge-based methods Regions are bounded by features Features represent sharp contrast in some property (locally maximal constrast) E.g. Canny Edges
Pixel Classification Simplest: Thresholding Pixels above threshold in class A, below class B Connected components on class label Extension of thresholding > pattern recognition Image intensities not enough Define set of features at each pixel
Options for Pixel Features Intensity Derivatives (at different scales) Also differential invariants (e.g. grad mag) Neighborhood statistics Mean, variance Neighborhood histogram Texture (e.g. band-pass filters) Multivariate data (vector-valued range) Color Spectral MRI
Spectral MRI Classification Features based spin-lattice and spin-spin relaxation states Classification Tasdizen et al.
Pattern Recognition Relatively old idea (mid 20th century) Classify an instance based on multiple measurements (features) Statistical decision theory (min. prob. of error) For each set of measurements say which class and (maybe) prob.
Concept Feature Vector Set of measurements Position in feature space Domain Range (Feature Space)
Classification Typical approach: construct a function which tells you the extent to which x is in class i Two types of problems Supervised classified examples are given Unsupervised - only raw data is given
Pattern Recognition What is the form of f()? Could be anything, but Linear Difference of Gaussians *homogeneous coordinates
Finding f() From Examples For each class use prototype Classify instance based on nearest prototype x Neural nets (e.g. perceptron) Learn set of parameters (e.g. Ws) through many examples Statistical Construct probability density functions from examples
Unsupervised Find natural structure in data E.g. clusters K-means alg. Start with k centers (random) Find set of points closest to each center Move center to mean of points Repeat until centers don t move
Hierarchical Grouping Methods Splitting, merging of regions Construct metric on region configurations M(i) Statistics of region (average intensity, etc). Are two regions similar enough to be merged? Should they be split? Small regions E.g. pixels Larger regions Higher cost Each node <-> region
Merging and Splitting 21
Simple Merging Algorithm Each pixel -> one region For each region, check merge with each neighbor Cost of merge C(i,j) = M(iUj) - [M(i) + M(j)] + mc, where mc can be the cost for merging, and M refers to the cost associated with each region. Sort by cost (e.g. heap) and merge sequentially. Stop at number of regions or no more merges
Normalized Cut (Shi and Malik `00) Treat image as graph Vertices -> pixels Edges -> neighbors Must define a neighborhood stencil (the neigbhors to which a pixel is connected) Nieghborhood connections Weighted Edges Image Graph
Minimum Cut Edge Weights Edge weights Pixel distance, edges (e.g. Gaussian fall off) Say how many regions you want Cut graph so that flow between regions is minimized (min cut) Min cut
Minimum Cut G is the graph modeling the image with the set of pixels as vertices Vand the set of edges E modeling the similarity between pairs of vertices. G = V, E, A B = V, A B = A and B be the two disjoint sets. cut A, B = σ u A,v B w(u, v) Nieghborhood connections We can recursively use min-cut to partition the image into several regions. Image 25
Limitations of min-cut There is a bias in partitioning small regions. 26
Normalized cuts G = V, E, A B = V, A B = A and B be the two disjoint sets. NCUT A, B = cut A,B assoc A,V + cut A,B assoc B,V assoc A, V = σ u A,t V w(u, t) The idea is to minimize the disassociation between the groups and maximize the association within the groups. 27
28
Normalized cuts D is an N N diagonal matrix with d on its diagonal (offdiagonal elements are 0 s in a diagonal matrix): N = V, d i, i = w(i, j) W is an N N symmetrical matrix with W ij = w i, j. The problem of normalized cuts is equivalent to solving the following eigenvalue problem: D W y = λdy j 29
Normalized cuts The eigenvector corresponding to the second smallest eigenvalue gives the partition of the graph. The eigenvectors corresponding to the third smallest and so forth also contain useful partition information. 30
Eigenvectors original EV for 2 nd min EV EV for 9th min EV 31
Normalized Cuts NxN matrix (N number of pixels) Eigen values/vectors describes the cuts Computationally expensive, but Matrix is sparse because of neighborhood structure I.e. most connections are zero Run recursively to get more regions
Watershed segmentation An image can be considered as a topographic surface. [Source : http://cmm.ensmp.fr/~beucher/wtshed.html ] 33
Watershed segmentation We flood this surface from its minima. We prevent the merging of the waters coming from different sources. We partition the image into two different sets: the catchment basins and the watershed lines. 34
Boundary Function (e.g. grad. mag.) Watershed Segmentation Catchment Basin Watershed Regions Generalizes to any dimension or boundary measure Each vertical line refers to partition between different regions.
Watershed Segmentation Boundary Function Threshold Watershed Regions Hierarchy
Watershed height Watershed Saliency Height of water before flooding neighbor Used as cost of merge to build hiearchy
Watershed Segmentation Properties General Non-local regions can leak Boundary based Poor in low-contrast data Sensitive to noise Low level (pixel based) Lack of shape model Preprocessing Necessary for reliable boundary measure
Watershed Segmentation Level 1
Watershed Segmentation Level 2
Foreground / Background Estimation using Graph Cuts [Graph cuts slides adapted from Lubor Ladicky] Rother et al. SIGGRAPH04
Foreground / Background Estimation Data term Smoothness term Neighborhood terms Data term Smoothness term Estimated using FG / BG colour models Delta function is 1 when the condition is satisfied, and 0 otherwise Parameters that are manually set or learned from data where Intensity dependent smoothness
Foreground / Background Estimation Data term Smoothness term
Foreground / Background Estimation Data term Smoothness term How to solve this optimization problem?
Foreground / Background Estimation Data term Smoothness term How to solve this optimization problem? Transform into min-cut / max-flow problem Solve it using min-cut / max-flow algorithm
Min-Cut Problem 2 source 5 9 4 2 2 3 5 1 1 3 6 5 2 3 6 8 sink 5
Min-Cut Problem 2 source 5 9 4 2 2 3 5 1 1 3 6 5 source set 2 3 edge costs sink set 6 8 sink 5
Min-Cut Problem 2 S source 5 9 4 2 2 3 5 cost = 18 3 1 1 6 5 source set 3 2 edge costs sink set 6 8 sink 5 T
Min-Cut Problem 2 S source 5 9 cost = 25 4 2 2 3 5 1 1 3 6 5 source set 2 3 edge costs sink set 6 8 sink 5 T
Min-Cut Problem 2 S source 5 9 4 2 2 3 5 cost = 23 3 1 1 6 5 source set 3 2 edge costs sink set 6 8 sink 5 T
Min-Cut Problem 2 source 5 9 4 2 2 3 5 cost = 15 3 1 1 6 5 source set 3 2 edge costs sink set 6 8 sink 5
Min-Cut Problem source 5 9 x 1 2 4 2 x 2 2 3 5 x 3 1 1 3 x 4 6 5 3 2 x 5 x 6 6 8 sink 5
Min-Cut Problem source After the substitution of the constraints : 5 9 x 1 2 4 2 x 2 2 3 5 x 3 1 1 x 4 6 5 3 2 x 5 3 x 6 ( ) 6 8 sink 5
Min-Cut Problem source edges source 5 9 x 1 2 4 2 x 2 2 3 5 x 3 1 1 3 x 4 6 5 3 2 x 5 x 6 6 8 5 ( x i = 0 x i S sink edges ) x i = 1 x i T sink Edges between variables
Min-Cut Problem 2 source C(x) = 5x 1 + 9x 2 + 4x 3 + 3x 3 (1-x 1 ) + 2x 1 (1-x 3 ) 5 9 + 2x 2 (1-x 3 ) + 5x 3 (1-x 2 ) + 2x 4 (1-x 1 ) 4 + 1x 5 (1-x 1 ) + 6x 5 (1-x 3 ) + 5x 6 (1-x 3 ) + 1x 3 (1-x 6 ) x 1 x 2 2 2 + 3x 3 5 6 (1-x 2 ) + 2x 5 (1-x 4 ) + 3x 5 (1-x 6 ) + 6(1-x 4 ) x 3 3 1 1 + 8(1-x 5 ) + 5(1-x 6 ) x 4 6 5 x 3 6 2 x 5 6 8 sink 5
56
57
58
Foreground / Background Estimation source sink Rother et al. SIGGRAPH04
Object-class Segmentation Data term Smoothness term Data term Smoothness term Discriminatively trained classifier where
Object-class Segmentation grass Original Image Initial solution
Object-class Segmentation grass building grass Original Image Initial solution Building expansion
Object-class Segmentation grass building grass Original Image Initial solution Building expansion sky building grass Sky expansion
Object-class Segmentation grass building grass Original Image Initial solution Building expansion sky building tree sky building grass Sky expansion grass Tree expansion
Object-class Segmentation grass building grass Original Image Initial solution Building expansion sky sky sky building tree building tree aeroplane building grass grass grass Sky expansion Tree expansion Final Solution
Applications Graph cuts is a powerful global optimization technique (no need for any initialization) that has been used in many applications such as segmentation, stereo, human body pose estimation, etc.