CS 4495 Computer Vision Aaron Bobick (slides by Tucker Hermans) School of Interactive Computing
Administrivia PS 4: Out but I was a bit late so due date pushed back to Oct 29. OpenCV now has real SIFT again (the notfree packages). If using Python and OpenCV you should be able to use those calls. We re still investigating SIFT for Python for those *not* using OpenCV.
Why segmentation?
of Coherent Regions Berkeley segmentation database: http://www.eecs.berkeley.edu/research/projects/cs/vision/grouping/segbench/ Slide by Svetlana Lazebnik
Grouping of Similar Neighbors X. Ren and J. Malik. Learning a classification model for segmentation. ICCV 2003. Slide by Svetlana Lazebnik
Figure Ground Separate the foreground object (figure) from the background (ground) D. Tsai, M. Flagg, and J. M. Rehg. Motion Coherent Tracking with Multi-label MRF optimization. BMVC 2010.
Extensions Beyond Single Images J. Strom, A. Richardson, E. Olson. Graph-based for Colored 3D Laser Point Clouds. IROS 2010. M. Grundmann, V. Kwatra, M. Han, I. Essa. Efficient Hierarchical Graph-Based Video. CVPR 2010.
Kristen Grauman CS 4495 Computer Vision A. Bobick Image segmentation: toy example 1 2 3 input image pixel count black pixels intensity gray pixels white pixels These intensities define the three groups. We could label every pixel in the image according to which of these primary intensities it is. i.e., segment the image based on the intensity feature. What if the image isn t quite so simple?
Noisy Images pixel count input image intensity pixel count Kristen Grauman input image intensity
Noisy Images pixel count input image intensity Now how to determine the three main intensities that define our groups? We need to cluster. Kristen Grauman
Clustering 0 190 255 intensity 1 2 3 Goal: choose three centers as the representative intensities, and label every pixel according to which of these centers it is nearest to. Best cluster centers are those that minimize SSD between all points and their nearest cluster center ci: Kristen Grauman
Kristen Grauman CS 4495 Computer Vision A. Bobick Clustering With this objective, it is a chicken and egg problem: Q: how to determine which points to associate with each cluster center, c i? A: for each point p, choose closest c i Q: If we knew the group memberships, how do we get the centers? A: choose c i to be the mean of all points in the cluster
K-means clustering: Algorithm 1. Randomly initialize the cluster centers, cc 1,, cc KK 2. Given cluster centers, determine points in each cluster For each point p, find the closest cc ii. Put p into cluster i 3. Given points in each cluster, solve for cccc Set cc ii to be the mean of points in cluster i 4. If cc ii have changed, repeat Step 2 Text by Steve Seitz
Andrew Moore
Andrew Moore
Andrew Moore
Andrew Moore
Andrew Moore
as clustering Depending on what we choose as the feature space, we can group pixels in different ways. Grouping pixels based on intensity similarity Feature space: intensity value (1-d) Source: K. Grauman
Number of Clusters K=2 K=3 quantization of the feature space; segmentation label map
as clustering Depending on what we choose as the feature space, we can group pixels in different ways. Grouping pixels based on color similarity R=255 G=200 B=250 B G R=245 G=220 B=248 R Feature space: color value (3-d) R=15 G=189 B=2 R=3 G=12 B=2 Kristen Grauman
as clustering K-means clustering based on intensity or color is essentially vector quantization of the image attributes Image Intensity-based clusters Color-based clusters Slide by Svetlana Lazebnik
as clustering Depending on what we choose as the feature space, we can group pixels in different ways. Grouping pixels based on intensity+position similarity Intensity Y Kristen Grauman X Both regions are black, but if we also include position (x,y), then we could group the two into distinct segments; way to encode both similarity & proximity.
as clustering Source: K. Grauman
as clustering Clustering based on (r,g,b,x,y) values enforces more spatial coherence Slide by Svetlana Lazebnik
K-Means for segmentation Pros Very simple method Converges to a local minimum of the error function Cons Memory-intensive Need to pick K Sensitive to initialization Sensitive to outliers Only finds spherical clusters Slide by Svetlana Lazebnik
as clustering Color, brightness, position alone are not enough to distinguish all regions
as clustering Depending on what we choose as the feature space, we can group pixels in different ways. Grouping pixels based on texture similarity F 1 F 2 Filter bank of 24 filters F 24 Feature space: filter bank responses (e.g., 24-d) Kristen Grauman
CS 4495 Computer Vision A. Bobick Aside: Texture representation example Windows with primarily horizontal edges Dimension 2 (mean d/dy value) Both Dimension 1 (mean d/dx value) mean d/dx value mean d/dy value Win. #1 4 10 Win.#2 18 7 Win.#9 20 20 Kristen Grauman Windows with small gradient in both directions Windows with primarily vertical edges statistics to summarize patterns in small windows
Aside: Texture features Find textons by clustering vectors of filter bank outputs Describe texture in a window based on its texton histogram Image Texton map Count Texton index Count Count Texton index Texton index Malik, Belongie, Leung and Shi. IJCV 2001. Adapted from Lana Lazebnik
Image segmentation example Kristen Grauman
Make it better K-means heavily sensitive to initial conditions and (typically) need to know K in advance. Suppose we assume that there are a few modes in the image and that all the pixels come from these modes. If you could find the modes you might be able to segment the image.
Mean shift algorithm The mean shift algorithm seeks modes or local maxima of density in the feature space image Feature space (L*u*v* color values)
A digression about color Color is an inherently perceptual phenomena. Only related but not the same as wavelength of light energy. In fact only some colors are found in the spectrum
Colors perceivable by the human eye CIE xy chromaticity diagram, 1931
CIE XYZ color space (1931) A space with desired properties Easy to compute linear transform of CIE RGB Y: Perceived luminance X, Z: Perceived color Represents a wide range of colors
Colors perceivable by the human eye y = y = X X + Y + Z Y X + Y + Z CIE xy chromaticity diagram, 1931
CIE L*a*b* color space L = 25% L = 50% L = 75%
Cylindrical view Think of chroma (here a*, b*) defining a planar disc at each luminance level (L)
HSL and HSV color spaces
But there are lots of color spaces
The one we know best RGB color space
My favorite
Like a squared double cone?
Mean shift algorithm The mean shift algorithm seeks modes or local maxima of density in the feature space image Feature space (L*u*v* color values)
Slide by Y. Ukrainitz & B. Sarel CS 4495 Computer Vision A. Bobick Mean shift Search window Center of mass Mean Shift vector
Mean shift Search window Center of mass Mean Shift vector Slide by Y. Ukrainitz & B. Sarel
Mean shift Search window Center of mass Mean Shift vector Slide by Y. Ukrainitz & B. Sarel
Mean shift Search window Center of mass Mean Shift vector Slide by Y. Ukrainitz & B. Sarel
Mean shift Search window Center of mass Mean Shift vector Slide by Y. Ukrainitz & B. Sarel
Mean shift Search window Center of mass Mean Shift vector Slide by Y. Ukrainitz & B. Sarel
Mean shift Search window Center of mass Slide by Y. Ukrainitz & B. Sarel
Mean shift clustering Cluster: all data points in the attraction basin of a mode Attraction basin: the region for which all trajectories lead to the same mode Slide by Y. Ukrainitz & B. Sarel
Mean shift clustering/segmentation Find features (color, gradients, texture, etc) Initialize windows at individual feature points (pixels) Perform mean shift for each window (pixel) until convergence Merge windows (pixels) that end up near the same peak or mode
Mean shift segmentation results http://www.caip.rutgers.edu/~comanici/mspami/mspamiresults.html
Mean shift segmentation results
Mean shift Pros: Does not assume shape on clusters One parameter choice (window size) Generic technique Find multiple modes Cons: Selection of window size Does not scale well with dimension of feature space Kristen Grauman
Images as graphs q p w pq w Fully-connected graph node (vertex) for every pixel link between every pair of pixels, p,q affinity weight w pq for each link (edge) w pq measures similarity similarity is inversely proportional to difference (in color and position ) Source: Steve Seitz
Measuring affinity One possibility: Small sigma: group only nearby points Large sigma: group distant points Kristen Grauman
by graph partitioning j i w ij A B C Break Graph into Segments Delete links that cross between segments Easiest to break links that have low affinity similar pixels should be in the same segments dissimilar pixels should be in different segments Source: S. Seitz
Graph cut A B Set of edges whose removal makes a graph disconnected Cost of a cut: sum of weights of cut edges cut( A, B) w p, q p A, q B A graph cut gives us a segmentation What is a good graph cut and how do we find one? = Source: S. Seitz
Cuts in a graph: Min cut A B cut( A, B) = w p, q p A, q B Find minimum cut gives you a segmentation fast algorithms exist for doing this (we may see this ) Source: Steve Seitz
Minimum cut Problem with minimum cut: Weight of cut proportional to number of edges in the cut; tends to produce small, isolated components. [Shi & Malik, 2000 PAMI]
Cuts in a graph: Normalized cut A B Normalized Cut fix bias of Min Cut by normalizing for size of segments: cut( A, B) assoc( A, V ) cut( A, B) assoc( B, V ) assoc(a,v) = sum of weights of all edges that touch A Ncut value small when we get two clusters with many edges with high weights, and few edges of low weight between them Approximate solution for minimizing the Ncut value : generalized eigenvalue problem. + J. Shi and J. Malik, Normalized Cuts and Image, CVPR, 1997 Source: Steve Seitz
Example results
Results: Berkeley Engine http://www.cs.berkeley.edu/~fowlkes/bse/
Normalized cuts: pros and cons Pros: Generic framework, flexible to choice of function that computes weights ( affinities ) between nodes Does not require model of the data distribution Cons: Time complexity can be high Dense, highly connected graphs many affinity computations Solving eigenvalue problem Preference for balanced partitions Kristen Grauman
The end
Geometry of Color (CIE) Perceptual color spaces are non-convex Three primaries can span the space, but weights may be negative. Curved outer edge consists of single wavelength primaries Source: Jim Rehg
RGB Color Space Many colors cannot be represented (phosphor limitations) Source: Jim Rehg
Uniform color spaces McAdam ellipses (next slide) demonstrate that differences in x,y are a poor guide to differences in color Construct color spaces so that differences in coordinates are a good guide to differences in color. Source: Jim Rehg
McAdam ellipses Figures courtesy of D. Forsyth
LUV Color Space
HSV Color Space RGB HSV Source: Intel IPP
LUV Color Space RGB LUV More info see: http://software.intel.com/sites/products/documentation/hpc/ipp/ippi/ippi_ch6/ch6_color_models.html Source: Intel IPP