CS 5540 Spring 2013 Assignment 3, v1.0 Due: Apr. 24th 11:59PM

Similar documents
Introduction. Computer Vision & Digital Image Processing. Preview. Basic Concepts from Set Theory

Morphological Image Processing

Morphological Image Processing

Morphological Image Processing

Maximum flows & Maximum Matchings

From Pixels to Blobs

Biomedical Image Analysis. Mathematical Morphology

Graphs, graph algorithms (for image segmentation),... in progress

MULTI-REGION SEGMENTATION

CAP5415-Computer Vision Lecture 13-Image/Video Segmentation Part II. Dr. Ulas Bagci

CS6670: Computer Vision

Processing of binary images

Lecture 11: Maximum flow and minimum cut

EE 584 MACHINE VISION

Image Processing. Bilkent University. CS554 Computer Vision Pinar Duygulu

Binary Image Processing. Introduction to Computer Vision CSE 152 Lecture 5

CITS 4402 Computer Vision

s-t Graph Cuts for Binary Energy Minimization

CS 217 Algorithms and Complexity Homework Assignment 2

CS4670 / 5670: Computer Vision Noah Snavely

Maximum flows and minimal cuts. Filip Malmberg

morphology on binary images

09/11/2017. Morphological image processing. Morphological image processing. Morphological image processing. Morphological image processing (binary)

Image Enhancement Using Fuzzy Morphology

MEDICAL IMAGE COMPUTING (CAP 5937) LECTURE 10: Medical Image Segmentation as an Energy Minimization Problem

Graph Based Image Segmentation

C E N T E R A T H O U S T O N S C H O O L of H E A L T H I N F O R M A T I O N S C I E N C E S. Image Operations II

Chapter 9 Morphological Image Processing

Symbol Detection Using Region Adjacency Graphs and Integer Linear Programming

What will we learn? What is mathematical morphology? What is mathematical morphology? Fundamental concepts and operations

COMP 558 lecture 22 Dec. 1, 2010

CS443: Digital Imaging and Multimedia Binary Image Analysis. Spring 2008 Ahmed Elgammal Dept. of Computer Science Rutgers University

LATIN SQUARES AND TRANSVERSAL DESIGNS

Lecture 3: Art Gallery Problems and Polygon Triangulation

Statistical and Learning Techniques in Computer Vision Lecture 1: Markov Random Fields Jens Rittscher and Chuck Stewart

Homework 3 Solutions

ADAPTIVE GRAPH CUTS WITH TISSUE PRIORS FOR BRAIN MRI SEGMENTATION

Image Segmentation. Srikumar Ramalingam School of Computing University of Utah. Slides borrowed from Ross Whitaker

15-451/651: Design & Analysis of Algorithms October 11, 2018 Lecture #13: Linear Programming I last changed: October 9, 2018

Computer Vision 2. SS 18 Dr. Benjamin Guthier Professur für Bildverarbeitung. Computer Vision 2 Dr. Benjamin Guthier

Image Segmentation Image Thresholds Edge-detection Edge-detection, the 1 st derivative Edge-detection, the 2 nd derivative Horizontal Edges Vertical

intro, applications MRF, labeling... how it can be computed at all? Applications in segmentation: GraphCut, GrabCut, demos

CS 473: Algorithms. Ruta Mehta. Spring University of Illinois, Urbana-Champaign. Ruta (UIUC) CS473 1 Spring / 36

AMS /672: Graph Theory Homework Problems - Week V. Problems to be handed in on Wednesday, March 2: 6, 8, 9, 11, 12.

CS 664 Segmentation. Daniel Huttenlocher

SPERNER S LEMMA MOOR XU

CS4670: Computer Vision

MEDICAL IMAGE COMPUTING (CAP 5937) LECTURE 10: Medical Image Segmentation as an Energy Minimization Problem

Some material taken from: Yuri Boykov, Western Ontario

Analysis of Binary Images

The strong chromatic number of a graph

Minimum Cost Edge Disjoint Paths

Chapter 5 Graph Algorithms Algorithm Theory WS 2012/13 Fabian Kuhn

by conservation of flow, hence the cancelation. Similarly, we have

LP-Modelling. dr.ir. C.A.J. Hurkens Technische Universiteit Eindhoven. January 30, 2008

EECS490: Digital Image Processing. Lecture #17

Advanced Operations Research Techniques IE316. Quiz 1 Review. Dr. Ted Ralphs

Morphological Image Processing

Robot vision review. Martin Jagersand

Filters. Advanced and Special Topics: Filters. Filters

Vertex-Colouring Edge-Weightings

Bilinear Programming

Jessica Su (some parts copied from CLRS / last quarter s notes)

Interpolation is a basic tool used extensively in tasks such as zooming, shrinking, rotating, and geometric corrections.

x Boundary Intercepts Test (0,0) Conclusion 2x+3y=12 (0,4), (6,0) 0>12 False 2x-y=2 (0,-2), (1,0) 0<2 True

CS261: Problem Set #2

Solving problems on graph algorithms

Basic relations between pixels (Chapter 2)

CS446: Machine Learning Fall Problem Set 4. Handed Out: October 17, 2013 Due: October 31 th, w T x i w

CIS UDEL Working Notes on ImageCLEF 2015: Compound figure detection task

EE795: Computer Vision and Intelligent Systems

Notes on Minimum Cuts and Modular Functions

Interactive segmentation, Combinatorial optimization. Filip Malmberg

Definition For vertices u, v V (G), the distance from u to v, denoted d(u, v), in G is the length of a shortest u, v-path. 1

Supervised texture detection in images

Fuzzy Soft Mathematical Morphology

CSCI-GA Scripting Languages

Connectivity Preserving Digitization of Blurred Binary Images in 2D and 3D

Image Analysis - Lecture 5

MITOCW watch?v=4dj1oguwtem

Introduction to grayscale image processing by mathematical morphology

Primal Dual Schema Approach to the Labeling Problem with Applications to TSP

International Journal of Advance Engineering and Research Development. Applications of Set Theory in Digital Image Processing

Biomedical Image Analysis. Point, Edge and Line Detection

CSE 417 Network Flows (pt 3) Modeling with Min Cuts

J Linear Programming Algorithms

arxiv: v2 [cs.dm] 28 Dec 2010

EXAM SOLUTIONS. Image Processing and Computer Vision Course 2D1421 Monday, 13 th of March 2006,

VC 16/17 TP5 Single Pixel Manipulation

Topological structure of images

Homework 4 Computer Vision CS 4731, Fall 2011 Due Date: Nov. 15, 2011 Total Points: 40

Modular Representations of Graphs

Network Flow I. Lecture Overview The Network Flow Problem

Edges and Binary Images

9.1 Cook-Levin Theorem

This is already grossly inconvenient in present formalisms. Why do we want to make this convenient? GENERAL GOALS

An exact algorithm for max-cut in sparse graphs

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

Texture. Texture is a description of the spatial arrangement of color or intensities in an image or a selected region of an image.

Digital Image Processing COSC 6380/4393

Transcription:

1 Introduction In this programming project, we are going to do a simple image segmentation task. Given a grayscale image with a bright object against a dark background and we are going to do a binary decision problem for each pixel whether it belongs to foreground or background. You will play with both synthetic data and real data and you are supposed to implement both the threshold algorithm and graph cut algorithm we learned in the lecture and understand them better. Again, this assignment can be done individually or in pairs, though we strongly encourage you to work in pairs. You can use any languages to do it. You are supposed to implement your own algorithms so that you CANNOT use any built-in functions or third-party code of these algorithms directly EXCEPT that you can use or translate the code which we provide on solving min-cut problem. If you have any questions about this requirement, please ask the TA (chenwang@cs.cornell.edu). 2 Assignment Submission Please submit an archived file through CMS including your report and source code. You are supposed to include a brief description on the technique choices (like parameters, energy functions, etc.) you made in your algorithm. For each image segmentation task, you are supposed to provide at least one mask image and its corresponding extracted foreground (please see Sec. 3.1 for more details and examples). It s better if you can provide multiple results using different techniques and compare them. 3 Assignment 3.1 Task Definition Definition 3.1 (Grayscale Image). In a computer system, a grayscale image can be represented by a 2-D matrix. Suppose we have H W pixels in the image, we will represent it as an H W matrix I where each entry I ij is the intensity of the corresponding pixel on the i-th row and j-th column. Typically, the intensity is an integer with range [0, 255] where 0 indicates black and 255 indicates white. Definition 3.2 (Binary Image Segmentation). In a binary image segmentation task, we would like to classify each pixel into 2 categories (i.e., foreground and background). In other words, we would like to work out a function f : [0, 255] H W {0, 1} H W which takes an image as input and output a 0,1-matrix with the same size where 0 indicates background pixel and 1 indicates foreground pixel. Sometimes we also name the output as mask matrix and we can visualize it as an image. Definition 3.3 (Extracted Foreground). We define extracted foreground as an image which keeps the original foreground and leaves all the background pixels as black. Formally, suppose Y = f(i), we can define O as the extracted foreground by: O ij = Y ij I ij (1) The following figure illustrates the original image, mask image and the extracted foreground. You are also supposed to include the mask image and extracted foreground for each segmentation in your experiments. 3.2 Threshold-based Segmentation 3.2.1 Simple Thresholding Algorithm Since in our simple binary image segmentation task, we assume that the object is always bright while the background is always dark, a naive way to do the segmentation is to set up a threshold t and apply the 1

Figure 1: Input Image Figure 2: Mask Image (Binary Segmentation Result) Figure 3: Extracted Foreground thresholding function to each pixel. I.e., given image I H W, we obtain our result (mask matrix) Y H W as: Y ij = { 1, Iij t 0, I ij < t (2) However, real images are not ideal. So that the simple thresholding algorithm is vulnerable to noise, and the boundary we got may not be smooth. Therefore, we alway perform some pre-processing and post-processing to enhance our result. I will introduce two typical filters for pre-processing and post-processing respectively. 3.2.2 Mean Filter Denoising We can generalize our 1-D mean filter into 2-D space to do the denoising task before we do the segmentation. The idea is still to replace the intensity of each pixel by its local average. Typically, given any pixel, we will define all the pixels within Manhattan Distance k as its neighbor, i.e., a square with length 2k + 1 centered at the given pixel. Here, k is a parameter. Note that we may also take a different treatment to the boundary pixels this time, see the formal definition below. We can define our neighbor set N (x, y) and the image I after denoising in the following way: N (x, y) = {(i, j) } 1 i H, 1 j W, x i + y i k (3) I xy = (i,j) N (x,y) I ij N (x, y) (4) 3.2.3 Morphological Filters To refine the segmentation result, we would like to do some post-processing on our result. Morphological filters are classic tools to tackle this issue. Two fundamental operations are dilation and erosion. In dilation, if any neighborhood pixel is foreground, it becomes foreground. In erosion, if any neighborhood pixel is background, it becomes background. Common morphological filter applies one dilation operation firstly, then followed by an erosion operation. It can fill in some holds in the segmentation result and makes the boundary more smooth. You also need to define the neighbor set for morphological filter. In practice, this neighbor set is typically very small, e.g., k 3 for the Manhattan distance we defined above. 3.2.4 Your Task You are suppose to do the following tasks in this part: Implement thresholding algorithm and 2-D mean filters. (Optional) Implement Morphological filters. 2

Figure 4: Example for a graph Choose proper threshold t, apply simple thresholding algorithm on groundtruth.png in synthetic data. Apply simple thresholding algorithm on noisy.png in synthetic data. Analyze why a simple thresholding strategy fails in this case. Choose proper size k, apply mean filter on noisy.png, then apply the simple thresholding algorithm to do the segmentation. (Optional: apply morphological filters to refine the result.) With the experience in the synthetic data, conduct segmentation for each image in real data using thresholding algorithm. (Optional: you can add more fun things beyond the mean filter and morphological filters to make the results better.) 3.3 Graph Cut-based Segmentation 3.3.1 Preliminary This part will introduce some basic knowledge in graph theory, including direct graph, flow, cut, etc. If you have already know these kind of things or you are not interested in reading too much maths, you can skip the formal definition. But PLEASE try to understand all the examples here, which may help you understand the high-level ideas of these mathematical tools and build our own models in image segmentation task better. Some materials of this part come from Dexter Kozen s textbook [1]. If you are interested in this staff, you can take CS 4820 or CS 6820 for more details. Definition 3.4 (Directed Graph). A directed graph G is defined by a tuple G = (V, E). Here V is a vertex set which represents the nodes in the graph. E V 2 is an edges set which represents the directed links between a pair of nodes. Example of graph: Fig. 4 illustrates an example of a directed graph. There are 3 nodes and 3 directed edges in the graph. Here our vertex set V = {1, 2, 3}, and out edge set E = {(1, 2), (2, 3), (3, 1)}. Note that we use ordered pair to represent directed edge, i.e., (1, 2) is an edge pointed from node 1 to node 2 (see the arrow in the example), which is different from edge (2, 1). Suppose we are given a tuple G = (V, c, s, t), where V is a set of vertices, s, t V are distinguished vertices which are called the source and sink respectively, and c is a function c : V 2 R + assigning a nonnegative real capacity to each pair of vertices. We make G into a directed graph by defining the set of directed edges: E = { (u, v) c(u, v) > 0 } (5) Definition 3.5 (Flow). A function f : V 2 R is called a flow if the following three conditions are satisfied: capacity constraints: for all vertex u, v V, we have f(u, v) c(u, v) skew symmetry: for all u, v V, we have f(u, v) = f(v, u) 3

Figure 5: Example for a flow conservation of flow: for all vertex v V except s and t, we have v V f(u, v) = 0 Definition 3.6 (Max Flow Problem). We can define the value of a flow f as the total amount of flow from the source, i.e., f = f(s, v) (6) v V The max flow problem is to find the flow f with the maximum value f. Example of flow: The intuition of a flow comes from the water flow in pipes. We use a graph to describe the connection of pipes, we use the capacity function c to describe the capacity for each pipe and we use flow function f to describe the amount of water go through each pipe in unit time. An interpretation of the 3 properties in our definition of flow is: 1) Capacity constraints says we cannot push too much water into any pipe which exceeds its capacity in unit time. 2) Skew symmetry is somehow counter-intuitive, we can understand that I give you 5 gallons of water means you receive 5 gallons of water from me. An awkward way to express the same thing maybe I give you 5 gallons of water mean you give me -5 gallons of water. 3) Combining with the second property, conservation of flow says for each internal node, the amount of water it received from other nodes is equal to the amount of water it pushes to other nodes. In other words, internal nodes cannot store waters. Only source node s can generate water while sink node t can store water in our water system. Fig. 5 illustrates an example of a flow network and its flow. The numbers f/c on each edge are its corresponding flow and capacity. To keep the figure clean, I don t draw lines with 0-capacity and negative flow. But we should know that f(s, 1) = 2 implies f(1, s) = 2. We can verify the capacity constraints and conservation of flow holds in this example. For example, for node 4, it receive 1 gallon of water from node 1 and 3 gallons of water from node 2. It also pushes 4 gallons of water towards sink node t. By the way, this flow illustrated in the figure is also the max flow, with value f = 5. Definition 3.7 (s,t-cut). An s,t-cut is a pair A, B of disjoint subsets of V whose union is V such that s A, t B. The capacity of the cut A, B, denoted c(a, B), is: c(a, B) = c(u, v) (7) u A,v B Definition 3.8 (Min Cut Problem). Given the tuple G = (V, c, s, t), find the s,t-cut A, B with the minimal capacity value. Example of cut: In Fig. 6, the dotted line separate the vertex set into A = {s, 1, 2, 4} and B = {3, t}. The capacity of this cut is c(a, B) = 5. By the way, this is also the min-cut of this graph. Theorem 3.9 (Max Flow-Min Cut Theorem). Suppose f is the max flow in G = (V, c, s, t) and A, B is the min cut in G = (V, c, s, t), then we have: f = c(a, B). An intuitive interpretation of this theorem is the maximum flow in a pipe system is bounded by the bottleneck pipes in the system. We can also see from the examples above that the value of max flow agrees with the capacity of min-cut in our example. 4

Figure 6: Example for a cut 3.3.2 Graph Cut for Image Segmentation Just as we learned from lecture, our energy function will be a combination of data term and prior term. E(L) = D p (L p ) + λ P pq I(L p L q ) (8) p where both D p and P pq are some kinds of penalty function. D p (L p ) describes the penalty of we label pixel p as L p. P pq is the penalty of two neighbor pixels p and q with different labels. There are many choices for these penalty functions, most of them are depend on the intensity of pixels. To convert this energy minimization problem into a min-cut problem. We need to define the flow network in the following way: pq N We treat each pixel p we need to label as one node n p in the graph. We need to add one special source node s and one special sink node t into the graph. We add an directed edge from source s to each internal node n p with capacity D p (1) while we add an directed edge from each internal node n p to sink t with capacity D p (0). For each pair of pixels p and q appears in our neighborhood system N, we add two directed edges (n p, n q ) and (n q, n p ) with both capacity λp pq. Fig. 7[2] is a very good example to illustrated to procedure described above. Suppose (A, B) is an arbitrary cut in the graph we defined above, then we label each pixel p A as 0 (background) and other pixels q B as 1 (foreground). One claim is c(a, B) = E(L). We can prove this by some computation: c(a, B) = p A,q B c(p, q) (defintion of cut) = q B {t} c(s, q) + p A {s} c(p, t) + p A {s},q B {t} c(p, q) (note that c(s, t) = 0) = q B {t} D q(1) + p A {s} D p(0) + p A {s},q B {t} λp pq = p D p(l p ) + λ pq N,L p L q P pq = p D p(l p ) + λ pq N P pqi(l p L q ) = E(L) (9) Therefore, we can also see that the optimal labels in our binary segmentation task to minimize the energy function is equivalent to the min-cut in the graph we defined. For more details about the formulation of the problem and the choices of penalty functions, please refer Boykov and Funka-Lea s paper [2], especially Section. 2.1, 2.2, 2.3 and 2.5. 5

Figure 7: Example of image segmentation using graph cut[2] 3.3.3 Usage of Dinic Class The algorithm to solve a max-flow (min-cut) problem is not trivial and far more than the requirement of the course. Therefore, we provided an implementation of Dinic Algorithm to solve this the max-flow and min-cut problem. You can use this code as a subroutine in your assignment. This code is written in C++, but you can easily translate it into your own programming language since there are only 144 lines of code. You don t need to understand how does this code works. You need to follow the following instructions to use this code: Step 1: Initialize an instance of Dinic class. Please also specify the number of vertices and edges of your graph as the first and second parameters in the constructor in this step. They will be used to allocate arrays in the memory so that please make sure that they are big enough. Step 2: Call AddEdge(s, t, c) to add edges of the graph one by one. The parameters means that we will add a directed edge from node s to node t with capacity c. This operation is additive, i.e., AddEdge(1,2,1);AddEdge(1,2,2); is equivalent to AddEdge(1,2,3). By the way, AddEdge(s,t,c,true); is a syntax sugar for AddEdge(s,t,c);Add(t,s,c);. You may find this one useful when you define prior terms. Step 3: Call MaxFlow() to calculate the max flow of the graph. Step 4: Call MinCut() to get the min cut set of the graph. This function will return a boolean array which indicates whether each variable belongs to set A. Please note that min-cut is computed based on the max flow. So please make sure that call MaxFlow() before MinCut(). The main function in the provided code illustrate how to get the results in Fig. 5 and Fig. 6. It should generate the max flow as 5 and min cut set A = {0, 1, 2, 4} and B = {3, 5}. You can use this one or some other toy problems as a test case for your own translation version. Please contact TA when you have any questions about the usage of this code. 6

3.3.4 Your Task You are supposed to do the following things in this part: Implement graph-cut algorithm for image segmentation using (or translating) the program we provided to solve the min-cut problem as a sub-routine. Choose proper parameters, define the penalty functions used in the energy function of graph cut, conduct graph-cut based segmentation for each images in both synthetic and real dataset. You may choose you own way to do the pre-processing and post-processing. Note that this parameter (including the parameters in the penalty functions) are task-specific, which means you may need to tune them for each image. And the quality of the final segmentation result will be heavily relied on these parameters and functions. Compare the result of threshold based segmentation and graph-cut based segmentation. Please also note that for some big images, it may takes several minutes for Dinic Algorithm to optimize the energy function. (According to my own experiment) So please try to use some efficient programming languages this time. The speed of the algorithm also depends on the size of your neighbor system, please try smaller neighbor system (like 4-neighbor system) when it takes too much time. In case of the program is still very slow, you can down sampling the original image to get a smaller image (which means a smaller graph). But remember, this is the last way out. Only use it when necessary. 4 Academic Integrity Academic integrity is important in this course. cornell.edu/academic/aic.html). You must follow the school s code (http://cuinfo. Since this is a programming project, we would like to emphasize the following rules: Having discussions with other people, using open sources and public tools, getting ideas from research papers is allowed, but proper citations and acknowledgements are required. Otherwise, any direct or indirect copy from other s work, Internet, etc. is strictly forbidden. All the results you reported in this project must be generated by your submitted programs. Violations of academic integrity are taken very seriously. Please feel free to contact Professor Zabih if you have any questions or concerns about this topic, or if you feel there is any possibility that you may be violating the code of academic integrity. References [1] D.C. Kozen, The design and analysis of algorithms, Springer, 1991. [2] Y. Boykov, and G. Funka-Lea, Graph cuts and efficient ND image segmentation, International Journal of Computer Vision 70 (2006), pp. 109 131. 7