GENERALIZING CONTEXTS AMENABLE TO GREEDY AND GREEDY-LIKE ALGORITHMS. Yuli Ye

Size: px

Start display at page:

Download "GENERALIZING CONTEXTS AMENABLE TO GREEDY AND GREEDY-LIKE ALGORITHMS. Yuli Ye"

Angelica Sherman
5 years ago
Views:

1 GENERALIZING CONTEXTS AMENABLE TO GREEDY AND GREEDY-LIKE ALGORITHMS by Yuli Ye A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Graduate Department of Computer Science University of Toronto Copyright 2013 by Yuli Ye

2 Abstract Generalizing Contexts Amenable to Greedy and Greedy-Like Algorithms Yuli Ye Doctor of Philosophy Graduate Department of Computer Science University of Toronto 2013 One central question in theoretical computer science is how to solve problems accurately and quickly. Despite the encouraging development of various algorithmic techniques in the past, we are still at the very beginning of the understanding of these techniques. One particularly interesting paradigm is the greedy algorithm paradigm. Informally, a greedy algorithm builds a solution to a problem incrementally by making locally optimal decisions at each step. Greedy algorithms are important in algorithm design as they are natural, conceptually simple to state and usually efficient. Despite wide applications of greedy algorithms in practice, their behaviour is not well understood. However, we do know that in several specific settings, greedy algorithms can achieve good results. This thesis focuses on examining contexts in which greedy and greedy-like algorithms are successful, and extending them to more general settings. In particular, we investigate structural properties of graphs and set systems, families of special functions, and greedy approximation algorithms for several classic NP-hard problems in those contexts. A natural phenomenon we observe is a trade-off between the approximation ratio and the generality of those contexts. ii

3 Acknowledgements It is my great honour to study theoretical computer science at the University of Toronto and having a great advisor, Allan Borodin, in helping me go through one of the important processes of my life. I am greatly indebted to him, for his long-time support, both emotionally and financially, for his inspiration and guidance, for his patience and encouragement. It has been a long journey, and without him, this would never be possible. My deepest thanks to my committee members: Charles Rackoff and Derek Corneil, for their careful reading, comments and suggestions on the thesis. I particularly enjoyed the conversations I had with Derek. His passion to research and his life philosophy set a true role model for me. Special thanks to Faith Ellen, for serving on my committee for earlier checkpoints and many constructive and helpful comments to drafts of the thesis. It is my honour to have Magnús M. Halldórsson be my external examiner. I thank him for his helpful comments about the thesis and his excellent research in the field of approximation algorithms. Many ideas of this thesis are borrowed from his papers and insights. I would like to thank all my coauthors, especially Stephen Cook and John Brzozowski. It is my privilege to work with them. I admire their work ethic and persistence, and have learnt a great deal from them. A very special thanks to Dai Le, a wonderful friend for the past four years. We had many enjoyable conversations, and walked almost every path on campus. It is great to have you as a companion during these years. I also would like thank Renqiang Min, Phuong Nguyen, Yuval Filmus, Brendan Lucier, Justin Ward, Joel Oren, Xiaodan Zhu and many other. It is impossible to enumerate all their names here. Finally, thanks to my family in Toronto and China, my wife Lingling and my dear daughter April. I know you have waited for so long. The dream finally comes true. iii

4 Contents 1 Introduction What is a Greedy Algorithm? Greedy Algorithms: A Brief History Approximation Algorithms Why Study Greedy Algorithms? A List of Problems Overview of the Thesis Greedy Algorithms on Special Structures Chordal Graphs and Related Structures Interval Selection and Colouring Perfect Elimination Ordering Extending Perfect Elimination Orderings Inductive and Universal Neighbourhood Properties and Their Related Graphs Classes Natural Subclasses of the Four Families Graphs induced by the Job Interval Selection Problem Planar Graphs Disk and Unit Disk Graphs, Intersection Graphs of Convex Shapes More Subclasses iv

5 2.3 Properties of G(I S k ) and G(CC k ) Greedy Algorithms for G(I S k ) and G(CC k ) Maximum Independent Set Minimum Vertex Colouring Minimum Vertex Cover Weighted Maximum c-colourable Subgraph The Graph Class G(CC 2 ) Matroids and Chordoids Matroids Greedy Algorithms and Matroids Chordoids Greedy Algorithms for Special Functions Linear Functions and Submodular Functions Max-Sum Diversification A Greedy Algorithm and Its Analysis Further Discussions Weakly Submodular Functions Examples of Weakly Submodular Functions Weakly Submodular Function Maximization Further Discussions Sum Colouring - A Case Study of Greedy Algorithms Introduction NP-Hardness for Penny Graphs Approximation Algorithms for d-claw-free Graphs and their Subclasses Compact Colouring for Ĝ(I S k ) Unit Square Graphs v

6 4.4 Priority Inapproximation for Sum Colouring Fixed Order and Adaptive Order Deriving Lower Bounds An Inapproximation Lower Bound for Sum Colouring Conclusion Greedy Algorithms with Weight Scaling Introduction Weight Scaling for Degrees Weight Scaling for Claws Conclusion Conclusion 122 Bibliography 124 vi

7 List of Figures 2.1 A chordal graph A set of eight intervals ordered by non-decreasing finish-time An optimal solution of interval selection on the input in Fig An optimal solution of interval colouring on the input in Fig The interval graph of Fig A graph in Ĝ(I S 2 ) and G(I S 2 ) but not in Ĝ(CC 2 ) and G(CC 2 ) An example of the job interval selection problem No triangular face One triangular face Two adjacent triangular faces An example of a disk graph Partition the plane into six regions An example of a circular-arc graph An example of the construction for k = A vertex v in H and one of its independent neighbours u A mapping from O to A A maximum matching between N (v) and N 2 (v) An example of a triangle weight decomposition of a graph An optimal sum colouring of G Unit interval graphs and proper interval graphs vii

8 4.3 Unit square graphs and proper intersection graphs of axis-parallel rectangles Unit disk graphs and penny graphs Transformation from planar graphs with maximum degree 3 to penny graphs Transformation for straight pairs Transformation for uneven pairs Transformation for corner pairs The edge gadget Best colouring of the edge gadget when both u and v are coloured Recolour v to improve the sum The adjacent edge gadget Transformation between two grid vertices Corner cases in the first transformation An overlapping adjacent pair A degree-three corner Two graphs for adaptive priority algorithms An example for weight scaling viii

9 Chapter 1 Introduction For many application areas, greedy strategies are a natural, conceptually simple, and efficient algorithmic approach. Although for the vast majority of optimization problems, greedy algorithms are not optimal, in some specific settings, such as for simple graph and scheduling problems, greedy algorithms can find a global optimum. Natural questions to ask are what brings success to greedy algorithms and to what extent can we generalize this success. Before we investigate these questions, we first give some background and motivation; namely, understanding what they are, their history, settings in which greedy algorithms are effective and, in general, why they are an interesting subject to study. 1.1 What is a Greedy Algorithm? Most greedy algorithms appear in the form of an iterative procedure that, at each step, makes locally optimal decisions with respect to a certain criterion. A commonly used example of a problem that can be solved using a greedy algorithm is interval selection. Given a set of intervals, each represented by its start-time and finish-time, we are to select a subset of non-overlapping intervals with maximum cardinality. 1

10 CHAPTER 1. INTRODUCTION 2 Natural greedy algorithms for this problem decide at each step which interval to consider next and what to do about it. The keyword greedy is reflected in this decision-making at each step. Usually, this decision has the following characteristics: Local: decisions are based on information maintained locally for each input item. This can the degree of a vertex, the weight of an edge, the distance to a vertex in a graph, or the index in a particular ordering. Irrevocable: once determined, decisions cannot be changed afterwards. Greedy: decisions are made so as to optimize a certain criterion. Note that these characteristics may not always be present in designing a greedy algorithm. However, they do appear frequently. There are often many different choices for a greedy rule to optimize the decision at each step. For example, for the interval selection problem, the following rules may be used for selecting the next interval. At each step, we can choose: (1) an interval with the smallest processing time, or (2) an interval with the earliest start-time, or (3) an interval with the smallest number of conflicting intervals, or (4) an interval with the earliest finish-time. By symmetry, choosing the earliest start-time/finish-time is equivalent to choosing the latest finish-time/start-time. Although all these choices are reasonable for a greedy algorithm, only (4) leads to an optimal solution. Therefore, choosing the right greedy rule is crucial when designing a greedy algorithm. 1.2 Greedy Algorithms: A Brief History Greedy algorithms are natural and they have been widely used in practice. Some of these algorithms are so natural and important that people rarely associate them with greedy algorithms. In fact, it was not until the early 1970 s, that the term greedy algorithm started emerging. As algorithm design became a standard course in computer science, the term

11 CHAPTER 1. INTRODUCTION 3 became more popular, and people started to recognize greedy algorithms as a general way to solve problems. Greedy algorithms have a long history. In the early 13 th century, in the book Liber Abaci, Fibonacci described a process of finding a representation of a fraction by a sum of unit fractions (i.e., fractions of the form 1 ) with different denominators. The problem is known as n the Egyptian fraction problem. The process described by Fibonacci is a greedy algorithm. At each step, it finds the largest unit fraction not exceeding the remaining fraction, subtracts it from the remaining fraction, until the remaining fraction becomes zero. Fibonacci showed that such a greedy process terminates in a number of steps which is no greater than the numerator of the original fraction. Hence it produces a finite representation. Greedy algorithms appear in many important applications. For example, in graph theoretic problems, we have Prim s algorithm and Kruskal s algorithm for finding a minimum weighted spanning tree. Prim s algorithm maintains a set S of discovered vertices. At each step, it adds to S the closest vertex to S. Kruskal s algorithm sorts all edges in non-decreasing order of weights. At each step, it selects the next edge if it does not create a cycle with previously selected edges. Like Prim s algorithm, Dijkstra s algorithm for finding shortest paths can also be viewed as a greedy algorithm. Dijkstra s algorithm maintains a set S of discovered vertices. At each step, it adds to S the closest vertex to the starting vertex via vertices in S. In data compression, we have Huffman s algorithm to generate the optimal prefix-code tree. Huffman s algorithm is a greedy algorithm in the sense that at each step, it extracts the two smallest elements from a set, combines them and puts the newly combined element back into the set. In scheduling, Graham s list scheduling algorithm for minimizing completion time is a greedy algorithm. It sorts jobs in non-increasing order of processing time. At each step, it schedules the next job to the least loaded machine. Similarly, Johnson s first fit decreasing algorithm for bin packing is also a greedy algorithm. It sorts items in nonincreasing size. At each step, it assigns the next item to the first bin into which it fits and opens a new bin if the item does not fit into an existing bin.

12 CHAPTER 1. INTRODUCTION 4 There are also important developments in studying special structures related to greedy algorithms. Matroids and related systems were first studied in the 1950s by Rado [76], Gale [34] and Edmonds [27]. This work was later extended by Korte and Lovász [61]. In particular, matroids characterize those hereditary set systems for which the natural greedy algorithm always optimizes linear objectives. Related to this development, a similar concept, known as the Monge property, is studied in the literature of transportation problems. In his 1963 paper, Hoffman [49] gave a necessary and sufficient condition, the existence of a Monge sequence, that determines when a certain transportation problem can be solved greedily. Greedy algorithms are also studied in optimizing special functions. An important result of Nemhauser, Wosley and Fisher [71] states that, over a uniform matroid, the natural greedy algorithm achieves an e approximation to the optimal value of a monotone submodular e 1 set function. This has led applications in fields such as information retrieval [57] and natural language processing [69, 67, 68]. Recently, there has been growing interest in trying to better understand greedy algorithms. One key development is the priority framework initiated by Borodin, Nielsen and Rackoff [12] in It gives a precise model for greedy algorithms, so that their power and limitations can be analyzed. We briefly discuss the priority framework in Section Approximation Algorithms Throughout the thesis, we focus on approximation algorithms for optimization problems. Most of these problems are NP-hard. The celebrated result of NP-completeness [20] shows that hard problems exist. Karp s landmark paper on twenty-one NP-complete problems [56] shows they are abundant. In practice, finding optimal solutions for such problems is computationally intractable. They are commonly addressed with heuristics that provide a solution, but with no guarantee on the solution s quality. For many optimization problems, however, a sub-optimal solution

13 CHAPTER 1. INTRODUCTION 5 that is close to the global optimum, and computationally feasible, can sometimes be used as a good substitute for the computationally infeasible optimal solution. Algorithms that find such sub-optimal solutions are called approximation algorithms. Central to the framework of approximation algorithms is the definition of the approximation ratio, a mathematical measure which bounds the worst case performance of such algorithms. For a given optimization problem with objective function φ( ), we let σ be an input instance, and let A (σ) and O (σ) be the algorithm s solution and the optimal solution respectively. If the problem is a minimization problem, then the approximation ratio is defined to be the supremum of the ratio between the value of algorithm s solution and the value of the optimal solution over the set of all possible input instances Σ: φ[a (σ)] ρ(a ) = sup σ Σ φ[o (σ)]. It is not hard to see that, for minimization problems, the approximation ratio is no less than one. For a maximization problem, we take the convention that the approximation ratio is defined to be the supremum of the ratio between the value of the optimal solution and the value of algorithm s solution over the set of all possible input instances: ρ(a ) = sup σ Σ φ[o (σ)] φ[a (σ)], so that the approximation ratio is always no less than one. Based on these definitions, it is desirable to have a polynomial time algorithm having a ratio close to one. The approximation ratio provides a guarantee on the quality of the solution obtained in the worst case. In some cases, trade-offs between the approximation ratio and the running time are possible. An approximation algorithm is a polynomial-time approximation scheme (PTAS) if it takes an instance of an optimization problem and a fixed parameter ɛ > 0 and, in polynomial time in the input size n, produces a solution that is within a factor 1 + ɛ of the optimal.

14 CHAPTER 1. INTRODUCTION 6 An algorithm is a fully polynomial-time approximation scheme (FPTAS) if the running time is polynomial in both the input size n and 1 ɛ. An optimization problem having a polynomial-time approximation algorithm with approximation ratio bounded by a constant is said to be approximable. The class of such problems is called APX. In this thesis, we focus on approximation algorithms with small constant approximation ratios. They provide a good guarantee on the quality of a solution. In practical settings, when inputs are not chosen adversarially, these algorithms can achieve better results than their worst-case bounds. 1.4 Why Study Greedy Algorithms? It this section, we give a series of motivating examples to show that greedy algorithms are a powerful, subtle and interesting class of algorithms. We start with a greedy algorithm for the set cover problem. Problem Given a universe U of n elements and a collection S of m subsets of U : S = {S 1,S 2,...,S m } such that U = m S i, the set cover problem asks to find a cover C S with minimum size such that U = Si C S i. The following very simple and natural greedy algorithm has an H n lnn approximation, where H n = n 1 i is the n th harmonic number. This bound is tight up to a constant factor under the assumption that P is not equal to N P [78]. GREEDY SET COVER 1: C = 2: while C does not cover elements in U do 3: Pick a set S i S that covers the most uncovered elements in U 4: Remove S i from S 5: Add S i to C

15 CHAPTER 1. INTRODUCTION 7 6: end while 7: Return C Now we look at a related problem, vertex cover. Problem Given a graph G = (V,E), the vertex cover problem asks to find a cover C V with minimum size such that every edge in E is incident to at least one vertex in C. The vertex cover problem can be viewed a special case of the set cover problem by taking the universe to be the set of edges and each set in the collection to be the subset of edges incident to a particular vertex. Note that each edge appears in exactly two such sets as it has two end vertices. This type of set cover problem is also referred as the 2-frequency set cover problem. The following algorithm, known as the largest degree heuristic, is a direct translation of the greedy set cover algorithm above. GREEDY VERTEX COVER 1: C = 2: while C does not cover all edges in E do 3: Pick a vertex v in G with the largest current degree 4: Remove v and all its incident edges from G 5: Add v to C 6: end while 7: Return C Like the greedy set cover algorithm, the above greedy algorithm for vertex cover has an H n approximation. However, as a special case of set cover, the vertex cover problem admits algorithms with better approximation ratios. The best-known ratio for vertex cover is two, and the bound is tight in the sense that the ratio 2 ɛ, for any constant ɛ > 0, is not possible assuming the unique games conjecture [58]. A greedy algorithm similar to the style of the algorithm above that achieves approximation ratio two was obtained by Clarkson in [19] using a slightly different greedy rule.

16 CHAPTER 1. INTRODUCTION 8 CLARKSON S ALGORITHM 1: C = 2: For all v V, let w(v) = 1 3: while C does not cover all edges in E do 4: Let d(v) be the current degree of v 5: Select v in G minimizing w(v) d(v) 6: For any neighbour u of v in G, let w(u) = w(u) w(v) d(v) 7: Remove v and all its incident edges from G 8: Add v to C 9: end while 10: Return C Theorem [19] Clarkson s Algorithm achieves an approximation ratio of two for the vertex cover problem. Note that without line six, Clarkson s Algorithm is the same as the greedy vertex cover algorithm. This is because w(v) will always have a value one, and minimizing 1 at each d(v) step is the same as maximizing d(v). This demonstrates the flexibility of greedy algorithms. This flexibility essentially comes from the greedy choice we can make at each step. We know very little about how this flexibility can translate to the design of better algorithms. In some cases, slightly changing the greedy rule used in an algorithm can make its behaviour mysterious and make its analysis challenging. For example, consider the following algorithm for vertex cover. ANOTHER GREEDY ALGORITHM FOR VERTEX COVER 1: C = 2: while C does not cover all edges in E do 3: For all v V, let d(v) be its current degree and N (v) be the set of its neighbours 4: Select v in G maximizing u N (v) 1 d(u) 1

17 CHAPTER 1. INTRODUCTION 9 5: Remove v and all its incident edges from G 6: Add v to C 7: end while It seems difficult to find an input instance leading to an approximation ratio greater than two for this algorithm, nor a proof of any constant approximation ratio. 1.5 A List of Problems Throughout the thesis, we will discuss algorithms for many NP-hard problems. This section collects definitions for all these problems. 1. Weighted Maximum Independent Set (WMIS): Given a graph G = (V,E) and a weight function w : V Z +, the goal is to find a subset S of vertices maximizing the total weight of S such that no two vertices in S are adjacent in G. 2. Maximum Independent Set (MIS): This is the unweighted version of WMIS obtained by taking the weight of each vertex to be one. The size of an MIS of a graph G is denoted as α(g). 3. Weighted Minimum Vertex Cover (WMVC): Given a graph G = (V,E) and a weight function w : V Z +, the goal is to find a subset S of vertices minimizing the total weight of S such that every edge is incident to at least one vertex in S. 4. Minimum Vertex Cover (MVC): This is the unweighted version of WMVC obtained by taking the weight of each vertex to be one. 5. Weighted Maximum Clique (WMC): Given a graph G = (V,E) and a weight function w : V Z +, the goal is to find a subset S of vertices maximizing the total weight of S such that every two vertices in S are adjacent in G.

18 CHAPTER 1. INTRODUCTION Maximum Clique (MC): This is the unweighted version of WMC obtained by taking the weight of each vertex to be one. 7. Weighted Maximum c-colourable Subgraph (WCOL c ): Given a graph G = (V,E) and a weight function w : V Z +, the goal is to find a subset S of vertices maximizing the total weight of S such that S can be partitioned into c independent subsets. This problem generalizes WMIS in the sense that WMIS is WCOL c with c = Maximum c-colourable Subgraph (COL c ): This is the unweighted version of WCOL c obtained by taking the weight of each vertex to be one. 9. Minimum Vertex Colouring (COL): Given a graph G = (V,E), the goal is to colour the vertices with a minimum number of colours such that no two vertices with the same colour are adjacent in G. The minimum number of colours needed to colour G is called the chromatic number of G, and is denoted as χ(g). 10. Minimum Clique Cover (MCC): Given a graph G = (V,E), the goal is to colour the vertices with a minimum number of colours such that every two vertices with the same colour are adjacent in G. 11. Sum Colouring (SC): Given a graph G = (V,E), a proper colouring of G is an assignment c : V Z + such that for any two adjacent vertices u, v, c(u) c(v). The goal of the problem is to give a proper colouring of G such that v V c(v) is minimized. For the rest of the thesis, from time to time, we will use the above acronyms to refer to particular problems. 1.6 Overview of the Thesis This thesis has two parts. The first half (Chapters 2 and 3) examines the contexts where greedy algorithms have good performance, and extends to more general settings. Chapter 2

19 CHAPTER 1. INTRODUCTION 11 studies structural properties of graphs and set systems. In particular, we define generalizations of chordal graphs called inductive k-independent graphs. We study properties of such families of graphs, and we show that several natural classes of graphs are inductive k- independent for small constants k; for example, planar graphs. For any fixed constant k, we develop simple, polynomial time approximation algorithms for inductive k-independent graphs for several well-studied NP-complete problems. For the extension to matroids, we give a new definition of a hereditary set system by replacing the augmentation property of a matroid by an ordered augmentation property. We present several related natural problems, and give positive and negative results about optimization problems over such set systems. In particular, the unweighted maximum independent set problem can be solved greedily in linear time given an ordering of elements satisfying the ordered augmentation property, while the corresponding weighted version of the problem is NP-hard. Chapter 3 focuses on optimization problems for special classes of functions. We consider the problem of maximizing a set function over a uniform matroid and over a general matroid. We extend known results for modular and monotone submodular functions to more general functions. One class of functions is the objective function in the max-sum diversification problem, which is a linear combination of a submodular function and the sum of metric distances of a set. The other class of functions is weakly submodular functions, which generalizes the objective function in max-sum diversification. We discuss greedy (and local search) algorithms for problems optimizing these functions and obtain constant approximation guarantees. In particular, for max-sum diversification, we obtain a greedy 2-approximation algorithm over a uniform matroid, and a 2-approximation local search algorithm over a matroid. For weakly submodular functions, we obtain a 5.95-approximation greedy algorithm over a uniform matroid, and a 14.5-approximation local search algorithm over a matroid. The second half of the thesis (Chapter 4 and 5) presents some results about the design of greedy algorithms. Chapter 4 is a case study of greedy algorithms for the sum colouring

20 CHAPTER 1. INTRODUCTION 12 problem. In particular, we prove the problem is NP-hard for penny graphs, unit disk graphs and unit square graphs. We design approximation algorithms for the class of d-claw-free graphs and its subclasses. In particular, a (d 1)-approximation greedy algorithms for d- claw-free graphs and a 2-approximation algorithm for unit square graphs. We use the priority framework developed in [12], and give a priority inapproximability result for the sum colouring problem on a specific subclass of d-claw-free graphs. Chapter 5 discusses the weight scaling technique for designing a greedy algorithm. We focus on graph optimization problems with weighted vertices. The weight scaling technique gives a scaling factor for each vertex. These scaling factors can be used to produce an ordering in which a greedy algorithm considers vertices. We prove general bounds for greedy algorithms using different scaling factors, and provide a uniform view of several results in the literature. Chapter 6 concludes the thesis.

21 Chapter 2 Greedy Algorithms on Special Structures Although relatively rare, there are problems for which simple greedy algorithms can achieve an optimal solution. In many of those cases, it is the underlying structure of the problem that allows for the success of the algorithm. For example, although the maximum independent set problem is NP-hard for general graphs, a simple greedy algorithm solves it for chordal graphs in polynomial-time. In this chapter, we discuss two different settings in which greedy algorithms achieve good performance. First, we consider properties of a graph based on the neighbourhood of nodes, extending chordal graphs and claw-free graphs. Then we discuss set systems generalizing matroids and chordal graphs. 2.1 Chordal Graphs and Related Structures Throughout this chapter, we focus on graphs which are simple, connected and undirected. Vertices or edges of a graph may in some cases be weighted. Initially, we consider unweighted graphs. We start with an important graph class: chordal graphs. The study of chordal graphs can be traced back to the late 1950s; the first definition of chordal graphs is given by Hajnal and Surányi [40]. Definition [40] A graph G is chordal if each cycle in G of length at least four has at least 13

22 CHAPTER 2. GREEDY ALGORITHMS ON SPECIAL STRUCTURES 14 one chord Figure 2.1: A chordal graph Figure 2.1 shows a chordal graph with six vertices. The cycle has a chord 5-6. Chordal graphs appear frequently under other names in the literature, such as triangulated graphs, rigid circuit graphs, monotone transitive graphs, and perfect elimination graphs. Chordal graphs have many different characterizations. Dirac [25] proved that a graph is chordal if and only if every minimal vertex separator is a clique. Fulkerson and Gross [33] showed that a graph is chordal if and only if it admits a perfect elimination ordering. This was also observed by Rose [82]. Based on this definition, the first linear time recognition algorithm for chordal graphs [81] was devised using lexicographic breadth-first search. Later, Tarjan and Yannakakis [83] gave an even simpler recognition algorithm for chordal graphs using maximum cardinality search. Chordal graphs also have a beautiful characterization in intersection graph theory. Independently, Buneman [14], Gavril [36] and Walter [86] proved that a graph is chordal if and only if it is an intersection graph of subtrees of a tree. In fact, for every chordal graph, there is a subtree representation of a clique tree, and such a clique tree can be found in linear time. See the work of Hsu and Ma [50]. Not only do chordal graphs have rich characterizations and efficient recognition algorithms, they also contain many interesting subclasses, such as interval graphs, split graphs and k-trees. Many NP-hard problems also become easy for chordal graphs. For example, the maximum independent set problem, the maximum clique problem and minimum vertex

23 CHAPTER 2. GREEDY ALGORITHMS ON SPECIAL STRUCTURES 15 colouring problem can all be solved in linear time for chordal graphs. Most notably, each of these algorithms is a greedy algorithm utilizing a perfect elimination ordering of a chordal graph. We explore this phenomena in this chapter. For the purpose of illustration, we start with two problems on interval graphs Interval Selection and Colouring The interval selection problem and the interval colouring problem are two examples of problems that admit optimal greedy algorithms. Problem Given a set S of n intervals where each interval I k is a half open interval (s k, f k ] with a start-time s k and a finish-time f k, the goal of the interval selection problem is to find a non-overlapping subset of S with maximum size Figure 2.2: A set of eight intervals ordered by non-decreasing finish-time AN OPTIMAL GREEDY ALGORITHM FOR INTERVAL SELECTION 1: Sort all intervals according to non-decreasing finish-time 2: for i = 1,...,n do 3: Select the i th interval if it does not overlap with anything selected before 4: end for The set of intervals selected by the above greedy algorithm is shown in red in Fig Note that it is impossible to select more than four non-overlapping intervals for the input in Fig. 2.2; hence the solution produced is optimal.

24 CHAPTER 2. GREEDY ALGORITHMS ON SPECIAL STRUCTURES Figure 2.3: An optimal solution of interval selection on the input in Fig. 2.2 Problem Given a set S of n intervals where each interval I k is a half open interval (s k, f k ] with a start-time s k and a finish-time f k, the goal of the interval colouring problem is to assign a colour to each interval such that any two intervals with the same colour do not overlap. AN OPTIMAL GREEDY ALGORITHM FOR INTERVAL COLOURING 1: Sort all intervals according to non-decreasing start-time 2: for i = 1,...,n do 3: Colour the i th interval with the first available colour j not used by any interval overlapping with the i th interval 4: end for Figure 2.4: An optimal solution of interval colouring on the input in Fig. 2.2 The colour assigned to each interval by the above greedy algorithm is shown in Fig Observe that intervals 1, 2 and 3 overlap with each other, so it is impossible to use less than three colours for the input in Fig. 2.2; hence the solution produced is optimal. Note that the first algorithm utilizes a non-decreasing order of finish-time. The second algorithm utilizes a non-decreasing order of start-time, which is equivalent to a nonincreasing order of finish-time. A natural question to ask is what property of such orderings

25 CHAPTER 2. GREEDY ALGORITHMS ON SPECIAL STRUCTURES 17 allow both greedy algorithms to be optimal, and to what extent can this be generalized Perfect Elimination Ordering A key property of the ordering of intervals with non-decreasing finish-time is that for any particular interval I k = (s k, f k ], all its overlapping intervals appearing later in the ordering also overlap with each other. More specifically, all of them must contain the time point f k. The interval graph of a set of intervals is obtained by viewing each interval as a vertex, and drawing an edge between two vertices if and only if the two intervals overlap. The perfect elimination ordering is the generalization of the non-decreasing finish-time ordering of intervals to the graph theoretical setting. Definition Given a graph G = (V, E), a perfect elimination ordering is an ordering of vertices such that for each vertex v, the neighbours of v that occur after v in the ordering form a clique Figure 2.5: The interval graph of Fig. 2.2 Figure 2.5 shows the graph representation of Fig In particular, the vertices of the graph are labeled according to the labelling of intervals in Fig The ordering of these labels gives a perfect elimination ordering. In general, a perfect elimination ordering characterizes the class of chordal graphs. Theorem [33, 82] A graph is a chordal graph if and only if it admits a perfect elimination ordering.

26 CHAPTER 2. GREEDY ALGORITHMS ON SPECIAL STRUCTURES 18 Many NP-hard optimization problems can be solved or have good approximation solutions for chordal graphs because of the existence of a perfect elimination ordering. In the subsequent subsections, we generalize perfect elimination orderings, hence extending chordal graphs to more general graph classes. We show later in this chapter that the problems mentioned above can have good approximation solutions for these natural extensions Extending Perfect Elimination Orderings In the definition of a perfect elimination ordering, we call the subgraph induced by the neighbours of v that occur after v in the ordering the inductive neighbourhood of v with respect to the given ordering. The perfect elimination ordering states that for any v in the given ordering the size of an MIS in the inductive neighbourhood of v is one. A natural extension is to relax this MIS size to a general parameter k. Definition [2, 89] Given a graph G = (V,E), a k-independence ordering is an ordering of vertices such that for each vertex v, the size of an MIS of the inductive neighbourhood of v is at most k. The minimum of such k over all possible orderings of vertices of a graph G is called the inductive independence number 1 of that graph. We denote it by λ(g). This extension of a perfect elimination ordering leads to a natural generalization of chordal graphs. Definition [2, 89] A graph G is inductive k-independent if λ(g) k. Surprisingly, this extension seems to have only been relatively recently proposed in [2] and not studied subsequently. It turns out to be a rich extension. We defer the discussion on natural subclasses of this extension to Section 2.2. The way we extend perfect elimination orderings is to put a constraint on the inductive neighbourhoods of all the vertices. In fact, similar concepts exist in the literature. For ex- 1 Akcoglu et al. [2] refer to this as the directed local independence number.

27 CHAPTER 2. GREEDY ALGORITHMS ON SPECIAL STRUCTURES 19 ample, a graph is k-degenerate 2 [66] if every subgraph has a vertex of degree at most k. This definition was extended to the weighted case in [54] and was referred as weighted inductiveness. In [53] and more recently in [55], an inductive neighbourhood property based on the size of an MCC is also studied. In the next subsection, we give a uniform view of graph classes based on inductive neighbourhood properties Inductive and Universal Neighbourhood Properties and Their Related Graphs Classes We define our terminology first. Let G = (V,E) be a graph of n vertices. If X V, the subgraph of G induced by X is denoted by G[X ]. For a particular vertex v i V, let d(v i ) denote its degree and N (v i ) denote the set of neighbours of v i, excluding v i. Given an ordering of vertices v 1, v 2,..., v n, we use V i to denote the set of vertices {v i,..., v n }. Let P be a graph property. It is closed on induced subgraphs if whenever P holds for a graph G, it also holds for any induced subgraph of G. A graph has an inductive neighbourhood property with respect to P if and only if there exists an ordering of vertices v 1, v 2,..., v n such that for any v i, P holds on G[N (v i ) V i ]. The set of all graphs satisfying such an inductive neighbourhood property is denoted as G(P). Such an ordering of vertices is called an elimination ordering with respect to the property P. A graph has a universal neighbourhood property with respect to P if and only if for all vertices v 1, v 2,..., v n, P holds on G[N (v i )]. The set of all graphs satisfying such a neighbourhood property is denoted as Ĝ(P). Proposition If the property P is closed on induced subgraphs, then Ĝ(P) G(P). Proof: Let G be a graph with n vertices in the class Ĝ(P), and let v 1, v 2,..., v n be an arbitrary ordering of vertices. For any vertex v i, since the property P holds on G[N (v i )] and the property is closed on induced subgraphs, P holds on G[N (v i ) V i ]. Therefore, the original 2 This is also known as k-inductiveness in [51].

28 CHAPTER 2. GREEDY ALGORITHMS ON SPECIAL STRUCTURES 20 ordering of vertices v 1, v 2,..., v n is an elimination ordering for G with respect to the property P. Therefore, G is in the class of G(P). Proposition If the property P is closed on induced subgraphs, then for a graph G in G(P), any induced subgraph of G is also in G(P). Proof: We prove this by contradiction. Suppose the statement is false and let G be a graph in G(P) that contains an induced subgraph which is not in G(P). Let G be a minimum size induced subgraph of G that is not in G(P), and let N (v) denote the set of neighbours of v in G and N (v) denote the set of neighbours of v in G. Note that for any vertex v in G, the property P does not hold on G[N (v)]. Otherwise, deleting v from G will create a smaller induced subgraph of G that is not in G(P). Let v 1, v 2,..., v n be an elimination ordering of vertices in G with respect to the property P, and let v i be the first vertex in the ordering that appears in G. Clearly, the property P holds on G[N (v i ) V i ], but does not hold on G[N (v i )]. Since N (v i ) N (v i ) V i, G[N (v i )] is an induced subgraph of G[N (v i ) V i ]; which is a contradiction. Theorem If the property P can be tested in O(p(n)) time, then a graph in Ĝ(P) can be recognized in O(np(n)) time. Proof: The graph G is in Ĝ(P) if and only if the property P holds on G[N (v)] for every vertex v in G. Since the property P can be tested in p(n) time, we can test it for all the vertices to determine if G is in Ĝ(P). Theorem If the property P is closed on induced subgraphs, and the property P can be tested in O(p(n)) time, then a graph in G(P) can be recognized in O(n 2 p(n)) time by the following algorithm. Furthermore, let Q be a queue, an elimination ordering with respect to the property P can be constructed and stored in Q. RECURSIVE_TEST(G, P )

29 CHAPTER 2. GREEDY ALGORITHMS ON SPECIAL STRUCTURES 21 1: if G is empty then 2: Return TRUE 3: end if 4: for all v V do 5: if P holds on G[N (v)] then 6: Enqueue v to Q 7: Return RECURSIVE_TEST(G[V \ {v}], P ) 8: end if 9: end for 10: Return FALSE Proof: For a given graph G, if the above algorithm returns TRUE, then the ordering of vertices given by Q is an elimination ordering with respect to property P. Therefore G is in G(P). We prove the other direction by contradiction. Suppose the above algorithm fails to recognize a graph in G(P), let G be a minimum counter-example, that is, G is in G(P) but RE- CURSIVE_TEST on (G,P) returns FALSE. There are two cases: 1. It returns FALSE because for every vertex v in the graph G, the property P does not hold on G[N (v)]. Then G is not in G(P), which is a contradiction. 2. It returns FALSE because the recursive call on (G[V \ {v}],p) returns FALSE. Then since the property P is closed on induced subgraphs, by Proposition 2.1.9, G[V \ {v}] is in G(P). Hence, G[V \ {v}] is a smaller counter-example. This is also a contradiction. Note that each recursive call requires checking the remaining vertices of the graph (in the worst case), and reduces the number of vertices of the graph by one. Therefore, the overall running time of recognizing a graph in G(P) is O(n 2 ) times the time to test the property P. Note that if G is in G(P), then the ordering of vertices in queue Q provides a certificate.

30 CHAPTER 2. GREEDY ALGORITHMS ON SPECIAL STRUCTURES 22 If G is not in G(P), then the remaining graph at the termination of the RECURSIVE_TEST provides a certificate on why G is not in G(P). In this chapter, we focus on graphs with their inductive and universal neighbourhoods satisfying the following two graph properties: 1. MCC k: The size of the minimum clique cover is no more than k. We denote the two classes of such graphs by G(CC k ) and Ĝ(CC k ). 2. M I S k: The size of the maximum independent set is no more than k. We denote the two classes of such graphs by G(I S k ) and Ĝ(I S k ). Note that both properties are closed on induced subgraphs, and by Proposition 2.1.8, we have Ĝ(CC k ) G(CC k ) and Ĝ(I S k ) G(I S k ). It is not difficult to show the inclusion is proper for all positive integer k. Theorem For any positive integer k, Ĝ(CC k ) G(CC k ) and Ĝ(I S k ) G(I S k ). Proof: Consider the complete bipartite graph K k,k+1. It is in G(CC k ) and G(I S k ) but is not in Ĝ(CC k ) and Ĝ(I S k ). Furthermore, as M I S k is a weaker property 3 than MCC k, we have Ĝ(CC k ) Ĝ(I S k ) and G(CC k ) G(I S k ). Theorem For any positive integer k > 1, Ĝ(CC k ) Ĝ(I S k ) and G(CC k ) G(I S k ). For k = 1, Ĝ(CC k ) = Ĝ(I S k ) and G(CC k ) = G(I S k ). Proof: We give an explicit construction to show separations between Ĝ(CC k ) and Ĝ(I S k ), and between G(CC k ) and G(I S k ). For any k > 1, the construction starts with two cycles of length 2k + 1. We then connect every vertex in the first cycle to every vertex in the second cycle. See Fig. 2.6 for the case when k = 2. By symmetry, the induced subgraph G[N (v)] for 3 In fact, the gap between the size of M I S and the size of MCC can be arbitrarily large.

31 CHAPTER 2. GREEDY ALGORITHMS ON SPECIAL STRUCTURES 23 Figure 2.6: A graph in Ĝ(I S 2 ) and G(I S 2 ) but not in Ĝ(CC 2 ) and G(CC 2 ) any v is an odd cycle C 2k+1 plus two independent vertices, each connecting to every vertex in the cycle. It is not difficult to see that the size of an MIS for graph G[N (v)] is k while the size of an MCC for graph G[N (v)] is k + 1. Therefore, it is in Ĝ(I S k ) and G(I S k ) but is not in Ĝ(CC k ) and G(CC k ). This gives us four families of graphs having rich and interesting properties. Note that G(CC 1 ) = G(I S 1 ) is exactly the class of chordal graphs, using the characterization in terms of admitting a perfect elimination ordering. In the following section, we give natural examples of graphs contained in these families. 2.2 Natural Subclasses of the Four Families The four families we have defined in the previous section: Ĝ(CC k ), Ĝ(I S k ), G(CC k ) and G(I S k ) have many interesting subclasses. In this section, we give some natural examples of graphs in these families with small constant parameters Graphs induced by the Job Interval Selection Problem In the job interval selection problem, we are given a set of jobs. Each job is a set of half open intervals on the real line. To schedule a job, exactly one of these intervals must be selected.

32 CHAPTER 2. GREEDY ALGORITHMS ON SPECIAL STRUCTURES 24 To schedule several jobs, the intervals selected for the jobs must not overlap. The objective is to schedule as many jobs as possible under these constraints. We can view the job interval selection problem as an MIS on a special input graph as follows: The vertices of the graphs are intervals. Two vertices are adjacent if and only if the corresponding intervals overlap or they belong to the same job. Fig. 2.7a shows an input instance of the job interval selection problem. Each job is denoted by a particular colour; the set of intervals associated with that job is coloured using that particular colour. The intervals are labeled by non-decreasing finish-time. Fig. 2.7b shows the corresponding graph of the input instance in Fig. 2.7a. Graphs induced by the job interval selection problem are also know as strip graphs in [42]. They are a special type of 2-interval graphs (a) An input instance (b) The corresponding input graph Figure 2.7: An example of the job interval selection problem Observation Any graph induced by the job interval selection problem is in G(CC 2 ). Proof: We order intervals by non-decreasing finish-time. We examine this ordering with respect to the input graph. For ease of explanation, we do not distinguish between an interval and its corresponding vertex. For any particular vertex v, we can partition the neighbours of v appearing later in the ordering into two groups: those containing the finish-time of v and those that belong to the same job as v. If a neighbour of v satisfies both conditions, it does not matter which group we classify it into. Observe that within each group, any pair of vertices are adjacent. In other words, the inductive neighbourhood of v can be covered by two cliques. Therefore, any input graph of a job interval selection problem is in G(CC 2 ).

33 CHAPTER 2. GREEDY ALGORITHMS ON SPECIAL STRUCTURES Planar Graphs Planar graphs are well-studied objects in the literature not only because of their numerous applications but also due to the existence of many interesting properties. In this section, we present a nice property of planar graphs in terms of their inductive neighbourhoods. Theorem Any planar graph is in G(CC 3 ). Proof: Proof by contradiction. Suppose there are planar graphs which are not in G(CC 3 ). Let G = (V,E) be a minimum size counter-example, so for any vertex v in G, the size of an MCC of G[N (v)]) is greater than three; this forbids G having vertices of degree three or less. We examine a planar embedding of G; in the sequel, we do not distinguish G from this planar embedding. A vertex-edge pair is pair (v,e) such that edge e is incident to v. We count the number of faces, edges and vertices by first charging them to vertex-edge pairs, and then sum up the charges. Let v be a vertex and d(v) be the degree of v. Let e be an edge incident to v, then the vertex-charge from v to (v,e) is 1 d(v), the edge-charge from e to (v,e) is 1. For a face f, the 2 number of boundary edges of f is called the degree of the face. It is denoted as d(f ). A pair (v,e) is incident to a face if e is a boundary edge of that face. The face-charge from f to an incident pair (v,e) is 1 2d(f ). Note that these charges are carefully chosen such that if we sum up vertex-charges over all vertex-edge pairs, the sum is the number of vertices. If we sum up edge-charges over all vertex-edge pairs, the sum is the number of edges. If we sum up facecharges over all vertex-edge pairs, the sum is the number of faces. We provide an upper bound on the number of faces and derive a contradiction using the Euler characteristic. Depending on the degree of v of each vertex-edge pair (v,e), we have three cases: 1. The degree of v is four. Let A be the set of vertex-edge pairs containing such a vertex. Then since the size of an MCC of G[N (v)]) is greater than three, none of the neighbours of v in G can be adjacent. Therefore, for each edge incident to v, the face to its left has degree at least four and so does the face to its right. Note that they might be

34 CHAPTER 2. GREEDY ALGORITHMS ON SPECIAL STRUCTURES 26 the same face. The face-charge of such vertex-edge pair is at most 1. If we sum up all 4 face-charges of vertex-edge pairs containing v, it is at most one. 2. The degree of v is five. Let B be the set of vertex-edge pairs containing such a vertex. Then since the size of an MCC of G[N (v)]) is greater than three, there are at most two triangular faces incident to v in G. We break it down further into three cases: (a) If there is no triangular face incident to v in G, we denote the set of vertex-edge pairs containing such a vertex as B 1. See Fig. 2.8 below. Using a similar argument v Figure 2.8: No triangular face as above, the face-charge of such vertex-edge pair is at most 1. If we sum up all 4 face-charges of vertex-edge pairs containing v, it is at most 5 4. (b) If there is exactly one triangular face incident to v in G, we denote the set of vertex-edge pairs containing such a vertex as B 2. See Fig. 2.9 below. Observe that the face-changes of vertex-edge pairs involved in the triangular face is at most 7. If we sum up all face-charges of vertex-edge pairs containing v, it is at most (c) If there are exactly two triangular faces incident to v in G, we denote the set of vertex-edge pairs containing such a vertex as B 3. See Fig below. Note that these two triangular faces must be adjacent, for otherwise the size of an MCC of G[N (v)]) is no more than three. Using a similar argument, if we sum up all

35 CHAPTER 2. GREEDY ALGORITHMS ON SPECIAL STRUCTURES 27 v Figure 2.9: One triangular face v Figure 2.10: Two adjacent triangular faces face-charges of vertex-edge pairs containing v, it is at most The degree of v is more than five. Let C be the set of vertex-edge pairs containing such a vertex. The face-charge of such vertex-edge pair is at most 1 3. If we sum up all face-charges of vertex-edge pairs containing v, it is at most 1 3 d(v). Let F denote the set of faces. We count the total number of vertices, edges and provide an upper bound on the total number of faces by summing up vertex-changes, edge-changes and face-charges, respectively, over all vertex-edges pairs. The total number of vertices is V = A + B + C. The total number of edges is E = 1 2 [4 A + 5 B + d(v)] = 2 A + 5 v C 2 B d(v). v C

36 CHAPTER 2. GREEDY ALGORITHMS ON SPECIAL STRUCTURES 28 For the total number of faces, we have Therefore, bounding the Euler characteristic: F A B B B d(v). 3 v C V E + F C 1 4 B B B 3 1 d(v) C 1 d(v). 6 v C 6 v C Since d(v) 6 for any (v,e) C, we have V E + F C 1 d(v) 0. 6 This contradicts the fact that G is planar since every planar graph has Euler characteristic 2. Therefore any planar graph is in G(CC 3 ). v C Disk and Unit Disk Graphs, Intersection Graphs of Convex Shapes Geometric intersection graphs play an important role in many applications. For example, interval graphs in scheduling, disk graphs and unit disk graphs in wireless communication, and intersecting rectangles in layout problems and bioinformatics. Due to geometric constraints, these graphs have many interesting properties. In this subsection, we study the relationship between disk graphs, unit disk graphs, translates of a convex shape and the four families we have defined in the previous section: Ĝ(CC k ), Ĝ(I S k ), G(CC k ) and G(I S k ). We restrict our attention to the two dimensional plane; each geometric object is a closed set in R 2. We define classes of geometric intersection graphs as follows. We are given a set of geometric objects in the plane; these are the vertices of the intersection graph. Two vertices are adjacent if and only if the two objects overlap; i.e., have a non-empty intersection. We first consider disk graphs and unit disk graphs where the objects are (respectively) disks of arbitrary radius and disks of fixed radius. Figure 2.11 shows a set of disks on the plane and the corresponding disk graph. There are two important geometric properties of disks.

37 CHAPTER 2. GREEDY ALGORITHMS ON SPECIAL STRUCTURES 29 (a) A set of disks on the plane (b) The corresponding disk graph Figure 2.11: An example of a disk graph Observation Given a set of disks on the plane and a particular disk s, the set of overlapping disks of s no smaller than s can be partitioned into six (possibly empty) subsets, such that within each subset, any two disks overlap. Proof: For a given disk s, we construct the partition into six subsets explicitly. In particular, take the centre of s as the origin and partition the plane into six regions with an equal angle of 60, see Fig below. To be precise, each region includes its left boundary and excludes its right boundary. For any two disks no smaller than s, if they overlap with s and their Figure 2.12: Partition the plane into six regions centres lie in the same region, then they must overlap with each other due to geometric constraints. In fact, this is true even if their centres lie on adjacent boundaries. Therefore, we can partition the set of overlapping disks of s no smaller than s into six subsets based on which region the centre of the overlapping disk lies in.

38 CHAPTER 2. GREEDY ALGORITHMS ON SPECIAL STRUCTURES 30 Observation Given a set of disks on the plane and a particular disk s, there are at most five overlapping disks of s no smaller than s such that no two of them overlap. Proof: We prove this by contradiction. Suppose there are six overlapping disks of s no smaller than s such that no two of them overlap. By the pigeonhole principle, there are at least two disks whose angle with respect to the centre of s is less or equal to 60. Therefore, these two disks must overlap, which is a contradiction. Theorem Any disk graph is in G(I S 5 ) and G(CC 6 ). Proof: This is immediate from Observation and if we order the set of disks by non-decreasing size. Corollary Any unit disk graph is in Ĝ(I S 5 ) and Ĝ(CC 6 ). The properties of disk graphs and unit disk graphs inspire us to study more general geometric shapes: convex shapes. Here we focus on intersection graphs of uniform and nonuniform translates of a convex shape. For a given shape, a uniform translate is the same shape with the same size and orientation but a possibly different location. A non-uniform translate is the same shape with the same orientation but a possibly different size and location. It is clear that disk graphs are examples of intersection graphs of non-uniform translates and unit disk graphs are examples of intersection graphs of uniform translates. In [59], Kim, Kostochka and Nakprasit proved that for an intersection graph of uniform translates of a convex shape, if it has a clique number k then it is (3k 3)-degenerate. Although the statement of that result is not immediately applicable, the proof shows that the neighbourhood of the topmost object can be covered by three cliques. This leads to the following theorem. Theorem [59] Any intersection graph of uniform translates of a convex shape is in G(CC 3 ).

39 CHAPTER 2. GREEDY ALGORITHMS ON SPECIAL STRUCTURES 31 A weaker version of Theorem 2.2.7, which is independently proved using a geometric argument, can be found in [89]. Similar to Observation and 2.2.4, we can extend Theorem and to the following theorems. Theorem [89] Any intersection graph of uniform translates of a convex shape is in Ĝ(I S 5 ) and Ĝ(CC 6 ). Theorem [89] Any intersection graph of non-uniform translates of a convex shape is in G(I S 5 ) and G(CC 6 ) More Subclasses There are three more subclasses we want to mention: d-claw-free graphs, k-degenerate graphs, and circular-arc graphs. Definition A graph is d-claw-free if every vertex has less than d independent neighbours. Note that the class of d-claw-free graphs is exactly the class Ĝ(I S d 1 ). There are many interesting subclasses of d-claw-free graphs; we give two examples. 1. Line Graphs: Given a graph G, its line graph L(G) is a graph such that each vertex of L(G) is an edge of G, and two vertices of L(G) are adjacent if and only if their corresponding edges share a vertex in G. Observe that line graphs are in Ĝ(CC 2 ). This is because any edge has two end vertices. Any other edge incident to it must share one of the end vertices. 2. Intersection Graphs of k-sets: Given a universe U of n elements, and m subsets of U, each with at most k elements, its intersection graph is a graph such that each vertex is a subset, two vertices are adjacent if and only if the two subsets have non-empty intersection. Observe that intersection graphs of k-sets are in Ĝ(CC k ).

40 CHAPTER 2. GREEDY ALGORITHMS ON SPECIAL STRUCTURES 32 Definition A graph is k-degenerate if every subgraph has a vertex of degree at most k. A very general subclass of k-degenerate graphs is the class of graphs with treewidth at most k. We give the definition of graphs with treewidth at most k in terms of k trees. A k-tree can be formed by starting with a clique of size k and then repeatedly adding vertices in such a way that each added vertex has exactly k neighbours which form a clique. The graphs that have treewidth at most k are exactly the subgraphs of k-trees, and for this reason they are also called partial k-trees. Graphs with treewidth at most k are quite general and useful. It includes rich subclasses even for small values of k. For example, series-parallel graphs and outer planar graph are graphs with treewidth at most 2. Many graph problems, to be more precise, all problems definable in monadic second-order logic, can be solved in polynomial time using dynamic programming for graphs with bounded treewidth [21]. Definition A graph is a circular-arc graph if it is the intersection graph of arcs of a circle. Given a set of arcs of a circle, vertices of a circular arc graphs are these arcs. Two vertices are adjacent if two corresponding arcs overlap. See Fig for an example. Given a circular- (a) A set of arcs of a circle (b) The corresponding circular-arc graph Figure 2.13: An example of a circular-arc graph arc graph, the length of an arc is the corresponding angle if we connect both end-points to the centre of the circle. Consider the arc c with the smallest arc length. All intersecting arcs contain either the left end-point of c or the right end-point of c. Therefore, circular-arc graphs are in the class of G(CC 2 ).

41 CHAPTER 2. GREEDY ALGORITHMS ON SPECIAL STRUCTURES 33 We have seen so far several natural examples of graphs in the families of Ĝ(CC k ), Ĝ(I S k ), G(CC k ) and G(I S k ) for small values of k. In the next section, we study properties of the graph classes G(I S k ) and G(CC k ) when k is a small constant. 2.3 Properties of G(I S k ) and G(CC k ) First, we show that any graph in G(I S k ) can be recognized in polynomial time using Theorem Corollary A k-independence ordering of a graph G in G(I S k ) with n vertices can be constructed in O(k 2 n k+3 ) time and linear space. Proof: The property M I S k is closed on induced subgraphs and can be tested in time O(k 2 n k+1 ). By Theorem , a graph in G(I S k ) can be recognized in time O(k 2 n k+3 ) and linear space of the size of the graph. By an observation of Itai and Rodeh [52] and results in [28], we can improve this running time to O(n ), O(n ) and O(n ) for k = 2, 3 and 4 respectively. If we allow an algorithm to use O(n k+2 ) space, we can further improve the running time of the recognition algorithm. Proposition A k-independence ordering of a graph G in G(I S k ) with n vertices can be constructed in O(k 2 n k+2 ) time and O(n k+2 ) space. Proof: Given a graph G, we build a bipartite graph H = (A,B) in the following way. We construct a vertex-node (a node in A) for each vertex in G and a subset-node (a node in B) for each subset of size k + 1 in G. We connect a vertex-node to a subset-node with a red edge if the vertex of the vertex-node is adjacent to all vertices in the subset-node and these vertices form an independent set. We connect a vertex-node to a subset-node with a black

42 CHAPTER 2. GREEDY ALGORITHMS ON SPECIAL STRUCTURES ,2 1,3 1,4 2,3 2,4 3,4 (a) A graph G (b) The corresponding graph H Figure 2.14: An example of the construction for k = 1 edge if the vertex of the vertex-node is one of the vertices in the subset-node. See Fig for an example of the construction for k = 1. Observe that constructing such a graph H takes O(k 2 n k+2 ) time and O(n k+2 ) space. Once H is constructed, we construct an ordering of vertices of G in the following way. At each step, we look for a vertex-node in A that is not incident to any red edge. The vertex of such a vertex-node is then the next vertex added to the ordering. We then delete such a vertex-node in A and all its neighbours in B together with all incident (black and red) edges, and continue. If finally there are no vertices remaining in A, then we have constructed an inductive k-independent ordering. Otherwise at a particular step, every vertex-node in A has at least one red edge and we can conclude that G is not an inductive k-independent graph. It is known [26] that MIS is W[1]-complete for general graphs when it is parameterized by the size of the maximum independent set. By a reduction from MIS, finding the inductive independence number of a graph is also complete for W[1], hence it is unlikely to obtain a fixed parameter tractable solution for recognizing a graph in G(I S k ) for a general k. But this does not exclude the possibility to improve the current time complexity for small fixed parameters, for example, when k = 2 or 3. It is interesting to note that recognizing a chordal graph, i.e., a graph in G(I S 1 ), can be done in linear time (in the number of edges), while the generic algorithm above runs in time O(n 3 ). It seems quite possible to improve the running

43 CHAPTER 2. GREEDY ALGORITHMS ON SPECIAL STRUCTURES 35 time of the generic recognition algorithm; we leave this as an open question. We are primarily interested in graphs with a small inductive independence number. Graphs with a large inductive independence number are not interesting to us as they cannot be recognized efficiently and do not provide good approximation bounds. We discuss the latter in Section 2.4. In many specific cases, not only do we know a-priori that a graph has a small inductive independence number, like those subclasses we discussed in Section 2.2, but also a k-independence ordering with a desired k can be computed much more efficiently than the time complexity bound provided by Proposition We give several observations below which all follow immediately from the specific ordering discussed in Section 2.2. Observation [89] A 2-independence ordering can be computed in O(n log n) time for any input graph of the job interval selection problem with n intervals. Observation [89] A 3-independence ordering can be computed in O(n log n) time for any intersection graph of n uniform translates of a convex shape. Observation [89] A 5-independence ordering can be computed in O(n log n) time for any intersection graph of n non-uniform translates of a convex shape. Next we bound the inductive independence number of a graph by the number of vertices and edges in the graph. Theorem A graph G with n vertices and m edges has inductive independence number no more than min{ n 2, 1+4[( n 2) m]+1 m, }. 2 Proof: Let λ(g) be the inductive independence number of G. We can then find an induced subgraph H such that every vertex has at least λ(g) independent neighbours. Let v be a vertex in H and u be one of its λ(g) independent neighbours. Note that u must again have at least λ(g) independent neighbours. Furthermore, since u is not adjacent to any of the

44 CHAPTER 2. GREEDY ALGORITHMS ON SPECIAL STRUCTURES 36 v u Figure 2.15: A vertex v in H and one of its independent neighbours u other λ(g) 1 independent neighbours of v, the two independent neighbour sets of u and v has to be disjoint; see Fig below. Therefore, the total number of vertices is at least 2λ(G), the total number of edges is at least λ(g) 2 and the total number of missed edges is at least 2 ( λ(g) ) 2. Therefore, a graph G with n vertices and m edges has inductive independence number no more than min{ n 1 + 4[ ( n 2) m] + 1 2, m, }. 2 We now consider the class G(CC k ). Unlike the property M I S k, the property MCC k is NP-hard to test for k > 2. Therefore, Theorem no longer applies. However, the property MCC 2 can be tested in linear time for general graphs as testing bipartiteness can be done in linear time by a greedy algorithm. Hence, the graph class G(CC 2 ) can be recognized in polynomial time. By the RECURSIVE_TEST algorithm in Theorem , the following corollary is immediate. Corollary For any graph G in G(CC 2 ) with n vertices, an elimination ordering with respect to the property MCC 2 can be constructed in O(mn 2 ) time and linear space. 2.4 Greedy Algorithms for G(I S k ) and G(CC k ) In this section, we focus on algorithmic aspects of the two families G(I S k ) and G(CC k ). We show that for several classic NP-hard problems, good approximation algorithms can be developed for graphs in these two classes; furthermore all these algorithms are greedy-like

45 CHAPTER 2. GREEDY ALGORITHMS ON SPECIAL STRUCTURES 37 algorithms. For simplicity, other than Subsection 2.4.4, we focus mainly on the unweighted case of these problems. Note that the weighted case of maximum independent set requires an extension to the stack algorithm [11]. Since G(CC k ) G(I S k ), we discuss algorithms for the class G(I S k ) whenever possible since a result for G(I S k ) implies the same result for G(CC k ) but not vice versa. A discussion for the graphs class G(CC 2 ) is given in Subsection 2.4.5, as by Corollary 2.3.7, graphs in G(CC 2 ) can be recognized in polynomial time. For most algorithms, we are more concerned about their approximation ratios than their precise time complexities. The running time of an algorithm for the graph class G(I S k ) is usually bounded by the running time for constructing a k-independence ordering. Nevertheless, for a fixed constant k, such running time is polynomial Maximum Independent Set For general graphs, MIS is NP-hard and even NP-hard to approximate within a factor of n 1 ɛ for any constant ɛ > 0. However for chordal graphs, MIS can be solved by a greedy algorithm in polynomial time. We extend this result and show that a k-approximation for MIS can be achieved on G(I S k ). A GREEDY ALGORITHM FOR MIS ON G(I S k ) 1: Sort all vertices according a k-independence ordering 2: for i = 1,...,n do 3: Select the i th vertex if it is not adjacent to anything selected before 4: end for Theorem [2, 89] The above greedy algorithm achieves a k-approximation for MIS on G(I S k ). Proof: We prove it using a charging argument. Let π be a k-independence ordering. Let O be the optimal solution and A be the greedy solution. We order vertices in O and A accord-

46 CHAPTER 2. GREEDY ALGORITHMS ON SPECIAL STRUCTURES 38 ing to the k-independence ordering used by the algorithm; let v 1, v 2,..., v p and u 1,u 2,...,u q be the induced ordering of vertices in O and in A respectively according to π. We define the following mapping from O to A : A vertex v i in O is mapped to the same vertex in A if v i exists in A, or the first vertex in A that is adjacent to v i ; see Fig for an example. We v 1 v v i... v p u 1 u 2... u j u q Figure 2.16: A mapping from O to A observe the following properties of this mapping. First of all, every vertex in O maps to some vertex in A ; for otherwise, A would include that vertex by the greedy selection rule. Furthermore, no vertex in O maps to some vertex in A that appears later in the k-independence ordering; for otherwise, A would include that vertex by the greedy selection rule. Since the set of vertices that map to a particular vertex in A has to form an independent set, by the definition of the k-independence ordering, there are at most k vertices in O that map to the same vertex in A. Hence we can conclude that the size of O is at most k times of the size of A. Therefore, the above algorithm is a k-approximation for MIS on G(I S k ). For the weighted case, a local ratio algorithm [2] can achieve the same approximation ratio of k for graphs in G(I S k ). We state the theorem below without a proof. This local ratio algorithm can be view as a two-pass greedy-like algorithm. A more general problem will be discussed in detail in Subsection 2.4.4, and we will obtain a result that implies Theorem Theorem [2] There is a k-approximation local ratio algorithm for WMIS on G(I S k ).

47 CHAPTER 2. GREEDY ALGORITHMS ON SPECIAL STRUCTURES Minimum Vertex Colouring The minimum vertex colouring problem is a well-studied NP-hard problem. For a graph with n vertices, it is NP-hard to approximate the chromatic number within n 1 ɛ for any fixed ɛ > 0. For chordal graphs, a greedy algorithm on the reverse of any perfect elimination ordering gives an optimal colouring. For graphs in G(I S k ), the same greedy algorithm achieves a k-approximation. A GREEDY ALGORITHM FOR COL ON G(I S k ) 1: Sort all vertices according to a reverse k-independence ordering 2: for i = 1,...,n do 3: Colour the i th vertex with the first available colour j not used by any of its neighbours 4: end for Theorem The above greedy algorithm achieves a k-approximation for COL on G(I S k ). Proof: Let v 1, v 2,..., v n be a k-independence ordering, so the algorithm colours the vertices according to the ordering v n, v n 1,..., v 1. Let V i = {v i,..., v n }, we prove by induction that the algorithm achieves a k-approximation for G[V i ] for all i from n to 1. The base case is clear, since when i = n, G[V n ] is just a single vertex. Now we assume the statement holds for i > t, i.e., the number of colours c i used in the algorithm for G[V i ] satisfies c i k χ(g[v i ]). Now we consider i = t. There are three cases: 1. If c t = c t+1, then the statement holds trivially since c t = c t+1 k χ(g[v t+1 ]) k χ(g[v t ]). 2. If χ(g[v t ]) = χ(g[v t+1 ]) + 1, then the statement also holds trivially since c t c t k χ(g[v t+1 ]) + 1 k(χ(g[v t+1 ]) + 1) = k χ(g[v t ]).

48 CHAPTER 2. GREEDY ALGORITHMS ON SPECIAL STRUCTURES The only remaining case is when c t = c t+1 + 1, and χ(g[v t ]) = χ(g[v t+1 ]). Suppose c t > k χ(g[v t ]). Since we have to increase the number of colours, there exist c t+1 neighbours of v t, each having a different colour. These c t+1 neighbours together with v t must be grouped into χ(g[v t ]) colour classes in the optimal colouring. Therefore at least one colour class in the optimal colouring will have at least c t+1+1 χ(g[v t ]) vertices from the set N (v t ) V t. Since c t χ(g[v t ]) = c t χ(g[v t ]) > k, we have one colour class containing more than k vertices from N (v t ) V t. This contradicts the fact that v 1, v 2,..., v n is an inductive k-independent ordering. This completes the induction; therefore the algorithm achieves a k-approximation for COL on G(I S k ) Minimum Vertex Cover The minimum vertex cover problem is one of the most celebrated problems in the area of approximation algorithms, because there exist several simple 2-approximation algorithms, yet for general graphs no known algorithm 4 can achieve an approximation ratio better than 2 ɛ for any fixed ɛ > 0. The problem is NP-hard and NP-hard to approximate within a factor of 1.36 [24]. In this subsection, we discuss approximation algorithms for MVC on G(I S k ). A graph is triangle-free if no three vertices in the graph form a triangle of edges. We first discuss graphs in G(I S k ) that are triangle-free. MVC on Triangle-Free G(I S k ) For a given vertex v, let N 2 (v) denote vertices with distance two to v, we first prove the following lemma. 4 In fact, the ratio 2 ɛ for any fixed ɛ > 0 is not possible assuming the unique games conjecture; see [58].

49 CHAPTER 2. GREEDY ALGORITHMS ON SPECIAL STRUCTURES 41 Lemma Given a triangle-free graph G in G(I S k ), let v be a vertex of minimum degree, then there is a matching of size d(v) 1 between N (v) and N 2 (v). Proof: We colour vertices in N (v) red (big) and vertices in N 2 (v) blue (small). Let M be a maximum matching between red and blue vertices; i.e., every edge in M has one red end vertex and one blue end vertex, and each vertex in N (v) N 2 (v) occurs at most once in M. Let R 1 be the set of red vertices that participate in the matching, and R 2 be the set of remaining red vertices. Let B 1 be the set of blue vertices that participate in the matching, and B 2 be the set of remaining blue vertices; see Fig below. Note that no edge connects R 1 B 1 M v u R 2 B 2 Figure 2.17: A maximum matching between N (v) and N 2 (v) a vertex in R 2 to a vertex in B 2. Furthermore, since G is triangle-free, no edge connects any two vertices in N (v). Suppose that M < d(v) 1, then R 2 is non-empty. For any vertex u R 2, its neighbours are contained in the set B 1 {v}, therefore d(u) < d(v); this contradicts the fact that v is a vertex of minimum degree. Therefore, M d(v) 1, and hence there is a matching of size d(v) 1 between N (v) and N 2 (v). We now consider the following greedy algorithm for a triangle-free, k-inductive independent graph G. A GREEDY-LIKE ALGORITHM FOR MVC ON TRIANGLE-FREE G(I S k )

50 CHAPTER 2. GREEDY ALGORITHMS ON SPECIAL STRUCTURES 42 1: C = 2: while G is not empty do 3: Pick a vertex v with minimum degree 4: Let M be a matching of size d(v) 1 between N (v) and N 2 (v) 5: Let u be a vertex in N (v) that is not in the matching M 6: Add u and vertices in M to C, remove them and their incident edges from G 7: Remove all isolated vertices from G 8: end while 9: Return C Theorem The above greedy-like algorithm achieves a (2 1 )-approximation for MVC k on triangle-free graphs in G(I S k ). Proof: At each step of the algorithm, let M = M {uv} and let S be the set of vertices added to the cover C. Observe that S covers all edges in M and has size 2d(v) 1. Let S be a maximum size subset of vertices of M that covers M in an optimal solution. Note that S d(v); furthermore, the set of edges in G covered by S is a subset of edges in G covered by S. Since G is in G(I S k ) and triangle-free, we have d(v) k. Therefore, the approximation ratio is at most d(v) 1 d(v) 2k 1 k = 2 1 k. We can further improve the ratio in Theorem to 2 2 k+1 using the following result of Hochbaum [48], which is based on Nemhauser and Trotter s decomposition scheme [72]. Theorem [48] Let G be a weighted graph with n vertices and m edges. If it takes only s steps to colour the vertices of G with c colours then it takes only s +O(nm logn) steps to find a vertex cover whose weight is at most 2 2 c times the weight of an optimal vertex cover. In order to use Theorem 2.4.6, we first prove the following two lemmas. Lemma If a graph in G(I S k ) with n vertices is triangle-free, then a k-independence ordering can be constructed in O(kn logn) time.

51 CHAPTER 2. GREEDY ALGORITHMS ON SPECIAL STRUCTURES 43 Proof: We store the set of vertices of the graph into a priority queue with updates, using their degrees as their priorities. Since the graph is triangle-free, for any vertex v, N (v) is an independent set. By Theorem , a k-independence ordering can be constructed at each step by dequeuing the vertex of minimum degree and updating the degrees of its neighbours in the priority queue. As there are at most k neighbours at each step, these updates takes at most O(k log n) time. Therefore, a k-independence ordering can be constructed in O(kn log n) time. Lemma If a graph in G(I S k ) is triangle-free, then a simple greedy algorithm can provide a valid colouring of its vertices using at most k + 1 colours. Proof: By Lemma 2.4.7, for a triangle-free graph in G(I S k ), a k-independence ordering can be constructed efficiently. Suppose we colour the vertices of the graph according to the reverse of this ordering. Since the graph is triangle-free, whenever we colour a vertex v, at most k neighbours of v are already coloured. Therefore, we would use at most k +1 colours. By Theorem 2.4.6, Lemma and Lemma 2.4.8, the following theorem is immediate. Theorem There is a time O(mn logn) algorithm that achieves a (2 2 k+1 )-approximation for WMVC on triangle-free G(I S k ). Although Theorem has a better approximation ratio than Theorem 2.4.5, its running time is slightly less efficient than Theorem Furthermore, the greedy algorithm for Theorem is a much simpler combinatorial algorithm. Note that Halperin s algorithm [45] can achieve a factor of (2 (1 o(1)) 2lnlnk lnk )-approximation for WMVC on triangle-free G(I S k ), but it uses a more complicated semidefinite programming (SPD) relaxation of vertex cover.

52 CHAPTER 2. GREEDY ALGORITHMS ON SPECIAL STRUCTURES 44 MVC on G(I S k ) We now discuss approximating MVC for all graphs in G(I S k ); i.e., without the triangle-free assumption. Note that for a given graph G = (V,E), if S is an MIS of G then V \ S is an MVC of G. For k = 1, the graph class of G(I S k ) is exactly chordal graphs, and MVC can be solved optimally in polynomial time. For the remainder of this subsection, we assume k > 1. Note that if a graph contains a triangle, adding all three vertices of the triangle to the cover can introduce at most one extra vertex to the optimal cover in terms of covering the three edges of the triangle. That means if the approximation ratio of an algorithm we are aiming for is greater than 3 2, then we can remove a triangle, add its vertices to the cover and reduce to a smaller problem without sacrificing the approximation ratio of the algorithm. This leads the following meta-algorithm for graphs in G(I S k ). A META-ALGORITHM FOR MVC ON G(I S k ) 1: C = 2: Remove all triangles from G and add their vertices to C. 3: Let C be the cover returned by running on G an approximation algorithm for MVC on triangle-free G(I S k ) 4: Return C C Note that removing all triangles can be done in matrix multiplication time O(n ω ) O(n ) or in O(mn) time for sparse graphs. Combining this fact and the above metaalgorithm with Theorem and Theorem 2.4.9, we have the following two theorems. Theorem For k > 1, there is a polynomial time algorithm that achieves a (2 1 k )- approximation for MVC on G(I S k ). The algorithm runs in O(n ω ) O(n ) time or in O(mn) time for sparse graphs. Theorem For k > 2, there is an algorithm that runs in O(mn logn) time and achieves a (2 2 k+1 )-approximation for MVC on G(I S k ).

53 CHAPTER 2. GREEDY ALGORITHMS ON SPECIAL STRUCTURES 45 result. For the weighted case, we can use a similar trick as used in [15] and obtain the following Theorem For k > 2, there is a polynomial time algorithm that achieves a (2 2 k+1 )- approximation for WMVC on G(I S k ). Proof: This follows from the local ratio vertex cover algorithm of Bar-Yehuda and Even [8] as we now explain. We do the following triangle weight decomposition. Consider a given graph G in G(I S k ) with weights on its vertices. If there is a triangle with positive weights on all its three vertices, let w min be the minimum weight of the three. Take out this triangle and label these three vertices with w min. Reduce the weight of the three vertices in the original graph by w min. Repeat the above until there is no triangle with positive weights on all its vertices. An example of a triangle weight decomposition is shown in Fig After we (a) The original graph (b) A triangle weight decomposition Figure 2.18: An example of a triangle weight decomposition of a graph have a triangle weight decomposition of a graph G, we have a resulting graph G r with a set of weighted triangles: T 1,T 2,...,T p. We first take vertices having weight 0 in G r into C 1, and remove them from G r. The result graph is G r. It is not hard to see G r is trianglefree. Furthermore, since G r is an induced subgraph of G, it is still in G(I S k ). We then apply Theorem to get a (2 2 k+1 )-approximation for G r. The vertex cover for G r is C 2. Then C = C 1 C 2 is a vertex cover with an approximation ratio (2 2 k+1 ) to the optimal vertex cover of G. To see this, we provide two observations:

54 CHAPTER 2. GREEDY ALGORITHMS ON SPECIAL STRUCTURES The set C is a valid vertex cover for G. Suppose an edge e is not covered by C, then e cannot incident to a vertex of weight 0 in G r. Therefore, it must be an edge in G r. Since C 2 is a vertex cover for G r, C 2 covers e. Therefore e is covered by C. This is a contradiction. 2. The total weight in C is no more than (2 2 ) of the optimal vertex cover of G. Let k+1 the weight of an optimal vertex cover of G r is w 0, then the weight of an optimal vertex cover of G r is also w 0. Let C opt be an optimal vertex cover of G. Then the weight of C opt on G r (taking weights of G r instead of G) is at least w 0. Since C 2 is a (2 2 k+1 )- approximation for G r, and vertices in C 1 have weight 0, the weight of C on G r is at most (2 2 k+1 )w 0. Now we adding back successively T 1,T 2,...,T p to G r one at a time. Let w(t i ) be the weight of T i. At each step i, the weight of C opt on the resulting graph increases by at least 2 3 w(t i ) since at least two vertices in T i is in C opt, while the weight of C on the resulting graph increases by at most w(t i ). Therefore the final ratio between the weight of C and the weight of C opt is at most For k > 2, this ratio is at most 2 2 k+1. (2 2 k+1 )w 0 + p w(t i ) w p 3 w(t. i ) Therefore, the algorithm achieves a (2 2 k+1 )-approximation for WMVC on G(I S k ). Similarly, Halperin s algorithm gives a (2 (1 o(1)) 2lnlnk lnk )-approximation for WMVC on G(I S k ) Weighted Maximum c-colourable Subgraph The interval selection problem discussed in Section is often extended to multiple machines. For identical machines, the graph-theoretic formulation of this problem leads to a natural generalization of MIS. In this section, we discuss the weighted version of this generalization: WCOL c.

55 CHAPTER 2. GREEDY ALGORITHMS ON SPECIAL STRUCTURES 47 Recall that in the weighted maximum c-colourable subgraph problem, we are given a graph G = (V,E) with n vertices and m edges, and a weight function w : V Z +. The goal is to find a subset S of vertices maximizing the total weight of S such that S can be partitioned into c independent subsets. This problem is also referred to as the weighted maximum c- partite induced subgraph problem in some graph theory literature [1]. The problem is known to be NP-hard [88] even for chordal graphs. Chakaravarthy and Roy [17] showed that for chordal graphs, the problem admits a simple and efficient 2-approximation algorithm. We strengthen this result and extend it to the graph class of G(I S k ). Theorem For all k 1 and c 1, there is a polynomial time algorithm that achieves a (k c )-approximation for WCOL c on G(I S k ). We describe an algorithm that achieves the approximation ratio for Theorem The algorithm is called a stack algorithm as modelled in [11]. For each colour class l, we allocate a stack S l to temporarily store candidate vertices potentially assigned to that colour class. For each vertex v, let w l (v) denote its updated weight with respect to the stack S l. A STACK ALGORITHM FOR WCOL c ON G(I S k ) 1: Sort all vertices according a k-independence ordering 2: for i = 1,...,n do 3: Let w l (v i ) = w(v i ) v j S l N (v i ) w l (v j ) for each colour class l 4: if w l (v i ) 0 for all l = 1...c then 5: Reject v i without assigning any colour 6: else 7: Let h = argmax c l=1 w l (v i ) and push v i onto S h 8: end if 9: end for 10: for l = 1,...,c do 11: while S l is not empty do

56 CHAPTER 2. GREEDY ALGORITHMS ON SPECIAL STRUCTURES 48 12: Pop v out of S l 13: if v is adjacent to any vertex with colour l then 14: Reject v without assigning any colour 15: else 16: Assign colour l to v 17: end if 18: end while 19: end for We call lines 2 to 9 the push phase of the algorithm and lines 10 to 19 the pop phase of the algorithm. For simplicity, we do not distinguish between a stack and the set of vertices it contains at the end of the push phase; it should be clear which is being referred to by the context in which it appears. Let W l be the total updated weight of vertices in S l, and let W = c l=1 W l. Before proving Theorem , we give three lemmas. Let M be a c by c square matrix, and let Σ be the set of all permutations of {1,2,...,c}. For any σ Σ, let σ i be the i th element in the permutation. Lemma There exists a permutation σ such that M iσi 1 M i j. c Proof: Suppose otherwise. Then for each permutation σ we have i i,j M iσi > 1 M i j. c We sum over all σ Σ. Since in total we have c! permutations, we have σ Σ i i i,j M iσi > c! 1 M i j. c Since each M i j is counted exactly (c 1)! times on the left hand side, we have i,j (c 1)! M i j > (c 1)! M i j, i,j i,j which is a contradiction.

57 CHAPTER 2. GREEDY ALGORITHMS ON SPECIAL STRUCTURES 49 Lemma The solution of the algorithm has total weight at least W. Proof: Let A l be the set of vertices in the solution of the algorithm with colour l. For any given vertex v i A l, let S i l be the content of the stack S l before v i is being pushed onto the stack. We have w(v i ) = w l (v i ) + v j S i l N (v i ) w l (v j ). If we sum over all v i A l, we have w(v i ) = v i A l v i A l w l (v i ) + v i A l v j S i l N (v i ) w l (v j ) v t S l w l (v t ) = W l. The inequality holds because for any v t S l, we either have v t S i l N (v i ) for some v i A l or we have v t A l. Summing over colour classes, we have that the solution of the algorithm has total weight at least W. We now proceed to the proof of Theorem Proof: Let A be the solution of the algorithm and O be the optimal solution. For each given vertex v i in O, let o i be its colour class in O, and a i be its colour class in A if it is in A. Let S i o i be the content of the stack S oi when the algorithm considers v i. We then have three cases: 1. If v i is rejected during the push phase of the algorithm then we have w(v i ) v j S i o i N (v i ) w oi (v j ). In this case, we charge w(v i ) to all w oi (v j ) with v j S i o i N (v i ). Each w oi (v j ) can be charged at most k times coming from the same colour class. 2. If v i is accepted into the same colour class during the push phase of the algorithm then we have w(v i ) = w oi (v i ) + w oi (v j ). v j S i o i N (v i )

58 CHAPTER 2. GREEDY ALGORITHMS ON SPECIAL STRUCTURES 50 In this case, we charge w(v i ) to w oi (v i ) and all w oi (v j ) with v j S i o i N (v i ). Note that they all appear in the same colour class o i ; w oi (v i ) is charged at most once and each w oi (v j ) is charged at most k times coming from the same colour class. 3. If v i is accepted into a different colour class during the push phase of the algorithm then we have w(v i ) = w oi (v i ) + v j S i o i N (v i ) w oi (v j ) w ai (v i ) + v j S i o i N (v i ) w oi (v j ). In this case, we charge w(v i ) to w ai (v i ) and all w oi (v j ) with v j So i i N (v i ). Note that each w oi (v j ) appears in the same colour class o i and is charged at most k times coming from the same colour class. However w ai (v i ) in this case is in a different colour class a i and is charged at most once coming from a different colour class. If we sum over all v i O, we have w(v i ) v i O v i A O o i a i c w ai (v i ) + k w l (v t ). v t S l l=1 The inequality holds because when we sum over all weights of v i O, there are two types of charges for w l (v t ) for any vertex v t S l. There are charges coming from the same colour class, of which there can be at most k; and charges coming from a different colour class. The latter ones only appear when v i is accepted into a stack of a different colour class (comparing to the optimal solution) during the push phase of the algorithm. Therefore there is at most one such charge, which leads to the extra term v i A O o i a i w ai (v i ). Note that if we can permute the colour classes of the optimal solution so that for any v i A O, o i = a i, then this extra term disappears and we achieve a k-approximation. But it might be the case that no matter how we permute the colour classes of the optimal solution, we always have some v i A O with o i a i. We construct the weight matrix M in the following way. An assignment i j is to assign the colour class i of O to the colour class j of A. A vertex is misplaced with respect to this assignment i j if it is in A O and its colour class is i in O, but is not j in A. We then let M i j

59 CHAPTER 2. GREEDY ALGORITHMS ON SPECIAL STRUCTURES 51 be the total updated weight of misplaced vertices with respect to the assignment i j. Note that the total weight of the matrix is (c 1) v i A O w ai (v i ), and applying Lemma , there exists a permutation of the colour class in O such that Therefore, we have v i O w ai (v i ) c 1 v i A O o i a i c w(v i ) c 1 c By Lemma , we have v i O v i A w(v i ) c 1 c v i A O l=1 w ai (v i ) c 1 c v i A w(v i ). c w(v i ) + k w l (v t ) c 1 w(v i ) + kw. v t S l c v i A v i A w(v i ) + kw (k c ) v i A w(v i ). Therefore, the algorithm achieves a (k c )-approximation for WCOL c on G(I S k ). Note that given a k-independence ordering, the running time of the stack algorithm for Theorem is dominated by the push phase and can be bounded by O(min{m logc + n,m + cn}). The first quantity is obtained as follows: for each vertex, we maintain a priority queue of its updated weights for all the colour classes. An update occurs for each edge in the graph and the cost of such an update is O(logc). Therefore the running time is bounded by O(m logc + n). For the second quantity, at each step, we basically calculate the updated weighted of that vertex for all colour classes, and then find the best colour class to push that vertex onto the stack. Calculating the update weighted for all vertices costs time O(m), and finding the best colour class for each vertex costs time O(c). Therefore the running time is bounded by O(m + cn). In general, by a result in [6], the existence of an r -approximation for WMIS always implies (using a greedy algorithm that repeatedly takes an r -approximation solution of WMIS in the remaining graph) an approximation algorithm with ratio that when c = 1, cases, we have (cr ) c (cr ) c (cr 1) c = r. When c = 2 and r = 1, (cr ) c (cr ) c (cr 1) c (cr ) c (cr ) c (cr 1) c = 1 1 (1 1 1 cr )c 1 e r. (cr ) c (cr ) c (cr 1) c for WCOL c. Note = 4. For the remaining 3

60 CHAPTER 2. GREEDY ALGORITHMS ON SPECIAL STRUCTURES 52 This ratio is no more than r c for all choices of r and c. However, the running time of this algorithm is O(c(m +n)) given a k-independence ordering, which is slightly worse than the stack algorithm The Graph Class G(CC 2 ) The graph class G(CC k ) is a subclass of G(I S k ), hence all algorithms studied in this section apply to the graph class of G(CC k ). Furthermore, if an elimination ordering with respect to the property MCC k is given, then both WMC and MCC can be approximated within a ratio of k. However, as noted in Section 2.3, unlike the property M I S k, the property MCC k is NP-hard to test for k > 2. Therefore, Theorem no longer applies. In this subsection, we focus on the graph class G(CC 2 ). The graph class G(CC 2 ) contains several interesting subclasses such as line graphs, translates of a uniform rectangle, input graphs of the job interval selection problem and circulararc graphs. Furthermore, by Corollary 2.3.7, for a graph in G(CC 2 ), an elimination order with respect to the property MCC 2 can be constructed in polynomial time. Here, we give an optimal algorithm for WMC and a 2-approximation algorithm for MCC on G(CC 2 ). Theorem Given a graph in G(CC 2 ), there is an algorithm that solves WMC in polynomial time. Proof: Let v 1, v 2,..., v n be an elimination ordering with respect to the property MCC 2. For each v i, let G i = G[(N (v i ) {v i }) V i ]. Since the size of an MCC on G i is at most 2, the complement of G i is a bipartite graph. Note that a WMIS in a bipartite graph can be determined in polynomial time [23][39], hence a WMC in G i can be computed in polynomial time. We compute a WMC for each G i, and the largest one is a WMC for G. To see why, consider any weighted maximum clique C of G. Let v j be the vertex in C that appears first in the elimination ordering. Hence, when we compute an WMC for G j, we catch a weighted maximum clique of G.

61 CHAPTER 2. GREEDY ALGORITHMS ON SPECIAL STRUCTURES 53 Theorem Given a graph in G(CC 2 ), there is a polynomial time 2-approximation algorithm for MCC. Proof: Let v 1, v 2,..., v n be an elimination ordering with respect to the property MCC 2. We construct an independent set S by repeatedly taking a vertex according to this elimination ordering and removing all its neighbours. For each v i S, let G i = G[(N (v i ) {v i }) V i ]. Since the size of an MCC on G i is at most 2, there are at most two cliques in an MCC on G i. We take the union of those cliques for every G i. It is clear that this is a clique cover for G, and has size 2 S. Since S is an independent set, an MCC on G has size at least S. Therefore, the algorithm achieves a 2-approximation. 2.5 Matroids and Chordoids The previous sections discuss graph structures based on inductive and universal neighbourhood properties. In this section, we consider set systems. In particular, we discuss matroids, an extension of matroids, and greedy algorithms on these set systems Matroids Matroids are well studied objects in combinatorial optimization. A matroid M is a pair (U,F ), where U is a set of ground elements and F is a family of subsets of U, called independent sets, with the following properties : Trivial Property: F. Hereditary Property: If A F and B A, then B F. Augmentation Property: If A,B F and A = B + 1, then there exists an element e A \ B such that B {e} F.

62 CHAPTER 2. GREEDY ALGORITHMS ON SPECIAL STRUCTURES 54 The maximal independent sets of a matroid are called bases. By the augmentation property, all bases have the same cardinality. For a given subset A of U, the rank of A, denoted as r (A), is the size of the largest independent set contained in A. The rank function satisfies the following properties for all A, B U : Monotonicity: A B implies r (A) r (B); Submodularity: r (A B) + r (A B) r (A) + r (B). The definition of a matroid captures the key notion of independence from both linear algebra and graph theory. For example, consider a set of vectors S in a vector space and let F denote the set of linearly independent sets of vectors, then (S,F ) is a matroid. Given a simple graph, let E be the set of edges and let F be the set of forests, then (E,F ) is also a matroid. We give two more examples of matroids which we use in the thesis. 1. Uniform Matroid: Given a set U of ground elements, let F be the set of all subsets of U with no more than k elements. Then (U,F ) is known as the uniform matroid of rank k. 2. Partition Matroid: Given a set U of ground elements which is partitioned into sets U 1,U 2,...U l. Let F be the set of all subsets of U with no more than k i elements from each partition U i for all i = 1,2,...,l. Then (U,F ) is a partition matroid. Note that a uniform matroid is a special case of a partition matroid for which l = Greedy Algorithms and Matroids One interesting aspect of matroids is the connection to greedy algorithms. Given a matroid M = (U,F ) and a positive weight function w : U R +, there is a natural optimization problem associated with the matroid M and this weight function w, namely that of finding an independent set of maximum total weight. We call this problem the maximum independent set problem on matroids. Let U = n. Sort elements in U in non-increasing order of weights.

63 CHAPTER 2. GREEDY ALGORITHMS ON SPECIAL STRUCTURES 55 Let x i denote the i th element in this order. The following natural" greedy algorithm solves the problem optimally: GREEDY ALGORITHM FOR MATROIDS 1: S = 2: for i = 1,...,n do 3: if S {x i } F, add x i to S 4: end for 5: return S Theorem [76] The above greedy algorithm optimally solves the weighted maximum independent set problem on matroids. Another interesting fact is the reverse direction of the implication for hereditary set systems. Theorem [27] Let (U,F ) be a hereditary set system. If for every choice of a weight function w : U R +, the above greedy algorithm constructs a feasible set with maximum total weight, then F is the set of independent sets of a matroid M with underlying ground set U. Matroids give a characterization of hereditary set systems for which the natural" greedy algorithm achieves the optimal solution for the maximum independent set problem Chordoids Note that for the maximum independent set problem on a matroid, if the problem is unweighted, then any ordering of elements will give an optimal solution for the greedy algorithm. This is different than the greedy algorithm for the maximum independent set problem on chordal graphs, where a specific ordering of vertices has to be used. We extend the definition of matroid by replacing the definition of augmentation property by the following property, which we call the ordered augmentation property.

64 CHAPTER 2. GREEDY ALGORITHMS ON SPECIAL STRUCTURES 56 Definition A set system (U, F ) satisfies the ordered augmentation property if there is a total ordering of elements e 1,e 2,...,e n such that for any feasible set S F and any element e i S, if S {e i } F and S + where S = {e j e j S, j < i } and S + = {e j e j S, j > i }, then there exists an element e k S + such that S \ {e k } {e i } F. It turns out the ordered augmentation property is strictly weaker than the augmentation property of matroids. Proposition The augmentation property implies the ordered augmentation property. Proof: Let (U,F ) be a set system that satisfies the augmentation property, and let e 1,e 2,...,e n be an arbitrary ordering of elements in U. For any feasible set S F and any element e i S, if S {e i } F and S +, by the augmentation property, we can repeatedly augment the set starting with S {e i } using elements in S + until its size is equal to the size of S. This implies that there exists an element e k S + such that S \ {e k } {e i } F. Definition Let C = (U,F ) be a set system. If C satisfies the trivial property, the hereditary property and the ordered augmentation property, then C is a chordoid. By Proposition 2.5.4, chordoids generalize matroids. We give four additional examples of chordoids. Example Let U be a set of elements and let w : U Z + be a positive weight function on elements of U. Let B be a positive integer and let F = {S U e S w(e) B}. Then (U,F ) is a chordoid. If we take an ordering of elements in non-decreasing order of weights (breaking ties arbitrarily), then the set system satisfies the ordered augmentation property. Let S be an independent set. Let e 1 S, e 2 S and S = S \ {e 1 } {e 2 }. If w(e 1 ) w(e 2 ), then S is independent since e S w(e) B. Note that since each element has a positive weight, any subset of an independent set is also independent, therefore the set system (U,F ) is a chordoid. Note

65 CHAPTER 2. GREEDY ALGORITHMS ON SPECIAL STRUCTURES 57 that the above constraint e S w(e) B is often referred to as a knapsack constraint. Therefore, any optimization problem over a knapsack constraint is an optimization problem over a chordoid. Example Let G = (V,E) be a chordal graph, and let F be the family of independent sets of the graph G. Then (V,F ) is a chordoid. Note that any subset of an independent set is independent. Therefore, this set system satisfies the hereditary property. We now examine the ordered augmentation property. Consider a perfect elimination ordering of vertices, for any independent set S and any vertex v S, let S be the set of vertices in S appearing earlier than v in the ordering and let S + be the set of vertices in S appearing later than v in the ordering. By the definition of perfect elimination ordering, there is at most one vertex in S + that can be adjacent to v. Hence, if S {v} is independent, to augment S with v, maintaining independence of the set, we need to remove at most one vertex in S +. Therefore, a perfect elimination ordering of a chordal graph satisfies the ordered augmentation property; the set system (V,F ) is a chordoid. Example Given a set of codewords U over some alphabet Σ, let F be the set of prefix-free subsets of U. Then (U,F ) is a chordoid. A subset of a prefix-free set is clearly prefix-free; hence the hereditary property is satisfied. Consider an ordering of all codewords in a non-increasing order of lengths (breaking ties arbitrarily). Note that for any prefix-free set S and a codeword w, there is at most one codeword of a smaller or equal length in S that can be a prefix of w. Therefore, (U,F ) satisfies the ordered augmentation property. Example Given a set U of partial vectors of length n of the form (a 1, a 2,..., a i,?,?,...,?). The unknown entries are marked with?. The number of known entries of a vector is called its effective length. A subset S of partial vectors is independent if no matter what the unknown

66 CHAPTER 2. GREEDY ALGORITHMS ON SPECIAL STRUCTURES 58 values are, the subset S is linearly independent. Let F be the family of independent sets of U. Then (U,F ) is a chordoid. Clearly, any subset of an independent set is independent. It remains to verify the ordered augmentation property. Proposition A set system (U,F ) where U is the set of particle vectors of length n satisfies the ordered augmentation property when ordered in a non-increasing manner by effective length. Proof: Let v 1, v 2,..., v n be a non-increasing order of partial vectors ordered according to their effective lengths. In sequel, we assume that all subsets of U are ordered using this ordering. Let S be an independent set and v be a partial vector that is not in S. Let S denote the set of partial vectors in S appearing earlier than v in the ordering, and let S + denote the set of partial vectors in S appearing later than v. We assume that S {v} is independent, but S {v} is not. An independent set is v-dependent if adding v to it makes it dependent. Let D denote the set of the minimal v-dependent subsets of S. For any set D in D, define the index of D to be the largest index (the position in the ordering v 1, v 2,..., v n ) among all partial vectors in D. Let m be the maximum index over all sets in D. We claim that S \ {v m } {v} is independent. To see this, let T = {t 1,..., t k } be a set in D with index m. Then there exist non-zero constants α 1,α 2,...,α k such that k α i t i + v = (0,0,...,0,?,?,...,?), where we make the convention that? + a =? and? a =? for any real number a. Note that by our choice of T, t k = v m and the effective length of k α i t i is no greater than the effective length of v. Furthermore, k α i t i has the same values as v for all its known entries.

67 CHAPTER 2. GREEDY ALGORITHMS ON SPECIAL STRUCTURES 59 Now suppose that S \ {v m } {v} is not independent. Let H = {h 1,...,h l } be a minimal subset of S \ {v m } that is v-dependent. Then there exist non-zero constants β 1,β 2,...,β l such that Therefore, we have l β i h i + v = (0,0,...,0,?,?,...,?). l k β i h i α i t i = (0,0,...,0,?,?,...,?), where there is at least one non-zero coefficient, i.e., the coefficient of t k. Therefore the original set S is not independent, which is a contradiction. Given a chordoid (U,F ), let u 1,u 2,...,u n be the an ordering of elements satisfying the ordered augmentation property. The following greedy algorithm is optimal for the maximum independent set problem over a chordoid. AN OPTIMAL GREEDY ALGORITHM FOR MIS OVER A CHORDOID 1: S = 2: for i = 1,...,n do 3: Add u i to S if the resulting set is independent 4: end for 5: Return S Theorem The greedy algorithm solves the maximum independent set problem optimally for a chordoid. Proof: Let A be the greedy solution and O be an optimal solution. We are going to slowly change O to A. We let O 0 = O. Let π be an ordering of elements satisfying the ordered augmentation property. We order elements in A according to π: a 1, a 2,..., a m, and apply the following procedure at each step i, for i = 1,...,m. If a i O i 1 then O i = O i 1. Otherwise, we add the element a i to O i 1. Note that no element in O i 1 \ A appears earlier than a i in the ordering, for otherwise, it

68 CHAPTER 2. GREEDY ALGORITHMS ON SPECIAL STRUCTURES 60 will be chosen by the greedy algorithm. By the ordered augmentation property, after adding a i, we need at most remove one element in O i 1 appearing later than a i to maintain its independence. We let the resulting set be O i. We have two observations: 1. O i 1 has the same size as O i for all i = 1,...,m. 2. O m = A. The first observation is easy as at each step we either do nothing, or add one element and remove at most one element. Since we cannot do better than the optimal solution, the sizes of O i for all i are kept fixed. For the second observation, it is not hard to see A O m. Furthermore O m cannot contain any extra element, for otherwise, it will be chosen by the greedy algorithm. Therefore A = O 0 = O. The greedy algorithm is optimal. Theorem The weighted maximum independent set for a chordoid is NP-hard. Proof: This is immediate since the knapsack problem is NP-hard and it is a special case of the weighted maximum independent set problem for a chordoid. There are other set system characterizations for greedy algorithms in the literature, most notably, greedoids in [61]. A set system (U,F ) is a greedoid if it satisfies the trivial property, the augmentation property and the following accessible property instead of the hereditary property: Accessible Property: If A F and A, then e A such that A \ {e} F. Chordoids are different from matroids, greedoids in two key aspects. For matroids and greedoids, the greedy algorithm is always optimal for the weighted maximum independent set problem, while this does not hold for all chordoids. Secondly, both matroids and greedoids have the same size for all bases, while this is not true for chordoids.

69 Chapter 3 Greedy Algorithms for Special Functions An optimization problem takes the form of optimizing an objective function subject to some constraints. While the previous chapter deals with special structures yielding constraints, in this chapter, we study special families of objective functions. In order to make a fair comparison among different objective functions, we fix our constraints, and consider a general class of optimization problems of the following form: Given a universe U and a set function f : 2 U R, we want to find a subset S of U with cardinality p, where p is a fixed constant, maximizing f (S). The constraint of the problem is a very simple cardinality constraint, which is also known as the uniform matroid constraint. The objective functions we consider in this chapter start from very simple linear functions and submodular functions to more general functions: functions modelling diversity and weakly submodular functions. A set function is monotone if for all S T U, f (S) f (T ); it is normalized if f ( ) = 0. In this chapter, we restrict our attention to monotone and normalized functions. 3.1 Linear Functions and Submodular Functions A set function f is linear if for all S,T U, f (S) + f (T ) = f (S T ) + f (S T ). 61

70 CHAPTER 3. GREEDY ALGORITHMS FOR SPECIAL FUNCTIONS 62 The other way to view a linear function is that the contribution of an individual element e to a set is the value of that element, which is essentially f ({e}). Therefore, we can use the following simple greedy algorithm to optimize a linear function over a uniform matroid. LINEAR FUNCTION MAXIMIZATION 1: for i = 1,..., p do 2: Choose an element giving most increase in value to the current set 3: Add that element to the current set 4: end for It is not hard to see that the above greedy algorithm solves the problem optimally. A more general class of functions is the class of submodular functions. A set function f is sumodular if for all S,T U, f (S) + f (T ) f (S T ) + f (S T ). Note that submodular functions are often studied in value oracle model [73], where the only access to f ( ) is through a black box returning f (S) for a given set S. We can also view a submodular function in terms of marginal gain. An equivalent definition is that for all S T T {x} U, f (S {x}) f (S) f (T {x}) f (T ). This basically says the marginal gain of an element to a set is no greater than the gain to a smaller subset. For the problem of maximizing a submodular function over a uniform matroid, we can use the same greedy algorithm. SUBMODULAR FUNCTION MAXIMIZATION 1: for i = 1,..., p do 2: Choose an element giving most increase in value to the current set 3: Add that element to the current set 4: end for

71 CHAPTER 3. GREEDY ALGORITHMS FOR SPECIAL FUNCTIONS 63 For submodular functions, the greedy algorithm does not always find the optimal solution to the problem. However, a result of Nemhauser, Wolsey and Fisher [71] shows that it achieves an approximation ratio of e Furthermore, the bound is known to be e 1 tight both in the value oracle model and explicitly posed instances assuming P is not equal to N P [31]. 3.2 Max-Sum Diversification We now turn our attention to more general functions. The linear functions or submodular functions discussed in the previous section often model the quality of a given subset. For some applications, this is not enough. For example, in portfolio management, allocating equities only according to the total expected return might lead to a large potential risk as the portfolio is not diversified. A similar situation occurs in information retrieval. For example, in search engines, when pre-knowledge of the user intent is not available, it is actually better for a search engine to diversify its displayed results to improve user satisfaction. In many such situations, diversity is an important measure that must be brought into consideration. Recently, there has been a rising interest in the notion of diversity, especially in the context of social media and web search. However the concept of diversity is not new, there is a rich and long line of research dealing with a similar concept in the literature of location theory. In particular, the placement of facilities on a network to maximize some function of the distances between facilities. The situation arises when proximity of facilities is undesirable, for example, the distribution of business franchises in a city. Such location problems are often referred to as dispersion problems; for more motivation and early work, see [29, 30, 64]. Analytical models for the dispersion problem assume that the given network is represented by a set V = {v 1, v 2,..., v n } of n vertices with metric distance between every pair of vertices. The objective is to locate p facilities (p n) among the n vertices, with at most one facility per vertex, such that some function of distances between facilities is maximized.

72 CHAPTER 3. GREEDY ALGORITHMS FOR SPECIAL FUNCTIONS 64 Different objective functions are considered for the dispersion problems in the literature. For example, the max-sum criterion (maximize the total distances between all pairs of facilities) in [87, 29, 77], the max-min criterion (maximize the minimum distance between a pair of facilities) in [64, 29, 77], the max-mst (maximize the minimum spanning tree among all facilities) and many other related criteria in [41, 18]. The general problem (even in the metric case) for most of these criteria is NP-hard, and approximation algorithms have been developed and studied; see [18] for a summary of previous known results. In this section, we study a problem that extends the max-sum dispersion problem. We first give the definition of a metric distance function. Definition Let U be the underlying ground set, a distance function d(, ) measuring between every pair of elements is metric if it satisfies the following properties: 1. Non-Negativity: For any x, y U, d(x, y) Coincidence Axiom: For any x, y U, d(x, y) = 0 if and only if x = y. 3. Symmetry: For any x, y U, d(x, y) = d(y, x). 4. Triangle Inequality: For any x, y, z U, d(x, y) + d(x, z) d(y, z). Definition Let U be the underlying ground set, and d(, ) a metric distance function. Given a fixed integer k, the goal of the max-sum dispersion problem is to find a subset S U that maximizes {u,v} S d(u, v) subject to S = p. The max-sum dispersion problem is known to be NP-hard [46], but it is not known whether or not it admits a PTAS. In [77], Ravi, Rosenkrantz and Tayi give a greedy algorithm and prove that it has an approximation ratio within a factor of four. This is later improved by Hassin, Rubinstein and Tamir [47], who show a different algorithm with an approximation ratio of two. This is the best known ratio today. We study a generalization of the max-sum dispersion problem; we call it the max-sum diversification problem.

73 CHAPTER 3. GREEDY ALGORITHMS FOR SPECIAL FUNCTIONS 65 Definition Let U be the underlying ground set, and d(, ) a metric distance function for any pair of elements in U. Let f ( ) be a non-negative set function measuring the weight of any subset. Given a fixed integer k, the goal of the max-sum diversification problem is to find a subset S U that: maximizes f (S) + λ {u,v} S d(u, v) subject to S = p, where λ is a non-negative parameter specifying a desired trade-off between the two objectives; i.e., the quality of the set and the diversity of the set. The max-sum diversification problem is first proposed and studied in the context of result diversification in [38] 1, where the function f ( ) is linear. In their paper, the value of f (S) measures the relevance of a given subset to a search query, and the value {u,v} S d(u, v) gives a diversity measure on S. The parameter λ specifies a desired trade-off between relevance and diversity. They reduce the problem to the max-sum dispersion problem, and using an algorithm in [47], they obtain an approximation ratio of two. We study the problem with more general weight functions: normalized, monotone submodular set functions. Therefore, the problem also extends the submodular maximization problem discussed in Section 3.1. Note that results in [38] no longer apply after extending the weight functions to submodular set functions A Greedy Algorithm and Its Analysis In this subsection, we give a non-oblivious greedy algorithm for the max-sum diversification problem that achieves a 2-approximation. Before giving the algorithm, we first introduce our notation. We extend the notion of distance function to sets. For disjoint subsets S,T U, let d(s) = {u,v} S d(u, v), and d(s,t ) = u S,v T d(u, v). 1 In fact, they have a slightly different but equivalent formulation.

74 CHAPTER 3. GREEDY ALGORITHMS FOR SPECIAL FUNCTIONS 66 Now we define various types of marginal gain. For any given subset S U and an element u U \ S, let φ(s) be the value of the objective function. Let d u (S) = v S d(u, v) be the marginal gain on the distance; let f u (S) = f (S {u}) f (S) be the marginal gain on the weight; and φ u (S) = f u (S) + λd u (S) be the total marginal gain on the objective function. Let f u (S) = 1 2 f u(s), and φ u (S) = f u (S) + λd u(s). We consider the following simple greedy algorithm: A GREEDY ALGORITHM FOR MAX-SUM DIVERSIFICATION 1: S = 2: while S < p do 3: Find u U \ S maximizing φ u (S) 4: S = S {u} 5: end while 6: return S Note that the above greedy algorithm is non-oblivious as it is not selecting the next element with respect to the objective function but rather with respect to a closely related potential function". To show a bounded approximation ratio for the algorithm, we utilize the following variation of a lemma in [77]. Lemma Given a metric distance function d(, ) defined on U, and two disjoint subsets X and Y of U, we have the following inequality: ( X 1)d(X,Y ) Y d(x ). Proof: For any x 1, x 2 X and y Y, by the triangle inequality, we have d(x 1, y) + d(x 2, y) d(x 1, x 2 ). Summing up over all unordered pairs of {x 1, x 2 }, we have ( X 1)d(X, y) d(x ).

75 CHAPTER 3. GREEDY ALGORITHMS FOR SPECIAL FUNCTIONS 67 Summing up over all y, we have ( X 1)d(X,Y ) Y d(x ). Theorem The greedy algorithm achieves a 2-approximation for the max-sum diversification problem with normalized, monotone submodular set functions. Proof: Let O be an optimal solution, and G, the greedy solution at the end of the algorithm. Let G i be the greedy solution at the end of step i, i < p; and let A = O G i, B = G i \ A and C = O \ A. By lemma 3.2.4, we have the following three inequalities: ( C 1)d(B,C ) B d(c ) (3.1) ( C 1)d(A,C ) A d(c ) (3.2) ( A 1)d(A,C ) C d(a) (3.3) Furthermore, we have d(a,c ) + d(a) + d(c ) = d(o) (3.4) Note that the algorithm clearly achieves the optimal solution if p = 1. If C = 1, then A = p 1. Since A = p 1 if and only if G i = p 1, we have i = p 1 and G i O. Let v be the element in C, and let u be the element taken by the greedy algorithm in the next step, then φ u (G i ) φ v (G i ). Therefore, 1 2 f u(g i ) + λd u (G i ) 1 2 f v (G i ) + λd v (G i ), which implies φ u (G i ) = f u (G i ) + λd u (G i ) 1 2 f u(g i ) + λd u (G i ) 1 2 f v (G i ) + λd v (G i ) 1 2 φ v (G i );

76 CHAPTER 3. GREEDY ALGORITHMS FOR SPECIAL FUNCTIONS 68 and hence φ(g) 1 2 φ(o). Now we can assume that p > 1 and C > 1. We apply the following non-negative multipliers to equations (3.1), (3.2), (3.3), (3.4) and add them: (3.1) 1 C B +(3.2) C 1 p( C 1) +(3.3) i p(p 1) + (3.4) i C p(p 1) ; we then have d(a,c ) + d(b,c ) i C (p C ) p(p 1)( C 1) d(c ) i C p(p 1) d(o). Since p > C, d(c,g i ) i C p(p 1) d(o). By submodularity and monotonicity of f ( ), we have Therefore, f v (G i ) f (C G i ) f (G i ) f (O) f (G). v C φ v (G i ) = [f v (G i ) + λd({v},g i )] v C v C = f v (G i ) + λd(c,g i ) v C [f (O) f (G)] + λi C p(p 1) d(o). Let u i+1 be the element taken at step (i + 1), then we have φ u i+1 (G i ) 1 p [f (O) f (G)] + Summing over all i from 0 to p 1, we have Hence, and This completes the proof. λi p(p 1) d(o). φ p 1 (G) = φ u i+1 (G i ) [f (O) f (G)] + λ 2 d(o). i=0 f (G) + λd(g) f (O) f (G) + λ 2 d(o), φ(g) = f (G) + λd(g) 1 2 [f (O) + λd(o)] = 1 2 φ(o).

77 CHAPTER 3. GREEDY ALGORITHMS FOR SPECIAL FUNCTIONS 69 Note that the approximation ratio of 2 obtained in Theorem is tight with respect to the greedy algorithm. Consider the following example. Example Let U be a set of 2p elements and let A and B be a bipartition of U, each containing p elements. The weight of each element is 0. The distance function d(, ) is defined as follows. We have d(x, y) = 2 if x A and y A; otherwise d(x, y) = 1. Note that d(, ) is a metric distance function. Furthermore, it is possible for the greedy algorithm to choose the set B as the solution to the problem. The optimal solution is A, and φ(a) = 2φ(B). Therefore, the approximation ratio of 2 obtained in Theorem is tight with respect to the greedy algorithm Further Discussions It is natural to extend the cardinality constraint of the max-sum diversification problem to a general matroid constraint. Definition Let U be the underlying ground set, and F be the set of independent subsets of U such that M =< U,F > is a matroid. Let d(, ) be a metric distance function measuring the distance on every pair of elements. For any subset of U, let f ( ) be a non-negative set function measuring the total weight of the subset. The goal of the max-sum diversification problem with a matroid constraint is to find a subset S F that: maximizes f (S) + λ {u,v}:u,v S d(u, v) where λ is a parameter specifying a desired trade-off between the two objectives. As before, we let φ(s) be the value of the objective function for a set S. The greedy algorithm in the previous subsection still applies, but it fails to achieve any constant approximation ratio. Consider the following partition matroid. The set of ground elements U = {e 1,e 2,e 3,e 4 }. The bases of the matroid are {e 1,e 2 },{e 1,e 3 },{e 4,e 2 },{e 4,e 3 }.

78 CHAPTER 3. GREEDY ALGORITHMS FOR SPECIAL FUNCTIONS 70 This is a partition matroid with one block {e 1,e 4 } and the other block {e 2,e 3 } and one element allowed per block. The weight of each element is 0. The distances between any pair of elements 2 are defined as follows: d(e 1,e 2 ) = d(e 1,e 3 ) = d(e 2,e 3 ) = 1; d(e 1,e 4 ) = d(e 2,e 4 ) = d(e 3,e 4 ) = n. It is not hard to see that d(, ) is a metric distance function. Note that the greedy algorithm may pick e 1 during its first iteration. No matter what it picks during the second iteration, the resulting solution has a value of 1. However, there is a basis with value n. Therefore, the approximation ratio is unbounded. This is in contrast to the greedy algorithm of Nemhauser, Wolsey and Fisher [71] for submodular function maximization, which achieves a 2-approximation after replacing the uniform matroid constraint by a general matroid constraint. Note that the problem is trivial if the rank of the matroid is less than two. Therefore, without loss of generality, we assume the rank is greater or equal to two. Let {x, y} = argmax[f ({x, y}) + λd(x, y)]. {x,y} F We consider the following oblivious local search algorithm: MAX-SUM DIVERSIFICATION WITH A MATROID CONSTRAINT 1: Let S be a basis of M containing both x and y 2: while there exists u U \ S and v S such that S {u} \ {v} F and φ(s {u} \ {v}) > φ(s) do 3: S = S {u} \ {v} 4: end while 5: return S It turns out that the above local search algorithm achieves an approximation ratio of 2. Note that if the rank of the matroid is two, then the algorithm is clearly optimal. From now 2 For each pair (x, y), we only define d(x, y). The value of d(y, x) is the same as d(x, y).

79 CHAPTER 3. GREEDY ALGORITHMS FOR SPECIAL FUNCTIONS 71 on, we assume the rank of the matroid is greater than two. Before we prove the theorem, we need a few lemmas. First, we state a result in [13]. Lemma [13] For any two sets X,Y F with X = Y, there is a bijective mapping g : X Y such that X {g (x)} \ {x} F for any x X. Let O be an optimal solution, and S, the solution at the end of the local search algorithm. Let A = O S, B = S \ A and C = O \ A. Since both S and O are bases of the matroid, they have the same cardinality. Therefore, B and C have the same cardinality. By Lemma 3.2.8, there is a bijective mapping g : B C such that S {g (b)} \ {b} F for any b B. Let B = {b 1,b 2,...,b t }, and let c i = g (b i ) for all i = 1,..., t. Without loss of generality, we assume t 2, for otherwise, the algorithm is optimal by the local optimality condition. Lemma f (S) + t f (S {c i } \ {b i }) f (S \ {b 1,...,b t }) + t f (S {c i }). Proof: Since f is submodular, f (S) f (S \ {b 1 }) f (S {c 1 }) f (S {c 1 } \ {b 1 }) f (S \ {b 1 }) f (S \ {b 1,b 2 }) f (S {c 2 }) f (S {c 2 } \ {b 2 }). f (S \ {b 1,...,b t 1 }) f (S \ {b 1,...,b t }) f (S {c t }) f (S {c t } \ {b t }). Summing up these inequalities, we have t t f (S) f (S \ {b 1,...,b t }) f (S {c i }) f (S {c i } \ {b i }), and the lemma follows. Lemma t f (S {c i }) (t 1)f (S) + f (S {c 1,...,c t }).

80 CHAPTER 3. GREEDY ALGORITHMS FOR SPECIAL FUNCTIONS 72 Proof: Since f is submodular, f (S {c t }) f (S) = f (S {c t }) f (S) f (S {c t 1 }) f (S) f (S {c t,c t 1 }) f (S {c t }) f (S {c t 2 }) f (S) f (S {c t,c t 1,c t 2 }) f (S {c t,c t 1 }). f (S {c 1 }) f (S) f (S {c 1,...,c t }) f (S {c 2,...,c t }) Summing up these inequalities, we have t f (S {c i }) t f (S) f (S {c 1,...,c t }) f (S), and the lemma follows. Lemma t f (S {c i } \ {b i }) (t 2)f (S) + f (O). Proof: Combining Lemma and Lemma , we have t f (S) + f (S {c i } \ {b i }) f (S \ {b 1,...,b t }) + t f (S {c i }) (t 1)f (S) + f (S {c 1,...,c t }) (t 1)f (S) + f (O). Therefore, the lemma follows. Lemma If t > 2, d(b,c ) t d(b i,c i ) d(c ). Proof: For any b i,c j,c k, we have d(b i,c j ) + d(b i,c k ) d(c j,c k ).

81 CHAPTER 3. GREEDY ALGORITHMS FOR SPECIAL FUNCTIONS 73 Summing up these inequalities over all i, j,k with i j, i k, j k, we have each d(b i,c j ) with i j being counted (t 2) times; and each d(c i,c j ) with i j being counted (t 2) times. Therefore and the lemma follows. t (t 2)[d(B,C ) d(b i,c i )] (t 2)d(C ), Lemma t d(s {c i } \ {b i }) (t 2)d(S) + d(o). Proof: = t d(s {c i } \ {b i }) t [d(s) + d(c i,s \ {b i }) d(b i,s \ {b i })] = td(s) + = td(s) + t d(c i,s \ {b i }) t d(c i,s) = td(s) + d(c,s) t d(b i,s \ {b i }) t d(c i,b i ) t d(b i,s \ {b i }) t d(c i,b i ) d(a,b) 2d(B). There are two cases. If t > 2 then by Lemma , we have d(c,s) t d(c i,b i ) = d(a,c ) + d(b,c ) d(a,c ) + d(c ). t d(c i,b i ) Furthermore, since d(s) = d(a) + d(b) + d(a,b), we have 2d(S) d(a,b) 2d(B) d(a). Therefore t d(s {c i } \ {b i }) = td(s) + d(c,s) t d(c i,b i ) d(a,b) 2d(B) (t 2)d(S) + d(a,c ) + d(c ) + d(a) (t 2)d(S) + d(o).

82 CHAPTER 3. GREEDY ALGORITHMS FOR SPECIAL FUNCTIONS 74 If t = 2, then since the rank of the matroid is greater than two, A. Let z be an element in A, then we have t 2d(S) + d(c,s) d(c i,b i ) d(a,b) 2d(B) = d(a,c ) + d(b,c ) t d(c i,b i ) + 2d(A) + d(a,b) d(a,c ) + d(c 1,b 2 ) + d(c 2,b 1 ) + d(a) + d(z,b 1 ) + d(z,b 2 ) d(a,c ) + d(a) + d(c 1,c 2 ) d(a,c ) + d(a) + d(c ) = d(o). Therefore t d( {c i } \ {b i }) = td(s) + d(c,s) t d(c i,b i ) d(a,b) 2d(B) (t 2)d(S) + d(o). This completes the proof. Now we are ready to prove the theorem. Theorem The local search algorithm achieves an approximation ratio of 2 for the max-sum diversification problem with a matroid constraint. Proof: Since S is a locally optimal solution, we have φ(s) φ(s {c i } \ {b i }) for all i. Therefore, for all i we have f (S) + λd(s) f (S {c i } \ {b i }) + λd(s {c i } \ {b i }). Summing up over all i, we have t t t f (S) + λtd(s) f (S {c i } \ {b i }) + λ d(s {c i } \ {b i }).

83 CHAPTER 3. GREEDY ALGORITHMS FOR SPECIAL FUNCTIONS 75 By Lemma , we have t t f (S) + λtd(s) (t 2)f (S) + f (O) + λ d(s {c i } \ {b i }). By Lemma , we have t f (S) + λtd(s) (t 2)f (S) + f (O) + λ[(t 2)d(S) + d(o)]. Therefore, 2f (S) + 2λd(S)) f (O) + λd(o). φ(s) 1 2 φ(o), this completes the proof. Theorem shows that even in the more general case with a matroid constraint, we can still achieve an approximation ratio of 2. In fact, by Example 3.2.6, the set B in the example is a locally optimal set; therefore, this ratio is tight. Note that with a small sacrifice on the approximation ratio, the algorithm can be modified to run in polynomial time by looking for an ɛ-improvement instead of an arbitrary improvement. 3.3 Weakly Submodular Functions Submodular functions are well-studied objects in combinatorial optimization, game theory and economics. The natural diminishing returns property makes them suitable for many applications. In this section, we study an extension of submodular functions which also generalizes the objective function in the max-sum diversification problem. Recall the definition of a submodular function: A function f ( ) is submodular if for any two sets S and T, we have f (S) + f (T ) f (S T ) + f (S T ).

84 CHAPTER 3. GREEDY ALGORITHMS FOR SPECIAL FUNCTIONS 76 We consider the following variation, and we call a function f ( ) weakly submodular if for and two sets S and T, we have T f (S) + S f (T ) S T f (S T ) + S T f (S T ) Examples of Weakly Submodular Functions There are several natural examples of weakly submodular functions. Again, all functions considered here are normalized and monotone. Submodular Functions From the definition, it is not obvious that submodular functions is a subclass of weakly submodular functions. First, we prove this is the case. Proposition Any submodular function is weakly submodular. Proof: Given a monotone submodular function f ( ) and two subsets S and T, without loss of generality, we assume S T, then T f (S) + S f (T ) = S [f (S) + f (T )] + ( T S )f (S). By submodularity f (S) + f (T ) f (T S) + f (T S) and monotonicity f (S) f (S T ), we have T f (S) + S f (T ) = S [f (S) + f (T )] + ( T S )f (S) S [f (S T ) + f (S T )] + ( T S )f (S T ) = S f (S T ) + T f (S T ) = S T f (S T ) + ( S S T )f (S T ) + T f (S T ). And again by monotonicity f (S T ) f (S T ), we have ( S S T )f (S T ) + T f (S T ) ( S + T S T )f (S T ) = S T f (S T ).

85 CHAPTER 3. GREEDY ALGORITHMS FOR SPECIAL FUNCTIONS 77 Therefore the proposition follows. T f (S) + S f (T ) S T f (S T ) + S T f (S T ); Sum of Metric Distances of a Set Let U be a metric space with a distance function d(, ). For any subset S, define d(s) to be the sum of distances induced by S; i.e., d(s) = d(u, v) {u,v}:u,v S where d(u, v) measures the distance between u and v. We also extend the function to a pair of disjoint subsets S and T and define d(s,t ) to be the sum of distances between S and T ; i.e., d(s,t ) = {u,v}:u S,v T d(u, v). We have the following proposition. Proposition The sum of metric distances of a set is weakly submodular. Proof: Given two subsets S and T of U, let A = S \ T, B = T \ S and C = S T. Observe the fact that by the triangle inequality, we have B d(a,c ) + A d(b,c ) C d(a,b). Therefore, T d(s) + S d(t ) = ( B + C )[d(a) + d(c ) + d(a,c )] + ( A + C )[d(b) + d(c ) + d(b,c )] = C [d(a) + d(b) + d(c ) + d(a,c ) + d(b,c )] + ( A + B + C )d(c ) + B d(a) + A d(b) + B d(a,c ) + A d(b,c ) C [d(a) + d(b) + d(c ) + d(a,c ) + d(b,c )] + S T d(s T ) + C d(a,b) = C [d(a) + d(b) + d(c ) + d(a,c ) + d(b,c ) + d(a,b)] + S T d(s T ) = S T d(s T ) + S T d(s T ).

86 CHAPTER 3. GREEDY ALGORITHMS FOR SPECIAL FUNCTIONS 78 Average Non-Negative Segmentation Functions Given an m n matrix M and any subset S [m], a segmentation function σ(s) is the sum of the maximum elements of each column whose row indices appear in S; i.e.; σ(s) = n j =1 max i S M i j. A segmentation function is average non-negative if for each row i, the sum of all entries of M is non-negative; i.e., n j =1 M i j 0. We can use columns to model individuals, and rows to model items, then each entry of M i j represents how much the individual j likes the item i. The average non-negative property basically requires that for each item i, on average people do not hate it. Next, we show that an average non-negative segmentation function is weakly-submodular. We first prove the following two lemmas. Lemma An average non-negative segmentation function is monotone. Proof: Let S be a proper subset of [m], and e be an element in [m] that is not in S. If S is empty, then by the average non-negative property, we have σ({e}) = n j =1 M e j 0. Otherwise, by adding e to S we have max i S {e} M i j max i S M i j for all 1 j n. Therefore σ(s {e}) σ(s). Lemma For any non-disjoint set S and T and an average non-negative segmentation function σ( ), we have σ(s) + σ(t ) σ(s T ) + σ(s T ). This is also referred as the meta-submodular property [60]. Proof: For any non-disjoint set S and T and an average non-negative segmentation function σ( ), we let σ j (S) = max i S M i j. We show a stronger statement that for any j [n], we have σ j (S) + σ j (T ) σ j (S T ) + σ j (S T ).

87 CHAPTER 3. GREEDY ALGORITHMS FOR SPECIAL FUNCTIONS 79 Let e be an element in S T such that M e j is maximum. Without loss of generality, assume e S, then σ j (S) = σ j (S T ) = M e j. Since S T T, we have σ j (T ) σ j (S T ). Therefore, σ j (S) + σ j (T ) σ j (S T ) + σ j (S T ). Summing over all j [n], we have σ(s) + σ(t ) σ(s T ) + σ(s T ) as desired. Proposition Any average non-negative segmentation function is weakly submodular. Proof: For any two set S and T and an average non-negative segmentation function σ( ), if S and T are non-disjoint then by Lemma 3.3.4, S and T satisfy the submodular property and hence they satisfy the weakly submodular property by Proposition If S and T are disjoint, then S T = 0, and S T = S + T. By monotonicity property in Lemma 3.3.1, we also have σ(s) σ(s T ) and σ(t ) σ(s T ). Therefore, S T σ(s T ) + S T σ(s T ) T σ(s T ) + S σ(s T ) T σ(s) + S σ(t ); the weakly submodular property is also satisfied. Squares of Cardinality of a Set For a given set S, let f (S) = S 2. We show that this function is also weakly submodular. Proposition The square of cardinality of a set is weakly submodular.

88 CHAPTER 3. GREEDY ALGORITHMS FOR SPECIAL FUNCTIONS 80 Proof: Given two subsets S and T of U, let a = S \ T, b = T \ S and c = S T. T f (S) + S f (T ) = (b + c)(a + c) 2 + (a + c)(b + c) 2 = (a + b + 2c)(b + c)(a + c) = (a + b + 2c)(ab + ac + bc + c 2 ) (a + b + 2c)(ac + bc + c 2 ) = (a + b + 2c)c(a + b + c) = c(a + b + c) 2 + (a + b + c)c 2 = S T f (S T ) + S T f (S T ). The Objective Function of Max-Sum Diversification We first show a property of weakly submodular functions. Lemma Non-negative linear combinations of weakly submodular functions are weakly submodular. Proof: Consider weakly submodular functions f 1, f 2,..., f n and non-negative numbers α 1,α 2,...,α n. Let g (S) = n α i f i (S), then for any two set S and T, we have T g (S) + S g (T ) n n = T α i f i (S) + S α i f i (T ) = n α i [ T f i (S) + S f i (T )] n α i [ S T f i (S T ) + S T f i (S T )] n n = S T α i f i (S T ) + S T α i f i (S T ) = S T g (S T ) + S T g (S T ).

89 CHAPTER 3. GREEDY ALGORITHMS FOR SPECIAL FUNCTIONS 81 Therefore, g (S) is weakly submodular. Corollary The objective function of the max-sum diversification problem is weakly submodular. Proof: This follows immediate from Proposition and and Lemma Weakly Submodular Function Maximization In this subsection, we discuss a greedy approximation algorithms for maximizing weakly submodular functions over a uniform matroid. Given an underlying set U and a weakly submodular function f ( ) defined on every subset of U, the goal is to select a subset S maximizing f (S) subject to a cardinality constraint S p. We consider the following greedy algorithm. GREEDY ALGORITHM FOR WEAKLY SUBMODULAR FUNCTION MAXIMIZATION 1: S = 2: while S < p do 3: Find u U \ S maximizing f (S {u}) f (S) 4: S = S {u} 5: end while 6: return S Theorem The above greedy algorithm achieves an approximation ratio Before getting into the proof, we first prove two algebraic identities. Lemma n ( i + 1 ) j 1 = i ( i + 1 ) n i. i i j =1

90 CHAPTER 3. GREEDY ALGORITHMS FOR SPECIAL FUNCTIONS 82 Proof: Note that the expression on the left-hand side is a geometric sum. Therefore, we have n ( i + 1 ) j 1 = ( i+1 ) n 1 i j =1 i i+1 1 = i ( i + 1 ) n i. i i Lemma n j =1 j ( i + 1 i ) j 1 = ni 2 ( i + 1 i ) n+1 (n + 1)i 2 ( i + 1 ) n + i 2. i Proof: Consider the function f (x) = n j =1 x j with x 1, its derivative f (x) = n j =1 j x j 1. Since f (x) is a geometric sum and x 1, we have Taking derivatives on both sides we have f (x) = xn+1 1 x 1. f (x) = (n + 1)xn (x 1) x n (x 1) 2 = nxn+1 (n + 1)x n + 1 (x 1) 2. Therefore, we have n j =1 j x j 1 = nxn+1 (n + 1)x n + 1 (x 1) 2. Substituting x with i+1, we have i n j =1 j ( i + 1 i+1 ) j 1 n( i ) n+1 (n + 1)( i+1 i ) n + 1 = i ( i+1 = ni 2 ( i 1) 2 i + 1 i ) n+1 (n + 1)i 2 ( i + 1 ) n + i 2. i Now we proceed to the proof to Theorem Proof: Let S i be the greedy solution after the i th iteration; i.e., S i = i. Let O be an optimal solution, and let C i = O \ S i. Let m i = C i, and C i = {c 1,c 2,...,c mi }. By the weakly submodu-

91 CHAPTER 3. GREEDY ALGORITHMS FOR SPECIAL FUNCTIONS 83 larity definition, we get the following m i inequalities for each 0 < i < p: (i + m i 1)f (S i {c 1 }) + (i + 1)f (S i {c 2,...,c mi }) (i )f (S i {c 1...,c mi }) + (i + m i )f (S i ) (i + m i 2)f (S i {c 2 }) + (i + 1)f (S i {c 3,...,c mi }) (i )f (S i {c 2...,c mi }) + (i + m i 1)f (S i ). (i + 1)f (S i {c mi 1}) + (i + 1)f (S i {c mi }) (i )f (S i {c mi 1,c mi }) + (i + 2)f (S i ) (i )f (S i {c mi }) + (i + 1)f (S i ) (i )f (S i {c mi }) + (i + 1)f (S i ). j =1 Multiplying the j th inequality by ( i+1 i ) j 1, and summing all of them up, we have m i (i + m i j )( i + 1 ) j 1 f (S i {c j }) + (i + 1)( i + 1 ) m i 1 f (S i ) i i m i (i )f (S i {c 1,...,c mi }) + (i + m i j + 1)( i + 1 ) j 1 f (S i ). i j =1 By monotonicity, we have f (S i {c 1,...,c mi }) f (O). Rearranging the inequality, m i (i + m i j )( i + 1 i j =1 ) j 1 m i 1 f (S i {c j }) (i )f (O) + j =1 (i + m i j + 1)( i + 1 i ) j 1 f (S i ). By the greedy selection rule, we know that f (S i+1 ) f (S i {c j }) for any 1 j m i, therefore we have m i (i + m i j )( i + 1 i j =1 ) j 1 m i 1 f (S i+1 ) (i )f (O) + j =1 (i + m i j + 1)( i + 1 i ) j 1 f (S i ). For the ease of notation, we let m i a i = (i + m i j )( i + 1 i j =1 We first simplify a i and b i. ) j 1 m i 1 b i = m i a i = (i + m i j )( i + 1 i j =1 ) j 1 m i = (i + m i )( i + 1 ) j 1 i j =1 m i j =1 j =1 (i + m i j + 1)( i + 1 i j ( i + 1 ) j 1. i ) j 1

92 CHAPTER 3. GREEDY ALGORITHMS FOR SPECIAL FUNCTIONS 84 By Lemma and , we have a i = (i + m i )[i ( i + 1 i ) m i i ] m i i 2 ( i + 1 i ) m i +1 + (m i + 1)i 2 ( i + 1 ) m i i 2 i = [i 2 + i m i m i (i 2 + i ) + (m i + 1)i 2 ]( i + 1 ) m i 2i 2 i m i i = 2i 2 ( i + 1 ) m i 2i 2 i m i. i Similarly, we have m i 1 b i = (i + m i j + 1)( i + 1 i j =1 m i 1 = j =1 (i + m i + 1)( i + 1 i = (i + m i + 1)[i ( i + 1 i ) j 1 ) j 1 m i 1 j =1 j ( i + 1 ) j 1 i ) m i 1 i ] (m i 1)i 2 ( i + 1 i ) m i + m i i 2 ( i + 1 ) m i 1 i 2 i = [i 2 + i m i + i (m i 1)(i 2 + i ) + m i i 2 ]( i + 1 ) m i 1 2i 2 i m i i i = 2i (i + 1)( i + 1 ) m i 1 2i 2 i m i i i = 2i 2 ( i + 1 ) m i 2i 2 i m i i. i Now let We have a i = p (i + p j )( i + 1 i j =1 ) j 1 b p 1 i = a i a i = b i b i 0 j =1 (i + p j + 1)( i + 1 i ) j 1 Therefore, a i f (S i+1) b i f (S i ) = a i f (S i+1 ) b i f (S i ) + (a i a i )[f (S i+1 ) f (S i )]. Since f ( ) is monotone, we have f (S i+1 ) f (S i ) 0. Therefore, a i f (S i+1) b i f (S i ) a i f (S i+1 ) b i f (S i ) i f (O).

93 CHAPTER 3. GREEDY ALGORITHMS FOR SPECIAL FUNCTIONS 85 Then we have the following set of inequalities: a1 f (S 2) 1f (O) + b1 f (S 1) a2 f (S 3) 2f (O) + b2 f (S 2). ap 2 f (S p 1) (p 2)f (O) + bp 2 f (S p 2) ap 1 f (S p) (p 1)f (O) + bp 1 f (S p 1). Multiplying the i th inequality by b 1 f (S 1), p 1 j =1 a j p 1 j =2 b j Therefore the approximation ratio i 1 j =1 a j i, summing all of them up and ignore the term j =2 b j f (S p ) p 1 i i 1 j =1 a j i j =2 b j f (O). f (O) f (S p ) p 1 p 1 j =1 a j p 1 j =2 b j i i 1 j =1 a j i j =2 b j = p 1 i p 1 j =i+1 b j p 1 j =i a j 1 = ( p 1 [ i a i p 1 b j j =i+1 a j ]) 1. Note that the approximation ratio is simply a function of p, and it converges 3 to 5.95 as p tends to. In particular, the approximation ratio is 3.74 when p = 10 and approximation ratio is 5.62 when p = Further Discussions As discussed in Subsection 3.2.2, it is natural to consider the general matroid constraint for the problem of weakly submodular function maximization. For this more general problem, the greedy algorithm in the previous section no longer achieves any constant approximation ratio. We consider the following oblivious local search algorithm: WEAKLY SUBMODULAR FUNCTION MAXIMIZATION WITH A MATROID CONSTRAINT 3 This number is obtained by a computer program.

94 CHAPTER 3. GREEDY ALGORITHMS FOR SPECIAL FUNCTIONS 86 1: Let S be a basis of M 2: while exists u U \ S and v S such that S {u} \ {v} F and f (S {u} \ {v}) > f (S) do 3: S = S {u} \ {v} 4: end while 5: return S Before we prove the theorem, we need to prove several lemmas. Let O be the optimal solution, and S, the solution at the end of the local search algorithm. Let s be the size of a basis; let A = O S, B = S \ A and C = O \ A. By Lemma 3.2.8, there is a bijective mapping g : B C such that S {b} \ {g (b)} F for any b B. Let B = {b 1,b 2,...,b t }, and let c i = g (b i ) for all i = 1,..., t. We reorder b 1,b 2,...,b t in different ways. Let b 1,b 2,...,b t be an ordering such that the corresponding c 1,c 2,...,c t maximizes the sum t s+1 (s i )( s ) i 1 f (S {c i }); and let b 1,b 2,...,b t be an ordering such that the corresponding c 1,c 2,...,c t minimizes the sum t s+1 (s + t i )( ) i 1 f (S {c s i }). Lemma Given three non-increasing non-negative sequences: α 1 α 2 α n 0, β 1 β 2 β n 0, x 1 x 2 x n 0. Then we have n α i x i n n β i β i x n+1 i n α i.

95 CHAPTER 3. GREEDY ALGORITHMS FOR SPECIAL FUNCTIONS 87 Proof: Consider the following: n n α i x i = nα 1 x 1 + nα 2 x nα n x n = = n n α i x 1 + (nα 1 α i )x 1 + nα 2 x nα n x n n n α i x 1 + (nα 1 + nα 2 α i )x nα n x n n n n α i x 1 + α i x 2 + (nα 1 + nα 2 2 α i )x nα n x n =. n n n n α i x 1 + α i x α i x n + (nα 1 + nα nα n n α i )x n n n α i x i Similarly, we have n n β i x n+1 i = nβ 1 x n + nβ 2 x n nβ n x 1 = = n n β i x n + (nβ 1 β i )x n + nβ 2 x n nβ n x 1 n n β i x n + (nβ 1 + nβ 2 β i )x n nβ n x 1 n n n β i x n + β i x n 1 + (nβ 1 + nβ 2 2 β i )x n nβ n x 1 =. n n n n β i x n + β i x n β i x 1 + (nα 1 + nβ nβ n n β i )x 1 n n β i x i Therefore the lemma follows. Lemma t (s i )( s + 1 ) i 1 f (S {c i s }) t s f (S) + (s + 1 i )( s + 1 ) i 1 f (S {c i s } \ s + 1 {b i }) (s + 1)( ) t 1 f (S \ {b 1 s,...,b t }).

96 CHAPTER 3. GREEDY ALGORITHMS FOR SPECIAL FUNCTIONS 88 Proof: By the definition of weakly submodular, we have s f (S) + s f (S {c 1 } \ {b 1 }) (s 1)f (S {c 1 }) + (s + 1)f (S \ {b 1 }) s f (S \ {b 1 }) + (s 1)f (S {c 2 } \ {b 2 }) (s 2)f (S {c 2 }) + (s + 1)f (S \ {b 1,b 2 }). s f (S \ {b 1,...,b t 1 }) + (s t + 1)f (S {c t } \ {b t }) (s t)f (S {c t }) + (s + 1)f (S \ {b 1,...,b t }) Multiplying the i th inequality by ( s+1 ) i 1, and summing all of them up to get s t s f (S) + (s + 1 i )( s + 1 ) i 1 f (S {c i s } \ {b i }) t (s i )( s + 1 ) i 1 f (S {c s + 1 i }) + (s + 1)( ) t 1 f (S \ {b 1 s s,...,b t }). After rearranging the inequality, we get t (s i )( s + 1 ) i 1 f (S {c i s }) t s f (S) + (s + 1 i )( s + 1 ) i 1 f (S {c i s } \ s + 1 {b i }) (s + 1)( ) t 1 f (S \ {b 1 s,...,b t }). Lemma t (s + t i )( s + 1 t ) i 1 f (S {c i s }) (s + t + 1 i )( s + 1 ) i 1 f (S) s s f (S {c 1 s + 1,...,c t }) (s + 1)( ) t 1 f (S) s Proof: By the definition of weakly submodular, we have (s + t 1)f (S {c 1 }) + (s + 1)f (S {c 2,...,c m i }) s f (S {c 1,...,c m i }) + (s + t)f (S) (s + 1)f (S {c t 1 }) + (s + 1)f (S {c t }) s f (S {c s f (S {c t. t 1,c t }) + (s + 2)f (S) }) + (s + 1)f (S) s f (S {c }) + (s + 1)f (S). t

97 CHAPTER 3. GREEDY ALGORITHMS FOR SPECIAL FUNCTIONS 89 Multiplying the i th inequality by ( s+1 ) i 1, and summing all of them up, we have s t (s + t i )( s + 1 ) i 1 f (S {c s + 1 i }) + (s + 1)( ) t 1 f (S) s s Therefore, we have t (s + t i )( s + 1 ) i 1 f (S {c i s }) t s f (S {c 1,...,c t }) + t s f (S {c 1,...,c t }) + (s + t + 1 i )( s + 1 ) i 1 f (S). s (s + t + 1 i )( s + 1 s ) i 1 f (S) (s + 1)( s + 1 ) t 1 f (S). s Let Lemma t A = (s i )( s + 1 t ) i 1, B = (s + 1 i )( s + 1 ) i 1, s s t C = (s + t i )( s + 1 t ) i 1, D = (s + t + 1 i )( s + 1 ) i 1. s s C t (s i )( s + 1 t ) i 1 f (S {c i s }) A (s + t i )( s + 1 ) i 1 f (S {c i s }). Proof: This is immediate by Lemma Theorem Let s be the size of a basis, the local search algorithm achieves an approximation ratio of 14.5 for an arbitrary s, approximately when s = 6. The ratio converges to as s tends to. Proof: Since S is a locally optimal solution, we have f (S) f (S {c i } \ {b i }). Since f (S \ {b 1,...,b t }) 0, by Lemma , we have t (s i )( s + 1 t ) i 1 f (S {c i s }) s f (S) + (s + 1 i )( s + 1 ) i 1 f (S). s

98 CHAPTER 3. GREEDY ALGORITHMS FOR SPECIAL FUNCTIONS 90 Therefore, t (s i )( s + 1 ) i 1 f (S {c i }) (s + B)f (S). s On the other hand, we have O S {c 1,...,c t }, by monotonicity, we have f (O) f (S 1,...,c t }). By Lemma , we have {c t (s + t i )( s + 1 ) i 1 f (S {c s + 1 i }) s f (O) + [D (s + 1)( ) t 1 ]f (S). s s Lemma , we have C t (s i )( s + 1 t ) i 1 f (S {c i s }) A (s + t i )( s + 1 ) i 1 f (S {c i s }). Therefore C (s + B)f (S) As f (O) + A[D (s + 1)( s + 1 ) t 1 ]f (S) s Hence the approximation ratio: f (O) C B AD +C s + A(s + 1)( f (S) As Simplifying the notation, we have f (O) f (S) s+1 ) t 1 s C B AD +C s = As + ( s + 1 ) t. s t (s2 + st + ti si )( s+1 s ) i 1 + 2t 1 s+1 i=t+1 t(2t i )( s ) i 1 t + ( s + 1 ) t. s+1 s(s i )( ) s i 1 s The expression is monotonically increasing with t and is bounded from above 4 by 14.5 for s > 1. In particular, it has an approximate value of when s = 6. The ratio converges to as s tends to. 4 This number is obtained by a computer program.

99 Chapter 4 Sum Colouring - A Case Study of Greedy Algorithms In this chapter, we study greedy algorithms through a particular problem: the sum colouring problem. We focus on the class of d-claw-free graphs and its subclasses, proving NPhardness and giving greedy approximation algorithms for the problem. Finally, we derive inapproximation lower bounds for the sum colouring problem on restricted families of graphs using the priority framework developed in [12]. 4.1 Introduction The sum colouring problem (SC), also known as the chromatic sum problem, was formally introduced in [62]. For a given graph G = (V,E), a proper colouring of G is an assignment of positive integers to its vertices φ : V Z + such that no two adjacent vertices are assigned the same colour. The sum colouring problem seeks a proper colouring such that the sum of colours over all vertices v V φ(v) is minimized. When this sum is minimized, this sum is called the chromatic sum of the graph G. Sum colouring has many applications in job scheduling and resource allocation. For example, consider an instance of job scheduling in which one is given a set of jobs S each requiring unit execution time. One can view this 91

100 CHAPTER 4. SUM COLOURING - A CASE STUDY OF GREEDY ALGORITHMS 92 instance in a graph-theoretic sense: we construct a graph G whose vertex set is in one-toone correspondence with the set of input jobs S, and an edge exists between two vertices if and only if the corresponding jobs conflict for resources. In other words, we consider the underlying conflict graph G of the job scheduling instance. Finding the chromatic sum of G corresponds to minimizing the average job completion time. The sum colouring problem has been studied extensively in the literature. The problem is NP-hard for general graphs [62], and cannot be approximated within n 1 ɛ for any constant ɛ > 0 unless ZPP=NP [5][32]. Note that an optimal colouring of a graph does not necessarily yield an optimal sum colouring for this graph. Consider a graph G and an optimal sum colouring of G in Fig It uses three colours, while the chromatic number of G is two. In Figure 4.1: An optimal sum colouring of G fact, the gap between the chromatic number and the number of colours used in an optimal sum colouring can be made arbitrarily large, even for the case of trees [62]. The sum colouring problem is polynomial time solvable for proper interval graphs [74] and trees [62]. However, the problem is APX-hard for both bipartite graphs [7] and interval graphs [70], which is a little surprising given that many NP-hard problems are solvable in polynomial time for these two classes. The best known approximation algorithm for interval graphs has an approximation ratio of [43]. For bipartite graphs, there is a approximation [37]. In this chapter, we focus on the class of d-claw-free graphs and its subclasses. Recall that a graph is d-claw-free if every vertex has less than d independent neighbours. The class of

101 CHAPTER 4. SUM COLOURING - A CASE STUDY OF GREEDY ALGORITHMS 93 d-claw-free graphs is exactly the class Ĝ(I S d 1 ) discussed in Chapter 2. Here we give subclasses of d-claw-free graphs in addition to those given in Subsection All these subclasses fall into the category of geometric intersection graphs defined in Subsection Unit Interval Graphs: The vertices are unit intervals in a real line, and two vertices are adjacent if and only if the two corresponding intervals overlap; see Fig. 4.2a. 2. Proper Interval Graphs: The vertices are intervals in a real line and no interval is properly contained in another interval. Two vertices are adjacent if and only if the two corresponding intervals overlap; see Fig. 4.2b. It is known that the class of proper interval graphs and the class of unit interval graphs coincide [80]. Furthermore, a geometric representation of a proper interval graph can be transformed to a geometric representation of a unit interval graph in polynomial time using only expansion and contraction of intervals [9]. (a) A unit interval graph (b) A proper interval graph Figure 4.2: Unit interval graphs and proper interval graphs 3. Unit Square Graphs: The vertices are axis-parallel unit squares 1 in a two dimensional plane, and two vertices are adjacent if and only if the two corresponding squares overlap; see Fig. 4.3a. 4. Proper Intersection Graphs of Axis-Parallel Rectangles: The vertices are axis-parallel rectangles in a two dimensional plane and the projection of any rectangle onto either 1 Note that here we do not allow unit squares to rotate. For the rest of this chapter, whenever we say unit squares, we mean axis-parallel unit squares.

102 CHAPTER 4. SUM COLOURING - A CASE STUDY OF GREEDY ALGORITHMS 94 the x-axis or y-axis is not properly contained in that of another rectangle. Two vertices are adjacent if and only if the two corresponding rectangles intersects; see Fig. 4.3b. (a) A unit square graph (b) A proper intersection graph of axis-parallel rectangles Figure 4.3: Unit square graphs and proper intersection graphs of axis-parallel rectangles 5. Unit Disk Graphs: The vertices are unit disks in a two dimensional plane, and two vertices are adjacent if and only if the two corresponding disks overlap; see Fig. 4.4a. 6. Penny Graphs: The vertices are unit disks in a two dimensional plane that do not share a common interior point, and two vertices are adjacent if and only if the two corresponding disks touch each other at the boundary; see Fig. 4.4b. (a) A unit disk graph (b) A penny graph Figure 4.4: Unit disk graphs and penny graphs

103 CHAPTER 4. SUM COLOURING - A CASE STUDY OF GREEDY ALGORITHMS 95 It is not hard to see that unit interval graphs and proper interval graphs are 3-claw-free; unit square graphs and proper intersection graphs of axis-parallel rectangles are 5-clawfree; and unit disk graphs and penny graphs are 6-claw-free. We first show the class of proper intersection graphs of axis-parallel rectangles and the class of unit square graphs coincide. Theorem The class of proper intersection graphs of axis-parallel rectangles is the same as the class of unit square graphs. Furthermore, a geometric representation of a proper intersection graph of axis-parallel rectangles can be transformed to a geometric representation a unit square graph in polynomial time. Proof: It is clear that unit square graphs are contained in the class of proper intersection graphs of axis-parallel rectangles. We only need to show the reverse direction. Given a geometric representation of a proper intersection graph of axis-parallel rectangles, for each axis, its projection is a proper interval graph. By applying on both x-axis and y-axis the transformation given in [9], which converts a proper interval representation to a unit interval representation using only expansion and contraction of intervals, a geometric representation of a unit square graph can be constructed in polynomial time. Therefore, the two classes coincide. 4.2 NP-Hardness for Penny Graphs In this section, we show sum colouring is NP-hard for penny graphs. The reduction combines ideas in [16] and [44], and reduces from the maximum independent set problem on planar graphs with maximum degree 3. First, we make use of the following observation from Valiant [85]. Lemma [85] A planar graph G with maximum degree 3 can be embedded in the plane using O( V 2 ) units of area in such a way that its vertices are at integer coordinates and its

CHAPTER 4. SUM COLOURING - A CASE STUDY OF GREEDY ALGORITHMS 96 edges are drawn so that they are made up of line segments of the form x = i or y = j, for integers i and j.

104 CHAPTER 4. SUM COLOURING - A CASE STUDY OF GREEDY ALGORITHMS 96 edges are drawn so that they are made up of line segments of the form x = i or y = j, for integers i and j. Given a planar graph G with maximum degree 3, we first apply Lemma to draw its embedding onto integer coordinates. Without loss of generality we assume those coordinates are multiples of 8 units. We replace each vertex with a filled unit disk, and for each edge uv, we replace it with l uv tangent hollow unit disks where l uv is the Manhattan distance between u and v. We call the resulting penny graph G. See figure 4.5. Note that there are three types of adjacent pair of unit disks. A corner pair refers two adjacent disks such that one of them is at the corner; an uneven pair refers two adjacent disks such that the centre of at least one of them does not lie on the grid; the rest of the pairs are straight pairs. It is not hard to observe the following relationship between the sizes of maximum independent sets of the two graphs. Figure 4.5: Transformation from planar graphs with maximum degree 3 to penny graphs Lemma Let α( ) denote the size of the maximum independent set, then α(g ) = α(g) + uv E l uv 2. Proof: We first show that α(g ) is at least α(g) + uv E l uv 2. Given a maximum independent set I of G, for any edge uv, at least one of u and v are not in I, hence we can add

Chordal graphs MPRI

Chordal graphs MPRI 2017 2018 Michel Habib habib@irif.fr http://www.irif.fr/~habib Sophie Germain, septembre 2017 Schedule Chordal graphs Representation of chordal graphs LBFS and chordal graphs More structural