Analyzing a Greedy Approximation of an MDL Summarization

Size: px
Start display at page:

Download "Analyzing a Greedy Approximation of an MDL Summarization"

Transcription

1 Analyzing a Greedy Approximation of an MDL Summarization Peter Fontana fontanap@seas.upenn.edu Faculty Advisor: Dr. Sudipto Guha April 10, 2007 Abstract Many OLAP (On-line Analytical Processing) applications have produced data cubes that summarize and aggregate details of data queries. These data cubes are multi-dimensional matrices where each cell that satisfies a specific property or trait is represented as a 1, notated as a 1-cell in this report. A cell that does not satisfy that specific property is represented as a 0, notated as a 0-cell. in this report In order to compress the amount of space required to represent this matrix completely, others have used MDL (Minimum Description Length) Summarization, including the MDL Summarization with Holes. While it is NP-Hard to compute the optimal MDL Summarization with Holes for a data matrix of 2 or more dimensions (Proven by Bu et al. [1]), there exists a greedy algorithm to approximate the MDL Summarization with Holes, proven to give an answer that within a factor of l m log(m) of the optimal solution (Proven by Guha and Tan [3]), where M is a factor dependent size of the data matrix. See the Technical Approach section of this report for a definition of l m. However, Guha and Tan in [3] mention that this bound has not been proven tight. I studied this for 2-dimensional matrices where the algorithm can only compress by covering rows and columns (here l m = 2). Currently, I have a proof that the greedy algorithm is a 4-approximation algorithm in this special 2-dimensional case and a constant-factor (2 (κ 2))-approximation algorithm in the general case. Furthermore, I have written a program that uses the greedy approximation to MDL Summarize with Holes an arbitrary n-by-n 2-dimensional matrix of 1 s and 0 s. Related Work 1

2 Currently, OLAP (On-line Analytical Processing) database applications exist and have very powerful data processing abilities, including the ability to give data in many varying levels of detail. OLAP applications aggregate the detail of the data by performing rollup operations, which take a current data sheet and produce a higher-level data sheet by grouping data cells together or classifying data cells at a higher level. Rollups are further described in [2]. Sage and Sarawagi [6], have developed intelligent methods of performing rollup operations, which are described in [6]. With these intelligent rollup operations, data queries can be abstracted to the level of detail where all cells can be classified by whether or not each data cell has a different property. All cells that satisfy a specific property will be represented by a 1 (1-cells) and all other cells will be represented by a 0 (0-cells). For example, let a data matrix consist of rows representing companies, columns of matrices representing cities, and the cells (i,j) containing the revenue that the company i makes in that city j. Performing a rollup could produce a new matrix of the same size where every cell (i,j) is 1 in the new matrix if it was $10,000 or more in the data matrix, and 0 otherwise. This matrix is now an abstraction that describes all the (company, city) pairs where company i made at least $10,000 in city j. Matrices describing the aggregated data, such as the matrix described in the example above, can be summarized using a MDL (Minimum Description Length) Summarization. MDL Summarization is the process of summarizing the data matrix by describing rectangular regions of 1-cells. I.e. instead of describing each cell of a multi-dimensional matrix, the MDL Summarization can describe the whole data matrix merely by describing the rectangular regions of the matrix that contain 1-cells. These rectangular regions are referred to as non-trivial rectangles (trivial rectangles are the individual cells and the entire matrix). (The notion of a rectangle is taken directly from [3]). Here, the problem defines the cost so that describing any rectangular region of all 1-cells costs 1. Many variants of the MDL Summarization have been considered. Two such summarization methods include a Generalized MDL Summarization (see [4]) and an MDL Summarization with Holes (as defined in [1]) (This definition is also described in the paragraph below). An MDL Summarization results in a more compact representation of the data. These summarization methods produce more compact representations that results in space savings, which are especially useful when used to make shorter database queries [5]. This project focuses on the MDL Summarization with Holes (as defined in [1] - the definition is paraphrased here). MDL Summarization with Holes is a specialized/refined MDL Summarization where rectangles can contain 0- cells as well as 1-cells. Since the rectangles are used to represent a region of 1-cells, the MDL Summarization with Holes also describes each 0-cell in each rectangle (these are the holes) [1]. Each 0-cell is described only once by the summarization, even if it is in overlapping rectangles that are chosen [1]. In the regular MDL Summarization, if a rectangular region contained a 0-cell in it, the summarization could not choose that rectangle even if it all the other cells were 1-cells. However, the MDL Summarization with Holes can take this rectangle and then pay a cost of 1 for each 0-cell (the hole in the rectangle.) By 2

3 allowing holes in the rectangles, chosen rectangle can become much larger and encapsulate more of the 1-cells, which results in a more compact description of the data matrix [1]. For data matrices of 2 or more dimensions, producing an optimal MDL Summarization with Holes has been proven to be NP-Hard by Bu et Al. [1]. However, Bu et al. [1] have also proposed heuristics to produce useful MDL Summarizations with Holes. One of these heuristics, which I have investigated, is the greedy approach. Guha and Tan [3] have also examined this greedy approach for approximating the MDL Summarization with Holes. Guha and Tan [3] have proven that the greedy approach gives an approximation that is O(l m log(m)), where M is the size of the data matrix. See the Technical Approach section of this paper for a definition of l m. Guha and Tan have also examined a recursive solution that expands the MDL Summarization with Holes [3]. Here, the summarization recursively selects rectangles of 1-cells, then subtracts out sub-rectangles of 0- cells, and then recurses further, alternately adding and subtracting rectangles of 1-cells and 0-cells (respectively) up to k regions (this is the k-recursive solution described in [3].) Afterwards, all 0-cells included in a rectangle of 1-cells and all 1-cells not included in a rectangle are described as individual cells. Here, Guha and Tan [3] use a Linear Programming approach instead of a greedy approach when k 2. The MDL Summarization with Holes is also called the 1-recursive Summarization described in [3]. My project focuses on the 1-recursive greedy approach of the MDL Summarization with Holes (as described in [3]). This means that the algorithm chooses only 1 level of regions before describing individual cells. While this greedy approximation has been proven to be a O(l m log(m))-approximation algorithm for the optimal MDL Summarization with Holes by Guha and Tan [3], this bound has not previously been tightly proven [3]. Guha and Tan [3] describe this as an open problem by giving the proof that the greedy algorithm is a O(l m log(m))-approximation algorithm but only giving a proof that the greedy algorithm is at best an Ω(l m )-approximation algorithm [3]. No current paper has answered this question as of April 1, My project shows that this greedy algorithm is a constant-factor (2 (κ 2)) approximation algorithm (relative to M), (such as an O(l m )-approximation algorithm) of the optimal MDL Summarization with Holes in the general case, and a 4-approximation algorithm in the special case. For definitions of l m and κ, see the subsection Definitions in the Technical Approach section of this paper. Technical Approach I studied the greedy algorithm given for MDL Summarization with Holes in [3] and tightened the analysis of the optimality of the greedy approximation algorithm, specifically focusing on the 2-dimensional case where the only nontrivial rectangles are row and columns. In this section, I define l m and κ, outline the description of the greedy algorithm, give an example of the problem 3

4 in the specialized case, give an example that shows that the greedy-algorithm is an Ω(2)-approximation algorithm in the specialized case (in the case 2 = l m = (κ 2)) following the example in [3] and then describe some challenges in solving this problem. Definitions Note: The following definitions are paraphrased from [3]. l m : When the rectangles are specified, one can consider which rectangles completely contain other rectangles. Now, for each cell u in the matrix, consider all of the rectangles that cover u. Call this set S u. Now the contains (subset) relation forms a poset over the rectangles S u, and more specifically, a lattice, because the entire matrix is a rectangle that contains u and the matrix contains all other rectangles. Here, l m is the size of the largest antichain of the poset of S u for all cells u. κ: κ is the largest number of rectangles that all contain a common cell of the matrix. (κ is defined as such in [3]). Here κ 2 is the largest number of non-trivial rectangles that all contain (cover) a common cell. The two excluded trivial rectangles are the entire matrix and a single cell. In general (κ 2) l m. In this specialized case, it happens that 2 = l m = (κ 2). M: M is the size of the data Matrix. Description of the Greedy Algorithm Here is the greedy algorithm: After reading in the data matrix, the greedy algorithm looks at all of the uncovered 0-cells and 1-cells (i.e all 0 s and 1 s that are not contained in a chosen row and column) and for each unchosen row and 1+#(uncovered 0 s) #(uncovered 1 s). column the greedy algorithm computes the following ratio: If any ratio is less than 1, the greedy algorithm chooses the row or column with the smallest ratio and covers the elements in that row or column. The greedy algorithm repeats this process until no ratio is less than one or until all possible rectangular regions are chosen. The final cost of the greedy solution is the number of rows and columns chosen + the number of uncovered 1 s + the number of covered 0 s. To generalize this to an arbitrary set of non-trivial rectangles (instead of rows and columns), just have the algorithm calculate this ratio for each of those non-trivial rectangles instead of for each row and column. Description of the Program Implementation I have implemented the greedy approximation algorithm in C for the 2- dimensional case when the non-trivial rectangles are rows and columns. The 4

5 program reads in a 2-dimensional matrix of 1 s and 0 s from a text file, where each 1-cell will be represented by a 1. The program implements the greedy approximation algorithm on the matrix and outputs the result to the screen (or into another text file if redirected). The input file includes an ordered list of the rows and columns taken by the algorithm, the resulting summarized matrix and the cost of the space needed to store the data cube both before and after the compression. An Example of the Greedy Algorithm Here I give an ordinary example that illustrates the greedy algorithm and the results I produced. This example matrix is 6 by 6. Here is the output from my program: 6 by 6 2-dimensional matrix. Here, the 1st row is row 0 and the 1st column is column 0. Original input matrix: Length needed to describe uncompressed matrix: 22 Greedy Algorithm took column 0 with a greedy ratio of Greedy Algorithm took row 5 with a greedy ratio of 0.5 Greedy Algorithm took column 5 with a greedy ratio of 0.5 Note: a covered 0 is printed as an x and a covered 1 is printed as a +. Final compressed matrix from Greedy Algorithm: x x Length needed to describe compressed matrix: 13 5

6 An Illustrative example of the Greedy Algorithm Using the program I wrote, I have given the 2-dimensional example for my special case of chosen coverable regions (these regions and the individual cells together are defined as rectangles) that is the example described in [3] that proves that the greedy algorithm is an Ω(l m ) approximation algorithm. Here is the output from my program for the example when the matrix is 8 by 8: 8 by 8 2-dimensional matrix. Here, the 1st row is row 0 and the 1st column is column 0. Original input matrix: Length needed to describe uncompressed matrix: 32 (Greedy Algorithm took nothing) Note: a covered 0 is printed as an x and a covered 1 is printed as a +. Final compressed matrix from Greedy Algorithm: Length needed to describe compressed matrix: 32 Here the cost of the greedy solution is 32. However, the optimal solution is to choose rows 2-7 and columns 2-7 (i.e cover the entire cross), which is shown below in the notation of a solution of the program: Note: a covered 0 is printed as an x and a covered 1 is printed as a +. 6

7 xxxx++ ++xxxx++ ++xxxx++ ++xxxx Cost of Greedy Solution Cost of Optimal Solution = The cost of this optimal solution is 24. Here, the ratio of = 4 3 < l m = 2. Since here l m = 2 (the rows and columns are the only overlapping rectangles), this example confirms the tightness of the specialized 2-dimensional case as proven in [3]. Now, to generalize for an n-by-n matrix in my special case, the cost of the greedy solution becomes n ( n 2 ) = n2 2 (there are 4 ( n 2 n 4 ) = n2 2 1-cells in this matrix and the greedy solution covers none of the cells, paying 1 for each 1-cell). The cost of the optimal solution becomes n n (The optimal solution takes n 2 rows and n 2 columns and pays for the n 2 ( n 2 ) = n2 4 0 s that are covered). As n gets very large, the influence of the n term in the greedy cost gets small (the influence of the n term approaches 0 as n approaches infinity) so as n approaches infinity, the ratio becomes: lim n ( n2 2 ) ( n2 4 + n) = 2 Which is l m and κ 2. This example and the reasoning behind it is a direct application of the proof in [3] that the greedy algorithm is an Ω(l m )-approximation algorithm (and an Ω(κ 2)-approximation algorithm) in the special 2-Dimensional case. Thus, if an O(l m )-approximation bound is proven for the greedy approximation algorithm, it must be tight. Challenges One challenge to solving this problem was to truly understand the formal description of the problem first, sicne this was one of my first times that I was learning about a problem through reading research papers. By reading the relevant sections of relevant papers, especially [3], thoroughly, I gained a better understanding of the formal definitions of the problem, which helped me solve it better. Terms that took me time to understand were κ and l m. By re-reading [3] and getting a better understanding of l m and κ, I was able to better understand what I was doing, which helped me check my proof, proof techniques and develop a correct proof. Another challenge was learning to formally write the proof. While often I had valid ideas and could understand what I was thinking, I was inexperienced at writing a proof in a way that was concise, thorough and understandable. 7

8 This resulted in me spending much of my rewriting the proof so it was clearer and understandable before my advisor could check the proof for correctness. While this was a challenge to learn how to properly write a proof, my advisor Dr. Sudipto Guha guided me along the way and used it as an opportunity to teach me how to write a proof. Conclusion I have developed a proof that the greedy algorithm is a 4-approximation in this specific case and a (2 (κ 2))-approximation algorithm in the general case. Through this project, I learned how to write a proof in a formal and understandable style that could be read by other researchers. Throughout the year, my advisor, Dr. Sudipto Guha was very helpful, guiding me through the proofwriting process so as I was writing proofs I was not only fixing technical errors but writing the proof in a clearer, more concise and more precise way. This process of learning how to write a proof in a research paper has been tremendously helpful for me. Something that enriched the process and made it easier for me to understand the problem was to learn some fundamental concepts of Linear Programming such as the Simplex Method, Duality and solving Network-Flow Problems while I was solving this problem. This helped because in [3] there are many algorithms for this problem and for similar problems that involve Linear Programming. By better understanding Linear Programming, I could better understand these algorithms, which in turn gave me a better understanding of the problem I am working on, which helped me solve this problem. Throughout this report I frequently use the notation and the results in [3]. I am doing this because this project provides a proof that tightens a result in the paper [3]. I have included my proof that this greedy algorithm is an 4-approximation algorithm in the special case and a (2 (κ 2)) approximation algorithm in the general case in the Proof of the MDL Greedy Approximation Bound section of this report. References [1] Bu, Shaofeng, Laks V.S. Lakshmanan and Raymond T. Ng. MDL Summarization with Holes. Proceedings of the 31st VLDB Conference, pages , [2] Chaudhuri, Surajit and Umeshwar Dayal. An Overview of Data Warehousing and OLAP Technology. ACM SIGMOD Record, Volume 26, Issue 1, pages 65-74,

9 [3] Guha, Sudipto and Jinsong Tan. Recursive MDL Summarization and Approximation Algorithms [4] Lakshmanan, Laks V.S., Raymond T. Ng, Christing Xing Wang, Xiaodong Zhou and Theodore J Johnson. The Generalized MDL Approach for Summarization. Proceedings of the 28th VLDB Conference, pages , [5] Pu, Ken Q. and Alberto O. Mendelzon. Concise Descriptions of Subsets of Structured Sets. ACM Transactions on Database Systems (TODS) Vol. 30, No. 1, pages , March [6] Sathe, Gayatri and Sunita Sarawagi. Intelligent Rollups in Multidimensional OLAP Data. Proceedings of the 27th VLDB Conference, pages , [7] Vazirani, Vijay V. Approximation Algorithms. Berlin: Springer-Verlag. Corrected Second Printing, Proof of the MDL Greedy Approximation Bound This proof proves the 4-approximation MDL Greedy Approximation Bound for the 2-dimensional case where the only rectangles are rows and columns and a O(κ) = 2 (κ 2)-approximation algorithm in the general case, with an arbitrary number of dimensions and arbitrary non-trivial rectangles. In this proof, when the word rectangle is used, it refers to a non-trivial rectangle, which is any rectangle that contains 2 or more cells. Individual cells of the matrix and the matrix as a whole are trivial rectangles. Definitions The notations are defined here and will be used in the proofs. M is the original n-dimensional matrix of data cells. A G is the set of chosen rectangles of the greedy solution for the matrix M. A is the set of chosen rectangles of the optimal solution. Define the cost of the matrix M with respect to a solution A, denoted cost(a, M) to be the length (cost) of the description of M after applying A. By definition, cost(a, M) = (1) + (1) + A u M, M[u]=1, u is not covered by any rectangle in A u M, M[u]=0, u is covered by at least one rectangle in A 9

10 Define the cost of a region R of cells with respect to a solution A to be denoted cost(a, R) to be the length (cost) of the description of R after applying A. Here this is used when R is a sub region of the matrix M. By definition cost(a, R) = (1) + (1) + A u R, R[u]=1, u is not covered by any rectangle in R u R, R[u]=0, u is covered by at least one rectangle in A Define nont cover(a, M) to be the region of M consisting of all the cells contained by any non-trivial rectangle chosen by A. i.e nont cover(a, M) = {u (( r A)(u r))} Define the greedy estimate of a region R with respect to a solution A, denoted est(a, R), to denote the cost the greedy solution estimates the region to cost if the algorithm chose every non-trivial rectangle that is not in A in addition to those already in A. Let Rt denote the set of all non-trivial rectangles that are not in A such that each cell that each rectangle contains is in the region R. I.e. Rt = {r r / A and (( u r) u R)} So, est(a, R) = cost(a, nont cover(a, M)) + Rt + ( 1 + (1) ) Theorem and Proof r i Rt u r,m[u]=0 I prove the Theorerm using Claims. I first state and prove the Claims, then I prove the Theorem. Claim 1. cost(a G, nont cover(a, M)) (κ 2)cost(A, nont cover(a, M)) Proof of Claim 1. The region nont cover(a, M) is the region containing only the cells that are contained by some non-trivial rectangle that A chose. Look at each rectangle r that the Greedy Algorithm did not take that the optimal solution did. Now, consider what the greedy solution estimates the cost of r to be if it would take it. That estimate of the cost of r is 1 + (1). u r,m[u]=0 Summing all the estimates of r (denote Rt as the collection ( of all these rectangles r), the total estimated additional cost is 1 + (1) ) r i Rt u r,m[u]=0 ( This cost is Rt + (1) ), which is at most Rt +(κ 2) non-trivial rectangles. u in some rectangle in Rt, r that contains u M[u]=0 u in some rectangle in Rt, M[u]=0 ( 1 ). since each cell is covered by at most κ 2 10

11 Add this cost to the current cost of the rectangles chosen by A G and this is est(a G, nont cover(a, M)). Now, since the greedy algorithm did not take these rectangles, the cost of not taking the rectangles in Rt is less than the estimate of taking all of the rectangles in Rt. (If not, then one rectangle s estimate will be lower than the cost of not taking it, and then the greedy solution would have taken that rectangle.) Hence, cost(a G, nont cover(a, M)) est(a G, nont cover(a, M)) However, since the optimal solution took these rectangles, it must be less costly to take those rectangles than to not take those rectangles and pay for the cells individually. However, no matter which rectangles are taken or in what ( order they are taken in, the optimal solution must pay at least a cost of Rt + (1) ). u in some rectangle in Rt, M[u]=0 In the worst case, the entire cost of describing this region is the cost of describing Rt, so est(a G, nont cover(a, M)) (κ 2)cost(A, nont cover(a, M)) Therefore, cost(a G, nont cover(a, M)) (κ 2)cost(A, nont cover(a, M)) Claim 2. cost(a G, nont cover(a G, M)) (κ 2)cost((A G A ), nont cover(a G, M)) Proof of Claim 2. The region nont cover(a G, M) contains only (and all) the cells that are contained in a rectangle that A G chose. Look at cost((a G A )), nont cover(a G, M). and now look at all the rectangles in A G but not in A. At each step, the greedy solution only takes beneficial rectangles, which are rectangles that when chosen will result in a reduced cost. Now there are two kinds of rectangles that are in A G : rectangles in A G A and rectangles in A G A Consider each rectangle r g in A G A and compare it to when it was taken relative to each rectangle r in A G A. Now, when r g was taken in the greedy algorithm, it must have been beneficial to take. Now, examine the benefit of r g after every rectangle in A G A has been taken. Since the order of rectangles has been changed, the benefit of each r g can change. If r g was taken after each r, there is no additional cost. However, if r g was taken before r, its benefit will differ, so consider each cell u that overlap with r g and r. If u is a 0-cell, the ratio that the greedy algorithm sees is now even lower (more beneficial), and hence taking that rectangle results in an increased benefit to the solution formed by A G A. 11

12 If u is a 1-cell, then the greedy ratio for r g can be at most one unit less than benefical to the solution, since the greedy ratio is beneficial except for the overlapped cell. This means that the greedy solution pays at most an additional cost of 1 for each rectangle r g that covers u when the greedy solution takes r g. Since there are at most (κ 2) non-trivial rectangles that contain u, taking the rectangles that contain u results in at most an additional cost of (κ 2) for u. In the worst case, the entire cost of describing this region is caused by the loss from these cells u, so the cost with the greedy rectangles is at most (κ 2) times the cost of the solution (A G A ) for this region. Note: In the above paragraph, (κ 2) and not κ is used because the individual cells and the matrix are excluded because the greedy algorithm will never take a trivial rectangle and a non-trivial rectangle that contais or is contained by a taken trivial rectangle. This is because the greedy algorithm will initially take the matrix (if it is the most beneficial) or it will not take the matrix. As for individual cells, if A G took the cell in addition to a non-trivial rectangle in (A G A ) that contained the cell, the greedy solution would discard the individual cell. Claim 3. Let R 1, R 2 be two regions, not necessarily disjoint and let A be some solution. Then cost(a, (R 1 R 2 )) cost(a, R 1 ) + cost(a, R 2 ) 2 cost(a, (R 1 R 2 )) Proof of Claim 3. Trivial. Theorem 1. The MDL Greedy algorithm is a 2 (κ 2)-approximation algorithm in the general case. Proof of Theorem 1. We will break the cost of the entire solution into the following regions, whose union is the matrix M: 1. nont cover(a, M) (Denoted R ) 2. nont cover(a G, M) (Denoted R G ) 3. M (R R G ) By Claim 1, cost(a G, R ) (κ 2)cost(A, R ) By Claim 2, cost(a G, R G ) (κ 2)cost(A, R G ) 12

13 Therefore, by Claim 3, cost(a G, R R G ) 2 (κ 2)cost(A, R R G ) Now, cost(a G, (M (R R G )) = cost(a, (M (R R G )). This is because by definition of this region, both A G and A pay for all the cells individually and do not have any non-trivial rectangles that cover any of these cells. Here, since M (R R G ) and (R R G ) are disjoint and no non-trivial rectangle in A or A G covers a single cell in the M (R R G ) region, cost(a G, M (R R G )) + cost(a G, (R R G )) = cost(a G, M) and cost(a, M (R R G )) + cost(a, (R R G )) = cost(a, M) Therefore, cost(a G, (R R G )) + cost(a G, M (R R G )) 2 (κ 2)cost(A, (R R G )) + cost(a, M (R R G )) cost(a G, (R R G )) + cost(a G, M (R R G )) 2 (κ 2)cost(A, (R R G )) + 2 (κ 2)cost(A, M (R R G )) ( cost(ag, (R R G )) + cost(a G, M (R R G )) ) 2 (κ 2) ( cost(a, (R R G )) + cost(a, M (R R G )) ) Hence, cost(a G, M) 2 (κ 2)cost(A, M) Corollary 1. In the special 2-dimensional case with only row and column nontrivial rectangles, the MDL Greedy algorithm is a 4-approximation algorithm. Proof of Corollary 1. Immediate from Theorem 1, since (κ 2) = 2 in this special case, because the row and column are the only overlapping non-trivial rectangles. 13

3 No-Wait Job Shops with Variable Processing Times

3 No-Wait Job Shops with Variable Processing Times 3 No-Wait Job Shops with Variable Processing Times In this chapter we assume that, on top of the classical no-wait job shop setting, we are given a set of processing times for each operation. We may select

More information

Online algorithms for clustering problems

Online algorithms for clustering problems University of Szeged Department of Computer Algorithms and Artificial Intelligence Online algorithms for clustering problems Summary of the Ph.D. thesis by Gabriella Divéki Supervisor Dr. Csanád Imreh

More information

INTRODUCTION TO THE HOMOLOGY GROUPS OF COMPLEXES

INTRODUCTION TO THE HOMOLOGY GROUPS OF COMPLEXES INTRODUCTION TO THE HOMOLOGY GROUPS OF COMPLEXES RACHEL CARANDANG Abstract. This paper provides an overview of the homology groups of a 2- dimensional complex. It then demonstrates a proof of the Invariance

More information

Maximal Monochromatic Geodesics in an Antipodal Coloring of Hypercube

Maximal Monochromatic Geodesics in an Antipodal Coloring of Hypercube Maximal Monochromatic Geodesics in an Antipodal Coloring of Hypercube Kavish Gandhi April 4, 2015 Abstract A geodesic in the hypercube is the shortest possible path between two vertices. Leader and Long

More information

Lecture 3 February 9, 2010

Lecture 3 February 9, 2010 6.851: Advanced Data Structures Spring 2010 Dr. André Schulz Lecture 3 February 9, 2010 Scribe: Jacob Steinhardt and Greg Brockman 1 Overview In the last lecture we continued to study binary search trees

More information

Computing Appropriate Representations for Multidimensional Data

Computing Appropriate Representations for Multidimensional Data Computing Appropriate Representations for Multidimensional Data Yeow Wei Choong LI - Université F Rabelais HELP Institute - Malaysia choong yw@helpedumy Dominique Laurent LI - Université F Rabelais Tours

More information

arxiv: v1 [math.co] 25 Sep 2015

arxiv: v1 [math.co] 25 Sep 2015 A BASIS FOR SLICING BIRKHOFF POLYTOPES TREVOR GLYNN arxiv:1509.07597v1 [math.co] 25 Sep 2015 Abstract. We present a change of basis that may allow more efficient calculation of the volumes of Birkhoff

More information

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Approximation algorithms Date: 11/27/18

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Approximation algorithms Date: 11/27/18 601.433/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Approximation algorithms Date: 11/27/18 22.1 Introduction We spent the last two lectures proving that for certain problems, we can

More information

Leveraging Set Relations in Exact Set Similarity Join

Leveraging Set Relations in Exact Set Similarity Join Leveraging Set Relations in Exact Set Similarity Join Xubo Wang, Lu Qin, Xuemin Lin, Ying Zhang, and Lijun Chang University of New South Wales, Australia University of Technology Sydney, Australia {xwang,lxue,ljchang}@cse.unsw.edu.au,

More information

Competitive Analysis of On-line Algorithms for On-demand Data Broadcast Scheduling

Competitive Analysis of On-line Algorithms for On-demand Data Broadcast Scheduling Competitive Analysis of On-line Algorithms for On-demand Data Broadcast Scheduling Weizhen Mao Department of Computer Science The College of William and Mary Williamsburg, VA 23187-8795 USA wm@cs.wm.edu

More information

Novel Materialized View Selection in a Multidimensional Database

Novel Materialized View Selection in a Multidimensional Database Graphic Era University From the SelectedWorks of vijay singh Winter February 10, 2009 Novel Materialized View Selection in a Multidimensional Database vijay singh Available at: https://works.bepress.com/vijaysingh/5/

More information

Approximation Algorithms

Approximation Algorithms Approximation Algorithms Subhash Suri June 5, 2018 1 Figure of Merit: Performance Ratio Suppose we are working on an optimization problem in which each potential solution has a positive cost, and we want

More information

by conservation of flow, hence the cancelation. Similarly, we have

by conservation of flow, hence the cancelation. Similarly, we have Chapter 13: Network Flows and Applications Network: directed graph with source S and target T. Non-negative edge weights represent capacities. Assume no edges into S or out of T. (If necessary, we can

More information

CSE 417 Network Flows (pt 3) Modeling with Min Cuts

CSE 417 Network Flows (pt 3) Modeling with Min Cuts CSE 417 Network Flows (pt 3) Modeling with Min Cuts Reminders > HW6 is due on Friday start early bug fixed on line 33 of OptimalLineup.java: > change true to false Review of last two lectures > Defined

More information

Lecture 2. 1 Introduction. 2 The Set Cover Problem. COMPSCI 632: Approximation Algorithms August 30, 2017

Lecture 2. 1 Introduction. 2 The Set Cover Problem. COMPSCI 632: Approximation Algorithms August 30, 2017 COMPSCI 632: Approximation Algorithms August 30, 2017 Lecturer: Debmalya Panigrahi Lecture 2 Scribe: Nat Kell 1 Introduction In this lecture, we examine a variety of problems for which we give greedy approximation

More information

Quotient Cube: How to Summarize the Semantics of a Data Cube

Quotient Cube: How to Summarize the Semantics of a Data Cube Quotient Cube: How to Summarize the Semantics of a Data Cube Laks V.S. Lakshmanan (Univ. of British Columbia) * Jian Pei (State Univ. of New York at Buffalo) * Jiawei Han (Univ. of Illinois at Urbana-Champaign)

More information

OLAP2 outline. Multi Dimensional Data Model. A Sample Data Cube

OLAP2 outline. Multi Dimensional Data Model. A Sample Data Cube OLAP2 outline Multi Dimensional Data Model Need for Multi Dimensional Analysis OLAP Operators Data Cube Demonstration Using SQL Multi Dimensional Data Model Multi dimensional analysis is a popular approach

More information

Introduction to Algorithms / Algorithms I Lecturer: Michael Dinitz Topic: Approximation algorithms Date: 11/18/14

Introduction to Algorithms / Algorithms I Lecturer: Michael Dinitz Topic: Approximation algorithms Date: 11/18/14 600.363 Introduction to Algorithms / 600.463 Algorithms I Lecturer: Michael Dinitz Topic: Approximation algorithms Date: 11/18/14 23.1 Introduction We spent last week proving that for certain problems,

More information

Approximation Algorithms

Approximation Algorithms Approximation Algorithms Prof. Tapio Elomaa tapio.elomaa@tut.fi Course Basics A 4 credit unit course Part of Theoretical Computer Science courses at the Laboratory of Mathematics There will be 4 hours

More information

Greedy Algorithms 1 {K(S) K(S) C} For large values of d, brute force search is not feasible because there are 2 d {1,..., d}.

Greedy Algorithms 1 {K(S) K(S) C} For large values of d, brute force search is not feasible because there are 2 d {1,..., d}. Greedy Algorithms 1 Simple Knapsack Problem Greedy Algorithms form an important class of algorithmic techniques. We illustrate the idea by applying it to a simplified version of the Knapsack Problem. Informally,

More information

Cache-Oblivious Traversals of an Array s Pairs

Cache-Oblivious Traversals of an Array s Pairs Cache-Oblivious Traversals of an Array s Pairs Tobias Johnson May 7, 2007 Abstract Cache-obliviousness is a concept first introduced by Frigo et al. in [1]. We follow their model and develop a cache-oblivious

More information

Lecture notes on the simplex method September We will present an algorithm to solve linear programs of the form. maximize.

Lecture notes on the simplex method September We will present an algorithm to solve linear programs of the form. maximize. Cornell University, Fall 2017 CS 6820: Algorithms Lecture notes on the simplex method September 2017 1 The Simplex Method We will present an algorithm to solve linear programs of the form maximize subject

More information

1 Non greedy algorithms (which we should have covered

1 Non greedy algorithms (which we should have covered 1 Non greedy algorithms (which we should have covered earlier) 1.1 Floyd Warshall algorithm This algorithm solves the all-pairs shortest paths problem, which is a problem where we want to find the shortest

More information

On Covering a Graph Optimally with Induced Subgraphs

On Covering a Graph Optimally with Induced Subgraphs On Covering a Graph Optimally with Induced Subgraphs Shripad Thite April 1, 006 Abstract We consider the problem of covering a graph with a given number of induced subgraphs so that the maximum number

More information

9.5 Equivalence Relations

9.5 Equivalence Relations 9.5 Equivalence Relations You know from your early study of fractions that each fraction has many equivalent forms. For example, 2, 2 4, 3 6, 2, 3 6, 5 30,... are all different ways to represent the same

More information

16 Greedy Algorithms

16 Greedy Algorithms 16 Greedy Algorithms Optimization algorithms typically go through a sequence of steps, with a set of choices at each For many optimization problems, using dynamic programming to determine the best choices

More information

Subset sum problem and dynamic programming

Subset sum problem and dynamic programming Lecture Notes: Dynamic programming We will discuss the subset sum problem (introduced last time), and introduce the main idea of dynamic programming. We illustrate it further using a variant of the so-called

More information

9 Bounds for the Knapsack Problem (March 6)

9 Bounds for the Knapsack Problem (March 6) 9 Bounds for the Knapsack Problem (March 6) In this lecture, I ll develop both upper and lower bounds in the linear decision tree model for the following version of the (NP-complete) Knapsack 1 problem:

More information

Table 1 below illustrates the construction for the case of 11 integers selected from 20.

Table 1 below illustrates the construction for the case of 11 integers selected from 20. Q: a) From the first 200 natural numbers 101 of them are arbitrarily chosen. Prove that among the numbers chosen there exists a pair such that one divides the other. b) Prove that if 100 numbers are chosen

More information

Approximation Algorithms

Approximation Algorithms Chapter 8 Approximation Algorithms Algorithm Theory WS 2016/17 Fabian Kuhn Approximation Algorithms Optimization appears everywhere in computer science We have seen many examples, e.g.: scheduling jobs

More information

JOB SHOP SCHEDULING WITH UNIT LENGTH TASKS

JOB SHOP SCHEDULING WITH UNIT LENGTH TASKS JOB SHOP SCHEDULING WITH UNIT LENGTH TASKS MEIKE AKVELD AND RAPHAEL BERNHARD Abstract. In this paper, we consider a class of scheduling problems that are among the fundamental optimization problems in

More information

Separators in High-Genus Near-Planar Graphs

Separators in High-Genus Near-Planar Graphs Rochester Institute of Technology RIT Scholar Works Theses Thesis/Dissertation Collections 12-2016 Separators in High-Genus Near-Planar Graphs Juraj Culak jc1789@rit.edu Follow this and additional works

More information

Complexity Results on Graphs with Few Cliques

Complexity Results on Graphs with Few Cliques Discrete Mathematics and Theoretical Computer Science DMTCS vol. 9, 2007, 127 136 Complexity Results on Graphs with Few Cliques Bill Rosgen 1 and Lorna Stewart 2 1 Institute for Quantum Computing and School

More information

Online algorithms for clustering problems

Online algorithms for clustering problems University of Szeged Department of Computer Algorithms and Artificial Intelligence Online algorithms for clustering problems Ph.D. Thesis Gabriella Divéki Supervisor: Dr. Csanád Imreh University of Szeged

More information

On the Max Coloring Problem

On the Max Coloring Problem On the Max Coloring Problem Leah Epstein Asaf Levin May 22, 2010 Abstract We consider max coloring on hereditary graph classes. The problem is defined as follows. Given a graph G = (V, E) and positive

More information

Ramsey s Theorem on Graphs

Ramsey s Theorem on Graphs Ramsey s Theorem on Graphs 1 Introduction Exposition by William Gasarch Imagine that you have 6 people at a party. We assume that, for every pair of them, either THEY KNOW EACH OTHER or NEITHER OF THEM

More information

MULTIDIMENSIONAL coding in general and two-dimensional. Sequence Folding, Lattice Tiling, and Multidimensional Coding Tuvi Etzion, Fellow, IEEE

MULTIDIMENSIONAL coding in general and two-dimensional. Sequence Folding, Lattice Tiling, and Multidimensional Coding Tuvi Etzion, Fellow, IEEE IEEE TRANSACTIONS ON INFORMATION THEORY, VOL 57, NO 7, JULY 2011 4383 Sequence Folding, Lattice Tiling, Multidimensional Coding Tuvi Etzion, Fellow, IEEE Abstract Folding a sequence into a multidimensional

More information

11.1 Facility Location

11.1 Facility Location CS787: Advanced Algorithms Scribe: Amanda Burton, Leah Kluegel Lecturer: Shuchi Chawla Topic: Facility Location ctd., Linear Programming Date: October 8, 2007 Today we conclude the discussion of local

More information

A New Combinatorial Design of Coded Distributed Computing

A New Combinatorial Design of Coded Distributed Computing A New Combinatorial Design of Coded Distributed Computing Nicholas Woolsey, Rong-Rong Chen, and Mingyue Ji Department of Electrical and Computer Engineering, University of Utah Salt Lake City, UT, USA

More information

Leveraging Transitive Relations for Crowdsourced Joins*

Leveraging Transitive Relations for Crowdsourced Joins* Leveraging Transitive Relations for Crowdsourced Joins* Jiannan Wang #, Guoliang Li #, Tim Kraska, Michael J. Franklin, Jianhua Feng # # Department of Computer Science, Tsinghua University, Brown University,

More information

Efficient and Effective Practical Algorithms for the Set-Covering Problem

Efficient and Effective Practical Algorithms for the Set-Covering Problem Efficient and Effective Practical Algorithms for the Set-Covering Problem Qi Yang, Jamie McPeek, Adam Nofsinger Department of Computer Science and Software Engineering University of Wisconsin at Platteville

More information

DW Performance Optimization (II)

DW Performance Optimization (II) DW Performance Optimization (II) Overview Data Cube in ROLAP and MOLAP ROLAP Technique(s) Efficient Data Cube Computation MOLAP Technique(s) Prefix Sum Array Multiway Augmented Tree Aalborg University

More information

Definition For vertices u, v V (G), the distance from u to v, denoted d(u, v), in G is the length of a shortest u, v-path. 1

Definition For vertices u, v V (G), the distance from u to v, denoted d(u, v), in G is the length of a shortest u, v-path. 1 Graph fundamentals Bipartite graph characterization Lemma. If a graph contains an odd closed walk, then it contains an odd cycle. Proof strategy: Consider a shortest closed odd walk W. If W is not a cycle,

More information

Lecture 8: The Traveling Salesman Problem

Lecture 8: The Traveling Salesman Problem Lecture 8: The Traveling Salesman Problem Let G = (V, E) be an undirected graph. A Hamiltonian cycle of G is a cycle that visits every vertex v V exactly once. Instead of Hamiltonian cycle, we sometimes

More information

Designing Views to Answer Queries under Set, Bag,and BagSet Semantics

Designing Views to Answer Queries under Set, Bag,and BagSet Semantics Designing Views to Answer Queries under Set, Bag,and BagSet Semantics Rada Chirkova Department of Computer Science, North Carolina State University Raleigh, NC 27695-7535 chirkova@csc.ncsu.edu Foto Afrati

More information

Analysis of Basic Data Reordering Techniques

Analysis of Basic Data Reordering Techniques Analysis of Basic Data Reordering Techniques Tan Apaydin 1, Ali Şaman Tosun 2, and Hakan Ferhatosmanoglu 1 1 The Ohio State University, Computer Science and Engineering apaydin,hakan@cse.ohio-state.edu

More information

Greedy Algorithms 1. For large values of d, brute force search is not feasible because there are 2 d

Greedy Algorithms 1. For large values of d, brute force search is not feasible because there are 2 d Greedy Algorithms 1 Simple Knapsack Problem Greedy Algorithms form an important class of algorithmic techniques. We illustrate the idea by applying it to a simplified version of the Knapsack Problem. Informally,

More information

Dynamic Programming. Lecture Overview Introduction

Dynamic Programming. Lecture Overview Introduction Lecture 12 Dynamic Programming 12.1 Overview Dynamic Programming is a powerful technique that allows one to solve many different types of problems in time O(n 2 ) or O(n 3 ) for which a naive approach

More information

Treewidth and graph minors

Treewidth and graph minors Treewidth and graph minors Lectures 9 and 10, December 29, 2011, January 5, 2012 We shall touch upon the theory of Graph Minors by Robertson and Seymour. This theory gives a very general condition under

More information

Approximability Results for the p-center Problem

Approximability Results for the p-center Problem Approximability Results for the p-center Problem Stefan Buettcher Course Project Algorithm Design and Analysis Prof. Timothy Chan University of Waterloo, Spring 2004 The p-center

More information

Cascaded Coded Distributed Computing on Heterogeneous Networks

Cascaded Coded Distributed Computing on Heterogeneous Networks Cascaded Coded Distributed Computing on Heterogeneous Networks Nicholas Woolsey, Rong-Rong Chen, and Mingyue Ji Department of Electrical and Computer Engineering, University of Utah Salt Lake City, UT,

More information

The Near Greedy Algorithm for Views Selection in Data Warehouses and Its Performance Guarantees

The Near Greedy Algorithm for Views Selection in Data Warehouses and Its Performance Guarantees The Near Greedy Algorithm for Views Selection in Data Warehouses and Its Performance Guarantees Omar H. Karam Faculty of Informatics and Computer Science, The British University in Egypt and Faculty of

More information

HARNESSING CERTAINTY TO SPEED TASK-ALLOCATION ALGORITHMS FOR MULTI-ROBOT SYSTEMS

HARNESSING CERTAINTY TO SPEED TASK-ALLOCATION ALGORITHMS FOR MULTI-ROBOT SYSTEMS HARNESSING CERTAINTY TO SPEED TASK-ALLOCATION ALGORITHMS FOR MULTI-ROBOT SYSTEMS An Undergraduate Research Scholars Thesis by DENISE IRVIN Submitted to the Undergraduate Research Scholars program at Texas

More information

MOST attention in the literature of network codes has

MOST attention in the literature of network codes has 3862 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 56, NO. 8, AUGUST 2010 Efficient Network Code Design for Cyclic Networks Elona Erez, Member, IEEE, and Meir Feder, Fellow, IEEE Abstract This paper introduces

More information

K-SATURATED GRAPHS CLIFFORD BRIDGES, AMANDA DAY, SHELLY MANBER

K-SATURATED GRAPHS CLIFFORD BRIDGES, AMANDA DAY, SHELLY MANBER K-SATURATED GRAPHS CLIFFORD BRIDGES, AMANDA DAY, SHELLY MANBER Abstract. We present some properties of k-existentially-closed and k-saturated graphs, as steps toward discovering a 4-saturated graph. We

More information

Algebraic method for Shortest Paths problems

Algebraic method for Shortest Paths problems Lecture 1 (06.03.2013) Author: Jaros law B lasiok Algebraic method for Shortest Paths problems 1 Introduction In the following lecture we will see algebraic algorithms for various shortest-paths problems.

More information

On the Hardness of Counting the Solutions of SPARQL Queries

On the Hardness of Counting the Solutions of SPARQL Queries On the Hardness of Counting the Solutions of SPARQL Queries Reinhard Pichler and Sebastian Skritek Vienna University of Technology, Faculty of Informatics {pichler,skritek}@dbai.tuwien.ac.at 1 Introduction

More information

Formally Self-Dual Codes Related to Type II Codes

Formally Self-Dual Codes Related to Type II Codes Formally Self-Dual Codes Related to Type II Codes Koichi Betsumiya Graduate School of Mathematics Nagoya University Nagoya 464 8602, Japan and Masaaki Harada Department of Mathematical Sciences Yamagata

More information

Algorithms for Euclidean TSP

Algorithms for Euclidean TSP This week, paper [2] by Arora. See the slides for figures. See also http://www.cs.princeton.edu/~arora/pubs/arorageo.ps Algorithms for Introduction This lecture is about the polynomial time approximation

More information

However, this is not always true! For example, this fails if both A and B are closed and unbounded (find an example).

However, this is not always true! For example, this fails if both A and B are closed and unbounded (find an example). 98 CHAPTER 3. PROPERTIES OF CONVEX SETS: A GLIMPSE 3.2 Separation Theorems It seems intuitively rather obvious that if A and B are two nonempty disjoint convex sets in A 2, then there is a line, H, separating

More information

Memoization/Dynamic Programming. The String reconstruction problem. CS124 Lecture 11 Spring 2018

Memoization/Dynamic Programming. The String reconstruction problem. CS124 Lecture 11 Spring 2018 CS124 Lecture 11 Spring 2018 Memoization/Dynamic Programming Today s lecture discusses memoization, which is a method for speeding up algorithms based on recursion, by using additional memory to remember

More information

Approximation Algorithms: The Primal-Dual Method. My T. Thai

Approximation Algorithms: The Primal-Dual Method. My T. Thai Approximation Algorithms: The Primal-Dual Method My T. Thai 1 Overview of the Primal-Dual Method Consider the following primal program, called P: min st n c j x j j=1 n a ij x j b i j=1 x j 0 Then the

More information

Formal Model. Figure 1: The target concept T is a subset of the concept S = [0, 1]. The search agent needs to search S for a point in T.

Formal Model. Figure 1: The target concept T is a subset of the concept S = [0, 1]. The search agent needs to search S for a point in T. Although this paper analyzes shaping with respect to its benefits on search problems, the reader should recognize that shaping is often intimately related to reinforcement learning. The objective in reinforcement

More information

Lecture 7. s.t. e = (u,v) E x u + x v 1 (2) v V x v 0 (3)

Lecture 7. s.t. e = (u,v) E x u + x v 1 (2) v V x v 0 (3) COMPSCI 632: Approximation Algorithms September 18, 2017 Lecturer: Debmalya Panigrahi Lecture 7 Scribe: Xiang Wang 1 Overview In this lecture, we will use Primal-Dual method to design approximation algorithms

More information

Adaptations of the A* Algorithm for the Computation of Fastest Paths in Deterministic Discrete-Time Dynamic Networks

Adaptations of the A* Algorithm for the Computation of Fastest Paths in Deterministic Discrete-Time Dynamic Networks 60 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 3, NO. 1, MARCH 2002 Adaptations of the A* Algorithm for the Computation of Fastest Paths in Deterministic Discrete-Time Dynamic Networks

More information

Surfaces Beyond Classification

Surfaces Beyond Classification Chapter XII Surfaces Beyond Classification In most of the textbooks which present topological classification of compact surfaces the classification is the top result. However the topology of 2- manifolds

More information

A Genus Bound for Digital Image Boundaries

A Genus Bound for Digital Image Boundaries A Genus Bound for Digital Image Boundaries Lowell Abrams and Donniell E. Fishkind March 9, 2005 Abstract Shattuck and Leahy [4] conjectured and Abrams, Fishkind, and Priebe [1],[2] proved that the boundary

More information

CS270 Combinatorial Algorithms & Data Structures Spring Lecture 19:

CS270 Combinatorial Algorithms & Data Structures Spring Lecture 19: CS270 Combinatorial Algorithms & Data Structures Spring 2003 Lecture 19: 4.1.03 Lecturer: Satish Rao Scribes: Kevin Lacker and Bill Kramer Disclaimer: These notes have not been subjected to the usual scrutiny

More information

Approximation Algorithms

Approximation Algorithms Approximation Algorithms Group Members: 1. Geng Xue (A0095628R) 2. Cai Jingli (A0095623B) 3. Xing Zhe (A0095644W) 4. Zhu Xiaolu (A0109657W) 5. Wang Zixiao (A0095670X) 6. Jiao Qing (A0095637R) 7. Zhang

More information

Discrete mathematics , Fall Instructor: prof. János Pach

Discrete mathematics , Fall Instructor: prof. János Pach Discrete mathematics 2016-2017, Fall Instructor: prof. János Pach - covered material - Lecture 1. Counting problems To read: [Lov]: 1.2. Sets, 1.3. Number of subsets, 1.5. Sequences, 1.6. Permutations,

More information

Topic: Local Search: Max-Cut, Facility Location Date: 2/13/2007

Topic: Local Search: Max-Cut, Facility Location Date: 2/13/2007 CS880: Approximations Algorithms Scribe: Chi Man Liu Lecturer: Shuchi Chawla Topic: Local Search: Max-Cut, Facility Location Date: 2/3/2007 In previous lectures we saw how dynamic programming could be

More information

15-451/651: Design & Analysis of Algorithms October 11, 2018 Lecture #13: Linear Programming I last changed: October 9, 2018

15-451/651: Design & Analysis of Algorithms October 11, 2018 Lecture #13: Linear Programming I last changed: October 9, 2018 15-451/651: Design & Analysis of Algorithms October 11, 2018 Lecture #13: Linear Programming I last changed: October 9, 2018 In this lecture, we describe a very general problem called linear programming

More information

Cpt S 223 Course Overview. Cpt S 223, Fall 2007 Copyright: Washington State University

Cpt S 223 Course Overview. Cpt S 223, Fall 2007 Copyright: Washington State University Cpt S 223 Course Overview 1 Course Goals Learn about new/advanced data structures Be able to make design choices on the suitable data structure for different application/problem needs Analyze (objectively)

More information

Stanford University CS261: Optimization Handout 1 Luca Trevisan January 4, 2011

Stanford University CS261: Optimization Handout 1 Luca Trevisan January 4, 2011 Stanford University CS261: Optimization Handout 1 Luca Trevisan January 4, 2011 Lecture 1 In which we describe what this course is about and give two simple examples of approximation algorithms 1 Overview

More information

Improving the Performance of OLAP Queries Using Families of Statistics Trees

Improving the Performance of OLAP Queries Using Families of Statistics Trees Improving the Performance of OLAP Queries Using Families of Statistics Trees Joachim Hammer Dept. of Computer and Information Science University of Florida Lixin Fu Dept. of Mathematical Sciences University

More information

Unlabeled equivalence for matroids representable over finite fields

Unlabeled equivalence for matroids representable over finite fields Unlabeled equivalence for matroids representable over finite fields November 16, 2012 S. R. Kingan Department of Mathematics Brooklyn College, City University of New York 2900 Bedford Avenue Brooklyn,

More information

1 The range query problem

1 The range query problem CS268: Geometric Algorithms Handout #12 Design and Analysis Original Handout #12 Stanford University Thursday, 19 May 1994 Original Lecture #12: Thursday, May 19, 1994 Topics: Range Searching with Partition

More information

CONNECTIVITY AND NETWORKS

CONNECTIVITY AND NETWORKS CONNECTIVITY AND NETWORKS We begin with the definition of a few symbols, two of which can cause great confusion, especially when hand-written. Consider a graph G. (G) the degree of the vertex with smallest

More information

15-451/651: Design & Analysis of Algorithms November 4, 2015 Lecture #18 last changed: November 22, 2015

15-451/651: Design & Analysis of Algorithms November 4, 2015 Lecture #18 last changed: November 22, 2015 15-451/651: Design & Analysis of Algorithms November 4, 2015 Lecture #18 last changed: November 22, 2015 While we have good algorithms for many optimization problems, the previous lecture showed that many

More information

Efficiency of Hybrid Index Structures - Theoretical Analysis and a Practical Application

Efficiency of Hybrid Index Structures - Theoretical Analysis and a Practical Application Efficiency of Hybrid Index Structures - Theoretical Analysis and a Practical Application Richard Göbel, Carsten Kropf, Sven Müller Institute of Information Systems University of Applied Sciences Hof Hof,

More information

ELEMENTARY NUMBER THEORY AND METHODS OF PROOF

ELEMENTARY NUMBER THEORY AND METHODS OF PROOF CHAPTER 4 ELEMENTARY NUMBER THEORY AND METHODS OF PROOF Copyright Cengage Learning. All rights reserved. SECTION 4.2 Direct Proof and Counterexample II: Rational Numbers Copyright Cengage Learning. All

More information

8 Matroid Intersection

8 Matroid Intersection 8 Matroid Intersection 8.1 Definition and examples 8.2 Matroid Intersection Algorithm 8.1 Definitions Given two matroids M 1 = (X, I 1 ) and M 2 = (X, I 2 ) on the same set X, their intersection is M 1

More information

Lecture 9 - Matrix Multiplication Equivalences and Spectral Graph Theory 1

Lecture 9 - Matrix Multiplication Equivalences and Spectral Graph Theory 1 CME 305: Discrete Mathematics and Algorithms Instructor: Professor Aaron Sidford (sidford@stanfordedu) February 6, 2018 Lecture 9 - Matrix Multiplication Equivalences and Spectral Graph Theory 1 In the

More information

A Provably Good Approximation Algorithm for Rectangle Escape Problem with Application to PCB Routing

A Provably Good Approximation Algorithm for Rectangle Escape Problem with Application to PCB Routing A Provably Good Approximation Algorithm for Rectangle Escape Problem with Application to PCB Routing Qiang Ma Hui Kong Martin D. F. Wong Evangeline F. Y. Young Department of Electrical and Computer Engineering,

More information

9.1 Cook-Levin Theorem

9.1 Cook-Levin Theorem CS787: Advanced Algorithms Scribe: Shijin Kong and David Malec Lecturer: Shuchi Chawla Topic: NP-Completeness, Approximation Algorithms Date: 10/1/2007 As we ve already seen in the preceding lecture, two

More information

Interleaving Schemes on Circulant Graphs with Two Offsets

Interleaving Schemes on Circulant Graphs with Two Offsets Interleaving Schemes on Circulant raphs with Two Offsets Aleksandrs Slivkins Department of Computer Science Cornell University Ithaca, NY 14853 slivkins@cs.cornell.edu Jehoshua Bruck Department of Electrical

More information

On the Relationships between Zero Forcing Numbers and Certain Graph Coverings

On the Relationships between Zero Forcing Numbers and Certain Graph Coverings On the Relationships between Zero Forcing Numbers and Certain Graph Coverings Fatemeh Alinaghipour Taklimi, Shaun Fallat 1,, Karen Meagher 2 Department of Mathematics and Statistics, University of Regina,

More information

Binary Decision Diagrams

Binary Decision Diagrams Logic and roof Hilary 2016 James Worrell Binary Decision Diagrams A propositional formula is determined up to logical equivalence by its truth table. If the formula has n variables then its truth table

More information

Graph Cube: On Warehousing and OLAP Multidimensional Networks

Graph Cube: On Warehousing and OLAP Multidimensional Networks Graph Cube: On Warehousing and OLAP Multidimensional Networks Peixiang Zhao, Xiaolei Li, Dong Xin, Jiawei Han Department of Computer Science, UIUC Groupon Inc. Google Cooperation pzhao4@illinois.edu, hanj@cs.illinois.edu

More information

Theorem 2.9: nearest addition algorithm

Theorem 2.9: nearest addition algorithm There are severe limits on our ability to compute near-optimal tours It is NP-complete to decide whether a given undirected =(,)has a Hamiltonian cycle An approximation algorithm for the TSP can be used

More information

The optimal routing of augmented cubes.

The optimal routing of augmented cubes. The optimal routing of augmented cubes. Meirun Chen, Reza Naserasr To cite this version: Meirun Chen, Reza Naserasr. The optimal routing of augmented cubes.. Information Processing Letters, Elsevier, 28.

More information

HOT asax: A Novel Adaptive Symbolic Representation for Time Series Discords Discovery

HOT asax: A Novel Adaptive Symbolic Representation for Time Series Discords Discovery HOT asax: A Novel Adaptive Symbolic Representation for Time Series Discords Discovery Ninh D. Pham, Quang Loc Le, Tran Khanh Dang Faculty of Computer Science and Engineering, HCM University of Technology,

More information

15-451/651: Design & Analysis of Algorithms January 26, 2015 Dynamic Programming I last changed: January 28, 2015

15-451/651: Design & Analysis of Algorithms January 26, 2015 Dynamic Programming I last changed: January 28, 2015 15-451/651: Design & Analysis of Algorithms January 26, 2015 Dynamic Programming I last changed: January 28, 2015 Dynamic Programming is a powerful technique that allows one to solve many different types

More information

Combinatorial Problems on Strings with Applications to Protein Folding

Combinatorial Problems on Strings with Applications to Protein Folding Combinatorial Problems on Strings with Applications to Protein Folding Alantha Newman 1 and Matthias Ruhl 2 1 MIT Laboratory for Computer Science Cambridge, MA 02139 alantha@theory.lcs.mit.edu 2 IBM Almaden

More information

Discharging and reducible configurations

Discharging and reducible configurations Discharging and reducible configurations Zdeněk Dvořák March 24, 2018 Suppose we want to show that graphs from some hereditary class G are k- colorable. Clearly, we can restrict our attention to graphs

More information

Parameterized graph separation problems

Parameterized graph separation problems Parameterized graph separation problems Dániel Marx Department of Computer Science and Information Theory, Budapest University of Technology and Economics Budapest, H-1521, Hungary, dmarx@cs.bme.hu Abstract.

More information

Heffter Arrays: Biembeddings of Cycle Systems on Surfaces

Heffter Arrays: Biembeddings of Cycle Systems on Surfaces Heffter Arrays: Biembeddings of Cycle Systems on Surfaces by Jeff Dinitz* University of Vermont and Dan Archdeacon (University of Vermont) Tom Boothby (Simon Fraser University) Our goal is to embed the

More information

Notes on Binary Dumbbell Trees

Notes on Binary Dumbbell Trees Notes on Binary Dumbbell Trees Michiel Smid March 23, 2012 Abstract Dumbbell trees were introduced in [1]. A detailed description of non-binary dumbbell trees appears in Chapter 11 of [3]. These notes

More information

Advanced Algorithms Class Notes for Monday, October 23, 2012 Min Ye, Mingfu Shao, and Bernard Moret

Advanced Algorithms Class Notes for Monday, October 23, 2012 Min Ye, Mingfu Shao, and Bernard Moret Advanced Algorithms Class Notes for Monday, October 23, 2012 Min Ye, Mingfu Shao, and Bernard Moret Greedy Algorithms (continued) The best known application where the greedy algorithm is optimal is surely

More information

4. Linear Programming

4. Linear Programming /9/08 Systems Analysis in Construction CB Construction & Building Engineering Department- AASTMT by A h m e d E l h a k e e m & M o h a m e d S a i e d. Linear Programming Optimization Network Models -

More information