New Complexity Results on Array Contraction and Related Problems

Size: px
Start display at page:

Download "New Complexity Results on Array Contraction and Related Problems"

Transcription

1 Laboratoire de l Informatique du Parallélisme École Normale Supérieure de Lyon Unité Mixte de Recherche CNRS-INRIA-ENS LYON n o 5668 New Complexity Results on Array Contraction and Related Problems (Extension to Research Report 22-7) Alain Darte and Guillaume Huard October 22 Research Report N o 22-4 École Normale Supérieure de Lyon 46 Allée d Italie, Lyon Cedex 7, France Téléphone : +33() Télécopieur : +33() Adresse électronique : lip@ens-lyon.fr

2 New Complexity Results on Array Contraction and Related Problems (Extension to Research Report 22-7) Alain Darte and Guillaume Huard October 22 Abstract Array contraction is an optimization that transforms array variables into scalar variables within a loop. While the opposite transformation, scalar expansion, is used for enabling parallelism (with a penalty in memory size), array contraction is used to save memory by removing temporary arrays and to increase locality. Several heuristics have already been proposed to perform array contraction through loop fusion and/or loop shifting. But so far, the complexity of the problem was unknown, and no exact approach was available (and even more, only a sufficient condition for array contraction was used). In this report, we focus on the theoretical aspects of the problem. We prove several NP-complete results that characterize precisely its complexity and we provide an integer linear programming formulation to solve the problem exactly. Our study also proves the NP-completeness of similar problems whose complexity was not established so far. Keywords: Code optimization, array contraction, memory reduction, complexity, NP-completeness, integer linear programming. Résumé La contraction de tableaux est une optimisation de code qui transforme des variables de type tableau en variables scalaires au sein d une boucle. Alors que la transformation inverse, l expansion de scalaire, est utilisée pour augmenter le parallélisme (avec une pénalité en taille mémoire), la contraction de tableau est utilisée pour économiser de la mémoire en supprimant des tableaux temporaires et pour augmenter la localité. Plusieurs heuristiques ont été proposées dans le passé pour rendre possible la contraction de tableaux par fusion de boucles et décalage d instructions. Néanmoins, la complexité du problème était jusqu à présent inconnue et aucune méthode de résolution exacte disponible (même plus, seule une condition suffisante de contraction était utilisée). Dans ce rapport, nous démontrons plusieurs résultats de NP-complétude qui caractérisent précisément le problème et nous proposons une méthode de résolution exacte par programmation linéaire en nombres entiers. Notre étude démontre également la NP-complétude de problèmes voisins dont la complexité n était pas établie jusqu à présent. Mots-clés: Optimisation de code, contraction de tableau, réduction mémoire, NP-complétude, programmation linéaire en nombres entiers.

3 Contents Introduction 2 2 Program Model and Objectives 3 2. Dependence Graph Contraction of Arcs, Contraction of Vertices Validity of Loop Transformations Loop Fusion Loop Shifting Complexity 7 3. Loop Fusion For Array Contraction With the Standard Condition With the Extended Condition Loop Shifting For Array Contraction With the Standard Condition With the Extended Condition An Integer Linear Programming Formulation 2 4. Loop Fusion For Array Contraction Loop Shifting For Array Contraction Related Work 24 6 Summary and Future Work 25

4 New Complexity Results on Array Contraction and Related Problems Alain Darte and Guillaume Huard LIP, ENS-Lyon, 46, Allée d Italie, 697 Lyon, France. {Alain.Darte,Guillaume.Huard}@ens-lyon.fr 3th October 22 Abstract Array contraction is an optimization that transforms array variables into scalar variables within a loop. While the opposite transformation, scalar expansion, is used for enabling parallelism (with a penalty in memory size), array contraction is used to save memory by removing temporary arrays and to increase locality. Several heuristics have already been proposed to perform array contraction through loop fusion and/or loop shifting. But so far, the complexity of the problem was unknown, and no exact approach was available (and even more, only a sufficient condition for array contraction was used). In this report, we focus on the theoretical aspects of the problem. We prove several NP-complete results that characterize precisely its complexity and we provide an integer linear programming formulation to solve the problem exactly. Our study also proves the NP-completeness of similar problems whose complexity was not established so far. Introduction Memory optimizations are becoming more and more important. First, as the gap between general-purpose single-chip processor and memory speeds grew, exploiting memory hierarchy became fundamental to achieve good performance. A large amount of compiler work has therefore focused on loop transformations and optimized data layouts (see [28, 23,, 22,, 8] to quote but a few) for better cache reuse and data prefetching. Now, memory optimizations become even more important in the context of compilation for embedded processor applications. Performance is not necessarily the only issue, but power consumption, memory design (size, type, etc.) are new objectives to be considered. Improving locality for reducing memory traffic, for deleting temporary arrays, for reducing memory sizes are important optimizations in the compilation process. Thus, architectural changes in the design and in the objectives have pushed compilers to develop sophisticated memory optimizations. On the other end of the spectrum, the evolution of languages also pushes compilers to be smarter when allocating memory. Indeed, array languages such as Fortran 9, HPF [3, 3], or ZPL [6] require the introduction of many temporary arrays by the compiler, which increases (compared to the apparent user s code) memory usage if no further optimizations are performed. Today, memory optimizations can be necessary for both reasons (architecture and languages) simultaneously. Indeed, if most circuits are still developed at register transfer level, several projects (such as PICO [25, ]) already target the compilation of circuits from C code or even from higher-level languages such as Matlab. In this latter case, the compiler will have to introduce many temporary arrays (due to the input language), and 2

5 perform both high-level and low-level optimizations to be very careful on the final memory allocation (because of the hardware objectives). Array contraction [8] is one of these memory transformations. When each element of an array is defined and used within the same iteration of surrounding loops, the array can be replaced by a scalar variable that holds the values of all elements, consecutively. Typically, such a situation occurs in codes where the contracted array is a temporary array i.e., either an array that the user introduced him/herself to store some intermediate computations, or an array that the compiler introduced for the same reason (again in array languages for example) and this temporary array is used, in the original code, in several successive loops. Therefore, in most practical cases and most benchmarks, the natural transformation that enables array contraction is loop fusion [3, page 35]. In an array language such as Fortran 9, because of its array constructs that support index shifts (when array sections differ by a constant) and negative strides, loop shifting (which takes the form of loop alignment in this case [3, page 322]) and loop reversal (the loop iterates in the other direction) are two other natural transformations for array contraction. In [24], Vivek Sarkar and Guang Gao were the first who tried to optimize explicitly for array contraction. They focused on finding the most suitable loop reversal to enable array contraction. Then, in [8], with R. Olsen and R. Thekkath, they mainly explored loop fusion for array contraction, developing a heuristic based on a maxflow-mincut algorithm. Then, several authors (see the Related Work section) contributed to loop fusion optimizations, but with slightly different objectives, focusing on loop fusion for locality [9], weighted loop fusion [2], maximal fusion (number of loops) [5], loop fusion for memory reduction [27, 7], etc. All these approaches keep in mind array contraction but they do not optimize directly for it. They target variants of data locality (for example, number of fused dependences) and, in favorable cases (but not always), they can achieve array contraction as a secondary effect. Nevertheless, since the work of Gao, Sarkar and al., several questions remained open. What is the complexity of optimizing for array contraction? How costly is an exact approach? Do we have to rely on heuristics? The goal of this report is to answer these theoretical questions. Also, we show that the way arrays are contracted traditionally (what we call the standard condition) is a bit restrictive. We give a more accurate formulation (we call it the extended condition) that allows us to contract more arrays. The rest of the report is organized as follows. Section 2 describes the program model we consider and defines the array contraction problems we address, mainly array contraction enabled by loop fusion, and by a combination of loop shifting and loop fusion. Section 3 gives several NP-complete proofs that characterize their complexity. Our results show at the same time the NP-completeness of three other optimization problems whose complexity was not established so far. In Section 4, we show that both problems (array contraction with loop fusion and array contraction with loop shifting) can be solved thanks to an integer linear programming formulation. More related works are discussed in Section 5. We conclude in Section 6. 2 Program Model and Objectives To simplify the discussion, we consider a sequence of simple (i.e., not nested) loops, with unitary loop steps, each loop containing one or several simple statements (assignments to an array or scalar variable). Dependences between statements exist that restrict the order in which statements can be executed. We first assume that each statement writes in a different variable and that all dependences are flow dependences (i.e., writes occur before reads). Following the terminology in [8, 2], we also assume that all loops are conformable (or of same type, with the terminology in [2]), i.e., regardless of dependences, all loops could be fused without code 3

6 generation or semantics problems (they have similar headers, same control dependences, etc.). Remark: In terms of NP-completeness, the simplest the input, the strongest the proof of NP-completeness. Restricting to simple cases is therefore not a restriction, but a strength. However, when solving the problem in practice, we will need to be able to extend the technique to more general cases. We will explain later when we can do it, and when problems remain to be solved. 2. Dependence Graph The program is represented by a dependence graph G = (V,E), a directed graph, where each vertex in V corresponds to a statement and an arc e = (u,v) E states that the statement v should always be placed in the same loop as the statement u, or in a loop after. We keep track of some information on dependences, their nature (flow, anti, or output dependences, even if so far we assume that all are flow dependences) and which dependence distances (differences between the loop index of the destination operation and the loop index of the source operation) each arc corresponds to. Dependence distances (or over-approximations) are used to decide whether a code transformation is valid or not. Figure shows a sample program fragment (this is a modified version of the examples from [2] and [8]), first written in Fortran 9 with array expressions, then written with loops where loop fusion has been greedily applied (from top to bottom). The graph on the right is the corresponding dependence graph (labels on arcs will be explained hereafter). A(:N) = E(:N-) B = A*2 + 3 C = B + 99 D(:N) = A(N::-) + A(:N) E = B + C*D F = E*4 + 2 G = E*8-3 H(:N) = F(:N) + G(:N)*E(2:N+) DO I=,N A(I) = E(I-) B(I) = A(I)*2 + 3 C(I) = B(I) + 99 DO I=,N D(I) = A(N-I+) + A(I) E(I) = B(I) + C(I)*D(I) F(I) = E(I)*4 + 2 G(I) = E(I)*8-3 DO I=,N H(I) = F(I) + G(I)*E(I+) A * B D C E F G H Figure : Sample program fragment (array version, loop version and dependences). 2.2 Contraction of Arcs, Contraction of Vertices A dependence arc is contractable when it corresponds to a dependence weight equal to. It can be contracted when it is contractable, and when the source and destination of the dependence are both in the same loop. Furthermore, in this case, the source should be placed textually before the destination to preserve the semantics of the program. In this case, there is an immediate reuse of the data involved in the dependence. An array is contractable when all flow dependence arcs relative to this array are contractable. It can be contracted into (replaced 4

7 by) a scalar variable when all flow dependence arcs relative to this array are contracted, i.e., when every element of the array is defined and used within the same iteration of a loop. The two problems of optimizing the contraction of arcs or the contraction of vertices are very similar, and we will use this similarity in our NP-completeness proofs. Remark: We point out that this condition for the contraction of an array (we call it the standard condition) is the one used in all previous works on array contraction, but it is only a sufficient condition. Indeed, an array such that each element is read either within the same iteration (i.e., at distance ) by statements textually after the writing statement, or in the next iteration (i.e., at distance ) but by statements textually before the writing statement could also be contracted. See Figure 2 for an example. This more general condition (we call it the extended condition) makes the problem harder to formulate since an additional condition on the textual reordering of statements has to be ensured. From a complexity point view, as we will prove later, the problem is NP-complete with the extended condition too. We also point out that, even if it is not mentioned in previous works, care should be taken for generating code, possibly using additional scalar variables, when several statements write into the same array, and the array is involved at the same time in flow dependences, and in output or anti dependences. B()= DO I=,N A(I) = B(I-) + B(I) = A(I) + 3 B= DO I=,N A = B + B = A + 3 Figure 2: An example of contraction with the extended condition. We distinguish between live-in arrays that are defined before the code fragment to be optimized, live-out arrays that should be kept in memory for later use, after the code fragment, and temporary arrays that are defined in the code fragment before being read and never used after the code fragment. In the code of Figure, E is a live-in array, H is a live-out array, and all other arrays are temporary arrays. If memory reduction is the main goal of array contraction, only temporary arrays are candidates for array contraction. However, if the goal of array contraction is to better use registers and to avoid memory traffic, we can also consider other arrays as candidates for array contraction. In this case, if a live-out array (the situation is similar for a live-in array) is written by several statements in the code fragment, intermediate writes can be considered for array contraction, but the last writes should be kept in the array. 2.3 Validity of Loop Transformations We label dependence arcs depending whether they prevent or not the code transformation to contract an array Loop Fusion When considering loop fusion, we need to distinguish between negative dependence distances, positive dependence distances, and dependence distances equal to. The conditions for loop fusion are well-known (see for example in [3]). An arc that corresponds to a negative dependence distance is called a fusion-preventing arc: the source and the destination of the arc should be placed in two different loops, and the source in a loop textually before. In this 5

8 case, contraction is certainly not possible since some data should be kept in memory for use in a subsequent loop (when the dependence is a flow dependence). An arc with a nonnegative distance (we call it a precedence arc) is not fusion-preventing. The source should be placed in a loop before the destination, or it can be placed in the same loop. If it is not contractable (i.e., if it corresponds to a positive distance), the corresponding array cannot be contracted (at least with the standard condition) since some data will be used in different iterations. And if it is contractable, as we reminded earlier, the source of the arc should be placed textually before its destination. Following classical definitions on loop fusion, a fusion partition P of G = (V,E) is a partition of V (the vertices) into disjoint subsets (called clusters). A fusion partition is legal for G if and only if: for each fusion-preventing arc, the source vertex and the destination vertex are in different clusters, the fused dependence graph defined by the fusion partition (there is an arc from a cluster A P to a cluster B P, B A, if there is an arc e = (u,v) E such that u A and v B) is a directed graph with no circuit (DAG). The dotted closed lines in Figure correspond to the greedy fusion partition (each statement is placed in the first possible loop). Given a legal fusion partition, the output code can be obtained as follows: all statements that belong to the same cluster are placed in a single loop, following the partial order defined by the -weight arcs (to preserve the semantics of the program), and loops are textually ordered according to some topological sort defined by the arcs in the fused dependence graph. For this to be possible (in particular when the graph corresponds to a valid program), the graph should have no circuit containing a fusion-preventing arc, and no -weight circuit. The problem Loop fusion for array contraction is to find a legal fusion partition for a dependence graph G so that as many arrays as possible can be contracted after fusion. After contraction, the partition depicted in Figure corresponds to the code in Figure 3(a), with 5 contracted arcs and contracted array. The solution obtained in [2] would be the code in Figure 3(b), with 6 contracted arcs and 3 contracted arrays. The optimal solution for array contraction is given in Figure 3(c), with 6 contracted arcs and 5 contracted arrays. DO I=,N A(I) = E(I-) B(I) = A(I)*2 + 3 C(I) = B(I) + 99 DO I=,N d = A(N-I+) + A(I) E(I) = B(I) + C(I)*d F(I) = E(I)*4 + 2 G(I) = E(I)*8-3 DO I=,N H(I) = F(I) + G(I)*E(I+) DO I=,N A(I) = E(I-) DO I=,N b = A(I)*2 + 3 c = b + 99 d = A(N-I+) + A(I) E(I) = b + c*d F(I) = E(I)*4 + 2 G(I) = E(I)*8-3 DO I=,N H(I) = F(I) + G(I)*E(I+) DO I=,N A(I) = E(I-) DO I=,N b = A(I)*2 + 3 c = b + 99 d = A(N-I+) + A(I) E(I) = b + c*d DO I=,N f = E(I)*4 + 2 g = E(I)*8-3 H(I) = f + g*e(i+) Figure 3: Codes after loop fusion and array contraction with (a) the greedy partition, (b) the partition selected in [2], (c) an optimal partition for array contraction. 6

9 2.3.2 Loop Shifting Loop shifting (also called loop alignment in [3]) consists in defining a map from V to Z (the relative integers) that assigns to each vertex (i.e., statement) of the dependence graph G a shift r(u) so that the operation corresponding to the statement u at iteration i in the original code is performed at iteration i + r(u) in the transformed code. A dependence distance from u to v originally equal to d(e) is equal, after the shift, to d(e) + r(v) r(u). Given a shift r for each statement, one can define the corresponding dependence graph G r where the dependence distance d r (e) of an arc is d(e) + r(v) r(u). When considering loop shifting, the weight of arcs change, so more arcs can be considered as contractable and fewer arcs are fusion-preventing, but we need more information on dependence distances to see it. An arc that corresponds to a numerically constant dependence distance is a uniform arc; with an adequate shift, the arc can be transformed into an arc with dependence distance equal to (i.e., becomes contractable). A precedence arc is an arc corresponding to dependence distances that are lower-bounded by a numerical constant; loop fusion may be possible with a sufficient shift, but if the arc is not uniform, the corresponding arc will not be contracted. Finally, any other arc is a fusion-preventing arc since, whatever the shift, the arc will still prevent fusion. In terms of direction vectors [29], a precedence arc corresponds to a label z+, where z is a relative integer, and a fusion-preventing arc corresponds to a label. In the graph of Figure, all arcs are uniform except the arc with label, which is a fusionpreventing arc. In the case of loop fusion alone, the arc with dependence distance (from E to H) is also considered as a fusion-preventing arc, while for loop shifting it is just a precedence arc (it is uniform). With the terminology of [5], a legal shift is a shift r such that in G r all arcs have a nonnegative weight. A fusion partition is legal with respect to a shift r (legal or not) if the partition is legal for G r, i.e., if the fused dependence graph has no circuit, if r is a legal shift when considering only arcs with both ends in the same cluster, and if G r has no -weight circuit. Since the weight of a circuit is unchanged by a shift, a graph G that has a shift r and a corresponding legal fusion partition (in particular, the graph of a valid program) has only circuits of positive weight. The problem Loop shifting for array contraction is to find a shift r and a legal fusion partition for G r so that as many arrays as possible can be contracted after shift by r and fusion. 3 Complexity In this section, we consider the following simplest case: all dependences are uniform (i.e., with constant dependence distances), all vertices in the dependence graph correspond to statements that write in different arrays, and all dependences are flow dependences. In this case, each vertex correspond to a different contractable array, and there is a gain of one contracted array each time a vertex is in the same cluster as all its successors and all dependence distances from this vertex to these successors are contractable, i.e., with weight for the standard condition, and with weight (if the successor is textually after) or (if the successor is textually before) for the extended condition. 3. Loop Fusion For Array Contraction The first problem, Loop Fusion for Array Contraction, is the easiest to understand. We first focus on arrays contracted with the standard condition. 7

10 3.. With the Standard Condition If no arcs are fusion-preventing, then the optimal partition is to place all statements within the same loop. All contractable arcs can be contracted and all arrays whose outgoing flow arcs are all contractable can be contracted. When the dependence graph has only one fusion-preventing arc, then the problem can also be solved in polynomial time, as a variant of a maxflow-mincut algorithm (as noticed in [8]), even if some modifications have to be done so that the maxflowmincut algorithm, which can naturally maximize the number of contracted arcs, is able to maximize the number of contracted vertices. However, when the dependence graph has more fusion-preventing arcs, the complexity changes. Theorem Loop Fusion for Array Contraction is strongly NP-complete for directed graphs with no circuit but a chain of k 2 fusion-preventing arcs. Proof: The associated decision problem is obviously in NP. The rest of the proof is, as in [9], by reduction from the problem Multiway Cut [4], whose following instance is NP-complete for k 3 (and even for the fixed value k = 3). Multiway Cut: Instance: An undirected graph G = (V,E), k specified vertices (s i ) i k, and an integer K. Question: Is there a set E E of edges, of size at most K, such that the removal of E from E disconnects all s i from any other s j? (Such a set E is called a cut.) Transformation To transform an instance of Multiway Cut into an instance of Loop Fusion for Array Contraction, there are two minor difficulties: () we are not interested in maximizing the number of contracted arcs, but in maximizing the number of contracted vertices, (2) our dependence graphs are directed, which is not the case in Multiway Cut. From an instance G = (V,E) and (s i ) i k of Multiway Cut, we build a graph G = (V,E ) as follows. We first let V = V and, for i < k, we add to E a fusion-preventing arc from s i to s i+. For each undirected edge e = (u,v) in E, we add to V a new vertex n e and we add to E two contractable arcs, one from n e to u and one from n e to v. The resulting graph G has V + E vertices, 2 E + k arcs, it has no circuit, and all arcs go from a vertex in V \ V to a vertex in V, except the fusion-preventing arcs that form a directed chain of (k ) arcs between the k specified vertices in V. Furthermore, to count the number of contracted vertices after fusion, we can only consider the vertices in V \ V (the vertices n e ) since, whatever the partition, vertices in V are either never contractable (vertices s i for i < k, since they have a fusion-preventing outgoing arc) or always contractable (other vertices since they have no outgoing arc at all). To complete the proof, we now show that there is a valid cut for G of size at most K if and only if there is a legal fusion partition with at least E K contracted vertices in V \ V (resp. 2 E K contracted arcs in G ). Reduction Given a cut E for G, we define the following binary relation on V : for all w V, wrw, and for all e = (u,v) E \ E, urv, vru, n e Ru, urn e, n e Rv, and vrn e. The transitive closure of R is an equivalence relation whose equivalence classes define a partition. By construction, two vertices u V and v V are in the same equivalence class if and only if there is an undirected path in G from u to v with all edges in E \ E. Furthermore, the fused dependence graph has the following properties. If a cluster contains no vertex in V, then it contains a unique vertex in V \ V (i.e., a vertex of type n e ), and the cluster has no incoming 8

11 arc and two outgoing arcs. If a cluster contains a vertex in V, the cluster has no outgoing arc except possibly the fusion-preventing arcs because if a vertex in V \ V is in the cluster, its two successors are also in the cluster. Therefore, only fusion-preventing arcs can be involved in a circuit in the fused dependence graph. From these properties, it is now clear that, if E is a valid cut, the partition is a legal fusion partition. First, all s i are in different clusters since there is no path between any pair (s i,s j ) with edges all in E \ E. Furthermore, if the reduced dependence graph defined by the partition has a circuit, then the circuit corresponds to a circuit of fusion-preventing arcs involving the vertices s i. But this is impossible since the fusion-preventing arcs define a chain, not a circuit. The fusion partition we obtain has already the following property: it has 2( E E ) contracted arcs and E E contracted vertices in V \ V. Now, since the fused dependence graph has no circuit, we can number the clusters according to some topological sort, and for each edge e = (u,v) E, we can place n e in the same cluster as the vertex, between u or v, with smaller cluster number, without creating any circuit. Finally, we end up with a legal fusion partition with 2 E E contracted arcs and E E contracted vertices in V \ V. Conversely, for any legal fusion partition for G, no two s i are in the same cluster and each vertex n e is contracted if and only if it is in the same cluster as its two successors u and v. If a vertex n e is not in the same cluster as both u and v, then with the same technique as above, we can always place it in the same cluster as either u or v. After this transformation, if the number of contracted vertices in V \ V is E K, we get 2( E K) + K = 2 E K contracted arcs. And if we remove in G all edges e = (u,v) such that u and v are not in the same cluster in G, we get a valid cut of size K. We point out that the proof of Theorem proves at the same time that maximizing the number of contracted arcs with loop fusion is strongly NP-complete (if there are at least 2 fusion-preventing arcs), in other words that the problem Weighted Loop Fusion, introduced in [9], is strongly NP-complete. We think this is interesting to mention because the proof in [9], which is so far the main (if not only) NP-completeness result most papers on loop fusion refer to, turns out to be incorrect (the construction proposed in [9] does not guarantee that all valid cuts correspond to partitions without circuit). But the idea to use Multiway Cut in the reduction was the right one as the proof above shows With the Extended Condition The previous proof is also valid if we consider the extended condition for contractability, since we can always restrict, in the proof, to programs such that noncontractable arcs (arcs with nonzero weight) corresponds to dependences with weight. In this case, there is no arc with weight in the graph, and arrays contracted with the extended condition are the same as arrays contracted with the standard condition, and the problem remains NP-complete. However, as we mentioned earlier, when there are no fusion-preventing arcs for example when all weights are equal to or the complete fusion is always possible and contraction with the standard condition is easy. But is that always true for the extended condition? Maybe not, since we need to find an adequate ordering of statements such that an arc is contracted if it has a weight and the source of the arc is placed textually before its destination (standard condition), but also if it has a weight and the source is placed textually after its destination. Consider a directed graph G = (V,E,d) where d(u) {,} for all u V. The variant of Loop Fusion for Extended Array Contraction is, for such a simple instance, to determine, given an integer K, a total order of vertices such that e = (u,v) and d(e) = implies u v, and at least K vertices are contracted with the extended condition, where a vertex u is contracted if u v for each arc e = (u,v) with weight and v u for each arc e = (u,v) 9

12 with weight. For this to be possible, the graph obtained by removing all noncontracted arcs and by replacing each contracted arc (u,v) of weight by an arc (v,u) should have no circuit. Theorem 2 Loop Fusion for Extended Array Contraction is strongly NP-complete for directed graphs with no circuit and arcs with weights or. Proof: We use a reduction from Vertex Cover (Problem GT in [9, p. 9]) recalled below. Vertex Cover: Instance: An undirected graph G = (V,E) and an integer K. Question: Is there a subset V of V, of size at most K, such that for each edge (u,v) E at least one of u and v belongs to V? (Such a set V is called a vertex cover.) Loop Fusion for Extended Array Contraction is obviously in NP. Now consider an instance G = (V,E) of Vertex Cover. We use a reduction similar to the reduction for Feedback Arc Set (Problem GT8 in [9, p. 92]). We build a directed graph G = (V,E,d) with no circuit as follows: for each vertex u V, we define two vertices u and u in V with an arc from u to u with weight, and for each edge e = (u,v), we define an arc from u to v and an arc from v to u, both with weight. There are 2 V vertices and V + 2 E arcs in G. We now show that there is a vertex cover of size at most K in G if and only if there are at least 2 V K contracted vertices (resp. 2 E + V K contracted arcs) in G. Consider a vertex cover V for G of size K. Remove all arcs (u,u ) in G when u V and replace each remaining arc of the form (v,v ) by an arc (v,v). Since V is a cover for G, there cannot be any path of the form u u v v in this new graph since either (u,u) or (v,v) is not in the graph. So, there is no circuit in this graph and we can find an ordering of vertices that follows the direction of arcs (in other words, the set of arcs (u,u) that we removed is a feedback arc set for this graph). This leads to a solution of Loop Fusion for Extended Array Contraction with at least 2 V K contracted vertices (all vertices except possibly those that belong to V) and 2 E + V K contracted arcs (all arcs with weight plus the arcs (u,u ) with weight such that u does not belong to the vertex cover). Conversely, consider a valid fusion and ordering of vertices, and define V as the set of vertices u in V such that u has at least one noncontracted outgoing arc in G. By definition, if we remove from G all noncontracted arcs, and if we replace each contracted arc (u,u ) with weight by an arc (u,u), we get a directed graph with no circuit. This implies in particular that there is no circuit u u v v u, therefore at least one of (u,u ) and (v,v ) is not contracted. In other words, V is a vertex cover for G. Now, if there are 2 V K contracted vertices, then only K vertices are in V, so the size of V is exactly K. And if there are 2 E + V K contracted arcs, each noncontracted arc can contribute to at most one different vertex in V, so the size of V is at most K. Theorem 2 shows that the problem is more difficult with the extended condition. Simply finding an adequate statement ordering is difficult while, for the standard condition, the difficulty arises only when some arcs prevent the total fusion (or even a partition with 2 clusters). 3.2 Loop Shifting For Array Contraction When all dependences are uniform, Loop Fusion for Array Contraction is NP-complete because negative dependence distances are considered as fusion-preventing arcs. But the situation may be different when introducing loop shifting. Indeed, with a sufficient shift, any uniform dependence can be transformed into a nonnegative dependence distance (i.e., a fusionpreventing arc can become a precedence arc). Even more, the complete fusion of a sequence of

13 loops with uniform dependences is always possible after suitable shifts. It is therefore legitimate to wonder if introducing loop shifting in the case of uniform dependences makes the problem easier. Also, this problem is of practical interest since many code fragments, for example coming from Fortran 9 array expressions, are codes with uniform dependences. Again, we first consider the case of contraction with the standard condition (no smart statement ordering to define), then with the extended condition With the Standard Condition Unlike Loop Fusion for Array Contraction, which is linked to a very close well-known problem (almost all difficulties are therefore pushed into the NP-completeness proof of Multiway Cut, which is quite long and difficult, see [4]), we have almost to start from scratch to establish the NP-completeness of Loop Shifting for Array Contraction. The proof has two parts. We first show that finding a shift that maximizes the number of contracted arcs (arcs with weight after the shift) is strongly NP-complete. Then, as in Theorem, we are able to reduce, from the maximization of contracted arcs, the maximization of contracted vertices we are interested in. We first need the following technical lemma. Lemma Let G = (V,E,d) be a directed graph where each arc e has a weight d(e) Z and such that all circuits have a positive weight. Let r be a shift for G and let P be a legal fusion partition with respect to r. Then there exists a legal shift r such that all arcs, contracted for r and P, are contracted for r and the partition P = {V } (total fusion). Furthermore, u V, r (u) e E d(e). Proof: To build r from r, we define a graph G = (V,E ) in which r will be computed. We first let G = G and for each arc e = (u,v) E such that e is contracted for r and P, we add a new arc e in E from v to u with weight d (e ) = d(e). Note first that all circuits in G r have a nonnegative weight (since each circuit should be part of a given cluster, and in each cluster all weights are nonnegative). The same is true in G r. Since weights of circuits are not modified by a shift, G has the same property and we can compute shortest paths in G. We define π(u) as the minimal weight of a path ending at u (a nonpositive quantity if, by convention, a path with no arc has weight ). For each arc e = (u,v) E, we have π(v) π(u) + d (e) (since the weight of the minimal path to v is less than or equal to the weight of any path that goes to v through u). With r (u) = π(u), we get d(e) + r (v) r (u) for all e = (u,v) E, thus r is legal for G and the partition P = {V } with only one cluster is legal with respect to r. Furthermore, because of the arcs e, we even have π(v) = π(u) + d (e) when e is contracted for r and P. Thus, arcs contracted for r and P are contracted for r and P. Finally, since r (u) is built as the opposite of the weight of an elementary path P(u) in G ending at u, we have r (u) = e P(u) d (e) e E d(e). Lemma shows that we can restrict to solutions that correspond to a total fusion with a legal shift. In practice however, when nonuniform and, in particular, fusion-preventing dependences exist, we will have to be able to take into account fusion partitions with more than one cluster (this will be done in the linear programming formulation presented in Section 4.2). For the NP-completeness proof itself, we now consider the following problem: Maximization of Local Accesses: Instance: A uniform dependence graph G = (V,E,d) and an integer K. Question: Is there a shift r of G and a legal fusion partition with respect to r such that at least K arcs are contracted?

14 The next theorem characterizes the complexity of Maximization of Local Accesses. Theorem 3 Maximization of Local Accesses is strongly NP-complete. Proof: We first show that the problem belongs to NP. Thanks to Lemma, given a shift r and a corresponding partition, there is always a legal shift r of polynomial size with at least as many contracted arcs. Therefore, given a positive instance of our problem, there is a polynomial certificate (the shift r of polynomial size) for which we can check in polynomial time that at least K arcs are contracted. The rest of the proof is by reduction from the problem Not-All- Equal 3SAT (Problem LO3 in [9, p. 259]) that we recall here. Not-All-Equal 3SAT: Instance: A set U of n boolean variables and a set C of m clauses over U such that each clause c C has c = 3. Question: Is there a truth assignment for U such that each clause in C has at least one true literal and at least one false literal? Transformation Let (U, C) be an instance of Not-All-Equal 3SAT. We define an instance G = (V,E,d) of Maximization of Local Accesses as follows. We start from G = (V,E) with V = and E =. For each variable u U, we add to G two vertices u and ū, and two arcs, with weight, from u to ū and from ū to u (see Figure 4). for each clause c = {x,y,z} C, we add to G six arcs of weight, from x to y, from y to x, from x to z, from z to x, from y to z, and from z to y (see Figure 5). we let K = 2m + n (remember that n = U and m = C ). The graph G has a polynomial size, with 2 U vertices and 2 U + 6 C arcs. u u u {x, y, z} x z y Figure 4: Transformation of a variable. Figure 5: Transformation of a clause. Reduction Let (U,C) be a positive instance of Not-All-Equal 3SAT, and let T be a truth assignment for U such that each clause has at least one true literal and at least one false literal. We define, for each variable u U, r(u) = { if T(u) = true if T(u) = false r(ū) = r(u) First, note that for each arc e = (u,v) E, r(v) r(u) (since r takes values in {,}), thus r(v) r(u)+d(e) since all arcs in G have a weight equal to (by construction). In other 2

15 words, r is a legal shift for G. For each u U, r(u) r(ū) = ±. Thus, either d r ((u,ū)) = and d r ((ū,u)) = 2, or d r ((u,ū)) = 2 and d r ((ū,u)) =. Therefore, each structure associated to a variable generates exactly one zero-weight arc, i.e., n zero-weight arcs for all variables. For each clause c = {x,y,z} C, at least one literal is true and at least one is false. Thus, there are two literals, for example x and y, such that r(x) = r(y) and r(x) r(z) = r(y) r(z) = ±. Therefore, there is no zero-weight arc between x and y, and there is exactly one such arc between x and z, and one between y and z. In other words, two arcs have a zero weight in each structure associated with a clause, i.e., 2m arcs of zero weight for all clauses. In addition to the n zero-weight arcs we obtained for the variables, we get 2m + n = K zero-weight arcs in G r, and (G,K) is a positive instance of Maximization of Local Accesses. Conversely, suppose that (G, K) is a positive instance of Maximization of Local Accesses. Let r be a shift of G such that G r has at least K zero-weight arcs. We define, for each literal u U: { true if r(u) mod 2 = T(u) = false if r(u) mod 2 = Note that a shift does not change the total weight along a circuit, thus at most one of the two arcs associated to a variable can have a zero weight after the shift (otherwise the weight of the circuit would have a weight equal to and not to 2). We now show that at most two arcs in the structure associated to a clause can have a zero weight after the shift. First, the same observation as before shows that only one of the two arcs between two different literals can have a zero weight, therefore at most 3 such arcs for each clause. Suppose that there is a clause c = {x,y,z} with at least two zero-weight arcs after the shift, for example between x and y (thus r(y) = r(x) ± ), and between x and z (thus r(z) = r(x) ± ). Then, either r(y) = r(z), or r(y) = r(z) ± 2, and in both cases, there is no zero-weight arc between y and z. Therefore, each structure associated to a clause has at most 2 zero-weight arcs after the shift (actually, all this is true even if the shift is not legal). To summarize, G r has at most 2m + n zero-weight arcs, and by hypothesis, it has at least K = 2m + n zero-weight arcs. Thus, it has exactly K = 2m + n arcs of zero weight, i.e., one for each structure associated to a variable, and two for each structure associated to a clause. It remains to show that T is a truth assignment with at least one true and one false literal in each clause. Since there is a zero-weight arc between u and ū, r(u) = r(ū) ±. Thus, r(u) mod 2 r(ū) mod 2 and T(u) T(ū); Each clause c = {x,y,z} contains exactly two arcs of zero weight; consider one of them, for example between x and y. We have r(x) = r(y) ± and thus T(x) T(y). Therefore, (U,C) is a positive instance of Not-All-Equal 3SAT. We just proved that (G, K) is a positive instance of Maximization of Local Accesses if and only if (U,T) is a positive instance of Not-All-Equal 3SAT. This proves that Maximization of Local Accesses is strongly NP-complete. Note that the instance built in the previous proof is a graph that has always circuits. Nevertheless, it is possible to show that Maximization of Local Accesses is strongly NPcomplete even for a graph with no circuit (and even if the shift is supposed to be legal or not). The proof is more technical, but we give it here for completeness. Theorem 4 Maximization of local accesses is strongly NP-complete even for a graph with no circuit and weights equal to or. 3

16 Proof: The same proof as in Theorem 3 shows that the problem belongs to NP. The rest of the proof is by reduction from the problem One-in-Three 3SAT (Problem LO4 in [9, p. 259]) that we recall here. One-In-Three 3SAT: Instance: A set U of n boolean variables and a set C of m clauses over U such that each clause c C has c = 3. Question: Is there a truth assignment for U such that each clause in C has exactly one true literal? Transformation Let (U,C) be an instance of One-in-Three 3SAT. We define an instance G = (V,E,d) of Maximization of local accesses as follows. We start from G = (V,E,d) with two vertices a and b (V = {a,b}) and E is a set of 48mn + 8m arcs with weight, from a to b. We call this initial structure the base structure (see Figure 6). a 48mn+8m b Figure 6: Base structure. For each variable u U, we add to G two vertices u and ū, and 24m arcs with weight, from ū to b, 6m arcs with weight from u to ū, 24m arcs with weight from a to u, 6m arcs of weight from a to ū, and 6m arcs with weight from u to b (see Figure 7). a u u b Figure 7: Transformation of a variable (each arc on the figure is repeated 8m times). For each clause c = {x,y,z} C, we consider what we call derivative clauses c = {x,ȳ, z}, c = { x,y, z}, and c 2 = { x,ȳ,z}. We add to G three vertices c, c, and c 2, and for each derivative clause c i = {x,y,z }, i {,,2}, we add an arc with weight from a to c i, 4

17 an arc with weight from c i to each of the vertices x, y, and z (built by the variables), and an arc with weight from each of the vertices x, y, and z to b (see Figure 8). x a c i y b z Figure 8: Transformation of a clause (for one derivative clause). We let K = 96mn + 22m. The graph G built this way has no circuit. It has 2n + 3m + 2 vertices and 44mn + 29m arcs. Reduction Let (U, C) be a positive instance of One-in-Three 3SAT, and let T be a truth assignment for U such that each clause has exactly one true literal. We define a shift r for G as follows. We let r(a) = r(b) =, for each u U: r(u) = { if T(u) = true if T(u) = false r(ū) = r(u) and for each clause c C, for each derivative clause c i = {x,y,z }, i {,,2}: { if T(x r(c i ) = ) = T(y ) = T(z ) = true otherwise Since r(a) = r(b), the shift is legal for the base structure of G and the 48mn+8m corresponding arcs have a weight equal to in G r. For each variable u U, either r(u) = and r(ū) =, or r(u) = and r(ū) =. In both cases, this gives rise to exactly 6 groups of 8m arcs with weight in each structure associated to a variable (i.e., 6 arcs with weight on Figure 7). Therefore, there is a total of 48mn arcs of weight for all variables (note also that the shift is legal for all arcs associated to variables). For each clause c C, exactly one of the literals of c is true, therefore exactly one derivative clause c i = {x,y,z }, i {,,2} has all its literals true. For this derivative clause, r(x ) = r(y ) = r(z ) = and r(c i ) =, thus the shift is legal for the associated structure and it gives rise to 6 arcs with weight. For any other derivative clause c j, r(c j ) = since at least one literal is false, and whatever the shift of the other vertices ( or ), the shift is legal for the structure associated to c j and gives rise to 4 arcs with weight. Therefore, there is a total of = 4 arcs with weight for each clause and 4m such arcs for all the clauses. For the full graph G r, we get a total of 96mn + 22m arcs with weight : (G,K) is a positive instance of Maximization of local accesses. Conversely, suppose that (G, K) is a positive instance of Maximization of local accesses. Let r be a shift of G such that G r has at least K zero-weight arcs. Without loss of generality, we can assume that r(a) =, otherwise we subtract r(a) to all values of the shift. Before going further, we prove some properties of the shift r. We first prove that r(a) = r(b) =, then that for each variable u U, r(u) and r(ū) are equal to or, and that they are not equal. 5

18 Note that each structure built from a variable contains 2 8m = 96m arcs and that each structure built from a clause contains 3 7 = 2 arcs, thus a total of 96mn + 2m arcs for all structures. Therefore, if G r contains K = 96mn + 22m arcs with weight or more, then some of them belong to the base structure. Since all arcs for the base structure have the same initial weight, the same source a, and the same destination b, they all have a weight in G r. Thus r(a) = r(b) = and there are exactly 48mn + 8m arcs with weight in the base structure. Now consider the structure built from a variable u U. Assume that r(u) < or r(u) > : in both cases, the arcs from a to u and the arcs from u to b have a nonzero weight after the shift (since r(a) = r(b) = ). Furthermore, the arcs from a to ū and from u to ū cannot have a weight simultaneously (otherwise, r(ū) = and r(u) = ). Therefore, at most 4m arcs have a weight (the arcs from a to ū or the arcs from u to ū, plus possibly the arcs from ū to b). Now suppose that r(ū) < or r(ū) > : in both cases, the arcs from a to ū and the arcs from ū to b have a nonzero weight (since r(a) = r(b) = ). Furthermore, the arcs from u to ū and the arcs from u to b cannot have a weight simultaneously since they have the same source u, the same initial weight, but r(ū) r(b). Therefore, again, the structure contains at most 4m arcs of weight (the arcs from u to ū or the arcs from u to b, plus possibly the arcs from a to u). Finally, we can easily check that if r(u) = r(ū) = or if r(u) = r(ū) =, then the structure corresponding to u contains exactly 4m arcs of weight, and if r(u) = and r(ū) =, or if r(u) = and r(ū) =, then it contains 48m arcs of weight. Now, if at least one of the structures built from a variable contains at most 4m arcs with weight, then the total number of arcs with weight in the structures associated to variables, plus the base structure, is at most 48mn + 8m + 48m(n ) + 4m = 96mn. By definition of K, we still need 22m arcs of weight in the structures associated to clauses, but the total number of arcs in these structures is only 2m. Therefore, each structure associated to a variable must contain exactly 48m arcs with weight after the shift, and for each variable u U, either r(u) = and r(ū) =, or r(u) = and r(ū) = (note also that the shift is legal for all arcs in the structure). We define, for each literal u U: { false if r(u) = T(u) = true otherwise { false if r(ū) = and T(ū) = true otherwise The previous study shows that for each variable u U, T(u) T(ū), thus T is a truth assignment. It remains to show that each clause contains exactly one true literal. Consider a derivative clause c i = {x,y,z } obtained from a clause c C: it corresponds to a structure in which r(x ), r(y ), and r(z ) are equal to or (since each such vertex is the u or ū contained in the structure associated to a variable u). Assume that r(c i ) < or r(c i ) >. Then the arc from a to c i and the arcs from c i to each of the vertices x, y, and z have a nonzero weight after the shift. In this case, at most 3 arcs have a weight. Furthermore, if r(c i ) =, then the structure has exactly 4 arcs with weight, and if r(c i ) =, then either r(x ) = r(y ) = r(z ) = and the structure contains exactly 6 arcs of weight, or it is easy to check that it contains at most 4 arcs of weight (and some have a negative weight). Finally, by construction of derivative clauses c, c, and c 2, for each pair of such clauses (c i,c j ), there is a variable u U such that u c i and ū c j. Therefore, at most one of the three derivative structures is such that r(x ) = r(y ) = r(z ) = and contains 6 arcs of weight (each other derivative clause contains at most 4 such arcs). If at least one clause has strictly less than arcs with weight in its 3 derivative clauses, then with the arcs of other clauses, we get at most ( ) (m ) + 3 = 4m arcs of weight. With the other structures, we get at most 48mn + 8m + 48mn + 4m = 96mn + 22m arcs of weight, which is not enough. Therefore, for each clause c C, exactly one derivative clause c i contains 6 arcs of zero weight and the two other derivative clauses contains 4 such arcs. This means that 6

New Results on Array Contraction

New Results on Array Contraction Laboratoire de l Informatique du Parallélisme École Normale Supérieure de Lyon Unité Mixte de Recherche CNRS-INRIA-ENS LYON n o 5668 New Results on Array Contraction Alain Darte and Guillaume Huard April

More information

Laboratoire de l Informatique du Parallélisme

Laboratoire de l Informatique du Parallélisme Laboratoire de l Informatique du Parallélisme École Normale Supérieure de Lyon Unité Mixte de Recherche CNRS-INRIA-ENS LYON n o 8512 SPI Multiplication by an Integer Constant Vincent Lefevre January 1999

More information

Solutions for the Exam 6 January 2014

Solutions for the Exam 6 January 2014 Mastermath and LNMB Course: Discrete Optimization Solutions for the Exam 6 January 2014 Utrecht University, Educatorium, 13:30 16:30 The examination lasts 3 hours. Grading will be done before January 20,

More information

Scheduling Tasks Sharing Files from Distributed Repositories (revised version)

Scheduling Tasks Sharing Files from Distributed Repositories (revised version) Laboratoire de l Informatique du Parallélisme École Normale Supérieure de Lyon Unité Mixte de Recherche CNRS-INRIA-ENS LYON-UCBL n o 8 Scheduling Tasks Sharing Files from Distributed Repositories (revised

More information

9.1 Cook-Levin Theorem

9.1 Cook-Levin Theorem CS787: Advanced Algorithms Scribe: Shijin Kong and David Malec Lecturer: Shuchi Chawla Topic: NP-Completeness, Approximation Algorithms Date: 10/1/2007 As we ve already seen in the preceding lecture, two

More information

Exact Algorithms Lecture 7: FPT Hardness and the ETH

Exact Algorithms Lecture 7: FPT Hardness and the ETH Exact Algorithms Lecture 7: FPT Hardness and the ETH February 12, 2016 Lecturer: Michael Lampis 1 Reminder: FPT algorithms Definition 1. A parameterized problem is a function from (χ, k) {0, 1} N to {0,

More information

When double rounding is odd

When double rounding is odd Laboratoire de l Informatique du Parallélisme École Normale Supérieure de Lyon Unité Mixte de Recherche CNRS-INRIA-ENS LYON-UCBL n o 5668 When double rounding is odd Sylvie Boldo, Guillaume Melquiond November

More information

The Encoding Complexity of Network Coding

The Encoding Complexity of Network Coding The Encoding Complexity of Network Coding Michael Langberg Alexander Sprintson Jehoshua Bruck California Institute of Technology Email: mikel,spalex,bruck @caltech.edu Abstract In the multicast network

More information

In this lecture we discuss the complexity of approximation problems, and show how to prove they are NP-hard.

In this lecture we discuss the complexity of approximation problems, and show how to prove they are NP-hard. In this lecture we discuss the complexity of approximation problems, and show how to prove they are NP-hard. 1 We will show how one can prove such results and then apply this technique to some approximation

More information

Theorem 2.9: nearest addition algorithm

Theorem 2.9: nearest addition algorithm There are severe limits on our ability to compute near-optimal tours It is NP-complete to decide whether a given undirected =(,)has a Hamiltonian cycle An approximation algorithm for the TSP can be used

More information

CSE 417 Network Flows (pt 3) Modeling with Min Cuts

CSE 417 Network Flows (pt 3) Modeling with Min Cuts CSE 417 Network Flows (pt 3) Modeling with Min Cuts Reminders > HW6 is due on Friday start early bug fixed on line 33 of OptimalLineup.java: > change true to false Review of last two lectures > Defined

More information

NP and computational intractability. Kleinberg and Tardos, chapter 8

NP and computational intractability. Kleinberg and Tardos, chapter 8 NP and computational intractability Kleinberg and Tardos, chapter 8 1 Major Transition So far we have studied certain algorithmic patterns Greedy, Divide and conquer, Dynamic programming to develop efficient

More information

Recognizing Interval Bigraphs by Forbidden Patterns

Recognizing Interval Bigraphs by Forbidden Patterns Recognizing Interval Bigraphs by Forbidden Patterns Arash Rafiey Simon Fraser University, Vancouver, Canada, and Indiana State University, IN, USA arashr@sfu.ca, arash.rafiey@indstate.edu Abstract Let

More information

NP-Completeness. Algorithms

NP-Completeness. Algorithms NP-Completeness Algorithms The NP-Completeness Theory Objective: Identify a class of problems that are hard to solve. Exponential time is hard. Polynomial time is easy. Why: Do not try to find efficient

More information

Principles of AI Planning. Principles of AI Planning. 7.1 How to obtain a heuristic. 7.2 Relaxed planning tasks. 7.1 How to obtain a heuristic

Principles of AI Planning. Principles of AI Planning. 7.1 How to obtain a heuristic. 7.2 Relaxed planning tasks. 7.1 How to obtain a heuristic Principles of AI Planning June 8th, 2010 7. Planning as search: relaxed planning tasks Principles of AI Planning 7. Planning as search: relaxed planning tasks Malte Helmert and Bernhard Nebel 7.1 How to

More information

Decision Problems. Observation: Many polynomial algorithms. Questions: Can we solve all problems in polynomial time? Answer: No, absolutely not.

Decision Problems. Observation: Many polynomial algorithms. Questions: Can we solve all problems in polynomial time? Answer: No, absolutely not. Decision Problems Observation: Many polynomial algorithms. Questions: Can we solve all problems in polynomial time? Answer: No, absolutely not. Definition: The class of problems that can be solved by polynomial-time

More information

P and NP CISC5835, Algorithms for Big Data CIS, Fordham Univ. Instructor: X. Zhang

P and NP CISC5835, Algorithms for Big Data CIS, Fordham Univ. Instructor: X. Zhang P and NP CISC5835, Algorithms for Big Data CIS, Fordham Univ. Instructor: X. Zhang Efficient Algorithms So far, we have developed algorithms for finding shortest paths in graphs, minimum spanning trees

More information

2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006

2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006 2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006 The Encoding Complexity of Network Coding Michael Langberg, Member, IEEE, Alexander Sprintson, Member, IEEE, and Jehoshua Bruck,

More information

CS261: A Second Course in Algorithms Lecture #16: The Traveling Salesman Problem

CS261: A Second Course in Algorithms Lecture #16: The Traveling Salesman Problem CS61: A Second Course in Algorithms Lecture #16: The Traveling Salesman Problem Tim Roughgarden February 5, 016 1 The Traveling Salesman Problem (TSP) In this lecture we study a famous computational problem,

More information

NP Completeness. Andreas Klappenecker [partially based on slides by Jennifer Welch]

NP Completeness. Andreas Klappenecker [partially based on slides by Jennifer Welch] NP Completeness Andreas Klappenecker [partially based on slides by Jennifer Welch] Overview We already know the following examples of NPC problems: SAT 3SAT We are going to show that the following are

More information

Chapter 8. NP-complete problems

Chapter 8. NP-complete problems Chapter 8. NP-complete problems Search problems E cient algorithms We have developed algorithms for I I I I I finding shortest paths in graphs, minimum spanning trees in graphs, matchings in bipartite

More information

Lecture and notes by: Sarah Fletcher and Michael Xu November 3rd, Multicommodity Flow

Lecture and notes by: Sarah Fletcher and Michael Xu November 3rd, Multicommodity Flow Multicommodity Flow 1 Introduction Suppose we have a company with a factory s and a warehouse t. The quantity of goods that they can ship from the factory to the warehouse in a given time period is limited

More information

Advanced Algorithms Class Notes for Monday, October 23, 2012 Min Ye, Mingfu Shao, and Bernard Moret

Advanced Algorithms Class Notes for Monday, October 23, 2012 Min Ye, Mingfu Shao, and Bernard Moret Advanced Algorithms Class Notes for Monday, October 23, 2012 Min Ye, Mingfu Shao, and Bernard Moret Greedy Algorithms (continued) The best known application where the greedy algorithm is optimal is surely

More information

PCP and Hardness of Approximation

PCP and Hardness of Approximation PCP and Hardness of Approximation January 30, 2009 Our goal herein is to define and prove basic concepts regarding hardness of approximation. We will state but obviously not prove a PCP theorem as a starting

More information

Scan Scheduling Specification and Analysis

Scan Scheduling Specification and Analysis Scan Scheduling Specification and Analysis Bruno Dutertre System Design Laboratory SRI International Menlo Park, CA 94025 May 24, 2000 This work was partially funded by DARPA/AFRL under BAE System subcontract

More information

CMPSCI611: The SUBSET-SUM Problem Lecture 18

CMPSCI611: The SUBSET-SUM Problem Lecture 18 CMPSCI611: The SUBSET-SUM Problem Lecture 18 We begin today with the problem we didn t get to at the end of last lecture the SUBSET-SUM problem, which we also saw back in Lecture 8. The input to SUBSET-

More information

Bijective Proofs of Two Broken Circuit Theorems

Bijective Proofs of Two Broken Circuit Theorems Bijective Proofs of Two Broken Circuit Theorems Andreas Blass PENNSYLVANIA STATE UNIVERSITY UNIVERSITY PARK, PENNSYLVANIA 16802 Bruce Eli Sagan THE UNIVERSITY OF PENNSYLVANIA PHILADELPHIA, PENNSYLVANIA

More information

CMPSCI 311: Introduction to Algorithms Practice Final Exam

CMPSCI 311: Introduction to Algorithms Practice Final Exam CMPSCI 311: Introduction to Algorithms Practice Final Exam Name: ID: Instructions: Answer the questions directly on the exam pages. Show all your work for each question. Providing more detail including

More information

P and NP CISC4080, Computer Algorithms CIS, Fordham Univ. Instructor: X. Zhang

P and NP CISC4080, Computer Algorithms CIS, Fordham Univ. Instructor: X. Zhang P and NP CISC4080, Computer Algorithms CIS, Fordham Univ. Instructor: X. Zhang Efficient Algorithms So far, we have developed algorithms for finding shortest paths in graphs, minimum spanning trees in

More information

Lecture 9 - Matrix Multiplication Equivalences and Spectral Graph Theory 1

Lecture 9 - Matrix Multiplication Equivalences and Spectral Graph Theory 1 CME 305: Discrete Mathematics and Algorithms Instructor: Professor Aaron Sidford (sidford@stanfordedu) February 6, 2018 Lecture 9 - Matrix Multiplication Equivalences and Spectral Graph Theory 1 In the

More information

LECTURES 3 and 4: Flows and Matchings

LECTURES 3 and 4: Flows and Matchings LECTURES 3 and 4: Flows and Matchings 1 Max Flow MAX FLOW (SP). Instance: Directed graph N = (V,A), two nodes s,t V, and capacities on the arcs c : A R +. A flow is a set of numbers on the arcs such that

More information

Routing Reconfiguration/Process Number: Networks with Shared Bandwidth.

Routing Reconfiguration/Process Number: Networks with Shared Bandwidth. INSTITUT NATIONAL DE RECHERCHE EN INFORMATIQUE ET EN AUTOMATIQUE Routing Reconfiguration/Process Number: Networks with Shared Bandwidth. David Coudert Dorian Mazauric Nicolas Nisse N 6790 January 2009

More information

1 Variations of the Traveling Salesman Problem

1 Variations of the Traveling Salesman Problem Stanford University CS26: Optimization Handout 3 Luca Trevisan January, 20 Lecture 3 In which we prove the equivalence of three versions of the Traveling Salesman Problem, we provide a 2-approximate algorithm,

More information

x ji = s i, i N, (1.1)

x ji = s i, i N, (1.1) Dual Ascent Methods. DUAL ASCENT In this chapter we focus on the minimum cost flow problem minimize subject to (i,j) A {j (i,j) A} a ij x ij x ij {j (j,i) A} (MCF) x ji = s i, i N, (.) b ij x ij c ij,

More information

Definition: A graph G = (V, E) is called a tree if G is connected and acyclic. The following theorem captures many important facts about trees.

Definition: A graph G = (V, E) is called a tree if G is connected and acyclic. The following theorem captures many important facts about trees. Tree 1. Trees and their Properties. Spanning trees 3. Minimum Spanning Trees 4. Applications of Minimum Spanning Trees 5. Minimum Spanning Tree Algorithms 1.1 Properties of Trees: Definition: A graph G

More information

Fast algorithms for max independent set

Fast algorithms for max independent set Fast algorithms for max independent set N. Bourgeois 1 B. Escoffier 1 V. Th. Paschos 1 J.M.M. van Rooij 2 1 LAMSADE, CNRS and Université Paris-Dauphine, France {bourgeois,escoffier,paschos}@lamsade.dauphine.fr

More information

ALGORITHMS EXAMINATION Department of Computer Science New York University December 17, 2007

ALGORITHMS EXAMINATION Department of Computer Science New York University December 17, 2007 ALGORITHMS EXAMINATION Department of Computer Science New York University December 17, 2007 This examination is a three hour exam. All questions carry the same weight. Answer all of the following six questions.

More information

1 Definition of Reduction

1 Definition of Reduction 1 Definition of Reduction Problem A is reducible, or more technically Turing reducible, to problem B, denoted A B if there a main program M to solve problem A that lacks only a procedure to solve problem

More information

Lecture 7: Counting classes

Lecture 7: Counting classes princeton university cos 522: computational complexity Lecture 7: Counting classes Lecturer: Sanjeev Arora Scribe:Manoj First we define a few interesting problems: Given a boolean function φ, #SAT is the

More information

11/22/2016. Chapter 9 Graph Algorithms. Introduction. Definitions. Definitions. Definitions. Definitions

11/22/2016. Chapter 9 Graph Algorithms. Introduction. Definitions. Definitions. Definitions. Definitions Introduction Chapter 9 Graph Algorithms graph theory useful in practice represent many real-life problems can be slow if not careful with data structures 2 Definitions an undirected graph G = (V, E) is

More information

The optimal routing of augmented cubes.

The optimal routing of augmented cubes. The optimal routing of augmented cubes. Meirun Chen, Reza Naserasr To cite this version: Meirun Chen, Reza Naserasr. The optimal routing of augmented cubes.. Information Processing Letters, Elsevier, 28.

More information

Chapter 9 Graph Algorithms

Chapter 9 Graph Algorithms Chapter 9 Graph Algorithms 2 Introduction graph theory useful in practice represent many real-life problems can be slow if not careful with data structures 3 Definitions an undirected graph G = (V, E)

More information

NP-complete Reductions

NP-complete Reductions NP-complete Reductions 1. Prove that 3SAT P DOUBLE-SAT, i.e., show DOUBLE-SAT is NP-complete by reduction from 3SAT. The 3-SAT problem consists of a conjunction of clauses over n Boolean variables, where

More information

COMP260 Spring 2014 Notes: February 4th

COMP260 Spring 2014 Notes: February 4th COMP260 Spring 2014 Notes: February 4th Andrew Winslow In these notes, all graphs are undirected. We consider matching, covering, and packing in bipartite graphs, general graphs, and hypergraphs. We also

More information

ABC basics (compilation from different articles)

ABC basics (compilation from different articles) 1. AIG construction 2. AIG optimization 3. Technology mapping ABC basics (compilation from different articles) 1. BACKGROUND An And-Inverter Graph (AIG) is a directed acyclic graph (DAG), in which a node

More information

Feedback Arc Set in Bipartite Tournaments is NP-Complete

Feedback Arc Set in Bipartite Tournaments is NP-Complete Feedback Arc Set in Bipartite Tournaments is NP-Complete Jiong Guo 1 Falk Hüffner 1 Hannes Moser 2 Institut für Informatik, Friedrich-Schiller-Universität Jena, Ernst-Abbe-Platz 2, D-07743 Jena, Germany

More information

Efficient Polynomial-Time Nested Loop Fusion with Full Parallelism

Efficient Polynomial-Time Nested Loop Fusion with Full Parallelism Efficient Polynomial-Time Nested Loop Fusion with Full Parallelism Edwin H.-M. Sha Timothy W. O Neil Nelson L. Passos Dept. of Computer Science Dept. of Computer Science Dept. of Computer Science Erik

More information

Scheduling Tasks Sharing Files from Distributed Repositories (revised version)

Scheduling Tasks Sharing Files from Distributed Repositories (revised version) INSTITUT NATIONAL DE RECHERCHE EN INFORMATIQUE ET EN AUTOMATIQUE Scheduling Tasks Sharing Files from Distributed Repositories (revised version) Arnaud Giersch Yves Robert Frédéric Vivien N February 00

More information

NP versus PSPACE. Frank Vega. To cite this version: HAL Id: hal https://hal.archives-ouvertes.fr/hal

NP versus PSPACE. Frank Vega. To cite this version: HAL Id: hal https://hal.archives-ouvertes.fr/hal NP versus PSPACE Frank Vega To cite this version: Frank Vega. NP versus PSPACE. Preprint submitted to Theoretical Computer Science 2015. 2015. HAL Id: hal-01196489 https://hal.archives-ouvertes.fr/hal-01196489

More information

Computational problems. Lecture 2: Combinatorial search and optimisation problems. Computational problems. Examples. Example

Computational problems. Lecture 2: Combinatorial search and optimisation problems. Computational problems. Examples. Example Lecture 2: Combinatorial search and optimisation problems Different types of computational problems Examples of computational problems Relationships between problems Computational properties of different

More information

Vertex Cover Approximations

Vertex Cover Approximations CS124 Lecture 20 Heuristics can be useful in practice, but sometimes we would like to have guarantees. Approximation algorithms give guarantees. It is worth keeping in mind that sometimes approximation

More information

NP-Hardness. We start by defining types of problem, and then move on to defining the polynomial-time reductions.

NP-Hardness. We start by defining types of problem, and then move on to defining the polynomial-time reductions. CS 787: Advanced Algorithms NP-Hardness Instructor: Dieter van Melkebeek We review the concept of polynomial-time reductions, define various classes of problems including NP-complete, and show that 3-SAT

More information

Increasing Parallelism of Loops with the Loop Distribution Technique

Increasing Parallelism of Loops with the Loop Distribution Technique Increasing Parallelism of Loops with the Loop Distribution Technique Ku-Nien Chang and Chang-Biau Yang Department of pplied Mathematics National Sun Yat-sen University Kaohsiung, Taiwan 804, ROC cbyang@math.nsysu.edu.tw

More information

Computability Theory

Computability Theory CS:4330 Theory of Computation Spring 2018 Computability Theory Other NP-Complete Problems Haniel Barbosa Readings for this lecture Chapter 7 of [Sipser 1996], 3rd edition. Sections 7.4 and 7.5. The 3SAT

More information

Lecture 22 Tuesday, April 10

Lecture 22 Tuesday, April 10 CIS 160 - Spring 2018 (instructor Val Tannen) Lecture 22 Tuesday, April 10 GRAPH THEORY Directed Graphs Directed graphs (a.k.a. digraphs) are an important mathematical modeling tool in Computer Science,

More information

Recitation 4: Elimination algorithm, reconstituted graph, triangulation

Recitation 4: Elimination algorithm, reconstituted graph, triangulation Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 Recitation 4: Elimination algorithm, reconstituted graph, triangulation

More information

Digital Logic Design: a rigorous approach c

Digital Logic Design: a rigorous approach c Digital Logic Design: a rigorous approach c Chapter 4: Directed Graphs Guy Even Moti Medina School of Electrical Engineering Tel-Aviv Univ. October 31, 2017 Book Homepage: http://www.eng.tau.ac.il/~guy/even-medina

More information

8 Matroid Intersection

8 Matroid Intersection 8 Matroid Intersection 8.1 Definition and examples 8.2 Matroid Intersection Algorithm 8.1 Definitions Given two matroids M 1 = (X, I 1 ) and M 2 = (X, I 2 ) on the same set X, their intersection is M 1

More information

Optimizing Latency and Reliability of Pipeline Workflow Applications

Optimizing Latency and Reliability of Pipeline Workflow Applications Laboratoire de l Informatique du Parallélisme École Normale Supérieure de Lyon Unité Mixte de Recherche CNRS-INRIA-ENS LYON-UCBL n o 5668 Optimizing Latency and Reliability of Pipeline Workflow Applications

More information

Problem Set 7 Solutions

Problem Set 7 Solutions Design and Analysis of Algorithms March 0, 2015 Massachusetts Institute of Technology 6.046J/18.410J Profs. Erik Demaine, Srini Devadas, and Nancy Lynch Problem Set 7 Solutions Problem Set 7 Solutions

More information

Example of a Demonstration that a Problem is NP-Complete by reduction from CNF-SAT

Example of a Demonstration that a Problem is NP-Complete by reduction from CNF-SAT 20170926 CNF-SAT: CNF-SAT is a problem in NP, defined as follows: Let E be a Boolean expression with m clauses and n literals (literals = variables, possibly negated), in which - each clause contains only

More information

Eulerian disjoint paths problem in grid graphs is NP-complete

Eulerian disjoint paths problem in grid graphs is NP-complete Discrete Applied Mathematics 143 (2004) 336 341 Notes Eulerian disjoint paths problem in grid graphs is NP-complete Daniel Marx www.elsevier.com/locate/dam Department of Computer Science and Information

More information

Lecture 11: Maximum flow and minimum cut

Lecture 11: Maximum flow and minimum cut Optimisation Part IB - Easter 2018 Lecture 11: Maximum flow and minimum cut Lecturer: Quentin Berthet 4.4. The maximum flow problem. We consider in this lecture a particular kind of flow problem, with

More information

Representation of Finite Games as Network Congestion Games

Representation of Finite Games as Network Congestion Games Representation of Finite Games as Network Congestion Games Igal Milchtaich To cite this version: Igal Milchtaich. Representation of Finite Games as Network Congestion Games. Roberto Cominetti and Sylvain

More information

Complexity Classes and Polynomial-time Reductions

Complexity Classes and Polynomial-time Reductions COMPSCI 330: Design and Analysis of Algorithms April 19, 2016 Complexity Classes and Polynomial-time Reductions Lecturer: Debmalya Panigrahi Scribe: Tianqi Song 1 Overview In this lecture, we introduce

More information

We will focus on data dependencies: when an operand is written at some point and read at a later point. Example:!

We will focus on data dependencies: when an operand is written at some point and read at a later point. Example:! Class Notes 18 June 2014 Tufts COMP 140, Chris Gregg Detecting and Enhancing Loop-Level Parallelism Loops: the reason we can parallelize so many things If the compiler can figure out if a loop is parallel,

More information

On the Relationships between Zero Forcing Numbers and Certain Graph Coverings

On the Relationships between Zero Forcing Numbers and Certain Graph Coverings On the Relationships between Zero Forcing Numbers and Certain Graph Coverings Fatemeh Alinaghipour Taklimi, Shaun Fallat 1,, Karen Meagher 2 Department of Mathematics and Statistics, University of Regina,

More information

Notes for Lecture 24

Notes for Lecture 24 U.C. Berkeley CS170: Intro to CS Theory Handout N24 Professor Luca Trevisan December 4, 2001 Notes for Lecture 24 1 Some NP-complete Numerical Problems 1.1 Subset Sum The Subset Sum problem is defined

More information

CS261: Problem Set #1

CS261: Problem Set #1 CS261: Problem Set #1 Due by 11:59 PM on Tuesday, April 21, 2015 Instructions: (1) Form a group of 1-3 students. You should turn in only one write-up for your entire group. (2) Turn in your solutions by

More information

Best known solution time is Ω(V!) Check every permutation of vertices to see if there is a graph edge between adjacent vertices

Best known solution time is Ω(V!) Check every permutation of vertices to see if there is a graph edge between adjacent vertices Hard Problems Euler-Tour Problem Undirected graph G=(V,E) An Euler Tour is a path where every edge appears exactly once. The Euler-Tour Problem: does graph G have an Euler Path? Answerable in O(E) time.

More information

Disjoint Support Decompositions

Disjoint Support Decompositions Chapter 4 Disjoint Support Decompositions We introduce now a new property of logic functions which will be useful to further improve the quality of parameterizations in symbolic simulation. In informal

More information

Exam problems for the course Combinatorial Optimization I (DM208)

Exam problems for the course Combinatorial Optimization I (DM208) Exam problems for the course Combinatorial Optimization I (DM208) Jørgen Bang-Jensen Department of Mathematics and Computer Science University of Southern Denmark The problems are available form the course

More information

12.1 Formulation of General Perfect Matching

12.1 Formulation of General Perfect Matching CSC5160: Combinatorial Optimization and Approximation Algorithms Topic: Perfect Matching Polytope Date: 22/02/2008 Lecturer: Lap Chi Lau Scribe: Yuk Hei Chan, Ling Ding and Xiaobing Wu In this lecture,

More information

Matching Algorithms. Proof. If a bipartite graph has a perfect matching, then it is easy to see that the right hand side is a necessary condition.

Matching Algorithms. Proof. If a bipartite graph has a perfect matching, then it is easy to see that the right hand side is a necessary condition. 18.433 Combinatorial Optimization Matching Algorithms September 9,14,16 Lecturer: Santosh Vempala Given a graph G = (V, E), a matching M is a set of edges with the property that no two of the edges have

More information

Strongly connected: A directed graph is strongly connected if every pair of vertices are reachable from each other.

Strongly connected: A directed graph is strongly connected if every pair of vertices are reachable from each other. Directed Graph In a directed graph, each edge (u, v) has a direction u v. Thus (u, v) (v, u). Directed graph is useful to model many practical problems (such as one-way road in traffic network, and asymmetric

More information

6.2. Paths and Cycles

6.2. Paths and Cycles 6.2. PATHS AND CYCLES 85 6.2. Paths and Cycles 6.2.1. Paths. A path from v 0 to v n of length n is a sequence of n+1 vertices (v k ) and n edges (e k ) of the form v 0, e 1, v 1, e 2, v 2,..., e n, v n,

More information

In this lecture, we ll look at applications of duality to three problems:

In this lecture, we ll look at applications of duality to three problems: Lecture 7 Duality Applications (Part II) In this lecture, we ll look at applications of duality to three problems: 1. Finding maximum spanning trees (MST). We know that Kruskal s algorithm finds this,

More information

arxiv: v2 [cs.ds] 18 May 2015

arxiv: v2 [cs.ds] 18 May 2015 Optimal Shuffle Code with Permutation Instructions Sebastian Buchwald, Manuel Mohr, and Ignaz Rutter Karlsruhe Institute of Technology {sebastian.buchwald, manuel.mohr, rutter}@kit.edu arxiv:1504.07073v2

More information

A proof-producing CSP solver: A proof supplement

A proof-producing CSP solver: A proof supplement A proof-producing CSP solver: A proof supplement Report IE/IS-2010-02 Michael Veksler Ofer Strichman mveksler@tx.technion.ac.il ofers@ie.technion.ac.il Technion Institute of Technology April 12, 2010 Abstract

More information

CSE 421 Applications of DFS(?) Topological sort

CSE 421 Applications of DFS(?) Topological sort CSE 421 Applications of DFS(?) Topological sort Yin Tat Lee 1 Precedence Constraints In a directed graph, an edge (i, j) means task i must occur before task j. Applications Course prerequisite: course

More information

Small Survey on Perfect Graphs

Small Survey on Perfect Graphs Small Survey on Perfect Graphs Michele Alberti ENS Lyon December 8, 2010 Abstract This is a small survey on the exciting world of Perfect Graphs. We will see when a graph is perfect and which are families

More information

Boolean Functions (Formulas) and Propositional Logic

Boolean Functions (Formulas) and Propositional Logic EECS 219C: Computer-Aided Verification Boolean Satisfiability Solving Part I: Basics Sanjit A. Seshia EECS, UC Berkeley Boolean Functions (Formulas) and Propositional Logic Variables: x 1, x 2, x 3,, x

More information

Maximum flows & Maximum Matchings

Maximum flows & Maximum Matchings Chapter 9 Maximum flows & Maximum Matchings This chapter analyzes flows and matchings. We will define flows and maximum flows and present an algorithm that solves the maximum flow problem. Then matchings

More information

CSC 505, Fall 2000: Week 12

CSC 505, Fall 2000: Week 12 CSC 505, Fall 000: Week Proving the N P-completeness of a decision problem A:. Prove that A is in N P give a simple guess and check algorithm (the certificate being guessed should be something requiring

More information

Greedy Algorithms 1. For large values of d, brute force search is not feasible because there are 2 d

Greedy Algorithms 1. For large values of d, brute force search is not feasible because there are 2 d Greedy Algorithms 1 Simple Knapsack Problem Greedy Algorithms form an important class of algorithmic techniques. We illustrate the idea by applying it to a simplified version of the Knapsack Problem. Informally,

More information

Discrete Optimization. Lecture Notes 2

Discrete Optimization. Lecture Notes 2 Discrete Optimization. Lecture Notes 2 Disjunctive Constraints Defining variables and formulating linear constraints can be straightforward or more sophisticated, depending on the problem structure. The

More information

Strategies for Replica Placement in Tree Networks

Strategies for Replica Placement in Tree Networks Laboratoire de l Informatique du Parallélisme École Normale Supérieure de Lyon Unité Mixte de Recherche CNRS-INRIA-ENS LYON-UCBL n o 5668 Strategies for Replica Placement in Tree Networks Anne Benoit,

More information

Solution for Homework set 3

Solution for Homework set 3 TTIC 300 and CMSC 37000 Algorithms Winter 07 Solution for Homework set 3 Question (0 points) We are given a directed graph G = (V, E), with two special vertices s and t, and non-negative integral capacities

More information

Section 3.1: Nonseparable Graphs Cut vertex of a connected graph G: A vertex x G such that G x is not connected. Theorem 3.1, p. 57: Every connected

Section 3.1: Nonseparable Graphs Cut vertex of a connected graph G: A vertex x G such that G x is not connected. Theorem 3.1, p. 57: Every connected Section 3.1: Nonseparable Graphs Cut vertex of a connected graph G: A vertex x G such that G x is not connected. Theorem 3.1, p. 57: Every connected graph G with at least 2 vertices contains at least 2

More information

Math 443/543 Graph Theory Notes 11: Graph minors and Kuratowski s Theorem

Math 443/543 Graph Theory Notes 11: Graph minors and Kuratowski s Theorem Math 443/543 Graph Theory Notes 11: Graph minors and Kuratowski s Theorem David Glickenstein November 26, 2008 1 Graph minors Let s revisit some de nitions. Let G = (V; E) be a graph. De nition 1 Removing

More information

LID Assignment In InfiniBand Networks

LID Assignment In InfiniBand Networks LID Assignment In InfiniBand Networks Wickus Nienaber, Xin Yuan, Member, IEEE and Zhenhai Duan, Member, IEEE Abstract To realize a path in an InfiniBand network, an address, known as Local IDentifier (LID)

More information

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Approximation algorithms Date: 11/27/18

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Approximation algorithms Date: 11/27/18 601.433/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Approximation algorithms Date: 11/27/18 22.1 Introduction We spent the last two lectures proving that for certain problems, we can

More information

On Universal Cycles of Labeled Graphs

On Universal Cycles of Labeled Graphs On Universal Cycles of Labeled Graphs Greg Brockman Harvard University Cambridge, MA 02138 United States brockman@hcs.harvard.edu Bill Kay University of South Carolina Columbia, SC 29208 United States

More information

Lecture 10 October 7, 2014

Lecture 10 October 7, 2014 6.890: Algorithmic Lower Bounds: Fun With Hardness Proofs Fall 2014 Lecture 10 October 7, 2014 Prof. Erik Demaine Scribes: Fermi Ma, Asa Oines, Mikhail Rudoy, Erik Waingarten Overview This lecture begins

More information

Module 11. Directed Graphs. Contents

Module 11. Directed Graphs. Contents Module 11 Directed Graphs Contents 11.1 Basic concepts......................... 256 Underlying graph of a digraph................ 257 Out-degrees and in-degrees.................. 258 Isomorphism..........................

More information

Diverse Routing with the star property

Diverse Routing with the star property Diverse Routing with the star property Jean-Claude Bermond, David Coudert, Gianlorenzo D Angelo, Fatima Zahra Moataz RESEARCH REPORT N 8071 September 2012 Project-Team MASCOTTE ISSN 0249-6399 ISRN INRIA/RR--8071--FR+ENG

More information

CPSC 536N: Randomized Algorithms Term 2. Lecture 10

CPSC 536N: Randomized Algorithms Term 2. Lecture 10 CPSC 536N: Randomized Algorithms 011-1 Term Prof. Nick Harvey Lecture 10 University of British Columbia In the first lecture we discussed the Max Cut problem, which is NP-complete, and we presented a very

More information

Structure of spaces of rhombus tilings in the lexicograhic case

Structure of spaces of rhombus tilings in the lexicograhic case EuroComb 5 DMTCS proc. AE, 5, 45 5 Structure of spaces of rhombus tilings in the lexicograhic case Eric Rémila, Laboratoire de l Informatique du Parallélisme (umr 5668 CNRS-INRIA-Univ. Lyon -ENS Lyon),

More information

1 Introduction. 1. Prove the problem lies in the class NP. 2. Find an NP-complete problem that reduces to it.

1 Introduction. 1. Prove the problem lies in the class NP. 2. Find an NP-complete problem that reduces to it. 1 Introduction There are hundreds of NP-complete problems. For a recent selection see http://www. csc.liv.ac.uk/ ped/teachadmin/comp202/annotated_np.html Also, see the book M. R. Garey and D. S. Johnson.

More information

Global Register Allocation

Global Register Allocation Global Register Allocation Lecture Outline Memory Hierarchy Management Register Allocation via Graph Coloring Register interference graph Graph coloring heuristics Spilling Cache Management 2 The Memory

More information