Bachelor Seminar Complexity Analysis of Register Allocation

Size: px
Start display at page:

Download "Bachelor Seminar Complexity Analysis of Register Allocation"

Transcription

1 Bachelor Seminar Complexity Analysis of Register Allocation Markus Koenig Embedded Systems Group, University of Kaiserslautern m koenig13@cs.uni-kl.de Abstract Register allocation is the task of assigning temporary variables in a program to the available registers in the machine. Optimal register allocation (minimizing the number of registers used) is proved to be NP-complete by Chaitin et al, by a reduction from graph coloring problem. In particular, Chaitin et al showed for any given arbitrary graph, there exists a program whose interference graph is the same as the given arbitrary graph. In this paper, we study two existing analyses on complexity of register allocation [5, 1]. [5] proves that although optimal register allocation can be done in polynomial time for programs in static single assignment (SSA) form, the complexity after classical SSA elimination remains NP-complete. [1] shows that although register allocation is NP-complete due to the correspondence with coloring problem, the real complexity arises from the further optimizations of spilling and coalescing and from critical edges. Furthermore we also study a technique for solving combined register allocation and instruction scheduling [4]. 1 Introduction In most programs we have to store variables for a later use. This fact makes register allocation so important and it is a reason to do this as fast as possible. The physical memory split up into two or more sections, here is the simple approach with the main memory and a cache enough. To store a variable in main memory needs, compared to the registers, a lot of time so it is clear to avoid this when ever possible. The access to the memory is called spill (load/ store). Sometimes it is useful to transfer a variable to another register adding such an instruction is splitting and removing such an instruction is coalescing. So before the register allocation is done it is a good idea to check whether splitting or coalescing save a spill. Another problem is to find out which variable is the best for spill and so on to find the minimum number of variables that have to be spilled. It is also important to mention that the number of registers are part of the input but however the algorithm should find the smallest number of registers need to be allocated. The SSA, Single Static Assignment, Form is used in compilers as first step before a program is transferred to an executable form. In SSA every variable is defined and used for a definition only one time. The second step is the use of instruction scheduling during the register allocation. The problems arising there would be in case of a separate use of them two, which will be shown later. The combined method is called Crisp (Combined Register allocation and Instruction Scheduling Problem). To examine the outcome of the combined solution a cost function is used and a detailed analysis of the single steps to reorder the basic blocks is needed. With a look at the complexity another way, instead of the graph coloring, is

2 2 shown and at last point an experiment will show numbers for comparison with other algorithms. The first part of the solution, 3.1, is about the problem structure and a closer look on the proof of Chaitin. Coloring an arbitrary graph is Np-complete but it is not like that in all cases and now there are more optimization algorithms. Although it is Np-complete we see that the optimization is useful to save one register as shown in the two example figures. In section 3.2 the SSA model is introduced. There are some transformations done before it goes to the coloring and hopefully the chordal structure of the graph is after the SSA -process good enough for a simple coloring. Part 3.3 is another way to prepare the graph coloring. A combined method is used which includes instruction scheduling and register allocating. An example that shows the effect of the combined method is given in Figure 8. To see more information an experiment shows the improvement of this approach in relation to the separate use of instruction scheduling and register allocation. Then some limitations are shown of the different improvements. Unfortunately in some cases it is impossible to gain an easy graph for coloring so the time expensive spill and Np-complete coloring must be done. 2 Related Work The first Np-completeness proof that was made for register allocation was done by R. Sethi. He modeled the problem as graph coloring where the variables are the vertices and two vertices are connected by an edge if they are alive at the same time in program execution. The number of the available registers is the k that is the number of the possible colors. He used a DAG (directed acyclic graph) and find out the Np-completeness comes from the fact that the appearance of the instructions in the program code is not fixed. In this first approach we already see that the problem got two instances, on the one hand to decide whether a variable has to be spilled or not and on the other hand how to color the graph that K registers are enough. One is exactly defined in the following: Pereira, F. M. Q., & Palsberg, J. (2006, March). REGISTER ALLOCATION AFTER CLASSICAL SSA ELIMINATION IS NP-COMPLETE. In International Conference on Foundations of Software Science and Computation Structures (pp ). Springer, Berlin, Heidelberg. [5] Core register allocation problem: Instance: a program P and a number K of available registers. Problem: can each of the temporaries of P be mapped to one of the K registers such that temporary variables with interfering live ranges are assigned to different registers? By analyzing the problem carefully it is important to look at the single steps of the problem instance. First we get a program that consists of instructions with variables. Then before it comes to model a graph and color this there are some steps to take for optimization. With a special view to the basic blocks there are also interesting things to recognize. The similarity between a basic block witch is the smallest part of the program that is analyzed and the final coloring, that is the biggest step, is given. Motwani, R., Palem, K. V., Sarkar, V., & Reyen, S. (1995). COMBINING REGISTER ALLOCATION AND INSTRUCTION SCHEDULING. Courant Institute, New York University. [4] A formulation of the combined register allocation and instruction scheduling problem within a basic block as a single optimization problem, showing that a simple instance of the combined problem (single register, no latencies, single functional unit) is NP-hard, even though instruction scheduling for a basic block with 0/1 latencies on a single pipelined functional unit is not NP-hard.

3 3 Figure 1: (a) shows the matrix equation with φ function. (b) and (c) shows a matrix with its semantic. So there is a φ function for every n rows and each column represents another execution path in the program. [4] The improvements will shorten the time needed for the most steps, the graph structure can be formed better, maybe to chordal graph, the instructions can be sorted favorable, however this is also possible in some special cases and nevertheless with all the improvements the register allocation will be a Np-complete problem. 3 The Solution First, in section 3.1, the SSA- method is introduced which uses interval graphs SSA- circle graphs and phi- functions. Second, 3.2, we need a closer look at the problem instances and find out in which step it has got which complexity. The last part, 3.3, is another method in contrast to SSA for the simplification of the program before the graph coloring starts. 3.1 The SSA (Static Single Assignment) Approach Phi functions For the SSA form the φ- functions are important. These functions are used like a naming system for the variables which choose the correct name and value for each variable. They are needed because in SSA there can not be two variables with the same name. So if we use a variable a second time or in another branch of the program the φ- function remove the old name, which is used before, and then set a new name. Here the syntax is described as a matrix modeled by Hack et al. In figure 1 the φ functions are evaluated simultaneously when each basic block begins. In fact that every column is a separate line in the control flow graph the variables in a row are independent of each

4 4 Figure 2: (a) shows a classical progrm in SSA-form. (b) shows the control flow graph and (c) the program with SSA-elimination. The three steps point out the transformation from a normal program into the post SSA form.[5] other and can allocated at the same register. Referring to figure 2, in (b) we can see in the second block that the variables v 11 and i 1 interfere but V 11 and i 2 does not just like the description of SSA. The Post SSA-form or also called SSA-elimination shown in (c) is the executable program. It is used because φ- functions are not supported in every programming language. This is the reason why the number of variables looking at the whole program is increased but maxlive is not. In fact that a variable in SSA-form is allowed only to use one time it is defined and another for defining a new variable there is always a new variable introduced when a value has to be used a second time. So the value is the same but the name has changed to solve the SSAcondition. The transformation into the SSA-form the compiler can do in cubic time. The advantage of the SSA-transformation is that we can color a chordal graph, which is the result of the SSA-elimination, in linear time and mapping a program into the SSA instance take polynomial time so this way in all steps is done in polynomial time and the allocation is no longer Np-complete. This sounds good but there is a problem when we look at all possibilities. The return direction can not be done so easy because we loose information in the single steps and so a solution for a program in SSA-form is not always the same as a solution for the program we started with. The whole problem is still Np-complete but we found a way that makes the allocation simple in some special cases SSA Circular Graphs First to define the interval graph in 3 in the way that two vertices have an edge V if and only if the intervals, described by the two nodes d and u, do not have an empty intersection. Next the outcome is a set of edges defined in V. So we can define three subsets of this amount.

5 5 Figure 3: [5] V i = (d,u) V d < u (1) V l = (d,u) V d > u (2) V z = (d,y) V (y,u) V l (3) The first set V i describe the edges that are between two nodes d and u when the number d is smaler than u. In the second amount it is the way around when the value of d is bigger than u. The last set point out the intersection, if there are nodes d and y when i am sure that the node y belongs to another u and they have an edge that is V l So on for the SSA- circular graph there are two additional points (y,u) W l : d N : (d,y) W z (4) (d,u) W i W z : (d,u ) W l : u < d (5) that makes the difference. A vertex (a, b) consists of two extreme points a and b. The first condition with the three subsets say that if a node is in the third amount there are two vertices that share the same extreme point. In the second condition we got a similar situation that says each interval in the amount of Wl share an extreme point with Wz. These shared extreme points will be used for the parameters of the phi- functions. For the following a mapping function F used that works on pairs (V,K) that splits intervals from V l. The results are presented in Lemma 1 to Lemma 5 by Pereira and Palsberg. [5] Lemma 1 If V is a circular graph and min(v ) > K, then F(V,K) is an SSA circular graph. Lemma 2 If W = F(V,K) is K-colorable, then two intervals in W z and W l that share an extreme point must be assigned the same color by any K- coloring of W. Lemma 3 Suppose V is a circular graph and min(v ) > K. We have V is K-colorable if and only if F(V,K) is K-colorable. Lemma 4 Graph coloring for SSA circular graphs is Np-complete.

6 6 Figure 4: (a) shows in the upper section the interval graph and below the corresponding graph with the edges. In (b) there is shown the program defined by the graphs in (a).[5] Post SSA Programs Lemma 5 We can color an SSA- circular graph Y with K colors if and only if we can solve the core register allocation problem for H(Y,K) and K + 1 registers. In this part in contrast to the one before a new representation of circular graphs is used. The circular graph is converted into a list I of elements that is finite. I = de f ( j),use( j),copy( j, j), j, j N The j s are some names that were used temporarily and if we go trough the lists of elements in I the letters d and u describe the number of elements in the list of the defs or copies and the list of the uses or copies. In this example, Figure 4, there are two parts to look at. First there is a loop that can be colored independently from the rest of the program because the variables inside nearly all are used only local. So the other part is the rest of the program outside the loop. This will make the coloring easier if the programs are not so simple like this one. If the coloring of the loop is done this could be mapped to on the whole graph in linear time. The solution for the core register allocation will need K + 1 registers because of the loop control variable i and i 2. Now a valid solution to color the example graph is to use color one for a,a2,c, the second color for b,d,t,t2, color three for e,e2 and the last one for the loop control variable. Unfortunately this improvement can not reduce the complexity of the whole allocation problem because there is still a graph coloring to be done but it shows that the worst case with the bad execution time can be avoided in some cases (see [3]).

7 7 Figure 5: This figure shows an example how to form a program by using the interference graph on the left side (from [1]) 3.2 Analyzing the Problem Chaitin proved by reducing k-colors that the allocation is Np-complete (see [2]). By given undirected graph G and a natural number k is the question: Can we color G with k colors so that there are not two nodes linked by an edge that have the same color. For k > 2 and an arbitrary graph is this problem Np-complete. For the reduction Chaitin modeled a program with V variables and k as number of registers available. The variables u and v from V are linked by an edge if they have to be alive at the same time. So there is a maximum number of variables that have to be alive simultaneously while the program is running as maxlive. In the figure1 below the edges are presented as basic blocks. When there are V variables and another variable x (V + 1) need to be colored, in worst case it has got an edge to every vertex, we need a new color for x. Hence it is Np-complete to decide k + 1 registers are needed. But this model do not pay attention to improvements before it comes to the graph coloring. Sometimes the program structure allows improvements by splitting or coalescing and the graph is no longer complicated to color in all cases. In this example we can not use one of the optimization because the edges are critical, an edges is critical when it raises the number of maxlive. Or, like in the example, when the edge cause a circle. So it is clear that the interesting part which makes the problem Np-complete is located in the critical edges but only if maxlive > k or maxlive = k. The cases when maxlive < k are easy to solve because no spill is needed and graph coloring is not a problem if we know before start the coloring that for each variable there will be at least one register. Maxlive could be a minimum number of registers needed for fast allocation. If there were no critical edges and the optimization produce a chordal graph the coloring would take linear time. Hence the proof Chaitin made is only relevant for the worst case with the critical edges, otherwise we can find an easier way to allocate the registers. Without critical edges and the problem of maxlive > k there could help splitting or coalescing to reduce maxlive. These two steps are done before spilling because the also called shuffle is more time efficient than a spill. As follows it is the next problem, if the shuffle can not reduce maxlive, to keep the number of spilled

8 8 Figure 6: Here it is the example shown with the split of the critical edges (like in Figure 5)(from [1]) variables as small as possible. Figure2 6 above shows a similar structure like figure1 5 but this time the critical edges are split up. So there are three variables u, x u, y u, for each vertex u and a variable x uv for each edge. Now it seems like we will need more register than last time for the graph coloring but we can use a split after or before every basic block so the graph will be 3-colorable because the critical edges are gone. Every basic block needs three variables and after execution it has produced three variables. For example the first on left side set a, b, and x ab. Then the next block on left side set y a and y b, the origin registers used for a and b are now overwritten by the first two and the values are not lost. The last block needs all of the three variables but no more because we store them in the step which between the last two of the first program (Figure 5). The coloring is now easier because the variable which was in figure1 linked to all others is now independent and only linked to the node where it is needed. The coloring goes like this, the vertex u gets one color and the triangle which is build by x u and y u is used for the other two colors. u, v and x uv build another triangle but this is except u independent from the one before so vertex v and v uv can get same colors like x u and y u. The last one with u another vertex w and x uw goes the same. To do so for each node the graph is colored with three colors. Hence we need one register less than we need for the graph in figure1. This coloring (the problem 3-colorable) is also Np-complete which means we have still the same complexity.

9 9 Figure 7: [4] 3.3 The Combined Allocation Comparison of Three Algorithms Including the Combined Method In this section we have a look at the following example to make a simple but very helpful observation. Figure 7 As known from the parts earlier we can split a program into basic blocks for a better overview and the coloring. Now the basic blocks are in focus for some improvements. This example is a basic block with six instructions and it is used a two stage pipeline with two registers available. The first picture in Figure 8 shows the instruction scheduling followed by register allocation. This method is used very often and tries to execute as many as possible steps parallel. Parallelism is typical for manufactures to work efficiently but in program code something really bad could happens. If more register have to be allocated than available a spill instruction is needed that take a lot of time, in comparison to the use of a register. Hence this method increases in some cases maxlive but saves idle slots to perform a better number of cycles. The second picture in Figure 8 shows register allocation followed by instruction scheduling. It is from a time registers were rarely available and a spill was not really an option. Instead of another register there more time is needed. In contrast to the first picture maxlive is one less but the method take two more cycles until the program execution finished. So we see here the other way around. The last picture in 8 shows the composition of the two approaches. As advantage maxlive is two like in the one before but the cycle time is one less, not same like the first but better than each. We noticed first method pushes v 5 between v 2 and v 3 so we need register for v 5 before v 2 is finished this increase maxlive. Second pulls an idle slot between v 2 and v 3 and v 5 and v 6 to avoid a collision but this increase the cycle time. The third placed v 4 in the idle slot between v 2 and v 3 and also set v 5 lately which gives us the best solution. After this a model for better understanding is introduced. A basic block consists of a set of instructions V = v1...vn and let DG be the data dependence graph formed by the set V. If two nodes vi and v j are connected with an edge (v i,v j ) it means that v i must start before v j is started. For every instruction

10 10 Figure 8: [4]

11 11 of V there is an execution time t needed and each edge has an inter instruction latency that says the next instruction must start at least a specific number of cycles after the one before completed. To simplify this example every instruction needs one cycle time and every latency is zero or one. The beginning of every instruction and the end of all instructions are specified by a schedule. In the schedule there is for every variable defined the use(v i ) and the de f (v i ) which are two sets where the read and write if defined. There are two more sets that are needed in this context the set of producers prod(v) and the set of consumers con(v) which means that the variable is initialized by the producer and used by the consumer. As follows the dependence graph DG need to know all data flows. The SSA- form creates the basic blocks in that every variable is defined and used only one time. So now for each schedule there is a value range of virtual registers defined by a triple (r,(v i ),(v j )) with the following properties: In this schedule we have a difference to the usual live range of a variable. A virtual register can have more than one value range because the definition says that this range is only for one definition and use but the variable can have more than one use and so every use is added as one item to the value range. The usual live range is from the definition until the last use. With a look at a single variable the value ranges can not overlap. The virtual registers models the fact that hardware use the same register for input and output without any interference between the use like in the SSA- form in which we can not do this. Another point is now to define the set of the spilled value ranges as SVR and the set of the active value ranges. To spill a value range over a certain time interval has the effect that this value range will not need a virtual register in this time but produces an overhead with the store and load operations when the value range is pushed into the main memory and the reload in the register again. In that way we buy in cost of time register for another value range. So it is clear that the following two conditions hold: The number of available registers is always bigger or same as the bandwidth for the active value range. The number of available functional units is always bigger or same as the number of executing instructions. The combined register allocation and instruction scheduling problem (CRISP). The CRISP can be formulated as a minimization problem in that we have to reduce the number of spills and the value of the completion time Analyzing the complexity of CRISP and improvements Now it need a closer look at the complexity of the CRISP. So we see that the register allocation is still Np- complete but under some circumstance we will see that the register allocation is not Np- complete. If we only look at the instruction scheduling for a basic block with latencies 0/1 on a single functional unit by a given register allocation is not Np-complete and if the schedule is given register allocation is also not Np-complete. In fact if we do not want to find the optimal solution but the near optimal there is a significant improvement by this algorithm. To show that CRISP is Np-complete a reduction from a very similar problem, Feedback Vertex Set (FVS), would be a distinct possibility. In details of the two problems can be described as in the following: CRISP In this case it a good idea to use a restricted version of the CRISP, RCRISP, which is defined like the normal CRISP except of one difference. Here the input of the edge latencies are all zero. The problem is to decide whether there is a permutation of the instructions such that the completion time is not bigger than a maximum t. Feedback Vertex Set (FVS) Problem Given a directed graph G (W,E) and a positive integer s, decide whether there is a feedback vertex set

12 12 S from W of size at most s, where a feedback vertex set of vertices whose removal (along with incident edges) from G will result in an acyclic graph. The Reduction Given an instance of FVS as an graph G(W,E) and the integer s that will be reduced to the RCRISP. The basic blocks were build with the vertices from the set W so there is a block V that consists the instructions w 1, w 2 and w 3 and a virtual register r w where de f (w 1 ) = use(w 2 ) = use(w 3 ) = r w and use(w 1 ) = de f (w 2 ) = de f (w 3 ) =. The only edges in the dependence graph are like this: For each vertex w we add edges to the following nodes so we have an edge from w 1 to w 2 and an edge from w 1 to w 3 and for an edge from v to w we connect v 3 with w 2. Finally we can define a value like a cost index: t = 3n + 2s, where n is the number of vertices and s is the integer from the FVS. In fact the number of total instructions is 3n in the RCRISP and the number of allowed spills is s. The following two statements are important to be mentioned. Lemma 1: The RCRISP instance has a solution σ with zero spills if and only if the FVS instance is a DAG (directed acyclic graph). The advantage of the Dag is that if there are two nodes v and w there is only an edge from v to w if f (v) < f (w). Hence we can use this for a schedule, for the RCRISP, that orders the instructions for the control flow graph. Here an isomorphism g : V > 1,,3n is useful that gives every vertex v from V the number of the set of instructions. Every instruction get like this a time value by ordering it into the schedule with the function g and we get a linear list of the instructions. So this new list of instructions is constructed to hold the orders of the dependency graph. Cause of the condition if v < w in the graph then f (v) < f (w) for the ordering in the schedule it can not happen that an instruction position is switched. There can not be to much instructions or new instructions be created because for every node there is a maximum number of possibilities that can happen (de f (v)anduse(v)). This goes to the next point that no spills are needed. Theorem 1: There is a polynomial time reduction from FVS to RCRISP, and hence RCRISP is Np-hard. The reduction can be done in polynomial time especially in quadratic time of the vertices. First it is important to recognize that the FVS and the schedule are connected by a time relation. The schedule has a cost C = 3n+2N where N is the number of spilled value ranges. So on the FVS instance has a feedback vertex set S of size s if and only if the RCRISP instance has a schedule of total cost K t = 3n+2s. Thus the feedback vertex set must have at most size s if and only if there are at most s spills in the schedule of the RCRISP instance. The sub graph that is produced by the FVS when we take W/S, where W is the amount of vertices and S the set of feedback vertex of max size s, is a DAG. So this sub graph is the perfect candidate to build a schedule from without any spills. To enlist the other variables we need to find a good place in the list and this makes the problem complex again. But nevertheless we have, before it comes to this, reduced the number of variables which is better than having all the variables enlisted like this. Maybe the rest of the variables have to be spilled some time but that are also less complex because we start with a spill free schedule Approximation Instead of the Optimal Solution A good way to handle Np-complete problems is not to search for the best solution but for the nearly optimal one. So we can define with instance I Opt(I) for the best solution of the problem I and S(I) for a sub optimal solution of the problem instance I that S(I)\Opt(I) < r as a minimization problem. In this context r is the ratio of the solution. In fact the use of a constant approximation algorithm is often a good average case algorithm. Hence we see that the CRISP is easier to approximate than solving the problem

13 13 with one of the earlier algorithms. But now we look at some negative points of the approximation algorithm. The standard method is to do something to model the problem but then it came to the graph coloring to check whether the graph is K-colorable. The vertices are the live ranges of the variables and the edges between two nodes are there if the live ranges overlap at any execution point. The coloring is Np-hard and if there is no way to color the graph we need to spill as many variables until it is colorable. If there is an approximation ratio like R/R 1 to find the maximum R-colorable sub graph there is a problem with the accuracy. When we build the sub graph from the original graph G we loose vertices and edges to form it until the approximation ratio is fulfilled. So in the sub graph it is still possible to have the Np-hard problem to find a solution because not every graph structure allows the problem to become easier if we leave out some nodes and edges. Here comes another problem with the minimization of the spilled variables. In the method we optimize them two we see that it is a similar problem when try to find the optimal solution. By the approximation there is not necessarily an equivalence between the graph coloring and the minimization of the spills, so if we approximate it could be bad solution for spilled variables. At last we see that it is also Np-complete when it comes to the graph coloring and there is no reliable way to avoid the coloring in all cases which makes the problem, optimal solution or approximation, Np-complete. This is the time to approximate the CRISP. The complexity of the approximation and so on the algorithm is Np-hard to find if there is a solution needed with O(log(n)). The best method to present has a complexity known by Ω(log(n)). Hence it seems to be impossible to find a satisfying algorithm that beats the complexity of the one at the moment. By this circumstance it is interesting to take a closer look at the relationship between the RCRISP and the FVS. In the reduction there is only a solution for the FVS instance with value s if and only if there is a solution of the RCRISP instance when t = 3n + 2s where n is the number of vertices of the FVS instance. In fact that the FVS is a sub set of vertices from the original graph we also have a sub set by the RCRISP and the range of the RCRISP instance will be 3n <= t <= 5n. So there must be any greedy solution for RCRISP like an a-approximation for a <= 5/3. This basic idea can be generalized for any RCRISP and so on the CRISP instances. A general heuristic for CRISP will supplement the previous effort. This heuristic is based on the following optimal algorithm for spill code generation with respect to the dependencies of the dependency graph of the specific instruction schedule. To do so we got a similar problem as we see before when we want to reduce a graph by the FVS, or in other words when removing one by one a vertex to gain a R-colorable sub graph without any spills first. There is a greedy algorithm that works in linear time and can find an optimal solution with help of a linear scan method. This method delete every node that would cause a spill and so on there is a non spill sub graph left. Therefore it is an optimal solution with respect to the spill minimization. To summarize the heuristic it follows a combined rank function that orders the instructions into a increasing list without considering the register bound. After that the variables to spill will be chosen by another walk through the schedule. The worst case complexity of these two operations is in polynomial time The Experiment To conclude the solution part an experiment is presented in the following. In the two tables below, Figure 9, the combined heuristic was compared with the first instruction scheduling register allocation and second register allocation instruction scheduling algorithm so there is a ratio presented. For the tests randomly generated DAGs were used with a two stage pipeline with four, eight, and 16 available registers. The cost ratio means that when value 1 the combined heuristic has lower cost than the phase ordered method. The results of the first table show that the combined solution needed in all cases lesser

14 14 Figure 9: [4] spills and the cost ratio is 16% 21% better than the phase ordered method. But the phase ordered solution shows the better makespan, which means the program schedule needs 13% 14% lesser time. In the second table the results are similar except of the makespan. Here the combined method is with the four and eight registers better but with 16 registers the phase ordered solution beats the other with 4%. The rest is like above the phase ordered method perform 4% 21% worse in the cost ratio and with the spills it is 19% 35% worse. A new point we see here is that the instruction scheduling seems to be more important because, while the register allocation first is in nearly all performances worse, the instruction scheduling makes a more time efficient schedule. As follows the instruction scheduling is more important and the combined solution is the best of the three. (y,u) W l : d N : (d,y) W z (6) (d,u) W i \W z : (d,u ) W i : u < d (7) Lemma 6 If V is a circular graph and min(v ) > K, then F(V,K) is an SSA circular graph. Lemma 7 If W = F(V,K) is K-colorable, then two intervals in W z and W z that share an extreme point must be assigned the same color by any K- coloring of W. Lemma 8 Suppose V is a circular graph and min(v ) > K. We have V is K-colorable if and only if F(V,K) is K-colorable. Lemma 9 Graph coloring for SSA circular graphs is NP-complete.

15 15 4 Results The core register allocation problem with SSA is Np-complete but the single steps can improve is some special cases the allocation. We analyze the SSA form and look at the single steps. The functions are an important tool for the SSA-transformation, they choose for every variable the correct name and so on the corresponding value and they are useful for the transformation. The use of them is by generating the SSA-form that the variables were defined and used only once. These copy instructions can not raise the complexity of the allocation problem. The way to the interval graph and then to post SSA bring more structure and also more overhead but in the lucky case the outcome is a chordal graph that can be colored in polynomial time that makes this transformation useful. Nevertheless there is a graph coloring problem at the end of this transformation which is Np-complete and the problem is still Np-complete. In the next part the step wise observation of the whole allocation problem shows that we can split it into different phases that bring different complexities to the problem in complete. In combination with the last part we see that there are two parts of the allocation that makes it Np-complete, when we do the graph coloring or schedule the basic blocks. The coalescing and splitting are two good instructions to prevent a spill, which is the next improvement. Here the graph structure is similar improved like the SSA-method. When looking at the problem we also noticed that the problem case is only given when the number of registers is smaller than the number of variables that have to be allocated. While Chaitin et. al. Concentrated on the graph coloring we now concentrate on the improvements that could be done before it comes to the complex coloring and with some little changes the coloring is not always Np-complete. In this model there were also shown the critical edges which cause the variable pressure. The critical edges are presented in all models but they do not always look like the same in SSA they were defined as the set of the spills and in the combined register allocation and instruction scheduling it was first the rest of the sub graph and then spilled. Finally the CRISP got an important position too, when the improvements have to be done. In this part the first result was that the combined allocation beats the two separated ones. But this method also got Np-completeness in the last step and it can not reduce the complexity of the problem. Another and better point is that this produces a schedule for the instructions which is not always Np-complete to handle. In fact it is similar to the SSA approach that try to reorder and build a better problem structure. 5 Conclusion Finally we look at a lot of different methods to improve the speed of the register allocation and we see that every method works on a special set of cases and makes this solution better. But at last it is not possible to reduce the complexity for a random input. With the help of these anlasysis we found out which part of the problem instance, makes the register allocation still an Np-complete problem. References [1] Florent Bouchez, Alain Darte, Christophe Guillon & Fabrice Rastello (2006): Register allocation: what does the NP-completeness proof of Chaitin et al. really prove? or revisiting register allocation: why and how. In: LCPC, 6, Springer, pp [2] Gregory J Chaitin, Marc A Auslander, Ashok K Chandra, John Cocke, Martin E Hopkins & Peter W Markstein (1981): Register allocation via coloring. Computer languages 6(1), pp

16 16 [3] Sebastian Hack, Daniel Grund & Gerhard Goos (2006): Register allocation for programs in SSA-form. CC 6, pp [4] Rajeev Motwani, Krishna V Palem, Vivek Sarkar & Salem Reyen (1995): Combining register allocation and instruction scheduling. Courant Institute, New York University. [5] Fernando Magno Quintao Pereira & Jens Palsberg (2006): Register allocation after classical SSA elimination is NP-complete. In: International Conference on Foundations of Software Science and Computation Structures, Springer, pp

Register Allocation. Register Allocation. Local Register Allocation. Live range. Register Allocation for Loops

Register Allocation. Register Allocation. Local Register Allocation. Live range. Register Allocation for Loops DF00100 Advanced Compiler Construction Register Allocation Register Allocation: Determines values (variables, temporaries, constants) to be kept when in registers Register Assignment: Determine in which

More information

SSA-Form Register Allocation

SSA-Form Register Allocation SSA-Form Register Allocation Foundations Sebastian Hack Compiler Construction Course Winter Term 2009/2010 saarland university computer science 2 Overview 1 Graph Theory Perfect Graphs Chordal Graphs 2

More information

Register Allocation: What does Chaitin s NP-completeness Proof Really Prove?

Register Allocation: What does Chaitin s NP-completeness Proof Really Prove? Laboratoire de l Informatique du Parallélisme École Normale Supérieure de Lyon Unité Mixte de Recherche CNRS-INRIA-ENS LYON-UCBL n o 5668 Register Allocation: What does Chaitin s NP-completeness Proof

More information

CSC D70: Compiler Optimization Register Allocation

CSC D70: Compiler Optimization Register Allocation CSC D70: Compiler Optimization Register Allocation Prof. Gennady Pekhimenko University of Toronto Winter 2018 The content of this lecture is adapted from the lectures of Todd Mowry and Phillip Gibbons

More information

Compiler Design. Register Allocation. Hwansoo Han

Compiler Design. Register Allocation. Hwansoo Han Compiler Design Register Allocation Hwansoo Han Big Picture of Code Generation Register allocation Decides which values will reside in registers Changes the storage mapping Concerns about placement of

More information

Theorem 2.9: nearest addition algorithm

Theorem 2.9: nearest addition algorithm There are severe limits on our ability to compute near-optimal tours It is NP-complete to decide whether a given undirected =(,)has a Hamiltonian cycle An approximation algorithm for the TSP can be used

More information

The complement of PATH is in NL

The complement of PATH is in NL 340 The complement of PATH is in NL Let c be the number of nodes in graph G that are reachable from s We assume that c is provided as an input to M Given G, s, t, and c the machine M operates as follows:

More information

Lecture Notes on Register Allocation

Lecture Notes on Register Allocation Lecture Notes on Register Allocation 15-411: Compiler Design Frank Pfenning Lecture 3 September 1, 2009 1 Introduction In this lecture we discuss register allocation, which is one of the last steps in

More information

register allocation saves energy register allocation reduces memory accesses.

register allocation saves energy register allocation reduces memory accesses. Lesson 10 Register Allocation Full Compiler Structure Embedded systems need highly optimized code. This part of the course will focus on Back end code generation. Back end: generation of assembly instructions

More information

Code generation for modern processors

Code generation for modern processors Code generation for modern processors Definitions (1 of 2) What are the dominant performance issues for a superscalar RISC processor? Refs: AS&U, Chapter 9 + Notes. Optional: Muchnick, 16.3 & 17.1 Instruction

More information

Code generation for modern processors

Code generation for modern processors Code generation for modern processors What are the dominant performance issues for a superscalar RISC processor? Refs: AS&U, Chapter 9 + Notes. Optional: Muchnick, 16.3 & 17.1 Strategy il il il il asm

More information

Notes for Lecture 24

Notes for Lecture 24 U.C. Berkeley CS170: Intro to CS Theory Handout N24 Professor Luca Trevisan December 4, 2001 Notes for Lecture 24 1 Some NP-complete Numerical Problems 1.1 Subset Sum The Subset Sum problem is defined

More information

31.6 Powers of an element

31.6 Powers of an element 31.6 Powers of an element Just as we often consider the multiples of a given element, modulo, we consider the sequence of powers of, modulo, where :,,,,. modulo Indexing from 0, the 0th value in this sequence

More information

SSA-based Register Allocation with PBQP

SSA-based Register Allocation with PBQP SSA-based Register Allocation with PBQP Sebastian Buchwald, Andreas Zwinkau, and Thomas Bersch Karlsruhe Institute of Technology (KIT) {buchwald,zwinkau}@kit.edu thomas.bersch@student.kit.edu Abstract.

More information

Register Allocation. Global Register Allocation Webs and Graph Coloring Node Splitting and Other Transformations

Register Allocation. Global Register Allocation Webs and Graph Coloring Node Splitting and Other Transformations Register Allocation Global Register Allocation Webs and Graph Coloring Node Splitting and Other Transformations Copyright 2015, Pedro C. Diniz, all rights reserved. Students enrolled in the Compilers class

More information

Extremal Graph Theory: Turán s Theorem

Extremal Graph Theory: Turán s Theorem Bridgewater State University Virtual Commons - Bridgewater State University Honors Program Theses and Projects Undergraduate Honors Program 5-9-07 Extremal Graph Theory: Turán s Theorem Vincent Vascimini

More information

Global Register Allocation

Global Register Allocation Global Register Allocation Y N Srikant Computer Science and Automation Indian Institute of Science Bangalore 560012 NPTEL Course on Compiler Design Outline n Issues in Global Register Allocation n The

More information

Approximation Algorithms

Approximation Algorithms Approximation Algorithms Subhash Suri June 5, 2018 1 Figure of Merit: Performance Ratio Suppose we are working on an optimization problem in which each potential solution has a positive cost, and we want

More information

PCP and Hardness of Approximation

PCP and Hardness of Approximation PCP and Hardness of Approximation January 30, 2009 Our goal herein is to define and prove basic concepts regarding hardness of approximation. We will state but obviously not prove a PCP theorem as a starting

More information

Small Survey on Perfect Graphs

Small Survey on Perfect Graphs Small Survey on Perfect Graphs Michele Alberti ENS Lyon December 8, 2010 Abstract This is a small survey on the exciting world of Perfect Graphs. We will see when a graph is perfect and which are families

More information

Global Register Allocation - Part 2

Global Register Allocation - Part 2 Global Register Allocation - Part 2 Y N Srikant Computer Science and Automation Indian Institute of Science Bangalore 560012 NPTEL Course on Compiler Design Outline Issues in Global Register Allocation

More information

Exact Algorithms Lecture 7: FPT Hardness and the ETH

Exact Algorithms Lecture 7: FPT Hardness and the ETH Exact Algorithms Lecture 7: FPT Hardness and the ETH February 12, 2016 Lecturer: Michael Lampis 1 Reminder: FPT algorithms Definition 1. A parameterized problem is a function from (χ, k) {0, 1} N to {0,

More information

Lecture 2. 1 Introduction. 2 The Set Cover Problem. COMPSCI 632: Approximation Algorithms August 30, 2017

Lecture 2. 1 Introduction. 2 The Set Cover Problem. COMPSCI 632: Approximation Algorithms August 30, 2017 COMPSCI 632: Approximation Algorithms August 30, 2017 Lecturer: Debmalya Panigrahi Lecture 2 Scribe: Nat Kell 1 Introduction In this lecture, we examine a variety of problems for which we give greedy approximation

More information

Vertex Cover Approximations

Vertex Cover Approximations CS124 Lecture 20 Heuristics can be useful in practice, but sometimes we would like to have guarantees. Approximation algorithms give guarantees. It is worth keeping in mind that sometimes approximation

More information

Outline. Register Allocation. Issues. Storing values between defs and uses. Issues. Issues P3 / 2006

Outline. Register Allocation. Issues. Storing values between defs and uses. Issues. Issues P3 / 2006 P3 / 2006 Register Allocation What is register allocation Spilling More Variations and Optimizations Kostis Sagonas 2 Spring 2006 Storing values between defs and uses Program computes with values value

More information

Online Graph Exploration

Online Graph Exploration Distributed Computing Online Graph Exploration Semester thesis Simon Hungerbühler simonhu@ethz.ch Distributed Computing Group Computer Engineering and Networks Laboratory ETH Zürich Supervisors: Sebastian

More information

11.1 Facility Location

11.1 Facility Location CS787: Advanced Algorithms Scribe: Amanda Burton, Leah Kluegel Lecturer: Shuchi Chawla Topic: Facility Location ctd., Linear Programming Date: October 8, 2007 Today we conclude the discussion of local

More information

Solutions to In Class Problems Week 5, Wed.

Solutions to In Class Problems Week 5, Wed. Massachusetts Institute of Technology 6.042J/18.062J, Fall 05: Mathematics for Computer Science October 5 Prof. Albert R. Meyer and Prof. Ronitt Rubinfeld revised October 5, 2005, 1119 minutes Solutions

More information

Fall Compiler Principles Lecture 12: Register Allocation. Roman Manevich Ben-Gurion University

Fall Compiler Principles Lecture 12: Register Allocation. Roman Manevich Ben-Gurion University Fall 2014-2015 Compiler Principles Lecture 12: Register Allocation Roman Manevich Ben-Gurion University Syllabus Front End Intermediate Representation Optimizations Code Generation Scanning Lowering Local

More information

The problem of minimizing the elimination tree height for general graphs is N P-hard. However, there exist classes of graphs for which the problem can

The problem of minimizing the elimination tree height for general graphs is N P-hard. However, there exist classes of graphs for which the problem can A Simple Cubic Algorithm for Computing Minimum Height Elimination Trees for Interval Graphs Bengt Aspvall, Pinar Heggernes, Jan Arne Telle Department of Informatics, University of Bergen N{5020 Bergen,

More information

Greedy Algorithms 1. For large values of d, brute force search is not feasible because there are 2 d

Greedy Algorithms 1. For large values of d, brute force search is not feasible because there are 2 d Greedy Algorithms 1 Simple Knapsack Problem Greedy Algorithms form an important class of algorithmic techniques. We illustrate the idea by applying it to a simplified version of the Knapsack Problem. Informally,

More information

Lecture 8: The Traveling Salesman Problem

Lecture 8: The Traveling Salesman Problem Lecture 8: The Traveling Salesman Problem Let G = (V, E) be an undirected graph. A Hamiltonian cycle of G is a cycle that visits every vertex v V exactly once. Instead of Hamiltonian cycle, we sometimes

More information

Lecture 7. s.t. e = (u,v) E x u + x v 1 (2) v V x v 0 (3)

Lecture 7. s.t. e = (u,v) E x u + x v 1 (2) v V x v 0 (3) COMPSCI 632: Approximation Algorithms September 18, 2017 Lecturer: Debmalya Panigrahi Lecture 7 Scribe: Xiang Wang 1 Overview In this lecture, we will use Primal-Dual method to design approximation algorithms

More information

CHAPTER 3. Register allocation

CHAPTER 3. Register allocation CHAPTER 3 Register allocation In chapter 1 we simplified the generation of x86 assembly by placing all variables on the stack. We can improve the performance of the generated code considerably if we instead

More information

Recitation 4: Elimination algorithm, reconstituted graph, triangulation

Recitation 4: Elimination algorithm, reconstituted graph, triangulation Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 Recitation 4: Elimination algorithm, reconstituted graph, triangulation

More information

SSA-Based Register Allocation with PBQP

SSA-Based Register Allocation with PBQP SSA-Based Register Allocation with PBQP Sebastian Buchwald, Andreas Zwinkau, and Thomas Bersch Karlsruhe Institute of Technology (KIT) {buchwald,zwinkau}@kit.edu, thomas.bersch@student.kit.edu Abstract.

More information

Graphs and Network Flows IE411. Lecture 21. Dr. Ted Ralphs

Graphs and Network Flows IE411. Lecture 21. Dr. Ted Ralphs Graphs and Network Flows IE411 Lecture 21 Dr. Ted Ralphs IE411 Lecture 21 1 Combinatorial Optimization and Network Flows In general, most combinatorial optimization and integer programming problems are

More information

LECTURES 3 and 4: Flows and Matchings

LECTURES 3 and 4: Flows and Matchings LECTURES 3 and 4: Flows and Matchings 1 Max Flow MAX FLOW (SP). Instance: Directed graph N = (V,A), two nodes s,t V, and capacities on the arcs c : A R +. A flow is a set of numbers on the arcs such that

More information

CHAPTER 3. Register allocation

CHAPTER 3. Register allocation CHAPTER 3 Register allocation In chapter 1 we simplified the generation of x86 assembly by placing all variables on the stack. We can improve the performance of the generated code considerably if we instead

More information

Module 6 NP-Complete Problems and Heuristics

Module 6 NP-Complete Problems and Heuristics Module 6 NP-Complete Problems and Heuristics Dr. Natarajan Meghanathan Professor of Computer Science Jackson State University Jackson, MS 39217 E-mail: natarajan.meghanathan@jsums.edu P, NP-Problems Class

More information

Stanford University CS261: Optimization Handout 1 Luca Trevisan January 4, 2011

Stanford University CS261: Optimization Handout 1 Luca Trevisan January 4, 2011 Stanford University CS261: Optimization Handout 1 Luca Trevisan January 4, 2011 Lecture 1 In which we describe what this course is about and give two simple examples of approximation algorithms 1 Overview

More information

On the Max Coloring Problem

On the Max Coloring Problem On the Max Coloring Problem Leah Epstein Asaf Levin May 22, 2010 Abstract We consider max coloring on hereditary graph classes. The problem is defined as follows. Given a graph G = (V, E) and positive

More information

Linear Scan Register Allocation. Kevin Millikin

Linear Scan Register Allocation. Kevin Millikin Linear Scan Register Allocation Kevin Millikin Register Allocation Register Allocation An important compiler optimization Compiler: unbounded # of virtual registers Processor: bounded (small) # of registers

More information

Approximation Algorithms

Approximation Algorithms Chapter 8 Approximation Algorithms Algorithm Theory WS 2016/17 Fabian Kuhn Approximation Algorithms Optimization appears everywhere in computer science We have seen many examples, e.g.: scheduling jobs

More information

Chordal deletion is fixed-parameter tractable

Chordal deletion is fixed-parameter tractable Chordal deletion is fixed-parameter tractable Dániel Marx Institut für Informatik, Humboldt-Universität zu Berlin, Unter den Linden 6, 10099 Berlin, Germany. dmarx@informatik.hu-berlin.de Abstract. It

More information

Lecture 5: Graphs. Rajat Mittal. IIT Kanpur

Lecture 5: Graphs. Rajat Mittal. IIT Kanpur Lecture : Graphs Rajat Mittal IIT Kanpur Combinatorial graphs provide a natural way to model connections between different objects. They are very useful in depicting communication networks, social networks

More information

15-451/651: Design & Analysis of Algorithms November 4, 2015 Lecture #18 last changed: November 22, 2015

15-451/651: Design & Analysis of Algorithms November 4, 2015 Lecture #18 last changed: November 22, 2015 15-451/651: Design & Analysis of Algorithms November 4, 2015 Lecture #18 last changed: November 22, 2015 While we have good algorithms for many optimization problems, the previous lecture showed that many

More information

CS261: Problem Set #1

CS261: Problem Set #1 CS261: Problem Set #1 Due by 11:59 PM on Tuesday, April 21, 2015 Instructions: (1) Form a group of 1-3 students. You should turn in only one write-up for your entire group. (2) Turn in your solutions by

More information

Discharging and reducible configurations

Discharging and reducible configurations Discharging and reducible configurations Zdeněk Dvořák March 24, 2018 Suppose we want to show that graphs from some hereditary class G are k- colorable. Clearly, we can restrict our attention to graphs

More information

Introduction to Optimization, Instruction Selection and Scheduling, and Register Allocation

Introduction to Optimization, Instruction Selection and Scheduling, and Register Allocation Introduction to Optimization, Instruction Selection and Scheduling, and Register Allocation Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Traditional Three-pass Compiler

More information

CS 4407 Algorithms. Lecture 8: Circumventing Intractability, using Approximation and other Techniques

CS 4407 Algorithms. Lecture 8: Circumventing Intractability, using Approximation and other Techniques CS 4407 Algorithms Lecture 8: Circumventing Intractability, using Approximation and other Techniques Prof. Gregory Provan Department of Computer Science University College Cork CS 4010 1 Lecture Outline

More information

Mathematical and Algorithmic Foundations Linear Programming and Matchings

Mathematical and Algorithmic Foundations Linear Programming and Matchings Adavnced Algorithms Lectures Mathematical and Algorithmic Foundations Linear Programming and Matchings Paul G. Spirakis Department of Computer Science University of Patras and Liverpool Paul G. Spirakis

More information

Error-Correcting Codes

Error-Correcting Codes Error-Correcting Codes Michael Mo 10770518 6 February 2016 Abstract An introduction to error-correcting codes will be given by discussing a class of error-correcting codes, called linear block codes. The

More information

On 2-Subcolourings of Chordal Graphs

On 2-Subcolourings of Chordal Graphs On 2-Subcolourings of Chordal Graphs Juraj Stacho School of Computing Science, Simon Fraser University 8888 University Drive, Burnaby, B.C., Canada V5A 1S6 jstacho@cs.sfu.ca Abstract. A 2-subcolouring

More information

Register allocation. CS Compiler Design. Liveness analysis. Register allocation. Liveness analysis and Register allocation. V.

Register allocation. CS Compiler Design. Liveness analysis. Register allocation. Liveness analysis and Register allocation. V. Register allocation CS3300 - Compiler Design Liveness analysis and Register allocation V. Krishna Nandivada IIT Madras Copyright c 2014 by Antony L. Hosking. Permission to make digital or hard copies of

More information

Register Allocation via Hierarchical Graph Coloring

Register Allocation via Hierarchical Graph Coloring Register Allocation via Hierarchical Graph Coloring by Qunyan Wu A THESIS Submitted in partial fulfillment of the requirements for the degree of MASTER OF SCIENCE IN COMPUTER SCIENCE MICHIGAN TECHNOLOGICAL

More information

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Approximation algorithms Date: 11/27/18

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Approximation algorithms Date: 11/27/18 601.433/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Approximation algorithms Date: 11/27/18 22.1 Introduction We spent the last two lectures proving that for certain problems, we can

More information

Maximal Independent Set

Maximal Independent Set Chapter 0 Maximal Independent Set In this chapter we present a highlight of this course, a fast maximal independent set (MIS) algorithm. The algorithm is the first randomized algorithm that we study in

More information

Investigating Different Register Allocation Techniques for a GPU Compiler

Investigating Different Register Allocation Techniques for a GPU Compiler MASTER S THESIS LUND UNIVERSITY 2016 Investigating Different Register Allocation Techniques for a GPU Compiler Max Andersson Department of Computer Science Faculty of Engineering LTH ISSN 1650-2884 LU-CS-EX

More information

CSE 417 Branch & Bound (pt 4) Branch & Bound

CSE 417 Branch & Bound (pt 4) Branch & Bound CSE 417 Branch & Bound (pt 4) Branch & Bound Reminders > HW8 due today > HW9 will be posted tomorrow start early program will be slow, so debugging will be slow... Review of previous lectures > Complexity

More information

Graphs and Discrete Structures

Graphs and Discrete Structures Graphs and Discrete Structures Nicolas Bousquet Louis Esperet Fall 2018 Abstract Brief summary of the first and second course. É 1 Chromatic number, independence number and clique number The chromatic

More information

Register Allocation & Liveness Analysis

Register Allocation & Liveness Analysis Department of Computer Sciences Register Allocation & Liveness Analysis CS502 Purdue University is an Equal Opportunity/Equal Access institution. Department of Computer Sciences In IR tree code generation,

More information

AAL 217: DATA STRUCTURES

AAL 217: DATA STRUCTURES Chapter # 4: Hashing AAL 217: DATA STRUCTURES The implementation of hash tables is frequently called hashing. Hashing is a technique used for performing insertions, deletions, and finds in constant average

More information

Register allocation. Overview

Register allocation. Overview Register allocation Register allocation Overview Variables may be stored in the main memory or in registers. { Main memory is much slower than registers. { The number of registers is strictly limited.

More information

CS 580: Algorithm Design and Analysis. Jeremiah Blocki Purdue University Spring 2018

CS 580: Algorithm Design and Analysis. Jeremiah Blocki Purdue University Spring 2018 CS 580: Algorithm Design and Analysis Jeremiah Blocki Purdue University Spring 2018 Chapter 11 Approximation Algorithms Slides by Kevin Wayne. Copyright @ 2005 Pearson-Addison Wesley. All rights reserved.

More information

Module 6 P, NP, NP-Complete Problems and Approximation Algorithms

Module 6 P, NP, NP-Complete Problems and Approximation Algorithms Module 6 P, NP, NP-Complete Problems and Approximation Algorithms Dr. Natarajan Meghanathan Associate Professor of Computer Science Jackson State University Jackson, MS 39217 E-mail: natarajan.meghanathan@jsums.edu

More information

CS261: A Second Course in Algorithms Lecture #16: The Traveling Salesman Problem

CS261: A Second Course in Algorithms Lecture #16: The Traveling Salesman Problem CS61: A Second Course in Algorithms Lecture #16: The Traveling Salesman Problem Tim Roughgarden February 5, 016 1 The Traveling Salesman Problem (TSP) In this lecture we study a famous computational problem,

More information

Lecture 24: More Reductions (1997) Steven Skiena. skiena

Lecture 24: More Reductions (1997) Steven Skiena.   skiena Lecture 24: More Reductions (1997) Steven Skiena Department of Computer Science State University of New York Stony Brook, NY 11794 4400 http://www.cs.sunysb.edu/ skiena Prove that subgraph isomorphism

More information

Maximal Independent Set

Maximal Independent Set Chapter 4 Maximal Independent Set In this chapter we present a first highlight of this course, a fast maximal independent set (MIS) algorithm. The algorithm is the first randomized algorithm that we study

More information

Faster parameterized algorithms for Minimum Fill-In

Faster parameterized algorithms for Minimum Fill-In Faster parameterized algorithms for Minimum Fill-In Hans L. Bodlaender Pinar Heggernes Yngve Villanger Technical Report UU-CS-2008-042 December 2008 Department of Information and Computing Sciences Utrecht

More information

Lecture outline. Graph coloring Examples Applications Algorithms

Lecture outline. Graph coloring Examples Applications Algorithms Lecture outline Graph coloring Examples Applications Algorithms Graph coloring Adjacent nodes must have different colors. How many colors do we need? Graph coloring Neighbors must have different colors

More information

Exercise set 2 Solutions

Exercise set 2 Solutions Exercise set 2 Solutions Let H and H be the two components of T e and let F E(T ) consist of the edges of T with one endpoint in V (H), the other in V (H ) Since T is connected, F Furthermore, since T

More information

Faster parameterized algorithms for Minimum Fill-In

Faster parameterized algorithms for Minimum Fill-In Faster parameterized algorithms for Minimum Fill-In Hans L. Bodlaender Pinar Heggernes Yngve Villanger Abstract We present two parameterized algorithms for the Minimum Fill-In problem, also known as Chordal

More information

NP and computational intractability. Kleinberg and Tardos, chapter 8

NP and computational intractability. Kleinberg and Tardos, chapter 8 NP and computational intractability Kleinberg and Tardos, chapter 8 1 Major Transition So far we have studied certain algorithmic patterns Greedy, Divide and conquer, Dynamic programming to develop efficient

More information

PACKING DIGRAPHS WITH DIRECTED CLOSED TRAILS

PACKING DIGRAPHS WITH DIRECTED CLOSED TRAILS PACKING DIGRAPHS WITH DIRECTED CLOSED TRAILS PAUL BALISTER Abstract It has been shown [Balister, 2001] that if n is odd and m 1,, m t are integers with m i 3 and t i=1 m i = E(K n) then K n can be decomposed

More information

CHAPTER 2. Graphs. 1. Introduction to Graphs and Graph Isomorphism

CHAPTER 2. Graphs. 1. Introduction to Graphs and Graph Isomorphism CHAPTER 2 Graphs 1. Introduction to Graphs and Graph Isomorphism 1.1. The Graph Menagerie. Definition 1.1.1. A simple graph G = (V, E) consists of a set V of vertices and a set E of edges, represented

More information

Chapter 3 Trees. Theorem A graph T is a tree if, and only if, every two distinct vertices of T are joined by a unique path.

Chapter 3 Trees. Theorem A graph T is a tree if, and only if, every two distinct vertices of T are joined by a unique path. Chapter 3 Trees Section 3. Fundamental Properties of Trees Suppose your city is planning to construct a rapid rail system. They want to construct the most economical system possible that will meet the

More information

April 15, 2015 More Register Allocation 1. Problem Register values may change across procedure calls The allocator must be sensitive to this

April 15, 2015 More Register Allocation 1. Problem Register values may change across procedure calls The allocator must be sensitive to this More Register Allocation Last time Register allocation Global allocation via graph coloring Today More register allocation Procedure calls Interprocedural April 15, 2015 More Register Allocation 1 Register

More information

Approximation Algorithms

Approximation Algorithms Approximation Algorithms Given an NP-hard problem, what should be done? Theory says you're unlikely to find a poly-time algorithm. Must sacrifice one of three desired features. Solve problem to optimality.

More information

Scan Scheduling Specification and Analysis

Scan Scheduling Specification and Analysis Scan Scheduling Specification and Analysis Bruno Dutertre System Design Laboratory SRI International Menlo Park, CA 94025 May 24, 2000 This work was partially funded by DARPA/AFRL under BAE System subcontract

More information

Greedy algorithms is another useful way for solving optimization problems.

Greedy algorithms is another useful way for solving optimization problems. Greedy Algorithms Greedy algorithms is another useful way for solving optimization problems. Optimization Problems For the given input, we are seeking solutions that must satisfy certain conditions. These

More information

Empirical analysis of procedures that schedule unit length jobs subject to precedence constraints forming in- and out-stars

Empirical analysis of procedures that schedule unit length jobs subject to precedence constraints forming in- and out-stars Empirical analysis of procedures that schedule unit length jobs subject to precedence constraints forming in- and out-stars Samuel Tigistu Feder * Abstract This paper addresses the problem of scheduling

More information

Complexity Results on Graphs with Few Cliques

Complexity Results on Graphs with Few Cliques Discrete Mathematics and Theoretical Computer Science DMTCS vol. 9, 2007, 127 136 Complexity Results on Graphs with Few Cliques Bill Rosgen 1 and Lorna Stewart 2 1 Institute for Quantum Computing and School

More information

K-SATURATED GRAPHS CLIFFORD BRIDGES, AMANDA DAY, SHELLY MANBER

K-SATURATED GRAPHS CLIFFORD BRIDGES, AMANDA DAY, SHELLY MANBER K-SATURATED GRAPHS CLIFFORD BRIDGES, AMANDA DAY, SHELLY MANBER Abstract. We present some properties of k-existentially-closed and k-saturated graphs, as steps toward discovering a 4-saturated graph. We

More information

Last week: Breadth-First Search

Last week: Breadth-First Search 1 Last week: Breadth-First Search Set L i = [] for i=1,,n L 0 = {w}, where w is the start node For i = 0,, n-1: For u in L i : For each v which is a neighbor of u: If v isn t yet visited: - mark v as visited,

More information

In this lecture we discuss the complexity of approximation problems, and show how to prove they are NP-hard.

In this lecture we discuss the complexity of approximation problems, and show how to prove they are NP-hard. In this lecture we discuss the complexity of approximation problems, and show how to prove they are NP-hard. 1 We will show how one can prove such results and then apply this technique to some approximation

More information

Chapter 18 out of 37 from Discrete Mathematics for Neophytes: Number Theory, Probability, Algorithms, and Other Stuff by J. M. Cargal.

Chapter 18 out of 37 from Discrete Mathematics for Neophytes: Number Theory, Probability, Algorithms, and Other Stuff by J. M. Cargal. Chapter 8 out of 7 from Discrete Mathematics for Neophytes: Number Theory, Probability, Algorithms, and Other Stuff by J. M. Cargal 8 Matrices Definitions and Basic Operations Matrix algebra is also known

More information

11/22/2016. Chapter 9 Graph Algorithms. Introduction. Definitions. Definitions. Definitions. Definitions

11/22/2016. Chapter 9 Graph Algorithms. Introduction. Definitions. Definitions. Definitions. Definitions Introduction Chapter 9 Graph Algorithms graph theory useful in practice represent many real-life problems can be slow if not careful with data structures 2 Definitions an undirected graph G = (V, E) is

More information

Dr. Amotz Bar-Noy s Compendium of Algorithms Problems. Problems, Hints, and Solutions

Dr. Amotz Bar-Noy s Compendium of Algorithms Problems. Problems, Hints, and Solutions Dr. Amotz Bar-Noy s Compendium of Algorithms Problems Problems, Hints, and Solutions Chapter 1 Searching and Sorting Problems 1 1.1 Array with One Missing 1.1.1 Problem Let A = A[1],..., A[n] be an array

More information

Chapter 9 Graph Algorithms

Chapter 9 Graph Algorithms Chapter 9 Graph Algorithms 2 Introduction graph theory useful in practice represent many real-life problems can be slow if not careful with data structures 3 Definitions an undirected graph G = (V, E)

More information

Scribe from 2014/2015: Jessica Su, Hieu Pham Date: October 6, 2016 Editor: Jimmy Wu

Scribe from 2014/2015: Jessica Su, Hieu Pham Date: October 6, 2016 Editor: Jimmy Wu CS 267 Lecture 3 Shortest paths, graph diameter Scribe from 2014/2015: Jessica Su, Hieu Pham Date: October 6, 2016 Editor: Jimmy Wu Today we will talk about algorithms for finding shortest paths in a graph.

More information

3 No-Wait Job Shops with Variable Processing Times

3 No-Wait Job Shops with Variable Processing Times 3 No-Wait Job Shops with Variable Processing Times In this chapter we assume that, on top of the classical no-wait job shop setting, we are given a set of processing times for each operation. We may select

More information

Combinatorics Prof. Dr. L. Sunil Chandran Department of Computer Science and Automation Indian Institute of Science, Bangalore

Combinatorics Prof. Dr. L. Sunil Chandran Department of Computer Science and Automation Indian Institute of Science, Bangalore Combinatorics Prof. Dr. L. Sunil Chandran Department of Computer Science and Automation Indian Institute of Science, Bangalore Lecture - 5 Elementary concepts and basic counting principles So, welcome

More information

Packing Edge-Disjoint Triangles in Given Graphs

Packing Edge-Disjoint Triangles in Given Graphs Electronic Colloquium on Computational Complexity, Report No. 13 (01) Packing Edge-Disjoint Triangles in Given Graphs Tomás Feder Carlos Subi Abstract Given a graph G, we consider the problem of finding

More information

11. APPROXIMATION ALGORITHMS

11. APPROXIMATION ALGORITHMS 11. APPROXIMATION ALGORITHMS load balancing center selection pricing method: vertex cover LP rounding: vertex cover generalized load balancing knapsack problem Lecture slides by Kevin Wayne Copyright 2005

More information

1 Better Approximation of the Traveling Salesman

1 Better Approximation of the Traveling Salesman Stanford University CS261: Optimization Handout 4 Luca Trevisan January 13, 2011 Lecture 4 In which we describe a 1.5-approximate algorithm for the Metric TSP, we introduce the Set Cover problem, observe

More information

Register Allocation. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice.

Register Allocation. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice. Register Allocation Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP at Rice. Copyright 00, Keith D. Cooper & Linda Torczon, all rights reserved.

More information

Algorithms Exam TIN093/DIT600

Algorithms Exam TIN093/DIT600 Algorithms Exam TIN093/DIT600 Course: Algorithms Course code: TIN 093 (CTH), DIT 600 (GU) Date, time: 24th October 2015, 14:00 18:00 Building: M Responsible teacher: Peter Damaschke, Tel. 5405. Examiner:

More information

Solutions for the Exam 6 January 2014

Solutions for the Exam 6 January 2014 Mastermath and LNMB Course: Discrete Optimization Solutions for the Exam 6 January 2014 Utrecht University, Educatorium, 13:30 16:30 The examination lasts 3 hours. Grading will be done before January 20,

More information

Graph Algorithms. Tours in Graphs. Graph Algorithms

Graph Algorithms. Tours in Graphs. Graph Algorithms Graph Algorithms Tours in Graphs Graph Algorithms Special Paths and Cycles in Graphs Euler Path: A path that traverses all the edges of the graph exactly once. Euler Cycle: A cycle that traverses all the

More information