CS 406/534 Compiler Construction Instruction Selection and Global Register Allocation

Size: px
Start display at page:

Download "CS 406/534 Compiler Construction Instruction Selection and Global Register Allocation"

Transcription

1 CS 406/534 Compiler Construction Instruction Selection and Global Register Allocation Prof. Li Xu Dept. of Computer Science UMass Lowell Fall 2004 Part of the course lecture notes are based on Prof. Keith Cooper, Prof. Ken Kennedy and Dr. Linda Torczon s teaching materials at Rice University. All rights reserved. 1

2 What We Did Last Time Instruction Scheduling beyond basic blocks Code shape and code generation CS406/534 Fall 2004, Prof. Li Xu 2 2

3 Today s Goals Continue our tour of the back-end Automated instruction selection through pattern matching Peephole matching Tree pattern matching Global register allocation CS406/534 Fall 2004, Prof. Li Xu 3 3

4 The Problem Modern computers (still) have many ways to do things Consider register-to-register copy in ILOC Obvious operation is i2ir i r j Many others exist addi r i,0 r j multir i,1 r j ori r i,0 r j subir i,0 r j divi r i,1 r j xorir i,0 r j lshiftir i,0 r j rshiftir i,0 r j and others Human would ignore all of these Algorithm must look at all of them & find low-cost encoding Take context into account (busy functional unit?) CS406/534 Fall 2004, Prof. Li Xu 4 4

5 The Goal Want to automate generation of instruction selectors: retargetable compilers Front End Middle End Back End Infrastructure Machine description Back-end Generator Tables Pattern Matching Engine Description-based retargeting Machine description should also help with scheduling & allocation CS406/534 Fall 2004, Prof. Li Xu 5 5

6 Automated Pattern Matching Tree-oriented IR suggests pattern matching on trees Tree-patterns as input, matcher as output Each pattern maps to a target-machine instruction sequence Use dynamic programming or bottom-up rewrite systems Linear IR suggests using some sort of string matching Strings as input, matcher as output Each string maps to a target-machine instruction sequence Peephole matching CS406/534 Fall 2004, Prof. Li Xu 6 6

7 Basic idea Peephole Matching Compiler can discover local improvements locally Look at a small set of adjacent operations Move a peephole over code & search for improvement Store followed by load Original code Improved code storeai r 1 r 0,8 storeai r 1 r 0,8 loadai r 0,8 r 15 i2i r 1 r 15 CS406/534 Fall 2004, Prof. Li Xu 7 7

8 Basic idea Peephole Matching Compiler can discover local improvements locally Look at a small set of adjacent operations Move a peephole over code & search for improvement Store followed by load Simple algebraic identities Original code Improved code addi r 2,0 r 7 mult r 4,r 7 r 10 mult r 4,r 2 r 10 CS406/534 Fall 2004, Prof. Li Xu 8 8

9 Basic idea Peephole Matching Compiler can discover local improvements locally Look at a small set of adjacent operations Move a peephole over code & search for improvement Store followed by load Simple algebraic identities Jump to a jump Original code Improved code jumpi L 10 L 10 : jumpi L 11 L 10 : jumpi L 11 CS406/534 Fall 2004, Prof. Li Xu 9 9

10 Implementing it Peephole Matching Early systems used limited set of hand-coded patterns Window size ensured quick processing Modern peephole instruction selectors Break problem into three tasks (Davidson) IR Expander LLIR Simplifier LLIR Matcher ASM IR LLIR LLIR LLIR LLIR ASM Apply symbolic interpretation & simplification systematically CS406/534 Fall 2004, Prof. Li Xu 10 10

11 Expander Peephole Matching Turns IR code into a low-level IR (LLIR) such as RTL Operation-by-operation, template-driven rewriting LLIR form includes all direct effects (e.g., setting cc) Significant, albeit constant, expansion of size IR Expander LLIR Simplifier LLIR Matcher ASM IR LLIR LLIR LLIR LLIR ASM CS406/534 Fall 2004, Prof. Li Xu 11 11

12 Simplifier Peephole Matching Looks at LLIR through window and rewrites Uses forward substitution, algebraic simplification, local constant propagation, and dead-effect elimination Performs local optimization within window IR Expander LLIR Simplifier LLIR Matcher ASM IR LLIR LLIR LLIR LLIR ASM This is the heart of the peephole system Benefit of peephole optimization shows up in this step CS406/534 Fall 2004, Prof. Li Xu 12 12

13 Matcher Peephole Matching Compares simplified LLIR against a library of patterns Picks low-cost pattern that captures effects Must preserve LLIR effects, may add new ones (e.g., set cc) Generates the assembly code output IR Expander LLIR Simplifier LLIR Matcher ASM IR LLIR LLIR LLIR LLIR ASM CS406/534 Fall 2004, Prof. Li Xu 13 13

14 Example Original IR Code OP Arg 1 Arg 2 Result mult 2 y sub x t 1 t 1 w Expand LLIR Code r 10 2 r r 12 + r 11 r 13 MEM(r 12 ) r 14 r 10 x r 13 r r 16 + r 15 r 17 MEM(r 16 ) r 18 r 17 -r 14 r r 20 + r 19 MEM(r 20 ) r 18 CS406/534 Fall 2004, Prof. Li Xu 14 14

15 Example LLIR Code r 10 2 r r 12 + r 11 r 13 MEM(r 12 ) r 14 r 10 x r 13 r r 16 + r 15 r 17 MEM(r 16 ) r 18 r 17 -r 14 r r 20 + r 19 MEM(r 20 ) r 18 Simplify MEM(r 0 r 18 LLIR Code r 13 MEM(r 0 r 14 2 x r 13 r 17 MEM(r 0 r 18 r 17 -r 14 CS406/534 Fall 2004, Prof. Li Xu 15 15

16 Example MEM(r 0 r 18 LLIR Code r 13 MEM(r 0 r 14 2 x r 13 r 17 MEM(r 0 r 18 r 17 -r 14 Match ILOC Code loadai r 0,@y r 13 multi 2 x r 13 r 14 loadai r 0,@x r 17 sub r 17 -r 14 r 18 storeai r 18 r 0,@w Introduced all memory operations & temporary names Turned out pretty good code CS406/534 Fall 2004, Prof. Li Xu 16 16

17 Steps of the Simplifier (3-op window) LLIR Code r 10 2 r r 12 + r 11 r 13 MEM(r 12 ) r 14 r 10 x r 13 r r 16 + r 15 r 17 MEM(r 16 ) r 18 r 17 -r 14 r r 20 + r 19 r 10 2 r r 12 + r 11 MEM(r 20 ) r 18 CS406/534 Fall 2004, Prof. Li Xu 17 17

18 Steps of the Simplifier (3-op window) LLIR Code r 10 2 r r 12 + r 11 r 13 MEM(r 12 ) r 14 r 10 x r 13 r r 16 + r 15 r 17 MEM(r 16 ) r 18 r 17 -r 14 r r 20 + r 19 MEM(r 20 ) r 18 r 10 2 r r 12 + r 11 r 10 2 r 12 r 13 MEM(r 12 ) CS406/534 Fall 2004, Prof. Li Xu 18 18

19 Steps of the Simplifier (3-op window) LLIR Code r 10 2 r r 12 + r 11 r 13 MEM(r 12 ) r 14 r 10 x r 13 r r 16 + r 15 r 17 MEM(r 16 ) r 18 r 17 -r 14 r r 20 + r 19 MEM(r 20 ) r 18 r 10 2 r 12 r 13 MEM(r 12 ) r 10 2 r 13 MEM(r 0 r 14 r 10 x r 13 CS406/534 Fall 2004, Prof. Li Xu 19 19

20 Steps of the Simplifier (3-op window) LLIR Code r 10 2 r r 12 + r 11 r 13 MEM(r 12 ) r 14 r 10 x r 13 r r 16 + r 15 r 17 MEM(r 16 ) r 18 r 17 -r 14 r r 20 + r 19 MEM(r 20 ) r 18 r 10 2 r 13 MEM(r 0 r 14 r 10 x r 13 r 13 MEM(r 0 r 14 2 x r 13 r CS406/534 Fall 2004, Prof. Li Xu 20 20

21 Steps of the Simplifier (3-op window) LLIR Code r 10 2 r r 12 + r 11 r 13 MEM(r 12 ) r 14 r 10 x r 13 r r 16 + r 15 r 17 MEM(r 16 ) r 18 r 17 -r 14 r r 20 + r 19 r 13 MEM(r 0 r 14 2 x r 13 r 1 st op it has rolled out of window r 14 2 x r 13 r r 16 + r 15 MEM(r 20 ) r 18 CS406/534 Fall 2004, Prof. Li Xu 21 21

22 Steps of the Simplifier (3-op window) LLIR Code r 10 2 r r 12 + r 11 r 13 MEM(r 12 ) r 14 r 10 x r 13 r r 16 + r 15 r 17 MEM(r 16 ) r 18 r 17 -r 14 r r 20 + r 19 MEM(r 20 ) r 18 r 14 2 x r 13 r r 16 + r 15 r 14 2 x r 13 r 16 r 17 MEM(r 16 ) CS406/534 Fall 2004, Prof. Li Xu 22 22

23 Steps of the Simplifier (3-op window) LLIR Code r 10 2 r r 12 + r 11 r 13 MEM(r 12 ) r 14 r 10 x r 13 r r 16 + r 15 r 17 MEM(r 16 ) r 18 r 17 -r 14 r r 20 + r 19 MEM(r 20 ) r 18 r 14 2 x r 13 r 16 r 17 MEM(r 16 ) r 14 2 x r 13 r 17 MEM(r 0 +@x) r 18 r 17 -r 14 CS406/534 Fall 2004, Prof. Li Xu 23 23

24 Steps of the Simplifier (3-op window) LLIR Code r 10 2 r r 12 + r 11 r 13 MEM(r 12 ) r 14 r 10 x r 13 r r 16 + r 15 r 17 MEM(r 16 ) r 18 r 17 -r 14 r r 20 + r 19 MEM(r 20 ) r 18 r 14 2 x r 13 r 17 MEM(r 0 +@x) r 18 r 17 -r 14 r 17 MEM(r 0 +@x) r 18 r 17 -r 14 r CS406/534 Fall 2004, Prof. Li Xu 24 24

25 Steps of the Simplifier (3-op window) LLIR Code r 10 2 r r 12 + r 11 r 13 MEM(r 12 ) r 14 r 10 x r 13 r r 16 + r 15 r 17 MEM(r 16 ) r 18 r 17 -r 14 r r 20 + r 19 MEM(r 20 ) r 18 r 17 MEM(r 0 +@x) r 18 r 17 -r 14 r r 18 r 17 -r 14 r r 20 + r 19 CS406/534 Fall 2004, Prof. Li Xu 25 25

26 Steps of the Simplifier (3-op window) LLIR Code r 10 2 r r 12 + r 11 r 13 MEM(r 12 ) r 14 r 10 x r 13 r r 16 + r 15 r 17 MEM(r 16 ) r 18 r 17 -r 14 r r 20 + r 19 MEM(r 20 ) r 18 r 18 r 17 -r 14 r r 20 + r 19 r 18 r 17 -r 14 r 20 MEM(r 20 ) r 18 CS406/534 Fall 2004, Prof. Li Xu 26 26

27 Steps of the Simplifier (3-op window) LLIR Code r 10 2 r r 12 + r 11 r 13 MEM(r 12 ) r 14 r 10 x r 13 r r 16 + r 15 r 17 MEM(r 16 ) r 18 r 17 -r 14 r r 20 + r 19 MEM(r 20 ) r 18 r 18 r 17 -r 14 r 20 MEM(r 20 ) r 18 r 18 r 17 -r 14 MEM(r 0 r 18 CS406/534 Fall 2004, Prof. Li Xu 27 27

28 Example LLIR Code r 10 2 r r 12 + r 11 r 13 MEM(r 12 ) r 14 r 10 x r 13 r r 16 + r 15 r 17 MEM(r 16 ) r 18 r 17 -r 14 r r 20 + r 19 MEM(r 20 ) r 18 Simplify MEM(r 0 r 18 LLIR Code r 13 MEM(r 0 r 14 2 x r 13 r 17 MEM(r 0 r 18 r 17 -r 14 CS406/534 Fall 2004, Prof. Li Xu 28 28

29 Details Making It All Work LIR is largely machine independent (RTL) Target machine described as LLIR ASM pattern Actual pattern matching Use a hand-coded pattern matcher (GCC) Turn patterns into grammar & use LR parser (VPO) Several compilers use this technology It seems to produce good portable instruction selectors Key strength appears to be late low-level optimization CS406/534 Fall 2004, Prof. Li Xu 29 29

30 Tree-Pattern Matching Many compilers use tree-structured IRs Abstract syntax trees generated in the parser Trees or DAGs for expressions These systems might well use trees to represent target ISA Match these pattern trees against IR trees CS406/534 Fall 2004, Prof. Li Xu 30 30

31 The Concept Low-level AST for w x - 2 * y + - ARP: r arp NUM: constant LAB: ASM label VAL ARP NUM 4 REF REF NUM 2 * REF w: at ARP+4 x: at ARP-26 Y: + + VAL ARP NUM -26 NUM 12 CS406/534 Fall 2004, Prof. Li Xu 31 31

32 Notation To describe these trees, we need a concise notation + GETS - -(REF(REF(+(VAL 2,NUM 2 ))), *(NUM 3,(REF(+(LAB 1,NUM 3 )))))) VAL ARP NUM 4 REF * *(NUM 3,(REF(+(LAB 1,NUM 3 )))))) (+(VAL 1,NUM 1 ) REF NUM 2 REF (REF(REF(+(VAL 2,NUM 2 ))) + + VAL ARP NUM -26 NUM 12 GETS(+(VAL 1,NUM 1 ), -(REF(REF(+(VAL 2,NUM 2 ))), *(NUM 3,(REF(+(LAB 1,NUM 3 )))))) CS406/534 Fall 2004, Prof. Li Xu 32 32

33 Tree-Pattern Matching Goal is to tile AST with operation trees A tiling is collection of <ast, op > pairs ast is a node in the AST op is an operation tree <ast, op > means that op could implement the subtree at ast A tiling implements an AST if it covers every node in the AST and the overlap between any two trees is limited to a single node <ast, op> tiling means ast is also covered by a leaf in another operation tree in the tiling, unless it is the root Where two operation trees meet, they must be compatible (expect the value in the same location) CS406/534 Fall 2004, Prof. Li Xu 33 33

34 Tiling the Tree VAL ARP + Tile 1 NUM 4 GETS Tile 6 REF REF - Tile 5 * Tile 4 NUM 2 REF Each tile corresponds to a sequence of operations Emitting those operations in an appropriate order implements the tree. VAL ARP + Tile 2 NUM Tile 3 NUM 12 CS406/534 Fall 2004, Prof. Li Xu 34 34

35 Given a tiled tree Tiling the Tree Postorder treewalk, with node-dependent order for children Right child of GETS before its left child Might impose most demanding first rule (Sethi ) Emit code sequence for tiles, in order Tie boundaries together with register names Tile 6 uses registers produced by tiles 1 & 5 Tile 6 emits store r tile 5 r tile 1 Can incorporate a real allocator or can use NextRegister++ CS406/534 Fall 2004, Prof. Li Xu 35 35

36 Tiling the Tree Finding the matches to tile the tree Compiler writer connects operation trees to AST subtrees Provide a set of rewrite rules Encode tree syntax, in linear form Associate each rule with a code template CS406/534 Fall 2004, Prof. Li Xu 36 36

37 Rewrite rules: AST into ILOC Rule Cost Template 1 Goal Assign 0 2 Assign GETS(Reg 1,Reg 2 ) 1 store r 2 r 1 3 Assign GETS(+(Reg 1,Reg 2 ),Reg 3 ) 1 storea O r 3 r 1,r 2 4 Assign GETS(+(Reg 1,NUM 2 ),Reg 3 ) 1 storeai r 3 r 1,n 2 5 Assign GETS(+(NUM 1,Reg 2 ),Reg 3 ) 1 storeai r 3 r 2,n 1 6 Reg LAB 1 1 loadi l 1 r new 7 Reg VAL Reg NUM 1 1 loadi n 1 r new 9 Reg REF(Reg 1 ) 1 load r 1 r new 10 Reg REF(+ (Reg 1,Reg 2 )) 1 loada O r 1,r 2 r new 11 Reg REF(+ (Reg 1,NUM 2 )) 1 loadai r 1,n 2 r new 12 Reg REF(+ (NUM 1,Reg 2 )) 1 loadai r 2,n 1 r new CS406/534 Fall 2004, Prof. Li Xu 37 37

38 Rewrite rules: AST into ILOC Rule Cost Template 13 Reg + (Reg 1,Reg 2 ) 1 add r 1,r 2 r new 14 Reg + (Reg 1,NUM 2 ) 1 addi r 1,n 2 r new 15 Reg + (NUM 1,Reg 2 ) 1 addi r 2,n 1 r new 16 Reg -(Reg 1,Reg 2 ) 1 sub r 1,r 2 r new 17 Reg -(Reg 1,NUM 2 ) 1 subi r 1,n 2 r new 18 Reg -(NUM 1,Reg 2 ) 1 rsubi r 2,n 1 r new 19 Reg x (Reg 1,Reg 2 ) 1 mult r 1,r 2 r new 20 Reg x (Reg 1,NUM 2 ) 1 multir 1,n 2 r new 21 Reg x (NUM 1,Reg 2 ) 1 multir 2,n 1 r new A real set of rules would cover more than signed integers CS406/534 Fall 2004, Prof. Li Xu 38 38

39 Tiling the Tree Need an algorithm to match AST subtrees with the rules Consider tile 3 in our example REF + What rules match tile 3? NUM 12 CS406/534 Fall 2004, Prof. Li Xu 39 39

40 Tiling the Tree Need an algorithm to match AST subtrees with the rules Consider tile 3 in our example REF What rules match tile 3? 6: Reg LAB 1 tiles the lower left node + 6 NUM 12 CS406/534 Fall 2004, Prof. Li Xu 40 40

41 Tiling the Tree Need an algorithm to match AST subtrees with the rules Consider tile 3 in our example REF What rules match tile 3? 6: Reg LAB 1 tiles the lower left node 8: Reg NUM 1 tiles the bottom right node NUM 12 CS406/534 Fall 2004, Prof. Li Xu 41 41

42 Tiling the Tree Need an algorithm to match AST subtrees with the rules Consider tile 3 in our example REF NUM 12 What rules match tile 3? 6: Reg LAB 1 tiles the lower left node 8: Reg NUM 1 tiles the bottom right node 13: Reg + (Reg 1,Reg 2 ) tiles the + node CS406/534 Fall 2004, Prof. Li Xu 42 42

43 Tiling the Tree Need an algorithm to match AST subtrees with the rules Consider tile 3 in our example 9 REF NUM 12 What rules match tile 3? 6: Reg LAB 1 tiles the lower left node 8: Reg NUM 1 tiles the bottom right node 13: Reg + (Reg 1,Reg 2 ) tiles the + node 9: Reg REF(Reg 1 ) tiles the REF CS406/534 Fall 2004, Prof. Li Xu 43 43

44 Tiling the Tree Need an algorithm to match AST subtrees with the rules Consider tile 3 in our example 9 REF NUM 12 What rules match tile 3? 6: Reg LAB 1 tiles the lower left node 8: Reg NUM 1 tiles the bottom right node 13: Reg + (Reg 1,Reg 2 ) tiles the + node 9: Reg REF(Reg 1 ) tiles the REF We denote this match as <6,8,13,9> CS406/534 Fall 2004, Prof. Li Xu 44 44

45 Finding Matches Many Sequences Match Our Subtree Cost Sequences REF 2 3 6,11 6,8,10 8,12 8,6,10 6,14,9 8,15, ,8,13,9 8,6,13,9 NUM 12 In general, we want the low cost sequence Each unit of cost is an operation (1 cycle) We should favor short sequences CS406/534 Fall 2004, Prof. Li Xu 45 45

46 Low Cost Matches Finding Matches REF Sequences with Cost of 2 + NUM 12 6: Reg LAB 1 11: Reg REF(+(Reg 1,NUM 2 )) 8: Reg NUM 1 12: Reg REF(+(NUM 1,Reg 2 )) G r i loadai r i,12 r j loadi 12 r i loadai r i,@g r j These two are equivalent in cost 6,11 might be better, may be longer than the immediate field CS406/534 Fall 2004, Prof. Li Xu 46 46

47 Tiling the Tree Tile(n) Label(n) Ø if n has two children then Tile (left child of n) Tile (right child of n) for each rule r that implements n if (left(r) Label(left(n)) and (right(r) Label(right(n)) then Label(n) Label(n) {r} else if n has one child Tile(child of n) for each rule r that implements n if (left(r) Label(child(n)) then Label(n) Label(n) {r} else /* n is a leaf */ Label(n) {all rules that implement n } Match binary nodes against binary rules Match unary nodes against unary rules Handle leaves with lookup in rule table CS406/534 Fall 2004, Prof. Li Xu 47 47

48 Tiling the Tree Tile(n) Label(n) Ø if n has two children then Tile (left child of n) Tile (right child of n) for each rule r that implements n if (left(r) Label(left(n)) and (right(r) Label(right(n)) then Label(n) Label(n) {r} else if n has one child Tile(child of n) for each rule r that implements n if (left(r) Label(child(n)) then Label(n) Label(n) {r} else /* n is a leaf */ Label(n) {all rules that implement n } This algorithm Finds all matches in rule set Labels node n with that set Can keep lowest cost match at each point Leads to a notion of local optimality lowest cost at each point Spends its time in the two matching loops CS406/534 Fall 2004, Prof. Li Xu 48 48

49 Summary Tree patterns represent AST and ASM Can use matching algorithms to find low-cost tiling of AST Can turn a tiling into code using templates for matched rules Techniques (& tools) exist to do this efficiently CS406/534 Fall 2004, Prof. Li Xu 49 49

50 Global Register Allocation Taking a global approach Abandon the distinction between local & global Make systematic use of registers or memory Adopt a general scheme to approximate a good allocation Graph coloring paradigm (Lavrov & (later) Chaitin ) 1 Build an interference graph G I for the procedure Computing live ranges in the global scope Computing overlap of live ranges (interference) 2 Construct a k-coloring of interference graph Minimal coloring is NP-Complete Spilling and splitting live ranges if necessary 3 Map colors onto physical registers CS406/534 Fall 2004, Prof. Li Xu 50 50

51 Web-based Live Ranges Starting Point: def-use chains (DU chains) Connect definition to all reachable uses Join defs and uses into same web Def and all reachable uses must be in same web All defs that reach same use must be in same web Use a union-find algorithm CS406/534 Fall 2004, Prof. Li Xu 51 51

52 Example l1 def x def y def y def x use y l4 use x use y l3 use x def x l2 use x CS406/534 Fall 2004, Prof. Li Xu 52 52

53 Interference Two liveranges interfere if they overlap (have a nonemtpy intersection) Interference captures the conflicts for storage allocation If two liveranges interfere, values must be stored in different registers or memory locations If two liveranges do not interfere, can store values in same register or memory location CS406/534 Fall 2004, Prof. Li Xu 53 53

54 Example l1 def x def y def y def x use y l4 use x use y l3 l2 use x def x use x l3 l1 l4 l2 CS406/534 Fall 2004, Prof. Li Xu 54 54

55 Graph Coloring (A Background Digression) The problem A graph G is said to be k-colorable iff the nodes can be labeled with integers 1 k so that no edge in G connects two nodes with the same label Examples 2-colorable 3-colorable Each color can be mapped to a distinct physical register CS406/534 Fall 2004, Prof. Li Xu 55 55

56 Interference Graph The interference graph, G I Nodes in G I represent values, or live ranges Edges in G I represent individual interferences For x, y G I, <x,y> iff x and y interfere A k-coloring of G I can be mapped into an allocation to k registers CS406/534 Fall 2004, Prof. Li Xu 56 56

57 Building the Interference Graph To build the interference graph 1 Discover live ranges > Modern compilers use SSA form 2 Compute LIVE sets for each block > Use data flow analysis 3 Iterate over each block > Track the current LIVE set > At each operation, add appropriate edges & update LIVE Edge from result to each value in LIVE Remove result from LIVE Edge from each operand to each value in LIVE CS406/534 Fall 2004, Prof. Li Xu 57 57

58 Observation on Coloring for Register Allocation Suppose we have k registers look for a k coloring Any vertex n that has fewer than k neighbors in the interference graph (n < k) can always be colored! Pick any color not used by its neighbors there must be one CS406/534 Fall 2004, Prof. Li Xu 58 58

59 Chaitin s Algorithm 1. While vertices with < k neighbors in G I > Pick any vertex n such that n < k and put it on the stack > Remove that vertex and all edges incident to it from G I This will lower the degree of n s neighbors 2. If G I is non-empty (all vertices have k or more neighbors) then: > Pick a vertex n (using some heuristic) and spill the live range associated with n > Remove vertex n from G I, along with all edges incident to it and put it on the stack > If this causes some vertex in G I to have fewer than k neighbors, then go to step 1; otherwise, repeat step 2 3. If no spill, successively pop vertices off the stack and color them in the lowest color not used by some neighbor; otherwise, insert spill code, recompute G I and start from step 1 CS406/534 Fall 2004, Prof. Li Xu 59 59

60 Chaitin s Algorithm in Practice 3 Registers Stack CS406/534 Fall 2004, Prof. Li Xu 60 60

61 Chaitin s Algorithm in Practice 3 Registers Stack CS406/534 Fall 2004, Prof. Li Xu 61 61

62 Chaitin s Algorithm in Practice 3 Registers Stack CS406/534 Fall 2004, Prof. Li Xu 62 62

63 Chaitin s Algorithm in Practice 3 Registers Stack CS406/534 Fall 2004, Prof. Li Xu 63 63

64 Chaitin s Algorithm in Practice 3 Registers Colors: : 2: 3: Stack CS406/534 Fall 2004, Prof. Li Xu 64 64

65 Chaitin s Algorithm in Practice 3 Registers Colors: 5 1: : 3: Stack CS406/534 Fall 2004, Prof. Li Xu 65 65

66 Chaitin s Algorithm in Practice 3 Registers Colors: 5 1: : 3: Stack CS406/534 Fall 2004, Prof. Li Xu 66 66

67 Chaitin s Algorithm in Practice 3 Registers Colors: 4 5 1: : 3: Stack CS406/534 Fall 2004, Prof. Li Xu 67 67

68 Chaitin s Algorithm in Practice 3 Registers 1 Stack Colors: 1: 2: 3: CS406/534 Fall 2004, Prof. Li Xu 68 68

69 Chaitin s Algorithm in Practice 3 Registers Colors: 1: 2: 3: Stack CS406/534 Fall 2004, Prof. Li Xu 69 69

70 Improvement in Coloring Scheme Optimistic Coloring (Briggs, Cooper, Kennedy, and Torczon) Instead of stopping at the end when all vertices have at least k neighbors, put each on the stack according to some priority When you pop them off they may still color! 2 Registers: CS406/534 Fall 2004, Prof. Li Xu 70 70

71 Improvement in Coloring Scheme Optimistic Coloring (Briggs, Cooper, Kennedy, and Torczon) Instead of stopping at the end when all vertices have at least k neighbors, put each on the stack according to some priority When you pop them off they may still color! 2 Registers: 2-colorable CS406/534 Fall 2004, Prof. Li Xu 71 71

72 Chaitin-Briggs Algorithm 1. While vertices with < k neighbors in G I > Pick any vertex n such that n < k and put it on the stack > Remove that vertex and all edges incident to it from G I This may create vertices with fewer than k neighbors 2. If G I is non-empty (all vertices have k or more neighbors) then: > Pick a vertex n (using some heuristic condition), push n on the stack and remove n from G I, along with all edges incident to it > If this causes some vertex in G I to have fewer than k neighbors, then go to step 1; otherwise, repeat step 2 3. Successively pop vertices off the stack and color them in the lowest color not used by some neighbor > If some vertex cannot be colored, then pick an uncolored vertex to spill, spill it, and restart at step 1 CS406/534 Fall 2004, Prof. Li Xu 72 72

73 Chaitin Allocator renumber Build SSA, build live ranges, rename build Build the interference graph coalesce spill costs simplify select Fold unneeded copies LR x LR y, and < LR x,lr y > G I combine LR x & LR y Estimate cost for spilling each live range Remove nodes from the graph While stack is non-empty pop n, insert n into G, I & try to color it while N is non-empty if n with n < k then push n onto stack else pick n to spill push n onto stack remove n from G I spill Spill uncolored definitions & uses Chaitin s algorithm CS406/534 Fall 2004, Prof. Li Xu 73 73

74 Chaitin Allocator renumber Build SSA, build live ranges, rename build Build the interference graph coalesce Fold unneeded copies LR x LR y, and < LR x,lr y > G I combine LR x & LR y W at c h this edge spill costs simplify select spill Estimate cost for spilling each live range Remove nodes from the graph While stack is non-empty pop n, insert n into G, I & try to color it Spill uncolored definitions & uses Chaitin s algorithm while N is non-empty if n with n < k then push n onto stack else pick n to spill push n onto stack remove n from G I CS406/534 Fall 2004, Prof. Li Xu 74 74

75 Chaitin-Briggs Allocator renumber Build SSA, build live ranges, rename build Build the interference graph coalesce spill costs simplify select Fold unneeded copies LR x LR y, and < LR x,lr y > G I combine LR x & LR y Estimate cost for spilling each live range Remove nodes from the graph While stack is non-empty pop n, insert n into G, I & try to color it while N is non-empty if n with n < k then push n onto stack else pick n to spill push n onto stack remove n from G I spill Spill uncolored definitions & uses Briggs algorithm (1989) CS406/534 Fall 2004, Prof. Li Xu 75 75

76 Picking a Spill Candidate When n G I, n k, simplify must pick a spill candidate Chaitin s heuristic Minimize spill cost current degree If LR x has a negative spill cost, spill it pre-emptively Cheaper to spill it than to keep it in a register If LR x has an infinite spill cost, it cannot be spilled No value dies between its definition & its use No more than k definitions since last value died (safety valve) Spill cost is weighted cost of loads & stores needed to spill x Bernstein et al. Suggest repeating simplify, select, & spill with several different spill choice heuristics & keeping the best CS406/534 Fall 2004, Prof. Li Xu 76 76

77 Other Improvements to Chaitin-Briggs Spilling partial live ranges Bergner introduced interference region spilling Limits spilling to regions of high demand for registers Splitting live ranges Simple idea break up one or more live ranges Allocator can use different registers for distinct subranges Allocator can spill subranges independently (use 1 spill location) Conservative coalescing Combining LR x LR y to form LR xy may increase register pressure Limit coalescing to case where LR xy < k Iterative form tries to coalesce before spilling CS406/534 Fall 2004, Prof. Li Xu 77 77

78 Results are pretty good Performance Simple procedures allocate without spills There is some room for improvement Long blocks, regions of high pressure Many implementation issues Many people have looked at improving Chaitin-Briggs Better allocations Better coloring Softer coalescing Better spilling Spilling partial live ranges Better implementations Faster graph construction Faster coalescing CS406/534 Fall 2004, Prof. Li Xu 78 78

79 Rematerialization Never-killed values can be rematerialized (rather than spilled) Operands are always available Computed in a single operation Cheaper to recompute than to store & reload (the classic spill ) Allocator must Discover & mark never-killed LRs Reflect rematerialization in spill costs Use all this knowledge to generate right spills Chaitin rematerialized LRs that were entirely never-killed We can do partial LRs CS406/534 Fall 2004, Prof. Li Xu 79 79

80 Bibliography Briggs, Cooper, & Torczon, Improvements to Graph Coloring Register Allocation, ACM TOPLAS 16(3), May, Bernstein, Goldin, Golumbic, Krawczyk, Mansour, Nashon, & Pinter, Spill Code Minimization Techniques for Optimizing Compilers, Proceedings of PLDI 89, SIGPLAN Notices 24(7), July George & Appel, Iterated Register Coalescing, ACM TOPLAS 18(3), May, Bergner, Dahl, Engebretsen, & O Keefe, Spill Code Minimization via Interference Region Spilling, Proceedings of PLDI 97, SIGPLAN Notices 32(6), June Cooper, Harvey, & Torczon, How to Build an Interference Graph, Software Practice and Experience, 28(4), April, 1998 Cooper & Simpson, Live-range splitting in a graph coloring register allocator, Proceedings of the 1998 International Conference on Compiler Construction, LNCS 1381 (Springer), March/April CS406/534 Fall 2004, Prof. Li Xu 80 80

81 Instruction selection Summary Global register allocation CS406/534 Fall 2004, Prof. Li Xu 81 81

82 Next Class Optimization Data-flow analysis SSA CS406/534 Fall 2004, Prof. Li Xu 82 82

Instruction Selection, II Tree-pattern matching

Instruction Selection, II Tree-pattern matching Instruction Selection, II Tree-pattern matching Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 4 at Rice University have explicit permission

More information

Instruction Selection: Peephole Matching. Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved.

Instruction Selection: Peephole Matching. Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Instruction Selection: Peephole Matching Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. The Problem Writing a compiler is a lot of work Would like to reuse components

More information

Compiler Design. Register Allocation. Hwansoo Han

Compiler Design. Register Allocation. Hwansoo Han Compiler Design Register Allocation Hwansoo Han Big Picture of Code Generation Register allocation Decides which values will reside in registers Changes the storage mapping Concerns about placement of

More information

Introduction to Optimization, Instruction Selection and Scheduling, and Register Allocation

Introduction to Optimization, Instruction Selection and Scheduling, and Register Allocation Introduction to Optimization, Instruction Selection and Scheduling, and Register Allocation Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Traditional Three-pass Compiler

More information

Instruction Selection and Scheduling

Instruction Selection and Scheduling Instruction Selection and Scheduling The Problem Writing a compiler is a lot of work Would like to reuse components whenever possible Would like to automate construction of components Front End Middle

More information

Global Register Allocation via Graph Coloring

Global Register Allocation via Graph Coloring Global Register Allocation via Graph Coloring Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit permission

More information

CS 406/534 Compiler Construction Putting It All Together

CS 406/534 Compiler Construction Putting It All Together CS 406/534 Compiler Construction Putting It All Together Prof. Li Xu Dept. of Computer Science UMass Lowell Fall 2004 Part of the course lecture notes are based on Prof. Keith Cooper, Prof. Ken Kennedy

More information

Agenda. CSE P 501 Compilers. Big Picture. Compiler Organization. Intermediate Representations. IR for Code Generation. CSE P 501 Au05 N-1

Agenda. CSE P 501 Compilers. Big Picture. Compiler Organization. Intermediate Representations. IR for Code Generation. CSE P 501 Au05 N-1 Agenda CSE P 501 Compilers Instruction Selection Hal Perkins Autumn 2005 Compiler back-end organization Low-level intermediate representations Trees Linear Instruction selection algorithms Tree pattern

More information

CSE 504: Compiler Design. Instruction Selection

CSE 504: Compiler Design. Instruction Selection Instruction Selection Pradipta De pradipta.de@sunykorea.ac.kr Current Topic Instruction Selection techniques Tree Walk Tiling based approach Peephole Optimization Instruction Selection Difficulty of Instruction

More information

Instruction Selection: Preliminaries. Comp 412

Instruction Selection: Preliminaries. Comp 412 COMP 412 FALL 2017 Instruction Selection: Preliminaries Comp 412 source code Front End Optimizer Back End target code Copyright 2017, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled

More information

Register Allocation. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice.

Register Allocation. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice. Register Allocation Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP at Rice. Copyright 00, Keith D. Cooper & Linda Torczon, all rights reserved.

More information

Topic 6 Basic Back-End Optimization

Topic 6 Basic Back-End Optimization Topic 6 Basic Back-End Optimization Instruction Selection Instruction scheduling Register allocation 2008/4/8 \course\cpeg421-08s\topic-6.ppt 1 ABET Outcome Ability to apply knowledge of basic code generation

More information

Lecture 12: Compiler Backend

Lecture 12: Compiler Backend CS 515 Programming Language and Compilers I Lecture 1: Compiler Backend (The lectures are based on the slides copyrighted by Keith Cooper and Linda Torczon from Rice University.) Zheng (Eddy) Zhang Rutgers

More information

Global Register Allocation via Graph Coloring The Chaitin-Briggs Algorithm. Comp 412

Global Register Allocation via Graph Coloring The Chaitin-Briggs Algorithm. Comp 412 COMP 412 FALL 2018 Global Register Allocation via Graph Coloring The Chaitin-Briggs Algorithm Comp 412 source code IR Front End Optimizer Back End IR target code Copyright 2018, Keith D. Cooper & Linda

More information

Code generation for modern processors

Code generation for modern processors Code generation for modern processors Definitions (1 of 2) What are the dominant performance issues for a superscalar RISC processor? Refs: AS&U, Chapter 9 + Notes. Optional: Muchnick, 16.3 & 17.1 Instruction

More information

Code generation for modern processors

Code generation for modern processors Code generation for modern processors What are the dominant performance issues for a superscalar RISC processor? Refs: AS&U, Chapter 9 + Notes. Optional: Muchnick, 16.3 & 17.1 Strategy il il il il asm

More information

Register Allocation. Global Register Allocation Webs and Graph Coloring Node Splitting and Other Transformations

Register Allocation. Global Register Allocation Webs and Graph Coloring Node Splitting and Other Transformations Register Allocation Global Register Allocation Webs and Graph Coloring Node Splitting and Other Transformations Copyright 2015, Pedro C. Diniz, all rights reserved. Students enrolled in the Compilers class

More information

Global Register Allocation - Part 2

Global Register Allocation - Part 2 Global Register Allocation - Part 2 Y N Srikant Computer Science and Automation Indian Institute of Science Bangalore 560012 NPTEL Course on Compiler Design Outline Issues in Global Register Allocation

More information

Rematerialization. Graph Coloring Register Allocation. Some expressions are especially simple to recompute: Last Time

Rematerialization. Graph Coloring Register Allocation. Some expressions are especially simple to recompute: Last Time Graph Coloring Register Allocation Last Time Chaitin et al. Briggs et al. Today Finish Briggs et al. basics An improvement: rematerialization Rematerialization Some expressions are especially simple to

More information

Global Register Allocation

Global Register Allocation Global Register Allocation Y N Srikant Computer Science and Automation Indian Institute of Science Bangalore 560012 NPTEL Course on Compiler Design Outline n Issues in Global Register Allocation n The

More information

CS415 Compilers Register Allocation. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

CS415 Compilers Register Allocation. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University CS415 Compilers Register Allocation These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University Review: The Back End IR Instruction Selection IR Register

More information

Local Optimization: Value Numbering The Desert Island Optimization. Comp 412 COMP 412 FALL Chapter 8 in EaC2e. target code

Local Optimization: Value Numbering The Desert Island Optimization. Comp 412 COMP 412 FALL Chapter 8 in EaC2e. target code COMP 412 FALL 2017 Local Optimization: Value Numbering The Desert Island Optimization Comp 412 source code IR Front End Optimizer Back End IR target code Copyright 2017, Keith D. Cooper & Linda Torczon,

More information

CS415 Compilers. Intermediate Represeation & Code Generation

CS415 Compilers. Intermediate Represeation & Code Generation CS415 Compilers Intermediate Represeation & Code Generation These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University Review - Types of Intermediate Representations

More information

Intermediate Representations

Intermediate Representations COMP 506 Rice University Spring 2018 Intermediate Representations source code IR Front End Optimizer Back End IR target code Copyright 2018, Keith D. Cooper & Linda Torczon, all rights reserved. Students

More information

CSE P 501 Compilers. Register Allocation Hal Perkins Autumn /22/ Hal Perkins & UW CSE P-1

CSE P 501 Compilers. Register Allocation Hal Perkins Autumn /22/ Hal Perkins & UW CSE P-1 CSE P 501 Compilers Register Allocation Hal Perkins Autumn 2011 11/22/2011 2002-11 Hal Perkins & UW CSE P-1 Agenda Register allocation constraints Local methods Faster compile, slower code, but good enough

More information

Register Allocation 3/16/11. What a Smart Allocator Needs to Do. Global Register Allocation. Global Register Allocation. Outline.

Register Allocation 3/16/11. What a Smart Allocator Needs to Do. Global Register Allocation. Global Register Allocation. Outline. What a Smart Allocator Needs to Do Register Allocation Global Register Allocation Webs and Graph Coloring Node Splitting and Other Transformations Determine ranges for each variable can benefit from using

More information

Register allocation. Overview

Register allocation. Overview Register allocation Register allocation Overview Variables may be stored in the main memory or in registers. { Main memory is much slower than registers. { The number of registers is strictly limited.

More information

Code Shape II Expressions & Assignment

Code Shape II Expressions & Assignment Code Shape II Expressions & Assignment Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit permission to make

More information

Register Allocation (wrapup) & Code Scheduling. Constructing and Representing the Interference Graph. Adjacency List CS2210

Register Allocation (wrapup) & Code Scheduling. Constructing and Representing the Interference Graph. Adjacency List CS2210 Register Allocation (wrapup) & Code Scheduling CS2210 Lecture 22 Constructing and Representing the Interference Graph Construction alternatives: as side effect of live variables analysis (when variables

More information

Register Allocation. Register Allocation. Local Register Allocation. Live range. Register Allocation for Loops

Register Allocation. Register Allocation. Local Register Allocation. Live range. Register Allocation for Loops DF00100 Advanced Compiler Construction Register Allocation Register Allocation: Determines values (variables, temporaries, constants) to be kept when in registers Register Assignment: Determine in which

More information

k register IR Register Allocation IR Instruction Scheduling n), maybe O(n 2 ), but not O(2 n ) k register code

k register IR Register Allocation IR Instruction Scheduling n), maybe O(n 2 ), but not O(2 n ) k register code Register Allocation Part of the compiler s back end IR Instruction Selection m register IR Register Allocation k register IR Instruction Scheduling Machine code Errors Critical properties Produce correct

More information

CS 406/534 Compiler Construction Instruction Scheduling

CS 406/534 Compiler Construction Instruction Scheduling CS 406/534 Compiler Construction Instruction Scheduling Prof. Li Xu Dept. of Computer Science UMass Lowell Fall 2004 Part of the course lecture notes are based on Prof. Keith Cooper, Prof. Ken Kennedy

More information

CS415 Compilers. Instruction Scheduling and Lexical Analysis

CS415 Compilers. Instruction Scheduling and Lexical Analysis CS415 Compilers Instruction Scheduling and Lexical Analysis These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University Instruction Scheduling (Engineer

More information

Computing Inside The Parser Syntax-Directed Translation, II. Comp 412 COMP 412 FALL Chapter 4 in EaC2e. source code. IR IR target.

Computing Inside The Parser Syntax-Directed Translation, II. Comp 412 COMP 412 FALL Chapter 4 in EaC2e. source code. IR IR target. COMP 412 FALL 20167 Computing Inside The Parser Syntax-Directed Translation, II Comp 412 source code IR IR target Front End Optimizer Back End code Copyright 2017, Keith D. Cooper & Linda Torczon, all

More information

CS5363 Final Review. cs5363 1

CS5363 Final Review. cs5363 1 CS5363 Final Review cs5363 1 Programming language implementation Programming languages Tools for describing data and algorithms Instructing machines what to do Communicate between computers and programmers

More information

Register allocation. Register allocation: ffl have value in a register when used. ffl limited resources. ffl changes instruction choices

Register allocation. Register allocation: ffl have value in a register when used. ffl limited resources. ffl changes instruction choices Register allocation IR instruction selection register allocation machine code errors Register allocation: have value in a register when used limited resources changes instruction choices can move loads

More information

Local Register Allocation (critical content for Lab 2) Comp 412

Local Register Allocation (critical content for Lab 2) Comp 412 Updated After Tutorial COMP 412 FALL 2018 Local Register Allocation (critical content for Lab 2) Comp 412 source code IR Front End Optimizer Back End IR target code Copyright 2018, Keith D. Cooper & Linda

More information

Redundant Computation Elimination Optimizations. Redundancy Elimination. Value Numbering CS2210

Redundant Computation Elimination Optimizations. Redundancy Elimination. Value Numbering CS2210 Redundant Computation Elimination Optimizations CS2210 Lecture 20 Redundancy Elimination Several categories: Value Numbering local & global Common subexpression elimination (CSE) local & global Loop-invariant

More information

Outline. Register Allocation. Issues. Storing values between defs and uses. Issues. Issues P3 / 2006

Outline. Register Allocation. Issues. Storing values between defs and uses. Issues. Issues P3 / 2006 P3 / 2006 Register Allocation What is register allocation Spilling More Variations and Optimizations Kostis Sagonas 2 Spring 2006 Storing values between defs and uses Program computes with values value

More information

CSC D70: Compiler Optimization Register Allocation

CSC D70: Compiler Optimization Register Allocation CSC D70: Compiler Optimization Register Allocation Prof. Gennady Pekhimenko University of Toronto Winter 2018 The content of this lecture is adapted from the lectures of Todd Mowry and Phillip Gibbons

More information

Introduction to Optimization Local Value Numbering

Introduction to Optimization Local Value Numbering COMP 506 Rice University Spring 2018 Introduction to Optimization Local Value Numbering source IR IR target code Front End Optimizer Back End code Copyright 2018, Keith D. Cooper & Linda Torczon, all rights

More information

Register allocation. instruction selection. machine code. register allocation. errors

Register allocation. instruction selection. machine code. register allocation. errors Register allocation IR instruction selection register allocation machine code errors Register allocation: have value in a register when used limited resources changes instruction choices can move loads

More information

Lecture 21 CIS 341: COMPILERS

Lecture 21 CIS 341: COMPILERS Lecture 21 CIS 341: COMPILERS Announcements HW6: Analysis & Optimizations Alias analysis, constant propagation, dead code elimination, register allocation Available Soon Due: Wednesday, April 25 th Zdancewic

More information

Fall Compiler Principles Lecture 12: Register Allocation. Roman Manevich Ben-Gurion University

Fall Compiler Principles Lecture 12: Register Allocation. Roman Manevich Ben-Gurion University Fall 2014-2015 Compiler Principles Lecture 12: Register Allocation Roman Manevich Ben-Gurion University Syllabus Front End Intermediate Representation Optimizations Code Generation Scanning Lowering Local

More information

Global Register Allocation - Part 3

Global Register Allocation - Part 3 Global Register Allocation - Part 3 Y N Srikant Computer Science and Automation Indian Institute of Science Bangalore 560012 NPTEL Course on Compiler Design Outline Issues in Global Register Allocation

More information

Linear Scan Register Allocation. Kevin Millikin

Linear Scan Register Allocation. Kevin Millikin Linear Scan Register Allocation Kevin Millikin Register Allocation Register Allocation An important compiler optimization Compiler: unbounded # of virtual registers Processor: bounded (small) # of registers

More information

Computing Inside The Parser Syntax-Directed Translation, II. Comp 412

Computing Inside The Parser Syntax-Directed Translation, II. Comp 412 COMP 412 FALL 2018 Computing Inside The Parser Syntax-Directed Translation, II Comp 412 source code IR IR target Front End Optimizer Back End code Copyright 2018, Keith D. Cooper & Linda Torczon, all rights

More information

register allocation saves energy register allocation reduces memory accesses.

register allocation saves energy register allocation reduces memory accesses. Lesson 10 Register Allocation Full Compiler Structure Embedded systems need highly optimized code. This part of the course will focus on Back end code generation. Back end: generation of assembly instructions

More information

Introduction to Compiler

Introduction to Compiler Formal Languages and Compiler (CSE322) Introduction to Compiler Jungsik Choi chjs@khu.ac.kr 2018. 3. 8 Traditional Two-pass Compiler Source Front End Back End Compiler Target High level functions Recognize

More information

Register Allocation via Hierarchical Graph Coloring

Register Allocation via Hierarchical Graph Coloring Register Allocation via Hierarchical Graph Coloring by Qunyan Wu A THESIS Submitted in partial fulfillment of the requirements for the degree of MASTER OF SCIENCE IN COMPUTER SCIENCE MICHIGAN TECHNOLOGICAL

More information

CS 406/534 Compiler Construction Parsing Part I

CS 406/534 Compiler Construction Parsing Part I CS 406/534 Compiler Construction Parsing Part I Prof. Li Xu Dept. of Computer Science UMass Lowell Fall 2004 Part of the course lecture notes are based on Prof. Keith Cooper, Prof. Ken Kennedy and Dr.

More information

The View from 35,000 Feet

The View from 35,000 Feet The View from 35,000 Feet This lecture is taken directly from the Engineering a Compiler web site with only minor adaptations for EECS 6083 at University of Cincinnati Copyright 2003, Keith D. Cooper,

More information

Topic 12: Register Allocation

Topic 12: Register Allocation Topic 12: Register Allocation COS 320 Compiling Techniques Princeton University Spring 2016 Lennart Beringer 1 Structure of backend Register allocation assigns machine registers (finite supply!) to virtual

More information

Register Allocation. Preston Briggs Reservoir Labs

Register Allocation. Preston Briggs Reservoir Labs Register Allocation Preston Briggs Reservoir Labs An optimizing compiler SL FE IL optimizer IL BE SL A classical optimizing compiler (e.g., LLVM) with three parts and a nice separation of concerns: front

More information

CS415 Compilers. Lexical Analysis

CS415 Compilers. Lexical Analysis CS415 Compilers Lexical Analysis These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University Lecture 7 1 Announcements First project and second homework

More information

Code Shape Comp 412 COMP 412 FALL Chapters 4, 5, 6 & 7 in EaC2e. source code. IR IR target. code. Front End Optimizer Back End

Code Shape Comp 412 COMP 412 FALL Chapters 4, 5, 6 & 7 in EaC2e. source code. IR IR target. code. Front End Optimizer Back End COMP 412 FALL 2017 Code Shape Comp 412 source code IR IR target Front End Optimizer Back End code Copyright 2017, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at

More information

The C2 Register Allocator. Niclas Adlertz

The C2 Register Allocator. Niclas Adlertz The C2 Register Allocator Niclas Adlertz 1 1 Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated

More information

CS 406/534 Compiler Construction Instruction Scheduling beyond Basic Block and Code Generation

CS 406/534 Compiler Construction Instruction Scheduling beyond Basic Block and Code Generation CS 406/534 Compiler Construction Instruction Scheduling beyond Basic Block and Code Generation Prof. Li Xu Dept. of Computer Science UMass Lowell Fall 2004 Part of the course lecture notes are based on

More information

Implementing Control Flow Constructs Comp 412

Implementing Control Flow Constructs Comp 412 COMP 412 FALL 2018 Implementing Control Flow Constructs Comp 412 source code IR Front End Optimizer Back End IR target code Copyright 2018, Keith D. Cooper & Linda Torczon, all rights reserved. Students

More information

Combining Optimizations: Sparse Conditional Constant Propagation

Combining Optimizations: Sparse Conditional Constant Propagation Comp 512 Spring 2011 Combining Optimizations: Sparse Conditional Constant Propagation Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at Rice University

More information

Just-In-Time Compilers & Runtime Optimizers

Just-In-Time Compilers & Runtime Optimizers COMP 412 FALL 2017 Just-In-Time Compilers & Runtime Optimizers Comp 412 source code IR Front End Optimizer Back End IR target code Copyright 2017, Keith D. Cooper & Linda Torczon, all rights reserved.

More information

The ILOC Simulator User Documentation

The ILOC Simulator User Documentation The ILOC Simulator User Documentation Comp 506, Spring 2017 The ILOC instruction set is taken from the book, Engineering A Compiler, published by the Morgan- Kaufmann imprint of Elsevier [1]. The simulator

More information

Live Range Splitting in a. the graph represent live ranges, or values. An edge between two nodes

Live Range Splitting in a. the graph represent live ranges, or values. An edge between two nodes Live Range Splitting in a Graph Coloring Register Allocator Keith D. Cooper 1 and L. Taylor Simpson 2 1 Rice University, Houston, Texas, USA 2 Trilogy Development Group, Austin, Texas, USA Abstract. Graph

More information

Low-Level Issues. Register Allocation. Last lecture! Liveness analysis! Register allocation. ! More register allocation. ! Instruction scheduling

Low-Level Issues. Register Allocation. Last lecture! Liveness analysis! Register allocation. ! More register allocation. ! Instruction scheduling Low-Level Issues Last lecture! Liveness analysis! Register allocation!today! More register allocation!later! Instruction scheduling CS553 Lecture Register Allocation I 1 Register Allocation!Problem! Assign

More information

Compiler Passes. Optimization. The Role of the Optimizer. Optimizations. The Optimizer (or Middle End) Traditional Three-pass Compiler

Compiler Passes. Optimization. The Role of the Optimizer. Optimizations. The Optimizer (or Middle End) Traditional Three-pass Compiler Compiler Passes Analysis of input program (front-end) character stream Lexical Analysis Synthesis of output program (back-end) Intermediate Code Generation Optimization Before and after generating machine

More information

Register Allocation & Liveness Analysis

Register Allocation & Liveness Analysis Department of Computer Sciences Register Allocation & Liveness Analysis CS502 Purdue University is an Equal Opportunity/Equal Access institution. Department of Computer Sciences In IR tree code generation,

More information

The ILOC Simulator User Documentation

The ILOC Simulator User Documentation The ILOC Simulator User Documentation COMP 412, Fall 2015 Documentation for Lab 1 The ILOC instruction set is taken from the book, Engineering A Compiler, published by the Elsevier Morgan-Kaufmann [1].

More information

CS 406/534 Compiler Construction Intermediate Representation and Procedure Abstraction

CS 406/534 Compiler Construction Intermediate Representation and Procedure Abstraction CS 406/534 Compiler Construction Intermediate Representation and Procedure Abstraction Prof. Li Xu Dept. of Computer Science UMass Lowell Fall 2004 Part of the course lecture notes are based on Prof. Keith

More information

Register Allocation. CS 502 Lecture 14 11/25/08

Register Allocation. CS 502 Lecture 14 11/25/08 Register Allocation CS 502 Lecture 14 11/25/08 Where we are... Reasonably low-level intermediate representation: sequence of simple instructions followed by a transfer of control. a representation of static

More information

Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit

Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit Intermediate Representations Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit permission to make copies

More information

EECS 583 Class 15 Register Allocation

EECS 583 Class 15 Register Allocation EECS 583 Class 15 Register Allocation University of Michigan November 2, 2011 Announcements + Reading Material Midterm exam: Monday, Nov 14?» Could also do Wednes Nov 9 (next week!) or Wednes Nov 16 (2

More information

Compilers and Code Optimization EDOARDO FUSELLA

Compilers and Code Optimization EDOARDO FUSELLA Compilers and Code Optimization EDOARDO FUSELLA Contents Data memory layout Instruction selection Register allocation Data memory layout Memory Hierarchy Capacity vs access speed Main memory Classes of

More information

Register Allocation in Just-in-Time Compilers: 15 Years of Linear Scan

Register Allocation in Just-in-Time Compilers: 15 Years of Linear Scan Register Allocation in Just-in-Time Compilers: 15 Years of Linear Scan Kevin Millikin Google 13 December 2013 Register Allocation Overview Register allocation Intermediate representation (IR): arbitrarily

More information

CS153: Compilers Lecture 15: Local Optimization

CS153: Compilers Lecture 15: Local Optimization CS153: Compilers Lecture 15: Local Optimization Stephen Chong https://www.seas.harvard.edu/courses/cs153 Announcements Project 4 out Due Thursday Oct 25 (2 days) Project 5 out Due Tuesday Nov 13 (21 days)

More information

Introduction to Optimization. CS434 Compiler Construction Joel Jones Department of Computer Science University of Alabama

Introduction to Optimization. CS434 Compiler Construction Joel Jones Department of Computer Science University of Alabama Introduction to Optimization CS434 Compiler Construction Joel Jones Department of Computer Science University of Alabama Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved.

More information

Lecture Compiler Backend

Lecture Compiler Backend Lecture 19-23 Compiler Backend Jianwen Zhu Electrical and Computer Engineering University of Toronto Jianwen Zhu 2009 - P. 1 Backend Tasks Instruction selection Map virtual instructions To machine instructions

More information

CS /534 Compiler Construction University of Massachusetts Lowell. NOTHING: A Language for Practice Implementation

CS /534 Compiler Construction University of Massachusetts Lowell. NOTHING: A Language for Practice Implementation CS 91.406/534 Compiler Construction University of Massachusetts Lowell Professor Li Xu Fall 2004 NOTHING: A Language for Practice Implementation 1 Introduction NOTHING is a programming language designed

More information

Faculty of Electrical Engineering, Mathematics, and Computer Science Delft University of Technology

Faculty of Electrical Engineering, Mathematics, and Computer Science Delft University of Technology Faculty of Electrical Engineering, Mathematics, and Computer Science Delft University of Technology exam Compiler Construction in4020 July 5, 2007 14.00-15.30 This exam (8 pages) consists of 60 True/False

More information

CS415 Compilers Overview of the Course. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

CS415 Compilers Overview of the Course. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University CS415 Compilers Overview of the Course These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University Critical Facts Welcome to CS415 Compilers Topics in the

More information

Compilers. Intermediate representations and code generation. Yannis Smaragdakis, U. Athens (original slides by Sam

Compilers. Intermediate representations and code generation. Yannis Smaragdakis, U. Athens (original slides by Sam Compilers Intermediate representations and code generation Yannis Smaragdakis, U. Athens (original slides by Sam Guyer@Tufts) Today Intermediate representations and code generation Scanner Parser Semantic

More information

The ILOC Simulator User Documentation

The ILOC Simulator User Documentation The ILOC Simulator User Documentation Spring 2015 Semester The ILOC instruction set is taken from the book, Engineering A Compiler, published by the Morgan- Kaufmann imprint of Elsevier [1]. The simulator

More information

A Bad Name. CS 2210: Optimization. Register Allocation. Optimization. Reaching Definitions. Dataflow Analyses 4/10/2013

A Bad Name. CS 2210: Optimization. Register Allocation. Optimization. Reaching Definitions. Dataflow Analyses 4/10/2013 A Bad Name Optimization is the process by which we turn a program into a better one, for some definition of better. CS 2210: Optimization This is impossible in the general case. For instance, a fully optimizing

More information

Data Structures and Algorithms in Compiler Optimization. Comp314 Lecture Dave Peixotto

Data Structures and Algorithms in Compiler Optimization. Comp314 Lecture Dave Peixotto Data Structures and Algorithms in Compiler Optimization Comp314 Lecture Dave Peixotto 1 What is a compiler Compilers translate between program representations Interpreters evaluate their input to produce

More information

The Software Stack: From Assembly Language to Machine Code

The Software Stack: From Assembly Language to Machine Code COMP 506 Rice University Spring 2018 The Software Stack: From Assembly Language to Machine Code source code IR Front End Optimizer Back End IR target code Somewhere Out Here Copyright 2018, Keith D. Cooper

More information

Register Allocation. Lecture 38

Register Allocation. Lecture 38 Register Allocation Lecture 38 (from notes by G. Necula and R. Bodik) 4/27/08 Prof. Hilfinger CS164 Lecture 38 1 Lecture Outline Memory Hierarchy Management Register Allocation Register interference graph

More information

A Parametric View of Retargetable. Register Allocation. January 24, Abstract

A Parametric View of Retargetable. Register Allocation. January 24, Abstract A Parametric View of Retargetable Register Allocation Kelvin S. Bryant Jon Mauney ksb@cs.umd.edu mauney@csc.ncsu.edu Dept. of Computer Science Dept. of Computer Science Univ. of Maryland, College Park,

More information

CS415 Compilers. Syntax Analysis. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

CS415 Compilers. Syntax Analysis. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University CS415 Compilers Syntax Analysis These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University Limits of Regular Languages Advantages of Regular Expressions

More information

Compiler Optimization Techniques

Compiler Optimization Techniques Compiler Optimization Techniques Department of Computer Science, Faculty of ICT February 5, 2014 Introduction Code optimisations usually involve the replacement (transformation) of code from one sequence

More information

Generating Code for Assignment Statements back to work. Comp 412 COMP 412 FALL Chapters 4, 6 & 7 in EaC2e. source code. IR IR target.

Generating Code for Assignment Statements back to work. Comp 412 COMP 412 FALL Chapters 4, 6 & 7 in EaC2e. source code. IR IR target. COMP 412 FALL 2017 Generating Code for Assignment Statements back to work Comp 412 source code IR IR target Front End Optimizer Back End code Copyright 2017, Keith D. Cooper & Linda Torczon, all rights

More information

Compiler construction in4303 lecture 9

Compiler construction in4303 lecture 9 Compiler construction in4303 lecture 9 Code generation Chapter 4.2.5, 4.2.7, 4.2.11 4.3 Overview Code generation for basic blocks instruction selection:[burs] register allocation: graph coloring instruction

More information

Lecture Outline. Intermediate code Intermediate Code & Local Optimizations. Local optimizations. Lecture 14. Next time: global optimizations

Lecture Outline. Intermediate code Intermediate Code & Local Optimizations. Local optimizations. Lecture 14. Next time: global optimizations Lecture Outline Intermediate code Intermediate Code & Local Optimizations Lecture 14 Local optimizations Next time: global optimizations Prof. Aiken CS 143 Lecture 14 1 Prof. Aiken CS 143 Lecture 14 2

More information

Middle End. Code Improvement (or Optimization) Analyzes IR and rewrites (or transforms) IR Primary goal is to reduce running time of the compiled code

Middle End. Code Improvement (or Optimization) Analyzes IR and rewrites (or transforms) IR Primary goal is to reduce running time of the compiled code Traditional Three-pass Compiler Source Code Front End IR Middle End IR Back End Machine code Errors Code Improvement (or Optimization) Analyzes IR and rewrites (or transforms) IR Primary goal is to reduce

More information

Register allocation. TDT4205 Lecture 31

Register allocation. TDT4205 Lecture 31 1 Register allocation TDT4205 Lecture 31 2 Variables vs. registers TAC has any number of variables Assembly code has to deal with memory and registers Compiler back end must decide how to juggle the contents

More information

Intermediate Representations

Intermediate Representations Most of the material in this lecture comes from Chapter 5 of EaC2 Intermediate Representations Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP

More information

Instruction Scheduling Beyond Basic Blocks Extended Basic Blocks, Superblock Cloning, & Traces, with a quick introduction to Dominators.

Instruction Scheduling Beyond Basic Blocks Extended Basic Blocks, Superblock Cloning, & Traces, with a quick introduction to Dominators. Instruction Scheduling Beyond Basic Blocks Extended Basic Blocks, Superblock Cloning, & Traces, with a quick introduction to Dominators Comp 412 COMP 412 FALL 2016 source code IR Front End Optimizer Back

More information

Lab 3, Tutorial 1 Comp 412

Lab 3, Tutorial 1 Comp 412 COMP 412 FALL 2018 Lab 3, Tutorial 1 Comp 412 source code IR IR Front End Optimizer Back End target code Copyright 2018, Keith D. Cooper, Linda Torczon & Zoran Budimlić, all rights reserved. Students enrolled

More information

An Overview of Compilation

An Overview of Compilation An Overview of Compilation (www.cse.iitb.ac.in/ uday) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay January 2014 cs306 Compilation Overview: Outline 1/18 Outline

More information

Intermediate Representations Part II

Intermediate Representations Part II Intermediate Representations Part II Types of Intermediate Representations Three major categories Structural Linear Hybrid Directed Acyclic Graph A directed acyclic graph (DAG) is an AST with a unique

More information

Variables vs. Registers/Memory. Simple Approach. Register Allocation. Interference Graph. Register Allocation Algorithm CS412/CS413

Variables vs. Registers/Memory. Simple Approach. Register Allocation. Interference Graph. Register Allocation Algorithm CS412/CS413 Variables vs. Registers/Memory CS412/CS413 Introduction to Compilers Tim Teitelbaum Lecture 33: Register Allocation 18 Apr 07 Difference between IR and assembly code: IR (and abstract assembly) manipulate

More information

Parsing II Top-down parsing. Comp 412

Parsing II Top-down parsing. Comp 412 COMP 412 FALL 2018 Parsing II Top-down parsing Comp 412 source code IR Front End Optimizer Back End IR target code Copyright 2018, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled

More information