Compilation Lecture 11a. Register Allocation Noam Rinetzky. Text book: Modern compiler implementation in C Andrew A.

Size: px
Start display at page:

Download "Compilation Lecture 11a. Register Allocation Noam Rinetzky. Text book: Modern compiler implementation in C Andrew A."

Transcription

1 Compilation Leture 11a Text book: Modern ompiler implementation in C Andrew A. Appel Register Alloation Noam Rinetzky 1

2 Registers Dediated memory loations that an be aessed quikly, an have omputations performed on them, and

3 Registers Dediated memory loations that an be aessed quikly, an have omputations performed on them, and Usages Operands of instrutions Store temporary results Can (should) be used as loop indexes due to frequent arithmeti operation Used to manage administrative info e.g., runtime stak

4 Register alloation Number of registers is limited Need to alloate them in a lever way Using registers intelligently is a ritial step in any ompiler A good register alloator an generate ode orders of magnitude better than a bad register alloator

5 Register Alloation Mahine-agnosti optimizations Assume unbounded number of registers Expression trees Basi bloks Mahine-dependent optimization K registers Some have speial purposes Control flow graphs (global register alloation)

6 Basi Compiler Phases Soure program (string) lexial analysis Tokens syntax analysis Abstrat syntax tree semanti analysis AST + Symbol table Translate Frame Intermediate representation Instrution seletion Assembly Global Register Alloation Fin. Assembly

7 Input: Global Register Alloation Sequene of mahine instrutions ( assembly ) Unbounded number of temporary variables aka symboli registers mahine desription Output # of registers, restritions Sequene of mahine instrutions using mahine registers (assembly) Mahine registers Some MOV instrutions removed

8 Global Register Alloation Input: Sequene of mahine ode instrutions (assembly) Output Unbounded number of temporary registers Sequene of mahine ode instrutions (assembly) Mahine registers Some MOVE instrutions removed Missing prologue and epilogue

9 Computing Liveness Information Dataflow analysis (previous leture)

10 Variable Liveness A statement x = y + z defines x uses y and z A variable x is live at a program point if its value (at this point) is used at a later point y = 42 z = 73 x = y + z print(x); x undef, y live, z undef x undef, y live, z live x is live, y dead, z dead x is dead, y dead, z dead (showing state after the statement)

11 Liveness Analysis b = a + 2 = b * b b = + 1 return b * a

12 Liveness Analysis b = a + 2 = b * b b = + 1 return b * a {b, a}

13 Liveness Analysis b = a + 2 = b * b b = + 1 return b * a {a, } {b, a}

14 Liveness Analysis b = a + 2 = b * b b = + 1 return b * a {b, a} {a, } {b, a}

15 Liveness Analysis b = a + 2 = b * b b = + 1 return b * a {a} {b, a} {a, } {b, a}

16 Interferene graph onstrution (Main idea) For every node n in CFG, we have out[n] Set of temporaries live out of n Two variables interfere if they appear in the same out[n] of any node n Cannot be alloated to the same register Conversely, if two variables do not interfere with eah other, they an be assigned the same register We say they have disjoint live ranges How to assign registers to variables?

17 Interferene graph Nodes of the graph = variables Edges onnet variables that interfere with one another Nodes will be assigned a olor orresponding to the register assigned to the variable Two olors an t be next to one another in the graph

18 Results of Liveness Analysis b = a + 2 = b * b b = + 1 return b * a {a} {b, a} {a, } {b, a}

19 Interferene graph olor register b = a + 2 = b * b b = + 1 return b * a {a} {b, a} {a, } {b, a} a eax ebx b

20 Colored graph olor register b = a + 2 = b * b b = + 1 return b * a {a} {b, a} {a, } {b, a} a eax ebx b

21 Graph oloring This problem is equivalent to grapholoring, whih is NP-hard if there are at least three registers No good polynomial-time algorithms (or even good approximations!) are known for this problem We have to be ontent with a heuristi that is good enough for RIGs that arise in pratie

22 Coloring by simplifiation [Kempe 1879] How to find a k-oloring of a graph Intuition: Suppose we are trying to k-olor a graph and find a node with fewer than k edges If we delete this node from the graph and olor what remains, we an find a olor for this node if we add it bak in Reason: fewer than k neighbors d some olor must be left over

23 Coloring by simplifiation [Kempe 1879] How to find a k-oloring of a graph Phase 1: Simplifiation Repeatedly simplify graph When a variable (i.e., graph node) is removed, push it on a stak Phase 2: Coloring Unwind stak and reonstrut the graph as follows: Pop variable from the stak Add it bak to the graph Color the node for that variable with a olor that it doesn t interfere with simplify olor

24 olor register eax ebx Coloring k=2 a b stak: d e

25 olor register eax ebx Coloring k=2 a b stak: d e

26 olor register eax ebx Coloring k=2 a b stak: d e e

27 olor register eax ebx Coloring k=2 a d b e stak: a e

28 olor register eax ebx Coloring k=2 a d b e stak: b a e

29 olor register eax ebx Coloring k=2 a d b e stak: d b a e

30 olor register eax ebx Coloring k=2 a d b e stak: b a e

31 olor register eax ebx Coloring k=2 a b stak: d e a e

32 olor register eax ebx Coloring k=2 a b stak: d e e

33 olor register eax ebx Coloring k=2 a b stak: d e

34 olor register eax ebx Coloring k=2 a b stak: d e

35 Failure of heuristi If the graph annot be olored, it will eventually be simplified to graph in whih every node has at least K neighbors Sometimes, the graph is still K-olorable! Finding a K-oloring in all situations is an NP-omplete problem We will have to approximate to make register alloators fast enough

36 olor register eax ebx Coloring k=2 a b stak: d e

37 olor register eax ebx Coloring k=2 Some graphs an t be olored in K olors: a b stak: d e

38 olor register eax ebx Coloring k=2 Some graphs an t be olored using K olors: a b d e a? e?

39 olor register eax ebx Coloring k=2 Some graphs an t be olored using K olors: a b d e a? e?

40 olor register eax ebx Coloring k=2 Some graphs an t be olored in K olors: a b stak: d e

41 olor register eax ebx Coloring k=2 Simplifiation gets stuk! a b stak: d e

42 Chaitin s algorithm Choose and remove an arbitrary node, marking it troublesome Use heuristis to hoose whih one When adding node bak in, it may be possible to find a valid olor Otherwise, we have to spill that node

43 Spilling Phase 3: spilling one all nodes have K or more neighbors, pik a node for spilling There are many heuristis that an be used to pik a node Try to pik node not used muh, not in inner loop Storage in ativation reord Remove it from graph We an now repeat phases 1-2 without this node Better approah rewrite ode to spill variable, reompute liveness information and try to olor again

44 olor register eax ebx Coloring k=2 Some graphs an t be olored in K olors: a d b e stak: e a d no olors left for e!

45 olor register eax ebx Coloring k=2 Some graphs an t be olored in K olors: a d b e stak: b e a d

46 olor register eax ebx Coloring k=2 Some graphs an t be olored in K olors: a d b e stak: e a d

47 olor register eax ebx Coloring k=2 Some graphs an t be olored in K olors: a b stak: a d d e

48 olor register eax ebx Coloring k=2 Some graphs an t be olored in K olors: a b stak: d d e

49 olor register eax ebx Coloring k=2 Some graphs an t be olored in K olors: a b stak: d e

50 Optimizing move instrutions Code generation produes a lot of extra mov instrutions mov t5, t9 If we an assign t5 and t9 to same register, we an get rid of the mov effetively, opy elimination at the register alloation level Idea: if t5 and t9 are not onneted in inferene graph, oalese them into a single variable; the move will be redundant Problem: oalesing nodes an make a graph un-olorable Conservative oalesing heuristi

51 Optimizing MOV instrutions Code generation produes a lot of extra mov instrutions mov t5, t9 If we an assign t5 and t9 to same register, we an get rid of the mov effetively, opy elimination at the register alloation level Idea: if t5 and t9 are not onneted in inferene graph, oalese them into a single variable; the move will be redundant Problem: oalesing nodes an make a graph un-olorable Conservative oalesing heuristi

52 Coalesing MOVs an be removed if the soure and the target share the same register The soure and the target of the move an be merged into a single node (unifying the sets of neighbors) May require more registers Conservative Coalesing Merge nodes only if the resulting node has fewer than K

53 Constrained Moves A instrution T S is onstrained if S and T interfere May happen after oalesing X Y Y Z X Y Z Constrained MOVs are not oalesed

54 Constrained Moves A instrution T S is onstrained if S and T interfere May happen after oalesing X Y X,Y Y Z Z Constrained MOVs are not oalesed

55 Constrained Moves A instrution T S is onstrained if S and T interfere May happen after oalesing X Y X,Y Y Z Z Constrained MOVs are not oalesed

56 Handling preolored nodes Some variables are pre-assigned to registers Eg: mul on x86/pentium uses eax; defines eax, edx Eg: all on x86/pentium Defines (trashes) aller-save registers eax, ex, edx To properly alloate registers, treat these register uses as speial temporary variables and enter into interferene graph as preolored nodes

57 Handling preolored nodes Simplify. Never remove a pre-olored node it already has a olor, i.e., it is a given register Coloring. One simplified graph is all olored nodes, add other nodes bak in and olor them using preolored nodes as starting point

58 Pre-Colored Nodes Some registers in the intermediate language are preolored: orrespond to real registers (stak-pointer, frame-pointer, parameters, ) Cannot be Simplified, Coalesed, or Spilled infinite degree Interfered with eah other But normal temporaries an be oalesed into pre-olored registers Register alloation is ompleted when all the nodes are pre-olored

59 Caller-Save and Callee-Save Registers allee-save-registers (MIPS 16-23) Saved by the allee when modified Values are automatially preserved aross alls aller-save-registers Saved by the aller when needed Values are not automatially preserved Usually the arhiteture defines aller-save and alleesave registers Separate ompilation Interoperability between ode produed by different ompilers/languages But ompilers an deide when to use aller/allee registers

60 Caller-Save vs. Callee-Save Registers int foo(int a) { int b=a+1; f1(); g1(b); return(b+2); } void bar (int y) { int x=y+1; f2(y); g2(2); }

61 Saving Callee-Save Registers enter: def(r 7 ) enter: def(r 7 ) t 231 r 7 exit: use(r 7 ) r 7 t 231 exit: use(r 7 )

62 Graph Coloring with Coalesing Build: Construt the interferene graph Simplify: Reursively remove non-mov nodes with less than K neighbors; Push removed nodes into stak Coalese: Conservatively merge unonstrained MOV related nodes with fewer than K heavy neighbors Freeze: Give-Up Coalesing on some MOV related nodes with low degree of interferene edges Speial ase: merged node has less than k neighbors All non-mov related nodes are heavy Potential-Spill: Spill some nodes and remove nodes Push removed nodes into stak Selet: Assign atual registers (from simplify/spill stak) Atual-Spill: Spill some potential spills and repeat the proess

63 A Complete Example Callee-saved registers Caller-saved registers

64 A Complete Example

65 A Complete Example Spill a & e Deg. of r1,ae,d < K r2 & b (Alt: ae+r1)

66 A Complete Example ae & r1 (Alt: ) pop freeze r 1 ae-d Simplify d (Alt: ae+r1) pop d d d

67 A Complete Example 1&r3, 2 &r3 a&e, b&r2

68 A Complete Example ae & r1 Simplify d Pop d d opt gen ode

69 Interproedural Alloation Alloate registers to multiple proedures Potential saving aller/allee save registers Parameter passing Return values But may inrease ompilation ost Funtion inline an help

70 Summary Two Register Alloation Methods Loal of every IR tree Simultaneous instrution seletion and register alloation Optimal (under ertain onditions) Global of every funtion Missing Applied after instrution seletion Performs well for mahines with many registers Can handle instrution level parallelism Interproedural alloation

71 The End

Compilation /17a Lecture 10. Register Allocation Noam Rinetzky

Compilation /17a Lecture 10. Register Allocation Noam Rinetzky Compilation 0368-3133 2016/17a Lecture 10 Register Allocation Noam Rinetzky 1 What is a Compiler? 2 Registers Dedicated memory locations that can be accessed quickly, can have computations performed on

More information

Fall Compiler Principles Lecture 12: Register Allocation. Roman Manevich Ben-Gurion University

Fall Compiler Principles Lecture 12: Register Allocation. Roman Manevich Ben-Gurion University Fall 2014-2015 Compiler Principles Lecture 12: Register Allocation Roman Manevich Ben-Gurion University Syllabus Front End Intermediate Representation Optimizations Code Generation Scanning Lowering Local

More information

Register Allocation III. Interference Graph Allocators. Computing the Interference Graph (in MiniJava compiler)

Register Allocation III. Interference Graph Allocators. Computing the Interference Graph (in MiniJava compiler) Register Alloation III Announements Reommen have interferene graph onstrution working by Monay Last leture Register alloation aross funtion alls Toay Register alloation options Interferene Graph Alloators

More information

Register Allocation III. Interference Graph Allocators. Coalescing. Granularity of Allocation (Renumber step in Briggs) Chaitin

Register Allocation III. Interference Graph Allocators. Coalescing. Granularity of Allocation (Renumber step in Briggs) Chaitin Register Alloation III Last time Register alloation aross funtion alls Toay Register alloation options Interferene Graph Alloators Chaitin Briggs CS553 Leture Register Alloation III 1 CS553 Leture Register

More information

Variables vs. Registers/Memory. Simple Approach. Register Allocation. Interference Graph. Register Allocation Algorithm CS412/CS413

Variables vs. Registers/Memory. Simple Approach. Register Allocation. Interference Graph. Register Allocation Algorithm CS412/CS413 Variables vs. Registers/Memory CS412/CS413 Introduction to Compilers Tim Teitelbaum Lecture 33: Register Allocation 18 Apr 07 Difference between IR and assembly code: IR (and abstract assembly) manipulate

More information

COMP 181. Prelude. Intermediate representations. Today. Types of IRs. High-level IR. Intermediate representations and code generation

COMP 181. Prelude. Intermediate representations. Today. Types of IRs. High-level IR. Intermediate representations and code generation Prelude COMP 181 Intermediate representations and ode generation November, 009 What is this devie? Large Hadron Collider What is a hadron? Subatomi partile made up of quarks bound by the strong fore What

More information

Department of Electrical Engineering and Computer Science MASSACHUSETTS INSTITUTE OF TECHNOLOGY Fall Test I Solutions

Department of Electrical Engineering and Computer Science MASSACHUSETTS INSTITUTE OF TECHNOLOGY Fall Test I Solutions Department of Eletrial Engineering and Computer iene MAACHUETT INTITUTE OF TECHNOLOGY 6.035 Fall 2016 Test I olutions 1 I Regular Expressions and Finite-tate Automata For Questions 1, 2, and 3, let the

More information

Outline: Software Design

Outline: Software Design Outline: Software Design. Goals History of software design ideas Design priniples Design methods Life belt or leg iron? (Budgen) Copyright Nany Leveson, Sept. 1999 A Little History... At first, struggling

More information

8 Instruction Selection

8 Instruction Selection 8 Instrution Seletion The IR ode instrutions were designed to do exatly one operation: load/store, add, subtrat, jump, et. The mahine instrutions of a real CPU often perform several of these primitive

More information

Register allocation. TDT4205 Lecture 31

Register allocation. TDT4205 Lecture 31 1 Register allocation TDT4205 Lecture 31 2 Variables vs. registers TAC has any number of variables Assembly code has to deal with memory and registers Compiler back end must decide how to juggle the contents

More information

Compilation /15a Lecture 7. Activation Records Noam Rinetzky

Compilation /15a Lecture 7. Activation Records Noam Rinetzky Compilation 0368-3133 2014/15a Lecture 7 Activation Records Noam Rinetzky 1 Code generation for procedure calls (+ a few words on the runtime system) 2 Code generation for procedure calls Compile time

More information

Liveness Analysis and Register Allocation. Xiao Jia May 3 rd, 2013

Liveness Analysis and Register Allocation. Xiao Jia May 3 rd, 2013 Liveness Analysis and Register Allocation Xiao Jia May 3 rd, 2013 1 Outline Control flow graph Liveness analysis Graph coloring Linear scan 2 Basic Block The code in a basic block has: one entry point,

More information

Reading Object Code. A Visible/Z Lesson

Reading Object Code. A Visible/Z Lesson Reading Objet Code A Visible/Z Lesson The Idea: When programming in a high-level language, we rarely have to think about the speifi ode that is generated for eah instrution by a ompiler. But as an assembly

More information

Reading Object Code. A Visible/Z Lesson

Reading Object Code. A Visible/Z Lesson Reading Objet Code A Visible/Z Lesson The Idea: When programming in a high-level language, we rarely have to think about the speifi ode that is generated for eah instrution by a ompiler. But as an assembly

More information

1 The Knuth-Morris-Pratt Algorithm

1 The Knuth-Morris-Pratt Algorithm 5-45/65: Design & Analysis of Algorithms September 26, 26 Leture #9: String Mathing last hanged: September 26, 27 There s an entire field dediated to solving problems on strings. The book Algorithms on

More information

Register allocation. Register allocation: ffl have value in a register when used. ffl limited resources. ffl changes instruction choices

Register allocation. Register allocation: ffl have value in a register when used. ffl limited resources. ffl changes instruction choices Register allocation IR instruction selection register allocation machine code errors Register allocation: have value in a register when used limited resources changes instruction choices can move loads

More information

Lecture 21 CIS 341: COMPILERS

Lecture 21 CIS 341: COMPILERS Lecture 21 CIS 341: COMPILERS Announcements HW6: Analysis & Optimizations Alias analysis, constant propagation, dead code elimination, register allocation Available Soon Due: Wednesday, April 25 th Zdancewic

More information

Announcements. Lecture Caching Issues for Multi-core Processors. Shared Vs. Private Caches for Small-scale Multi-core

Announcements. Lecture Caching Issues for Multi-core Processors. Shared Vs. Private Caches for Small-scale Multi-core Announements Your fous should be on the lass projet now Leture 17: Cahing Issues for Multi-ore Proessors This week: status update and meeting A short presentation on: projet desription (problem, importane,

More information

Algorithms, Mechanisms and Procedures for the Computer-aided Project Generation System

Algorithms, Mechanisms and Procedures for the Computer-aided Project Generation System Algorithms, Mehanisms and Proedures for the Computer-aided Projet Generation System Anton O. Butko 1*, Aleksandr P. Briukhovetskii 2, Dmitry E. Grigoriev 2# and Konstantin S. Kalashnikov 3 1 Department

More information

Runtime Support for OOLs Part II Comp 412

Runtime Support for OOLs Part II Comp 412 COMP 412 FALL 2017 Runtime Support for OOLs Part II Comp 412 soure IR Front End Optimizer Bak End IR target Copright 2017, Keith D. Cooper & Linda Torzon, all rights reserved. Students enrolled in Comp

More information

CS:APP2e Web Aside ASM:X87: X87-Based Support for Floating Point

CS:APP2e Web Aside ASM:X87: X87-Based Support for Floating Point CS:APP2e Web Aside ASM:X87: X87-Based Support for Floating Point Randal E. Bryant David R. O Hallaron June 5, 2012 Notie The material in this doument is supplementary material to the book Computer Systems,

More information

XML Data Streams. XML Stream Processing. XML Stream Processing. Yanlei Diao. University of Massachusetts Amherst

XML Data Streams. XML Stream Processing. XML Stream Processing. Yanlei Diao. University of Massachusetts Amherst XML Stream Proessing Yanlei Diao University of Massahusetts Amherst XML Data Streams XML is the wire format for data exhanged online. Purhase orders http://www.oasis-open.org/ommittees/t_home.php?wg_abbrev=ubl

More information

Data Structures in Java

Data Structures in Java Data Strutures in Java Leture 8: Trees and Tree Traversals. 10/5/2015 Daniel Bauer 1 Trees in Computer Siene A lot of data omes in a hierarhial/nested struture. Mathematial expressions. Program struture.

More information

Register allocation. CS Compiler Design. Liveness analysis. Register allocation. Liveness analysis and Register allocation. V.

Register allocation. CS Compiler Design. Liveness analysis. Register allocation. Liveness analysis and Register allocation. V. Register allocation CS3300 - Compiler Design Liveness analysis and Register allocation V. Krishna Nandivada IIT Madras Copyright c 2014 by Antony L. Hosking. Permission to make digital or hard copies of

More information

Topic 12: Register Allocation

Topic 12: Register Allocation Topic 12: Register Allocation COS 320 Compiling Techniques Princeton University Spring 2016 Lennart Beringer 1 Structure of backend Register allocation assigns machine registers (finite supply!) to virtual

More information

Parametric Abstract Domains for Shape Analysis

Parametric Abstract Domains for Shape Analysis Parametri Abstrat Domains for Shape Analysis Xavier RIVAL (INRIA & Éole Normale Supérieure) Joint work with Bor-Yuh Evan CHANG (University of Maryland U University of Colorado) and George NECULA (University

More information

Compilers and Code Optimization EDOARDO FUSELLA

Compilers and Code Optimization EDOARDO FUSELLA Compilers and Code Optimization EDOARDO FUSELLA Contents Data memory layout Instruction selection Register allocation Data memory layout Memory Hierarchy Capacity vs access speed Main memory Classes of

More information

Algorithms for External Memory Lecture 6 Graph Algorithms - Weighted List Ranking

Algorithms for External Memory Lecture 6 Graph Algorithms - Weighted List Ranking Algorithms for External Memory Leture 6 Graph Algorithms - Weighted List Ranking Leturer: Nodari Sithinava Sribe: Andi Hellmund, Simon Ohsenreither 1 Introdution & Motivation After talking about I/O-effiient

More information

Register allocation. instruction selection. machine code. register allocation. errors

Register allocation. instruction selection. machine code. register allocation. errors Register allocation IR instruction selection register allocation machine code errors Register allocation: have value in a register when used limited resources changes instruction choices can move loads

More information

Total 100

Total 100 CS331 SOLUTION Problem # Points 1 10 2 15 3 25 4 20 5 15 6 15 Total 100 1. ssume you are dealing with a ompiler for a Java-like language. For eah of the following errors, irle whih phase would normally

More information

This fact makes it difficult to evaluate the cost function to be minimized

This fact makes it difficult to evaluate the cost function to be minimized RSOURC LLOCTION N SSINMNT In the resoure alloation step the amount of resoures required to exeute the different types of proesses is determined. We will refer to the time interval during whih a proess

More information

Compiler Architecture

Compiler Architecture Code Generation 1 Compiler Architecture Source language Scanner (lexical analysis) Tokens Parser (syntax analysis) Syntactic structure Semantic Analysis (IC generator) Intermediate Language Code Optimizer

More information

Lecture Notes on Register Allocation

Lecture Notes on Register Allocation Lecture Notes on Register Allocation 15-411: Compiler Design Frank Pfenning Lecture 3 September 1, 2009 1 Introduction In this lecture we discuss register allocation, which is one of the last steps in

More information

splitting tehniques that partition live ranges have been proposed to solve both the spilling problem[5][8] and the assignment problem[8][9]. The parti

splitting tehniques that partition live ranges have been proposed to solve both the spilling problem[5][8] and the assignment problem[8][9]. The parti Load/Store Range Analysis for Global Register Alloation Priyadarshan Kolte and Mary Jean Harrold Department of Computer Siene Clemson University Abstrat Live range splitting tehniques divide the live ranges

More information

Background/Review on Numbers and Computers (lecture)

Background/Review on Numbers and Computers (lecture) Bakground/Review on Numbers and Computers (leture) ICS312 Mahine-Level and Systems Programming Henri Casanova (henri@hawaii.edu) Numbers and Computers Throughout this ourse we will use binary and hexadeimal

More information

Winter Compiler Construction T10 IR part 3 + Activation records. Today. LIR language

Winter Compiler Construction T10 IR part 3 + Activation records. Today. LIR language Winter 2006-2007 Compiler Construction T10 IR part 3 + Activation records Mooly Sagiv and Roman Manevich School of Computer Science Tel-Aviv University Today ic IC Language Lexical Analysis Syntax Analysis

More information

Recursion examples: Problem 2. (More) Recursion and Lists. Tail recursion. Recursion examples: Problem 2. Recursion examples: Problem 3

Recursion examples: Problem 2. (More) Recursion and Lists. Tail recursion. Recursion examples: Problem 2. Recursion examples: Problem 3 Reursion eamples: Problem 2 (More) Reursion and s Reursive funtion to reverse a string publi String revstring(string str) { if(str.equals( )) return str; return revstring(str.substring(1, str.length()))

More information

x86 Assembly Crash Course Don Porter

x86 Assembly Crash Course Don Porter x86 Assembly Crash Course Don Porter Registers ò Only variables available in assembly ò General Purpose Registers: ò EAX, EBX, ECX, EDX (32 bit) ò Can be addressed by 8 and 16 bit subsets AL AH AX EAX

More information

System-Level Parallelism and Throughput Optimization in Designing Reconfigurable Computing Applications

System-Level Parallelism and Throughput Optimization in Designing Reconfigurable Computing Applications System-Level Parallelism and hroughput Optimization in Designing Reonfigurable Computing Appliations Esam El-Araby 1, Mohamed aher 1, Kris Gaj 2, arek El-Ghazawi 1, David Caliga 3, and Nikitas Alexandridis

More information

Compiler Passes. Optimization. The Role of the Optimizer. Optimizations. The Optimizer (or Middle End) Traditional Three-pass Compiler

Compiler Passes. Optimization. The Role of the Optimizer. Optimizations. The Optimizer (or Middle End) Traditional Three-pass Compiler Compiler Passes Analysis of input program (front-end) character stream Lexical Analysis Synthesis of output program (back-end) Intermediate Code Generation Optimization Before and after generating machine

More information

WORKSHOP 20 CREATING PCL FUNCTIONS

WORKSHOP 20 CREATING PCL FUNCTIONS WORKSHOP 20 CREATING PCL FUNCTIONS WS20-1 WS20-2 Problem Desription This exerise involves reating two PCL funtions that an be used to easily hange the view of a model. The PCL funtions are reated by reording

More information

On - Line Path Delay Fault Testing of Omega MINs M. Bellos 1, E. Kalligeros 1, D. Nikolos 1,2 & H. T. Vergos 1,2

On - Line Path Delay Fault Testing of Omega MINs M. Bellos 1, E. Kalligeros 1, D. Nikolos 1,2 & H. T. Vergos 1,2 On - Line Path Delay Fault Testing of Omega MINs M. Bellos, E. Kalligeros, D. Nikolos,2 & H. T. Vergos,2 Dept. of Computer Engineering and Informatis 2 Computer Tehnology Institute University of Patras,

More information

238P: Operating Systems. Lecture 3: Calling conventions. Anton Burtsev October, 2018

238P: Operating Systems. Lecture 3: Calling conventions. Anton Burtsev October, 2018 238P: Operating Systems Lecture 3: Calling conventions Anton Burtsev October, 2018 What does CPU do internally? (Remember Lecture 01 - Introduction?) CPU execution loop CPU repeatedly reads instructions

More information

CA Compiler Construction

CA Compiler Construction CA4003 - Compiler Construction David Sinclair When procedure A calls procedure B, we name procedure A the caller and procedure B the callee. A Runtime Environment, also called an Activation Record, is

More information

The recursive decoupling method for solving tridiagonal linear systems

The recursive decoupling method for solving tridiagonal linear systems Loughborough University Institutional Repository The reursive deoupling method for solving tridiagonal linear systems This item was submitted to Loughborough University's Institutional Repository by the/an

More information

Real-Time Control for a Turbojet Engine

Real-Time Control for a Turbojet Engine A Multiproessor mplementation of Real-Time Control for a Turbojet Engine Phillip L. Shaffer ABSTRACT: A real-time ontrol program for a turbojet engine has been implemented on a four-proessor omputer, ahieving

More information

CS553 Lecture Introduction to Data-flow Analysis 1

CS553 Lecture Introduction to Data-flow Analysis 1 ! Ide Introdution to Dt-flow nlysis!lst Time! Implementing Mrk nd Sweep GC!Tody! Control flow grphs! Liveness nlysis! Register llotion CS553 Leture Introdution to Dt-flow Anlysis 1 Dt-flow Anlysis! Dt-flow

More information

CA Test Data Manager 4.x Implementation Proven Professional Exam (CAT-681) Study Guide Version 1.0

CA Test Data Manager 4.x Implementation Proven Professional Exam (CAT-681) Study Guide Version 1.0 Implementation Proven Professional Study Guide Version 1.0 PROPRIETARY AND CONFIDENTIAL INFORMATION 2017 CA. All rights reserved. CA onfidential & proprietary information. For CA, CA Partner and CA Customer

More information

Register Allocation, iii. Bringing in functions & using spilling & coalescing

Register Allocation, iii. Bringing in functions & using spilling & coalescing Register Allocation, iii Bringing in functions & using spilling & coalescing 1 Function Calls ;; f(x) = let y = g(x) ;; in h(y+x) + y*5 (:f (x

More information

Lecture 14. Recursion

Lecture 14. Recursion Leture 14 Reursion Announements for Today Prelim 1 Tonight at 5:15 OR 7:30 A D (5:15, Uris G01) E-K (5:15, Statler) L P (7:30, Uris G01) Q-Z (7:30, Statler) Graded y noon on Sun Sores will e in CMS In

More information

Allocating Rotating Registers by Scheduling

Allocating Rotating Registers by Scheduling Alloating Rotating Registers by Sheduling Hongbo Rong Hyunhul Park Cheng Wang Youfeng Wu Programming Systems Lab Intel Labs {hongbo.rong,hyunhul.park,heng..wang,youfeng.wu}@intel.om ABSTRACT A rotating

More information

W4118: PC Hardware and x86. Junfeng Yang

W4118: PC Hardware and x86. Junfeng Yang W4118: PC Hardware and x86 Junfeng Yang A PC How to make it do something useful? 2 Outline PC organization x86 instruction set gcc calling conventions PC emulation 3 PC board 4 PC organization One or more

More information

Chapter 2: Introduction to Maple V

Chapter 2: Introduction to Maple V Chapter 2: Introdution to Maple V 2-1 Working with Maple Worksheets Try It! (p. 15) Start a Maple session with an empty worksheet. The name of the worksheet should be Untitled (1). Use one of the standard

More information

Liveness Analysis and Register Allocation

Liveness Analysis and Register Allocation Liveness Analysis and Register Allocation Leonidas Fegaras CSE 5317/4305 L10: Liveness Analysis and Register Allocation 1 Liveness Analysis So far we assumed that we have a very large number of temporary

More information

CS577 Modern Language Processors. Spring 2018 Lecture Optimization

CS577 Modern Language Processors. Spring 2018 Lecture Optimization CS577 Modern Language Processors Spring 2018 Lecture Optimization 1 GENERATING BETTER CODE What does a conventional compiler do to improve quality of generated code? Eliminate redundant computation Move

More information

THEORY OF COMPILATION

THEORY OF COMPILATION Lecture 10 Activation Records THEORY OF COMPILATION EranYahav www.cs.technion.ac.il/~yahave/tocs2011/compilers-lec10.pptx Reference: Dragon 7.1,7.2. MCD 6.3,6.4.2 1 You are here Compiler txt Source Lexical

More information

Where we are. Instruction selection. Abstract Assembly. CS 4120 Introduction to Compilers

Where we are. Instruction selection. Abstract Assembly. CS 4120 Introduction to Compilers Where we are CS 420 Introduction to Compilers Andrew Myers Cornell University Lecture 8: Instruction Selection 5 Oct 20 Intermediate code Canonical intermediate code Abstract assembly code Assembly code

More information

143A: Principles of Operating Systems. Lecture 5: Calling conventions. Anton Burtsev January, 2017

143A: Principles of Operating Systems. Lecture 5: Calling conventions. Anton Burtsev January, 2017 143A: Principles of Operating Systems Lecture 5: Calling conventions Anton Burtsev January, 2017 Stack and procedure calls Stack Main purpose: Store the return address for the current procedure Caller

More information

Practical Malware Analysis

Practical Malware Analysis Practical Malware Analysis Ch 4: A Crash Course in x86 Disassembly Revised 1-16-7 Basic Techniques Basic static analysis Looks at malware from the outside Basic dynamic analysis Only shows you how the

More information

Lecture #16: Introduction to Runtime Organization. Last modified: Fri Mar 19 00:17: CS164: Lecture #16 1

Lecture #16: Introduction to Runtime Organization. Last modified: Fri Mar 19 00:17: CS164: Lecture #16 1 Lecture #16: Introduction to Runtime Organization Last modified: Fri Mar 19 00:17:19 2010 CS164: Lecture #16 1 Status Lexical analysis Produces tokens Detects & eliminates illegal tokens Parsing Produces

More information

Today More register allocation Clarifications from last time Finish improvements on basic graph coloring concept Procedure calls Interprocedural

Today More register allocation Clarifications from last time Finish improvements on basic graph coloring concept Procedure calls Interprocedural More Register Allocation Last time Register allocation Global allocation via graph coloring Today More register allocation Clarifications from last time Finish improvements on basic graph coloring concept

More information

Wednesday, October 15, 14. Functions

Wednesday, October 15, 14. Functions Functions Terms void foo() { int a, b;... bar(a, b); void bar(int x, int y) {... foo is the caller bar is the callee a, b are the actual parameters to bar x, y are the formal parameters of bar Shorthand:

More information

Colouring contact graphs of squares and rectilinear polygons de Berg, M.T.; Markovic, A.; Woeginger, G.

Colouring contact graphs of squares and rectilinear polygons de Berg, M.T.; Markovic, A.; Woeginger, G. Colouring ontat graphs of squares and retilinear polygons de Berg, M.T.; Markovi, A.; Woeginger, G. Published in: nd European Workshop on Computational Geometry (EuroCG 06), 0 Marh - April, Lugano, Switzerland

More information

Project Compiler. CS031 TA Help Session November 28, 2011

Project Compiler. CS031 TA Help Session November 28, 2011 Project Compiler CS031 TA Help Session November 28, 2011 Motivation Generally, it s easier to program in higher-level languages than in assembly. Our goal is to automate the conversion from a higher-level

More information

Flow Demands Oriented Node Placement in Multi-Hop Wireless Networks

Flow Demands Oriented Node Placement in Multi-Hop Wireless Networks Flow Demands Oriented Node Plaement in Multi-Hop Wireless Networks Zimu Yuan Institute of Computing Tehnology, CAS, China {zimu.yuan}@gmail.om arxiv:153.8396v1 [s.ni] 29 Mar 215 Abstrat In multi-hop wireless

More information

Midterm II CS164, Spring 2006

Midterm II CS164, Spring 2006 Midterm II CS164, Spring 2006 April 11, 2006 Please read all instructions (including these) carefully. Write your name, login, SID, and circle the section time. There are 10 pages in this exam and 4 questions,

More information

Extracting Partition Statistics from Semistructured Data

Extracting Partition Statistics from Semistructured Data Extrating Partition Statistis from Semistrutured Data John N. Wilson Rihard Gourlay Robert Japp Mathias Neumüller Department of Computer and Information Sienes University of Strathlyde, Glasgow, UK {jnw,rsg,rpj,mathias}@is.strath.a.uk

More information

CSE P 501 Compilers. Register Allocation Hal Perkins Autumn /22/ Hal Perkins & UW CSE P-1

CSE P 501 Compilers. Register Allocation Hal Perkins Autumn /22/ Hal Perkins & UW CSE P-1 CSE P 501 Compilers Register Allocation Hal Perkins Autumn 2011 11/22/2011 2002-11 Hal Perkins & UW CSE P-1 Agenda Register allocation constraints Local methods Faster compile, slower code, but good enough

More information

Low-Level Issues. Register Allocation. Last lecture! Liveness analysis! Register allocation. ! More register allocation. ! Instruction scheduling

Low-Level Issues. Register Allocation. Last lecture! Liveness analysis! Register allocation. ! More register allocation. ! Instruction scheduling Low-Level Issues Last lecture! Liveness analysis! Register allocation!today! More register allocation!later! Instruction scheduling CS553 Lecture Register Allocation I 1 Register Allocation!Problem! Assign

More information

143A: Principles of Operating Systems. Lecture 4: Calling conventions. Anton Burtsev October, 2017

143A: Principles of Operating Systems. Lecture 4: Calling conventions. Anton Burtsev October, 2017 143A: Principles of Operating Systems Lecture 4: Calling conventions Anton Burtsev October, 2017 Recap from last time Stack and procedure calls What is stack? Stack It's just a region of memory Pointed

More information

DECT Module Installation Manual

DECT Module Installation Manual DECT Module Installation Manual Rev. 2.0 This manual desribes the DECT module registration method to the HUB and fan airflow settings. In order for the HUB to ommuniate with a ompatible fan, the DECT module

More information

A Bad Name. CS 2210: Optimization. Register Allocation. Optimization. Reaching Definitions. Dataflow Analyses 4/10/2013

A Bad Name. CS 2210: Optimization. Register Allocation. Optimization. Reaching Definitions. Dataflow Analyses 4/10/2013 A Bad Name Optimization is the process by which we turn a program into a better one, for some definition of better. CS 2210: Optimization This is impossible in the general case. For instance, a fully optimizing

More information

Automatic Physical Design Tuning: Workload as a Sequence Sanjay Agrawal Microsoft Research One Microsoft Way Redmond, WA, USA +1-(425)

Automatic Physical Design Tuning: Workload as a Sequence Sanjay Agrawal Microsoft Research One Microsoft Way Redmond, WA, USA +1-(425) Automati Physial Design Tuning: Workload as a Sequene Sanjay Agrawal Mirosoft Researh One Mirosoft Way Redmond, WA, USA +1-(425) 75-357 sagrawal@mirosoft.om Eri Chu * Computer Sienes Department University

More information

x86 assembly CS449 Fall 2017

x86 assembly CS449 Fall 2017 x86 assembly CS449 Fall 2017 x86 is a CISC CISC (Complex Instruction Set Computer) e.g. x86 Hundreds of (complex) instructions Only a handful of registers RISC (Reduced Instruction Set Computer) e.g. MIPS

More information

What are Cycle-Stealing Systems Good For? A Detailed Performance Model Case Study

What are Cycle-Stealing Systems Good For? A Detailed Performance Model Case Study What are Cyle-Stealing Systems Good For? A Detailed Performane Model Case Study Wayne Kelly and Jiro Sumitomo Queensland University of Tehnology, Australia {w.kelly, j.sumitomo}@qut.edu.au Abstrat The

More information

Reverse Engineering of Assembler Programs: A Model-Based Approach and its Logical Basis

Reverse Engineering of Assembler Programs: A Model-Based Approach and its Logical Basis Reverse Engineering of Assembler Programs: A Model-Based Approah and its Logial Basis Tom Lake and Tim Blanhard, InterGlossa Ltd., Reading, UK Tel: +44 174 561919 email: {Tom.Lake,Tim.Blanhard}@glossa.o.uk

More information

Interconnection Styles

Interconnection Styles Interonnetion tyles oftware Design Following the Export (erver) tyle 2 M1 M4 M5 4 M3 M6 1 3 oftware Design Following the Export (Client) tyle e 2 e M1 M4 M5 4 M3 M6 1 e 3 oftware Design Following the Export

More information

1 Disjoint-set data structure.

1 Disjoint-set data structure. CS 124 Setion #4 Union-Fin, Greey Algorithms 2/20/17 1 Disjoint-set ata struture. 1.1 Operations Disjoint-set ata struture enale us to effiiently perform operations suh as plaing elements into sets, querying

More information

Zippy - A coarse-grained reconfigurable array with support for hardware virtualization

Zippy - A coarse-grained reconfigurable array with support for hardware virtualization Zippy - A oarse-grained reonfigurable array with support for hardware virtualization Christian Plessl Computer Engineering and Networks Lab ETH Zürih, Switzerland plessl@tik.ee.ethz.h Maro Platzner Department

More information

Winter Compiler Construction T11 Activation records + Introduction to x86 assembly. Today. Tips for PA4. Today:

Winter Compiler Construction T11 Activation records + Introduction to x86 assembly. Today. Tips for PA4. Today: Winter 2006-2007 Compiler Construction T11 Activation records + Introduction to x86 assembly Mooly Sagiv and Roman Manevich School of Computer Science Tel-Aviv University Today ic IC Language Lexical Analysis

More information

CS 406/534 Compiler Construction Putting It All Together

CS 406/534 Compiler Construction Putting It All Together CS 406/534 Compiler Construction Putting It All Together Prof. Li Xu Dept. of Computer Science UMass Lowell Fall 2004 Part of the course lecture notes are based on Prof. Keith Cooper, Prof. Ken Kennedy

More information

Definitions Homework. Quine McCluskey Optimal solutions are possible for some large functions Espresso heuristic. Definitions Homework

Definitions Homework. Quine McCluskey Optimal solutions are possible for some large functions Espresso heuristic. Definitions Homework EECS 33 There be Dragons here http://ziyang.ees.northwestern.edu/ees33/ Teaher: Offie: Email: Phone: L477 Teh dikrp@northwestern.edu 847 467 2298 Today s material might at first appear diffiult Perhaps

More information

Sequential Incremental-Value Auctions

Sequential Incremental-Value Auctions Sequential Inremental-Value Autions Xiaoming Zheng and Sven Koenig Department of Computer Siene University of Southern California Los Angeles, CA 90089-0781 {xiaominz,skoenig}@us.edu Abstrat We study the

More information

Drawing lines. Naïve line drawing algorithm. drawpixel(x, round(y)); double dy = y1 - y0; double dx = x1 - x0; double m = dy / dx; double y = y0;

Drawing lines. Naïve line drawing algorithm. drawpixel(x, round(y)); double dy = y1 - y0; double dx = x1 - x0; double m = dy / dx; double y = y0; Naïve line drawing algorithm // Connet to grid points(x0,y0) and // (x1,y1) by a line. void drawline(int x0, int y0, int x1, int y1) { int x; double dy = y1 - y0; double dx = x1 - x0; double m = dy / dx;

More information

Layout Compliance for Triple Patterning Lithography: An Iterative Approach

Layout Compliance for Triple Patterning Lithography: An Iterative Approach Layout Compliane for Triple Patterning Lithography: An Iterative Approah Bei Yu, Gilda Garreton, David Z. Pan ECE Dept. University of Texas at Austin, Austin, TX, USA Orale Las, Orale Corporation, Redwood

More information

Topic 5: semantic analysis. 5.2 Attribute Grammars

Topic 5: semantic analysis. 5.2 Attribute Grammars Topi 5: semanti analysis 5.2 Semanti Analysis Attribute Grammar: An attribute grammar is an extension of a ontextfree grammar, with two extensions: 1.Attributes on Symbols: ah grammar symbol S, terminal

More information

LAB 4: Operations on binary images Histograms and color tables

LAB 4: Operations on binary images Histograms and color tables LAB 4: Operations on binary images Histograms an olor tables Computer Vision Laboratory Linköping University, Sween Preparations an start of the lab system You will fin a ouple of home exerises (marke

More information

Graph-Based vs Depth-Based Data Representation for Multiview Images

Graph-Based vs Depth-Based Data Representation for Multiview Images Graph-Based vs Depth-Based Data Representation for Multiview Images Thomas Maugey, Antonio Ortega, Pasal Frossard Signal Proessing Laboratory (LTS), Eole Polytehnique Fédérale de Lausanne (EPFL) Email:

More information

x86 assembly CS449 Spring 2016

x86 assembly CS449 Spring 2016 x86 assembly CS449 Spring 2016 CISC vs. RISC CISC [Complex instruction set Computing] - larger, more feature-rich instruction set (more operations, addressing modes, etc.). slower clock speeds. fewer general

More information

CA Privileged Access Manager 3.x Proven Implementation Professional Exam (CAT-661) Study Guide Version 1.0

CA Privileged Access Manager 3.x Proven Implementation Professional Exam (CAT-661) Study Guide Version 1.0 Exam (CAT-661) Study Guide Version 1.0 PROPRIETARY AND CONFIDENTIAL INFMATION 2018 CA. All rights reserved. CA onfidential & proprietary information. For CA, CA Partner and CA Customer use only. No unauthorized

More information

Lecture 25: Register Allocation

Lecture 25: Register Allocation Lecture 25: Register Allocation [Adapted from notes by R. Bodik and G. Necula] Topics: Memory Hierarchy Management Register Allocation: Register interference graph Graph coloring heuristics Spilling Cache

More information

Run Time Environment. Implementing Object-Oriented Languages

Run Time Environment. Implementing Object-Oriented Languages Run Time Environment Implementing Objet-Oriented Languages Copright 2017, Pedro C. Diniz, all rights reserved. Students enrolled in the Compilers lass at the Universit of Southern California have epliit

More information

CA Release Automation 5.x Implementation Proven Professional Exam (CAT-600) Study Guide Version 1.1

CA Release Automation 5.x Implementation Proven Professional Exam (CAT-600) Study Guide Version 1.1 Exam (CAT-600) Study Guide Version 1.1 PROPRIETARY AND CONFIDENTIAL INFORMATION 2016 CA. All rights reserved. CA onfidential & proprietary information. For CA, CA Partner and CA Customer use only. No unauthorized

More information

Detection and Recognition of Non-Occluded Objects using Signature Map

Detection and Recognition of Non-Occluded Objects using Signature Map 6th WSEAS International Conferene on CIRCUITS, SYSTEMS, ELECTRONICS,CONTROL & SIGNAL PROCESSING, Cairo, Egypt, De 9-31, 007 65 Detetion and Reognition of Non-Oluded Objets using Signature Map Sangbum Park,

More information

Performance Benchmarks for an Interactive Video-on-Demand System

Performance Benchmarks for an Interactive Video-on-Demand System Performane Benhmarks for an Interative Video-on-Demand System. Guo,P.G.Taylor,E.W.M.Wong,S.Chan,M.Zukerman andk.s.tang ARC Speial Researh Centre for Ultra-Broadband Information Networks (CUBIN) Department

More information

PASCAL 64. "The" Pascal Compiler for the Commodore 64. A Data Becker Product. >AbacusiII Software P.O. BOX 7211 GRAND RAPIDS, MICK 49510

PASCAL 64. The Pascal Compiler for the Commodore 64. A Data Becker Product. >AbacusiII Software P.O. BOX 7211 GRAND RAPIDS, MICK 49510 PASCAL 64 "The" Pasal Compiler for the Commodore 64 A Data Beker Produt >AbausiII Software P.O. BOX 7211 GRAND RAPIDS, MICK 49510 7010 COPYRIGHT NOTICE ABACUS Software makes this pakage available for use

More information

UCSB Math TI-85 Tutorials: Basics

UCSB Math TI-85 Tutorials: Basics 3 UCSB Math TI-85 Tutorials: Basis If your alulator sreen doesn t show anything, try adjusting the ontrast aording to the instrutions on page 3, or page I-3, of the alulator manual You should read the

More information

CSE 401 Final Exam. December 16, 2010

CSE 401 Final Exam. December 16, 2010 CSE 401 Final Exam December 16, 2010 Name You may have one sheet of handwritten notes plus the handwritten notes from the midterm. You may also use information about MiniJava, the compiler, and so forth

More information

Multiple Assignments

Multiple Assignments Two Outputs Conneted Together Multiple Assignments Two Outputs Conneted Together if (En1) Q

More information

13.1 Numerical Evaluation of Integrals Over One Dimension

13.1 Numerical Evaluation of Integrals Over One Dimension 13.1 Numerial Evaluation of Integrals Over One Dimension A. Purpose This olletion of subprograms estimates the value of the integral b a f(x) dx where the integrand f(x) and the limits a and b are supplied

More information