Exhaustive Phase Order Search Space Exploration and Evaluation

Size: px
Start display at page:

Download "Exhaustive Phase Order Search Space Exploration and Evaluation"

Transcription

1 Exhaustive Phase Order Search Space Exploration and Evaluation by Prasad Kulkarni (Florida State University) / 55

2 Compiler Optimizations To improve efficiency of compiler generated code Optimization phases require enabling conditions need specific patterns in the code many also need available registers Phases interact with each other Applying optimizations in different orders generates different code 2 / 55

3 Phase Ordering Problem To find an ordering of optimization phases that produces optimal code with respect to possible phase orderings Evaluating each sequence involves compiling, assembling, linking, execution and verifying results Best optimization phase ordering depends on source application target platform implementation of optimization phases Long standing problem in compiler optimization!! 3 / 55

4 Phase Ordering Space Current compilers incorporate numerous different optimization phases 15 distinct phases in our compiler backend 15! = 1,307,674,368,000 Phases can enable each other any phase can be active multiple times = 437,893,890,380,859,375 cannot restrict sequence length to = * / 55

5 Addressing Phase Ordering Exhaustive Search universally considered intractable We are now able to exhaustively evaluate the optimization phase order space. 5 / 55

6 Re-stating of Phase Ordering Earlier approach explicitly enumerate all possible optimization phase orderings Our approach explicitly enumerate all function instances that can be produced by any combination of phases 6 / 55

7 Outline Experimental framework Exhaustive phase order space evaluation Faster conventional compilation Conclusions Summary of my other work Future research directions 7 / 55

8 Outline Experimental framework Exhaustive phase order space evaluation Faster conventional compilation Conclusions Summary of my other work Future research directions 8 / 55

9 Experimental Framework We used the VPO compilation system established compiler framework, started development in 1988 comparable performance to gcc O2 VPO performs all transformations on a single representation (RTLs), so it is possible to perform most phases in an arbitrary order Experiments use all the 15 re-orderable optimization phases in VPO Target architecture was the StrongARM SA-100 processor 9 / 55

10 VPO Optimization Phases ID Optimization Phase ID Optimization Phase b branch chaining l loop transformations c common subexpr. elim. n code abstraction d remv. unreachable code o eval. order determin. g loop unrolling q strength reduction h dead assignment elim. r reverse branches i block reordering s instruction selection j minimize loop jumps u remv. useless jumps k register allocation 10 / 55

11 Disclaimers Did not include optimization phases normally associated with compiler front ends no memory hierarchy optimizations no inlining or other interprocedural optimizations Did not vary how phases are applied Did not include optimizations that require profile data 11 / 55

12 Benchmarks 12 MiBench benchmarks; 244 functions Category auto network telecomm consumer security office Program bitcount qsort dijkstra patricia fft adpcm jpeg tiff2bw sha blowfish stringsearch ispell Description test processor bit manipulation abilities sort strings using quicksort sorting algorithm Dijkstra s shortest path algorithm construct patricia trie for IP traffic fast fourier transform compress 16-bit linear PCM samples to 4-bit image compression and decompression convert color.tiff image to b&w image secure hash algorithm symmetric block cipher with variable length key searches for given words in phrases fast spelling checker 12 / 55

13 Outline Experimental framework Exhaustive phase order space evaluation Faster conventional compilation Conclusions Summary of my other work Future research directions 13 / 55

14 Terminology Active phase An optimization phase that modifies the function representation Dormant phase A phase that is unable to find any opportunity to change the function Function instance any semantically, syntactically, and functionally correct representation of the source function (that can be produced by our compiler) 14 / 55

15 Naïve Optimization Phase Order Space All combinations of optimization phase sequences are attempted L0 a b c d L1 a b c d a d a d a d b c b c b c L2 15 / 55

16 Eliminating Consecutively Applied Phases A phase just applied in our compiler cannot be immediately active again L0 a b c d L1 a b c d a b c d a b c d a b c d L2 16 / 55

17 Eliminating Dormant Phases Get feedback from the compiler indicating if any transformations were successfully applied in a phase. L0 a b c d L1 b c d a c d a b d a b c L2 17 / 55

18 Identical Function Instances Some optimization phases are independent example: branch chaining & register allocation Different phase sequences can produce the same code r[2] = 1; r[3] = r[4] + r[2]; instruction selection r[3] = r[4] + 1; r[2] = 1; r[3] = r[4] + r[2]; constant propagation r[2] = 1; r[3] = r[4] + 1; dead assignment elimination r[3] = r[4] + 1; 18 / 55

19 Equivalent Function Instances sum = 0; for (i = 0; i < 1000; i++ ) sum += a [ i ]; r[10]=0; r[12]=hi[a]; r[12]=r[12]+lo[a]; r[1]=r[12]; r[9]=4000+r[12]; L3 r[8]=m[r[1]]; r[10]=r[10]+r[8]; r[1]=r[1]+4; IC=r[1]?r[9]; PC=IC<0,L3; Register Allocation before Code Motion Source Code r[11]=0; r[10]=hi[a]; r[10]=r[10]+lo[a]; r[1]=r[10]; r[9]=4000+r[10]; L5 r[8]=m[r[1]]; r[11]=r[11]+r[8]; r[1]=r[1]+4; IC=r[1]?r[9]; PC=IC<0,L5; Code Motion before Register Allocation r[32]=0; r[33]=hi[a]; r[33]=r[33]+lo[a]; r[34]=r[33]; r[35]=4000+r[33]; L01 r[36]=m[r[34]]; r[32]=r[32]+r[36]; r[34]=r[34]+4; IC=r[34]?r[35]; PC=IC<0,L01; After Mapping Registers 19 / 55

20 Efficient Detection of Unique Function Instances After pruning dormant phases there may be tens or hundreds of thousands of unique instances Use a CRC (cyclic redundancy check) checksum on the bytes of the RTLs representing the instructions Used a hash table to check if an identical or equivalent function instance already exists in the DAG 20 / 55

21 Eliminating Identical/Equivalent Function Instances Resulting search space is a DAG of function instances L0 a b c c d a d a d L1 L2 21 / 55

22 Static Enumeration Results Function Inst. Fn_inst Len CF Batch Vs. optimal start_input_bmp (j) 1, , correct(i) 1,295 1,348, main (t) 1,276 2,882, parse_switches (j) 1, , start_input_gif (j) , start_input_tga (j) , askmode (i) , skiptoword (i) , start_input_ppm (j) 795 8, Average (234) , / 55

23 Exhaustively enumerated the optimization phase order space to find an optimal phase ordering with respect to code-size [Published in CGO 06] 23 / 55

24 Determining Program Performance Almost 175,000 distinct function instances, on average largest enumerated function has 2,882,021 instances Too time consuming to execute each distinct function instance assemble link execute more expensive than compilation Many embedded development environments use simulation simulation orders of magnitude more expensive than execution Use data obtained from a few executions to estimate the performance of all remaining function instances 24 / 55

25 Determining Program Performance (cont...) Function instances having identical control-flow graphs execute each block the same number of times Execute application once for each controlflow structure Statically estimate the number of cycles required to execute each basic block dynamic frequency measure = Σ (static cycles * block frequency) 25 / 55

26 Predicting Relative Performance I cycles 4 cycles cycles cycles 20 cycles 2 2 cycles 22 cycles 2 2 cycles 5 5 cycles cycles 5 10 cycles cycles Total cycles = 744 Total cycles = / 55

27 Dynamic Frequency Results Function Inst. Fn_inst Len CF Leaf % from optimal Batch Worst main (t) 1,276 2,882, , parse_switches (j) 1, , , askmode (i) , skiptoword (i) , , start_input_ppm (j) 795 8, pfx_list_chk (i) 640 1,269, , main (f) 624 2,789, , sha_transform (h) , , main (p) , Average (79) , / 55

28 Correlation Dynamic Frequency Counts Vs. Simulator Cycles Static performance estimation is inaccurate ignored cache/branch misprediction penalties Dynamic frequency counts may be sufficiently accurate simplification of the estimation problem most embedded systems have simpler architectures We show strong correlation between our measure of performance and simulator cycles 28 / 55

29 Complete Function Correlation Example: init_search in stringsearch 29 / 55

30 Leaf Function Correlation Leaf function instances are generated when no additional phases can be successfully applied Leaf instances provide a good sampling represents the only code that can be generated by an aggressive compiler, like VPO at least one leaf instance represents an optimal phase ordering for over 86% of functions significant percent of leaf instances among optimal 30 / 55

31 Leaf Function Correlation Statistics Pearson s correlation coefficient Pcorr = Σxy (ΣxΣy)/n sqrt( (Σx 2 (Σx) 2 /n) * (Σy 2 -(Σy) 2 /n) ) Accuracy of our estimate of optimal perf. cycle count for best leaf Lcorr = cy. cnt for leaf with best dynamic freq count 31 / 55

32 32 / 55 Leaf Function Correlation Statistics (cont ) Leaves Ratio Lcorr 0% Leaves Ratio average dijkstra(d) dequeue(d) ntbl_bit (b) ntbl_bitcnt(b) 23 main(b) bitcount(b) 2 bit_shifter(b) 2 bit_count.(b) 2 BW_btbl...(b) 1 AR_btbl...(b) Lcorr 1% Pcorr Function

33 Exhaustively evaluated the optimization phase order space to find a near-optimal phase ordering with respect to simulator cycles [Published in LCTES 06] 33 / 55

34 Outline Experimental framework Exhaustive phase order space evaluation Faster conventional compilation Conclusions Summary of my other work Future research directions 34 / 55

35 Phase Enabling Interaction benables a along the path a-b-a a b c b c c a b a d 35 / 55

36 36 / 55 Phase Enabling Probabilities g u s r q o n l k j i h d c b u s r q o n l k j i h g d c b St Ph

37 Phase Disabling Interaction b disables a along the path b-c-d a b c b c c a b a d 37 / 55

38 38 / 55 Disabling Probabilities u s r q o n l k j i h g d c b u s r q o n l k j i h g d c b Ph

39 Faster Conventional Compiler Modified VPO to use enabling and disabling phase probabilities to decrease compilation time # p[i] - current probability of phase i being active # e[i][j] - probability of phase j enabling phase i # d[i][j] - probability of phase j disabling phase i For each phase i do p[i] = e[i][st]; While (any p[i] > 0) do Select j as the current phase with highest probability of being active Apply phase j If phase j was active then For each phase i, where i!= j do p[i] += ((1-p[i]) * e[i][j]) - (p[i] * d[i][j]) p[j] = 0 39 / 55

40 40 / 55 Probabilistic Compilation Results average dijkstra(d) main(j) N/A LZWReadByte(j) N/A read_scan...(j) sha_trans...(h) main(f) fft_float(f) start_inp...(j) N/A start_inp...(j) N/A start_inp...(j) parse_swi...(j) N/A start_inp...(j) Speed Size Time Active Attempted Active Attempted Prob. / Old Prob. Compilation Old Compilation Function

41 Outline Experimental framework Exhaustive phase order space evaluation Faster conventional compilation Conclusions Summary of my other work Future research directions 41 / 55

42 Conclusions Phase ordering problem long standing problem in compiler optimization exhaustive evaluation always considered infeasible Exhaustively evaluated the phase order space re-interpretation of the problem novel application of search algorithms fast pruning techniques accurate prediction of relative performance Analyzed properties of the phase order space to speedup conventional compilation published in CGO 06, LCTES 06, submitted to TOPLAS 42 / 55

43 Challenges Exhaustive phase order search is a severe stress test for the compiler isolate analysis required and invalidated by each phase produce correct code for all phase orderings eliminate all memory leaks Search algorithm needs to be highly efficient used CRCs and hashes for function comparisons stored intermediate function instances to reduce disk access maintained logs to restart search after crash 43 / 55

44 Outline Experimental framework Exhaustive phase order space evaluation Faster conventional compilation Conclusions Summary of my other work Future research directions 44 / 55

45 VISTA Provides an interactive code improvement paradigm view low-level program representation apply existing phases and manual changes in any order browse and undo previous changes automatically obtain performance information automatically search for effective phase sequences Useful as a research as well as teaching tool employed in three universities published in LCTES 03, TECS / 55

46 VISTA Main Window 46 / 55

47 Faster Genetic Algorithm Searches Improving performance of genetic algorithms avoid redundant executions of the application over 87% of executions were avoided reduce search time by 62% modify search to obtain comparable results in fewer generations reduced GA generations by 59% reduce search time by 35% published in PLDI 04, TACO / 55

48 Heuristic Search Algorithms Analyzing the phase order space to improve heuristic algorithms detailed performance and cost comparison of different heuristic algorithms demonstrated the importance and difficulty of selecting the correct sequence length illustrated the importance of leaf function instances proposed modifications to existing algorithms, and new search algorithms published in CGO / 55

49 Dynamic Compilation Explored asynchronous dynamic compilation in a virtual machine demonstrated shortcomings of current popular compilation strategy describe importance of minimum compiler utilization discussed new compilation strategies explored the changes needed to current compilation strategies to exploit free cycles Will be published in VEE / 55

50 Outline Experimental framework Exhaustive phase order space evaluation Faster conventional compilation Conclusions Summary of my other work Future research directions 50 / 55

51 Compiler Technology Challenges High Level Language c o m p i l e r Machine Architecture Support for parallelism traditional languages express parallelism dynamic scheduling Virtual machines dynamic code generation and optimization Push compilation decisions further down multi-core heterogeneous cores No great solution performance monitoring software-controlled reconfiguration Can no longer do it alone 51 / 55

52 Iterative Compilation & Machine Learning Improved scope for iterative compilation & machine learning proliferation of new architectures automate tuning compiler heuristics tuning important libraries using performance monitors dynamic JIT compilers How to use machine learning to optimize and schedule more efficiently? 52 / 55

53 Heterogeneous Multi-core Architectures Can provide the best performance, cost, power balance Challenges schedule tasks, allocate resources dynamic core-specific optimization automatic data layout to prevent conflicts 53 / 55

54 Dynamic Compilation Virtual machines likely to grow in importance productivity, portability, interoperability, isolation... Challenges when, what, how to parallelize using hardware performance monitors using static analyses to aid dynamic compilation debugging tools for correctness and performance debugging 54 / 55

55 Questions? 55 / 55

56 Results Code Size Summary Exhaustively enumerated 234 out of 244 functions Sequence length Maximum: 44; Average: Distinct function instances Maximum: 2,882,021; Average: 89,947 Distinct control-flows Maximum: 920; Average: 36.2 Code size improvement over default sequence Maximum: 63%; Average: 6.46% 56 / 55

57 Results Performance Summary Exhaustively evaluated 79 out of 88 functions Sequence length Maximum: 44; Average: 16.1 Distinct function instances Maximum: 2,882,021; Average: 174,574.8 Distinct control-flows Maximum: 920; Average: 47.4 Performance improvement over default sequence Maximum: 15%; Average: 4.8% 57 / 55

58 Leaf Vs. Non-Leaf Performance 58 / 55

59 Phase Order Space Evaluation Summary last phase active? Y identical function instance? N equivalent function instance? N generate next optimization sequence N Y Y Y add node to DAG calculate function performance simulate application N seen control-flow structure? 59 / 55

60 Phase Order Space Evaluation Summary last phase active? Y identical function instance? N equivalent function instance? N generate next optimization sequence N Y Y Y add node to DAG calculate function performance simulate application N seen control-flow structure? 60 / 55

61 Phase Order Space Evaluation Summary last phase active? Y identical function instance? N equivalent function instance? N generate next optimization sequence N Y Y Y add node to DAG calculate function performance simulate application N seen control-flow structure? 61 / 55

62 Phase Order Space Evaluation Summary last phase active? Y identical function instance? N equivalent function instance? N generate next optimization sequence N Y Y Y add node to DAG calculate function performance simulate application N seen control-flow structure? 62 / 55

63 Phase Order Space Evaluation Summary last phase active? Y identical function instance? N equivalent function instance? N generate next optimization sequence N Y Y Y add node to DAG calculate function performance simulate application N seen control-flow structure? 63 / 55

64 Phase Order Space Evaluation Summary last phase active? Y identical function instance? N equivalent function instance? N generate next optimization sequence N Y Y Y add node to DAG calculate function performance simulate application N seen control-flow structure? 64 / 55

65 Phase Order Space Evaluation Summary last phase active? Y identical function instance? N equivalent function instance? N generate next optimization sequence N Y Y Y add node to DAG calculate function performance simulate application N seen control-flow structure? 65 / 55

66 Phase Order Space Evaluation Summary last phase active? Y identical function instance? N equivalent function instance? N generate next optimization sequence N Y Y Y add node to DAG calculate function performance simulate application N seen control-flow structure? 66 / 55

67 Weighted Function Instance DAG Each node is weighted by the number of paths to a leaf node a b 5 [abc] c 2 [bc] 1 [c] 2 [ab] b c c a b 1 [a] 1 [d] 1 a d / 55

68 Predicting Relative Performance II 5 4 cycles 5 10 cycles 5 10 cycles 5 10 cycles? 15 cycles? 15 cycles?? 4 cycles? 10 cycles 26 cycles? 90 cycles? 44 cycles? 2 cycles Total cycles = 170 Total cycles =?? 68 / 55

69 Case when No Leaf is Optimal 69 / 55

In Search of Near-Optimal Optimization Phase Orderings

In Search of Near-Optimal Optimization Phase Orderings In Search of Near-Optimal Optimization Phase Orderings Prasad A. Kulkarni David B. Whalley Gary S. Tyson Jack W. Davidson Languages, Compilers, and Tools for Embedded Systems Optimization Phase Ordering

More information

Evaluating Heuristic Optimization Phase Order Search Algorithms

Evaluating Heuristic Optimization Phase Order Search Algorithms Evaluating Heuristic Optimization Phase Order Search Algorithms by Prasad A. Kulkarni David B. Whalley Gary S. Tyson Jack W. Davidson Computer Science Department, Florida State University, Tallahassee,

More information

Fast Searches for Effective Optimization Phase Sequences

Fast Searches for Effective Optimization Phase Sequences Fast Searches for Effective Optimization Phase Sequences Prasad Kulkarni, Stephen Hines, Jason Hiser David Whalley, Jack Davidson, Douglas Jones Computer Science Department, Florida State University, Tallahassee,

More information

Evaluating Heuristic Optimization Phase Order Search Algorithms

Evaluating Heuristic Optimization Phase Order Search Algorithms Evaluating Heuristic Optimization Phase Order Search Algorithms Prasad A. Kulkarni, David B. Whalley, Gary S. Tyson Florida State University Computer Science Department Tallahassee, FL 32306-4530 {kulkarni,whalley,tyson}@cs.fsu.edu

More information

Improving Both the Performance Benefits and Speed of Optimization Phase Sequence Searches

Improving Both the Performance Benefits and Speed of Optimization Phase Sequence Searches Improving Both the Performance Benefits and Speed of Optimization Phase Sequence Searches Prasad A. Kulkarni Michael R. Jantz University of Kansas Department of Electrical Engineering and Computer Science,

More information

Understanding Optimization Phase Interactions to Reduce the Phase Order Search Space

Understanding Optimization Phase Interactions to Reduce the Phase Order Search Space Understanding Optimization Phase Interactions to Reduce the Phase Order Search Space Michael R. Jantz Submitted to the graduate degree program in Electrical Engineering and Computer Science and the Graduate

More information

Exploiting Phase Inter-Dependencies for Faster Iterative Compiler Optimization Phase Order Searches

Exploiting Phase Inter-Dependencies for Faster Iterative Compiler Optimization Phase Order Searches 1/26 Exploiting Phase Inter-Dependencies for Faster Iterative Compiler Optimization Phase Order Searches Michael R. Jantz Prasad A. Kulkarni Electrical Engineering and Computer Science, University of Kansas

More information

Compiler Optimization

Compiler Optimization Compiler Optimization The compiler translates programs written in a high-level language to assembly language code Assembly language code is translated to object code by an assembler Object code modules

More information

Improving Data Access Efficiency by Using Context-Aware Loads and Stores

Improving Data Access Efficiency by Using Context-Aware Loads and Stores Improving Data Access Efficiency by Using Context-Aware Loads and Stores Alen Bardizbanyan Chalmers University of Technology Gothenburg, Sweden alenb@chalmers.se Magnus Själander Uppsala University Uppsala,

More information

Introduction to optimizations. CS Compiler Design. Phases inside the compiler. Optimization. Introduction to Optimizations. V.

Introduction to optimizations. CS Compiler Design. Phases inside the compiler. Optimization. Introduction to Optimizations. V. Introduction to optimizations CS3300 - Compiler Design Introduction to Optimizations V. Krishna Nandivada IIT Madras Copyright c 2018 by Antony L. Hosking. Permission to make digital or hard copies of

More information

ABC basics (compilation from different articles)

ABC basics (compilation from different articles) 1. AIG construction 2. AIG optimization 3. Technology mapping ABC basics (compilation from different articles) 1. BACKGROUND An And-Inverter Graph (AIG) is a directed acyclic graph (DAG), in which a node

More information

Continuous Compilation for Aggressive and Adaptive Code Transformation

Continuous Compilation for Aggressive and Adaptive Code Transformation Continuous Compilation for Aggressive and Adaptive Code Transformation Bruce R. Childers University of Pittsburgh Pittsburgh, Pennsylvania, USA childers@cs.pitt.edu http://www.cs.pitt.edu/coco Today s

More information

CSE 501: Compiler Construction. Course outline. Goals for language implementation. Why study compilers? Models of compilation

CSE 501: Compiler Construction. Course outline. Goals for language implementation. Why study compilers? Models of compilation CSE 501: Compiler Construction Course outline Main focus: program analysis and transformation how to represent programs? how to analyze programs? what to analyze? how to transform programs? what transformations

More information

Building a Runnable Program and Code Improvement. Dario Marasco, Greg Klepic, Tess DiStefano

Building a Runnable Program and Code Improvement. Dario Marasco, Greg Klepic, Tess DiStefano Building a Runnable Program and Code Improvement Dario Marasco, Greg Klepic, Tess DiStefano Building a Runnable Program Review Front end code Source code analysis Syntax tree Back end code Target code

More information

Goals of Program Optimization (1 of 2)

Goals of Program Optimization (1 of 2) Goals of Program Optimization (1 of 2) Goal: Improve program performance within some constraints Ask Three Key Questions for Every Optimization 1. Is it legal? 2. Is it profitable? 3. Is it compile-time

More information

Improving Program Efficiency by Packing Instructions into Registers

Improving Program Efficiency by Packing Instructions into Registers Improving Program Efficiency by Packing Instructions into Registers Stephen Hines, Joshua Green, Gary Tyson and David Whalley Florida State University Computer Science Dept. Tallahassee, Florida 32306-4530

More information

Tour of common optimizations

Tour of common optimizations Tour of common optimizations Simple example foo(z) { x := 3 + 6; y := x 5 return z * y } Simple example foo(z) { x := 3 + 6; y := x 5; return z * y } x:=9; Applying Constant Folding Simple example foo(z)

More information

Reducing Power Consumption for High-Associativity Data Caches in Embedded Processors

Reducing Power Consumption for High-Associativity Data Caches in Embedded Processors Reducing Power Consumption for High-Associativity Data Caches in Embedded Processors Dan Nicolaescu Alex Veidenbaum Alex Nicolau Dept. of Information and Computer Science University of California at Irvine

More information

Automatic Selection of GCC Optimization Options Using A Gene Weighted Genetic Algorithm

Automatic Selection of GCC Optimization Options Using A Gene Weighted Genetic Algorithm Automatic Selection of GCC Optimization Options Using A Gene Weighted Genetic Algorithm San-Chih Lin, Chi-Kuang Chang, Nai-Wei Lin National Chung Cheng University Chiayi, Taiwan 621, R.O.C. {lsch94,changck,naiwei}@cs.ccu.edu.tw

More information

Compiler Passes. Optimization. The Role of the Optimizer. Optimizations. The Optimizer (or Middle End) Traditional Three-pass Compiler

Compiler Passes. Optimization. The Role of the Optimizer. Optimizations. The Optimizer (or Middle End) Traditional Three-pass Compiler Compiler Passes Analysis of input program (front-end) character stream Lexical Analysis Synthesis of output program (back-end) Intermediate Code Generation Optimization Before and after generating machine

More information

Middle End. Code Improvement (or Optimization) Analyzes IR and rewrites (or transforms) IR Primary goal is to reduce running time of the compiled code

Middle End. Code Improvement (or Optimization) Analyzes IR and rewrites (or transforms) IR Primary goal is to reduce running time of the compiled code Traditional Three-pass Compiler Source Code Front End IR Middle End IR Back End Machine code Errors Code Improvement (or Optimization) Analyzes IR and rewrites (or transforms) IR Primary goal is to reduce

More information

Using De-optimization to Re-optimize Code

Using De-optimization to Re-optimize Code Using De-optimization to Re-optimize Code Stephen Hines, Prasad Kulkarni, David Whalley Computer Science Dept. Florida State University Tallahassee, FL 32306-4530 (hines, kulkarni, whalley)@cs.fsu.edu

More information

More Code Generation and Optimization. Pat Morin COMP 3002

More Code Generation and Optimization. Pat Morin COMP 3002 More Code Generation and Optimization Pat Morin COMP 3002 Outline DAG representation of basic blocks Peephole optimization Register allocation by graph coloring 2 Basic Blocks as DAGs 3 Basic Blocks as

More information

Reducing Instruction Fetch Cost by Packing Instructions into Register Windows

Reducing Instruction Fetch Cost by Packing Instructions into Register Windows Reducing Instruction Fetch Cost by Packing Instructions into Register Windows Stephen Hines, Gary Tyson, David Whalley Computer Science Dept. Florida State University November 14, 2005 ➊ Introduction Reducing

More information

Dynamic Binary Translation for Generation of Cycle AccurateOctober Architecture 28, 2008 Simulators 1 / 1

Dynamic Binary Translation for Generation of Cycle AccurateOctober Architecture 28, 2008 Simulators 1 / 1 Dynamic Binary Translation for Generation of Cycle Accurate Architecture Simulators Institut für Computersprachen Technische Universtät Wien Austria Andreas Fellnhofer Andreas Krall David Riegler Part

More information

CODE GENERATION Monday, May 31, 2010

CODE GENERATION Monday, May 31, 2010 CODE GENERATION memory management returned value actual parameters commonly placed in registers (when possible) optional control link optional access link saved machine status local data temporaries A.R.

More information

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Subject Name: CS2352 Principles of Compiler Design Year/Sem : III/VI

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Subject Name: CS2352 Principles of Compiler Design Year/Sem : III/VI DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Subject Name: CS2352 Principles of Compiler Design Year/Sem : III/VI UNIT I - LEXICAL ANALYSIS 1. What is the role of Lexical Analyzer? [NOV 2014] 2. Write

More information

Office Hours: Mon/Wed 3:30-4:30 GDC Office Hours: Tue 3:30-4:30 Thu 3:30-4:30 GDC 5.

Office Hours: Mon/Wed 3:30-4:30 GDC Office Hours: Tue 3:30-4:30 Thu 3:30-4:30 GDC 5. CS380C Compilers Instructor: TA: lin@cs.utexas.edu Office Hours: Mon/Wed 3:30-4:30 GDC 5.512 Jia Chen jchen@cs.utexas.edu Office Hours: Tue 3:30-4:30 Thu 3:30-4:30 GDC 5.440 January 21, 2015 Introduction

More information

A Feasibility Study for Methods of Effective Memoization Optimization

A Feasibility Study for Methods of Effective Memoization Optimization A Feasibility Study for Methods of Effective Memoization Optimization Daniel Mock October 2018 Abstract Traditionally, memoization is a compiler optimization that is applied to regions of code with few

More information

Group B Assignment 8. Title of Assignment: Problem Definition: Code optimization using DAG Perquisite: Lex, Yacc, Compiler Construction

Group B Assignment 8. Title of Assignment: Problem Definition: Code optimization using DAG Perquisite: Lex, Yacc, Compiler Construction Group B Assignment 8 Att (2) Perm(3) Oral(5) Total(10) Sign Title of Assignment: Code optimization using DAG. 8.1.1 Problem Definition: Code optimization using DAG. 8.1.2 Perquisite: Lex, Yacc, Compiler

More information

Intelligent Compilation

Intelligent Compilation Intelligent Compilation John Cavazos Department of Computer and Information Sciences University of Delaware Autotuning and Compilers Proposition: Autotuning is a component of an Intelligent Compiler. Code

More information

Compilers and Interpreters

Compilers and Interpreters Overview Roadmap Language Translators: Interpreters & Compilers Context of a compiler Phases of a compiler Compiler Construction tools Terminology How related to other CS Goals of a good compiler 1 Compilers

More information

USC 227 Office hours: 3-4 Monday and Wednesday CS553 Lecture 1 Introduction 4

USC 227 Office hours: 3-4 Monday and Wednesday  CS553 Lecture 1 Introduction 4 CS553 Compiler Construction Instructor: URL: Michelle Strout mstrout@cs.colostate.edu USC 227 Office hours: 3-4 Monday and Wednesday http://www.cs.colostate.edu/~cs553 CS553 Lecture 1 Introduction 3 Plan

More information

Global Register Allocation

Global Register Allocation Global Register Allocation Y N Srikant Computer Science and Automation Indian Institute of Science Bangalore 560012 NPTEL Course on Compiler Design Outline n Issues in Global Register Allocation n The

More information

Lecture 14 Pointer Analysis

Lecture 14 Pointer Analysis Lecture 14 Pointer Analysis Basics Design Options Pointer Analysis Algorithms Pointer Analysis Using BDDs Probabilistic Pointer Analysis [ALSU 12.4, 12.6-12.7] Phillip B. Gibbons 15-745: Pointer Analysis

More information

Compilation 2012 Static Analysis

Compilation 2012 Static Analysis Compilation 2012 Jan Midtgaard Michael I. Schwartzbach Aarhus University Interesting Questions Is every statement reachable? Does every non-void method return a value? Will local variables definitely be

More information

CS 701. Class Meets. Instructor. Teaching Assistant. Key Dates. Charles N. Fischer. Fall Tuesdays & Thursdays, 11:00 12: Engineering Hall

CS 701. Class Meets. Instructor. Teaching Assistant. Key Dates. Charles N. Fischer. Fall Tuesdays & Thursdays, 11:00 12: Engineering Hall CS 701 Charles N. Fischer Class Meets Tuesdays & Thursdays, 11:00 12:15 2321 Engineering Hall Fall 2003 Instructor http://www.cs.wisc.edu/~fischer/cs703.html Charles N. Fischer 5397 Computer Sciences Telephone:

More information

Optimization. ASU Textbook Chapter 9. Tsan-sheng Hsu.

Optimization. ASU Textbook Chapter 9. Tsan-sheng Hsu. Optimization ASU Textbook Chapter 9 Tsan-sheng Hsu tshsu@iis.sinica.edu.tw http://www.iis.sinica.edu.tw/~tshsu 1 Introduction For some compiler, the intermediate code is a pseudo code of a virtual machine.

More information

Enhancing Energy Efficiency of Processor-Based Embedded Systems thorough Post-Fabrication ISA Extension

Enhancing Energy Efficiency of Processor-Based Embedded Systems thorough Post-Fabrication ISA Extension Enhancing Energy Efficiency of Processor-Based Embedded Systems thorough Post-Fabrication ISA Extension Hamid Noori, Farhad Mehdipour, Koji Inoue, and Kazuaki Murakami Institute of Systems, Information

More information

Register Allocation. CS 502 Lecture 14 11/25/08

Register Allocation. CS 502 Lecture 14 11/25/08 Register Allocation CS 502 Lecture 14 11/25/08 Where we are... Reasonably low-level intermediate representation: sequence of simple instructions followed by a transfer of control. a representation of static

More information

9/5/17. The Design and Implementation of Programming Languages. Compilation. Interpretation. Compilation vs. Interpretation. Hybrid Implementation

9/5/17. The Design and Implementation of Programming Languages. Compilation. Interpretation. Compilation vs. Interpretation. Hybrid Implementation Language Implementation Methods The Design and Implementation of Programming Languages Compilation Interpretation Hybrid In Text: Chapter 1 2 Compilation Interpretation Translate high-level programs to

More information

The analysis part breaks up the source program into constituent pieces and creates an intermediate representation of the source program.

The analysis part breaks up the source program into constituent pieces and creates an intermediate representation of the source program. COMPILER DESIGN 1. What is a compiler? A compiler is a program that reads a program written in one language the source language and translates it into an equivalent program in another language-the target

More information

Lecture 16 Pointer Analysis

Lecture 16 Pointer Analysis Pros and Cons of Pointers Lecture 16 Pointer Analysis Basics Design Options Pointer Analysis Algorithms Pointer Analysis Using BDDs Probabilistic Pointer Analysis Many procedural languages have pointers

More information

Running class Timing on Java HotSpot VM, 1

Running class Timing on Java HotSpot VM, 1 Compiler construction 2009 Lecture 3. A first look at optimization: Peephole optimization. A simple example A Java class public class A { public static int f (int x) { int r = 3; int s = r + 5; return

More information

Lecture 27. Pros and Cons of Pointers. Basics Design Options Pointer Analysis Algorithms Pointer Analysis Using BDDs Probabilistic Pointer Analysis

Lecture 27. Pros and Cons of Pointers. Basics Design Options Pointer Analysis Algorithms Pointer Analysis Using BDDs Probabilistic Pointer Analysis Pros and Cons of Pointers Lecture 27 Pointer Analysis Basics Design Options Pointer Analysis Algorithms Pointer Analysis Using BDDs Probabilistic Pointer Analysis Many procedural languages have pointers

More information

Lecture 20 Pointer Analysis

Lecture 20 Pointer Analysis Lecture 20 Pointer Analysis Basics Design Options Pointer Analysis Algorithms Pointer Analysis Using BDDs Probabilistic Pointer Analysis (Slide content courtesy of Greg Steffan, U. of Toronto) 15-745:

More information

IA-64 Compiler Technology

IA-64 Compiler Technology IA-64 Compiler Technology David Sehr, Jay Bharadwaj, Jim Pierce, Priti Shrivastav (speaker), Carole Dulong Microcomputer Software Lab Page-1 Introduction IA-32 compiler optimizations Profile Guidance (PGOPTI)

More information

FLORIDA STATE UNIVERSITY COLLEGE OF ARTS AND SCIENCES DEEP: DEPENDENCY ELIMINATION USING EARLY PREDICTIONS LUIS G. PENAGOS

FLORIDA STATE UNIVERSITY COLLEGE OF ARTS AND SCIENCES DEEP: DEPENDENCY ELIMINATION USING EARLY PREDICTIONS LUIS G. PENAGOS FLORIDA STATE UNIVERSITY COLLEGE OF ARTS AND SCIENCES DEEP: DEPENDENCY ELIMINATION USING EARLY PREDICTIONS By LUIS G. PENAGOS A Thesis submitted to the Department of Computer Science in partial fulfillment

More information

VIVA QUESTIONS WITH ANSWERS

VIVA QUESTIONS WITH ANSWERS VIVA QUESTIONS WITH ANSWERS 1. What is a compiler? A compiler is a program that reads a program written in one language the source language and translates it into an equivalent program in another language-the

More information

Chapter S:II. II. Search Space Representation

Chapter S:II. II. Search Space Representation Chapter S:II II. Search Space Representation Systematic Search Encoding of Problems State-Space Representation Problem-Reduction Representation Choosing a Representation S:II-1 Search Space Representation

More information

SYED AMMAL ENGINEERING COLLEGE (An ISO 9001:2008 Certified Institution) Dr. E.M. Abdullah Campus, Ramanathapuram

SYED AMMAL ENGINEERING COLLEGE (An ISO 9001:2008 Certified Institution) Dr. E.M. Abdullah Campus, Ramanathapuram CS6660 COMPILER DESIGN Question Bank UNIT I-INTRODUCTION TO COMPILERS 1. Define compiler. 2. Differentiate compiler and interpreter. 3. What is a language processing system? 4. List four software tools

More information

Network Design Considerations for Grid Computing

Network Design Considerations for Grid Computing Network Design Considerations for Grid Computing Engineering Systems How Bandwidth, Latency, and Packet Size Impact Grid Job Performance by Erik Burrows, Engineering Systems Analyst, Principal, Broadcom

More information

Instruction Set Principles and Examples. Appendix B

Instruction Set Principles and Examples. Appendix B Instruction Set Principles and Examples Appendix B Outline What is Instruction Set Architecture? Classifying ISA Elements of ISA Programming Registers Type and Size of Operands Addressing Modes Types of

More information

Energy Consumption Evaluation of an Adaptive Extensible Processor

Energy Consumption Evaluation of an Adaptive Extensible Processor Energy Consumption Evaluation of an Adaptive Extensible Processor Hamid Noori, Farhad Mehdipour, Maziar Goudarzi, Seiichiro Yamaguchi, Koji Inoue, and Kazuaki Murakami December 2007 Outline Introduction

More information

EXTRACTION OF RELEVANT WEB PAGES USING DATA MINING

EXTRACTION OF RELEVANT WEB PAGES USING DATA MINING Chapter 3 EXTRACTION OF RELEVANT WEB PAGES USING DATA MINING 3.1 INTRODUCTION Generally web pages are retrieved with the help of search engines which deploy crawlers for downloading purpose. Given a query,

More information

A Sparse Algorithm for Predicated Global Value Numbering

A Sparse Algorithm for Predicated Global Value Numbering Sparse Predicated Global Value Numbering A Sparse Algorithm for Predicated Global Value Numbering Karthik Gargi Hewlett-Packard India Software Operation PLDI 02 Monday 17 June 2002 1. Introduction 2. Brute

More information

Administration. Prerequisites. Meeting times. CS 380C: Advanced Topics in Compilers

Administration. Prerequisites. Meeting times. CS 380C: Advanced Topics in Compilers Administration CS 380C: Advanced Topics in Compilers Instructor: eshav Pingali Professor (CS, ICES) Office: POB 4.126A Email: pingali@cs.utexas.edu TA: TBD Graduate student (CS) Office: Email: Meeting

More information

Sireesha R Basavaraju Embedded Systems Group, Technical University of Kaiserslautern

Sireesha R Basavaraju Embedded Systems Group, Technical University of Kaiserslautern Sireesha R Basavaraju Embedded Systems Group, Technical University of Kaiserslautern Introduction WCET of program ILP Formulation Requirement SPM allocation for code SPM allocation for data Conclusion

More information

Local Optimization: Value Numbering The Desert Island Optimization. Comp 412 COMP 412 FALL Chapter 8 in EaC2e. target code

Local Optimization: Value Numbering The Desert Island Optimization. Comp 412 COMP 412 FALL Chapter 8 in EaC2e. target code COMP 412 FALL 2017 Local Optimization: Value Numbering The Desert Island Optimization Comp 412 source code IR Front End Optimizer Back End IR target code Copyright 2017, Keith D. Cooper & Linda Torczon,

More information

Administrivia. CS 133: Databases. Cost-based Query Sub-System. Goals for Today. Midterm on Thursday 10/18. Assignments

Administrivia. CS 133: Databases. Cost-based Query Sub-System. Goals for Today. Midterm on Thursday 10/18. Assignments Administrivia Midterm on Thursday 10/18 CS 133: Databases Fall 2018 Lec 12 10/16 Prof. Beth Trushkowsky Assignments Lab 3 starts after fall break No problem set out this week Goals for Today Cost-based

More information

What Do Compilers Do? How Can the Compiler Improve Performance? What Do We Mean By Optimization?

What Do Compilers Do? How Can the Compiler Improve Performance? What Do We Mean By Optimization? What Do Compilers Do? Lecture 1 Introduction I What would you get out of this course? II Structure of a Compiler III Optimization Example Reference: Muchnick 1.3-1.5 1. Translate one language into another

More information

Compiler construction 2009

Compiler construction 2009 Compiler construction 2009 Lecture 3 JVM and optimization. A first look at optimization: Peephole optimization. A simple example A Java class public class A { public static int f (int x) { int r = 3; int

More information

Register Allocation 1

Register Allocation 1 Register Allocation 1 Lecture Outline Memory Hierarchy Management Register Allocation Register interference graph Graph coloring heuristics Spilling Cache Management The Memory Hierarchy Registers < 1

More information

Machine-Independent Optimizations

Machine-Independent Optimizations Chapter 9 Machine-Independent Optimizations High-level language constructs can introduce substantial run-time overhead if we naively translate each construct independently into machine code. This chapter

More information

CSC D70: Compiler Optimization Register Allocation

CSC D70: Compiler Optimization Register Allocation CSC D70: Compiler Optimization Register Allocation Prof. Gennady Pekhimenko University of Toronto Winter 2018 The content of this lecture is adapted from the lectures of Todd Mowry and Phillip Gibbons

More information

Addressed Issue. P2P What are we looking at? What is Peer-to-Peer? What can databases do for P2P? What can databases do for P2P?

Addressed Issue. P2P What are we looking at? What is Peer-to-Peer? What can databases do for P2P? What can databases do for P2P? Peer-to-Peer Data Management - Part 1- Alex Coman acoman@cs.ualberta.ca Addressed Issue [1] Placement and retrieval of data [2] Server architectures for hybrid P2P [3] Improve search in pure P2P systems

More information

THE FLORIDA STATE UNIVERSITY COLLEGE OF ART & SCIENCES REDUCING THE WCET OF APPLICATIONS ON LOW END EMBEDDED SYSTEMS WANKANG ZHAO

THE FLORIDA STATE UNIVERSITY COLLEGE OF ART & SCIENCES REDUCING THE WCET OF APPLICATIONS ON LOW END EMBEDDED SYSTEMS WANKANG ZHAO THE FLORIDA STATE UNIVERSITY COLLEGE OF ART & SCIENCES REDUCING THE WCET OF APPLICATIONS ON LOW END EMBEDDED SYSTEMS By WANKANG ZHAO A Dissertation submitted to the Department of Computer Science in partial

More information

Outline. Register Allocation. Issues. Storing values between defs and uses. Issues. Issues P3 / 2006

Outline. Register Allocation. Issues. Storing values between defs and uses. Issues. Issues P3 / 2006 P3 / 2006 Register Allocation What is register allocation Spilling More Variations and Optimizations Kostis Sagonas 2 Spring 2006 Storing values between defs and uses Program computes with values value

More information

CSC D70: Compiler Optimization

CSC D70: Compiler Optimization CSC D70: Compiler Optimization Prof. Gennady Pekhimenko University of Toronto Winter 2018 The content of this lecture is adapted from the lectures of Todd Mowry and Phillip Gibbons CSC D70: Compiler Optimization

More information

Summary: Open Questions:

Summary: Open Questions: Summary: The paper proposes an new parallelization technique, which provides dynamic runtime parallelization of loops from binary single-thread programs with minimal architectural change. The realization

More information

Code optimization. Have we achieved optimal code? Impossible to answer! We make improvements to the code. Aim: faster code and/or less space

Code optimization. Have we achieved optimal code? Impossible to answer! We make improvements to the code. Aim: faster code and/or less space Code optimization Have we achieved optimal code? Impossible to answer! We make improvements to the code Aim: faster code and/or less space Types of optimization machine-independent In source code or internal

More information

Intermediate Representations

Intermediate Representations COMP 506 Rice University Spring 2018 Intermediate Representations source code IR Front End Optimizer Back End IR target code Copyright 2018, Keith D. Cooper & Linda Torczon, all rights reserved. Students

More information

High Level, high speed FPGA programming

High Level, high speed FPGA programming Opportunity: FPAs High Level, high speed FPA programming Wim Bohm, Bruce Draper, Ross Beveridge, Charlie Ross, Monica Chawathe Colorado State University Reconfigurable Computing technology High speed at

More information

Register Allocation. Lecture 38

Register Allocation. Lecture 38 Register Allocation Lecture 38 (from notes by G. Necula and R. Bodik) 4/27/08 Prof. Hilfinger CS164 Lecture 38 1 Lecture Outline Memory Hierarchy Management Register Allocation Register interference graph

More information

1 Motivation for Improving Matrix Multiplication

1 Motivation for Improving Matrix Multiplication CS170 Spring 2007 Lecture 7 Feb 6 1 Motivation for Improving Matrix Multiplication Now we will just consider the best way to implement the usual algorithm for matrix multiplication, the one that take 2n

More information

Static Analysis of Worst-Case Stack Cache Behavior

Static Analysis of Worst-Case Stack Cache Behavior Static Analysis of Worst-Case Stack Cache Behavior Florian Brandner Unité d Informatique et d Ing. des Systèmes ENSTA-ParisTech Alexander Jordan Embedded Systems Engineering Sect. Technical University

More information

Speculative Tag Access for Reduced Energy Dissipation in Set-Associative L1 Data Caches

Speculative Tag Access for Reduced Energy Dissipation in Set-Associative L1 Data Caches Speculative Tag Access for Reduced Energy Dissipation in Set-Associative L1 Data Caches Alen Bardizbanyan, Magnus Själander, David Whalley, and Per Larsson-Edefors Chalmers University of Technology, Gothenburg,

More information

Just-In-Time Compilers & Runtime Optimizers

Just-In-Time Compilers & Runtime Optimizers COMP 412 FALL 2017 Just-In-Time Compilers & Runtime Optimizers Comp 412 source code IR Front End Optimizer Back End IR target code Copyright 2017, Keith D. Cooper & Linda Torczon, all rights reserved.

More information

Combining Analyses, Combining Optimizations - Summary

Combining Analyses, Combining Optimizations - Summary Combining Analyses, Combining Optimizations - Summary 1. INTRODUCTION Cliff Click s thesis Combining Analysis, Combining Optimizations [Click and Cooper 1995] uses a structurally different intermediate

More information

Global Register Allocation - Part 2

Global Register Allocation - Part 2 Global Register Allocation - Part 2 Y N Srikant Computer Science and Automation Indian Institute of Science Bangalore 560012 NPTEL Course on Compiler Design Outline Issues in Global Register Allocation

More information

Hugh Leather, Edwin Bonilla, Michael O'Boyle

Hugh Leather, Edwin Bonilla, Michael O'Boyle Automatic Generation for Machine Learning Based Optimizing Compilation Hugh Leather, Edwin Bonilla, Michael O'Boyle Institute for Computing Systems Architecture University of Edinburgh, UK Overview Introduction

More information

Operating system integrated energy aware scratchpad allocation strategies for multiprocess applications

Operating system integrated energy aware scratchpad allocation strategies for multiprocess applications University of Dortmund Operating system integrated energy aware scratchpad allocation strategies for multiprocess applications Robert Pyka * Christoph Faßbach * Manish Verma + Heiko Falk * Peter Marwedel

More information

Compiler Optimization and Code Generation

Compiler Optimization and Code Generation Compiler Optimization and Code Generation Professor: Sc.D., Professor Vazgen Melikyan 1 Course Overview Introduction: Overview of Optimizations 1 lecture Intermediate-Code Generation 2 lectures Machine-Independent

More information

Advances in Compilers

Advances in Compilers Advances in Compilers Prof. (Dr.) K.R. Chowdhary, Director COE Email: kr.chowdhary@jietjodhpur.ac.in webpage: http://www.krchowdhary.com JIET College of Engineering August 4, 2017 kr chowdhary Compilers

More information

HW 2 is out! Due 9/25!

HW 2 is out! Due 9/25! HW 2 is out! Due 9/25! CS 6290 Static Exploitation of ILP Data-Dependence Stalls w/o OOO Single-Issue Pipeline When no bypassing exists Load-to-use Long-latency instructions Multiple-Issue (Superscalar),

More information

Register Allocation (via graph coloring) Lecture 25. CS 536 Spring

Register Allocation (via graph coloring) Lecture 25. CS 536 Spring Register Allocation (via graph coloring) Lecture 25 CS 536 Spring 2001 1 Lecture Outline Memory Hierarchy Management Register Allocation Register interference graph Graph coloring heuristics Spilling Cache

More information

An Evaluation of Autotuning Techniques for the Compiler Optimization Problems

An Evaluation of Autotuning Techniques for the Compiler Optimization Problems An Evaluation of Autotuning Techniques for the Compiler Optimization Problems Amir Hossein Ashouri, Gianluca Palermo and Cristina Silvano Politecnico di Milano, Milan, Italy {amirhossein.ashouri,ginaluca.palermo,cristina.silvano}@polimi.it

More information

Optimization Prof. James L. Frankel Harvard University

Optimization Prof. James L. Frankel Harvard University Optimization Prof. James L. Frankel Harvard University Version of 4:24 PM 1-May-2018 Copyright 2018, 2016, 2015 James L. Frankel. All rights reserved. Reasons to Optimize Reduce execution time Reduce memory

More information

Register Allocation. Lecture 16

Register Allocation. Lecture 16 Register Allocation Lecture 16 1 Register Allocation This is one of the most sophisticated things that compiler do to optimize performance Also illustrates many of the concepts we ve been discussing in

More information

Cross-Layer Memory Management to Reduce DRAM Power Consumption

Cross-Layer Memory Management to Reduce DRAM Power Consumption Cross-Layer Memory Management to Reduce DRAM Power Consumption Michael Jantz Assistant Professor University of Tennessee, Knoxville 1 Introduction Assistant Professor at UT since August 2014 Before UT

More information

Advanced Database Systems

Advanced Database Systems Lecture IV Query Processing Kyumars Sheykh Esmaili Basic Steps in Query Processing 2 Query Optimization Many equivalent execution plans Choosing the best one Based on Heuristics, Cost Will be discussed

More information

COMS W4115 Programming Languages and Translators Lecture 21: Code Optimization April 15, 2013

COMS W4115 Programming Languages and Translators Lecture 21: Code Optimization April 15, 2013 1 COMS W4115 Programming Languages and Translators Lecture 21: Code Optimization April 15, 2013 Lecture Outline 1. Code optimization strategies 2. Peephole optimization 3. Common subexpression elimination

More information

Announcements. My office hours are today in Gates 160 from 1PM-3PM. Programming Project 3 checkpoint due tomorrow night at 11:59PM.

Announcements. My office hours are today in Gates 160 from 1PM-3PM. Programming Project 3 checkpoint due tomorrow night at 11:59PM. IR Generation Announcements My office hours are today in Gates 160 from 1PM-3PM. Programming Project 3 checkpoint due tomorrow night at 11:59PM. This is a hard deadline and no late submissions will be

More information

Introduction to Machine-Independent Optimizations - 1

Introduction to Machine-Independent Optimizations - 1 Introduction to Machine-Independent Optimizations - 1 Department of Computer Science and Automation Indian Institute of Science Bangalore 560 012 NPTEL Course on Principles of Compiler Design Outline of

More information

Cache Aware Optimization of Stream Programs

Cache Aware Optimization of Stream Programs Cache Aware Optimization of Stream Programs Janis Sermulins, William Thies, Rodric Rabbah and Saman Amarasinghe LCTES Chicago, June 2005 Streaming Computing Is Everywhere! Prevalent computing domain with

More information

Life Cycle of Source Program - Compiler Design

Life Cycle of Source Program - Compiler Design Life Cycle of Source Program - Compiler Design Vishal Trivedi * Gandhinagar Institute of Technology, Gandhinagar, Gujarat, India E-mail: raja.vishaltrivedi@gmail.com Abstract: This Research paper gives

More information

Parallel Query Optimisation

Parallel Query Optimisation Parallel Query Optimisation Contents Objectives of parallel query optimisation Parallel query optimisation Two-Phase optimisation One-Phase optimisation Inter-operator parallelism oriented optimisation

More information

Old formulation of branch paths w/o prediction. bne $2,$3,foo subu $3,.. ld $2,..

Old formulation of branch paths w/o prediction. bne $2,$3,foo subu $3,.. ld $2,.. Old formulation of branch paths w/o prediction bne $2,$3,foo subu $3,.. ld $2,.. Cleaner formulation of Branching Front End Back End Instr Cache Guess PC Instr PC dequeue!= dcache arch. PC branch resolution

More information

Comp 204: Computer Systems and Their Implementation. Lecture 22: Code Generation and Optimisation

Comp 204: Computer Systems and Their Implementation. Lecture 22: Code Generation and Optimisation Comp 204: Computer Systems and Their Implementation Lecture 22: Code Generation and Optimisation 1 Today Code generation Three address code Code optimisation Techniques Classification of optimisations

More information

Under the Compiler's Hood: Supercharge Your PLAYSTATION 3 (PS3 ) Code. Understanding your compiler is the key to success in the gaming world.

Under the Compiler's Hood: Supercharge Your PLAYSTATION 3 (PS3 ) Code. Understanding your compiler is the key to success in the gaming world. Under the Compiler's Hood: Supercharge Your PLAYSTATION 3 (PS3 ) Code. Understanding your compiler is the key to success in the gaming world. Supercharge your PS3 game code Part 1: Compiler internals.

More information