FSU. Av oiding Unconditional Jumps by Code Replication. Frank Mueller and David B. Whalley. Florida State University DEPARTMENT OF COMPUTER SCIENCE
|
|
- Cordelia Charles
- 6 years ago
- Views:
Transcription
1 Av oiding Unconditional Jumps by Code Replication by Frank Mueller and David B. Whalley Florida State University
2 Overview uncond. jumps occur often in programs [4-0% dynamically] produced by loops, conditional statements, etc. can almost always be avoided technique to avoid uncond. jumps introduction evaluation 2
3 Motivation code generated includes uncond. jumps for while-loops and for-loops typically if-then-else construct always other language constructs (break, goto, continue in C) asaside-effect of optimizations uncond. jump instruction can be avoided when code is replicated from the target methods often employed for certain class of loops in frontend new method ispart of optimizations in back-end of compiler can be applied universally to all uncond. jumps may introduce sources for other optimizations 3
4 Example While-Loop (RTLs for 68020) without replication i=; while (x[i]) { x[i-] = x[i]; i++; } with replication a[0]=a[6]+x.+; NZ=B[a[6]+x.+]?0; a[]=a[0]; PC=NZ==0,L6; L5 a[0]=a[6]+x.; NZ=B[a[0]]?0; L000 PC=NZ==0,L6; B[a[0]]=B[a[0]+]; B[a[0]-]=B[a[]++]; a[0]=a[0]+; a[0]=a[0]+; NZ=B[a[0]+]?0; PC=L5; PC=NZ!=0,L000; L6... L6... 4
5 Example 2 For-Loop (RTLs for 68020) without replication for (i = k; i < 0; i++) x[i] = y[i]; with replication L9 L8 d[0]=l[a[6]+k.]; a[0]=d[0]+a[6]+x.; a[]=d[0]+a[6]+y.; PC=L8; B[a[0]++]=B[a[]++]; d[0]=d[0]+; L9 d[0]=l[a[6]+k.]; NZ=d[0]?0; PC=NZ>=0,L000; a[0]=d[0]+a[6]+x.; a[]=d[0]+a[6]+y.; B[a[0]++]=B[a[]++]; d[0]=d[0]+; NZ=d[0]?0; NZ=d[0]?0; PC=NZ<0,L9; PC=NZ<0,L9;... L
6 Example 3 Exit Condition in the Middle of a Loop (RTLs for 68020) i=; while (i++ < n) x[i-] = x[i]; without replication with replication L5 d[]=; a[0]=a[6]+x.; d[0]=; d[]=2; NZ=d[0]?L[_n]; PC=NZ>=0,L6; a[0]=a[6]+x.+; d[0]=d[]; a[0]=a[0]+; d[]=d[]+; L000 NZ=d[0]?L[_n]; B[a[0]]=B[a[0]+]; PC=NZ>=0,L6; a[0]=a[0]+; B[a[0]]=B[a[0]+]; d[0]=d[]; PC=L5; d[]=d[]+; L6... NZ=d[0]?L[_n] PC=NZ<0,L000; L6... 6
7 Example 4 If-Then-Else Statement (RTLs for 68020) without replication if (i > 5) i=i/n; else i=i*n; return(i); with replication L22 L23 NZ=L[a[6]+i.]?5; PC=NZ<=0,L22; d[0]=l[a[6]+i.]; d[0]=d[0]/l[a[6]+n.]; L[a[6]+i.]=d[0]; PC=L23; d[0]=l[a[6]+i.]; d[0]=d[0]*l[a[6]+n.]; L[a[6]+i.]=d[0]; a[6]=uk; PC=RT; L22 NZ=L[a[6]+i.]?5; PC=NZ<=0,L22; d[0]=l[a[6]+i.]; d[0]=d[0]/l[a[6]+n.]; L[a[6]+i.]=d[0]; a[6]=uk; PC=RT; d[0]=l[a[6]+i.]; d[0]=d[0]*l[a[6]+n.]; L[a[6]+i.]=d[0]; a[6]=uk; PC=RT; 7
8 Remote Preheader No Preheader Remote Preheader 8
9 Algorithm JUMPS. set up matrix (used to find shortest replication sequence) 2. traverse basic blocks until uncond. jump found 3. choose replication sequence (towards return or loop) 4. expand replication sequence to include all blocks within a loop 5. replicate code and adjust its control flow 6. adjust control flow ofportion of loops which was not copied 7. if control flow has become non-reducible then remove replicated code 9
10 Interference with Natural Loops Without Replication With Partial Replication With Loop Replication
11 Partial Overlapping of Natural Loops Initial Control Flow After Replication Adjusted Control Flow 2
12 Measurements for Motorola 68020/6888 and Sun SPARC test programs included benchmarks UNIX utilities applications instrumentation of programs at code generation time compiled with different sets of optimizations: SIMPLE: standard opt. LOOPS: standard opt. + code replication at loops JUMPS: standard opt. + generalized code repl. 3
13 Measurements (cont.) Number of Static and Dynamic Instructions (Sun SPARC) Sun SPARC program static instructions dynamic instructions executed SIMPLE LOOPS JUMPS SIMPLE LOOPS JUMPS cal % +2.89% 37, % -3.5% quicksort % +50.6% 836, % -4.2% wc % % 540, % -.96% grep % %,930, % -3.57% sort, % +89.7%,8, % -0.49% od, % +95.9% 2,336, % -0.22% mincost, % % 335, % -3.9% bubblesort % +5.4% 29,07, % -0.07% matmult % +3.67% 4,403, % -0.28% banner % % 2, % -0.25% sieve % +3.23% 2,84, % -3.73% compact, % +75.8% 3,409, % -4.86% queens % +7.89% 263, % -0.03% deroff 7, % % 448,58-0.0% -3.3% average, % % 4,784, % -5.7% 5
14 Measurements (cont.) Number of Static and Dynamic Instructions (Motorola 68020) Motorola program static instructions dynamic instructions executed SIMPLE LOOPS JUMPS SIMPLE LOOPS JUMPS cal % % 36, % -3.7% quicksort % % 536, % -3.96% wc % % 42, % -5.32% grep % %,309, % -3.44% sort, % % 902, % -2.43% od, % %,980, % -0.30% mincost % % 302, % -5.3% bubblesort % +2.92% 20,340, % -8.92% matmult % +3.42% 4,89, % -0.2% banner % % 2, % -3.34% sieve % +.43%,759, % -8.53% compact, % % 0,602, % -5.26% queens % +2.77% 89, % -0.05% deroff 5, % +55.7% 360, % -7.05% average % % 3,6, % -6.94% 6
15 Future Work handle indirect jumps in algorithm should improve dynamic savings may reduce size of generated code limit length of replication sequence, use depth-bound DFS should reduce size of generated code should improve compile-time overhead of optimization phase determine best phase ordering trade-off compile-time / exploit optimizations 8
16 Conclusions code replication avoids almost all uncond. jumps reduces number of executed instructions by 6% increases number of instructions between branches by.5 on SPARC results in 4% decreased cache work (except for small caches) increases code size by 53% outperforms traditional methods to avoid uncond. jumps should be applied in the back-end of highly optimizing compilers 9
17 Order of Optimizations branch chaining; dead code elimination; reorder basic blocks to minimize jumps; code replication (either JUMPS or LOOPS); dead code elimination; instruction selection; register assignment; if (change) instruction selection; do { register allocation by register coloring; instruction selection; common subexpression elimination; dead variable elimination; code motion; strength reduction; recurrences; instruction selection; branch chaining; constant folding at conditional branches; code replication (either JUMPS or LOOPS); dead code elimination; } while (change); filling of delay slots for RISCs; 20
Avoiding Unconditional Jumps by Code Replication. Frank Mueller and David B. Whalley. from the execution of a number of programs.
voiding Unconditional Jumps by Code Replication Frank Mueller and David B Whalley Department of Computer Science, B-73 Florida State University Tallahassee, Florida 32306-09 e-mail: whalley@csfsuedu bstract
More informationWhat Do Compilers Do? How Can the Compiler Improve Performance? What Do We Mean By Optimization?
What Do Compilers Do? Lecture 1 Introduction I What would you get out of this course? II Structure of a Compiler III Optimization Example Reference: Muchnick 1.3-1.5 1. Translate one language into another
More informationEvaluating Heuristic Optimization Phase Order Search Algorithms
Evaluating Heuristic Optimization Phase Order Search Algorithms by Prasad A. Kulkarni David B. Whalley Gary S. Tyson Jack W. Davidson Computer Science Department, Florida State University, Tallahassee,
More informationControl flow graphs and loop optimizations. Thursday, October 24, 13
Control flow graphs and loop optimizations Agenda Building control flow graphs Low level loop optimizations Code motion Strength reduction Unrolling High level loop optimizations Loop fusion Loop interchange
More informationFSU DEPARTMENT OF COMPUTER SCIENCE
mueller@cs.fsu.edu whalley@cs.fsu.edu of Computer Science Department State University Florida On Debugging RealTime Applications Frank Mueller and David Whalley Tallahassee, FL 323044019 email: On Debugging
More informationr[2] = M[x]; M[x] = r[2]; r[2] = M[x]; M[x] = r[2];
Using a Swap Instruction to Coalesce Loads and Stores Apan Qasem, David Whalley, Xin Yuan, and Robert van Engelen Department of Computer Science, Florida State University Tallahassee, FL 32306-4530, U.S.A.
More informationCompiler Optimization
Compiler Optimization The compiler translates programs written in a high-level language to assembly language code Assembly language code is translated to object code by an assembler Object code modules
More informationLecture Notes on Loop Optimizations
Lecture Notes on Loop Optimizations 15-411: Compiler Design Frank Pfenning Lecture 17 October 22, 2013 1 Introduction Optimizing loops is particularly important in compilation, since loops (and in particular
More informationCSC D70: Compiler Optimization
CSC D70: Compiler Optimization Prof. Gennady Pekhimenko University of Toronto Winter 2018 The content of this lecture is adapted from the lectures of Todd Mowry and Phillip Gibbons CSC D70: Compiler Optimization
More informationCompiler Optimizations. Chapter 8, Section 8.5 Chapter 9, Section 9.1.7
Compiler Optimizations Chapter 8, Section 8.5 Chapter 9, Section 9.1.7 2 Local vs. Global Optimizations Local: inside a single basic block Simple forms of common subexpression elimination, dead code elimination,
More informationTour of common optimizations
Tour of common optimizations Simple example foo(z) { x := 3 + 6; y := x 5 return z * y } Simple example foo(z) { x := 3 + 6; y := x 5; return z * y } x:=9; Applying Constant Folding Simple example foo(z)
More informationLecture 4: Instruction Set Design/Pipelining
Lecture 4: Instruction Set Design/Pipelining Instruction set design (Sections 2.9-2.12) control instructions instruction encoding Basic pipelining implementation (Section A.1) 1 Control Transfer Instructions
More informationCompiler Optimizations. Chapter 8, Section 8.5 Chapter 9, Section 9.1.7
Compiler Optimizations Chapter 8, Section 8.5 Chapter 9, Section 9.1.7 2 Local vs. Global Optimizations Local: inside a single basic block Simple forms of common subexpression elimination, dead code elimination,
More informationStatic Single Assignment Form in the COINS Compiler Infrastructure
Static Single Assignment Form in the COINS Compiler Infrastructure Masataka Sassa (Tokyo Institute of Technology) Background Static single assignment (SSA) form facilitates compiler optimizations. Compiler
More informationIn Search of Near-Optimal Optimization Phase Orderings
In Search of Near-Optimal Optimization Phase Orderings Prasad A. Kulkarni David B. Whalley Gary S. Tyson Jack W. Davidson Languages, Compilers, and Tools for Embedded Systems Optimization Phase Ordering
More informationLanguages and Compiler Design II IR Code Optimization
Languages and Compiler Design II IR Code Optimization Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU Spring 2010 rev.: 4/16/2010 PSU CS322 HM 1 Agenda IR Optimization
More informationOptimization Prof. James L. Frankel Harvard University
Optimization Prof. James L. Frankel Harvard University Version of 4:24 PM 1-May-2018 Copyright 2018, 2016, 2015 James L. Frankel. All rights reserved. Reasons to Optimize Reduce execution time Reduce memory
More informationCOP5621 Exam 4 - Spring 2005
COP5621 Exam 4 - Spring 2005 Name: (Please print) Put the answers on these sheets. Use additional sheets when necessary. Show how you derived your answer when applicable (this is required for full credit
More informationIntermediate Code & Local Optimizations
Lecture Outline Intermediate Code & Local Optimizations Intermediate code Local optimizations Compiler Design I (2011) 2 Code Generation Summary We have so far discussed Runtime organization Simple stack
More informationFSU DEPARTMENT OF COMPUTER SCIENCE
mueller@cs.fsu.edu whalley@cs.fsu.edu of Computer Science Department State University Florida Predicting Instruction Cache Behavior Frank Mueller () David Whalley () Marion Harmon(FAMU) Tallahassee, FL
More informationfor (sp = line; *sp; sp++) { for (sp = line; ; sp++) { switch (*sp) { switch (*sp) { case p :... case k :...
Eectively Exploiting Indirect Jumps GANG-RYUNG UH Lucent Technologies, Allentown, PA 18103 tel: 610-712-2447, email:uh@lucent.com and DAVID B. WHALLEY Department of Computer Science, Florida State University,
More informationGoals of Program Optimization (1 of 2)
Goals of Program Optimization (1 of 2) Goal: Improve program performance within some constraints Ask Three Key Questions for Every Optimization 1. Is it legal? 2. Is it profitable? 3. Is it compile-time
More informationCODE GENERATION Monday, May 31, 2010
CODE GENERATION memory management returned value actual parameters commonly placed in registers (when possible) optional control link optional access link saved machine status local data temporaries A.R.
More informationMethods for Saving and Restoring Register Values across Function Calls
Methods for Saving and Restoring Register Values across Function Calls JACK W DAVIDSON AND DAVID B WHALLEY Department of Computer Science, University of Virginia, Charlottesville, VA 22903, USA SUMMARY
More informationECE 486/586. Computer Architecture. Lecture # 7
ECE 486/586 Computer Architecture Lecture # 7 Spring 2015 Portland State University Lecture Topics Instruction Set Principles Instruction Encoding Role of Compilers The MIPS Architecture Reference: Appendix
More informationLoops. Lather, Rinse, Repeat. CS4410: Spring 2013
Loops or Lather, Rinse, Repeat CS4410: Spring 2013 Program Loops Reading: Appel Ch. 18 Loop = a computation repeatedly executed until a terminating condition is reached High-level loop constructs: While
More informationLecture 3 Local Optimizations, Intro to SSA
Lecture 3 Local Optimizations, Intro to SSA I. Basic blocks & Flow graphs II. Abstraction 1: DAG III. Abstraction 2: Value numbering IV. Intro to SSA ALSU 8.4-8.5, 6.2.4 Phillip B. Gibbons 15-745: Local
More informationOptimization. ASU Textbook Chapter 9. Tsan-sheng Hsu.
Optimization ASU Textbook Chapter 9 Tsan-sheng Hsu tshsu@iis.sinica.edu.tw http://www.iis.sinica.edu.tw/~tshsu 1 Introduction For some compiler, the intermediate code is a pseudo code of a virtual machine.
More informationDecreasing Process Memory Requirements by Overlapping Program Portions
Decreasing Process Memory Requirements by Overlapping Program Portions Richard L. Bowman Emily J. Ratliff David B. Whalley Harris, Melbourne IBM, Fort Lauderdale Florida State Univ., CS Dept bowman@harris.com
More information4.1 Interval Scheduling
41 Interval Scheduling Interval Scheduling Interval scheduling Job j starts at s j and finishes at f j Two jobs compatible if they don't overlap Goal: find maximum subset of mutually compatible jobs a
More informationCompiler and Architectural Support for. Jerey S. Snyder, David B. Whalley, and Theodore P. Baker. Department of Computer Science
Fast Context Switches: Compiler and Architectural Support for Preemptive Scheduling Jerey S. Snyder, David B. Whalley, and Theodore P. Baker Department of Computer Science Florida State University, Tallahassee,
More informationMore Code Generation and Optimization. Pat Morin COMP 3002
More Code Generation and Optimization Pat Morin COMP 3002 Outline DAG representation of basic blocks Peephole optimization Register allocation by graph coloring 2 Basic Blocks as DAGs 3 Basic Blocks as
More informationControl Flow Analysis & Def-Use. Hwansoo Han
Control Flow Analysis & Def-Use Hwansoo Han Control Flow Graph What is CFG? Represents program structure for internal use of compilers Used in various program analyses Generated from AST or a sequential
More informationCMSC 611: Advanced Computer Architecture
CMSC 611: Advanced Computer Architecture Compilers Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier Science
More informationUSC 227 Office hours: 3-4 Monday and Wednesday CS553 Lecture 1 Introduction 4
CS553 Compiler Construction Instructor: URL: Michelle Strout mstrout@cs.colostate.edu USC 227 Office hours: 3-4 Monday and Wednesday http://www.cs.colostate.edu/~cs553 CS553 Lecture 1 Introduction 3 Plan
More informationLecture 8: Induction Variable Optimizations
Lecture 8: Induction Variable Optimizations I. Finding loops II. III. Overview of Induction Variable Optimizations Further details ALSU 9.1.8, 9.6, 9.8.1 Phillip B. Gibbons 15-745: Induction Variables
More informationFinal Exam Review Questions
EECS 665 Final Exam Review Questions 1. Give the three-address code (like the quadruples in the csem assignment) that could be emitted to translate the following assignment statement. However, you may
More informationPipelining and Vector Processing
Chapter 8 Pipelining and Vector Processing 8 1 If the pipeline stages are heterogeneous, the slowest stage determines the flow rate of the entire pipeline. This leads to other stages idling. 8 2 Pipeline
More informationWhat Compilers Can and Cannot Do. Saman Amarasinghe Fall 2009
What Compilers Can and Cannot Do Saman Amarasinghe Fall 009 Optimization Continuum Many examples across the compilation pipeline Static Dynamic Program Compiler Linker Loader Runtime System Optimization
More informationSoftware Exorcism: A Handbook for Debugging and Optimizing Legacy Code
Software Exorcism: A Handbook for Debugging and Optimizing Legacy Code BILL BLUNDEN Apress About the Author Acknowledgments Introduction xi xiii xv Chapter 1 Preventative Medicine 1 1.1 Core Problems 2
More informationHY425 Lecture 09: Software to exploit ILP
HY425 Lecture 09: Software to exploit ILP Dimitrios S. Nikolopoulos University of Crete and FORTH-ICS November 4, 2010 ILP techniques Hardware Dimitrios S. Nikolopoulos HY425 Lecture 09: Software to exploit
More informationOptimizations. Optimization Safety. Optimization Safety CS412/CS413. Introduction to Compilers Tim Teitelbaum
Optimizations CS412/CS413 Introduction to Compilers im eitelbaum Lecture 24: s 24 Mar 08 Code transformations to improve program Mainly: improve execution time Also: reduce program size Can be done at
More informationAggressive Loop Unrolling in a Retargetable, Optimizing Compiler JACK W. DAVIDSON and SAN JAY JINTURKAR. Abstract
Aggressive Loop Unrolling in a Retargetable, Optimizing Compiler JACK W. DAVIDSON and SAN JAY JINTURKAR {jwd, sj 3e} Ovirginia. edu Department of Computer Science, Thornton Hall University of Virginia,
More informationHY425 Lecture 09: Software to exploit ILP
HY425 Lecture 09: Software to exploit ILP Dimitrios S. Nikolopoulos University of Crete and FORTH-ICS November 4, 2010 Dimitrios S. Nikolopoulos HY425 Lecture 09: Software to exploit ILP 1 / 44 ILP techniques
More informationCompiler Theory. (Intermediate Code Generation Abstract S yntax + 3 Address Code)
Compiler Theory (Intermediate Code Generation Abstract S yntax + 3 Address Code) 006 Why intermediate code? Details of the source language are confined to the frontend (analysis phase) of a compiler, while
More informationPipelining, Branch Prediction, Trends
Pipelining, Branch Prediction, Trends 10.1-10.4 Topics 10.1 Quantitative Analyses of Program Execution 10.2 From CISC to RISC 10.3 Pipelining the Datapath Branch Prediction, Delay Slots 10.4 Overlapping
More informationIntermediate Representations. Reading & Topics. Intermediate Representations CS2210
Intermediate Representations CS2210 Lecture 11 Reading & Topics Muchnick: chapter 6 Topics today: Intermediate representations Automatic code generation with pattern matching Optimization Overview Control
More informationMachine-Independent Optimizations
Chapter 9 Machine-Independent Optimizations High-level language constructs can introduce substantial run-time overhead if we naively translate each construct independently into machine code. This chapter
More informationInduction Variable Identification (cont)
Loop Invariant Code Motion Last Time Uses of SSA: reaching constants, dead-code elimination, induction variable identification Today Finish up induction variable identification Loop invariant code motion
More informationAvoiding Conditional Branches by Code Replication
Avoiding Conditional Branches by Code Replication FRANK MUELLER DAVID B. WHALLEY Fachbereich Informatik Department of Computer Science Humboldt-Universitȧ. tzuberlin Florida State University Unter den
More informationCompiler Optimization and Code Generation
Compiler Optimization and Code Generation Professor: Sc.D., Professor Vazgen Melikyan 1 Course Overview Introduction: Overview of Optimizations 1 lecture Intermediate-Code Generation 2 lectures Machine-Independent
More informationMulti-dimensional Arrays
Multi-dimensional Arrays IL for Arrays & Local Optimizations Lecture 26 (Adapted from notes by R. Bodik and G. Necula) A 2D array is a 1D array of 1D arrays Java uses arrays of pointers to arrays for >1D
More informationCIS 1.5 Course Objectives. a. Understand the concept of a program (i.e., a computer following a series of instructions)
By the end of this course, students should CIS 1.5 Course Objectives a. Understand the concept of a program (i.e., a computer following a series of instructions) b. Understand the concept of a variable
More informationARSITEKTUR SISTEM KOMPUTER. Wayan Suparta, PhD 17 April 2018
ARSITEKTUR SISTEM KOMPUTER Wayan Suparta, PhD https://wayansuparta.wordpress.com/ 17 April 2018 Reduced Instruction Set Computers (RISC) CISC Complex Instruction Set Computer RISC Reduced Instruction Set
More informationCS553 Lecture Profile-Guided Optimizations 3
Profile-Guided Optimizations Last time Instruction scheduling Register renaming alanced Load Scheduling Loop unrolling Software pipelining Today More instruction scheduling Profiling Trace scheduling CS553
More informationCode Generation. Frédéric Haziza Spring Department of Computer Systems Uppsala University
Code Generation Frédéric Haziza Department of Computer Systems Uppsala University Spring 2008 Operating Systems Process Management Memory Management Storage Management Compilers Compiling
More informationLiveness Analysis and Register Allocation. Xiao Jia May 3 rd, 2013
Liveness Analysis and Register Allocation Xiao Jia May 3 rd, 2013 1 Outline Control flow graph Liveness analysis Graph coloring Linear scan 2 Basic Block The code in a basic block has: one entry point,
More informationImplementation of Additional Optimizations for VPO on the Power Platform
Implementation of Additional Optimizations for VPO on the Power Platform Chungsoo Lim, Gary A. Smith, Won So {clim, gasmith, wso}@ncsu.edu http://www4.ncsu.edu/~gasmith/csc791a_proj Spring 2005 CSC791A
More informationSoot A Java Bytecode Optimization Framework. Sable Research Group School of Computer Science McGill University
Soot A Java Bytecode Optimization Framework Sable Research Group School of Computer Science McGill University Goal Provide a Java framework for optimizing and annotating bytecode provide a set of API s
More informationCompiler Passes. Optimization. The Role of the Optimizer. Optimizations. The Optimizer (or Middle End) Traditional Three-pass Compiler
Compiler Passes Analysis of input program (front-end) character stream Lexical Analysis Synthesis of output program (back-end) Intermediate Code Generation Optimization Before and after generating machine
More informationAn Overview of GCC Architecture (source: wikipedia) Control-Flow Analysis and Loop Detection
An Overview of GCC Architecture (source: wikipedia) CS553 Lecture Control-Flow, Dominators, Loop Detection, and SSA Control-Flow Analysis and Loop Detection Last time Lattice-theoretic framework for data-flow
More informationCode optimization. Have we achieved optimal code? Impossible to answer! We make improvements to the code. Aim: faster code and/or less space
Code optimization Have we achieved optimal code? Impossible to answer! We make improvements to the code Aim: faster code and/or less space Types of optimization machine-independent In source code or internal
More informationRedundant Computation Elimination Optimizations. Redundancy Elimination. Value Numbering CS2210
Redundant Computation Elimination Optimizations CS2210 Lecture 20 Redundancy Elimination Several categories: Value Numbering local & global Common subexpression elimination (CSE) local & global Loop-invariant
More information5008: Computer Architecture HW#2
5008: Computer Architecture HW#2 1. We will now support for register-memory ALU operations to the classic five-stage RISC pipeline. To offset this increase in complexity, all memory addressing will be
More informationINSTITUTE OF AERONAUTICAL ENGINEERING
INSTITUTE OF AERONAUTICAL ENGINEERING (Autonomous) Dundigal, Hyderabad -500 043 INFORMATION TECHNOLOGY TUTORIAL QUESTION BANK Course Name : DESIGN AND ANALYSIS OF ALGORITHMS Course Code : AIT001 Class
More informationBentley Rules for Optimizing Work
6.172 Performance Engineering of Software Systems SPEED LIMIT PER ORDER OF 6.172 LECTURE 2 Bentley Rules for Optimizing Work Charles E. Leiserson September 11, 2012 2012 Charles E. Leiserson and I-Ting
More informationReducing the Cost of Branches by Using Registersf-
Reducing the Cost of Branches by Using Registersf- JACK W. DAVIDSON AND DAVID B. WHALLEY Department of Computer Science University of Virginia Charlottesville, VA 22903, U. S. A. ABSTRACT In an attempt
More informationUsing a Swap Instruction to Coalesce Loads and Stores
Using a Swap Instruction to Coalesce Loads and Stores Apan Qasem, David Whalley, Xin Yuan, and Robert van Engelen Department of Computer Science, Florida State University Tallahassee, FL 32306-4530, U.S.A.
More informationLoop Invariant Code Motion. Background: ud- and du-chains. Upward Exposed Uses. Identifying Loop Invariant Code. Last Time Control flow analysis
Loop Invariant Code Motion Loop Invariant Code Motion Background: ud- and du-chains Last Time Control flow analysis Today Loop invariant code motion ud-chains A ud-chain connects a use of a variable to
More informationUsing Cache Line Coloring to Perform Aggressive Procedure Inlining
Using Cache Line Coloring to Perform Aggressive Procedure Inlining Hakan Aydın David Kaeli Department of Electrical and Computer Engineering Northeastern University Boston, MA, 02115 {haydin,kaeli}@ece.neu.edu
More informationCOMS W4115 Programming Languages and Translators Lecture 21: Code Optimization April 15, 2013
1 COMS W4115 Programming Languages and Translators Lecture 21: Code Optimization April 15, 2013 Lecture Outline 1. Code optimization strategies 2. Peephole optimization 3. Common subexpression elimination
More informationPrinciples of Compiler Design
Principles of Compiler Design Code Generation Compiler Lexical Analysis Syntax Analysis Semantic Analysis Source Program Token stream Abstract Syntax tree Intermediate Code Code Generation Target Program
More informationCSE443 Compilers. Dr. Carl Alphonce 343 Davis Hall
CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall http://www.cse.buffalo.edu/faculty/alphonce/sp17/cse443/index.php https://piazza.com/class/iybn4ndqa1s3ei Announcements Grading survey
More informationIntermediate representation
Intermediate representation Goals: encode knowledge about the program facilitate analysis facilitate retargeting facilitate optimization scanning parsing HIR semantic analysis HIR intermediate code gen.
More informationIntermediate Representations Part II
Intermediate Representations Part II Types of Intermediate Representations Three major categories Structural Linear Hybrid Directed Acyclic Graph A directed acyclic graph (DAG) is an AST with a unique
More informationEECS 583 Class 3 More on loops, Region Formation
EECS 583 Class 3 More on loops, Region Formation University of Michigan September 19, 2016 Announcements & Reading Material HW1 is out Get busy on it!» Course servers are ready to go Today s class» Trace
More informationOffice Hours: Mon/Wed 3:30-4:30 GDC Office Hours: Tue 3:30-4:30 Thu 3:30-4:30 GDC 5.
CS380C Compilers Instructor: TA: lin@cs.utexas.edu Office Hours: Mon/Wed 3:30-4:30 GDC 5.512 Jia Chen jchen@cs.utexas.edu Office Hours: Tue 3:30-4:30 Thu 3:30-4:30 GDC 5.440 January 21, 2015 Introduction
More informationMemory Bandwidth Optimizations for Wide-Bus Machines
Reprinted from Proceedings of the Hawaii International Conference on System Sciences, January 1993 Memory Bandwidth Optimizations for Wide-Bus Machines Michael A. Alexander, Mark W. Bailey, Bruce R. Childers,
More informationIntermediate Code Generation
Intermediate Code Generation In the analysis-synthesis model of a compiler, the front end analyzes a source program and creates an intermediate representation, from which the back end generates target
More informationWilliam Stallings Computer Organization and Architecture. Chapter 12 Reduced Instruction Set Computers
William Stallings Computer Organization and Architecture Chapter 12 Reduced Instruction Set Computers Major Advances in Computers(1) The family concept IBM System/360 1964 DEC PDP-8 Separates architecture
More informationSireesha R Basavaraju Embedded Systems Group, Technical University of Kaiserslautern
Sireesha R Basavaraju Embedded Systems Group, Technical University of Kaiserslautern Introduction WCET of program ILP Formulation Requirement SPM allocation for code SPM allocation for data Conclusion
More informationTopic 9: Control Flow
Topic 9: Control Flow COS 320 Compiling Techniques Princeton University Spring 2016 Lennart Beringer 1 The Front End The Back End (Intel-HP codename for Itanium ; uses compiler to identify parallelism)
More informationModel-based Software Development
Model-based Software Development 1 SCADE Suite Application Model in SCADE (data flow + SSM) System Model (tasks, interrupts, buses, ) SymTA/S Generator System-level Schedulability Analysis Astrée ait StackAnalyzer
More informationIntermediate Code & Local Optimizations. Lecture 20
Intermediate Code & Local Optimizations Lecture 20 Lecture Outline Intermediate code Local optimizations Next time: global optimizations 2 Code Generation Summary We have discussed Runtime organization
More informationFast Searches for Effective Optimization Phase Sequences
Fast Searches for Effective Optimization Phase Sequences Prasad Kulkarni, Stephen Hines, Jason Hiser David Whalley, Jack Davidson, Douglas Jones Computer Science Department, Florida State University, Tallahassee,
More informationTopic I (d): Static Single Assignment Form (SSA)
Topic I (d): Static Single Assignment Form (SSA) 621-10F/Topic-1d-SSA 1 Reading List Slides: Topic Ix Other readings as assigned in class 621-10F/Topic-1d-SSA 2 ABET Outcome Ability to apply knowledge
More informationWhy Global Dataflow Analysis?
Why Global Dataflow Analysis? Answer key questions at compile-time about the flow of values and other program properties over control-flow paths Compiler fundamentals What defs. of x reach a given use
More informationCalvin Lin The University of Texas at Austin
Loop Invariant Code Motion Last Time SSA Today Loop invariant code motion Reuse optimization Next Time More reuse optimization Common subexpression elimination Partial redundancy elimination February 23,
More informationInstruction-Level Parallelism. Instruction Level Parallelism (ILP)
Instruction-Level Parallelism CS448 1 Pipelining Instruction Level Parallelism (ILP) Limited form of ILP Overlapping instructions, these instructions can be evaluated in parallel (to some degree) Pipeline
More informationCompiler Optimization Techniques
Compiler Optimization Techniques Department of Computer Science, Faculty of ICT February 5, 2014 Introduction Code optimisations usually involve the replacement (transformation) of code from one sequence
More informationLecture 1 Introduc-on
Lecture 1 Introduc-on What would you get out of this course? Structure of a Compiler Op9miza9on Example 15-745: Introduc9on 1 What Do Compilers Do? 1. Translate one language into another e.g., convert
More informationCompiler Design. Fall Control-Flow Analysis. Prof. Pedro C. Diniz
Compiler Design Fall 2015 Control-Flow Analysis Sample Exercises and Solutions Prof. Pedro C. Diniz USC / Information Sciences Institute 4676 Admiralty Way, Suite 1001 Marina del Rey, California 90292
More informationOperating system integrated energy aware scratchpad allocation strategies for multiprocess applications
University of Dortmund Operating system integrated energy aware scratchpad allocation strategies for multiprocess applications Robert Pyka * Christoph Faßbach * Manish Verma + Heiko Falk * Peter Marwedel
More informationA main goal is to achieve a better performance. Code Optimization. Chapter 9
1 A main goal is to achieve a better performance Code Optimization Chapter 9 2 A main goal is to achieve a better performance source Code Front End Intermediate Code Code Gen target Code user Machineindependent
More informationTuning Programs For Performance
Tuning Programs For Performance Don t tune your program unless there is a performance issue Use performance instrumentation tools: prof: performance profiler based on sampling gprof: extension to prof
More informationECE902 Virtual Machine Final Project: MIPS to CRAY-2 Binary Translation
ECE902 Virtual Machine Final Project: MIPS to CRAY-2 Binary Translation Weiping Liao, Saengrawee (Anne) Pratoomtong, and Chuan Zhang Abstract Binary translation is an important component for translating
More informationOther Forms of Intermediate Code. Local Optimizations. Lecture 34
Other Forms of Intermediate Code. Local Optimizations Lecture 34 (Adapted from notes by R. Bodik and G. Necula) 4/18/08 Prof. Hilfinger CS 164 Lecture 34 1 Administrative HW #5 is now on-line. Due next
More informationAdministrative. Other Forms of Intermediate Code. Local Optimizations. Lecture 34. Code Generation Summary. Why Intermediate Languages?
Administrative Other Forms of Intermediate Code. Local Optimizations HW #5 is now on-line. Due next Friday. If your test grade is not glookupable, please tell us. Please submit test regrading pleas to
More informationContents. Preface xvii Acknowledgments. CHAPTER 1 Introduction to Parallel Computing 1. CHAPTER 2 Parallel Programming Platforms 11
Preface xvii Acknowledgments xix CHAPTER 1 Introduction to Parallel Computing 1 1.1 Motivating Parallelism 2 1.1.1 The Computational Power Argument from Transistors to FLOPS 2 1.1.2 The Memory/Disk Speed
More informationCISC / RISC. Complex / Reduced Instruction Set Computers
Systems Architecture CISC / RISC Complex / Reduced Instruction Set Computers CISC / RISC p. 1/12 Instruction Usage Instruction Group Average Usage 1 Data Movement 45.28% 2 Flow Control 28.73% 3 Arithmetic
More information