Code Analysis and Optimization. Pat Morin COMP 3002

Similar documents
More Code Generation and Optimization. Pat Morin COMP 3002

CS2110 Fall 2011 Lecture 25. Under the Hood: The Java Virtual Machine, Part II

Review. Pat Morin COMP 3002

Code Generation. Frédéric Haziza Spring Department of Computer Systems Uppsala University

Compiler construction 2009

Code optimization. Have we achieved optimal code? Impossible to answer! We make improvements to the code. Aim: faster code and/or less space

Introduction to Code Optimization. Lecture 36: Local Optimization. Basic Blocks. Basic-Block Example

Compiling Techniques

Under the Hood: The Java Virtual Machine. Problem: Too Many Platforms! Compiling for Different Platforms. Compiling for Different Platforms

JVML Instruction Set. How to get more than 256 local variables! Method Calls. Example. Method Calls

CSE 431S Final Review. Washington University Spring 2013

Under the Hood: The Java Virtual Machine. Lecture 23 CS2110 Fall 2008

JVM. What This Topic is About. Course Overview. Recap: Interpretive Compilers. Abstract Machines. Abstract Machines. Class Files and Class File Format

Principles of Compiler Design

Comp 204: Computer Systems and Their Implementation. Lecture 22: Code Generation and Optimisation

Intermediate representation

Compiler construction 2009

Compiler Code Generation COMP360

02 B The Java Virtual Machine

Course Overview. PART I: overview material. PART II: inside a compiler. PART III: conclusion

Compiler Optimization Techniques

How do you create a programming language for the JVM?

CODE GENERATION Monday, May 31, 2010

Administration CS 412/413. Why build a compiler? Compilers. Architectural independence. Source-to-source translator

COMP 3002: Compiler Construction. Pat Morin School of Computer Science

THEORY OF COMPILATION

Midterm 2. CMSC 430 Introduction to Compilers Spring Instructions Total 100. Name: April 18, 2012

Project. there are a couple of 3 person teams. a new drop with new type checking is coming. regroup or see me or forever hold your peace

Code Generation Introduction

Static Checking and Intermediate Code Generation Pat Morin COMP 3002

CMSC430 Spring 2009 Midterm 2 (Solutions)

Optimization. ASU Textbook Chapter 9. Tsan-sheng Hsu.

CSC 4181 Handout : JVM

Instruction Selection and Scheduling

Part VII : Code Generation

Run-time Environments. Lecture 13. Prof. Alex Aiken Original Slides (Modified by Prof. Vijay Ganesh) Lecture 13

CS153: Compilers Lecture 17: Control Flow Graph and Data Flow Analysis

Topics. Structured Computer Organization. Assembly language. IJVM instruction set. Mic-1 simulator programming

COMP3131/9102: Programming Languages and Compilers

Chapter 5. A Closer Look at Instruction Set Architectures. Chapter 5 Objectives. 5.1 Introduction. 5.2 Instruction Formats

Chapter 5. A Closer Look at Instruction Set Architectures

Group B Assignment 8. Title of Assignment: Problem Definition: Code optimization using DAG Perquisite: Lex, Yacc, Compiler Construction

CSE 431S Code Generation. Washington University Spring 2013

Intermediate Code & Local Optimizations. Lecture 20

Intermediate Code & Local Optimizations

Recap: Printing Trees into Bytecodes

Introduction to Optimization, Instruction Selection and Scheduling, and Register Allocation

CMSC430 Spring 2014 Midterm 2 Solutions

Agenda. CSE P 501 Compilers. Java Implementation Overview. JVM Architecture. JVM Runtime Data Areas (1) JVM Data Types. CSE P 501 Su04 T-1

YETI. GraduallY Extensible Trace Interpreter VEE Mathew Zaleski, Angela Demke Brown (University of Toronto) Kevin Stoodley (IBM Toronto)

COMP 303 Computer Architecture Lecture 3. Comp 303 Computer Architecture

Compiler Architecture

Tutorial 3: Code Generation

CS577 Modern Language Processors. Spring 2018 Lecture Interpreters

CSE 504. Expression evaluation. Expression Evaluation, Runtime Environments. One possible semantics: Problem:

Compiler Theory. (Intermediate Code Generation Abstract S yntax + 3 Address Code)

Advanced Database Systems

Chapter 5. A Closer Look at Instruction Set Architectures

Chapter 5. A Closer Look at Instruction Set Architectures. Chapter 5 Objectives. 5.1 Introduction. 5.2 Instruction Formats

Programming Language Systems

Midterm 2. CMSC 430 Introduction to Compilers Fall Instructions Total 100. Name: November 11, 2015

Variables vs. Registers/Memory. Simple Approach. Register Allocation. Interference Graph. Register Allocation Algorithm CS412/CS413

Improving Java Code Performance. Make your Java/Dalvik VM happier

Code Generation. M.B.Chandak Lecture notes on Language Processing

COMS W4115 Programming Languages and Translators Lecture 21: Code Optimization April 15, 2013

Compiler Design and Construction Optimization

Register Allocation. Lecture 16

Other Forms of Intermediate Code. Local Optimizations. Lecture 34

Copyright 2012, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 16

Administrative. Other Forms of Intermediate Code. Local Optimizations. Lecture 34. Code Generation Summary. Why Intermediate Languages?

Register Allocation (via graph coloring) Lecture 25. CS 536 Spring

Improving Java Performance

Compiler Design Spring 2017

Program Dynamic Analysis. Overview

3/15/18. Overview. Program Dynamic Analysis. What is dynamic analysis? [3] Why dynamic analysis? Why dynamic analysis? [3]

Register Allocation. CS 502 Lecture 14 11/25/08

ECE232: Hardware Organization and Design

The Structure of a Syntax-Directed Compiler

Register Allocation. Lecture 38

Compiler Passes. Optimization. The Role of the Optimizer. Optimizations. The Optimizer (or Middle End) Traditional Three-pass Compiler

Lecture Outline. Intermediate code Intermediate Code & Local Optimizations. Local optimizations. Lecture 14. Next time: global optimizations

Code Generation. Dragon: Ch (Just part of it) Holub: Ch 6.

Over-view. CSc Java programs. Java programs. Logging on, and logging o. Slides by Michael Weeks Copyright Unix basics. javac.

Let s make some Marc R. Hoffmann Eclipse Summit Europe

Digital Forensics Lecture 3 - Reverse Engineering

MIT Introduction to Program Analysis and Optimization. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

CSC 2400: Computing Systems. X86 Assembly: Function Calls"

COMP 520 Fall 2009 Virtual machines (1) Virtual machines

Lecture 4: MIPS Instruction Set

Register Allocation 1

Global Register Allocation

G Compiler Construction Lecture 12: Code Generation I. Mohamed Zahran (aka Z)

An Introduction to Multicodes. Ben Stephenson Department of Computer Science University of Western Ontario

Building a Compiler with. JoeQ. Outline of this lecture. Building a compiler: what pieces we need? AKA, how to solve Homework 2

The Java Virtual Machine. CSc 553. Principles of Compilation. 3 : The Java VM. Department of Computer Science University of Arizona

Intermediate Code Generation

Programming Language Processor Theory

Lecture 4: Instruction Set Architecture

Hardware Speculation Support

Local Optimization: Value Numbering The Desert Island Optimization. Comp 412 COMP 412 FALL Chapter 8 in EaC2e. target code

Transcription:

Code Analysis and Optimization Pat Morin COMP 3002

Outline Basic blocks and flow graphs Local register allocation Global register allocation Selected optimization topics 2

The Big Picture By now, we know enough to compile a programming language into machine code But the machine code isn't terribly efficient syntactic analyzer interm. rep. code generator machine interm. indep. opts. rep. basic code gen. machine dep. opts. 3

Today's Lecture We will look at different kinds of optimizations a compiler can perform Different optimizations apply to different architectures or at different times Virtual stack machines 3-Address instructions Register-based machines 4

Basic Blocks A basic block is a block of (machine or intermediate) code that always runs straight through without interruption A block head is the target of a (conditional or unconditional) jump, or the code immediately after a jump or function call, or the first line of code in a function A basic block starts at a block head and continues to the next block head (or the end of the code/function) 5

Basic Block Example getstatic java/lang/system/out Ljava/io/PrintStream; iload 0 ifeq false_label ldc "true" goto print_it false_label: ldc "false" print_it: invokevirtual java/io/printstream/println(ljava/lang/string;)v return 6

Basic Block Example Identify the basic blocks in the following ldc 0.0 fstore 1 start: fload 1 ; load i fload 0 ; load n fcmpl ifge done fload 1 invokestatic SimpleTest/printFloat(F)V fload 1 ldc 1.0 fadd fstore 1 goto start done: return 7

Why Basic Blocks? Because basic blocks always run straight through, without interruption We are free to modify a lot of the code within a basic block If a variable is set within a basic block then we know the value of that variable for the remainder of the block 8

Transformations on Basic Blocks Common subexpression elimination Works because we know the values of all variables that have been set within that block a := b+c b := a-d c := b+c d := a-d ; replace with d := b 9

Transformations on Basic Blocks Useless code elimination We can determine that some statements have no effect outside the basic block and can be eliminated iload 0 ldc 1 iadd istore 0 iload 0 pop ; eliminate ; eliminate 10

Transformation on Basic Blocks Renaming temporary variables (3-address codes) and reordering instructions can be useful t1 := b+c t2 := x+y ; can reorder if b,c!=t2 and x,y!=t1 11

Transformations on Basic Blocks We can use algebraic identities to simplify code or use less expensive instructions Usually applies when one of the operands is a constant x := x + 0 x := x * 1 x := y + 0 x := y * 1 x := y * 2 ; eliminate ; eliminate ; x := y ; x := y ; x := y + y might be faster 12

Register Machines 13

Register Machines A typical computer has a fixed number of registers All operations require that the operands be contained in these registers Reading data from memory into registers (load) and writing it back (store) is slow We want to minimize the number of loads and stores Problem: Many functions will have more variables than available registers 14

Next-Use Information When inspecting a basic block, it can be helpful to know when each variable will be used next ; code for x := y + z mov y, R0 ; put y into register 0 mov z, R1 ; put z into register 1 add R0, R1 ; store result of add in R0 mov R0, x ; store x ; code for p := y * 2 mov y, R0 ; put y into register 0 ld 2, R1 ; put 2 into register 1 add R0, R1 ; store result of add in R0 mov R0, p ; store p 15

Next-Use Information (Cont'd) An improved use of registers ; code for x := y + z mov y, R0 ; put y into register 0 mov z, R1 ; put z into register 1 add R1, R0 ; store result of add in R1 mov R1, x ; store x ; code for p := y * 2 ; y is still in R0 ld 2, R1 ; put 2 into register 1 add R0, R1 ; store result of add in R0 mov R0, p ; store p 16

Computing Next Use Information By scanning backwards we can compute next-use information for each variable used in each line of a basic block With each variable, we know the next time it is used in an expression the next time its value is changed Aliasing (pointers and references) can complicate matters 17

Next-Use Information Example 1. t1 := b * b ; t1(5) b(never) 2. t2 := 4 * a ; t2(3) a(6) 3. t3 := t2 * c ; t3(4) t2(never) c(never) 4. t4 := sqrt(t3) ; t4(5) t3(never) 5. t5 := t1 t4 ; t5(7) t1(never) t4(never) 6. t6 := 2 * a ; t6(6) a(never) 7. t7 := t5 / t6 ; t7(8) t5(never) t6(never) 18

Generating Code From Next-Use Scan the block from beginning to end, keeping track of where each variable is stored (in which register or in memory) To generate code for x := y + z Assume x, y, and z are distinct if x is in a register Ri then mark Ri as free If y and z are not in registers, then bring them into registers Do the addition (now x is stored in a register) 19

Bringing a Variable into a Register To load a variable y into a register If some register is free then use that register Otherwise, consider registers that store values also stored in memory and use one of those Otherwise, write a register into memory and use it In the case of ties, write the register holding the variable whose next use information is farthest into the future At the end of the basic block, generate code to write all registers back to memory 20

Code Generation - Example Generate code for this on a 2-register machine 1. t1 := b * b ; t1(5) b(never) 2. t2 := 4 * a ; t2(3) a(6) 3. t3 := t2 * c ; t3(4) t2(never) c(never) 4. t4 := sqrt(t3) ; t4(5) t3(never) 5. t5 := t1 t4 ; t5(7) t1(never) t4(never) 6. t6 := 2 * a ; t6(6) a(never) 7. t7 := t5 / t6 ; t7(8) t5(never) t6(never)

The Pains of Pointers In languages with pointers, basic register allocation becomes much more difficult This is especially true in languages, like C and C++ with very flexible pointers For this reason, many languages outperform even the best optimizing C compilers int *a; int x, y, z, w;... *a = 23; // this may have modified x, y, z, or w // a C compiler has to work hard to // know that it doesn't 22

The Control Flow Graph 23

The Control Flow Graph The (control) flow graph is a directed graph whose vertices are the basic blocks An edge goes from block A to block B if A terminates with a (conditional) jump to B, or B comes after A and A's last statement is anything other than a goto or return (unconditional jump) The flow graph tells us, for every block, which blocks we might visit next 24

getstatic java/lang/system/out Ljava/io/PrintStream; iload 0 ifeq false_label ldc "true" goto print_it false_label: ldc "false" print_it: invokevirtual java/io/printstream/println(ljava/lang/string;)v return

Flow Graph Example Construct the control flow graph: ldc 0.0 fstore 1 start: fload 1 ; load i fload 0 ; load n fcmpl ifge done fload 1 invokestatic SimpleTest/printFloat(F)V fload 1 ldc 1.0 fadd fstore 1 goto start done: return

Global Register Allocation We have seen an efficient algorithm for managing registers within a block Summary: Keep track of which values are in which registers Only store a register when necessary Store all dirty registers at the end of a block Problem: It's often worth keeping registers in variables across blocks loop indices are a common example 27

Example i := 0 start: i := i + 1... if i < 1000 goto start ldc R0, 0 start: inc R0... ldc R1, 1000 sub R1,R0 jmplt R1, start ldc R0, 0 mov R0, o ; store i start: mov i, R0 ; load i inc R0... ldc R1, 1000 sub R1, R0 mov R0, i ; store i jmplt R1, start 28

Global Register Allocation Designate one or more registers as variable registers that will be used to store local variables Analyze loops and decide which variables get to become register variables 29

Assigning Register Variables Easy case: 1 block in a loop Calculate the savings for each variable save 1 load if the variable is accessed save 1 store if the variable is modified Example: i used and modified (1 load + 1 store) a is modified but not used (1 store) b and c are used but not modified (1 load) putting i in a register yields the greatest savings start: i := i + 1 a := b + c... if i < 1000 goto start 30

More Complicated Variants A cycle with an if statement Only count savings by half as much in the red boxes 31

More Complicated Variants Nested Cycles Pay a penalty for choosing a different variable to use in the inner cycle 32

Other Control Flow Graph Tricks The control flow graph allows several other useful optimizations based on reachability analysis Can we get to a basic block B from a basic block A? This question is answered by computing the transitive closure of the control flow graph 33

Dead Code Elimination A piece of code is dead if it can not be reached in any execution path For a function look at the first basic block of the function (A) code B is dead if A->B is not in the transitive closure Dead code never executes and can therefore be eliminated 34

No Longer Used Variables At some point during the execution of a function, a local variable may never be used again We can avoid unnecessarily storing this variable If variable i is modified in basic block A Check if there is any block B such that i is used in block B, and A -> B in the transitive closure If not, then i is never used again after visiting A 35

When to Construct the Flow Graph The best time to construct the control flow graph is after some optimizations have been done on the basic blocks This may reduce the number of edges in the graph start:... t0 = 1 < 3 if t0 goto start 36

Summary Basic blocks and control flow graphs represent a compiler's understanding of how a program executes Basic blocks always run right through We understand enough about values in basic blocks to optimize agressively Flow graphs represent execution paths Give more information about data in basic blocks Allow for reachability analysis 37