CPSC 510 (Fall 2007, Winter 2008) Compiler Construction II

Similar documents
CPEG 421/621 - Spring 2008

CS 701. Class Meets. Instructor. Teaching Assistant. Key Dates. Charles N. Fischer. Fall Tuesdays & Thursdays, 11:00 12: Engineering Hall

What Do Compilers Do? How Can the Compiler Improve Performance? What Do We Mean By Optimization?

7. Optimization! Prof. O. Nierstrasz! Lecture notes by Marcus Denker!

Programming Language Processor Theory

Code optimization. Have we achieved optimal code? Impossible to answer! We make improvements to the code. Aim: faster code and/or less space

Compilers. Lecture 2 Overview. (original slides by Sam

CSC D70: Compiler Optimization

CSCI 565 Compiler Design and Implementation Spring 2014

CSE 501: Compiler Construction. Course outline. Goals for language implementation. Why study compilers? Models of compilation

CS 2461: Computer Architecture 1

Introduction to Compilers

CS153: Compilers Lecture 15: Local Optimization

Code Optimization. Code Optimization

Compiler Code Generation COMP360

6. Intermediate Representation!

Algorithm Performance Factors. Memory Performance of Algorithms. Processor-Memory Performance Gap. Moore s Law. Program Model of Memory I

USC 227 Office hours: 3-4 Monday and Wednesday CS553 Lecture 1 Introduction 4

6. Intermediate Representation

Compiler Optimization

COMP 181 Compilers. Administrative. Last time. Prelude. Compilation strategy. Translation strategy. Lecture 2 Overview

Compiler Optimization Intermediate Representation

Overview of a Compiler

8. Static Single Assignment Form. Marcus Denker

A Bad Name. CS 2210: Optimization. Register Allocation. Optimization. Reaching Definitions. Dataflow Analyses 4/10/2013

Running class Timing on Java HotSpot VM, 1

Goals of Program Optimization (1 of 2)

The View from 35,000 Feet

COMS W4115 Programming Languages and Translators Lecture 21: Code Optimization April 15, 2013

Compilers for Modern Architectures Course Syllabus, Spring 2015

Compilers. Optimization. Yannis Smaragdakis, U. Athens

Transistor: Digital Building Blocks

Chapter 5A. Large and Fast: Exploiting Memory Hierarchy

10/16/2017. Miss Rate: ABC. Classifying Misses: 3C Model (Hill) Reducing Conflict Misses: Victim Buffer. Overlapping Misses: Lockup Free Cache

Compiler construction 2009

About the Authors... iii Introduction... xvii. Chapter 1: System Software... 1

HPC VT Machine-dependent Optimization

Administration. Prerequisites. Meeting times. CS 380C: Advanced Topics in Compilers

Final Exam Review Questions

Code Merge. Flow Analysis. bookkeeping

CS 132 Compiler Construction

Tour of common optimizations

Caches and Memory Hierarchy: Review. UCSB CS240A, Winter 2016

Control flow graphs and loop optimizations. Thursday, October 24, 13

Compiler Optimizations. Chapter 8, Section 8.5 Chapter 9, Section 9.1.7

CA Compiler Construction

ICS 252 Introduction to Computer Design

Compiler Optimizations. Chapter 8, Section 8.5 Chapter 9, Section 9.1.7

Overview of a Compiler

Overview of a Compiler

Middle End. Code Improvement (or Optimization) Analyzes IR and rewrites (or transforms) IR Primary goal is to reduce running time of the compiled code

Compiler Construction LECTURE # 1

COMP 3002: Compiler Construction. Pat Morin School of Computer Science

Building a Runnable Program and Code Improvement. Dario Marasco, Greg Klepic, Tess DiStefano

Static Single Assignment Form in the COINS Compiler Infrastructure

G.PULLAIH COLLEGE OF ENGINEERING & TECHNOLOGY

Course introduction. Advanced Compiler Construction Michel Schinz

Caches and Memory Hierarchy: Review. UCSB CS240A, Fall 2017

CS 403 Compiler Construction Lecture 10 Code Optimization [Based on Chapter 8.5, 9.1 of Aho2]

CS 4120 and 5120 are really the same course. CS 4121 (5121) is required! Outline CS 4120 / 4121 CS 5120/ = 5 & 0 = 1. Course Information

Loop Optimizations. Outline. Loop Invariant Code Motion. Induction Variables. Loop Invariant Code Motion. Loop Invariant Code Motion

CS 432 Fall Mike Lam, Professor. Compilers. Advanced Systems Elective

Compiler Construction

Compiler Design and Construction Optimization

Introduction to Machine-Independent Optimizations - 1

Intermediate Code & Local Optimizations

Code Shape II Expressions & Assignment

Languages and Compiler Design II IR Code Optimization

Who are we? Andre Platzer Out of town the first week GHC TAs Alex Crichton, senior in CS and ECE Ian Gillis, senior in CS

PERFORMANCE OPTIMISATION

Compiler Construction 1. Introduction. Oscar Nierstrasz

Optimization. ASU Textbook Chapter 9. Tsan-sheng Hsu.

Office Hours: Mon/Wed 3:30-4:30 GDC Office Hours: Tue 3:30-4:30 Thu 3:30-4:30 GDC 5.

Introduction to optimizations. CS Compiler Design. Phases inside the compiler. Optimization. Introduction to Optimizations. V.

Intermediate representation

Algorithm Performance Factors. Memory Performance of Algorithms. Processor-Memory Performance Gap. Moore s Law. Program Model of Memory II

INSTITUTE OF AERONAUTICAL ENGINEERING (AUTONOMOUS)

Supercomputing in Plain English Part IV: Henry Neeman, Director

Midterm 2. CMSC 430 Introduction to Compilers Fall Instructions Total 100. Name: November 21, 2016

Compiler Passes. Optimization. The Role of the Optimizer. Optimizations. The Optimizer (or Middle End) Traditional Three-pass Compiler

COMPILER DESIGN - CODE OPTIMIZATION

Functional programming languages

CSE4305: Compilers for Algorithmic Languages CSE5317: Design and Construction of Compilers

CS 432 Fall Mike Lam, Professor. Compilers. Advanced Systems Elective

Compiler Design (40-414)

Compiler Theory Introduction and Course Outline Sandro Spina Department of Computer Science

Lecture 1: Introduction

CS553 Lecture Generalizing Data-flow Analysis 3

CSE4305: Compilers for Algorithmic Languages CSE5317: Design and Construction of Compilers

CS 406/534 Compiler Construction Putting It All Together

Intermediate Representations. Reading & Topics. Intermediate Representations CS2210

Cache Performance (H&P 5.3; 5.5; 5.6)

Two hours UNIVERSITY OF MANCHESTER SCHOOL OF COMPUTER SCIENCE. Date: Friday 20th May 2016 Time: 14:00-16:00

Outline. Lecture 17: Putting it all together. Example (input program) How to make the computer understand? Example (Output assembly code) Fall 2002

Code Generation. Frédéric Haziza Spring Department of Computer Systems Uppsala University

General Course Information. Catalogue Description. Objectives

Loops. Lather, Rinse, Repeat. CS4410: Spring 2013

Intermediate Code & Local Optimizations. Lecture 20

CSE4305: Compilers for Algorithmic Languages CSE5317: Design and Construction of Compilers

Compiler Construction 1. Introduction. Oscar Nierstrasz

Transcription:

CPSC 510 (Fall 2007, Winter 2008) Compiler Construction II Robin Cockett Department of Computer Science University of Calgary Alberta, Canada robin@cpsc.ucalgary.ca Sept. 2007

Introduction to the course The course A first look at general optimization.. Architecture dependent optimizations Bibliography

What is a compiler?. A compiler translates source programs into target programs Usually source programs are a high-level human-readable form while target programs are usually machine executable. Programs specify a dynamic (run-time) behaviour and are subject to certain static (compile-time) requirements. Compilers must correctly implement the behaviour specified by programs... and must do so in a reasonable amount of time producing reasonable efficient target code.

Structure of a compiler SC p AST sa IR opt IR cg TC SC = Source code AST = Abstract syntax tree IR = Intermediate representation TC = Target Code p = lexing and parsing sa = semantic analysis opt = optimization cg = code generation This course: the last two steps!

Compilers are fundamental Compiler technology has a long history. Theoretical ideas are heavily used. Modern machine architectures are unusable without an (optimizing) compiler. Architecture and compiler technology interacts RISC/CISC micro-processor architecture Pipeline architectures... Multi-core processors... Ideas from compilers are used everywhere...

Organizing compiler development SML x86 Java Sparc C# MIPS OCaml PPC Haskell ARM... n times m compilers.

Organizing compiler development... n plus m compilers. SML x86 Java Sparc C# IR MIPS OCaml PPC Haskell ARM

Organizing compiler development The strategy: Have good intermediate representations! Do generally applicable optimizations at the level of the intermediate representation. Do architecture specific optimizations when generating code...

Focus of the course... This course is about optimizing compilers (parsing, lexing, and type checking is assumed...) Main emphasis on general optimization techniques: Control flow analysis Code transformations Instruction selection and scheduling. Register allocation Secondary concerns: Compiling should not take forever The code footprint should be reasonable... Architectural features pipelining, parallelism.

Focus of the course... Compilers directly use theoretical ideas... Algorithms are everywhere... Data structures are everywhere... Data-flow analysis uses graph theory Transformations are based on the semantics of the computation.

The course I. First semester: the theory and practise of optimization Basic register allocation and instruction selection Control flow graphs (CFG), dominators, loop structure, and reducibility Data-flow analysis, solving fixed point equations on CFGs Static single assignment form (ψ-functions), translation into a control flow language, program reduction. Arrays, garbage, functions,... Code generation issues: instruction scheduling...

The course II. Second semester: implementing a substantial component of a compiler Project initiation: project description and website (25 Four project milestones (25 Final presentation and project submission (50

Evaluation of the course First semester: Midterm exam (25%) Final exam: take-home exam (35%) Assignments: Register allocation and instruction selection (10%) Data-flow analysis (10%) Code reducing in a control flow language (10%)

What is better? Focus on execution speed... Reduce number of instructions Decrease memory usage Increase locality Replace expensive instructions with cheap ones Increase pipelining and parallelism Use special architectural features WANT: optimized program performs no worse!

Architecture independent optimizations... The above optimizations, except the first, depend on the machine architecture. BUT there are many general optimizations steps which reduce the number of instructions... 1. Constant propagation, constant folding 2. Dead-code elimination, common subexpression elimination 3. Store/load elimination 4. Strength reduction and algebraic simplification 5. Loop invariant code motion, loop unrolling 6. Loop fusion loop, and loop splitting 7. Induction variable elimination These types of optimization are the main focus of the course...

Constant propagation... Replace variables by constants if they are known... a = 5; b= 27; a = 5; b= 27;...... n = a + b; n = 5 + 27; for (i=0; i<n; ++i) for (i=0; i<n; ++i)......

Constant folding... Evaluate expressions involving constants... a = 5; b= 27; a = 5; b= 27;...... n= a + b; n = 32; for (i=0; i,n; ++i) for (i=0; i<n; ++i)......

Common subexpression elimination.. Replace expressions which have already been evaluated by a variable which holds the value of the computation... a = c * d; a = c+d;...... d = (c*d +t); d = a+t;...... Eliminating dead code is less easy: how do we know that some code will never be used a global analysis is needed.

Algebraic simplification... Replace and expression with a simpler expressions which known to be equal using algebraic identities: a = c * 0; a = 0; b = c+0; b = c; c = d*1 c = d; m = 2*n m = shift(n 1); The last is a strength reduction as shifting is cheaper than multiplying...

Code motion.. Move expressions which are invariant in loops out of loops: for (i=0; i<100; ++i) for (i=0; i<100; ++i) for (j=0; j<100; ++j) for (j=0; j<100; ++j) for (k=0; k<100; ++k) t1= a[i][j]; a[i][j][k] = i*j*k; t2= i*j; for (k=0; k<100; ++j) t1[k] = t2 * k; This is a very important optimization but it is tricky!!

So what is the difficulty? These optimizations may look easy..... but they are not. The big question is how to do them automatically! They also interact: optimizing in one area can make available an optimization in another etc. ORGANIZING them so that one optimization does not get in the way of another is crucial!... and so, in the end, everything has been done which is (reasonably) possible. REMEMBER: a small optimization in a loop can save hours of CPU time! Using the machine architecture to advantage is important too...

Locality... Why is memory usage important? Location Type of memory Quantity Access time CPU Registers 32B 3ns CPU Cache 2KB 6ns CPU Main memory 2MB 60ns Hard drive Disk 2GB 8ms Memory hierarchy the importance of locality

Register allocation x = (x+y)*z r1 = M[x]; r2 = M[y]; r1 = r1+r2; r2 = M[z]; x r1*r2; Register allocation, in general, is an NP-complete problem. However, many of the register allocation problems particular to compilers can be solved in near linear time. Poor register allocation leads to spilling and loss of locality... a big hit!

Instruction selection Most architectures provide a variety of instructions which cam be used in different ways to perform a given computation. There are dynamic algorithms for instruction selection techniques which minimize the number/cost of the instructions.

Pipelining.. Consider the loop for (1=1; i m; i++) for (1=1; j n; j++) x[i] = x[i] + y[i,j]; Pipelined execution for m = 4 and n = 3: Time Processor 1 Processor 2 Processor 3 1 x[1] += y[1,1] 2 x[2] += y[2,1] x[1] += y[2,1] 3 x[3] += y[3,1] x[2] += y[2,2] x[1] += y[1,3] 4 x[4] += y[4,1] x[3] += y[3,2] x[2] += y[2,3] 5 x[4] += y[4,2] x[3] += y[3,3] 6 x[4] += y[4,3]

Instruction scheduling Instruction should be scheduled to make use of pipelining and parallelism.

Texts... Modern Compiler Implementation Building Optimizing Compilers Compiler Design and Optimization Compilers: Principals, Techniques, and Tools Appel Morgan Muchnik Aho Sethi Ullman

Journals... IEEE ACM JPDS JSC JPP PC JPL Computer Transactions on Computers Concurrency Transactions on Parallel and Distributed Systems TOPLAS - Transactions on Programming Languages and Systems Transactions on Computer Systems Journal of Parallel and Distributed Computing Journal of Supercomputing International Journal of Parallel Programming Parallel Computing Journal of Programming Languages

Conferences... PLDI POPL ASPLOS PPOPP ICPP ICS PACT ACM Symposium on Programming Language Design and Implementation ACM Symposium on Principles of Programming Languages ACM Symposium on Arcitecture Support for Programming Languages and Operating Systems ACM Symposium on Principles and Practise of Parallel Programming International Conference on Parallel Processing International Conference on Super Computing Parallel Architectures an Compilation Techniques