Overview of a Compiler

Similar documents
The View from 35,000 Feet

Overview of a Compiler

Overview of a Compiler

Code Merge. Flow Analysis. bookkeeping

High-level View of a Compiler

Compiling Techniques

CS415 Compilers Overview of the Course. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

Introduction to Compiler

CS415 Compilers Overview of the Course. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

Compilers and Interpreters

CS 132 Compiler Construction

Compiler Construction

COMP 181 Compilers. Administrative. Last time. Prelude. Compilation strategy. Translation strategy. Lecture 2 Overview

Goals for course What is a compiler and briefly how does it work? Review everything you should know from 330 and before

CS Lecture 2. The Front End. Lecture 2 Lexical Analysis

CS 352: Compilers: Principles and Practice

Compilers. Lecture 2 Overview. (original slides by Sam

Front End. Hwansoo Han

Introduction to Compilers

Introduction to Parsing

Compiler Construction LECTURE # 1

Lexical Analysis. Introduction

Loop Optimizations. Outline. Loop Invariant Code Motion. Induction Variables. Loop Invariant Code Motion. Loop Invariant Code Motion

CS 406/534 Compiler Construction Putting It All Together

Intermediate Representations

CSE 501: Compiler Construction. Course outline. Goals for language implementation. Why study compilers? Models of compilation

Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit

Intermediate Representations

What do Compilers Produce?

Building a Runnable Program and Code Improvement. Dario Marasco, Greg Klepic, Tess DiStefano

Formats of Translated Programs

CS415 Compilers. Syntax Analysis. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

Intermediate Representations

CS5363 Final Review. cs5363 1

Intermediate Representations & Symbol Tables

Compiler Passes. Optimization. The Role of the Optimizer. Optimizations. The Optimizer (or Middle End) Traditional Three-pass Compiler

Compiler Code Generation COMP360

Instruction Selection: Preliminaries. Comp 412

About the Authors... iii Introduction... xvii. Chapter 1: System Software... 1

Principles of Compiler Design

Intermediate Representations

Undergraduate Compilers in a Day

Compiler Construction 1. Introduction. Oscar Nierstrasz

Part 5 Program Analysis Principles and Techniques

What is a compiler? var a var b mov 3 a mov 4 r1 cmpi a r1 jge l_e mov 2 b jmp l_d l_e: mov 3 b l_d: ;done

Optimization. ASU Textbook Chapter 9. Tsan-sheng Hsu.

CSE P 501 Compilers. Intermediate Representations Hal Perkins Spring UW CSE P 501 Spring 2018 G-1

USC 227 Office hours: 3-4 Monday and Wednesday CS553 Lecture 1 Introduction 4

Crafting a Compiler with C (II) Compiler V. S. Interpreter

Compiler Construction 1. Introduction. Oscar Nierstrasz

Just-In-Time Compilers & Runtime Optimizers

SYLLABUS UNIT - I UNIT - II UNIT - III UNIT - IV CHAPTER - 1 : INTRODUCTION CHAPTER - 4 : SYNTAX AX-DIRECTED TRANSLATION TION CHAPTER - 7 : STORA

CSE450 Translation of Programming Languages. Lecture 4: Syntax Analysis

Goals of Program Optimization (1 of 2)

Compiler Optimization

Introduction to Parsing. Comp 412

Group B Assignment 8. Title of Assignment: Problem Definition: Code optimization using DAG Perquisite: Lex, Yacc, Compiler Construction

Middle End. Code Improvement (or Optimization) Analyzes IR and rewrites (or transforms) IR Primary goal is to reduce running time of the compiled code

What is a compiler? Xiaokang Qiu Purdue University. August 21, 2017 ECE 573

CST-402(T): Language Processors

EECS 6083 Intro to Parsing Context Free Grammars

Group B Assignment 9. Code generation using DAG. Title of Assignment: Problem Definition: Code generation using DAG / labeled tree.

IR Optimization. May 15th, Tuesday, May 14, 13

Compiler Optimizations. Chapter 8, Section 8.5 Chapter 9, Section 9.1.7

Parsing II Top-down parsing. Comp 412

Compiler Design (40-414)

CSE 401/M501 Compilers

Intermediate Code Generation

Office Hours: Mon/Wed 3:30-4:30 GDC Office Hours: Tue 3:30-4:30 Thu 3:30-4:30 GDC 5.

CS 4201 Compilers 2014/2015 Handout: Lab 1

Comp 204: Computer Systems and Their Implementation. Lecture 22: Code Generation and Optimisation

Agenda. CSE P 501 Compilers. Big Picture. Compiler Organization. Intermediate Representations. IR for Code Generation. CSE P 501 Au05 N-1

CS 406/534 Compiler Construction Parsing Part I

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design

Intermediate Representations. Reading & Topics. Intermediate Representations CS2210

CS415 Compilers. Intermediate Represeation & Code Generation

Computing Inside The Parser Syntax-Directed Translation. Comp 412 COMP 412 FALL Chapter 4 in EaC2e. source code. IR IR target.

Administration CS 412/413. Why build a compiler? Compilers. Architectural independence. Source-to-source translator

The Structure of a Syntax-Directed Compiler

Intermediate Code Generation (ICG)

COLLEGE OF ENGINEERING, NASHIK. LANGUAGE TRANSLATOR

Compiling Regular Expressions COMP360

Local Optimization: Value Numbering The Desert Island Optimization. Comp 412 COMP 412 FALL Chapter 8 in EaC2e. target code

Tour of common optimizations

CJT^jL rafting Cm ompiler

Intermediate Code Generation

Academic Formalities. CS Modern Compilers: Theory and Practise. Images of the day. What, When and Why of Compilers

When do We Run a Compiler?

CSE P 501 Compilers. Parsing & Context-Free Grammars Hal Perkins Spring UW CSE P 501 Spring 2018 C-1

CS 2461: Computer Architecture 1

Appendix A The DL Language

CprE 488 Embedded Systems Design. Lecture 6 Software Optimization

Register Allocation. Global Register Allocation Webs and Graph Coloring Node Splitting and Other Transformations

The Structure of a Syntax-Directed Compiler

Compiler Optimizations. Chapter 8, Section 8.5 Chapter 9, Section 9.1.7

ECE 486/586. Computer Architecture. Lecture # 7

Programming Language Processor Theory

The structure of a compiler

Usually, target code is semantically equivalent to source code, but not always!

Compiler Optimisation

Transcription:

High-level View of a Compiler Overview of a Compiler Compiler Copyright 2010, Pedro C. Diniz, all rights reserved. Students enrolled in the Compilers class at the University of Southern California have explicit permission to make copies of these materials for their personal use. Implications Must recognize legal (and illegal) programs Must generate correct Must manage storage of all variables (and ) Must agree with OS & linker on format for object Big step up from assembly language use higher level notations 2 Traditional Two-pass Compiler A Common Fallacy Implications Use an intermediate representation () maps legal source into maps into target machine Admits multiple front s & multiple passes (better ) Typically, front is O(n) or O(n log n), while back is NPC Scheme Smalltalk Target 1 Target 2 Target 3 Can we build n x m compilers with n+m components? Must en all language specific knowledge in each front Must en all features in a single Must en all target specific knowledge in each back Limited success in systems with very low-level s 3 4 The The tokens tokens Responsibilities Recognize legal (& illegal) programs Report errors in a useful way Produce & preliminary storage map Shape the for the back Much of front construction can be automated Maps character stream into words the basic unit of syntax Produces pairs a word & its part of speech x = x + y ; becomes <id,x> = <id,x> + <id,y> ; word lexeme, part of speech token type In casual speech, we call the pair a token Typical tokens include number, identifier, +,, new, while, if eliminates white space (including comments) Speed is important 5 6 1

The tokens Recognizes context-free syntax & reports errors Guides context-sensitive ( semantic ) analysis (type checking) Builds for source program Hand-d parsers are fairly easy to build Most books advocate using automatic parser generators The Context-free syntax is specified with a grammar SheepNoise SheepNoise baa baa This grammar defines the set of noises that a sheep makes under normal circumstances It is written in a variant of us Naur Form (BNF) Formally, a grammar G = (S,N,T,P) S is the start symbol N is a set of non-terminal symbols T is a set of terminal symbols or words P is a set of productions or rewrite rules (P : N N T ) 7 8 The The Context-free syntax can be put to better use 1. goal expr 2. expr expr op term 3. term 4. term number 5. id 6. op + 7. - S = goal T = { number, id, +, - } N = { goal, expr, term, op } P = { 1, 2, 3, 4, 5, 6, 7} This grammar defines simple expressions with addition & subtraction over number and id This grammar, like many, falls in a class called context-free grammars, abbreviated CFG Given a CFG, we can derive sentences by repeated substitution Production Result goal 1 expr 2 expr op term 5 expr op y 7 expr - y 2 expr op term - y 4 expr op 2 - y 6 expr + 2 - y 3 term + 2 - y 5 x + 2 - y To recognize a valid sentence in some CFG, we reverse this process and build up a parse 9 10 The The A parse can be represented by a tree (parse tree or syntax tree) x + 2 - y goal Compilers often use an abstract syntax tree - expr expr expr op term op - term <id,y> <id,x> + <number,2> <id,y> The AST summarizes grammatical structure, without including detail about the derivation term <id,x> + <number,2> This contains a lot of unneeded information. 1. goal expr 2. expr expr op term 3. term 4. term number 5. id 6. op + 7. - This is much more concise ASTs are one kind of intermediate representation () 11 12 2

The The Responsibilities Translate into target machine Choose instructions to implement each operation Decide which value to keep in registers Ensure conformance with system interfaces Automation has been less successful in the back Produce fast, compact Take advantage of target features such as addressing modes Usually viewed as a pattern matching problem ad hoc methods, pattern matching, dynamic programming This was the problem of the future in 1978 Spurred by transition from PDP-11 to VAX-11 Orthogonality of RISC simplified this problem 13 14 The The Have each value in a register when it is used Manage a limited set of resources Can change instruction choices & insert LOADs & STOREs Optimal allocation is NP-Complete (1 or k registers) Compilers approximate solutions to NP-Complete problems Avoid hardware stalls and interlocks Use all functional units productively Can increase lifetime of variables (changing the allocation) Optimal scheduling is NP-Complete in nearly all cases Heuristic techniques are well developed 15 16 Traditional Three-pass Compiler The Optimizer (or Middle ) Middle Opt Opt Opt... Opt 1 2 3 n Improvement (or Optimization) Analyzes and rewrites (or transforms) Primary goal is to reduce running time of the compiled May also improve space, power consumption, Must preserve meaning of the Measured by values of named variables Modern optimizers are structured as a series of passes Typical Transformations Discover & propagate some constant value Move a computation to a less frequently executed place Specialize some computation based on context Discover a redundant computation & remove it Remove useless or unreachable En an idiom in some particularly efficient form 17 18 3

Example Example Optimization of Subscript Expressions in Optimization of Subscript Expressions in Address(A(I,J)) = address(a(0,0)) + J * (column size) + I Address(A(I,J)) = address(a(0,0)) + J * (column size) + I Does the user realize a multiplication is generated here? Does the user realize a multiplication is generated here? DO I = 1, M A(I,J) = A(I,J) + C ENDDO 19 20 Example Modern Restructuring Compiler Optimization of Subscript Expressions in HL AST Restructurer HL AST Gen Opt + Address(A(I,J)) = address(a(0,0)) + J * (column size) + I DO I = 1, M A(I,J) = A(I,J) + C ENDDO Does the user realize a multiplication is generated here? compute addr(a(0,j)) DO I = 1, M add 1 to get addr(a(i,j)) A(I,J) = A(I,J) + C ENDDO Typical Restructuring Transformations: Blocking for Memory Hierarchy and Reuse Vectorization Parallelization All based on depence Also full and partial inlining 21 22 Role of the Run-Time System Memory management services Allocate In the heap or in an activation record (stack frame) Deallocate Collect garbage Run-time type checking Error processing Interface to the operating system Input and output Support of parallelism Parallel Thread initiation Communication and Synchronization 1957: The FORTRAN Automatic Coding System Index Optimiz n Merge bookkeeping Six passes in a fixed order Generated good Assumed unlimited index registers motion out of loops, with ifs and gotos Did flow analysis & register allocation Flow Analysis Middle Final Assembly 23 24 4

1969: IBM s FORTRAN H Compiler Scan & Parse Build CFG & DOM Find Busy Vars CSE Inv Mot n Copy Elim. OSR Re - assoc (consts) Reg. Alloc. Final Assy. 1975: BLISS-11 compiler (Wulf et al., CMU) Lex- Syn- Flo allocation Delay TLA Rank Pack Final Middle Used low-level (quads), identified loops with dominators Focused on optimizing loops ( inside out order) Passes are familiar today Simple front, simple back for IBM 370 Middle The great compiler for the PDP-11 Seven passes in a fixed order Focused on shape & instruction selection LexSynFlo did preliminary flow analysis Final included a grab-bag of peephole optimizations Basis for early VAX & Tartan Labs compilers 25 26 1980: IBM s PL.8 Compiler 1980: IBM s PL.8 Compiler Middle Middle Many passes, 1 front, several back s Collection of 10 or more passes Repeat some passes and analyses Represent complex operations at 2 levels Below machine-level Dead elimination cse motion Constant folding Strength reduction Value numbering Dead store elimination straightening Trap elimination Algebraic reassociation * Many passes, 1 front, several back s Collection of 10 or more passes Repeat some passes and analyses Represent complex operations at 2 levels Below machine-level Multi-level has become common wisdom * 27 28 1986: HP s PA-RISC Compiler 1999: The SUIF Compiler System 77 C/ Alpha Middle Middle x86 Several front s, an optimizer, and a back Four fixed-order choices for optimization (9 passes) Coloring allocator, instruction scheduler, peephole optimizer Another classically-built compiler 3 front s, 3 back s 18 passes, configurable order Two-level (High SUIF, Low SUIF) Inted as research infrastructure 29 30 5

1999: The SUIF Compiler System 1999: The SUIF Compiler System 77 C/ 77 C/ Alpha Alpha x86 x86 Middle Middle Another classically-built compiler 3 front s, 3 back s 18 passes, configurable order Two-level (High SUIF, Low SUIF) Inted as research infrastructure SSA construction Dead elimination Partial redundancy elimination Constant propagation value numbering Strength reduction Reassociation scheduling allocation Another classically-built compiler 3 front s, 3 back s 18 passes, configurable order Two-level (High SUIF, Low SUIF) Inted as research infrastructure Data depence analysis Scalar & array privatization Reduction recognition Pointer analysis Affine loop transformations Blocking Capturing object definitions Virtual function call elimination Garbage collection 31 32 Middle 3 front s, 1 back Five-levels of Interprocedural Classic Analysis Inlining (user & library ) Cloning (constants & locality) Dead function elimination Dead variable elimination Middle 3 front s, 1 back Five-levels of Optimization Depence Analysis Parallelization transformations (fission, fusion, interchange, peeling, tiling, unroll & jam) Array privatization 33 34 Middle 3 front s, 1 back Five-levels of Optimization SSA-based analysis & opt n Constant propagation, PRE, OSR+LFTR, DVNT, DCE (also used by other phases) Middle 3 front s, 1 back Five-levels of Generation If conversion & predication motion (inc. sw pipelining) Peephole optimization 35 36 6

Summary Even a 2000 JIT fits the mold, albeit with fewer passes Overview of a Compiler s Tasks byte native Basic Translation from High-level to level Structure of a (Classical) Compiler Middle Environment Traditional Three Phase Structure Classical Compilers tasks are handled elsewhere Few (if any) optimizations Avoid expensive analysis Emphasis on generating native Compilation must be profitable Static vs. Dynamic 37 38 7