Advanced C Programming

Similar documents
CS415 Compilers. Intermediate Represeation & Code Generation

G Programming Languages - Fall 2012

Intermediate Representations

Intermediate Representation (IR)

Intermediate Representations

G Programming Languages Spring 2010 Lecture 4. Robert Grimm, New York University

Compiler Passes. Optimization. The Role of the Optimizer. Optimizations. The Optimizer (or Middle End) Traditional Three-pass Compiler

Intermediate Code Generation

SSA Construction. Daniel Grund & Sebastian Hack. CC Winter Term 09/10. Saarland University

Intermediate representation

Agenda. CSE P 501 Compilers. Big Picture. Compiler Organization. Intermediate Representations. IR for Code Generation. CSE P 501 Au05 N-1

Chapter 13: Reference. Why reference Typing Evaluation Store Typings Safety Notes

Intermediate Representations & Symbol Tables

COS 320. Compiling Techniques

Data Structures and Algorithms in Compiler Optimization. Comp314 Lecture Dave Peixotto

Lecture Compiler Middle-End

Chapter 11 :: Functional Languages

Chapter 5. Names, Bindings, and Scopes

CS 406/534 Compiler Construction Putting It All Together

CSE 401 Final Exam. March 14, 2017 Happy π Day! (3/14) This exam is closed book, closed notes, closed electronics, closed neighbors, open mind,...

Code generation and local optimization

An Overview to Compiler Design. 2008/2/14 \course\cpeg421-08s\topic-1a.ppt 1

Code generation and local optimization

CODE GENERATION Monday, May 31, 2010

CSE443 Compilers. Dr. Carl Alphonce 343 Davis Hall

IR Optimization. May 15th, Tuesday, May 14, 13

CSc 453. Compilers and Systems Software. 13 : Intermediate Code I. Department of Computer Science University of Arizona

Announcements. My office hours are today in Gates 160 from 1PM-3PM. Programming Project 3 checkpoint due tomorrow night at 11:59PM.

Principles of Compiler Design

Weeks 6&7: Procedures and Parameter Passing

CS153: Compilers Lecture 15: Local Optimization

CSE 413 Languages & Implementation. Hal Perkins Winter 2019 Structs, Implementing Languages (credits: Dan Grossman, CSE 341)

7 Translation to Intermediate Code

A Bad Name. CS 2210: Optimization. Register Allocation. Optimization. Reaching Definitions. Dataflow Analyses 4/10/2013

The Compiler So Far. CSC 4181 Compiler Construction. Semantic Analysis. Beyond Syntax. Goals of a Semantic Analyzer.

Undergraduate Compilers in a Day

G Compiler Construction Lecture 12: Code Generation I. Mohamed Zahran (aka Z)

The role of semantic analysis in a compiler

A Propagation Engine for GCC

Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit

Where We Are. Lexical Analysis. Syntax Analysis. IR Generation. IR Optimization. Code Generation. Machine Code. Optimization.

opt. front end produce an intermediate representation optimizer transforms the code in IR form into an back end transforms the code in IR form into

Faculty of Electrical Engineering, Mathematics, and Computer Science Delft University of Technology

Short Notes of CS201

Semantic Analysis and Type Checking

CS 4110 Programming Languages & Logics

CS201 - Introduction to Programming Glossary By

Intermediate Representations

Functional Languages. Hwansoo Han

Control Flow Analysis

Introduction to Code Optimization. Lecture 36: Local Optimization. Basic Blocks. Basic-Block Example

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler Front-End

Intermediate Representations Part II

Type Checking. Outline. General properties of type systems. Types in programming languages. Notation for type rules.

Verification of an ML compiler. Lecture 3: Closures, closure conversion and call optimisations

Lecture 23 CIS 341: COMPILERS

Outline. General properties of type systems. Types in programming languages. Notation for type rules. Common type rules. Logical rules of inference

A New Verified Compiler Backend for CakeML. Yong Kiam Tan* Magnus O. Myreen Ramana Kumar Anthony Fox Scott Owens Michael Norrish

CSE P 501 Compilers. SSA Hal Perkins Spring UW CSE P 501 Spring 2018 V-1

Static Program Analysis Part 1 the TIP language

Three-Address Code IR

CS 432 Fall Mike Lam, Professor. Code Generation

CS 4110 Programming Languages & Logics. Lecture 35 Typed Assembly Language

Middle End. Code Improvement (or Optimization) Analyzes IR and rewrites (or transforms) IR Primary goal is to reduce running time of the compiled code

Programming Language Processor Theory

Lecture 12: Data Types (and Some Leftover ML)

Lecture Notes on Common Subexpression Elimination

SSA-Form Register Allocation

Model-based Software Development

Data Flow Testing. CSCE Lecture 10-02/09/2017

Simply-Typed Lambda Calculus

GCC Internals Alias analysis

Data Representation and Storage. Some definitions (in C)

Hoare logic. WHILE p, a language with pointers. Introduction. Syntax of WHILE p. Lecture 5: Introduction to separation logic

SE352b: Roadmap. SE352b Software Engineering Design Tools. W3: Programming Paradigms

Intermediate Representations

LLVM and IR Construction

Question No: 1 ( Marks: 1 ) - Please choose one One difference LISP and PROLOG is. AI Puzzle Game All f the given

CS 701. Class Meets. Instructor. Teaching Assistant. Key Dates. Charles N. Fischer. Fall Tuesdays & Thursdays, 11:00 12: Engineering Hall

Hoare logic. Lecture 5: Introduction to separation logic. Jean Pichon-Pharabod University of Cambridge. CST Part II 2017/18

Lecture 21 CIS 341: COMPILERS

IR Generation. May 13, Monday, May 13, 13

System Software Assignment 1 Runtime Support for Procedures

Operational Semantics of Cool

9/5/17. The Design and Implementation of Programming Languages. Compilation. Interpretation. Compilation vs. Interpretation. Hybrid Implementation

Goals of Program Optimization (1 of 2)

COMP 181. Prelude. Intermediate representations. Today. High-level IR. Types of IRs. Intermediate representations and code generation

Optimization Prof. James L. Frankel Harvard University

Programming Methodology

Compiler Theory. (Intermediate Code Generation Abstract S yntax + 3 Address Code)

CSE 307: Principles of Programming Languages

SPIM Procedure Calls

G Programming Languages Spring 2010 Lecture 6. Robert Grimm, New York University

Functional programming languages

CS558 Programming Languages

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler so far

Compiler construction

Programmiersprachen (Programming Languages)

Programming Language Pragmatics

Compiler Construction: LLVMlite

Transcription:

Advanced C Programming Compilers Sebastian Hack hack@cs.uni-sb.de Christoph Weidenbach weidenbach@mpi-inf.mpg.de 20.01.2009 saarland university computer science 1

Contents Overview Optimizations Program Representations Abstract Syntax Trees Control-Flow Graphs Some Simple Optimizations Dead Code Elimination Constant Folding Static Single Assignment Scalar Variables, Memory, and State Summary 2

Goals Get an impression of what compilers can do Write programs in a way such that compilers can optimize them well Get an impression of what compilers cannot do Do some important optimizations by hand 3

Compilers Architecture Front End Middle End Back End Syntactic / semantic analysis of the input program Dependent on the programming language Heart of the compiler Independent from language and target architecture most optimizations implemented here Transform the program to machine code Dependent on target architecture Implement resource constraints of machine/runtimesystem 4

Optimizations Optimization is the wrong word It is a mathematical term describing the task of solving an optimization problem Compiler optimizations merely transform the program Should thus be called transformations We call them optimizations anyway Many interesting optimizations are NP-complete or uncomputable Since compilation speed also matters: Much in compilers is about finding fast heuristics for extremely difficult problems Challenging engineering task: Very diverse inputs Complex data structures Complex invariants No tolerance of failure: Must work for every input 5

Optimizations Compiler writers have a mathematically provable job guarantee The full employment theorem 6

Optimizations Compiler writers have a mathematically provable job guarantee The full employment theorem Given: A program P that does not emit anything Wanted: The smallest binary for P Theorem There exists no compiler that can produce such a binary for every P 6

Optimizations Compiler writers have a mathematically provable job guarantee The full employment theorem Given: A program P that does not emit anything Wanted: The smallest binary for P Theorem There exists no compiler that can produce such a binary for every P Proof. If P does not terminate, its smallest implementation is L1 : jmp L1 To this end, the compiler must determine whether P holds. 6

Program Representations Compilers process data like any other program However, the data they process are programs To get an idea of what compilers can do, we need to understand how they represent programs Every end uses its own intermediate representation (IR) The effectiveness of many optimizations are dependent on the degree of abstraction and the shape of the IR Most compilers use 4 IRs 7

Program Representations Front End Abstract Syntax Tree (AST) Program represented by syntactical structure Basically a large tree and a name table Nodes represent type of structural entity: Function, Statement, Operator,... Mainly used for: Name resolution Type checking High-level transformations (loop transformations) 8

Program Representation AST Source int sum_upto ( int n) { int i, res = 0; for ( i = 0; i < n; ++i) res += i; return res ; } AST FUNCTION_DECL name:sum_upto ARG name:n type:int BODY STATEMENT_LIST VAR_DECL name:i type:int... FOR_LOOP ASSIGN VAR_EXPR Name:i CONST_EXPR Value:0 CMP_EXPR Op:< VAR_EXPR Name:i VAR_EXPR Name:n... 9

Program Representations Middle to Back End Control-Flow Graphs (CFG) High-level control structures (for, if,... ) gone Nodes of the CFG: Basic Blocks Edges represent flow of control Instructions in a basic block are in triple form Each instruction has the form z op(x 1,..., x n) Often: n = 2 No expression trees anymore Notion of a statement no longer present z, x1,..., x n scalar variables machine types Definition (Basic Block) A basic block B is a maximal sequence of instructions I 1,..., I n for which 1. I i is a control-flow predecessor of I i+1 2. If I i is executed so is I j 10

Program Representation CFG/Tripe-Code Source int sum_upto ( int n) { int i, r = 0; for ( i = 0; i < n; ++i) r += i; return r; } Tripe-code CFG i r cnst 0 i cnst 0 b cmplt(i, n) cond(b, T, F ) T r add(r, i) i add(i, 1) F ret(r) 11

Program Representations Back End Nowadays similar to middle end: CFGs with machine instructions Registers instead of variables At the very end, a list of assembly instructions is generated CFG is flattened Flattening important: Use fall-throughs safe jump instructions Arrange blocks carefully to aid branch prediction Other minor stuff to care about: Instruction encoding Alignment Data Layout... 12

Contents Overview Optimizations Program Representations Abstract Syntax Trees Control-Flow Graphs Some Simple Optimizations Dead Code Elimination Constant Folding Static Single Assignment Scalar Variables, Memory, and State Summary 13

Dead Code Elimination Eliminate Code which has no effect Must not be written by the user Can also result as garbage from other transformation x z z op(x,... ) x ret(z) Definition of x in right branch is dead the value computed there will never be used How to find dead computations? Data-flow analysis 14

Constant Folding Compute constant expressions during compile time Addition in left block can be optimized to y cnst 10 z add(y, x) x cnst 0 ret(x) x call() z cnst 10 The use of x in the bottom cannot x has unknown contents when coming from the right branch Again, use data-flow analysis to determine whether variable has known constant contents 15

Static Single Assignment (SSA) Performing data-flow analyses all the time is laborious Each time the program changes, analysis information has to be updated Both transformations needed following information: Reaching Definitions For a use of a variable x, which are the definitions of x that can write the value read at the use of x Solution: Encode this directly in the IR Allow every variable to only have one instruction that writes its value At each use of that variable there is exactly one definition reaching Variables and program points are now identical 16

Dead Code Elimination Revisited SSA non SSA x z SSA x 1 z 1 z op(x,... ) x z 2 op(x 1,... ) x 2 ret(z) ret(z? ) Which z is used at the return? 17

Dead Code Elimination Revisited SSA non SSA x z SSA x 1 z 1 z op(x,... ) x ret(z) z 2 op(x 1,... ) x 2 z 3 φ(z 2, z 1 ) ret(z 3 ) Which z is used at the return? Use φ-functions to propagate SSA variables over control flow Each variable which has no use is dead (x 2 ) Use that criterion transitively 17

Constant Folding Revisited SSA non SSA x cnst 0 SSA x 1 cnst 0 y cnst 10 z add(y, x) x call() y 1 cnst 10 z 1 add(y 1, x 1 ) x 2 call() ret(x) x 3 φ(x 1, x 2 ) ret(x 3 ) Each variable has only one definition Either the value at the definition was constant or not we see that x 3 is not constant because not all arguments of the φ are constant 18

SSA... is functional programming (Kelsey 1995) start (a, b) start if b < a f1 f2 c 1 a b c 2 0 f3 c 3 φ(c 1, c 2 ) fun start a b = if b < a then f1 a b else f2 fun f1 a b fun f2 = f3 0 = let c = b-a in f3 c return c 3 fun f3 c = c Each block is a function In FP each variable can be bind only once (here we go!) Control flow modeled by function evaluations 19

Contents Overview Optimizations Program Representations Abstract Syntax Trees Control-Flow Graphs Some Simple Optimizations Dead Code Elimination Constant Folding Static Single Assignment Scalar Variables, Memory, and State Summary 20

Scalar Variables, Memory, and State Up to now all variables are scalar : resemble machine types (int, float, double), no arrays or structs And all variables were alias-free : each variable was only accessible by a single name Every modification of the variable happened through that name Under SSA this is equivalent to the variable concept in FP In FP there is no difference between the name and the variable Scalar, alias-free variables are good for code generation They can be put into a register 21

Scalar Variables, Memory, and State Up to now all variables are scalar : resemble machine types (int, float, double), no arrays or structs And all variables were alias-free : each variable was only accessible by a single name Every modification of the variable happened through that name Under SSA this is equivalent to the variable concept in FP In FP there is no difference between the name and the variable Scalar, alias-free variables are good for code generation They can be put into a register What about non-scalar variables? What about variables referenced by pointers? We are able to reference the same variable through different names In imperative programming names and variables are not the same This makes life much harder for the compiler 21

Scalar Variables, Memory, and State How are non-scalar variables implemented? Arrays Arrays define potentially aliased variables Each array element can be accessed by an indexing expression The value of the index expression might not be known at compile time To disambiguate two accesses a[i] and a[j], need to prove i j Structs... are simpler Unless the address of an element is taken, they can be scalarized int foo ( void ) vec3_t vec ;... } int foo ( void ) float x, y, z;... } 22

Scalar Variables, Memory, and State Aliased Variables int global_var ; int foo ( int *p) { global_var = 2; *p = 3; return global_var ; } We cannot optimize to return 2; p might point to global var global var is potentially aliased How can we find out? Look at all callers of foo and the passed argument Thus: probably also all the callers of the callers and so on What, if we do not know all the callers 23

Scalar Variables, Memory, and State Aliased Variables int global_var ; int foo ( int *p) { global_var = 2; *p = 3; return global_var ; } We can help the compiler If the address of global var is never taken and we defined it as static it can only be modified by functions in the current file And never through a second name It cannot be aliased Be as precise as possible with your declarations 24

Scalar Variables, Memory, and State We implement aliased variables by a global memory (Of course it is the other way around ) This memory belongs to the state The main difference between functional and imperative programming is the presence of state What else belongs to the state is a question of the programming language s semantics How that state is updated is (mostly) decided by the memory model For correct compilation, the visible effects on the state and their order have to be preserved Both are defined in the PL s semantics How do we model the state in the IR? 25

Scalar Variables, Memory, and State Representation of Memory The memory is also represented as an SSA variable Each load and store reads takes a memory variable and gives back a new one int global_var ; int foo ( int *p) { global_var = 2; *p = 3; return global_var ; } Memory is treated functionally Similar to the concept of a monad, cf. Haskell c 1 cnst 2 c 2 cnst 3 M 1 getarg(0) a symcnst global var M 2 store(m 1, a, c 1 ) p getarg(1) M 3 store(m 2, p, c 2 ) (M 4, r) load(m 3, a) ret(m 4, r) 26

Scalar Variables, Memory, and State Representation of Memory We can also have multiple memory variables! They must however be implemented with the single memory we have We must make sure that they represent pairwise disjoint variables M 2 store(m 1, p, v) M 3 store(m 1, q, w) does only work if p q Benefit: Variables may be scalarized in some regions of the code order of memory accesses can be changed important for code generation 27

Scalar Variables, Memory, and State Points-to Analysis Subdivision of memory needs results of points-to analysis For each use of a pointer determine an (over-approximated) set of variables the pointer might point to One of the hardest analyses interprocedural (whole-program) long runtime, large memory consumption Do not count on it Many compilers make precision sacrifices to safe compilation time 28

Summary Scalar, alias-free variables are good! Many analyses are easy for them Most optimizations only work on scalar, alias-free variables They can be allocated to a processor register Having many scalar variables is no problem The register allocator will decide which ones to spill where Know that accesses to non-scalar variables might result in memory accesses Always program as scalar as possible Always convey as much information as possible Do not overly rely on points-to analysis 29

Being Scalar Best Practices Arrays Prefer typedef struct { float x, y, z, w; } vec_t ; Over typedef float vec_t [4]; Compiler might have trouble analysing indexing expressions a.x is much clearer Can be scalarized more easily Some compilers do not consider arrays for scalarization 30

Being Scalar Best Practices Arrays Prefer int x = p[i]; int y = p[ i + 1]; int z = p[ i + 2]; Over int q = p + i; int x = *p ++; int y = *p ++; int z = *p ++; Array base pointer stays the same Inequality of indexing often easier to analyze than the pointer update Compiler will do that transformation itself if he knows that he can save a register 31

Being Scalar Best Practices Avoid pointer dereferencing Prefer void isqrt ( unsigned long a, unsigned long *q, unsigned long * r) { unsigned long qq, rr; qq = a; if (a > 0) { while (qq >( rr=a/qq )) { qq = ( qq + rr) >> 1; } } rr = a - qq * qq; *q = qq; *r = rr; } Over void isqrt ( unsigned long a, unsigned long *q, unsigned long * r) { *q = a; if (a > 0) { while (*q >(* r=a/ *q)) { *q = (*q + *r) >> 1; } } *r = a - *q * *q; } The left version makes explicit that we assume q r In C99 you could use restrict But then you rely on the compiler to do it right If all these memory accesses stay, performance is worse Treat memory accesses like reading from file 32