High-level View of a Compiler Overview of a Compiler Compiler Copyright 2010, Pedro C. Diniz, all rights reserved. Students enrolled in the Compilers class at the University of Southern California have explicit permission to make copies of these materials for their personal use. Implications Must recognize legal (and illegal) programs Must generate correct Must manage storage of all variables (and ) Must agree with OS & linker on format for object Big step up from assembly language use higher level notations 2 Traditional Two-pass Compiler A Common Fallacy Implications Use an intermediate representation () maps legal source into maps into target machine Admits multiple front s & multiple passes (better ) Typically, front is O(n) or O(n log n), while back is NPC Scheme Smalltalk Target 1 Target 2 Target 3 Can we build n x m compilers with n+m components? Must en all language specific knowledge in each front Must en all features in a single Must en all target specific knowledge in each back Limited success in systems with very low-level s 3 4 The The tokens tokens Responsibilities Recognize legal (& illegal) programs Report errors in a useful way Produce & preliminary storage map Shape the for the back Much of front construction can be automated Maps character stream into words the basic unit of syntax Produces pairs a word & its part of speech x = x + y ; becomes <id,x> = <id,x> + <id,y> ; word lexeme, part of speech token type In casual speech, we call the pair a token Typical tokens include number, identifier, +,, new, while, if eliminates white space (including comments) Speed is important 5 6 1
The tokens Recognizes context-free syntax & reports errors Guides context-sensitive ( semantic ) analysis (type checking) Builds for source program Hand-d parsers are fairly easy to build Most books advocate using automatic parser generators The Context-free syntax is specified with a grammar SheepNoise SheepNoise baa baa This grammar defines the set of noises that a sheep makes under normal circumstances It is written in a variant of us Naur Form (BNF) Formally, a grammar G = (S,N,T,P) S is the start symbol N is a set of non-terminal symbols T is a set of terminal symbols or words P is a set of productions or rewrite rules (P : N N T ) 7 8 The The Context-free syntax can be put to better use 1. goal expr 2. expr expr op term 3. term 4. term number 5. id 6. op + 7. - S = goal T = { number, id, +, - } N = { goal, expr, term, op } P = { 1, 2, 3, 4, 5, 6, 7} This grammar defines simple expressions with addition & subtraction over number and id This grammar, like many, falls in a class called context-free grammars, abbreviated CFG Given a CFG, we can derive sentences by repeated substitution Production Result goal 1 expr 2 expr op term 5 expr op y 7 expr - y 2 expr op term - y 4 expr op 2 - y 6 expr + 2 - y 3 term + 2 - y 5 x + 2 - y To recognize a valid sentence in some CFG, we reverse this process and build up a parse 9 10 The The A parse can be represented by a tree (parse tree or syntax tree) x + 2 - y goal Compilers often use an abstract syntax tree - expr expr expr op term op - term <id,y> <id,x> + <number,2> <id,y> The AST summarizes grammatical structure, without including detail about the derivation term <id,x> + <number,2> This contains a lot of unneeded information. 1. goal expr 2. expr expr op term 3. term 4. term number 5. id 6. op + 7. - This is much more concise ASTs are one kind of intermediate representation () 11 12 2
The The Responsibilities Translate into target machine Choose instructions to implement each operation Decide which value to keep in registers Ensure conformance with system interfaces Automation has been less successful in the back Produce fast, compact Take advantage of target features such as addressing modes Usually viewed as a pattern matching problem ad hoc methods, pattern matching, dynamic programming This was the problem of the future in 1978 Spurred by transition from PDP-11 to VAX-11 Orthogonality of RISC simplified this problem 13 14 The The Have each value in a register when it is used Manage a limited set of resources Can change instruction choices & insert LOADs & STOREs Optimal allocation is NP-Complete (1 or k registers) Compilers approximate solutions to NP-Complete problems Avoid hardware stalls and interlocks Use all functional units productively Can increase lifetime of variables (changing the allocation) Optimal scheduling is NP-Complete in nearly all cases Heuristic techniques are well developed 15 16 Traditional Three-pass Compiler The Optimizer (or Middle ) Middle Opt Opt Opt... Opt 1 2 3 n Improvement (or Optimization) Analyzes and rewrites (or transforms) Primary goal is to reduce running time of the compiled May also improve space, power consumption, Must preserve meaning of the Measured by values of named variables Modern optimizers are structured as a series of passes Typical Transformations Discover & propagate some constant value Move a computation to a less frequently executed place Specialize some computation based on context Discover a redundant computation & remove it Remove useless or unreachable En an idiom in some particularly efficient form 17 18 3
Example Example Optimization of Subscript Expressions in Optimization of Subscript Expressions in Address(A(I,J)) = address(a(0,0)) + J * (column size) + I Address(A(I,J)) = address(a(0,0)) + J * (column size) + I Does the user realize a multiplication is generated here? Does the user realize a multiplication is generated here? DO I = 1, M A(I,J) = A(I,J) + C ENDDO 19 20 Example Modern Restructuring Compiler Optimization of Subscript Expressions in HL AST Restructurer HL AST Gen Opt + Address(A(I,J)) = address(a(0,0)) + J * (column size) + I DO I = 1, M A(I,J) = A(I,J) + C ENDDO Does the user realize a multiplication is generated here? compute addr(a(0,j)) DO I = 1, M add 1 to get addr(a(i,j)) A(I,J) = A(I,J) + C ENDDO Typical Restructuring Transformations: Blocking for Memory Hierarchy and Reuse Vectorization Parallelization All based on depence Also full and partial inlining 21 22 Role of the Run-Time System Memory management services Allocate In the heap or in an activation record (stack frame) Deallocate Collect garbage Run-time type checking Error processing Interface to the operating system Input and output Support of parallelism Parallel Thread initiation Communication and Synchronization 1957: The FORTRAN Automatic Coding System Index Optimiz n Merge bookkeeping Six passes in a fixed order Generated good Assumed unlimited index registers motion out of loops, with ifs and gotos Did flow analysis & register allocation Flow Analysis Middle Final Assembly 23 24 4
1969: IBM s FORTRAN H Compiler Scan & Parse Build CFG & DOM Find Busy Vars CSE Inv Mot n Copy Elim. OSR Re - assoc (consts) Reg. Alloc. Final Assy. 1975: BLISS-11 compiler (Wulf et al., CMU) Lex- Syn- Flo allocation Delay TLA Rank Pack Final Middle Used low-level (quads), identified loops with dominators Focused on optimizing loops ( inside out order) Passes are familiar today Simple front, simple back for IBM 370 Middle The great compiler for the PDP-11 Seven passes in a fixed order Focused on shape & instruction selection LexSynFlo did preliminary flow analysis Final included a grab-bag of peephole optimizations Basis for early VAX & Tartan Labs compilers 25 26 1980: IBM s PL.8 Compiler 1980: IBM s PL.8 Compiler Middle Middle Many passes, 1 front, several back s Collection of 10 or more passes Repeat some passes and analyses Represent complex operations at 2 levels Below machine-level Dead elimination cse motion Constant folding Strength reduction Value numbering Dead store elimination straightening Trap elimination Algebraic reassociation * Many passes, 1 front, several back s Collection of 10 or more passes Repeat some passes and analyses Represent complex operations at 2 levels Below machine-level Multi-level has become common wisdom * 27 28 1986: HP s PA-RISC Compiler 1999: The SUIF Compiler System 77 C/ Alpha Middle Middle x86 Several front s, an optimizer, and a back Four fixed-order choices for optimization (9 passes) Coloring allocator, instruction scheduler, peephole optimizer Another classically-built compiler 3 front s, 3 back s 18 passes, configurable order Two-level (High SUIF, Low SUIF) Inted as research infrastructure 29 30 5
1999: The SUIF Compiler System 1999: The SUIF Compiler System 77 C/ 77 C/ Alpha Alpha x86 x86 Middle Middle Another classically-built compiler 3 front s, 3 back s 18 passes, configurable order Two-level (High SUIF, Low SUIF) Inted as research infrastructure SSA construction Dead elimination Partial redundancy elimination Constant propagation value numbering Strength reduction Reassociation scheduling allocation Another classically-built compiler 3 front s, 3 back s 18 passes, configurable order Two-level (High SUIF, Low SUIF) Inted as research infrastructure Data depence analysis Scalar & array privatization Reduction recognition Pointer analysis Affine loop transformations Blocking Capturing object definitions Virtual function call elimination Garbage collection 31 32 Middle 3 front s, 1 back Five-levels of Interprocedural Classic Analysis Inlining (user & library ) Cloning (constants & locality) Dead function elimination Dead variable elimination Middle 3 front s, 1 back Five-levels of Optimization Depence Analysis Parallelization transformations (fission, fusion, interchange, peeling, tiling, unroll & jam) Array privatization 33 34 Middle 3 front s, 1 back Five-levels of Optimization SSA-based analysis & opt n Constant propagation, PRE, OSR+LFTR, DVNT, DCE (also used by other phases) Middle 3 front s, 1 back Five-levels of Generation If conversion & predication motion (inc. sw pipelining) Peephole optimization 35 36 6
Summary Even a 2000 JIT fits the mold, albeit with fewer passes Overview of a Compiler s Tasks byte native Basic Translation from High-level to level Structure of a (Classical) Compiler Middle Environment Traditional Three Phase Structure Classical Compilers tasks are handled elsewhere Few (if any) optimizations Avoid expensive analysis Emphasis on generating native Compilation must be profitable Static vs. Dynamic 37 38 7