Today s objectives. CSE401: Introduction to Compiler Construction. What is a compiler? Administrative Details. Why study compilers?

Similar documents
COP4020 Programming Languages. Compilers and Interpreters Prof. Robert van Engelen

CS 11 C track: lecture 1

Python Programming: An Introduction to Computer Science

Pseudocode ( 1.1) Analysis of Algorithms. Primitive Operations. Pseudocode Details. Running Time ( 1.1) Estimating performance

COP4020 Programming Languages. Functional Programming Prof. Robert van Engelen

Chapter 1. Introduction to Computers and C++ Programming. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

n Some thoughts on software development n The idea of a calculator n Using a grammar n Expression evaluation n Program organization n Analysis

Data Structures and Algorithms. Analysis of Algorithms

Chapter 5. Functions for All Subtasks. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Solutions to Final COMS W4115 Programming Languages and Translators Monday, May 4, :10-5:25pm, 309 Havemeyer

Running Time. Analysis of Algorithms. Experimental Studies. Limitations of Experiments

Chapter 4. Procedural Abstraction and Functions That Return a Value. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Last class. n Scheme. n Equality testing. n eq? vs. equal? n Higher-order functions. n map, foldr, foldl. n Tail recursion

Running Time ( 3.1) Analysis of Algorithms. Experimental Studies. Limitations of Experiments

Analysis of Algorithms

Appendix D. Controller Implementation

Chapter 9. Pointers and Dynamic Arrays. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

CMPT 125 Assignment 2 Solutions

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Python Programming: An Introduction to Computer Science

COSC 1P03. Ch 7 Recursion. Introduction to Data Structures 8.1

Chapter 10. Defining Classes. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

1 Enterprise Modeler

From last week. Lecture 5. Outline. Principles of programming languages

Chapter 11. Friends, Overloaded Operators, and Arrays in Classes. Copyright 2014 Pearson Addison-Wesley. All rights reserved.

implement language system

Abstract. Chapter 4 Computation. Overview 8/13/18. Bjarne Stroustrup Note:

Classes and Objects. Again: Distance between points within the first quadrant. José Valente de Oliveira 4-1

Goals of this Lecture Activity Diagram Example

CSC 220: Computer Organization Unit 11 Basic Computer Organization and Design

Outline and Reading. Analysis of Algorithms. Running Time. Experimental Studies. Limitations of Experiments. Theoretical Analysis

Analysis Metrics. Intro to Algorithm Analysis. Slides. 12. Alg Analysis. 12. Alg Analysis

Lecture 1: Introduction and Strassen s Algorithm

EE 459/500 HDL Based Digital Design with Programmable Logic. Lecture 13 Control and Sequencing: Hardwired and Microprogrammed Control

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

University of Waterloo Department of Electrical and Computer Engineering ECE 250 Algorithms and Data Structures

top() Applications of Stacks

Abstract Syntax Trees. AST Data Structure. Visitor Interface. Accept methods. Visitor Methodology for AST Traversal CS412/CS413

CIS 121 Data Structures and Algorithms with Java Spring Stacks, Queues, and Heaps Monday, February 18 / Tuesday, February 19

EE University of Minnesota. Midterm Exam #1. Prof. Matthew O'Keefe TA: Eric Seppanen. Department of Electrical and Computer Engineering

Τεχνολογία Λογισμικού

Elementary Educational Computer

Analysis of Algorithms

Analysis of Algorithms

COMP Parallel Computing. PRAM (1): The PRAM model and complexity measures

n Theory and practice of parsing n Underlying language theory (CFGs,...) n Top-down parsing (and be able to do it)

Chapter 4 The Datapath

Behavioral Modeling in Verilog

CS 111 Green: Program Design I Lecture 27: Speed (cont.); parting thoughts

Objectives: parsing lectures. CSE401: Parsing. Parsing: two jobs. Parsing. CFG terminology. Context-free grammars (CFGs) Understand:

Code Review Defects. Authors: Mika V. Mäntylä and Casper Lassenius Original version: 4 Sep, 2007 Made available online: 24 April, 2013

COP4020 Programming Languages. Subroutines and Parameter Passing Prof. Robert van Engelen

CSC165H1 Worksheet: Tutorial 8 Algorithm analysis (SOLUTIONS)

9.1. Sequences and Series. Sequences. What you should learn. Why you should learn it. Definition of Sequence

Chapter 2. C++ Basics. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

CMSC Computer Architecture Lecture 12: Virtual Memory. Prof. Yanjing Li University of Chicago

Chapter 5: Processor Design Advanced Topics. Microprogramming: Basic Idea

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Basic allocator mechanisms The course that gives CMU its Zip! Memory Management II: Dynamic Storage Allocation Mar 6, 2000.

Last Class. Announcements. Lecture Outline. Types. Structural Equivalence. Type Equivalence. Read: Scott, Chapters 7 and 8. T2 y; x = y; n Types

Goals of the Lecture Object Constraint Language

A SOFTWARE MODEL FOR THE MULTILAYER PERCEPTRON

Programming with Shared Memory PART II. HPC Spring 2017 Prof. Robert van Engelen

EE260: Digital Design, Spring /16/18. n Example: m 0 (=x 1 x 2 ) is adjacent to m 1 (=x 1 x 2 ) and m 2 (=x 1 x 2 ) but NOT m 3 (=x 1 x 2 )

Recursion. Computer Science S-111 Harvard University David G. Sullivan, Ph.D. Review: Method Frames

Chapter 8. Strings and Vectors. Copyright 2014 Pearson Addison-Wesley. All rights reserved.

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Part A Datapath Design

Outline. Research Definition. Motivation. Foundation of Reverse Engineering. Dynamic Analysis and Design Pattern Detection in Java Programs

Math 10C Long Range Plans

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Lecture 5. Counting Sort / Radix Sort

Lecture 1. Topics. Principles of programming languages (2007) Lecture 1. What makes programming languages such an interesting subject?

Task scenarios Outline. Scenarios in Knowledge Extraction. Proposed Framework for Scenario to Design Diagram Transformation

CS 111: Program Design I Lecture # 7: First Loop, Web Crawler, Functions

Chapter 8. Strings and Vectors. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Exercise 6 (Week 42) For the foreign students only.

Matrix representation of a solution of a combinatorial problem of the group theory

n Haskell n Syntax n Lazy evaluation n Static typing and type inference n Algebraic data types n Pattern matching n Type classes

Structuring Redundancy for Fault Tolerance. CSE 598D: Fault Tolerant Software

CSE 417: Algorithms and Computational Complexity

. Written in factored form it is easy to see that the roots are 2, 2, i,

Lower Bounds for Sorting

CIS 121 Data Structures and Algorithms with Java Spring Stacks and Queues Monday, February 12 / Tuesday, February 13

Contextual Analysis. Overview of Lecture 3. Ch 4 Syntactic Analysis. Mededelingen

What are we going to learn? CSC Data Structures Analysis of Algorithms. Overview. Algorithm, and Inputs

Software development of components for complex signal analysis on the example of adaptive recursive estimation methods.

Solution printed. Do not start the test until instructed to do so! CS 2604 Data Structures Midterm Spring, Instructions:

CS553 Lecture Reuse Optimization: Common SubExpr Elim 3

CMSC Computer Architecture Lecture 3: ISA and Introduction to Microarchitecture. Prof. Yanjing Li University of Chicago

NTH, GEOMETRIC, AND TELESCOPING TEST

n Maurice Wilkes, 1949 n Organize software to minimize errors. n Eliminate most of the errors we made anyway.

Lecture 5: Recursion. Recursion Overview. Recursion is a powerful technique for specifying funclons, sets, and programs

Chapter 3 DB-Gateways

CS : Programming for Non-Majors, Summer 2007 Programming Project #3: Two Little Calculations Due by 12:00pm (noon) Wednesday June

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5

High-Order Language APPLICATION LEVEL HIGH-ORDER LANGUAGE LEVEL ASSEMBLY LEVEL OPERATING SYSTEM LEVEL INSTRUCTION SET ARCHITECTURE LEVEL

CS211 Fall 2003 Prelim 2 Solutions and Grading Guide

Module Instantiation. Finite State Machines. Two Types of FSMs. Finite State Machines. Given submodule mux32two: Instantiation of mux32two

Threads and Concurrency in Java: Part 1

CS 111: Program Design I Lecture 15: Objects, Pandas, Modules. Robert H. Sloan & Richard Warner University of Illinois at Chicago October 13, 2016

Transcription:

CSE401: Itroductio to Compiler Costructio Larry Ruzzo Sprig 2004 Today s objectives Admiistrative details Defie compilers ad why we study them Defie the high-level structure of compilers Associate specific tasks, theories, ad techologies with achievig the differet structural elemets of a compiler Ad build some iitial ituitio about why these are eeded Slides by Chambers, Eggers, Notki, Ruzzo, ad others W.L.Ruzzo & UW CSE 1994-2004 Admiistrative Details Course Web: http://www.cs.washigto.edu/401 Gradig Homeworks ~20% Project ~40% Midterm ~15% Fial ~25% Project: toy compiler a (almost) real oe. Staged. Optioal teams of 2-3 people. What is a compiler? Compiler Source Code Executable Code A software tool that traslates a program i source code form to a equivalet program i a executable (target) form Coverts from a form good for people to a form good for computers Examples Why study compilers? Source laguages Java C C++ LISP ML COBOL Target architectures MIPS x86 SPARC Alpha C 1

CSE401 s project-orieted approach More o the project Start with a compiler for PL/0, writte i C++ We defie additioal laguage features Such as commets, arrays, call-by-referece parameters, result-returig procedures, for loops, etc. You modify the compiler to traslate the exteded PL/0 laguage Project completed i well-defied stages Strogly recommeded that you work i twoperso teams for the quarter Gradig based o correctess clarity of desig ad implemetatio quality of testig Provides experiece with object-orieted desig ad with C++ Provides experiece with workig i a team What's hard about compilig Example I will preset a small program to you, character by character Idetify problems that you ca see that you will ecouter i compilig this program Here s a example problem Whe we see a character 1 followed by a character 7, we have to covert it to the iteger 17. 1. i 2. 3. t 4. _ 5. i 6. ; 7. _ 8. i 9. : 10. = 11. 1 12. 7 13. _ 14. ; 15. p 16. r 17. i 18. 19. t 20. ( 21. i 22. * 23. i 24. + 25. 2 26. ) 27. ; _ is the space character This is ot a PL/0 program! Structure of compilers A commo compiler structure has bee defied Years ad years of deep, difficult research itermixed with buildig of thousads of compilers Actual compilers ofte differ from this prototype Primary differeces are the orderig ad clarity with which the pieces are actually separated But the model is still extremely useful You will see the structure to a large degree i the PL/0 compiler Prototype compiler structure Source Stream of characters Sequece of tokes Abstract Sytax Tree (AST) AST+ ad Lexical aalysis Sytactic aalysis Sematic aalysis Storage layout Executable code Itermediate represetatio Code geeratio Optimizatio Target Itermediate represetatio Itermediate code geeratio AST++ ad 2

Source stream of characters sequece of tokes abstract sytax tree (AST) AST+ ad Lexical aalysis Sytactic aalysis Sematic aalysis Storage layout Executable code Itermediate represetatio Itermediate represetatio AST++ ad Target Code geeratio Optimizatio Itermediate code geeratio Frot- ad back-ed. These parts are ofte lumped ito two categories The frot-ed Focuses o (repeated) aalysis Determies what the program is The back-ed Focuses o sythesis Produces target program equivalet to source program A example compilatio module mai; var x:it, result: it; procedure square(:it); begi result := *; ed square; begi x := iput; while x <> 0 do square(x); output := result; x := iput; ed; ed mai. A real PL/0 program We ll step through Lexical aalysis Sytactic aalysis Sematic aalysis Storage layout Code geeratio Lexical aalysis (AKA scaig ad tokeizig) Sytactic aalysis (AKA parsig) Read i characters ad clump them ito tokes Idet Also strip out white space ad commets Specify tokes with regular Letter expressios Use fiite state machies to Digit sca Remember the coectio betwee regular expressios ad fiite state machies ::= Letter AlphaNum* Iteger ::= Digit+ AlphaNum ::= Letter Digit ::= a z A Z ::= 0 9 E.g.: While x <> 0 do keywd id op it keywd Tur toke stream ito tree based o the program s sytactic structure Defie sytax usig cotext free grammar (CFG) EBNF is a commo otatio for defiig cocrete sytax Cares about semi-colos, pares, ad such Parser usually costructs AST represetig abstract sytax Cares about statemet structures, precedece ad such Stmt ::= Astmt IfStmt Astmt ::= Lvalue := Expr ; Lvalue ::= Id IfStmt ::= if Test the Stmt [else Stmt] ; Test ::= Expr = Expr Expr < Expr Expr ::= Term + Term Term Term Term Term ::= Factor * Factor Factor Factor ::= - Factor Id It ( Expr ) Lvalue Sytactic aalysis example Stmt Astmt Fact Expr Term Fact Id := Id * Id ; result := * ; Stmt ::= Astmt IfStmt Astmt ::= Lvalue := Expr ; Lvalue ::= Id IfStmt ::= if Test the Stmt [else Stmt] ; Test ::= Expr = Expr Expr < Expr Expr ::= Term + Term Term Term Term Term ::= Factor * Factor Factor Factor ::= - Factor Id It ( Expr ) Sematic aalysis (Name resolutio ad type checkig) Give AST figure out what declaratio each ame refers to perform static cosistecy checks Key data structure: maps ames to iformatio about ame derived from declaratio Sematic aalysis steps Process each scope, top dow Process declaratios i each scope ito for scope Process body of each scope i cotext of 3

Sematic aalysis example Storage layout it x; it y(void); it mai(void) { double x,y; x = x + 5; pritf("x is %d",x); x = y(); retur 1/2 ; } Which var with which decl? What type? Operators legal o those types? Coercio? Fuctio arg & retur types too? Overloadig? Goto/case labels uique? Give, determie how ad where variables will be stored at rutime What represetatio is used for each kid of data? How much space does each variable require? I what kid of memory should it be placed? static, global memory stack memory heap memory Where i memory should it be placed? e.g., what stack offset? Storage layout example Code geeratio it x; it y(void); it mai(void) { double x,y; x = x + 5; pritf("x is %d",x); x = y(); retur 1/2 ; } Outer x: 4 bytes, static Ier x,y: 8 bytes each o stack What address? How does pritf fid its parameters? How does mai retur a value? Give aotated AST ad, produce target code Ofte doe as three steps Produce machie-idepedet low-level represetatio of the program (itermediate represetatio or IR) Perform machie-idepedet optimizatios (optioal) Traslate IR ito machie-specific target istructios Istructio selectio Register allocatio Codege example x = x + y; t42 fl x lw $2, 48($fp) t43 fl y lw $3, 52($fp) t44 fl t42 + t43 add $2, $2, $3 x fl t44 x = x * 2;t45 fl x lw $2, 48($fp) t46 fl 2 li $3, 2 t47 fl t45 * t46 mul $2, $2, $3 x fl t47 x += y; t48 fl x lw $2, 48($fp) t49 fl y lw $3, 52($fp) t50 fl t48 + t49 add $2, $2, $3 x fl t50 Optimizatio Ca you see simple chages that would streamlie the code above? How could you fid them automatically? 4

Does this structure work well? Compilers vs. iterpreters FORTRAN I Compiler (circa 1954-56) 18 perso years PL/0 Compiler By the ed of the quarter, you'll have a workig compiler that's way better tha FORTRAN I i most respects (key exceptio: optimizatio) Compilers implemet laguages by traslatio Iterpreters implemet laguages directly Note: the lie is ot always crystal-clear Compilers ad iterpreters have tradeoffs Executio speed of program Start-up overhead, tur-aroud time Ease of implemetatio mig eviromet facilities Coceptual clarity Compiler egieerig issues Portability Ideal is multiple frot-eds ad multiple back-eds with a shared itermediate laguage Sequecig phases of compilatio Stream-based vs. sytax-directed Multiple, separate passes vs. fewer, itegrated passes How to avoid compiler bugs? Objectives: ext lecture Defie overall theory ad practical structure of lexical aalysis Briefly recap regular expressios, fiite state machies, ad their relatioship Eve briefer recap of the laguage hierarchy Show how to defie tokes with regular expressios Show how to leverage this style of toke defiitio i implemetig a lexer 5