CSEP 501 Compilers. Overview and Administrivia Hal Perkins Winter /8/ Hal Perkins & UW CSE A-1

Similar documents
CSE 401/M501 Compilers

CSE 413 Programming Languages & Implementation. Hal Perkins Autumn 2012 Grammars, Scanners & Regular Expressions

CS415 Compilers Overview of the Course. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

CSE 413 Programming Languages & Implementation. Hal Perkins Winter 2019 Grammars, Scanners & Regular Expressions

CSEP 501 Compilers. Languages, Automata, Regular Expressions & Scanners Hal Perkins Winter /8/ Hal Perkins & UW CSE B-1

Compilers and Interpreters

CSE P 501 Compilers. Parsing & Context-Free Grammars Hal Perkins Spring UW CSE P 501 Spring 2018 C-1

CS415 Compilers Overview of the Course. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

High-level View of a Compiler

Regular Expressions. Agenda for Today. Grammar for a Tiny Language. Programming Language Specifications

CSE 582 Autumn 2002 Exam 11/26/02

Compilers Crash Course

Compiler Construction LECTURE # 1

CS 4120 and 5120 are really the same course. CS 4121 (5121) is required! Outline CS 4120 / 4121 CS 5120/ = 5 & 0 = 1. Course Information

Compilers and Code Optimization EDOARDO FUSELLA

CSE 504: Compiler Design

Compiler Design. Dr. Chengwei Lei CEECS California State University, Bakersfield

COMP 181 Compilers. Administrative. Last time. Prelude. Compilation strategy. Translation strategy. Lecture 2 Overview

Compiling Regular Expressions COMP360

CS 415 Midterm Exam Spring 2002

Compiling Techniques

CSE 401 Compilers. Overview and Administrivia Hal Perkins Winter UW CSE 401 Winter 2017 A-1

CSE4305: Compilers for Algorithmic Languages CSE5317: Design and Construction of Compilers

Compilers. Computer Science 431

Compiler Theory Introduction and Course Outline Sandro Spina Department of Computer Science

CS 406/534 Compiler Construction Putting It All Together

CSE P 501 Compilers. Intermediate Representations Hal Perkins Spring UW CSE P 501 Spring 2018 G-1

INTRODUCTION PRINCIPLES OF PROGRAMMING LANGUAGES. Norbert Zeh Winter Dalhousie University 1/10

Compiling and Interpreting Programming. Overview of Compilers and Interpreters

Compiler Design (40-414)

Languages and Automata

Why are there so many programming languages? Why do we have programming languages? What is a language for? What makes a language successful?

Crafting a Compiler with C (II) Compiler V. S. Interpreter

Languages, Automata, Regular Expressions & Scanners. Winter /8/ Hal Perkins & UW CSE B-1

CSE 401/M501 Compilers

CS Compiler Construction West Virginia fall semester 2014 August 18, 2014 syllabus 1.0

Static Semantics. Winter /3/ Hal Perkins & UW CSE I-1

The View from 35,000 Feet

CST-402(T): Language Processors

COP4020 Programming Languages. Compilers and Interpreters Robert van Engelen & Chris Lacher

CSE 582 Autumn 2002 Exam Sample Solution

Front End. Hwansoo Han

CA Compiler Construction

Working of the Compilers

General Course Information. Catalogue Description. Objectives

Compilers. Prerequisites

CSCI 565 Compiler Design and Implementation Spring 2014

Lexical Scanning COMP360

ECE573 Introduction to Compilers & Translators

CS 432 Fall Mike Lam, Professor. Compilers. Advanced Systems Elective

What is a compiler? Xiaokang Qiu Purdue University. August 21, 2017 ECE 573

Compiler Construction

Compilers. Lecture 2 Overview. (original slides by Sam

What is a compiler? var a var b mov 3 a mov 4 r1 cmpi a r1 jge l_e mov 2 b jmp l_d l_e: mov 3 b l_d: ;done

9/5/17. The Design and Implementation of Programming Languages. Compilation. Interpretation. Compilation vs. Interpretation. Hybrid Implementation

CS 415 Midterm Exam Spring SOLUTION

Structure of a compiler. More detailed overview of compiler front end. Today we ll take a quick look at typical parts of a compiler.

Introduction to Compiler

CSE4305: Compilers for Algorithmic Languages CSE5317: Design and Construction of Compilers

Overview of the Class

CSE4305: Compilers for Algorithmic Languages CSE5317: Design and Construction of Compilers

The Front End. The purpose of the front end is to deal with the input language. Perform a membership test: code source language?

CS5363 Final Review. cs5363 1

! Broaden your language horizons! Different programming languages! Different language features and tradeoffs. ! Study how languages are implemented

Semantic Analysis. Lecture 9. February 7, 2018

CS 132 Compiler Construction

CS153: Compilers Lecture 1: Introduction

15-411/ Compiler Design

CS 4240: Compilers and Interpreters Project Phase 1: Scanner and Parser Due Date: October 4 th 2015 (11:59 pm) (via T-square)

Overview of the Class

Early computers (1940s) cost millions of dollars and were programmed in machine language. less error-prone method needed

LECTURE NOTES ON COMPILER DESIGN P a g e 2

Compiler construction

CS202 Compiler Construction. Christian Skalka. Course prerequisites. Solid programming skills a must.

PLAGIARISM. Administrivia. Course home page: Introduction to Programming Languages and Compilers

Introduction to Programming Languages and Compilers. CS164 11:00-12:00 MWF 306 Soda

Lecture 1: Course Introduction

What do Compilers Produce?

CS Lecture 2. The Front End. Lecture 2 Lexical Analysis

Theory and Compiling COMP360

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

INF5110 Compiler Construction

Welcome to CSE131b: Compiler Construction

CS415 Compilers. Lexical Analysis

Design & Implementation Overview

Software II: Principles of Programming Languages

Compilers for Modern Architectures Course Syllabus, Spring 2015

1. true / false By a compiler we mean a program that translates to code that will run natively on some machine.

CSE 401/M501 Compilers

CS 352: Compilers: Principles and Practice

When do We Run a Compiler?

Why are there so many programming languages?

G53CMP: Lecture 1. Administrative Details 2016 and Introduction to Compiler Construction. Henrik Nilsson. University of Nottingham, UK

CSE P 501 Compilers. Implementing ASTs (in Java) Hal Perkins Winter /22/ Hal Perkins & UW CSE H-1

CSE P 501 Compilers. Parsing & Context-Free Grammars Hal Perkins Winter UW CSE P 501 Winter 2016 C-1

CSE P 501 Compilers. Parsing & Context-Free Grammars Hal Perkins Winter /15/ Hal Perkins & UW CSE C-1

Compiling Techniques

Lecture 1: Course Introduction

Defining Program Syntax. Chapter Two Modern Programming Languages, 2nd ed. 1

CS426 Compiler Construction Fall 2006

Transcription:

CSEP 501 Compilers Overview and Administrivia Hal Perkins Winter 2008 1/8/2008 2002-08 Hal Perkins & UW CSE A-1

Credits Some direct ancestors of this course Cornell CS 412-3 (Teitelbaum, Perkins) Rice CS 412 (Cooper, Kennedy, Torczon) UW CSE 401 (Chambers, Ruzzo, et al) UW CSE 582 (Perkins) Many grad compiler courses Many books (Appel; Cooper/Torczon; Aho, [Lam,] Sethi, Ullman [Dragon Book]) 1/8/2008 2002-08 Hal Perkins & UW CSE A-2

Agenda Introductions What s a compiler? Administrivia 1/8/2008 2002-08 Hal Perkins & UW CSE A-3

CSEP 501 Personel Instructor: Hal Perkins CSE 548; perkins@cs Office hours: after class + drop in when you re around + appointments TA: Hao Lu hlv@cs Office hours: CSE 216, Tue 5-6:15 + appointments 1/8/2008 2002-08 Hal Perkins & UW CSE A-4

And the point is Execute this! How? int npos = 0; int k = 0; while (k < length) { if (a[k] > 0) { npos++; } } 1/8/2008 2002-08 Hal Perkins & UW CSE A-5

Interpreters & Compilers Interpreter A program that reads an source program and produces the results of executing that program Compiler A program that translates a program from one language (the source) to another (the target) 1/8/2008 2002-08 Hal Perkins & UW CSE A-6

Common Issues Compilers and interpreters both must read the input a stream of characters and understand it; analysis w h i l e ( k < l e n g t h ) { <nl> <tab> i f ( a [ k ] > 0 ) <nl> <tab> <tab>{ n P o s + + ; } <nl> <tab> } 1/8/2008 2002-08 Hal Perkins & UW CSE A-7

Interpreter Interpreter Execution engine Program execution interleaved with analysis running = true; while (running) { analyze next statement; execute that statement; } Usually need repeated analysis of statements (particularly in loops, functions) But: immediate execution, good debugging & interaction 1/8/2008 2002-08 Hal Perkins & UW CSE A-8

Compiler Read and analyze entire program Translate to semantically equivalent program in another language Presumably easier to execute or more efficient Should improve the program in some fashion Offline process Tradeoff: compile time overhead (preprocessing step) vs execution performance 1/8/2008 2002-08 Hal Perkins & UW CSE A-9

Typical Implementations Compilers FORTRAN, C, C++, Java, COBOL, etc. etc. Strong need for optimization in many cases Interpreters PERL, Python, Ruby, awk, sed, shells, Scheme/Lisp/ML, postscript/pdf, Java VM Particularly effective if interpreter overhead is low relative to execution cost of individual statements 1/8/2008 2002-08 Hal Perkins & UW CSE A-10

Hybrid approaches Well-known example: Java Compile Java source to byte codes Java Virtual Machine language (.class files) Execution Interpret byte codes directly, or Compile some or all byte codes to native code Just-In-Time compiler (JIT) detect hot spots & compile on the fly to native code standard these days Variation:.NET Compilers generate MSIL All IL compiled to native code before execution 1/8/2008 2002-08 Hal Perkins & UW CSE A-11

Why Study Compilers? (1) Become a better programmer(!) Insight into interaction between languages, compilers, and hardware Understanding of implementation techniques What is all that stuff in the debugger anyway? Better intuition about what your code does 1/8/2008 2002-08 Hal Perkins & UW CSE A-12

Why Study Compilers? (2) Compiler techniques are everywhere Parsing (little languages, interpreters, XML) Database engines, query languages AI: domain-specific languages Text processing Tex/LaTex -> dvi -> Postscript -> pdf Hardware: VHDL; model-checking tools Mathematics (Mathematica, Matlab) 1/8/2008 2002-08 Hal Perkins & UW CSE A-13

Why Study Compilers? (3) Fascinating blend of theory and engineering Direct applications of theory to practice Parsing, scanning, static analysis Some very difficult problems (NP-hard or worse) Resource allocation, optimization, etc. Need to come up with good-enough approximations/heuristics 1/8/2008 2002-08 Hal Perkins & UW CSE A-14

Why Study Compilers? (4) Ideas from many parts of CSE AI: Greedy algorithms, heuristic search Algorithms: graph algorithms, dynamic programming, approximation algorithms Theory: Grammars, DFAs and PDAs, pattern matching, fixed-point algorithms Systems: Allocation & naming, synchronization, locality Architecture: pipelines, instruction set use, memory hierarchy management 1/8/2008 2002-08 Hal Perkins & UW CSE A-15

Why Study Compilers? (5) You might even write a compiler some day! You ll almost certainly write parsers and interpreters in some context if you haven t already 1/8/2008 2002-08 Hal Perkins & UW CSE A-16

Structure of a Compiler First approximation Front end: analysis Read source program and understand its structure and meaning Back end: synthesis Generate equivalent target language program Source Front End Back End Target 1/8/2008 2002-08 Hal Perkins & UW CSE A-17

Implications Must recognize legal programs (& complain about illegal ones) Must generate correct code Must manage storage of all variables/data Must agree with OS & linker on target format Source Front End Back End Target 1/8/2008 2002-08 Hal Perkins & UW CSE A-18

More Implications Need some sort of Intermediate Representation(s) (IR) Front end maps source into IR Back end maps IR to target machine code Often multiple IRs higher level at first, lower level in later phases Source Front End Back End Target 1/8/2008 2002-08 Hal Perkins & UW CSE A-19

Front End source tokens IR Scanner Parser Split into two parts Scanner: Responsible for converting character stream to token stream Also strips out white space, comments Parser: Reads token stream; generates IR Both of these can be generated automatically Source language specified by a formal grammar Tools read the grammar and generate scanner & parser (either table-driven or hard-coded) 1/8/2008 2002-08 Hal Perkins & UW CSE A-20

Tokens Token stream: Each significant lexical chunk of the program is represented by a token Operators & Punctuation: {}[]!+-=*;: Keywords: if while return goto Identifiers: id & actual name Constants: kind & value; int, floating-point character, string, 1/8/2008 2002-08 Hal Perkins & UW CSE A-21

Scanner Example Input text // this statement does very little if (x >= y) y = 42; Token Stream IF LPAREN ID(x) GEQ ID(y) RPAREN ID(y) BECOMES INT(42) SCOLON Notes: tokens are atomic items, not character strings; comments & whitespace are not tokens (in most languages) 1/8/2008 2002-08 Hal Perkins & UW CSE A-22

Parser Output (IR) Many different forms Engineering tradeoffs have changed over time (e.g., memory is (almost) free these days) Common output from a parser is an abstract syntax tree Essential meaning of the program without the syntactic noise 1/8/2008 2002-08 Hal Perkins & UW CSE A-23

Parser Example Token Stream Input Abstract Syntax Tree IF LPAREN ID(x) ifstmt GEQ ID(y) RPAREN ID(y) BECOMES >= assign INT(42) SCOLON ID(x) ID(y) ID(y) INT(42) 1/8/2008 2002-08 Hal Perkins & UW CSE A-24

Static Semantic Analysis During or (more common) after parsing Type checking Check language requirements like proper declarations, etc. Preliminary resource allocation Collect other information needed by back end analysis and code generation 1/8/2008 2002-08 Hal Perkins & UW CSE A-25

Back End Responsibilities Translate IR into target machine code Should produce good code good = fast, compact, low power consumption (pick some) Should use machine resources effectively Registers Instructions Memory hierarchy 1/8/2008 2002-08 Hal Perkins & UW CSE A-26

Back End Structure Typically split into two major parts with sub phases Optimization code improvements Often works on lower-level IR than parser AST Code generation Instruction selection & scheduling Register allocation 1/8/2008 2002-08 Hal Perkins & UW CSE A-27

The Result Input if (x >= y) >= y = 42; ifstmt assign Output mov eax,[ebp+16] cmp eax,[ebp-8] jl L17 mov [ebp-8],42 L17: ID(x) ID(y) ID(y) INT(42) 1/8/2008 2002-08 Hal Perkins & UW CSE A-28

Some History (1) 1950 s. Existence proof FORTRAN I (1954) competitive with hand-optimized code 1960 s New languages: ALGOL, LISP, COBOL, SIMULA Formal notations for syntax, esp. BNF Fundamental implementation techniques Stack frames, recursive procedures, etc. 1/8/2008 2002-08 Hal Perkins & UW CSE A-29

Some History (2) 1970 s Syntax: formal methods for producing compiler front-ends; many theorems Late 1970 s, 1980 s New languages (functional; Smalltalk & object-oriented) New architectures (RISC machines, parallel machines, memory hierarchy issues) More attention to back-end issues 1/8/2008 2002-08 Hal Perkins & UW CSE A-30

Some History (3) 1990s and beyond Compilation techniques appearing in many new places Just-in-time compilers (JITs) Software analysis, verification, security Phased compilation blurring the lines between compile time and runtime Using machine learning techniques to control optimizations(!) Compiler technology critical to effective use of new hardware (RISC, Itanium, complex memory heirarchies) The new 800 lb gorilla - multicore 1/8/2008 2002-08 Hal Perkins & UW CSE A-31

CSEP 501 Course Project Best way to learn about compilers is to build one CSEP 501 course project: Implement an x86 compiler in Java for an object-oriented programming language MiniJava subset of Java from Appel book Includes core object-oriented parts (classes, instances, and methods, including subclasses and inheritance) Basic control structures (if, while) Integer variables and expressions 1/8/2008 2002-08 Hal Perkins & UW CSE A-32

Project Details Goal: large enough language to be interesting; small enough to be tractable Project due in phases Final result is the main thing, but timeliness and quality of intermediate work counts for something Final report & short conference at end of the course Core requirements, then open-ended Reasonably open to alternative projects; let s discuss Most likely would be a different implementation language (C#, ML, F#,?) or target (MIPS/SPIM, x86-64, ) 1/8/2008 2002-08 Hal Perkins & UW CSE A-33

Prerequisites Assume undergrad courses in: Data structures & algorithms Linked lists, dictionaries, trees, hash tables, &c Formal languages & automata Regular expressions, finite automata, context-free grammars, maybe a little parsing Machine organization Assembly-level programming for some machine (not necessarily x86) Gaps can usually be filled in But be prepared to put in extra time if needed 1/8/2008 2002-08 Hal Perkins & UW CSE A-34

Project Groups You are encouraged to work in groups of 2 or 3 Pair programming strongly encouraged Space for CVS or SVN repositories + other shared files available on UW CSE machines Use if desired; not required Mail to instructor/ta if you want this 1/8/2008 2002-08 Hal Perkins & UW CSE A-35

Programming Environments Whatever you want! But assuming you re using Java, your code should compile & run using standard Sun javac/java Generics (Java 5/6) are fine If you use C# or something else, you assume some risk of the unknown Work with other members of the class and pull together Class discussion list can be very helpful here If you re looking for a Java IDE, try Eclipse Or netbeans, or <name your favorite> javac/java + emacs for the truly hardcore 1/8/2008 2002-08 Hal Perkins & UW CSE A-36

Requirements & Grading Roughly 50% project 20% individual written homework 25% exam (Thur. evening, about 2/3 of the way through the course) 5% other Intent is to have homework submission online with graded work returned via email Will adjust as needed 1/8/2008 2002-08 Hal Perkins & UW CSE A-37

CSE 582 Administrivia 1 lecture per week Tuesday 6:30-9:20, CSE 305 + MSFT Carpools? Office Hours Perkins: after class, drop-ins Lu: Tue. 5-6:15, UW CSE 216 Also appointments Suggestions for other times/locations? 1/8/2008 2002-08 Hal Perkins & UW CSE A-38

CSEP 501 Web Everything is (or will be) at www.cs.washington.edu/csep501 Lecture slides will be on the course web by mid-afternoon before each class Printed copies available in class at UW, but you may want to read or print in advance Live video during class But do try to join us (questions, etc.) Archived video and slides from class will be posted a day or two later 1/8/2008 2002-08 Hal Perkins & UW CSE A-39

Communications Course web site Mailing list You are automatically subscribed if you are enrolled Will keep this fairly low-volume; limited to things that everyone needs to read Link is on course web page Discussion board Also linked from course web Use for anything relevant to the course let s try to build a community Can configure to have postings sent via email 1/8/2008 2002-08 Hal Perkins & UW CSE A-40

Books Three good books: Aho, Lam, Sethi, Ullman, Dragon Book, 2 nd ed (but 1 st ed is also fine) Appel, Modern Compiler Implementation in Java, 2 nd ed. Cooper & Torczon, Engineering a Compiler Dragon book is the official text, but all would work & we ll draw on all three (and more) If we put these on reserve in the engineering library, would anyone notice? 1/8/2008 2002-08 Hal Perkins & UW CSE A-41

Academic Integrity We want a cooperative group working together to do great stuff! Possibilities include bounties for first person to solve vexing problems But: you must never misrepresent work done by someone else as your own, without proper credit OK to share ideas & help each other out, but your project should ultimately be created by your group & solo homework / tests should be your own 1/8/2008 2002-08 Hal Perkins & UW CSE A-42

Any questions? Your job is to ask questions to be sure you understand what s happening and to slow me down Otherwise, I ll barrel on ahead 1/8/2008 2002-08 Hal Perkins & UW CSE A-43

Coming Attractions Review of formal grammars Lexical analysis scanning Background for first part of the project Followed by parsing Good time to read the first couple of chapters of (any of) the book(s) 1/8/2008 2002-08 Hal Perkins & UW CSE A-44