CPSC 411: to Compiler Construction Kris De Volder kdvolder@cs.ubc.ca http://www.ugrad.cs.ubc.ca/~cs411/ 1
This Lecture Course Organization and Policies Website Newsgroup Assignments General Information Collaboration and Plagiarism Assignment 1: due this friday at midnight! Assignment 2: due next friday at midnight! : What is this course about. Textbook Course Overview 2
Course Website http://ugrad.cs.ubc.ca/~cs411/ Repository for all important course information. E.g: Assignment deadlines and starter code. Lecture material (if available) Course Policies TA and instructor email and office hours... You are expected to read everything on the website. Also check the webpage for updates and important news items regularly. 3
Course Newsgroup / email newsgroup: ubc.courses.cpsc.411 Setup instructions: http://www.cs.ubc.ca/local/computing/accounts/email_news.shtml Newsgroup is for student questions of a public nature : clarification about assignments, course material etc. do not use email for this purpose! last minute announcements by TA and instructor. Email is appropriate if: your question is of a private nature... e.g: marking issues, personal problems, etc. 4
Course Newsgroup / email You are expected to check the newsgroup regularly. You are encouraged to discuss the assignments in the newsgroup. The TA and instructor will check the newsgroup regularly. You are encouraged to answer other student's questions. Note that you should use good judgement in posting code to the newsgroup. Posting small pieces of code to explain a problem is ok. But be aware that copying code for your solution may be against course plagiarism rules! 5
Assignments / Project Two kinds of assignments: small warm up exercises. get yourself setup and acquainted with tools: java, Eclipse, JavaCC,... larger project Build a MiniJava compiler in different stages. 6
Warm-up Assignments The first two assignments are posted on the website. Deadlines: Assignment 1 due this friday january 8. Assignment 2 will be due next friday january 15. Assignments are always due at 23:59 on the given date. Assignment 1 is to freshen up your Java and get you setup for working with Eclipse. 7
Textbook Modern Compiler Implementation in Java. 2 nd Ed. Andrew W. Appel. Chapters 1.. 12: will be covered. Some later chapters may also be covered. Some extra material from other sources may be added. Reading the covered chapters of the textbook is mandatory. The course webpage will be updated to reflect the (extra) material covered. You are expected to make notes during lectures for any extra material, as well as study extra materials lecture posted on the website. 8
Assigned Reading You should read the textbook ahead of lecture. Modern Compiler Implementation in Java. 2 nd Ed. Andrew W. Appel. For next lecture you should read: The course website (all of it) Chapter 1 of the textbook. 9
Chapter 1: GOAL this lecture: What is this course about... a high-level perspective. OVERVIEW Levels of Programming Languages Language processors. Compiler pipe-line architecture Syntax: Grammars and ASTs Example: the Straightline programming language. 10
Levels of Programming Languages High level program class Triangle {... float surface() return b*h/2; } Low level program LOAD r1,b LOAD r2,h MUL r1,r2 DIV r1,#2 RET Executable Machine code 0001001001000101 0010010011101100 10101101001... 11
Levels of Programming Languages Some high-level languages: Pascal, Java, C, C++, Ada,... Some low level languages: x86 assembly language, PowerPC assembly language,... 12
Levels of Programming Languages What makes a highlevel language different from a lowlevel language? Things found in HL languages but typically not in LL languages Expressions control structures/abstractions: while, repeat until, if then else procedures data types distinguish several different types of data composite data types user defined data types encapsulation modules, procedures, objects 13
Abstraction A highlevel language is more abstract than a lowlevel language. More abstract? What does that mean? Abstraction: Separate the how from the what. Or what is implemented from how is it implemented. e.g. procedural abstraction = separate what does it do from how does it do it HL languages abstract away from the underlying machine => much more portable 14
Levels of Programming Languages Q: How do the following make a HL language more abstract? Expressions control structures: while, repeat until, if then else procedures data types encapsulation modules, procedures, objects 15
Language Processors: What are they? A programming language processor is any system (software or hardware) that manipulates programs. Examples: Editors Translators (e.g. compiler, assembler, disassembler) Interpreters 16
Language Processors: Why do we need them? Programmer Compute surface area of a triangle? How to bridge the semantic gap? Programmer Concepts and Ideas Java Program JVM Assembly code JVM Binary code JVM Interpreter 0101001001... Hardware Pentium Hardware 17
Typical Compiler is a pipeline of successive program transformation phases eventually producing executable code. Source Program Lex Tokens Parse Abstract Syntax Tree Semantic Analysis Decorated AST Translate IR Trees... Machine Code A Tree representation of a program is passed between many of the phases. 18
Course Overview (Chapter 1) Compiler Frontend: Lexical Analysis & Parsing (Chapter 2,3,4) Semantic Analysis (Chapter 5) Activation Records (Chapter 6) Translation to Intermediate Code (Chapter 7) Basic Blocks and Traces (Chapter 8) Compiler Backend: Instruction Selection (Chapter 9) Liveness Analysis (Chapter 10) Register Allocation (Chapter 11) Code Emission (Chapter 12) 19
Specifying Tree Languages You probably already know that we can specify programming language syntax using grammars (e.g. written in EBNF or BNF notation). You can also specify Tree Languages. Example: Let's define the abstract syntax of a simple programming language. 20
Example: The Straightline Language Example program: a := 5 + 3 ; b := ( print ( a, a 1 ), 10 * a ) ; print ( b ) 21
Example: The Straightline Language Stm ::= Stm ; Stm Stm ::= <id> := Exp Stm ::= print ( ExpList ) Exp ::= <id> Exp ::= <num> Exp ::= Exp BinOp Exp Exp ::= (Stm, Exp) ExpList ::= Exp, ExpList ExpList ::= Exp BinOp ::= + BinOp ::= BinOp ::= * BinOp ::= / (CompoundStm) (AssignStm) (PrintStm) (IdExp) (NumExp) (OpExp) (EseqExp) (PairExpList) (LastExpList) (Plus) (Minus) (Times) (Div) Note: This grammar can not be used for parsing! Why? 22
Abstract Syntax The grammar does however serve quite well as a specification of the structure of abstract syntax trees for Straightline programs. Example program: a := 5 + 3 ; b := ( print ( a, a 1 ), 10 * a ) ; print ( b ) Exercises: 1) Draw the abstract syntax tree for the above program on the board. 2) Sketch out how to implement Java classes to represent Straighline ASTs. 23
Conclusion / Summary This course is about Compilers. Compilers are language processors translating high-level language into low-level language help bridge the semantic gap compilers typically work in multiple phases phases often use tree representations (ASTs) as input / output. We can describe AST structure with CFG: from the CFG we can derive implementations (e.g. Java classes) for representing the ASTs. 24