Lecture 1 CIS 341: COMPILERS

Similar documents
CS 4120 and 5120 are really the same course. CS 4121 (5121) is required! Outline CS 4120 / 4121 CS 5120/ = 5 & 0 = 1. Course Information

Compilers Crash Course

15-411/ Compiler Design

ECE573 Introduction to Compilers & Translators

PLAGIARISM. Administrivia. Compilers. CS143 11:00-12:15TT B03 Gates. Text. Staff. Instructor. TAs. Office hours, contact info on 143 web site

CS415 Compilers Overview of the Course. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

CIS 341 Midterm February 28, Name (printed): Pennkey (login id): SOLUTIONS

Compilers for Modern Architectures Course Syllabus, Spring 2015

Programming Languages and Compilers (CS 421)

CS 314 Principles of Programming Languages

Compiler Construction LECTURE # 1

CSCI-GA Scripting Languages

CS 6371: Advanced Programming Languages

Administrivia. Compilers. CS143 10:30-11:50TT Gates B01. Text. Staff. Instructor. TAs. The Purple Dragon Book. Alex Aiken. Aho, Lam, Sethi & Ullman

Scope and Introduction to Functional Languages. Review and Finish Scoping. Announcements. Assignment 3 due Thu at 11:55pm. Website has SML resources

CS 415 Midterm Exam Spring 2002

Compiler Construction

Lecture 1: Course Introduction

Compiler Design. Dr. Chengwei Lei CEECS California State University, Bakersfield

CSE4305: Compilers for Algorithmic Languages CSE5317: Design and Construction of Compilers

Compiler Construction D7011E

COSE212: Programming Languages. Lecture 0 Course Overview

Introduction to OCaml

Lecture 1: Course Introduction

General Course Information. Catalogue Description. Objectives

Compilers and interpreters

Introduction to ML. Mooly Sagiv. Cornell CS 3110 Data Structures and Functional Programming

Introduction to Compilers

CA Compiler Construction

CS131: Programming Languages and Compilers. Spring 2017

CS 565: Programming Languages. Spring 2008 Tu, Th: 16:30-17:45 Room LWSN 1106

CST-402(T): Language Processors

COMP 181 Compilers. Administrative. Last time. Prelude. Compilation strategy. Translation strategy. Lecture 2 Overview

CS202 Compiler Construction. Christian Skalka. Course prerequisites. Solid programming skills a must.

CSE 504: Compiler Design

1DL321: Kompilatorteknik I (Compiler Design 1) Introduction to Programming Language Design and to Compilation

CMSC 330: Organization of Programming Languages. Functional Programming with OCaml

Welcome to CSE131b: Compiler Construction

1DL321: Kompilatorteknik I (Compiler Design 1)

! Broaden your language horizons! Different programming languages! Different language features and tradeoffs. ! Study how languages are implemented

Introduction to ML. Mooly Sagiv. Cornell CS 3110 Data Structures and Functional Programming

C Compilation Model. Comp-206 : Introduction to Software Systems Lecture 9. Alexandre Denault Computer Science McGill University Fall 2006

CSE4305: Compilers for Algorithmic Languages CSE5317: Design and Construction of Compilers

CS426 Compiler Construction Fall 2006

CS 113: Introduction to

4) C = 96 * B 5) 1 and 3 only 6) 2 and 4 only

ECE Introduction to Compilers and Translation Engineering

CMSC 350: COMPILER DESIGN

Introduction to Programming Languages and Compilers. CS164 11:00-12:30 TT 10 Evans. UPRM ICOM 4029 (Adapted from: Prof. Necula UCB CS 164)

CSEP 501 Compilers. Overview and Administrivia Hal Perkins Winter /8/ Hal Perkins & UW CSE A-1

Philadelphia University Faculty of Information Technology Department of Computer Science --- Semester, 2007/2008. Course Syllabus

Finding People and Information (1) G53CMP: Lecture 1. Aims and Motivation (1) Finding People and Information (2)

Introduction (1) COMP 520: Compiler Design (4 credits) Alexander Krolik MWF 13:30-14:30, MD 279

Dialects of ML. CMSC 330: Organization of Programming Languages. Dialects of ML (cont.) Features of ML. Functional Languages. Features of ML (cont.

CS457/557 Functional Languages

Overview of the Class

Lecture 13 CIS 341: COMPILERS

Compiling and Interpreting Programming. Overview of Compilers and Interpreters

Earlier edition Dragon book has been revised. Course Outline Contact Room 124, tel , rvvliet(at)liacs(dot)nl

CSE4305: Compilers for Algorithmic Languages CSE5317: Design and Construction of Compilers

CIS24 Project #3. Student Name: Chun Chung Cheung Course Section: SA Date: 4/28/2003 Professor: Kopec. Subject: Functional Programming Language (ML)

CSCI 565 Compiler Design and Implementation Spring 2014

CS164: Programming Assignment 5 Decaf Semantic Analysis and Code Generation

Lecture 9 CIS 341: COMPILERS

CS 3304 Comparative Languages. Lecture 1: Introduction

CSE 401/M501 Compilers

CS Compiler Construction West Virginia fall semester 2014 August 18, 2014 syllabus 1.0

Introduction - CAP Course

COP4020 Programming Languages. Compilers and Interpreters Robert van Engelen & Chris Lacher

Working of the Compilers

1DL321: Kompilatorteknik I (Compiler Design 1) Introduction to Programming Language Design and to Compilation

CS 241 Data Organization. August 21, 2018

CS 3813/718 Fall Python Programming. Professor Liang Huang.

CS153: Compilers Lecture 1: Introduction

Introduction to ML. Mooly Sagiv. Cornell CS 3110 Data Structures and Functional Programming

Semantic Analysis. Lecture 9. February 7, 2018

What is a compiler? Xiaokang Qiu Purdue University. August 21, 2017 ECE 573

Programming Language Design and Implementation. Cunning Plan. Your Host For The Semester. Wes Weimer TR 9:30-10:45 MEC 214. Who Are We?

You may work with a partner on this quiz; both of you must submit your answers.

What is a compiler? var a var b mov 3 a mov 4 r1 cmpi a r1 jge l_e mov 2 b jmp l_d l_e: mov 3 b l_d: ;done

Introduction to ML. Based on materials by Vitaly Shmatikov. General-purpose, non-c-like, non-oo language. Related languages: Haskell, Ocaml, F#,

Programming Languages, Summary CSC419; Odelia Schwartz

Software Lesson 2 Outline

San José State University College of Science/Department of Computer Science CS 152, Programming Language Paradigms, Section 03/04, Fall, 2018

CIS 341 Midterm March 2, Name (printed): Pennkey (login id): Do not begin the exam until you are told to do so.

Verified compilers. Guest lecture for Compiler Construction, Spring Magnus Myréen. Chalmers University of Technology

Lecture 1: Introduction

BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE, Pilani Pilani Campus Instruction Division

Evaluation Scheme L T P Total Credit Theory Mid Sem Exam

Compiler Construction

San José State University Department of Computer Science CS151, Section 04 Object Oriented Design Spring 2018

Translator Design CRN Course Administration CMSC 4173 Spring 2017

CSE 351: Week 4. Tom Bergan, TA

CPS104 Recitation: Assembly Programming

Compiler Design 1. Introduction to Programming Language Design and to Compilation

Compiler Construction 2010 (6 credits)

What is a Compiler? Compiler Construction SMD163. Why Translation is Needed: Know your Target: Lecture 8: Introduction to code generation

! Broaden your language horizons. ! Study how languages are implemented. ! Study how languages are described / specified

COS 326 Func,onal Programming: An elegant weapon for the modern age David Walker Princeton University

7. Introduction to Denotational Semantics. Oscar Nierstrasz

Transcription:

Lecture 1 CIS 341: COMPILERS

Administrivia Instructor: Steve Zdancewic Office hours: Fridays 3:30-5:00 & by appointment Levine 511 TAs: Olek Gierczak Nicolas Koh Yishuai Li Richard Zhang Office hours: TBD E-mail: cis341@seas.upenn.edu Web site: http://www.seas.upenn.edu/~cis341 Piazza: http://piazza.com/upenn/spring2018/cis341 Zdancewic CIS 341: Compilers 2

You will learn: Practical applications of theory Lexing/Parsing/Interpreters Why CIS 341? How high-level languages are implemented in machine language (A subset of) Intel x86 architecture More about common compilation tools like GCC and LLVM A deeper understanding of code A little about programming language semantics & types Functional programming in OCaml How to manipulate complex data structures How to be a better programmer Expect this to be a very challenging, implementationoriented course. Programming projects can take tens of hours per week Zdancewic CIS 341: Compilers 3

Course projects The CIS341 Compiler HW1: OCaml Programming HW2: X86lite interpreter HW3: LLVMlite compiler HW4: Lexing, Parsing, simple compilation HW5: Higher-level Features HW6: Analysis and Optimizations I HW7: Optimizations II Goal: build a complete compiler from a high-level, type-safe language to x86 assembly. *HW 4 7 are undergoing a re-design this semester, so they re a bit in flux. Zdancewic CIS 341: Compilers 4

Resources Course textbook: (recommended, not required) Modern compiler implementation in ML (Appel) Additional compilers books: Compilers Principles, Techniques & Tools (Aho, Lam, Sethi, Ullman) a.k.a. The Dragon Book Advanced Compiler Design & Implementation (Muchnick) About Ocaml: Real World Ocaml (Minsky, Madhavapeddy, Hickey) realworldocaml.org Introduction to Objective Caml (Hickey) Zdancewic CIS 341: Compilers 5

Why OCaml? OCaml is a dialect of ML Meta Language It was designed to enable easy manipulation abstract syntax trees Type-safe, mostly pure, functional language with support for polymorphic (generic) algebraic datatypes, modules, and mutable state The OCaml compiler itself is well engineered you can study its source! It is the right tool for this job Forgot about OCaml after CIS120? Next couple lectures will (re)introduce it First two projects will help you get up to speed programming See Introduction to Objective Caml by Jason Hickey book available on the course web pages, referred to in HW1 Zdancewic CIS 341: Compilers 6

HW1: Hellocaml Homework 1 is available on the course web site. Individual project no groups Due: Wednesday, 24 Jan. 2013 at 11:59pm Topic: OCaml programming, an introduction to interpreters OCaml head start on eniac: Run ocaml from the command line to invoke the top-level loop Run ocamlbuild main.native to run the compiler We recommend using: Emacs/Vim + merlin (less recommended: Eclipse with the OcaIDE plugin) See the course web pages about the CIS341 tool chain to get started Zdancewic CIS 341: Compilers 7

Homework Policies Homework (except HW1) may be done individually or in pairs Late projects: up to 24 hours late: 15 point penalty up to 48 hours late: 30 point penalty after 48 hours: not accepted Submission policy: Projects that don t compile will get no credit Partial credit will be awarded according to the guidelines in the project description Academic integrity: don t cheat This course will abide by the University s Code of Academic Integrity low level and high level discussions across groups are fine mid level discussions / code sharing are not permitted General principle: When in doubt, ask! Zdancewic CIS 341: Compilers 8

Course Policies Prerequisites: CIS121 and CIS240 (262 useful too!) Significant programming experience If HW1 is a struggle, this class might not be a good fit for you (HW1 is significantly simpler than the rest ) Grading: 70% Projects: Compiler Groups of 1 or 2 students Implemented in OCaml 12% Midterm 18% Final exam Lecture attendance is crucial No laptops (or other devices)! It s too distracting for me and for others in the class. Zdancewic CIS 341: Compilers 9

What is a compiler? COMPILERS Zdancewic CIS 341: Compilers 10

What is a Compiler? A compiler is a program that translates from one programming language to another. Typically: high-level source code to low-level machine code (object code) Not always: Source-to-source translators, Java bytecode compiler, GWT Java Javascript High-level Code? Low-level Code Zdancewic CIS 341: Compilers 11

Historical Aside This is an old problem! Until the 1950 s: computers were programmed in assembly. 1951 1952: Grace Hopper developed the A-0 system for the UNIVAC I She later contributed significantly to the design of COBOL 1957: the FORTRAN compiler was built at IBM Team led by John Backus 1960 s: development of the first bootstrapping compiler for LISP 1970 s: language/compiler design blossomed Today: thousands of languages (most little used) Some better designed than others... 1980s: ML / LCF 1984: Standard ML 1987: Caml 1991: Caml Light 1995: Caml Special Light 1996: Objective Caml Zdancewic CIS 341: Compilers 12

Optimized for human readability Source Code Expressive: matches human ideas of grammar / syntax / meaning Redundant: more information than needed to help catch errors Abstract: exact computation possibly not fully determined by code Example C source: #include <stdio.h>!! int factorial(int n) {! int acc = 1;! while (n > 0) {! acc = acc * n;! n = n - 1;! }! return acc;! }!! int main(int argc, char *argv[]) {! printf("factorial(6) = %d\n", factorial(6));! } Zdancewic CIS 341: Compilers 13

Low-level code Optimized for Hardware Machine code hard for people to read Redundancy, ambiguity reduced Abstractions & information about intent is lost Assembly language then machine language Figure at right shows (unoptimized) 32-bit code for the factorial function _factorial: ## BB#0: pushl %ebp movl %esp, %ebp subl $8, %esp movl 8(%ebp), %eax movl %eax, -4(%ebp) movl $1, -8(%ebp) LBB0_1: cmpl $0, -4(%ebp) jle LBB0_3 ## BB#2: movl -8(%ebp), %eax imull -4(%ebp), %eax movl %eax, -8(%ebp) movl -4(%ebp), %eax subl $1, %eax movl %eax, -4(%ebp) jmp LBB0_1 LBB0_3: movl -8(%ebp), %eax addl $8, %esp popl %ebp retl Zdancewic CIS 341: Compilers 14

How to translate? Source code Machine code mismatch Some languages are farther from machine code than others: Consider: C, C++, Java, Lisp, ML, Haskell, Ruby, Python, Javascript Goals of translation: Source level expressiveness for the task Best performance for the concrete computation Reasonable translation efficiency (< O(n 3 )) Maintainable code Correctness! Zdancewic CIS 341: Compilers 15

Correct Compilation Programming languages describe computation precisely therefore, translation can be precisely described a compiler can be correct with respect to the source and target language semantics. Correctness is important! Broken compilers generate broken code. Hard to debug source programs if the compiler is incorrect. Failure has dire consequences for development cost, security, etc. This course: some techniques for building correct compilers Finding and Understanding Bugs in C Compilers, Yang et al. PLDI 2011 There is much ongoing research about proving compilers correct. (Google for CompCert, Verified Software Toolchain, or Vellvm) Zdancewic CIS 341: Compilers 16

Idea: Translate in Steps Compile via a series of program representations Intermediate representations are optimized for program manipulation of various kinds: Semantic analysis: type checking, error checking, etc. Optimization: dead-code elimination, common subexpression elimination, function inlining, register allocation, etc. Code generation: instruction selection Representations are more machine specific, less language specific as translation proceeds Zdancewic CIS 341: Compilers 17

(Simplified) Compiler Structure Source Code (Character stream) if (b == 0) a = 0; Lexical Analysis Token Stream Abstract Syntax Tree Parsing Intermediate Code Generation Intermediate Code Front End (machine independent) Middle End (compiler dependent) Assembly Code CMP ECX, 0 SETBZ EAX Code Generation Back End (machine dependent) Zdancewic CIS 341: Compilers 18

Typical Compiler Stages Lexing à token stream Parsing à abstract syntax Disambiguation à abstract syntax Semantic analysis à annotated abstract syntax Translation à intermediate code Control-flow analysis à control-flow graph Data-flow analysis à interference graph Register allocation à assembly Code emission Optimizations may be done at many of these stages Different source language features may require more/different stages Assembly code is not the end of the story Zdancewic CIS 341: Compilers 19

Compilation & Execution Assembly Code Object Code Source code Compiler Assembler foo.c gcc -S foo.s as foo.o Library code Linker ld Fully-resolved machine Code Loader foo (Usually: gcc -o foo foo.c) Executable image Zdancewic CIS 341: Compilers 20

Introduction to OCaml programming A little background about ML Interactive tour via the OCaml top-loop & Emacs Writing simple interpreters OCAML Zdancewic CIS 341: Compilers 21

ML s History 1971: Robin Milner starts the LCF Project at Stanford logic of computable functions 1973: At Edinburgh, Milner implemented his theorem prover and dubbed it Meta Language ML 1984: ML escaped into the wild and became Standard ML SML 97 newest version of the standard There is a whole family of SML compilers: SML/NJ developed at AT&T Bell Labs MLton whole program, optimizing compiler Poly/ML Moscow ML ML Kit compiler MLj SML to Java bytecode compiler ML 2000: failed revised standardization sml: successor ML discussed intermittently 2014: sml-family.org + definition on github Zdancewic CIS 341: Compilers 22

OCaml s History The Formel project at the Institut National de Rechereche en Informatique et en Automatique (INRIA) 1987: Guy Cousineau re-implemented a variant of ML Implementation targeted the Categorical Abstract Machine (CAM) As a pun, CAM-ML became CAML 1991: Xavier Leroy and Damien Doligez wrote Caml-light Compiled CAML to a virtual machine with simple bytecode (much faster!) 1996: Xavier Leroy, Jérôme Vouillon, and Didier Rémy Add an object system to create OCaml Add native code compilation Many updates, extensions, since Microsoft s F# language is a descendent of OCaml 2013: ocaml.org Zdancewic CIS 341: Compilers 23

OCaml Tools ocaml the top-level interactive loop ocamlc the bytecode compiler ocamlopt the native code compiler ocamldep the dependency analyzer ocamldoc the documentation generator ocamllex the lexer generator ocamlyacc the parser generator menhir a more modern parser generator ocamlbuild a compilation manager utop a more fully-featured interactive top-level opam package manager Zdancewic CIS 341: Compilers 24

Distinguishing Characteristics Functional & (Mostly) Pure Programs manipulate values rather than issue commands Functions are first-class entities Results of computation can be named using let Has relatively few side effects (imperative updates to memory) Strongly & Statically typed Compiler typechecks every expression of the program, issues errors if it can t prove that the program is type safe Good support for type inference & generic (polymorphic) types Rich user-defined algebraic data types with pervasive use of pattern matching Very strong and flexible module system for constructing large projects Zdancewic CIS 341: Compilers 25

Most Important Features for CIS341 Types: int, bool, int32, int64, char, string, built-in lists, tuples, records, functions Concepts: Pattern matching Recursive functions over algebraic (i.e. tree-structured) datatypes Libraries: Int32, Int64, List, Printf, Format Zdancewic CIS 341: Compilers 26

How to represent programs as data structures. How to write programs that process programs. INTERPRETERS Zdancewic CIS 341: Compilers 27

Factorial: Everyone s Favorite Function Consider this implementation of factorial in a hypothetical programming language: X = 6; ANS = 1; whilenz (x) { ANS = ANS * X; X = X + -1; } We need to describe the constructs of this hypothetical language Syntax: which sequences of characters count as a legal program? Semantics: what is the meaning (behavior) of a legal program? Zdancewic CIS 341: Compilers 28

Grammar for a Simple Language <exp> ::= <X> <exp> + <exp> <exp> * <exp> <exp> < <exp> <integer constant> (<exp>) <cmd> ::= skip <X> = <exp> ifnz <exp> { <cmd> } else { <cmd> } whilenz <exp> { <cmd> } <cmd>; <cmd> Concrete syntax (grammar) for a simple imperative language Written in Backus-Naur form <exp> and <cmd> are nonterminals ::=,, and < > symbols are part of the meta language keywords, like skip and ifnz and symbols, like { and + are part of the object language Need to represent the abstract syntax (i.e. hide the irrelevant of the concrete syntax) Implement the operational semantics (i.e. define the behavior, or meaning, of the program) Zdancewic CIS 341: Compilers 29

OCaml Demo simple.ml Zdancewic CIS 341: Compilers 30