Finding People and Information (1) G53CMP: Lecture 1. Aims and Motivation (1) Finding People and Information (2)

Similar documents
G53CMP: Lecture 1. Administrative Details 2016 and Introduction to Compiler Construction. Henrik Nilsson. University of Nottingham, UK

Recap: Typical Compiler Structure. G53CMP: Lecture 2 A Complete (Albeit Small) Compiler. This Lecture (2) This Lecture (1)

Compiler Construction LECTURE # 1

Philadelphia University Faculty of Information Technology Department of Computer Science --- Semester, 2007/2008. Course Syllabus

COMP 3002: Compiler Construction. Pat Morin School of Computer Science

Compilers Crash Course

Compiling Techniques

CS 4120 and 5120 are really the same course. CS 4121 (5121) is required! Outline CS 4120 / 4121 CS 5120/ = 5 & 0 = 1. Course Information

Formal Languages and Compilers Lecture I: Introduction to Compilers

C Compilation Model. Comp-206 : Introduction to Software Systems Lecture 9. Alexandre Denault Computer Science McGill University Fall 2006

Compiler construction

CS Compiler Construction West Virginia fall semester 2014 August 18, 2014 syllabus 1.0

CSE4305: Compilers for Algorithmic Languages CSE5317: Design and Construction of Compilers

CS415 Compilers Overview of the Course. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

G.PULLAIH COLLEGE OF ENGINEERING & TECHNOLOGY

Compiler Design. Dr. Chengwei Lei CEECS California State University, Bakersfield

INF5110 Compiler Construction

CSE4305: Compilers for Algorithmic Languages CSE5317: Design and Construction of Compilers

CS 432 Fall Mike Lam, Professor. Compilers. Advanced Systems Elective

Compiler Design (40-414)

Compilers and Interpreters

CA Compiler Construction

CMPE 152 Compiler Design

Compiler Construction

SRM UNIVERSITY FACULTY OF ENGINEERING AND TECHNOLOGY

CST-402(T): Language Processors

ECE573 Introduction to Compilers & Translators

Compilers and computer architecture: introduction

General Course Information. Catalogue Description. Objectives

CSE4305: Compilers for Algorithmic Languages CSE5317: Design and Construction of Compilers

CS 536. Class Meets. Introduction to Programming Languages and Compilers. Instructor. Key Dates. Teaching Assistant. Charles N. Fischer.

Compiler Construction 2010 (6 credits)

12048 Compilers II. Computer Science Engineering, 8º semester Mandatory, 2º Cycle. Theoretical credits: 3.0 Lab credits: 1.5. Theory: José Neira

COMP 181 Compilers. Administrative. Last time. Prelude. Compilation strategy. Translation strategy. Lecture 2 Overview

Compiler Construction Principles And Practice Solution Manual

Compiler Construction

CSE 582 Autumn 2002 Exam 11/26/02

Important Project Dates

CS 236 Language and Computation

Earlier edition Dragon book has been revised. Course Outline Contact Room 124, tel , rvvliet(at)liacs(dot)nl

Compiling Regular Expressions COMP360

Introduction to Functional Programming. Slides by Koen Claessen and Emil Axelsson

CS/SE 153 Concepts of Compiler Design

understanding recursive data types, recursive functions to compute over them, and structural induction to prove things about them

CS 432 Fall Mike Lam, Professor. Compilers. Advanced Systems Elective

CMPE 152 Compiler Design

CS164: Programming Assignment 5 Decaf Semantic Analysis and Code Generation

SRM UNIVERSITY FACULTY OF ENGINEERING AND TECHNOLOGY SCHOOL OF COMPUTING DEPARTMENT OF CSE COURSE PLAN

9/5/17. The Design and Implementation of Programming Languages. Compilation. Interpretation. Compilation vs. Interpretation. Hybrid Implementation

CS 415 Midterm Exam Spring SOLUTION

Compiler Construction D7011E

Introduction - CAP Course

Compilers. Prerequisites

Compilers for Modern Architectures Course Syllabus, Spring 2015

CS131: Programming Languages and Compilers. Spring 2017

CSE P 501 Compilers. Parsing & Context-Free Grammars Hal Perkins Spring UW CSE P 501 Spring 2018 C-1

HOLY ANGEL UNIVERSITY COLLEGE OF INFORMATION AND COMMUNICATIONS TECHNOLOGY COMPILER THEORY COURSE SYLLABUS

Compiler Theory Introduction and Course Outline Sandro Spina Department of Computer Science

CSCI 565 Compiler Design and Implementation Spring 2014

CS 314 Principles of Programming Languages

G53CMP: Lecture 4. Syntactic Analysis: Parser Generators. Henrik Nilsson. University of Nottingham, UK. G53CMP: Lecture 4 p.1/32

CS 406/534 Compiler Construction Putting It All Together

CS/SE 153 Concepts of Compiler Design

CS202 Compiler Construction. Christian Skalka. Course prerequisites. Solid programming skills a must.

Introduction to Compilers and Language Design Copyright (C) 2017 Douglas Thain. All rights reserved.

Introduction to Compiler Construction

What do Compilers Produce?

CSC488S/CSC2107S - Compilers and Interpreters. CSC 488S/CSC 2107S Lecture Notes

INCORPORATING ADVANCED PROGRAMMING TECHNIQUES IN THE COMPUTER INFORMATION SYSTEMS CURRICULUM

Semantic Analysis. Lecture 9. February 7, 2018

CSE 504: Compiler Design

Compiler Design Overview. Compiler Design 1

What Is Computer Science? The Scientific Study of Computation. Expressing or Describing

CS415 Compilers. Procedure Abstractions. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

Anatomy of a Compiler. Overview of Semantic Analysis. The Compiler So Far. Why a Separate Semantic Analysis?

Introduction. Compiler Design CSE Overview. 2 Syntax-Directed Translation. 3 Phases of Translation

What the CPU Sees Basic Flow Control Conditional Flow Control Structured Flow Control Functions and Scope. C Flow Control.

CSCE 314 Programming Languages. Type System

Compiler Optimisation

Lecture 1: Introduction

Overview. Overview. Programming problems are easier to solve in high-level languages

LECTURE NOTES ON COMPILER DESIGN P a g e 2

CS-XXX: Graduate Programming Languages. Lecture 9 Simply Typed Lambda Calculus. Dan Grossman 2012

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler Front-End

The role of semantic analysis in a compiler

CIS 341 Midterm February 28, Name (printed): Pennkey (login id): SOLUTIONS

Languages and Automata

A Brief Haskell and GHC Refresher

Parsing II Top-down parsing. Comp 412

Introduction to Compiler Construction

CS164: Midterm I. Fall 2003

Compilers and Code Optimization EDOARDO FUSELLA

Programming Assignment IV Due Monday, November 8 (with an automatic extension until Friday, November 12, noon)

G52LAC Languages and Computation Lecture 6

CISC 124: Introduction To Computing Science II

CMPE 152 Compiler Design

Lecture Notes on Compiler Design: Overview

Compiler Construction

Overview of the Class

7. Introduction to Denotational Semantics. Oscar Nierstrasz

Transcription:

Finding People and Information (1) G53CMP: Lecture 1 Administrative Details 2017 and Introduction to Compiler Construction Henrik Nilsson Henrik Nilsson Room A08, Computer Science Building e-mail: nhn@cs.nott.ac.uk Teaching Assistants: Jan Bracker www.cs.nott.ac.uk/~jzb Martin Handley www.cs.nott.ac.uk/~psxmah University of Nottingham, UK G53CMP: Lecture 1 p.1/36 G53CMP: Lecture 1 p.2/36 Finding People and Information (2) Main module web page: www.cs.nott.ac.uk/~psznhn/g53cmp Moodle: moodle.nottingham.ac.uk/ course/view.php?id=53723 Direct questions concerning lectures and coursework to the Moodle G53CMP Forum. Anyone can ask and answer questions, but you must not post exact solutions to the coursework. Aims and Motivation (1) Why study Compiler Construction? Why did you opt to take this module? More generally, what do you think are good reasons to take this module? G53CMP: Lecture 1 p.3/36 G53CMP: Lecture 1 p.4/36

Aims and Motivation (2) Aims: Deepened understanding of: how compilers (and interpreters) work and are constructed programming language design and semantics The former is a great, hands on, learning-by-doing way to learn the latter. Aims and Motivation (3) Why? The ACM/IEEE 2013 CS Curriculum Guidelines: G53CMP: Lecture 1 p.5/36 G53CMP: Lecture 1 p.6/36 Aims and Motivation (4) Moreover: Compilers: a microcosm of computer science [CT04] Formal Languages and Automata Theory Datastructures and algorithms Computer architecture Programming language semantics Formal reasoning about programs Software engineering aspects Thus, capstone module tying everything together. G53CMP: Lecture 1 p.7/36 Aims and Motivation (5) Or, in terms of modules, G53CMP directly draws from/informs: G52LAC: formal language theory, grammars, (D)FAs G51MCS, G52ACE: formal reasoning, structural induction G51PGA, G51PGP: programming, understanding programming languages G51CSF, G51CSA: how computers work G54FOP/FPP: programming language theory G53CMP: Lecture 1 p.8/36

Aims and Motivation (6) Jobs? There are plenty of companies out there with in-house languages or that critically rely on compiler/interpreter expertise for other reasons. Some possibly surprising examples: Facebook Standard Chartered Bank Jane Street Learning Outcomes Knowledge of language and compiler design, semantics, key ideas and techniques. Experience of compiler construction tools. Experience of working with a medium-sized program. Programming in various paradigms Capturing design through formal specifications and deriving implementations from those. G53CMP: Lecture 1 p.9/36 G53CMP: Lecture 1 p.10/36 Literature (1) David A. Watt and Deryck F. Brown. Programming Language Processors in Java, Prentice-Hall, 1999. Used to be the main book. The lectures partly follow the structure of this book. The coursework was originally based on it. Hands-on approach to compiler construction. Particularly good if you like Java. Considers software engineering aspects. A bit weak on linking theory with practice. Literature (2) An alternative: Keith D. Cooper and Linda Torczon. Engineering a Compiler, Elsevier, 2004. Covers more ground in greater depth than this module. Gradually becoming the new main reference for the module. For each lecture, there are references to the relevant chapter(s) of both books (see lecture overview on the G53CMP web page). G53CMP: Lecture 1 p.11/36 G53CMP: Lecture 1 p.12/36

Literature (3) Great supplement: Alfred V Aho, Ravi Sethi, Jeffrey D. Ullman. Compilers Principles, Techniques, and Tools, Addison-Wesley, 1986. (The Dragon Book.) Classic reference in the field. Covers much more ground in greater depth than this module. A book that will last for years. There is a New(-ish) 2007 edition! Literature (4) Other useful references: Benjamin C. Pierce. Types and Programming Languages. Graham Hutton. Programming in Haskell. Books seem a bit old? Sure! They focus on core principles of lasting value that it pays off to learn. Cf. ACM/IEEE 2013 Curriculum Guidelines G53CMP: Lecture 1 p.13/36 G53CMP: Lecture 1 p.14/36 Lectures and Handouts Come prepared to take notes. There will be some handouts, but for the most part not. All electronic slides, program code, and other supporting material in electronic form used during the lectures, will be made available on the course web page. However! The electronic record of the lectures is neither guaranteed to be complete nor self-contained! Medium of Instruction Haskell used as medium of instruction throughout the module as: An ideal language for illustrating and discussing all aspects of compiler construction (and similar applications). Functional language notation is closely aligned with mathematical notation and formalisms used in text books on compilers. In practice, often a good choice for implementing compilers (and much else beside). G53CMP: Lecture 1 p.15/36 G53CMP: Lecture 1 p.16/36

Assessment First sit: The exam counts for 75 % of the total mark. The coursework counts for the remaining 25 %. 2 h exam, 3 questions, each worth 25 %. Bonus! There will be (sub)question(s) on the exam closely related to the coursework! Effectively, the weight of the coursework is thus more like 50 %, except partly examined later. Resit: 100 % exam Assessment (2) Why such emphasis on the coursework? Compiler construction is best learnt by doing. Thus, if you do and understand the coursework, you will be handsomely rewarded. Past experience shows that students who don t engage with the coursework struggle to pass. G53CMP: Lecture 1 p.17/36 G53CMP: Lecture 1 p.18/36 Coursework You will be given partial implementations of a compiler for a small language called MiniTrinagle. You will be asked to: answer theoretical questions related to the code extend the code with new features. Detailed instructions for the coursework available (soon) from the module web page (Part I: 6 Oct.). Study these instructions very carefully! Coursework Assessment (1) Two parts to the coursework: I and II Each part to be solved individually Submission for each part: - Brief written report (hard copy & PDF) - All source code (electronically) For part II, compulsory 10 minute oral examination in assigned slot during one of the lab sessions after the submission deadline. Catch-up slots only if missed slot with good cause; personal tutor to request on your behalf. G53CMP: Lecture 1 p.19/36 G53CMP: Lecture 1 p.20/36

Coursework Assessment (2) A number of weighted questions for each. part Written answer to each question assessed on - Correctness (0, 1, or 2 marks) - Style (0, 1, or 2 marks) In the oral examination (part II only), you explain your answers. Your explanations are assessed as follows: - 2: 100 % of mark for written answer - 1: 65 % of mark for written answer - 0: 0 % of mark for written answer Coursework Deadlines Coursework deadlines: Part I: Monday 30 October, 15:00. Part II: Monday 27 November, 15:00. Oral examinations during the lab sessions the following two Fridays; i.e. 1 and 8 December. Start early! It is not possible to do this coursework at the last minute. First lab session: Friday 6 October, 13:00 15:00. G53CMP: Lecture 1 p.21/36 G53CMP: Lecture 1 p.22/36 What is a Compiler? (1) Compilers are program translators: source program Typical example: Source language: C compiler error diagnostics target program Target language: x86 assembler Why? To make it easier to program computers! What is a Compiler? (2) GCC translates this C program int main(int argc, char *argv) { printf("%d\n", argc - 1); } into this x86 assembly code (excerpt): movl decl subl pushl pushl call 8(%ebp), %eax %eax $8, %esp %eax $.LC0 printf G53CMP: Lecture 1 p.23/36 addl $16, %esp G53CMP: Lecture 1 p.24/36

Source and Target Languages Large spectrum of possibilities, for example: Source languages: - (High-level) programming languages - Modelling languages - Document description languages - Database query languages Target languages: - High-level programming language - Low-level programming language (assembler or machine code, byte code) G53CMP: Lecture 1 p.25/36 Compilers vs. Interpreters Interpreters are another class of translators: Compiler: translates a program once and for all into target language. Interpreter: effectively translates (the used parts of) a source program every time it is run. Techniques like Just-In-Time Compilation (JIT) blurs this distinction. Compilers and interpreters sometimes used together, e.g. Java: Java compiled into Java byte code, byte code interpreted by a Java Virtual Machine (JVM), JVM might use JIT. G53CMP: Lecture 1 p.26/36 Inside the Compiler (1) Inside the Compiler (2) Traditionally, a compiler is broken down into several phases: Scanner: lexical analysis Parser: syntactic analysis Checker: contextual analysys (e.g. type checking) Optimizer: code improvement Code generator Front End Middle Section/ Back End scanner parser checker optimizer/ code generator sequence of characters sequence of tokens Abstract Syntax Tree (AST) Intermediate Representation (IR), e.g. verified/annotated AST target code Lexical Analysis Syntactic Analysis/Parsing Contextual Analysis/checking Static Semantics (e.g. Type Checking) Optimization and Code Generation (possibly many steps involving a number of intermediary representations) G53CMP: Lecture 1 p.27/36 G53CMP: Lecture 1 p.28/36

Inside the Compiler (3) Lexical Analysis: - Verify that input character sequence is lexically valid. - Group characters into sequence of lexical symbols, tokens. - Discard white space and comments (typically). Inside the Compiler (4) Syntactic Analysis/Parsing - Verify that the input program is syntactically valid, i.e. conforms to the Context Free Grammar of the language. - Determine the program structure. - Construct a representation of the program reflecting that structure without unnecessary details, usually an Abstract Syntax Tree (AST). G53CMP: Lecture 1 p.29/36 G53CMP: Lecture 1 p.30/36 Example: TXL into C compiler Scenario: We wish to develop a compiler for TXL: Trivial expression Language. To save ourselves some effort, we are going to compile TXL into C, and then use an existing C compiler (GCC) to translate into executable machine code. Informal TXL Syntax and Semantics Some examples of TXL programs, concrete syntax, and their intended meaning, semantics: 1 + 3 Semantics: 4 1 + (3 * (2 + 2)) Semantics: 13 let x = 3 * 7 in x + 3 Semantics: 24 This is dynamic semantics: what does a program mean when run? G53CMP: Lecture 1 p.31/36 G53CMP: Lecture 1 p.32/36

Inside the Compiler (5) Contextual Analysis/Checking Static Semantics: - Resolve meaning of symbols. - Report undefined symbols. - Type checking. -... Informal TXL Syntax and Semantics let x = 3 * 7 in let x = x * 3 in x - 21 Semantics:??? Some static semantics possibilities: Disallow re-definition of entities already in scope. Allow nested scopes, decide how to disambiguate; e.g., closest containing scope. Recursive definitions or not? I.e., is the defined entity in scope in its own definition? We opt for nesting, closest containing scope, no recursion. (Dynamic) semantics: 42 G53CMP: Lecture 1 p.33/36 G53CMP: Lecture 1 p.34/36 Informal TXL Syntax and Semantics let x = 3 in y + x Semantics:??? Some possibilities: Insist all variables be defined. The program can then be statically rejected as meaningless. Catch use of undefined variables dynamically, making the meaning of the program undefined. Assume some default value, like 0, for variables that are not explicitly defined. G53CMP: Lecture 1 p.35/36 Inside the Compiler (5) Optimization: - Code improvements aiming at making it run faster and/or use less space, energy, etc. - Almost always heuristics: cannot guarantee optimal result. Code Generation: - Output the appropriate sequence of target language instructions. - Might involve further low-level (target-specific) optimization. G53CMP: Lecture 1 p.36/36