COP4020 Programming Languages. Compilers and Interpreters Robert van Engelen & Chris Lacher

Similar documents
Compiling and Interpreting Programming. Overview of Compilers and Interpreters

CST-402(T): Language Processors

COP4020 Programming Languages. Compilers and Interpreters Prof. Robert van Engelen

Working of the Compilers

LECTURE 3. Compiler Phases

9/5/17. The Design and Implementation of Programming Languages. Compilation. Interpretation. Compilation vs. Interpretation. Hybrid Implementation

LECTURE 2. Compilers and Interpreters

Compilation I. Hwansoo Han

Why are there so many programming languages? Why do we have programming languages? What is a language for? What makes a language successful?

Why are there so many programming languages?

Early computers (1940s) cost millions of dollars and were programmed in machine language. less error-prone method needed

Language Translation. Compilation vs. interpretation. Compilation diagram. Step 1: compile. Step 2: run. compiler. Compiled program. program.

COMPILER DESIGN LECTURE NOTES


COP 3402 Systems Software. Lecture 4: Compilers. Interpreters

Compilers. Prerequisites

Concepts of Programming Languages

Introduction. Compilers and Interpreters

Compiler Design. Computer Science & Information Technology (CS) Rank under AIR 100

Programming Languages

Crafting a Compiler with C (II) Compiler V. S. Interpreter

PRINCIPLES OF COMPILER DESIGN UNIT I INTRODUCTION TO COMPILERS

Pioneering Compiler Design

CS 415 Midterm Exam Spring 2002

Introduction to Programming Languages. CSE 307 Principles of Programming Languages Stony Brook University

Compilers and Interpreters

Informal Semantics of Data. semantic specification names (identifiers) attributes binding declarations scope rules visibility

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

Compiling Regular Expressions COMP360

Introduction to Compiler Construction

SE352b: Roadmap. SE352b Software Engineering Design Tools. W3: Programming Paradigms

Introduction to Compiler Construction

COP4020 Programming Languages. Semantics Robert van Engelen & Chris Lacher

The role of semantic analysis in a compiler

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler Front-End

SKILL AREA 304: Review Programming Language Concept. Computer Programming (YPG)

A Simple Syntax-Directed Translator

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler so far

Introduction to Java Programming

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

Introduction. Interpreters High-level language intermediate code which is interpreted. Translators. e.g.

COMPILER DESIGN. For COMPUTER SCIENCE

CS 4201 Compilers 2014/2015 Handout: Lab 1

PRINCIPLES OF COMPILER DESIGN UNIT I INTRODUCTION TO COMPILING

Design & Implementation Overview

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

Chapter 1. Preview. Reason for Studying OPL. Language Evaluation Criteria. Programming Domains

CS 330 Lecture 18. Symbol table. C scope rules. Declarations. Chapter 5 Louden Outline

Programming 1. Lecture 1 COP 3014 Fall August 28, 2017

Why study Programming Language Concepts? Chapter One. Language Evaluation Criteria. Programming Domains. Readability Writability Reliability Cost

What is a compiler? var a var b mov 3 a mov 4 r1 cmpi a r1 jge l_e mov 2 b jmp l_d l_e: mov 3 b l_d: ;done

Programmiersprachen (Programming Languages)

The Compiler So Far. CSC 4181 Compiler Construction. Semantic Analysis. Beyond Syntax. Goals of a Semantic Analyzer.

CMPT 379 Compilers. Anoop Sarkar.

What do Compilers Produce?

CSCE 314 Programming Languages. Type System

Compilers and Code Optimization EDOARDO FUSELLA

CS 321 IV. Overview of Compilation

History of Compilers The term

CD Assignment I. 1. Explain the various phases of the compiler with a simple example.

General Concepts. Abstraction Computational Paradigms Implementation Application Domains Influence on Success Influences on Design

Compilers and interpreters

Where We Are. Lexical Analysis. Syntax Analysis. IR Generation. IR Optimization. Code Generation. Machine Code. Optimization.

Programming 1 - Honors

COP4020 Programming Languages. Semantics Prof. Robert van Engelen

COSC121: Computer Systems: Runtime Stack

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design

Syntax-Directed Translation Part I

1. true / false By a compiler we mean a program that translates to code that will run natively on some machine.

ST. XAVIER S COLLEGE

What is a compiler? Xiaokang Qiu Purdue University. August 21, 2017 ECE 573

Earlier edition Dragon book has been revised. Course Outline Contact Room 124, tel , rvvliet(at)liacs(dot)nl

CSE450. Translation of Programming Languages. Lecture 11: Semantic Analysis: Types & Type Checking

Programming Languages, Summary CSC419; Odelia Schwartz

Informatica 3 Syntax and Semantics

Intermediate Code Generation

SEMANTIC ANALYSIS TYPES AND DECLARATIONS

Announcements. My office hours are today in Gates 160 from 1PM-3PM. Programming Project 3 checkpoint due tomorrow night at 11:59PM.

Principles of Programming Languages. Lecture Outline

Anatomy of a Compiler. Overview of Semantic Analysis. The Compiler So Far. Why a Separate Semantic Analysis?

Static and Dynamic Semantics

CSC488S/CSC2107S - Compilers and Interpreters. CSC 488S/CSC 2107S Lecture Notes

Undergraduate Compilers in a Day

Högnivåspråk och översättning

SLIDE 2. At the beginning of the lecture, we answer question: On what platform the system will work when discussing this subject?

CS152 Programming Language Paradigms Prof. Tom Austin, Fall Syntax & Semantics, and Language Design Criteria

Life Cycle of Source Program - Compiler Design

Compiling Techniques

Lexical Scanning COMP360

Formats of Translated Programs

Intermediate Representations

Chapter 1. Preliminaries

8/27/17. CS-3304 Introduction. What will you learn? Semester Outline. Websites INTRODUCTION TO PROGRAMMING LANGUAGES

CS 242. Fundamentals. Reading: See last slide

Compiler Design (40-414)

A simple syntax-directed

Chapter 2 A Quick Tour

Course introduction. Advanced Compiler Construction Michel Schinz

Introduction. CS 2210 Compiler Design Wonsun Ahn

Time : 1 Hour Max Marks : 30

Transcription:

COP4020 ming Languages Compilers and Interpreters Robert van Engelen & Chris Lacher

Overview Common compiler and interpreter configurations Virtual machines Integrated development environments Compiler phases Lexical analysis Syntax analysis Semantic analysis Intermediate (machine-independent) code generation Intermediate code optimization Target (machine-dependent) code generation Target code optimization

Compilers versus Interpreters The compiler versus interpreter implementation is often fuzzy One can view an interpreter as a virtual machine that executes highlevel code Java is compiled to bytecode Java bytecode is interpreted by the Java virtual machine (JVM) or translated to machine code by a just-in-time compiler (JIT) A processor (CPU) can be viewed as an implementation in hardware of a virtual machine (e.g. bytecode can be executed in hardware) Some programming languages cannot be purely compiled into machine code alone Some languages allow programs to rewrite/add code to the code base dynamically Some languages allow programs to translate data to code for execution (interpretation)

Compilers versus Interpreters Compilers try to be as smart as possible to fix decisions that can be taken at compile time to avoid to generate code that makes this decision at run time Type checking at compile time vs. runtime Static allocation Static linking Code optimization Compilation leads to better performance in general Allocation of variables without variable lookup at run time Aggressive code optimization to exploit hardware features Interpretation facilitates interactive debugging and testing Interpretation leads to better diagnostics of a programming problem Procedures can be invoked from command line by a user Variable values can be inspected and modified by a user

Compilation Compilation is the conceptual process of translating source code into a CPU-executable binary target code Compiler runs on the same platform X as the target code Source Compiler Target Debug on X Compile on X Input Target Output Run on X

Cross Compilation Compiler runs on platform X, target code runs on platform Y Source Cross Compiler Compile on X Target Copy to Y Debug on X (= emulate Y) Input Target Run on Y Output

Interpretation Interpretation is the conceptual process of running highlevel code by an interpreter Source Input Interpreter Output

Virtual Machines A virtual machine executes an instruction stream in software Adopted by Pascal, Java, Smalltalk-80, C#, functional and logic languages, and some scripting languages Pascal compilers generate P-code that can be interpreted or compiled into object code Java compilers generate bytecode that is interpreted by the Java virtual machine (JVM) The JVM may translate bytecode into machine code by just-intime (JIT) compilation

Compilation and Execution on Virtual Machines Compiler generates intermediate program Virtual machine interprets the intermediate program Source Compiler Compile on X Intermediate Run on VM Input Virtual Machine Run on X, Y, Z, Output

Pure Compilation and Static Linking Adopted by the typical Fortran implementation Library routines are separately linked (merged) with the object code of the program Source Compiler Incomplete Object Code extern printf(); _printf _fget _fscan Static Library Object Code Linker Binary Executable

Compilation, Assembly, and Static Linking Facilitates debugging of the compiler Source Compiler Assembly Assembler Static Library Object Code Linker Binary Executable

Compilation, Assembly, and Dynamic Linking Dynamic libraries (DLL,.so,.dylib) are linked at run-time by the OS (via stubs in the executable) Source Compiler Assembly Assembler Shared Dynamic Libraries Input Incomplete Executable Output

Preprocessing Most C and C++ compilers use a preprocessor to expand macros Source Preprocessor Modified Source Compiler Assembly or Object Code

The CPP Preprocessor Early C++ compilers used the CPP preprocessor to generated C code for compilation C++ Source Code C++ Preprocessor C Source Code C Compiler Assembly or Object Code

Integrated Development Environments ming tools function together in concert Editors Compilers/preprocessors/interpreters Debuggers Emulators Assemblers Linkers Advantages Tools and compilation stages are hidden Automatic source-code dependency checking Debugging made simpler Editor with search facilities Examples Smalltalk-80, Eclipse, MS VisualStudio, Borland

Compilation Phases and Passes Compilation of a program proceeds through a fixed series of phases Each phase use an (intermediate) form of the program produced by an earlier phase Subsequent phases operate on lower-level code representations Each phase may consist of a number of passes over the program representation Pascal, FORTRAN, C languages designed for one-pass compilation, which explains the need for function prototypes Single-pass compilers need less memory to operate Java and ADA are multi-pass

Front end analysis Back end synthesis Compiler Front- and Back-end Source program (character stream) Scanner (lexical analysis) Tokens Parser (syntax analysis) Parse tree Semantic Analysis and Intermediate Code Generation Abstract syntax tree or other intermediate form Abstract syntax tree or other intermediate form Machine- Independent Code Improvement Modified intermediate form Target Code Generation Assembly or object code Machine-Specific Code Improvement Modified assembly or object code

Scanner: Lexical Analysis Lexical analysis breaks up a program into tokens program gcd (input, output); var i, j : integer; begin read (i, j); while i <> j do if i > j then i := i - j else j := j - i; writeln (i) end. program gcd ( input, output ) ; var i, j : integer ; begin read ( i, j ) ; while i <> j do if i > j then i := i - j else j := i - i ; writeln ( i ) end.

Context-Free Grammars A context-free grammar defines the syntax of a programming language The syntax defines the syntactic categories for language constructs Statements Expressions Declarations Categories are subdivided into more detailed categories A Statement is a For-statement If-statement Assignment <statement> <for-statement> <assignment> ::= <for-statement> <if-statement> <assignment> ::= for ( <expression> ; <expression> ; <expression> ) <statement> ::= <identifier> := <expression>

Example: Micro Pascal <> ::= program <id> ( <id> <More_ids> ) ; <Block>. <Block> ::= <Variables> begin <Stmt> <More_Stmts> end <More_ids> ::=, <id> <More_ids> <Variables> ::= var <id> <More_ids> : <Type> ; <More_Variables> <More_Variables> ::= <id> <More_ids> : <Type> ; <More_Variables> <Stmt> ::= <id> := <Exp> if <Exp> then <Stmt> else <Stmt> while <Exp> do <Stmt> begin <Stmt> <More_Stmts> end <Exp> ::= <num> <id> <Exp> + <Exp> <Exp> - <Exp>

Parser: Syntax Analysis Parsing organizes tokens into a hierarchy called a parse tree (more about this later) Essentially, a grammar of a language defines the structure of the parse tree, which in turn describes the program structure A syntax error is produced by a compiler when the parse tree cannot be constructed for a program

Semantic Analysis Semantic analysis is applied by a compiler to discover the meaning of a program by analyzing its parse tree or abstract syntax tree Static semantic checks are performed at compile time Type checking Every variable is declared before used Identifiers are used in appropriate contexts Check subroutine call arguments Check labels Dynamic semantic checks are performed at run time, and the compiler produces code that performs these checks Array subscript values are within bounds Arithmetic errors, e.g. division by zero Pointers are not dereferenced unless pointing to valid object A variable is used but hasn't been initialized When a check fails at run time, an exception is raised

Semantic Analysis and Strong Typing A language is strongly typed "if (type) errors are always detected" Errors are either detected at compile time or at run time Examples of such errors are listed on previous slide Languages that are strongly typed are Ada, Java, ML, Haskell Languages that are not strongly typed are Fortran, Pascal, C/C++, Lisp Strong typing makes language safe and easier to use, but potentially slower because of dynamic semantic checks In some languages, most (type) errors are detected late at run time which is detrimental to reliability e.g. early Basic, Lisp, Prolog, some script languages

Code Generation and Intermediate Code Forms A typical intermediate form of code produced by the semantic analyzer is an abstract syntax tree (AST) The AST is annotated with useful information such as pointers to the symbol table entry of identifiers Example AST for the gcd program in Pascal

Target Code Generation and Optimization The AST with the annotated information is traversed by the compiler to generate a low-level intermediate form of code, close to assembly This machine-independent intermediate form is optimized From the machine-independent form assembly or object code is generated by the compiler This machine-specific code is optimized to exploit specific hardware features