Crafting a Compiler with C (II) Compiler V. S. Interpreter

Similar documents
What do Compilers Produce?

Compiler Design (40-414)

Chapter 3 (part 3) Describing Syntax and Semantics

Formats of Translated Programs

The Structure of a Syntax-Directed Compiler

The Structure of a Syntax-Directed Compiler

Compiling and Interpreting Programming. Overview of Compilers and Interpreters

10/18/18. Outline. Semantic Analysis. Two types of semantic rules. Syntax vs. Semantics. Static Semantics. Static Semantics.

CS 4201 Compilers 2014/2015 Handout: Lab 1

The Structure of a Compiler

COP4020 Programming Languages. Compilers and Interpreters Robert van Engelen & Chris Lacher

A Simple Syntax-Directed Translator

What is a compiler? var a var b mov 3 a mov 4 r1 cmpi a r1 jge l_e mov 2 b jmp l_d l_e: mov 3 b l_d: ;done

CST-402(T): Language Processors

When do We Run a Compiler?

Principles of Programming Languages. Lecture Outline

COMPILER DESIGN. For COMPUTER SCIENCE

Compilers and Interpreters

Working of the Compilers

COLLEGE OF ENGINEERING, NASHIK. LANGUAGE TRANSLATOR

Chapter 3: Syntax and Semantics. Syntax and Semantics. Syntax Definitions. Matt Evett Dept. Computer Science Eastern Michigan University 1999

COMPILER DESIGN LECTURE NOTES

What is a compiler? Xiaokang Qiu Purdue University. August 21, 2017 ECE 573

Chapter 3. Describing Syntax and Semantics ISBN

PRINCIPLES OF COMPILER DESIGN UNIT I INTRODUCTION TO COMPILERS

Compilers. History of Compilers. A compiler allows programmers to ignore the machine-dependent details of programming.

9/5/17. The Design and Implementation of Programming Languages. Compilation. Interpretation. Compilation vs. Interpretation. Hybrid Implementation

LECTURE 3. Compiler Phases


Chapter 3. Describing Syntax and Semantics

Chapter 3. Syntax - the form or structure of the expressions, statements, and program units

COP 3402 Systems Software. Lecture 4: Compilers. Interpreters

Language Translation. Compilation vs. interpretation. Compilation diagram. Step 1: compile. Step 2: run. compiler. Compiled program. program.

COMPILER CONSTRUCTION LAB 2 THE SYMBOL TABLE. Tutorial 2 LABS. PHASES OF A COMPILER Source Program. Lab 2 Symbol table

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design

About the Authors... iii Introduction... xvii. Chapter 1: System Software... 1

The Structure of a Syntax-Directed Compiler

Compiling Regular Expressions COMP360

Structure of a compiler. More detailed overview of compiler front end. Today we ll take a quick look at typical parts of a compiler.

CS152 Programming Language Paradigms Prof. Tom Austin, Fall Syntax & Semantics, and Language Design Criteria

CMSC 330: Organization of Programming Languages. Formal Semantics of a Prog. Lang. Specifying Syntax, Semantics

CSE 413 Languages & Implementation. Hal Perkins Winter 2019 Structs, Implementing Languages (credits: Dan Grossman, CSE 341)

Optimizing Finite Automata

The Compiler So Far. CSC 4181 Compiler Construction. Semantic Analysis. Beyond Syntax. Goals of a Semantic Analyzer.

CS 314 Principles of Programming Languages

Syntax. A. Bellaachia Page: 1

CJT^jL rafting Cm ompiler

SYED AMMAL ENGINEERING COLLEGE (An ISO 9001:2008 Certified Institution) Dr. E.M. Abdullah Campus, Ramanathapuram

Software II: Principles of Programming Languages

Question Bank. 10CS63:Compiler Design

CMPT 379 Compilers. Anoop Sarkar.

Semantic Analysis. Lecture 9. February 7, 2018

COMPILER CONSTRUCTION Seminar 02 TDDB44

Programmiersprachen (Programming Languages)

Compilers. Prerequisites

Introduction to Compiler


Reading Assignment. Scanner. Read Chapter 3 of Crafting a Compiler.

Compiler Design. Computer Science & Information Technology (CS) Rank under AIR 100

Introduction. Interpreters High-level language intermediate code which is interpreted. Translators. e.g.

Programming Languages

CS321 Languages and Compiler Design I. Winter 2012 Lecture 1

Why are there so many programming languages? Why do we have programming languages? What is a language for? What makes a language successful?

Programming Languages Third Edition. Chapter 7 Basic Semantics

3. DESCRIBING SYNTAX AND SEMANTICS

Chapter 3. Topics. Languages. Formal Definition of Languages. BNF and Context-Free Grammars. Grammar 2/4/2019

CS 242. Fundamentals. Reading: See last slide

CS 415 Midterm Exam Spring 2002

CS Lecture 2. The Front End. Lecture 2 Lexical Analysis

CS 314 Principles of Programming Languages. Lecture 3

Front End. Hwansoo Han

Fall Compiler Principles Lecture 5: Intermediate Representation. Roman Manevich Ben-Gurion University of the Negev

Theoretical Part. Chapter one:- - What are the Phases of compiler? Answer:

PRINCIPLES OF COMPILER DESIGN UNIT I INTRODUCTION TO COMPILING

Why are there so many programming languages?

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler Front-End

SEM / YEAR : VI / III CS2352 PRINCIPLES OF COMPLIERS DESIGN UNIT I - LEXICAL ANALYSIS PART - A

Chapter 3. Describing Syntax and Semantics

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

The role of semantic analysis in a compiler

Gujarat Technological University Sankalchand Patel College of Engineering, Visnagar B.E. Semester VII (CE) July-Nov Compiler Design (170701)

Lexical Analysis. Lexical analysis is the first phase of compilation: The file is converted from ASCII to tokens. It must be fast!

Chapter 1. Preview. Reason for Studying OPL. Language Evaluation Criteria. Programming Domains

Computer Science Department Carlos III University of Madrid Leganés (Spain) David Griol Barres

A simple syntax-directed

4. An interpreter is a program that

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler so far

Properties of Regular Expressions and Finite Automata

TDDD55 - Compilers and Interpreters Lesson 3

Syntax and Grammars 1 / 21

Program Abstractions, Language Paradigms. CS152. Chris Pollett. Aug. 27, 2008.

Compiler Code Generation COMP360

SYSTEMS PROGRAMMING. Srimanta Pal. Associate Professor Indian Statistical Institute Kolkata OXFORD UNIVERSITY PRESS

Undergraduate Compilers in a Day

Formal Languages and Compilers Lecture I: Introduction to Compilers

Chapter 3. Semantics. Topics. Introduction. Introduction. Introduction. Introduction

Alternatives for semantic processing

TDDD55- Compilers and Interpreters Lesson 3

CS1622. Semantic Analysis. The Compiler So Far. Lecture 15 Semantic Analysis. How to build symbol tables How to use them to find

CS 406/534 Compiler Construction Putting It All Together

Transcription:

Crafting a Compiler with C (II) 資科系 林偉川 Compiler V S Interpreter Compilation - Translate high-level program to machine code Lexical Analyzer, Syntax Analyzer, Intermediate code generator(semantics Analyzer), Optimization, Code Generation use symbol table - Slow translation + Linker - Fast execution - Von-Neumann bottleneck connection between computer s memory and CPU 2 1

Compiler V S Interpreter Pure interpretation for source-code debugger - No translation - Fetch/decode/execution cycle - Slow execution (every-time statement decoding) LISP, shell of UNIX, Script language 3 Compiler V S Interpreter Interpreter directly interpret the source programs they process or syntactically transformed of them A compiler has distinct translation and execution phase The translation phase may generate a virtual machine language that is interpreted by a particular computer, either in Firmware or HW 4 2

Compiler V S Interpreter 5 Hybrid Implementation system 6 3

Phase 1 Editor Disk Program is created in an editor and stored on disk in a file ending with java Phase 2 Compiler Disk Compiler creates bytecodes and stores them on disk in a file ending with class Phase 3 Class Loader Primary Memory Class loader reads class files containing bytecodes from disk and puts those bytecodes in memory Disk Phase 4 Bytecode Verifier Primary Memory Bytecode verifier confirms that all bytecodes are valid and do not violate Java s security restrictions Phase 5 Interpreter Primary Memory Interpreter reads bytecodes and translates them into a language that the computer can understand, possibly storing data values as the program executes 7 Compiler Structure Scanner Scan source program and build tokens such as identifier, integer, real, reserved words, delimiters (tokens are encoded as integer) Put program into a compact and uniform formats (token) Eliminates unneeded information (comments) Enter preliminary information into symbol and attribute tables Formats and lists the source program use regular expression to describe tokens and can be converted into Finite Automata [scanner generator Lex] Hand-coding scanner VS Table-driven scanner (generator produce) 8 4

Compiler Structure Parser Build the syntactic structure defined by a programming language syntax from tokes Parser reads tokens and group them into units as specified by the productions of CFG Parsers are driven by tables created from CFG by a parser generator (Yacc) The parser verifies correct syntax, if a syntax error is found, it can repair the error or error recovery by consulting error repair table created by parser/repair generator 9 Compiler Structure Semantics routine supply the meaning of the program based on the syntactic structure to generate intermediate or target code Check the static semantics of each construct to be legal, if the construct is correct, semantic routines also do the actual translation (intermediate code generation ) attribute grammar Static semantic checking is machine independent, intermediate code generation is machine dependent If two components are separated, retargeting a compiler to generate code for a different target machine is simplified 10 5

Compiler Structure Optimizer intermediate code generation is analyzed and transformed into improved code by optimizer Very complex and slow Maybe turn off to speed up translation Usually it was skipped Peephole optimization Attempt to do simple, often machine dependent, code improvements subset optimizer to reduce the instructions: Elimination the instructions such as V * 1 or V + 0 Eliminate Store Reg_a into memory(a), Load memory(a) to Reg_a 11 Compiler Structure Code generator intermediate code is mapped into target code by code generator Need detailed information about the target machine Register allocation Addressing mode (direct/indirect) Choice of instruction formats (RR, RM, MM) Generation of good target machine code requires consideration of many special cases 12 6

Compiler Structure Compilers analyze their input, construct a semantic representation, and synthesize their output from it (analysis-synthesis paradigm) A program consists of a front-end which analyzes the text and constructs internally a table of (length, frequency) pairs 13 Syntax-directed compiler 14 7

Compiler Architecture 15 Compiler method One-pass compiler no optimization is needed, just merge code generation with semantic routine and intermediate code Automatic generation of code generators has been widely studied automatically match a low-level IR to target instruction templates Localizes the target machine dependencies of a compiler and makes it possible to retarget a compiler to a new target machine (Assembly into another machine s) 16 8

Compiler method Symbol and attribute tables A symbol table is a mechanism that allow information (attributes) to be associated with identifiers Each time an identifier is used, a symbol table provides access to the information collected about the identifier when its declaration was processed Symbol tables can be used by any of the compiler components to enter, share, and later retrieve information about variables, procedures name, labels, etc 17 Compiler method Compiler writing tools Compiler-compilers or compiler generator includes scanner and parser generator and some also include symbol table routines and code-generation capability These generators greatly aid in building piece of compilers, but the great bulk of the effort in building a compiler lies in writing and debugging semantic routines Describe what a program should do and generate a program from it by a program generator 18 9

Advantage of using compiler generator Input to compiler generator is much higher level of abstraction than handwritten program This increases the chance that the program will be correct Allows increasing flexibility and modifiability If a small change in syntax of a language, a handwritten parser would be a major block to any such change With a generated parser, just changes the syntax description and generates a new parser 19 Advantage of using compiler generator Pre-canned or tailored code can be added to the generated program, enhancing its power at hardly any cost Input error handling is usually a difficult affair in handwritten parser A generated parser can include tailored error correction code with no effort on the part of the part of the programmer 20 10

Advantage of using compiler generator A formal description can be used to generate more than one type of program Generated compiler program may be more or less efficient than handwritten one whenever the possibility exists Generating a compiler program is always to be preferred Using a parser generator, a parser can be generated automatically Especially for the file conversion system profited from compiler construction technique such as converting text file to ps or pdf 21 Compiler Structure Compilers analyze their input, construct a semantic representation, and synthesize their output from it (analysis-synthesis paradigm) A program consists of a front-end which analyzes the text and constructs internally a table of (length, frequency) pairs 22 11

Semantic representation Semantic representation takes the intermediate code as data structure such as annotated abstract syntax tree (parse tree) AST Syntax tree (parse tree) of program text is a data structure which shows how the various segments of the program text are to be viewed in term of grammar Parsing is the process of structuring a text according to a given grammar 23 Semantic representation The parser can be written by hand if the grammar is very small and simple For larger and/or more complicated grammars it can be generated by a parser generator The expression b*b-4*a*c, the annotated parse tree according to the grammar shown as followed: expression expression + term expression - term term term term * factor term / factor factor factor identifier constant ( expression ) 24 12

Parse tree of an expression b*b-4*a*c 25 Annotated AST of an expression b*b-4*a*c 26 13

Structure of the compiler 27 Syntax of PL CFG Context-free syntax defines legal sequences of symbols independent of any notion of what the symbols mean Context-free cannot describe type compatibility (type checking) and scoping rules (block structure) (CFGs limitation) context-sensitive a=b+c is illegal if b or c is undeclared 28 14

Another semantic of PL - Context-sensitive Context-sensitive restriction can be handled by semantic process which can be divided into static and run-time semantics Static semantics rules define that all identifiers should be declared, operators and operands should be typecompatible attribute grammar Ex E 1 E 2 +T Should be augmented with a type attribute for E and T, a predicate requiring type compatibility (E 2 type = numeric) and (Ttype = numeric) 29 Attribute Grammar Attribute grammar Rule 1 Syntax rule: <assign> <var> = <expr> Semantic rules: <expr>expected_type <var>actual_type 由 LHS type 決定 Rule 2 Syntax rule: <expr> <var>[2] + <var>[3] Semantic rules: <expr>actual_type if (<var>[2]actual_type = int) and (<var>[3]actual_type = int) then int else real Predicate: <expr>actual_type = <expr>expected_type 30 15

Attribute grammar 3 Syntax rule: <expr> <var> Semantic rule: <expr>actual_type <var>actual_type Predicate: <expr>actual_type = <expr>expected_type 4 Syntax rule: <var> A B C Semantic rule: <var>actual_type Look-up(<var>string) Look-up function looks up a given variable name in the symbol table and returns the variable s type 31 Example of attribute grammar Ex A=A+B <assign> <var> = <expr> 1 <var>actual_type Look-up(A) [Rule 4] 2 <expr>expect_type <var>actual_type [Rule 1] 3 <var>[2]actual_type Look-up(A) [Rule 4] <var>[3]actual_type Look-up(B) [Rule 4] 4 <expr>actual_type either int or real [Rule 2] 5 <expr>expected_type =<expr>actual_type is either TRUE or FALSE [Rule 2] 32 16

Attribute grammar Characteristics: Attribute grammar is used to describe the syntax and the static semantics of a PL Advantage: Attribute grammar can be used as the formal definition of a language that can be input to a compiler generation System 33 Attribute grammar Disadvantage: Hard to describe all of the syntax and static semantics of a real PL because its size and complexity The number of attributes and semantics rules make such grammars difficult to read and write The attribute values on a large parse tree are expensive 34 17

Runtime semantic Axiomatic semantics based on specified relation or predicates that relate program variables 35 Runtime semantic Ex The axiom definition var = exp usually states that a predicate involving var is true after statement execution iff the predicate obtained by replacing all occurrences of var by exp is true for y > 3 to be true after executing the statement y=x+1, the predicate x+1>3 should have to be true before the statement is executed Aliasing makes axiomatic definitions much more complex 36 18

Advantage Axiomatic semantics Good for deriving proofs of program correctness Concentrate on how relations among variables are changed by statement execution Disadvantage Difficult to be used to completely define most PL Cannot model implementation consideration such as run out of memory 37 Denotational model Rely on notation from mathematics and often fairly compact A syntax-directed definition in term of the meaning of its immediate constituents Ex Define integer addition (cannot overflow) E[T 1 +T 2 ] = E[T 1 ] is integer and E[T 2 ] is integer range(e[t 1 ]+E[T 2 ]) else error The first Ada compiler basis 38 19