Compiler construction in4020 lecture 5

Similar documents
Compiler construction 2002 week 5

Compiler construction 2005 lecture 5

IN4305 Engineering project Compiler construction

Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

Parsing How parser works?

Syntax Analysis Part IV

COMPILER CONSTRUCTION LAB 2 THE SYMBOL TABLE. Tutorial 2 LABS. PHASES OF A COMPILER Source Program. Lab 2 Symbol table

Compiler construction lecture 3

Using an LALR(1) Parser Generator

CSCI Compiler Design

Compiler construction in4303 lecture 3

Yacc: A Syntactic Analysers Generator

FUNCTIONAL AND LOGIC PROGRAMS

Compiler Construction

Lexical and Syntax Analysis

CSE302: Compiler Design

Compiler Construction

Compiler Lab. Introduction to tools Lex and Yacc

LECTURE 11. Semantic Analysis and Yacc

SEMANTIC ANALYSIS TYPES AND DECLARATIONS

Compiler construction in4303 answers

COMPILER CONSTRUCTION Seminar 02 TDDB44

Syntax-Directed Translation

As we have seen, token attribute values are supplied via yylval, as in. More on Yacc s value stack

UNIVERSITY OF CALIFORNIA

TDDD55 - Compilers and Interpreters Lesson 3

Faculty of Electrical Engineering, Mathematics, and Computer Science Delft University of Technology

Faculty of Electrical Engineering, Mathematics, and Computer Science Delft University of Technology

SYED AMMAL ENGINEERING COLLEGE (An ISO 9001:2008 Certified Institution) Dr. E.M. Abdullah Campus, Ramanathapuram

Yacc. Generator of LALR(1) parsers. YACC = Yet Another Compiler Compiler symptom of two facts: Compiler. Compiler. Parser

Anatomy of a Compiler. Overview of Semantic Analysis. The Compiler So Far. Why a Separate Semantic Analysis?

L L G E N. Generator of syntax analyzier (parser)

Engineering project Compiler Construction Assignments

Context-free grammars

Yacc Yet Another Compiler Compiler

COMPILERS AND INTERPRETERS Lesson 4 TDDD16

Compiler Construction: Parsing

TDDD55- Compilers and Interpreters Lesson 3

Principles of Programming Languages

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

JavaCC Parser. The Compilation Task. Automated? JavaCC Parser

G53CMP: Lecture 4. Syntactic Analysis: Parser Generators. Henrik Nilsson. University of Nottingham, UK. G53CMP: Lecture 4 p.1/32

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

Compiler construction assignments

A clarification on terminology: Recognizer: accepts or rejects strings in a language. Parser: recognizes and generates parse trees (imminent topic)

Compilers Project 3: Semantic Analyzer

The Compiler So Far. CSC 4181 Compiler Construction. Semantic Analysis. Beyond Syntax. Goals of a Semantic Analyzer.

Compilers. Type checking. Yannis Smaragdakis, U. Athens (original slides by Sam

Semantic actions for declarations and expressions

Conflicts in LR Parsing and More LR Parsing Types

Semantic actions for declarations and expressions. Monday, September 28, 15

Semantic Analysis. Lecture 9. February 7, 2018

COMPILER (CSE 4120) (Lecture 6: Parsing 4 Bottom-up Parsing )

Question Points Score

Lex & Yacc. By H. Altay Güvenir. A compiler or an interpreter performs its task in 3 stages:

Lex & Yacc. by H. Altay Güvenir. A compiler or an interpreter performs its task in 3 stages:

Syn S t yn a t x a Ana x lysi y s si 1

PRACTICAL CLASS: Flex & Bison

RYERSON POLYTECHNIC UNIVERSITY DEPARTMENT OF MATH, PHYSICS, AND COMPUTER SCIENCE CPS 710 FINAL EXAM FALL 96 INSTRUCTIONS

An Introduction to LEX and YACC. SYSC Programming Languages

Compiler Construction

Lex & Yacc (GNU distribution - flex & bison) Jeonghwan Park

Introduction to Yacc. General Description Input file Output files Parsing conflicts Pseudovariables Examples. Principles of Compilers - 16/03/2006

Introduction to Lex & Yacc. (flex & bison)

Project Compiler. CS031 TA Help Session November 28, 2011

A Bison Manual. You build a text file of the production (format in the next section); traditionally this file ends in.y, although bison doesn t care.

Undergraduate Compilers in a Day

Big Picture: Compilation Process. CSCI: 4500/6500 Programming Languages. Big Picture: Compilation Process. Big Picture: Compilation Process.

Parser Tools: lex and yacc-style Parsing

Parser Tools: lex and yacc-style Parsing

Syntax Analysis Part VIII

Context Analysis. Mooly Sagiv. html://

Syntax Analysis The Parser Generator (BYacc/J)

Topic 5: Syntax Analysis III

Chapter 3: Describing Syntax and Semantics. Introduction Formal methods of describing syntax (BNF)

Preparing for the ACW Languages & Compilers

Programming Languages

COP4020 Programming Assignment 2 - Fall 2016

Context-Free Grammar. Concepts Introduced in Chapter 2. Parse Trees. Example Grammar and Derivation

CS1622. Semantic Analysis. The Compiler So Far. Lecture 15 Semantic Analysis. How to build symbol tables How to use them to find

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler Front-End

Project 2 Interpreter for Snail. 2 The Snail Programming Language

The role of semantic analysis in a compiler

Simple LR (SLR) LR(0) Drawbacks LR(1) SLR Parse. LR(1) Start State and Reduce. LR(1) Items 10/3/2012

CSE P 501 Compilers. Static Semantics Hal Perkins Winter /22/ Hal Perkins & UW CSE I-1

Semantic actions for declarations and expressions

Today's Topics. CISC 458 Winter J.R. Cordy

Chapter 2: Syntax Directed Translation and YACC

CSE 401 Midterm Exam Sample Solution 11/4/11

Automatic Scanning and Parsing using LEX and YACC

CS143 Handout 12 Summer 2011 July 1 st, 2011 Introduction to bison

Part III : Parsing. From Regular to Context-Free Grammars. Deriving a Parser from a Context-Free Grammar. Scanners and Parsers.

Compiler Construction Assignment 3 Spring 2018

Review main idea syntax-directed evaluation and translation. Recall syntax-directed interpretation in recursive descent parsers

Syntax-Directed Translation. Lecture 14

Bottom-Up Parsing. Lecture 11-12

CS 426 Fall Machine Problem 1. Machine Problem 1. CS 426 Compiler Construction Fall Semester 2017

CS 415 Midterm Exam Spring 2002

LECTURE 3. Compiler Phases

Software II: Principles of Programming Languages

Transcription:

Compiler construction in4020 lecture 5 Semantic analysis Assignment #1 Chapter 6.1

Overview semantic analysis identification symbol tables type checking CS assignment yacc LLgen language grammar parser generator program text lexical analysis tokens syntax analysis AST context handling annotated AST

Semantic analysis information is scattered throughout the program identifiers serve as connectors find defining occurrence of each applied occurrence of an identifier in the AST undefined identifiers error unused identifiers warning check rules in the language definition type checking control flow (dead code)

Symbol table global storage used by all compiler phases holds information about identifiers: type location size program text lexical analysis syntax analysis context handling annotated AST optimizations code generation executable s y m b o l t a b l e

Symbol table implementation extensible string-indexable array linear list next tree name type hash table bucket 0 bucket 1 bucket 2 bucket 3 next name type aap next name type hash function: string int noot mies

Identification different kinds of identifiers variables type names field selectors name spaces typedef int i int j void foo(int j) { struct i {i i} i scopes i: i.i = 3 printf( "%d\n", i.i) }

Handling scopes Stack of scope elements when entering a scope a new element is pushed on the stack declared identifiers are entered in the top scope element applied identifiers are looked up in the scope elements from top to bottom the top element is removed upon scope exit

A scoped hash-based symbol table aap( int noot) scope stack { int mies, aap 2 }... noot decl 1 prop 1 0 hash table bucket 0 bucket 1 bucket 2 aap decl 2 prop 0 prop bucket 3 mies level decl 2 prop

Identification: complications overloading operators: N*2 prijs*2.20371 functions: PUT(s:STRING) PUT(i:INTEGER) solution: yield set of possibilities (to be constrained by type checking) imported scopes C++ scope resolution operator x:: Modula FROM module IMPORT... solution: stack (or merge) the new scope

Type checking operators and functions impose restrictions on the types of the arguments types basic types structured types type names typedef struct { double re double im } complex

Forward declarations recursive data structures TYPE Tree = POINTER TO Node Type Node = RECORD element : Integer left, right : Tree END RECORD type information must be stored type table type information must be resolved undefined types circularities

Type equivalence name equivalence [all types get a unique name] VAR a : ARRAY [Integer 1..10] OF Real VAR b : ARRAY [Integer 1..10] OF Real structural equivalence [difficult to check] TYPE c = RECORD i : Integer p : POINTER TO c END RECORD TYPE d = RECORD i : Integer p : POINTER TO RECORD i : Integer p : POINTER to c END RECORD END RECORD

Coercions implicit data and type conversion to match operand (argument) type coercions complicate identification (ambiguity) VAR a : Real... a := 5 3.14 + 7 8 + 9 two phase approach expand a type to a set by applying coercions reduce type sets based on constraints imposed by (overloaded) operators and language semantics

Variable: value or location? two usages of variables rvalue: value lvalue: location VAR p : Real VAR q : Real... p := q insert coercion to dereference variable checking rules: := found lvalue rvalue expected lvalue rvalue - deref ERROR - (location of) p deref (location of) q

Exercise (3 min.) complete the table expression construct constant variable identifier (non-variable) &lvalue *rvalue V[rvalue] rvalue + rvalue lvalue = rvalue result kind (lvalue/rvalue) rvalue lvalue rvalue rvalue lvalue V rvalue rvalue V stands for lvalue or rvalue

Assignment for CS students Asterix compiler Asterix program token description Asterix grammar lex yacc lexical analysis syntax analysis 1. replace yacc by LLgen 2. make Asterix object-oriented classes and objects inheritance and dynamic binding context handling code generation C-code

Assignment for CE students Asterix compiler Asterix program token description Asterix grammar lex yacc lexical analysis syntax analysis 1. extend Asterix with records 2. code generation context handling code generation C-code

Online assistance: Panoramix records assembly objects Candy? Name or group number: Code to compile: Go! Browse...

Online assistance: Diagnostix testing Candy? Name or group number: Assignment: Tar file with test set: Go! Browse...

Case study: parser generators Yacc bottom-up LALR(1) LLgen top-down LL(1) language grammar parser generator program text lexical analysis tokens syntax analysis AST context handling annotated AST

Yet another compiler compiler yacc (bison): parser generator for UNIX LALR(1) grammar C code format of the yacc input file: definitions %% rules %% user code tokens + properties grammar rules + actions auxiliary C-code

Yacc-based expression interpreter input file %token DIGIT %% line : expr '\n' { printf("%d\n", $1)} expr : expr '+' expr { $$ = $1 + $3} expr '*' expr { $$ = $1 * $3} '(' expr ')' { $$ = $2} DIGIT %% grammar semantics yacc maintains a stack of values that may be referenced ($i) in the semantic actions

Yacc interface to lexical analyzer yacc invokes yylex() to get the next token the value of a token must be stored in the global variable yylval the default value type is int, but can be changed %% yylex() { } int c c = getchar() if (isdigit(c)) { yylval = c - '0' return DIGIT } return c

Yacc interface to back-end yacc generates a function named yyparse() syntax errors are reported by invoking a callback function yyerror() %% yylex() {... } main() { yyparse() } yyerror() { printf("syntax error\n") exit(1) }

Yacc-based expression interpreter input file (desk0.y) run yacc %% line : expr '\n' { printf("%d\n", $1)} expr : expr '+' expr { $$ = $1 + $3} expr '*' expr { $$ = $1 * $3} '(' expr ')' { $$ = $2} DIGIT %% > make desk0 bison -v desk0.y desk0.y contains 4 shift/reduce conflicts. gcc -o desk0 desk0.tab.c >

Yacc-based expression interpreter input file (desk0.y) %% run yacc run desk0, is it correct? > desk0 2*3+4 14 %% line : expr '\n' { printf("%d\n", $1)} expr : expr '+' expr { $$ = $1 + $3} expr '*' expr { $$ = $1 * $3} '(' expr ')' { $$ = $2} DIGIT NO operator precedence: Meneer Please Van Excuse DalenMy Wacht Dear Op Aunt Antwoord Sally

Operator precedence in Yacc priority from top (low) to bottom (high) %token DIGIT %left '+' %left '*' %% line : expr '\n' { printf("%d\n", $1)} expr : expr '+' expr { $$ = $1 + $3} expr '*' expr { $$ = $1 * $3} '(' expr ')' { $$ = $2} DIGIT %%

Exercise (5 min.) multiple lines: %% lines: line lines line line : expr '\n' { printf("%d\n", $1)} expr : expr '+' expr { $$ = $1 + $3} expr '*' expr { $$ = $1 * $3} '(' expr ')' { $$ = $2} DIGIT %% Extend the interpreter to a desk calculator with registers named a z. Example input: v=3*(w+4)

Answers

LLgen: LL(1) parser generator LLgen is part of the Amsterdam Compiler Kit takes LL(1) grammar + semantic actions in C and generates a recursive descent parser LLgen features: repetition operators advanced error handling parameter passing control over semantic actions dynamic conflict resolvers

LLgen example: expression interpreter start from LR(1) grammar make grammar LL(1) left recursion operator precedence use repetition operators %token DIGIT add semantic actions attach parameters to grammar rules insert C-code between the symbols main : line+ line : expr '\n' expr : term [ '+' term ]* term : factor [ '*' factor ]* factor : '(' expr ') DIGIT LLgen

main : line+ line {int e} : expr(&e) '\n' { printf("%d\n", e)} expr(int *e) {int t} : term(e) [ '+' term(&t) { *e = *e + t} ]* term(int *t) {int f} : factor(t) [ '*' factor(&f) { *t = *t * f} ]* factor(int *f) : '(' expr(f) ')' DIGIT { *f = yylval} grammar semantics

LLgen interface to lexical analyzer by default LLgen invokes yylex() to get the next token yylex() { int c the value of a token can be stored in any global variable (yylval) of any type (int) } c = getchar() if (isdigit(c)) { yylval = c - '0' return DIGIT } return c

LLgen interface to back-end LLgen generates a user-named function (parse) LLgen handles syntax errors by inserting missing tokens and deleting unexpected tokens LLmessage() is invoked to notify the lexical analyzer %start parse, main LLmessage(int class) { switch (class) { case -1: printf("expecting EOF, ") case 0: printf("deleting token (%d)\n",llsymb) break default: /* push back token LLsymb */ printf("inserting token (%d)\n",class) break } }

Exercise (5 min.) extend LLgen-based interpreter to a desk calculator with registers named a z. Example input: v=3*(w+4)

Answers