Examples of attributes: values of evaluated subtrees, type information, source file coordinates,

Similar documents
A Simple Syntax-Directed Translator

A simple syntax-directed

Context-Free Grammar. Concepts Introduced in Chapter 2. Parse Trees. Example Grammar and Derivation

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

CSE 401 Midterm Exam 11/5/10

A programming language requires two major definitions A simple one pass compiler

2.2 Syntax Definition

Structure of a compiler. More detailed overview of compiler front end. Today we ll take a quick look at typical parts of a compiler.

Principles of Programming Languages COMP251: Syntax and Grammars

Building Compilers with Phoenix

COP 3402 Systems Software Top Down Parsing (Recursive Descent)

CPS 506 Comparative Programming Languages. Syntax Specification

Part 5 Program Analysis Principles and Techniques

Chapter 3: Describing Syntax and Semantics. Introduction Formal methods of describing syntax (BNF)

This book is licensed under a Creative Commons Attribution 3.0 License

Formal Languages and Compilers Lecture VI: Lexical Analysis

CSE 311 Lecture 21: Context-Free Grammars. Emina Torlak and Kevin Zatloukal

Test I Solutions MASSACHUSETTS INSTITUTE OF TECHNOLOGY Spring Department of Electrical Engineering and Computer Science

Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part 1

Parsing II Top-down parsing. Comp 412

Last time. What are compilers? Phases of a compiler. Scanner. Parser. Semantic Routines. Optimizer. Code Generation. Sunday, August 29, 2010

Lexical analysis. Syntactical analysis. Semantical analysis. Intermediate code generation. Optimization. Code generation. Target specific optimization

1 Lexical Considerations

Lexical Analysis. Introduction

Introduction to Compiler Construction

Left to right design 1

Program Analysis ( 软件源代码分析技术 ) ZHENG LI ( 李征 )

Stack Applications. Lecture 27 Sections Robb T. Koether. Hampden-Sydney College. Wed, Mar 29, 2017

Compiler Design Aug 1996

Briefly describe the purpose of the lexical and syntax analysis phases in a compiler.

4. Semantic Processing and Attributed Grammars

Compiling Regular Expressions COMP360

COP 3402 Systems Software Syntax Analysis (Parser)

CSE au Final Exam Sample Solution

Lecture Chapter 6 Recursion as a Problem Solving Technique

More Assigned Reading and Exercises on Syntax (for Exam 2)

Introduction to Compiler Construction

Syntax Analysis Check syntax and construct abstract syntax tree

ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών

Introduction to Compiler Construction

MIT Specifying Languages with Regular Expressions and Context-Free Grammars. Martin Rinard Massachusetts Institute of Technology

MIT Specifying Languages with Regular Expressions and Context-Free Grammars

Lexical Scanning COMP360

MidTerm Papers Solved MCQS with Reference (1 to 22 lectures)

Some Applications of Stack. Spring Semester 2007 Programming and Data Structure 1

List of Figures. About the Authors. Acknowledgments

A lexical analyzer generator for Standard ML. Version 1.6.0, October 1994

More on Syntax. Agenda for the Day. Administrative Stuff. More on Syntax In-Class Exercise Using parse trees

CSE P 501 Exam 11/17/05 Sample Solution

B The SLLGEN Parsing System

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design

Front End. Hwansoo Han

Semantic analysis and intermediate representations. Which methods / formalisms are used in the various phases during the analysis?

CS415 Compilers. Syntax Analysis. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

RYERSON POLYTECHNIC UNIVERSITY DEPARTMENT OF MATH, PHYSICS, AND COMPUTER SCIENCE CPS 710 FINAL EXAM FALL 96 INSTRUCTIONS

CSE 401 Midterm Exam Sample Solution 2/11/15

Parsing and Pattern Recognition

Time : 1 Hour Max Marks : 30

CS 314 Principles of Programming Languages

Chapter 4. Syntax - the form or structure of the expressions, statements, and program units

Syntax/semantics. Program <> program execution Compiler/interpreter Syntax Grammars Syntax diagrams Automata/State Machines Scanning/Parsing

An Introduction to Trees

CSE 3302 Programming Languages Lecture 2: Syntax

Group A Assignment 3(2)

Theory and Compiling COMP360

Lexical Analysis. COMP 524, Spring 2014 Bryan Ward

Theoretical Part. Chapter one:- - What are the Phases of compiler? Answer:

Code No: R Set No. 1

Consider a description of arithmetic. It includes two equations that define the structural types of digit and operator:

Specifying Syntax. An English Grammar. Components of a Grammar. Language Specification. Types of Grammars. 1. Terminal symbols or terminals, Σ

Syntax and Grammars 1 / 21

Introduction to Syntax Directed Translation and Top-Down Parsers

Compiler Code Generation COMP360

COLLEGE OF ENGINEERING, NASHIK. LANGUAGE TRANSLATOR

PRINCIPLES OF COMPILER DESIGN UNIT I INTRODUCTION TO COMPILERS

CS 315 Programming Languages Syntax. Parser. (Alternatively hand-built) (Alternatively hand-built)

Lexical Analysis. Note by Baris Aktemur: Our slides are adapted from Cooper and Torczon s slides that they prepared for COMP 412 at Rice.

Using an LALR(1) Parser Generator

COMPILER DESIGN UNIT I LEXICAL ANALYSIS. Translator: It is a program that translates one language to another Language.

1. INTRODUCTION TO LANGUAGE PROCESSING The Language Processing System can be represented as shown figure below.

Introduction to Parsing. Lecture 5

Context-Free Grammars

TDDD55 - Compilers and Interpreters Lesson 3

CSC 467 Lecture 3: Regular Expressions

Lexical and Syntax Analysis

Syntax-Directed Translation

Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.

PRINCIPLES OF COMPILER DESIGN UNIT I INTRODUCTION TO COMPILING

CS 403: Scanning and Parsing

Compiler Lab. Introduction to tools Lex and Yacc

CS321 Languages and Compiler Design I. Winter 2012 Lecture 4

CSE 413 Final Exam. December 13, 2012

Stating the obvious, people and computers do not speak the same language.

March 13/2003 Jayakanth Srinivasan,

SMURF Language Reference Manual Serial MUsic Represented as Functions

LESSON 1. A C program is constructed as a sequence of characters. Among the characters that can be used in a program are:

Compiler Design Concepts. Syntax Analysis

UNIT -2 LEXICAL ANALYSIS

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

flex is not a bad tool to use for doing modest text transformations and for programs that collect statistics on input.

Transcription:

1

2

3

Attributes can be added to the grammar symbols, and program fragments can be added as semantic actions to the grammar, to form a syntax-directed translation scheme. Some attributes may be set by the lexical analysis, and some attributes may be computed in the semantic actions. Examples of attributes: values of evaluated subtrees, type information, source file coordinates, By injecting corresponding code fragments into the parser implementation, the semantic actions can be executed during the parse. This is known as a syntaxdirected translation. 4

Example from the book (Section 2.3): turn infix arithmetic expressions into postfix dittos. 5

Postfix notation for arithmetic expressions puts the operator at the end instead of in-between the operands (which is called infix notation). With postfix notation, no parentheses are needed. This is a good example since it postfix notation is similar to the stack machine code that you will generate in the first lab assignment. 6

The postfix notation of a single constant num is defined as just that constant. The postfix notation of an infix expression on the form (E) is defined as the postfix notation of E. The postfix notation of an infix expression on the form E 1 op E 2, where op is some binary operator, is defined as the postfix notation of E 1 followed by the postfix notation of E 2 followed by op. Note that this definition is not concerned with the precedence or associativity of operators. It assumes that the intended order in which the operators are applied is already reflected in the parse tree for the expression, and the same application order will be used in the resulting postfix expression. 7

8

Translation scheme based on the old expression grammar (we will take care of the left-recursion later). The code fragments should be executed by the parser as soon as the production has been identified. The semantic actions can also be put in the middle of production bodies. We assume that the scanner has attached an attribute value to the num tokens (the book uses a nonterminal). Note that attributes can be attached also to nonterminals, and the attributes may be changed by the semantic actions to propagate information to different parts of the parse tree. 9

The semantic actions can be seen as grammar symbols. If inserted as leaves in the parse tree, they are executed in the order given by a depthfirst, left-right traversal of the tree. 10

11

Since we treat the semantic actions as grammar symbols, they can be included in the left recursion elimination. Here the same translation scheme, with left recursion removed using the simple procedure shown before, is shown. Note that since the semantic actions now appear in the middle of the production bodies, they should be executed as soon as the parser has processed the symbols to the left in the body. However, this is a common mistake in the first lab assignment. Make sure e.g. the expression 3 2 1 is translated as (3 2) 1 instead of as 3 (2 1)! 12

13

Syntax-directed definitions are similar to syntax-directed translation schemes, but more abstract or declarative. Extends the grammar in the following way: Attaches attributes to grammar symbols (terminals and nonterminals) Attaches semantic rules to productions that define the attributes. Contrary to semantic actions, no evaluation order is specified, but is instead implied by the definition. It is common to add subscripts to grammar symbols that occur several times in the same production, to be able to distinguish them in the semantic rules. The table shows a syntax-directed definition for the infix postfix translation. The operator means string concatenation. 14

15

16

Note that the lexical analysis should not assume that the input program is syntactically correct. For instance, the following regular expression used to distinguish the keyword if from identifiers starting with the letters if is problematic: if[ \t\n]*( The problem is that it assumes that the next token is a left parenthesis (which it will be if the program is syntactically correct, but this might not be the case). 17

The typesetting language TeX has support for configuring the lexical analysis. For example, it has support for changing which character should be used to start comments. Another example: if the macro \A expands to some and the macro \B expands to macro, then \csname\a\b\endcsname generates a call to the macro \somemacro. In this case, a token (a control sequence token) has been generated from the invocation of other macros. 18

19

20

21

22

The lexical analysis can also be implemented as a DFA. The DFA is invoked each time GetNextToken() is called. The tokens recognized by the DFA in this example are: < Less than <= Less than or equal > Greater than >= Greater than or equal = Equal <> Not equal The * at states 4 and 8 means that the current input position must be moved back one step. Optional exercise: Draw a DFA that recognizes the following tokens (from the language C): add Plus operator: + incr Increment operator: ++ sub Minus operator: - decr Decrement operator: -- arrow Struct member accessor: -> id Identifiers: [a-za-z_][a-za-z0-9_]* if The keyword if 23

This table encodes the DFA from the previous slide. Green cells mark success: the returning of a token. Red cells mark lexical error. In reality, other characters in the state 0 would be the start of some other lexemes, since it is not common that a language only contains relational operators. 24

With a keyword table, recognized identifiers can be checked against the keywords in the table to see if they should be returned as keyword tokens instead. Another strategy is to always test for keywords before identifiers, e.g. by constructing the DFA this way. 25

26

27

28