Lecture 10 Parsing 10.1

Similar documents
COP 3402 Systems Software Top Down Parsing (Recursive Descent)

CMPS Programming Languages. Dr. Chengwei Lei CEECS California State University, Bakersfield

Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part 1

Syntax. In Text: Chapter 3

Homework & Announcements

CPS 506 Comparative Programming Languages. Syntax Specification

announcements CSE 311: Foundations of Computing review: regular expressions review: languages---sets of strings

Lecture 4: Syntax Specification

Chapter 3. Describing Syntax and Semantics ISBN

Chapter 3. Describing Syntax and Semantics

COP 3402 Systems Software Syntax Analysis (Parser)

Last time. What are compilers? Phases of a compiler. Scanner. Parser. Semantic Routines. Optimizer. Code Generation. Sunday, August 29, 2010

Lexical and Syntax Analysis. Top-Down Parsing

Dr. D.M. Akbar Hussain

Principles of Programming Languages COMP251: Syntax and Grammars

CSE450 Translation of Programming Languages. Lecture 4: Syntax Analysis

3. Context-free grammars & parsing

Chapter 3: Describing Syntax and Semantics. Introduction Formal methods of describing syntax (BNF)

Describing Syntax and Semantics

CIT Lecture 5 Context-Free Grammars and Parsing 4/2/2003 1

CSE 130 Programming Language Principles & Paradigms Lecture # 5. Chapter 4 Lexical and Syntax Analysis

Lexical and Syntax Analysis

10/5/17. Lexical and Syntactic Analysis. Lexical and Syntax Analysis. Tokenizing Source. Scanner. Reasons to Separate Lexical and Syntax Analysis

Chapter 4. Syntax - the form or structure of the expressions, statements, and program units

Compiler Design Concepts. Syntax Analysis

CSE 311 Lecture 21: Context-Free Grammars. Emina Torlak and Kevin Zatloukal

EECS 6083 Intro to Parsing Context Free Grammars

Introduction to Parsing

CSE 12 Abstract Syntax Trees

F453 Module 7: Programming Techniques. 7.2: Methods for defining syntax

10/4/18. Lexical and Syntactic Analysis. Lexical and Syntax Analysis. Tokenizing Source. Scanner. Reasons to Separate Lexical and Syntactic Analysis

Programming Language Specification and Translation. ICOM 4036 Fall Lecture 3

Defining syntax using CFGs

A programming language requires two major definitions A simple one pass compiler

Introduction to Lexing and Parsing

4. Lexical and Syntax Analysis

ICOM 4036 Spring 2004

Programming Languages Third Edition

Chapter 4. Lexical and Syntax Analysis

Theoretical Part. Chapter one:- - What are the Phases of compiler? Answer:

Programming Language Syntax and Analysis

CSE 3302 Programming Languages Lecture 2: Syntax

Formal Languages and Compilers Lecture I: Introduction to Compilers

4. Lexical and Syntax Analysis

Part 5 Program Analysis Principles and Techniques

Consider a description of arithmetic. It includes two equations that define the structural types of digit and operator:

Chapter 3. Describing Syntax and Semantics ISBN

Chapter 3. Describing Syntax and Semantics

A simple syntax-directed

Types and Static Type Checking (Introducing Micro-Haskell)

Building Compilers with Phoenix

Structure of a compiler. More detailed overview of compiler front end. Today we ll take a quick look at typical parts of a compiler.

Chapter 3. Describing Syntax and Semantics

Habanero Extreme Scale Software Research Project

Syntax/semantics. Program <> program execution Compiler/interpreter Syntax Grammars Syntax diagrams Automata/State Machines Scanning/Parsing

Lexical and Syntax Analysis

Parsing. source code. while (k<=n) {sum = sum+k; k=k+1;}

CS415 Compilers. Syntax Analysis. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

EDAN65: Compilers, Lecture 04 Grammar transformations: Eliminating ambiguities, adapting to LL parsing. Görel Hedin Revised:

Parsing II Top-down parsing. Comp 412

Chapter 9: Dealing with Errors

Syntax Analysis/Parsing. Context-free grammars (CFG s) Context-free grammars vs. Regular Expressions. BNF description of PL/0 syntax

Programming Language Definition. Regular Expressions

2068 (I) Attempt all questions.

Context-free grammars (CFG s)

Compiler Design Overview. Compiler Design 1

1. The output of lexical analyser is a) A set of RE b) Syntax Tree c) Set of Tokens d) String Character

Syntax. 2.1 Terminology

Introduction to Syntax Analysis Recursive-Descent Parsing

flex is not a bad tool to use for doing modest text transformations and for programs that collect statistics on input.

Syntax. A. Bellaachia Page: 1

Fundamental Concepts. Chapter 1

CSCI 1260: Compilers and Program Analysis Steven Reiss Fall Lecture 4: Syntax Analysis I

LANGUAGE PROCESSORS. Introduction to Language processor:

Time : 1 Hour Max Marks : 30

Theory and Compiling COMP360

COSC252: Programming Languages: Semantic Specification. Jeremy Bolton, PhD Adjunct Professor

Languages and Compilers

LL(k) Parsing. Predictive Parsers. LL(k) Parser Structure. Sample Parse Table. LL(1) Parsing Algorithm. Push RHS in Reverse Order 10/17/2012

Compilers - Chapter 2: An introduction to syntax analysis (and a complete toy compiler)

Derivations vs Parses. Example. Parse Tree. Ambiguity. Different Parse Trees. Context Free Grammars 9/18/2012

CSCI312 Principles of Programming Languages!

CS323 Lecture - Specifying Syntax and Semantics Last revised 1/16/09

Compiler Construction

List of Figures. About the Authors. Acknowledgments

Syntax Analysis. COMP 524: Programming Language Concepts Björn B. Brandenburg. The University of North Carolina at Chapel Hill

CS 314 Principles of Programming Languages

Syntax Analysis. The Big Picture. The Big Picture. COMP 524: Programming Languages Srinivas Krishnan January 25, 2011

CS 406/534 Compiler Construction Parsing Part I

Lexical Scanning COMP360

Principles of Programming Languages COMP251: Syntax and Grammars

Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part2 3.3 Parse Trees and Abstract Syntax Trees

Introduction to Parsing. Lecture 8

Question Bank. 10CS63:Compiler Design

ITEC2620 Introduction to Data Structures

This book is licensed under a Creative Commons Attribution 3.0 License

CMSC 330: Organization of Programming Languages. Context Free Grammars

Syntactic Analysis. The Big Picture Again. Grammar. ICS312 Machine-Level and Systems Programming

COLLEGE OF ENGINEERING, NASHIK. LANGUAGE TRANSLATOR

CS 415 Midterm Exam Spring 2002

Transcription:

10.1 The next two lectures cover parsing. To parse a sentence in a formal language is to break it down into its syntactic components. Parsing is one of the most basic functions every compiler carries out, as well has having many other uses. In this lecture, we ll be looking at how to define a formal language, and we ll study an example of parsing and evaluating an arithmetic expression.

10.2 Parsing is an operation that s carried out on formally defined languages. So how do we formally define a language? We do it using another formally defined language, called a meta-language, such as Backus-Naur Form (or BNF). Here s an example of BNF. It defines what a product is. <product> ::= <number> <product> * <number> This says that a product is either a number or a number followed by * followed by another product. Note that it s a recursive definition. The symbol is part of BNF, and it means or. (The * symbol is not part of BNF, but is part of the language being defined.)

10.3 The following BNF definitions describe the syntax of an arithmetic expression. <expression> ::= <term> <expression> + <term> <expression> <term> <term> ::= <factor> <term> * <factor> <term> / <factor> <factor> ::= <number> ( <expression> )

10.4 From the preceding definitions, it should be clear that the following are all syntactically correct expressions: 5 5 + 8 (5 + 8) 5 * 2 + 1 5 * (2 + 1) while the following are not: + 5 + ((5 + 8) 5 * 2 + + 1

10.5 Before we parse a sentence in a formal language, it s convenient to break it down into a list of lexemes. A lexeme (sometimes also called token) is the smallest syntactic unit in a language. The following are typical kinds of lexemes: Numbers (ie: numeric strings) Names (ie alphanumeric strings) Operators (including brackets) (eg: *, +, ^, etc) For example, the sentence (64 + 7) * 128 is broken down into a list of seven lexemes: (, 64, +, 7, ), *, and 128.

10.6 In what follows, we ll assume that the expression is a string pointed to by a pointer expression. We ll also assume we have the following access functions are already written: Don t worry about the data type string and how these access functions are written for now. We will deal with them later.

10.7 Here s the BNF for an expression again. <expression> ::= <term> <expression> + <term> <expression> <term> This definition is left-recursive the recursive part is the leftmost item on the right-hand-side on the definition. If we translated this directly into a recursive procedure, it would go into an infinite loop. Here s a right-recursive version which defines the same thing. <expression> ::= <term> <term> + <expression> <term> <expression> The right-recursive version can be translated directly into a recursive parsing procedure, but processes expressions in the reverse order to what we require. So we use the left-recursive version, and translate it into an iterative procedure.

10.8 Here s the BNF again. <expression> ::= <term> <expression> + <term> <expression> <term> To evaluate an expression E iteratively, this is what you do. Evaluate the next term in E, and call the result X. While the next character in E is an operator ( + or ), Read past the operator. Evaluate the next term in E, calling the result Y. Let X be X + Y or X Y, depending on the operator. The value of E is the final value of X.

10.9 Here s the C++ code. It should be self-explanatory.

10.10 Evaluating a term is almost identical. Here s the BNF. <term> ::= <factor> <term> * <factor> <term> / <factor> To evaluate a term T, this is what you do. Evaluate the next factor in T, and call the result X. While the next character in T is an operator ( * or / ), Read past the operator. Evaluate the next factor in T, calling the result Y. Let X be X * Y or X / Y, depending on the operator. The value of T is the final value of X.

10.11 And here s the C++ code.

10.12 Evaluating a factor is a little different, and involves a recursive call. Here s the BNF again. <factor> ::= <number> ( <expression> ) To evaluate a factor F, this is what you do. If the next character in F isn t ( then let X be the first number in F. Otherwise evaluate the next expression in F, and call the result X. Then read past the ). The value of F is X.

10.13 Here s the C++ code.

10.14 Note that evalexpression(), evalterm() and evalfactor() are mutually recursive. That is: evalexpression()---calls---evalterm()---calls---evalfactor() calls We need to use function prototype to specify functions arguments before they are used. Here are the function prototype for all three functions:

10.15 Borland C++ understand a data type called string. Here are some examples of access functions for string data type. (They are actually membership functions, something we will consider in a later lecture.)

10.16 Now we can write our access routines for this program:

10.17

10.18

10.19 expression 7 * ( 5 + 4 )

10.20 expression 7 * ( 5 + 4 )

10.21 expression 7 * ( 5 + 4 ) Returns result = 7 expression string = *(5+4)

10.22 op = * expression ( 5 + 4 ) result = 7

10.23 expression nextchar() = ( 5 + 4 )

10.24 expression nextchar() = ( 5 + 4 )

10.25 expression Result = 5 5 + 4 )

10.26 op = + expression Returns temp = 4 + 4 ) result = 9

10.27 expression result = 9 )

10.28 expression result = 7 temp = 9

10.29 expression final result = 63