Error Handling Syntax-Directed Translation Recursive Descent Parsing

Similar documents
Error Handling Syntax-Directed Translation Recursive Descent Parsing

Error Handling Syntax-Directed Translation Recursive Descent Parsing

Error Handling Syntax-Directed Translation Recursive Descent Parsing. Lecture 6

Ambiguity and Errors Syntax-Directed Translation

Abstract Syntax Trees & Top-Down Parsing

Abstract Syntax Trees & Top-Down Parsing

Abstract Syntax Trees & Top-Down Parsing

Programming Languages & Translators PARSING. Baishakhi Ray. Fall These slides are motivated from Prof. Alex Aiken: Compilers (Stanford)

Introduction to Parsing Ambiguity and Syntax Errors

Introduction to Parsing Ambiguity and Syntax Errors

Administrativia. WA1 due on Thu PA2 in a week. Building a Parser III. Slides on the web site. CS164 3:30-5:00 TT 10 Evans.

Administrativia. PA2 assigned today. WA1 assigned today. Building a Parser II. CS164 3:30-5:00 TT 10 Evans. First midterm. Grammars.

CS1622. Today. A Recursive Descent Parser. Preliminaries. Lecture 9 Parsing (4)

Derivations vs Parses. Example. Parse Tree. Ambiguity. Different Parse Trees. Context Free Grammars 9/18/2012

Ambiguity. Grammar E E + E E * E ( E ) int. The string int * int + int has two parse trees. * int

Outline. Limitations of regular languages Parser overview Context-free grammars (CFG s) Derivations Syntax-Directed Translation

Ambiguity, Precedence, Associativity & Top-Down Parsing. Lecture 9-10

Building a Parser II. CS164 3:30-5:00 TT 10 Evans. Prof. Bodik CS 164 Lecture 6 1

Outline. Parser overview Context-free grammars (CFG s) Derivations Syntax-Directed Translation

Lecture 14 Sections Mon, Mar 2, 2009

Introduction to Parsing. Lecture 8

Context-Free Grammars

Chapter 4: Syntax Analyzer

Syntax-Directed Translation. Lecture 14

Outline. Regular languages revisited. Introduction to Parsing. Parser overview. Context-free grammars (CFG s) Lecture 5. Derivations.

Outline. Limitations of regular languages. Introduction to Parsing. Parser overview. Context-free grammars (CFG s)

ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών

Building a Parser III. CS164 3:30-5:00 TT 10 Evans. Prof. Bodik CS 164 Lecture 6 1

Some Basic Definitions. Some Basic Definitions. Some Basic Definitions. Language Processing Systems. Syntax Analysis (Parsing) Prof.

( ) i 0. Outline. Regular languages revisited. Introduction to Parsing. Parser overview. Context-free grammars (CFG s) Lecture 5.

A programming language requires two major definitions A simple one pass compiler

Abstract Syntax Trees Synthetic and Inherited Attributes

Introduction to Syntax Analysis. Compiler Design Syntax Analysis s.l. dr. ing. Ciprian-Bogdan Chirila

Syntax Analysis. Chapter 4

Recursive Descent Parsers

Introduction to Parsing. Lecture 5. Professor Alex Aiken Lecture #5 (Modified by Professor Vijay Ganesh)

Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.

Principles of Programming Languages

Announcements. Written Assignment 1 out, due Friday, July 6th at 5PM.

Bottom-Up Parsing II (Different types of Shift-Reduce Conflicts) Lecture 10. Prof. Aiken (Modified by Professor Vijay Ganesh.

syntax tree - * * * - * * * * * 2 1 * * 2 * (2 * 1) - (1 + 0)

CSE450 Translation of Programming Languages. Lecture 4: Syntax Analysis

Introduction to Parsing. Lecture 5

CS /534 Compiler Construction University of Massachusetts Lowell

Review of CFGs and Parsing II Bottom-up Parsers. Lecture 5. Review slides 1

Syntax Analysis Part I

Syntax-Directed Translation. Introduction

Review: Shift-Reduce Parsing. Bottom-up parsing uses two actions: Bottom-Up Parsing II. Shift ABC xyz ABCx yz. Lecture 8. Reduce Cbxy ijk CbA ijk

LECTURE 3. Compiler Phases

Compilers. Compiler Construction Tutorial The Front-end

Bottom-Up Parsing. Lecture 11-12

Top down vs. bottom up parsing

LR Parsing LALR Parser Generators

Outline. The strategy: shift-reduce parsing. Introduction to Bottom-Up Parsing. A key concept: handles

Front End. Hwansoo Han

CS453 : JavaCUP and error recovery. CS453 Shift-reduce Parsing 1

Syntax Errors; Static Semantics

CS 406/534 Compiler Construction Putting It All Together

LR Parsing LALR Parser Generators

Introduction to Parsing. Lecture 5

CS153: Compilers Lecture 4: Recursive Parsing

Semantic Analysis. Lecture 9. February 7, 2018

Syntax-Directed Translation

Syntactic Analysis. Top-Down Parsing. Parsing Techniques. Top-Down Parsing. Remember the Expression Grammar? Example. Example

Syntax Intro and Overview. Syntax

CSE443 Compilers. Dr. Carl Alphonce 343 Davis Hall

In One Slide. Outline. LR Parsing. Table Construction

Undergraduate Compilers in a Day

CS 314 Principles of Programming Languages

CS2210: Compiler Construction Syntax Analysis Syntax Analysis

Today. Assignments. Lecture Notes CPSC 326 (Spring 2019) Quiz 5. Exam 1 overview. Type checking basics. HW4 due. HW5 out, due in 2 Tuesdays

CS5363 Final Review. cs5363 1

Lexical and Syntax Analysis. Top-Down Parsing

Syntax-Directed Translation

Lecture 14: Parser Conflicts, Using Ambiguity, Error Recovery. Last modified: Mon Feb 23 10:05: CS164: Lecture #14 1

Error Recovery during Top-Down Parsing: Acceptable-sets derived from continuation

Static Semantics. Lecture 15. (Notes by P. N. Hilfinger and R. Bodik) 2/29/08 Prof. Hilfinger, CS164 Lecture 15 1

CMPT 379 Compilers. Parse trees

Syntax Analysis/Parsing. Context-free grammars (CFG s) Context-free grammars vs. Regular Expressions. BNF description of PL/0 syntax

UNIT-4 (COMPILER DESIGN)

COL728 Minor1 Exam Compiler Design Sem II, Answer all 5 questions Max. Marks: 20

Today s Topics. Last Time Top-down parsers - predictive parsing, backtracking, recursive descent, LL parsers, relation to S/SL

Introduction to Compiler

Bottom-Up Parsing. Lecture 11-12

Topic 3: Syntax Analysis I

SYED AMMAL ENGINEERING COLLEGE (An ISO 9001:2008 Certified Institution) Dr. E.M. Abdullah Campus, Ramanathapuram

A Simple Syntax-Directed Translator

Introduction to Bottom-Up Parsing

CS 314 Principles of Programming Languages. Lecture 9

Context-free grammars (CFG s)

Syntactic Analysis. CS345H: Programming Languages. Lecture 3: Lexical Analysis. Outline. Lexical Analysis. What is a Token? Tokens

CS415 Compilers. Syntax Analysis. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

Type Checking. Outline. General properties of type systems. Types in programming languages. Notation for type rules.

A simple syntax-directed

CS1622. Semantic Analysis. The Compiler So Far. Lecture 15 Semantic Analysis. How to build symbol tables How to use them to find

Outline. General properties of type systems. Types in programming languages. Notation for type rules. Common type rules. Logical rules of inference

Bottom Up Parsing. Shift and Reduce. Sentential Form. Handle. Parse Tree. Bottom Up Parsing 9/26/2012. Also known as Shift-Reduce parsing

Syntax-Directed Translation Part I

It parses an input string of tokens by tracing out the steps in a leftmost derivation.

Syntax-Directed Translation. CS Compiler Design. SDD and SDT scheme. Example: SDD vs SDT scheme infix to postfix trans

Transcription:

Announcements rror Handling Syntax-Directed ranslation Lecture 6 PA1 & WA1 Due today at midnight PA2 & WA2 Assigned today Prof. Aiken CS 14 Lecture 6 1 Prof. Aiken CS 14 Lecture 6 2 Outline xtensions of CFG for parsing Precedence declarations rror handling Semantic actions Constructing a parse tree Recursive descent Prof. Aiken CS 14 Lecture 6 rror Handling Purpose of the compiler is o detect non-valid programs o translate the valid ones Many kinds of possible errors (e.g. in C) rror kind xample Detected by Lexical $ Lexer Syntax x *% Parser Semantic int x; y = x(); ype checker Correctness your favorite program ester/user Prof. Aiken CS 14 Lecture 6 4 Syntax rror Handling rror handler should Report errors accurately and clearly Recover from an error quickly Not slow down compilation of valid code Approaches to Syntax rror Recovery From simple to complex Panic mode rror productions Automatic local or global correction Good error handling is not easy to achieve Not all are supported by all parser generators Prof. Aiken CS 14 Lecture 6 5 Prof. Aiken CS 14 Lecture 6 6 1

rror Recovery: Panic Mode Simplest, most popular method When an error is detected: Discard tokens until one with a clear role is found Continue from there Such tokens are called synchronizing tokens ypically the statement or expression terminators Syntax rror Recovery: Panic Mode (Cont.) Consider the erroneous expression (1 + + 2) + Panic-mode recovery: Skip ahead to next integer and then continue Bison: use the special terminal error to describe how much input to skip int + ( ) error int ( error ) Prof. Aiken CS 14 Lecture 6 7 Prof. Aiken CS 14 Lecture 6 8 Syntax rror Recovery: rror Productions rror Recovery: Local and Global Correction Idea: specify in the grammar known common mistakes ssentially promotes common errors to alternative syntax xample: Write 5 x instead of 5 * x Add the production Disadvantage Complicates the grammar Idea: find a correct nearby program ry token insertions and deletions xhaustive search Disadvantages: Hard to implement Slows down parsing of correct programs Nearby is not necessarily the intended program Not all tools support it Prof. Aiken CS 14 Lecture 6 9 Prof. Aiken CS 14 Lecture 6 10 Syntax rror Recovery: Past and Present Past Slow recompilation cycle (even once a day) Find as many errors in one cycle as possible Researchers could not let go of the topic Present Quick recompilation cycle Users tend to correct one error/cycle Complex error recovery is less compelling Panic-mode seems enough Abstract Syntax rees So far a parser traces the derivation of a sequence of tokens he rest of the compiler needs a structural representation of the program Abstract syntax trees Like parse trees but ignore some details Abbreviated as AS Prof. Aiken CS 14 Lecture 6 11 Prof. Aiken CS 14 Lecture 6 12 2

Abstract Syntax ree (Cont.) xample of Parse ree Consider the grammar int ( ) + And the string 5 + (2 + ) int 5 + ( ) races the operation of the parser Does capture the nesting structure After lexical analysis (a list of tokens) int 5 + ( int 2 + int ) During parsing we build a parse tree + int 2 int But too much info Parentheses Single-successor nodes Prof. Aiken CS 14 Lecture 6 1 Prof. Aiken CS 14 Lecture 6 14 xample of Abstract Syntax ree PLUS PLUS 5 2 Also captures the nesting structure But abstracts from the concrete syntax => more compact and easier to use An important data structure in a compiler Prof. Aiken CS 14 Lecture 6 15 Semantic Actions his is what we ll use to construct ASs ach grammar symbol may have attributes For terminal symbols (lexical tokens) attributes can be calculated by the lexer ach production may have an action Written as: X Y 1 Y n { action } hat can refer to or compute symbol attributes Prof. Aiken CS 14 Lecture 6 16 Semantic Actions: An xample Consider the grammar int + ( ) For each symbol X define an attribute X.val For terminals, val is the associated lexeme For non-terminals, val is the expression s value (and is computed from values of subexpressions) We annotate the grammar with actions: int {.val = int.val } 1 + 2 {.val = 1.val + 2.val } ( 1 ) {.val = 1.val } Prof. Aiken CS 14 Lecture 6 17 Semantic Actions: An xample (Cont.) String: 5 + (2 + ) okens: int 5 + ( int 2 + int ) Productions quations 1 + 2.val = 1.val + 2.val 1 int 5 1.val = int 5.val = 5 2 ( ) 2.val =.val 4 + 5.val = 4.val + 5.val 4 int 2 4.val = int 2.val = 2 5 int 5.val = int.val = Prof. Aiken CS 14 Lecture 6 18

Semantic Actions: Notes Semantic actions specify a system of equations Declarative Style Order of resolution is not specified he parser figures it out Imperative Style he order of evaluation is fixed Important if the actions manipulate global state Semantic Actions: Notes We ll explore actions as pure equations Style 1 But note bison has a fixed order of evaluation for actions xample:.val = 4.val + 5.val Must compute 4.val and 5.val before.val We say that.val depends on 4.val and 5.val Prof. Aiken CS 14 Lecture 6 19 Prof. Aiken CS 14 Lecture 6 20 Dependency Graph 1 + 2 int 5 5 + ( + ) ach node labeled has one slot for the val attribute Note the dependencies valuating Attributes An attribute must be computed after all its successors in the dependency graph have been computed In previous example attributes can be computed bottom-up 4 + int 2 2 5 int Such an order exists when there are no cycles Cyclically defined attributes are not legal Prof. Aiken CS 14 Lecture 6 21 Prof. Aiken CS 14 Lecture 6 22 Dependency Graph 10 1 5 + 2 5 int 5 5 ( 5 ) Semantic Actions: Notes (Cont.) Synthesized attributes Calculated from attributes of descendents in the parse tree.val is a synthesized attribute Can always be calculated in a bottom-up order 4 2 + int 2 2 5 int Grammars with only synthesized attributes are called S-attributed grammars Most common case Prof. Aiken CS 14 Lecture 6 2 Prof. Aiken CS 14 Lecture 6 24 4

Inherited Attributes Another kind of attribute Calculated from attributes of parent and/or siblings in the parse tree xample: a line calculator A Line Calculator ach line contains an expression int + ach line is terminated with the = sign L = + = In second form the value of previous line is used as starting value A program is a sequence of lines P ε P L Prof. Aiken CS 14 Lecture 6 25 Prof. Aiken CS 14 Lecture 6 26 Attributes for the Line Calculator ach has a synthesized attribute val Calculated as before ach L has an attribute val L = { L.val =.val } + = { L.val =.val + L.prev } We need the value of the previous line We use an inherited attribute L.prev Attributes for the Line Calculator (Cont.) ach P has a synthesized attribute val he value of its last line P ε { P.val = 0 } P 1 L { P.val = L.val; L.prev = P 1.val } ach L has an inherited attribute prev L.prev is inherited from sibling P 1.val xample Prof. Aiken CS 14 Lecture 6 27 Prof. Aiken CS 14 Lecture 6 28 xample of Inherited Attributes xample of Inherited Attributes P val synthesized P 5 val synthesized P ε 0 L + + 4 int 2 2 + + 5 int = prev inherited All can be computed in depth-first order P ε 0 0 L 5 0 + 5 + 4 2 5 int 2 2 int = prev inherited All can be computed in depth-first order Prof. Aiken CS 14 Lecture 6 29 Prof. Aiken CS 14 Lecture 6 0 5

Semantic Actions: Notes (Cont.) Semantic actions can be used to build ASs And many other things as well Also used for type checking, code generation, Process is called syntax-directed translation Substantial generalization over CFGs Constructing An AS We first define the AS data type Supplied by us for the project Consider an abstract tree type with two constructors: mkplus( mkleaf(n), ) = = n PLUS 1 2 1 2 Prof. Aiken CS 14 Lecture 6 1 Prof. Aiken CS 14 Lecture 6 2 Constructing a Parse ree We define a synthesized attribute ast Values of ast values are ASs We assume that int.lexval is the value of the integer lexeme Computed using semantic actions int.ast = mkleaf(int.lexval) 1 + 2.ast = mkplus( 1.ast, 2.ast) ( 1 ).ast = 1.ast Prof. Aiken CS 14 Lecture 6 Parse ree xample Consider the string int 5 + ( int 2 + int ) A bottom-up evaluation of the ast attribute:.ast = mkplus(mkleaf(5), mkplus(mkleaf(2), mkleaf()) PLUS PLUS 5 2 Prof. Aiken CS 14 Lecture 6 4 Summary Intro to op-down Parsing: he Idea We can specify language syntax using CFG A parser will answer whether s L(G) and will build a parse tree which we convert to an AS and pass on to the rest of the compiler he parse tree is constructed From the top From left to right erminals are seen in order of appearance in the token stream: t 2 t 5 t 6 t 8 t 9 1 t 2 4 t 5 t 6 7 t 8 t 9 Prof. Aiken CS 14 Lecture 6 5 Prof. Aiken CS 14 Lecture 6 6 6

Consider the grammar + int int * ( ) oken stream is: + int int * ( ) Start with top-level non-terminal ry the rules for in order Prof. Aiken CS 14 Lecture 6 7 Prof. Aiken CS 14 Lecture 6 8 + int int * ( ) + int int * ( ) int Mismatch: int is not (! Backtrack Prof. Aiken CS 14 Lecture 6 9 Prof. Aiken CS 14 Lecture 6 40 + int int * ( ) + int int * ( ) int * Mismatch: int is not (! Backtrack Prof. Aiken CS 14 Lecture 6 41 Prof. Aiken CS 14 Lecture 6 42 7

+ int int * ( ) + int int * ( ) ( ) Match! Advance input. Prof. Aiken CS 14 Lecture 6 4 Prof. Aiken CS 14 Lecture 6 44 + int int * ( ) + int int * ( ) ( ) ( ) Prof. Aiken CS 14 Lecture 6 45 Prof. Aiken CS 14 Lecture 6 46 + int int * ( ) + int int * ( ) ( ) Match! Advance input. ( ) Match! Advance input. Prof. Aiken CS 14 Lecture 6 47 int Prof. Aiken CS 14 Lecture 6 48 int 8

+ int int * ( ) A Recursive Descent Parser. Preliminaries Let OKN be the type of tokens Special tokens IN, OPN, CLOS, PLUS, IMS Let the global next point to the next token ( ) nd of input, accept. Prof. Aiken CS 14 Lecture 6 49 int Prof. Aiken CS 14 Lecture 6 50 A (Limited) Recursive Descent Parser (2) Define boolean functions that check the token string for a match of A given token terminal bool term(okn tok) { return *next++ == tok; } he nth production of S: bool S n () { } ry all productions of S: bool S() { } Prof. Aiken CS 14 Lecture 6 51 A (Limited) Recursive Descent Parser () For production bool 1 () { return (); } For production + bool 2 () { return () && term(plus) && (); } For all productions of (with backtracking) bool () { OKN *save = next; return (next = save, 1 ()) (next = save, 2 ()); } Prof. Aiken CS 14 Lecture 6 52 A (Limited) Recursive Descent Parser (4) Functions for non-terminal bool 1 () { return term(in); } bool 2 () { return term(in) && term(ims) && (); } bool () { return term(opn) && () && term(clos); } bool () { OKN *save = next; return (next = save, 1 ()) (next = save, 2 ()) (next = save, ()); }. Notes. o start the parser Initialize next to point to first token Invoke () Notice how this simulates the example parse asy to implement by hand But not completely general Cannot backtrack once a production is successful Works for grammars where at most one production can succeed for a non-terminal Prof. Aiken CS 14 Lecture 6 5 Prof. Aiken CS 14 Lecture 6 54 9

xample When Recursive Descent Does Not Work + ( int ) int int * ( ) bool term(okn tok) { return *next++ == tok; } bool 1 () { return (); } bool 2 () { return () && term(plus) && (); } bool () {OKN *save = next; return (next = save, 1 ()) (next = save, 2 ()); } bool 1 () { return term(in); } bool 2 () { return term(in) && term(ims) && (); } bool () { return term(opn) && () && term(clos); } bool () { OKN *save = next; return (next = save, 1 ()) (next = save, 2 ()) (next = save, ()); } Consider a production S S a bool S 1 () { return S() && term(a); } bool S() { return S 1 (); } S() goes into an infinite loop A left-recursive grammar has a non-terminal S S + Sα for some α Recursive descent does not work in such cases Prof. Aiken CS 14 Lecture 6 55 Prof. Aiken CS 14 Lecture 6 56 limination of Left Recursion Consider the left-recursive grammar S S α β S generates all strings starting with a β and followed by a number of α Can rewrite using right-recursion S β S S α S ε More limination of Left-Recursion In general S S α 1 S α n β 1 β m All strings derived from S start with one of β 1,,β m and continue with several instances of α 1,,α n Rewrite as S β 1 S β m S S α 1 S α n S ε Prof. Aiken CS 14 Lecture 6 57 Prof. Aiken CS 14 Lecture 6 58 General Left Recursion Summary of Recursive Descent he grammar S A α δ A S β is also left-recursive because S + S β α his left-recursion can also be eliminated Simple and general parsing strategy Left-recursion must be eliminated first but that can be done automatically Unpopular because of backtracking hought to be too inefficient See Dragon Book for general algorithm Section 4. In practice, backtracking is eliminated by restricting the grammar Prof. Aiken CS 14 Lecture 6 59 Prof. Aiken CS 14 Lecture 6 60 10