For example, we might have productions such as:

Similar documents
Fall Compiler Principles Lecture 5: Parsing part 4. Roman Manevich Ben-Gurion University

Fall Compiler Principles Lecture 4: Parsing part 3. Roman Manevich Ben-Gurion University of the Negev

COMP 181. Prelude. Prelude. Summary of parsing. A Hierarchy of Grammar Classes. More power? Syntax-directed translation. Analysis

A clarification on terminology: Recognizer: accepts or rejects strings in a language. Parser: recognizes and generates parse trees (imminent topic)

JavaCUP. There are also many parser generators written in Java

POLITECNICO DI TORINO. (01JEUHT) Formal Languages and Compilers. Laboratory N 3. Lab 3. Cup Advanced Use

CS453 : JavaCUP and error recovery. CS453 Shift-reduce Parsing 1

Last Time. What do we want? When do we want it? An AST. Now!

Simple LR (SLR) LR(0) Drawbacks LR(1) SLR Parse. LR(1) Start State and Reduce. LR(1) Items 10/3/2012

CUP. Lecture 18 CUP User s Manual (online) Robb T. Koether. Hampden-Sydney College. Fri, Feb 27, 2015

Abstract Syntax. Mooly Sagiv. html://

Error Recovery. Computer Science 320 Prof. David Walker - 1 -

Syntax Errors; Static Semantics

Operator Precedence a+b*c b*c E + T T * P ( E )

ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών

Introduction to Syntax Analysis. Compiler Design Syntax Analysis s.l. dr. ing. Ciprian-Bogdan Chirila

Context-free grammars (CFG s)

Using JFlex. "Linux Gazette...making Linux just a little more fun!" by Christopher Lopes, student at Eastern Washington University April 26, 1999

Project 2 Interpreter for Snail. 2 The Snail Programming Language

Compilers. Bottom-up Parsing. (original slides by Sam

Syntax Analysis/Parsing. Context-free grammars (CFG s) Context-free grammars vs. Regular Expressions. BNF description of PL/0 syntax

Compilers and Language Processing Tools

CS 4240: Compilers and Interpreters Project Phase 1: Scanner and Parser Due Date: October 4 th 2015 (11:59 pm) (via T-square)

CSE 582 Autumn 2002 Exam Sample Solution

ASTs, Objective CAML, and Ocamlyacc

Agenda. Previously. Tentative syllabus. Fall Compiler Principles Lecture 5: Parsing part 4 12/2/2015. Roman Manevich Ben-Gurion University

Syntax Analysis. Chapter 4

CSE 401 Midterm Exam 11/5/10

Introduction to Compiler Design

Part III : Parsing. From Regular to Context-Free Grammars. Deriving a Parser from a Context-Free Grammar. Scanners and Parsers.

Properties of Regular Expressions and Finite Automata

Lecture 14: Parser Conflicts, Using Ambiguity, Error Recovery. Last modified: Mon Feb 23 10:05: CS164: Lecture #14 1

Error Management in Compilers and Run-time Systems

Syntax/semantics. Program <> program execution Compiler/interpreter Syntax Grammars Syntax diagrams Automata/State Machines Scanning/Parsing

Yacc: A Syntactic Analysers Generator

CPS 506 Comparative Programming Languages. Syntax Specification

Simple Lexical Analyzer

COMPILER CONSTRUCTION LAB 2 THE SYMBOL TABLE. Tutorial 2 LABS. PHASES OF A COMPILER Source Program. Lab 2 Symbol table

Building a Parser Part III

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

Let s look at the CUP specification for CSX-lite. Recall its CFG is

The Structure of a Syntax-Directed Compiler

CSE 340 Fall 2014 Project 4

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

Defining syntax using CFGs

Last time. What are compilers? Phases of a compiler. Scanner. Parser. Semantic Routines. Optimizer. Code Generation. Sunday, August 29, 2010

CMSC 350: COMPILER DESIGN

CSEP 501 Compilers. Languages, Automata, Regular Expressions & Scanners Hal Perkins Winter /8/ Hal Perkins & UW CSE B-1

A Bison Manual. You build a text file of the production (format in the next section); traditionally this file ends in.y, although bison doesn t care.

CSE302: Compiler Design

Parser and syntax analyzer. Context-Free Grammar Definition. Scanning and parsing. How bottom-up parsing works: Shift/Reduce tecnique.

A Short Summary of Javali

COMPILERS BASIC COMPILER FUNCTIONS

Syntax. A. Bellaachia Page: 1

Compilation 2013 Parser Generators, Conflict Management, and ML-Yacc

Introduction to Parsing Ambiguity and Syntax Errors

Lexical Analysis. Lexical analysis is the first phase of compilation: The file is converted from ASCII to tokens. It must be fast!

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler Front-End

CS 536 Midterm Exam Spring 2013

Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.

Introduction to Parsing Ambiguity and Syntax Errors

CSE 401 Midterm Exam Sample Solution 11/4/11

The TinyJ Compiler's Static and Stack-Dynamic Memory Allocation Rules Static Memory Allocation for Static Variables in TinyJ:

Syntactic Analysis. The Big Picture Again. Grammar. ICS312 Machine-Level and Systems Programming

TML Language Reference Manual

Defining Program Syntax. Chapter Two Modern Programming Languages, 2nd ed. 1

Some Basic Definitions. Some Basic Definitions. Some Basic Definitions. Language Processing Systems. Syntax Analysis (Parsing) Prof.

Context-free grammars

The Structure of a Syntax-Directed Compiler

Principles of Programming Languages COMP251: Syntax and Grammars

PLT 4115 LRM: JaTesté

Left to right design 1

GAWK Language Reference Manual

3. Parsing. 3.1 Context-Free Grammars and Push-Down Automata 3.2 Recursive Descent Parsing 3.3 LL(1) Property 3.4 Error Handling

Programming Language Syntax and Analysis

Name EID. (calc (parse '{+ {with {x {+ 5 5}} {with {y {- x 3}} {+ y y} } } z } ) )

Regular Expressions. Agenda for Today. Grammar for a Tiny Language. Programming Language Specifications

Chapter 4: Syntax Analyzer

Compiler Theory. (Semantic Analysis and Run-Time Environments)

flex is not a bad tool to use for doing modest text transformations and for programs that collect statistics on input.

In Java, data type boolean is used to represent Boolean data. Each boolean constant or variable can contain one of two values: true or false.

Compiler Design (40-414)

How do LL(1) Parsers Build Syntax Trees?

Full file at

Lexical and Syntax Analysis

A Simple Syntax-Directed Translator

shift-reduce parsing

Topic 3: Syntax Analysis I

Table-Driven Top-Down Parsers

Theoretical Part. Chapter one:- - What are the Phases of compiler? Answer:

MiniJava Parser and AST. Winter /27/ Hal Perkins & UW CSE E1-1

Typescript on LLVM Language Reference Manual

Using an LALR(1) Parser Generator

CA Compiler Construction

SEMANTIC ANALYSIS TYPES AND DECLARATIONS

Syntax Analysis Check syntax and construct abstract syntax tree

CSE302: Compiler Design

RYERSON POLYTECHNIC UNIVERSITY DEPARTMENT OF MATH, PHYSICS, AND COMPUTER SCIENCE CPS 710 FINAL EXAM FALL 96 INSTRUCTIONS

CS1622. Semantic Analysis. The Compiler So Far. Lecture 15 Semantic Analysis. How to build symbol tables How to use them to find

Ambiguity and Errors Syntax-Directed Translation

Transcription:

1. Error Recovery This is an extract from http://www.cs.pr inceton.edu/ appel/moder n/java/cup/manual.html#errors A impor tant aspect of building parsers with CUP is support for syntactic error recovery. CUP uses the same error recovery mechanisms as YACC. In par ticular, it suppor ts a special error symbol (denoted simply as error). This symbol plays the role of a special non-terminal which, instead of being defined by productions, instead matches an erroneous input sequence. The error symbol only comes into play if a syntax error is detected. If a syntax error is detected then the parser tries to replace some por tion of the input token stream with error and then continue parsing. For example, we might have productions such as: stmt ::= expr SEMI while_stmt SEMI if_stmt SEMI... error SEMI ; This indicates that if none of the normal productions for stmt can be matched by the input, then a syntax error should be declared, and recovery should be made by skipping erroneous tokens (equivalent to matching and replacing them with error) up to a point at which the parse can be continued with a semicolon (and additional context that legally follows a statement). Anerror is considered to be recovered from if and only if a sufficient number of tokens past the error symbol can be successfully parsed. (The number of tokens required is determined by the error_sync_size() method of the parser and defaults to 3).

-2- Specifically, the parser first looks for the closest state to the top of the parse stack that has an outgoing transition under error. This generally corresponds to wor king from productions that represent more detailed constructs (such as a specific kind of statement) up to productions that represent more general or enclosing constr ucts (such as the general production for all statements or a production representing a whole section of declarations) until we get to a place where an error recovery production has been provided for. Once the parser is placed into a configuration that has an immediate error recovery (by popping the stack tothe first such state), the parser begins skipping tokens to find a point at which the parse can be continued. After discarding each token, the parser attempts to parse ahead in the input (without executing any embedded semantic actions). If the parser can successfully parse past the required number of tokens, then the input is backed uptothe point of recovery and the parse is resumed normally (executing all actions). If the parse cannot be continued far enough, then another token is discarded and the parser again tries to parse ahead. If the end of input is reached without making a successful recover y (or there was no suitable error recovery state found on the parse stack tobegin with) then error recovery fails. 1.1. Cup Error Methods public void report_error(str ing message, Object info) This method should be called whenever an error message is to be issued. In the default implementation of this method, the first parameter provides the text of a message which is printed on System.err and the second parameter is simply ignored. It is ver y typical to overr ide this method in order to provide a more sophisticated error reporting mechanism.

-3- public void report_fatal_error(str ing message, Object info) This method should be called whenever a non-recoverable error occurs. It responds by calling report_error(), then aborts parsing by calling the parser method done_parsing(), and finally throws an exception. (In general done_parsing() should be called at any point that parsing needs to be terminated early). public void syntax_error(symbol cur_token) This method is called by the parser as soon as a syntax error is detected (but before error recovery is attempted). In the default implementation it calls: report_error("syntax error", null);. public void unrecovered_syntax_error(symbol cur_token) This method is called by the parser if it is unable to recover from a syntax error. In the default implementation it calls: repor t_fatal_error("couldn t repair and continue parse", null);. protected int error_sync_size() This method is called by the parser to determine how many tokens it must successfully parse in order to consider an error recovery successful. The default implementation returns 3. Values below 2 are not recommended. 1.2. Example Parser.cup: 1 // JavaCup specification for a simple expression evaluator (w/ a 2 3 import java_cup.runtime.*; 4 5 parser code {: 6 7 protected int error_sync_size () { 8 System.out.println(""); 9 return 1; // not recommended by the CUP manual

-4-10 // but we need recovery after the next successfu 11 } 12 public void syntax_error(symbol cur_token) { 13 System.out.println("I m sorry, but I have to "); 14 System.out.println("report a syntax error."); 15 System.out.println("The last symbol was: "+ Scanner.lastSymb 16 " (See sym.java)"); 17 System.out.println("at line number " + Main.s.getLineNumber( 18 19 report_error("syntax error", null); 20 } 21 22 :} 23 24 /* Terminals (tokens returned by the scanner). */ 25 terminal SEMI, PLUS, MINUS, TIMES, DIVIDE, MOD; 26 terminal LPAREN, RPAREN; 27 terminal Integer NUMBER; 28 29 /* Non terminals */ 30 non terminal Object expr_list, expr_part; 31 non terminal Integer expr; 32 33 /* Precedences */ 34 precedence left PLUS, MINUS; 35 precedence left TIMES, DIVIDE, MOD; precedence left LPAREN; /* T 36 expr_part 37 ; 38 39 expr_part ::= expr:e 40 {: System.out.println("= " + e); :} 41 SEMI 42 /* 43 error:e 44 {: System.out.println("error "); 45 parser.report_error("this input is not cor

-5-46 System.out.println("e = " + e); 47 :} 48 SEMI 49 */ 50 ; 51 52 expr ::= expr:e1 PLUS expr:e2 53 {: RESULT = new Integer(e1.intValue() + e2.in 54 expr:e1 MINUS expr:e2 55 {: RESULT = new Integer(e1.intValue() - e2.in 56 expr:e1 TIMES expr:e2 57 {: RESULT = new Integer(e1.intValue() * e2.in 58 expr:e1 DIVIDE expr:e2 59 {: if ( e2.intvalue()!= 0 ) 60 RESULT = new Integer(e1.intValue() / e2. 61 else { 62 System.out.println("Zero Division"); 63 RESULT = new Integer(0); 64 } 65 :} 66 expr:e1 MOD expr:e2 67 {: if ( e2.intvalue()!= 0 ) 68 RESULT = new Integer(e1.intValue() % e2. 69 else { 70 System.out.println("Zero Mod Operation") 71 RESULT = new Integer(0); 72 } 73 :} 74 NUMBER:n 75 {: RESULT = n; :} 76 LPAREN expr:e RPAREN 77 {: RESULT = e; :} 78 79 error:e 80 {: 81 parser.report_error("operand missing", nul

-6-82 System.out.println("e = " + e); 83 RESULT = new Integer(42); 84 :} 85 86 ; Source Code: Src/11/Parser.cup Execution java Main 2 - ; I m sorry, but I have to report a syntax error. The last symbol was: #2 (See sym.java) at line number 1 Syntax error 2 + 3; Operand missing e = null = -40 = 5 2 + 3; = 5 1.3. CUP Manual: Appendix D: Bugs In this version of CUP it s difficult for the semantic action phrases (Java code attached to productions) to access the report_error method and other similar methods and objects defined in the parser code directive.

-7-