Parser and syntax analyzer. Context-Free Grammar Definition. Scanning and parsing. How bottom-up parsing works: Shift/Reduce tecnique.

Similar documents
(01JEUHT) Formal Languages and Compilers

A Bison Manual. You build a text file of the production (format in the next section); traditionally this file ends in.y, although bison doesn t care.

POLITECNICO DI TORINO. (01JEUHT) Formal Languages and Compilers. Laboratory N 3. Lab 3. Cup Advanced Use

POLITECNICO DI TORINO. Formal Languages and Compilers. Laboratory N 1. Laboratory N 1. Languages?

Formal Languages and Compilers

COMPILER CONSTRUCTION LAB 2 THE SYMBOL TABLE. Tutorial 2 LABS. PHASES OF A COMPILER Source Program. Lab 2 Symbol table

CSE 401 Midterm Exam Sample Solution 11/4/11

Using an LALR(1) Parser Generator

Last Time. What do we want? When do we want it? An AST. Now!

Introduction to Programming Using Java (98-388)

CSEP 501 Compilers. Languages, Automata, Regular Expressions & Scanners Hal Perkins Winter /8/ Hal Perkins & UW CSE B-1

CPS 506 Comparative Programming Languages. Syntax Specification

Parsing Techniques. AST Review. AST Data Structures. Implicit AST Construction. AST Construction CS412/CS413. Introduction to Compilers Tim Teitelbaum

Compilation 2012 Context-Free Languages Parsers and Scanners. Jan Midtgaard Michael I. Schwartzbach Aarhus University

Using JFlex. "Linux Gazette...making Linux just a little more fun!" by Christopher Lopes, student at Eastern Washington University April 26, 1999

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

LECTURE 7. Lex and Intro to Parsing

Properties of Regular Expressions and Finite Automata

Regular Expressions. Agenda for Today. Grammar for a Tiny Language. Programming Language Specifications

CSCI312 Principles of Programming Languages!

Configuration Sets for CSX- Lite. Parser Action Table

UNIVERSITY OF CALIFORNIA

Lexical Analysis. Introduction

Project 2 Interpreter for Snail. 2 The Snail Programming Language

RYERSON POLYTECHNIC UNIVERSITY DEPARTMENT OF MATH, PHYSICS, AND COMPUTER SCIENCE CPS 710 FINAL EXAM FALL 96 INSTRUCTIONS

1 Lexical Considerations

CSE 401 Midterm Exam Sample Solution 2/11/15

TDDD55- Compilers and Interpreters Lesson 2

CSE 401 Midterm Exam 11/5/10

Action Table for CSX-Lite. LALR Parser Driver. Example of LALR(1) Parsing. GoTo Table for CSX-Lite

Introduction to Parsing. Lecture 8

Building a Parser Part III

Principles of Programming Languages COMP251: Syntax and Grammars

LL(k) Parsing. Predictive Parsers. LL(k) Parser Structure. Sample Parse Table. LL(1) Parsing Algorithm. Push RHS in Reverse Order 10/17/2012

Syntactic Analysis. The Big Picture Again. Grammar. ICS312 Machine-Level and Systems Programming

Lexical Analysis. Lexical analysis is the first phase of compilation: The file is converted from ASCII to tokens. It must be fast!

Simple LR (SLR) LR(0) Drawbacks LR(1) SLR Parse. LR(1) Start State and Reduce. LR(1) Items 10/3/2012

Lex & Yacc. by H. Altay Güvenir. A compiler or an interpreter performs its task in 3 stages:

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

Compiler Construction Assignment 3 Spring 2018

Bottom-Up Parsing. Lecture 11-12

MIT Specifying Languages with Regular Expressions and Context-Free Grammars

COP4020 Programming Assignment 2 - Fall 2016

Let us construct the LR(1) items for the grammar given below to construct the LALR parsing table.

Lexical and Syntax Analysis

Error Detection in LALR Parsers. LALR is More Powerful. { b + c = a; } Eof. Expr Expr + id Expr id we can first match an id:

How do LL(1) Parsers Build Syntax Trees?

flex is not a bad tool to use for doing modest text transformations and for programs that collect statistics on input.

Some Basic Definitions. Some Basic Definitions. Some Basic Definitions. Language Processing Systems. Syntax Analysis (Parsing) Prof.

Lex & Yacc. By H. Altay Güvenir. A compiler or an interpreter performs its task in 3 stages:

UNIT III & IV. Bottom up parsing

CS 4240: Compilers and Interpreters Project Phase 1: Scanner and Parser Due Date: October 4 th 2015 (11:59 pm) (via T-square)

([1-9] 1[0-2]):[0-5][0-9](AM PM)? What does the above match? Matches clock time, may or may not be told if it is AM or PM.

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

Compilers. Compiler Construction Tutorial The Front-end

Optimizing Finite Automata

JavaCC: SimpleExamples

Syntax/semantics. Program <> program execution Compiler/interpreter Syntax Grammars Syntax diagrams Automata/State Machines Scanning/Parsing

CSE 3302 Programming Languages Lecture 2: Syntax

Compilers. Bottom-up Parsing. (original slides by Sam

Building Compilers with Phoenix

JavaCUP. There are also many parser generators written in Java

Parser Tools: lex and yacc-style Parsing

Lexical Considerations

Syntax-Directed Translation

Syntax. Syntax. We will study three levels of syntax Lexical Defines the rules for tokens: literals, identifiers, etc.

Question Points Score

Outline. Limitations of regular languages. Introduction to Parsing. Parser overview. Context-free grammars (CFG s)

Time : 1 Hour Max Marks : 30

Lecture 8: Deterministic Bottom-Up Parsing

Fall Compiler Principles Lecture 5: Parsing part 4. Roman Manevich Ben-Gurion University

Downloaded from Page 1. LR Parsing

A simple syntax-directed

Compiler Construction

Lex and Yacc. More Details

CSE 413 Programming Languages & Implementation. Hal Perkins Autumn 2012 Grammars, Scanners & Regular Expressions

CSE 130 Programming Language Principles & Paradigms Lecture # 5. Chapter 4 Lexical and Syntax Analysis

Programming Language Syntax and Analysis

Programming Languages (CS 550) Lecture 4 Summary Scanner and Parser Generators. Jeremy R. Johnson

Concepts Introduced in Chapter 4

CS 11 Ocaml track: lecture 6

Parsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing

Lexical Analysis. Chapter 1, Section Chapter 3, Section 3.1, 3.3, 3.4, 3.5 JFlex Manual

Compiler Passes. Syntactic Analysis. Context-free Grammars. Syntactic Analysis / Parsing. EBNF Syntax of initial MiniJava.

Lecture 7: Deterministic Bottom-Up Parsing

The structure of a compiler

Lexing, Parsing. Laure Gonnord sept Master 1, ENS de Lyon

Agenda. Previously. Tentative syllabus. Fall Compiler Principles Lecture 5: Parsing part 4 12/2/2015. Roman Manevich Ben-Gurion University

COP4020 Programming Assignment 2 Spring 2011

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Compiler Design

Abstract Syntax. Mooly Sagiv. html://

Grammars and Parsing, second week

Lesson 10. CDT301 Compiler Theory, Spring 2011 Teacher: Linus Källberg

Compiler Construction

Fall Compiler Principles Lecture 4: Parsing part 3. Roman Manevich Ben-Gurion University of the Negev

A programming language requires two major definitions A simple one pass compiler

Bottom-Up Parsing. Lecture 11-12

MIT Specifying Languages with Regular Expressions and Context-Free Grammars. Martin Rinard Massachusetts Institute of Technology

A Simple Syntax-Directed Translator

Principles of Programming Languages COMP251: Syntax and Grammars

Transcription:

POLITECNICO I TORINO Parser and syntax analyzer (01JEUHT) Formal Languages and Compilers Laboratory N 2 Stefano Scanzio mail: Web: http://www.skenz.it/compilers Given a non-ambiguous grammar and a sequence of input symbols, a parser is a program that verifies whether the sequence can be generated by means of a derivation from the grammar. A syntax analyzer (parser) is a program capable of associating to the input sequence the correct parse tree. Parsers can be classified as top-down (parse tree is built from the root to the leaves ) bottom-up (parse tree is built from the leaves to the root ) : CUP 1 2 Scanning and parsing Context-Free Grammar efinition Input: 3+2= 3*(8-4)= Scanner (JFlex) PLUS EQ STAR RO MIN RC EQ Grammar (G): E ::= E + T E ::= E - T Parser (Cup) Generated by G Not generated by G A CF grammar is described by T, NT, S, PR T: Terminals / tokens of the language NT: Non-terminals enote sets of strings generated by the grammar S: Start symbol S NT PR: Production rules Indicate how T and NT are combined to generate valid strings PR: NT ::= T NT 3 4 Example How bottom-up parsing works: Shift/Reduce tecnique erivation: A sequence of grammar rule applications and substitutions that transform a starting non-terminal into a sequence of terminals (tokens). assign_stmt ::= I EQ expr S expr ::= expr operator term expr ::= term term ::= I term ::= FLOAT term ::= INT operator ::= PLUS operator ::= MIN A stack, initially empty, is used to keep track of symbols already recognized. Terminal symbols are pushed in the stack (shift), until the top of the stack contains a handle (right hand side of a production): the handle is then substituted by the corresponding nonterminal (reduce). Note that the reduce operation may only be applied to the top of the stack. Parsing is successful only when at the end of the input stream the stack contains only the start symbol 5 8

Parse Trees and Shift/Reduce Introduction to CUP Input String: a1, a2, a3 Scanner: a1, a2, a3 EL CM EL CM EL Parse Tree EL CM ELCM EL Recursive Left Grammar ::= CM EL ::= EL Action: Stack: ε Shift: EL Reduce: Shift: CM Shift: CM EL Reduce: Shift: CM Shift: CM EL Reduce: Cup is a parser generator that transforms the definition of a context-free grammar in a Java program that parses sequences of input symbols according to the grammar itself. Besides defining syntax rules, it is possible to specify actions to be executed whenever a production is reduced. The parser must be integrated by a scanner: some conventions simplify the integration of Cup-generated parses with JFlexgenerated scanners. Official manual: http://www.cs.princeton.edu/~appel/modern/java/cup/manual.h tml 10 18 Source file format Setup section A Cup source file has a syntax very similar to Java programs. It can be ideally divided in the following sections: Setup Terminals and non-terminals Precedences (Next lesson) Rules Comments are allowed following Java syntax (included in /* and */, or preceded by //) This section contains all the directives needed for the parser Inclusion of Cup library and other libraries: import java_cup.runtime.* User code: (Next lesson) Ridefinition of Cup internal methods Integration with scanner other than JFlex 19 20 It contains the definition of Terminals: passed by JFlex Non-Terminals The grammar start symbol Terminals terminal <terminal_1>,,<terminal_n> <terminal>: name containing letters, _,. and digits (the first character must be a letter) Terminals are recognized by Jflex Start symbol start with <non_terminal_name> It is the root of the parse tree Only one occurrence of this keyword is allowed Non-Terminals non terminal <non_terminal_1>,,<non_terminal_n> <non_terminal>: name containing letters, _,. and digits (the first character must be a letter). 21 22

L a a T P I SO SC S char * argv [ 10 ] Productions (grammal rules): T L S L L L CM P a a a SO SC a I Input string: char *argv[10] L a a T P I SO SC S char * argv [ 10 ] Start symbol start with Non-Terminals non terminal, L,, a (Note that the Start symbol is a non-terminal) Terminals terminal T, P, I, terminal SO, SC, CM, S (Recognized by JFlex) Input Sequence 25 26 Rules section Rules section (2) The Rules section contains one or more productions in the form: <non_terminal> ::= Right_Hand_Side where Right_Hand_Side is a sequence of 0 or more symbols. To each prodution, an action can be associated, which must be enclosed between {: and : Note: the action is executed right before the reduce operation takes place Example: ::= T L S {: System.out.println( eclaration found") : If more than one production exist for a given non-terminal, they must be grouped and separated by. Es. funz ::= type I RO L RC S {: System.out.println("Function prototype") : type I RO L RC BO stmt_list BC {: System.out.println("Function") : NB: the use of the character generates two separates rules. It is important to remember that the code between {: and : is executed only when a giver rule is matched. 27 28 Rules section : Example Integrating JFlex and Cup import java_cup.runtime.* //Terminals / Non-Terminals Section terminal T, P, I,, S, CM, SO, SC non terminal,, L, a start with //Rule Section ::= T L S L::= L CM ::= P a a::= a SO SC I L L a a Productions: T L S L CM P a a SO SC I 29 30

Integrating JFlex and Cup Integrating JFlex and Cup (2) Parser and scanner must agree on the values associated to each token (terminal) When the scanner recognizes a token, it must pass a suitable value to the parser. This is done by means of the Symbol class, whose constructors are: public Symbol( int sym_id) public Symbol( int sym_id, int left, int right) public Symbol( int sym_id, Object o) public Symbol( int sym_id, int left, int right, Object o) The class Symbol can be found in the cup installation directory: Java_cup/runtime/Symbol.java When a terminal is defined by means of the terminal keyword, Cup associated an integer value to that token. This mapping is contained in the file generated by cup during the compiling process 31 If in the parser the following list of terminal symbols has been declared: terminal T, P, I,, P, CM, SO, SC, S They can be used inside the scanner and passed to the parser in the following way: [a-za-z_][a-za-z0-9_]* {return new Symbol(sym.I) \[ {return new Symbol(sym.SO) \] {return new Symbol(sym.SC) 32 Scanner modifications The Cup parser Include the Cup library ( java_cup.runtime.* ) in the code section Activate Cup compatibility by means of the %cup directive in the eclarations section import java_cup.runtime.* terminal EL, CM non terminal, E CM EL EL import java_cup.runtime.* %cup [a-z]+ { return new Symbol(sym.EL), { return new Symbol(sym.CM) CM EL EL start with E E ::= {: System.out.println( found") : {: System.out.println( Empty list") : ::= CM EL ::= EL 33 34 Main Compiling import java.io.* public class Main { static public void main(string argv[]) { try { /* Instantiate the scanner and open input file argv[0] */ Yylex l = new Yylex(new FileReader(argv[0])) /* Instantiate the parser */ parser p = new parser(l) /* Start the parser */ Object result = p.parse() catch (Exception e) { e.printstacktrace() 35 scanner. java java_cup.main In the case of shift/reduce or reduce/reduce conflits: java java_cup.main expect <number_of_conflicts> java java_cup.mainrawtree Can be used in LABINF or at home installing a modified version of the parser The parse tree is drawn (useful for debugging) 36

Compiling Or *.java For the compilation of all the files of the project java Main <file> To execute the program using <file> a input 37