Operator Precedence. Java CUP. E E + T T T * P P P id id id. Does a+b*c mean (a+b)*c or

Similar documents
Java CUP. Java CUP Specifications. User Code Additions. Package and Import Specifications

Operator Precedence a+b*c b*c E + T T * P ( E )

Let s look at the CUP specification for CSX-lite. Recall its CFG is

Properties of Regular Expressions and Finite Automata

Fig.25: the Role of LEX

CS321 Languages and Compiler Design I. Winter 2012 Lecture 5

CSCI 3130: Formal Languages and Automata Theory Lecture 12 The Chinese University of Hong Kong, Fall 2011

Homework. Context Free Languages III. Languages. Plan for today. Context Free Languages. CFLs and Regular Languages. Homework #5 (due 10/22)

CSE 401 Midterm Exam 11/5/10 Sample Solution

CS 340, Fall 2014 Dec 11 th /13 th Final Exam Note: in all questions, the special symbol ɛ (epsilon) is used to indicate the empty string.

CS412/413. Introduction to Compilers Tim Teitelbaum. Lecture 4: Lexical Analyzers 28 Jan 08

this grammar generates the following language: Because this symbol will also be used in a later step, it receives the

CS 430 Spring Mike Lam, Professor. Parsing

Topic 2: Lexing and Flexing

Dr. D.M. Akbar Hussain

Definition of Regular Expression

CSCE 531, Spring 2017, Midterm Exam Answer Key

Problem Set 2 Fall 16 Due: Wednesday, September 21th, in class, before class begins.

CS143 Handout 07 Summer 2011 June 24 th, 2011 Written Set 1: Lexical Analysis

ASTs, Regex, Parsing, and Pretty Printing

12 <= rm <digit> 2 <= rm <no> 2 <= rm <no> <digit> <= rm <no> <= rm <number>

Context-Free Grammars

Some Thoughts on Grad School. Undergraduate Compilers Review and Intro to MJC. Structure of a Typical Compiler. Lexing and Parsing

CMPSC 470: Compiler Construction

Languages. L((a (b)(c))*) = { ε,a,bc,aa,abc,bca,... } εw = wε = w. εabba = abbaε = abba. (a (b)(c)) *

Reducing a DFA to a Minimal DFA

CS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Fall

In the last lecture, we discussed how valid tokens may be specified by regular expressions.

10/12/17. Motivating Example. Lexical and Syntax Analysis (2) Recursive-Descent Parsing. Recursive-Descent Parsing. Recursive-Descent Parsing

Lexical Analysis. Amitabha Sanyal. ( as) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay

From Dependencies to Evaluation Strategies

Context-Free Grammars

LR Parsing, Part 2. Constructing Parse Tables. Need to Automatically Construct LR Parse Tables: Action and GOTO Table

CS 340, Fall 2016 Sep 29th Exam 1 Note: in all questions, the special symbol ɛ (epsilon) is used to indicate the empty string.

Lexical Analysis: Constructing a Scanner from Regular Expressions

CMSC 331 First Midterm Exam

box Boxes and Arrows 3 true 7.59 'X' An object is drawn as a box that contains its data members, for example:

Example: Source Code. Lexical Analysis. The Lexical Structure. Tokens. What do we really care here? A Sample Toy Program:

Compilers Spring 2013 PRACTICE Midterm Exam

CS 321 Programming Languages and Compilers. Bottom Up Parsing

ECE 468/573 Midterm 1 September 28, 2012

Assignment 4. Due 09/18/17

Sample Midterm Solutions COMS W4115 Programming Languages and Translators Monday, October 12, 2009

Eliminating left recursion grammar transformation. The transformed expression grammar

Lecture 10 Evolutionary Computation: Evolution strategies and genetic programming

Compilation

Scanner Termination. Multi Character Lookahead

ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών

stack of states and grammar symbols Stack-Bottom marker C. Kessler, IDA, Linköpings universitet. 1. <list> -> <list>, <element> 2.

TO REGULAR EXPRESSIONS

CS201 Discussion 10 DRAWTREE + TRIES

Agenda & Reading. Class Exercise. COMPSCI 105 SS 2012 Principles of Computer Science. Arrays

COMP 423 lecture 11 Jan. 28, 2008

Scanner Termination. Multi Character Lookahead. to its physical end. Most parsers require an end of file token. Lex and Jlex automatically create an

Midterm I Solutions CS164, Spring 2006

Lexical Analysis and Lexical Analyzer Generators

Top-down vs Bottom-up. Bottom up parsing. Sentential form. Handles. Handles in expression example

CSc 453. Compilers and Systems Software. 4 : Lexical Analysis II. Department of Computer Science University of Arizona

Implementing Automata. CSc 453. Compilers and Systems Software. 4 : Lexical Analysis II. Department of Computer Science University of Arizona

Lexical analysis, scanners. Construction of a scanner

Quiz2 45mins. Personal Number: Problem 1. (20pts) Here is an Table of Perl Regular Ex

acronyms possibly used in this test: CFG :acontext free grammar CFSM :acharacteristic finite state machine DFA :adeterministic finite automata

Functor (1A) Young Won Lim 10/5/17

Fall Compiler Principles Lecture 1: Lexical Analysis. Roman Manevich Ben-Gurion University of the Negev

ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών. Lecture 3b Lexical Analysis Elias Athanasopoulos

Midterm 2 Sample solution

CS 236 Language and Computation. Alphabet. Definition. I.2.1. Formal Languages (10.1)

Functor (1A) Young Won Lim 8/2/17

Compiler Construction D7011E

LING/C SC/PSYC 438/538. Lecture 21 Sandiway Fong

CSc 453 Compilers and Systems Software. 6 : Top-Down Parsing I

Deterministic. Finite Automata. And Regular Languages. Fall 2018 Costas Busch - RPI 1

Context-Free Grammars

What are suffix trees?

CS 432 Fall Mike Lam, Professor a (bc)* Regular Expressions and Finite Automata

Principles of Programming Languages

Scanning Theory and Practice

Fall Compiler Principles Lecture 1: Lexical Analysis. Roman Manevich Ben-Gurion University

CIS 1068 Program Design and Abstraction Spring2015 Midterm Exam 1. Name SOLUTION

2014 Haskell January Test Regular Expressions and Finite Automata

CS311H: Discrete Mathematics. Graph Theory IV. A Non-planar Graph. Regions of a Planar Graph. Euler s Formula. Instructor: Işıl Dillig

CMPT 379 Compilers. Lexical Analysis

Should be done. Do Soon. Structure of a Typical Compiler. Plan for Today. Lab hours and Office hours. Quiz 1 is due tonight, was posted Tuesday night

Slides for Data Mining by I. H. Witten and E. Frank

Lab 1 - Counter. Create a project. Add files to the project. Compile design files. Run simulation. Debug results

Example: 2:1 Multiplexer

George Boole. IT 3123 Hardware and Software Concepts. Switching Algebra. Boolean Functions. Boolean Functions. Truth Tables

Section 3.1: Sequences and Series

If you are at the university, either physically or via the VPN, you can download the chapters of this book as PDFs.

Solving Problems by Searching. CS 486/686: Introduction to Artificial Intelligence

PYTHON PROGRAMMING. The History of Python. Features of Python. This Course

UNIVERSITY OF EDINBURGH COLLEGE OF SCIENCE AND ENGINEERING SCHOOL OF INFORMATICS INFORMATICS 1 COMPUTATION & LOGIC INSTRUCTIONS TO CANDIDATES

How to Design REST API? Written Date : March 23, 2015

Reference types and their characteristics Class Definition Constructors and Object Creation Special objects: Strings and Arrays

Finite Automata. Lecture 4 Sections Robb T. Koether. Hampden-Sydney College. Wed, Jan 21, 2015

LAB L Hardware Building Blocks

Symbol Table management

Matrices and Systems of Equations

Before We Begin. Introduction to Spatial Domain Filtering. Introduction to Digital Image Processing. Overview (1): Administrative Details (1):

CS 241. Fall 2017 Midterm Review Solutions. October 24, Bits and Bytes 1. 3 MIPS Assembler 6. 4 Regular Languages 7.

Transcription:

Opertor Precedence Most progrmming lnguges hve opertor precedence rules tht stte the order in which opertors re pplied (in the sence of explicit prentheses). Thus in C nd Jv nd CSX, +*c mens compute *c, then dd in. These opertors precedence rules cn e incorported directly into CFG. Consider E E + T T T T * P P P id ( E ) Does +*c men (+)*c or +(*c)? The grmmr tells us! Look t the derivtion tree: E E + T T T * P P P id id id The other grouping cn t e otined unless explicit prentheses re used. (Why?) 200 201 Jv CUP Jv CUP is prser-genertion tool, similr to Ycc. CUP uilds Jv prser for LALR(1) grmmrs from production rules nd ssocited Jv code frgments. When prticulr production is recognized, its ssocited code frgment is executed (typiclly to uild n AST). CUP genertes Jv source file prser.jv. It contins clss prser, with method Symol prse() The Symol returned y the prser is ssocited with the grmmr s strt symol nd contins the AST for the whole source progrm. The file sym.jv is lso uilt for use with JLex-uilt scnner (so tht oth scnner nd prser use the sme token codes). If n unrecovered syntx error occurs, Exception() is thrown y the prser. CUP nd Ycc ccept exctly the sme clss of grmmrs ll LL(1) grmmrs, plus mny useful non- LL(1) grmmrs. CUP is clled s jv jv_cup.min < file.cup 202 203

Jv CUP Specifictions User Code Additions You my define Jv code to e included within the generted prser: ction code {: /*jv code */ This code is plced within the generted ction clss (which holds user-specified production ctions). prser code {: /*jv code */ This code is plced within the generted prser clss. init with{: /*jv code */ This code is used to initilize the generted prser. scn with{: /*jv code */ This code is used to tell the generted prser how to get tokens from the scnner. Jv CUP specifictions re of the form: Pckge nd import specifictions User code dditions Terminl nd non-terminl declrtions A context-free grmmr, ugmented with Jv code frgments Pckge nd Import Specifictions You define pckge nme s: pckge nme You dd imports to e used s: import jv_cup.runtime.* 204 205 Terminl nd Non-terminl Declrtions You define terminl symols you will use s: terminl clssnme nme 1, nme 2,... clssnme is clss used y the scnner for tokens (CSXToken, CSXIdentifierToken, etc.) You define non-terminl symols you will use s: non terminl clssnme nme 1, nme 2,... clssnme is the clss for the AST node ssocited with the non-terminl (stmtnode, exprnode, etc.) Production Rules Production rules re of the form nme ::= nme 1 nme 2... ction or nme ::= nme 1 nme 2... ction 1 nme 3 nme 4... ction 2... Nmes re the nmes of terminls or non-terminls, s declred erlier. Actions re Jv code frgments, of the form {: /*jv code */ The Jv oject ssocted with symol ( token or AST node) my e nmed y dding :id suffix to terminl or non-terminl in rule. 206 207

RESULT nmes the left-hnd side non-terminl. The Jv clsses of the symols re defined in the terminl nd non-terminl declrtion sections. For exmple, prog ::= LBRACE:l stmts:s RBRACE {: RESULT = new csxlitenode(s, l.linenum,l.colnum) This corresponds to the production prog { stmts } The left rce is nmed l the stmts non-terminl is clled s. In the ction code, new CSXLiteNode is creted nd ssigned to prog. It is constructed from the AST node ssocited with s. Its line nd column numers re those given to the left rce, l (y the scnner). To tell CUP wht non-terminl to use s the strt symol (prog in our exmple), we use the directive: strt with prog 208 209 Exmple Let s look t the CUP specifiction for CSX-lite. Recll its CFG is progrm { stmts } stmts stmt stmts λ stmt id = expr if ( expr ) stmt expr expr + id expr - id id The corresponding CUP specifiction is: /*** This Is A Jv CUP Specifiction For CSX-lite, Smll Suset of The CSX Lnguge, Used In Cs536 ***/ /* Preliminries to set up nd use the scnner. */ import jv_cup.runtime.* prser code {: pulic void syntx_error (Symol cur_token){ report_error( CSX syntx error t line + String.vlueOf(((CSXToken) cur_token.vlue).linenum), null)} init with {: scn with {: return Scnner.next_token() 210 211

/* Terminls (tokens returned y the scnner). */ terminl CSXIdentifierToken IDENTIFIER terminl CSXToken SEMI, LPAREN, RPAREN, ASG, LBRACE, RBRACE terminl CSXToken PLUS, MINUS, rw_if /* Non terminls */ non terminl csxlitenode prog non terminl stmtsnode stmts non terminl stmtnode stmt non terminl exprnode exp non terminl ident strt with prog prog::= LBRACE:l stmts:s RBRACE new csxlitenode(s, l.linenum,l.colnum) stmts::= stmt:s1 stmts:s2 new stmtsnode(s1,s2, s1.linenum,s1.colnum) stmtsnode.null stmt::= ident:id ASG exp:e SEMI new sgnode(id,e, id.linenum,id.colnum) rw_if:i LPAREN exp:e RPAREN stmt:s new ifthennode(e,s, stmtnode.null, i.linenum,i.colnum) exp::= exp:leftvl PLUS:op ident:rightvl new inryopnode(leftvl, sym.plus, rightvl, op.linenum,op.colnum) exp:leftvl MINUS:op ident:rightvl new inryopnode(leftvl, sym.minus,rightvl, op.linenum,op.colnum) ident:i {: RESULT = i 212 213 ident::= IDENTIFIER:i {: RESULT = new ( new (i.identifiertext, i.linenum,i.colnum), exprnode.null, i.linenum,i.colnum) Let s prse { = } First, is prsed using ident::= IDENTIFIER:i {: RESULT = new ( new (i.identifiertext, i.linenum,i.colnum), exprnode.null, i.linenum,i.colnum) We uild 214 215

Next, is prsed using ident::= IDENTIFIER:i {: RESULT = new ( new (i.identifiertext, i.linenum,i.colnum), exprnode.null, i.linenum,i.colnum) We uild Then s sutree is recognized s n exp: ident:i {: RESULT = i Now the ssignment sttement is recognized: stmt::= ident:id ASG exp:e SEMI new sgnode(id,e, id.linenum,id.colnum) We uild sgnode 216 217 The stmts λ production is mtched (indicting tht there re no more sttements in the progrm). CUP mtches stmts::= stmtsnode.null nd we uild This uilds stmtsnode sgnode nullstmtsnode nullstmtsnode Next, stmts stmt stmts is mtched using stmts::= stmt:s1 stmts:s2 new stmtsnode(s1,s2, s1.linenum,s1.colnum) As the lst step of the prse, the prser mtches progrm { stmts } using the CUP rule prog::= LBRACE:l stmts:s RBRACE new csxlitenode(s, l.linenum,l.colnum) 218 219

The finl AST reurned y the prser is csxlitenode stmtsnode sgnode nullstmtsnode Errors in Context-Free Grmmrs Context-free grmmrs cn contin errors, just s progrms do. Some errors re esy to detect nd fix others re more sutle. In context-free grmmrs we strt with the strt symol, nd pply productions until terminl string is produced. Some context-free grmmrs my contin useless non-terminls. Non-terminls tht re unrechle (from the strt symol) or tht derive no terminl string re considered useless. Useless non-terminls (nd productions tht involve them) cn e sfely removed from 220 221 grmmr without chnging the lnguge defined y the grmmr. A grmmr contining useless non-terminls is sid to e nonreduced. After useless non-terminls re removed, the grmmr is reduced. Consider S A B x B A A C d Which non-terminls re unrechle? Which derive no terminl string? 222