Last Time. What do we want? When do we want it? An AST. Now!

Similar documents
CUP. Lecture 18 CUP User s Manual (online) Robb T. Koether. Hampden-Sydney College. Fri, Feb 27, 2015

Fall Compiler Principles Lecture 5: Parsing part 4. Roman Manevich Ben-Gurion University

Fall Compiler Principles Lecture 4: Parsing part 3. Roman Manevich Ben-Gurion University of the Negev

CS 536 Midterm Exam Spring 2013

Properties of Regular Expressions and Finite Automata

Syntax-Directed Translation. Lecture 14

Abstract Syntax. Mooly Sagiv. html://

Last time. What are compilers? Phases of a compiler. Scanner. Parser. Semantic Routines. Optimizer. Code Generation. Sunday, August 29, 2010

Operator Precedence a+b*c b*c E + T T * P ( E )

Defining syntax using CFGs

CS 11 Ocaml track: lecture 6

Agenda. Previously. Tentative syllabus. Fall Compiler Principles Lecture 5: Parsing part 4 12/2/2015. Roman Manevich Ben-Gurion University

A programming language requires two major definitions A simple one pass compiler

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

How do LL(1) Parsers Build Syntax Trees?

PLT 4115 LRM: JaTesté

CPS 506 Comparative Programming Languages. Syntax Specification

CS 541 Fall Programming Assignment 3 CSX_go Parser

Yacc Yet Another Compiler Compiler

Using an LALR(1) Parser Generator

CSEP 501 Compilers. Languages, Automata, Regular Expressions & Scanners Hal Perkins Winter /8/ Hal Perkins & UW CSE B-1

Compilation 2013 Parser Generators, Conflict Management, and ML-Yacc

Regular Expressions. Agenda for Today. Grammar for a Tiny Language. Programming Language Specifications

Compilers. Compiler Construction Tutorial The Front-end

Introduction to Yacc. General Description Input file Output files Parsing conflicts Pseudovariables Examples. Principles of Compilers - 16/03/2006

Syntax Analysis The Parser Generator (BYacc/J)

ASTs, Objective CAML, and Ocamlyacc

Question Points Score

CSE 401 Midterm Exam Sample Solution 11/4/11

The Structure of a Syntax-Directed Compiler

Programming Languages (CS 550) Lecture 4 Summary Scanner and Parser Generators. Jeremy R. Johnson

TML Language Reference Manual

Simple LR (SLR) LR(0) Drawbacks LR(1) SLR Parse. LR(1) Start State and Reduce. LR(1) Items 10/3/2012

Syntax/semantics. Program <> program execution Compiler/interpreter Syntax Grammars Syntax diagrams Automata/State Machines Scanning/Parsing

Topic 5: Syntax Analysis III

CS164: Programming Assignment 3 Dpar Parser Generator and Decaf Parser

Parser and syntax analyzer. Context-Free Grammar Definition. Scanning and parsing. How bottom-up parsing works: Shift/Reduce tecnique.

Lex & Yacc (GNU distribution - flex & bison) Jeonghwan Park

Syntax-Directed Translation. Introduction

CSE 3302 Programming Languages Lecture 2: Syntax

Announcements. Working in pairs is only allowed for programming assignments and not for homework problems. H3 has been posted

Syntax Analysis/Parsing. Context-free grammars (CFG s) Context-free grammars vs. Regular Expressions. BNF description of PL/0 syntax

Syntax-Directed Translation

A Simple Syntax-Directed Translator

For example, we might have productions such as:

A Bison Manual. You build a text file of the production (format in the next section); traditionally this file ends in.y, although bison doesn t care.

JavaCUP. There are also many parser generators written in Java

Table-Driven Top-Down Parsers

Context-free grammars (CFG s)

CSE 401 Midterm Exam 11/5/10

CS 536 Spring Programming Assignment 1 Identifier Cross-Reference Analysis

Context-free grammars

Abstract Syntax Trees

The Structure of a Syntax-Directed Compiler

Assignment 1. Due 08/28/17

Compilers and Language Processing Tools

Compiler Construction: Parsing

COMPILER CONSTRUCTION LAB 2 THE SYMBOL TABLE. Tutorial 2 LABS. PHASES OF A COMPILER Source Program. Lab 2 Symbol table

Lesson 10. CDT301 Compiler Theory, Spring 2011 Teacher: Linus Källberg

CS 536 Spring Programming Assignment 3 CSX Parser. Due: Tuesday, March 24, 2015 Not accepted after Tuesday, March 31, 2015

Project 3 - Parsing. Misc Details. Parser.java Recursive Descent Parser... Using Lexer.java

Semantic Analysis Attribute Grammars

LECTURE 11. Semantic Analysis and Yacc

CSE P 501 Compilers. Implementing ASTs (in Java) Hal Perkins Autumn /20/ Hal Perkins & UW CSE H-1

More On Syntax Directed Translation

Ray Pereda Unicon Technical Report UTR-03. February 25, Abstract

A lexical analyzer generator for Standard ML. Version 1.6.0, October 1994

Principles of Programming Languages COMP251: Syntax and Grammars

Syntax Analysis Part IV

G53CMP: Lecture 4. Syntactic Analysis: Parser Generators. Henrik Nilsson. University of Nottingham, UK. G53CMP: Lecture 4 p.1/32

The Structure of a Compiler

Syntactic Analysis. The Big Picture Again. Grammar. ICS312 Machine-Level and Systems Programming

Lexical and Syntax Analysis

Parser Tools: lex and yacc-style Parsing

CSE 413 Programming Languages & Implementation. Hal Perkins Autumn 2012 Grammars, Scanners & Regular Expressions

Lex and Yacc. More Details

CSE P 501 Compilers. Implementing ASTs (in Java) Hal Perkins Winter /22/ Hal Perkins & UW CSE H-1

Compilers. Bottom-up Parsing. (original slides by Sam

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

LECTURE 3. Compiler Phases

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

POLITECNICO DI TORINO. (01JEUHT) Formal Languages and Compilers. Laboratory N 3. Lab 3. Cup Advanced Use

Context-Free Grammars

Defining syntax using CFGs

Syntactic Analysis. Syntactic analysis, or parsing, is the second phase of compilation: The token file is converted to an abstract syntax tree.

Semantic actions for expressions

Optimizing Finite Automata

CSE 413 Programming Languages & Implementation. Hal Perkins Winter 2019 Grammars, Scanners & Regular Expressions

Name EID. (calc (parse '{+ {with {x {+ 5 5}} {with {y {- x 3}} {+ y y} } } z } ) )

Lexical and Syntax Analysis

Parsing II Top-down parsing. Comp 412

Compiler Passes. Syntactic Analysis. Context-free Grammars. Syntactic Analysis / Parsing. EBNF Syntax of initial MiniJava.

CS Lecture 2. The Front End. Lecture 2 Lexical Analysis

A clarification on terminology: Recognizer: accepts or rejects strings in a language. Parser: recognizes and generates parse trees (imminent topic)

Parser Tools: lex and yacc-style Parsing

COMP3131/9102: Programming Languages and Compilers

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

Lexical Analysis. Chapter 1, Section Chapter 3, Section 3.1, 3.3, 3.4, 3.5 JFlex Manual

([1-9] 1[0-2]):[0-5][0-9](AM PM)? What does the above match? Matches clock time, may or may not be told if it is AM or PM.

Chapter 3. Parsing #1

Transcription:

Java CUP 1

Last Time What do we want? An AST When do we want it? Now! 2

This Time A little review of ASTs The philosophy and use of a Parser Generator 3

Translating Lists CFG IdList -> id IdList comma id AST Input x, y, z IdList IdNode x IdNode y IdNode z IdList, id z IdList id x, id y 4

Parser Generators Tools that take an SDT spec and build an AST YACC: Yet Another Compiler Compiler Java CUP: Constructor of Useful Parsers Conceptually similar to JLex Input: Language rules + actions Output: Java code Parser spec (xxx.cup) Java CUP Parser Source (parser.java) Symbols (sym.java) 5

Java CUP Parser.java Constructor takes arg of type Scanner (i.e., yylex) Contains a parsing method return: Symbol whose value contains translation of root nonterminal Uses output of JLex Depends on scanner and TokenVals sym.java defines the communication language Uses defs of AST classes Also in xxx.cup Parser Source (parser.java) Parser spec (xxx.cup) Java CUP Defines the token names used by both JLex and Java CUP Symbols (sym.java) 6

Java CUP Input Spec Terminal & nonterminal declarations Optional precedence and associativity declarations Grammar with rules and actions [no actions shown here] Grammar rules Expr ::= intliteral id Expr plus Expr Expr times Expr lparens Expr rparens Terminal and Nonterminals terminal intliteral; terminal id; terminal plus; terminal minus; terminal times; terminal lparen; terminal rparen; non terminal Expr; lowest precedence first Precedence and Associativity precedence left plus, minus; precedence left times; prededence nonassoc less; 7

Java CUP Example Assume ExpNode subclasses PlusNode, TimesNode have 2 children for operands IdNode has a String field IntLitNode has an int field Assume Token classes IntLitTokenVal with field intval for the value of the integer literal IdTokenVal with field idval for the actual identifier Step 1: Add types to terminals terminal IntLitTokenVal intliteral; terminal IdTokenVal id; terminal plus; terminal times; terminal lparen; terminal rparen; non terminal ExpNode expr; 8

Java CUP Example Expr ::= intliteral id Expr plus Expr Expr times Expr lparen Expr rparen ; 9

Java CUP Example Expr ::= intliteral:i RESULT = new IntLitNode(i.intVal); id Expr plus Expr Expr times Expr lparen Expr rparen ; 10

Java CUP Example Expr ::= intliteral:i RESULT = new IntLitNode(i.intVal); id:i RESULT = new IdNode(i.idVal); Expr:e1 plus Expr:e2 RESULT = new PlusNode(e1,e2); Expr:e1 times Expr:e2 RESULT = new TimesNode(e1,e2); lparen Expr:e rparen RESULT = e; ; 11

Java CUP Example Input: 2 + 3 Expr PlusNode left: right: IntLitNode Expr plus Expr IntLitNode val: 2 val: 3 IntLitTokenVal linenum: charnum: intval: intliteral intliteral IntLitTokenVal linenum: charnum: intval: 2 3 Purple = Terminal Token (Built by Scanner) Blue = Symbol (Built by Parser) 12

Handling Lists in Java CUP stmtlist ::= stmtlist:sl stmt:s sl.addtoend(s); RESULT = sl; /* epsilon */ RESULT = new Sequence(); ; Another issue: left-recursion (as above) or right-recursion? For top-down parsers, must use right-recursion Left-recursion causes an infinite loop With Java CUP, use left-recursion! Java CUP is a bottom-up parser (LALR(1)) Left-recursion allows a bottom-up parser to recognize a list s1, s2, s3, s4 with no extra stack space: recognize instance of stmtlist ::= epsilon (current nonterminal stmtlist) recognize instance of stmtlist ::= stmtlist:current stmt:s1 [s1] recognize instance of stmtlist ::= stmtlist:current stmt:s2 [s1, s2] recognize instance of stmtlist ::= stmtlist:current stmt:s3 [s1, s2, s3] recognize instance of stmtlist ::= stmtlist:current stmt:s4 [s1, s2, s3, s4] 13

UMINUS is a phony token never returned by the scanner. UMINUS is solely for the purpose of being used in %prec UMINUS Handling Unary Minus /* precedences and associativities of operators */ precedence left PLUS, MINUS; precedence left TIMES, DIVIDE; precedence nonassoc UMINUS; // Also used for precedence of unary minus The precedence of a rule is that of the last token of the rule, unless assigned a specific precedence via %prec <TOKEN> exp ::=... MINUS exp:e RESULT = new UnaryMinusNode(e); %prec UMINUS /* artificially elevate the precedence to that of UMINUS */ exp:e1 PLUS exp:e2 RESULT = new PlusNode(e1, e2); exp:e1 MINUS exp:e2 RESULT = new MinusNode(e1, e2);... ; 14

Java CUP Demo 15