COMP 181. Prelude. Prelude. Summary of parsing. A Hierarchy of Grammar Classes. More power? Syntax-directed translation. Analysis

Similar documents
Compilers. Bottom-up Parsing. (original slides by Sam

Fall Compiler Principles Lecture 4: Parsing part 3. Roman Manevich Ben-Gurion University of the Negev

Fall Compiler Principles Lecture 5: Parsing part 4. Roman Manevich Ben-Gurion University

Lecture 14: Parser Conflicts, Using Ambiguity, Error Recovery. Last modified: Mon Feb 23 10:05: CS164: Lecture #14 1

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

Lecture 8: Deterministic Bottom-Up Parsing

Simple LR (SLR) LR(0) Drawbacks LR(1) SLR Parse. LR(1) Start State and Reduce. LR(1) Items 10/3/2012

Lecture 7: Deterministic Bottom-Up Parsing

Syntax-Directed Translation

For example, we might have productions such as:

A clarification on terminology: Recognizer: accepts or rejects strings in a language. Parser: recognizes and generates parse trees (imminent topic)

COMPILER CONSTRUCTION LAB 2 THE SYMBOL TABLE. Tutorial 2 LABS. PHASES OF A COMPILER Source Program. Lab 2 Symbol table

In One Slide. Outline. LR Parsing. Table Construction

Context-sensitive Analysis. Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved.

Abstract Syntax. Mooly Sagiv. html://

Context-sensitive Analysis

Grammars. CS434 Lecture 15 Spring 2005 Department of Computer Science University of Alabama Joel Jones

Bottom-Up Parsing. Lecture 11-12

LR Parsing LALR Parser Generators

LR Parsing LALR Parser Generators

Context-free grammars

Context-Free Grammar. Concepts Introduced in Chapter 2. Parse Trees. Example Grammar and Derivation

Using an LALR(1) Parser Generator

UNIVERSITY OF CALIFORNIA

Bottom-Up Parsing. Lecture 11-12

CS 4240: Compilers and Interpreters Project Phase 1: Scanner and Parser Due Date: October 4 th 2015 (11:59 pm) (via T-square)

Context-sensitive Analysis

Parsing III. CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones

Compiler Construction

Compilation 2013 Parser Generators, Conflict Management, and ML-Yacc

SYED AMMAL ENGINEERING COLLEGE (An ISO 9001:2008 Certified Institution) Dr. E.M. Abdullah Campus, Ramanathapuram

Building a Parser Part III

JavaCC Parser. The Compilation Task. Automated? JavaCC Parser

Compiler Construction: Parsing

Compiler Construction

Wednesday, September 9, 15. Parsers

Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

Monday, September 13, Parsers

LR Parsing. Table Construction

Yacc: A Syntactic Analysers Generator

Syntax-Directed Translation. Lecture 14

Wednesday, August 31, Parsers

Part III : Parsing. From Regular to Context-Free Grammars. Deriving a Parser from a Context-Free Grammar. Scanners and Parsers.

Compilers. Type checking. Yannis Smaragdakis, U. Athens (original slides by Sam

Parsing Wrapup. Roadmap (Where are we?) Last lecture Shift-reduce parser LR(1) parsing. This lecture LR(1) parsing

Summary: Semantic Analysis

COMPILER CONSTRUCTION Seminar 02 TDDB44

Lecture Notes on Bottom-Up LR Parsing

Parsers. Xiaokang Qiu Purdue University. August 31, 2018 ECE 468

Lexical and Syntax Analysis. Bottom-Up Parsing

Syntax-Directed Translation. Introduction

Conflicts in LR Parsing and More LR Parsing Types

How do LL(1) Parsers Build Syntax Trees?

Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

Parsing Algorithms. Parsing: continued. Top Down Parsing. Predictive Parser. David Notkin Autumn 2008

10/5/17. Lexical and Syntactic Analysis. Lexical and Syntax Analysis. Tokenizing Source. Scanner. Reasons to Separate Lexical and Syntax Analysis

Using JFlex. "Linux Gazette...making Linux just a little more fun!" by Christopher Lopes, student at Eastern Washington University April 26, 1999

CS 11 Ocaml track: lecture 6

3. Parsing. Oscar Nierstrasz

CS606- compiler instruction Solved MCQS From Midterm Papers

Let us construct the LR(1) items for the grammar given below to construct the LALR parsing table.

10/4/18. Lexical and Syntactic Analysis. Lexical and Syntax Analysis. Tokenizing Source. Scanner. Reasons to Separate Lexical and Syntactic Analysis

Syntax Analysis, V Bottom-up Parsing & The Magic of Handles Comp 412

Compilers. Yannis Smaragdakis, U. Athens (original slides by Sam

Parsing III. (Top-down parsing: recursive descent & LL(1) )

CS143 Handout 20 Summer 2011 July 15 th, 2011 CS143 Practice Midterm and Solution

Formal Languages and Compilers Lecture VII Part 4: Syntactic A

Syntax-Directed Translation. CS Compiler Design. SDD and SDT scheme. Example: SDD vs SDT scheme infix to postfix trans

Review main idea syntax-directed evaluation and translation. Recall syntax-directed interpretation in recursive descent parsers

Topic 5: Syntax Analysis III

Building Compilers with Phoenix

CS 406: Syntax Directed Translation

Review of CFGs and Parsing II Bottom-up Parsers. Lecture 5. Review slides 1

A programming language requires two major definitions A simple one pass compiler

Lecture Notes on Bottom-Up LR Parsing

Context-sensitive Analysis Part II

Bottom-up parsing. Bottom-Up Parsing. Recall. Goal: For a grammar G, withstartsymbols, any string α such that S α is called a sentential form

A Simple Syntax-Directed Translator

Yacc Yet Another Compiler Compiler

UNIVERSITY OF CALIFORNIA Department of Electrical Engineering and Computer Sciences Computer Science Division

COMP 181 Compilers. Administrative. Last time. Prelude. Compilation strategy. Translation strategy. Lecture 2 Overview

COMP 181. Prelude. Next step. Parsing. Study of parsing. Specifying syntax with a grammar

Computing Inside The Parser Syntax-Directed Translation. Comp 412 COMP 412 FALL Chapter 4 in EaC2e. source code. IR IR target.

shift-reduce parsing

LECTURE 11. Semantic Analysis and Yacc

CS 406/534 Compiler Construction Parsing Part I

CS 2210 Sample Midterm. 1. Determine if each of the following claims is true (T) or false (F).

Principle of Compilers Lecture IV Part 4: Syntactic Analysis. Alessandro Artale

G53CMP: Lecture 4. Syntactic Analysis: Parser Generators. Henrik Nilsson. University of Nottingham, UK. G53CMP: Lecture 4 p.1/32

1. The output of lexical analyser is a) A set of RE b) Syntax Tree c) Set of Tokens d) String Character

Last time. What are compilers? Phases of a compiler. Scanner. Parser. Semantic Routines. Optimizer. Code Generation. Sunday, August 29, 2010

CSCI Compiler Design

Semantic Analysis Attribute Grammars

The Structure of a Syntax-Directed Compiler

TDDD55 - Compilers and Interpreters Lesson 3

MIT Parse Table Construction. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

Syntactic Analysis. Top-Down Parsing

Alternatives for semantic processing

LR Parsing. Leftmost and Rightmost Derivations. Compiler Design CSE 504. Derivations for id + id: T id = id+id. 1 Shift-Reduce Parsing.

Transcription:

Prelude COMP 8 October, 9 What is triskaidekaphobia? Fear of the number s? No aisle in airplanes, no th floor in buildings Fear of Friday the th? Paraskevidedekatriaphobia or friggatriskaidekaphobia Why unlucky? Partly: considered perfect number Christianity: th guest at last supper was Judas Norse myth: th god Loki arranged murder of Baldr Code of Hammurabi: no number Lucky Christianity: guests at last supper, attributes of mercy Judaism: principles of Maimonides, age of Bar Mitzvah Sikhism: lucky number Tufts University Computer Science Prelude Analysis What is the probability that the th occurs on a Friday? Day of the week Sunday 8 Monday 8 Tuesday 8 Wednesday 8 Thursday 8 Friday 88 Saturday 8 of occurences Summary of parsing Parsing A solid foundation: context-free grammars A simple parser: LL() A more powerful parser: LR() An efficiency hack: LALR() LALR() parser generators Tufts University Computer Science Tufts University Computer Science A Hierarchy of Grammar Classes From Andrew Appel, Modern Compiler Implementation in Java Tufts University Computer Science More power? So far: LALR and SLR: reduce size of tables Also reduce space of languages What if I want to expand the space of languages? What could I do at a reduce/reduce conflict? Try both reductions! GLR parsing At a choice: split the stack, explore both sibilities If one doesn t work out, kill it Run-time proportional to amount of ambiguity Must design the stack data structure very carefully Tufts University Computer Science

General algorithms LR parsing Parsers for full class of context-free grammars Mostly used in linguistics constructive proof of decidability CYK (9) Bottom-up dynamic programming algorithm O(n ) Earley s algorithm (9) Top-down dynamic programming algorithm Developed the notation for partial production (LR item) Worst-case O(n ) running time But, O(n ) even for unambiguous grammars GLR Worse-case O(n ), but O(n) for unambiguous grammars Input: Stack: s m X m s m- X m- s a a a i a n $ LR Parsing Engine Scanner Compiler construction Parser Action Goto Grammar Generator LR tables Tufts University Computer Science Tufts University Computer Science 8 Real world parsers Real generated code lex, flex, yacc, bison Interaction between lexer and parser C typedef problem Merging two languages Debugging Diagnosing reduce/reduce conflicts How to step through an LR parser Parser generators : JavaCUP LALR() parser generator Input: grammar specification Output: Java classes Generic engine Action/goto tables Separate scanner specification Similar tools: SableCC yacc and bison generate C/C++ parsers JavaCC: similar, but generates LL() parser Tufts University Computer Science 9 Tufts University Computer Science JavaCUP example Simple expression grammar Operations over numbers only // Import generic engine code import java_cup.runtime.*; /* Preliminaries to set up and use the scanner. */ init with {: scanner.init(); :}; scan with {: return scanner.next_token(); :}; Note: interface to scanner One issue: how to agree on names of the tokens Define terminals and non-terminals Indicate operator precedence /* Terminals (tokens returned by the scanner). */ terminal SEMI, PLUS, MINUS, TIMES, DIVIDE, MOD; terminal UMINUS, LPAREN, RPAREN; terminal Integer NUMBER; /* Non terminals */ non terminal expr_list, expr_part; non terminal Integer expr, term, factor; /* Precedences */ precedence left PLUS, MINUS; precedence left TIMES, DIVIDE, MOD; Tufts University Computer Science Tufts University Computer Science

Grammar rules Roadmap expr_list ::= expr_list expr_part expr_part ; text chars Lexical analyzer tokens Parser IR expr_part ::= expr SEMI ; expr ::= expr PLUS expr expr MINUS expr expr TIMES expr expr DIVIDE expr expr MOD expr LPAREN expr RPAREN NUMBER ; Parsing Tells us if input is syntactically correct Gives us derivation or parse tree But we want to do more: Build some data structure the IR Perform other checks and computations Errors Tufts University Computer Science Tufts University Computer Science In practice: Fold some computations into parsing Computations are triggered by parsing steps Parser generators Add action code to do something Typically build the IR How much can we do during parsing? General strategy Associate ues with grammar symbols Associate computations with productions Implementation approaches Formal: attribute grammars Informal: ad-hoc translation schemes Some things cannot be folded into parsing Tufts University Computer Science Tufts University Computer Science Desk calculator Expression grammar Build parse tree Euate the resulting tree G E E E + T E T T T * F T F F ( E ) F num * + G E E + T T T * F F F Can we euate the expression without building the tree first? Piggyback on parsing G E E E + T E T T T * F T F F ( E ) F num G 9 E E + T T T * F F * + F Tufts University Computer Science Tufts University Computer Science 8

Codify: Store intermediate ues with non-terminals Perform computations in each production G E E E + T E T T T * F T F F ( E ) F num Computation print(e.) E. E. + T. E. T. T. T. * F. T. F. F. E. F. ueof(num) Attribute grammars A context-free grammar with a set of rules Each symbol has a set of ues, or attributes Semantic rules : how to compute each attribute The bad news: Formal attribute grammars never widely adopted Why study them? The attribute grammar formalism is important Succinctly makes many points clear Sets the stage for actual, ad-hoc practice The problems motivate practice Tufts University Computer Science 9 Tufts University Computer Science derivations Grammar: Describes signed binary numbers We would like to augment it with rules that compute the decimal ue of each id input string Sign Sign + - For Sign Sign For Sign Sign Sign Sign Sign Sign Sign Sign Tufts University Computer Science Tufts University Computer Science Attribute grammar Attribution rules Goal: Compute the ue of the binary number Information we need Position of each bit to compute place ue = * + * + * Sum of bit ues Computation Propagate ition information Accumulate the sums Attributes Rules Sign Sign + - How to compute from Sign and? if Sign.neg then. - *. else.. How to compute ue?.. +. or.. How to compute ue?. or. (bit ition) Tufts University Computer Science Where does come from? Tufts University Computer Science

Attribution rules Attribution rules Sign Sign + - Start at bit ition. = Push ition information down.. +.. or.. Now we can compute ue. or. (.) Attribution rules.. +... +.. Notice: Information can flow top-down or bottom-up (Also: Left-to-right or Right-to-left) Tufts University Computer Science Tufts University Computer Science Attribution rules Top-down ues are inherited attributes Bottom-up ues are synthesized attributes Values with no dependence are called independent attributes Attribute grammars Specification: Attributes Associated with nodes in parse tree Distinguish multiple non-terminals with index Rules Value assignments associated with productions Information is entirely local: it can only refer to ues in the given production Given a parse tree Rules form a dependence graph between attributes Result: a high-level, functional specification Tufts University Computer Science Tufts University Computer Science 8 Euation Tricky part Values flowing both up and down in tree How do we order the computation? And, how does that relate to parsing order? (i.e., the order in which parse tree nodes are created) Sign Key Must obey the dependences between attributes Let s look at a general technique for euation For Tufts University Computer Science 9 Tufts University Computer Science

Sign neg: : : Annotate parse tree with attributes Sign neg: : Inherited attributes flow down in the tree : : : : For For Tufts University Computer Science Tufts University Computer Science Sign neg: true : : At leaves, add dependences between inherited and synthesized attributes Sign neg: true : : Collect the synthesized attributes For For Tufts University Computer Science Tufts University Computer Science Sign neg: true : Complete graph Now, throw away the parse tree neg: true : Need a method to solve the set of constraints described by this dependence graph : : For For Tufts University Computer Science Tufts University Computer Science

Euation Dynamic, dependence-based methods Build the parse tree, dependence graph Topologically sort the graph Gives us an order of euation Rule-based methods Analyze rules at compiler-generation time Determine a fixed (static) ordering Euate nodes in that order Oblivious methods Ignore rules & parse tree Pick a convenient order (at design time) & use it Attribute grammars Clean, declarative Handle a wide variety of problems BUT, have limitations and euation issues Reality Never widely adopted In practice: Apply arbitrary code actions on attributes Order of euation dictated by parsing algorithm Only works for limited classes of attribute grammars Tufts University Computer Science Tufts University Computer Science 8 Special class: L-attributed L-attributed definition Use ues from parent and siblings For production A X X X n Each attribute of X i depends on Attributes of X X X i-, and Inherited attributes of A Suited to LL parsing Euate in a single top-down pass (left to right) Pass ues down through recursive descent Emit code to compute attributes in appropriate procedures Special class: S-attributed S-attributed definition All attributes are synthesized For production A X X X n Value of A is computed as a function of the attributes already computed for X X X n Suited to LR parsing Can be computed in a single bottom-up pass Associate pieces of code with each production At each reduction, the code is executed Tufts University Computer Science 9 Tufts University Computer Science Name for the ue associated with this production Another example expr_part ::= expr SEMI ; expr ::= expr:e PLUS expr:e {: RESULT = new Integer(e.intValue() + e.intvalue()); :} expr:e MINUS expr:e {: RESULT = new Integer(e.intValue() - e.intvalue()); :}... LPAREN expr:e RPAREN {: RESULT = e; :} NUMBER:n {: RESULT = n; :} ; Arbitrary code between {: and :} RESULT refers to the attribute of the LHS non-terminal Build an abstract syntax tree expr_part ::= expr SEMI ; expr ::= expr:e PLUS expr:e {: RESULT = new AddNode(e, e); :} expr:e MINUS expr:e {: RESULT = new SubNode(e, e); :}... LPAREN expr:e RPAREN {: RESULT = e; :} NUMBER:n {: RESULT = new Node(n); :} ; Tufts University Computer Science Tufts University Computer Science

Implementation How does this work? Where are the attributes stored? What do e, e, n, RESULT refer to? Key: store attributes on stack At a reduction of A β Pop β symbols symbol, state, attribute ue Map ues to names: expr:e PLUS expr:e e = top of stack, then PLUS, then e Invoke action code on ues store in RESULT Push RESULT back on stack with new symbol, state Next Static checking Tufts University Computer Science Tufts University Computer Science 8