# Compiler Construction

Size: px
Start display at page:

Transcription

1 Compiler Construction Thomas Noll Software Modeling and Verification Group RWTH Aachen University

2 Recap: LR(1) Parsing LR(1) Items and Sets Observation: not every element of fo(a) can follow every occurrence of A = refinement of LR(0) items by adding possible lookahead symbols Definition (LR(1) items and sets) Let G = N, Σ, P, S CFG Σ be start separated by S S. If S r αaaw r αβ 1 β 2 aw, then [A β 1 β 2, a] is called an LR(1) item for αβ 1. If S r αa r αβ 1 β 2, then [A β 1 β 2, ε] is called an LR(1) item for αβ 1. Given γ X, LR(1)(γ) denotes the set of all LR(1) items for γ, called the LR(1) set (or: LR(1) information) of γ. LR(1)(G) := {LR(1)(γ) γ X }. 3 of 34 Compiler Construction

3 Recap: LR(1) Parsing LR(1) Conflicts Definition (LR(1) conflicts) Let G = N, Σ, P, S CFG Σ and I LR(1)(G). I has a shift/reduce conflict if there exist A α 1 aα 2, B β P and x Σ ε such that [A α 1 aα 2, x], [B β, a] I. I has a reduce/reduce conflict if there exist x Σ ε and A α, B β P with A B or α β such that [A α, x], [B β, x] I. Lemma G LR(1) iff no I LR(1)(G) contains conflicting items. 4 of 34 Compiler Construction

4 Recap: LR(1) Parsing The LR(1) Action Function Definition (LR(1) action function) The LR(1) action function act : LR(1)(G) Σ ε {red i i [p]} {shift, accept, error} is defined by act(i, x) := red i if i 0, π i = A α and [A α, x] I shift if [A α 1 xα 2, y] I and x Σ accept if [S S, ε] I and x = ε error otherwise Corollary For every G CFG Σ, G LR(1) iff its LR(1) action function is well defined. 5 of 34 Compiler Construction

5 Generating Top-Down Parsers Using ANTLR Overview of ANTLR ANother Tool for Language Recognition Input: language description using EBNF grammars Output: recogniser for the language Supports recognisers for three kinds of input: character streams (generation of scanner) token streams (generation of parser) node streams (generation of tree walker) Current version: ANTLR 4.7 generates LL( ) recognisers: flexible choice of lookahead length applies longest match principle supports ambiguous grammars by using first match principle for rules supports direct left recursion targets Java, C++, C#, Python, JavaScript, Go, Swift Details: T. Parr: The Definitive ANTLR 4 Reference, Pragmatic Bookshelf, of 34 Compiler Construction

6 Generating Top-Down Parsers Using ANTLR Example: Infix Postfix Translator (Simple.g4) grammar Simple; // productions for syntax analysis program returns [String s]: e=expr EOF {\$s = \$e.s;}; expr returns [String s]: t=term r=rest {\$s = \$t.s + \$r.s;}; rest returns [String s]: PLUS t=term r=rest {\$s = \$t.s + "+" + \$r.s;} MINUS t=term r=rest {\$s = \$t.s + "-" + \$r.s;} /* empty */ {\$s = "";}; term returns [String s]: DIGIT {\$s = \$DIGIT.text;}; // productions for lexical analysis PLUS : + ; MINUS : - ; DIGIT : [0-9]; 8 of 34 Compiler Construction

7 Generating Top-Down Parsers Using ANTLR Java Code for Using Translator import java.io.*; import org.antlr.v4.runtime.*; public class SimpleMain { public static void main(final String[] args) throws IOException { String printsource = null, printsymtab = null, printir = null, printasm = null; SimpleLexer lexer = /* Create instance of lexer */ new SimpleLexer(new ANTLRInputStream(args[0])); SimpleParser parser = /* Create instance of parser */ new SimpleParser(new CommonTokenStream(lexer)); String postfix = parser.program().s; /* Run translator */ System.out.println(postfix); } } 9 of 34 Compiler Construction

8 Generating Top-Down Parsers Using ANTLR An Example Run 1. After installation, invoke ANTLR: \$ java -jar /usr/local/lib/antlr-4.7-complete.jar Simple.g4 (will generate SimpleLexer.java, SimpleParser.java, Simple.tokens, and SimpleLexer.tokens) 2. Use Java compiler: \$ javac -cp /usr/local/lib/antlr-4.7-complete.jar Simple*.java 3. Run translator: \$ java -cp.:/usr/local/lib/antlr-4.7-complete.jar SimpleMain of 34 Compiler Construction

9 Generating Top-Down Parsers Using ANTLR Advantages of ANTLR Advantages of ANTLR Generated (Java) code is similar to hand-written code possible (and easy) to read and debug generated code Syntax for specifying scanners, parsers and tree walkers is identical Support for many target programming languages ANTLR is well supported and has an active user community 11 of 34 Compiler Construction

10 Generating Bottom-Up Parsers Using yacc and bison The yacc and bison Tools Usage of yacc ( yet another compiler compiler ): yacc [f]lex spec.y y.tab.c lex.yy.c spec.l yacc specification Parser source Scanner source [f]lex specification cc a.out Executable LALR(1) parser Like for [f]lex, a yacc specification is of the form Declarations (optional) %% Rules %% Auxiliary procedures (optional) bison : upward-compatible GNU implementation of yacc (more flexible w.r.t. file names,...) 13 of 34 Compiler Construction

11 Generating Bottom-Up Parsers Using yacc and bison yacc Specifications Declarations: Token definitions: %token Tokens Not every token needs to be declared ( +, =,...) Start symbol: %start Symbol (optional) C code for declarations etc.: %{ Code %} Rules: context-free productions and semantic actions A α 1 α 2... α n represented as A : α 1 {Action 1 } α 2. {Action 2 } α n {Action n }; Semantic actions = C statements for computing attribute values \$\$ = attribute value of A \$i = attribute value of i-th symbol on right-hand side Default action: \$\$ = \$1 Auxiliary procedures: scanner (if not generated by [f]lex), error routines, of 34 Compiler Construction

12 Generating Bottom-Up Parsers Using yacc and bison Example: Simple Desk Calculator I %{/* SLR(1) grammar for arithmetic expressions (Example 9.9) */ #include <stdio.h> #include <ctype.h> %} %token DIGIT %% line : expr \n { printf("%d\n", \$1); }; expr : expr + term { \$\$ = \$1 + \$3; } term { \$\$ = \$1; }; term : term * factor { \$\$ = \$1 * \$3; } factor { \$\$ = \$1; }; factor : ( expr ) { \$\$ = \$2; } DIGIT { \$\$ = \$1; }; %% yylex() { int c; c = getchar(); if (isdigit(c)) yylval = c - 0 ; return DIGIT; return c; } 15 of 34 Compiler Construction

13 Generating Bottom-Up Parsers Using yacc and bison Example: Simple Desk Calculator II \$ yacc calc.y \$ cc y.tab.c -ly \$ a.out \$ a.out 2+3* of 34 Compiler Construction

14 Generating Bottom-Up Parsers Using yacc and bison An Ambiguous Grammar I %{/* Ambiguous grammar for arithmetic expressions (Example 10.17) */ #include <stdio.h> #include <ctype.h> %} %token DIGIT %% line : expr \n { printf("%d\n", \$1); }; expr : expr + expr { \$\$ = \$1 + \$3; } expr * expr { \$\$ = \$1 * \$3; } DIGIT { \$\$ = \$1; }; %% yylex() { int c; c = getchar(); if (isdigit(c)) {yylval = c - 0 ; return DIGIT;} return c; } 17 of 34 Compiler Construction

15 Generating Bottom-Up Parsers Using yacc and bison An Ambiguous Grammar II Invoking yacc with the option -v produces a report y.output: State 8 2 expr: expr. + expr 2 expr + expr. 3 expr. * expr + shift and goto state 6 * shift and goto state 7 + [reduce with rule 2 (expr)] * [reduce with rule 2 (expr)] State 9 2 expr: expr. + expr 3 expr. * expr 3 expr * expr. + shift and goto state 6 * shift and goto state 7 + [reduce with rule 3 (expr)] * [reduce with rule 3 (expr)] 18 of 34 Compiler Construction

16 Generating Bottom-Up Parsers Using yacc and bison Conflict Handling in yacc Default conflict resolving strategy in yacc: reduce/reduce: choose first conflicting production in specification shift/reduce: prefer shift resolves dangling-else ambiguity (Example 10.18) correctly also adequate for strong following weak operator (* after +; Example 10.17) and for right-associative operators not appropriate for weak following strong operator and for left-associative operators ( = reduce; see Example 10.17) For ambiguous grammar: \$ yacc ambig.y conflicts: 4 shift/reduce \$ cc y.tab.c -ly \$ a.out 2+3*5 17 \$ a.out 2* of 34 Compiler Construction

17 Generating Bottom-Up Parsers Using yacc and bison Precedences and Associativities in yacc I General mechanism for resolving conflicts: %[left right] Operators 1. %[left right] Operators n operators in one line have given associativity and same precedence precedence increases over lines Example 11.1 %left + - %left * / %right ^ ^ (right associative) binds stronger than * and / (left associative), which in turn bind stronger than + and - (left associative) 20 of 34 Compiler Construction

18 Generating Bottom-Up Parsers Using yacc and bison Precedences and Associativities in yacc II %{/* Ambiguous grammar for arithmetic expressions with precedences and associativities */ #include <stdio.h> #include <ctype.h> %} %token DIGIT %left + %left * %% line : expr \n { printf("%d\n", \$1); }; expr : expr + expr { \$\$ = \$1 + \$3; } expr * expr { \$\$ = \$1 * \$3; } DIGIT { \$\$ = \$1; }; %% yylex() { int c; c = getchar(); if (isdigit(c)) {yylval = c - 0 ; return DIGIT;} return c; } 21 of 34 Compiler Construction

19 Generating Bottom-Up Parsers Using yacc and bison Precedences and Associativities in yacc III \$ yacc ambig-prio.y \$ cc y.tab.c -ly \$ a.out 2* \$ a.out 2+3* of 34 Compiler Construction

20 LL and LR Parsing in Practice LL and LR Parsing in Practice In practice: use of LL(1)/LL( ) or LALR(1) Detailed comparison (cf. Fischer/LeBlanc: Crafting a Compiler, Benjamin/Cummings, 1988): Simplicity: LL wins LL parsing technique easier to understand recursive-descent parser easier to debug than LALR action tables Generality: LALR wins almost LL(1) LALR(1) (only pathological counterexamples) LL requires elimination of left recursion and left factorization Semantic actions: (see semantic analysis) LL wins actions can be placed anywhere in LL parsers without causing conflicts in LALR: implicit ε-productions = may generate conflicts Error handling: LL wins top-down approach provides context information = better basis for reporting and/or repairing errors Parser 24 of 34 size: Compilercomparable Construction Lecture LL: 11: action Syntax Analysis table VII (Practical Issues) &

21 Overview Conceptual Structure of a Compiler Source code Lexical analysis (Scanner) Asg Syntax analysis (Parser) Var Exp Sum Semantic analysis Var Con Asg ok attribute grammars Generation of intermediate code int Var Exp int Sum int Code optimisation int Var Con int Generation of target code Target code 26 of 34 Compiler Construction

22 Semantic Analysis Beyond Syntax To generate (efficient) code, the compiler needs to answer many questions: Are there identifiers that are not declared? Declared but not used? Is x a scalar, an array, or a procedure? Of which type? Which declaration of x is used by each reference? Is x defined before it is used? Is the expression 3 * x + y type consistent? Where should the value of x be stored (register/stack/heap)? Do p and q refer to the same memory location (aliasing)?... These cannot be expressed using context-free grammars! (For example, {ww w Σ + } / CFL Σ ) 28 of 34 Compiler Construction

23 Semantic Analysis Static Semantics Static semantics Refers to properties of program constructs which are true for every occurrence of this construct in every program execution and can be decided at compile time ( static ) but are context-sensitive and thus not expressible using context-free grammars ( semantics ). Example properties Static: type or declaredness of an identifier, number of registers required to evaluate an expression,... Dynamic: value of an expression, size of procedure stack, of 34 Compiler Construction

24 Attribute Grammars Attribute Grammars I Goal: compute context-dependent but runtime-independent properties of a given program Idea: enrich context-free grammar by semantic rules which annotate syntax tree with attribute values = Semantic analysis = attribute evaluation Result: attributed syntax tree In greater detail: With every grammar symbol a set of attributes is associated. Two types of attributes are distinguished: Synthesized: bottom-up computation (from the leaves to the root) Inherited: top-down computation (from the root to the leaves) With every production a set of semantic rules is associated. 31 of 34 Compiler Construction

25 Attribute Grammars Attribute Grammars II Advantage: attribute grammars provide a very flexible and broadly applicable mechanism for transporting information through the syntax tree ( syntax-directed translation ) Attribute values: symbol tables, data types, code, error flags,... Application in Compiler Construction: static semantics program analysis for optimisation code generation error handling Automatic attribute evaluation by compiler generators (cf. ANTLR s and yacc s synthesized attributes) Originally designed by D. Knuth for defining the semantics of context-free languages (Math. Syst. Theory 2 (1968), pp ) 32 of 34 Compiler Construction

26 Attribute Grammars Example: Type Checking I Example 11.2 (Attribute grammar for type checking) Pgm Dcl Cmd ok.0 = ok.2 env.2 = st.1 Dcl ε st.0 = [id err id Id] Typ var;dcl st.0 = st.4[id.2 typ.1] Typ int typ.0 = int Typ bool typ.0 = bool Cmd ε ok.0 = true var := Exp;Cmd ok.0 = (env.0(id.1) = typ.3 ok.5) env.3 = env.0 env.5 = env.0 Exp num typ.0 = int var typ.0 = env.0(id.1) Exp + Exp typ.0 = (typ.1 = typ.3 = int? int:err) env.1 = env.0 env.3 = env.0 Exp < Exp typ.0 = (typ.1 = typ.3 = int? bool:err) env.1 = env.0 env.3 = env.0 Exp && Exp typ.0 = (typ.1 = typ.3 = bool? bool:err) env.1 = env.0 env.3 = env.0 Synthesized attributes: id (identifier name), ok (Boolean result), st (symbol table, mapping identifiers to types), typ (data type in {bool, int, err}) Inherited attributes: env (environment same type as symbol table) 33 of 34 Compiler Construction

27 Attribute Grammars Example: Type Checking II Example 11.3 (Attributed syntax tree) For program int x; bool y; x := x+1; y := x && y: Pgm o Dcl s e Cmd o Typ t var i ; Dcl s var i := e Exp t ; e Cmd o int Typ t var i ; Dcl s e Exp t + e Exp t var i := Exp e t ; e Cmd o bool ε var i num e Exp t && e Exp t ε ( e = env, i = id, o = ok, s = st, t = typ) var i var i 34 of 34 Compiler Construction

### Compiler Construction

Compiler Construction Thomas Noll Software Modeling and Verification Group RWTH Aachen University https://moves.rwth-aachen.de/teaching/ss-17/cc/ Recap: LR(1) Parsing Outline of Lecture 11 Recap: LR(1)

### Compiler Construction

Compiler Construction Thomas Noll Software Modeling and Verification Group RWTH Aachen University https://moves.rwth-aachen.de/teaching/ws-1819/cc/ Recap: LR(1) Parsing Outline of Lecture 11 Recap: LR(1)

### Syntax Analysis Part IV

Syntax Analysis Part IV Chapter 4: Bison Slides adapted from : Robert van Engelen, Florida State University Yacc and Bison Yacc (Yet Another Compiler Compiler) Generates LALR(1) parsers Bison Improved

### CSE302: Compiler Design

CSE302: Compiler Design Instructor: Dr. Liang Cheng Department of Computer Science and Engineering P.C. Rossin College of Engineering & Applied Science Lehigh University March 27, 2007 Outline Recap General/Canonical

### Syntax-Directed Translation

Syntax-Directed Translation ALSU Textbook Chapter 5.1 5.4, 4.8, 4.9 Tsan-sheng Hsu tshsu@iis.sinica.edu.tw http://www.iis.sinica.edu.tw/~tshsu 1 What is syntax-directed translation? Definition: The compilation

### COMPILER CONSTRUCTION LAB 2 THE SYMBOL TABLE. Tutorial 2 LABS. PHASES OF A COMPILER Source Program. Lab 2 Symbol table

COMPILER CONSTRUCTION Lab 2 Symbol table LABS Lab 3 LR parsing and abstract syntax tree construction using ''bison' Lab 4 Semantic analysis (type checking) PHASES OF A COMPILER Source Program Lab 2 Symtab

### Context-free grammars

Context-free grammars Section 4.2 Formal way of specifying rules about the structure/syntax of a program terminals - tokens non-terminals - represent higher-level structures of a program start symbol,

### Compiler Construction: Parsing

Compiler Construction: Parsing Mandar Mitra Indian Statistical Institute M. Mitra (ISI) Parsing 1 / 33 Context-free grammars. Reference: Section 4.2 Formal way of specifying rules about the structure/syntax

### Conflicts in LR Parsing and More LR Parsing Types

Conflicts in LR Parsing and More LR Parsing Types Lecture 10 Dr. Sean Peisert ECS 142 Spring 2009 1 Status Project 2 Due Friday, Apr. 24, 11:55pm The usual lecture time is being replaced by a discussion

### Yacc: A Syntactic Analysers Generator

Yacc: A Syntactic Analysers Generator Compiler-Construction Tools The compiler writer uses specialised tools (in addition to those normally used for software development) that produce components that can

### Principles of Programming Languages

Principles of Programming Languages h"p://www.di.unipi.it/~andrea/dida2ca/plp- 15/ Prof. Andrea Corradini Department of Computer Science, Pisa Lesson 10! LR parsing with ambiguous grammars Error recovery

### EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

EDAN65: Compilers, Lecture 06 A LR parsing Görel Hedin Revised: 2017-09-11 This lecture Regular expressions Context-free grammar Attribute grammar Lexical analyzer (scanner) Syntactic analyzer (parser)

### Using an LALR(1) Parser Generator

Using an LALR(1) Parser Generator Yacc is an LALR(1) parser generator Developed by S.C. Johnson and others at AT&T Bell Labs Yacc is an acronym for Yet another compiler compiler Yacc generates an integrated

### Lecture 14: Parser Conflicts, Using Ambiguity, Error Recovery. Last modified: Mon Feb 23 10:05: CS164: Lecture #14 1

Lecture 14: Parser Conflicts, Using Ambiguity, Error Recovery Last modified: Mon Feb 23 10:05:56 2015 CS164: Lecture #14 1 Shift/Reduce Conflicts If a DFA state contains both [X: α aβ, b] and [Y: γ, a],

### Bottom-Up Parsing. Lecture 11-12

Bottom-Up Parsing Lecture 11-12 (From slides by G. Necula & R. Bodik) 2/20/08 Prof. Hilfinger CS164 Lecture 11 1 Administrivia Test I during class on 10 March. 2/20/08 Prof. Hilfinger CS164 Lecture 11

### CS 2210 Sample Midterm. 1. Determine if each of the following claims is true (T) or false (F).

CS 2210 Sample Midterm 1. Determine if each of the following claims is true (T) or false (F). F A language consists of a set of strings, its grammar structure, and a set of operations. (Note: a language

### TDDD55 - Compilers and Interpreters Lesson 3

TDDD55 - Compilers and Interpreters Lesson 3 November 22 2011 Kristian Stavåker (kristian.stavaker@liu.se) Department of Computer and Information Science Linköping University LESSON SCHEDULE November 1,

### G53CMP: Lecture 4. Syntactic Analysis: Parser Generators. Henrik Nilsson. University of Nottingham, UK. G53CMP: Lecture 4 p.1/32

G53CMP: Lecture 4 Syntactic Analysis: Parser Generators Henrik Nilsson University of Nottingham, UK G53CMP: Lecture 4 p.1/32 This Lecture Parser generators ( compiler compilers ) The parser generator Happy

### Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan

Language Processing Systems Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan Syntax Analysis (Parsing) 1. Uses Regular Expressions to define tokens 2. Uses Finite Automata to

### COMPILER CONSTRUCTION Seminar 02 TDDB44

COMPILER CONSTRUCTION Seminar 02 TDDB44 Martin Sjölund (martin.sjolund@liu.se) Adrian Horga (adrian.horga@liu.se) Department of Computer and Information Science Linköping University LABS Lab 3 LR parsing

### COMPILERS AND INTERPRETERS Lesson 4 TDDD16

COMPILERS AND INTERPRETERS Lesson 4 TDDD16 Kristian Stavåker (kristian.stavaker@liu.se) Department of Computer and Information Science Linköping University TODAY Introduction to the Bison parser generator

### Lecture 8: Deterministic Bottom-Up Parsing

Lecture 8: Deterministic Bottom-Up Parsing (From slides by G. Necula & R. Bodik) Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 1 Avoiding nondeterministic choice: LR We ve been looking at general

### CSCI Compiler Design

Syntactic Analysis Automatic Parser Generators: The UNIX YACC Tool Portions of this lecture were adapted from Prof. Pedro Reis Santos s notes for the 2006 Compilers class lectured at IST/UTL in Lisbon,

### Compiler construction in4020 lecture 5

Compiler construction in4020 lecture 5 Semantic analysis Assignment #1 Chapter 6.1 Overview semantic analysis identification symbol tables type checking CS assignment yacc LLgen language grammar parser

### TDDD55- Compilers and Interpreters Lesson 3

TDDD55- Compilers and Interpreters Lesson 3 Zeinab Ganjei (zeinab.ganjei@liu.se) Department of Computer and Information Science Linköping University 1. Grammars and Top-Down Parsing Some grammar rules

### Compiler construction 2002 week 5

Compiler construction in400 lecture 5 Koen Langendoen Delft University of Technology The Netherlands Overview semantic analysis identification symbol tables type checking assignment yacc LLgen language

### Lecture 7: Deterministic Bottom-Up Parsing

Lecture 7: Deterministic Bottom-Up Parsing (From slides by G. Necula & R. Bodik) Last modified: Tue Sep 20 12:50:42 2011 CS164: Lecture #7 1 Avoiding nondeterministic choice: LR We ve been looking at general

### Parsing How parser works?

Language Processing Systems Prof. Mohamed Hamada Software Engineering Lab. The University of Aizu Japan Syntax Analysis (Parsing) 1. Uses Regular Expressions to define tokens 2. Uses Finite Automata to

### Lexical and Syntax Analysis

Lexical and Syntax Analysis (of Programming Languages) Bison, a Parser Generator Lexical and Syntax Analysis (of Programming Languages) Bison, a Parser Generator Bison: a parser generator Bison Specification

### LR Parsing LALR Parser Generators

LR Parsing LALR Parser Generators Outline Review of bottom-up parsing Computing the parsing DFA Using parser generators 2 Bottom-up Parsing (Review) A bottom-up parser rewrites the input string to the

### In One Slide. Outline. LR Parsing. Table Construction

LR Parsing Table Construction #1 In One Slide An LR(1) parsing table can be constructed automatically from a CFG. An LR(1) item is a pair made up of a production and a lookahead token; it represents a

### Compilers. Bottom-up Parsing. (original slides by Sam

Compilers Bottom-up Parsing Yannis Smaragdakis U Athens Yannis Smaragdakis, U. Athens (original slides by Sam Guyer@Tufts) Bottom-Up Parsing More general than top-down parsing And just as efficient Builds

### Bottom-Up Parsing. Lecture 11-12

Bottom-Up Parsing Lecture 11-12 (From slides by G. Necula & R. Bodik) 9/22/06 Prof. Hilfinger CS164 Lecture 11 1 Bottom-Up Parsing Bottom-up parsing is more general than topdown parsing And just as efficient

### Compiler Lab. Introduction to tools Lex and Yacc

Compiler Lab Introduction to tools Lex and Yacc Assignment1 Implement a simple calculator with tokens recognized using Lex/Flex and parsing and semantic actions done using Yacc/Bison. Calculator Input:

### Yacc Yet Another Compiler Compiler

LEX and YACC work as a team Yacc Yet Another Compiler Compiler How to work? Some material adapted from slides by Andy D. Pimentel LEX and YACC work as a team Availability call yylex() NUM + NUM next token

### Compiler construction 2005 lecture 5

Compiler construction in400 lecture 5 Semantic analysis Assignment #1 Chapter 6.1 Overview semantic analysis identification symbol tables type checking CS assignment yacc LLgen language parser generator

### Compilation 2013 Parser Generators, Conflict Management, and ML-Yacc

Compilation 2013 Parser Generators, Conflict Management, and ML-Yacc Erik Ernst Aarhus University Parser generators, ML-Yacc LR parsers are tedious to write, but can be generated, e.g., by ML-Yacc Input:

### LR Parsing LALR Parser Generators

Outline LR Parsing LALR Parser Generators Review of bottom-up parsing Computing the parsing DFA Using parser generators 2 Bottom-up Parsing (Review) A bottom-up parser rewrites the input string to the

### Lexical and Syntax Analysis. Bottom-Up Parsing

Lexical and Syntax Analysis Bottom-Up Parsing Parsing There are two ways to construct derivation of a grammar. Top-Down: begin with start symbol; repeatedly replace an instance of a production s LHS with

### Simple LR (SLR) LR(0) Drawbacks LR(1) SLR Parse. LR(1) Start State and Reduce. LR(1) Items 10/3/2012

LR(0) Drawbacks Consider the unambiguous augmented grammar: 0.) S E \$ 1.) E T + E 2.) E T 3.) T x If we build the LR(0) DFA table, we find that there is a shift-reduce conflict. This arises because the

### A Bison Manual. You build a text file of the production (format in the next section); traditionally this file ends in.y, although bison doesn t care.

A Bison Manual 1 Overview Bison (and its predecessor yacc) is a tool that take a file of the productions for a context-free grammar and converts them into the tables for an LALR(1) parser. Bison produces

### Compiler Construction

Compiler Construction Thomas Noll Software Modeling and Verification Group RWTH Aachen University https://moves.rwth-aachen.de/teaching/ss-16/cc/ Recap: Circularity of Attribute Grammars Circularity of

### JavaCC Parser. The Compilation Task. Automated? JavaCC Parser

JavaCC Parser The Compilation Task Input character stream Lexer stream Parser Abstract Syntax Tree Analyser Annotated AST Code Generator Code CC&P 2003 1 CC&P 2003 2 Automated? JavaCC Parser The initial

### Wednesday, September 9, 15. Parsers

Parsers What is a parser A parser has two jobs: 1) Determine whether a string (program) is valid (think: grammatically correct) 2) Determine the structure of a program (think: diagramming a sentence) Agenda

### Parsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:

What is a parser Parsers A parser has two jobs: 1) Determine whether a string (program) is valid (think: grammatically correct) 2) Determine the structure of a program (think: diagramming a sentence) Agenda

### LR Parsing. Leftmost and Rightmost Derivations. Compiler Design CSE 504. Derivations for id + id: T id = id+id. 1 Shift-Reduce Parsing.

LR Parsing Compiler Design CSE 504 1 Shift-Reduce Parsing 2 LR Parsers 3 SLR and LR(1) Parsers Last modifled: Fri Mar 06 2015 at 13:50:06 EST Version: 1.7 16:58:46 2016/01/29 Compiled at 12:57 on 2016/02/26

### IN4305 Engineering project Compiler construction

IN4305 Engineering project Compiler construction Koen Langendoen Delft University of Technology The Netherlands Course organization kick/off lectures (2x) lab work (14x) practice makes perfect NO exam,

### Section A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.

Section A 1. What do you meant by parser and its types? A parser for grammar G is a program that takes as input a string w and produces as output either a parse tree for w, if w is a sentence of G, or

### Syn S t yn a t x a Ana x lysi y s si 1

Syntax Analysis 1 Position of a Parser in the Compiler Model Source Program Lexical Analyzer Token, tokenval Get next token Parser and rest of front-end Intermediate representation Lexical error Syntax

### 4. Lexical and Syntax Analysis

4. Lexical and Syntax Analysis 4.1 Introduction Language implementation systems must analyze source code, regardless of the specific implementation approach Nearly all syntax analysis is based on a formal

### COMP 181. Prelude. Prelude. Summary of parsing. A Hierarchy of Grammar Classes. More power? Syntax-directed translation. Analysis

Prelude COMP 8 October, 9 What is triskaidekaphobia? Fear of the number s? No aisle in airplanes, no th floor in buildings Fear of Friday the th? Paraskevidedekatriaphobia or friggatriskaidekaphobia Why

### Programming Languages (CS 550) Lecture 4 Summary Scanner and Parser Generators. Jeremy R. Johnson

Programming Languages (CS 550) Lecture 4 Summary Scanner and Parser Generators Jeremy R. Johnson 1 Theme We have now seen how to describe syntax using regular expressions and grammars and how to create

### Parser. Larissa von Witte. 11. Januar Institut für Softwaretechnik und Programmiersprachen. L. v. Witte 11. Januar /23

Parser Larissa von Witte Institut für oftwaretechnik und Programmiersprachen 11. Januar 2016 L. v. Witte 11. Januar 2016 1/23 Contents Introduction Taxonomy Recursive Descent Parser hift Reduce Parser

### Formal Languages and Compilers Lecture VII Part 4: Syntactic A

Formal Languages and Compilers Lecture VII Part 4: Syntactic Analysis Free University of Bozen-Bolzano Faculty of Computer Science POS Building, Room: 2.03 artale@inf.unibz.it http://www.inf.unibz.it/

### As we have seen, token attribute values are supplied via yylval, as in. More on Yacc s value stack

More on Yacc s value stack As we noted last time, Yacc uses a second stack to store the attribute values of the tokens and terminals in the parse stack. For a token, the attributes are computed by the

### Fall Compiler Principles Lecture 4: Parsing part 3. Roman Manevich Ben-Gurion University of the Negev

Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev Tentative syllabus Front End Intermediate Representation Optimizations Code Generation Scanning

### Compiler Construction

Compiler Construction Thomas Noll Software Modeling and Verification Group RWTH Aachen University https://moves.rwth-aachen.de/teaching/ws-1819/cc/ Generation of Intermediate Code Outline of Lecture 15

### Let us construct the LR(1) items for the grammar given below to construct the LALR parsing table.

MODULE 18 LALR parsing After understanding the most powerful CALR parser, in this module we will learn to construct the LALR parser. The CALR parser has a large set of items and hence the LALR parser is

### Specifying Syntax. An English Grammar. Components of a Grammar. Language Specification. Types of Grammars. 1. Terminal symbols or terminals, Σ

Specifying Syntax Language Specification Components of a Grammar 1. Terminal symbols or terminals, Σ Syntax Form of phrases Physical arrangement of symbols 2. Nonterminal symbols or syntactic categories,

### Parsers. Xiaokang Qiu Purdue University. August 31, 2018 ECE 468

Parsers Xiaokang Qiu Purdue University ECE 468 August 31, 2018 What is a parser A parser has two jobs: 1) Determine whether a string (program) is valid (think: grammatically correct) 2) Determine the structure

### Compiler Construction

Compiler Construction Thomas Noll Software Modeling and Verification Group RWTH Aachen University https://moves.rwth-aachen.de/teaching/ss-16/cc/ Recap: First-Longest-Match Analysis Outline of Lecture

### CPS 506 Comparative Programming Languages. Syntax Specification

CPS 506 Comparative Programming Languages Syntax Specification Compiling Process Steps Program Lexical Analysis Convert characters into a stream of tokens Lexical Analysis Syntactic Analysis Send tokens

### COMPILER (CSE 4120) (Lecture 6: Parsing 4 Bottom-up Parsing )

COMPILR (CS 4120) (Lecture 6: Parsing 4 Bottom-up Parsing ) Sungwon Jung Mobile Computing & Data ngineering Lab Dept. of Computer Science and ngineering Sogang University Seoul, Korea Tel: +82-2-705-8930

### Configuration Sets for CSX- Lite. Parser Action Table

Configuration Sets for CSX- Lite State s 6 s 7 Cofiguration Set Prog { Stmts } Eof Stmts Stmt Stmts State s s Cofiguration Set Prog { Stmts } Eof Prog { Stmts } Eof Stmts Stmt Stmts Stmts λ Stmt if ( Expr

### Parsing Wrapup. Roadmap (Where are we?) Last lecture Shift-reduce parser LR(1) parsing. This lecture LR(1) parsing

Parsing Wrapup Roadmap (Where are we?) Last lecture Shift-reduce parser LR(1) parsing LR(1) items Computing closure Computing goto LR(1) canonical collection This lecture LR(1) parsing Building ACTION

### Syntax Analysis The Parser Generator (BYacc/J)

Syntax Analysis The Parser Generator (BYacc/J) CMPSC 470 Lecture 09-2 Topics: Yacc, BYacc/J A. Yacc Yacc is a computer program that generate LALR parser. Yacc stands for Yet Another Compiler-Compiler.

### PRACTICAL CLASS: Flex & Bison

Master s Degree Course in Computer Engineering Formal Languages FORMAL LANGUAGES AND COMPILERS PRACTICAL CLASS: Flex & Bison Eliana Bove eliana.bove@poliba.it Install On Linux: install with the package

### A Simple Syntax-Directed Translator

Chapter 2 A Simple Syntax-Directed Translator 1-1 Introduction The analysis phase of a compiler breaks up a source program into constituent pieces and produces an internal representation for it, called

### Gujarat Technological University Sankalchand Patel College of Engineering, Visnagar B.E. Semester VII (CE) July-Nov Compiler Design (170701)

Gujarat Technological University Sankalchand Patel College of Engineering, Visnagar B.E. Semester VII (CE) July-Nov 2014 Compiler Design (170701) Question Bank / Assignment Unit 1: INTRODUCTION TO COMPILING

### Hyacc comes under the GNU General Public License (Except the hyaccpar file, which comes under BSD License)

HYACC User Manual Created on 3/12/07. Last modified on 1/19/2017. Version 0.98 Hyacc comes under the GNU General Public License (Except the hyaccpar file, which comes under BSD License) Copyright 2007-2017.

### Fall Compiler Principles Lecture 5: Parsing part 4. Roman Manevich Ben-Gurion University

Fall 2014-2015 Compiler Principles Lecture 5: Parsing part 4 Roman Manevich Ben-Gurion University Tentative syllabus Front End Intermediate Representation Optimizations Code Generation Scanning Lowering

### 4. Lexical and Syntax Analysis

4. Lexical and Syntax Analysis 4.1 Introduction Language implementation systems must analyze source code, regardless of the specific implementation approach Nearly all syntax analysis is based on a formal

### Syntax Analysis. Martin Sulzmann. Martin Sulzmann Syntax Analysis 1 / 38

Syntax Analysis Martin Sulzmann Martin Sulzmann Syntax Analysis 1 / 38 Syntax Analysis Objective Recognize individual tokens as sentences of a language (beyond regular languages). Example 1 (OK) Program

### LR Parsing. Table Construction

#1 LR Parsing Table Construction #2 Outline Review of bottom-up parsing Computing the parsing DFA Closures, LR(1) Items, States Transitions Using parser generators Handling Conflicts #3 In One Slide An

### Compiler Construction

Compiler Construction Thomas Noll Software Modeling and Verification Group RWTH Aachen University https://moves.rwth-aachen.de/teaching/ss-17/cc/ Recap: First-Longest-Match Analysis The Extended Matching

### Introduction to Yacc. General Description Input file Output files Parsing conflicts Pseudovariables Examples. Principles of Compilers - 16/03/2006

Introduction to Yacc General Description Input file Output files Parsing conflicts Pseudovariables Examples General Description A parser generator is a program that takes as input a specification of a

### A programming language requires two major definitions A simple one pass compiler

A programming language requires two major definitions A simple one pass compiler [Syntax: what the language looks like A context-free grammar written in BNF (Backus-Naur Form) usually suffices. [Semantics:

### Automatic Scanning and Parsing using LEX and YACC

Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology ISSN 2320 088X IMPACT FACTOR: 6.017 IJCSMC,

### Syntax and Parsing COMS W4115. Prof. Stephen A. Edwards Fall 2003 Columbia University Department of Computer Science

Syntax and Parsing COMS W4115 Prof. Stephen A. Edwards Fall 2003 Columbia University Department of Computer Science Lexical Analysis (Scanning) Lexical Analysis (Scanning) Goal is to translate a stream

### Compiler Construction

Compiler Construction Thomas Noll Software Modeling and Verification Group RWTH Aachen University https://moves.rwth-aachen.de/teaching/ss-16/cc/ Conceptual Structure of a Compiler Source code x1 := y2

### COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

COMP-421 Compiler Design Presented by Dr Ioanna Dionysiou Administrative! Any questions about the syllabus?! Course Material available at www.cs.unic.ac.cy/ioanna! Next time reading assignment [ALSU07]

### Syntax Analysis. Amitabha Sanyal. (www.cse.iitb.ac.in/ as) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay

Syntax Analysis (www.cse.iitb.ac.in/ as) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay September 2007 College of Engineering, Pune Syntax Analysis: 2/124 Syntax

### Compiler Construction

Compiler Construction Thomas Noll Software Modeling and Verification Group RWTH Aachen University https://moves.rwth-aachen.de/teaching/ss-16/cc/ Seminar Analysis and Verification of Pointer Programs (WS

### Monday, September 13, Parsers

Parsers Agenda Terminology LL(1) Parsers Overview of LR Parsing Terminology Grammar G = (Vt, Vn, S, P) Vt is the set of terminals Vn is the set of non-terminals S is the start symbol P is the set of productions

### Compiler Construction

Compiler Construction Thomas Noll Software Modeling and Verification Group RWTH Aachen University https://moves.rwth-aachen.de/teaching/ss-16/cc/ Seminar Analysis and Verification of Pointer Programs (WS

### Part III : Parsing. From Regular to Context-Free Grammars. Deriving a Parser from a Context-Free Grammar. Scanners and Parsers.

Part III : Parsing From Regular to Context-Free Grammars Deriving a Parser from a Context-Free Grammar Scanners and Parsers A Parser for EBNF Left-Parsable Grammars Martin Odersky, LAMP/DI 1 From Regular

### shift-reduce parsing

Parsing #2 Bottom-up Parsing Rightmost derivations; use of rules from right to left Uses a stack to push symbols the concatenation of the stack symbols with the rest of the input forms a valid bottom-up

### How do LL(1) Parsers Build Syntax Trees?

How do LL(1) Parsers Build Syntax Trees? So far our LL(1) parser has acted like a recognizer. It verifies that input token are syntactically correct, but it produces no output. Building complete (concrete)

### CS664 Compiler Theory and Design LIU 1 of 16 ANTLR. Christopher League* 17 February Figure 1: ANTLR plugin installer

CS664 Compiler Theory and Design LIU 1 of 16 ANTLR Christopher League* 17 February 2016 ANTLR is a parser generator. There are other similar tools, such as yacc, flex, bison, etc. We ll be using ANTLR

### Compiler Construction

Compiler Construction Thomas Noll Software Modeling and Verification Group RWTH Aachen University https://moves.rwth-aachen.de/teaching/ss-17/cc/ Generation of Intermediate Code Outline of Lecture 15 Generation

### Compiler Construction

Compiler Construction Thomas Noll Software Modeling and Verification Group RWTH Aachen University https://moves.rwth-aachen.de/teaching/ss-17/cc/ Generation of Intermediate Code Conceptual Structure of

### The Parsing Problem (cont d) Recursive-Descent Parsing. Recursive-Descent Parsing (cont d) ICOM 4036 Programming Languages. The Complexity of Parsing

ICOM 4036 Programming Languages Lexical and Syntax Analysis Lexical Analysis The Parsing Problem Recursive-Descent Parsing Bottom-Up Parsing This lecture covers review questions 14-27 This lecture covers

### LECTURE 3. Compiler Phases

LECTURE 3 Compiler Phases COMPILER PHASES Compilation of a program proceeds through a fixed series of phases. Each phase uses an (intermediate) form of the program produced by an earlier phase. Subsequent

### 3. Parsing. Oscar Nierstrasz

3. Parsing Oscar Nierstrasz Thanks to Jens Palsberg and Tony Hosking for their kind permission to reuse and adapt the CS132 and CS502 lecture notes. http://www.cs.ucla.edu/~palsberg/ http://www.cs.purdue.edu/homes/hosking/

### MIT Parse Table Construction. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology

MIT 6.035 Parse Table Construction Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology Parse Tables (Review) ACTION Goto State ( ) \$ X s0 shift to s2 error error goto s1

### LR(0) Parsing Summary. LR(0) Parsing Table. LR(0) Limitations. A Non-LR(0) Grammar. LR(0) Parsing Table CS412/CS413

LR(0) Parsing ummary C412/C41 Introduction to Compilers Tim Teitelbaum Lecture 10: LR Parsing February 12, 2007 LR(0) item = a production with a dot in RH LR(0) state = set of LR(0) items valid for viable

### Building Compilers with Phoenix

Building Compilers with Phoenix Parser Generators: ANTLR History of ANTLR ANother Tool for Language Recognition Terence Parr's dissertation: Obtaining Practical Variants of LL(k) and LR(k) for k > 1 PCCTS: