Parser Combinators 11/3/2003 IPT, ICS 1

Similar documents
COMP3131/9102: Programming Languages and Compilers

Lecture C-10: Parser Combinators - Introduction

Lecture 12: Parser-Generating Tools

CPS 506 Comparative Programming Languages. Syntax Specification

JavaCC: SimpleExamples

Automated Tools. The Compilation Task. Automated? Automated? Easier ways to create parsers. The final stages of compilation are language dependant

Programming Languages. Dr. Philip Cannata 1

Build your own languages with

JavaCC Parser. The Compilation Task. Automated? JavaCC Parser

Lazy Functional Parser Combinators in Java

Simple LR (SLR) LR(0) Drawbacks LR(1) SLR Parse. LR(1) Start State and Reduce. LR(1) Items 10/3/2012

Tree Oriented Programming. Jeroen Fokker

I/O in Haskell. To output a character: putchar :: Char -> IO () e.g., putchar c. To output a string: putstr :: String -> IO () e.g.

EECS 700 Functional Programming

Action Table for CSX-Lite. LALR Parser Driver. Example of LALR(1) Parsing. GoTo Table for CSX-Lite

Programming Languages. Dr. Philip Cannata 1

Fast, Error Correcting Parser Combinators: A Short Tutorial

Outline. 1 Scanning Tokens. 2 Regular Expresssions. 3 Finite State Automata

Outline. Top Down Parsing. SLL(1) Parsing. Where We Are 1/24/2013

CS 4240: Compilers and Interpreters Project Phase 1: Scanner and Parser Due Date: October 4 th 2015 (11:59 pm) (via T-square)

Lexical Analysis 1 / 52

Compilation 2010 SableCC

Part III : Parsing. From Regular to Context-Free Grammars. Deriving a Parser from a Context-Free Grammar. Scanners and Parsers.

Lexical and Syntax Analysis

Configuration Sets for CSX- Lite. Parser Action Table

FUNCTIONAL AND LOGIC PROGRAMS

Standard prelude. Appendix A. A.1 Classes

Introduction to Programming Using Java (98-388)

Defining Program Syntax. Chapter Two Modern Programming Languages, 2nd ed. 1

PROGRAMMING FUNDAMENTALS

2018/2/5 话费券企业客户接入文档 语雀

Building lexical and syntactic analyzers. Chapter 3. Syntactic sugar causes cancer of the semicolon. A. Perlis. Chomsky Hierarchy

EDAN65: Compilers, Lecture 06 A LR parsing. Görel Hedin Revised:

FUNCTIONAL PEARL Parsing Permutation Phrases

CSE 413 Winter 2001 Final Exam Sample Solution

Parsing. On to ReadP. Part III: Using the ReadP package. Primitives Repeated from Hutton s Parser.hs. First Examples

Left to right design 1

Compilers CS S-01 Compiler Basics & Lexical Analysis

Functional Parsing. Languages for Lunch 10/14/08 James E. Heliotis & Axel T. Schreiner

Lecture 10 September 11, 2017

Array. Array Declaration:

Compilers CS S-01 Compiler Basics & Lexical Analysis

Getting started with Java

4. Semantic Processing and Attributed Grammars

Compiling expressions

CS664 Compiler Theory and Design LIU 1 of 16 ANTLR. Christopher League* 17 February Figure 1: ANTLR plugin installer

EDA180: Compiler Construc6on. Top- down parsing. Görel Hedin Revised: a

Compilation 2012 ocamllex and menhir

Structure of JavaCC File. JavaCC Rules. Lookahead. Example JavaCC File. JavaCC rules correspond to EBNF rules. JavaCC rules have the form:

JavaCC Rules. Structure of JavaCC File. JavaCC rules correspond to EBNF rules. JavaCC rules have the form:

CSC Java Programming, Fall Java Data Types and Control Constructs

Name EID. (calc (parse '{+ {with {x {+ 5 5}} {with {y {- x 3}} {+ y y} } } z } ) )

Lex Spec Example. Int installid() {/* code to put id lexeme into string table*/}

Error Detection in LALR Parsers. LALR is More Powerful. { b + c = a; } Eof. Expr Expr + id Expr id we can first match an id:

Part II : Lexical Analysis

An ANTLR Grammar for Esterel

Parsing a primer. Ralf Lämmel Software Languages Team University of Koblenz-Landau

Chapter 3. Parsing #1

C16b: Exception Handling

Languages and Compilers (SProg og Oversættere) Lecture 3 recap Bent Thomsen Department of Computer Science Aalborg University

DM550 / DM857 Introduction to Programming. Peter Schneider-Kamp

Exercise ANTLRv4. Patryk Kiepas. March 25, 2017

Syntax. A. Bellaachia Page: 1

Functional Programming and Haskell

Informatics 1 Functional Programming Lectures 15 and 16. IO and Monads. Don Sannella University of Edinburgh

DM550 / DM857 Introduction to Programming. Peter Schneider-Kamp

JavaCC. JavaCC File Structure. JavaCC is a Lexical Analyser Generator and a Parser Generator. The structure of a JavaCC file is as follows:

SML-SYNTAX-LANGUAGE INTERPRETER IN JAVA. Jiahao Yuan Supervisor: Dr. Vijay Gehlot

Relation Overriding. Syntax and Semantics. Simple Semantic Domains. Operational Semantics

CIT 3136 Lecture 7. Top-Down Parsing

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

CS S-01 Compiler Basics & Lexical Analysis 1

221 Compilers Exercise 1: writing a very tiny interpreter

LL(k) Compiler Construction. Choice points in EBNF grammar. Left recursive grammar

Full file at

CSC 1214: Object-Oriented Programming

CSCE314: Programming Languages Final Examination

Chapter 3: Lexical Analysis

LL(k) Compiler Construction. Top-down Parsing. LL(1) parsing engine. LL engine ID, $ S 0 E 1 T 2 3

Specifying Syntax. An English Grammar. Components of a Grammar. Language Specification. Types of Grammars. 1. Terminal symbols or terminals, Σ

Informatics 1 Functional Programming 19 Tuesday 23 November IO and Monads. Philip Wadler University of Edinburgh

(A) 99 (B) 100 (C) 101 (D) 100 initial integers plus any additional integers required during program execution

Semantic Analysis with Attribute Grammars Part 4

CSCE 314 Programming Languages. Monadic Parsing

Parsing. Zhenjiang Hu. May 31, June 7, June 14, All Right Reserved. National Institute of Informatics

JavaCUP. There are also many parser generators written in Java

Java Programming: from the Beginning. Chapter 8 More Control Structures. CSc 2310: Principle of Programming g( (Java) Spring 2013

Topic 6: Partial Application, Function Composition and Type Classes

Topic 6: Partial Application, Function Composition and Type Classes

Object Oriented Programming Exception Handling

CS Parsing 1

i219 Software Design Methodology 9. Dynamic modeling 2 Kazuhiro Ogata (JAIST) Outline of lecture

Compiler Construction D7011E

CSCE 314 Programming Languages. Functional Parsers

Peace cannot be kept by force; it can only be achieved by understanding. Albert Einstein

Examination in Compilers, EDAN65

CS 11 Ocaml track: lecture 6

Parsing. COMP 520: Compiler Design (4 credits) Professor Laurie Hendren.

Fundamentals of Object Oriented Programming

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

Transcription:

Parser Combinators 11/3/2003 IPT, ICS 1

Parser combinator library Similar to those from Grammars & Parsing But more efficient, self-analysing error recovery 11/3/2003 IPT, ICS 2

Basic combinators Similar to (E)BNF EBNF 's' x y x y x? e combinator psym 's' x < > y x <*> y x `opt` () psucceed () usage symbol alternatives sequence optional empty 11/3/2003 IPT, ICS 3

Symbol > parse (psym 'a') "a" ('a',"") symbol > parse (psym 'a') "b" ('a'," Deleted : 'b' before eof\n Inserted: 'a' before eof\n") error recovery t p inp -- test parser p on input inp = do let (res, msgs) = parse p inp putstr (if null msgs then "" else "Errors:\n" ++ msgs) putstr ("\n"++show(res)) > t (psym 'a') "b" Errors: Deleted : 'b' before eof Inserted: 'a' before eof 'a' 11/3/2003 IPT, ICS 4

Empty & optional > t (psucceed 'a') "a" Errors: Not used: 'a' 'a' empty > t (psym 'a' `opt` 'b') "a" 'a' > t (psym 'a' `opt` 'b') "c" Errors: Not used: 'c' 'b' optional 11/3/2003 IPT, ICS 5

Sequence t ( psucceed (\a b -> [b]++[a]) <*> psym 'a' <*> psym 'b' ) "ab" "ba" sequence f <$> p = psucceed f <*> p t ( (\a b -> [b]++[a]) <$> psym 'a' <*> psym 'b' ) "ab" "ba" application 11/3/2003 IPT, ICS 6

Alternative & range > t (psym 'a' < > psym 'b') "a" 'a' > t (psym 'a' < > psym 'b') "b" 'b' > t (psym 'a' < > psym 'b') "c" Errors: Deleted : 'c' before eof Inserted: 'a' before eof 'a' alternative > t (panysym "ab") "a" > t (panysym "ab") "b" pany f l = foldr1 (< >) (map f l) panysym l = pany psym l > t ('a' <..> 'b') "a" range 11/3/2003 IPT, ICS 7

Derived parsers Created by combining basic parsers (hence the name combinators) pfoldr plist pchainr x * x * x (y x)* sequence of x with result folding sequence of x result is list sequence of x, separated by y 11/3/2003 IPT, ICS 8

Repetition > t (pfoldr ((+),0) ( (\x -> ord x - ord '0') <$> '0' <..> '9' ) ) "34521" 15 sequence folding plist p = pfoldr ((:),[]) p t ( foldr (+) 0. map (\x -> ord x - ord '0') <$> plist ('0' <..> '9') ) "34521" t ( foldr (+) 0 <$> plist ( (\x -> ord x - ord '0') <$> '0' <..> '9' ) ) "34521" sequence listing 11/3/2003 IPT, ICS 9

Chain t (pchainr ( (+) <$ psym '+' < > (-) <$ psym '-' ) ( (\x -> ord x - ord '0') <$> '0' <..> '9' ) ) "3+4-5+2-1" 1 chaining Evaluates 3+(4-(5+(2-1))) 11/3/2003 IPT, ICS 10

Ambiguity and greediness t ((,) <$> plist (psym 'a') <*> plist (psym 'a') ) "aaaa" ("aaaa","") the 1st or 2nd? greedy: 1st takes it all t ((,) <$> plist_ng (psym 'a') <*> plist (psym 'a') ) "aaaa" (""," aaaa ") non greedy variant 11/3/2003 IPT, ICS 11

Example: expression parser module Expr where import UU_Parsing_Core import UU_Parsing_Derived instance Symbol Char pparens p = psym '(' *> p <* psym ')' pdigit = (\d -> ord d - ord '0') <$> panysym ['0'..'9'] pnat = foldl (\a b -> a*10 + b) 0 <$> plist1 pdigit pfact = pnat < > pparens pexpr pterm = pchainl ((*) <$ psym '*' < > div <$ psym '/' ) pfact pexpr = pchainl ((+) <$ psym '+' < > (-) <$ psym '-' ) pterm on :: Show a => Parser Char a -> [Char] -> IO () on p inp -- run parser p on input inp = do let (res, msgs) = parse p inp putstr (if null msgs then "" else "Errors:\n" ++ show msgs) putstr ("\n" ++ show res ++ "\n") main :: IO () main = do putstr "Enter expression: " inp <- getline pexpr `on` inp main 11/3/2003 IPT, ICS 12

Why use parser combinators? Reasons not to use because Haskell is so weird everybody elsewhere uses (e.g.) Java Reason(s) to use parser combinators are simple compared to... 11/3/2003 IPT, ICS 13

Example using JavaCC JavaCC generates Java source code from grammar specification javacc Expr javac *java java Expr JavaCC used for SUN s Java compiler 11/3/2003 IPT, ICS 14

Example: Expr.jj PARSER_BEGIN(Expr) public class Expr static int total; static java.util.stack argstack = new java.util.stack(); SKIP : " " "\r" "\t" public static void main(string args[]) throws ParseException Expr parser = new Expr(System.in); TOKEN : while (true) System.out.print("Enter Expression: "); < EOL: "\n" > System.out.flush(); try switch (parser.one_line()) TOKEN : /* OPERATORS */ case -1: System.exit(0); < PLUS: "+" > case 0: < MINUS: "-" > break; < MULTIPLY: "*" > case 1: < DIVIDE: "/" > Object x = argstack.pop(); System.out.println("Evaluation result = " + x.tostring()); break; TOKEN : catch (ParseException x) < CONSTANT: ( <DIGIT> )+ > System.out.println("Exiting."); < #DIGIT: ["0" - "9"] > throw x; 11/3/2003 IPT, ICS 15 PARSER_END(Expr)

Example: Expr.jj int one_line() : expr() <EOL> return 1; <EOL> return 0; <EOF> return -1; void expr() : Token x; term() ( ( x = <PLUS> x = <MINUS> ) term() int a = ((Integer) argstack.pop()).intvalue(); int b = ((Integer) argstack.pop()).intvalue(); if ( x.kind == PLUS ) argstack.push(new Integer(b + a)); else argstack.push(new Integer(b - a)); )* void term() : Token x; factor() ( ( x = <MULTIPLY> x = <DIVIDE> ) factor() int a = ((Integer) argstack.pop()).intvalue(); int b = ((Integer) argstack.pop()).intvalue(); if ( x.kind == MULTIPLY ) argstack.push(new Integer(b * a)); else argstack.push(new Integer(b / a)); )* void factor() : <CONSTANT> try int x = Integer.parseInt(token.image); argstack.push(new Integer(x)); catch (NumberFormatException ee) argstack.push(new Integer(0)); 11/3/2003 "(" expr() ")" IPT, ICS 16