CSCE 314 Programming Languages. Monadic Parsing

Similar documents
CSCE 314 Programming Languages. Monadic Parsing

CSCE 314 Programming Languages. Functional Parsers

CSCE 314 TAMU Fall CSCE 314: Programming Languages Dr. Flemming Andersen. Functional Parsers

EECS 700 Functional Programming

Parsing. Zhenjiang Hu. May 31, June 7, June 14, All Right Reserved. National Institute of Informatics

Monadic Parsing. 1 Chapter 13 in the book Programming in Haskell (2nd ed.) by. 2 Chapters 10 and 16 in the book Real World Haskell by Bryan

CSCE 314 Programming Languages

CIS552: Advanced Programming

CSCE 314 Programming Languages

Shell CSCE 314 TAMU. Higher Order Functions

CSCE 314 Programming Languages. Interactive Programming: I/O

Standard prelude. Appendix A. A.1 Classes

University of Utrecht. 1992; Fokker, 1995), the use of monads to structure functional programs (Wadler,

A programming language requires two major definitions A simple one pass compiler

CSCE 314 Programming Languages Functors, Applicatives, and Monads

CS 209 Functional Programming

Informatics 1 Functional Programming Lecture 5. Function properties. Don Sannella University of Edinburgh

Informatics 1 Functional Programming Lectures 15 and 16. IO and Monads. Don Sannella University of Edinburgh

Haskell Types, Classes, and Functions, Currying, and Polymorphism

CSCE 314 Programming Languages

Informatics 1 Functional Programming 19 Tuesday 23 November IO and Monads. Philip Wadler University of Edinburgh

Optimizing Finite Automata

CSCE 314 Programming Languages. Type System

CSE302: Compiler Design

Building a Parser II. CS164 3:30-5:00 TT 10 Evans. Prof. Bodik CS 164 Lecture 6 1

CS 320: Concepts of Programming Languages

A Tour of Language Implementation

Haskell: Lists. CS F331 Programming Languages CSCE A331 Programming Language Concepts Lecture Slides Friday, February 24, Glenn G.

Lexical Analysis (ASU Ch 3, Fig 3.1)

CMSC 330: Organization of Programming Languages. Architecture of Compilers, Interpreters

Lecture 19: Functions, Types and Data Structures in Haskell

CMSC 330: Organization of Programming Languages. Context Free Grammars

Shell CSCE 314 TAMU. Haskell Functions

LL Parsing: A piece of cake after LR

Where We Are. CMSC 330: Organization of Programming Languages. This Lecture. Programming Languages. Motivation for Grammars

CMSC 330: Organization of Programming Languages. Context Free Grammars

CMSC 330: Organization of Programming Languages. Functional Programming with Lists

CMSC 330: Organization of Programming Languages

Thoughts on Assignment 4 Haskell: Flow of Control

Today. Assignments. Lecture Notes CPSC 326 (Spring 2019) Quiz 2. Lexer design. Syntax Analysis: Context-Free Grammars. HW2 (out, due Tues)

Shell CSCE 314 TAMU. Functions continued

Package Ramble. R topics documented: October 29, Type Package Title Parser Combinator for R Version Date Author Chapman Siu

Administrativia. PA2 assigned today. WA1 assigned today. Building a Parser II. CS164 3:30-5:00 TT 10 Evans. First midterm. Grammars.

Syntax Analysis. Prof. James L. Frankel Harvard University. Version of 6:43 PM 6-Feb-2018 Copyright 2018, 2015 James L. Frankel. All rights reserved.

Parsing a primer. Ralf Lämmel Software Languages Team University of Koblenz-Landau

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

Architecture of Compilers, Interpreters. CMSC 330: Organization of Programming Languages. Front End Scanner and Parser. Implementing the Front End

A general introduction to Functional Programming using Haskell

CMSC 330: Organization of Programming Languages

CSCE 314 Programming Languages

Lecture C-10: Parser Combinators - Introduction

Principles of Programming Languages COMP251: Syntax and Grammars

Parsing Combinators: Introduction & Tutorial

SML A F unctional Functional Language Language Lecture 19

Properties of Regular Expressions and Finite Automata

Principles of Programming Languages COMP251: Syntax and Grammars

Context-Free Languages & Grammars (CFLs & CFGs) Reading: Chapter 5

CSE 3302 Programming Languages Lecture 8: Functional Programming

CMSC 330: Organization of Programming Languages

CPS 506 Comparative Programming Languages. Syntax Specification

ITEC2620 Introduction to Data Structures

Haske k ll An introduction to Functional functional programming using Haskell Purely Lazy Example: QuickSort in Java Example: QuickSort in Haskell

Expressions. Parsing Expressions. Showing and Reading. Parsing. Parsing Numbers. Expressions. Such as. Can be modelled as a datatype

Dr. D.M. Akbar Hussain

Programming Language Definition. Regular Expressions

COP4020 Programming Languages. Syntax Prof. Robert van Engelen

Structurally Recursive Descent Parsing. Nils Anders Danielsson (Nottingham) Joint work with Ulf Norell (Chalmers)

Parsing Expressions. Slides by Koen Lindström Claessen & David Sands

An introduction introduction to functional functional programming programming using usin Haskell

CSCE 314 TAMU Fall CSCE 314: Programming Languages Dr. Flemming Andersen. Haskell Functions

7. String Methods. Methods. Methods. Data + Functions Together. Designing count as a Function. Three String Methods 1/22/2016

Haskell Introduction Lists Other Structures Data Structures. Haskell Introduction. Mark Snyder

CS 315 Programming Languages Syntax. Parser. (Alternatively hand-built) (Alternatively hand-built)

CMSC 330: Organization of Programming Languages. Functional Programming with Lists

Lexical Analysis. Introduction

Compiler Construction

Introducing Wybe a language for everyone

2.2 Syntax Definition

Monads seen so far: IO vs Gen

LL(k) Parsing. Predictive Parsers. LL(k) Parser Structure. Sample Parse Table. LL(1) Parsing Algorithm. Push RHS in Reverse Order 10/17/2012

Higher-Order Functions for Parsing

Dr. D.M. Akbar Hussain

Theoretical Part. Chapter one:- - What are the Phases of compiler? Answer:

Chapter 3: Lexical Analysis

Sometimes an ambiguous grammar can be rewritten to eliminate the ambiguity.

DVA337 HT17 - LECTURE 4. Languages and regular expressions

Logical Methods in... using Haskell Getting Started

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

COMP-421 Compiler Design. Presented by Dr Ioanna Dionysiou

Scheme: Strings Scheme: I/O

B The SLLGEN Parsing System

Programming Languages (CS 550) Lecture 4 Summary Scanner and Parser Generators. Jeremy R. Johnson

CSEP 501 Compilers. Languages, Automata, Regular Expressions & Scanners Hal Perkins Winter /8/ Hal Perkins & UW CSE B-1

PROGRAMMING IN HASKELL. Chapter 10 - Interactive Programming

ITEC2620 Introduction to Data Structures

1. Function Type Declarations (12 points):

Functional Programming and Haskell

Syntax-Directed Translation. Lecture 14

QUESTION BANK CHAPTER 1 : OVERVIEW OF SYSTEM SOFTWARE. CHAPTER 2: Overview of Language Processors. CHAPTER 3: Assemblers

A Simple Syntax-Directed Translator

Transcription:

CSCE 314 Programming Languages Monadic Parsing Dr. Hyunyoung Lee 1

What is a Parser? A parser is a program that takes a string of characters (or a set of tokens) as input and determines its syntactic structure. String or [Token] Parser syntactic structure + 2*3+4 means * 4 2 3 2

The Parser Type In a functional language such as Haskell, parsers can naturally be viewed as functions. type Parser = String -> Tree A parser is a function that takes a string and returns some form of tree. However, a parser might not require all of its input string, so we also return any unused input: type Parser = String -> (Tree,String) A string might be parsable in many ways, including none, so we generalize to a list of results: type Parser = String [(Tree,String)] The empty list denotes failure, a singleton list denotes success. 3

Furthermore, a parser might not always produce a tree, so we generalize to a value of any type: type Parser a = String [(a,string)] Finally, a parser might take token streams instead of character streams: type TokenParser b a = [b] [(a,[b])] Note: For simplicity, we will only consider parsers that either fail and return the empty list as results, or succeed and return a singleton list. 4

Basic Parsers (Building Blocks) The parser item fails if the input is empty, and consumes the first character otherwise: item :: Parser Char :: String -> [(Char, String)] :: [Char] -> [(Char, [Char])] item = \inp -> case inp of [] -> [] (x:xs) -> [(x,xs)] Example: > item Howdy all" [( H', owdy all")] > item [] 5

We can make it more explicit by letting the function parse apply a parser to a string: parse :: Parser a -> String -> [(a,string)] parse p inp = p inp - essentially id function Example: > parse item Howdy all" [( H', owdy all")] 6

Sequencing Parser Often, we need to combine parsers in sequence, e.g., the following grammar: <if-stmt> :: if (<expr>) then <stmt> First parse if, then (, then <expr>, then ), To combine parsers in sequence, we make the Parser type into a monad: instance Monad Parser where -- (>>=) :: Parser a -> (a -> Parser b) -> Parser b p >>= f = \inp -> case parse p inp of [] -> [] [(v,out)] -> parse (f v) out 7

Sequencing Parser (do) Now a sequence of parsers can be combined as a single composite parser using the keyword do. Example: three :: Parser (Char,Char) three = do x item > Parse three abcd" [(( a', c ), d")] item z item return (x,z) Meaning: The value of x is generated by the item parser. The parser return v always succeeds, returning the value v without consuming any input: return :: a -> Parser a return v = \inp -> [(v,inp)] 8

If any parser in a sequence of parsers fails, then the sequence as a whole fails. For example: three :: Parser (Char,Char) three = do x item item y item return (x,y) > parse three "abcdef" [(( a, c ),"def")] > parse three "ab" [] 9

Making Choices What if we have to backtrack? First try to parse p, then q? The parser p < > q behaves as the parser p if it succeeds, and as the parser q otherwise. empty :: Parser a empty = \inp -> [] - always fails (< >) :: Parser a -> Parser a -> Parser a p < > q = \inp -> case parse p inp of [] -> parse q inp [(v,out)] -> [(v,out)] Example: > parse empty "abc" [] > parse (item < > return d ) "abc" [('a',"bc")] 10

The Monadic Way Parser sequencing operator (>>=) :: Parser a -> (a -> Parser b) -> Parser b p >>= f = \inp -> case parse p inp of p >>= f fails if p fails [] -> [] [(v, out)] -> parse (f v) out otherwise applies f to the result of p this results in a new parser, which is then applied Example > parse ((empty < > item) >>= (\_ -> item)) "abc" [('b',"c")] 11

Examples > parse item "" [] > parse item "abc" [('a',"bc")] > Parse empty "abc" [] > parse (return 1) "abc" [(1,"abc")] > parse (item < > return 'd') "abc" [('a',"bc")] > parse (empty < > return 'd') "abc" [('d',"abc")] 12

>>= or do Using >>= p1 >>= \v1 -> p2 >>= \v2 ->... pn >>= \vn -> return (f v1 v2... vn) Using do notation do v1 <- p1 v2 <- p2... vn <- pn return (f v1 v2... vn) If some vi is not needed, vi <- pi can be written as pi, which corresponds to pi >>= \_ ->... 13

Example Using >>= rev3 = item >>= \v1 -> item >>= \v2 -> item >>= \_ -> item >>= \v3 -> return $ reverse (v1:v2:v3:[]) Using do notation rev3 = do v1 <- item v2 <- item item v3 <- item return $ reverse (v1:v2:v3:[]) > rev3 abcdef [( dba, ef )] > (rev3 >>= (\_ -> item)) abcde [( e, )] > (rev3 >>= (\_ -> item)) abcd [] 14

Key benefit: The result of first parse is available for the subsequent parsers parse (item >>= (\x -> item >>= (\y -> return (y:[x])))) ab [( ba, )] 15

Derived Primitives Parsing a character that satisfies a predicate: sat :: (Char -> Bool) -> Parser Char sat p = do x <- item if p x then return x else empty Examples > parse (sat (== a )) abc [( a, bc )] > parse (sat (== b )) abc [] > parse (sat islower) abc [( a, bc )] > parse (sat isupper) abc [] 16

Derived Parsers from sat digit, letter, alphanum :: Parser Char digit = sat isdigit letter = sat isalpha alphanum = sat isalphanum lower, upper :: Parser Char lower = sat islower upper = sat isupper char :: Char -> Parser Char char x = sat (== x) 17

To accept a particular string Use sequencing recursively: string :: String -> Parser String string [] = return [] string (x:xs) = do char x string xs return (x:xs) Entire parse fails if any of the recursive calls fail > parse (string "if [") "if (a<b) return;" [] > parse (string "if (") "if (a<b) return;" [("if (","a<b) return;")] 18

many applies the same parser many times many :: Parser a -> Parser [a] many p = some p < > return [] some :: Parser a -> Parser [a] some p = do v <- p vs <- many p return (v:vs) Examples > parse (many digit) "123ab" [("123","ab")] > parse (many digit) "ab123ab" [("","ab123ab")] > parse (many alphanum) "ab123ab" [("ab123ab","")] 19

Example Lee CSCE 314 TAMU We can now define a parser that consumes a list of one or more digits of correct format from a string: p :: Parser String p = do char '[' d <- digit ds <- many (do char ',' digit) char ']' return (d:ds) > parse p "[1,2,3,4]" [("1234","")] > parse p "[1,2,3,4" [] Note: More sophisticated parsing libraries can indicate and/or recover from errors in the input string. 20

Example: Parsing a token Lee CSCE 314 TAMU space :: Parser () space = do many (sat isspace) return () token :: Parser a -> Parser a token p = do space v <- p space return v identifier :: Parser String identifier = token ident ident :: Parser String ident = do x <- sat islower xs <- many (sat isalphanum) return (x:xs) 21