Deductive Parsing with Sequentially Indexed Grammars
|
|
- Eric Kelley
- 5 years ago
- Views:
Transcription
1 Deductive Parsing with Sequentially Indexed Grammars Jan van Eijck May 25, 2005 Abstract This paper extends the Earley parsing algorithm for context free languages [3] to the case of sequentially indexed languages. Sequentially indexed languages are related to indexed languages [1, 2]. The difference is that parallel processing of index stacks is replaced by sequential processing [4]. This paper contains the full code of an implementation in Haskell [6], in literate programming style [7], of an algorithm for deductive parsing based on [8], focussing on the case of an Earley style parsing algorithm for sequentially indexed languages. Keywords: Deductive parsing, context free grammars, indexed languages, nested stack automata, Earley parsing algorithm, Haskell, literate programming. 1 Introduction Indexed grammars [1] are quadruples G = (N, T, P, S), where N is a finite nonterminal alphabet, T is a finite terminal alphabet, N T =, P is a finite set of productions of the form (X, α) with X N N, α (N N T ), where N is the set of all X Y with X, Y N, and S N is the start symbol. A production (X, α) is written as X α. Let G = (N, T, P, S) be an indexed grammar. A pair (X, [X 1,..., X n ]), with X, X 1,..., X n N is called an indexed nonterminal. Indexed nonterminals are written as X X1 X n. Let N be the set of all X X1 X n, with X, X 1,..., X n N. Then a sentential form for G is a string α in (N T ). To define the one step derivation relation, we need a preliminary definition: Definition 1 If δ (N N T ) and γ N, then δ γ is given by the following recursion: ɛ ζ = ɛ (w : δ) ζ = w : δ ζ if w T, (Y : δ) ζ = Y ζ : δ ζ if Y N, (Y Z : δ) ζ = Y (Z:ζ) : δ ζ if Y Z N, 1
2 Note that as a special case of this, we have that (Y Z ) θ = Y Z:θ. Using the definition of δ γ, we can define one step derivations: Definition 2 Let α, β be sentential forms for indexed grammar G. Then α G β iff 1. α = γ 1 X ζ γ 2, X δ is a production of the grammar, and β = γ 1 δ ζ γ 2, or, 2. α = γ 1 X (Y :ζ) γ 2, X Y δ is a production of the grammar, and β = γ 1 δ ζ γ 2 In terms of this, α G β is defined in the usual way. This definition is equivalent to the definition in [1]. Sequentially indexed grammars use indices that get pushed to an arbitrary nonterminal in the righthand side of a production. Sequentially indexed grammars look just like indexed grammars, but the definition of derivation is different. The following definition uses list concatenation If ζ is the result of concatenating ζ 1 and ζ 2, we denote this as ζ = ζ 1 ++ζ 2. Definition 3 If δ (N N T ) and γ N, then (δ) ζ is the subset of (N T ) defined recursively as: (ɛ) [] = {ɛ} (ɛ) ζ = if ζ [], (w : δ) ζ = {w : δ δ δ ζ } if w T, (C : δ) ζ = {C ζ1 : δ ζ = ζ 1 ++ζ 2, δ δ ζ2 } if C N (C Y : δ) ζ = {C Y :ζ1 : δ ζ = ζ 1 ++ζ 2, δ δ ζ2 } if C Y N. The relation of one-step derivation is defined in terms of (δ) ζ, as follows: Definition 4 Let α, β be sentential forms for indexed grammar G. Then α G β iff 1. α = γ 1 B ζ γ 2, B δ is a production of the grammar, and β = γ 1 δ γ 2, where δ in (δ) γ, or 2. α = γ 1 B (Y :ζ) γ 2, B Y δ is a production of the grammar, and β = γ 1 δ γ 2, where δ (δ) ζ. In derivations with sequentially indexed grammars, stacks are never allowed to disappear, and stacks are never allowed to get duplicated. In particular, a production B ɛ will not allow a one-step derivation like B Y Y X ɛ, and a production B CD will not allow a one-step derivation like B Y Y X C Y Y X D Y Y X (but it will allow B Y Y X C Y Y X D, B Y Y X C Y Y D X, B Y Y X C Y D Y X, and B Y Y X CD Y Y X ). A production like B X ɛ can lead to a one-step derivation B X ɛ. This effectively treats X as a trace. Sequentially indexed grammars are different from an earlier proposal for a restricted form of indexed grammars, in [5]. Gazdar proposed to use index lists that get copied to a single nonterminal in the righthand sides of productions, but in such a way that this heir-nonterminal has to be indicated in the rule. 2
3 2 General Data Structures module DPS where import List import Char import System.IO.Unsafe (unsafeperformio) Terminal and nonterminal symbols: data Symbol a b = T a N b D b I b b deriving (Eq,Ord,Read) The D nonterminal is useful for extending a grammar with a new start symbol. The I and J indicate nonterminals indexed with another nonterminal (the distinction is useful for indicating whether an index has been pushed to the stack or not). The S nonterminal indicates a nonterminal indexed with a stack of nonterminals. Given show functions for the types a and b, we define a show function for Symbol a b as follows: instance (Show a, Show b) => Show (Symbol a b) where show (T x) = show x show (N x) = show x show (D x) = # : show x show (I x y) = show x ++ "[" ++ show y ++ "]" The property of being a nonterminal: nonterm :: Symbol a b -> Bool nonterm (T _) = False nonterm _ = True Category of a nonterminal: 3
4 ntcat :: Symbol a b -> [b] ntcat (N x) = [x] ntcat (I x _) = [x] ntcat _ = [] Index of a nonterminal: ntidx :: Symbol a b -> [b] ntidx (N x) = [] ntidx (I _ y) = [y] ntidx _ = [] The property of being a dummy symbol dummy :: Symbol a b -> Bool dummy (D _) = True dummy _ = False The property of being an indexed symbol: indexed :: Symbol a b -> Bool indexed (I ) = True indexed _ = False Grammar rules: data Rule a b = Rule (Symbol a b) [Symbol a b] deriving Eq A show function for grammar rules. instance (Show a, Show b) => Show (Rule a b) where show (Rule y zs) = show y ++ "-->" ++ show zs 4
5 Reading a grammar rule: instance (Read a, Read b) => Read (Rule a b) where readsprec p = \ r -> [ (Rule symbol rhs,u) (symbol,s) <- reads r, ("-->", t) <- lex s, (rhs, u) <- reads t ] Example: DPIL> read "N S --> [T a, N S, T a ]" :: Rule Char Char S -->[ a, S, a ] Functions for accessing the left- and righthand sides of a rule. lhs :: Rule a b -> Symbol a b lhs (Rule x ys) = x rhs :: Rule a b -> [Symbol a b] rhs (Rule x ys) = ys Function for counting the number of nonterminals in the righthand side of a rule: ntc :: [Symbol a b] -> Int ntc [] = 0 ntc (N _:rest) = 1 + ntc rest ntc (I : rest) = 1 + ntc rest ntc (_ : rest) = ntc rest A grammar is a list of rules: type Grammar a b = [Rule a b] When specifying a grammar we adopt the convention that the lefthandside symbol of the first grammar rule is the start symbol. 5
6 start :: Grammar a b -> Symbol a b start grammar = lhs (head grammar) Converting a list of strings into a grammar: readgrammar :: (Read a, Read b) => [String] -> Grammar a b readgrammar ls = map (read :: (Read a, Read b) => String -> Rule a b) ls where ls = filter nonempty ls nonempty = \ s -> dropwhile isspace s /= [] A function for reading a grammar from a file. getgrammar :: (Read a, Read b) => FilePath -> IO (Grammar a b) getgrammar filename = do str <- readfile filename return (readgrammar (lines str)) Same, avoiding the IO monad: getgr :: (Read a, Read b) => FilePath -> Grammar a b getgr filename = unsafeperformio (getgrammar filename) 3 Example Grammars for CF Languages For concreteness sake, let us assume that terminal and nonterminal symbols are of type Char. Here is an example grammar, read in from file grammar0 (it is assumed that the file grammar0 is in the current directory): DPS> getgr "grammar0" :: Grammar String String ["S"-->["a","S","b"],"S"-->["a","b"]] 6
7 Here is another example grammar. grammar1 :: Grammar Char Char grammar1 = [Rule (N S ) [T a, N S, T a ], Rule (N S ) [T b, N S, T b ], Rule (N S ) [T a ], Rule (N S ) [T b ] ] An example of a grammar with epsilon rules: grammar2 :: Grammar Char Char grammar2 = [Rule (N S ) [T a, N S, T a ], Rule (N S ) [T b, N S, T b ], Rule (N S ) [T a ], Rule (N S ) [T b ], Rule (N S ) [] ] A grammar for balanced parentheses: grammar3 :: Grammar Char Char grammar3 = [Rule (N S ) [T (, N S, T ), N S ], Rule (N S ) [] ] 4 Grammars for Non-CF Languages grammar4 :: Grammar Char Char grammar4 = [Rule (N S ) [T a, I S X ], Rule (N S ) [N A ], Rule (I A X ) [T b, N A, T c ], Rule (N A ) [] ] 7
8 grammar5 :: Grammar Char Char grammar5 = [Rule (N S ) [T a,i S X ], Rule (N S ) [T b,i S Y ], Rule (N S ) [N A ], Rule (I A X ) [N A, T a ], Rule (I A Y ) [N A, T b ], Rule (N A ) [] ] grammar6 :: Grammar Char Char grammar6 = [Rule (N A ) [I A X ], Rule (N A ) [N B ], Rule (I B X ) [T a, N B ], Rule (N B ) [] ] 5 Derivation Trees Here is a data type for derivation trees: data Tree a b = Leaf a Node b [b] [Tree a b] deriving (Eq,Ord,Show) Here is an example: tree0 = Node S [] [Leaf a, Leaf b ] Displaying a tree on the screen: 8
9 displaytree :: (Show a, Show b) => Tree a b -> IO() displaytree tr = mapm_ putstrln (showtree 0 tr) where showtree :: (Show a, Show b) => Int -> Tree a b -> [String] showtree i (Leaf x) = [(map (\ _ -> ) [1..i]) ++ show x] showtree i (Node x [] ts) = ((map (\ _ -> ) [1..i]) ++ show x) : concat (map (showtree (i+5)) ts) showtree i (Node x xs ts) = ((map (\ _ -> ) [1..i]) ++ show x ++ show xs) : concat (map (showtree (i+5)) ts) The example tree gets displayed as follows: DPIL> displaytree tree0 S a b Displaying a tree list: displaytrees :: (Show a, Show b) => [Tree a b] -> IO() displaytrees trees = sequence_ (map displaytree trees) 6 Earley Items, Axioms, Goals, Consequences Earley items Earley items for context free parsing are of the form i, A α β, j. They consist of a rule A αβ with a in its righthand side to indicate the part of the righthand side that was recognized so far, a pointer i to the parent node where the rule was invoked, and a pointer j to the position in the input that recognition has reached. For parsing indexed languages, we will use three extra components: 1. A stack of the indices at the point where the rule was invoked, 2. A stack of indices for the first nonterminal to the right of, 3. A stack of indices for the tail of the nonterminal list to the right of. We will use Greek letters η, ζ, θ for index stacks, 9
10 The item format now becomes: i, θ, A α β, η, ζ, j where θ, η, ζ are stacks of indices (nonterminals). The item indicates the following: grammar rule A αβ was invoked at point i, at the point of invocation, the top node A has associated stack θ, at point j, part α of the righthand side of the rule has been successfully recognized, η is the stack for the first nonterminal among β (if β has no nonterminals, then η is empty), ζ is the stack for the remainder of the nonterminals in β (if β has less than two nonterminals, then ζ is empty). For good measure, we also include a derivation tree component, by putting a list of derivation trees as the last component of an Earley item. data Item a b = Item Int [b] (Symbol a b) [Symbol a b] [Symbol a b] [b] [b] Int [Tree a b] deriving (Eq,Ord) A show function for items, using * for the dot, and suppressing the derivation tree component. 10
11 instance (Show a, Show b) => Show (Item a b) where show (Item i theta b symbols symbols eta zeta j ts) = "(" ++ show i ++ "," ++ show theta ++ "," ++ show b ++ "==>" ++ show symbols ++ "*" ++ show symbols ++ "," ++ show eta ++ "," ++ show zeta ++ "," ++ show j ++ ")" A function for extracting the list of derivation trees from an Earley item: gettrees :: Item a b -> [Tree a b] gettrees (Item i theta b symbols symbols eta zeta j ts) = ts Axiom In the case of Earley parsing with CF grammars, there is one axiom. It has the form 0, S S, 0, where S is the start symbol of the grammar and S is a new start symbol. Adapting this to the case of parsing with sequentially indexed grammars, the axiom takes the shape 0, [], S S, [], [], 0, indicating that at the beginning of the parse, there is one pending nonterminal, and all stack components are empty. axioms :: Grammar a b -> [Item a b] axioms grammar = [Item 0 [] (D x) [] [N x] [] [] 0 []] where (N x) = start grammar Goal In the case of Earley parsing with CF grammars, there is one goal. It has the form 0, S S, n, where S is the start symbol of the grammar, S is the new start symbol used in the axiom, and n is the length of the input. For the case of Earley style parsing with indexed grammars, we also require that the index stack components are empty at the end of the parse, so the goal shape becomes: 0, [], S S, [], [], n. 11
12 Here is a function for recognizing goals: goal :: (Eq a, Eq b) => Grammar a b -> [a] -> Item a b -> Bool goal grammar tokens (Item i theta symbol symbols symbols eta zeta k trees) = i == 0 && theta == [] && dummy symbol && symbols == [start grammar] && symbols == [] && eta == [] && zeta == [] && k == length tokens Consequences As in the case of Earley parsing with CF grammars, there are three kinds of consequences, for scanning, prediction and completion. consequences :: (Eq a,eq b) => Grammar a b -> [a] -> Item a b -> [Item a b] -> [Item a b] consequences grammar tokens trigger stored = scan tokens trigger ++ predict tokens grammar trigger ++ complete grammar trigger stored Scanning The scanning rule for Earley parsing with CF grammars is the rule that shifts the bullet across a terminal. It has the form (derivation tree component omitted): i, A α wβ, j i, A αw β, j
13 For parsing sequentially indexed languages, three index stack components are added to this. Scanning does not change the index stacks θ, η, ζ. i, θ, A α wβ, η, ζ, j i, θ, A αw β, η, ζ, j + 1 scan :: (Eq a,eq b) => [a] -> Item a b -> [Item a b] scan tokens (Item i theta a alpha [] eta zeta j ts) = [] scan tokens (Item i theta a alpha (symbol:beta) eta zeta j ts) j >= length tokens = [] otherwise = [ Item i theta a (alpha ++ [symbol]) beta eta zeta (j+1) (ts ++ [Leaf (tokens!! j)]) symbol == (T (tokens!! j)) ] Prediction The prediction rule for Earley parsing is the rule that initializes a new rule B γ on the basis of a premisse indicating that B is expected at the current point in the input. In the CF grammar case it has the following form (derivation tree component omitted): i, A α Bβ, j B γ j, B γ, j In the case of Earley-style parsing with sequentially indexed grammars this splits into four rules. The rules split the first index stack. For this we need some terminology. If γ is a list of grammar symbols and η, η, η are index stacks, then c(γ) is the number of nonterminals in γ, and C(η, η, η, γ) is the following constraint: η = η ++η (c(γ) = 0 η = []) (c(γ) = 1 η = []). Splitting a list in two sublists: 13
14 split :: [a] -> [([a],[a])] split [] = [([],[])] split (x:xs) = ([],x:xs): map (\ (us,vs) -> (x:us,vs)) (split xs) Implementation of the constraint: constraint :: (Eq a, Eq b) => ([b],[b],[symbol a b]) -> Bool constraint (stack1,stack2,symbols) = (ntc symbols /= 0 (stack1 == [] && stack2 == [])) && (ntc symbols /= 1 stack2 == []) The first prediction rule covers the case of an expected nonterminal B matched against a rule with head B. The rule distributes the appropriate stack over the new item, in accordance with the constraint imposed by the number of nonterminals in the righthand side of the grammar rule used in the prediction. i, θ, A α Bβ, η, ζ, j j, η, B γ, η, η, j B γ, C(η, η, η, γ) The second rule covers the case of an expected nonterminal B matched against a rule with head B X. This rule pops the index stack associated with B. i, θ, A α Bβ, (X : η), ζ, j j, η, B X γ, η, η B X γ, C(η, η, η, γ), j The third rule covers the case of an expected nonterminal B Y matched against a rule B γ: i, θ, A α B Y β, η, ζ, j j, (Y : η), B γ, η, η, j B γ, C(Y : η, η, η, γ), n j > η Note the side condition on the rule. The side condition prevents unlimited growth of the stack. This is needed to prevent a rule like A A Y from causing an unbounded number of pushes. The fourth rule covers the case of an expected nonterminal B Y matched against a rule B Y γ: i, θ, A α B Y β, η, ζ, j j, η, B Y γ, η, η, j B Y γ, C(η, η, η, γ) If no further symbols are expected, nothing is predicted: 14
15 predict :: (Eq a,eq b) => [a] -> Grammar a b -> Item a b -> [Item a b] predict tokens grammar (Item i theta a alpha [] eta zeta j ts) = [] If a nonterminal without index is expected, we get: predict tokens grammar (Item i theta a alpha (N x:beta) eta zeta j ts) = [ Item j eta (N x) [] gamma eta eta j [] Rule (N z) gamma <- grammar, (eta,eta ) <- split eta, x == z, constraint (eta,eta,gamma) ] ++ [ Item j (tail eta) (I x y) [] gamma eta eta j [] Rule (I x y) gamma <- grammar, x == x, eta /= [], head eta == y, (eta,eta ) <- split (tail eta), constraint (eta,eta,gamma) ] If a nonterminal with an index is expected, we get: 15
16 predict tokens grammar (Item i theta a alpha (I x y:beta) eta zeta j ts) = [ Item j (y:eta) (N x) [] gamma eta eta j [] Rule (N x ) gamma <- grammar, (eta,eta ) <- split (y:eta), x == x, constraint (eta,eta,gamma), length tokens - j > length eta ] ++ [ Item j eta (I x y) [] gamma eta eta j [] Rule (I x y ) gamma <- grammar, x == x, y == y, (eta,eta ) <- split eta, constraint (eta,eta,gamma) ] Finally, we need a catch-all clause to indicate that these are all the predict consequences. This covers the case where the next expected symbol is a terminal. predict tokens grammar (Item i theta a alpha beta eta zeta j ts) = [] Completion The completion rule for Earley parsing is the rule that shifts the bullet across a non-terminal. It has two premisses, and it is of the following form (derivation tree component 16
17 omitted): i, A α Bβ, k k, B γ, j i, A αb β, j For the case of Earley-style parsing with sequentially indexed grammars, this splits into four rules, as follows. The first rule checks that the lefthand tail index stack of the first premisse matches the head index stack of the second premisse, for the case of a match of expected symbol B against completed rule B γ. i, θ, A α Bβ, η, ζ, k k, η, B γ, [], [], j i, θ, A αb β, ζ, ζ C(ζ, ζ, ζ, β), j The second rule covers the case of a match of expected symbol B against completed rule B Y γ. i, θ, A α Bβ, (Y : η), ζ, k k, η, B Y γ, [], [], j i, θ, A αb β, ζ, ζ C(ζ, ζ, ζ, β), j The third rule covers the case of a match of expected symbol B Y against completed rule B γ. i, θ, A α B Y β, η, ζ, k k, (Y : η), B γ, [], [], j i, θ, A αb Y β, ζ, ζ C(ζ, ζ, ζ, β), j The fourth rule covers the case of a match of expected symbol B Y B Y γ. against completed rule i, θ, A α B Y β, η, ζ, k k, η, B Y γ, [], [], j i, θ, A αb Y β, ζ, ζ C(ζ, ζ, ζ, β), j In the implementation this is handled by distinguishing four cases: Trigger of the form i, θ, A α Bβ, η, ζ, k: look for completed item with head B or B Y on the chart. Trigger of the form i, θ, A α B Y β, η, ζ, k: look for completed item with head B or B Y on the chart. Trigger of the form k, η, B γ, [], [], j: look for item with expected symbol B or B Y the chart. on Trigger of the form k, η, B Y γ, [], [], j: look for item with expected symbol B or B Y on the chart. 17
18 complete :: (Eq a, Eq b) => Grammar a b -> Item a b -> [Item a b] -> [Item a b] complete grammar (Item i theta a alpha (N x:beta) eta zeta k ts) stored = [ Item i theta a (alpha++[n x]) beta zeta zeta j (ts ++ [Node x eta ts ]) (Item k eta symbol gamma [] [] [] j ts ) <- stored, (zeta,zeta ) <- split zeta, constraint (zeta,zeta,beta), k == k, eta == eta, symbol == N x ] ++ [ Item i theta a (alpha++[n x]) beta zeta zeta j (ts ++ [Node x eta ts ]) (Item k eta (I x y) gamma [] [] [] j ts ) <- stored, k == k, x == x, eta /= [], head eta == y, tail eta == eta, (zeta,zeta ) <- split zeta, constraint (zeta,zeta,beta) ] 18
19 complete grammar (Item i theta a alpha (I x y:beta) eta zeta k ts) stored = [ Item i theta a (alpha++[i x y]) beta zeta zeta j (ts ++ [Node x eta ts ]) (Item k eta symbol gamma [] [] [] j ts ) <- stored, eta /= [], head eta == y, tail eta == eta, (zeta,zeta ) <- split zeta, constraint (zeta,zeta,beta), k == k, symbol == N x ] ++ [ Item i theta a (alpha++[i x y]) beta zeta zeta j (ts ++ [Node x eta ts ]) (Item k eta symbol gamma [] [] [] j ts ) <- stored, k == k, symbol == I x y, eta == eta, (zeta,zeta ) <- split zeta, constraint (zeta,zeta,beta) ] 19
20 complete grammar (Item k eta (N x) gamma [] [] [] j ts) stored = [ Item i theta a (alpha++[n x]) beta zeta zeta j (ts ++ [Node x eta ts]) (Item i theta a alpha (symbol:beta) eta zeta k ts ) <- stored, k == k, eta == eta, symbol == N x, (zeta,zeta ) <- split zeta, constraint (zeta,zeta,beta) ] ++ [ Item i theta a (alpha++[i x y]) beta zeta zeta j (ts ++ [Node x eta ts]) (Item i theta a alpha (I x y:beta) eta zeta k ts ) <- stored, k == k, eta /= [], head eta == y, tail eta == eta, x == x, (zeta,zeta ) <- split zeta, constraint (zeta,zeta,beta) ] complete grammar (Item k eta (I x y) gamma [] [] [] j ts) stored = [ Item i theta a (alpha++[n x]) beta zeta zeta j (ts ++ [Node x (y:eta) ts]) (Item i theta a alpha (symbol:beta) (y:eta ) zeta k ts ) <- stored, k == k, eta == eta, symbol == N x, (zeta,zeta ) <- split zeta, constraint (zeta,zeta,beta) ] ++ [ Item i theta a (alpha++[i x y]) beta zeta zeta j (ts ++ [Node x (y:eta) ts]) (Item i theta a alpha (I x y :beta) eta zeta k ts ) <- stored, k == k, x == x, y == y, eta == eta, (zeta,zeta ) <- split zeta, constraint (zeta,zeta,beta) ] 20
21 In the implementation, we have to also specify what happens to premisses of the form This is the final case of the catch-all pattern. i, θ, A α wβ, η, ζ, k. complete grammar item stored = [] This completes the Earley-specific part of the story. 7 Chart and Agenda A chart plus agenda is a pair of item lists. Call this datatype a store. type Store a b = ([Item a b],[item a b]) The idea is to use the agenda for those items that have been proved, but whose direct consequences have not yet been derived, and the chart for the proved items the consequences of which have also been computed. We start out with an empty chart and with a list of all axioms on the agenda. initstore :: (Eq a, Eq b) => Grammar a b -> [a] -> Store a b initstore grammar tokens = ([], axioms grammar) Next, we tackle the items on the agenda one by one: add their consequences to the agenda. move them from the agenda to the chart (as their consequences have been computed). 21
22 exhaustagenda :: (Eq a, Eq b) => Grammar a b -> [a] -> Store a b -> Store a b exhaustagenda grammar tokens (chart,[]) = (chart,[]) exhaustagenda grammar tokens (chart,agenda@(trigger:rest)) = exhaustagenda grammar tokens (newchart,newagenda) where newchart = chart ++ [trigger] store = chart ++ agenda conseq = consequences grammar tokens trigger chart new = conseq \\ store newagenda = rest ++ new Check whether a goal item has been found, and return the list of goal items. goalfound :: (Eq a, Eq b) => Grammar a b -> [a] -> [Item a b] -> [Item a b] goalfound grammar tokens store = filter gl store where gl = goal grammar tokens If a parse is successful, it is nice to display the chart: display :: Show a => [a] -> IO() display [] = return () display (x:xs) = do print x display xs Rather than displaying the whole chart, we will display only the records of the nodes that have been successfully created. To that end, we prune the chart using the following filter: pruned :: (Eq a, Eq b) => [Item a b] -> [Item a b] pruned = filter (\ (Item i theta s symbols symbols eta zeta j ts) -> symbols == []) As output of a parse we allow either a parsetree or a chart, depending on a boolean trigger. 22
23 data OutputKind = Tree Chart deriving Eq Parsing is now a matter of initializing the store, exhausting the agenda, and checking whether a goal item has been found in the chart. parse :: (Eq a, Show a, Eq b, Show b) => Grammar a b -> [a] -> OutputKind -> IO() parse grammar tokens output = if goals /= [] then if output == Tree then displaytrees ptrees else display (pruned chart) else putstrln "no parse" where goals = goalfound grammar tokens chart ptrees = gettrees (head goals) ptree = head (ptrees) init = initstore grammar tokens result = exhaustagenda grammar tokens init chart = fst result Incomplete parses (for debugging): iparse :: (Eq a, Show a, Eq b, Show b) => Grammar a b -> [a] -> IO() iparse grammar tokens = display chart where init = initstore grammar tokens result = exhaustagenda grammar tokens init chart = fst result Parsing with a grammar read from a file: prs :: String -> [String] -> OutputKind -> IO() prs string tokens output = do grammar <- getgrammar string :: IO(Grammar String String) parse grammar tokens output 23
24 8 Testing parsetest :: (Eq a, Eq b) => Grammar a b -> [a] -> Bool parsetest grammar tokens = goals /= [] where goals = goalfound grammar tokens chart init = initstore grammar tokens result = exhaustagenda grammar tokens init chart = fst result test :: (Eq a, Show a, Eq b, Show b) => (Grammar a b, [a]) -> String test (grammar, tokens) = if parsetest grammar tokens then show grammar ++ " " ++ show tokens ++ " succeeds" else show grammar ++ " " ++ show tokens ++ " fails" 24
25 suite1 :: [(Grammar Char Char, [Char])] suite1 = [ (grammar1, ""), (grammar1, "abba"), (grammar1, "aba"), (grammar2, ""), (grammar2, "aba"), (grammar2, "abba"), (grammar2, "aaabbaaa"), (grammar3, ""), (grammar3, "(()())"), (grammar3, "(()()"), (grammar3, "((((())))()"), (grammar3, "((((())))())"), (grammar4, ""), (grammar4, "aabbcc"), (grammar4, "aabbbcc"), (grammar4, "aabbbccc"), (grammar4, "aaaaabbbbbccccc"), (grammar5, ""), (grammar5, "aabaaab"), (grammar5, "aabaab"), (grammar5, "aaaaabbaaaaabb"), (grammar6, ""), (grammar6, "a"), (grammar6, "ab") ] runtests :: IO() runtests = sequence_ (map (putstrln. test) suite1) 9 Function for Stand-alone Use Module declaration: 25
26 module Main where import DPS import System Definition of main function: main :: IO() main = do args <- getargs prs (args!! 0) (words (args!! 1)) Tree 26
27 This allows: sig]$ more grammar6 N "S" --> [N "NP", N "VP"] N "VP" --> [N "TV", N "NP"] N "VP" --> [T "talked"] N "VP" --> [T "smiled"] N "NP" --> [N "Det", N "CN"] N "NP" --> [T "John"] N "NP" --> [T "Mary"] N "TV" --> [T "loved"] N "TV" --> [T "hated"] N "Det" --> [T "the"] N "Det" --> [T "some"] N "CN" --> [T "man"] N "CN" --> [T "woman"] N "CN" --> [N "CN", T "that", I "S" "NP"] I "NP" "NP" --> [] [jve@water sig]$ runhugs Main grammar6 "John hated the man that loved Mary" "S" "NP" "John" "VP" "TV" "hated" "NP" "Det" "the" "CN" "CN" "man" "that" "S"["NP"] "NP"["NP"] "VP" "TV" "loved" "NP" "Mary" [jve@water sig]$ References [1] Aho, A. V. Indexed grammars an extension of context-free grammars. Journal of the ACM 15, 4 (1968),
28 [2] Aho, A. V. Nested stack automata. Journal of the ACM 16, 3 (1969), [3] Earley, J. An efficient context-free parsing algorithm. Communications of the ACM 13 (1970), [4] Eijck, J. v. Sequentially indexed grammars. manuscript, Centre for Mathematics and Computer Science, Amsterdam, [5] Gazdar, G. Applicability of indexed grammars to natural languages. In Natural Language Parsing and Linguistic Theories, U. Reyle and C. Rohrer, Eds. Reidel, Dordrecht, 1988, pp [6] Jones, S. P., Hughes, J., et al. Report on the programming language Haskell 98. Available from the Haskell homepage: [7] Knuth, D. Literate Programming. CSLI Lecture Notes, no. 27. CSLI, Stanford, [8] Shieber, S., Schabes, Y., and Pereira, F. Principles and implementation of deductive parsing. Journal of Logic Programming 24 (1995),
Parsing. Earley Parsing. Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 39
Parsing Earley Parsing Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Winter 2017/18 1 / 39 Table of contents 1 Idea 2 Algorithm 3 Tabulation 4 Parsing 5 Lookaheads 2 / 39 Idea (1) Goal: overcome
More informationParsing. Cocke Younger Kasami (CYK) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 35
Parsing Cocke Younger Kasami (CYK) Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Winter 2017/18 1 / 35 Table of contents 1 Introduction 2 The recognizer 3 The CNF recognizer 4 CYK parsing 5 CYK
More informationParsing a primer. Ralf Lämmel Software Languages Team University of Koblenz-Landau
Parsing a primer Ralf Lämmel Software Languages Team University of Koblenz-Landau http://www.softlang.org/ Mappings (edges) between different representations (nodes) of language elements. For instance,
More informationAndreas Daul Natalie Clarius
Parsing as Deduction M. Shieber, Y. Schabes, F. Pereira: Principles and Implementation of Deductive Parsing Andreas Daul Natalie Clarius Eberhard Karls Universität Tübingen Seminar für Sprachwissenschaft
More informationCS 314 Principles of Programming Languages
CS 314 Principles of Programming Languages Lecture 5: Syntax Analysis (Parsing) Zheng (Eddy) Zhang Rutgers University January 31, 2018 Class Information Homework 1 is being graded now. The sample solution
More informationBottom-Up Parsing. Lecture 11-12
Bottom-Up Parsing Lecture 11-12 (From slides by G. Necula & R. Bodik) 9/22/06 Prof. Hilfinger CS164 Lecture 11 1 Bottom-Up Parsing Bottom-up parsing is more general than topdown parsing And just as efficient
More informationCOP4020 Programming Languages. Syntax Prof. Robert van Engelen
COP4020 Programming Languages Syntax Prof. Robert van Engelen Overview Tokens and regular expressions Syntax and context-free grammars Grammar derivations More about parse trees Top-down and bottom-up
More informationCompilers. Bottom-up Parsing. (original slides by Sam
Compilers Bottom-up Parsing Yannis Smaragdakis U Athens Yannis Smaragdakis, U. Athens (original slides by Sam Guyer@Tufts) Bottom-Up Parsing More general than top-down parsing And just as efficient Builds
More informationParsing III. CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones
Parsing III (Top-down parsing: recursive descent & LL(1) ) (Bottom-up parsing) CS434 Lecture 8 Spring 2005 Department of Computer Science University of Alabama Joel Jones Copyright 2003, Keith D. Cooper,
More informationParsers. Xiaokang Qiu Purdue University. August 31, 2018 ECE 468
Parsers Xiaokang Qiu Purdue University ECE 468 August 31, 2018 What is a parser A parser has two jobs: 1) Determine whether a string (program) is valid (think: grammatically correct) 2) Determine the structure
More informationWednesday, September 9, 15. Parsers
Parsers What is a parser A parser has two jobs: 1) Determine whether a string (program) is valid (think: grammatically correct) 2) Determine the structure of a program (think: diagramming a sentence) Agenda
More informationParsers. What is a parser. Languages. Agenda. Terminology. Languages. A parser has two jobs:
What is a parser Parsers A parser has two jobs: 1) Determine whether a string (program) is valid (think: grammatically correct) 2) Determine the structure of a program (think: diagramming a sentence) Agenda
More informationChapter 3: Lexing and Parsing
Chapter 3: Lexing and Parsing Aarne Ranta Slides for the book Implementing Programming Languages. An Introduction to Compilers and Interpreters, College Publications, 2012. Lexing and Parsing* Deeper understanding
More informationStandard prelude. Appendix A. A.1 Classes
Appendix A Standard prelude In this appendix we present some of the most commonly used definitions from the standard prelude. For clarity, a number of the definitions have been simplified or modified from
More informationSyntax Analysis. Amitabha Sanyal. (www.cse.iitb.ac.in/ as) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay
Syntax Analysis (www.cse.iitb.ac.in/ as) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay September 2007 College of Engineering, Pune Syntax Analysis: 2/124 Syntax
More informationProgramming Languages 3. Definition and Proof by Induction
Programming Languages 3. Definition and Proof by Induction Shin-Cheng Mu Oct. 22, 2015 Total Functional Programming The next few lectures concerns inductive definitions and proofs of datatypes and programs.
More informationCOP4020 Programming Languages. Syntax Prof. Robert van Engelen
COP4020 Programming Languages Syntax Prof. Robert van Engelen Overview n Tokens and regular expressions n Syntax and context-free grammars n Grammar derivations n More about parse trees n Top-down and
More informationLogical Methods in... using Haskell Getting Started
Logical Methods in... using Haskell Getting Started Jan van Eijck May 4, 2005 Abstract The purpose of this course is to teach a bit of functional programming and logic, and to connect logical reasoning
More information3. Syntax Analysis. Andrea Polini. Formal Languages and Compilers Master in Computer Science University of Camerino
3. Syntax Analysis Andrea Polini Formal Languages and Compilers Master in Computer Science University of Camerino (Formal Languages and Compilers) 3. Syntax Analysis CS@UNICAM 1 / 54 Syntax Analysis: the
More informationContext-free grammars
Context-free grammars Section 4.2 Formal way of specifying rules about the structure/syntax of a program terminals - tokens non-terminals - represent higher-level structures of a program start symbol,
More informationLet s Talk About Logic
Let s Talk About Logic Jan van Eijck CWI & ILLC, Amsterdam Masterclass Logica, 2 Maart 2017 Abstract This lecture shows how to talk about logic in computer science. To keep things simple, we will focus
More informationTypes of parsing. CMSC 430 Lecture 4, Page 1
Types of parsing Top-down parsers start at the root of derivation tree and fill in picks a production and tries to match the input may require backtracking some grammars are backtrack-free (predictive)
More informationFROWN An LALR(k) Parser Generator
FROWN An LALR(k) Parser Generator RALF HINZE Institute of Information and Computing Sciences Utrecht University Email: ralf@cs.uu.nl Homepage: http://www.cs.uu.nl/~ralf/ September, 2001 (Pick the slides
More informationLecture Bottom-Up Parsing
Lecture 14+15 Bottom-Up Parsing CS 241: Foundations of Sequential Programs Winter 2018 Troy Vasiga et al University of Waterloo 1 Example CFG 1. S S 2. S AyB 3. A ab 4. A cd 5. B z 6. B wz 2 Stacks in
More informationFormal Languages and Compilers Lecture VI: Lexical Analysis
Formal Languages and Compilers Lecture VI: Lexical Analysis Free University of Bozen-Bolzano Faculty of Computer Science POS Building, Room: 2.03 artale@inf.unibz.it http://www.inf.unibz.it/ artale/ Formal
More informationSyntax Analysis: Context-free Grammars, Pushdown Automata and Parsing Part - 4. Y.N. Srikant
Syntax Analysis: Context-free Grammars, Pushdown Automata and Part - 4 Department of Computer Science and Automation Indian Institute of Science Bangalore 560 012 NPTEL Course on Principles of Compiler
More informationWhere We Are. CMSC 330: Organization of Programming Languages. This Lecture. Programming Languages. Motivation for Grammars
CMSC 330: Organization of Programming Languages Context Free Grammars Where We Are Programming languages Ruby OCaml Implementing programming languages Scanner Uses regular expressions Finite automata Parser
More informationMA513: Formal Languages and Automata Theory Topic: Context-free Grammars (CFG) Lecture Number 18 Date: September 12, 2011
MA53: Formal Languages and Automata Theory Topic: Context-free Grammars (CFG) Lecture Number 8 Date: September 2, 20 xercise: Define a context-free grammar that represents (a simplification of) expressions
More informationModels of Computation II: Grammars and Pushdown Automata
Models of Computation II: Grammars and Pushdown Automata COMP1600 / COMP6260 Dirk Pattinson Australian National University Semester 2, 2018 Catch Up / Drop in Lab Session 1 Monday 1100-1200 at Room 2.41
More informationWednesday, August 31, Parsers
Parsers How do we combine tokens? Combine tokens ( words in a language) to form programs ( sentences in a language) Not all combinations of tokens are correct programs (not all sentences are grammatically
More informationAssignment 4 CSE 517: Natural Language Processing
Assignment 4 CSE 517: Natural Language Processing University of Washington Winter 2016 Due: March 2, 2016, 1:30 pm 1 HMMs and PCFGs Here s the definition of a PCFG given in class on 2/17: A finite set
More informationMonday, September 13, Parsers
Parsers Agenda Terminology LL(1) Parsers Overview of LR Parsing Terminology Grammar G = (Vt, Vn, S, P) Vt is the set of terminals Vn is the set of non-terminals S is the start symbol P is the set of productions
More informationCSCE 314 TAMU Fall CSCE 314: Programming Languages Dr. Flemming Andersen. Haskell Functions
1 CSCE 314: Programming Languages Dr. Flemming Andersen Haskell Functions 2 Outline Defining Functions List Comprehensions Recursion 3 Conditional Expressions As in most programming languages, functions
More informationSyntax Analysis. Prof. James L. Frankel Harvard University. Version of 6:43 PM 6-Feb-2018 Copyright 2018, 2015 James L. Frankel. All rights reserved.
Syntax Analysis Prof. James L. Frankel Harvard University Version of 6:43 PM 6-Feb-2018 Copyright 2018, 2015 James L. Frankel. All rights reserved. Context-Free Grammar (CFG) terminals non-terminals start
More informationCompiler Construction: Parsing
Compiler Construction: Parsing Mandar Mitra Indian Statistical Institute M. Mitra (ISI) Parsing 1 / 33 Context-free grammars. Reference: Section 4.2 Formal way of specifying rules about the structure/syntax
More informationLecture 8: Context Free Grammars
Lecture 8: Context Free s Dr Kieran T. Herley Department of Computer Science University College Cork 2017-2018 KH (12/10/17) Lecture 8: Context Free s 2017-2018 1 / 1 Specifying Non-Regular Languages Recall
More informationTop-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7
Top-Down Parsing and Intro to Bottom-Up Parsing Lecture 7 1 Predictive Parsers Like recursive-descent but parser can predict which production to use Predictive parsers are never wrong Always able to guess
More informationContext-Free Languages and Parse Trees
Context-Free Languages and Parse Trees Mridul Aanjaneya Stanford University July 12, 2012 Mridul Aanjaneya Automata Theory 1/ 41 Context-Free Grammars A context-free grammar is a notation for describing
More information4. Lexical and Syntax Analysis
4. Lexical and Syntax Analysis 4.1 Introduction Language implementation systems must analyze source code, regardless of the specific implementation approach Nearly all syntax analysis is based on a formal
More informationCompiler Design 1. Bottom-UP Parsing. Goutam Biswas. Lect 6
Compiler Design 1 Bottom-UP Parsing Compiler Design 2 The Process The parse tree is built starting from the leaf nodes labeled by the terminals (tokens). The parser tries to discover appropriate reductions,
More informationTop-Down Parsing and Intro to Bottom-Up Parsing. Lecture 7
Top-Down Parsing and Intro to Bottom-Up Parsing Lecture 7 1 Predictive Parsers Like recursive-descent but parser can predict which production to use Predictive parsers are never wrong Always able to guess
More informationBottom-Up Parsing. Lecture 11-12
Bottom-Up Parsing Lecture 11-12 (From slides by G. Necula & R. Bodik) 2/20/08 Prof. Hilfinger CS164 Lecture 11 1 Administrivia Test I during class on 10 March. 2/20/08 Prof. Hilfinger CS164 Lecture 11
More informationIntroduction to Programming, Aug-Dec 2006
Introduction to Programming, Aug-Dec 2006 Lecture 3, Friday 11 Aug 2006 Lists... We can implicitly decompose a list into its head and tail by providing a pattern with two variables to denote the two components
More information4. Lexical and Syntax Analysis
4. Lexical and Syntax Analysis 4.1 Introduction Language implementation systems must analyze source code, regardless of the specific implementation approach Nearly all syntax analysis is based on a formal
More informationParsing - 1. What is parsing? Shift-reduce parsing. Operator precedence parsing. Shift-reduce conflict Reduce-reduce conflict
Parsing - 1 What is parsing? Shift-reduce parsing Shift-reduce conflict Reduce-reduce conflict Operator precedence parsing Parsing-1 BGRyder Spring 99 1 Parsing Parsing is the reverse of doing a derivation
More informationCS1622. Today. A Recursive Descent Parser. Preliminaries. Lecture 9 Parsing (4)
CS1622 Lecture 9 Parsing (4) CS 1622 Lecture 9 1 Today Example of a recursive descent parser Predictive & LL(1) parsers Building parse tables CS 1622 Lecture 9 2 A Recursive Descent Parser. Preliminaries
More informationCS 457/557: Functional Languages
CS 457/557: Functional Languages Lists and Algebraic Datatypes Mark P Jones Portland State University 1 Why Lists? Lists are a heavily used data structure in many functional programs Special syntax is
More informationParsing. Roadmap. > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing
Roadmap > Context-free grammars > Derivations and precedence > Top-down parsing > Left-recursion > Look-ahead > Table-driven parsing The role of the parser > performs context-free syntax analysis > guides
More informationCSCI312 Principles of Programming Languages!
CSCI312 Principles of Programming Languages!! Chapter 3 Regular Expression and Lexer Xu Liu Recap! Copyright 2006 The McGraw-Hill Companies, Inc. Clite: Lexical Syntax! Input: a stream of characters from
More informationLR Parsing. Leftmost and Rightmost Derivations. Compiler Design CSE 504. Derivations for id + id: T id = id+id. 1 Shift-Reduce Parsing.
LR Parsing Compiler Design CSE 504 1 Shift-Reduce Parsing 2 LR Parsers 3 SLR and LR(1) Parsers Last modifled: Fri Mar 06 2015 at 13:50:06 EST Version: 1.7 16:58:46 2016/01/29 Compiled at 12:57 on 2016/02/26
More informationSyntax Analysis Part I
Syntax Analysis Part I Chapter 4: Context-Free Grammars Slides adapted from : Robert van Engelen, Florida State University Position of a Parser in the Compiler Model Source Program Lexical Analyzer Token,
More informationAmbiguous Grammars and Compactification
Ambiguous Grammars and Compactification Mridul Aanjaneya Stanford University July 17, 2012 Mridul Aanjaneya Automata Theory 1/ 44 Midterm Review Mathematical Induction and Pigeonhole Principle Finite Automata
More informationLecture Notes on Shift-Reduce Parsing
Lecture Notes on Shift-Reduce Parsing 15-411: Compiler Design Frank Pfenning, Rob Simmons, André Platzer Lecture 8 September 24, 2015 1 Introduction In this lecture we discuss shift-reduce parsing, which
More informationLexical Analysis. Dragon Book Chapter 3 Formal Languages Regular Expressions Finite Automata Theory Lexical Analysis using Automata
Lexical Analysis Dragon Book Chapter 3 Formal Languages Regular Expressions Finite Automata Theory Lexical Analysis using Automata Phase Ordering of Front-Ends Lexical analysis (lexer) Break input string
More informationFinite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018
Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018 Lecture 11 Ana Bove April 26th 2018 Recap: Regular Languages Decision properties of RL: Is it empty? Does it contain this word? Contains
More informationCMSC 330: Organization of Programming Languages
CMSC 330: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Analyzer Optimizer Code Generator Abstract Syntax Tree Front End Back End Compiler
More informationUNIT III & IV. Bottom up parsing
UNIT III & IV Bottom up parsing 5.0 Introduction Given a grammar and a sentence belonging to that grammar, if we have to show that the given sentence belongs to the given grammar, there are two methods.
More information1 Lexical Considerations
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.035, Spring 2013 Handout Decaf Language Thursday, Feb 7 The project for the course is to write a compiler
More informationCompilers. Yannis Smaragdakis, U. Athens (original slides by Sam
Compilers Parsing Yannis Smaragdakis, U. Athens (original slides by Sam Guyer@Tufts) Next step text chars Lexical analyzer tokens Parser IR Errors Parsing: Organize tokens into sentences Do tokens conform
More informationSyntax Analysis Check syntax and construct abstract syntax tree
Syntax Analysis Check syntax and construct abstract syntax tree if == = ; b 0 a b Error reporting and recovery Model using context free grammars Recognize using Push down automata/table Driven Parsers
More informationLL(1) predictive parsing
LL(1) predictive parsing Informatics 2A: Lecture 11 Mary Cryan School of Informatics University of Edinburgh mcryan@staffmail.ed.ac.uk 10 October 2018 1 / 15 Recap of Lecture 10 A pushdown automaton (PDA)
More informationCMSC 330: Organization of Programming Languages. Architecture of Compilers, Interpreters
: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Scanner Parser Static Analyzer Intermediate Representation Front End Back End Compiler / Interpreter
More informationTalen en Compilers. Jurriaan Hage , period 2. November 13, Department of Information and Computing Sciences Utrecht University
Talen en Compilers 2017-2018, period 2 Jurriaan Hage Department of Information and Computing Sciences Utrecht University November 13, 2017 1. Introduction 1-1 This lecture Introduction Course overview
More informationChapter 3. Describing Syntax and Semantics
Chapter 3 Describing Syntax and Semantics Chapter 3 Topics Introduction The General Problem of Describing Syntax Formal Methods of Describing Syntax Attribute Grammars Describing the Meanings of Programs:
More informationCSCE 314 Programming Languages
CSCE 314 Programming Languages Syntactic Analysis Dr. Hyunyoung Lee 1 What Is a Programming Language? Language = syntax + semantics The syntax of a language is concerned with the form of a program: how
More informationDefining Program Syntax. Chapter Two Modern Programming Languages, 2nd ed. 1
Defining Program Syntax Chapter Two Modern Programming Languages, 2nd ed. 1 Syntax And Semantics Programming language syntax: how programs look, their form and structure Syntax is defined using a kind
More informationSLR parsers. LR(0) items
SLR parsers LR(0) items As we have seen, in order to make shift-reduce parsing practical, we need a reasonable way to identify viable prefixes (and so, possible handles). Up to now, it has not been clear
More informationFormal Languages and Compilers Lecture VII Part 3: Syntactic A
Formal Languages and Compilers Lecture VII Part 3: Syntactic Analysis Free University of Bozen-Bolzano Faculty of Computer Science POS Building, Room: 2.03 artale@inf.unibz.it http://www.inf.unibz.it/
More informationProgramming Languages (CS 550) Lecture 4 Summary Scanner and Parser Generators. Jeremy R. Johnson
Programming Languages (CS 550) Lecture 4 Summary Scanner and Parser Generators Jeremy R. Johnson 1 Theme We have now seen how to describe syntax using regular expressions and grammars and how to create
More informationCMSC 330: Organization of Programming Languages
CMSC 330: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Analyzer Optimizer Code Generator Abstract Syntax Tree Front End Back End Compiler
More informationLANGUAGE PROCESSORS. Presented By: Prof. S.J. Soni, SPCE Visnagar.
LANGUAGE PROCESSORS Presented By: Prof. S.J. Soni, SPCE Visnagar. Introduction Language Processing activities arise due to the differences between the manner in which a software designer describes the
More informationParser Tools: lex and yacc-style Parsing
Parser Tools: lex and yacc-style Parsing Version 6.11.0.6 Scott Owens January 6, 2018 This documentation assumes familiarity with lex and yacc style lexer and parser generators. 1 Contents 1 Lexers 3 1.1
More informationSyntax Analysis. Martin Sulzmann. Martin Sulzmann Syntax Analysis 1 / 38
Syntax Analysis Martin Sulzmann Martin Sulzmann Syntax Analysis 1 / 38 Syntax Analysis Objective Recognize individual tokens as sentences of a language (beyond regular languages). Example 1 (OK) Program
More informationLR Parsing Techniques
LR Parsing Techniques Introduction Bottom-Up Parsing LR Parsing as Handle Pruning Shift-Reduce Parser LR(k) Parsing Model Parsing Table Construction: SLR, LR, LALR 1 Bottom-UP Parsing A bottom-up parser
More informationChapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part 1
Chapter 3: CONTEXT-FREE GRAMMARS AND PARSING Part 1 1. Introduction Parsing is the task of Syntax Analysis Determining the syntax, or structure, of a program. The syntax is defined by the grammar rules
More informationLecture 7: Deterministic Bottom-Up Parsing
Lecture 7: Deterministic Bottom-Up Parsing (From slides by G. Necula & R. Bodik) Last modified: Tue Sep 20 12:50:42 2011 CS164: Lecture #7 1 Avoiding nondeterministic choice: LR We ve been looking at general
More informationSection A. A grammar that produces more than one parse tree for some sentences is said to be ambiguous.
Section A 1. What do you meant by parser and its types? A parser for grammar G is a program that takes as input a string w and produces as output either a parse tree for w, if w is a sentence of G, or
More informationChapter 4: LR Parsing
Chapter 4: LR Parsing 110 Some definitions Recall For a grammar G, with start symbol S, any string α such that S called a sentential form α is If α Vt, then α is called a sentence in L G Otherwise it is
More informationCMSC 330: Organization of Programming Languages. Context Free Grammars
CMSC 330: Organization of Programming Languages Context Free Grammars 1 Architecture of Compilers, Interpreters Source Analyzer Optimizer Code Generator Abstract Syntax Tree Front End Back End Compiler
More informationPROGRAMMING IN HASKELL. Chapter 2 - First Steps
PROGRAMMING IN HASKELL Chapter 2 - First Steps 0 The Hugs System Hugs is an implementation of Haskell 98, and is the most widely used Haskell system; The interactive nature of Hugs makes it well suited
More informationLecture 9: General and Bottom-Up Parsing. Last modified: Sun Feb 18 13:49: CS164: Lecture #9 1
Lecture 9: General and Bottom-Up Parsing Last modified: Sun Feb 18 13:49:40 2018 CS164: Lecture #9 1 A Little Notation Here and in lectures to follow, we ll often have to refer to general productions or
More informationIn One Slide. Outline. LR Parsing. Table Construction
LR Parsing Table Construction #1 In One Slide An LR(1) parsing table can be constructed automatically from a CFG. An LR(1) item is a pair made up of a production and a lookahead token; it represents a
More informationContext-Free Grammars
Department of Linguistics Ohio State University Syntax 2 (Linguistics 602.02) January 3, 2012 (CFGs) A CFG is an ordered quadruple T, N, D, P where a. T is a finite set called the terminals; b. N is a
More informationPROGRAMMING IN HASKELL. CS Chapter 6 - Recursive Functions
PROGRAMMING IN HASKELL CS-205 - Chapter 6 - Recursive Functions 0 Introduction As we have seen, many functions can naturally be defined in terms of other functions. factorial :: Int Int factorial n product
More informationThe List Datatype. CSc 372. Comparative Programming Languages. 6 : Haskell Lists. Department of Computer Science University of Arizona
The List Datatype CSc 372 Comparative Programming Languages 6 : Haskell Lists Department of Computer Science University of Arizona collberg@gmail.com All functional programming languages have the ConsList
More informationMIT Parse Table Construction. Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology
MIT 6.035 Parse Table Construction Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology Parse Tables (Review) ACTION Goto State ( ) $ X s0 shift to s2 error error goto s1
More informationChapter 4. Lexical and Syntax Analysis
Chapter 4 Lexical and Syntax Analysis Chapter 4 Topics Introduction Lexical Analysis The Parsing Problem Recursive-Descent Parsing Bottom-Up Parsing Copyright 2012 Addison-Wesley. All rights reserved.
More informationA left-sentential form is a sentential form that occurs in the leftmost derivation of some sentence.
Bottom-up parsing Recall For a grammar G, with start symbol S, any string α such that S α is a sentential form If α V t, then α is a sentence in L(G) A left-sentential form is a sentential form that occurs
More informationMIT Specifying Languages with Regular Expressions and Context-Free Grammars. Martin Rinard Massachusetts Institute of Technology
MIT 6.035 Specifying Languages with Regular essions and Context-Free Grammars Martin Rinard Massachusetts Institute of Technology Language Definition Problem How to precisely define language Layered structure
More informationMIT Specifying Languages with Regular Expressions and Context-Free Grammars
MIT 6.035 Specifying Languages with Regular essions and Context-Free Grammars Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology Language Definition Problem How to precisely
More informationLexical and Syntax Analysis. Top-Down Parsing
Lexical and Syntax Analysis Top-Down Parsing Easy for humans to write and understand String of characters Lexemes identified String of tokens Easy for programs to transform Data structure Syntax A syntax
More informationCS453 : JavaCUP and error recovery. CS453 Shift-reduce Parsing 1
CS453 : JavaCUP and error recovery CS453 Shift-reduce Parsing 1 Shift-reduce parsing in an LR parser LR(k) parser Left-to-right parse Right-most derivation K-token look ahead LR parsing algorithm using
More informationLexical Analysis (ASU Ch 3, Fig 3.1)
Lexical Analysis (ASU Ch 3, Fig 3.1) Implementation by hand automatically ((F)Lex) Lex generates a finite automaton recogniser uses regular expressions Tasks remove white space (ws) display source program
More informationLanguages and Compilers
Principles of Software Engineering and Operational Systems Languages and Compilers SDAGE: Level I 2012-13 3. Formal Languages, Grammars and Automata Dr Valery Adzhiev vadzhiev@bournemouth.ac.uk Office:
More informationLecture 8: Deterministic Bottom-Up Parsing
Lecture 8: Deterministic Bottom-Up Parsing (From slides by G. Necula & R. Bodik) Last modified: Fri Feb 12 13:02:57 2010 CS164: Lecture #8 1 Avoiding nondeterministic choice: LR We ve been looking at general
More informationINFOB3TC Solutions for the Exam
Department of Information and Computing Sciences Utrecht University INFOB3TC Solutions for the Exam Johan Jeuring Monday, 13 December 2010, 10:30 13:00 lease keep in mind that often, there are many possible
More informationThe CYK Algorithm. We present now an algorithm to decide if w L(G), assuming G to be in Chomsky Normal Form.
CFG [1] The CYK Algorithm We present now an algorithm to decide if w L(G), assuming G to be in Chomsky Normal Form. This is an example of the technique of dynamic programming Let n be w. The natural algorithm
More informationCOMPILER DESIGN - QUICK GUIDE COMPILER DESIGN - OVERVIEW
COMPILER DESIGN - QUICK GUIDE http://www.tutorialspoint.com/compiler_design/compiler_design_quick_guide.htm COMPILER DESIGN - OVERVIEW Copyright tutorialspoint.com Computers are a balanced mix of software
More informationUnit 13. Compiler Design
Unit 13. Compiler Design Computers are a balanced mix of software and hardware. Hardware is just a piece of mechanical device and its functions are being controlled by a compatible software. Hardware understands
More informationCMSC 330: Organization of Programming Languages
CMSC 330: Organization of Programming Languages Context Free Grammars and Parsing 1 Recall: Architecture of Compilers, Interpreters Source Parser Static Analyzer Intermediate Representation Front End Back
More information