Applications of Tree Automata Theory Lecture I: Tree Automata

Size: px
Start display at page:

Download "Applications of Tree Automata Theory Lecture I: Tree Automata"

Transcription

1 Applications of Tree Automata Theory Lecture I: Tree Automata Andreas Maletti Institute of Computer Science Universität Leipzig, Germany on leave from: Institute for Natural Language Processing Universität Stuttgart, Germany Yekaterinburg August 23, 204 Lecture I: Tree Automata A. Maletti

2 Roadmap Theory of Tree Automata 2 Parsing Basics and Evaluation 3 Parsing Advanced Topics 4 Machine Translation Basics and Evaluation 5 Theory of Tree Transducers 6 Machine Translation Advanced Topics Always ask questions right away! Lecture I: Tree Automata A. Maletti 2

3 Trees Motivation and Notation Lecture I: Tree Automata A. Maletti 3

4 Trees Parses We must bear in mind the Community as a whole S PRP MD We must VB PP bear IN PP in NN DT NN IN mind the Community as DT NN a whole Lecture I: Tree Automata A. Maletti 4

5 Trees Parses È âû äåéñòâèòåëüíî ñìîæåòå èçìåíèòü ñâîþ æèçíü, âîñïîëüçîâàâøèñü ïðèíöèïîì 80 ê 20 conj èçìåíèòü subj È âû ñìîæåòå adv auxs acc äåéñòâèòåëüíî æèçíü âîñïîëüçîâàâøèñü adj ñâîþ adv misc ins, ïðèíöèïîì gen card gen 80 ê 20 Lecture I: Tree Automata A. Maletti 4

6 Trees XML <library> <book> <title>the Golden Ticket</title> <author>lance Fortnow</author> <publisher>princeton Univ. Press</publisher> </book><book> <title>anna Karenina</title> <author>leo Tolstoy</author> <publisher>the Russian Messenger</publisher> </book> </library> library book book title author publisher title author publisher The Golden Ticket Lance Fortnow Princeton Univ. Press Anna Karenina Leo Tolstoy The Russian Messenger Lecture I: Tree Automata A. Maletti 5

7 Trees Sets Σ and Q Definition Set T Σ (Q) of Σ-trees indexed by Q is smallest T q T for all q Q σ(t,..., t k ) T for all k N, σ Σ, and t,..., t k T We assume Σ Q = and write σ instead of σ() Lecture I: Tree Automata A. Maletti 6

8 Trees Example Σ = {σ, α} and Q = {q, p} σ(σ(p, q), σ(σ(q, σ))) T Σ (Q) σ σ σ p q σ q σ Lecture I: Tree Automata A. Maletti 7

9 Trees Example Σ = {σ, α} and Q = {q, p} σ(σ(p, q), σ(σ(q, σ))) T Σ (Q) σ σ σ p q σ q σ Notes obvious recursion & induction principle Lecture I: Tree Automata A. Maletti 7

10 Trees Recursion Definition (Gorn address) Mapping pos: T Σ (Q) 2 N assigning positions pos(q) = {ε} pos(σ(t,..., t k )) = {ε} {i.w i k, w pos(t i )} Lecture I: Tree Automata A. Maletti 8

11 Trees Recursion Definition (Gorn address) Mapping pos: T Σ (Q) 2 N assigning positions pos(q) = {ε} pos(σ(t,..., t k )) = {ε} {i.w i k, w pos(t i )} Definition Leaves of t T Σ (Q): leaves(t) = {w pos(t) w. / pos(t)} Lecture I: Tree Automata A. Maletti 8

12 Trees Recursion S PRP MD We must VB PP bear IN PP in NN DT NN IN mind the Community as DT NN a whole Leaves marked in red Lecture I: Tree Automata A. Maletti 9

13 Trees Recursion S 2 PRP We MD must VB bear PP IN in 2 2 NN 3 2 mind the 2 DT NN Community 2 IN as PP DT NN a 2 2 whole Lecture I: Tree Automata A. Maletti 9

14 Trees Recursion S 2 PRP We MD must VB bear PP IN in 2 2 NN 3 2 mind the 2 DT NN Community 2 IN as PP DT NN a 2 2 whole Address of marked DT : Lecture I: Tree Automata A. Maletti 9

15 Trees Trees t, u T Σ (Q) and position w pos(t) in t Notation t(w) = label of t at position w t w = subtree rooted in w t[u] w = tree obtained by replacing the subtree at w in t by u Lecture I: Tree Automata A. Maletti 0

16 Trees S 2 PRP We MD must VB PP bear IN in NN DT NN mind the 2 Community 2 IN as PP DT NN a 2 2 whole t(2.2..) = bear Lecture I: Tree Automata A. Maletti

17 Trees S 2 PRP We MD must VB PP bear IN in NN DT NN mind the 2 Community 2 IN as PP DT NN a 2 2 whole t(2.2..) = bear t = (DT(the), NN(Community)) Lecture I: Tree Automata A. Maletti

18 Trees S 2 PRP We MD must VB PP bear IN in NN DT NN mind the 2 Community 2 IN as PP DT NN a 2 2 whole t(2.2..) = bear t = (DT(the), NN(Community)) t = t [(DT(the), NN(Community))] Lecture I: Tree Automata A. Maletti

19 Trees S 2 PRP We MD must VB PP bear IN in NN DT NN mind the 2 Community 2 IN as PP DT NN a 2 2 whole t(2.2..) = bear t = (DT(the), NN(Community)) t = t [(DT(the), NN(Community))] Lecture I: Tree Automata A. Maletti

20 Tree Language Representations Lecture I: Tree Automata A. Maletti 2

21 Tree Language Tree language (or forest) = subset of T Σ (Q) Motivation parses of a sentence translations of an input sentence valid XML documents... parse forest translation forest XML schema Lecture I: Tree Automata A. Maletti 3

22 Tree Language How to represent a set of trees? enumerate them Lecture I: Tree Automata A. Maletti 4

23 Tree Language How to represent a set of trees? enumerate them enumerate them cleverly (e.g., add sharing) Lecture I: Tree Automata A. Maletti 4

24 Tree Language L = {σ(σ(p, q), σ(σ(q, σ))), α(σ(q, σ), α)} Enumeration of L: σ σ p q σ σ q σ α σ q σ α Subtree-shared enumeration of L: (packed forest) σ σ p q σ σ σ α α Lecture I: Tree Automata A. Maletti 5

25 Tree Language How to represent a set of trees? enumerate them enumerate them cleverly (packed forest) Lecture I: Tree Automata A. Maletti 6

26 Tree Language How to represent a set of trees? enumerate them enumerate them cleverly parse forest of a CFG (packed forest) Lecture I: Tree Automata A. Maletti 6

27 Parse Forest of a CFG Example S MD PP VB PP MD must... Lecture I: Tree Automata A. Maletti 7

28 Parse Forest of a CFG Example S MD PP VB PP MD must... S PRP MD We must VB PP bear IN PP in NN DT NN IN mind the Community as DT NN a whole Lecture I: Tree Automata A. Maletti 7

29 Parse Forest of a CFG Example S MD PP VB PP MD must... S PRP MD We must VB PP bear IN PP in NN DT NN IN mind the Community as DT NN a whole Lecture I: Tree Automata A. Maletti 7

30 Parse Forest of a CFG Example S MD PP VB PP MD must... S PRP MD We must VB PP bear IN PP in NN DT NN IN mind the Community as DT NN a whole Lecture I: Tree Automata A. Maletti 7

31 Parse Forest of a CFG Example S MD PP VB PP MD must... S PRP MD We must VB PP bear IN PP in NN DT NN IN mind the Community as DT NN a whole Lecture I: Tree Automata A. Maletti 7

32 Parse Forest of a CFG Example S MD PP VB PP MD must... S PRP MD We must VB PP bear IN PP in NN DT NN IN mind the Community as DT NN a whole Lecture I: Tree Automata A. Maletti 7

33 Parse Forest of a CFG Example S MD PP VB PP MD must... S PRP MD We must VB PP bear IN PP in NN DT NN IN mind the Community as DT NN a whole Lecture I: Tree Automata A. Maletti 7

34 Local Tree Grammar Definition A local tree grammar (LTG) is a CFG G = (N, Q, S, P) finite set N nonterminals finite set Q terminals S N start nonterminals finite set P N (N Q) productions It will compute the derivation trees of the CFG Lecture I: Tree Automata A. Maletti 8

35 Local Tree Grammar LTG G = (N, Q, S, P) Definition (Generated tree language) L(G) contains exactly the trees t T N (Q) t(ε) S root label in S t(w) t(w.) t(w.k) P for every internal node w pos(t) with {i N w.i pos(t)} = {,..., k} t(w) Q for all w leaves(t) label child labels is a production of G leaves labeled by Q Lecture I: Tree Automata A. Maletti 9

36 Local Tree Grammar Observation LTGs generate exactly the parse forests of CFGs Lecture I: Tree Automata A. Maletti 20

37 Local Tree Grammar Observation LTGs generate exactly the parse forests of CFGs Theorem Local tree languages (LTL) are generated by LTGs closed under CFL (strings) LTL (trees) union intersection (rank-bounded) complement relabeling Lecture I: Tree Automata A. Maletti 20

38 Local Tree Grammar Observation LTGs generate exactly the parse forests of CFGs Theorem Local tree languages (LTL) are generated by LTGs closed under CFL (strings) LTL (trees) union intersection (rank-bounded) complement relabeling Lecture I: Tree Automata A. Maletti 20

39 Local Tree Grammar Observation LTGs generate exactly the parse forests of CFGs Theorem Local tree languages (LTL) are generated by LTGs closed under CFL (strings) LTL (trees) union intersection (rank-bounded) complement relabeling Lecture I: Tree Automata A. Maletti 20

40 Local Tree Grammar Observation LTGs generate exactly the parse forests of CFGs Theorem Local tree languages (LTL) are generated by LTGs closed under CFL (strings) LTL (trees) union intersection (rank-bounded) complement relabeling Lecture I: Tree Automata A. Maletti 20

41 Local Tree Grammar Observation LTGs generate exactly the parse forests of CFGs Theorem Local tree languages (LTL) are generated by LTGs closed under CFL (strings) LTL (trees) union intersection (rank-bounded) complement relabeling Lecture I: Tree Automata A. Maletti 20

42 Local Tree Grammar Observation LTGs generate exactly the parse forests of CFGs Theorem Local tree languages (LTL) are generated by LTGs closed under CFL (strings) LTL (trees) union intersection (rank-bounded) complement relabeling Lecture I: Tree Automata A. Maletti 20

43 Local Tree Grammar Observation LTGs generate exactly the parse forests of CFGs Theorem Local tree languages (LTL) are generated by LTGs closed under CFL (strings) LTL (trees) union intersection (rank-bounded) complement relabeling Lecture I: Tree Automata A. Maletti 20

44 Local Tree Grammar Observation LTGs generate exactly the parse forests of CFGs Theorem Local tree languages (LTL) are generated by LTGs closed under CFL (strings) LTL (trees) union intersection (rank-bounded) complement relabeling Lecture I: Tree Automata A. Maletti 20

45 Local Tree Grammar Observation LTGs generate exactly the parse forests of CFGs Theorem Local tree languages (LTL) are generated by LTGs closed under CFL (strings) LTL (trees) union intersection (rank-bounded) complement relabeling Lecture I: Tree Automata A. Maletti 20

46 Local Tree Grammar Observation LTGs generate exactly the parse forests of CFGs Properties simple no ambiguity (unique explanation for each generated tree) not closed under (most) BOOLEAN operations (union/intersection/complement: / / ) not closed under (non-injective) relabelings... Lecture I: Tree Automata A. Maletti 20

47 Local Tree Grammar LTG G = (N, Q, S, P) No ambiguity S PRP$ My NN dog VBZ sleeps is in L(G) if and only if S is a start nonterminal 2 all the productions in it are in P 3 all leaves are labeled by elements of Q Lecture I: Tree Automata A. Maletti 2

48 Local Tree Grammar Observation Local tree languages are not closed under union Proof. The following single-element tree languages are local: PRP$ My NN dog S VBZ sleeps PRP I S VBD scored AD RB well But their union is not local as it must also generate trees for My dog scored well and *I sleeps Lecture I: Tree Automata A. Maletti 22

49 Tree Language How to represent a set of trees? enumerate them enumerate them cleverly parse forest of a CFG (packed forest) (local tree languages) Lecture I: Tree Automata A. Maletti 23

50 Tree Language How to represent a set of trees? enumerate them enumerate them cleverly parse forest of a CFG (packed forest) (local tree languages) Lecture I: Tree Automata A. Maletti 23

51 Tree Language How to represent a set of trees? enumerate them enumerate them cleverly parse forest of a CFG tree substitution grammar (packed forest) (local tree languages) Lecture I: Tree Automata A. Maletti 23

52 Local Tree Grammar CFG production L R R 2 R 3 represented by tree L R R 2 R 3 Glue fragments together to obtain larger trees: S PRP We MD must VB PP Lecture I: Tree Automata A. Maletti 24

53 Local Tree Grammar CFG production L R R 2 R 3 represented by tree L R R 2 R 3 Glue fragments together to obtain larger trees: S PRP We MD must VB PP Lecture I: Tree Automata A. Maletti 24

54 Local Tree Grammar CFG production L R R 2 R 3 represented by tree L R R 2 R 3 Glue fragments together to obtain larger trees: S PRP We MD must VB PP Lecture I: Tree Automata A. Maletti 24

55 Local Tree Grammar CFG production L R R 2 R 3 represented by tree L R R 2 R 3 Glue fragments together to obtain larger trees: S PRP We MD must VB PP Lecture I: Tree Automata A. Maletti 24

56 Local Tree Grammar CFG production L R R 2 R 3 represented by tree L R R 2 R 3 Glue fragments together to obtain larger trees: S PRP We MD must VB PP Lecture I: Tree Automata A. Maletti 24

57 Local Tree Grammar CFG production L R R 2 R 3 represented by tree L R R 2 R 3 Glue fragments together to obtain larger trees: S PRP We MD must VB PP Lecture I: Tree Automata A. Maletti 24

58 Local Tree Grammar CFG production L R R 2 R 3 represented by tree L R R 2 R 3 Glue fragments together to obtain larger trees: S PRP We MD must VB PP Lecture I: Tree Automata A. Maletti 24

59 Local Tree Grammar CFG production L R R 2 R 3 represented by tree L R R 2 R 3 Glue fragments together to obtain larger trees: S PRP MD We must VB PP But why only small tree fragments? Lecture I: Tree Automata A. Maletti 24

60 Tree Substitution Grammar Definition (JOSHI 969) A tree substitution grammar (TSG) is a tuple (N, Q, S, P) finite set N nonterminals finite set Q terminals S N start nonterminals finite set P T N (Q) \ Q tree fragments Lecture I: Tree Automata A. Maletti 25

61 Tree Substitution Grammar Definition (JOSHI 969) A tree substitution grammar (TSG) is a tuple (N, Q, S, P) finite set N nonterminals finite set Q terminals S N start nonterminals finite set P T N (Q) \ Q tree fragments Typical fragments [POST 20] S S VBD PP CD PRP TO Lecture I: Tree Automata A. Maletti 25

62 Tree Substitution Grammar TSG G = (N, Q, S, P) and sentential forms ξ, ζ T N (Q) Definition ξ G ζ if there exist a tree fragment t P and a position w leaves(ξ) such that ξ = ξ[t(ε)] w and ζ = ξ[t] w Lecture I: Tree Automata A. Maletti 26

63 Tree Substitution Grammar TSG G = (N, Q, S, P) and sentential forms ξ, ζ T N (Q) Definition ξ G ζ if there exist a tree fragment t P and a position w leaves(ξ) such that ξ = ξ[t(ε)] w and ζ = ξ[t] w Intuition In a derivation step: Find a leaf labeled by n N 2 Find a fragment t P such that t(ε) = n 3 Replace selected n by t Lecture I: Tree Automata A. Maletti 26

64 Tree Substitution Grammar Example Fragments: S(, ) (MD, ) (VB, PP, ) (PRP) PRP(We) MD(must) S PRP We MD must VB PP Lecture I: Tree Automata A. Maletti 27

65 Tree Substitution Grammar Example Fragments: S(, ) (MD, ) (VB, PP, ) (PRP) PRP(We) MD(must) S PRP We MD must VB PP Lecture I: Tree Automata A. Maletti 27

66 Tree Substitution Grammar Example Fragments: S(, ) (MD, ) (VB, PP, ) (PRP) PRP(We) MD(must) S PRP We MD must VB PP Lecture I: Tree Automata A. Maletti 27

67 Tree Substitution Grammar Example Fragments: S(, ) (MD, ) (VB, PP, ) (PRP) PRP(We) MD(must) S PRP We MD must VB PP Lecture I: Tree Automata A. Maletti 27

68 Tree Substitution Grammar Example Fragments: S(, ) (MD, ) (VB, PP, ) (PRP) PRP(We) MD(must) S PRP We MD must VB PP Lecture I: Tree Automata A. Maletti 27

69 Tree Substitution Grammar Example Fragments: S(, ) (MD, ) (VB, PP, ) (PRP) PRP(We) MD(must) S PRP We MD must VB PP Lecture I: Tree Automata A. Maletti 27

70 Tree Substitution Grammar Example Fragments: S(, ) (MD, ) (VB, PP, ) (PRP) PRP(We) MD(must) S PRP We MD must VB PP Lecture I: Tree Automata A. Maletti 27

71 Tree Substitution Grammar Example Fragments: S(, ) (MD, ) (VB, PP, ) (PRP) PRP(We) MD(must) S PRP We MD must VB PP Lecture I: Tree Automata A. Maletti 27

72 Tree Substitution Grammar Example Fragments: S(, ) (MD, ) (VB, PP, ) (PRP) PRP(We) MD(must) S PRP We MD must VB PP Lecture I: Tree Automata A. Maletti 27

73 Tree Substitution Grammar Example Fragments: S(, ) (MD, ) (VB, PP, ) (PRP) PRP(We) MD(must) S PRP We MD must VB PP Lecture I: Tree Automata A. Maletti 27

74 Tree Substitution Grammar Example Fragments: S(, ) (MD, ) (VB, PP, ) (PRP) PRP(We) MD(must) S PRP We MD must VB PP Lecture I: Tree Automata A. Maletti 27

75 Tree Substitution Grammar Example Fragments: S(, ) (MD, ) (VB, PP, ) (PRP) PRP(We) MD(must) S PRP We MD must VB PP Lecture I: Tree Automata A. Maletti 27

76 Tree Substitution Grammar Example Fragments: S(, ) (MD, ) (VB, PP, ) (PRP) PRP(We) MD(must) S PRP We MD must VB PP Lecture I: Tree Automata A. Maletti 27

77 Tree Substitution Grammar Example Fragments: S(, ) (MD, ) (VB, PP, ) (PRP) PRP(We) MD(must) S PRP We MD must VB PP Lecture I: Tree Automata A. Maletti 27

78 Tree Substitution Grammar Example Fragments: S(, ) (MD, ) (VB, PP, ) (PRP) PRP(We) MD(must) S PRP We MD must VB PP Lecture I: Tree Automata A. Maletti 27

79 Tree Substitution Grammar Example Fragments: S(, ) (MD, ) (VB, PP, ) (PRP) PRP(We) MD(must) S PRP We MD must VB PP Lecture I: Tree Automata A. Maletti 27

80 Tree Substitution Grammar Example Fragments: S(, ) (MD, ) (VB, PP, ) (PRP) PRP(We) MD(must) S PRP We MD must VB PP Lecture I: Tree Automata A. Maletti 27

81 Tree Substitution Grammar Example Fragments: S(, ) (MD, ) (VB, PP, ) (PRP) PRP(We) MD(must) S PRP We MD must VB PP Lecture I: Tree Automata A. Maletti 27

82 Tree Substitution Grammar Example Fragments: S(, ) (MD, ) (VB, PP, ) (PRP) PRP(We) MD(must) S PRP We MD must VB PP Lecture I: Tree Automata A. Maletti 27

83 Tree Substitution Grammar TSG G = (N, Q, S, P) Definition for all n N: L(G, n) = {t T N (Q) w leaves(t): t(w) Q, n G t} L(G) = s S L(G, s) Lecture I: Tree Automata A. Maletti 28

84 Tree Substitution Grammar TSG G = (N, Q, S, P) and TSL = {L(G) G TSG} Definition for all n N: L(G, n) = {t T N (Q) w leaves(t): t(w) Q, n G t} L(G) = s S L(G, s) Theorem FIN TSL all finite languages are TSL Lecture I: Tree Automata A. Maletti 28

85 Tree Substitution Grammar TSG G = (N, Q, S, P) and TSL = {L(G) G TSG} Definition for all n N: L(G, n) = {t T N (Q) w leaves(t): t(w) Q, n G t} L(G) = s S L(G, s) Theorem FIN TSL all finite languages are TSL 2 LTL TSL all local tree languages are TSL Lecture I: Tree Automata A. Maletti 28

86 Tree Substitution Grammar Theorem Tree substitution languages (TSL) have the following properties: closed under CFL LTL TSL union intersection (rank-bounded) complement relabeling Lecture I: Tree Automata A. Maletti 29

87 Tree Substitution Grammar Theorem Tree substitution languages (TSL) have the following properties: closed under CFL LTL TSL union intersection (rank-bounded) complement relabeling Lecture I: Tree Automata A. Maletti 29

88 Tree Substitution Grammar Theorem Tree substitution languages (TSL) have the following properties: closed under CFL LTL TSL union intersection (rank-bounded) complement relabeling Lecture I: Tree Automata A. Maletti 29

89 Tree Substitution Grammar Theorem Tree substitution languages (TSL) have the following properties: closed under CFL LTL TSL union intersection (rank-bounded) complement relabeling Lecture I: Tree Automata A. Maletti 29

90 Tree Substitution Grammar Theorem Tree substitution languages (TSL) have the following properties: closed under CFL LTL TSL union intersection (rank-bounded) complement relabeling Lecture I: Tree Automata A. Maletti 29

91 Tree Substitution Grammar Properties simple more expressive than local tree grammars ambiguity (several explanations for a generated tree) not closed under BOOLEAN operations (union/intersection/complement: / / ) not closed under (non-injective) relabelings... Lecture I: Tree Automata A. Maletti 30

92 Tree Substitution Grammar Theorem Tree substitution languages are not closed under union Proof. Counterexample must be infinite artificial example S S C a C b C C a b L = {S(C n (a), a) n N} L 2 = {S(C n (b), b) n N} Their union is not a tree substitution language Lecture I: Tree Automata A. Maletti 3

93 Tree Substitution Grammar Theorem Tree substitution languages are not closed under intersection Proof. Ideas? Lecture I: Tree Automata A. Maletti 32

94 Tree Language How to represent a set of trees? enumerate them enumerate them cleverly parse forest of a CFG tree substitution grammar regular tree grammar (packed forest) (local tree languages) Lecture I: Tree Automata A. Maletti 33

95 Regular Tree Grammar Definition (BRAINERD, 969) A regular tree grammar (RTG) is a tuple G = (N, Σ, S, P) with finite set N nonterminals finite set Σ terminals S N start nonterminals finite set P N T Σ (N) productions Remark Instead of (n, t) we write n t Lecture I: Tree Automata A. Maletti 34

96 Regular Tree Grammar Example N = {n 0, n, n 2, n 3, n 4, n 5, n 6 } Σ = {,, S,... } S = {n 0 } and the following productions: n 4 n 5 n 2 n 3 n 0 S n 4 n 0 n n 6 S n 2 n 4 Lecture I: Tree Automata A. Maletti 35

97 Regular Tree Grammar RTG G = (N, Σ, S, P) and sentential forms ξ, ζ T Σ (N) Definition (Derivation Semantics) ξ G ζ if there exist a production n t P and a position w leaves(ξ) such that ξ = ξ[n] w and ζ = ξ[t] w Lecture I: Tree Automata A. Maletti 36

98 Regular Tree Grammar RTG G = (N, Σ, S, P) and sentential forms ξ, ζ T Σ (N) Definition (Derivation Semantics) ξ G ζ if there exist a production n t P and a position w leaves(ξ) such that ξ = ξ[n] w and ζ = ξ[t] w Definition (Recognized tree language) L(G) = {t T Σ s S : s G t} Lecture I: Tree Automata A. Maletti 36

99 Regular Tree Grammar Productions S S n 4 n 5 n 2 n 3 n 0 n n 4 n 0 n 6 n 2 n 4 Derivation: n 0 G S n n 4 G n S n 5 n 2 n 3 Lecture I: Tree Automata A. Maletti 37

100 Regular Tree Grammar Productions S S n 4 n 5 n 2 n 3 n 0 n n 4 n 0 n 6 n 2 n 4 Derivation: n 0 G S n n 4 G n S n 5 n 2 n 3 Lecture I: Tree Automata A. Maletti 37

101 Regular Tree Grammar Productions S S n 4 n 5 n 2 n 3 n 0 n n 4 n 0 n 6 n 2 n 4 Derivation: n 0 G S n n 4 G n S n 5 n 2 n 3 Lecture I: Tree Automata A. Maletti 37

102 Regular Tree Grammar regular tree languages RTL = {L(G) G RTG} Theorem tree substitution languages regular tree languages Proof. We can express the union counterexample easily Lecture I: Tree Automata A. Maletti 38

103 Regular Tree Grammar Theorem Regular tree languages (RTL) have the following properties: closed under CFL LTL TSL RTL union intersection (rank-bounded) complement relabeling Lecture I: Tree Automata A. Maletti 39

104 Regular Tree Grammar Theorem Regular tree languages (RTL) have the following properties: closed under CFL LTL TSL RTL union intersection (rank-bounded) complement relabeling Lecture I: Tree Automata A. Maletti 39

105 Regular Tree Grammar Theorem Regular tree languages (RTL) have the following properties: closed under CFL LTL TSL RTL union intersection (rank-bounded) complement relabeling Lecture I: Tree Automata A. Maletti 39

106 Regular Tree Grammar Theorem Regular tree languages (RTL) have the following properties: closed under CFL LTL TSL RTL union intersection (rank-bounded) complement relabeling Lecture I: Tree Automata A. Maletti 39

107 Regular Tree Grammar Theorem Regular tree languages (RTL) have the following properties: closed under CFL LTL TSL RTL union intersection (rank-bounded) complement relabeling Lecture I: Tree Automata A. Maletti 39

108 Regular Tree Grammar Properties simple more expressive than tree substitution grammars ambiguity (several explanations for a generated tree) closed under all BOOLEAN operations (union/intersection/complement: / / ) closed under (non-injective) relabelings... Lecture I: Tree Automata A. Maletti 40

109 Regular Tree Grammar RTG G = (N, Σ, S, P) Definition (BRAINERD, 969) G is in normal form if t = σ(n,..., n k ) with σ Σ and n,..., n k N for all n t P Lecture I: Tree Automata A. Maletti 4

110 Regular Tree Grammar RTG G = (N, Σ, S, P) Definition (BRAINERD, 969) G is in normal form if t = σ(n,..., n k ) with σ Σ and n,..., n k N for all n t P Example productions S S n 4 n 5 n 2 n 3 n 0 n n 4 n 0 n 6 n 2 n 4 Lecture I: Tree Automata A. Maletti 4

111 Regular Tree Grammar Theorem (BRAINERD, 969) Every RTG is equivalent to an RTG in normal form Proof. Simply cut large rules introducing new states n 0 n 6 S n 2 n 4 n 0 S n 6 n n n 2 n 4 Lecture I: Tree Automata A. Maletti 42

112 Standard Representation Tree Automata Lecture I: Tree Automata A. Maletti 43

113 Tree Automaton Definition (THATCHER, 970; ROUNDS, 970) A tree automaton (TA) is an RTG in normal form Lecture I: Tree Automata A. Maletti 44

114 Tree Automaton Definition (THATCHER, 970; ROUNDS, 970) A tree automaton (TA) is an RTG in normal form Remarks bottom-up: rules written as σ(n,..., n k ) n top-down: rules written as n σ(n,..., n k ) Lecture I: Tree Automata A. Maletti 44

115 Tree Automaton Definition top-down deterministic if n N, k N, σ Σ at most one n,..., n k N : n σ(n,..., n k ) P and S = bottom-up deterministic if k N, σ Σ, n,..., n k N at most one n N : σ(n,..., n k ) n P (red determines blue) Lecture I: Tree Automata A. Maletti 45

116 Tree Automaton Theorem (THATCHER, WRIGHT, 968; DONER, 970) top-down det. RTL bottom-up det. RTL = RTL Lecture I: Tree Automata A. Maletti 46

117 Tree Automaton Theorem (THATCHER, WRIGHT, 968; DONER, 970) top-down det. RTL bottom-up det. RTL = RTL Proof. Let G = (N, Σ, S, P) be a tree automaton. We construct bottom-up det. TA G = (2 N, Σ, S, P ) with S = {N N N S } contains start nonterminal for each σ Σ, N,..., N k N, and k rk P (σ) {n n σ(n,..., n k ) P, n i N i } σ(n,..., N k ) P no other productions are in P Lecture I: Tree Automata A. Maletti 46

118 Tree Automaton Theorem (THATCHER, WRIGHT, 968; DONER, 970) top-down det. RTL bottom-up det. RTL = RTL Proof. Let G = (N, Σ, S, P) be a tree automaton. We construct bottom-up det. TA G = (2 N, Σ, S, P ) with S = {N N N S } contains start nonterminal for each σ Σ, N,..., N k N, and k rk P (σ) {n n σ(n,..., n k ) P, n i N i } σ(n,..., N k ) P no other productions are in P Strictness by the tree language L = {σ(α, β), σ(β, α)} Lecture I: Tree Automata A. Maletti 46

119 Tree Automaton RTL top-down det. RTL TSL bottom-up det. RTL LTL FIN Lecture I: Tree Automata A. Maletti 47

120 Tree Automaton RTL top-down det. RTL TSL bottom-up det. RTL LTL FIN Lecture I: Tree Automata A. Maletti 47

121 Tree Automaton RTL top-down det. RTL TSL bottom-up det. RTL LTL FIN Remark finite tree languages top-down deterministic RTL Lecture I: Tree Automata A. Maletti 47

122 Tree Automaton Theorem Regular tree languages are closed under all BOOLEAN operations substitution (quotients) and iteration (non-deterministic) relabelings linear homomorphisms inverse homomorphisms Lecture I: Tree Automata A. Maletti 48

123 Tree Automaton Theorem Regular tree languages are closed under substitution Definition L, L T Σ tree languages and α Σ L[α L ] contains all trees obtained from a tree of L by replacing each leaf labeled α by a tree of L different occurrences can be differently replaced Lecture I: Tree Automata A. Maletti 49

124 Tree Automaton Theorem Regular tree languages are closed under substitution L[α L ] t t t 2 t 3 t L t, t 2, t 3 L Lecture I: Tree Automata A. Maletti 49

125 Tree Automaton DTA = bottom-up deterministic tree automaton Definition A TA is minimal in C if all equivalent TAs of C have at least as many states Theorem Complexity of minimization problems: outp. C \ inp. model DTA TA DTA PTime ExpTime TA PSpace-hard ExpTime in ExpTime Lecture I: Tree Automata A. Maletti 50

126 Tree Automaton Definition Height of a tree t T Σ (Q): height(q) = 0 height(σ(t,..., t k )) = + max {height(t i ) i k} Lecture I: Tree Automata A. Maletti 5

127 Tree Automaton Definition Height of a tree t T Σ (Q): height(q) = 0 height(σ(t,..., t k )) = + max {height(t i ) i k} Intuition Height of t = number of edges in a maximal path from the root to a leaf Lecture I: Tree Automata A. Maletti 5

128 Tree Automaton Theorem (Pumping Lemma) For every regular tree language L T Σ there exists n N such that for every t L with height(t) n there exist contexts c, c C Σ and t T Σ such that t = c[c [t ]] c c[c [c c [t ] ]] L for any number of c c c c c. t c t Lecture I: Tree Automata A. Maletti 52

129 Tree Automaton Problem Complexity DTA TA Emptiness PTime PTime Finiteness PTime PTime Universality PTime ExpTime Inclusion PTime ExpTime Equivalence PTime ExpTime Membership in logdcfl logcfl Lecture I: Tree Automata A. Maletti 53

130 Literature Textbooks COMON, DAUCHET et al. Tree Automata Techniques and Applications GÉCSEG, STEINBY Tree Languages Handbook of Formal Languages, vol. 3, Springer, 997 GÉCSEG, STEINBY Tree Automata Akadémiai Kiadó, 984 Lecture I: Tree Automata A. Maletti 54

Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars

Multi Context-Free Tree Grammars and Multi-component Tree Adjoining Grammars Multi ontext-free Tree Grammars and Multi-component Tree djoining Grammars Joost Engelfriet 1 ndreas Maletti 2 1 LIS,, Leiden, The Netherlands 2 Institute of omputer Science,, Leipzig, Germany maletti@informatik.uni-leipzig.de

More information

Decision Properties for Context-free Languages

Decision Properties for Context-free Languages Previously: Decision Properties for Context-free Languages CMPU 240 Language Theory and Computation Fall 2018 Context-free languages Pumping Lemma for CFLs Closure properties for CFLs Today: Assignment

More information

Ambiguous Grammars and Compactification

Ambiguous Grammars and Compactification Ambiguous Grammars and Compactification Mridul Aanjaneya Stanford University July 17, 2012 Mridul Aanjaneya Automata Theory 1/ 44 Midterm Review Mathematical Induction and Pigeonhole Principle Finite Automata

More information

Context-Free Languages and Parse Trees

Context-Free Languages and Parse Trees Context-Free Languages and Parse Trees Mridul Aanjaneya Stanford University July 12, 2012 Mridul Aanjaneya Automata Theory 1/ 41 Context-Free Grammars A context-free grammar is a notation for describing

More information

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018 Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018 Lecture 11 Ana Bove April 26th 2018 Recap: Regular Languages Decision properties of RL: Is it empty? Does it contain this word? Contains

More information

CS5371 Theory of Computation. Lecture 8: Automata Theory VI (PDA, PDA = CFG)

CS5371 Theory of Computation. Lecture 8: Automata Theory VI (PDA, PDA = CFG) CS5371 Theory of Computation Lecture 8: Automata Theory VI (PDA, PDA = CFG) Objectives Introduce Pushdown Automaton (PDA) Show that PDA = CFG In terms of descriptive power Pushdown Automaton (PDA) Roughly

More information

Introduction to Parsing. Lecture 8

Introduction to Parsing. Lecture 8 Introduction to Parsing Lecture 8 Adapted from slides by G. Necula Outline Limitations of regular languages Parser overview Context-free grammars (CFG s) Derivations Languages and Automata Formal languages

More information

MA513: Formal Languages and Automata Theory Topic: Context-free Grammars (CFG) Lecture Number 18 Date: September 12, 2011

MA513: Formal Languages and Automata Theory Topic: Context-free Grammars (CFG) Lecture Number 18 Date: September 12, 2011 MA53: Formal Languages and Automata Theory Topic: Context-free Grammars (CFG) Lecture Number 8 Date: September 2, 20 xercise: Define a context-free grammar that represents (a simplification of) expressions

More information

Outline. Limitations of regular languages. Introduction to Parsing. Parser overview. Context-free grammars (CFG s)

Outline. Limitations of regular languages. Introduction to Parsing. Parser overview. Context-free grammars (CFG s) Outline Limitations of regular languages Introduction to Parsing Parser overview Lecture 8 Adapted from slides by G. Necula Context-free grammars (CFG s) Derivations Languages and Automata Formal languages

More information

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2016

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2016 Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2016 Lecture 15 Ana Bove May 23rd 2016 More on Turing machines; Summary of the course. Overview of today s lecture: Recap: PDA, TM Push-down

More information

Context-Free Grammars

Context-Free Grammars Context-Free Grammars 1 Informal Comments A context-free grammar is a notation for describing languages. It is more powerful than finite automata or RE s, but still cannot define all possible languages.

More information

Reflection in the Chomsky Hierarchy

Reflection in the Chomsky Hierarchy Reflection in the Chomsky Hierarchy Henk Barendregt Venanzio Capretta Dexter Kozen 1 Introduction We investigate which classes of formal languages in the Chomsky hierarchy are reflexive, that is, contain

More information

Final Course Review. Reading: Chapters 1-9

Final Course Review. Reading: Chapters 1-9 Final Course Review Reading: Chapters 1-9 1 Objectives Introduce concepts in automata theory and theory of computation Identify different formal language classes and their relationships Design grammars

More information

University of Nevada, Las Vegas Computer Science 456/656 Fall 2016

University of Nevada, Las Vegas Computer Science 456/656 Fall 2016 University of Nevada, Las Vegas Computer Science 456/656 Fall 2016 The entire examination is 925 points. The real final will be much shorter. Name: No books, notes, scratch paper, or calculators. Use pen

More information

Context-Free Languages & Grammars (CFLs & CFGs) Reading: Chapter 5

Context-Free Languages & Grammars (CFLs & CFGs) Reading: Chapter 5 Context-Free Languages & Grammars (CFLs & CFGs) Reading: Chapter 5 1 Not all languages are regular So what happens to the languages which are not regular? Can we still come up with a language recognizer?

More information

Defining syntax using CFGs

Defining syntax using CFGs Defining syntax using CFGs Roadmap Last time Defined context-free grammar This time CFGs for specifying a language s syntax Language membership List grammars Resolving ambiguity CFG Review G = (N,Σ,P,S)

More information

1. [5 points each] True or False. If the question is currently open, write O or Open.

1. [5 points each] True or False. If the question is currently open, write O or Open. University of Nevada, Las Vegas Computer Science 456/656 Spring 2018 Practice for the Final on May 9, 2018 The entire examination is 775 points. The real final will be much shorter. Name: No books, notes,

More information

Languages and Compilers

Languages and Compilers Principles of Software Engineering and Operational Systems Languages and Compilers SDAGE: Level I 2012-13 3. Formal Languages, Grammars and Automata Dr Valery Adzhiev vadzhiev@bournemouth.ac.uk Office:

More information

Multiple Choice Questions

Multiple Choice Questions Techno India Batanagar Computer Science and Engineering Model Questions Subject Name: Formal Language and Automata Theory Subject Code: CS 402 Multiple Choice Questions 1. The basic limitation of an FSM

More information

Midterm Exam II CIS 341: Foundations of Computer Science II Spring 2006, day section Prof. Marvin K. Nakayama

Midterm Exam II CIS 341: Foundations of Computer Science II Spring 2006, day section Prof. Marvin K. Nakayama Midterm Exam II CIS 341: Foundations of Computer Science II Spring 2006, day section Prof. Marvin K. Nakayama Print family (or last) name: Print given (or first) name: I have read and understand all of

More information

Lecture 8: Context Free Grammars

Lecture 8: Context Free Grammars Lecture 8: Context Free s Dr Kieran T. Herley Department of Computer Science University College Cork 2017-2018 KH (12/10/17) Lecture 8: Context Free s 2017-2018 1 / 1 Specifying Non-Regular Languages Recall

More information

UNIT I PART A PART B

UNIT I PART A PART B OXFORD ENGINEERING COLLEGE (NAAC ACCREDITED WITH B GRADE) DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING LIST OF QUESTIONS YEAR/SEM: III/V STAFF NAME: Dr. Sangeetha Senthilkumar SUB.CODE: CS6503 SUB.NAME:

More information

XML Research for Formal Language Theorists

XML Research for Formal Language Theorists XML Research for Formal Language Theorists Wim Martens TU Dortmund Wim Martens (TU Dortmund) XML for Formal Language Theorists May 14, 2008 1 / 65 Goal of this talk XML Research vs Formal Languages Wim

More information

THEORY OF COMPUTATION

THEORY OF COMPUTATION THEORY OF COMPUTATION UNIT-1 INTRODUCTION Overview This chapter begins with an overview of those areas in the theory of computation that are basic foundation of learning TOC. This unit covers the introduction

More information

CS525 Winter 2012 \ Class Assignment #2 Preparation

CS525 Winter 2012 \ Class Assignment #2 Preparation 1 CS525 Winter 2012 \ Class Assignment #2 Preparation Ariel Stolerman 2.26) Let be a CFG in Chomsky Normal Form. Following is a proof that for any ( ) of length exactly steps are required for any derivation

More information

MIT Specifying Languages with Regular Expressions and Context-Free Grammars

MIT Specifying Languages with Regular Expressions and Context-Free Grammars MIT 6.035 Specifying Languages with Regular essions and Context-Free Grammars Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology Language Definition Problem How to precisely

More information

Where We Are. CMSC 330: Organization of Programming Languages. This Lecture. Programming Languages. Motivation for Grammars

Where We Are. CMSC 330: Organization of Programming Languages. This Lecture. Programming Languages. Motivation for Grammars CMSC 330: Organization of Programming Languages Context Free Grammars Where We Are Programming languages Ruby OCaml Implementing programming languages Scanner Uses regular expressions Finite automata Parser

More information

Formal languages and computation models

Formal languages and computation models Formal languages and computation models Guy Perrier Bibliography John E. Hopcroft, Rajeev Motwani, Jeffrey D. Ullman - Introduction to Automata Theory, Languages, and Computation - Addison Wesley, 2006.

More information

Definition: A context-free grammar (CFG) is a 4- tuple. variables = nonterminals, terminals, rules = productions,,

Definition: A context-free grammar (CFG) is a 4- tuple. variables = nonterminals, terminals, rules = productions,, CMPSCI 601: Recall From Last Time Lecture 5 Definition: A context-free grammar (CFG) is a 4- tuple, variables = nonterminals, terminals, rules = productions,,, are all finite. 1 ( ) $ Pumping Lemma for

More information

Syntax Analysis Part I

Syntax Analysis Part I Syntax Analysis Part I Chapter 4: Context-Free Grammars Slides adapted from : Robert van Engelen, Florida State University Position of a Parser in the Compiler Model Source Program Lexical Analyzer Token,

More information

Derivations of a CFG. MACM 300 Formal Languages and Automata. Context-free Grammars. Derivations and parse trees

Derivations of a CFG. MACM 300 Formal Languages and Automata. Context-free Grammars. Derivations and parse trees Derivations of a CFG MACM 300 Formal Languages and Automata Anoop Sarkar http://www.cs.sfu.ca/~anoop strings grow on trees strings grow on Noun strings grow Object strings Verb Object Noun Verb Object

More information

Homework. Context Free Languages. Before We Start. Announcements. Plan for today. Languages. Any questions? Recall. 1st half. 2nd half.

Homework. Context Free Languages. Before We Start. Announcements. Plan for today. Languages. Any questions? Recall. 1st half. 2nd half. Homework Context Free Languages Homework #2 returned Homework #3 due today Homework #4 Pg 133 -- Exercise 1 (use structural induction) Pg 133 -- Exercise 3 Pg 134 -- Exercise 8b,c,d Pg 135 -- Exercise

More information

CS402 - Theory of Automata Glossary By

CS402 - Theory of Automata Glossary By CS402 - Theory of Automata Glossary By Acyclic Graph : A directed graph is said to be acyclic if it contains no cycles. Algorithm : A detailed and unambiguous sequence of instructions that describes how

More information

MIT Specifying Languages with Regular Expressions and Context-Free Grammars. Martin Rinard Massachusetts Institute of Technology

MIT Specifying Languages with Regular Expressions and Context-Free Grammars. Martin Rinard Massachusetts Institute of Technology MIT 6.035 Specifying Languages with Regular essions and Context-Free Grammars Martin Rinard Massachusetts Institute of Technology Language Definition Problem How to precisely define language Layered structure

More information

PDA s. and Formal Languages. Automata Theory CS 573. Outline of equivalence of PDA s and CFG s. (see Theorem 5.3)

PDA s. and Formal Languages. Automata Theory CS 573. Outline of equivalence of PDA s and CFG s. (see Theorem 5.3) CS 573 Automata Theory and Formal Languages Professor Leslie Lander Lecture # 20 November 13, 2000 Greibach Normal Form (GNF) Sheila Greibach s normal form (GNF) for a CFG is one where EVERY production

More information

We ve studied the main models and concepts of the theory of computation:

We ve studied the main models and concepts of the theory of computation: CMPSCI 601: Summary & Conclusions Lecture 27 We ve studied the main models and concepts of the theory of computation: Computability: what can be computed in principle Logic: how can we express our requirements

More information

(a) R=01[((10)*+111)*+0]*1 (b) ((01+10)*00)*. [8+8] 4. (a) Find the left most and right most derivations for the word abba in the grammar

(a) R=01[((10)*+111)*+0]*1 (b) ((01+10)*00)*. [8+8] 4. (a) Find the left most and right most derivations for the word abba in the grammar Code No: R05310501 Set No. 1 III B.Tech I Semester Regular Examinations, November 2008 FORMAL LANGUAGES AND AUTOMATA THEORY (Computer Science & Engineering) Time: 3 hours Max Marks: 80 Answer any FIVE

More information

Closure Properties of CFLs; Introducing TMs. CS154 Chris Pollett Apr 9, 2007.

Closure Properties of CFLs; Introducing TMs. CS154 Chris Pollett Apr 9, 2007. Closure Properties of CFLs; Introducing TMs CS154 Chris Pollett Apr 9, 2007. Outline Closure Properties of Context Free Languages Algorithms for CFLs Introducing Turing Machines Closure Properties of CFL

More information

Discrete Mathematics Lecture 4. Harper Langston New York University

Discrete Mathematics Lecture 4. Harper Langston New York University Discrete Mathematics Lecture 4 Harper Langston New York University Sequences Sequence is a set of (usually infinite number of) ordered elements: a 1, a 2,, a n, Each individual element a k is called a

More information

Context-Free Grammars

Context-Free Grammars Department of Linguistics Ohio State University Syntax 2 (Linguistics 602.02) January 3, 2012 (CFGs) A CFG is an ordered quadruple T, N, D, P where a. T is a finite set called the terminals; b. N is a

More information

Theory Bridge Exam Example Questions Version of June 6, 2008

Theory Bridge Exam Example Questions Version of June 6, 2008 Theory Bridge Exam Example Questions Version of June 6, 2008 This is a collection of sample theory bridge exam questions. This is just to get some idea of the format of the bridge exam and the level of

More information

Lexical Analysis. COMP 524, Spring 2014 Bryan Ward

Lexical Analysis. COMP 524, Spring 2014 Bryan Ward Lexical Analysis COMP 524, Spring 2014 Bryan Ward Based in part on slides and notes by J. Erickson, S. Krishnan, B. Brandenburg, S. Olivier, A. Block and others The Big Picture Character Stream Scanner

More information

A Characterization of the Chomsky Hierarchy by String Turing Machines

A Characterization of the Chomsky Hierarchy by String Turing Machines A Characterization of the Chomsky Hierarchy by String Turing Machines Hans W. Lang University of Applied Sciences, Flensburg, Germany Abstract A string Turing machine is a variant of a Turing machine designed

More information

Context Free Grammars. CS154 Chris Pollett Mar 1, 2006.

Context Free Grammars. CS154 Chris Pollett Mar 1, 2006. Context Free Grammars CS154 Chris Pollett Mar 1, 2006. Outline Formal Definition Ambiguity Chomsky Normal Form Formal Definitions A context free grammar is a 4-tuple (V, Σ, R, S) where 1. V is a finite

More information

Context-Free Grammars

Context-Free Grammars Context-Free Grammars Describing Languages We've seen two models for the regular languages: Automata accept precisely the strings in the language. Regular expressions describe precisely the strings in

More information

Nested Words. A model of linear and hierarchical structure in data. Examples of such data

Nested Words. A model of linear and hierarchical structure in data. Examples of such data Nested Words A model of linear and hierarchical structure in data. s o m e w o r d Examples of such data Executions of procedural programs Marked-up languages such as XML Annotated linguistic data Example:

More information

Compilers Course Lecture 4: Context Free Grammars

Compilers Course Lecture 4: Context Free Grammars Compilers Course Lecture 4: Context Free Grammars Example: attempt to define simple arithmetic expressions using named regular expressions: num = [0-9]+ sum = expr "+" expr expr = "(" sum ")" num Appears

More information

Regular Path Queries on Graphs with Data

Regular Path Queries on Graphs with Data Regular Path Queries on Graphs with Data Leonid Libkin Domagoj Vrgoč ABSTRACT Graph data models received much attention lately due to applications in social networks, semantic web, biological databases

More information

Branching Tree Grammars in TREEBAG

Branching Tree Grammars in TREEBAG Umeå University Department of Computing Science Master s Thesis June 12, 2006 Branching Tree Grammars in TREEBAG Author: Gabriel Jonsson Supervisor: Frank Drewes Examiner: Per Lindström ii Abstract In

More information

Last lecture CMSC330. This lecture. Finite Automata: States. Finite Automata. Implementing Regular Expressions. Languages. Regular expressions

Last lecture CMSC330. This lecture. Finite Automata: States. Finite Automata. Implementing Regular Expressions. Languages. Regular expressions Last lecture CMSC330 Finite Automata Languages Sets of strings Operations on languages Regular expressions Constants Operators Precedence 1 2 Finite automata States Transitions Examples Types This lecture

More information

The CKY Parsing Algorithm and PCFGs. COMP-550 Oct 12, 2017

The CKY Parsing Algorithm and PCFGs. COMP-550 Oct 12, 2017 The CKY Parsing Algorithm and PCFGs COMP-550 Oct 12, 2017 Announcements I m out of town next week: Tuesday lecture: Lexical semantics, by TA Jad Kabbara Thursday lecture: Guest lecture by Prof. Timothy

More information

Introduction to Lexing and Parsing

Introduction to Lexing and Parsing Introduction to Lexing and Parsing ECE 351: Compilers Jon Eyolfson University of Waterloo June 18, 2012 1 Riddle Me This, Riddle Me That What is a compiler? 1 Riddle Me This, Riddle Me That What is a compiler?

More information

Constraint Satisfaction Problems

Constraint Satisfaction Problems Constraint Satisfaction Problems Search and Lookahead Bernhard Nebel, Julien Hué, and Stefan Wölfl Albert-Ludwigs-Universität Freiburg June 4/6, 2012 Nebel, Hué and Wölfl (Universität Freiburg) Constraint

More information

Non-context-Free Languages. CS215, Lecture 5 c

Non-context-Free Languages. CS215, Lecture 5 c Non-context-Free Languages CS215 Lecture 5 c 2007 1 The Pumping Lemma Theorem (Pumping Lemma) Let be context-free There exists a positive integer divided into five pieces Proof for for each and Let and

More information

Assignment 4 CSE 517: Natural Language Processing

Assignment 4 CSE 517: Natural Language Processing Assignment 4 CSE 517: Natural Language Processing University of Washington Winter 2016 Due: March 2, 2016, 1:30 pm 1 HMMs and PCFGs Here s the definition of a PCFG given in class on 2/17: A finite set

More information

Context-Free Grammars

Context-Free Grammars Context-Free Grammars Carl Pollard yntax 2 (Linguistics 602.02) January 3, 2012 Context-Free Grammars (CFGs) A CFG is an ordered quadruple T, N, D, P where a. T is a finite set called the terminals; b.

More information

Theory of Computation

Theory of Computation Theory of Computation For Computer Science & Information Technology By www.thegateacademy.com Syllabus Syllabus for Theory of Computation Regular Expressions and Finite Automata, Context-Free Grammar s

More information

Normal Forms for CFG s. Eliminating Useless Variables Removing Epsilon Removing Unit Productions Chomsky Normal Form

Normal Forms for CFG s. Eliminating Useless Variables Removing Epsilon Removing Unit Productions Chomsky Normal Form Normal Forms for CFG s Eliminating Useless Variables Removing Epsilon Removing Unit Productions Chomsky Normal Form 1 Variables That Derive Nothing Consider: S -> AB, A -> aa a, B -> AB Although A derives

More information

Homework & NLTK. CS 181: Natural Language Processing Lecture 9: Context Free Grammars. Motivation. Formal Def of CFG. Uses of CFG.

Homework & NLTK. CS 181: Natural Language Processing Lecture 9: Context Free Grammars. Motivation. Formal Def of CFG. Uses of CFG. C 181: Natural Language Processing Lecture 9: Context Free Grammars Kim Bruce Pomona College pring 2008 Homework & NLTK Review MLE, Laplace, and Good-Turing in smoothing.py Disclaimer: lide contents borrowed

More information

CT32 COMPUTER NETWORKS DEC 2015

CT32 COMPUTER NETWORKS DEC 2015 Q.2 a. Using the principle of mathematical induction, prove that (10 (2n-1) +1) is divisible by 11 for all n N (8) Let P(n): (10 (2n-1) +1) is divisible by 11 For n = 1, the given expression becomes (10

More information

(Refer Slide Time: 0:19)

(Refer Slide Time: 0:19) Theory of Computation. Professor somenath Biswas. Department of Computer Science & Engineering. Indian Institute of Technology, Kanpur. Lecture-15. Decision Problems for Regular Languages. (Refer Slide

More information

Lecture 1. 1 Notation

Lecture 1. 1 Notation Lecture 1 (The material on mathematical logic is covered in the textbook starting with Chapter 5; however, for the first few lectures, I will be providing some required background topics and will not be

More information

Introduction to Parsing. Lecture 5

Introduction to Parsing. Lecture 5 Introduction to Parsing Lecture 5 1 Outline Regular languages revisited Parser overview Context-free grammars (CFG s) Derivations Ambiguity 2 Languages and Automata Formal languages are very important

More information

Theory of Computations Spring 2016 Practice Final Exam Solutions

Theory of Computations Spring 2016 Practice Final Exam Solutions 1 of 8 Theory of Computations Spring 2016 Practice Final Exam Solutions Name: Directions: Answer the questions as well as you can. Partial credit will be given, so show your work where appropriate. Try

More information

Computer Sciences Department

Computer Sciences Department 1 Reference Book: INTRODUCTION TO THE THEORY OF COMPUTATION, SECOND EDITION, by: MICHAEL SIPSER 3 D E C I D A B I L I T Y 4 Objectives 5 Objectives investigate the power of algorithms to solve problems.

More information

Chapter 18: Decidability

Chapter 18: Decidability Chapter 18: Decidability Peter Cappello Department of Computer Science University of California, Santa Barbara Santa Barbara, CA 93106 cappello@cs.ucsb.edu Please read the corresponding chapter before

More information

Context-Free Grammars

Context-Free Grammars Context-Free Grammars Describing Languages We've seen two models for the regular languages: Finite automata accept precisely the strings in the language. Regular expressions describe precisely the strings

More information

Outline. Limitations of regular languages Parser overview Context-free grammars (CFG s) Derivations Syntax-Directed Translation

Outline. Limitations of regular languages Parser overview Context-free grammars (CFG s) Derivations Syntax-Directed Translation Outline Introduction to Parsing Lecture 8 Adapted from slides by G. Necula and R. Bodik Limitations of regular languages Parser overview Context-free grammars (CG s) Derivations Syntax-Directed ranslation

More information

Introduction to Automata Theory. BİL405 - Automata Theory and Formal Languages 1

Introduction to Automata Theory. BİL405 - Automata Theory and Formal Languages 1 Introduction to Automata Theory BİL405 - Automata Theory and Formal Languages 1 Automata, Computability and Complexity Automata, Computability and Complexity are linked by the question: What are the fundamental

More information

PS3 - Comments. Describe precisely the language accepted by this nondeterministic PDA.

PS3 - Comments. Describe precisely the language accepted by this nondeterministic PDA. University of Virginia - cs3102: Theory of Computation Spring 2010 PS3 - Comments Average: 46.6 (full credit for each question is 55 points) Problem 1: Mystery Language. (Average 8.5 / 10) In Class 7,

More information

Digraph Complexity Measures and Applications in Formal Language Theory

Digraph Complexity Measures and Applications in Formal Language Theory Digraph Complexity Measures and by Hermann Gruber Institut für Informatik Justus-Liebig-Universität Giessen Germany November 14, 2008 Overview 1 Introduction and Motivation 2 3 4 5 Outline Introduction

More information

Lecture 4: Syntax Specification

Lecture 4: Syntax Specification The University of North Carolina at Chapel Hill Spring 2002 Lecture 4: Syntax Specification Jan 16 1 Phases of Compilation 2 1 Syntax Analysis Syntax: Webster s definition: 1 a : the way in which linguistic

More information

Course Project 2 Regular Expressions

Course Project 2 Regular Expressions Course Project 2 Regular Expressions CSE 30151 Spring 2017 Version of February 16, 2017 In this project, you ll write a regular expression matcher similar to grep, called mere (for match and echo using

More information

Some Interdefinability Results for Syntactic Constraint Classes

Some Interdefinability Results for Syntactic Constraint Classes Some Interdefinability Results for Syntactic Constraint Classes Thomas Graf tgraf@ucla.edu tgraf.bol.ucla.edu University of California, Los Angeles Mathematics of Language 11 Bielefeld, Germany 1 The Linguistic

More information

Introduction to Parsing. Lecture 5

Introduction to Parsing. Lecture 5 Introduction to Parsing Lecture 5 1 Outline Regular languages revisited Parser overview Context-free grammars (CFG s) Derivations Ambiguity 2 Languages and Automata Formal languages are very important

More information

Models of Computation II: Grammars and Pushdown Automata

Models of Computation II: Grammars and Pushdown Automata Models of Computation II: Grammars and Pushdown Automata COMP1600 / COMP6260 Dirk Pattinson Australian National University Semester 2, 2018 Catch Up / Drop in Lab Session 1 Monday 1100-1200 at Room 2.41

More information

CS/B.Tech/CSE/IT/EVEN/SEM-4/CS-402/ ItIauIafIaAblll~AladUnrtel1ity

CS/B.Tech/CSE/IT/EVEN/SEM-4/CS-402/ ItIauIafIaAblll~AladUnrtel1ity CS/B.Tech/CSE/IT/EVEN/SEM-4/CS-402/2015-16 ItIauIafIaAblll~AladUnrtel1ity ~ t; ~~ ) MAULANA ABUL KALAM AZAD UNIVERSITY OF TECHNOLOGY, WEST BENGAL Paper Code: CS-402 FORMAL LANGUAGE AND AUTOMATA THEORY

More information

JNTUWORLD. Code No: R

JNTUWORLD. Code No: R Code No: R09220504 R09 SET-1 B.Tech II Year - II Semester Examinations, April-May, 2012 FORMAL LANGUAGES AND AUTOMATA THEORY (Computer Science and Engineering) Time: 3 hours Max. Marks: 75 Answer any five

More information

Syntax Analysis Check syntax and construct abstract syntax tree

Syntax Analysis Check syntax and construct abstract syntax tree Syntax Analysis Check syntax and construct abstract syntax tree if == = ; b 0 a b Error reporting and recovery Model using context free grammars Recognize using Push down automata/table Driven Parsers

More information

Context Free Languages and Pushdown Automata

Context Free Languages and Pushdown Automata Context Free Languages and Pushdown Automata COMP2600 Formal Methods for Software Engineering Ranald Clouston Australian National University Semester 2, 2013 COMP 2600 Context Free Languages and Pushdown

More information

QUESTION BANK. Formal Languages and Automata Theory(10CS56)

QUESTION BANK. Formal Languages and Automata Theory(10CS56) QUESTION BANK Formal Languages and Automata Theory(10CS56) Chapter 1 1. Define the following terms & explain with examples. i) Grammar ii) Language 2. Mention the difference between DFA, NFA and εnfa.

More information

Parsing. source code. while (k<=n) {sum = sum+k; k=k+1;}

Parsing. source code. while (k<=n) {sum = sum+k; k=k+1;} Compiler Construction Grammars Parsing source code scanner tokens regular expressions lexical analysis Lennart Andersson parser context free grammar Revision 2012 01 23 2012 parse tree AST builder (implicit)

More information

ONE-STACK AUTOMATA AS ACCEPTORS OF CONTEXT-FREE LANGUAGES *

ONE-STACK AUTOMATA AS ACCEPTORS OF CONTEXT-FREE LANGUAGES * ONE-STACK AUTOMATA AS ACCEPTORS OF CONTEXT-FREE LANGUAGES * Pradip Peter Dey, Mohammad Amin, Bhaskar Raj Sinha and Alireza Farahani National University 3678 Aero Court San Diego, CA 92123 {pdey, mamin,

More information

VALLIAMMAI ENGNIEERING COLLEGE SRM Nagar, Kattankulathur 603203. DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING Year & Semester : III Year, V Semester Section : CSE - 1 & 2 Subject Code : CS6503 Subject

More information

Automatic Inference of Code Transforms and Search Spaces for Automatic Patch Generation Systems Fan Long, Peter Amidon, and Martin Rinard

Automatic Inference of Code Transforms and Search Spaces for Automatic Patch Generation Systems Fan Long, Peter Amidon, and Martin Rinard Computer Science and Artificial Intelligence Laboratory Technical Report MIT-CSAIL-TR-2016-010 July 8, 2016 Automatic Inference of Code Transforms and Search Spaces for Automatic Patch Generation Systems

More information

[Ch 6] Set Theory. 1. Basic Concepts and Definitions. 400 lecture note #4. 1) Basics

[Ch 6] Set Theory. 1. Basic Concepts and Definitions. 400 lecture note #4. 1) Basics 400 lecture note #4 [Ch 6] Set Theory 1. Basic Concepts and Definitions 1) Basics Element: ; A is a set consisting of elements x which is in a/another set S such that P(x) is true. Empty set: notated {

More information

Consider a description of arithmetic. It includes two equations that define the structural types of digit and operator:

Consider a description of arithmetic. It includes two equations that define the structural types of digit and operator: Syntax A programming language consists of syntax, semantics, and pragmatics. We formalize syntax first, because only syntactically correct programs have semantics. A syntax definition of a language lists

More information

Chapter 3. Semantics. Topics. Introduction. Introduction. Introduction. Introduction

Chapter 3. Semantics. Topics. Introduction. Introduction. Introduction. Introduction Topics Chapter 3 Semantics Introduction Static Semantics Attribute Grammars Dynamic Semantics Operational Semantics Axiomatic Semantics Denotational Semantics 2 Introduction Introduction Language implementors

More information

I have read and understand all of the instructions below, and I will obey the Academic Honor Code.

I have read and understand all of the instructions below, and I will obey the Academic Honor Code. Midterm Exam CS 341-451: Foundations of Computer Science II Fall 2014, elearning section Prof. Marvin K. Nakayama Print family (or last) name: Print given (or first) name: I have read and understand all

More information

Generell Topologi. Richard Williamson. May 27, 2013

Generell Topologi. Richard Williamson. May 27, 2013 Generell Topologi Richard Williamson May 27, 2013 1 1 Tuesday 15th January 1.1 Topological spaces definition, terminology, finite examples Definition 1.1. A topological space is a pair (X, O) of a set

More information

Formal Languages and Compilers Lecture V: Parse Trees and Ambiguous Gr

Formal Languages and Compilers Lecture V: Parse Trees and Ambiguous Gr Formal Languages and Compilers Lecture V: Parse Trees and Ambiguous Grammars Free University of Bozen-Bolzano Faculty of Computer Science POS Building, Room: 2.03 artale@inf.unibz.it http://www.inf.unibz.it/

More information

CS415 Compilers. Syntax Analysis. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

CS415 Compilers. Syntax Analysis. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University CS415 Compilers Syntax Analysis These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University Limits of Regular Languages Advantages of Regular Expressions

More information

Parsing II Top-down parsing. Comp 412

Parsing II Top-down parsing. Comp 412 COMP 412 FALL 2018 Parsing II Top-down parsing Comp 412 source code IR Front End Optimizer Back End IR target code Copyright 2018, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled

More information

Context-Free Grammars. Carl Pollard Ohio State University. Linguistics 680 Formal Foundations Tuesday, November 10, 2009

Context-Free Grammars. Carl Pollard Ohio State University. Linguistics 680 Formal Foundations Tuesday, November 10, 2009 Context-Free Grammars Carl Pollard Ohio State University Linguistics 680 Formal Foundations Tuesday, November 10, 2009 These slides are available at: http://www.ling.osu.edu/ scott/680 1 (1) Context-Free

More information

The Turing Machine. Unsolvable Problems. Undecidability. The Church-Turing Thesis (1936) Decision Problem. Decision Problems

The Turing Machine. Unsolvable Problems. Undecidability. The Church-Turing Thesis (1936) Decision Problem. Decision Problems The Turing Machine Unsolvable Problems Motivating idea Build a theoretical a human computer Likened to a human with a paper and pencil that can solve problems in an algorithmic way The theoretical machine

More information

Defining syntax using CFGs

Defining syntax using CFGs Defining syntax using CFGs Roadmap Last 8me Defined context-free grammar This 8me CFGs for syntax design Language membership List grammars Resolving ambiguity CFG Review G = (N,Σ,P,S) means derives derives

More information

LOGIC AND DISCRETE MATHEMATICS

LOGIC AND DISCRETE MATHEMATICS LOGIC AND DISCRETE MATHEMATICS A Computer Science Perspective WINFRIED KARL GRASSMANN Department of Computer Science University of Saskatchewan JEAN-PAUL TREMBLAY Department of Computer Science University

More information

Context-Free Grammars and Languages (2015/11)

Context-Free Grammars and Languages (2015/11) Chapter 5 Context-Free Grammars and Languages (2015/11) Adriatic Sea shore at Opatija, Croatia Outline 5.0 Introduction 5.1 Context-Free Grammars (CFG s) 5.2 Parse Trees 5.3 Applications of CFG s 5.4 Ambiguity

More information

Lecture 15: The subspace topology, Closed sets

Lecture 15: The subspace topology, Closed sets Lecture 15: The subspace topology, Closed sets 1 The Subspace Topology Definition 1.1. Let (X, T) be a topological space with topology T. subset of X, the collection If Y is a T Y = {Y U U T} is a topology

More information

1 Minimum Cut Problem

1 Minimum Cut Problem CS 6 Lecture 6 Min Cut and Karger s Algorithm Scribes: Peng Hui How, Virginia Williams (05) Date: November 7, 07 Anthony Kim (06), Mary Wootters (07) Adapted from Virginia Williams lecture notes Minimum

More information