Lecture 8: Context Free Grammars


 3 years ago
1 Lecture 8: Context Free s Dr Kieran T. Herley Department of Computer Science University College Cork KH (12/10/17) Lecture 8: Context Free s / 1
2 Specifying NonRegular Languages Recall Language Observations Not every language is regular, e.g. L = {a n b n : n nonnegative integer} Consider following recursive rules defining L 1 ɛ L 2 if α L, then so is a α b Every string derived by repeated application of above rules is in L Every string in L can be formed by these rules by applying second rule n times KH (12/10/17) Lecture 8: Context Free s / 1
3 Context Free s Idea Capture above idea using a contextfree grammar (CFG). S ɛ S a S b Intuitive Explanation Symbol < S > a substitutable placeholder productions are substitution rules Language consists of strings derivable by starting with < S > repeatedly applying rules continuing until no placeholders left KH (12/10/17) Lecture 8: Context Free s / 1
4 CFG cont d S ɛ S a S b Examples of Derivations S ɛ S a S b a b S a S b aa S bb aabb So ɛ, ab, aabb, belong to language KH (12/10/17) Lecture 8: Context Free s / 1
5 Some Terminology A (contextfree) grammar consists of one or more Productions S a S b LHS a single nonterminal (here S ) RHS sequence of one or more symbols (here a S b ) composed of terminals, nonterminals and ɛs ( separates two; sometimes ::= etc. used instead) where Terminals are symbols from underlying alphabet, e.g. {a, b } Nonterminals are placeholder symbols, e.g. S (Here enclosed in angle brackets for clarity) Start Symbol a nonterminal ( S ) ; the first nonterminal by default KH (12/10/17) Lecture 8: Context Free s / 1
6 Derivations S ɛ S a S b Derivation Transformation of start symbol into sentence (sequence of terminals) by repeated application of grammar productions i.e. substitution of RHS of some production for the nonterminal in its LHS Example: S a S b aa S bb aabb The intermediate stages e.g. a S b are known as sentential forms Definition Sentences derivable from start symbol constitute the language defined by grammar KH (12/10/17) Lecture 8: Context Free s / 1
7 More Examples and Counterexamples S ɛ S a S b KH (12/10/17) Lecture 8: Context Free s / 1
8 More Examples and Counterexamples S ɛ S a S b aaabbb? KH (12/10/17) Lecture 8: Context Free s / 1
9 More Examples and Counterexamples S ɛ S a S b aaabbb? aaab? KH (12/10/17) Lecture 8: Context Free s / 1
10 More Examples and Counterexamples S ɛ S a S b aaabbb? aaab? abba? Upshot specifies language L = {a n b n n 0} Note If we interpret a as ( and b as ), this captures the set of nested parentheses KH (12/10/17) Lecture 8: Context Free s / 1
11 Another S N S S N N ɛ N ( S ) Features Start S Nonterminals S, N Terminals Left and right parentheses symbols ( ( and ) shown in boldface) KH (12/10/17) Lecture 8: Context Free s / 1
12 Left Recursion S N S S N N ɛ N ( S ) Note is left recursive: embodies rules of form X X α 1 This is one of the standard grammar idioms used to express repetition Some techniques disfavour left recursion; can usually recast grammar to avoid 1 More indirect forms of left recursion are also possible KH (12/10/17) Lecture 8: Context Free s / 1
13 Another cont d S N S S N N ɛ N ( S ) Observation The first two rules imply S N S S N N N S S N S N N N N N i.e. S can roll out sequence of one or more N s depending on the number of applications of Rule 2. This is a standard CFG idiom to specify repetition. KH (12/10/17) Lecture 8: Context Free s / 1
14 Some More Derivations S N S S N N ɛ N ( S ) Some Derivations S N ɛ KH (12/10/17) Lecture 8: Context Free s / 1
15 Some More Derivations S N S S N N ɛ N ( S ) Some Derivations S N ɛ S N ( S ) ( N ) () KH (12/10/17) Lecture 8: Context Free s / 1
16 Some More Derivations S N S S N N ɛ N ( S ) Some Derivations S N ɛ S N ( S ) ( N ) () S N ( S ) ( N ) (( S )) (( N )) (()) KH (12/10/17) Lecture 8: Context Free s / 1
17 Some More Derivations S N S S N N ɛ N ( S ) More Derivations S S N N N ( S ) N ( N ) N () N () ( S ) ()( N ) ()() KH (12/10/17) Lecture 8: Context Free s / 1
18 Some More Derivations S N S S N N ɛ N ( S ) More Derivations S S N N N ( S ) N ( N ) N () N () ( S ) ()( N ) ()() Upshot captures set of balanced parentheses as found in validly formated arithmetic expressions. KH (12/10/17) Lecture 8: Context Free s / 1
19 Parse Trees Parse Trees S N S S N N ɛ N ( S ) Sentence/ Source : ()()() Parse Tree tree representation of derivation start symbol at root terminals at leaves each nonleaf reflects a production inorder traversal (leaves only) yields sentence. KH (12/10/17) Lecture 8: Context Free s / 1
20 Parse Trees Parse Trees S N S S N N ɛ N ( S ) Sentence/ Source : ()()() KH (12/10/17) Lecture 8: Context Free s / 1
21 Parse Trees Parse Trees cont d Tree representation encodes connection between source and grammar Compilers often use such trees to model detailed structure of source to drive code generation, for example KH (12/10/17) Lecture 8: Context Free s / 1
22 Parse Trees Notational Note Productions sharing the same LHS can be combined using the symbol (read or ). So X α X β X γ can be abbreviated to X α β γ KH (12/10/17) Lecture 8: Context Free s / 1
23 CFGs and Programming Language Syntax for Simple Arithmetic Expressions expr expr + term expr  term term term term * factor term / factor factor factor NUM ( expr ) Terminal NUM stands for a number (i.e. sequence of digits). CFGs can be used to specify syntax for arithmetic expressions and most programming languages CFGbased tools allow us to generate parser capable of recognizing expressions automatically KH (12/10/17) Lecture 8: Context Free s / 1
24 CFGs and Programming Language Syntax Some Examples of Valid Expressions 1 NUM 2 NUM NUM 3 NUM + NUM 4 NUM + NUM NUM 5 NUM (NUM + NUM) KH (12/10/17) Lecture 8: Context Free s / 1
25 CFGs and Programming Language Syntax Example 1 Expression Parse Tree NUM expr expr + term expr  term term term term * factor term / factor factor factor NUM ( expr ) KH (12/10/17) Lecture 8: Context Free s / 1
26 CFGs and Programming Language Syntax Example 2 Expression Parse Tree NUM NUM expr expr + term expr  term term term term * factor term / factor factor factor NUM ( expr ) KH (12/10/17) Lecture 8: Context Free s / 1
27 CFGs and Programming Language Syntax Example 3 Expression Parse Tree NUM + NUM expr expr + term expr  term term term term * factor term / factor factor factor NUM ( expr ) KH (12/10/17) Lecture 8: Context Free s / 1
28 CFGs and Programming Language Syntax Example 4 Expression Parse Tree NUM + NUM NUM expr expr + term expr  term term term term * factor term / factor factor factor NUM ( expr ) KH (12/10/17) Lecture 8: Context Free s / 1
29 CYK Algorithm Parsing Algorithm <expr > <expr > + <term> <expr > <term> <term> <term> <term> <factor > <term>/<factor > <factor > < factor > NUM (<expr >) For CFG G and string s how do we determine if s L(G)? Could try enumerating all possible derivations but TGBABW... KH (12/10/17) Lecture 8: Context Free s / 1
30 CYK Algorithm CYK Algorithm for i 1 to n do V[i, 1] {A A > a is a production and ith symbol of x is a} for j 2 to n do for i 1 to n j + 1 do V[i, j ] {} for k 1 to j 1 do V[i, j ] V[i, j ] Union {A A >BC is a production, B is in V[i, k] and C is in V[i+k, j k]} 2 Computes (in V [i, j]) set of nonterminals <X> which for which derivation <X> x i x i+1 x i+j 1 exists, where x i x i+1 x i+j 1 denotes substring of source beginning at x i and of length j. 2 See J. E. Hopcroft and J. D. Ullmann, Introduction to Automata, Languages and Computation, AddisonWesley, 1979 (pp ) KH (12/10/17) Lecture 8: Context Free s / 1
31 CYK Algorithm Chomsky Normal Form Chomsky Normal Form (CNF) Any grammar without ɛ can be recast to use only productions of form A B C A a where. are nonterminals and a is a terminal. Transformation reasonably straightforward, but not discussed here KH (12/10/17) Lecture 8: Context Free s / 1
32 CYK Algorithm Determines for any CNF G and string s, whether s L(G) (Can be modified to produce derivation/parse tree) (Dynamic Programming!) KH (12/10/17) Lecture 8: Context Free s / 1 CYK Algorithm for i 1 to n do V[i, 1] {A A > a is a production and ith symbol of x is a} for j 2 to n do for i 1 to n j + 1 do V[i, j ] {} for k 1 to j 1 do V[i, j ] V[i, j ] Union {A A >BC is a production, B is in V[i, k] and C is in V[i+k, j k]} CYK Algorithm
