Parsing as Deduction M. Shieber, Y. Schabes, F. Pereira: Principles and Implementation of Deductive Parsing Andreas Daul Natalie Clarius Eberhard Karls Universität Tübingen Seminar für Sprachwissenschaft (SfS) HS Parsing with Prolog Daniël de Kok, SS 2016 08.07.2016 1 / 80
Outline 2 / 80
Introduction Recap: Grammars and derivations The formal Introduction Recap: Grammars and derivations The formal 3 / 80
Introduction Recap: Grammars and derivations The formal Introduction Recap: Grammars and derivations The formal 4 / 80
Introduction Introduction Recap: Grammars and derivations The formal What is meant by Parsing as Deduction? 5 / 80
Introduction Introduction Recap: Grammars and derivations The formal What is meant by Parsing as Deduction? Is the process of reasoning from statements to reach a logical conclusion Possibility to represent parse strategies as deduction rules Create inference rules and a formalized grammar to encode different parsing strategies Apply rules and grammar to the deduction engine introduced here Allows for rapid testing and prototyping of different parsing strategies 6 / 80
Introduction Introduction Recap: Grammars and derivations The formal Formalization as stated by the paper 1 : 1. Existing logics can be used as a basis for new grammar with desirable representational or computational properties. 2. The modular separation of parsing into a logic of grammaticality claims and a proof search procedure allows the investigation of a wide range of existing grammar by selecting specific classes of grammaticality claims and specific search procedures. 1 Schieber,Parsing as Deduction (1995) p4 7 / 80
Introduction Introduction Recap: Grammars and derivations The formal inference rules tell us what items are to be computed deduction procedure tells us in what order these items are to be computed. 8 / 80
Introduction Recap: Grammars and derivations The formal Introduction Recap: Grammars and derivations The formal 9 / 80
Chomsky hierarchy and Chomsky Normal Form Introduction Recap: Grammars and derivations The formal Chomsky hierarchy: Type Grammar Production rules Type 0 unrestricted α β Type 1 context-sensitive αaβ αγβ Type 2 context-free A γ Type 3 regular A ab or A Ba Chomsky normal form (ChNF): each production rule is of one of the forms: A BC (binary branching into nonterminals) A α (unary branching into terminals) S ɛ (S must then not appear on the right-hand side of a grammar rule) 10 / 80
Grammars and derivations Introduction Recap: Grammars and derivations The formal Context-free grammar G = {N, Σ, P, S} with N = the set of nonterminal symbols (including S) Σ = the set of terminal symbols P = the set of production rules S = the start symbol (V = N Σ = the vocabulary of the grammar) Derivations: = immediate derivation = indirect derivation (reflexive, transitive closure of derivation relation) 11 / 80
Introduction Recap: Grammars and derivations The formal Introduction Recap: Grammars and derivations The formal 12 / 80
The formal Introduction Recap: Grammars and derivations The formal all the deductive parsing s we will define consist of the following four components: a class of items: these are the formulas in our logic a set of axioms: sound axioms are true claims grounded in the lexical items that occur in the string a set of inference rules: logical inference rules are used to deduce one item from another [antecedent] side conditions [consequent] a subclass of items: the goal items: this is what we want to proof if the string to be parsed should be valid in the grammar 13 / 80
14 / 80
Item forms Axioms Goals Inference rules Example derivation 15 / 80
The top-down recursive-descent deductive parsing Item forms Axioms Goals Inference rules Example derivation Item form: [ β, j] where 0 j n Axioms: [ S, 0] Goals: [, n] Inference rules: Scanning: Prediction: [ w j+1 β, j] [ β, j + 1] [ Bβ, j] [ γβ, j] B γ 16 / 80
: Item Form Item forms Axioms Goals Inference rules Example derivation [ β, j] asserts that the substring of the string w up to and including the j-th element, when followed by β, forms a sentential form of the language more formally: 1 iff S w 1...w j β [ β, j] = 0 else the dot indicates the break point in the sentential form between the substring that has already been recognised and the substring that is still to be recognised 17 / 80
: Axioms Item forms Axioms Goals Inference rules Example derivation [ S, 0]: the first symbol is S this axiom is sound because S S trivially 18 / 80
: Goals Item forms Axioms Goals Inference rules Example derivation [, n] makes the claim that S w 1...w n w is a valid string produced by the grammar if this goal item can be proved from the axioms and the inference rules, the string is L(G) the recoginition algorithm that makes this proof is a pure top-down left-to-right-regime, a recursive-descent algorithm 19 / 80
: Inference rules Item forms Axioms Goals Inference rules Example derivation Scanning rule: two items of the form [ w j+1 β, j] and [ β, j + 1] both make the claim that S w 1...w j w j+1 β; therefore we can infer the latter from the former: [ w j+1 β, j] [ β, j + 1] Prediction rule: two items of the form [ Bβ, j] and [ γβ, j] together with B γ both make the claim that S w 1...w j Bβ = S w 1...w j γβ; therefore we can infer the latter from the former: [ Bβ, j] B γ [ γβ, j] 20 / 80
: Example derivation Item forms Axioms Goals Inference rules Example derivation 21 / 80
: Example derivation Item forms Axioms Goals Inference rules Example derivation the derivation just showed contains only those steps that are strictly necessary for the proof in an actual search procedure, items will be generated that are either dead-ends or redundant trivially, with an ambiguous grammar, there will also be different proofs corresponding to different parses 22 / 80
Item forms Axioms Goals Inference rules Example Derivation 23 / 80
The bottom-up (shift reduce) deductive parsing Item forms Axioms Goals Inference rules Example Derivation Item form: [α, j] where 0 j n Axioms: [, 0] Goals: [S, n] Inference rules: Shift: Reduce: [α, j] [αw j+1, j + 1] [αγ, j] [αb, j] B γ 24 / 80
: Item Form Item forms Axioms Goals Inference rules Example Derivation Item form: [α, j] where 0 j n asserting that αw j+1...w n w1...w n (which also means that α w 1...w j ) final_item ( item ([ Value ], Length ), Value ) :- sentencelength ( Length ), startsymbol ( Value ). 25 / 80
Bottom-Up: Axioms Item forms Axioms Goals Inference rules Example Derivation Axiom: [, 0] starting with empty parse stack initial_item ( item ([], 0)). 26 / 80
Bottom-Up: Goals Item forms Axioms Goals Inference rules Example Derivation Goals: [S, n] completing a sentential form for the entire input length final_item ( item ([ Value ], Length ), Value ) :- sentencelength ( Length ), startsymbol ( Value ). 27 / 80
Item forms Axioms Goals Inference rules Example Derivation : Inference rules Inference rules: [α, j] Shift: [αw j+1, j + 1] Antecedent and consequent both claim that αw j+1...w n w1...w n inference ( shift, [ item (Beta, I) ], % ------------------------------- item ([B Beta ], I1), % where [I1 is I + 1, word (I1, Bterm ), lex ( Bterm, B)] ). 28 / 80
Item forms Axioms Goals Inference rules Example Derivation : Inference rules [αγ, j] Reduce: B γ [αb, j] if αγw j+1...w n w1...w n and B γ then it also holds that αbw j+1...w n w1...w n inference ( reduce, [ item ( BetaAlpha, I) ], % -------------------------------- item ([A Alpha ], I), % where [(A ---> Beta ), reverse (Beta, BetaR ), append ( BetaR, Alpha, BetaAlpha )] ). 29 / 80
: Example Derivation Item forms Axioms Goals Inference rules Example Derivation Derivation steps for sentence: A lindy swings Notice the last entry, the goal item, showing the sentence is parsable with regard to the given grammar. 30 / 80
Item forms Axioms Goals Inference rules Example Derivation 31 / 80
Parser Item forms Axioms Goals Inference rules Example Derivation in recursive-descent we kept a partial sentential form for yet to be parsed material, the dot at the beginning of a string telling us that these symbols come after the point that was already reached in the recognition process in shift-reduce, we kept a partial sentential form for material that has already been parsed, putting the dot at the end of the string as a reminder for the fact that those symbols come before the point that was reached in the recognition process 32 / 80
Parser Item forms Axioms Goals Inference rules Example Derivation In we keep both partial sentential forms, with the dot marking a middle position in the recognition process, the dot is therefore a needed component now, not just a mnemonic help 33 / 80
Parser Item forms Axioms Goals Inference rules Example Derivation Item form: [i, A α β, j] Axioms: [0, S S, 0] Goals: [0, S S, n] Inference rules: Scanning: Prediction: Completion: [i, A α w j+1 β, j] [i, A αw j+1 β, j + 1] [i, A α Bβ, j] [j, B γ, j + 1] B γ [i, A α Bβ, j] [k, B γ, j] [i, A αb β, j] 34 / 80
: Item Form Item forms Axioms Goals Inference rules Example Derivation Item form: [i, A α β, j] with α, β as strings in V and A αβ as a production of the grammar j shows again the position in the string that recognition has reached marks that point in the partial sentential form i marks the starting point of the partial sentential form item makes top-down claim that S w 1...w i Aγ and bottom-up claim that αw j+1...w n wi+1...w n 35 / 80
: Axioms Item forms Axioms Goals Inference rules Example Derivation Axiom: 2 [0, S S, 0] initial_item ( item ( <start >, [], [ Start ], 0,0)) :- startsymbol ( Start ). 2 inf-earley.pl 36 / 80
: Goals Item forms Axioms Goals Inference rules Example Derivation Goal: [0, S S, n] final_item ( item ( <start >, [ Start ], [], 0, ), Start ) :- startsymbol ( Start ), sentencelength ( Length ). Length 37 / 80
: Prediction Item forms Axioms Goals Inference rules Example Derivation Prediction: [i, A α Bβ, j] B γ inference ( predictor, [ item (_A, _Alpha, [B _Beta ], _I,J) ], % ---------------------------------------- item (B, [], Gamma, J,J), % where [(B ---> Gamma )] ). 38 / 80
: Scanning Item forms Axioms Goals Inference rules Example Derivation Scanning: [i, A α w j+1 β, j] [i, A αw j+1 β, j + 1] inference ( scanner, [ item (A, Alpha, [B Beta ], I, J) ], % ------------------------------------- item (A, [B Alpha ], Beta, I, J1), % where [J1 is J + 1, word (J1, Bterm ), lex ( Bterm, B)] ). 39 / 80
: Completion Item forms Axioms Goals Inference rules Example Derivation Completion: [i, A α Bβ, j] [k, B γ, j] [i, A αb β, j] inference ( completor, [ item (A, Alpha, [B Beta ], I,J), item (B, _Gamma, [], J,K) ], % -------------------------------- item (A, [B Alpha ], Beta, I,K), % where [] ). 40 / 80
: Example Derivation Item forms Axioms Goals Inference rules Example Derivation Derivation steps for sentence: A lindy swings 41 / 80
Item forms Axioms Goal items Inference rules Encoding 42 / 80
The deductive parsing Item forms Axioms Goal items Inference rules Encoding Item form: [A, i, j] Axioms: [A, i, i + 1] Goals: [S, 0, n] Inference rules: [B, i, j] [C, j, k] [A, i, k] A BC 43 / 80
: Item form Item forms Axioms Goal items Inference rules Encoding [A, i, j]: A w i+1...w j the nonterminal A derives the substring between indices i and j in the string 44 / 80
: Axioms Item forms Axioms Goal items Inference rules Encoding for each word w i+1 in the string and each rule A w i+1, [A, i, i + 1] is a true claim; therefore: [A, i, i + 1] with A w i+1 is axiomatic 45 / 80
: Goal items Item forms Axioms Goal items Inference rules Encoding [S, 0, n] asserts that S w 1...w n if this form is deducible, the string is admitted by the grammar, because w 1...w n = w 46 / 80
: Inference rules Item forms Axioms Goal items Inference rules Encoding whenever we know that B w i+1...w j and C w j+1...w k with A BC, it is sound to conclude that A w i+1...w k therefore, with two items [B, i, j] and [C, j, k], and the production rule A BC, we can infer [A, i, k]: [B, i, j] [C, j, k] [A, i, k] A BC 47 / 80
Item forms Axioms Goal items Inference rules Encoding : Encoding This deduction can be encoded in the following way: % Axiom : nt(a, I1, I):- % item : [A,i,i +1] word (I,W), (A ---> [W]), %A ---> [X1,... XM] resembles production rule A - > W1... Wn I1 is I -1. % Inference rule : nt(a,i,k):-% consequence item : [A,i,k] nt(b,i,j), % premise item nt(c,j,k), % premise item (A ---> [B,C]). % premise production rule 48 / 80
49 / 80
50 / 80
Augmented PS Combinatory categorial grammars Tree-adjoining grammars and related 51 / 80
Augmented phrase structure Augmented PS Combinatory categorial grammars Tree-adjoining grammars and related 52 / 80
Combinatory categorial grammars Augmented PS Combinatory categorial grammars Tree-adjoining grammars and related 53 / 80
Tree-adjoining grammars and related Augmented PS Combinatory categorial grammars Tree-adjoining grammars and related 54 / 80
Efficiency Implementation of agenda and chart Implementation of the deduction engine Implementation of other aspects Efficiency Implementation of agenda and chart Implementation of the deduction engine Implementation of other aspects 55 / 80
implementation Efficiency Implementation of agenda and chart Implementation of the deduction engine Implementation of other aspects so far: only specification of inference rules remaining: integration in a frame that allows for these inference rules to be used in an actual parsing algorithm most important part: choosing a deduction procedure to operate over the inference rules 56 / 80
Efficiency Implementation of agenda and chart Implementation of the deduction engine Implementation of other aspects Efficiency Implementation of agenda and chart Implementation of the deduction engine Implementation of other aspects 57 / 80
Chart Efficiency Implementation of agenda and chart Implementation of the deduction engine Implementation of other aspects we do not want to enumerate an item more than once cache/chart of lemmas in order to keep track of what we already encountered similar to chart in chart-parsing, well-formed substring table in parsing or state sets in parsing 58 / 80
Agenda Efficiency Implementation of agenda and chart Implementation of the deduction engine Implementation of other aspects each item added to the chart might generate new consequences When do we compute the consequences of a new item? solution: separate agenda of items that have been proved but whose consequences have not been computed when an item s consequences are computed, the item is moved from the agenda to the chart and the consequences are added to the agenda for later consideration 59 / 80
Chart-based, agenda-driven deduction procedure basic deduction procedure making use of both charts and agendas: Efficiency Implementation of agenda and chart Implementation of the deduction engine Implementation of other aspects 60 / 80
Efficiency Implementation of agenda and chart Implementation of the deduction engine Implementation of other aspects Eliminating redundancy redundancy in the chart: only items should be added to the chart that don t already exist there ( step (2b)) redundancy in the agenda: only items which have new immediate consequences, i.e. that do not already occur in the chart or in the agenda, should be added to the agenda ( step (2c)) triggering the generation of new immediate consequences: when generating all items that are new immediate consequences of the trigger item together with all other items in the chart, we want to avoid generating redundant items, i.e. items that would already follow from the other chart items (without the trigger item) search for new immediate consequences can be limited to just those where at least one of the antecedents is the trigger item 61 / 80
Efficiency Implementation of agenda and chart Implementation of the deduction engine Implementation of other aspects Providing efficient access storing items (e.g. an item like [i, A α β, j]) in the chart and in the agenda should allow for efficient access, i.e. for directly indexing into the stored items appropriately indexing for redundancy checking: attributes in items like indices, used production rule and dot position might be used indexing for antecedent lookup: not all of the above information available; instead: e.g. first index j and main functor on the left-hand side B variable renaming: matching items against inference rules produces further instantiations which should not affect the already stored variables variables in agenda and chart items should be renamed consistently before they are further used 62 / 80
Efficiency Implementation of agenda and chart Implementation of the deduction engine Implementation of other aspects Efficiency Implementation of agenda and chart Implementation of the deduction engine Implementation of other aspects 63 / 80
Efficiency Implementation of agenda and chart Implementation of the deduction engine Implementation of other aspects Agenda and Chart storing Agenda and Chart in one structure to handle redundancy checking stored/2 (items.pl) first argument representing the position, allowing direct access to any stored item stored (1, item (...) ). % beginning of chart stored (2, item (...) ). stored (i - 1, item (...) ). % end of chart stored (i, item (...) ). % head of agenda stored (i + 1, item (...) ). stored (k - 1, item (...) ). stored (k, item (...) ). % tail of agenda 64 / 80
implementation Efficiency Implementation of agenda and chart Implementation of the deduction engine Implementation of other aspects characterization of agenda items via two indices, representing the first (head) and last (tail) agenda item since new items are added to the end we gain an implicit distinction between chart and agenda, with chart items having an index smaller than the agenda head 65 / 80
Efficiency Implementation of agenda and chart Implementation of the deduction engine Implementation of other aspects Efficiency Implementation of agenda and chart Implementation of the deduction engine Implementation of other aspects 66 / 80
Agenda driven, chart based deduction Efficiency Implementation of agenda and chart Implementation of the deduction engine Implementation of other aspects The deduction engine encoded in the following way. Initialization 3 : parse ( Value ) :- % (1) Initialize the chart and agenda init_ chart, init_agenda ( Agenda ), % (2) Remove items from the agenda and process % until the agenda is empty exhaust ( Agenda ), % (3) Try to find a goal item in the chart goal_item_in_chart ( Goal ). 3 driver.pl 67 / 80
Agenda driven, chart based deduction Efficiency Implementation of agenda and chart Implementation of the deduction engine Implementation of other aspects Processing of trigger items until agenda is exhausted: exhaust ( Empty ) :- % (2) If the agenda is empty, we re done is_empty_agenda ( Empty ). 68 / 80
Efficiency Implementation of agenda and chart Implementation of the deduction engine Implementation of other aspects Agenda driven, chart based deduction Processing of trigger items until agenda is exhausted: exhaust ( Agenda0 ) :- % (2 a) Otherwise get the next item index from the agenda pop_agenda ( Agenda0, Index, Agenda1 ), % (2 b) Add it to the chart add_item_to_chart ( Index ), % (2 c) Add its consequences to the agenda add_consequences_to_agenda ( Index, Agenda1, Agenda ), % (2) Continue processing the agenda until empty exhaust ( Agenda ). 69 / 80
Agenda driven, chart based deduction Efficiency Implementation of agenda and chart Implementation of the deduction engine Implementation of other aspects Generating consequences for each item and placing them in agenda: add_consequences_to_agenda ( Index, Agenda0, Agenda ) :- findall ( Consequence, consequence ( Index, Consequence ), Consequences ), add_items_to_agenda ( Consequences, Agenda0, Agenda ). add items to agenda/3 stores to new agenda, taking care of indices 70 / 80
Efficiency Implementation of agenda and chart Implementation of the deduction engine Implementation of other aspects Agenda driven, chart based deduction When does a trigger item have a consequence? if it matches an antecedent of some rule maybe together with other antecedents that have already been proved and are stored in the chart possible side conditions have to hold as well consequence ( Index, Consequent ) :- index_to_item ( Index, Trigger ), matching_rule ( Trigger, RuleName, Others, Consequent, SideConds ), items_in_chart ( Others, Index ), hold ( SideConds ). 71 / 80
Agenda driven, chart based deduction Efficiency Implementation of agenda and chart Implementation of the deduction engine Implementation of other aspects we assume inference rules to be stored as predicates of the form inference(rulename, Antecedents, Consequent, SideConds) RuleName - a mnemonic name for the rule Antecedents - list of antecedent items of that rule Consequent - the single consequent item Sideconds - list of side conditions then pick rule where antecedent matches trigger and split off unmatched antecedents ( to be checked for in the chart ) 72 / 80
Agenda driven, chart based deduction Efficiency Implementation of agenda and chart Implementation of the deduction engine Implementation of other aspects matching_rule ( Trigger, RuleName, Others, Consequent, SideConds ) :- inference ( RuleName, Antecedents, Consequent, SideConds ), split ( Trigger, Antecedents, Others ). 73 / 80
Efficiency Implementation of agenda and chart Implementation of the deduction engine Implementation of other aspects Efficiency Implementation of agenda and chart Implementation of the deduction engine Implementation of other aspects 74 / 80
Implementation of other aspects Efficiency Implementation of agenda and chart Implementation of the deduction engine Implementation of other aspects The complete : organising modules: infer.pl input and encoding of the string to be parsed: input.pl, readin.pl the deduction engine driver including generation of consequences: driver.pl encoding of the storage of items including the the chart and agenda: item.pl, agenda.pl, chart.pl encoding of deduction s: inference.pl, inf-top-down.pl, inf-bottom-up.pl, inf-earley.pl, inf-ccg.pl, inf-lig-tag.pl, inf-tag-cky.pl inclusion of the grammars: grammars.pl, gram-dcgl.pl, gram-ccg.pl, gram-dcg-lc.pl, lig-gram.pl other utilities, such as subsumption checking: utilities.pl monitoring and debugging: monitor.pl 75 / 80
76 / 80
And now a live demonstration! 77 / 80
78 / 80
What we have achieved: describing parsing as deduction process using inference rules showing the commonalities and relationships in the logic of parsing algorithms, while abstracting away from incidental differences of control application not only for, but also for alternative grammar like tree-adjoining grammars and categorial grammars 79 / 80
Literature Shieber, S. M., Schabes, Y., & Pereira, F. C. (1995). Principles and deductive parsing. The Journal of logic programming, 24(1), 3-36. Source code http://lanl.arxiv.org/e-print/cmp-lg/9404008v1 (accessed: 23 th June 2016 14:00; modified for compatibility with SWI Prolog) 80 / 80