S Det N VP V The dog barked LTOP h 1 INDEX e 2 def q rel bark v rel prpstn m rel LBL h 4 dog n rel LBL h RELS LBL h 1 ARG0 x 5 LBL h 9 8 ARG0 e MARG h 3 RSTR h 6 ARG0 x 2 5 ARG1 x BODY h 5 7 HCONS h 3 = q h 9, h 6 = q h 8 Algorithms for AI and NLP (INF4820 TFSs) HEAD 1 SPR 2 SPR, COMPS COMPS 3 HEAD 1 SPR 2 COMPS 3 Stephan Oepen and Jan Tore Lønning Universitetet i Oslo {oe jtl }@ifi.uio.no
A (Simplified) PCFG Estimation Example S S S VP VP VP Kim V shot VP elephants P in PP pajamas Kim V loves cake P with PP chocolate Kim VP V arrived PP P in Oslo P(RHS LHS) CFG Rule S VP VP VP PP VP V PP P PP VP V Estimate rule probability from observed distribution; conditional probabilities: P(RHS LHS) = C(LHS, RHS) C(LHS) Natural Language Understanding (2)
Formally: Probabilistic Context-Free Grammars Formally, a context-free grammar (CFG) is a quadruple: C, Σ,P,S... P is a set of category rewrite rules (aka productions), each with a conditional probability P(RHS LHS), e.g.... Kim [0.6] snow [0.4]... for each rule α β 1, β 2,...,β n P : α C and β i C Σ; 1 i n;... for each α C, the probabilities of all rules R α... must sum to 1. Natural Language Understanding (3)
Limitations of Context-Free Grammar Agreement and Valency (For Example) That dog barks. That dogs barks. Those dogs barks. The dog chased a cat. The dog barks a cat. The dog chased. The dog chased a cat my neighbours. The cat was chased by a dog. The cat was chased of a dog.... Natural Language Understanding (4)
Unification-Based Grammar: Structured Categories All (constituent) categories in the grammar are typed feature structures; feature structures are recursive, record-like objects: attribute value sets; typing very similar to OO programming: a multipe-inheritance hierarchy; specific TFS configurations may correspond to traditional categories; labels like S or are mere abbreviations, not elements of the theory. word HEAD noun [ SPR HEAD det COMPS ] HEAD verb SPR COMPS HEAD verb [ SPR HEAD noun COMPS ] N S VP Natural Language Understanding (5)
The Type Hierarchy: Fundamentals Types represent groups of entities with similar properties ( classes ); types ordered by specificity: subtypes inherit properties of (all) parents; type hierarchy determines which types are compatible (and which not). *top* *list* *string* feat-struc *cons* *null* expression pos word noun verb det Natural Language Understanding (6)
Type Hierarchies: Multiple Inheritance flyer and swimmer no common descendants: they are incompatible; flyer and bee stand in hierarchical relationship: they unify to subtype; flyer and invertebrate have one unique greatest lower bound ( glb ). *top* animal flyer swimmer invertebrate vertebrate bee fish cod guppy Natural Language Understanding (7)
Typed Feature Structure Subsumption Typed feature structures can be partially ordered by information content; a more general structure is said to subsume a more specific one; *top* is the most general feature structure (while is inconsistent); ( square subset or equal ) conventionally used to depict subsumption. Feature structure F subsumes feature structure G (F G) iff: (1) if path p is defined in F then p is also defined in G and the type of the value of p in F is a supertype or equal to the type of the value of p in G, and (2) all paths that are reentrant in F are also reentrant in G. Natural Language Understanding (8)
Feature Structure Subsumption: Examples TFS 1 : TFS 3 : a b FOO x BAR x FOO y BAR x BAZ x TFS 2 : TFS 4 : a a FOO x BAR y FOO 1 x BAR 1 Hierarchy a FOO BAR b BAZ x y Feature structure F subsumes feature structure G (F G) iff: (1) if path p is defined in F then p is also defined in G and the type of the value of p in F is a supertype or equal to the type of the value of p in G, and (2) all paths that are reentrant in F are also reentrant in G. Natural Language Understanding (9)
Typed Feature Structure Unification Decide whether two typed feature structures are mutually compatible; determine combination of two TFSs to give the most general feature structure which retains all information which they individually contain; if there is no such feature structure, unification fails (depicted as ); unification monotonically combines information from both input TFSs; relation to subsumption the unification of two structures F and G is the most general TFS which is subsumed by both F and G (if it exists). ( square set intersection ) conventionally used to depict unification. Natural Language Understanding (10)
Typed Feature Structure Unification: Examples TFS 1 : TFS 3 : a b FOO x BAR x FOO y BAR x BAZ x TFS 2 : TFS 4 : a a FOO x BAR y FOO 1 x BAR 1 Hierarchy a FOO BAR b BAZ x y TFS 1 TFS 2 TFS 2 TFS 1 TFS 3 TFS 3 TFS 3 TFS 4 b FOO 1 y BAR 1 BAZ x Natural Language Understanding (11)
Typed Feature Structure Example (as AVM) HEAD verb ARGS *cons* FIRST word ORTH chased HEAD verb REST *cons* FIRST expression HEAD noun REST *null* Natural Language Understanding (12)
Typed Feature Structure Example (as Graph) HEAD verb ARGS *cons* FIRST word REST ORTH chased HEAD verb *cons* expression FIRST HEAD noun REST *null* Natural Language Understanding (13)
Reentrancy in a Typed Feature Structure (AVM) HEAD 1 verb ARGS *ne-list* FIRST word ORTH chased HEAD 1 REST *ne-list* FIRST expression HEAD noun REST *null* Natural Language Understanding (14)
Reentrancy in a Typed Feature Structure (Graph) HEAD verb ARGS HEAD *ne-list*first word ORTH chased REST expression *ne-list*first HEAD noun REST *null* Natural Language Understanding (15)