Context-Free Grammars

Similar documents
Context-Free Grammars. Carl Pollard Ohio State University. Linguistics 680 Formal Foundations Tuesday, November 10, 2009

Context-Free Grammars

1 Informal Motivation

Computational Linguistics: Syntax-Semantics Interface

Typed Lambda Calculus for Syntacticians

Computational Linguistics: Syntax-Semantics

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2016

Chapter 3. Describing Syntax and Semantics

Ling/CSE 472: Introduction to Computational Linguistics. 5/9/17 Feature structures and unification

Proseminar on Semantic Theory Fall 2013 Ling 720 An Algebraic Perspective on the Syntax of First Order Logic (Without Quantification) 1

Typed Lambda Calculus

Computational Linguistics: Feature Agreement

Chapter 3. Semantics. Topics. Introduction. Introduction. Introduction. Introduction

Reflection in the Chomsky Hierarchy

Introduction to Semantics. Expanding Our Formalism, Part 2 1

2 Ambiguity in Analyses of Idiomatic Phrases

Homework & NLTK. CS 181: Natural Language Processing Lecture 9: Context Free Grammars. Motivation. Formal Def of CFG. Uses of CFG.

Chapter 3. Syntax - the form or structure of the expressions, statements, and program units

Chapter 3. Describing Syntax and Semantics ISBN

Chapter 3. Describing Syntax and Semantics

1 Introduction CHAPTER ONE: SETS

Introduction to Lexical Analysis

Context-Free Grammars

Context Free Grammars. CS154 Chris Pollett Mar 1, 2006.

Some Interdefinability Results for Syntactic Constraint Classes

Describing Syntax and Semantics

What if current foundations of mathematics are inconsistent? Vladimir Voevodsky September 25, 2010

In Our Last Exciting Episode

Ling 571: Deep Processing for Natural Language Processing

Finite Automata. Dr. Nadeem Akhtar. Assistant Professor Department of Computer Science & IT The Islamia University of Bahawalpur

STRUCTURES AND STRATEGIES FOR STATE SPACE SEARCH

The Earley Parser

Context-Free Grammars

TAFL 1 (ECS-403) Unit- V. 5.1 Turing Machine. 5.2 TM as computer of Integer Function

COP 3402 Systems Software Syntax Analysis (Parser)

Type Systems COMP 311 Rice University Houston, Texas

This is already grossly inconvenient in present formalisms. Why do we want to make this convenient? GENERAL GOALS

Midterm Exam II CIS 341: Foundations of Computer Science II Spring 2006, day section Prof. Marvin K. Nakayama

Syntax and Grammars 1 / 21

Dependency grammar and dependency parsing

Theory of Computer Science. D2.1 Introduction. Theory of Computer Science. D2.2 LOOP Programs. D2.3 Syntactic Sugar. D2.

Dowty Friday, July 22, 11

Dependency grammar and dependency parsing

Lecture 5. Logic I. Statement Logic

Propositional Logic Formal Syntax and Semantics. Computability and Logic

1 Introduction. 2 Set-Theory Formalisms. Formal Semantics -W2: Limitations of a Set-Theoretic Model SWU LI713 Meagan Louie August 2015

3.4 Deduction and Evaluation: Tools Conditional-Equational Logic

Defining Program Syntax. Chapter Two Modern Programming Languages, 2nd ed. 1

Syntax. In Text: Chapter 3

Ling/CSE 472: Introduction to Computational Linguistics. 5/4/17 Parsing

Contents. Chapter 1 SPECIFYING SYNTAX 1

Propositional Logic. Part I

3.1 Constructions with sets

Intro to semantics; Small-step semantics Lecture 1 Tuesday, January 29, 2013

CMPS Programming Languages. Dr. Chengwei Lei CEECS California State University, Bakersfield

Semantics and Generative Grammar. Expanding Our Formalism, Part 2 1. These more complex functions are very awkward to write in our current notation

Syntax Analysis. Chapter 4

Randomized Algorithms Part Four

Randomized Algorithms Part Four

Chapter 3: Syntax and Semantics. Syntax and Semantics. Syntax Definitions. Matt Evett Dept. Computer Science Eastern Michigan University 1999

Trees Rooted Trees Spanning trees and Shortest Paths. 12. Graphs and Trees 2. Aaron Tan November 2017

Introduction to Lexical Functional Grammar. Wellformedness conditions on f- structures. Constraints on f-structures

The CKY Parsing Algorithm and PCFGs. COMP-599 Oct 12, 2016

Formal Languages and Compilers Lecture I: Introduction to Compilers

CSCE 314 Programming Languages

Complexity Theory. Compiled By : Hari Prasad Pokhrel Page 1 of 20. ioenotes.edu.np

Towards a basic DCG for English: X-bar Theory. A DCG for English using gap threading for unbounded dependencies. Verb phrases and sentences

Squibs and Discussions

Chapter 4. Syntax - the form or structure of the expressions, statements, and program units

1 The Axiom of Extensionality

Dependency grammar and dependency parsing

Lecture 4: Syntax Specification

L322 Syntax. Chapter 3: Structural Relations. Linguistics 322 D E F G H. Another representation is in the form of labelled brackets:

Formal Systems and their Applications

Principles of Programming Languages COMP251: Syntax and Grammars

The Turing Machine. Unsolvable Problems. Undecidability. The Church-Turing Thesis (1936) Decision Problem. Decision Problems

CHAPTER 10 GRAPHS AND TREES. Alessandro Artale UniBZ - artale/

The Diagonal Lemma: An Informal Exposition

Languages and Compilers

MIT Specifying Languages with Regular Expressions and Context-Free Grammars. Martin Rinard Massachusetts Institute of Technology

Grammar Development with LFG and XLE. Miriam Butt University of Konstanz

MIT Specifying Languages with Regular Expressions and Context-Free Grammars

Non-deterministic Finite Automata (NFA)

CSCI.6962/4962 Software Verification Fundamental Proof Methods in Computer Science (Arkoudas and Musser) Sections p.

Assignment 4 CSE 517: Natural Language Processing

Introduction to Automata Theory. BİL405 - Automata Theory and Formal Languages 1

Chapter 3 (part 3) Describing Syntax and Semantics

Ambiguous Grammars and Compactification

This book is licensed under a Creative Commons Attribution 3.0 License

FUZZY SPECIFICATION IN SOFTWARE ENGINEERING

UNIVERSITY OF EDINBURGH COLLEGE OF SCIENCE AND ENGINEERING SCHOOL OF INFORMATICS INFR08008 INFORMATICS 2A: PROCESSING FORMAL AND NATURAL LANGUAGES

Computational Linguistics (INF2820 TFSs)

Regular Expressions. Agenda for Today. Grammar for a Tiny Language. Programming Language Specifications

Context-Free Grammars

A Primer on Graph Processing with Bolinas

More Theories, Formal semantics

Introduction to Lexical Analysis

Syntax-semantics interface and the non-trivial computation of meaning

Optimizing Finite Automata

Lecture 5: The Halting Problem. Michael Beeson

Transcription:

Department of Linguistics Ohio State University Syntax 2 (Linguistics 602.02) January 3, 2012

(CFGs) A CFG is an ordered quadruple T, N, D, P where a. T is a finite set called the terminals; b. N is a finite set called the nonterminals c. D is a finite subset of N T called the lexical entries; d. P is a finite subset of N N + called the phrase structure rules (PSRs).

CFG Notation a. A t means A, t D. b. A A 0... A n 1 means A, A 0... A n 1 P. c. A {s 0,... s n 1 } abbreviates A s i (i < n).

A Toy CFG for English (1/2) T = {Fido, Felix, Mary, barked, bit, gave, believed, heard, the, cat, dog, yesterday} N = {S, NP, VP, TV, DTV, SV, Det, N, Adv} D consist of the following lexical entries: NP {Fido, Felix, Mary} VP barked TV bit DTV gave SV {believed, heard} Det the N {cat, dog} Adv yesterday

A Toy CFG for English (2/2) P consists of the following PSRs: S NP VP VP {TV NP, DTV NP NP, SV S, VP Adv} NP Det N

Context-Free Languages (CFLs) a. Given a CFG T, N, D, P, we can define a function C from N to (T -)languages (we write C A for C(A)) as described below. b. The C A are called the syntactic categories of the CFG (and so a nointerminal can be thought of as a name of a syntactic category). c. A language is called context-free if it is a syntactic category of some CFG.

Historical Notes Up until the mid 1980 s an open research questions was whether NLs (considered as sets of word strings) were context-free languages (CFLs). Chomsky maintained they were not, and his invention of transformational grammar (TG) was motivated in large part by the perceived need to go beyond the expressive power of CFGs. Gazdar and Pullum (early 1980 s) refuted all published arguments that NLs could not be CFLs. Together with Klein and Sag, they developed a context-free framework, generalized phrase structure grammar (GPSG), for syntactic theory. But in 1985, Shieber published a paper arguing that Swiss German cannot be a CFL. Shieber s argument is still generally accepted today.

Defining the Syntactic Categories of a CFG (1/2) a. We will recursively define a function h : ω (T ) N. b. Intuitively, for each nonterminal A, the sets h(n)(a) are successively larger approximations of C A. c. Then C A is defined to be C A = def n ωh(n)(a).

Defining the Syntactic Categories of a CFG (2/2) d. We define h using the Recursion Theorem (RT) with X, x, F set as follows: i. X = (T ) N ii. x is the function that maps each A N to the set of length-one strings t such that A t. iii. F is the function from X to X that maps a function L : N (T ) to the function that maps each nonterminal A to the union of L(A) with the set of all strings that can be obtained by applying a PSR A A 0... A n 1 to strings s 0,..., s n 1, where, for each i < n, s i belongs to L(A i ). I.e. F (L)(A) = L(A) {L(A 0 )... L(A n 1 ) A A 0... A n 1 }. iv. Given these values of X, x, and F, the RT guarantees the existence of a unique function h from ω to functions from N to (T ).

Proving that a String Belongs to a Category (1/2) a. With the C A formally defined as above, the following two clauses amount to an (informal) simultaneous recursive definition of the syntactic categories: i. ( Base Clause) If A t, then t C A. ii. (Recursion Clause) If A A 0... A n 1 and for each i < n, s i C Ai, then s 0... s n 1 C A. b. This in turn provides a simple-minded way to prove that a string belongs to a syntactic category (if in fact it does!).

Proving that a String Belongs to a Category (2/2) c. By way of illustration, consider the string s = Mary heard Fido bit Felix yesterday. d. We can (and will) prove that s C S. e. But most syntacticians would say that s corresponds to two different sentences, one roughly paraphrasable as Mary heard yesterday that Fido bit Felix and another roughly paraphrasable as Mary heard that yesterday, Fido bit Felix. f. Of course, these two sentences mean different things; but more relevant for our present purposes is that we can also characterize the difference between the two sentences purely in terms of two distinct ways of proving that s C S.

First Proof a. From the lexicon and the base clause, we know that Mary, Fido, Felix C NP, heard C SV, bit C TV, and yesterday C Adv. b. Then, by repeated applications of the recursion clause, it follows that: 1. since bit C TV and Felix C NP, bit Felix C VP ; 2. since bit Felix C VP and yesterday C Adv, bit Felix yesterday C VP ; 3. since Fido C NP and bit Felix yesterday C VP, Fido bit Felix yesterday C S ; 4. since heard C SV and Fido bit Felix yesterday C S, heard Fido bit Felix yesterday CP VP ; and finally, 5. since Mary C NP and heard Fido bit Felix yesterday C VP, Mary heard Fido bit Felix yesterday C S.

Second Proof a. Same as for first proof. b. Then, by repeated applications of the recursion clause, it follows that: 1. since Fido C NP and bit Felix C VP, Fido bit Felix C S ; 2. since heard C SV and Fido bit Felix C S, heard Fido bit Felix C VP ; 3. since heard Fido bit Felix C VP and yesterday C Adv, heard Fido bit Felix yesterday C VP ; and finally, 4. since Mary C NP and heard Fido bit Felix yesterday C VP, Mary heard Fido bit Felix yesterday C S.

Proofs vs. Phrase Structure Trees The analysis of NL syntax in terms of proofs is characteristic of the family of theoretical approaches collectively known as categorial grammar, initiated by Lambek (1958). But the most widely practiced approaches (sometimes referred to as mainstream generative grammar) analyze NL syntax in terms of trees, which will be introduced presently. For now, we just note that the two proofs above would correspond in a more mainstream syntactic approach to the two trees represented informally by the following two diagrams:

Tree Corresponding to First Proof S NP VP Mary SV S heard NP VP Fido VP Adv TV NP yesterday bit Felix

Tree Corresponding to Second Proof S NP VP Mary VP Adv SV S yesterday heard NP VP Fido TV NP bit Felix

Syntactic Analysis as Deduction Intuitively, it seems clear that there is a close relationship between the proof-based approach and the one based on phrase-structure trees. This intuition can be made mathematically precise, but we needn t go into it here. Instead, we will develop a grammar framework in which: grammars are logical theories words are axioms grammar rules are logical inference rules syntactic analyses are formal proofs.