Question Answering Systems

Similar documents
Natural Language Processing SoSe Question Answering. (based on the slides of Dr. Saeedeh Momtazi) )

Natural Language Processing SoSe Question Answering. (based on the slides of Dr. Saeedeh Momtazi)

Natural Language Processing. SoSe Question Answering

CS674 Natural Language Processing

Introduction to Text Mining. Hongning Wang

Information Retrieval CS Lecture 01. Razvan C. Bunescu School of Electrical Engineering and Computer Science

Classification and Comparative Analysis of Passage Retrieval Methods

Department of Electronic Engineering FINAL YEAR PROJECT REPORT

Query Difficulty Prediction for Contextual Image Retrieval

Shrey Patel B.E. Computer Engineering, Gujarat Technological University, Ahmedabad, Gujarat, India

Natural Language Processing

Presented by: Dimitri Galmanovich. Petros Venetis, Alon Halevy, Jayant Madhavan, Marius Paşca, Warren Shen, Gengxin Miao, Chung Wu

International Journal of Scientific & Engineering Research Volume 7, Issue 12, December-2016 ISSN

Content Enrichment. An essential strategic capability for every publisher. Enriched content. Delivered.

A Survey Of Different Text Mining Techniques Varsha C. Pande 1 and Dr. A.S. Khandelwal 2

RPI INSIDE DEEPQA INTRODUCTION QUESTION ANALYSIS 11/26/2013. Watson is. IBM Watson. Inside Watson RPI WATSON RPI WATSON ??? ??? ???

A Hybrid Neural Model for Type Classification of Entity Mentions

Developing Focused Crawlers for Genre Specific Search Engines

Semantic Extensions to Syntactic Analysis of Queries Ben Handy, Rohini Rajaraman

Text Mining for Software Engineering

Chapter 27 Introduction to Information Retrieval and Web Search

Parsing tree matching based question answering

Question answering. CS474 Intro to Natural Language Processing. History LUNAR

Watson & WMR2017. (slides mostly derived from Jim Hendler and Simon Ellis, Rensselaer Polytechnic Institute, or from IBM itself)

Part I: Data Mining Foundations

Question Answering Approach Using a WordNet-based Answer Type Taxonomy

ACCESSING DATABASE USING NLP

Parts of Speech, Named Entity Recognizer

Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p.

Web-based Question Answering: Revisiting AskMSR

Final Project Discussion. Adam Meyers Montclair State University

AUTOMATIC VISUAL CONCEPT DETECTION IN VIDEOS

Domain-specific Concept-based Information Retrieval System

Exam Marco Kuhlmann. This exam consists of three parts:

A bit of theory: Algorithms

NLP Final Project Fall 2015, Due Friday, December 18

Taming Text. How to Find, Organize, and Manipulate It MANNING GRANT S. INGERSOLL THOMAS S. MORTON ANDREW L. KARRIS. Shelter Island

Experiences with UIMA in NLP teaching and research. Manuela Kunze, Dietmar Rösner

Question Answering Using XML-Tagged Documents

Ghent University-IBCN Participation in TAC-KBP 2015 Cold Start Slot Filling task

Master Project. Various Aspects of Recommender Systems. Prof. Dr. Georg Lausen Dr. Michael Färber Anas Alzoghbi Victor Anthony Arrascue Ayala

Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank text

Search Engines. Information Retrieval in Practice

OPEN INFORMATION EXTRACTION FROM THE WEB. Michele Banko, Michael J Cafarella, Stephen Soderland, Matt Broadhead and Oren Etzioni

Semantic Web Company. PoolParty - Server. PoolParty - Technical White Paper.

Topics for Today. The Last (i.e. Final) Class. Weakly Supervised Approaches. Weakly supervised learning algorithms (for NP coreference resolution)

Annotating Spatio-Temporal Information in Documents

University of Sheffield, NLP. Chunking Practical Exercise

Knowledge Engineering with Semantic Web Technologies

Prof. Ahmet Süerdem Istanbul Bilgi University London School of Economics

60-538: Information Retrieval

Bing Liu. Web Data Mining. Exploring Hyperlinks, Contents, and Usage Data. With 177 Figures. Springer

NLP Techniques in Knowledge Graph. Shiqi Zhao

INFORMATION RETRIEVAL SYSTEM: CONCEPT AND SCOPE

A Comparison Study of Question Answering Systems

A CASE STUDY: Structure learning for Part-of-Speech Tagging. Danilo Croce WMR 2011/2012

Domain-specific user preference prediction based on multiple user activities

Focused Crawling with

CHAPTER 5 EXPERT LOCATOR USING CONCEPT LINKING

Passage Reranking for Question Answering Using Syntactic Structures and Answer Types

Information Retrieval

Dependency grammar and dependency parsing

Cocobean A Socially Ranked Question Answering System

L435/L555. Dept. of Linguistics, Indiana University Fall 2016

Named Entity Detection and Entity Linking in the Context of Semantic Web

SAPIENT Automation project

Wikipedia and the Web of Confusable Entities: Experience from Entity Linking Query Creation for TAC 2009 Knowledge Base Population

Outline. Morning program Preliminaries Semantic matching Learning to rank Entities

Natural Language Processing with PoolParty

Answer Extraction. NLP Systems and Applications Ling573 May 13, 2014

An Ontology-Based Information Retrieval Model for Domesticated Plants

The answer (circa 2001)

What is this Song About?: Identification of Keywords in Bollywood Lyrics

News Filtering and Summarization System Architecture for Recognition and Summarization of News Pages

Text Mining: A Burgeoning technology for knowledge extraction

Search Engines Information Retrieval in Practice

Query Phrase Expansion using Wikipedia for Patent Class Search

Apache UIMA and Mayo ctakes

Search Engines. Information Retrieval in Practice

Tools and Infrastructure for Supporting Enterprise Knowledge Graphs

A MODEL OF EXTRACTING PATTERNS IN SOCIAL NETWORK DATA USING TOPIC MODELLING, SENTIMENT ANALYSIS AND GRAPH DATABASES

Conceptual Search ESI, Litigation and the issue of Language. David T. Chaplin Kroll Ontrack

Knowledge Retrieval. Franz J. Kurfess. Computer Science Department California Polytechnic State University San Luis Obispo, CA, U.S.A.

Maximizing the Value of STM Content through Semantic Enrichment. Frank Stumpf December 1, 2009

A BFS-BASED SIMILAR CONFERENCE RETRIEVAL FRAMEWORK

It s time for a semantic engine!

Large-Scale Syntactic Processing: Parsing the Web. JHU 2009 Summer Research Workshop

Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach

Semi-Supervised Abstraction-Augmented String Kernel for bio-relationship Extraction

Hebei University of Technology A Text-Mining-based Patent Analysis in Product Innovative Process

Content-based Recommender Systems

WEB SEARCH, FILTERING, AND TEXT MINING: TECHNOLOGY FOR A NEW ERA OF INFORMATION ACCESS

Design and Implementation of Search Engine Using Vector Space Model for Personalized Search

Information Extraction Techniques in Terrorism Surveillance

A Review on Identifying the Main Content From Web Pages

Semantics-enhanced Online Intellectual Capital Mining Service for Enterprise Customer Centers

CIRGDISCO at RepLab2012 Filtering Task: A Two-Pass Approach for Company Name Disambiguation in Tweets

Introduction to Information Retrieval

ESERCITAZIONE PIATTAFORMA WEKA. Croce Danilo Web Mining & Retrieval 2015/2016

Syntax and Grammars 1 / 21

Transcription:

Question Answering Systems An Introduction Potsdam, Germany, 14 July 2011 Saeedeh Momtazi Information Systems Group

Outline 2 1 Introduction

Outline 2 1 Introduction 2 History

Outline 2 1 Introduction 2 History 3 QA Architecture Factoid QA Opinion QA

Outline 2 1 Introduction 2 History 3 QA Architecture Factoid QA Opinion QA 4 Summary

Outline 3 1 Introduction 2 History 3 QA Architecture Factoid QA Opinion QA 4 Summary

Why QA 4

Why QA 4

Why QA 4

Why QA 4

Why QA 4

Why QA 4

Why QA 4

Why QA 4

Why QA 4

Why QA 4

Why QA 4

Why QA 5

Why QA 5

Why QA 5

Why QA 5

Why QA 5

QA vs. SE 6

QA vs. SE 6 longer input

QA vs. SE 6 longer input keywords natural language questions

QA vs. SE 6 longer input keywords natural language questions shorter output

QA vs. SE 6 longer input keywords natural language questions documents short answer strings shorter output

QA Types 7 Closed-domain only answer questions from a specific domain. Open-domain answer any domain independent question.

Outline 8 1 Introduction 2 History 3 QA Architecture Factoid QA Opinion QA 4 Summary

History 9 BASEBALL [Green et al., 1963] One of the earliest question answering systems Developed to answer user questions about dates, locations, and the results of baseball matches LUNAR [Woods, 1977] Developed to answer natural language questions about the geological analysis of rocks returned by the Apollo moon missions Able to answer 90% of questions in its domain posed by people not trained on the system

History 10 STUDENT Built to answer high-school students questions about algebraic exercises. PHLIQA Developed to answer the user s questions about European computer systems. UC (Unix Consultant) Answered questions about the Unix operating system LILOG Was able to answer questions about tourism information in cities within Germany

History 11 Closed-domain systems Extracting answers from structured data (database) Converting natural language question to a database query

History 11 Closed-domain systems Extracting answers from structured data (database) Labor intensive to build Converting natural language question to a database query Easy to implement

Open-domain QA 12 Closed-domain QA Open-domain QA Using a large collection of unstructured data (e.g., the Web) instead of databases

Open-domain QA 12 Closed-domain QA Open-domain QA Using a large collection of unstructured data (e.g., the Web) instead of databases Covering many subjects Information constantly added and updated No manual work for building database

Open-domain QA 12 Closed-domain QA Open-domain QA Using a large collection of unstructured data (e.g., the Web) instead of databases Covering many subjects Information constantly added and updated No manual work for building database Some times information is not up to date Some information is wrong More irrelevant information

Open-domain QA 12 Closed-domain QA Open-domain QA Using a large collection of unstructured data (e.g., the Web) instead of databases Covering many subjects Information constantly added and updated No manual work for building database Some times information is not up to date Some information is wrong More irrelevant information More complex systems are required

Open-domain QA 13 START [Katz, 1997] Utilized a knowledge-base to answer the user s questions The knowledge-base was first created automatically from unstructured Internet data Then it was used to answer natural language questions

Open-domain QA 13 START [Katz, 1997] Utilized a knowledge-base to answer the user s questions The knowledge-base was first created automatically from unstructured Internet data Then it was used to answer natural language questions

Open-domain QA 13 START [Katz, 1997] Utilized a knowledge-base to answer the user s questions The knowledge-base was first created automatically from unstructured Internet data Then it was used to answer natural language questions

Open-domain QA 13 START [Katz, 1997] Utilized a knowledge-base to answer the user s questions The knowledge-base was first created automatically from unstructured Internet data Then it was used to answer natural language questions

IBM Watson 14 Webpage: IBM Watson http://www-03.ibm.com/innovation/us/watson/index.html Video: IBM and the Jeopardy Challenge http://www.youtube.com/watch?v=fc3irywr4c8&feature=relmfu http://www.youtube.com/watch?v=_1c7s7-3fxi Video: A Brief Overview of the DeepQA Project http://www.youtube.com/watch?v=3g2h3dz8rnc

Outline 15 1 Introduction 2 History 3 QA Architecture Factoid QA Opinion QA 4 Summary

16 Architecture Who is Warren Moon s Agent? Corpus Web Question Analysis Question Classification Query Construction Document Retrieval Sentence Retrieval Sentence Annotation Answer Extraction Answer Validation Leigh Steinberg

16 Architecture Who is Warren Moon s Agent? Corpus Web Question Analysis Question Classification Query Construction Document Retrieval Sentence Retrieval Sentence Annotation Answer Extraction Answer Validation Leigh Steinberg Natural Language Processing

16 Architecture Who is Warren Moon s Agent? Corpus Web Question Analysis Question Classification Query Construction Document Retrieval Sentence Retrieval Sentence Annotation Answer Extraction Answer Validation Leigh Steinberg Natural Language Processing Information Retrieval

16 Architecture Who is Warren Moon s Agent? Corpus Web Question Analysis Question Classification Query Construction Document Retrieval Sentence Retrieval Sentence Annotation Answer Extraction Answer Validation Natural Language Processing Information Retrieval Information Extraction Leigh Steinberg

Question Analysis 17 Target Extraction Extracting the target of the question Using question target at the query construction step

Question Analysis 17 Target Extraction Extracting the target of the question Using question target at the query construction step Pattern Extraction Extracting a pattern from the question Matching the pattern with a list of pre-defined question patterns Finding the corresponding answer pattern Realizing the position of the answer in the sentence at the answer extraction step

Question Analysis 17 Target Extraction Extracting the target of the question Using question target at the query construction step Pattern Extraction Extracting a pattern from the question Matching the pattern with a list of pre-defined question patterns Finding the corresponding answer pattern Realizing the position of the answer in the sentence at the answer extraction step examples: Question: In what country was Albert Einstein born? Question Pattern: In what country was X born? Answer Pattern: X was born in Y.

Question Analysis 17 Target Extraction Extracting the target of the question Using question target at the query construction step Pattern Extraction Extracting a pattern from the question Matching the pattern with a list of pre-defined question patterns Finding the corresponding answer pattern Realizing the position of the answer in the sentence at the answer extraction step

Question Analysis 17 Target Extraction Extracting the target of the question Using question target at the query construction step Pattern Extraction Parsing Extracting a pattern from the question Matching the pattern with a list of pre-defined question patterns Finding the corresponding answer pattern Realizing the position of the answer in the sentence at the answer extraction step Using a dependency parser to extract the syntactic relations between question terms Using the dependency relation path between question words to extract the correct answer at answer extraction step

Question Classification 18 Classifying the input question into a set of question types Defining a map between question types and available named entity labels Using question type to extract strings that have the same type as the input question at the answer extraction step

Question Classification 18 Classifying the input question into a set of question types Defining a map between question types and available named entity labels Using question type to extract strings that have the same type as the input question at the answer extraction step Example: Question: In what country was Albert Einstein born? Type: Country

Question Classification 19 Classification taxonomies BBN Pasca & Harabagiu Li & Roth

Question Classification 19 Classification taxonomies BBN Pasca & Harabagiu Li & Roth

19 Question Classification Classification taxonomies BBN Pasca & Harabagiu Li & Roth 6 coarse- and 50 fine-grained classes Types HUMAN ABREVIATION DESCRIPTION LOCATION NUMERIC ENTITY mountain state country city

Question Classification 20 Using available classifiers based on the favorite model SVM: SVM-light Maximum Entropy: Maxent, Yasmet Naive Bayes

Query Construction 21 Goal: Formulating a query with a high chance of retrieving relevant documents Task: Assigning a higher weight to the question target Using query expansion techniques to expand the query Thesaurus-based query expansion Relevance feedback Pseudo-relevance feedback

Document Retrieval 22 Importance: QA components use computationally intensive algorithms Time complexity of the system strongly depend on the size of the to be processed corpus Task: Reducing the search space for the subsequent modules Retrieving relevant documents from a large corpus Selecting top n retrieved document for the next steps

Document Retrieval 23 Using available information retrieval models Vector Space Model Probabilistic Model Language Model

Document Retrieval 23 Using available information retrieval models Vector Space Model Probabilistic Model Language Model Using available information retrieval toolkits

Sentence Retrieval 24 Task: Finding a small segment of text that contains the answer Benefits beyond document retrieval: Documents are very large Documents span different subject areas The relevant information is expressed locally Retrieving sentences simplifies the answer extraction step

Sentence Retrieval 25 Language model-based sentence retrieval Query Q P(Q S 2 ) S 1 S 7 S 6 S 2 S 3 S 5 S 4

Sentence Retrieval 25 Language model-based sentence retrieval Query Q P(Q S 2 ) S 1 S 7 S 6 S 2 S 3 S 5 S 4 Query likelihood model: P(Q S) = M i=1 P(q i S)

Sentence Retrieval 26 Challenge: The term mismatch problem in sentence retrieval is more critical than document retrieval

Sentence Retrieval 26 Challenge: The term mismatch problem in sentence retrieval is more critical than document retrieval Approaches:

Sentence Retrieval 26 Challenge: The term mismatch problem in sentence retrieval is more critical than document retrieval Approaches: Query expansion, (Pseudo-)Relevance Feedback

Sentence Retrieval 26 Challenge: The term mismatch problem in sentence retrieval is more critical than document retrieval Approaches: Query expansion, (Pseudo-)Relevance Feedback Does not work at sentence-level retrieval

Sentence Retrieval 26 Challenge: The term mismatch problem in sentence retrieval is more critical than document retrieval Approaches: Query expansion, (Pseudo-)Relevance Feedback Term relationship models WordNet Term clustering model Translation model Triggering model Does not work at sentence-level retrieval

Sentence Annotation 27 Annotating relevant sentences using linguistic analyses: Named entity recognition Dependency parsing Noun phrase chunking Semantic role labeling

Sentence Annotation 27 Annotating relevant sentences using linguistic analyses: Named entity recognition Dependency parsing Noun phrase chunking Semantic role labeling Example: Question: In what country was Albert Einstein born?

Sentence Annotation 27 Annotating relevant sentences using linguistic analyses: Named entity recognition Dependency parsing Noun phrase chunking Semantic role labeling Example (NER): Sentence1: Albert Einstein was born on 14 March 1879. Sentence2: Albert Einstein was born in Germany. Sentence3: Albert Einstein was born in a Jewish family.

Sentence Annotation 27 Annotating relevant sentences using linguistic analyses: Named entity recognition Dependency parsing Noun phrase chunking Semantic role labeling Example (NER): Sentence1: Albert Einstein was born on 14 March 1879. Person Name Date Sentence2: Albert Einstein was born in Germany. Person Name Country Sentence3: Albert Einstein was born in a Jewish family. Person Name Religion

Answer Extraction 28 Extracting candidate answers based on various informations: The extracted patterns from question analysis The dependency pars of question from the question analysis The question type from question classification All annotated data from sentence annotation

Answer Extraction 28 Extracting candidate answers based on various informations: The extracted patterns from question analysis The dependency pars of question from the question analysis The question type from question classification All annotated data from sentence annotation Example: Question: In what country was Albert Einstein born?

Answer Extraction 28 Extracting candidate answers based on various informations: The extracted patterns from question analysis The dependency pars of question from the question analysis The question type from question classification All annotated data from sentence annotation Example: Question: In what country was Albert Einstein born? Question Pattern: In what country was X born? Answer Pattern: X was born in Y.

Answer Extraction 28 Extracting candidate answers based on various informations: The extracted patterns from question analysis The dependency pars of question from the question analysis The question type from question classification All annotated data from sentence annotation Example (Pattern): Sentence1: Albert Einstein was born on 14 March 1879. Sentence2: Albert Einstein was born in Germany. Sentence3: Albert Einstein was born in a Jewish family.

Answer Extraction 28 Extracting candidate answers based on various informations: The extracted patterns from question analysis The dependency pars of question from the question analysis The question type from question classification All annotated data from sentence annotation Example (Pattern): Sentence2: Albert Einstein was born in Germany. Sentence3: Albert Einstein was born in a Jewish family.

Answer Extraction 28 Extracting candidate answers based on various informations: The extracted patterns from question analysis The dependency pars of question from the question analysis The question type from question classification All annotated data from sentence annotation Example: Question: In what country was Albert Einstein born? Question Pattern: In what country was X born? Answer Pattern: X was born in Y.

Answer Extraction 28 Extracting candidate answers based on various informations: The extracted patterns from question analysis The dependency pars of question from the question analysis The question type from question classification All annotated data from sentence annotation Example: Question: In what country was Albert Einstein born? Question Pattern: In what country was X born? Answer Pattern: X was born in Y. Question Type: LOCATION - COUNTRY

Answer Extraction 28 Extracting candidate answers based on various informations: The extracted patterns from question analysis The dependency pars of question from the question analysis The question type from question classification All annotated data from sentence annotation Example (Pattern): Sentence2: Albert Einstein was born in Germany. Sentence3: Albert Einstein was born in a Jewish family.

Answer Extraction 28 Extracting candidate answers based on various informations: The extracted patterns from question analysis The dependency pars of question from the question analysis The question type from question classification All annotated data from sentence annotation Example (NER): Sentence2: Albert Einstein was born in Germany. Person Name Country Sentence3: Albert Einstein was born in a Jewish family. Person Name Religion

Answer Extraction 28 Extracting candidate answers based on various informations: The extracted patterns from question analysis The dependency pars of question from the question analysis The question type from question classification All annotated data from sentence annotation Example (NER): Sentence2: Albert Einstein was born in Germany. Person Name Country Sentence3: Albert Einstein was born in a Jewish family. Person Name Religion

Answer Validation 29 Using the Web as a knowledge resource Sending question keywords and answer candidates to a search engine Finding the frequency of the answer candidate within the Web data Selecting the most likely answers based on the frequencies

Answer Validation 30 Query model: Bag-of-Word Noun-Phrase-Chunks Declarative-Form

Answer Validation 30 Query model: Bag-of-Word Noun-Phrase-Chunks Declarative-Form Example: Question: In what country was Albert Einstein born?

Answer Validation 30 Query model: Bag-of-Word Noun-Phrase-Chunks Declarative-Form Example: Question: In what country was Albert Einstein born? Answer Candidate: Germany

Answer Validation 30 Query model: Bag-of-Word Noun-Phrase-Chunks Declarative-Form Bag-of-Word: Albert Einstein born Germany

Answer Validation 30 Query model: Bag-of-Word Noun-Phrase-Chunks Declarative-Form Bag-of-Word: Albert Einstein born Germany Noun-Phrase-Chunks: Albert Einstein born Germany

Answer Validation 30 Query model: Bag-of-Word Noun-Phrase-Chunks Declarative-Form Bag-of-Word: Albert Einstein born Germany Noun-Phrase-Chunks: Albert Einstein born Germany Declarative-Form: Albert Einstein born Germany

31 Architecture Who is Warren Moon s Agent? Corpus Web Question Analysis Question Classification Query Construction Document Retrieval Sentence Retrieval Sentence Annotation Answer Extraction Answer Validation Natural Language Processing Information Retrieval Information Extraction Leigh Steinberg

Fact vs. Opinion 32 Factual questions Who is Warren Moon s Agent? When was Mozart born? In what country was Albert Einstein born? Who is the director of the Hermitage Museum?

Fact vs. Opinion 33 Opinionated questions What do people like about Wikipedia? What organizations are against universal health care? What are the public opinions on human cloning? How do students feel about Microsoft products?

Outline 34 1 Introduction 2 History 3 QA Architecture Factoid QA Opinion QA 4 Summary

35 Architecture What do people like about Wikipedia? Corpus Web Question Analysis Question Classification Query Construction Document Retrieval Sentence Retrieval Sentence Annotation Answer Extraction Answer Validation

35 Architecture What do people like about Wikipedia? Corpus Question Analysis Question Classification Query Construction Document Retrieval Web Sentence Annotation Answer Extraction Answer Validation

35 Architecture What do people like about Wikipedia? Corpus Question Analysis Question Classification Query Construction Document Retrieval Sentence Retrieval Web Sentence Annotation Answer Extraction Answer Validation

35 Architecture What do people like about Wikipedia? Corpus Web Question Analysis Question Classification Query Construction Document Retrieval Sentence Annotation Answer Extraction Answer Validation Sentence Retrieval S Opinion Classification

35 Architecture What do people like about Wikipedia? Corpus Web Question Analysis Question Classification Query Construction Document Retrieval Sentence Annotation Answer Extraction Answer Validation Sentence Retrieval S Opinion Classification S Polarity Classification

35 Architecture What do people like about Wikipedia? Question Analysis Corpus Web Query Construction Document Retrieval Sentence Annotation Answer Extraction Answer Validation Sentence Retrieval S Opinion Classification S Polarity Classification

35 Architecture What do people like about Wikipedia? Question Analysis Q Semantic Classification Corpus Web Query Construction Document Retrieval Sentence Annotation Answer Extraction Answer Validation Q Polarity Classification Sentence Retrieval S Opinion Classification S Polarity Classification

35 Architecture What do people like about Wikipedia? Question Analysis Q Semantic Classification Corpus Web Query Construction Document Retrieval Sentence Annotation Answer Extraction Answer Validation Q Polarity Classification Sentence Retrieval S Opinion Classification S Polarity Classification Natural Language Processing Information Retrieval Information Extraction

35 Architecture What do people like about Wikipedia? Question Analysis Q Semantic Classification Corpus Web Query Construction Document Retrieval Sentence Annotation Answer Extraction Answer Validation Q Polarity Classification Sentence Retrieval S Opinion Classification S Polarity Classification Natural Language Processing Information Retrieval Information Extraction Opinion Mining

Q Polarity Classification 36 Running in parallel with question classification Task: Decide whether the input question has a positive or negative polarity

Q Polarity Classification 36 Running in parallel with question classification Task: Decide whether the input question has a positive or negative polarity examples: What do people like about Wikipedia? Why people hate reading Wikipedia articles?

Q Polarity Classification 36 Running in parallel with question classification Task: Decide whether the input question has a positive or negative polarity Approaches: Using a rule-based model based on subjectivity lexicon Running a classifier trained on an annotated corpus Training on overall vocabulary of the dataset Training on all polarity expressions from the subjectivity lexicon

S Opinion Classification 37 Importance: Sentence retrieval output is mixed (factual & opinionated) Opinion question answering systems are looking for opinionated sentences

S Opinion Classification 37 Importance: Sentence retrieval output is mixed (factual & opinionated) Opinion question answering systems are looking for opinionated sentences Goal: Classifying retrieved sentences as opinionated or factual.

S Opinion Classification 37 Importance: Sentence retrieval output is mixed (factual & opinionated) Opinion question answering systems are looking for opinionated sentences examples: Question: What do people like about Wikipedia?

S Opinion Classification 37 Importance: Sentence retrieval output is mixed (factual & opinionated) Opinion question answering systems are looking for opinionated sentences examples: S1: I agree Wikipedia is very much in handy when your online; however, I can not use it when I am not online. S2: Wikipedia began as a complementary project for Nupedia, a free online English-language encyclopedia project. S3: Jimmy Wales and Larry Sanger co-founded Wikipedia in January 2001. S4: Wikipedia is a great way to access lots of information.

S Opinion Classification 37 Importance: Sentence retrieval output is mixed (factual & opinionated) Opinion question answering systems are looking for opinionated sentences Opinion S1: I agree Wikipedia is very much in handy when your online; however, I can not use it when I am not online. S4: Wikipedia is a great way to access lots of information. Fact S2: Wikipedia began as a complementary project for Nupedia, a free online English-language encyclopedia project. S3: Jimmy Wales and Larry Sanger co-founded Wikipedia in January 2001.

S Opinion Classification 37 Importance: Sentence retrieval output is mixed (factual & opinionated) Opinion question answering systems are looking for opinionated sentences Goal: Classifying retrieved sentences as opinionated or factual. Approaches: Running a classifier trained on an annotated corpus SVM Maximum Entropy Naive Bayes

S Polarity Classification 38 Task: Distinguishing between positive and negative sentences Returning sentences which have the same polarity as the input question Approach: Using a classifier to classify opinionated sentences as positive or negative Using a small set of lightweight linguistic polarity features Considering the distance between polarity features and the topic in the sentence Using a dependency parser to consider syntactic features

Outline 39 1 Introduction 2 History 3 QA Architecture Factoid QA Opinion QA 4 Summary

Summary 40

Next Semester 41 Master Seminar on Question Answering Systems

Next Semester 41 Master Seminar on Question Answering Systems Question Analysis Query Construction S Opinion Classification Sentence Annotation Q Semantic Classification Document Retrieval S Polarity Classification Answer Extraction Q Polarity Classification Sentence Retrieval Answer Validation