A Hybrid Neural Model for Type Classification of Entity Mentions

Similar documents
Lecture 20: Neural Networks for NLP. Zubin Pahuja

TTIC 31190: Natural Language Processing

Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank text

Entity Linking. David Soares Batista. November 11, Disciplina de Recuperação de Informação, Instituto Superior Técnico

FastText. Jon Koss, Abhishek Jindal

Machine Learning 13. week

Representation Learning using Multi-Task Deep Neural Networks for Semantic Classification and Information Retrieval

Detection and Extraction of Events from s

Exam Marco Kuhlmann. This exam consists of three parts:

Sentiment Classification of Food Reviews

Deep Learning on Graphs

Question Answering Systems

Combining Neural Networks and Log-linear Models to Improve Relation Extraction

A Deep Relevance Matching Model for Ad-hoc Retrieval

Grounded Compositional Semantics for Finding and Describing Images with Sentences

Empirical Evaluation of RNN Architectures on Sentence Classification Task

A Text Classification Model Using Convolution Neural Network and Recurrent Neural Network

CSC 578 Neural Networks and Deep Learning

Sparse Non-negative Matrix Language Modeling

Homework 2: HMM, Viterbi, CRF/Perceptron

A Neuro Probabilistic Language Model Bengio et. al. 2003

Semantic image search using queries

A Systematic Overview of Data Mining Algorithms

Deep Learning based Authorship Identification

Web-based Question Answering: Revisiting AskMSR

JOINT INTENT DETECTION AND SLOT FILLING USING CONVOLUTIONAL NEURAL NETWORKS. Puyang Xu, Ruhi Sarikaya. Microsoft Corporation

Understanding the Query: THCIB and THUIS at NTCIR-10 Intent Task. Junjun Wang 2013/4/22

Deep Learning Applications

DBpedia Spotlight at the MSM2013 Challenge

Outline. Morning program Preliminaries Semantic matching Learning to rank Entities

End-To-End Spam Classification With Neural Networks

Named Entity Detection and Entity Linking in the Context of Semantic Web

Classification. 1 o Semestre 2007/2008

Watson & WMR2017. (slides mostly derived from Jim Hendler and Simon Ellis, Rensselaer Polytechnic Institute, or from IBM itself)

Natural Language Processing with Deep Learning CS224N/Ling284

Deep Learning for Entity Matching: A Design Space Exploration. Sidharth Mudgal Sunil Kumar

Encoding RNNs, 48 End of sentence (EOS) token, 207 Exploding gradient, 131 Exponential function, 42 Exponential Linear Unit (ELU), 44

Automatic Summarization

Deep Learning on Graphs

Supervised classification of law area in the legal domain

Natural Language Processing CS 6320 Lecture 6 Neural Language Models. Instructor: Sanda Harabagiu

Natural Language Processing with Deep Learning CS224N/Ling284

Self-tuning ongoing terminology extraction retrained on terminology validation decisions

A Systematic Overview of Data Mining Algorithms. Sargur Srihari University at Buffalo The State University of New York

USING RECURRENT NEURAL NETWORKS FOR DUPLICATE DETECTION AND ENTITY LINKING

An Ensemble Dialogue System for Facts-Based Sentence Generation

Taming Text. How to Find, Organize, and Manipulate It MANNING GRANT S. INGERSOLL THOMAS S. MORTON ANDREW L. KARRIS. Shelter Island

Structured Learning. Jun Zhu

MPI-INF AT THE NTCIR-11 TEMPORAL QUERY CLASSIFICATION TASK

ABC-CNN: Attention Based CNN for Visual Question Answering

Machine Learning for Natural Language Processing. Alice Oh January 17, 2018

Perceptron: This is convolution!

Predict the box office of US movies

S-MART: Novel Tree-based Structured Learning Algorithms Applied to Tweet Entity Linking

CSCI 599: Applications of Natural Language Processing Information Retrieval Retrieval Models (Part 3)"

Jianyong Wang Department of Computer Science and Technology Tsinghua University

Learning to Rank with Attentive Media Attributes

A Study of MatchPyramid Models on Ad hoc Retrieval

Natural Language Processing

Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling

A Comparison of Sequence-Trained Deep Neural Networks and Recurrent Neural Networks Optical Modeling For Handwriting Recognition

How GPUs Power Comcast's X1 Voice Remote and Smart Video Analytics. Jan Neumann Comcast Labs DC May 10th, 2017

Natural Language Processing with PoolParty

Recurrent Neural Nets II

Image Captioning with Object Detection and Localization

Supervised Models for Coreference Resolution [Rahman & Ng, EMNLP09] Running Example. Mention Pair Model. Mention Pair Example

Understanding the Query: THCIB and THUIS at NTCIR-10 Intent Task

PTE : Predictive Text Embedding through Large-scale Heterogeneous Text Networks

Natural Language Processing SoSe Question Answering. (based on the slides of Dr. Saeedeh Momtazi) )

CS6220: DATA MINING TECHNIQUES

Generative Adversarial Text to Image Synthesis

Convolutional Networks for Text

Search Engines. Information Retrieval in Practice

16-785: Integrated Intelligence in Robotics: Vision, Language, and Planning. Spring 2018 Lecture 14. Image to Text

Keyword Extraction by KNN considering Similarity among Features

N-Gram Language Modelling including Feed-Forward NNs. Kenneth Heafield. University of Edinburgh

More Learning. Ensembles Bayes Rule Neural Nets K-means Clustering EM Clustering WEKA

Creating a Classifier for a Focused Web Crawler

Hidden Markov Models. Gabriela Tavares and Juri Minxha Mentor: Taehwan Kim CS159 04/25/2017

Pointer Network. Oriol Vinyals. 박천음 강원대학교 Intelligent Software Lab.

NUS-I2R: Learning a Combined System for Entity Linking

LIDER Survey. Overview. Number of participants: 24. Participant profile (organisation type, industry sector) Relevant use-cases

Machine Learning With Python. Bin Chen Nov. 7, 2017 Research Computing Center

Asynchronous Parallel Learning for Neural Networks and Structured Models with Dense Features

The exam is closed book, closed notes except your one-page (two-sided) cheat sheet.

Neural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /10/2017

FLL: Answering World History Exams by Utilizing Search Results and Virtual Examples

Semi-Supervised Abstraction-Augmented String Kernel for bio-relationship Extraction

10/14/2017. Dejan Sarka. Anomaly Detection. Sponsors

Sentence selection with neural networks over string kernels

Author Prediction for Turkish Texts

Image Question Answering using Convolutional Neural Network with Dynamic Parameter Prediction

Text Categorization (I)

Text Classification and Clustering Using Kernels for Structured Data

Recurrent Neural Networks. Nand Kishore, Audrey Huang, Rohan Batra

Recurrent Neural Networks

Automatic Wordnet Mapping: from CoreNet to Princeton WordNet

Visual Concept Detection and Linked Open Data at the TIB AV- Portal. Felix Saurbier, Matthias Springstein Hamburg, November 6 SWIB 2017

Eagle Eye. Sommersemester 2017 Big Data Science Praktikum. Zhenyu Chen - Wentao Hua - Guoliang Xue - Bernhard Fabry - Daly

Face Recognition A Deep Learning Approach

Transcription:

A Hybrid Neural Model for Type Classification of Entity Mentions

Motivation Types group entities to categories Entity types are important for various NLP tasks Our task: predict an entity mention s type Input c S c 1 left context Output Type w 1 w n mention c 1 c S right context [an initiative sponsored by][bill & Melinda Gates Foundation][to fight HIV infection] Organization

Mention Bill & Melinda Gates Foundation (Organization) Bill, Melinda, Gates -> {Person Name} {Person Name} + Foundation -> Organization Context [The greater part of ][Gates][ ' population is in Marion County.] (Location) [Gates][ was a baseball player.] (Person)

Architecture Decision Model (softmax classifier) y = 1 C j=1 e θ j x eθ1x e θ Cx Context Model Mention Model Context Model

RNN-based Mention Model Learn composition patterns for entity mention {Name} + Foundation / University -> (Organization) {Body Region} + {Disease} -> (Disease) Recurrent Neural Networks (Elman Networks) Use a global composition matrix to compute representation recurrently A natural way to learn composition patterns v 2 = tanh(w w 1 w 2 + b m ) v 3 = tanh(w v 2 w + b m ) v 5 = tanh(w v 4 3 w + b m ) 5 Mention Representation Bill & Melinda Gates Foundation w 1 w 2 w 3 w 4 w 5

MLP-based Context Model Use context to disambiguate [The greater part of ][Gates][ ' population is in Marion County.] (Location) [Gates][ was a baseball player.] (Person) MultiLayer Perceptrons Location-aware, jointly trained Hidden Layer Concatenation Layer Left Context Representation Right Context Representation an initiative sponsored by to fight HIV infection

Model Training Objective function Back-propagation algorithm Back-propagate errors of softmax classifier to other layers Optimization minimize θ Mini-batched AdaGrad i j t j i log y j i cross entropy loss + λ θ 2 θ 2 2 regularization

Automatically Generating Training Data Wikipedia Article Context Mention Context Anchor link Wikipedia ID DBpedia DBpedia Entity rdf:type Organization

Automatically Generating Training Data DBpedia ontology 22 top-level types Wiki-22 #Train: 2 million #Dev: 0.1 million #Test: 0.28 million

Evaluation on Wiki-22 micro-f1 / macro-f1 score Baseline methods Support Vector Machine (SVM) Multinomial Naive Bayes (MNB) Sum word vectors (ADD) Use a softmax classifier *-mention Only use mention *-context Only use context *-joint Use both mention and context

Comparison with Previous Systems HYENA [Yosef et al., 2012] Support Vector Machine unigrams, bigrams, and trigrams of mentions, surrounding sentences, mention paragraphs, part-of-speech tags of context words, gazetteer dictionary FIGER [Ling and Weld, 2012] Perceptron unigrams, word shapes, part-of-speech tags, length, Brown clusters, head words, dependency structures, ReVerb patterns

Evaluation on Unseen Mentions Evaluate on unseen mentions (length > 2) Mentions which do not appear in the train set Help us deal with uncommon or unseen mentions RNN-based mention model utilizes the compositional nature of mentions

Examples: Compositionality of Mentions Query similar mention examples cosine similarity of mentions' vector representations Mentions that are of similar patterns are closer

Evaluation in Question Answering (QA) Web-based QA system [Cucerzan and Agichtein, 2005; Lin, 2007] Add Q&A type interaction feature template Q: who is the ceo of microsoft? Answer Type Classifier (18 types) Person Search Engine (Bing) Candidates Extracted from Titles and Snippets [left context] [Satya Nadella] [right context] [left context] [Xbox] [right context] Person Device Ranker Add Feature Template: {Type(Q) Type(A)} {Person Person} positive weight {Person Device} negative weight Answers

Evaluation in Question Answering (QA) WebQuestions dataset [Berant et al., 2013] Manually annotated question-answer pairs Our type classifier improves the accuracy of QA systems

Conclusion and Future Work Conclusion Recurrent Neural Networks are good at learning soft patterns Compositional nature of entity mentions Generalize for Unseen or uncommon mentions Automatically generate training data instead of annotating manually Type information is important for many NLP tasks Future work Fine-grained type classification Person -> doctor, actor, etc. Utilize hierarchical taxonomy Multi-label Utilize global information (e.g., document topic)

THANKS!