Graph-Based Entity Linking

Similar documents
Entity Linking at TAC Task Description

TALP at WePS Daniel Ferrés and Horacio Rodríguez

Entity Linking. David Soares Batista. November 11, Disciplina de Recuperação de Informação, Instituto Superior Técnico

TALP at GeoCLEF-2006: Experiments Using JIRS and Lucene with the ADL Feature Type Thesaurus

NUS-I2R: Learning a Combined System for Entity Linking

PRIS at TAC2012 KBP Track

UBC Entity Discovery and Linking & Diagnostic Entity Linking at TAC-KBP 2014

CMU System for Entity Discovery and Linking at TAC-KBP 2017

Stacked Ensembles of Information Extractors for Knowledge-Base Population

CMU System for Entity Discovery and Linking at TAC-KBP 2016

Wikipedia and the Web of Confusable Entities: Experience from Entity Linking Query Creation for TAC 2009 Knowledge Base Population

The GeoTALP-IR System at GeoCLEF-2005: Experiments Using a QA-based IR System, Linguistic Analysis, and a Geographical Thesaurus

Stanford-UBC at TAC-KBP

Cross Lingual Question Answering using CINDI_QA for 2007

ADVERTIMENT ADVERTENCIA WARNING

Web Search: Techniques, algorithms and Aplications. Basic Techniques for Web Search

Chinese Microblog Entity Linking System Combining Wikipedia and Search Engine Retrieval Results

Linguistic Resources for Entity Linking Evaluation: from Monolingual to Cross-lingual

Chapter 6: Information Retrieval and Web Search. An introduction

CMU System for Entity Discovery and Linking at TAC-KBP 2015

Repositorio Institucional de la Universidad Autónoma de Madrid. Pattern Recognition 40.4, (2007):

Lecture 24: NER & Entity Linking

Towards Summarizing the Web of Entities

WEB SEARCH, FILTERING, AND TEXT MINING: TECHNOLOGY FOR A NEW ERA OF INFORMATION ACCESS

New York University 2014 Knowledge Base Population Systems

Bi-directional Linkability From Wikipedia to Documents and Back Again: UMass at TREC 2012 Knowledge Base Acceleration Track

Wikulu: Information Management in Wikis Enhanced by Language Technologies

University of Alicante at NTCIR-9 GeoTime

MSRA Columbus at GeoCLEF2007

WebTLab: A cooccurrence-based approach to KBP 2010 Entity-Linking task

Knowledge Based Systems Text Analysis

M erg in g C lassifiers for Im p ro v ed In fo rm a tio n R e triev a l

MSRA Columbus at GeoCLEF 2006

ABSTRACT. Professor Douglas W. Oard College of Information Studies

Presented by: Dimitri Galmanovich. Petros Venetis, Alon Halevy, Jayant Madhavan, Marius Paşca, Warren Shen, Gengxin Miao, Chung Wu

Fractional Similarity : Cross-lingual Feature Selection for Search

A Keyword-based Structured Query Language

A fully-automatic approach to answer geographic queries: GIRSA-WP at GikiP

A Machine Learning Approach for Displaying Query Results in Search Engines

Proposed Task Description for Knowledge-Base Population at TAC English Slot Filling Regular and Temporal

Ghent University-IBCN Participation in TAC-KBP 2015 Cold Start Slot Filling task

King Fahd University of Petroleum and Minerals

FFRDC Team s Expert Elicitation

SYDNEY CMCRC at TAC

GENERALIZED WEIGHTED PAGE RANKING ALGORITHM BASED ON CONTENT FOR ENHANCING INFORMATION RETRIEVAL ON WEB

Preference Elicitation for Single Crossing Domain

Document Structure Analysis in Associative Patent Retrieval

Tourism applications of Artificial Intelligence techniques. Dr. Antonio Moreno, ITAKA research group, URV

Precise Medication Extraction using Agile Text Mining

WeSeE-Match Results for OEAI 2012

Annual Public Report - Project Year 2 November 2012

The University of Évora s Participation in

The Synonym Management Process in SAREL

SEMANTIC WEB POWERED PORTAL INFRASTRUCTURE

Wikipedia Retrieval Task ImageCLEF 2011

A cocktail approach to the VideoCLEF 09 linking task

Short Text Categorization Exploiting Contextual Enrichment and External Knowledge

Using an Image-Text Parallel Corpus and the Web for Query Expansion in Cross-Language Image Retrieval

Query Phrase Expansion using Wikipedia for Patent Class Search

Hierarchical Location and Topic Based Query Expansion

A Flexible Distributed Architecture for Natural Language Analyzers

Dialog System & Technology Challenge 6 Overview of Track 1 - End-to-End Goal-Oriented Dialog learning

COMP102: Introduction to Databases, 9.1

Toward a Knowledge-Based Solution for Information Discovery in Complex and Dynamic Domains

Stanford-UBC Entity Linking at TAC-KBP

Generalizing from Freebase and Patterns using Cluster-Based Distant Supervision for KBP Slot-Filling

Text Mining for Software Engineering

Bc. Pavel Taufer. Named Entity Recognition and Linking

An Entity Name Systems (ENS) for the [Semantic] Web

How to Use Crystal Reports

Linking Entities in Chinese Queries to Knowledge Graph

BPMN Working Draft. 1. Introduction

Event Registry. Gregor Leban. Jozef Stefan Institute

THE amount of Web data has increased exponentially

Privacy Policy. Last Updated: August 2017

OKKAM-based instance level integration

Word Indexing Versus Conceptual Indexing in Medical Image Retrieval

A Formal Approach to Score Normalization for Meta-search

Stanford s 2013 KBP System

A Survey Of Different Text Mining Techniques Varsha C. Pande 1 and Dr. A.S. Khandelwal 2

MIRACLE at ImageCLEFmed 2008: Evaluating Strategies for Automatic Topic Expansion

What is this Song About?: Identification of Keywords in Bollywood Lyrics

Benchmarks for Enterprise Linking: Thomson Reuters R&D at TAC 2013

HUKB at NTCIR-12 IMine-2 task: Utilization of Query Analysis Results and Wikipedia Data for Subtopic Mining

This is an author-deposited version published in : Eprints ID : 12964

Unsupervised Learning and Clustering

UMass at TREC 2006: Enterprise Track

Graph-Based Relation Validation Method

IRCE at the NTCIR-12 IMine-2 Task

WebSAIL Wikifier at ERD 2014

Advanced HP Application Lifecycle Management 12.x.Software

Re-contextualization and contextual Entity exploration. Sebastian Holzki

Multi-agent and Semantic Web Systems: Linked Open Data

Robust Entity Linking in Heterogeneous Domains

CriES 2010

Chapter 8. Evaluating Search Engine

A New Approach for Automatic Thesaurus Construction and Query Expansion for Document Retrieval

Building a Large-Scale Cross-Lingual Knowledge Base from Heterogeneous Online Wikis

LINDEN: Linking Named Entities with Knowledge Base via Semantic Knowledge

Artificial Intelligence

Transcription:

Graph-Based Entity Linking A. Naderi, J. Turmo, H. Rodríguez TALP Research Center Universitat Politecnica de Catalunya February 20, 2014 1 / 29

Table of contents 1 Introduction Entity Linking EL applications EL problem in real-life entities Our proposal 2 Our Approach Candidate Generation Graph Generation Graph-based Ranking 3 Evaluation Framework 2 / 29

Section 1 Introduction 3 / 29

Importance of World Knowledge Recently, the needs of world knowledge for Artificial Intelligence applications are highly increasing. 4 / 29

Importance of World Knowledge Recently, the needs of world knowledge for Artificial Intelligence applications are highly increasing. Knowledge Bases (KB) are involved to keep and categorize entities and their relations as a part of world knowledge. 5 / 29

Way to prevent high cost of manual elicitation but the high cost of manual elicitation to create KBs forces toward automatic acquisition from text. 6 / 29

Way to prevent high cost of manual elicitation but the high cost of manual elicitation to create KBs forces toward automatic acquisition from text. This requires two main abilities: 7 / 29

Way to prevent high cost of manual elicitation but the high cost of manual elicitation to create KBs forces toward automatic acquisition from text. This requires two main abilities: 1) extracting relevant information of mentioned entities from the unstructured text (Slot Filling), and 8 / 29

Way to prevent high cost of manual elicitation but the high cost of manual elicitation to create KBs forces toward automatic acquisition from text. This requires two main abilities: 1) extracting relevant information of mentioned entities from the unstructured text (Slot Filling), and 2) linking these entities with entries in the KB (Entity Linking). 9 / 29

Entity Linking (EL) Also known as record linkage, grounding, or name entity disambiguation. 10 / 29

EL applications Automatic link generation for entity references in news articles. 11 / 29

EL applications Automatic link generation for entity references in news articles. E-mail processing to process messages and identify references to people in the contact list. 12 / 29

EL applications Automatic link generation for entity references in news articles. E-mail processing to process messages and identify references to people in the contact list. To monitor events in the companies to monitor events like merges of companies or new product releases. 13 / 29

EL applications Automatic link generation for entity references in news articles. E-mail processing to process messages and identify references to people in the contact list. To monitor events in the companies to monitor events like merges of companies or new product releases. In automating corporate customer care to process inquiries or complaints using the information provided by the customers. 14 / 29

EL Problems 15 / 29

Our Proposal for EL We take advantage of a graph-based structure for EL. The advantages are: It is used to present facts in visual form. To be used to make facts clearer and more understandable. Is a convincing structure that can show and compare relationships and changes. Is a compact way to convey information. Is forceful that emphasizes main points. Some EL research are presented using graph-based approaches but in comparison to other approaches it is not enough. 16 / 29

Section 2 Our Graph-Based Approach for EL 17 / 29

EL Task Definition An example Mono-Lingual EL query is <query id= EL000304 > <name>barnhill</name> <docid>eng-ng-31-100578-11879229</docid> <startoffset>xxx</startoffset> <endoffset>yyy</endoffset> </query> The answer is: the ID of the KB entry to which the name refers, or a NILxxxx ID 18 / 29

Our Approach 19 / 29

Graph Generation: initial graph A graph for query name Picasso 20 / 29

Graph Generation Each edge in this graph has two characteristics: 1 Attributes (e.g., Place of Birth, Age) 2 An assigned weight In the case of weights, we manually assign different weights for the edges considering the reliability of information that are extracted for each candidate. 21 / 29

Graph Generation Then, the system tries to find common properties between candidates and query. And, a path is generated between the candidate and query through these common properties. Common properties between query and candidates 22 / 29

Graph-Based Ranking C = (c 1, c 2,..., c n ) is the set of candidate nodes. q is the query node in the graph G, P ck c k. = (P 1 c k, P 2 c k,..., P m c k ) is the set of paths between q and P i c k is represented by the sequence of weights corresponding to the edges in the path, Pc i k =< w 1, w 2,..., w r >. s ck is the score of the candidate node c k, then: 23 / 29

Graph-Based Ranking s ck = { P ick P ck w j P i c k w j if P ck 0 if P ck = 24 / 29

Introduction Our Approach Evaluation Graph-Based Ranking A sample view of our graph structure sc1 = sq (wq2 + wc11 ) sc2 = sq (wq2 + wc12 ) + sq (wq1 + wc22 ) sc3 = 0, Initial score for the candidates is zero and for the query is 1. 25 / 29

Graph-Based Ranking Assuming: m q as the query name, S = {s ck }, the link between m q and KB is obtained as follows: { c if c C : sc = max(s) β link(m q, KB) = NIL otherwise 26 / 29

Section 3 Evaluation 27 / 29

Evaluation For testing our Baseline EL system, we have participated in the TAC-KBP English Mono-Lingual Entity Linking (MLEL) evaluation in the years 2012. E. González, H. Rodriguez, J. Turmo, P.R. Comas, A. Naderi, A. Ageno, E. Sapena, M. Vila, M.A. Marti. 2012. The TALP participation at TAC-KBP 2012. Text Analysis Conference, Gaithersburg, ML USA. 28 / 29

Thank You 29 / 29