A Neighborhood Relevance Model for Entity Linking
|
|
- Samson Roy Brooks
- 6 years ago
- Views:
Transcription
1 A Neighborhood Relevance Model for Entity Linking Jeffrey Dalton University of Massachusetts 140 Governors Drive Aherst, MA, U.S.A. Laura Dietz University of Massachusetts 140 Governors Drive Aherst, MA, U.S.A. ABSTRACT Entity Linking is the task of apping a string in a docuent to its entity in a knowledge base. One of the crucial tasks is to identify disabiguating context; joint assignent odels leverage the relationships within the knowledge base. We deonstrate how joint assignent odels can be approxiated with inforation retrieval. We introduce the neighborhood relevance odel which uses relevance feedback techniques to identify the salience of entity context using cross-docuent evidence. We show that this odel is ore effective than local docuent odels for ranking KB entities. Experients on the TAC KBP entity linking task deonstrate that our odel is the best perforing syste for strings that are linkable to the knowledge base. 1. INTRODUCTION Entity linking is iportant because ost content created is unstructured text in the for of news, blogs, forus, and icroblogs such as Twitter and Facebook. A key challenge is to link these unstructured text docuents to the Web of Data. Entity linking bridges the structure gap between text docuents and linked data by identifying entions of entities in free text and linking the to knowledge bases. It enriches unstructured docuents with links to people, places, and concepts in the world. Entity linking is a fundaental building block that supports a wide variety of inforation extraction, docuent suarization, and data ining tasks. For exaple, linked entities in docuents can be used to expand existing knowledge base entries with new facts and relationships. The ajor challenge in entity linking is abiguity. An entity ention in text ay be abiguous for a wide variety of reasons: ultiple entities share the sae nae (e.g. Michael Jordan), entities are referred to incopletely (e.g. Justin for Justin Bieber), by pseudonys or nicknaes (Christopher George Latore Wallace is also known as The Notorious B.I.G.), and are often abbreviated (e.g. UW for the University of Wisconsin as well as University of Washington). Perission to ake digital or hard copies of all or part of this work for personal or classroo use is granted without fee provided that copies are not ade or distributed for profit or coercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific perission and/or a fee. OAIR 13, May 22-24, 2013, Lisbon, Portugal. Copyright 2013 CID The entity linking proble has been studied over several years in the TAC Knowledge Base Population venue with the following task definition: Entity Linking: Given a string ention q in a docuent, predict the entity c in the knowledge base which the string represents, or NIL if no such entity is available. A typical entity linking syste consists of four phases: 1) query expansion, 2) candidate generation, 3) entity ranking, and 4) handling NIL cases. The goal of the first two steps is to achieve a high-recall set of Wikipedia entities. Given the candidate set, ost effective approaches, e.g., [15, 4, 16], leverage contextual entities as disabiguating evidence in step 3. One issue is the candidate generation step is often perfored using string atching heuristics, resulting in large candidate sets that ay contain hundreds or thousands of entities for abiguous atches. The connection between candidate generation and ranking are often separate and not well aligned. We advocate an inforation retrieval approach that uses one probabilistic odel to approach steps 1-3. We introduce our linking syste, KB Bridge. Suppleentary aterials for this work is available on the KB Bridge website 1. Existing entity linking ethods only eploy IR to a inor degree. We odel the entire linking task as a retrieval proble, including foralising joint assignent odels within the retrieval fraework. The graphical odeling fraework allows us to ground our work on odels fro both inforation extraction and inforation retrieval. For a given entity ention, the correct knowledge base entry is likely to share iportant pieces of contextual inforation: lexically siilar naes, shared topical siilarity reflected in word usage, and shared naed entities. Additional context could be included, but throughout the paper we focus the previously described context: nae variations, surrounding sentences, and neighboring entities. Entity linking provides soe unusual challenges. The typical IR setting addresses short queries by using relevance feedback [17] to expand the query odel. In entity linking, the query is an entity string ebedded in a longer docuent, providing an abundance of context which could be leveraged. However, not all context is equally helpful, either because of abiguity, heterogeneity in topic, or spurious collocations. Consider the exaple ABC shot the TV draa Lost in Australia. with the task of linking ABC to the entity Aerican Broadcasting Copany. The naed entity span Australia is not relevant for the true answer. It ight 1 jdalton/kbbridge
2 actually isguide the process to link to the wrong entity Australian Broadcasting Corporation. We introduce the neighborhood relevance odel to estiate the salience of context with the goal of filtering and weighting (as opposed to expanding) the context odel. The neighborhood relevance odel is based on ideas of pseudorelevance feedback [20] and latent concept expansion [13] to leverage cooccurrence evidence across topically siilar docuents. Our ain contributions are: An unsupervised approach to entity linking based upon the Markov Rando Field inforation retrieval odel that provides copetitive perforance out-of-the-box. A unified retrieval based approach to linking cobining candidate generation and ranking in a single retrieval fraework, with ore than 95% recall in the highest ranked 10 entities. A ention-specific approach for identifying salient neighboring entities using across-docuent evidence based on relevance feedback. Deonstrating the benefits of the entity neighborhood relevance odel in cobination with a supervised learning to rank fraework, resulting in the best ranking for entities linkable to the knowledge base for the TAC KBP task. 2. BACKGROUND AND RELATED WORK 2.1 Related Work Early work on entity linking was perfored by Bunescu and Pasca [2] and Cucerzan [3] to link entions of topics to their Wikipedia pages. In contrast to their odels, we focus on a retrieval approach that leverages text based ranking without exploiting extensive Wikipedia-specific structure. Our work is related to that of Gottipati and Jiang [5] who apply a language odeling approach to entity linking. They expand the original query ention with contextual inforation fro the language odel of the docuent. We use the local weighting as a starting point for estiating the entity salience and copare against it as a baseline. Entity linking has been studied in a variety of recent venues. At INEX the Link the Wiki task explored autoatically discovering links that should be created in a Wikipedia article [7]. More recently, it is one of the principle tasks studied at the ongoing Text Analysis Conference Knowledge Base Population track (TAC KBP). Ji et al. [9, 8] provide an overview of the recent systes and approaches. Instead of linking individual entions one at a tie, recent work [3, 16, 19, 11, 6] focuses on linking the set of entions, M, that occur in the query docuent d. These odels perfor joint inference over the link assignents to identify a coherent assignent of KB entries. In our work we leverage the set of entions M in the docuent as context in an inforation retrieval odel. In this work we instead focus on identifying salient entity entions in the context, because entions in the docuent ay be spurious or only tangentially related. This is especially true if the docuent contains ultiple topics. 2.2 Joint Neighborhood Assignent Model Graphical odels [10], such as Markov Rando fields, are a popular tool in both inforation extraction and inforation retrieval. Graphical odels provide the atheatical fraework for foralizing intuitions on how available data (e.g., the query ention) and quantities of interest (e.g. entity in the knowledge base) are connected. After casting data and quantities of interest as rando variables, dependencies between two (or ore) variables are encoded by factor functions φ that assign a non-negative score to each cobination of variable settings. Factors φ are often expressed by a loglinear function of a feature vector. The joint configuration of all variables is scored by the likelihood function L, which is represented by the noralized product over scores of all factor functions for the given variable settings. The factor functions φ are designed to encode our intuition on which variable choices go well together. For instance, early approaches to entity linking [1] heuristically extract a set of entity candidates and use a graphical odel with single factor φ e (q, c). The factor φ e foralizes the intuition that query entions q and entities c are copatible when their naes atch and the docuent surrounding q has ters siilar to the article of the entity candidate c. This basic odel is extended in joint neighborhood assignent odels [3, 16, 12]. The idea is that knowledge base entities which are entioned in the sae docuents are also likely to be structurally related in the knowledge base. When the query ention is abiguous, several candidate entities have an equally high score under the factor φ e (q, c). Joint assignent odels explicitly incorporate entity entions fro the sae docuent as the query ention q, which we call neighbor entions henceforth. Each neighbor ention i is associated with a latent variable z i referring to the respective entity for the neighbor. We expect the neighbor entities z to help disabiguate the true candidate c. This expectation is foralized by a second factor function φ ee (z, c), which captures intuitions on when entities z and c are entioned in the context of each other. For the factor function φ ee, Ratinov et al. [16] set siilarity of inlinks (and outlinks) for c and z; Cucerzan [4, 3] use the overlap in Wikipedia categories and presence of links fro z to c and vice versa. For each neighbor ention i a set of candidates are heuristically identified, where the variable z i represents a choice of the candidate. The odel is visualized in Figure 1a, where variables q and i are observed, but settings of variables c and z i are chosen fro respective candidate sets. A factor is represented as a sall black box which connect its input variables. If each neighboring ention is linked to its correct entity candidate z i, the true candidate c will achieve a high score of φ ee (z i, c) for ost of the neighbors z i. This inforation is cobined across all neighbors and taken together with the nae-factor φ e (q, c). The dilea is that linking all i to z i requires that we solve the entity linking proble as part of the solution. For this odel, the joint inference proble does not have a closed-for solution and therefore requires approxiate inference such as belief propagation [12] or iterative heuristics [4, 16]. As the task is to link only the single query ention, the neighboring entity links are to be arginalized out by integration over z. Candidates c for the query are scored by Equation 1.
3 (a) Joint Assignent Model(b) Neighborhood uery Model Figure 1: Neighborhood Models L(c) = φ e (q, c) ( φ e (, z) φ ee (z, c) dz) (1) 3. UERY MODEL In this section we integrate graphical odels for entity linking as developed in the inforation extraction counity with graphical odels used for inforation retrieval. We utilize the fact that query odels, such as query likelihood and the sequential dependence odel [14] have an underlying graphical odel that gives rise to a score of a docuent. 3.1 Neighborhood uery Model We deonstrate how the retrieval engine can be used to infer the joint inference proble of Equation 1 whenever factor functions φ e and φ ee are expressible as query operators. This allows us to cobine the candidate generation (step 2) with the entity ranking (step 3) and have it carried out by a retrieval engine that optiizes over all possible entities in the knowledge base at once. In contrast to the joint neighborhood assignent approaches, no candidate set has to be identified beforehand. The key insight is that for particular choices of features for the nae-atch factor φ e (, z) and the entity-link factor φ ee (z, c), it is possible to solve the integral over z in Equation 1, with sart preprocessing and indexing. For instance, consider the following siple factor functions for φ e and φ ee. We choose φ e (, z) to yield 1 if the ention is a string atch to the title of z and 0 otherwise; and φ ee (z, c) to yield 1 if z links to c. In this case, we can replace the integral over z with a factor φ e (, c) that yields 1 if the ention atches a title of an inlink of c. Such a factor can be analytically solved by transforing the Wikipedia snapshot so that the indexed article of entity c is enriched with titles of Wikipedia pages that link to c in a separate field (or extents). A search query for the string in the field of inlinks represents a graphical odel that represents the score of φ e. For any choices of factor functions for φ e and φ ee, the knowledge base can be transfored to allow for efficient replaceent factors φ e that can be directly optiized within the retrieval odel fraework. By integrating z out, the joint neighborhood assignent odel depicted in Figure 1a becoes the neighborhood query odel in Figure 1b with the following likelihood function. L(c) = φ e (q, c) φ e (, c) (2) Many coplex factor functions are possible, but for the reainder of this publication we use two siple, yet effective factor functions φ: Factor φ e (q, c) encodes atches of the string q and ters fro the surrounding text in any field of the Wikipedia article of the candidate c, this includes naes as well as the full-text of the article. Factor φ e (, c) atches the string representation of in the full-text and titles entities that link to (and are linked fro) the Wikipedia entry of c. 3.2 Relevance-weighted Neighborhood uery Model As pointed out in the introduction, not all contextual entities are equally salient. For each contextual entity, its salience for disabiguating query q is denoted by ρ q(), ranging on a scale between 0 and 1. If the salience ρ q() is 0, we want to reove the effect of φ e (, c) on the likelihood function. Based on the geoetric ean, which is the natural choice for averages of probabilities, we achieve the weighting with the geoetric interpolated odel of Equation 3. L(c) = φ e (q, c) ( ) ρq() φ e (, c) Notice, that the unweighted odel follows as a special case where all saliences are 1. We also introduce the paraeter λ M that allows trading off between the direct siilarity of the query and candidate as expressed by φ e (q, c) and the aggregated influence of the M contextual entities. Exploiting that the sort-order induced by L is invariant with respect to logariths, we cast the optiization in log-space as in Equation 4. log L(c) = logφ e (q, c) + λ M 1 M (3) ( ) ρ q() log φ e (, c) 3.3 Context fro uery Docuent The factor φ e foralizes our intuition on copatibility between the query ention q and the true candidate c. This includes nae atches of the string representation of with naes listed in the knowledge base (e.g. title, redirect, anchor text). We further extend it to other siilarity easures that are independent of the neighbor entions (which are represented by φ e ). Nae variations v of the query string can be extracted fro the query docuent, to add robustness to the entity linking inference. This is especially iportant if the query string is an acrony or an abiguous reference to the entity. We also incorporate the words, s, of the sentence that surround the query ention or one of its nae variations to preserve verbs, adjectives, and standing expressions that ight disabiguate the candidate. We introduce separate weight paraeters λ, λ V, λ S to individually control influence of nae-atches of the query ention, nae-atches of nae variations v, and sentence context respectively. Accordingly, odel the factor function φ e (q, c) by the likelihood of a graphical odel itself, represented by a log-linear function of potential functions φ nae for nae-siilarity and φ sent for sentence context. The resulting optiization criterion of the candidate answer c for the query odel given the query q, V nae variations v, S contextual phrases s, and M neighbor entions is given in Equation 5. (4)
4 log L(c) = λ logφ nae (q, c) (5) + λ V 1 log φ nae (v, c) V v + λ S 1 log φ sent (s, c) S + λ M 1 M s ( ) ρ q() log φ e (, c) 3.4 Joint Inference with Galago Using log-linear odels for factors φ with features that are readily available in the Indri and Galago 2 query languages, we can leverage the retrieval engine to optiize Equation 5. This is possible because the geoetric interpolations with weights λ and ρ are expressed with the weighted #cobine operator. We rely on the sequential dependence odel [14], which is a query odel for odels dependencies between adjacent query words. It strikes a balance between a odel that ignores the order between the query ters (e.g. query likelihood odel) and a odel that requires the exact ordering of the words (n-gra or phrase odel). For a given sequence of ters t 1, t 2,..., t n, the sequential dependence odel scores docuents by a graphical odel cobining ter factors φ t (t i), bi-gra factors φ o (t i, t i+1), and factors capturing whether two subsequent query ters t i and t i+1 occur within a window of 8 words. We use siple factors in the reainder of the paper. For the nae-atch factor φ nae (q, c) that tests all of the entity s indexed docuent for presence of the string representation of the query ention q. Because q consists of ultiple ters, we use the sequential dependence odel. We use a query likelihood operator for φ sent. For the neighbor factor φ e we also use the sequential dependence odel. With these feature functions, the optiization criterion of Equation 5 is equivalent to the Galago query given in Figure 2. We restrict ourselves to a siple factor functions to deonstrate general applicability. Field-retrieval odels provide further options, e.g., to let φ e distinguish atches in different nae fields (title, anchor text, etc); and φ e distinguish atches in the full-text, the title of in-links and titles of out-links. The factor functions can ake use of any feature that can be expressed in Galago query language, while still allowing to be optiized inside the retrieval engine. 4. NEIGHBORHOOD RELEVANCE MODEL In the previous section we introduced a query odel containing salience-weighted entity entions. We now discuss ethods for estiating these salience weights ρ q() in an unsupervised anner. The idea is to assue a high salience of for the query ention q, if both are frequently entioned together. It is iportant to note that even unabiguous entions are not necessarily useful for disabiguating other entions. 2 #cobine:0=λ :1=λ V :2=λ S :3=λ M ( ) #seqdep(q) #cobine(#seqdep(v 0)... #seqdep(v V )) #cobine(#seqdep(s 0),..., #seqdep(s S)) #cobine:0 = ρ( 0) :... k = ρ( k )( ) #seqdep( 0),..., #seqdep( k ) Figure 2: Salience-weighted neighborhood query odel in Galago syntax. 4.1 Local Docuent Neighborhood Model As a first approach, salience of the neighboring entions on the query ention can be estiated fro the docuent surrounding the query. This technique was used by Gottipati and Jiang [5] to build a ultinoial language odel of entity entions fro the query docuent d q with occurrence count n,dq. We refer to this siple estiation technique as the local odel. ρ local n,dq q () = (6) n,d q Gottipati also tested weighting schees that incorporates distance, but found that these did not significantly iprove the results. We suspect that especially when the query is not the ain focus of the query docuent, any contextual entities are not relevant for disabiguation and ay actually lead to worse perforance (see experiental evaluation). 4.2 Across-docuent Neighborhood Relevance Model We suggest the neighborhood relevance odel which estiates entity saliences ρ fro across-docuent evidence. The idea is that a neighbor is iportant if it occurs frequently in the context of the query ention within the docuent as well as across other docuents that are topically related and contain entions of the query ention q. Our ethod is based on pseudo-relevance feedback [13], which analyses results a first retrieval pass to expand the query with related words. We first identify the query string q, with nae variations v, and neighborhood, and using the local docuent saliences ρ local. We use the query odel given in Equation 5 to search for docuents d that contain coreferent entions. We identify coreferent entions by coparison to the nae variations v and q we call the pseudo-coreferent entions. Given a set of retrieved pseudo-relevant docuents D, we use the retrieval probability of the docuent as an estiate of how likely the pseudo-coreferent ention refers to the sae entity as the query q. As ost retrieval fraeworks return only unnoralized (rank-equivalent) retrieval scores L(d) L(d), the estiate has to be approxiated with d. D L(d ) Counting the occurrence frequency n,d of string-identical entions of, we build a ultinoial language odel across the pseudo-relevant docuents with relevance-odel weighting as follows.
5 ρ nr q () = 1 d D L(d ) d D n,d L(d) (7) n,d In other words, the salience of a ention in the neighborhood is expressed by accuulating relative retrieval probabilities of docuents according to how often they contain the ention. Any corpus that contains docuents with siilar properties as the query docuent is a reasonable target corpus for the feedback query. In this work, we use the coplete TAC source corpus, but other choices such as restrictions to news or blogs, depending on the query docuent are possible. Typically, relevance feedback odels are used to expand the query with new ters. This odel is capable of introducing new entity entions that are not contained in the query docuent. However, since the context of the query docuent is already very rich, a preliinary experient deonstrated that it is better to use the relevance odel to reduce and weight the context found in the query docuent. 5. KB BRIDGE: ENTITY LINKING SYSTEM In this section we describe KB Bridge, our retrieval-based entity linking syste which is ipleented using the Galago search engine and the MRF inforation retrieval fraework. The syste links entity entions in the source docuent to knowledge base entities. The ranking of the entities is a two-stage process. First, entities are ranked using the Galago retrieval odel described in Figure 2. The ranking is then refined with a supervised learning to rank odel using RankLib 3. The final step is NIL handling which deterines if the ention is in the knowledge base or whether it is unknown. 5.1 Knowledge Base Representation Our syste addresses text-driven knowledge bases in which each entity is associated with free text, with relationships between entities fro hyperlinks or other sources. Wikipedia is one representation of such a knowledge base, but our syste would likely perfor well on other knowledge bases. In order to efficiently search over knowledge bases with illions or billions of entities we use an inforation retrieval syste. For these experients, we index the full text of the Wikipedia article, title, redirects, Freebase nae variations, internal anchor text, and web anchor text. 5.2 Docuent Analysis The first step in linking is to identify the entity query span q in the docuent and to find disabiguating contextual inforation for the query odel introduced in Section 3. This includes nae variations v, contextual sentences s, and other neighboring entions. In the TAC KBP challenge, the entities of type person, organization, or location are the ain focus of the linking effort and so the syste detects entities using standard naed entity recognition tools, including UMass s factorie 4 and Stanford CoreNLP 5. These provide the entions spans to derive query entions q, nae variations v, and neighboring entities. Beyond the standard entity classes, our approach 3 vdang/ranklib.htl is general enough to also link other entity types if a suitable detector is incorporated. Given a target entity ention, q, the syste needs to identify nae variations, v, in the docuent, such as Steve to Steve Jobs or IOC to International Olypic Coittee. The goal is to identify alternative naes that are less abiguous than the query ention. We use the within-docuent coreference tool fro UMass s factorie, together with capitalized word sequences that contain the query string (ignoring capitalization and punctuation for the atching) to extract nae variations v. Fro the set of coreferent entions, we extract the sentences s they occur within. After reoving stopwords, casing and punctuation they represent non-ner context such as verbs, adjectives, and ulti-word phrases. 5.3 Cross-docuent evidence Instead of analyzing one docuent in isolation, KB Bridge leverages topically siilar entity entions across docuents. A full-text index of a corpus that contains siilar docuents is constructed. We then generate a query according to Figure 2, using local salience weights ρ local, to retrieve docuents fro the collection. The docuents are ranked by the likelihood of containing relevant entity entions for original query ention, q. These docuents are used in the neighborhood relevance odel to identify salience weights, ρ nr. 5.4 KB Entity Ranking The query odel with salience weights fro local docuent analysis and the neighborhood relevance odel ρ nr () is executed against the search index of KB entries as shown in Equation 5. Our syste supports any feature function expressible in Galago query notation. Beyond this initial ranking, we can further refine the ranking using ore coplex features in learning to rank odels. The ranking is refined using supervised achine learning in a learning to rank (LTR) odel. The refineent eploys ore extensive feature coparisons which would be expensive to copute over the entire collection. For these experients we use the LabdaMART ranking odel, a type of gradient boosted decision tree that is state-of-the-art and captures non-linear dependencies in the data. The odel includes dozens of features. A description of the features used in the odel is found in Table NIL Handling After the entities are ranked, the last step is to deterine if the top-ranked entity for a ention is correct and should be linked to the KB entry or instead refers to an entity not in the knowledge base, in which case NIL should be returned. For these experients, we use a siple NIL handling strategy. We return NIL, if the supervised score of the top ranked entity is below a threshold τ. The NIL threshold τ is tuned on the training data. For the special case of the TAC KBP challenge, the reference knowledge base is a subset of Wikipedia. We exploit this fact by returning NIL whenever the top ranked Wikipedia entity is not contained in the reference knowledge base. 6. EXPERIMENTAL EVALUATION 6.1 Setup We base our experiental evaluation on four data sets fro the TAC KBP entity linking copetition fro 2009 to 2012.
6 Feature Set Type Description Character Siilarity q, v Lower-cased noralized string siilarity: Exact atch, prefix atch, Dice, Jaccard, Levenstein, Jaro-Winkler Token Siilarity q,v Lower-cased noralized token siilarity: Exact atch, Dice, Jaccard Acrony atch q Tests if query is an acrony, if first letters atch, and if KB entry nae is a possible acroyn expansion Field atches q, v Field counts and query likelihood probabilities for title, anchor text, redirects, alternative naes fields Link Probability q, v p (anchor KB entry) - the fraction of internal and external total anchor strings targeting the entity Inlink count docuent prior Log of the nuber of internal and external links to the target KB entry Text Siilarity docuent Noralized text siilarity of docuent and KB entity: Cosine with TF-IDF, KL, JS, Jaccard token overlap Neighborhood text siilarity docuent Noralized neighborhood siilarity: KL Divergence, Nuber of atches, atch probability Neighborhood link siilarity docuent Neighborhood siilarity with in/out links: KL divergence, Jensen-Shannon Divergence, Dice overlap, Jaccard Rank features retrieval Raw retrieval log likelihood, Noralized posterior probability, 1/retrieval rank Context Rank Features retrieval retrieval scores for each contextual coponents: q, v, s, nr, local Table 1: Features of the query ention and candidate Wikipedia entity. Over the years, the TAC organizers and the Linguistic Data Consortiu cae up with evaluation queries with varying characteristics both in ters of abiguity (average unique entions per entity) and variety (average nuber of entities per ention) Data The TAC KBP Knowledge Base was constructed fro a dup of English Wikipedia fro October 2008 containing 818,741 entries. The source collection includes over 1.2 illion newswire docuents, approxiately 500 thousand web docuents and hundreds of transcribed spoken docuents. Across all years there are a total of 12,130 query entity entions. We use all query entions with odd nubered IDs as training data, and the even for evaluation. We inspected the distribution of the queries in the split to ensure they were representative for both NIL and queries with a ground truth entity ( in-kb ) as well as the entity type distribution (Per/Org/GPE). The training set contains 6043 queries, 3034 with a ground truth entity c and 3009 NIL queries. The evaluation set contains 6087 queries with 3058 NIL and 3029 in-kb. This training set is used to learn paraeters of our query odel, as well as paraeters of the supervised reranker. We learn across all years and evaluate year-by-year for coparison with previous results Wikipedia dup For evaluating a retrieval approach to linking, we use a ore recent dup of Wikipedia. We use a Freebase dup of the English Wikipedia fro January 2012, which contains over 3.8 illion articles. It includes the full-text of the article along with etadata including redirects, disabiguation links, outgoing links, and anchor text. We also use the Google Cross-Wiki dictionary [18] for external link inforation fro the web. We derive a apping between our snapshot and the official TAC KBP knowledge base using title atches and article redirects. Using a ore recent snapshot of Wikipedia is a coon practice eployed by the top perforing linking systes in TAC. The full snapshot provides full text as well as rich category and link structure. 6.2 Context Modeling We first evaluate the contributions of the different types of context for the entity query. The context includes the entity query string q, nae variations v, the sentences s surrounding the query or nae variations, as well as neighboring entity spans. The cobinations of these context coponents is indicated by, V, S, or M in the ethod prefix. We evaluate three context weighting ethods. The first is unifor weighting (M). Second is the local docuent odel by Gottipati [5] (indicated by local). Third is our neighborhood relevance odel (indicated by the suffix nr). We copare both for estiating the salience ρ() of neighboring entity entions. Baselines are the ethods using only the query string (), the cobination of query and nae variations (), as well as the local context weighting (M local). Our suggested ethods are SM nr and M nr. These odels are the full query odel with neighborhood relevance weighting with and without sentences. For each of the copared ethods, we train separate λ paraeters on the training data using a coordinate ascent learning algorith. Estiated λ paraeters differ across ethods. For the SM nr odel the estiated paraeters are: λ = 0.321, λ V = 0.293, λ S = 0.155, and λ M = Figure 3 visualizes an ablation study for the context coponents using precision at rank one for evaluation. Figure 3a shows the cuulative iproveents as context is added and weighted with the neighborhood relevance (M nr). M with unifor neighborhood weighting perfors siilarly to M local weighting (not shown). We observe that adding sentence context does not significantly iprove perforance. Figure 3b details the individual contributions of contextual coponents (oitting sentences). It is interesting that the ethod (entity nae plus nae variations) and M nr (entity nae plus weighted entity spans) are coparable in effectiveness. This is useful when no high quality nae variations are extractable fro the text, as is the case in inforal text fro social edia. The cuulative figure above shows that when cobined these features yield further iproveent. Across all years the neighborhood relevance odel achieves better effectiveness than the local odel. 6.3 Ranking Distribution In the previous section we exained the effectiveness of the different contextual coponents on the top-ranked result. Table 2 presents the contextual ranking odels evaluated using ean reciprocal rank (MRR). Siilar to the previous results, it shows that the ost effective odels include the neighborhood relevance weighting schee (nr). M nr and SM nr are significantly better than the baseline. The only exception is in 2010, when the queries are
7 P1 P M M_nr SM_nr all 2009 M_local M_nr M M_nr SM_nr all M M_nr SM_nr all (a) Cuulative M_local M_nr M_local M_nr M M_nr SM_nr all 2012 M_local M_nr Method M nr M nr LTR Table 3: Learning to rank refineent results with ean recipocal rank (MRR). All LTR results are statistically significant with α = 5% over the unsupervised M nr average recall M_nr M_nr LTR Figure 4: Average recall at rank cutoff k cutoff rank k (b) Individual Contributions. Figure 3: Ablation study for the suggested ethod in ters of 1. Method M nr * 0.825* M M nr 0.795* * 0.715* M local 0.784* * S * SM nr 0.792* * 6* SM local 0.780* * 0.719* all context 0.786* * 0.735* Table 2: Ranking results on TAC by year with varying context ethods with ean reciprocal rank (MRR). The best results for each year are highlighted in bold. Results that are statistically significant with α = 5% over the baseline are indicated with *. easier. In this case only the M nr ethod is significantly better. Additionally, M nr is significantly better than the local weighting (Gottipati) for However, there is no significant difference in We hypothesize that the reason the neighborhood odel does not iprove over the local odel in 2012 is because the queries are significantly ore abiguous and the quality of the retrieved feedback docuents is lower. We refine the retrieval ranking using a supervised learning to rank odel. The the features in the ranking odel are described in Table 1. The top 100 results fro the best ranking, M nr are reranked. The results of this are shown in Table 3. The results show significant iproveent over the initial retrieval ranking leveraging ore features that perfor ore extensive contextual coparison. The results for 2012 are still well below the other years, indicating the difficulty of these queries even leveraging the ore coplex contextual features. This indicates that a better feature representation is needed to address soe of these difficult to resolve entions Recall The previous results use ean reciprocal rank to easure the retrieval effectiveness. We now exaine the rank distribution in ore detail, exaining the recall at a given rank cutoff. The entity recall is critical because it is an upper bound on the effectiveness of downstrea systes. To achieve a iniu 90% recall threshold across all years requires hundreds of candidates for the query () odel, 20 for, 16 for M nr, and only 3 for M nr LTR. The learning to rank odel achieves at least 95% recall across all years within 10 results. 6.4 TAC KBP results In this section, we evaluate the ranking as part of the entire linking pipeline described in Section 5. We report the icro-averaged accuracy because we do not focus on clustering NIL entity entions. The results are in Table 4. The unsupervised retrieval M nr perfors well, above the edian in 2012 and copetitive in previous years. The supervised ranking odels iprove effectiveness significantly. The in-kb ranking results outperfor the best perforing systes in 2009 through 2011 and are coparable in The ain focus of this work is ranking, and this shows the effectiveness of our approach. We now exaine the overall (all) results, including the NIL handling. The results show that the M nr with LTR reranking and NIL handling outperfors the top syste in 2009 and is copetitive with the best perforing systes in subsequent years. Applying the score threshold iproves the overall accuracy despite decreasing in-kb effectiveness. This is because soe correctly linked entities are arked as NIL, but are outweighed by the greater reduction in false
8 in-kb NIL all in-kb NIL all in-kb NIL all in-kb NIL all M nr M nr LTR RM nr LTR NIL Best Perforer Table 4: TAC Entity Linking perforance in icro-average accuracy. positive entity links. The NIL handling strategy based on thresholding the ranking score is effective, but could be iproved further. Other linking systes use a supervised NIL classifier for this step, allowing the to perfor well despite less effective in-kb ranking. 7. CONCLUSION In this paper we propose an approach to entity linking based upon the Markov Rando Field inforation retrieval odel (MRF-IR). We focus on the task of ranking knowledge base entities. We deonstrate how concepts fro joint neighborhood odels can be incorporated within the MRF-IR fraework. Further, we propose a neighborhood relevance odel (NRM) that uses relevance feedback techniques to identify salient entity context across docuents. Our experients on the TAC KBP entity linking data show that the neighborhood relevance odel outperfors other contextual odels. When the ranking is refined with a learning to rank odel the results beat the current best perforing systes on in-kb ranking accuracy. Cobined with a siple NIL handling strategy the overall effectiveness on all entions is coparable to, and soeties better than, other state-of-the-art entity linking systes. Acknowledgeents This work was supported in part by the Center for Intelligent Inforation Retrieval and in part under subcontract # fro SRI International, prie contractor to DARPA contract #HR C Any opinions, findings and conclusions or recoendations expressed in this aterial are those of the authors and do not necessarily reflect those of the sponsor. 8. REFERENCES [1] Entity disabiguation for knowledge base population. In COLING, [2] Razvan Bunescu and Marius Pasca. Using encyclopedic knowledge for naed entity disabiguation. In European Chapter of the Association for Coputational Linguistics (EACL), [3] S. Cucerzan. Large-Scale Naed Entity Disabiguation Based on Wikipedia Data. EMNLP, [4] S. Cucerzan. Tac entity linking by perforing full-docuent entity extraction and disabiguation. Proceedings of the Text Analysis Conference, [5] Swapna Gottipati and Jing Jiang. Linking entities to a knowledge base with query expansion. In EMNLP, [6] Johannes Hoffart, Mohaed A. Yosef, Ilaria Bordino, Hagen Fürstenau, Manfred Pinkal, Marc Spaniol, Bilyana Taneva, Stefan Thater, and Gerhard Weiku. Robust disabiguation of naed entities in text. In EMNLP, [7] Darren W. Huang, Yue Xu, Andrew Trotan, and Shloo Geva. Focused access to XML docuents. chapter Overview of INEX 2007 Link the Wiki Track [8] Heng Ji and Ralph Grishan. Knowledge base population: successful approaches and challenges. In ACL-HLT, [9] Heng Ji, Ralph Grishan, and Hoa Dang. Overview of the TAC2011 knowledge base population track. In TAC 2011 Proceedings Papers, [10] Daphne Koller and Nir Friedan. Probabilistic graphical odels: principles and techniques. MIT press, [11] Sayali Kulkarni, Ait Singh, Ganesh Raakrishnan, and Souen Chakrabarti. Collective annotation of wikipedia entities in web text. In KDD, [12] Paul McNaee, Veselin Stoyanov, Jaes Mayfield, Ti Finin, Ti Oates, Tan Xu, Douglas W. Oard, and Dawn Lawrie. HLTCOE participation at TAC 2012: Entity linking and cold start knowledge base construction. In TAC KBP, [13] D. Metzler and W.B. Croft. Latent concept expansion using arkov rando fields. In SIGIR, [14] Donald Metzler and W. Bruce Croft. A arkov rando field odel for ter dependencies. In SIGIR, [15] S. Monahan, J. Lehann, T. Nyberg, J. Plyale, and A. Jung. Cross-lingual cross-docuent coreference with entity linking. Proceedings of the Text Analysis Conference, [16] L. Ratinov, D. Roth, D. Downey, and M. Anderson. Local and global algoriths for disabiguation to wikipedia. In ACL, [17] J. J. Rocchio. Relevance feedback in inforation retrieval. In G. Salton, editor, The SMART Retrieval Syste: Experients in Autoatic Docuent Processing, chapter 14, pages [18] Valentin I. Spitkovsky and Angel X. Chang. A Cross-Lingual dictionary for english wikipedia concepts. In LREC, [19] Veselin Stoyanov, Jaes Mayfield, Tan Xu, Douglas W. Oard, Dawn Lawrie, Ti Oates, and Ti Finin. A context-aware approach to entity linking. In AKBC-WEKEX, [20] J. Xu and W.B. Croft. uery expansion using local and global docuent analysis. In SIGIR, 1996.
A Neighborhood Relevance Model for Entity Linking
A Neighborhood Relevance Model for Entity Linking Jeffrey Dalton University of Massachusetts 140 Governors Drive Aherst, MA, U.S.A. jdalton@cs.uass.edu Laura Dietz University of Massachusetts 140 Governors
More informationLeveraging Relevance Cues for Improved Spoken Document Retrieval
Leveraging Relevance Cues for Iproved Spoken Docuent Retrieval Pei-Ning Chen 1, Kuan-Yu Chen 2 and Berlin Chen 1 National Taiwan Noral University, Taiwan 1 Institute of Inforation Science, Acadeia Sinica,
More informationBi-directional Linkability From Wikipedia to Documents and Back Again: UMass at TREC 2012 Knowledge Base Acceleration Track
Bi-directional Linkability From Wikipedia to Documents and Back Again: UMass at TREC 2012 Knowledge Base Acceleration Track Jeffrey Dalton University of Massachusetts, Amherst jdalton@cs.umass.edu Laura
More informationMeta-Path-Based Ranking with Pseudo Relevance Feedback on Heterogeneous Graph for Citation Recommendation
Meta-Path-Based Ranking with Pseudo Relevance Feedback on Heterogeneous Graph for Citation Recoendation Xiaozhong Liu School of Inforatics and Coputing Indiana University Blooington Blooington, IN, USA,
More informationDetection of Outliers and Reduction of their Undesirable Effects for Improving the Accuracy of K-means Clustering Algorithm
Detection of Outliers and Reduction of their Undesirable Effects for Iproving the Accuracy of K-eans Clustering Algorith Bahan Askari Departent of Coputer Science and Research Branch, Islaic Azad University,
More information1 Extended Boolean Model
1 EXTENDED BOOLEAN MODEL It has been well-known that the Boolean odel is too inflexible, requiring skilful use of Boolean operators to obtain good results. On the other hand, the vector space odel is flexible
More informationEE 364B Convex Optimization An ADMM Solution to the Sparse Coding Problem. Sonia Bhaskar, Will Zou Final Project Spring 2011
EE 364B Convex Optiization An ADMM Solution to the Sparse Coding Proble Sonia Bhaskar, Will Zou Final Project Spring 20 I. INTRODUCTION For our project, we apply the ethod of the alternating direction
More informationIdentifying Converging Pairs of Nodes on a Budget
Identifying Converging Pairs of Nodes on a Budget Konstantina Lazaridou Departent of Inforatics Aristotle University, Thessaloniki, Greece konlaznik@csd.auth.gr Evaggelia Pitoura Coputer Science and Engineering
More informationOPTIMAL COMPLEX SERVICES COMPOSITION IN SOA SYSTEMS
Key words SOA, optial, coplex service, coposition, Quality of Service Piotr RYGIELSKI*, Paweł ŚWIĄTEK* OPTIMAL COMPLEX SERVICES COMPOSITION IN SOA SYSTEMS One of the ost iportant tasks in service oriented
More informationAn Automatic Method for Summary Evaluation Using Multiple Evaluation Results by a Manual Method
An Autoatic Method for Suary Evaluation Using Multiple Evaluation Results by a Manual Method Hidetsugu Nanba Faculty of Inforation Sciences, Hiroshia City University 3-4-1 Ozuka, Hiroshia, 731-3194 Japan
More informationSolving the Damage Localization Problem in Structural Health Monitoring Using Techniques in Pattern Classification
Solving the Daage Localization Proble in Structural Health Monitoring Using Techniques in Pattern Classification CS 9 Final Project Due Dec. 4, 007 Hae Young Noh, Allen Cheung, Daxia Ge Introduction Structural
More informationGeo-activity Recommendations by using Improved Feature Combination
Geo-activity Recoendations by using Iproved Feature Cobination Masoud Sattari Middle East Technical University Ankara, Turkey e76326@ceng.etu.edu.tr Murat Manguoglu Middle East Technical University Ankara,
More informationDifferent criteria of dynamic routing
Procedia Coputer Science Volue 66, 2015, Pages 166 173 YSC 2015. 4th International Young Scientists Conference on Coputational Science Different criteria of dynaic routing Kurochkin 1*, Grinberg 1 1 Kharkevich
More informationCollection Selection Based on Historical Performance for Efficient Processing
Collection Selection Based on Historical Perforance for Efficient Processing Christopher T. Fallen and Gregory B. Newby Arctic Region Supercoputing Center University of Alaska Fairbanks Fairbanks, Alaska
More informationTALLINN UNIVERSITY OF TECHNOLOGY, INSTITUTE OF PHYSICS 17. FRESNEL DIFFRACTION ON A ROUND APERTURE
7. FRESNEL DIFFRACTION ON A ROUND APERTURE. Objective Exaining diffraction pattern on a round aperture, deterining wavelength of light source.. Equipent needed Optical workbench, light source, color filters,
More informationThe optimization design of microphone array layout for wideband noise sources
PROCEEDINGS of the 22 nd International Congress on Acoustics Acoustic Array Systes: Paper ICA2016-903 The optiization design of icrophone array layout for wideband noise sources Pengxiao Teng (a), Jun
More informationAn Efficient Approach for Content Delivery in Overlay Networks
An Efficient Approach for Content Delivery in Overlay Networks Mohaad Malli, Chadi Barakat, Walid Dabbous Projet Planète, INRIA-Sophia Antipolis, France E-ail:{alli, cbarakat, dabbous}@sophia.inria.fr
More informationTHE rapid growth and continuous change of the real
IEEE TRANSACTIONS ON SERVICES COMPUTING, VOL. 8, NO. 1, JANUARY/FEBRUARY 2015 47 Designing High Perforance Web-Based Coputing Services to Proote Teleedicine Database Manageent Syste Isail Hababeh, Issa
More informationA Context-Aware Approach to Entity Linking
A Context-Aware Approach to Entity Linking Veselin Stoyanov and James Mayfield HLTCOE Johns Hopkins University Dawn Lawrie Loyola University in Maryland Tan Xu and Douglas W. Oard University of Maryland
More informationMULTI-INDEX VOTING FOR ASYMMETRIC DISTANCE COMPUTATION IN A LARGE-SCALE BINARY CODES. Chih-Yi Chiu, Yu-Cyuan Liou, and Sheng-Hao Chou
MULTI-INDEX VOTING FOR ASYMMETRIC DISTANCE COMPUTATION IN A LARGE-SCALE BINARY CODES Chih-Yi Chiu, Yu-Cyuan Liou, and Sheng-Hao Chou Departent of Coputer Science and Inforation Engineering, National Chiayi
More informationBrian Noguchi CS 229 (Fall 05) Project Final Writeup A Hierarchical Application of ICA-based Feature Extraction to Image Classification Brian Noguchi
A Hierarchical Application of ICA-based Feature Etraction to Iage Classification Introduction Iage classification poses one of the greatest challenges in the achine vision and achine learning counities.
More informationDesigning High Performance Web-Based Computing Services to Promote Telemedicine Database Management System
Designing High Perforance Web-Based Coputing Services to Proote Teleedicine Database Manageent Syste Isail Hababeh 1, Issa Khalil 2, and Abdallah Khreishah 3 1: Coputer Engineering & Inforation Technology,
More informationFeature Based Registration for Panoramic Image Generation
IJCSI International Journal of Coputer Science Issues, Vol. 10, Issue 6, No, Noveber 013 www.ijcsi.org 13 Feature Based Registration for Panoraic Iage Generation Kawther Abbas Sallal 1, Abdul-Mone Saleh
More informationPOSITION-PATCH BASED FACE HALLUCINATION VIA LOCALITY-CONSTRAINED REPRESENTATION. Junjun Jiang, Ruimin Hu, Zhen Han, Tao Lu, and Kebin Huang
IEEE International Conference on ultiedia and Expo POSITION-PATCH BASED FACE HALLUCINATION VIA LOCALITY-CONSTRAINED REPRESENTATION Junjun Jiang, Ruiin Hu, Zhen Han, Tao Lu, and Kebin Huang National Engineering
More informationClustering. Cluster Analysis of Microarray Data. Microarray Data for Clustering. Data for Clustering
Clustering Cluster Analysis of Microarray Data 4/3/009 Copyright 009 Dan Nettleton Group obects that are siilar to one another together in a cluster. Separate obects that are dissiilar fro each other into
More informationMapping Data in Peer-to-Peer Systems: Semantics and Algorithmic Issues
Mapping Data in Peer-to-Peer Systes: Seantics and Algorithic Issues Anastasios Keentsietsidis Marcelo Arenas Renée J. Miller Departent of Coputer Science University of Toronto {tasos,arenas,iller}@cs.toronto.edu
More informationComputer Aided Drafting, Design and Manufacturing Volume 26, Number 2, June 2016, Page 13
Coputer Aided Drafting, Design and Manufacturing Volue 26, uber 2, June 2016, Page 13 CADDM 3D reconstruction of coplex curved objects fro line drawings Sun Yanling, Dong Lijun Institute of Mechanical
More informationPerformance Analysis of RAID in Different Workload
Send Orders for Reprints to reprints@benthascience.ae 324 The Open Cybernetics & Systeics Journal, 2015, 9, 324-328 Perforance Analysis of RAID in Different Workload Open Access Zhang Dule *, Ji Xiaoyun,
More informationEffective Tracking of the Players and Ball in Indoor Soccer Games in the Presence of Occlusion
Effective Tracking of the Players and Ball in Indoor Soccer Gaes in the Presence of Occlusion Soudeh Kasiri-Bidhendi and Reza Safabakhsh Airkabir Univerisity of Technology, Tehran, Iran {kasiri, safa}@aut.ac.ir
More informationDebiasing Crowdsourced Batches
Debiasing Crowdsourced Batches Honglei Zhuang, Aditya Paraeswaran, Dan Roth and Jiawei Han Departent of Coputer Science University of Illinois at Urbana-Chapaign {hzhuang3, adityagp, danr, hanj}@illinois.edu
More informationFeature Selection to Relate Words and Images
The Open Inforation Systes Journal, 2009, 3, 9-13 9 Feature Selection to Relate Words and Iages Wei-Chao Lin 1 and Chih-Fong Tsai*,2 Open Access 1 Departent of Coputing, Engineering and Technology, University
More informationDevelopment of an Integrated Cost Estimation and Cost Control System for Construction Projects
ABSTRACT Developent of an Integrated Estiation and Control Syste for Construction s by Salan Azhar, Syed M. Ahed and Aaury A. Caballero Florida International University 0555 W. Flagler Street, Miai, Florida
More informationModeling Parallel Applications Performance on Heterogeneous Systems
Modeling Parallel Applications Perforance on Heterogeneous Systes Jaeela Al-Jaroodi, Nader Mohaed, Hong Jiang and David Swanson Departent of Coputer Science and Engineering University of Nebraska Lincoln
More informationScalable search-based image annotation
Multiedia Systes DOI 17/s53-8-128-y REGULAR PAPER Scalable search-based iage annotation Changhu Wang Feng Jing Lei Zhang Hong-Jiang Zhang Received: 18 October 27 / Accepted: 15 May 28 Springer-Verlag 28
More informationBoosted Detection of Objects and Attributes
L M M Boosted Detection of Objects and Attributes Abstract We present a new fraework for detection of object and attributes in iages based on boosted cobination of priitive classifiers. The fraework directly
More informationSupplementary Section. A. Algorithm. B. Datasets. C. More on Labeling Functions
Suppleentary Section In this section, we include ore details about our algorith, labeling functions, datasets as well as additional results and coparisons, which could not be included in the ain paper
More informationA Novel 2D Texture Classifier For Gray Level Images
2012, TextRoad Publication ISSN 2090-4304 Journal of Basic and Applied Scientific Research www.textroad.co A Novel 2D Texture Classifier For Gray Level Iages B.S. Mousavi 1 Young Researchers Club, Zahedan
More informationA Learning Framework for Nearest Neighbor Search
A Learning Fraework for Nearest Neighbor Search Lawrence Cayton Departent of Coputer Science University of California, San Diego lcayton@cs.ucsd.edu Sanjoy Dasgupta Departent of Coputer Science University
More informationA wireless sensor network for visual detection and classification of intrusions
A wireless sensor network for visual detection and classification of intrusions ANDRZEJ SLUZEK 1,3, PALANIAPPAN ANNAMALAI 2, MD SAIFUL ISLAM 1 1 School of Coputer Engineering, 2 IntelliSys Centre Nanyang
More informationTensorFlow and Keras-based Convolutional Neural Network in CAT Image Recognition Ang LI 1,*, Yi-xiang LI 2 and Xue-hui LI 3
2017 2nd International Conference on Coputational Modeling, Siulation and Applied Matheatics (CMSAM 2017) ISBN: 978-1-60595-499-8 TensorFlow and Keras-based Convolutional Neural Network in CAT Iage Recognition
More informationA Novel Fuzzy Chinese Address Matching Engine Based on Full-text Search Technology
Based on Full-text Search Technology 12 Institute of Reote Sensing and Digital Earth National Engineering Research Center for Reote Sensing Applications,Beijing,100101, China E-ail: yaoxj@radi.ac.cn Xiang
More informationOblivious Routing for Fat-Tree Based System Area Networks with Uncertain Traffic Demands
Oblivious Routing for Fat-Tree Based Syste Area Networks with Uncertain Traffic Deands Xin Yuan Wickus Nienaber Zhenhai Duan Departent of Coputer Science Florida State University Tallahassee, FL 3306 {xyuan,nienaber,duan}@cs.fsu.edu
More informationEFFICIENT VIDEO SEARCH USING IMAGE QUERIES A. Araujo1, M. Makar2, V. Chandrasekhar3, D. Chen1, S. Tsai1, H. Chen1, R. Angst1 and B.
EFFICIENT VIDEO SEARCH USING IMAGE QUERIES A. Araujo1, M. Makar2, V. Chandrasekhar3, D. Chen1, S. Tsai1, H. Chen1, R. Angst1 and B. Girod1 1 Stanford University, USA 2 Qualco Inc., USA ABSTRACT We study
More informationThe Internal Conflict of a Belief Function
The Internal Conflict of a Belief Function Johan Schubert Abstract In this paper we define and derive an internal conflict of a belief function We decopose the belief function in question into a set of
More informationInternational Journal of Scientific & Engineering Research, Volume 4, Issue 10, October ISSN
International Journal of Scientific & Engineering Research, Volue 4, Issue 0, October-203 483 Design an Encoding Technique Using Forbidden Transition free Algorith to Reduce Cross-Talk for On-Chip VLSI
More informationOptimal Route Queries with Arbitrary Order Constraints
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL.?, NO.?,? 20?? 1 Optial Route Queries with Arbitrary Order Constraints Jing Li, Yin Yang, Nikos Maoulis Abstract Given a set of spatial points DS,
More informationDerivation of an Analytical Model for Evaluating the Performance of a Multi- Queue Nodes Network Router
Derivation of an Analytical Model for Evaluating the Perforance of a Multi- Queue Nodes Network Router 1 Hussein Al-Bahadili, 1 Jafar Ababneh, and 2 Fadi Thabtah 1 Coputer Inforation Systes Faculty of
More informationOn computing global similarity in images
On coputing global siilarity in iages S Ravela and R Manatha Coputer Science Departent University of Massachusetts, Aherst, MA 01003 Eail: ravela,anatha @csuassedu Abstract The retrieval of iages based
More informationA Learning Framework for Nearest Neighbor Search
A Learning Fraework for Nearest Neighbor Search Lawrence Cayton Departent of Coputer Science University of California, San Diego lcayton@cs.ucsd.edu Sanjoy Dasgupta Departent of Coputer Science University
More informationCOLOR HISTOGRAM AND DISCRETE COSINE TRANSFORM FOR COLOR IMAGE RETRIEVAL
COLOR HISTOGRAM AND DISCRETE COSINE TRANSFORM FOR COLOR IMAGE RETRIEVAL 1 Te-Wei Chiang ( 蔣德威 ), 2 Tienwei Tsai ( 蔡殿偉 ), 3 Jeng-Ping Lin ( 林正平 ) 1 Dept. of Accounting Inforation Systes, Chilee Institute
More informationCarving Differential Unit Test Cases from System Test Cases
Carving Differential Unit Test Cases fro Syste Test Cases Sebastian Elbau, Hui Nee Chin, Matthew B. Dwyer, Jonathan Dokulil Departent of Coputer Science and Engineering University of Nebraska - Lincoln
More informationAN APPROACH ON BIMODAL BIOMETRIC SYSTEMS
AN APPROACH ON BIODAL BIOETRIC SYSTES Eugen LUPU, Siina EERICH Technical University of Cluj-Napoca, 26-28 Baritiu str. Cluj-Napoca phone: +40-264-40-266; fax: +40-264-592-055; e-ail: Eugen.Lupu @co.utcluj.ro
More informationThe Boundary Between Privacy and Utility in Data Publishing
The Boundary Between Privacy and Utility in Data Publishing Vibhor Rastogi Dan Suciu Sungho Hong ABSTRACT We consider the privacy proble in data publishing: given a database instance containing sensitive
More informationA Trajectory Splitting Model for Efficient Spatio-Temporal Indexing
A Trajectory Splitting Model for Efficient Spatio-Teporal Indexing Slobodan Rasetic Jörg Sander Jaes Elding Mario A. Nasciento Departent of Coputing Science University of Alberta Edonton, Alberta, Canada
More informationA simplified approach to merging partial plane images
A siplified approach to erging partial plane iages Mária Kruláková 1 This paper introduces a ethod of iage recognition based on the gradual generating and analysis of data structure consisting of the 2D
More informationEvaluation of a multi-frame blind deconvolution algorithm using Cramér-Rao bounds
Evaluation of a ulti-frae blind deconvolution algorith using Craér-Rao bounds Charles C. Beckner, Jr. Air Force Research Laboratory, 3550 Aberdeen Ave SE, Kirtland AFB, New Mexico, USA 87117-5776 Charles
More informationQUERY ROUTING OPTIMIZATION IN SENSOR COMMUNICATION NETWORKS
QUERY ROUTING OPTIMIZATION IN SENSOR COMMUNICATION NETWORKS Guofei Jiang and George Cybenko Institute for Security Technology Studies and Thayer School of Engineering Dartouth College, Hanover NH 03755
More informationGromov-Hausdorff Distance Between Metric Graphs
Groov-Hausdorff Distance Between Metric Graphs Jiwon Choi St Mark s School January, 019 Abstract In this paper we study the Groov-Hausdorff distance between two etric graphs We copute the precise value
More informationEnergy-Efficient Disk Replacement and File Placement Techniques for Mobile Systems with Hard Disks
Energy-Efficient Disk Replaceent and File Placeent Techniques for Mobile Systes with Hard Disks Young-Jin Ki School of Coputer Science & Engineering Seoul National University Seoul 151-742, KOREA youngjk@davinci.snu.ac.kr
More informationImage Filter Using with Gaussian Curvature and Total Variation Model
IJECT Vo l. 7, Is s u e 3, Ju l y - Se p t 016 ISSN : 30-7109 (Online) ISSN : 30-9543 (Print) Iage Using with Gaussian Curvature and Total Variation Model 1 Deepak Kuar Gour, Sanjay Kuar Shara 1, Dept.
More informationApproximate String Matching with Reduced Alphabet
Approxiate String Matching with Reduced Alphabet Leena Salela 1 and Jora Tarhio 2 1 University of Helsinki, Departent of Coputer Science leena.salela@cs.helsinki.fi 2 Aalto University Deptartent of Coputer
More informationSmarter Balanced Assessment Consortium Claims, Targets, and Standard Alignment for Math
Sarter Balanced Assessent Consortiu s, s, Stard Alignent for Math The Sarter Balanced Assessent Consortiu (SBAC) has created a hierarchy coprised of clais targets that together can be used to ake stateents
More informationGalois Homomorphic Fractal Approach for the Recognition of Emotion
Galois Hooorphic Fractal Approach for the Recognition of Eotion T. G. Grace Elizabeth Rani 1, G. Jayalalitha 1 Research Scholar, Bharathiar University, India, Associate Professor, Departent of Matheatics,
More informationStructural Balance in Networks. An Optimizational Approach. Andrej Mrvar. Faculty of Social Sciences. University of Ljubljana. Kardeljeva pl.
Structural Balance in Networks An Optiizational Approach Andrej Mrvar Faculty of Social Sciences University of Ljubljana Kardeljeva pl. 5 61109 Ljubljana March 23 1994 Contents 1 Balanced and clusterable
More informationTheoretical Analysis of Local Search and Simple Evolutionary Algorithms for the Generalized Travelling Salesperson Problem
Theoretical Analysis of Local Search and Siple Evolutionary Algoriths for the Generalized Travelling Salesperson Proble Mojgan Pourhassan ojgan.pourhassan@adelaide.edu.au Optiisation and Logistics, The
More informationUtility-Based Resource Allocation for Mixed Traffic in Wireless Networks
IEEE IFOCO 2 International Workshop on Future edia etworks and IP-based TV Utility-Based Resource Allocation for ixed Traffic in Wireless etworks Li Chen, Bin Wang, Xiaohang Chen, Xin Zhang, and Dacheng
More information3D Building Detection and Reconstruction from Aerial Images Using Perceptual Organization and Fast Graph Search
436 Journal of Electrical Engineering & Technology, Vol. 3, No. 3, pp. 436~443, 008 3D Building Detection and Reconstruction fro Aerial Iages Using Perceptual Organization and Fast Graph Search Dong-Min
More informationPROBABILISTIC LOCALIZATION AND MAPPING OF MOBILE ROBOTS IN INDOOR ENVIRONMENTS WITH A SINGLE LASER RANGE FINDER
nd International Congress of Mechanical Engineering (COBEM 3) Noveber 3-7, 3, Ribeirão Preto, SP, Brazil Copyright 3 by ABCM PROBABILISTIC LOCALIZATION AND MAPPING OF MOBILE ROBOTS IN INDOOR ENVIRONMENTS
More informationVerifying the structure and behavior in UML/OCL models using satisfiability solvers
IET Cyber-Physical Systes: Theory & Applications Review Article Verifying the structure and behavior in UML/OCL odels using satisfiability solvers ISSN 2398-3396 Received on 20th October 2016 Revised on
More informationNews Events Clustering Method Based on Staging Incremental Single-Pass Technique
News Events Clustering Method Based on Staging Increental Single-Pass Technique LI Yongyi 1,a *, Gao Yin 2 1 School of Electronics and Inforation Engineering QinZhou University 535099 Guangxi, China 2
More informationAN INTEGRATED APPROACH TO MUSIC BOUNDARY DETECTION
10th International Society for Music Inforation Retrieval Conference (ISMIR 2009) AN INTEGRATED APPROACH TO MUSIC BOUNDARY DETECTION Min-Yian Su, Yi-Hsuan Yang, Yu-Ching Lin, Hoer Chen National Taiwan
More informationDefining and Surveying Wireless Link Virtualization and Wireless Network Virtualization
1 Defining and Surveying Wireless Link Virtualization and Wireless Network Virtualization Jonathan van de Belt, Haed Ahadi, and Linda E. Doyle The Centre for Future Networks and Counications - CONNECT,
More informationHKBU Institutional Repository
Hong Kong Baptist University HKBU Institutional Repository HKBU Staff Publication 18 Towards why-not spatial keyword top-k ueries: A direction-aware approach Lei Chen Hong Kong Baptist University, psuanle@gail.co
More informationJoint Measurement- and Traffic Descriptor-based Admission Control at Real-Time Traffic Aggregation Points
Joint Measureent- and Traffic Descriptor-based Adission Control at Real-Tie Traffic Aggregation Points Stylianos Georgoulas, Panos Triintzios and George Pavlou Centre for Counication Systes Research, University
More informationImprove Peer Cooperation using Social Networks
Iprove Peer Cooperation using Social Networks Victor Ponce, Jie Wu, and Xiuqi Li Departent of Coputer Science and Engineering Florida Atlantic University Boca Raton, FL 33431 Noveber 5, 2007 Corresponding
More informationAn Integrated Processing Method for Multiple Large-scale Point-Clouds Captured from Different Viewpoints
519 An Integrated Processing Method for Multiple Large-scale Point-Clouds Captured fro Different Viewpoints Yousuke Kawauchi 1, Shin Usuki, Kenjiro T. Miura 3, Hiroshi Masuda 4 and Ichiro Tanaka 5 1 Shizuoka
More informationEntity Search Engine: Towards Agile Best-Effort Information Integration over the Web
Entity Search Engine: Towards Agile Best-Effort Inforation Integration over the Web Tao Cheng, Kevin Chen-Chuan Chang University of Illinois at Urbana-Chapaign {tcheng3, kcchang}@cs.uiuc.edu. INTRODUCTION
More informationINTRINSIC DECOMPOSITION FOR STEREOSCOPIC IMAGES
INTRINSIC DECOMPOSITION FOR STEREOSCOPIC IMAGES Dehua Xie 1, Shuaicheng Liu 1, Kaio Lin 2, Shuyuan Zhu 1, and Bing Zeng 1 1 School of Electronic Engineering, University of Electronic Science and Technology
More informationDetecting Anomalous Structures by Convolutional Sparse Models
Detecting Anoalous Structures by Convolutional Sparse Models Diego Carrera, Giacoo Boracchi Dipartiento di Elettronica, Inforazione e Bioingegeria Politecnico di Milano, Italy {giacoo.boracchi, diego.carrera}@polii.it
More informationNUS-I2R: Learning a Combined System for Entity Linking
NUS-I2R: Learning a Combined System for Entity Linking Wei Zhang Yan Chuan Sim Jian Su Chew Lim Tan School of Computing National University of Singapore {z-wei, tancl} @comp.nus.edu.sg Institute for Infocomm
More informationResolution. Super-Resolution Imaging. Problem
Resolution Super-Resolution Iaging Resolution: Sallest easurable detail in a visual presentation Subhasis Chaudhuri Departent of Electrical Engineering Indian institute of Technology Bobay Powai, Mubai-400
More informationEfficient Estimation of Inclusion Coefficient using HyperLogLog Sketches
Efficient Estiation of Inclusion Coefficient using HyperLogLog Sketches Azade Nazi, Bolin Ding, Vivek Narasayya, Surajit Chaudhuri Microsoft Research {aznazi, bolind, viveknar, surajitc}@icrosoft.co ABSTRACT
More informationNON-RIGID OBJECT TRACKING: A PREDICTIVE VECTORIAL MODEL APPROACH
NON-RIGID OBJECT TRACKING: A PREDICTIVE VECTORIAL MODEL APPROACH V. Atienza; J.M. Valiente and G. Andreu Departaento de Ingeniería de Sisteas, Coputadores y Autoática Universidad Politécnica de Valencia.
More informationELEVATION SURFACE INTERPOLATION OF POINT DATA USING DIFFERENT TECHNIQUES A GIS APPROACH
ELEVATION SURFACE INTERPOLATION OF POINT DATA USING DIFFERENT TECHNIQUES A GIS APPROACH Kulapraote Prathuchai Geoinforatics Center, Asian Institute of Technology, 58 Moo9, Klong Luang, Pathuthani, Thailand.
More informationNovel Image Representation and Description Technique using Density Histogram of Feature Points
Novel Iage Representation and Description Technique using Density Histogra of Feature Points Keneilwe ZUVA Departent of Coputer Science, University of Botswana, P/Bag 00704 UB, Gaborone, Botswana and Tranos
More information6.1 Topological relations between two simple geometric objects
Chapter 5 proposed a spatial odel to represent the spatial extent of objects in urban areas. The purpose of the odel, as was clarified in Chapter 3, is ultifunctional, i.e. it has to be capable of supplying
More informationData Caching for Enhancing Anonymity
Data Caching for Enhancing Anonyity Rajiv Bagai and Bin Tang Departent of Electrical Engineering and Coputer Science Wichita State University Wichita, Kansas 67260 0083, USA Eail: {rajiv.bagai, bin.tang}@wichita.edu
More informationEntity Linking at Web Scale
Entity Linking at Web Scale Thomas Lin, Mausam, Oren Etzioni Computer Science & Engineering University of Washington Seattle, WA 98195, USA {tlin, mausam, etzioni}@cs.washington.edu Abstract This paper
More informationDesign Optimization of Mixed Time/Event-Triggered Distributed Embedded Systems
Design Optiization of Mixed Tie/Event-Triggered Distributed Ebedded Systes Traian Pop, Petru Eles, Zebo Peng Dept. of Coputer and Inforation Science, Linköping University {trapo, petel, zebpe}@ida.liu.se
More informationA Novel Fast Constructive Algorithm for Neural Classifier
A Novel Fast Constructive Algorith for Neural Classifier Xudong Jiang Centre for Signal Processing, School of Electrical and Electronic Engineering Nanyang Technological University Nanyang Avenue, Singapore
More informationCollaborative Web Caching Based on Proxy Affinities
Collaborative Web Caching Based on Proxy Affinities Jiong Yang T J Watson Research Center IBM jiyang@usibco Wei Wang T J Watson Research Center IBM ww1@usibco Richard Muntz Coputer Science Departent UCLA
More informationA Low-Cost Multi-Failure Resilient Replication Scheme for High Data Availability in Cloud Storage
216 IEEE 23rd International Conference on High Perforance Coputing A Low-Cost Multi-Failure Resilient Replication Schee for High Data Availability in Cloud Storage Jinwei Liu* and Haiying Shen *Departent
More informationAdaptive Holistic Scheduling for In-Network Sensor Query Processing
Adaptive Holistic Scheduling for In-Network Sensor Query Processing Hejun Wu and Qiong Luo Departent of Coputer Science and Engineering Hong Kong University of Science & Technology Clear Water Bay, Kowloon,
More informationSummary. Reconstruction of data from non-uniformly spaced samples
Is there always extra bandwidth in non-unifor spatial sapling? Ralf Ferber* and Massiiliano Vassallo, WesternGeco London Technology Center; Jon-Fredrik Hopperstad and Ali Özbek, Schluberger Cabridge Research
More informationA CRYPTANALYTIC ATTACK ON RC4 STREAM CIPHER
A CRYPTANALYTIC ATTACK ON RC4 STREAM CIPHER VIOLETA TOMAŠEVIĆ, SLOBODAN BOJANIĆ 2 and OCTAVIO NIETO-TALADRIZ 2 The Mihajlo Pupin Institute, Volgina 5, 000 Belgrade, SERBIA AND MONTENEGRO 2 Technical University
More informationRobust Relevance-Based Language Models
Robust Relevance-Based Language Models Xiaoyan Li Department of Computer Science, Mount Holyoke College 50 College Street, South Hadley, MA 01075, USA Email: xli@mtholyoke.edu ABSTRACT We propose a new
More informationPERFORMANCE MEASURES FOR INTERNET SERVER BY USING M/M/m QUEUEING MODEL
IJRET: International Journal of Research in Engineering and Technology ISSN: 239-63 PERFORMANCE MEASURES FOR INTERNET SERVER BY USING M/M/ QUEUEING MODEL Raghunath Y. T. N. V, A. S. Sravani 2 Assistant
More informationDesign and Implementation of Business Logic Layer Object-Oriented Design versus Relational Design
Design and Ipleentation of Business Logic Layer Object-Oriented Design versus Relational Design Ali Alharthy Faculty of Engineering and IT University of Technology, Sydney Sydney, Australia Eail: Ali.a.alharthy@student.uts.edu.au
More informationVision Based Mobile Robot Navigation System
International Journal of Control Science and Engineering 2012, 2(4): 83-87 DOI: 10.5923/j.control.20120204.05 Vision Based Mobile Robot Navigation Syste M. Saifizi *, D. Hazry, Rudzuan M.Nor School of
More informationMassive amounts of high-dimensional data are pervasive in multiple domains,
Challenges of Feature Selection for Big Data Analytics Jundong Li and Huan Liu, Arizona State University Massive aounts of high-diensional data are pervasive in ultiple doains, ranging fro social edia,
More information