A Neighborhood Relevance Model for Entity Linking

Size: px
Start display at page:

Download "A Neighborhood Relevance Model for Entity Linking"

Transcription

1 A Neighborhood Relevance Model for Entity Linking Jeffrey Dalton University of Massachusetts 140 Governors Drive Aherst, MA, U.S.A. Laura Dietz University of Massachusetts 140 Governors Drive Aherst, MA, U.S.A. ABSTRACT Entity Linking is the task of apping a string in a docuent to its entity in a knowledge base. One of the crucial tasks is to identify disabiguating context; joint assignent odels leverage the relationships within the knowledge base. We deonstrate how joint assignent odels can be approxiated with inforation retrieval. We introduce the neighborhood relevance odel which uses relevance feedback techniques to identify the salience of entity context using cross-docuent evidence. We show that this odel is ore effective than local docuent odels for ranking KB entities. Experients on the TAC KBP entity linking task deonstrate that our odel is the best perforing syste for strings that are linkable to the knowledge base. 1. INTRODUCTION Entity linking is iportant because ost content created is unstructured text in the for of news, blogs, forus, and icroblogs such as Twitter and Facebook. A key challenge is to link these unstructured text docuents to the Web of Data. Entity linking bridges the structure gap between text docuents and linked data by identifying entions of entities in free text and linking the to knowledge bases. It enriches unstructured docuents with links to people, places, and concepts in the world. Entity linking is a fundaental building block that supports a wide variety of inforation extraction, docuent suarization, and data ining tasks. For exaple, linked entities in docuents can be used to expand existing knowledge base entries with new facts and relationships. The ajor challenge in entity linking is abiguity. An entity ention in text ay be abiguous for a wide variety of reasons: ultiple entities share the sae nae (e.g. Michael Jordan), entities are referred to incopletely (e.g. Justin for Justin Bieber), by pseudonys or nicknaes (Christopher George Latore Wallace is also known as The Notorious B.I.G.), and are often abbreviated (e.g. UW for the University of Wisconsin as well as University of Washington). Perission to ake digital or hard copies of all or part of this work for personal or classroo use is granted without fee provided that copies are not ade or distributed for profit or coercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific perission and/or a fee. OAIR 13, May 22-24, 2013, Lisbon, Portugal. Copyright 2013 CID The entity linking proble has been studied over several years in the TAC Knowledge Base Population venue with the following task definition: Entity Linking: Given a string ention q in a docuent, predict the entity c in the knowledge base which the string represents, or NIL if no such entity is available. A typical entity linking syste consists of four phases: 1) query expansion, 2) candidate generation, 3) entity ranking, and 4) handling NIL cases. The goal of the first two steps is to achieve a high-recall set of Wikipedia entities. Given the candidate set, ost effective approaches, e.g., [15, 4, 16], leverage contextual entities as disabiguating evidence in step 3. One issue is the candidate generation step is often perfored using string atching heuristics, resulting in large candidate sets that ay contain hundreds or thousands of entities for abiguous atches. The connection between candidate generation and ranking are often separate and not well aligned. We advocate an inforation retrieval approach that uses one probabilistic odel to approach steps 1-3. We introduce our linking syste, KB Bridge. Suppleentary aterials for this work is available on the KB Bridge website 1. Existing entity linking ethods only eploy IR to a inor degree. We odel the entire linking task as a retrieval proble, including foralising joint assignent odels within the retrieval fraework. The graphical odeling fraework allows us to ground our work on odels fro both inforation extraction and inforation retrieval. For a given entity ention, the correct knowledge base entry is likely to share iportant pieces of contextual inforation: lexically siilar naes, shared topical siilarity reflected in word usage, and shared naed entities. Additional context could be included, but throughout the paper we focus the previously described context: nae variations, surrounding sentences, and neighboring entities. Entity linking provides soe unusual challenges. The typical IR setting addresses short queries by using relevance feedback [17] to expand the query odel. In entity linking, the query is an entity string ebedded in a longer docuent, providing an abundance of context which could be leveraged. However, not all context is equally helpful, either because of abiguity, heterogeneity in topic, or spurious collocations. Consider the exaple ABC shot the TV draa Lost in Australia. with the task of linking ABC to the entity Aerican Broadcasting Copany. The naed entity span Australia is not relevant for the true answer. It ight 1 jdalton/kbbridge

2 actually isguide the process to link to the wrong entity Australian Broadcasting Corporation. We introduce the neighborhood relevance odel to estiate the salience of context with the goal of filtering and weighting (as opposed to expanding) the context odel. The neighborhood relevance odel is based on ideas of pseudorelevance feedback [20] and latent concept expansion [13] to leverage cooccurrence evidence across topically siilar docuents. Our ain contributions are: An unsupervised approach to entity linking based upon the Markov Rando Field inforation retrieval odel that provides copetitive perforance out-of-the-box. A unified retrieval based approach to linking cobining candidate generation and ranking in a single retrieval fraework, with ore than 95% recall in the highest ranked 10 entities. A ention-specific approach for identifying salient neighboring entities using across-docuent evidence based on relevance feedback. Deonstrating the benefits of the entity neighborhood relevance odel in cobination with a supervised learning to rank fraework, resulting in the best ranking for entities linkable to the knowledge base for the TAC KBP task. 2. BACKGROUND AND RELATED WORK 2.1 Related Work Early work on entity linking was perfored by Bunescu and Pasca [2] and Cucerzan [3] to link entions of topics to their Wikipedia pages. In contrast to their odels, we focus on a retrieval approach that leverages text based ranking without exploiting extensive Wikipedia-specific structure. Our work is related to that of Gottipati and Jiang [5] who apply a language odeling approach to entity linking. They expand the original query ention with contextual inforation fro the language odel of the docuent. We use the local weighting as a starting point for estiating the entity salience and copare against it as a baseline. Entity linking has been studied in a variety of recent venues. At INEX the Link the Wiki task explored autoatically discovering links that should be created in a Wikipedia article [7]. More recently, it is one of the principle tasks studied at the ongoing Text Analysis Conference Knowledge Base Population track (TAC KBP). Ji et al. [9, 8] provide an overview of the recent systes and approaches. Instead of linking individual entions one at a tie, recent work [3, 16, 19, 11, 6] focuses on linking the set of entions, M, that occur in the query docuent d. These odels perfor joint inference over the link assignents to identify a coherent assignent of KB entries. In our work we leverage the set of entions M in the docuent as context in an inforation retrieval odel. In this work we instead focus on identifying salient entity entions in the context, because entions in the docuent ay be spurious or only tangentially related. This is especially true if the docuent contains ultiple topics. 2.2 Joint Neighborhood Assignent Model Graphical odels [10], such as Markov Rando fields, are a popular tool in both inforation extraction and inforation retrieval. Graphical odels provide the atheatical fraework for foralizing intuitions on how available data (e.g., the query ention) and quantities of interest (e.g. entity in the knowledge base) are connected. After casting data and quantities of interest as rando variables, dependencies between two (or ore) variables are encoded by factor functions φ that assign a non-negative score to each cobination of variable settings. Factors φ are often expressed by a loglinear function of a feature vector. The joint configuration of all variables is scored by the likelihood function L, which is represented by the noralized product over scores of all factor functions for the given variable settings. The factor functions φ are designed to encode our intuition on which variable choices go well together. For instance, early approaches to entity linking [1] heuristically extract a set of entity candidates and use a graphical odel with single factor φ e (q, c). The factor φ e foralizes the intuition that query entions q and entities c are copatible when their naes atch and the docuent surrounding q has ters siilar to the article of the entity candidate c. This basic odel is extended in joint neighborhood assignent odels [3, 16, 12]. The idea is that knowledge base entities which are entioned in the sae docuents are also likely to be structurally related in the knowledge base. When the query ention is abiguous, several candidate entities have an equally high score under the factor φ e (q, c). Joint assignent odels explicitly incorporate entity entions fro the sae docuent as the query ention q, which we call neighbor entions henceforth. Each neighbor ention i is associated with a latent variable z i referring to the respective entity for the neighbor. We expect the neighbor entities z to help disabiguate the true candidate c. This expectation is foralized by a second factor function φ ee (z, c), which captures intuitions on when entities z and c are entioned in the context of each other. For the factor function φ ee, Ratinov et al. [16] set siilarity of inlinks (and outlinks) for c and z; Cucerzan [4, 3] use the overlap in Wikipedia categories and presence of links fro z to c and vice versa. For each neighbor ention i a set of candidates are heuristically identified, where the variable z i represents a choice of the candidate. The odel is visualized in Figure 1a, where variables q and i are observed, but settings of variables c and z i are chosen fro respective candidate sets. A factor is represented as a sall black box which connect its input variables. If each neighboring ention is linked to its correct entity candidate z i, the true candidate c will achieve a high score of φ ee (z i, c) for ost of the neighbors z i. This inforation is cobined across all neighbors and taken together with the nae-factor φ e (q, c). The dilea is that linking all i to z i requires that we solve the entity linking proble as part of the solution. For this odel, the joint inference proble does not have a closed-for solution and therefore requires approxiate inference such as belief propagation [12] or iterative heuristics [4, 16]. As the task is to link only the single query ention, the neighboring entity links are to be arginalized out by integration over z. Candidates c for the query are scored by Equation 1.

3 (a) Joint Assignent Model(b) Neighborhood uery Model Figure 1: Neighborhood Models L(c) = φ e (q, c) ( φ e (, z) φ ee (z, c) dz) (1) 3. UERY MODEL In this section we integrate graphical odels for entity linking as developed in the inforation extraction counity with graphical odels used for inforation retrieval. We utilize the fact that query odels, such as query likelihood and the sequential dependence odel [14] have an underlying graphical odel that gives rise to a score of a docuent. 3.1 Neighborhood uery Model We deonstrate how the retrieval engine can be used to infer the joint inference proble of Equation 1 whenever factor functions φ e and φ ee are expressible as query operators. This allows us to cobine the candidate generation (step 2) with the entity ranking (step 3) and have it carried out by a retrieval engine that optiizes over all possible entities in the knowledge base at once. In contrast to the joint neighborhood assignent approaches, no candidate set has to be identified beforehand. The key insight is that for particular choices of features for the nae-atch factor φ e (, z) and the entity-link factor φ ee (z, c), it is possible to solve the integral over z in Equation 1, with sart preprocessing and indexing. For instance, consider the following siple factor functions for φ e and φ ee. We choose φ e (, z) to yield 1 if the ention is a string atch to the title of z and 0 otherwise; and φ ee (z, c) to yield 1 if z links to c. In this case, we can replace the integral over z with a factor φ e (, c) that yields 1 if the ention atches a title of an inlink of c. Such a factor can be analytically solved by transforing the Wikipedia snapshot so that the indexed article of entity c is enriched with titles of Wikipedia pages that link to c in a separate field (or extents). A search query for the string in the field of inlinks represents a graphical odel that represents the score of φ e. For any choices of factor functions for φ e and φ ee, the knowledge base can be transfored to allow for efficient replaceent factors φ e that can be directly optiized within the retrieval odel fraework. By integrating z out, the joint neighborhood assignent odel depicted in Figure 1a becoes the neighborhood query odel in Figure 1b with the following likelihood function. L(c) = φ e (q, c) φ e (, c) (2) Many coplex factor functions are possible, but for the reainder of this publication we use two siple, yet effective factor functions φ: Factor φ e (q, c) encodes atches of the string q and ters fro the surrounding text in any field of the Wikipedia article of the candidate c, this includes naes as well as the full-text of the article. Factor φ e (, c) atches the string representation of in the full-text and titles entities that link to (and are linked fro) the Wikipedia entry of c. 3.2 Relevance-weighted Neighborhood uery Model As pointed out in the introduction, not all contextual entities are equally salient. For each contextual entity, its salience for disabiguating query q is denoted by ρ q(), ranging on a scale between 0 and 1. If the salience ρ q() is 0, we want to reove the effect of φ e (, c) on the likelihood function. Based on the geoetric ean, which is the natural choice for averages of probabilities, we achieve the weighting with the geoetric interpolated odel of Equation 3. L(c) = φ e (q, c) ( ) ρq() φ e (, c) Notice, that the unweighted odel follows as a special case where all saliences are 1. We also introduce the paraeter λ M that allows trading off between the direct siilarity of the query and candidate as expressed by φ e (q, c) and the aggregated influence of the M contextual entities. Exploiting that the sort-order induced by L is invariant with respect to logariths, we cast the optiization in log-space as in Equation 4. log L(c) = logφ e (q, c) + λ M 1 M (3) ( ) ρ q() log φ e (, c) 3.3 Context fro uery Docuent The factor φ e foralizes our intuition on copatibility between the query ention q and the true candidate c. This includes nae atches of the string representation of with naes listed in the knowledge base (e.g. title, redirect, anchor text). We further extend it to other siilarity easures that are independent of the neighbor entions (which are represented by φ e ). Nae variations v of the query string can be extracted fro the query docuent, to add robustness to the entity linking inference. This is especially iportant if the query string is an acrony or an abiguous reference to the entity. We also incorporate the words, s, of the sentence that surround the query ention or one of its nae variations to preserve verbs, adjectives, and standing expressions that ight disabiguate the candidate. We introduce separate weight paraeters λ, λ V, λ S to individually control influence of nae-atches of the query ention, nae-atches of nae variations v, and sentence context respectively. Accordingly, odel the factor function φ e (q, c) by the likelihood of a graphical odel itself, represented by a log-linear function of potential functions φ nae for nae-siilarity and φ sent for sentence context. The resulting optiization criterion of the candidate answer c for the query odel given the query q, V nae variations v, S contextual phrases s, and M neighbor entions is given in Equation 5. (4)

4 log L(c) = λ logφ nae (q, c) (5) + λ V 1 log φ nae (v, c) V v + λ S 1 log φ sent (s, c) S + λ M 1 M s ( ) ρ q() log φ e (, c) 3.4 Joint Inference with Galago Using log-linear odels for factors φ with features that are readily available in the Indri and Galago 2 query languages, we can leverage the retrieval engine to optiize Equation 5. This is possible because the geoetric interpolations with weights λ and ρ are expressed with the weighted #cobine operator. We rely on the sequential dependence odel [14], which is a query odel for odels dependencies between adjacent query words. It strikes a balance between a odel that ignores the order between the query ters (e.g. query likelihood odel) and a odel that requires the exact ordering of the words (n-gra or phrase odel). For a given sequence of ters t 1, t 2,..., t n, the sequential dependence odel scores docuents by a graphical odel cobining ter factors φ t (t i), bi-gra factors φ o (t i, t i+1), and factors capturing whether two subsequent query ters t i and t i+1 occur within a window of 8 words. We use siple factors in the reainder of the paper. For the nae-atch factor φ nae (q, c) that tests all of the entity s indexed docuent for presence of the string representation of the query ention q. Because q consists of ultiple ters, we use the sequential dependence odel. We use a query likelihood operator for φ sent. For the neighbor factor φ e we also use the sequential dependence odel. With these feature functions, the optiization criterion of Equation 5 is equivalent to the Galago query given in Figure 2. We restrict ourselves to a siple factor functions to deonstrate general applicability. Field-retrieval odels provide further options, e.g., to let φ e distinguish atches in different nae fields (title, anchor text, etc); and φ e distinguish atches in the full-text, the title of in-links and titles of out-links. The factor functions can ake use of any feature that can be expressed in Galago query language, while still allowing to be optiized inside the retrieval engine. 4. NEIGHBORHOOD RELEVANCE MODEL In the previous section we introduced a query odel containing salience-weighted entity entions. We now discuss ethods for estiating these salience weights ρ q() in an unsupervised anner. The idea is to assue a high salience of for the query ention q, if both are frequently entioned together. It is iportant to note that even unabiguous entions are not necessarily useful for disabiguating other entions. 2 #cobine:0=λ :1=λ V :2=λ S :3=λ M ( ) #seqdep(q) #cobine(#seqdep(v 0)... #seqdep(v V )) #cobine(#seqdep(s 0),..., #seqdep(s S)) #cobine:0 = ρ( 0) :... k = ρ( k )( ) #seqdep( 0),..., #seqdep( k ) Figure 2: Salience-weighted neighborhood query odel in Galago syntax. 4.1 Local Docuent Neighborhood Model As a first approach, salience of the neighboring entions on the query ention can be estiated fro the docuent surrounding the query. This technique was used by Gottipati and Jiang [5] to build a ultinoial language odel of entity entions fro the query docuent d q with occurrence count n,dq. We refer to this siple estiation technique as the local odel. ρ local n,dq q () = (6) n,d q Gottipati also tested weighting schees that incorporates distance, but found that these did not significantly iprove the results. We suspect that especially when the query is not the ain focus of the query docuent, any contextual entities are not relevant for disabiguation and ay actually lead to worse perforance (see experiental evaluation). 4.2 Across-docuent Neighborhood Relevance Model We suggest the neighborhood relevance odel which estiates entity saliences ρ fro across-docuent evidence. The idea is that a neighbor is iportant if it occurs frequently in the context of the query ention within the docuent as well as across other docuents that are topically related and contain entions of the query ention q. Our ethod is based on pseudo-relevance feedback [13], which analyses results a first retrieval pass to expand the query with related words. We first identify the query string q, with nae variations v, and neighborhood, and using the local docuent saliences ρ local. We use the query odel given in Equation 5 to search for docuents d that contain coreferent entions. We identify coreferent entions by coparison to the nae variations v and q we call the pseudo-coreferent entions. Given a set of retrieved pseudo-relevant docuents D, we use the retrieval probability of the docuent as an estiate of how likely the pseudo-coreferent ention refers to the sae entity as the query q. As ost retrieval fraeworks return only unnoralized (rank-equivalent) retrieval scores L(d) L(d), the estiate has to be approxiated with d. D L(d ) Counting the occurrence frequency n,d of string-identical entions of, we build a ultinoial language odel across the pseudo-relevant docuents with relevance-odel weighting as follows.

5 ρ nr q () = 1 d D L(d ) d D n,d L(d) (7) n,d In other words, the salience of a ention in the neighborhood is expressed by accuulating relative retrieval probabilities of docuents according to how often they contain the ention. Any corpus that contains docuents with siilar properties as the query docuent is a reasonable target corpus for the feedback query. In this work, we use the coplete TAC source corpus, but other choices such as restrictions to news or blogs, depending on the query docuent are possible. Typically, relevance feedback odels are used to expand the query with new ters. This odel is capable of introducing new entity entions that are not contained in the query docuent. However, since the context of the query docuent is already very rich, a preliinary experient deonstrated that it is better to use the relevance odel to reduce and weight the context found in the query docuent. 5. KB BRIDGE: ENTITY LINKING SYSTEM In this section we describe KB Bridge, our retrieval-based entity linking syste which is ipleented using the Galago search engine and the MRF inforation retrieval fraework. The syste links entity entions in the source docuent to knowledge base entities. The ranking of the entities is a two-stage process. First, entities are ranked using the Galago retrieval odel described in Figure 2. The ranking is then refined with a supervised learning to rank odel using RankLib 3. The final step is NIL handling which deterines if the ention is in the knowledge base or whether it is unknown. 5.1 Knowledge Base Representation Our syste addresses text-driven knowledge bases in which each entity is associated with free text, with relationships between entities fro hyperlinks or other sources. Wikipedia is one representation of such a knowledge base, but our syste would likely perfor well on other knowledge bases. In order to efficiently search over knowledge bases with illions or billions of entities we use an inforation retrieval syste. For these experients, we index the full text of the Wikipedia article, title, redirects, Freebase nae variations, internal anchor text, and web anchor text. 5.2 Docuent Analysis The first step in linking is to identify the entity query span q in the docuent and to find disabiguating contextual inforation for the query odel introduced in Section 3. This includes nae variations v, contextual sentences s, and other neighboring entions. In the TAC KBP challenge, the entities of type person, organization, or location are the ain focus of the linking effort and so the syste detects entities using standard naed entity recognition tools, including UMass s factorie 4 and Stanford CoreNLP 5. These provide the entions spans to derive query entions q, nae variations v, and neighboring entities. Beyond the standard entity classes, our approach 3 vdang/ranklib.htl is general enough to also link other entity types if a suitable detector is incorporated. Given a target entity ention, q, the syste needs to identify nae variations, v, in the docuent, such as Steve to Steve Jobs or IOC to International Olypic Coittee. The goal is to identify alternative naes that are less abiguous than the query ention. We use the within-docuent coreference tool fro UMass s factorie, together with capitalized word sequences that contain the query string (ignoring capitalization and punctuation for the atching) to extract nae variations v. Fro the set of coreferent entions, we extract the sentences s they occur within. After reoving stopwords, casing and punctuation they represent non-ner context such as verbs, adjectives, and ulti-word phrases. 5.3 Cross-docuent evidence Instead of analyzing one docuent in isolation, KB Bridge leverages topically siilar entity entions across docuents. A full-text index of a corpus that contains siilar docuents is constructed. We then generate a query according to Figure 2, using local salience weights ρ local, to retrieve docuents fro the collection. The docuents are ranked by the likelihood of containing relevant entity entions for original query ention, q. These docuents are used in the neighborhood relevance odel to identify salience weights, ρ nr. 5.4 KB Entity Ranking The query odel with salience weights fro local docuent analysis and the neighborhood relevance odel ρ nr () is executed against the search index of KB entries as shown in Equation 5. Our syste supports any feature function expressible in Galago query notation. Beyond this initial ranking, we can further refine the ranking using ore coplex features in learning to rank odels. The ranking is refined using supervised achine learning in a learning to rank (LTR) odel. The refineent eploys ore extensive feature coparisons which would be expensive to copute over the entire collection. For these experients we use the LabdaMART ranking odel, a type of gradient boosted decision tree that is state-of-the-art and captures non-linear dependencies in the data. The odel includes dozens of features. A description of the features used in the odel is found in Table NIL Handling After the entities are ranked, the last step is to deterine if the top-ranked entity for a ention is correct and should be linked to the KB entry or instead refers to an entity not in the knowledge base, in which case NIL should be returned. For these experients, we use a siple NIL handling strategy. We return NIL, if the supervised score of the top ranked entity is below a threshold τ. The NIL threshold τ is tuned on the training data. For the special case of the TAC KBP challenge, the reference knowledge base is a subset of Wikipedia. We exploit this fact by returning NIL whenever the top ranked Wikipedia entity is not contained in the reference knowledge base. 6. EXPERIMENTAL EVALUATION 6.1 Setup We base our experiental evaluation on four data sets fro the TAC KBP entity linking copetition fro 2009 to 2012.

6 Feature Set Type Description Character Siilarity q, v Lower-cased noralized string siilarity: Exact atch, prefix atch, Dice, Jaccard, Levenstein, Jaro-Winkler Token Siilarity q,v Lower-cased noralized token siilarity: Exact atch, Dice, Jaccard Acrony atch q Tests if query is an acrony, if first letters atch, and if KB entry nae is a possible acroyn expansion Field atches q, v Field counts and query likelihood probabilities for title, anchor text, redirects, alternative naes fields Link Probability q, v p (anchor KB entry) - the fraction of internal and external total anchor strings targeting the entity Inlink count docuent prior Log of the nuber of internal and external links to the target KB entry Text Siilarity docuent Noralized text siilarity of docuent and KB entity: Cosine with TF-IDF, KL, JS, Jaccard token overlap Neighborhood text siilarity docuent Noralized neighborhood siilarity: KL Divergence, Nuber of atches, atch probability Neighborhood link siilarity docuent Neighborhood siilarity with in/out links: KL divergence, Jensen-Shannon Divergence, Dice overlap, Jaccard Rank features retrieval Raw retrieval log likelihood, Noralized posterior probability, 1/retrieval rank Context Rank Features retrieval retrieval scores for each contextual coponents: q, v, s, nr, local Table 1: Features of the query ention and candidate Wikipedia entity. Over the years, the TAC organizers and the Linguistic Data Consortiu cae up with evaluation queries with varying characteristics both in ters of abiguity (average unique entions per entity) and variety (average nuber of entities per ention) Data The TAC KBP Knowledge Base was constructed fro a dup of English Wikipedia fro October 2008 containing 818,741 entries. The source collection includes over 1.2 illion newswire docuents, approxiately 500 thousand web docuents and hundreds of transcribed spoken docuents. Across all years there are a total of 12,130 query entity entions. We use all query entions with odd nubered IDs as training data, and the even for evaluation. We inspected the distribution of the queries in the split to ensure they were representative for both NIL and queries with a ground truth entity ( in-kb ) as well as the entity type distribution (Per/Org/GPE). The training set contains 6043 queries, 3034 with a ground truth entity c and 3009 NIL queries. The evaluation set contains 6087 queries with 3058 NIL and 3029 in-kb. This training set is used to learn paraeters of our query odel, as well as paraeters of the supervised reranker. We learn across all years and evaluate year-by-year for coparison with previous results Wikipedia dup For evaluating a retrieval approach to linking, we use a ore recent dup of Wikipedia. We use a Freebase dup of the English Wikipedia fro January 2012, which contains over 3.8 illion articles. It includes the full-text of the article along with etadata including redirects, disabiguation links, outgoing links, and anchor text. We also use the Google Cross-Wiki dictionary [18] for external link inforation fro the web. We derive a apping between our snapshot and the official TAC KBP knowledge base using title atches and article redirects. Using a ore recent snapshot of Wikipedia is a coon practice eployed by the top perforing linking systes in TAC. The full snapshot provides full text as well as rich category and link structure. 6.2 Context Modeling We first evaluate the contributions of the different types of context for the entity query. The context includes the entity query string q, nae variations v, the sentences s surrounding the query or nae variations, as well as neighboring entity spans. The cobinations of these context coponents is indicated by, V, S, or M in the ethod prefix. We evaluate three context weighting ethods. The first is unifor weighting (M). Second is the local docuent odel by Gottipati [5] (indicated by local). Third is our neighborhood relevance odel (indicated by the suffix nr). We copare both for estiating the salience ρ() of neighboring entity entions. Baselines are the ethods using only the query string (), the cobination of query and nae variations (), as well as the local context weighting (M local). Our suggested ethods are SM nr and M nr. These odels are the full query odel with neighborhood relevance weighting with and without sentences. For each of the copared ethods, we train separate λ paraeters on the training data using a coordinate ascent learning algorith. Estiated λ paraeters differ across ethods. For the SM nr odel the estiated paraeters are: λ = 0.321, λ V = 0.293, λ S = 0.155, and λ M = Figure 3 visualizes an ablation study for the context coponents using precision at rank one for evaluation. Figure 3a shows the cuulative iproveents as context is added and weighted with the neighborhood relevance (M nr). M with unifor neighborhood weighting perfors siilarly to M local weighting (not shown). We observe that adding sentence context does not significantly iprove perforance. Figure 3b details the individual contributions of contextual coponents (oitting sentences). It is interesting that the ethod (entity nae plus nae variations) and M nr (entity nae plus weighted entity spans) are coparable in effectiveness. This is useful when no high quality nae variations are extractable fro the text, as is the case in inforal text fro social edia. The cuulative figure above shows that when cobined these features yield further iproveent. Across all years the neighborhood relevance odel achieves better effectiveness than the local odel. 6.3 Ranking Distribution In the previous section we exained the effectiveness of the different contextual coponents on the top-ranked result. Table 2 presents the contextual ranking odels evaluated using ean reciprocal rank (MRR). Siilar to the previous results, it shows that the ost effective odels include the neighborhood relevance weighting schee (nr). M nr and SM nr are significantly better than the baseline. The only exception is in 2010, when the queries are

7 P1 P M M_nr SM_nr all 2009 M_local M_nr M M_nr SM_nr all M M_nr SM_nr all (a) Cuulative M_local M_nr M_local M_nr M M_nr SM_nr all 2012 M_local M_nr Method M nr M nr LTR Table 3: Learning to rank refineent results with ean recipocal rank (MRR). All LTR results are statistically significant with α = 5% over the unsupervised M nr average recall M_nr M_nr LTR Figure 4: Average recall at rank cutoff k cutoff rank k (b) Individual Contributions. Figure 3: Ablation study for the suggested ethod in ters of 1. Method M nr * 0.825* M M nr 0.795* * 0.715* M local 0.784* * S * SM nr 0.792* * 6* SM local 0.780* * 0.719* all context 0.786* * 0.735* Table 2: Ranking results on TAC by year with varying context ethods with ean reciprocal rank (MRR). The best results for each year are highlighted in bold. Results that are statistically significant with α = 5% over the baseline are indicated with *. easier. In this case only the M nr ethod is significantly better. Additionally, M nr is significantly better than the local weighting (Gottipati) for However, there is no significant difference in We hypothesize that the reason the neighborhood odel does not iprove over the local odel in 2012 is because the queries are significantly ore abiguous and the quality of the retrieved feedback docuents is lower. We refine the retrieval ranking using a supervised learning to rank odel. The the features in the ranking odel are described in Table 1. The top 100 results fro the best ranking, M nr are reranked. The results of this are shown in Table 3. The results show significant iproveent over the initial retrieval ranking leveraging ore features that perfor ore extensive contextual coparison. The results for 2012 are still well below the other years, indicating the difficulty of these queries even leveraging the ore coplex contextual features. This indicates that a better feature representation is needed to address soe of these difficult to resolve entions Recall The previous results use ean reciprocal rank to easure the retrieval effectiveness. We now exaine the rank distribution in ore detail, exaining the recall at a given rank cutoff. The entity recall is critical because it is an upper bound on the effectiveness of downstrea systes. To achieve a iniu 90% recall threshold across all years requires hundreds of candidates for the query () odel, 20 for, 16 for M nr, and only 3 for M nr LTR. The learning to rank odel achieves at least 95% recall across all years within 10 results. 6.4 TAC KBP results In this section, we evaluate the ranking as part of the entire linking pipeline described in Section 5. We report the icro-averaged accuracy because we do not focus on clustering NIL entity entions. The results are in Table 4. The unsupervised retrieval M nr perfors well, above the edian in 2012 and copetitive in previous years. The supervised ranking odels iprove effectiveness significantly. The in-kb ranking results outperfor the best perforing systes in 2009 through 2011 and are coparable in The ain focus of this work is ranking, and this shows the effectiveness of our approach. We now exaine the overall (all) results, including the NIL handling. The results show that the M nr with LTR reranking and NIL handling outperfors the top syste in 2009 and is copetitive with the best perforing systes in subsequent years. Applying the score threshold iproves the overall accuracy despite decreasing in-kb effectiveness. This is because soe correctly linked entities are arked as NIL, but are outweighed by the greater reduction in false

8 in-kb NIL all in-kb NIL all in-kb NIL all in-kb NIL all M nr M nr LTR RM nr LTR NIL Best Perforer Table 4: TAC Entity Linking perforance in icro-average accuracy. positive entity links. The NIL handling strategy based on thresholding the ranking score is effective, but could be iproved further. Other linking systes use a supervised NIL classifier for this step, allowing the to perfor well despite less effective in-kb ranking. 7. CONCLUSION In this paper we propose an approach to entity linking based upon the Markov Rando Field inforation retrieval odel (MRF-IR). We focus on the task of ranking knowledge base entities. We deonstrate how concepts fro joint neighborhood odels can be incorporated within the MRF-IR fraework. Further, we propose a neighborhood relevance odel (NRM) that uses relevance feedback techniques to identify salient entity context across docuents. Our experients on the TAC KBP entity linking data show that the neighborhood relevance odel outperfors other contextual odels. When the ranking is refined with a learning to rank odel the results beat the current best perforing systes on in-kb ranking accuracy. Cobined with a siple NIL handling strategy the overall effectiveness on all entions is coparable to, and soeties better than, other state-of-the-art entity linking systes. Acknowledgeents This work was supported in part by the Center for Intelligent Inforation Retrieval and in part under subcontract # fro SRI International, prie contractor to DARPA contract #HR C Any opinions, findings and conclusions or recoendations expressed in this aterial are those of the authors and do not necessarily reflect those of the sponsor. 8. REFERENCES [1] Entity disabiguation for knowledge base population. In COLING, [2] Razvan Bunescu and Marius Pasca. Using encyclopedic knowledge for naed entity disabiguation. In European Chapter of the Association for Coputational Linguistics (EACL), [3] S. Cucerzan. Large-Scale Naed Entity Disabiguation Based on Wikipedia Data. EMNLP, [4] S. Cucerzan. Tac entity linking by perforing full-docuent entity extraction and disabiguation. Proceedings of the Text Analysis Conference, [5] Swapna Gottipati and Jing Jiang. Linking entities to a knowledge base with query expansion. In EMNLP, [6] Johannes Hoffart, Mohaed A. Yosef, Ilaria Bordino, Hagen Fürstenau, Manfred Pinkal, Marc Spaniol, Bilyana Taneva, Stefan Thater, and Gerhard Weiku. Robust disabiguation of naed entities in text. In EMNLP, [7] Darren W. Huang, Yue Xu, Andrew Trotan, and Shloo Geva. Focused access to XML docuents. chapter Overview of INEX 2007 Link the Wiki Track [8] Heng Ji and Ralph Grishan. Knowledge base population: successful approaches and challenges. In ACL-HLT, [9] Heng Ji, Ralph Grishan, and Hoa Dang. Overview of the TAC2011 knowledge base population track. In TAC 2011 Proceedings Papers, [10] Daphne Koller and Nir Friedan. Probabilistic graphical odels: principles and techniques. MIT press, [11] Sayali Kulkarni, Ait Singh, Ganesh Raakrishnan, and Souen Chakrabarti. Collective annotation of wikipedia entities in web text. In KDD, [12] Paul McNaee, Veselin Stoyanov, Jaes Mayfield, Ti Finin, Ti Oates, Tan Xu, Douglas W. Oard, and Dawn Lawrie. HLTCOE participation at TAC 2012: Entity linking and cold start knowledge base construction. In TAC KBP, [13] D. Metzler and W.B. Croft. Latent concept expansion using arkov rando fields. In SIGIR, [14] Donald Metzler and W. Bruce Croft. A arkov rando field odel for ter dependencies. In SIGIR, [15] S. Monahan, J. Lehann, T. Nyberg, J. Plyale, and A. Jung. Cross-lingual cross-docuent coreference with entity linking. Proceedings of the Text Analysis Conference, [16] L. Ratinov, D. Roth, D. Downey, and M. Anderson. Local and global algoriths for disabiguation to wikipedia. In ACL, [17] J. J. Rocchio. Relevance feedback in inforation retrieval. In G. Salton, editor, The SMART Retrieval Syste: Experients in Autoatic Docuent Processing, chapter 14, pages [18] Valentin I. Spitkovsky and Angel X. Chang. A Cross-Lingual dictionary for english wikipedia concepts. In LREC, [19] Veselin Stoyanov, Jaes Mayfield, Tan Xu, Douglas W. Oard, Dawn Lawrie, Ti Oates, and Ti Finin. A context-aware approach to entity linking. In AKBC-WEKEX, [20] J. Xu and W.B. Croft. uery expansion using local and global docuent analysis. In SIGIR, 1996.

A Neighborhood Relevance Model for Entity Linking

A Neighborhood Relevance Model for Entity Linking A Neighborhood Relevance Model for Entity Linking Jeffrey Dalton University of Massachusetts 140 Governors Drive Aherst, MA, U.S.A. jdalton@cs.uass.edu Laura Dietz University of Massachusetts 140 Governors

More information

Leveraging Relevance Cues for Improved Spoken Document Retrieval

Leveraging Relevance Cues for Improved Spoken Document Retrieval Leveraging Relevance Cues for Iproved Spoken Docuent Retrieval Pei-Ning Chen 1, Kuan-Yu Chen 2 and Berlin Chen 1 National Taiwan Noral University, Taiwan 1 Institute of Inforation Science, Acadeia Sinica,

More information

Bi-directional Linkability From Wikipedia to Documents and Back Again: UMass at TREC 2012 Knowledge Base Acceleration Track

Bi-directional Linkability From Wikipedia to Documents and Back Again: UMass at TREC 2012 Knowledge Base Acceleration Track Bi-directional Linkability From Wikipedia to Documents and Back Again: UMass at TREC 2012 Knowledge Base Acceleration Track Jeffrey Dalton University of Massachusetts, Amherst jdalton@cs.umass.edu Laura

More information

Meta-Path-Based Ranking with Pseudo Relevance Feedback on Heterogeneous Graph for Citation Recommendation

Meta-Path-Based Ranking with Pseudo Relevance Feedback on Heterogeneous Graph for Citation Recommendation Meta-Path-Based Ranking with Pseudo Relevance Feedback on Heterogeneous Graph for Citation Recoendation Xiaozhong Liu School of Inforatics and Coputing Indiana University Blooington Blooington, IN, USA,

More information

Detection of Outliers and Reduction of their Undesirable Effects for Improving the Accuracy of K-means Clustering Algorithm

Detection of Outliers and Reduction of their Undesirable Effects for Improving the Accuracy of K-means Clustering Algorithm Detection of Outliers and Reduction of their Undesirable Effects for Iproving the Accuracy of K-eans Clustering Algorith Bahan Askari Departent of Coputer Science and Research Branch, Islaic Azad University,

More information

1 Extended Boolean Model

1 Extended Boolean Model 1 EXTENDED BOOLEAN MODEL It has been well-known that the Boolean odel is too inflexible, requiring skilful use of Boolean operators to obtain good results. On the other hand, the vector space odel is flexible

More information

EE 364B Convex Optimization An ADMM Solution to the Sparse Coding Problem. Sonia Bhaskar, Will Zou Final Project Spring 2011

EE 364B Convex Optimization An ADMM Solution to the Sparse Coding Problem. Sonia Bhaskar, Will Zou Final Project Spring 2011 EE 364B Convex Optiization An ADMM Solution to the Sparse Coding Proble Sonia Bhaskar, Will Zou Final Project Spring 20 I. INTRODUCTION For our project, we apply the ethod of the alternating direction

More information

Identifying Converging Pairs of Nodes on a Budget

Identifying Converging Pairs of Nodes on a Budget Identifying Converging Pairs of Nodes on a Budget Konstantina Lazaridou Departent of Inforatics Aristotle University, Thessaloniki, Greece konlaznik@csd.auth.gr Evaggelia Pitoura Coputer Science and Engineering

More information

OPTIMAL COMPLEX SERVICES COMPOSITION IN SOA SYSTEMS

OPTIMAL COMPLEX SERVICES COMPOSITION IN SOA SYSTEMS Key words SOA, optial, coplex service, coposition, Quality of Service Piotr RYGIELSKI*, Paweł ŚWIĄTEK* OPTIMAL COMPLEX SERVICES COMPOSITION IN SOA SYSTEMS One of the ost iportant tasks in service oriented

More information

An Automatic Method for Summary Evaluation Using Multiple Evaluation Results by a Manual Method

An Automatic Method for Summary Evaluation Using Multiple Evaluation Results by a Manual Method An Autoatic Method for Suary Evaluation Using Multiple Evaluation Results by a Manual Method Hidetsugu Nanba Faculty of Inforation Sciences, Hiroshia City University 3-4-1 Ozuka, Hiroshia, 731-3194 Japan

More information

Solving the Damage Localization Problem in Structural Health Monitoring Using Techniques in Pattern Classification

Solving the Damage Localization Problem in Structural Health Monitoring Using Techniques in Pattern Classification Solving the Daage Localization Proble in Structural Health Monitoring Using Techniques in Pattern Classification CS 9 Final Project Due Dec. 4, 007 Hae Young Noh, Allen Cheung, Daxia Ge Introduction Structural

More information

Geo-activity Recommendations by using Improved Feature Combination

Geo-activity Recommendations by using Improved Feature Combination Geo-activity Recoendations by using Iproved Feature Cobination Masoud Sattari Middle East Technical University Ankara, Turkey e76326@ceng.etu.edu.tr Murat Manguoglu Middle East Technical University Ankara,

More information

Different criteria of dynamic routing

Different criteria of dynamic routing Procedia Coputer Science Volue 66, 2015, Pages 166 173 YSC 2015. 4th International Young Scientists Conference on Coputational Science Different criteria of dynaic routing Kurochkin 1*, Grinberg 1 1 Kharkevich

More information

Collection Selection Based on Historical Performance for Efficient Processing

Collection Selection Based on Historical Performance for Efficient Processing Collection Selection Based on Historical Perforance for Efficient Processing Christopher T. Fallen and Gregory B. Newby Arctic Region Supercoputing Center University of Alaska Fairbanks Fairbanks, Alaska

More information

TALLINN UNIVERSITY OF TECHNOLOGY, INSTITUTE OF PHYSICS 17. FRESNEL DIFFRACTION ON A ROUND APERTURE

TALLINN UNIVERSITY OF TECHNOLOGY, INSTITUTE OF PHYSICS 17. FRESNEL DIFFRACTION ON A ROUND APERTURE 7. FRESNEL DIFFRACTION ON A ROUND APERTURE. Objective Exaining diffraction pattern on a round aperture, deterining wavelength of light source.. Equipent needed Optical workbench, light source, color filters,

More information

The optimization design of microphone array layout for wideband noise sources

The optimization design of microphone array layout for wideband noise sources PROCEEDINGS of the 22 nd International Congress on Acoustics Acoustic Array Systes: Paper ICA2016-903 The optiization design of icrophone array layout for wideband noise sources Pengxiao Teng (a), Jun

More information

An Efficient Approach for Content Delivery in Overlay Networks

An Efficient Approach for Content Delivery in Overlay Networks An Efficient Approach for Content Delivery in Overlay Networks Mohaad Malli, Chadi Barakat, Walid Dabbous Projet Planète, INRIA-Sophia Antipolis, France E-ail:{alli, cbarakat, dabbous}@sophia.inria.fr

More information

THE rapid growth and continuous change of the real

THE rapid growth and continuous change of the real IEEE TRANSACTIONS ON SERVICES COMPUTING, VOL. 8, NO. 1, JANUARY/FEBRUARY 2015 47 Designing High Perforance Web-Based Coputing Services to Proote Teleedicine Database Manageent Syste Isail Hababeh, Issa

More information

A Context-Aware Approach to Entity Linking

A Context-Aware Approach to Entity Linking A Context-Aware Approach to Entity Linking Veselin Stoyanov and James Mayfield HLTCOE Johns Hopkins University Dawn Lawrie Loyola University in Maryland Tan Xu and Douglas W. Oard University of Maryland

More information

MULTI-INDEX VOTING FOR ASYMMETRIC DISTANCE COMPUTATION IN A LARGE-SCALE BINARY CODES. Chih-Yi Chiu, Yu-Cyuan Liou, and Sheng-Hao Chou

MULTI-INDEX VOTING FOR ASYMMETRIC DISTANCE COMPUTATION IN A LARGE-SCALE BINARY CODES. Chih-Yi Chiu, Yu-Cyuan Liou, and Sheng-Hao Chou MULTI-INDEX VOTING FOR ASYMMETRIC DISTANCE COMPUTATION IN A LARGE-SCALE BINARY CODES Chih-Yi Chiu, Yu-Cyuan Liou, and Sheng-Hao Chou Departent of Coputer Science and Inforation Engineering, National Chiayi

More information

Brian Noguchi CS 229 (Fall 05) Project Final Writeup A Hierarchical Application of ICA-based Feature Extraction to Image Classification Brian Noguchi

Brian Noguchi CS 229 (Fall 05) Project Final Writeup A Hierarchical Application of ICA-based Feature Extraction to Image Classification Brian Noguchi A Hierarchical Application of ICA-based Feature Etraction to Iage Classification Introduction Iage classification poses one of the greatest challenges in the achine vision and achine learning counities.

More information

Designing High Performance Web-Based Computing Services to Promote Telemedicine Database Management System

Designing High Performance Web-Based Computing Services to Promote Telemedicine Database Management System Designing High Perforance Web-Based Coputing Services to Proote Teleedicine Database Manageent Syste Isail Hababeh 1, Issa Khalil 2, and Abdallah Khreishah 3 1: Coputer Engineering & Inforation Technology,

More information

Feature Based Registration for Panoramic Image Generation

Feature Based Registration for Panoramic Image Generation IJCSI International Journal of Coputer Science Issues, Vol. 10, Issue 6, No, Noveber 013 www.ijcsi.org 13 Feature Based Registration for Panoraic Iage Generation Kawther Abbas Sallal 1, Abdul-Mone Saleh

More information

POSITION-PATCH BASED FACE HALLUCINATION VIA LOCALITY-CONSTRAINED REPRESENTATION. Junjun Jiang, Ruimin Hu, Zhen Han, Tao Lu, and Kebin Huang

POSITION-PATCH BASED FACE HALLUCINATION VIA LOCALITY-CONSTRAINED REPRESENTATION. Junjun Jiang, Ruimin Hu, Zhen Han, Tao Lu, and Kebin Huang IEEE International Conference on ultiedia and Expo POSITION-PATCH BASED FACE HALLUCINATION VIA LOCALITY-CONSTRAINED REPRESENTATION Junjun Jiang, Ruiin Hu, Zhen Han, Tao Lu, and Kebin Huang National Engineering

More information

Clustering. Cluster Analysis of Microarray Data. Microarray Data for Clustering. Data for Clustering

Clustering. Cluster Analysis of Microarray Data. Microarray Data for Clustering. Data for Clustering Clustering Cluster Analysis of Microarray Data 4/3/009 Copyright 009 Dan Nettleton Group obects that are siilar to one another together in a cluster. Separate obects that are dissiilar fro each other into

More information

Mapping Data in Peer-to-Peer Systems: Semantics and Algorithmic Issues

Mapping Data in Peer-to-Peer Systems: Semantics and Algorithmic Issues Mapping Data in Peer-to-Peer Systes: Seantics and Algorithic Issues Anastasios Keentsietsidis Marcelo Arenas Renée J. Miller Departent of Coputer Science University of Toronto {tasos,arenas,iller}@cs.toronto.edu

More information

Computer Aided Drafting, Design and Manufacturing Volume 26, Number 2, June 2016, Page 13

Computer Aided Drafting, Design and Manufacturing Volume 26, Number 2, June 2016, Page 13 Coputer Aided Drafting, Design and Manufacturing Volue 26, uber 2, June 2016, Page 13 CADDM 3D reconstruction of coplex curved objects fro line drawings Sun Yanling, Dong Lijun Institute of Mechanical

More information

Performance Analysis of RAID in Different Workload

Performance Analysis of RAID in Different Workload Send Orders for Reprints to reprints@benthascience.ae 324 The Open Cybernetics & Systeics Journal, 2015, 9, 324-328 Perforance Analysis of RAID in Different Workload Open Access Zhang Dule *, Ji Xiaoyun,

More information

Effective Tracking of the Players and Ball in Indoor Soccer Games in the Presence of Occlusion

Effective Tracking of the Players and Ball in Indoor Soccer Games in the Presence of Occlusion Effective Tracking of the Players and Ball in Indoor Soccer Gaes in the Presence of Occlusion Soudeh Kasiri-Bidhendi and Reza Safabakhsh Airkabir Univerisity of Technology, Tehran, Iran {kasiri, safa}@aut.ac.ir

More information

Debiasing Crowdsourced Batches

Debiasing Crowdsourced Batches Debiasing Crowdsourced Batches Honglei Zhuang, Aditya Paraeswaran, Dan Roth and Jiawei Han Departent of Coputer Science University of Illinois at Urbana-Chapaign {hzhuang3, adityagp, danr, hanj}@illinois.edu

More information

Feature Selection to Relate Words and Images

Feature Selection to Relate Words and Images The Open Inforation Systes Journal, 2009, 3, 9-13 9 Feature Selection to Relate Words and Iages Wei-Chao Lin 1 and Chih-Fong Tsai*,2 Open Access 1 Departent of Coputing, Engineering and Technology, University

More information

Development of an Integrated Cost Estimation and Cost Control System for Construction Projects

Development of an Integrated Cost Estimation and Cost Control System for Construction Projects ABSTRACT Developent of an Integrated Estiation and Control Syste for Construction s by Salan Azhar, Syed M. Ahed and Aaury A. Caballero Florida International University 0555 W. Flagler Street, Miai, Florida

More information

Modeling Parallel Applications Performance on Heterogeneous Systems

Modeling Parallel Applications Performance on Heterogeneous Systems Modeling Parallel Applications Perforance on Heterogeneous Systes Jaeela Al-Jaroodi, Nader Mohaed, Hong Jiang and David Swanson Departent of Coputer Science and Engineering University of Nebraska Lincoln

More information

Scalable search-based image annotation

Scalable search-based image annotation Multiedia Systes DOI 17/s53-8-128-y REGULAR PAPER Scalable search-based iage annotation Changhu Wang Feng Jing Lei Zhang Hong-Jiang Zhang Received: 18 October 27 / Accepted: 15 May 28 Springer-Verlag 28

More information

Boosted Detection of Objects and Attributes

Boosted Detection of Objects and Attributes L M M Boosted Detection of Objects and Attributes Abstract We present a new fraework for detection of object and attributes in iages based on boosted cobination of priitive classifiers. The fraework directly

More information

Supplementary Section. A. Algorithm. B. Datasets. C. More on Labeling Functions

Supplementary Section. A. Algorithm. B. Datasets. C. More on Labeling Functions Suppleentary Section In this section, we include ore details about our algorith, labeling functions, datasets as well as additional results and coparisons, which could not be included in the ain paper

More information

A Novel 2D Texture Classifier For Gray Level Images

A Novel 2D Texture Classifier For Gray Level Images 2012, TextRoad Publication ISSN 2090-4304 Journal of Basic and Applied Scientific Research www.textroad.co A Novel 2D Texture Classifier For Gray Level Iages B.S. Mousavi 1 Young Researchers Club, Zahedan

More information

A Learning Framework for Nearest Neighbor Search

A Learning Framework for Nearest Neighbor Search A Learning Fraework for Nearest Neighbor Search Lawrence Cayton Departent of Coputer Science University of California, San Diego lcayton@cs.ucsd.edu Sanjoy Dasgupta Departent of Coputer Science University

More information

A wireless sensor network for visual detection and classification of intrusions

A wireless sensor network for visual detection and classification of intrusions A wireless sensor network for visual detection and classification of intrusions ANDRZEJ SLUZEK 1,3, PALANIAPPAN ANNAMALAI 2, MD SAIFUL ISLAM 1 1 School of Coputer Engineering, 2 IntelliSys Centre Nanyang

More information

TensorFlow and Keras-based Convolutional Neural Network in CAT Image Recognition Ang LI 1,*, Yi-xiang LI 2 and Xue-hui LI 3

TensorFlow and Keras-based Convolutional Neural Network in CAT Image Recognition Ang LI 1,*, Yi-xiang LI 2 and Xue-hui LI 3 2017 2nd International Conference on Coputational Modeling, Siulation and Applied Matheatics (CMSAM 2017) ISBN: 978-1-60595-499-8 TensorFlow and Keras-based Convolutional Neural Network in CAT Iage Recognition

More information

A Novel Fuzzy Chinese Address Matching Engine Based on Full-text Search Technology

A Novel Fuzzy Chinese Address Matching Engine Based on Full-text Search Technology Based on Full-text Search Technology 12 Institute of Reote Sensing and Digital Earth National Engineering Research Center for Reote Sensing Applications,Beijing,100101, China E-ail: yaoxj@radi.ac.cn Xiang

More information

Oblivious Routing for Fat-Tree Based System Area Networks with Uncertain Traffic Demands

Oblivious Routing for Fat-Tree Based System Area Networks with Uncertain Traffic Demands Oblivious Routing for Fat-Tree Based Syste Area Networks with Uncertain Traffic Deands Xin Yuan Wickus Nienaber Zhenhai Duan Departent of Coputer Science Florida State University Tallahassee, FL 3306 {xyuan,nienaber,duan}@cs.fsu.edu

More information

EFFICIENT VIDEO SEARCH USING IMAGE QUERIES A. Araujo1, M. Makar2, V. Chandrasekhar3, D. Chen1, S. Tsai1, H. Chen1, R. Angst1 and B.

EFFICIENT VIDEO SEARCH USING IMAGE QUERIES A. Araujo1, M. Makar2, V. Chandrasekhar3, D. Chen1, S. Tsai1, H. Chen1, R. Angst1 and B. EFFICIENT VIDEO SEARCH USING IMAGE QUERIES A. Araujo1, M. Makar2, V. Chandrasekhar3, D. Chen1, S. Tsai1, H. Chen1, R. Angst1 and B. Girod1 1 Stanford University, USA 2 Qualco Inc., USA ABSTRACT We study

More information

The Internal Conflict of a Belief Function

The Internal Conflict of a Belief Function The Internal Conflict of a Belief Function Johan Schubert Abstract In this paper we define and derive an internal conflict of a belief function We decopose the belief function in question into a set of

More information

International Journal of Scientific & Engineering Research, Volume 4, Issue 10, October ISSN

International Journal of Scientific & Engineering Research, Volume 4, Issue 10, October ISSN International Journal of Scientific & Engineering Research, Volue 4, Issue 0, October-203 483 Design an Encoding Technique Using Forbidden Transition free Algorith to Reduce Cross-Talk for On-Chip VLSI

More information

Optimal Route Queries with Arbitrary Order Constraints

Optimal Route Queries with Arbitrary Order Constraints IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL.?, NO.?,? 20?? 1 Optial Route Queries with Arbitrary Order Constraints Jing Li, Yin Yang, Nikos Maoulis Abstract Given a set of spatial points DS,

More information

Derivation of an Analytical Model for Evaluating the Performance of a Multi- Queue Nodes Network Router

Derivation of an Analytical Model for Evaluating the Performance of a Multi- Queue Nodes Network Router Derivation of an Analytical Model for Evaluating the Perforance of a Multi- Queue Nodes Network Router 1 Hussein Al-Bahadili, 1 Jafar Ababneh, and 2 Fadi Thabtah 1 Coputer Inforation Systes Faculty of

More information

On computing global similarity in images

On computing global similarity in images On coputing global siilarity in iages S Ravela and R Manatha Coputer Science Departent University of Massachusetts, Aherst, MA 01003 Eail: ravela,anatha @csuassedu Abstract The retrieval of iages based

More information

A Learning Framework for Nearest Neighbor Search

A Learning Framework for Nearest Neighbor Search A Learning Fraework for Nearest Neighbor Search Lawrence Cayton Departent of Coputer Science University of California, San Diego lcayton@cs.ucsd.edu Sanjoy Dasgupta Departent of Coputer Science University

More information

COLOR HISTOGRAM AND DISCRETE COSINE TRANSFORM FOR COLOR IMAGE RETRIEVAL

COLOR HISTOGRAM AND DISCRETE COSINE TRANSFORM FOR COLOR IMAGE RETRIEVAL COLOR HISTOGRAM AND DISCRETE COSINE TRANSFORM FOR COLOR IMAGE RETRIEVAL 1 Te-Wei Chiang ( 蔣德威 ), 2 Tienwei Tsai ( 蔡殿偉 ), 3 Jeng-Ping Lin ( 林正平 ) 1 Dept. of Accounting Inforation Systes, Chilee Institute

More information

Carving Differential Unit Test Cases from System Test Cases

Carving Differential Unit Test Cases from System Test Cases Carving Differential Unit Test Cases fro Syste Test Cases Sebastian Elbau, Hui Nee Chin, Matthew B. Dwyer, Jonathan Dokulil Departent of Coputer Science and Engineering University of Nebraska - Lincoln

More information

AN APPROACH ON BIMODAL BIOMETRIC SYSTEMS

AN APPROACH ON BIMODAL BIOMETRIC SYSTEMS AN APPROACH ON BIODAL BIOETRIC SYSTES Eugen LUPU, Siina EERICH Technical University of Cluj-Napoca, 26-28 Baritiu str. Cluj-Napoca phone: +40-264-40-266; fax: +40-264-592-055; e-ail: Eugen.Lupu @co.utcluj.ro

More information

The Boundary Between Privacy and Utility in Data Publishing

The Boundary Between Privacy and Utility in Data Publishing The Boundary Between Privacy and Utility in Data Publishing Vibhor Rastogi Dan Suciu Sungho Hong ABSTRACT We consider the privacy proble in data publishing: given a database instance containing sensitive

More information

A Trajectory Splitting Model for Efficient Spatio-Temporal Indexing

A Trajectory Splitting Model for Efficient Spatio-Temporal Indexing A Trajectory Splitting Model for Efficient Spatio-Teporal Indexing Slobodan Rasetic Jörg Sander Jaes Elding Mario A. Nasciento Departent of Coputing Science University of Alberta Edonton, Alberta, Canada

More information

A simplified approach to merging partial plane images

A simplified approach to merging partial plane images A siplified approach to erging partial plane iages Mária Kruláková 1 This paper introduces a ethod of iage recognition based on the gradual generating and analysis of data structure consisting of the 2D

More information

Evaluation of a multi-frame blind deconvolution algorithm using Cramér-Rao bounds

Evaluation of a multi-frame blind deconvolution algorithm using Cramér-Rao bounds Evaluation of a ulti-frae blind deconvolution algorith using Craér-Rao bounds Charles C. Beckner, Jr. Air Force Research Laboratory, 3550 Aberdeen Ave SE, Kirtland AFB, New Mexico, USA 87117-5776 Charles

More information

QUERY ROUTING OPTIMIZATION IN SENSOR COMMUNICATION NETWORKS

QUERY ROUTING OPTIMIZATION IN SENSOR COMMUNICATION NETWORKS QUERY ROUTING OPTIMIZATION IN SENSOR COMMUNICATION NETWORKS Guofei Jiang and George Cybenko Institute for Security Technology Studies and Thayer School of Engineering Dartouth College, Hanover NH 03755

More information

Gromov-Hausdorff Distance Between Metric Graphs

Gromov-Hausdorff Distance Between Metric Graphs Groov-Hausdorff Distance Between Metric Graphs Jiwon Choi St Mark s School January, 019 Abstract In this paper we study the Groov-Hausdorff distance between two etric graphs We copute the precise value

More information

Energy-Efficient Disk Replacement and File Placement Techniques for Mobile Systems with Hard Disks

Energy-Efficient Disk Replacement and File Placement Techniques for Mobile Systems with Hard Disks Energy-Efficient Disk Replaceent and File Placeent Techniques for Mobile Systes with Hard Disks Young-Jin Ki School of Coputer Science & Engineering Seoul National University Seoul 151-742, KOREA youngjk@davinci.snu.ac.kr

More information

Image Filter Using with Gaussian Curvature and Total Variation Model

Image Filter Using with Gaussian Curvature and Total Variation Model IJECT Vo l. 7, Is s u e 3, Ju l y - Se p t 016 ISSN : 30-7109 (Online) ISSN : 30-9543 (Print) Iage Using with Gaussian Curvature and Total Variation Model 1 Deepak Kuar Gour, Sanjay Kuar Shara 1, Dept.

More information

Approximate String Matching with Reduced Alphabet

Approximate String Matching with Reduced Alphabet Approxiate String Matching with Reduced Alphabet Leena Salela 1 and Jora Tarhio 2 1 University of Helsinki, Departent of Coputer Science leena.salela@cs.helsinki.fi 2 Aalto University Deptartent of Coputer

More information

Smarter Balanced Assessment Consortium Claims, Targets, and Standard Alignment for Math

Smarter Balanced Assessment Consortium Claims, Targets, and Standard Alignment for Math Sarter Balanced Assessent Consortiu s, s, Stard Alignent for Math The Sarter Balanced Assessent Consortiu (SBAC) has created a hierarchy coprised of clais targets that together can be used to ake stateents

More information

Galois Homomorphic Fractal Approach for the Recognition of Emotion

Galois Homomorphic Fractal Approach for the Recognition of Emotion Galois Hooorphic Fractal Approach for the Recognition of Eotion T. G. Grace Elizabeth Rani 1, G. Jayalalitha 1 Research Scholar, Bharathiar University, India, Associate Professor, Departent of Matheatics,

More information

Structural Balance in Networks. An Optimizational Approach. Andrej Mrvar. Faculty of Social Sciences. University of Ljubljana. Kardeljeva pl.

Structural Balance in Networks. An Optimizational Approach. Andrej Mrvar. Faculty of Social Sciences. University of Ljubljana. Kardeljeva pl. Structural Balance in Networks An Optiizational Approach Andrej Mrvar Faculty of Social Sciences University of Ljubljana Kardeljeva pl. 5 61109 Ljubljana March 23 1994 Contents 1 Balanced and clusterable

More information

Theoretical Analysis of Local Search and Simple Evolutionary Algorithms for the Generalized Travelling Salesperson Problem

Theoretical Analysis of Local Search and Simple Evolutionary Algorithms for the Generalized Travelling Salesperson Problem Theoretical Analysis of Local Search and Siple Evolutionary Algoriths for the Generalized Travelling Salesperson Proble Mojgan Pourhassan ojgan.pourhassan@adelaide.edu.au Optiisation and Logistics, The

More information

Utility-Based Resource Allocation for Mixed Traffic in Wireless Networks

Utility-Based Resource Allocation for Mixed Traffic in Wireless Networks IEEE IFOCO 2 International Workshop on Future edia etworks and IP-based TV Utility-Based Resource Allocation for ixed Traffic in Wireless etworks Li Chen, Bin Wang, Xiaohang Chen, Xin Zhang, and Dacheng

More information

3D Building Detection and Reconstruction from Aerial Images Using Perceptual Organization and Fast Graph Search

3D Building Detection and Reconstruction from Aerial Images Using Perceptual Organization and Fast Graph Search 436 Journal of Electrical Engineering & Technology, Vol. 3, No. 3, pp. 436~443, 008 3D Building Detection and Reconstruction fro Aerial Iages Using Perceptual Organization and Fast Graph Search Dong-Min

More information

PROBABILISTIC LOCALIZATION AND MAPPING OF MOBILE ROBOTS IN INDOOR ENVIRONMENTS WITH A SINGLE LASER RANGE FINDER

PROBABILISTIC LOCALIZATION AND MAPPING OF MOBILE ROBOTS IN INDOOR ENVIRONMENTS WITH A SINGLE LASER RANGE FINDER nd International Congress of Mechanical Engineering (COBEM 3) Noveber 3-7, 3, Ribeirão Preto, SP, Brazil Copyright 3 by ABCM PROBABILISTIC LOCALIZATION AND MAPPING OF MOBILE ROBOTS IN INDOOR ENVIRONMENTS

More information

Verifying the structure and behavior in UML/OCL models using satisfiability solvers

Verifying the structure and behavior in UML/OCL models using satisfiability solvers IET Cyber-Physical Systes: Theory & Applications Review Article Verifying the structure and behavior in UML/OCL odels using satisfiability solvers ISSN 2398-3396 Received on 20th October 2016 Revised on

More information

News Events Clustering Method Based on Staging Incremental Single-Pass Technique

News Events Clustering Method Based on Staging Incremental Single-Pass Technique News Events Clustering Method Based on Staging Increental Single-Pass Technique LI Yongyi 1,a *, Gao Yin 2 1 School of Electronics and Inforation Engineering QinZhou University 535099 Guangxi, China 2

More information

AN INTEGRATED APPROACH TO MUSIC BOUNDARY DETECTION

AN INTEGRATED APPROACH TO MUSIC BOUNDARY DETECTION 10th International Society for Music Inforation Retrieval Conference (ISMIR 2009) AN INTEGRATED APPROACH TO MUSIC BOUNDARY DETECTION Min-Yian Su, Yi-Hsuan Yang, Yu-Ching Lin, Hoer Chen National Taiwan

More information

Defining and Surveying Wireless Link Virtualization and Wireless Network Virtualization

Defining and Surveying Wireless Link Virtualization and Wireless Network Virtualization 1 Defining and Surveying Wireless Link Virtualization and Wireless Network Virtualization Jonathan van de Belt, Haed Ahadi, and Linda E. Doyle The Centre for Future Networks and Counications - CONNECT,

More information

HKBU Institutional Repository

HKBU Institutional Repository Hong Kong Baptist University HKBU Institutional Repository HKBU Staff Publication 18 Towards why-not spatial keyword top-k ueries: A direction-aware approach Lei Chen Hong Kong Baptist University, psuanle@gail.co

More information

Joint Measurement- and Traffic Descriptor-based Admission Control at Real-Time Traffic Aggregation Points

Joint Measurement- and Traffic Descriptor-based Admission Control at Real-Time Traffic Aggregation Points Joint Measureent- and Traffic Descriptor-based Adission Control at Real-Tie Traffic Aggregation Points Stylianos Georgoulas, Panos Triintzios and George Pavlou Centre for Counication Systes Research, University

More information

Improve Peer Cooperation using Social Networks

Improve Peer Cooperation using Social Networks Iprove Peer Cooperation using Social Networks Victor Ponce, Jie Wu, and Xiuqi Li Departent of Coputer Science and Engineering Florida Atlantic University Boca Raton, FL 33431 Noveber 5, 2007 Corresponding

More information

An Integrated Processing Method for Multiple Large-scale Point-Clouds Captured from Different Viewpoints

An Integrated Processing Method for Multiple Large-scale Point-Clouds Captured from Different Viewpoints 519 An Integrated Processing Method for Multiple Large-scale Point-Clouds Captured fro Different Viewpoints Yousuke Kawauchi 1, Shin Usuki, Kenjiro T. Miura 3, Hiroshi Masuda 4 and Ichiro Tanaka 5 1 Shizuoka

More information

Entity Search Engine: Towards Agile Best-Effort Information Integration over the Web

Entity Search Engine: Towards Agile Best-Effort Information Integration over the Web Entity Search Engine: Towards Agile Best-Effort Inforation Integration over the Web Tao Cheng, Kevin Chen-Chuan Chang University of Illinois at Urbana-Chapaign {tcheng3, kcchang}@cs.uiuc.edu. INTRODUCTION

More information

INTRINSIC DECOMPOSITION FOR STEREOSCOPIC IMAGES

INTRINSIC DECOMPOSITION FOR STEREOSCOPIC IMAGES INTRINSIC DECOMPOSITION FOR STEREOSCOPIC IMAGES Dehua Xie 1, Shuaicheng Liu 1, Kaio Lin 2, Shuyuan Zhu 1, and Bing Zeng 1 1 School of Electronic Engineering, University of Electronic Science and Technology

More information

Detecting Anomalous Structures by Convolutional Sparse Models

Detecting Anomalous Structures by Convolutional Sparse Models Detecting Anoalous Structures by Convolutional Sparse Models Diego Carrera, Giacoo Boracchi Dipartiento di Elettronica, Inforazione e Bioingegeria Politecnico di Milano, Italy {giacoo.boracchi, diego.carrera}@polii.it

More information

NUS-I2R: Learning a Combined System for Entity Linking

NUS-I2R: Learning a Combined System for Entity Linking NUS-I2R: Learning a Combined System for Entity Linking Wei Zhang Yan Chuan Sim Jian Su Chew Lim Tan School of Computing National University of Singapore {z-wei, tancl} @comp.nus.edu.sg Institute for Infocomm

More information

Resolution. Super-Resolution Imaging. Problem

Resolution. Super-Resolution Imaging. Problem Resolution Super-Resolution Iaging Resolution: Sallest easurable detail in a visual presentation Subhasis Chaudhuri Departent of Electrical Engineering Indian institute of Technology Bobay Powai, Mubai-400

More information

Efficient Estimation of Inclusion Coefficient using HyperLogLog Sketches

Efficient Estimation of Inclusion Coefficient using HyperLogLog Sketches Efficient Estiation of Inclusion Coefficient using HyperLogLog Sketches Azade Nazi, Bolin Ding, Vivek Narasayya, Surajit Chaudhuri Microsoft Research {aznazi, bolind, viveknar, surajitc}@icrosoft.co ABSTRACT

More information

NON-RIGID OBJECT TRACKING: A PREDICTIVE VECTORIAL MODEL APPROACH

NON-RIGID OBJECT TRACKING: A PREDICTIVE VECTORIAL MODEL APPROACH NON-RIGID OBJECT TRACKING: A PREDICTIVE VECTORIAL MODEL APPROACH V. Atienza; J.M. Valiente and G. Andreu Departaento de Ingeniería de Sisteas, Coputadores y Autoática Universidad Politécnica de Valencia.

More information

ELEVATION SURFACE INTERPOLATION OF POINT DATA USING DIFFERENT TECHNIQUES A GIS APPROACH

ELEVATION SURFACE INTERPOLATION OF POINT DATA USING DIFFERENT TECHNIQUES A GIS APPROACH ELEVATION SURFACE INTERPOLATION OF POINT DATA USING DIFFERENT TECHNIQUES A GIS APPROACH Kulapraote Prathuchai Geoinforatics Center, Asian Institute of Technology, 58 Moo9, Klong Luang, Pathuthani, Thailand.

More information

Novel Image Representation and Description Technique using Density Histogram of Feature Points

Novel Image Representation and Description Technique using Density Histogram of Feature Points Novel Iage Representation and Description Technique using Density Histogra of Feature Points Keneilwe ZUVA Departent of Coputer Science, University of Botswana, P/Bag 00704 UB, Gaborone, Botswana and Tranos

More information

6.1 Topological relations between two simple geometric objects

6.1 Topological relations between two simple geometric objects Chapter 5 proposed a spatial odel to represent the spatial extent of objects in urban areas. The purpose of the odel, as was clarified in Chapter 3, is ultifunctional, i.e. it has to be capable of supplying

More information

Data Caching for Enhancing Anonymity

Data Caching for Enhancing Anonymity Data Caching for Enhancing Anonyity Rajiv Bagai and Bin Tang Departent of Electrical Engineering and Coputer Science Wichita State University Wichita, Kansas 67260 0083, USA Eail: {rajiv.bagai, bin.tang}@wichita.edu

More information

Entity Linking at Web Scale

Entity Linking at Web Scale Entity Linking at Web Scale Thomas Lin, Mausam, Oren Etzioni Computer Science & Engineering University of Washington Seattle, WA 98195, USA {tlin, mausam, etzioni}@cs.washington.edu Abstract This paper

More information

Design Optimization of Mixed Time/Event-Triggered Distributed Embedded Systems

Design Optimization of Mixed Time/Event-Triggered Distributed Embedded Systems Design Optiization of Mixed Tie/Event-Triggered Distributed Ebedded Systes Traian Pop, Petru Eles, Zebo Peng Dept. of Coputer and Inforation Science, Linköping University {trapo, petel, zebpe}@ida.liu.se

More information

A Novel Fast Constructive Algorithm for Neural Classifier

A Novel Fast Constructive Algorithm for Neural Classifier A Novel Fast Constructive Algorith for Neural Classifier Xudong Jiang Centre for Signal Processing, School of Electrical and Electronic Engineering Nanyang Technological University Nanyang Avenue, Singapore

More information

Collaborative Web Caching Based on Proxy Affinities

Collaborative Web Caching Based on Proxy Affinities Collaborative Web Caching Based on Proxy Affinities Jiong Yang T J Watson Research Center IBM jiyang@usibco Wei Wang T J Watson Research Center IBM ww1@usibco Richard Muntz Coputer Science Departent UCLA

More information

A Low-Cost Multi-Failure Resilient Replication Scheme for High Data Availability in Cloud Storage

A Low-Cost Multi-Failure Resilient Replication Scheme for High Data Availability in Cloud Storage 216 IEEE 23rd International Conference on High Perforance Coputing A Low-Cost Multi-Failure Resilient Replication Schee for High Data Availability in Cloud Storage Jinwei Liu* and Haiying Shen *Departent

More information

Adaptive Holistic Scheduling for In-Network Sensor Query Processing

Adaptive Holistic Scheduling for In-Network Sensor Query Processing Adaptive Holistic Scheduling for In-Network Sensor Query Processing Hejun Wu and Qiong Luo Departent of Coputer Science and Engineering Hong Kong University of Science & Technology Clear Water Bay, Kowloon,

More information

Summary. Reconstruction of data from non-uniformly spaced samples

Summary. Reconstruction of data from non-uniformly spaced samples Is there always extra bandwidth in non-unifor spatial sapling? Ralf Ferber* and Massiiliano Vassallo, WesternGeco London Technology Center; Jon-Fredrik Hopperstad and Ali Özbek, Schluberger Cabridge Research

More information

A CRYPTANALYTIC ATTACK ON RC4 STREAM CIPHER

A CRYPTANALYTIC ATTACK ON RC4 STREAM CIPHER A CRYPTANALYTIC ATTACK ON RC4 STREAM CIPHER VIOLETA TOMAŠEVIĆ, SLOBODAN BOJANIĆ 2 and OCTAVIO NIETO-TALADRIZ 2 The Mihajlo Pupin Institute, Volgina 5, 000 Belgrade, SERBIA AND MONTENEGRO 2 Technical University

More information

Robust Relevance-Based Language Models

Robust Relevance-Based Language Models Robust Relevance-Based Language Models Xiaoyan Li Department of Computer Science, Mount Holyoke College 50 College Street, South Hadley, MA 01075, USA Email: xli@mtholyoke.edu ABSTRACT We propose a new

More information

PERFORMANCE MEASURES FOR INTERNET SERVER BY USING M/M/m QUEUEING MODEL

PERFORMANCE MEASURES FOR INTERNET SERVER BY USING M/M/m QUEUEING MODEL IJRET: International Journal of Research in Engineering and Technology ISSN: 239-63 PERFORMANCE MEASURES FOR INTERNET SERVER BY USING M/M/ QUEUEING MODEL Raghunath Y. T. N. V, A. S. Sravani 2 Assistant

More information

Design and Implementation of Business Logic Layer Object-Oriented Design versus Relational Design

Design and Implementation of Business Logic Layer Object-Oriented Design versus Relational Design Design and Ipleentation of Business Logic Layer Object-Oriented Design versus Relational Design Ali Alharthy Faculty of Engineering and IT University of Technology, Sydney Sydney, Australia Eail: Ali.a.alharthy@student.uts.edu.au

More information

Vision Based Mobile Robot Navigation System

Vision Based Mobile Robot Navigation System International Journal of Control Science and Engineering 2012, 2(4): 83-87 DOI: 10.5923/j.control.20120204.05 Vision Based Mobile Robot Navigation Syste M. Saifizi *, D. Hazry, Rudzuan M.Nor School of

More information

Massive amounts of high-dimensional data are pervasive in multiple domains,

Massive amounts of high-dimensional data are pervasive in multiple domains, Challenges of Feature Selection for Big Data Analytics Jundong Li and Huan Liu, Arizona State University Massive aounts of high-diensional data are pervasive in ultiple doains, ranging fro social edia,

More information