Entity Search Engine: Towards Agile Best-Effort Information Integration over the Web

Size: px
Start display at page:

Download "Entity Search Engine: Towards Agile Best-Effort Information Integration over the Web"

Transcription

1 Entity Search Engine: Towards Agile Best-Effort Inforation Integration over the Web Tao Cheng, Kevin Chen-Chuan Chang University of Illinois at Urbana-Chapaign {tcheng3, INTRODUCTION The iense scale and wide spread has rendered the Web as an ultiate inforation repository as not only the sources where we find but also the destinations where we publish inforation. The dual forces have enriched the Web with all kinds of data, uch beyond the conventional page view of the Web as a corpus of HTML pages, or docuents. The Web has thus becoe a rich collection of data-rich pages, on the surface Web of static URLs (e.g., personal or copany hoepages) as well as the deep Web of database-backed contents (e.g., flights fro aa.co), as Figure (a) shows. The richness of data, while a proising opportunity, has challenged us to effectively find data we need, fro one or ultiple sources. In particular, we are otivated by, when building the Meta- Querier at UIUC, the need of large scale on-the-fly integration for online structured data. The MetaQuerier syste, as we reported in CIDR 2005 [4], ais at finding and querying data sources on the deep Web. However, as the last ile for etaquerying, when users can query ultiple sources on the fly [8], or when data is autoatically crawled fro sources [6], how do we identify and integrate the structured data ebedded in unstructured result pages? (e.g., the query results of aazon.co and bn.co). Further, we realized, as observed earlier, beyond our Meta- Querier experience, such data richness is pervasive fro the depth as well as the surface of the Web. While data is proliferating, however, we are currently not able to effectively access such data. To otivate, consider a few scenarios, for user Ay: Scenario : To call Aazon.co for her online purchase, how can Ay find their phone nuber? To begin with, what should be the right keywords for finding pages with such nubers? A query like aazon custoer service phone ay not work; often a phone is siply shown without keyword phone (e.g., custoer service: ). On the other hand, aazon custoer service could be too broad. Then, for each querying, she ust sift through the returned pages to dig for the phone nubers. This overall process can be unnec- This article is published under a Creative Coons License Agreeent ( You ay copy, distribute, display, and perfor the work, ake derivative works and ake coercial use of the work, but you ust attribute the work to the author and CIDR Cars.co BN.co Figure : The data-rich Web. Aazon.co AA.co essarily tie consuing. Scenario 2: To apply for graduate schools, how can she find the list of professors in the database area? She has to go through all the CS departent pages or, even worse, check faculty s hoepages one by one a very laborious process. Scenario 3: As a graduate student, Ay needs to prepare a seinar presentation for her choice of recent papers. Can she find papers that coe readily with presentations, i.e., a PDF file along with a PPT file, say fro SIGMOD 2006? Scenario 4: Now Ay wants to buy a copy of Shakespeare s Halet to read; how can she find the prices and cover iages of available choices fro, say, Borders.co and BN.co. She would have to look at and copare the results fro ultiple online bookstores one by one. In these scenarios, like every user in any siilar situations, Ay is looking for particular types of data, which we call entities, e.g., a phone nuber, a book cover iage, a PDF, a PPT, a nae, a date, an eail address, etc. She is not, we stress, looking for pages as relevant docuents to read, but entities as data for her subsequent tasks (to contact, to apply, to present, to buy, etc.). We are facing a dilea: On one hand, for accessing over the sheer size of the Web, we are ostly relying on search engines, such as Google, Yahoo, or MSN, which search for pages by keywords. On this extree, while being IR-style with a scalable text processing fraework, they are not data aware. On the other hand, integration services, such as Expedia.co or PriceGrabber.co, exist online for specific doains. On this extree, while providing DB-style precise querying, they can hardly scale the aount of data and the nuber of sources on the Web. As our solution, which this paper will propose, we believe the two extrees ust eet, with a synergistic arriage in

2 the iddle. Fro the search perspective: Can we build a search engine with data awareness? Fro the integration perspective: Can we develop an integration syste with scale awareness, even with liited best-effort seantics? As we will discuss in Section 3, the two perspectives together lead to our objective of entity search As a search syste, we propose our EntitySearch engine to search directly for target inforation entities, and holistically across their occurrences in all pages. As an integration syste, it will assue only a lightweight off-the-shelf entity extraction layer, and thus scalable to any sources; further, it will construct derived relations as the search output, on the fly, and thus flexible for ad-hoc user needs. In suary, this paper proposes the concept of entity search and its initial ipleentation as an agile best-effort fraework for realizing large scale inforation integration over the Web. We will otivate and define entity search in Section 2 and Section 3 respectively. Section 4 will then discuss our initial pilot ipleentation for exploring research issues. Section 5 will deonstrate the usefulness of entity search using real world scenarios. We will conclude in Section THE DILEMMA To access the Web, we ust resort to search engines or integration services. The state of the art, however, presents a clear dilea between data and scale, and we are thus lacking large scale data-based access to the Web. 2. Search Engines: Data Awareness? On one hand, our inforation access is ostly relying on current search systes, such as popular engines like Google, Yahoo, and MSN Search However, while reaching extensive parts of the Web, with their inherent page view of the Web, they are lacking even inial data awareness and thus incapable for such tasks as finding data entities. That is, current search systes are not data aware. In their rather siplistic page view, with a text retrieval syste as their backend engine, they naturally view the Web as siply a repository of interlinked docuents, or pages, each containing soe keywords upon which searches can be perfored. Our inforation access aounts to finding relevant pages by keywords regardless of what data entities we are looking for. In other words, fro this search perspective, we are lacking a search syste that is aware of data. Specifically, this lacking leads to two ajor liitations of current search systes: Indirect Input and Output. In ters of the input and output, current engines are searching indirectly. To begin with, to forulate queries as input, users cannot directly describe what they want. Ay has to forulate her needs indirectly as keyword queries, often in a non-trivial and non-intuitive way, with a hope to hit relevant pages that ay or ay not contain target entities. For Scenario, will aazon custoer service phone work? Then, as query output, users cannot directly get what they want. The engine will only take Ay, indirectly, to a list of pages, and she ust anually sift through the to find the phone nuber. Can we help users to search directly in both describing and getting what they want? Singular Matching Mechanis. In ters of the atching echanis, current search engines are finding each page singularly. Our target entities often coe fro ultiple pages. In Scenario, the sae phone nuber of Aazon ay appear in the copany s Web site, online user forus, or even blogs. In this case, we should collect, for each phone, all its occurrences fro ultiple pages as supporting evidences of atching. In Scenario 2, the list of professors probably cannot be found in any single page. In this case, again, we ust look at any pages to coe up with the list of proising naes (and each nae ay appear in ultiple pages, like the earlier case). Can we help users to search holistically for atching entities across the Web corpus as a whole, instead of each individual page? Thus, fro the search perspective, we are facing the challenge of building data awareness into Web search systes. As data proliferates on the Web, the classic view of the Web as a page repository is increasingly inadequate. A step forward, can we search for data, or specific entities, as our target directly and across all pages holistically? That is, while keyword-based search is scalable to the Web and easy to use, can it be orphed to be data aware? 2.2 Integration Systes: Scale Awareness? On the other hand, there already eerged any integration systes, such as the various specialized engines in coparison shopping (e.g., PriceGrabber.co, NextTag.co and BizRate.co) and vertical search services (e.g., Realtor.co for real estates, Expedia.co for airfares, and Google Base for various doains) However, while offering precise data-based access, with their inherent database view, they are not not designed with scale in ind, for the large aount of data-rich pages on the Web and the large diversity of user needs. That is, current integration systes are not scale aware. In their rather rigid database view, they naturally assue and can only query data prepared in a structured relational forat in certain prescribed scheas upon which SQL queries can be perfored. Our inforation access is thus significantly liited by the availability of data in such rigid scheas, as well as the types of queries that the prescribed schea can support. In other words, fro this integration perspective, we are lacking an integration syste that is aware of scale the proliferation of Web data and online users. In particular, because their database view requires data in well-structured forat, an integration syste ust build wrapper for each source, which works as a relation extractor for each site, to precisely extract its unstructured text into structured DBMS, upon which to perfor SQL queries. We believe that, while achieving structural rigor, this technique is not viable, with two liitations: Liited Sources. In ters of sources, it lacks scalability. As repeatedly reported [7, 5], per-source wrappers are not only laborious to build but also fragile to aintain, and thus a cost barrier. To incorporate yriad potential sources (and pages), we shall not rely on per-source training or construction. Liited Queries. In ters of queries, it lacks flexibility: A wrapper ust ake a hard decision on the scheas of the relations to extract e.g., for book, to extract a relation with forat, publisher, price, cover iage, or all? The ore coplex a schea is, the ore likely its extraction ay fail, while a sipler schee ay be useless for any users. Thus, fro the integration perspective, we are facing the challenge of building scale awareness into Web integration This is perhaps why today s Web integration systes work by assuing a sall set of pre-configured sources, by user subission (e.g., Google Base), or by relying on a centralized database, e.g., Sabre [3] for airfares and MLS [2] for real estate

3 rank rank 2 phone nuber PDF sigod6.pdf surajit2.pdf score PPT urls aazon.co/support.ht yblog.org/shopping Dell.co/supportors xyz.co hp.co sigod6.ppt surajit2.ppt score urls db.co,sigod.co s.co Figure 2: Query Results of Q and Q3. systes. While the prevalence of data on the Web presents novel opportunities for integration, this Web-scale scenario contrasts traditional settings and defeats current techniques We ust rethink not only new techniques but also realistic objectives. To bring yriads heterogeneous sources to eet ad-hoc users, as the scalability and flexibility andate, can we develop agile integration without rigid scheas, even with best effort seantics? 3. OUR PROPOSAL: ENTITY SEARCH Approaching the current barriers fro the dual perspectives, we are inspired that they see to converge to the sae solution: The dual perspectives represent the two extrees of the spectru: siple keyword search to fully transparent inforation integration. Meeting in the iddle, they suggest the arriage of scale-aware search and data-aware integration. Towards searching directly and holistically, for finding specific types of inforation, we thus propose to support entity search. First, as input, users forulate queries to directly describe what types of data they are looking for: She can siply specify what her target entities are, and what keywords ay appear in the context with a right answer. To distinguish between entities to look for and keywords in the context, we use a prefix #, e.g., #phone for the phone entity. Our scenarios will naturally lead to the following queries: Q: (aazon custoer service #phone) Q2: (#professor #university #research= database ) Q3: ow(sigod 2006 #pdf file #ppt file) Q4: (#title= halet #iage #price) In the queries, there are two coponents: () Context pattern, how will the target entities appear? Q says that #phone will appear with these keywords in the default pattern of near (or proxiity). We ay also explicitly specify the pattern, e.g., Q3 uses ow to ean order-window (the keywords ust appear before #pdf file and then #ppt file. (2) Content restriction: A target entity will atch any instances of that entity type, subject to optional restriction on their content values e.g., Q will atch every phone instance, while Q 2 will only atch research area database. (In addition to equality =, other restriction operators are possible, such as contain. ) Second, as output, users will get directly the entities that they are looking for. That is, as a query specifies what entity types are the targets, its results are those entity instances (or literal values) that atch the query, in a ranked order by their atching scores. (We will discuss this atching next.) Figure 2 shows soe exaple results for Q and Q3. Third, as search echanis, entity search will find atching entities holistically, where an instance will be found and atched in all the pages where it occurs; e.g., a #phone ay occur at ultiple URLs as Figure 2 shows. For each instance, all its atching occurrences will be aggregated to for the final ranking; e.g., a phone nuber that ore frequently occurs at where aazon custoer service is entioned ay rank higher than others. Thus, entity search will not only find entities as the priary result, but also return pages where each entity is found as the secondary result and supporting evidences. To support entity-based querying, the syste ust be fundaentally entity aware: As a departure fro current search engines built around the notions of pages and keywords, we ust generalize the to support entity as a first-class concept. With this awareness, as our data odel, we will ove fro the current page view, i.e., the Web as a docuent collection, to the new entity view, i.e., the Web as an entity repository. Upon this foundation, as our query echanis, we can then develop entity search, where users specify what they are looking for with keywords and target entities, as Q Q4 illustrated. We now foralize these notions. Data Model: Entity View. How should we view the Web as our database to search over? In the standard page view, the Web is a set of docuents (or pages) D = {d, d 2,..., d n}. Our data odel takes an entity view: We consider the Web as priarily a repository of entities: E = {E, E 2,..., E n}, where each E i is an entity type. For instance, to support Scenario (Section ), the syste ight be constructed with entities E = {E : #phone, E 2 : #eail}. Further, each entity type E i is a set of entity instances that are extracted fro the corpus, i.e., literal values of entity type E i that occurs soewhere in soe d D. We use e i to denote an entity instance of entity type E i. In our exaple, e.g., by recognizing phonenuber patterns (say, in a regular expression of digits) fro D, we ay extract #phone = { , , (27) ,...} As the starting point for data-aware search, this tagging, or entity extraction, is to recognize each entity fro Web pages. Entity extraction has been well studied in the context of general inforation extraction, and techniques abound, fro siple syntactic pattern atching to sophisticated statistical taggers, with any off-the-shelf tools available. We stress that, however, while ipleentations ay differ, such extractors are inherently iperfect, and entity search ust essentially deal with uncertainty. Used as a black box, in abstraction, an entity extractor will return, for each occurrence, the extraction confidence probability and the position of extraction. We thus transfor the page view into our entity view, E = {E, E 2,..., E n}. The Entity Search Proble: Given: An entity collection E = {E, E 2,..., E N } Input: Query: β-α(k, K 2,..., K l, E, E 2,..., E ) Output: t = e, e 2,..., e : sorted by score(t), the tuple score of t with respect to β and α. As input, as Section 3 discussed, an entity-search query is siilar to standard keyword queries, but now users can specify entity types, E, E 2,..., and E, in addition to keywords K,..., K l. Optionally, in a coplete for, a query can also specify a atching pattern α (e.g., ow in Q3), to restrict when an occurrence of an instance t = e, e 2,..., e is considered a atching tuple, and a scoring easure β, to specify how all

4 the atching instances are ranked. Depending on the ipleentation and application settings, the choices of the β pattern and the α can be syste built-in or user specified, and they together deterine the ranking scores. As output, for a query with target entities, each atching result is a -ary entity tuple, i.e., t = e, e 2,..., e, i.e., a cobined instance. Note that an entity tuple ay contain one or ultiple instances, each of which is associated with one entity type. For exaple, David, david@hotail.co, is an entity tuple of type #nae, #eail, #phone, and Canada, Ottawa of type #country, #capital city. The objective is to find, fro the space of E... E, the atching tuples in ranked order by how well they atch the query, i.e., how well entity tuple t and the specified keywords associate, as atched by pattern α and ranked by easure β, which result in the tuple score score(t). To suarize, and to put our proposal in perspectives, we exaine it for both the search and integration point of views: First, as a search fraework, entity search is data aware. With the probabilistic tagging of entities, we view the Web as a repository of entities. Users directly specify their target entities, and the syste holistically evaluates the atchings. Second, as an integration fraework, entity search is scale aware, towards agile requireent and best-effort seantics for large scale deployent: On one hand, as deployent requireent, it is agile, requiring only inial tagging of the data doain of interest It can be uniforly applied to any sources, without requiring building wrappers. We stress that each entity is independently extracted in a soft probabilistic sense, and thus requires no rigid full relation extraction that often andates per-source wrappers. It can adapt to diverse inforation needs, without fixed pre-scribed scheas. The desired entities are only associated by ad-hoc queries at query tie Thus, users ay ask #phone with ib thinkpad or bill gates, or they ask to pair #phone with, say, #eail for white house. Supporting such online atching and association is exactly the challenge (and usefulness) of entity search. On the other hand, as query seantics, it is only best effort, returning the result in ranked, best-first anner. In essence, the syste assues integration in a probabilistic sense : Given independent entities with probabilities, entity search finds atching instances that ost likely for a desired tuple. To understand the proise of the syste and the practical issues, we decided to start with a quick Version 0. pilot ipleentation, which is the focus of our deonstration. 4. THE PILOT SYSTEM To realize our proposal in Section 3, we ipleented a pilot syste for epirical study. First, as we abstracted in the Entity Search proble, we need to coe up with specific atching patterns α as well as scoring easure β. Figures 3 and 4 suarizes our currently ipleented α and β. To specify a tuple function operator, the user or the application will choose the exact easures to use. The α Measure: An α easure qualifies if atch of occurrences of entities atch the desired tuple, in the atching pattern α(x). In principle, any Web presentation features can be incorporated for pattern expression, e.g., linguistic, proxiity, or visual features. In this work, we treat webpages as linear docuents. Consequently, we ipleented several siple α(x) qualification rule phrase(x) x atches a phrase as sepcified uwn(x) all ters in x occur in window of size n, any order own(x) all ters in x occur in window of size n, in the order nnuwn(x) uwn(x), and the ters are nearest neighbors: i.e., the su of the distances fro the left ost ter to all others is the sallest, aong other choices. nnown(x) own(x), and the ters are nearest neighbors as above tf dtf i Figure 3: Pattern easures. f ( e,..., e ) E tscore f ( e,..., e ), where with D as corpus size: windowsize f ( e l windowsize f k i) ( j ) E = f ( e ) i= 2 j= D D cprod f ( e,..., e) = [ e~,..., e ~ ] α ( x) f ( e,..., e ) log f ( e ) i = i forula [ ~..., ]) [ ~, where (, ] ~ ( ~ D e e = D e,..., e ) ( ~ ~ 2 e. pos e. pos) [ ~ e,..., ~ ] ( ) e x α i= conf ( ei) Figure 4: Scoring easures. position-based, page-bounded α easures as Figure 3 suarizes. The β Measures: Once instances of entities are atched, they for tuples. A β easure quantifies how proising a tuple e,...,e is. A tuple e,...,e ay appear in the corpus any ties let [ẽ,..., ẽ ] be one such occurrence, i.e., [ẽ,..., ẽ ] α(x). We note that a scoring function β for deterining the tuple score, i.e., β(e,...,e ), can build upon the following quantities:. Frequency of the tuple: How any ties has e,...,e occurred? That is, how any [ẽ,..., ẽ ] are atched? 2. Strength of each occurrence: How well does each [ẽ,..., ẽ ] atch the pattern? 3. Frequency of individual entity instance: How any ties has e i appeared in the corpus? A tuple ay be frequent siply because its entity instances are coon. 4. Uncertainty of entity instance: What is the probability of instance e i being of the entity type E i. Our initial ipleentation thus tried, for experientation, to use all four in various ways, as Figure 4 suarizes a few representative ones: tf : tuple frequency (using ); dtf : distance weighted tuple frequency (using, 3); i: utual inforation (using, 2); t-score(using, 3); conf (using 4). While these easures are validated in our experients to be useful, we believe a ore systeatic study of the scoring easure is still needed. Coing up with ore effective scoring easures is itself an interesting research proble. We plan to work towards this direction in the future. Next, we describe the key syste coponents to accoplish the goal of entity search, towards agile best-effort inforation integration. The overall syste architecture is illustrated in Figure 5. We now zoo into each specific syste coponent. Data Collector: Our data webpages could coe fro both the surface Web and the deep Web. To get webpages fro the surface Web, our Data Collector works like a crawler, getting webpages related with a specific topic/application. Our current ipleentation obtains webpages fro the Stanford Web- Base Project 2. To get webpages fro the deep Web sources, 2 testbed/doc2/webbase/ i i+

5 CEO Apple Crop Users, Applications EntitySearch query Query Engine Results, scores Relational Operations Constructor Price 7.99$ 6.99$ 5.00$ 7.35$ Copany Microsoft Apple Intel Aazon Indices Extraction Storage Figure 6: Syste Interface. Annotated Webpages Entity Extractor Entity Models Webpages Data Collector Figure 5: Syste Architecture our Data Collector works by collecting result pages of specific queries on deep Web sources, e.g., houses available in a certain zipcode area, books written by soe specific authors, etc. Entity Extractor: In order to discover seantics of the doain, our Entity Extractor extracts doain entities, the key coponents of the doain, independently. Each entities is extracted using an entity odel, which describes how to identify instances of an entity in the webpages. Query Engine This coponent essentially supports online querying by perforing pattern atching and tuple scoring, as we abstracted in Section 3 and described in the beginning of this Section. As we produce tuples in the results, naturally any relational operations could be perfored, which corresponds to the Relational Operations sub coponent in Figure 5. Our Query Engine is orphed using the Leur Toolkit (version 2.2), an inforation retrieval engine []. This IR engine facilitates easy storage of the extracted entities in the for of ordered lists based on docuent ID, uch like storing inverted indices for keywords. It also enables the construction in a natural way as sort-erge-join based on docuent ID is one of the ost coon operations in answering IR queries. 5. DEMONSTRATION In this deonstration, we specifically show the online coponent of our syste, which is our query engine. The current sever is running on a Pentiu-4 2.6GHz PC with GB eory. This section will first discuss the interface of our query engine. Then two deonstration scenarios are presented to show the effectiveness of our syste. Finally, the deonstration plan is described. 5. Syste Interface We show our query interface for the query engine of Entity- Search in Figure 6. The first three input boxes in Figure 6 directly refers to the operators explained in Section 4. Matching Pattern specifies a pattern α(x), in which x is a list of ters to be joined, as either entities E i or literal keywords K j, and α is a pattern easure, shown in Figure 3, specifying how the ters are connected into a pattern. Figure 7: Saple Output of Query C. Scoring Measure specifies which β scoring easure, aong the ones we support in Figure 4, will be used to score the atched tuples. Entity Filter enables specifying conditions on each entity. In principle, any filter operation could be applied to an entity. Our syste currently supports two operators: equalto (string value of the instance atches exactly with the specified keywords) and contains (string value of the instance contains the specified keywords). In addition, We provide three auxiliary features to iprove the effectiveness of the engine. These features could be regarded as a light-weight Relational Operation layer, which perfors inial but useful relational operations upon the output relation. We discuss these three features one by one. Corpus Restriction: The Corpus Restriction operator is designed to facilitate application s control over the corpus. We support Corpus Restriction by URL doains as regular patterns. For instance, we could set this field to *.azon.co, which atches the sub-corpus of pages at the doains with that pattern (e.g., book.azon.co). Pages Per Answer: This feature requests the nuber of Web pages to return as support pages or evidences for each tuple returned by the query engine because results are integrated fro ultiple Web pages across the corpus. Order By specifies the order of listing results (e.g., by #research alphabetically) uch like the sae clause in SQL. By default it will rank the results according to their ranking score calculated by the β Measure. Application issues queries by filling in the interface (Matching Pattern, Scoring Measure and Entity Filter), as well as the other three auxiliary operators and then clicking the Subit Query button. Figure 7 shows a saple syste output. As you can see fro the figure, it is in the for of a relation. Each tuple has a score associated with it. The URL below each tuple shows the supporting pages where this tuple is found. 5.2 Deonstration Scenarios In this subsection, we use two concrete scenarios on real data to deonstrate the practical usage of our syste. The two scenarios we selected are very different in nature. The first scenario, regarding education doain regarding idwest CS departents, focuses on the surface Web whose pages are

6 ostly unstructured. In contrast, the second scenario, using the book doain, focuses on the deep Web. Result pages collected fro the deep Web tend to be ore structured. We show that our syste works well in both scenarios. Scenario : Midwest Coputer Science Doain Currently, we have collected pages regarding CS departent in six idwest universities (IIT, Illinois, Indiana, Michigan, Purdue and Wisconsin) fro the WebBase project. The Doain Extractor extracts the following attributes: professor, research, university, eail and phone. Three exaple queries: C: eails of professors across universities C2: research areas of professors across universities C3: professors conducting DB related research across universities Above we have shown three exaple queries that could be asked in this doain. Detailed query operators for each query is excluded. In Query C, the application is interested in integrating all the eails of professors across universities into one table. Figure 7 shows a snippet of the result, which is good. We anually checked the result returned by our syste for Query C, over 90% of professor s eails are included in the output relation. And if for each unique professor, we only reain the tuple with the highest score in the relation, we can achieve precision over 85%. Please refer to the results in our online deo for ore details. Siilar results are observed fro other exaple queries, such as Query C2 and C3. Scenario 2: Shakespeare s Book Doain This scenario focuses on the deep Web. We issued a query by filling author attribute with Shakespeare to the three ajor online book stores: We then anually collect webpages containing the top 00 results fro each site. The Doain Extractor extracts the following attributes: title, author, iage, price, date. Three exaple queries: B: title and price of books B2: iages of books with title containing Halet B3: title of books that are in stock Above we have shown three exaple queries that could be asked in this doain, excluding detailed query operators. Query B is a very coon integration scenario where the application wants to find all the books and their prices across ultiple websites. More interestingly, Query B2 asks for iages of titles with keyword Halet in it. As you can see fro the top results shown in Figure 8, the iages indeed are all cover iages for books with keyword Halet in their titles. Query B3 could be issued using the title attribute together with keywords in stock as application finds vendors norally put keywords in stock to indicate the books currently available for purchase. Siilarly, we can query for books that are out of stock, on order, etc. Due to the lack of space, we are only able to briefly show very liited exaple queries and results in our two scenarios. To experience ore about the above two scenarios and have a real feeling of the results, we invite users to the following online deo site Figure 8: Saple Output of Query B Deo Plan Users can follow the link fro our deo site to experience our two deo scenarios. For each scenario, we have listed a set of exaple queries, including all the exaple queries shown in Section 5.2. By clicking on an exaple query, the query operators of the chosen query will be autoatically filled into the input boxes in our query interface. Alternatively, users are welcoe to odify the query operators of the exaple queries or coe up with their own queries. Query results are norally returned within seconds. 6. CONCLUSION In this paper, we proposed the concept of Entity Search for supporting agile best-effort inforation integration over the Web. While we observe entity search, as a id-way arriage of search and integration, is eaningful and useful to access the data-rich Web, there are several open research issues towards its full realization: First, as the core, how to rank each tuple, so that the best atching surfaces to the top? Second, to provide efficient online search, can we generalize the current page-based search engine into an entity search engine? Finally, as entity search returns a ranked list of tuples (e.g., Figure 2), how can we integrate the derived relations with relational SQL-based querying e.g., joining #copany, #phone with #copany, #eail? Or, filtering only those copanies with.co? We are continuing to investigate these challenging and interesting probles as our future research agenda. 7. REFERENCES [] Leur toolkit for language odeling and inforation retrieval. leur. [2] Multiple listing service. Listing Service. [3] Sabre holdings corporation. [4] K. C.-C. Chang, B. He, and Z. Zhang. Toward large scale integration: Building a etaquerier over databases on the web. In Proceedings CIDR [5] W. W. Cohen. Soe practical observations on integration of web inforation. In WebDB (Inforal Proceedings), pages 55 60, 999. [6] S. Raghavan and H. Garcia-Molina. Crawling the hidden web. In VLDB, pages 29 38, 200. [7] L. J. Seligan, A. Rosenthal, P. E. Lehner, and A. Sith. Data integration: Where does the tie go? IEEE Data Eng. Bull., 25(3):3 0, [8] Z. Zhang, B. He, and K. C.-C. Chang. Light-weight doain-based for assistant: Querying web databases on the fly. In Proceedings of VLDB 2005.

1 Extended Boolean Model

1 Extended Boolean Model 1 EXTENDED BOOLEAN MODEL It has been well-known that the Boolean odel is too inflexible, requiring skilful use of Boolean operators to obtain good results. On the other hand, the vector space odel is flexible

More information

Mapping Data in Peer-to-Peer Systems: Semantics and Algorithmic Issues

Mapping Data in Peer-to-Peer Systems: Semantics and Algorithmic Issues Mapping Data in Peer-to-Peer Systes: Seantics and Algorithic Issues Anastasios Keentsietsidis Marcelo Arenas Renée J. Miller Departent of Coputer Science University of Toronto {tasos,arenas,iller}@cs.toronto.edu

More information

Leveraging Relevance Cues for Improved Spoken Document Retrieval

Leveraging Relevance Cues for Improved Spoken Document Retrieval Leveraging Relevance Cues for Iproved Spoken Docuent Retrieval Pei-Ning Chen 1, Kuan-Yu Chen 2 and Berlin Chen 1 National Taiwan Noral University, Taiwan 1 Institute of Inforation Science, Acadeia Sinica,

More information

Defining and Surveying Wireless Link Virtualization and Wireless Network Virtualization

Defining and Surveying Wireless Link Virtualization and Wireless Network Virtualization 1 Defining and Surveying Wireless Link Virtualization and Wireless Network Virtualization Jonathan van de Belt, Haed Ahadi, and Linda E. Doyle The Centre for Future Networks and Counications - CONNECT,

More information

Computer Aided Drafting, Design and Manufacturing Volume 26, Number 2, June 2016, Page 13

Computer Aided Drafting, Design and Manufacturing Volume 26, Number 2, June 2016, Page 13 Coputer Aided Drafting, Design and Manufacturing Volue 26, uber 2, June 2016, Page 13 CADDM 3D reconstruction of coplex curved objects fro line drawings Sun Yanling, Dong Lijun Institute of Mechanical

More information

Design and Implementation of Business Logic Layer Object-Oriented Design versus Relational Design

Design and Implementation of Business Logic Layer Object-Oriented Design versus Relational Design Design and Ipleentation of Business Logic Layer Object-Oriented Design versus Relational Design Ali Alharthy Faculty of Engineering and IT University of Technology, Sydney Sydney, Australia Eail: Ali.a.alharthy@student.uts.edu.au

More information

COLOR HISTOGRAM AND DISCRETE COSINE TRANSFORM FOR COLOR IMAGE RETRIEVAL

COLOR HISTOGRAM AND DISCRETE COSINE TRANSFORM FOR COLOR IMAGE RETRIEVAL COLOR HISTOGRAM AND DISCRETE COSINE TRANSFORM FOR COLOR IMAGE RETRIEVAL 1 Te-Wei Chiang ( 蔣德威 ), 2 Tienwei Tsai ( 蔡殿偉 ), 3 Jeng-Ping Lin ( 林正平 ) 1 Dept. of Accounting Inforation Systes, Chilee Institute

More information

Design Optimization of Mixed Time/Event-Triggered Distributed Embedded Systems

Design Optimization of Mixed Time/Event-Triggered Distributed Embedded Systems Design Optiization of Mixed Tie/Event-Triggered Distributed Ebedded Systes Traian Pop, Petru Eles, Zebo Peng Dept. of Coputer and Inforation Science, Linköping University {trapo, petel, zebpe}@ida.liu.se

More information

Collection Selection Based on Historical Performance for Efficient Processing

Collection Selection Based on Historical Performance for Efficient Processing Collection Selection Based on Historical Perforance for Efficient Processing Christopher T. Fallen and Gregory B. Newby Arctic Region Supercoputing Center University of Alaska Fairbanks Fairbanks, Alaska

More information

Energy-Efficient Disk Replacement and File Placement Techniques for Mobile Systems with Hard Disks

Energy-Efficient Disk Replacement and File Placement Techniques for Mobile Systems with Hard Disks Energy-Efficient Disk Replaceent and File Placeent Techniques for Mobile Systes with Hard Disks Young-Jin Ki School of Coputer Science & Engineering Seoul National University Seoul 151-742, KOREA youngjk@davinci.snu.ac.kr

More information

Verifying the structure and behavior in UML/OCL models using satisfiability solvers

Verifying the structure and behavior in UML/OCL models using satisfiability solvers IET Cyber-Physical Systes: Theory & Applications Review Article Verifying the structure and behavior in UML/OCL odels using satisfiability solvers ISSN 2398-3396 Received on 20th October 2016 Revised on

More information

Feature Selection to Relate Words and Images

Feature Selection to Relate Words and Images The Open Inforation Systes Journal, 2009, 3, 9-13 9 Feature Selection to Relate Words and Iages Wei-Chao Lin 1 and Chih-Fong Tsai*,2 Open Access 1 Departent of Coputing, Engineering and Technology, University

More information

CS 361 Meeting 8 9/24/18

CS 361 Meeting 8 9/24/18 CS 36 Meeting 8 9/4/8 Announceents. Hoework 3 due Friday. Review. The closure properties of regular languages provide a way to describe regular languages by building the out of sipler regular languages

More information

TensorFlow and Keras-based Convolutional Neural Network in CAT Image Recognition Ang LI 1,*, Yi-xiang LI 2 and Xue-hui LI 3

TensorFlow and Keras-based Convolutional Neural Network in CAT Image Recognition Ang LI 1,*, Yi-xiang LI 2 and Xue-hui LI 3 2017 2nd International Conference on Coputational Modeling, Siulation and Applied Matheatics (CMSAM 2017) ISBN: 978-1-60595-499-8 TensorFlow and Keras-based Convolutional Neural Network in CAT Iage Recognition

More information

QUERY ROUTING OPTIMIZATION IN SENSOR COMMUNICATION NETWORKS

QUERY ROUTING OPTIMIZATION IN SENSOR COMMUNICATION NETWORKS QUERY ROUTING OPTIMIZATION IN SENSOR COMMUNICATION NETWORKS Guofei Jiang and George Cybenko Institute for Security Technology Studies and Thayer School of Engineering Dartouth College, Hanover NH 03755

More information

EE 364B Convex Optimization An ADMM Solution to the Sparse Coding Problem. Sonia Bhaskar, Will Zou Final Project Spring 2011

EE 364B Convex Optimization An ADMM Solution to the Sparse Coding Problem. Sonia Bhaskar, Will Zou Final Project Spring 2011 EE 364B Convex Optiization An ADMM Solution to the Sparse Coding Proble Sonia Bhaskar, Will Zou Final Project Spring 20 I. INTRODUCTION For our project, we apply the ethod of the alternating direction

More information

Boosted Detection of Objects and Attributes

Boosted Detection of Objects and Attributes L M M Boosted Detection of Objects and Attributes Abstract We present a new fraework for detection of object and attributes in iages based on boosted cobination of priitive classifiers. The fraework directly

More information

Carving Differential Unit Test Cases from System Test Cases

Carving Differential Unit Test Cases from System Test Cases Carving Differential Unit Test Cases fro Syste Test Cases Sebastian Elbau, Hui Nee Chin, Matthew B. Dwyer, Jonathan Dokulil Departent of Coputer Science and Engineering University of Nebraska - Lincoln

More information

Identifying Converging Pairs of Nodes on a Budget

Identifying Converging Pairs of Nodes on a Budget Identifying Converging Pairs of Nodes on a Budget Konstantina Lazaridou Departent of Inforatics Aristotle University, Thessaloniki, Greece konlaznik@csd.auth.gr Evaggelia Pitoura Coputer Science and Engineering

More information

Privacy-preserving String-Matching With PRAM Algorithms

Privacy-preserving String-Matching With PRAM Algorithms Privacy-preserving String-Matching With PRAM Algoriths Report in MTAT.07.022 Research Seinar in Cryptography, Fall 2014 Author: Sander Sii Supervisor: Peeter Laud Deceber 14, 2014 Abstract In this report,

More information

Collaborative Web Caching Based on Proxy Affinities

Collaborative Web Caching Based on Proxy Affinities Collaborative Web Caching Based on Proxy Affinities Jiong Yang T J Watson Research Center IBM jiyang@usibco Wei Wang T J Watson Research Center IBM ww1@usibco Richard Muntz Coputer Science Departent UCLA

More information

Development of an Integrated Cost Estimation and Cost Control System for Construction Projects

Development of an Integrated Cost Estimation and Cost Control System for Construction Projects ABSTRACT Developent of an Integrated Estiation and Control Syste for Construction s by Salan Azhar, Syed M. Ahed and Aaury A. Caballero Florida International University 0555 W. Flagler Street, Miai, Florida

More information

The Boundary Between Privacy and Utility in Data Publishing

The Boundary Between Privacy and Utility in Data Publishing The Boundary Between Privacy and Utility in Data Publishing Vibhor Rastogi Dan Suciu Sungho Hong ABSTRACT We consider the privacy proble in data publishing: given a database instance containing sensitive

More information

Improve Peer Cooperation using Social Networks

Improve Peer Cooperation using Social Networks Iprove Peer Cooperation using Social Networks Victor Ponce, Jie Wu, and Xiuqi Li Departent of Coputer Science and Engineering Florida Atlantic University Boca Raton, FL 33431 Noveber 5, 2007 Corresponding

More information

Geo-activity Recommendations by using Improved Feature Combination

Geo-activity Recommendations by using Improved Feature Combination Geo-activity Recoendations by using Iproved Feature Cobination Masoud Sattari Middle East Technical University Ankara, Turkey e76326@ceng.etu.edu.tr Murat Manguoglu Middle East Technical University Ankara,

More information

Structuring Business Metadata in Data Warehouse Systems for Effective Business Support

Structuring Business Metadata in Data Warehouse Systems for Effective Business Support Structuring Business Metadata in Data Warehouse Systes for Effective Business Support arxiv:cs/0110020v1 [cs.db] 8 Oct 2001 N.L. Sarda Departent of Coputer Science and Engineering Indian Institute of Technology

More information

Analysing Real-Time Communications: Controller Area Network (CAN) *

Analysing Real-Time Communications: Controller Area Network (CAN) * Analysing Real-Tie Counications: Controller Area Network (CAN) * Abstract The increasing use of counication networks in tie critical applications presents engineers with fundaental probles with the deterination

More information

An Architecture for a Distributed Deductive Database System

An Architecture for a Distributed Deductive Database System IEEE TENCON '93 / B eih An Architecture for a Distributed Deductive Database Syste M. K. Mohania N. L. Sarda bept. of Coputer Science and Engineering, Indian Institute of Technology, Bobay 400 076, INDIA

More information

Designing High Performance Web-Based Computing Services to Promote Telemedicine Database Management System

Designing High Performance Web-Based Computing Services to Promote Telemedicine Database Management System Designing High Perforance Web-Based Coputing Services to Proote Teleedicine Database Manageent Syste Isail Hababeh 1, Issa Khalil 2, and Abdallah Khreishah 3 1: Coputer Engineering & Inforation Technology,

More information

Clustering. Cluster Analysis of Microarray Data. Microarray Data for Clustering. Data for Clustering

Clustering. Cluster Analysis of Microarray Data. Microarray Data for Clustering. Data for Clustering Clustering Cluster Analysis of Microarray Data 4/3/009 Copyright 009 Dan Nettleton Group obects that are siilar to one another together in a cluster. Separate obects that are dissiilar fro each other into

More information

A Trajectory Splitting Model for Efficient Spatio-Temporal Indexing

A Trajectory Splitting Model for Efficient Spatio-Temporal Indexing A Trajectory Splitting Model for Efficient Spatio-Teporal Indexing Slobodan Rasetic Jörg Sander Jaes Elding Mario A. Nasciento Departent of Coputing Science University of Alberta Edonton, Alberta, Canada

More information

POSITION-PATCH BASED FACE HALLUCINATION VIA LOCALITY-CONSTRAINED REPRESENTATION. Junjun Jiang, Ruimin Hu, Zhen Han, Tao Lu, and Kebin Huang

POSITION-PATCH BASED FACE HALLUCINATION VIA LOCALITY-CONSTRAINED REPRESENTATION. Junjun Jiang, Ruimin Hu, Zhen Han, Tao Lu, and Kebin Huang IEEE International Conference on ultiedia and Expo POSITION-PATCH BASED FACE HALLUCINATION VIA LOCALITY-CONSTRAINED REPRESENTATION Junjun Jiang, Ruiin Hu, Zhen Han, Tao Lu, and Kebin Huang National Engineering

More information

Fair Energy-Efficient Sensing Task Allocation in Participatory Sensing with Smartphones

Fair Energy-Efficient Sensing Task Allocation in Participatory Sensing with Smartphones Advance Access publication on 24 February 2017 The British Coputer Society 2017. All rights reserved. For Perissions, please eail: journals.perissions@oup.co doi:10.1093/cojnl/bxx015 Fair Energy-Efficient

More information

On computing global similarity in images

On computing global similarity in images On coputing global siilarity in iages S Ravela and R Manatha Coputer Science Departent University of Massachusetts, Aherst, MA 01003 Eail: ravela,anatha @csuassedu Abstract The retrieval of iages based

More information

Planet Scale Software Updates

Planet Scale Software Updates Planet Scale Software Updates Christos Gkantsidis 1, Thoas Karagiannis 2, Pablo Rodriguez 1, and Milan Vojnović 1 1 Microsoft Research 2 UC Riverside Cabridge, UK Riverside, CA, USA {chrisgk,pablo,ilanv}@icrosoft.co

More information

An Efficient Approach for Content Delivery in Overlay Networks

An Efficient Approach for Content Delivery in Overlay Networks An Efficient Approach for Content Delivery in Overlay Networks Mohaad Malli, Chadi Barakat, Walid Dabbous Projet Planète, INRIA-Sophia Antipolis, France E-ail:{alli, cbarakat, dabbous}@sophia.inria.fr

More information

Adaptive Holistic Scheduling for In-Network Sensor Query Processing

Adaptive Holistic Scheduling for In-Network Sensor Query Processing Adaptive Holistic Scheduling for In-Network Sensor Query Processing Hejun Wu and Qiong Luo Departent of Coputer Science and Engineering Hong Kong University of Science & Technology Clear Water Bay, Kowloon,

More information

Data Caching for Enhancing Anonymity

Data Caching for Enhancing Anonymity Data Caching for Enhancing Anonyity Rajiv Bagai and Bin Tang Departent of Electrical Engineering and Coputer Science Wichita State University Wichita, Kansas 67260 0083, USA Eail: {rajiv.bagai, bin.tang}@wichita.edu

More information

A Novel Fast Constructive Algorithm for Neural Classifier

A Novel Fast Constructive Algorithm for Neural Classifier A Novel Fast Constructive Algorith for Neural Classifier Xudong Jiang Centre for Signal Processing, School of Electrical and Electronic Engineering Nanyang Technological University Nanyang Avenue, Singapore

More information

THE rapid growth and continuous change of the real

THE rapid growth and continuous change of the real IEEE TRANSACTIONS ON SERVICES COMPUTING, VOL. 8, NO. 1, JANUARY/FEBRUARY 2015 47 Designing High Perforance Web-Based Coputing Services to Proote Teleedicine Database Manageent Syste Isail Hababeh, Issa

More information

Homework 1. An Introduction to Neural Networks

Homework 1. An Introduction to Neural Networks Hoework An Introduction to Neural Networks -785: Introduction to Deep Learning Spring 09 OUT: January 4, 09 DUE: February 6, 09, :59 PM Start Here Collaboration policy: You are expected to coply with the

More information

Data & Knowledge Engineering

Data & Knowledge Engineering Data & Knowledge Engineering 7 (211) 17 187 Contents lists available at ScienceDirect Data & Knowledge Engineering journal hoepage: www.elsevier.co/locate/datak An approxiate duplicate eliination in RFID

More information

I ve Seen Enough : Incrementally Improving Visualizations to Support Rapid Decision Making

I ve Seen Enough : Incrementally Improving Visualizations to Support Rapid Decision Making I ve Seen Enough : Increentally Iproving Visualizations to Support Rapid Decision Making Sajjadur Rahan Marya Aliakbarpour 2 Ha Kyung Kong Eric Blais 3 Karrie Karahalios,4 Aditya Paraeswaran Ronitt Rubinfield

More information

Scalable search-based image annotation

Scalable search-based image annotation Multiedia Systes DOI 17/s53-8-128-y REGULAR PAPER Scalable search-based iage annotation Changhu Wang Feng Jing Lei Zhang Hong-Jiang Zhang Received: 18 October 27 / Accepted: 15 May 28 Springer-Verlag 28

More information

M Software management

M Software management M Software anageent This docuent is part of the UCISA Inforation Security Toolkit providing guidance on the policies and processes needed to ipleent an organisational inforation security policy. To use

More information

News Events Clustering Method Based on Staging Incremental Single-Pass Technique

News Events Clustering Method Based on Staging Incremental Single-Pass Technique News Events Clustering Method Based on Staging Increental Single-Pass Technique LI Yongyi 1,a *, Gao Yin 2 1 School of Electronics and Inforation Engineering QinZhou University 535099 Guangxi, China 2

More information

Heterogeneous Radial Basis Function Networks

Heterogeneous Radial Basis Function Networks Proceedings of the International Conference on Neural Networks (ICNN ), vol. 2, pp. 23-2, June. Heterogeneous Radial Basis Function Networks D. Randall Wilson, Tony R. Martinez e-ail: randy@axon.cs.byu.edu,

More information

MULTI-INDEX VOTING FOR ASYMMETRIC DISTANCE COMPUTATION IN A LARGE-SCALE BINARY CODES. Chih-Yi Chiu, Yu-Cyuan Liou, and Sheng-Hao Chou

MULTI-INDEX VOTING FOR ASYMMETRIC DISTANCE COMPUTATION IN A LARGE-SCALE BINARY CODES. Chih-Yi Chiu, Yu-Cyuan Liou, and Sheng-Hao Chou MULTI-INDEX VOTING FOR ASYMMETRIC DISTANCE COMPUTATION IN A LARGE-SCALE BINARY CODES Chih-Yi Chiu, Yu-Cyuan Liou, and Sheng-Hao Chou Departent of Coputer Science and Inforation Engineering, National Chiayi

More information

A Hybrid Network Architecture for File Transfers

A Hybrid Network Architecture for File Transfers JOURNAL OF IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 6, NO., JANUARY 9 A Hybrid Network Architecture for File Transfers Xiuduan Fang, Meber, IEEE, Malathi Veeraraghavan, Senior Meber,

More information

6.1 Topological relations between two simple geometric objects

6.1 Topological relations between two simple geometric objects Chapter 5 proposed a spatial odel to represent the spatial extent of objects in urban areas. The purpose of the odel, as was clarified in Chapter 3, is ultifunctional, i.e. it has to be capable of supplying

More information

New Measures for the Evaluation of Interactive Information Retrieval Systems: Normalized Task Completion Time and Normalized User Effectiveness

New Measures for the Evaluation of Interactive Information Retrieval Systems: Normalized Task Completion Time and Normalized User Effectiveness New Measures for the Evaluation of Interactive Inforation Retrieval Systes: Noralized Task Copletion Tie and Noralized User Effectiveness Jing Cheng Graduate School of Library and Inforation Science, University

More information

MAPPING THE DATA FLOW MODEL OF COMPUTATION INTO AN ENHANCED VON NEUMANN PROCESSOR * Peter M. Maurer

MAPPING THE DATA FLOW MODEL OF COMPUTATION INTO AN ENHANCED VON NEUMANN PROCESSOR * Peter M. Maurer MAPPING THE DATA FLOW MODEL OF COMPUTATION INTO AN ENHANCED VON NEUMANN PROCESSOR * Peter M. Maurer Departent of Coputer Science and Engineering University of South Florida Tapa, FL 33620 Abstract -- The

More information

Performance Analysis of RAID in Different Workload

Performance Analysis of RAID in Different Workload Send Orders for Reprints to reprints@benthascience.ae 324 The Open Cybernetics & Systeics Journal, 2015, 9, 324-328 Perforance Analysis of RAID in Different Workload Open Access Zhang Dule *, Ji Xiaoyun,

More information

Two hours UNIVERSITY OF MANCHESTER. January Answer any THREE questions. Answer each question in a separate book

Two hours UNIVERSITY OF MANCHESTER. January Answer any THREE questions. Answer each question in a separate book Two hours UNIVERSITY OF MANCHESTER Database Architecture Models and Design January 2004 Answer any THREE questions Answer each question in a separate book The use of electronic calculators is not peritted.

More information

NON-RIGID OBJECT TRACKING: A PREDICTIVE VECTORIAL MODEL APPROACH

NON-RIGID OBJECT TRACKING: A PREDICTIVE VECTORIAL MODEL APPROACH NON-RIGID OBJECT TRACKING: A PREDICTIVE VECTORIAL MODEL APPROACH V. Atienza; J.M. Valiente and G. Andreu Departaento de Ingeniería de Sisteas, Coputadores y Autoática Universidad Politécnica de Valencia.

More information

Efficient Estimation of Inclusion Coefficient using HyperLogLog Sketches

Efficient Estimation of Inclusion Coefficient using HyperLogLog Sketches Efficient Estiation of Inclusion Coefficient using HyperLogLog Sketches Azade Nazi, Bolin Ding, Vivek Narasayya, Surajit Chaudhuri Microsoft Research {aznazi, bolind, viveknar, surajitc}@icrosoft.co ABSTRACT

More information

Ascending order sort Descending order sort

Ascending order sort Descending order sort Scalable Binary Sorting Architecture Based on Rank Ordering With Linear Area-Tie Coplexity. Hatrnaz and Y. Leblebici Departent of Electrical and Coputer Engineering Worcester Polytechnic Institute Abstract

More information

Adaptive Parameter Estimation Based Congestion Avoidance Strategy for DTN

Adaptive Parameter Estimation Based Congestion Avoidance Strategy for DTN Proceedings of the nd International onference on oputer Science and Electronics Engineering (ISEE 3) Adaptive Paraeter Estiation Based ongestion Avoidance Strategy for DTN Qicai Yang, Futong Qin, Jianquan

More information

An Ad Hoc Adaptive Hashing Technique for Non-Uniformly Distributed IP Address Lookup in Computer Networks

An Ad Hoc Adaptive Hashing Technique for Non-Uniformly Distributed IP Address Lookup in Computer Networks An Ad Hoc Adaptive Hashing Technique for Non-Uniforly Distributed IP Address Lookup in Coputer Networks Christopher Martinez Departent of Electrical and Coputer Engineering The University of Texas at San

More information

A Spectral Framework for Detecting Inconsistency across Multi-Source Object Relationships

A Spectral Framework for Detecting Inconsistency across Multi-Source Object Relationships A Spectral Fraework for Detecting Inconsistency across Multi-Source Object Relationships Jing Gao, Wei Fan, Deepak Turaga, Srinivasan Parthasarathy, and Jiawei Han University of Illinois at Urbana-Chapaign,

More information

Optimal Route Queries with Arbitrary Order Constraints

Optimal Route Queries with Arbitrary Order Constraints IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL.?, NO.?,? 20?? 1 Optial Route Queries with Arbitrary Order Constraints Jing Li, Yin Yang, Nikos Maoulis Abstract Given a set of spatial points DS,

More information

Image Filter Using with Gaussian Curvature and Total Variation Model

Image Filter Using with Gaussian Curvature and Total Variation Model IJECT Vo l. 7, Is s u e 3, Ju l y - Se p t 016 ISSN : 30-7109 (Online) ISSN : 30-9543 (Print) Iage Using with Gaussian Curvature and Total Variation Model 1 Deepak Kuar Gour, Sanjay Kuar Shara 1, Dept.

More information

Different criteria of dynamic routing

Different criteria of dynamic routing Procedia Coputer Science Volue 66, 2015, Pages 166 173 YSC 2015. 4th International Young Scientists Conference on Coputational Science Different criteria of dynaic routing Kurochkin 1*, Grinberg 1 1 Kharkevich

More information

Closing The Performance Gap between Causal Consistency and Eventual Consistency

Closing The Performance Gap between Causal Consistency and Eventual Consistency Closing The Perforance Gap between Causal Consistency and Eventual Consistency Jiaqing Du Călin Iorgulescu Aitabha Roy Willy Zwaenepoel EPFL ABSTRACT It is well known that causal consistency is ore expensive

More information

A Learning Framework for Nearest Neighbor Search

A Learning Framework for Nearest Neighbor Search A Learning Fraework for Nearest Neighbor Search Lawrence Cayton Departent of Coputer Science University of California, San Diego lcayton@cs.ucsd.edu Sanjoy Dasgupta Departent of Coputer Science University

More information

OPTIMAL COMPLEX SERVICES COMPOSITION IN SOA SYSTEMS

OPTIMAL COMPLEX SERVICES COMPOSITION IN SOA SYSTEMS Key words SOA, optial, coplex service, coposition, Quality of Service Piotr RYGIELSKI*, Paweł ŚWIĄTEK* OPTIMAL COMPLEX SERVICES COMPOSITION IN SOA SYSTEMS One of the ost iportant tasks in service oriented

More information

A Learning Framework for Nearest Neighbor Search

A Learning Framework for Nearest Neighbor Search A Learning Fraework for Nearest Neighbor Search Lawrence Cayton Departent of Coputer Science University of California, San Diego lcayton@cs.ucsd.edu Sanjoy Dasgupta Departent of Coputer Science University

More information

Multi Packet Reception and Network Coding

Multi Packet Reception and Network Coding The 2010 Military Counications Conference - Unclassified Progra - etworking Protocols and Perforance Track Multi Packet Reception and etwork Coding Aran Rezaee Research Laboratory of Electronics Massachusetts

More information

Investigation of The Time-Offset-Based QoS Support with Optical Burst Switching in WDM Networks

Investigation of The Time-Offset-Based QoS Support with Optical Burst Switching in WDM Networks Investigation of The Tie-Offset-Based QoS Support with Optical Burst Switching in WDM Networks Pingyi Fan, Chongxi Feng,Yichao Wang, Ning Ge State Key Laboratory on Microwave and Digital Counications,

More information

An Integrated Processing Method for Multiple Large-scale Point-Clouds Captured from Different Viewpoints

An Integrated Processing Method for Multiple Large-scale Point-Clouds Captured from Different Viewpoints 519 An Integrated Processing Method for Multiple Large-scale Point-Clouds Captured fro Different Viewpoints Yousuke Kawauchi 1, Shin Usuki, Kenjiro T. Miura 3, Hiroshi Masuda 4 and Ichiro Tanaka 5 1 Shizuoka

More information

Joint Measurement- and Traffic Descriptor-based Admission Control at Real-Time Traffic Aggregation Points

Joint Measurement- and Traffic Descriptor-based Admission Control at Real-Time Traffic Aggregation Points Joint Measureent- and Traffic Descriptor-based Adission Control at Real-Tie Traffic Aggregation Points Stylianos Georgoulas, Panos Triintzios and George Pavlou Centre for Counication Systes Research, University

More information

Oblivious Routing for Fat-Tree Based System Area Networks with Uncertain Traffic Demands

Oblivious Routing for Fat-Tree Based System Area Networks with Uncertain Traffic Demands Oblivious Routing for Fat-Tree Based Syste Area Networks with Uncertain Traffic Deands Xin Yuan Wickus Nienaber Zhenhai Duan Departent of Coputer Science Florida State University Tallahassee, FL 3306 {xyuan,nienaber,duan}@cs.fsu.edu

More information

On the Computation and Application of Prototype Point Patterns

On the Computation and Application of Prototype Point Patterns On the Coputation and Application of Prototype Point Patterns Katherine E. Tranbarger Freier 1 and Frederic Paik Schoenberg 2 Abstract This work addresses coputational probles related to the ipleentation

More information

Utility-Based Resource Allocation for Mixed Traffic in Wireless Networks

Utility-Based Resource Allocation for Mixed Traffic in Wireless Networks IEEE IFOCO 2 International Workshop on Future edia etworks and IP-based TV Utility-Based Resource Allocation for ixed Traffic in Wireless etworks Li Chen, Bin Wang, Xiaohang Chen, Xin Zhang, and Dacheng

More information

A High-Speed VLSI Fuzzy Inference Processor for Trapezoid-Shaped Membership Functions *

A High-Speed VLSI Fuzzy Inference Processor for Trapezoid-Shaped Membership Functions * JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 21, 607-626 (2005) A High-Speed VLSI Fuzzy Inference Processor for Trapezoid-Shaped Mebership Functions * SHIH-HSU HUANG AND JIAN-YUAN LAI + Departent of

More information

Distributed Multicast Tree Construction in Wireless Sensor Networks

Distributed Multicast Tree Construction in Wireless Sensor Networks Distributed Multicast Tree Construction in Wireless Sensor Networks Hongyu Gong, Luoyi Fu, Xinzhe Fu, Lutian Zhao 3, Kainan Wang, and Xinbing Wang Dept. of Electronic Engineering, Shanghai Jiao Tong University,

More information

Modeling Parallel Applications Performance on Heterogeneous Systems

Modeling Parallel Applications Performance on Heterogeneous Systems Modeling Parallel Applications Perforance on Heterogeneous Systes Jaeela Al-Jaroodi, Nader Mohaed, Hong Jiang and David Swanson Departent of Coputer Science and Engineering University of Nebraska Lincoln

More information

A Measurement-Based Model for Parallel Real-Time Tasks

A Measurement-Based Model for Parallel Real-Time Tasks A Measureent-Based Model for Parallel Real-Tie Tasks Kunal Agrawal 1 Washington University in St. Louis St. Louis, MO, USA kunal@wustl.edu https://orcid.org/0000-0001-5882-6647 Sanjoy Baruah 2 Washington

More information

A Fast Multi-Objective Genetic Algorithm for Hardware-Software Partitioning In Embedded System Design

A Fast Multi-Objective Genetic Algorithm for Hardware-Software Partitioning In Embedded System Design A Fast Multi-Obective Genetic Algorith for Hardware-Software Partitioning In Ebedded Syste Design 1 M.Jagadeeswari, 2 M.C.Bhuvaneswari 1 Research Scholar, P.S.G College of Technology, Coibatore, India

More information

PERFORMANCE MEASURES FOR INTERNET SERVER BY USING M/M/m QUEUEING MODEL

PERFORMANCE MEASURES FOR INTERNET SERVER BY USING M/M/m QUEUEING MODEL IJRET: International Journal of Research in Engineering and Technology ISSN: 239-63 PERFORMANCE MEASURES FOR INTERNET SERVER BY USING M/M/ QUEUEING MODEL Raghunath Y. T. N. V, A. S. Sravani 2 Assistant

More information

Database Design on Mechanical Equipment Operation Management System Zheng Qiu1, Wu kaiyuan1, Wu Chengyan1, Liu Lei2

Database Design on Mechanical Equipment Operation Management System Zheng Qiu1, Wu kaiyuan1, Wu Chengyan1, Liu Lei2 2nd International Conference on Advances in Mechanical Engineering and Industrial Inforatics (AMEII 206) Database Design on Mechanical Equipent Manageent Syste Zheng Qiu, Wu kaiyuan, Wu Chengyan, Liu Lei2

More information

Set Theoretic Estimation for Problems in Subtractive Color

Set Theoretic Estimation for Problems in Subtractive Color Set Theoretic Estiation for Probles in Subtractive Color Gaurav Shara Digital Iaging Technology Center, Xerox Corporation, MS0128-27E, 800 Phillips Rd, Webster, NY 14580 Received 25 March 1999; accepted

More information

A Discrete Spring Model to Generate Fair Curves and Surfaces

A Discrete Spring Model to Generate Fair Curves and Surfaces A Discrete Spring Model to Generate Fair Curves and Surfaces Atsushi Yaada Kenji Shiada 2 Tootake Furuhata and Ko-Hsiu Hou 2 Tokyo Research Laboratory IBM Japan Ltd. LAB-S73 623-4 Shiotsurua Yaato Kanagawa

More information

AN INTEGRATED APPROACH TO MUSIC BOUNDARY DETECTION

AN INTEGRATED APPROACH TO MUSIC BOUNDARY DETECTION 10th International Society for Music Inforation Retrieval Conference (ISMIR 2009) AN INTEGRATED APPROACH TO MUSIC BOUNDARY DETECTION Min-Yian Su, Yi-Hsuan Yang, Yu-Ching Lin, Hoer Chen National Taiwan

More information

Resource Optimization for Web Service Composition

Resource Optimization for Web Service Composition Resource Optiization for Web Coposition Xia Gao, Ravi Jain, Zulfikar Razan, Ulas Kozat Abstract coposition recently eerged as a costeffective way to quickly create new services within a network. Soe research

More information

THE rounding operation is performed in almost all arithmetic

THE rounding operation is performed in almost all arithmetic This is the author's version of an article that has been published in this journal. Changes were ade to this version by the publisher prior to publication. The final version of record is available at http://dx.doi.org/.9/tvlsi.5.538

More information

Derivation of an Analytical Model for Evaluating the Performance of a Multi- Queue Nodes Network Router

Derivation of an Analytical Model for Evaluating the Performance of a Multi- Queue Nodes Network Router Derivation of an Analytical Model for Evaluating the Perforance of a Multi- Queue Nodes Network Router 1 Hussein Al-Bahadili, 1 Jafar Ababneh, and 2 Fadi Thabtah 1 Coputer Inforation Systes Faculty of

More information

A Low-Cost Multi-Failure Resilient Replication Scheme for High Data Availability in Cloud Storage

A Low-Cost Multi-Failure Resilient Replication Scheme for High Data Availability in Cloud Storage 216 IEEE 23rd International Conference on High Perforance Coputing A Low-Cost Multi-Failure Resilient Replication Schee for High Data Availability in Cloud Storage Jinwei Liu* and Haiying Shen *Departent

More information

Meta-Path-Based Ranking with Pseudo Relevance Feedback on Heterogeneous Graph for Citation Recommendation

Meta-Path-Based Ranking with Pseudo Relevance Feedback on Heterogeneous Graph for Citation Recommendation Meta-Path-Based Ranking with Pseudo Relevance Feedback on Heterogeneous Graph for Citation Recoendation Xiaozhong Liu School of Inforatics and Coputing Indiana University Blooington Blooington, IN, USA,

More information

HKBU Institutional Repository

HKBU Institutional Repository Hong Kong Baptist University HKBU Institutional Repository HKBU Staff Publication 18 Towards why-not spatial keyword top-k ueries: A direction-aware approach Lei Chen Hong Kong Baptist University, psuanle@gail.co

More information

Real-Time Detection of Invisible Spreaders

Real-Time Detection of Invisible Spreaders Real-Tie Detection of Invisible Spreaders MyungKeun Yoon Shigang Chen Departent of Coputer & Inforation Science & Engineering University of Florida, Gainesville, FL 3, USA {yoon, sgchen}@cise.ufl.edu Abstract

More information

Gromov-Hausdorff Distance Between Metric Graphs

Gromov-Hausdorff Distance Between Metric Graphs Groov-Hausdorff Distance Between Metric Graphs Jiwon Choi St Mark s School January, 019 Abstract In this paper we study the Groov-Hausdorff distance between two etric graphs We copute the precise value

More information

Scheduling Parallel Task Graphs on (Almost) Homogeneous Multi-cluster Platforms

Scheduling Parallel Task Graphs on (Almost) Homogeneous Multi-cluster Platforms 1 Scheduling Parallel Task Graphs on (Alost) Hoogeneous Multi-cluster Platfors Pierre-François Dutot, Tchiou N Takpé, Frédéric Suter and Henri Casanova Abstract Applications structured as parallel task

More information

Inverted Index Compression via Online Document Routing

Inverted Index Compression via Online Document Routing Inverted Index Copression via Online Docuent Routing Gal Lavee, Ronny Lepel, Edo Liberty, and Oren Soekh Yahoo! Labs., Haifa, Israel Eail: {gallavee, rlepel, edo, orens}@yahoo-inc.co ABSTRACT Modern search

More information

Massive amounts of high-dimensional data are pervasive in multiple domains,

Massive amounts of high-dimensional data are pervasive in multiple domains, Challenges of Feature Selection for Big Data Analytics Jundong Li and Huan Liu, Arizona State University Massive aounts of high-diensional data are pervasive in ultiple doains, ranging fro social edia,

More information

Galois Homomorphic Fractal Approach for the Recognition of Emotion

Galois Homomorphic Fractal Approach for the Recognition of Emotion Galois Hooorphic Fractal Approach for the Recognition of Eotion T. G. Grace Elizabeth Rani 1, G. Jayalalitha 1 Research Scholar, Bharathiar University, India, Associate Professor, Departent of Matheatics,

More information

Scheduling Parallel Real-Time Recurrent Tasks on Multicore Platforms

Scheduling Parallel Real-Time Recurrent Tasks on Multicore Platforms IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL., NO., NOV 27 Scheduling Parallel Real-Tie Recurrent Tasks on Multicore Platfors Risat Pathan, Petros Voudouris, and Per Stenströ Abstract We

More information

Ranking Spatial Data by Quality Preferences

Ranking Spatial Data by Quality Preferences 1 Ranking Spatial Data by Quality Preferences Man Lung Yiu, Hua Lu, Nikos Maoulis, and Michail Vaitis Abstract A spatial preference query ranks objects based on the qualities of features in their spatial

More information

Effective Tracking of the Players and Ball in Indoor Soccer Games in the Presence of Occlusion

Effective Tracking of the Players and Ball in Indoor Soccer Games in the Presence of Occlusion Effective Tracking of the Players and Ball in Indoor Soccer Gaes in the Presence of Occlusion Soudeh Kasiri-Bidhendi and Reza Safabakhsh Airkabir Univerisity of Technology, Tehran, Iran {kasiri, safa}@aut.ac.ir

More information

Feature Based Registration for Panoramic Image Generation

Feature Based Registration for Panoramic Image Generation IJCSI International Journal of Coputer Science Issues, Vol. 10, Issue 6, No, Noveber 013 www.ijcsi.org 13 Feature Based Registration for Panoraic Iage Generation Kawther Abbas Sallal 1, Abdul-Mone Saleh

More information