Overview of the Integration Wizard Project for Querying and Managing Semistructured Data in Heterogeneous Sources

Size: px
Start display at page:

Download "Overview of the Integration Wizard Project for Querying and Managing Semistructured Data in Heterogeneous Sources"

Transcription

1 In Proceedings of the Fifth National Computer Science and Engineering Conference (NSEC 2001), Chiang Mai University, Chiang Mai, Thailand, November Overview of the Integration Wizard Project for Querying and Managing Semistructured Data in Heterogeneous Sources Joachim Hammer Computer & Information Science & Eng. University of Florida Box , Gainesville, FL 3261 USA Charnyote Pluempitiwiriyawej Computer Science Department Mahidol University Rama VI Rd., Bangkok THAILAND cccpt@mahidol.ac.th Abstract We describe the Integration Wizard (IWIZ) system for retrieving heterogeneous information from multiple data sources. IWIZ provides users with an integrated, queriable view of the information that is available in the sources without a need to know where the information is coming from and how it is accessed. Due to the popularity of the Web, we focus on sources containing semistructured data. IWIZ uses novel mediation and wrapper technologies to process multi-source queries, transform data from the native source context into the internal IWIZ model, which is based on XML, and merge the results. To improve query response time, a data warehouse is used to cache results of frequently asked queries. This paper provides an overview of the IWIZ architecture and reports on some of our experience with the prototype implementation. 1. Introduction The need to access and manage information from a variety of sources and applications using different data models, representations and interfaces has created a great demand for tools supporting data and systems integration. It has also provided continuous motivation for research projects as seen in the literature [1, 2, 5, 12]. One reason for this need was the paradigm shift from centralized to client-server and distributed systems, with multiple autonomous sources producing and managing their own data. A more recent cause for the interest in integration technologies is the emergence of E-Commerce and its need for accessing repositories, applications and legacy systems 1 [4] located across the corporate intranet or at partner companies on the Internet. 1 For simplicity, we will collectively refer to repositories, applications, and legacy sources etc. as sources. In order to combine information from independently managed data sources, integration systems need to overcome the discrepancies in the way source data is maintained, modeled and queried. Some aspects of these heterogeneities are due to the use of different hardware and software platforms to manage data. The emergence of standard protocols and middleware components, e.g. CORBA, DCOM, ODBC, JDBC, etc., has simplified remote access to many standard source systems. As a result, most of the research initiatives for integrating heterogeneous data sources have focused on resolving the schematic and semantic discrepancies that exist among related data in different sources, assuming the sources can be reliably and efficiently accessed using the above protocols. For example, Bill Kent s article [11] clearly illustrates the problems associated with the fact that the same real-world object can be represented in many different ways in different sources. There are two approaches to building integration systems: The data warehousing approach [9], which prefetches interesting information from the sources, merges the data and resolves existing discrepancies, and stores the integrated information in a central repository before users submit their queries. Since users never query the sources directly, warehousing data is an efficient mechanism to support frequently asked queries as long as the data is available in the warehouse. The second approach is referred to as virtual warehousing or mediation [20] and provides users with a queriable, integrated view of the underlying sources. No data is actually stored at the mediator, hence the term virtual warehouse. Users can query the mediator, which in turn queries the relevant sources and integrates the individual results into a format consistent with the mediator view. Unlike the warehousing approach, data retrieval and processing is done at query-time. The mediation approach is preferred when user queries are unpredictable or the contents of the sources change rapidly.

2 We have designed and implemented an integration system, called Information Integration Wizard (IWIZ), which combines the data warehousing and mediation approaches. IWIZ allows end-users to issue queries based on a global schema to retrieve information from various sources without knowledge about their location, API, and data representation. However, unlike existing systems, queries that can be satisfied using the contents of the IWIZ warehouse, are answered quickly and efficiently without connecting to the sources. In the case when the relevant data is not available in the warehouse or its contents are out-of-date, the query is submitted to the sources via the IWIZ mediator; the warehouse is also updated for future use. An additional advantage of IWIZ is that even though sources may be temporarily unavailable, IWIZ may still be able to answer queries as long as the information has been previously cached in the warehouse. Due to the popularity of the Web and the fact that much of the interesting information is available in the form of Web pages, catalogs, or reports with mostly loose schemas and few constraints, we have focused on integrating semistructured data [17]. Semistructured data collectively refers to information whose contents and structure are flexible and thus cannot be described and managed by the more rigid traditional data models (e.g., relational model). Middleware Front-end Querying & Browsing Interfaces (QBI) Metadata Repository IWIZ Mediator Source 1 Source 2... Warehouse Warehouse Manager IWIZ Wrapper 1 IWIZ Wrapper 2 IWIZ Wrapper n Source n Back-end Figure 1: Schematic description of the IWIZ architecture and its main components 2. Overview of the IWIZ Architecture A conceptual overview of the IWIZ system is shown in Figure 1. System components can be grouped into two categories: Storage and control. Storage components include the sources, the warehouse, and the metadata repository. Control components include the querying and browsing interface (QBI), the warehouse manager, the mediator, and the wrappers. In addition, there is information not explicitly shown in the figure, which includes the global schema, the queries and the data. The global schema, which is created by a domain expert, describes the information available in the underlying sources and consists of a hierarchy of concepts and their definitions as well as the constraints. Internally, all data are represented in the form of XML documents [19], which are manipulated through queries expressed in XML-QL [3]. The global schema, which describes the structure of all internal data, is represented as a Document Type Definition (DTD), a sample of which is shown later in the paper. The definition of concepts and terms used in the schema is stored in the global ontology [6]. As indicated in Figure 1, users interact with IWIZ through QBI, which provides a conceptual overview of the source contents in the form of the global IWIZ schema and shields users from the intricacies of XML and XML- QL. QBI translates user requests into equivalent XML-QL queries, which are submitted to the warehouse manager. In case when the query can be answered by the warehouse, the answer is returned to QBI. Otherwise, the query is processed by the mediator, which retrieves the requested information from the relevant sources through the wrappers. The contents of the warehouse are updated whenever a query cannot be satisfied or whenever existing content has become stale. Our update policy assures that over time the warehouse contains as much of the result set as possible to answer the majority of the frequently asked queries. We now describe the three main control components in detail Wrappers Source-specific wrappers provide access to the underlying sources and support schema restructuring [8]. Specifically, a wrapper maps the data model used in the associated source into the data model used by the integration system. Furthermore, it has to determine the correspondence between concepts presented in the global schema and those presented in the source schema and carry out the restructuring. In IWIZ, currently all of the sources are based on XML; hence, only structural conversions are necessary. These structural conversions are captured in the form of mappings, which are generated when the wrapper is configured. To generate the mappings, the wrapper uses the explicit source schema defined in the form of a DTD as well as a local ontology. This local ontology describes the meaning of the source vocabulary in terms of the concepts of the global ontology. If the underlying sources have no explicitly defined DTD, one must first be inferred by the wrapper [7]. At run-time, the wrapper receives XML-QL queries from the mediator and transforms them into equivalent XML-QL queries that can be executed on the XML

3 source document using the wrapper s own query processor; note, we are assuming that our sources have no query capabilities of their own. The result of the query is converted into an XML document consistent with the global IWIZ schema and returned to the mediator. IWIZ wrappers automate much of the setup and conversion specification generation; in addition, they can be generated efficiently with minimal human intervention. Details describing the IWIZ wrapper design and implementation can be found in [18] Mediator The mediator supports querying, reconciling and cleansing of related data from the underlying sources. The mediator accepts a user query that is written against the global schema and generates one or more subqueries that retrieve the data that is necessary to satisfy the original user query from the sources. To do this, the mediator rewrites the user query into multiple source-specific queries; furthermore, it generates a query plan that describes an execution sequence for combining the partial results from the individual sources. After the partial results have been merged, the mediator reconciles the data into the integrated result requested in the user query. Data reconciliation refers to the resolution of potential data conflicts, such as multiple occurrences of the same realworld object or inconsistencies in the data among related objects. We have classified all possible conflicts that can occur when reconciling XML-based data and developed a novel hierarchical clustering model to support automatic reconciliation. Our experiments have shown that on the average, our clustering strategy automatically reduces the number of duplicates by more than 50%, while at the same time, reduces the number of incorrectly matched objects by up to 43% compared to no clustering [14]. The knowledge needed to generate subqueries and configure the clustering model for data reconciliation is captured (with human input) in the mediation specification and used to configure the mediator at builttime. To the best of our knowledge, our IWIZ mediator is the only mediator that supports automatic reconciliation when merging the data returned to form the integrated result. Details about the classification, the clustering model as well as the mediator implementation can be found in [14, 16] Data Warehouse Manager In order to warehouse data items that are represented as XML document, a persistent storage manager for XML is needed. We found only a few systems for persistently storing XML/DOM objects [13]. Therefore, we decided to use Oracle 8i as the underlying storage manager and develop XML extensions for converting between XML and Oracle. The decision to use an RDBMS was based mostly on the fact that many relational database vendors are trying to enhance their systems with XML extensions as well as its maturity and widespread usage. We also developed an XML wrapper to encapsulate the functionality of Oracle and provide an API that is consistent with the XML data model used by the other IWIZ components. The XML wrapper is part of the warehouse manager, which controls and maintains information that is stored in the data warehouse. At builttime, it creates the relational database schema that corresponds to the global IWIZ schema. At run-time it translates XML-QL queries into equivalent SQL statements that can be executed on the relational schema in the warehouse; it converts a relational query result into an XML document that exhibits the same structure as specified by the original XML-QL query; finally, it maintains the contents of the warehouse in light of updates to the sources as well as to the query mix of the IWIZ users. In order to understand how the warehouse manager maintains the contents of the warehouse, we need to briefly explain the sequence of events that occur when a user query is submitted to IWIZ. The query is forwarded to the warehouse manager, which analyzes whether or not the requested data are in the warehouse, and if so, whether the contents are up-to-date. To determine if the query can be satisfied by the warehouse, we use results from query containment theory. To determine whether the contents are up-to-date, we use a pre-defined consistency threshold, which specifies the time interval for which a result is valid. If the query cannot be satisfied in by the warehouse, it is sent to the mediator, which retrieves the data from the sources. In the latter case, the warehouse manager generates one or more maintenance queries to update the warehouse contents. Note, since the warehouse schema and the global IWIZ schema have a different structure, the original user query cannot be used to maintain the warehouse. Converting data and queries between the hierarchical graph model of XML and the flat structure of the relational model is not straightforward. The warehouse manager uses novel techniques for preserving the hierarchical structure of XML when storing and retrieving XML documents to and from the warehouse. Details regarding the warehouse manager and its implementation can be found in [10, 15]. 3. Case Study: Integrating Bibliography Data in IWIZ In order to demonstrate how the IWIZ mediator works, we describe a simple integration scenario involving multiple bibliography sources. Our global schema, which is represented in the form of a DTD,

4 contains terms such as article, author, book, editor, proceedings, title, year, etc. A snapshot of the DTD is shown in Figure 2. The +,? and * symbols indicate XML element constraints which refer to one-or-more, zero-or-one, and zero-or-more, respectively. For example, ontology, which is the root element for the schema, may contain zero-or-more bib elements, which in turn may contain zero-or-more bibliographical objects such as article, book, booklet, and so on. The symbol #PCDATA means that the element in the corresponding XML document must have a value. <!ELEMENT ontology (bib)*> <!ELEMENT bib (article book booklet...)*> <!ELEMENT article ( author+, title, year, month?, pages?, note?, journal )> <!ELEMENT address (#PCDATA)> <!ELEMENT author (firstname?,lastname,address?)> <!ELEMENT firstname (#PCDATA)> <!ELEMENT lastname (#PCDATA)> <!ELEMENT journal (title, year?, month?, volume?, number? )> <!ELEMENT month (#PCDATA) > <!ELEMENT note (#PCDATA) > <!ELEMENT number (#PCDATA) > <!ELEMENT title (#PCDATA) > <!ELEMENT type (#PCDATA) > <!ELEMENT volume (#PCDATA) > <!ELEMENT year (#PCDATA) > Figure 2: Sample DTD describing the structure of the concept article in an ontology of bibliography domain The global schema is used by all three control components: by the warehouse manager to create the relational schema for storing query results, by the wrappers as a target schema during the restructuring of query results, and by the mediator when merging and reconciling the source results. In our case study, we consider eight overlapping sources, each capable of providing some of the data for the concepts in the global schema. The current implementation of IWIZ supports three different types of queries with varying degrees of complexity: a simple query, which contains no nesting, no projections and no joins; a projection query, which contains one or more conditions on a particular concept; and a join query, which contains join conditions between two more concepts. More complex types of queries such as nested or group-by queries will be supported in the next version of IWIZ. Because of space limitations, we only demonstrate the mediation of a simple query. function query() { WHERE <ontology> <bib> <article> <title> <PCDATA> $title </PCDATA> </title> </article> </bib> </ontology> IN IWIZ CONSTRUCT <article_in_iwiz> $title </article_in_iwiz> } Figure 3: Simple XML-QL query produced by QBI Figure 3 shows a sample XML-QL query that retrieves article titles (as bound by the path expression: <ontology><bib><article><title><pcdata > $title...) from the IWIZ system and also specifies the format of the result (as defined in the CONSTRUCT clause). The IN clause, which usually specifies the name of the XML document on which the query is to be executed, indicates that the answer is to be retrieved from the IWIZ system. The query is submitted to the warehouse manager, which determines whether the desired titles exist in the warehouse. Here we assume that the requested data must be fetched from the sources to demonstrate the mediation process. The mediator creates a query plan and a set of subqueries against those sources that contain articles and their titles. Note that since not all source results are complete, it is the job of the mediator to merge the data into a complete result, which may not always be possible. This is accomplished by one or more join queries, which are executed against the partial source results to produce a single answer. In our simple case, only one join query is necessary. Note that creating the query plan requires significant knowledge about the contents and query capabilities of the sources/wrappers. Finally, the mediator reconciles any remaining discrepancies in the integrated result using the clustering module to detect duplicates and to resolve inconsistencies among related data items [14]. <?xml version= 1.0?> <!DOCTYPE QueryPlan SYSTEM QueryPlan.dtd > <QueryPlan uquid="0001" forelement= ontology.bib.article.title > <ExecutionTree queryprocessor= xmlql.cmd queryfilename= 0001.et1.xmlql /> </QueryPlan> Figure 4a: Sample query plan

5 /* 0001.et1.xmlql */ function query() { WHERE <ontology><bib><article><title> <PCDATA>$med_Title1</> </></></></> IN source1.xml, <ontology><bib><article><title> <PCDATA>$med_Title2</> </></></></> IN source2.xml, $med_title1 = $med_title2 CONSTRUCT <ontology><bib><article><title> <PCDATA>$med_Title1</> </></></></> } Figure 4b: Join query that is referenced in the query plan in Figure 4a Figure 4a shows a sample query plan and its execution tree. The execution tree includes a reference to the query file containing the join query shown in Figure 4b, as well as a reference to the query processor on which to execute 2. In the current version, the mediator invokes the XML-QL processor and executes the XML-QL join query. The final answer, which is returned to the user, is shown in Figure 5. Note that the schema of the result is consistent with the CONSTRUCT clause (i.e., the requested user view) in Figure 3. <?xml version="1.0" encoding="utf-8"?> <XML ID="1.whr.genoid_0"> <article_in_iwiz> <title>superviews: Virtual Integration of Multiple Databases</title> </article_in_iwiz> <article_in_iwiz> <title>optimization by Simulated Annealing</title> </article_in_iwiz> : : </XML> Figure 5: Snapshot of query result returned to the user 4. Conclusion and Future Research We have introduced a solution to data integration, which allows end-users to access and retrieve information from multiple sources through a consistent, integrated 2 This allows us to easily link in a newer version of the same or different query processor without recompilation. view. IWIZ uses a combined data warehousing-mediation approach for enhanced query performance and increased reliability. Specifically, it uses novel wrapper and mediation technologies to reduce human involvement as much as possible. Given the popularity of the Web, IWIZ is designed to integrate sources that provide XML-based, semistructured data. In the future, we plan to extend the system to support additional source data models, including unstructured sources. Within IWIZ, data is represented as XML documents whose schema is defined by a DTD. Given the rapid evolution of XML and its related technologies, our next version of the prototype will move towards XML Schema for its ability to represent a richer set of data modeling constructs. Other plans include the support of more complex queries in the mediator and more sophisticated warehouse maintenance procedures, which rely on source monitors for determining when the warehouse needs updating rather than on user-defined refresh times. We will continue report on our progress in future conferences and workshops. 5. References [1] S. Chawathe, H. Garcia-Molina, J. Hammer, K. Ireland, Y. Papakonstantinou, J. Ullman, and J. Widom, The TSIMMIS Project: Integration of Heterogeneous Information Sources, presented at The 10th Anniversary Meeting of the Information Processing Society of Japan, Tokyo, Japan, [2] W. W. Cohen, The WHIRL Approach to Data Integration, IEEE Intelligent Systems, vol. 13, pp , [3] A. Deutch, M. Fernandez, D. Florescu, A. Levy, and D. Suciu, A Query Language for XML, presented at Proceedings of the 8th International World Wide Web Conference (WWW8), Toronto, Canada, [4] K. Geihs, Middleware Challenges Ahead, in IEEE Computer, vol. 34, 2001, pp [5] M. R. Genesereth, A. M. Keller, and O. M. Duschka, Infomaster: An Information Integration System, presented at Proceedings of the 1997 ACM SIGMOD International Conference on Management of Data, Tucson, Arizona, [6] T. R. Gruber, A Translation Approach to Portable Ontologies, Knowledge Acquisition, vol. 5, pp , [7] H. Gu, Designing and Implementing a DTD Inference Engine for the IWIZ Project, in Computer and Information Science and Engineering Department. Gainesville: University of Florida, 2000, pp. 76. [8] J. Hammer, M. Breunig, and H. Garcia-Molina, Template-Based Wrappers in the TSIMMIS System, presented at The 23rd ACM SIGMOD International Conference on Management of Data, Tucson, Arizona, 1997.

6 [9] W. H. Inmon and C. Kelley, Rdb/VMS: Developing the Data Warehouse. Boston, London, Toronto: QED Publishing Group, [10] R. Kanna, Managing XML Data in a Relational Warehouse : On Query Translation, Warehouse Maintenance, and Data Staleness, in Computer and Information Science and Engineering Department. Gainesville: University of Florida, [11] W. Kent, "The Many Forms of a Single Fact," presented at Proceedings of the IEEE Spring Compcon, San Francisco, CA,1989. [12] A. Levy, The Information Manifold Approach to Data Integration, IEEE Intelligent Systems, vol. 13, pp , [13] J. McHugh, S. Abiteboul, R. Goldman, D. Quass, and J. Widom, Lore: A Database Management System for Semistructured Data, SIGMOD Record, vol. 23, pp , [14] C. Pluempitiwiriyawej, A New Hierarchical Clustering Model for Speeding Up the Reconciliation of XML- Based, Semistructured Data in Mediation Systems, in Computer and Information Science and Engineering Department. Gainesville: University of Florida, 2001, pp [15] R. Ramani, A Toolkit for Managing XML Data With a Relational Database Management System, in Computer and Information Science and Engineering Department. Gainesville: University of Florida, 2001, pp. 54. [16] A. Shah, Source Specific Query Rewriting and Query Plan Generation for Merging XML-Based Semistructured data in Mediation Systems, in Computer and Information Science and Engineering Department. Gainesville: University of Florida, [17] D. Suciu, Proceedings of the Workshop on Management of Semistructured Data. Tucson, AZ, [18] A. Teterovskaya, Conflict Detection and Resolution During Restructuring of XML Data, in Computer and Information Science and Engineering Department. Gainesville: University of Florida, 2000, pp [19] W3C, The World Wide Web Consortium (W3C). [20] G. Wiederhold, Mediators in the Architecture of Future Information Systems, in IEEE Computer, vol. 25. U.S. Naval Postgraduate School, Monterey, California, 1992, pp

MANAGING XML DATA IN A RELATIONAL WAREHOUSE: ON QUERY TRANSLATION, WAREHOUSE MAINTENANCE, AND DATA STALENESS

MANAGING XML DATA IN A RELATIONAL WAREHOUSE: ON QUERY TRANSLATION, WAREHOUSE MAINTENANCE, AND DATA STALENESS MANAGING XML DATA IN A RELATIONAL WAREHOUSE: ON QUERY TRANSLATION, WAREHOUSE MAINTENANCE, AND DATA STALENESS By RAJESH KANNA A THESIS PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL

More information

Managing Changes to Schema of Data Sources in a Data Warehouse

Managing Changes to Schema of Data Sources in a Data Warehouse Association for Information Systems AIS Electronic Library (AISeL) AMCIS 2001 Proceedings Americas Conference on Information Systems (AMCIS) December 2001 Managing Changes to Schema of Data Sources in

More information

Browsing in the tsimmis System. Stanford University. into requests the source can execute. The data returned by the source is converted back into the

Browsing in the tsimmis System. Stanford University. into requests the source can execute. The data returned by the source is converted back into the Information Translation, Mediation, and Mosaic-Based Browsing in the tsimmis System SIGMOD Demo Proposal (nal version) Joachim Hammer, Hector Garcia-Molina, Kelly Ireland, Yannis Papakonstantinou, Jerey

More information

Wrapper 2 Wrapper 3. Information Source 2

Wrapper 2 Wrapper 3. Information Source 2 Integration of Semistructured Data Using Outer Joins Koichi Munakata Industrial Electronics & Systems Laboratory Mitsubishi Electric Corporation 8-1-1, Tsukaguchi Hon-machi, Amagasaki, Hyogo, 661, Japan

More information

Robust Mediation of Construction Supply Chain Information

Robust Mediation of Construction Supply Chain Information In Proceedings of the ASCE Specialty Conference on Fully Integrated and Automated Project Processes (FIAPP) in Civil Engineering, Blacksburg, VA, ASCE, September 2001. Robust Mediation of Construction

More information

I. Khalil Ibrahim, V. Dignum, W. Winiwarter, E. Weippl, Logic Based Approach to Semantic Query Transformation for Knowledge Management Applications,

I. Khalil Ibrahim, V. Dignum, W. Winiwarter, E. Weippl, Logic Based Approach to Semantic Query Transformation for Knowledge Management Applications, I. Khalil Ibrahim, V. Dignum, W. Winiwarter, E. Weippl, Logic Based Approach to Semantic Query Transformation for Knowledge Management Applications, Proc. of the International Conference on Knowledge Management

More information

Query Processing and Optimization on the Web

Query Processing and Optimization on the Web Query Processing and Optimization on the Web Mourad Ouzzani and Athman Bouguettaya Presented By Issam Al-Azzoni 2/22/05 CS 856 1 Outline Part 1 Introduction Web Data Integration Systems Query Optimization

More information

An Approach to Resolve Data Model Heterogeneities in Multiple Data Sources

An Approach to Resolve Data Model Heterogeneities in Multiple Data Sources Edith Cowan University Research Online ECU Publications Pre. 2011 2006 An Approach to Resolve Data Model Heterogeneities in Multiple Data Sources Chaiyaporn Chirathamjaree Edith Cowan University 10.1109/TENCON.2006.343819

More information

Data Warehousing Alternatives for Mobile Environments

Data Warehousing Alternatives for Mobile Environments Data Warehousing Alternatives for Mobile Environments I. Stanoi D. Agrawal A. El Abbadi Department of Computer Science University of California Santa Barbara, CA 93106 S. H. Phatak B. R. Badrinath Department

More information

Interoperability in GIS Enabling Technologies

Interoperability in GIS Enabling Technologies Interoperability in GIS Enabling Technologies Ubbo Visser, Heiner Stuckenschmidt, Christoph Schlieder TZI, Center for Computing Technologies University of Bremen D-28359 Bremen, Germany {visser heiner

More information

MIWeb: Mediator-based Integration of Web Sources

MIWeb: Mediator-based Integration of Web Sources MIWeb: Mediator-based Integration of Web Sources Susanne Busse and Thomas Kabisch Technical University of Berlin Computation and Information Structures (CIS) sbusse,tkabisch@cs.tu-berlin.de Abstract MIWeb

More information

Data integration supports seamless access to autonomous, heterogeneous information

Data integration supports seamless access to autonomous, heterogeneous information Using Constraints to Describe Source Contents in Data Integration Systems Chen Li, University of California, Irvine Data integration supports seamless access to autonomous, heterogeneous information sources

More information

A System For Information Extraction And Intelligent Search Using Dynamically Acquired Background Knowledge

A System For Information Extraction And Intelligent Search Using Dynamically Acquired Background Knowledge A System For Information Extraction And Intelligent Search Using Dynamically Acquired Background Knowledge Samhaa R. El-Beltagy, Ahmed Rafea, and Yasser Abdelhamid Central Lab for Agricultural Expert Systems

More information

Ontology Construction -An Iterative and Dynamic Task

Ontology Construction -An Iterative and Dynamic Task From: FLAIRS-02 Proceedings. Copyright 2002, AAAI (www.aaai.org). All rights reserved. Ontology Construction -An Iterative and Dynamic Task Holger Wache, Ubbo Visser & Thorsten Scholz Center for Computing

More information

Semistructured Data Store Mapping with XML and Its Reconstruction

Semistructured Data Store Mapping with XML and Its Reconstruction Semistructured Data Store Mapping with XML and Its Reconstruction Enhong CHEN 1 Gongqing WU 1 Gabriela Lindemann 2 Mirjam Minor 2 1 Department of Computer Science University of Science and Technology of

More information

A CORBA-based Multidatabase System - Panorama Project

A CORBA-based Multidatabase System - Panorama Project A CORBA-based Multidatabase System - Panorama Project Lou Qin-jian, Sarem Mudar, Li Rui-xuan, Xiao Wei-jun, Lu Zheng-ding, Chen Chuan-bo School of Computer Science and Technology, Huazhong University of

More information

A FRAMEWORK FOR EFFICIENT DATA SEARCH THROUGH XML TREE PATTERNS

A FRAMEWORK FOR EFFICIENT DATA SEARCH THROUGH XML TREE PATTERNS A FRAMEWORK FOR EFFICIENT DATA SEARCH THROUGH XML TREE PATTERNS SRIVANI SARIKONDA 1 PG Scholar Department of CSE P.SANDEEP REDDY 2 Associate professor Department of CSE DR.M.V.SIVA PRASAD 3 Principal Abstract:

More information

Querying XML Data. Mary Fernandez. AT&T Labs Research David Maier. Oregon Graduate Institute

Querying XML Data. Mary Fernandez. AT&T Labs Research David Maier. Oregon Graduate Institute Querying XML Data Alin Deutsch Univ. of Pennsylvania adeutsch@gradient.cis.upenn.edu Alon Levy University of Washington, Seattle alon@cs.washington.edu Mary Fernandez AT&T Labs Research mff@research.att.com

More information

THE MOMIS METHODOLOGY FOR INTEGRATING HETEROGENEOUS DATA SOURCES *

THE MOMIS METHODOLOGY FOR INTEGRATING HETEROGENEOUS DATA SOURCES * THE MOMIS METHODOLOGY FOR INTEGRATING HETEROGENEOUS DATA SOURCES * Domenico Beneventano and Sonia Bergamaschi Dipartimento di Ingegneria dell'informazione Università degli Studi di Modena e Reggio Emilia

More information

Scalable Hybrid Search on Distributed Databases

Scalable Hybrid Search on Distributed Databases Scalable Hybrid Search on Distributed Databases Jungkee Kim 1,2 and Geoffrey Fox 2 1 Department of Computer Science, Florida State University, Tallahassee FL 32306, U.S.A., jungkkim@cs.fsu.edu, 2 Community

More information

Describing and Utilizing Constraints to Answer Queries in Data-Integration Systems

Describing and Utilizing Constraints to Answer Queries in Data-Integration Systems Describing and Utilizing Constraints to Answer Queries in Data-Integration Systems Chen Li Information and Computer Science University of California, Irvine, CA 92697 chenli@ics.uci.edu Abstract In data-integration

More information

Mapping Target Schemas to Source Schemas Using WordNet Hierarchies

Mapping Target Schemas to Source Schemas Using WordNet Hierarchies Mapping Target Schemas to Source Schemas Using WordNet Hierarchies A Thesis Proposal Presented to the Department of Computer Science Brigham Young University In Partial Fulfillment of the Requirements

More information

Chapter 13 XML: Extensible Markup Language

Chapter 13 XML: Extensible Markup Language Chapter 13 XML: Extensible Markup Language - Internet applications provide Web interfaces to databases (data sources) - Three-tier architecture Client V Application Programs Webserver V Database Server

More information

Integrated Usage of Heterogeneous Databases for Novice Users

Integrated Usage of Heterogeneous Databases for Novice Users International Journal of Networked and Distributed Computing, Vol. 3, No. 2 (April 2015), 109-118 Integrated Usage of Heterogeneous Databases for Novice Users Ayano Terakawa Dept. of Information Science,

More information

An ODBC CORBA-Based Data Mediation Service

An ODBC CORBA-Based Data Mediation Service An ODBC CORBA-Based Data Mediation Service Paul L. Bergstein Dept. of Computer and Information Science University of Massachusetts Dartmouth, Dartmouth MA pbergstein@umassd.edu Keywords: Data mediation,

More information

PROJECT OBJECTIVES 2. PREVIOUS RESULTS

PROJECT OBJECTIVES 2. PREVIOUS RESULTS 1 Project Title: Integration of Product Design and Project Activities using Process Specification and Simulation Access Languages Principal Investigator: Kincho H. Law, Stanford University Duration: September

More information

Storing and Maintaining Semistructured Data Efficiently in an Object-Relational Database

Storing and Maintaining Semistructured Data Efficiently in an Object-Relational Database Storing and Maintaining Semistructured Data Efficiently in an Object-Relational Database Yuanying Mo National University of Singapore moyuanyi@comp.nus.edu.sg Tok Wang Ling National University of Singapore

More information

Fundamentals of. Database Systems. Shamkant B. Navathe. College of Computing Georgia Institute of Technology PEARSON.

Fundamentals of. Database Systems. Shamkant B. Navathe. College of Computing Georgia Institute of Technology PEARSON. Fundamentals of Database Systems 5th Edition Ramez Elmasri Department of Computer Science and Engineering The University of Texas at Arlington Shamkant B. Navathe College of Computing Georgia Institute

More information

SEEK: Scalable Extraction of Enterprise Knowledge

SEEK: Scalable Extraction of Enterprise Knowledge SEEK: Scalable Extraction of Enterprise Knowledge Joachim Hammer Dept. of CISE University of Florida 26-Feb-2002 1 Project Overview Faculty Joachim Hammer Mark Schmalz Computer Science Students Sangeetha

More information

Extending CMIS Standard for XML Databases

Extending CMIS Standard for XML Databases Extending CMIS Standard for XML Databases Mihai Stancu * *Faculty of Mathematics and Computer Science, Department of Computer Science, University of Craiova, Romania (e-mail: mihai.stancu@yahoo.com) Abstract:

More information

Introduction to Federation Server

Introduction to Federation Server Introduction to Federation Server Alex Lee IBM Information Integration Solutions Manager of Technical Presales Asia Pacific 2006 IBM Corporation WebSphere Federation Server Federation overview Tooling

More information

Features and Requirements for an XML View Definition Language: Lessons from XML Information Mediation

Features and Requirements for an XML View Definition Language: Lessons from XML Information Mediation Page 1 of 5 Features and Requirements for an XML View Definition Language: Lessons from XML Information Mediation 1. Introduction C. Baru, B. Ludäscher, Y. Papakonstantinou, P. Velikhov, V. Vianu XML indicates

More information

Data Integration and Data Warehousing Database Integration Overview

Data Integration and Data Warehousing Database Integration Overview Data Integration and Data Warehousing Database Integration Overview Sergey Stupnikov Institute of Informatics Problems, RAS ssa@ipi.ac.ru Outline Information Integration Problem Heterogeneous Information

More information

Chapter 1: Introduction

Chapter 1: Introduction Chapter 1: Introduction Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Outline The Need for Databases Data Models Relational Databases Database Design Storage Manager Query

More information

Browsing in the tsimmis System æ. Stanford University. Tel.: è415è Fax: è415è

Browsing in the tsimmis System æ. Stanford University. Tel.: è415è Fax: è415è Information Translation, Mediation, and Mosaic-Based Browsing in the tsimmis System æ SIGMOD Demo Proposal èænal versionè Joachim Hammer, Hector Garcia-Molina, Kelly Ireland, Yannis Papakonstantinou, Jeærey

More information

Introduction to Semistructured Data and XML. Overview. How the Web is Today. Based on slides by Dan Suciu University of Washington

Introduction to Semistructured Data and XML. Overview. How the Web is Today. Based on slides by Dan Suciu University of Washington Introduction to Semistructured Data and XML Based on slides by Dan Suciu University of Washington CS330 Lecture April 8, 2003 1 Overview From HTML to XML DTDs Querying XML: XPath Transforming XML: XSLT

More information

Teiid Designer User Guide 7.5.0

Teiid Designer User Guide 7.5.0 Teiid Designer User Guide 1 7.5.0 1. Introduction... 1 1.1. What is Teiid Designer?... 1 1.2. Why Use Teiid Designer?... 2 1.3. Metadata Overview... 2 1.3.1. What is Metadata... 2 1.3.2. Editing Metadata

More information

Some aspects of references behaviour when querying XML with XQuery

Some aspects of references behaviour when querying XML with XQuery Some aspects of references behaviour when querying XML with XQuery c B.Khvostichenko boris.khv@pobox.spbu.ru B.Novikov borisnov@acm.org Abstract During the XQuery query evaluation, the query output is

More information

Ontology-Based Schema Integration

Ontology-Based Schema Integration Ontology-Based Schema Integration Zdeňka Linková Institute of Computer Science, Academy of Sciences of the Czech Republic Pod Vodárenskou věží 2, 182 07 Prague 8, Czech Republic linkova@cs.cas.cz Department

More information

Teiid Designer User Guide 7.7.0

Teiid Designer User Guide 7.7.0 Teiid Designer User Guide 1 7.7.0 1. Introduction... 1 1.1. What is Teiid Designer?... 1 1.2. Why Use Teiid Designer?... 2 1.3. Metadata Overview... 2 1.3.1. What is Metadata... 2 1.3.2. Editing Metadata

More information

Navigational Integration of Autonomous Web Information Sources by Mobile Users

Navigational Integration of Autonomous Web Information Sources by Mobile Users Navigational Integration of Autonomous Web Information Sources by Mobile Users Wisut Sae-Tung Tadashi OHMORI Mamoru HOSHI Graduate School of Information Systems The University of Electro-Communications,

More information

A Data warehouse within a Federated database architecture

A Data warehouse within a Federated database architecture Association for Information Systems AIS Electronic Library (AISeL) AMCIS 1997 Proceedings Americas Conference on Information Systems (AMCIS) 8-15-1997 A Data warehouse within a Federated database architecture

More information

Development of an Ontology-Based Portal for Digital Archive Services

Development of an Ontology-Based Portal for Digital Archive Services Development of an Ontology-Based Portal for Digital Archive Services Ching-Long Yeh Department of Computer Science and Engineering Tatung University 40 Chungshan N. Rd. 3rd Sec. Taipei, 104, Taiwan chingyeh@cse.ttu.edu.tw

More information

Indexing XML Data with ToXin

Indexing XML Data with ToXin Indexing XML Data with ToXin Flavio Rizzolo, Alberto Mendelzon University of Toronto Department of Computer Science {flavio,mendel}@cs.toronto.edu Abstract Indexing schemes for semistructured data have

More information

Chapter 1: Introduction. Chapter 1: Introduction

Chapter 1: Introduction. Chapter 1: Introduction Chapter 1: Introduction Database System Concepts, 5th Ed. See www.db-book.com for conditions on re-use Chapter 1: Introduction Purpose of Database Systems View of Data Database Languages Relational Databases

More information

Extracting Semistructured Information from the Web

Extracting Semistructured Information from the Web Extracting Semistructured Information from the Web J. Hammer, H. Garcia-Molina, J. Cho, R. Aranha, and A. Crespo Department of Computer Science Stanford University Stanford, CA 94305-9040 {hector,joachim,cho,aranha,crespo@cs.stanford.edu

More information

Introduction to Database Systems CSE 414

Introduction to Database Systems CSE 414 Introduction to Database Systems CSE 414 Lecture 14-15: XML CSE 414 - Spring 2013 1 Announcements Homework 4 solution will be posted tomorrow Midterm: Monday in class Open books, no notes beyond one hand-written

More information

STRUCTURE-BASED QUERY EXPANSION FOR XML SEARCH ENGINE

STRUCTURE-BASED QUERY EXPANSION FOR XML SEARCH ENGINE STRUCTURE-BASED QUERY EXPANSION FOR XML SEARCH ENGINE Wei-ning Qian, Hai-lei Qian, Li Wei, Yan Wang and Ao-ying Zhou Computer Science Department Fudan University Shanghai 200433 E-mail: wnqian@fudan.edu.cn

More information

A MODEL FOR ADVANCED QUERY CAPABILITY DESCRIPTION IN MEDIATOR SYSTEMS

A MODEL FOR ADVANCED QUERY CAPABILITY DESCRIPTION IN MEDIATOR SYSTEMS A MODEL FOR ADVANCED QUERY CAPABILITY DESCRIPTION IN MEDIATOR SYSTEMS Alberto Pan, Paula Montoto and Anastasio Molano Denodo Technologies, Almirante Fco. Moreno 5 B, 28040 Madrid, Spain Email: apan@denodo.com,

More information

Bonus Content. Glossary

Bonus Content. Glossary Bonus Content Glossary ActiveX control: A reusable software component that can be added to an application, reducing development time in the process. ActiveX is a Microsoft technology; ActiveX components

More information

Protégé-2000: A Flexible and Extensible Ontology-Editing Environment

Protégé-2000: A Flexible and Extensible Ontology-Editing Environment Protégé-2000: A Flexible and Extensible Ontology-Editing Environment Natalya F. Noy, Monica Crubézy, Ray W. Fergerson, Samson Tu, Mark A. Musen Stanford Medical Informatics Stanford University Stanford,

More information

CS425 Fall 2016 Boris Glavic Chapter 1: Introduction

CS425 Fall 2016 Boris Glavic Chapter 1: Introduction CS425 Fall 2016 Boris Glavic Chapter 1: Introduction Modified from: Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Textbook: Chapter 1 1.2 Database Management System (DBMS)

More information

A Bottom-up Strategy for Query Decomposition

A Bottom-up Strategy for Query Decomposition A Bottom-up Strategy for Query Decomposition Le Thi Thu Thuy, Doan Dai Duong, Virendrakumar C. Bhavsar and Harold Boley Faculty of Computer Science, University of New Brunswick Fredericton, New Brunswick,

More information

Mediating and Metasearching on the Internet

Mediating and Metasearching on the Internet Mediating and Metasearching on the Internet Luis Gravano Computer Science Department Columbia University www.cs.columbia.edu/ gravano Yannis Papakonstantinou Computer Science and Engineering Department

More information

Database Systems: Design, Implementation, and Management Tenth Edition. Chapter 14 Database Connectivity and Web Technologies

Database Systems: Design, Implementation, and Management Tenth Edition. Chapter 14 Database Connectivity and Web Technologies Database Systems: Design, Implementation, and Management Tenth Edition Chapter 14 Database Connectivity and Web Technologies Database Connectivity Mechanisms by which application programs connect and communicate

More information

B.H.GARDI COLLEGE OF MASTER OF COMPUTER APPLICATION. Ch. 1 :- Introduction Database Management System - 1

B.H.GARDI COLLEGE OF MASTER OF COMPUTER APPLICATION. Ch. 1 :- Introduction Database Management System - 1 Basic Concepts :- 1. What is Data? Data is a collection of facts from which conclusion may be drawn. In computer science, data is anything in a form suitable for use with a computer. Data is often distinguished

More information

A Framework for Processing Complex Document-centric XML with Overlapping Structures Ionut E. Iacob and Alex Dekhtyar

A Framework for Processing Complex Document-centric XML with Overlapping Structures Ionut E. Iacob and Alex Dekhtyar A Framework for Processing Complex Document-centric XML with Overlapping Structures Ionut E. Iacob and Alex Dekhtyar ABSTRACT Management of multihierarchical XML encodings has attracted attention of a

More information

Symmetrically Exploiting XML

Symmetrically Exploiting XML Symmetrically Exploiting XML Shuohao Zhang and Curtis Dyreson School of E.E. and Computer Science Washington State University Pullman, Washington, USA The 15 th International World Wide Web Conference

More information

Database Management Systems (CPTR 312)

Database Management Systems (CPTR 312) Database Management Systems (CPTR 312) Preliminaries Me: Raheel Ahmad Ph.D., Southern Illinois University M.S., University of Southern Mississippi B.S., Zakir Hussain College, India Contact: Science 116,

More information

Ontology Structure of Elements for Web-based Natural Disaster Preparedness Systems

Ontology Structure of Elements for Web-based Natural Disaster Preparedness Systems Association for Information Systems AIS Electronic Library (AISeL) AMCIS 2007 Proceedings Americas Conference on Information Systems (AMCIS) December 2007 Ontology Structure of Elements for Web-based Natural

More information

VALUE RECONCILIATION IN MEDIATORS OF HETEROGENEOUS INFORMATION COLLECTIONS APPLYING WELL-STRUCTURED CONTEXT SPECIFICATIONS

VALUE RECONCILIATION IN MEDIATORS OF HETEROGENEOUS INFORMATION COLLECTIONS APPLYING WELL-STRUCTURED CONTEXT SPECIFICATIONS VALUE RECONCILIATION IN MEDIATORS OF HETEROGENEOUS INFORMATION COLLECTIONS APPLYING WELL-STRUCTURED CONTEXT SPECIFICATIONS D. O. Briukhov, L. A. Kalinichenko, N. A. Skvortsov, S. A. Stupnikov Institute

More information

EXTRACTION AND ALIGNMENT OF DATA FROM WEB PAGES

EXTRACTION AND ALIGNMENT OF DATA FROM WEB PAGES EXTRACTION AND ALIGNMENT OF DATA FROM WEB PAGES Praveen Kumar Malapati 1, M. Harathi 2, Shaik Garib Nawaz 2 1 M.Tech, Computer Science Engineering, 2 M.Tech, Associate Professor, Computer Science Engineering,

More information

1 Copyright 2011, Oracle and/or its affiliates. All rights reserved.

1 Copyright 2011, Oracle and/or its affiliates. All rights reserved. 1 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Integrating Complex Financial Workflows in Oracle Database Xavier Lopez Seamus Hayes Oracle PolarLake, LTD 2 Copyright 2011, Oracle

More information

Fundarnentals of. Sharnkant B. Navathe College of Computing Georgia Institute of Technology

Fundarnentals of. Sharnkant B. Navathe College of Computing Georgia Institute of Technology Fundarnentals of n I 5th Edition Ramez Elmasri Department of Computer Science and Engineering The University of Texas at Arlington Sharnkant B. Navathe College of Computing Georgia Institute of Technology

More information

Keywords Data alignment, Data annotation, Web database, Search Result Record

Keywords Data alignment, Data annotation, Web database, Search Result Record Volume 5, Issue 8, August 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Annotating Web

More information

METADATA INTERCHANGE IN SERVICE BASED ARCHITECTURE

METADATA INTERCHANGE IN SERVICE BASED ARCHITECTURE UDC:681.324 Review paper METADATA INTERCHANGE IN SERVICE BASED ARCHITECTURE Alma Butkovi Tomac Nagravision Kudelski group, Cheseaux / Lausanne alma.butkovictomac@nagra.com Dražen Tomac Cambridge Technology

More information

Chapter 1: Introduction

Chapter 1: Introduction Chapter 1: Introduction Chapter 1: Introduction Purpose of Database Systems Database Languages Relational Databases Database Design Data Models Database Internals Database Users and Administrators Overall

More information

Ryan Stephens. Ron Plew Arie D. Jones. Sams Teach Yourself FIFTH EDITION. 800 East 96th Street, Indianapolis, Indiana, 46240

Ryan Stephens. Ron Plew Arie D. Jones. Sams Teach Yourself FIFTH EDITION. 800 East 96th Street, Indianapolis, Indiana, 46240 Ryan Stephens Ron Plew Arie D. Jones Sams Teach Yourself FIFTH EDITION 800 East 96th Street, Indianapolis, Indiana, 46240 Table of Contents Part I: An SQL Concepts Overview HOUR 1: Welcome to the World

More information

An UML-XML-RDB Model Mapping Solution for Facilitating Information Standardization and Sharing in Construction Industry

An UML-XML-RDB Model Mapping Solution for Facilitating Information Standardization and Sharing in Construction Industry An UML-XML-RDB Model Mapping Solution for Facilitating Information Standardization and Sharing in Construction Industry I-Chen Wu 1 and Shang-Hsien Hsieh 2 Department of Civil Engineering, National Taiwan

More information

Metadata. Data Warehouse

Metadata. Data Warehouse A DECISION SUPPORT PROTOTYPE FOR THE KANSAS STATE UNIVERSITY LIBRARIES Maria Zamr Bleyberg, Raghunath Mysore, Dongsheng Zhu, Radhika Bodapatla Computing and Information Sciences Department Karen Cole,

More information

MULTIMEDIA DATABASES OVERVIEW

MULTIMEDIA DATABASES OVERVIEW MULTIMEDIA DATABASES OVERVIEW Recent developments in information systems technologies have resulted in computerizing many applications in various business areas. Data has become a critical resource in

More information

Practical Database Design Methodology and Use of UML Diagrams Design & Analysis of Database Systems

Practical Database Design Methodology and Use of UML Diagrams Design & Analysis of Database Systems Practical Database Design Methodology and Use of UML Diagrams 406.426 Design & Analysis of Database Systems Jonghun Park jonghun@snu.ac.kr Dept. of Industrial Engineering Seoul National University chapter

More information

Inferring Structure in Semistructured Data

Inferring Structure in Semistructured Data Inferring Structure in Semistructured Data SVETLOZAR NESTOROV æ SERGE ABITEBOUL y RAJEEV MOTWANI z Department of Computer Science Stanford University Stanford, CA 94305-9040 fevtimov,abiteboug@db.stanford.edu,

More information

AN APPROACH TO UNIFICATION OF XML AND OBJECT- RELATIONAL DATA MODELS

AN APPROACH TO UNIFICATION OF XML AND OBJECT- RELATIONAL DATA MODELS AN APPROACH TO UNIFICATION OF XML AND OBJECT- RELATIONAL DATA MODELS Iryna Kozlova 1 ), Norbert Ritter 1 ) Abstract The emergence and wide deployment of XML technologies in parallel to the (object-)relational

More information

Resolving Schema and Value Heterogeneities for XML Web Querying

Resolving Schema and Value Heterogeneities for XML Web Querying Resolving Schema and Value Heterogeneities for Web ing Nancy Wiegand and Naijun Zhou University of Wisconsin 550 Babcock Drive Madison, WI 53706 wiegand@cs.wisc.edu, nzhou@wisc.edu Isabel F. Cruz and William

More information

Oracle Data Integration and OWB: New for 11gR2

Oracle Data Integration and OWB: New for 11gR2 Oracle Data Integration and OWB: New for 11gR2 C. Antonio Romero, Oracle Corporation, Redwood Shores, US Keywords: data integration, etl, real-time, data warehousing, Oracle Warehouse Builder, Oracle Data

More information

DESIGNING AND IMPLEMENTING THE DTD INFERENCE ENGINE FOR THE I-WIZ PROJECT HONGYU GUO

DESIGNING AND IMPLEMENTING THE DTD INFERENCE ENGINE FOR THE I-WIZ PROJECT HONGYU GUO DESIGNING AND IMPLEMENTING THE DTD INFERENCE ENGINE FOR THE I-WIZ PROJECT By HONGYU GUO A THESIS PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS

More information

FUNDAMENTALS OF. Database S wctpmc. Shamkant B. Navathe College of Computing Georgia Institute of Technology. Addison-Wesley

FUNDAMENTALS OF. Database S wctpmc. Shamkant B. Navathe College of Computing Georgia Institute of Technology. Addison-Wesley FUNDAMENTALS OF Database S wctpmc SIXTH EDITION Ramez Elmasri Department of Computer Science and Engineering The University of Texas at Arlington Shamkant B. Navathe College of Computing Georgia Institute

More information

A Web Service-Based System for Sharing Distributed XML Data Using Customizable Schema

A Web Service-Based System for Sharing Distributed XML Data Using Customizable Schema Proceedings of the 2009 IEEE International Conference on Systems, Man, and Cybernetics San Antonio, TX, USA - October 2009 A Web Service-Based System for Sharing Distributed XML Data Using Customizable

More information

A MAS Based ETL Approach for Complex Data

A MAS Based ETL Approach for Complex Data A MAS Based ETL Approach for Complex Data O. Boussaid, F. Bentayeb, J. Darmont Abstract : In a data warehousing process, the phase of data integration is crucial. Many methods for data integration have

More information

Sangam: A Framework for Modeling Heterogeneous Database Transformations

Sangam: A Framework for Modeling Heterogeneous Database Transformations Sangam: A Framework for Modeling Heterogeneous Database Transformations Kajal T. Claypool University of Massachusetts-Lowell Lowell, MA Email: kajal@cs.uml.edu Elke A. Rundensteiner Worcester Polytechnic

More information

FedX: A Federation Layer for Distributed Query Processing on Linked Open Data

FedX: A Federation Layer for Distributed Query Processing on Linked Open Data FedX: A Federation Layer for Distributed Query Processing on Linked Open Data Andreas Schwarte 1, Peter Haase 1,KatjaHose 2, Ralf Schenkel 2, and Michael Schmidt 1 1 fluid Operations AG, Walldorf, Germany

More information

An Improving for Ranking Ontologies Based on the Structure and Semantics

An Improving for Ranking Ontologies Based on the Structure and Semantics An Improving for Ranking Ontologies Based on the Structure and Semantics S.Anusuya, K.Muthukumaran K.S.R College of Engineering Abstract Ontology specifies the concepts of a domain and their semantic relationships.

More information

Systems:;-'./'--'.; r. Ramez Elmasri Department of Computer Science and Engineering The University of Texas at Arlington

Systems:;-'./'--'.; r. Ramez Elmasri Department of Computer Science and Engineering The University of Texas at Arlington Data base 7\,T"] Systems:;-'./'--'.; r Modelsj Languages, Design, and Application Programming Ramez Elmasri Department of Computer Science and Engineering The University of Texas at Arlington Shamkant

More information

Routing XQuery in A P2P Network Using Adaptable Trie-Indexes

Routing XQuery in A P2P Network Using Adaptable Trie-Indexes Routing XQuery in A P2P Network Using Adaptable Trie-Indexes Florin Dragan, Georges Gardarin, Laurent Yeh PRISM laboratory Versailles University & Oxymel, France {firstname.lastname@prism.uvsq.fr} Abstract

More information

Querying XML data: Does One Query Language Fit All? Abstract 1.0 Introduction 2.0 Background: Querying XML documents

Querying XML data: Does One Query Language Fit All? Abstract 1.0 Introduction 2.0 Background: Querying XML documents Querying XML data: Does One Query Language Fit All? V. Ramesh, Arijit Sengupta and Bryan Reinicke venkat@indiana.edu, asengupt@indiana.edu, breinick@indiana.edu Kelley School of Business, Indiana University,

More information

XViews: XML views of relational schemas

XViews: XML views of relational schemas SDSC TR-1999-3 XViews: XML views of relational schemas Chaitanya Baru San Diego Supercomputer Center, University of California San Diego La Jolla, CA 92093, USA baru@sdsc.edu October 7, 1999 San Diego

More information

Semantic Data Extraction for B2B Integration

Semantic Data Extraction for B2B Integration Silva, B., Cardoso, J., Semantic Data Extraction for B2B Integration, International Workshop on Dynamic Distributed Systems (IWDDS), In conjunction with the ICDCS 2006, The 26th International Conference

More information

Conception of Information Systems Part 3: Integration of Heterogeneous Databases

Conception of Information Systems Part 3: Integration of Heterogeneous Databases Conception of Information Systems Part 3: Integration of Heterogeneous Databases 2004, Karl Aberer, EPFL-SSC, Laboratoire de systèmes d'informations rèpartis Part 2-1 1 PART III - Integration of Heterogeneous

More information

The Evolution of Data Warehousing. Data Warehousing Concepts. The Evolution of Data Warehousing. The Evolution of Data Warehousing

The Evolution of Data Warehousing. Data Warehousing Concepts. The Evolution of Data Warehousing. The Evolution of Data Warehousing The Evolution of Data Warehousing Data Warehousing Concepts Since 1970s, organizations gained competitive advantage through systems that automate business processes to offer more efficient and cost-effective

More information

Inferring Structure in Semistructured Data

Inferring Structure in Semistructured Data Inferring Structure in Semistructured Data SVETLOZAR NESTOROV SERGE ABITEBOUL RAJEEV MOTWANI æ Department of Computer Science Stanford University Stanford, CA 94305-9040 fevtimov,abiteboug@db.stanford.edu,

More information

P2P Knowledge Management: an Investigation of the Technical Architecture and Main Processes

P2P Knowledge Management: an Investigation of the Technical Architecture and Main Processes P2P Management: an Investigation of the Technical Architecture and Main Processes Oscar Mangisengi, Wolfgang Essmayr Software Competence Center Hagenberg (SCCH) Hauptstrasse 99, A-4232 Hagenberg, Austria

More information

Databases and Information Retrieval Integration TIETS42. Kostas Stefanidis Autumn 2016

Databases and Information Retrieval Integration TIETS42. Kostas Stefanidis Autumn 2016 + Databases and Information Retrieval Integration TIETS42 Autumn 2016 Kostas Stefanidis kostas.stefanidis@uta.fi http://www.uta.fi/sis/tie/dbir/index.html http://people.uta.fi/~kostas.stefanidis/dbir16/dbir16-main.html

More information

Improving Resource Management And Solving Scheduling Problem In Dataware House Using OLAP AND OLTP Authors Seenu Kohar 1, Surender Singh 2

Improving Resource Management And Solving Scheduling Problem In Dataware House Using OLAP AND OLTP Authors Seenu Kohar 1, Surender Singh 2 Improving Resource Management And Solving Scheduling Problem In Dataware House Using OLAP AND OLTP Authors Seenu Kohar 1, Surender Singh 2 1 M.tech Computer Engineering OITM Hissar, GJU Univesity Hissar

More information

Conception of Information Systems Lecture 3: Integration of Heterogeneous Databases

Conception of Information Systems Lecture 3: Integration of Heterogeneous Databases Conception of Information Systems Lecture 3: Integration of Heterogeneous Databases 22 March 2005 http://lsirwww.epfl.ch/courses/cis/2005ss/ 2004-2005, Karl Aberer & J.P. Martin-Flatin 1 1 Outline Introduction

More information

Interrogation System Architecture of Heterogeneous Data for Decision Making

Interrogation System Architecture of Heterogeneous Data for Decision Making Interrogation System Architecture of Heterogeneous Data for Decision Making Cécile Nicolle, Youssef Amghar, Jean-Marie Pinon Laboratoire d'ingénierie des Systèmes d'information INSA de Lyon Abstract Decision

More information

Introduction to Semistructured Data and XML

Introduction to Semistructured Data and XML Introduction to Semistructured Data and XML Chapter 27, Part D Based on slides by Dan Suciu University of Washington Database Management Systems, R. Ramakrishnan 1 How the Web is Today HTML documents often

More information

Whitepaper. Solving Complex Hierarchical Data Integration Issues. What is Complex Data? Types of Data

Whitepaper. Solving Complex Hierarchical Data Integration Issues. What is Complex Data? Types of Data Whitepaper Solving Complex Hierarchical Data Integration Issues What is Complex Data? Historically, data integration and warehousing has consisted of flat or structured data that typically comes from structured

More information

Semistructured Data: The Tsimmis Experience. Stanford University.

Semistructured Data: The Tsimmis Experience. Stanford University. Semistructured Data: The Tsimmis Experience Joachim Hammer, Jason McHugh, and Hector Garcia-Molina Department of Computer Science Stanford University Stanford, CA 94305{9040 U.S.A. fjoachim,mchughj,hectorg@db.stanford.edu

More information

IJSRD - International Journal for Scientific Research & Development Vol. 4, Issue 06, 2016 ISSN (online):

IJSRD - International Journal for Scientific Research & Development Vol. 4, Issue 06, 2016 ISSN (online): IJSRD - International Journal for Scientific Research & Development Vol. 4, Issue 06, 2016 ISSN (online): 2321-0613 Tanzeela Khanam 1 Pravin S.Metkewar 2 1 Student 2 Associate Professor 1,2 SICSR, affiliated

More information