Generic Model Management

Similar documents
Creating a Mediated Schema Based on Initial Correspondences

Where do these data come from? What technologies do they use?? Whatever they use, they need models (schemas, metadata, )

Metamodeling. What is Metamodeling? Dimensions on Metamodeling The Information Resource Dictionary Standard (IRDS) Repositories

Generic Schema Matching with Cupid

COMA A system for flexible combination of schema matching approaches. Hong-Hai Do, Erhard Rahm University of Leipzig, Germany dbs.uni-leipzig.

Generic Model Management: Experiences and Open Questions

Generic Schema Matching with Cupid

Industrial-Strength Schema Matching

XML Schema Matching Using Structural Information

Outline A Survey of Approaches to Automatic Schema Matching. Outline. What is Schema Matching? An Example. Another Example

Sangam: A Framework for Modeling Heterogeneous Database Transformations

Outline. Data Integration. Entity Matching/Identification. Duplicate Detection. More Resources. Duplicates Detection in Database Integration

Learning mappings and queries

Generic Model Management

DATABASE TECHNOLOGY - 1DL124

Matching Large XML Schemas

VALLIAMMAI ENGINEERING COLLEGE

Supporting Executable Mappings in Model Management

Ch. 21: Object Oriented Databases

Homework Assignment 3. November 9th, 2017 Due on November 23th, 11:59pm (midnight) CS425 - Database Organization Results

Generic Schema Matching with Cupid

Data Integration and Data Warehousing Database Integration Overview

Phil Bernstein. Microsoft Research. Most slides come from SIGMOD 07 Keynote & Bridging Apps & DB, both with Sergey Melnik. Nov.

A Flexible Approach Based on the user Preferences for Schema Matching

Using Relational Databases for Digital Research

Query Engines for Web-Accessible XML Data

From business need to implementation Design the right information solution

SEBD 2011, Maratea, Italy June 28, 2011 Database Group 2011

Meta-Data Support for Data Transformations Using Microsoft Repository

Reducing the Cost of Validating Mapping Compositions by Exploiting Semantic Relationships

Relational Databases Lecture 2

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2009 Lecture 4 - Schema Normalization

Relational model continued. Understanding how to use the relational model. Summary of board example: with Copies as weak entity

Extracting Ontologies from Standards: Experiences and Issues

Design concepts for data-intensive applications

Outline. CUGS Core - Databases. Requirements. Work method. Databanks/Databases. Databank

Poster Session: An Indexing Structure for Automatic Schema Matching

Chapter 8: Enhanced ER Model

XML Grammar Similarity: Breakthroughs and Limitations

Teiid Designer User Guide 7.5.0

Second OMG Workshop on Web Services Modeling. Easy Development of Scalable Web Services Based on Model-Driven Process Management

Database Instance And Relational Schema Design A Fact Oriented Approach

DBMS Lesson Plan. Name of the faculty: Ms. Kavita. Discipline: CSE. Semester: IV (January-April 2018) Subject: DBMS (CSE 202-F)

User Stories Report. Project. Statistics: Name Start End Weather Forecaster 5/2/ /7/2005

MIGRATION OF A RELATIONAL DATABASE RDB TO AN OBJECT ORIENTED DATABASE OODB

Generic Schema Merging

consider the following tables pertaining to employees working for different Write the SQL statement to retrieve the average age of employees for each

10/24/12. What We Have Learned So Far. XML Outline. Where We are Going Next. XML vs Relational. What is XML? Introduction to Data Management CSE 344

Enterprise Architect. User Guide Series. Domain Models

Semi-Automatic Conceptual Data Modeling Using Entity and Relationship Instance Repositories

A Generic Algorithm for Heterogeneous Schema Matching

Secure Model Management Operations for the Web

CSE 562 Database Systems

Efficient Object-Relational Mapping for JAVA and J2EE Applications or the impact of J2EE on RDB. Marc Stampfli Oracle Software (Switzerland) Ltd.

MIQIS: Modular Integration of Queryable Information Systems

Modeling Databases Using UML

MISM: A platform for model-independent solutions to model management problems

<Insert Picture Here> Oracle SQL Developer Data Modeler 3.0: Technical Overview

CS143: Relational Model

Intro to DB CHAPTER 6

A GML SCHEMA MAPPING APPROACH TO OVERCOME SEMANTIC HETEROGENEITY IN GIS

DC Area Business Objects Crystal User Group (DCABOCUG) Data Warehouse Architectures for Business Intelligence Reporting.

Flexible Dataspace Management Through Model Management

Schema Exchange: a Template-based Approach to Data and Metadata Translation

SYED AMMAL ENGINEERING COLLEGE

Semi-automatic Generation of Active Ontologies from Web Forms

Schema Exchange: a Template-based Approach to Data and Metadata Translation

XML Technical Overview. Bill Arledge, Consulting Product Manager BMC Software Inc.

DATABASE MANAGEMENT SYSTEMS. UNIT I Introduction to Database Systems

Overview of Database Design Process Example Database Application (COMPANY) ER Model Concepts

CMSC 424 Database design Lecture 3: Entity-Relationship Model. Book: Chap. 1 and 6. Mihai Pop

Advantages of UML for Multidimensional Modeling

Schema And Draw The Dependency Diagram

Evolution of XML Applications

DSE 203 DAY 1: REVIEW OF DBMS CONCEPTS

Chapter 6: Entity-Relationship Model. The Next Step: Designing DB Schema. Identifying Entities and their Attributes. The E-R Model.

International Jmynal of Intellectual Advancements and Research in Engineering Computations

UNIT I. Introduction

The Next Step: Designing DB Schema. Chapter 6: Entity-Relationship Model. The E-R Model. Identifying Entities and their Attributes.

Extending the use of RuleML to store metadata and database semantics

Conceptual Database Modeling

Chapter 2. DB2 concepts

An Experiment on the Matching and Reuse of XML Schemas

1 Copyright 2011, Oracle and/or its affiliates. All rights reserved.

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2009 Lecture 3 - Schema Normalization

Model Driven Engineering (MDE)

Schema Repository Database Evolution And Metamodeling

Relational Database Systems Part 01. Karine Reis Ferreira

IT Infrastructure for BIM and GIS 3D Data, Semantics, and Workflows

Unit I. By Prof.Sushila Aghav MIT

CS630. Object-Oriented DBMS Fundamentals. Les Waguespack, Ph.D. LJW 2014 : OODBMS::

JSimpleDB: Language Driven Persistence for Java

Announcements. CSCI 334: Principles of Programming Languages. Exam Study Session: Monday, May pm TBL 202. Lecture 22: Domain Specific Languages

Detailed Data Modelling. Detailed Data Modelling. Detailed Data Modelling. Identifying Attributes. Attributes

Hermes - A Framework for Location-Based Data Management *

XBenchMatch: a Benchmark for XML Schema Matching Tools

Schema Reintegration Using Generic Schema Manipulation Operators

A Tool for Semi-Automated Semantic Schema Mapping: Design and Implementation

Metamodel Matching: Experiments and Comparison

Introduction. Web Pages. Example Graph

Transcription:

Generic Model Management A Database Infrastructure for Schema Manipulation Philip A. Bernstein Microsoft Corporation April 29, 2002 2002 Microsoft Corp. 1

The Problem ithere is 30 years of DB Research on meta data But we don t have great infrastructure to offer Most design tools and web services store meta data in files, not DBs OODBMS s are not a huge success Most meta data driven tools use their own infrastructure Goal: generic meta data manipulation infrastructure Reduce the amount of programming required to build meta data driven applications. Proposal Define an algebra to manipulate meta data in large chunks, called models and mappings. 2002 Microsoft Corp. 2

Outline Overview of Model Management Solutions to classical meta data problems Recent technical results 2002 Microsoft Corp. 3

Models and Mappings Model a complex information structure XML schema, SQL schema, OO interface, UML model, web site map, make script,. Mapping a representation of a transformation from one model into another Map between two XML schemas Map a SQL schema to an XML schema Map data sources to a data warehouse Map an ER diagram to a SQL schema Map a process defn to a workflow script 2002 Microsoft Corp. 4

Representation A model is a directed graph with one root. Relational Schema Emp map 1 Emp XSD E# E# Dept# Dept# Name Name A mapping is a model each of whose nodes connects nodes of two other models First Last 2002 Microsoft Corp. 5

Model Management Algebra Match Merge Compose Diff Select Enumerate ApplyFunction Copy Invert Update operations 2002 Microsoft Corp. 6

Match Match(M 1, M 2 ) returns the best mapping between M 1 and M 2 M 1 M 2 Emp map1 E# = Dept# = Name» Addr Emp E# Dept# Phone Name First Last 2002 Microsoft Corp. 7

Merge(M 1, M 2, map) Return the union of models M 1 and M 2 Use map to guide the Merge If elements x = y in map, then collapse them into one element Emp mapc Emp Emp Addr Name = Name Phone Addr Name Phone 2002 Microsoft Corp. 8

Left Composition ( f ) Emp mapa Emp mapb Emp Addr a1 Name b1 Name Street a2 Street b2 StAddr City a3 City b3 Town M 1 M 2 M 3 Emp mapc Emp Addr c1 mapc = mapa f mapb Street c2 StAddr City c3 Town 2002 Microsoft Corp. 9

Model Management Algebra map = Match (M 1, M 2 ) M 3 = Merge (M 1, M 2, map) map 3 = Compose(map 1, map 2 ) M 2 = Diff(M 1, map) M 2 = Select(M 1, pred) list = Enumerate(M) ApplyFunction(M, f ) M 2 = Copy(M 1 ) Update operations They re generic = data model independent well implemented on an extended ER model with an extensibility story 2002 Microsoft Corp. 10

Example Given map 1 from SQL schema rdb1 to xsd1, xsd2, which is similar to xsd1 Produce a map between xsd2 and a relational schema. xsd1 1. map 2 xsd2 1. map2 = Match(xsd1, xsd2) map 1 rdb1 2. map 3 3. map 4 rdb2 2. map 3 = map 1 map 2 3. <map< 4, rdb2 > = Copy(map 3 ) 4. Use ApplyFunction(map 4 ) to map each x in Diff(xsd2,map 4 ) into rdb2 2002 Microsoft Corp. 11

Theme Classic meta data problems can be solved using Model Management operations Schema integration Schema evolution Data migration Reverse engineering Published solutions to these problems help us produce generic implementations of model mgmt operations 2002 Microsoft Corp. 12

Outline Overview of Model Management Solutions to classical meta data problems Schema integration Schema evolution Reverse engineering Data migration Recent technical results 2002 Microsoft Corp. 13

Schema Integration Given two view schemas, V 1 and V 2 Produce an integrated schema, S S V 1 1. map V 2 1. map= = Match(V 1, V 2 ) 2. S S = Merge(V 1, V 2, map) 2. 3. ApplyFunction(S ) ) // to resolve conflicts in S, producing S 2002 Microsoft Corp. 14

V 1 Emp map Emp V 2 S E# Dept# Addr Name Emp = =» E# Dept# Phone FirstName LastName E# Dept# Addr Phone f L FirstName Name LastName FirstName R LastName 1. map= = Match(V 1, V 2 ) 2. S S = Merge(V 1, V 2, map) 3. Use ApplyFunction(S ) to re- solve conflicts, producing S 2002 Microsoft Corp. 15

Schema Evolution Given map SV from schema S to view V a modified version S of S Produce a mapping map S V from S to V (i.e. a view defn for V over S ). V 1. map S S = Match(S,, S) map SV S 2. map S V 1. map S S S 2. map S V = map S S map SV 3. Use ApplyFunction(V) to delete elements not derivable from S 2002 Microsoft Corp. 16

Outline Overview of Model Management Solutions to classical meta data problems Schema integration Schema evolution Reverse engineering Data migration Recent technical results 2002 Microsoft Corp. 17

Reverse Engineering Given Model M (e.g., an ER model) Model G (e.g., SQL) generated via map MG from M A modified version G of G Produce A modified version M of M that generates G M map MG G 2. map MG 1. map GG M 3. map M G G 1. map GG = Match(G, G ) G 2. map MG = map MG map GG 2002 Microsoft Corp. 18 GG 3. <M, map G M > = Copy(map MG 4. Use ApplyFunction(map M G ), to reverse engineer each g in Diff(G,map M G ) into M MG )

Data Migration Given a schema S and its database D an evolved schema S Produce a procedure for mapping D into an S database D Enum Generate Migration Script D Run S 1. map SS S D 1. map SS = Match(S, S ) S 2. Use Enum(S) to generate a data migration script 2002 Microsoft Corp. 19

Data Translation Like data migration, except S and S are expressed in different data models. 2002 Microsoft Corp. 20

Outline Overview of Model Management Solutions to classical meta data problems Recent technical results 2002 Microsoft Corp. 21

Status Report Vision [Bernstein, Halevy, & Pottinger, SIGMOD Record 12/00] Data Warehouse Examples [Bernstein & Rahm, ER 00] Match Operation Survey: [Rahm & Bernstein, VLDB Journal, 12/01] Prototype: [Madhavan, Bernstein, & Rahm, VLDB 01] Merge Operation coming soon Theory [Alagic & Bernstein, DBPL 01] 2002 Microsoft Corp. 22

Schema Matching Approaches About a dozen published algorithms. Many good ideas, but none are robust. Individual matchers Combined matchers Schema-based Content-based Hybrid Composite Per-Element Linguistic Constraint -based Names Descriptions Types Keys Structural Constraint -based Graph matching Per-Element Linguistic Constraint -based IR (word Value pattern frequencies, and ranges key terms) Manual composition Automatic composition 2002 Microsoft Corp. 23

The Cupid Algorithm Computes linguistic similarity of element pairs Computes structural similarity of element pairs Generates a mapping PO PurchaseOrder POShipTo POBillTo DeliverTo InvoiceTo Address Address City Street City Street City Street City Street ssim++ 2002 Microsoft Corp. 24

Merge(M 1, map, M 2, M 3 ) [Buneman, Davidson, Kosky, EDBT 92] Meta-model has aggregation & generalization only Do a union and collapse objects having the same name Fix-up step for inconsistencies created by merging a X a X a X a Y X Z a Y Z Y Z W Successive fixups lead to different results Batch them at the end, to produce a unique minimal result Now enrich the meta-model (containment, complex mappings, ) & merge semantics (conflicts, deletes) 2002 Microsoft Corp. 25

Customer Scheduled Delivery Order Salesperson Produc Update Marketing Bill Customer Inventory Authorize Credit Schedule Delivery Order Entry Implementation Vision Model-Driven UI Generator cust emp dept dno dna select all Generic Tools Browser Import/export Scripting Editors Catalogs Model Manager Match Merge Apply Compose Copy Operation Specializations Object-Oriented Repository MM Meta-Model OR Mapper Inferencing Engine " & ^ $ SQL DBMS 2002 Microsoft Corp. 26

Related Work There s a lot of it. Apply it to model management! Platforms OODBs, datalog, deductive OODBs (Telos/ConceptBase, F-Logic) Inferencing on mappings AQUV, description logic Transitive closure and recursive QP Differencing text, trees, graphs Data translation algebras, schema evolution Data integration schema match, view generation 2002 Microsoft Corp. 27

Summary Raise the level of abstraction of meta-data programming by using: models and mappings as objects an algebra that manipulates models and mappings on a generic meta-model Classical meta data problems can be expressed using this algebra Implementations of classic problems offer guidance on implementing the algebra 2002 Microsoft Corp. 28

References http://www.research.microsoft.com/~philbe P. Bernstein & E. Rahm, Data Warehouse Scenarios for Model Management, ER 2000 Conference P. Bernstein, A. Levy, R. Pottinger, A Vision for Management of Complex Models, SIGMOD Record, Dec. 2000 E. Rahm, P. Bernstein, On Matching Schemas Automatically, VLDB Journal, Dec. 01. J. Madhavan, P. Bernstein, E. Rahm, Generic Schema Matching with Cupid, VLDB 2001 S. Alagic, P. Bernstein, A Model Theory for Generic Schema Management, DBPL 2001 2002 Microsoft Corp. 29