ON SCHEMA DISCOVERY ICDM Renée J. Miller

Size: px
Start display at page:

Download "ON SCHEMA DISCOVERY ICDM Renée J. Miller"

Transcription

1 ON SCHEMA DISCOVERY ICDM 2011 Renée J. Miller

2 What are Schemas? 2 Schema From the Greek "σχήμα meaning shape, or more generally, plan Structure and constraints the data (should) satisfy Attribute structure Grouping into (nested) tables Constraints Keys, functional dependencies, rules Referential constraints (foreign keys) Domain constraints More general assertions, exclusion constraints, etc. Encode some (important) data semantics

3 Where are Schemas Used? 3 Enterprise Information Designed, curated, and valuable Used in decision making A single source may be massive Foundation for Business Intelligence/Analytics Web/Personal Information Light-weight structure, little/no curation or design Convenient human memory aid Often small, but numerous, data sources Spectrum of information

4 Evolving Role of Schemas 4 Old: Prescriptive Role Time-invariant portion of data Used to ensure data consistency New: Descriptive Role Evolve as data semantics evolves Used to describe, understand, query the data

5 Goals for Talk 5 Overview some of our work on schema discovery Robust and well-studied area for enterprise data Present some new challenges for modern information Unifying theme Leveraging data semantics

6 Outline 6 Motivation Structure & Constraint Discovery Dependency Discovery Constraint Repair Schema Alignment (Data Integration) Schema Mapping Discovery Ontology Alignment Discovery Data Alignment (time permitting) New Challenges posed by Linked Open Data Conclusions and Open Problems

7 Joint Work 7 Series of Papers (including but not exclusively) IEEE DE Bulletin 03 Schema Discovery SIGMOD 04 [Andritsos, Tsaparas, M-] ICDE 06 [Andristos, Fuxman, M-] SIGMOD 07 [Udrea, Getoor, M-] VLDB 08 [Chiang, M-] ICDE 11 [Chiang, M-] Clio [Fagin, Haas, Hernandez, M-, Popa, Velegrakis] Clio: VLDB 00/02 thru to 2009 book chapter Data Exchange: [Fagin, Kolaitis, M-, Popa ICDT03, TCS05] Fei Chiang graduating Spring 2012

8 What is a Good Schema? 8 Traditional Answer: minimize redundancy Redundancy can lead to inconsistency BCNF: all functional dependencies (FDs) are keys Every attribute shall depend on the key, the whole key, and nothing but the key.... so help me Codd Information theoretic characterization Given any set of cells (attribute values) in a table, should not be able to predict the value of another cell If the only constraints allowed are FDs, we can show that BCNF minimizes redundancy [Arenas 06 Dissertation, Kolahi 07 Dissertation on 3NF]

9 Finding Good Schemas? 9 Today, can no longer assume data was designed A table many contain more than one type of entity May be result of integration or information extraction May not have constraints enforced on it Violations of constraints May be errors May be due to evolving (unknown) data semantics Need to discover/maintain schema & constraints

10 Legacy Relations 10 OiD Type Category Start End SalesAssc Time 123 Voice Unlimited 1/20/10 1/20/11 Fred10 10:17: Data Unlimited 4/20/05 3/20/09 Pat11 23:15: Data Weekend 4/20/05 4/20/06 Pat11 01:01: Voice 120HR 2/20/08 2/20/10 Fred10 23:15: Charger Motorola 2/14/10 NULL CRM 10:17: Phone MotoRZ2 12/1/10 NULL CRM 11:15: Phone BlackBerry 12/1/10 NULL CRM 12:12:22 Tables & attributes may become overloaded Service orders and product orders

11 Information Theoretic Clustering 11 We use LIMBO, a scalable algorithm that clusters categorical data [Andritsos, M-, Tsaparas SIGMOD04] Idea: Compress tuples, T, into a clustering C so that the information preserved about the attribute values is maximum [Slonim, Tishby NIPS99] Naturally finds (groups) redundant values LIMBO builds compact summaries that represent the data

12 Finding good horizontal decompositions 12 OiD Type Category Start End SalesAssc Time 123 Voice Unlimited 1/20/10 1/20/11 Fred10 10:17: Data Unlimited 4/20/05 3/20/09 Pat11 23:15: Data Weekend 4/20/05 4/20/06 Pat11 01:01: Voice 120HR 2/20/08 2/20/10 Fred10 23:15: Charger Motorola 2/14/10 NULL CRM 10:17: Phone MotoRZ2 12/1/10 NULL CRM 11:15: Phone BlackBerry 12/1/10 NULL CRM 12:12:22 Merge tuples to lose as little information about attribute values as possible

13 Applications 13 Horizontally decompose legacy tables Group tuples into semantically meaningful types [Andritsos, M-, Tsaparas SIGMOD04] Find potential duplicates tuples in data Entity-identification in relational data [Andritsos, Fuxman, M- ICDE06] Create probabilistic DBs from duplicated data Compute meaningful probabilities that reflect duplication in data [Hassanzadeh, M- VLDB Journal09]

14 Constraint Discovery 14 Functional Dependencies (includes Keys) FDEP [Flach, Savnik AICommunications99], TANE [Huhtala et al. ComputingJ.99], FastFDs [Wyss, Giannella, Roberston DaWaK01] Inclusion Dependencies (includes Foreign Keys) General Inclusion Deps [Bauckmann et al. ICDE07, De Marchi, Lopes, Petit EDBT09] and many others Foreign Keys [Zhang et al. PVLDB10] Mining may give large number of dependencies not always intuitive not all are useful for re-designing the data may be accidental Goal: find interesting dependencies efficiently

15 Conditional Rules 15 Rules may not hold over entire table FDs assume table represents a single entity type Conditional FDs [Maher TCS97, Bohannon et al. ICDE07] Applications to cleaning and understanding data Airline Status Miles Seating Boarding One World Bronze 1x Standard Standard Meridan Silver 1x Elite Standard Skyway Silver 2x Preferred Standard Skyway Gold 2x Elite First One world Gold 3x Preferred Priority Skyway Gold 2x Preferred Priority Functional Dependency [Airline,Status] [Miles] Conditional Functional Dep [Status= Gold,Seating] [Boarding] One World Silver 2x Preferred Priority

16 Discovery of Conditional FDs 16 Modification of TANE search to consider conditioning conditions [Chiang, M- VLDB08] Standard measures of rule interest Support, Entropy, Conviction,... Find approximate conditional FDs and deviants (data that form candidates for cleaning)

17 Inconsistent Data 17 Data may become inconsistent with constraints Options Drop constraints and discover new constraints that fit Assume constraints are correct and repair data [Bohannon et al. SIGMOD05], [Cong et al. VLDB07], [Kolahi, Lakshmanan ICDT09] For FD: X Y, find minimal cost changes to the data (e.g., edit distance) to the Y values Repair data and constraints to fit [Chiang, M- ICDE11]

18 Data & Constraint Repair 18 District Region Municipal AC Street City Prov PCode Brook Granville Glendale 412 Roslin Toronto ON M4N 1Y3 Brook Granville Glendale 412 Roslin Toronto OH M4N 1Y3 Brook Granville Guildwood 553 Sidney Belleville ON K8P 3Y9 Brook Granville Guildwood 553 Sidney Belleville ON K8P 1J7 Fife Parkhill Moore 725 Poth Dundee ON NOB 2E0 Fife Parkhill Moore 725 Roseville Dundee ON NOB 2E0 Fife Parkhill Napa 228 Roslin Toronto ON M4N 1Y3 Municipal F1: [District, Region] [AC] 1) Repair the data or the constraints? 2) How to find the repairs? F2: [PCode] [City, Prov]

19 Key Ideas 19 Use variance of information to select attributes for doing constraint repair Attribute with no variance of information with respect to the inconsistent data is a perfect repair Unified cost model for comparing Constraint Repair with Data Repair

20 Summary Constraint Discovery 20 Goal is to discover or repair constraints to find accurate model of data Can be used to query and understand data Maintain data consistency Clean data or correct data entry errors Semantic query optimization

21 Outline 21 Motivation Constraint Discovery Dependency Discovery Constraint Repair Schema Alignment (Data Integration) Schema Mapping Discovery Ontology Alignment Discovery Data Alignment New Challenges posed by Linked Open Data Conclusions and Open Problems

22 Discovery of Schema Mappings 22 Q S Schema S I S Μ + Q T Μ Μ + I S Q T Schema T I T Schema Mappings are declarative specifications that describe the relationship between a source schema S and a target schema T Key to both Data exchange (creating IT) aka Data Warehousing Query rewriting (creating QS) aka Data Federation

23 Mapping example 23 <a, b, c> <d, e, f> <g, h, i> P Q R source A B C D E F G H I Referential Constraint P(a,b,c) Y Z T(a,Y,Z) Q(d,e,f) X U T(X,e,U) target T A E I R(g,h,i) V W T(V,W,i) a e i a Y 0 Z 0 X 0 e U 0 V 0 W 0 i a e Z 1 V 1 W 1 i There may be many solutions for T (J, J 1, J 2, etc.) However, J seems to be more general J 1 J J 2 X 0, Y 0, Z 0 represent unknown values (or nulls ) Intuitively, J 1 and J 2 have extra information

24 Mapping example 24 Emp <Pat, b, c> Worksin <d, 100K, f> Name B C D Salary F Employee A E I Pat 100K.10 Pat Y 0 Z 0 X 0 100K U 0 V 0 W 0.10 J 1 J h 1 = {Y 0 -> 100K, Z 0 ->.10 } Dept <g, h,.10> G H Bonus Pat 100K Z 1 V 1 W 1.10 J 2 h 2 J 1 and J 2 assert extra information: Pat s salary is 100K this is not required by source or mapping Homomorphisms: h: J J 1 constants h(c) = c J tuples (X,Y,Z), J1 tuple (h(x),h(y),h(z))

25 Universal Solutions 25 Given a data exchange setting (S, T, M, Σ t ) where Σ t are constraints on the target T, and a source instance I, a universal solution is a target instance J: J is a solution for I solutions J for I, there is a homomorphism h: J J For the example, J is a universal solution there are homomorphisms h1: J J1 and h2: J J2 there are no homomorphisms from J1 or J2 J

26 Universal Solutions in Data Exchange 26 We introduced the notion of universal solutions as the best solutions in data exchange The most general solutions Foundational Results [Fagin, Kolaitis, M-, Popa, ICDT03, TCS05] Universal solutions are unique up to homomorphic equivalence; they represent the entire solution space The chase procedure produces a universal solution in polynomial time The certain answers of target conjunctive queries can be obtained by evaluation on an arbitrary universal solution; & universal solutions are only solutions with this property

27 Mapping example 27 <a, b, c> <b, e, f> <f, h, i> P Q R A B C D E F G H I Source constraints can be used to improve mapping d,e,f Q(d,e,f) a,c P(a, d,c) d,e,f Q(d,e,f) h, i R(f,h,i) M P(a,b,c) Y Z T(a,Y,Z) Q(d,e,f) X U T(X,e,U) T A E I R(g,h,i) V W T(V,W,i) J is universal solution for M a Y 0 Z 0 X 0 e U 0 V 0 W 0 i M2: P(a,b,c),Q(b,e,f),R(f,h,i) T(a,e,i) J a e i J1 J1 is universal solution for M2

28 Mapping example 28 <Pat, E1, NY> Emp Worksin <E1, 100K, D1> Dept <D1, CS,.10> Name Eid Addr Eid Salary Did Did Dname Bonus Employee Name Salary Bonus Pat 100K.10 If meaning of Employee table coincides with WorksIn relationship then mapping should be: Emp(Name,Eid,Addr),Worksin(Eid,Salary,Did),Dept (Did,Dname,Bonus) Employee(Name, Salary,Bonus) J 1 is best (universal) solution J 1

29 Schema Mapping Discovery 29 Use chase (logical inference) to infer connections in source and target schemas Potential semantic relationships Guide user in selecting correct semantic relationships to Text use in mapping Transformation code generation SQL, XQuery, XSLT transforms,... Include (skolem) terms to generate labeled nulls Technology in several IBM product lines including IBM s Infosphere Data Architect [M-, Haas, Hernandez VLDB00] thru to [Fagin+2009]

30 Schema Mapping Summary 30 Schema mapping leverages Statistical inference to infer attribute matches Logical inference to give matches a semantic interpretation As inter-schema constraints Can we extend this idea to a closer integration of approaches?

31 Ontology Mapping 31 (discoveredby, owl:inverseof, discoverer); (discoveredby, owl:type, owl:functionalproperty) (discoveredby, owl:inverseof, discoverer); (associatedwith, owl:type, owl:transitiveproperty) (resultsf rom, rdfs:subpropertyof, associatedwith)

32 Example OWL Lite Ontologies 32 An entity can be a: Class (discoveredby, owl:inverseof, discoverer); (discoveredby, owl:type, owl:functionalproperty) (discoveredby, owl:inverseof, discoverer); (associatedwith, owl:type, owl:transitiveproperty) (resultsf rom, rdfs:subpropertyof, associatedwith)

33 Example OWL Lite Ontologies 33 An entity can be a: Class Instance (discoveredby, owl:inverseof, discoverer); (discoveredby, owl:type, owl:functionalproperty) (discoveredby, owl:inverseof, discoverer); (associatedwith, owl:type, owl:transitiveproperty) (resultsf rom, rdfs:subpropertyof, associatedwith)

34 Example OWL Lite Ontologies 34 An entity can be a: Class Instance Property (discoveredby, owl:inverseof, discoverer); (discoveredby, owl:type, owl:functionalproperty) (discoveredby, owl:inverseof, discoverer); (associatedwith, owl:type, owl:transitiveproperty) (resultsf rom, rdfs:subpropertyof, associatedwith)

35 Example OWL Lite Ontologies 35 (discoveredby, owl:inverseof, discoverer) (discoveredby, owl:type, owl:functionalproperty) (associatedwith, owl:type, owl:transitiveproperty) (resultsfrom, rdfs:subpropertyof, associatedwith) Axioms

36 Computing Similarity 36 sim lexical : Jaro-Winkler and Wordnet sim structural : Jaccard for neighborhoods sim extensional : Jaccard on extensions Standard used in schema/ontology matchers [COMA++ & others] parameters: λ x, λ s, λ e different for classes, instances and properties

37 37 Similarity

38 38 Does Similarity agree with Semantics

39 39 Performing logical inference

40 Performing logical inference 40 Candidate Consequence (TheodorEscherich, owl:sameas, T.S. Escherich) is a logical consequence of the candidate (E-ColiPoisoning, owl:sameas, E-Coli)

41 Combining Evidence 41 If logical consequence(s) are similar, then increase similarity of candidate by inference similarity Candidate: pair of entities (ci, cj) asserted to be same (E-coli Poisoning same-as E-coli) similarity 0.5 Consequence: pair of entities (ei, ej) inferred to be same (Theodore same-as T. S.) similarity 0.6 Inference Similarity Greater than one if consequences are similar Less than one otherwise Multiply candidate similarity by inference similarity (E-coli Poisoning same-as E-coli) 0.5*1.5 =.75

42 Experimental Framework 42 ILIADS-tailored uses best set of parameters for each pair of ontologies ILIADS-fixed uses one set of parameters for all pairs of ontologies FCA-merge [Stumme and Maedche, IJCAI 2001] uses formal concept analysis and an external document corpus COMA++ [Aumueller et al., SIGMOD 2005] implements multiple match strategies, including robust collection of similarity functions 30 pairs of real-world ontologies (up to 20,000 triples) From a variety of domains: medical, geographical, economical, biological Ground truth provided by human reviewers

43 Precision/recall for ontologies with substantial instance data 43

44 Ontology Mapping Summary 44 Use logical constraints in schemas/ontologies Able to improve recall substantially over standard statistical inference techniques that are based on lexical, structural, semantic similarity Introduced notion of inference similarity [Udrea, Getoor, M- SIGMOD07]

45 Outline 45 Motivation Constraint Discovery Dependency Discovery Constraint Repair Schema Alignment (Data Integration) Schema Mapping Discovery Ontology Alignment Discovery Data Alignment New Challenges posed by Linked Open Data Conclusions and Open Problems

46 Vision: reconciling data 46 Linked Open Data Publish Web Data with useful semantic information LinQuer Scalable, declarative (native DBMS) support for syntactic and semantic data matching So far leveraging domain constraints (and semantics of domains), but can we do more? Applications to Linked Clinical Trials Work of Oktie Hassanzadeh [Hassanzadeh et al. CIKM09] starting soon at IBM Watson

47 Conclusions 47 Schema Discovery Need to be able to discover and maintain structural and semantic information Role in prescribing and enforcing data consistency Role in cleaning, querying, and understanding data More flexible view of schemas Support what-if style analysis Postulate constraints that match your model of a domain DBMS gives answers that are consistent with constraints If I believe that Customers are uniquely identified by their phone number, zip code, and last name, then how many customers do I have? [Fuxman 2008 ACM SIGMOD Jim Grey Dissertation Award] [Fuxman, Fazli, M- SIGMOD05]

Leveraging Data and Structure in Ontology Integration

Leveraging Data and Structure in Ontology Integration Leveraging Data and Structure in Ontology Integration O. Udrea L. Getoor R.J. Miller Group 15 Enrico Savioli Andrea Reale Andrea Sorbini DEIS University of Bologna Searching Information in Large Spaces

More information

Bio/Ecosystem Informatics

Bio/Ecosystem Informatics Bio/Ecosystem Informatics Renée J. Miller University of Toronto DB research problem: managing data semantics R. J. Miller University of Toronto 1 Managing Data Semantics Semantics modeled by Schemas (structure

More information

Kanata: Adaptation and Evolution in Data Sharing Systems

Kanata: Adaptation and Evolution in Data Sharing Systems Kanata: Adaptation and Evolution in Data Sharing Systems Periklis Andritsos Ariel Fuxman Anastasios Kementsietsidis Renée J. Miller Yannis Velegrakis Department of Computer Science University of Toronto

More information

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2009 Lecture 4 - Schema Normalization

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2009 Lecture 4 - Schema Normalization CSE 544 Principles of Database Management Systems Magdalena Balazinska Winter 2009 Lecture 4 - Schema Normalization References R&G Book. Chapter 19: Schema refinement and normal forms Also relevant to

More information

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2009 Lecture 3 - Schema Normalization

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2009 Lecture 3 - Schema Normalization CSE 544 Principles of Database Management Systems Magdalena Balazinska Fall 2009 Lecture 3 - Schema Normalization References R&G Book. Chapter 19: Schema refinement and normal forms Also relevant to this

More information

INCONSISTENT DATABASES

INCONSISTENT DATABASES INCONSISTENT DATABASES Leopoldo Bertossi Carleton University, http://www.scs.carleton.ca/ bertossi SYNONYMS None DEFINITION An inconsistent database is a database instance that does not satisfy those integrity

More information

Structural characterizations of schema mapping languages

Structural characterizations of schema mapping languages Structural characterizations of schema mapping languages Balder ten Cate INRIA and ENS Cachan (research done while visiting IBM Almaden and UC Santa Cruz) Joint work with Phokion Kolaitis (ICDT 09) Schema

More information

CSE 562 Database Systems

CSE 562 Database Systems Goal CSE 562 Database Systems Question: The relational model is great, but how do I go about designing my database schema? Database Design Some slides are based or modified from originals by Magdalena

More information

A Collective, Probabilistic Approach to Schema Mapping

A Collective, Probabilistic Approach to Schema Mapping A Collective, Probabilistic Approach to Schema Mapping Angelika Kimmig, Alex Memory, Renée Miller, Lise Getoor ILP 2017 (published at ICDE 2017) 1 Context: Data Exchange & source emp id company 1 Alice

More information

A Unified Model for Data and Constraint Repair

A Unified Model for Data and Constraint Repair A Unified Model for Data and Constraint Repair Fei Chiang, Renée J. Miller Department of Computer Science, University of Toronto Toronto, Canada {fchiang, miller}@cs.toronto.edu Abstract Integrity constraints

More information

Data Integration: Schema Mapping

Data Integration: Schema Mapping Data Integration: Schema Mapping Jan Chomicki University at Buffalo and Warsaw University March 8, 2007 Jan Chomicki (UB/UW) Data Integration: Schema Mapping March 8, 2007 1 / 13 Data integration Data

More information

Relational Model. Rab Nawaz Jadoon DCS. Assistant Professor. Department of Computer Science. COMSATS IIT, Abbottabad Pakistan

Relational Model. Rab Nawaz Jadoon DCS. Assistant Professor. Department of Computer Science. COMSATS IIT, Abbottabad Pakistan Relational Model DCS COMSATS Institute of Information Technology Rab Nawaz Jadoon Assistant Professor COMSATS IIT, Abbottabad Pakistan Management Information Systems (MIS) Relational Model Relational Data

More information

CS6302- DATABASE MANAGEMENT SYSTEMS- QUESTION BANK- II YEAR CSE- III SEM UNIT I

CS6302- DATABASE MANAGEMENT SYSTEMS- QUESTION BANK- II YEAR CSE- III SEM UNIT I CS6302- DATABASE MANAGEMENT SYSTEMS- QUESTION BANK- II YEAR CSE- III SEM UNIT I 1.List the purpose of Database System (or) List the drawback of normal File Processing System. 2. Define Data Abstraction

More information

Consistent Query Answering

Consistent Query Answering Consistent Query Answering Opportunities and Limitations Jan Chomicki Dept. CSE University at Buffalo State University of New York http://www.cse.buffalo.edu/ chomicki 1 Integrity constraints Integrity

More information

Data Integration: Schema Mapping

Data Integration: Schema Mapping Data Integration: Schema Mapping Jan Chomicki University at Buffalo and Warsaw University March 8, 2007 Jan Chomicki (UB/UW) Data Integration: Schema Mapping March 8, 2007 1 / 13 Data integration Jan Chomicki

More information

CS 2451 Database Systems: Database and Schema Design

CS 2451 Database Systems: Database and Schema Design CS 2451 Database Systems: Database and Schema Design http://www.seas.gwu.edu/~bhagiweb/cs2541 Spring 2018 Instructor: Dr. Bhagi Narahari Relational Model: Definitions Review Relations/tables, Attributes/Columns,

More information

Schema Management. Abstract

Schema Management. Abstract Schema Management Periklis Andritsos Λ Ronald Fagin y Ariel Fuxman Λ Laura M. Haas y Mauricio A. Hernández y Ching-Tien Ho y Anastasios Kementsietsidis Λ Renée J. Miller Λ Felix Naumann y Lucian Popa y

More information

Learning mappings and queries

Learning mappings and queries Learning mappings and queries Marie Jacob University Of Pennsylvania DEIS 2010 1 Schema mappings Denote relationships between schemas Relates source schema S and target schema T Defined in a query language

More information

An Ameliorated Methodology to Eliminate Redundancy in Databases Using SQL

An Ameliorated Methodology to Eliminate Redundancy in Databases Using SQL An Ameliorated Methodology to Eliminate Redundancy in Databases Using SQL Praveena M V 1, Dr. Ajeet A. Chikkamannur 2 1 Department of CSE, Dr Ambedkar Institute of Technology, VTU, Karnataka, India 2 Department

More information

Data Cleansing. LIU Jingyuan, Vislab WANG Yilei, Theoretical group

Data Cleansing. LIU Jingyuan, Vislab WANG Yilei, Theoretical group Data Cleansing LIU Jingyuan, Vislab WANG Yilei, Theoretical group What is Data Cleansing Data cleansing (data cleaning) is the process of detecting and correcting (or removing) errors or inconsistencies

More information

Database Management System

Database Management System Database Management System Lecture 4 Database Design Normalization and View * Some materials adapted from R. Ramakrishnan, J. Gehrke and Shawn Bowers Today s Agenda Normalization View Database Management

More information

Scalable Data Exchange with Functional Dependencies

Scalable Data Exchange with Functional Dependencies Scalable Data Exchange with Functional Dependencies Bruno Marnette 1, 2 Giansalvatore Mecca 3 Paolo Papotti 4 1: Oxford University Computing Laboratory Oxford, UK 2: INRIA Saclay, Webdam Orsay, France

More information

Database Management System Prof. Partha Pratim Das Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur

Database Management System Prof. Partha Pratim Das Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur Database Management System Prof. Partha Pratim Das Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur Lecture - 19 Relational Database Design (Contd.) Welcome to module

More information

Introduction Data Integration Summary. Data Integration. COCS 6421 Advanced Database Systems. Przemyslaw Pawluk. CSE, York University.

Introduction Data Integration Summary. Data Integration. COCS 6421 Advanced Database Systems. Przemyslaw Pawluk. CSE, York University. COCS 6421 Advanced Database Systems CSE, York University March 20, 2008 Agenda 1 Problem description Problems 2 3 Open questions and future work Conclusion Bibliography Problem description Problems Why

More information

CS W Introduction to Databases Spring Computer Science Department Columbia University

CS W Introduction to Databases Spring Computer Science Department Columbia University CS W4111.001 Introduction to Databases Spring 2018 Computer Science Department Columbia University 1 in SQL 1. Key constraints (PRIMARY KEY and UNIQUE) 2. Referential integrity constraints (FOREIGN KEY

More information

Techno India Batanagar Computer Science and Engineering. Model Questions. Subject Name: Database Management System Subject Code: CS 601

Techno India Batanagar Computer Science and Engineering. Model Questions. Subject Name: Database Management System Subject Code: CS 601 Techno India Batanagar Computer Science and Engineering Model Questions Subject Name: Database Management System Subject Code: CS 601 Multiple Choice Type Questions 1. Data structure or the data stored

More information

Schema Refinement: Dependencies and Normal Forms

Schema Refinement: Dependencies and Normal Forms Schema Refinement: Dependencies and Normal Forms Grant Weddell David R. Cheriton School of Computer Science University of Waterloo CS 348 Introduction to Database Management Spring 2012 CS 348 (Intro to

More information

Chapter 6: Relational Database Design

Chapter 6: Relational Database Design Chapter 6: Relational Database Design Chapter 6: Relational Database Design Features of Good Relational Design Atomic Domains and First Normal Form Decomposition Using Functional Dependencies Second Normal

More information

Lecture 11 - Chapter 8 Relational Database Design Part 1

Lecture 11 - Chapter 8 Relational Database Design Part 1 CMSC 461, Database Management Systems Spring 2018 Lecture 11 - Chapter 8 Relational Database Design Part 1 These slides are based on Database System Concepts 6th edition book and are a modified version

More information

DATABASE MANAGEMENT SYSTEMS

DATABASE MANAGEMENT SYSTEMS www..com Code No: N0321/R07 Set No. 1 1. a) What is a Superkey? With an example, describe the difference between a candidate key and the primary key for a given relation? b) With an example, briefly describe

More information

Foundations of Data Exchange and Metadata Management. Marcelo Arenas Ron Fagin Special Event - SIGMOD/PODS 2016

Foundations of Data Exchange and Metadata Management. Marcelo Arenas Ron Fagin Special Event - SIGMOD/PODS 2016 Foundations of Data Exchange and Metadata Management Marcelo Arenas Ron Fagin Special Event - SIGMOD/PODS 2016 The need for a formal definition We had a paper with Ron in PODS 2004 Back then I was a Ph.D.

More information

Composing Schema Mapping

Composing Schema Mapping Composing Schema Mapping An Overview Phokion G. Kolaitis UC Santa Cruz & IBM Research Almaden Joint work with R. Fagin, L. Popa, and W.C. Tan 1 Data Interoperability Data may reside at several different

More information

Database Technology Introduction. Heiko Paulheim

Database Technology Introduction. Heiko Paulheim Database Technology Introduction Outline The Need for Databases Data Models Relational Databases Database Design Storage Manager Query Processing Transaction Manager Introduction to the Relational Model

More information

The Relational Model. Chapter 3. Comp 521 Files and Databases Fall

The Relational Model. Chapter 3. Comp 521 Files and Databases Fall The Relational Model Chapter 3 Comp 521 Files and Databases Fall 2012 1 Why Study the Relational Model? Most widely used model by industry. IBM, Informix, Microsoft, Oracle, Sybase, etc. It is simple,

More information

Function Symbols in Tuple-Generating Dependencies: Expressive Power and Computability

Function Symbols in Tuple-Generating Dependencies: Expressive Power and Computability Function Symbols in Tuple-Generating Dependencies: Expressive Power and Computability Georg Gottlob 1,2, Reinhard Pichler 1, and Emanuel Sallinger 2 1 TU Wien and 2 University of Oxford Tuple-generating

More information

A7-R3: INTRODUCTION TO DATABASE MANAGEMENT SYSTEMS

A7-R3: INTRODUCTION TO DATABASE MANAGEMENT SYSTEMS A7-R3: INTRODUCTION TO DATABASE MANAGEMENT SYSTEMS NOTE: 1. There are TWO PARTS in this Module/Paper. PART ONE contains FOUR questions and PART TWO contains FIVE questions. 2. PART ONE is to be answered

More information

Data Mining. Data preprocessing. Hamid Beigy. Sharif University of Technology. Fall 1395

Data Mining. Data preprocessing. Hamid Beigy. Sharif University of Technology. Fall 1395 Data Mining Data preprocessing Hamid Beigy Sharif University of Technology Fall 1395 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1395 1 / 15 Table of contents 1 Introduction 2 Data preprocessing

More information

CS2255 DATABASE MANAGEMENT SYSTEMS QUESTION BANK UNIT I

CS2255 DATABASE MANAGEMENT SYSTEMS QUESTION BANK UNIT I CS2255 DATABASE MANAGEMENT SYSTEMS CLASS: II YEAR CSE SEM:04 STAFF INCHARGE: Mr S.GANESH,AP/CSE QUESTION BANK UNIT I 2 MARKS List the purpose of Database System (or) List the drawback of normal File Processing

More information

Chapter 10. Chapter Outline. Chapter Outline. Functional Dependencies and Normalization for Relational Databases

Chapter 10. Chapter Outline. Chapter Outline. Functional Dependencies and Normalization for Relational Databases Chapter 10 Functional Dependencies and Normalization for Relational Databases Chapter Outline 1 Informal Design Guidelines for Relational Databases 1.1Semantics of the Relation Attributes 1.2 Redundant

More information

The Relational Model. Chapter 3. Comp 521 Files and Databases Fall

The Relational Model. Chapter 3. Comp 521 Files and Databases Fall The Relational Model Chapter 3 Comp 521 Files and Databases Fall 2014 1 Why the Relational Model? Most widely used model by industry. IBM, Informix, Microsoft, Oracle, Sybase, MySQL, Postgres, Sqlite,

More information

DATABASE TECHNOLOGY - 1MB025 (also 1DL029, 1DL300+1DL400)

DATABASE TECHNOLOGY - 1MB025 (also 1DL029, 1DL300+1DL400) 1 DATABASE TECHNOLOGY - 1MB025 (also 1DL029, 1DL300+1DL400) Spring 2008 An introductury course on database systems http://user.it.uu.se/~udbl/dbt-vt2008/ alt. http://www.it.uu.se/edu/course/homepage/dbastekn/vt08/

More information

Lectures 12: Design Theory I. 1. Normal forms & functional dependencies 2/19/2018. Today s Lecture. What you will learn about in this section

Lectures 12: Design Theory I. 1. Normal forms & functional dependencies 2/19/2018. Today s Lecture. What you will learn about in this section Today s Lecture Lectures 12: Design Theory I Professor Xiannong Meng Spring 2018 Lecture and activity contents are based on what Prof Chris Ré used in his CS 145 in the fall 2016 term with permission 1.

More information

Database Applications (15-415)

Database Applications (15-415) Database Applications (15-415) The Relational Model Lecture 3, January 18, 2015 Mohammad Hammoud Today Last Session: The entity relationship (ER) model Today s Session: ER model (Cont d): conceptual design

More information

Relational model continued. Understanding how to use the relational model. Summary of board example: with Copies as weak entity

Relational model continued. Understanding how to use the relational model. Summary of board example: with Copies as weak entity COS 597A: Principles of Database and Information Systems Relational model continued Understanding how to use the relational model 1 with as weak entity folded into folded into branches: (br_, librarian,

More information

COSC Dr. Ramon Lawrence. Emp Relation

COSC Dr. Ramon Lawrence. Emp Relation COSC 304 Introduction to Database Systems Normalization Dr. Ramon Lawrence University of British Columbia Okanagan ramon.lawrence@ubc.ca Normalization Normalization is a technique for producing relations

More information

UNIT I. Introduction

UNIT I. Introduction UNIT I Introduction Objective To know the need for database system. To study about various data models. To understand the architecture of database system. To introduce Relational database system. Introduction

More information

UNIT 3 DATABASE DESIGN

UNIT 3 DATABASE DESIGN UNIT 3 DATABASE DESIGN Objective To study design guidelines for relational databases. To know about Functional dependencies. To have an understanding on First, Second, Third Normal forms To study about

More information

Relational Model History. COSC 416 NoSQL Databases. Relational Model (Review) Relation Example. Relational Model Definitions. Relational Integrity

Relational Model History. COSC 416 NoSQL Databases. Relational Model (Review) Relation Example. Relational Model Definitions. Relational Integrity COSC 416 NoSQL Databases Relational Model (Review) Dr. Ramon Lawrence University of British Columbia Okanagan ramon.lawrence@ubc.ca Relational Model History The relational model was proposed by E. F. Codd

More information

Chapter 8: Relational Database Design

Chapter 8: Relational Database Design Chapter 8: Relational Database Design Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 8: Relational Database Design Features of Good Relational Design Atomic Domains

More information

Relational Design: Characteristics of Well-designed DB

Relational Design: Characteristics of Well-designed DB 1. Minimal duplication Relational Design: Characteristics of Well-designed DB Consider table newfaculty (Result of F aculty T each Course) Id Lname Off Bldg Phone Salary Numb Dept Lvl MaxSz 20000 Cotts

More information

Data about data is database Select correct option: True False Partially True None of the Above

Data about data is database Select correct option: True False Partially True None of the Above Within a table, each primary key value. is a minimal super key is always the first field in each table must be numeric must be unique Foreign Key is A field in a table that matches a key field in another

More information

Unit 3 : Relational Database Design

Unit 3 : Relational Database Design Unit 3 : Relational Database Design Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Content Relational Model: Basic concepts, Attributes and Domains, CODD's Rules, Relational

More information

Informal Design Guidelines for Relational Databases

Informal Design Guidelines for Relational Databases Outline Informal Design Guidelines for Relational Databases Semantics of the Relation Attributes Redundant Information in Tuples and Update Anomalies Null Values in Tuples Spurious Tuples Functional Dependencies

More information

CS 338 Functional Dependencies

CS 338 Functional Dependencies CS 338 Functional Dependencies Bojana Bislimovska Winter 2016 Outline Design Guidelines for Relation Schemas Functional Dependency Set and Attribute Closure Schema Decomposition Boyce-Codd Normal Form

More information

Overview. Data-mining. Commercial & Scientific Applications. Ongoing Research Activities. From Research to Technology Transfer

Overview. Data-mining. Commercial & Scientific Applications. Ongoing Research Activities. From Research to Technology Transfer Data Mining George Karypis Department of Computer Science Digital Technology Center University of Minnesota, Minneapolis, USA. http://www.cs.umn.edu/~karypis karypis@cs.umn.edu Overview Data-mining What

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe CHAPTER 14 Basics of Functional Dependencies and Normalization for Relational Databases Slide 14-2 Chapter Outline 1 Informal Design Guidelines for Relational Databases 1.1 Semantics of the Relation Attributes

More information

The interaction of theory and practice in database research

The interaction of theory and practice in database research The interaction of theory and practice in database research Ron Fagin IBM Research Almaden 1 Purpose of This Talk Encourage collaboration between theoreticians and system builders via two case studies

More information

Schema Refinement: Dependencies and Normal Forms

Schema Refinement: Dependencies and Normal Forms Schema Refinement: Dependencies and Normal Forms Grant Weddell Cheriton School of Computer Science University of Waterloo CS 348 Introduction to Database Management Spring 2016 CS 348 (Intro to DB Mgmt)

More information

DATABASE MANAGEMENT SYSTEM SHORT QUESTIONS. QUESTION 1: What is database?

DATABASE MANAGEMENT SYSTEM SHORT QUESTIONS. QUESTION 1: What is database? DATABASE MANAGEMENT SYSTEM SHORT QUESTIONS Complete book short Answer Question.. QUESTION 1: What is database? A database is a logically coherent collection of data with some inherent meaning, representing

More information

From ER Diagrams to the Relational Model. Rose-Hulman Institute of Technology Curt Clifton

From ER Diagrams to the Relational Model. Rose-Hulman Institute of Technology Curt Clifton From ER Diagrams to the Relational Model Rose-Hulman Institute of Technology Curt Clifton Review Entity Sets and Attributes Entity set: collection of things in the DB Attribute: property of an entity calories

More information

Functional Dependencies & Normalization for Relational DBs. Truong Tuan Anh CSE-HCMUT

Functional Dependencies & Normalization for Relational DBs. Truong Tuan Anh CSE-HCMUT Functional Dependencies & Normalization for Relational DBs Truong Tuan Anh CSE-HCMUT 1 2 Contents 1 Introduction 2 Functional dependencies (FDs) 3 Normalization 4 Relational database schema design algorithms

More information

Schema Refinement and Normal Forms

Schema Refinement and Normal Forms Schema Refinement and Normal Forms Chapter 19 Quiz #2 Next Wednesday Comp 521 Files and Databases Fall 2010 1 The Evils of Redundancy Redundancy is at the root of several problems associated with relational

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe CHAPTER 26 Enhanced Data Models: Introduction to Active, Temporal, Spatial, Multimedia, and Deductive Databases 26.1 Active Database Concepts and Triggers Database systems implement rules that specify

More information

The Relational Model

The Relational Model The Relational Model Grant Weddell David R. Cheriton School of Computer Science University of Waterloo CS 348 Introduction to Database Management Spring 2012 CS 348 (Intro to DB Mgmt) Relational Model

More information

Conceptual Design. The Entity-Relationship (ER) Model

Conceptual Design. The Entity-Relationship (ER) Model Conceptual Design. The Entity-Relationship (ER) Model CS430/630 Lecture 12 Slides based on Database Management Systems 3 rd ed, Ramakrishnan and Gehrke Database Design Overview Conceptual design The Entity-Relationship

More information

The Relational Model. Outline. Why Study the Relational Model? Faloutsos SCS object-relational model

The Relational Model. Outline. Why Study the Relational Model? Faloutsos SCS object-relational model The Relational Model CMU SCS 15-415 C. Faloutsos Lecture #3 R & G, Chap. 3 Outline Introduction Integrity constraints (IC) Enforcing IC Querying Relational Data ER to tables Intro to Views Destroying/altering

More information

Graph Databases. Guilherme Fetter Damasio. University of Ontario Institute of Technology and IBM Centre for Advanced Studies IBM Corporation

Graph Databases. Guilherme Fetter Damasio. University of Ontario Institute of Technology and IBM Centre for Advanced Studies IBM Corporation Graph Databases Guilherme Fetter Damasio University of Ontario Institute of Technology and IBM Centre for Advanced Studies Outline Introduction Relational Database Graph Database Our Research 2 Introduction

More information

Data Mining. Data preprocessing. Hamid Beigy. Sharif University of Technology. Fall 1394

Data Mining. Data preprocessing. Hamid Beigy. Sharif University of Technology. Fall 1394 Data Mining Data preprocessing Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1394 1 / 15 Table of contents 1 Introduction 2 Data preprocessing

More information

Relational Design Theory. Relational Design Theory. Example. Example. A badly designed schema can result in several anomalies.

Relational Design Theory. Relational Design Theory. Example. Example. A badly designed schema can result in several anomalies. Relational Design Theory Relational Design Theory A badly designed schema can result in several anomalies Update-Anomalies: If we modify a single fact, we have to change several tuples Insert-Anomalies:

More information

Relational Database Design (II)

Relational Database Design (II) Relational Database Design (II) 1 Roadmap of This Lecture Algorithms for Functional Dependencies (cont d) Decomposition Using Multi-valued Dependencies More Normal Form Database-Design Process Modeling

More information

CS425 Fall 2016 Boris Glavic Chapter 1: Introduction

CS425 Fall 2016 Boris Glavic Chapter 1: Introduction CS425 Fall 2016 Boris Glavic Chapter 1: Introduction Modified from: Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Textbook: Chapter 1 1.2 Database Management System (DBMS)

More information

Chapter 10. Normalization. Chapter Outline. Chapter Outline(contd.)

Chapter 10. Normalization. Chapter Outline. Chapter Outline(contd.) Chapter 10 Normalization Chapter Outline 1 Informal Design Guidelines for Relational Databases 1.1Semantics of the Relation Attributes 1.2 Redundant Information in Tuples and Update Anomalies 1.3 Null

More information

Distributed Database Systems By Syed Bakhtawar Shah Abid Lecturer in Computer Science

Distributed Database Systems By Syed Bakhtawar Shah Abid Lecturer in Computer Science Distributed Database Systems By Syed Bakhtawar Shah Abid Lecturer in Computer Science 1 Distributed Database Systems Basic concepts and Definitions Data Collection of facts and figures concerning an object

More information

Database Design and Tuning

Database Design and Tuning Database Design and Tuning Chapter 20 Comp 521 Files and Databases Spring 2010 1 Overview After ER design, schema refinement, and the definition of views, we have the conceptual and external schemas for

More information

The Relational Model. Why Study the Relational Model? Relational Database: Definitions

The Relational Model. Why Study the Relational Model? Relational Database: Definitions The Relational Model Database Management Systems, R. Ramakrishnan and J. Gehrke 1 Why Study the Relational Model? Most widely used model. Vendors: IBM, Microsoft, Oracle, Sybase, etc. Legacy systems in

More information

Database Management

Database Management Database Management - 2011 Model Answers 1. a. A data model should comprise a structural part, an integrity part and a manipulative part. The relational model provides standard definitions for all three

More information

CPSC 421 Database Management Systems. Lecture 19: Physical Database Design Concurrency Control and Recovery

CPSC 421 Database Management Systems. Lecture 19: Physical Database Design Concurrency Control and Recovery CPSC 421 Database Management Systems Lecture 19: Physical Database Design Concurrency Control and Recovery * Some material adapted from R. Ramakrishnan, L. Delcambre, and B. Ludaescher Agenda Physical

More information

A Deeper Look at Data Modeling. Shan-Hung Wu & DataLab CS, NTHU

A Deeper Look at Data Modeling. Shan-Hung Wu & DataLab CS, NTHU A Deeper Look at Data Modeling Shan-Hung Wu & DataLab CS, NTHU Outline More about ER & Relational Models Weak Entities Inheritance Avoiding redundancy & inconsistency Functional Dependencies Normal Forms

More information

Applied Databases. Sebastian Maneth. Lecture 5 ER Model, normal forms. University of Edinburgh - January 25 th, 2016

Applied Databases. Sebastian Maneth. Lecture 5 ER Model, normal forms. University of Edinburgh - January 25 th, 2016 Applied Databases Lecture 5 ER Model, normal forms Sebastian Maneth University of Edinburgh - January 25 th, 2016 Outline 2 1. Entity Relationship Model 2. Normal Forms Keys and Superkeys 3 Superkey =

More information

Relational Database design. Slides By: Shree Jaswal

Relational Database design. Slides By: Shree Jaswal Relational Database design Slides By: Shree Jaswal Topics: Design guidelines for relational schema, Functional Dependencies, Definition of Normal Forms- 1NF, 2NF, 3NF, BCNF, Converting Relational Schema

More information

Chapter 14. Database Design Theory: Introduction to Normalization Using Functional and Multivalued Dependencies

Chapter 14. Database Design Theory: Introduction to Normalization Using Functional and Multivalued Dependencies Chapter 14 Database Design Theory: Introduction to Normalization Using Functional and Multivalued Dependencies Copyright 2012 Ramez Elmasri and Shamkant B. Navathe Chapter Outline 1 Informal Design Guidelines

More information

ERBlox: Combining Matching Dependencies with Machine Learning for Entity Resolution Leopoldo Bertossi Carleton University School of Computer Science Institute for Data Science Ottawa, Canada bertossi@scs.carleton.ca

More information

Chapter 1: Introduction

Chapter 1: Introduction Chapter 1: Introduction Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Outline The Need for Databases Data Models Relational Databases Database Design Storage Manager Query

More information

Chapter 4. The Relational Model

Chapter 4. The Relational Model Chapter 4 The Relational Model Chapter 4 - Objectives Terminology of relational model. How tables are used to represent data. Connection between mathematical relations and relations in the relational model.

More information

Chapter 1: Introduction. Chapter 1: Introduction

Chapter 1: Introduction. Chapter 1: Introduction Chapter 1: Introduction Database System Concepts, 5th Ed. See www.db-book.com for conditions on re-use Chapter 1: Introduction Purpose of Database Systems View of Data Database Languages Relational Databases

More information

Normalization. Murali Mani. What and Why Normalization? To remove potential redundancy in design

Normalization. Murali Mani. What and Why Normalization? To remove potential redundancy in design 1 Normalization What and Why Normalization? To remove potential redundancy in design Redundancy causes several anomalies: insert, delete and update Normalization uses concept of dependencies Functional

More information

Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 10-2

Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 10-2 Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 10-2 Chapter Outline 1 Informal Design Guidelines for Relational Databases 1.1Semantics of the Relation Attributes 1.2 Redundant

More information

CSE 544 Principles of Database Management Systems. Alvin Cheung Fall 2015 Lecture 8 - Data Warehousing and Column Stores

CSE 544 Principles of Database Management Systems. Alvin Cheung Fall 2015 Lecture 8 - Data Warehousing and Column Stores CSE 544 Principles of Database Management Systems Alvin Cheung Fall 2015 Lecture 8 - Data Warehousing and Column Stores Announcements Shumo office hours change See website for details HW2 due next Thurs

More information

MaanavaN.Com DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING QUESTION BANK

MaanavaN.Com DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING QUESTION BANK CS1301 DATABASE MANAGEMENT SYSTEM DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING QUESTION BANK Sub code / Subject: CS1301 / DBMS Year/Sem : III / V UNIT I INTRODUCTION AND CONCEPTUAL MODELLING 1. Define

More information

A Non intrusive Data driven Approach to Debugging Schema Mappings for Data Exchange

A Non intrusive Data driven Approach to Debugging Schema Mappings for Data Exchange 1. Problem and Motivation A Non intrusive Data driven Approach to Debugging Schema Mappings for Data Exchange Laura Chiticariu and Wang Chiew Tan UC Santa Cruz {laura,wctan}@cs.ucsc.edu Data exchange is

More information

DBAI-TR UMAP: A Universal Layer for Schema Mapping Languages

DBAI-TR UMAP: A Universal Layer for Schema Mapping Languages DBAI-TR-2012-76 UMAP: A Universal Layer for Schema Mapping Languages Florin Chertes and Ingo Feinerer Technische Universität Wien, Vienna, Austria Institut für Informationssysteme FlorinChertes@acm.org

More information

The strategy for achieving a good design is to decompose a badly designed relation appropriately.

The strategy for achieving a good design is to decompose a badly designed relation appropriately. The strategy for achieving a good design is to decompose a badly designed relation appropriately. Functional Dependencies The single most important concept in relational schema design theory is that of

More information

High Level Database Models

High Level Database Models ICS 321 Fall 2011 High Level Database Models Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 9/21/2011 Lipyeow Lim -- University of Hawaii at Manoa 1 Database

More information

Lise Getoor, University of Maryland Renée J. Miller, University of Toronto

Lise Getoor, University of Maryland Renée J. Miller, University of Toronto Lise Getoor, University of Maryland Renée J. Miller, University of Toronto From Webster. Main Entry: align ment Variant(s): also aline ment \əlīn-mənt\ Function: noun Date: 1790 1: the act of aligning

More information

COMP718: Ontologies and Knowledge Bases

COMP718: Ontologies and Knowledge Bases 1/35 COMP718: Ontologies and Knowledge Bases Lecture 9: Ontology/Conceptual Model based Data Access Maria Keet email: keet@ukzn.ac.za home: http://www.meteck.org School of Mathematics, Statistics, and

More information

The Relational Model. Chapter 3

The Relational Model. Chapter 3 The Relational Model Chapter 3 Why Study the Relational Model? Most widely used model. Systems: IBM DB2, Informix, Microsoft (Access and SQL Server), Oracle, Sybase, MySQL, etc. Legacy systems in older

More information

Data Exchange: Semantics and Query Answering

Data Exchange: Semantics and Query Answering Data Exchange: Semantics and Query Answering Ronald Fagin Phokion G. Kolaitis Renée J. Miller Lucian Popa IBM Almaden Research Center fagin,lucian @almaden.ibm.com University of California at Santa Cruz

More information

B.H.GARDI COLLEGE OF MASTER OF COMPUTER APPLICATION. Ch. 1 :- Introduction Database Management System - 1

B.H.GARDI COLLEGE OF MASTER OF COMPUTER APPLICATION. Ch. 1 :- Introduction Database Management System - 1 Basic Concepts :- 1. What is Data? Data is a collection of facts from which conclusion may be drawn. In computer science, data is anything in a form suitable for use with a computer. Data is often distinguished

More information

NORMALISATION (Relational Database Schema Design Revisited)

NORMALISATION (Relational Database Schema Design Revisited) NORMALISATION (Relational Database Schema Design Revisited) Designing an ER Diagram is fairly intuitive, and faithfully following the steps to map an ER diagram to tables may not always result in the best

More information

Functional Dependencies and. Databases. 1 Informal Design Guidelines for Relational Databases. 4 General Normal Form Definitions (For Multiple Keys)

Functional Dependencies and. Databases. 1 Informal Design Guidelines for Relational Databases. 4 General Normal Form Definitions (For Multiple Keys) 1 / 13 1 Informal Design Guidelines for Relational Databases 1.1Semantics of the Relation Attributes 1.2 Redundant d Information in Tuples and Update Anomalies 1.3 Null Values in Tuples 1.4 Spurious Tuples

More information