Model Management and Schema Mappings: Theory and Practice (Part II)
|
|
- Victoria McCormick
- 5 years ago
- Views:
Transcription
1 IBM Research Model Management and Schema Mappings: Theory and Practice (Part II) Howard Ho IBM Almaden Research Center For VLDB07 Tutorial with Phil Bernstein, Microsoft Research 2006 IBM Corporation
2 The Integration Challenge Complex and heterogeneous environments Many different types of systems Many inter-related applications Escalating needs Variety, velocity, volume People are expensive 1
3 Outline Clio: Basic Features of a Schema Mapping System Schema Matching Schema Mapping Query Generation IBM Rational Data Architect Product MAUI: Advanced Features of a Schema Mapping System Nested Mapping Model Mapping-Based XML Transformation Engine Schema Integration Schema Evolution Clio2010: Mapping-Based Authoring of Data Flows ETL: Mapping ETL Conversion Web-Service Composition: Mapping for web-service data sources Mashups: Mapping Mashup Generation Reuse: Mapping Polymorphism Conclusions and Future Directions 2
4 Clio: A Schema Mapping System XML Schema DTD Relational User mapping Mapping Generation Wants data from S Understands T May not understand S Source schema S conforms to Logical mapping (internal) Xformation Query Generation Target schema T conforms to data Low-level mapping (SQL, SQL/XML, XQuery, XSLT and Java code) Logical mappings can be used for both target materialization or query rewriting 3
5 Major Features (and Challenges) Schemas can be arbitrarily different Element correspondences Human friendly Automatic discovery Support Nested Structures Nested Relational Model Nested Constraints Produce Correct Grouping Preserve data meaning Discover associations Use constraints & schema Create New Target Values and 4
6 Generate Transformation Queries (XQuery) <?xml version="1.0" encoding="utf-8"?> <statisticsdb> <citystatistics> <city/>, distinct ( FOR $x0 IN $doc/expensedb/grant, $x1 IN $doc/expensedb/company WHERE $x1/cid/text() = $x0/cid/text() RETURN <organization> <orgid> $x0/cid/text() </orgid>, <oname> $x1/cname/text() </oname>, distinct ( FOR $x0l1 IN $doc/expensedb/grant, $x1l1 IN $doc/expensedb/company WHERE $x1l1/cid/text() = $x0l1/cid/text() AND $x1/cname/text() = $x1l1/cname/text() AND $x0/cid/text() = $x0l1/cid/text() RETURN <funding> <fid> "Sk35(", $x0l1/amount/text(), ", ", $x1l1/cname/text(), ", ", $x0l1/cid/text(), ")" </fid>, <proj> "Sk36(", $x0l1/amount/text(), ", ", $x1l1/cname/text(), ", ", $x0l1/cid/text(), ")" </proj>, <aid> "Sk32(", $x0l1/amount/text(), ", ", $x1l1/cname/text(), ", ", $x0l1/cid/text(), ")" </aid> </funding> ) </organization> ), distinct ( FOR $x0 IN $doc/expensedb/grant, $x1 IN $doc/expensedb/company WHERE $x1/cid/text() = $x0/cid/text() RETURN <financial> <aid> "Sk32(", $x0/amount/text(), ", ", $x1/cname/text(), ", ", $x0/cid/text(), ")" </aid>, <amount> $x0/amount/text() </amount> </financial> ) </citystatistics> </statisticsdb> 5
7 Generate Transformation Scripts (XSLT) <?xml version="1.0" encoding="utf-8"?> <xsl:stylesheet version="1.0" xmlns:xsl=" <xsl:output method="xml" version="1.0" encoding="utf-8" indent="yes"/> <xsl:template match="/"> <result> <xsl:call-template name="q0"/> </result> </xsl:template> <xsl:template name="q0"> <xsl:element name="statisticsdb"> <xsl:attribute name="isroot">true</xsl:attribute> <xsl:element name="clioset"> <xsl:attribute name="id">sk_statisticsdb()</xsl:attribute> </xsl:element> </xsl:element> <xsl:for-each select="/expensedb/grant"> <xsl:variable name="x0" select="."/> <xsl:for-each select="/expensedb/company"> <xsl:variable name="x1" select="."/> <xsl:if test="$x1/cid=$x0/cid"> <xsl:element name="citystatistics"> <xsl:attribute name="inset">sk_statisticsdb()</xsl:attribute> <xsl:element name="city"/> <xsl:element name="clioset"> <xsl:attribute name="id">sk_statisticsdb_0_1(sk_statisticsdb())</xsl:attribute> </xsl:element> <xsl:element name="clioset"> <xsl:attribute name="id">sk_statisticsdb_0_2(sk_statisticsdb())</xsl:attribute> </xsl:element> </xsl:element> <xsl:element name="organization"> <xsl:attribute name="inset">sk_statisticsdb_0_1(sk_statisticsdb())</xsl:attribute> <xsl:element name="orgid"><xsl:value-of select="$x0/cid"/></xsl:element> <xsl:element name="oname"><xsl:value-of select="$x1/cname"/></xsl:element> <xsl:element name="clioset"> <xsl:attribute name="id">sk_statisticsdb_0_1_0_2(<xsl:value-of select="$x0/cid"/>, <xsl:value-of select="$x1/cname"/>, Sk_statisticsDB_0_1(Sk_statisticsDB())) </xsl:attribute> </xsl:element> </xsl:element> <xsl:element name="funding"> <xsl:attribute name="inset">sk_statisticsdb_0_1_0_2(<xsl:value-of select="$x0/cid"/>, <xsl:value-of select="$x1/cname"/>, Sk_statisticsDB_0_1(Sk_statisticsDB())) </xsl:attribute> <xsl:element name="fid"> Sk35(<xsl:value-of select="$x0/amount"/>, <xsl:value-of select="$x1/cname"/>, <xsl:value-of select="$x0/cid"/>) </xsl:element>
8 (Flat) Mapping Generation Source schema S Schema Correspondences Target schema T Mappings Source Concepts (relational views) Target Concepts (relational views) Step 1. Extraction of concepts (in each schema). Concept = one category of data that can exist in the schema Step 2. Mapping generation Enumerate all non-redundant maps between pairs of concepts [Popa, Velegrakis, Miller, Hernandez, Fagin. VLDB02] [Fagin, Kolatios, Miller, Popa. ICDT 03] [Haas, Hernandez, Ho, Popa, Roth. SIGMOD 05] 7
9 (Flat) Mapping Example m 1 maps proj to dept-projects m 1 : (p 0 in proj) (d in dept) (p in d.projects) p 0.dname = d.dname p 0.pname = p.pname m 2 : (p 0 in proj) (e 0 in p 0.emps) (d in dept) (p in d.projects) (e in d.emps) (w in e.workson) w.pid = p.pid p 0.dname = d.dname p 0.pname = p.pname e 0.ename = e.ename expression for e 0.salary = e.salary dept-empsworksonprojects The concept of project of a department m m 2 1 proj: Set [ dname pname emps: Set [ ename salary ] ] Two basic mappings (or source-to-target tgds or GLAV formulas) dept: Set [ dname budget emps: Set [ ename salary workson: Set [ pid ] ] projects: Set [ pid pname ] ] m 2 maps proj-emps to dept-emps-workson-projects The concept of project of an employee of a department 8
10 IBM Rational Data Architect Product 9
11 IBM Rational Data Architect Product Schema Matching, Schema Mapping and Query Generation Technologies from Clio Value Correspondences in the GUI Blue Lines: Confirmed by the users Gray Lines: Suggested by the schema matching algorithms Schema Matching Five different algorithms: two name-based (including thesaurus lookup) and three instance-based Users can choose Any weighted combination of the 5 schema matching algorithms Source or target One element (element/attribute or column) or a group of elements (subtree or table) The value k (for the top-k matches) The system returns the top-k matches for each element Current Release Source is relational (other IBM products support XML sources) Target can be relational (generates SQL) or XML (generates SQL/XML) Mapping is standardized within IBM, as an EMF in-memory object and as a serialized XML document 10
12 Outline Clio: Basic Features of a Schema Mapping System MAUI: Advanced Features of a Schema Mapping System Nested Mapping Model [Fuxman, Hernandez, Ho, Miller, Papotti, Popa. VLDB 06] Mapping-Based XML Transformation Engine Schema Integration Schema Evolution Clio2010: Mapping-Based Authoring of Data Flows Conclusions and Future Directions 11
13 New Nested-Mapping Engine for Clio Existing Clio engine is based on a flat mapping model Pros: easier to implement Cons: Fragmentation into many overlapping mappings Inefficiency in execution Redundancy in the output data No user-defined grouping semantics New Clio engine is based on a nested-mapping model Cons: more challenging to design and implement Pros: overcomes the above problems in the flat mapping model 12
14 (Flat) Mapping Example m 1 maps proj to dept-projects m 1 : (p 0 in proj) (d in dept) (p in d.projects) p 0.dname = d.dname p 0.pname = p.pname m 2 : (p 0 in proj) (e 0 in p 0.emps) (d in dept) (p in d.projects) (e in d.emps) (w in e.workson) w.pid = p.pid p 0.dname = d.dname p 0.pname = p.pname e 0.ename = e.ename expression for e 0.salary = e.salary dept-empsworksonprojects The concept of project of a department m m 2 1 proj: Set [ dname pname emps: Set [ ename salary ] ] Two basic mappings (or source-to-target tgds or GLAV formulas) dept: Set [ dname budget emps: Set [ ename salary workson: Set [ pid ] ] projects: Set [ pid pname ] ] m 2 maps proj-emps to dept-emps-workson-projects The concept of project of an employee of a department 13
15 Correlating Mapping Formulas m 1 : (p 0 in proj) (d in dept) (p in d.projects) p 0.dname = d.dname p 0.pname = p.pname proj tuples mapped only once m 2 : (p 0 in proj) (e 0 in p 0.emps) (d in dept) (p in d.projects) (e in d.emps) (w in e.workson) w.pid=p.pid p 0.dname = d.dname p 0.pname = p.pname e 0.ename = e.ename e 0.salary = e.salary Submapping, correlated to the parent mapping For every proj tuple, we map all employees, as a group. (Source grouping is preserved) Replace with n: (p 0 in proj) (d in dept) (p in d.projects) p 0.dname = d.dname p 0.pname = p.pname [ (e 0 in p 0.emps) (e in d.emps) (w in e.workson) w.pid=p.pid e 0.ename = e.ename e 0.salary = e.salary ] This is a nested mapping 14
16 Performance Flat query execution time / Nested query execution time KB 514 KB 312 KB Execution time for flat: 22mins Execution time for nested: 1.1s Flat query output size / Nested query output size Nesting Level 2111 KB 514 KB 312 KB Size of generated data (flat) including duplicates: 45MB Size of generated data (nested): 552KB Nesting Level The nested mapping generates much more efficient execution and less redundant data 15
17 Outline Clio: Basic Features of a Schema Mapping System MAUI: Advanced Features of a Schema Mapping System Nested Mapping Model Mapping-Based XML Transformation Engine [Jiang, Ho, Popa, Han. WWW 07] Schema Integration Schema Evolution Clio2010: Mapping-Based Authoring of Data Flows Conclusions and Future Directions 16
18 Performance in Executing Mappings Source schema S conforms to Logical mapping (internal) Query Generation Target schema T conforms to data Low-level mapping (SQL/XQuery/XSLT) Mappings are translated into general purpose query languages E.g., SQL, XQuery, XSLT Query optimization issues are left to the runtime engine of each of these languages to decide i.e., Q.O. decisions are not encoded in the queries. Idea: Execute mappings directly in our own runtime. 17
19 Mapping Execution Engine in Java Source schema S Logical mapping (internal) Target schema T conforms to data Mapping Execution Engine conforms to Motivation Grouping (on multiple levels) of transformed values. XQuery & XSLT have no specific constructs (hence algorithms) for such grouping tasks Scalability and efficiency Controllable memory resource usage Speed of transformation almost linear to input sizes Mapping-based Model high-level mapping semantics using IBM Mapping Specification Language (MSL) standard 18
20 Three-Phase Algorithm for Executing Mappings Phase 1 (Extract): tuple extraction Streaming holistic twig join Streaming and stored/indexed XML SQL queries through ODBC Relational source Phase 2 (Transform): generate an XML tree from each tuple Phase 3 (Merge): data merging A dynamic, scalable merge algorithm Hash-based vs. sort-based algorithms 19
21 XML-to-XML Comparative Results Running time in second Galax Quip MonetXQ Saxon Ours Data sizes (MB) 20
22 XML-to-XML Scalability Results Our Engine Running time (seconds DBLP data sizes (MB) 21
23 Outline Clio: Basic Features of a Schema Mapping System MAUI: Advanced Features of a Schema Mapping System Nested Mapping Model Mapping-Based XML Transformation Engine Schema Integration [Chiticariu, Hernandez, Popa, Kolaitis. VLDB 07 demo] Schema Evolution Clio2010: Mapping-Based Authoring of Data Flows Conclusions and Future Directions 22
24 Schema Integration Problem Given multiple overlapping schemas in the same domain, consolidate them into one. Output Input Source 1 Schema Source 2 Schema Schema Mappings Source n Schema Applications Provide a standard representation of the data (for unified querying, data warehousing, reference point, etc.) Metadata chaos reduction Schema integration is a hard problem Requires a lot of human interaction/feedback A series of recipes that arrive at a single integrated schema Related work [Pottinger, Bernstein. VLDB 03] [Chiticariu, Hernandez, Popa, Kolaitis. VLDB 07 demo] 23
25 Schema Integration [Chiticariu et al.] 1 User schema S 1 Schema Correspondences User schema S 2 2 Integrated schema I 1 3 Final schema I S 1 concepts S 2 concepts Integrated schema I n employee of a department B 1 B 4 dname budget dname budget ename B 3 B 2 sal address dname budget ename sal address pid pname department dname budget pid pname project of a department project of an employee of a department Step 1. Extraction of concepts from a hierarchy Step 2. Consider all possible ways of performing a hierarchical merge of the related concepts Generate initial set of candidate good integrated schemas Duplication-free enumeration algorithm Avoids duplicates by using constraints (Horn clauses) Polynomial-delay algorithm for enumerating satisfying assignments Step 3. Browse/Search/Refine set of candidate schemas Combination of partial enumeration with user constraints Based on the schemas seen so far, users express constraints on the future schemas to be generated 24
26 25
27 Outline Clio: Basic Features of a Schema Mapping System MAUI: Advanced Features of a Schema Mapping System Nested Mapping Model Mapping-Based XML Transformation Engine Schema Integration Schema Evolution Clio2010: Mapping-Based Authoring of Data Flows Conclusions and Future Directions 26
28 Schema Evolution The mapping Source Schema SS M ST Adapted Derived MTT T Target Schema Diff. Script Given or discovered T evolves Mapping Composition M ST = M ST M TT T When the target schema evolves, the system can generate an initial default mapping for T T automatically The user adds new mapping lines for new schema elements in T to complete the mapping M TT (for T T ) The new mapping M ST (S T ) is a composition of M ST (S T) and M TT (T T ) 27
29 Schema Evolution S evolves SS M ST T Diff. Script Given or discovered Derived MS S S S Adapted M S T = M S S M ST Mapping Composition When the source schema evolves, the system can generate an initial default mapping for S S automatically The user adds new mapping lines for new schema elements in S to complete the mapping M S S (for S S) The new mapping M S T (S T) is a composition of M S S (S S) and M ST (S T) 28
30 Related Theory and Algorithms Mapping Composition [Madhavan, Halevy. VLDB 03] [Fagin, Kolaitis, Popa, Tan. PODS 04] [Nash, Bernstein, Melnik. PODS 05] [Bernstein, Green, Melnik, Nash. VLDB 06] Mapping Inversion [Fagin. PODS 06] [Fagin, Kolaitis, Popa, Tan. PODS 07] Schema Evolution [Rahm, Bernstein. SIGMOD Rec. Dec 06] Mapping Adaptation under Evolving Schemas [Velegrakis, Miller, Popa. VLDB 03] [Yu, Popa. VLDB 05] Query rewrite [Yu, Popa. SIGMOD 04] 29
31 Outline Clio: Basic Features of a Schema Mapping System MAUI: Advanced Features of a Schema Mapping System Clio2010: Mapping-Based Authoring of Data Flows ETL: Mapping ETL Conversion Web-Service Composition: Mapping for web-service data sources Mashups: Mapping Mashup Generation Reuse: Mapping Polymorphism Conclusions and Future Directions 30
32 Mapping-Based Authoring of Data Flows Motivation Schema mappings are the building blocks for larger data transformation and integration applications The graph of mappings is a declarative specification of the flow of data Similar to ETL & mashups Need to take functional (web-service) data sources Need to extend Clio where multiple mappings can be defined, loaded (sharing a context), reused and managed 31
33 Clio2010 Clio2010 Automatically assemble a graph of initial, uncorrelated mappings into larger, richer mappings Support more complicated mapping composition and mapping merge Support functional data sources (e.g., web services) Compile the larger mappings into a global execution plan in Queries (XQuery) transformation scripts (XSLT) ETL flows (e.g., IBM WebSphere DataStage) mashups (e.g., IBM DAMIA) Support mapping reuse through mapping polymorphism framework 32
34 Outline Clio: Basic Features of a Schema Mapping System MAUI: Advanced Features of a Schema Mapping System Clio2010: Mapping-Based Authoring of Data Flows ETL: Mapping ETL Conversion [Hernandez et al. 07] Web-Service Composition: Mapping for web-service data sources Mashups: Mapping Mashup Generation Reuse: Mapping Polymorphism Conclusions and Future Directions 33
35 Orchid: Mapping ETL Conversion Round-tripping between ETL (Extract, Transform, Load) scripts and declarative mappings Generate ETL script from mappings Extract mapping information from ETL scripts Track and propagate mapping-related modifications Optimization of ETL scripts Remove redundant operations Push computation to other runtimes Rewrite ETL scripts EII Mapping in RDA ETL Flow Orchid 34
36 Mapping ETL Generation The developer inspects the connections between the sources, rules, and data warehouse model and generates a DataStage job to move data 35
37 ETL Mapping Abstraction The developer refines, tests and deploys the DataStage job to production. Need ETL Mapping: To reflect changes made in the ETL script back in the mapping. To extract mappings out of ETL scripts. 36
38 Outline Clio: Basic Features of a Schema Mapping System MAUI: Advanced Features of a Schema Mapping System Clio2010: Mapping-Based Authoring of Data Flows ETL: Mapping ETL Conversion Web-Service Composition: Mappings for web-service data sources [Alexe et al. 07] Mashups: Mapping Mashup Generation Reuse: Mapping Polymorphism Conclusions and Future Directions 37
39 Flows of Mappings getibmhotels : [ state, city ] Set [ hotel, geo, country, airport ] Source 1 Source 2 M 2 M 1 gethotelreviews : [ hotel, state, city ] Set [ image rooms rating reviews: Set [ review ] ] Target root: Set [ state, city, hotel, ratings: Set [ rating reviews: Set [ review ] ] M 3 38
40 Functional Data Sources Web Service Definition Argument Type Result Type Mapping between functional sources 39
41 Outline Clio: Basic Features of a Schema Mapping System MAUI: Advanced Features of a Schema Mapping System Clio2010: Mapping-Based Authoring of Data Flows ETL: Mapping ETL Conversion Web-Service Composition: Mapping for web-service data sources Mashups: Mapping Mashup Generation [Camacho et al. 07] Reuse: Mapping Polymorphism Conclusions and Future Directions 40
42 IBM DAMIA: A Mashup Tool IBM DAMIA Produces easily customizable flows Mashup fabric for data aggregation and transformation Web-based tool with userfriendly interface No scripting/programming skills required XQuery-like model 41
43 Generate DAMIA Flows from Clio Clio High level mappings Compiler SQL XSLT XQuery DAMIA flow 42
44 From XQuery to ETL The XQuery has a specific flow pattern, with several phases: Source extraction queries (projection, navigation, join) Result: sets of flat tuples Key generation Union Duplicate elimination Computation of target groups Generate target output (hierarchy) Back to hierarchical data We can visualize all this as a graph of operators we get the equivalent of an ETL job for XML 43
45 Doc 1 Student2, Doc 2 Student1 courseeval /Student1/* De-dup /Student2/* /courseeval/* // // // XML sid relational name sid π course π ûü eval_key=eval_key name grade sid sid name name eid = Key π π cname Key to Sk1( ) Eval_2 file Course_2 = F_course(sid, name) Key Key eid = Key Sk2( ) to Pid_0 = F_course ( ) Key Course_2 Eval_2 value_key π Pid_0 Key cname eid Key Pid_0 = F_course ( ) value_key Pid_0 π cname Key eid value_key Key value_key Course_1 Student_0 Eval_2 De-dup value_key Course_2 = Pid_0 value_key Nest-join Group-by Pid_0 concatenate Student, Course, Eval De-dup value_key relational XML... 44
46 Outline Clio: Basic Features of a Schema Mapping System MAUI: Advanced Features of a Schema Mapping System Clio2010: Mapping-Based Authoring of Data Flows ETL: Mapping ETL Conversion Web-Service Composition: Mapping for web-service data sources Mashups: Mapping Mashup Generation Reuse: Mapping Polymorphism [Wisnesky et al. 07] Conclusions and Future Directions 45
47 Mapping Reuse Motivaion: Existing mapping formalisms implicitly contain information that we need to be explicit Example: A Clio nested mapping expression implicitly defines a class of schemas for which it has an interpretation But in Clio mapping representation (XSML), every mapping instance requires concrete source and target schemas The flexibility to re-use mappings is lost Approach: We need a formal language for talking about mappings Require theory and tools to manipulate mappings as blocks Solution: Mapping Polymorphism 46
48 Clio Language Embedding into MF The bijection respects mapping semantics. Terms of Polymorphic Mapping Type Instantiation with concrete types Terms of Mapping Type Bijection Clio Nested Mapping Language Terms in MF MF also includes useful terms that manipulate mappings. 47
49 Outline Clio: Basic Features of a Schema Mapping System MAUI: Advanced Features of a Schema Mapping System Clio2010: Mapping-Based Authoring of Data Flows Conclusion and Related Work 48
50 Clio Innovations Over Time Product component Conference 99 SQL Assist Chocolate VLDB 2002 Cinnamon (CM 8.3) SIGMOD 2004 VLDB 2005 Criollo (RDA) SIGMOD VLDB 2001 & ICDE PODS SIGMOD 2000 demo demo VLDB 2006 Join-path generation Attribute matching XML support XSLT generation Deep union Mapping composition Schema integration SQL generation Clio research component XQuery generation Data viewer Hybrid algorithm SQL/XML generation Mapping language XQuery rewrite Schema evolution Java runtime engine Java code generation Nested mapping engine Flow mapping Web services 49
51 Conclusion Mappings define relationships between schemas of data sources Mappings and transformation queries are the new metadata Customers want I (for integration), not EAI, EII, ETL, etc. Consumability for information integration 50
52 Related Work: Very Small Subset Information Integration Survey [Haas ICDT 07] Beauty and the Beast: The Theory and Practice of Information Integration [Kolatis PODS 05] Schema mappings, data exchange, and metadata management [Halevy, Rajaraman, Ordille VLDB 06] Data Integration: The Teenage Years as part of their 10-year Best Paper Award on Information Manifold paper Schema Matching Survey by Erhard Rahm and Philip Bernstein in VLDB J. 01 Much work by Philip Bernstein, Anhai Doan, Alon Halevy, Jayant Madhavan Information Discovery project at IBM Almaden (Berthold Reinwald et al.) Data cleansing, conflict resolution, data quality, duplicate detection, entity resolution, data provenance Orchestra project at U Penn Schema Mapping Debugger [Alexe, Chiticariu, Tan. VLDB 06] A Benchmark for Schema Mapping System [Alexe, Tan, Velegrakis. 07] A new schema mapping GUI based on XQBE ideas [Ceri, Hernandez, Raffio et al. 07] Deep Web, peer-to-peer, ontology, etc 51
53 Acknowledgements: Clio Research Team The Project Founders: Laura Haas and Renee Miller (around 1999) Regulars: Clio Group: Mauricio Hernandez, Howard Ho, Lucian Popa, Ioana Stanoi Theory Group: Ron Fagin, Phokion Kolaitis, Alan Nash Past Visiting scientists: Stefan Dessloch, Paolo Papotti, Wang-Chiew Tan, Yannis Velegrakis, Cathy Wyss Past Interns and Post-docs: Bogdan Alexe, Alfredo Camacho, Laura Chiticariu, Ariel Fuxman, Michael Gubanov, Haifeng Jiang, Felix Naumann, Paolo Papotti, Ahmed Radwan, Alessandro Raffio, Michael Richmond, Shivkumar Shivaji, Yannis Velegrakis, Melanie Weis, Ryan Wisnesky, Lingling Yan, Cong Yu, Jindan Zhou 52
Simplifying Information Integration: Object-Based Flow-of-Mappings Framework for Integration
Simplifying Information Integration: Object-Based Flow-of-Mappings Framework for Integration Howard Ho IBM Almaden Research Center Take Home Messages Clio (Schema Mapping for EII) has evolved to Clio 2010
More informationBio/Ecosystem Informatics
Bio/Ecosystem Informatics Renée J. Miller University of Toronto DB research problem: managing data semantics R. J. Miller University of Toronto 1 Managing Data Semantics Semantics modeled by Schemas (structure
More informationLearning mappings and queries
Learning mappings and queries Marie Jacob University Of Pennsylvania DEIS 2010 1 Schema mappings Denote relationships between schemas Relates source schema S and target schema T Defined in a query language
More informationDBAI-TR UMAP: A Universal Layer for Schema Mapping Languages
DBAI-TR-2012-76 UMAP: A Universal Layer for Schema Mapping Languages Florin Chertes and Ingo Feinerer Technische Universität Wien, Vienna, Austria Institut für Informationssysteme FlorinChertes@acm.org
More informationNested Mappings: Schema Mapping Reloaded
Nested Mappings: Schema Mapping Reloaded Ariel Fuxman University of Toronto afuxman@cs.toronto.edu Renee J. Miller University of Toronto miller@cs.toronto.edu Mauricio A. Hernandez IBM Almaden Research
More informationClio Grows Up: From Research Prototype to Industrial Tool
Clio Grows Up: From Research Prototype to Industrial Tool Laura M. Haas IBM Silicon Valley Labs laura@almaden.ibm.com Mauricio A. Hernández IBM Almaden Research Center mauricio@almaden.ibm.com Lucian Popa
More informationOrchid: Integrating Schema Mapping and ETL
Orchid: Integrating Schema Mapping and ETL (Exted Version) Stefan Dessloch, Mauricio A. Hernández, Ryan Wisnesky, Ahmed Radwan, Jindan Zhou Department of Computer Science, University of Kaiserslautern
More informationA Non intrusive Data driven Approach to Debugging Schema Mappings for Data Exchange
1. Problem and Motivation A Non intrusive Data driven Approach to Debugging Schema Mappings for Data Exchange Laura Chiticariu and Wang Chiew Tan UC Santa Cruz {laura,wctan}@cs.ucsc.edu Data exchange is
More informationSchema Management. Abstract
Schema Management Periklis Andritsos Λ Ronald Fagin y Ariel Fuxman Λ Laura M. Haas y Mauricio A. Hernández y Ching-Tien Ho y Anastasios Kementsietsidis Λ Renée J. Miller Λ Felix Naumann y Lucian Popa y
More informationThe interaction of theory and practice in database research
The interaction of theory and practice in database research Ron Fagin IBM Research Almaden 1 Purpose of This Talk Encourage collaboration between theoreticians and system builders via two case studies
More informationScalable Data Exchange with Functional Dependencies
Scalable Data Exchange with Functional Dependencies Bruno Marnette 1, 2 Giansalvatore Mecca 3 Paolo Papotti 4 1: Oxford University Computing Laboratory Oxford, UK 2: INRIA Saclay, Webdam Orsay, France
More informationComposing Schema Mapping
Composing Schema Mapping An Overview Phokion G. Kolaitis UC Santa Cruz & IBM Research Almaden Joint work with R. Fagin, L. Popa, and W.C. Tan 1 Data Interoperability Data may reside at several different
More informationCreating a Mediated Schema Based on Initial Correspondences
Creating a Mediated Schema Based on Initial Correspondences Rachel A. Pottinger University of Washington Seattle, WA, 98195 rap@cs.washington.edu Philip A. Bernstein Microsoft Research Redmond, WA 98052-6399
More informationA Classification of Schema Mappings and Analysis of Mapping Tools
A Classification of Schema Mappings and Analysis of Mapping Tools Frank Legler IBM Deutschland Entwicklung GmbH flegler@de.ibm.com Felix Naumann Hasso-Plattner-Institut, Potsdam naumann@hpi.uni-potsdam.de
More informationIntroduction Data Integration Summary. Data Integration. COCS 6421 Advanced Database Systems. Przemyslaw Pawluk. CSE, York University.
COCS 6421 Advanced Database Systems CSE, York University March 20, 2008 Agenda 1 Problem description Problems 2 3 Open questions and future work Conclusion Bibliography Problem description Problems Why
More informationAlgebraic Model Management: A Survey
Algebraic Model Management: A Survey Patrick Schultz 1, David I. Spivak 1, and Ryan Wisnesky 2 1 Massachusetts Institute of Technology 2 Categorical Informatics, Inc. Abstract. We survey the field of model
More informationFoundations of Data Exchange and Metadata Management. Marcelo Arenas Ron Fagin Special Event - SIGMOD/PODS 2016
Foundations of Data Exchange and Metadata Management Marcelo Arenas Ron Fagin Special Event - SIGMOD/PODS 2016 The need for a formal definition We had a paper with Ron in PODS 2004 Back then I was a Ph.D.
More informationFunction Symbols in Tuple-Generating Dependencies: Expressive Power and Computability
Function Symbols in Tuple-Generating Dependencies: Expressive Power and Computability Georg Gottlob 1,2, Reinhard Pichler 1, and Emanuel Sallinger 2 1 TU Wien and 2 University of Oxford Tuple-generating
More informationSchema Exchange: a Template-based Approach to Data and Metadata Translation
Schema Exchange: a Template-based Approach to Data and Metadata Translation Paolo Papotti and Riccardo Torlone Università Roma Tre {papotti,torlone}@dia.uniroma3.it Abstract. In this paper we study the
More informationData Integration and Data Warehousing Database Integration Overview
Data Integration and Data Warehousing Database Integration Overview Sergey Stupnikov Institute of Informatics Problems, RAS ssa@ipi.ac.ru Outline Information Integration Problem Heterogeneous Information
More informationBioFederator: A Data Federation System for Bioinformatics on the Web
BioFederator: A Data Federation System for Bioinformatics on the Web Ahmed Radwan, Akmal Younis, Mauricio A. Hernandez, Howard Ho, Department of Electrical and Computer Engineering Lucian Popa, Shivkumar
More informationXML. Objectives. Duration. Audience. Pre-Requisites
XML XML - extensible Markup Language is a family of standardized data formats. XML is used for data transmission and storage. Common applications of XML include business to business transactions, web services
More informationNew Requirements. Advanced Query Processing. Top-N/Bottom-N queries Interactive queries. Skyline queries, Fast initial response time!
Lecture 13 Advanced Query Processing CS5208 Advanced QP 1 New Requirements Top-N/Bottom-N queries Interactive queries Decision making queries Tolerant of errors approximate answers acceptable Control over
More informationSchema Exchange: a Template-based Approach to Data and Metadata Translation
Schema Exchange: a Template-based Approach to Data and Metadata Translation Paolo Papotti and Riccardo Torlone Università Roma Tre {papotti,torlone}@dia.uniroma3.it Abstract. We study the schema exchange
More informationUniversity of Massachusetts Amherst Department of Computer Science Prof. Yanlei Diao
University of Massachusetts Amherst Department of Computer Science Prof. Yanlei Diao CMPSCI 445 Midterm Practice Questions NAME: LOGIN: Write all of your answers directly on this paper. Be sure to clearly
More informationMapper An Efficient Data Transformation Operator
Mapper An Efficient Data Transformation Operator Paulo Carreira University of Lisbon 2 Context Source: IMPCRED CREDTID MERCH FT546083 RT546084 8 PCS LARGE CONTAINER MODULE X067 100 PCS COTTON CORDUROY
More informationKanata: Adaptation and Evolution in Data Sharing Systems
Kanata: Adaptation and Evolution in Data Sharing Systems Periklis Andritsos Ariel Fuxman Anastasios Kementsietsidis Renée J. Miller Yannis Velegrakis Department of Computer Science University of Toronto
More informationAuthor: Irena Holubová Lecturer: Martin Svoboda
NPRG036 XML Technologies Lecture 6 XSLT 9. 4. 2018 Author: Irena Holubová Lecturer: Martin Svoboda http://www.ksi.mff.cuni.cz/~svoboda/courses/172-nprg036/ Lecture Outline XSLT Principles Templates Instructions
More informationGeneric Model Management
Generic Model Management A Database Infrastructure for Schema Manipulation Philip A. Bernstein Microsoft Corporation April 29, 2002 2002 Microsoft Corp. 1 The Problem ithere is 30 years of DB Research
More informationSCHEMA MAPPING DESIGN SYSTEMS 1. Schema Mapping Design Systems: Example-Driven and Semantic Approaches. Kathryn Dahlgren. CSU Stanislaus CS4960
SCHEMA MAPPING DESIGN SYSTEMS 1 Schema Mapping Design Systems: Example-Driven and Semantic Approaches Kathryn Dahlgren CSU Stanislaus CS4960 Dr. Melanie Martin SCHEMA MAPPING DESIGN SYSTEMS 2 Introduction
More informationDatabase Design and Tuning
Database Design and Tuning Chapter 20 Comp 521 Files and Databases Spring 2010 1 Overview After ER design, schema refinement, and the definition of views, we have the conceptual and external schemas for
More informationDISCUSSION 5min 2/24/2009. DTD to relational schema. Inlining. Basic inlining
XML DTD Relational Databases for Querying XML Documents: Limitations and Opportunities Semi-structured SGML Emerging as a standard E.g. john 604xxxxxxxx 778xxxxxxxx
More informationPhysical Database Design and Tuning. Review - Normal Forms. Review: Normal Forms. Introduction. Understanding the Workload. Creating an ISUD Chart
Physical Database Design and Tuning R&G - Chapter 20 Although the whole of this life were said to be nothing but a dream and the physical world nothing but a phantasm, I should call this dream or phantasm
More informationDatabase Management Systems (COP 5725) Homework 3
Database Management Systems (COP 5725) Homework 3 Instructor: Dr. Daisy Zhe Wang TAs: Yang Chen, Kun Li, Yang Peng yang, kli, ypeng@cise.uf l.edu November 26, 2013 Name: UFID: Email Address: Pledge(Must
More informationCSIT5300: Advanced Database Systems
CSIT5300: Advanced Database Systems L11: Physical Database Design Dr. Kenneth LEUNG Department of Computer Science and Engineering The Hong Kong University of Science and Technology Hong Kong SAR, China
More informationCall: Datastage 8.5 Course Content:35-40hours Course Outline
Datastage 8.5 Course Content:35-40hours Course Outline Unit -1 : Data Warehouse Fundamentals An introduction to Data Warehousing purpose of Data Warehouse Data Warehouse Architecture Operational Data Store
More informationCS425 Fall 2016 Boris Glavic Chapter 1: Introduction
CS425 Fall 2016 Boris Glavic Chapter 1: Introduction Modified from: Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Textbook: Chapter 1 1.2 Database Management System (DBMS)
More informationChapter 1: Introduction
Chapter 1: Introduction Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Outline The Need for Databases Data Models Relational Databases Database Design Storage Manager Query
More informationDatabase Design Process
Database Design Process Real World Functional Requirements Requirements Analysis Database Requirements Functional Analysis Access Specifications Application Pgm Design E-R Modeling Choice of a DBMS Data
More informationXML Wrap-up. CS 431 March 1, 2006 Carl Lagoze Cornell University
XML Wrap-up CS 431 March 1, 2006 Carl Lagoze Cornell University XSLT Processing Model Input XSL doc parse Input XML doc parse Parsed tree serialize Input XML doc Parsed tree Xformed tree Output doc (xml,
More informationIntroduction to Federation Server
Introduction to Federation Server Alex Lee IBM Information Integration Solutions Manager of Technical Presales Asia Pacific 2006 IBM Corporation WebSphere Federation Server Federation overview Tooling
More informationA Mapping Model for Transforming Nested XML Documents
IJCSNS International Journal of Computer Science and Network Security, VOL.6 No.2A, February 2006 83 A Mapping Model for Transforming Nested XML Documents Gang Qian, and Yisheng Dong, Department of Computer
More informationWhat s a database system? Review of Basic Database Concepts. Entity-relationship (E/R) diagram. Two important questions. Physical data independence
What s a database system? Review of Basic Database Concepts CPS 296.1 Topics in Database Systems According to Oxford Dictionary Database: an organized body of related information Database system, DataBase
More informationA FRAMEWORK FOR EFFICIENT DATA SEARCH THROUGH XML TREE PATTERNS
A FRAMEWORK FOR EFFICIENT DATA SEARCH THROUGH XML TREE PATTERNS SRIVANI SARIKONDA 1 PG Scholar Department of CSE P.SANDEEP REDDY 2 Associate professor Department of CSE DR.M.V.SIVA PRASAD 3 Principal Abstract:
More informationOutline. q Database integration & querying. q Peer-to-Peer data management q Stream data management q MapReduce-based distributed data management
Outline n Introduction & architectural issues n Data distribution n Distributed query processing n Distributed query optimization n Distributed transactions & concurrency control n Distributed reliability
More informationCOSC Dr. Ramon Lawrence. Emp Relation
COSC 304 Introduction to Database Systems Normalization Dr. Ramon Lawrence University of British Columbia Okanagan ramon.lawrence@ubc.ca Normalization Normalization is a technique for producing relations
More informationA GML SCHEMA MAPPING APPROACH TO OVERCOME SEMANTIC HETEROGENEITY IN GIS
A GML SCHEMA MAPPING APPROACH TO OVERCOME SEMANTIC HETEROGENEITY IN GIS Manoj Paul, S. K. Ghosh School of Information Technology, Indian Institute of Technology, Kharagpur 721302, India - (mpaul, skg)@sit.iitkgp.ernet.in
More informationStructural characterizations of schema mapping languages
Structural characterizations of schema mapping languages Balder ten Cate INRIA and ENS Cachan (research done while visiting IBM Almaden and UC Santa Cruz) Joint work with Phokion Kolaitis (ICDT 09) Schema
More informationFoundations and Applications of Schema Mappings
Foundations and Applications of Schema Mappings Phokion G. Kolaitis University of California Santa Cruz & IBM Almaden Research Center The Data Interoperability Challenge Data may reside at several different
More informationPhysical Database Design and Tuning. Chapter 20
Physical Database Design and Tuning Chapter 20 Introduction We will be talking at length about database design Conceptual Schema: info to capture, tables, columns, views, etc. Physical Schema: indexes,
More informationCOMP7640 Assignment 2
COMP7640 Assignment 2 Due Date: 23:59, 14 November 2014 (Fri) Description Question 1 (20 marks) Consider the following relational schema. An employee can work in more than one department; the pct time
More informationDesigning your BI Architecture
IBM Software Group Designing your BI Architecture Data Movement and Transformation David Cope EDW Architect Asia Pacific 2007 IBM Corporation DataStage and DWE SQW Complex Files SQL Scripts ERP ETL Engine
More informationADT 2009 Other Approaches to XQuery Processing
Other Approaches to XQuery Processing Stefan Manegold Stefan.Manegold@cwi.nl http://www.cwi.nl/~manegold/ 12.11.2009: Schedule 2 RDBMS back-end support for XML/XQuery (1/2): Document Representation (XPath
More informationOverview of Data Management
Overview of Data Management School of Computer Science University of Waterloo Databases CS348 (University of Waterloo) Overview of Data Management 1 / 21 What is Data ANSI definition of data: 1 A representation
More informationAdministrivia. Physical Database Design. Review: Optimization Strategies. Review: Query Optimization. Review: Database Design
Administrivia Physical Database Design R&G Chapter 16 Lecture 26 Homework 5 available Due Monday, December 8 Assignment has more details since first release Large data files now available No class Thursday,
More informationDatabases. Relational Model, Algebra and operations. How do we model and manipulate complex data structures inside a computer system? Until
Databases Relational Model, Algebra and operations How do we model and manipulate complex data structures inside a computer system? Until 1970.. Many different views or ways of doing this Could use tree
More informationQUERY PROCESSING & OPTIMIZATION CHAPTER 19 (6/E) CHAPTER 15 (5/E)
QUERY PROCESSING & OPTIMIZATION CHAPTER 19 (6/E) CHAPTER 15 (5/E) 2 LECTURE OUTLINE Query Processing Methodology Basic Operations and Their Costs Generation of Execution Plans 3 QUERY PROCESSING IN A DDBMS
More informationCore Schema Mappings: Computing Core Solution with Target Dependencies in Data Exchange
Core Schema Mappings: Computing Core Solution with Target Dependencies in Data Exchange S. Ravichandra, and D.V.L.N. Somayajulu Abstract Schema mapping is a declarative specification of the relationship
More informationCOSC 304 Introduction to Database Systems SQL. Dr. Ramon Lawrence University of British Columbia Okanagan
COSC 304 Introduction to Database Systems SQL Dr. Ramon Lawrence University of British Columbia Okanagan ramon.lawrence@ubc.ca SQL Queries Querying with SQL is performed using a SELECT statement. The general
More information10. Record-Oriented DB Interface
10 Record-Oriented DB Interface Theo Härder wwwhaerderde Goals - Design principles for record-oriented and navigation on logical access paths - Development of a scan technique and a Main reference: Theo
More informationQUERY OPTIMIZATION [CH 15]
Spring 2017 QUERY OPTIMIZATION [CH 15] 4/12/17 CS 564: Database Management Systems; (c) Jignesh M. Patel, 2013 1 Example SELECT distinct ename FROM Emp E, Dept D WHERE E.did = D.did and D.dname = Toy EMP
More informationXML. COSC Dr. Ramon Lawrence. An attribute is a name-value pair declared inside an element. Comments. Page 3. COSC Dr.
COSC 304 Introduction to Database Systems XML Dr. Ramon Lawrence University of British Columbia Okanagan ramon.lawrence@ubc.ca XML Extensible Markup Language (XML) is a markup language that allows for
More informationMore SQL: Complex Queries, Triggers, Views, and Schema Modification
Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5 Outline More Complex SQL Retrieval Queries
More information6232B: Implementing a Microsoft SQL Server 2008 R2 Database
6232B: Implementing a Microsoft SQL Server 2008 R2 Database Course Overview This instructor-led course is intended for Microsoft SQL Server database developers who are responsible for implementing a database
More informationRelational Databases
Relational Databases Jan Chomicki University at Buffalo Jan Chomicki () Relational databases 1 / 49 Plan of the course 1 Relational databases 2 Relational database design 3 Conceptual database design 4
More informationIntroduction to Query Processing and Query Optimization Techniques. Copyright 2011 Ramez Elmasri and Shamkant Navathe
Introduction to Query Processing and Query Optimization Techniques Outline Translating SQL Queries into Relational Algebra Algorithms for External Sorting Algorithms for SELECT and JOIN Operations Algorithms
More informationSQL Queries. COSC 304 Introduction to Database Systems SQL. Example Relations. SQL and Relational Algebra. Example Relation Instances
COSC 304 Introduction to Database Systems SQL Dr. Ramon Lawrence University of British Columbia Okanagan ramon.lawrence@ubc.ca SQL Queries Querying with SQL is performed using a SELECT statement. The general
More informationPhil Bernstein. Microsoft Research. Most slides come from SIGMOD 07 Keynote & Bridging Apps & DB, both with Sergey Melnik. Nov.
Phil Bernstein Microsoft Research Nov. 26, 2007 Most slides come from SIGMOD 07 Keynote & Bridging Apps & DB, both with Sergey Melnik 1 Make it easier to write programs that access databases Traditionally,
More informationWhat happens. 376a. Database Design. Execution strategy. Query conversion. Next. Two types of techniques
376a. Database Design Dept. of Computer Science Vassar College http://www.cs.vassar.edu/~cs376 Class 16 Query optimization What happens Database is given a query Query is scanned - scanner creates a list
More informationInformation Systems. Database System Architecture. Relational Databases. Nikolaj Popov
Information Systems Database System Architecture. Relational Databases Nikolaj Popov Research Institute for Symbolic Computation Johannes Kepler University of Linz, Austria popov@risc.uni-linz.ac.at Outline
More informationDatabase Technology Introduction. Heiko Paulheim
Database Technology Introduction Outline The Need for Databases Data Models Relational Databases Database Design Storage Manager Query Processing Transaction Manager Introduction to the Relational Model
More informationQuestion 1 (a) 10 marks
Question 1 (a) Consider the tree in the next slide. Find any/ all violations of a B+tree structure. Identify each bad node and give a brief explanation of each error. Assume the order of the tree is 4
More informationData Integration: Schema Mapping
Data Integration: Schema Mapping Jan Chomicki University at Buffalo and Warsaw University March 8, 2007 Jan Chomicki (UB/UW) Data Integration: Schema Mapping March 8, 2007 1 / 13 Data integration Data
More information(Big Data Integration) : :
(Big Data Integration) : : 3 # $%&'! ()* +$,- 2/30 ()* + # $%&' = 3 : $ 2 : 17 ;' $ # < 2 6 ' $%&',# +'= > 0 - '? @0 A 1 3/30 3?. - B 6 @* @(C : E6 - > ()* (C :(C E6 1' +'= - ''3-6 F :* 2G '> H-! +'-?
More informationData Integration: Schema Mapping
Data Integration: Schema Mapping Jan Chomicki University at Buffalo and Warsaw University March 8, 2007 Jan Chomicki (UB/UW) Data Integration: Schema Mapping March 8, 2007 1 / 13 Data integration Jan Chomicki
More informationDesigning Information-Preserving Mapping Schemes for XML
Designing Information-Preserving Mapping Schemes for XML Denilson Barbosa Juliana Freire Alberto O. Mendelzon VLDB 2005 Motivation An XML-to-relational mapping scheme consists of a procedure for shredding
More informationSemi-structured Data 11 - XSLT
Semi-structured Data 11 - XSLT Andreas Pieris and Wolfgang Fischl, Summer Term 2016 Outline What is XSLT? XSLT at First Glance XSLT Templates Creating Output Further Features What is XSLT? XSL = extensible
More informationExcel to XML v3. Compatibility Switch 13 update 1 and higher. Windows or Mac OSX.
App documentation Page 1/5 Excel to XML v3 Description Excel to XML will let you submit an Excel file in the format.xlsx to a Switch flow where it will be converted to XML and/or metadata sets. It will
More informationSemantic Errors in Database Queries
Semantic Errors in Database Queries 1 Semantic Errors in Database Queries Stefan Brass TU Clausthal, Germany From April: University of Halle, Germany Semantic Errors in Database Queries 2 Classification
More informationSemantic Optimization of Preference Queries
Semantic Optimization of Preference Queries Jan Chomicki University at Buffalo http://www.cse.buffalo.edu/ chomicki 1 Querying with Preferences Find the best answers to a query, instead of all the answers.
More informationIMPORTANT: Circle the last two letters of your class account:
Fall 2001 University of California, Berkeley College of Engineering Computer Science Division EECS Prof. Michael J. Franklin FINAL EXAM CS 186 Introduction to Database Systems NAME: STUDENT ID: IMPORTANT:
More informationPhysical Database Design and Tuning
Physical Database Design and Tuning CS 186, Fall 2001, Lecture 22 R&G - Chapter 16 Although the whole of this life were said to be nothing but a dream and the physical world nothing but a phantasm, I should
More informationAlgorithms for Query Processing and Optimization. 0. Introduction to Query Processing (1)
Chapter 19 Algorithms for Query Processing and Optimization 0. Introduction to Query Processing (1) Query optimization: The process of choosing a suitable execution strategy for processing a query. Two
More informationExcel to XML v4. Version adds two Private Data sets
Excel to XML v4 Page 1/6 Excel to XML v4 Description Excel to XML will let you submit an Excel file in the format.xlsx to a Switch flow were it will be converted to XML and/or metadata sets. It will accept
More informationPart V. Working with Information Systems. Marc H. Scholl (DBIS, Uni KN) Information Management Winter 2007/08 1
Part V Working with Information Systems Marc H. Scholl (DBIS, Uni KN) Information Management Winter 2007/08 1 Outline of this part 1 Introduction to Database Languages Declarative Languages Option 1: Graphical
More informationMOC 20463C: Implementing a Data Warehouse with Microsoft SQL Server
MOC 20463C: Implementing a Data Warehouse with Microsoft SQL Server Course Overview This course provides students with the knowledge and skills to implement a data warehouse with Microsoft SQL Server.
More informationOracle Database 10g: Introduction to SQL
ORACLE UNIVERSITY CONTACT US: 00 9714 390 9000 Oracle Database 10g: Introduction to SQL Duration: 5 Days What you will learn This course offers students an introduction to Oracle Database 10g database
More informationCopyright 2016 Ramez Elmasri and Shamkant B. Navathe
CHAPTER 19 Query Optimization Introduction Query optimization Conducted by a query optimizer in a DBMS Goal: select best available strategy for executing query Based on information available Most RDBMSs
More informationIntroduction to Data Management. Lecture #17 (Physical DB Design!)
Introduction to Data Management Lecture #17 (Physical DB Design!) Instructor: Mike Carey mjcarey@ics.uci.edu Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Announcements v Homework info:
More informationIntroduction to Wireless Sensor Network. Peter Scheuermann and Goce Trajcevski Dept. of EECS Northwestern University
Introduction to Wireless Sensor Network Peter Scheuermann and Goce Trajcevski Dept. of EECS Northwestern University 1 A Database Primer 2 A leap in history A brief overview/review of databases (DBMS s)
More informationMobile and Heterogeneous databases Distributed Database System Query Processing. A.R. Hurson Computer Science Missouri Science & Technology
Mobile and Heterogeneous databases Distributed Database System Query Processing A.R. Hurson Computer Science Missouri Science & Technology 1 Note, this unit will be covered in four lectures. In case you
More informationSQL. CS 564- Fall ACKs: Dan Suciu, Jignesh Patel, AnHai Doan
SQL CS 564- Fall 2015 ACKs: Dan Suciu, Jignesh Patel, AnHai Doan MOTIVATION The most widely used database language Used to query and manipulate data SQL stands for Structured Query Language many SQL standards:
More informationOutline. 1 CS520-5) Data Exchange
Outline 0) Course Info 1) Introduction 2) Data Preparation and Cleaning 3) Schema matching and mapping 4) Virtual Data Integration 5) Data Exchange 6) Data Warehousing 7) Big Data Analytics 8) Data Provenance
More informationGrid Computing Systems: A Survey and Taxonomy
Grid Computing Systems: A Survey and Taxonomy Material for this lecture from: A Survey and Taxonomy of Resource Management Systems for Grid Computing Systems, K. Krauter, R. Buyya, M. Maheswaran, CS Technical
More informationKeyword Search over Hybrid XML-Relational Databases
SICE Annual Conference 2008 August 20-22, 2008, The University Electro-Communications, Japan Keyword Search over Hybrid XML-Relational Databases Liru Zhang 1 Tadashi Ohmori 1 and Mamoru Hoshi 1 1 Graduate
More informationB.H.GARDI COLLEGE OF MASTER OF COMPUTER APPLICATION. Ch. 1 :- Introduction Database Management System - 1
Basic Concepts :- 1. What is Data? Data is a collection of facts from which conclusion may be drawn. In computer science, data is anything in a form suitable for use with a computer. Data is often distinguished
More informationXML Systems & Benchmarks
XML Systems & Benchmarks Christoph Staudt Peter Chiv Saarland University, Germany July 1st, 2003 Main Goals of our talk Part I Show up how databases and XML come together Make clear the problems that arise
More informationDatabase Applications (15-415)
Database Applications (15-415) ER to Relational & Relational Algebra Lecture 4, January 20, 2015 Mohammad Hammoud Today Last Session: The relational model Today s Session: ER to relational Relational algebra
More informationSTBenchmark: Towards a Benchmark for Mapping Systems
STBenchmark: Towards a Benchmark for Mapping Systems Bogdan Alexe UC Santa Cruz abogdan@cs.ucsc.edu Wang-Chiew Tan UC Santa Cruz wctan@cs.ucsc.edu Yannis Velegrakis University of Trento velgias@disi.unitn.eu
More informationSQL STRUCTURED QUERY LANGUAGE
STRUCTURED QUERY LANGUAGE SQL Structured Query Language 4.1 Introduction Originally, SQL was called SEQUEL (for Structured English QUery Language) and implemented at IBM Research as the interface for an
More information