FDB: A Query Engine for Factorised Relational Databases
|
|
- Charles Park
- 5 years ago
- Views:
Transcription
1 FDB: A Query Engine for Factorised Relational Databases Nurzhan Bakibayev, Dan Olteanu, and Jakub Zavodny Oxford CS Christan Grant cgrant@cise.ufl.edu University of Florida November 1, 2013 cgrant (UF) FDB: Factorised RDBMS November 1, / 39
2 Contents 1 Intro 2 F-Representation 3 F-Trees 4 Query Evaluation Normalisation Operator Swap Operator Cartesian Product Operator Selection Operator Merge Selection Operator Absorb Selection Operator Selection with Constant Operator Projection Operator 5 Query Optimization Cost of an F-Plan Exhaustive Search Greedy Heuristic 6 Experiments Experiment #1 Experiment #2 Experiment #2 Experiment #3 Experiment #4 7 Conclusion 8 Thanks cgrant (UF) FDB: Factorised RDBMS November 1, / 39
3 Intro Intro A succinct representation for large relations. It is used in Google s F1 data management system for ML. Used for large incomplete/uncertain relations. FDB can speed up expected queries. cgrant (UF) FDB: Factorised RDBMS November 1, / 39
4 Intro Intro A succinct representation for large relations. It is used in Google s F1 data management system for ML. Used for large incomplete/uncertain relations. FDB can speed up expected queries. This paper discusses basics FDB system. cgrant (UF) FDB: Factorised RDBMS November 1, / 39
5 TL;DR Intro Given a query: Q 1 = Order item Store location Disp cgrant (UF) FDB: Factorised RDBMS November 1, / 39
6 TL;DR Intro Given a query: Q 1 = Order item Store location Disp And a result: cgrant (UF) FDB: Factorised RDBMS November 1, / 39
7 TL;DR Intro Given a query: Q 1 = Order item Store location Disp And a result: We can represent it as this: cgrant (UF) FDB: Factorised RDBMS November 1, / 39
8 TL;DR Intro Given a query: Q 1 = Order item Store location Disp And a result: We can represent it as this: Or a tree: cgrant (UF) FDB: Factorised RDBMS November 1, / 39
9 Intro TL;DR Given a query: Q 1 = Order item Store location Disp And a result: We can represent it as this: Or a tree: In exponentially -space and +speed for SPJ queries. cgrant (UF) FDB: Factorised RDBMS November 1, / 39
10 Intro Contents 1 Intro 2 F-Representation 3 F-Trees 4 Query Evaluation Normalisation Operator Swap Operator Cartesian Product Operator Selection Operator Projection Operator 5 Query Optimization Cost of an F-Plan Exhaustive Search Greedy Heuristic 6 Experiments Experiment #1 Experiment #2 Experiment #2 Experiment #3 Experiment #4 7 Conclusion 8 Thanks cgrant (UF) FDB: Factorised RDBMS November 1, / 39
11 Example Schema Intro cgrant (UF) FDB: Factorised RDBMS November 1, / 39
12 F-Representation Factorized Representations Relational algebra expressions. Constructed with singleton relations v, the union and product operators. exponentially more succinct than relations they encode. fast (constant-delay) enumeration of tuples cgrant (UF) FDB: Factorised RDBMS November 1, / 39
13 F-Representation Factorized Representations Definition 1 A factorised representation E, or f-representation for short, over a set S of attributes and domain D is a relational algebra expression of the form, empty relation over S;, the relation consisting of the nullary tuplem if S = ; A : a, the unary relation with a single tuple with value a, if S = {A} and a is a value in the domain D; (E), where E is an f-representation over S; E 1 E n, where each E i is an f-representation over S; E 1 E n, where each E i is an f-representation over S i and S is the disjoint union over all S i. cgrant (UF) FDB: Factorised RDBMS November 1, / 39
14 F-Representation Factorized Representations Singleton A : a is a singleton E the size is the number of singletons A single system may have many different f-representations. The tuples of a given f-representation can be enumerated with O( S ) additional space and delay. cgrant (UF) FDB: Factorised RDBMS November 1, / 39
15 F-Representation Factorized Representations Singleton A : a is a singleton E the size is the number of singletons A single system may have many different f-representations. The tuples of a given f-representation can be enumerated with O( S ) additional space and delay. Factorized Representation Compact Factorized Representation cgrant (UF) FDB: Factorised RDBMS November 1, / 39
16 F-Trees F-Trees Definition An f-tree for short, over a schema S of attributes is an unordered rooted forest with each node labelled by a non-empty subset of S such that each Q1 F-Tree Q1 Alternate F-Tree attribute of S labels exactly one node. cgrant (UF) FDB: Factorised RDBMS November 1, / 39
17 F-Trees F-Trees Definition An f-tree for short, over a schema S of attributes is an unordered rooted forest with each node labelled by a non-empty subset of S such that each Q1 F-Tree Q1 Alternate F-Tree attribute of S labels exactly one node. Tree created in O( S R log R ). We can go back to R in O( S R ). cgrant (UF) FDB: Factorised RDBMS November 1, / 39
18 F-Trees F-Trees Queries Given a query Q = π P σ ϕ (R 1 R n ) Given a query Q, an f-tree T is an f-tree of Q if and only if it satisfies the path constraint. all dependent attributes can only label nodes along a same root-to-leaf path. Nodes of a join are only dependent if they are in the P list or the same R i. cgrant (UF) FDB: Factorised RDBMS November 1, / 39
19 F-Trees F-Trees Size Bounds Let st be the maximal root-to-leaf path edge cover T. For and database D and f-tree T the result size is at most O( P D s(t ) ) cgrant (UF) FDB: Factorised RDBMS November 1, / 39
20 F-Trees F-Trees Size Bounds Let st be the maximal root-to-leaf path edge cover T. For and database D and f-tree T the result size is at most O( P D s(t ) ) If we let s(q) be the minimal s(t ) over all f-trees. cgrant (UF) FDB: Factorised RDBMS November 1, / 39
21 F-Trees F-Trees Size Bounds Let st be the maximal root-to-leaf path edge cover T. For and database D and f-tree T the result size is at most O( P D s(t ) ) If we let s(q) be the minimal s(t ) over all f-trees. P D s(q) Q(D) This can mean asymptotically smaller, exponential size and time savings cgrant (UF) FDB: Factorised RDBMS November 1, / 39
22 Query Evaluation Contents 1 Intro 2 F-Representation 3 F-Trees 4 Query Evaluation Normalisation Operator Swap Operator Cartesian Product Operator Selection Operator Projection Operator 5 Query Optimization Cost of an F-Plan Exhaustive Search Greedy Heuristic 6 Experiments Experiment #1 Experiment #2 Experiment #2 Experiment #3 Experiment #4 7 Conclusion 8 Thanks cgrant (UF) FDB: Factorised RDBMS November 1, / 39
23 Query Evaluation F-plans Any select-project-join query can be evaluated by a sequential composition of operators called an f-plan. f-trees uniquely identify f-representations operators are in tree form. The time complexity of each f-plan operator is O( T 2 N log N), where N is the sum of input and output f-representations and T is the input f-tree. cgrant (UF) FDB: Factorised RDBMS November 1, / 39
24 Query Evaluation Normalisation Operator Normalisation Operator 1 push-up operator ψ B normalization η(t ) A tree is normalized if the push up operator can not be applied. The push-up operator is applied bottom up. ψ B η(t ) Normalization 1 All other operators expect normalized inputs cgrant (UF) FDB: Factorised RDBMS November 1, / 39
25 Query Evaluation Swap Operator Swap Operator X A,B Exchanges B with its parent A and promotes children. cgrant (UF) FDB: Factorised RDBMS November 1, / 39
26 Query Evaluation Cartesian Product Operator Cartesian Product Operator T = T T cgrant (UF) FDB: Factorised RDBMS November 1, / 39
27 Query Evaluation Selection Operator Merge Selection Operator µ A,B If A and B are siblings cgrant (UF) FDB: Factorised RDBMS November 1, / 39
28 Query Evaluation Selection Operator Merge Selection Operator µ A,B If A and B are siblings µ(, ) = cgrant (UF) FDB: Factorised RDBMS November 1, / 39
29 Query Evaluation Selection Operator Absorb Selection Operator α A,B If A is an ancestor of B Example: cgrant (UF) FDB: Factorised RDBMS November 1, / 39
30 Query Evaluation Selection Operator Selection with Constant Operator σ Aθc Filters out all the values in A : a where a θc cgrant (UF) FDB: Factorised RDBMS November 1, / 39
31 Query Evaluation Projection Operator Projection Operator π A Removes a node from the tree. Only permitted if A is a leaf node or with other attributes. cgrant (UF) FDB: Factorised RDBMS November 1, / 39
32 Query Optimization Contents 1 Intro 2 F-Representation 3 F-Trees 4 Query Evaluation Normalisation Operator Swap Operator Cartesian Product Operator Selection Operator Projection Operator 5 Query Optimization Cost of an F-Plan Exhaustive Search Greedy Heuristic 6 Experiments Experiment #1 Experiment #2 Experiment #2 Experiment #3 Experiment #4 7 Conclusion 8 Thanks cgrant (UF) FDB: Factorised RDBMS November 1, / 39
33 Query Optimization Query Optimization Query optimization has two objectives: minimizing the cost of computing a factorised query result; minimizing the size of this output representation. cgrant (UF) FDB: Factorised RDBMS November 1, / 39
34 Query Optimization Query Optimization Optimize the f-tree then the f-representation. Products and Selections are always evaluated first (cheapest). Selection (A = B) can only be done if they are on the same path or siblings cgrant (UF) FDB: Factorised RDBMS November 1, / 39
35 Cost of an F-Plan Query Optimization Cost of an F-Plan use asymptotic bounds of s(t ). with an evaluation time of O( D s(f ) log D ), where s(f ) = max(s(t 0,... s(t k )). cgrant (UF) FDB: Factorised RDBMS November 1, / 39
36 Query Optimization Cost of an F-Plan Cost of an F-Plan use asymptotic bounds of s(t ). with an evaluation time of O( D s(f ) log D ), where s(f ) = max(s(t 0,... s(t k ))., or, use cardinality of estimate from intermediate f-trees using RDBMS style techniques cgrant (UF) FDB: Factorised RDBMS November 1, / 39
37 Query Optimization Exhaustive Search Exhaustive Search Enumerate plans and the one with the shortest path (Dijkstra) between T init to T final is least cost. Search space is O(n 4n ) with n p attributes in the projection list cgrant (UF) FDB: Factorised RDBMS November 1, / 39
38 Query Optimization Greedy Heuristic Greedy Heuristic Only restructures nodes that appear in selections and projections. In projection ordering chooses the least cost at each step. Computing the tree take O( T 2 ). cgrant (UF) FDB: Factorised RDBMS November 1, / 39
39 Experiments Contents 1 Intro 2 F-Representation 3 F-Trees 4 Query Evaluation Normalisation Operator Swap Operator Cartesian Product Operator Selection Operator Projection Operator 5 Query Optimization Cost of an F-Plan Exhaustive Search Greedy Heuristic 6 Experiments Experiment #1 Experiment #2 Experiment #2 Experiment #3 Experiment #4 7 Conclusion 8 Thanks cgrant (UF) FDB: Factorised RDBMS November 1, / 39
40 Experiments Implementation Compare SQLite and PostgreSQL with FDB and RDB for SQLite and PostgreSQL tried to disable expensive writing (try to make it in-memory) Implemented in C++ Optimal f-representation computed with multi-way merge-sort join algorithm. cgrant (UF) FDB: Factorised RDBMS November 1, / 39
41 Experiments Experiment Design Generate random data for queries Repeat a number of times for optimizations time, execution time, representation sizes and f-plan quality Tuples generated independently with a uniform or Zipf distribution. Queries are equality joins cgrant (UF) FDB: Factorised RDBMS November 1, / 39
42 Experiments Experiment #1 Experiment #1 Average time for optimising a query on flat data. A = 40 Attributes, R = 1,..., 8 relations, K = 1,..., 9 selections cgrant (UF) FDB: Factorised RDBMS November 1, / 39
43 Experiments Experiment #1 Experiment #1 Average time for optimising a query on flat data. A = 40 Attributes, R = 1,..., 8 relations, K = 1,..., 9 selections cgrant (UF) FDB: Factorised RDBMS November 1, / 39
44 Experiments Experiment #1 Experiment #1 Average time for optimising a query on flat data. A = 40 Attributes, R = 1,..., 8 relations, K = 1,..., 9 selections cgrant (UF) FDB: Factorised RDBMS November 1, / 39
45 Experiment #2 Experiments Experiment #2 cgrant (UF) FDB: Factorised RDBMS November 1, / 39
46 Experiment #2 Experiments Experiment #2 cgrant (UF) FDB: Factorised RDBMS November 1, / 39
47 Experiment #3 Experiments Experiment #3 Compare performance of FDB, RDB, SQLite and PostgreSQL Results size Evaluation Time cgrant (UF) FDB: Factorised RDBMS November 1, / 39
48 Experiments Experiment #3 Experiment #3 Compare performance of FDB, RDB, SQLite and PostgreSQL. Each tuple size decreases results size by 20 Results size Evaluation time cgrant (UF) FDB: Factorised RDBMS November 1, / 39
49 Experiments Experiment #3 Experiment #3 Compare performance of FDB, RDB, SQLite and PostgreSQL. In Q A is a many-many query. This is highly compressible so the factorized result is small. cgrant (UF) FDB: Factorised RDBMS November 1, / 39
50 Experiment #4 Experiment over Factorized data Experiments Experiment #4 Results size Evaluation time cgrant (UF) FDB: Factorised RDBMS November 1, / 39
51 Conclusion Conclusion Factorizing Relational Data prevents the explosion of data size for large joins. This paper described techniques and experiments to decrease this size and improving query performance. For MLNs and other graph DBs that require many joins this can significantly improve query performance. cgrant (UF) FDB: Factorised RDBMS November 1, / 39
52 Thank you Thanks Questions? cgrant (UF) FDB: Factorised RDBMS November 1, / 39
FDB: A Query Engine for Factorised Relational Databases
FDB: A Query Engine for Factorised Relational Databases Nurzhan Bakibayev, Dan Olteanu, and Jakub Závodný Department of Computer Science, University of Oxford, OX1 3QD, UK {nurzhan.bakibayev, dan.olteanu,
More informationQueries with Order-by Clauses and Aggregates on Factorised Relational Data
Queries with Order-by Clauses and Aggregates on Factorised Relational Data Tomáš Kočiský Magdalen College University of Oxford A fourth year project report submitted for the degree of Masters of Mathematics
More informationarxiv: v1 [cs.db] 1 Jul 2013
Aggregation and Ordering in Factorised Databases Nurzhan Bakibayev, Tomáš Kočiský, Dan Olteanu, and Jakub Závodný Department of Computer Science, University of Oxford, OX1 3QD, UK {nurzhan.bakibayev, tomas.kocisky,
More informationQuery Optimization. Query Optimization. Optimization considerations. Example. Interaction of algorithm choice and tree arrangement.
COS 597: Principles of Database and Information Systems Query Optimization Query Optimization Query as expression over relational algebraic operations Get evaluation (parse) tree Leaves: base relations
More informationQuery Solvers on Database Covers
Query Solvers on Database Covers Wouter Verlaek Kellogg College University of Oxford Supervised by Prof. Dan Olteanu A thesis submitted for the degree of Master of Science in Computer Science Trinity 2018
More informationCSE 530A. B+ Trees. Washington University Fall 2013
CSE 530A B+ Trees Washington University Fall 2013 B Trees A B tree is an ordered (non-binary) tree where the internal nodes can have a varying number of child nodes (within some range) B Trees When a key
More informationGreedy Approach: Intro
Greedy Approach: Intro Applies to optimization problems only Problem solving consists of a series of actions/steps Each action must be 1. Feasible 2. Locally optimal 3. Irrevocable Motivation: If always
More informationUnion/Find Aka: Disjoint-set forest. Problem definition. Naïve attempts CS 445
CS 5 Union/Find Aka: Disjoint-set forest Alon Efrat Problem definition Given: A set of atoms S={1, n E.g. each represents a commercial name of a drugs. This set consists of different disjoint subsets.
More informationChapter 2: Intro to Relational Model
Non è possibile visualizzare l'immagine. Chapter 2: Intro to Relational Model Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Example of a Relation attributes (or columns)
More informationQuery Processing & Optimization
Query Processing & Optimization 1 Roadmap of This Lecture Overview of query processing Measures of Query Cost Selection Operation Sorting Join Operation Other Operations Evaluation of Expressions Introduction
More informationChapter 2: Intro to Relational Model
Chapter 2: Intro to Relational Model Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Example of a Relation attributes (or columns) tuples (or rows) 2.2 Attribute Types The
More informationParallel Query Optimisation
Parallel Query Optimisation Contents Objectives of parallel query optimisation Parallel query optimisation Two-Phase optimisation One-Phase optimisation Inter-operator parallelism oriented optimisation
More informationDatabase Systems External Sorting and Query Optimization. A.R. Hurson 323 CS Building
External Sorting and Query Optimization A.R. Hurson 323 CS Building External sorting When data to be sorted cannot fit into available main memory, external sorting algorithm must be applied. Naturally,
More informationlooking ahead to see the optimum
! Make choice based on immediate rewards rather than looking ahead to see the optimum! In many cases this is effective as the look ahead variation can require exponential time as the number of possible
More informationIntroduction to Optimization
Introduction to Optimization Greedy Algorithms October 28, 2016 École Centrale Paris, Châtenay-Malabry, France Dimo Brockhoff Inria Saclay Ile-de-France 2 Course Overview Date Fri, 7.10.2016 Fri, 28.10.2016
More informationQUERY OPTIMIZATION FOR DATABASE MANAGEMENT SYSTEM BY APPLYING DYNAMIC PROGRAMMING ALGORITHM
QUERY OPTIMIZATION FOR DATABASE MANAGEMENT SYSTEM BY APPLYING DYNAMIC PROGRAMMING ALGORITHM Wisnu Adityo NIM 13506029 Information Technology Department Institut Teknologi Bandung Jalan Ganesha 10 e-mail:
More informationQuery Optimization. Shuigeng Zhou. December 9, 2009 School of Computer Science Fudan University
Query Optimization Shuigeng Zhou December 9, 2009 School of Computer Science Fudan University Outline Introduction Catalog Information for Cost Estimation Estimation of Statistics Transformation of Relational
More informationFactorized Databases
Factorized Databases http://www.cs.ox.ac.uk/projects/fdb/ Dan Olteanu Maximilian Schleich Department of Computer Science, University of Oxford ABSTRACT This paper overviews factorized databases and their
More informationOptimization Overview
Lecture 17 Optimization Overview Lecture 17 Lecture 17 Today s Lecture 1. Logical Optimization 2. Physical Optimization 3. Course Summary 2 Lecture 17 Logical vs. Physical Optimization Logical optimization:
More informationCourse Review for Finals. Cpt S 223 Fall 2008
Course Review for Finals Cpt S 223 Fall 2008 1 Course Overview Introduction to advanced data structures Algorithmic asymptotic analysis Programming data structures Program design based on performance i.e.,
More informationIntroduction to Query Processing and Query Optimization Techniques. Copyright 2011 Ramez Elmasri and Shamkant Navathe
Introduction to Query Processing and Query Optimization Techniques Outline Translating SQL Queries into Relational Algebra Algorithms for External Sorting Algorithms for SELECT and JOIN Operations Algorithms
More informationQuestion 1 (a) 10 marks
Question 1 (a) Consider the tree in the next slide. Find any/ all violations of a B+tree structure. Identify each bad node and give a brief explanation of each error. Assume the order of the tree is 4
More informationmanaging an evolving set of connected components implementing a Union-Find data structure implementing Kruskal s algorithm
Spanning Trees 1 Spanning Trees the minimum spanning tree problem three greedy algorithms analysis of the algorithms 2 The Union-Find Data Structure managing an evolving set of connected components implementing
More informationRelational Model: History
Relational Model: History Objectives of Relational Model: 1. Promote high degree of data independence 2. Eliminate redundancy, consistency, etc. problems 3. Enable proliferation of non-procedural DML s
More informationReference Sheet for CO142.2 Discrete Mathematics II
Reference Sheet for CO14. Discrete Mathematics II Spring 017 1 Graphs Defintions 1. Graph: set of N nodes and A arcs such that each a A is associated with an unordered pair of nodes.. Simple graph: no
More informationIntroduction Alternative ways of evaluating a given query using
Query Optimization Introduction Catalog Information for Cost Estimation Estimation of Statistics Transformation of Relational Expressions Dynamic Programming for Choosing Evaluation Plans Introduction
More informationQuery Evaluation and Optimization
Query Evaluation and Optimization Jan Chomicki University at Buffalo Jan Chomicki () Query Evaluation and Optimization 1 / 21 Evaluating σ E (R) Jan Chomicki () Query Evaluation and Optimization 2 / 21
More informationCMSC 424 Database design Lecture 18 Query optimization. Mihai Pop
CMSC 424 Database design Lecture 18 Query optimization Mihai Pop More midterm solutions Projects do not be late! Admin Introduction Alternative ways of evaluating a given query Equivalent expressions Different
More informationPrinciples of Data Management. Lecture #12 (Query Optimization I)
Principles of Data Management Lecture #12 (Query Optimization I) Instructor: Mike Carey mjcarey@ics.uci.edu Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Today s Notable News v B+ tree
More informationAdvanced Databases. Lecture 4 - Query Optimization. Masood Niazi Torshiz Islamic Azad university- Mashhad Branch
Advanced Databases Lecture 4 - Query Optimization Masood Niazi Torshiz Islamic Azad university- Mashhad Branch www.mniazi.ir Query Optimization Introduction Transformation of Relational Expressions Catalog
More informationTwo hours UNIVERSITY OF MANCHESTER SCHOOL OF COMPUTER SCIENCE
COMP 62421 Two hours UNIVERSITY OF MANCHESTER SCHOOL OF COMPUTER SCIENCE Querying Data on the Web Date: Wednesday 24th January 2018 Time: 14:00-16:00 Please answer all FIVE Questions provided. They amount
More informationMinimum-Spanning-Tree problem. Minimum Spanning Trees (Forests) Minimum-Spanning-Tree problem
Minimum Spanning Trees (Forests) Given an undirected graph G=(V,E) with each edge e having a weight w(e) : Find a subgraph T of G of minimum total weight s.t. every pair of vertices connected in G are
More informationTechnical University of Denmark
Technical University of Denmark Written examination, May 7, 27. Course name: Algorithms and Data Structures Course number: 2326 Aids: Written aids. It is not permitted to bring a calculator. Duration:
More informationDatabase System Concepts
Chapter 13: Query Processing s Departamento de Engenharia Informática Instituto Superior Técnico 1 st Semester 2008/2009 Slides (fortemente) baseados nos slides oficiais do livro c Silberschatz, Korth
More information1 Format. 2 Topics Covered. 2.1 Minimal Spanning Trees. 2.2 Union Find. 2.3 Greedy. CS 124 Quiz 2 Review 3/25/18
CS 124 Quiz 2 Review 3/25/18 1 Format You will have 83 minutes to complete the exam. The exam may have true/false questions, multiple choice, example/counterexample problems, run-this-algorithm problems,
More informationLecture 13. Reading: Weiss, Ch. 9, Ch 8 CSE 100, UCSD: LEC 13. Page 1 of 29
Lecture 13 Connectedness in graphs Spanning trees in graphs Finding a minimal spanning tree Time costs of graph problems and NP-completeness Finding a minimal spanning tree: Prim s and Kruskal s algorithms
More informationCS521 \ Notes for the Final Exam
CS521 \ Notes for final exam 1 Ariel Stolerman Asymptotic Notations: CS521 \ Notes for the Final Exam Notation Definition Limit Big-O ( ) Small-o ( ) Big- ( ) Small- ( ) Big- ( ) Notes: ( ) ( ) ( ) ( )
More informationCSE 100: GRAPH ALGORITHMS
CSE 100: GRAPH ALGORITHMS Dijkstra s Algorithm: Questions Initialize the graph: Give all vertices a dist of INFINITY, set all done flags to false Start at s; give s dist = 0 and set prev field to -1 Enqueue
More informationAdministrivia. CS 133: Databases. Cost-based Query Sub-System. Goals for Today. Midterm on Thursday 10/18. Assignments
Administrivia Midterm on Thursday 10/18 CS 133: Databases Fall 2018 Lec 12 10/16 Prof. Beth Trushkowsky Assignments Lab 3 starts after fall break No problem set out this week Goals for Today Cost-based
More informationAlgorithms for Query Processing and Optimization. 0. Introduction to Query Processing (1)
Chapter 19 Algorithms for Query Processing and Optimization 0. Introduction to Query Processing (1) Query optimization: The process of choosing a suitable execution strategy for processing a query. Two
More informationσ (R.B = 1 v R.C > 3) (S.D = 2) Conjunctive normal form Topics for the Day Distributed Databases Query Processing Steps Decomposition
Topics for the Day Distributed Databases Query processing in distributed databases Localization Distributed query operators Cost-based optimization C37 Lecture 1 May 30, 2001 1 2 Query Processing teps
More informationContents Contents Introduction Basic Steps in Query Processing Introduction Transformation of Relational Expressions...
Contents Contents...283 Introduction...283 Basic Steps in Query Processing...284 Introduction...285 Transformation of Relational Expressions...287 Equivalence Rules...289 Transformation Example: Pushing
More information( ) D. Θ ( ) ( ) Ο f ( n) ( ) Ω. C. T n C. Θ. B. n logn Ο
CSE 0 Name Test Fall 0 Multiple Choice. Write your answer to the LEFT of each problem. points each. The expected time for insertion sort for n keys is in which set? (All n! input permutations are equally
More informationCS 6783 (Applied Algorithms) Lecture 5
CS 6783 (Applied Algorithms) Lecture 5 Antonina Kolokolova January 19, 2012 1 Minimum Spanning Trees An undirected graph G is a pair (V, E); V is a set (of vertices or nodes); E is a set of (undirected)
More informationSchema for Examples. Query Optimization. Alternative Plans 1 (No Indexes) Motivating Example. Alternative Plans 2 With Indexes
Schema for Examples Query Optimization (sid: integer, : string, rating: integer, age: real) (sid: integer, bid: integer, day: dates, rname: string) Similar to old schema; rname added for variations. :
More informationSets MAT231. Fall Transition to Higher Mathematics. MAT231 (Transition to Higher Math) Sets Fall / 31
Sets MAT231 Transition to Higher Mathematics Fall 2014 MAT231 (Transition to Higher Math) Sets Fall 2014 1 / 31 Outline 1 Sets Introduction Cartesian Products Subsets Power Sets Union, Intersection, Difference
More information( ) n 3. n 2 ( ) D. Ο
CSE 0 Name Test Summer 0 Last Digits of Mav ID # Multiple Choice. Write your answer to the LEFT of each problem. points each. The time to multiply two n n matrices is: A. Θ( n) B. Θ( max( m,n, p) ) C.
More informationQuery Optimization. Schema for Examples. Motivating Example. Similar to old schema; rname added for variations. Reserves: Sailors:
Query Optimization Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Schema for Examples (sid: integer, sname: string, rating: integer, age: real) (sid: integer, bid: integer, day: dates,
More informationEvaluating XPath Queries
Chapter 8 Evaluating XPath Queries Peter Wood (BBK) XML Data Management 201 / 353 Introduction When XML documents are small and can fit in memory, evaluating XPath expressions can be done efficiently But
More informationChapter 11: Query Optimization
Chapter 11: Query Optimization Chapter 11: Query Optimization Introduction Transformation of Relational Expressions Statistical Information for Cost Estimation Cost-based optimization Dynamic Programming
More informationCS301 - Data Structures Glossary By
CS301 - Data Structures Glossary By Abstract Data Type : A set of data values and associated operations that are precisely specified independent of any particular implementation. Also known as ADT Algorithm
More informationBrute Force: Selection Sort
Brute Force: Intro Brute force means straightforward approach Usually based directly on problem s specs Force refers to computational power Usually not as efficient as elegant solutions Advantages: Applicable
More informationQuery Optimization. Schema for Examples. Motivating Example. Similar to old schema; rname added for variations. Reserves: Sailors:
Query Optimization atabase Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Schema for Examples (sid: integer, sname: string, rating: integer, age: real) (sid: integer, bid: integer, day: dates,
More informationChapter 13: Query Optimization. Chapter 13: Query Optimization
Chapter 13: Query Optimization Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 13: Query Optimization Introduction Equivalent Relational Algebra Expressions Statistical
More informationImproving Query Plans. CS157B Chris Pollett Mar. 21, 2005.
Improving Query Plans CS157B Chris Pollett Mar. 21, 2005. Outline Parse Trees and Grammars Algebraic Laws for Improving Query Plans From Parse Trees To Logical Query Plans Syntax Analysis and Parse Trees
More informationChapter 12: Query Processing
Chapter 12: Query Processing Overview Catalog Information for Cost Estimation $ Measures of Query Cost Selection Operation Sorting Join Operation Other Operations Evaluation of Expressions Transformation
More informationExam 1. March 20th, CS525 - Midterm Exam Solutions
Name CWID Exam 1 March 20th, 2017 CS525 - Midterm Exam s Please leave this empty! 1 2 3 4 Sum Things that you are not allowed to use Personal notes Textbook Printed lecture notes Phone The exam is 75 minutes
More informationCopyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. Chapter 6 Outline. Unary Relational Operations: SELECT and
Chapter 6 The Relational Algebra and Relational Calculus Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 6 Outline Unary Relational Operations: SELECT and PROJECT Relational
More informationUncertain Data Models
Uncertain Data Models Christoph Koch EPFL Dan Olteanu University of Oxford SYNOMYMS data models for incomplete information, probabilistic data models, representation systems DEFINITION An uncertain data
More informationRelational Query Optimization
Relational Query Optimization Module 4, Lectures 3 and 4 Database Management Systems, R. Ramakrishnan 1 Overview of Query Optimization Plan: Tree of R.A. ops, with choice of alg for each op. Each operator
More informationATYPICAL RELATIONAL QUERY OPTIMIZER
14 ATYPICAL RELATIONAL QUERY OPTIMIZER Life is what happens while you re busy making other plans. John Lennon In this chapter, we present a typical relational query optimizer in detail. We begin by discussing
More informationChapter 13: Query Processing
Chapter 13: Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 13.1 Basic Steps in Query Processing 1. Parsing
More informationRelational Model 2: Relational Algebra
Yufei Tao Department of Computer Science and Engineering Chinese University of Hong Kong The relational model defines: 1 the format by which data should be stored; 2 the operations for querying the data.
More informationRelational Query Optimization. Highlights of System R Optimizer
Relational Query Optimization Chapter 15 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Highlights of System R Optimizer v Impact: Most widely used currently; works well for < 10 joins.
More informationRelational Query Optimization. Overview of Query Evaluation. SQL Refresher. Yanlei Diao UMass Amherst October 23 & 25, 2007
Relational Query Optimization Yanlei Diao UMass Amherst October 23 & 25, 2007 Slide Content Courtesy of R. Ramakrishnan, J. Gehrke, and J. Hellerstein 1 Overview of Query Evaluation Query Evaluation Plan:
More informationCSE 100 Disjoint Set, Union Find
CSE 100 Disjoint Set, Union Find Union-find using up-trees The union-find data structure 1 0 2 6 7 4 3 5 8 Perform these operations: Find(4) =! Find(3) =! Union(1,0)=! Find(4)=! 2 Array representation
More informationOptimization I : Brute force and Greedy strategy
Chapter 3 Optimization I : Brute force and Greedy strategy A generic definition of an optimization problem involves a set of constraints that defines a subset in some underlying space (like the Euclidean
More informationLecture Query evaluation. Cost-based transformations. By Marina Barsky Winter 2016, University of Toronto
Lecture 02.04. Query evaluation Cost-based transformations By Marina Barsky Winter 2016, University of Toronto RDBMS query evaluation How does a RDBMS answer your query? SQL Query Relational Algebra (RA)
More informationChapter 12: Query Processing
Chapter 12: Query Processing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Overview Chapter 12: Query Processing Measures of Query Cost Selection Operation Sorting Join
More informationMulti-way Search Trees. (Multi-way Search Trees) Data Structures and Programming Spring / 25
Multi-way Search Trees (Multi-way Search Trees) Data Structures and Programming Spring 2017 1 / 25 Multi-way Search Trees Each internal node of a multi-way search tree T: has at least two children contains
More informationRelational Model, Relational Algebra, and SQL
Relational Model, Relational Algebra, and SQL August 29, 2007 1 Relational Model Data model. constraints. Set of conceptual tools for describing of data, data semantics, data relationships, and data integrity
More informationMonotone Constraints in Frequent Tree Mining
Monotone Constraints in Frequent Tree Mining Jeroen De Knijf Ad Feelders Abstract Recent studies show that using constraints that can be pushed into the mining process, substantially improves the performance
More informationChapter 12: Query Processing. Chapter 12: Query Processing
Chapter 12: Query Processing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 12: Query Processing Overview Measures of Query Cost Selection Operation Sorting Join
More informationOutline. Query Processing Overview Algorithms for basic operations. Query optimization. Sorting Selection Join Projection
Outline Query Processing Overview Algorithms for basic operations Sorting Selection Join Projection Query optimization Heuristics Cost-based optimization 19 Estimate I/O Cost for Implementations Count
More informationCS 564 Final Exam Fall 2015 Answers
CS 564 Final Exam Fall 015 Answers A: STORAGE AND INDEXING [0pts] I. [10pts] For the following questions, clearly circle True or False. 1. The cost of a file scan is essentially the same for a heap file
More informationDatabase Tuning and Physical Design: Basics of Query Execution
Database Tuning and Physical Design: Basics of Query Execution Spring 2018 School of Computer Science University of Waterloo Databases CS348 (University of Waterloo) Query Execution 1 / 43 The Client/Server
More informationChapter 3. Algorithms for Query Processing and Optimization
Chapter 3 Algorithms for Query Processing and Optimization Chapter Outline 1. Introduction to Query Processing 2. Translating SQL Queries into Relational Algebra 3. Algorithms for External Sorting 4. Algorithms
More information! A relational algebra expression may have many equivalent. ! Cost is generally measured as total elapsed time for
Chapter 13: Query Processing Basic Steps in Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 1. Parsing and
More informationChapter 13: Query Processing Basic Steps in Query Processing
Chapter 13: Query Processing Basic Steps in Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 1. Parsing and
More informationCSE 190D Spring 2017 Final Exam Answers
CSE 190D Spring 2017 Final Exam Answers Q 1. [20pts] For the following questions, clearly circle True or False. 1. The hash join algorithm always has fewer page I/Os compared to the block nested loop join
More informationChapter 14 Query Optimization
Chapter 14 Query Optimization Chapter 14: Query Optimization! Introduction! Catalog Information for Cost Estimation! Estimation of Statistics! Transformation of Relational Expressions! Dynamic Programming
More informationChapter 4 Distributed Query Processing
Chapter 4 Distributed Query Processing Table of Contents Overview of Query Processing Query Decomposition and Data Localization Optimization of Distributed Queries Chapter4-1 1 1. Overview of Query Processing
More informationChapter 14 Query Optimization
Chapter 14 Query Optimization Chapter 14: Query Optimization! Introduction! Catalog Information for Cost Estimation! Estimation of Statistics! Transformation of Relational Expressions! Dynamic Programming
More informationChapter 14 Query Optimization
Chapter 14: Query Optimization Chapter 14 Query Optimization! Introduction! Catalog Information for Cost Estimation! Estimation of Statistics! Transformation of Relational Expressions! Dynamic Programming
More informationDBMS Query evaluation
Data Management for Data Science DBMS Maurizio Lenzerini, Riccardo Rosati Corso di laurea magistrale in Data Science Sapienza Università di Roma Academic Year 2016/2017 http://www.dis.uniroma1.it/~rosati/dmds/
More informationIntroduction to Optimization
Introduction to Optimization Greedy Algorithms October 5, 2015 École Centrale Paris, Châtenay-Malabry, France Dimo Brockhoff INRIA Lille Nord Europe Course Overview 2 Date Topic Mon, 21.9.2015 Introduction
More informationLecture 6 Basic Graph Algorithms
CS 491 CAP Intro to Competitive Algorithmic Programming Lecture 6 Basic Graph Algorithms Uttam Thakore University of Illinois at Urbana-Champaign September 30, 2015 Updates ICPC Regionals teams will be
More informationWhat s a database system? Review of Basic Database Concepts. Entity-relationship (E/R) diagram. Two important questions. Physical data independence
What s a database system? Review of Basic Database Concepts CPS 296.1 Topics in Database Systems According to Oxford Dictionary Database: an organized body of related information Database system, DataBase
More informationAn SQL query is parsed into a collection of query blocks optimize one block at a time. Nested blocks are usually treated as calls to a subroutine
QUERY OPTIMIZATION 1 QUERY OPTIMIZATION QUERY SUB-SYSTEM 2 ROADMAP 3. 12 QUERY BLOCKS: UNITS OF OPTIMIZATION An SQL query is parsed into a collection of query blocks optimize one block at a time. Nested
More informationlogn D. Θ C. Θ n 2 ( ) ( ) f n B. nlogn Ο n2 n 2 D. Ο & % ( C. Θ # ( D. Θ n ( ) Ω f ( n)
CSE 0 Test Your name as it appears on your UTA ID Card Fall 0 Multiple Choice:. Write the letter of your answer on the line ) to the LEFT of each problem.. CIRCLED ANSWERS DO NOT COUNT.. points each. The
More informationData and File Structures Laboratory
Binary Trees Assistant Professor Machine Intelligence Unit Indian Statistical Institute, Kolkata September, 2018 1 Basics 2 Implementation 3 Traversal Basics of a tree A tree is recursively defined as
More informationWhat happens. 376a. Database Design. Execution strategy. Query conversion. Next. Two types of techniques
376a. Database Design Dept. of Computer Science Vassar College http://www.cs.vassar.edu/~cs376 Class 16 Query optimization What happens Database is given a query Query is scanned - scanner creates a list
More informationQuery Optimization. Vishy Poosala Bell Labs
Query Optimization Vishy Poosala Bell Labs 1 Outline Introduction Necessary Details Cost Estimation Result Size Estimation Standard approach for query optimization Other ways Related Concepts 2 Given a
More informationIntroduction to Combinatorial Algorithms
Fall 2009 Intro Introduction to the course What are : Combinatorial Structures? Combinatorial Algorithms? Combinatorial Problems? Combinatorial Structures Combinatorial Structures Combinatorial structures
More informationPlan for today. Query Processing/Optimization. Parsing. A query s trip through the DBMS. Validation. Logical plan
Plan for today Query Processing/Optimization CPS 216 Advanced Database Systems Overview of query processing Query execution Query plan enumeration Query rewrite heuristics Query rewrite in DB2 2 A query
More informationStorage hierarchy. Textbook: chapters 11, 12, and 13
Storage hierarchy Cache Main memory Disk Tape Very fast Fast Slower Slow Very small Small Bigger Very big (KB) (MB) (GB) (TB) Built-in Expensive Cheap Dirt cheap Disks: data is stored on concentric circular
More informationTHE B+ TREE INDEX. CS 564- Spring ACKs: Jignesh Patel, AnHai Doan
THE B+ TREE INDEX CS 564- Spring 2018 ACKs: Jignesh Patel, AnHai Doan WHAT IS THIS LECTURE ABOUT? The B+ tree index Basics Search/Insertion/Deletion Design & Cost 2 INDEX RECAP We have the following query:
More information( D. Θ n. ( ) f n ( ) D. Ο%
CSE 0 Name Test Spring 0 Multiple Choice. Write your answer to the LEFT of each problem. points each. The time to run the code below is in: for i=n; i>=; i--) for j=; j
More informationIan Kenny. December 1, 2017
Ian Kenny December 1, 2017 Introductory Databases Query Optimisation Introduction Any given natural language query can be formulated as many possible SQL queries. There are also many possible relational
More informationGraph and Digraph Glossary
1 of 15 31.1.2004 14:45 Graph and Digraph Glossary A B C D E F G H I-J K L M N O P-Q R S T U V W-Z Acyclic Graph A graph is acyclic if it contains no cycles. Adjacency Matrix A 0-1 square matrix whose
More information