Chapter 4 Distributed Query Processing

Size: px
Start display at page:

Download "Chapter 4 Distributed Query Processing"

Transcription

1 Chapter 4 Distributed Query Processing Table of Contents Overview of Query Processing Query Decomposition and Data Localization Optimization of Distributed Queries Chapter4-1 1

2 1. Overview of Query Processing Objectives of Query Processing SQL Efficient execution strategy of low-level language : ñ Query Optimization SQL TotalCost : CPU, I/O, Communication Cost Response Time : Parallelization Chapter4-2 Layers of Query Processing SQL on Global Relations Query Decomposition Global Schema Algebraic Query on Global Relations Control Site Local Sites Data Localization Fragment Query Global Optimization Optimized Fragment Query with Communication Ops. Local Optimization Optimized Local Queries Fragment Schema Statistics on Fragments Local Schema Chapter4-3 2

3 2. Query Decomposition and Data Localization Query Decomposition Global relation ó SQL query global relation ó relational algebra Centralized DBMS ú Data Localization Global relation ó relational algebra fragment Fragment query Node communication cost. Chapter Query Decomposition Output Queries of Query Decomposition Semantically correct Good in the sense that redundant work is avoided Steps of Query Decomposition Normalization Analysis Elimination of Redundancy Rewriting Chapter4-5 3

4 Normalization Goal Query Normalized form to facilitate further processing Conjunctive normal form vs. Disjunctive normal from Example Select ENAME From E, G Where E. = G. AND G.JNO = J1 AND (DUR = 12 OR DUR = 24) CNF: E. = G. G.JNO = J1 (DUR = 12 DUR = 24) DNF: ( E. = G. G.JNO = J1 DUR = 12 ) (E. = G. G.JNO = J1 DUR = 24 ) Redundant Expression! Chapter4-6 Analysis Goal: Normalized query ó type incorrect semantically incorrect Type Incorrect Query SELECT E# FROM E WHERE ENAME > 2000; E#: E attribute ð ENAME: string type > 2000 incompatible Chapter4-7 4

5 Analysis Semantically Incorrect Query SELECT ENAME, RESP FROM E, G, J WHERE E. = G. AND DUR > 36 AND TITLE = Programmer AND JNAME = CAD/CAM ; JNAME = CAD/CAM þ.? Query Graph Node: result relation & operand relation Edge: operation(join, project) Join Graph: Subgraph of query graph Chapter4-8 Analysis Query Graph SELECT ENAME, RESP FROM E, G, J WHERE E. = G. AND G.JNO = J.JNO AND DUR > 36 AND TITLE = Programmer AND JNAME = CAD/CAM ; DUR > 36 E. = G. G G.JNO = J.JNO E. = G. G G.JNO = J.JNO TITLE = Programmer E ENAME RESP J JNAME = CAD/CAM Result E Join Graph J Query graph disconnected, semantically incorrect query! Chapter4-9 5

6 Elimination of Redundancy Idempotency rule ÿ. Example SELECT TITLE FROM E WHERE (NOT (TITLE = Programmer ) AND (TITLE = Programmer OR TITLE = Engineer ) AND NOT (TITLE = Engineer )) OR ENAME = J.Doe ; = SELECT TITLE FROM E WHERE ENAME = J.Doe ; ( p1 (p1 p2) p2) p3 = p3 Chapter4-10 Rewriting Goal: SQL relational algebra SQL relational algebra ñ relational algebra output: relational algebra tree Relational Algebra Tree Leaf node: FROM Root node: SELECT ó Projection Internal node: WHERE ó Chapter4-11 6

7 Rewriting Example Π ENAME SELECT ENAME FROM J, G, E WHERE G. = E. AND G.JNO = J.JNO AND ENAME J.Doe AND J.NAME = CAD/CAM AND (DUR = 12 OR DUR = 24); σdur = 12 DUR = 24 σjname = CAD/CAM σename J.Doe JNO J G E Chapter4-12 Rewriting Combine sequence of unary operations Π ENAME Perform selection as early as possible. Π ENAME σename J.Doe JNAME = CAD/CAM (DUR = 12 DUR = 24) JNO Π JNO,ENAME JNO Π JNO Π JNO, Π,ENAME J G E σjname = CAD/CAM σdur = 12 DUR = 24 σename J.Doe J G E Chapter4-13 7

8 2.2 Data Localization Global relation ó relational algebra tree fragment ó relational algebra tree Generic Query Relational algebra tree leaf node fragment definition Generic query ó reduction Chapter4-14 Reduction for Primary Horizontal Fragmentation Rule1 ReductionwithSelection R: R 1,R 2,,R w (R k = σ Pk (R)) σ Pi (R k )) = if x in R : (p i (x) p k (x)) R k relational algebra tree Rule2 ReductionwithJoin R i R k = if x in R i, y in R k : (p i (x) p k (x)) Join empty fragment join. Reduced query generic query ñ? Worst case: Cartesian product of two sets of fragments Partial joins can be done in parallel. Chapter4-15 8

9 Example of Rule 1 Sample Relation E E 1 = σ <= E3 (E) E 2 = σ E3 < <= E6 (E) E 3 = σ > E6 (E) Sample Query SELECT * FROM E WHERE = E5 ; Generic Query σ = E5 E 1 E 2 E 3 Reduced Query σ = E5 E 2 Chapter4-16 Example of Rule 2 Sample Relation E, G G 1 = σ <= E3 (G) G 2 = σ > E3 (G) Sample Query SELECT * FROM E, G WHERE E. = G.; Generic Query E 1 E 2 E 3 G 1 G 2 Reduced Query E 1 G 1 E 2 G 2 E 3 G 2 Chapter4-17 9

10 Reduction for Vertical Fragmentation Rule 3 R: R 1,R 2,,R w (R k = Π A (R), where A A) Π D, Key (R k ) is useless if D A. Reduction for Vertical Fragmentation Determine the useless intermediate relations. Remove the sub-trees that produce them. Chapter4-18 Example of Rule 3 Sample Relation E E 1 = Π,ENAME (E) E 2 = Π,TITLE (E) Sample Query SELECT ENAME FROM E; Generic Query Π ENAME E 1 E 2 Reduced Query Π ENAME E 1 Chapter

11 Reduction for Derived Fragmentation Transformation Distribute joins over unions. Apply rule 2. Join Reduction Reduced query generic query ñ! Sample Relation E, G E 1 = σ TITLE= Programmer (E) E 2 = σ TITLE Programmer (E) G 1 = G E1 G 2 = G E2 Sample Query SELECT * FROM E, G WHERE G. = E. AND TITLE = Engineer ; Chapter4-20 Generic Query Reduced Query σtitle= Engineer σtitle= Engineer G 1 G 2 E 1 E 2 G 2 E 2 Query after Pushing Selection Down Query after Moving Unions Up σtitle= Engineer σtitle= Engineer σtitle= Engineer G 1 G 2 E 2 E 2 G 1 E 2 G 2 Chapter

12 Reduction for Hybrid Fragmentation Sample Relation E E 1 = σ <= E4 (Π,ENAME (E)) E 2 = σ > E4 (Π,ENAME (E)) E 3 = Π,TITLE (E) Sample Query SELECT ENAME FROM E WHERE = E5 ; Generic Query E 1 E 2 Π ENAME σ= E5 E 3 Reduced Query Π ENAME σ= E5 E 2 Chapter Optimization of Distributed Queries Goal of Query Optimization Finding an optimal ordering of operations for a given query Table of Contents Inputs to Query Optimization Centralized Query Optimization Join Ordering in Fragment Queries Distributed Query Optimization Algorithms Chapter

13 3.1 Inputs to Query Optimization Cost Model Reducing the Total Cost C cpu * #insts + C I/O * #I/Os + C msg * #msgs + C tr * #bytes Major factors in WAN: Communication cost Major factors in LAN: All three components Reducing the Response Time #x: ú þ ó ó response time Response Time Total Cost (?) Chapter4-24 Database Statistics Relation R, A = {A 1,A 2,,A n }, F = {R 1,R 2,,R r } length(a i ), card(π Ai (R k )) : # of distinct values of A i on R k min(a i ), max(a i ) card(dom[a i ]) card(r k ) size(r) = card(r) * length(r) Join Selectivity Factor, SF J, of relations R and S SF J (R, S) = card(r JN S) / (card(r) * card(s)) Chapter

14 Cardinalities of Intermediate Results Selection: card(σ F (R)) = SF S (F) * card(r) SF S (A=value) = 1 / card(π A (R)) SF S (A>value) = (max(a) value) / (max(a) min(a)) Projection: card(π A (R)) = card(r), if A is a key. Cartesian Product: card(r S) = card(r) * card(s) Join: card(r JN A=B S) = card(s), if A is a key of R. card(r JN S) = SF J * card(r) * card(s) Semijoin: card(r SJ A S) = SF SJ (S.A) * card(r) SF SJ (R SJ A S) = card(π A (S)) / card(dom[a]) Union: max{card(r), card(s)} ~ card(r) + card(s) Difference: 0 ~ card(r) Chapter Centralized Query Optimization Relationships between CQO vs. DQO query local query, local query ó CQO DQO CQO DQO = CQO + Communication Effect Two Algorithms INGRES Algorithm: Dynamic Optimization System R Algorithm: Static Optimization Chapter

15 INGRES Algorithm ä Multi-variable query mono-variable query sequence Q:Q 1 Q 2 Q n (Q i :Q i 1 ) Detachment: single variable predicate ó relation Substitution: ø detachment mutivariable query ó relation query mono-variable query Chapter4-28 Example Detachment Sample Query: Q SELECT E.ENAME FROM E, G, J WHERE E. = G. AND G.JNO = J.JNO AND JNAME = CAD/CAM ; 1 ST Detachment: Q 1 SELECT J.JNO INTO JVAR FROM J WHERE JNAME = CAD/CAM ; 2 ND Detachment: Q 2 SELECT G. INTO GVAR FROM G, JVAR WHERE G.GNO = JVAR.JNO; 3 RD Detachment: Q 3 SELECT E.ENAME FROM E, GVAR WHERE E. = GVAR.; Q = Q 1 Q 2 Q 3 Chapter

16 Example Substitution 3 RD Detachment: Q 3 SELECT E.ENAME FROM E, GVAR WHERE E. = GVAR.; Assume GVAR: Consists of 2 tuples <E1> <E2> 1 ST Substitution: Q 31 SELECT E.ENAME FROM E WHERE E. = E1 ; 2 ND Substitution: Q 32 SELECT E.ENAME FROM E WHERE E. = E2 ; Chapter4-30 System R Algorithm input: QT query tree with n relations output: output the result of execution begin for each relation Ri QT do for each access path APij to Ri do determine cost(apij) best_api with minimum cost end-for for each order (Ri1, Ri2,, Rin) with i = 1,, n! do build strategy ( ((best_api1 JN Ri2) JN Ri3) JN JN Rin) compute the cost of strategy end-for output strategy with minimum cost end. Chapter

17 Example System R Algorithm Sample Query: Q SELECT E.ENAME FROM E, G, J WHERE E. = G. AND G.JNO = J.JNO AND JNAME = CAD/CAM ; Assume E has an index on. GhasanindexonJNO. J has an index on JNO and an indexonjname. Join Graph G JNO Single-Relation Access Plan E: Sequential Scan G: Sequential Scan J: Index Scan on JNAME E J Chapter4-32 Alternative Join Orders (EJNG)>(GJNE) (G JN J) > (J JN G) E G J E G E J G E G J J G J E (G E) J (J G) E Chapter

18 3.3 Join Ordering in Fragment Queries Join Ordering in a Distributed Context Communication Cost JoinOrdering Communication Cost Semijoin Chapter4-34 Join Ordering RJNS R: site 1,S:site2 If size(r) < size(s): R site 2 If size(s) < size(r): S site 1 ó Join ä:. ó & Chapter

19 Example Join Ordering J JN JNO E JN G 1. E site 2, (E JN G) site 3 2. G site 1, (E JN G) site 3 Site 2 G 3. G site 3, (G JN J) site 1 4. J site 2, (J JN G) site 1 5. E site 2, J site 2 JNO E J Site 1 Site 3 size(e), size(g), size(j) size(e JN G), size(g JN J) Chapter4-36 Semijoin Based Algorithms RJN A S(R site1,s site2 ) Π A (S) site 1 Site 1 computes R = R SJ A S R site 2 Site 2 computes R JN A S Semijoin is better if size(π A (S) + size(r SJ A S) < size(r) General Idea EJNGJNJ E JN G JN J (E = E SJ G, G = G SJ J) E = E SJ (G SJ J), ø Semijoin Program Full Reducer using Acyclic Join Graph Chapter

20 Join vs. Semijoin Semijoin is effective, If length(join attribute) < length(entire tuple) Semijoin has good selectivity. Semijoin is not effective, If Neither of the above condition is met. Communication cost is not significant. Increased local processing cost Intermediate result ó index Chapter Distributed Query Optimization Algorithm R* Algorithm Extension of System R s Optimizer LAN WAN SDD-1 Algorithm Extension of hill-climbing Algorithm Semijoin algorithm WAN Chapter

21 R* Algorithm Exhaustive Searching Permutations of join order, join methods, result site, access path to the internal relation, intersite transfer mode Heuristic search space Intersite Transfer Mode Ship-whole Fetch-as-needed External relation scan internal relation ñ ( semijoin) Chapter4-40 R* Algorithm R JN S (R: external, S: internal) Notation LC: local processing cost (I/O + CPU time) CC: communication cost s=card(s SJ A R) / card(r) Average # of S s tuple that match one tuple of R R S 4 Chapter

22 1: R S ÿ Cost = LC(retrieve card(r) tuples from R) + CC(size(R)) + LC(retrieve s tuples from S) * card(r) 2: S R ÿ Cost = LC(retrieve card(s) tuples from S) + CC(size(S)) + LC(store card(s) tuples in T) + LC(retrieve card(r) tuples from R) + LC(retrieve s tuples from T) * card(r) 3: R scan S Cost = LC(retrieve card(r) tuples from R) + CC(length(A)) * card(r) + LC(retrieve s tuples from S) * card(r) + CC(s * length(s)) * card(r) 4: R S 3 Cost = LC(retrieve card(s) tuples from S) + CC(size(S)) + LC(retrieve card(r) tuples from R) + CC(size(R)) + LC(store card(s) tuples in T) + LC(retrieve s tuples from T) * card(r) Chapter4-42 R* Algorithm size(r) >> size(s) 2 communication cost ñ, 2 S ÿ local processing cost 1 3 S size(r) join selectivity : 3 Otherwise, 1 4? 3 ÿ. Chapter

23 SDD-1 Algorithm ä Selection of Beneficial Semijons Cost(R SJ A S) = C MSG +C TR * size(π A (S)) Benefit(R SJ A S)=(1 SF SJ (S.A)) * size(r) * C TR Beneficial Semijoin: Benefit > Cost Assembly Site Selection Select site i such that i stores the largest amount of data after all local operations. Post-optimization Assembly site ÿ ó semijoin benefit Chapter4-44 Example SDD-1 Algorithm Sample Query: Q SELECT * FROM E, G, J WHERE E. = G. AND G.JNO = J.JNO; relation E G J card tuple size relation size Join Graph Site 2 attribute SF SJ size(π attribute ) E G JNO J E. G. G.JNO J.JNO Site 1 Site 3 Chapter

24 Selection of Beneficial Semijoins Step 1: Calculate the benefit and cost of semijoins SJ 1 : G SJ E, benefit = (1 0.3) * 3000, cost = 120 SJ 2 : G SJ J, benefit = (1 0.4) * 3000, cost = 200 SJ 3 : E SJ G, benefit = (1 0.8) * 1500, cost = 400 SJ 4 : J SJ G, benefit = (1 1.0) * 2000, cost = 400 Most beneficial semijoin = SJ 1 Change the size of G to 900 = 3000 * 0.3 ChangeSF SJ (G.) to 0.24 = 0.8 * 0.3 InsertSJ 1 to execution strategy Step 2: Calculate again the benefit and cost of semijoins SJ 2 : G SJ J, benefit = (1 0.4) * 900, cost = 200 SJ 3 : E SJ G, benefit = (1 0.24) * 1500, cost = 400 Change the size of E to 360 = 1500 * 0.24 Step 3: Select SJ 2 and G s size is set to 360 = 900 * 0.4 Chapter4-46 Assembly Site Selection After reduction, the amount of data stored at each site is Site1(E):360 Site2(G):360 Site 3 (J): 2000 chosen as assembly site Post-optimization J ó semijoin, post-optimization. Final Execution Strategy (G SJ E) SJ J site 3 ESJG site 3 Site 3 Chapter

Optimization of Distributed Queries

Optimization of Distributed Queries Query Optimization Optimization of Distributed Queries Issues in Query Optimization Joins and Semijoins Query Optimization Algorithms Centralized query optimization: Minimize the cots function Find (the

More information

Introduction Background Distributed DBMS Architecture Distributed Database Design Semantic Data Control Distributed Query Processing

Introduction Background Distributed DBMS Architecture Distributed Database Design Semantic Data Control Distributed Query Processing Outline Introduction Background Distributed DBMS Architecture Distributed Database Design Semantic Data Control Distributed Query Processing Query Processing Methodology Distributed Query Optimization

More information

Introduction Background Distributed DBMS Architecture Distributed Database Design Semantic Data Control Distributed Query Processing

Introduction Background Distributed DBMS Architecture Distributed Database Design Semantic Data Control Distributed Query Processing Outline Introduction Background Distributed DBMS Architecture Distributed Database Design Semantic Data Control Distributed Query Processing Query Processing Methodology Distributed Query Optimization

More information

CS54200: Distributed. Introduction

CS54200: Distributed. Introduction CS54200: Distributed Database Systems Query Processing 9 March 2009 Prof. Chris Clifton Query Processing Introduction Converting user commands from the query language (SQL) to low level data manipulation

More information

Mobile and Heterogeneous databases Distributed Database System Query Processing. A.R. Hurson Computer Science Missouri Science & Technology

Mobile and Heterogeneous databases Distributed Database System Query Processing. A.R. Hurson Computer Science Missouri Science & Technology Mobile and Heterogeneous databases Distributed Database System Query Processing A.R. Hurson Computer Science Missouri Science & Technology 1 Note, this unit will be covered in four lectures. In case you

More information

Query Processing. high level user query. low level data manipulation. query processor. commands

Query Processing. high level user query. low level data manipulation. query processor. commands Query Processing high level user query query processor low level data manipulation commands 1 Selecting Alternatives SELECT ENAME FROM EMP,ASG WHERE EMP.ENO = ASG.ENO AND DUR > 37 Strategy A ΠENAME(σDUR>37

More information

Query Processing SL03

Query Processing SL03 Distributed Database Systems Fall 2016 Query Processing Overview Query Processing SL03 Distributed Query Processing Steps Query Decomposition Data Localization Query Processing Overview/1 Query processing:

More information

Integration of Transactional Systems

Integration of Transactional Systems Integration of Transactional Systems Distributed Query Processing Robert Wrembel Poznań University of Technology Institute of Computing Science Robert.Wrembel@cs.put.poznan.pl www.cs.put.poznan.pl/rwrembel

More information

Query Decomposition and Data Localization

Query Decomposition and Data Localization Query Decomposition and Data Localization Query Decomposition and Data Localization Query decomposition and data localization consists of two steps: Mapping of calculus query (SQL) to algebra operations

More information

Distributed Databases Systems

Distributed Databases Systems Distributed Databases Systems Lecture No. 05 Query Processing Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology Jamshoro Outline

More information

Introduction to Query Processing and Query Optimization Techniques. Copyright 2011 Ramez Elmasri and Shamkant Navathe

Introduction to Query Processing and Query Optimization Techniques. Copyright 2011 Ramez Elmasri and Shamkant Navathe Introduction to Query Processing and Query Optimization Techniques Outline Translating SQL Queries into Relational Algebra Algorithms for External Sorting Algorithms for SELECT and JOIN Operations Algorithms

More information

On Building Integrated and Distributed Database Systems

On Building Integrated and Distributed Database Systems On Building Integrated and Distributed Database Systems Distributed Query Processing and Optimization Robert Wrembel Poznań University of Technology Institute of Computing Science Poznań,, Poland Robert.Wrembel@cs.put.poznan.pl

More information

Query Processing and Query Optimization. Prof Monika Shah

Query Processing and Query Optimization. Prof Monika Shah Query Processing and Query Optimization Query Processing SQL Query Is in Library Cache? System catalog (Dict / Dict cache) Scan and verify relations Parse into parse tree (relational Calculus) View definitions

More information

Algorithms for Query Processing and Optimization. 0. Introduction to Query Processing (1)

Algorithms for Query Processing and Optimization. 0. Introduction to Query Processing (1) Chapter 19 Algorithms for Query Processing and Optimization 0. Introduction to Query Processing (1) Query optimization: The process of choosing a suitable execution strategy for processing a query. Two

More information

QUERY PROCESSING & OPTIMIZATION CHAPTER 19 (6/E) CHAPTER 15 (5/E)

QUERY PROCESSING & OPTIMIZATION CHAPTER 19 (6/E) CHAPTER 15 (5/E) QUERY PROCESSING & OPTIMIZATION CHAPTER 19 (6/E) CHAPTER 15 (5/E) 2 LECTURE OUTLINE Query Processing Methodology Basic Operations and Their Costs Generation of Execution Plans 3 QUERY PROCESSING IN A DDBMS

More information

Chapter 3. Algorithms for Query Processing and Optimization

Chapter 3. Algorithms for Query Processing and Optimization Chapter 3 Algorithms for Query Processing and Optimization Chapter Outline 1. Introduction to Query Processing 2. Translating SQL Queries into Relational Algebra 3. Algorithms for External Sorting 4. Algorithms

More information

Distributed Query Optimization: Use of mobile Agents Kodanda Kumar Melpadi

Distributed Query Optimization: Use of mobile Agents Kodanda Kumar Melpadi Distributed Query Optimization: Use of mobile Agents Kodanda Kumar Melpadi M.Tech (IT) GGS Indraprastha University Delhi mk_kumar_76@yahoo.com Abstract DDBS adds to the conventional centralized DBS some

More information

Query Processing and Optimization *

Query Processing and Optimization * OpenStax-CNX module: m28213 1 Query Processing and Optimization * Nguyen Kim Anh This work is produced by OpenStax-CNX and licensed under the Creative Commons Attribution License 3.0 Query processing is

More information

Query optimization. Elena Baralis, Silvia Chiusano Politecnico di Torino. DBMS Architecture D B M G. Database Management Systems. Pag.

Query optimization. Elena Baralis, Silvia Chiusano Politecnico di Torino. DBMS Architecture D B M G. Database Management Systems. Pag. Database Management Systems DBMS Architecture SQL INSTRUCTION OPTIMIZER MANAGEMENT OF ACCESS METHODS CONCURRENCY CONTROL BUFFER MANAGER RELIABILITY MANAGEMENT Index Files Data Files System Catalog DATABASE

More information

Relational Query Optimization. Highlights of System R Optimizer

Relational Query Optimization. Highlights of System R Optimizer Relational Query Optimization Chapter 15 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Highlights of System R Optimizer v Impact: Most widely used currently; works well for < 10 joins.

More information

Outline. Query Processing Overview Algorithms for basic operations. Query optimization. Sorting Selection Join Projection

Outline. Query Processing Overview Algorithms for basic operations. Query optimization. Sorting Selection Join Projection Outline Query Processing Overview Algorithms for basic operations Sorting Selection Join Projection Query optimization Heuristics Cost-based optimization 19 Estimate I/O Cost for Implementations Count

More information

Query Processing and Optimization

Query Processing and Optimization Query Processing and Optimization (Part-1) Prof Monika Shah Overview of Query Execution SQL Query Compile Optimize Execute SQL query parse parse tree statistics convert logical query plan apply laws improved

More information

Relational Model, Relational Algebra, and SQL

Relational Model, Relational Algebra, and SQL Relational Model, Relational Algebra, and SQL August 29, 2007 1 Relational Model Data model. constraints. Set of conceptual tools for describing of data, data semantics, data relationships, and data integrity

More information

Ian Kenny. December 1, 2017

Ian Kenny. December 1, 2017 Ian Kenny December 1, 2017 Introductory Databases Query Optimisation Introduction Any given natural language query can be formulated as many possible SQL queries. There are also many possible relational

More information

Relational Model: History

Relational Model: History Relational Model: History Objectives of Relational Model: 1. Promote high degree of data independence 2. Eliminate redundancy, consistency, etc. problems 3. Enable proliferation of non-procedural DML s

More information

Web Science & Technologies University of Koblenz Landau, Germany. Relational Data Model

Web Science & Technologies University of Koblenz Landau, Germany. Relational Data Model Web Science & Technologies University of Koblenz Landau, Germany Relational Data Model Overview Relational data model; Tuples and relations; Schemas and instances; Named vs. unnamed perspective; Relational

More information

Relational Databases

Relational Databases Relational Databases Jan Chomicki University at Buffalo Jan Chomicki () Relational databases 1 / 49 Plan of the course 1 Relational databases 2 Relational database design 3 Conceptual database design 4

More information

CSE 444: Database Internals. Lecture 22 Distributed Query Processing and Optimization

CSE 444: Database Internals. Lecture 22 Distributed Query Processing and Optimization CSE 444: Database Internals Lecture 22 Distributed Query Processing and Optimization CSE 444 - Spring 2014 1 Readings Main textbook: Sections 20.3 and 20.4 Other textbook: Database management systems.

More information

Published by: PIONEER RESEARCH & DEVELOPMENT GROUP (www.prdg.org) 1

Published by: PIONEER RESEARCH & DEVELOPMENT GROUP (www.prdg.org) 1 Optimization of Join Queries on Distributed Relations Using Semi-Joins Suresh Sapa 1, K. P. Supreethi 2 1, 2 JNTUCEH, Hyderabad, India Abstract The processing and optimizing a join query in distributed

More information

QUERY PROCESSING & OPTIMIZATION CHAPTER 19 (6/E) CHAPTER 15 (5/E)

QUERY PROCESSING & OPTIMIZATION CHAPTER 19 (6/E) CHAPTER 15 (5/E) QUERY PROCESSING & OPTIMIZATION CHAPTER 19 (6/E) CHAPTER 15 (5/E) 2 LECTURE OUTLINE Query Processing Methodology Basic Operations and Their Costs Generation of Execution Plans 3 QUERY PROCESSING IN A DDBMS

More information

Chapter 12: Query Processing

Chapter 12: Query Processing Chapter 12: Query Processing Overview Catalog Information for Cost Estimation $ Measures of Query Cost Selection Operation Sorting Join Operation Other Operations Evaluation of Expressions Transformation

More information

Overview of Implementing Relational Operators and Query Evaluation

Overview of Implementing Relational Operators and Query Evaluation Overview of Implementing Relational Operators and Query Evaluation Chapter 12 Motivation: Evaluating Queries The same query can be evaluated in different ways. The evaluation strategy (plan) can make orders

More information

Relational Query Optimization. Overview of Query Evaluation. SQL Refresher. Yanlei Diao UMass Amherst October 23 & 25, 2007

Relational Query Optimization. Overview of Query Evaluation. SQL Refresher. Yanlei Diao UMass Amherst October 23 & 25, 2007 Relational Query Optimization Yanlei Diao UMass Amherst October 23 & 25, 2007 Slide Content Courtesy of R. Ramakrishnan, J. Gehrke, and J. Hellerstein 1 Overview of Query Evaluation Query Evaluation Plan:

More information

Database Systems External Sorting and Query Optimization. A.R. Hurson 323 CS Building

Database Systems External Sorting and Query Optimization. A.R. Hurson 323 CS Building External Sorting and Query Optimization A.R. Hurson 323 CS Building External sorting When data to be sorted cannot fit into available main memory, external sorting algorithm must be applied. Naturally,

More information

Query Evaluation and Optimization

Query Evaluation and Optimization Query Evaluation and Optimization Jan Chomicki University at Buffalo Jan Chomicki () Query Evaluation and Optimization 1 / 21 Evaluating σ E (R) Jan Chomicki () Query Evaluation and Optimization 2 / 21

More information

What happens. 376a. Database Design. Execution strategy. Query conversion. Next. Two types of techniques

What happens. 376a. Database Design. Execution strategy. Query conversion. Next. Two types of techniques 376a. Database Design Dept. of Computer Science Vassar College http://www.cs.vassar.edu/~cs376 Class 16 Query optimization What happens Database is given a query Query is scanned - scanner creates a list

More information

Implementation of Relational Operations. Introduction. CS 186, Fall 2002, Lecture 19 R&G - Chapter 12

Implementation of Relational Operations. Introduction. CS 186, Fall 2002, Lecture 19 R&G - Chapter 12 Implementation of Relational Operations CS 186, Fall 2002, Lecture 19 R&G - Chapter 12 First comes thought; then organization of that thought, into ideas and plans; then transformation of those plans into

More information

FDB: A Query Engine for Factorised Relational Databases

FDB: A Query Engine for Factorised Relational Databases FDB: A Query Engine for Factorised Relational Databases Nurzhan Bakibayev, Dan Olteanu, and Jakub Zavodny Oxford CS Christan Grant cgrant@cise.ufl.edu University of Florida November 1, 2013 cgrant (UF)

More information

Module 9: Selectivity Estimation

Module 9: Selectivity Estimation Module 9: Selectivity Estimation Module Outline 9.1 Query Cost and Selectivity Estimation 9.2 Database profiles 9.3 Sampling 9.4 Statistics maintained by commercial DBMS Web Forms Transaction Manager Lock

More information

Database Languages and their Compilers

Database Languages and their Compilers Database Languages and their Compilers Prof. Dr. Torsten Grust Database Systems Research Group U Tübingen Winter 2010 2010 T. Grust Database Languages and their Compilers 4 Query Normalization Finally,

More information

Query optimization. Query Optimization. Query Optimization. Cost estimation. Another reason why plain IOs not enough: Parallelism

Query optimization. Query Optimization. Query Optimization. Cost estimation. Another reason why plain IOs not enough: Parallelism Query optimization Query Optimization It is safer to accept any chance that offers itself, and extemporize a procedure to fit it, than to get a good plan matured, and wait for a chance of using it. Thomas

More information

Relational Query Optimization. Overview of Query Evaluation. SQL Refresher. Yanlei Diao UMass Amherst March 8 and 13, 2007

Relational Query Optimization. Overview of Query Evaluation. SQL Refresher. Yanlei Diao UMass Amherst March 8 and 13, 2007 Relational Query Optimization Yanlei Diao UMass Amherst March 8 and 13, 2007 Slide Content Courtesy of R. Ramakrishnan, J. Gehrke, and J. Hellerstein 1 Overview of Query Evaluation Query Evaluation Plan:

More information

The query processor turns user queries and data modification commands into a query plan - a sequence of operations (or algorithm) on the database

The query processor turns user queries and data modification commands into a query plan - a sequence of operations (or algorithm) on the database query processing Query Processing The query processor turns user queries and data modification commands into a query plan - a sequence of operations (or algorithm) on the database from high level queries

More information

Principles of Data Management. Lecture #12 (Query Optimization I)

Principles of Data Management. Lecture #12 (Query Optimization I) Principles of Data Management Lecture #12 (Query Optimization I) Instructor: Mike Carey mjcarey@ics.uci.edu Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Today s Notable News v B+ tree

More information

Query Processing & Optimization

Query Processing & Optimization Query Processing & Optimization 1 Roadmap of This Lecture Overview of query processing Measures of Query Cost Selection Operation Sorting Join Operation Other Operations Evaluation of Expressions Introduction

More information

SDD-1 Algorithm Implementation

SDD-1 Algorithm Implementation National Institute of Technology Karnataka, Surathkal Project Report on SDD-1 Algorithm Implementation Under the Guidance of: Mr. Dr. Anantha Narayana (Professor) Submitted by: Mr. Vasanth Raja Chittampally

More information

University of Massachusetts Amherst Department of Computer Science Prof. Yanlei Diao

University of Massachusetts Amherst Department of Computer Science Prof. Yanlei Diao University of Massachusetts Amherst Department of Computer Science Prof. Yanlei Diao CMPSCI 445 Midterm Practice Questions NAME: LOGIN: Write all of your answers directly on this paper. Be sure to clearly

More information

Introduction Alternative ways of evaluating a given query using

Introduction Alternative ways of evaluating a given query using Query Optimization Introduction Catalog Information for Cost Estimation Estimation of Statistics Transformation of Relational Expressions Dynamic Programming for Choosing Evaluation Plans Introduction

More information

Chapter 13: Query Optimization. Chapter 13: Query Optimization

Chapter 13: Query Optimization. Chapter 13: Query Optimization Chapter 13: Query Optimization Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 13: Query Optimization Introduction Equivalent Relational Algebra Expressions Statistical

More information

Chapter 13: Query Optimization

Chapter 13: Query Optimization Chapter 13: Query Optimization Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 13: Query Optimization Introduction Transformation of Relational Expressions Catalog

More information

Query Processing Strategies and Optimization

Query Processing Strategies and Optimization Query Processing Strategies and Optimization CPS352: Database Systems Simon Miner Gordon College Last Revised: 10/25/12 Agenda Check-in Design Project Presentations Query Processing Programming Project

More information

Chapter 19 Query Optimization

Chapter 19 Query Optimization Chapter 19 Query Optimization It is an activity conducted by the query optimizer to select the best available strategy for executing the query. 1. Query Trees and Heuristics for Query Optimization - Apply

More information

Outline. q Database integration & querying. q Peer-to-Peer data management q Stream data management q MapReduce-based distributed data management

Outline. q Database integration & querying. q Peer-to-Peer data management q Stream data management q MapReduce-based distributed data management Outline n Introduction & architectural issues n Data distribution n Distributed query processing n Distributed query optimization n Distributed transactions & concurrency control n Distributed reliability

More information

Relational Query Optimization

Relational Query Optimization Relational Query Optimization Module 4, Lectures 3 and 4 Database Management Systems, R. Ramakrishnan 1 Overview of Query Optimization Plan: Tree of R.A. ops, with choice of alg for each op. Each operator

More information

CSC 742 Database Management Systems

CSC 742 Database Management Systems CSC 742 Database Management Systems Topic #16: Query Optimization Spring 2002 CSC 742: DBMS by Dr. Peng Ning 1 Agenda Typical steps of query processing Two main techniques for query optimization Heuristics

More information

Database Applications (15-415)

Database Applications (15-415) Database Applications (15-415) DBMS Internals- Part VI Lecture 14, March 12, 2014 Mohammad Hammoud Today Last Session: DBMS Internals- Part V Hash-based indexes (Cont d) and External Sorting Today s Session:

More information

2.2.2.Relational Database concept

2.2.2.Relational Database concept Foreign key:- is a field (or collection of fields) in one table that uniquely identifies a row of another table. In simpler words, the foreign key is defined in a second table, but it refers to the primary

More information

Query processing and optimization

Query processing and optimization Query processing and optimization These slides are a modified version of the slides of the book Database System Concepts (Chapter 13 and 14), 5th Ed., McGraw-Hill, by Silberschatz, Korth and Sudarshan.

More information

Review. Relational Query Optimization. Query Optimization Overview (cont) Query Optimization Overview. Cost-based Query Sub-System

Review. Relational Query Optimization. Query Optimization Overview (cont) Query Optimization Overview. Cost-based Query Sub-System Review Relational Query Optimization R & G Chapter 12/15 Implementation of single Relational Operations Choices depend on indexes, memory, stats, Joins Blocked nested loops: simple, exploits extra memory

More information

CPSC 310: Database Systems / CSPC 603: Database Systems and Applications Exam 2 November 16, 2005

CPSC 310: Database Systems / CSPC 603: Database Systems and Applications Exam 2 November 16, 2005 CPSC 310: Database Systems / CSPC 603: Database Systems and Applications Exam 2 November 16, 2005 Name: Instructions: 1. This is a closed book exam. Do not use any notes or books, other than your two 8.5-by-11

More information

Mobile and Heterogeneous databases

Mobile and Heterogeneous databases Mobile and Heterogeneous databases Heterogeneous Distributed Databases Query Processing A.R. Hurson Computer Science Missouri Science & Technology 1 Note, this unit will be covered in two lectures. In

More information

Query Optimization Overview. COSC 404 Database System Implementation. Query Optimization. Query Processor Components The Parser

Query Optimization Overview. COSC 404 Database System Implementation. Query Optimization. Query Processor Components The Parser COSC 404 Database System Implementation Query Optimization Query Optimization Overview The query processor performs four main tasks: 1) Verifies the correctness of an SQL statement 2) Converts the SQL

More information

Chapter 11: Query Optimization

Chapter 11: Query Optimization Chapter 11: Query Optimization Chapter 11: Query Optimization Introduction Transformation of Relational Expressions Statistical Information for Cost Estimation Cost-based optimization Dynamic Programming

More information

CMSC 424 Database design Lecture 18 Query optimization. Mihai Pop

CMSC 424 Database design Lecture 18 Query optimization. Mihai Pop CMSC 424 Database design Lecture 18 Query optimization Mihai Pop More midterm solutions Projects do not be late! Admin Introduction Alternative ways of evaluating a given query Equivalent expressions Different

More information

CS330. Query Processing

CS330. Query Processing CS330 Query Processing 1 Overview of Query Evaluation Plan: Tree of R.A. ops, with choice of alg for each op. Each operator typically implemented using a `pull interface: when an operator is `pulled for

More information

CompSci 516 Data Intensive Computing Systems

CompSci 516 Data Intensive Computing Systems CompSci 516 Data Intensive Computing Systems Lecture 9 Join Algorithms and Query Optimizations Instructor: Sudeepa Roy CompSci 516: Data Intensive Computing Systems 1 Announcements Takeaway from Homework

More information

Cs712 Important Questions & Past Papers

Cs712 Important Questions & Past Papers Cs712 Distributed Database Q1. Differentiate Horizontal and vertical partitions? Vertical Fragmentation: Different subsets of attributes are stored at different places, like, Table EMP (eid, ename, edept,

More information

Query Optimization. Shuigeng Zhou. December 9, 2009 School of Computer Science Fudan University

Query Optimization. Shuigeng Zhou. December 9, 2009 School of Computer Science Fudan University Query Optimization Shuigeng Zhou December 9, 2009 School of Computer Science Fudan University Outline Introduction Catalog Information for Cost Estimation Estimation of Statistics Transformation of Relational

More information

Two hours UNIVERSITY OF MANCHESTER SCHOOL OF COMPUTER SCIENCE

Two hours UNIVERSITY OF MANCHESTER SCHOOL OF COMPUTER SCIENCE COMP 62421 Two hours UNIVERSITY OF MANCHESTER SCHOOL OF COMPUTER SCIENCE Querying Data on the Web Date: Wednesday 24th January 2018 Time: 14:00-16:00 Please answer all FIVE Questions provided. They amount

More information

7. Query Processing and Optimization

7. Query Processing and Optimization 7. Query Processing and Optimization Processing a Query 103 Indexing for Performance Simple (individual) index B + -tree index Matching index scan vs nonmatching index scan Unique index one entry and one

More information

Database Technology Introduction. Heiko Paulheim

Database Technology Introduction. Heiko Paulheim Database Technology Introduction Outline The Need for Databases Data Models Relational Databases Database Design Storage Manager Query Processing Transaction Manager Introduction to the Relational Model

More information

σ (R.B = 1 v R.C > 3) (S.D = 2) Conjunctive normal form Topics for the Day Distributed Databases Query Processing Steps Decomposition

σ (R.B = 1 v R.C > 3) (S.D = 2) Conjunctive normal form Topics for the Day Distributed Databases Query Processing Steps Decomposition Topics for the Day Distributed Databases Query processing in distributed databases Localization Distributed query operators Cost-based optimization C37 Lecture 1 May 30, 2001 1 2 Query Processing teps

More information

Improving Query Plans. CS157B Chris Pollett Mar. 21, 2005.

Improving Query Plans. CS157B Chris Pollett Mar. 21, 2005. Improving Query Plans CS157B Chris Pollett Mar. 21, 2005. Outline Parse Trees and Grammars Algebraic Laws for Improving Query Plans From Parse Trees To Logical Query Plans Syntax Analysis and Parse Trees

More information

Relational Algebra and SQL

Relational Algebra and SQL Relational Algebra and SQL Relational Algebra. This algebra is an important form of query language for the relational model. The operators of the relational algebra: divided into the following classes:

More information

Overview of Query Evaluation. Overview of Query Evaluation

Overview of Query Evaluation. Overview of Query Evaluation Overview of Query Evaluation Chapter 12 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Overview of Query Evaluation v Plan: Tree of R.A. ops, with choice of alg for each op. Each operator

More information

Overview of Query Processing. Evaluation of Relational Operations. Why Sort? Outline. Two-Way External Merge Sort. 2-Way Sort: Requires 3 Buffer Pages

Overview of Query Processing. Evaluation of Relational Operations. Why Sort? Outline. Two-Way External Merge Sort. 2-Way Sort: Requires 3 Buffer Pages Overview of Query Processing Query Parser Query Processor Evaluation of Relational Operations Query Rewriter Query Optimizer Query Executor Yanlei Diao UMass Amherst Lock Manager Access Methods (Buffer

More information

Database Management Systems (COP 5725) Homework 3

Database Management Systems (COP 5725) Homework 3 Database Management Systems (COP 5725) Homework 3 Instructor: Dr. Daisy Zhe Wang TAs: Yang Chen, Kun Li, Yang Peng yang, kli, ypeng@cise.uf l.edu November 26, 2013 Name: UFID: Email Address: Pledge(Must

More information

Principles of Data Management. Lecture #9 (Query Processing Overview)

Principles of Data Management. Lecture #9 (Query Processing Overview) Principles of Data Management Lecture #9 (Query Processing Overview) Instructor: Mike Carey mjcarey@ics.uci.edu Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Today s Notable News v Midterm

More information

Overview of Query Processing

Overview of Query Processing ICS 321 Fall 2013 Overview of Query Processing Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 11/20/2013 Lipyeow Lim -- University of Hawaii at Manoa 1

More information

Midterm Exam. Name: CSE232A, Winter February 21, Brief Directions:

Midterm Exam. Name: CSE232A, Winter February 21, Brief Directions: Midterm Exam CSE232A, Winter 2002 February 21, 2002 Name: Brief Directions: ² Write clearly: First, you don't want me to spend the whole week grading, do you? Second, it's good for you to write clearly!

More information

Faloutsos 1. Carnegie Mellon Univ. Dept. of Computer Science Database Applications. Outline

Faloutsos 1. Carnegie Mellon Univ. Dept. of Computer Science Database Applications. Outline Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications Lecture #14: Implementation of Relational Operations (R&G ch. 12 and 14) 15-415 Faloutsos 1 introduction selection projection

More information

CSC 261/461 Database Systems Lecture 13. Fall 2017

CSC 261/461 Database Systems Lecture 13. Fall 2017 CSC 261/461 Database Systems Lecture 13 Fall 2017 Announcement Start learning HTML, CSS, JavaScript, PHP + SQL We will cover the basics next week https://www.w3schools.com/php/php_mysql_intro.asp Project

More information

Review. Support for data retrieval at the physical level:

Review. Support for data retrieval at the physical level: Query Processing Review Support for data retrieval at the physical level: Indices: data structures to help with some query evaluation: SELECTION queries (ssn = 123) RANGE queries (100

More information

CS 245 Midterm Exam Solution Winter 2015

CS 245 Midterm Exam Solution Winter 2015 CS 245 Midterm Exam Solution Winter 2015 This exam is open book and notes. You can use a calculator and your laptop to access course notes and videos (but not to communicate with other people). You have

More information

DBMS Query evaluation

DBMS Query evaluation Data Management for Data Science DBMS Maurizio Lenzerini, Riccardo Rosati Corso di laurea magistrale in Data Science Sapienza Università di Roma Academic Year 2016/2017 http://www.dis.uniroma1.it/~rosati/dmds/

More information

Announcement. Reading Material. Overview of Query Evaluation. Overview of Query Evaluation. Overview of Query Evaluation 9/26/17

Announcement. Reading Material. Overview of Query Evaluation. Overview of Query Evaluation. Overview of Query Evaluation 9/26/17 Announcement CompSci 516 Database Systems Lecture 10 Query Evaluation and Join Algorithms Project proposal pdf due on sakai by 5 pm, tomorrow, Thursday 09/27 One per group by any member Instructor: Sudeepa

More information

Query Execution [15]

Query Execution [15] CSC 661, Principles of Database Systems Query Execution [15] Dr. Kalpakis http://www.csee.umbc.edu/~kalpakis/courses/661 Query processing involves Query processing compilation parsing to construct parse

More information

Database Tuning and Physical Design: Basics of Query Execution

Database Tuning and Physical Design: Basics of Query Execution Database Tuning and Physical Design: Basics of Query Execution Spring 2018 School of Computer Science University of Waterloo Databases CS348 (University of Waterloo) Query Execution 1 / 43 The Client/Server

More information

Contents Contents Introduction Basic Steps in Query Processing Introduction Transformation of Relational Expressions...

Contents Contents Introduction Basic Steps in Query Processing Introduction Transformation of Relational Expressions... Contents Contents...283 Introduction...283 Basic Steps in Query Processing...284 Introduction...285 Transformation of Relational Expressions...287 Equivalence Rules...289 Transformation Example: Pushing

More information

Relational Query Languages: Relational Algebra. Juliana Freire

Relational Query Languages: Relational Algebra. Juliana Freire Relational Query Languages: Relational Algebra Relational Query Languages Query languages: Allow manipulation and retrieval of data from a database. Relational model supports simple, powerful QLs: Simple

More information

CS 4604: Introduction to Database Management Systems. B. Aditya Prakash Lecture #10: Query Processing

CS 4604: Introduction to Database Management Systems. B. Aditya Prakash Lecture #10: Query Processing CS 4604: Introduction to Database Management Systems B. Aditya Prakash Lecture #10: Query Processing Outline introduction selection projection join set & aggregate operations Prakash 2018 VT CS 4604 2

More information

Distributed Data Management

Distributed Data Management Distributed Data Management Wolf-Tilo Balke Christoph Lofi Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de 2.0 Introduction 3.0 Query Processing 3.1 Basic

More information

Advanced Database Systems

Advanced Database Systems Lecture IV Query Processing Kyumars Sheykh Esmaili Basic Steps in Query Processing 2 Query Optimization Many equivalent execution plans Choosing the best one Based on Heuristics, Cost Will be discussed

More information

Chapter 2: Intro to Relational Model

Chapter 2: Intro to Relational Model Non è possibile visualizzare l'immagine. Chapter 2: Intro to Relational Model Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Example of a Relation attributes (or columns)

More information

Hash-Based Indexing 165

Hash-Based Indexing 165 Hash-Based Indexing 165 h 1 h 0 h 1 h 0 Next = 0 000 00 64 32 8 16 000 00 64 32 8 16 A 001 01 9 25 41 73 001 01 9 25 41 73 B 010 10 10 18 34 66 010 10 10 18 34 66 C Next = 3 011 11 11 19 D 011 11 11 19

More information

Overview of Query Evaluation

Overview of Query Evaluation Overview of Query Evaluation Chapter 12 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Overview of Query Evaluation Plan: Tree of R.A. ops, with choice of alg for each op. Each operator

More information

Mahathma Gandhi University

Mahathma Gandhi University Mahathma Gandhi University BSc Computer science III Semester BCS 303 OBJECTIVE TYPE QUESTIONS Choose the correct or best alternative in the following: Q.1 In the relational modes, cardinality is termed

More information

Evaluation of relational operations

Evaluation of relational operations Evaluation of relational operations Iztok Savnik, FAMNIT Slides & Textbook Textbook: Raghu Ramakrishnan, Johannes Gehrke, Database Management Systems, McGraw-Hill, 3 rd ed., 2007. Slides: From Cow Book

More information

Query Processing. Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016

Query Processing. Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016 Query Processing Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016 Slides re-used with some modification from www.db-book.com Reference: Database System Concepts, 6 th Ed. By Silberschatz,

More information

Basant Group of Institution

Basant Group of Institution Basant Group of Institution Visual Basic 6.0 Objective Question Q.1 In the relational modes, cardinality is termed as: (A) Number of tuples. (B) Number of attributes. (C) Number of tables. (D) Number of

More information