Query Processing. high level user query. low level data manipulation. query processor. commands
|
|
- Jonas Lamb
- 6 years ago
- Views:
Transcription
1 Query Processing high level user query query processor low level data manipulation commands 1
2 Selecting Alternatives SELECT ENAME FROM EMP,ASG WHERE EMP.ENO = ASG.ENO AND DUR > 37 Strategy A ΠENAME(σDUR>37 EMP.ENO=ASG.ENO (EMP ASG)) Strategy B ΠENAME(EMP ENO (σdur>37 (ASG))) Strategy B avoids Cartesian product, so is better 2
3 What is the Problem? Site 1 Site 2 ASG1=σENO E3 (ASG) Site 3 ASG2=σENO> E3 (ASG) EMP1=σENO E3 (EMP) result = EMP1 EMP2 EMP1 Site 4 EMP2 =EMP2 ASG1 Site 1 ASG1 =σdur>37(asg1) EMP2=σENO> E3 (EMP) result2=(emp1 EMP2) EMP2 Site 3 ENOASG1 Site 5 Result Site 5 Site 5 EMP1 =EMP1 Site 4 Site 2 ENOASG2 ENOσDUR>37(ASG1 ASG1) ASG1 ASG2 EMP1 EMP2 Site 1 Site 2 Site 3 Site 4 ASG2 ASG2 =σdur>37(asg2) 3
4 Cost of Alternatives Assume: size(emp) = 400, size(asg) = 1000 uniform distribution of tuples tuple access cost = 1 unit; tuple transfer cost = 10 units there are 20 ASG tuples with DUR>37 4
5 Strategy 1 produce ASG': (10+10) tuple access cost 20 transfer ASG' to the sites of EMP: (10+10) tuple transfer cost 200 produce EMP': (10+10) tuple access cost 2 40 transfer EMP' to result site: (10+10) tuple transfer cost 200 Total cost 460 5
6 Strategy 2 transfer EMP to site 5: 400 tuple transfer cost 4000 transfer ASG to site 5: 1000 tuple transfer cost produce ASG':1000 tuple access cost 1,000 join EMP and ASG': tuple access cost 8,000 Total cost 23,000 6
7 Optimization Objectives Minimize a cost function I/O cost + CPU cost + communication cost These might have different weights in different distributed environments Wide area networks communication cost will dominate low bandwidth low speed high protocol overhead most algorithms ignore all other cost components Local area networks communication cost not that dominant total cost function should be considered 7
8 Query Optimization Issues: Optimization Granularity Single query at a time cannot use common intermediate results Multiple queries at a time efficient if many similar queries decision space is much larger 8
9 Query Optimization Issues Statistics Relation cardinality size of a tuple fraction of tuples participating in a join with another relation Attribute cardinality of domain actual number of distinct values Common assumptions independence between different attribute values uniform distribution of attribute values within their domain 9
10 Query Optimization Issues Decision Sites Centralized single site determines the best schedule simple need knowledge about the entire distributed database Distributed cooperation among sites to determine the schedule need only local information cost of cooperation Hybrid one site determines the global schedule each site optimizes the local subqueries 10
11 Query Optimization Issues Network Topology Wide area networks (WAN) Local area networks (LAN) System (cluster) area networks (SAN) 11
12 Distributed Query Processing Methodology Calculus Query on Distributed Relations Query Decomposition GLOBAL SCHEMA Algebraic Query on Distributed Relations CONTROL SITE Data Localization FRAGMENT SCHEMA Fragment Query Global Optimization STATS ON FRAGMENTS Optimized Fragment Query with Communication Operations LOCAL SITES Local Optimization LOCAL SCHEMAS Optimized Local Queries 12
13 Query Decomposition Input : Calculus query on global relations Normalization Analysis Simplification Translation/Restructuring Also called: NFST (Normalization, Factorization, Semantic Analysis, Translation) 13
14 Normalization Lexical and syntactic analysis check validity (similar to compilers) check for attributes and relations type checking on the qualification Put into normal form Conjunctive normal form (p 11 p 12 p 1n ) (p m1 p m2 p mn ) Disjunctive normal form (p 11 p 12 p 1n ) (p m1 p m2 p mn ) OR's mapped into union AND's mapped into join or selection 14
15 Analysis Annotation with type information Scoping for attributes Produce dependency graph 15
16 Simplification Why simplify? Avoid unnecessary operations (run-time and compile-time) How? Use transformation rules elimination of redundancy idempotency rules p1 ( p1) false, p1 (p1 p2) p1, p1 false p1 application of transitivity use of integrity rules 16
17 Simplification Example SELECT TITLE FROM EMP WHERE EMP.ENAME = J. Doe OR (NOT(EMP.TITLE = Programmer ) AND (EMP.TITLE = Programmer OR EMP.TITLE = Elect. Eng. ) AND NOT(EMP.TITLE = Elect. Eng. )) SELECT TITLE FROM EMP WHERE EMP.ENAME = J. Doe 17
18 Translation/Restructuring Convert relational calculus to relational algebra Example ΠENAME σdur=12 OR DUR=24 Find the names of employees other than J. Doe who worked on the CAD/CAM project for either 1 or 2 years. σpname= CAD/CAM SELECT ENAME σename J. DOE FROM EMP, ASG, PROJ WHERE EMP.ENO = ASG.ENO AND ASG.PNO = PROJ.PNO AND ENAME J. Doe AND PNAME = CAD/CAM AND (DUR = 12 OR DUR = 24) Project Select PNO Join ENO PROJ ASG EMP 18
19 Canonical Translation π A1,..., An σ P select distinct A1,..., An from R1,..., Rk where P Rk R3 R1 R2 19
20 Equivalences 1. (R ) ( (σcn(r )) )) σc1 c2... cn σc1(σc2 2. ((R )) (σc1((r )) σc1(σc2 σc2 3. πl1(π ( (π Ln(R )) )) (R ) L2 πl1 if L1 L2... Ln sch(r) 4. πa1,, (σc(r )) An σc (πa1,, An(R )) if C uses only A1,, An 5.,, and A are commutative 20
21 Equivalences 6. σc(r Aj S) σc(r) Aj S if c uses only attributes of R Combne with other rules for σ, e.g. σc1 c2(r A j S) σc1(r) A j (σc2(s)) 21
22 Equivalences 7. πl (R Ac S) πl sch(r) (R) Ac πl sch(s)(s) if C uses only attributes from L else (C set of attributes used in C) πl(racs) πl(π(l C) sch(r)(r) Ac π(l C) sch(s)(s)) (applicable to with C=true) 22
23 Equivalences 8. For Φ {A,,, }: (R Φ S) Φ T R Φ (S Φ T) (Associativity) 9. For Φ {,, }: σc(r Φ S) σc(r) Φ σc(s) 10. πc(r S) πc (R) πc (S) 23
24 Equivalences 11. (c1 c2) ( c1) ( c2) (c1 c2) ( c1) ( c2) 12. σc(r S ) R Ac S 24
25 Restructuring Heuristics 1. Split predicate use Rule 1 2. Use Rules 2, 4, 6, 9 to push selections down 3. Form joins using Rule Use 3, 4, 7, 10 to push projectons 5. Reduce number of operators using Rules 1,3 25
26 Example (Translation) ΠENAME SELECT σpname= CAD/CAM Recall the previous example: Find the names of employees other than J. Doe who worked on the CAD/CAM project for either one or two years. (DUR=12 DUR=24) ENAME J. DOE WHERE ASG.ENO=EMP.ENO SELECT ENAME ASG.PNO=PROJ.ENO FROM PROJ, ASG, EMP WHERE ASG.ENO=EMP.ENO AND ASG.PNO=PROJ.PNO AND ENAME J. Doe AND PROJ.PNAME= CAD/CAM AND (DUR=12 OR DUR=24) PROJ ASG FROM EMP 26
27 Restructuring Π ENAME PNO Π PNO,ENAME ENO Π PNO Π PNO,ENO Π PNO,ENAME σ PNAME = "CAD/CAM" σ DUR =12 DUR=24 σ ENAME "J. Doe" PROJ ASG EMP 27
28 Distributed Query Processing Methodology Calculus Query on Distributed Relations Query Decomposition GLOBAL SCHEMA Algebraic Query on Distributed Relations CONTROL SITE Data Localization FRAGMENT SCHEMA Fragment Query Global Optimization STATS ON FRAGMENTS Optimized Fragment Query with Communication Operations LOCAL SITES Local Optimization LOCAL SCHEMAS Optimized Local Queries 28
29 Step 2 Data Localization Input: Algebraic query on distributed relations Determine which fragments are involved Localization program substitute for each global relation its reconstruction expression optimize 29
30 Example Π Assume ENAME σdur=12 OR DUR=24 EMP is fragmented into EMP1, EMP2, EMP3 as follows: EMP1=σENO E3 (EMP) EMP2= σ E3 <ENO E6 (EMP) EMP3=σENO E6 (EMP) σpname= CAD/CAM σename J. DOE ASG fragmented into ASG1 and ASG2 as follows: PNO ASG1=σENO E3 (ASG) ENO ASG2=σENO> E3 (ASG) Replace EMP by (EMP1 EMP2 EMP3 ) and (ASG1 ASG2) in any query PROJ ASG by EMP1 EMP2 EMP3 ASG1 ASG2 30
31 Provides Parallellism... ENO EMP1 ASG1 ENO EMP2 ASG2 ENO EMP3 ASG1 ENO EMP3 ASG2 31
32 Eliminates Unnecessary Work ENO EMP1 ENO ENO ASG1 EMP2 ASG2 EMP3 ASG2 32
33 Reduction for PHF Reduction with selection Relation R and FR={R1, R2,, Rw} where Rj=σ pj(r) σ p (Rj)= if x in R: (pi(x) pj(x)) i Example σ ENO= E5 EMP1 EMP2 SELECT * FROM EMP σ ENO= E5 WHERE ENO= E5 EMP3 EMP2 33
34 Reduction for PHF Reduction with join Possible if fragmentation is done on join attribute Distribute join over union (R 1 R 2 ) S (R 1 S) (R 2 S) Given R i = σ pi (R) and R j = σ pj (R) R i R j = if x in R i, y in R j : (p i (x) p j (y)) 34
35 Reduction for PHF Reduction with join - Example Assume EMP is fragmented as before and ASG1: σeno "E3"(ASG) ASG2: σeno > "E3"(ASG) Consider the query SELECT* FROM EMP, ASG WHERE EMP.ENO=ASG.ENO ENO EMP1 EMP2 EMP3 ASG1 ASG2 35
36 Reduction for PHF Reduction with join - Example Distribute join over unions Apply the reduction rule ENO EMP1 ENO ASG1 EMP2 ENO ASG2 EMP3 ASG2 36
37 Reduction for VF Find useless (not empty) intermediate relations Relation R defined over attributes A = {A1,..., An} vertically fragmented as Ri = ΠA' (R) where A' A: ΠD,K(Ri) is useless if the set of projection attributes D is not in A' Example: EMP1= ΠENO,ENAME (EMP); EMP2= ΠENO,TITLE (EMP) SELECT ENAME FROM EMP ΠENAME ΠENAME ENO EMP1 EMP2 EMP1 37
38 Reduction for DHF Rule : Distribute joins over unions Apply the join reduction for horizontal fragmentation Example ASG1: ASG ENO EMP1 ASG2: ASG ENO EMP2 EMP1: σ TITLE= Programmer (EMP) EMP2: σ TITLE Programmer (EMP) Query SELECT * FROM EMP, ASG WHERE ASG.ENO = EMP.ENO AND EMP.TITLE = Mech. Eng. 38
39 Reduction for DHF Generic query ENO σ TITLE= Mech. Eng. ASG 1 Selections first ASG 2 EMP 1 EMP 2 ENO σ TITLE= Mech. Eng. ASG 1 ASG 2 EMP 2 39
40 Reduction for DHF Joins over unions ENO ENO σ TITLE= Mech. Eng. σ TITLE= Mech. Eng. ASG 1 EMP 2 ASG 2 EMP 2 Elimination of the empty intermediate relations (left sub-tree) ENO σ TITLE= Mech. Eng. ASG 2 EMP 2 40
41 Reduction for HF Combine the rules already specified: Remove empty relations generated by contradicting selections on horizontal fragments; Remove useless relations generated by projections on vertical fragments; Distribute joins over unions in order to isolate and remove useless joins. 41
42 Reduction for HF ΠENAME Example Consider the following hybrid fragmentation: EMP1=σENO "E4" (ΠENO,ENAME (EMP)) σ ENO= E5 EMP2=σENO>"E4" (ΠENO,ENAME (EMP)) ENO EMP3= ΠENO,TITLE (EMP) and the query SELECT FROM ΠENAME ENAME σ ENO= E5 EMP2 EMP WHERE ENO= E5 EMP1 EMP2 EMP3 42
43 Distributed Query Processing Methodology Calculus Query on Distributed Relations Query Decomposition GLOBAL SCHEMA Algebraic Query on Distributed Relations CONTROL SITE Data Localization FRAGMENT SCHEMA Fragment Query Global Optimization STATS ON FRAGMENTS Optimized Fragment Query with Communication Operations LOCAL SITES Local Optimization LOCAL SCHEMAS Optimized Local Queries 43
44 Step 3 Global Query Optimization Input: Fragment query Find the best (not necessarily optimal) global schedule Minimize a cost function Distributed join processing Bushy vs. linear trees Which relation to ship where? Ship-whole vs ship-as-needed Decide on the use of semijoins Semijoin saves on communication at the expense of more local processing. Join methods nested loop vs ordered joins (merge join or hash join) 44
45 Join Ordering Consider two relations only if size (R) < size (S) R S if size (R) > size (S) Multiple relations more difficult because too many alternatives. Compute the cost of all alternatives and select the best one. Necessary to compute the size of intermediate relations which is difficult. Use heuristics 45
46 Join Ordering Example Consider PROJ PNOASG ENOEMP Site 2 ASG ENO EMP Site 1 PNO PROJ Site 3 46
47 Join Ordering Example 1. EMP Site 2 2. ASG Site 1 Site 2 computes EMP'=EMP ASG Site 1 computes EMP'=EMP ASG EMP' Site 3 EMP' Site 3 Site 3 computes EMP PROJ Site 3 computes EMP PROJ 3. ASG Site 3 4. PROJ Site 2 Site 3 computes ASG'=ASG PROJ Site 2 computes PROJ'=PROJ ASG ASG' Site 1 PROJ' Site 1 Site 1 computes ASG' EMP Site 1 computes PROJ' EMP 5. EMP Site 2 PROJ Site 2 Site 2 computes EMP PROJ ASG 47
48 Semijoin Algorithms Consider the join of two relations: R[A] (located at site 1) S[A] (located at site 2) Alternatives: 1 Do the join R A S 2 Perform one of the semijoin equivalents R A S (R R (R A A S) (S A S) A A S R) A (S A R) 48
49 Semijoin Algorithms Perform the join send R to Site 2 Site 2 computes R S A Consider semijoin (R A S) A S S' A(S) S' Site 1 Site 1 computes R' = R A S' R' Site 2 Site 2 computes R' A S Semijoin is better if size(πa(s)) + size(r A S)) < size(r) 49
50 Step 4 Local Optimization Input: Best global execution schedule Select the best access path Use the centralized optimization techniques 50
Query Processing SL03
Distributed Database Systems Fall 2016 Query Processing Overview Query Processing SL03 Distributed Query Processing Steps Query Decomposition Data Localization Query Processing Overview/1 Query processing:
More informationQuery Decomposition and Data Localization
Query Decomposition and Data Localization Query Decomposition and Data Localization Query decomposition and data localization consists of two steps: Mapping of calculus query (SQL) to algebra operations
More informationIntroduction Background Distributed DBMS Architecture Distributed Database Design Semantic Data Control Distributed Query Processing
Outline Introduction Background Distributed DBMS Architecture Distributed Database Design Semantic Data Control Distributed Query Processing Query Processing Methodology Distributed Query Optimization
More informationIntroduction Background Distributed DBMS Architecture Distributed Database Design Semantic Data Control Distributed Query Processing
Outline Introduction Background Distributed DBMS Architecture Distributed Database Design Semantic Data Control Distributed Query Processing Query Processing Methodology Distributed Query Optimization
More informationCS54200: Distributed. Introduction
CS54200: Distributed Database Systems Query Processing 9 March 2009 Prof. Chris Clifton Query Processing Introduction Converting user commands from the query language (SQL) to low level data manipulation
More informationDistributed Databases Systems
Distributed Databases Systems Lecture No. 05 Query Processing Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology Jamshoro Outline
More informationQUERY PROCESSING & OPTIMIZATION CHAPTER 19 (6/E) CHAPTER 15 (5/E)
QUERY PROCESSING & OPTIMIZATION CHAPTER 19 (6/E) CHAPTER 15 (5/E) 2 LECTURE OUTLINE Query Processing Methodology Basic Operations and Their Costs Generation of Execution Plans 3 QUERY PROCESSING IN A DDBMS
More informationQUERY PROCESSING & OPTIMIZATION CHAPTER 19 (6/E) CHAPTER 15 (5/E)
QUERY PROCESSING & OPTIMIZATION CHAPTER 19 (6/E) CHAPTER 15 (5/E) 2 LECTURE OUTLINE Query Processing Methodology Basic Operations and Their Costs Generation of Execution Plans 3 QUERY PROCESSING IN A DDBMS
More informationOptimization of Distributed Queries
Query Optimization Optimization of Distributed Queries Issues in Query Optimization Joins and Semijoins Query Optimization Algorithms Centralized query optimization: Minimize the cots function Find (the
More informationDistributed caching for multiple databases
Distributed caching for multiple databases K. V. Santhilata, Post Graduate Research Student, Department of Informatics, School of Natural and Mathematical Sciences, King's College London, London, U.K 1
More informationChapter 4 Distributed Query Processing
Chapter 4 Distributed Query Processing Table of Contents Overview of Query Processing Query Decomposition and Data Localization Optimization of Distributed Queries Chapter4-1 1 1. Overview of Query Processing
More informationDistributed Query Optimization: Use of mobile Agents Kodanda Kumar Melpadi
Distributed Query Optimization: Use of mobile Agents Kodanda Kumar Melpadi M.Tech (IT) GGS Indraprastha University Delhi mk_kumar_76@yahoo.com Abstract DDBS adds to the conventional centralized DBS some
More informationMobile and Heterogeneous databases Distributed Database System Query Processing. A.R. Hurson Computer Science Missouri Science & Technology
Mobile and Heterogeneous databases Distributed Database System Query Processing A.R. Hurson Computer Science Missouri Science & Technology 1 Note, this unit will be covered in four lectures. In case you
More informationOutline. Distributed DBMS Page 5. 1
Outline Introduction Background Distributed DBMS Architecture Distributed Database Design Fragmentation Data Location Semantic Data Control Distributed Query Processing Distributed Transaction Management
More informationMobile and Heterogeneous databases
Mobile and Heterogeneous databases Heterogeneous Distributed Databases Query Processing A.R. Hurson Computer Science Missouri Science & Technology 1 Note, this unit will be covered in two lectures. In
More informationOutline. Distributed DBMS 1998 M. Tamer Özsu & Patrick Valduriez
Outline Introduction Background Distributed DBMS Architecture Distributed Database Design Semantic Data Control View Management Data Security Semantic Integrity Control Distributed Query Processing Distributed
More informationCS54200: Distributed Database Systems
CS54200: Distributed Database Systems Distributed Database Design 23 February, 2009 Prof. Chris Clifton Design Problem In the general setting: Making decisions about the placement of data and programs
More informationWhat is Data? Volatile vs. persistent data Our concern is primarily with persistent data
What is? ANSI definition: ❶ A representation of facts, concepts, or instructions in a formalized manner suitable for communication, interpretation, or processing by humans or by automatic means. ❷ Any
More informationWhat is Data? ANSI definition: Volatile vs. persistent data. Data. Our concern is primarily with persistent data
What is Data? ANSI definition: Data ❶ A representation of facts, concepts, or instructions in a formalized manner suitable for communication, interpretation, or processing by humans or by automatic means.
More informationBackground. Chapter Overview of Relational DBMS Relational Database Concepts
Chapter 2 Background As indicated in the previous chapter, there are two technological bases for distributed database technology: database management and computer networks. In this chapter, we provide
More informationIntegration of Transactional Systems
Integration of Transactional Systems Distributed Query Processing Robert Wrembel Poznań University of Technology Institute of Computing Science Robert.Wrembel@cs.put.poznan.pl www.cs.put.poznan.pl/rwrembel
More informationOutline. File Systems. Page 1
Outline Introduction What is a distributed DBMS Problems Current state-of-affairs Background Distributed DBMS Architecture Distributed Database Design Semantic Data Control Distributed Query Processing
More informationOutline. q Database integration & querying. q Peer-to-Peer data management q Stream data management q MapReduce-based distributed data management
Outline n Introduction & architectural issues n Data distribution n Distributed query processing n Distributed query optimization n Distributed transactions & concurrency control n Distributed reliability
More informationIntroduction to Query Processing and Query Optimization Techniques. Copyright 2011 Ramez Elmasri and Shamkant Navathe
Introduction to Query Processing and Query Optimization Techniques Outline Translating SQL Queries into Relational Algebra Algorithms for External Sorting Algorithms for SELECT and JOIN Operations Algorithms
More informationAlgorithms for Query Processing and Optimization. 0. Introduction to Query Processing (1)
Chapter 19 Algorithms for Query Processing and Optimization 0. Introduction to Query Processing (1) Query optimization: The process of choosing a suitable execution strategy for processing a query. Two
More informationChapter 19 Query Optimization
Chapter 19 Query Optimization It is an activity conducted by the query optimizer to select the best available strategy for executing the query. 1. Query Trees and Heuristics for Query Optimization - Apply
More informationChapter 3. Algorithms for Query Processing and Optimization
Chapter 3 Algorithms for Query Processing and Optimization Chapter Outline 1. Introduction to Query Processing 2. Translating SQL Queries into Relational Algebra 3. Algorithms for External Sorting 4. Algorithms
More informationCs712 Important Questions & Past Papers
Cs712 Distributed Database Q1. Differentiate Horizontal and vertical partitions? Vertical Fragmentation: Different subsets of attributes are stored at different places, like, Table EMP (eid, ename, edept,
More informationRelational Model History. COSC 304 Introduction to Database Systems. Relational Model and Algebra. Relational Model Definitions.
COSC 304 Introduction to Database Systems Relational Model and Algebra Dr. Ramon Lawrence University of British Columbia Okanagan ramon.lawrence@ubc.ca Relational Model History The relational model was
More informationCOSC 122 Computer Fluency. Databases. Dr. Ramon Lawrence University of British Columbia Okanagan
COSC 122 Computer Fluency Databases Dr. Ramon Lawrence University of British Columbia Okanagan ramon.lawrence@ubc.ca Key Points 1) Databases allow for easy storage and retrieval of large amounts of information.
More informationDatabase Systems External Sorting and Query Optimization. A.R. Hurson 323 CS Building
External Sorting and Query Optimization A.R. Hurson 323 CS Building External sorting When data to be sorted cannot fit into available main memory, external sorting algorithm must be applied. Naturally,
More informationQuery optimization. Elena Baralis, Silvia Chiusano Politecnico di Torino. DBMS Architecture D B M G. Database Management Systems. Pag.
Database Management Systems DBMS Architecture SQL INSTRUCTION OPTIMIZER MANAGEMENT OF ACCESS METHODS CONCURRENCY CONTROL BUFFER MANAGER RELIABILITY MANAGEMENT Index Files Data Files System Catalog DATABASE
More informationRelational Query Optimization. Overview of Query Evaluation. SQL Refresher. Yanlei Diao UMass Amherst October 23 & 25, 2007
Relational Query Optimization Yanlei Diao UMass Amherst October 23 & 25, 2007 Slide Content Courtesy of R. Ramakrishnan, J. Gehrke, and J. Hellerstein 1 Overview of Query Evaluation Query Evaluation Plan:
More informationKey Points. COSC 122 Computer Fluency. Databases. What is a database? Databases in the Real-World DBMS. Database System Approach
COSC 122 Computer Fluency Dr. Ramon Lawrence University of British Columbia Okanagan ramon.lawrence@ubc.ca Key Points 1) allow for easy storage and retrieval of large amounts of information. 2) Relational
More informationChapter 12: Query Processing
Chapter 12: Query Processing Overview Catalog Information for Cost Estimation $ Measures of Query Cost Selection Operation Sorting Join Operation Other Operations Evaluation of Expressions Transformation
More informationTotalCost = 3 (1, , 000) = 6, 000
156 Chapter 12 HASH JOIN: Now both relations are the same size, so we can treat either one as the smaller relation. With 15 buffer pages the first scan of S splits it into 14 buckets, each containing about
More informationQuery Processing and Query Optimization. Prof Monika Shah
Query Processing and Query Optimization Query Processing SQL Query Is in Library Cache? System catalog (Dict / Dict cache) Scan and verify relations Parse into parse tree (relational Calculus) View definitions
More informationCopyright 2016 Ramez Elmasri and Shamkant B. Navathe
CHAPTER 19 Query Optimization Introduction Query optimization Conducted by a query optimizer in a DBMS Goal: select best available strategy for executing query Based on information available Most RDBMSs
More informationDatabase Tuning and Physical Design: Basics of Query Execution
Database Tuning and Physical Design: Basics of Query Execution Spring 2018 School of Computer Science University of Waterloo Databases CS348 (University of Waterloo) Query Execution 1 / 43 The Client/Server
More informationOutline. Query Processing Overview Algorithms for basic operations. Query optimization. Sorting Selection Join Projection
Outline Query Processing Overview Algorithms for basic operations Sorting Selection Join Projection Query optimization Heuristics Cost-based optimization 19 Estimate I/O Cost for Implementations Count
More informationQuery Optimization Overview
Query Optimization Overview parsing, syntax checking semantic checking check existence of referenced relations and attributes disambiguation of overloaded operators check user authorization query rewrites
More informationQuery Processing & Optimization
Query Processing & Optimization 1 Roadmap of This Lecture Overview of query processing Measures of Query Cost Selection Operation Sorting Join Operation Other Operations Evaluation of Expressions Introduction
More informationDatabase Architectures
Database Architectures CPS352: Database Systems Simon Miner Gordon College Last Revised: 11/15/12 Agenda Check-in Centralized and Client-Server Models Parallelism Distributed Databases Homework 6 Check-in
More informationDatabase System Concepts
Chapter 14: Optimization Departamento de Engenharia Informática Instituto Superior Técnico 1 st Semester 2007/2008 Slides (fortemente) baseados nos slides oficiais do livro c Silberschatz, Korth and Sudarshan.
More informationAdvanced Databases. Lecture 4 - Query Optimization. Masood Niazi Torshiz Islamic Azad university- Mashhad Branch
Advanced Databases Lecture 4 - Query Optimization Masood Niazi Torshiz Islamic Azad university- Mashhad Branch www.mniazi.ir Query Optimization Introduction Transformation of Relational Expressions Catalog
More informationCMSC 424 Database design Lecture 18 Query optimization. Mihai Pop
CMSC 424 Database design Lecture 18 Query optimization Mihai Pop More midterm solutions Projects do not be late! Admin Introduction Alternative ways of evaluating a given query Equivalent expressions Different
More informationWhat happens. 376a. Database Design. Execution strategy. Query conversion. Next. Two types of techniques
376a. Database Design Dept. of Computer Science Vassar College http://www.cs.vassar.edu/~cs376 Class 16 Query optimization What happens Database is given a query Query is scanned - scanner creates a list
More informationDatabases. Relational Model, Algebra and operations. How do we model and manipulate complex data structures inside a computer system? Until
Databases Relational Model, Algebra and operations How do we model and manipulate complex data structures inside a computer system? Until 1970.. Many different views or ways of doing this Could use tree
More informationCS 4604: Introduction to Database Management Systems. B. Aditya Prakash Lecture #10: Query Processing
CS 4604: Introduction to Database Management Systems B. Aditya Prakash Lecture #10: Query Processing Outline introduction selection projection join set & aggregate operations Prakash 2018 VT CS 4604 2
More informationOverview of Data Management
Overview of Data Management School of Computer Science University of Waterloo Databases CS348 (University of Waterloo) Overview of Data Management 1 / 21 What is Data ANSI definition of data: 1 A representation
More informationRelational Databases
Relational Databases Jan Chomicki University at Buffalo Jan Chomicki () Relational databases 1 / 49 Plan of the course 1 Relational databases 2 Relational database design 3 Conceptual database design 4
More informationMahathma Gandhi University
Mahathma Gandhi University BSc Computer science III Semester BCS 303 OBJECTIVE TYPE QUESTIONS Choose the correct or best alternative in the following: Q.1 In the relational modes, cardinality is termed
More informationDistributed Databases
Distributed Databases Chapter 1: Introduction Syllabus Data Independence and Distributed Data Processing Definition of Distributed databases Promises of Distributed Databases Technical Problems to be Studied
More informationBasant Group of Institution
Basant Group of Institution Visual Basic 6.0 Objective Question Q.1 In the relational modes, cardinality is termed as: (A) Number of tuples. (B) Number of attributes. (C) Number of tables. (D) Number of
More informationCOSC 304 Introduction to Database Systems SQL. Dr. Ramon Lawrence University of British Columbia Okanagan
COSC 304 Introduction to Database Systems SQL Dr. Ramon Lawrence University of British Columbia Okanagan ramon.lawrence@ubc.ca SQL Queries Querying with SQL is performed using a SELECT statement. The general
More informationSchema for Examples. Query Optimization. Alternative Plans 1 (No Indexes) Motivating Example. Alternative Plans 2 With Indexes
Schema for Examples Query Optimization (sid: integer, : string, rating: integer, age: real) (sid: integer, bid: integer, day: dates, rname: string) Similar to old schema; rname added for variations. :
More informationName: 1. (a) SQL is an example of a non-procedural query language.
Name: 1 1. (20 marks) erminology: or each of the following statements, state whether it is true or false. If it is false, correct the statement without changing the underlined text. (Note: there might
More informationDatabase Architectures
Database Architectures CPS352: Database Systems Simon Miner Gordon College Last Revised: 4/15/15 Agenda Check-in Parallelism and Distributed Databases Technology Research Project Introduction to NoSQL
More informationQuery Optimization. Schema for Examples. Motivating Example. Similar to old schema; rname added for variations. Reserves: Sailors:
Query Optimization atabase Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Schema for Examples (sid: integer, sname: string, rating: integer, age: real) (sid: integer, bid: integer, day: dates,
More informationQuery Optimization. Schema for Examples. Motivating Example. Similar to old schema; rname added for variations. Reserves: Sailors:
Query Optimization Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Schema for Examples (sid: integer, sname: string, rating: integer, age: real) (sid: integer, bid: integer, day: dates,
More informationCh 5 : Query Processing & Optimization
Ch 5 : Query Processing & Optimization Basic Steps in Query Processing 1. Parsing and translation 2. Optimization 3. Evaluation Basic Steps in Query Processing (Cont.) Parsing and translation translate
More informationParallel DBMS. Parallel Database Systems. PDBS vs Distributed DBS. Types of Parallelism. Goals and Metrics Speedup. Types of Parallelism
Parallel DBMS Parallel Database Systems CS5225 Parallel DB 1 Uniprocessor technology has reached its limit Difficult to build machines powerful enough to meet the CPU and I/O demands of DBMS serving large
More informationAdvanced Databases: Parallel Databases A.Poulovassilis
1 Advanced Databases: Parallel Databases A.Poulovassilis 1 Parallel Database Architectures Parallel database systems use parallel processing techniques to achieve faster DBMS performance and handle larger
More informationSQL Queries. COSC 304 Introduction to Database Systems SQL. Example Relations. SQL and Relational Algebra. Example Relation Instances
COSC 304 Introduction to Database Systems SQL Dr. Ramon Lawrence University of British Columbia Okanagan ramon.lawrence@ubc.ca SQL Queries Querying with SQL is performed using a SELECT statement. The general
More informationR & G Chapter 13. Implementation of single Relational Operations Choices depend on indexes, memory, stats, Joins Blocked nested loops:
Relational Query Optimization R & G Chapter 13 Review Implementation of single Relational Operations Choices depend on indexes, memory, stats, Joins Blocked nested loops: simple, exploits extra memory
More informationQUERY OPTIMIZATION. CS 564- Spring ACKs: Jeff Naughton, Jignesh Patel, AnHai Doan
QUERY OPTIMIZATION CS 564- Spring 2018 ACKs: Jeff Naughton, Jignesh Patel, AnHai Doan WHAT IS THIS LECTURE ABOUT? What is a query optimizer? Generating query plans Cost estimation of query plans 2 ARCHITECTURE
More informationRelational Model History. COSC 416 NoSQL Databases. Relational Model (Review) Relation Example. Relational Model Definitions. Relational Integrity
COSC 416 NoSQL Databases Relational Model (Review) Dr. Ramon Lawrence University of British Columbia Okanagan ramon.lawrence@ubc.ca Relational Model History The relational model was proposed by E. F. Codd
More informationOverview of Implementing Relational Operators and Query Evaluation
Overview of Implementing Relational Operators and Query Evaluation Chapter 12 Motivation: Evaluating Queries The same query can be evaluated in different ways. The evaluation strategy (plan) can make orders
More informationRelational Query Optimization
Relational Query Optimization Module 4, Lectures 3 and 4 Database Management Systems, R. Ramakrishnan 1 Overview of Query Optimization Plan: Tree of R.A. ops, with choice of alg for each op. Each operator
More informationFaloutsos 1. Carnegie Mellon Univ. Dept. of Computer Science Database Applications. Outline
Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications Lecture #14: Implementation of Relational Operations (R&G ch. 12 and 14) 15-415 Faloutsos 1 introduction selection projection
More informationQuery Processing & Optimization. CS 377: Database Systems
Query Processing & Optimization CS 377: Database Systems Recap: File Organization & Indexing Physical level support for data retrieval File organization: ordered or sequential file to find items using
More informationCSE 544 Principles of Database Management Systems. Alvin Cheung Fall 2015 Lecture 7 - Query optimization
CSE 544 Principles of Database Management Systems Alvin Cheung Fall 2015 Lecture 7 - Query optimization Announcements HW1 due tonight at 11:45pm HW2 will be due in two weeks You get to implement your own
More informationDatabase Systems. Announcement. December 13/14, 2006 Lecture #10. Assignment #4 is due next week.
Database Systems ( 料 ) December 13/14, 2006 Lecture #10 1 Announcement Assignment #4 is due next week. 2 1 Overview of Query Evaluation Chapter 12 3 Outline Query evaluation (Overview) Relational Operator
More informationIntroduction Alternative ways of evaluating a given query using
Query Optimization Introduction Catalog Information for Cost Estimation Estimation of Statistics Transformation of Relational Expressions Dynamic Programming for Choosing Evaluation Plans Introduction
More information7. Query Processing and Optimization
7. Query Processing and Optimization Processing a Query 103 Indexing for Performance Simple (individual) index B + -tree index Matching index scan vs nonmatching index scan Unique index one entry and one
More informationCOSC 304 Introduction to Database Systems SQL DDL. Dr. Ramon Lawrence University of British Columbia Okanagan
COSC 304 Introduction to Database Systems SQL DDL Dr. Ramon Lawrence University of British Columbia Okanagan ramon.lawrence@ubc.ca SQL Overview Structured Query Language or SQL is the standard query language
More informationQuery Optimization. Query Optimization. Optimization considerations. Example. Interaction of algorithm choice and tree arrangement.
COS 597: Principles of Database and Information Systems Query Optimization Query Optimization Query as expression over relational algebraic operations Get evaluation (parse) tree Leaves: base relations
More informationChapter 13: Query Optimization. Chapter 13: Query Optimization
Chapter 13: Query Optimization Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 13: Query Optimization Introduction Equivalent Relational Algebra Expressions Statistical
More informationDISTRIBUTED DATABASES CS561-SPRING 2012 WPI, MOHAMED ELTABAKH
DISTRIBUTED DATABASES CS561-SPRING 2012 WPI, MOHAMED ELTABAKH 1 RECAP: PARALLEL DATABASES Three possible architectures Shared-memory Shared-disk Shared-nothing (the most common one) Parallel algorithms
More informationFinal Exam Review 2. Kathleen Durant CS 3200 Northeastern University Lecture 23
Final Exam Review 2 Kathleen Durant CS 3200 Northeastern University Lecture 23 QUERY EVALUATION PLAN Representation of a SQL Command SELECT {DISTINCT} FROM {WHERE
More informationCOSC Dr. Ramon Lawrence. Emp Relation
COSC 304 Introduction to Database Systems Normalization Dr. Ramon Lawrence University of British Columbia Okanagan ramon.lawrence@ubc.ca Normalization Normalization is a technique for producing relations
More informationChapter 6 Part I The Relational Algebra and Calculus
Chapter 6 Part I The Relational Algebra and Calculus Copyright 2004 Ramez Elmasri and Shamkant Navathe Database State for COMPANY All examples discussed below refer to the COMPANY database shown here.
More informationDistributed Data Management
Distributed Data Management Wolf-Tilo Balke Christoph Lofi Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de 2.0 Introduction 3.0 Query Processing 3.1 Basic
More informationCSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2007 Lecture 9 - Query optimization
CSE 544 Principles of Database Management Systems Magdalena Balazinska Fall 2007 Lecture 9 - Query optimization References Access path selection in a relational database management system. Selinger. et.
More informationCopyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. Chapter 6 Outline. Unary Relational Operations: SELECT and
Chapter 6 The Relational Algebra and Relational Calculus Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 6 Outline Unary Relational Operations: SELECT and PROJECT Relational
More informationQUERY OPTIMIZATION [CH 15]
Spring 2017 QUERY OPTIMIZATION [CH 15] 4/12/17 CS 564: Database Management Systems; (c) Jignesh M. Patel, 2013 1 Example SELECT distinct ename FROM Emp E, Dept D WHERE E.did = D.did and D.dname = Toy EMP
More informationRelational Model: History
Relational Model: History Objectives of Relational Model: 1. Promote high degree of data independence 2. Eliminate redundancy, consistency, etc. problems 3. Enable proliferation of non-procedural DML s
More informationQuery Execution [15]
CSC 661, Principles of Database Systems Query Execution [15] Dr. Kalpakis http://www.csee.umbc.edu/~kalpakis/courses/661 Query processing involves Query processing compilation parsing to construct parse
More informationOn Building Integrated and Distributed Database Systems
On Building Integrated and Distributed Database Systems Distributed Query Processing and Optimization Robert Wrembel Poznań University of Technology Institute of Computing Science Poznań,, Poland Robert.Wrembel@cs.put.poznan.pl
More informationNote that it works well to answer questions on computer instead of on the board.
1) SELECT pname FROM Proj WHERE budget > 250000; 2) SELECT eno FROM Emp where salary < 30000; 3) SELECT DISTINCT resp from WorksOn; 4) SELECT ename FROM Emp WHERE bdate > DATE 1970-07-01' and salary >
More information1. Introduction; Selection, Projection. 2. Cartesian Product, Join. 5. Formal Definitions, A Bit of Theory
Relational Algebra 169 After completing this chapter, you should be able to enumerate and explain the operations of relational algebra (there is a core of 5 relational algebra operators), write relational
More informationQuery Evaluation and Optimization
Query Evaluation and Optimization Jan Chomicki University at Buffalo Jan Chomicki () Query Evaluation and Optimization 1 / 21 Evaluating σ E (R) Jan Chomicki () Query Evaluation and Optimization 2 / 21
More informationContents Contents Introduction Basic Steps in Query Processing Introduction Transformation of Relational Expressions...
Contents Contents...283 Introduction...283 Basic Steps in Query Processing...284 Introduction...285 Transformation of Relational Expressions...287 Equivalence Rules...289 Transformation Example: Pushing
More informationCompSci 516 Data Intensive Computing Systems
CompSci 516 Data Intensive Computing Systems Lecture 9 Join Algorithms and Query Optimizations Instructor: Sudeepa Roy CompSci 516: Data Intensive Computing Systems 1 Announcements Takeaway from Homework
More informationPrinciples of Database Management Systems
Principles of Database Management Systems 5: Query Processing Pekka Kilpeläinen (partially based on Stanford CS245 slide originals by Hector Garcia-Molina, Jeff Ullman and Jennifer Widom) Query Processing
More informationExamples of Physical Query Plan Alternatives. Selected Material from Chapters 12, 14 and 15
Examples of Physical Query Plan Alternatives Selected Material from Chapters 12, 14 and 15 1 Query Optimization NOTE: SQL provides many ways to express a query. HENCE: System has many options for evaluating
More informationDatabase Management Systems (COP 5725) Homework 3
Database Management Systems (COP 5725) Homework 3 Instructor: Dr. Daisy Zhe Wang TAs: Yang Chen, Kun Li, Yang Peng yang, kli, ypeng@cise.uf l.edu November 26, 2013 Name: UFID: Email Address: Pledge(Must
More informationChapter 14: Query Optimization
Chapter 14: Query Optimization Database System Concepts 5 th Ed. See www.db-book.com for conditions on re-use Chapter 14: Query Optimization Introduction Transformation of Relational Expressions Catalog
More informationLecture 19: Query Optimization (1)
Lecture 19: Query Optimization (1) May 17, 2010 Dan Suciu -- 444 Spring 2010 1 Announcements Homework 3 due on Wednesday in class How is it going? Project 4 posted Due on June 2 nd Start early! Dan Suciu
More informationChapter 11: Query Optimization
Chapter 11: Query Optimization Chapter 11: Query Optimization Introduction Transformation of Relational Expressions Statistical Information for Cost Estimation Cost-based optimization Dynamic Programming
More information