CS 347 Parallel and Distributed Data Processing

Size: px
Start display at page:

Download "CS 347 Parallel and Distributed Data Processing"

Transcription

1 C 347 Parallel and Distributed Data Processing pring 2016 Query Processing Decomposition Localization Optimization Notes 3: Query Processing C 347 Notes 3 2 Decomposition ame as in centralized system 1. Normalization 2. Eliminating redundancy 3. lgebraic rewriting Normalization Convert from general language to a standard form E.g., relational algebra C 347 Notes 3 3 C 347 Notes 3 4

2 Normalization Example: Normalization lso detect invalid expressions.,.c select.,.c from, where. =. and σ (.B=1.C>3).D=2 select * from where.d = 3 does not have D attribute ((.B = 1 and.d = 2) or (.C > 3 and.d = 2)) conjunctive normal form C 347 Notes 3 5 C 347 Notes 3 6 Eliminating edundancy E.g., in conditions Eliminating edundancy E.g., sharing common sub-expressions (. = 1) (. > 5) false (. < 10) (. < 5). < 5 σ cond σ cond T σ cond T C 347 Notes 3 7 C 347 Notes 3 8

3 lgebraic ewriting E.g., pushing conditions down σ cond σ cond3 σ cond1 σ cond2 Query Processing Decomposition One or more algebraic query trees on relations Localization eplacing relations by corresponding fragments C 347 Notes 3 9 C 347 Notes 3 10 Localization teps (1) tart with query (2) eplace relations by fragments (3) Push up and, σ down (apply C 245 rules) Localization Notation for fragment [ : cond] conditions its tuples satisfy fragment (4) implify ~ eliminate unnecessary operations C 347 Notes 3 11 C 347 Notes 3 12

4 Example (1) Example (2) σ E=3 σ E=3 [ 1: E<10] [ 2: E 10] C 347 Notes 3 13 C 347 Notes 3 14 Example Example (3) (3) σ E=3 σ E=3 [ 1: E<10] [ 2: E 10] σ E=3 σ E=3 [ 1: E<10] [ 2: E 10] Ø C 347 Notes 3 15 C 347 Notes 3 16

5 Example (4) Localization ule 1 σ E=3 [ 1: E<10] σ c1 [: c2] [: false] σ c1 [: c1 c2] Ø C 347 Notes 3 17 C 347 Notes 3 18 Example Example B σ E=3 [ 2: E 10] σ E=3 [ 2: E=3 E 10] σ E=3 [ 2: false] Ø (1) = common attribute C 347 Notes 3 19 C 347 Notes 3 20

6 Example B Example B (2) (3) [1: <5] [2: 5 10] [3: >10] [1: <5] [2: 5] [1: <5] [1: <5] [1: <5] [2: 5] [3: >10] [1: <5] [3: >10] [2: 5] [2: 5 10] [1: <5] [2: 5 10] [2: 5] C 347 Notes 3 21 C 347 Notes 3 22 Example B Example B (3) (4) [1: <5] [1: <5] [1: <5] [2: 5] [3: >10] [1: <5] [3: >10] [2: 5] [1: <5] [1: <5] [2: 5 10] [2: 5] [3: >10] [2: 5] [2: 5 10] [1: <5] [2: 5 10] [2: 5] C 347 Notes 3 23 C 347 Notes 3 24

7 Localization ule 2 Example B [ 1: < 5] [ 2: 5] [: c 1] [: c 2] [ : c 1 c 2.=.] [ 1 2: 1. < = 2.] [: false] Ø [ 1 2: false] Ø C 347 Notes 3 25 C 347 Notes 3 26 Example C Example C (2) (3) K K K K K [1: <10] [2: 10] [1: K=.K.<10] [2: K=.K. 10] derived fragmentation [1] [1] [1] [2] [2] [1] [2] [2] C 347 Notes 3 27 C 347 Notes 3 28

8 Example C (3) Example C (4) K K K K K K [1] [1] [1] [2] [2] [1] [2] [2] [1: <10] [1: K=.K.<10] [2: 10] [2: K=.K. 10] C 347 Notes 3 29 C 347 Notes 3 30 Example C Example D K [ 1: < 10] [ 2: K =.K. 10] (1) [ 1 K 2: 1. < 10 2.K =.K K = 2.K] 1 (K,, B) 2 (K, C, D) [ 1 Ø K 2: false] vertical fragmentation C 347 Notes 3 31 C 347 Notes 3 32

9 Example D Example D (2) (2) K K 1 (K,,B) 2 (K,C,D) K, K, 1 (K,,B) 2 (K,C,D) C 347 Notes 3 33 C 347 Notes 3 34 Example D ule 3 (4) Consider the vertical fragmentation 1 (K,, B) i = (), i i Then for any B i B() = B( i B i Ø ) C 347 Notes 3 35 C 347 Notes 3 36

10 Query Processing Decomposition Localization Overview Optimization process 1. Generate query plans P 1 P 2 P 3 P n Optimization Overview Joins and other operations Inter-operation parallelism Optimization strategies 2. Estimate size of intermediate results 3. Estimate cost of plan C 1 C 2 C 3 C n C 347 Notes Pick minimum C 347 Notes 3 38 Overview Differences from centralized optimization New strategies for some operations Parallel/distributed sort Parallel/distributed join emi-join Privacy preserving join Duplicate elimination ggregation election Many ways to assign and schedule processors C 347 Notes 3 39 Parallel/Distributed ort Input a) elation on single site/disk b) elation fragmented/partitioned by sort attribute c) elation fragmented/partitioned by another attribute C 347 Notes 3 40

11 Parallel/Distributed ort Output a) orted on single site/disk b) orted fragments/partitions Basic ort (K, ) to be sorted on K Horizontally fragmented on K using vector (k 0, k 1,..., k n) k0 k F 1 F 2 F 3 C 347 Notes 3 41 C 347 Notes 3 42 Basic ort Basic ort lgorithm 1. ort each fragment independently 2. hip results if necessary ame idea on different architectures hared nothing P 1 P 2 M F 1 M F 2 hared memory P 1 P 2 F 1 M F 2 C 347 Notes 3 43 C 347 Notes 3 44

12 ange-partition ort (K, ) to be sorted on K located at one or more site/disk, not fragmented on K ange-partition ort lgorithm 1. ange partition on K 2. Basic sort a 1 local sort 1 k0 b 2 local sort 2 result k1 3 local sort 3 C 347 Notes 3 45 C 347 Notes 3 46 Partition Vectors Problem elect a good partition vector given fragments a b c 4 Partition Vectors Example approach Each site sends to coordinator MIN sort key MX sort key Number of tuples Coordinator Computes vector and distributes to sites Decides the number of sites to perform local sorts C 347 Notes 3 47 C 347 Notes 3 48

13 Partition Vectors ample scenario Coordinator receives : MIN = 5 MX = 9 # of tuples = 10 B: MIN = 7 MX = 16 # of tuples = 10 Partition Vectors ample scenario Coordinator receives : MIN = 5 MX = 9 # of tuples = 10 B: MIN = 7 MX = 16 # of tuples = 10 Expected tuples 2 = 10 / ( ) 1 = 10 / ( ) k0? assuming we want to sort at 2 sites C 347 Notes 3 49 C 347 Notes 3 50 Partition Vectors Partition Vectors ample scenario Expected tuples 2 Variations end more info to coordinator: a) Partition vector for local site Expected tuples with key < k 0 = half of total tuples 2(k 0 5) + (k 0 7) = 10 k 0 = ( ) / 3 = k0? 1 b) Histogram # of tuples local vector C 347 Notes 3 51 C 347 Notes 3 52

14 Partition Vectors Variations Multiple rounds 1. ites send range and # of tuples to coordinator 2. Coordinator distributes preliminary vector V 0 3. ites send coordinator # of tuples in each V 0 range Partition Vectors Variations Distributed algorithm that does not require a coordinator? 4. Coordinator computes and distributes final vector V f C 347 Notes 3 53 C 347 Notes 3 54 Parallel External ort-merge ame as range-partition sort except does sorting first a local sort a 1 Parallel/Distributed Join Input elations, May or may not be fragmented b local sort b k0 2 k1 result Output esult at one or more sites 3 in order merge C 347 Notes 3 55 C 347 Notes 3 56

15 Partitioned Join (Equijoin) local join Partitioned Join (Equijoin) ame partition function f is used for both and 1 1 f can be range or hash partitioning B 2 2 B Local join can be of any type (use any C 245 optimization) f() 3 result 3 f() C Various scheduling options, e.g., a) Partition ; partition ; join b) Partition ; build local hash table for ; partition and join C 347 Notes 3 57 C 347 Notes 3 58 Partitioned Join (Equijoin) We already know why it works Partitioned Join (Equijoin) electing a good partition function f is very important Goal is to make all i + i the same size [1] [2] [3] [1] [2] [3] [1] [1] [2] [2] [3] [3] ometimes we partition just to make this join possible C 347 Notes 3 59 C 347 Notes 3 60

16 symmetric Fragment + eplicate Join symmetric Fragment + eplicate Join 1 local join Can use any partition function f for (even round robin) Can do any join, not just equijoin (e.g., ).<.B B 2 B partition 3 result union C 347 Notes 3 61 C 347 Notes 3 62 General Fragment + eplicate Join General Fragment + eplicate Join B 2 2 B 2 2 f partition 3 fragments 3 3 n copies of each fragment f partition 3 fragments 3 3 n copies of each fragment is partitioned in similar fashion C 347 Notes 3 63 C 347 Notes 3 64

17 General Fragment + eplicate Join General Fragment + eplicate Join The asymmetric F+ join is special case of the general F symmetric F+ may be good if is small n m pairings of, fragments lso works on non-equijoins result C 347 Notes 3 65 C 347 Notes 3 66 emi-join emi-join Goal: reduce communication traffic ( ) or ( ) or ( ) ( ) B 2 a 10 b 25 c 30 d C 3 x 10 y 15 z 25 w 32 x C 347 Notes 3 67 C 347 Notes 3 68

18 emi-join emi-join B 2 a 10 b 25 c 30 d C 3 x 10 y 15 z 25 w 32 x B 2 a 10 b 25 c 30 d C 3 x 10 y 15 z 25 w 32 x = [2, 10, 25, 30] = [2, 10, 25, 30] 10 y = 25 w C 347 Notes 3 69 C 347 Notes 3 70 emi-join emi-join B 2 a 10 b 25 c 30 d C 3 x 10 y 15 z 25 w 32 x ssume is the smaller relation Then, in general, ( ) is better than ( ) if size( ) + size ( ) < size ( ) Transmitted data With semi-join ( ): T = C + result With join : T = 4 + B + result better if B is large Can do similar comparison for other joins Only taking into account transmission cost C 347 Notes 3 71 C 347 Notes 3 72

19 emi-join Trick: encode (or ) as a bit vector emi-join Useful for three way joins T keys in one bit/possible key C 347 Notes 3 73 C 347 Notes 3 74 emi-join Useful for three way joins T emi-join Useful for three way joins T Option 1: T where = = T Option 1: T where = = T Option 2: T where = = T C 347 Notes 3 75 C 347 Notes 3 76

20 emi-join Useful for three way joins T Option 1: T where = = T Option 2: T where = = T Many options ~ exponential in # of relations Privacy Preserving Join ite 1 has (, B) ite 2 has (, C) Want to compute ite 1 should not discover any info not in the join ite 2 should not discover any info not in the join C 347 Notes 3 77 C 347 Notes 3 78 Privacy Preserving Join emi-join won t work If site 1 sends to site 2, site 2 learns all keys of Privacy Preserving Join Fix: send hash keys only ite 1 hashes each value of before sending ite 2 hashes its own values (same h) to see what tuples match a1 a2 a3 B b1 b2 b3 = (a1, a2, a3, a4) a1 a3 a5 C c1 c2 c3 a1 a2 a3 B b1 b2 b3 = (h(a1), h(a2), h(a3), h(a4)) site 2 sees it has h(a1), h(a3) a1 a3 a5 C c1 c2 c3 a4 b4 a7 c4 site 1 site 2 a4 b4 a7 c4 (a1, c1), (a3, c3) site 1 site 2 C 347 Notes 3 79 C 347 Notes 3 80

21 Privacy Preserving Join Privacy Preserving Join What is the problem? What is the problem? a1 a2 a3 B b1 b2 b3 = (h(a1), h(a2), h(a3), h(a4)) site 2 sees it has h(a1), h(a3) a1 a3 a5 C c1 c2 c3 a1 a2 a3 B b1 b2 b3 = (h(a1), h(a2), h(a3), h(a4)) site 2 sees it has h(a1), h(a3) a1 a3 a5 C c1 c2 c3 a4 b4 a7 c4 (a1, c1), (a3, c3) site 1 site 2 a4 b4 a7 c4 (a1, c1), (a3, c3) site 1 site 2 Dictionary attack ite 2 can take all keys a 1, a 2, a 3, and checks if h(a 1), h(a 2), h(a 3), matches what site 1 sent C 347 Notes 3 81 C 347 Notes 3 82 Privacy Preserving Join Our adversary model: honest but curious Dictionary attack is possible (cheating is internal and can t be caught) ending incorrect keys not possible (cheater could be caught) Privacy Preserving Join One solution Information haring cross Private Databases. grawal et al., IGMOD 2003 Use commutative encryption functions E i(x) = x encrypted using a key private to site i E 1(E 2(x)) = E 2(E 1(x)) horthand: E 1(x) is x, E 2(x) is x, E 1(E 2(x)) is x C 347 Notes 3 83 C 347 Notes 3 84

22 Privacy Preserving Join Other Privacy Preserving Operations One solution Inequality join.>. a1 a2 B b1 b2 ( a1, a2, a3, a4 ) a1 a3 C c1 c2 imilarity join sim(.,.)<e a3 b3 ( a1, a2, a3, a4 ) a5 c3 a4 b4 a7 c4 ( a1, a3, a5, a7 ) site 1 site 2 Computes ( a1, a3, a5, a7 ), intersects with ( a1, a2, a3, a4 ) (a1, b1), (a3, b3) C 347 Notes 3 85 C 347 Notes 3 86 Other Parallel Operations Parallel/Distributed ggregates Duplicate elimination ort first (in parallel) then eliminate duplicates in result Partition tuples (range or hash) and eliminate locally a 1 toy 10 2 toy 20 3 sales 15 ggregates Partition by grouping attributes; compute aggregate locally b 4 sales 5 5 toy 20 6 mgmt 15 7 sales 10 8 mgmt 30 sum (sal) group by dept C 347 Notes 3 87 C 347 Notes 3 88

23 Parallel/Distributed ggregates Parallel/Distributed ggregates a b 1 toy 10 2 toy 20 3 sales 15 4 sales 5 5 toy 20 6 mgmt 15 7 sales 10 8 mgmt 30 1 toy 10 2 toy 20 5 toy 20 6 mgmt 15 8 mgmt 30 3 sales 15 4 sales 5 7 sales 10 a b 1 toy 10 2 toy 20 3 sales 15 4 sales 5 5 toy 20 6 mgmt 15 7 sales 10 8 mgmt 30 1 toy 10 2 toy 20 5 toy 20 6 mgmt 15 8 mgmt 30 3 sales 15 4 sales 5 7 sales 10 sum sum dept sum toy 50 mgmt 45 dept sum sales 30 sum (sal) group by dept sum (sal) group by dept C 347 Notes 3 89 C 347 Notes 3 90 Parallel/Distributed ggregates Parallel/Distributed ggregates a 1 toy 10 2 toy 20 3 sales 15 a 1 toy 10 2 toy 20 3 sales 15 sum dept salary toy 30 toy 20 mgmt 45 b 4 sales 5 5 toy 20 6 mgmt 15 7 sales 10 8 mgmt 30 b 4 sales 5 5 toy 20 6 mgmt 15 7 sales 10 8 mgmt 30 sum dept salary sales 15 sales 15 sum (sal) group by dept with less data sum (sal) group by dept with less data C 347 Notes 3 91 C 347 Notes 3 92

24 Parallel/Distributed ggregates Parallel/Distributed ggregates a 1 toy 10 2 toy 20 3 sales 15 sum dept salary toy 30 toy 20 mgmt 45 sum dept sum toy 50 mgmt 45 Enhancement Perform aggregate during partition to reduce data transmitted Does not work for all aggregate functions which ones? b 4 sales 5 5 toy 20 6 mgmt 15 7 sales 10 8 mgmt 30 sum dept salary sales 15 sales 15 sum dept sum sales 30 sum (sal) group by dept with less data C 347 Notes 3 93 C 347 Notes 3 94 Preview: Mapeduce Parallel/Distributed election traightforward if one can leverage range or hash partitioning data 1 map data B 1 reduce data C 1 How do indexes work? data 2 data B 2 data C 1 data 3 C 347 Notes 3 95 C 347 Notes 3 96

25 Parallel/Distributed election Partition vector can act as the root of a distributed index Parallel/Distributed election Distributed indexes on a non-partition attributes get complicated k0 k1 k0 k1 local indexes index sites site 1 site 2 site 3 tuple sites C 347 Notes 3 97 C 347 Notes 3 98 Parallel/Distributed election Indexing schemes Which one is better in a distributed environment? How to make updates and expansion efficient? Where to store the directory and set of participants? Is global knowledge necessary? If the index is not too big, it may be better to keep it whole and replicate it Query Processing Decomposition Localization Optimization Overview Joins and other operations Inter-operation parallelism Optimization strategies C 347 Notes 3 99 C 347 Notes 3 100

26 Inter-Operation Parallelism Pipelined Parallelism Pipelined Independent site 2 σ c site 1 site 2 site 1 tuples matching σc join probe result C 347 Notes C 347 Notes Pipelined Parallelism Pipelining cannot be used in all cases Independent Parallelism site 1 site 2 T V stream of tuples stream of tuples 1. temp 1 temp 2 T V 2. result temp 1 temp 2 C 347 Notes C 347 Notes 3 104

27 ummary s we consider query plans for optimization, we must consider various new strategies for individual operations scheduling operations C 347 Notes 3 105

σ (R.B = 1 v R.C > 3) (S.D = 2) Conjunctive normal form Topics for the Day Distributed Databases Query Processing Steps Decomposition

σ (R.B = 1 v R.C > 3) (S.D = 2) Conjunctive normal form Topics for the Day Distributed Databases Query Processing Steps Decomposition Topics for the Day Distributed Databases Query processing in distributed databases Localization Distributed query operators Cost-based optimization C37 Lecture 1 May 30, 2001 1 2 Query Processing teps

More information

Parallel DBMS. Parallel Database Systems. PDBS vs Distributed DBS. Types of Parallelism. Goals and Metrics Speedup. Types of Parallelism

Parallel DBMS. Parallel Database Systems. PDBS vs Distributed DBS. Types of Parallelism. Goals and Metrics Speedup. Types of Parallelism Parallel DBMS Parallel Database Systems CS5225 Parallel DB 1 Uniprocessor technology has reached its limit Difficult to build machines powerful enough to meet the CPU and I/O demands of DBMS serving large

More information

! Parallel machines are becoming quite common and affordable. ! Databases are growing increasingly large

! Parallel machines are becoming quite common and affordable. ! Databases are growing increasingly large Chapter 20: Parallel Databases Introduction! Introduction! I/O Parallelism! Interquery Parallelism! Intraquery Parallelism! Intraoperation Parallelism! Interoperation Parallelism! Design of Parallel Systems!

More information

Chapter 20: Parallel Databases

Chapter 20: Parallel Databases Chapter 20: Parallel Databases! Introduction! I/O Parallelism! Interquery Parallelism! Intraquery Parallelism! Intraoperation Parallelism! Interoperation Parallelism! Design of Parallel Systems 20.1 Introduction!

More information

Chapter 20: Parallel Databases. Introduction

Chapter 20: Parallel Databases. Introduction Chapter 20: Parallel Databases! Introduction! I/O Parallelism! Interquery Parallelism! Intraquery Parallelism! Intraoperation Parallelism! Interoperation Parallelism! Design of Parallel Systems 20.1 Introduction!

More information

What happens. 376a. Database Design. Execution strategy. Query conversion. Next. Two types of techniques

What happens. 376a. Database Design. Execution strategy. Query conversion. Next. Two types of techniques 376a. Database Design Dept. of Computer Science Vassar College http://www.cs.vassar.edu/~cs376 Class 16 Query optimization What happens Database is given a query Query is scanned - scanner creates a list

More information

Chapter 17: Parallel Databases

Chapter 17: Parallel Databases Chapter 17: Parallel Databases Introduction I/O Parallelism Interquery Parallelism Intraquery Parallelism Intraoperation Parallelism Interoperation Parallelism Design of Parallel Systems Database Systems

More information

Chapter 18: Parallel Databases

Chapter 18: Parallel Databases Chapter 18: Parallel Databases Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 18: Parallel Databases Introduction I/O Parallelism Interquery Parallelism Intraquery

More information

Chapter 18: Parallel Databases. Chapter 18: Parallel Databases. Parallelism in Databases. Introduction

Chapter 18: Parallel Databases. Chapter 18: Parallel Databases. Parallelism in Databases. Introduction Chapter 18: Parallel Databases Chapter 18: Parallel Databases Introduction I/O Parallelism Interquery Parallelism Intraquery Parallelism Intraoperation Parallelism Interoperation Parallelism Design of

More information

Query Processing. Introduction to Databases CompSci 316 Fall 2017

Query Processing. Introduction to Databases CompSci 316 Fall 2017 Query Processing Introduction to Databases CompSci 316 Fall 2017 2 Announcements (Tue., Nov. 14) Homework #3 sample solution posted in Sakai Homework #4 assigned today; due on 12/05 Project milestone #2

More information

CS 347 Parallel and Distributed Data Processing

CS 347 Parallel and Distributed Data Processing C 37 Parallel and Distributed Data Processing pring 2016 Notes : Query Optimization Query Optimization Cost estimation trategies for exploring plans Q min C 37 Notes 2 Based on estimating result sizes

More information

Query Optimization. Query Optimization. Optimization considerations. Example. Interaction of algorithm choice and tree arrangement.

Query Optimization. Query Optimization. Optimization considerations. Example. Interaction of algorithm choice and tree arrangement. COS 597: Principles of Database and Information Systems Query Optimization Query Optimization Query as expression over relational algebraic operations Get evaluation (parse) tree Leaves: base relations

More information

Relational Algebra Equivalencies. Database Systems: The Complete Book Ch

Relational Algebra Equivalencies. Database Systems: The Complete Book Ch elational Algebra Equivalencies Database ystems: he Complete Book Ch. 16.2-16.3 Implementing: Joins olution 1 (Nested-Loop) For Each (a in A) { For Each (b in B) { emit (a, b); }} A B 2 Implementing: Joins

More information

Advanced Databases. Lecture 15- Parallel Databases (continued) Masood Niazi Torshiz Islamic Azad University- Mashhad Branch

Advanced Databases. Lecture 15- Parallel Databases (continued) Masood Niazi Torshiz Islamic Azad University- Mashhad Branch Advanced Databases Lecture 15- Parallel Databases (continued) Masood Niazi Torshiz Islamic Azad University- Mashhad Branch www.mniazi.ir Parallel Join The join operation requires pairs of tuples to be

More information

Query Evaluation and Optimization

Query Evaluation and Optimization Query Evaluation and Optimization Jan Chomicki University at Buffalo Jan Chomicki () Query Evaluation and Optimization 1 / 21 Evaluating σ E (R) Jan Chomicki () Query Evaluation and Optimization 2 / 21

More information

Introduction to Query Processing and Query Optimization Techniques. Copyright 2011 Ramez Elmasri and Shamkant Navathe

Introduction to Query Processing and Query Optimization Techniques. Copyright 2011 Ramez Elmasri and Shamkant Navathe Introduction to Query Processing and Query Optimization Techniques Outline Translating SQL Queries into Relational Algebra Algorithms for External Sorting Algorithms for SELECT and JOIN Operations Algorithms

More information

Query optimization. Query Optimization. Query Optimization. Cost estimation. Another reason why plain IOs not enough: Parallelism

Query optimization. Query Optimization. Query Optimization. Cost estimation. Another reason why plain IOs not enough: Parallelism Query optimization Query Optimization It is safer to accept any chance that offers itself, and extemporize a procedure to fit it, than to get a good plan matured, and wait for a chance of using it. Thomas

More information

Relational Algebra. Relational Algebra Overview. Relational Algebra Overview. Unary Relational Operations 8/19/2014. Relational Algebra Overview

Relational Algebra. Relational Algebra Overview. Relational Algebra Overview. Unary Relational Operations 8/19/2014. Relational Algebra Overview The Relational Algebra Relational Algebra Relational algebra is the basic set of operations for the relational model These operations enable a user to specify basic retrieval requests (or queries) Relational

More information

Magda Balazinska - CSE 444, Spring

Magda Balazinska - CSE 444, Spring Query Optimization Algorithm CE 444: Database Internals Lectures 11-12 Query Optimization (part 2) Enumerate alternative plans (logical & physical) Compute estimated cost of each plan Compute number of

More information

Parallel DBMS. Prof. Yanlei Diao. University of Massachusetts Amherst. Slides Courtesy of R. Ramakrishnan and J. Gehrke

Parallel DBMS. Prof. Yanlei Diao. University of Massachusetts Amherst. Slides Courtesy of R. Ramakrishnan and J. Gehrke Parallel DBMS Prof. Yanlei Diao University of Massachusetts Amherst Slides Courtesy of R. Ramakrishnan and J. Gehrke I. Parallel Databases 101 Rise of parallel databases: late 80 s Architecture: shared-nothing

More information

Chapter 12: Query Processing. Chapter 12: Query Processing

Chapter 12: Query Processing. Chapter 12: Query Processing Chapter 12: Query Processing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 12: Query Processing Overview Measures of Query Cost Selection Operation Sorting Join

More information

Advanced Databases: Parallel Databases A.Poulovassilis

Advanced Databases: Parallel Databases A.Poulovassilis 1 Advanced Databases: Parallel Databases A.Poulovassilis 1 Parallel Database Architectures Parallel database systems use parallel processing techniques to achieve faster DBMS performance and handle larger

More information

Chapter 12: Query Processing

Chapter 12: Query Processing Chapter 12: Query Processing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Overview Chapter 12: Query Processing Measures of Query Cost Selection Operation Sorting Join

More information

Query Execution [15]

Query Execution [15] CSC 661, Principles of Database Systems Query Execution [15] Dr. Kalpakis http://www.csee.umbc.edu/~kalpakis/courses/661 Query processing involves Query processing compilation parsing to construct parse

More information

Query Processing. Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016

Query Processing. Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016 Query Processing Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016 Slides re-used with some modification from www.db-book.com Reference: Database System Concepts, 6 th Ed. By Silberschatz,

More information

CSIT5300: Advanced Database Systems

CSIT5300: Advanced Database Systems CSIT5300: Advanced Database Systems L10: Query Processing Other Operations, Pipelining and Materialization Dr. Kenneth LEUNG Department of Computer Science and Engineering The Hong Kong University of Science

More information

Chapter 13: Query Optimization. Chapter 13: Query Optimization

Chapter 13: Query Optimization. Chapter 13: Query Optimization Chapter 13: Query Optimization Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 13: Query Optimization Introduction Equivalent Relational Algebra Expressions Statistical

More information

Database Systems. Announcement. December 13/14, 2006 Lecture #10. Assignment #4 is due next week.

Database Systems. Announcement. December 13/14, 2006 Lecture #10. Assignment #4 is due next week. Database Systems ( 料 ) December 13/14, 2006 Lecture #10 1 Announcement Assignment #4 is due next week. 2 1 Overview of Query Evaluation Chapter 12 3 Outline Query evaluation (Overview) Relational Operator

More information

EECS 647: Introduction to Database Systems

EECS 647: Introduction to Database Systems EECS 647: Introduction to Database Systems Instructor: Luke Huan Spring 2009 External Sorting Today s Topic Implementing the join operation 4/8/2009 Luke Huan Univ. of Kansas 2 Review DBMS Architecture

More information

Introduction to Wireless Sensor Network. Peter Scheuermann and Goce Trajcevski Dept. of EECS Northwestern University

Introduction to Wireless Sensor Network. Peter Scheuermann and Goce Trajcevski Dept. of EECS Northwestern University Introduction to Wireless Sensor Network Peter Scheuermann and Goce Trajcevski Dept. of EECS Northwestern University 1 A Database Primer 2 A leap in history A brief overview/review of databases (DBMS s)

More information

Algorithms for Query Processing and Optimization. 0. Introduction to Query Processing (1)

Algorithms for Query Processing and Optimization. 0. Introduction to Query Processing (1) Chapter 19 Algorithms for Query Processing and Optimization 0. Introduction to Query Processing (1) Query optimization: The process of choosing a suitable execution strategy for processing a query. Two

More information

CMP-3440 Database Systems

CMP-3440 Database Systems CMP-3440 Database Systems Relational DB Languages Relational Algebra, Calculus, SQL Lecture 05 zain 1 Introduction Relational algebra & relational calculus are formal languages associated with the relational

More information

Chapter 11: Query Optimization

Chapter 11: Query Optimization Chapter 11: Query Optimization Chapter 11: Query Optimization Introduction Transformation of Relational Expressions Statistical Information for Cost Estimation Cost-based optimization Dynamic Programming

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe CHAPTER 19 Query Optimization Introduction Query optimization Conducted by a query optimizer in a DBMS Goal: select best available strategy for executing query Based on information available Most RDBMSs

More information

Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. Chapter 6 Outline. Unary Relational Operations: SELECT and

Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. Chapter 6 Outline. Unary Relational Operations: SELECT and Chapter 6 The Relational Algebra and Relational Calculus Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 6 Outline Unary Relational Operations: SELECT and PROJECT Relational

More information

Track Join. Distributed Joins with Minimal Network Traffic. Orestis Polychroniou! Rajkumar Sen! Kenneth A. Ross

Track Join. Distributed Joins with Minimal Network Traffic. Orestis Polychroniou! Rajkumar Sen! Kenneth A. Ross Track Join Distributed Joins with Minimal Network Traffic Orestis Polychroniou Rajkumar Sen Kenneth A. Ross Local Joins Algorithms Hash Join Sort Merge Join Index Join Nested Loop Join Spilling to disk

More information

Database System Concepts

Database System Concepts Chapter 13: Query Processing s Departamento de Engenharia Informática Instituto Superior Técnico 1 st Semester 2008/2009 Slides (fortemente) baseados nos slides oficiais do livro c Silberschatz, Korth

More information

Advanced Database Systems

Advanced Database Systems Lecture IV Query Processing Kyumars Sheykh Esmaili Basic Steps in Query Processing 2 Query Optimization Many equivalent execution plans Choosing the best one Based on Heuristics, Cost Will be discussed

More information

Operator Implementation Wrap-Up Query Optimization

Operator Implementation Wrap-Up Query Optimization Operator Implementation Wrap-Up Query Optimization 1 Last time: Nested loop join algorithms: TNLJ PNLJ BNLJ INLJ Sort Merge Join Hash Join 2 General Join Conditions Equalities over several attributes (e.g.,

More information

Chapter 13: Query Processing

Chapter 13: Query Processing Chapter 13: Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 13.1 Basic Steps in Query Processing 1. Parsing

More information

Securing Distributed Computation via Trusted Quorums. Yan Michalevsky, Valeria Nikolaenko, Dan Boneh

Securing Distributed Computation via Trusted Quorums. Yan Michalevsky, Valeria Nikolaenko, Dan Boneh Securing Distributed Computation via Trusted Quorums Yan Michalevsky, Valeria Nikolaenko, Dan Boneh Setting Distributed computation over data contributed by users Communication through a central party

More information

Chapter 18: Parallel Databases

Chapter 18: Parallel Databases Chapter 18: Parallel Databases Introduction Parallel machines are becoming quite common and affordable Prices of microprocessors, memory and disks have dropped sharply Recent desktop computers feature

More information

Query Processing & Optimization

Query Processing & Optimization Query Processing & Optimization 1 Roadmap of This Lecture Overview of query processing Measures of Query Cost Selection Operation Sorting Join Operation Other Operations Evaluation of Expressions Introduction

More information

PARALLEL & DISTRIBUTED DATABASES CS561-SPRING 2012 WPI, MOHAMED ELTABAKH

PARALLEL & DISTRIBUTED DATABASES CS561-SPRING 2012 WPI, MOHAMED ELTABAKH PARALLEL & DISTRIBUTED DATABASES CS561-SPRING 2012 WPI, MOHAMED ELTABAKH 1 INTRODUCTION In centralized database: Data is located in one place (one server) All DBMS functionalities are done by that server

More information

Outline. Query Processing Overview Algorithms for basic operations. Query optimization. Sorting Selection Join Projection

Outline. Query Processing Overview Algorithms for basic operations. Query optimization. Sorting Selection Join Projection Outline Query Processing Overview Algorithms for basic operations Sorting Selection Join Projection Query optimization Heuristics Cost-based optimization 19 Estimate I/O Cost for Implementations Count

More information

! A relational algebra expression may have many equivalent. ! Cost is generally measured as total elapsed time for

! A relational algebra expression may have many equivalent. ! Cost is generally measured as total elapsed time for Chapter 13: Query Processing Basic Steps in Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 1. Parsing and

More information

Chapter 13: Query Processing Basic Steps in Query Processing

Chapter 13: Query Processing Basic Steps in Query Processing Chapter 13: Query Processing Basic Steps in Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 1. Parsing and

More information

Parallel Databases C H A P T E R18. Practice Exercises

Parallel Databases C H A P T E R18. Practice Exercises C H A P T E R18 Parallel Databases Practice Exercises 181 In a range selection on a range-partitioned attribute, it is possible that only one disk may need to be accessed Describe the benefits and drawbacks

More information

Database System Concepts

Database System Concepts Chapter 14: Optimization Departamento de Engenharia Informática Instituto Superior Técnico 1 st Semester 2007/2008 Slides (fortemente) baseados nos slides oficiais do livro c Silberschatz, Korth and Sudarshan.

More information

Chapter 18 Strategies for Query Processing. We focus this discussion w.r.t RDBMS, however, they are applicable to OODBS.

Chapter 18 Strategies for Query Processing. We focus this discussion w.r.t RDBMS, however, they are applicable to OODBS. Chapter 18 Strategies for Query Processing We focus this discussion w.r.t RDBMS, however, they are applicable to OODBS. 1 1. Translating SQL Queries into Relational Algebra and Other Operators - SQL is

More information

Advanced Databases. Lecture 4 - Query Optimization. Masood Niazi Torshiz Islamic Azad university- Mashhad Branch

Advanced Databases. Lecture 4 - Query Optimization. Masood Niazi Torshiz Islamic Azad university- Mashhad Branch Advanced Databases Lecture 4 - Query Optimization Masood Niazi Torshiz Islamic Azad university- Mashhad Branch www.mniazi.ir Query Optimization Introduction Transformation of Relational Expressions Catalog

More information

data parallelism Chris Olston Yahoo! Research

data parallelism Chris Olston Yahoo! Research data parallelism Chris Olston Yahoo! Research set-oriented computation data management operations tend to be set-oriented, e.g.: apply f() to each member of a set compute intersection of two sets easy

More information

Chapter 3. Algorithms for Query Processing and Optimization

Chapter 3. Algorithms for Query Processing and Optimization Chapter 3 Algorithms for Query Processing and Optimization Chapter Outline 1. Introduction to Query Processing 2. Translating SQL Queries into Relational Algebra 3. Algorithms for External Sorting 4. Algorithms

More information

Chapter 6 Part I The Relational Algebra and Calculus

Chapter 6 Part I The Relational Algebra and Calculus Chapter 6 Part I The Relational Algebra and Calculus Copyright 2004 Ramez Elmasri and Shamkant Navathe Database State for COMPANY All examples discussed below refer to the COMPANY database shown here.

More information

DBMS Y3/S5. 1. OVERVIEW The steps involved in processing a query are: 1. Parsing and translation. 2. Optimization. 3. Evaluation.

DBMS Y3/S5. 1. OVERVIEW The steps involved in processing a query are: 1. Parsing and translation. 2. Optimization. 3. Evaluation. Query Processing QUERY PROCESSING refers to the range of activities involved in extracting data from a database. The activities include translation of queries in high-level database languages into expressions

More information

Incremental Evaluation of Sliding-Window Queries over Data Streams

Incremental Evaluation of Sliding-Window Queries over Data Streams Incremental Evaluation of Sliding-Window Queries over Data Streams Thanaa M. Ghanem 1 Moustafa A. Hammad 2 Mohamed F. Mokbel 3 Walid G. Aref 1 Ahmed K. Elmagarmid 1 1 Department of Computer Science, Purdue

More information

Query Processing with Indexes. Announcements (February 24) Review. CPS 216 Advanced Database Systems

Query Processing with Indexes. Announcements (February 24) Review. CPS 216 Advanced Database Systems Query Processing with Indexes CPS 216 Advanced Database Systems Announcements (February 24) 2 More reading assignment for next week Buffer management (due next Wednesday) Homework #2 due next Thursday

More information

Physical Database Design and Tuning

Physical Database Design and Tuning Physical Database Design and Tuning CS 186, Fall 2001, Lecture 22 R&G - Chapter 16 Although the whole of this life were said to be nothing but a dream and the physical world nothing but a phantasm, I should

More information

QUERY OPTIMIZATION [CH 15]

QUERY OPTIMIZATION [CH 15] Spring 2017 QUERY OPTIMIZATION [CH 15] 4/12/17 CS 564: Database Management Systems; (c) Jignesh M. Patel, 2013 1 Example SELECT distinct ename FROM Emp E, Dept D WHERE E.did = D.did and D.dname = Toy EMP

More information

Contents Contents Introduction Basic Steps in Query Processing Introduction Transformation of Relational Expressions...

Contents Contents Introduction Basic Steps in Query Processing Introduction Transformation of Relational Expressions... Contents Contents...283 Introduction...283 Basic Steps in Query Processing...284 Introduction...285 Transformation of Relational Expressions...287 Equivalence Rules...289 Transformation Example: Pushing

More information

The Relational Algebra

The Relational Algebra The Relational Algebra Relational Algebra Relational algebra is the basic set of operations for the relational model These operations enable a user to specify basic retrieval requests (or queries) 27-Jan-14

More information

Faloutsos 1. Carnegie Mellon Univ. Dept. of Computer Science Database Applications. Outline

Faloutsos 1. Carnegie Mellon Univ. Dept. of Computer Science Database Applications. Outline Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications Lecture #14: Implementation of Relational Operations (R&G ch. 12 and 14) 15-415 Faloutsos 1 introduction selection projection

More information

Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Administrivia. Administrivia. Faloutsos/Pavlo CMU /615

Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Administrivia. Administrivia. Faloutsos/Pavlo CMU /615 Carnegie Mellon Univ. Dept. of Computer Science 15-415/615 - DB Applications C. Faloutsos A. Pavlo Lecture#14(b): Implementation of Relational Operations Administrivia HW4 is due today. HW5 is out. Faloutsos/Pavlo

More information

Chapter 14: Query Optimization

Chapter 14: Query Optimization Chapter 14: Query Optimization Database System Concepts 5 th Ed. See www.db-book.com for conditions on re-use Chapter 14: Query Optimization Introduction Transformation of Relational Expressions Catalog

More information

Query optimization. Elena Baralis, Silvia Chiusano Politecnico di Torino. DBMS Architecture D B M G. Database Management Systems. Pag.

Query optimization. Elena Baralis, Silvia Chiusano Politecnico di Torino. DBMS Architecture D B M G. Database Management Systems. Pag. Database Management Systems DBMS Architecture SQL INSTRUCTION OPTIMIZER MANAGEMENT OF ACCESS METHODS CONCURRENCY CONTROL BUFFER MANAGER RELIABILITY MANAGEMENT Index Files Data Files System Catalog DATABASE

More information

DBMS Query evaluation

DBMS Query evaluation Data Management for Data Science DBMS Maurizio Lenzerini, Riccardo Rosati Corso di laurea magistrale in Data Science Sapienza Università di Roma Academic Year 2016/2017 http://www.dis.uniroma1.it/~rosati/dmds/

More information

Storage hierarchy. Textbook: chapters 11, 12, and 13

Storage hierarchy. Textbook: chapters 11, 12, and 13 Storage hierarchy Cache Main memory Disk Tape Very fast Fast Slower Slow Very small Small Bigger Very big (KB) (MB) (GB) (TB) Built-in Expensive Cheap Dirt cheap Disks: data is stored on concentric circular

More information

Evaluation of Relational Operations

Evaluation of Relational Operations Evaluation of Relational Operations Chapter 14 Comp 521 Files and Databases Fall 2010 1 Relational Operations We will consider in more detail how to implement: Selection ( ) Selects a subset of rows from

More information

Query Processing and Query Optimization. Prof Monika Shah

Query Processing and Query Optimization. Prof Monika Shah Query Processing and Query Optimization Query Processing SQL Query Is in Library Cache? System catalog (Dict / Dict cache) Scan and verify relations Parse into parse tree (relational Calculus) View definitions

More information

Chapter 12: Query Processing

Chapter 12: Query Processing Chapter 12: Query Processing Overview Catalog Information for Cost Estimation $ Measures of Query Cost Selection Operation Sorting Join Operation Other Operations Evaluation of Expressions Transformation

More information

Data Storage and Query Answering. Indexing and Hashing (5)

Data Storage and Query Answering. Indexing and Hashing (5) Data Storage and Query Answering Indexing and Hashing (5) Linear Hash Tables No directory. Introduction Hash function computes sequences of k bits. Take only the i last of these bits and interpret them

More information

Physical Database Design and Tuning. Chapter 20

Physical Database Design and Tuning. Chapter 20 Physical Database Design and Tuning Chapter 20 Introduction We will be talking at length about database design Conceptual Schema: info to capture, tables, columns, views, etc. Physical Schema: indexes,

More information

Parallel DBMS. Chapter 22, Part A

Parallel DBMS. Chapter 22, Part A Parallel DBMS Chapter 22, Part A Slides by Joe Hellerstein, UCB, with some material from Jim Gray, Microsoft Research. See also: http://www.research.microsoft.com/research/barc/gray/pdb95.ppt Database

More information

CSE 190D Spring 2017 Final Exam Answers

CSE 190D Spring 2017 Final Exam Answers CSE 190D Spring 2017 Final Exam Answers Q 1. [20pts] For the following questions, clearly circle True or False. 1. The hash join algorithm always has fewer page I/Os compared to the block nested loop join

More information

Advances in Data Management Query Processing and Query Optimisation A.Poulovassilis

Advances in Data Management Query Processing and Query Optimisation A.Poulovassilis 1 Advances in Data Management Query Processing and Query Optimisation A.Poulovassilis 1 General approach to the implementation of Query Processing and Query Optimisation functionalities in DBMSs 1. Parse

More information

New Requirements. Advanced Query Processing. Top-N/Bottom-N queries Interactive queries. Skyline queries, Fast initial response time!

New Requirements. Advanced Query Processing. Top-N/Bottom-N queries Interactive queries. Skyline queries, Fast initial response time! Lecture 13 Advanced Query Processing CS5208 Advanced QP 1 New Requirements Top-N/Bottom-N queries Interactive queries Decision making queries Tolerant of errors approximate answers acceptable Control over

More information

Parallel DBs. April 23, 2018

Parallel DBs. April 23, 2018 Parallel DBs April 23, 2018 1 Why Scale? Scan of 1 PB at 300MB/s (SATA r2 Limit) Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) ~1 Hour Why Scale Up? Scan of 1 PB at 300MB/s (SATA r2 Limit) (x1000)

More information

Relational Databases

Relational Databases Relational Databases Jan Chomicki University at Buffalo Jan Chomicki () Relational databases 1 / 49 Plan of the course 1 Relational databases 2 Relational database design 3 Conceptual database design 4

More information

Physical Database Design and Tuning. Review - Normal Forms. Review: Normal Forms. Introduction. Understanding the Workload. Creating an ISUD Chart

Physical Database Design and Tuning. Review - Normal Forms. Review: Normal Forms. Introduction. Understanding the Workload. Creating an ISUD Chart Physical Database Design and Tuning R&G - Chapter 20 Although the whole of this life were said to be nothing but a dream and the physical world nothing but a phantasm, I should call this dream or phantasm

More information

Query Decomposition and Data Localization

Query Decomposition and Data Localization Query Decomposition and Data Localization Query Decomposition and Data Localization Query decomposition and data localization consists of two steps: Mapping of calculus query (SQL) to algebra operations

More information

Name: Problem 1 Consider the following two transactions: T0: read(a); read(b); if (A = 0) then B = B + 1; write(b);

Name: Problem 1 Consider the following two transactions: T0: read(a); read(b); if (A = 0) then B = B + 1; write(b); Name: Problem 1 Consider the following two transactions: T0: read(a); read(b); if (A = 0) then B = B + 1; write(b); T1: read(b); read(a); if (B = 0) then A = A + 1; write(a); Let the consistency requirement

More information

Chapter 19 Query Optimization

Chapter 19 Query Optimization Chapter 19 Query Optimization It is an activity conducted by the query optimizer to select the best available strategy for executing the query. 1. Query Trees and Heuristics for Query Optimization - Apply

More information

Implementation of Relational Operations. Introduction. CS 186, Fall 2002, Lecture 19 R&G - Chapter 12

Implementation of Relational Operations. Introduction. CS 186, Fall 2002, Lecture 19 R&G - Chapter 12 Implementation of Relational Operations CS 186, Fall 2002, Lecture 19 R&G - Chapter 12 First comes thought; then organization of that thought, into ideas and plans; then transformation of those plans into

More information

Chapter 6 The Relational Algebra and Calculus

Chapter 6 The Relational Algebra and Calculus Chapter 6 The Relational Algebra and Calculus 1 Chapter Outline Example Database Application (COMPANY) Relational Algebra Unary Relational Operations Relational Algebra Operations From Set Theory Binary

More information

Announcement. Reading Material. Overview of Query Evaluation. Overview of Query Evaluation. Overview of Query Evaluation 9/26/17

Announcement. Reading Material. Overview of Query Evaluation. Overview of Query Evaluation. Overview of Query Evaluation 9/26/17 Announcement CompSci 516 Database Systems Lecture 10 Query Evaluation and Join Algorithms Project proposal pdf due on sakai by 5 pm, tomorrow, Thursday 09/27 One per group by any member Instructor: Sudeepa

More information

Lassonde School of Engineering Winter 2016 Term Course No: 4411 Database Management Systems

Lassonde School of Engineering Winter 2016 Term Course No: 4411 Database Management Systems Lassonde School of Engineering Winter 2016 Term Course No: 4411 Database Management Systems Last Name: First Name: Student ID: 1. Exam is 2 hours long 2. Closed books/notes Problem 1 (6 points) Consider

More information

Introduction to Data Management. Lecture #17 (Physical DB Design!)

Introduction to Data Management. Lecture #17 (Physical DB Design!) Introduction to Data Management Lecture #17 (Physical DB Design!) Instructor: Mike Carey mjcarey@ics.uci.edu Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Announcements v Homework info:

More information

Chapter 5 Relational Algebra. Nguyen Thi Ai Thao

Chapter 5 Relational Algebra. Nguyen Thi Ai Thao Chapter 5 Nguyen Thi Ai Thao thaonguyen@cse.hcmut.edu.vn Spring- 2016 Contents 1 Unary Relational Operations 2 Operations from Set Theory 3 Binary Relational Operations 4 Additional Relational Operations

More information

From SQL-query to result Have a look under the hood

From SQL-query to result Have a look under the hood From SQL-query to result Have a look under the hood Classical view on RA: sets Theory of relational databases: table is a set Practice (SQL): a relation is a bag of tuples R π B (R) π B (R) A B 1 1 2

More information

Examples of Physical Query Plan Alternatives. Selected Material from Chapters 12, 14 and 15

Examples of Physical Query Plan Alternatives. Selected Material from Chapters 12, 14 and 15 Examples of Physical Query Plan Alternatives Selected Material from Chapters 12, 14 and 15 1 Query Optimization NOTE: SQL provides many ways to express a query. HENCE: System has many options for evaluating

More information

The Extended Algebra. Duplicate Elimination. Sorting. Example: Duplicate Elimination

The Extended Algebra. Duplicate Elimination. Sorting. Example: Duplicate Elimination The Extended Algebra Duplicate Elimination 2 δ = eliminate duplicates from bags. τ = sort tuples. γ = grouping and aggregation. Outerjoin : avoids dangling tuples = tuples that do not join with anything.

More information

CS 582 Database Management Systems II

CS 582 Database Management Systems II Review of SQL Basics SQL overview Several parts Data-definition language (DDL): insert, delete, modify schemas Data-manipulation language (DML): insert, delete, modify tuples Integrity View definition

More information

Rajiv GandhiCollegeof Engineering& Technology, Kirumampakkam.Page 1 of 10

Rajiv GandhiCollegeof Engineering& Technology, Kirumampakkam.Page 1 of 10 Rajiv GandhiCollegeof Engineering& Technology, Kirumampakkam.Page 1 of 10 RAJIV GANDHI COLLEGE OF ENGINEERING & TECHNOLOGY, KIRUMAMPAKKAM-607 402 DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING QUESTION BANK

More information

Raunak Rathi 1, Prof. A.V.Deorankar 2 1,2 Department of Computer Science and Engineering, Government College of Engineering Amravati

Raunak Rathi 1, Prof. A.V.Deorankar 2 1,2 Department of Computer Science and Engineering, Government College of Engineering Amravati Analytical Representation on Secure Mining in Horizontally Distributed Database Raunak Rathi 1, Prof. A.V.Deorankar 2 1,2 Department of Computer Science and Engineering, Government College of Engineering

More information

Administrivia. Physical Database Design. Review: Optimization Strategies. Review: Query Optimization. Review: Database Design

Administrivia. Physical Database Design. Review: Optimization Strategies. Review: Query Optimization. Review: Database Design Administrivia Physical Database Design R&G Chapter 16 Lecture 26 Homework 5 available Due Monday, December 8 Assignment has more details since first release Large data files now available No class Thursday,

More information

Query Optimization. Introduction to Databases CompSci 316 Fall 2018

Query Optimization. Introduction to Databases CompSci 316 Fall 2018 Query Optimization Introduction to Databases CompSci 316 Fall 2018 2 Announcements (Tue., Nov. 20) Homework #4 due next in 2½ weeks No class this Thu. (Thanksgiving break) No weekly progress update due

More information

RELATIONAL DATA MODEL: Relational Algebra

RELATIONAL DATA MODEL: Relational Algebra RELATIONAL DATA MODEL: Relational Algebra Outline 1. Relational Algebra 2. Relational Algebra Example Queries 1. Relational Algebra A basic set of relational model operations constitute the relational

More information

Database Design and Tuning

Database Design and Tuning Database Design and Tuning Chapter 20 Comp 521 Files and Databases Spring 2010 1 Overview After ER design, schema refinement, and the definition of views, we have the conceptual and external schemas for

More information

What s a database system? Review of Basic Database Concepts. Entity-relationship (E/R) diagram. Two important questions. Physical data independence

What s a database system? Review of Basic Database Concepts. Entity-relationship (E/R) diagram. Two important questions. Physical data independence What s a database system? Review of Basic Database Concepts CPS 296.1 Topics in Database Systems According to Oxford Dictionary Database: an organized body of related information Database system, DataBase

More information

Parallel DBMS. Lecture 20. Reading Material. Instructor: Sudeepa Roy. Reading Material. Parallel vs. Distributed DBMS. Parallel DBMS 11/15/18

Parallel DBMS. Lecture 20. Reading Material. Instructor: Sudeepa Roy. Reading Material. Parallel vs. Distributed DBMS. Parallel DBMS 11/15/18 Reading aterial CompSci 516 atabase Systems Lecture 20 Parallel BS Instructor: Sudeepa Roy [RG] Parallel BS: Chapter 22.1-22.5 [GUW] Parallel BS and map-reduce: Chapter 20.1-20.2 Acknowledgement: The following

More information