Shuigeng Zhou. May 18, 2016 School of Computer Science Fudan University

Size: px
Start display at page:

Download "Shuigeng Zhou. May 18, 2016 School of Computer Science Fudan University"

Transcription

1 Query Processing Shuigeng Zhou May 18, 2016 School of Comuter Science Fudan University

2 Overview Outline Measures of Query Cost Selection Oeration Sorting Join Oeration Other Oerations Evaluation of Exressions 16/5/25 School of C.S., Fudan Univ. 2

3 Basic Stes in Query Processing 1. Parsing and translation 2. Otimization 3. Evaluation 16/5/25 School of C.S., Fudan Univ. 3

4 Basic Stes in Query Processing (Cont.) Parsing and translation n translate the query into its internal form, which is then translated into relational algebra n Parser checks syntax, verifies relations Evaluation n The query-execution engine takes a queryevaluation lan, executes that lan, and returns the answers to the query 16/5/25 School of C.S., Fudan Univ. 4

5 Basic Stes in Query Processing: Otimization A relational algebra exression may have many equivalent exressions Each relational algebra oeration can be evaluated using one of several different algorithms Annotated exression secifying detailed evaluation strategy is called an evaluation-lan n can use an index on balance to find accounts with balance < 2500, n or can erform comlete relation scan and discard accounts with balance /5/25 School of C.S., Fudan Univ. 5

6 Basic Stes in Query Processing: Otimization Examle: select * from account where balance < 2500 can use an index on balance to find accounts with balance < 2500, or can erform comlete relation scan and discard accounts with balance /5/25 School of C.S., Fudan Univ. 6

7 Basic Stes: Otimization (Cont.) Query otimization: amongst all equivalent evaluation lans choose the one with lowest cost. n Cost is estimated using statistical information from the database catalog In this lecture we study n How to measure query costs n Algorithms for evaluating relational algebra oerations n How to combine algorithms for individual oerations in order to evaluate a comlete exression In next lecture n We study how to otimize queries, that is, how to find an evaluation lan with lowest estimated cost 16/5/25 School of C.S., Fudan Univ. 7

8 Measures of Query Cost

9 Measures of Query Cost Cost is generally measured as total elased time for answering query n disk accesses, CPU, or even network communication Tyically disk access is the redominant cost, and is also relatively easy to estimate. Measured by taking into account n Number of seeks --- average-seek-cost n Number of blocks read --- average-block-read-cost n Number of blocks written --- average-block-write-cost l Cost to write a block is greater than cost to read a block data is read back after being written to ensure that the write was successful 16/5/25 School of C.S., Fudan Univ. 9

10 Measures of Query Cost (Cont.) For simlicity we just use number of block transfers from disk as the cost measure n n We ignore the difference in cost between sequential and random I/O for simlicity We also ignore CPU costs for simlicity Costs deends on the size of the buffer in main memory n n Having more memory reduces need for disk access We often use worst case estimates, assuming only the minimum amount of memory needed for the oeration is available Real systems take CPU cost into account, differentiate between sequential and random I/O, and take buffer size into account We do not include cost to writing outut to disk in our cost formulae 16/5/25 School of C.S., Fudan Univ. 10

11 Selection σ (r) or select * from r where

12 Selection Oeration File scan search algorithms that locate and retrieve records that fulfill a selection condition. Algorithm A1 (linear search) n Cost estimate (number of disk blocks scanned) = b r n If selection is on a key attribute, average cost = (b r /2) l sto on finding record n Linear search can be alied regardless of l selection condition or l ordering of records in the file, or l availability of indices 16/5/25 School of C.S., Fudan Univ. 12

13 Selection Oeration (Cont.) A2 (binary search) n Alicable if selection is an equality comarison on the attribute on which file is ordered. n Assume that the blocks of a relation are stored contiguously n Cost estimate (number of disk blocks to be scanned): l log 2 (b r ) cost of locating the first tule by a binary search on the blocks l Plus number of blocks containing records that satisfy selection condition Will see how to estimate this cost in next lecture 16/5/25 School of C.S., Fudan Univ. 13

14 Selections Using Indices Index scan search algorithms that use an index n selection condition must be on search-key of index. A3 (rimary index on candidate key, equality) n Retrieve a single record that satisfies the corresonding equality condition n Cost = HT i + 1 (B + -tree) A4 (rimary index on nonkey, equality) : Retrieve multile records n Records will be on consecutive blocks n Cost = HT i + number of blocks containing retrieved records 16/5/25 School of C.S., Fudan Univ. 14

15 Selections Using Indices (Cont.) A5 (equality on search-key of secondary index) n Retrieve a single record if the search-key is a candidate key l Cost = HT i + 1 n Retrieve multile records if search-key is not a candidate key l Cost = HT i + number of records retrieved Can be very exensive! l Each record may be on a different block One block access for each retrieved record 16/5/25 School of C.S., Fudan Univ. 15

16 Selections Involving Comarisons Can imlement selections of the form σ A V (r) or σ A V (r) by using n a linear file scan or binary search, n or by using indices in the following ways: A6 (rimary index, comarison) n Relation is sorted on A n For σ A V (r) use index to find first tule v and scan relation sequentially from there n For σ A V (r) just scan relation sequentially till first tule > v; do not use index 16/5/25 School of C.S., Fudan Univ. 16

17 Selections Involving Comarisons (Cont.) A7 (secondary index, comarison) n For σ A V (r) use index to find first index entry v and scan index sequentially from there, to find ointers to records. n For σ A V (r) just scan leaf ages of index finding ointers to records, till first entry > v n In either case, retrieve records that are ointed to l requires an I/O for each record l Linear file scan may be cheaer if many records are to be fetched! 16/5/25 School of C.S., Fudan Univ. 17

18 Imlementation of Comlex Selections Conjunction ( 合取 ): σ θ 1 θ 2... θ n(r) A8 (conjunctive selection using one index) n Select a condition of θ i and algorithms A1 through A7 that results in the least cost forσ θi (r). n Test other conditions on tule after fetching it into memory buffer. A9 (conjunctive selection using multile-key index) n Use aroriate comosite (multile-key) index if available. 16/5/25 School of C.S., Fudan Univ. 18

19 Imlementation of Comlex Selections (Cont.) A10 (conjunctive selection by intersection of identifiers) n Requires indices with record ointers. n Use corresonding index for each condition, and take intersection of all the obtained sets of record ointers. n Then fetch records from file n If some conditions do not have aroriate indices, aly test in memory. 16/5/25 School of C.S., Fudan Univ. 19

20 Imlementation of Comlex Selections (Cont.) Disjunction ( 析取 ):σ θ 1 θ 2... θ n (r). A11 (disjunctive selection by union of identifiers) n Alicable if all conditions have available indices. l Otherwise use linear scan. n Use corresonding index for each condition, and take union of all the obtained sets of record ointers. n Then fetch records from file 16/5/25 School of C.S., Fudan Univ. 20

21 Imlementation of Comlex Selections (Cont.) Negation: σ θ (r) n Use linear scan on file n If very few records satisfy θ, and an index is alicable to θ l Find satisfying records using index and fetch from file n Examle: σ (A 0) (r)= σ A=0 (r) 16/5/25 School of C.S., Fudan Univ. 21

22 Sorting

23 Why sorting? Sorting n For a SQL query with order by clause, query answers must be returned in order n For join oeration with two relations, if the relations are sorted by the join attributes, the join oeration can be evaluated very efficiently n Sorting for constructing ordered indices 16/5/25 School of C.S., Fudan Univ. 23

24 Sorting How to sort relations? n For relations that fit in memory, techniques like quick-sort can be used n We may build an index on the relation, and then use the index to read the relation in sorted order. May lead to one disk block access for each tule (for non-rimary indices) n For relations that don t fit in memory, external sort-merge is a good choice 16/5/25 School of C.S., Fudan Univ. 24

25 External Sort-Merge Let M denote memory size (in ages) 1. Create sorted runs. Let i be 0 initially. Reeatedly do the following till the end of the relation: (a) Read M blocks of relation into memory (b) Sort the in-memory blocks (c) Write sorted data to run R i ; increment i. Let the final value of i be N 2. Merge the runs (N-way merge). We assume (for now) that N < M. 1. Use N blocks of memory to buffer inut runs, and 1 block to buffer outut. Read the first block of each run into its buffer age 2. reeat 1. Select the first record (in sort order) among all buffer ages 2. Write the record to the outut buffer. If the outut buffer is full write it to disk. 3. Delete the record from its inut buffer age. If the buffer age becomes emty then read the next block of the run into the buffer. 3. until all inut buffer ages are emty: 16/5/25 School of C.S., Fudan Univ. 25

26 External Sort-Merge INPUT 1... INPUT 2... Disk Make runs INPUT M M Main memory buffers INPUT 1 Disk... INPUT 2... OUTPUT INPUT N Merge Disk M Main memory buffers Disk 16/5/25 School of C.S., Fudan Univ. 26

27 External Sort-Merge (Cont.) If N M, several merge asses are required. n In each ass, contiguous grous of M - 1 runs are merged. n A ass reduces the number of runs by a factor of M -1. l E.g. If M=11, and there are 90 runs, one ass reduces the number of runs to 9, each 10 times the size of the initial runs n Reeated asses are erformed till all runs have been merged into one. 16/5/25 School of C.S., Fudan Univ. 27

28 Examle: External Sorting Using Sort-Merge Sort on the first column! M=3 16/5/25 School of C.S., Fudan Univ. 28

29 External Merge Sort (Cont.) Cost analysis: n Total number of merge asses required: log M 1 (b r /M). n Disk accesses for initial run creation as well as in each ass is 2b r l for final ass, we don t count write cost we ignore final write cost for all oerations since the outut of an oeration may be sent to the arent oeration without being written to disk Thus total number of disk accesses for external sorting: b r ( 2 log M 1 (b r / M) + 1)=b r ( 2 log M 1 (b r / M) )-b r + 2b r 16/5/25 School of C.S., Fudan Univ. 29

30 Join n Nested-loo join n Block nested-loo join n Indexed nested-loo join n Merge-join n Hash-join

31 Join Oeration Several different algorithms to imlement joins n Nested-loo join n Block nested-loo join n Indexed nested-loo join n Merge-join n Hash-join Choice based on cost estimate Examles use the following information n Number of records of customer: 10,000 deositor: 5000 n Number of blocks of customer: 400 deositor: /5/25 School of C.S., Fudan Univ. 31

32 Tule Nested-Loo Join To comute the theta join r θ s for each tule t r in r do begin for each tule t s in s do begin test air (t r,t s ) to see if they satisfy the join condition θ if they do, add t r t s to the result. end end r is called the outer relation and s the inner relation Requires no indices and can be used with any kind of join condition. Exensive since it examines every air of tules in the two relations. 16/5/25 School of C.S., Fudan Univ. 32

33 Tule nested-loo Join (Cont.) In the worst case, the estimated cost is n r b s + b r disk accesses. If the best case, reduces cost to b r + b s. Assuming worst case memory availability cost estimate is n = 2,000,100 disk accesses with deositor as outer relation, and n = 1,000,400 disk accesses with customer as the outer relation. If smaller relation (deositor) fits entirely in memory, the cost estimate will be 500 disk accesses. Block nested-loos algorithm (next slide) is referable. 16/5/25 School of C.S., Fudan Univ. 33

34 Block Nested-Loo Join Variant of nested-loo join in which every block of inner relation is aired with every block of outer relation. for each block B r of r do begin for each block B s of s do begin for each tule t r in B r do begin for each tule t s in B s do begin Check if (t r,t s ) satisfy the join condition if they do, add t r t s to the result. end end end end 16/5/25 School of C.S., Fudan Univ. 34

35 Block Nested-Loo Join (Cont.) r join s Hash table for block of r (M-2 blocks)... Join Result Inut buffer for s Outut buffer 16/5/25 School of C.S., Fudan Univ. 35

36 Block Nested-Loo Join (Cont.) Worst case estimate: b r b s + b r block accesses. Best case: b r + b s block accesses. Imrovements : n In block nested-loo, use (M 2) disk blocks as blocking unit for outer relations, where M = memory size in blocks; use remaining two blocks to buffer inner relation and outut l Cost = b r / (M-2) b s + b r n If equi-join attribute forms a key of inner relation, sto inner loo on first match n Scan inner relation forward and backward alternately, to make use of the blocks remaining in buffer. 16/5/25 School of C.S., Fudan Univ. 36

37 Indexed Nested-Loo Join Index lookus can relace file scans if n join is an equi-join or natural join and n an index is available on the inner relation s join attribute For each tule t r in the outer relation r, use the index to look u tules in s Cost of the join: b r + n r c n Where c is the cost of traversing index and fetching all matching s tules for one tule of r If indices are available on join attributes of both r and s, use the relation with fewer tules as the outer relation. 16/5/25 School of C.S., Fudan Univ. 37

38 Examle of Nested-Loo Join Costs Comute deositor customer. Let customer have a rimary B + -tree index on the join attribute customer-name, which contains 20 entries in each index node. Since customer has 10,000 tules (400 blocks), the height of the tree is 4 = [log_20 ^10,000]+1, and one more access to find the actual data deositor has 5000 tules -> 100 blocks Cost of block nested loos join n 400* = 40,100 disk accesses Cost of indexed nested loos join n * 5 = 25,100 disk accesses. 16/5/25 School of C.S., Fudan Univ. 38

39 Merge-Join 1. Sort both relations on their join attribute (if not already sorted on the join attributes). 2. Merge the sorted relations to join them 1. Join ste is similar to the merge stage of the sort-merge algorithm. 2. Main difference is handling of dulicate values in join attribute every air with similar value on join attribute must be matched 3. Detailed algorithm in book 16/5/25 School of C.S., Fudan Univ. 39

40 Merge-Join (Cont.) Can be used only for equi-joins and natural joins Each block needs to be read only once (assuming all tules for any given value of the join attributes fit in memory) Thus number of block accesses for merge-join is b r + b s + the cost of sorting if relations are unsorted. hybrid merge-join (combining indices with merge-join): If one relation is sorted, and the other has a secondary B + -tree index on the join attribute n n n Merge the sorted relation with the leaf entries of the B + -tree, the result file contains tules from the sorted file and the addresses from the unsorted file Sort the result file on the addresses of the unsorted relation s tules Scan the unsorted relation in hysical address order and merge with revious result, to relace addresses by the actual tules 16/5/25 School of C.S., Fudan Univ. 40

41 Hash-Join Alicable for equi-joins and natural joins. A hash function h is used to artition tules of both relations h mas JoinAttrs values to {0, 1,..., n}. n r 0, r 1,..., r n denote artitions of r tules n s 0, s 1..., s n denotes artitions of s tules Note: in the book, r i is denoted as H ri,, s i is denoted as H si and n is denoted as n h 16/5/25 School of C.S., Fudan Univ. 41

42 Hash-Join (Cont.) 16/5/25 School of C.S., Fudan Univ. 42

43 Hash-Join (Cont.) r tules in r i need only to be comared with s tules in s i. n an r tule and an s tule that satisfy the join condition will have the same value for the join attributes. n If that value is hashed to some value i, the r tule has to be in r i and the s tule in s i. 16/5/25 School of C.S., Fudan Univ. 43

44 Hash-Join Algorithm The hash-join of r and s is comuted as follows: 1. Partition the relation s using hashing function h. When artitioning a relation, one block of memory is reserved as the outut buffer for each artition 2. Partition r similarly 3. For each i: (a) Load s i into memory and build an in-memory hash index on it using the join attribute. This hash index uses a different hash function than the earlier one h. (b) Read the tules in r i from the disk one by one. For each tule t r locate each matching tule t s in s i using the in-memory hash index. Outut the concatenation of their attributes Relation s is called the build inut and r is called the robe inut 16/5/25 School of C.S., Fudan Univ. 44

45 Hash-Join Algorithm Partition both relations using hash fn h: r tules in artition i will only match s tules in artition i Original Relation... INPUT hash function h OUTPUT 1 n 2 Partitions 1 2 n Disk M main memory buffers Disk Read in a artition of r, hash it using h2 (<> h!). Scan matching artition of s, search for matches Partitions of r & s hash fn h2 Disk Hash table for artition si (k < M-1 ages) h2 Inut buffer For ri Outut buffer M main memory buffers Join Result Disk 16/5/25 School of C.S., Fudan Univ. 45

46 Hash-Join algorithm (Cont.) The value n (# artitions) and the hash function h is chosen such that each s i should fit in memory (say M blocks) n Tyically n is chosen as b s /M * f where f is a fudge factor, tyically around 1.2 n The robe relation artitions r i need not fit in memory Recursive artitioning required if number of artitions n is greater than number of ages M of memory. n instead of artitioning n ways, use M 1 artitions for s n Further artition the M 1 artitions using a different hash function n Use same artitioning method on r 16/5/25 School of C.S., Fudan Univ. 46

47 Handling of Overflows Hash-table overflow occurs in artition s i if s i does not fit in memory. Reasons could be n n Many tules in s with same value for join attributes Bad hash function Overflow resolution can be done in build hase n n Partition s i is further artitioned using different hash function. Partition r i must be similarly artitioned. Overflow avoidance erforms artitioning carefully to avoid overflows during build hase n E.g. artition build relation into many artitions, then combine them Both aroaches fail with large numbers of dulicates n Fallback otion: use block nested loos join on overflowed artitions 16/5/25 School of C.S., Fudan Univ. 47

48 Cost of Hash-Join If recursive artitioning is not required: cost of hash join is 2(b r + b s ) +(b r + b s ) + 4n If recursive artitioning required, number of asses required for artitioning s is log M 1 (b s ) 1. the number of asses for artitioning of r is also the same as for s. Therefore it is best to choose the smaller relation as the build relation. Total cost estimate is: 2(b r + b s ) log M 1 (b s ) 1 + b r + b s If the entire build inut can be ket in main memory, n can be set to 0 and the algorithm does not artition the relations into temorary files. Cost estimate goes down to b r + b s. 16/5/25 School of C.S., Fudan Univ. 48

49 Examle of Cost of Hash-Join customer deositor Assume that memory size is 22 blocks b deositor = 100 and b customer = 400. deositor is to be used as build inut. Partition it into five artitions, each of size 20 blocks. This artitioning can be done in one ass. Similarly, artition customer into five artitions,each of size 80. This is also done in one ass. Therefore total cost: 3( ) = 1500 block transfers n ignores cost of writing artially filled blocks 16/5/25 School of C.S., Fudan Univ. 49

50 Hybrid Hash Join Useful when memory size are relatively large, and the build inut is bigger than memory. Main feature of hybrid hash join: Kee the first artition of the build relation in memory. E.g. With memory size of 25 blocks, deositor can be artitioned into five artitions, each of size 20 blocks. Division of memory: n n The first artition occuies 20 blocks of memory 1 block is used for inut, and 1 block each for buffering the other 4 artitions. customer is similarly artitioned into five artitions each of size 80; the first is used right away for robing, instead of being written out and read back. Cost of 3( ) = 1300 block transfers for hybrid hash join, instead of 1500 with lain hash-join. Hybrid hash-join most useful if M >> bs 16/5/25 School of C.S., Fudan Univ. 50

51 Comlex Joins Join with a conjunctive condition: n r θ1 θ 2... θ n s Either use nested loos/block nested loos, or n Comute the result of one of the simler joins r θi s l final result comrises those tules in the intermediate result that satisfy the remaining conditions θ 1... θ i 1 θ i θ n Join with a disjunctive condition n r θ1 θ2... θns Either use nested loos/block nested loos, or n Comute as the union of the records in individual joins r θ i s: (r θ1s) (r θ2 s)... (r θn s) 16/5/25 School of C.S., Fudan Univ. 51

52 Comlex Joins Join involving three relations: loan deositor customer Strategy 1. Comute deositor customer; use result to comute loan (deositor customer) Strategy 2. Comuter loan deositor first, and then join the result with customer. Strategy 3. Perform the air of joins at once. Build and index on loan for loan-number, and on customer for customer-name. n n For each tule t in deositor, look u the corresonding tules in customer and the corresonding tules in loan. Each tule of deositor is examined exactly once. Strategy 3 combines two oerations into one secial-urose oeration that is more efficient than imlementing two joins of two relations. 16/5/25 School of C.S., Fudan Univ. 52

53 Other Oerations

54 Other Oerations Dulicate elimination can be imlemented via hashing or sorting. n On sorting dulicates will come adjacent to each other, and all but one coy of dulicates can be deleted. Otimization: dulicates can be deleted during run generation as well as at intermediate merge stes in external sort-merge. n Hashing is similar dulicates will come into the same bucket. Projection is imlemented by erforming rojection on each tule followed by dulicate elimination. 16/5/25 School of C.S., Fudan Univ. 54

55 Other Oerations: Aggregation Aggregation can be imlemented in a manner similar to dulicate elimination. n Sorting or hashing can be used to bring tules in the same grou together, and then the aggregate functions can be alied on each grou. n Otimization: combine tules in the same grou during run generation and intermediate merges, by comuting artial aggregate values l For count, min, max, sum: kee aggregate values on tules found so far in the grou. l For avg, kee sum and count, and divide sum by count at the end 16/5/25 School of C.S., Fudan Univ. 55

56 Other Oerations: Set Oerations Set oerations (, and ): can either use variant of mergejoin after sorting, or variant of hash-join. E.g., Set oerations using hashing: 1. Partition both relations using the same hash function, thereby creating, r 1,.., r n and s 1, s 2.., s n 2. Process each artition i as follows. Using a different hashing function, build an in-memory hash index on r i after it is brought into memory. 3. r s: Add tules in s i to the hash index if they are not already in it. At end of s i, add the tules in the hash index to the result. r s: outut tules in s i to the result if they are already there in the hash index. r s: for each tule in s i, if it is there in the hash index, delete it from the index. At end of s i, add remaining tules in the hash index to the result. 16/5/25 School of C.S., Fudan Univ. 56

57 Other Oerations: Outer Join Outer join can be comuted either as n n A join followed by addition of null-added non-articiating tules. by modifying the join algorithms. Modifying merge join to comute r s n In r s, non articiating tules are those in r Π R (r s) n Modify merge-join to comute r s: During merging, for every tule t r from r that do not match any tule in s, outut t r added with nulls. n Right outer-join and full outer-join can be comuted similarly. Modifying hash join to comute r s n n If r is robe relation, outut non-matching r tules added with nulls If r is build relation, when robing kee track of which r tules matched s tules. At end of s i outut non-matched r tules added with nulls 16/5/25 School of C.S., Fudan Univ. 57

58 Evaluation of Exressions

59 Evaluation of Exressions So far: we have seen algorithms for individual oerations Alternatives for evaluating an entire exression tree n Materialization: generate results of an exression whose inuts are relations or are already comuted, materialize (store) it on disk. n Pielining: ass on tules to arent oerations even as an oeration is being executed We study above alternatives in more detail 16/5/25 School of C.S., Fudan Univ. 59

60 Materialization Materialized evaluation: evaluate one oeration at a time, starting at the lowest-level. E.g., in figure below, comute and store σ ( account balance<2500 ) then comute and store the revious result join with customer, and finally comute the rojections on customer-name. 16/5/25 School of C.S., Fudan Univ. 60

61 Materialization (Cont.) Materialized evaluation is always alicable Cost of writing results to disk and reading them back can be quite high n Our cost formulas for oerations ignore cost of writing results to disk, so l Overall cost = Sum of costs of individual oerations + cost of writing intermediate results to disk Double buffering: use two outut buffers for each oeration, when one is full write it to disk while the other is getting filled n Allows overla of disk writes with comutation and reduces execution time 16/5/25 School of C.S., Fudan Univ. 61

62 Pielining Pielined evaluation : evaluate several oerations simultaneously, assing the results of one oeration on to the next. E.g., in revious exression tree, don t store result of σ balance<2500 ( account ) n instead, ass tules directly to the join.. Similarly, don t store result of join, ass tules directly to rojection. Much cheaer than materialization. Pielining may not always be ossible e.g., sort, hash-join. Pielines can be executed in two ways: demand driven and roducer driven 16/5/25 School of C.S., Fudan Univ. 62

63 Pielining (Cont.) In demand driven or lazy evaluation n n n n system reeatedly requests next tule from to level oeration Each oeration requests next tule from children oerations as required, in order to outut its next tule Between calls, oeration has to maintain state so it knows what to return next Each oeration is imlemented as an iterator imlementing the following oerations l oen() l E.g. file scan: initialize file scan, store ointer to beginning of file as state E.g.merge join: sort relations and store ointers to beginning of sorted relations as state next() E.g. for file scan: Outut next tule, and advance and store file ointer E.g. for merge join: continue with merge from earlier state till next outut tule is found. Save ointers as iterator state. l close() 16/5/25 School of C.S., Fudan Univ. 63

64 Pielining (Cont.) In roducer-driven or eager ielining n Oerators roduce tules eagerly and ass them u to their arents l Buffer maintained between oerators, child uts tules in buffer, arent removes tules from buffer l if buffer is full, child waits till there is sace in the buffer, and then generates more tules n System schedules oerations that have sace in outut buffer and can rocess more inut tules 16/5/25 School of C.S., Fudan Univ. 64

65 Assignments Practice exercises: 13.3, 13.6 Exercises: 13.11, /5/25 School of C.S., Fudan Univ. 65

Chapter 13: Query Processing

Chapter 13: Query Processing Chapter 13: Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 13.1 Basic Steps in Query Processing 1. Parsing

More information

! A relational algebra expression may have many equivalent. ! Cost is generally measured as total elapsed time for

! A relational algebra expression may have many equivalent. ! Cost is generally measured as total elapsed time for Chapter 13: Query Processing Basic Steps in Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 1. Parsing and

More information

Chapter 13: Query Processing Basic Steps in Query Processing

Chapter 13: Query Processing Basic Steps in Query Processing Chapter 13: Query Processing Basic Steps in Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 1. Parsing and

More information

Chapter 12: Query Processing

Chapter 12: Query Processing Chapter 12: Query Processing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Overview Chapter 12: Query Processing Measures of Query Cost Selection Operation Sorting Join

More information

Query Processing. Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016

Query Processing. Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016 Query Processing Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016 Slides re-used with some modification from www.db-book.com Reference: Database System Concepts, 6 th Ed. By Silberschatz,

More information

Chapter 12: Query Processing. Chapter 12: Query Processing

Chapter 12: Query Processing. Chapter 12: Query Processing Chapter 12: Query Processing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 12: Query Processing Overview Measures of Query Cost Selection Operation Sorting Join

More information

Database System Concepts

Database System Concepts Chapter 13: Query Processing s Departamento de Engenharia Informática Instituto Superior Técnico 1 st Semester 2008/2009 Slides (fortemente) baseados nos slides oficiais do livro c Silberschatz, Korth

More information

Advanced Database Systems

Advanced Database Systems Lecture IV Query Processing Kyumars Sheykh Esmaili Basic Steps in Query Processing 2 Query Optimization Many equivalent execution plans Choosing the best one Based on Heuristics, Cost Will be discussed

More information

Chapter 12: Query Processing

Chapter 12: Query Processing Chapter 12: Query Processing Overview Catalog Information for Cost Estimation $ Measures of Query Cost Selection Operation Sorting Join Operation Other Operations Evaluation of Expressions Transformation

More information

Query Processing & Optimization

Query Processing & Optimization Query Processing & Optimization 1 Roadmap of This Lecture Overview of query processing Measures of Query Cost Selection Operation Sorting Join Operation Other Operations Evaluation of Expressions Introduction

More information

CSIT5300: Advanced Database Systems

CSIT5300: Advanced Database Systems CSIT5300: Advanced Database Systems L10: Query Processing Other Operations, Pipelining and Materialization Dr. Kenneth LEUNG Department of Computer Science and Engineering The Hong Kong University of Science

More information

CMSC424: Database Design. Instructor: Amol Deshpande

CMSC424: Database Design. Instructor: Amol Deshpande CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu Databases Data Models Conceptual representa1on of the data Data Retrieval How to ask ques1ons of the database How to answer those ques1ons

More information

DBMS Y3/S5. 1. OVERVIEW The steps involved in processing a query are: 1. Parsing and translation. 2. Optimization. 3. Evaluation.

DBMS Y3/S5. 1. OVERVIEW The steps involved in processing a query are: 1. Parsing and translation. 2. Optimization. 3. Evaluation. Query Processing QUERY PROCESSING refers to the range of activities involved in extracting data from a database. The activities include translation of queries in high-level database languages into expressions

More information

Query processing and optimization

Query processing and optimization Query processing and optimization These slides are a modified version of the slides of the book Database System Concepts (Chapter 13 and 14), 5th Ed., McGraw-Hill, by Silberschatz, Korth and Sudarshan.

More information

Advanced Databases. Lecture 1- Query Processing. Masood Niazi Torshiz Islamic Azad university- Mashhad Branch

Advanced Databases. Lecture 1- Query Processing. Masood Niazi Torshiz Islamic Azad university- Mashhad Branch Advanced Databases Lecture 1- Query Processing Masood Niazi Torshiz Islamic Azad university- Mashhad Branch www.mniazi.ir Overview Measures of Query Cost Selection Operation Sorting Join Operation Other

More information

Announcement. Reading Material. Overview of Query Evaluation. Overview of Query Evaluation. Overview of Query Evaluation 9/26/17

Announcement. Reading Material. Overview of Query Evaluation. Overview of Query Evaluation. Overview of Query Evaluation 9/26/17 Announcement CompSci 516 Database Systems Lecture 10 Query Evaluation and Join Algorithms Project proposal pdf due on sakai by 5 pm, tomorrow, Thursday 09/27 One per group by any member Instructor: Sudeepa

More information

Ch 5 : Query Processing & Optimization

Ch 5 : Query Processing & Optimization Ch 5 : Query Processing & Optimization Basic Steps in Query Processing 1. Parsing and translation 2. Optimization 3. Evaluation Basic Steps in Query Processing (Cont.) Parsing and translation translate

More information

Introduction to Query Processing and Query Optimization Techniques. Copyright 2011 Ramez Elmasri and Shamkant Navathe

Introduction to Query Processing and Query Optimization Techniques. Copyright 2011 Ramez Elmasri and Shamkant Navathe Introduction to Query Processing and Query Optimization Techniques Outline Translating SQL Queries into Relational Algebra Algorithms for External Sorting Algorithms for SELECT and JOIN Operations Algorithms

More information

1.5 Case Study. dynamic connectivity quick find quick union improvements applications

1.5 Case Study. dynamic connectivity quick find quick union improvements applications . Case Study dynamic connectivity quick find quick union imrovements alications Subtext of today s lecture (and this course) Stes to develoing a usable algorithm. Model the roblem. Find an algorithm to

More information

Lecture 8: Orthogonal Range Searching

Lecture 8: Orthogonal Range Searching CPS234 Comutational Geometry Setember 22nd, 2005 Lecture 8: Orthogonal Range Searching Lecturer: Pankaj K. Agarwal Scribe: Mason F. Matthews 8.1 Range Searching The general roblem of range searching is

More information

Review. Support for data retrieval at the physical level:

Review. Support for data retrieval at the physical level: Query Processing Review Support for data retrieval at the physical level: Indices: data structures to help with some query evaluation: SELECTION queries (ssn = 123) RANGE queries (100

More information

Query Optimization. Shuigeng Zhou. December 9, 2009 School of Computer Science Fudan University

Query Optimization. Shuigeng Zhou. December 9, 2009 School of Computer Science Fudan University Query Optimization Shuigeng Zhou December 9, 2009 School of Computer Science Fudan University Outline Introduction Catalog Information for Cost Estimation Estimation of Statistics Transformation of Relational

More information

Lecture 18. Today, we will discuss developing algorithms for a basic model for parallel computing the Parallel Random Access Machine (PRAM) model.

Lecture 18. Today, we will discuss developing algorithms for a basic model for parallel computing the Parallel Random Access Machine (PRAM) model. U.C. Berkeley CS273: Parallel and Distributed Theory Lecture 18 Professor Satish Rao Lecturer: Satish Rao Last revised Scribe so far: Satish Rao (following revious lecture notes quite closely. Lecture

More information

Faloutsos 1. Carnegie Mellon Univ. Dept. of Computer Science Database Applications. Outline

Faloutsos 1. Carnegie Mellon Univ. Dept. of Computer Science Database Applications. Outline Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications Lecture #14: Implementation of Relational Operations (R&G ch. 12 and 14) 15-415 Faloutsos 1 introduction selection projection

More information

Database Applications (15-415)

Database Applications (15-415) Database Applications (15-415) DBMS Internals- Part VI Lecture 14, March 12, 2014 Mohammad Hammoud Today Last Session: DBMS Internals- Part V Hash-based indexes (Cont d) and External Sorting Today s Session:

More information

Algorithms for Query Processing and Optimization. 0. Introduction to Query Processing (1)

Algorithms for Query Processing and Optimization. 0. Introduction to Query Processing (1) Chapter 19 Algorithms for Query Processing and Optimization 0. Introduction to Query Processing (1) Query optimization: The process of choosing a suitable execution strategy for processing a query. Two

More information

Query Processing & Optimization. CS 377: Database Systems

Query Processing & Optimization. CS 377: Database Systems Query Processing & Optimization CS 377: Database Systems Recap: File Organization & Indexing Physical level support for data retrieval File organization: ordered or sequential file to find items using

More information

Optimizing Dynamic Memory Management!

Optimizing Dynamic Memory Management! Otimizing Dynamic Memory Management! 1 Goals of this Lecture! Hel you learn about:" Details of K&R hea mgr" Hea mgr otimizations related to Assignment #6" Faster free() via doubly-linked list, redundant

More information

CS 4604: Introduction to Database Management Systems. B. Aditya Prakash Lecture #10: Query Processing

CS 4604: Introduction to Database Management Systems. B. Aditya Prakash Lecture #10: Query Processing CS 4604: Introduction to Database Management Systems B. Aditya Prakash Lecture #10: Query Processing Outline introduction selection projection join set & aggregate operations Prakash 2018 VT CS 4604 2

More information

Evaluation of Relational Operations

Evaluation of Relational Operations Evaluation of Relational Operations Chapter 14 Comp 521 Files and Databases Fall 2010 1 Relational Operations We will consider in more detail how to implement: Selection ( ) Selects a subset of rows from

More information

Introduction to Parallel Algorithms

Introduction to Parallel Algorithms CS 1762 Fall, 2011 1 Introduction to Parallel Algorithms Introduction to Parallel Algorithms ECE 1762 Algorithms and Data Structures Fall Semester, 2011 1 Preliminaries Since the early 1990s, there has

More information

Efficient Processing of Top-k Dominating Queries on Multi-Dimensional Data

Efficient Processing of Top-k Dominating Queries on Multi-Dimensional Data Efficient Processing of To-k Dominating Queries on Multi-Dimensional Data Man Lung Yiu Deartment of Comuter Science Aalborg University DK-922 Aalborg, Denmark mly@cs.aau.dk Nikos Mamoulis Deartment of

More information

Lecture 19 Query Processing Part 1

Lecture 19 Query Processing Part 1 CMSC 461, Database Management Systems Spring 2018 Lecture 19 Query Processing Part 1 These slides are based on Database System Concepts 6 th edition book (whereas some quotes and figures are used from

More information

Implementation of Relational Operations: Other Operations

Implementation of Relational Operations: Other Operations Implementation of Relational Operations: Other Operations Module 4, Lecture 2 Database Management Systems, R. Ramakrishnan 1 Simple Selections SELECT * FROM Reserves R WHERE R.rname < C% Of the form σ

More information

Range Searching. Data structure for a set of objects (points, rectangles, polygons) for efficient range queries.

Range Searching. Data structure for a set of objects (points, rectangles, polygons) for efficient range queries. Range Searching Data structure for a set of objects (oints, rectangles, olygons) for efficient range queries. Y Q Deends on tye of objects and queries. Consider basic data structures with broad alicability.

More information

Equality-Based Translation Validator for LLVM

Equality-Based Translation Validator for LLVM Equality-Based Translation Validator for LLVM Michael Ste, Ross Tate, and Sorin Lerner University of California, San Diego {mste,rtate,lerner@cs.ucsd.edu Abstract. We udated our Peggy tool, reviously resented

More information

Convex Hulls. Helen Cameron. Helen Cameron Convex Hulls 1/101

Convex Hulls. Helen Cameron. Helen Cameron Convex Hulls 1/101 Convex Hulls Helen Cameron Helen Cameron Convex Hulls 1/101 What Is a Convex Hull? Starting Point: Points in 2D y x Helen Cameron Convex Hulls 3/101 Convex Hull: Informally Imagine that the x, y-lane is

More information

CAS CS 460/660 Introduction to Database Systems. Query Evaluation II 1.1

CAS CS 460/660 Introduction to Database Systems. Query Evaluation II 1.1 CAS CS 460/660 Introduction to Database Systems Query Evaluation II 1.1 Cost-based Query Sub-System Queries Select * From Blah B Where B.blah = blah Query Parser Query Optimizer Plan Generator Plan Cost

More information

Query Evaluation! References:! q [RG-3ed] Chapter 12, 13, 14, 15! q [SKS-6ed] Chapter 12, 13!

Query Evaluation! References:! q [RG-3ed] Chapter 12, 13, 14, 15! q [SKS-6ed] Chapter 12, 13! Query Evaluation! References:! q [RG-3ed] Chapter 12, 13, 14, 15! q [SKS-6ed] Chapter 12, 13! q Overview! q Optimization! q Measures of Query Cost! Query Evaluation! q Sorting! q Join Operation! q Other

More information

Database Applications (15-415)

Database Applications (15-415) Database Applications (15-415) DBMS Internals- Part VI Lecture 17, March 24, 2015 Mohammad Hammoud Today Last Two Sessions: DBMS Internals- Part V External Sorting How to Start a Company in Five (maybe

More information

Evaluation of relational operations

Evaluation of relational operations Evaluation of relational operations Iztok Savnik, FAMNIT Slides & Textbook Textbook: Raghu Ramakrishnan, Johannes Gehrke, Database Management Systems, McGraw-Hill, 3 rd ed., 2007. Slides: From Cow Book

More information

Evaluation of Relational Operations: Other Techniques

Evaluation of Relational Operations: Other Techniques Evaluation of Relational Operations: Other Techniques [R&G] Chapter 14, Part B CS4320 1 Using an Index for Selections Cost depends on #qualifying tuples, and clustering. Cost of finding qualifying data

More information

Administriva. CS 133: Databases. General Themes. Goals for Today. Fall 2018 Lec 11 10/11 Query Evaluation Prof. Beth Trushkowsky

Administriva. CS 133: Databases. General Themes. Goals for Today. Fall 2018 Lec 11 10/11 Query Evaluation Prof. Beth Trushkowsky Administriva Lab 2 Final version due next Wednesday CS 133: Databases Fall 2018 Lec 11 10/11 Query Evaluation Prof. Beth Trushkowsky Problem sets PSet 5 due today No PSet out this week optional practice

More information

10 File System Mass Storage Structure Mass Storage Systems Mass Storage Structure Mass Storage Structure FILE SYSTEM 1

10 File System Mass Storage Structure Mass Storage Systems Mass Storage Structure Mass Storage Structure FILE SYSTEM 1 10 File System 1 We will examine this chater in three subtitles: Mass Storage Systems OERATING SYSTEMS FILE SYSTEM 1 File System Interface File System Imlementation 10.1.1 Mass Storage Structure 3 2 10.1

More information

CS122 Lecture 15 Winter Term,

CS122 Lecture 15 Winter Term, CS122 Lecture 15 Winter Term, 2014-2015 2 Index Op)miza)ons So far, only discussed implementing relational algebra operations to directly access heap Biles Indexes present an alternate access path for

More information

Evaluation of Relational Operations

Evaluation of Relational Operations Evaluation of Relational Operations Yanlei Diao UMass Amherst March 13 and 15, 2006 Slides Courtesy of R. Ramakrishnan and J. Gehrke 1 Relational Operations We will consider how to implement: Selection

More information

Query Processing. Solutions to Practice Exercises Query:

Query Processing. Solutions to Practice Exercises Query: C H A P T E R 1 3 Query Processing Solutions to Practice Exercises 13.1 Query: Π T.branch name ((Π branch name, assets (ρ T (branch))) T.assets>S.assets (Π assets (σ (branch city = Brooklyn )(ρ S (branch)))))

More information

Randomized algorithms: Two examples and Yao s Minimax Principle

Randomized algorithms: Two examples and Yao s Minimax Principle Randomized algorithms: Two examles and Yao s Minimax Princile Maximum Satisfiability Consider the roblem Maximum Satisfiability (MAX-SAT). Bring your knowledge u-to-date on the Satisfiability roblem. Maximum

More information

CS2 Algorithms and Data Structures Note 8

CS2 Algorithms and Data Structures Note 8 CS2 Algorithms and Data Structures Note 8 Heasort and Quicksort We will see two more sorting algorithms in this lecture. The first, heasort, is very nice theoretically. It sorts an array with n items in

More information

15-415/615 Faloutsos 1

15-415/615 Faloutsos 1 Carnegie Mellon Univ. Dept. of Computer Science 15-415/615 - DB Applications Lecture #14: Implementation of Relational Operations (R&G ch. 12 and 14) 15-415/615 Faloutsos 1 Outline introduction selection

More information

Implementation of Relational Operations. Introduction. CS 186, Fall 2002, Lecture 19 R&G - Chapter 12

Implementation of Relational Operations. Introduction. CS 186, Fall 2002, Lecture 19 R&G - Chapter 12 Implementation of Relational Operations CS 186, Fall 2002, Lecture 19 R&G - Chapter 12 First comes thought; then organization of that thought, into ideas and plans; then transformation of those plans into

More information

Chapter 3. Algorithms for Query Processing and Optimization

Chapter 3. Algorithms for Query Processing and Optimization Chapter 3 Algorithms for Query Processing and Optimization Chapter Outline 1. Introduction to Query Processing 2. Translating SQL Queries into Relational Algebra 3. Algorithms for External Sorting 4. Algorithms

More information

Lecture 2: Fixed-Radius Near Neighbors and Geometric Basics

Lecture 2: Fixed-Radius Near Neighbors and Geometric Basics structure arises in many alications of geometry. The dual structure, called a Delaunay triangulation also has many interesting roerties. Figure 3: Voronoi diagram and Delaunay triangulation. Search: Geometric

More information

Cross products. p 2 p. p p1 p2. p 1. Line segments The convex combination of two distinct points p1 ( x1, such that for some real number with 0 1,

Cross products. p 2 p. p p1 p2. p 1. Line segments The convex combination of two distinct points p1 ( x1, such that for some real number with 0 1, CHAPTER 33 Comutational Geometry Is the branch of comuter science that studies algorithms for solving geometric roblems. Has alications in many fields, including comuter grahics robotics, VLSI design comuter

More information

Database Applications (15-415)

Database Applications (15-415) Database Applications (15-415) DBMS Internals- Part VIII Lecture 16, March 19, 2014 Mohammad Hammoud Today Last Session: DBMS Internals- Part VII Algorithms for Relational Operations (Cont d) Today s Session:

More information

Evaluation of Relational Operations: Other Techniques

Evaluation of Relational Operations: Other Techniques Evaluation of Relational Operations: Other Techniques Chapter 12, Part B Database Management Systems 3ed, R. Ramakrishnan and Johannes Gehrke 1 Using an Index for Selections v Cost depends on #qualifying

More information

The Priority R-Tree: A Practically Efficient and Worst-Case Optimal R-Tree

The Priority R-Tree: A Practically Efficient and Worst-Case Optimal R-Tree The Priority R-Tree: A Practically Efficient and Worst-Case Otimal R-Tree Lars Arge Deartment of Comuter Science Duke University, ox 90129 Durham, NC 27708-0129 USA large@cs.duke.edu Mark de erg Deartment

More information

CS2 Algorithms and Data Structures Note 8

CS2 Algorithms and Data Structures Note 8 CS2 Algorithms and Data Structures Note 8 Heasort and Quicksort We will see two more sorting algorithms in this lecture. The first, heasort, is very nice theoretically. It sorts an array with n items in

More information

Advances in Data Management Query Processing and Query Optimisation A.Poulovassilis

Advances in Data Management Query Processing and Query Optimisation A.Poulovassilis 1 Advances in Data Management Query Processing and Query Optimisation A.Poulovassilis 1 General approach to the implementation of Query Processing and Query Optimisation functionalities in DBMSs 1. Parse

More information

Efficient Parallel Hierarchical Clustering

Efficient Parallel Hierarchical Clustering Efficient Parallel Hierarchical Clustering Manoranjan Dash 1,SimonaPetrutiu, and Peter Scheuermann 1 Deartment of Information Systems, School of Comuter Engineering, Nanyang Technological University, Singaore

More information

Implementation of Relational Operations

Implementation of Relational Operations Implementation of Relational Operations Module 4, Lecture 1 Database Management Systems, R. Ramakrishnan 1 Relational Operations We will consider how to implement: Selection ( ) Selects a subset of rows

More information

Autonomic Physical Database Design - From Indexing to Multidimensional Clustering

Autonomic Physical Database Design - From Indexing to Multidimensional Clustering Autonomic Physical Database Design - From Indexing to Multidimensional Clustering Stehan Baumann, Kai-Uwe Sattler Databases and Information Systems Grou Technische Universität Ilmenau, Ilmenau, Germany

More information

Learning Motion Patterns in Crowded Scenes Using Motion Flow Field

Learning Motion Patterns in Crowded Scenes Using Motion Flow Field Learning Motion Patterns in Crowded Scenes Using Motion Flow Field Min Hu, Saad Ali and Mubarak Shah Comuter Vision Lab, University of Central Florida {mhu,sali,shah}@eecs.ucf.edu Abstract Learning tyical

More information

search(i): Returns an element in the data structure associated with key i

search(i): Returns an element in the data structure associated with key i CS161 Lecture 7 inary Search Trees Scribes: Ilan Goodman, Vishnu Sundaresan (2015), Date: October 17, 2017 Virginia Williams (2016), and Wilbur Yang (2016), G. Valiant Adated From Virginia Williams lecture

More information

Something to think about. Problems. Purpose. Vocabulary. Query Evaluation Techniques for large DB. Part 1. Fact:

Something to think about. Problems. Purpose. Vocabulary. Query Evaluation Techniques for large DB. Part 1. Fact: Query Evaluation Techniques for large DB Part 1 Fact: While data base management systems are standard tools in business data processing they are slowly being introduced to all the other emerging data base

More information

CSE 544 Principles of Database Management Systems

CSE 544 Principles of Database Management Systems CSE 544 Principles of Database Management Systems Alvin Cheung Fall 2015 Lecture 6 Lifecycle of a Query Plan 1 Announcements HW1 is due Thursday Projects proposals are due on Wednesday Office hour canceled

More information

CMSC424: Database Design. Instructor: Amol Deshpande

CMSC424: Database Design. Instructor: Amol Deshpande CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu Databases Data Models Conceptual representa1on of the data Data Retrieval How to ask ques1ons of the database How to answer those ques1ons

More information

Source Coding and express these numbers in a binary system using M log

Source Coding and express these numbers in a binary system using M log Source Coding 30.1 Source Coding Introduction We have studied how to transmit digital bits over a radio channel. We also saw ways that we could code those bits to achieve error correction. Bandwidth is

More information

Outline. Query Processing Overview Algorithms for basic operations. Query optimization. Sorting Selection Join Projection

Outline. Query Processing Overview Algorithms for basic operations. Query optimization. Sorting Selection Join Projection Outline Query Processing Overview Algorithms for basic operations Sorting Selection Join Projection Query optimization Heuristics Cost-based optimization 19 Estimate I/O Cost for Implementations Count

More information

Signature File Hierarchies and Signature Graphs: a New Index Method for Object-Oriented Databases

Signature File Hierarchies and Signature Graphs: a New Index Method for Object-Oriented Databases Signature File Hierarchies and Signature Grahs: a New Index Method for Object-Oriented Databases Yangjun Chen* and Yibin Chen Det. of Business Comuting University of Winnieg, Manitoba, Canada R3B 2E9 ABSTRACT

More information

OMNI: An Efficient Overlay Multicast. Infrastructure for Real-time Applications

OMNI: An Efficient Overlay Multicast. Infrastructure for Real-time Applications OMNI: An Efficient Overlay Multicast Infrastructure for Real-time Alications Suman Banerjee, Christoher Kommareddy, Koushik Kar, Bobby Bhattacharjee, Samir Khuller Abstract We consider an overlay architecture

More information

What happens. 376a. Database Design. Execution strategy. Query conversion. Next. Two types of techniques

What happens. 376a. Database Design. Execution strategy. Query conversion. Next. Two types of techniques 376a. Database Design Dept. of Computer Science Vassar College http://www.cs.vassar.edu/~cs376 Class 16 Query optimization What happens Database is given a query Query is scanned - scanner creates a list

More information

Query Processing Strategies and Optimization

Query Processing Strategies and Optimization Query Processing Strategies and Optimization CPS352: Database Systems Simon Miner Gordon College Last Revised: 10/25/12 Agenda Check-in Design Project Presentations Query Processing Programming Project

More information

Hash-Based Indexing 165

Hash-Based Indexing 165 Hash-Based Indexing 165 h 1 h 0 h 1 h 0 Next = 0 000 00 64 32 8 16 000 00 64 32 8 16 A 001 01 9 25 41 73 001 01 9 25 41 73 B 010 10 10 18 34 66 010 10 10 18 34 66 C Next = 3 011 11 11 19 D 011 11 11 19

More information

An improved algorithm for Hausdorff Voronoi diagram for non-crossing sets

An improved algorithm for Hausdorff Voronoi diagram for non-crossing sets An imroved algorithm for Hausdorff Voronoi diagram for non-crossing sets Frank Dehne, Anil Maheshwari and Ryan Taylor May 26, 2006 Abstract We resent an imroved algorithm for building a Hausdorff Voronoi

More information

Chapter 14 Query Optimization

Chapter 14 Query Optimization Chapter 14 Query Optimization Chapter 14: Query Optimization! Introduction! Catalog Information for Cost Estimation! Estimation of Statistics! Transformation of Relational Expressions! Dynamic Programming

More information

Chapter 14 Query Optimization

Chapter 14 Query Optimization Chapter 14: Query Optimization Chapter 14 Query Optimization! Introduction! Catalog Information for Cost Estimation! Estimation of Statistics! Transformation of Relational Expressions! Dynamic Programming

More information

Chapter 14 Query Optimization

Chapter 14 Query Optimization Chapter 14 Query Optimization Chapter 14: Query Optimization! Introduction! Catalog Information for Cost Estimation! Estimation of Statistics! Transformation of Relational Expressions! Dynamic Programming

More information

37. Hard Disk Drives. Operating System: Three Easy Pieces

37. Hard Disk Drives. Operating System: Three Easy Pieces 37. Hard Disk Drives Oerating System: Three Easy Pieces AOS@UC 1 Hard Disk Driver Hard disk driver have been the main form of ersistent data storage in comuter systems for decades ( and consequently file

More information

Administrivia. CS 133: Databases. Cost-based Query Sub-System. Goals for Today. Midterm on Thursday 10/18. Assignments

Administrivia. CS 133: Databases. Cost-based Query Sub-System. Goals for Today. Midterm on Thursday 10/18. Assignments Administrivia Midterm on Thursday 10/18 CS 133: Databases Fall 2018 Lec 12 10/16 Prof. Beth Trushkowsky Assignments Lab 3 starts after fall break No problem set out this week Goals for Today Cost-based

More information

Introduction Alternative ways of evaluating a given query using

Introduction Alternative ways of evaluating a given query using Query Optimization Introduction Catalog Information for Cost Estimation Estimation of Statistics Transformation of Relational Expressions Dynamic Programming for Choosing Evaluation Plans Introduction

More information

Evaluation of Relational Operations. Relational Operations

Evaluation of Relational Operations. Relational Operations Evaluation of Relational Operations Chapter 14, Part A (Joins) Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Relational Operations v We will consider how to implement: Selection ( )

More information

CS122 Lecture 4 Winter Term,

CS122 Lecture 4 Winter Term, CS122 Lecture 4 Winter Term, 2014-2015 2 SQL Query Transla.on Last time, introduced query evaluation pipeline SQL query SQL parser abstract syntax tree SQL translator relational algebra plan query plan

More information

Databases 2 Lecture IV. Alessandro Artale

Databases 2 Lecture IV. Alessandro Artale Free University of Bolzano Database 2. Lecture IV, 2003/2004 A.Artale (1) Databases 2 Lecture IV Alessandro Artale Faculty of Computer Science Free University of Bolzano Room: 221 artale@inf.unibz.it http://www.inf.unibz.it/

More information

Evaluation of Relational Operations: Other Techniques. Chapter 14 Sayyed Nezhadi

Evaluation of Relational Operations: Other Techniques. Chapter 14 Sayyed Nezhadi Evaluation of Relational Operations: Other Techniques Chapter 14 Sayyed Nezhadi Schema for Examples Sailors (sid: integer, sname: string, rating: integer, age: real) Reserves (sid: integer, bid: integer,

More information

CS122 Lecture 8 Winter Term,

CS122 Lecture 8 Winter Term, CS122 Lecture 8 Winter Term, 2014-2015 2 Other Join Algorithms Nested- loops join is generally useful, but slow Most joins involve equality tests against attributes Such joins are called equijoins Two

More information

Chapter 14: Query Optimization

Chapter 14: Query Optimization Chapter 14: Query Optimization Database System Concepts 5 th Ed. See www.db-book.com for conditions on re-use Chapter 14: Query Optimization Introduction Transformation of Relational Expressions Catalog

More information

Chapter 12: Indexing and Hashing. Basic Concepts

Chapter 12: Indexing and Hashing. Basic Concepts Chapter 12: Indexing and Hashing! Basic Concepts! Ordered Indices! B+-Tree Index Files! B-Tree Index Files! Static Hashing! Dynamic Hashing! Comparison of Ordered Indexing and Hashing! Index Definition

More information

22. Swaping: Policies

22. Swaping: Policies 22. Swaing: Policies Oerating System: Three Easy Pieces 1 Beyond Physical Memory: Policies Memory ressure forces the OS to start aging out ages to make room for actively-used ages. Deciding which age to

More information

SQL QUERY EVALUATION. CS121: Relational Databases Fall 2017 Lecture 12

SQL QUERY EVALUATION. CS121: Relational Databases Fall 2017 Lecture 12 SQL QUERY EVALUATION CS121: Relational Databases Fall 2017 Lecture 12 Query Evaluation 2 Last time: Began looking at database implementation details How data is stored and accessed by the database Using

More information

Truth Trees. Truth Tree Fundamentals

Truth Trees. Truth Tree Fundamentals Truth Trees 1 True Tree Fundamentals 2 Testing Grous of Statements for Consistency 3 Testing Arguments in Proositional Logic 4 Proving Invalidity in Predicate Logic Answers to Selected Exercises Truth

More information

Fundamentals of Database Systems

Fundamentals of Database Systems Fundamentals of Database Systems Assignment: 4 September 21, 2015 Instructions 1. This question paper contains 10 questions in 5 pages. Q1: Calculate branching factor in case for B- tree index structure,

More information

Real Time Compression of Triangle Mesh Connectivity

Real Time Compression of Triangle Mesh Connectivity Real Time Comression of Triangle Mesh Connectivity Stefan Gumhold, Wolfgang Straßer WSI/GRIS University of Tübingen Abstract In this aer we introduce a new comressed reresentation for the connectivity

More information

Chapter 12: Indexing and Hashing

Chapter 12: Indexing and Hashing Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing and Hashing Index Definition in SQL

More information

10. Parallel Methods for Data Sorting

10. Parallel Methods for Data Sorting 10. Parallel Methods for Data Sorting 10. Parallel Methods for Data Sorting... 1 10.1. Parallelizing Princiles... 10.. Scaling Parallel Comutations... 10.3. Bubble Sort...3 10.3.1. Sequential Algorithm...3

More information

Cost-based Query Sub-System. Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Last Class.

Cost-based Query Sub-System. Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Last Class. Cost-based Query Sub-System Carnegie Mellon Univ. Dept. of Computer Science 15-415/615 - DB Applications Queries Select * From Blah B Where B.blah = blah Query Parser Query Optimizer C. Faloutsos A. Pavlo

More information

Parallel Construction of Multidimensional Binary Search Trees. Ibraheem Al-furaih, Srinivas Aluru, Sanjay Goil Sanjay Ranka

Parallel Construction of Multidimensional Binary Search Trees. Ibraheem Al-furaih, Srinivas Aluru, Sanjay Goil Sanjay Ranka Parallel Construction of Multidimensional Binary Search Trees Ibraheem Al-furaih, Srinivas Aluru, Sanjay Goil Sanjay Ranka School of CIS and School of CISE Northeast Parallel Architectures Center Syracuse

More information

DBMS Query evaluation

DBMS Query evaluation Data Management for Data Science DBMS Maurizio Lenzerini, Riccardo Rosati Corso di laurea magistrale in Data Science Sapienza Università di Roma Academic Year 2016/2017 http://www.dis.uniroma1.it/~rosati/dmds/

More information

Evaluation of Relational Operations: Other Techniques

Evaluation of Relational Operations: Other Techniques Evaluation of Relational Operations: Other Techniques Chapter 14, Part B Database Management Systems 3ed, R. Ramakrishnan and Johannes Gehrke 1 Using an Index for Selections Cost depends on #qualifying

More information

Contents Contents Introduction Basic Steps in Query Processing Introduction Transformation of Relational Expressions...

Contents Contents Introduction Basic Steps in Query Processing Introduction Transformation of Relational Expressions... Contents Contents...283 Introduction...283 Basic Steps in Query Processing...284 Introduction...285 Transformation of Relational Expressions...287 Equivalence Rules...289 Transformation Example: Pushing

More information