Query Processing Transparencies. Pearson Education Limited 1995, 2005

Size: px
Start display at page:

Download "Query Processing Transparencies. Pearson Education Limited 1995, 2005"

Transcription

1 Chapter 21 Query Processing Transparencies 1 Chapter 21 - Objectives Objectives of query processing and optimization. Staticti versus dynamic query optimization. i How a query is decomposed and semantically analyzed. How to create a R.A.T. to represent a query. Rules of equivalence for RA operations. How to apply heuristic transformation rules to improve efficiency of a query. 2

2 Chapter 21 - Objectives Types of database statistics required to estimate cost of operations. Different strategies for implementing selection. How to evaluate cost and size of selection. Different strategies for implementing join. How to evaluate cost and size of join. Different strategies for implementing projection. How to evaluate cost and size of projection. 3 Chapter 21 - Objectives How to evaluate the cost and size of other RA operations. How pipelining can be used to improve efficiency of queries. Difference between materialization and pipelining. i Advantages of left-deep trees. Approaches to finding optimal execution strategy. How Oracle handles QO. 4

3 Introduction In network and hierarchical DBMSs, low-level procedural query language is generally embedded in high-level programming language. Programmer s responsibility to select most appropriate execution strategy. With declarative languages such as SQL, user specifies what data is required rather than how it is to be retrieved. ti Relieves user of knowing what constitutes good execution strategy. 5 Introduction Also gives DBMS more control over system performance. Two main techniques for query optimization: heuristic rules that order operations in a query; comparing different strategies based on relative costs, and selecting one that minimizes resource usage. Disk access tends to be dominant cost in query processing for centralized DBMS. 6

4 Query Processing Activities involved in retrieving data from the database. Aims of QP: transform query written in high-level language (e.g. SQL), into correct and efficient execution strategy expressed in low-level language g (implementing RA); execute strategy to retrieve required data. 7 Query Optimization Activity of choosing an efficient execution strategy for processing query. As there are many equivalent transformations of same high-levelh l query, aim of QO is tochoose one that minimizes resource usage. Generally, reduce total execution time of query. May also reduce response time of query. Problem computationally intractable with large number of relations, so strategy adopted is reduced to finding near optimum solution. 8

5 Example Different Strategies Find all Managers who work at a London branch. SELECT * FROM Staff s, Branch b WHERE s.branchno = b.branchnobranchno AND (s.position = Manager AND b.city = London ); 9 Example Different Strategies Three equivalent RA queries are: (1) σ (position='manager') (city='london') (Staff.branchNo=Branch.branchNo) (Staff X Branch) (2) σ (position='manager') (city='london') ( Staff Staff.branchNo=Branch.branchNo Branch) (3) (σ position='manager' (Staff)) Staff.branchNo=Branch.branchNo (σ city='london' (Branch)) 10

6 Example Different Strategies Assume: 1000 tuples in Staff; 50 tuples in Branch; 50 Managers; 5 London branches; no indexes or sort keys; results of any intermediate t operations stored on disk; cost of the final write is ignored; tuples are accessed one at a time. 11 Example Cost Comparison Cost (in disk accesses) are: (1) ( ) + 2*(1000 * 50) = (2) 2* ( ) = (3) * (50 + 5) = Cartesian product and join operations much more expensive than selection, and third option significantly reduces size of relations being joined together. 12

7 Phases of Query Processing QP has four main phases: decomposition (consisting of parsing and validation); optimization; code generation; execution. 13 Phases of Query Processing 14

8 Dynamic versus Static Optimization Two times when first three phases of QP can be carried out: dynamically every time query is run; statically tti when query is first submitted. ittd Advantages of dynamic QO arise from fact that information is up to date. Disadvantages are that performance of query is affected, time may limit finding optimum strategy. 15 Dynamic versus Static Optimization Advantages of static QO are removal of runtime overhead, and more time to find optimum strategy. Disadvantages arise from fact that chosen execution strategy may no longer be optimal when query is run. Could use a hybrid approach to overcome this. 16

9 Query Decomposition Aims are to transform high-level query into RA query and check that query is syntactically and semantically correct. Typical stages are: analysis, normalization, semantic analysis, simplification, query restructuring. 17 Analysis Analyze query lexically and syntactically using compiler techniques. Verify relations and attributes exist. Verify operations are appropriate for object type. 18

10 Analysis - Example SELECT staff_no FROM Staff WHERE position > 10; This query would be rejected on two grounds: staff_no is not defined for Staff relation (should be staffno). Comparison >10 is incompatible with type position, which is variable character string. 19 Analysis Finally, query transformed into some internal representation more suitable for processing. Some kind of query tree is typically chosen, constructed as follows: Leaf node created for each base relation. Non-leaf node created for each intermediate relation produced by RA operation. Root of tree represents query result. Sequence is directed from leaves to root. 20

11 Example R.A.T. 21 Normalization Converts query into a normalized form for easier manipulation. Predicate can be converted into one of two forms: Conjunctive normal form: (position = 'Manager' salary > 20000) (branchno = 'B003') Disjunctive normal form: (position = 'Manager' branchno = 'B003' ) (salary > branchno = 'B003') 22

12 Semantic Analysis Rejects normalized queries that are incorrectly formulated or contradictory. Query is incorrectly formulated if components do not contribute to generation of result. Query is contradictory if its predicate cannot be satisfied by anytuple. Algorithms to determine correctness exist only for queries that do not contain disjunction and negation. 23 Semantic Analysis For these queries, could construct: A relation connection graph. Normalized attribute connection graph. Relation connection graph Create node for each relation and node for result. Create edges between two nodes that represent a join, and edges between nodes that represent projection. If not connected, query is incorrectly formulated. 24

13 Semantic Analysis - Normalized Attribute Connection Graph Create node for each reference to an attribute, or constant 0. Create directed edge between nodes that represent a ji join, and directed d edge bt between attribute t node and 0 node that represents selection. Weight edges a b with value c, if it represents inequality condition (a b + c); weight edges 0 a with -c, if it represents inequality condition (a c). If graph has cycle for which valuation sum is negative, query is contradictory. 25 Example Checking Semantic Correctness SELECT p.propertyno, p.street FROM Client c, Viewing v, PropertyForRent p WHERE c.clientno = v.clientno AND c.maxrent >= 500 AND c.preftype = Flat AND p.ownerno = CO93 ; Relation connection graph notfully connected, so query is not correctly formulated. Have omitted the join condition (v.propertyno = p.propertyno). 26

14 Example Checking Semantic Correctness Relation Connection graph Normalized attribute connection graph 27 Example Checking Semantic Correctness SELECT p.propertyno, p.street FROM Client c, Viewing v, PropertyForRent p WHERE c.maxrent > 500 AND c.clientno = v.clientno AND v.propertyno = p.propertyno AND c.preftype = Flat AND c.maxrent < 200; Normalized attribute connection graph has cycle between nodes c.maxrent and 0 with negative valuation sum, so query is contradictory. 28

15 Simplification Detects redundant qualifications, eliminates common sub-expressions, transforms query to semantically equivalent but more easilyand efficiently i computed form. Typically, access restrictions, view definitions, and integrity constraints are considered. Assuming user has appropriate p access privileges, first apply well-known idempotency rules of boolean algebra. 29 Transformation Rules for RA Operations Conjunctive Selection operations can cascade into individual Selection operations (and vice versa). σ p q r (R) = σ p (σ q (σ r (R))) p q r p q r Sometimes referred to as cascade of Selection. σ branchno='b003' salary>15000 (Staff) = σ branchno='b003' (σ salary>15000 (Staff)) 30

16 Transformation Rules for RA Operations Commutativity of Selection. σ p (σ q (R)) = σ q (σ p (R)) For example: σ branchno='b003' (σ salary>15000 (Staff)) = σ salary>15000 (σ branchno='b003' (Staff)) salary>15000( branchno= B003 ( )) 31 Transformation Rules for RA Operations In a sequence of Projection operations, only the last in the sequence is required. Π L Π M Π N (R) = Π L (R) For example: Π lnameπ branchno, lname( (Staff) = Π lname (Staff) 32

17 Transformation Rules for RA Operations Commutativity of Selection and Projection. If predicate p involves only attributes in projection list, Selection and Projection operations commute: Π Ai,, Am (σ p (R)) = σ p (Π Ai,, Am (R)) where p {A 1, A 2 2,, A m m} For example: Π fname, lname (σ lname='beech' (Staff)) = σ lname='beech' (Π fname,lname (Staff)) 33 Transformation Rules for RA Operations Commutativity of Theta join (and Cartesian product). R p S = S p R R X S = S X R Rule also applies to Equijoin and Natural join. For example: Staff staff.branchno=branch.branchno Branch = Branch staff.branchno=branch.branchno Staff 34

18 Transformation Rules for RA Operations Commutativity of Selection and Theta join (or Cartesian product). If selection predicate involves only attributes t of one of join relations, Selection and Join (or Cartesian product) operations commute: σ p(r r S) = (σ p(r)) r S σ p (R X S) = (σ p (R)) X S where p {A 1, A 2,, A n } 35 Transformation Rules for RA Operations If selection predicate is conjunctive predicate having form (p q), where p only involves attributes of R, and q only attributes of S, Sl Selection and Theta ji join operations commute as: σ p q (R r S) = (σ p (R)) r (σ q (S)) σ p q (R X S) = (σ p (R)) X (σ q (S)) 36

19 Transformation Rules for RA Operations For example: σ position='manager' city='london' (Staff Staff.branchNo=Branch.branchNo h b Branch) = (σ position='manager' (Staff)) Staff.branchNo=Branch.branchNo (σ city='london' (Branch)) 37 Transformation Rules for RA Operations Commutativity of Projection and Theta join (or Cartesian product). If projection list is of form L = L 1 L 2, where L 1 only has attributes of R, and L 2 only has attributes of S, provided join condition only contains attributes of L, Projection and Theta join commute: Π L1 L2 (R r S) = (Π L1 (R)) r (Π L2 (S)) 38

20 Transformation Rules for RA Operations If join condition contains additional attributes not in L (M = M 1 M 2 where M 1 only has attributes of R, and M 2 only has attributes of S), a final projection operation is required: Π L1 L2 (R r S) = Π L1 L2 ((Π (Π L1 M1 (R)) r (Π L2 M2 (S))) 39 Transformation Rules for RA Operations For example: Π position,city,branchno (Staff Staff.branchNo=Branch.branchNo Branch) = (Π position, branchno (Staff)) Staff.branchNo=Branch.branchNo ( Π city, branchno (Branch)) and using the latter rule: Π position, city (Staff Staff.branchNo=Branch.branchNo Branch) = Π position, city ((Π position, branchno (Staff)) Staff.branchNo=Branch.branchNo ( Π city, branchno (Branch))) 40

21 Transformation Rules for RA Operations Commutativity of Union and Intersection (but not set difference). R S=S S R R S = S R 41 Transformation Rules for RA Operations Commutativity of Selection and set operations (Union, Intersection, and Set difference). σ p (R S) = σ p (S) σ p (R) σ p (R S) = σ p (S) σ p (R) p p p σ p (R - S) = σ p (S) - σ p (R) 42

22 Transformation Rules for RA Operations Commutativity of Projection and Union. Π L (R S) = Π L (S) Π L (R) Associativity of Union and Intersection (but not Stdiff Set difference). (R S) T = S (R T) (R S) T = S (R T) 43 Transformation Rules for RA Operations Associativity of Theta join (and Cartesian product). Cartesian product and Nt Naturalji join arealways associative: (R S) T = R (S T) (RXS)XT X = RX(SXT) X If join condition q involves attributes only from S and T, then Theta join is associative: (R p S) q r T=R p r (S q T) 44

23 Transformation Rules for RA Operations For example: (Staff Staff.staffNo=PropertyForRent.staffNo PropertyForRent) ownerno=owner.ownerno staff.lname=owner.lname Owner = Staff staff.staffno=propertyforrent.staffno staff.lname=lname (PropertyForRent ownerno Owner) 45 Example 21.3 Use of Transformation Rules For prospective renters of flats, find properties that match requirements and owned by CO93. SELECT p.propertyno, p.street FROM Ci Client c, Viewing i v, PropertyForRent p WHERE c.preftype = Flat AND c.clientno = v.clientno AND v.propertyno = p.propertyno AND c.maxrent >= p.rent AND c.preftype = p.type AND p.ownerno = CO93 ; 46

24 Example 21.3 Use of Transformation Rules 47 Example 21.3 Use of Transformation Rules 48

25 Example 21.3 Use of Transformation Rules 49 Heuristical Processing Strategies Perform Selection operations as early as possible. Keep predicates on same relation together. th Combine Cartesian product with subsequent Selection whose predicate represents join condition into a Join operation. Use associativity of binary operations to rearrange leaf nodes so leaf nodes with most restrictive Selection operations executed first. 50

26 Heuristical Processing Strategies Perform Projection as early as possible. Keep projection attributes t on same relation together. th Compute common expressions once. If common expression appears more than once, and result not too large, store result and reuse it when required. Useful when querying views, as same expression is used to construct view each time. 51 Cost Estimation for RA Operations Many different ways of implementing RA operations. Aim of QO is to choose most efficient one. Use formulae thatt estimate t costs for a number of options, and select one with lowest cost. Consider only cost of disk access, which is usually dominant cost in QP. Many estimates are based on cardinality of the relation, so need to be able to estimate this. 52

27 Database Statistics Success of estimation depends on amount and currency of statistical information DBMS holds. Keeping statistics current can be problematic. If statistics updated every time tuple is changed, this would impact performance. DBMS could update statistics on a periodic basis, for example nightly, ihtl or whenever the system is idle. 53 Typical Statistics for Relation R ntuples(r) - number of tuples in R. bfactor(r) - blocking factor of R. nblocks(r) - number of blocks required to store R: nblocks(r) = [ntuples(r)/bfactor(r)] (R)] 54

28 Typical Statistics for Attribute A of Relation R ndistinct A (R) - number of distinct values that appear for attribute t A in R. min A(R),max A(R) minimum and maximum possible values for attribute A in R. SC A (R) - selection cardinality of attribute A in R. Average number of tuples that satisfy an equality condition on attribute A. 55 Statistics for Multilevel Index I on Attribute A nlevels A (I) - number of levels in I. nlfblocks A (I) - number of leaf blocks in I. 56

29 Selection Operation Predicate may be simple or composite. Number of different implementations, depending on file structure, and whether attribute(s) involved are indexed/hashed. Main strategies are: Linear Search (Unordered file, no index). Binary Search (Ordered file, no index). Equality on hash key. Equality condition on primary key. 57 Selection Operation Inequality condition on primary key. Equality condition on clustering (secondary) index. Equality condition on a non-clustering (secondary) index. Inequality condition on a secondary B + -tree index. 58

30 Estimating Cardinality of Selection Assume attribute values are uniformly distributed within their domain and attributes are independent. ntuples(s) = SC A (R) For any attribute te B A of S, ndistinct B (S) = ntuples(s) if ntuples(s) <ndistinct B B( (R)/2 ndistinct B (R) if ntuples(s) > 2*nDistinct B (R) [(ntuples(s) + ndistinct B (R))/3] otherwise 59 Linear Search (Ordered File, No Index) May need to scan each tuple in each block to check whether it satisfies predicate. For equality condition on key attribute, cost estimate is: [nblocks(r)/2] For any other condition, entire file may need to be searched, so more general cost estimate is: nblocks(r) 60

31 Binary Search (Ordered File, No Index) If predicate is of form A = x, and file is ordered on key attribute A, cost estimate: [log 2 (nblocks(r))] Generally, cost estimate is: [log 2 (nblocks(r))] ( + [SC A A( (R)/bFactor(R)] ( )] -1 First term represents cost of finding first tuple using binary search. Expect there to be SC A (R) tuples satisfying predicate. 61 Equality of Hash Key If attribute A is hash key, apply hashing algorithm to calculate target address for tuple. If there is no overflow, expected cost is 1. If there is overflow, additional accesses may be necessary. 62

32 Equality Condition on Primary Key Can use primary index to retrieve single record satisfying condition. Need to read one more block than number of index accesses, equivalent to number of levels in index, so estimated cost is: nlevels A (I) Inequality Condition on Primary Key Can first use index to locate record satisfying predicate (A = x). Provided index is sorted, records can be found by accessing all records before/after this one. Assuming uniform distribution, would expect half the records to satisfy inequality, so estimated cost is: nlevels A (I) + [nblocks(r)/2] 64

33 Equality Condition on Clustering Index Can use index to retrieve required records. Eti Estimated tdcost is: nlevels A (I) + [SC A (R)/bFactor(R)] Second term is estimate of number of blocks that will be required to store number of tuples that satisfy equality condition, represented as SC A (R). 65 Equality Condition on Non-Clustering Index Can use index to retrieve required records. Have to assume thatt tuples are on different blocks (index is not clustered this time), so estimated cost becomes: nlevels A (I) + [SC A (R)] 66

34 Inequality Condition on a Secondary B + - Tree Index From leaf nodes of tree, can scan keys from smallest value up to x (< or <= ) or from x up to maximum value (> or >=). Assuming uniform distribution, would expect half the leaf node blocks to be accessed and, via index, half the file records to be accessed. Estimated cost is: nlevels A (I) + [nlfblocks A (I)/2 + ntuples(r)/2] 67 Composite Predicates - Conjunction without Disjunction May consider following approaches: - If one attribute has index or is ordered, can use one of above selection strategies. Can then check each retrieved record. - For equality on two or more attributes, with composite index (or hash key) on combined attributes, can search index directly. - With secondary indexes on one or more attributes t (involved only in equality conditions in predicate), could use record pointers if exist. 68

35 Composite Predicates - Selections with Disjunction If one term contains an (OR), and term requires linear search, entire selection requires linear search. Only if index or sort order exists on every term can selection be optimized by retrieving records that satisfy each condition and applying union operator. Again, record pointers can be used if they exist. 69 Join Operation Main strategies for implementing join: Block Nested Loop Join. Indexed Nested Loop Join. Sort-Merge Join. Hash Join. 70

36 Estimating Cardinality of Join Cardinality of Cartesian product is: ntuples(r) * ntuples(s) More difficult to estimate cardinality of any join as depends on distribution of values. Worst case, cannot be any greater than this value. 71 Estimating Cardinality of Join If assume uniform distribution, can estimate for Equijoins with a predicate (R.A = S.B) as follows: If A is key of R: ntuples(t) ntuples(s) If B is key of S: ntuples(t) ntuples(r) Otherwise, could estimate cardinality of join as: ntuples(t) = SC A A( (R)*nTuples(S) ( ) ntuples(t) = SC B (S)*nTuples(R) or 72

37 Block Nested Loop Join Simplest join algorithm is nested loop that joins two relations together a tuple at a time. Outer loop iterates over each tuple in R, and inner loop iterates over each tuple in S. As basic unit of reading/writing is a disk block, btt better to have two extra loops thatt process blocks. Estimated cost of this approach is: nblocks(r) + (nblocks(r) * nblocks(s)) 73 Block Nested Loop Join Could read as many blocks as possible of smaller relation, R say, into database buffer, saving one block for inner relation and one for result. New cost estimate becomes: nblocks(r) + [nblocks(s)*(nblocks(r)/(nbuffer-2))] If can read all blocks of R into the buffer, this reduces to: nblocks(r) + nblocks(s) 74

38 Indexed Nested Loop Join If have index (or hash function) on join attributes of inner relation, can use index lookup. For each tuple in R, use index to retrieve matching tuples of S. Cost of scanning R isnblocks(r), asbefore. bf Cost of retrieving matching tuples in S depends on type of index and number of matching tuples. If join attribute A in S is PK, cost estimate is: nblocks(r) + ntuples(r)*(nlevels A (I) + 1) 75 Sort-Merge Join For Equijoins, most efficient join is when both relations are sorted on join attributes. Can look for qualifying tuples merging relations. May need to sort relations first. Now tuples with same join value are in order. If assume join is *:* and each set of tuples with same join value can be held in database buffer at same time, then each block of each relation need only be read once. 76

39 Sort-Merge Join Cost estimate for the sort-merge join is: nblocks(r) + nblocks(s) If a relation has to be sorted, R say, add: nblocks(r)*[log 2 (nblocks(r)] 77 Hash Join For Natural or Equijoin, hash join may be used. Idea is to partition relations according to some hash function that provides uniformity and randomness. Each equivalent partition should hold same value for ji join attributes, t although h it may hold more than one value. Cost estimate of hash join as: 3(nBlocks(R) + nblocks(s)) 78

40 Projection Operation To implement projection need to: remove attributes that are not required; eliminate any duplicate tuples produced from previous step. Only required if projection attributes do not include a key. Two main approaches to eliminating duplicates: sorting; hashing. 79 Estimating Cardinality of Projection When projection contains key, cardinality is: ntuples(s) = ntuples(r) If projection consists of a single non-key attribute, estimate is: ntuples(s) = SC A (R) Otherwise, could estimate cardinality as: ntuples(s) min(ntuples(r), Π m i =1 (ndistinct ai (R))) 80

41 Duplicate Elimination using Sorting Sort tuples of reduced relation using all remaining attributes as sort key. Duplicates will now be adjacent and can be removed easily. Estimated cost of sorting is: nblocks(r)*[log 2 (nblocks(r))]. Combined cost is: nblocks(r) + nblocks(r)*[log 2 (nblocks(r))] 81 Duplicate Elimination using Hashing Two phases: partitioning and duplicate elimination. In partitioning phase, for each tuple in R, remove unwanted attributes and apply hash function to combination of remaining attributes, and write reduced tuple to hashed value. Two tuples that belong to different partitions are guaranteed not to be duplicates. Estimated cost is: nblocks(r) + nb 82

42 Set Operations Can be implemented by sorting both relations on same attributes, and scanning through each of sorted relations once to obtain desired result. Could use sort-merge join as basis. Estimated cost in all cases is: nblocks(r) + nblocks(s) + nblocks(r)*[log 2 (nblocks(r))] + nblocks(s)*[log 2 (nblocks(s))] ( Could also use hashing algorithm. 83 Estimating Cardinality of Set Operations As duplicates are eliminated when performing Union, difficult to estimate cardinality, but can give an upper and lower bound as: max(ntuples(r), ntuples(s)) ntuples(t) ntuples(r) + ntuples(s) For Set Difference, can also give upper and lower bound: 0 ntuples(t) ntuples(r) 84

43 Aggregate Operations SELECT AVG(salary) FROM Staff; To implement query, could scan entire Staff relation and maintain running count of number of tuples read and sum of all salaries. Easy to compute average from these two running counts. 85 Aggregate Operations SELECT AVG(salary) FROM Staff GROUP BY branchno; For grouping queries, can use sorting or hashing algorithms similar to duplicate elimination. Can estimate e cardinality yof result using estimates derived earlier for selection. 86

44 Enumeration of Alternative Strategies Fundamental to efficiency of QO is the search space of possible execution strategies and the enumeration algorithm used to search this space. Query with 2 joins gives 12 join orderings: R (S T) R (T S) (S T) R (T S) R S (R T) S (T R) (R T) S (T R) S T (R S) T (S R) (R S) T (S R) T With n relations,,(2(n( 1))!/(n 1)! orderings. If n = 4 this is 120; if n = 10 this is > 176 billion. Compounded d by different selection/join i methods. 87 Pipelining Materialization - output of one operation is stored in temporary relation for processing by next. Could also pipeline results of one operation to another without creating temporary relation. Known as pipelining i or on-the-fly processing. Pipelining can save on cost of creating temporary relations and reading results back in again. Generally, pipeline is implemented as separate process or thread. 88

45 Types of Trees 89 Pipelining With linear trees, relation on one side of each operator is always a base relation. However, as need to examine entire inner relation for each tuple of outer relation, inner relations must always be materialized. This makes lftd left-deep trees appealing as inner relations are always base relations. Reduces search space for optimum strategy, and allows QO to use dynamic processing. Not all execution strategies are considered. 90

46 Physical Operators & Strategies Term physical operator refers to specific algorithm that implements a logical operation, such as selection or join. For example, can use sort-merge join to implement the join operation. Replacing logical operations in a R.A.T. with physical operators produces an execution strategy (or query evaluation plan or access plan). 91 Physical Operators & Strategies 92

47 Reducing the Search Space Restriction 1: Unary operations processed onthe-fly: selections processed as relations are accessed for first time; projections processed as results of other operations are generated. Restriction i 2: Cartesian products are never formed unless query itself specifies one. Restriction i 3: Inner operand of each ji join is a base relation, never an intermediate result. This uses fact that with left-deep trees inner operand is a base relation and so already materialized. Restriction i 3 excludes many alternative strategies but significantly reduces number to be considered. 93 Dynamic Programming Enumeration of left-deep trees using dynamic programming first proposed for System R QO. Algorithm based on assumption that the cost model satisfies principle of optimality. Thus, to obtain optimal strategy for query with wt n joins, only need to consider optimal strategies for subexpressions with (n 1) joins and extend those strategies with an additional join. Remaining suboptimal strategies t can be discarded. d d 94

48 Dynamic Programming To ensure some potentially useful strategies are not discarded algorithm retains strategies with interesting orders: an intermediate result has an interesting ti order if it is sorted by a final ORDER BY attribute, GROUP BY attribute, or any attributes that participate in subsequent joins. 95 Dynamic Programming SELECT p.propertyno, p.street FROM Client c, Viewing v, PropertyForRent p WHERE c.maxrent < 500 AND c.clientno = v.clientno AND v.propertyno p = p.propertyno; p p Attributes c.clientno, v.clientno, v.propertyno, and p.propertynopropertyno are interesting. If any intermediate result is sorted on any of these attributes, then corresponding partial strategy must be included in search. 96

49 Dynamic Programming Algorithm proceeds from the bottom up and constructs all alternative join trees that satisfy the restrictions above, as follows: Pass 1: Enumerate the strategies for each base relation using a linear search and all available indexes on the relation. These partial strategies are partitioned into equivalence classes based on any interesting orders. An additional equivalence class is created for the partial strategies with no interesting order. 97 Dynamic Programming For each equivalence class, strategy with lowest cost is retained for consideration in next pass. Do not retain equivalence class with no interesting order if its lowest cost strategy is not lower than all other strategies. For a given relation R, any selections involving i only attributes of R are processed on-the-fly. Similarly, il l any attributes t of R that t are not part of the SELECT clause and do not contribute to any subsequent join can be projected out at this stage (restriction 1 above). 98

50 Dynamic Programming Pass 2: Generate all 2-relation strategies by considering each strategy retained after Pass 1 as outer relation, discarding any Cartesian products generated (restriction 2 above). Again, any on-the- fly processing is performed and lowest cost strategy in each equivalence class is retained. Pass n: Generate all n-relation strategies by considering each strategy retained after Pass (n 1) as outer relation, discarding any Cartesian products generated. After pruning, now have lowest overall strategy for processing the query. 99 Dynamic Programming Although algorithm is still exponential, there are query forms for which it only generates O(n 3 ) strategies, so for n = 10 the number is 1,000, which is significantly better than the 176 billion different join orders noted earlier. 100

51 Semantic Query Optimization Based on constraints specified on the database schema to reduce the search space. For example, a constraint states that staff cannot supervise more than 100 properties, so any query searching for staff who supervise more than 100 properties will produce zero rows. Now consider: CREATE ASSERTION ManagerSalary CHECK (salary > AND position = Manager ) SELECT s.staffno, fname, lname, propertyno FROM Staff s, PropertyForRent p WHERE s.staffno = p.staffno AND position = Manager ; 101 Semantic Query Optimization Can rewrite this query as: SELECT s.staffno, staffno fname, lname, propertyno FROM Staff s, PropertyForRent p WHERE s.staffno = p.staffno AND salary > AND position = Manager ; Additional predicate may be very useful if only index for Staff is a B+-tree on the salary attribute. However, additional predicate would complicate query if no such index existed. 102

52 Query Optimization in Oracle Oracle supports two approaches to query optimization: rule-based and cost-based. Rule-based 15 rules, ranked in order of efficiency. Particular access path for a table only chosen if statement contains a predicate or other construct that makes that access path available. Score assigned to each execution strategy using these rankings and strategy with best (lowest) score selected. 103 QO in Oracle Rule-Based When 2 strategies have same score, tie-break resolved by making decision based on order in which tables occur in the SQL statement. 104 Pearson Education Limited 1995, 2005

53 QO in Oracle Rule-based: Example SELECT propertyno FROM PropertyForRent WHERE rooms > 7 AND city = London Single-column access path using index on city from WHERE condition (city = London ). Rank 9. Unbounded range scan using index on rooms from WHERE condition (rooms > 7). Rank 11. Full table scan - rank 15. Although there is index on propertyno, column does not appear in WHERE clause and so is not considered by optimizer. Based on these paths, rule-based optimizer will choose to use index based on city column. 105 QO in Oracle Cost-Based To improve QO, Oracle introduced cost-based optimizer in Oracle 7, which selects strategy that requires minimal resource use necessary to process pocess all rows accessed by query (avoiding (vod above tie-break anomaly). User can select whether minimal resource usage is based on throughput or based on response time, by setting the OPTIMIZER_MODEMODE initialization parameter. Cost-based optimizer i also takes into consideration hints that the user may provide. 106

54 QO in Oracle Statistics Cost-based optimizer depends on statistics for all tables, clusters, and indexes accessed by query. Users responsibility to generate these statistics and keep them current. Package DBMS_STATS can be used to generate and manage statistics. ttiti Whenever possible, Oracle uses a parallel method to gather statistics, although index statistics are collected serially. EXECUTE DBMS_STATS.GATHER_SCHEMA_STATS( Manager ); 107 QO in Oracle Histograms Previously made assumption that data values within columns of a table are uniformly distributed. Histogram of values and their relative frequencies gives optimizer improved selectivity estimates in presence of non-uniform distribution. 108

55 QO in Oracle Histograms (a) uniform distribution of rooms; (b) actual non-uniform distribution. (a) can be stored compactly as low value (1) and high value (10),and as totalt count of all frequencies (in this case,100). 109 QO in Oracle Histograms Histogram is data structure that can improve estimates of number of tuples in result. Two types of histogram: width-balanced histogram, which divides data into a fixed number of equal-width ranges (called buckets) each containing count of number of values falling within that bucket; height-balanced histogram,, which places approximately same number of values in each bucket so that end points of each bucket are determined by how many values are in that bucket. 110

56 QO in Oracle Histograms (a) width-balanced for rooms with 5 buckets. Each bucket of equal width with 2 values (1-2, 3-4, etc.) (b) height-balanced height of each column is 20 (100/5). 111 QO in Oracle Viewing Execution Plan 112

Introduction to Query Processing and Query Optimization Techniques. Copyright 2011 Ramez Elmasri and Shamkant Navathe

Introduction to Query Processing and Query Optimization Techniques. Copyright 2011 Ramez Elmasri and Shamkant Navathe Introduction to Query Processing and Query Optimization Techniques Outline Translating SQL Queries into Relational Algebra Algorithms for External Sorting Algorithms for SELECT and JOIN Operations Algorithms

More information

CS317 File and Database Systems

CS317 File and Database Systems CS317 File and Database Systems Lecture 3 Relational Calculus and Algebra Part-2 September 10, 2017 Sam Siewert RDBMS Fundamental Theory http://dilbert.com/strips/comic/2008-05-07/ Relational Algebra and

More information

Chapter 12: Query Processing

Chapter 12: Query Processing Chapter 12: Query Processing Overview Catalog Information for Cost Estimation $ Measures of Query Cost Selection Operation Sorting Join Operation Other Operations Evaluation of Expressions Transformation

More information

Algorithms for Query Processing and Optimization. 0. Introduction to Query Processing (1)

Algorithms for Query Processing and Optimization. 0. Introduction to Query Processing (1) Chapter 19 Algorithms for Query Processing and Optimization 0. Introduction to Query Processing (1) Query optimization: The process of choosing a suitable execution strategy for processing a query. Two

More information

Query Processing & Optimization

Query Processing & Optimization Query Processing & Optimization 1 Roadmap of This Lecture Overview of query processing Measures of Query Cost Selection Operation Sorting Join Operation Other Operations Evaluation of Expressions Introduction

More information

Relational Algebra and Relational Calculus. Pearson Education Limited 1995,

Relational Algebra and Relational Calculus. Pearson Education Limited 1995, Relational Algebra and Relational Calculus 1 Objectives Meaning of the term relational completeness. How to form queries in relational algebra. How to form queries in tuple relational calculus. How to

More information

Chapter 3. Algorithms for Query Processing and Optimization

Chapter 3. Algorithms for Query Processing and Optimization Chapter 3 Algorithms for Query Processing and Optimization Chapter Outline 1. Introduction to Query Processing 2. Translating SQL Queries into Relational Algebra 3. Algorithms for External Sorting 4. Algorithms

More information

What happens. 376a. Database Design. Execution strategy. Query conversion. Next. Two types of techniques

What happens. 376a. Database Design. Execution strategy. Query conversion. Next. Two types of techniques 376a. Database Design Dept. of Computer Science Vassar College http://www.cs.vassar.edu/~cs376 Class 16 Query optimization What happens Database is given a query Query is scanned - scanner creates a list

More information

CMP-3440 Database Systems

CMP-3440 Database Systems CMP-3440 Database Systems Relational DB Languages Relational Algebra, Calculus, SQL Lecture 05 zain 1 Introduction Relational algebra & relational calculus are formal languages associated with the relational

More information

STRUCTURED QUERY LANGUAGE (SQL)

STRUCTURED QUERY LANGUAGE (SQL) STRUCTURED QUERY LANGUAGE (SQL) EGCO321 DATABASE SYSTEMS KANAT POOLSAWASD DEPARTMENT OF COMPUTER ENGINEERING MAHIDOL UNIVERSITY SQL TIMELINE SCOPE OF SQL THE ISO SQL DATA TYPES SQL identifiers are used

More information

Chapter 12: Query Processing

Chapter 12: Query Processing Chapter 12: Query Processing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Overview Chapter 12: Query Processing Measures of Query Cost Selection Operation Sorting Join

More information

Chapter 12: Query Processing. Chapter 12: Query Processing

Chapter 12: Query Processing. Chapter 12: Query Processing Chapter 12: Query Processing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 12: Query Processing Overview Measures of Query Cost Selection Operation Sorting Join

More information

Chapter 6. SQL Data Manipulation

Chapter 6. SQL Data Manipulation Chapter 6 SQL Data Manipulation Pearson Education 2014 Chapter 6 - Objectives Purpose and importance of SQL. How to retrieve data from database using SELECT and: Use compound WHERE conditions. Sort query

More information

Chapter 13: Query Processing

Chapter 13: Query Processing Chapter 13: Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 13.1 Basic Steps in Query Processing 1. Parsing

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe CHAPTER 19 Query Optimization Introduction Query optimization Conducted by a query optimizer in a DBMS Goal: select best available strategy for executing query Based on information available Most RDBMSs

More information

Query Processing. Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016

Query Processing. Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016 Query Processing Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016 Slides re-used with some modification from www.db-book.com Reference: Database System Concepts, 6 th Ed. By Silberschatz,

More information

! A relational algebra expression may have many equivalent. ! Cost is generally measured as total elapsed time for

! A relational algebra expression may have many equivalent. ! Cost is generally measured as total elapsed time for Chapter 13: Query Processing Basic Steps in Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 1. Parsing and

More information

Chapter 13: Query Processing Basic Steps in Query Processing

Chapter 13: Query Processing Basic Steps in Query Processing Chapter 13: Query Processing Basic Steps in Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 1. Parsing and

More information

Database System Concepts

Database System Concepts Chapter 13: Query Processing s Departamento de Engenharia Informática Instituto Superior Técnico 1 st Semester 2008/2009 Slides (fortemente) baseados nos slides oficiais do livro c Silberschatz, Korth

More information

Advanced Database Systems

Advanced Database Systems Lecture IV Query Processing Kyumars Sheykh Esmaili Basic Steps in Query Processing 2 Query Optimization Many equivalent execution plans Choosing the best one Based on Heuristics, Cost Will be discussed

More information

Hash-Based Indexing 165

Hash-Based Indexing 165 Hash-Based Indexing 165 h 1 h 0 h 1 h 0 Next = 0 000 00 64 32 8 16 000 00 64 32 8 16 A 001 01 9 25 41 73 001 01 9 25 41 73 B 010 10 10 18 34 66 010 10 10 18 34 66 C Next = 3 011 11 11 19 D 011 11 11 19

More information

Chapter 5. Relational Algebra and Relational Calculus

Chapter 5. Relational Algebra and Relational Calculus Chapter 5 Relational Algebra and Relational Calculus Overview The previous chapter covers the relational model, which provides a formal description of the structure of a database This chapter covers the

More information

Outline. Query Processing Overview Algorithms for basic operations. Query optimization. Sorting Selection Join Projection

Outline. Query Processing Overview Algorithms for basic operations. Query optimization. Sorting Selection Join Projection Outline Query Processing Overview Algorithms for basic operations Sorting Selection Join Projection Query optimization Heuristics Cost-based optimization 19 Estimate I/O Cost for Implementations Count

More information

CS317 File and Database Systems

CS317 File and Database Systems CS317 File and Database Systems Lecture 3 Relational Calculus and Algebra Part-2 September 7, 2018 Sam Siewert RDBMS Fundamental Theory http://dilbert.com/strips/comic/2008-05-07/ Relational Algebra and

More information

Announcement. Reading Material. Overview of Query Evaluation. Overview of Query Evaluation. Overview of Query Evaluation 9/26/17

Announcement. Reading Material. Overview of Query Evaluation. Overview of Query Evaluation. Overview of Query Evaluation 9/26/17 Announcement CompSci 516 Database Systems Lecture 10 Query Evaluation and Join Algorithms Project proposal pdf due on sakai by 5 pm, tomorrow, Thursday 09/27 One per group by any member Instructor: Sudeepa

More information

Query Processing & Optimization. CS 377: Database Systems

Query Processing & Optimization. CS 377: Database Systems Query Processing & Optimization CS 377: Database Systems Recap: File Organization & Indexing Physical level support for data retrieval File organization: ordered or sequential file to find items using

More information

Query optimization. Elena Baralis, Silvia Chiusano Politecnico di Torino. DBMS Architecture D B M G. Database Management Systems. Pag.

Query optimization. Elena Baralis, Silvia Chiusano Politecnico di Torino. DBMS Architecture D B M G. Database Management Systems. Pag. Database Management Systems DBMS Architecture SQL INSTRUCTION OPTIMIZER MANAGEMENT OF ACCESS METHODS CONCURRENCY CONTROL BUFFER MANAGER RELIABILITY MANAGEMENT Index Files Data Files System Catalog DATABASE

More information

Contents Contents Introduction Basic Steps in Query Processing Introduction Transformation of Relational Expressions...

Contents Contents Introduction Basic Steps in Query Processing Introduction Transformation of Relational Expressions... Contents Contents...283 Introduction...283 Basic Steps in Query Processing...284 Introduction...285 Transformation of Relational Expressions...287 Equivalence Rules...289 Transformation Example: Pushing

More information

Chapter 13: Query Optimization. Chapter 13: Query Optimization

Chapter 13: Query Optimization. Chapter 13: Query Optimization Chapter 13: Query Optimization Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 13: Query Optimization Introduction Equivalent Relational Algebra Expressions Statistical

More information

ATYPICAL RELATIONAL QUERY OPTIMIZER

ATYPICAL RELATIONAL QUERY OPTIMIZER 14 ATYPICAL RELATIONAL QUERY OPTIMIZER Life is what happens while you re busy making other plans. John Lennon In this chapter, we present a typical relational query optimizer in detail. We begin by discussing

More information

Relational Query Optimization. Overview of Query Evaluation. SQL Refresher. Yanlei Diao UMass Amherst October 23 & 25, 2007

Relational Query Optimization. Overview of Query Evaluation. SQL Refresher. Yanlei Diao UMass Amherst October 23 & 25, 2007 Relational Query Optimization Yanlei Diao UMass Amherst October 23 & 25, 2007 Slide Content Courtesy of R. Ramakrishnan, J. Gehrke, and J. Hellerstein 1 Overview of Query Evaluation Query Evaluation Plan:

More information

Relational Query Optimization

Relational Query Optimization Relational Query Optimization Module 4, Lectures 3 and 4 Database Management Systems, R. Ramakrishnan 1 Overview of Query Optimization Plan: Tree of R.A. ops, with choice of alg for each op. Each operator

More information

Query Optimization. Shuigeng Zhou. December 9, 2009 School of Computer Science Fudan University

Query Optimization. Shuigeng Zhou. December 9, 2009 School of Computer Science Fudan University Query Optimization Shuigeng Zhou December 9, 2009 School of Computer Science Fudan University Outline Introduction Catalog Information for Cost Estimation Estimation of Statistics Transformation of Relational

More information

Introduction Alternative ways of evaluating a given query using

Introduction Alternative ways of evaluating a given query using Query Optimization Introduction Catalog Information for Cost Estimation Estimation of Statistics Transformation of Relational Expressions Dynamic Programming for Choosing Evaluation Plans Introduction

More information

Standard Query Language. SQL: Data Definition Transparencies

Standard Query Language. SQL: Data Definition Transparencies Standard Query Language SQL: Data Definition Transparencies Chapter 6 - Objectives Data types supported by SQL standard. Purpose of integrity enhancement feature of SQL. How to define integrity constraints

More information

Chapter 19 Query Optimization

Chapter 19 Query Optimization Chapter 19 Query Optimization It is an activity conducted by the query optimizer to select the best available strategy for executing the query. 1. Query Trees and Heuristics for Query Optimization - Apply

More information

Query Optimization. Query Optimization. Optimization considerations. Example. Interaction of algorithm choice and tree arrangement.

Query Optimization. Query Optimization. Optimization considerations. Example. Interaction of algorithm choice and tree arrangement. COS 597: Principles of Database and Information Systems Query Optimization Query Optimization Query as expression over relational algebraic operations Get evaluation (parse) tree Leaves: base relations

More information

Advances in Data Management Query Processing and Query Optimisation A.Poulovassilis

Advances in Data Management Query Processing and Query Optimisation A.Poulovassilis 1 Advances in Data Management Query Processing and Query Optimisation A.Poulovassilis 1 General approach to the implementation of Query Processing and Query Optimisation functionalities in DBMSs 1. Parse

More information

Query processing and optimization

Query processing and optimization Query processing and optimization These slides are a modified version of the slides of the book Database System Concepts (Chapter 13 and 14), 5th Ed., McGraw-Hill, by Silberschatz, Korth and Sudarshan.

More information

Advanced Databases. Lecture 4 - Query Optimization. Masood Niazi Torshiz Islamic Azad university- Mashhad Branch

Advanced Databases. Lecture 4 - Query Optimization. Masood Niazi Torshiz Islamic Azad university- Mashhad Branch Advanced Databases Lecture 4 - Query Optimization Masood Niazi Torshiz Islamic Azad university- Mashhad Branch www.mniazi.ir Query Optimization Introduction Transformation of Relational Expressions Catalog

More information

Ch 5 : Query Processing & Optimization

Ch 5 : Query Processing & Optimization Ch 5 : Query Processing & Optimization Basic Steps in Query Processing 1. Parsing and translation 2. Optimization 3. Evaluation Basic Steps in Query Processing (Cont.) Parsing and translation translate

More information

DBMS Y3/S5. 1. OVERVIEW The steps involved in processing a query are: 1. Parsing and translation. 2. Optimization. 3. Evaluation.

DBMS Y3/S5. 1. OVERVIEW The steps involved in processing a query are: 1. Parsing and translation. 2. Optimization. 3. Evaluation. Query Processing QUERY PROCESSING refers to the range of activities involved in extracting data from a database. The activities include translation of queries in high-level database languages into expressions

More information

Review. Relational Query Optimization. Query Optimization Overview (cont) Query Optimization Overview. Cost-based Query Sub-System

Review. Relational Query Optimization. Query Optimization Overview (cont) Query Optimization Overview. Cost-based Query Sub-System Review Relational Query Optimization R & G Chapter 12/15 Implementation of single Relational Operations Choices depend on indexes, memory, stats, Joins Blocked nested loops: simple, exploits extra memory

More information

Parser: SQL parse tree

Parser: SQL parse tree Jinze Liu Parser: SQL parse tree Good old lex & yacc Detect and reject syntax errors Validator: parse tree logical plan Detect and reject semantic errors Nonexistent tables/views/columns? Insufficient

More information

Chapter 6. SQL: SubQueries

Chapter 6. SQL: SubQueries Chapter 6 SQL: SubQueries Pearson Education 2009 Definition A subquery contains one or more nested Select statements Example: List the staff who work in the branch at 163 Main St SELECT staffno, fname,

More information

Query Processing. Solutions to Practice Exercises Query:

Query Processing. Solutions to Practice Exercises Query: C H A P T E R 1 3 Query Processing Solutions to Practice Exercises 13.1 Query: Π T.branch name ((Π branch name, assets (ρ T (branch))) T.assets>S.assets (Π assets (σ (branch city = Brooklyn )(ρ S (branch)))))

More information

CMSC424: Database Design. Instructor: Amol Deshpande

CMSC424: Database Design. Instructor: Amol Deshpande CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu Databases Data Models Conceptual representa1on of the data Data Retrieval How to ask ques1ons of the database How to answer those ques1ons

More information

Chapter 11: Query Optimization

Chapter 11: Query Optimization Chapter 11: Query Optimization Chapter 11: Query Optimization Introduction Transformation of Relational Expressions Statistical Information for Cost Estimation Cost-based optimization Dynamic Programming

More information

Chapter 14 Query Optimization

Chapter 14 Query Optimization Chapter 14 Query Optimization Chapter 14: Query Optimization! Introduction! Catalog Information for Cost Estimation! Estimation of Statistics! Transformation of Relational Expressions! Dynamic Programming

More information

Chapter 14 Query Optimization

Chapter 14 Query Optimization Chapter 14 Query Optimization Chapter 14: Query Optimization! Introduction! Catalog Information for Cost Estimation! Estimation of Statistics! Transformation of Relational Expressions! Dynamic Programming

More information

Chapter 14 Query Optimization

Chapter 14 Query Optimization Chapter 14: Query Optimization Chapter 14 Query Optimization! Introduction! Catalog Information for Cost Estimation! Estimation of Statistics! Transformation of Relational Expressions! Dynamic Programming

More information

Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. Chapter 6 Outline. Unary Relational Operations: SELECT and

Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. Chapter 6 Outline. Unary Relational Operations: SELECT and Chapter 6 The Relational Algebra and Relational Calculus Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 6 Outline Unary Relational Operations: SELECT and PROJECT Relational

More information

Overview of Implementing Relational Operators and Query Evaluation

Overview of Implementing Relational Operators and Query Evaluation Overview of Implementing Relational Operators and Query Evaluation Chapter 12 Motivation: Evaluating Queries The same query can be evaluated in different ways. The evaluation strategy (plan) can make orders

More information

Lecture 6 Structured Query Language (SQL)

Lecture 6 Structured Query Language (SQL) ITM661 Database Systems Lecture 6 Structured Query Language (SQL) (Data Definition) T. Connolly, and C. Begg, Database Systems: A Practical Approach to Design, Implementation, and Management, 5th edition,

More information

Relational Query Optimization. Highlights of System R Optimizer

Relational Query Optimization. Highlights of System R Optimizer Relational Query Optimization Chapter 15 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Highlights of System R Optimizer v Impact: Most widely used currently; works well for < 10 joins.

More information

Chapter 18 Strategies for Query Processing. We focus this discussion w.r.t RDBMS, however, they are applicable to OODBS.

Chapter 18 Strategies for Query Processing. We focus this discussion w.r.t RDBMS, however, they are applicable to OODBS. Chapter 18 Strategies for Query Processing We focus this discussion w.r.t RDBMS, however, they are applicable to OODBS. 1 1. Translating SQL Queries into Relational Algebra and Other Operators - SQL is

More information

Database Applications (15-415)

Database Applications (15-415) Database Applications (15-415) DBMS Internals- Part VI Lecture 14, March 12, 2014 Mohammad Hammoud Today Last Session: DBMS Internals- Part V Hash-based indexes (Cont d) and External Sorting Today s Session:

More information

Lecture 5 Data Definition Language (DDL)

Lecture 5 Data Definition Language (DDL) ITM-661 ระบบฐานข อม ล (Database system) Walailak - 2013 Lecture 5 Data Definition Language (DDL) Walailak University T. Connolly, and C. Begg, Database Systems: A Practical Approach to Design, Implementation,

More information

Chapter 12: Indexing and Hashing. Basic Concepts

Chapter 12: Indexing and Hashing. Basic Concepts Chapter 12: Indexing and Hashing! Basic Concepts! Ordered Indices! B+-Tree Index Files! B-Tree Index Files! Static Hashing! Dynamic Hashing! Comparison of Ordered Indexing and Hashing! Index Definition

More information

Faloutsos 1. Carnegie Mellon Univ. Dept. of Computer Science Database Applications. Outline

Faloutsos 1. Carnegie Mellon Univ. Dept. of Computer Science Database Applications. Outline Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications Lecture #14: Implementation of Relational Operations (R&G ch. 12 and 14) 15-415 Faloutsos 1 introduction selection projection

More information

Query Optimization. Schema for Examples. Motivating Example. Similar to old schema; rname added for variations. Reserves: Sailors:

Query Optimization. Schema for Examples. Motivating Example. Similar to old schema; rname added for variations. Reserves: Sailors: Query Optimization Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Schema for Examples (sid: integer, sname: string, rating: integer, age: real) (sid: integer, bid: integer, day: dates,

More information

QUERY OPTIMIZATION E Jayant Haritsa Computer Science and Automation Indian Institute of Science. JAN 2014 Slide 1 QUERY OPTIMIZATION

QUERY OPTIMIZATION E Jayant Haritsa Computer Science and Automation Indian Institute of Science. JAN 2014 Slide 1 QUERY OPTIMIZATION E0 261 Jayant Haritsa Computer Science and Automation Indian Institute of Science JAN 2014 Slide 1 Database Engines Main Components Query Processing Transaction Processing Access Methods JAN 2014 Slide

More information

CS 4604: Introduction to Database Management Systems. B. Aditya Prakash Lecture #10: Query Processing

CS 4604: Introduction to Database Management Systems. B. Aditya Prakash Lecture #10: Query Processing CS 4604: Introduction to Database Management Systems B. Aditya Prakash Lecture #10: Query Processing Outline introduction selection projection join set & aggregate operations Prakash 2018 VT CS 4604 2

More information

Chapter 12: Indexing and Hashing

Chapter 12: Indexing and Hashing Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing and Hashing Index Definition in SQL

More information

Relational Model: History

Relational Model: History Relational Model: History Objectives of Relational Model: 1. Promote high degree of data independence 2. Eliminate redundancy, consistency, etc. problems 3. Enable proliferation of non-procedural DML s

More information

Schema for Examples. Query Optimization. Alternative Plans 1 (No Indexes) Motivating Example. Alternative Plans 2 With Indexes

Schema for Examples. Query Optimization. Alternative Plans 1 (No Indexes) Motivating Example. Alternative Plans 2 With Indexes Schema for Examples Query Optimization (sid: integer, : string, rating: integer, age: real) (sid: integer, bid: integer, day: dates, rname: string) Similar to old schema; rname added for variations. :

More information

Chapter 14: Query Optimization

Chapter 14: Query Optimization Chapter 14: Query Optimization Database System Concepts 5 th Ed. See www.db-book.com for conditions on re-use Chapter 14: Query Optimization Introduction Transformation of Relational Expressions Catalog

More information

Integration of Transactional Systems

Integration of Transactional Systems Integration of Transactional Systems Distributed Query Processing Robert Wrembel Poznań University of Technology Institute of Computing Science Robert.Wrembel@cs.put.poznan.pl www.cs.put.poznan.pl/rwrembel

More information

Evaluation of Relational Operations

Evaluation of Relational Operations Evaluation of Relational Operations Chapter 14 Comp 521 Files and Databases Fall 2010 1 Relational Operations We will consider in more detail how to implement: Selection ( ) Selects a subset of rows from

More information

Principles of Data Management. Lecture #9 (Query Processing Overview)

Principles of Data Management. Lecture #9 (Query Processing Overview) Principles of Data Management Lecture #9 (Query Processing Overview) Instructor: Mike Carey mjcarey@ics.uci.edu Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Today s Notable News v Midterm

More information

Relational Algebra. Relational Algebra Overview. Relational Algebra Overview. Unary Relational Operations 8/19/2014. Relational Algebra Overview

Relational Algebra. Relational Algebra Overview. Relational Algebra Overview. Unary Relational Operations 8/19/2014. Relational Algebra Overview The Relational Algebra Relational Algebra Relational algebra is the basic set of operations for the relational model These operations enable a user to specify basic retrieval requests (or queries) Relational

More information

CSIT5300: Advanced Database Systems

CSIT5300: Advanced Database Systems CSIT5300: Advanced Database Systems L10: Query Processing Other Operations, Pipelining and Materialization Dr. Kenneth LEUNG Department of Computer Science and Engineering The Hong Kong University of Science

More information

QUERY OPTIMIZATION FOR DATABASE MANAGEMENT SYSTEM BY APPLYING DYNAMIC PROGRAMMING ALGORITHM

QUERY OPTIMIZATION FOR DATABASE MANAGEMENT SYSTEM BY APPLYING DYNAMIC PROGRAMMING ALGORITHM QUERY OPTIMIZATION FOR DATABASE MANAGEMENT SYSTEM BY APPLYING DYNAMIC PROGRAMMING ALGORITHM Wisnu Adityo NIM 13506029 Information Technology Department Institut Teknologi Bandung Jalan Ganesha 10 e-mail:

More information

Principles of Data Management. Lecture #12 (Query Optimization I)

Principles of Data Management. Lecture #12 (Query Optimization I) Principles of Data Management Lecture #12 (Query Optimization I) Instructor: Mike Carey mjcarey@ics.uci.edu Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Today s Notable News v B+ tree

More information

Administrivia. CS 133: Databases. Cost-based Query Sub-System. Goals for Today. Midterm on Thursday 10/18. Assignments

Administrivia. CS 133: Databases. Cost-based Query Sub-System. Goals for Today. Midterm on Thursday 10/18. Assignments Administrivia Midterm on Thursday 10/18 CS 133: Databases Fall 2018 Lec 12 10/16 Prof. Beth Trushkowsky Assignments Lab 3 starts after fall break No problem set out this week Goals for Today Cost-based

More information

Database System Concepts

Database System Concepts Chapter 14: Optimization Departamento de Engenharia Informática Instituto Superior Técnico 1 st Semester 2007/2008 Slides (fortemente) baseados nos slides oficiais do livro c Silberschatz, Korth and Sudarshan.

More information

Ian Kenny. November 28, 2017

Ian Kenny. November 28, 2017 Ian Kenny November 28, 2017 Introductory Databases Relational Algebra Introduction In this lecture we will cover Relational Algebra. Relational Algebra is the foundation upon which SQL is built and is

More information

Query Optimization. Schema for Examples. Motivating Example. Similar to old schema; rname added for variations. Reserves: Sailors:

Query Optimization. Schema for Examples. Motivating Example. Similar to old schema; rname added for variations. Reserves: Sailors: Query Optimization atabase Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Schema for Examples (sid: integer, sname: string, rating: integer, age: real) (sid: integer, bid: integer, day: dates,

More information

Final Exam Review 2. Kathleen Durant CS 3200 Northeastern University Lecture 23

Final Exam Review 2. Kathleen Durant CS 3200 Northeastern University Lecture 23 Final Exam Review 2 Kathleen Durant CS 3200 Northeastern University Lecture 23 QUERY EVALUATION PLAN Representation of a SQL Command SELECT {DISTINCT} FROM {WHERE

More information

3ISY402 DATABASE SYSTEMS

3ISY402 DATABASE SYSTEMS 3ISY402 DATABASE SYSTEMS - SQL: Data Definition 1 Leena Gulabivala Material from essential text: T CONNOLLY & C BEGG. Database Systems A Practical Approach to Design, Implementation and Management, 4th

More information

Query Processing SL03

Query Processing SL03 Distributed Database Systems Fall 2016 Query Processing Overview Query Processing SL03 Distributed Query Processing Steps Query Decomposition Data Localization Query Processing Overview/1 Query processing:

More information

Query Processing and Optimization *

Query Processing and Optimization * OpenStax-CNX module: m28213 1 Query Processing and Optimization * Nguyen Kim Anh This work is produced by OpenStax-CNX and licensed under the Creative Commons Attribution License 3.0 Query processing is

More information

Chapter 11: Indexing and Hashing

Chapter 11: Indexing and Hashing Chapter 11: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing and Hashing Index Definition in SQL

More information

CMP-3440 Database Systems

CMP-3440 Database Systems CMP-3440 Database Systems Advanced SQL Lecture 07 zain 1 Select Statement - Aggregates ISO standard defines five aggregate functions: COUNT returns number of values in specified column. SUM returns sum

More information

Review. Support for data retrieval at the physical level:

Review. Support for data retrieval at the physical level: Query Processing Review Support for data retrieval at the physical level: Indices: data structures to help with some query evaluation: SELECTION queries (ssn = 123) RANGE queries (100

More information

Principles of Data Management. Lecture #13 (Query Optimization II)

Principles of Data Management. Lecture #13 (Query Optimization II) Principles of Data Management Lecture #13 (Query Optimization II) Instructor: Mike Carey mjcarey@ics.uci.edu Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Today s Notable News v Reminder:

More information

2.2.2.Relational Database concept

2.2.2.Relational Database concept Foreign key:- is a field (or collection of fields) in one table that uniquely identifies a row of another table. In simpler words, the foreign key is defined in a second table, but it refers to the primary

More information

CMSC424: Database Design. Instructor: Amol Deshpande

CMSC424: Database Design. Instructor: Amol Deshpande CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu Databases Data Models Conceptual representa1on of the data Data Retrieval How to ask ques1ons of the database How to answer those ques1ons

More information

Overview of Query Processing and Optimization

Overview of Query Processing and Optimization Overview of Query Processing and Optimization Source: Database System Concepts Korth and Silberschatz Lisa Ball, 2010 (spelling error corrections Dec 07, 2011) Purpose of DBMS Optimization Each relational

More information

Database Tuning and Physical Design: Basics of Query Execution

Database Tuning and Physical Design: Basics of Query Execution Database Tuning and Physical Design: Basics of Query Execution Spring 2018 School of Computer Science University of Waterloo Databases CS348 (University of Waterloo) Query Execution 1 / 43 The Client/Server

More information

Evaluation of Relational Operations: Other Techniques. Chapter 14 Sayyed Nezhadi

Evaluation of Relational Operations: Other Techniques. Chapter 14 Sayyed Nezhadi Evaluation of Relational Operations: Other Techniques Chapter 14 Sayyed Nezhadi Schema for Examples Sailors (sid: integer, sname: string, rating: integer, age: real) Reserves (sid: integer, bid: integer,

More information

Midterm Review. March 27, 2017

Midterm Review. March 27, 2017 Midterm Review March 27, 2017 1 Overview Relational Algebra & Query Evaluation Relational Algebra Rewrites Index Design / Selection Physical Layouts 2 Relational Algebra & Query Evaluation 3 Relational

More information

Query Processing and Query Optimization. Prof Monika Shah

Query Processing and Query Optimization. Prof Monika Shah Query Processing and Query Optimization Query Processing SQL Query Is in Library Cache? System catalog (Dict / Dict cache) Scan and verify relations Parse into parse tree (relational Calculus) View definitions

More information

Database Systems External Sorting and Query Optimization. A.R. Hurson 323 CS Building

Database Systems External Sorting and Query Optimization. A.R. Hurson 323 CS Building External Sorting and Query Optimization A.R. Hurson 323 CS Building External sorting When data to be sorted cannot fit into available main memory, external sorting algorithm must be applied. Naturally,

More information

Implementing Relational Operators: Selection, Projection, Join. Database Management Systems, R. Ramakrishnan and J. Gehrke 1

Implementing Relational Operators: Selection, Projection, Join. Database Management Systems, R. Ramakrishnan and J. Gehrke 1 Implementing Relational Operators: Selection, Projection, Join Database Management Systems, R. Ramakrishnan and J. Gehrke 1 Readings [RG] Sec. 14.1-14.4 Database Management Systems, R. Ramakrishnan and

More information

Chapter 6 The Relational Algebra and Calculus

Chapter 6 The Relational Algebra and Calculus Chapter 6 The Relational Algebra and Calculus 1 Chapter Outline Example Database Application (COMPANY) Relational Algebra Unary Relational Operations Relational Algebra Operations From Set Theory Binary

More information

DBMS Query evaluation

DBMS Query evaluation Data Management for Data Science DBMS Maurizio Lenzerini, Riccardo Rosati Corso di laurea magistrale in Data Science Sapienza Università di Roma Academic Year 2016/2017 http://www.dis.uniroma1.it/~rosati/dmds/

More information

QUERY PROCESSING & OPTIMIZATION CHAPTER 19 (6/E) CHAPTER 15 (5/E)

QUERY PROCESSING & OPTIMIZATION CHAPTER 19 (6/E) CHAPTER 15 (5/E) QUERY PROCESSING & OPTIMIZATION CHAPTER 19 (6/E) CHAPTER 15 (5/E) 2 LECTURE OUTLINE Query Processing Methodology Basic Operations and Their Costs Generation of Execution Plans 3 QUERY PROCESSING IN A DDBMS

More information

Query Execution [15]

Query Execution [15] CSC 661, Principles of Database Systems Query Execution [15] Dr. Kalpakis http://www.csee.umbc.edu/~kalpakis/courses/661 Query processing involves Query processing compilation parsing to construct parse

More information

Relational Query Optimization. Overview of Query Evaluation. SQL Refresher. Yanlei Diao UMass Amherst March 8 and 13, 2007

Relational Query Optimization. Overview of Query Evaluation. SQL Refresher. Yanlei Diao UMass Amherst March 8 and 13, 2007 Relational Query Optimization Yanlei Diao UMass Amherst March 8 and 13, 2007 Slide Content Courtesy of R. Ramakrishnan, J. Gehrke, and J. Hellerstein 1 Overview of Query Evaluation Query Evaluation Plan:

More information