Query Processing Strategies in Distributed Database

Size: px
Start display at page:

Download "Query Processing Strategies in Distributed Database"

Transcription

1 Query Processing Strategies in Distributed Database Kunal Jamsutkar, M.Tech, Department of Computer Engineering and Information Technology, V.J.T.I., Mumbai Viki Patil, M.Tech, Department of Computer Engineering and Information Technology, V.J.T.I., Mumbai Dr.B.B.Meshram, Professor, Department of Computer Engineering and Information Technology, V.J.T.I., Mumbai ABSTRACT Query optimization is an important part of database management system. In this paper, through the research on query optimization technology, based on a number of optimization algorithms commonly used in distributed query, It aims to arrive at an optimal query processing plan for a given distributed query. As per the approach, the query plans having the required data residing close to each other are considered more efficient and, therefore, these generated query plans would result in efficient query processing.. Keywords: Database, query processing, distributed query strategy, system model, query processing cost, cost measures. Introduction In recent years, with the development of computer network and database technology, distributed database is more and more widely used; with the expanding application, data queries are increasingly complex, the efficiency requests are increasingly high, so query processing is a key issue of the distributed database system. In a distributed database environment, data stored at different sites connected through network. A distributed database management systems (DDBMS) support creation and maintenance of distributed database. The research literature proposes a wide variety of query optimization algorithms. Yu/chang give comprehensive overviews on various query optimization techniques for distribute database management system [20]. However, these overviews do not attempt to develop a model of query optimization that explains and presents the algorithms in a uniform way. This understanding in case we want to change or extend existing algorithms to adapt them to new requirements. In this research we consider query processing algorithms for a Distributed Database system. There has been many research done on distributed query processing methods (see [2],[3]). Increased reliability and performance can also be attained with a distributed database. All database systems must be able to respond to requests for information from the user i.e. process queries. How a DBMS processes queries and the methods it uses to optimize their performance are topics that will be covered in this paper. In certain sections of this paper, various concepts will be illustrated with an example. Since many optimization algorithms differ in their computational behavior while reflecting aspects of the implementation environment at the same time, it is the purpose of this paper to understand all of them by few simple concepts. Finally, we summaries our findings and discuss future work. General aspects of optimization To provide a better understanding of what we mean by the term query, query processing and query optimization. Further we discuss the algorithms of query optimization that can found in all optimization algorithms described in the papers. Definitions And Examples A. What Is A Query? A database query is the instructing a DBMS to update or retrieve specific data to/from the physically stored medium. The actual updating and retrieval of data is performed through various low-level operations. Examples of such operations for a relational DBMS can be relational algebra operations such as project, join, select,cartesian product, etc. B. The Query Processor There are three phases that a query passes through during the DBMS processing of that query: 1. Parsing and translation 2. Optimization 3. Evaluation Most queries submitted to a DBMS are in a highlevel language such as SQL. During the parsing and translation stage, the human readable form of the query is translated into forms usable by the DBMS. These can be in the forms of a relational algebra Blue Ocean Research Journals 71

2 expression, query tree and query graph. Consider the following SQL query: SELECT make FROM vehicles WHERE make = Toyota. This can be translated into the following relational algebra expressions: ( π make (vehicles)) make (vehicles)) And represented as a query graph: Toyota Make= Camaro Fig 2. Query Graph vehicles After parsing and translation into a relational algebra expression, the query is then transformed into a form, usually a query tree or graph that can be handled by the optimization engine. The optimization engine then performs various analyses on the query data, generating a number of valid evaluation plans. From there, it determines the most appropriate evaluation plan to execute. After the evaluation plan has been selected, it is passed into the DMBS query-execution engine (also referred to as the runtime database processor), where the plan is executed and the results are returned. B.1- Parsing and Translating the Query The first step in processing a query submitted to a DBMS is to convert the query into a form usable by the query processing engine. High-level query languages such as SQL represent a query as a string, or sequence, of characters. Certain sequences of characters represent various types of tokens such as keywords, operators, operands, literal strings, etc. Like all languages, there are rules (syntax and grammar) that govern how the tokens can be combined into understandable (i.e. valid) statements. The primary job of the parser is to extract the tokens from the raw string of characters and translate them into the corresponding internal data elements (i.e. relational algebra operations and operands) and structures (i.e. query tree, query graph).the last job of the parser is to verify the validity and syntax of the original query string. B.2- Optimizing the Query In this stage, the query processor applies rules to the internal data structures of the query to transform these structures into equivalent, but more efficient representations. The rules can be based upon mathematical models of the relational algebra expression and tree (heuristics), upon cost estimates of different algorithms applied to operations or upon the semantics within the query and the relations it involves. Selecting the proper rules to apply, when to apply them and how they are applied is the function of the query optimization engine. B.3- Evaluating the Query The final step in processing a query is the evaluation phase. The best evaluation plan candidate generated by the optimization engine is selected and then executed. (Note that there can exist multiple methods of executing a query. Besides processing a query in a simple sequential manner, some of a query s individual operations can be processed in parallel either as independent processes or as interdependent pipelines of processes or threads. Regardless of the method chosen, the actual results should be same.) C. Query Processing Query processing is defined as the activities involved in parsing, validating, optimizing and executing a query. The main aim of query processing is Transform query written in high-level language (e.g. SQL), into correct and efficient execution strategy expressed in low-level language (implementing Relational Algebra) and to find information in one or more databases and deliver it to the user quickly and efficiently. High level user query Query Processor Low level data manipulation commands Fig.3 Flow of Query Processing D. Query Optimization Query optimization is defined as the activity of choosing an efficient execution strategy for processing a query. Query optimization is a part of query processing. The main aims of query optimization are to choose a transformation that minimizes resource usage, Reduce total execution time of query and also reduce response time of query. Distributed Query Processing Methodology: Blue Ocean Research Journals 72

3 Journal of Engineering, Computers & Applied Sciences (JEC&AS) ISSN No: Distributed query processing contains four stagess which are as follows: 1. Query decomposition 2. Data localization 3. Global optimization 4. Local optimization. D.1- Query decomposition In this stage we are giving Calculus Query as an input and we are gettingg output as Algebraic Query. This stage is again divided in four stages they are Normalization, Restructuring Analysis, Simplification and Input: Calculus query on global relations Normalization Manipulate query quantifiers and qualification Analysis detects and rejects incorrect queries Possible for only a subset of relational calculus Simplification eliminate redundant predicates Restructuring calculus query ==> algebraic query More than one translation is possible use transformation rules. D.2- Data localization: in this stage Algebraic query on distributed relations is input and fragment query is output. In this stage fragment involvement is determined. D.3- Global optimization: in this stage Fragment Query is input and optimized fragment query is output. Finding best global schedule is done in this stage. D.4- Local optimization: Best global execution schedule is input and localized optimization queries are output in this stage. It containn two sub stages they are Select the bestt access path, Use the centralized optimization techniques. E. Distributed Query Optimization: Distributed query optimization is defined as finding efficient execution strategy path in distributed networks. Query optimization is difficult in distributed environment. There are three components of distributed query optimization they are Access Method, Join Criteria, and Transmission Costs. Access Method: The methods which are used to access data from distributed environment like hashing, indexing etc. Join Criteria: In distributed database data is presented in different sites. Join criteria is used to join the different sites to get optimized result. Transmission Costs: If data from multiple sitess must be joined to satisfy a single query, then the cost of transmitting the results from intermediate steps needs to bee factored into the equation. At times, it may be moree cost effectivee simply to ship entire tables across the network to enable processing to occur at a single site, thereby reducing overall transmission costs. This component of query optimization is an issue only in a distributed environment. There are many distributed query optimization issues somee of them are types of optimizers, optimization granularity, network topologies and optimization timing. Fig.4 Query processing methodology 3. Optimal Distribution Strategies for Simple Queries. Query optimization algorithms that derive optimal distribution strategies for a class of distributed queries called simple queries. Blue Ocean Research Journals 73

4 There are various algorithms are used for query optimization such as Algorithm PARALLEL [3] was used to derive a minimal response time distribution strategy for any given simple query. Algorithm SERIAL [3] strategy consists of transmitting each relation in a serial order. Algorithm GENERAL. Minimization of response time and total time is done by three different versions of the algorithm, which are A. Response Time Version B. Total Time Version C. Handling Redundant Data Transmission Algorithm-S is a static algorithm, as are PARALLEL, SERIAL, GENERAL, and D. In a static algorithm, the strategy is generated before any transmission or intersite joining takes place. Therefore, the algorithm must include some method for estimating the effect of a semijoin on the parameters. Related Work The query is decomposed into single-joining-attribute subqueries. Candidate schedules are generated for each subquery separately. There is an integration step but no synchronization step. By contrast, algorithm-s uses a more precise interpretation of attribute independence which takes into account forced reductions in the projected size of nonjoining attributes with low value multiplicity (keys, for instance). Since reductions are not restricted to single attributes, the decomposition into subqueries is no longer desirable and is not done. The integration step which follows is very similar to that of GENERAL. The final SYNCHRONIZE step is used to detect beneficial semi join delays which might have been missed because integrated schedules are generated for each relation separately. In modifying and extending GENERAL, we get different strategies which result in reduced costs. These substantial cost savings show up when using the response time minimization objective as well as the total time minimization objective. For most complex queries, algorithm-s provides the same the integrated schedules are chosen to be strategy whether the response-time or total-time version is used. A.1- AN OPTIMIZATION EXAMPLE Assume that the COURSE table and the ENROLLMENT table exist at Site 1; the STUDENT table exists at Site 2.If either all of the tables existed at a single site, or the DBMS supported distributed multi-site requests. However, if the DMBS cannot perform (or optimize) distributed multi-site requests, programmatic optimization must be performed. There are at least six different ways to go about optimizing this three-table join. Option 1: Start with Site 1 and join COURSE and ENROLLMENT, selecting only physics courses. For each qualifying row, move it to Site2 to be joined with STUDENT to see if any are seniors. Option 2: Start with Site 1 and join COURSE and ENROLLMENT, selecting only physics courses, and move the entire result set to Site 2 to be joined with STUDENT, checking for senior students only. Option 3: Start with Site 2 and select only seniors from STUDENT. For each of these examine the join of COURSE and ENROLLMENT at Site 1 for physics classes. Option 4: Start with Site 2 and select only seniors from STUDENT at Site 2, and move the entire result set to Site 1 to be joined with COURSE and ENROLLMENT, checking for physics classes only. Option 5: Move the COURSE and ENROLLMENT tables to Site 2 and proceed with a local three-table join. Option 6: Move the STUDENT to Site 1 and proceed with a local three-table join. Which of these six options will perform the best? Unfortunately, the only correct answer is "It depends." The optimal choice will depend upon: 1. the size of the tables; 2.the size of the result sets that is, the number of qualifying rows and their length in bytes; and 3.the efficiency of the network. B. THE ROLE OF INDEXES The utilization of indexes can dramatically reduce the execution time of various operations such as select and join. Let us review some of the types of index Blue Ocean Research Journals 74

5 file structures and the roles they play in reducing execution time and overhead: Dense Index: Data-file is ordered by the search key and every search key value has a separate index record. This structure requires only a single seek to find the first occurrence of a set of contiguous records with the desired search value. Sparse Index: Data-file is ordered by the index search key and only some of the search key values have corresponding index records. Each index record s data-file pointer points to the first data-file record with the search key value. While this structure can be less efficient (in terms of number of disk accesses) than a dense index to find the desired records, it requires less storage space and less overhead during insertion and deletion operations. Primary Index: The data file is ordered by the attribute that is also the search key in the index file. Primary indices can be dense or sparse. This is also referred to as an Index-Sequential File [5]. For scanning through a relation s records in sequential order by a key value, this is one of the fastest and more efficient structures -- locating a record has a cost of 1 seek, and the contiguous makeup of the records in sorted order minimizes the number of blocks that have to be read. However, after large numbers of insertions and deletions, the performance can degrade quite quickly, and the only way to restore the performance is to perform reorganization. Secondary Index: The data file is ordered by an attribute that is different from the search key in the index file. Secondary indices must be dense. Multi-Level Index: An index structure consisting of 2 or more tiers of records where an upper tier s records point to associated index records of the tier below. The bottom tier s index records contain the pointers to the data-file records. Multi-level indices can be used, for instance, to reduce the number of disk block reads needed during a binary search. Clustering Index: A two-level index structure where the records in the first level contain the clustering field value in one field and a second field pointing to a block [of 2nd level records] in the second level. The records in the second level have one field that points to an actual data file record or to another 2nd level block. B+-tree Index: Multi-level index with a balanced-tree structure. Finding a search key value in a B+-tree is proportional to the height of the tree maximum number of seeks required is lg height. While this, on average, is more than a single-level, dense index that requires only one seek, the B+-tree structure has a distinct advantage in that it does not require reorganization it is self-optimizing because the tree is kept balanced during insertions and deletions. Many mission-critical applications require high performance with near-100% uptime, which cannot be achieved with structures requiring reorganization. The leaves of the B+tree are used to reorganize the data file. C. New query optimization techniques in distributed database: C.1- Cost based query optimization: Objective of Cost-based query optimization is estimate the cost of different equivalent query expressions and chose the execution plan with the lowest cost. Cost based query optimization mainly depends on two factors they are solution space and cost function. Solution space: this is depends on the set of equivalent algebraic expressions. Cost function: cost function is equivalent to summation of I/O cost, CPU cost and communication cost. It also depends on different distributed environments. By considering these factors cost based query optimization is processed in distributed environment. C.2- Heuristic based query optimization: Heuristic based query optimization process involve following steps: 1) Perform Selection operations as early as possible. 2) Combine Cartesian product with subsequent selection whose predicate represents join condition into a Join operation. 3) Use associatively of binary operations to rearrange leaf nodes so leaf nodes with most restrictive Selection operations executed first. Blue Ocean Research Journals 75

6 4) Perform Projections operations as early as possible. 5) Eliminate duplicate computations. It is mainly used to minimize cost of selecting sites for multi join operations. Advantages of Distributed query optimization: Distributed Query optimization techniques provide exact results in distributed environment. These techniques provide efficient performance in different distributed networks. In internet these techniques helps to search exact information and extract the required one. D. Query Processing in Relational Database Systems The conventional method of processing a query in a relational DBMS is to parse the SQL statement and produce a relational calculus-like logical representation of the query, and then to invoke the query optimizer, which generates a query plan. The query plan is fed into an execution engine that directly executes it, typically with little or no runtime decision-making (Figure 5). The query plan can be thought of as a tree of unary and binary relational algebra operators, where each operator is annotated with specific details about the algorithm to use (e.g., nested loops join versus hash join) and how to allocate resources (e.g., memory). In many cases the query plan also includes low-level physical operations like sorting, network shipping, etc. that do not affect the logical representation of the data. Certain query processors consider only restricted types of queries, rather than SQL. A common example of this is select project-join or SPJ queries: an SPJ query essentially represents a single SQL SELECT-FROM-WHERE block with no aggregation or subqueries. User Query Execution Query Query Query Optimizer Plan Executor Result Example for SPJ queries: SELECT * FROM R,S,T,U WHERE R.s=S.a AND S.b=T.b AND T.c=U.c Fig 5. Query Plan E. Results for related Algorithm Based above examples we summarize the Complexities of all algorithms. Table 1 Complexity table Algorithms Complexity 1.Parallel O(m 2 ) 2. Serial O(mlog 2 m) 3.General 3.1 Procedure Total O (σm 2 ) 3.2Procedure Response 4.Algorithem S. O(mlogm) Where m is the number of required relations in the query. Conclusion Algorithm-S is a straightforward modification and extension of Apers, Hevner, and Yao's algorithm GENERAL. In GENERAL, the attribute independence assumption is interpreted to mean that a semijoin has no effect on the projected size of nonjoining attributes. This is significant, since low response time costs and low total time costs are both desirable objectives, even though one may predominate in a given situation.most real-world data is not well structured. Today's databases typically contain much non-structured data such as text, images, video, and audio, often distributed across computer networks. To process these kinds of data and optimize queries on this data requires these distributed query optimization techniques. References [1] R. Hevner and S. B. Yao, Query Processing in distributed database systems," IEEE Trans. Software Eng., vol. SE-5, pp ,May [2] William Perrizo, A Method for Processing Distributed Database Queries, IEEE Trans. Software Eng., vol. SE-10,No.4,JULY1984. [3] Peter M. G. Apers, Alan R. Hevner, And S. Bing Yao, Optimization Algorithms for Distributed Queries, IEEE Trans.,1983 [4] M. Tamer Ozsu, GTE Laboratories, Patrick Valduriez, Distributed Database Systems: Where Are We Now? IEEE INRIA, [5] Sakti Pramanik And David Vineyard, Optimizing Join Queries in Distributed Databases, IEEE, Blue Ocean Research Journals 76

7 [6] Arbee L. P. Chen and Victor 0. K. Li, Improvement Algorithms for Semijoin Query Processing Programs in Distributed Database Systems, IEEE, [7] AviSilbershatz, Hank Korth and S.Sudarshan. Database System Concepts, 4 th a. Edition. McGraw-Hill, [8] RamezElmasri and Shamkant B. Navathe.Fundamentals of Database Systems, second Edition. Addison-Wesley Publishing Company, [9] Donald Kossmann and Konrad Stocker. Iterative Dynamic Programming: A new Class of Query Optimization Algorithms. ACM Transactions on Database Systems, Vol. 25, No. 1, March 2000, Pages [10] Hsiao-Fei Liu, Ya-Hui Chang and Kun-Mao Chao. An Optimal Algorithm for Querying Tree Structures and its Applications in Bioinformatics. ACM SIGMOD Record Vol. 33, No. 2, June [11] Thomas Schwentick. XPath Query Containment. ACM SIGMOD Record, Vol.33, No. 1, March [12] Wesley W.Chu and Paul Hurley, Optimal Query Processing for Distributed Database Systems. IEEE Trans. computers, vol.c-31, No.9, September [13] W.Cellary, Z.Krolikowski and T.Morzy, Other Comments on Optimization Algorithms for Distributed Qyeries. IEEE Trans. On Software Engineering, vol.14, No.4, April [14] PauraS.M.Tsai,ArbeeL.P.Chen, Optimizing Queries with Foreign Function in a Distributed Environment, IEEE Trans. On Knowledge and data engineering, vol.14,no.4,july/august [15] Dave D.Straube and M.TamerOzsu, Query Optimization and Execution Plan Generation in Object Oriented Data Management Systems, IEEE Trans. On Knowledge and data engineering, vol.7, No.2, April [16] Stefano Ceri and George Gottlob, Translating SQL into Relational Algebra Optimization, Semantics and Equivalence of SQL Queries, IEEE Trans. On Software engineering, vol.se- 11, No.4, April [17] P.A.BersteinN.Goodman,E.Wong,G.L.Reeve and J.Rothmie, Query Processing in a system for distributed database (SDD-1), ACM Trans.DatabaseSyst.,Vol 6,Dec [18] S.Chaudhari and K.Shim, Query Optimization in presence of foreign Function, Proc.intl conf. vary large data bases, [19] D.Chiu and Y.Ho, A methodology for interpreting tree queries into optimal semi-join expression, inproc.acmsigmod,may [20] C.Yu and Caching, Distributed Query Processing, ACM Comput.Surveys, Vol.16, no.4, Dec [21] Ming Syan Chen and Philip S.Yu, Using Combination joins and semijoins operations for distributed query processing, IEEE Transactions on Knowledge and Data Engineering, [22] Chihping Wang and Ming Syan Chen, On the Complexity of Distributed Query Optimization, IEEE Transactions on Knowledge and Data Engineering, Volume 8,no.4, Aug [23] Ming Syan Chen and Philip S.Yu, Using join operations As reducer for distributed query processing, IEEE Transactions on Knowledge and Data Engineering, [24] Konrad Stocker,Donald Kossmann,Reinhard Braumandl and Alfons Kemper, Integrating semi- join Reducers into State-of-the-Art Query Processors,IEEE, Blue Ocean Research Journals 77

Architecture of Cache Investment Strategies

Architecture of Cache Investment Strategies Architecture of Cache Investment Strategies Sanju Gupta The Research Scholar, The IIS University, Jaipur khandelwalsanjana@yahoo.com Abstract - Distributed database is an important field in database research

More information

Chapter 12: Query Processing

Chapter 12: Query Processing Chapter 12: Query Processing Overview Catalog Information for Cost Estimation $ Measures of Query Cost Selection Operation Sorting Join Operation Other Operations Evaluation of Expressions Transformation

More information

Query Processing & Optimization

Query Processing & Optimization Query Processing & Optimization 1 Roadmap of This Lecture Overview of query processing Measures of Query Cost Selection Operation Sorting Join Operation Other Operations Evaluation of Expressions Introduction

More information

Database System Concepts

Database System Concepts Chapter 13: Query Processing s Departamento de Engenharia Informática Instituto Superior Técnico 1 st Semester 2008/2009 Slides (fortemente) baseados nos slides oficiais do livro c Silberschatz, Korth

More information

DATABASE PERFORMANCE AND INDEXES. CS121: Relational Databases Fall 2017 Lecture 11

DATABASE PERFORMANCE AND INDEXES. CS121: Relational Databases Fall 2017 Lecture 11 DATABASE PERFORMANCE AND INDEXES CS121: Relational Databases Fall 2017 Lecture 11 Database Performance 2 Many situations where query performance needs to be improved e.g. as data size grows, query performance

More information

What happens. 376a. Database Design. Execution strategy. Query conversion. Next. Two types of techniques

What happens. 376a. Database Design. Execution strategy. Query conversion. Next. Two types of techniques 376a. Database Design Dept. of Computer Science Vassar College http://www.cs.vassar.edu/~cs376 Class 16 Query optimization What happens Database is given a query Query is scanned - scanner creates a list

More information

Something to think about. Problems. Purpose. Vocabulary. Query Evaluation Techniques for large DB. Part 1. Fact:

Something to think about. Problems. Purpose. Vocabulary. Query Evaluation Techniques for large DB. Part 1. Fact: Query Evaluation Techniques for large DB Part 1 Fact: While data base management systems are standard tools in business data processing they are slowly being introduced to all the other emerging data base

More information

Pragmatic Approach to Query Optimization

Pragmatic Approach to Query Optimization Pragmatic Approach to Query Optimization Subhi H. Hamdoon, PhD. College of Applied Sciences, Ministry of Higher Education, Oman Virendra Gawande, PhD. College of Applied Sciences, Ministry of Higher Education,

More information

Analysis of Query Processing and Optimization

Analysis of Query Processing and Optimization Analysis of Query Processing and Optimization Nimra Memon, Muhammad Saleem Vighio, Shah Zaman Nizamani, Niaz Ahmed Memon, Adeel Riaz Memon, Umair Ramzan Shaikh Abstract Modern database management systems

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe CHAPTER 19 Query Optimization Introduction Query optimization Conducted by a query optimizer in a DBMS Goal: select best available strategy for executing query Based on information available Most RDBMSs

More information

7. Query Processing and Optimization

7. Query Processing and Optimization 7. Query Processing and Optimization Processing a Query 103 Indexing for Performance Simple (individual) index B + -tree index Matching index scan vs nonmatching index scan Unique index one entry and one

More information

Kathleen Durant PhD Northeastern University CS Indexes

Kathleen Durant PhD Northeastern University CS Indexes Kathleen Durant PhD Northeastern University CS 3200 Indexes Outline for the day Index definition Types of indexes B+ trees ISAM Hash index Choosing indexed fields Indexes in InnoDB 2 Indexes A typical

More information

Introduction to Query Processing and Query Optimization Techniques. Copyright 2011 Ramez Elmasri and Shamkant Navathe

Introduction to Query Processing and Query Optimization Techniques. Copyright 2011 Ramez Elmasri and Shamkant Navathe Introduction to Query Processing and Query Optimization Techniques Outline Translating SQL Queries into Relational Algebra Algorithms for External Sorting Algorithms for SELECT and JOIN Operations Algorithms

More information

Integration of Transactional Systems

Integration of Transactional Systems Integration of Transactional Systems Distributed Query Processing Robert Wrembel Poznań University of Technology Institute of Computing Science Robert.Wrembel@cs.put.poznan.pl www.cs.put.poznan.pl/rwrembel

More information

Distributed Databases Systems

Distributed Databases Systems Distributed Databases Systems Lecture No. 05 Query Processing Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology Jamshoro Outline

More information

Chapter 13: Query Processing

Chapter 13: Query Processing Chapter 13: Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 13.1 Basic Steps in Query Processing 1. Parsing

More information

Query Processing. Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016

Query Processing. Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016 Query Processing Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016 Slides re-used with some modification from www.db-book.com Reference: Database System Concepts, 6 th Ed. By Silberschatz,

More information

Teaching Scheme Business Information Technology/Software Engineering Management Advanced Databases

Teaching Scheme Business Information Technology/Software Engineering Management Advanced Databases Teaching Scheme Business Information Technology/Software Engineering Management Advanced Databases Level : 4 Year : 200 2002 Jim Craven (jcraven@bournemouth.ac.uk) Stephen Mc Kearney (smckearn@bournemouth.ac.uk)

More information

Query Processing and Query Optimization. Prof Monika Shah

Query Processing and Query Optimization. Prof Monika Shah Query Processing and Query Optimization Query Processing SQL Query Is in Library Cache? System catalog (Dict / Dict cache) Scan and verify relations Parse into parse tree (relational Calculus) View definitions

More information

Advanced Database Systems

Advanced Database Systems Lecture IV Query Processing Kyumars Sheykh Esmaili Basic Steps in Query Processing 2 Query Optimization Many equivalent execution plans Choosing the best one Based on Heuristics, Cost Will be discussed

More information

Chapter 12: Query Processing

Chapter 12: Query Processing Chapter 12: Query Processing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Overview Chapter 12: Query Processing Measures of Query Cost Selection Operation Sorting Join

More information

Advanced Databases. Lecture 1- Query Processing. Masood Niazi Torshiz Islamic Azad university- Mashhad Branch

Advanced Databases. Lecture 1- Query Processing. Masood Niazi Torshiz Islamic Azad university- Mashhad Branch Advanced Databases Lecture 1- Query Processing Masood Niazi Torshiz Islamic Azad university- Mashhad Branch www.mniazi.ir Overview Measures of Query Cost Selection Operation Sorting Join Operation Other

More information

Tri-variate Optimization Strategies of Semi-Join Technique on Distributed Databases

Tri-variate Optimization Strategies of Semi-Join Technique on Distributed Databases Tri-variate Optimization Strategies of Semi-Join Technique on Distributed Databases Sunita M. Mahajan, PhD. Principal Department of Computer Science Mumbai Education Trust, Bandra, Vaishali P. Jadhav Research

More information

! A relational algebra expression may have many equivalent. ! Cost is generally measured as total elapsed time for

! A relational algebra expression may have many equivalent. ! Cost is generally measured as total elapsed time for Chapter 13: Query Processing Basic Steps in Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 1. Parsing and

More information

Chapter 13: Query Processing Basic Steps in Query Processing

Chapter 13: Query Processing Basic Steps in Query Processing Chapter 13: Query Processing Basic Steps in Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 1. Parsing and

More information

Efficiency. Efficiency: Indexing. Indexing. Efficiency Techniques. Inverted Index. Inverted Index (COSC 488)

Efficiency. Efficiency: Indexing. Indexing. Efficiency Techniques. Inverted Index. Inverted Index (COSC 488) Efficiency Efficiency: Indexing (COSC 488) Nazli Goharian nazli@cs.georgetown.edu Difficult to analyze sequential IR algorithms: data and query dependency (query selectivity). O(q(cf max )) -- high estimate-

More information

An Overview of various methodologies used in Data set Preparation for Data mining Analysis

An Overview of various methodologies used in Data set Preparation for Data mining Analysis An Overview of various methodologies used in Data set Preparation for Data mining Analysis Arun P Kuttappan 1, P Saranya 2 1 M. E Student, Dept. of Computer Science and Engineering, Gnanamani College of

More information

QUERY OPTIMIZATION E Jayant Haritsa Computer Science and Automation Indian Institute of Science. JAN 2014 Slide 1 QUERY OPTIMIZATION

QUERY OPTIMIZATION E Jayant Haritsa Computer Science and Automation Indian Institute of Science. JAN 2014 Slide 1 QUERY OPTIMIZATION E0 261 Jayant Haritsa Computer Science and Automation Indian Institute of Science JAN 2014 Slide 1 Database Engines Main Components Query Processing Transaction Processing Access Methods JAN 2014 Slide

More information

Chapter 11: Indexing and Hashing

Chapter 11: Indexing and Hashing Chapter 11: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing and Hashing Index Definition in SQL

More information

Chapter 12: Query Processing. Chapter 12: Query Processing

Chapter 12: Query Processing. Chapter 12: Query Processing Chapter 12: Query Processing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 12: Query Processing Overview Measures of Query Cost Selection Operation Sorting Join

More information

Chapter 12: Query Processing

Chapter 12: Query Processing Chapter 12: Query Processing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Basic Steps in Query Processing 1. Parsing and translation 2. Optimization 3. Evaluation 12.2

More information

Horizontal Aggregations for Mining Relational Databases

Horizontal Aggregations for Mining Relational Databases Horizontal Aggregations for Mining Relational Databases Dontu.Jagannadh, T.Gayathri, M.V.S.S Nagendranadh. Department of CSE Sasi Institute of Technology And Engineering,Tadepalligudem, Andhrapradesh,

More information

Query optimization. Elena Baralis, Silvia Chiusano Politecnico di Torino. DBMS Architecture D B M G. Database Management Systems. Pag.

Query optimization. Elena Baralis, Silvia Chiusano Politecnico di Torino. DBMS Architecture D B M G. Database Management Systems. Pag. Database Management Systems DBMS Architecture SQL INSTRUCTION OPTIMIZER MANAGEMENT OF ACCESS METHODS CONCURRENCY CONTROL BUFFER MANAGER RELIABILITY MANAGEMENT Index Files Data Files System Catalog DATABASE

More information

Review. Support for data retrieval at the physical level:

Review. Support for data retrieval at the physical level: Query Processing Review Support for data retrieval at the physical level: Indices: data structures to help with some query evaluation: SELECTION queries (ssn = 123) RANGE queries (100

More information

Horizontal Aggregations in SQL to Prepare Data Sets Using PIVOT Operator

Horizontal Aggregations in SQL to Prepare Data Sets Using PIVOT Operator Horizontal Aggregations in SQL to Prepare Data Sets Using PIVOT Operator R.Saravanan 1, J.Sivapriya 2, M.Shahidha 3 1 Assisstant Professor, Department of IT,SMVEC, Puducherry, India 2,3 UG student, Department

More information

Query Processing SL03

Query Processing SL03 Distributed Database Systems Fall 2016 Query Processing Overview Query Processing SL03 Distributed Query Processing Steps Query Decomposition Data Localization Query Processing Overview/1 Query processing:

More information

Advanced Databases: Parallel Databases A.Poulovassilis

Advanced Databases: Parallel Databases A.Poulovassilis 1 Advanced Databases: Parallel Databases A.Poulovassilis 1 Parallel Database Architectures Parallel database systems use parallel processing techniques to achieve faster DBMS performance and handle larger

More information

File Structures and Indexing

File Structures and Indexing File Structures and Indexing CPS352: Database Systems Simon Miner Gordon College Last Revised: 10/11/12 Agenda Check-in Database File Structures Indexing Database Design Tips Check-in Database File Structures

More information

CSE 544 Principles of Database Management Systems

CSE 544 Principles of Database Management Systems CSE 544 Principles of Database Management Systems Alvin Cheung Fall 2015 Lecture 5 - DBMS Architecture and Indexing 1 Announcements HW1 is due next Thursday How is it going? Projects: Proposals are due

More information

Data Modeling and Databases Ch 9: Query Processing - Algorithms. Gustavo Alonso Systems Group Department of Computer Science ETH Zürich

Data Modeling and Databases Ch 9: Query Processing - Algorithms. Gustavo Alonso Systems Group Department of Computer Science ETH Zürich Data Modeling and Databases Ch 9: Query Processing - Algorithms Gustavo Alonso Systems Group Department of Computer Science ETH Zürich Transactions (Locking, Logging) Metadata Mgmt (Schema, Stats) Application

More information

Hash-Based Indexing 165

Hash-Based Indexing 165 Hash-Based Indexing 165 h 1 h 0 h 1 h 0 Next = 0 000 00 64 32 8 16 000 00 64 32 8 16 A 001 01 9 25 41 73 001 01 9 25 41 73 B 010 10 10 18 34 66 010 10 10 18 34 66 C Next = 3 011 11 11 19 D 011 11 11 19

More information

Published by: PIONEER RESEARCH & DEVELOPMENT GROUP (www.prdg.org) 1

Published by: PIONEER RESEARCH & DEVELOPMENT GROUP (www.prdg.org) 1 Optimization of Join Queries on Distributed Relations Using Semi-Joins Suresh Sapa 1, K. P. Supreethi 2 1, 2 JNTUCEH, Hyderabad, India Abstract The processing and optimizing a join query in distributed

More information

Chapter 12: Indexing and Hashing. Basic Concepts

Chapter 12: Indexing and Hashing. Basic Concepts Chapter 12: Indexing and Hashing! Basic Concepts! Ordered Indices! B+-Tree Index Files! B-Tree Index Files! Static Hashing! Dynamic Hashing! Comparison of Ordered Indexing and Hashing! Index Definition

More information

Mobile and Heterogeneous databases Distributed Database System Query Processing. A.R. Hurson Computer Science Missouri Science & Technology

Mobile and Heterogeneous databases Distributed Database System Query Processing. A.R. Hurson Computer Science Missouri Science & Technology Mobile and Heterogeneous databases Distributed Database System Query Processing A.R. Hurson Computer Science Missouri Science & Technology 1 Note, this unit will be covered in four lectures. In case you

More information

Algorithms for Query Processing and Optimization. 0. Introduction to Query Processing (1)

Algorithms for Query Processing and Optimization. 0. Introduction to Query Processing (1) Chapter 19 Algorithms for Query Processing and Optimization 0. Introduction to Query Processing (1) Query optimization: The process of choosing a suitable execution strategy for processing a query. Two

More information

Chapter 12: Indexing and Hashing

Chapter 12: Indexing and Hashing Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing and Hashing Index Definition in SQL

More information

Optimization of Queries in Distributed Database Management System

Optimization of Queries in Distributed Database Management System Optimization of Queries in Distributed Database Management System Bhagvant Institute of Technology, Muzaffarnagar Abstract The query optimizer is widely considered to be the most important component of

More information

Data Modeling and Databases Ch 10: Query Processing - Algorithms. Gustavo Alonso Systems Group Department of Computer Science ETH Zürich

Data Modeling and Databases Ch 10: Query Processing - Algorithms. Gustavo Alonso Systems Group Department of Computer Science ETH Zürich Data Modeling and Databases Ch 10: Query Processing - Algorithms Gustavo Alonso Systems Group Department of Computer Science ETH Zürich Transactions (Locking, Logging) Metadata Mgmt (Schema, Stats) Application

More information

Database Systems External Sorting and Query Optimization. A.R. Hurson 323 CS Building

Database Systems External Sorting and Query Optimization. A.R. Hurson 323 CS Building External Sorting and Query Optimization A.R. Hurson 323 CS Building External sorting When data to be sorted cannot fit into available main memory, external sorting algorithm must be applied. Naturally,

More information

Chapter 17: Parallel Databases

Chapter 17: Parallel Databases Chapter 17: Parallel Databases Introduction I/O Parallelism Interquery Parallelism Intraquery Parallelism Intraoperation Parallelism Interoperation Parallelism Design of Parallel Systems Database Systems

More information

Query processing and optimization

Query processing and optimization Query processing and optimization These slides are a modified version of the slides of the book Database System Concepts (Chapter 13 and 14), 5th Ed., McGraw-Hill, by Silberschatz, Korth and Sudarshan.

More information

The Hibernate Framework Query Mechanisms Comparison

The Hibernate Framework Query Mechanisms Comparison The Hibernate Framework Query Mechanisms Comparison Tisinee Surapunt and Chartchai Doungsa-Ard Abstract The Hibernate Framework is an Object/Relational Mapping technique which can handle the data for applications

More information

Database Technology. Topic 7: Data Structures for Databases. Olaf Hartig.

Database Technology. Topic 7: Data Structures for Databases. Olaf Hartig. Topic 7: Data Structures for Databases Olaf Hartig olaf.hartig@liu.se Database System 2 Storage Hierarchy Traditional Storage Hierarchy CPU Cache memory Main memory Primary storage Disk Tape Secondary

More information

CSE 444: Database Internals. Lectures 5-6 Indexing

CSE 444: Database Internals. Lectures 5-6 Indexing CSE 444: Database Internals Lectures 5-6 Indexing 1 Announcements HW1 due tonight by 11pm Turn in an electronic copy (word/pdf) by 11pm, or Turn in a hard copy in my office by 4pm Lab1 is due Friday, 11pm

More information

Architecture of a Database Management System Ray Lockwood

Architecture of a Database Management System Ray Lockwood Assorted Topics Architecture of a Database Management System Pg 1 Architecture of a Database Management System Ray Lockwood Points: A DBMS is divided into modules or layers that isolate functionality.

More information

SDD-1 Algorithm Implementation

SDD-1 Algorithm Implementation National Institute of Technology Karnataka, Surathkal Project Report on SDD-1 Algorithm Implementation Under the Guidance of: Mr. Dr. Anantha Narayana (Professor) Submitted by: Mr. Vasanth Raja Chittampally

More information

Module 9: Selectivity Estimation

Module 9: Selectivity Estimation Module 9: Selectivity Estimation Module Outline 9.1 Query Cost and Selectivity Estimation 9.2 Database profiles 9.3 Sampling 9.4 Statistics maintained by commercial DBMS Web Forms Transaction Manager Lock

More information

Query Execution [15]

Query Execution [15] CSC 661, Principles of Database Systems Query Execution [15] Dr. Kalpakis http://www.csee.umbc.edu/~kalpakis/courses/661 Query processing involves Query processing compilation parsing to construct parse

More information

Parser: SQL parse tree

Parser: SQL parse tree Jinze Liu Parser: SQL parse tree Good old lex & yacc Detect and reject syntax errors Validator: parse tree logical plan Detect and reject semantic errors Nonexistent tables/views/columns? Insufficient

More information

Query Optimization. Query Optimization. Optimization considerations. Example. Interaction of algorithm choice and tree arrangement.

Query Optimization. Query Optimization. Optimization considerations. Example. Interaction of algorithm choice and tree arrangement. COS 597: Principles of Database and Information Systems Query Optimization Query Optimization Query as expression over relational algebraic operations Get evaluation (parse) tree Leaves: base relations

More information

User Perspective. Module III: System Perspective. Module III: Topics Covered. Module III Overview of Storage Structures, QP, and TM

User Perspective. Module III: System Perspective. Module III: Topics Covered. Module III Overview of Storage Structures, QP, and TM Module III Overview of Storage Structures, QP, and TM Sharma Chakravarthy UT Arlington sharma@cse.uta.edu http://www2.uta.edu/sharma base Management Systems: Sharma Chakravarthy Module I Requirements analysis

More information

Query Optimization in Distributed Databases. Dilşat ABDULLAH

Query Optimization in Distributed Databases. Dilşat ABDULLAH Query Optimization in Distributed Databases Dilşat ABDULLAH 1302108 Department of Computer Engineering Middle East Technical University December 2003 ABSTRACT Query optimization refers to the process of

More information

Chapter 3. Algorithms for Query Processing and Optimization

Chapter 3. Algorithms for Query Processing and Optimization Chapter 3 Algorithms for Query Processing and Optimization Chapter Outline 1. Introduction to Query Processing 2. Translating SQL Queries into Relational Algebra 3. Algorithms for External Sorting 4. Algorithms

More information

Principles of Data Management. Lecture #9 (Query Processing Overview)

Principles of Data Management. Lecture #9 (Query Processing Overview) Principles of Data Management Lecture #9 (Query Processing Overview) Instructor: Mike Carey mjcarey@ics.uci.edu Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Today s Notable News v Midterm

More information

CS 245 Midterm Exam Solution Winter 2015

CS 245 Midterm Exam Solution Winter 2015 CS 245 Midterm Exam Solution Winter 2015 This exam is open book and notes. You can use a calculator and your laptop to access course notes and videos (but not to communicate with other people). You have

More information

Overview of Query Processing and Optimization

Overview of Query Processing and Optimization Overview of Query Processing and Optimization Source: Database System Concepts Korth and Silberschatz Lisa Ball, 2010 (spelling error corrections Dec 07, 2011) Purpose of DBMS Optimization Each relational

More information

Query Processing and Optimization *

Query Processing and Optimization * OpenStax-CNX module: m28213 1 Query Processing and Optimization * Nguyen Kim Anh This work is produced by OpenStax-CNX and licensed under the Creative Commons Attribution License 3.0 Query processing is

More information

Indexing. Week 14, Spring Edited by M. Naci Akkøk, , Contains slides from 8-9. April 2002 by Hector Garcia-Molina, Vera Goebel

Indexing. Week 14, Spring Edited by M. Naci Akkøk, , Contains slides from 8-9. April 2002 by Hector Garcia-Molina, Vera Goebel Indexing Week 14, Spring 2005 Edited by M. Naci Akkøk, 5.3.2004, 3.3.2005 Contains slides from 8-9. April 2002 by Hector Garcia-Molina, Vera Goebel Overview Conventional indexes B-trees Hashing schemes

More information

Data about data is database Select correct option: True False Partially True None of the Above

Data about data is database Select correct option: True False Partially True None of the Above Within a table, each primary key value. is a minimal super key is always the first field in each table must be numeric must be unique Foreign Key is A field in a table that matches a key field in another

More information

Physical Level of Databases: B+-Trees

Physical Level of Databases: B+-Trees Physical Level of Databases: B+-Trees Adnan YAZICI Computer Engineering Department METU (Fall 2005) 1 B + -Tree Index Files l Disadvantage of indexed-sequential files: performance degrades as file grows,

More information

! Parallel machines are becoming quite common and affordable. ! Databases are growing increasingly large

! Parallel machines are becoming quite common and affordable. ! Databases are growing increasingly large Chapter 20: Parallel Databases Introduction! Introduction! I/O Parallelism! Interquery Parallelism! Intraquery Parallelism! Intraoperation Parallelism! Interoperation Parallelism! Design of Parallel Systems!

More information

Chapter 20: Parallel Databases

Chapter 20: Parallel Databases Chapter 20: Parallel Databases! Introduction! I/O Parallelism! Interquery Parallelism! Intraquery Parallelism! Intraoperation Parallelism! Interoperation Parallelism! Design of Parallel Systems 20.1 Introduction!

More information

Chapter 20: Parallel Databases. Introduction

Chapter 20: Parallel Databases. Introduction Chapter 20: Parallel Databases! Introduction! I/O Parallelism! Interquery Parallelism! Intraquery Parallelism! Intraoperation Parallelism! Interoperation Parallelism! Design of Parallel Systems 20.1 Introduction!

More information

What s a database system? Review of Basic Database Concepts. Entity-relationship (E/R) diagram. Two important questions. Physical data independence

What s a database system? Review of Basic Database Concepts. Entity-relationship (E/R) diagram. Two important questions. Physical data independence What s a database system? Review of Basic Database Concepts CPS 296.1 Topics in Database Systems According to Oxford Dictionary Database: an organized body of related information Database system, DataBase

More information

Basant Group of Institution

Basant Group of Institution Basant Group of Institution Visual Basic 6.0 Objective Question Q.1 In the relational modes, cardinality is termed as: (A) Number of tuples. (B) Number of attributes. (C) Number of tables. (D) Number of

More information

Notes. Some of these slides are based on a slide set provided by Ulf Leser. CS 640 Query Processing Winter / 30. Notes

Notes. Some of these slides are based on a slide set provided by Ulf Leser. CS 640 Query Processing Winter / 30. Notes uery Processing Olaf Hartig David R. Cheriton School of Computer Science University of Waterloo CS 640 Principles of Database Management and Use Winter 2013 Some of these slides are based on a slide set

More information

Database Tuning and Physical Design: Basics of Query Execution

Database Tuning and Physical Design: Basics of Query Execution Database Tuning and Physical Design: Basics of Query Execution Spring 2018 School of Computer Science University of Waterloo Databases CS348 (University of Waterloo) Query Execution 1 / 43 The Client/Server

More information

International Journal of Modern Trends in Engineering and Research e-issn: p-issn:

International Journal of Modern Trends in Engineering and Research  e-issn: p-issn: International Journal of Modern Trends in Engineering and Research www.ijmter.com Fragmentation as a Part of Security in Distributed Database: A Survey Vaidik Ochurinda 1 1 External Student, MCA, IGNOU.

More information

Mahathma Gandhi University

Mahathma Gandhi University Mahathma Gandhi University BSc Computer science III Semester BCS 303 OBJECTIVE TYPE QUESTIONS Choose the correct or best alternative in the following: Q.1 In the relational modes, cardinality is termed

More information

Indexing and Hashing

Indexing and Hashing C H A P T E R 1 Indexing and Hashing This chapter covers indexing techniques ranging from the most basic one to highly specialized ones. Due to the extensive use of indices in database systems, this chapter

More information

Query Processing Strategies and Optimization

Query Processing Strategies and Optimization Query Processing Strategies and Optimization CPS352: Database Systems Simon Miner Gordon College Last Revised: 10/25/12 Agenda Check-in Design Project Presentations Query Processing Programming Project

More information

Course Outline and Objectives: Database Programming with SQL

Course Outline and Objectives: Database Programming with SQL Introduction to Computer Science and Business Course Outline and Objectives: Database Programming with SQL This is the second portion of the Database Design and Programming with SQL course. In this portion,

More information

CSE 544, Winter 2009, Final Examination 11 March 2009

CSE 544, Winter 2009, Final Examination 11 March 2009 CSE 544, Winter 2009, Final Examination 11 March 2009 Rules: Open books and open notes. No laptops or other mobile devices. Calculators allowed. Please write clearly. Relax! You are here to learn. Question

More information

CPSC 421 Database Management Systems. Lecture 11: Storage and File Organization

CPSC 421 Database Management Systems. Lecture 11: Storage and File Organization CPSC 421 Database Management Systems Lecture 11: Storage and File Organization * Some material adapted from R. Ramakrishnan, L. Delcambre, and B. Ludaescher Today s Agenda Start on Database Internals:

More information

Query Decomposition and Data Localization

Query Decomposition and Data Localization Query Decomposition and Data Localization Query Decomposition and Data Localization Query decomposition and data localization consists of two steps: Mapping of calculus query (SQL) to algebra operations

More information

Introduction Alternative ways of evaluating a given query using

Introduction Alternative ways of evaluating a given query using Query Optimization Introduction Catalog Information for Cost Estimation Estimation of Statistics Transformation of Relational Expressions Dynamic Programming for Choosing Evaluation Plans Introduction

More information

Chapter 19 Query Optimization

Chapter 19 Query Optimization Chapter 19 Query Optimization It is an activity conducted by the query optimizer to select the best available strategy for executing the query. 1. Query Trees and Heuristics for Query Optimization - Apply

More information

Relational Query Optimization

Relational Query Optimization Relational Query Optimization Chapter 15 Ramakrishnan & Gehrke (Sections 15.1-15.6) CPSC404, Laks V.S. Lakshmanan 1 What you will learn from this lecture Cost-based query optimization (System R) Plan space

More information

CS 245 Midterm Exam Winter 2014

CS 245 Midterm Exam Winter 2014 CS 245 Midterm Exam Winter 2014 This exam is open book and notes. You can use a calculator and your laptop to access course notes and videos (but not to communicate with other people). You have 70 minutes

More information

Chapter 18: Parallel Databases

Chapter 18: Parallel Databases Chapter 18: Parallel Databases Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 18: Parallel Databases Introduction I/O Parallelism Interquery Parallelism Intraquery

More information

Chapter 18: Parallel Databases. Chapter 18: Parallel Databases. Parallelism in Databases. Introduction

Chapter 18: Parallel Databases. Chapter 18: Parallel Databases. Parallelism in Databases. Introduction Chapter 18: Parallel Databases Chapter 18: Parallel Databases Introduction I/O Parallelism Interquery Parallelism Intraquery Parallelism Intraoperation Parallelism Interoperation Parallelism Design of

More information

Three Read Priority Locking for Concurrency Control in Distributed Databases

Three Read Priority Locking for Concurrency Control in Distributed Databases Three Read Priority Locking for Concurrency Control in Distributed Databases Christos Papanastasiou Technological Educational Institution Stereas Elladas, Department of Electrical Engineering 35100 Lamia,

More information

Information Management (IM)

Information Management (IM) 1 2 3 4 5 6 7 8 9 Information Management (IM) Information Management (IM) is primarily concerned with the capture, digitization, representation, organization, transformation, and presentation of information;

More information

Intro to DB CHAPTER 12 INDEXING & HASHING

Intro to DB CHAPTER 12 INDEXING & HASHING Intro to DB CHAPTER 12 INDEXING & HASHING Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing

More information

Principles of Parallel Algorithm Design: Concurrency and Decomposition

Principles of Parallel Algorithm Design: Concurrency and Decomposition Principles of Parallel Algorithm Design: Concurrency and Decomposition John Mellor-Crummey Department of Computer Science Rice University johnmc@rice.edu COMP 422/534 Lecture 2 12 January 2017 Parallel

More information

Review. Relational Query Optimization. Query Optimization Overview (cont) Query Optimization Overview. Cost-based Query Sub-System

Review. Relational Query Optimization. Query Optimization Overview (cont) Query Optimization Overview. Cost-based Query Sub-System Review Relational Query Optimization R & G Chapter 12/15 Implementation of single Relational Operations Choices depend on indexes, memory, stats, Joins Blocked nested loops: simple, exploits extra memory

More information

CSC 742 Database Management Systems

CSC 742 Database Management Systems CSC 742 Database Management Systems Topic #16: Query Optimization Spring 2002 CSC 742: DBMS by Dr. Peng Ning 1 Agenda Typical steps of query processing Two main techniques for query optimization Heuristics

More information

Query Optimization Overview. COSC 404 Database System Implementation. Query Optimization. Query Processor Components The Parser

Query Optimization Overview. COSC 404 Database System Implementation. Query Optimization. Query Processor Components The Parser COSC 404 Database System Implementation Query Optimization Query Optimization Overview The query processor performs four main tasks: 1) Verifies the correctness of an SQL statement 2) Converts the SQL

More information

Query Processing. high level user query. low level data manipulation. query processor. commands

Query Processing. high level user query. low level data manipulation. query processor. commands Query Processing high level user query query processor low level data manipulation commands 1 Selecting Alternatives SELECT ENAME FROM EMP,ASG WHERE EMP.ENO = ASG.ENO AND DUR > 37 Strategy A ΠENAME(σDUR>37

More information

D B M G Data Base and Data Mining Group of Politecnico di Torino

D B M G Data Base and Data Mining Group of Politecnico di Torino Database Management Data Base and Data Mining Group of tania.cerquitelli@polito.it A.A. 2014-2015 Optimizer operations Operation Evaluation of expressions and conditions Statement transformation Description

More information