Wentworth Institute of Technology COMP570 Database Applications Fall 2014 Derbinsky. Physical Tuning. Lecture 10. Physical Tuning

Similar documents
Wentworth Institute of Technology COMP2670 Databases Spring 2016 Derbinsky. Physical Tuning. Lecture 12. Physical Tuning

Chapter 18 Strategies for Query Processing. We focus this discussion w.r.t RDBMS, however, they are applicable to OODBS.

Database design process

SQL STRUCTURED QUERY LANGUAGE

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe Slide 16-1

CSIT5300: Advanced Database Systems

Data about data is database Select correct option: True False Partially True None of the Above

Physical Database Design and Tuning. Review - Normal Forms. Review: Normal Forms. Introduction. Understanding the Workload. Creating an ISUD Chart

What happens. 376a. Database Design. Execution strategy. Query conversion. Next. Two types of techniques

A subquery is a nested query inserted inside a large query Generally occurs with select, from, where Also known as inner query or inner select,

Lecture #16 (Physical DB Design)

Physical Database Design and Tuning

Introduction to Query Processing and Query Optimization Techniques. Copyright 2011 Ramez Elmasri and Shamkant Navathe

More SQL: Complex Queries, Triggers, Views, and Schema Modification

Algorithms for Query Processing and Optimization. 0. Introduction to Query Processing (1)

Chapter 3. Algorithms for Query Processing and Optimization

Introduction to SQL. ECE 650 Systems Programming & Engineering Duke University, Spring 2018

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

The Relational Algebra

Physical Database Design and Tuning. Chapter 20

Chapter 8 SQL-99: Schema Definition, Basic Constraints, and Queries

COSC344 Database Theory and Applications. σ a= c (P) S. Lecture 4 Relational algebra. π A, P X Q. COSC344 Lecture 4 1

Interview Questions on DBMS and SQL [Compiled by M V Kamal, Associate Professor, CSE Dept]

Chapter 8. Joined Relations. Joined Relations. SQL-99: Schema Definition, Basic Constraints, and Queries

Chapter 6 - Part II The Relational Algebra and Calculus

More SQL: Complex Queries, Triggers, Views, and Schema Modification

Administrivia. Physical Database Design. Review: Optimization Strategies. Review: Query Optimization. Review: Database Design

Slides by: Ms. Shree Jaswal

COSC344 Database Theory and Applications. Lecture 5 SQL - Data Definition Language. COSC344 Lecture 5 1

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe Slide 25-1

Chapter 19 Query Optimization

Database Technology. Topic 3: SQL. Olaf Hartig.

ECE 650 Systems Programming & Engineering. Spring 2018

Overview Relational data model

NESTED QUERIES AND AGGREGATION CHAPTER 5 (6/E) CHAPTER 8 (5/E)

We shall represent a relation as a table with columns and rows. Each column of the table has a name, or attribute. Each row is called a tuple.

More SQL: Complex Queries, Triggers, Views, and Schema Modification

Relational Database design. Slides By: Shree Jaswal

Querying Data with Transact SQL

Simple SQL Queries (contd.)

MANAGING DATA(BASES) USING SQL (NON-PROCEDURAL SQL, X401.9)

CPSC 421 Database Management Systems. Lecture 19: Physical Database Design Concurrency Control and Recovery

Introduction to Data Management. Lecture #17 (Physical DB Design!)

Chapter 6 The Relational Algebra and Calculus

SQL-99: Schema Definition, Basic Constraints, and Queries. Create, drop, alter Features Added in SQL2 and SQL-99

SQL. Copyright 2013 Ramez Elmasri and Shamkant B. Navathe

SQL Queries. COSC 304 Introduction to Database Systems SQL. Example Relations. SQL and Relational Algebra. Example Relation Instances

Chapter 10. Normalization. Chapter Outline. Chapter Outline(contd.)

Chapter 10. Chapter Outline. Chapter Outline. Functional Dependencies and Normalization for Relational Databases

Chapter 14. Database Design Theory: Introduction to Normalization Using Functional and Multivalued Dependencies

Discuss physical db design and workload What choises we have for tuning a database How to tune queries and views

Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification

Database Design and Tuning

Friday Nights with Databases!

NESTED QUERIES AND AGGREGATION CHAPTER 5 (6/E) CHAPTER 8 (5/E)

COSC 304 Introduction to Database Systems SQL. Dr. Ramon Lawrence University of British Columbia Okanagan

Chapter 4. Basic SQL. SQL Data Definition and Data Types. Basic SQL. SQL language SQL. Terminology: CREATE statement

SQL: A COMMERCIAL DATABASE LANGUAGE. Data Change Statements,

Chapter 6: RELATIONAL DATA MODEL AND RELATIONAL ALGEBRA

Relational Algebra. Relational Algebra Overview. Relational Algebra Overview. Unary Relational Operations 8/19/2014. Relational Algebra Overview

Database Technology. Topic 2: Relational Databases and SQL. Olaf Hartig.

Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 10-2

Modern Database Systems Lecture 1

Informal Design Guidelines for Relational Databases

Query Processing & Optimization. CS 377: Database Systems

Teradata. This was compiled in order to describe Teradata and provide a brief overview of common capabilities and queries.

SQL: Advanced Queries, Assertions, Triggers, and Views. Copyright 2012 Ramez Elmasri and Shamkant B. Navathe

RELATIONAL DATA MODEL: Relational Algebra

A taxonomy of SQL queries Learning Plan

8) A top-to-bottom relationship among the items in a database is established by a

Functional Dependencies and. Databases. 1 Informal Design Guidelines for Relational Databases. 4 General Normal Form Definitions (For Multiple Keys)

RDBMS- Day 4. Grouped results Relational algebra Joins Sub queries. In today s session we will discuss about the concept of sub queries.

Chapter 14 Outline. Normalization for Relational Databases: Outline. Chapter 14: Basics of Functional Dependencies and

Relational Algebra Part I. CS 377: Database Systems

COSC344 Database Theory and Applications. Lecture 6 SQL Data Manipulation Language (1)

Chapter 6 5/2/2008. Chapter Outline. Database State for COMPANY. The Relational Algebra and Calculus

King Fahd University of Petroleum and Minerals

Chapter 6 Part I The Relational Algebra and Calculus

CMP-3440 Database Systems

Course Modules for MCSA: SQL Server 2016 Database Development Training & Certification Course:

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

ECE 650 Systems Programming & Engineering. Spring 2018

COSC Assignment 2

Ian Kenny. November 28, 2017

Databasesystemer, forår 2005 IT Universitetet i København. Forelæsning 8: Database effektivitet. 31. marts Forelæser: Rasmus Pagh

Mobile and Heterogeneous databases Distributed Database System Query Processing. A.R. Hurson Computer Science Missouri Science & Technology

L130 - DATABASE MANAGEMENT SYSTEMS LAB CYCLE-1 1) Create a table STUDENT with appropriate data types and perform the following queries.

Normalization in DBMS

The Relational Algebra and Calculus. Copyright 2013 Ramez Elmasri and Shamkant B. Navathe

COSC Dr. Ramon Lawrence. Emp Relation

Database Administration and Tuning

Outline. Textbook Chapter 6. Note 1. CSIE30600/CSIEB0290 Database Systems Basic SQL 2

Chapter 8: The Relational Algebra and The Relational Calculus

Storage and Indexing

Database Design Theory and Normalization. CS 377: Database Systems

Database Systems External Sorting and Query Optimization. A.R. Hurson 323 CS Building

Examination examples

QUERY PROCESSING & OPTIMIZATION CHAPTER 19 (6/E) CHAPTER 15 (5/E)

DATA AND SCHEMA MODIFICATIONS CHAPTERS 4,5 (6/E) CHAPTER 8 (5/E)

CS403- Database Management Systems Solved MCQS From Midterm Papers. CS403- Database Management Systems MIDTERM EXAMINATION - Spring 2010

Transcription:

Lecture 10 1

Context Influential Factors Knobs Denormalization Database Design Query Design Outline 2

Database Design and Implementation Process 3

Factors that Influence Attributes: Queries and Transactions Queried = good for indexes Updated = bad for indexes Unique = should be indexed Frequency: Queries and Transactions 80/20 rule -> effective profiling Constraints: Queries and Transactions E.g. must complete within X seconds Frequency: Updates Statistics Storage allocation I/O performance Query execution time 4

Tools at Your Disposal Indexes Covered in last lecture Denormalization Materialized views Database design Query design 5

Denormalization The goal of normalization is to yield a database schema that is free from redundancies Depending upon performance constraints and the job mix, sometimes it is appropriate to introduce redundancies (i.e. denormalize to 1/2NF) in the name of performance improvement (e.g. to avoid joins) Note: a schema should always be fully normalized first, and denormalization considered during physical tuning upon analysis of constraints/performance This technique should be deliberate and is not an excuse for sloppy database design 6

Example: Employee Assignment Roster ASSIGN(Emp_id, Proj_id, Emp_name, Emp_job_title, Percent_assigned, Proj_name, Proj_mgr_id, Proj_mgr_name) Proj_id Proj_name, Proj_mgr_id Proj_mgr_id Proj_mgr_name Emp_id Emp_name, Emp_job_title EMP (Emp_id, Emp_name, Emp_job_title) PROJ (Proj_id, Proj_name, Proj_mgr_id) EMP_PROJ (Emp_id, Proj_id, Percent_assigned) 7

Main Approaches to Denormalizing Use materialized views Create a new relation on disk, DBMS responsible for automatically updating w.r.t. base relations Denormalize the logical data design Implement constraints via DBMS (e.g. triggers) or application logic 8

Denormalization Examples Storing derived attributes Storing the aggregation of the "many" objects in a one-to-many relationship as an attribute of the "one" relation (e.g. count, sum(expr)) Adding attributes to a relation from another relation with which it will be joined Storing results of calculations on one or more fields within the same relation 9

Database Design Tuning Denormalization is one method by which to alter database design to achieve performance goals Others common approaches Vertical partitioning Horizontal partitioning 10

Vertical Partitioning Given a normalized relation [typically with many attributes], break into two or more relations, each duplicating the PK, but separating attribute groups Example: Given R(K,A,B,C,G,H, ) Knowing that (A,B,C) typically together, distinct from (G, H, ) Yield R1(K,A,B,C) and R2(K,G,H, ) 11

Horizontal Partitioning Given a normalized relation [typically with many rows], break into two or more relations, each with the same columns, but a different subset of rows Example: Given ORDER(ID,REGION_ID, ) Knowing that typical queries are specific to a region Yield ORDER_R1(ID, ), ORDER_R2(ID, ), Will require multiple queries/union if all orders are to be considered at once 12

Query Design Tuning Indications Profiling indicates too much I/O and/or time The query plan (via EXPLAIN) shows that relevant indexes are not being used The following slides offer common situations in which query tuning might be applicable. For any particular DBMS, see vendor documentation and trade literature. Generally speaking, do not attempt to pre-optimize for these situations let the DBMS/profiling tell you when there is a problem (i.e. avoid premature optimization). 13

Query Issues (1) Many query optimizers do not use indexes in the presence of arithmetic expressions (such as Salary/365 > 10.50), numerical comparisons of attributes of different sizes and precision (such as Aqty = Bqty where Aqty is of type INTEGER and Bqty is of type SMALLINTEGER), NULL comparisons (such as Bdate IS NULL), and substring comparisons (such as Lname LIKE %mann ) Some of this (e.g. arithmetic expressions) can be ameliorated with denormalization 14

Query Issues (2) Indexes are often not used for nested queries using IN; for example, the following query: SELECT Ssn FROM EMPLOYEE WHERE Dno IN ( SELECT Dnumber FROM DEPARTMENT WHERE Mgr_ssn = 333445555 ); may not use the index on Dno in EMPLOYEE, whereas using Dno=Dnumber in the WHERE-clause with a single block query may cause the index to be used Introducing additional calls to your application may alleviate this type of issue, assuming communication I/O is not prohibitively expensive 15

Query Issues (3) Some DISTINCTs may be redundant and can be avoided without changing the result. A DISTINCT often causes a sort operation and must be avoided as much as possible 16

Query Issues (4) Avoid correlated queries where possible. Consider the following query, which retrieves the highest paid employee in each department: SELECT Ssn FROM EMPLOYEE E WHERE Salary = SELECT MAX (Salary) FROM EMPLOYEE AS M WHERE M.Dno = E.Dno; This has the potential danger of searching all of the inner EMPLOYEE table M for each tuple from the outer EMPLOYEE table E To make the execution more efficient, the process can be rewritten such that one query computes the maximum salary in each department and then is joined 17

Query Issues (5) If multiple options for a join condition are possible, choose one that avoids string comparisons For example, assuming that the Name attribute is a candidate key in EMPLOYEE and STUDENT, it is better to use EMPLOYEE.Ssn = STUDENT.Ssn as a join condition rather than EMPLOYEE.Name = STUDENT.Name 18

Query Issues (6) One idiosyncrasy with some query optimizers is that the order of tables in the FROM-clause may affect the join processing. If that is the case, one may have to switch this order so that the smaller of the two relations is scanned and the larger relation is used with an appropriate index. Some DBMSs have commands by which to influence query optimization (e.g. HINT) 19

Query Issues (7) A query with multiple selection conditions that are connected via OR may not be prompting the query optimizer to use any index. Such a query may be split up and expressed as a union of queries, each with a condition on an attribute that causes an index to be used. For example, SELECT Fname, Lname, Salary, Age7 FROM EMPLOYEE WHERE Age > 45 OR Salary < 50000; may be executed using table scan giving poor performance. Splitting it up as SELECT Fname, Lname, Salary, Age FROM EMPLOYEE WHERE Age>45 UNION SELECT Fname, Lname, Salary, Age FROM EMPLOYEE WHERE Salary < 50000; may utilize indexes on Age as well as on Salary 20