Relational Database Systems 2 1. System Architecture

Similar documents
Relational Database Systems 2 1. System Architecture

0. Organizational Issues. 0. Why should you be here? 0. Why should you be here? 0. Why should you be here? Relational Database Systems

Relational Database Systems 1

Relational Database Systems 1 Wolf-Tilo Balke Hermann Kroll

Database systems. Jaroslav Porubän, Miroslav Biňas, Milan Nosáľ (c)

Course: Database Management Systems. Lê Thị Bảo Thu

(Extended) Entity Relationship

Relational Database Systems 1

Databases 1. Daniel POP

Relational Database Systems 2 5. Query Processing

Dr. Angelika Reiser Chair for Database Systems (I3)

5.3 Parser and Translator. 5.3 Parser and Translator. 5.3 Parser and Translator. 5.3 Parser and Translator. 5.3 Parser and Translator

Course Logistics & Chapter 1 Introduction

D.Hemavathi & R.Venkatalakshmi, Assistant Professor, SRM University, Kattankulathur

CS 525 Advanced Database Organization - Spring 2017 Mon + Wed 1:50-3:05 PM, Room: Stuart Building 111

Who, where, when. Database Management Systems (LIX022B05) Literature. Evaluation. Lab Sessions. About this course. After this course...

Quick Facts about the course. CS 2550 / Spring 2006 Principles of Database Systems. Administrative. What is a Database Management System?

Introduction. Example Databases

CT13 DATABASE MANAGEMENT SYSTEMS DEC 2015

Database Management System. Fundamental Database Concepts

B.H.GARDI COLLEGE OF MASTER OF COMPUTER APPLICATION. Ch. 1 :- Introduction Database Management System - 1

XML Databases 1. Introduction,

COMP3311 Database Systems

Database System Concepts

Chapter 1: Introduction

DB Basic Concepts. Rab Nawaz Jadoon DCS. Assistant Professor. Department of Computer Science. COMSATS IIT, Abbottabad Pakistan

; Spring 2008 Prof. Sang-goo Lee (14:30pm: Mon & Wed: Room ) ADVANCED DATABASES

What is Data? ANSI definition: Volatile vs. persistent data. Data. Our concern is primarily with persistent data

What is Data? Volatile vs. persistent data Our concern is primarily with persistent data

Administration Naive DBMS CMPT 454 Topics. John Edgar 2

Relational Database Systems 2 5. Query Processing

Databases and Database Management Systems

CMPT 354 Database Systems I. Spring 2012 Instructor: Hassan Khosravi

Chapter 1 Chapter-1

ADVANCED DATABASES ; Spring 2015 Prof. Sang-goo Lee (11:00pm: Mon & Wed: Room ) Advanced DB Copyright by S.-g.

Introduction: Database Concepts Slides by: Ms. Shree Jaswal

CS425 Fall 2016 Boris Glavic Chapter 1: Introduction

Data Warehousing & Mining Techniques

Database Management Systems. Chapter 1

2. Summary. 2.1 Basic Architecture. 2. Architecture. 2.1 Staging Area. 2.1 Operational Data Store. Last week: Architecture and Data model

Multimedia Databases

Data Warehousing & Mining Techniques

Database Systems Concepts *

SRM UNIVERSITY FACULTY OF ENGINEERING AND TECHNOLOGY SCHOOL OF COMPUTING DEPARTMENT OF CSE COURSE PLAN

Database. Università degli Studi di Roma Tor Vergata. ICT and Internet Engineering. Instructor: Andrea Giglio

Architecture and Implementation of Database Systems (Summer 2018)

Chapter 1: Introduction

LECTURE1: PRINCIPLES OF DATABASES

5/23/2014. Limitations of File-based Approach. Limitations of File-based Approach CS235/CS334 DATABASE TECHNOLOGY CA 40%

DATABASE MANAGEMENT SYSTEMS. UNIT I Introduction to Database Systems

Informatics 1: Data & Analysis

Relational Database Systems Recovery

Course Introduction & Foundational Concepts

Multimedia Databases. 0. Organizational Issues. 0. Organizational Issues. 0. Organizational Issues. 0. Organizational Issues. 1.

Informatics 1: Data & Analysis

MIS Database Systems.

UNIT I. Introduction

BIS Database Management Systems.

Databases. Jörg Endrullis. VU University Amsterdam

Introduction to Database Management Systems

Data about data is database Select correct option: True False Partially True None of the Above

Deccan Education Society s FERGUSSON COLLEGE, PUNE (AUTONOMOUS) SYLLABUS UNDER AUTONOMY. FIRST YEAR B.Sc. COMPUTER SCIENCE SEMESTER I

Database Management System (15ECSC208) UNIT I: Chapter 1: Introduction to DBMS and ER-Model

35 Database benchmarking 25/10/17 12:11 AM. Database benchmarking

Chapter 1: Introduction. Chapter 1: Introduction

Relational Database Systems 1

CMSC 424 Database design Lecture 2: Design, Modeling, Entity-Relationship. Book: Chap. 1 and 6. Mihai Pop

CS 4604: Introduction to Database Management Systems. B. Aditya Prakash Lecture #1: Introduction

Evolution of Database Systems

Database Management Systems Introduction to DBMS

DATABASE MANAGEMENT SYSTEM SUBJECT CODE: CE 305

Chapter 1: Introduction

Unit 2. Unit 3. Unit 4

Database Technology Introduction. Heiko Paulheim

8) A top-to-bottom relationship among the items in a database is established by a

Relational Database Systems 1

Unit I. By Prof.Sushila Aghav MIT

Chapter 1: Introduction

15CS53: DATABASE MANAGEMENT SYSTEM

Database Management Systems (CPTR 312)

Introduction to the course

DATABASE MANAGEMENT SYSTEM SHORT QUESTIONS. QUESTION 1: What is database?

Specific Objectives Contents Teaching Hours 4 the basic concepts 1.1 Concepts of Relational Databases

Practical Database Design Methodology and Use of UML Diagrams Design & Analysis of Database Systems

DATABASE MANAGEMENT SYSTEM ARCHITECTURE

G64DBS Database Systems. G64DBS Module. Recommended Textbook. Assessment. Recommended Textbook. Recommended Textbook.

CSE 3241: Database Systems I Databases Introduction (Ch. 1-2) Jeremy Morris

Introduction. Who wants to study databases?

Relational Database Systems Part 01. Karine Reis Ferreira

Database Management System Implementation. Who am I? Who is the teaching assistant? TR, 10:00am-11:20am NTRP B 140 Instructor: Dr.

Overview of Data Management

Introduction and Overview

CSC 261/461 Database Systems Lecture 20. Spring 2017 MW 3:25 pm 4:40 pm January 18 May 3 Dewey 1101

Database System Concepts and Architecture

Introduction to Databases

Database Principle. Zhuo Wang Spring

ABD - Database Administration

Database Systems. Sven Helmer. Database Systems p. 1/567

Relational Database Systems 2 3. Indexing and Access Paths

CAS CS 460/660 Introduction to Database Systems. Fall

Transcription:

Relational Database Systems 2 1. System Architecture Silke Eckstein Andreas Kupfer Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de

Organizational Information Lecture 01. April 2009 08. July 2009, 09:45h-11:15h Exercise 01. April 2009 08. July 2009, 11:30h-12:15h In fact, we will interleave lectures and exercises 4 Credits Exams Oral exams, 13. 15. July or 11.-12. August 2009 50% of total exercise point are needed to be eligible for the exams Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 2

Recommended Literature Fundamentals of Database Systems (EN) Elmasri & Navathe Addison Wesley, ISBN 032141506X Database Systems Concepts (SKS) Silberschatz, Korth & Sudarshan McGraw Hill, ISBN 0072958863 Database Systems (GUW) Garcia-Molina, Ullman & Widom Prentice Hall, ISBN 0130319953 Datenbanksysteme (KE) Kemper & Eickler Oldenbourg, ISBN 3486576909 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 3

Recommended Literature Transactional Information Systems (WV) Weikum & Vossen Morgan Kaufmann, ISBN 1558605088 Transaction Processing (GR) Gray & Reuter Morgan Kaufmann, ISBN 1558601902 Database Security (CFMS) Castano, Fugini, Martella & Samarati Addison Wesley, ISBN 0201593750 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 4

1. System Architecture 1.1 Characteristics of Databases 1.2 Data-Models and Schemas Data Independence Three Schema Architecture System Catalogs 1.3 System Structure 1.4 Quality Benchmarks Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 5

1.1 What is a Database? A database (DB) is a collection of related data Represents some aspects of the real world Universe of Discourse (UoD) Data is logically coherent Is provided for an intended group of users and applications A database management system (DBMS) is a collection of programs to maintain a database, i.e. for Definition of Data and Structure Physical Construction Manipulation Sharing/Protecting Persistence/Recovery EN 1.1 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 6

1.1 Example Classic Example: Banking Systems DBMS used in banking since ca. 1960 Huge amounts of data on customers, accounts, loans, balances, Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 7

1.1 Why not use the File System? File management systems are physical interfaces Account Data F i l e App 1 Customer Letters Customer Data Loans S y s t e m App 2 Money Transfer Balance Sheets Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 8

1.1 File Systems Advantages Fast and easy access Disadvantages Uncontrolled redundancy Inconsistent data Limited data sharing and access rights Poor enforcement of standards Excessive data and access paths maintenance Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 9

1.1 Databases Databases are logical interfaces Controlled redundancy Data consistency & integrity constraints Integration of data Effective and secure data sharing Backup and recovery However More complex More expensive data access Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 10

1.1 Example DBMS replaced previously dominant file-based systems in banking due to special requirements Simultaneous and quick access is necessary Failures and loss of data cannot be tolerated Data always has to remain in a consistent state Frequent queries and modifications Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 11

1.1 Characteristics of Databases Databases control redundancy Same data used by different applications/tasks is only stored once Access via a single interface provided by DBMS Redundancy only purposefully used to speed up data access (e.g. materialized views) Problems of uncontrolled redundancy Difficulties in consistently updating data Waste of storage space EN 1.6.1 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 12

1.1 Characteristics of Databases Databases are well-structured (e.g. ER-Model) Catalog (data dictionary) contains all meta-data Defines the structure of the data in the database Example: ER-Model Simple banking system ID AccNo firstname customer has account balance lastname address type EN 1.3 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 13

1.1 Characteristics of Databases Databases aim at efficient manipulation of data Physical tuning allows for good data allocation Indexes speed up search and access Query plans are optimized for improved performance Example: Simple Index Index File AccNo 1278945 5539783 9134354 Data File AccNo type balance 1278945 saving 312.10 2437954 saving 1324.82 4543032 checking -43.03 5539783 saving 12.54 7809849 checking 7643.89 8942214 checking -345.17 9134354 saving 2.22 9543252 saving 524.89 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 14

1.1 Characteristics of Databases Isolation between applications and data Database employs data abstraction by providing data models Applications work only on the conceptual representation of data Data is strictly typed (Integer, Timestamp, VarChar, ) Details on where data is actually stored and how it is accessed is hidden by the DBMS Applications can access and manipulate data by invoking abstract operations (e.g. SQL Select statements) DBMS-controlled parts of the file system are strongly protected against outside manipulation (tablespaces) EN 1.3 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 15

1.1 Characteristics of Databases Example: Schema is changed and table-space moved without an application noticing Application SELEC T AccNo FROM account WHERE balance>0 DBMS Disk 1 Disk 2 AccNo balance 1278945 312.10 2437954 1324.82 4543032-43.03 5539783 12.54 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 16

1.1 Characteristics of Databases Example: Schema is changed and table-space moved without an application noticing Application SELEC T AccNo FROM account WHERE balance>0 DBMS Disk 1 Disk 2 AccNo balance 1278945 312.10 2437954 1324.82 4543032-43.03 5539783 12.54 AccNo type balance 1278945 saving 312.10 2437954 saving 1324.82 4543032 checking -43.03 5539783 saving 12.54 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 17

1.1 Characteristics of Databases Supports multiple views of the data Views provide a different perspective of the DB A user s conceptual understanding or task-based excerpt of all data (e.g. aggregations) Security considerations and access control (e.g. projections) For the application, a view does not differ from a table Views may contain subsets of a DB and/or contain virtual data Virtual data is derived from the DB (mostly by simple SQL statements, e.g. joins over several tables) Can either be computed at query time or materialized upfront EN 1.3 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 18

1.1 Characteristics of Databases Example Views: Projection Saving account clerk vs. checking account clerk Saving View Original Table AccNo balance AccNo type balance 1278945 312.10 1278945 saving 312.10 2437954 saving 1324.82 4543032 checking -43.03 5539783 saving 12.54 7809849 checking 7643.89 8942214 checking -345.17 9134354 saving 2.22 9543252 saving 524.89 2437954 1324.82 5539783 12.54 9134354 2.22 9543252 524.89 Checking View AccNo balance 4543032-43.03 7809849 7643.89 8942214-345.17 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 19

1.1 Characteristics of Databases Sharing of data and support for atomic multiuser transactions Multiple user and applications may access the DB at the same time Concurrency control is necessary for maintaining consistency Transactions need to be atomic and isolated from each other EN 1.3 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 20

1.1 Characteristics of Databases Example: Atomic Transactions Program: Transfer X Euro from Account 1 to Account 2 1. Deduce amount X from Account 1 2. Add amount X to Account 2 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 21

1.1 Characteristics of Databases Example: Atomic Transactions Program: Transfer X Euro from Account 1 to Account 2 1. Deduce amount X from Account 1 2. Add amount X to Account 2 But what happens if system fails between step 1 and 2? Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 22

1.1 Characteristics of Databases Example: Multi-User Transactions Program: Deduce amount X from Account 1 1. Read old balance from DB 2. New balance := old balance X 3. Write new balance back to the DB Problem: Dirty Read Account 1 has 500 User 1 wants deduce 20 User 2 wants to deduce 80 at the same time Without multi-user transaction, account will have either 480 or 420, but not the correct 400 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 23

1.1 Characteristics of Databases Persistence of data and disaster recovery Data needs to be persistent and accessible at all times Quick recovery from system crashes without data loss Recovery from natural desasters ( fire, earthquakes, ) EN 1.3 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 24

Aim of this Lecture What concepts does a DBMS need and how do you actually implement the concepts to build a DBMS? Basic concepts Query processing and optimization Transaction concept and implementing concurrent usage Logging and recovery concepts Implementing access control Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 25

1.2 Data Models A Data Model describes data objects, operations and their effects Data Definition Language (DDL) Create Table, Create View, Constraint/Check, etc. Data Manipulation Language (DML) Select, Insert, Delete, Update, etc. DML and DDL are usually clearly separated, since they handle data and meta-data, respectively EN 2.1 KE 1.4 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 26

1.2 Data Models Conceptual Data Models ER Model Semantic Data Models UML class diagrams Logical Data Models Model Types Relational Data Model (in this lecture) Network Models Object Models Schema describing Structure High Level Operations Physical Data Models Describes how data is stored, i.e. formats, ordering and access paths like tablespaces or indexes EN 2.1 KE 1.4 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 27

1.2 DBMS Meta-Data Environments Schemas Describe a part of the structure of the stored data as tables, attributes, views, constraints, relationships, etc. (Meta-Data) System Catalogs A collection of schemas Contain special schemas describing the schema collection Clusters (optionally) A collection of catalogs May be individually defined for each user (access control) Represent the maximal query scope GUW8.3 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 28

1.2 DBMS Meta-Data Environments DBMS Environment Catalog Schema Schema Cluster = Max. Query Scope Catalog Schema Schema Catalog Schema Schema GUW 8.3 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 29

1.2 Meta-Data - Example DBMS: IBM DB2 V9 Catalog: HORIZON Example Meta-Data View: SYSIBM.TABLES Describes all tables of the catalog Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 30

1.2 Schemas and Instances Schemas describe the structure of part of the DB data (intensional database) Entities (a real world concept) as tables and their attributes (a property of an entity) Types of attributes and integrity constraints Relationships between entities as tables Schemas are intended to be stable and not change often Basic operations Operations for selections, insertions and updates Optionally user defined operations (User Defined Functions (UDFs), stored procedures) and types (UDTs) May be used for more complex computations on data EN 2.1 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 31

1.2 Schemas and Instances The actually stored data is called an instance of a schema (extensional database) Warning: some DBMS (e.g. IBM DB2) call a set of schemas and physical parameters (tablespaces, etc.) instances of a database IntensionalDB ExtensionalDB Table account AccNo type balance 1278945 saving 312.10 2437954 saving 1324.82 4543032 checking 43.03 5539783 saving 12.54 7809849 checking 7643.89 8942214 checking 345.17 9134354 saving 2.22 9543252 saving 524.89 Primary key AccNo Check balance> 0 EN 2.1 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 32

1.2 Three Schema Architecture Remember: DBs should be well structured and efficient Programs and data should be isolated Different views for different user groups are necessary Thus, DBs are organized using 3 levels of schemas Internal Schema (physical schema) Describes the physical storage and access paths Uses physical models Conceptual Schema (logical schema) Describes structure of the whole DB, hiding physical details Uses conceptual models External Schema (views) Describes parts of the DB structure for a certain user group as views Hides the conceptual details EN 2.2 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 33

1.2 Three Schema Architecture End Users or Applications View 1 View 2 View n Conceptual Schema Internal Schema Stored Data ANSI/SPARC (1975) American National Standards Institute / Standards Planning and Requirements Committee EN 2.2 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 34

1.2 Data Independence Ability to change schema of one level without changing the others Logical Data Independence Change of conceptual schema without change of external schemas (and thus applications) Examples: adding attributes, changing constraints, But: for example dropping an attribute used in some user s/application s view will violate independence EN 2.2.2 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 35

1.2 Data Independence Physical Data Independence Changes of the internal schema do not affect the conceptual schema Important for reorganizing data on the disk (moving or splitting tablespaces) Adding or changing access paths (new indices, etc.) Physical tuning is one of the most important maintenance tasks of DB administrators Physical independence is also supported by having a declarative query language in relational databases What to access vs. how to access EN 2.2.2 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 36

1.3 System Structure - Overview Database characteristics lead to layered architecture Query Processor Query Optimization Query Planning Storage Manager Access Paths Physical sets, pages, buffers Accesses disks through OS May be avoided using raw devices for direct data access Applications /Queries DBMS Query Processor Storage Manager Operating System Disks EN 1.1 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 37

1.3 System Structure Application Programmers DB Administrators Application Interfaces Application Programs Direct Query DB Scheme Transaction Manager Query Processor Applications Embedded Programs DML Object Code Precompiler DML Compiler DDL Interpreter Storage Manager File Manager Buffer Manager Query Evaluation Engine Data Indices Statistics Catalog/ Dictionary SKS 1.9 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 38

1.3 Storage Manager The storage manager provides the interface between the data stored in the database and the application programs and queries submitted to the system The storage manager is responsible for Interaction with the file manager Efficient storing, retrieving and updating of data Tasks: Storage access File organization Indexing and hashing Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 39

1.3 Query Processor The query processor parses queries, optimizes query plans and evaluates the query Alternative ways of evaluating a given query due to equivalent expressions Different algorithms for each operation Cost difference between good and bad ways of evaluating a query can be enormous Needs to estimate the cost of operations Depends critically on statistical information about relations which the DBMS maintains Need to estimate statistics for intermediate results to compute cost of complex expressions (join order, etc.) Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 40

1.3 Transaction Manager A transaction is a collection of operations that performs a single logical function in a database application The transaction manager Ensures that the database remains in a correct state despite system failures (like power failures, operating system crashes, or transaction failures) Controls the interaction among concurrent transactions to ensure the database consistency Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 41

1.4 DBMS Quality How do you know whether you built or bought a good DBMS? Always: depends on the application Analyze data volume, typical DB queries and transactions (what do you really need?) Analyze expected frequency of invocation of queries and transactions (what has to be supported?) Analyze time constraints of queries and transactions (how fast does it have to be?) Analyze expected frequency of update operations (does it deal with rather static or with volatile data?) Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 42

1.4 Performance Measures Basically analytical & experimental approaches on typical characteristics like Response time: how long can a query/update be expected to take? On average or at peak times (worst case) Transaction throughput: how many transactions can be processed per second/millisecond? On average or at peak times (worst case) Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 43

1.4 Industry Standard Benchmarks How to compare database performance across vendors? The Transaction Processing Performance Council Aims are significant disk input/output, moderate system and application execution time, and transaction integrity Defines certain scenarios with standard data sets, schemas and queries http://www.tpc.org Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 44

1.4 Industry Standard Benchmarks E.g. TPC-D (used until 1999) Decision Support Applications Performance Metrics: Throughput measured in transactions per second (tps) Response time of transaction (transaction elapse time) Cost metric (in $/tps) OLTP multiple on-line terminal sessions modeled by transaction arrival distribution Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 45

1.4 Example query from TPC-D 2.3 Forecasting Revenue Query (Q6) This query quantifies the amount of revenue increase that would have resulted from eliminating company-wide discounts in a given percentage range in a given year. Asking this type of what if query can be used to look for ways to increase revenues. 2.3.1 Business Question The Forecasting Revenue Change Query considers all the lineitems shipped in a given year with discounts between DISCOUNT+0.01 and DISCOUNT-0.01. The query list the amount by which the total revenues would have decreased if these discounts had been eliminated for lineitems with item quantities less than QUANTITY. Note that the potential revenue increase is equal to the sum of (L_EXTENDEDPRICE * L_DISCOUNT) for all lineitems with quantities and discounts in the qualifying range. Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 46

1.4 Example query from TPC-D 2.3.2 Functional Query Definition SELECT SUM(L_EXTENDEDPRICE*L_DISCOUNT) AS REVENUE FROM LINEITEM WHERE L_SHIPDATE >= DATE [DATE]] AND L_SHIPDATE < DATE [DATE] + INTERVAL 1 YEAR AND L_DISCOUNT BETWEEN [DISCOUNT] - 0.01 AND [DISCOUNT] + 0.01 AND L_QUANTITY < [QUANTITY] 2.8.3 Substitution Parameters Values for the following substitution parameters must be generated and used to build the executable query text DATE is the first of January of a randomly selected year within [1993-1997] DISCOUNT is randomly selected within [0.02.. 0.09] QUANTITY is randomly selected within [24.. 25] Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 47

1.4 Example query from TPC-D 2.8.4 Query Validation For validation against the qualification database the query must be executed using the following values for the substitution parameters 1. DATE = 1994-01-01 2. DISCOUNT = 0.06 3. QUANTITY = 24 Query validation output data: 1 row returned REVENUE 11450588.04 Query validation demonstrates the integrity of an implementation Query phrasings are run against 100MB data set If the answer sets don t match, the benchmark is invalid! Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 48

1.4 Results as of 1999 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 49

1.4 Current TPC Benchmarks TPC-C Standard for comparing On-Line Transaction Processing (OLTP) performance on various hardware and software configurations since 1992 (currently in version 5.9) TPC-App Application server and web services benchmark simulates the activities of a business-to-business transactional application server operating in a 24/7 environment Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 50

1.4 Current TPC Benchmarks TPC-E New OLTP workload benchmark Simulates the OLTP workload of a brokerage firm focussing on a central database that executes transactions related to the firm s customer accounts TPC-H Ad-hoc, decision support benchmark Consists of a suite of business oriented ad-hoc queries and concurrent data modifications Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 51