Relational Database Systems 2 1. System Architecture

Similar documents
Relational Database Systems 2 1. System Architecture

Relational Database Systems 1

0. Organizational Issues. 0. Why should you be here? 0. Why should you be here? 0. Why should you be here? Relational Database Systems

Relational Database Systems 1 Wolf-Tilo Balke Hermann Kroll

Database systems. Jaroslav Porubän, Miroslav Biňas, Milan Nosáľ (c)

Course: Database Management Systems. Lê Thị Bảo Thu

(Extended) Entity Relationship

Relational Database Systems 1

CS 525 Advanced Database Organization - Spring 2017 Mon + Wed 1:50-3:05 PM, Room: Stuart Building 111

Course Logistics & Chapter 1 Introduction

; Spring 2008 Prof. Sang-goo Lee (14:30pm: Mon & Wed: Room ) ADVANCED DATABASES

Dr. Angelika Reiser Chair for Database Systems (I3)

Quick Facts about the course. CS 2550 / Spring 2006 Principles of Database Systems. Administrative. What is a Database Management System?

Databases 1. Daniel POP

D.Hemavathi & R.Venkatalakshmi, Assistant Professor, SRM University, Kattankulathur

Administration Naive DBMS CMPT 454 Topics. John Edgar 2

COMP3311 Database Systems

CT13 DATABASE MANAGEMENT SYSTEMS DEC 2015

Database Management System. Fundamental Database Concepts

Chapter 1: Introduction

Databases and Database Management Systems

B.H.GARDI COLLEGE OF MASTER OF COMPUTER APPLICATION. Ch. 1 :- Introduction Database Management System - 1

Chapter 1 Chapter-1

Relational Database Systems 2 5. Query Processing

Who, where, when. Database Management Systems (LIX022B05) Literature. Evaluation. Lab Sessions. About this course. After this course...

CMPT 354 Database Systems I. Spring 2012 Instructor: Hassan Khosravi

5.3 Parser and Translator. 5.3 Parser and Translator. 5.3 Parser and Translator. 5.3 Parser and Translator. 5.3 Parser and Translator

ADVANCED DATABASES ; Spring 2015 Prof. Sang-goo Lee (11:00pm: Mon & Wed: Room ) Advanced DB Copyright by S.-g.

What is Data? ANSI definition: Volatile vs. persistent data. Data. Our concern is primarily with persistent data

What is Data? Volatile vs. persistent data Our concern is primarily with persistent data

CS425 Fall 2016 Boris Glavic Chapter 1: Introduction

Introduction: Database Concepts Slides by: Ms. Shree Jaswal

Data Warehousing & Mining Techniques

Database System Concepts

2. Summary. 2.1 Basic Architecture. 2. Architecture. 2.1 Staging Area. 2.1 Operational Data Store. Last week: Architecture and Data model

Data Warehousing & Mining Techniques

Database Management Systems Introduction to DBMS

Introduction. Example Databases

UNIT I. Introduction

Relational Database Systems 2 5. Query Processing

Course Introduction & Foundational Concepts

Database Management Systems. Chapter 1

DATABASE MANAGEMENT SYSTEM SUBJECT CODE: CE 305

DB Basic Concepts. Rab Nawaz Jadoon DCS. Assistant Professor. Department of Computer Science. COMSATS IIT, Abbottabad Pakistan

MIS Database Systems.

BIS Database Management Systems.

DATABASE MANAGEMENT SYSTEM SHORT QUESTIONS. QUESTION 1: What is database?

Chapter 1: Introduction

Unit 2. Unit 3. Unit 4

DATABASE MANAGEMENT SYSTEMS. UNIT I Introduction to Database Systems

HyPer-sonic Combined Transaction AND Query Processing

SRM UNIVERSITY FACULTY OF ENGINEERING AND TECHNOLOGY SCHOOL OF COMPUTING DEPARTMENT OF CSE COURSE PLAN

Practical Database Design Methodology and Use of UML Diagrams Design & Analysis of Database Systems

Multimedia Databases

Relational Database Systems Part 01. Karine Reis Ferreira

Introduction to the course

LECTURE1: PRINCIPLES OF DATABASES

Data Warehousing & Data Mining

Database Technology Introduction. Heiko Paulheim

Chapter 1: Introduction. Chapter 1: Introduction

CS 4604: Introduction to Database Management Systems. B. Aditya Prakash Lecture #1: Introduction

Specific Objectives Contents Teaching Hours 4 the basic concepts 1.1 Concepts of Relational Databases

CMSC 424 Database design Lecture 2: Design, Modeling, Entity-Relationship. Book: Chap. 1 and 6. Mihai Pop

Database System Concepts and Architecture

Department of Information Technology B.E/B.Tech : CSE/IT Regulation: 2013 Sub. Code / Sub. Name : CS6302 Database Management Systems

Chapter 1: Introduction

Databases. Jörg Endrullis. VU University Amsterdam

Database Systems Concepts *

15CS53: DATABASE MANAGEMENT SYSTEM

Introduction to Database Management Systems

Chapter 1: Introduction

Overview of Data Management

Informatics 1: Data & Analysis

Architecture and Implementation of Database Systems (Summer 2018)

Deccan Education Society s FERGUSSON COLLEGE, PUNE (AUTONOMOUS) SYLLABUS UNDER AUTONOMY. FIRST YEAR B.Sc. COMPUTER SCIENCE SEMESTER I

5/23/2014. Limitations of File-based Approach. Limitations of File-based Approach CS235/CS334 DATABASE TECHNOLOGY CA 40%

Relational Database Systems 1

Multimedia Databases. 0. Organizational Issues. 0. Organizational Issues. 0. Organizational Issues. 0. Organizational Issues. 1.

Relational Database Systems 1

CSE 3241: Database Systems I Databases Introduction (Ch. 1-2) Jeremy Morris

Database Management Systems (CPTR 312)

Evolution of Database Systems

G64DBS Database Systems. G64DBS Module. Recommended Textbook. Assessment. Recommended Textbook. Recommended Textbook.

CSC 355 Database Systems

Data about data is database Select correct option: True False Partially True None of the Above

CAS CS 460/660 Introduction to Database Systems. Fall

M S Ramaiah Institute of Technology Department of Computer Science And Engineering

DATABASE MANAGEMENT SYSTEM ARCHITECTURE

Relational Database Systems 2 3. Indexing and Access Paths

Database. Università degli Studi di Roma Tor Vergata. ICT and Internet Engineering. Instructor: Andrea Giglio

Unit I. By Prof.Sushila Aghav MIT

Course Introduction & Foundational Concepts

Introduction and Overview

Rajiv GandhiCollegeof Engineering& Technology, Kirumampakkam.Page 1 of 10

Database Management System (15ECSC208) UNIT I: Chapter 1: Introduction to DBMS and ER-Model

HyPer-sonic Combined Transaction AND Query Processing

Informatics 1: Data & Analysis

Chapter 1: Introduction

Database Principle. Zhuo Wang Spring

John Edgar 2

Transcription:

Relational Database Systems 2 1. System Architecture Wolf-Tilo Balke Jan-Christoph Kalo Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de

1 Organizational Issues Lecture Regular dates from 6. April to 13. July Große Übung (Exercise) Interleaved with lecture Discussion of exercises, examples, detours,... Oral Exams Will be announced later 5 Credits Lecture Webpage http://www.ifis.cs.tu-bs.de/teaching/ss-16/rdb2 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 2

1 Organizational Issues Scope of the lecture Architecture of RDBs Storage Hardware Indexing Basic Index Structures Tree Index Structures Query Processing Basics of QP (Algebraic) Query Optimization (Heuristical) Query Optimization (Statistical & Join-Order) Transaction Management Locking Protocols Alternative Protocols Data Protection Recovery & Durability Security Trends & Alternatives Alternative Implementations Alternative Applications Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 3

1 Recommended Literature Fundamentals of Database Systems (EN) Elmasri & Navathe Addison Wesley, ISBN 032141506X Database Systems Concepts (SKS) Silberschatz, Korth & Sudarshan McGraw Hill, ISBN 0072958863 Database Systems (GUW) Garcia-Molina, Ullman & Widom Prentice Hall, ISBN 0130319953 Datenbanksysteme (KE) Kemper & Eickler Oldenbourg, ISBN 3486576909 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 4

1 Recommended Literature Transactional Information Systems (WV) Weikum & Vossen Morgan Kaufmann, ISBN 1558605088 Transaction Processing (GR) Gray & Reuter Morgan Kaufmann, ISBN 1558601902 Database Security (CFMS) Castano, Fugini, Martella & Samarati Addison Wesley, ISBN 0201593750 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 5

1 System Architecture 1.1 Characteristics of Databases 1.2 Data-Models and Schemas Data Independence Three Schema Architecture System Catalogs 1.3 System Structure 1.4 Quality Benchmarks Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 6

1.1 What is a Database? A database (DB) is a collection of related data Represents some aspects of the real world Universe of Discourse (UoD) Data is logically coherent Is provided for an intended group of users and applications A database management system (DBMS) is a collection of programs to maintain a database, i.e. for Definition of Data and Structure Physical Construction Manipulation Sharing/Protecting Persistence/Recovery EN 1.1 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 7

1.1 Example Classic Example: Banking Systems DBMS used in banking since ca. 1960 Huge amounts of data on customers, accounts, loans, balances, Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 8

1.1 Why not use the File System? File management systems are physical interfaces Account Data F i l e App 1 Customer Letters Customer Data Loans S y s t e m App 2 Money Transfer Balance Sheets Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 9

1.1 File Systems Advantages Fast and easy access Disadvantages Uncontrolled redundancy Inconsistent data Limited data sharing and access rights Poor enforcement of standards Excessive data and access paths maintenance Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 10

1.1 Databases Databases are logical interfaces Controlled redundancy Data consistency & integrity constraints Integration of data Effective and secure data sharing Backup and recovery However More complex More expensive data access Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 11

1.1 Example DBMS replaced previously dominant file-based systems in banking due to special requirements Simultaneous and quick access is necessary Failures and loss of data cannot be tolerated Data always has to remain in a consistent state Frequent queries and modifications Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 12

1.1 Characteristics of Databases Databases control redundancy Same data used by different applications/tasks is only stored once Access via a single interface provided by DBMS Redundancy only purposefully used to speed up data access (e.g. materialized views) Problems of uncontrolled redundancy Difficulties in consistently updating data Waste of storage space EN 1.6.1 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 13

1.1 Characteristics of Databases Databases are well-structured (e.g. ER-Model) Catalog (data dictionary) contains all meta-data Defines the structure of the data in the database Example: ER-Model Simple banking system ID AccNo firstname customer has account balance lastname address type EN 1.3 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 14

1.1 Characteristics of Databases Databases aim at efficient manipulation of data Physical tuning allows for good data allocation Indexes speed up search and access Query plans are optimized for improved performance Example: Simple Index Index File AccNo 1278945 5539783 9134354 Data File AccNo type balance 1278945 saving 312.10 2437954 saving 1324.82 4543032 checking -43.03 5539783 saving 12.54 7809849 checking 7643.89 8942214 checking -345.17 9134354 saving 2.22 9543252 saving 524.89 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 15

1.1 Characteristics of Databases Isolation between applications and data Database employs data abstraction by providing data models Applications work only on the conceptual representation of data Data is strictly typed (Integer, Timestamp, VarChar, ) Details on where data is actually stored and how it is accessed is hidden by the DBMS Applications can access and manipulate data by invoking abstract operations (e.g. SQL Select statements) DBMS-controlled parts of the file system are strongly protected against outside manipulation (tablespaces) EN 1.3 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 16

1.1 Characteristics of Databases Example: Schema is changed and table-space moved without an application noticing Application SELECT AccNo FROM account WHERE balance > 0 DBMS Disk 1 Disk 2 AccNo balance 1278945 312.10 2437954 1324.82 4543032-43.03 5539783 12.54 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 17

1.1 Characteristics of Databases Example: Schema is changed and table-space moved without an application noticing Application SELECT AccNo FROM account WHERE balance > 0 DBMS Disk 1 Disk 2 AccNo balance 1278945 312.10 2437954 1324.82 4543032-43.03 5539783 12.54 AccNo type balance 1278945 saving 312.10 2437954 saving 1324.82 4543032 checking -43.03 5539783 saving 12.54 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 18

1.1 Characteristics of Databases Supports multiple views of the data Views provide a different perspective of the DB A user s conceptual understanding or task-based excerpt of all data (e.g. aggregations) Security considerations and access control (e.g. projections) For the application, a view does not differ from a table Views may contain subsets of a DB and/or contain virtual data Virtual data is derived from the DB (mostly by simple SQL statements, e.g. joins over several tables) Can either be computed at query time or materialized upfront EN 1.3 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 19

1.1 Characteristics of Databases Example Views: Projection Saving account clerk vs. checking account clerk Saving View Original Table AccNo balance AccNo type balance 1278945 saving 312.10 2437954 saving 1324.82 4543032 checking -43.03 5539783 saving 12.54 7809849 checking 7643.89 8942214 checking -345.17 9134354 saving 2.22 9543252 saving 524.89 1278945 312.10 2437954 1324.82 5539783 12.54 9134354 2.22 9543252 524.89 Checking View AccNo balance 4543032-43.03 7809849 7643.89 8942214-345.17 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 20

1.1 Characteristics of Databases Sharing of data and support for atomic multiuser transactions Multiple user and applications may access the DB at the same time Concurrency control is necessary for maintaining consistency Transactions need to be atomic and isolated from each other EN 1.3 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 21

1.1 Characteristics of Databases Example: Atomic Transactions Program: Transfer X Euro from Account 1 to Account 2 1. Deduce amount X from Account 1 2. Add amount X to Account 2 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 22

1.1 Characteristics of Databases Example: Atomic Transactions Program: Transfer X Euro from Account 1 to Account 2 1. Deduce amount X from Account 1 2. Add amount X to Account 2 But what happens if system fails between step 1 and 2? Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 23

1.1 Characteristics of Databases Example: Multi-User Transactions Program: Deduce amount X from Account 1 1. Read old balance from DB 2. New balance := old balance X 3. Write new balance back to the DB Problem: Dirty Read Account 1 has 500 User 1 wants deduce 20 User 2 wants to deduce 80 at the same time Without multi-user transaction, account will have either 480 or 420, but not the correct 400 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 24

1.1 Characteristics of Databases Persistence of data and disaster recovery Data needs to be persistent and accessible at all times Quick recovery from system crashes without data loss Recovery from natural desasters ( fire, earthquakes, ) EN 1.3 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 25

1.1 Aim of this Lecture What concepts does a DBMS need and how do you actually implement the concepts to build a DBMS? Basic concepts Query processing and optimization Transaction concept and implementing concurrent usage Logging and recovery concepts Implementing access control Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 26

1.2 Data Models A Data Model describes data objects, operations and their effects Data Definition Language (DDL) Create Table, Create View, Constraint/Check, etc. Data Manipulation Language (DML) Select, Insert, Delete, Update, etc. DML and DDL are usually clearly separated, since they handle data and meta-data, respectively EN 2.1 KE 1.4 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 27

1.2 Data Models Conceptual Data Models ER Model Semantic Data Models UML class diagrams Logical Data Models Model Types Relational Data Model (in this lecture) Network Models Object Models Schema describing Structure High Level Operations Physical Data Models Describes how data is stored, i.e. formats, ordering and access paths like tablespaces or indexes EN 2.1 KE 1.4 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 28

1.2 DBMS Meta-Data Environments Schemas Describe a part of the structure of the stored data as tables, attributes, views, constraints, relationships, etc. (Meta-Data) System Catalogs A collection of schemas Contain special schemas describing the schema collection Clusters (optionally) A collection of catalogs May be individually defined for each user (access control) Represent the maximal query scope GUW8.3 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 29

1.2 DBMS Meta-Data Environments DBMS Environment Catalog Schema Schema Cluster = Max. Query Scope Catalog Schema Schema Catalog Schema Schema GUW 8.3 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 30

1.2 Meta-Data - Example DBMS: IBM DB2 V9 Catalog: HORIZON Example Meta-Data View: SYSIBM.TABLES Describes all tables of the catalog Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 31

1.2 Schemas and Instances Schemas describe the structure of part of the DB data (intensional database) Entity Types (a real world concept) as tables and their attributes (a property of an entity) Types of attributes and integrity constraints Relationships between entity types as tables Schemas are intended to be stable and not change often Basic operations Operations for selections, insertions and updates Optionally user defined operations (User Defined Functions (UDFs), stored procedures) and types (UDTs) May be used for more complex computations on data EN 2.1 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 32

1.2 Schemas and Instances The actually stored data is called an instance of a schema (extensional database) Warning: some DBMS (e.g. IBM DB2) call a set of schemas and physical parameters (tablespaces, etc.) instances of a database Intensional DB Table account AccNo type balance 1278945 saving 312.10 2437954 saving 1324.82 Primary key AccNo Check balance > 0 Extensional DB 4543032 checking 43.03 5539783 saving 12.54 7809849 checking 7643.89 8942214 checking 345.17 9134354 saving 2.22 9543252 saving 524.89 EN 2.1 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 33

1.2 Three Schema Architecture Remember: DBs should be well structured and efficient Programs and data should be isolated Different views for different user groups are necessary Thus, DBs are organized using 3 levels of schemas Internal Schema (physical layer) Describes the physical storage and access paths Uses physical models Conceptual Schema (logical layer) Describes structure of the whole DB, hiding physical details Uses logical data models External Schema (presentation layer) Describes parts of the DB structure for a certain user group as views Hides the conceptual details EN 2.2 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 34

1.2 Three Schema Architecture End Users or Applications View 1 View 2 View n Conceptual Schema Internal Schema Stored Data ANSI/SPARC (1975) American National Standards Institute / Standards Planning and Requirements Committee EN 2.2 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 35

1.2 Data Independence Ability to change schema of one level without changing the others Logical Data Independence Change of conceptual schema without change of external schemas (and thus applications) Examples: adding attributes, changing constraints, But: for example dropping an attribute used in some user s/application s view will violate independence EN 2.2.2 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 36

1.2 Data Independence Physical Data Independence Changes of the internal schema do not affect the conceptual schema Important for reorganizing data on the disk (moving or splitting tablespaces) Adding or changing access paths (new indices, etc.) Physical tuning is one of the most important maintenance tasks of DB administrators Physical independence is also supported by having a declarative query language in relational databases What to access vs. how to access EN 2.2.2 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 37

1.3 System Structure - Overview Database characteristics lead to layered architecture Query Processor Query Optimization Query Planning Storage Manager Access Paths Physical sets, pages, buffers Accesses disks through OS May be avoided using raw devices for direct data access Applications /Queries DBMS Query Processor Storage Manager Operating System Disks EN 1.1 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 38

1.3 System Structure Application Programmers DB Administrators Application Interfaces Application Programs Direct Query DB Scheme Transaction Manager Applications Programs Object Code Query Processor Embedded DML Precompiler DML Compiler DDL Interpreter Storage Manager File Manager Buffer Manager Query Evaluation Engine Data Indices Statistics Catalog/ Dictionary SKS 1.9 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 39

1.3 Storage Manager The storage manager provides the interface between the data stored in the database and the application programs and queries submitted to the system The storage manager is responsible for Interaction with the file manager Efficient storing, retrieving and updating of data Tasks: Storage access File organization Indexing and hashing Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 40

1.3 Query Processor The query processor parses queries, optimizes query plans and evaluates the query Alternative ways of evaluating a given query due to equivalent expressions Different algorithms for each operation Cost difference between good and bad ways of evaluating a query can be enormous Needs to estimate the cost of operations Depends critically on statistical information about relations which the DBMS maintains Need to estimate statistics for intermediate results to compute cost of complex expressions (join order, etc.) Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 41

1.3 Transaction Manager A transaction is a collection of operations that performs a single logical function in a database application The transaction manager Ensures that the database remains in a correct state despite system failures (like power failures, operating system crashes, or transaction failures) Controls the interaction among concurrent transactions to ensure the database consistency Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 42

1.4 DBMS Quality How do you know whether you built or bought a good DBMS? Always: depends on the application Analyze data volume, typical DB queries and transactions (what do you really need?) Analyze expected frequency of invocation of queries and transactions (what has to be supported?) Analyze time constraints of queries and transactions (how fast does it have to be?) Analyze expected frequency of update operations (does it deal with rather static or with volatile data?) Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 43

1.4 Performance Measures Basically analytical & experimental approaches on typical characteristics like Response time: how long can a query/update be expected to take? On average or at peak times (worst case) Transaction throughput: how many transactions can be processed per second/millisecond? On average or at peak times (worst case) Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 44

1.4 Industry Standard Benchmarks How to compare database performance across vendors? The Transaction Processing Performance Council Aims are significant disk input/output, moderate system and application execution time, and transaction integrity Defines certain scenarios with standard data sets, schemas and queries http://www.tpc.org Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 45

1.4 Industry Standard Benchmarks Performance Metrics Throughput measured in transactions per second (tps) Response time of transaction (transaction elapse time) Cost metric (in $/tps) TPC-D Ad hoc business questions, e.g. sales trends Decision Support Applications Long, complex read-only queries Infrequent updates Access large portions of the database Used until 1999 Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 46

1.4 Current TPC Benchmarks TPC-C Standard for comparing On-Line Transaction Processing (OLTP) performance on various hardware and software configurations since 1992 Regular business operations, e.g. order-entry processing OLTP applications Update intensive Shorter transactions that access a small portion of a database Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 47

1.4 TPC-C Database Schema Numbers in entity blocks represent the cardinality of the tables (number of rows). These numbers are factored by W. The numbers next to the relationship arrows represent the cardinality of the relationships. The plus (+) symbol is used after the cardinality of a relationship or table to illustrate that this number is subject to small variations in the initial database population over the measurement interval as rows are added or deleted. Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 48

1.4 TPC-C Queries New-Order Transaction Entering a complete order through a single database transaction. Payment Transaction Updates the customer's balance and reflects the payment on the district and warehouse sales statistics. Order-Status Transaction Queries the status of a customer's last order. Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 49

1.4 TPC-C Queries Delivery Transaction Consists of processing a batch of 10 new (not yet delivered) orders. Each order is processed (delivered) in full within the scope of a read-write database transaction. Stock-Level Transaction Determines the number of recently sold items that have a stock level below a specified threshold. Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 50

1.4 TPC-C Results Performance only Price/Performance Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 51

1.4 Current TPC Benchmarks TPC-E New OLTP workload benchmark Simulates the OLTP workload of a brokerage firm focusing on a central database that executes transactions related to the firm s customer accounts TPC-H Ad-hoc, decision support benchmark Consists of a suite of business oriented ad-hoc queries and concurrent data modifications Relational Database Systems 2 Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 52

1 Next Lecture How is data physically stored? Memory types Hard disks RAID SAN NAS etc. Relational Database Systems 2 Wolf-Tilo Balke - Benjamin Köhncke Institut für Informationssysteme 53