Administration Naive DBMS CMPT 454 Topics. John Edgar 2

Similar documents
Goals for Today. CS 133: Databases. Final Exam: Logistics. Why Use a DBMS? Brief overview of course. Course evaluations

John Edgar 2

TRANSACTION PROPERTIES

Page 1. Goals for Today" What is a Database " Key Concept: Structured Data" CS162 Operating Systems and Systems Programming Lecture 13.

User Perspective. Module III: System Perspective. Module III: Topics Covered. Module III Overview of Storage Structures, QP, and TM

CHAPTER 3 RECOVERY & CONCURRENCY ADVANCED DATABASE SYSTEMS. Assist. Prof. Dr. Volkan TUNALI

CMSC 461 Final Exam Study Guide

Database Management Systems Introduction to DBMS

Introduction to Transaction Management

Department of Information Technology B.E/B.Tech : CSE/IT Regulation: 2013 Sub. Code / Sub. Name : CS6302 Database Management Systems

CMPT 354: Database System I. Lecture 11. Transaction Management

Big Data Processing Technologies. Chentao Wu Associate Professor Dept. of Computer Science and Engineering

Final Exam CSE232, Spring 97

Introduction and Overview

CSE 190D Database System Implementation

Database Systems Management

COURSE 1. Database Management Systems

Database Management Systems. Chapter 1

CS145: Intro to Databases. Lecture 1: Course Overview

Introduction and Overview

Transaction Management & Concurrency Control. CS 377: Database Systems

MaanavaN.Com DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING QUESTION BANK

CSE 344 Final Review. August 16 th

CMPT 354 Database Systems I. Spring 2012 Instructor: Hassan Khosravi

Data Modeling and Databases Ch 14: Data Replication. Gustavo Alonso, Ce Zhang Systems Group Department of Computer Science ETH Zürich

What s a database system? Review of Basic Database Concepts. Entity-relationship (E/R) diagram. Two important questions. Physical data independence

Overview. Introduction to Transaction Management ACID. Transactions

SQL: Transactions. Announcements (October 2) Transactions. CPS 116 Introduction to Database Systems. Project milestone #1 due in 1½ weeks

BBM371- Data Management. Lecture 1: Course policies, Introduction to DBMS

Page 1. Quiz 18.1: Flow-Control" Goals for Today" Quiz 18.1: Flow-Control" CS162 Operating Systems and Systems Programming Lecture 18 Transactions"

Data! CS 133: Databases. Goals for Today. So, what is a database? What is a database anyway? From the textbook:

October/November 2009

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer DBMS

Databases. Jörg Endrullis. VU University Amsterdam

Database Management System


1/9/13. + The Transaction Concept. Transaction Processing. Multiple online users: Gives rise to the concurrency problem.

CPSC 421 Database Management Systems. Lecture 19: Physical Database Design Concurrency Control and Recovery

Transaction Management

Database Management Systems CSEP 544. Lecture 9: Transactions and Recovery

Database Systems. Announcement

Introduction to Data Management. Lecture #1 (Course Trailer )

) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons)

CS564: Database Management Systems. Lecture 1: Course Overview. Acks: Chris Ré 1

Rajiv GandhiCollegeof Engineering& Technology, Kirumampakkam.Page 1 of 10

Databases - Transactions

Unit 2. Unit 3. Unit 4

CSCE 4523 Introduction to Database Management Systems Final Exam Spring I have neither given, nor received,unauthorized assistance on this exam.

Database Management Systems Paper Solution

Introduction. Example Databases

T ransaction Management 4/23/2018 1

Transactions Processing (i)

SYED AMMAL ENGINEERING COLLEGE

Transactions. Kathleen Durant PhD Northeastern University CS3200 Lesson 9

Transactions. ACID Properties of Transactions. Atomicity - all or nothing property - Fully performed or not at all

Database Management Systems

Intro to Transactions

In This Lecture. Transactions and Recovery. Transactions. Transactions. Isolation and Durability. Atomicity and Consistency. Transactions Recovery

DATABASE MANAGEMENT SYSTEMS

B.H.GARDI COLLEGE OF MASTER OF COMPUTER APPLICATION. Ch. 1 :- Introduction Database Management System - 1

Introduction to Data Management. Lecture #1 (Course Trailer ) Instructor: Chen Li

Large-Scale Key-Value Stores Eventual Consistency Marco Serafini

Overview of Transaction Management

Techno India Batanagar Computer Science and Engineering. Model Questions. Subject Name: Database Management System Subject Code: CS 601

Database Systems CSE 414

CSE 344 MARCH 25 TH ISOLATION

Introduction TRANSACTIONS & CONCURRENCY CONTROL. Transactions. Concurrency

TRANSACTION MANAGEMENT

TRANSACTION PROCESSING MONITOR OVERVIEW OF TPM FOR DISTRIBUTED TRANSACTION PROCESSING

Topics in Reliable Distributed Systems

Introduction to Transaction Management

Introduction to Data Management. Lecture #1 (Course Trailer )

Introduction to Database Management Systems

XI. Transactions CS Computer App in Business: Databases. Lecture Topics

Lecture X: Transactions

CSC 261/461 Database Systems Lecture 21 and 22. Spring 2017 MW 3:25 pm 4:40 pm January 18 May 3 Dewey 1101

Intro to Transaction Management

Final Exam CSE232, Spring 97, Solutions

SQL: Transactions. Introduction to Databases CompSci 316 Fall 2017

Transaction Management. Pearson Education Limited 1995, 2005

Problems Caused by Failures

Lectures 8 & 9. Lectures 7 & 8: Transactions

A REVIEW OF BASIC KNOWLEDGE OF DATABASE SYSTEM

Database Architectures

Transaction Management: Concurrency Control

Introduction Aggregate data model Distribution Models Consistency Map-Reduce Types of NoSQL Databases

Introduction to Data Management CSE 344

CSC 261/461 Database Systems Lecture 20. Spring 2017 MW 3:25 pm 4:40 pm January 18 May 3 Dewey 1101

CAS CS 460/660 Introduction to Database Systems. Fall

What are Transactions? Transaction Management: Introduction (Chap. 16) Major Example: the web app. Concurrent Execution. Web app in execution (CS636)

CSCE 4523 Introduction to Database Management Systems Final Exam Fall I have neither given, nor received,unauthorized assistance on this exam.

Weak Levels of Consistency

CSE 344 MARCH 21 ST TRANSACTIONS

Modern Database Systems CS-E4610

Actions are never left partially executed. Actions leave the DB in a consistent state. Actions are not affected by other concurrent actions

Transaction Management: Introduction (Chap. 16)

Database Management Systems

Transaction Processing. Introduction to Databases CompSci 316 Fall 2018

Database Applications (15-415)

Database Management System Prof. D. Janakiram Department of Computer Science & Engineering Indian Institute of Technology, Madras Lecture No.

Transcription:

Administration Naive DBMS CMPT 454 Topics John Edgar 2

http://www.cs.sfu.ca/coursecentral/454/johnwill/ John Edgar 4

Assignments 25% Midterm exam in class 20% Final exam 55% John Edgar 5

A database stores the data for some organization The focus of a conventional database is on recording and querying individual data items The data for a very simple system might consist of a single file Which presumably would not require a database So why use a database? Larger organizations require many disparate types of records to be stored Often used by different parts of the organization Consider what information SFU needs to record These records need to be accessed by many users at the same time John Edgar 7

Let's imagine a naïve implementation of a Database Management System (DBMS) The query processor is responsible for accessing data and determining the result of a query query files* query processor result ** * on hard disk ** in main memory John Edgar 8

One file for each table Records are separated by newline characters Fields in records separated by some special character e.g. the Customer file might store Kent#123#journalist Banner#322#unemployed Store the database schema in a file e.g. the Customer and Account schema Customer#name#STR#id#INT#job#STR Account#acc_id#INT#id#INT#balance#FLOAT John Edgar 9

SELECT * FROM Customer WHERE job = 'journalist' Read the file schema to find Customer attributes Check the condition is semantically valid for customers Create a new file (T) for the query results Read the Customer file, and for each line (i.e. record) Check condition, c If c is true write the line to T Add a line for T to the file schema John Edgar 10

SELECT balance FROM Customer C, Account A WHERE C.name = 'Jones' AND C.id = A.id Simple join* algorithm: FOR each record c in Customer FOR each record a in Account IF c and a satisfy the WHERE condition THEN add the balance field from Account to result Note that the query does not specify a join but a Cartesian product followed by a complex selection so we assume some level of query optimization John Edgar 11

Searching for a subset of records entails reading the entire file There is no efficient method of just retrieving customers who are journalists Or of retrieving individual customers There is no efficient way to compute complex queries The join algorithm is relatively expensive Note that every customer is matched to every account regardless of the customer name What is its O notation running time? John Edgar 12

Changing a single record entails reading the entire file and writing it back What happens when two people want to change account balances at the same time? Either something bad happens Or we prevent one person from making changes What happens if the system crashes as changes are being made to data? The data is lost or possibly something worse John Edgar 13

CMPT 354 database design, creation, and use ER model and relational model Relational algebra and SQL Implementation of database applications CMPT 454 database management system design How is data stored and accessed? How are SQL queries processed? What is a transaction, and how do multiple users use the same database? What happens if there is a system failure? John Edgar 14

Processing occurs in main memory Data is stored in secondary storage and has to be retrieved to be processed Reading or writing data in secondary storage is much slower than accessing main memory Typically, the cost metric for DB operations is based around disk access time John Edgar 16

Mechanics of disks Access characteristics Organizing related data on disk Algorithms for disk access Disk failures Improving access and reliability RAID Solid State Drives John Edgar 17

Arranging records on a disk Fixed length records Variable length records Representing addresses and pointers BLOBs John Edgar 18

An index is a structure that speeds up access to records based on search criteria For example, finding student data efficiently using student ID There are different index structures, with their own strengths and weaknesses B trees Hash tables Specialized index techniques John Edgar 19

Processing an SQL query requires methods for satisfying SQL operators Selections Projections Joins Set operations Aggregations... There is more than one algorithm for each of these operations John Edgar 21

SQL is a procedural query language That specifies the operations to be performed to satisfy a query Most queries have equivalent queries That use different operations, or order of operations, but that return the same result Query optimization is the process of finding the best equivalent query Or at least one that is good enough John Edgar 22

Once equivalent queries are derived they have to be evaluated What is the cost metric? What is the size of intermediate relations? The result of each operation is a relation For multiple relation queries, how does the choice of join order affect the cost of a query? How are the results of one operation passed to the next? John Edgar 23

A transaction is a single logical unit of work Who is the owner of the largest account? Which students have a GPA less than 2.0? Transfer $200 from Bob to Kate Add 5% interest to all accounts Enroll student 123451234 in CMPT 454... Many transactions entail multiple actions John Edgar 25

time Transferring $200 from one bank account to another is a single transaction With multiple actions Action Bob Kate Read Bob's balance 347 Read Kate's balance 191 Subtract $200 from Bob's balance (147) Add $200 to Kate's balance (391) Write Bob's new balance 147 Write Kate's new balance 391 John Edgar 26

A typical OLTP 1 database is expected to be accessed by multiple users concurrently Consider the Student Information System Concurrency increases throughput 2 without crying please Actions of different transactions may be interleaved rather than processing each transaction in series Interleaving transactions may leave the database in an inconsistent state 1 Online Transaction Processing 2 Throughput is a measure of the number of transactions processed over time John Edgar 27

time T1 Transfer $200 from Bob to Kate T2 Deposit $7,231 in Bob's Account Action Bob Kate T1 Read Bob's balance 347 T2 Read Bob's balance 347 T1 Read Kate's balance 191 T2 Add $7,231 to Bob's balance (7,578) T1 Subtract $200 from Bob's balance (147) T2 Write Bob's new balance 7,578 T1 Add $200 to Kate's balance (391) T1 Write Bob's new balance 147 Bob is probably not happy This transaction schedule should be prevented T1 Write Kate's new balance 391 John Edgar 28

Transactions should maintain the ACID properties Atomic Consistent Isolated Durable John Edgar 29

Serial and serializable schedules Conflict serializability Locking Two phase locking Locking scheduler Lock modes Architecture Optimistic concurrency control Deadlocks John Edgar 30

Processing is performed in main memory But database objects are only persistent when written to long term storage Once a transaction has completed it should be persistent If the system crashes after completion but before changes are written to disk those changes are lost Recovery is the process of returning a database to a consistent state after a system crash John Edgar 32

Transactions Undo logging Redo logging Undo/Redo logging Media failures John Edgar 33

Massive datasets are currently maintained across the web Some of these are stored in relational databases And others are not There are many NoSQL data stores What are the issues in maintaining a distributed database? John Edgar 35

Distributed vs. parallel databases Horizontal and vertical fragmentation Distributed query processing Distributed transactions Cloud databases John Edgar 36