Introduction to Transactions: Controlling Concurrent "Behavior" Virtual and Materialized Views" Indexes: Speeding Accesses to Data"

Similar documents
Transactions, Views, Indexes. Controlling Concurrent Behavior Virtual and Materialized Views Speeding Accesses to Data

Transactions. The Setting. Example: Bad Interaction. Serializability Isolation Levels Atomicity

Views, Indexes, Authorization. Views. Views 8/6/18. Virtual and Materialized Views Speeding Accesses to Data Grant/Revoke Priviledges

CSCD43: Database Systems Technology. Lecture 4

Transactions. Juliana Freire. Some slides adapted from L. Delcambre, R. Ramakrishnan, G. Lindstrom, J. Ullman and Silberschatz, Korth and Sudarshan

Grouping Operator. Applying γ L (R) Recall: Outerjoin. Example: Grouping/Aggregation. γ A,B,AVG(C)->X (R) =??

Databases 1. Defining Tables, Constraints

Transactions and Locking. Rose-Hulman Institute of Technology Curt Clifton

Schedule. Today: Feb. 21 (TH) Feb. 28 (TH) Feb. 26 (T) Mar. 5 (T) Read Sections , Project Part 6 due.

Database Usage (and Construction)

Constraints, Views & Indexes. Running Example. Kinds of Constraints INTEGRITY CONSTRAINTS. Keys Foreign key or referential integrity constraints

CSE 530A ACID. Washington University Fall 2013

Introduction to Databases, Fall 2005 IT University of Copenhagen. Lecture 10: Transaction processing. November 14, Lecturer: Rasmus Pagh

CSCC43H: Introduction to Databases. Lecture 9

INDEXES MICHAEL LIUT DEPARTMENT OF COMPUTING AND SOFTWARE MCMASTER UNIVERSITY

TRANSACTION PROPERTIES

CHAPTER 3 RECOVERY & CONCURRENCY ADVANCED DATABASE SYSTEMS. Assist. Prof. Dr. Volkan TUNALI

SQL: Transactions. Introduction to Databases CompSci 316 Fall 2017

SQL: Transactions. Announcements (October 2) Transactions. CPS 116 Introduction to Database Systems. Project milestone #1 due in 1½ weeks

Announcements. SQL: Part IV. Transactions. Summary of SQL features covered so far. Fine prints. SQL transactions. Reading assignments for this week

Transactions. ACID Properties of Transactions. Atomicity - all or nothing property - Fully performed or not at all

SQL Continued! Outerjoins, Aggregations, Grouping, Data Modification

More SQL. Extended Relational Algebra Outerjoins, Grouping/Aggregation Insert/Delete/Update

Introduction to SQL. Select-From-Where Statements Multirelation Queries Subqueries. Slides are reused by the approval of Jeffrey Ullman s

CSC 343 Winter SQL: Aggregation, Joins, and Triggers MICHAEL LIUT

What are Transactions? Transaction Management: Introduction (Chap. 16) Major Example: the web app. Concurrent Execution. Web app in execution (CS636)

EXTENDED RELATIONAL ALGEBRA OUTERJOINS, GROUPING/AGGREGATION INSERT/DELETE/UPDATE

Introduction to SQL. Select-From-Where Statements Multirelation Queries Subqueries

Transaction Management: Introduction (Chap. 16)

Database Systems. Announcement

Why Transac'ons? Database systems are normally being accessed by many users or processes at the same 'me.

Introduction to SQL SELECT-FROM-WHERE STATEMENTS SUBQUERIES DATABASE SYSTEMS AND CONCEPTS, CSCI 3030U, UOIT, COURSE INSTRUCTOR: JAREK SZLICHTA

COSC344 Database Theory and Applications. Lecture 21 Transactions

Introduction to SQL. Multirelation Queries Subqueries. Slides are reused by the approval of Jeffrey Ullman s

A transaction is a sequence of one or more processing steps. It refers to database objects such as tables, views, joins and so forth.

Transaction Management

Intro to Transaction Management

Database Management Systems

CPS352 Lecture - The Transaction Concept

COURSE 1. Database Management Systems

Databases - Transactions

Overview. Introduction to Transaction Management ACID. Transactions

DATABASE TRANSACTIONS. CS121: Relational Databases Fall 2017 Lecture 25

COSC 304 Introduction to Database Systems. Advanced SQL. Dr. Ramon Lawrence University of British Columbia Okanagan

Database Systemer, Forår 2006 IT Universitet i København. Lecture 10: Transaction processing. 6 april, Forelæser: Esben Rune Hansen

CSE 344 MARCH 21 ST TRANSACTIONS

E/R Diagrams! Converting E/R Diagrams to Relations!

14.1 Answer: 14.2 Answer: 14.3 Answer: 14.4 Answer:

Transactions Processing (i)

TRANSACTION MANAGEMENT

CS352 Lecture - The Transaction Concept

CMPT 354: Database System I. Lecture 11. Transaction Management

Weak Levels of Consistency

Intro to Transactions

Database!! Structured data collection!! Records!! Relationships. Enforces that data maintains certain consistency properties

Transactions. Kathleen Durant PhD Northeastern University CS3200 Lesson 9

Announcements. Transaction. Motivating Example. Motivating Example. Transactions. CSE 444: Database Internals

Techno India Batanagar Computer Science and Engineering. Model Questions. Subject Name: Database Management System Subject Code: CS 601

XI. Transactions CS Computer App in Business: Databases. Lecture Topics

Why SQL? SQL is a very-high-level language. Database management system figures out best way to execute query

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Problems Caused by Failures

Lecture 20 Transactions

Database Systems CSE 414

DATABASE MANAGEMENT SYSTEMS

Database Security: Transactions, Access Control, and SQL Injection

ECE 650 Systems Programming & Engineering. Spring 2018

Final Review #2. Monday, May 4, 2015

Motivating Example. Motivating Example. Transaction ROLLBACK. Transactions. CSE 444: Database Internals

Database Management System Prof. D. Janakiram Department of Computer Science & Engineering Indian Institute of Technology, Madras Lecture No.

SQL DATA DEFINITION LANGUAGE

Part VIII Transactions, Integrity and Triggers

Chapter 20 Introduction to Transaction Processing Concepts and Theory

A tuple is dangling if it doesn't join with any

Chapter 7 (Cont.) Transaction Management and Concurrency Control

SQL DATA DEFINITION LANGUAGE

Database Design and Programming


Transaction Management & Concurrency Control. CS 377: Database Systems

Database Technology. Topic 8: Introduction to Transaction Processing

CSC 261/461 Database Systems Lecture 21 and 22. Spring 2017 MW 3:25 pm 4:40 pm January 18 May 3 Dewey 1101

CS 5614: (Big) Data Management Systems. B. Aditya Prakash Lecture #6: Transac/ons 1: Intro. to ACID

Advanced Databases. Transactions. Nikolaus Augsten. FB Computerwissenschaften Universität Salzburg

Databases-1 Lecture-01. Introduction, Relational Algebra

Chapter 22. Transaction Management

Introduction to Transaction Management

CS122 Lecture 15 Winter Term,

Lecture 21. Lecture 21: Concurrency & Locking

Administration Naive DBMS CMPT 454 Topics. John Edgar 2

In This Lecture. Transactions and Recovery. Transactions. Transactions. Isolation and Durability. Atomicity and Consistency. Transactions Recovery

CSE 344 MARCH 5 TH TRANSACTIONS

Information Systems (Informationssysteme)

Transaction Management

Database Principles: Fundamentals of Design, Implementation, and Management Tenth Edition. Chapter 13 Managing Transactions and Concurrency

Lecture 14: Transactions in SQL. Friday, April 27, 2007

Constraints. Local and Global Constraints Triggers

Introduction to Database Systems. Announcements CSE 444. Review: Closure, Key, Superkey. Decomposition: Schema Design using FD

CS 4604: Introduc0on to Database Management Systems. B. Aditya Prakash Lecture #17: Transac0ons 1: Intro. to ACID

Transactions and Isolation

CSE 444: Database Internals. Lectures 13 Transaction Schedules

Transcription:

Introduction to Transactions: Controlling Concurrent "Behavior" Virtual and Materialized Views" Indexes: Speeding Accesses to Data"

Transaction = process involving database queries and/or modification." A transaction is a collection of 1+ operations that must be executed atomically." So either all are executed or none are" Normally with some strong requirements regarding concurrency." Formed in SQL from single statements or explicit programmer control." Depending on the implementation, a transaction may start:" Implicitly, with the execution of a SELECT, UPDATE, statement, or" Explicitly, with a BEGIN TRANSACTION statement" Transaction finishes with a COMMIT or ROLLBACK statement"

Database systems are normally accessed by many users or processes at the same time." This is the case for both queries and modifications." A DMBS needs to keep processes from troublesome interactions." As well, we have been assuming that operations are carried out atomically, and that the hardware or software canʼt fail in the middle of a modification." Clearly this is an unrealistic assumption."

You and your domestic partner each take $100 from different ATMʼs at about the same time from the same account." The DBMS better make sure one account deduction doesnʼt get lost." Compare: An OS allows two people to edit a document at the same time. If both write, one personʼs changes get lost."

Properly executed transactions are said to meet the ACID criteria." ACID transactions are:" Atomic : Whole transaction or none is done." Consistent : Database constraints preserved." Isolated : It appears to the user as if only one process executes at a time." Durable : Effects of a process survive a crash." Weaker forms of transactions are often supported as well."

The SQL statement COMMIT causes a transaction to complete." Its database modifications are now permanent in the database." Prior to a COMMIT, other transactions (commonly) cannot see the partial or tentative changes." I.e. the database system might lock the changed items until the COMMIT is executed."

The SQL statement ROLLBACK also causes the transaction to end, but by aborting." No effects on the database." Failures like division by 0 or a constraint violation can also cause rollback, even if the programmer does not request it." One use of ROLLBACK: Abort a database modification via a trigger."

Assume the usual Sells(bar,beer,price) relation, and suppose that Joeʼs Bar sells only Export for $2.50 and Sleeman for $3.00." Sally is querying Sells for the highest and lowest price Joe charges." Joe decides to stop selling Export and Sleeman, and to sell only Heineken at $3.50."

Sally executes the following two SQL statements called (min) and (max) to help us remember what they do." (max) "SELECT MAX(price) FROM Sells" " " "WHERE bar = ʼJoeʼʼs Barʼ;" (min) "SELECT MIN(price) FROM Sells" " " "WHERE bar = ʼJoeʼʼs Barʼ;"

At about the same time, Joe executes the following steps: (del) and (ins)." (del) " DELETE FROM Sells" " " WHERE bar = ʼJoeʼʼs Barʼ;" (ins) " INSERT INTO Sells" " " VALUES(ʼJoeʼʼs Barʼ, ʼHeinekenʼ, 3.50);"

Although (max) must come before (min), and (del) must come before (ins), " there are no other constraints on the order of these statements, " unless we group Sallyʼs and/or Joeʼs statements into transactions."

Suppose the steps execute in the order (max)(del)(ins)(min)." Joeʼs Prices:" {2.50,3.00}" {2.50,3.00}" {3.50}" {3.50}" Statement:" (max)" (del)" (ins)" (min)" Result:" 3.00" 3.50" Sally sees MAX < MIN!"

If we group Sallyʼs statements (max)(min) into one transaction, then she cannot see this inconsistency." She sees Joeʼs prices at some fixed time." Either before or after he changes prices, or in the middle, but the MAX and MIN are computed from the same prices."

Suppose Joe executes (del)(ins), not as a transaction, but after executing these statements, thinks better of it and issues a ROLLBACK statement." If Sally executes her statements after (ins) but before the rollback, she sees a value, 3.50, that never existed in the database."

If Joe executes (del)(ins) as a transaction, its effect cannot be seen by others until the transaction executes COMMIT." If the transaction executes ROLLBACK instead, then its effects can never be seen."

Locking or isolating part of the database can be expensive." The isolation level controls the extent to which a transaction is exposed to the effects of other concurrent transactions." SQL defines four isolation levels " = choices about what interactions are allowed by transactions that execute at about the same time." Only one level ( serializable ) = ACID transactions." Each DBMS implements transactions in its own way."

Within a transaction, we can say:" SET TRANSACTION ISOLATION LEVEL X! "where X =" 1. SERIALIZABLE" 2. REPEATABLE READ" 3. READ COMMITTED" 4. READ UNCOMMITTED"

If Sally = (max)(min) and Joe = (del)(ins) are each transactions, and Sally runs with isolation level SERIALIZABLE, then she will see the database either before or after Joe runs, but not in the middle." If a transaction T is SERIALIZABLE, then it reads only changes made by committed transactions." As well, no value read or written by T is changed by another transaction until T completes."

Your choice, e.g., run serializable, affects only how you see the database, not how others see it." Example: If Joe runs serializable, but Sally doesnʼt, then Sally might see no prices for Joeʼs Bar." I.e., it looks to Sally as if she ran in the middle of Joeʼs transaction."

If Sally runs with isolation level READ COMMITTED, then she can see only committed data, but not necessarily the same data each time." Example: Under READ COMMITTED, the interleaving (max)(del)(ins) (min) is allowed, as long as Joe commits." Sally sees MAX < MIN." With READ COMMITTED a transaction T reads only changes made by committed transactions, and no value written by T can be changed by other transactions until T is complete." A value read by T may be changed by another transaction while T is in progress."

Requirement is like read-committed, plus: if data is read again, then everything seen the first time will be seen the second time." But the second and subsequent reads may see more tuples as well. "

Suppose Sally runs under REPEATABLE READ, and the order of execution is (max)(del)(ins)(min)." (max) sees prices 2.50 and 3.00." (min) can see 3.50, but must also see 2.50 and 3.00, because they were seen on the earlier read by (max)."

A transaction running under READ UNCOMMITTED can see data in the database, even if it was written by a transaction that has not committed (and may never)." Example: If Sally runs under READ UNCOMMITTED, she could see a price 3.50 even if Joe later aborts." Common terminology:" Dirty data is data written by a transaction that has not yet committed." A dirty read is a read of dirty data written by another transaction."

A view is a relation defined in terms of stored tables (called base tables ) and other views." Two kinds:" 1. Virtual = not stored in the database" Just a query for constructing the relation." 2. Materialized = actually constructed and stored."

Declare by:" " "CREATE [MATERIALIZED] VIEW <name> AS <query>;" Default is virtual."

CanDrink(customer, beer) is a view containing the customer-beer pairs such that the customer frequents at least one bar that serves the beer:" "CREATE VIEW CanDrink AS" " "SELECT customer, beer" " "FROM Frequents, Sells" " "WHERE Frequents.bar = Sells.bar;" Renaming attributes:" "CREATE VIEW CanDrink(person, beverage) AS" " "SELECT customer, beer" " "FROM Frequents, Sells" " "WHERE Frequents.bar = Sells.bar;"

Can query a view as if it were a base table." Also: a limited ability to modify views if it makes sense as a modification of one underlying base table." Example query:" " "SELECT beer FROM CanDrink" " "WHERE customer = ʼSallyʼ;"

In general, it is impossible to modify a virtual view, because a virtual view doesnʼt exist." But an INSTEAD OF trigger lets us interpret view modifications in a way that makes sense." Example: View Synergy has (customer, beer, bar) triples such that the bar serves the beer, the customer frequents the bar and likes the beer."

CREATE VIEW Synergy AS" "SELECT Likes.customer, Likes.beer, Sells.bar" "FROM Likes, Sells, Frequents" "WHERE Likes.customer = Frequents.customer" " "AND Likes.beer = Sells.beer" " "AND Sells.bar = Frequents.bar;" Pick one copy of" each attribute" Natural join of Likes," Sells, and Frequents"

We cannot insert into Synergy --- it is a virtual view." But we can use an INSTEAD OF trigger to turn a (customer, beer, bar) triple into three insertions of projected pairs, one for each of Likes, Sells, and Frequents." Sells.price will have to be NULL."

CREATE TRIGGER ViewTrig" "INSTEAD OF INSERT ON Synergy" "REFERENCING NEW ROW AS n" "FOR EACH ROW" "BEGIN" " "INSERT INTO LIKES VALUES(n.customer, n.beer);" " "INSERT INTO SELLS(bar, beer) VALUES(n.bar, n.beer);" " "INSERT INTO FREQUENTS VALUES(n.customer, n.bar);" "END;"

Example:" "CREATE VIEW BeerPrices AS" " "SELECT beer, price" " "FROM Sells" Recall: SELLS has attributes bar, beer, prices" Primary key of SELLS is bar, beer"

Problem: each time a base table changes, the materialized view may change." Cannot afford to recompute the view with each change." Solution (sometimes): Periodic reconstruction of the materialized view, which is otherwise out of date. "

Sales of items require frequent modification of a database." If aggregated data (for example) is to be used to analyse sales, then materialised views may be used." May be updated once a day" Justification: Things donʼt change much over 24 hours."

Canadian Tire stores every sale at every store in a database." Overnight, the sales for the day are used to update a data warehouse = materialized views of the sales." The warehouse is used by analysts to predict trends and move goods around." Weʼll see more on data warehouses later "

An index for an attribute (or attributes) of a relation is a data structure used to speed access to tuples of a relation, given values of the attribute(s)." In a DBMS it is a balanced search tree with giant nodes (a full disk page) called a B-tree." Can make query answering and joins involving the attribute much faster." On the other hand, modifications are more complex and take longer." Covered in depth in CMPT454"

No standard!" Typical syntax:" " "CREATE INDEX BeerInd "ON Beers(manf);" " "CREATE INDEX SellInd "ON Sells(bar, beer);"

Given a value v, the index takes us to only those tuples that have v in the attribute(s) of the index." Example: use BeerInd and SellInd to find the prices of beers manufactured by Peteʼs and sold by Joe. (next slide)" With the indices, just retrieve tuples satisfying these conditions" Clearly, can result in huge savings (vs. retrieving all tuples from the mentioned relations)"

SELECT price " FROM Beers, Sells" WHERE manf = ʼPeteʼʼsʼ AND" "Beers.name = Sells.beer AND" "bar = ʼJoeʼʼs Barʼ;" 1. Use BeerInd to get all the beers made by Peteʼs." 2. Then use SellInd to get prices of those beers, with bar = ʼJoeʼʼs Barʼ"

A major problem in making a database run fast is deciding which indexes to create." Recall:" Pro: An index speeds up queries that can use it." Con: An index slows down modifications on its relation because the index must be modified too." The key for a relation is usually the most useful attribute to have an index on:" Queries in which a value for a key is specified are common." For a given key value there is only one tuple. Thus the index returns at most one tuple, requiring just 1 page from the relation instance to be retrieved."

Suppose the only things we did with our beers database was:" 1. Insert new facts into a relation (10%)." 2. Find the price of a given beer at a given bar (90%)." Then " SellInd on Sells(bar, beer) would be wonderful, but " BeerInd on Beers(manf) would be harmful."

Use a tuning advisor to help determine appropriate indexes." A major research thrust." Hand tuning is difficult and inaccurate."

An advisor gets a query load, e.g.:" 1. Choose random queries from the history of queries run on the database, or" 2. Designer provides a sample workload or constraints."

The advisor generates candidate indexes and evaluates each on the workload." Feed each sample query to the query optimizer, which assumes only this one index is available." Measure the improvement/degradation in the average running time of the queries." Problem: Indexes are not independent the choice of one may affect the performance of another." In practice, a greedy algorithm works well."