Introduction to Data Management. Lecture #1 (Course Trailer )

Similar documents
Introduction to Data Management. Lecture #1 (Course Trailer ) Instructor: Chen Li

Introduction to Data Management. Lecture #1 (Course Trailer )

Introduction to Data Management. Lecture #1 (The Course Trailer )

Introduction to Database Systems. Chapter 1. Instructor: . Database Management Systems, R. Ramakrishnan and J. Gehrke 1

Introduction to Database Management Systems

Introduction and Overview

Database Management Systems. Chapter 1

Introduction and Overview

Introduction to Database Systems CS432. CS432/433: Introduction to Database Systems. CS432/433: Introduction to Database Systems

Database Management Systems Chapter 1 Instructor: Oliver Schulte Database Management Systems 3ed, R. Ramakrishnan and J.

Introduction to Data Management. Lecture #2 Intro II & Data Models I

Database Management System

Database. Università degli Studi di Roma Tor Vergata. ICT and Internet Engineering. Instructor: Andrea Giglio

Week 1 Part 1: An Introduction to Database Systems

Big Data Processing Technologies. Chentao Wu Associate Professor Dept. of Computer Science and Engineering

BBM371- Data Management. Lecture 1: Course policies, Introduction to DBMS

Introduction to Data Management. Lecture #18 (Transactions)

Transaction Management Overview. Transactions. Concurrency in a DBMS. Chapter 16

CS 443 Database Management Systems. Professor: Sina Meraji

Database Management Systems. Syllabus. Instructor: Vinnie Costa

Introduction to Data Management. Lecture #25 (Transactions II)

Introduction to Data Management. Lecture #4 (E-R Relational Translation)

Introduction to Data Management. Lecture #24 (Transactions)

CS145: Intro to Databases. Lecture 1: Course Overview

Page 1. Quiz 18.1: Flow-Control" Goals for Today" Quiz 18.1: Flow-Control" CS162 Operating Systems and Systems Programming Lecture 18 Transactions"

CMPUT 291 File and Database Management Systems

CS275 Intro to Databases. File Systems vs. DBMS. Why is a DBMS so important? 4/6/2012. How does a DBMS work? -Chap. 1-2

CS564: Database Management Systems. Lecture 1: Course Overview. Acks: Chris Ré 1

Introduction to Data Management. Lecture #26 (Transactions, cont.)

Page 1. Goals for Today" What is a Database " Key Concept: Structured Data" CS162 Operating Systems and Systems Programming Lecture 13.

Introduction to Data Management. Lecture #5 Relational Model (Cont.) & E-Rà Relational Mapping

Transaction Management Overview

Fall Semester 2002 Prof. Michael Franklin

Lecture 1: Introduction

Introduction to Database Systems

CAS CS 460/660 Introduction to Database Systems. Fall

Introduction to Data Management. Lecture #2 (Big Picture, Cont.)

Data! CS 133: Databases. Goals for Today. So, what is a database? What is a database anyway? From the textbook:

Concurrency Control & Recovery

Database Management Systems MIT Introduction By S. Sabraz Nawaz

CS145: Intro to Databases. Lecture 1: Course Overview

Concurrency Control & Recovery

A Few Tips and Suggestions. Database System II Preliminaries. Applications (contd.) Applications

Database Systems ( 資料庫系統 ) Practicum in Database Systems ( 資料庫系統實驗 ) 9/20 & 9/21, 2006 Lecture #1

Intro to Transaction Management

Outline. Database Management Systems (DBMS) Database Management and Organization. IT420: Database Management and Organization

Principles of Data Management. Lecture #3 (Managing Files of Records)

Database Applications (15-415)

Introduction. Example Databases

1/19/2012. Finish Chapter 1. Workers behind the Scene. CS 440: Database Management Systems

Introduction to Data Management. Lecture #4 (E-R à Relational Design)

CMPSCI 645 Database Design & Implementation

Lecture 2 08/26/15. CMPSC431W: Database Management Systems. Instructor: Yu- San Lin

Transaction Management and Concurrency Control. Chapter 16, 17

Introduction to Data Management. Lecture #4 E-R Model, Still Going

MIT Database Management Systems Lesson 01: Introduction

COURSE 1. Database Management Systems

Database Management System

Announcements. PS 3 is out (see the usual place on the course web) Be sure to read my notes carefully Also read. Take a break around 10:15am

John Edgar 2

EECS3421 Introduction to Database Management Systems. Thanks to John Mylopoulos and Ryan Johnson for material in these slides

Principles of Data Management. Lecture #2 (Storing Data: Disks and Files)

A Few Tips and Suggestions. Database System II Preliminaries. Applications (contd.) Applications

The Relational Model. Database Management Systems

Database Applications (15-415)

Database Management Systems. Chapter 3 Part 1

9/8/2018. Prerequisites. Grading. People & Contact Information. Textbooks. Course Info. CS430/630 Database Management Systems Fall 2018

A Few Tips and Suggestions. Database System I Preliminaries. Applications (contd.) Applications

Database Management Systems MIT Lesson 01 - Introduction By S. Sabraz Nawaz

CS430/630 Database Management Systems Spring, Betty O Neil University of Massachusetts at Boston

Overview of Transaction Management

Administration Naive DBMS CMPT 454 Topics. John Edgar 2

The Relational Data Model. Data Model

One Size Fits All: An Idea Whose Time Has Come and Gone

Introduction to Data Management. Lecture #6 E-Rà Relational Mapping (Cont.)

CSC 261/461 Database Systems Lecture 20. Spring 2017 MW 3:25 pm 4:40 pm January 18 May 3 Dewey 1101

What are Transactions? Transaction Management: Introduction (Chap. 16) Major Example: the web app. Concurrent Execution. Web app in execution (CS636)

Introduction to Transaction Management

Review. The Relational Model. Glossary. Review. Data Models. Why Study the Relational Model? Why use a DBMS? OS provides RAM and disk

Transaction Management: Introduction (Chap. 16)

Course Introduction. CSC343 - Introduction to Databases Manos Papagelis

Principles of Data Management. Lecture #9 (Query Processing Overview)

Mike Carey Information Systems Group Computer Science Department UC Irvine

Introduction to Data Management. Lecture 21 (Indexing, cont.)

CS634 Architecture of Database Systems Spring Elizabeth (Betty) O Neil University of Massachusetts at Boston

Introduction to Data Management. Lecture #5 (E-R Relational, Cont.)

Database Design and Implementation

Introduction TRANSACTIONS & CONCURRENCY CONTROL. Transactions. Concurrency

Goals for Today. CS 133: Databases. Relational Model. Multi-Relation Queries. Reason about the conceptual evaluation of an SQL query

Administrivia. The Relational Model. Review. Review. Review. Some useful terms

745: Advanced Database Systems

The Relational Model. Chapter 3. Database Management Systems, R. Ramakrishnan and J. Gehrke 1

CSCI1270 Introduction to Database Systems

Database Applications (15-415)

IST 210: Organization of Data

MIS Database Systems.

BIS Database Management Systems.

EECS 647: Introduction to Database Systems

Outline. Databases and DBMS s. Recent Database Applications. Earlier Database Applications. CMPSCI445: Information Systems.

Intro to Transactions

Transcription:

Introduction to Data Management Lecture #1 (Course Trailer ) Instructor: Mike Carey mjcarey@ics.uci.edu Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Today s Topics v Welcome to one of my biggest classes ever! v Read (and live by) the course wiki page: http://www.ics.uci.edu/~cs122a/ v Also follow (and live by) the Piazza page: https://piazza.com/uci/winter2015/cs122a/home v Let s look at both of these, and then lets also look at a preview of what lies ahead. v Note: There will be a quiz in this week s discussions and, you will need to prepare (by reading about Academic Honesty)...! Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 2

What is a Database System? v What s a database? A very large, integrated collection of data v Usually a model of a real-world enterprise Entities (e.g., students, courses, Facebook users, ) with attributes (e.g., name, birthdate, GPA, ) Relationships (e.g., Susan is taking CS 234, Susan is a friend of Lynn, ) v What s a database management system (DBMS)? A software system designed to store, manage, and provide access to one or more databases Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 3 File Systems vs. DBMS v Application programs must sometimes stage large datasets between main memory and secondary storage (for buffering huge data sets, getting page-oriented access, etc.) v Special code needed for different queries, and that code must be (stay) correct and efficient v Must protect data from inconsistency due to multiple concurrent users v Crash recovery is important since data is now the currency of the day (corporate jewels) v Security and access control are also important(!) Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 4

Evolution of DBMS Files Manual Coding " Byte streams " Majority of application development effort goes towards building and then maintaining data access logic CODASYL/IMS Early DBMS Technologies " Records and pointers " Large, carefully tuned data access programs that have dependencies on physical access paths, indexes, etc. Relational Relational DB Systems " Declarative approach " Tables and views bring data independence " Details left to system " Designed to simplify data-centric application development Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 5 Why Use a DBMS? v Data independence. v Efficient data access. v Reduced application development time. v Data integrity and security. v Uniform data administration. v Concurrent access, recovery from crashes. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 6

Why Study Databases? v Shift from computation to information At the low end : explosion of the web (a mess!) At the high end : scientific applications, social data analytics, v Datasets increasing in diversity and volume Digital libraries, interactive video, Human Genome project, EOS project, the Web itself,?! Mobile devices, Internet of Things,... need for DBMS exploding! v DBMS field encompasses most of CS!! OS, languages, theory, AI, multimedia, logic, Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 7 Why Study Databases (Really)? Big Data! J Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 8

Data Models v A data model is a collection of concepts for describing data v A schema is a description of a particular collection of data, using a given data model v The relational model is (still) the most widely used data model today Relation basically a table with rows and (named) columns Schema describes the tables and their columns Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 9 Levels of Abstraction v Many views of one conceptual (logical) schema and an underlying physical schema Views describe how different users see the data. View 1 View 2 View 3 Conceptual schema defines the logical structure of the database Physical schema describes the files and indexes used under the covers Conceptual Schema Physical Schema Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 10

Example: University DB v Conceptual schema: Students(sid: string, name: string, login: string, age: integer, gpa: real) Courses(cid: string, cname: string, credits: integer) Enrolled(sid: string, cid: string, grade: string) v Physical schema: Relations stored as unordered files Index on first and third columns of Students v External schema (a.k.a. view): CourseInfo(cid: string, cname: string, enrollment: integer) Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 11 Data Independence v Applications are insulated (at multiple levels) from how data is actually structured and stored Logical data independence: Protection from changes in the logical structure of data Physical data independence: Protection from changes in the physical structure of data v One of the most important benefits of DBMS use! Allows changes to occur w/o application rewrites! Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 12 1

Example: University DB (cont.) v User query (in SQL, against the external schema): SELECT c.cid, c.enrollment FROM CourseInfo c WHERE c.cname = Computer Game Design v Equivalent query (against the conceptual schema): SELECT e.cid, count(e.*) FROM Enrolled e, Courses c WHERE e.cid = c.cid AND c.cname = Computer Game Design GROUP BY c.cid v Under the hood (against the physical schema) Access Courses use index on cname to find associated cid Access Enrolled use index on cid to count the enrollments Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 13 Databases: The Cast v End users and DBMS software vendors v DB application programmers E.g., smart webmasters v Database administrator (DBA) Designs logical and physical schemas Handles security and authorization Ensures data availability, crash recovery Tunes the database (physical schema) as needs evolve à (DBA must understand how a DBMS works!) ß Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 14

Concurrency Control v Concurrent execution of user programs is essential for good DBMS performance. Because disk accesses are frequent, and relatively slow, it is crucial to keep the CPUs (cores!) humming by working on multiple users programs concurrently. v Interleaving actions of different user programs can lead to inconsistency: e.g., a bank transfer is run while a customer s assets are being totalled. v DBMS ensures that such problems don t arise: users/programmers can pretend they re using a single-user system. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 15 Transaction: An Execution of a DB Program v Key concept is transaction: An atomic sequence of database actions (e.g., reads/writes). v Each transaction, when executed completely, must leave the DB in a consistent state if the DB is consistent before it was executed. Users can specify simple integrity constraints on the data, and the DBMS will enforce these constraints. Beyond this, the DBMS is happily clueless about the data semantics (e.g., how bank interest is computed). Note: Ensuring that a given transaction (if run all by itself) preserves consistency is the user s (app s) job! Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 16

Concurrent DBMS Transactions v DBMS ensures that execution of {T1,..., Tn} is equivalent to some (in fact, any!) serial execution. Before reading/writing an object, a transaction requests a lock on the object and waits till the DBMS gives it the lock. (Locks are released together at end of transaction.) Key Idea: If any action of Ti (e.g., writing X) impacts Tj (e.g., reading X), one will get a lock on X first and the other will wait until the first one is done; this orders the transactions! Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 17 Ensuring Atomicity v DBMS ensures atomicity (all-or-nothing property) even if system crashes in the middle of a Xact. v Idea: Keep a log (history) of all actions carried out by the DBMS while executing a set of Xacts: Before a change is made to the database, a corresponding log entry is forced to a safe (different) location. In the event of a crash, the effects of partially executed transactions can first be undone using the log. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 18

The Log v The following actions are recorded in the log: Ti writes an object: The old value and the new value. Log record must go to disk before the changed DB page! Ti commits/aborts: A log record indicating the action. v Log records are linked by Xact id, so it s easy to undo a specific Xact (e.g., if it has to abort, or following a crash). v Log is usually replicated on stable storage. v All logging (and in fact, all the stuff we re talking about) is handled transparently by the DBMS. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 19 Architecture of a DBMS v A typical DBMS has a layered architecture. v Note: This figure doesn t show the locking and recovery components. v This is one of several possible architectures; each actual system has its own variations. Queries Query Optimization and Execution Relational Operators Files and Access Methods Buffer Management Disk Space Management Note: These layers must consider concurrency control and recovery DB Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 20

What s Exciting in DB Land Today? v The Web is full of database challenges Click streams and social networks generate lots of data How can I query and analyze all of that data? A box for keywords only goes so far How can I query the web, e.g., Find me 5-string Fender bass guitars for sale in the $1500-2000 price range v Ubiquitous computing is data-rich, too Build, deploy, and use location-based data services Query and aggregate streams of sensor or video data Internet of things, SoLoMo (Social/Local/Mobile), v There s data everywhere, and of all shapes and sizes How do we integrate it, e.g., for rapid crisis response? And when we do, how do we ensure privacy/security? Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 21 Summary v DBMS is used to maintain & query large datasets. v Benefits include recovery from system crashes, concurrent access, quick application development, data integrity and security. v Levels of abstraction give data independence. v A DBMS typically has a layered architecture. v DBAs (and friends) hold responsible jobs and they are also well-paid! (J ) v Data-related R&D is one of the broadest, most exciting areas in CS. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 22

Questions? Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 23