Chapter 1: Introduction

Similar documents
CS425 Fall 2016 Boris Glavic Chapter 1: Introduction

Chapter 1: Introduction

Database Technology Introduction. Heiko Paulheim

Chapter 2: Intro to Relational Model

Database Management System. Fundamental Database Concepts

Database Principle. Zhuo Wang Spring

Chapter 1: Introduction

Course Logistics & Chapter 1 Introduction

Chapter 1: Introduction

Chapter 1: Introduction. Chapter 1: Introduction

UNIT I. Introduction

Chapter 1: Introduction

Database Management Systems (CPTR 312)

Database System Concepts

Unit I. By Prof.Sushila Aghav MIT

CS425 Fall 2016 Boris Glavic Chapter 2: Intro to Relational Model

Chapter 2: Intro to Relational Model

D.Hemavathi & R.Venkatalakshmi, Assistant Professor, SRM University, Kattankulathur

Chapter 2 Introduction to Relational Models

Unit1: Introduction. Database System Concepts, 6 th Ed. Silberschatz, Korth and Sudarshan See for conditions on re-use

B.H.GARDI COLLEGE OF MASTER OF COMPUTER APPLICATION. Ch. 1 :- Introduction Database Management System - 1

CS34800 Information Systems. Course Overview Prof. Walid Aref January 8, 2018

Chapter 2: Relational Model

COMP Instructor: Dimitris Papadias WWW page:

Explain in words what this relational algebra expression returns:

Chapter 3: Introduction to SQL

CSE 132A. Database Systems Principles

Chapter 1 Introduction

Upon completion of this Unit, the students will be introduced to the following

CS275 Intro to Databases

DATABASE MANAGEMENT SYSTEMS. UNIT I Introduction to Database Systems

CMPT 354 Database Systems I. Spring 2012 Instructor: Hassan Khosravi

Chapter 1 Chapter-1

Chapter 3: Introduction to SQL

Mahathma Gandhi University

COMP.3090/3100 Database I & II. Textbook

The Relational Model

QUIZ: Is either set of attributes a superkey? A candidate key? Source:

Database Management Systems MIT Introduction By S. Sabraz Nawaz

Basant Group of Institution

Quick Facts about the course. CS 2550 / Spring 2006 Principles of Database Systems. Administrative. What is a Database Management System?

Database System Concepts, 5 th Ed.! Silberschatz, Korth and Sudarshan See for conditions on re-use "

The Relational Model Constraints and SQL DDL

Information Systems and Software Systems Engineering (12CFU)

Who, where, when. Database Management Systems (LIX022B05) Literature. Evaluation. Lab Sessions. About this course. After this course...

Introduction to Relational Databases. Introduction to Relational Databases cont: Introduction to Relational Databases cont: Relational Data structure

CS121 MIDTERM REVIEW. CS121: Relational Databases Fall 2017 Lecture 13

The DBMS accepts requests for data from the application program and instructs the operating system to transfer the appropriate data.

Unit 3 : Relational Database Design

John Edgar 2

The SQL data-definition language (DDL) allows defining :

CS403- Database Management Systems Solved MCQS From Midterm Papers. CS403- Database Management Systems MIDTERM EXAMINATION - Spring 2010

CS143: Relational Model

CS 377 Database Systems

Introduction CHAPTER. 1.1 Database-System Applications

Chapter 2: Relational Model

Comp 5311 Database Management Systems. 2. Relational Model and Algebra

COSC344 Database Theory and Applications. σ a= c (P) Lecture 3 The Relational Data. Model. π A, COSC344 Lecture 3 1

Relational Model. Nisa ul Hafidhoh

CISC 3140 (CIS 20.2) Design & Implementation of Software Application II

Introduction Database Concepts

Chapter 4. The Relational Model

CPS510 Database System Design Primitive SYSTEM STRUCTURE

; Spring 2008 Prof. Sang-goo Lee (14:30pm: Mon & Wed: Room ) ADVANCED DATABASES

Chapter 6 Formal Relational Query Languages

Data about data is database Select correct option: True False Partially True None of the Above

CSCB20. Introduction to Database and Web Application Programming. Anna Bretscher Winter 2017

Data! CS 133: Databases. Goals for Today. So, what is a database? What is a database anyway? From the textbook:

CAS CS 460/660 Introduction to Database Systems. Fall

Database Applications (15-415)

The Relational Model. Why Study the Relational Model? Relational Database: Definitions

CIS 330: Applied Database Systems. ER to Relational Relational Algebra

Relational Data Model

Midterm Review. Winter Lecture 13

CSIT5300: Advanced Database Systems

Why Study the Relational Model? The Relational Model. Relational Database: Definitions. The SQL Query Language. Relational Query Languages

CSCB20 Week 2. Introduction to Database and Web Application Programming. Anna Bretscher Winter 2017

ECE 650 Systems Programming & Engineering. Spring 2018

Relational model continued. Understanding how to use the relational model. Summary of board example: with Copies as weak entity

THE RELATIONAL DATABASE MODEL

Review for Exam 1 CS474 (Norton)

CS 582 Database Management Systems II

CSE 544 Principles of Database Management Systems

Let s briefly review important EER inheritance concepts

ITCS 3160 DATA BASE DESIGN AND IMPLEMENTATION

CS425 Fall 2017 Boris Glavic Chapter 5: Intermediate SQL

Database Systems Relational Model. A.R. Hurson 323 CS Building

Introduction to DBMS

SYLLABUS (R15A0509) DATABASE MANAGEMENT SYSTEMS

COURSE OVERVIEW THE RELATIONAL MODEL. CS121: Relational Databases Fall 2017 Lecture 1

Database Management System 9

COSC 304 Introduction to Database Systems. Database Introduction. Dr. Ramon Lawrence University of British Columbia Okanagan

Relational Database Systems Part 01. Karine Reis Ferreira

COURSE OVERVIEW THE RELATIONAL MODEL. CS121: Introduction to Relational Database Systems Fall 2016 Lecture 1

BIS Database Management Systems.

MIS Database Systems.

Chapter 4: Intermediate SQL

The Relational Model. Chapter 3. Comp 521 Files and Databases Fall

Relational Model History. COSC 304 Introduction to Database Systems. Relational Model and Algebra. Relational Model Definitions.

Introduction: Database Concepts Slides by: Ms. Shree Jaswal

Transcription:

Chapter 1: Introduction Chapter 2: Intro. To the Relational Model Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use

Database Management System (DBMS) DBMS is Collection of interrelated data, a.k.a. database (DB for short) Set of programs to access the data: Find individual data items or sets Data mining: Find patterns! (Ch.20) Modify Cleaning examples Insert + Delete 1.2

1.1 Examples of DB Applications Banking: transactions Airlines: reservations, schedules Universities: registration, grades Sales: customers, products, purchases Online retailers: order tracking, customized recommendations Manufacturing: production, inventory, orders, supply chain Human resources: employee records, salaries, tax deductions 1.3

Today s databases can be very large There are dozens of companies today with DBs in the 100 PB (petabyte) range. The LHC at CERN produces about 25 PB of data every year, for an estimated 300 PB total so far. Hurray for the Higgs boson! Facebook had about 300 PB in 2014, and its users are producing about 84 PB a year. 1.4

Robotic data tape storage library at one of the CERN-LHC Tier 1 centers Source: http://www.quantumdiaries.org/2012/09/18/higgs-hunting-software/ 1.5

Database Source: System http://www.digitaltrends.com/computing/amazon-snowmobile-data-migration-cloud-storage-truck/ Concepts - 6 th Edition 1.6

The University DB example used in our text Stores info about: Instructors Students Departments Course offerings Application program examples: Add new students, instructors, and courses Register students for courses, generate class rosters Assign grades to students, compute GPAs Generate transcripts 1.7

Before DBMSs existed, DB applications were built directly on top of file systems The system stores permanent records in various files If a new dept. is created, then a new file is needed, to store info about all the instructors in that dept. Application programs extract records from, and add records to, the appropriate files. Application programs are also needed to enforce the various rules (constraints), e.g. a student cannot take more than 6 courses at the same time. 1.8

Drawbacks of using file systems Data redundancy and inconsistency Multiple file formats, duplication of information in different files Difficulty in accessing data Need to write a new program to carry out each new task Data isolation multiple files and formats Integrity problems Integrity constraints (e.g., account balance > 0) become buried in program code rather than being stated explicitly Hard to add new constraints or change existing ones 1.9

Drawbacks of using file systems (cont.) Atomicity of updates Failures may leave database in an inconsistent state with partial updates carried out Example: Transfer of funds from one account to another should either complete or not happen at all Concurrent access by multiple users Concurrent access needed for performance Uncontrolled concurrent accesses can lead to inconsistencies Example: Two people reading a balance (say $100) and updating it by withdrawing money (say $50 each) at the same time. Security problems Hard to provide user access to some, but not all, data 1.10

Drawbacks of using file systems - Conclusion Redundancy Inconsistency Difficulty of access Isolation Integrity Try to remember 3 of these! Atomicity Concurrency Security DBMSs offer solutions to all the above problems! 1.11

Not in text 2.0 Mathematical Concept of Relation Example relations between the same two sets of objects Source: http://www.math-only-math.com/worksheet-on-functions-or-mapping.html 1.12

Not in text Mathematical Concept of Relation Extreme cases: The empty relation and the total relation a d a d b e b e c f c f Source: http://www.math-only-math.com/worksheet-on-functions-or-mapping.html 1.13

Not in text Mathematical Concept of Relation Food for thought: How many different relations exist between these two sets of elements? a b c d e f Source: http://www.math-only-math.com/worksheet-on-functions-or-mapping.html 1.14

Not in text Mathematical Concept of Relation In a Relational DB, the relation is represented by a table Source: http://www.math-only-math.com/worksheet-on-functions-or-mapping.html 1.15

QUIZ: Draw the tables corresponding to the relations b, c, and d Source: http://www.math-only-math.com/worksheet-on-functions-or-mapping.html 1.16

2.1 Relations and Tables attributes (or columns) tuples (or rows) These are called 4-tuples 1.17

QUIZ The IDs have 5 digits, the names are strings of 20 characters, the dept. names are chosen from a set of 50, and the salaries are integers with 6 digits. How many different tuples can there be? 1.18

Attribute Types The set of allowed values for each attribute is called the domain of the attribute. Attribute values are (normally) required to be atomic; that is, indivisible. The special value null is a member of every domain. The null value causes complications in the definition of many operations. 1.19

2.2 Relation Schema and Instance A 1, A 2,, A n are attributes R = (A 1, A 2,, A n ) is a relation schema Example: instructor = (ID, name, dept_name, salary) Formally, given sets D 1, D 2,. D n a relation r is a subset of the Cartesian product D 1 x D 2 x x D n Thus, a relation is a set of n-tuples (a 1, a 2,, a n ) where each a i D i The current values (relation instance) of a relation are specified by a table An element t of r is a tuple, represented by a row in a table 1.20

Relations are Unordered! The order of tuples is irrelevant (so tuples may be stored in arbitrary order!) Example: instructor relation with unordered tuples 1.21

We have covered the following textbook sections: 1.1, 1.2, 2.1, 2.2 See you at 2:40 in the lab! SCIENCE 206 1.22 EOL 1

QUIZ What is the difference between a Database (DB) and a Database Management System (DBMS)? How was data stored and managed before the introduction of DB technology? Briefly outline three problems with the method above. 1.23

QUIZ What is the difference between a relation and a table? Explain the following DB concepts: Domain Atomic Null 1.24

QUIZ Are these different relations/tables? No, b/c relations are by definition unordered! 1.25

1.3 View of Data We use abstraction to manage any complex system. Catalog, tables, functions, Files, hard-disk blocks 1.26

Who uses these levels? DB user (e.g. Registrar s office clerk) DB Admin straddles the levels DB programmer (e.g. the person who loads data into the DB) DBMS developer (e.g. Oracle, Teradata) 1.27

Instances and Schemas Schema the logical structure of the database Example: The database consists of information about a set of customers and accounts and the relationship between them Analogous to the type of a variable in programming (e.g. array of characters) Physical schema: database design at the physical level Logical schema: database design at the logical level Instance the actual content of the database at a particular point in time Analogous to the value of a variable (e.g. Hello, world! ) 1.28

Physical Data Independence = the ability to modify the physical schema without changing the logical schema This is an example of modularity! Applications depend on the logical schema In general, the interfaces between the various levels and components should be well defined so that changes in some parts do not seriously influence others. 1.29

Data model = collection of conceptual tools for describing Data Data relationships Data semantics (e.g. constraints) Relational Entity-Relationship (E-R) model (mainly for DB design) Object-based: encapsulation, methods, inheritance (Objectoriented and Object-relational) Semistructured: a type does not necessarily mean the same set of attributes! (XML) Other older models: Network Hierarchical 1.30

1.4 and 1.5 DB Languages Data Manipulation Language (DML) For accessing and manipulating the data A.k.a. query language Two classes of languages Procedural user specifies what data is required and how to get those data Declarative (nonprocedural!) user specifies what is required without specifying how to do it SQL is the most widely used query language and declarative. 1.31

SQL example: Find the name of the instructor with ID 22222 select name from instructor where instructor.id = 22222 1.32

SQL example: Find the ID and building of instructors in the Physics dept. select instructor.id, department.building from instructor, department where instructor.dept_name = department.dept_name and department.dept_name = Physics 1.33

SQL Queries in a nutshell: select from where 1.34

QUIZ: SQL Write a query to find the building where the instructor with ID 10101 works 1.35

QUIZ: SQL Write a query to find the building where the instructor with ID 10101 works select department.building from instructor, department where instructor.dept_name = department.dept_name and instructor.id = 10101 1.36

QUIZ: SQL Find the name of all instructors in the Comp. Sci. dept. 1.37

QUIZ: SQL Find the name of all instructors in the Comp. Sci. dept. who make more than $80,000 1.38

QUIZ: SQL Find the name of instructors who work in a dept. with a budget over $ 100,000 1.39

SQL Application programs generally access databases through one of Language extensions to allow embedded SQL Application program interface (e.g., ODBC/JDBC) which allow SQL queries to be sent to a database (Chapters 3, 4 and 5) 1.40

Data Definition Language (DDL) Specification notation for defining the database schema Example: create table instructor ( ID char(5), name varchar(20), dept_name varchar(20), salary numeric(8,2)) DDL compiler generates a set of table templates stored in a data dictionary Data dictionary contains metadata (i.e., data about data) Database schema Integrity constraints Primary key (ID uniquely identifies instructors) Referential integrity (references constraint in SQL) e.g. dept_name value in any instructor tuple must appear in department relation Authorization 1.41 EoL

QUIZ: SQL Write SQL DDL code to create a table named account, with two columns: Account number, which is a string of 10 characters Balance, which is an integer 1.42

QUIZ: SQL Write SQL DDL code to create a table named account, with two columns: Account number, which is a string of 10 characters Balance, which is an integer create table account (account_number char(10), balance integer) 1.43

1.6.4 Database design Database = multiple relations Information about an enterprise is broken up into parts (relations) instructor, department, student, advisor, etc. Bad design: place all information in one huge table, e.g. 1.44

Database design Bad design: all information in one huge table results in repetition of information (e.g., two students have the same instructor) the need for null values (e.g., represent an student with no advisor) problems when updating the table Normalization theory (Chapter 7) deals with how to design good relations. 1.45

Lets break the big table, then! Do you see a problem with this DB? 1.46

The correct solution preserves the connections in the data! 1.47

Remember: 2.3 Keys Relations are sets of tuples (tables are sets of rows). Mathematical sets cannot, by definition, have duplicates. Therefore, no two tuples in a relation can have exactly the same attributes. 1.48

Is this a valid table? QUIZ Keys 1.49

2.3 Keys No two tuples in a relation can have exactly the same attributes. If we look at all columns of the table, the combinations of values in them are unique, but Question: Can we find a smaller set of attributes whose combination is unique for the given relation/table? 1.50

QUIZ Keys What is the smallest key for this table? 1.51

QUIZ Keys What is the smallest key for this table? A B C A1 B1 C1 A2 B1 C2 A2 B1 C3 A3 B2 C2 A4 B2 C1 A5 B3 C4 1.52

Mathematical Notations for Keys K is a subset of the attributes of R: K R K is a superkey of R if values for K are sufficient to identify a unique tuple of each possible relation r(r) Example: {ID} and {ID,name} are both superkeys of instructor. Superkey K is a candidate key if K is minimal Example: {ID} is a candidate key for Instructor One of the candidate keys is selected to be the primary key. Which one? Depends on application. 1.53

Schema diagram for University DB: Look at the keys in each table! 1.54

Individual work (try to solve before the next class) End-of-chapter exercises 1.1, 1.4, 1.8, 1.9 2.1, 2.3, 2.4 1.55

We have covered the following textbook sections: 1.3, 1.4, 1.5, 1.6.4, 2.3 See you at 2:40 in the lab! SCIENCE 206 1.56 EOL 2

QUIZ What are the data models we mentioned? Relational Entity-Relationship (E-R) Object-based: Object-oriented and Object-relational Semistructured: a type does not necessarily mean the same set of attributes! (e.g. XML, JSON) Older models: Network Hierarchical 1.57

QUIZ What are the two extreme cases of structuring of a DB into tables/relations? Why is either one bad? Name two major players in the DBMS market. 1.58

QUIZ: SQL Is SQL a procedural language? Explain. Is SQL a DML or a DDL? Explain. 1.59

QUIZ SQL Write (Postgre)SQL commands to do the following: Create a new database electronics Create a table circuits in electronics, with two columns: Name of chip (e.g. 74LS04) Type of gates (e.g. NOT) Nr. of gates on a chip (e.g. 6) Delete the whole table Delete the whole database 1.60

CREATE DATABASE electronics; \connect electronics CREATE TABLE circuits ( name varchar(15), type varchar(10), nr_gates integer ); DROP TABLE circuits; \connect postgres DROP DATABASE electronics; Must connect to DB to create table! Cannot drop the DB we are connected to! 1.61

QUIZ: Keys 1.62

QUIZ: Keys Practice Exercise 2.1 Consider the following relational DB: Employee(person_name, street, city) Works(person_name, company_name, salary) Company(company_name, city) What are the appropriate primary keys? 1.63

1.6.1 The DB design process 1. A high-level data model (e.g. the E-R model, see below) is chosen as a conceptual framework to specify the requirements DB designer interacts with users (a.k.a. domain experts) Outcome is a (complete) specification of user requirements 2. A data model (e.g. the relational model) is chosen 3. Conceptual design: DB designer translates user requirements into conceptual schema for the model chosen What attributes to represent in the DB How to group attributes into tables In the language of section 1.3.2, conceptual schema comes before logical schema! 1.64

1.6.1 The DB design process 3. Conceptual design: DB designer translates user requirements into conceptual schema for the model chosen Functional requirements = what kind of operations must be performed on the data 4. Logical design The result is the logical schema of the DB (e.g. SQL DDL) 5. Physical design The result is the physical schema of the DB (e.g. files, disk arrays, SAN) 1.65

1.6.2 Example: DB design for the University DB Read and understand! See also Appendix A! 1.66

1.6.3 The E-R model Detailed coverage to follow in Ch.7. 1.67

SKIP Sections 1.7 through 1.12 READ and TAKE NOTES: 1.13 History of DB Systems 1.68

2.3 (contd.) Using Keys to Connect Different Relations/Tables Foreign key constraint: Value in one table must appear in another Referencing table (source) Referenced table (destination) Example: instructor.dept_name is a foreign key in instructor. It references department.dept_name Important: foreign keys are not keys! (not unique) 1.69

Practice Exercise 2.2 Give an example of tuple insert and an example of tuple delete that would cause violations of the FK constraint. QUIZ: Foreign Keys 1.70

2.4 Schema diagrams 1.71

2.4 Schema diagrams An almost equivalent schema is this (What is missing?) 1.72

2.5 Relational Query Languages Very little new info. beyond Ch.1 Read! 1.73

2.5 Relational Operations 1.74

Relational Algebra: Selection of tuples (rows) Relation r Select tuples with A=B and D > 5 σ A=B and D > 5 (r) 1.75

Relational Algebra: Selection of Columns (Attributes) Relation r : Select A and C A.k.a. projection Π A, C (r) 1.76

QUIZ: Relational Algebra Write a relational algebra expression that returns those values of A that are associated with values of C in between 5 and 15 (inclusive). 1.77

Relational Algebra and SQL The selection operator σ is analogous to the SQL clause The projection operator P is analogous to the SQL clause 1.78

More Relational Algebra 1.79

Joining two relations Cartesian Product is the most general join Relations r, s: r x s: 1.80

Joining two relations Natural join is the most restrictive join Only attributes with the same name and type are matched! 1.81

Union of two relations Relations r, s: r s: 1.82

Set difference of two relations Relations r, s: r s: 1.83

Set Intersection of two relations Relation r, s: r s 1.84

Explain in words what these expressions do: Practice Exercise 2.6 1.85

QUIZ: Relational Algebra Explain in words what this relational algebra expression returns: 1.86

Individual work (try to solve before the next class) End-of-chapter Practice Exercises 2.3, 2.4, 2.5 1.87

Homework for Chs. 1 + 2 --1.11, 1.12, 1.15 --2.8, 2.9, 2.13, 2.14, 2.15 Due next Tuesday, Jan.31 Hint: For the Bank database used in several exercises, you may use the schema provided on the next slide. 1.88

Schema for Bank database (fig.2.15) 1.89

Text sections covered: 1.1 through 1.6 + 1.13 (History) Entire Ch.2 1.90 EOL 3