CSE 344 JANUARY 5 TH INTRO TO THE RELATIONAL DATABASE

Similar documents
CSE 344 JANUARY 8 TH SQLITE AND JOINS

Introduction to Databases CSE 414. Lecture 2: Data Models

Announcements. Using Electronics in Class. Review. Staff Instructor: Alvin Cheung Office hour on Wednesdays, 1-2pm. Class Overview

Introduction to Data Management CSE 344. Lecture 2: Data Models

Introduction to Database Systems CSE 414. Lecture 3: SQL Basics

Introduction to Database Systems CSE 414. Lecture 3: SQL Basics

Lecture 2: Introduction to SQL

Introduction to SQL Part 1 By Michael Hahsler based on slides for CS145 Introduction to Databases (Stanford)

DS Introduction to SQL Part 1 Single-Table Queries. By Michael Hahsler based on slides for CS145 Introduction to Databases (Stanford)

Big Data Processing Technologies. Chentao Wu Associate Professor Dept. of Computer Science and Engineering

DS Introduction to SQL Part 2 Multi-table Queries. By Michael Hahsler based on slides for CS145 Introduction to Databases (Stanford)

Introduction to Database Systems CSE 444

CMPT 354: Database System I. Lecture 2. Relational Model

Lecture 3: SQL Part II

Introduction to Data Management CSE 414

MTAT Introduction to Databases

Lecture 3 Additional Slides. CSE 344, Winter 2014 Sudeepa Roy

Announcements. Multi-column Keys. Multi-column Keys (3) Multi-column Keys. Multi-column Keys (2) Introduction to Data Management CSE 414

INF 212 ANALYSIS OF PROG. LANGS SQL AND SPREADSHEETS. Instructors: Crista Lopes Copyright Instructors.

Announcements. Agenda. Database/Relation/Tuple. Schema. Discussion. CSE 444: Database Internals

Announcements. Multi-column Keys. Multi-column Keys. Multi-column Keys (3) Multi-column Keys (2) Introduction to Data Management CSE 414

Polls on Piazza. Open for 2 days Outline today: Next time: "witnesses" (traditionally students find this topic the most difficult)

Agenda. Discussion. Database/Relation/Tuple. Schema. Instance. CSE 444: Database Internals. Review Relational Model

Introduction to Data Management CSE 344

Database Systems CSE 303. Lecture 02

The Relational Data Model

Database Systems CSE 303. Lecture 02

SQL. Assit.Prof Dr. Anantakul Intarapadung

COMP 430 Intro. to Database Systems

Principles of Database Systems CSE 544. Lecture #1 Introduction and SQL

Database Systems CSE 414

Where Are We? Next Few Lectures. Integrity Constraints Motivation. Constraints in E/R Diagrams. Keys in E/R Diagrams

Principles of Database Systems CSE 544. Lecture #1 Introduction and SQL

Principles of Database Systems CSE 544. Lecture #2 SQL The Complete Story

CSE 344 JANUARY 3 RD - INTRODUCTION

Announcements. Agenda. Database/Relation/Tuple. Discussion. Schema. CSE 444: Database Internals. Room change: Lab 1 part 1 is due on Monday

Introduction to Data Management CSE 344

Announcements. JSon Data Structures. JSon Syntax. JSon Semantics: a Tree! JSon Primitive Datatypes. Introduction to Database Systems CSE 414

user specifies what is wanted, not how to find it

Before we talk about Big Data. Lets talk about not-so-big data

Database Systems CSE 414

CSE 344 APRIL 16 TH SEMI-STRUCTURED DATA

JSON - Overview JSon Terminology

Data! CS 133: Databases. Goals for Today. So, what is a database? What is a database anyway? From the textbook:

Introduction to Data Management CSE 344

Keys, SQL, and Views CMPSCI 645

SQL Functionality SQL. Creating Relation Schemas. Creating Relation Schemas

Lecture 4. Lecture 4: The E/R Model

Introduction to Database Systems CSE 344

CSE 544 Principles of Database Management Systems

L12: ER modeling 5. CS3200 Database design (sp18 s2) 2/22/2018

SQL Introduction. CS 377: Database Systems

CSE 344 APRIL 11 TH DATALOG

Introduction to Database Systems CSE 414

COMP 430 Intro. to Database Systems

CS 564 Midterm Review

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2006 Lecture 3 - Relational Model

Introduction to Database Systems. The Relational Data Model

CSE 344 JULY 30 TH DB DESIGN (CH 4)

Introduction to Database Systems. The Relational Data Model. Werner Nutt

CSE 544 Principles of Database Management Systems

SQL: Concepts. Todd Bacastow IST 210: Organization of Data 2/17/ IST 210

Announcements. Two Classes of Database Applications. Class Overview. NoSQL Motivation. RDBMS Review: Serverless

Class Overview. Two Classes of Database Applications. NoSQL Motivation. RDBMS Review: Client-Server. RDBMS Review: Serverless

SQLite, Firefox, and our small IMDB movie database. CS3200 Database design (sp18 s2) Version 1/17/2018

Entity/Relationship Modelling

Announcements. Database Design. Database Design. Database Design Process. Entity / Relationship Diagrams. Introduction to Data Management CSE 344

Chapter 4. Basic SQL. SQL Data Definition and Data Types. Basic SQL. SQL language SQL. Terminology: CREATE statement

CSE 344 JANUARY 19 TH SUBQUERIES 2 AND RELATIONAL ALGEBRA

SQL Simple Queries. Chapter 3.1 V3.01. Napier University

L07: SQL: Advanced & Practice. CS3200 Database design (sp18 s2) 1/11/2018

CSE 544 Principles of Database Management Systems

Basic SQL. Dr Fawaz Alarfaj. ACKNOWLEDGEMENT Slides are adopted from: Elmasri & Navathe, Fundamentals of Database Systems MySQL Documentation

Lecture 5. Lecture 5: The E/R Model

CSE 344 FEBRUARY 14 TH INDEXING

Database Systems CSE 414

Overview of the Class and Introduction to DB schemas and queries. Lois Delcambre

CSE 344 JULY 9 TH NOSQL

Lecture 2 SQL. Announcements. Recap: Lecture 1. Today s topic. Semi-structured Data and XML. XML: an overview 8/30/17. Instructor: Sudeepa Roy

SQL Data Definition and Data Manipulation Languages (DDL and DML)

CMPT 354: Database System I. Lecture 3. SQL Basics

CSE 344 Midterm. Monday, Nov 4, 2013, 9:30-10:20. Question Points Score Total: 100

CSC 261/461 Database Systems Lecture 7

Operating systems fundamentals - B07

SQL. Structured Query Language

CS 327E Lecture 5. Shirley Cohen. September 14, 2016

Lectures 2&3: Introduction to SQL

COURSE OVERVIEW THE RELATIONAL MODEL. CS121: Relational Databases Fall 2017 Lecture 1

Information Systems Engineering. SQL Structured Query Language DDL Data Definition (sub)language

COSC344 Database Theory and Applications. Lecture 6 SQL Data Manipulation Language (1)

Data about data is database Select correct option: True False Partially True None of the Above

COURSE OVERVIEW THE RELATIONAL MODEL. CS121: Introduction to Relational Database Systems Fall 2016 Lecture 1

CSE 344 MAY 7 TH EXAM REVIEW

ECE 650 Systems Programming & Engineering. Spring 2018

CSE 414 Midterm. April 28, Name: Question Points Score Total 101. Do not open the test until instructed to do so.

Announcements. Outline UNIQUE. (Inner) joins. (Inner) Joins. Database Systems CSE 414. WQ1 is posted to gradebook double check scores

Databases. Jörg Endrullis. VU University Amsterdam

Introduction to Data Management. Lecture #4 (E-R Relational Translation)

Introduction to Data Management CSE 344

CSE 344 Introduction to Data Management. Section 2: More SQL

Transcription:

CSE 344 JANUARY 5 TH INTRO TO THE RELATIONAL DATABASE

ADMINISTRATIVE MINUTIAE Midterm Exam: February 9 th : 3:30-4:20 Final Exam: March 15 th : 2:30 4:20

ADMINISTRATIVE MINUTIAE Midterm Exam: February 9 th : 3:30-4:20 Final Exam: March 15 th : 2:30 4:20 HW#1 Out on Monday Online Quiz #1 out on Monday Syllabus and course website Expect email w/link to Piazza over the weekend

ADMINISTRATIVE MINUTIAE Midterm Exam: February 9 th : 3:30-4:20 Final Exam: March 15 th : 2:30 4:20 HW#1 Out on Monday Online Quiz #1 out on Monday Syllabus and course website Expect email w/link to Piazza over the weekend Next week: section will be very helpful setting up git and SQLite. Don t hesitate to come to OH if you re having trouble tutorial w/ lecture slides

CLASS OVERVIEW Unit 1: Intro Unit 2: Relational Data Models and Query Languages Data models, SQL RA, Datalog Unit 3: Non-relational data Unit 4: RDMBS internals and query optimization Unit 5: Parallel query processing Unit 6: DBMS usability, conceptual design Unit 7: Transactions Unit 8: Advanced topics (time permitting)

REVIEW What is a database? A collection of files storing related data What is a DBMS? An application program that allows us to manage efficiently the collection of data files

DATA MODELS Recall our example: want to design a database of books: author, title, publisher, pub date, price, etc How should we describe this data? Data model = mathematical formalism (or conceptual way) for describing the data

DATA MODELS Relational Data represented as relations Semi-structured (Json/XML) Data represented as trees Key-value pairs Unit 2 Unit 3 Used by NoSQL systems Graph Object-oriented

DATABASES VS. DATA STRUCTURES What are some important distinctions between database systems, and data structure systems?

DATABASES VS. DATA STRUCTURES What are some important distinctions between database systems, and data structure systems? Structure:

DATABASES VS. DATA STRUCTURES What are some important distinctions between database systems, and data structure systems? Structure: Java concerned with physical structure. DBMS concerned with conceptual structure

DATABASES VS. DATA STRUCTURES What are some important distinctions between database systems, and data structure systems? Structure: Java concerned with physical structure. DBMS concerned with conceptual structure Operations: Java low level, DBMS restricts allowable operations. Why?

DATABASES VS. DATA STRUCTURES What are some important distinctions between database systems, and data structure systems? Structure: Java concerned with physical structure. DBMS concerned with conceptual structure Operations: Java low level, DBMS restricts allowable operations. Efficiency and data control

DATABASES VS. DATA STRUCTURES What are some important distinctions between database systems, and data structure systems? Structure: Java concerned with physical structure. DBMS concerned with conceptual structure Operations: Java low level, DBMS restricts allowable operations. Efficiency and data control Data constraints:

DATABASES VS. DATA STRUCTURES What are some important distinctions between database systems, and data structure systems? Structure: Java concerned with physical structure. DBMS concerned with conceptual structure Operations: Java low level, DBMS restricts allowable operations. Efficiency and data control Data constraints: Enforced typing allows us to maximize our memory usage and to be confident our operations are successful

3 ELEMENTS OF DATA MODELS Instance The actual data Schema Describe what data is being stored Query language How to retrieve and manipulate data

RELATIONAL MODEL Data is a collection of relations / tables: columns / attributes / fields rows / tuples / records cname country no_employees for_profit GizmoWorks USA 20000 True Canon Japan 50000 True Hitachi Japan 30000 True HappyCam Canada 500 False mathematically, relation is a set of tuples each tuple (or entry) must have a value for each attribute order of the rows is unspecified

RELATIONAL MODEL Data is a collection of relations / tables: columns / attributes / fields rows / tuples / records cname country no_employees for_profit GizmoWorks USA 20000 True Canon Japan 50000 True Hitachi Japan 30000 True HappyCam Canada 500 False mathematically, relation is a set of tuples each tuple (or entry) must have a value for each attribute order of the rows is unspecified What is the schema for this table?

RELATIONAL MODEL Data is a collection of relations / tables: columns / attributes / fields rows / tuples / records cname country no_employees for_profit GizmoWorks USA 20000 True Canon Japan 50000 True Hitachi Japan 30000 True HappyCam Canada 500 False mathematically, relation is a set of tuples each tuple (or entry) must have a value for each attribute order of the rows is unspecified What is the schema for this table? Company(cname, country, no_employees, for_profit)

THE RELATIONAL DATA MODEL Degree (arity) of a relation = #attributes Each attribute has a type. Examples types: Strings: CHAR(20), VARCHAR(50), TEXT Numbers: INT, SMALLINT, FLOAT MONEY, DATETIME, Few more that are vendor specific Statically and strictly enforced

KEYS Key = one (or multiple) attributes that uniquely identify a record

KEYS Key = one (or multiple) attributes that uniquely identify a record Key cname country no_employees for_profit GizmoWorks USA 20000 True Canon Japan 50000 True Hitachi Japan 30000 True HappyCam Canada 500 False

KEYS Key = one (or multiple) attributes that uniquely identify a record Key Not a key cname country no_employees for_profit GizmoWorks USA 20000 True Canon Japan 50000 True Hitachi Japan 30000 True HappyCam Canada 500 False

KEYS Key = one (or multiple) attributes that uniquely identify a record Key Not a key Is this a key? cname country no_employees for_profit GizmoWorks USA 20000 True Canon Japan 50000 True Hitachi Japan 30000 True HappyCam Canada 500 False

KEYS Key = one (or multiple) attributes that uniquely identify a record Key Not a key Is this a key? No: future updates to the database may create duplicate no_employees cname country no_employees for_profit GizmoWorks USA 20000 True Canon Japan 50000 True Hitachi Japan 30000 True HappyCam Canada 500 False

MULTI-ATTRIBUTE KEY Key = fname,lname (what does this mean?) fname lname Income Department Alice Smith 20000 Testing Alice Thompson 50000 Testing Bob Thompson 30000 SW Carol Smith 50000 Testing

MULTIPLE KEYS Key Another key SSN fname lname Income Department 111-22-3333 Alice Smith 20000 Testing 222-33-4444 Alice Thompson 50000 Testing 333-44-5555 Bob Thompson 30000 SW 444-55-6666 Carol Smith 50000 Testing We can choose one key and designate it as primary key E.g.: primary key = SSN

FOREIGN KEY Company(cname, country, no_employees, for_profit) Country(name, population) Company cname country no_employees for_profit Canon Japan 50000 Y Hitachi Japan 30000 Y Country Foreign key to Country.name name USA Japan population 320M 127M

KEYS: SUMMARY Key = columns that uniquely identify tuple Usually we underline A relation can have many keys, but only one can be chosen as primary key Foreign key: Attribute(s) whose value is a key of a record in some other relation Foreign keys are sometimes called semantic pointer

QUERY LANGUAGE SQL Structured Query Language Developed by IBM in the 70s Most widely used language to query relational data Other relational query languages Datalog, relational algebra

OUR FIRST DBMS SQL Lite Will switch to SQL Server later in the quarter

DEMO 1

DEMO 1 What operations should we expect SQLite (or any DBMS) to support just on what we know right now?

DEMO 1 What operations should we expect SQLite (or any DBMS) to support just on what we know right now? create table insert into select delete from

DEMO 1 What operations should we expect SQLite (or any DBMS) to support just on what we know right now? create table insert into select delete from What sorts of inputs do these functions need to have?

DEMO 1 What operations should we expect SQLite (or any DBMS) to support just on what we know right now? create table insert into select delete from What sorts of inputs do these functions need to have? create table:

DEMO 1 What operations should we expect SQLite (or any DBMS) to support just on what we know right now? create table insert into select delete from What sorts of inputs do these functions need to have? create table: table name, schema

DEMO 1 What operations should we expect SQLite (or any DBMS) to support just on what we know right now? create table insert into select delete from What sorts of inputs do these functions need to have? create table: table name, schema insert into:

DEMO 1 What operations should we expect SQLite (or any DBMS) to support just on what we know right now? create table insert into select delete from What sorts of inputs do these functions need to have? create table: table name, schema insert into: table name, tuple

DEMO 1 What operations should we expect SQLite (or any DBMS) to support just on what we know right now? create table insert into select delete from What sorts of inputs do these functions need to have? create table: table name, schema insert into: table name, tuple select:

DEMO 1 What operations should we expect SQLite (or any DBMS) to support just on what we know right now? create table insert into select delete from What sorts of inputs do these functions need to have? create table: table name, schema insert into: table name, tuple select: table name, attributes

DEMO 1 What operations should we expect SQLite (or any DBMS) to support just on what we know right now? create table insert into select delete from What sorts of inputs do these functions need to have? create table: table name, schema insert into: table name, tuple select: table name, attributes delete from: table name, condition

DEMO 1 What operations should we expect SQLite (or any DBMS) to support just on what we know right now? create table insert into select delete from What other behavior do we expect from these functions?

DEMO 1 What operations should we expect SQLite (or any DBMS) to support just on what we know right now? create table insert into select delete from What other behavior do we expect from these functions? Much of the behavior is similar to a dictionary from 332. Create table ~= new DS(), insert into ~= insert(k,v), select! ~= find(k), delete from ~= remove(k)

DEMO 1 What operations should we expect SQLite (or any DBMS) to support just on what we know right now? create table insert into select delete from What other behavior do we expect from these functions? Much of the behavior is similar to a dictionary from 332. Create table ~= new DS(), insert into ~= insert(k,v), select! ~= find(k), delete from ~= remove(k) Also have the key constraints!

DEMO 1 Common Syntax CREATE TABLE [tablename] ([att1] [type1], [att2] [type2] ); INSERT INTO [tablename] VALUES ([val1],[val2] ); SELECT * FROM [tablename]

DEMO 1 Common Syntax CREATE TABLE [tablename] ([att1] [type1], [att2] [type2] ); INSERT INTO [tablename] VALUES ([val1],[val2] ); SELECT [att1],[att2], FROM [tablename]

DEMO 1 Common Syntax CREATE TABLE [tablename] ([att1] [type1], [att2] [type2] ); INSERT INTO [tablename] VALUES ([val1],[val2] ); SELECT [att1],[att2], FROM [tablename] WHERE [condition]

DEMO 1 Common Syntax CREATE TABLE [tablename] ([att1] [type1], [att2] [type2] ); INSERT INTO [tablename] VALUES ([val1],[val2] ); SELECT [att1],[att2], FROM [tablename] WHERE [condition] DELETE FROM [tablename]

DEMO 1 Common Syntax CREATE TABLE [tablename] ([att1] [type1], [att2] [type2] ); INSERT INTO [tablename] VALUES ([val1],[val2] ); SELECT [att1],[att2], FROM [tablename] WHERE [condition] DELETE FROM [tablename] WHERE [condition]

DEMO 1

DISCUSSION Two other operations we want to support ALTER TABLE: Adds a new attribute to the table UPDATE: Change the attribute for a particular tuple in the table. Common Syntax ALTER TABLE [tablename] ADD [attname] [atttype] UPDATE [tablename] SET [attname]=[value]

DISCUSSION Two other operations we want to support ALTER TABLE: Adds a new attribute to the table UPDATE: Change the attribute for a particular tuple in the table. Common Syntax ALTER TABLE [tablename] ADD [attname] [atttype] UPDATE [tablename] SET [attname]=[value] WHERE [condition]

DEMO 2

DISCUSSION Tables are NOT ordered they are sets or multisets (bags) Tables are FLAT No nested attributes Tables DO NOT prescribe how they are implemented / stored on disk This is called physical data independence

TABLE IMPLEMENTATION How would you implement this? cname country no_employees for_profit GizmoWorks USA 20000 True Canon Japan 50000 True Hitachi Japan 30000 True HappyCam Canada 500 False

TABLE IMPLEMENTATION How would you implement this? cname country no_employees for_profit GizmoWorks USA 20000 True Canon Japan 50000 True Hitachi Japan 30000 True HappyCam Canada 500 False Row major: as an array of objects GizmoWorks USA 20000 True Canon Japan 50000 True Hitachi Japan 30000 True HappyCam Canada 500 False

TABLE IMPLEMENTATION How would you implement this? cname country no_employees for_profit GizmoWorks USA 20000 True Canon Japan 50000 True Hitachi Japan 30000 True HappyCam Canada 500 False Column major: as one array per attribute GizmoWorks Canon Hitachi HappyCam USA Japan Japan Canada 20000 50000 30000 500 True True True False

TABLE IMPLEMENTATION How would you implement this? cname country no_employees for_profit GizmoWorks USA 20000 True Canon Japan 50000 True Hitachi Japan 30000 True HappyCam Canada 500 False Physical data independence The logical definition of the data remains unchanged, even when we make changes to the actual implementation

FIRST NORMAL FORM cname country no_employees for_profit Canon Japan 50000 Y Hitachi Japan 30000 Y All relations must be flat: we say that the relation is in first normal form

FIRST NORMAL FORM cname country no_employees for_profit Canon Japan 50000 Y Hitachi Japan 30000 Y All relations must be flat: we say that the relation is in first normal form E.g. we want to add products manufactured by each company:

FIRST NORMAL FORM cname country no_employees for_profit Canon Japan 50000 Y Hitachi Japan 30000 Y All relations must be flat: we say that the relation is in first normal form E.g. we want to add products manufactured by each company: cname country no_employees for_profit products Canon Japan 50000 Y Hitachi Japan 30000 Y pname price category SingleTouch 149.99 Photography Gadget 200 Toy pname price category AC 300 Appliance

FIRST NORMAL FORM cname country no_employees for_profit Canon Japan 50000 Y Hitachi Japan 30000 Y All relations must be flat: we say that the relation is in first normal form E.g. we want to add products manufactured by each company: cname country no_employees for_profit products Non-1NF! Canon Japan 50000 Y Hitachi Japan 30000 Y pname price category SingleTouch 149.99 Photography Gadget 200 Toy pname price category AC 300 Appliance

FIRST NORMAL FORM Now it s in 1NF Company cname country no_employees for_profit Canon Japan 50000 Y Hitachi Japan 30000 Y Products pname price category manufacturer SingleTouch 149.99 Photography Canon AC 300 Appliance Hitachi Gadget 200 Toy Canon

DEMO 3

DATA MODELS: SUMMARY Schema + Instance + Query language Relational model: Database = collection of tables Each table is flat: first normal form Key: may consists of multiple attributes Foreign key: semantic pointer Physical data independence