Principles of Data Management

Similar documents
The SQL data-definition language (DDL) allows defining :

Chapter 3: Introduction to SQL

Chapter 3: Introduction to SQL

Chapter 3: Introduction to SQL. Chapter 3: Introduction to SQL

Lecture 6 - More SQL

Lecture 3 SQL. Shuigeng Zhou. September 23, 2008 School of Computer Science Fudan University

Chapter 4. Basic SQL. SQL Data Definition and Data Types. Basic SQL. SQL language SQL. Terminology: CREATE statement

Unit 2: SQL AND PL/SQL

QQ Group

Basic Structure Set Operations Aggregate Functions Null Values Nested Subqueries Derived Relations Views Modification of the Database Data Definition

CS425 Fall 2017 Boris Glavic Chapter 4: Introduction to SQL

Slides by: Ms. Shree Jaswal

Chapter 4. Basic SQL. Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley

Chapter 4: SQL. Basic Structure

SQL. Lecture 4 SQL. Basic Structure. The select Clause. The select Clause (Cont.) The select Clause (Cont.) Basic Structure.

CS 582 Database Management Systems II

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Database Systems: Design, Implementation, and Management Tenth Edition. Chapter 7 Introduction to Structured Query Language (SQL)

Chapter 4: Intermediate SQL

Sql Server Syllabus. Overview

Relational Algebra and SQL

DATABASTEKNIK - 1DL116

DATABASE TECHNOLOGY - 1MB025

Outline. Textbook Chapter 6. Note 1. CSIE30600/CSIEB0290 Database Systems Basic SQL 2

Chapter 3: SQL. Database System Concepts, 5th Ed. Silberschatz, Korth and Sudarshan See for conditions on re-use

Chapter 3: SQL. Chapter 3: SQL

Silberschatz, Korth and Sudarshan See for conditions on re-use

DATABASE TECHNOLOGY - 1MB025

WHAT IS SQL. Database query language, which can also: Define structure of data Modify data Specify security constraints

Database System Concepts, 5th Ed.! Silberschatz, Korth and Sudarshan See for conditions on re-use "

Querying Data with Transact SQL

SQL functions fit into two broad categories: Data definition language Data manipulation language

CSIE30600 Database Systems Basic SQL 2. Outline

Explain in words what this relational algebra expression returns:

DATABASE TECHNOLOGY. Spring An introduction to database systems

DATABASE DESIGN I - 1DL300

SQL Functionality SQL. Creating Relation Schemas. Creating Relation Schemas

COGS 121 HCI Programming Studio. Week 03 - Tech Lecture

The SQL database language Parts of the SQL language

Basic SQL. Dr Fawaz Alarfaj. ACKNOWLEDGEMENT Slides are adopted from: Elmasri & Navathe, Fundamentals of Database Systems MySQL Documentation

CS 327E Class 3. February 11, 2019

618 Index. BIT data type, 108, 109 BIT_LENGTH, 595f BIT VARYING data type, 108 BLOB data type, 108 Boolean data type, 109

CSCB20 Week 4. Introduction to Database and Web Application Programming. Anna Bretscher Winter 2017

An Introduction to Structured Query Language

SQL (Structured Query Language)

SQL: Data Definition Language. csc343, Introduction to Databases Diane Horton Fall 2017

COSC 304 Introduction to Database Systems SQL DDL. Dr. Ramon Lawrence University of British Columbia Okanagan

An Introduction to Structured Query Language

Introduction to SQL. ECE 650 Systems Programming & Engineering Duke University, Spring 2018

An Introduction to Structured Query Language

An Introduction to Structured Query Language

Chapter 8 SQL-99: Schema Definition, Basic Constraints, and Queries

CSEN 501 CSEN501 - Databases I

CSCB20 Week 3. Introduction to Database and Web Application Programming. Anna Bretscher Winter 2017

Oracle Syllabus Course code-r10605 SQL

Database design process

INDEX. 1 Basic SQL Statements. 2 Restricting and Sorting Data. 3 Single Row Functions. 4 Displaying data from multiple tables

GridDB Advanced Edition SQL reference

An Introduction to Structured Query Language

SQL STRUCTURED QUERY LANGUAGE

Database Systems SQL SL03

5. Single-row function

Oracle Database 11g: SQL and PL/SQL Fundamentals

Course Outline and Objectives: Database Programming with SQL

Intermediate SQL ( )

EGCI 321: Database Systems. Dr. Tanasanee Phienthrakul

Chapter 4: Intermediate SQL

Database Systems SQL SL03

Optional SQL Feature Summary

Working with DB2 Data Using SQL and XQuery Answers

Introduction to Computer Science and Business

STRUCTURED QUERY LANGUAGE (SQL)

20461: Querying Microsoft SQL Server 2014 Databases

COSC344 Database Theory and Applications. Lecture 5 SQL - Data Definition Language. COSC344 Lecture 5 1

Table of Contents. PDF created with FinePrint pdffactory Pro trial version

CS 4604: Introduc0on to Database Management Systems. B. Aditya Prakash Lecture #4: SQL---Part 2

SQL-99: Schema Definition, Basic Constraints, and Queries. Create, drop, alter Features Added in SQL2 and SQL-99

Course Modules for MCSA: SQL Server 2016 Database Development Training & Certification Course:

SQL Fundamentals. Chapter 3. Class 03: SQL Fundamentals 1

Structured Query Language (SQL)

SQL: Concepts. Todd Bacastow IST 210: Organization of Data 2/17/ IST 210

1) Introduction to SQL

Chapter 4: Intermediate SQL

MTA Database Administrator Fundamentals Course

Debapriyo Majumdar DBMS Fall 2016 Indian Statistical Institute Kolkata

BASIC SQL CHAPTER 4 (6/E) CHAPTER 8 (5/E)

Querying Data with Transact-SQL

SQL. Copyright 2013 Ramez Elmasri and Shamkant B. Navathe

Chapter 8. Joined Relations. Joined Relations. SQL-99: Schema Definition, Basic Constraints, and Queries

1 Writing Basic SQL SELECT Statements 2 Restricting and Sorting Data

Constraints. Primary Key Foreign Key General table constraints Domain constraints Assertions Triggers. John Edgar 2

BASIC SQL CHAPTER 4 (6/E) CHAPTER 8 (5/E)

Implementing Table Operations Using Structured Query Language (SQL) Using Multiple Operations. SQL: Structured Query Language

SQL Interview Questions

T-SQL Training: T-SQL for SQL Server for Developers

Index. Bitmap Heap Scan, 156 Bitmap Index Scan, 156. Rahul Batra 2018 R. Batra, SQL Primer,

Interview Questions on DBMS and SQL [Compiled by M V Kamal, Associate Professor, CSE Dept]

SQL DATA DEFINITION LANGUAGE

Lab # 4. Data Definition Language (DDL)

user specifies what is wanted, not how to find it

Transcription:

Principles of Data Management Alvin Lin August 2018 - December 2018 Structured Query Language Structured Query Language (SQL) was created at IBM in the 80s: SQL-86 (first standard) SQL-89 SQL-92 (what all SQL compliant DBs use) SQL-1999 SQL-2003 SQL is a commercial language that is intended to be a data manipulation language, but supports features of data definition languages. It supports the following features: Schema Domain Integrity Constraints Indicies Security Physical Storage 1

Domain Types char(n) - n number of characters int - numbers smallint - smaller value number numeric(p,d) - precision decimal real,double,float - infinite precision decimals varchar(n) - variable length characters Create Table create table r ( A1 D1, A2 D2,..., An Dn) ( integrity constraints ); Example: create table dog ( ID int, Name varchar ( 20), Salary numeric (8,2), goodboy smallint ); Example: create table p( ID int, Name varchar (4) not null, DOB varchar (10), dog_id int, primary key (ID, Name ), foreign key ( dog_id ) references dog (ID) ); CRUD insert into r values (v1,v2,vn); insert into dog values (1, cat,10.00,0) ; delete from dog ; drop table r; alter table dog add iscat int not null default 0; alter table dog drop iscat ; 2

Select select a1,a2,a3,an from r1,r2,rn where predicate ; select name from dog ; select distinct name from dog ; select all name from dog ; select joe as name ; select name, lab as breed from dog ; select * from dog ; Explicitly specify what attributes to return, SELECT * should be avoided with the exception of manually viewing data. select id, name, salary as pay from dog where salary > 80000; The where keyword allows for conditions by which we can filter the returned rows. We can specify modifications to the returned attribute and use boolean conditions to create compound conditionals. select id, name, salary /26 as pay from dog where salary > 80000 and breed = lab ; The between modifier allows us to select a range as a predicate. select * from dog where age between 6 and 9; Selecting from multiple tables using SELECT * returns a Cartesian product. select * from dog, cat ; This is not particularly useful, it is much more useful to use a predicate to match the data. select * from dog, cat where dog. name = cat. name and dog. age = 4; select * from dog as D, cat as C where D. age = C. age ; Examples table dog name owner Charlie Snoopy Woodstock Snoopy Lucy Ricky Snoopy Sam Finding all the owners with dogs named Woodstock: select d2. owner from dog where name = woodstock ; Finding all the dogs owned by Snoopy: 3

select name from dog where owner = snoopy ; Selecting the owner s owner of Charlie: select owner from dog d1, dog d2 where d1. name = charlie and d1. owner = d2. name ; String Operations The percent sign (%) is used for partial substring matches while the underscore ( ) is used for single character matches. select * from dog where name like % py ; select * from dog where name like Sn\ escape \ ; Concatenation is expressed with double pipes ( ). select * from fname lname like % foo ; Ordering The order by keyword allows for queries to be sorted. Specifying more than one sort option allows for sorting by multiple fields. select name, lname from dog order by name asc ; select name, lname from dog order by name, lname desc ; select name, lname from dog order by name desc, lname ; Set Operations Set operation keywords are valid in SQL. ( select name from dog ) union ( select name from cat ); ( select name from dog ) intersect ( select name from cat ); ( select name from dog ) except ( select name from cat ); Null Values select field from table where age is null Boolean evaluation: 1. True or null evaluates to true 2. False or null evaluates to null 4

3. Null or null evaluates to null 4. True and null evaluates to null 5. False and null evaluates to false 6. Null and null evaluates to null Aggregate Functions There are 4 1 aggregate functions: avg, min, max, sum, count. Count does not 2 entirely count as an aggregate function. We can use this to count the number of non-null rows: select count (*) from penguin ; Because no rows can be non-null, this just returns the number of rows in your database. We can also select the number of non-null values for a field: select count ( age ) from penguin ; To count distinct values in a field: select count ( distinct name ) from dog ; The other aggregate functions can be used in a similar way: select avg ( salary ) from dog where state = CA ; If there are no dogs, then 0 is returned. If some dogs have a null salary, then null is returned. select avg ( salary ) as avgs, state from dog group by state ; select avg ( salary ) as avgs, state, name from dog group by state, name ; Results like this can be filtered with the having clause: select avg ( salary ) as avgs, state from dog group by state having avgs > 5; Subqueries Subquery expressions can be used for set membership, comparisons, and cardinality expressions. 5

select name, age from dog where age in ( select age from cat where state = sleeping ); select name, age from dog where ( age, legs ) in ( select age, legs from cat where state = sleeping ); select name from dog where age > some ( select age from cat where state = CA ); select name from dog where age > all ( select age from cat where state = CA ); select name from dog where age > ( select min ( age ) from cat where state = CA ); Because nested subqueries can get very messy, we use the with clause as a way of defining a temporary relation that only exists in the clause in which the with clause occurs. with dept_ total ( dept_ name, value ) as ( select dept_ name, sum ( salary ) from instructor group by dept_name ), dept_total_avg ( value ) as ( select avg ( value ) from dept_total ) select dept_ name from dept_ total, dept_ total_ avg where dept_ total. value > dept_ total_ avg. value ; Scalar subqueries must be used where only a single value is expected. select dept_ name, ( select count (*) from instructor where department. dept_ name = instructor. dept_ name ) as num_ instructors from department ; Database Modification Deleting records (also accepts a valid where clause): delete from instructor ; delete from instructor where dept_ name = Finance ; delete from instructor where dept_ name in ( select dept_ name from department where building = Watson ); delete from instructor where salary < ( select avg ( salary ) from instructor ); Records can be inserted into databases: insert into course ( course_ id, title, dept_ name, credits ) values ( CS -437, Database Systems, Comp Sci, 4); insert into course values ( CS -437, Database Systems, Comp Sci, 4); 6

insert into student select ID, name, dept_ name, 0 from instructor ; Records can be updated in databases: update instructor set salary = salary * 1. 03 where salary > 100000; update instructor set salary = salary * 1. 05 where salary <= 100000; As an aside, since we have no guarantee to the order of execution of the previous statements, some instructors might get more than one raise. We can use the case statement to solve this. update instructor set salary = case when salary <= 100000 then salary * 1. 05 else salary * 1. 03 end Updates can also take scalar subqueries: update student S set total_ credits = ( select sum ( credits ) from takes, course where takes. course_ id = course. course_ id and S.ID = takes.id and takes. grade <> F and takes. grade is not null ); Database Joins Join types: inner left outer right outer full outer Conditions: natural on <predicate>, using (attributes) 7

Natural joins match on all attributes with the same name in both tables. Inner joins only select rows on the matching attribute (essentially enforcing that no row in the output has null values) while full outer joins preserve all information in both tables. Dog Name Age A 5 B 10 C 7 Cat ID Name State 1 C VT 2 D IA select * from dog natural left outer join cat ; Name Age ID State A 5 NULL NULL B 10 NULL NULL C 7 1 VT select * dog natural right outer join cat ; Name Age ID State C 7 1 VT D NULL 2 IA select * from dog natural full outer join cat ; Name Age ID State A 5 NULL NULL B 10 NULL NULL C 7 1 VT D NULL 2 IA select * from dog inner join cat on cat. name = dog. name ; Name Age ID State C 7 1 VT Suppose we have two cats with the same name: 8

Dog Name Age A 5 B 10 C 7 Cat ID Name State 1 C VT 2 D IA 3 C MI select * from dog natural full outer join cat ; Views Name Age ID State A 5 NULL NULL B 10 NULL NULL C 7 1 VT D NULL 2 IA C 7 3 MI Views are used to restrict access to attributes or rows. It represents a virtual table. create view old_ dogs as ( select name, age from dog where age > 10) ; Executing any query on old dogs is equivalent to using the view as a subquery. This is generally used to restrict access to the table, prevent users from querying all dogs or querying certain attributes of dogs. We can also insert into views. insert into old_dogs values ( Jeff, 13) values ( Cheese, 8); While this might work, you will never see Cheese by querying the view because the view restricts access to dogs whose age is over 10. create view dogs2 as ( select name from dog, cat where dog. lives_in = cat. state ); Inserting into this view would lead to ambiguity, so in general, there are certain restrictions for inserting into views. 1. Only a single table is present in the from clause. 2. No aggregations, distinct, etc. 3. No group by or having clauses. 4. Any attribute not in the selection can be set to null. There exist the concept of materialized views, which are physical tables that contain the results of a view. It is a cache of data used for long running queries that needs to be updated. 9

Transactions In most flavors of SQL, transactions begin with most statements and usually end with a commit or rollback. You can specify for a group of statements to be atomic however. begin atomic ; select * from dog ; insert into cat... update turtle... commit ; Integrity Constraints Integrity contraints for a single table include the following: not null unique primary key: not null, unique, indexed check <predicate> create table dog ( id int primary key, name varchar ( 30) not null, salary int, check ( salary > 5), check ( age < 10), check ( breed in ( Lab, Dalmatian, GSD )) ) Integrity constraints can apply across multiple tables. foreign keys to ensure some condition. create table dog ( id int primary key, name varchar ( 20) not null, foreign key ( owner ) references person ( id) on update cascade ) Referential integrity uses If the referenced person is deleted or updated in database, we specify how the database handles the constraint. on update cascade makes any changes from the 10

person table cascade into the dog table to update its value. Alternatively, we can specify on delete set null to set the value to null if the referenced person is deleted. Another option is to specify set default 0 to specify a default value if the referenced person is deleted. Complex check statements can also be used like this, but it is not usually recommended since there are better options. create table dog ( check owner_ id in ( select id from person ) ) Additional Data Types Date - yyyy-mm-dd Time - HH:MM:SS Timestamp - Date and time Interval - the resulting of subtracting a date/time related data type Blob - binary large object (cannot be indexed) Clob - character large object (cannot be indexed) Blobs and clobs are referenced by a pointer stored in the database. Because they are not indexed, relevant metadata needs to be stored as well. Indices Suppose have a dog table with the attributes id, name, age, and state: create index dog_ index on dog ( state, age ); Creating an index affects the structure of how the data is stored on disk to allow efficient access. Queries that use an index must use all the attributes that the index applies to. An index creates a table of all the distinct values for an attribute currently in the table. Each entry in the index holds a list of pointers to all matching entries in the table. The index does not have to be specified in order for it to be used to optimize a query. After it is created, the database will use automatically for any matching predicates. 11

Functions/Procedures We can define functions and procedures according to the SQL-1999 standard. The syntax for this will often be vendor-specific. create function dog_ count ( state varchar (2) ) returns integer begin declare dc integer ; select count (*) into dc from dog where dog. state = state ; return dc; end After defining this function, we can use it in future queries: select name, age from cat where dog_ count ( OR ) > 5; Subqueries can be also generally be used as arguments in functions. All functions must have a begin and end clause, a return type specified, and the actual value returned. Tables can also be returned as of SQL-2003: create function dogs ( st varchar (2) ) returns table ( id int, name varchar ( 10), breed varchar (5) ) begin return table ( select id, name, breed from dog where dog. state = state ) end Procedures can also be used as function like constructs: create procedure dog_ count ( state varchar (2), out dcount integer ) begin select count (*) into dcount from dog where dog. state = state ; end This could be used in subsequent SQL statements: declare dc int ; call dog_count ( OR, dc); Iteration SQL-1999 also supports iteration using while: while boolean_ exp do... end while and also repeat: 12

repeat... until expression end repeat Another option available is for loops: declare r integer default 0; for r as select age from dog ; do set n = n + r. age end for Triggers Triggers are statements that happen automatically upon an event. A condition to trigger upon and a subsequent action must be specified. Like functions and procedures, the syntax here will likely be vendor-specific. Triggers can be set on insert, update, and delete events. create trigger trigger1 setnull before update of dog referencing new row as nrow for each row when ( nrow. age > 15) begin atomic set nrow. age = null ; end ; create trigger myt after update of dog referencing new row as nrow for each row when ( nrow. age > 30) begin atomic set nrow. age = null end ; create trigger doginfo_ trigger after update of dog on ( age ) referencing new row as nrow referencing old row as orow for each row when ( nrow. breed <> GSD and nrow. breed is not null ) and ( orow. breed = GSD or orow. breed is null ) begin atomic update cat set cat. age = cat. age + ( select age from dog where cat. id = nrow. id) where cat. breed = orow. breed end ; Triggers are generally used to accumulate summary data, perform data replication, and set default values (although these are all really bad reasons to use them). They 13

have issues with cascading runs and their best use is probably with audit data to record when changes are made in a database. You can find all my notes at http://omgimanerd.tech/notes. If you have any questions, comments, or concerns, please contact me at alvin@omgimanerd.tech 14