Programming and Database Fundamentals for Data Scientists

Similar documents
Information Systems for Engineers. Exercise 10. ETH Zurich, Fall Semester Hand-out Due

Relational Algebra. Spring 2012 Instructor: Hassan Khosravi

Information Systems Engineering. SQL Structured Query Language DDL Data Definition (sub)language

Relational Model, Key Constraints

Introduction to SQL on GRAHAM ED ARMSTRONG SHARCNET AUGUST 2018

Basic SQL. Basic SQL. Basic SQL

SQL: Concepts. Todd Bacastow IST 210: Organization of Data 2/17/ IST 210

SQL: Data Definition Language

AC61/AT61 DATABASE MANAGEMENT SYSTEMS DEC 2013

High-Level Database Models. Spring 2011 Instructor: Hassan Khosravi

Relational Model. CSE462 Database Concepts. Demian Lessa. Department of Computer Science and Engineering State University of New York, Buffalo

Set Operations, Union

Lab # 2 Hands-On. DDL Basic SQL Statements Institute of Computer Science, University of Tartu, Estonia

Basic SQL. Dr Fawaz Alarfaj. ACKNOWLEDGEMENT Slides are adopted from: Elmasri & Navathe, Fundamentals of Database Systems MySQL Documentation

SQL Functionality SQL. Creating Relation Schemas. Creating Relation Schemas

Databases - Relations in Databases. (N Spadaccini 2010) Relations in Databases 1 / 16

1 INTRODUCTION TO EASIK 2 TABLE OF CONTENTS

Niklas Fors The Relational Data Model 1 / 17

Data Types in MySQL CSCU9Q5. MySQL. Data Types. Consequences of Data Types. Common Data Types. Storage size Character String Date and Time.

Database Systems. Basics of the Relational Data Model

BASIC SQL CHAPTER 4 (6/E) CHAPTER 8 (5/E)

Database Modifications and Transactions

SQL Data Definition and Data Manipulation Languages (DDL and DML)

1. Given the name of a movie studio, find the net worth of its president.

BASIC SQL CHAPTER 4 (6/E) CHAPTER 8 (5/E)

The M in LAMP: MySQL CSCI 470: Web Science Keith Vertanen Copyright 2014

Database Systems CSE 303. Lecture 02

Database Systems CSE 303. Lecture 02

SQL DATA DEFINITION LANGUAGE

Joins, NULL, and Aggregation

SQL DATA DEFINITION LANGUAGE

CS W Introduction to Databases Spring Computer Science Department Columbia University

SQL Overview. CSCE 315, Fall 2017 Project 1, Part 3. Slides adapted from those used by Jeffrey Ullman, via Jennifer Welch

Midterm 1 157A Fall /22

Data Definition Language (DDL), Views and Indexes Instructor: Shel Finkelstein

Databases-1 Lecture-01. Introduction, Relational Algebra

user specifies what is wanted, not how to find it

Department of Computer Science University of Cyprus. EPL342 Databases. Lab 2

SQL queries II. Set operations and joins

SQL Fundamentals. Chapter 3. Class 03: SQL Fundamentals 1

Data Storage and Query Answering. Data Storage and Disk Structure (4)

Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa

TINYINT[(M)] [UNSIGNED] [ZEROFILL] A very small integer. The signed range is -128 to 127. The unsigned range is 0 to 255.

SQL DATA DEFINITION LANGUAGE

SQL: Data De ni on. B0B36DBS, BD6B36DBS: Database Systems. h p:// Lecture 3

EGCI 321: Database Systems. Dr. Tanasanee Phienthrakul

Creating Tables, Defining Constraints. Rose-Hulman Institute of Technology Curt Clifton

Simple Quesries in SQL & Table Creation and Data Manipulation

Constraints. Primary Key Foreign Key General table constraints Domain constraints Assertions Triggers. John Edgar 2

SQL. Often times, in order for us to build the most functional website we can, we depend on a database to store information.

Basis Data Terapan. Yoannita

Data Definition and Data Manipulation. Lecture 5: SQL s Data Definition Language CS1106/CS5021/CS6503 Introduction to Relational Databases

Lecture 5: SQL s Data Definition Language

Information Systems for Engineers Fall Data Definition with SQL

Data and Tables. Bok, Jong Soon

Chapter 4. Basic SQL. Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley

SQL OVERVIEW. CS121: Relational Databases Fall 2017 Lecture 4

MySQL: an application

Database Programming with PL/SQL

Information Systems Engineering. SQL Structured Query Language DML Data Manipulation (sub)language

Operating systems fundamentals - B07

Database Systems. Bence Molnár

Chapter 8 How to work with data types

Lab 4: Tables and Constraints

Organization of Records in Blocks

THE AUSTRALIAN NATIONAL UNIVERSITY. Mid-Semester Examination August 2006 RELATIONAL DATABASES (COMP2400)

COGS 121 HCI Programming Studio. Week 03 - Tech Lecture

INFORMATION TECHNOLOGY NOTES

More MySQL ELEVEN Walkthrough examples Walkthrough 1: Bulk loading SESSION

Department of Computer Science University of Cyprus. EPL342 Databases. Lab 1. Introduction to SQL Server Panayiotis Andreou

Algebraic laws extensions to relational algebra

SQL Data Definition Language: Create and Change the Database Ray Lockwood

SQL Introduction. CS 377: Database Systems

MTAT Introduction to Databases

Optimization of logical query plans Eliminating redundant joins

Informatics 1: Data & Analysis

Chapter 3 Introduction to relational databases and MySQL

Exam I Computer Science 420 Dr. St. John Lehman College City University of New York 12 March 2002

SRM UNIVERSITY FACULTY OF ENGINEERING AND TECHNOLOGY SCHOOL OF COMPUTING DEPARTMENT OF CSE COURSE PLAN

Flexible Data Modeling in MariaDB

Introduction to SQL Server 2005/2008 and Transact SQL

Lab # 4. Data Definition Language (DDL)

Tool/Web URL Features phpmyadmin. More on phpmyadmin under User Intefaces. MySQL Query Browser

Chapter 2 The relational Model of data. Relational model introduction

Logical database design

CSCI3030U Database Models

Representing Data Elements

The Relational Model of Data (ii)

Database Systems Architecture. Stijn Vansummeren

Informatics 1: Data & Analysis

COMP102: Introduction to Databases, 5 & 6

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Documentation Accessibility. Access to Oracle Support. Supported Browsers

Embedded SQL. csc343, Introduction to Databases Diane Horton with examples from Ullman and Widom Fall 2014

CSCI-UA: Database Design & Web Implementation. Professor Evan Sandhaus

Full file at

Relational Model and Relational Algebra A Short Review Class Notes - CS582-01

DS Introduction to SQL Part 1 Single-Table Queries. By Michael Hahsler based on slides for CS145 Introduction to Databases (Stanford)

IBM DB2 UDB V7.1 Family Fundamentals.

Query Processing and Optimization

Transcription:

Programming and Database Fundamentals for Data Scientists Database Fundamentals Varun Chandola School of Engineering and Applied Sciences State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB EAS 503 1 / 22

Outline Overview Data Model Schemas SQL Basics Chandola@UB EAS 503 2 / 22

Overview Design of databases Entity Relationship Model Chapters 2, 4 (until section 4.5) Database programming SQL Chapters 6,7, and 8 SQL in a server environment Embedding SQL in Python Chapter 9 (partly) Book Database Systems, The Complete Book (2nd Ed.), Hector Garcia-Molina, Jeffrey D. Ullman, Jennifer Widom (2009), Prentice Hall. Chandola@UB EAS 503 3 / 22

What is a Data Model? Mathematical representation of data Examples: Relational, Semi-structured, Hierarchical, Network Operations on data Constraints Chandola@UB EAS 503 4 / 22

Relational Model - A Relation is a Table Data arranged as rows in a table, each row has related information about one data entity Consider the following relation (or table) Movies title year length genre Gone with the wind 1939 231 drama Star wars 1977 124 scifi Wayne s world 1992 95 comedy Attributes (column headers) Tuples (rows) Relation name (movies) Chandola@UB EAS 503 5 / 22

Schemas Relation schema Relation name and attribute list Type of each attribute E.g., title - String, year - Integer, etc. A Database is a collection of relations (tables). The collection of all relation schemas in the database is the database schema Chandola@UB EAS 503 6 / 22

Why Relational Model Most popular - simple and limited Allows for highly efficient implementations to operate on the data Allows implement high-level languages, such as SQL Chandola@UB EAS 503 7 / 22

Relational Model Basics Relation Attributes Tuples Schemas Domains Relation instance Relation keys title year length genre Gone with the wind 1939 231 drama Star wars 1977 124 scifi Wayne s world 1992 95 comedy Chandola@UB EAS 503 8 / 22

A Simple Database Movies Movies title: string, year: int, length: int, genre: string, studioname: string, producercertificatenum: int Studio name: string, address: string presidentcertificatenum: int StarsIn movietitle: string, movieyear: int, starname: string MovieStar name: string, birthdate: date, address: string, gender: string MovieExecutive name: string, certificatenum: int, address: string, networth: int Chandola@UB EAS 503 9 / 22

Starting with SQL Structure Query Language or SQL is the language to interact with a relational database management system Has two uses 1. Data definition creating database schemas, etc. 2. Data manipulation querying, modifying database tables Chandola@UB EAS 503 10 / 22

Creating a Database CREATE DATABASE moviedb; Chandola@UB EAS 503 11 / 22

Creating/Deleting Tables create movies table CREATE TABLE movies ( title VARCHAR(128) NOT NULL, year INT, length INT, studioname VARCHAR(128), executivenumber INT ); deleting a table DROP TABLE movies; Chandola@UB EAS 503 12 / 22

SQL Types - Numeric Type Storage Min Max Signed/Unsigned Signed/Unsigned TINYINT 1-128 127 0 255 SMALLINT 2-2 15 2 15 1 0 2 1 6 1 MEDIUMINT 3 2 23 2 23 1 0 2 24 1 INT 4 2 31 2 31 1 0 2 32 1 BIGINT 8 2 63 2 63 1 0 2 64 1 Chandola@UB EAS 503 13 / 22

SQL Types - Optional Width Argument One can optionally set the display width - INT(4) Chandola@UB EAS 503 14 / 22

SQL Types - Floating Points FLOAT and DOUBLE keywords to specify fields with single and double precision values, respectively. Chandola@UB EAS 503 15 / 22

SQL Types - Date and Time Types DATE - Only date and no time. DATE values are displayed as YYYY-MM-DD The supported range is 1000-01-01 to 9999-12-31. DATETIME - Both date and time DATETIME values are displayed as YYYY-MM-DD HH:MM:SS The supported range is 1000-01-01 00:00:00 to 9999-12-31 23:59:59. TIMESTAMP - Full timestamp (stored as UTC but displayed using current time zone) TIMESTAMP values are displayed as YYYY-MM-DD HH:MM:SS The supported range is 1970-01-01 00:00:01 UTC to 2038-01-19 03:14:07 UTC. Chandola@UB EAS 503 16 / 22

SQL Types - Text CHAR and VARCHAR are declared with a length specifying the maximumum length string that can be stored in that field Difference between the two Maximum length for a CHAR field is 255 bytes, while maximum length for a VARCHAR field is 65,535 bytes CHAR(4) will always use 4 bytes (shorter strings are padded with empty space) VARCHAR(4) will use one byte to store the length of the stored string but only use the exact length Example: The string ab will use 4 bytes when stored as CHAR(4) and 3 bytes when stored as VARCHAR(4) Example: The string abcd will use 4 bytes when stored as CHAR(4) and 5 bytes when stored as VARCHAR(4) Chandola@UB EAS 503 17 / 22

SQL Types - More Text BINARY and VARBINARY For binary data (length specified as number of bytes) TEXT and BLOB For very long strings and binary data, respectively Chandola@UB EAS 503 18 / 22

Modifying Schema DROP already seen Adding or deleting columns add a new column to an existing table ALTER TABLE movies ADD genre VARCHAR(16); change type of an existing column ALTER TABLE movies MODIFY genre VARCHAR(32); delete an existing column from a table ALTER TABLE movies DROP genre; Chandola@UB EAS 503 19 / 22

Defining Keys An attribute or list of attributes (say S) may be declared PRIMARY KEY or UNIQUE. For UNIQUE, two tuples cannot agree on all of the attributes in S, unless the values for S in one of the tuple is NULL For PRIMARY KEY, attributes in S are not allowed to have NULL as a value for their components Chandola@UB EAS 503 20 / 22

Example create movies table with UNIQUE attribute CREATE TABLE movies ( title VARCHAR(128) UNIQUE, year INT, length INT, studioname VARCHAR(128), executivenumber INT ); create movies table with primary key as ( title, year) CREATE TABLE movies ( title VARCHAR(128), year INT, length INT, studioname VARCHAR(128), executivenumber INT, PRIMARY KEY (title,year) ); Chandola@UB EAS 503 21 / 22

References Chandola@UB EAS 503 22 / 22