INF 212 ANALYSIS OF PROG. LANGS SQL AND SPREADSHEETS. Instructors: Crista Lopes Copyright Instructors.

Similar documents
Introduction to SQL Part 1 By Michael Hahsler based on slides for CS145 Introduction to Databases (Stanford)

DS Introduction to SQL Part 2 Multi-table Queries. By Michael Hahsler based on slides for CS145 Introduction to Databases (Stanford)

Lecture 3: SQL Part II

Lecture 2: Introduction to SQL

DS Introduction to SQL Part 1 Single-Table Queries. By Michael Hahsler based on slides for CS145 Introduction to Databases (Stanford)

Big Data Processing Technologies. Chentao Wu Associate Professor Dept. of Computer Science and Engineering

Introduction to Database Systems CSE 444

Before we talk about Big Data. Lets talk about not-so-big data

Database Systems CSE 303. Lecture 02

Database Systems CSE 303. Lecture 02

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2006 Lecture 3 - Relational Model

Introduction to Database Systems CSE 414. Lecture 3: SQL Basics

SQL. Assit.Prof Dr. Anantakul Intarapadung

Lecture 3 Additional Slides. CSE 344, Winter 2014 Sudeepa Roy

Data Informatics. Seon Ho Kim, Ph.D.

CSE 544 Principles of Database Management Systems

Introduction to Databases CSE 414. Lecture 2: Data Models

Polls on Piazza. Open for 2 days Outline today: Next time: "witnesses" (traditionally students find this topic the most difficult)

Introduction to Database Systems CSE 414. Lecture 3: SQL Basics

CSE 544 Principles of Database Management Systems

Principles of Database Systems CSE 544. Lecture #2 SQL The Complete Story

Agenda. Discussion. Database/Relation/Tuple. Schema. Instance. CSE 444: Database Internals. Review Relational Model

SQLite, Firefox, and our small IMDB movie database. CS3200 Database design (sp18 s2) Version 1/17/2018

SQL. Structured Query Language

CSE 344 JANUARY 8 TH SQLITE AND JOINS

CSE 344 JANUARY 5 TH INTRO TO THE RELATIONAL DATABASE

Lectures 2&3: Introduction to SQL

Introduction to Relational Databases

The Relational Data Model

Introduction to Data Management CSE 414

L07: SQL: Advanced & Practice. CS3200 Database design (sp18 s2) 1/11/2018

Announcements. Agenda. Database/Relation/Tuple. Discussion. Schema. CSE 444: Database Internals. Room change: Lab 1 part 1 is due on Monday

Announcements. Agenda. Database/Relation/Tuple. Schema. Discussion. CSE 444: Database Internals

COMP 430 Intro. to Database Systems

Principles of Database Systems CSE 544. Lecture #1 Introduction and SQL

Principles of Database Systems CSE 544. Lecture #1 Introduction and SQL

Database Systems CSE 414

Introduction to Database Systems. The Relational Data Model

Introduction to Database Systems. The Relational Data Model. Werner Nutt

Introduction to Data Management CSE 344

CS 564 Midterm Review

MTAT Introduction to Databases

CPSC 504 Background (aka, all you need to know about databases for this course in two lectures) Rachel Pottinger January 8 and 12, 2015

COMP 430 Intro. to Database Systems

Announcements. Multi-column Keys. Multi-column Keys (3) Multi-column Keys. Multi-column Keys (2) Introduction to Data Management CSE 414

Announcements. Multi-column Keys. Multi-column Keys. Multi-column Keys (3) Multi-column Keys (2) Introduction to Data Management CSE 414

Database Management Concepts I

Introduction to Data Management CSE 344. Lecture 2: Data Models

CSC 261/461 Database Systems Lecture 7

Administrative notes. Levels of Abstraction. Overview of the next two classes. Schema and Instances. Conceptual Database Design

CSC 261/461 Database Systems Lecture 6. Spring 2017 MW 3:25 pm 4:40 pm January 18 May 3 Dewey 1101

Introduction to Data Management CSE 344

CS317 File and Database Systems

Principles of Database Systems CSE 544p. Lecture #1 September 28, 2011

Announcements. Using Electronics in Class. Review. Staff Instructor: Alvin Cheung Office hour on Wednesdays, 1-2pm. Class Overview

Database design and implementation CMPSCI 645. Lecture 04: SQL and Datalog

CMPT 354: Database System I. Lecture 3. SQL Basics

Part I: Structured Data

Review for Exam 1 CS474 (Norton)

Lecture 4. Lecture 4: The E/R Model

Data, Databases, and DBMSs

Database Languages. A DBMS provides two types of languages: Language for accessing & manipulating the data. Language for defining a database schema

Announcements. Outline UNIQUE. (Inner) joins. (Inner) Joins. Database Systems CSE 414. WQ1 is posted to gradebook double check scores

Lecture 04: SQL. Monday, April 2, 2007

Oracle Database 10g: Introduction to SQL

Lecture 5. Lecture 5: The E/R Model

ADVANCED DATABASES ; Spring 2015 Prof. Sang-goo Lee (11:00pm: Mon & Wed: Room ) Advanced DB Copyright by S.-g.

Database Technology Introduction. Heiko Paulheim

Introduction. Example Databases

Informatics 1: Data & Analysis

CS425 Fall 2016 Boris Glavic Chapter 1: Introduction

Section 2.2: Relational Databases

Further GroupBy & Extend Operations

Learn Well Technocraft

Chapter 1: Introduction

Unit I. By Prof.Sushila Aghav MIT

CS143: Relational Model

Keys, SQL, and Views CMPSCI 645

CMPT 354: Database System I. Lecture 2. Relational Model

Certification Exam Preparation Seminar: Oracle Database SQL

Administration Naive DBMS CMPT 454 Topics. John Edgar 2

Lecture 04: SQL. Wednesday, October 4, 2006

Querying Data with Transact SQL

SELECT Product.name, Purchase.store FROM Product JOIN Purchase ON Product.name = Purchase.prodName

CSC 261/461 Database Systems Lecture 5. Fall 2017

Please pick up your name card

L12: ER modeling 5. CS3200 Database design (sp18 s2) 2/22/2018

The Entity-Relationship Model (ER Model) - Part 1

B.H.GARDI COLLEGE OF MASTER OF COMPUTER APPLICATION. Ch. 1 :- Introduction Database Management System - 1

Writing Analytical Queries for Business Intelligence

8) A top-to-bottom relationship among the items in a database is established by a

CS

Overview of Data Management

Informatics 1: Data & Analysis

Database Lab Lab 6 DML part 3

Course Outline. Querying Data with Transact-SQL Course 20761B: 5 days Instructor Led

Introduction to Database Systems CSE 414

Querying Data with Transact-SQL

What is Data? ANSI definition: Volatile vs. persistent data. Data. Our concern is primarily with persistent data

What is Data? Volatile vs. persistent data Our concern is primarily with persistent data

Transcription:

INF 212 ANALYSIS OF PROG. LANGS SQL AND SPREADSHEETS Instructors: Crista Lopes Copyright Instructors.

Data-Centric Programming Focus on data Interactive data: SQL Dataflow: Spreadsheets Dataflow: Iterators, generators, coroutines

SQL Standard Query Language

History Data banks since 1950s Disks (direct access storage) in 1960s How to store and retrieve data from disk Efficiently, cleanly Before 1970: Hierarchical models (trees) Network models (graphs) E. Codd, 1970: Relational model

Relational model Logic deductive system minus deductions Data independence isolate applications from data representations Data inconsistency Relation as in Mathematics: Given sets S 1, S 2,... S n : R is a relation on these sets iff R = {{e 1, e 2,..., e n },...} where e i S i R S 1 x S 2 x... xs n (R is a subset of the Cartesian product)

Relations Each column represents a domain Ordering of columns is important order of the domains of R Each row represents an n-tuple of R Ordering of rows is immaterial All rows are distinct S 1 S 2 S 3 S 4 S 5 S 6

Math: Relations vs. Functions Relation >> Function Function = relation where each element of the domain corresponds to one element of the range Int Int -2 3 5 3 7 3 12 3 Relation? Function? Int Int -2 3-2 0 7 3 12 1 Relation? Function?

Relations in data bases Relations define subsets of the domain: R S 1 x S 2 x... xs n supply (supplier part project quantity) 1 2 5 17 1 3 5 23 2 3 7 9 2 7 5 4 4 1 1 12 Supply is a relation (subset) from supplier x part x project x quantity supplier x part x project x quantity And supplier, part, project, quantity all subsets of Int

Relations in data bases Relations may include repeated domains component (part part quantity) 1 5 9 2 5 7 3 5 2 2 6 12 3 6 3 Part 1 is a subpart of part 5, and there needs to be 9 part 1s to make a part 5 Component is a relation (subset) from part x part x quantity part x part x quantity And part, part, project, quantity all subsets of Int

Relationships A relationship is an equivalence class of those relations that are equivalent under permutation of domains Order not important Same domains are distinguished by role names User-facing model component (subpart part quantity) 1 5 9 2 5 7 3 5 2 2 6 12 3 6 3

Cross-references Elements of a relation can cross-reference elements of the same or another relation Done via Keys Student (id name) 1 John Smith 2 Alice 3 Bob Log (id action student_id) 1 upload 2 2 upload 3 3 delete 2

Operations on relations Permutation Interchanging columns yields converse relations Subsetting Selecting only a subset of tuples Projection Selection of only a subset of columns Join Merging two or more relations without loss of information

Relational Model SQL Data Definition Language (DDL) Create/alter/delete tables and their attributes Data Manipulation Language (DML) Query one or more tables Insert/delete/modify tuples in tables

Subsetting Product PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography Canon MultiTouch $203.99 Household Hitachi SELECT * FROM Product WHERE category= Gadgets selection PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks

Projection+Subsetting Product PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography Canon MultiTouch $203.99 Household Hitachi SELECT PName, Price, Manufacturer FROM Product WHERE Price > 100 selection and projection PName Price Manufacturer SingleTouch $149.99 Canon MultiTouch $203.99 Hitachi

Joins Product (pname, price, category, manufacturer) Company (cname, stockprice, country) Find all products under $200 manufactured in Japan; return their names and prices. SELECT PName, Price FROM Product, Company WHERE Manufacturer=CName AND Country= Japan AND Price <= 200 Join between Product and Company

Joins Product PName Price Category Manufacturer Gizmo $19.99 Gadgets GizmoWorks Powergizmo $29.99 Gadgets GizmoWorks SingleTouch $149.99 Photography Canon MultiTouch $203.99 Household Hitachi Company Cname StockPrice Country GizmoWorks 25 USA Canon 65 Japan Hitachi 15 Japan SELECT PName, Price FROM Product, Company WHERE Manufacturer=CName AND Country= Japan AND Price <= 200 PName Price SingleTouch $149.99

Full SQL Very powerful query language Ordering, Grouping, aggregation, rich type system,... Declarative Say what you want, not how you want it to happen Nothing related to query processing or internal data representations

Spreadsheets

Spreadsheets

Spreadsheets One of the most successful software genres Centuries-old accounting practices... Some cells contain primitive values Some cells contain values derived from formulas...with computers Automatic update of derived values when primitive values change Dataflow programming

In Python: Columns = 2-part lists: data formula All formulas run on updates

In OOP Columns = Objects with 2 parts, data and formula Formulas = Objects with method execute Map function: applies a given function to one or more list of values Check for equivalents in C++ (Boost maybe?), C# (Select) Not hard to do by hand: iterate Homework: no ugly code! Think carefully. Model it nicely. Use the right words. Use ADTs.