INTRODUCTION TO DATABASES IN PYTHON. Filtering and Targeting Data

Similar documents
CSC Web Programming. Introduction to SQL

IMPORTING DATA IN PYTHON I. Introduction to relational databases

INTRODUCTION TO DATA VISUALIZATION WITH PYTHON. Visualizing Regressions

HKTA TANG HIN MEMORIAL SECONDARY SCHOOL SECONDARY 3 COMPUTER LITERACY. Name: ( ) Class: Date: Databases and Microsoft Access

12. MS Access Tables, Relationships, and Queries

Implementing Table Operations Using Structured Query Language (SQL) Using Multiple Operations. SQL: Structured Query Language

Chapter 1 : Informatics Practices. Class XII ( As per CBSE Board) Advance operations on dataframes (pivoting, sorting & aggregation/descriptive

Data Manipulation Language (DML)

Rows and Range, Preceding and Following

Database Systems: Design, Implementation, and Management Tenth Edition. Chapter 7 Introduction to Structured Query Language (SQL)

Computing for Medicine (C4M) Seminar 3: Databases. Michelle Craig Associate Professor, Teaching Stream

Chapter # 7 Introduction to Structured Query Language (SQL) Part II

COMP 244 DATABASE CONCEPTS & APPLICATIONS

NCSS: Databases and SQL

Thomas Vincent Head of Data Science, Getty Images

STOP DROWNING IN DATA. START MAKING SENSE! An Introduction To SQLite Databases. (Data for this tutorial at

Based on the following Table(s), Write down the queries as indicated: 1. Write an SQL query to insert a new row in table Dept with values: 4, Prog, MO

Language. f SQL. Larry Rockoff COURSE TECHNOLOGY. Kingdom United States. Course Technology PTR. A part ofcenqaqe Learninq

SQL: Data Querying. B0B36DBS, BD6B36DBS: Database Systems. h p:// Lecture 4

SQL functions fit into two broad categories: Data definition language Data manipulation language

SQL Data Querying and Views

tablename ORDER BY column ASC tablename ORDER BY column DESC sortingorder, } The WHERE and ORDER BY clauses can be combined in one

STIDistrict Query (Basic)

Full file at

ASSIGNMENT NO Computer System with Open Source Operating System. 2. Mysql

Jarek Szlichta

Visual C# 2012 How to Program by Pe ars on Ed uc ati on, Inc. All Ri ght s Re ser ve d.

ARTIFICIAL INTELLIGENCE AND PYTHON

CS 582 Database Management Systems II

DSC 201: Data Analysis & Visualization

DSC 201: Data Analysis & Visualization

CGS 3066: Spring 2017 SQL Reference

Structure Query Language (SQL)

2) SQL includes a data definition language, a data manipulation language, and SQL/Persistent stored modules. Answer: TRUE Diff: 2 Page Ref: 36

CSCI/CMPE Object-Oriented Programming in Java JDBC. Dongchul Kim. Department of Computer Science University of Texas Rio Grande Valley

OVERVIEW OF RELATIONAL DATABASES: KEYS

Chapter 16: Databases

XQ: An XML Query Language Language Reference Manual

SQL Data Query Language

Unit Assessment Guide

MANAGING DATA(BASES) USING SQL (NON-PROCEDURAL SQL, X401.9)

You can write a command to retrieve specified columns and all rows from a table, as illustrated

Exact Numeric Data Types

Advanced tabular data processing with pandas. Day 2

Insertions, Deletions, and Updates

Test Bank for Database Processing Fundamentals Design and Implementation 13th Edition by Kroenke

Python for Data Analysis. Prof.Sushila Aghav-Palwe Assistant Professor MIT

Scientific Programming. Lecture A07 Pandas

Databases II: Microsoft Access

Import and Browse. Review data. bp_stages is a chart based on a graphic

NESTED QUERIES AND AGGREGATION CHAPTER 5 (6/E) CHAPTER 8 (5/E)

Chapter 7. Introduction to Structured Query Language (SQL) Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel

INTRODUCTION TO DATABASES IN PYTHON. Creating Databases and Tables

In This Lecture. Yet More SQL SELECT ORDER BY. SQL SELECT Overview. ORDER BY Example. ORDER BY Example. Yet more SQL

SQL 2 (The SQL Sequel)

SQL CHEAT SHEET. created by Tomi Mester

30. Structured Query Language (SQL)

Operating systems fundamentals - B07

Chapter 4. Basic SQL. Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley

II (The Sequel) We will use the following database as an example throughout this lab, found in students.db.

origin destination duration New York London 415 Shanghai Paris 760 Istanbul Tokyo 700 New York Paris 435 Moscow Paris 245 Lima New York 455

Relational Database Management Systems for Epidemiologists: SQL Part I

T-SQL Training: T-SQL for SQL Server for Developers

Logical Operators and aggregation

ORACLE TRAINING CURRICULUM. Relational Databases and Relational Database Management Systems

Key Points. COSC 122 Computer Fluency. Databases. What is a database? Databases in the Real-World DBMS. Database System Approach

Querying Data with Transact SQL

Stat Wk 3. Stat 342 Notes. Week 3, Page 1 / 71

Polaris SQL Introduction. Michael Fields Central Library Consortium

This lecture. Databases - SQL II. Counting students. Summary Functions

Introduction to SQL Server 2005/2008 and Transact SQL

Databases - SQL II. (GF Royle, N Spadaccini ) Structured Query Language II 1 / 22

Database Management Systems,

Why Relational Databases? Relational databases allow for the storage and analysis of large amounts of data.

CST272 SQL Server, SQL and the SqlDataSource Page 1

Python, PySpark and Riak TS. Stephen Etheridge Lead Solution Architect, EMEA

Data Wrangling with Python and Pandas

MariaDB Crash Course. A Addison-Wesley. Ben Forta. Upper Saddle River, NJ Boston. Indianapolis. Singapore Mexico City. Cape Town Sydney.

Final Exam, Version 3 CSci 127: Introduction to Computer Science Hunter College, City University of New York

SQL Structured Query Language Introduction

CS2 Current Technologies Lecture 2: SQL Programming Basics

Lesson 2. Data Manipulation Language

1.8.1 Research various database careers SE: 335

Databases - SQLite Wrap-Up

SQL. Char (30) can store ram, ramji007 or 80- b

Data Infrastructure IRAP Training 6/27/2016

Instructor: Craig Duckett. Lecture 03: Tuesday, April 3, 2018 SQL Sorting, Aggregates and Joining Tables

SE: 4, 6, 31 (Knowledge Check), 60, (Knowledge Check) 1.2 Explain the purpose of a relational Explain how a database is

Python data pipelines similar to R Documentation

Exam Questions 1z0-882

Lab 8 - Subset Selection in Python

Introduction to PROC SQL

Follow these steps to get started: o Launch MS Access from your start menu. The MS Access startup panel is displayed:

SQL BASICS WITH THE SMALLBANKDB STEFANO GRAZIOLI & MIKE MORRIS

NESTED QUERIES AND AGGREGATION CHAPTER 5 (6/E) CHAPTER 8 (5/E)

Report Writing: Basic Steps

Hustle Documentation. Release 0.1. Tim Spurway

SQL: Aggregate Functions with Grouping

MIS2502: Data Analytics SQL Getting Information Out of a Database Part 1: Basic Queries

The UOB Python Lectures: Part 3 - Python for Data Analysis

Transcription:

INTRODUCTION TO DATABASES IN PYTHON Filtering and Targeting Data

Where Clauses In [1]: stmt = select([census]) In [2]: stmt = stmt.where(census.columns.state == 'California') In [3]: results = connection.execute(stmt).fetchall() In [4]: for result in results:...: print(result.state, result.age) Out[4]: California 0 California 1 California 2 California 3 California 4 California 5...

Where Clauses Restrict data returned by a query based on boolean conditions Compare a column against a value or another column Often used comparisons: '==', '<=', '>=', or '!='

Expressions Provide more complex conditions than simple operators Eg. in_(), like(), between() Many more in documentation Available as method on a Column

Expressions In [1]: stmt = select([census]) In [2]: stmt = stmt.where( census.columns.state.startswith('new')) In [3]: for result in connection.execute(stmt):...: print(result.state, result.pop2000) Out[3]: New Jersey 56983 New Jersey 56686 New Jersey 57011...

Conjunctions Allow us to have multiple criteria in a where clause Eg. and_(), not_(), or_()

Conjunctions In [1]: from sqlalchemy import or_ In [2]: stmt = select([census]) In [3]: stmt = stmt.where(...: or_(census.columns.state == 'California',...: census.columns.state == 'New York'...: )...: ) In [4]: for result in connection.execute(stmt):...: print(result.state, result.sex) Out[4]: New York M California F

INTRODUCTION TO DATABASES IN PYTHON Let s practice!

Introduction to Relational Databases in Python INTRODUCTION TO DATABASES IN PYTHON Ordering Query Results

Order by Clauses Allows us to control the order in which records are returned in the query results Available as a method on statements order_by()

Order by Ascending In [1]: print(results[:10]) Out[1]: [('Illinois',), ] In [3]: stmt = select([census.columns.state]) In [4]: stmt = stmt.order_by(census.columns.state) In [5]: results = connection.execute(stmt).fetchall() In [6]: print(results[:10]) Out[6]: [('Alabama',), ]

Order by Descending Wrap the column with desc() in the order_by() clause

Order by Multiple Just separate multiple columns with a comma Orders completely by the first column Then if there are duplicates in the first column, orders by the second column repeat until all columns are ordered

Order by Multiple In [6]: print(results) Out[6]: ('Alabama', 'M') In [7]: stmt = select([census.columns.state,...: census.columns.sex]) In [8]: stmt = stmt.order_by(census.columns.state,...: census.columns.sex) In [9]: results = connection.execute(stmt).first() In [10]: print(results) Out[10]:('Alabama', 'F') ('Alabama', 'F') ('Alabama', 'M')

INTRODUCTION TO DATABASES IN PYTHON Let s practice!

INTRODUCTION TO DATABASES IN PYTHON Counting, Summing and Grouping Data

SQL Functions E.g. Count, Sum from sqlalchemy import func More efficient than processing in Python Aggregate data

Sum Example In [1]: from sqlalchemy import func In [2]: stmt = select([func.sum(census.columns.pop2008)]) In [3]: results = connection.execute(stmt).scalar() In [4]: print(results) Out[4]: 302876613

Group by Allows us to group row by common values

Group by In [1]: stmt = select([census.columns.sex,...: func.sum(census.columns.pop2008)...: ]) In [2]: stmt = stmt.group_by(census.columns.sex) In [3]: results = connection.execute(stmt).fetchall() In [4]: print(results) Out[4]: [('F', 153959198), ('M', 148917415)]

Group by Supports multiple columns to group by with a pattern similar to order_by() Requires all selected columns to be grouped or aggregated by a function

Group by Multiple In [1]: stmt = select([census.columns.sex,...: census.columns.age,...: func.sum(census.columns.pop2008)...: ]) In [2]: stmt = stmt.group_by(census.columns.sex,...: census.columns.age) In [2]: results = connection.execute(stmt).fetchall() In [3]: print(results) Out[3]: [('F', 0, 2105442), ('F', 1, 2087705), ('F', 2, 2037280), ('F', 3, 2012742), ('F', 4, 2014825), ('F', 5, 1991082), ('F', 6, 1977923), ('F', 7, 2005470), ('F', 8, 1925725),

Handling ResultSets from Functions SQLAlchemy auto generates column names for functions in the ResultSet The column names are often func_# such as count_1 Replace them with the label() method

Using label() In [1]: print(results[0].keys()) Out[1]: ['sex', u'sum_1'] In [2]: stmt = select([census.columns.sex,...: func.sum(census.columns.pop2008).label(...: 'pop2008_sum')...: ]) In [3]: stmt = stmt.group_by(census.columns.sex) In [4]: results = connection.execute(stmt).fetchall() In [5]: print(results[0].keys()) Out[5]: ['sex', 'pop2008_sum']

INTRODUCTION TO DATABASES IN PYTHON Let s practice!

INTRODUCTION TO DATABASES IN PYTHON SQLAlchemy and Pandas for Visualization

SQLAlchemy and Pandas DataFrame can take a SQLAlchemy ResultSet Make sure to set the DataFrame columns to the ResultSet keys

DataFrame Example In [1]: import pandas as pd In [2]: df = pd.dataframe(results) In [3]: df.columns = results[0].keys() In [4]: print(df) Out[4]: sex pop2008_sum 0 F 2105442 1 F 2087705 2 F 2037280 3 F 2012742 4 F 2014825 5 F 1991082

Graphing We can graph just like we would normally

Graphing Example In [1]: import matplotlib.pyplot as plt In [2]: df[10:20].plot.barh() In [3]: plt.show()

Graphing Output Introduction to Databases in Python

INTRODUCTION TO DATABASES \ IN PYTHON Let s practice!