Relational Databases for Biologists: Efficiently Managing and Manipulating Your Data

Similar documents
Databases for Biologists

Relational Databases for Biologists

Session 2 Outline. Building An E-R Diagram. Database Basics. Number Data Types. db4bio E-R Diagram II. Relational Databases for Biologists

Relational Databases for Biologists: Efficiently Managing and Manipulating Your Data

Relational Databases for Biologists: Efficiently Managing and Manipulating Your Data

Unit 1 - Chapter 4,5

Data Modelling and Databases. Exercise Session 7: Integrity Constraints

Oracle Database 10g: Introduction to SQL

Table of Contents. PDF created with FinePrint pdffactory Pro trial version

Chapter 9: Working With Data

This lab will introduce you to MySQL. Begin by logging into the class web server via SSH Secure Shell Client

Using MySQL on the Winthrop Linux Systems

Introduction to Computer Science and Business

Course Outline and Objectives: Database Programming with SQL

Outline. Introduction to SQL. What happens when you run an SQL query? There are 6 possible clauses in a select statement. Tara Murphy and James Curran

CSC Web Programming. Introduction to SQL

Operating systems fundamentals - B07

Mastering phpmyadmiri 3.4 for

Exact Numeric Data Types

Advanced SQL Tribal Data Workshop Joe Nowinski

Chapter 3. Introduction to relational databases and MySQL. 2010, Mike Murach & Associates, Inc. Murach's PHP and MySQL, C3

Model Question Paper. Credits: 4 Marks: 140

Provider: MySQLAB Web page:

Oracle Database: SQL and PL/SQL Fundamentals NEW

Introduction to SQL. Tara Murphy and James Curran. 15th April, 2009

Oracle Database: SQL and PL/SQL Fundamentals Ed 2

MySQL. A practical introduction to database design

Oracle Database 11g: SQL and PL/SQL Fundamentals

Index. Bitmap Heap Scan, 156 Bitmap Index Scan, 156. Rahul Batra 2018 R. Batra, SQL Primer,

Introduction to relational databases and MySQL

Chapter 1 SQL and Data

MySQL for Developers Ed 3

MySQL 5.0 Certification Study Guide

Databases (MariaDB/MySQL) CS401, Fall 2015

MySQL by Examples for Beginners

More MySQL ELEVEN Walkthrough examples Walkthrough 1: Bulk loading SESSION

Week 11 ~ Chapter 8 MySQL Command Line. PHP and MySQL CIS 86 Mission College

SQL Interview Questions

Using SQL with SQL Developer 18.2

Oracle Database: Introduction to SQL Ed 2

Using Relational Databases for Digital Research

Lecture 5. Monday, September 15, 2014

CGS 3066: Spring 2017 SQL Reference

Manual Trigger Sql Server Example Update Column Value

How to use SQL to create a database

User's Guide c-treeace SQL Explorer

MySQL for Developers Ed 3

Oracle Database: Introduction to SQL

T-SQL Training: T-SQL for SQL Server for Developers

Oracle Database: Introduction to SQL

MySQL for Developers with Developer Techniques Accelerated

5. Single-row function

Exploring the Microsoft Access User Interface and Exploring Navicat and Sequel Pro, and refer to chapter 5 of The Data Journalist.

Oracle Syllabus Course code-r10605 SQL

Oracle Database: Introduction to SQL

Optional SQL Feature Summary

Bayesian Pathway Analysis (BPA) Tutorial

Database Systems. phpmyadmin Tutorial

Oracle Database: SQL and PL/SQL Fundamentals

MySQL for Beginners Ed 3

1 INTRODUCTION TO EASIK 2 TABLE OF CONTENTS

Database Management System 9

MTA Database Administrator Fundamentals Course

Bitnami MariaDB for Huawei Enterprise Cloud

A practical introduction to database design

Chapter 9: MySQL for Server-Side Data Storage

Oracle Schema Create Date Index Trunc >>>CLICK HERE<<<

Oracle Database 12c SQL Fundamentals

MySQL Schema Best Practices

Bitnami MySQL for Huawei Enterprise Cloud

Alter Change Default Schema Oracle Sql Developer

1 Writing Basic SQL SELECT Statements 2 Restricting and Sorting Data

Connecting Spotfire to Data Sources with Information Designer

Lab # 2. Data Definition Language (DDL) Eng. Alaa O Shama

INDEX. 1 Basic SQL Statements. 2 Restricting and Sorting Data. 3 Single Row Functions. 4 Displaying data from multiple tables

DEVELOPING DATABASE APPLICATIONS (INTERMEDIATE MICROSOFT ACCESS, X405.5)

DB2 SQL Class Outline

Midterm Review. Winter Lecture 13

Using SQL with SQL Developer Part I

Topics of the talk. Biodatabases. Data types. Some sequence terminology...

Ryan Stephens. Ron Plew Arie D. Jones. Sams Teach Yourself FIFTH EDITION. 800 East 96th Street, Indianapolis, Indiana, 46240

Module 3 MySQL Database. Database Management System

Automated SQL Ownage Techniques. OWASP October 30 th, The OWASP Foundation

Teradata SQL Features Overview Version

Using SQL with SQL Developer 18.2 Part I

81067AE Development Environment Introduction in Microsoft

Oracle Way To Grant Schema Privileges All Tables

ASSIGNMENT NO 2. Objectives: To understand and demonstrate DDL statements on various SQL objects

Introduction to Computer Science and Business

Some Useful Options. Code Sample: MySQLMonitor/Demos/Create-DB.bat

Tutorial 1: Using Excel to find unique values in a list

normalization are being violated o Apply the rule of Third Normal Form to resolve a violation in the model

"Charting the Course... Oracle18c SQL (5 Day) Course Summary

Oracle SQL & PL SQL Course

Lecture 8. Database Management and Queries

Type Java.sql.sqlexception Error Code 0 Sql State S1000

EE221 Databases Practicals Manual

TopView SQL Configuration

Oracle Database 11g: Introduction to SQLRelease 2

MySQL Introduction. Practical Course Integrative Bioinformatics. March 7, 2012

Transcription:

Relational Databases for Biologists: Efficiently Managing and Manipulating Your Data Session 3 Building and modifying a database with SQL George Bell, Ph.D. WIBR Bioinformatics and Research Computing

Session 3 Outline SQL query review Creating databases Creating tables Altering table structure Inserting data Deleting data Updating/modifying data Automating repetitive tasks

SELECT > SELECT * FROM Data LIMIT 5; > # Comments after # # Get non-redundant list SELECT DISTINCT species FROM LocusDescr; +-----------------+----------+-------+ affyid exptid level +-----------------+----------+-------+ AFFX-MurIL2_at hs-cer-1 20 AFFX-MurIL10_at hs-cer-1 8 AFFX-MurIL4_at hs-cer-1 77 AFFX-MurFAS_at hs-cer-1 30 AFFX-BioB-5_at hs-cer-1 258 +-----------------+----------+-------+ +---------+ species +---------+ Hs Mm +---------+

WHERE And ORDER BY > SELECT * FROM RefSeqs WHERE linkid BETWEEN 50 AND 100 LIMIT 5; > SELECT * FROM RefSeqs WHERE linkid BETWEEN 50 AND 100 ORDER BY ntrefseq DESC LIMIT 5; +--------+-----------+-----------+ linkid ntrefseq aarefseq +--------+-----------+-----------+ 50 NM_001098 NP_001089 51 NM_004035 NP_004026 52 NM_004300 NP_004291 53 NM_001610 NP_001601 54 NM_001611 NP_001602 +--------+-----------+-----------+ +--------+-----------+-----------+ linkid ntrefseq aarefseq +--------+-----------+-----------+ 70 NM_005159 NP_005150 81 NM_004924 NP_004915 91 NM_004302 NP_004293 86 NM_004301 NP_004292 52 NM_004300 NP_004291 +--------+-----------+-----------+

GROUP BY And HAVING > SELECT affyid, MIN(level) as min, MAX(level) as max FROM Data GROUP BY affyid HAVING max - min > 5000 LIMIT 5; > SELECT gbid, count(affyid) AS num_affyids FROM Targets GROUP BY gbid HAVING COUNT(gbId) > 4 ORDER BY num_affyids DESC LIMIT 5; +-------------+------+-------+ affyid min max +-------------+------+-------+ 100047_at 20 7784 100068_at 414 5883 100069_at 616 6349 100329_at 20 21455 100342_i_at 786 7931 +-------------+------+-------+ +----------+-------------+ gbid num_affyids +----------+-------------+ J04423 14 AC002397 12 AF109905 9 AF100956 9 AL031228 8 +----------+-------------+

Table Joining > SELECT DISTINCT Unigenes.uId, GO_Descr.description AS GO_description FROM Unigenes, LocusLinks, Ontologies, GO_Descr WHERE Unigenes.linkId=LocusLinks.linkId AND LocusLinks.linkId=Ontologies.linkId AND Ontologies.goAcc=GO_Descr.goAcc LIMIT 5; +-----------+-------------------------------+ uid GO_description +-----------+-------------------------------+ Hs.373554 calcium ion binding Hs.74561 protein carrier Hs.155956 arylamine N-acetyltransferase Hs.2 arylamine N-acetyltransferase Hs.234726 serine protease inhibitor +-----------+-------------------------------+

Output Formats Query from MySQL prompt Ending query with \G (in place of ; ) mysql < q.sql tab-delimited output gbid num_affyids J04423 14 AC002397 12 AF109905 9 AF100956 9 AL031228 8 +----------+-------------+ gbid num_affyids +----------+-------------+ J04423 14 AC002397 12 AF109905 9 AF100956 9 AL031228 8 +----------+-------------+ *************************** 1. row *************************** gbid: J04423 num_affyids: 14 *************************** 2. row *************************** gbid: AC002397 num_affyids: 12 *************************** 3. row *************************** gbid: AF109905 num_affyids: 9 *************************** 4. row *************************** gbid: AF100956 num_affyids: 9 *************************** 5. row *************************** gbid: AL031228 num_affyids: 8

Access Privileges Restrict access and prevent accidental alteration of important information Can limit what individual users can see and do on particular databases and specific tables Access privileges are stored in the mysql database > GRANT ALL PRIVILEGES ON db4bio.* TO superuser@ % IDENTIFIED BY password ; > GRANT SELECT,INSERT ON db4bio.data TO admin@ 18.157.*.* IDENTIFIED BY pass2 ;

CREATE DATABASE Allows you to create a new database on the database server (if you have permission) > SHOW DATABASES; > CREATE DATABASE go; > SHOW DATABASES; > USE go; +----------+ Database +----------+ anno cpa db4bio go goaway mirna mysql sirna2 test wibrunix +----------+

CREATE TABLE Translate an E-R diagram (schema) into a functioning database Descriptions gbid description > CREATE TABLE Descriptions ( gbid VARCHAR(20) NOT NULL, description VARCHAR(100), PRIMARY KEY (gbid) ); +-------------+--------------+------+-----+---------+-------+ Field Type Null Key Default Extra +-------------+--------------+------+-----+---------+-------+ gbid varchar(20) PRI description varchar(100) YES NULL +-------------+--------------+------+-----+---------+-------+

CREATE TABLE Targets affyid gbid species > CREATE TABLE Targets ( affyid VARCHAR(20) NOT NULL, gbid VARCHAR(20) NOT NULL, species VARCHAR(20), PRIMARY KEY (affyid, gbid) ); +---------+-------------+------+-----+---------+-------+ Field Type Null Key Default Extra +---------+-------------+------+-----+---------+-------+ affyid varchar(20) PRI gbid varchar(20) PRI species varchar(20) YES NULL +---------+-------------+------+-----+---------+-------+

ALTER TABLE Modify a table s attributes Attribute names, type, null, key, default Add or drop attributes > ALTER TABLE Data CHANGE level level DOUBLE; > ALTER TABLE Data RENAME level expression; > ALTER TABLE Data ADD PRIMARY KEY (exptid); > ALTER TABLE Data DROP COLUMN affyid; > ALTER TABLE Data ADD date TIMESTAMP; > DROP TABLE Data;

INSERT INTO Finally, add data into tables > INSERT INTO Data (level, exptid, affyid) EXPLICIT ORDER VALUES (215, hs-hrt-1, 100008_at ); > INSERT INTO Data IMPLIED ORDER VALUES ( 100008_at, hs-hrt-1, 215); > INSERT INTO Data2 (affyid2,level2) DATA COPYING SELECT Data.affyId, Data.level FROM Data WHERE Data.level < 250;

DELETE FROM Delete data from tables Similar syntax as SELECT > DELETE FROM Data WHERE exptid= hs-hrt-1 ; > DELETE FROM Sources BE CONSISTENT WHERE exptid= hs-hrt-1 ;

UPDATE Modify data already stored in a table Again, similar syntax as SELECT > UPDATE Data MODIFY SET exptid= hs-hrt-2 WHERE exptid= hs-hrt-1 ; > UPDATE Source FIX SET exptid= ms-hrt-1, source= Mm WHERE exptid= hs-hrt-1 ; > UPDATE Data INTERNAL SET level=level*1.27 NORMALIZATION WHERE exptid= hs-hrt-1 ;

LOAD DATA And Export Read rows from a text file (in the current directory) into a table and vice versa > LOAD DATA LOCAL INFILE data.txt INTO TABLE db4bio.data FIELDS TERMINATED BY \t LINES TERMINATED BY \n ; > LOAD DATA LOCAL INFILE data.txt INTO TABLE db4bio.data; > SELECT * INTO OUTFILE data.txt FIELDS TERMINATED BY, FROM Data; Standard line ends: Macintosh = \r Windows = \r\n Assumes tabdelimited file, with lines ending in \n But need access to computer with MySQL

LOAD DATA warnings mysql> LOAD DATA LOCAL INFILE "Hs_sources_test.txt" -> INTO TABLE Sources; Query OK, 4 rows affected, 3 warnings (0.00 sec) Records: 4 Deleted: 0 Skipped: 0 Warnings: 3 mysql> SHOW warnings; +---------+----------------------------------------------------+ Level Code Message +---------+------+---------------------------------------------+ Warning 1265 Data truncated for column 'exptid' at row 3 Warning 1265 Data truncated for column 'exptid' at row 4 Warning 1262 Row 4 was truncated; it contained --- +---------+------+---------------------------------------------+ 3 rows in set (0.00 sec) mysql> LOAD DATA LOCAL INFILE "Hs_sources_test.txt" -> INTO TABLE Sources; Query OK, 0 rows affected, 3 warnings (0.00 sec) Records: 4 Deleted: 0 Skipped: 4 Warnings: 3

Automating Repetitive Tasks Use.SQL files to perform SQL commands automatically Automatically create a series of tables % mysql -h hebrides.wi.mit.edu -u guest -p -D databasename < create.sql Feed a complicated query to the database and receive the results in A text file % mysql -h hebrides.wi.mit.edu -u web -p -D db4bio < query1.sql > query1.out

Summary Design databases with E-R diagrams Data mine using combinations of SELECT/FROM with WHERE, GROUP BY, HAVING, ORDER BY, and aggregates Create and implement databases Input and output data from databases Modify existing data within databases

Advanced topics Query optimization (adding indexes) Dates and times all expected functionality Mathematics functions: logs, trig, etc. String (text) functions substring, concatenate, replace, case change, etc. Nested queries SELECT * FROM Ontologies WHERE linkid IN (SELECT linkid FROM LocusLinks WHERE gbid LIKE A82% );

Where To Go From Here? Consult SQL And MySQL Resources http://www.mysql.com Tutorial, Reference Manual Graphical interfaces to MySQL DBDesigner (free) MySQL Administrator SQL4XManagerJ (inexpensive) Visio (Microsoft) Visual Case (expensive) Ensembl databases with open access Sources of data to build your own: UCSC Bioinformatics; Gene Ontology; Entrez Gene

Course Goals Conceptualize data in terms of relations (database tables) Design relational databases Use SQL commands to extract data from (mine) databases Use SQL commands to build and modify databases

Exercises Create tables Input data Modify/delete particular data Accessing your own database: mysql -u username -p -D username -h hebrides.wi.mit.edu