1/5/11. Announcements. Topics Lab hour/discussion sec2on: Introduc4on (SDM, e Science, scien2fic workflows) Databases. Assignment 1

Size: px
Start display at page:

Download "1/5/11. Announcements. Topics Lab hour/discussion sec2on: Introduc4on (SDM, e Science, scien2fic workflows) Databases. Assignment 1"

Transcription

1 1/5/11 Announcements Topics Lab hour/discussion sec2on: Introduc4on (SDM, e Science, scien2fic workflows) Databases next Thursday 2 3pm, 93 Hutchison Assignment 1 Basic concepts of rela2onal databases (crash course) Metadata, ontologies Hands on exercises with SQLite, MySQL, Python GeBng started with Python & SQLite Out: end of this week, due: end of next week (tbd) Readings: Scien4fic Workflows Two papers posted on web site please read! Send ques2ons to mailing list! Visit site, subscribe to mailing list (=group) sites.google.com/site/ecs166wq11 Basic concepts Hands on exercises with Kepler, Taverna, Python, Data Provenance (theory & examples) Parallel Execu2on (Map Reduce) New reading materials there, addi2onal online book, etc groups.google.com/group/ecs166wq Workflow Inputs Taverna workflow The 4th Paradigm qtl_end_position qtl_start_position chromosome_name genes_in_qtl mmusculus_gene_ensembl Chapter 3, page 137 in the 4th Paradigm book. remove_uniprot_duplicates remove_entrez_duplicates create_report merge_uniprot_ids merge_entrez_genes merge_reports REMOVE_NULLS_2 remove_nulls add_uniprot_to_string add_ncbi_to_string Kegg_gene_ids Kegg_gene_ids_2 concat_kegg_genes regex_2 workflow connects interna2onally distributed datasets to iden2fy candidate genes that could be implicated in resistance to African trypanosomiasis (sleeping sickness) binfo merge_kegg_references remove_duplicate_kegg_genes Get_pathways Workflow Inputs regex gene_ids split_by_regex lister get_pathways_by_genes1 Merge_pathways concat_ids concat_gene_pathway_ids Merge_gene_pathways pathway_desc Merge_pathway_desc Workflow Outputs pathway_genes merge_genes_and_pathways kegg_pathway_release split_gene_ids split_for_duplicates pathway_desc remove_pathway_duplicates pathway_ids merge_pathway_list_1 gene_descriptions merge_genes_and_pathways_2 merge_pathway_desc merge_pathway_list_2 merge_gene_desc merge_genes_and_pathways_3 remove_pathway_nulls remove_pathway_nulls_2 remove_nulls_3 species getcurrentdatabase Workflow Outputs kegg_pathway_release 3 merged_pathways Workflow Outputs Workflow Inputs An_output_port An_input_port pathway_descriptions A_local_service pathway_ids Beanshell gene_descriptions A_Soaplab_service kegg_external_gene_reference String_constant report ensembl_database_release A_Biomart_Service 4 FIGURE 1. A Taverna workflow that connects several internationally distributed datasets to identify candidate genes that could be implicated in resistance to African trypanosomiasis [11]. THE FOURTH PARADIGM Microbial Ecology, Metagenomics: what microbes are in my favorite environment? Assembled con2gs Workflow for Alignment, Taxonomy, Ecology of Ribosomal Sequences (Amber Hartman; Eisen Lab; UC Davis) WATERS: Profile alignment Chimera check Find OTUs (STAP or Infernal) (Mallard) (OTUHunter) +/ cluster +/ cluster View tree: Dendroscope Build phylogene2c tree (RaxML or Quicktree) +/ cipres UniFrac: tree & STAP (ss rrna Taxonomy Assigning Pipeline) environment file D. Wu, A.L. Hartman, N. Ward, J.A. Eisen, PLoS ONE, June 2008 Li Weng et al. Genome Res Metadata 139 Assign Taxonomy (STAP) +/ cluster Visualiza2on tools: Cytoscape networks & Heat map Diversity sta2s2cs: Metadata Metadata Text: OUT list, Chao1, Shannon Graphs: rarefac2on curves, rank abundance curves 6 1

2 1/5/11 Executable WATERS Workflow in Kepler myexperiment.org myexperiment allows users to find, use and share scien2fic workflows and other Research Objects, and to build communi2es. 7 Simple Kepler analysis workflow using 8 Scientific Workflow for Phylogenetic Analysis Dan Higgins, NCEAS Newick Tree Aligned AA-Sequences AA-Sequences Clustal DrawTree Quicktree SciWF ~ executable spec of a scien4fic data analysis method Actors Tokens int, string, record{..}, array[..],.. Data source from EcoGrid (metadata-driven ingestion) R processing script res <- lm(baro ~ T_AIR) res plot(t_air, BARO) abline(res) Channels 9 From Climate Gate to Reproducible Science Ports 10 Provenance and Scientific Workflows prov e nance noun place of origin; derivation For the scien4st (focus on data deriva2on) Evaluate results based on actors and data used, parameter sebngs, etc. Automate metadata crea2on Maintain a record of what was done within a project, etc. Provide a high level view of what a workflow did, dependencies, etc. For the engineer (focus on processing history) Monitor, benchmark, and op2mize workflow performance Record resources used during workflow execu2on Checkpoint and restart workflows Op2miza2on (e.g., minimize unnecessary recomputa2ons) 11 2

3 1/5/1 Provenance questions a scientists might ask Which DNA sequences were input to the workflow? Which phylogene2c trees were created? Which actor created this phylogene2c tree? Which input sequences did this tree depend on? What input sequences were not used to derive any output consensus trees? What sequence alignment was used to infer this tree? Which actors were involved in crea2ng this tree? How can we answer these questions? By recording what happens during the workflow run We open call the result a workflow execu2on trace What is recorded depends on what can be observed during the run What can be observed depends on the model of computa:on (MoC) Some2mes the MoC isn t enough and the observables must be augmented to capture provenance Provenance (Data Lineage) Graphs Workflow (Kepler/COMAD) Scientific Workflows & Data Mining: Kepler/WEKA Provenance Scien4fic workflows: to specify and execute computa2onal pipelines Provenance informa2ons capture data lineage and processing history 16 Topics Introduc4on (SDM, e Science, scien2fic workflows) Databases Basic concepts of rela2onal databases (crash course) Metadata, ontologies Hands on exercises with SQLite, MySQL, Python Scien4fic Workflows Basic concepts Hands on exercises with Kepler, Taverna, Python, Data Provenance (theory & examples) Parallel Execu2on (Map Reduce) Introduction to Data(base) Management Why study data(base) management? Critical to business, government, science, culture, society, Determines success of many corporations (even their existence) Many tech companies built on data management (Google, Amazon, Yahoo!, Facebook, ) or offer database products (Microsoft, IBM, Oracle) Database systems span major areas of computer science Operating systems (file, memory, process management) Theory (languages, algorithms, complexity) Artificial Intelligence (knowledge-based systems, logic, search) Software Engineering (application development) Data structures (trees, hash-tables) and the DB research community continues to be very active 17 18

4 1/5/1 Databases are everywhere ( Every-Ware ) Regularly Structured Data Sets the structure once (e.g., table attributes) and then has many instances (records) that use that structure Examples of regularly structured data Employee, payroll, bank account Data captured on web forms Examples of unstructured a.k.a. loosely or semi-structured data Documents, (heaps of) video, audio, images, maps, We Focus on Regularly Structured Data We focus on relational database management systems (abbreviated: DBMS or RDBMS) Mainly designed to store, manage, and retrieve structured data We use SQL to manage and retrieve (query) data from databases (abbreviated: DB) Unstructured data (e.g., documents) is managed mainly by content management and information retrieval systems Includes search engines on the web Querying involves indexing words in text, ranking results, etc. Includes Web 2.0 features like tagging/labeling * Many DBMSs now support unstructured and semi-structured data too Some Characteristics of Data in Databases Data is persistent One or more applications use the same data Data stored between applications Data often too large to easily manage in-memory DBMSs handle this for free Manually handling data (files) is usually ad hoc (each app. does it differently) and can be inefficient Data may be very large (business, government, science, ) Library of congress > 20 terabytes of print Amazon.com: > 42 terabytes of data Youtube: > 45 terabytes of video AT&T: > 323 terabytes of call records National Energy Research Scientific Computing Center: > 2.8 petabytes * 1 terabyte 1,000,000,000,000 bytes * 1 petabyte 1,000,000,000,000,000 bytes (and there is talk about exabytes at DOE) 2 2 Lots of Data Everywhere From : History: According to Kevin Kelly in The New York Times, "the entire [written] works of humankind, from the beginning of recorded history, in all languages" would amount to 50 petabytes of data. [1] Computer hardware: Teradata Database 12 has a capacity of 50 petabytes of compressed data. [2][3] Telecoms: AT&T has about 16 petabytes of data transferred through their networks each day. [4] Archives: The Internet Archive contains about 3 petabytes of data, and is growing at the rate of about 100 terabytes per month as of March, [5][6] Internet: Google processes about 20 petabytes of data per day. [7] Physics: The 4 experiments in the Large Hadron Collider will produce about 15 petabytes of data per year, which will be distributed over the LHC Computing Grid. [8] P2P networks: As of October 2009, Isohunt has about 9.76 petabytes of files contained in torrents indexed globally. [9] Games: World of Warcraft utilizes 1.3 petabytes of storage to maintain its game. [10] What is a DB? A database (DB) is a (structured) collection of persistent data NB (the picky guy): DB schema vs. DB instance A database management system (DBMS) is a software system that supports the definition, population, and query of a database DBMS DB 2 2

5 1/5/1 Basic Database Architecture Query Processing Web Forms Application Front Ends SQL Interface Web Forms Application Front Ends SQL Interface SQL Commands SQL Commands Query Execution DBMS Plan Executor Operator Evaluator Parser Optimizer Query Evaluation Engine DBMS Plan Executor Operator Evaluator Parser Optimizer Query Evaluation Engine Transaction Lock Concurrency Control File and Access Methods Buffer Disk Space Recovery Transaction Lock Concurrency Control File and Access Methods Buffer Disk Space Recovery Index Files Data Files System Catalog Index Files Data Files System Catalog Computer Science in a Nutshell All computer science students must learn to integrate theory and practice, to recognize the importance of abstraction, and to appreciate the value of good engineering design. Final report of the Joint ACM/IEEE-CS Task Force on Computing Curricula 2005 for Computer Science Computer Science in a Nutshell this course Strong Emphasis Practice Practical concepts Skills Tools Formalization may not exactly match practical concept (often the core, e.g., SQL vs. Relational Algebra) Important Formalizations Theory Formal definitions Mathematical results This is one of the really fun things about studying database systems!!! Only a Bit Engineering Performance tradeoffs Scalability Reliability Focus of the DB research community 27 Introduction to Relational Databases The name of the table (relation) Assume this table has been defined to keep track of bank account Also referred to as a relation

6 1/5/1 The name of the table (relation) The name of the attributes (columns) The schema of the table The schema sets the structure of the table The schema is the defini2on of the table Which generally includes more that what is shown here E.g., data types and constraints 3 3 Rows Each entry in the table is called a row, tuple, or record (often used interchangeably) The instance of the schema is the current set of rows Instance The inten:on of the table W. Yu R. Jones The current extension (or extent) of the table Not used as often in relational databases mainly in deductive (logic-based) and object-oriented databases 3 3 Degree or Arity of a table is the number of auributes Arity of this rela2on is (because there are auributes) W. Yu R. Jones Cardinality of this instance is 6 (because there are 6 rows) Cardinality of a table is the number of rows in the current instance Transac2on id Date 10/22/09 Check number Date 92 10/23/ /24/

7 1/5/1 Transac2on id Date 10/22/ Each table (typically) has a key The values of the key must be unique Transac2on id Date 10/22/ A key consists of one or more attributes We often underline key attributes Check number Date 92 10/23/ /24/ What is the key for the table? Check number Date 92 10/23/ /24/ Is this legal? Transac2on id Date 10/22/ /5/ If not, how do we prevent it from happening? Transac2on id Date 10/22/ /5/ We say that Deposit.Account is a foreign key that references Account.Number i.e., each Deposit (row) must refer to an Account (row) If the DBMS enforces this constraint, we have referential integrity Check number Date /23/ /24/ Are there any foreign keys in the Check table? Yes, Check.Account is a foreign key that references Account.Number Transac2on id Date 10/22/09 Check number Date 92 10/23/ /24/ Foreign keys may or may not be part of the key for the table Deposit.is not part of the key for Check.is part of the key for 4 4 7

8 1/5/1 Consider the following sample data from a table Consider the following sample data from a table Salesperson Company Jones Smith Age Commission $50,000 $60,000 Salesperson Company Jones Smith Age Commission $50,000 $60,000 Can you tell what the key for this table is? Now can you tell what the key for the table is? 4 4 One possibility: Another possibility: Person Table with Id as the key Sales Commission Table, by client company, per day Id Name Jones Smith Age Salary $50,000 $60,000 Salesperson Company Jones Smith Day Commission $50,000 $60,000 Keys, Table Names, A@ribute Names (help to) tell us what the table is ( the meaning, or seman:cs, of the rela:on => a form of metadata) For every attribute of every table, the schema specifies allowable values. For example, Number must be an integer Owner must be a 30-character string Type must be checking or savings The set of allowed values for an attribute is called the domain of the attribute Select the tables, with a name for each table A database schema may have multiple tables Each table has its own schema Select attributes for each table and give the domain for each attribute This is the basis of a relation (or table) schema also: Specify the key(s) for each table There can be more than one key for a table There is only one primary key (more on this later) Specify all appropriate foreign keys

9 1/5/1 Another example database More standard nota2on; Each table has one primary key Teacher(Number, Name, Office, E mail) Course(Number, Name, Descrip2on) Class Offering(Quarter, Course, Sec2on, Teacher, TimeDay) Student(Number, Name, Major, Advisor) Completed(Student, Course, Quarter, Sec2on, Grade) 49 Another example database with some foreign keys shown informally Teacher(Number, Name, Office, E mail) Course(Number, Name, Descrip2on) Class Offering(Quarter, Course, Sec2on, Teacher, TimeDay) Student(Number, Name, Major, Advisor) Completed(Student, Course, Quarter, Sec2on, Grade) What foreign keys are missing? 50 Another example database with some foreign keys shown informally Teacher(Number, Name, Office, E mail) Course(Number, Name, Descrip2on) Class Offering(Quarter, Course, Sec2on, Teacher, TimeDay) Student(Number, Name, Major, Advisor) Completed(Student, Course, Quarter, Sec2on, Grade) 5 Another example database with some foreign keys shown informally Teacher(Number, Name, Office, E mail) Course(Number, Name, Descrip2on) Class Offering(Quarter, Course, Sec2on, Teacher, TimeDay) Student(Number, Name, Major, Advisor) Completed(Student, Course, Quarter, Sec2on, Grade) What are the limita:ons of this schema? 5 9

Overview of the Class and Introduction to DB schemas and queries. Lois Delcambre

Overview of the Class and Introduction to DB schemas and queries. Lois Delcambre Overview of the Class and Introduction to DB schemas and queries Lois Delcambre 1 CS 386/586 Introduction to Databases Instructor: Lois Delcambre lmd@cs.pdx.edu 503 725-2405 TA: TBA Office Hours: Immediately

More information

Lecture Notes CPSC 321 (Fall 2018) Today... Survey. Course Overview. Homework. HW1 (out) S. Bowers 1 of 8

Lecture Notes CPSC 321 (Fall 2018) Today... Survey. Course Overview. Homework. HW1 (out) S. Bowers 1 of 8 Today... Survey Course Overview Homework HW1 (out) S. Bowers 1 of 8 Course Overview Course webpage www.cs.gonzaga.edu/bowers/courses/cpsc321 Please check frequently (schedule, notes, assignments, etc.)

More information

Database Management System

Database Management System Database Management System Lecture 1 Introduction to Relational Database * Some materials adapted from R. Ramakrishnan, J. Gehrke Today s Agenda Course Layout Introduction to Relational Database Overview

More information

CS 564: DATABASE MANAGEMENT SYSTEMS. Spring 2018

CS 564: DATABASE MANAGEMENT SYSTEMS. Spring 2018 CS 564: DATABASE MANAGEMENT SYSTEMS Spring 2018 DATA IS EVERYWHERE! Our world is increasingly data driven scientific discoveries online services (social networks, online retailers) decision making Databases

More information

Data! CS 133: Databases. Goals for Today. So, what is a database? What is a database anyway? From the textbook:

Data! CS 133: Databases. Goals for Today. So, what is a database? What is a database anyway? From the textbook: CS 133: Databases Fall 2018 Lec 01 09/04 Introduction & Relational Model Data! Need systems to Data is everywhere Banking, airline reservations manage the data Social media, clicking anything on the internet

More information

Introduction to Database Systems. Motivation. Werner Nutt

Introduction to Database Systems. Motivation. Werner Nutt Introduction to Database Systems Motivation Werner Nutt 1 Databases Are Everywhere Database = a large (?) collection of related data Classically, a DB models a real-world organisation (e.g., enterprise,

More information

CS425 Fall 2016 Boris Glavic Chapter 1: Introduction

CS425 Fall 2016 Boris Glavic Chapter 1: Introduction CS425 Fall 2016 Boris Glavic Chapter 1: Introduction Modified from: Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Textbook: Chapter 1 1.2 Database Management System (DBMS)

More information

CSE 344 JANUARY 3 RD - INTRODUCTION

CSE 344 JANUARY 3 RD - INTRODUCTION CSE 344 JANUARY 3 RD - INTRODUCTION COURSE FORMAT Lectures Location: SIG 134 Please attend Sections: Content: exercises, tutorials, questions, new materials (occasionally) Locations: see web Please attend

More information

Chapter 1: Introduction

Chapter 1: Introduction Chapter 1: Introduction Chapter 2: Intro. To the Relational Model Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Database Management System (DBMS) DBMS is Collection of

More information

Introduc)on to Database Systems CSE 444. Lecture #1 March 29, 2010

Introduc)on to Database Systems CSE 444. Lecture #1 March 29, 2010 Introduc)on to Database Systems CSE 444 Lecture #1 March 29, 2010 1 Staff Instructor: Dan Suciu CSE 662, suciu@cs.washington.edu Office hours: Mondays 1:30 2:30 Grad TA: Jessica Leung joyleung@cs.washington.edu

More information

John Edgar 2

John Edgar 2 CMPT 354 http://www.cs.sfu.ca/coursecentral/354/johnwill/ John Edgar 2 Assignments 30% Midterm exam in class 20% Final exam 50% John Edgar 3 A database is a collection of information Databases of one

More information

MIS Database Systems.

MIS Database Systems. MIS 335 - Database Systems http://www.mis.boun.edu.tr/durahim/ Ahmet Onur Durahim Learning Objectives Database systems concepts Designing and implementing a database application Life of a Query in a Database

More information

CSE 544 Principles of Database Management Systems

CSE 544 Principles of Database Management Systems CSE 544 Principles of Database Management Systems Lecture 1 - Introduction and the Relational Model 1 Outline Introduction Class overview Why database management systems (DBMS)? The relational model 2

More information

BIS Database Management Systems.

BIS Database Management Systems. BIS 512 - Database Management Systems http://www.mis.boun.edu.tr/durahim/ Ahmet Onur Durahim Learning Objectives Database systems concepts Designing and implementing a database application Life of a Query

More information

The Relational Model. Database Management Systems

The Relational Model. Database Management Systems The Relational Model Fall 2017, Lecture 2 A relationship, I think, is like a shark, you know? It has to constantly move forward or it dies. And I think what we got on our hands is a dead shark. Woody Allen

More information

Modern Database Systems CS-E4610

Modern Database Systems CS-E4610 Modern Database Systems CS-E4610 Aristides Gionis Michael Mathioudakis Spring 2017 what is a database? a collection of data what is a database management system?... a.k.a. database system software to store,

More information

CAS CS 460/660 Introduction to Database Systems. Fall

CAS CS 460/660 Introduction to Database Systems. Fall CAS CS 460/660 Introduction to Database Systems Fall 2017 1.1 About the course Administrivia Instructor: George Kollios, gkollios@cs.bu.edu MCS 283, Mon 2:30-4:00 PM and Tue 1:00-2:30 PM Teaching Fellows:

More information

DATABASE MANAGEMENT SYSTEMS. UNIT I Introduction to Database Systems

DATABASE MANAGEMENT SYSTEMS. UNIT I Introduction to Database Systems DATABASE MANAGEMENT SYSTEMS UNIT I Introduction to Database Systems Terminology Data = known facts that can be recorded Database (DB) = logically coherent collection of related data with some inherent

More information

Review. The Relational Model. Glossary. Review. Data Models. Why Study the Relational Model? Why use a DBMS? OS provides RAM and disk

Review. The Relational Model. Glossary. Review. Data Models. Why Study the Relational Model? Why use a DBMS? OS provides RAM and disk Review The Relational Model CS 186, Fall 2006, Lecture 2 R & G, Chap. 3 Why use a DBMS? OS provides RAM and disk Review Why use a DBMS? OS provides RAM and disk Concurrency Recovery Abstraction, Data Independence

More information

CS 245: Database System Principles

CS 245: Database System Principles CS 245: Database System Principles Notes 01: Introduction Peter Bailis CS 245 Notes 1 1 This course pioneered by Hector Garcia-Molina All credit due to Hector All mistakes due to Peter CS 245 Notes 1 2

More information

COURSE OVERVIEW THE RELATIONAL MODEL. CS121: Relational Databases Fall 2017 Lecture 1

COURSE OVERVIEW THE RELATIONAL MODEL. CS121: Relational Databases Fall 2017 Lecture 1 COURSE OVERVIEW THE RELATIONAL MODEL CS121: Relational Databases Fall 2017 Lecture 1 Course Overview 2 Introduction to relational database systems Theory and use of relational databases Focus on: The Relational

More information

COURSE OVERVIEW THE RELATIONAL MODEL. CS121: Introduction to Relational Database Systems Fall 2016 Lecture 1

COURSE OVERVIEW THE RELATIONAL MODEL. CS121: Introduction to Relational Database Systems Fall 2016 Lecture 1 COURSE OVERVIEW THE RELATIONAL MODEL CS121: Introduction to Relational Database Systems Fall 2016 Lecture 1 Course Overview 2 Introduction to relational database systems Theory and use of relational databases

More information

Introduc3on to Data Management

Introduc3on to Data Management ICS 101 Fall 2014 Introduc3on to Data Management Assoc. Prof. Lipyeow Lim Informa3on & Computer Science Department University of Hawaii at Manoa Lipyeow Lim - - University of Hawaii at Manoa 1 The Data

More information

Standard stuff. Class webpage: cs.rhodes.edu/db Textbook: get it somewhere; used is fine. Prerequisite: CS 241 Coursework:

Standard stuff. Class webpage: cs.rhodes.edu/db Textbook: get it somewhere; used is fine. Prerequisite: CS 241 Coursework: Databases Standard stuff Class webpage: cs.rhodes.edu/db Textbook: get it somewhere; used is fine Stay up with reading! Prerequisite: CS 241 Coursework: Homework, group project, midterm, final Be prepared

More information

EECS 647: Introduction to Database Systems

EECS 647: Introduction to Database Systems EECS 647: Introduction to Database Systems Instructor: Luke Huan Spring 2009 Queries for Today What is a database? What is a database management system? Why take a database course? Who will teach? How

More information

CS430/630 Database Management Systems Spring, Betty O Neil University of Massachusetts at Boston

CS430/630 Database Management Systems Spring, Betty O Neil University of Massachusetts at Boston CS430/630 Database Management Systems Spring, 2019 Betty O Neil University of Massachusetts at Boston People & Contact Information Instructor: Prof. Betty O Neil Email: eoneil AT cs DOT umb DOT edu (preferred

More information

Administrivia. The Relational Model. Review. Review. Review. Some useful terms

Administrivia. The Relational Model. Review. Review. Review. Some useful terms Administrivia The Relational Model Ramakrishnan & Gehrke Chapter 3 Homework 0 is due next Thursday No discussion sections next Monday (Labor Day) Enrollment goal ~150, 118 currently enrolled, 47 on the

More information

Introduc.on to Databases

Introduc.on to Databases Introduc.on to Databases G6921 and G6931 Web Technologies Dr. Séamus Lawless Housekeeping Course Structure 1) Intro to the Web 2) HTML 3) HTML and CSS Essay Informa.on Session 4) Intro to Databases 5)

More information

Relational Database Systems Part 01. Karine Reis Ferreira

Relational Database Systems Part 01. Karine Reis Ferreira Relational Database Systems Part 01 Karine Reis Ferreira karine@dpi.inpe.br Aula da disciplina Computação Aplicada I (CAP 241) 2016 Database System Database: is a collection of related data. represents

More information

Introduction to Data Management. Lecture #1 (Course Trailer )

Introduction to Data Management. Lecture #1 (Course Trailer ) Introduction to Data Management Lecture #1 (Course Trailer ) Instructor: Mike Carey mjcarey@ics.uci.edu Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Today s Topics v Welcome to one

More information

CMPT 354: Database System I. Lecture 1. Course Introduction

CMPT 354: Database System I. Lecture 1. Course Introduction CMPT 354: Database System I Lecture 1. Course Introduction 1 Outline Motivation for studying this course Course admin and set up Overview of course topics 2 Trend 1: Data grows exponentially 1 ZB = 1,

More information

Introduction to Data Management. Lecture #1 (Course Trailer ) Instructor: Chen Li

Introduction to Data Management. Lecture #1 (Course Trailer ) Instructor: Chen Li Introduction to Data Management Lecture #1 (Course Trailer ) Instructor: Chen Li 1 Today s Topics v Welcome to one of my biggest classes ever! v Read (and live by) the course wiki page: http://www.ics.uci.edu/~cs122a/

More information

9/8/2018. Prerequisites. Grading. People & Contact Information. Textbooks. Course Info. CS430/630 Database Management Systems Fall 2018

9/8/2018. Prerequisites. Grading. People & Contact Information. Textbooks. Course Info. CS430/630 Database Management Systems Fall 2018 CS430/630 Database Management Systems Fall 2018 People & Contact Information Instructor: Prof. Betty O Neil Email: eoneil AT cs DOT umb DOT edu (preferred contact) Web: http://www.cs.umb.edu/~eoneil Office:

More information

Why are you here? Introduction. Course roadmap. Course goals. What do you want from a DBMS? What is a database system? Aren t databases just

Why are you here? Introduction. Course roadmap. Course goals. What do you want from a DBMS? What is a database system? Aren t databases just Why are you here? 2 Introduction CPS 216 Advanced Database Systems Aren t databases just Trivial exercises in first-order logic (says AI)? Bunch of out-of-fashion I/O-efficient indexes and algorithms (says

More information

745: Advanced Database Systems

745: Advanced Database Systems 745: Advanced Database Systems Yanlei Diao University of Massachusetts Amherst Outline Overview of course topics Course requirements Database Management Systems 1. Online Analytical Processing (OLAP) vs.

More information

A Few Tips and Suggestions. Database System II Preliminaries. Applications (contd.) Applications

A Few Tips and Suggestions. Database System II Preliminaries. Applications (contd.) Applications A Few Tips and Suggestions Purpose of doing MS and its implications Database System II Preliminaries Is your Goal getting A s and a GPA of 4.0? Try to match your career goals with what you learn How the

More information

Database Applications (15-415)

Database Applications (15-415) Database Applications (15-415) Course Overview and Introduction Lecture 1, January 11, 2015 Mohammad Hammoud Today Why databases and why studying databases? Course overview including objectives, topics

More information

Introduction to Data Management. Lecture #1 (The Course Trailer )

Introduction to Data Management. Lecture #1 (The Course Trailer ) Introduction to Data Management Lecture #1 (The Course Trailer ) Instructor: Mike Carey mjcarey@ics.uci.edu Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Today s Topics v Welcome to

More information

Chapter 1: Introduction

Chapter 1: Introduction Chapter 1: Introduction Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Outline The Need for Databases Data Models Relational Databases Database Design Storage Manager Query

More information

Database Applications (15-415)

Database Applications (15-415) Database Applications (15-415) The Entity Relationship Model Lecture 2, January 12, 2016 Mohammad Hammoud Today Last Session: Course overview and a brief introduction on databases and database systems

More information

What s a database anyway?

What s a database anyway? Lecture 1 Databases TDA357/DIT620 Pablo Picazo pablop@chalmers.se What s a database anyway? Structured Persistant Changable Digital A database is True to integrity constraints DBMS Database == Data collection

More information

Introduction to Database Systems CS432. CS432/433: Introduction to Database Systems. CS432/433: Introduction to Database Systems

Introduction to Database Systems CS432. CS432/433: Introduction to Database Systems. CS432/433: Introduction to Database Systems Introduction to Database Systems CS432 Instructor: Christoph Koch koch@cs.cornell.edu CS 432 Fall 2007 1 CS432/433: Introduction to Database Systems Underlying theme: How do I build a data management system?

More information

Your New App. Motivation. Data Management is Universal. Staff. Introduction to Data Management (Database Systems) CSE 414. Lecture 1: Introduction

Your New App. Motivation. Data Management is Universal. Staff. Introduction to Data Management (Database Systems) CSE 414. Lecture 1: Introduction Introduction to Data Management (Database Systems) CSE 414 Lecture 1: Introduction The world is drowning in data! LSST produces 30 TB of data per night Large Synoptic Survey Telescope 9 PB per year LHC

More information

Introduction to Data Management. Lecture #1 (Course Trailer )

Introduction to Data Management. Lecture #1 (Course Trailer ) Introduction to Data Management Lecture #1 (Course Trailer ) Instructor: Mike Carey mjcarey@ics.uci.edu Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Today s Topics! Welcome to my biggest

More information

Introduction to Data Management. Lecture #2 (Big Picture, Cont.) Instructor: Chen Li

Introduction to Data Management. Lecture #2 (Big Picture, Cont.) Instructor: Chen Li Introduction to Data Management Lecture #2 (Big Picture, Cont.) Instructor: Chen Li 1 Announcements v We added 10 more seats to the class for students on the waiting list v Deadline to drop the class:

More information

Introduction and Overview

Introduction and Overview Introduction and Overview Instructor: Leonard McMillan Comp 521 Files and Databases Fall 2016 1 Course Administrivia Optional Book Cow book Somewhat Dense Cover about 80% Instructor Leonard McMillan Teaching

More information

Introduction. Who wants to study databases?

Introduction. Who wants to study databases? Introduction Example databases Overview of concepts Why use database systems Who wants to study databases? What is the use of all the courses I have taken so far? This course shows very concrete how CS

More information

Lecture 1: Introduction

Lecture 1: Introduction CSCC43 Introduction to Databases Lecture 1: Introduction Lei Jiang (slides provided by Prof. John Mylopoulos) Outline Databases and DBMSs Data Models and Data Independence Database Transactions DBMS Languages

More information

Introduction to Data Management (Database Systems) CSE 414

Introduction to Data Management (Database Systems) CSE 414 Introduction to Data Management (Database Systems) CSE 414 Lecture 1: Introduction Overload: https://catalyst.uw.edu/webq/survey/cseadv/328147 (fill this out by Wednesday evening) CSE 414 - Spring 2017

More information

Introduction to Data Management CSE 344. Lecture 1: Introduction

Introduction to Data Management CSE 344. Lecture 1: Introduction Introduction to Data Management CSE 344 Lecture 1: Introduction CSE 344 - Winter 2014 1 Staff Instructor: Sudeepa Roy sudeepa@cs.washington.edu Office hours: Wednesdays, 3:30-4:20, in CSE 344 (my office)

More information

CMPT 354 Database Systems I. Spring 2012 Instructor: Hassan Khosravi

CMPT 354 Database Systems I. Spring 2012 Instructor: Hassan Khosravi CMPT 354 Database Systems I Spring 2012 Instructor: Hassan Khosravi Textbook First Course in Database Systems, 3 rd Edition. Jeffry Ullman and Jennifer Widom Other text books Ramakrishnan SILBERSCHATZ

More information

Introduction and Overview

Introduction and Overview Introduction and Overview (Read Cow book Chapter 1) Instructor: Leonard McMillan mcmillan@cs.unc.edu Comp 521 Files and Databases Spring 2010 1 Course Administrivia Book Cow book New (to our Dept) More

More information

Relational Algebra for sets Introduction to relational algebra for bags

Relational Algebra for sets Introduction to relational algebra for bags Relational Algebra for sets Introduction to relational algebra for bags Thursday, September 27, 2012 1 1 Terminology for Relational Databases Slide repeated from Lecture 1... Account Number Owner Balance

More information

CPSC 421 Database Management Systems. Lecture 11: Storage and File Organization

CPSC 421 Database Management Systems. Lecture 11: Storage and File Organization CPSC 421 Database Management Systems Lecture 11: Storage and File Organization * Some material adapted from R. Ramakrishnan, L. Delcambre, and B. Ludaescher Today s Agenda Start on Database Internals:

More information

Introduction Database Concepts

Introduction Database Concepts Introduction Database Concepts CO attained : CO1 Hours Required: 05 Self Study: 08 Prepared and presented by : Ms. Swati Abhang Contents Introduction Characteristics of databases, File system V/s Database

More information

Who we are: Database Research - Provenance, Integration, and more hot stuff. Boris Glavic. Department of Computer Science

Who we are: Database Research - Provenance, Integration, and more hot stuff. Boris Glavic. Department of Computer Science Who we are: Database Research - Provenance, Integration, and more hot stuff Boris Glavic Department of Computer Science September 24, 2013 Hi, I am Boris Glavic, Assistant Professor Hi, I am Boris Glavic,

More information

Course Logistics & Chapter 1 Introduction

Course Logistics & Chapter 1 Introduction CMSC 461, Database Management Systems Spring 2018 Course Logistics & Chapter 1 Introduction These slides are based on Database System Concepts book th edition, and the 2009 CMSC 461 slides by Dr. Kalpakis

More information

The Next Step: Designing DB Schema. Chapter 6: Entity-Relationship Model. The E-R Model. Identifying Entities and their Attributes.

The Next Step: Designing DB Schema. Chapter 6: Entity-Relationship Model. The E-R Model. Identifying Entities and their Attributes. Chapter 6: Entity-Relationship Model Our Story So Far: Relational Tables Databases are structured collections of organized data The Relational model is the most common data organization model The Relational

More information

Introduction to Data Management. Lecture #4 (E-R Relational Translation)

Introduction to Data Management. Lecture #4 (E-R Relational Translation) Introduction to Data Management Lecture #4 (E-R Relational Translation) Instructor: Mike Carey mjcarey@ics.uci.edu Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Announcements v Today

More information

DATABASE PERFORMANCE AND INDEXES. CS121: Relational Databases Fall 2017 Lecture 11

DATABASE PERFORMANCE AND INDEXES. CS121: Relational Databases Fall 2017 Lecture 11 DATABASE PERFORMANCE AND INDEXES CS121: Relational Databases Fall 2017 Lecture 11 Database Performance 2 Many situations where query performance needs to be improved e.g. as data size grows, query performance

More information

Database Management Systems MIT Introduction By S. Sabraz Nawaz

Database Management Systems MIT Introduction By S. Sabraz Nawaz Database Management Systems MIT 22033 Introduction By S. Sabraz Nawaz Recommended Reading Database Management Systems 3 rd Edition, Ramakrishnan, Gehrke Murach s SQL Server 2008 for Developers Any book

More information

Data about data is database Select correct option: True False Partially True None of the Above

Data about data is database Select correct option: True False Partially True None of the Above Within a table, each primary key value. is a minimal super key is always the first field in each table must be numeric must be unique Foreign Key is A field in a table that matches a key field in another

More information

Databases TDA357/DIT620. Niklas Broberg

Databases TDA357/DIT620. Niklas Broberg Databases TDA357/DIT620 Niklas Broberg niklas.broberg@chalmers.se 1 What s a database anyway? 2 A database is Structured Persistant Changable Digital True to integrity constraints 3 DBMS Database == Data

More information

Data, Information, and Databases

Data, Information, and Databases Data, Information, and Databases BDIS 6.1 Topics Covered Information types: transactional vsanalytical Five characteristics of information quality Database versus a DBMS RDBMS: advantages and terminology

More information

Introduction to Databases

Introduction to Databases Introduction to Databases Matthew J. Graham CACR Methods of Computational Science Caltech, 2009 January 27 - Acknowledgements to Julian Bunn and Ed Upchurch what is a database? A structured collection

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe Copyright 2016 Ramez Elmasri and Shamkant B. Navathe CHAPTER 1 Databases and Database Users Copyright 2016 Ramez Elmasri and Shamkant B. Navathe Slide 1-2 OUTLINE Types of Databases and Database Applications

More information

CS634 Architecture of Database Systems Spring Elizabeth (Betty) O Neil University of Massachusetts at Boston

CS634 Architecture of Database Systems Spring Elizabeth (Betty) O Neil University of Massachusetts at Boston CS634 Architecture of Database Systems Spring 2018 Elizabeth (Betty) O Neil University of Massachusetts at Boston People & Contact Information Instructor: Prof. Betty O Neil Email: eoneil AT cs.umb.edu

More information

Chapter 6: Entity-Relationship Model. The Next Step: Designing DB Schema. Identifying Entities and their Attributes. The E-R Model.

Chapter 6: Entity-Relationship Model. The Next Step: Designing DB Schema. Identifying Entities and their Attributes. The E-R Model. Chapter 6: Entity-Relationship Model The Next Step: Designing DB Schema Our Story So Far: Relational Tables Databases are structured collections of organized data The Relational model is the most common

More information

CS 4604: Introduc0on to Database Management Systems. B. Aditya Prakash Lecture #1: Introduc/on

CS 4604: Introduc0on to Database Management Systems. B. Aditya Prakash Lecture #1: Introduc/on CS 4604: Introduc0on to Database Management Systems B. Aditya Prakash Lecture #1: Introduc/on Course Informa0on Instructor B. Aditya Prakash, Torg 3160 F, badityap@cs.vt.edu Office Hours: 2:30-3:30pm Mondays

More information

What is Data? ANSI definition: Volatile vs. persistent data. Data. Our concern is primarily with persistent data

What is Data? ANSI definition: Volatile vs. persistent data. Data. Our concern is primarily with persistent data What is Data? ANSI definition: Data ❶ A representation of facts, concepts, or instructions in a formalized manner suitable for communication, interpretation, or processing by humans or by automatic means.

More information

What is Data? Volatile vs. persistent data Our concern is primarily with persistent data

What is Data? Volatile vs. persistent data Our concern is primarily with persistent data What is? ANSI definition: ❶ A representation of facts, concepts, or instructions in a formalized manner suitable for communication, interpretation, or processing by humans or by automatic means. ❷ Any

More information

Database Technology Introduction. Heiko Paulheim

Database Technology Introduction. Heiko Paulheim Database Technology Introduction Outline The Need for Databases Data Models Relational Databases Database Design Storage Manager Query Processing Transaction Manager Introduction to the Relational Model

More information

CSCI1270 Introduction to Database Systems

CSCI1270 Introduction to Database Systems CSCI1270 Introduction to Database Systems with thanks to Prof. George Kollios, Boston University Prof. Mitch Cherniack, Brandeis University Prof. Avi Silberschatz, Yale University 1.1 What is a Database

More information

Database Systems Management

Database Systems Management Database Systems Management Instructor - Russ Wakefield GTA Shivani Dave On Campus and Distance Learning What is CS430 / CS430dl? Instructor (Russ) and GTA (Shivani) Homework assignments 4-5 Lab assignments

More information

By Marina Barsky CSC 343. Introduction to databases. Summer

By Marina Barsky CSC 343. Introduction to databases. Summer By Marina Barsky CSC 343 Introduction to databases Summer 2016 http://www.cdf.toronto.edu/~csc343h/summer/ The world of data We aggressively acquire and keep data forever We feel real freedom when all

More information

MIT Database Management Systems Lesson 01: Introduction

MIT Database Management Systems Lesson 01: Introduction MIT 22033 Database Management Systems Lesson 01: Introduction By S. Sabraz Nawaz Senior Lecturer in MIT, FMC, SEUSL Learning Outcomes At the end of the module the student will be able to: Describe the

More information

Chapter 6. Foundations of Business Intelligence: Databases and Information Management VIDEO CASES

Chapter 6. Foundations of Business Intelligence: Databases and Information Management VIDEO CASES Chapter 6 Foundations of Business Intelligence: Databases and Information Management VIDEO CASES Case 1a: City of Dubuque Uses Cloud Computing and Sensors to Build a Smarter, Sustainable City Case 1b:

More information

Introduction to Data Management. Lecture #2 (Big Picture, Cont.)

Introduction to Data Management. Lecture #2 (Big Picture, Cont.) Introduction to Data Management Lecture #2 (Big Picture, Cont.) Instructor: Mike Carey mjcarey@ics.uci.edu Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Announcements v Still hanging

More information

Database Management Systems Chapter 1 Instructor: Oliver Schulte Database Management Systems 3ed, R. Ramakrishnan and J.

Database Management Systems Chapter 1 Instructor: Oliver Schulte Database Management Systems 3ed, R. Ramakrishnan and J. Database Management Systems Chapter 1 Instructor: Oliver Schulte oschulte@cs.sfu.ca Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 What is a database? A database (DB) is a very large,

More information

B.H.GARDI COLLEGE OF MASTER OF COMPUTER APPLICATION. Ch. 1 :- Introduction Database Management System - 1

B.H.GARDI COLLEGE OF MASTER OF COMPUTER APPLICATION. Ch. 1 :- Introduction Database Management System - 1 Basic Concepts :- 1. What is Data? Data is a collection of facts from which conclusion may be drawn. In computer science, data is anything in a form suitable for use with a computer. Data is often distinguished

More information

Scientific Workflow Tools. Daniel Crawl and Ilkay Altintas San Diego Supercomputer Center UC San Diego

Scientific Workflow Tools. Daniel Crawl and Ilkay Altintas San Diego Supercomputer Center UC San Diego Scientific Workflow Tools Daniel Crawl and Ilkay Altintas San Diego Supercomputer Center UC San Diego 1 escience Today Increasing number of Cyberinfrastructure (CI) technologies Data Repositories: Network

More information

A collection of persistent data that can be shared and interrelated. A system or application that must be operational for a company to function.

A collection of persistent data that can be shared and interrelated. A system or application that must be operational for a company to function. Objec.ve Introduc.on to Databases Dr. Jeff Pi9ges ITEC 0 Provide an overview of database systems What is a database? Why are databases important? What careers are available in the Database field? How do

More information

Announcements. PS 3 is out (see the usual place on the course web) Be sure to read my notes carefully Also read. Take a break around 10:15am

Announcements. PS 3 is out (see the usual place on the course web) Be sure to read my notes carefully Also read. Take a break around 10:15am Announcements PS 3 is out (see the usual place on the course web) Be sure to read my notes carefully Also read SQL tutorial: http://www.w3schools.com/sql/default.asp Take a break around 10:15am 1 Databases

More information

DATABASE MANAGEMENT SYSTEMS

DATABASE MANAGEMENT SYSTEMS www..com Code No: N0321/R07 Set No. 1 1. a) What is a Superkey? With an example, describe the difference between a candidate key and the primary key for a given relation? b) With an example, briefly describe

More information

CS 4604: Introduc0on to Database Management Systems

CS 4604: Introduc0on to Database Management Systems CS 4604: Introduc0on to Database Management Systems B. Aditya Prakash Lecture #1: Introduc/on Based on material by Profs. T. M. Murali and Christos Faloutsos Course Informa0on Instructor B. Aditya Prakash,

More information

Announcements. Using Electronics in Class. Review. Staff Instructor: Alvin Cheung Office hour on Wednesdays, 1-2pm. Class Overview

Announcements. Using Electronics in Class. Review. Staff Instructor: Alvin Cheung Office hour on Wednesdays, 1-2pm. Class Overview Announcements Introduction to Databases CSE 414 Lecture 2: Data Models HW1 and WQ1 released Both due next Tuesday Office hours start this week Sections tomorrow Make sure you sign up on piazza Please ask

More information

CS 245: Principles of Data-Intensive Systems. Instructor: Matei Zaharia cs245.stanford.edu

CS 245: Principles of Data-Intensive Systems. Instructor: Matei Zaharia cs245.stanford.edu CS 245: Principles of Data-Intensive Systems Instructor: Matei Zaharia cs245.stanford.edu Outline Why study data-intensive systems? Course logistics Key issues and themes A bit of history CS 245 2 My Background

More information

Chapter 11 Database Concepts

Chapter 11 Database Concepts Chapter 11 Database Concepts INTRODUCTION Database is collection of interrelated data and database system is basically a computer based record keeping system. It contains the information about one particular

More information

The DBMS accepts requests for data from the application program and instructs the operating system to transfer the appropriate data.

The DBMS accepts requests for data from the application program and instructs the operating system to transfer the appropriate data. Managing Data Data storage tool must provide the following features: Data definition (data structuring) Data entry (to add new data) Data editing (to change existing data) Querying (a means of extracting

More information

Introduction to CS 4604

Introduction to CS 4604 Introduction to CS 4604 T. M. Murali August 23, 2010 Course Information Instructor T. M. Murali, 2160B Torgerson, 231-8534, murali@cs.vt.edu Office Hours: 9:30am 11:30am Mondays and Wednesdays Teaching

More information

Outline. Databases and DBMS s. Recent Database Applications. Earlier Database Applications. CMPSCI445: Information Systems.

Outline. Databases and DBMS s. Recent Database Applications. Earlier Database Applications. CMPSCI445: Information Systems. Outline CMPSCI445: Information Systems Overview of databases and DBMS s Course topics and requirements Yanlei Diao University of Massachusetts Amherst Databases and DBMS s Commercial DBMS s A database

More information

Credits. Principles of Database Management Systems. Isn t Implementing a Database System Simple? Megatron 3000 Implementation Details

Credits. Principles of Database Management Systems. Isn t Implementing a Database System Simple? Megatron 3000 Implementation Details Principles of Database Management Systems (Tietokannanhallintajärjestelmät) Pekka Kilpeläinen Fall 2001 Credits Based on Stanford CS 245 lecture notes by original authors Hector Garcia- Molina, Jeff Ullman

More information

Part I What are Databases?

Part I What are Databases? Part I 1 Overview & Motivation 2 Architectures 3 Areas of Application 4 History Saake Database Concepts Last Edited: April 2019 1 1 Educational Objective for Today... Motivation for using database systems

More information

Introduction to Data Management. Lecture #2 Intro II & Data Models I

Introduction to Data Management. Lecture #2 Intro II & Data Models I Introduction to Data Management Lecture #2 Intro II & Data Models I Instructor: Mike Carey mjcarey@ics.uci.edu Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Today s Topics v The biggest

More information

Towards Integrating Work1low and Database Provenance

Towards Integrating Work1low and Database Provenance IPAW 12 Interna'onal Provenance and Annota'on Workshop Towards Integrating Work1low and Database Provenance Fernando Chiriga- and Juliana Freire Polytechnic Ins4tute of NYU Database Provenance Fine- grained

More information

Data, Databases, and DBMSs

Data, Databases, and DBMSs Todd S. Bacastow January 2004 IST 210 Data, Databases, and DBMSs 1 Evolution Ways of storing data Files ancient times (1960) Databases Hierarchical (1970) Network (1970) Relational (1980) Object (1990)

More information

Chapter 6 VIDEO CASES

Chapter 6 VIDEO CASES Chapter 6 Foundations of Business Intelligence: Databases and Information Management VIDEO CASES Case 1a: City of Dubuque Uses Cloud Computing and Sensors to Build a Smarter, Sustainable City Case 1b:

More information

Chapter 1 Introduction

Chapter 1 Introduction Chapter 1 Introduction Contents The History of Database System Overview of a Database Management System (DBMS) Three aspects of database-system studies the state of the art Introduction to Database Systems

More information

Basics of Data Management

Basics of Data Management Basics of Data Management Chaitan Baru 2 2 Objectives of this Module Introduce concepts and technologies for managing structured, semistructured, unstructured data Obtain a grounding in traditional data

More information

; Spring 2008 Prof. Sang-goo Lee (14:30pm: Mon & Wed: Room ) ADVANCED DATABASES

; Spring 2008 Prof. Sang-goo Lee (14:30pm: Mon & Wed: Room ) ADVANCED DATABASES 4541.564; Spring 2008 Prof. Sang-goo Lee (14:30pm: Mon & Wed: Room 302-208) ADVANCED DATABASES Syllabus Text Books Exams (tentative dates) Database System Concepts, 5th Edition, A. Silberschatz, H. F.

More information