TITLE OF COURSE SYLLABUS, SEMESTER, YEAR

Similar documents
LIS 2680: Database Design and Applications

Course and Contact Information. Course Description. Course Objectives

San José State University College of Science / Department of Computer Science Introduction to Database Management Systems, CS157A-3-4, Fall 2017

Prof. David Yarowsky

CS157a Fall 2018 Sec3 Home Page/Syllabus

Course and Contact Information. Course Description. Course Objectives

San José State University Computer Science Department CS157A: Introduction to Database Management Systems Sections 5 and 6, Fall 2015

Avi Silberschatz, Henry F. Korth, S. Sudarshan, Database System Concept, McGraw- Hill, ISBN , 6th edition.

Rochester Institute of Technology Golisano College of Computing and Information Sciences Department of Information Sciences and Technologies

Common Syllabus revised

CSC 407 Database System I COURSE PARTICULARS COURSE INSTRUCTORS COURSE DESCRIPTION

CIS 408 Internet Computing (3-0-3)

MWF 9:00-9:50AM & 12:00-12:50PM (ET)

City University of Hong Kong Course Syllabus. offered by Department of Computer Science with effect from Semester A 2017/18

CMPS 182: Introduction to Database Management Systems. Instructor: David Martin TA: Avi Kaushik. Syllabus

Advanced Topics in Database Systems Spring 2016

Specific Objectives Contents Teaching Hours 4 the basic concepts 1.1 Concepts of Relational Databases

CPS352 - DATABASE SYSTEMS. Professor: Russell C. Bjork Spring semester, Office: KOSC 242 x4377

Advanced Relational Database Management MISM Course F A Fall 2017 Carnegie Mellon University

Advanced Relational Database Management MISM Course S A3 Spring 2019 Carnegie Mellon University

IS Spring 2018 Database Design, Management and Applications

COLLEGE OF INFORMATION TECHNOLOGY DEPARTMENT OF INFORMATION TECHNOLOGY COURSE SYLLABUS/SPECIFICATION

San José State University Department of Computer Science CS-174, Server-side Web Programming, Section 2, Spring 2018

CPS352 Database Systems Syllabus Fall 2012

San Jose State University College of Science Department of Computer Science CS151, Object-Oriented Design, Sections 1, 2, and 3, Spring 2018

CSCI 201L Syllabus Principles of Software Development Spring 2018

San Jose State University College of Science Department of Computer Science CS151, Object-Oriented Design, Sections 1,2 and 3, Spring 2017

TEACHING & ASSESSMENT PLAN

IS 331-Fall 2017 Database Design, Management and Applications

Course Design Document: IS202 Data Management. Version 4.5

PELLISSIPPI STATE COMMUNITY COLLEGE MASTER SYLLABUS ADVANCED DATABASE MANAGEMENT SYSTEMS CSIT 2550

Database Systems: Concepts, design, and implementation ISE 382 (3 Units)

CPS352 - DATABASE SYSTEMS. Professor: Russell C. Bjork Spring semester, Office: KOSC 242 x4377

CIS 3308 Web Application Programming Syllabus

USC Viterbi School of Engineering

Del Mar College Master Course Syllabus. UNIX System Administration Course Number: ITSC1358

INFS 2150 (Section A) Fall 2018

Course Name: Database Design Course Code: IS414

San Jose State University - Department of Computer Science

CSCI 6312 Advanced Internet Programming

1. Query and manipulate data with Entity Framework.

Database Design and Management - BADM 352 Fall 2009 Syllabus and Schedule

SULTAN QABOOS UNIVERSITY COURSE OUTLINE PROGRAM: B.Sc. in Computer Science. Laboratory (Practical) Field or Work Placement

San Jose State University College of Science Department of Computer Science CS185C, NoSQL Database Systems, Section 1, Spring 2018

Course and Contact Information. Catalog Description. Course Objectives

KOMAR UNIVERSITY OF SCIENCE AND TECHNOLOGY (KUST)

BMI 544: Databases Winter term, 2015

Syllabus COSC-051-x - Computer Science I Fall Office Hours: Daily hours will be entered on Course calendar (or by appointment)

KUWAIT UNIVERSITY College of Business Administration Department of Quantitative Methods and Information Systems

A: 90% - 100% B: 80% - <90% C: 70% - <80% D: 60% - <70% F: < 60% Important Dates:

San José State University Science/Computer Science Database Management System I

DATABASE MANAGEMENT SYSTEM SUBJECT CODE: CE 305

ESSEX COUNTY COLLEGE Business Division CIS 152 Internet Concepts Course Outline

University of Maryland at College Park Department of Geographical Sciences GEOG 477/ GEOG777: Mobile GIS Development

PELLISSIPPI STATE TECHNICAL COMMUNITY COLLEGE MASTER SYLLABUS HPC INTERNETWORKING & GRID TECHNOLOGY HPC 1020

HOLY ANGEL UNIVERSITY COLLEGE OF INFORMATION and COMMUNICATIONS TECHNOLOGY DATABASE MANAGEMENT SYSTEM COURSE SYLLABUS

CSC 261/461 Database Systems. Fall 2017 MW 12:30 pm 1:45 pm CSB 601

CSc 2310 Principles of Programming (Java) Jyoti Islam

Introduction To Data Processing COMP 153 Business Administration Program/Administrative Studies. Course Outline

FOUNDATIONS OF INFORMATION SYSTEMS MIS 2749 COURSE SYLLABUS Fall, Course Title and Description

Syllabus Revised 01/03/2018

IT 341 Fall 2017 Syllabus. Department of Information Sciences and Technology Volgenau School of Engineering George Mason University

Syllabus. ICS103: Computer Programming in C 2017 / 2018 First Semester (Term 171) INSTRUCTOR Office Phone Address Office Hours

IST659 Spring2015 M001 Wang Syllabus Data Administration Concepts and Database Management

CMPUT 391 Database Management Systems. Fall Semester 2006, Section A1, Dr. Jörg Sander. Introduction

9/8/2018. Prerequisites. Grading. People & Contact Information. Textbooks. Course Info. CS430/630 Database Management Systems Fall 2018

The University of Jordan

ISM 324: Information Systems Security Spring 2014

ITSC 1319 INTERNET/WEB PAGE DEVELOPMENT SYLLABUS

Web Development: Client Side

Syllabus for CSC 455 Database Systems 3 Credit Hours Spring 2012

San Jose State University College of Science Department of Computer Science CS185C, Introduction to NoSQL databases, Spring 2017

CASPER COLLEGE COURSE SYLLABUS MSFT 1600 Managing Microsoft Exchange Server 2003 Semester/Year: Fall 2007

IST659 Fall 2018 M004 Class Syllabus. Data Administration Concepts and Database Management

Geography 4150/5150. Teaching Assistant: Chenjun Ling Office: McEniry 427 Office Hours: Tuesday 2:30-4:40pm

Advanced Database Organization INF613

Database Management Systems CS Spring 2017

Big Sandy Community and Technical College. Course Syllabus

The University of Jordan. Accreditation & Quality Assurance Center. COURSE Syllabus

San José State University Computer Science CS 122 Advanced Python Programming Spring 2018

San José State University Department of Computer Science CS049J, Programming in Java, Section 2, Fall, 2016

IST659 Spring 2016 Huang Syllabus Data Administration Concepts and Database Management

NEW YORK CITY COLLEGE OF TECHNOLOGY COMPUTER SYSTEMS TECHNOLOGY DEPARTMENT CST4714 DATABASE ADMINISTRATION (2 class hours, 2 lab hours, 3 credits)

CS430/630 Database Management Systems Spring, Betty O Neil University of Massachusetts at Boston

San José State University Computer Science Department CS49J, Section 3, Programming in Java, Fall 2015

San José State University Department of Computer Science CS166, Information Security, Section 1, Fall, 2018

Textbook(s) and other required material: Raghu Ramakrishnan & Johannes Gehrke, Database Management Systems, Third edition, McGraw Hill, 2003.

Nashville State Community College Computer and Engineering Technologies Division Computer Information Systems. Master Course Syllabus

COURSE OUTLINE. Last Amendment Edition Procedure No. Lecturer /blog Room No. Phone No. / Name.

Web Programming Fall 2011

CMPSCI 645 Database Design & Implementation

CS/SE 153 Concepts of Compiler Design

Syllabus Revised 08/21/17

Database Management System Implementation. Who am I? Who is the teaching assistant? TR, 10:00am-11:20am NTRP B 140 Instructor: Dr.

Updated: 2/14/2017 Page 1 of 6

INSTITUTE OF AERONAUTICAL ENGINEERING

Cleveland State University

Programming 1. Outline (111) Lecture 0. Important Information. Lecture Protocol. Subject Overview. General Overview.

Database Security MET CS 674 On-Campus/Blended

Course Design Document. IS410: Advanced Data Management. Version 5.1

Transcription:

TITLE OF COURSE SYLLABUS, SEMESTER, YEAR Instructor Contact Information Jennifer Weller Jweller2@uncc.edu Office Hours Time/Location of Course Mon 9-11am MW 8-9:15am, BINF 105 Textbooks Needed: none required, see below COURSE DESCRIPTION Goals: a. Students will learn where and how to obtain information from public biological databases and how to assess the quality of that information. b. Students will learn what a data model is and how to design, implement, populate and query relational databases to support their own research, which often requires the integration of public data with new data, and merging data from multiple sources having different formats. Methods: This course is taught as a Research Methods course, in which students learn practical research skills, including the theory of the relational model and the practice of creating a well-constructed database within a relational database management system. Students will build the skill set needed to create a research database to support scientific inquiry; students also learn methods required to access, retrieve and integrate data from diverse public databases as well as query that database to uncover new information.

Background: There are dozens of large public repositories of biological data and thousands of smaller databases. These data sources use a variety of designs, including hierarchical and relational schemas. Regardless of the underlying model, these are often housed within relational database management systems. Successful data retrieval is very dependent on implementation of the model as a schema and in how data is exposed to users. For some databases screen scraping is the only mechanism for retrieving the data, while access to a schema, SQL and APIs permits more effective use. Serialization methods are important to maintain structure during data transfer, we will discuss XML and XML Schema as examples of ways to accomplish data transfer without loss of organization. o Because the generation of biological data is very complex and our understanding is not complete, there are many valid data representations. Throughout the course we will consider the availability of representation standards and their proper use, taking as our exemplar the Generic Model Organism Databases as a standardized but extensible model, with the associated toolkits. o In the framework of the relational database management system the language used for creating and retrieving data is SQL, to which students will be introduced. As students become familiar with SQL and other programming tools they will be expected to create a project database in which they demonstrate their abilities to locate, acquire, integrate and interpret data from public repositories. o Particular topics will include formats and schemas in important biological data repositories, as databases (GEO/SRA, Expression Atlas, PDB), or knowledge bases (KEGG, PharmGKB). o We will also discuss biological ontologies (such as SO and GO) and how their deployment affects database design, integration and queries. PRE/CO- REQUISITES The lecture and lab are co-requisites. Prerequisite: It is assumed that students have some background in cell and molecular biology, such that they understand the basic biological units being modeled and how they relate to one another. No prior programming or database skills are required. Pre-requisite: Admission to graduate standing and undergraduate training in Computer Science or one of the Life Sciences

LEARNING OBJECTIVES Having successfully completed this course, the student will be able to explain and apply: o The relational model and Codd s rules; normalization. o Standard conventions for communicating relational data models. o Install and use a freely available RDBMS. o Load data into an RDBMS using several methods and the verify outcome. o Use SQL for data retrieval from public databases, perform data transformation and load data into a personal database o Interpret and use BioOntologies and XML; explain foundations of interoperability. o Data Cubes and other non-normalized systems are used for establishing Knowledge Bases. o Create a research database of his/her own design; implement and demonstrate its use. o Present a working version of the research database in the context of a scientific problem that it solves. o Write a paper describing the project database in a journal format. o Research ethics with a specific emphasis on how the interoperability of disparate databases can lead to unintentional invasion of privacy. INSTRUCTIONAL METHODS The course is divided between lecture and lab, with more lectures at the beginning. o Concepts are presented primarily as lectures elucidating important concepts presented in detail in reading assignments, with presentation of factual material in a standard lecture style, and discussion of alternative approaches presented in readings. o Labs will include explanation of methods by the instructor, demonstrations of methods using software applications by the instructor or TA, followed by supervised student activities in which they carry out tasks using the methods.

Expected Class Work: Reading o Papers, blogs, links to demonstrations, etc. about data models, schemas, management systems, integration methods and new approaches will be assigned weekly. Problem Sets There will be 5 problem sets, which are aimed at having you practice specific technical skills. o Use a design tool to create a conceptual model of a data model o Install SQLite and a Browser, use documentation to understand the application o Answer questions about Relational Algebra o Practice writing SQL queries on a class database o Practice interpreting XML and ontologies Homework There will be 5 homework assignments, which help you use skills to build up the components necessary to create your project database. o Critical Evaluation of a Database o Identifying and mapping key data elements in GEO and SRA files o Conceptual Design of your Project Database iteration 1 o Normalization of your Project Data Model o Creation and Testing of your Project Schema Exams There will be one midterm (given during class) and one final exam (turned in during the scheduled final period) covering the lecture content. Questions are multiple choice or short answer with one or two using SQL as the DML or DDL problems to solve. Student Projects: Construct a database that supports scientific research using genomic data, in an area decided each year by the professor.

GRADING Students will be evaluated on their ability to synthesize lecture and lab material effectively, including their ability to answer factual questions regarding lecture and reading material on exams, their ability to practice specific skills in problem sets, their ability to reason about novel problems based on that material in homework assignments, and their ability to put into practice concepts and skills to create a project, which must be documented and presented live to the class. PhD students are required to undertake more complex projects, integrating additional information into the schema and carrying out more complex queries for their presentations. Homework 20% (5) Problem Sets 20% (5) Midterm exam 15% (Comprehensive) Final exam 15% (Comprehensive) Project Presentation (live) 15% Final Project Report (documentation) 15% TENTATIVE SCHEDULE (WEEKLY, OR TOPICS) Date (Week) Subject/Topic Assignments Due 1 Introduction: Scientific Data and Databases, Standards, Exchange methods, Ontologies 2 Genomics Experiments and Data types; Methods and Measurements 3 Data Models and DBMS 4 Entity Relational Modeling 5 The relational model and Normal Forms 6 SQL and SQLite 7 Generic Model Organism Databases and CHADO 8 Extract/Transform/Load and Advanced Queries 9 Ontologies: Gene Ont. and Sequence Ont. 10 Resource Description Framework and OBO 11 Serialization methods for structures data transfer 12 NoSQL databases, Data Cubes and Knowledge Bases 13 Responsible Conduct of Research 14 Presentations of Projects

Note on textbooks: I do not require a text, as there are none that use scientific applications as the teaching paradigm. There is an on-line text that I can recommend: Database design with UML and SQL, 3 rd edition, by Tom Jewett (http://www.tomjewett.com/dbdesign/dbdesign.php ) and I will assign content from specific Web sites as well as a fair number of journal papers. For examples of current scientific databases I strongly recommend looking at recent editions of the Nucleic Acids Research database edition that appears annually, as well as on-line documentation of the RDBMS, tools and specific databases that we discuss in class. The basics of the relational model and relational schemas and SQL are covered in almost any beginning database text I will provide likely chapter titles to guide your reading. Good information may be found in the following texts, available in the library, relevant chapters will be cited for each lecture: 1. Introduction to Bio-Ontologies: by Peter N Robinson and Sebastian Bauer, CRC (Chapman and Hall) Press, First Edition, 2011. 2. Database Systems Concepts by A. Silberschatz, H. Korth and S. Sudershan, ed. 6, 2010, McGraw-Hill ISBN 0-07-352332-1. 3. Database Systems: Design, Implementation and Management, Seventh Edition by Peter Rob and Carlos Coronel 2007, from Course Technology: Thomson Learning, Boston MA. ISBN 13: 978-1-4188-3593-4. Note there is now an 8 th edition. Note on Lab Applications: You will be required to install and learn to use SQLite v 3.8.7.1, a single-file RDBMS Database browsers SQLite Manager v 0.8.1 (a Firefox plugin which is lightweight) and DB Browser for SQLite v 3.4.0,which has many more features MySQL Workbench v 6.2.3, which is a data-modeling tool. We will demonstrate the use of these tools during the lab section of the course, but you are also expected to install and use them on whatever computer you are using to carry out your project.

POLICIES AND PROCEDURES ACADEMIC INTEGRITY All students are required to read and abide by the Code of Student Academic Integrity. Violations of the Code of Student Academic Integrity, including plagiarism, will result in disciplinary action as provided in the Code. Definitions and examples of plagiarism are set forth in the Code. The Code is available from the Dean of Students Office or online at: http://www.legal.uncc.edu/policies/ps- 105.html. A set of links to various resources on plagiarism and how to avoid it is available at the UNCC Library website: http://library.uncc.edu/display/?dept=instruction&format=open&page=920. ATTENDANCE I do not take attendance. I also don t go over lecture material for students who miss lectures (barring illness), office hours are for those who have questions about a specific lecture point or homework and practice problems or for help with the project. Note that there is no official textbook that carries the burden of delivering the bulk of instruction so that might weigh into the attendance decision. OTHER Students are encouraged to use computers during the lecture and lab periods for note-taking and other course-related activities. However, the use of cell phones, beepers or other communication devices, and using class laptops for email, social media, etc. is disruptive to your concentration and your neighbors hence these uses are prohibited during class. Except for emergencies (for which the instructor has been alerted in advance), those using such devices for those purposes must leave the classroom for the remainder of the class period.