USC Viterbi School of Engineering

Similar documents
INF 315E Introduction to Databases School of Information Fall 2015

Oklahoma State University Institute of Technology Face-to-Face Common Syllabus Fall 2017

Pre-Requisites: CS2510. NU Core Designations: AD


Discover Viterbi: Computer Science, Cyber Security & Informatics Programs. Viterbi School of Engineering University of Southern California Fall 2017

San José State University College of Science / Department of Computer Science Introduction to Database Management Systems, CS157A-3-4, Fall 2017

HOUSTON COMMUNITY COLLEGE BUSINESS TECHNOLOGY NORTHEAST COLLEGE-NORTHLINE LOCATION COURSE SYLLABUS FALL 2011 COMPUTER APPLICATION I POFI 1301

TITLE OF COURSE SYLLABUS, SEMESTER, YEAR

NOTE: This syllabus is subject to change during the semester. Please check this syllabus on a regular basis for any updates.

Oklahoma State University Institute of Technology Face-to-Face Common Syllabus Fall 2017

Meetings This class meets on Mondays from 6:20 PM to 9:05 PM in CIS Room 1034 (in class delivery of instruction).

LIS 2680: Database Design and Applications

Course and Contact Information. Course Description. Course Objectives

NOTE: This syllabus is subject to change during the semester. Please check this syllabus on a regular basis for any updates.

Textbook(s) and other required material: Raghu Ramakrishnan & Johannes Gehrke, Database Management Systems, Third edition, McGraw Hill, 2003.

CSCI 201L Syllabus Principles of Software Development Spring 2018

San José State University Computer Science Department CS157A: Introduction to Database Management Systems Sections 5 and 6, Fall 2015

COSC-589 Web Search and Sense-making Information Retrieval In the Big Data Era. Spring Instructor: Grace Hui Yang

Part A: Course Outline

The Linux Command Line: A Complete Introduction, 1 st ed., by William E. Shotts, Jr., No Starch Press, 2012.

San Jose State University College of Science Department of Computer Science CS185C, Introduction to NoSQL databases, Spring 2017

San José State University Department of Computer Science CS049J, Programming in Java, Section 2, Fall, 2016

CSC 261/461 Database Systems. Fall 2017 MW 12:30 pm 1:45 pm CSB 601

Course and Contact Information. Course Description. Course Objectives

IST659 Fall 2018 M004 Class Syllabus. Data Administration Concepts and Database Management

INFS 2150 (Section A) Fall 2018

Oklahoma State University Institute of Technology Online Common Syllabus Spring 2019

CSE 504: Compiler Design

KOMAR UNIVERSITY OF SCIENCE AND TECHNOLOGY (KUST)

ESET 369 Embedded Systems Software, Fall 2017

IT 341 Fall 2017 Syllabus. Department of Information Sciences and Technology Volgenau School of Engineering George Mason University

San Jose State University College of Science Department of Computer Science CS151, Object-Oriented Design, Sections 1, 2, and 3, Spring 2018

Compilers for Modern Architectures Course Syllabus, Spring 2015

MWF 9:00-9:50AM & 12:00-12:50PM (ET)

CS 241 Data Organization using C

South Portland, Maine Computer Information Security

ISM 324: Information Systems Security Spring 2014

CPS352 Database Systems Syllabus Fall 2012

CS157a Fall 2018 Sec3 Home Page/Syllabus

SCHEME OF COURSE WORK. Data Warehousing and Data mining

San José State University Computer Science Department CS49J, Section 3, Programming in Java, Fall 2015

IST359 - INTRODUCTION TO DATABASE MANAGEMENT SYSTEMS

San Jose State University College of Science Department of Computer Science CS151, Object-Oriented Design, Sections 1,2 and 3, Spring 2017

University of Asia Pacific (UAP) Department of Computer Science and Engineering (CSE)

Syllabus Revised 08/21/17

School of Computer Science

Lecture Notes CPSC 321 (Fall 2018) Today... Survey. Course Overview. Homework. HW1 (out) S. Bowers 1 of 8

FIT3056 Secure and trusted software systems. Unit Guide. Semester 2, 2010

Syllabus Revised 01/03/2018

San José State University Department of Computer Science CS-144, Advanced C++ Programming, Section 1, Fall 2017

Syllabus for HPE 021 Advanced Golf and Fitness 1 Credit Hour Fall 2014

Rochester Institute of Technology Golisano College of Computing and Information Sciences Department of Information Sciences and Technologies

VIS II: Design Communication Graphic Design Basics, Photoshop and InDesign Spring 2018

College of San Mateo Course Outline

Murach's HTML and CSS3 3 rd Edition By Boehm, Anne Fresno, Calif Publisher: Mike Murach & Associates, 2015 ISBN-13:

San José State University Department of Computer Science CS166, Information Security, Section 1, Fall, 2018

CASPER COLLEGE COURSE SYLLABUS MSFT 1600 Managing Microsoft Exchange Server 2003 Semester/Year: Fall 2007

CSCE 441 Computer Graphics Fall 2018

San Jose State University College of Science Department of Computer Science CS185C, NoSQL Database Systems, Section 1, Spring 2018

IS Spring 2018 Database Design, Management and Applications

Syllabus for HPE 099 Aerobic Proficiency 1 Credit Hour Fall 2012

MORGAN STATE UNIVERSITY DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING COURSE SYLLABUS FALL, 2015

CIS 3308 Web Application Programming Syllabus

Database Systems (INFR10070) Dr Paolo Guagliardo. University of Edinburgh. Fall 2016

BEST BIG DATA CERTIFICATIONS

Model 4.2 Faculty member + student Course syllabus for Advanced programming language - CS313D

CSE 344 JANUARY 3 RD - INTRODUCTION

Big Sandy Community and Technical College. Course Syllabus

University of Asia Pacific (UAP) Department of Electrical and Electronics Engineering (EEE) Course Outline

Syllabus for HPE 099 Aerobic Proficiency 1 Credit Hour Spring 2015

San José State University Computer Science CS 122 Advanced Python Programming Spring 2018

ESET 349 Microcontroller Architecture, Fall 2018

Murach's HTML and CSS3 3 rd Edition By Boehm, Anne Fresno, Calif Publisher: Mike Murach & Associates, 2015 ISBN-13:

Database Design and Management - BADM 352 Fall 2009 Syllabus and Schedule

In this course, you need to use Pearson etext. Go to "Pearson etext and Video Notes".

AE Computer Programming for Aerospace Engineers

Geography 4150/5150. Teaching Assistant: Chenjun Ling Office: McEniry 427 Office Hours: Tuesday 2:30-4:40pm

COURSE SYLLABUS ****************************************************************************** YEAR COURSE OFFERED: 2015

CSCI 6312 Advanced Internet Programming

Introduction to Information Technology ITP 101x (4 Units)

Database Concepts. CS 377: Database Systems

Fundamentals of Computer Science CSCI 136 Syllabus Fall 2018

Computer Science Technology Department

ICS111 Introduction to Computer Science

San José State University Science/Computer Science Database Management System I

Course Title: Computer Networking 2. Course Section: CNS (Winter 2018) FORMAT: Face to Face

Object-Oriented Programming for Managers

CSC 1052 Algorithms & Data Structures II: Introduction

Operating Systems, Spring 2015 Course Syllabus

TCOM 663/CFRS Intrusion Detection and Forensics Department of Electrical and Computer Engineering George Mason University Fall, 2010

Syllabus Revised 08/15/2018

IS 331-Fall 2017 Database Design, Management and Applications

Advanced Topics in Database Systems Spring 2016

Instructor: Anna Miller

Compulsory course in Computer Science

Survey of Programming Languages Dr. R. M. Siegfried 407 Science (516) (not for homework submission)

CS/SE 153 Concepts of Compiler Design

Eight units must be completed and passed to be awarded the Diploma.

COURSE SYLLABUS ETHC 210 SCIENCE AND SOCIETY

University of Asia Pacific (UAP) Department of Computer Science and Engineering (CSE)

Transcription:

Introduction to Computational Thinking and Data Science USC Viterbi School of Engineering http://www.datascience4all.org Term: Fall 2016 Time: Tues- Thur 10am- 11:50am Location: Allan Hancock Foundation Building (AHF) 145D Instructor: Dr. Yolanda Gil Office: GER 207 Office Hours: Thursdays 9am- 10am Contact: gil@isi.edu Instructor: Dr. Atefeh Farzindar Office: GER 207 Office Hours: Thursdays 11:50 am- 12:50 pm Contact: farzinda@usc.edu USC course number: INF 549 Units: 4 Catalogue Course Description Introduction to data analysis techniques and associated computing concepts for non- programmers. Topics include foundations for data analysis, visualization, parallel processing, metadata, provenance, and data stewardship. Expanded Course Description This course will teach non- programmers to think in computing terms about modern topics, and to approach real- world phenomena through data science. The course will enable students to: Acquire computational thinking skills that will enable students to represent and reason about complex problems in the digital arena Understand in terms of their possibilities and limitations to approach complex problems cast in terms of the emerging field of data science Become data science scholars through best practices in data documentation and dissemination The course is intended for students in disciplines outside of computer science, so no prior experience with computer science is assumed. The course topics will be particularly relevant to students interested in physical sciences and social sciences. This class will include eight homework assignments and a final exam. Learning Objectives This course teaches non- programmers to think in computing terms about modern topics, and to approach real- world phenomena through data science. The course introduces and

corresponding approaches to data analysis, including geospatial data, time series, networks, and multimedia data. Students learn to run multi- step analysis through a graphical workflow interface, and will experience first hand complex concepts in data science such as parallel computing, provenance, and visualization. Students also learn to use ontologies and logic representations to capture metadata and other knowledge about complex data. The course includes practical lessons to use workflow and ontology development toolkits, as well as best practices for data stewardship and dissemination. Prerequisite(s): none Co- Requisite (s): none Recommended Preparation: Mathematics and logic undergraduate courses. Description and Assessment of Assignments There will be a homework assignment every 3 or 4 lectures. The assignments must be submitted individually and students will receive individual scores. Students may work in groups to complete the tasks. The homework assignments are expected to take 6-8 hours. Each assignment is graded on a scale of 0-100 and the grading criteria will be specified in each assignment. The homework topics are listed in the Course Schedule. The homeworks include a class project that will be developed by the students independently in 3 separate stages, getting feedback from the instructors at each stage. Syllabus and Class Schedule Date Topic Material Covered s Section I: Introduction to Computational Thinking and Data Science 1 8/23 Computation What is computational thinking al thinking Computational thinking for and data reasoning and analysis science What is data science Data scientists The context of data science 2 8/25 Data What is data What is not (yet) data Time series data Networked data Geospatial data Text data Labeled and annotated data Big data HW1: Project part 1 (due 9/6) 3 8/30 Data analysis software Programs for data analysis Inputs and Outputs Program Parameters Programming Languages Programs as Black Boxes Algorithms versus software Syllabus for INF 549 Fall 2016, Page 2

4 9/1 Multi- step data analysis as workflows 5 9/6 Workflow practicum Building workflows by composing software Pre- processing and post- processing data Workflows for data analysis Workflow inputs and parameters Executing workflows Exploring data through workflows Workflows in practice The WINGS workflow system Workflows in practice 6 9/8 Provenance What is provenance Provenance concerning objects Provenance concerning people and institutions Provenance concerning processes Provenance models Provenance standards Section II: Data Analysis 7 9/13 Data pre- processing 8 9/15 Data analysis tasks (I) 9 9/20 Data analysis tasks (II) 10 9/22 Data analysis tasks (III) Data cleaning Quality control Data integration Feature selection Feature construction Data analysis tasks in data mining, statistics, and machine learning Supervised learning o Classification tasks o Classification algorithms o Evaluation of classifiers Unsupervised learning o Clustering o Pattern detection o Anomaly detection Simulation and prediction Causality o Probabilistic graphical models o Bayesian networks o Causal models HW2: Exploring data analysis workflows (due 9/15) HW3: Analyzing data with workflows (due 9/29) Syllabus for INF 549 Fall 2016, Page 3

11 9/27 Data lifecycle Data collection Data storage Data extraction and querying Data integration Data presentation Section III: Data Analysis in Practice 12 9/29 Analyzing (I) 13 10/4 Analyzing (II) 14 10/6 Parallel and distributed computing for big data (I) 15 10/11 Parallel and distributed computing for big data (II) Analyzing multimedia data o Pre- processing images o Segmentation o Edge detection o Object detection o Video analysis Analyzing geospatial data o Coordinate systems GIS systems Analyzing time series data o Collecting time series data o Pre- processing time series data o Event detection o Granger causality Cost of computation Divide and conquer Parallel computing Multi- core computing Distributed computing Cluster computers Cloud computing Grid computing Virtual machines Web services Speedup with parallel computing Dependencies and message passing Limits of speedup: Critical path Amdahl s law Embarrassingly parallel computations When problems are not parallelizable Execution failures Reduction through HW4: Project part 2 (due 10/11) Syllabus for INF 549 Fall 2016, Page 4

16 10/13 Data visualization 17 10/18 Analyzing (III) 18 10/20 Analyzing (IV) 19 10/25 Analyzing (V) 20 12/27 Project management MapReduce/Hadoop Quality of visualizations Major types of visualizations Time series visualizations Geospatial visualizations Multi- dimensional spaces Network visualizations Analyzing text data o Pre- processing text o Document classification o Document clustering o Topic detection Sentiment analysis Analyzing network data o Network structure o Dynamic networks o Scale- free networks o Network analysis Social media Analysis Geolocation Reading:NLP for social media, Ch1 and Ch3 Multidisciplinary collaborations Project management Lean Sixsigma Reading:NLP for social media, Ch4 HW5: Data visualization (due 10/25) Section IV: Metadata 21 11/1 Semantic metadata What is metadata Basic metadata versus semantic metadata Metadata about data collection Metadata about data processing Metadata for search and retrieval Metadata standards Domain metadata and ontologies 22 11/3 Ontologies (I) What is an ontology Taxonomies and class inheritance Properties Logical constraints Ontologies (II) o Logical reasoning and 6: Parallel Processing (due 11/8) Syllabus for INF 549 Fall 2016, Page 5

23 11/8 Ontologies (III) Section V: Data Dissemination 24 11/10 Data stewardship 25 11/15 Data formats and standards 26 11/17 Tracking metadata and provenance Section VI: Advanced Topics 27 11/22 Privacy and ethics in data science inference o Expressivity and computation o The Semantic Web Practicum: the PROTÉGÉ ontology editor Data sharing Data identifiers Licenses for data Data citation and attribution Software and other work products Data formats Data standards Data repositories Data services The Semantic Web and linked open data Combining computation with metadata and provenance Validating a data analysis method Tracking provenance during data analysis Automatically generating metadata for data analysis Privacy o Fair Information Practices o Managing sensitive data o Anonymizing sensitive data, k- anonymity, ial privacy o Re- identifying datasets Reproducibility Societal value of data 28 11/29 Databases File systems vs databases Relational databases o Data models o SQL o Transactions HW7: Developing ontologies (due 11/22) HW8: Project part 3- final report (due 12/1) Syllabus for INF 549 Fall 2016, Page 6

NoSQL databases 29 12/1 Review Selected topics Assignment Submission Policy assignments are due at 11:59pm on the due date and should be submitted in Blackboard. will be accepted up to one week late as long as the student requested a late submission ahead of the deadline, and in that case the assignment will be graded at 20% less than the possible points for the assignment. After one week, the assignment will not be graded. Grading Breakdown Quizzes: There will be weekly quizzes based on the material from the week before. There is no mid- term for this class. : There will be eight homework assignments throughout the course. The homework topics are listed in the Syllabus and Class Schedule. Final Exam: There is a final exam at the end of the semester covering all of the material covered in the class. Grading Schema: Quizzes 20% assignments 50% Class participation 10% Final: 20% Total 100% Grades will range from A through F. Syllabus for INF 549 Fall 2016, Page 7