Data Analyst Nanodegree Syllabus

Similar documents
Data Analyst Nanodegree Syllabus

Business Analytics Nanodegree Syllabus

Programming for Data Science Syllabus

Introduction to Programming Nanodegree Syllabus

DSC 201: Data Analysis & Visualization

Data Science. Data Analyst. Data Scientist. Data Architect

Six Core Data Wrangling Activities. An introductory guide to data wrangling with Trifacta

DATA SCIENCE INTRODUCTION QSHORE TECHNOLOGIES. About the Course:

Full Stack Web Developer Nanodegree Syllabus

DSC 201: Data Analysis & Visualization

Learning Objectives for Data Concept and Visualization

Table of Contents (As covered from textbook)

Part I, Chapters 4 & 5. Data Tables and Data Analysis Statistics and Figures

BUSINESS ANALYTICS. 96 HOURS Practical Learning. DexLab Certified. Training Module. Gurgaon (Head Office)

What s New in Spotfire DXP 1.1. Spotfire Product Management January 2007

Introduction to Data Science. Introduction to Data Science with Python. Python Basics: Basic Syntax, Data Structures. Python Concepts (Core)

Certified Data Science with Python Professional VS-1442

Your Name: Section: INTRODUCTION TO STATISTICAL REASONING Computer Lab #4 Scatterplots and Regression

BIO 360: Vertebrate Physiology Lab 9: Graphing in Excel. Lab 9: Graphing: how, why, when, and what does it mean? Due 3/26

Writing Queries Using Microsoft SQL Server 2008 Transact-SQL. Overview

DATA SCIENCE NORTHWESTERN BOOT CAMP CURRICULUM OVERVIEW DATA SCIENCE BOOT CAMP

Descriptive Statistics, Standard Deviation and Standard Error

THE DATA ANALYTICS BOOT CAMP

Data Science Bootcamp Curriculum. NYC Data Science Academy

GRAPHING BAYOUSIDE CLASSROOM DATA

Project II. argument/reasoning based on the dataset)

Antrix Academy of Data Science TM

Name: Stat 300: Intro to Probability & Statistics Textbook: Introduction to Statistical Investigations

Data can be in the form of numbers, words, measurements, observations or even just descriptions of things.

UCF DATA ANALYTICS AND VISUALIZATION BOOT CAMP

EZY Intellect Pte. Ltd., #1 Changi North Street 1, Singapore

COWLEY COLLEGE & Area Vocational Technical School

About Intellipaat. About the Course. Why Take This Course?

Data 100. Lecture 5: Data Cleaning & Exploratory Data Analysis

Learner Expectations UNIT 1: GRAPICAL AND NUMERIC REPRESENTATIONS OF DATA. Sept. Fathom Lab: Distributions and Best Methods of Display

Introduction to Data Analytics. David Walling

Pre-Requisites: CS2510. NU Core Designations: AD

DATA STRUCTURE AND ALGORITHM USING PYTHON

DATA ANALYTICS BOOT CAMP

Applied Regression Modeling: A Business Approach

Minitab Study Card J ENNIFER L EWIS P RIESTLEY, PH.D.

Slice Intelligence!

WELCOME! Lecture 3 Thommy Perlinger

Tableau. training courses

Few s Design Guidance

Data 100 Lecture 5: Data Cleaning & Exploratory Data Analysis

/4 Directions: Graph the functions, then answer the following question.

Writing Queries Using Microsoft SQL Server 2008 Transact- SQL

Chapter 5snow year.notebook March 15, 2018

Data Science with Python Course Catalog

Learning Alliance Corporation, Inc. For more info: go to

STA 570 Spring Lecture 5 Tuesday, Feb 1

8. MINITAB COMMANDS WEEK-BY-WEEK

Minitab 17 commands Prepared by Jeffrey S. Simonoff

IAT 355 Visual Analytics. Data and Statistical Models. Lyn Bartram

Python With Data Science

Introduction to BEST Viewpoints

alteryx training courses

To study the application of Data Visualization and Analysis tools

Oracle Big Data Discovery

SPSS TRAINING SPSS VIEWS

Data Analytics Training Program

SOFTWARE DEVELOPMENT: DATA SCIENCE

Chapter 2: Descriptive Statistics

STAT 1291: Data Science

Exploratory Data Analysis with R. Matthew Renze Iowa Code Camp Fall 2013

CHAPTER 3: Data Description

Chapter 6: Comparing Two Means Section 6.1: Comparing Two Groups Quantitative Response

Ivy s Business Analytics Foundation Certification Details (Module I + II+ III + IV + V)

Vocabulary. 5-number summary Rule. Area principle. Bar chart. Boxplot. Categorical data condition. Categorical variable.

Big Data Analytics The Data Mining process. Roger Bohn March. 2017

Statistics: Interpreting Data and Making Predictions. Visual Displays of Data 1/31

DM4U_B P 1 W EEK 1 T UNIT

Chapter 3: Data Description - Part 3. Homework: Exercises 1-21 odd, odd, odd, 107, 109, 118, 119, 120, odd

Excel 2010 with XLSTAT

ACHIEVEMENTS FROM TRAINING

Visualizing Crime in San Francisco During the 2014 World Series

Survey of Math: Excel Spreadsheet Guide (for Excel 2016) Page 1 of 9

Regression III: Advanced Methods

Chapter 2: Modeling Distributions of Data

LAB 1 INSTRUCTIONS DESCRIBING AND DISPLAYING DATA

Cornerstone 7. DoE, Six Sigma, and EDA

MHPE 494: Data Analysis. Welcome! The Analytic Process

Enduring Understandings: Some basic math skills are required to be reviewed in preparation for the course.

8 Organizing and Displaying

Python for Data Analysis. Prof.Sushila Aghav-Palwe Assistant Professor MIT

Further Maths Notes. Common Mistakes. Read the bold words in the exam! Always check data entry. Write equations in terms of variables

DATA 301 Introduction to Data Analytics Visualization. Dr. Ramon Lawrence University of British Columbia Okanagan

MySQL for Developers. Duration: 5 Days

CHAPTER 2 Modeling Distributions of Data

COURSE SYLLABUS COURSE TITLE:

Overview. Prerequisites. Course Outline. Course Outline :: Apache Spark Development::

MySQL for Developers. Duration: 5 Days

Eight units must be completed and passed to be awarded the Diploma.

Statistical Methods. Instructor: Lingsong Zhang. Any questions, ask me during the office hour, or me, I will answer promptly.

Week 7 Picturing Network. Vahe and Bethany

1 RefresheR. Figure 1.1: Soy ice cream flavor preferences

MicroStrategy Desktop

DSC 201: Data Analysis & Visualization

Lecture Topic Projects 1 Intro, schedule, and logistics 2 Applications of visual analytics, data types 3 Data sources and preparation Project 1 out 4

Transcription:

Data Analyst Nanodegree Syllabus Discover Insights from Data with Python, R, SQL, and Tableau Before You Start Prerequisites : In order to succeed in this program, we recommend having experience working with data in SQL and/or a spreadsheet tool like Microsoft Excel. You should also have a good understanding of descriptive statistics, including how to calculate and interpret measures of center (mean, median, mode); measures of spread (standard deviation, 5-number summary); and build bar charts, histograms, boxplots, and scatterplots. Educational Objectives : Learn to organize data, uncover patterns and insights, draw meaningful conclusions, and clearly communicate critical findings. Learn to use Python, R, SQL and Tableau. Gain all the skills necessary to get a job as a data analyst. Program Design Length of Program *: The program is divided into two terms of three months each (approx. 13 weeks). We expect students to work 10 hours/week on average. Estimated time commitment is 130 hours per term. Textbooks required : None Instructional Tools Available : Video lectures, personalized project reviews, live chat help, dedicated mentor *The length is an estimation of total hours the average student may take to complete all required coursework, including lecture and project time. Actual hours may vary. TERM 1: DATA ANALYSIS WITH PYTHON AND SQL Intro Project: Explore Weather Trends (5 hrs) This project will introduce you to the key steps of the data analysis process. You ll do so by analyzing data from a bike share company found in the San Francisco Bay Area. You ll submit this project in your first 7 days, and by the end you ll be able to: Use basic Python code to clean a dataset for analysis Run code to create visualizations from the wrangled data Analyze trends shown in the visualizations and report your conclusions Determine if this program is a good fit for your time and talents

Project: Explore US Bikeshare Data (40 hrs) You will use Python to perform steps of the data analysis process on bikeshare trip data collected from three US cities. You will write code to clean the data, compute descriptive statistics, and create basic visualizations of the distribution of data. Supporting Lesson Content: Introduction to Python Programming NUMBERS AND STRINGS FUNCTIONS, INSTALLATION, AND CONDITIONALS DATA STRUCTURES AND LOOPS FILES AND MODULES Learn about Python's numeric and string data types Use variables to store data Use built-in functions and methods Install Python on your computer Organize your code into functions Use conditionals to make decisions Use collection data types: lists, sets, and dictionaries Write `for` and `while` loops to express repetition Practice refactoring and problem solving Use modules from the Python standard library and from third-party libraries Read data from files on disk Use online resources to help solve problems Project: Investigate a Dataset (40 hrs) In this project, you ll choose one of Udacity's curated datasets and investigate it using NumPy and pandas. You ll complete the entire data analysis process, starting by posing a question and finishing by sharing your findings. Supporting Lesson Content: Introduction to Data Analysis Data Analysis in Python DATA ANALYSIS PROCESS PANDAS AND NUMPY: CASE STUDY 1 Learn about the keys steps of the data analysis process Investigate multiple datasets using Python and Pandas Perform the entire data analysis process on a dataset Learn to use NumPy and Pandas to wrangle, explore, analyze,

and visualize data PANDAS AND NUMPY: CASE STUDY 2 Perform the entire data analysis process on a dataset Learn more about NumPy and Pandas to wrangle, explore, analyze, and visualize data. Introduction to SQL Basic SQL SQL Joins SQL Aggregations Advanced SQL Queries Write common SQL commands including SELECT, FROM, and WHERE, as well as corresponding logical operators Write JOINs in SQL, as you are now able to combine data from multiple sources to answer more complex business questions Write common aggregations in SQL including COUNT, SUM, MIN, and MAX Write CASE and DATE functions, as well as work with NULLs Edit a database using CREATE TABLE, INSERT INTO, UPDATE, and other statements Use window functions and subqueries to add steps to a query Use documentation to learn new functions and complete complex tasks Project: Analyze Experiment Results (45 hrs) In this project, you will be provided a dataset reflecting data collected from an experiment. You ll use statistical techniques to answer questions about the data and report your conclusions and recommendations in a report. Supporting Lesson Content: Practical Statistics STANDARDIZING NORMAL DISTRIBUTION SAMPLING DISTRIBUTIONS ESTIMATION Convert distributions into the standard normal distribution using the Z-score Compute proportions using standardized distributions Use normal distributions to compute probabilities Use the Z-table to look up the proportions of observations above, below, or in between values Apply the concepts of probability and normalization to sample data sets Estimate population parameters from sample statistics using confidence intervals

HYPOTHESIS TESTING T-TESTS REGRESSION Use critical values to make decisions on whether or not a treatment has changed the value of a population parameter Test the effect of a treatment or compare the difference in means for two groups when we have small sample sizes Build a linear regression model to understand the relationship between independent and dependent variables Use linear regression results to make a prediction TERM 2: ADVANCED DATA ANALYSIS Intro Project: Test a Perceptual Phenomenon (10 hrs) In this project, you ll use descriptive statistics and a statistical test to analyze the Stroop effect, a classic result of experimental psychology. Communicate your understanding of the data and use statistical inference to draw a conclusion based on the results. Supporting Lesson Content: Practical Statistics Project: Wrangle and Analyze Data (50 hrs) Real-world data rarely comes clean. Using Python, you'll gather data from a variety of sources, assess its quality and tidiness, then clean it. You'll document your wrangling efforts in a Jupyter Notebook, plus showcase them through analyses and visualizations using Python and SQL. Supporting Lesson Content: Data Wrangling INTRO TO DATA WRANGLING GATHERING DATA Identify each step of the data wrangling process (gathering, assessing, and cleaning) Wrangle a CSV file downloaded from Kaggle using fundamental gathering, assessing, and cleaning code Gather data from multiple sources, including gathering files, programmatically downloading files, web-scraping data, and accessing data from APIs Import data of various file formats into pandas, including flat files (e.g. TSV), HTML files, TXT files, and JSON files Store gathered data in a PostgreSQL database

ASSESSING DATA CLEANING DATA Assess data visually and programmatically using pandas Distinguish between dirty data (content or quality issues) and messy data (structural or tidiness issues) Identify data quality issues and categorize them using metrics: validity, accuracy, completeness, consistency, and uniformity Identify each step of the data cleaning process (defining, coding, and testing) Clean data using Python and pandas Test cleaning code visually and programmatically using Python Project: Explore and Summarize Data (50 hrs) In this project, you ll use R and apply exploratory data analysis techniques to explore a selected data set for distributions, outliers, and anomalies. Supporting Lesson Content: Data Analysis with R WHAT IS EDA? R BASICS EXPLORE ONE VARIABLE EXPLORE TWO VARIABLES EXPLORE MANY VARIABLES DIAMONDS AND PRICE PREDICTIONS Define and identify the importance of exploratory data analysis (EDA) Install RStudio and packages Write basic R scripts to inspect datasets Quantify and visualize individual variables within a dataset Create histograms and boxplots Transform variables Examine and identify tradeoffs in visualizations Properly apply relevant techniques for exploring the relationship between any two variables in a data set Create scatter plots Calculate correlations Investigate conditional means Reshape data frames and use aesthetics like color and shape to uncover information Use predictive modeling to determine a good price for a diamond

Project: Create a Tableau Story (20 hrs) In this project, you ll create a data visualization, using Tableau, from a data set that tells a story or highlights trends or patterns in the data. Your work should be a reflection of the theory and practice of data visualization, harnessing visual encodings and design principles for effective communication. Supporting Lesson Content: Data Visualization with Tableau DATA VISUALIZATION FUNDAMENTALS DESIGN PRINCIPLES CREATING VISUALIZATIONS WITH TABLEAU TELLING STORIES WITH TABLEAU Understand the importance of data visualization Know how different data types are encoded in visualizations Select the most effective chart or graph based on the data being displayed Use color, shape, size, and other elements effectively Become proficient in basic Tableau functionality, including charts, filters, hierarchies, etc. Create calculated fields in Tableau Create Tableau dashboards and stories to effectively communicate data