Statistics Lecture 6. Looking at data one variable

Similar documents
CHAPTER 2 Modeling Distributions of Data

Name Date Types of Graphs and Creating Graphs Notes

3 Graphical Displays of Data

Univariate descriptives

Table of Contents (As covered from textbook)

3 Graphical Displays of Data

Visualizing Data: Freq. Tables, Histograms

Name: Date: Period: Chapter 2. Section 1: Describing Location in a Distribution

Unit 7 Statistics. AFM Mrs. Valentine. 7.1 Samples and Surveys

Chapter 2 Modeling Distributions of Data

Topic (3) SUMMARIZING DATA - TABLES AND GRAPHICS

CHAPTER 1. Introduction. Statistics: Statistics is the science of collecting, organizing, analyzing, presenting and interpreting data.

Univariate Statistics Summary

Visualizing univariate data 1

Chapter 1. Looking at Data-Distribution

Prepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order.

Density Curve (p52) Density curve is a curve that - is always on or above the horizontal axis.

Statistical Methods. Instructor: Lingsong Zhang. Any questions, ask me during the office hour, or me, I will answer promptly.

Sections Graphical Displays and Measures of Center. Brian Habing Department of Statistics University of South Carolina.

AP Statistics Summer Assignment:

Frequency Distributions

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency

BIO 360: Vertebrate Physiology Lab 9: Graphing in Excel. Lab 9: Graphing: how, why, when, and what does it mean? Due 3/26

2.1 Objectives. Math Chapter 2. Chapter 2. Variable. Categorical Variable EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES

AP Statistics Prerequisite Packet

LAB 1 INSTRUCTIONS DESCRIBING AND DISPLAYING DATA

AND NUMERICAL SUMMARIES. Chapter 2

CHAPTER-13. Mining Class Comparisons: Discrimination between DifferentClasses: 13.4 Class Description: Presentation of Both Characterization and

CHAPTER 2 DESCRIPTIVE STATISTICS

Vocabulary. 5-number summary Rule. Area principle. Bar chart. Boxplot. Categorical data condition. Categorical variable.

Chapter 3 - Displaying and Summarizing Quantitative Data

Section 1.2. Displaying Quantitative Data with Graphs. Mrs. Daniel AP Stats 8/22/2013. Dotplots. How to Make a Dotplot. Mrs. Daniel AP Statistics

10.4 Measures of Central Tendency and Variation

10.4 Measures of Central Tendency and Variation

CHAPTER 3: Data Description

Lecture 6: Chapter 6 Summary

IAT 355 Visual Analytics. Data and Statistical Models. Lyn Bartram

Probability and Statistics. Copyright Cengage Learning. All rights reserved.

MATH& 146 Lesson 10. Section 1.6 Graphing Numerical Data

Chapter 5: The beast of bias

AP Statistics. Study Guide

Assignments. Math 338 Lab 1: Introduction to R. Atoms, Vectors and Matrices

12. A(n) is the number of times an item or number occurs in a data set.

STA 570 Spring Lecture 5 Tuesday, Feb 1

Homework Packet Week #3

Chapter 5: The standard deviation as a ruler and the normal model p131

The basic arrangement of numeric data is called an ARRAY. Array is the derived data from fundamental data Example :- To store marks of 50 student

CHAPTER 2: DESCRIPTIVE STATISTICS Lecture Notes for Introductory Statistics 1. Daphne Skipper, Augusta University (2016)

Chapter 2: The Normal Distribution

Chapter2 Description of samples and populations. 2.1 Introduction.

Chapter 2: The Normal Distributions

Chapter 2 - Graphical Summaries of Data

Regression III: Advanced Methods

Chapter 2: Modeling Distributions of Data

Data can be in the form of numbers, words, measurements, observations or even just descriptions of things.

Lecture 1: Exploratory data analysis

MATH 1070 Introductory Statistics Lecture notes Descriptive Statistics and Graphical Representation

Lecture 3 Questions that we should be able to answer by the end of this lecture:

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 2.1- #

Chapter 6: DESCRIPTIVE STATISTICS

Exploratory Data Analysis

Lecture 3 Questions that we should be able to answer by the end of this lecture:

Lecture Notes 3: Data summarization

CITS4009 Introduc0on to Data Science

Acquisition Description Exploration Examination Understanding what data is collected. Characterizing properties of data.

Chapter 2 Organizing and Graphing Data. 2.1 Organizing and Graphing Qualitative Data

+ Statistical Methods in

STA Rev. F Learning Objectives. Learning Objectives (Cont.) Module 3 Descriptive Measures

STAT STATISTICAL METHODS. Statistics: The science of using data to make decisions and draw conclusions

Summarising Data. Mark Lunt 09/10/2018. Arthritis Research UK Epidemiology Unit University of Manchester

2.1: Frequency Distributions and Their Graphs

Minitab Notes for Activity 1

Unit 5: Estimating with Confidence

Descriptive Statistics, Standard Deviation and Standard Error

1 Overview of Statistics; Essential Vocabulary

CHAPTER 2 Modeling Distributions of Data

WELCOME! Lecture 3 Thommy Perlinger

Things you ll know (or know better to watch out for!) when you leave in December: 1. What you can and cannot infer from graphs.

STA Module 2B Organizing Data and Comparing Distributions (Part II)

STA Learning Objectives. Learning Objectives (cont.) Module 2B Organizing Data and Comparing Distributions (Part II)

TMTH 3360 NOTES ON COMMON GRAPHS AND CHARTS

Section 2.2 Normal Distributions. Normal Distributions

appstats6.notebook September 27, 2016

Chapter 2: Understanding Data Distributions with Tables and Graphs

What are we working with? Data Abstractions. Week 4 Lecture A IAT 814 Lyn Bartram

UNIT 1A EXPLORING UNIVARIATE DATA

L E A R N I N G O B JE C T I V E S

Chapter 5. Understanding and Comparing Distributions. Copyright 2012, 2008, 2005 Pearson Education, Inc.

a. divided by the. 1) Always round!! a) Even if class width comes out to a, go up one.

STP 226 ELEMENTARY STATISTICS NOTES PART 2 - DESCRIPTIVE STATISTICS CHAPTER 3 DESCRIPTIVE MEASURES

At the end of the chapter, you will learn to: Present data in textual form. Construct different types of table and graphs

1.3 Graphical Summaries of Data

2.1: Frequency Distributions

IT 403 Practice Problems (1-2) Answers

Chapter 2. Descriptive Statistics: Organizing, Displaying and Summarizing Data

Basic Statistical Terms and Definitions

STP 226 ELEMENTARY STATISTICS NOTES

More Numerical and Graphical Summaries using Percentiles. David Gerard

Part I, Chapters 4 & 5. Data Tables and Data Analysis Statistics and Figures

Lecture Series on Statistics -HSTC. Frequency Graphs " Dr. Bijaya Bhusan Nanda, Ph. D. (Stat.)

Transcription:

Statistics 111 - Lecture 6 Looking at data one variable Chapter 1.1 Moore, McCabe and Craig Probability vs. Statistics Probability 1. We know the distribution of the random variable (Normal, Binomial) 2. We know how to compute different parameters of the random variable (Expected value, Variance) Statistics 1. We have a sample (dataset) of n observations 2. We don t know the underline distribution 3. We try to estimate and infer about the parameters of the population 1

Dataset Definitions Individuals are the objects described in a set of data. Variable is any characteristic of an individuals. A variable can take different values for different individuals. Categorical variable places an individual into one of several groups Examples: gender, race Quantitative variable takes on numerical values that are usually considered as continuous Examples: height, age, wages Distributions A distribution describes what values a variable takes and how frequently these values occur. The distribution of a variable can be described graphically: Categorical Variable Bar plot, Pie Chart Quantitative Variable Boxplot, Histogram Characteristics of distributions: Center Spread Shape Outliers 2

Barplots and Pie Charts For categorical variables, we can graph the distribution using bar plots and pie charts Barplots and Pie Charts Your favorite Color 3

Barplots and Pie Charts Pie charts are generally not as useful as bar plots Need to have all categories to make a pie chart harder to compare subsets of categories Scale of pie charts can sometimes be misleading harder to see small differences Boxplots Box plots are an effective tool for conveying information of continuous variables Box contains the central 50% of the data, with a line indicating the median Median is the value with 50% of data on either side Whiskers contain most of the rest of the data, except for suspected outliers Outliers are suspiciously large or small values 4

Boxplots Box plots were originally designed to visually diagnosed a normal distribution. Boxplot: Shoe Size of Stat 111 Class Almost all values are between 5 and 13 50% of values are between 7.5 and 10 Center (Median) is around 8.5 Couple of suspected outliers: 14 and 14.5 5

Frequency 0 2 4 6 8 10 12 6/3/2010 Summary of Boxplots Useful for displaying center and spread of a distribution, as well as potential outliers However, boxplot doesn t really give us much of an idea of the shape of the distribution Histograms are much better graphical summaries of shape Histograms Histograms emphasize frequency of different values in the distribution 60 62 64 66 68 70 72 74 Height X-axis: Values are divided into bins Y-axis: Height of each bin is the frequency that values from that bin appear in dataset 6

Frequency 0 2 4 6 8 10 12 Density 0.00 0.05 0.10 0.15 6/3/2010 Another Example: Height in Stat 111 60 64 68 72 Height 60 64 68 72 Height Vertical axis is sometimes the density (or relative frequency) : equal to the frequency of the bin divided by the total number of obs Histograms versus Boxplots Both graphs give a good idea of the spread Boxplots may be a little clearer in terms of the center and outliers in a distribution center outliers spread of likely values center 7

Histograms versus Boxplots Histograms much more effective at displaying the shape of a distribution Skewness: departure from left-right symmetry Multi-modality: presence of multiple high frequency values clearly not symmetric not symmetric? second peak Symmetry - Histograms vs. Boxplots 8

Density Curves Often easier to examine a distribution with a smooth curve instead of a histogram Example: vocabulary scores from 947 seventh graders in Gary, Indiana Example with Test Score Data Number of scores less than 6 in population is 287 out of 947, so relative frequency is 0.303 Using a density curve (normal distribution), the approximate frequency is 0.293 9

Approximations Real data will never exactly fit a density curve ie. be exactly symmetric or normally-distributed Graphs that made a difference. 10

Time to JMP! 11