Stat 528 (Autumn 2008) Density Curves and the Normal Distribution. Measures of center and spread. Features of the normal distribution

Similar documents
Lecture 3 Questions that we should be able to answer by the end of this lecture:

Lecture 3 Questions that we should be able to answer by the end of this lecture:

Chapter 2: The Normal Distributions

Chapter 2: The Normal Distribution

Name: Date: Period: Chapter 2. Section 1: Describing Location in a Distribution

Chapter 2 Modeling Distributions of Data

Density Curve (p52) Density curve is a curve that - is always on or above the horizontal axis.

CHAPTER 2 Modeling Distributions of Data

IT 403 Practice Problems (1-2) Answers

Chapter 5: The standard deviation as a ruler and the normal model p131

CHAPTER 2: Describing Location in a Distribution

An Introduction to Minitab Statistics 529

Chapter 5snow year.notebook March 15, 2018

Prepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order.

Lecture 6: Chapter 6 Summary

STA Rev. F Learning Objectives. Learning Objectives (Cont.) Module 3 Descriptive Measures

LAB 1 INSTRUCTIONS DESCRIBING AND DISPLAYING DATA

The first few questions on this worksheet will deal with measures of central tendency. These data types tell us where the center of the data set lies.

Part I, Chapters 4 & 5. Data Tables and Data Analysis Statistics and Figures

Student Learning Objectives

CHAPTER 2: SAMPLING AND DATA

CHAPTER 2 Modeling Distributions of Data

CHAPTER 2 Modeling Distributions of Data

appstats6.notebook September 27, 2016

CHAPTER 2 DESCRIPTIVE STATISTICS

10.4 Measures of Central Tendency and Variation

10.4 Measures of Central Tendency and Variation

5.1 Introduction to the Graphs of Polynomials

CHAPTER 1. Introduction. Statistics: Statistics is the science of collecting, organizing, analyzing, presenting and interpreting data.

Chapter 5: The normal model

Sections 4.3 and 4.4

MATH NATION SECTION 9 H.M.H. RESOURCES

Section 9: One Variable Statistics

So..to be able to make comparisons possible, we need to compare them with their respective distributions.

Chapter 5. Normal. Normal Curve. the Normal. Curve Examples. Standard Units Standard Units Examples. for Data

3. Data Analysis and Statistics

Chapter 2. Frequency distribution. Summarizing and Graphing Data

STA Module 4 The Normal Distribution

STA /25/12. Module 4 The Normal Distribution. Learning Objectives. Let s Look at Some Examples of Normal Curves

Normal Data ID1050 Quantitative & Qualitative Reasoning

Chapter 2. Descriptive Statistics: Organizing, Displaying and Summarizing Data

Averages and Variation

A straight line is the graph of a linear equation. These equations come in several forms, for example: change in x = y 1 y 0

LESSON 3: CENTRAL TENDENCY

4.3 The Normal Distribution

Your Name: Section: INTRODUCTION TO STATISTICAL REASONING Computer Lab #4 Scatterplots and Regression

MAT 102 Introduction to Statistics Chapter 6. Chapter 6 Continuous Probability Distributions and the Normal Distribution

Univariate Statistics Summary

Probability and Statistics. Copyright Cengage Learning. All rights reserved.

Chapter 6. THE NORMAL DISTRIBUTION

Frequency Distributions

CHAPTER 2: DESCRIPTIVE STATISTICS Lecture Notes for Introductory Statistics 1. Daphne Skipper, Augusta University (2016)

Selected Introductory Statistical and Data Manipulation Procedures. Gordon & Johnson 2002 Minitab version 13.

Chapter 3 Analyzing Normal Quantitative Data

MAT 142 College Mathematics. Module ST. Statistics. Terri Miller revised July 14, 2015

6-1 THE STANDARD NORMAL DISTRIBUTION

Chapter 6. THE NORMAL DISTRIBUTION

CHAPTER 3: Data Description

STA Module 2B Organizing Data and Comparing Distributions (Part II)

STA Learning Objectives. Learning Objectives (cont.) Module 2B Organizing Data and Comparing Distributions (Part II)

1.2. Pictorial and Tabular Methods in Descriptive Statistics

Chapter 2 Describing, Exploring, and Comparing Data

Page 1. Graphical and Numerical Statistics

Downloaded from

STP 226 ELEMENTARY STATISTICS NOTES PART 2 - DESCRIPTIVE STATISTICS CHAPTER 3 DESCRIPTIVE MEASURES

Key: 5 9 represents a team with 59 wins. (c) The Kansas City Royals and Cleveland Indians, who both won 65 games.

Learning Objectives. Continuous Random Variables & The Normal Probability Distribution. Continuous Random Variable

Chapter 3 - Displaying and Summarizing Quantitative Data

BIOL Gradation of a histogram (a) into the normal curve (b)

UNIT 1A EXPLORING UNIVARIATE DATA

Distributions of Continuous Data

MAT 110 WORKSHOP. Updated Fall 2018

Section 2.2 Normal Distributions. Normal Distributions

Select Cases. Select Cases GRAPHS. The Select Cases command excludes from further. selection criteria. Select Use filter variables

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency

Further Maths Notes. Common Mistakes. Read the bold words in the exam! Always check data entry. Write equations in terms of variables

Chapter 2: Modeling Distributions of Data

The Normal Distribution

GenStat for Schools. Disappearing Rock Wren in Fiordland

How individual data points are positioned within a data set.

AP Statistics. Study Guide

Continuous Improvement Toolkit. Normal Distribution. Continuous Improvement Toolkit.

L E A R N I N G O B JE C T I V E S

Chapter Two: Descriptive Methods 1/50

2.1 Objectives. Math Chapter 2. Chapter 2. Variable. Categorical Variable EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES

Chapter2 Description of samples and populations. 2.1 Introduction.

Chapter 6 Normal Probability Distributions

Things you ll know (or know better to watch out for!) when you leave in December: 1. What you can and cannot infer from graphs.

Section 18-1: Graphical Representation of Linear Equations and Functions

Chapter 1. Looking at Data-Distribution

Section 10.4 Normal Distributions

Normal Distribution. 6.4 Applications of Normal Distribution

Acquisition Description Exploration Examination Understanding what data is collected. Characterizing properties of data.

AND NUMERICAL SUMMARIES. Chapter 2

Statistics: Normal Distribution, Sampling, Function Fitting & Regression Analysis (Grade 12) *

ECLT 5810 Data Preprocessing. Prof. Wai Lam

Distributions of random variables

B. Graphing Representation of Data

Use of GeoGebra in teaching about central tendency and spread variability

Linear Regression. Problem: There are many observations with the same x-value but different y-values... Can t predict one y-value from x. d j.

Transcription:

Stat 528 (Autumn 2008) Density Curves and the Normal Distribution Reading: Section 1.3 Density curves An example: GRE scores Measures of center and spread The normal distribution Features of the normal distribution Useful rules of thumb z-scores and the standard normal Calculations using the normal distribution Normal quantile plots (checking normality graphically) 1

Density curves Think of the histogram. It is often useful to think of a mathematical description for the distributions we observe in data. A density curve is a function describing the height of a relative frequency histogram at given values of the distribution. It is a mathematical model for the pattern of a distribution. The density curve always has non-negative height. The area under the density curve is one. It is possible to model our data using a smooth curve? Then, given a useful model for the data we observe, we can answer questions about the data using the model. 2

An example GRE scores The Graduate Record Examinations (GRE) are widely used to help predict the performance of applicants to graduate schools. Suppose a psychology department at a university has 34 applicants with the following quantitative GRE scores: 569 528 698 676 543 702 436 482 655 500 536 617 548 334 620 567 605 564 545 647 579 518 465 744 645 513 728 627 584 572 449 575 399 797 Frequency 9 8 7 6 5 4 3 2 1 0 350 400 450 500 550 600 GRE score 650 700 750 800 3

Histogram of the GRE scores Summarize the distribution of the GRE scores. Is there some smooth curve which is a good summary for this distribution? 4

Measures of center and spread for density curves The median of a density curve is the value for which 50% of the area under the curve is on the left and 50% is on the right. The mean of a density curve is the balancing point. The mode(s) of a density curve is/are the value(s) which have the highest density (height of curve). The mean and median are the same for a symmetric density curve. Need to do some math (calculus) to calculate the measures of spread for density curves. 5

An example Example 1.83: Figure 1.36 of the book (page 85) displays three density curves, each with three points marked on it. At which points to the mean and median fall? 6

The normal distribution Most important distribution in statistics. It turns up everywhere. e.g., heights and weights, test scores, measurement errors in scientific experiments, concentrations of chemicals. Also occurs because certain averages are approximately normal distributions (see later). A normal distribution is described by a density curve with a given equation. The height of a density curve at a point x is ) 1 (x µ)2 f(x) = exp (. 2πσ 2σ 2 The normal distribution is determined by two parameters called µ and σ. µ can be any value, but σ > 0. 7

Plots of the normal distribution mu=0, sigma=1 mu=5, sigma=1 f(x) 0.0 0.1 0.2 0.3 0.4 f(x) 0.0 0.02 0.04 0.06 0.08-3 -2-1 0 1 2 3 x mu=0, sigma=5-15 -10-5 0 5 10 15 x f(x) 0.0 0.1 0.2 0.3 0.4 f(x) 0.0 0.02 0.04 0.06 0.08 2 3 4 5 6 7 8 x mu=5, sigma=5-10 -5 0 5 10 15 20 x 8

Features of the normal distribution The distribution is symmetric the median is equal to µ. There are points of inflection at µ σ and µ + σ. What this means: curve turns downwards between µ σ and µ + σ. curve turns upwards outside µ σ and µ + σ. The mean of a normal distribution is µ. The standard deviation is σ. 9

Useful rules of thumb for the normal distribution For a normal distribution with mean µ and standard deviation σ: about 68% of the observations are in the range µ σ to µ + σ. about 95% of the observations are in the range µ 2σ to µ + 2σ. about 99.7% of the observations are in the range µ 3σ to µ + 3σ. 10

Example Systolic pressure is the force of blood in the arteries as the heart beats. Suppose that the systolic blood pressure for males aged 40-49, is normal distributed with a mean of 134.7 mmhg and a standard deviation of 3.1 mmhg. Answer the following questions. 1. Plot the density curve for this distribution. 2. Between what systolic blood pressure values, do the middle 95% of all males aged 40-49 lie? 3. How small are the smallest 2.5% of all blood pressures for males aged 40-49? 4. How large are the largest 2.5% values of all blood pressures for males aged 40-49? 11

Example (cont.) 12

Remarks We use the notation N(µ, σ) to denote a normal distribution with mean µ and standard deviation σ. e.g., The distribution of blood pressures is N(134.7, 3.1). In the last example, we answered questions about a variable using a mathematical model not actual data (the source of the model for the data is not specified in the example). 13

GRE example Suppose that the quantitative GRE scores for applicants in the psychology department are approximately normal with mean µ = 544 and standard deviation σ = 103. What proportion of applicants have a score less than 500? What proportion of applicants have a score larger than 700? What proportion of applicants have a score between 500 and 700? 14

Evaluating proportions using a density curve Areas under the density curve represent the relative frequency or proportion of ranges of values occurring. For the normal distribution, we evaluate these areas using z-scores. The z-score or standardized value of an observation x from a distribution with mean µ and standard deviation σ is z = x µ σ. The z-score measures how many standard deviations x is away from the mean. A z-score can be positive or negative. 15

The standard normal distribution µ = 0 and σ = 1 corresponds to the standard normal distribution, i.e., N(0, 1). Key Fact: If we have a variable, X, with a N(µ, σ) distribution then the standardized variable has a N(0,1) distribution. Z = X µ σ Table A (inside cover of the textbook) tabulates the areas to the left of a value in the standard normal distribution - this is the only table we need. Game plan: State the problem. Standardize by converting from N(µ, σ) to N(0, 1). Use the table to evaluate the area to the left of the curve. Answer the question. 16

GRE example (cont.) Let X denote the quantitative GRE scores for psychology applicants. X has a N(544, 103) distribution. Part (a): What proportion of applicants have a score less than 500? 17

GRE example (cont.) Part (b): What proportion of applicants have a score larger than 700? 18

GRE example (cont.) Part (c): What proportion of applicants have a score between 500 and 700? 19

Blood pressures revisited Suppose that the systolic blood pressure for males aged 40-49, is normal distributed with a mean of 134.7 mmhg and a standard deviation of 3.1 mmhg. What blood pressure value will place a male aged 40-49 in the top 5%? In the top 1%? 20

Normal quantile plots The normal quantile plot is a method we use to determine whether a sample of observations can be modeled by a normal distribution. The procedure: 1. Sort the data from smallest to largest. 2. Calculate the percentile of each data value. (for i = 1,..., n, the ith largest value is the (i 0.5)/n 100% percentile) 3. Calculate the z-score for each percentile. 4. Plot the data values on the y-axis versus the z-scores on the y-axis. If the distribution is close to normal, the plot points will lie close to a straight line. We let MINITAB do the calculations. 21

MINITAB Example Load the GRE scores data from the class website. Select the menu command Graph Probability Plot. Select the Simple graph type. In the dialog box for Graph variables select C3. Click Distribution: Under the Data Display tab, untick Show confidence interval and click OK. Click Scale: Under the Axes and Ticks tab, select Transpose Y and X. Under tab Y-scale Type, select Score, and click OK. Click OK again to produce the figure. 22

Normal quantile plot of GRE scores Conclusions? 23

Normal quantile plot of the hurricane losses (from the Introduction notes) Are the hurricane losses normally distributed? 24