Chapter 5: The normal model

Similar documents
appstats6.notebook September 27, 2016

Lecture 3 Questions that we should be able to answer by the end of this lecture:

Lecture 3 Questions that we should be able to answer by the end of this lecture:

Density Curve (p52) Density curve is a curve that - is always on or above the horizontal axis.

Check Homework. More with normal distribution

Chapter 5: The standard deviation as a ruler and the normal model p131

10.4 Measures of Central Tendency and Variation

10.4 Measures of Central Tendency and Variation

Female Brown Bear Weights

Distributions of random variables

Chapter 6 The Standard Deviation as Ruler and the Normal Model

STA Module 4 The Normal Distribution

STA /25/12. Module 4 The Normal Distribution. Learning Objectives. Let s Look at Some Examples of Normal Curves

Prepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order.

CHAPTER 2 DESCRIPTIVE STATISTICS

Name: Date: Period: Chapter 2. Section 1: Describing Location in a Distribution

Section 1.2. Displaying Quantitative Data with Graphs. Mrs. Daniel AP Stats 8/22/2013. Dotplots. How to Make a Dotplot. Mrs. Daniel AP Statistics

CHAPTER 2: DESCRIPTIVE STATISTICS Lecture Notes for Introductory Statistics 1. Daphne Skipper, Augusta University (2016)

Chapter 3 Analyzing Normal Quantitative Data

Stat 528 (Autumn 2008) Density Curves and the Normal Distribution. Measures of center and spread. Features of the normal distribution

Ch6: The Normal Distribution

Math 14 Lecture Notes Ch. 6.1

UNIT 1A EXPLORING UNIVARIATE DATA

Univariate Statistics Summary

Chapter 2 Modeling Distributions of Data

Section 9: One Variable Statistics

Chapter 5. Normal. Normal Curve. the Normal. Curve Examples. Standard Units Standard Units Examples. for Data

Section 6.3: Measures of Position

Lecture Notes 3: Data summarization

Chapter 11. Worked-Out Solutions Explorations (p. 585) Chapter 11 Maintaining Mathematical Proficiency (p. 583)

23.2 Normal Distributions

Unit 8: Normal Calculations

Lecture 6: Chapter 6 Summary

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency

MAT 102 Introduction to Statistics Chapter 6. Chapter 6 Continuous Probability Distributions and the Normal Distribution

Learning Log Title: CHAPTER 7: PROPORTIONS AND PERCENTS. Date: Lesson: Chapter 7: Proportions and Percents

Today s Topics. Percentile ranks and percentiles. Standardized scores. Using standardized scores to estimate percentiles

Section 10.4 Normal Distributions

Student Learning Objectives

IT 403 Practice Problems (1-2) Answers

Chapter 2: The Normal Distributions

MAT 142 College Mathematics. Module ST. Statistics. Terri Miller revised July 14, 2015

Further Maths Notes. Common Mistakes. Read the bold words in the exam! Always check data entry. Write equations in terms of variables

Introduction to the Practice of Statistics Fifth Edition Moore, McCabe

Averages and Variation

Measures of Position

Data can be in the form of numbers, words, measurements, observations or even just descriptions of things.

Measures of Central Tendency. A measure of central tendency is a value used to represent the typical or average value in a data set.

The standard deviation 1 n

6th Grade Vocabulary Mathematics Unit 2

Chapter 6. THE NORMAL DISTRIBUTION

Name: Stat 300: Intro to Probability & Statistics Textbook: Introduction to Statistical Investigations

Chapter 6 Normal Probability Distributions

Measures of Central Tendency

Homework Packet Week #3

Chapter 6. THE NORMAL DISTRIBUTION

No. of blue jelly beans No. of bags

Key: 5 9 represents a team with 59 wins. (c) The Kansas City Royals and Cleveland Indians, who both won 65 games.

AP Statistics Prerequisite Packet

MATH& 146 Lesson 8. Section 1.6 Averages and Variation

STA Rev. F Learning Objectives. Learning Objectives (Cont.) Module 3 Descriptive Measures

A straight line is the graph of a linear equation. These equations come in several forms, for example: change in x = y 1 y 0

8 2 Properties of a normal distribution.notebook Properties of the Normal Distribution Pages

The Normal Distribution

Linear Regression. Problem: There are many observations with the same x-value but different y-values... Can t predict one y-value from x. d j.

Measures of Dispersion

MAT 110 WORKSHOP. Updated Fall 2018

STANDARDS OF LEARNING CONTENT REVIEW NOTES ALGEBRA I. 4 th Nine Weeks,

AP Statistics. Study Guide

MEASURES OF CENTRAL TENDENCY

3.5 Applying the Normal Distribution: Z-Scores

Downloaded from

STANDARDS OF LEARNING CONTENT REVIEW NOTES. ALGEBRA I Part II. 3 rd Nine Weeks,

Normal Distribution. 6.4 Applications of Normal Distribution

Processing, representing and interpreting data

CHAPTER 2 Modeling Distributions of Data

Create a bar graph that displays the data from the frequency table in Example 1. See the examples on p Does our graph look different?

4.3 The Normal Distribution

Normal Data ID1050 Quantitative & Qualitative Reasoning

MATH& 146 Lesson 10. Section 1.6 Graphing Numerical Data

The main issue is that the mean and standard deviations are not accurate and should not be used in the analysis. Then what statistics should we use?

CHAPTER 2 Modeling Distributions of Data

CHAPTER 3: Data Description

2.1 Objectives. Math Chapter 2. Chapter 2. Variable. Categorical Variable EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES

STA Module 2B Organizing Data and Comparing Distributions (Part II)

STA Learning Objectives. Learning Objectives (cont.) Module 2B Organizing Data and Comparing Distributions (Part II)

Math 085 Final Exam Review

Advanced Algebra I Simplifying Expressions

Vocabulary. 5-number summary Rule. Area principle. Bar chart. Boxplot. Categorical data condition. Categorical variable.

+ Statistical Methods in

Section 2.2 Normal Distributions. Normal Distributions

6-1 THE STANDARD NORMAL DISTRIBUTION

Chapter 2: The Normal Distribution

AND NUMERICAL SUMMARIES. Chapter 2

MATH 112 Section 7.2: Measuring Distribution, Center, and Spread

Unit 0: Extending Algebra 1 Concepts

AP Statistics Summer Assignment:

Distributions of Continuous Data

L E A R N I N G O B JE C T I V E S

Section 3.1 Shapes of Distributions MDM4U Jensen

Transcription:

Chapter 5: The normal model Objective (1) Learn how rescaling a distribution affects its summary statistics. (2) Understand the concept of normal model. (3) Learn how to analyze distributions using the normal model. Concept briefs: * Rescaling effect = +/- operations shift the median & mean, but not IQR or SD. x/ operations scale all summary statistics. * Z-score = Rescaled data value = Distance of data from the mean, measured in SD s. * Standardizing data = Rescaling all data values into Z-scores. * Normal model = A perfect bell-shaped histogram, often used to approximate symmetric, unimodal distributions. * The 68-95-99.7 rule = Indicates proportions of spread in normal models. * Standard normal Vs. other normal models.

Warmup exercise Consider the following examples of quantitative distributions: (1) Average local temperature in January for the past 100 years. (2) Actual volume of soda in a thousand 12 oz cans. (3) Height of all women students (or all men students) at EC. (4) Weight of peaches harvested in a crop of, say, 10,000 peaches. Identify one thing common in all these distributions. Illustration (Thanks to Prof. Sarah Mabrouk, Math Dept, FSC.) Height limitations for flight attendants: To work as a flight attendant for United Airlines, you must be between 5 2 and 6 tall. For 18- to 24- year olds in the United States, the mean height for males is about 70.1 with a standard deviation of 2.7, and the mean for females is 64.8 with a standard deviation of 2.5. From this information, is it possible to estimate what % of men meet the requirement, and what % of women meet it? Suppose we add the information: Both distributions are approximately normal. Simplified version: Suppose the height requirement is: 62.5" to 72.5" Women's height: Mean = 65", SD = 2.5" Can you estimate the % of women who qualify?

Effect of rescaling data distributions Firstly, what does rescaling mean? * It is somewhat like changing units (e.g, change all heights from feet to inches). * It involves performing the same sequence of arithmetic operations on every data value. Two key operations exist: (1) addition (OR, subtraction) of a constant. (2) multiplication (OR, division) by a constant. Summary of effect of rescaling Addition / subtraction affects the center but not the spread of the distribution. It shifts the position of every data value by the same amount, so the mean, median, and positions of Q1, Q3, etc. change. But this doesn t change the range, IQR or standard deviation. Multiplication / division affect the center and the spread of the distribution. If we multiply by m, the mean, median, Q1, Q3, get multiplied by m as well. Furthermore, the IQR and standard deviation get multiplied by m as well.

Standardizing and z-scores Z-score is a very, very important concept! * Each data value in a distribution can be converted (rescaled) into a z-score. * Here is the shortest, easiest, insightful way to think of a z-score z-score = How many SD s away from the mean is this data value? E.g., Suppose the mean height of 10 students in a group is 68 and the SD is 4 (i) Beth is one of these 10 students. Her height is 70. Thus, she is one half SD above the mean. Her z-score is 0.5. (ii) Mary is also part of this group, and she is 64 tall. So, she is 1 SD below the mean. Her z-score is -1.0. * Important point: Z-scores have no units, even if the original data has units. Standardizing This means converting all values in a distribution to z-scores. Now guess what is the mean & SD of a standardized distribution?

The Normal Model What is the meaning of the term model in mathematics? It means an idealization of a real-life situation in order to make it feasible to analyze, understand & make useful predictions. The Normal model refers to a particular graphical shape applicable to analyzing unimodal, symmetric distributions. It looks like the bell curve. No real-life data distribution is ever perfectly Normal, but many can be analyzed quite nicely by treating them as normal distributions, if they are roughly symmetric and unimodal. Q: Consider the following 2 distributions: (1) Height of students in this class; and (2) Shoe-size of students in this class. Is it possible to treat both as normal distributions? How? # of students # of students 50 55 60 65 70 75 80 Height (inches) 5 6 7 8 9 10 11 12 Shoe size (US units)

Answer: Yes, we can analyze both as normal distributions! Simply standardize & write both distributions in terms of z-scores. Then, if they re both normal, the two graphs will look identical (including same #s on the x-y axes). IN FACT, any data distribution, if it is normal, should look like the following graph after standardizing: Points to note: * This is a smooth, continuous curve, not a histogram. So, how does it apply to histograms? Answer: We look at the overall profile of histograms. * The above graph is called the standard normal model (because it has mean=0, and standard deviation=1). * The normal model can be rescaled to have any other combination of mean and SD. It is not the value of mean/sd that is important, but the shape of the distribution itself.

Key attributes of all normal distributions * If a distribution is normal, it has some very specific statistical properties. * This includes the 68-95-99.7 percent rule, which says: (1) 68% of all data lies within 1 standard deviation of the mean. (2) 95% of all data lies within 2 SD s. (3) 99.7% of all data lies within 3 SD s. * Standard notation: = mean; = standard deviation. The Normal Model Recall: You can have a normal model with any combination of and * Therefore, there is also special notation for referring to normal models in general N (, ) denotes the normal model with mean= and SD= * Note that once you specify, there is exactly one normal model with those particular parameters.

Examples of normal models with different choices of and : So how do we use all this stuff in practice? * The key point here is: We have a mathematical model that is a good approximation to certain kinds of data. * When a model fits some specific data, it becomes very easy to analyze & predict key attributes related to that variable. E.g., Suppose you had a model for the fluctuations in price of a company s stock. You could make a fortune, quit school & retire in a few months! * Many real-life distributions are very close to normal: * Average local temperature in January for the past 100 years. * Weight of peaches harvested in a crop. * Student shoe-sizes on campus * Volume of soda in 12 oz cans. * Blood-pressure or pulse-rates of any (large) group of people. If we know a distribution is approximately normal, we can estimate many useful details about it.

Illustration: Suppose the distribution of shoe-sizes of students in this class is (roughly) symmetric and unimodal, with mean 8.0 and standard deviation 1.0. Estimate each of the following statistics: (A) What % of students have shoe-size larger than 9.0? (B) What % have shoe-size smaller than 6.0? (C) What % have shoe-sizes between 6.0 and 7.0? Answers: (a) 32 / 2 = 16% (b) 5 / 2 = 2.5 % (c) 27 / 2 = 13.5 %