Continuous Improvement Toolkit. Normal Distribution. Continuous Improvement Toolkit.

Similar documents
Cpk: What is its Capability? By: Rick Haynes, Master Black Belt Smarter Solutions, Inc.

CHAPTER 2 Modeling Distributions of Data

= = P. IE 434 Homework 2 Process Capability. Kate Gilland 10/2/13. Figure 1: Capability Analysis

Chapter 2 Modeling Distributions of Data

Distributions of random variables

Chapter 2: Modeling Distributions of Data

BIOL Gradation of a histogram (a) into the normal curve (b)

IT 403 Practice Problems (1-2) Answers

Chapter 6. The Normal Distribution. McGraw-Hill, Bluman, 7 th ed., Chapter 6 1

Chapter 2: The Normal Distribution

STA Module 4 The Normal Distribution

STA /25/12. Module 4 The Normal Distribution. Learning Objectives. Let s Look at Some Examples of Normal Curves

Chapter 6 Normal Probability Distributions

Stat 528 (Autumn 2008) Density Curves and the Normal Distribution. Measures of center and spread. Features of the normal distribution

MAT 110 WORKSHOP. Updated Fall 2018

8: Statistics. Populations and Samples. Histograms and Frequency Polygons. Page 1 of 10

CHAPTER 6. The Normal Probability Distribution

CHAPTER 2 Modeling Distributions of Data

The first few questions on this worksheet will deal with measures of central tendency. These data types tell us where the center of the data set lies.

Chapter 6. THE NORMAL DISTRIBUTION

Chapter 6: DESCRIPTIVE STATISTICS

Bluman & Mayer, Elementary Statistics, A Step by Step Approach, Canadian Edition

8. MINITAB COMMANDS WEEK-BY-WEEK

Statistics: Normal Distribution, Sampling, Function Fitting & Regression Analysis (Grade 12) *

Section 2.2 Normal Distributions

4. RCO Prevention Reduce Chance of Occurrence: Does not Allow defect to occur.

Learning Objectives. Continuous Random Variables & The Normal Probability Distribution. Continuous Random Variable

Prepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order.

Ch6: The Normal Distribution

Chapter 6. THE NORMAL DISTRIBUTION

Part I, Chapters 4 & 5. Data Tables and Data Analysis Statistics and Figures

The Normal Distribution

Page 1. Graphical and Numerical Statistics

8 2 Properties of a normal distribution.notebook Properties of the Normal Distribution Pages

Frequency Distributions

Control Charts. An Introduction to Statistical Process Control

Descriptive Statistics, Standard Deviation and Standard Error

Chapter 3: Data Description Calculate Mean, Median, Mode, Range, Variation, Standard Deviation, Quartiles, standard scores; construct Boxplots.

Key: 5 9 represents a team with 59 wins. (c) The Kansas City Royals and Cleveland Indians, who both won 65 games.

Minitab Study Card J ENNIFER L EWIS P RIESTLEY, PH.D.

MAT 142 College Mathematics. Module ST. Statistics. Terri Miller revised July 14, 2015

Learner Expectations UNIT 1: GRAPICAL AND NUMERIC REPRESENTATIONS OF DATA. Sept. Fathom Lab: Distributions and Best Methods of Display

The Normal Distribution & z-scores

Basic Statistical Terms and Definitions

Quality Improvement Tools

Lecture 3 Questions that we should be able to answer by the end of this lecture:

Normal Distribution. 6.4 Applications of Normal Distribution

Further Maths Notes. Common Mistakes. Read the bold words in the exam! Always check data entry. Write equations in terms of variables

appstats6.notebook September 27, 2016

Chapter 3 Analyzing Normal Quantitative Data

Data Analysis and Solver Plugins for KSpread USER S MANUAL. Tomasz Maliszewski

CHAPTER 2 Modeling Distributions of Data

Table Of Contents. Table Of Contents

AP Statistics. Study Guide

MS in Quality Management Science : III SEMESTER : MID-TERM EXAMINATION

Graphical Analysis of Data using Microsoft Excel [2016 Version]

Vocabulary. 5-number summary Rule. Area principle. Bar chart. Boxplot. Categorical data condition. Categorical variable.

Excel 2010 with XLSTAT

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 2.1- #

Use of Extreme Value Statistics in Modeling Biometric Systems

Using Excel for Graphical Analysis of Data

The Normal Distribution & z-scores

Table of Contents (As covered from textbook)

Lecture 3 Questions that we should be able to answer by the end of this lecture:

The Normal Distribution & z-scores

The Normal Distribution. John McGready, PhD Johns Hopkins University

Math 14 Lecture Notes Ch. 6.1

Chapter 1. Looking at Data-Distribution

STAT 113: Lab 9. Colin Reimer Dawson. Last revised November 10, 2015

23.2 Normal Distributions

CHAPTER 2: SAMPLING AND DATA

Week 7: The normal distribution and sample means

STA 570 Spring Lecture 5 Tuesday, Feb 1

Statistical Techniques for Validation Sampling. Copyright GCI, Inc. 2016

Instructions for Using ABCalc James Alan Fox Northeastern University Updated: August 2009

Chapters 5-6: Statistical Inference Methods

Chapter 6: Continuous Random Variables & the Normal Distribution. 6.1 Continuous Probability Distribution

Introduction to the Practice of Statistics Fifth Edition Moore, McCabe

Minitab Training. Leading Innovation. 3 1 s. 6 2 s. Upper Specification Limit. Lower Specification Limit. Mean / Target. High Probability of Failure

2009 by Minitab Inc. All rights reserved. Release 3.1, January 2009 Minitab, Quality Companion by Minitab, the Minitab logo, and Quality Trainer by

Probability and Statistics. Copyright Cengage Learning. All rights reserved.

Lab 4: Distributions of random variables

Distributions of Continuous Data

Confidence Intervals: Estimators

Your Name: Section: 2. To develop an understanding of the standard deviation as a measure of spread.

Chapter 2 Describing, Exploring, and Comparing Data

Minitab 17 commands Prepared by Jeffrey S. Simonoff

Unit 5: Estimating with Confidence

Math 214 Introductory Statistics Summer Class Notes Sections 3.2, : 1-21 odd 3.3: 7-13, Measures of Central Tendency

UNIT 1A EXPLORING UNIVARIATE DATA

Normal Curves and Sampling Distributions

a. divided by the. 1) Always round!! a) Even if class width comes out to a, go up one.

Measures of Dispersion

STA Rev. F Learning Objectives. Learning Objectives (Cont.) Module 3 Descriptive Measures

Chapter 2: The Normal Distributions

CHAPTER 1. Introduction. Statistics: Statistics is the science of collecting, organizing, analyzing, presenting and interpreting data.

Please consider the environment before printing this tutorial. Printing is usually a waste.

CHAPTER 3: Data Description

Using Excel for Graphical Analysis of Data

Probability & Statistics Chapter 6. Normal Distribution

Transcription:

Continuous Improvement Toolkit Normal Distribution

The Continuous Improvement Map Managing Risk FMEA Understanding Performance** Check Sheets Data Collection PDPC RAID Log* Risk Analysis* Benchmarking*** Interviews Control Charts Focus Groups Observations Brainstorming Selecting & Decision Making Quality Function Deployment Run Charts 5 Whys Fishbone Diagram SCAMPER*** Importance-Urgency Mapping Cost Benefit Analysis TPN Analysis Relations Mapping Tree Diagram* Attribute Analysis Morphological Analysis Lateral Thinking Group Creativity SIPOC* Flowcharting Planning & Project Management* SWOT Analysis Product Family Matrix Waste Analysis** Flow Process Charts** IDEF0 Stakeholder Analysis Fault Tree Analysis Decision Tree Pick Chart Voting Four Field Matrix Project Charter Improvement Roadmaps Traffic Light Assessment Critical-to Tree Force Field Analysis Portfolio Matrix PDCA Policy Deployment Gantt Charts Lean Measures Kano Decision Balance Sheet Paired Comparison DMAIC Kaizen Events Control Planning OEE Cost of Quality* Pugh Matrix Prioritization Matrix Process Yield A3 Thinking Standard work Document control Earned Value KPIs Pareto Analysis Matrix Diagram Cross Training Capability Indices Understanding Descriptive Statistics ANOVA Chi-Square Cause & Effect TPM Automation Gap Analysis* Probability Distributions Hypothesis Testing Design of Experiment Mistake Proofing Ergonomics Bottleneck Analysis Histograms Multi vari Studies Confidence Intervals Simulation Just in Time 5S Reliability Analysis Graphical Analysis Scatter Plots Correlation Regression Quick Changeover Visual Management Data collection planner* Questionnaires MSA Payoff Matrix Sampling Suggestion systems Break-even Analysis How-How Diagram*** Affinity Diagram Mind Mapping* Delphi Method Five Ws Root Cause Analysis Data Mining MOST Daily Planning RACI Matrix Spaghetti ** Value Analysis** PERT/CPM Activity Networks Implementing Solutions*** Pull Flow Process Redesign Value Stream Mapping** Time Value Map** Service Blueprints Process Mapping Designing & Analyzing Processes

The commonest and the most useful continuous distribution. A symmetrical probability distribution where most results are located in the middle and few are spread on both sides. It has the shape of a bell. Can entirely be described by its mean and standard deviation.

Can be found practically everywhere: In nature. In engineering and industrial processes. In social and human sciences. Many everyday data sets follow approximately the normal distribution.

Examples: The body temperature for healthy humans. The heights and weights of adults. The thickness and dimensions of a product. IQ and standardized test scores. Quality control test results. Errors in measurements.

Why? Used to illustrate the shape and variability of the data. Used to estimate future process performance. Normality is an important assumption when conducting statistical analysis. Certain SPC charts and many statistical inference tests require the data to be normally distributed.

Normal Curve: A graphical representation of the normal distribution. It is determined by the mean and the standard deviation. It a symmetric unimodal bell-shaped curve. Its tails extending infinitely in both directions. The wider the curve, the larger the standard deviation and the more variation exists in the process. The spread of the curve is equivalent to six times the standard deviation of the process.

Normal Curve: Helps calculating the probabilities for normally distributed populations. The probabilities are represented by the area under the normal curve. The total area under the curve is equal to 100% (or 1.00). This represents the population of the observations. We can get a rough estimate of the probability above a value, below a value, or between any two values. Total probability = 100%

Normal Curve: Since the normal curve is symmetrical, 50 percent of the data lie on each side of the curve. Total probability = 100% Inflection Point 50% 50% Mean X

Empirical Rule: For any normally distributed data: 68% of the data fall within 1 standard deviation of the mean. 95% of the data fall within 2 standard deviations of the mean. 99.7% of the data fall within 3 standard deviations of the mean. 99.73% 95.44% 68.26% 2.1% 13.6% 34.1% 34.1% 13.6% 2.1% -3σ -2σ -1σ μ 1σ 2σ 3σ

Empirical Rule: Suppose that the heights of a sample men are normally distributed. The mean height is 178 cm and a standard deviation is 7 cm. We can generalize that: 68% of population are between 171 cm and 185 cm. This might be a generalization, but it s true if the data is normally distributed. 68% 171 178 185

Empirical Rule: For a stable normally distributed process, 99.73% of the values lie within +/-3 standard deviation of the mean. -3-2 -1 0 1 2 3

Standard Normal Distribution: A common practice to convert any normal distribution to the standardized form and then use the standard normal table to find probabilities. The Standard Normal Distribution (Z distribution) is a way of standardizing the normal distribution. It always has a mean of 0 and a standard deviation of 1. -3-2 -1 0 1 2 3

Standard Normal Distribution: Any normally distributed data can be converted to the standardized form using the formula: Z = (X μ) / σ where: X is the data point in question. Z (or Z-score) is a measure of the number of standard deviations of that data point from the mean. Z

Standard Normal Distribution: Converting from X to Z : X L = 4.0 (LSL) X U = 5.0 (USL) x z σ = 0.2 3.9 4.1 4.3 4.5 4.7 4.9 5.1-3 -2-1 0 1 2 3-2.5 2.5 The specification limits at 4.0 and 5.0mm respectively, and the corresponding z values The X-scale is for the actual values and the Z-scale is for the standardized values

Standard Normal Distribution: You can then use this information to determine the area under the normal distribution curve that is: To the right of your data point. To the left of the data point. Between two data points. Outside of two data points. X X X1 X2 X1 X2

Z-Table: Used to find probabilities associated with the standard normal curve. You may also use the Z-table calculator instead of looking into the Z-table manually. Z +0.00 +0.01 +0.02 +0.03 0.0 0.50000 0.50399 0.50798 0.51197 0.1 0.53983 0.54380 0.54776 0.55172 0.2 0.57926 0.58317 0.58706 0.59095 0.3 0.61791 0.62172 0.62552 0.62930 Cumulative Z-Table

Z-Table: There are different forms of the Z-table: Cumulative, gives the proportion of the population that is to the left or below the Z-score. Complementary cumulative, gives the proportion of the population that is to the right or above the Z-score. Cumulative from mean, gives the proportion of the population to the left of that z-score to the mean only.

Z-Table: X L = 4.0 (LSL) X U = 5.0 (USL) P(Z>2.5) = 0.0062 P(Z>2.5) = 0.0062 x z σ = 0.2 3.9 4.1 4.3 4.5 4.7 4.9 5.1-3 -2-1 0 1 2 3-2.5 2.5 The area between Z = -2.5 & Z = +2.5 cannot be looked up directly in the Z-Table However, since the area below the total curve is 1, it can be found by subtracting the known areas from 1.

Exercise: Question: For a process with a mean of 100, a standard deviation of 10 and an upper specification of 120, what is the probability that a randomly selected item is defective (or beyond the upper specification limit)? Time allowed: 10 minutes

Exercise: Answer: The Z-score is equal to = (120 100) / 10 = 2. This means that the upper specification limit is 2 standard deviations above the mean. Now that we have the Z-score, we can use the Z-table to find the probability. From the Z-table (the complementary cumulative table), the area under the curve for a Z-value of 2 = 0.02275 or 2.275%. This means that there is a chance of 2.275% for any randomly selected item to be defective.

- Normality Testing in Minitab Many statistical tests require that the distribution is normal. Several tools are available to assess the normality of data: Using a histogram to visually explore the data. Producing a normal probability plot. Carrying out an Anderson-Darling normality test. All these tools are easy to use in Minitab statistical software. Normality Testing

- Normality Testing in Minitab Histograms: Efficient graphical methods for describing the distribution of data. It is always a good practice to plot your data in a histogram after collecting the data. This will give you an insight about the shape of the distribution. If the data is symmetrically distributed and most results are located in the middle, we can assume that the data is normally distributed.

- Normality Testing in Minitab Histograms: Suppose that a line manager is seeking to assess how consistently a production line is producing. He is interested in the weight of a food products with a target of 50 grams per item. He takes a random sample of 40 products and measures their weights.

- Normality Testing in Minitab Histograms: Remember to copy the data from the Excel worksheet and paste it into the Minitab worksheet. 50.9 49.3 49.6 51.1 49.8 49.6 49.9 49.9 49.7 50.8 48.8 49.8 50.4 48.9 50.6 50.3 50.4 50.7 50.2 50.0 50.8 50.3 50.4 49.9 49.4 50.0 50.5 50.3 50.1 49.9 49.8 50.2 49.3 50.0 49.7 50.3 49.7 50.2 50.0 49.1 Select Graph > Histogram > With Fit. Specify the column of data to analyze Click OK.

- Normality Testing in Minitab The histogram below suggests that the data is normally distribution. 9 8 Mean 50.02 StDev 0.5289 N 40 7 6 Frequency 5 4 3 2 1 0 49.0 49.5 50.0 50.5 51.0 Weight Notice how the data is symmetrically distributed and concentrated in the center of the histogram

- Normality Testing in Minitab Normal Probability Plots: Used to test the assumption of normality. Provides a more decisive approach. All points for a normal distribution should approximately form a straight line that falls between 95% confidence interval limits.

- Normality Testing in Minitab Normal Probability Plots: To create a normal probability plot: Select Graph > Probability Plot > Single. Specify the column of data to analyze. Leave the distribution option to be normal. Click OK.

- Normality Testing in Minitab Normal Probability Plots: Here is a screenshot of the example result for our previous example: The data points approximately follow a straight line that falls mainly between CI limits It can be concluded that the data is normally distributed

- Normality Testing in Minitab What about the data in the following probability plot? CI limits

- Normality Testing in Minitab Anderson-Darling Normality Test: A statistical test that compares the actual distribution with the theoretical distribution. Measures how well the data follow the normal distribution (or any particular distribution). A lower p-value than the significance level (normally 0.05) indicates a lack of normality in the data. Remember to keep your eyes on the histogram and the normal probability plot in conjunction with the Anderson-Darling test before making any decision.

- Normality Testing in Minitab Anderson-Darling Normality Test: To conduct an Anderson-Darling normality test: Select Stat > Basic Statistics > Normality Test. Specify the column of data to analyze. Specify the test method to be Anderson-Darling. Click OK. The p-value in our product weight example was 0.819. This suggests that the data follow the normal distribution.

Further Information: The normal distribution is also known as the Gaussian Distribution after Carl Gauss who created the mathematical formula of the curve. Sometimes the process itself produces an approximately normal distribution. Other times the normal distribution can be obtained by performing a mathematical transformation on the data or by using means of subgroups.

Further Information: The Central Limit Theorem is a useful statistical concept. It states that the distribution of the means of random samples will always approach a normal distribution regardless of the shape or underlying distribution. Even if the population is skewed, the subgroup means will always be normal if the sample size is large enough (n >= 30).

Further Information: This means that statistical methods that work for normal distributions can also be applicable to other distributions. For example, certain SPC charts (such as XBar-R charts) can be used with non-normal data.

Further Information: In Excel, you may calculate the normal probabilities using the NORM.DIST function. Simply write: =NORM.DIST(x, mean, standard deviation, FALSE) where x is the data point in question.

Further Information: In Excel, NORM.INV is the inverse of the NORM.DIST function. It calculates the X variable given a probability. You can generate random numbers that follow the normal distribution by using the NORM.INV function. Use the formula: =NORM.INV(RAND(), mean, standard deviation)

Further Information: For non-continuous data, we can estimate the Z-value by calculating the area under the curves (DPU) and finding the equivalent Z-value for that area. Because discrete data is one-sided; the computed area is the total area. A process that yields 95% (or has 5% defects): The DPU for this process is 0.05. In the Z-table, an area of 0.05 has a Z-value of 1.64. 5% defects 1.64

Further Information - Exercise: An engineer collected 30 measurements. He calculated the average to be 80 and the standard deviation to be 10. The lower specification limit is 70. What is the Z-value? What percentage of the product is expected to be out of specification?

Further Information: More examples using the Z-distribution and the Z-table: The Z-value corresponding to a proportion defective of 0.05 (in one tail). The proportion of output greater than the process average. The yield of a stable process whose specification limits are set at +/- 3 standard deviations from the process average. The average weight of a certain product is 4.4 g with a standard deviation of 1.3 g. Question: What is the probability that a randomly selected product would weigh at least 3.1 g but not more than 7.0 g.

Further Information: More examples using the Z-distribution and the Z-table: The life of a fully-charged cell phone battery is normally distributed with a mean of 14 hours with a standard deviation of 1 hour. What is the probability that a battery lasts at least 13 hours? x or μ s or σ

Further Information: More examples using the Z-distribution and the Z-table: The maximum weight for a certain product should not exceed 52 gm. After sampling 40 products from the line, none of the weights exceeded the upper specification limit. It may look that there will be no overweighed products, however, the normal curve suggests that there might be some over the long term. The normal distribution can be used to make better prediction of the number of failures that will occur in the long term. In our case, the Z-table predicts the area under the curve to be 0.6% for a Z-value of 2.5. This is a better prediction than the 0% assumed earlier.