Process Capability Calculations with Extremely Non-Normal Data

Similar documents
This copy of the Validation Protocol / Report for. # ZTC-10, for version 2 of ZTCP's. "MDD-PLUS C=0 SAMPLING PLANS.xls"

Chapter 2 Describing, Exploring, and Comparing Data

An Excel Add-In for Capturing Simulation Statistics

Overview. Frequency Distributions. Chapter 2 Summarizing & Graphing Data. Descriptive Statistics. Inferential Statistics. Frequency Distribution

revision of the validation protocol and of the completely executed validation report.

Accelerated Life Testing Module Accelerated Life Testing - Overview

Chapter 3: Rate Laws Excel Tutorial on Fitting logarithmic data

Fathom Dynamic Data TM Version 2 Specifications

Part I, Chapters 4 & 5. Data Tables and Data Analysis Statistics and Figures

Building Better Parametric Cost Models

The Power and Sample Size Application

Bluman & Mayer, Elementary Statistics, A Step by Step Approach, Canadian Edition

Frequency Distributions

Data Statistics Population. Census Sample Correlation... Statistical & Practical Significance. Qualitative Data Discrete Data Continuous Data

MAT 110 WORKSHOP. Updated Fall 2018

TYPES OF NUMBER P1 P2 P3 Learning Objective Understand place value in large numbers Add and subtract large numbers (up to 3 digits) Multiply and

2.830J / 6.780J / ESD.63J Control of Manufacturing Processes (SMA 6303) Spring 2008

Further Maths Notes. Common Mistakes. Read the bold words in the exam! Always check data entry. Write equations in terms of variables

Pre-Lab Excel Problem

QCTools. an Excel 97 AddIn. William F Lyle Copyright 1999

MATHEMATICS SYLLABUS SECONDARY 4th YEAR

Minitab detailed

0 Graphical Analysis Use of Excel

Dealing with Data in Excel 2013/2016

Math 227 EXCEL / MEGASTAT Guide

Data Analysis and Solver Plugins for KSpread USER S MANUAL. Tomasz Maliszewski

9-1 GCSE Maths. GCSE Mathematics has a Foundation tier (Grades 1 5) and a Higher tier (Grades 4 9).

Mathematics Year 9-11 Skills and Knowledge Checklist. Name: Class: Set : 1 Date Year 9 MEG :

calculations and approximations.

Error Analysis, Statistics and Graphing

QQ normality plots Harvey Motulsky, GraphPad Software Inc. July 2013

ENGR Fall Exam 1

Excel 2010 with XLSTAT

Section 2-2 Frequency Distributions. Copyright 2010, 2007, 2004 Pearson Education, Inc

Page 1. Graphical and Numerical Statistics

Grade 6 Curriculum and Instructional Gap Analysis Implementation Year

IQR = number. summary: largest. = 2. Upper half: Q3 =

Microarray Excel Hands-on Workshop Handout

ENGR Fall Exam 1 PRACTICE EXAM

Data Presentation. Figure 1. Hand drawn data sheet

Mathematics Year 9-11 Skills and Knowledge Checklist. Name: Class: Set : Premier Date Year 9 MEG :

Microsoft Excel Using Excel in the Science Classroom

Downloaded from

Descriptive Statistics, Standard Deviation and Standard Error

Chapter Two: Descriptive Methods 1/50

Lesson 18-1 Lesson Lesson 18-1 Lesson Lesson 18-2 Lesson 18-2

Year Nine Scheme of Work. Overview

Multivariate Capability Analysis

Parallel Lines Investigation

Correlation of the ALEKS courses Algebra 1 and High School Geometry to the Wyoming Mathematics Content Standards for Grade 11

SLStats.notebook. January 12, Statistics:

Probability Models.S4 Simulating Random Variables

Describe the Squirt Studio

AQA GCSE Maths - Higher Self-Assessment Checklist

Math 7 Glossary Terms

And the benefits are immediate minimal changes to the interface allow you and your teams to access these

Preliminary Lab Exercise Part 1 Calibration of Eppendorf Pipette

Statistical Graphs and Calculations

ME 142 Engineering Computation I. Graphing with Excel

Contents 10. Graphs of Trigonometric Functions

StatsMate. User Guide

Review Ch. 15 Spreadsheet and Worksheet Basics. 2010, 2006 South-Western, Cengage Learning

Click on the topic to go to the page

- 1 - Class Intervals

MAT 003 Brian Killough s Instructor Notes Saint Leo University

Univariate Statistics Summary

Chapter 2 - Graphical Summaries of Data

MATH1635, Statistics (2)

Chapter 2 Organizing and Graphing Data. 2.1 Organizing and Graphing Qualitative Data

Slide Copyright 2005 Pearson Education, Inc. SEVENTH EDITION and EXPANDED SEVENTH EDITION. Chapter 13. Statistics Sampling Techniques

Measures of Central Tendency

STAT 311 (3 CREDITS) VARIANCE AND REGRESSION ANALYSIS ELECTIVE: ALL STUDENTS. CONTENT Introduction to Computer application of variance and regression

Things to Know for the Algebra I Regents

Chemistry Excel. Microsoft 2007

How to Make Graphs in EXCEL

Central Valley School District Math Curriculum Map Grade 8. August - September

SAS/STAT 13.1 User s Guide. The Power and Sample Size Application

KS3 MATHEMATICS THRESHOLD DESCRIPTORS NUMBER (Incl. RATIO & PROPORTION)

Summer Packet 7 th into 8 th grade. Name. Integer Operations = 2. (-7)(6)(-4) = = = = 6.

Ch3 E lement Elemen ar tary Descriptive Descriptiv Statistics

Stat 528 (Autumn 2008) Density Curves and the Normal Distribution. Measures of center and spread. Features of the normal distribution

Appendix 14C: TIMSS 2015 Eighth Grade Mathematics Item Descriptions Developed During the TIMSS 2015 Benchmarking

Using Excel for Graphical Analysis of Data

Statistics & Curve Fitting Tool

Test Bank for Privitera, Statistics for the Behavioral Sciences

CHAPTER 1. Introduction. Statistics: Statistics is the science of collecting, organizing, analyzing, presenting and interpreting data.

BODMAS and Standard Form. Integers. Understand and use coordinates. Decimals. Introduction to algebra, linear equations

Chapter 2 Assignment (due Thursday, April 19)

Simplified Whisker Risk Model Extensions

Supplementary Material

Higher Course Year /16

Assignment 3 due Thursday Oct. 11

9.8 Rockin the Residuals

Markscheme May 2017 Mathematical studies Standard level Paper 1

Adjacent sides are next to each other and are joined by a common vertex.

Confidence Level Red Amber Green

Linear and Quadratic Least Squares

Table of Contents (As covered from textbook)

Describe the Squirt Studio Open Office Version

Microsoft Office Word 2013 Intermediate. Course 01 Working with Tables and Charts

Transcription:

Process Capability Calculations with Extremely Non-Normal Data copyright 2015 (all rights reserved), by: John N. Zorich Jr., MS, CQE Statistical Consultant & Trainer home office: Houston TX 408-203-8811 (cell) www.johnzorich.com JOHNZORICH@YAHOO.COM Reliability Plotting One of the best ways to determine process capability with extremely non-normal data is to use Reliability Plotting. "Reliability" is the % of the lot that meets specification. Reliability plotting involves analyzing the data graphically using the methods and formulas of linear regression. Sample Sizes The output of reliability plotting is a lower 1-sided confidence limit on the observed %-in-specification, at the confidence level used in the calculation. Any sample size is valid, because the smaller the sample size, the farther away from the observed %-inspec is the %-in-spec that you can claim. 90% 100% Confidence limit based upon small sample Confidence limit base upon large sample % reliability observed in sample Houston Regional Quality Conference, Nov. 13, 2015 Copyright 2015 by John Zorich 1

An incorrect distribution assumption can make huge differences in claimed % in-specification. What can be done if raw data cannot be transformed to Normality? QC Spec is 1.00 0 1.00 2.00 Charts copied from http://www.itl.nist.gov/div898/handbook/eda/section3/eda366.htm 10 devices put on-test: a geometric progression: 1, 2, 4, 8, 16, 32, 64, 128, 256, 512 30 devices put on-test: only 9 life-failures @ hours: 367, 422, 476, 508, 552, 589, 642, 683, 738 10 devices put on-test: tensile failures: 0.35, 0.35, 0.35, 0.35, 0.68, 1.22, 1.56, 1.91, 2.17, 2.18 60 sterile-barrier pouches put on-test: burst-pressure results show two distributions Reliability Plotting is one of the best ways to calculate process capability with data such as on previous slide. Reliability Plotting requires a transformation that gives a straight line (by Linear Regression) that can be extrapolated to the Specification value. % 1 Reliability Specification DATA Extrapolating a curved line is never discussed in reliability textbooks. 95% Confidence limit Transformed Specification Transformed Confidence limit on extrapolated point can be calculated by a standard linear regression formula. SEE NEXT SLIDE, FOR REFERENCE AFTER SEMINAR Houston Regional Quality Conference, Nov. 13, 2015 Copyright 2015 by John Zorich 2

Formula for calculating the 1-sided confidence limit on the mean Y-value at a single X-value on a linear regression line (i.e., the transformed Y-value for hollow triangle on previous slide) (note: The generalized linear regression equation is... Yei = a + b Xi ) = YSL +/ t x See x [(1 / N )+(( XSL Xavg )^2) / (Sum(( Xi Xavg)^2 ))]^0.5 where... YSL = transformed Y-axis value corresponding to the Specification Limit (i.e., transformed Y-value for solid triangle on the chart on previous slide) +/ = use + if out-of-specification is below the spec limit; otherwise use t = one-side, t-table value at alpha = 1 Confidence, and df = N 2 using Excel: t = TINV ( 2 * (1 Confidence), ( N 2 ) ) See = Std Error of Estimate = [ ( Sum ( ( Yei Yi ) ^2 ) ) / ( N 2 ) ] ^0.5 Yi = transformed plotted Y-axis value corresponding to a plotted Xi N = number of X,Y points plotted on the chart (not the same as sample size) XSL = transformed Specification Limit Xavg = average of the transformed X values of the plotted X,Y points (this is not the same as the average of the transformed raw data) (do not include the specification limit in this average) Xi = each of the transformed X values of the plotted X,Y points "Sum" here means to add up each (Xi Xavg)^2, from i = 1 thru i = N Decimal %Cumulative vs. Median Rank Decimal %cumulative = ( Rank ) / ( SampleSize ) where rank" of the lowest value in the data set = 1, next lowest value = 2, next lowest = 3, and so on, and Sample Size = total # of devices or parts being tested. Median Rank can be calculate by either... = ( Rank 0.3 ) / ( SampleSize + 0.4 ) = BETAINV ( 0.5, Rank, SampleSize Rank + 1 ) Those 2 median-rank formulas give the same result to MANY digits. On Y-axis, plot either... decimal %cumulative, or median rank. Reliability Plotting: EXACT vs. INTERVAL Rank 1 2 3 4 5 6 7 8 9 10 RAW DATA CUMULATIVE MEDIAN ( SORTED ) PERCENT DECIMAL RANK 78.2 10% 0.10 0.067 87.3 20% 0.20 0.163 89.9 30% 0.30 0.260 94.8 40% 0.40 0.356 99.7 50% 0.50 0.452 103.5 60% 0.60 0.548 108.9 70% 0.70 0.644 112.1 80% 0.80 0.740 118.6 90% 0.90 0.837 124.8 ( cannot plot 100%!! ) 0.933 Exact data are the individual measurement values obtained during a study. It is usually better to pool identical values into groups ("intervals"), the way it is done for a histogram. VERY IMPORTANT: Plot Median Rank on Y-axis for exact data. Plot decimal %cumulative for interval data. 26 Houston Regional Quality Conference, Nov. 13, 2015 Copyright 2015 by John Zorich 3

Examples of useful X-axis Transformations ( using Excel Formulas ) In the formulas below, A, B, C, D, and E are constants (negative or positive, whole numbers or fractional) chosen to help linearize the reliability plot. = ASINH ( SQRT ( X ) ) = SQRT ( X + A ) = ( ( X ^ B ) 1 ) / B (Box-Cox Transformation) = ASINH ( X / C ) (from Johnson SU Transformation) = LN ( X + D ) (from Johnson SL Transformation) = 1 / ( X + E ) Examples of useful Y-axis Transformations ( using Excel Formulas ) F = Median Rank (exact) or Decimal%Cum (interval) C = the user-chosen shape parameter constant ** = other parameters set to a value of 1.000 = NORMSINV ( F ) ( = Normal ) = NORMSINV [ 1 (1 F ) ^ ( 1 / C ) ] ** = EXP ( NORMSINV [ 1 (1 F ) ^ ( 1 / C ) ] ) ** = LN ( LN ( 1 / ( 1 F ) ) ) ( = SEV ) = LN ( 1 / ( 1 F ) ) ^ ( 1 / C ) ** ( = Weibull ) = LN ( F / ( 1 F ) ) ( = Logistic ) = LN ( 1 / ( LN ( 1 / F ) ) ) ( = LEV ) = TAN ( PI() * ( F 0.5 ) ) ( = Cauchy ) 20 10 devices put on-test; results have a geometric progression: 1, 2, 4, 8, 16, 32, 64, 128, 256, 512 30 units on-test; 9 life-failures occurred at hours: 367, 422, 476, 508, 552, 589, 642, 683, 738 Largest Extreme Value distribution yields 99.8% reliable at 95% confidence vs. spec of 0.100 16 Data is right censored ; that is, the 21 data points on the right side of the time line (i.e., greater than 738 hours) are not known. But sample size = 30 is used for calculation of y-axis median-rank values. Power- LogNormal distribution yields 99.8% reliable at 95% confidence vs. spec of 300 Houston Regional Quality Conference, Nov. 13, 2015 Copyright 2015 by John Zorich 4

10 devices put on-test; tensile failures were 0.35, 0.35, 0.35, 0.35, 0.68, 1.22, 1.56, 1.91, 2.17, 2.18 Leftmost dot represents the four 0.35 values (%cumulative = 40%) that are left-censored. Weibull distribution yields 62% reliable at 95% confidence vs. spec of 0.100 The 2.18 value is right-censored, i.e. not plotted, which makes it easier to find the transformation to linearity for this interval data. 16 60 sterile-barrier pouches put on-test; Failure (burst under pressure) occurred either at one of the 3 seams created by the pouch manufacturer, or at the seam created by purchaser s testing personnel. Sample shows 2 separate distributions: QC-spec here user s seal Mfg s seal 16 Burst strength (actual data) of very large, sterile-barrier pouches 60 pouches tested; the minimum spec was 0.40 psi. The (sorted) raw data was... 0.48 0.55 0.60 0.62 0.68 0.72 0.48 0.56 0.61 0.63 0.68 0.73 0.49 0.56 0.61 0.64 0.68 0.75 0.51 0.56 0.61 0.64 0.68 0.76 0.52 0.56 0.61 0.64 0.69 0.83 0.53 0.57 0.61 0.65 0.69 0.86 0.53 0.58 0.61 0.66 0.69 0.86 0.53 0.59 0.61 0.66 0.71 0.99 0.53 0.59 0.62 0.66 0.72 1.02 0.55 0.59 0.62 0.67 0.72 1.22 Burst strength Many replicate values, & therefore should (usually) plot as interval data rather than exact data. 0.48 0.55 0.60 0.62 0.68 0.72 0.48 0.56 0.61 0.63 0.68 0.73 0.49 0.56 0.61 0.64 0.68 0.75 0.51 0.56 0.61 0.64 0.68 0.76 0.52 0.56 0.61 0.64 0.69 0.83 0.53 0.57 0.61 0.65 0.69 0.86 0.53 0.58 0.61 0.66 0.69 0.86 0.53 0.59 0.61 0.66 0.71 0.99 0.53 0.59 0.62 0.66 0.72 1.02 0.55 0.59 0.62 0.67 0.72 1.22 Houston Regional Quality Conference, Nov. 13, 2015 Copyright 2015 by John Zorich 5

How to convert the ( n = 60 ) data, with its many exact but replicate values, into interval data: 0.48 0.55 0.60 0.48 0.56 0.61 0.49 0.56 0.61 0.51 0.56 0.61 0.52 0.56 0.61 0.53 0.57 0.61 0.53 0.58 0.61 0.53 0.59 0.61 0.53 0.59 0.62 0.55 0.59 0.62 Do NOT plot zero values; e.g., do not plot 54 -- 0 48 -- 2 = 2/60 = 0.033 49 -- 1 = 3/60 = 0.050 51 -- 1 = 4/60 = 0.067 52 -- 1 = 5/60 = 0.083 53 -- 4 = 9/60 = 0.150 55 -- 1 56 -- 4 57 -- 1 58 -- 1 59 -- 3 60 -- 1 61 -- 7 etc. Burst strength for n=60 pouches On a basic cumulative plot ( i.e., no transformations of either axis ) the data set comprises 2 different populations: Right-censoring is valid only if testing vs. a lower spec limit! Must "censor" these data, so that they do not interfere with identifying best transformation. Burst strength for n=60 pouches If convert from "exact" to "interval" and censor values above 0.8 psi, then results are... 99.9 %Reliability at 95% confidence using... Log(X) vs Z(%cum) ( i.e., LogNormal ) When analyzing reliability vs. upper specification limit, typically cannot use interval method because get nonsense results (> 100% reliability!), therefore must use exact method: Transformed Cumulative Decimal Fraction.. Transformed Cumulative Decimal Fraction.. 180% 160% 140% 120% 100% 80% 60% 40% 20% 0% 4 3 2 1 0-1 -2-3 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 Transformed Treatment Values 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 INTERVAL yields 140% reliability, which is nonsense. EXACT yields 99.9% reliability. Transformed Treatment Values Houston Regional Quality Conference, Nov. 13, 2015 Copyright 2015 by John Zorich 6

Reliability vs. 2-sided specification limit : calculate each one separately, using the same confidence level and transformation in each case, then pool the information like so: % out-ofspec vs. 100% lower spec % out-ofspec vs. upper spec For example: if reliability vs. lower spec is 97% and vs. upper is 96%, then overall reliability... = 100% (100% 97%) ( 100% 96%) = 100 3 4 = 93% reliability at 95% confidence Reliability Plotting vs. 2-sided specification limits : 98% reliable at Upper Spec Limit Therefore we are 95% confident that the population is 93% in-spec (100% 3% 4% = 93%) 97% reliable (= 3% out-of-spec) at upper 95% confidence limit 96% reliable (= 4% outof-spec) at lower 95% confidence limit 99% reliable at Lower Spec Limit Linearity of a Reliability Plot must be determined visually, NOT by trusting the correlation coefficient. 1000 0.955 800 0.955 600 0.955 400 200 Correlation Coefficients 0 0 5 10 15 20 Examples of Y-axis Reverse Transformations ( using Excel Formulas ) To determine the % Reliability, the y-axis confidence limit must be reverse transformed. For example... If Smallest Extreme Value (= SEV) is used, & the upper 95% confidence limit on an extrapolated lower specification is 2.820, then... 2.820 = LN ( LN ( 1 / ( 1 F ) ) ) F = 1 1 / EXP(EXP( 2.820)) = 0.05876 = 5.786% %Reliability = 100% 5.786% = 94.214% at 95% confidence 20 Houston Regional Quality Conference, Nov. 13, 2015 Copyright 2015 by John Zorich 7

In summary... Reliability Plotting requires the user to make decisions about... what transformations yield best straight line what data needs to be converted to intervals what data needs to be censored Spreadsheets can be used to perform reliability plotting; or high-end statistics programs. Biggest challenge is to explain the method to auditors (in the speaker s 15 years of experience using reliability plotting, such auditors are very receptive to this method, or very intimidated by it). There is no one textbook that discusses all of the Reliability Plotting topics that were demonstrated in this seminar, but the following textbooks discuss various individual aspects of it: Dovich: Quality Engineering Statistics Dovich: Reliability Statistics Juran: Juran s Quality Handbook Natrella: Experimental Statistics NIST Engineering Statistics Internet Handbook, found at http://www.itl.nist.gov/div898/handbook/index.htm Tobias & Trindade, Applied Reliability This has an entire chapter on reliability plotting. To receive some relevant computer files, including a spreadsheet that demonstrates Reliability Plotting, send an email request to... JOHNZORICH@YAHOO.COM or visit: www.johnzorich.com Houston Regional Quality Conference, Nov. 13, 2015 Copyright 2015 by John Zorich 8