Generic Graphics for Uncertainty and Sensitivity Analysis

Similar documents
Chapter 2 Describing, Exploring, and Comparing Data

DOWNLOAD PDF BIG IDEAS MATH VERTICAL SHRINK OF A PARABOLA

Use of GeoGebra in teaching about central tendency and spread variability

Frequency Distributions

Graphical Presentation for Statistical Data (Relevant to AAT Examination Paper 4: Business Economics and Financial Mathematics) Introduction

Chapter 2. Descriptive Statistics: Organizing, Displaying and Summarizing Data

PACKING DIGRAPHS WITH DIRECTED CLOSED TRAILS

Meeting 1 Introduction to Functions. Part 1 Graphing Points on a Plane (REVIEW) Part 2 What is a function?

UNIT 15 GRAPHICAL PRESENTATION OF DATA-I

Chapter 5. Transforming Shapes

Math 227 EXCEL / MEGASTAT Guide

The Rectangular Coordinate System and Equations of Lines. College Algebra

JUST THE MATHS UNIT NUMBER STATISTICS 1 (The presentation of data) A.J.Hobson

Chapter 2 Basic Structure of High-Dimensional Spaces

Section 2.2 Normal Distributions. Normal Distributions

STA 570 Spring Lecture 5 Tuesday, Feb 1

Data analysis using Microsoft Excel

Middle Years Data Analysis Display Methods

Linear programming II João Carlos Lourenço

Averages and Variation

Interactive Math Glossary Terms and Definitions

Rational Numbers: Graphing: The Coordinate Plane

1. To condense data in a single value. 2. To facilitate comparisons between data.

Scatterplot: The Bridge from Correlation to Regression

Graphing Linear Equations and Inequalities: Graphing Linear Equations and Inequalities in One Variable *

INDEPENDENT SCHOOL DISTRICT 196 Rosemount, Minnesota Educating our students to reach their full potential

Chapter Two: Descriptive Methods 1/50

Section 3.1 Graphing Using the Rectangular Coordinate System

Appendix A: Graph Types Available in OBIEE

Error Analysis, Statistics and Graphing

2017 SOLUTIONS (PRELIMINARY VERSION)

Chapter 15 Introduction to Linear Programming

Optimizations and Lagrange Multiplier Method

Univariate Statistics Summary

SNAP Centre Workshop. Graphing Lines

FUNCTIONS AND MODELS

Excel Core Certification

Mathematics 308 Geometry. Chapter 9. Drawing three dimensional objects

Graphical Methods in Linear Programming

Substituting a 2 b 2 for c 2 and using a little algebra, we can then derive the standard equation for an ellipse centred at the origin,

Plane Wave Imaging Using Phased Array Arno Volker 1

CAPE. Community Behavioral Health Data. How to Create CAPE. Community Assessment and Education to Promote Behavioral Health Planning and Evaluation

NARROW CORRIDOR. Teacher s Guide Getting Started. Lay Chin Tan Singapore

Creating a Basic Chart in Excel 2007

Overview. Frequency Distributions. Chapter 2 Summarizing & Graphing Data. Descriptive Statistics. Inferential Statistics. Frequency Distribution

Measures of Central Tendency

Tutorial 1: Welded Frame - Problem Description

A-SSE.1.1, A-SSE.1.2-

LOESS curve fitted to a population sampled from a sine wave with uniform noise added. The LOESS curve approximates the original sine wave.

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency

On Page Rank. 1 Introduction

UNC Charlotte 2010 Comprehensive

Probability Models.S4 Simulating Random Variables

(Refer Slide Time: 00:02:00)

General Program Description

Chapter 1. Linear Equations and Straight Lines. 2 of 71. Copyright 2014, 2010, 2007 Pearson Education, Inc.

Three-Dimensional Coordinate Systems

Common Core Vocabulary and Representations

6th Bay Area Mathematical Olympiad

Grade 9 Math Terminology

Chapter 6: DESCRIPTIVE STATISTICS

Measures of Central Tendency

Middle School Summer Review Packet for Abbott and Orchard Lake Middle School Grade 7

Middle School Summer Review Packet for Abbott and Orchard Lake Middle School Grade 7

Building Better Parametric Cost Models

Parallel and perspective projections such as used in representing 3d images.

WHAT YOU SHOULD LEARN

Describing Data: Frequency Tables, Frequency Distributions, and Graphic Presentation

DAY 52 BOX-AND-WHISKER

3. Data Analysis and Statistics

Fathom Dynamic Data TM Version 2 Specifications

Getting to Know Your Data

Name: -1 and and and and

demonstrate an understanding of the exponent rules of multiplication and division, and apply them to simplify expressions Number Sense and Algebra

How to use FSBforecast Excel add in for regression analysis

Middle School Math Course 2

Basics of Computational Geometry

Integrated Mathematics I Performance Level Descriptors

Data Analysis and Solver Plugins for KSpread USER S MANUAL. Tomasz Maliszewski

form are graphed in Cartesian coordinates, and are graphed in Cartesian coordinates.

TAGUCHI TECHNIQUES FOR 2 k FRACTIONAL FACTORIAL EXPERIMENTS

Downloaded from

An introduction to plotting data

Vocabulary Unit 2-3: Linear Functions & Healthy Lifestyles. Scale model a three dimensional model that is similar to a three dimensional object.

Fathom Tutorial. Dynamic Statistical Software. for teachers of Ontario grades 7-10 mathematics courses

2003/2010 ACOS MATHEMATICS CONTENT CORRELATION GRADE ACOS 2010 ACOS

(Refer Slide Time: 00:02:02)

) 2 + (y 2. x 1. y c x2 = y

Mathematics in Art and Architecture GEK1518

8: Statistics. Populations and Samples. Histograms and Frequency Polygons. Page 1 of 10

= f (a, b) + (hf x + kf y ) (a,b) +

Mathematical Reasoning. Lesson 37: Graphing Quadratic Equations. LESSON 37: Graphing Quadratic Equations

ASSOCIATION BETWEEN VARIABLES: SCATTERGRAMS (Like Father, Like Son)

Distributions of random variables

Integers & Absolute Value Properties of Addition Add Integers Subtract Integers. Add & Subtract Like Fractions Add & Subtract Unlike Fractions

Graphical Analysis. Figure 1. Copyright c 1997 by Awi Federgruen. All rights reserved.

Test designs for evaluating the effectiveness of mail packs Received: 30th November, 2001

D-Optimal Designs. Chapter 888. Introduction. D-Optimal Design Overview

Unit 1, Lesson 1: Moving in the Plane

Exam Review: Ch. 1-3 Answer Section

Transcription:

Generic Graphics for Uncertainty and Sensitivity Analysis R. M. Cooke Dept. Mathematics, TU Delft, The Netherlands J. M. van Noortwijk HKV Consultants, Lelystad, The Netherlands ABSTRACT: We discuss graphical methods which may be employed generically for uncertainty and sensitivity analysis. This field is rather new, and the literature reveals very little in the way of theoretical development. Perhaps it is the nature of these methods that one simply sees what is going on. Apart from statistical reference books [Cleveland, 1993] which focus on visualizing data, the main sources for graphical methods are software packages. Our focus is visualization to support uncertainty and sensitivity analysis. A simple problem for illustrating four generic graphical techniques, namely tornado graphs, radar plots, multiple scatter plots, and cobweb pots. Uncertainty analysis codes, with or without graphic facilities, were benchmarked in a recent workshop of the technical committee Uncertainty Modelling of the European Safety and Reliability Association. The report [Cooke 1997] contains descriptions and references to the codes, as well as simple test problems. An extended version of this paper will appear in [Saltelli appearing]. 1. A SIMPLE PROBLEM The following problem has been developed for the Cambridge Course for Industry, Dependence Modeling and Risk Management [Cambridge Course for Industry 1998], and it serves to illustrate the generic techniques. Suppose we are interested in how long a car will start after the headlights have stopped working. We build a simple reliability model of the car consisting of three components: the battery (bat), the headlight lampbulb (bulb), the starter motor (strtr). The headlight fails when either the battery or the bulb fail. The car s ignition fails when either the battery or the starter motor fail. Thus considering bat, bulb and strtr as life variables: headlite = min(bat, bulb), ignitn = min(bat, strtr). The variable of interest is then ign-head = ignitn - headlite. Note that this quantity may be either positive or negative, and that it equals zero whenever the battery fails before the bulb and before the starter motor. We shall assume that bat, bulb and strtr are independent exponential variables with unit expected lifetime. The question is, which variable is most important to the quantity of interest ignhead? 2. TORNADO GRAPHS Tornado graphs are simply bar graphs arranged vertically in order of descending absolute value. The spreadsheet add-on Crystal Ball performs uncertainty analysis and gives graphic output for sensitivity analysis in the form of tornado graphs (without using this designation). After selecting a target forecast variable, in this case ign-head, Crystal Ball shows the rank correlations of other input variables and other forecast variables as in Figure 1. The values, in this case rank correlation coefficients, are arranged in decreasing order of absolute value. Hence the variable strtr with rank correlation 0.56 is first, and bulb with rank correlation -0.54 is second, and so on. When influence on the target variable, ign-head, is interpreted as rank correlation, it is easy to pick out the most important variables from such graphs. Note that bat is shown as having rank correlation 0 with the target variable ign-head. This would suggest that bat was completely unimportant for ign-head. 1

strtr.56 bulb -.54 ign.48 headlite -.46 bat -.00-1 -0.5 0 0.5 1 Figure 1 Measured by Rank Correlation 3. RADAR GRAPHS Radar graphs provide another way of showing the information in Figure 1. Figure 2 shows a radar graph made in EXCEL by entering the rank correlations from Figure 1. Each variable corresponds to a ray in the graph. The variable with the highest rank correlation is plotted furthest from the midpoint, and the variable with the lowest rank correlation is plotted closest to the midpoint. The real value of radar plots lies in their ability to handle a large number of variables. Figure 3 is a radar plot showing six variables of interest (radiation dose to various organs) in different colors, against 161 explanatory variables [Goossens et al 1997]. In A3 it is possible to view all this information at one time. 4. MULTIPLE SCATTER PLOTS Statistical packages support varieties of scatter plots. For example, SPSS provides a multiple scatter plot facility. Simulation data produced by UNICORN for 1000 samples has been read into SPSS to produce the plot matrix shown in Figure 6. We see pairwise scatter plots of the variables in our problem. The first row, for example, shows the scatter plots of ign-head and, respectively, bat, bulb, strtr, ignitn and headlite. Figure 4 contains much more information than Figure 1. Let (a,b) denote the scatter plot in row a and column b, thus (1,2) denotes the second plot in the first row with ign-head on the vertical axis and bat on the horizontal axis. Note that (2,1) shows the same scatter plot, but with bat on the vertical and ing-head on the horizontal axes. Although Figure 1 suggested that bat was unimportant for ign-head, Figure 4 shows that the value of bat can say a great deal about ign-head. Thus, if bat assumes its lowest possible value, then the values of ign-head are severely constrained. This reflects the fact that if bat is smaller than bulb and strtr, then ignitn = headlite, and ign-head = 0. From (1,3) we see that large values of bulb tend to associate with small values if ign-head; if bulb is large, then the headlight may live longer than the ignition making ign-head negative. Similarly, (1,4) shows that large values of strtr are associated with large values of ign-head. These facts are reflected in the rank correlations of Figure 1. bat headlite 0.5 0-0.5 strtr 1-1 ignitn bulb Series1 In spite of the above remarks, the relation between rank correlations depicted in Figure 1 and the scatter plots of Figure 4 is not direct. Thus, bat and bulb are statistically independent, but if we look at (2,3), we might infer that high values of bat tend to associate with low values of bulb. This however is an artifact of the simulation. There are very few very high values of bat, and as these are independent of bulb, the corresponding values of bulb are not extreme. If we had a scatter plot of the rank of bat with the rank of bulb, then the points would be uniformly distributed on the unit square. Figure 2 2

Sensitivity for High collective Dose v1 4.00E-01 v2v3v4v5v6 v7v8v9v10 v11 v12 3.00E-01 v13 v14 v15 v16 v17 2.00E-01 v18 v19 v140 v141 v142 v143 v144 v145 v146 v147 v148 v149 v150 v151 v152 v153 v154 v155 v156 v157 v158 v159 v20 v21 1.00E-01 v22 v23 v136 v137 v138 v139 v24 v25 0.00E+00 v26 v27 v132 v133 v134 v135 v28-1.00e-01 v29 v30 v31 v128 v129 v130 v131-2.00e-01 v32 v33 v34 v125 v126 v127-3.00e-01 v35 v36 v124-4.00e-01 v37 v122 v123 v38 v39 v121-5.00e-01 v40 v120 v41 v119 v118 v43 v42 v117 v44 v116 v45 v115 v114 v47 v46 v113 v48 v112 v111 v50 v49 v110 v109 v52 v51 v108 v107 v54 v53 v106 v55 v105 v104 v57 v56 v103 v102 v59 v58 v101 v100 v61 v60 v99 v98 v63 v62 v97 v96 v65 v64 v95 v94 v67 v66 v93 v92 v91 v70 v69 v68 v90 v89 v88 v73 v72 v71 v87 v86 v85 v84 v83 v82 v81 v80 v79 v78 v77 v76 v75 v74 collective cd1 dose cd2 dose cd3 dose cd4 dose cd5 dose cd6 Figure 3 BAT BULB STRTR IGNITN HEADLITE Figure 4 Figure 5 shows an overlay scatter plot. Ign-head is depicted on the vertical axis, and the values for bat, bulb and strtr are shown as blue, green and red points respectively, when viewed in color. Figure 5 is a superposition of plots (1,2), (1,3) and (1,4) of Figure 5. However, Figure 5 is more than just a superposition. Inspecting Figure 5 closely, we see that there are always a red, green and blue point corresponding to each realized value on the vertical axis. Thus at the very top there is a green point at 3 ign-head = 238 and bulb slightly greater than zero. There are blue and red points also corresponding to ign-head = 238. These three points correspond to the same sample. Indeed, ign-head attains its maximum value when strtr is very large and bulb is very small. If a value of ign-head is realized twice, then there will be two triples of blue-green-red points on a horizontal line corresponding to this value, and it is impossible to resolve the two separate data points. For ign-head = 0, there are about 300 realizations.

300 200 100 0-100 -200-300 -100 0 100 200 300 400 500 600 Figure 5 700 BAT BULB STRTR possible values of these variables as parallel vertical lines. One sample from this distribution is a six-vector. We mark the six values on the six vertical lines and connect the marks by a jagged line. If we repeat this 1000 times we get Figure 6 below. We can recognize the exponential distributions of bat, bulb and strtr. Ignitn and headlite, being the minimum of independent exponentials, are also exponential. Ign-head has a more complicated distribution. The graphs at the top are the cross densities ; they show the density of line crossings midway between the vertical axes. The role of these in depicting dependence becomes clear when we transform the six variables to ranks or percentiles, as in Figure 7: The distribution underlying Figure 4 is six dimensional. Figure 4 does not show this distribution, but rather shows 36 two-dimensional projections from this distribution. Figure 5 shows more than a collection of two dimensional projections, as we can sometimes resolve the individual data points for bat bulb, strtr and ignhead, but it does not enables us to resolve all data points. The full distribution is shown in cobweb plots. 5. COBWEB PLOTS The uncertainty analysis program UNICORN contains a graphical feature that enables interactive visualization of a moderately high dimensional distribution. Our sample problem contains six random variables. Suppose we represent the Line shading is introduced to identify the four quartiles of the left most variable. A number of striking features emerge when we transform to the percentile cobweb plot. First of all, there is a curious hole in the distribution of ign-head. This is explained as follows. On one third of the samples, bat is the minimum of {bat, bulb, strtr}. On these samples ignitn=headlite and ign-head = 0. Hence the distribution of ign-head has an atom at zero with weight 0.33. On one-third of the samples strtr is the minimum and on these samples ign-headlite is negative, and on the remaining third bulb is the minimum and ign-head is positive. Hence, the atom at zero means that the percentiles 0.33 up to 0.66 are all equal to zero. The first positive number is the 67-th percentile. Figure 6 4

Figure 7 Note the cross densities in Figure 7. One can show the following for two adjacent continuously distributed variables X and Y in a percentile cobweb plot 1 : If the rank correlation between X and Y = 1 then the cross density is uniform If X and Y are independent (rank correlation 0) then the cross density is triangular If the rank correlation between X and Y = -1, then the cross density is a spike in the middle. Intermediate values of the rank correlation yield intermediate pictures. The cross density of ignitn and headlite tends toward uniform, and the rank correlation between these variables is 0.42. Cobweb plots support interactive conditionalization; that is, the user can define regions on the various axes and select only those samples which intersect the chosen region. Figure 8 shows the result of conditionalizing on ign-head = 0, Notice that if ign-head = 0, then bat is almost always the minimum of {bat,bulb,strtr}, and ignitn 1 These statements are easily proved with a remark by Tim Bedford. Notice that the cross density is the density of X+Y, where X and Y are uniformly distributed on the unit square. If X and Y have rank correlation 1, then X+Y is uniform [0,2]; if they have rank correlation -1 then X + Y = 1; if X and Y are independent then the mass for X+Y=a is proportional to the length of the segment x+y=a. 5 is almost always equal to headlite. This is reflected in the conditional rank correlation between ignitn and headlite almost equal to 1. We see that the conditional correlation as in Figure 8 can be very different from the unconditional correlation of Figure 7. From Figure 8 we also see that bat is almost always less than bulb and strtr. Cobweb plots allow us to examine local sensitivity. Thus we can say, suppose ign-head is very large, what values should the other variables take? The answer is gotten simply by conditionalizing on high values if ign-head. Figure 9 shows conditionalization on high values of ign-head, Figure 10 conditionalizes on low values. If ignhead is high, then bat, strtr and ignitn must be high. If ign-head is low then bat, bulb and headlite must be high. Note that bat is high in one case and low in the other. Hence, we should conclude that bat is very important both for high values and for low values of ign-head. This is a different conclusion than we would have drawn if we considered only the rank correlations of Figures 1 and 2 These facts can also be readily understood from the formulae themselves. Of course the methods come into their own in complex problems where we cannot see these relationships immediately from the formula. The graphical methods then draw our

attention to patterns which we must then seek to understand. REFERENCES Cambridge Course for Industry, 1998, Dependence Modelling and Risk Management, University of Cambridge, Cambridge. Cleveland, W.S. Visualizing Data, 1993, Marray Hill, New Jersey: AT&T Bell Laboratories. Cooke, R.M. 1995, UNICORN: Methods and code for uncertainty analysis, AEA Technology, Cheshire. Figure 8 Crystal Ball, version 4.0, 1996, Users Manual, Decisioneering Inc. Aurora, Colorado. European Safety and Reliability Association Technical Committee Uncertainty Modelling, 1997, Benchmark Workshop (Cooke, R.M. ed), Delft. EXCEL for Windows 95, 1995, Microsoft. Goossens, L.H.J., J.D. Harrison, B.C.P. Kraan, R.M. Cooke, F.T. Harper, and S.C. Hora, Probabilistic Accident Consequence Uncertainty Analysis: uncertainty assessment for internal dosimetry, NUREG/CR-6571, EUR 16773, SAND98-0119, vol. 1, prepared for U.S. Nuclear Regulatory Commission, Commission of the European Communities, 1997. Figure 9 Saltelli, A. (appearing), Mathematical and Statistical Methods for Sensitivity Analysis of Model Output, Wiley, New York. SPSS for Windows, 8.00, 1997, SPSS. Figure 10 6