Statistical Analysis of Output from Terminating Simulations
Statistical Analysis of Output from Terminating Simulations Time frame of simulations Strategy for data collection and analysis Confidence intervals Comparing two scenarios using Output Analyzer Searching for an optimal scenario with OptQuest
Motivation Run a simulation (once) what does it mean? Was this run typical or not? Variability from run to run (of the same model)? Need statistical analysis of output data From a single model configuration Compare two or more different configurations Search for an optimal configuration Statistical analysis of output is often ignored This is a big mistake no idea of precision of results Not hard or time-consuming to do this it just takes a little planning and thought, then some (cheap) computer time
Time Frame of Simulations Terminating: Specific starting, stopping conditions Run length will be well-defined (and finite) Steady-state: Long-run Over theoretically infinite time frame Initial conditions for the simulation don t matter A steady-state simulation has to stop at some point and as you might guess these runs can be pretty long
Strategy for Data Collection and Analysis For terminating case, make some number n (IID) of independent replications: To do this, just open the Run > Setup > Replication Parameters dialog box and enter the value of n you want for the number of replications. Check both boxes for Initialize Between Replications To cause both the system state variables and the statistical accumulators to be cleared at the end of each replication.
Separate results for each replication Category by Replication report for 10 replications Consider a bank with 5 tellers, one FIFO queue, open 9am to 5pm, flush out before stopping. Inter-arrivals ~ expo (mean = 1 min.), service times ~ expo (mean = 4 min.) Simulating 10 days (replications) starting with no customers, the summary measures are as follows. Replication Number served Finished Time(hours) Average delay in queue (minutes) Average Queue Length 1 469 8.243 1.0452 0.9869 2 482 8.045 1.279 1.2735 3 509 8.026 1.5341 1.6151 4 530 8.049 1.5592 1.7045 5 510 8.125 2.4832 2.5874 6 548 8.15 4.7143 5.2616 7 484 8.103 3.7054 3.6731 8 490 8.36 2.62 2.55 9 511 8.2 1.33 1.37 10 483 8.148 1.09 1.07 Note that the variability of summary measures across replications.
Confidence Intervals for Terminating Systems Replication Number served Finished Time(hours) Average delay in queue (minutes) Average Queue Length 1 469 8.243 1.0452 0.9869 2 482 8.045 1.279 1.2735 3 509 8.026 1.5341 1.6151 4 530 8.049 1.5592 1.7045 5 510 8.125 2.4832 2.5874 6 548 8.15 4.7143 5.2616 7 484 8.103 3.7054 3.6731 8 490 8.36 2.62 2.55 9 511 8.2 1.33 1.37 10 483 8.148 1.09 1.07 Sample Mean 2.13604 2.20921 Standard deviation 1.239555412 1.362130309 Half width 0.886662919 0.974341627 Min 1.0452 0.9869 Max 4.7143 5.2616
Half Width, Number of Replications Prefer smaller confidence intervals precision Notation: Confidence interval: Half-width = Can t control t or s Must increase n how much? Depends on how one defines precision
Confidence interval =95%,n=10 replications s(waiting time)=1.23 Half-width =0.886 point estimate=2.13 Confidence Interval=2.13 0.886
Confidence Intervals for Terminating Systems If > 1 replication, Arena uses cross-repl. data as above Possibly most useful part: 95% confidence interval on expected values
Half Width, Number of Replications Want this to be small, say < h where h is prespecified Set half-width = h, solve for n Not really solved for n (t, s depend on n) Approximation: Replace t by z, corresponding normal critical value Pretend that current s will hold for larger samples
Half Width, Number of Replications Get s = sample standard deviation from initial number n 0 of replications For confidence interval =95%,z=1.96,h=10% ( )=0.213 n 1.96 1.23 = 128 replications n grows as h decreases After running the model for 128 replications 12
Half Width, Number of Replications Easier but different approximation: h 0 = half width from initial number n 0 of replications n 10 0.886 = 173 replications
Warm Up Period Steady-State : beyond which the output appears not to be drifting any more statistically. Arena s companion package Output Analyzer can be used to get a plot of any model parameter for the replication. In our example I want a plot for Total WIP and for Total Number in Queue for 10 replications. Using statistic module: Save the output file as.dat
Open Output Analyzer Warm Up Period Click on plot icon Search for the file you save choose for replication All
Warm Up Period Total WIP
Warm Up Period Total Number in Queue
Comparing Two Scenarios Two scenarios: 1)Base scenario is Bank Example 2) New scenario is that we hire 6 tellers instead of 5 tellers. replications Scenario 1(waiting time in queue)w1 Scenario 2(waiting time in queue)w2 Z=W1-W2 1 1.0452 0.3651 0.6801 2 1.279 0.2145 1.0645 3 1.5341 0.4643 1.0698 4 1.5592 0.3585 1.2007 5 2.4832 0.1066 2.3766 6 4.7143 0.4175 4.2968 7 3.7054 0.6298 3.0756 8 2.62 0.7119 1.9081 9 1.33 0.5928 0.7372 10 1.09 0.7317 0.3583 Average 1.67677 Standard deviation 1.244945066 confidence intetval 1.68 0.8905
Comparing Two Scenarios Reasonable but not-quite-right idea Make confidence intervals on expected outputs from each scenario, see if they overlap; look at total wait time in queue. At 95% t-paired Confidence interval [0.78,2.57] Since the interval does not contain 0, this implies that the two systems are sensibly different.in addition system 2 seems to be better (as the customer stays lower time in the system) But this doesn t allow for a precise, efficient statistical conclusion
Comparing Two Scenarios Using Output Analyzer Output Analyzer is a separate application that operates on.dat files produced by Arena Launch separately from Windows, not from Arena To save output values (Expressions) of entries in Statistic data module (Type = Output) enter filename.dat in Output File column Did for both scenarios
Comparing Two Scenarios Using Output Analyzer Analyze > Compare Means menu option Add data files A and B for two scenarios Select Lumped for Replications field Title, confidence level, accept Paired-t Test, do not Scale Display since two output performance measures have different units
Compare Means via Output Analyzer Results: Point estimate = 1.68 At 95% t-paired Confidence interval [0.78,2.57]
Comparing Two Scenarios Using Output Analyzer over 100 Replications Point estimate = 1.61 At 95% t-paired Confidence interval [1.37,1.84]
Searching for an Optimal Scenario with OptQuest (not included in student version of Arena) Seek input controls minimizing Total Cost while keeping Percent Rejected 5 Explore all possibilities add resources in any combination New rules: 26 number of trunk lines 50 Total number of new employees of all five types 15
File >OptQuest Optimization OptQuest(Model 5-3) Formulate as an optimization problem: Minimize Total Cost Subject to 26 MR(Trunk Line) 50 0 New Sales + New Tech 1 + New Tech 2 + New Tech 3 + New Tech All 15 Percent Rejected 5 Objective function is a simulation-model output OptQuest requires finite Replication Length Tools > OptQuest for Arena Constraint on another output New Optimization or Browse for saved one (.opt) Tree on left, expand for Controls and Responses Constraints on input control (decision) variables
In this toolbar we will input our optimization problem parameters OptQuest
Searching for an Optimal Scenario with OptQuest Controls, Responses Controls Resources Trunk Line Integer, Lower Bound = 26, Suggested Value = 29, Upper Bound = 50 Controls User Specified New Sales Integer, Lower Bound = 0, Suggested Value = 3, Upper Bound = 15 Click on Included to collect selections at top or bottom Responses User Specified Output Check Percent Rejected, Total Cost Slide 27 of 31
Searching for an Optimal Scenario with OptQuest Constraints, Objective Constraints Add button, then each of first five controls, +, then <= 15 Add button, then Percent Rejected, then <= 5 Objectives Add button, Total Cost, Minimize radio button Options Stopping rules Tolerance for regarding results as equal Replications per simulation Solutions log file location
Searching for an Optimal Scenario with OptQuest Running or Run > Start or F5 Optimization branch on tree to watch progress, scenarios so far, best scenario so far Can t absolutely guarantee a true optimum But usually finds far better configuration than possible by hand
Output