Monte Carlo Techniques for Estimating Power in Aircraft T&E Tests. Todd Remund Dr. William Kitto EDWARDS AFB, CA. July 2011

Similar documents
Edwards Air Force Base Accelerates Flight Test Data Analysis Using MATLAB and Math Works. John Bourgeois EDWARDS AFB, CA. PRESENTED ON: 10 June 2010

The Last Word on TLE. James Brownlow AIR FORCE TEST CENTER EDWARDS AFB, CA May, Approved for public release ; distribution is unlimited.

Can Wing Tip Vortices Be Accurately Simulated? Ryan Termath, Science and Teacher Researcher (STAR) Program, CAL POLY SAN LUIS OBISPO

DoD Common Access Card Information Brief. Smart Card Project Managers Group

Smart Data Selection (SDS) Brief

Accuracy of Computed Water Surface Profiles

Multi-Modal Communication

Service Level Agreements: An Approach to Software Lifecycle Management. CDR Leonard Gaines Naval Supply Systems Command 29 January 2003

Design Improvement and Implementation of 3D Gauss-Markov Mobility Model. Mohammed Alenazi, Cenk Sahin, and James P.G. Sterbenz

Dana Sinno MIT Lincoln Laboratory 244 Wood Street Lexington, MA phone:

ARINC653 AADL Annex Update

Empirically Based Analysis: The DDoS Case

COMPUTATIONAL FLUID DYNAMICS (CFD) ANALYSIS AND DEVELOPMENT OF HALON- REPLACEMENT FIRE EXTINGUISHING SYSTEMS (PHASE II)

Running CyberCIEGE on Linux without Windows

4. Lessons Learned in Introducing MBSE: 2009 to 2012

Information, Decision, & Complex Networks AFOSR/RTC Overview

FUDSChem. Brian Jordan With the assistance of Deb Walker. Formerly Used Defense Site Chemistry Database. USACE-Albuquerque District.

High-Assurance Security/Safety on HPEC Systems: an Oxymoron?

Using Model-Theoretic Invariants for Semantic Integration. Michael Gruninger NIST / Institute for Systems Research University of Maryland

ASSESSMENT OF A BAYESIAN MODEL AND TEST VALIDATION METHOD

Basing a Modeling Environment on a General Purpose Theorem Prover

Concept of Operations Discussion Summary

Kathleen Fisher Program Manager, Information Innovation Office

Space and Missile Systems Center

Technological Advances In Emergency Management

HEC-FFA Flood Frequency Analysis

Washington University

A Distributed Parallel Processing System for Command and Control Imagery

AFRL-ML-WP-TM

A Methodology for End-to-End Evaluation of Arabic Document Image Processing Software

DEVELOPMENT OF A NOVEL MICROWAVE RADAR SYSTEM USING ANGULAR CORRELATION FOR THE DETECTION OF BURIED OBJECTS IN SANDY SOILS

Computer Aided Munitions Storage Planning

Space and Missile Systems Center

Web Site update. 21st HCAT Program Review Toronto, September 26, Keith Legg

2013 US State of Cybercrime Survey

MODELING AND SIMULATION OF LIQUID MOLDING PROCESSES. Pavel Simacek Center for Composite Materials University of Delaware

Architecting for Resiliency Army s Common Operating Environment (COE) SERC

Setting the Standard for Real-Time Digital Signal Processing Pentek Seminar Series. Digital IF Standardization

Vision Protection Army Technology Objective (ATO) Overview for GVSET VIP Day. Sensors from Laser Weapons Date: 17 Jul 09 UNCLASSIFIED

SURVIVABILITY ENHANCED RUN-FLAT

C2-Simulation Interoperability in NATO

Radar Tomography of Moving Targets

Towards a Formal Pedigree Ontology for Level-One Sensor Fusion

Data Warehousing. HCAT Program Review Park City, UT July Keith Legg,

75th Air Base Wing. Effective Data Stewarding Measures in Support of EESOH-MIS

Lessons Learned in Adapting a Software System to a Micro Computer

Topology Control from Bottom to Top

Uniform Tests of File Converters Using Unit Cubes

Distributed Real-Time Embedded Video Processing

REPORT DOCUMENTATION PAGE

The C2 Workstation and Data Replication over Disadvantaged Tactical Communication Links

SINOVIA An open approach for heterogeneous ISR systems inter-operability

ATCCIS Replication Mechanism (ARM)

U.S. Army Research, Development and Engineering Command (IDAS) Briefer: Jason Morse ARMED Team Leader Ground System Survivability, TARDEC

M&S Strategic Initiatives to Support Test & Evaluation

ENVIRONMENTAL MANAGEMENT SYSTEM WEB SITE (EMSWeb)

A Review of the 2007 Air Force Inaugural Sustainability Report

An Update on CORBA Performance for HPEC Algorithms. Bill Beckwith Objective Interface Systems, Inc.

DoD M&S Project: Standardized Documentation for Verification, Validation, and Accreditation

Use of the Polarized Radiance Distribution Camera System in the RADYO Program

CENTER FOR ADVANCED ENERGY SYSTEM Rutgers University. Field Management for Industrial Assessment Centers Appointed By USDOE

ASPECTS OF USE OF CFD FOR UAV CONFIGURATION DESIGN

WAITING ON MORE THAN 64 HANDLES

Android: Call C Functions with the Native Development Kit (NDK)

US Army Industry Day Conference Boeing SBIR/STTR Program Overview

Cyber Threat Prioritization

Stereo Vision Inside Tire

Wireless Connectivity of Swarms in Presence of Obstacles

Guide to Windows 2000 Kerberos Settings

Open Systems Development Initiative (OSDI) Open Systems Project Engineering Conference (OSPEC) FY 98 Status Review 29 April - 1 May 1998

Corrosion Prevention and Control Database. Bob Barbin 07 February 2011 ASETSDefense 2011

Dr. Stuart Dickinson Dr. Donald H. Steinbrecher Naval Undersea Warfare Center, Newport, RI May 10, 2011

Wind Tunnel Validation of Computational Fluid Dynamics-Based Aero-Optics Model

SETTING UP AN NTP SERVER AT THE ROYAL OBSERVATORY OF BELGIUM

Headquarters U.S. Air Force. EMS Play-by-Play: Using Air Force Playbooks to Standardize EMS

32nd Annual Precise Time and Time Interval (PTTI) Meeting. Ed Butterline Symmetricom San Jose, CA, USA. Abstract

LARGE AREA, REAL TIME INSPECTION OF ROCKET MOTORS USING A NOVEL HANDHELD ULTRASOUND CAMERA

Using Templates to Support Crisis Action Mission Planning

Balancing Transport and Physical Layers in Wireless Ad Hoc Networks: Jointly Optimal Congestion Control and Power Control

BUPT at TREC 2009: Entity Track

2011 NNI Environment, Health, and Safety Research Strategy

Ross Lazarus MB,BS MPH

Introducing I 3 CON. The Information Interpretation and Integration Conference

Defense Hotline Allegations Concerning Contractor-Invoiced Travel for U.S. Army Corps of Engineers' Contracts W912DY-10-D-0014 and W912DY-10-D-0024

Col Jaap de Die Chairman Steering Committee CEPA-11. NMSG, October

Shallow Ocean Bottom BRDF Prediction, Modeling, and Inversion via Simulation with Surface/Volume Data Derived from X-Ray Tomography

Fall 2014 SEI Research Review Verifying Evolving Software

73rd MORSS CD Cover Page UNCLASSIFIED DISCLOSURE FORM CD Presentation

UMass at TREC 2006: Enterprise Track

73rd MORSS CD Cover Page UNCLASSIFIED DISCLOSURE FORM CD Presentation

TARGET IMPACT DETECTION ALGORITHM USING COMPUTER-AIDED DESIGN (CAD) MODEL GEOMETRY

Title: An Integrated Design Environment to Evaluate Power/Performance Tradeoffs for Sensor Network Applications 1

Current Threat Environment

INTEGRATING LOCAL AND GLOBAL NAVIGATION IN UNMANNED GROUND VEHICLES

Nationwide Automatic Identification System (NAIS) Overview. CG 939 Mr. E. G. Lockhart TEXAS II Conference 3 Sep 2008

Approaches to Improving Transmon Qubits

Maintenance Program and New Materials on Boeing Commercial Aircraft Joseph H. Osborne The Boeing Company Seattle, WA

Dynamic Information Management and Exchange for Command and Control Applications

75 TH MORSS CD Cover Page. If you would like your presentation included in the 75 th MORSS Final Report CD it must :

An Efficient Architecture for Ultra Long FFTs in FPGAs and ASICs

Transcription:

AFFTC-PA-11244 Monte Carlo Techniques for Estimating Power in Aircraft T&E Tests A F F T C Todd Remund Dr. William Kitto AIR FORCE FLIGHT TEST CENTER EDWARDS AFB, CA July 2011 Approved for public release A: distribution is unlimited. AIR FORCE FLIGHT TEST CENTER EDWARDS AIR FORCE BASE, CALIFORNIA AIR FORCE MATERIEL COMMAND UNITED STATES AIR FORCE

REPORT DOCUMENTATION PAGE Form Approved OMB No. 0704-0188 Public reporting burden for this collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources, gathering and maintaining the data needed, and completing and reviewing this collection of information. Send comments regarding this burden estimate or any other aspect of this collection of information, including suggestions for reducing this burden to Department of Defense, Washington Headquarters Services, Directorate for Information Operations and Reports (0704-0188), 1215 Jefferson Davis Highway, Suite 1204, Arlington, VA 22202-4302. Respondents should be aware that notwithstanding any other provision of law, no person shall be subject to any penalty for failing to comply with a collection of information if it does not display a currently valid OMB control number. PLEASE DO NOT RETURN YOUR FORM TO THE ABOVE ADDRESS. 1. REPORT DATE (DD-MM-YYYY) 21-07-2011 4. TITLE AND SUBTITLE 2. REPORT TYPE ITEA Conference Monte Carlo Techniques for Estimating Power in Aircraft T&E Tests 3. DATES COVERED (From - To) 20-07-2011 to 22-07-2011 5a. CONTRACT NUMBER NA 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) Todd Remund, Dr. William Kitto 5d. PROJECT NUMBER 5e. TASK NUMBER 5f. WORK UNIT NUMBER 7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES) AND ADDRESS(ES) 812 TSS 307 E. Popson Ave Edwards AFB, CA 93524 8. PERFORMING ORGANIZATION REPORT NUMBER AFFTC-PA-11244 9. SPONSORING / MONITORING AGENCY NAME(S) AND ADDRESS(ES) 812 TSS 307 E. Popson Ave Edwards AFB, CA 93524 12. DISTRIBUTION / AVAILABILITY STATEMENT Approved for public release A: distribution is unlimited. 10. SPONSOR/MONITOR S ACRONYM(S) N/A 11. SPONSOR/MONITOR S REPORT NUMBER(S) N/A 13. SUPPLEMENTARY NOTES CA: Air Force Flight Test Center Edwards AFB CA CC: 012100 14. ABSTRACT Edwards AFB, as a matter of policy, requires statistical rigor be a part of test design and analysis. Statistically defensible methods are used to gain as much information as possible from each test. This requires: Statistically defensible methods be identified and applied to each test Setting up tests to maximize scope of inference, and Determining the power or each test to optimize sample size This paper demonstrates how Monte Carlo techniques may be applied to aircraft test and evaluation to determine the power of the test and the associated sample size requirements. Traditional methods for determining the power of a test are based on distributional assumptions associated with data. These assumptions may not be appropriate; a distribution-free Monte Carlo technique for power assessment for tests with (possible) serially correlated data is presented. The technique is illustrated with an example from a target location error (TLE) test. Power of the test and appropriate sample sizes are derived using Monte Carlo simulation implemented in R. 15. SUBJECT TERMS Power, statistics, resampling, Monte Carlo simulation, R, sample size, CEP, CE90, circular error. 16. SECURITY CLASSIFICATION OF: Unclassified a. REPORT Unclassified b. ABSTRACT Unclassified 17. LIMITATION OF ABSTRACT 18. NUMBER OF PAGES c. THIS PAGE Unclassified None 21 19a. NAME OF RESPONSIBLE PERSON 412 TENG/EN (Tech Pubs) 19b. TELEPHONE NUMBER (include area code) 661-277-8615 Standard Form 298 (Rev. 8-98) Prescribed by ANSI Std. Z39.18

Air Force Flight Test Center War-Winning Capabilities On Time, On Cost Monte Carlo Techniques for Estimating Power in Aircraft T&E Tests July 2011 Approved for public release; distribution is unlimited. AFFTC-PA-11244 Todd Remund & William Kitto 412 TW 661-277-6384 I n t e g r i t y - S e r v i c e - E x c e l l e n c e 1

Power what is it? Power = Pr(H A H A is true) α Choose sample size, n, to get this Also need to decide what you want to see stay tuned = Pr(H A H o is true) Choose this number directly Normally 0.05 or 0.1 2

Uncertainty and its vicissitudes The mean is never equal to 0. But can we see the difference given how much spread/uncertainty there is in the sample? 3

Uncertainty and its vicissitudes The confidence intervals put a measure on uncertainty to help us make a decision. Here a difference is not detected because the confidence intervals cross zero. 4

Again Power what is it? Power is the proportion of times, in the long run, that our test (t-test, CI) identifies a difference, when it really exists. WRT the mean, we need to decide how big of a difference from zero do we care about. This is the effect size, called δ. If we choose enough samples the CI shrinks, and it is easier to see a difference. But how large of a sample do we need to get to see the δ we want? 5

Power via Monte Carlo For many applications, such as the one given, power calculations are closed form. For other difficult applications, the calculation doesn t exist. Generate Alternate Population With the desired effect With the distribution characteristics needed Sample from it repeatedly Each time analyzing the sample and record significance Compute the proportion of significant outcomes This is power via Monte Carlo 6

Setting Up the Minimal Alternative (1-sample t-test) Couch the example in a one-sample t-test for simplicity s sake. We want to see at least a δ effect size 80% of the time 7

Setting Up the Minimal Alternative (1-sample t-test) Couch the example in a one-sample t-test for simplicity s sake. With this much uncertainty (characteristics of population). We want to see at least a δ effect size 80% of the time 8

Using the Alternate World Couch the example in a one-sample t-test for simplicity s sake. Now repeatedly sample from the population and run t- tests each time. Each sample will have a different estimate for mean and standard deviation. With this much uncertainty (characteristics of population). 95% CI Sample 1 0.45 1.72 Sample 2-0.02 1.81 Sample 3 0.25 2.08 Sample 4 0.39 1.85...... Sample 1000 0.41 1.59 We want to see a δ effect size 80% of the time 9

Power Estimate With n=10, st.dev=1, δ=1 for Normal distribution we get Using the Monte Carlo method to calculate power for the one-sample t-test: Power = 80.2% This method has a little variation in the estimate because it is a simulation approach. Using conventional methods: Power = 80.31% 10

Click through PDF SerialMeans.pdf File SERIAL CORRELATION 11

2-Sample t-test w/ Serial Corr. Incorrect SE of mean from an individual sample. Actual SE of mean. (An adjustment is needed in analysis) 12

Need a different version of test. The difference in a regular 2-sample t-test, and an adjusted test is, Estimate the autocorrelation, r Adjust the SE of the test statistic: SE( x CI x y) y adjusted z 1 2 SE( x SE( x y)* y) 1 1 adjusted r r What is the conventional method of computing power for this? 13

How do we do it? 1. Create the minimal alternate hypothetical population MAHP 2. Take sample of size n from the MAHP 3. Test to see significance with chosen test, (here we re using the adjusted CI previous page). 4. Repeat steps 2 and 3 1000, 10000, or more times while recording how many are significant. 5. Find proportion that are significant out of number of repeated loops. 14

Method: Plug and Play Question: What n do we use to detect δ with chosen power? 1. Generate MAHP Minimal Alternate Hypothetical Population Size 100,000 (or something real big) There may be multiple ways of testing effect. Method 1 Method 2 Method 3 Which is most powerful for given sample size? 2. Sample n values from MAHP 3. Test effect with method Method 1 4. Record outcome from test (significant or not) Power Repeat loop 1K or 10K times 1 1000 Get vector, v, of 0/1, 1 for significant outcome 1000 v i i 1 15

CE90 Power We want to know how many runs we need to prove CE90 is meeting spec for a targeting device. How close to spec do you want to be before you are willing to concede that you are no different from spec? We want to have proof of meeting spec if it is at least 2 feet beyond CE90. 16

CE90 MAHP 17

CE90 MAHP 18

CE90 MAHP 19

CE90 Power With δ=2 and sample size of 60, Power = 70% This power calculation is only for the specific situations similar to the MAHP. Any different pattern in CE will require a separate power analysis. 20

Summary Monte Carlo power estimation is versatile and can handle most situations. It is difficult sometimes to create the MAHP. A statistician is likely needed to aid in the process. 21

Means from Different Samples 2 0 2 4 6 8 Mean(X) AR(1) Mean(Y) AR(1) Mean(X) Independent Mean(Y) Independent