Test Oracles and Randomness

Size: px
Start display at page:

Download "Test Oracles and Randomness"

Transcription

1 Test Oracles and Randomness UNIVERSITÄ T ULM DOCENDO CURANDO SCIENDO Ralph Guderlei and Johannes Mayer rjg@mathematik.uni-ulm.de, jmayer@mathematik.uni-ulm.de University of Ulm, Department of Stochastics and Department of Applied Information Processing Ralph Guderlei - Test Oracles and Randomness - NetObject Days 2004 p. 1

2 Introduction Standard methods of software testing do not provide information about software reliability. random testing The main problem in random testing is to verify the actual results of the Implementation Under Test (IUT). Oracles Ralph Guderlei - Test Oracles and Randomness - NetObject Days 2004 p. 2

3 Oracles An oracle provides methods to generate expected results for test cases to compare the expected results to the actual results of the IUT. Ralph Guderlei - Test Oracles and Randomness - NetObject Days 2004 p. 3

4 Oracles An oracle provides methods to generate expected results for test cases to compare the expected results to the actual results of the IUT. It consists of two parts: the result generator to obtain expected results the comparator to verify the actual results of the IUT Ralph Guderlei - Test Oracles and Randomness - NetObject Days 2004 p. 3

5 Standard Types of Oracles Oracles do not apply generally, only in special cases. Standard types are Perfect Oracle Gold Standard Oracle Parametric Oracle or Heuristic Oracle Ralph Guderlei - Test Oracles and Randomness - NetObject Days 2004 p. 4

6 Perfect Oracle equivalent to the IUT and completely trusted accepts every input specified for the IUT produces always the correct result a defect free version of the IUT Ralph Guderlei - Test Oracles and Randomness - NetObject Days 2004 p. 5

7 Gold Standard Oracle Test Input Test Case Input Golden Implementation golden result Comparator Pass / no Pass IUT actual result Use one or more versions of an existing application system to generate expected results (e.g. a legacy system). Ralph Guderlei - Test Oracles and Randomness - NetObject Days 2004 p. 6

8 Parametric Oracle Test Input IUT actual results Output Converter actual parameters Trusted Algorithm reference parameters Comparator Pass / no Pass Use an algorithm to compute parameters from the actual results and compare the actual parameters to expected parameter values. Ralph Guderlei - Test Oracles and Randomness - NetObject Days 2004 p. 7

9 Statistical Oracle special case of a parametric oracle Ralph Guderlei - Test Oracles and Randomness - NetObject Days 2004 p. 8

10 Statistical Oracle special case of a parametric oracle parameters are computed with statistical tools Ralph Guderlei - Test Oracles and Randomness - NetObject Days 2004 p. 8

11 Statistical Oracle special case of a parametric oracle parameters are computed with statistical tools comparison is done in a statistical way Ralph Guderlei - Test Oracles and Randomness - NetObject Days 2004 p. 8

12 Statistical Oracle special case of a parametric oracle parameters are computed with statistical tools comparison is done in a statistical way generation of random input data allows a large number of test cases Ralph Guderlei - Test Oracles and Randomness - NetObject Days 2004 p. 8

13 Micropattern Random Test Input Generator Test Case Input IUT Actual Results Statistical Analyzer Characteristics Distributional Parameters Comparator Pass / No Pass distributional properties of the generator and the IUT are used for the comparison Ralph Guderlei - Test Oracles and Randomness - NetObject Days 2004 p. 9

14 Requirements statistical characteristics of the IUT have to be known large number of input data for stable results (> 30) Ralph Guderlei - Test Oracles and Randomness - NetObject Days 2004 p. 10

15 Possible Uses (Scientific) applications dealing with randomness e.g. simulators, data analysis (e.g. in banking, image analysis) Applications with complicated input data, where reference values are difficult to obtain. As in the Example: images are difficult to analyze randomly generating and comparing the mean values is simple Ralph Guderlei - Test Oracles and Randomness - NetObject Days 2004 p. 11

16 Necessary Statistics X i iid random variables with mean µ and variance σ 2 Ralph Guderlei - Test Oracles and Randomness - NetObject Days 2004 p. 12

17 Necessary Statistics X i iid random variables with mean µ and variance σ 2 sample mean of n random variables: n X n = 1 n X i i=1 Ralph Guderlei - Test Oracles and Randomness - NetObject Days 2004 p. 12

18 Necessary Statistics X i iid random variables with mean µ and variance σ 2 sample mean of n random variables: n X n = 1 n X i i=1 sample variance of n random variables: n Sn 2 = 1 n 1 (X i X n ) 2 i=1 Ralph Guderlei - Test Oracles and Randomness - NetObject Days 2004 p. 12

19 Distributional Properties using the central limit theorem, one can assume the X i to be (assymptotically) normally distributed. n Xn µ σ d N(0, 1) as n Ralph Guderlei - Test Oracles and Randomness - NetObject Days 2004 p. 13

20 Simple Approach expected value is known: µ 0 Ralph Guderlei - Test Oracles and Randomness - NetObject Days 2004 p. 14

21 Simple Approach expected value is known: µ 0 compute empirical mean of the n actual results x i : x n = 1 n n i=1 x i Ralph Guderlei - Test Oracles and Randomness - NetObject Days 2004 p. 14

22 Simple Approach expected value is known: µ 0 compute empirical mean of the n actual results x i : x n = 1 n n i=1 x i pass if x n µ 0 µ 0 < ε, e.g.ε = 0.1 Ralph Guderlei - Test Oracles and Randomness - NetObject Days 2004 p. 14

23 First Attempt Natural choice: t-test if the mean of the actual results µ is equal to the expected result µ 0. Ralph Guderlei - Test Oracles and Randomness - NetObject Days 2004 p. 15

24 First Attempt Natural choice: t-test if the mean of the actual results µ is equal to the expected result µ 0. Type I error: IUT does not pass though it is correct (false alarm) Type II error: IUT does pass though it is not correct (false pass) But only the Type I error will be controlled. Ralph Guderlei - Test Oracles and Randomness - NetObject Days 2004 p. 15

25 Advanced Approach Alternative choice: use the intersection-union method to invert the test hypothesis. So the controllable probability for the Type I error becomes the probability for a false pass. Ralph Guderlei - Test Oracles and Randomness - NetObject Days 2004 p. 16

26 Advanced Approach Alternative choice: use the intersection-union method to invert the test hypothesis. So the controllable probability for the Type I error becomes the probability for a false pass. With δ > 0, one can define an interval around µ 0 such that X n / [µ 0 δ, µ 0 + δ] for a given probability α. Ralph Guderlei - Test Oracles and Randomness - NetObject Days 2004 p. 16

27 The Test Statistic For a given probability α (0, 1 2 ), the IUT passes if and n x n (µ 0 δ) s n n x n (µ 0 +δ) s n t n 1, α 2 t n 1, α 2, Ralph Guderlei - Test Oracles and Randomness - NetObject Days 2004 p. 17

28 The Test Statistic For a given probability α (0, 1 2 ), the IUT passes if and n x n (µ 0 δ) s n n x n (µ 0 +δ) s n t n 1, α 2 t n 1, α 2, where t n, α denotes the (1 α/2) - quantile of the 2 Student t-distribution with n 1 degrees of freedom. Ralph Guderlei - Test Oracles and Randomness - NetObject Days 2004 p. 17

29 An Example from Image Analysis compute morphological properties such as area or boundary length fit stochastic models to the given data Ralph Guderlei - Test Oracles and Randomness - NetObject Days 2004 p. 18

30 Random Input - The Boolean Model computationally simple model flexible good fit to real data the expected mean area and mean boundary length are known in explicit form Ralph Guderlei - Test Oracles and Randomness - NetObject Days 2004 p. 19

31 Micropattern Revisited Generator for Boolean Model Test Case Input IUT Actual Results, e.g. Area Statistical Analyzer Characterstics, e.g. Mean Area Theoretical Results from the Model Comparator Pass / No Pass Random Input Generator = Generator for the Boolean Model The IUT computes e.g. the area or the boundary length Ralph Guderlei - Test Oracles and Randomness - NetObject Days 2004 p. 20

32 Usage of the Simple Approach Performing the test for each computed characteristic (e.g. the area) for different Boolean Models Ralph Guderlei - Test Oracles and Randomness - NetObject Days 2004 p. 21

33 Usage of the Simple Approach Performing the test for each computed characteristic (e.g. the area) for different Boolean Models The simple approach was used as advanced smoke test to detect severe bugs in the program flow plausibility check for the computed values Ralph Guderlei - Test Oracles and Randomness - NetObject Days 2004 p. 21

34 Usage of the Advanced Approach verification of the results of the simple approach choosing a small α, one can say that the IUT produces the correct results (only) with respect to the tested characteristics with probability α Ralph Guderlei - Test Oracles and Randomness - NetObject Days 2004 p. 22

35 Numerical Example sample of n = 500 images, error probability α = , here x i ˆ= boundary length Ralph Guderlei - Test Oracles and Randomness - NetObject Days 2004 p. 23

36 Numerical Example sample of n = 500 images, error probability α = , here x i ˆ= boundary length x n = , µ 0 = , δ = = 1% µ 0 Ralph Guderlei - Test Oracles and Randomness - NetObject Days 2004 p. 23

37 Numerical Example sample of n = 500 images, error probability α = , here x i ˆ= boundary length x n = , µ 0 = , δ = = 1% µ 0 n x n (µ 0 δ) s n = > 3.92 = t 499, and n x n (µ 0 + δ) s n = < 3.92 = t 499, Ralph Guderlei - Test Oracles and Randomness - NetObject Days 2004 p. 23

38 Numerical Example sample of n = 500 images, error probability α = , here x i ˆ= boundary length x n = , µ 0 = , δ = = 1% µ 0 n x n (µ 0 δ) s n = > 3.92 = t 499, and n x n (µ 0 + δ) s n = < 3.92 = t 499, So the IUT passes the test. Ralph Guderlei - Test Oracles and Randomness - NetObject Days 2004 p. 23

39 Conclusion The approach includes information about the reliability with respect to for the tested characteristics The approach makes it possible to handle randomness in tests It is possible to test for other characteristics than the mean The approach shown above does not apply in all cases Ralph Guderlei - Test Oracles and Randomness - NetObject Days 2004 p. 24

40 Thank you for your attention. Ralph Guderlei - Test Oracles and Randomness - NetObject Days 2004 p. 25

Test Oracles Using Statistical Methods

Test Oracles Using Statistical Methods Test Oracles Using Statistical Methods Johannes Mayer, Ralph Guderlei Abteilung Angewandte Informationsverarbeitung, Abteilung Stochastik Universität Ulm Helmholtzstrasse 18 D 89069 Ulm, Germany jmayer@mathematik.uni-ulm.de,

More information

On Testing Image Processing Applications With Statistical Methods

On Testing Image Processing Applications With Statistical Methods On Testing Image Processing Applications With Statistical Methods Johannes Mayer Abteilung Angewandte Informationsverarbeitung Universität Ulm D 89069 Ulm mayer@mathematik.uni-ulm.de Abstract: Testing

More information

Chapter 3. Bootstrap. 3.1 Introduction. 3.2 The general idea

Chapter 3. Bootstrap. 3.1 Introduction. 3.2 The general idea Chapter 3 Bootstrap 3.1 Introduction The estimation of parameters in probability distributions is a basic problem in statistics that one tends to encounter already during the very first course on the subject.

More information

Use of Extreme Value Statistics in Modeling Biometric Systems

Use of Extreme Value Statistics in Modeling Biometric Systems Use of Extreme Value Statistics in Modeling Biometric Systems Similarity Scores Two types of matching: Genuine sample Imposter sample Matching scores Enrolled sample 0.95 0.32 Probability Density Decision

More information

Introduction to hypothesis testing

Introduction to hypothesis testing Introduction to hypothesis testing Mark Johnson Macquarie University Sydney, Australia February 27, 2017 1 / 38 Outline Introduction Hypothesis tests and confidence intervals Classical hypothesis tests

More information

An Approach To ANOM Chart. Muhammad Riaz

An Approach To ANOM Chart. Muhammad Riaz An Approach To ANOM Chart Muhammad Riaz Department of tatistics, Quaid-i-Azam University, Islamabad, Pakistan E-mail: riaz76qau@yahoo.com Abstract The study proposes a scheme for the structure of Analysis

More information

Optimization and Simulation

Optimization and Simulation Optimization and Simulation Statistical analysis and bootstrapping Michel Bierlaire Transport and Mobility Laboratory School of Architecture, Civil and Environmental Engineering Ecole Polytechnique Fédérale

More information

Bayesian Spatiotemporal Modeling with Hierarchical Spatial Priors for fmri

Bayesian Spatiotemporal Modeling with Hierarchical Spatial Priors for fmri Bayesian Spatiotemporal Modeling with Hierarchical Spatial Priors for fmri Galin L. Jones 1 School of Statistics University of Minnesota March 2015 1 Joint with Martin Bezener and John Hughes Experiment

More information

Clustering. Mihaela van der Schaar. January 27, Department of Engineering Science University of Oxford

Clustering. Mihaela van der Schaar. January 27, Department of Engineering Science University of Oxford Department of Engineering Science University of Oxford January 27, 2017 Many datasets consist of multiple heterogeneous subsets. Cluster analysis: Given an unlabelled data, want algorithms that automatically

More information

So..to be able to make comparisons possible, we need to compare them with their respective distributions.

So..to be able to make comparisons possible, we need to compare them with their respective distributions. Unit 3 ~ Modeling Distributions of Data 1 ***Section 2.1*** Measures of Relative Standing and Density Curves (ex) Suppose that a professional soccer team has the money to sign one additional player and

More information

Chapter 1. Introduction

Chapter 1. Introduction Chapter 1 Introduction A Monte Carlo method is a compuational method that uses random numbers to compute (estimate) some quantity of interest. Very often the quantity we want to compute is the mean of

More information

Lecture 10: Nested Depth First Search, Counter- Example Generation Revisited, Bit-State Hashing, On-The-Fly Model Checking

Lecture 10: Nested Depth First Search, Counter- Example Generation Revisited, Bit-State Hashing, On-The-Fly Model Checking CS 267: Automated Verification Lecture 10: Nested Depth First Search, Counter- Example Generation Revisited, Bit-State Hashing, On-The-Fly Model Checking Instructor: Tevfik Bultan Buchi Automata Language

More information

Sensing Error Minimization for Cognitive Radio in Dynamic Environment using Death Penalty Differential Evolution based Threshold Adaptation

Sensing Error Minimization for Cognitive Radio in Dynamic Environment using Death Penalty Differential Evolution based Threshold Adaptation Sensing Error Minimization for Cognitive Radio in Dynamic Environment using Death Penalty Differential Evolution based Threshold Adaptation Soumyadip Das 1, Sumitra Mukhopadhyay 2 1,2 Institute of Radio

More information

Resources for statistical assistance. Quantitative covariates and regression analysis. Methods for predicting continuous outcomes.

Resources for statistical assistance. Quantitative covariates and regression analysis. Methods for predicting continuous outcomes. Resources for statistical assistance Quantitative covariates and regression analysis Carolyn Taylor Applied Statistics and Data Science Group (ASDa) Department of Statistics, UBC January 24, 2017 Department

More information

Application of Characteristic Function Method in Target Detection

Application of Characteristic Function Method in Target Detection Application of Characteristic Function Method in Target Detection Mohammad H Marhaban and Josef Kittler Centre for Vision, Speech and Signal Processing University of Surrey Surrey, GU2 7XH, UK eep5mm@ee.surrey.ac.uk

More information

CHAPTER 2: Describing Location in a Distribution

CHAPTER 2: Describing Location in a Distribution CHAPTER 2: Describing Location in a Distribution 2.1 Goals: 1. Compute and use z-scores given the mean and sd 2. Compute and use the p th percentile of an observation 3. Intro to density curves 4. More

More information

Resampling Methods for Dependent Data

Resampling Methods for Dependent Data S.N. Lahiri Resampling Methods for Dependent Data With 25 Illustrations Springer Contents 1 Scope of Resampling Methods for Dependent Data 1 1.1 The Bootstrap Principle 1 1.2 Examples 7 1.3 Concluding

More information

Network Traffic Measurements and Analysis

Network Traffic Measurements and Analysis DEIB - Politecnico di Milano Fall, 2017 Sources Hastie, Tibshirani, Friedman: The Elements of Statistical Learning James, Witten, Hastie, Tibshirani: An Introduction to Statistical Learning Andrew Ng:

More information

Sections 4.3 and 4.4

Sections 4.3 and 4.4 Sections 4.3 and 4.4 Timothy Hanson Department of Statistics, University of South Carolina Stat 205: Elementary Statistics for the Biological and Life Sciences 1 / 32 4.3 Areas under normal densities Every

More information

Midterm Exam II CIS 341: Foundations of Computer Science II Spring 2006, day section Prof. Marvin K. Nakayama

Midterm Exam II CIS 341: Foundations of Computer Science II Spring 2006, day section Prof. Marvin K. Nakayama Midterm Exam II CIS 341: Foundations of Computer Science II Spring 2006, day section Prof. Marvin K. Nakayama Print family (or last) name: Print given (or first) name: I have read and understand all of

More information

Evaluating Robot Systems

Evaluating Robot Systems Evaluating Robot Systems November 6, 2008 There are two ways of constructing a software design. One way is to make it so simple that there are obviously no deficiencies. And the other way is to make it

More information

Developing Effect Sizes for Non-Normal Data in Two-Sample Comparison Studies

Developing Effect Sizes for Non-Normal Data in Two-Sample Comparison Studies Developing Effect Sizes for Non-Normal Data in Two-Sample Comparison Studies with an Application in E-commerce Durham University Apr 13, 2010 Outline 1 Introduction Effect Size, Complementory for Hypothesis

More information

10-701/15-781, Fall 2006, Final

10-701/15-781, Fall 2006, Final -7/-78, Fall 6, Final Dec, :pm-8:pm There are 9 questions in this exam ( pages including this cover sheet). If you need more room to work out your answer to a question, use the back of the page and clearly

More information

2.8. Connectedness A topological space X is said to be disconnected if X is the disjoint union of two non-empty open subsets. The space X is said to

2.8. Connectedness A topological space X is said to be disconnected if X is the disjoint union of two non-empty open subsets. The space X is said to 2.8. Connectedness A topological space X is said to be disconnected if X is the disjoint union of two non-empty open subsets. The space X is said to be connected if it is not disconnected. A subset of

More information

EE 584 MACHINE VISION

EE 584 MACHINE VISION EE 584 MACHINE VISION Binary Images Analysis Geometrical & Topological Properties Connectedness Binary Algorithms Morphology Binary Images Binary (two-valued; black/white) images gives better efficiency

More information

IQR = number. summary: largest. = 2. Upper half: Q3 =

IQR = number. summary: largest. = 2. Upper half: Q3 = Step by step box plot Height in centimeters of players on the 003 Women s Worldd Cup soccer team. 157 1611 163 163 164 165 165 165 168 168 168 170 170 170 171 173 173 175 180 180 Determine the 5 number

More information

Online Learning. Lorenzo Rosasco MIT, L. Rosasco Online Learning

Online Learning. Lorenzo Rosasco MIT, L. Rosasco Online Learning Online Learning Lorenzo Rosasco MIT, 9.520 About this class Goal To introduce theory and algorithms for online learning. Plan Different views on online learning From batch to online least squares Other

More information

Theoretical Concepts of Machine Learning

Theoretical Concepts of Machine Learning Theoretical Concepts of Machine Learning Part 2 Institute of Bioinformatics Johannes Kepler University, Linz, Austria Outline 1 Introduction 2 Generalization Error 3 Maximum Likelihood 4 Noise Models 5

More information

Slides 11: Verification and Validation Models

Slides 11: Verification and Validation Models Slides 11: Verification and Validation Models Purpose and Overview The goal of the validation process is: To produce a model that represents true behaviour closely enough for decision making purposes.

More information

Lecture: Simulation. of Manufacturing Systems. Sivakumar AI. Simulation. SMA6304 M2 ---Factory Planning and scheduling. Simulation - A Predictive Tool

Lecture: Simulation. of Manufacturing Systems. Sivakumar AI. Simulation. SMA6304 M2 ---Factory Planning and scheduling. Simulation - A Predictive Tool SMA6304 M2 ---Factory Planning and scheduling Lecture Discrete Event of Manufacturing Systems Simulation Sivakumar AI Lecture: 12 copyright 2002 Sivakumar 1 Simulation Simulation - A Predictive Tool Next

More information

Bayesian Sequential Sampling Policies and Sufficient Conditions for Convergence to a Global Optimum

Bayesian Sequential Sampling Policies and Sufficient Conditions for Convergence to a Global Optimum Bayesian Sequential Sampling Policies and Sufficient Conditions for Convergence to a Global Optimum Peter Frazier Warren Powell 2 Operations Research & Information Engineering, Cornell University 2 Operations

More information

MATH : EXAM 3 INFO/LOGISTICS/ADVICE

MATH : EXAM 3 INFO/LOGISTICS/ADVICE MATH 3342-004: EXAM 3 INFO/LOGISTICS/ADVICE INFO: WHEN: Friday (04/22) at 10:00am DURATION: 50 mins PROBLEM COUNT: Appropriate for a 50-min exam BONUS COUNT: At least one TOPICS CANDIDATE FOR THE EXAM:

More information

11 Sets II Operations

11 Sets II Operations 11 Sets II Operations Tom Lewis Fall Term 2010 Tom Lewis () 11 Sets II Operations Fall Term 2010 1 / 12 Outline 1 Union and intersection 2 Set operations 3 The size of a union 4 Difference and symmetric

More information

Bootstrap confidence intervals Class 24, Jeremy Orloff and Jonathan Bloom

Bootstrap confidence intervals Class 24, Jeremy Orloff and Jonathan Bloom 1 Learning Goals Bootstrap confidence intervals Class 24, 18.05 Jeremy Orloff and Jonathan Bloom 1. Be able to construct and sample from the empirical distribution of data. 2. Be able to explain the bootstrap

More information

Note Set 4: Finite Mixture Models and the EM Algorithm

Note Set 4: Finite Mixture Models and the EM Algorithm Note Set 4: Finite Mixture Models and the EM Algorithm Padhraic Smyth, Department of Computer Science University of California, Irvine Finite Mixture Models A finite mixture model with K components, for

More information

Equivalence Tests for Two Means in a 2x2 Cross-Over Design using Differences

Equivalence Tests for Two Means in a 2x2 Cross-Over Design using Differences Chapter 520 Equivalence Tests for Two Means in a 2x2 Cross-Over Design using Differences Introduction This procedure calculates power and sample size of statistical tests of equivalence of the means of

More information

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown Z-TEST / Z-STATISTIC: used to test hypotheses about µ when the population standard deviation is known and population distribution is normal or sample size is large T-TEST / T-STATISTIC: used to test hypotheses

More information

The Bootstrap and Jackknife

The Bootstrap and Jackknife The Bootstrap and Jackknife Summer 2017 Summer Institutes 249 Bootstrap & Jackknife Motivation In scientific research Interest often focuses upon the estimation of some unknown parameter, θ. The parameter

More information

Metrics for Performance Evaluation How to evaluate the performance of a model? Methods for Performance Evaluation How to obtain reliable estimates?

Metrics for Performance Evaluation How to evaluate the performance of a model? Methods for Performance Evaluation How to obtain reliable estimates? Model Evaluation Metrics for Performance Evaluation How to evaluate the performance of a model? Methods for Performance Evaluation How to obtain reliable estimates? Methods for Model Comparison How to

More information

Scientific Computing: An Introductory Survey

Scientific Computing: An Introductory Survey Scientific Computing: An Introductory Survey Chapter 13 Random Numbers and Stochastic Simulation Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign Copyright

More information

Machine Learning Techniques for Data Mining

Machine Learning Techniques for Data Mining Machine Learning Techniques for Data Mining Eibe Frank University of Waikato New Zealand 10/25/2000 1 PART V Credibility: Evaluating what s been learned 10/25/2000 2 Evaluation: the key to success How

More information

THREE LECTURES ON BASIC TOPOLOGY. 1. Basic notions.

THREE LECTURES ON BASIC TOPOLOGY. 1. Basic notions. THREE LECTURES ON BASIC TOPOLOGY PHILIP FOTH 1. Basic notions. Let X be a set. To make a topological space out of X, one must specify a collection T of subsets of X, which are said to be open subsets of

More information

THIS IS NOT REPRESNTATIVE OF CURRENT CLASS MATERIAL. STOR 455 Midterm 1 September 28, 2010

THIS IS NOT REPRESNTATIVE OF CURRENT CLASS MATERIAL. STOR 455 Midterm 1 September 28, 2010 THIS IS NOT REPRESNTATIVE OF CURRENT CLASS MATERIAL STOR 455 Midterm September 8, INSTRUCTIONS: BOTH THE EXAM AND THE BUBBLE SHEET WILL BE COLLECTED. YOU MUST PRINT YOUR NAME AND SIGN THE HONOR PLEDGE

More information

r v i e w o f s o m e r e c e n t d e v e l o p m

r v i e w o f s o m e r e c e n t d e v e l o p m O A D O 4 7 8 O - O O A D OA 4 7 8 / D O O 3 A 4 7 8 / S P O 3 A A S P - * A S P - S - P - A S P - - - - L S UM 5 8 - - 4 3 8 -F 69 - V - F U 98F L 69V S U L S UM58 P L- SA L 43 ˆ UéL;S;UéL;SAL; - - -

More information

Statistics 202: Data Mining. c Jonathan Taylor. Outliers Based in part on slides from textbook, slides of Susan Holmes.

Statistics 202: Data Mining. c Jonathan Taylor. Outliers Based in part on slides from textbook, slides of Susan Holmes. Outliers Based in part on slides from textbook, slides of Susan Holmes December 2, 2012 1 / 1 Concepts What is an outlier? The set of data points that are considerably different than the remainder of the

More information

Modeling and Performance Analysis with Discrete-Event Simulation

Modeling and Performance Analysis with Discrete-Event Simulation Simulation Modeling and Performance Analysis with Discrete-Event Simulation Chapter 10 Verification and Validation of Simulation Models Contents Model-Building, Verification, and Validation Verification

More information

Managing Uncertainty in Data Streams. Aleka Seliniotaki Project Presentation HY561 Heraklion, 22/05/2013

Managing Uncertainty in Data Streams. Aleka Seliniotaki Project Presentation HY561 Heraklion, 22/05/2013 Managing Uncertainty in Data Streams Aleka Seliniotaki Project Presentation HY561 Heraklion, 22/05/2013 Introduction Uncertain Data Streams T V Data: incomplete, imprecise, misleading Results: unknown

More information

Notes on Simulations in SAS Studio

Notes on Simulations in SAS Studio Notes on Simulations in SAS Studio If you are not careful about simulations in SAS Studio, you can run into problems. In particular, SAS Studio has a limited amount of memory that you can use to write

More information

Performance Evaluation

Performance Evaluation Performance Evaluation Dan Lizotte 7-9-5 Evaluating Performance..5..5..5..5 Which do ou prefer and wh? Evaluating Performance..5..5 Which do ou prefer and wh?..5..5 Evaluating Performance..5..5..5..5 Performance

More information

SSJ User s Guide. Package stat Tools for Collecting Statistics. Version: December 21, 2006

SSJ User s Guide. Package stat Tools for Collecting Statistics. Version: December 21, 2006 SSJ User s Guide Package stat Tools for Collecting Statistics Version: December 21, 2006 CONTENTS 1 Contents Overview........................................ 2 StatProbe........................................

More information

ST512. Fall Quarter, Exam 1. Directions: Answer questions as directed. Please show work. For true/false questions, circle either true or false.

ST512. Fall Quarter, Exam 1. Directions: Answer questions as directed. Please show work. For true/false questions, circle either true or false. ST512 Fall Quarter, 2005 Exam 1 Name: Directions: Answer questions as directed. Please show work. For true/false questions, circle either true or false. 1. (42 points) A random sample of n = 30 NBA basketball

More information

A New Statistical Procedure for Validation of Simulation and Stochastic Models

A New Statistical Procedure for Validation of Simulation and Stochastic Models Syracuse University SURFACE Electrical Engineering and Computer Science L.C. Smith College of Engineering and Computer Science 11-18-2010 A New Statistical Procedure for Validation of Simulation and Stochastic

More information

Mixed Effects Models. Biljana Jonoska Stojkova Applied Statistics and Data Science Group (ASDa) Department of Statistics, UBC.

Mixed Effects Models. Biljana Jonoska Stojkova Applied Statistics and Data Science Group (ASDa) Department of Statistics, UBC. Mixed Effects Models Biljana Jonoska Stojkova Applied Statistics and Data Science Group (ASDa) Department of Statistics, UBC March 6, 2018 Resources for statistical assistance Department of Statistics

More information

CURVILINEAR MESH GENERATION IN 3D

CURVILINEAR MESH GENERATION IN 3D CURVILINEAR MESH GENERATION IN 3D Saikat Dey, Robert M. O'Bara 2 and Mark S. Shephard 2 SFA Inc. / Naval Research Laboratory, Largo, MD., U.S.A., dey@cosmic.nrl.navy.mil 2 Scientific Computation Research

More information

4.12 Generalization. In back-propagation learning, as many training examples as possible are typically used.

4.12 Generalization. In back-propagation learning, as many training examples as possible are typically used. 1 4.12 Generalization In back-propagation learning, as many training examples as possible are typically used. It is hoped that the network so designed generalizes well. A network generalizes well when

More information

Modeling with Uncertainty Interval Computations Using Fuzzy Sets

Modeling with Uncertainty Interval Computations Using Fuzzy Sets Modeling with Uncertainty Interval Computations Using Fuzzy Sets J. Honda, R. Tankelevich Department of Mathematical and Computer Sciences, Colorado School of Mines, Golden, CO, U.S.A. Abstract A new method

More information

MODEL SELECTION AND REGULARIZATION PARAMETER CHOICE

MODEL SELECTION AND REGULARIZATION PARAMETER CHOICE MODEL SELECTION AND REGULARIZATION PARAMETER CHOICE REGULARIZATION METHODS FOR HIGH DIMENSIONAL LEARNING Francesca Odone and Lorenzo Rosasco odone@disi.unige.it - lrosasco@mit.edu June 6, 2011 ABOUT THIS

More information

Section 16. The Subspace Topology

Section 16. The Subspace Topology 16. The Subspace Product Topology 1 Section 16. The Subspace Topology Note. Recall from Analysis 1 that a set of real numbers U is open relative to set X if there is an open set of real numbers O such

More information

Computational Methods. Randomness and Monte Carlo Methods

Computational Methods. Randomness and Monte Carlo Methods Computational Methods Randomness and Monte Carlo Methods Manfred Huber 2010 1 Randomness and Monte Carlo Methods Introducing randomness in an algorithm can lead to improved efficiencies Random sampling

More information

Lab 5 - Risk Analysis, Robustness, and Power

Lab 5 - Risk Analysis, Robustness, and Power Type equation here.biology 458 Biometry Lab 5 - Risk Analysis, Robustness, and Power I. Risk Analysis The process of statistical hypothesis testing involves estimating the probability of making errors

More information

Statistical foundations of Machine Learning INFO-F-422 TP: Linear Regression

Statistical foundations of Machine Learning INFO-F-422 TP: Linear Regression Statistical foundations of Machine Learning INFO-F-422 TP: Linear Regression Catharina Olsen and Gianluca Bontempi March 12, 2013 1 1 Repetition 1.1 Estimation using the mean square error Assume to have

More information

Lecture 15: The subspace topology, Closed sets

Lecture 15: The subspace topology, Closed sets Lecture 15: The subspace topology, Closed sets 1 The Subspace Topology Definition 1.1. Let (X, T) be a topological space with topology T. subset of X, the collection If Y is a T Y = {Y U U T} is a topology

More information

Statistical Tests for Variable Discrimination

Statistical Tests for Variable Discrimination Statistical Tests for Variable Discrimination University of Trento - FBK 26 February, 2015 (UNITN-FBK) Statistical Tests for Variable Discrimination 26 February, 2015 1 / 31 General statistics Descriptional:

More information

Bootstrap Confidence Interval of the Difference Between Two Process Capability Indices

Bootstrap Confidence Interval of the Difference Between Two Process Capability Indices Int J Adv Manuf Technol (2003) 21:249 256 Ownership and Copyright 2003 Springer-Verlag London Limited Bootstrap Confidence Interval of the Difference Between Two Process Capability Indices J.-P. Chen 1

More information

DATA MINING AND MACHINE LEARNING. Lecture 6: Data preprocessing and model selection Lecturer: Simone Scardapane

DATA MINING AND MACHINE LEARNING. Lecture 6: Data preprocessing and model selection Lecturer: Simone Scardapane DATA MINING AND MACHINE LEARNING Lecture 6: Data preprocessing and model selection Lecturer: Simone Scardapane Academic Year 2016/2017 Table of contents Data preprocessing Feature normalization Missing

More information

Chapter 6 Normal Probability Distributions

Chapter 6 Normal Probability Distributions Chapter 6 Normal Probability Distributions 6-1 Review and Preview 6-2 The Standard Normal Distribution 6-3 Applications of Normal Distributions 6-4 Sampling Distributions and Estimators 6-5 The Central

More information

Statistical Performance Comparisons of Computers

Statistical Performance Comparisons of Computers Tianshi Chen 1, Yunji Chen 1, Qi Guo 1, Olivier Temam 2, Yue Wu 1, Weiwu Hu 1 1 State Key Laboratory of Computer Architecture, Institute of Computing Technology (ICT), Chinese Academy of Sciences, Beijing,

More information

ECE 470: Homework 5. Due Tuesday, October 27 in Seth Hutchinson. Luke A. Wendt

ECE 470: Homework 5. Due Tuesday, October 27 in Seth Hutchinson. Luke A. Wendt ECE 47: Homework 5 Due Tuesday, October 7 in class @:3pm Seth Hutchinson Luke A Wendt ECE 47 : Homework 5 Consider a camera with focal length λ = Suppose the optical axis of the camera is aligned with

More information

Monte Carlo for Spatial Models

Monte Carlo for Spatial Models Monte Carlo for Spatial Models Murali Haran Department of Statistics Penn State University Penn State Computational Science Lectures April 2007 Spatial Models Lots of scientific questions involve analyzing

More information

Chapter 6. THE NORMAL DISTRIBUTION

Chapter 6. THE NORMAL DISTRIBUTION Chapter 6. THE NORMAL DISTRIBUTION Introducing Normally Distributed Variables The distributions of some variables like thickness of the eggshell, serum cholesterol concentration in blood, white blood cells

More information

Three challenges in route choice modeling

Three challenges in route choice modeling Three challenges in route choice modeling Michel Bierlaire and Emma Frejinger transp-or.epfl.ch Transport and Mobility Laboratory, EPFL Three challenges in route choice modeling p.1/61 Route choice modeling

More information

Ground Tracking in Ground Penetrating Radar

Ground Tracking in Ground Penetrating Radar Ground Tracking in Ground Penetrating Radar Kyle Bradbury, Peter Torrione, Leslie Collins QMDNS Conference May 19, 2008 The Landmine Problem Landmine Monitor Report, 2007 Cost of Landmine Detection Demining

More information

Don t just read it; fight it! Ask your own questions, look for your own examples, discover your own proofs. Is the hypothesis necessary?

Don t just read it; fight it! Ask your own questions, look for your own examples, discover your own proofs. Is the hypothesis necessary? Don t just read it; fight it! Ask your own questions, look for your own examples, discover your own proofs. Is the hypothesis necessary? Is the converse true? What happens in the classical special case?

More information

Epistemic/Non-probabilistic Uncertainty Propagation Using Fuzzy Sets

Epistemic/Non-probabilistic Uncertainty Propagation Using Fuzzy Sets Epistemic/Non-probabilistic Uncertainty Propagation Using Fuzzy Sets Dongbin Xiu Department of Mathematics, and Scientific Computing and Imaging (SCI) Institute University of Utah Outline Introduction

More information

1. (10 points) Draw the state diagram of the DFA that recognizes the language over Σ = {0, 1}

1. (10 points) Draw the state diagram of the DFA that recognizes the language over Σ = {0, 1} CSE 5 Homework 2 Due: Monday October 6, 27 Instructions Upload a single file to Gradescope for each group. should be on each page of the submission. All group members names and PIDs Your assignments in

More information

Hellenic Complex Systems Laboratory

Hellenic Complex Systems Laboratory Hellenic Complex Systems Laboratory Technical Report No IV Calculation of the confidence bounds for the fraction nonconforming of normal populations of measurements in clinical laboratory medicine Aristides

More information

Math 494: Mathematical Statistics

Math 494: Mathematical Statistics Math 494: Mathematical Statistics Instructor: Jimin Ding jmding@wustl.edu Department of Mathematics Washington University in St. Louis Class materials are available on course website (www.math.wustl.edu/

More information

Geometric approximation of curves and singularities of secant maps Ghosh, Sunayana

Geometric approximation of curves and singularities of secant maps Ghosh, Sunayana University of Groningen Geometric approximation of curves and singularities of secant maps Ghosh, Sunayana IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish

More information

Automatic Detection of Defects in Applications without Test Oracles

Automatic Detection of Defects in Applications without Test Oracles SOFTWARE PRACTICE AND EXPERIENCE Softw. Pract. Exper. 2010; 00:1 35 Published online in Wiley InterScience (www.interscience.wiley.com). Automatic Detection of Defects in Applications without Test Oracles

More information

General Factorial Models

General Factorial Models In Chapter 8 in Oehlert STAT:5201 Week 9 - Lecture 1 1 / 31 It is possible to have many factors in a factorial experiment. We saw some three-way factorials earlier in the DDD book (HW 1 with 3 factors:

More information

General Factorial Models

General Factorial Models In Chapter 8 in Oehlert STAT:5201 Week 9 - Lecture 2 1 / 34 It is possible to have many factors in a factorial experiment. In DDD we saw an example of a 3-factor study with ball size, height, and surface

More information

Parameterization of triangular meshes

Parameterization of triangular meshes Parameterization of triangular meshes Michael S. Floater November 10, 2009 Triangular meshes are often used to represent surfaces, at least initially, one reason being that meshes are relatively easy to

More information

STA258H5. Al Nosedal and Alison Weir. Winter Al Nosedal and Alison Weir STA258H5 Winter / 27

STA258H5. Al Nosedal and Alison Weir. Winter Al Nosedal and Alison Weir STA258H5 Winter / 27 STA258H5 Al Nosedal and Alison Weir Winter 2017 Al Nosedal and Alison Weir STA258H5 Winter 2017 1 / 27 BOOTSTRAP CONFIDENCE INTERVALS Al Nosedal and Alison Weir STA258H5 Winter 2017 2 / 27 Distribution

More information

Discrete Mathematics Course Review 3

Discrete Mathematics Course Review 3 21-228 Discrete Mathematics Course Review 3 This document contains a list of the important definitions and theorems that have been covered thus far in the course. It is not a complete listing of what has

More information

Modelling Proportions and Count Data

Modelling Proportions and Count Data Modelling Proportions and Count Data Rick White May 4, 2016 Outline Analysis of Count Data Binary Data Analysis Categorical Data Analysis Generalized Linear Models Questions Types of Data Continuous data:

More information

Defect Detection in Patterned Silicon Wafers Using Anisotropic Kernels

Defect Detection in Patterned Silicon Wafers Using Anisotropic Kernels Defect Detection in Patterned Silicon Wafers Using Anisotropic Kernels Maria Zontak Defect Detection in Patterned Silicon Wafers Using Anisotropic Kernels Research Thesis In Partial Fulfillment of the

More information

6-1 THE STANDARD NORMAL DISTRIBUTION

6-1 THE STANDARD NORMAL DISTRIBUTION 6-1 THE STANDARD NORMAL DISTRIBUTION The major focus of this chapter is the concept of a normal probability distribution, but we begin with a uniform distribution so that we can see the following two very

More information

2) In the formula for the Confidence Interval for the Mean, if the Confidence Coefficient, z(α/2) = 1.65, what is the Confidence Level?

2) In the formula for the Confidence Interval for the Mean, if the Confidence Coefficient, z(α/2) = 1.65, what is the Confidence Level? Pg.431 1)The mean of the sampling distribution of means is equal to the mean of the population. T-F, and why or why not? True. If you were to take every possible sample from the population, and calculate

More information

Chapter 6. THE NORMAL DISTRIBUTION

Chapter 6. THE NORMAL DISTRIBUTION Chapter 6. THE NORMAL DISTRIBUTION Introducing Normally Distributed Variables The distributions of some variables like thickness of the eggshell, serum cholesterol concentration in blood, white blood cells

More information

Part I. Hierarchical clustering. Hierarchical Clustering. Hierarchical clustering. Produces a set of nested clusters organized as a

Part I. Hierarchical clustering. Hierarchical Clustering. Hierarchical clustering. Produces a set of nested clusters organized as a Week 9 Based in part on slides from textbook, slides of Susan Holmes Part I December 2, 2012 Hierarchical Clustering 1 / 1 Produces a set of nested clusters organized as a Hierarchical hierarchical clustering

More information

Interpolation by Spline Functions

Interpolation by Spline Functions Interpolation by Spline Functions Com S 477/577 Sep 0 007 High-degree polynomials tend to have large oscillations which are not the characteristics of the original data. To yield smooth interpolating curves

More information

Publicly-verifiable proof of storage: a modular construction. Federico Giacon

Publicly-verifiable proof of storage: a modular construction. Federico Giacon Publicly-verifiable proof of storage: a modular construction Federico Giacon Ruhr-Universita t Bochum federico.giacon@rub.de 6th BunnyTN, Trent 17 December 2015 Proof of Storage Proof of Storage (PoS)

More information

4.5 The smoothed bootstrap

4.5 The smoothed bootstrap 4.5. THE SMOOTHED BOOTSTRAP 47 F X i X Figure 4.1: Smoothing the empirical distribution function. 4.5 The smoothed bootstrap In the simple nonparametric bootstrap we have assumed that the empirical distribution

More information

Modelling Proportions and Count Data

Modelling Proportions and Count Data Modelling Proportions and Count Data Rick White May 5, 2015 Outline Analysis of Count Data Binary Data Analysis Categorical Data Analysis Generalized Linear Models Questions Types of Data Continuous data:

More information

Divide and Conquer Kernel Ridge Regression

Divide and Conquer Kernel Ridge Regression Divide and Conquer Kernel Ridge Regression Yuchen Zhang John Duchi Martin Wainwright University of California, Berkeley COLT 2013 Yuchen Zhang (UC Berkeley) Divide and Conquer KRR COLT 2013 1 / 15 Problem

More information

Client Dependent GMM-SVM Models for Speaker Verification

Client Dependent GMM-SVM Models for Speaker Verification Client Dependent GMM-SVM Models for Speaker Verification Quan Le, Samy Bengio IDIAP, P.O. Box 592, CH-1920 Martigny, Switzerland {quan,bengio}@idiap.ch Abstract. Generative Gaussian Mixture Models (GMMs)

More information

A Linear Approximation Based Method for Noise-Robust and Illumination-Invariant Image Change Detection

A Linear Approximation Based Method for Noise-Robust and Illumination-Invariant Image Change Detection A Linear Approximation Based Method for Noise-Robust and Illumination-Invariant Image Change Detection Bin Gao 2, Tie-Yan Liu 1, Qian-Sheng Cheng 2, and Wei-Ying Ma 1 1 Microsoft Research Asia, No.49 Zhichun

More information

Algorithms for Nearest Neighbors

Algorithms for Nearest Neighbors Algorithms for Nearest Neighbors Classic Ideas, New Ideas Yury Lifshits Steklov Institute of Mathematics at St.Petersburg http://logic.pdmi.ras.ru/~yura University of Toronto, July 2007 1 / 39 Outline

More information

Week 4: Simple Linear Regression II

Week 4: Simple Linear Regression II Week 4: Simple Linear Regression II Marcelo Coca Perraillon University of Colorado Anschutz Medical Campus Health Services Research Methods I HSMP 7607 2017 c 2017 PERRAILLON ARR 1 Outline Algebraic properties

More information

Continuous Improvement Toolkit. Normal Distribution. Continuous Improvement Toolkit.

Continuous Improvement Toolkit. Normal Distribution. Continuous Improvement Toolkit. Continuous Improvement Toolkit Normal Distribution The Continuous Improvement Map Managing Risk FMEA Understanding Performance** Check Sheets Data Collection PDPC RAID Log* Risk Analysis* Benchmarking***

More information