Section 6.2: Generating Discrete Random Variates

Similar documents
Section 6.5: Random Sampling and Shuffling

Simulation Input Data Modeling

Probability Models.S4 Simulating Random Variables

Package simed. November 27, 2017

Use of Extreme Value Statistics in Modeling Biometric Systems

Functions. How is this definition written in symbolic logic notation?

Probability Model for 2 RV s

Probability and Statistics for Final Year Engineering Students

Discrete-Event Simulation:

1KOd17RMoURxjn2 CSE 20 DISCRETE MATH Fall

Biostatistics & SAS programming. Kevin Zhang

Chapter 6: Simulation Using Spread-Sheets (Excel)

Chapter 3. Bootstrap. 3.1 Introduction. 3.2 The general idea

YEAR 12 Core 1 & 2 Maths Curriculum (A Level Year 1)

Information Theory and Communication

Laplace Transform of a Lognormal Random Variable

Lecture 8: Jointly distributed random variables

Definition. A Taylor series of a function f is said to represent function f, iff the error term converges to 0 for n going to infinity.

Lecture 2: Introduction to Numerical Simulation

Topic 5 - Joint distributions and the CLT

Simulation Modeling and Analysis

Functions. Def. Let A and B be sets. A function f from A to B is an assignment of exactly one element of B to each element of A.

Sorting. There exist sorting algorithms which have shown to be more efficient in practice.

MULTI-DIMENSIONAL MONTE CARLO INTEGRATION

MATH : EXAM 3 INFO/LOGISTICS/ADVICE

Will Monroe July 21, with materials by Mehran Sahami and Chris Piech. Joint Distributions

Computational Methods. Randomness and Monte Carlo Methods

Today s outline: pp

Crowd-Blending Privacy

7.1 INVERSE FUNCTIONS

Lab 5 Monte Carlo integration

PHYSICS 115/242 Homework 5, Solutions X = and the mean and variance of X are N times the mean and variance of 12/N y, so

Section 2.3: Monte Carlo Simulation

Lecture 4. Digital Image Enhancement. 1. Principle of image enhancement 2. Spatial domain transformation. Histogram processing

What We ll Do... Random

COMPUTER SCIENCE LARGE PRACTICAL.

Testing Continuous Distributions. Artur Czumaj. DIMAP (Centre for Discrete Maths and it Applications) & Department of Computer Science

Functions 2/1/2017. Exercises. Exercises. Exercises. and the following mathematical appetizer is about. Functions. Functions

Exam 2 is Tue Nov 21. Bring a pencil and a calculator. Discuss similarity to exam1. HW3 is due Tue Dec 5.

EDAA40 At home exercises 1

Recursive Estimation

CREATING SIMULATED DATASETS Edition by G. David Garson and Statistical Associates Publishing Page 1

correlated to the Michigan High School Mathematics Content Expectations

Sampling from distributions

Physics 736. Experimental Methods in Nuclear-, Particle-, and Astrophysics. - Statistical Methods -

Generating random samples from user-defined distributions

STAT 725 Notes Monte Carlo Integration

QED Q: Why is it called the triangle inequality? A: Analogue with euclidean distance in the plane: picture Defn: Minimum Distance of a code C:

Understanding Disconnection and Stabilization of Chord

CCSSM Curriculum Analysis Project Tool 1 Interpreting Functions in Grades 9-12

Biostatistics 615/815 Lecture 16: Importance sampling Single dimensional optimization

Probabilistic Graphical Models

Monte Carlo Methods and Statistical Computing: My Personal E

StatsMate. User Guide

Discrete Structures. Fall Homework3

8 1 Sequences. Defining a Sequence. A sequence {a n } is a list of numbers written in explicit order. e.g.

SigmaXL Feature List Summary, What s New in Versions 6.0, 6.1 & 6.2, Installation Notes, System Requirements and Getting Help

Discrete-Event Simulation:

mywbut.com Informed Search Strategies-I

1 Overview. 2 Applications of submodular maximization. AM 221: Advanced Optimization Spring 2016

Week 5. Convex Optimization

A metric space is a set S with a given distance (or metric) function d(x, y) which satisfies the conditions

Algorithms (III) Yijia Chen Shanghai Jiaotong University

About the Author. Dependency Chart. Chapter 1: Logic and Sets 1. Chapter 2: Relations and Functions, Boolean Algebra, and Circuit Design

Statool User s Manual

Expectation Propagation

Algorithms (III) Yu Yu. Shanghai Jiaotong University

4&5 Binary Operations and Relations. The Integers. (part I)

Functions and Sequences Rosen, Secs. 2.3, 2.4

Intensity Transformations and Spatial Filtering

Section 3.2: Multi Stream Lehmer RNGs

From Ver(ces to Fragments: Rasteriza(on

COPULA MODELS FOR BIG DATA USING DATA SHUFFLING

Answers to specimen paper questions. Most of the answers below go into rather more detail than is really needed. Please let me know of any mistakes.

Fathom Dynamic Data TM Version 2 Specifications

Distribution of tree parameters

Mathematics. Smyth County Schools Curriculum Map Grade:11-12 Subject:Math Analysis MA.1, MA.6, MA.8, MA.9 MA.1, MA.2, MA.3.

Monte Carlo Integration and Random Numbers

Advanced Operations Research Techniques IE316. Quiz 1 Review. Dr. Ted Ralphs

Phil 320 Chapter 1: Sets, Functions and Enumerability I. Sets Informally: a set is a collection of objects. The objects are called members or

PIANOS requirements specifications

CS 97SI: INTRODUCTION TO PROGRAMMING CONTESTS. Jaehyun Park

An interpolating 4-point C 2 ternary stationary subdivision scheme

Distributed Computing over Communication Networks: Leader Election

4.1 Performance Analysis

Amphitheater School District End Of Year Algebra II Performance Assessment Review

On Modularity Clustering. Group III (Ying Xuan, Swati Gambhir & Ravi Tiwari)

Introduction to Machine Learning CMU-10701

Floating Point Numbers

Floating Point Numbers

Package UnivRNG. R topics documented: January 10, Type Package

COMPUTATIONAL STATISTICS UNSUPERVISED LEARNING

Modules and Clients 1 / 21

Image Processing Lecture 10

Generalised Mean Averaging Interpolation by Discrete Cubic Splines

MAT Business Calculus - Quick Notes

Practical 4 Programming in R

Getting started with simulating data in R: some helpful functions and how to use them Ariel Muldoon August 28, 2018

Chapter 6: Continuous Random Variables & the Normal Distribution. 6.1 Continuous Probability Distribution

Transcription:

Section 6.2: Generating Discrete Random Variates Discrete-Event Simulation: A First Course c 2006 Pearson Ed., Inc. 0-13-142917-5 Discrete-Event Simulation: A First Course Section 6.2: Generating Discrete Random Variates 1/ 24

Section 6.2: Generating Discrete Random Variates The inverse distribution function (idf) of X is the function F : (0,1) X for all u (0,1) as F( ) is the cdf of X F (u) = min x {x : u < F(x)} That is, if F (u) = x, x is the smallest possible value of X for which F(x) is greater than u Discrete-Event Simulation: A First Course Section 6.2: Generating Discrete Random Variates 2/ 24

Example 6.2.1 Two common ways of plotting a cdf with X = {a,a + 1,...,b}: 1.0 1.0 F (u) = x F (u) = x F( ) u F( ) u 0.0 a x b 0.0 a x b Theorem (6.2.1) Let X = {a,a + 1,...,b} where b may be and F( ) be the cdf of X For any u (0,1), if u < F(a), F (u) = a else F (u) = x where x X is the unique possible value of X for which F(x 1) u < F(x) Discrete-Event Simulation: A First Course Section 6.2: Generating Discrete Random Variates 3/ 24

Algorithm 6.2.1 For X = {a,a + 1,...,b}, the following linear search algorithm defines F (u) Algorithm 6.2.1 x = a; while (F(x) <= u) x++; return x; /*x is F (u)*/ Average case analysis: Let Y be the number of while loop passes Y = X a E[Y] = E[X a] = E[X] a = µ a Discrete-Event Simulation: A First Course Section 6.2: Generating Discrete Random Variates 4/ 24

Algorithm 6.2.2 Idea: start at a more likely point For X = {a,a + 1,...,b}, a more efficient linear search algorithm defines F (u) Algorithm 6.2.2 x = mode; /*initialize with the mode of X */ if (F(x) <= u) while (F(x) <= u) x++; else if (F(a) <= u) while (F(x-1) > u) x--; else x = a; return x; /* x is F (u)*/ For large X, consider binary search Discrete-Event Simulation: A First Course Section 6.2: Generating Discrete Random Variates 5/ 24

Idf Examples In some cases F (u) can be determined explicitly If X is Bernoulli(p) and F(x) = u, then x = 0 iff 0 < u < 1 p: { F 0 0 < u < 1 p (u) = 1 1 p u < 1 Discrete-Event Simulation: A First Course Section 6.2: Generating Discrete Random Variates 6/ 24

Example 6.2.3: Idf for Equilikely If X is Equilikely(a,b), F(x) = x a + 1 b a + 1 x = a,a + 1,...,b For 0 < u < F(a), For F(a) u < 1, F (u) = a F(x 1) u < F(x) (x 1) a + 1 u < x a + 1 b a + 1 b a + 1 x a + (b a + 1)u < x + 1 Therefore, for all u (0,1) F (u) = a + (b a + 1)u Discrete-Event Simulation: A First Course Section 6.2: Generating Discrete Random Variates 7/ 24

Example 6.2.4: Idf for Geometric If X is Geometric(p), F(x) = 1 p x+1 x = 0,1,2,... For 0 < u < F(0), F (u) = 0 For F(0) u < 1, F(x 1) u < F(x) 1 p x u < 1 p x+1. x ln(1 u) ln(p) < x + 1 For all u (0,1) ln(1 u) F (u) = ln(p) Discrete-Event Simulation: A First Course Section 6.2: Generating Discrete Random Variates 8/ 24

Random Variate Generation By Inversion X is a discrete random variable with idf F ( ) Continuous random variable U is Uniform(0,1) Z is the discrete random variable defined by Z = F (U) Theorem (6.2.2) Z and X are identically distributed Theorem 6.2.2 allows any discrete random variable (with known idf) to be generated with one call to Random() Algorithm 6.2.3 If X is a discrete random variable with idf F ( ), a random variate x can be generated as u = Random(); return F (u); Discrete-Event Simulation: A First Course Section 6.2: Generating Discrete Random Variates 9/ 24

Proof for Theorem 6.2.2 Prove that X = Z F : (0, 1) X, so u (0, 1) such that F (u) = x Z = F (U) It follows that x Z so X Z From definition of Z, if z Z then u (0, 1) such that F (u) = z F : (0, 1) X It follows that z X so Z X Prove that Z and X have the same pdf Let X = Z = {a, a + 1,...,b}, from definition of Z and F ( ) and theorem 6.2.1: if z = a, if z Z, z a, Pr(Z = a) = Pr(U < F(a)) = F(a) = f (a) Pr(Z = z) = Pr(F(z 1) U < F(z)) = F(z) F(z 1) = f (z) Discrete-Event Simulation: A First Course Section 6.2: Generating Discrete Random Variates 10/ 24

Inversion Examples Example 6.2.5 Consider X with pdf 0.1 x=2 f (x) = 0.3 x=3 0.6 x=6 The cdf for X is plotted using two formats 1.0 u 1.0 u F( ) 0.4 F( ) 0.4 0.1 0.0 2 3 4 5 6 x 0.1 0.0 2 3 4 5 6 x Discrete-Event Simulation: A First Course Section 6.2: Generating Discrete Random Variates 11/ 24

Algorithm for Example 6.2.5 Example 6.2.5 if (u < 0.1) return 2; else if (u < 0.4) return 3; else return 6; returns 2 with probability 0.1, 3 with probability 0.3 and 6 with probability 0.6 which corresponds to the pdf of X This example can be made more efficient: check the ranges for u associated with x = 6 first (the mode), then x = 3, then x = 2 Problems may arise when X is large or infinite Discrete-Event Simulation: A First Course Section 6.2: Generating Discrete Random Variates 12/ 24

More Inversion Examples Example 6.2.6: Generating a Bernoulli(p) Random Variate u = Random(); if (u < 1-p) return 0; else return 1; Example 6.2.7: Generating an Equilikely(a,b) Random Variate u = Random(); return a + (long) (u * (b - a + 1)); Example 6.2.8: Generating a Geometric(p) Random Variate u = Random() return (long) (log(1.0 - u) / log(p)); Discrete-Event Simulation: A First Course Section 6.2: Generating Discrete Random Variates 13/ 24

Example 6.2.9 X is a Binomial(n,p) random variate F(x) = x t=0 ( n x ) p x (1 p) n x x = 0,1,2,...,n Incomplete beta function { 1 I(x + 1,n x,p) x = 0,1,...,n 1 F(x) = 1 x = n Except for special cases, an incomplete beta function cannot be inverted to form a closed form expression for the idf Inversion is not easily applied to generation of Binomial(n,p) random variates Discrete-Event Simulation: A First Course Section 6.2: Generating Discrete Random Variates 14/ 24

Algorithm Design Criteria The design of a correct, exact and efficient algorithm to generate corresponding random variates is often complex Portability - implementable in high-level languages Exactness - histogram of variates should converge to pdf Robustness - performance should be insensitive to small changes in parameters and should work properly for all reasonable parameter values Efficiency - it should be time efficient (set-up time and marginal execution time) and memory efficient Clarity - it is easy to understand and implement Synchronization - exactly one call to Random() is required Monotonicity - it is synchronized and the transformation from u to x is monotone increasing (or decreasing) Inversion satisfies some criteria, but not necessarily all Discrete-Event Simulation: A First Course Section 6.2: Generating Discrete Random Variates 15/ 24

Example 6.2.10 To generate Binomial(10,0.4), the pdf is (to 0.ddd precision) x : 0 1 2 3 4 5 6 7 8 9 10 f (x) : 0.006 0.040 0.121 0.215 0.251 0.201 0.111 0.042 0.011 0.002 0.000 Random variates can be generated by filling a 1000-element integer-valued array a[ ] with 6 0 s, 40 1 s, 121 2 s, etc. Binomial(10, 0.4) Random Variate j = Equilikely(0,999); return a[j]; This algorithm is portable, robust, clear, synchronized and monotone, with small marginal execution time The algorithm is not exact: f (10) = 1/9765625 Set-up time and memory efficiency could be problematic: for 0.ddddd precision, need 100 000-element array Discrete-Event Simulation: A First Course Section 6.2: Generating Discrete Random Variates 16/ 24

Example 6.2.11: Exact Algorithm for Binomial(10, 0.4) An exact algorithm is based on filling an 11-element floating-point array with cdf values then using Alg. 6.2.2 with x = 4 to initialize the search In general, to generate Binomial(n,p) by inversion compute a floating-point array of n + 1 cdf values use Alg. 6.2.2 with x = np to initialize the search The library rvms can be used to compute the cdf array by calling cdfbinomial(n,p,x) for x = 0,1,...,n Only drawback is some inefficiency (setup time and memory) Discrete-Event Simulation: A First Course Section 6.2: Generating Discrete Random Variates 17/ 24

Example 6.2.12 The cdf array from Example 6.2.11 can be eliminated cdf values computed as needed by Alg. 6.2.2 Reduces set-up time and memory Increases marginal execution time Function idfbinomial(n,p,u) in library rvms does this Binomial(n,p) random variates can be generated by inversion Generating a Binomial Random Variate u = Random(); return idfbinomial(n, p, u); /* in library rvms*/ Inversion can be used for the six models: Ideal for Equilikely(a, b), Bernoulli(p) and Geometric(p) For Binomial(n, p), Pascal(n, p) and Poisson(µ), time and memory efficiency can be a problem if inversion is used Discrete-Event Simulation: A First Course Section 6.2: Generating Discrete Random Variates 18/ 24

Alternative Random Variate Generation Algorithms Example 6.2.13 Binomial Random Variates A Binomial(n,p) random variate can be generated by summing an iid Bernoulli(p) sequence Generating a Binomial Random Variate x = 0; for (i = 0; i < n; i++) x += Bernoulli(p); return x; The algorithm is: portable, exact, robust, clear The algorithm is not: synchronized or monotone Marginal execution: O(n) complexity Discrete-Event Simulation: A First Course Section 6.2: Generating Discrete Random Variates 19/ 24

Poisson Random Variates A Poisson(µ) random variable is the n limiting case of a Binomial(n,µ/n) random variable For large n, Poisson(µ) Binomial(n, µ/n) The previous O(n) algorithm for Binomial(n,p) should not be used when n is large The Poisson(µ) cdf F( ) is equal to an incomplete gamma function F(x) = 1 P(x + 1,µ) x = 0,1,2,... An incomplete gamma function cannot be inverted to form an idf Inversion to generate a Poisson(µ) requires searching the cdf as in Examples 6.2.11 and 6.2.12 Discrete-Event Simulation: A First Course Section 6.2: Generating Discrete Random Variates 20/ 24

Example 6.2.14 Generating a Poisson Random Variate a = 0.0; x = 0; while (a < µ) { a += Exponential(1.0); x++; } return x - 1; The algorithm does not rely on inversion or the large n version of Binomial(n,p) The algorithm is: portable, exact, robust; not synchronized or monotone; marginal execution time can be long for large µ It is obscure. Clarity will be provided in Section 7.3 Discrete-Event Simulation: A First Course Section 6.2: Generating Discrete Random Variates 21/ 24

Pascal Random Variates A Pascal(n,p) cdf is equal to an incomplete beta function: F(x) = 1 I(x + 1,n,p) x = 0,1,2,... X is Pascal(n,p) iff X = X 1 + X 2 + + X n where X 1,X 2,...,X n is an iid Geometric(p) sequence Example 6.2.15 Summing Geometric(p) random variates to generate a Pascal(n,p) random variate Generating a Pascal Random Variate x = 0; for(i = 0; i < n; i++) x += Geometric(p); return x; The algorithm is: portable, exact, robust, clear; not synchronized or monotone; marginal execution complexity is O(n) Discrete-Event Simulation: A First Course Section 6.2: Generating Discrete Random Variates 22/ 24

Library rvgs Includes 6 discrete random variate generators (as below) and 7 continuous random variate generators long Bernoulli(double p) long Binomial(long n, double p) long Equilikely(long a, long b) long Geometric(double p) long Pascal(long n, double p) long Poisson(double µ) Functions Bernoulli, Equilikely, Geometric use inversion; essentially ideal Functions Binomial, Pascal, Poisson do not use inversion Discrete-Event Simulation: A First Course Section 6.2: Generating Discrete Random Variates 23/ 24

Library rvms Provides accurate pdf, cdf, idf functions for many random variates Idfs can be used to generate random variates by inversion Functions idfbinomial, idfpascal, idfpoisson may have high marginal execution times Not recommended when many observations are needed due to time inefficiency Array of cdf values with inversion may be preferred Discrete-Event Simulation: A First Course Section 6.2: Generating Discrete Random Variates 24/ 24