Simulation Monte Carlo
Monte Carlo simulation Outcome of a single stochastic simulation run is always random A single instance of a random variable Goal of a simulation experiment is to get knowledge about the distribution of the random variable (mean, variance) Right value is deterministic but it can not be determined explicitly.
Buffons needle Classical example of simulation experiment where exact result is known. Count of Buffon presented a method to define the value for p in 1733. Throw a needle of length l on a plane that has parallel lines with distance d. Count how often the needle crosses a line. Compute experimental probability for hits P= #hits/#trials
Buffons needle Needle hits a line if The distance from the center of the needle to closest line is less than l sin a, where a is the angle between the needle and the line d l a Angle ~ Unif(0, p/2) Center ~ Unif (0,d/2)
Buffons needle Probability of hit can be computed using the volume of the area bounded by the sinusoidal curve. p= 2l/(pd) Hence estimate is p = 2l/(pd) for experimental value for p. d/2 l/2 p/2
Buffons needle Result of a single throw is random So is the average of N throws. What do we know after N throws? Can we define the distribution of the average P after N throws. Or at least expectation and variance P is an average of N independent random variables Single attempts obey binomial distribution with mean p (=2l/(pd)) E(P)=p.
Buffons needle Variance of the result in one throw is p(1-p) (result is a Bin(p) variable) Variance of the average of N independent trials is p(1-p)/n I.e. Var(P) = p(1-p)/n So now we have observation of a random variable with known variance. We can estimate the relationship between the sample average and the expectation.
Confidence interval Assume we know a sample average of a random variable Where is the true expectation with, say 99% probability. Define sc. confidence interval for which P( P-d < p< P+d) >0.99. Can be defined if distribution of P is known. P is average of N independent Bin variables. For large N, P is approximately normally distributed. d is of form c(p)n^(-1/2).
Monte Carlo -integration Buffon s needle was sampling a variable expectation has a formula as definite integral. The approach can be used generally to approximate integrals. Integrate f on [a,b] given that 0<f<c If x is Unif(a,b) and y is Unif(0,c), determine experimentally the probability p for y< f(x). The sought integral is p(b-a)c.
Monte Carlo -integration More experiments lead to more accurate estimate for p. Confidence interval (error) is proportional to N^(-1/2). Not efficient for one dimensional integrals Length of confidence interval and asymptotic behavior depend only on p, not the dimension of the integral. Efficient way to get rough estimates for high dimensional integrals.
Monte Carlo Previous Monte Carlo does not apply directly to all cases Unbounded interval or function Possible to give up y variable Compute only E(f(x)) Cheaper but error analysis is more demanding Flat upper bound c can be replaced Find a pdf g such that f(x)< cg(x) Draw x:s from distribution g Aim for success rate p ~ 1
Monte Carlo applications Typical Monte Carlo case is (very) high dimensional integral arising from modelling ray propagation in material. Each collision is modelled with multidimensional integral (probabilities for absorption, scattering as functions of incident angle, energy, particle shapes, surface properties, adsorption in free path, etc) For single ray the complexity grows only linearily with number of collisions.
M C Example Consider scattering of laser beam from a material layer MSc thesis of Jukka Räbinä 2005 Goal is to simulate different statistics of the scattered image using Monte-Carlo
Experimental set up
Simulate propagation of ray in cloud of particles Basically ray tracing Positions and scattering directions of particles are random Scattering
Compute the intensity, center of mass etc of the scatter image captured by camera Goal of simulation I.e. an integral of a function involving the intensity distribution.
Simulation experiment Send parallel rays with normally distributed intensity Collect the (few) rays scattered to the camera
Simulation results Three different implementations of M-C Each with 100M simulated rays Differences in execution times and confidence intervals Differences can be explained after learning about variance reduction methods