MULTI-DIMENSIONAL MONTE CARLO INTEGRATION

CS580: Computer Graphics KAIST School of Computing Chapter 3 MULTI-DIMENSIONAL MONTE CARLO INTEGRATION 2 1

Monte Carlo Integration This describes a simple technique for the numerical evaluation of integrals Suppose: b I = f (x)dx, x [a,b] a The Monte-Carlo converts this to an expected value computation problem, where p(x) is some probability density function: I = ( f (x) / p(x))p(x)dx = E[ f (x) / p(x)] 3 Monte Carlo Integration Values can be estimated by taking N samples x 1, x 2,, x n drawn from PDF p(x): I = E[ f (x) / p(x)] I = (1/ N) f (x i ) / p(x i ) The variance is proportional to 1/N Error decrease: 1/sqrt(N) (=stddev.) N i=1 4 2

Monte Carlo Integration The expected value of the estimator <I> E I = E 1 N N i=1 f (x i ) p(x i ) = 1 N E f (x ) i N p(x i ) i=1 = 1 N N f (x) p(x) p(x)dx = f (x)dx = I. σ 2 = 1 N f (x) g(x) I 2 p(x)dx. 5 Example We know it should be 1.0. Here with uniform sample (p(x)=1): 2 σ est I = 1 0 5x 4 dx I I = 1 N N 5x i 4 σ i=1 1 2 -σ 2 Error = 1 1 N (5x4 1) 2 dx = 16 0 9N Number of samples 6 3

Bias When the expected value of the estimator E[<I>] is exactly the value of the integral I à unbiased Bias is: B I = E I I. When N is infinite, lim B N I = 0 7 Accuracy Chebyshev s Inequality The probability that a sample deviates form the solution by a value greater than σ 2, where δ is δ an arbitrary positive number, is smaller than δ. Pr I E[ I ] σ 2 δ δ If δ = 1 10000, Pr I E[ I ] 100σ primary N 1 10000 8 4

Accuracy Central Limit Theorem As N, the values of the estimator have a normal distribution. o = 12 u = 1.0 Mode, Median, Mean Pr(X) u = 2.0 68.3 % u = 3.0 95.4 % 99.7 % 0.0 4.0 8.0 12.0 16.0 20.0 24.0 µ-3s µ-2s µ-1s µ µ+1s µ+2s µ+3s 9 Deterministic Quadrature vs. MC One approximation of the integral I: N I w i f (x i ) = i=1 N i=1 f (x i )(b a) N Extending these deterministic quadrature rules to a d-dimensional integral would require N d samples 10 5

Multidimensional MC Integration Multidimensional Monte Carlo integration: I = I = 1 N f (x, y)dx dy N i=1 f (x i, y i ) p(x i, y i ) Note that unlike deterministic quadrature techniques, Monte Carlo techniques permit an arbitrary choice of N 11 MC Integration over a Hemisphere Consider a light source L. Irradiance can be computed as an integral: I = L source cosθ dw Θ 2π = L source cosθ sinθ dθ dφ. 0 π /2 0 Differential solid angle: dw Θ = sinθdθdφ 12 6

MC Integration over a Hemisphere The estimator for irradiance is: I = 1 N L source (Θ i )cosθ sinθ N i=1 p(θ i ) Choose our samples from the following probability distribution: cosθ sinθ p(θ i ) = π The estimator for irradiance is: I = π N N i=1 L source (Θ i ) 13 Summary of Monte Carlo A Monte Carlo estimator for an integral I = 1 N f (x i ) N i=1 p(x i ) The variance of this estimator is: I = f (x)dx σ 2 = 1 N f (x) p(x) I 2 p(x)dx MC computation steps: 1. Sampling according to a PDF 2. Evaluating the PDF at that sample 3. Averaging these appropriately weighted sampled values 14 7

Chapter 3 RANDOM(?) SAMPLING 15 Sampling The Monte Carlo technique computes samples from a probability distribution p(x). We want to find samples such that the distribution of the samples matches p(x). Suppose p(x) is known How could we achieve better sampling? Inverse cumulative distribution function Rejection sampling 16 8

Inverse Cumulative Distribution Function Given a set of probability p i, we pick discrete random variables x i, discrete cumulative probability distribution function (CDF), i corresponding to the p i as follows: F i = Compute a sample u that is uniformly distributed over the domain [0,1) F k 1 u < F k ; k 1 k p j=1 j u < p j=1 j ; k 1 p j u < F k 1 + p k. j=1 p j=1 j 17 Uniform Probability A uniform PDF: p u (x) = 1 b a. A CDF: F(a u < b) = (b a) b' 1 b' a' Pr(x [a',b']) = dx = ; a' b a b a y 1 Pr(x y) = CDF(y) = b a dx = y a b a when a=0, b=1, Pr(x y) = CDF(y) = y. The probability that the value of u lies between F k-1 and F k is: p k = F k F k 1 18 9

Inverse Cumulative Distribution Function Continuous random variables A sample can be generated according to a given distribution p(x) by applying the inverse cumulative distribution function of p(x) to a uniformly generated random variable u over the interval [0,1). F(y) = y p(x)dx 1. Pick u uniformly from [0,1) 2. Output y=f -1 (y) 19 Inverse Cumulative Distribution Function Based on the CDF of the uniform PDF, we know: Pr[u X] = X. Therefore, Pr[F 1 (u) F 1 (X)] = X if X = F(Y ) Pr[y Y ] = F(Y ) = Pr[y Y ] = Y Y p(x)dx p(x)dx. 20 10

Example Draw random samples from PDF: in [0,1] CDF: p(x) = 5x 4 Sample: F(x) = p(t)dt = x 5 0 x x i = F 1 (u i ) = 5 u i 21 Cosine sampling example A cosine weighting factor arises in the rendering equation; therefore, it is often useful to sample the hemisphere to compute radiance using a cosine PDF. The hemisphere can be sampled such that the samples are weighted by the cosine term. The PDF is cos( q ) p( qf, ) = p 22 11

Cosine sampling example Its CDF is computed as: 1 F = cos qdw ; p ò 1 f q F( qf, ) = cos q'sin q' dq' df' p òò 0 0 1 f q = df' cos q'sin q' dq' p ò0 ò0 f 2 q = (-cos q '/ 2) p 0 f 2 = (1- cos q ) 2p 23 Cosine sampling example The CDF, with respect to f and q functions, is separable: f 2 Ff =, Fq = 1- cos q p Therefore, we compute two uniformly distributed samples u 1 and u 2 : -1 fi = 2 pu1, qi = cos u2 Where 1-u is replaced by u 2, since the uniform random variables lie in the domain [0,1). These f i and q i values are distributed according to the cosine PDF. 24 12

Empirical CDF In case we need to learn the CDF by an example. Consider the following dataset: [4 0 3 2 2] The formula for the CDF F n (t) is The empirical CDF is a step function that has a step of 1/n=1/5=0.2 at each of the observed data points. () # of sample values t Fn t = n t F n (t) 0 1/5 1 1/5 2 3/5 3 4/5 4 5/5 https://onlinecourses.science.psu.edu/stat464/node/84 25 Empirical CDF If there are k observations that have the sample value as t, then the size of the step is k n. For our example, the size of the step at t = 2 is 2/5 since two of the observations equal to 2. t F n (t) 0 1/5 1 1/5 2 3/5 3 4/5 4 5/5 t F n (t) 0.0 1/5 0.5 1/5 1.0 1/5 1.5 1/5 2.0 3/5 26 13

Draw samples from PDF Draw random samples x i from PDF p(x) range: [, ] Compute CDF F(x) = x f (t)dt Sample inverse CDF uniformly: x i = F 1 (u i ), u i [0,1) 27 Inverse CDF Therefore, k is selected with probability p k, which is exactly what we want. Computing the F value à O(N) time Looking up the appropriate value to output O(log 2 (N)) time per sample (by conducting a binary search on the precomputed F table) 28 14

Rejection Sampling It is often impossible to derive an analytical formula for the inverse of the cumulative distribution function. Rejection sampling is an alternative. Samples are tentatively proposed and tested to determined acceptance or rejection of the sample. This method raises the dimension of the function being sampled by one And then uniformly samples the bounding box that includes the entire PDF. This sampling technique yields samples with the appropriate distribution. 29 Rejection Sampling A 1D PDF whose maximum value over the domain [a,b] to be sampled M. Create a 2D function [a,b]x[0,m]. Sample uniformly as (x,y). Then reject samples such that p(x)<y. The distribution of the accepted samples is the pdf p(x) 30 15

Jittered sampling Generate N x N stratified sample per pixel at (i,j) Generate random variable λ 1 & λ 2 to index stratified sample Generate Ray: COP to sampled position at (i+ λ 1,j+ λ 2 ) Radiance = Total Radiance / N_RAYS_PER_PIXEL Stratified Sampling A uniform random sample is random! Each type of pattern is equally probable. A stratified sample, where we sample randomly within strata significantly reduces the variance. Enhance the convergence speed. 32 16

Stratified Sampling The basic idea in stratified sampling is to split the integration domain into m disjoint subdomains (socalled strata) and evaluate the integral in each of the subdomains separately with one or more samples. 1 α 1 α 2 f (x)dx = f (x)dx + f (x)dx +! 0 0 α m 1 α 1 + f (x)dx + f (x)dx. α m 2 1 α m 1 33 N-Rooks Algorithm One major disadvantage of stratified sampling arises when it is used for higher-dimensional sampling. N-Rooks algorithm distributes N samples evenly among the strata. 34 17

Example 35 Pseudo Random Numbers C/C++ have an inbuilt random number generator For example: /* generate random number in range [0,1] */ float uniform(void) { return((float)rand()/(float)(rand_max)); } To choose an event with probability p, use if(uniform() < p) /* do the event */; else /* don t */ 36 18

Random Numbers Builtin random number generator Usually not very good Not stratified, etc. Better (Quasi-)random numbers: Halton sequence FUNCTION Halton(index, base) BEGIN result = 0; f = 1 / base; i = index; WHILE (i > 0) BEGIN result = result + f * (i % base); i = FLOOR(i / base); f = f / base; END RETURN result; END 37 Chapter 3 VARIANCE REDUCTION 38 19

Variance Reduction Monte Carlo techniques: Blind Monte Carlo (no information about the probability function, what we learnt, assuming uniform distribution) Informed Monte Carlo (more accurate) Designing efficient estimators is the major area in Monte Carlo techniques. E.g., importance sampling, stratified sampling, multiple importance sampling, quasi-monte Carlo, etc. 39 Importance Sampling Importance sampling uses a non-uniform probability distribution function to generate sample. Consider N random samples over domain with probability p(x) Define an estimator of integral <I> as: I = 1 N f (x i ) N i=1 p(x i ) Expected value of this estimator is I. 40 20

Importance Sampling Since a perfect estimator would have the variance be zero, We could define a function L for the perfect estimator: 2 f (x L(p) = i ) p(x i ) p(x)dx + λ D p(x)dx D Where, the integral of p(x) over the integration domain D is 1. p(x)dx = 1 D We need to find a scalar λ to minimize L. 41 Importance Sampling The minimization problem can be solved using the Euler-Lagrange differential equation. f (x L(p) = i ) 2 p(x i ) + λ p(x) dx D Differentiate L(p) with respect to p(x) 0 = f (x) 2 p p(x) + λ p(x) 0 = f 2 (x) p 2 (x) + λ p(x) = 1 λ f (x). 42 21

Importance Sampling The constant is a scaling factor, such that p(x) can fulfill the boundary condition. The optimal p(x) is then given by: f (x) p(x) = f (x)dx D We use p that resembles f. Does not change convergence rate (still sqrt) bad uniform good 43 Example Evaluate: Use PDF Yields: I = 1 0 5x 4 dx p(x) = 5x 4 I 1 N 5x 4 = 1 N 5x 4 i=1 44 22

Example: Glossy Rendering 45 Example: Glossy Rendering 46 23

Example: Glossy Rendering 47 Antialiasing in Ray Tracing In order to reduce aliasing due to undersampling in ray tracing each pixel may be sampled and then the average radiance per pixel found. A stratified sample over the pixel is preferable to a uniform sample especially when the gradient within the pixel is sharply changing. 48 24