A Random Number Based Method for Monte Carlo Integration

A Random Number Based Method for Monte Carlo Integration J Wang and G Harrell Department Math and CS, Valdosta State University, Valdosta, Georgia, USA Abstract - A new method is proposed for Monte Carlo integration This method is more efficient with wider coverage, including improper integrals, while the classical Monte Carlo integration can only handle bounded domain integrals To implement this method in computer programming, you only need a random number generator Unlike the deterministic numerical integration methods, the expected error of this method is independent of the integral dimensionality This method is powerful and dominates other numerical integral methods for the higherdimensional integrals Keywords: Monte Carlo integration; Output analysis; Trapezoid rule; Simpson rule; Error analysis; Improper integral 1 Introduction In single variable calculus ([8]), the general way to evaluate a definite integral is to find a formula for one of the antiderivatives of such that However some antiderivatives are not easy to find, and others, like and have been proved that no such closed-form formulas exist [8] There is a need to evaluate the definite integral of such functions Two different types of numerical methods are involved in approximating The first numerical method is called the deterministic integration, such as the trapezoidal rule and the Simpson rule The second numerical method is called the Monte Carlo integration In Section 2, we introduce the Monte Carlo integration in a general -dimensional integral setting Selecting a probability density function, having the same domain as the integral domain, is an important part in the Monte Carlo integration The efficiency of the Monte Carlo integration depends on the degree of difficulty for generating the random sample according to the selected density function The most popular method in practice is the simple Monte Carlo integration This method only works for bounded integral domain It does not work for any improper integrals with unbounded domains In Section 3, we propose a new method for Monte Carlo integration It is simple and efficient It can be used to approximate integrals with unbounded domains for any dimensional integrals The key idea is to convert the original integral domain to a new integral domain a unit hyper-cube The implementation of this new method is much easier A random number generator is the only thing you need In Section 4, we conduct the variance analysis It helps us to decide when the Monte Carlo simulation run should stop, and what the random sample size should be For the new proposed method, the expected value of error is ( ), which is independent of the integral dimensionality Comparing the new method with Trapezoid rule and the Simpson Rule, the new method is getting better and better when the integral dimension number is large than 8 It is particularly attractive and powerful for higher dimensional integrals 2 Monte Carlo Integration Monte Carlo ([2], [4]) integration is a simple and powerful method for approximating complicated integrals In general, we are interested in evaluating the integral of a function over a domain, ( ) (1) Here we use vector notation to indicate that is a - dimensional function In fact, Monte Carlo techniques are more attractive for estimating higher-dimensional integrals We assume that there is a probability density function (pdf) defined on the same domain Therefore (1) equals to, [ ( ) ] ( ) (2) ( ) The integral (2) is equivalent to the expectation of with respect to a random variable distributed following to the cdf [ ( ) ( ) ] (3) According to the cdf we generate independent random samples The Monte Carlo integration estimator is defined as following, (4)

Based on the Strong Law of Large Numbers ([1]), we have, with probability 1, This means, with probability 1, The Monte Carlo approximation estimator of [ ( ) ( ) ] is a consistent The efficiency of this method depends on how to generate the random samples The important sampling technique ([2], [7]) can be used here to reduce the Monte Carlo integration error (the sample variance) How to pick a cdf defined on is a major concern in using important sampling techniques Another important efficient factor is how to generate the random sample Sometime it may be very expensive depending on the selected cdf In practice, the most commonly used method is called the simple Monte Carlo simulation We define the volume of the integration domain in (1), The cdf selected for the simple Monte Carlo integration ( ) { In the implementation of simple Monte Carlo integration, we only need to generate random samples uniformly over the integral domain The efficiency is depending on the shape of integral domain 3 Using Random Numbers to Approximate Integrals As we discussed in the previous section, the shape of the integral domain is an issue here for the simple Monte Carlo integration It is only feasible when the shape of is a hyper-rectangle For an improper integral with unbounded domain, the simple Monte Carlo technique does not work It is impossible to generate random samples uniformly over an unbounded integral region For the general Monte Carlo integration technique, the issue here is how to select a cdf The necessary condition for is having the same domain Can we generate random samples from this density function efficiently and how? In order to solve these issues discussed above, we propose a new random number base method for the Monte Carlo integration The key idea is that we use the change variable (substitution) method to transfer the original integral domain (including unbounded case) into the new domain: a unit hyper-cube Therefore, we can easily generate random samples uniformly over a unite hypercube We introduce the new method for one-dimensional integration first This method can be easily generalized into multiple -dimensional situations What is a random number? By definition, a random number is a continuous random variable following a uniform distribution over the interval Its pdf is given by, { (5) The cumulative distribution function (cdf) is given by { (6) In connecting the random numbers, we begin with a special case Let be a one-dimension function and we want to approximate the integral value of on (7) If is a random number having cdf (5), then the expectation of is given by Therefore If are independent random numbers, we have that are independent and identically distributed random variables with a mean, which is equivalent to Therefore, by the Strong Law of Large Numbers, we have, with probability 1, We can use a large number of random numbers to approximate the definite integral (7) The natural approximation of based on random numbers is defined by, (8) We define this approximation technique as a random number based method for Monte Carlo integration In general, we want to approximate integral, (9) Here the integral lower and upper bounds and can be any real numbers, or positive and negative infinities For any arbitrary combination of and in (9), we will provide detailed steps to convert (9) as a basic integral (7) over the interval The key idea is to find a suitable variable substitution That we substitute variable in terms of This substitution will transfer the original

integral interval into the new integral interval The transformation process can be described as following, where the new integrand, (10) ( ) here is the Jacobian of the variable substitution If the transformation (10) exists, we can use the method (8) to approximate in (9), (11) Here will converge to the true integral value in (9) with probability 1 Let s classify all possible combinations of the integral lower and upper bounds and in (9) into different cases, and then we discuss each individual one Case A: For Case B: For here and are both finite real numbers and, we define the following substitution, implies implies, and ( ) ( ) We have done the transformation, here is a finite real number, we define a nonlinear transformation, implies, and implies Case C: For ( ) Using the negative sign to switch the integral lower and upper bounds, we have, This implies, Here may be unbounded when However this improper integral is convergent if is finite There is no problem when we perform the Monte Carlo simulation, since all pseudo random numbers are from open interval and can t be zero here is a finite real number, we define a nonlinear transformation, and implies, implies This implies, ( ) A typical application of Case C is the Normal cdf, For this special integral, the closed-form of its antiderivative does not exist

Case D: For we rewrite this integral into a sum of two integrals, We apply Case C method to the first integral and Case B method to the second integral, where and respectively Therefore, we have, We have discussed all possible situations of the integral (9) so far In other words, we can use the random number based method to approximate the integral (9) for any arbitrary lower and upper bounds This idea can be generalized to approximate the multiple integrals For a general -dimensional integral, (12) we may transfer (12) into the following integral, (13) Integral (13) has a unit hyper-cube domain and is equivalent to, where are independent random numbers To approximate the value of the -dimensional integral (13), we only need to generate independent random numbers over the unit hyper-cube : Therefore, we can estimate the integral value of (14) by, By the Strong Law of Large Numbers, with probability 1, We only need a random number generator ([5], [6]) to implement our method in computer programming 4 Error Analysis The implementation the Monte Carlo integration is a computer simulation The Monte Carlo integration estimate convergences to the true integral value with probability 1 In practice, we need to consider when to stop the Monte Carlo simulation, which essentially is deciding the appropriate sample size The quality of the Monte Carlo integration estimator is depending on this sample size Typical error analysis in Monte Carlo simulation is called the output analysis using sample variance to evaluate the quality of Monte Carlo estimator ([2], [7]) In this section, we are assuming that the variance is finite so that the Central Limit Theory can be applied The mean value of in (14) [ ] This implies that is an unbiased estimator variance of is given by, ( ) The where, is the variance ( ) and is the unit uniform random vector over the hyper-cube ([2], [7]) In practice, we use the following sample variance to estimate the about true variance ( ) This variance decreases asymptotically to zero as The expected value of the error is proportional to the standard deviation, which is ( ) From the Central Limit Theory, within one- error, the confidence level is about 68% Within two- error, the confidence level is about 95% Within three- error, the confidence level is about 997% We can decide the sample size based on the desired confidence level The swing digit ideal in ([9]) is a good method for reporting simulation outputs To increase accuracy in one more digit, you need to increase the sample size 100 times The expected value of the error for this new method is always ( ), which does not depend on the integral dimensionality However, for the deterministic numerical integration, it does depend on the integral dimensionality For a -dimensional integral, the Trapezoid rule provides an error of of ([3]), and the Simpson rule provides an error In comparing error orders, for lower dimension integrals, the Trapezoid rule and Simpson rule are better than the Monte Carlo integration As the dimensionality increases, for, the Monte Carlo integration is better than the Trapezoid rule, and for, the Monte Carlo integration is better than the Simpson Rule Particularly, the Monte Carlo method is powerful for high dimension integrals When the dimensional number is higher than 20, the only working numerical method is the Monte Carlo integration in many existing software packages

5 Conclusions In reality, an analytical form for a given integral is sometimes difficult to find We turn this situation into numerical integration The popular deterministic numerical methods are the Trapezoid rule and the Simpson rule The statistical numerical method is the Monte Carlo integration In general, you need to select a probability density function, which has the same domain as the given integral domain Generating a sample according to the selected density function is sometimes very expensive or inefficient For the most popular method - simple Monte Carlo integration, the integral domain has to be a bounded hyperrectangle This method does not work for any integral with an unbounded integral domain The new proposed method in this paper is more efficient and easier to implement in computer programming We have covered all possible integral situations, including improper integrals Detailed steps are provided for converting the given integral domain into a new domain a unit hyper-cube You only need to generate random samples uniformly over a unit hype-cube When the integral dimensional number is lower, there is no significant in errors for different numerical integration methods When the integral dimensional number is higher, say larger than 8, the new proposed method for Monte Carlo integration dominates all other numerical integration methods Transcendentals Addison Wesley, 10 th Edition, New York, NY, 2001 [9] A L Wang, A L and Kicey, C J On the Accuracy of Buffon's Needle: A Simulation Output Analysis Proceedings of the 49th Annual ACM SE Conference, ACM Press, New York, NY, 233 236, 2011 6 References [1] Billingsley, P Convergence of Probability Measures Wiley, 2 nd Edition, New York, NY, 1999 [2] Bratley, P, Fox, B L, and Schrage, L E A Guide to Simulation Springer-Verlag, New York, NY, 1987 [3] Burden, R L, Faires, J D, and Reynolds, A C Numerical Analysis PWS Publishers, 2 nd Edition, Boston, MA, 1979 [4] Hammersley, J M and Handscomb, D C 1964 Monte Carlo Methods, Methuen, London, 1964 [5] D H Lehmer, D H Mathematical Methods in Large- Scale Computing Units Annals of the Computation, Laboratory of Harvard University, 26, 141 146, 1951 [6] L Ecuyer, P Uniform Random Number Generation, Annals of Operations Research, 53, 77 120, 1994 [7] Ross, S M A Course in Simulation MacMillan, New York, NY, 1990 [8] Thomas Jr, G B, Weir, M D, Hass, J R, and Giordano, F R Thomas' Calculus Early