Monte Carlo Techniques for Bayesian Statistical Inference A comparative review
|
|
- Cory Burns
- 5 years ago
- Views:
Transcription
1 1 Monte Carlo Techniques for Bayesian Statistical Inference A comparative review BY Y. FAN, D. WANG School of Mathematics and Statistics, University of New South Wales, Sydney 2052, AUSTRALIA 18th January, 2007 ABSTRACT In this article, we summarise Monte Carlo simulation methods commonly used in Bayesian statistical computing. We give descriptions for each algorithm and provide R codes for their implementation via a simple 2-dimensional example. We compare the relative merits of these methods qualitatively by considering their general user-friendliness, and numerically in terms of mean squared error and computational time. We conclude with some general guidelines and recommendations. Some keywords: Monte Carlo; Markov Chain Monte Carlo; Simulation; Rejection Sampling; Importance Sampling; Gibbs Sampler; Metropolis-Hastings; Adaptive Rejection; Slice Sampler; Sequential Monte Carlo. 1 Introduction Together with the availability of more complex data, Bayesian statistical models have become more sophisticated, making analytical calculations tedious, or simply intractable in many cases. There is an increasing reliance on simulation based methods for such inferences. Suppose we have a distribution f(θ) for a vector of real-valued parameters θ Θ, we may be interested in summary statistics for θ such as the mean, mode or the quantiles. When f is complex, inference for θ are often carried out using Monte Carlo simulations. In this article, we summarise some of the most commonly used Monte Carlo methods, including importance sampling, rejection sampling, sequential Monte Carlo, Gibbs sampler, adaptive rejection sampler, slice sampler and the Metropolis-Hastings sampler. We first summarise and present the algorithms listed above, and provide simple R codes to demonstrate how implementations are carried out (with the exception of the adaptive rejection sampling, where we make use of the WinBUGS software) for a simple example. We carry out a comparisons of how well each method performs, in terms of their general applicability and user-friendliness. Quantitative measures in terms of mean squared errors of the estimate, and computational time are also recorded via the example. Throughout the paper, we base our comparisons using a bivariate Normal distribution with varying degrees of correlation, as the benchmark distribution. While the benchmark distribution we use here is relatively simple, it is a conceptually easy distribution for the reader to grasp the effects of the different samplers on the various statistic we monitor.
2 2 In Section 2, we first set out some notations and background for the examples. In Section 3 and 4, we present the above algorithms by separating them into direct and iterative algorithms respectively. We compare the results of our simulation study in Section 5, and conclude with some recommendations in Section 6. 2 Notation and Benchmark Distribution Let θ = (θ 1,..., θ d ) Θ be the set of d dimensional parameter for which we wish to make inference. f(θ) denotes the distribution for θ. We wish to simulate N samples of θ from f(θ), which we refer to as the target distribution. For a running example, taking d = 2, a bivariate Normal distribution with varying correlation coefficient is used to benchmark the comparisons. Here we give details of the target distribution and the notations used throughout the paper. Let ( ) θ1 N(µ, Σ) where µ = [ 0 0 θ 2 ] Σ = ρ is the correlation between parameters θ 1 and θ 2. The joint distribution for θ 1, θ 2 is f(θ 1, θ 2 ) = [ 1 ρ ρ 1 ] 1 2π 1 ρ 2 exp{ 1 2(1 ρ 2 ) (θ2 1 + θ 2 2 2ρθ 1 θ 2 ) (1) Hence the conditional distribution for θ 1 θ 2 N(ρθ 2, 1 ρ 2 ), similarly, the conditional distribution for θ 2 θ 1 N(ρθ 1, 1 ρ 2 ). In some applications that follows, it was easier to transform the parameter space onto the unit square. We make the transformation for θ 1 (, ) and θ 2 (, ) to u (0, 1) and v (0, 1) via the logistic functions (other transformations are also possible) u = exp( θ 1 ), v = exp( θ 2 ). The transformed function equivalent to Equation 1 now becomes, f s (u, v) = 1 2π + t 2 2ρst) (1 ρ 2 ) exp{ (s2 2(1 ρ 2 uv(1 u)(1 v) (2) ) where s = log( u v 1 u ), t = log( 1 v ). Finally, we stress that we do not claim any optimality for the use of this particular example for benchmarking. Indeed it is chosen for its simplicity and intuitiveness. Multimodality, correlations between parameters, dimensionality of the target distribution all play vital roles in determining how each method performs. We make our comparisons in this paper under the assumption that the user has minimum knowledge about the distribution from which they wish to sample from.
3 3 3 Direct Simulation Methods In this section, we present three of the most popular direct simulation method, importance sampling, rejection sampling and sequential Monte Carlo sampling. Common to all three of these methods is that samples are not obtained iteratively (depending on previous samples), and all samples obtained from these methods are used for statistical inference. 3.1 Importance Sampling We begin by introducing the classical Monte Carlo algorithm, see Wasserman (2004), Robert and Casella (2004), for approximating the integral associated with E(h(θ)) for some function h. The classical Monte Carlo algorithm begins by drawing N samples θ (i), i = 1,..., N uniformly over Θ and approximates the integral by the sample mean of h(θ (i) )f(θ (i) ). Importance sampling extends this by drawing samples from a trial distribution g. More efficient algorithm is obtained when g is close to f. Importance sampling produces weighted samples, with weights given by the ratio f/g. One can work either directly with the weighted samples, or resample with respect to the weights for a set of un-weighted samples. We give the algorithm below, Importance Sampling algorithm: (IS) 1. Draw N samples θ (1),..., θ (N) from g(θ). 2. Evaluate weights f(θ(i) ), for i = 1,..., N. g(θ (i) ) Example: Since the trial distribution g has to be chosen to have the same support as the target distribution f, we will make the transformation of θ 1 and θ 2 onto the unit square and work with the transformed function f s as in Equation 2. Note that better trial distributions may be found, although the process is non-trivial. We provide the R code for the importance sampler below is.bvn < function(n, rho){ theta1 < runif(n); theta2 < runif(n) weights < f.s(theta1, theta2, rho) ind < sample(c(1 : N), replace = T, prob = weights) return(list(theta1 = log(theta1[ind]/(1 theta1[ind])), theta2 = log(theta2[ind]/(1 theta2[ind])))) The function f.s(theta1,theta2,rho) computes the transformed function as in Equation 2.
4 4 3.2 Rejection Sampling For a trial density g and a constant M such that f(θ) Mg(θ), the rejection sampling algorithm, Ripley (1987), is given by Rejection Sampling algorithm: (RS) 1. Draw N samples θ (1),..., θ (N) from g(θ). 2. Draw N samples u (1),..., u (N) from Unif(0, 1). 3. Accept θ (i) if u (i) f(θ(i) ), for i = 1,..., N. Mg(θ (i) ) Unlike the importance sampling algorithm, rejection sampling produces independent and identically distributed samples from f. However, the calculation of M is an additional difficulty, and can be difficult to find in high dimensional problems. Example: Again, we use the transformed distribution of Equation 2, and take g as the bivariate Unif(0, 1) 2 distribution, thus M occurs at f(0, 0) = f s (0.5, 0.5), that is, the mode of the distribution. The acceptance rate in this example was around 20%. We give R code for the rejection sampler below, rs.bvn < function(n, rho){ theta1 < runif(n); theta2 < runif(n) M < f.s(0.5, 0.5, rho) ind < runif(n) < (f.s(y1, y2, rho)/m) return(list(theta1 = log(theta1[ind]/(1 theta1[ind])), theta2 = log(theta2[ind]/(1 theta2[ind])))) 3.3 Sequential Monte Carlo The sequential Monte Carlo (SMC) sampler can be viewed as an extension of importance sampling, by allowing intermediary steps and propagating moves within each distribution. Crucially, SMC does not require an initial distribution which takes the same support as the target distribution. This can be considered a major advantage, particularly for high dimensional problems. Furthermore, SMC is able to deal with far more complex problems by allowing corrections to the initial samples iteratively. As in importance sampling, SMC produces weighted samples, we give the algorithm below,
5 5 Sequential Monte Carlo algorithm: (SMC) 1. Draw N samples θ (1) 0,..., θ(n) 0 from initial distribution f 0 (θ). 2. Initialise weights w (i) 0 = 1, i = 1,..., N. Set t = For i = 1,..., N, (a) Move samples θ (i) t 1 according to forward transition kernel K t(θ (i) t 1, θ(i) t ). (b) For some arbitrary backwards transition kernel, L t 1 (θ (i) t for samples 4. If [ N w (i) t = w (i) t 1 f t (θ (i) t i=1 (w(i) t ) 2 ] 1 < N/2, resample with replacement, the samples {θ (i) t samples {θ (i) t, and set weights {w (i) t = Increment t = t + 1, if t < T, return to step 3. )L t 1 (θ (i), θ (i) t t 1 ) f t 1 (θ (i) t 1 )K t(θ (i) t 1, θ(i) t )., θ (i) t 1 ), set weights with weights {w (i) t to obtain new Here the initial distribution f 0 can be any distribution that we can sample directly from. f t can be viewed as intermediary distributions bridging between the initial and final distribution f T from which samples are required. The number of intermediary distributions T, as well as the forward and backward transition kernels are arbitrary, but consecutive distributions f i, f i+1 should be close. Del Moral et al. (2006) provides details on how to make these choices. Example: Contrary to importance and rejection sampling, sequential Monte Carlo does not require that the initial trial distribution to have the same support as the target distribution. However, in this example, for the ease of comparison, we choose the initial trial distribution to be the same as g used in the previous examples, and let f 0 to take on uniform values over the unit square. Again we work with the transformed function f s. We set T = 11, and f t = f 1 ɛ 0 f ɛ s, ɛ = 0, 0.1,..., 1. We choose the forward transition kernel K t (θ t 1, θ t ) to be a Beta random walk distribution Beta( θ t θ t 1, 1000) and let the backward kernel L t 1 (θ t, θ t 1 ) = Beta( θt θ t, 1000). R code for the SMC sampler are given below, the R function f.t(theta1,theta2,rho,eps)
6 6 computes the function f t above. smc.bvn < function(n, rho){ theta1 < runif(n); theta2 < runif(n) weights < rep(1, N); eps < seq(0, 1, length = T) for (i in c(1 : T)){ alpha1 < theta1 1000/(1 theta1); alpha2 < theta2 1000/(1 theta2) y1 < rbeta(n, alpha1, 1000); y2 < rbeta(n, alpha2, 1000) alphay1 < y1 1000/(1 y1); alphay2 < y2 1000/(1 y2) ratio1 < f.t(y1, y2, rho, eps[i + 1])/f.t(theta1, theta2, rho, eps[i]) ratio2 < dbeta(theta1, alphay1, 1000) dbeta(theta2, alphay2, 1000) ratio3 < dbeta(y1, alpha1, 1000) dbeta(y2, alpha2, 1000) weights < weights ratio1 ratio2/ratio3 theta1 < y1; theta2 < y2 if((1/sum(weights 2 )) < N/2){ ind < sample(c(1 : N), replace = T, prob = weights) theta1 < theta1[ind]; theta2 < theta2[ind] weights < rep(1, N) return(list(theta1 = log(theta1/(1 theta1)), theta2 = log(theta2/(1 theta2)))) 4 Iterative Simulation Methods In this section, Markov chain Monte Carlo methods are presented. Unlike direct simulation methods, these methods rely on the construction of a Markov chain. Hence by starting the Markov chain at any (arbitrarily) starting point, standard MCMC theory guarantees that the chain will converge to the correct distribution, see Gilks et al. (1996) for more details. One crucial difference between the iterative methods and the direct simulation methods is that iterative methods produce serially correlated samples. It is vitally important that the initial portions of the MCMC sample be discarded (usually termed burn-in). The determination of the length of burn-in, and the total length of Markov chain is collectively known as convergence diagnostics. Cowles and Carlin (1996) gives a comparative review of the various methods available in the literature for the assessment of convergence. 4.1 Gibbs Sampling The Gibbs sampler is a Markov chain sampler that starts at any arbitrary initial state. The chain then gets iteratively updated for some specified N iterations. At every iteration, it cylces through each of the d components of the paremeter θ = (θ 1,..., θ d ) in turn. The parameters are updated to a new sample according to their distributions conditioned on the current values of all other parameters. Casella and George (1992) provides an easy to read explanation of how the Gibbs sampler works. Here we give the Gibbs sampling algorithm for sampling θ = (θ 1,..., θ d ).
7 7 Gibbs Sampling algorithm: (Gibbs) 1. Initialise θ (1) 1,..., θ(1), set i = For j = 1,..., d, d (a) Sample θ (i+1) 1 from conditional distribution f(θ 1 θ (i) 2,..., θ(i) d ). (b) Sample θ (i+1) 2 from conditional distribution f(θ 2 θ (i+1) 1, θ (i) 3,..., θ(i) d ). (c). (d) Sample θ (i+1) d from conditional distribution f(θ d θ (i+1 1,..., θ (i+1) d 1 ). 3. Increment i = i + 1, if i < N, return to Step 2. The Gibbs sampler depends on the availability of the conditional distributions from which direct sampling must be possible. Example: Here we choose an arbitrary initial value. Full conditional distributions for θ 1 and θ 2 are available and can be sampled from directly. That is, f(θ (i) 1 θ(i 1) 2 ) N(ρθ (i 1) 2, 1 ρ 2 ) and f(θ (i) 2 θ(i) 1 ) N(ρθ(i) 1, 1 ρ2 ). R code for the Gibbs sampler is provided below. gibbs.bvn < function(n, rho, start){ theta1 < rep(na, N); theta2 < rep(na, N) theta1[1] < start[1]; theta2[1] < start[2] for (i in c(2 : N)){ #simulate from conditional distributions. theta1[i] < rnorm(1, rho theta2[i 1], sqrt(1 (rho rho))) theta2[i] < rnorm(1, rho theta1[i], sqrt(1 (rho rho))) return(list(theta1 = theta1, theta2 = theta2)) 4.2 Adaptive Rejection Sampling Gilks and Wild (1992) introduced adaptive rejection sampling for log-concave densities.the algorithm proceeds as in the Gibbs sampler, cycling through each of the d univariate parameters in turn, sampling from the conditional densities. Whereas the Gibbs sampler requires these conditional densities to be a standard distribution such that sampling from it is easy, the adaptive rejection sampling method will work for any logconcave conditional densities. Specifically the difference between the two algorithms is in Step 2 of the Gibbs sampler. We describe the adaptive rejection sampler for updating the jth parameter θ j in Step 2 of Gibbs algorithm. For some initial abscissae containing K points T K = {x k, k = 1,..., K, x 1 x 2... x K, over the parameter space of θ j, let h(θ j ) = ln f(θ j θ (i+1) 1,..., θ (i+1) j 1, θ(i) j+1,..., θ(i) d ),
8 8 and z k = h(x k+1) h(x k ) x k+1 h (x k+1 ) + x k h (x k ) h (x k ) h,, k = 1,..., K 1, (x k+1 ) u(θ) = h(x k ) + (θ x k )h (x k ), θ [z k 1, z k ], k = 1,..., K, l(θ) = (x k+1 θ)h(x k ) + (θ x k )h(x k+1 ) x k+1 x k, θ [x k, x k+1 ], k = 1,..., K, s(θ) = exp u(θ) exp u(θ )dθ, θ [z k 1, z k ], k = 1,..., K. z 0 and z K are the lower and upper bound on the support of θ j respectively. l(θ) = for θ < x 1 or θ > x K. The algorithm below samples from the conditional density f(θ j θ (i+1) 1,..., θ (i+1) j 1, θ(i) j+1,..., θ(i) d ) Adaptive Rejection algorithm: (ARS) 1. Initialise the K abscissae T K = {x j, j = 1,..., K 2. Sample y from s(θ) and sample w from Unif(0,1). If w exp{l(y) u(y), set θ (i+1) j = y. Otherwise go to Step If w exp{h(y) u(y), set θ (i+1) j = y. Otherwise go to Step Set T K+1 = T K {y, K = K + 1 and go to Step 2. Example: We implemented the example in WinBUGS, which requires specific coding generally based on statistically model specifications. Since our example here is artificial and not model based, we needed to use some of the tricks from Spiegelhalter et al. (2003) to sample from Equation 1. We will not include these codes here. 4.3 Slice Sampling The slice sampler generates a random sample from a given distribution by using an auxiliary variable, we give the algorithm for the slice sampler based on the single-variable slice sampler. As with ARS and Gibbs samplers, each parameter is updated in turn. We give the algorithm updating the jth parameter in Step 2 of Gibbs algorithm.
9 9 Slice Sampling algorithm: (SLI) 1. Sample u from Unif(0, f(θ (i) j θ(i+1) 1,..., θ (i+1) j 1 2. Sample θ (i+1) j uniformly from the set, θ(i) A = {θ j : f(θ j θ (i+1) 1,..., θ (i+1) j 1 j+1,..., θ(i) d ))., θ(i) j+1,..., θ(i) d ) > u The algorithm we gave is for the simplest form of the slice sampler. Multivariate updates using the slice sampler is also possible. However these are far more complex, see Neal (2003) for further details. Example: We will assume that we cannot sample easily from the conditional distributions of θ 1 and θ 2 here. R code for the slice sampler is given below. sli.bvn < function(n, rho, start){ theta1 < rep(na, N); theta2 < rep(na, N) theta1[1] < start[1]; theta2[1] < start[2] for (i in c(1 : T)){ u < runif(1, 0, f(theta1[i 1], theta2[i 1], rho)) x1 < left(theta2[i 1], u, rho) x2 < right(theta2[i 1], u, rho) theta1[i] < runif(1, x1, x2) u < runif(1, 0, f(theta1[i], theta2[i 1], rho)) y1 < left(theta1[i], u, rho) y2 < right(theta1[i], u, rho) theta2[i] < runif(1, y1, y2) return(list(theta1 = theta1, theta2 = theta2)) Here the function f(theta1,theta2,rho) computes the density from Equation 1, and the functions left(theta,u,rho), right(theta,u,rho) returns the values ρθ ± (1 ρ 2 ) log(2π(1 ρ 2 )u), giving the left and right limits of the interval A. 4.4 Metropolis-Hastings Sampling Metropolis-Hastings algorithms rely on the construction of a reversible Markov chain. At each iteration of the chain, a candidate sample is proposed from an arbitrary candidate generating function Q, this sample is then either accepted or rejected according to an acceptance ratio. Chib and Greenberg (1995) provides an expository article on the algorithm. The general Metropolis-Hastings algorithm is given below:
10 10 Metropolis-Hastings Algorithm: (MH) 1. Initialise θ (1), set i = Generate y from function Q(θ (i),.) and U from Unif(0, 1). 3. Let θ (i+1) = y if U min(1, f(y)q(y,θ(i) ) f(θ i )Q(θ (i),y) ), otherwise let θ(i+1) = θ (i). 4. increment i = i + 1, if i < N, return to Step 2. f(y)q(y,θ The quantity min(1, (i) ) ) is commonly referred to as the acceptance probability. f(θ (i) )Q(θ (i),y) Q(θ (i), y) is an arbitrary candidate generating function giving the probability of the new point y given the current point θ (i). Note that the algorithm given above updates the d dimensional parameter vector simultaneously. However it is also possible to update smaller blocks of size 1 s < d and cycle through each block in turn, in the manner of the Gibbs sampler. Note that when s = 1 for all blocks, and the candidate generating function is the full conditional distribution, then the Metropolis-Hastings sampler is equivalent to the Gibbs sampler. A combination of the Gibbs and Metropolis-Hastings move is called the hybrid sampler, we do not give further details here but refer the reader to Gilks et al. (1996) for further reading. Here we give two of the most popular MH samplers in detail, the random walk Metropolis-Hastings sampler (RW-MH) and the independence sampler (IND-MH). RW- MH takes Q(θ (i), y) = g( y θ (i) ) as the candidate generating function, where Q is symmetric with Q(θ (i), y) = Q(y, θ (i) ), hence the corresponding acceptance probability f(y) is given by min(1, f(θ (i) ) ). The IND-MH takes as candidate generating Q(θ(i), y) = g(y), where g is independent of the current state of the chain. Note here that the acceptance probability min(1, f(y)g(θ(i) ) f(θ (i) )g(y) ) =min(1, f(y)/g(y) ) is the ratio of the weights used in importance f(θ (i) )/g(θ (i) ) sampling. Example: For the RW-MH we let the proposal distribution Q be a bivariate normal distribution. We take the mean to be the value of the current iteration θ (i), and we tune the covariance matrix so that we obtain approximately acceptance rate. See Roberts and Rosenthal (2001) for more details on this acceptance rate calculation. Here, the sampler works with respect to Equation 1. R code below for the RW-MH requires the use of mvtnorm library package,
11 11 rw.bvn < function(n, rho, start){ theta1 < rep(na, N); theta2 < rep(na, N) theta1[1] < start[1]; theta2[1] < start[2] Id < matrix(c(1, 0, 0, 1), ncol = 2, byrow = T) sigma < matrix(c(1, rho, rho, 1), ncol = 2, byrow = T) for (i in c(2 : N)){ prop < rmvnorm(1, c(theta1[i 1], theta2[i 1]), 5.6 Id) accept < dmvnorm(prop, c(0, 0), sigma) dmvnorm(c(theta1[i 1], theta2[i 1]), c(0, 0), sigma) if(runif(1) < accept){ theta1[i] < prop[1]; theta2[i] < prop[2] else{theta1[i] < theta1[i 1]; theta2[i] < theta2[i 1] return(list(theta1 = theta1, theta2 = theta2)) For the independence sampler, we take the candidate distribution Q =Unif(0,1). Again the proposal distribution needs to have the same support as the target distribution, otherwise parts of the target distribution will never be visited by the chain. Thus we choose the Uniform distribution and again use the transformed bivariate normal distribution so that this algorithm is then broadly comparable with the direct simulation methods. We omit the R code for the IND-MH here as it differs from the RW-MH only in the candidate generating function and the consequent calculation of the acceptance probability. 5 Results of Comparisons We used the same example throughout this paper. For direct simulation methods, we transformed our bivariate Normal distribution onto the unit square and used a Uniform distribution over the unit square as the trial/initial distribution. We note that this is not the optimal choice as the trial distribution but a convenient one. We kept user specified choices as closely as possible for all our algorithms to facilitate comparison. For direct simulation methods, we drew N = 1, 000 samples each. For the iterative methods, we chose independent random starting points, and threw away 500 initial samples as burn-in and used the remaining 1,000 samples for inference. We calculated the mean square errors of quantile estimators for all samplers, using 5 replications, results are shown in Table 1. We restrict our comparisons to be within direct and iterative simulations separately. The direct simulation methods were not optimised, whereas the iterative simulations were in a sense optimised (with the exception of the IND-MH, which is comparable to direct simulation methods), this may explain the apparent smaller MSE values for the iterative methods. For the direct simulation methods, the MSE performances are similar for all three methods for the quantiles that are close to the modes of the distribution, i.e., θ 0.25, θ 0.5, θ Some differences for the tails θ 0.025, θ are apparent. Though all samplers deteriorate with the increase in the correlation ρ between parameters. The IS appears to have a small advantage over the other two methods, particularly for small ρ, i.e., when the two parameters are near independent. We note that the method of SMC, has many possibilities for
12 12 improvement, it is potentially a better method, particularly for high dimensions. However such improvement is beyond the capabilities of the general user. For iterative simulations, it is clear that Gibbs type algorithms (GIBBS, ARS and SLI) gave better MSEs than the MH (RW and IND) methods, particularly in the tails. In results not shown here, for ρ = 0.99, the RW-MH algorithm appears to be converging faster than the Gibbs type algorithms. Hence blocking updating highly correlated parameters may be preferable here. Perhaps more interestingly, the slice sampler appeared to have converged faster than the Gibbs sampler. In an informal comparison, it took about 5,500 iterations of the Gibbs sampler to converge at ρ = 0.99 compared with about 3,500 for the single-variable slice sampler. Roberts and Rosenthal (1999) gives some theoretical support for why this may occur for a two dimensional parameter space. Table 2 gives a list of main features for each of the methods, together with an estimate of their computational cost in terms of running time for the algorithms. Clearly in terms of computational time, IS and RS are much faster than SMC. For MCMC methods, both RW-MH and SLI samplers require large computational time, with Gibbs the fastest. We note that ARS is computed using WinBUGS, not R, and hence more efficient. 6 Discussions and recommendations Although IS is by far the easiest algorithm to use, RS produces i.i.d. samples. We recommend the use of rejection sampling (RS) when the dimension of the target distribution is small, i.e., one or two dimensions, and a good enveloping function g and M can easily be found. Transforming the target distribution onto bounded spaces, such as the way we did in our example, may sometimes help the search for g. In higher dimensions, SMC and MCMC will be preferable. Although the application of SMC requires more tuning than MCMC, it does not require additional computation of convergence assessment. Of the MCMC methods, we recommend using a combination of the Gibbs sampler when the full conditional distributions can easily be sampled from, and the single-variable slice sampler. When the parameters are known to be highly correlated, we recommend updating these parameters in a block using RW-MH, tuned to have acceptance rate of around Finally, if the user is familiar with the WinBUGS language, statistical models can be fitted using ARS, however, any flexibility to block update highly correlated parameters are lost. We note that the slice sampler is not restricted to updating one parameter at a time, multivariate slice samplers are also possible instead of the RW-MH for block updating, but the algorithm is far more complex. However, the slice sampler does not require userspecific tuning, and has the potential to be made into a generic software, given its nice properties supported by our simulation study, this would be highly recommended! Acknowledgements The authors wish to thank the Faculty of Science, and the School of Mathematics and Statistics at UNSW, the second author was supported by an FRG(UNSW) grant.
13 13 References Casella, G. and E. I. George (1992). Explaining the gibbs sampler. Journal of the American Statistical Association 46, Chib, S. and E. Greenberg (1995). Understanding the metropolis-hastings algorithm. The American Statistician 49, Cowles, M. K. and B. P. Carlin (1996). Markov chain monte carlo convergence diagnostics: A comparative review. Journal of the American Statistical Association 91, Del Moral, P., A. Doucet, and A. Jasra (2006). Sequential Monte Carlo samplers. Journal of the Royal Statistical Society. Series B 68, Gilks, W. R., S. Richardson, and D. J. Spiegelhalter (1996). Markov Chain Monte Carlo in Practice. Chapman and Hall. Gilks, W. R. and P. Wild (1992). Adaptive rejection sampling for gibbs sampling. Applied Statistics 41, Neal, R. M. (2003). Slice sampling. Annals of Statistics 31, Ripley, B. D. (1987). Stochastic Simulation. John Wiley and Sons. Robert, C. P. and G. Casella (2004). Monte Carlo Statistical Methods (2nd ed.). Springer Verlag. Roberts, G. and J. S. Rosenthal (2001). Optimal scaling for various metropolis-hastings algorithms. Statistical Science 16, Roberts, G. O. and J. S. Rosenthal (1999). Convergence of slice sampler Markov chains. Journal of the Royal Statistical Society: Series B 61, Spiegelhalter, D. J., A. Thomas, N. G. Best, and D. Lunn (2003). WinBUGS version 1.4 User Manual. MRC Biostatistics Unit, Cambridge, Available at Wasserman, L. (2004). All of Statistics - A Concise Course in Statistical Inference. Springer Verlag.
14 14 MSE: Direct simulation ρ IS θ RS SMC IS θ 0.25 RS SMC IS θ 0.5 RS SMC IS θ 0.75 RS SMC IS θ RS SMC MSE: Iterative simulation GIBBS RW-MH ARS SLI θ IND-MH GIBBS ARS SLI RW-MH θ 0.25 IND-MH GIBBS ARS SLI RW-MH θ 0.5 IND-MH GIBBS ARS SLI RW-MH θ 0.75 IND-MH GIBBS ARS SLI RW-MH θ IND-MH Table 1: Comparisons of MSE (sum of θ 1 and θ 2 )) for direct and iterative simulation methods.
15 15 Method time Advantages Disadvantages IS 0.04 produces weighted samples requires an enveloping function g. RS 0.04 produces i.i.d. samples requires an enveloping function g calculation of M SMC 1.51 suitable for high dimensions requires tuning produces weighted samples consecutive distributions should be close GIBBS 0.84 easy implementation need to sample directly from cond. distr. no block updating RW-MH easy implementation requires tuning block update IND-MH 3.00 easy to implement requires a good proposal distribution block update ARS 5.00 implemented in WinBUGS log-concave densities only no block updating SLI no tuning is required implementation is more complicated block update Table 2: Summary table for direct and iterative simulation methods. Process time in seconds in the amount of real time elapsed for computing 10,000 iterations on Pentium GHz machine.
1 Methods for Posterior Simulation
1 Methods for Posterior Simulation Let p(θ y) be the posterior. simulation. Koop presents four methods for (posterior) 1. Monte Carlo integration: draw from p(θ y). 2. Gibbs sampler: sequentially drawing
More informationBayesian Estimation for Skew Normal Distributions Using Data Augmentation
The Korean Communications in Statistics Vol. 12 No. 2, 2005 pp. 323-333 Bayesian Estimation for Skew Normal Distributions Using Data Augmentation Hea-Jung Kim 1) Abstract In this paper, we develop a MCMC
More informationBayesian Statistics Group 8th March Slice samplers. (A very brief introduction) The basic idea
Bayesian Statistics Group 8th March 2000 Slice samplers (A very brief introduction) The basic idea lacements To sample from a distribution, simply sample uniformly from the region under the density function
More informationMCMC Methods for data modeling
MCMC Methods for data modeling Kenneth Scerri Department of Automatic Control and Systems Engineering Introduction 1. Symposium on Data Modelling 2. Outline: a. Definition and uses of MCMC b. MCMC algorithms
More informationMarkov chain Monte Carlo methods
Markov chain Monte Carlo methods (supplementary material) see also the applet http://www.lbreyer.com/classic.html February 9 6 Independent Hastings Metropolis Sampler Outline Independent Hastings Metropolis
More informationChapter 1. Introduction
Chapter 1 Introduction A Monte Carlo method is a compuational method that uses random numbers to compute (estimate) some quantity of interest. Very often the quantity we want to compute is the mean of
More informationMCMC GGUM v1.2 User s Guide
MCMC GGUM v1.2 User s Guide Wei Wang University of Central Florida Jimmy de la Torre Rutgers, The State University of New Jersey Fritz Drasgow University of Illinois at Urbana-Champaign Travis Meade and
More informationOptimization Methods III. The MCMC. Exercises.
Aula 8. Optimization Methods III. Exercises. 0 Optimization Methods III. The MCMC. Exercises. Anatoli Iambartsev IME-USP Aula 8. Optimization Methods III. Exercises. 1 [RC] A generic Markov chain Monte
More informationMarkov Chain Monte Carlo (part 1)
Markov Chain Monte Carlo (part 1) Edps 590BAY Carolyn J. Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Spring 2018 Depending on the book that you select for
More informationADAPTIVE METROPOLIS-HASTINGS SAMPLING, OR MONTE CARLO KERNEL ESTIMATION
ADAPTIVE METROPOLIS-HASTINGS SAMPLING, OR MONTE CARLO KERNEL ESTIMATION CHRISTOPHER A. SIMS Abstract. A new algorithm for sampling from an arbitrary pdf. 1. Introduction Consider the standard problem of
More informationShort-Cut MCMC: An Alternative to Adaptation
Short-Cut MCMC: An Alternative to Adaptation Radford M. Neal Dept. of Statistics and Dept. of Computer Science University of Toronto http://www.cs.utoronto.ca/ radford/ Third Workshop on Monte Carlo Methods,
More informationQuantitative Biology II!
Quantitative Biology II! Lecture 3: Markov Chain Monte Carlo! March 9, 2015! 2! Plan for Today!! Introduction to Sampling!! Introduction to MCMC!! Metropolis Algorithm!! Metropolis-Hastings Algorithm!!
More informationA Short History of Markov Chain Monte Carlo
A Short History of Markov Chain Monte Carlo Christian Robert and George Casella 2010 Introduction Lack of computing machinery, or background on Markov chains, or hesitation to trust in the practicality
More information10.4 Linear interpolation method Newton s method
10.4 Linear interpolation method The next best thing one can do is the linear interpolation method, also known as the double false position method. This method works similarly to the bisection method by
More informationImproved Adaptive Rejection Metropolis Sampling Algorithms
Improved Adaptive Rejection Metropolis Sampling Algorithms 1 Luca Martino, Jesse Read, David Luengo Department of Signal Theory and Communications, Universidad Carlos III de Madrid. arxiv:1205.5494v4 [stat.co]
More informationStatistical Matching using Fractional Imputation
Statistical Matching using Fractional Imputation Jae-Kwang Kim 1 Iowa State University 1 Joint work with Emily Berg and Taesung Park 1 Introduction 2 Classical Approaches 3 Proposed method 4 Application:
More informationApproximate Bayesian Computation. Alireza Shafaei - April 2016
Approximate Bayesian Computation Alireza Shafaei - April 2016 The Problem Given a dataset, we are interested in. The Problem Given a dataset, we are interested in. The Problem Given a dataset, we are interested
More informationA GENERAL GIBBS SAMPLING ALGORITHM FOR ANALYZING LINEAR MODELS USING THE SAS SYSTEM
A GENERAL GIBBS SAMPLING ALGORITHM FOR ANALYZING LINEAR MODELS USING THE SAS SYSTEM Jayawant Mandrekar, Daniel J. Sargent, Paul J. Novotny, Jeff A. Sloan Mayo Clinic, Rochester, MN 55905 ABSTRACT A general
More informationSimulating from the Polya posterior by Glen Meeden, March 06
1 Introduction Simulating from the Polya posterior by Glen Meeden, glen@stat.umn.edu March 06 The Polya posterior is an objective Bayesian approach to finite population sampling. In its simplest form it
More informationSTAT 725 Notes Monte Carlo Integration
STAT 725 Notes Monte Carlo Integration Two major classes of numerical problems arise in statistical inference: optimization and integration. We have already spent some time discussing different optimization
More informationNested Sampling: Introduction and Implementation
UNIVERSITY OF TEXAS AT SAN ANTONIO Nested Sampling: Introduction and Implementation Liang Jing May 2009 1 1 ABSTRACT Nested Sampling is a new technique to calculate the evidence, Z = P(D M) = p(d θ, M)p(θ
More informationClustering Relational Data using the Infinite Relational Model
Clustering Relational Data using the Infinite Relational Model Ana Daglis Supervised by: Matthew Ludkin September 4, 2015 Ana Daglis Clustering Data using the Infinite Relational Model September 4, 2015
More informationAMCMC: An R interface for adaptive MCMC
AMCMC: An R interface for adaptive MCMC by Jeffrey S. Rosenthal * (February 2007) Abstract. We describe AMCMC, a software package for running adaptive MCMC algorithms on user-supplied density functions.
More informationBayesian Spatiotemporal Modeling with Hierarchical Spatial Priors for fmri
Bayesian Spatiotemporal Modeling with Hierarchical Spatial Priors for fmri Galin L. Jones 1 School of Statistics University of Minnesota March 2015 1 Joint with Martin Bezener and John Hughes Experiment
More informationSampling informative/complex a priori probability distributions using Gibbs sampling assisted by sequential simulation
Sampling informative/complex a priori probability distributions using Gibbs sampling assisted by sequential simulation Thomas Mejer Hansen, Klaus Mosegaard, and Knud Skou Cordua 1 1 Center for Energy Resources
More informationLinear Modeling with Bayesian Statistics
Linear Modeling with Bayesian Statistics Bayesian Approach I I I I I Estimate probability of a parameter State degree of believe in specific parameter values Evaluate probability of hypothesis given the
More informationarxiv: v3 [stat.co] 27 Apr 2012
A multi-point Metropolis scheme with generic weight functions arxiv:1112.4048v3 stat.co 27 Apr 2012 Abstract Luca Martino, Victor Pascual Del Olmo, Jesse Read Department of Signal Theory and Communications,
More informationMCMC Diagnostics. Yingbo Li MATH Clemson University. Yingbo Li (Clemson) MCMC Diagnostics MATH / 24
MCMC Diagnostics Yingbo Li Clemson University MATH 9810 Yingbo Li (Clemson) MCMC Diagnostics MATH 9810 1 / 24 Convergence to Posterior Distribution Theory proves that if a Gibbs sampler iterates enough,
More informationPackage batchmeans. R topics documented: July 4, Version Date
Package batchmeans July 4, 2016 Version 1.0-3 Date 2016-07-03 Title Consistent Batch Means Estimation of Monte Carlo Standard Errors Author Murali Haran and John Hughes
More informationA noninformative Bayesian approach to small area estimation
A noninformative Bayesian approach to small area estimation Glen Meeden School of Statistics University of Minnesota Minneapolis, MN 55455 glen@stat.umn.edu September 2001 Revised May 2002 Research supported
More informationIssues in MCMC use for Bayesian model fitting. Practical Considerations for WinBUGS Users
Practical Considerations for WinBUGS Users Kate Cowles, Ph.D. Department of Statistics and Actuarial Science University of Iowa 22S:138 Lecture 12 Oct. 3, 2003 Issues in MCMC use for Bayesian model fitting
More informationPackage mcmcse. February 15, 2013
Package mcmcse February 15, 2013 Version 1.0-1 Date 2012 Title Monte Carlo Standard Errors for MCMC Author James M. Flegal and John Hughes Maintainer James M. Flegal
More informationThe Cross-Entropy Method for Mathematical Programming
The Cross-Entropy Method for Mathematical Programming Dirk P. Kroese Reuven Y. Rubinstein Department of Mathematics, The University of Queensland, Australia Faculty of Industrial Engineering and Management,
More informationStochastic Function Norm Regularization of DNNs
Stochastic Function Norm Regularization of DNNs Amal Rannen Triki Dept. of Computational Science and Engineering Yonsei University Seoul, South Korea amal.rannen@yonsei.ac.kr Matthew B. Blaschko Center
More informationMSA101/MVE Lecture 5
MSA101/MVE187 2017 Lecture 5 Petter Mostad Chalmers University September 12, 2017 1 / 15 Importance sampling MC integration computes h(x)f (x) dx where f (x) is a probability density function, by simulating
More informationPost-Processing for MCMC
ost-rocessing for MCMC Edwin D. de Jong Marco A. Wiering Mădălina M. Drugan institute of information and computing sciences, utrecht university technical report UU-CS-23-2 www.cs.uu.nl ost-rocessing for
More informationMonte Carlo for Spatial Models
Monte Carlo for Spatial Models Murali Haran Department of Statistics Penn State University Penn State Computational Science Lectures April 2007 Spatial Models Lots of scientific questions involve analyzing
More informationJournal of Statistical Software
JSS Journal of Statistical Software December 2007, Volume 23, Issue 9. http://www.jstatsoft.org/ WinBUGSio: A SAS Macro for the Remote Execution of WinBUGS Michael K. Smith Pfizer Global Research and Development
More informationA guided walk Metropolis algorithm
Statistics and computing (1998) 8, 357±364 A guided walk Metropolis algorithm PAUL GUSTAFSON Department of Statistics, University of British Columbia Vancouver, B.C., Canada V6T 1Z2 Submitted July 1997
More informationPackage MfUSampler. June 13, 2017
Package MfUSampler June 13, 2017 Type Package Title Multivariate-from-Univariate (MfU) MCMC Sampler Version 1.0.4 Date 2017-06-09 Author Alireza S. Mahani, Mansour T.A. Sharabiani Maintainer Alireza S.
More informationDynamic Thresholding for Image Analysis
Dynamic Thresholding for Image Analysis Statistical Consulting Report for Edward Chan Clean Energy Research Center University of British Columbia by Libo Lu Department of Statistics University of British
More informationOverview. Monte Carlo Methods. Statistics & Bayesian Inference Lecture 3. Situation At End Of Last Week
Statistics & Bayesian Inference Lecture 3 Joe Zuntz Overview Overview & Motivation Metropolis Hastings Monte Carlo Methods Importance sampling Direct sampling Gibbs sampling Monte-Carlo Markov Chains Emcee
More informationHybrid Quasi-Monte Carlo Method for the Simulation of State Space Models
The Tenth International Symposium on Operations Research and Its Applications (ISORA 211) Dunhuang, China, August 28 31, 211 Copyright 211 ORSC & APORC, pp. 83 88 Hybrid Quasi-Monte Carlo Method for the
More informationThe Cross-Entropy Method
The Cross-Entropy Method Guy Weichenberg 7 September 2003 Introduction This report is a summary of the theory underlying the Cross-Entropy (CE) method, as discussed in the tutorial by de Boer, Kroese,
More informationRolling Markov Chain Monte Carlo
Rolling Markov Chain Monte Carlo Din-Houn Lau Imperial College London Joint work with Axel Gandy 4 th July 2013 Predict final ranks of the each team. Updates quick update of predictions. Accuracy control
More informationBayesian Modelling with JAGS and R
Bayesian Modelling with JAGS and R Martyn Plummer International Agency for Research on Cancer Rencontres R, 3 July 2012 CRAN Task View Bayesian Inference The CRAN Task View Bayesian Inference is maintained
More informationAn Introduction to Markov Chain Monte Carlo
An Introduction to Markov Chain Monte Carlo Markov Chain Monte Carlo (MCMC) refers to a suite of processes for simulating a posterior distribution based on a random (ie. monte carlo) process. In other
More informationRolling Markov Chain Monte Carlo
Rolling Markov Chain Monte Carlo Din-Houn Lau Imperial College London Joint work with Axel Gandy 4 th September 2013 RSS Conference 2013: Newcastle Output predicted final ranks of the each team. Updates
More informationModified Metropolis-Hastings algorithm with delayed rejection
Modified Metropolis-Hastings algorithm with delayed reection K.M. Zuev & L.S. Katafygiotis Department of Civil Engineering, Hong Kong University of Science and Technology, Hong Kong, China ABSTRACT: The
More informationAn Efficient Model Selection for Gaussian Mixture Model in a Bayesian Framework
IEEE SIGNAL PROCESSING LETTERS, VOL. XX, NO. XX, XXX 23 An Efficient Model Selection for Gaussian Mixture Model in a Bayesian Framework Ji Won Yoon arxiv:37.99v [cs.lg] 3 Jul 23 Abstract In order to cluster
More informationAssessing the Quality of the Natural Cubic Spline Approximation
Assessing the Quality of the Natural Cubic Spline Approximation AHMET SEZER ANADOLU UNIVERSITY Department of Statisticss Yunus Emre Kampusu Eskisehir TURKEY ahsst12@yahoo.com Abstract: In large samples,
More informationImage analysis. Computer Vision and Classification Image Segmentation. 7 Image analysis
7 Computer Vision and Classification 413 / 458 Computer Vision and Classification The k-nearest-neighbor method The k-nearest-neighbor (knn) procedure has been used in data analysis and machine learning
More informationBayesian data analysis using R
Bayesian data analysis using R BAYESIAN DATA ANALYSIS USING R Jouni Kerman, Samantha Cook, and Andrew Gelman Introduction Bayesian data analysis includes but is not limited to Bayesian inference (Gelman
More informationEstimation of Item Response Models
Estimation of Item Response Models Lecture #5 ICPSR Item Response Theory Workshop Lecture #5: 1of 39 The Big Picture of Estimation ESTIMATOR = Maximum Likelihood; Mplus Any questions? answers Lecture #5:
More informationSamuel Coolidge, Dan Simon, Dennis Shasha, Technical Report NYU/CIMS/TR
Detecting Missing and Spurious Edges in Large, Dense Networks Using Parallel Computing Samuel Coolidge, sam.r.coolidge@gmail.com Dan Simon, des480@nyu.edu Dennis Shasha, shasha@cims.nyu.edu Technical Report
More informationMonte Carlo sampling
1 y u theta 0 x 1 Monte Carlo sampling Problem 1 Suppose we want to sample uniformly at random from the triangle defined by the points (0,0), (0,1), (1,0). First Sampling Algorithm: We decide to do this
More informationCS281 Section 9: Graph Models and Practical MCMC
CS281 Section 9: Graph Models and Practical MCMC Scott Linderman November 11, 213 Now that we have a few MCMC inference algorithms in our toolbox, let s try them out on some random graph models. Graphs
More informationConvexization in Markov Chain Monte Carlo
in Markov Chain Monte Carlo 1 IBM T. J. Watson Yorktown Heights, NY 2 Department of Aerospace Engineering Technion, Israel August 23, 2011 Problem Statement MCMC processes in general are governed by non
More informationApproximate Bayesian Computation methods and their applications for hierarchical statistical models. University College London, 2015
Approximate Bayesian Computation methods and their applications for hierarchical statistical models University College London, 2015 Contents 1. Introduction 2. ABC methods 3. Hierarchical models 4. Application
More informationParallel Multivariate Slice Sampling
Parallel Multivariate Slice Sampling Matthew M. Tibbits Department of Statistics Pennsylvania State University mmt143@stat.psu.edu Murali Haran Department of Statistics Pennsylvania State University mharan@stat.psu.edu
More informationAn Analysis of a Variation of Hit-and-Run for Uniform Sampling from General Regions
An Analysis of a Variation of Hit-and-Run for Uniform Sampling from General Regions SEKSAN KIATSUPAIBUL Chulalongkorn University ROBERT L. SMITH University of Michigan and ZELDA B. ZABINSKY University
More informationModel validation through "Posterior predictive checking" and "Leave-one-out"
Workshop of the Bayes WG / IBS-DR Mainz, 2006-12-01 G. Nehmiz M. Könen-Bergmann Model validation through "Posterior predictive checking" and "Leave-one-out" Overview The posterior predictive distribution
More informationBART STAT8810, Fall 2017
BART STAT8810, Fall 2017 M.T. Pratola November 1, 2017 Today BART: Bayesian Additive Regression Trees BART: Bayesian Additive Regression Trees Additive model generalizes the single-tree regression model:
More informationBayesian Analysis of Extended Lomax Distribution
Bayesian Analysis of Extended Lomax Distribution Shankar Kumar Shrestha and Vijay Kumar 2 Public Youth Campus, Tribhuvan University, Nepal 2 Department of Mathematics and Statistics DDU Gorakhpur University,
More informationMonte Carlo Methods and Statistical Computing: My Personal E
Monte Carlo Methods and Statistical Computing: My Personal Experience Department of Mathematics & Statistics Indian Institute of Technology Kanpur November 29, 2014 Outline Preface 1 Preface 2 3 4 5 6
More informationA Random Number Based Method for Monte Carlo Integration
A Random Number Based Method for Monte Carlo Integration J Wang and G Harrell Department Math and CS, Valdosta State University, Valdosta, Georgia, USA Abstract - A new method is proposed for Monte Carlo
More informationMissing Data Analysis for the Employee Dataset
Missing Data Analysis for the Employee Dataset 67% of the observations have missing values! Modeling Setup Random Variables: Y i =(Y i1,...,y ip ) 0 =(Y i,obs, Y i,miss ) 0 R i =(R i1,...,r ip ) 0 ( 1
More informationLevel-set MCMC Curve Sampling and Geometric Conditional Simulation
Level-set MCMC Curve Sampling and Geometric Conditional Simulation Ayres Fan John W. Fisher III Alan S. Willsky February 16, 2007 Outline 1. Overview 2. Curve evolution 3. Markov chain Monte Carlo 4. Curve
More informationA One-Pass Sequential Monte Carlo Method for Bayesian Analysis of Massive Datasets
A One-Pass Sequential Monte Carlo Method for Bayesian Analysis of Massive Datasets Suhrid Balakrishnan and David Madigan Department of Computer Science, Rutgers University, Piscataway, NJ 08854, USA Department
More informationFitting Social Network Models Using the Varying Truncation S. Truncation Stochastic Approximation MCMC Algorithm
Fitting Social Network Models Using the Varying Truncation Stochastic Approximation MCMC Algorithm May. 17, 2012 1 This talk is based on a joint work with Dr. Ick Hoon Jin Abstract The exponential random
More informationApproximate Bayesian Computation using Auxiliary Models
Approximate Bayesian Computation using Auxiliary Models Tony Pettitt Co-authors Chris Drovandi, Malcolm Faddy Queensland University of Technology Brisbane MCQMC February 2012 Tony Pettitt () ABC using
More informationUNIVERSITY OF OSLO. Please make sure that your copy of the problem set is complete before you attempt to answer anything. k=1
UNIVERSITY OF OSLO Faculty of mathematics and natural sciences Eam in: STK4051 Computational statistics Day of eamination: Thursday November 30 2017. Eamination hours: 09.00 13.00. This problem set consists
More informationWeb page recommendation using a stochastic process model
Data Mining VII: Data, Text and Web Mining and their Business Applications 233 Web page recommendation using a stochastic process model B. J. Park 1, W. Choi 1 & S. H. Noh 2 1 Computer Science Department,
More informationProbabilistic Graphical Models
Overview of Part Two Probabilistic Graphical Models Part Two: Inference and Learning Christopher M. Bishop Exact inference and the junction tree MCMC Variational methods and EM Example General variational
More informationGiRaF: a toolbox for Gibbs Random Fields analysis
GiRaF: a toolbox for Gibbs Random Fields analysis Julien Stoehr *1, Pierre Pudlo 2, and Nial Friel 1 1 University College Dublin 2 Aix-Marseille Université February 24, 2016 Abstract GiRaF package offers
More informationA Bayesian approach to parameter estimation for kernel density estimation via transformations
A Bayesian approach to parameter estimation for kernel density estimation via transformations Qing Liu,, David Pitt 2, Xibin Zhang 3, Xueyuan Wu Centre for Actuarial Studies, Faculty of Business and Economics,
More informationParallel Gibbs Sampling From Colored Fields to Thin Junction Trees
Parallel Gibbs Sampling From Colored Fields to Thin Junction Trees Joseph Gonzalez Yucheng Low Arthur Gretton Carlos Guestrin Draw Samples Sampling as an Inference Procedure Suppose we wanted to know the
More informationBUGS: Language, engines, and interfaces
BUGS: Language, engines, and interfaces Patrick Breheny January 17 Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/18 The BUGS framework The BUGS project (Bayesian inference using Gibbs Sampling)
More informationMATH3016: OPTIMIZATION
MATH3016: OPTIMIZATION Lecturer: Dr Huifu Xu School of Mathematics University of Southampton Highfield SO17 1BJ Southampton Email: h.xu@soton.ac.uk 1 Introduction What is optimization? Optimization is
More informationCOPULA MODELS FOR BIG DATA USING DATA SHUFFLING
COPULA MODELS FOR BIG DATA USING DATA SHUFFLING Krish Muralidhar, Rathindra Sarathy Department of Marketing & Supply Chain Management, Price College of Business, University of Oklahoma, Norman OK 73019
More informationWinBUGS A Bayesian modelling framework: Concepts, structure, and extensibility
Statistics and Computing (2000) 10, 325 337 WinBUGS A Bayesian modelling framework: Concepts, structure, and extensibility DAVID J. LUNN, ANDREW THOMAS, NICKY BEST and DAVID SPIEGELHALTER Department of
More informationSlice sampler algorithm for generalized Pareto distribution
Slice sampler algorithm for generalized Pareto distribution Mohammad Rostami, Mohd Bakri Adam Yahya, Mohamed Hisham Yahya, Noor Akma Ibrahim Abstract In this paper, we developed the slice sampler algorithm
More informationPolyEDA: Combining Estimation of Distribution Algorithms and Linear Inequality Constraints
PolyEDA: Combining Estimation of Distribution Algorithms and Linear Inequality Constraints Jörn Grahl and Franz Rothlauf Working Paper 2/2004 January 2004 Working Papers in Information Systems University
More informationMissing Data Analysis for the Employee Dataset
Missing Data Analysis for the Employee Dataset 67% of the observations have missing values! Modeling Setup For our analysis goals we would like to do: Y X N (X, 2 I) and then interpret the coefficients
More informationAN ADAPTIVE POPULATION IMPORTANCE SAMPLER. Luca Martino*, Victor Elvira\ David Luengcfi, Jukka Corander*
AN ADAPTIVE POPULATION IMPORTANCE SAMPLER Luca Martino*, Victor Elvira\ David Luengcfi, Jukka Corander* * Dep. of Mathematics and Statistics, University of Helsinki, 00014 Helsinki (Finland). t Dep. of
More informationProbabilistic Graphical Models
10-708 Probabilistic Graphical Models Homework 4 Due Apr 27, 12:00 noon Submission: Homework is due on the due date at 12:00 noon. Please see course website for policy on late submission. You must submit
More informationBayesian parameter estimation in Ecolego using an adaptive Metropolis-Hastings-within-Gibbs algorithm
IT 16 052 Examensarbete 30 hp Juni 2016 Bayesian parameter estimation in Ecolego using an adaptive Metropolis-Hastings-within-Gibbs algorithm Sverrir Þorgeirsson Institutionen för informationsteknologi
More informationPackage atmcmc. February 19, 2015
Type Package Package atmcmc February 19, 2015 Title Automatically Tuned Markov Chain Monte Carlo Version 1.0 Date 2014-09-16 Author Jinyoung Yang Maintainer Jinyoung Yang
More informationHierarchical Modelling for Large Spatial Datasets
Hierarchical Modelling for Large Spatial Datasets Sudipto Banerjee 1 and Andrew O. Finley 2 1 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. 2 Biostatistics,
More informationOn the Parameter Estimation of the Generalized Exponential Distribution Under Progressive Type-I Interval Censoring Scheme
arxiv:1811.06857v1 [math.st] 16 Nov 2018 On the Parameter Estimation of the Generalized Exponential Distribution Under Progressive Type-I Interval Censoring Scheme Mahdi Teimouri Email: teimouri@aut.ac.ir
More informationYou ve already read basics of simulation now I will be taking up method of simulation, that is Random Number Generation
Unit 5 SIMULATION THEORY Lesson 39 Learning objective: To learn random number generation. Methods of simulation. Monte Carlo method of simulation You ve already read basics of simulation now I will be
More informationMarkov Decision Processes and Reinforcement Learning
Lecture 14 and Marco Chiarandini Department of Mathematics & Computer Science University of Southern Denmark Slides by Stuart Russell and Peter Norvig Course Overview Introduction Artificial Intelligence
More informationApproximate (Monte Carlo) Inference in Bayes Nets. Monte Carlo (continued)
Approximate (Monte Carlo) Inference in Bayes Nets Basic idea: Let s repeatedly sample according to the distribution represented by the Bayes Net. If in 400/1000 draws, the variable X is true, then we estimate
More informationComparing different interpolation methods on two-dimensional test functions
Comparing different interpolation methods on two-dimensional test functions Thomas Mühlenstädt, Sonja Kuhnt May 28, 2009 Keywords: Interpolation, computer experiment, Kriging, Kernel interpolation, Thin
More informationBAYESIAN OUTPUT ANALYSIS PROGRAM (BOA) VERSION 1.0 USER S MANUAL
BAYESIAN OUTPUT ANALYSIS PROGRAM (BOA) VERSION 1.0 USER S MANUAL Brian J. Smith January 8, 2003 Contents 1 Getting Started 4 1.1 Hardware/Software Requirements.................... 4 1.2 Obtaining BOA..............................
More informationarxiv: v2 [stat.co] 19 Feb 2016
Noname manuscript No. (will be inserted by the editor) Issues in the Multiple Try Metropolis mixing L. Martino F. Louzada Received: date / Accepted: date arxiv:158.4253v2 [stat.co] 19 Feb 216 Abstract
More informationPackage HKprocess. R topics documented: September 6, Type Package. Title Hurst-Kolmogorov process. Version
Package HKprocess September 6, 2014 Type Package Title Hurst-Kolmogorov process Version 0.0-1 Date 2014-09-06 Author Maintainer Imports MCMCpack (>= 1.3-3), gtools(>= 3.4.1) Depends
More informationNonparametric regression using kernel and spline methods
Nonparametric regression using kernel and spline methods Jean D. Opsomer F. Jay Breidt March 3, 016 1 The statistical model When applying nonparametric regression methods, the researcher is interested
More informationR package mcll for Monte Carlo local likelihood estimation
R package mcll for Monte Carlo local likelihood estimation Minjeong Jeon University of California, Berkeley Sophia Rabe-Hesketh University of California, Berkeley February 4, 2013 Cari Kaufman University
More informationTime Series Analysis by State Space Methods
Time Series Analysis by State Space Methods Second Edition J. Durbin London School of Economics and Political Science and University College London S. J. Koopman Vrije Universiteit Amsterdam OXFORD UNIVERSITY
More information