Approximation methods and quadrature points in PROC NLMIXED: A simulation study using structured latent curve models

Size: px
Start display at page:

Download "Approximation methods and quadrature points in PROC NLMIXED: A simulation study using structured latent curve models"

Transcription

1 Approximation methods and quadrature points in PROC NLMIXED: A simulation study using structured latent curve models Nathan Smith ABSTRACT Shelley A. Blozis, Ph.D. University of California Davis Davis, CA, USA Structured latent curve models, an alternative to nonlinear mixed models, allow researchers to account for individual variability in longitudinal responses through the use of individual latent curves that may vary from the mean response curve. These models can be fit in SAS using PROC NLMIXED, a procedure where a variety of approximation methods and options are available to the savvy user, and the choice of which may affect both parameter estimates and run times. Previous research conducted by the authors using empirical data showed that the combination of approximation procedures with a variety of estimating options produced different results depending on the combination of conditions. These conditions included the 4 different integral approximations (GAUSS, FIRO, HARDY, and ISAMP) available in NLMIXED. Within each estimating procedure the effect of the number of quadrature points (q-point) will be tested when used adaptively or non-adaptively on parameter estimation and processing speed. As an additional condition, the feasibility of providing good/poor/no starting values will be evaluated across estimating and q-point options. This paper will present the results of the follow-up data simulation and will comment on the differences between the findings of the experimental study and the simulation study. The data simulation will be conducted in SAS and the syntax used to set up the simulations will be presented and discussed in the context of the results. Outcome variables of interest for the different combination of conditions were computational demand, accuracy in recovering parameters, and convergence criteria. INTRODUCTION The purpose of a longitudinal data analysis is often to characterize and evaluate change in a response across multiple occasions of measurement. A nonlinear mixed model allows use of a nonlinear function to characterize responses for a given data set. The model admits both fixed and random effects, thus allowing for the coefficients of the function to be different across individuals. These models are applicable to a variety of observed processes that typically follow a nonlinear trajectory such as measures from learning and memory studies and from which it is important to ascertain individual variability. Maximum likelihood estimation (MLE) is a widely used method for estimating the parameters of a statistical model. MLE selects as parameter estimates values that maximize a likelihood function. This is to say that the procedure selects parameter values that are most likely to have produced the sample data under the proposed model. SAS PROC NLMIXED, a procedure available in SAS, allows users to fit nonlinear mixed models by maximizing an approximate integrated likelihood over the random effects. The procedure offers a selection from four different integral approximations: Gaussian quadrature, Hardy quadrature, importance sampling, and a first-order Taylor series expansion, in addition to several optimization algorithms. Parameter estimates and standard errors for a given model and sample data may vary based on the applied method. The theory and computational techniques of the NLMIXED procedure are based mainly on Pinheiro and Bates (1995). In the current paper, the accuracy and computational efficiency of different estimating procedures for structured latent curve models are compared. Pinheiro and Bates used both empirical and simulated data to evaluate variations in approximation techniques for a logistic mixed model with a single random effect and a first-order compartment model with two random effects. Based on the comparisons, their results suggest that many methods can be used to obtain accurate parameters estimates but that adaptive Gaussian quadrature should be used for its balance of computational efficiency and accuracy. Lesaffre and Spiessens (2001) subsequently reviewed the effect of the number of quadrature points on parameter estimates using both adaptive and non-adaptive Gaussian quadrature and suggested that parameter estimates can depend on the number and positioning of quadrature points (q-points) used to evaluate the likelihood integral. This paper follows up a study done by the authors that used empirical data and examined the different integral approximation methods and quadrature point options in PROC NLMIXED (Smith & Blozis, 2014). Data from a learning study were used to evaluate the approximation methods under a variety of different conditions. The data were performance scores on a quantitative procedural learning task. Study participants learned a set of declarative 1

2 rules for identifying characteristics of visual stimuli presented in a series. The task was memory focused, and the response variable was the time it took the individual to complete the quantitative learning task, measured across multiple occasions. The total sample size for this data was 288 individuals with complete data at all 12 measurement occasions. Figure 1A represents a randomly selected sample of 20 individuals, and Figure 1B shows the mean response of all participants (N=288) across the 12 measurement occasions. Figure 1A: Twenty Randomly Sampled Individuals Figure 1B: Sample Average QTR Score The sample average follows a nonlinear trajectory and approaches an asymptote greater than zero. Although many of the individual scores seem to follow this pattern as well, there is a great deal of variability in individual responses. A structured latent curve model (SLCM) allows researchers to account for individual variability in longitudinal responses through the use of individual latent curves that may vary from the mean response curve. The data were evaluated under the 3 integral approximation procedures (FIRO, GUASS, and ISAMP) amenable to the SLCM and at several different q-point conditions (1,5,10,20,30). Additionally, the GAUSS and ISAMP options were tested adaptively (OPTION=AD) and non-adaptively (OPTION=NOAD). The effect of providing starting values on parameter estimation was combined with measures of computational efficiency and model fit for each testable variation. The purpose of this was to identify sensitive areas in likelihood approximation for structured latent curve models and to propose estimating procedures that consider both accuracy and computational efficiency. The general finding was that with good starting values the adaptive conditions were the least variable, and at low q- point conditions, the most efficient. Increasing the number of quadrature points did not affect parameter estimates when used adaptively, but unless good starting values were provided, the adaptive procedures were unable to converge on any estimates. Non-adaptive methods were extremely variable at low q-point conditions and did not stabilize until at least 20 q-points were used. This is important to note because in some situations a non-adaptive method will converge on a solution where an adaptive method will not; this may happen because the adaptive methods use information from the data and from the starting values provided to locate the best point on the x-axis at which to evaluate the integral. The general recommendation that emerged from these findings was to use adaptive quadrature whenever possible, but if non-adaptive methods were required, to use a sufficiently high number of q- points to ensure parameter stability. For a more complete report on the results please see Smith and Blozis(2014). The goal of the current paper was to conduct a simulation study that allowed for more generalized conclusions about approximation methods for SLCMs. Data were simulated to have non-linear characteristics and closely resemble the empirical data used in the first study. Three integral approximations were tested (GUASS, ISAMP, and FIRO) adaptively and non-adaptively under a range of q-point conditions. For this simulation, the effect of providing poor or no starting values in comparison to good starting values was not evaluated. A total of 250 datasets, each with a sample size of N=250, were generated using PROC IML and using code adapted from Wicklin (2013). Review of the Estimating Procedures Available in PROC NLMIXED and the Use of the Number Quadrature Points Within PROC NLMIXED the four available integral estimating procedures are Gaussian quadrature (GAUSS), a firstorder Taylor series expansion (FIRO), Importance Sampling (ISAMP), and Hardy quadrature (HARDY). Hardy quadrature is not used in this study because the model to be fitted to the data includes more than one random effect, and this is the limit of the Hardy procedure. These estimating procedures maximize an approximation to the likelihood function integrated over the random effects. There are also a variety of optimization algorithms available to the researcher, although in this paper, only the default, a dual quasi-newton algorithm, is considered. The default estimating procedure in PROC NLMIXED is an adaptive Gaussian quadrature. This approximation method makes use of predetermined abscissas, or set locations on the x-axis, to evaluate the integral. In this sense, 2

3 Gaussian quadrature can be thought of as a type of Monte Carlo integration where the points at which the integral is evaluated are pre-set and where random effect vectors are generated from a normal distribution. This deterministic quality means that given the same input, the same output will always be generated if the number of quadrature points remains constant. The weights are fixed beforehand in the Gaussian quadrature rule, and the scale is centered at zero, unless adaptive Gaussian quadrature is used, in which case the scale is centered at the conditional nodes of the estimated random effects (Pinheiro & Bates, 1995). It is important to note that the actual number of points on the grid used in the estimation is the square of the number of quadrature points. The number of quadrature points can be specified in PROC NLMIXED using the QPOINT= option. If none are specified then the procedure will automatically select an appropriate number for the model, not exceeding 31 quadrature points. Pinero and Bates (1995) have suggested that the greatest gain in precision in estimating parameters comes from the centering on conditional nodes that takes place in adaptive Gaussian quadrature and that relatively little is gained when increasing the number of q- points past 1. An advantage of non-adaptive Gaussian quadrature, which can be specified using the NOAD option in PROC NLMIXED, is that it does not require the posterior modes of the random effects be calculated for each iteration (Pinero & Bates, 1995), and thus is less computationally demanding and does not depend on the procedure being able to correctly locate the posterior modes of the random effects. Importance Sampling (ISAMP) is performed using Monte Carlo integration and is more flexible than Gaussian quadrature because it allows the user to specify the distribution that created the data, which in turn makes the samples that are drawn more likely to be relevant to the integral. The area that is most likely to be important to the integral is then oversampled. Because the user may specify the distribution that the data comes from, this technique can be more appropriate for applications to non-normal data, provided an appropriate distribution is chosen. The approximation obtained from importance sampling is similar to that obtained from Gaussian quadrature, the main difference being that importance sampling uses a pseudo-random mechanism to select quadrature points, whereas Gaussian quadrature uses predetermined weights and abscissas. Stochastic variability associated with different importance samples make it difficult to compare small changes in parameter values because each importance sample will yield slightly different results. In PROC NLMIXED the user may specify a SEED= number that will allow for comparisons across different conditions. Alternatively the first-order method of Beal and Sheiner (1982,1988) is available. This method takes a first-order Taylor series expansion (FIRO) around the empirical best linear unbiased prediction of the random effects. This method requires that the user specify a NORMAL distribution in the RANDOM statement. As will be shown later, this method generally converges on an estimate quickly and can often be used to generate starting values for parameter estimates when none are available. The HARDY option makes use of Hardy quadrature that uses the adaptive trapezoidal rule. This option is only available for one dimensional integrals, or models that specify only one random effect. METHODS Model A structured latent curve model (SLCM) uses random coefficients combined with a factor matrix of basis functions to yield individual latent curves. The mean response is assumed to follow a target function for which the basis functions of the factor matrix are based (Blozis, 2004). In a SLCM the columns of the factor matrix define the shape of the common curve from which individuals are allowed to vary. A SLCM is an extension of a latent curve model (LCM), which conversely does not allow for individual variation in model parameters that enter the target function in a nonlinear way. The two models, a SLCM and a LCM, share a similar structure, however. Given a response variabley i, a latent curve model (similar to a structured latent curve model) may be expressed as y i = Λ i η i + ε i where Λ i is a factor loading matrix whose elements are the basis functions evaluated according to time: λ 11 λ 12 λ 1J λ Λ i = 21 λ 22 λ 2J [ λ Ti1 λ Ti2 λ TiJ ] Each row corresponds to a measurement occasion and each column corresponds to the elements that define the shape of the curve. The columns of the Λ i matrix are often referred to as basis functions. The factor η i is a vector of random coefficients that denotes how much an individual latent response curve depends on the basis curve. Although the parameters that define the factor matrix may enter the functions nonlinearly, the random coefficients enter the individual-level model linearly (Blozis, 2004). Measurement error (ε i ) is assumed to be independent of the random coefficients (η i ) and to be normally distributed. This allows for the estimation of a nonlinear function using standard maximum likelihood estimation techniques. The data are assumed to be multivariate normal. 3

4 The function that is believed to represent the mean response of the data is often referred to as the target function (Browne, 1993). The target function is the assumption proposed by the researcher that the mean response of the data will follow some predetermined trajectory. Several possible target functions could be used to generate the Λ matrix for the structured latent curve model. Consequently, the target function generates a nonlinear curve which the mean response across time points is expected to follow under a given theory. The curve follows the smooth function f(θ, t) where θ is a vector (θ 1,, θ J ) of unknown fixed parameters and t = (1,2,, T) is a vector of discrete measurement occasions. When evaluated at each discrete time point the target function f T (θ, T) = μ T. The same function used in Blozis (2004), that is, a negatively accelerated exponential function decreasing monotonically to an asymptote greater than zero, was used to create the factor matrix for the current study. This function is as follows: f(θ, t) = θ 1 (θ 1 θ 2 )exp [ θ 3 (t 1)] where t refers to the measurement occasion and is centered to the first learning trial. The first-order partial derivatives of this function make up the columns of the factor matrix Λ i, resulting in a 12x3 matrix. θ 1 is the potential response time or asymptote, θ 2 is the initial response time or intercept, and θ 3 is the population initial rate of change. Let f j (θ, t) represent the first partial derivative of the target function f(θ, t) where f f(θ, t) j (θ, t) = θ j Given the target function specified above based on three θ parameters and 12 measurements occasions, the Λ matrix for the structured latent curve would then be: f 1 (θ, 1) f 2 (θ, 1) f 3 (θ, 1) f Λ = 1 (θ, 2) f 2 (θ, 2) f 3 (θ, 2) [ f 1 (θ, 12) f 2 (θ, 12) f 3 (θ, 12)] This expression can then accommodate a population curve that follows the form of the target function f(θ, t) and individual level variation in the vector η i which represents the individual i s dependence on the j th basis function (Blozis, 2004). Data Simulation Although these procedures may behave differently at small sample sizes, the current study only considers one sample size of N=250. This sample size was chosen because it was close to the size of the sample that was tested using the empirical data (N=228) and because we wanted to avoid any complications that might arise from a smaller sample size. Future studies should consider the application of these methods to smaller samples. The first step of the data simulation was to generate the η i matrix, or the vector of random coefficients that are different for every individual, across all sets (250). The following code, adapted from Wicklin (2013), simulates multivariate normal data given the mean and covariance of the variables to be tested. proc iml; N = 250; NumSamples = 250; /* specify population mean and covariance */ Mean = {8.6,16.3,.64}; Cov = { , , }; call randseed(4322); X = RandNormal(N*NumSamples, Mean, Cov); /*Create data set from simulated data*/ ID = colvec(repeat(t(1:numsamples), 1, N)); /* 1,1,1,...,2,2,2,...,3,3,3,... */ Z = ID X; create raneff from Z[c={"ID" "t1" "t2" "t3"}]; append from Z; close raneff; quit; 4

5 This generates a 62500x4 matrix with all the individual level information across all replications. ID is an indicator variable that specifies the replication number Once the ID variable was selected out, the resulting η i matrix was then combined with the Λ matrix of basis functions (12x3) at each time point to generate one complete data set with 250 sets of 250 individuals at 12 time points. In other words, each row of the η i matrix was multiplied by the Λ matrix to generate a 1x12 vector for each individual across all individuals and replications. The resulting data set had 62,500 individuals in 250 sets. At this point, error variance was also added to each individual. Figure 2, included below, shows the simulated scores of a random sample of 24 individuals from the data. Figure 2: Random sample of simulated data across occasions Procedure For each of the estimating procedures considered (GAUSS, ISAMP, FIRO) and holding constant other conditions, the effect of the number of quadrature points (1, 2, 3, 4, 5, 10, 20, 30) was tested when used adaptively or nonadaptively on parameter estimation and processing speed. In summary, 2 (adaptive vs non-adaptive quadrature points) X 8 (1, 2, 3, 4, 5, 10, 20, and 30 Q-Points) X 2 (possible estimating procedures)+(firo) = 33 variations were evaluated and compared based on parameter estimates, processing speed, and model fit. Each of these 33 conditions were run across all 250 replications. The starting values provided to the procedure were the same values that were used to simulate the data and so can be considered good starting values. This was also implemented so that a bias could be computed. The SAS code used to test these conditions is included below PROC NLMIXED MAXITER=10000 GCONV=1e-10 METHOD=(INPUT) QPOINTS=(INPUT) NOAD(OPTIONAL); basis1 = 1-exp(-t3*(time-1)); basis2 = exp(-t3*(time-1)); 5

6 basis3 = (t1-t2)*(time-1)*exp(-t3*(time-1)); PARMS t1 = 8.6 t2 = 16.3 t3 =.64 s2e1 =.93 s2n1 = 4.32 cn2n1 = 7.26 s2n2 = 34.9 cn3n1 = cn3n2 =.48 s2n3 =.15; mean = basis1*(n1+t1) + basis2*(n2+t2) + basis3*(n3); var = s2e1; MODEL qrts ~ NORMAL(mean,var); RANDOM n1 n2 n3 ~ NORMAL([0,0,0], [s2n1, cn2n1,s2n2, cn3n1,cn3n2,s2n3]) SUBJECT=subid; by set; RUN; Note: Because convergence can sometimes be an issue, especially when considering several variations of quadrature points and estimation methods, the user can manually set the maximum number of function evaluations higher than the default using the MAXITER= option. Similarly, the GCONV= option is a convergence criterion for the relative gradient. This means that if the relative difference between two consecutive gradient values is less than the specified value, the convergence criteria will have been met and the procedure stops the iterative process. Results The analyses that were conducted previously on the empirical learning data suggested that parameter estimates may be unstable for lower quadrature point conditions, but only when non-adaptive approximation methods are used. In the case of the simulated data and subsequent analyses, the same pattern was observed. Figure 3 plots the different mean values of all three of the main fixed effects across quadrature point conditions and approximation methods. θ1 is the asymptote parameter, θ2 is the intercept, and θ3 is the rate of change parameter. The greatest change in the mean parameter estimates occurs between 1 and 5 quadrature points. It should be noted that the adaptive importance sampling, adaptive Gaussian quadrature, and FIRO methods yielded the exact same parameter estimates and standard errors, regardless of the number of quadrature points. This finding is in line with the conclusions of Pinheiro and Bates (1995) who also found that increasing the number of quadrature points past 1 for an adaptive approximation did not increase accuracy. What is interesting about the behavior of the parameter estimates in the study is that although adaptive methods showed greater stability across q-point conditions, they did not recover the true parameter estimates better than the non-adaptive methods when a larger number of q-point was used. Figure 3: Average values for fixed effects across quadrature point conditions 6

7 Averages of the fixed effects estimates were not consistent across all parameters. Although only three parameters are included here for illustrative purposes, an additional 7 parameters were estimated by the NLMIXED procedure. These parameters included the variances and covariances of the three random effects, as well as the error term at the first level of the model. The non-adaptive Gaussian quadrature approximation for QPOINTS=1 was unable to converge on any solutions for the covariance and variance estimates across all replications and instead took the starting values provided to the procedure as parameter estimates. Non-adaptive importance sampling was the least accurate and most biased of all the methods at low quadrature point conditions. Table 1A-D: Results from simulated data for fixed effects at QPOINTS=1, 5, 10, 30 Tables 1A-D include the means, relative bias, and standard deviations of the fixed effects estimates across all replications. The adaptive methods always recovered the nonlinear rate parameter estimate of θ3 with the least bias. An interesting trend to be noticed from these tables is that the adaptive estimates did not improve across QPOINT conditions, whereas the non-adaptive methods clearly did, showing less variability and bias with an increase of quadrature points. This demonstrates again the relation between quadrature points and accurate parameter estimation and suggests that when using an approximation method non-adaptively, one should be wary of using a small number of quadrature points. The question then arises: if an increase in the number of quadrature points can improve estimates and decrease bias, how many should be used? The answer to this question often comes down to the varying computational efficiency of the different procedures. Some of the differences in computational efficiency become more apparent when considering the time necessary to complete the different procedures adaptively and non-adaptively. Tables 2A and 2B reference the model fit and time used by each procedure at each level of tested quadrature point options for adaptive and non-adaptive methods, respectively. The run times are in minutes and are approximations for how long it took to complete an analysis on one simulated dataset. Because the analyses were looped through all 250 replications, the log file only displays the time 7

8 it took to process all replications. The run times displayed in Tables 2A and 2B are the total time divided by 250. This can be informative because the researcher who uses PROC NLMIXED to run an analysis is likely to run the procedure many fewer times than they were run for this simulation. Table 2A: Run times and fit for adaptive methods Table 2A: Run times and fit for adaptive methods The FIRO approximation method was among the quickest to produce estimates and provided parameter estimates that were comparable to the adaptive ISAMP and GAUSS approximations. As the number of quadrature points for the ISAMP and GAUSS approximations increased, so did the run times. In the case of the GAUSS approximation only.11 minutes (6.6 seconds) were required to generate parameter estimates for one adaptive quadrature point, but nearly 5.5 hours were needed to evaluate the same function with 30 adaptive quadrature points. In this simulation the increase in quadrature points was unnecessary given that parameter estimates and fit did not improve past one adaptive quadrature point. It is interesting to note, however, that although the parameter estimates do not change 8

9 past 1 q-point for the adaptive importance sampling, the fit statistics both improve and display less variability as more quadrature points are used. For the non-adaptive conditions, fit improves and run times increase as the number of quadrature points goes up. Run times were also shorter for non-adaptive methods. This was unsurprising because adaptive methods begin the iterative process of approximating the likelihood function at a location on the x-axis that is informed by characteristics of the data. The most dramatic improvement in fit is observed in the importance sampling approximation, and the most dramatic increase in run time is observed with the Gaussian quadrature approximation. A 30 q-point nonadaptive Gaussian approximation took around 1.5 hours to complete, where a corresponding important sampling approximation with 30 q-points took only 12 seconds to complete. Conclusions The structured latent curve model is not technically a fully nonlinear mixed model. While the shape of the mean curve is defined to be nonlinear, the random effects enter this model linearly. This facilitates the use of standard maximum likelihood techniques but may be the reason that the parameter estimates observed in this study are so similar under the adaptive condition. Because the model is easier to estimate and is conditionally linear, the procedure may be converging on a solution instead of just an approximation. It would be instructive in future studies to use these same experimental conditions and data using a different, fully nonlinear mixed model. The FIRO estimates were exactly the same as the ISAMP and GAUSS estimates across all conditions and were converged upon quicker than under any other condition, including when no starting values were provided. Much of the variability in the parameter estimates was observed when used non-adaptive quadrature points. The use of nonadaptive quadrature points took less CPU time but required more quadrature to reach the same level of accuracy. When using the NOAD option the user should be aware that parameter estimates may vary widely based on the number of quadrature points and approximation method used. In this vein, the authors suggest that if the NOAD option is to be used, a greater number of quadrature points also should be considered to ensure accuracy. A potential solution to long run times for the Gaussian quadrature approximations is to use importance sampling but with a very large number of quadrature points. The main difference between these two approximation methods is the mechanism by which the locations of the quadrature points are chosen, and it seems to follow that when using a semi-random method of selecting quadrature points(isamp) a larger number would provide better coverage. At 30 q- points the run time for non-adaptive GAUSS was approximately 1.5 hours, where 30 q-points of non-adaptive ISAMP took only 22 seconds. A further study would investigate the relationship between very high numbers of q- points( ) for ISAMP approximations and determine whether it was more computationally efficient than the GAUSS procedures while also remaining highly accurate. This particular data set had no missing values, which is uncommon in many longitudinal studies involving measurements taken on humans, and which further complicates the approximation procedures. A future study would include a number of data simulations which would allow for a closer examination of some of the findings of this study. It would be interesting and informative to re-evaluate these procedures with data that is less ideal and that may complicate the approximation process. A next step to take would be to include other starting value conditions where poor to average starting values were compared against the good/no starting value conditions. It is possible that we observed such consistency in parameter estimates because the starting values that were used were already very close to the values of the parameter estimates. 9

10 References 1. Blozis, S. (2004). Structured latent curve models for the study of change in multivariate repeated measures. Psychological Methods, 9:3, Beal, S.L., & Sheiner, L.B. (1982). Estimating population kinetics. CRC Crit. Rev. Biomed. Eng., 8, Beal, S.L., & Sheiner, L.B. (1988). Heteroskedastic nonlinear regression. Technometrics, 30, Davidian, W., & Giltinan, D. M. (1995). Nonlinear models for repeated measurement data. London: Chapman & Hall 5. Lesaffre, E., & Spiessens, B. (2001). On the number of quadrature points in a logistic random effects model: an example. Applied Statistics, 50, Pinheiro, J.C., & Bates, D.M. (1995). Approximations to the log-likelihood function in the nonlinear mixed-effects model. Journal of Computational and Graphical Statistics, 4, Smith, N., & Blozis, B. (2014) Options in estimating nonlinear mixed models: quadrature points and approximation methods. Paper presented at Western Users of SAS Software 2015: Analytics and Statistics, San Jose, California, 3-5 September. Proceedings from WUSS 2014 conference proceedings archive: 8. Wicklin, R. (2013) Simulating Data with SAS. SAS institute Inc., Cary, NC, USA. CONTACT INFORMATION Your comments and questions are valued and encouraged. Contact the authors at: Nathan Smith University of California, Davis Department of Psychology nabsmith@ucdavis.edu Shelley A. Blozis, Ph.D. University of California, Davis Department of Psychology sablozis@ucdavis.edu SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. indicates USA registration. Other brand and product names are trademarks of their respective companies. 10

Expectation-Maximization Methods in Population Analysis. Robert J. Bauer, Ph.D. ICON plc.

Expectation-Maximization Methods in Population Analysis. Robert J. Bauer, Ph.D. ICON plc. Expectation-Maximization Methods in Population Analysis Robert J. Bauer, Ph.D. ICON plc. 1 Objective The objective of this tutorial is to briefly describe the statistical basis of Expectation-Maximization

More information

Estimation of Item Response Models

Estimation of Item Response Models Estimation of Item Response Models Lecture #5 ICPSR Item Response Theory Workshop Lecture #5: 1of 39 The Big Picture of Estimation ESTIMATOR = Maximum Likelihood; Mplus Any questions? answers Lecture #5:

More information

(X 1:n η) 1 θ e 1. i=1. Using the traditional MLE derivation technique, the penalized MLEs for η and θ are: = n. (X i η) = 0. i=1 = 1.

(X 1:n η) 1 θ e 1. i=1. Using the traditional MLE derivation technique, the penalized MLEs for η and θ are: = n. (X i η) = 0. i=1 = 1. EXAMINING THE PERFORMANCE OF A CONTROL CHART FOR THE SHIFTED EXPONENTIAL DISTRIBUTION USING PENALIZED MAXIMUM LIKELIHOOD ESTIMATORS: A SIMULATION STUDY USING SAS Austin Brown, M.S., University of Northern

More information

GAMs semi-parametric GLMs. Simon Wood Mathematical Sciences, University of Bath, U.K.

GAMs semi-parametric GLMs. Simon Wood Mathematical Sciences, University of Bath, U.K. GAMs semi-parametric GLMs Simon Wood Mathematical Sciences, University of Bath, U.K. Generalized linear models, GLM 1. A GLM models a univariate response, y i as g{e(y i )} = X i β where y i Exponential

More information

Frequencies, Unequal Variance Weights, and Sampling Weights: Similarities and Differences in SAS

Frequencies, Unequal Variance Weights, and Sampling Weights: Similarities and Differences in SAS ABSTRACT Paper 1938-2018 Frequencies, Unequal Variance Weights, and Sampling Weights: Similarities and Differences in SAS Robert M. Lucas, Robert M. Lucas Consulting, Fort Collins, CO, USA There is confusion

More information

Chapter 15 Mixed Models. Chapter Table of Contents. Introduction Split Plot Experiment Clustered Data References...

Chapter 15 Mixed Models. Chapter Table of Contents. Introduction Split Plot Experiment Clustered Data References... Chapter 15 Mixed Models Chapter Table of Contents Introduction...309 Split Plot Experiment...311 Clustered Data...320 References...326 308 Chapter 15. Mixed Models Chapter 15 Mixed Models Introduction

More information

A User Manual for the Multivariate MLE Tool. Before running the main multivariate program saved in the SAS file Part2-Main.sas,

A User Manual for the Multivariate MLE Tool. Before running the main multivariate program saved in the SAS file Part2-Main.sas, A User Manual for the Multivariate MLE Tool Before running the main multivariate program saved in the SAS file Part-Main.sas, the user must first compile the macros defined in the SAS file Part-Macros.sas

More information

Supplementary Figure 1. Decoding results broken down for different ROIs

Supplementary Figure 1. Decoding results broken down for different ROIs Supplementary Figure 1 Decoding results broken down for different ROIs Decoding results for areas V1, V2, V3, and V1 V3 combined. (a) Decoded and presented orientations are strongly correlated in areas

More information

Information Criteria Methods in SAS for Multiple Linear Regression Models

Information Criteria Methods in SAS for Multiple Linear Regression Models Paper SA5 Information Criteria Methods in SAS for Multiple Linear Regression Models Dennis J. Beal, Science Applications International Corporation, Oak Ridge, TN ABSTRACT SAS 9.1 calculates Akaike s Information

More information

Note Set 4: Finite Mixture Models and the EM Algorithm

Note Set 4: Finite Mixture Models and the EM Algorithm Note Set 4: Finite Mixture Models and the EM Algorithm Padhraic Smyth, Department of Computer Science University of California, Irvine Finite Mixture Models A finite mixture model with K components, for

More information

Statistics & Analysis. A Comparison of PDLREG and GAM Procedures in Measuring Dynamic Effects

Statistics & Analysis. A Comparison of PDLREG and GAM Procedures in Measuring Dynamic Effects A Comparison of PDLREG and GAM Procedures in Measuring Dynamic Effects Patralekha Bhattacharya Thinkalytics The PDLREG procedure in SAS is used to fit a finite distributed lagged model to time series data

More information

Missing Data Missing Data Methods in ML Multiple Imputation

Missing Data Missing Data Methods in ML Multiple Imputation Missing Data Missing Data Methods in ML Multiple Imputation PRE 905: Multivariate Analysis Lecture 11: April 22, 2014 PRE 905: Lecture 11 Missing Data Methods Today s Lecture The basics of missing data:

More information

CHAPTER 1 INTRODUCTION

CHAPTER 1 INTRODUCTION Introduction CHAPTER 1 INTRODUCTION Mplus is a statistical modeling program that provides researchers with a flexible tool to analyze their data. Mplus offers researchers a wide choice of models, estimators,

More information

Experimental Data and Training

Experimental Data and Training Modeling and Control of Dynamic Systems Experimental Data and Training Mihkel Pajusalu Alo Peets Tartu, 2008 1 Overview Experimental data Designing input signal Preparing data for modeling Training Criterion

More information

1.2 Numerical Solutions of Flow Problems

1.2 Numerical Solutions of Flow Problems 1.2 Numerical Solutions of Flow Problems DIFFERENTIAL EQUATIONS OF MOTION FOR A SIMPLIFIED FLOW PROBLEM Continuity equation for incompressible flow: 0 Momentum (Navier-Stokes) equations for a Newtonian

More information

Missing Data Analysis for the Employee Dataset

Missing Data Analysis for the Employee Dataset Missing Data Analysis for the Employee Dataset 67% of the observations have missing values! Modeling Setup Random Variables: Y i =(Y i1,...,y ip ) 0 =(Y i,obs, Y i,miss ) 0 R i =(R i1,...,r ip ) 0 ( 1

More information

Performance of Latent Growth Curve Models with Binary Variables

Performance of Latent Growth Curve Models with Binary Variables Performance of Latent Growth Curve Models with Binary Variables Jason T. Newsom & Nicholas A. Smith Department of Psychology Portland State University 1 Goal Examine estimation of latent growth curve models

More information

Analysis of Complex Survey Data with SAS

Analysis of Complex Survey Data with SAS ABSTRACT Analysis of Complex Survey Data with SAS Christine R. Wells, Ph.D., UCLA, Los Angeles, CA The differences between data collected via a complex sampling design and data collected via other methods

More information

Module 1 Lecture Notes 2. Optimization Problem and Model Formulation

Module 1 Lecture Notes 2. Optimization Problem and Model Formulation Optimization Methods: Introduction and Basic concepts 1 Module 1 Lecture Notes 2 Optimization Problem and Model Formulation Introduction In the previous lecture we studied the evolution of optimization

More information

Louis Fourrier Fabien Gaie Thomas Rolf

Louis Fourrier Fabien Gaie Thomas Rolf CS 229 Stay Alert! The Ford Challenge Louis Fourrier Fabien Gaie Thomas Rolf Louis Fourrier Fabien Gaie Thomas Rolf 1. Problem description a. Goal Our final project is a recent Kaggle competition submitted

More information

INLA: Integrated Nested Laplace Approximations

INLA: Integrated Nested Laplace Approximations INLA: Integrated Nested Laplace Approximations John Paige Statistics Department University of Washington October 10, 2017 1 The problem Markov Chain Monte Carlo (MCMC) takes too long in many settings.

More information

Machine Learning / Jan 27, 2010

Machine Learning / Jan 27, 2010 Revisiting Logistic Regression & Naïve Bayes Aarti Singh Machine Learning 10-701/15-781 Jan 27, 2010 Generative and Discriminative Classifiers Training classifiers involves learning a mapping f: X -> Y,

More information

Monte Carlo Methods and Statistical Computing: My Personal E

Monte Carlo Methods and Statistical Computing: My Personal E Monte Carlo Methods and Statistical Computing: My Personal Experience Department of Mathematics & Statistics Indian Institute of Technology Kanpur November 29, 2014 Outline Preface 1 Preface 2 3 4 5 6

More information

Voxel selection algorithms for fmri

Voxel selection algorithms for fmri Voxel selection algorithms for fmri Henryk Blasinski December 14, 2012 1 Introduction Functional Magnetic Resonance Imaging (fmri) is a technique to measure and image the Blood- Oxygen Level Dependent

More information

Chapter 7: Dual Modeling in the Presence of Constant Variance

Chapter 7: Dual Modeling in the Presence of Constant Variance Chapter 7: Dual Modeling in the Presence of Constant Variance 7.A Introduction An underlying premise of regression analysis is that a given response variable changes systematically and smoothly due to

More information

FMA901F: Machine Learning Lecture 3: Linear Models for Regression. Cristian Sminchisescu

FMA901F: Machine Learning Lecture 3: Linear Models for Regression. Cristian Sminchisescu FMA901F: Machine Learning Lecture 3: Linear Models for Regression Cristian Sminchisescu Machine Learning: Frequentist vs. Bayesian In the frequentist setting, we seek a fixed parameter (vector), with value(s)

More information

Random projection for non-gaussian mixture models

Random projection for non-gaussian mixture models Random projection for non-gaussian mixture models Győző Gidófalvi Department of Computer Science and Engineering University of California, San Diego La Jolla, CA 92037 gyozo@cs.ucsd.edu Abstract Recently,

More information

10703 Deep Reinforcement Learning and Control

10703 Deep Reinforcement Learning and Control 10703 Deep Reinforcement Learning and Control Russ Salakhutdinov Machine Learning Department rsalakhu@cs.cmu.edu Policy Gradient I Used Materials Disclaimer: Much of the material and slides for this lecture

More information

Chapter 6: Examples 6.A Introduction

Chapter 6: Examples 6.A Introduction Chapter 6: Examples 6.A Introduction In Chapter 4, several approaches to the dual model regression problem were described and Chapter 5 provided expressions enabling one to compute the MSE of the mean

More information

Sequential Estimation in Item Calibration with A Two-Stage Design

Sequential Estimation in Item Calibration with A Two-Stage Design Sequential Estimation in Item Calibration with A Two-Stage Design Yuan-chin Ivan Chang Institute of Statistical Science, Academia Sinica, Taipei, Taiwan In this paper we apply a two-stage sequential design

More information

Theoretical Concepts of Machine Learning

Theoretical Concepts of Machine Learning Theoretical Concepts of Machine Learning Part 2 Institute of Bioinformatics Johannes Kepler University, Linz, Austria Outline 1 Introduction 2 Generalization Error 3 Maximum Likelihood 4 Noise Models 5

More information

Chapter 3. Bootstrap. 3.1 Introduction. 3.2 The general idea

Chapter 3. Bootstrap. 3.1 Introduction. 3.2 The general idea Chapter 3 Bootstrap 3.1 Introduction The estimation of parameters in probability distributions is a basic problem in statistics that one tends to encounter already during the very first course on the subject.

More information

SAS/STAT 13.1 User s Guide. The NESTED Procedure

SAS/STAT 13.1 User s Guide. The NESTED Procedure SAS/STAT 13.1 User s Guide The NESTED Procedure This document is an individual chapter from SAS/STAT 13.1 User s Guide. The correct bibliographic citation for the complete manual is as follows: SAS Institute

More information

Effects of PROC EXPAND Data Interpolation on Time Series Modeling When the Data are Volatile or Complex

Effects of PROC EXPAND Data Interpolation on Time Series Modeling When the Data are Volatile or Complex Effects of PROC EXPAND Data Interpolation on Time Series Modeling When the Data are Volatile or Complex Keiko I. Powers, Ph.D., J. D. Power and Associates, Westlake Village, CA ABSTRACT Discrete time series

More information

STATISTICS (STAT) Statistics (STAT) 1

STATISTICS (STAT) Statistics (STAT) 1 Statistics (STAT) 1 STATISTICS (STAT) STAT 2013 Elementary Statistics (A) Prerequisites: MATH 1483 or MATH 1513, each with a grade of "C" or better; or an acceptable placement score (see placement.okstate.edu).

More information

RESEARCH ARTICLE. Growth Rate Models: Emphasizing Growth Rate Analysis through Growth Curve Modeling

RESEARCH ARTICLE. Growth Rate Models: Emphasizing Growth Rate Analysis through Growth Curve Modeling RESEARCH ARTICLE Growth Rate Models: Emphasizing Growth Rate Analysis through Growth Curve Modeling Zhiyong Zhang a, John J. McArdle b, and John R. Nesselroade c a University of Notre Dame; b University

More information

Package acebayes. R topics documented: November 21, Type Package

Package acebayes. R topics documented: November 21, Type Package Type Package Package acebayes November 21, 2018 Title Optimal Bayesian Experimental Design using the ACE Algorithm Version 1.5.2 Date 2018-11-21 Author Antony M. Overstall, David C. Woods & Maria Adamou

More information

Deep Generative Models Variational Autoencoders

Deep Generative Models Variational Autoencoders Deep Generative Models Variational Autoencoders Sudeshna Sarkar 5 April 2017 Generative Nets Generative models that represent probability distributions over multiple variables in some way. Directed Generative

More information

SELECTION OF A MULTIVARIATE CALIBRATION METHOD

SELECTION OF A MULTIVARIATE CALIBRATION METHOD SELECTION OF A MULTIVARIATE CALIBRATION METHOD 0. Aim of this document Different types of multivariate calibration methods are available. The aim of this document is to help the user select the proper

More information

Computational Methods. Randomness and Monte Carlo Methods

Computational Methods. Randomness and Monte Carlo Methods Computational Methods Randomness and Monte Carlo Methods Manfred Huber 2010 1 Randomness and Monte Carlo Methods Introducing randomness in an algorithm can lead to improved efficiencies Random sampling

More information

REGULARIZED REGRESSION FOR RESERVING AND MORTALITY MODELS GARY G. VENTER

REGULARIZED REGRESSION FOR RESERVING AND MORTALITY MODELS GARY G. VENTER REGULARIZED REGRESSION FOR RESERVING AND MORTALITY MODELS GARY G. VENTER TODAY Advances in model estimation methodology Application to data that comes in rectangles Examples ESTIMATION Problems with MLE

More information

A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection (Kohavi, 1995)

A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection (Kohavi, 1995) A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection (Kohavi, 1995) Department of Information, Operations and Management Sciences Stern School of Business, NYU padamopo@stern.nyu.edu

More information

An Experiment in Visual Clustering Using Star Glyph Displays

An Experiment in Visual Clustering Using Star Glyph Displays An Experiment in Visual Clustering Using Star Glyph Displays by Hanna Kazhamiaka A Research Paper presented to the University of Waterloo in partial fulfillment of the requirements for the degree of Master

More information

Logistic Regression. Abstract

Logistic Regression. Abstract Logistic Regression Tsung-Yi Lin, Chen-Yu Lee Department of Electrical and Computer Engineering University of California, San Diego {tsl008, chl60}@ucsd.edu January 4, 013 Abstract Logistic regression

More information

A Brief Look at Optimization

A Brief Look at Optimization A Brief Look at Optimization CSC 412/2506 Tutorial David Madras January 18, 2018 Slides adapted from last year s version Overview Introduction Classes of optimization problems Linear programming Steepest

More information

An Efficient Model Selection for Gaussian Mixture Model in a Bayesian Framework

An Efficient Model Selection for Gaussian Mixture Model in a Bayesian Framework IEEE SIGNAL PROCESSING LETTERS, VOL. XX, NO. XX, XXX 23 An Efficient Model Selection for Gaussian Mixture Model in a Bayesian Framework Ji Won Yoon arxiv:37.99v [cs.lg] 3 Jul 23 Abstract In order to cluster

More information

Divide and Conquer Kernel Ridge Regression

Divide and Conquer Kernel Ridge Regression Divide and Conquer Kernel Ridge Regression Yuchen Zhang John Duchi Martin Wainwright University of California, Berkeley COLT 2013 Yuchen Zhang (UC Berkeley) Divide and Conquer KRR COLT 2013 1 / 15 Problem

More information

Scientific Computing: An Introductory Survey

Scientific Computing: An Introductory Survey Scientific Computing: An Introductory Survey Chapter 13 Random Numbers and Stochastic Simulation Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign Copyright

More information

Computational Machine Learning, Fall 2015 Homework 4: stochastic gradient algorithms

Computational Machine Learning, Fall 2015 Homework 4: stochastic gradient algorithms Computational Machine Learning, Fall 2015 Homework 4: stochastic gradient algorithms Due: Tuesday, November 24th, 2015, before 11:59pm (submit via email) Preparation: install the software packages and

More information

HMC CS 158, Fall 2017 Problem Set 3 Programming: Regularized Polynomial Regression

HMC CS 158, Fall 2017 Problem Set 3 Programming: Regularized Polynomial Regression HMC CS 158, Fall 2017 Problem Set 3 Programming: Regularized Polynomial Regression Goals: To open up the black-box of scikit-learn and implement regression models. To investigate how adding polynomial

More information

Network Traffic Measurements and Analysis

Network Traffic Measurements and Analysis DEIB - Politecnico di Milano Fall, 2017 Sources Hastie, Tibshirani, Friedman: The Elements of Statistical Learning James, Witten, Hastie, Tibshirani: An Introduction to Statistical Learning Andrew Ng:

More information

Computer Experiments. Designs

Computer Experiments. Designs Computer Experiments Designs Differences between physical and computer Recall experiments 1. The code is deterministic. There is no random error (measurement error). As a result, no replication is needed.

More information

Multiple Imputation with Mplus

Multiple Imputation with Mplus Multiple Imputation with Mplus Tihomir Asparouhov and Bengt Muthén Version 2 September 29, 2010 1 1 Introduction Conducting multiple imputation (MI) can sometimes be quite intricate. In this note we provide

More information

MS&E 226: Small Data

MS&E 226: Small Data MS&E 226: Small Data Lecture 13: The bootstrap (v3) Ramesh Johari ramesh.johari@stanford.edu 1 / 30 Resampling 2 / 30 Sampling distribution of a statistic For this lecture: There is a population model

More information

NONLINEAR PARAMETRIC AND NONPARAMETRIC POPULATION PHARMACOKINETIC MODELING ON A SUPERCOMPUTER

NONLINEAR PARAMETRIC AND NONPARAMETRIC POPULATION PHARMACOKINETIC MODELING ON A SUPERCOMPUTER University of Southern California School of Medicine Laboratory of Applied Pharmacokinetics Technical Report 99-1 NONLINEAR PARAMETRIC AND NONPARAMETRIC POPULATION PHARMACOKINETIC MODELING ON A SUPERCOMPUTER

More information

Generalized Additive Model

Generalized Additive Model Generalized Additive Model by Huimin Liu Department of Mathematics and Statistics University of Minnesota Duluth, Duluth, MN 55812 December 2008 Table of Contents Abstract... 2 Chapter 1 Introduction 1.1

More information

10.4 Linear interpolation method Newton s method

10.4 Linear interpolation method Newton s method 10.4 Linear interpolation method The next best thing one can do is the linear interpolation method, also known as the double false position method. This method works similarly to the bisection method by

More information

A Deterministic Global Optimization Method for Variational Inference

A Deterministic Global Optimization Method for Variational Inference A Deterministic Global Optimization Method for Variational Inference Hachem Saddiki Mathematics and Statistics University of Massachusetts, Amherst saddiki@math.umass.edu Andrew C. Trapp Operations and

More information

VARIANCE REDUCTION TECHNIQUES IN MONTE CARLO SIMULATIONS K. Ming Leung

VARIANCE REDUCTION TECHNIQUES IN MONTE CARLO SIMULATIONS K. Ming Leung POLYTECHNIC UNIVERSITY Department of Computer and Information Science VARIANCE REDUCTION TECHNIQUES IN MONTE CARLO SIMULATIONS K. Ming Leung Abstract: Techniques for reducing the variance in Monte Carlo

More information

Simulation Calibration with Correlated Knowledge-Gradients

Simulation Calibration with Correlated Knowledge-Gradients Simulation Calibration with Correlated Knowledge-Gradients Peter Frazier Warren Powell Hugo Simão Operations Research & Information Engineering, Cornell University Operations Research & Financial Engineering,

More information

Analytic Performance Models for Bounded Queueing Systems

Analytic Performance Models for Bounded Queueing Systems Analytic Performance Models for Bounded Queueing Systems Praveen Krishnamurthy Roger D. Chamberlain Praveen Krishnamurthy and Roger D. Chamberlain, Analytic Performance Models for Bounded Queueing Systems,

More information

D-Optimal Designs. Chapter 888. Introduction. D-Optimal Design Overview

D-Optimal Designs. Chapter 888. Introduction. D-Optimal Design Overview Chapter 888 Introduction This procedure generates D-optimal designs for multi-factor experiments with both quantitative and qualitative factors. The factors can have a mixed number of levels. For example,

More information

Parameter Estimation in Differential Equations: A Numerical Study of Shooting Methods

Parameter Estimation in Differential Equations: A Numerical Study of Shooting Methods Parameter Estimation in Differential Equations: A Numerical Study of Shooting Methods Franz Hamilton Faculty Advisor: Dr Timothy Sauer January 5, 2011 Abstract Differential equation modeling is central

More information

Title. Description. stata.com. intro 12 Convergence problems and how to solve them

Title. Description. stata.com. intro 12 Convergence problems and how to solve them Title statacom intro 12 Convergence problems and how to solve them Description or, or, Description Remarks and examples Also see It can be devilishly difficult for software to obtain results for SEMs Here

More information

Topics in Machine Learning-EE 5359 Model Assessment and Selection

Topics in Machine Learning-EE 5359 Model Assessment and Selection Topics in Machine Learning-EE 5359 Model Assessment and Selection Ioannis D. Schizas Electrical Engineering Department University of Texas at Arlington 1 Training and Generalization Training stage: Utilizing

More information

SAS Structural Equation Modeling 1.3 for JMP

SAS Structural Equation Modeling 1.3 for JMP SAS Structural Equation Modeling 1.3 for JMP SAS Documentation The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2012. SAS Structural Equation Modeling 1.3 for JMP. Cary,

More information

Probabilistic (Randomized) algorithms

Probabilistic (Randomized) algorithms Probabilistic (Randomized) algorithms Idea: Build algorithms using a random element so as gain improved performance. For some cases, improved performance is very dramatic, moving from intractable to tractable.

More information

1. Estimation equations for strip transect sampling, using notation consistent with that used to

1. Estimation equations for strip transect sampling, using notation consistent with that used to Web-based Supplementary Materials for Line Transect Methods for Plant Surveys by S.T. Buckland, D.L. Borchers, A. Johnston, P.A. Henrys and T.A. Marques Web Appendix A. Introduction In this on-line appendix,

More information

Monte Carlo Simulation. Ben Kite KU CRMDA 2015 Summer Methodology Institute

Monte Carlo Simulation. Ben Kite KU CRMDA 2015 Summer Methodology Institute Monte Carlo Simulation Ben Kite KU CRMDA 2015 Summer Methodology Institute Created by Terrence D. Jorgensen, 2014 What Is a Monte Carlo Simulation? Anything that involves generating random data in a parameter

More information

Application of a General Polytomous Testlet Model to the Reading Section of a Large-Scale English Language Assessment

Application of a General Polytomous Testlet Model to the Reading Section of a Large-Scale English Language Assessment Research Report Application of a General Polytomous to the Reading Section of a Large-Scale English Language Assessment Yanmei Li Shuhong Li Lin Wang September 00 ETS RR-0- Listening. Learning. Leading.

More information

Optimizing Pharmaceutical Production Processes Using Quality by Design Methods

Optimizing Pharmaceutical Production Processes Using Quality by Design Methods Optimizing Pharmaceutical Production Processes Using Quality by Design Methods Bernd Heinen, SAS WHITE PAPER SAS White Paper Table of Contents Abstract.... The situation... Case study and database... Step

More information

You ve already read basics of simulation now I will be taking up method of simulation, that is Random Number Generation

You ve already read basics of simulation now I will be taking up method of simulation, that is Random Number Generation Unit 5 SIMULATION THEORY Lesson 39 Learning objective: To learn random number generation. Methods of simulation. Monte Carlo method of simulation You ve already read basics of simulation now I will be

More information

Fast Automated Estimation of Variance in Discrete Quantitative Stochastic Simulation

Fast Automated Estimation of Variance in Discrete Quantitative Stochastic Simulation Fast Automated Estimation of Variance in Discrete Quantitative Stochastic Simulation November 2010 Nelson Shaw njd50@uclive.ac.nz Department of Computer Science and Software Engineering University of Canterbury,

More information

Applying Supervised Learning

Applying Supervised Learning Applying Supervised Learning When to Consider Supervised Learning A supervised learning algorithm takes a known set of input data (the training set) and known responses to the data (output), and trains

More information

Data Mining Chapter 8: Search and Optimization Methods Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

Data Mining Chapter 8: Search and Optimization Methods Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Data Mining Chapter 8: Search and Optimization Methods Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Search & Optimization Search and Optimization method deals with

More information

Clustering and Visualisation of Data

Clustering and Visualisation of Data Clustering and Visualisation of Data Hiroshi Shimodaira January-March 28 Cluster analysis aims to partition a data set into meaningful or useful groups, based on distances between data points. In some

More information

Data analysis using Microsoft Excel

Data analysis using Microsoft Excel Introduction to Statistics Statistics may be defined as the science of collection, organization presentation analysis and interpretation of numerical data from the logical analysis. 1.Collection of Data

More information

Knowledge Discovery and Data Mining. Neural Nets. A simple NN as a Mathematical Formula. Notes. Lecture 13 - Neural Nets. Tom Kelsey.

Knowledge Discovery and Data Mining. Neural Nets. A simple NN as a Mathematical Formula. Notes. Lecture 13 - Neural Nets. Tom Kelsey. Knowledge Discovery and Data Mining Lecture 13 - Neural Nets Tom Kelsey School of Computer Science University of St Andrews http://tom.home.cs.st-andrews.ac.uk twk@st-andrews.ac.uk Tom Kelsey ID5059-13-NN

More information

Inclusion of Aleatory and Epistemic Uncertainty in Design Optimization

Inclusion of Aleatory and Epistemic Uncertainty in Design Optimization 10 th World Congress on Structural and Multidisciplinary Optimization May 19-24, 2013, Orlando, Florida, USA Inclusion of Aleatory and Epistemic Uncertainty in Design Optimization Sirisha Rangavajhala

More information

Mid-Year Report. Discontinuous Galerkin Euler Equation Solver. Friday, December 14, Andrey Andreyev. Advisor: Dr.

Mid-Year Report. Discontinuous Galerkin Euler Equation Solver. Friday, December 14, Andrey Andreyev. Advisor: Dr. Mid-Year Report Discontinuous Galerkin Euler Equation Solver Friday, December 14, 2012 Andrey Andreyev Advisor: Dr. James Baeder Abstract: The focus of this effort is to produce a two dimensional inviscid,

More information

Knowledge Discovery and Data Mining

Knowledge Discovery and Data Mining Knowledge Discovery and Data Mining Lecture 13 - Neural Nets Tom Kelsey School of Computer Science University of St Andrews http://tom.home.cs.st-andrews.ac.uk twk@st-andrews.ac.uk Tom Kelsey ID5059-13-NN

More information

The NESTED Procedure (Chapter)

The NESTED Procedure (Chapter) SAS/STAT 9.3 User s Guide The NESTED Procedure (Chapter) SAS Documentation This document is an individual chapter from SAS/STAT 9.3 User s Guide. The correct bibliographic citation for the complete manual

More information

RELIABILITY DATA ANALYSIS FOR DESIGNED EXPERIMENTS

RELIABILITY DATA ANALYSIS FOR DESIGNED EXPERIMENTS RELIABILITY DATA ANALYSIS FOR DESIGNED EXPERIMENTS Laura J. Freeman and G. Geoffrey Vining Department of Statistics, Virginia Tech, Blacksburg, VA 24061-0439 ABSTRACT Product reliability is an important

More information

CSC 2515 Introduction to Machine Learning Assignment 2

CSC 2515 Introduction to Machine Learning Assignment 2 CSC 2515 Introduction to Machine Learning Assignment 2 Zhongtian Qiu(1002274530) Problem 1 See attached scan files for question 1. 2. Neural Network 2.1 Examine the statistics and plots of training error

More information

Mixture Models and the EM Algorithm

Mixture Models and the EM Algorithm Mixture Models and the EM Algorithm Padhraic Smyth, Department of Computer Science University of California, Irvine c 2017 1 Finite Mixture Models Say we have a data set D = {x 1,..., x N } where x i is

More information

Predict Outcomes and Reveal Relationships in Categorical Data

Predict Outcomes and Reveal Relationships in Categorical Data PASW Categories 18 Specifications Predict Outcomes and Reveal Relationships in Categorical Data Unleash the full potential of your data through predictive analysis, statistical learning, perceptual mapping,

More information

CPSC 340: Machine Learning and Data Mining. Probabilistic Classification Fall 2017

CPSC 340: Machine Learning and Data Mining. Probabilistic Classification Fall 2017 CPSC 340: Machine Learning and Data Mining Probabilistic Classification Fall 2017 Admin Assignment 0 is due tonight: you should be almost done. 1 late day to hand it in Monday, 2 late days for Wednesday.

More information

ESTIMATING DENSITY DEPENDENCE, PROCESS NOISE, AND OBSERVATION ERROR

ESTIMATING DENSITY DEPENDENCE, PROCESS NOISE, AND OBSERVATION ERROR ESTIMATING DENSITY DEPENDENCE, PROCESS NOISE, AND OBSERVATION ERROR Coinvestigators: José Ponciano, University of Idaho Subhash Lele, University of Alberta Mark Taper, Montana State University David Staples,

More information

High-Performance Procedures in SAS 9.4: Comparing Performance of HP and Legacy Procedures

High-Performance Procedures in SAS 9.4: Comparing Performance of HP and Legacy Procedures Paper SD18 High-Performance Procedures in SAS 9.4: Comparing Performance of HP and Legacy Procedures Jessica Montgomery, Sean Joo, Anh Kellermann, Jeffrey D. Kromrey, Diep T. Nguyen, Patricia Rodriguez

More information

Simulation Calibration with Correlated Knowledge-Gradients

Simulation Calibration with Correlated Knowledge-Gradients Simulation Calibration with Correlated Knowledge-Gradients Peter Frazier Warren Powell Hugo Simão Operations Research & Information Engineering, Cornell University Operations Research & Financial Engineering,

More information

1 2 (3 + x 3) x 2 = 1 3 (3 + x 1 2x 3 ) 1. 3 ( 1 x 2) (3 + x(0) 3 ) = 1 2 (3 + 0) = 3. 2 (3 + x(0) 1 2x (0) ( ) = 1 ( 1 x(0) 2 ) = 1 3 ) = 1 3

1 2 (3 + x 3) x 2 = 1 3 (3 + x 1 2x 3 ) 1. 3 ( 1 x 2) (3 + x(0) 3 ) = 1 2 (3 + 0) = 3. 2 (3 + x(0) 1 2x (0) ( ) = 1 ( 1 x(0) 2 ) = 1 3 ) = 1 3 6 Iterative Solvers Lab Objective: Many real-world problems of the form Ax = b have tens of thousands of parameters Solving such systems with Gaussian elimination or matrix factorizations could require

More information

Assessing the Quality of the Natural Cubic Spline Approximation

Assessing the Quality of the Natural Cubic Spline Approximation Assessing the Quality of the Natural Cubic Spline Approximation AHMET SEZER ANADOLU UNIVERSITY Department of Statisticss Yunus Emre Kampusu Eskisehir TURKEY ahsst12@yahoo.com Abstract: In large samples,

More information

3 Nonlinear Regression

3 Nonlinear Regression CSC 4 / CSC D / CSC C 3 Sometimes linear models are not sufficient to capture the real-world phenomena, and thus nonlinear models are necessary. In regression, all such models will have the same basic

More information

Ludwig Fahrmeir Gerhard Tute. Statistical odelling Based on Generalized Linear Model. íecond Edition. . Springer

Ludwig Fahrmeir Gerhard Tute. Statistical odelling Based on Generalized Linear Model. íecond Edition. . Springer Ludwig Fahrmeir Gerhard Tute Statistical odelling Based on Generalized Linear Model íecond Edition. Springer Preface to the Second Edition Preface to the First Edition List of Examples List of Figures

More information

Use of Extreme Value Statistics in Modeling Biometric Systems

Use of Extreme Value Statistics in Modeling Biometric Systems Use of Extreme Value Statistics in Modeling Biometric Systems Similarity Scores Two types of matching: Genuine sample Imposter sample Matching scores Enrolled sample 0.95 0.32 Probability Density Decision

More information

Expectation Maximization (EM) and Gaussian Mixture Models

Expectation Maximization (EM) and Gaussian Mixture Models Expectation Maximization (EM) and Gaussian Mixture Models Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer 1 2 3 4 5 6 7 8 Unsupervised Learning Motivation

More information

Generalized least squares (GLS) estimates of the level-2 coefficients,

Generalized least squares (GLS) estimates of the level-2 coefficients, Contents 1 Conceptual and Statistical Background for Two-Level Models...7 1.1 The general two-level model... 7 1.1.1 Level-1 model... 8 1.1.2 Level-2 model... 8 1.2 Parameter estimation... 9 1.3 Empirical

More information

3 Nonlinear Regression

3 Nonlinear Regression 3 Linear models are often insufficient to capture the real-world phenomena. That is, the relation between the inputs and the outputs we want to be able to predict are not linear. As a consequence, nonlinear

More information

Large-Scale Lasso and Elastic-Net Regularized Generalized Linear Models

Large-Scale Lasso and Elastic-Net Regularized Generalized Linear Models Large-Scale Lasso and Elastic-Net Regularized Generalized Linear Models DB Tsai Steven Hillion Outline Introduction Linear / Nonlinear Classification Feature Engineering - Polynomial Expansion Big-data

More information

A GENERAL GIBBS SAMPLING ALGORITHM FOR ANALYZING LINEAR MODELS USING THE SAS SYSTEM

A GENERAL GIBBS SAMPLING ALGORITHM FOR ANALYZING LINEAR MODELS USING THE SAS SYSTEM A GENERAL GIBBS SAMPLING ALGORITHM FOR ANALYZING LINEAR MODELS USING THE SAS SYSTEM Jayawant Mandrekar, Daniel J. Sargent, Paul J. Novotny, Jeff A. Sloan Mayo Clinic, Rochester, MN 55905 ABSTRACT A general

More information