Maximum Likelihood Estimation 5.0

Size: px
Start display at page:

Download "Maximum Likelihood Estimation 5.0"

Transcription

1 Maximum Likelihood Estimation 5.0 for GAUSS TM Mathematical and Statistical System Aptech Systems, Inc.

2 Information in this document is subject to change without notice and does not represent a commitment on the part of Aptech Systems, Inc. The software described in this document is furnished under a license agreement or nondisclosure agreement. The software may be used or copied only in accordance with the terms of the agreement. The purchaser may make one copy of the software for backup purposes. No part of this manual may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying and recording, for any purpose other than the purchaser s personal use without the written permission of Aptech Systems, Inc. c Copyright by Aptech Systems, Inc., Black Diamond, WA. All Rights Reserved. GAUSS, GAUSS Engine and GAUSS Light are trademarks of Aptech Systems, Inc. Other trademarks are the property of their respective owners. Part Number: Version 5.0 Documentation Revision: 2173 June 12, 2012

3 Contents Contents 1 Installation 1.1 UNIX/Linux/Mac Download CD Windows Download CD Bit Windows Difference Between the UNIX and Windows Versions Getting Started 3 Maximum Likelihood Estimation 3.1 The Log-likelihood Function Algorithm Derivatives The Secant Algorithms Convergence Berndt, Hall, Hall, and Hausman s (BHHH) Method Polak-Ribiere-type Conjugate Gradient (PRCG) Line Search Methods Random Search Weighted Maximum Likelihood Active and Inactive Parameters Example Managing Optimization iii

4 Maxlik 5.0 for GAUSS Scaling Condition Starting Point Diagnosis Gradients Analytical Gradient User-Supplied Numerical Gradient Algorithmic Derivatives Analytical Hessian User-Supplied Numerical Hessian Switching Algorithms Automatically FASTMAX Fast Execution MAXLIK Undefined Function Evaluation Inference Wald Inference Profile Likelihood Inference Profile Trace Plots Bootstrap Pseudo-Random Number Generators Bayesian Inference Run-Time Switches Calling MAXLIK Recursively Using MAXLIK Directly Error Handling Return Codes Error Trapping References Maximum Likelihood Reference iv

5 Contents FASTMAX FASTBayes FASTBoot FASTPflClimits FASTProfile MAXLIK MAXBayes MAXBoot MAXBlimits MAXCLPrt MAXDensity MAXHist MAXProfile MAXPflClimits MAXPrt MAXSet MAXTlimits Event Count and Duration Regression README Files Setup About the COUNT Procedures Inputs Outputs Global Control Variables Statistical Inference Problems with Convergence Annotated Bibliography v

6 Maxlik 5.0 for GAUSS 6 Count Reference CountCLPrt CountPrt CountSet Expgam Expon Hurdlep Negbin Pareto Poisson Supreme Supreme Index vi

7 Installation Installation UNIX/Linux/Mac If you are unfamiliar with UNIX/Linux/Mac, see your system administrator or system documentation for information on the system commands referred to below Download 1. Copy the.tar.gz or.zip file to /tmp. 2. If the file has a.tar.gz extension, unzip it using gunzip. Otherwise skip to step 3. gunzip app_appname_vernum.revnum_unix.tar.gz 3. cd to your GAUSS or GAUSS Engine installation directory. We are assuming /usr/local/gauss in this case. cd /usr/local/gauss 1-1

8 Maxlik 5.0 for GAUSS 4. Use tar or unzip, depending on the file name extension, to extract the file. tar xvf /tmp/app_appname_vernum.revnum_unix.tar or unzip /tmp/app_appname_vernum.revnum_unix.zip CD 1. Insert the Apps CD into your machine s CD-ROM drive. 2. Open a terminal window. 3. cd to your current GAUSS or GAUSS Engine installation directory. We are assuming /usr/local/gauss in this case. cd /usr/local/gauss 4. Use tar or unzip, depending on the file name extensions, to extract the files found on the CD. For example: tar xvf /cdrom/apps/app_appname_vernum.revnum_unix.tar or unzip /cdrom/apps/app_appname_vernum.revnum_unix.zip However, note that the paths may be different on your machine. 1.2 Windows Download Unzip the.zip file into your GAUSS or GAUSS Engine installation directory CD 1. Insert the Apps CD into your machine s CD-ROM drive. 1-2

9 2. Unzip the.zip files found on the CD to your GAUSS or GAUSS Engine installation directory. Installation Installation Bit Windows If you have both the 64-bit version of GAUSS and the 32-bit Companion Edition installed on your machine, you need to install any GAUSS applications you own in both GAUSS installation directories. 1.3 Difference Between the UNIX and Windows Versions If the functions can be controlled during execution by entering keystrokes from the keyboard, it may be necessary to press ENTER after the keystroke in the UNIX version. 1-3

10

11 Getting Started 2 Getting Started 2-1

12

13 Maximum Likelihood Estimation 3 MaxLik 3.1 The Log-likelihood Function Maximum Likelihood is a set of procedures for the estimation of the parameters of models via the maximum likelihood method with general constraints on the parameters, along with an additional set of procedures for statistical inference. Maximum Likelihood solves the general maximum likelihood problem L = N log P(Y i ; θ) w i i=1 where N is the number of observations, P(Y i, θ) is the probability of Y i given θ, a vector of parameters, and w i is the weight of the i-th observation. 3-1

14 Maxlik 5.0 for GAUSS The Maximum Likelihood procedure Maxlik finds values for the parameters in θ such that L is maximized. In fact Maxlik minimizes L. It is important to note, however, that the user must specify the log-probability to be maximized. Maxlik transforms the function into the form to be minimized. Maxlik has been designed to make the specification of the function and the handling of the data convenient. The user supplies a procedure that computes log P(Y i ; θ), i.e., the log-likelihood, given the parameters in θ, for either an individual observation or set of observations (i.e., it must return either the log-likelihood for an individual observation or a vector of log-likelihoods for a matrix of observations; see discussion of the global variable row below). Maxlik uses this procedure to construct the function to be minimized. 3.2 Algorithm Maximum Likelihood finds values for the parameters using an iterative method. In this method the parameters are updated in a series of iterations beginning with a starting values that you provide. Let θ t be the current parameter values. Then the succeeding values are θ t+1 = θ t + ρδ where δ is a k 1 direction vector, and ρ a scalar step length. Direction Define Σ(θ) = 2 L θ θ 3-2

15 Maximum Likelihood Estimation Ψ(θ) = L θ The direction, δ is the solution to Σ(θ t )δ = Ψ(θ t ) This solution requires that Σ be positive definite. Line Search The line search finds a value of ρ that minimizes or decreases L(θ t + ρδ). MaxLik Derivatives The minimization requires the calculation of a Hessian, Σ, and the gradient, Ψ. Maxlik computes these numerically if procedures to compute them are not supplied. If you provide a proc for computing Ψ, the first derivative of L, Maxlik uses it in computing Σ, the second derivative of L, i.e., Σ is computed as the Jacobian of the gradient. This improves the computational precision of the Hessian by about four places. The accuracy of the gradient is improved and thus the iterations converge in fewer iterations. Moreover, the convergence takes less time because of a decrease in function calls - the numerical gradient requires k function calls while an analytical gradient reduces that to one The Secant Algorithms The Hessian may be very expensive to compute at every iteration, and poor start values may produce an ill-conditioned Hessian. For these reasons alternative algorithms are 3-3

16 Maxlik 5.0 for GAUSS provided in Maxlik for updating the Hessian rather than computing it directly at each iteration. These algorithms, as well as step length methods, may be modified during the execution of Maxlik. Beginning with an initial estimate of the Hessian, or a conformable identity matrix, an update is calculated. The update at each iteration adds more information to the estimate of the Hessian, improving its ability to project the direction of the descent. Thus after several iterations the secant algorithm should do nearly as well as Newton iteration with much less computation. There are two basic types of secant methods, the BFGS (Broyden, Fletcher, Goldfarb, and Shanno), and the DFP (Davidon, Fletcher, and Powell). They are both rank two updates, that is, they are analogous to adding two rows of new data to a previously computed moment matrix. The Cholesky factorization of the estimate of the Hessian is updated using the functions cholup and choldn. In addition, Maxlik includes a scoring method, BHHH (Berndt, Hall, Hall, and Hausman). This method computes the gradient of the likelihood by observation, i.e., the Jacobian, and estimates Σ as the cross-product of this Jacobian. Secant Methods (BFGS and DFP) BFGS is the method of Broyden, Fletcher, Goldfarb, and Shanno, and DFP is the method of Davidon, Fletcher, and Powell. These methods are complementary (Luenberger 1984, page 268). BFGS and DFP are like the NEWTON method in that they use both first and second derivative information. However, in DFP and BFGS the Hessian is approximated, reducing considerably the computational requirements. Because they do not explicitly calculate the second derivatives they are sometimes called quasi-newton methods. While it takes more iterations than the NEWTON method, the use of an approximation produces a gain because it can be expected to converge in less overall time (unless analytical second derivatives are available in which case it might be a toss-up). The secant methods are commonly implemented as updates of the inverse of the Hessian. This is not the best method numerically for the BFGS algorithm (Gill and Murray, 1972). 3-4

17 Maximum Likelihood Estimation This version of Maxlik, following Gill and Murray (1972), updates the Cholesky factorization of the Hessian instead, using the functions cholup and choldn for BFGS. The new direction is then computed using cholsol, a Cholesky solve, as applied to the updated Cholesky factorization of the Hessian and the gradient Convergence Convergence is declared when the relative gradient is less than _max_gradtol. The relative gradient is a scaled gradient and is used for determining convergence in order to reduce the effects of scale. It is defined as the absolute value of the gradient times the absolute value of the parameter vector divided by the larger of zero and the absolute value of the function. By default, _max_gradtol = 1e-5. MaxLik Berndt, Hall, Hall, and Hausman s (BHHH) Method BHHH is a method proposed by Berndt, Hall, Hall and Hausman (1974) for the maximization of log-likelihood functions. It is a scoring method that uses the cross-product of the matrix of first derivatives to estimate the Hessian matrix. This calculation can be time-consuming, especially for large data sets, since a gradient matrix exactly the same size as the data set must be computed. For that reason BHHH cannot be considered a preferred choice for an optimization algorithm Polak-Ribiere-type Conjugate Gradient (PRCG) The conjugate gradient method is an improvement on the steepest descent method without the increase in memory and computational requirements of the secant methods. Only the gradient is stored, and the calculation of the new direction is different: d t+1 = g t+1 + β t d t 3-5

18 Maxlik 5.0 for GAUSS where t indicates t-th iteration, d is the direction, g is the gradient. The conjugate gradient method used in Maxlik is a variation called the Polak-Ribiere method where β t = (g t+1 g t ) g t+1 g t g t The Newton and secant methods require the storage on the order of the Hessian in memory, i.e., 8k 2 bytes of memory, where k is the number of parameters. For a very large problem this can be prohibitive. For example, 200 parameters will require 3.2 megabytes of memory, and this doesn t count the copies of the Hessian that may be generated by the program. For large problems, then, the PRCG and STEEP methods may be the only alternative. As described above, STEEP can be very inefficient in the region of the minimum, and therefore the PRCG is the method of choice in these cases Line Search Methods Given a direction vector d, the updated estimate of the parameters is computed θ t+1 = θ t + ρδ where ρ is a constant, usually called the step length, that increases the descent of the function given the direction. Maxlik includes a variety of methods for computing ρ. The value of the function to be minimized as a function of ρ is L(θ t + ρδ) Given θ and d, this is a function of a single variable ρ. Line search methods attempt to find a value for ρ that decreases m. STEPBT is a polynomial fitting method, BRENT and HALF are iterative search methods. A fourth method called ONE forces a step length of 1. The default line search method is STEPBT. If this, or any selected method, fails, then BRENT is tried. If BRENT fails, then HALF is tried. If all of the line search methods fail, then a random search is tried (provided _max_randradius is greater than zero). 3-6

19 Maximum Likelihood Estimation STEPBT STEPBT is an implementation of a similarly named algorithm described in Dennis and Schnabel (1983). It first attempts to fit a quadratic function to m(θ t + ρδ) and computes an ρ that minimizes the quadratic. If that fails it attempts to fit a cubic function. The cubic function more accurately portrays the F which is not likely to be very quadratic, but is, however, more costly to compute. STEPBT is the default line search method because it generally produces the best results for the least cost in computational resources. BRENT MaxLik This method is a variation on the golden section method due to Brent (1972). In this method, the function is evaluated at a sequence of test values for ρ. These test values are determined by extrapolation and interpolation using the constant, ( 5 1)/2 = This constant is the inverse of the so-called golden ratio (( 5 + 1)/2 = and is why the method is called a golden section method. This method is generally more efficient than STEPBT but requires significantly more function evaluations. HALF This method first computes m(x + d), i.e., sets ρ = 1. If m(x + d) < m(x) then the step length is set to 1. If not, then it tries m(x +.5d). The attempted step length is divided by one half each time the function fails to decrease, and exits with the current value when it does decrease. This method usually requires the fewest function evaluations (it often only requires one), but it is the least efficient in that it is not very likely to find the step length that decreases m the most. 3-7

20 Maxlik 5.0 for GAUSS BHHHStep This is a variation on the golden search method. A sequence of step lengths are computed, interpolating or extrapolating using a golden ratio, and the method exits when the function decreases by an amount determined by _max_interp Random Search If the line search fails, i.e., no ρ is found such that m(θ t + ρδ) < m(θ t ), then a search is attempted for a random direction that decreases the function. The radius of the random search is fixed by the global variable, _max_randradius (default =.01), times a measure of the magnitude of the gradient. Maxlik makes _max_maxtry attempts to find a direction that decreases the function, and if all of them fail, the direction with the smallest value for m is selected. The function should never increase, but this assumes a well-defined problem. In practice, many functions are not so well-defined, and it often is the case that convergence is more likely achieved by a direction that puts the function somewhere else on the hyper-surface even if it is at a higher point on the surface. Another reason for permitting an increase in the function here is that halting the minimization altogether is only alternative if it is not at the minimum, and so one might as well retreat to another starting point. If the function repeatedly increases, then you would do well to consider improving either the specification of the problem or the starting point Weighted Maximum Likelihood Weights are specified by setting the GAUSS global, weight to a weighting vector, or by assigning it the name of a column in the GAUSS data set being used in the estimation. Thus if a data matrix is being analyzed, weight must be assigned to a vector. Maxlik assumes that the weights sum to the number of observations, i.e, that the weights 3-8

21 Maximum Likelihood Estimation are frequencies. This will be an issue only with statistical inference. Otherwise, any multiple of the weights will produce the same results Active and Inactive Parameters The Maxlik global _max_active may be used to fix parameters to their start values. This allows estimation of different models without having to modify the function procedure. _max_active must be set to a vector of the same length as the vector of start values. Elements of _max_active set to zero will be fixed to their starting values, while nonzero elements will be estimated. This feature may also be used for model testing. _max_numobs times the difference between the function values (the second return argument in the call to Maxlik) is chi-squared distributed with degrees of freedom equal to the number of fixed parameters in _max_active. MaxLik Example This example estimates coefficients for a tobit model: library maxlik; #include maxlik.ext; maxset; proc lpr(x,z); local t,s,m,u; s = x[4]; if s <= 1e-4; retp(error(0)); endif; m = z[.,2:4]*x[1:3,.]; u = z[.,1]./= 0; t = z[.,1]-m; retp(u.*(-(t.*t)./(2*s)-.5*ln(2*s*pi)) + (1-u).*(ln(cdfnc(m/sqrt(s))))); 3-9

22 Maxlik 5.0 for GAUSS endp; x0 = { 1, 1, 1, 1 }; title = "tobit example"; {x,f,g,cov,ret} = maxlik("tobit",0,&lpr,x0); call maxprt(x,f,g,cov,ret); The output is: =========================================================================== tobit example =========================================================================== MAXLIK Version /30/2001 1:11 pm =========================================================================== Data Set: tobit return code = 0 normal convergence Mean log-likelihood Number of cases 100 Covariance matrix of the parameters computed by the following method: Inverse of computed Hessian Parameters Estimates Std. err. Est./s.e. Prob. Gradient P P P P Correlation matrix of the parameters

23 Maximum Likelihood Estimation Number of iterations 17 Minutes to convergence Managing Optimization The critical elements in optimization are scaling, starting point, and the condition of the model. When the data are scaled, the starting point is reasonably close to the solution, and the data and model go together well, the iterations converge quickly and without difficulty. MaxLik For best results therefore, you want to prepare the problem so that model is well-specified, the data scaled, and that a good starting point is available. The tradeoff among algorithms and step length methods is between speed and demands on the starting point and condition of the model. The less demanding methods are generally time consuming and computationally intensive, whereas the quicker methods (either in terms of time or number of iterations to convergence) are more sensitive to conditioning and quality of starting point Scaling For best performance, the diagonal elements of the Hessian matrix should be roughly equal. If some diagonal elements contain numbers that are very large and/or very small with respect to the others, Maxlik has difficulty converging. How to scale the diagonal elements of the Hessian may not be obvious, but it may suffice to ensure that the constants (or data ) used in the model are about the same magnitude. 3-11

24 Maxlik 5.0 for GAUSS Condition The specification of the model can be measured by the condition of the Hessian. The solution of the problem is found by searching for parameter values for which the gradient is zero. If, however, the Jacobian of the gradient (i.e., the Hessian) is very small for a particular parameter, then Maxlik has difficulty determining the optimal values since a large region of the function appears virtually flat to Maxlik. When the Hessian has very small elements, the inverse of the Hessian has very large elements and the search direction gets buried in the large numbers. Poor condition can be caused by bad scaling. It can also be caused by a poor specification of the model or by bad data. Bad models and bad data are two sides of the same coin. If the problem is highly nonlinear, it is important that data be available to describe the features of the curve described by each of the parameters. For example, one of the parameters of the Weibull function describes the shape of the curve as it approaches the upper asymptote. If data are not available on that portion of the curve, then that parameter is poorly estimated. The gradient of the function with respect to that parameter is very flat, elements of the Hessian associated with that parameter is very small, and the inverse of the Hessian contains very large numbers. In this case it is necessary to respecify the model in a way that excludes that parameter. Computer Arithmetic Computer arithmetic is fundamentally flawed by the fact that the computer number is finite (see Higham, 1996, for a general discussion). The standard double precision number in PCs carries about 16 decimal significant places. A simple operation can destroy nearly all of those places. The most destructive operation on a computer is addition and subtraction. Numbers are stored in a computer in the form of an abscissa and an exponent, e.g., e+02. There are about 16 decimal places of precision on most computers. The problem occurs when adding numbers that are of very different size. Before adding the number must be transformed so that the exponents are the same. For example consider adding e-07 to e+00: 3-12

25 Maximum Likelihood Estimation e e e+00 As you can see eight places were lost in the smaller number. If the exponent in the smaller number was 16 all of the places in that number would be lost. This problem is due to the finiteness of the computer number, not to the implementation of the operators. It is an inherent problem in all computers and the only solution, adding more bits to the computer number, is only temporary because sooner or later a problem will arise where that quantity of bits won t be enough. The first lesson to be learned from this is to avoid operations combining very small numbers with relatively large numbers. And for very small numbers, 1 can be a large number, as the example shows. MaxLik The standard method for evaluating the precision lost in computing a matrix inverse is the ratio of the largest to the smallest eigenvalue of the matrix. This quantity is sometimes called the condition number. The log of the condition number to the base 10 is approximately the number of decimal places lost in computing the inverse. A condition number greater than 1e16 therefore indicates that all of the 16 decimal places are lost that are available in the standard double precision floating point number. The BFGS optimization method in Maxlik has been successful primarily because its method of generating an approximation to the Hessian encourages better conditioning. The implementation of the NEWTON method involves a numerical calculation of the Hessian. A numerical Hessian, like all numerical derivatives, are computed by first computing a difference, the most destructive operation as we ve seen, and then compounding that by dividing the difference by a very small quantity. In general, when using double precision with 16 places of accuracy, about four places are lost in calculating a first derivative and another four with the second derivative. The numerical Hessian therefore begins with a loss of eight places of precision. If there are any problems computing the function itself, or if the model itself contains any problems of condition, there may be nothing left at all. 3-13

26 Maxlik 5.0 for GAUSS The BFGS method avoids much of the problems in computing a numerical Hessian. It produces an approximation by building information slowly with each iteration. Initially the Hessian is set tot he identity matrix, the matrix with the best condition but the least information. Information is increased at each iteration with a method that guarantees a positive definite result. This provides for stabler, though slower, progress towards convergence. The implementation of has been designed to minimize the damage to the precision of the optimization problem. The BFGS method avoids a direct calculation of the numerical Hessian, and uses sophisticated techmiques for calculating the direction that preserve as much precision as possible. However, all of this can be defeated by a poorly scaled problem or a poorly specified model. When the objective function being optimized is a log-likelihood, the inverse of the Hessian is an estimate of the covariance matrix of the sampling distribution of the parameters. The condition of the Hessian is related to (i) the scaling of the parameters, and (ii) the degree with which there are linear dependencies in the sampling distribution of the parameters. Scaling Scaling is under the direct control of the investigator and should never be an issue in the optimization. It might not always be obvious how to do it, though. In estimation problems scaling of the parameters is usually implemented by scaling the data. in regression models this is simple to accomplish, but in more complicated models it might be more difficult to do. It might be necessary to experiment with different scaling to get it right. The goal is to optimize the condition of the Hessian. The definition of the condition number implies that we endeavor to minimize the difference of the largest to the smallest eigenvalue of the Hessian. A rule of thumb for this is to scale the Hessian so that the diagonal elements are all about the same magnitude. If the scaling of the Hessian proves too difficult, an alternative method is to scale the parameters directly in the procedure computing the log-likelihood. Multiply or divide the parameter values being passed to the procedure by setting quantities before their use in the calculation of the log-likelihood. Experiment with different values until the diagonal 3-14

27 Maximum Likelihood Estimation elements of the Hessian are all about the same magnitude. Linear Dependencies or Nearly Linear Dependencies in the Sampling Distribution This is the most common difficulty in estimation and arises because of a discrepancy between the data and the model. If the data do not contain sufficient information to identify a parameter or set of parameters, a linear dependency is generated. A simple example occurs in regressors that cannot be distinquished from the constant because its variation is too small. When this happens, the sampling distribution of these two parameters becomes highly collinear. This collinearity will produce an eigenvalue approaching zero in the Hessian, increasing the number of places lost in the calculation of the inverse of the Hessian, degrading the optimization. MaxLik In the real world the data we have available will frequently fail to contain the information we need to estimate all of the parameters of our models. This means that it is a constant struggle to a well-conditioned estimation. When the condition sufficiently deteriorates to the point that the optimization fails, or the statistical inference fails through a failure to invert the Hessian, either more data must be found, or the model must be re-specified. Re-specification means either the direct reduction of the parameter space, that is, a parameter is deleted from the mdoel, or some sort of restriction is applied to the parameters. Diagnosing the Linear Dependency At times it may be very difficult to determine the cause of the ill-conditioning. If the Hessian being computed at convergence for teh covariance matrix of the parameters fails to invert, try the following: first generate the pivoted QR factorization of the Hessian, { R,E } = qre(h); 3-15

28 Maxlik 5.0 for GAUSS The linearly dependent columns of H are pivoted to the end of the R matrix. E contains the new order of the columns of H after pivoting. The number of linearly dependent columns is found by looking at the number of nearly zero elements at the end of the diagonal fo R. We can compute a coefficient matrix of the linear relationship of the dependent columns on the remaining columns by computing R 1 11 R 12 where R 11 is that portion of the R matrix associated with the independent columns and R 12 the independent with dependent. Rather than use the inverse function in GAUSS, we use a special solve function that takes advantage of the triangular shape of R 11. Suppose that the last two elements of R are nearly zero, then r0 = rows(r); r1 = rows(r) - 1; r2 = rows(r) - 2; B = utrisol(r[1:r2,r1:r0],r[1:r2,1:r2); B describes the linear dependencies among the columns of H and can be used to diagnose the ill-conditioning in the Hessian Starting Point When the model is not particularly well-defined, the starting point can be critical. When the optimization doesn t seem to be working, try different starting points. A closed form solution may exist for a simpler problem with the same parameters. For example, ordinary least squares estimates may be used for nonlinear least squares problems or nonlinear regressions like probit or logit. There are no general methods for computing start values and it may be necessary to attempt the estimation from a variety of starting points. 3-16

29 Maximum Likelihood Estimation Diagnosis When the optimization is not proceeding well, it is sometimes useful to examine the function, the gradient Ψ, the direction δ, the Hessian Σ, the parameters θ t, or the step length ρ, during the iterations. The current values of these matrices can be printed out or stored in the global _max_diagnostic by setting _max_diagnostic to a nonzero value. Setting it to 1 causes Maxlik to print them to the screen or output file, 2 causes Maxlik to store then in _max_diagnostic, and 3 does both. When you have selected _max_diagnostic = 2 or 3, Maxlik inserts the matrices into _max_diagnostic using the vput command. The matrices are extracted using the vread command. For example, MaxLik _max_diagnostic = 2; call MAXPrt(maxlik("tobit",0,&lpr,x0)); h = vread(_max_diagnostic,"hessian"); d = vread(_max_diagnostic,"direct"); The following table contains the strings to be used to retrieve the various matrices in the vread command: θ δ Σ Ψ ρ params direct hessian gradient step When nested calls to Maxlik are made, i.e., when the procedure for computing the log-likelihood itself calls its own version of Maxlik, _max_diagnostic returns the matrices of the outer call to Maxlik only. 3-17

30 Maxlik 5.0 for GAUSS 3.4 Gradients Analytical Gradient To increase accuracy and reduce time, you may supply a procedure for computing the gradient, Ψ(θ) = L/ θ, analytically. This procedure has two input arguments, a K 1 vector of parameters and an N i L submatrix of the input data set. The number of rows of the data set passed in the argument to the call of this procedure may be less than the total number of observations when the data are stored in a GAUSS data set and there was not enough space to store the data set in RAM in its entirety. In that case subsets of the data set are passed to the procedure in sequence. The gradient procedure must be written to return a gradient (or more accurately, a Jacobian ) with as many rows as the input submatrix of the data set. Thus the gradient procedure returns an N i K matrix of gradients of the N i observations with respect to the K parameters. The Maxlik global, _max_gradproc is then set to the pointer to that procedure. For example, library maxlik; #include maxlik.ext; maxset; proc lpsn(b,z); /* Function - Poisson Regression */ local m; m = z[.,2:4]*b; retp(z[.,1].*m-exp(m)); endp; proc lgd(b,z); /* Gradient */ retp((z[.,1]-exp(z[.,2:4]*b)).*z[.,2:4]); endp; x0 = {.5,.5,.5 }; _max_gradproc = &lgd; _max_gradchecktol = 1e-3; 3-18

31 Maximum Likelihood Estimation { x,f0,g,h,retcode } = MAXLIK("psn",0,&lpsn,x0); call MAXPrt(x,f0,g,h,retcode); In practice, unfortunately, much of the time spent on writing the gradient procedure is devoted to debugging. To help in this debugging process, Maxlik can be instructed to compute the numerical gradient along with your prospective analytical gradient for comparison purposes. In the example above this is accomplished by setting _max_gradchecktol to 1e User-Supplied Numerical Gradient MaxLik You may substitute your own numerical gradient procedure for the one used by Maxlik by default. This is done by setting the Maxlik global, _max_usergrad to a pointer to the procedure. Maxlik includes some numerical gradient functions in gradient.src which can be invoked using this global. One of these procedures, gradre, computes numerical gradients using the Richardson Extrapolation method. To use this method set _max_usernumgrad = &gradre; Algorithmic Derivatives Algorithmic Derivatives is a program that can be used to generate a GAUSS procedure to compute derivatives of the log-likelihood function. If you have Algorithmic Derivatives, be sure to read its manual for details on doing this. First, copy the procedure computing the log-likelihood to a separate file. Second, from the command line enter 3-19

32 Maxlik 5.0 for GAUSS ad file_name d_file_name where file_name is the name of the file containing the input function procedure, and d_file_name is the name of the file containing the output derivative procedure. If the input function procedure is named lpr, the output derivative procedure has the name d_1_lpr where the addition to the _1_ indicates that the derivative is with respect to the first of the two arguments. For example, put the following function into a file called lpr.fct proc lpr(x,z); local s,m,u; s = x[4]; m = z[.,2:4]*x[1:3,.]; u = z[.,1]./= 0; retp(u.*lnpdfmvn(z[.,1]-m,s) + (1-u).*(lncdfnc(m/sqrt(s)))); endp; Then enter the following at the GAUSS command line library ad; ad lpr.fct d_lpr.fct; If successful, the following is printed to the screen java -jar d:\gauss6.0\src\gaussad.jar lpr.fct d_lpr.fct and the derivative procedure is written to file named d_lpr.fct: 3-20

33 Maximum Likelihood Estimation /* Version:1.0 - May 15, 2004 */ /* Generated from:lpr.fct */ /* Taking derivative with respect to argument 1 */ Proc(1)=d_1_lpr(x, z); Clearg _AD_fnValue; Local s, m, u; s = x[(4)] ; Local _AD_t1; _AD_t1 = x[(1):(3),.] ; m = z[.,(2):(4)] * _AD_t1; u = z[.,(1)]./= 0; _AD_fnValue = (u.* lnpdfmvn( z[.,(1)] - m, s)) + ((1 - u).* lncdfnc(m / sqrt(s))); /* retp(_ad_fnvalue); */ /* endp; */ struct _ADS_optimum _AD_d AD_t1,_AD_d_x,_AD_d_s,_AD_d_m,_AD_d AD_fnValue; /* _AD_d AD_t1 = 0; _AD_d_s = 0; _AD_d_m = 0; */ _AD_d AD_fnValue = _ADP_d_x_dx(_AD_fnValue); _AD_d_s = _ADP_DtimesD(_AD_d AD_fnValue, _ADP_DplusD(_ADP_DtimesD(_ADP_d_xplusy_dx(u.* lnpdfmvn( z[.,(1)] - m, s), (1 - u).* lncdfnc(m / sqrt(s))), _ADP_DtimesD(_ADP_d_ydotx_dx(u, lnpdfmvn( z[.,(1)] - m, s)), _ADP_DtimesD(_ADP_internal(d_2_lnpdfmvn( z[.,(1)] - m, s)), _ADP_d_x_dx(s)))), _ADP_DtimesD(_ADP_d_yplusx_dx(u.* lnpdfmvn( z[.,(1)] - m, s), (1 - u).* lncdfnc(m / sqrt(s))), _ADP_DtimesD(_ADP_d_ydotx_dx(1 - u, lncdfnc(m / sqrt(s))), _ADP_DtimesD(_ADP_d_lncdfnc(m / sqrt(s)), _ADP_DtimesD(_ADP_d_ydivx_dx(m, sqrt(s)), _ADP_DtimesD(_ADP_d_sqrt(s), _ADP_d_x_dx(s)))))))); _AD_d_m = _ADP_DtimesD(_AD_d AD_fnValue, _ADP_DplusD(_ADP_DtimesD(_ADP_d_xplusy_dx(u.* lnpdfmvn( z[.,(1)] - m, s), (1 - u).* lncdfnc(m / sqrt(s))), _ADP_DtimesD(_ADP_d_ydotx_dx(u, lnpdfmvn( z[.,(1)] - m, s)), _ADP_DtimesD(_ADP_internal(d_1_lnpdfmvn( z[.,(1)] - m, s)), _ADP_DtimesD(_ADP_d_yminusx_dx( z[.,(1)], m), _ADP_d_x_dx(m))))), _ADP_DtimesD(_ADP_d_yplusx_dx(u.* lnpdfmvn( z[.,(1)] - m, s), (1 - u).* lncdfnc(m / sqrt(s))), _ADP_DtimesD(_ADP_d_ydotx_dx(1 - u, lncdfnc(m / sqrt(s) )), _ADP_DtimesD(_ADP_d_lncdfnc(m / sqrt(s)), _ADP_DtimesD(_ADP_d_xdivy_dx(m, sqrt(s)), _ADP_d_x_dx(m))))))); /* u = z[.,(1)]./= 0; */ _AD_d AD_t1 = _ADP_DtimesD(_AD_d_m, _ADP_DtimesD(_ADP_d_yx_dx( z[.,(2):(4)], _AD_t1), _ADP_d_x_dx(_AD_t1))); Local _AD_sr_x, _AD_sc_x; MaxLik 3-21

34 Maxlik 5.0 for GAUSS _AD_sr_x = _ADP_seqaMatrixRows(x); _AD_sc_x = _ADP_seqaMatrixCols(x); _AD_d_x = _ADP_DtimesD(_AD_d AD_t1, _ADP_d_x2Idx_dx(x, _AD_sr_x[(1):(3)], _AD_sc_x[0] )); Local _AD_s_x; _AD_s_x = _ADP_seqaMatrix(x); _AD_d_x = _ADP_DplusD(_ADP_DtimesD(_AD_d_s, _ADP_d_xIdx_dx(x, _AD_s_x[(4)] )), _AD_d_x); retp(_adp_external(_ad_d_x)); endp; If there s a syntax error in the input function procedure, the following is written to the screen java -jar d:\gauss6.0\src\gaussad.jar lpr.fct d_lpr.fct Command java -jar d:\gauss6.0\src\gaussad.jar lpr.fct d_lpr.fct exi the exit status 1 indicating that an error has occurred. The output file then contains the reason for the error: /* Version:1.0 - May 15, 2004 */ /* Generated from:lpr.fct */ /* Taking derivative with respect to argument 1 */ proc lpr(x,z); local s,m,u; s = x[4]; m = z[.,2:4]*x[1:3,.]; u = z[.,1]./= 0; retp(u.*lnpdfmvn(z[.,1]-m,s) + (1-u).*(lncdfnc(m/sqrt(s))); Error: lpr.fct:12:63: expecting ), found ; Finally, set the global, _max_gradproc equal to a pointer to this above procedure, for example, 3-22

35 Maximum Likelihood Estimation library maxlik,ad; #include ad.sdf x0 = { 1, 1, 1, 1 }; title = "tobit example"; _max_bounds = { , , ,.1 10 }; _max_gradproc = &d_1_lpr; Maxlik("tobit",0,&lpr,x0); MaxLik Speeding Up the Algorithmic Derivative A slightly faster derivative procedure can be generated by modifying the log-likelihood proc to return a scalar sum of the log-likelihoods in the input file in the call to AD. It is important to note that this derivative function based on a scalar return cannot be used for computing the QML covariance matrix of the parameters. Thus if you want both a derivative procedure based on a scalar return and QML standard errors you will need to provide both types of gradient procedures. To accomplish this first copy both versions of the log-likelihood procedure into separate files and run AD on both of them with different output files. Then copy both of these derivatives procedures to the command file. Note: the log-likelihood procedure that returns a vector of log-likelihoods should remain in the command file, i.e., don t use the version of the log-likelihood that returns a scalar in the command file. For example, enlarging on the example in the previous section, put the following into a separate file, 3-23

36 Maxlik 5.0 for GAUSS proc lpr2(x,z); local s,m,u,logl; s = x[4]; m = z[.,2:4]*x[1:3,.]; u = z[.,1]./= 0; logl = u.*lnpdfmvn(z[.,1]-m,s) + (1-u).*(lncdfnc(m/sqrt(s))); retp(sumc(logl)); endp; Then enter on the command line ad lpr2.src d_lpr2.src and copy the contents of d lpr2.src into the command file. Our comand file now contains two derivative procedures, one based on a scalar result and another on a vector result. The one in the previous section d_1_lpr is our vector result derivative, and the from run above, d_1_lpr2 is our scalar result derivative. We want to use d_1_lpr2 for the iterations because it will be faster (it is computing a 1 K vector gradient), and for the QML covariance matrix of the parameters we will use d_1_lpr which returns a N K matrix of derivatives as required for the QML covariance matrix. Our command file will be library maxlik,ad; #include ad.sdf x0 = { 1, 1, 1, 1 }; title = "tobit example"; _max_bounds = { , , 3-24

37 Maximum Likelihood Estimation ,.1 10 }; _max_qmlproc = &d_1_lpr; _max_gradproc = &d_1_lpr2; Maxlik("tobit",0,&lpr,x0); in addition to the two derivative procedures Analytical Hessian MaxLik You may provide a procedure for computing the Hessian, Σ(θ) = 2 L/ θ θ. This procedure has two arguments, the K 1 vector of parameters, an N i L submatrix of the input data set (where N i may be less than N), and returns a K K symmetric matrix of second derivatives of the objection function with respect to the parameters. The pointer to this procedure is stored in the global variable _max_hessproc. In practice, unfortunately, much of the time spent on writing the Hessian procedure is devoted to debugging. To help in this debugging process, Maxlik can be instructed to compute the numerical Hessian along with your prospective analytical Hessian for comparison purposes. To accomplish this _max_gradchecktol is set to a small nonzero value. library maxlik; #include maxlik.ext; proc lnlk(b,z); local dev,s2; dev = z[.,1] - b[1] * exp(-b[2]*z[.,2]); s2 = dev dev/rows(dev); 3-25

38 Maxlik 5.0 for GAUSS retp(-0.5*(dev.*dev/s2 + ln(2*pi*s2))); endp; proc grdlk(b,z); local d,s2,dev,r; d = exp(-b[2]*z[.,2]); dev = z[.,1] - b[1]*d; s2 = dev dev/rows(dev); r = dev.*d/s2; /* retp(r (-b[1]*z[.,2].*r)); correct gradient */ retp(r (z[.,2].*r)); /* incorrect gradient */ endp; proc hslk(b,z); local d,s2,dev,r, hss; d = exp(-b[2]*z[.,2]); dev = z[.,1] - b[1]*d; s2 = dev dev/rows(dev); if s2 <= 0; retp(error(0)); endif; r = z[.,2].*d.*(b[1].*d - dev)/s2; hss = -d.*d/s2 r -b[1].*z[.,2].*r; retp(xpnd(sumc(hss))); endp; maxset; _max_hessproc = &hslk; _max_gradproc = &grdlk; _max_gradchecktol = 1e-3; startv = { 2, 1 }; { x,f0,g,cov,retcode } = MAXLIK("nlls",0,&lnlk,startv); call MAXPrt(x,f0,g,cov,retcode); The gradient is incorrectly computed, and Maxlik responds with an error message. It is clear that the error is in the calculation of the gradient for the second parameter. analytical and numerical gradients differ 3-26

39 Maximum Likelihood Estimation numerical analytical ======================================================================== analytical Hessian and analytical gradient ======================================================================== MAXLIK Version /30/ :10 am ======================================================================== Data Set: nlls return code = 7 function cannot be evaluated at initial parameter values Mean log-likelihood Number of cases 150 MaxLik The covariance of the parameters failed to invert Parameters Estimates Gradient P P Number of iterations. Minutes to convergence User-Supplied Numerical Hessian You may substitute your own numerical Hessian procedure for the one used by Maxlik by default. This done by setting the Maxlik global, _max_userhess to a pointer to the procedure. This procedure has three input arguments, a pointer to the log-likelihood function, a K 1 vector of parameters, and an N i K matrix containing the data. It must return a K K matrix which is the estimated Hessian evaluated at the parameter vector. 3-27

40 Maxlik 5.0 for GAUSS Switching Algorithms Automatically The global variable _max_switch can be used to switch algorithms automatically during the iteratations. If _max_switch has one column, the algorithm is switched once during the iterations, and if it has two columns it is switched back and forth. The conditions for the switching is determined by the elements of _max_switch in the second through fourth rows. If these are rows are not supplied default values are entered. The first row contains the algorithm numbers to switch to, or if two columns to switch to and from. The algorithm switches if the log-likelihood function improves by less than the quantity in the second row, or if the number of iterations exceeds the quantity in the third row, or if the line search changes by less than the quantity in the fourth row. If only the first row is specified in the command file, that is, if only the algorithm numbers are entered, the second, third and fourth rows are set by default to.001, 10,.001 respectively. 3.5 FASTMAX Fast Execution MAXLIK Depending on the type of problem FASTMAX, the fast version of Maxlik, can be called with speed-ups from 10 percent to 500 percent over the regular version of Maxlik. This is achieved at the expense of losing some features, in particular, it won t print any iteration information to the screen, the globals cannot be modified on the fly, it can t print or store diagnostic information. Moreover, the dataset must be entirely storable in RAM. The gain in time depends on the type of problem. The greatest speedup occurs with problems that are function call intensive. The speedup will be less if gradients and/or Hessians are provided. The least speedup occurs for problems where convergence is quick, and the most where convergence is slow. Thus FASTMAX will least affect a bootstrap or profile likelihood estimation for models that converge quickly, and most affect those that don t. FASTMAX is most useful for problems that will be repeated in some way such as in a Monte 3-28

41 Maximum Likelihood Estimation Carlo study or a bootstrap. The initial runs would use Maxlik where monitoring the progress is most important, and subsequent runs would use FASTMAX. FASTMAX has the same arguments and returns as Maxlik and thus to call it you may change the name Maxlik in your command file to FASTMAX. FASTMAX does require that the dataset be storable in memory in its entirety, however, and if that isn t possible FASTMAX will fail. In a similar way, for the fast versions of MAXBOOT, MAXPROFILE, and MAXBAYES, change the calls to FASTBOOT, FASTPROFILE, and FASTBAYES, respectively. No changes in input or output arguments are necessary Undefined Function Evaluation MaxLik On occasion the log-likelihood function will evaluate to an undefined value, for example, the log-likelihood procedure may attempt to take the log of a negative quantity for one or more observations. If you have written your procedure to return a scalar missing value when this happens, Maxlik will succeed in recovering in most cases. That is, depending on circumstances it will find another set of parameter values or use a different line search method. If you are using FASTMAX, you can try a different strategy. Write your procedure to enter a missing value in the log-likelihood vector for that observation for which the calculation is undefined. FASTMAX will compute gradients and function values by list-wise deletion. In other words it will compute the function and gradient from the available observations. 3.6 Inference Maxlik includes four classes of methods for analyzing the distributions of the estimated parameters: 3-29

Constrained Maximum Likelihood Estimation

Constrained Maximum Likelihood Estimation Constrained Maximum Likelihood Estimation for GAUSS TM Version 2.0 Aptech Systems, Inc. Information in this document is subject to change without notice and does not represent a commitment on the part

More information

Optimization 3.1. for GAUSS TM. Aptech Systems, Inc.

Optimization 3.1. for GAUSS TM. Aptech Systems, Inc. Optimization 3.1 for GAUSS TM Aptech Systems, Inc. Information in this document is subject to change without notice and does not represent a commitment on the part of Aptech Systems, Inc. The software

More information

Maximum Likelihood estimation: Stata vs. Gauss

Maximum Likelihood estimation: Stata vs. Gauss Maximum Likelihood estimation: Stata vs. Gauss Index Motivation Objective The Maximum Likelihood Method Capabilities: Stata vs Gauss Conclusions Motivation Stata is a powerful and flexible statistical

More information

Maximum Likelihood MT 2.0

Maximum Likelihood MT 2.0 Maximum Likelihood MT 2.0 for GAUSS TM Mathematical and Statistical System Aptech Systems, Inc. Information in this document is subject to change without notice and does not represent a commitment on the

More information

Constrained Maximum Likelihood Estimation

Constrained Maximum Likelihood Estimation Constrained Maximum Likelihood Estimation for GAUSS TM Version 2.0 Aptech Systems, Inc. Information in this document is subject to change without notice and does not represent a commitment on the part

More information

Constrained Optimization MT 1.0

Constrained Optimization MT 1.0 Constrained Optimization MT 1.0 for GAUSS TM Mathematical and Statistical System Aptech Systems, Inc. Information in this document is subject to change without notice and does not represent a commitment

More information

Linear Programming MT 4.0

Linear Programming MT 4.0 Linear Programming MT 4.0 for GAUSS TM Mathematical and Statistical System Aptech Systems, Inc. Information in this document is subject to change without notice and does not represent a commitment on the

More information

Constrained Optimization

Constrained Optimization Constrained Optimization for GAUSS TM Version 2.0 Aptech Systems, Inc. Information in this document is subject to change without notice and does not represent a commitment on the part of Aptech Systems,

More information

Multi Layer Perceptron trained by Quasi Newton learning rule

Multi Layer Perceptron trained by Quasi Newton learning rule Multi Layer Perceptron trained by Quasi Newton learning rule Feed-forward neural networks provide a general framework for representing nonlinear functional mappings between a set of input variables and

More information

Constrained Optimization

Constrained Optimization Constrained Optimization Information in this document is subject to change without notice and does not represent a commitment on the part of Aptech Systems, Inc. The software described in this document

More information

Modern Methods of Data Analysis - WS 07/08

Modern Methods of Data Analysis - WS 07/08 Modern Methods of Data Analysis Lecture XV (04.02.08) Contents: Function Minimization (see E. Lohrmann & V. Blobel) Optimization Problem Set of n independent variables Sometimes in addition some constraints

More information

Theoretical Concepts of Machine Learning

Theoretical Concepts of Machine Learning Theoretical Concepts of Machine Learning Part 2 Institute of Bioinformatics Johannes Kepler University, Linz, Austria Outline 1 Introduction 2 Generalization Error 3 Maximum Likelihood 4 Noise Models 5

More information

Logistic Regression

Logistic Regression Logistic Regression ddebarr@uw.edu 2016-05-26 Agenda Model Specification Model Fitting Bayesian Logistic Regression Online Learning and Stochastic Optimization Generative versus Discriminative Classifiers

More information

A User Manual for the Multivariate MLE Tool. Before running the main multivariate program saved in the SAS file Part2-Main.sas,

A User Manual for the Multivariate MLE Tool. Before running the main multivariate program saved in the SAS file Part2-Main.sas, A User Manual for the Multivariate MLE Tool Before running the main multivariate program saved in the SAS file Part-Main.sas, the user must first compile the macros defined in the SAS file Part-Macros.sas

More information

Classical Gradient Methods

Classical Gradient Methods Classical Gradient Methods Note simultaneous course at AMSI (math) summer school: Nonlin. Optimization Methods (see http://wwwmaths.anu.edu.au/events/amsiss05/) Recommended textbook (Springer Verlag, 1999):

More information

Experimental Data and Training

Experimental Data and Training Modeling and Control of Dynamic Systems Experimental Data and Training Mihkel Pajusalu Alo Peets Tartu, 2008 1 Overview Experimental data Designing input signal Preparing data for modeling Training Criterion

More information

INTRODUCTION TO LINEAR AND NONLINEAR PROGRAMMING

INTRODUCTION TO LINEAR AND NONLINEAR PROGRAMMING INTRODUCTION TO LINEAR AND NONLINEAR PROGRAMMING DAVID G. LUENBERGER Stanford University TT ADDISON-WESLEY PUBLISHING COMPANY Reading, Massachusetts Menlo Park, California London Don Mills, Ontario CONTENTS

More information

Introduction to Optimization

Introduction to Optimization Introduction to Optimization Second Order Optimization Methods Marc Toussaint U Stuttgart Planned Outline Gradient-based optimization (1st order methods) plain grad., steepest descent, conjugate grad.,

More information

10.7 Variable Metric Methods in Multidimensions

10.7 Variable Metric Methods in Multidimensions 10.7 Variable Metric Methods in Multidimensions 425 *fret=dbrent(ax,xx,bx,f1dim,df1dim,tol,&xmin); for (j=1;j

More information

Newton and Quasi-Newton Methods

Newton and Quasi-Newton Methods Lab 17 Newton and Quasi-Newton Methods Lab Objective: Newton s method is generally useful because of its fast convergence properties. However, Newton s method requires the explicit calculation of the second

More information

STRAT. A Program for Analyzing Statistical Strategic Models. Version 1.4. Curtis S. Signorino Department of Political Science University of Rochester

STRAT. A Program for Analyzing Statistical Strategic Models. Version 1.4. Curtis S. Signorino Department of Political Science University of Rochester STRAT A Program for Analyzing Statistical Strategic Models Version 1.4 Curtis S. Signorino Department of Political Science University of Rochester c Copyright, 2001 2003, Curtis S. Signorino All rights

More information

David G. Luenberger Yinyu Ye. Linear and Nonlinear. Programming. Fourth Edition. ö Springer

David G. Luenberger Yinyu Ye. Linear and Nonlinear. Programming. Fourth Edition. ö Springer David G. Luenberger Yinyu Ye Linear and Nonlinear Programming Fourth Edition ö Springer Contents 1 Introduction 1 1.1 Optimization 1 1.2 Types of Problems 2 1.3 Size of Problems 5 1.4 Iterative Algorithms

More information

Constrained and Unconstrained Optimization

Constrained and Unconstrained Optimization Constrained and Unconstrained Optimization Carlos Hurtado Department of Economics University of Illinois at Urbana-Champaign hrtdmrt2@illinois.edu Oct 10th, 2017 C. Hurtado (UIUC - Economics) Numerical

More information

Convex Optimization CMU-10725

Convex Optimization CMU-10725 Convex Optimization CMU-10725 Conjugate Direction Methods Barnabás Póczos & Ryan Tibshirani Conjugate Direction Methods 2 Books to Read David G. Luenberger, Yinyu Ye: Linear and Nonlinear Programming Nesterov:

More information

D-Optimal Designs. Chapter 888. Introduction. D-Optimal Design Overview

D-Optimal Designs. Chapter 888. Introduction. D-Optimal Design Overview Chapter 888 Introduction This procedure generates D-optimal designs for multi-factor experiments with both quantitative and qualitative factors. The factors can have a mixed number of levels. For example,

More information

AN UNCONSTRAINED NONLINEAR OPTIMIZATION SOLVER: A USER S GUIDE

AN UNCONSTRAINED NONLINEAR OPTIMIZATION SOLVER: A USER S GUIDE AN UNONSTRAINED NONLINEAR OPTIMIZATION SOLVER: A USER S GUIDE 1. Introduction This modular software package has been designed to solve the unconstrained nonlinear optimization problem. This problem entails

More information

Estimation of Item Response Models

Estimation of Item Response Models Estimation of Item Response Models Lecture #5 ICPSR Item Response Theory Workshop Lecture #5: 1of 39 The Big Picture of Estimation ESTIMATOR = Maximum Likelihood; Mplus Any questions? answers Lecture #5:

More information

CS321 Introduction To Numerical Methods

CS321 Introduction To Numerical Methods CS3 Introduction To Numerical Methods Fuhua (Frank) Cheng Department of Computer Science University of Kentucky Lexington KY 456-46 - - Table of Contents Errors and Number Representations 3 Error Types

More information

Optimization. there will solely. any other methods presented can be. saved, and the. possibility. the behavior of. next point is to.

Optimization. there will solely. any other methods presented can be. saved, and the. possibility. the behavior of. next point is to. From: http:/ //trond.hjorteland.com/thesis/node1.html Optimization As discussed briefly in Section 4.1, the problem we are facing when searching for stationaryy values of the action given in equation (4.1)

More information

Today. Golden section, discussion of error Newton s method. Newton s method, steepest descent, conjugate gradient

Today. Golden section, discussion of error Newton s method. Newton s method, steepest descent, conjugate gradient Optimization Last time Root finding: definition, motivation Algorithms: Bisection, false position, secant, Newton-Raphson Convergence & tradeoffs Example applications of Newton s method Root finding in

More information

FMA901F: Machine Learning Lecture 3: Linear Models for Regression. Cristian Sminchisescu

FMA901F: Machine Learning Lecture 3: Linear Models for Regression. Cristian Sminchisescu FMA901F: Machine Learning Lecture 3: Linear Models for Regression Cristian Sminchisescu Machine Learning: Frequentist vs. Bayesian In the frequentist setting, we seek a fixed parameter (vector), with value(s)

More information

Our Strategy for Learning Fortran 90

Our Strategy for Learning Fortran 90 Our Strategy for Learning Fortran 90 We want to consider some computational problems which build in complexity. evaluating an integral solving nonlinear equations vector/matrix operations fitting data

More information

10.4 Linear interpolation method Newton s method

10.4 Linear interpolation method Newton s method 10.4 Linear interpolation method The next best thing one can do is the linear interpolation method, also known as the double false position method. This method works similarly to the bisection method by

More information

DEGENERACY AND THE FUNDAMENTAL THEOREM

DEGENERACY AND THE FUNDAMENTAL THEOREM DEGENERACY AND THE FUNDAMENTAL THEOREM The Standard Simplex Method in Matrix Notation: we start with the standard form of the linear program in matrix notation: (SLP) m n we assume (SLP) is feasible, and

More information

Bits, Words, and Integers

Bits, Words, and Integers Computer Science 52 Bits, Words, and Integers Spring Semester, 2017 In this document, we look at how bits are organized into meaningful data. In particular, we will see the details of how integers are

More information

Hashing. Hashing Procedures

Hashing. Hashing Procedures Hashing Hashing Procedures Let us denote the set of all possible key values (i.e., the universe of keys) used in a dictionary application by U. Suppose an application requires a dictionary in which elements

More information

Chapter II. Linear Programming

Chapter II. Linear Programming 1 Chapter II Linear Programming 1. Introduction 2. Simplex Method 3. Duality Theory 4. Optimality Conditions 5. Applications (QP & SLP) 6. Sensitivity Analysis 7. Interior Point Methods 1 INTRODUCTION

More information

A Brief Look at Optimization

A Brief Look at Optimization A Brief Look at Optimization CSC 412/2506 Tutorial David Madras January 18, 2018 Slides adapted from last year s version Overview Introduction Classes of optimization problems Linear programming Steepest

More information

Title. Syntax. optimize( ) Function optimization. S = optimize init() (varies) optimize init which(s [, { "max" "min" } ] )

Title. Syntax. optimize( ) Function optimization. S = optimize init() (varies) optimize init which(s [, { max min } ] ) Title optimize( ) Function optimization Syntax S = optimize init() (varies) optimize init which(s [, { "max" "min" } ] ) (varies) optimize init evaluator(s [, &function() ] ) (varies) optimize init evaluatortype(s

More information

Aim. Structure and matrix sparsity: Part 1 The simplex method: Exploiting sparsity. Structure and matrix sparsity: Overview

Aim. Structure and matrix sparsity: Part 1 The simplex method: Exploiting sparsity. Structure and matrix sparsity: Overview Aim Structure and matrix sparsity: Part 1 The simplex method: Exploiting sparsity Julian Hall School of Mathematics University of Edinburgh jajhall@ed.ac.uk What should a 2-hour PhD lecture on structure

More information

A large number of user subroutines and utility routines is available in Abaqus, that are all programmed in Fortran. Subroutines are different for

A large number of user subroutines and utility routines is available in Abaqus, that are all programmed in Fortran. Subroutines are different for 1 2 3 A large number of user subroutines and utility routines is available in Abaqus, that are all programmed in Fortran. Subroutines are different for implicit (standard) and explicit solvers. Utility

More information

Annotated multitree output

Annotated multitree output Annotated multitree output A simplified version of the two high-threshold (2HT) model, applied to two experimental conditions, is used as an example to illustrate the output provided by multitree (version

More information

Box-Cox Transformation for Simple Linear Regression

Box-Cox Transformation for Simple Linear Regression Chapter 192 Box-Cox Transformation for Simple Linear Regression Introduction This procedure finds the appropriate Box-Cox power transformation (1964) for a dataset containing a pair of variables that are

More information

06: Logistic Regression

06: Logistic Regression 06_Logistic_Regression 06: Logistic Regression Previous Next Index Classification Where y is a discrete value Develop the logistic regression algorithm to determine what class a new input should fall into

More information

CS281 Section 3: Practical Optimization

CS281 Section 3: Practical Optimization CS281 Section 3: Practical Optimization David Duvenaud and Dougal Maclaurin Most parameter estimation problems in machine learning cannot be solved in closed form, so we often have to resort to numerical

More information

Title. Description. stata.com

Title. Description. stata.com Title stata.com optimize( ) Function optimization Description Syntax Remarks and examples Conformability Diagnostics References Also see Description These functions find parameter vector or scalar p such

More information

10/11/2013. Chapter 3. Objectives. Objectives (continued) Introduction. Attributes of Algorithms. Introduction. The Efficiency of Algorithms

10/11/2013. Chapter 3. Objectives. Objectives (continued) Introduction. Attributes of Algorithms. Introduction. The Efficiency of Algorithms Chapter 3 The Efficiency of Algorithms Objectives INVITATION TO Computer Science 1 After studying this chapter, students will be able to: Describe algorithm attributes and why they are important Explain

More information

DOWNLOAD PDF BIG IDEAS MATH VERTICAL SHRINK OF A PARABOLA

DOWNLOAD PDF BIG IDEAS MATH VERTICAL SHRINK OF A PARABOLA Chapter 1 : BioMath: Transformation of Graphs Use the results in part (a) to identify the vertex of the parabola. c. Find a vertical line on your graph paper so that when you fold the paper, the left portion

More information

Roundoff Errors and Computer Arithmetic

Roundoff Errors and Computer Arithmetic Jim Lambers Math 105A Summer Session I 2003-04 Lecture 2 Notes These notes correspond to Section 1.2 in the text. Roundoff Errors and Computer Arithmetic In computing the solution to any mathematical problem,

More information

Reals 1. Floating-point numbers and their properties. Pitfalls of numeric computation. Horner's method. Bisection. Newton's method.

Reals 1. Floating-point numbers and their properties. Pitfalls of numeric computation. Horner's method. Bisection. Newton's method. Reals 1 13 Reals Floating-point numbers and their properties. Pitfalls of numeric computation. Horner's method. Bisection. Newton's method. 13.1 Floating-point numbers Real numbers, those declared to be

More information

An interesting related problem is Buffon s Needle which was first proposed in the mid-1700 s.

An interesting related problem is Buffon s Needle which was first proposed in the mid-1700 s. Using Monte Carlo to Estimate π using Buffon s Needle Problem An interesting related problem is Buffon s Needle which was first proposed in the mid-1700 s. Here s the problem (in a simplified form). Suppose

More information

AM205: lecture 2. 1 These have been shifted to MD 323 for the rest of the semester.

AM205: lecture 2. 1 These have been shifted to MD 323 for the rest of the semester. AM205: lecture 2 Luna and Gary will hold a Python tutorial on Wednesday in 60 Oxford Street, Room 330 Assignment 1 will be posted this week Chris will hold office hours on Thursday (1:30pm 3:30pm, Pierce

More information

Iterative Algorithms I: Elementary Iterative Methods and the Conjugate Gradient Algorithms

Iterative Algorithms I: Elementary Iterative Methods and the Conjugate Gradient Algorithms Iterative Algorithms I: Elementary Iterative Methods and the Conjugate Gradient Algorithms By:- Nitin Kamra Indian Institute of Technology, Delhi Advisor:- Prof. Ulrich Reude 1. Introduction to Linear

More information

Description Syntax Remarks and examples Conformability Diagnostics References Also see

Description Syntax Remarks and examples Conformability Diagnostics References Also see Title stata.com solvenl( ) Solve systems of nonlinear equations Description Syntax Remarks and examples Conformability Diagnostics References Also see Description The solvenl() suite of functions finds

More information

Tree-GP: A Scalable Bayesian Global Numerical Optimization algorithm

Tree-GP: A Scalable Bayesian Global Numerical Optimization algorithm Utrecht University Department of Information and Computing Sciences Tree-GP: A Scalable Bayesian Global Numerical Optimization algorithm February 2015 Author Gerben van Veenendaal ICA-3470792 Supervisor

More information

Chapter 1. Math review. 1.1 Some sets

Chapter 1. Math review. 1.1 Some sets Chapter 1 Math review This book assumes that you understood precalculus when you took it. So you used to know how to do things like factoring polynomials, solving high school geometry problems, using trigonometric

More information

Generalized Additive Model

Generalized Additive Model Generalized Additive Model by Huimin Liu Department of Mathematics and Statistics University of Minnesota Duluth, Duluth, MN 55812 December 2008 Table of Contents Abstract... 2 Chapter 1 Introduction 1.1

More information

E04DGF NAG Fortran Library Routine Document

E04DGF NAG Fortran Library Routine Document E04 Minimizing or Maximizing a Function E04DGF NAG Fortran Library Routine Document Note. Before using this routine, please read the Users Note for your implementation to check the interpretation of bold

More information

Cost Functions in Machine Learning

Cost Functions in Machine Learning Cost Functions in Machine Learning Kevin Swingler Motivation Given some data that reflects measurements from the environment We want to build a model that reflects certain statistics about that data Something

More information

Contents. I Basics 1. Copyright by SIAM. Unauthorized reproduction of this article is prohibited.

Contents. I Basics 1. Copyright by SIAM. Unauthorized reproduction of this article is prohibited. page v Preface xiii I Basics 1 1 Optimization Models 3 1.1 Introduction... 3 1.2 Optimization: An Informal Introduction... 4 1.3 Linear Equations... 7 1.4 Linear Optimization... 10 Exercises... 12 1.5

More information

Optimization and least squares. Prof. Noah Snavely CS1114

Optimization and least squares. Prof. Noah Snavely CS1114 Optimization and least squares Prof. Noah Snavely CS1114 http://cs1114.cs.cornell.edu Administrivia A5 Part 1 due tomorrow by 5pm (please sign up for a demo slot) Part 2 will be due in two weeks (4/17)

More information

CS 6210 Fall 2016 Bei Wang. Review Lecture What have we learnt in Scientific Computing?

CS 6210 Fall 2016 Bei Wang. Review Lecture What have we learnt in Scientific Computing? CS 6210 Fall 2016 Bei Wang Review Lecture What have we learnt in Scientific Computing? Let s recall the scientific computing pipeline observed phenomenon mathematical model discretization solution algorithm

More information

Recent advances in Metamodel of Optimal Prognosis. Lectures. Thomas Most & Johannes Will

Recent advances in Metamodel of Optimal Prognosis. Lectures. Thomas Most & Johannes Will Lectures Recent advances in Metamodel of Optimal Prognosis Thomas Most & Johannes Will presented at the Weimar Optimization and Stochastic Days 2010 Source: www.dynardo.de/en/library Recent advances in

More information

Lecture 6 - Multivariate numerical optimization

Lecture 6 - Multivariate numerical optimization Lecture 6 - Multivariate numerical optimization Björn Andersson (w/ Jianxin Wei) Department of Statistics, Uppsala University February 13, 2014 1 / 36 Table of Contents 1 Plotting functions of two variables

More information

Curriculum Map: Mathematics

Curriculum Map: Mathematics Curriculum Map: Mathematics Course: Honors Advanced Precalculus and Trigonometry Grade(s): 11-12 Unit 1: Functions and Their Graphs This chapter will develop a more complete, thorough understanding of

More information

A Study on the Optimization Methods for Optomechanical Alignment

A Study on the Optimization Methods for Optomechanical Alignment A Study on the Optimization Methods for Optomechanical Alignment Ming-Ta Yu a, Tsung-Yin Lin b *, Yi-You Li a, and Pei-Feng Shu a a Dept. of Mech. Eng., National Chiao Tung University, Hsinchu 300, Taiwan,

More information

Lecture 1 Contracts. 1 A Mysterious Program : Principles of Imperative Computation (Spring 2018) Frank Pfenning

Lecture 1 Contracts. 1 A Mysterious Program : Principles of Imperative Computation (Spring 2018) Frank Pfenning Lecture 1 Contracts 15-122: Principles of Imperative Computation (Spring 2018) Frank Pfenning In these notes we review contracts, which we use to collectively denote function contracts, loop invariants,

More information

GAUSS TM 10. Quick Start Guide

GAUSS TM 10. Quick Start Guide GAUSS TM 10 Quick Start Guide Information in this document is subject to change without notice and does not represent a commitment on the part of Aptech Systems, Inc. The software described in this document

More information

Introduction to MATLAB

Introduction to MATLAB Introduction to MATLAB Introduction MATLAB is an interactive package for numerical analysis, matrix computation, control system design, and linear system analysis and design available on most CAEN platforms

More information

( ) = Y ˆ. Calibration Definition A model is calibrated if its predictions are right on average: ave(response Predicted value) = Predicted value.

( ) = Y ˆ. Calibration Definition A model is calibrated if its predictions are right on average: ave(response Predicted value) = Predicted value. Calibration OVERVIEW... 2 INTRODUCTION... 2 CALIBRATION... 3 ANOTHER REASON FOR CALIBRATION... 4 CHECKING THE CALIBRATION OF A REGRESSION... 5 CALIBRATION IN SIMPLE REGRESSION (DISPLAY.JMP)... 5 TESTING

More information

A projected Hessian matrix for full waveform inversion Yong Ma and Dave Hale, Center for Wave Phenomena, Colorado School of Mines

A projected Hessian matrix for full waveform inversion Yong Ma and Dave Hale, Center for Wave Phenomena, Colorado School of Mines A projected Hessian matrix for full waveform inversion Yong Ma and Dave Hale, Center for Wave Phenomena, Colorado School of Mines SUMMARY A Hessian matrix in full waveform inversion (FWI) is difficult

More information

1. NUMBER SYSTEMS USED IN COMPUTING: THE BINARY NUMBER SYSTEM

1. NUMBER SYSTEMS USED IN COMPUTING: THE BINARY NUMBER SYSTEM 1. NUMBER SYSTEMS USED IN COMPUTING: THE BINARY NUMBER SYSTEM 1.1 Introduction Given that digital logic and memory devices are based on two electrical states (on and off), it is natural to use a number

More information

Repetition Through Recursion

Repetition Through Recursion Fundamentals of Computer Science I (CS151.02 2007S) Repetition Through Recursion Summary: In many algorithms, you want to do things again and again and again. For example, you might want to do something

More information

Parallel Implementations of Gaussian Elimination

Parallel Implementations of Gaussian Elimination s of Western Michigan University vasilije.perovic@wmich.edu January 27, 2012 CS 6260: in Parallel Linear systems of equations General form of a linear system of equations is given by a 11 x 1 + + a 1n

More information

ECE 204 Numerical Methods for Computer Engineers MIDTERM EXAMINATION /4:30-6:00

ECE 204 Numerical Methods for Computer Engineers MIDTERM EXAMINATION /4:30-6:00 ECE 4 Numerical Methods for Computer Engineers ECE 4 Numerical Methods for Computer Engineers MIDTERM EXAMINATION --7/4:-6: The eamination is out of marks. Instructions: No aides. Write your name and student

More information

Floating-Point Numbers in Digital Computers

Floating-Point Numbers in Digital Computers POLYTECHNIC UNIVERSITY Department of Computer and Information Science Floating-Point Numbers in Digital Computers K. Ming Leung Abstract: We explain how floating-point numbers are represented and stored

More information

Chapter 5. Repetition. Contents. Introduction. Three Types of Program Control. Two Types of Repetition. Three Syntax Structures for Looping in C++

Chapter 5. Repetition. Contents. Introduction. Three Types of Program Control. Two Types of Repetition. Three Syntax Structures for Looping in C++ Repetition Contents 1 Repetition 1.1 Introduction 1.2 Three Types of Program Control Chapter 5 Introduction 1.3 Two Types of Repetition 1.4 Three Structures for Looping in C++ 1.5 The while Control Structure

More information

Computing Basics. 1 Sources of Error LECTURE NOTES ECO 613/614 FALL 2007 KAREN A. KOPECKY

Computing Basics. 1 Sources of Error LECTURE NOTES ECO 613/614 FALL 2007 KAREN A. KOPECKY LECTURE NOTES ECO 613/614 FALL 2007 KAREN A. KOPECKY Computing Basics 1 Sources of Error Numerical solutions to problems differ from their analytical counterparts. Why? The reason for the difference is

More information

Logic, Words, and Integers

Logic, Words, and Integers Computer Science 52 Logic, Words, and Integers 1 Words and Data The basic unit of information in a computer is the bit; it is simply a quantity that takes one of two values, 0 or 1. A sequence of k bits

More information

Applying Machine Learning to Real Problems: Why is it Difficult? How Research Can Help?

Applying Machine Learning to Real Problems: Why is it Difficult? How Research Can Help? Applying Machine Learning to Real Problems: Why is it Difficult? How Research Can Help? Olivier Bousquet, Google, Zürich, obousquet@google.com June 4th, 2007 Outline 1 Introduction 2 Features 3 Minimax

More information

5.12 EXERCISES Exercises 263

5.12 EXERCISES Exercises 263 5.12 Exercises 263 5.12 EXERCISES 5.1. If it s defined, the OPENMP macro is a decimal int. Write a program that prints its value. What is the significance of the value? 5.2. Download omp trap 1.c from

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra)

AMS526: Numerical Analysis I (Numerical Linear Algebra) AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 20: Sparse Linear Systems; Direct Methods vs. Iterative Methods Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 26

More information

CS 450 Numerical Analysis. Chapter 7: Interpolation

CS 450 Numerical Analysis. Chapter 7: Interpolation Lecture slides based on the textbook Scientific Computing: An Introductory Survey by Michael T. Heath, copyright c 2018 by the Society for Industrial and Applied Mathematics. http://www.siam.org/books/cl80

More information

Box-Cox Transformation

Box-Cox Transformation Chapter 190 Box-Cox Transformation Introduction This procedure finds the appropriate Box-Cox power transformation (1964) for a single batch of data. It is used to modify the distributional shape of a set

More information

Chapter 2. Data Representation in Computer Systems

Chapter 2. Data Representation in Computer Systems Chapter 2 Data Representation in Computer Systems Chapter 2 Objectives Understand the fundamentals of numerical data representation and manipulation in digital computers. Master the skill of converting

More information

Lab copy. Do not remove! Mathematics 152 Spring 1999 Notes on the course calculator. 1. The calculator VC. The web page

Lab copy. Do not remove! Mathematics 152 Spring 1999 Notes on the course calculator. 1. The calculator VC. The web page Mathematics 152 Spring 1999 Notes on the course calculator 1. The calculator VC The web page http://gamba.math.ubc.ca/coursedoc/math152/docs/ca.html contains a generic version of the calculator VC and

More information

Leaning Graphical Model Structures using L1-Regularization Paths (addendum)

Leaning Graphical Model Structures using L1-Regularization Paths (addendum) Leaning Graphical Model Structures using -Regularization Paths (addendum) Mark Schmidt and Kevin Murphy Computer Science Dept. University of British Columbia {schmidtm,murphyk}@cs.ubc.ca 1 Introduction

More information

Introduction to Programming in C Department of Computer Science and Engineering. Lecture No. #16 Loops: Matrix Using Nested for Loop

Introduction to Programming in C Department of Computer Science and Engineering. Lecture No. #16 Loops: Matrix Using Nested for Loop Introduction to Programming in C Department of Computer Science and Engineering Lecture No. #16 Loops: Matrix Using Nested for Loop In this section, we will use the, for loop to code of the matrix problem.

More information

General Instructions. Questions

General Instructions. Questions CS246: Mining Massive Data Sets Winter 2018 Problem Set 2 Due 11:59pm February 8, 2018 Only one late period is allowed for this homework (11:59pm 2/13). General Instructions Submission instructions: These

More information

(Sparse) Linear Solvers

(Sparse) Linear Solvers (Sparse) Linear Solvers Ax = B Why? Many geometry processing applications boil down to: solve one or more linear systems Parameterization Editing Reconstruction Fairing Morphing 1 Don t you just invert

More information

Statistics & Analysis. A Comparison of PDLREG and GAM Procedures in Measuring Dynamic Effects

Statistics & Analysis. A Comparison of PDLREG and GAM Procedures in Measuring Dynamic Effects A Comparison of PDLREG and GAM Procedures in Measuring Dynamic Effects Patralekha Bhattacharya Thinkalytics The PDLREG procedure in SAS is used to fit a finite distributed lagged model to time series data

More information

An Improved Measurement Placement Algorithm for Network Observability

An Improved Measurement Placement Algorithm for Network Observability IEEE TRANSACTIONS ON POWER SYSTEMS, VOL. 16, NO. 4, NOVEMBER 2001 819 An Improved Measurement Placement Algorithm for Network Observability Bei Gou and Ali Abur, Senior Member, IEEE Abstract This paper

More information

Accelerating the Hessian-free Gauss-Newton Full-waveform Inversion via Preconditioned Conjugate Gradient Method

Accelerating the Hessian-free Gauss-Newton Full-waveform Inversion via Preconditioned Conjugate Gradient Method Accelerating the Hessian-free Gauss-Newton Full-waveform Inversion via Preconditioned Conjugate Gradient Method Wenyong Pan 1, Kris Innanen 1 and Wenyuan Liao 2 1. CREWES Project, Department of Geoscience,

More information

Lecture 1 Contracts : Principles of Imperative Computation (Fall 2018) Frank Pfenning

Lecture 1 Contracts : Principles of Imperative Computation (Fall 2018) Frank Pfenning Lecture 1 Contracts 15-122: Principles of Imperative Computation (Fall 2018) Frank Pfenning In these notes we review contracts, which we use to collectively denote function contracts, loop invariants,

More information

Course Number 432/433 Title Algebra II (A & B) H Grade # of Days 120

Course Number 432/433 Title Algebra II (A & B) H Grade # of Days 120 Whitman-Hanson Regional High School provides all students with a high- quality education in order to develop reflective, concerned citizens and contributing members of the global community. Course Number

More information

Maths for Signals and Systems Linear Algebra in Engineering. Some problems by Gilbert Strang

Maths for Signals and Systems Linear Algebra in Engineering. Some problems by Gilbert Strang Maths for Signals and Systems Linear Algebra in Engineering Some problems by Gilbert Strang Problems. Consider u, v, w to be non-zero vectors in R 7. These vectors span a vector space. What are the possible

More information

Errors in Computation

Errors in Computation Theory of Errors Content Errors in computation Absolute Error Relative Error Roundoff Errors Truncation Errors Floating Point Numbers Normalized Floating Point Numbers Roundoff Error in Floating Point

More information

Floating-Point Numbers in Digital Computers

Floating-Point Numbers in Digital Computers POLYTECHNIC UNIVERSITY Department of Computer and Information Science Floating-Point Numbers in Digital Computers K. Ming Leung Abstract: We explain how floating-point numbers are represented and stored

More information

2 Computation with Floating-Point Numbers

2 Computation with Floating-Point Numbers 2 Computation with Floating-Point Numbers 2.1 Floating-Point Representation The notion of real numbers in mathematics is convenient for hand computations and formula manipulations. However, real numbers

More information

WE consider the gate-sizing problem, that is, the problem

WE consider the gate-sizing problem, that is, the problem 2760 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I: REGULAR PAPERS, VOL 55, NO 9, OCTOBER 2008 An Efficient Method for Large-Scale Gate Sizing Siddharth Joshi and Stephen Boyd, Fellow, IEEE Abstract We consider

More information