Non-Parametric and Semi-Parametric Methods for Longitudinal Data

Size: px
Start display at page:

Download "Non-Parametric and Semi-Parametric Methods for Longitudinal Data"

Transcription

1 PART III Non-Parametric and Semi-Parametric Methods for Longitudinal Data

2

3 CHAPTER 8 Non-parametric and semi-parametric regression methods: Introduction and overview Xihong Lin and Raymond J. Carroll Contents 8.1 Introduction and overview Brief review of non-parametric and semi-parametric regression methods for independent data Local polynomial kernels Smoothing splines Regression splines and penalized splines (P-splines) Overview of non-parametric and semi-parametric regression for longitudinal data References Introduction and overview Parametric regression methods for longitudinal data have been well developed in the last 20 years. Such methods can be classified broadly as estimating equation based methods, such as generalized estimating equations (Liang and Zeger, 1986), and their extensions (Chapter 3), and mixed-effects models (Laird and Ware, 1982; Breslow and Clayton, 1993; see also Chapter 4). Diggle et al. (2002) provide an excellent overview of these parametric regression methods. For recent developments, see Chapter 3 through Chapter 6. A major limitation of these methods is that the relationship of the mean of a longitudinal response to covariates is assumed fully parametric. Although such parametric mean models enjoy simplicity, they have suffered from inflexibility in modeling complicated relationships between the response and covariates in various longitudinal studies. Examples include hormone profiles in a menstrual cycle in reproductive health (Brumback and Rice, 1998; Zhang et al., 1998); longitudinal CD4 trajectories in AIDS research (Zeger and Diggle, 1994; Lin and Ying, 2001); age effects on childhood respiratory disease (Diggle et al., 2002; Lin and Zhang, 1999); time trajectories in speech research and growth curves (Brumback and Lindstrom, 2004; Gasser et al., 1984); time-varying treatment/exposure effects (Hogan, Lin, and Herman, 2004; Huang, Wu, and Zhou, 2002); and time course analysis of microarray gene expressions (Luan and Li, 2003; Storey et al., 2005). These practical applications have placed a strong demand in the last 10 years on developing non-parametric and semiparametric regression methods for longitudinal data, where flexible functional forms can be estimated from the data to capture possibly complicated relationships between longitudinal outcomes and covariates. Non-parametric and semi-parametric regression methods for independent data have been well developed in the last two decades. Non-parametric regression methods can be broadly

4 192 NON-PARAMETRIC AND SEMI-PARAMETRIC REGRESSION METHODS classified into kernel methods (Wand and Jones, 1995), which are often based on local likelihoods (Fan and Gijbels, 1996), and splines, which include smoothing splines (Green and Silverman, 1994; Wahba, 1990), penalized splines (Eilers and Marx, 1996; Ruppert, Wand, and Carroll, 2003), and regression splines (Stone et al., 1997). Both smoothing splines and penalized splines are based on penalized likelihoods. Silverman (1984) demonstrated a close connection between kernel smoothing and smoothing spline smoothing, and showed that kernels and smoothing splines are asymptotically equivalent for independent data and that splines are higher-order kernels. Semi-parametric regression methods for independent data have been equally well developed (Härdle, Liang, and Gao, 1999; Green and Silverman, 1994, Chapter 4). Such models are sometimes referred to as (generalized) partial linear models, where the mean or the transformed mean (by a parametric link function) of an outcome variable is modeled in terms of parametric functions of a subset of the covariates and non-parametric functions of other covariates. Profile-kernel and profile-spline methods have been proposed for estimation in such partial linear models (Heckman, 1984; Speckman, 1988; Carroll et al., 1997). Non-parametric and semi-parametric regression methods for longitudinal data using kernel and spline methods have enjoyed substantial developments in the last 10 years. Chapter 9 through Chapter 12 provide reviews of these methods. To help the reader understand these developments for longitudinal data, in the next section we provide an overview of non-parametric and semi-parametric regression methods using kernels and splines for independent data. 8.2 Brief review of non-parametric and semi-parametric regression methods for independent data Local polynomial kernels Traditional kernel regression estimates a non-parametric regression function at a target point using local weighted averages; for example, the Nadaraya Watson estimator. The most popular kernel regression method is local polynomial regression (Wand and Jones, 1994; Fan and Gijbels, 1996). Consider the simplest non-parametric regression model, Y i = θ(z i )+ɛ i, (8.1) where Y i is a scalar continuous outcome, Z i is a scalar covariate, θ(z) is an unknown smooth function, and ɛ i N(0,σ 2 ) and is independent and identically distributed (i =1,...,N). The idea of the local dth-order polynomial regression estimator of θ(z) is to approximate θ(z i ) locally around any arbitrary point z by a dth-order polynomial as θ(z i ) α α d (Z i z) d = Z i (z) α, where Z i (z) ={1,...,(Z i z) d } and α =(α 0,...,α d ), and to estimate α by maximizing the local log-likelihood, apart from a constant, defined as 1 N 2σ 2 K h (Z i z){y i Z i (z) α} 2, i=1 where K h (s) =h 1 K(s/h), h is a bandwidth, and K( ) is a kernel function, which is often chosen as a symmetric density function with mean 0. Commonly used kernel functions include the Gaussian, uniform, and Epanechnikov kernels, the latter being K(s) = 3 4 (1 s 2 ) +, where a + = a if a>0 and 0 otherwise. The resulting kernel estimating equation is N Z i (z)k h (Z i z){y i Z i (z) α} =0. (8.2) i=1 The dth-order kernel estimator at the target point z is θ(z) = α 0.Ifd = 0, we have the traditional local average kernel estimator, which corresponds to the Nadaraya Watson

5 INDEPENDENT DATA 193 estimator, θ(z) = N i=1 K h(z i z)y i N i=1 K h(z i z). (8.3) The local linear kernel estimator (d = 1) has been commonly used because of its better bias properties. Bandwidth selection is important in kernel smoothing. The bandwidth h could be selected using cross-validation. Other approaches include plug-in estimators (Wand and Jones, 1994; Fan and Gijbels, 1996) and empirical bias bandwidth selection (Ruppert, 1997), among others. A key feature of kernel smoothing for independent data is that it is local in the sense that θ(z) places more weight on the observations when Z i values are in the neighborhood of z, and downweights those observations that are far from z. This can be seen from (8.3). Specifically, as the bandwidth h 0 and the sample size N, only the observations in the shrinking neighborhood of z contribute to the estimation of θ(z). As we will see, this locality property is no longer true for superior kernel and spline smoothing for longitudinal data (Chapter 9; see also Lin et al., 2004; Welsh, Lin, and Carroll, 2002). Fan (1993) showed that the local polynomial kernel estimator enjoys minimax efficiency among the class of all linear smoothers. The Epanechnikov kernel is optimal in the sense that it minimizes the mean squared error of the local polynomial kernel estimator. The local linear polynomial kernel estimator (8.2) can be extended easily to non-parametric regression for non-normal outcomes within the generalized linear model framework (Fan and Gijbels, 1996, Chapter 5) Smoothing splines A smoothing spline estimates the non-parametric regression function θ(z) using a piecewise polynomial function with all the observed covariate values {Z i } used as knots, where smoothness constraints are assumed at the knots (Wahba, 1990; Green and Silverman, 1994). The most commonly used smoothing spline is the natural cubic smoothing spline, which assumes θ(z) is a piecewise cubic function, is linear outside of min(z i ) and max(z i ), and is continuous and twice differentiable with a step function third derivative at the knots {Z i }. The natural cubic smoothing spline estimator can be obtained by maximizing a penalized log-likelihood as follows. Under the simple non-parametric model (8.1), the penalized log-likelihood can be written as [ 1 N ] [ ] 2σ 2 {Y i θ(z i )} 2 λ {θ (2) (z)} 2 dz = 1 N 2σ 2 {Y i θ(z i )} 2 λθ Ψθ, i=1 where λ is a smoothing parameter, Ψ is the cubic smoothing spline penalty matrix (Green and Silverman, 1994, Equation 2.3), θ = {θ(z 1 ),...,θ(z n )}, and θ (2) (z) denotes the second derivative of θ(z). The smoothing parameter λ controls the goodness of fit and the smoothness of the curve. The smoothing spline estimator interpolates the data if λ = 0, and assumes θ(z) to be linear if λ. The resulting cubic smoothing spline estimator takes the form of a ridge regression estimator, θ =(I + λψ) 1 Y, (8.4) where Y =(Y 1,...,Y n ). Efficient algorithms, such as the Reinsch algorithm (Green and Silverman, 1994), can be used to calculate θ in O(N) arithmetic operations. The smoothing parameter λ can be estimated using cross-validation, generalized cross-validation (Wahba, 1990), and general maximum likelihood (GML) (Wahba, 1985). i=1

6 194 NON-PARAMETRIC AND SEMI-PARAMETRIC REGRESSION METHODS There is a close connection between a smoothing spline estimator and a linear mixed model. Specifically, the GML esimator corresponds to the restricted maximum likelihood estimator in the corresponding mixed model. Such a connection, as well as the Bayesian formulation of the smoothing spline, will be discussed in more detail in Chapters 9, 11, and 12. Silverman (1984) showed that the smoothing spline estimator is asymptotically equivalent to the local average kernel estimator. Using Silverman s (1984) results, Nychka (1995) established the asymptotic properties of the smoothing spline estimator (8.4) by deriving its asymptotic bias and variance. Smoothing spline estimation has been extended to generalized linear models (Green and Silverman, 1994) and generalized additive models (Hastie and Tibshirani, 1990). Bayesian spline estimation can be found in Hastie and Tibshirani (2000). More discussions of the use of smoothing splines in longitudinal data can be found in Chapter 9 and Chapter Regression splines and penalized splines (P-splines) A key advantage of a smoothing spline is that all the observed design points are used as knots. Hence, one does not need to choose knots. However, when the sample size is large, computational demands are significantly increased and make it difficult to compute. Regression splines (Stone et al., 1997) are a basis function-based non-parametric regression method, which uses a small number of knots and proceeds with a parametric regression using the bases. Denote by {s 1,...,s L } a set of L knots, where L is often small (e.g., 5 or 6), and by {B 1 (z),...,b L (z)} a set of basis functions (e.g., B-spline basis or plus-function basis). For the simple non-parametric regression (8.1), one approximates θ(z) by θ(z) L B l (z)α l. (8.5) l=1 Then one estimates α =(α 1,...,α L ) by fitting the parametric model L Y i = B l (z)α l + ɛ i, (8.6) l=1 via standard least squares. The resulting non-parametric regression spline estimator of θ(z) is θ(z) = L l=1 B l(z) α l, where α is the maximum likelihood estimate under (8.6). A key advantage of the regression spline is its computational simplicity, since one only needs to fit a parametric model. However, choices of the number of knots and the locations of the knots are critical. Estimation of θ(z) could be sensitive to these choices. Adaptive knot allocation strategies have been recommended (Stone et al., 1997). Penalized splines (P-splines) are a hybrid of regression splines and smoothing splines (Eilers and Marx, 1996; Ruppert, Wand, and Carroll, 2003). One approximates θ(z) using the basis expansion (8.5) with a large number of knots L, where L is often much smaller than the sample size N but much larger than the number of knots often used in regression splines (e.g., L = 20 to 30). P-spline estimation proceeds by fitting (8.6) with a quadratic penalty on {α l }. For example, if the {B l (z)} are plus basis functions and L is the number of interior knots, the dth-order P-spline model is L θ(z; α) =α 0 + α 1 z + + α d z d + α l+d (z s l ) d +. One estimates θ(z; α) by maximizing the penalized log-likelihood, apart from a constant, { } 1 N L 2σ 2 {Y i θ(z i ; α)} 2 λ αl 2. i=1 l=1 l=1

7 REFERENCES 195 If {B l (z)} are B-spline basis functions, a second-order difference penalty of α can be used (Fahrmeir, Kneib, and Lang, 2004; Lang and Brezger, 2004). A key advantage of P-splines is that they reduce the computational burden of smoothing splines when the sample size is large, and are less sensitive to the allocation of the knots compared to regression splines. The smoothing parameter can be treated as a variance component using the connection between P-splines and mixed models and can be estimated using restricted maximum likelihood. Several recent attempts have been made to understand the theoretical properties of P-splines in special situations (Hall and Opsomer, 2005). More details about the use of regression splines and P-splines in longitudinal data can be found in Chapter Overview of non-parametric and semi-parametric regression for longitudinal data Although non-parametric and semi-parametric regression methods have been well developed for independent data, their developments for longitudinal data have only occurred in recent years. A major difficulty in the analysis of longitudinal data is that the data are subject to within-subject correlation among repeated measures over time. This correlation presents significant challenges in the development of kernel and spline smoothing methods for longitudinal data; in particular, a need for developing non-conventional smoothing methods and a better understanding of their properties. Specifically, traditional local likelihood based kernel methods are not able to effectively account for the within-subject correlation (Lin and Carroll, 2000). A consistent and efficient non-parametric estimator for longitudinal data needs to be non-local (Welsh, Lin, and Carroll, 2002; Lin et al., 2004). Standard functional data analysis techniques are not directly applicable to longitudinal data, as repeated measures are often obtained at irregular sparse time points and are often more noisy (Yao, Müller, and Wang, 2005). Chapter 9 provides an overview of both estimating equation based methods and likelihood based methods for non-parametric and semi-parametric regression using kernel and spline smoothing for longitudinal data. Chapter 10 surveys the use of functional data analysis methods for non-parametric regression in longitudinal data by treating data as samples of random curves. Chapter 11 reviews smoothing spline methods for longitudinal data, while Chapter 12 reviews penalized spline methods. One can find in these chapters detailed discussions of the attractive connection between spline estimation and mixed models (Brumback and Rice, 1998; Wang, 1998; Zhang et al., 1998; Lin and Zhang, 1999; Verbyla et al., 1999). References Breslow, N. E. and Clayton, D. G. (1993). Approximate inference in generalized linear mixed models. Journal of the American Statistical Association 88, Brumback, B. and Rice, J. A. (1998). Smoothing spline models for the analysis of nested and crossed samples of curves (with discussion). Journal of the American Statistical Association 93, Brumback, L. C. and Lindstrom, M. J. (2004). Self modeling with flexible, random time transformations. Biometrics 60, Carroll, R. J., Fan, J., Gijbels, I., and Wand, M. P. (1997). Generalized partially linear single-index models. Journal of the American Statistical Association 92, Diggle, P. J., Heagerty, P. J., Liang, K. Y., and Zeger, S. L. (2002). Analysis of Longitudinal Data. Oxford: Oxford University Press. Eilers, P. H. and Marx, B. D. (1996). Flexible smoothing with B-splines and penalities (with discussion). Statistical Science 11, Fahrmeir, L., Kneib, T., and Lang, S. (2004). Penalized structured additive regression for space-time data: A Bayesian perspective. Statistica Sinica 14,

8 196 INTRODUCTION AND OVERVIEW Fan, J. (1993). Local linear regression smoothers and their minimax efficiencies. Annals of Statistics 21, Fan, J. and Gijbels, I. (1996). Local Polynomial Modelling and Its Applications. London: Chapman & Hall. Gasser, T., Müller, H. G., Köhler, W., Molinari, L., and Prader, A. (1984). Nonparametric regression analysis of growth curves. Annals of Statistics 12, Green, P. J. and Silverman, B. W. (1994). Nonparametric Regression and Generalized Linear Models: A Roughness Penalty Approach. London: Chapman & Hall. Härdle, W., Liang, H., and Gao, J. (1999). Partially Linear Models. New York: Springer-Verlag. Hastie, T. and Tibshirani, R. (1990). Generalized Additive Models. London: Chapman & Hall. Hastie, T. and Tibshirani, R. (2000). Bayesian backfitting. Statistical Science 15, Hall, P. and Opsomer, J. (2005). Theory for penalised spline regression. Biometrika 92, Heckman, N. (1984). Spline smoothing in partial linear models. Journal of the American Statistical Association 48, Hogan, J. W., Lin, X., and Herman, B. (2004) Mixtures of varying coefficient models for longitudinal data with discrete or continuous nonignorable dropout. Biometrics 60, Huang, J., Wu, C., and Zhou, L. (2002). Varying-coefficient models and basis function approximation for the analysis of repeated measures. Biometrika 89, Laird, N. M. and Ware, J. H. (1982). Random-effects models for longitudinal data. Biometrics 38, Lang, S. and Brezger, A. (2004). Bayesian P-splines. Journal of Computational and Graphical Statistics 13, Liang, K. Y. and Zeger, S. L. (1986). Longitudinal data analysis using generalized linear models. Biometrika 73, Lin, D. and Ying, Z. (2001). Semiparametric and nonparametric regression analysis of longitudinal data. Journal of the American Statistical Association 96, Lin, X. and Carroll, R. J. (2000). Nonparametric function estimation for clustered data when the predictor is measured without/with error. Journal of the American Statistical Association 95, Lin, X. and Zhang, D. (1999). Inference in generalized additive mixed model using smoothing splines. Journal of the Royal Statistical Society, Series B 61, Lin, X., Wang, N., Welsh, A., and Carroll, R. J. (2004). Equivalent kernels of smoothing splines in nonparametric regression for clustered data. Biometrika 91, Luan, Y. and Li, H. (2003). Clustering of time-course gene expression data using a mixed-effects model with B-splines. Bioinformatics 19, Nychka, D. (1995). Splines as local smoothers. Annals of Statistics 23, Ruppert, D. (1997). Empirical-bias bandwidths for local polynomial nonparametric regression and density estimation. Journal of the American Statistical Association 92, Ruppert, D., Wand, M. P., and Carroll, R. J. (2003). Semiparametric Regression. Cambridge: Cambridge University Press. Silverman, B. (1984). Spline smoothing: the equivalent variable kernel method. Annals of Statistics 12, Speckman, P. (1988). Kernel smoothing in partial linear models. Journal of the Royal Statistical Society, Series B 50, Stone, C.J., Hansen, M., Kooperberg, C., and Truong, Y. K. (1997). Polynomial splines and their tensor products in extended linear modeling (with discussion). Annals of Statistics 25, Storey, J. D., Xiao, W., Leek, J. T., Tompkins, R. G., and Davis, R. W. (2005). Significance analysis of time course microarray experiments. Proceedings of the National Academy of Sciences 102,

9 REFERENCES 197 Verbyla, A. P., Cullis, B. R., Kenward, M. G., and Welham, S. J. (1999). The analysis of designed experiments and longitudinal data using smoothing splines. Applied Statistics 48, Wahba, G. (1985). A comparison of GCV and GML for choosing the smoothing parameter in the generalized spline problem. Annals of Statistics 13, Wahba, G. (1990). Spline Models for Observational Data. Philadephia: SIAM. Wand, M. P. and Jones, M. C. (1995). Kernel Smoothing. London: Chapman & Hall. Wang, Y. (1998). Mixed effects smoothing spline analysis of variance. Journal of the Royal Statistical Society, Series B 60, Welsh, A. H., Lin, X., and Carroll, R. J. (2002). Marginal longitudinal nonparametric regression: Locality and efficiency of spline and kernel methods. Journal of the American Statistical Association 97, Zeger, S. L. and Diggle, P. J. (1994). Semiparametric models for longitudinal data with application to CD4 cell numbers in HIV seroconverters. Biometrics 50, Zhang, D., Lin, X., Raz, J., and Sowers, M. (1998). Semiparametric stochastic mixed models for longitudinal data. Journal of the American Statistical Association 93,

10

Nonparametric regression using kernel and spline methods

Nonparametric regression using kernel and spline methods Nonparametric regression using kernel and spline methods Jean D. Opsomer F. Jay Breidt March 3, 016 1 The statistical model When applying nonparametric regression methods, the researcher is interested

More information

davidr Cornell University

davidr Cornell University 1 NONPARAMETRIC RANDOM EFFECTS MODELS AND LIKELIHOOD RATIO TESTS Oct 11, 2002 David Ruppert Cornell University www.orie.cornell.edu/ davidr (These transparencies and preprints available link to Recent

More information

Nonparametric Mixed-Effects Models for Longitudinal Data

Nonparametric Mixed-Effects Models for Longitudinal Data Nonparametric Mixed-Effects Models for Longitudinal Data Zhang Jin-Ting Dept of Stat & Appl Prob National University of Sinagpore University of Seoul, South Korea, 7 p.1/26 OUTLINE The Motivating Data

More information

Generalized Additive Model

Generalized Additive Model Generalized Additive Model by Huimin Liu Department of Mathematics and Statistics University of Minnesota Duluth, Duluth, MN 55812 December 2008 Table of Contents Abstract... 2 Chapter 1 Introduction 1.1

More information

Assessing the Quality of the Natural Cubic Spline Approximation

Assessing the Quality of the Natural Cubic Spline Approximation Assessing the Quality of the Natural Cubic Spline Approximation AHMET SEZER ANADOLU UNIVERSITY Department of Statisticss Yunus Emre Kampusu Eskisehir TURKEY ahsst12@yahoo.com Abstract: In large samples,

More information

arxiv: v1 [stat.me] 2 Jun 2017

arxiv: v1 [stat.me] 2 Jun 2017 Inference for Penalized Spline Regression: Improving Confidence Intervals by Reducing the Penalty arxiv:1706.00865v1 [stat.me] 2 Jun 2017 Ning Dai School of Statistics University of Minnesota daixx224@umn.edu

More information

Nonparametric Regression

Nonparametric Regression Nonparametric Regression John Fox Department of Sociology McMaster University 1280 Main Street West Hamilton, Ontario Canada L8S 4M4 jfox@mcmaster.ca February 2004 Abstract Nonparametric regression analysis

More information

Linear Penalized Spline Model Estimation Using Ranked Set Sampling Technique

Linear Penalized Spline Model Estimation Using Ranked Set Sampling Technique Linear Penalized Spline Model Estimation Using Ranked Set Sampling Technique Al Kadiri M. A. Abstract Benefits of using Ranked Set Sampling (RSS) rather than Simple Random Sampling (SRS) are indeed significant

More information

FMA901F: Machine Learning Lecture 3: Linear Models for Regression. Cristian Sminchisescu

FMA901F: Machine Learning Lecture 3: Linear Models for Regression. Cristian Sminchisescu FMA901F: Machine Learning Lecture 3: Linear Models for Regression Cristian Sminchisescu Machine Learning: Frequentist vs. Bayesian In the frequentist setting, we seek a fixed parameter (vector), with value(s)

More information

Nonparametric Estimation of Distribution Function using Bezier Curve

Nonparametric Estimation of Distribution Function using Bezier Curve Communications for Statistical Applications and Methods 2014, Vol. 21, No. 1, 105 114 DOI: http://dx.doi.org/10.5351/csam.2014.21.1.105 ISSN 2287-7843 Nonparametric Estimation of Distribution Function

More information

Smoothing and Forecasting Mortality Rates with P-splines. Iain Currie. Data and problem. Plan of talk

Smoothing and Forecasting Mortality Rates with P-splines. Iain Currie. Data and problem. Plan of talk Smoothing and Forecasting Mortality Rates with P-splines Iain Currie Heriot Watt University Data and problem Data: CMI assured lives : 20 to 90 : 1947 to 2002 Problem: forecast table to 2046 London, June

More information

Estimating Curves and Derivatives with Parametric Penalized Spline Smoothing

Estimating Curves and Derivatives with Parametric Penalized Spline Smoothing Estimating Curves and Derivatives with Parametric Penalized Spline Smoothing Jiguo Cao, Jing Cai Department of Statistics & Actuarial Science, Simon Fraser University, Burnaby, BC, Canada V5A 1S6 Liangliang

More information

SPATIALLY-ADAPTIVE PENALTIES FOR SPLINE FITTING

SPATIALLY-ADAPTIVE PENALTIES FOR SPLINE FITTING SPATIALLY-ADAPTIVE PENALTIES FOR SPLINE FITTING David Ruppert and Raymond J. Carroll January 6, 1999 Revised, July 17, 1999 Abstract We study spline fitting with a roughness penalty that adapts to spatial

More information

GAMs semi-parametric GLMs. Simon Wood Mathematical Sciences, University of Bath, U.K.

GAMs semi-parametric GLMs. Simon Wood Mathematical Sciences, University of Bath, U.K. GAMs semi-parametric GLMs Simon Wood Mathematical Sciences, University of Bath, U.K. Generalized linear models, GLM 1. A GLM models a univariate response, y i as g{e(y i )} = X i β where y i Exponential

More information

Lecture 27, April 24, Reading: See class website. Nonparametric regression and kernel smoothing. Structured sparse additive models (GroupSpAM)

Lecture 27, April 24, Reading: See class website. Nonparametric regression and kernel smoothing. Structured sparse additive models (GroupSpAM) School of Computer Science Probabilistic Graphical Models Structured Sparse Additive Models Junming Yin and Eric Xing Lecture 7, April 4, 013 Reading: See class website 1 Outline Nonparametric regression

More information

A General Greedy Approximation Algorithm with Applications

A General Greedy Approximation Algorithm with Applications A General Greedy Approximation Algorithm with Applications Tong Zhang IBM T.J. Watson Research Center Yorktown Heights, NY 10598 tzhang@watson.ibm.com Abstract Greedy approximation algorithms have been

More information

Splines. Patrick Breheny. November 20. Introduction Regression splines (parametric) Smoothing splines (nonparametric)

Splines. Patrick Breheny. November 20. Introduction Regression splines (parametric) Smoothing splines (nonparametric) Splines Patrick Breheny November 20 Patrick Breheny STA 621: Nonparametric Statistics 1/46 Introduction Introduction Problems with polynomial bases We are discussing ways to estimate the regression function

More information

Splines and penalized regression

Splines and penalized regression Splines and penalized regression November 23 Introduction We are discussing ways to estimate the regression function f, where E(y x) = f(x) One approach is of course to assume that f has a certain shape,

More information

A Fast Clustering Algorithm with Application to Cosmology. Woncheol Jang

A Fast Clustering Algorithm with Application to Cosmology. Woncheol Jang A Fast Clustering Algorithm with Application to Cosmology Woncheol Jang May 5, 2004 Abstract We present a fast clustering algorithm for density contour clusters (Hartigan, 1975) that is a modified version

More information

Expectation Maximization (EM) and Gaussian Mixture Models

Expectation Maximization (EM) and Gaussian Mixture Models Expectation Maximization (EM) and Gaussian Mixture Models Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer 1 2 3 4 5 6 7 8 Unsupervised Learning Motivation

More information

Linear penalized spline model estimation using ranked set sampling technique

Linear penalized spline model estimation using ranked set sampling technique Hacettepe Journal of Mathematics and Statistics Volume 46 (4) (2017), 669 683 Linear penalized spline model estimation using ranked set sampling technique Al Kadiri M A Abstract Benets of using Ranked

More information

A fast Mixed Model B-splines algorithm

A fast Mixed Model B-splines algorithm A fast Mixed Model B-splines algorithm arxiv:1502.04202v1 [stat.co] 14 Feb 2015 Abstract Martin P. Boer Biometris WUR Wageningen The Netherlands martin.boer@wur.nl February 17, 2015 A fast algorithm for

More information

What is machine learning?

What is machine learning? Machine learning, pattern recognition and statistical data modelling Lecture 12. The last lecture Coryn Bailer-Jones 1 What is machine learning? Data description and interpretation finding simpler relationship

More information

Modeling Criminal Careers as Departures From a Unimodal Population Age-Crime Curve: The Case of Marijuana Use

Modeling Criminal Careers as Departures From a Unimodal Population Age-Crime Curve: The Case of Marijuana Use Modeling Criminal Careers as Departures From a Unimodal Population Curve: The Case of Marijuana Use Donatello Telesca, Elena A. Erosheva, Derek A. Kreader, & Ross Matsueda April 15, 2014 extends Telesca

More information

Nonparametric Approaches to Regression

Nonparametric Approaches to Regression Nonparametric Approaches to Regression In traditional nonparametric regression, we assume very little about the functional form of the mean response function. In particular, we assume the model where m(xi)

More information

A popular method for moving beyond linearity. 2. Basis expansion and regularization 1. Examples of transformations. Piecewise-polynomials and splines

A popular method for moving beyond linearity. 2. Basis expansion and regularization 1. Examples of transformations. Piecewise-polynomials and splines A popular method for moving beyond linearity 2. Basis expansion and regularization 1 Idea: Augment the vector inputs x with additional variables which are transformation of x use linear models in this

More information

Preface to the Second Edition. Preface to the First Edition. 1 Introduction 1

Preface to the Second Edition. Preface to the First Edition. 1 Introduction 1 Preface to the Second Edition Preface to the First Edition vii xi 1 Introduction 1 2 Overview of Supervised Learning 9 2.1 Introduction... 9 2.2 Variable Types and Terminology... 9 2.3 Two Simple Approaches

More information

TECHNICAL REPORT NO December 11, 2001

TECHNICAL REPORT NO December 11, 2001 DEPARTMENT OF STATISTICS University of Wisconsin 2 West Dayton St. Madison, WI 5376 TECHNICAL REPORT NO. 48 December, 2 Penalized Log Likelihood Density Estimation, via Smoothing-Spline ANOVA and rangacv

More information

Package sbf. R topics documented: February 20, Type Package Title Smooth Backfitting Version Date Author A. Arcagni, L.

Package sbf. R topics documented: February 20, Type Package Title Smooth Backfitting Version Date Author A. Arcagni, L. Type Package Title Smooth Backfitting Version 1.1.1 Date 2014-12-19 Author A. Arcagni, L. Bagnato Package sbf February 20, 2015 Maintainer Alberto Arcagni Smooth Backfitting

More information

Statistical Modeling with Spline Functions Methodology and Theory

Statistical Modeling with Spline Functions Methodology and Theory This is page 1 Printer: Opaque this Statistical Modeling with Spline Functions Methodology and Theory Mark H. Hansen University of California at Los Angeles Jianhua Z. Huang University of Pennsylvania

More information

Median and Extreme Ranked Set Sampling for penalized spline estimation

Median and Extreme Ranked Set Sampling for penalized spline estimation Appl Math Inf Sci 10, No 1, 43-50 (016) 43 Applied Mathematics & Information Sciences An International Journal http://dxdoiorg/1018576/amis/10014 Median and Extreme Ranked Set Sampling for penalized spline

More information

Analyzing Longitudinal Data Using Regression Splines

Analyzing Longitudinal Data Using Regression Splines Analyzing Longitudinal Data Using Regression Splines Zhang Jin-Ting Dept of Stat & Appl Prob National University of Sinagpore August 18, 6 DSAP, NUS p.1/16 OUTLINE Motivating Longitudinal Data Parametric

More information

Generalized Additive Models

Generalized Additive Models :p Texts in Statistical Science Generalized Additive Models An Introduction with R Simon N. Wood Contents Preface XV 1 Linear Models 1 1.1 A simple linear model 2 Simple least squares estimation 3 1.1.1

More information

BIVARIATE PENALIZED SPLINES FOR REGRESSION

BIVARIATE PENALIZED SPLINES FOR REGRESSION Statistica Sinica 23 (2013), 000-000 doi:http://dx.doi.org/10.5705/ss.2010.278 BIVARIATE PENALIZED SPLINES FOR REGRESSION Ming-Jun Lai and Li Wang The University of Georgia Abstract: In this paper, the

More information

Statistical Modeling with Spline Functions Methodology and Theory

Statistical Modeling with Spline Functions Methodology and Theory This is page 1 Printer: Opaque this Statistical Modeling with Spline Functions Methodology and Theory Mark H Hansen University of California at Los Angeles Jianhua Z Huang University of Pennsylvania Charles

More information

Bivariate Penalized Splines for Regression. Ming-Jun Lai & Li Wang. The University of Georgia

Bivariate Penalized Splines for Regression. Ming-Jun Lai & Li Wang. The University of Georgia Submitted to Statistica Sinica 1 Bivariate Penalized Splines for Regression Ming-Jun Lai & Li Wang The University of Georgia Abstract: In this paper, the asymptotic behavior of penalized spline estimators

More information

Last time... Bias-Variance decomposition. This week

Last time... Bias-Variance decomposition. This week Machine learning, pattern recognition and statistical data modelling Lecture 4. Going nonlinear: basis expansions and splines Last time... Coryn Bailer-Jones linear regression methods for high dimensional

More information

NONPARAMETRIC REGRESSION SPLINES FOR GENERALIZED LINEAR MODELS IN THE PRESENCE OF MEASUREMENT ERROR

NONPARAMETRIC REGRESSION SPLINES FOR GENERALIZED LINEAR MODELS IN THE PRESENCE OF MEASUREMENT ERROR NONPARAMETRIC REGRESSION SPLINES FOR GENERALIZED LINEAR MODELS IN THE PRESENCE OF MEASUREMENT ERROR J. D. Maca July 1, 1997 Abstract The purpose of this manual is to demonstrate the usage of software for

More information

Straightforward intermediate rank tensor product smoothing in mixed models

Straightforward intermediate rank tensor product smoothing in mixed models Straightforward intermediate rank tensor product smoothing in mixed models Simon N. Wood, Fabian Scheipl, Julian J. Faraway January 6, 2012 Abstract Tensor product smooths provide the natural way of representing

More information

Approximate Smoothing Spline Methods for Large Data Sets in the Binary Case Dong Xiang, SAS Institute Inc. Grace Wahba, University of Wisconsin at Mad

Approximate Smoothing Spline Methods for Large Data Sets in the Binary Case Dong Xiang, SAS Institute Inc. Grace Wahba, University of Wisconsin at Mad DEPARTMENT OF STATISTICS University of Wisconsin 1210 West Dayton St. Madison, WI 53706 TECHNICAL REPORT NO. 982 September 30, 1997 Approximate Smoothing Spline Methods for Large Data Sets in the Binary

More information

Nonparametric and Semiparametric Econometrics Lecture Notes for Econ 221. Yixiao Sun Department of Economics, University of California, San Diego

Nonparametric and Semiparametric Econometrics Lecture Notes for Econ 221. Yixiao Sun Department of Economics, University of California, San Diego Nonparametric and Semiparametric Econometrics Lecture Notes for Econ 221 Yixiao Sun Department of Economics, University of California, San Diego Winter 2007 Contents Preface ix 1 Kernel Smoothing: Density

More information

Moving Beyond Linearity

Moving Beyond Linearity Moving Beyond Linearity Basic non-linear models one input feature: polynomial regression step functions splines smoothing splines local regression. more features: generalized additive models. Polynomial

More information

Soft Threshold Estimation for Varying{coecient Models 2 ations of certain basis functions (e.g. wavelets). These functions are assumed to be smooth an

Soft Threshold Estimation for Varying{coecient Models 2 ations of certain basis functions (e.g. wavelets). These functions are assumed to be smooth an Soft Threshold Estimation for Varying{coecient Models Artur Klinger, Universitat Munchen ABSTRACT: An alternative penalized likelihood estimator for varying{coecient regression in generalized linear models

More information

PSY 9556B (Feb 5) Latent Growth Modeling

PSY 9556B (Feb 5) Latent Growth Modeling PSY 9556B (Feb 5) Latent Growth Modeling Fixed and random word confusion Simplest LGM knowing how to calculate dfs How many time points needed? Power, sample size Nonlinear growth quadratic Nonlinear growth

More information

A Locally Adaptive Penalty for Estimation of. Functions with Varying Roughness.

A Locally Adaptive Penalty for Estimation of. Functions with Varying Roughness. A Locally Adaptive Penalty for Estimation of Functions with Varying Roughness. Curtis B. Storlie, Howard D. Bondell, and Brian J. Reich Date: June 5, 2008 Abstract We propose a new regularization method

More information

Picasso: A Sparse Learning Library for High Dimensional Data Analysis in R and Python

Picasso: A Sparse Learning Library for High Dimensional Data Analysis in R and Python Picasso: A Sparse Learning Library for High Dimensional Data Analysis in R and Python J. Ge, X. Li, H. Jiang, H. Liu, T. Zhang, M. Wang and T. Zhao Abstract We describe a new library named picasso, which

More information

Nonparametric and Semiparametric Linear Mixed Models

Nonparametric and Semiparametric Linear Mixed Models Nonparametric and Semiparametric Linear Mixed Models Megan J. Waterman Department of Defense, U.S. Government, USA. Jeffrey B. Birch Virginia Polytechnic Institute and State University, Blacksburg, VA

More information

Deposited on: 07 September 2010

Deposited on: 07 September 2010 Lee, D. and Shaddick, G. (2007) Time-varying coefficient models for the analysis of air pollution and health outcome data. Biometrics, 63 (4). pp. 1253-1261. ISSN 0006-341X http://eprints.gla.ac.uk/36767

More information

Intelligent Compaction and Quality Assurance of Roller Measurement Values utilizing Backfitting and Multiresolution Scale Space Analysis

Intelligent Compaction and Quality Assurance of Roller Measurement Values utilizing Backfitting and Multiresolution Scale Space Analysis Intelligent Compaction and Quality Assurance of Roller Measurement Values utilizing Backfitting and Multiresolution Scale Space Analysis Daniel K. Heersink 1, Reinhard Furrer 1, and Mike A. Mooney 2 arxiv:1302.4631v3

More information

Comment. J. L. FRENCH, E. E. KAMMANN, and M. F? WAND. a: 1 = -{lly -X~-Z~U~~'+AU~Z~U). sion involves minimization of

Comment. J. L. FRENCH, E. E. KAMMANN, and M. F? WAND. a: 1 = -{lly -X~-Z~U~~'+AU~Z~U). sion involves minimization of French, Kammann, and Wand: Comment J. L. FRENCH, E. E. KAMMANN, and M. F? WAND Comment 1. INTRODUCTION Semiparametric nonlinear mixed models are a useful addition to the regression modeling arsenal, and

More information

CoxFlexBoost: Fitting Structured Survival Models

CoxFlexBoost: Fitting Structured Survival Models CoxFlexBoost: Fitting Structured Survival Models Benjamin Hofner 1 Institut für Medizininformatik, Biometrie und Epidemiologie (IMBE) Friedrich-Alexander-Universität Erlangen-Nürnberg joint work with Torsten

More information

Smoothing Dissimilarities for Cluster Analysis: Binary Data and Functional Data

Smoothing Dissimilarities for Cluster Analysis: Binary Data and Functional Data Smoothing Dissimilarities for Cluster Analysis: Binary Data and unctional Data David B. University of South Carolina Department of Statistics Joint work with Zhimin Chen University of South Carolina Current

More information

Simple fitting of subject-specific curves for longitudinal data

Simple fitting of subject-specific curves for longitudinal data STATISTICS IN MEDICINE Statist Med 2004; 00:1 24 Simple fitting of subject-specific curves for longitudinal data M Durbán 1,, J Harezlak 2,MP Wand 3,RJ Carroll 4 1 Department of Statistics, Universidad

More information

S-Estimation for Penalized Regression Splines

S-Estimation for Penalized Regression Splines S-Estimation for Penalized Regression Splines K.Tharmaratnam, G. Claeskens and C. Croux OR & Business Statistics and Leuven Statistics Research Center Katholieke Universiteit Leuven, Belgium M. Salibián-Barrera

More information

Local spatial-predictor selection

Local spatial-predictor selection University of Wollongong Research Online Centre for Statistical & Survey Methodology Working Paper Series Faculty of Engineering and Information Sciences 2013 Local spatial-predictor selection Jonathan

More information

Independent Components Analysis through Product Density Estimation

Independent Components Analysis through Product Density Estimation Independent Components Analysis through Product Density Estimation 'frevor Hastie and Rob Tibshirani Department of Statistics Stanford University Stanford, CA, 94305 { hastie, tibs } @stat.stanford. edu

More information

Introduction to Nonparametric/Semiparametric Econometric Analysis: Implementation

Introduction to Nonparametric/Semiparametric Econometric Analysis: Implementation to Nonparametric/Semiparametric Econometric Analysis: Implementation Yoichi Arai National Graduate Institute for Policy Studies 2014 JEA Spring Meeting (14 June) 1 / 30 Motivation MSE (MISE): Measures

More information

On Kernel Density Estimation with Univariate Application. SILOKO, Israel Uzuazor

On Kernel Density Estimation with Univariate Application. SILOKO, Israel Uzuazor On Kernel Density Estimation with Univariate Application BY SILOKO, Israel Uzuazor Department of Mathematics/ICT, Edo University Iyamho, Edo State, Nigeria. A Seminar Presented at Faculty of Science, Edo

More information

Nonparametric Survey Regression Estimation in Two-Stage Spatial Sampling

Nonparametric Survey Regression Estimation in Two-Stage Spatial Sampling Nonparametric Survey Regression Estimation in Two-Stage Spatial Sampling Siobhan Everson-Stewart, F. Jay Breidt, Jean D. Opsomer January 20, 2004 Key Words: auxiliary information, environmental surveys,

More information

The flare Package for High Dimensional Linear Regression and Precision Matrix Estimation in R

The flare Package for High Dimensional Linear Regression and Precision Matrix Estimation in R Journal of Machine Learning Research 6 (205) 553-557 Submitted /2; Revised 3/4; Published 3/5 The flare Package for High Dimensional Linear Regression and Precision Matrix Estimation in R Xingguo Li Department

More information

Dynamic Thresholding for Image Analysis

Dynamic Thresholding for Image Analysis Dynamic Thresholding for Image Analysis Statistical Consulting Report for Edward Chan Clean Energy Research Center University of British Columbia by Libo Lu Department of Statistics University of British

More information

Economics Nonparametric Econometrics

Economics Nonparametric Econometrics Economics 217 - Nonparametric Econometrics Topics covered in this lecture Introduction to the nonparametric model The role of bandwidth Choice of smoothing function R commands for nonparametric models

More information

Ludwig Fahrmeir Gerhard Tute. Statistical odelling Based on Generalized Linear Model. íecond Edition. . Springer

Ludwig Fahrmeir Gerhard Tute. Statistical odelling Based on Generalized Linear Model. íecond Edition. . Springer Ludwig Fahrmeir Gerhard Tute Statistical odelling Based on Generalized Linear Model íecond Edition. Springer Preface to the Second Edition Preface to the First Edition List of Examples List of Figures

More information

Statistics & Analysis. Fitting Generalized Additive Models with the GAM Procedure in SAS 9.2

Statistics & Analysis. Fitting Generalized Additive Models with the GAM Procedure in SAS 9.2 Fitting Generalized Additive Models with the GAM Procedure in SAS 9.2 Weijie Cai, SAS Institute Inc., Cary NC July 1, 2008 ABSTRACT Generalized additive models are useful in finding predictor-response

More information

The picasso Package for High Dimensional Regularized Sparse Learning in R

The picasso Package for High Dimensional Regularized Sparse Learning in R The picasso Package for High Dimensional Regularized Sparse Learning in R X. Li, J. Ge, T. Zhang, M. Wang, H. Liu, and T. Zhao Abstract We introduce an R package named picasso, which implements a unified

More information

100 Myung Hwan Na log-hazard function. The discussion section of Abrahamowicz, et al.(1992) contains a good review of many of the papers on the use of

100 Myung Hwan Na log-hazard function. The discussion section of Abrahamowicz, et al.(1992) contains a good review of many of the papers on the use of J. KSIAM Vol.3, No.2, 99-106, 1999 SPLINE HAZARD RATE ESTIMATION USING CENSORED DATA Myung Hwan Na Abstract In this paper, the spline hazard rate model to the randomly censored data is introduced. The

More information

STAT 705 Introduction to generalized additive models

STAT 705 Introduction to generalized additive models STAT 705 Introduction to generalized additive models Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Data Analysis II 1 / 22 Generalized additive models Consider a linear

More information

GAMs, GAMMs and other penalized GLMs using mgcv in R. Simon Wood Mathematical Sciences, University of Bath, U.K.

GAMs, GAMMs and other penalized GLMs using mgcv in R. Simon Wood Mathematical Sciences, University of Bath, U.K. GAMs, GAMMs and other penalied GLMs using mgcv in R Simon Wood Mathematical Sciences, University of Bath, U.K. Simple eample Consider a very simple dataset relating the timber volume of cherry trees to

More information

Spatially Adaptive Bayesian Penalized Regression Splines (P-splines)

Spatially Adaptive Bayesian Penalized Regression Splines (P-splines) Spatially Adaptive Bayesian Penalized Regression Splines (P-splines) Veerabhadran BALADANDAYUTHAPANI, Bani K. MALLICK, and Raymond J. CARROLL In this article we study penalized regression splines (P-splines),

More information

CS 450 Numerical Analysis. Chapter 7: Interpolation

CS 450 Numerical Analysis. Chapter 7: Interpolation Lecture slides based on the textbook Scientific Computing: An Introductory Survey by Michael T. Heath, copyright c 2018 by the Society for Industrial and Applied Mathematics. http://www.siam.org/books/cl80

More information

Theoretical and Practical Aspects of Penalized Spline Smoothing

Theoretical and Practical Aspects of Penalized Spline Smoothing Theoretical and Practical Aspects of Penalized Spline Smoothing Dissertation zur Erlangung des Grades eines Doktors der Wirtschaftswissenschaften (Dr rer pol) der Fakultät für Wirtschaftswissenschaften

More information

Section 4 Matching Estimator

Section 4 Matching Estimator Section 4 Matching Estimator Matching Estimators Key Idea: The matching method compares the outcomes of program participants with those of matched nonparticipants, where matches are chosen on the basis

More information

Robust Parameter Design: A Semi-Parametric Approach

Robust Parameter Design: A Semi-Parametric Approach Robust Parameter Design: A Semi-Parametric Approach Stephanie M. Pickle Department of Statistics, Virginia Polytechnic Institute & State University Timothy J. Robinson Department of Statistics, University

More information

An R Package flare for High Dimensional Linear Regression and Precision Matrix Estimation

An R Package flare for High Dimensional Linear Regression and Precision Matrix Estimation An R Package flare for High Dimensional Linear Regression and Precision Matrix Estimation Xingguo Li Tuo Zhao Xiaoming Yuan Han Liu Abstract This paper describes an R package named flare, which implements

More information

Mixed Model-Based Hazard Estimation

Mixed Model-Based Hazard Estimation IN 1440-771X IBN 0 7326 1080 X Mixed Model-Based Hazard Estimation T. Cai, Rob J. Hyndman and M.P. and orking Paper 11/2000 December 2000 DEPRTMENT OF ECONOMETRIC ND BUINE TTITIC UTRI Mixed model-based

More information

Linear Methods for Regression and Shrinkage Methods

Linear Methods for Regression and Shrinkage Methods Linear Methods for Regression and Shrinkage Methods Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer 1 Linear Regression Models Least Squares Input vectors

More information

Nonparametric Frontier Estimation: The Smooth Local Maximum Estimator and The Smooth FDH Estimator

Nonparametric Frontier Estimation: The Smooth Local Maximum Estimator and The Smooth FDH Estimator Nonparametric Frontier Estimation: The Smooth Local Maximum Estimator and The Smooth FDH Estimator Hudson da S. Torrent 1 July, 2013 Abstract. In this paper we propose two estimators for deterministic

More information

Penalized Spline Model-Based Estimation of the Finite Populations Total from Probability-Proportional-to-Size Samples

Penalized Spline Model-Based Estimation of the Finite Populations Total from Probability-Proportional-to-Size Samples Journal of Of cial Statistics, Vol. 19, No. 2, 2003, pp. 99±117 Penalized Spline Model-Based Estimation of the Finite Populations Total from Probability-Proportional-to-Size Samples Hui Zheng 1 and Roderick

More information

Constrained Penalized Splines

Constrained Penalized Splines Constrained Penalized Splines Mary C Meyer Colorado State University October 15, 2010 Summary The penalized spline is a popular method for function estimation when the assumption of smoothness is valid.

More information

Goals of the Lecture. SOC6078 Advanced Statistics: 9. Generalized Additive Models. Limitations of the Multiple Nonparametric Models (2)

Goals of the Lecture. SOC6078 Advanced Statistics: 9. Generalized Additive Models. Limitations of the Multiple Nonparametric Models (2) SOC6078 Advanced Statistics: 9. Generalized Additive Models Robert Andersen Department of Sociology University of Toronto Goals of the Lecture Introduce Additive Models Explain how they extend from simple

More information

Topics in Machine Learning-EE 5359 Model Assessment and Selection

Topics in Machine Learning-EE 5359 Model Assessment and Selection Topics in Machine Learning-EE 5359 Model Assessment and Selection Ioannis D. Schizas Electrical Engineering Department University of Texas at Arlington 1 Training and Generalization Training stage: Utilizing

More information

Package gplm. August 29, 2016

Package gplm. August 29, 2016 Type Package Title Generalized Partial Linear Models (GPLM) Version 0.7-4 Date 2016-08-28 Author Package gplm August 29, 2016 Maintainer Provides functions for estimating a generalized

More information

Homework. Gaussian, Bishop 2.3 Non-parametric, Bishop 2.5 Linear regression Pod-cast lecture on-line. Next lectures:

Homework. Gaussian, Bishop 2.3 Non-parametric, Bishop 2.5 Linear regression Pod-cast lecture on-line. Next lectures: Homework Gaussian, Bishop 2.3 Non-parametric, Bishop 2.5 Linear regression 3.0-3.2 Pod-cast lecture on-line Next lectures: I posted a rough plan. It is flexible though so please come with suggestions Bayes

More information

Locally Weighted Least Squares Regression for Image Denoising, Reconstruction and Up-sampling

Locally Weighted Least Squares Regression for Image Denoising, Reconstruction and Up-sampling Locally Weighted Least Squares Regression for Image Denoising, Reconstruction and Up-sampling Moritz Baecher May 15, 29 1 Introduction Edge-preserving smoothing and super-resolution are classic and important

More information

Gradient LASSO algoithm

Gradient LASSO algoithm Gradient LASSO algoithm Yongdai Kim Seoul National University, Korea jointly with Yuwon Kim University of Minnesota, USA and Jinseog Kim Statistical Research Center for Complex Systems, Korea Contents

More information

Nonparametric Greedy Algorithms for the Sparse Learning Problem

Nonparametric Greedy Algorithms for the Sparse Learning Problem Nonparametric Greedy Algorithms for the Sparse Learning Problem Han Liu and Xi Chen School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 Abstract This paper studies the forward greedy

More information

Smoothing spline ANOVA for super-large samples: Scalable computation via rounding parameters

Smoothing spline ANOVA for super-large samples: Scalable computation via rounding parameters Smoothing spline ANOVA for super-large samples: Scalable computation via rounding parameters Nathaniel E. Helwig 1,2 and Ping Ma 3 arxiv:1602.05208v1 [stat.co] 16 Feb 2016 1 Department of Psychology, University

More information

Hierarchical Mixture Models for Nested Data Structures

Hierarchical Mixture Models for Nested Data Structures Hierarchical Mixture Models for Nested Data Structures Jeroen K. Vermunt 1 and Jay Magidson 2 1 Department of Methodology and Statistics, Tilburg University, PO Box 90153, 5000 LE Tilburg, Netherlands

More information

A Method for Comparing Multiple Regression Models

A Method for Comparing Multiple Regression Models CSIS Discussion Paper No. 141 A Method for Comparing Multiple Regression Models Yuki Hiruta Yasushi Asami Department of Urban Engineering, the University of Tokyo e-mail: hiruta@ua.t.u-tokyo.ac.jp asami@csis.u-tokyo.ac.jp

More information

Lecture 7: Splines and Generalized Additive Models

Lecture 7: Splines and Generalized Additive Models Lecture 7: and Generalized Additive Models Computational Statistics Thierry Denœux April, 2016 Introduction Overview Introduction Simple approaches Polynomials Step functions Regression splines Natural

More information

Free Knot Polynomial Spline Confidence Intervals

Free Knot Polynomial Spline Confidence Intervals Free Knot Polynomial Spline Confidence Intervals Vincent W. Mao Linda H. Zhao Department of Statistics University of Pennsylvania Philadelphia, PA 19104-6302 lzhao@wharton.upenn.edu Abstract We construct

More information

An Introduction to the Bootstrap

An Introduction to the Bootstrap An Introduction to the Bootstrap Bradley Efron Department of Statistics Stanford University and Robert J. Tibshirani Department of Preventative Medicine and Biostatistics and Department of Statistics,

More information

Linear Mixed Model Robust Regression

Linear Mixed Model Robust Regression Linear Mixed Model Robust Regression Megan J. Waterman, Jeffrey B. Birch, and Oliver Schabenberger November 5, 2006 Abstract Mixed models are powerful tools for the analysis of clustered data and many

More information

Multivariable Regression Modelling

Multivariable Regression Modelling Multivariable Regression Modelling A review of available spline packages in R. Aris Perperoglou for TG2 ISCB 2015 Aris Perperoglou for TG2 Multivariable Regression Modelling ISCB 2015 1 / 41 TG2 Members

More information

A toolbox of smooths. Simon Wood Mathematical Sciences, University of Bath, U.K.

A toolbox of smooths. Simon Wood Mathematical Sciences, University of Bath, U.K. A toolbo of smooths Simon Wood Mathematical Sciences, University of Bath, U.K. Smooths for semi-parametric GLMs To build adequate semi-parametric GLMs requires that we use functions with appropriate properties.

More information

Comparison of Optimization Methods for L1-regularized Logistic Regression

Comparison of Optimization Methods for L1-regularized Logistic Regression Comparison of Optimization Methods for L1-regularized Logistic Regression Aleksandar Jovanovich Department of Computer Science and Information Systems Youngstown State University Youngstown, OH 44555 aleksjovanovich@gmail.com

More information

An R Package flare for High Dimensional Linear Regression and Precision Matrix Estimation

An R Package flare for High Dimensional Linear Regression and Precision Matrix Estimation An R Package flare for High Dimensional Linear Regression and Precision Matrix Estimation Xingguo Li Tuo Zhao Xiaoming Yuan Han Liu Abstract This paper describes an R package named flare, which implements

More information

Parameterization of triangular meshes

Parameterization of triangular meshes Parameterization of triangular meshes Michael S. Floater November 10, 2009 Triangular meshes are often used to represent surfaces, at least initially, one reason being that meshes are relatively easy to

More information

User Guide of MTE.exe

User Guide of MTE.exe User Guide of MTE.exe James Heckman University of Chicago and American Bar Foundation Sergio Urzua University of Chicago Edward Vytlacil Columbia University March 22, 2006 The objective of this document

More information

The Curse of Dimensionality

The Curse of Dimensionality The Curse of Dimensionality ACAS 2002 p1/66 Curse of Dimensionality The basic idea of the curse of dimensionality is that high dimensional data is difficult to work with for several reasons: Adding more

More information