DATA ANALYSIS USING HIERARCHICAL GENERALIZED LINEAR MODELS WITH R

Size: px
Start display at page:

Download "DATA ANALYSIS USING HIERARCHICAL GENERALIZED LINEAR MODELS WITH R"

Transcription

1 DATA ANALYSIS USING HIERARCHICAL GENERALIZED LINEAR MODELS WITH R Lee, Rönnegård & Noh LRN@du.se Lee, Rönnegård & Noh HGLM book 1 / 24

2 Overview 1 Background to the book 2 Crack growth example 3 Contents 4 A last example - Street magician Lee, Rönnegård & Noh HGLM book 2 / 24

3 Development of books Lee, Rönnegård & Noh HGLM book 3 / 24

4 Model checking plots in linear regression y = Xβ + e Normal Q Q 12 Standardized residuals Theoretical Quantiles lm(y ~ x) Lee, Rönnegård & Noh HGLM book 4 / 24

5 Regression models using other distributions Gaussian L = n i=1 μ = Xβ 1 e 1 2σ 2 y i μ 2 2πσ Poisson n μ L = y ie μ y i! i=1 log(μ) = Xβ Binomial n n L = y μ y i(1 μ) n y i i i=1 logit(μ) = Xβ Gamma n 1 L = Γ(k) θ ky i k 1 e y i θ i=1 kθ μ; 1/μ = Xβ Lee, Rönnegård & Noh HGLM book 5 / 24

6 Generalized Linear Models Gaussian L = n i=1 μ = Xβ 1 e 1 2σ 2 y i μ 2 2πσ Poisson n μ L = y ie μ y i! i=1 log(μ) = Xβ Binomial n n L = y μ y i(1 μ) n y i i i=1 logit(μ) = Xβ Gamma n 1 L = Γ(k) θ ky i k 1 e y i θ i=1 kθ μ; 1/μ = Xβ GLM Common estimation algorithm using iterative regression - Fast and easy to implement - Linear regression model checking tools!!! Lee, Rönnegård & Noh HGLM book 6 / 24

7 Hierarchical Generalized Linear Models Lee Y. & Nelder J. A. (1996) GLM approach for fitting Linear mixed models Generalized linear mixed models (Laplace approximation) Mixed models with non-gaussian random effects Above models + dispersion modelling Lee, Rönnegård & Noh HGLM book 7 / 24

8 Hierarchical Generalized Linear Models Linear model with random effects y = Xβ + Zu + e u N(0, σu) 2 e N(0, σe) 2 Random effects Repeated measurements (on the same subject) the most common application The parameter is assumed to be sampled from a population of parameters (not fixed) The variance component σ 2 u is of interest Estimating û can be of interest for: prediction (including spatial Kriging) and ranking subjects. Lee, Rönnegård & Noh HGLM book 8 / 24

9 Hierarchical Generalized Linear Models Linear model y = Xβ + e e N(0, σ 2 e) Linear model including a dispersion model y = Xβ + e e N(0, exp(x d β d )) Lee, Rönnegård & Noh HGLM book 9 / 24

10 Hierarchical Generalized Linear Models Linear model Linear mixed model (LMM) Generalized linear model (GLM) Generalized linear mixed model (GLMM) Generalized linear model including Gaussian random effects Joint GLM Generalized linear model including dispersion model with fixed effects Hierarchical GLM (HGLM) Generalized linear model including Gaussian and/or non-gaussian random effects. Dispersion can be modelled using fixed effects. Lee, Rönnegård & Noh HGLM book 10 / 24

11 Hierarchical Generalized Linear Models Linear model Linear mixed model (LMM) Generalized linear model (GLM) Factor analysis Generalized linear mixed model (GLMM) Generalized linear model including Gaussian random effects Joint GLM Generalized linear model including dispersion model with fixed effects Structural Equation Models (SEM) Hierarchical GLM (HGLM) Generalized linear model including Gaussian and/or non-gaussian random effects. Dispersion can be modelled using fixed effects. HGLMs with correlated random effects Including spatial, temporal correlations, splines, GAM. Double HGLM (DHGLM) HGLM including dispersion model with both fixed and random effects Frailty HGLM HGLMs for survival analysis including competing risk models Lee, Rönnegård & Noh HGLM book 11 / 24

12 A motivating example from my own research An important field of application for random effect estimation is animal breeding where the animals are ranked by their genetic potential given by the estimated random effects in a linear mixed model y = Xβ + Za + e The outcome y can for instance be litter size in pigs or milk yield in dairy cows, and the random effects a N(0, σ 2 aa) have a correlation matrix A computed from pedigree information (see e.g., Chapter 17 of Pawitan (2001)). The ranking is based on the best linear unbiased predictor â, referred to as estimated breeding values (for the mean), and by selecting animals with high estimated breeding values the mean of the response variable is increased for each generation of selection. Lee, Rönnegård & Noh HGLM book 12 / 24

13 A motivating example from my own research An important field of application for random effect estimation is animal breeding where the animals are ranked by their genetic potential given by the estimated random effects in a linear mixed model y = Xβ + Za + e The outcome y can for instance be litter size in pigs or milk yield in dairy cows, and the random effects a N(0, σ 2 aa) have a correlation matrix A computed from pedigree information (see e.g., Chapter 17 of Pawitan (2001)). The ranking is based on the best linear unbiased predictor â, referred to as estimated breeding values (for the mean), and by selecting animals with high estimated breeding values the mean of the response variable is increased for each generation of selection. Possible to estimate breeding values for the variance! Lee, Rönnegård & Noh HGLM book 12 / 24

14 12 INTRODUCTION Figure 1.4 Distributions of observed litter sizes and the number of observations per sow.

15 Crack growth data Hudak et al. (1978) crack lengths measured on a compact tension steel. 21 metallic specimens, crack lengths recorded every 104 cycles y = increment of crack length crack0 = covariate for the mean part of the model cycle = covariate for the dispersion part of the model Lee, Rönnegård & Noh HGLM book 13 / 24

16 Crack growth data Lee, Rönnegård & Noh HGLM book 14 / 24

17 Crack growth - Using the hglm package res_glm <- glm(y ~ crack0, family= Gamma(link=log), data= data_crack_growth ) Lee, Rönnegård & Noh HGLM book 15 / 24

18 Crack growth - Using the hglm package res_glm <- glm(y ~ crack0, family= Gamma(link=log), data= data_crack_growth ) library(hglm) ## GLMM ## res_glmm <- hglm2(y ~ crack0 + (1 specimen), family= Gamma(link=log), data= data_crack_growth ) Lee, Rönnegård & Noh HGLM book 15 / 24

19 Crack growth - Using the hglm package res_glm <- glm(y ~ crack0, family= Gamma(link=log), data= data_crack_growth ) library(hglm) ## GLMM ## res_glmm <- hglm2(y ~ crack0 + (1 specimen), family= Gamma(link=log), data= data_crack_growth ) ## GLMM with dispersion model## res_glmm2 <- hglm2(y ~ crack0 + (1 specimen), family= Gamma(link=log), data= data_crack_growth, disp= ~cycle ) Lee, Rönnegård & Noh HGLM book 15 / 24

20 Crack growth - Using the dhglm package library(dhglm) ## HGLM I ## model_mu <- DHGLMMODELING(Model="mean", Link="log", LinPred = y ~ crack0 + (1 specimen), RandDist = "inverse-gamma") model_phi <- DHGLMMODELING(Model="dispersion") res_hglm1 <- dhglmfit(respdist="gamma", DataMain=data_crack_growth, MeanModel=model_mu, DispersionModel=model_phi) Lee, Rönnegård & Noh HGLM book 16 / 24

21 Crack growth - Using the dhglm package library(dhglm) ## HGLM I ## model_mu <- DHGLMMODELING(Model="mean", Link="log", LinPred = y ~ crack0 + (1 specimen), RandDist = "inverse-gamma") model_phi <- DHGLMMODELING(Model="dispersion") res_hglm1 <- dhglmfit(respdist="gamma", DataMain=data_crack_growth, MeanModel=model_mu, DispersionModel=model_phi) ## HGLM II ## model_mu <- DHGLMMODELING(Model="mean", Link="log", LinPred = y ~ crack0 + (1 specimen), RandDist="inverse-gamma") model_phi <- DHGLMMODELING(Model = "dispersion", Link = "log", LinPred = phi ~ cycle) res_hglm2 <- dhglmfit(respdist = "gamma", DataMain = data_crack_growth, MeanModel = model_mu, DispersionModel = model_phi) Lee, Rönnegård & Noh HGLM book 16 / 24

22 Crack growth - Using the dhglm package ## DHGLM I ## model_mu <- DHGLMMODELING(Model="mean", Link="log", LinPred = y ~ crack0 + (1 specimen), RandDist="inverse-gamma") model_phi <- DHGLMMODELING(Model="dispersion", Link="log", LinPred = phi ~ cycle + (1 specimen), RandDist="gaussian") res_dhglm1 <- dhglmfit(respdist = "gamma", DataMain = data_crack_growth, MeanModel = model_mu, DispersionModel = model_phi) Lee, Rönnegård & Noh HGLM book 17 / 24

23 Crack growth data - results HGLM I HGLM II DHGLM I Lee, Rönnegård & Noh HGLM book 18 / 24

24 Crack growth data - estimates Estimates from the model(mu) y ~ crack0 + (1 specimen) [1] "log" Estimate Std. Error t-value (Intercept) crack Estimates for logarithm of lambda=var(u_mu) Estimate Std. Error t-value specimen Estimates from the model(phi) phi ~ cycle + (1 specimen) [1] "log" Estimate Std. Error t-value (Intercept) cycle Estimates for logarithm of var(u_phi) Estimate Std. Error t-value specimen Lee, Rönnegård & Noh HGLM book 19 / 24

25 Contents List of notations ix Preface xi 1 Introduction Motivating examples Regarding criticisms of the h-likelihood R code Exercises 17 2 GLMs via iterative weighted least squares Examples R code Fisher s classical likelihood Iterative weighted least squares Model checking using residual plots Hat values Exercises 34 3 Inference for models with unobservables Examples R code Likelihood inference for random effects 44 v

26 vi CONTENTS 3.4 Extended likelihood principle Laplace approximations for the integrals Street magician H-likelihood and empirical Bayes Exercises 62 4 HGLMs: from method to algorithm Examples R code IWLS algorithm for interconnected GLMs IWLS algorithm for augmented GLM IWLS algorithm for HGLMs Estimation algorithm for a Poisson GLMM Exercises 91 5 HGLMs modeling in R Examples R code Exercises Double HGLMs - using the dhglm package Model description for DHGLMs Examples An extension of linear mixed models via DHGLM Implementation in the dhglm package Exercises Fitting multivariate HGLMs Examples Implementation in the mdhglm package Exercises 220

27 CONTENTS vii 8 Survival analysis Examples Competing risk models Comparison with alternative R procedures H-likelihood theory for the frailty model Running the frailtyhl package Exercises Joint models Examples H-likelihood approach to joint models Exercises Further topics Examples Variable selections Examples Hypothesis testing 291 References 299 Data Index 311 Author Index 313 Subject Index 317

28 Hierarchical Likelihood The hierarchical likelihood for observations y, fixed parameters θ and random effects v is defined as H(θ, v; y) = f θ (y v)f θ (v) Inference of fixed effects: H(θ, v; y)dv Inference of random effects: H(θ, v; y) Inference of dispersion parameters: f θ (y ˆβ), i.e. REML Lee, Rönnegård & Noh HGLM book 20 / 24

29 Street magician Here an example is presented to illustrate the fundamental idea of likelihood inference and how it may differ from Bayesian inference. You walk down the street and come across a street magician. He has a small bag with a number of dice. There are two types of dice in the bag; white ones and blue ones. The white are numbered 1 to 6, while the blue have three sides with 1 s and three sides with 2 s. The magician draws a die at random from the bag without showing it to you and rolls the die. He claims that the number that turns up is a 2. Which type of die would you guess he has rolled, a white or a blue one? Lee, Rönnegård & Noh HGLM book 21 / 24

30 Street magician - extended Instead of drawing a die, the street magician generated a random variable v from N(0, 1). Suppose that he observed v = v 0. Then without telling the value of realized value v 0, he generates random variable Y from N(v 0, 1). Then, he informs us the value y 0. Can we make an inference about v 0 given the observed data y o? Lee, Rönnegård & Noh HGLM book 22 / 24

31 Development of books Lee, Rönnegård & Noh HGLM book 23 / 24

32 Thank you! All R code included in the book available at: Lee, Rönnegård & Noh HGLM book 24 / 24

DATA ANALYSIS USING HIERARCHICAL GENERALIZED LINEAR MODELS WITH R

DATA ANALYSIS USING HIERARCHICAL GENERALIZED LINEAR MODELS WITH R DATA ANALYSIS USING HIERARCHICAL GENERALIZED LINEAR MODELS WITH R Lee, Rönnegård & Noh LRN@du.se Lee, Rönnegård & Noh HGLM book 1 / 25 Overview 1 Background to the book 2 A motivating example from my own

More information

Generalized least squares (GLS) estimates of the level-2 coefficients,

Generalized least squares (GLS) estimates of the level-2 coefficients, Contents 1 Conceptual and Statistical Background for Two-Level Models...7 1.1 The general two-level model... 7 1.1.1 Level-1 model... 8 1.1.2 Level-2 model... 8 1.2 Parameter estimation... 9 1.3 Empirical

More information

Generalized Additive Models

Generalized Additive Models :p Texts in Statistical Science Generalized Additive Models An Introduction with R Simon N. Wood Contents Preface XV 1 Linear Models 1 1.1 A simple linear model 2 Simple least squares estimation 3 1.1.1

More information

Evaluation of Double Hierarchical Generalized Linear Model versus Empirical Bayes method A comparison on Barley dataset

Evaluation of Double Hierarchical Generalized Linear Model versus Empirical Bayes method A comparison on Barley dataset Evaluation of Double Hierarchical Generalized Linear Model versus Empirical Bayes method A comparison on Barley dataset Author: Lei Wang Supervisor: Majbritt Felleki D-level in Statistics, Spring 00 School

More information

GAMs semi-parametric GLMs. Simon Wood Mathematical Sciences, University of Bath, U.K.

GAMs semi-parametric GLMs. Simon Wood Mathematical Sciences, University of Bath, U.K. GAMs semi-parametric GLMs Simon Wood Mathematical Sciences, University of Bath, U.K. Generalized linear models, GLM 1. A GLM models a univariate response, y i as g{e(y i )} = X i β where y i Exponential

More information

Inference for Generalized Linear Mixed Models

Inference for Generalized Linear Mixed Models Inference for Generalized Linear Mixed Models Christina Knudson, Ph.D. University of St. Thomas October 18, 2018 Reviewing the Linear Model The usual linear model assumptions: responses normally distributed

More information

Ludwig Fahrmeir Gerhard Tute. Statistical odelling Based on Generalized Linear Model. íecond Edition. . Springer

Ludwig Fahrmeir Gerhard Tute. Statistical odelling Based on Generalized Linear Model. íecond Edition. . Springer Ludwig Fahrmeir Gerhard Tute Statistical odelling Based on Generalized Linear Model íecond Edition. Springer Preface to the Second Edition Preface to the First Edition List of Examples List of Figures

More information

COPYRIGHTED MATERIAL CONTENTS

COPYRIGHTED MATERIAL CONTENTS PREFACE ACKNOWLEDGMENTS LIST OF TABLES xi xv xvii 1 INTRODUCTION 1 1.1 Historical Background 1 1.2 Definition and Relationship to the Delta Method and Other Resampling Methods 3 1.2.1 Jackknife 6 1.2.2

More information

FMA901F: Machine Learning Lecture 3: Linear Models for Regression. Cristian Sminchisescu

FMA901F: Machine Learning Lecture 3: Linear Models for Regression. Cristian Sminchisescu FMA901F: Machine Learning Lecture 3: Linear Models for Regression Cristian Sminchisescu Machine Learning: Frequentist vs. Bayesian In the frequentist setting, we seek a fixed parameter (vector), with value(s)

More information

Statistics (STAT) Statistics (STAT) 1. Prerequisites: grade in C- or higher in STAT 1200 or STAT 1300 or STAT 1400

Statistics (STAT) Statistics (STAT) 1. Prerequisites: grade in C- or higher in STAT 1200 or STAT 1300 or STAT 1400 Statistics (STAT) 1 Statistics (STAT) STAT 1200: Introductory Statistical Reasoning Statistical concepts for critically evaluation quantitative information. Descriptive statistics, probability, estimation,

More information

Analysis of Panel Data. Third Edition. Cheng Hsiao University of Southern California CAMBRIDGE UNIVERSITY PRESS

Analysis of Panel Data. Third Edition. Cheng Hsiao University of Southern California CAMBRIDGE UNIVERSITY PRESS Analysis of Panel Data Third Edition Cheng Hsiao University of Southern California CAMBRIDGE UNIVERSITY PRESS Contents Preface to the ThirdEdition Preface to the Second Edition Preface to the First Edition

More information

Poisson Regression and Model Checking

Poisson Regression and Model Checking Poisson Regression and Model Checking Readings GH Chapter 6-8 September 27, 2017 HIV & Risk Behaviour Study The variables couples and women_alone code the intervention: control - no counselling (both 0)

More information

Handbook of Statistical Modeling for the Social and Behavioral Sciences

Handbook of Statistical Modeling for the Social and Behavioral Sciences Handbook of Statistical Modeling for the Social and Behavioral Sciences Edited by Gerhard Arminger Bergische Universität Wuppertal Wuppertal, Germany Clifford С. Clogg Late of Pennsylvania State University

More information

Statistical Matching using Fractional Imputation

Statistical Matching using Fractional Imputation Statistical Matching using Fractional Imputation Jae-Kwang Kim 1 Iowa State University 1 Joint work with Emily Berg and Taesung Park 1 Introduction 2 Classical Approaches 3 Proposed method 4 Application:

More information

Statistical Methods for the Analysis of Repeated Measurements

Statistical Methods for the Analysis of Repeated Measurements Charles S. Davis Statistical Methods for the Analysis of Repeated Measurements With 20 Illustrations #j Springer Contents Preface List of Tables List of Figures v xv xxiii 1 Introduction 1 1.1 Repeated

More information

Expectation Maximization (EM) and Gaussian Mixture Models

Expectation Maximization (EM) and Gaussian Mixture Models Expectation Maximization (EM) and Gaussian Mixture Models Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer 1 2 3 4 5 6 7 8 Unsupervised Learning Motivation

More information

CHAPTER 1 INTRODUCTION

CHAPTER 1 INTRODUCTION Introduction CHAPTER 1 INTRODUCTION Mplus is a statistical modeling program that provides researchers with a flexible tool to analyze their data. Mplus offers researchers a wide choice of models, estimators,

More information

Predictive Checking. Readings GH Chapter 6-8. February 8, 2017

Predictive Checking. Readings GH Chapter 6-8. February 8, 2017 Predictive Checking Readings GH Chapter 6-8 February 8, 2017 Model Choice and Model Checking 2 Questions: 1. Is my Model good enough? (no alternative models in mind) 2. Which Model is best? (comparison

More information

Frequencies, Unequal Variance Weights, and Sampling Weights: Similarities and Differences in SAS

Frequencies, Unequal Variance Weights, and Sampling Weights: Similarities and Differences in SAS ABSTRACT Paper 1938-2018 Frequencies, Unequal Variance Weights, and Sampling Weights: Similarities and Differences in SAS Robert M. Lucas, Robert M. Lucas Consulting, Fort Collins, CO, USA There is confusion

More information

Time Series Analysis by State Space Methods

Time Series Analysis by State Space Methods Time Series Analysis by State Space Methods Second Edition J. Durbin London School of Economics and Political Science and University College London S. J. Koopman Vrije Universiteit Amsterdam OXFORD UNIVERSITY

More information

Acknowledgments. Acronyms

Acknowledgments. Acronyms Acknowledgments Preface Acronyms xi xiii xv 1 Basic Tools 1 1.1 Goals of inference 1 1.1.1 Population or process? 1 1.1.2 Probability samples 2 1.1.3 Sampling weights 3 1.1.4 Design effects. 5 1.2 An introduction

More information

INLA: an introduction

INLA: an introduction INLA: an introduction Håvard Rue 1 Norwegian University of Science and Technology Trondheim, Norway May 2009 1 Joint work with S.Martino (Trondheim) and N.Chopin (Paris) Latent Gaussian models Background

More information

Organizing data in R. Fitting Mixed-Effects Models Using the lme4 Package in R. R packages. Accessing documentation. The Dyestuff data set

Organizing data in R. Fitting Mixed-Effects Models Using the lme4 Package in R. R packages. Accessing documentation. The Dyestuff data set Fitting Mixed-Effects Models Using the lme4 Package in R Deepayan Sarkar Fred Hutchinson Cancer Research Center 18 September 2008 Organizing data in R Standard rectangular data sets (columns are variables,

More information

Package EBglmnet. January 30, 2016

Package EBglmnet. January 30, 2016 Type Package Package EBglmnet January 30, 2016 Title Empirical Bayesian Lasso and Elastic Net Methods for Generalized Linear Models Version 4.1 Date 2016-01-15 Author Anhui Huang, Dianting Liu Maintainer

More information

book 2014/5/6 15:21 page v #3 List of figures List of tables Preface to the second edition Preface to the first edition

book 2014/5/6 15:21 page v #3 List of figures List of tables Preface to the second edition Preface to the first edition book 2014/5/6 15:21 page v #3 Contents List of figures List of tables Preface to the second edition Preface to the first edition xvii xix xxi xxiii 1 Data input and output 1 1.1 Input........................................

More information

Markov Chain Monte Carlo (part 1)

Markov Chain Monte Carlo (part 1) Markov Chain Monte Carlo (part 1) Edps 590BAY Carolyn J. Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Spring 2018 Depending on the book that you select for

More information

The glmmml Package. August 20, 2006

The glmmml Package. August 20, 2006 The glmmml Package August 20, 2006 Version 0.65-1 Date 2006/08/20 Title Generalized linear models with clustering A Maximum Likelihood and bootstrap approach to mixed models. License GPL version 2 or newer.

More information

C A S I O f x S UNIVERSITY OF SOUTHERN QUEENSLAND. The Learning Centre Learning and Teaching Support Unit

C A S I O f x S UNIVERSITY OF SOUTHERN QUEENSLAND. The Learning Centre Learning and Teaching Support Unit C A S I O f x - 1 0 0 S UNIVERSITY OF SOUTHERN QUEENSLAND The Learning Centre Learning and Teaching Support Unit MASTERING THE CALCULATOR USING THE CASIO fx-100s Learning and Teaching Support Unit (LTSU)

More information

Monte Carlo for Spatial Models

Monte Carlo for Spatial Models Monte Carlo for Spatial Models Murali Haran Department of Statistics Penn State University Penn State Computational Science Lectures April 2007 Spatial Models Lots of scientific questions involve analyzing

More information

Regression Analysis and Linear Regression Models

Regression Analysis and Linear Regression Models Regression Analysis and Linear Regression Models University of Trento - FBK 2 March, 2015 (UNITN-FBK) Regression Analysis and Linear Regression Models 2 March, 2015 1 / 33 Relationship between numerical

More information

Image analysis. Computer Vision and Classification Image Segmentation. 7 Image analysis

Image analysis. Computer Vision and Classification Image Segmentation. 7 Image analysis 7 Computer Vision and Classification 413 / 458 Computer Vision and Classification The k-nearest-neighbor method The k-nearest-neighbor (knn) procedure has been used in data analysis and machine learning

More information

Introduction to Mixed-Effects Models for Hierarchical and Longitudinal Data

Introduction to Mixed-Effects Models for Hierarchical and Longitudinal Data John Fox Lecture Notes Introduction to Mixed-Effects Models for Hierarchical and Longitudinal Data Copyright 2014 by John Fox Introduction to Mixed-Effects Models for Hierarchical and Longitudinal Data

More information

Logistic Regression. (Dichotomous predicted variable) Tim Frasier

Logistic Regression. (Dichotomous predicted variable) Tim Frasier Logistic Regression (Dichotomous predicted variable) Tim Frasier Copyright Tim Frasier This work is licensed under the Creative Commons Attribution 4.0 International license. Click here for more information.

More information

Package gee. June 29, 2015

Package gee. June 29, 2015 Title Generalized Estimation Equation Solver Version 4.13-19 Depends stats Suggests MASS Date 2015-06-29 DateNote Gee version 1998-01-27 Package gee June 29, 2015 Author Vincent J Carey. Ported to R by

More information

Data-Splitting Models for O3 Data

Data-Splitting Models for O3 Data Data-Splitting Models for O3 Data Q. Yu, S. N. MacEachern and M. Peruggia Abstract Daily measurements of ozone concentration and eight covariates were recorded in 1976 in the Los Angeles basin (Breiman

More information

Fundamentals of Digital Image Processing

Fundamentals of Digital Image Processing \L\.6 Gw.i Fundamentals of Digital Image Processing A Practical Approach with Examples in Matlab Chris Solomon School of Physical Sciences, University of Kent, Canterbury, UK Toby Breckon School of Engineering,

More information

Machine Learning A W 1sst KU. b) [1 P] Give an example for a probability distributions P (A, B, C) that disproves

Machine Learning A W 1sst KU. b) [1 P] Give an example for a probability distributions P (A, B, C) that disproves Machine Learning A 708.064 11W 1sst KU Exercises Problems marked with * are optional. 1 Conditional Independence I [2 P] a) [1 P] Give an example for a probability distribution P (A, B, C) that disproves

More information

Random Number Generation and Monte Carlo Methods

Random Number Generation and Monte Carlo Methods James E. Gentle Random Number Generation and Monte Carlo Methods With 30 Illustrations Springer Contents Preface vii 1 Simulating Random Numbers from a Uniform Distribution 1 1.1 Linear Congruential Generators

More information

Machine Learning Techniques for Detecting Hierarchical Interactions in GLM s for Insurance Premiums

Machine Learning Techniques for Detecting Hierarchical Interactions in GLM s for Insurance Premiums Machine Learning Techniques for Detecting Hierarchical Interactions in GLM s for Insurance Premiums José Garrido Department of Mathematics and Statistics Concordia University, Montreal EAJ 2016 Lyon, September

More information

Bayes Estimators & Ridge Regression

Bayes Estimators & Ridge Regression Bayes Estimators & Ridge Regression Readings ISLR 6 STA 521 Duke University Merlise Clyde October 27, 2017 Model Assume that we have centered (as before) and rescaled X o (original X) so that X j = X o

More information

GLM II. Basic Modeling Strategy CAS Ratemaking and Product Management Seminar by Paul Bailey. March 10, 2015

GLM II. Basic Modeling Strategy CAS Ratemaking and Product Management Seminar by Paul Bailey. March 10, 2015 GLM II Basic Modeling Strategy 2015 CAS Ratemaking and Product Management Seminar by Paul Bailey March 10, 2015 Building predictive models is a multi-step process Set project goals and review background

More information

Beta-Regression with SPSS Michael Smithson School of Psychology, The Australian National University

Beta-Regression with SPSS Michael Smithson School of Psychology, The Australian National University 9/1/2005 Beta-Regression with SPSS 1 Beta-Regression with SPSS Michael Smithson School of Psychology, The Australian National University (email: Michael.Smithson@anu.edu.au) SPSS Nonlinear Regression syntax

More information

Opening Windows into the Black Box

Opening Windows into the Black Box Opening Windows into the Black Box Yu-Sung Su, Andrew Gelman, Jennifer Hill and Masanao Yajima Columbia University, Columbia University, New York University and University of California at Los Angels July

More information

Clustering Lecture 5: Mixture Model

Clustering Lecture 5: Mixture Model Clustering Lecture 5: Mixture Model Jing Gao SUNY Buffalo 1 Outline Basics Motivation, definition, evaluation Methods Partitional Hierarchical Density-based Mixture model Spectral methods Advanced topics

More information

Simulation Modeling and Analysis

Simulation Modeling and Analysis Simulation Modeling and Analysis FOURTH EDITION Averill M. Law President Averill M. Law & Associates, Inc. Tucson, Arizona, USA www. averill-law. com Boston Burr Ridge, IL Dubuque, IA New York San Francisco

More information

Missing Data. SPIDA 2012 Part 6 Mixed Models with R:

Missing Data. SPIDA 2012 Part 6 Mixed Models with R: The best solution to the missing data problem is not to have any. Stef van Buuren, developer of mice SPIDA 2012 Part 6 Mixed Models with R: Missing Data Georges Monette 1 May 2012 Email: georges@yorku.ca

More information

Study Guide. Module 1. Key Terms

Study Guide. Module 1. Key Terms Study Guide Module 1 Key Terms general linear model dummy variable multiple regression model ANOVA model ANCOVA model confounding variable squared multiple correlation adjusted squared multiple correlation

More information

Generalized Additive Models

Generalized Additive Models Generalized Additive Models Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Generalized Additive Models GAMs are one approach to non-parametric regression in the multiple predictor setting.

More information

Package citools. October 20, 2018

Package citools. October 20, 2018 Type Package Package citools October 20, 2018 Title Confidence or Prediction Intervals, Quantiles, and Probabilities for Statistical Models Version 0.5.0 Maintainer John Haman Functions

More information

CS839: Probabilistic Graphical Models. Lecture 10: Learning with Partially Observed Data. Theo Rekatsinas

CS839: Probabilistic Graphical Models. Lecture 10: Learning with Partially Observed Data. Theo Rekatsinas CS839: Probabilistic Graphical Models Lecture 10: Learning with Partially Observed Data Theo Rekatsinas 1 Partially Observed GMs Speech recognition 2 Partially Observed GMs Evolution 3 Partially Observed

More information

STATISTICS (STAT) Statistics (STAT) 1

STATISTICS (STAT) Statistics (STAT) 1 Statistics (STAT) 1 STATISTICS (STAT) STAT 2013 Elementary Statistics (A) Prerequisites: MATH 1483 or MATH 1513, each with a grade of "C" or better; or an acceptable placement score (see placement.okstate.edu).

More information

CHAPTER 7 EXAMPLES: MIXTURE MODELING WITH CROSS- SECTIONAL DATA

CHAPTER 7 EXAMPLES: MIXTURE MODELING WITH CROSS- SECTIONAL DATA Examples: Mixture Modeling With Cross-Sectional Data CHAPTER 7 EXAMPLES: MIXTURE MODELING WITH CROSS- SECTIONAL DATA Mixture modeling refers to modeling with categorical latent variables that represent

More information

10-701/15-781, Fall 2006, Final

10-701/15-781, Fall 2006, Final -7/-78, Fall 6, Final Dec, :pm-8:pm There are 9 questions in this exam ( pages including this cover sheet). If you need more room to work out your answer to a question, use the back of the page and clearly

More information

JMP Book Descriptions

JMP Book Descriptions JMP Book Descriptions The collection of JMP documentation is available in the JMP Help > Books menu. This document describes each title to help you decide which book to explore. Each book title is linked

More information

ESTIMATING DENSITY DEPENDENCE, PROCESS NOISE, AND OBSERVATION ERROR

ESTIMATING DENSITY DEPENDENCE, PROCESS NOISE, AND OBSERVATION ERROR ESTIMATING DENSITY DEPENDENCE, PROCESS NOISE, AND OBSERVATION ERROR Coinvestigators: José Ponciano, University of Idaho Subhash Lele, University of Alberta Mark Taper, Montana State University David Staples,

More information

Mixed Effects Models. Biljana Jonoska Stojkova Applied Statistics and Data Science Group (ASDa) Department of Statistics, UBC.

Mixed Effects Models. Biljana Jonoska Stojkova Applied Statistics and Data Science Group (ASDa) Department of Statistics, UBC. Mixed Effects Models Biljana Jonoska Stojkova Applied Statistics and Data Science Group (ASDa) Department of Statistics, UBC March 6, 2018 Resources for statistical assistance Department of Statistics

More information

Standard Errors in OLS Luke Sonnet

Standard Errors in OLS Luke Sonnet Standard Errors in OLS Luke Sonnet Contents Variance-Covariance of ˆβ 1 Standard Estimation (Spherical Errors) 2 Robust Estimation (Heteroskedasticity Constistent Errors) 4 Cluster Robust Estimation 7

More information

An introduction to the spamm package for mixed models

An introduction to the spamm package for mixed models An introduction to the spamm package for mixed models François Rousset September 30, 2018 The spamm package fits mixed models. It was developed first to fit models with spatial correlations, which commonly

More information

lme4: Mixed-effects modeling with R

lme4: Mixed-effects modeling with R Douglas M. Bates lme4: Mixed-effects modeling with R February 17, 2010 Springer Page: 1 job: lmmwr macro: svmono.cls date/time: 17-Feb-2010/14:23 Page: 2 job: lmmwr macro: svmono.cls date/time: 17-Feb-2010/14:23

More information

Canopy Light: Synthesizing multiple data sources

Canopy Light: Synthesizing multiple data sources Canopy Light: Synthesizing multiple data sources Tree growth depends upon light (previous example, lab 7) Hard to measure how much light an ADULT tree receives Multiple sources of proxy data Exposed Canopy

More information

This is called a linear basis expansion, and h m is the mth basis function For example if X is one-dimensional: f (X) = β 0 + β 1 X + β 2 X 2, or

This is called a linear basis expansion, and h m is the mth basis function For example if X is one-dimensional: f (X) = β 0 + β 1 X + β 2 X 2, or STA 450/4000 S: February 2 2005 Flexible modelling using basis expansions (Chapter 5) Linear regression: y = Xβ + ɛ, ɛ (0, σ 2 ) Smooth regression: y = f (X) + ɛ: f (X) = E(Y X) to be specified Flexible

More information

Fathom Dynamic Data TM Version 2 Specifications

Fathom Dynamic Data TM Version 2 Specifications Data Sources Fathom Dynamic Data TM Version 2 Specifications Use data from one of the many sample documents that come with Fathom. Enter your own data by typing into a case table. Paste data from other

More information

Machine Learning / Jan 27, 2010

Machine Learning / Jan 27, 2010 Revisiting Logistic Regression & Naïve Bayes Aarti Singh Machine Learning 10-701/15-781 Jan 27, 2010 Generative and Discriminative Classifiers Training classifiers involves learning a mapping f: X -> Y,

More information

CSSS 510: Lab 2. Introduction to Maximum Likelihood Estimation

CSSS 510: Lab 2. Introduction to Maximum Likelihood Estimation CSSS 510: Lab 2 Introduction to Maximum Likelihood Estimation 2018-10-12 0. Agenda 1. Housekeeping: simcf, tile 2. Questions about Homework 1 or lecture 3. Simulating heteroskedastic normal data 4. Fitting

More information

Variability in Annual Temperature Profiles

Variability in Annual Temperature Profiles Variability in Annual Temperature Profiles A Multivariate Spatial Analysis of Regional Climate Model Output Tamara Greasby, Stephan Sain Institute for Mathematics Applied to Geosciences, National Center

More information

Missing Data. Where did it go?

Missing Data. Where did it go? Missing Data Where did it go? 1 Learning Objectives High-level discussion of some techniques Identify type of missingness Single vs Multiple Imputation My favourite technique 2 Problem Uh data are missing

More information

GENREG DID THAT? Clay Barker Research Statistician Developer JMP Division, SAS Institute

GENREG DID THAT? Clay Barker Research Statistician Developer JMP Division, SAS Institute GENREG DID THAT? Clay Barker Research Statistician Developer JMP Division, SAS Institute GENREG WHAT IS IT? The Generalized Regression platform was introduced in JMP Pro 11 and got much better in version

More information

Asreml-R: an R package for mixed models using residual maximum likelihood

Asreml-R: an R package for mixed models using residual maximum likelihood Asreml-R: an R package for mixed models using residual maximum likelihood David Butler 1 Brian Cullis 2 Arthur Gilmour 3 1 Queensland Department of Primary Industries Toowoomba 2 NSW Department of Primary

More information

Introduction to Mixed Models: Multivariate Regression

Introduction to Mixed Models: Multivariate Regression Introduction to Mixed Models: Multivariate Regression EPSY 905: Multivariate Analysis Spring 2016 Lecture #9 March 30, 2016 EPSY 905: Multivariate Regression via Path Analysis Today s Lecture Multivariate

More information

ASReml: AN OVERVIEW. Rajender Parsad 1, Jose Crossa 2 and Juan Burgueno 2

ASReml: AN OVERVIEW. Rajender Parsad 1, Jose Crossa 2 and Juan Burgueno 2 ASReml: AN OVERVIEW Rajender Parsad 1, Jose Crossa 2 and Juan Burgueno 2 1 I.A.S.R.I., Library Avenue, New Delhi - 110 012, India 2 Biometrics and statistics Unit, CIMMYT, Mexico ASReml is a statistical

More information

Applying multiple imputation on waterbird census data Comparing two imputation methods

Applying multiple imputation on waterbird census data Comparing two imputation methods Applying multiple imputation on waterbird census data Comparing two imputation methods ISEC, Montpellier, 1 july 2014 Thierry Onkelinx, Koen Devos & Paul Quataert Thierry.Onkelinx@inbo.be Contents 1 Introduction

More information

An introduction to SPSS

An introduction to SPSS An introduction to SPSS To open the SPSS software using U of Iowa Virtual Desktop... Go to https://virtualdesktop.uiowa.edu and choose SPSS 24. Contents NOTE: Save data files in a drive that is accessible

More information

Lecture 24: Generalized Additive Models Stat 704: Data Analysis I, Fall 2010

Lecture 24: Generalized Additive Models Stat 704: Data Analysis I, Fall 2010 Lecture 24: Generalized Additive Models Stat 704: Data Analysis I, Fall 2010 Tim Hanson, Ph.D. University of South Carolina T. Hanson (USC) Stat 704: Data Analysis I, Fall 2010 1 / 26 Additive predictors

More information

GAMs, GAMMs and other penalized GLMs using mgcv in R. Simon Wood Mathematical Sciences, University of Bath, U.K.

GAMs, GAMMs and other penalized GLMs using mgcv in R. Simon Wood Mathematical Sciences, University of Bath, U.K. GAMs, GAMMs and other penalied GLMs using mgcv in R Simon Wood Mathematical Sciences, University of Bath, U.K. Simple eample Consider a very simple dataset relating the timber volume of cherry trees to

More information

Package glmmml. R topics documented: March 25, Encoding UTF-8 Version Date Title Generalized Linear Models with Clustering

Package glmmml. R topics documented: March 25, Encoding UTF-8 Version Date Title Generalized Linear Models with Clustering Encoding UTF-8 Version 1.0.3 Date 2018-03-25 Title Generalized Linear Models with Clustering Package glmmml March 25, 2018 Binomial and Poisson regression for clustered data, fixed and random effects with

More information

Part I. Hierarchical clustering. Hierarchical Clustering. Hierarchical clustering. Produces a set of nested clusters organized as a

Part I. Hierarchical clustering. Hierarchical Clustering. Hierarchical clustering. Produces a set of nested clusters organized as a Week 9 Based in part on slides from textbook, slides of Susan Holmes Part I December 2, 2012 Hierarchical Clustering 1 / 1 Produces a set of nested clusters organized as a Hierarchical hierarchical clustering

More information

Modelling and Quantitative Methods in Fisheries

Modelling and Quantitative Methods in Fisheries SUB Hamburg A/553843 Modelling and Quantitative Methods in Fisheries Second Edition Malcolm Haddon ( r oc) CRC Press \ y* J Taylor & Francis Croup Boca Raton London New York CRC Press is an imprint of

More information

P-spline ANOVA-type interaction models for spatio-temporal smoothing

P-spline ANOVA-type interaction models for spatio-temporal smoothing P-spline ANOVA-type interaction models for spatio-temporal smoothing Dae-Jin Lee and María Durbán Universidad Carlos III de Madrid Department of Statistics IWSM Utrecht 2008 D.-J. Lee and M. Durban (UC3M)

More information

Markov chain Monte Carlo methods

Markov chain Monte Carlo methods Markov chain Monte Carlo methods (supplementary material) see also the applet http://www.lbreyer.com/classic.html February 9 6 Independent Hastings Metropolis Sampler Outline Independent Hastings Metropolis

More information

C A S I O f x L B UNIVERSITY OF SOUTHERN QUEENSLAND. The Learning Centre Learning and Teaching Support Unit

C A S I O f x L B UNIVERSITY OF SOUTHERN QUEENSLAND. The Learning Centre Learning and Teaching Support Unit C A S I O f x - 8 2 L B UNIVERSITY OF SOUTHERN QUEENSLAND The Learning Centre Learning and Teaching Support Unit MASTERING THE CALCULATOR USING THE CASIO fx-82lb Learning and Teaching Support Unit (LTSU)

More information

Missing Data and Imputation

Missing Data and Imputation Missing Data and Imputation NINA ORWITZ OCTOBER 30 TH, 2017 Outline Types of missing data Simple methods for dealing with missing data Single and multiple imputation R example Missing data is a complex

More information

Package mcemglm. November 29, 2015

Package mcemglm. November 29, 2015 Type Package Package mcemglm November 29, 2015 Title Maximum Likelihood Estimation for Generalized Linear Mixed Models Version 1.1 Date 2015-11-28 Author Felipe Acosta Archila Maintainer Maximum likelihood

More information

Preface to the Second Edition. Preface to the First Edition. 1 Introduction 1

Preface to the Second Edition. Preface to the First Edition. 1 Introduction 1 Preface to the Second Edition Preface to the First Edition vii xi 1 Introduction 1 2 Overview of Supervised Learning 9 2.1 Introduction... 9 2.2 Variable Types and Terminology... 9 2.3 Two Simple Approaches

More information

Bayesian Modelling with JAGS and R

Bayesian Modelling with JAGS and R Bayesian Modelling with JAGS and R Martyn Plummer International Agency for Research on Cancer Rencontres R, 3 July 2012 CRAN Task View Bayesian Inference The CRAN Task View Bayesian Inference is maintained

More information

S H A R P E L R H UNIVERSITY OF SOUTHERN QUEENSLAND. The Learning Centre Learning and Teaching Support Unit

S H A R P E L R H UNIVERSITY OF SOUTHERN QUEENSLAND. The Learning Centre Learning and Teaching Support Unit S H A R P E L - 5 3 1 R H UNIVERSITY OF SOUTHERN QUEENSLAND The Learning Centre Learning and Teaching Support Unit TABLE OF CONTENTS PAGE Introduction 1 A word about starting out 2 1. Addition and subtraction

More information

Performance of Sequential Imputation Method in Multilevel Applications

Performance of Sequential Imputation Method in Multilevel Applications Section on Survey Research Methods JSM 9 Performance of Sequential Imputation Method in Multilevel Applications Enxu Zhao, Recai M. Yucel New York State Department of Health, 8 N. Pearl St., Albany, NY

More information

Linear Models in Medical Imaging. John Kornak MI square February 19, 2013

Linear Models in Medical Imaging. John Kornak MI square February 19, 2013 Linear Models in Medical Imaging John Kornak MI square February 19, 2013 Acknowledgement / Disclaimer Many of the slides in this lecture have been adapted from slides available in talks available on the

More information

Resources for statistical assistance. Quantitative covariates and regression analysis. Methods for predicting continuous outcomes.

Resources for statistical assistance. Quantitative covariates and regression analysis. Methods for predicting continuous outcomes. Resources for statistical assistance Quantitative covariates and regression analysis Carolyn Taylor Applied Statistics and Data Science Group (ASDa) Department of Statistics, UBC January 24, 2017 Department

More information

Note Set 4: Finite Mixture Models and the EM Algorithm

Note Set 4: Finite Mixture Models and the EM Algorithm Note Set 4: Finite Mixture Models and the EM Algorithm Padhraic Smyth, Department of Computer Science University of California, Irvine Finite Mixture Models A finite mixture model with K components, for

More information

Example Using Missing Data 1

Example Using Missing Data 1 Ronald H. Heck and Lynn N. Tabata 1 Example Using Missing Data 1 Creating the Missing Data Variable (Miss) Here is a data set (achieve subset MANOVAmiss.sav) with the actual missing data on the outcomes.

More information

Binary Regression in S-Plus

Binary Regression in S-Plus Fall 200 STA 216 September 7, 2000 1 Getting Started in UNIX Binary Regression in S-Plus Create a class working directory and.data directory for S-Plus 5.0. If you have used Splus 3.x before, then it is

More information

An Introduction to the Bootstrap

An Introduction to the Bootstrap An Introduction to the Bootstrap Bradley Efron Department of Statistics Stanford University and Robert J. Tibshirani Department of Preventative Medicine and Biostatistics and Department of Statistics,

More information

MCMC Methods for data modeling

MCMC Methods for data modeling MCMC Methods for data modeling Kenneth Scerri Department of Automatic Control and Systems Engineering Introduction 1. Symposium on Data Modelling 2. Outline: a. Definition and uses of MCMC b. MCMC algorithms

More information

Package ToTweedieOrNot

Package ToTweedieOrNot Type Package Package ToTweedieOrNot December 1, 2014 Title Code for the paper Generalised linear models for aggregate claims; to Tweedie or not? Version 1.0 Date 2014-11-27 Author Oscar Alberto Quijano

More information

Modelling Proportions and Count Data

Modelling Proportions and Count Data Modelling Proportions and Count Data Rick White May 4, 2016 Outline Analysis of Count Data Binary Data Analysis Categorical Data Analysis Generalized Linear Models Questions Types of Data Continuous data:

More information

Missing Data Analysis for the Employee Dataset

Missing Data Analysis for the Employee Dataset Missing Data Analysis for the Employee Dataset 67% of the observations have missing values! Modeling Setup For our analysis goals we would like to do: Y X N (X, 2 I) and then interpret the coefficients

More information

Linear Modeling with Bayesian Statistics

Linear Modeling with Bayesian Statistics Linear Modeling with Bayesian Statistics Bayesian Approach I I I I I Estimate probability of a parameter State degree of believe in specific parameter values Evaluate probability of hypothesis given the

More information

CS 229 Midterm Review

CS 229 Midterm Review CS 229 Midterm Review Course Staff Fall 2018 11/2/2018 Outline Today: SVMs Kernels Tree Ensembles EM Algorithm / Mixture Models [ Focus on building intuition, less so on solving specific problems. Ask

More information

Optimization and Simulation

Optimization and Simulation Optimization and Simulation Statistical analysis and bootstrapping Michel Bierlaire Transport and Mobility Laboratory School of Architecture, Civil and Environmental Engineering Ecole Polytechnique Fédérale

More information

Stat 8053, Fall 2013: Additive Models

Stat 8053, Fall 2013: Additive Models Stat 853, Fall 213: Additive Models We will only use the package mgcv for fitting additive and later generalized additive models. The best reference is S. N. Wood (26), Generalized Additive Models, An

More information