Generalized additive models II
|
|
- Thomasine Ferguson
- 6 years ago
- Views:
Transcription
1 Generalized additive models II Patrick Breheny October 13 Patrick Breheny BST 764: Applied Statistical Modeling 1/23
2 Coronary heart disease study Today s lecture will feature several case studies involving the application of smoothing splines and GAMs to real data sets Our first study looks at many potential risk factors for coronary heart disease (CHD): sbp: Systolic blood pressure tobacco: Cumulative lifetime tobacco consumption (kg) ldl: Low-density lipoprotein adiposity: % of weight from fat obesity: Body mass index famhist: Family history of CHD typea: A measure of stress alcohol: Current alcohol consumption age Patrick Breheny BST 764: Applied Statistical Modeling 2/23
3 CHD study: Plan of analysis There are many ways one could analyze this data, but since we are interested in all the potential risk factors, one reasonable way to proceed is to model each of the continuous factors using smoothing splines, with the degrees of freedom for each smooth estimated by GCV Here there is one categorical factor, family history, which is highly significant (p < ) with an odds ratio of 2.6 (1.6, 4.1) Patrick Breheny BST 764: Applied Statistical Modeling 3/23
4 CHD study: Smoothing term tests The remaining 8 predictors are continuous; cumulatively they add degrees of freedom to the model We are interested in carrying out likelihood ratio tests for these terms Because there are so many smooth terms, it is desirable to fix the smoothing parameters {λ j } at their full-model values; otherwise the hypothesis tests will become muddled by the fact that the reduced model differs not only in terms of its components, but also how flexible each term is allowed to be Patrick Breheny BST 764: Applied Statistical Modeling 4/23
5 CHD GAM Case study #1 Overall effect: p = 0.21 Linearity: p = 0.12 Overall effect: p = < Linearity: p = 0.02 s(sbp,1.24) s(tobacco,5.87) sbp tobacco Overall effect: p = < 0.01 Overall effect: p = 0.02 Linearity: p = 0.03 s(ldl,1) s(obesity,2.2) ldl obesity Patrick Breheny BST 764: Applied Statistical Modeling 5/23
6 CHD GAM (cont d) Overall effect: p = 0.20 Overall effect: p = < 0.01 Linearity: p = 0.05 s(adiposity,1) s(typea,3.33) adiposity typea Overall effect: p = 0.90 Overall effect: p = < 0.01 Linearity: p = 0.15 s(alcohol,1) s(age,3.39) alcohol age Patrick Breheny BST 764: Applied Statistical Modeling 6/23
7 Asthma CAFO study As another example, I collaborated with Dr. Sanderson on a project involving exposure to air pollution from concentrated animal feeding operations (CAFOs) and its effect on developing asthma in children The exposure metric was calculated based on the proximity of the home to a CAFO (or CAFOs), the size of those CAFOs, and adjusted for wind patterns to develop an overall CAFO exposure score Based on previous studies of asthma, we also adjusted for a number of risk factors: age, gender, socioeconomic status, allergies, premature birth, early respiratory infection, and whether the individual worked with pigs Patrick Breheny BST 764: Applied Statistical Modeling 7/23
8 CAFO study: Plan of analysis We could once again model each continuous term using smoothing splines, although since the focus of the study was on a single risk factor, I modeled CAFO exposure nonparametrically and included the others as simple linear terms The estimation of how CAFO exposure affects asthma risk was the purpose of the study; an interesting additional question is whether or not any interactions are present involving this exposure In the course of the study, we found some evidence for an interaction with early respiratory illness (the test for interaction had a p-value of 0.06) Patrick Breheny BST 764: Applied Statistical Modeling 8/23
9 CAFO study: Results log(cafo) Log odds (Asthma) Patrick Breheny BST 764: Applied Statistical Modeling 9/23
10 CAFO study: Results (cont d) Deviance Model explained Base 28.8% Linear 31.1% Spline 33.0% p-value for overall effect of CAFO:.001 p-value for nonlinear component of CAFO:.03 Patrick Breheny BST 764: Applied Statistical Modeling 10/23
11 CAFO study: Possible interaction No early respiratory infection log(cafo) Effect on log(odds) Early respiratory infection log(cafo) Effect on log(odds) Patrick Breheny BST 764: Applied Statistical Modeling 11/23
12 GAMMs Case study #1 I am glossing over a complication present in the CAFO study: the observations were not all independent Specifically, sampling was done at the household/family level and one would expect some correlation among subjects within a household Possibilities for dealing with this are to include a random intercept (a Generalized Additive Mixed Model, or GAMM), or to use generalized estimating equations Patrick Breheny BST 764: Applied Statistical Modeling 12/23
13 Case study #3 GAMs are wonderful tools, but they do rely on that assumption of additivity; what if effects are not additive? To put it another way, suppose we have x 1 and x 2, both continuous variables; can we estimate E(y x) = f(x 1, x 2 ) in a completely nonparametric way, without making any assumptions on f? Patrick Breheny BST 764: Applied Statistical Modeling 13/23
14 Tensor product splines Case study #3 One approach to constructing multidimensional splines is using the tensor product basis Suppose we specify a set of basis functions {h 1k } for x 1 and {h 2k } for x 2, with M 1 and M 2 elements, respectively The tensor product basis for the two-dimensional smooth function of x 1 and x 2 is given by and has M 1 M 2 elements g jk (x 1, x 2 ) = h 1j (x 1 )h 2k (x 2 ) Patrick Breheny BST 764: Applied Statistical Modeling 14/23
15 Thin plate splines Case study #1 Case study #3 The multidimensional analog of smoothing splines are called thin plate regression splines For two dimensions, we find f(x 1, x 2 ) that minimizes n l{y i, f(x 1i, x 2i )}+λ i=1 [ 2 ] 2 [ f 2 ] 2 f u u v [ 2 f v 2 Thin plate regression splines have fairly complicated basis functions ] 2 dudv Patrick Breheny BST 764: Applied Statistical Modeling 15/23
16 2D smoothing Case study #1 Case study #3 Thin plate spline ldl age Patrick Breheny BST 764: Applied Statistical Modeling 16/23
17 Case study #3 Restrictions imposed by various models GLM GLM w/ interaction ldl ldl age age GAM Thin plate ldl ldl age age Patrick Breheny BST 764: Applied Statistical Modeling 17/23
18 The curse of dimensionality Case study #3 Thin plate regression splines can be extended further into higher dimensions, but they become rather computationally intensive as the dimension exceeds 2 Also, the curse of dimensionality implies that we need an exponentially increasing amount of data to maintain accuracy as p increases Furthermore, multidimensional smooth functions are harder to visualize and interpret In other words, it is nice to have the option of including interactions, but one generally cannot stray too far from the structure provided by the additive model Patrick Breheny BST 764: Applied Statistical Modeling 18/23
19 WISEWOMAN Case study #1 Case study #3 Our final case study involves WISEWOMAN, a project initiated by the CDC offering early screenings of risk factors for heart disease, diabetes, and other chronic diseases in financially disadvantaged women In Iowa, the project contained an intervention component designed to effect lifestyle changes in at-risk individuals Patrick Breheny BST 764: Applied Statistical Modeling 19/23
20 Attendance Case study #1 Case study #3 However, no intervention can be successful if the individuals targeted do not attend In order to design better interventions, it is of interest to study factors associated with attendance vs. nonattendance In particular, investigators are interested in the role that travel distance plays in attendance Patrick Breheny BST 764: Applied Statistical Modeling 20/23
21 WISEWOMAN study: Analysis Case study #3 As with the CAFO study, the emphasis was on a particular term, so distance was modeled using a smoothing spline while other factors affecting attendance were controlled for using linear terms In this study, we found significant evidence for an interaction between travel distance and age Furthermore, we found evidence of a three-way interaction, in that the above two-dimensional smooth function differed in urban and rural settings Patrick Breheny BST 764: Applied Statistical Modeling 21/23
22 WISEWOMAN: Rural results Case study #3 Probability(Attend) 0.9 Probability(Attend) Distance Age Age Distance Patrick Breheny BST 764: Applied Statistical Modeling 22/23
23 Case study #3 WISEWOMAN: Urban vs. rural (cross-sections) Patrick Breheny BST 764: Applied Statistical Modeling 23/23
Generalized additive models I
I Patrick Breheny October 6 Patrick Breheny BST 764: Applied Statistical Modeling 1/18 Introduction Thus far, we have discussed nonparametric regression involving a single covariate In practice, we often
More informationPatrick Breheny. November 10
Patrick Breheny November Patrick Breheny BST 764: Applied Statistical Modeling /6 Introduction Our discussion of tree-based methods in the previous section assumed that the outcome was continuous Tree-based
More informationSplines. Patrick Breheny. November 20. Introduction Regression splines (parametric) Smoothing splines (nonparametric)
Splines Patrick Breheny November 20 Patrick Breheny STA 621: Nonparametric Statistics 1/46 Introduction Introduction Problems with polynomial bases We are discussing ways to estimate the regression function
More informationSplines and penalized regression
Splines and penalized regression November 23 Introduction We are discussing ways to estimate the regression function f, where E(y x) = f(x) One approach is of course to assume that f has a certain shape,
More informationGeneralized Additive Models
:p Texts in Statistical Science Generalized Additive Models An Introduction with R Simon N. Wood Contents Preface XV 1 Linear Models 1 1.1 A simple linear model 2 Simple least squares estimation 3 1.1.1
More informationThis is called a linear basis expansion, and h m is the mth basis function For example if X is one-dimensional: f (X) = β 0 + β 1 X + β 2 X 2, or
STA 450/4000 S: February 2 2005 Flexible modelling using basis expansions (Chapter 5) Linear regression: y = Xβ + ɛ, ɛ (0, σ 2 ) Smooth regression: y = f (X) + ɛ: f (X) = E(Y X) to be specified Flexible
More informationLecture 16: High-dimensional regression, non-linear regression
Lecture 16: High-dimensional regression, non-linear regression Reading: Sections 6.4, 7.1 STATS 202: Data mining and analysis November 3, 2017 1 / 17 High-dimensional regression Most of the methods we
More informationMSA220/MVE440 Statistical Learning for Big Data
MSA220/MVE440 Statistical Learning for Big Data Lecture 2 Rebecka Jörnsten Mathematical Sciences University of Gothenburg and Chalmers University of Technology Classification - selection of tuning parameters
More informationGoals of the Lecture. SOC6078 Advanced Statistics: 9. Generalized Additive Models. Limitations of the Multiple Nonparametric Models (2)
SOC6078 Advanced Statistics: 9. Generalized Additive Models Robert Andersen Department of Sociology University of Toronto Goals of the Lecture Introduce Additive Models Explain how they extend from simple
More informationGeneralized Additive Model
Generalized Additive Model by Huimin Liu Department of Mathematics and Statistics University of Minnesota Duluth, Duluth, MN 55812 December 2008 Table of Contents Abstract... 2 Chapter 1 Introduction 1.1
More informationStat 8053, Fall 2013: Additive Models
Stat 853, Fall 213: Additive Models We will only use the package mgcv for fitting additive and later generalized additive models. The best reference is S. N. Wood (26), Generalized Additive Models, An
More informationNonparametric Regression
Nonparametric Regression John Fox Department of Sociology McMaster University 1280 Main Street West Hamilton, Ontario Canada L8S 4M4 jfox@mcmaster.ca February 2004 Abstract Nonparametric regression analysis
More informationStatistics & Analysis. Fitting Generalized Additive Models with the GAM Procedure in SAS 9.2
Fitting Generalized Additive Models with the GAM Procedure in SAS 9.2 Weijie Cai, SAS Institute Inc., Cary NC July 1, 2008 ABSTRACT Generalized additive models are useful in finding predictor-response
More informationGeneralized Additive Models
Generalized Additive Models Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Generalized Additive Models GAMs are one approach to non-parametric regression in the multiple predictor setting.
More informationRandom Forest A. Fornaser
Random Forest A. Fornaser alberto.fornaser@unitn.it Sources Lecture 15: decision trees, information theory and random forests, Dr. Richard E. Turner Trees and Random Forests, Adele Cutler, Utah State University
More informationMission Lipid Data Management Software User s Guide
Mission Lipid Data Management Software User s Guide V1.0 September 2018 Table of Contents 1. Overview...1 1.1 About the Mission Lipid Data Management Software...1 1.2 System Requirements...1 1.3 Materials
More informationGAMs semi-parametric GLMs. Simon Wood Mathematical Sciences, University of Bath, U.K.
GAMs semi-parametric GLMs Simon Wood Mathematical Sciences, University of Bath, U.K. Generalized linear models, GLM 1. A GLM models a univariate response, y i as g{e(y i )} = X i β where y i Exponential
More informationGLM II. Basic Modeling Strategy CAS Ratemaking and Product Management Seminar by Paul Bailey. March 10, 2015
GLM II Basic Modeling Strategy 2015 CAS Ratemaking and Product Management Seminar by Paul Bailey March 10, 2015 Building predictive models is a multi-step process Set project goals and review background
More informationThe Bootstrap and Jackknife
The Bootstrap and Jackknife Summer 2017 Summer Institutes 249 Bootstrap & Jackknife Motivation In scientific research Interest often focuses upon the estimation of some unknown parameter, θ. The parameter
More informationRegression. Dr. G. Bharadwaja Kumar VIT Chennai
Regression Dr. G. Bharadwaja Kumar VIT Chennai Introduction Statistical models normally specify how one set of variables, called dependent variables, functionally depend on another set of variables, called
More informationPredict Outcomes and Reveal Relationships in Categorical Data
PASW Categories 18 Specifications Predict Outcomes and Reveal Relationships in Categorical Data Unleash the full potential of your data through predictive analysis, statistical learning, perceptual mapping,
More informationVine Medical Group Patient Registration Form Your Information
Your Information Welcome to Vine Medical Group. In order for us to offer you the high standards of clinical care we give to our patients, we ask that you complete this registration form. Before we are
More informationRoad Map for CAT4 Suite. CAT4 Road Map. Road Map for CAT4 Suite
Road Map for CAT4 Suite CAT4 Road Map Road Map for CAT4 Suite The Pen CS Clinical Audit Tool (CAT Plus Suite) is an extraction tool that takes a snapshot of your clinical data and allows you to analyse
More informationCH9.Generalized Additive Model
CH9.Generalized Additive Model Regression Model For a response variable and predictor variables can be modeled using a mean function as follows: would be a parametric / nonparametric regression or a smoothing
More informationUsing the SemiPar Package
Using the SemiPar Package NICHOLAS J. SALKOWSKI Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455, USA salk0008@umn.edu May 15, 2008 1 Introduction The
More informationWhat is machine learning?
Machine learning, pattern recognition and statistical data modelling Lecture 12. The last lecture Coryn Bailer-Jones 1 What is machine learning? Data description and interpretation finding simpler relationship
More informationIPUMS Training and Development: Requesting Data
IPUMS Training and Development: Requesting Data IPUMS PMA Exercise 2 OBJECTIVE: Gain an understanding of how IPUMS PMA service delivery point datasets are structured and how it can be leveraged to explore
More informationKeywords- Classification algorithm, Hypertensive, K Nearest Neighbor, Naive Bayesian, Data normalization
GLOBAL JOURNAL OF ENGINEERING SCIENCE AND RESEARCHES APPLICATION OF CLASSIFICATION TECHNIQUES TO DETECT HYPERTENSIVE HEART DISEASE Tulasimala B. N* 1, Elakkiya S 2 & Keerthana N 3 *1 Assistant Professor,
More informationA popular method for moving beyond linearity. 2. Basis expansion and regularization 1. Examples of transformations. Piecewise-polynomials and splines
A popular method for moving beyond linearity 2. Basis expansion and regularization 1 Idea: Augment the vector inputs x with additional variables which are transformation of x use linear models in this
More informationCurve fitting. Lab. Formulation. Truncation Error Round-off. Measurement. Good data. Not as good data. Least squares polynomials.
Formulating models We can use information from data to formulate mathematical models These models rely on assumptions about the data or data not collected Different assumptions will lead to different models.
More informationLecture 24: Generalized Additive Models Stat 704: Data Analysis I, Fall 2010
Lecture 24: Generalized Additive Models Stat 704: Data Analysis I, Fall 2010 Tim Hanson, Ph.D. University of South Carolina T. Hanson (USC) Stat 704: Data Analysis I, Fall 2010 1 / 26 Additive predictors
More informationLast time... Bias-Variance decomposition. This week
Machine learning, pattern recognition and statistical data modelling Lecture 4. Going nonlinear: basis expansions and splines Last time... Coryn Bailer-Jones linear regression methods for high dimensional
More informationA Feasibility and Acceptability Study of the Provision
A Feasibility and Acceptability Study of the Provision of mhealth Interventions for Behavior Change in Prehypertensive subjects in Argentina, Guatemala, and Peru. Med e Tel 2012 Beratarrechea A 1, Fernandez
More informationStatistical Methods for the Analysis of Repeated Measurements
Charles S. Davis Statistical Methods for the Analysis of Repeated Measurements With 20 Illustrations #j Springer Contents Preface List of Tables List of Figures v xv xxiii 1 Introduction 1 1.1 Repeated
More informationLearning-based Neuroimage Registration
Learning-based Neuroimage Registration Leonid Teverovskiy and Yanxi Liu 1 October 2004 CMU-CALD-04-108, CMU-RI-TR-04-59 School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 Abstract
More informationLecture 17: Smoothing splines, Local Regression, and GAMs
Lecture 17: Smoothing splines, Local Regression, and GAMs Reading: Sections 7.5-7 STATS 202: Data mining and analysis November 6, 2017 1 / 24 Cubic splines Define a set of knots ξ 1 < ξ 2 < < ξ K. We want
More informationMultiple imputation using chained equations: Issues and guidance for practice
Multiple imputation using chained equations: Issues and guidance for practice Ian R. White, Patrick Royston and Angela M. Wood http://onlinelibrary.wiley.com/doi/10.1002/sim.4067/full By Gabrielle Simoneau
More informationHeart Disease Prediction on Continuous Time Series Data with Entropy Feature Selection and DWT Processing
Heart Disease Prediction on Continuous Time Series Data with Entropy Feature Selection and DWT Processing 1 Veena N, 2 Dr.Anitha N 1 Research Scholar(VTU), Information Science and Engineering, B M S Institute
More informationSandeep Kharidhi and WenSui Liu ChoicePoint Precision Marketing
Generalized Additive Model and Applications in Direct Marketing Sandeep Kharidhi and WenSui Liu ChoicePoint Precision Marketing Abstract Logistic regression 1 has been widely used in direct marketing applications
More informationEYECARE REGISTRATION AND HISTORY
EYECARE REGISTRATION AND HISTORY PATIENT INFORMATION INSURANCE Date Who is responsible for this account? Patient Relationship to Patient Address Insurance Co. Group # City State Zip Is patient covered
More informationSection 2.2 Normal Distributions. Normal Distributions
Section 2.2 Normal Distributions Normal Distributions One particularly important class of density curves are the Normal curves, which describe Normal distributions. All Normal curves are symmetric, single-peaked,
More information2017 Partners in Excellence Executive Overview, Targets, and Methodology
2017 Partners in Excellence Executive Overview, s, and Methodology Overview The Partners in Excellence program forms the basis for HealthPartners financial and public recognition for medical or specialty
More informationStatistical Tests for Variable Discrimination
Statistical Tests for Variable Discrimination University of Trento - FBK 26 February, 2015 (UNITN-FBK) Statistical Tests for Variable Discrimination 26 February, 2015 1 / 31 General statistics Descriptional:
More informationGAMs, GAMMs and other penalized GLMs using mgcv in R. Simon Wood Mathematical Sciences, University of Bath, U.K.
GAMs, GAMMs and other penalied GLMs using mgcv in R Simon Wood Mathematical Sciences, University of Bath, U.K. Simple eample Consider a very simple dataset relating the timber volume of cherry trees to
More informationUnit 5 Logistic Regression Practice Problems
Unit 5 Logistic Regression Practice Problems SOLUTIONS R Users Source: Afifi A., Clark VA and May S. Computer Aided Multivariate Analysis, Fourth Edition. Boca Raton: Chapman and Hall, 2004. Exercises
More informationFitting latency models using B-splines in EPICURE for DOS
Fitting latency models using B-splines in EPICURE for DOS Michael Hauptmann, Jay Lubin January 11, 2007 1 Introduction Disease latency refers to the interval between an increment of exposure and a subsequent
More informationSTAT 705 Introduction to generalized additive models
STAT 705 Introduction to generalized additive models Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Data Analysis II 1 / 22 Generalized additive models Consider a linear
More informationUsing Multivariate Adaptive Regression Splines (MARS ) to enhance Generalised Linear Models. Inna Kolyshkina PriceWaterhouseCoopers
Using Multivariate Adaptive Regression Splines (MARS ) to enhance Generalised Linear Models. Inna Kolyshkina PriceWaterhouseCoopers Why enhance GLM? Shortcomings of the linear modelling approach. GLM being
More informationCS6375: Machine Learning Gautam Kunapuli. Mid-Term Review
Gautam Kunapuli Machine Learning Data is identically and independently distributed Goal is to learn a function that maps to Data is generated using an unknown function Learn a hypothesis that minimizes
More informationMissing Data Missing Data Methods in ML Multiple Imputation
Missing Data Missing Data Methods in ML Multiple Imputation PRE 905: Multivariate Analysis Lecture 11: April 22, 2014 PRE 905: Lecture 11 Missing Data Methods Today s Lecture The basics of missing data:
More informationFMA901F: Machine Learning Lecture 3: Linear Models for Regression. Cristian Sminchisescu
FMA901F: Machine Learning Lecture 3: Linear Models for Regression Cristian Sminchisescu Machine Learning: Frequentist vs. Bayesian In the frequentist setting, we seek a fixed parameter (vector), with value(s)
More informationdavidr Cornell University
1 NONPARAMETRIC RANDOM EFFECTS MODELS AND LIKELIHOOD RATIO TESTS Oct 11, 2002 David Ruppert Cornell University www.orie.cornell.edu/ davidr (These transparencies and preprints available link to Recent
More informationIBM SPSS Categories. Predict outcomes and reveal relationships in categorical data. Highlights. With IBM SPSS Categories you can:
IBM Software IBM SPSS Statistics 19 IBM SPSS Categories Predict outcomes and reveal relationships in categorical data Highlights With IBM SPSS Categories you can: Visualize and explore complex categorical
More informationIBM SPSS Categories 23
IBM SPSS Categories 23 Note Before using this information and the product it supports, read the information in Notices on page 55. Product Information This edition applies to version 23, release 0, modification
More informationUser Guide Chronic Disease Management Tariffs e-form
User Guide Chronic Disease Management Tariffs e-form Page 1 of 15 Table of Contents Computer Requirements... 3 Accessing the e-form... 3 Are eforms Secure?... 4 Completing the Chronic Disease Management
More informationStat 528 (Autumn 2008) Density Curves and the Normal Distribution. Measures of center and spread. Features of the normal distribution
Stat 528 (Autumn 2008) Density Curves and the Normal Distribution Reading: Section 1.3 Density curves An example: GRE scores Measures of center and spread The normal distribution Features of the normal
More informationPATIENT INFORMATION SPOUSE INFORMATION REFERRAL INFORMATION INSURANCE INFORMATION IN CASE OF EMERGENCY
Today s date: PATIENT INFORMATION Patient s Last name: First: Middle: Physician Name: Mr. Sex: Marital status (circle one) Single / Mar / Div / Sep / Wid Mailing address: City: State: ZIP Code: D.O.B:
More informationStatistics & Analysis. A Comparison of PDLREG and GAM Procedures in Measuring Dynamic Effects
A Comparison of PDLREG and GAM Procedures in Measuring Dynamic Effects Patralekha Bhattacharya Thinkalytics The PDLREG procedure in SAS is used to fit a finite distributed lagged model to time series data
More informationApplied Statistics : Practical 9
Applied Statistics : Practical 9 This practical explores nonparametric regression and shows how to fit a simple additive model. The first item introduces the necessary R commands for nonparametric regression
More information4.5 The smoothed bootstrap
4.5. THE SMOOTHED BOOTSTRAP 47 F X i X Figure 4.1: Smoothing the empirical distribution function. 4.5 The smoothed bootstrap In the simple nonparametric bootstrap we have assumed that the empirical distribution
More informationMoving Beyond Linearity
Moving Beyond Linearity The truth is never linear! 1/23 Moving Beyond Linearity The truth is never linear! r almost never! 1/23 Moving Beyond Linearity The truth is never linear! r almost never! But often
More informationPerception Viewed as an Inverse Problem
Perception Viewed as an Inverse Problem Fechnerian Causal Chain of Events - an evaluation Fechner s study of outer psychophysics assumes that the percept is a result of a causal chain of events: Distal
More information2018 Partners in Excellence Executive Overview, Targets, and Methodology
2018 Partners in Excellence Executive Overview, s, and Methodology Overview The Partners in Excellence program forms the basis for HealthPartners financial and public recognition for medical or specialty
More informationExample Using Missing Data 1
Ronald H. Heck and Lynn N. Tabata 1 Example Using Missing Data 1 Creating the Missing Data Variable (Miss) Here is a data set (achieve subset MANOVAmiss.sav) with the actual missing data on the outcomes.
More informationBuilding Better Parametric Cost Models
Building Better Parametric Cost Models Based on the PMI PMBOK Guide Fourth Edition 37 IPDI has been reviewed and approved as a provider of project management training by the Project Management Institute
More informationHierarchical Generalized Linear Models
Generalized Multilevel Linear Models Introduction to Multilevel Models Workshop University of Georgia: Institute for Interdisciplinary Research in Education and Human Development 07 Generalized Multilevel
More information2015 Partners in Excellence Executive Overview, Targets, and Methodology
2015 Partners in Excellence Executive Overview, s, and Methodology Overview The Partners in Excellence program forms the basis for HealthPartners financial and public recognition for medical or specialty
More informationAcquisition Description Exploration Examination Understanding what data is collected. Characterizing properties of data.
Summary Statistics Acquisition Description Exploration Examination what data is collected Characterizing properties of data. Exploring the data distribution(s). Identifying data quality problems. Selecting
More information1. Study Registration. 2. Confirm Registration
USER MANUAL 1. Study Registration Diabetic patients are more susceptible to experiencing cardiovascular events, but this can be minimized with control of blood glucose levels and other risk factors (blood
More informationLog-linear Models of Contingency Tables: Multidimensional Tables
CSSS/SOC/STAT 536: Logistic Regression and Log-linear Models Log-linear Models of Contingency Tables: Multidimensional Tables Christopher Adolph University of Washington, Seattle March 10, 2005 Assistant
More informationMachine Learning. Topic 4: Linear Regression Models
Machine Learning Topic 4: Linear Regression Models (contains ideas and a few images from wikipedia and books by Alpaydin, Duda/Hart/ Stork, and Bishop. Updated Fall 205) Regression Learning Task There
More informationPreface to the Second Edition. Preface to the First Edition. 1 Introduction 1
Preface to the Second Edition Preface to the First Edition vii xi 1 Introduction 1 2 Overview of Supervised Learning 9 2.1 Introduction... 9 2.2 Variable Types and Terminology... 9 2.3 Two Simple Approaches
More informationINTRODUCTION to SAS STATISTICAL PACKAGE LAB 3
Topics: Data step Subsetting Concatenation and Merging Reference: Little SAS Book - Chapter 5, Section 3.6 and 2.2 Online documentation Exercise I LAB EXERCISE The following is a lab exercise to give you
More informationPackage glmpath. R topics documented: January 28, Version 0.98 Date
Version 0.98 Date 2018-01-27 Package glmpath January 28, 2018 Title L1 Regularization Path for Generalized Linear Models and Cox Proportional Hazards Model Author Mee Young Park, Trevor Hastie Maintainer
More informationMachine Learning / Jan 27, 2010
Revisiting Logistic Regression & Naïve Bayes Aarti Singh Machine Learning 10-701/15-781 Jan 27, 2010 Generative and Discriminative Classifiers Training classifiers involves learning a mapping f: X -> Y,
More informationLecture 27: Review. Reading: All chapters in ISLR. STATS 202: Data mining and analysis. December 6, 2017
Lecture 27: Review Reading: All chapters in ISLR. STATS 202: Data mining and analysis December 6, 2017 1 / 16 Final exam: Announcements Tuesday, December 12, 8:30-11:30 am, in the following rooms: Last
More informationCPSC 340: Machine Learning and Data Mining
CPSC 340: Machine Learning and Data Mining Feature Selection Original version of these slides by Mark Schmidt, with modifications by Mike Gelbart. Admin Assignment 3: Due Friday Midterm: Feb 14 in class
More informationMixed Effects Models. Biljana Jonoska Stojkova Applied Statistics and Data Science Group (ASDa) Department of Statistics, UBC.
Mixed Effects Models Biljana Jonoska Stojkova Applied Statistics and Data Science Group (ASDa) Department of Statistics, UBC March 6, 2018 Resources for statistical assistance Department of Statistics
More informationChapter 19 Estimating the dose-response function
Chapter 19 Estimating the dose-response function Dose-response Students and former students often tell me that they found evidence for dose-response. I know what they mean because that's the phrase I have
More information* * * * * * * * * * * * * * * ** * **
Generalized additive models Trevor Hastie and Robert Tibshirani y 1 Introduction In the statistical analysis of clinical trials and observational studies, the identication and adjustment for prognostic
More informationMissing Data and Imputation
Missing Data and Imputation Hoff Chapter 7, GH Chapter 25 April 21, 2017 Bednets and Malaria Y:presence or absence of parasites in a blood smear AGE: age of child BEDNET: bed net use (exposure) GREEN:greenness
More informationNonparametric Risk Attribution for Factor Models of Portfolios. October 3, 2017 Kellie Ottoboni
Nonparametric Risk Attribution for Factor Models of Portfolios October 3, 2017 Kellie Ottoboni Outline The problem Page 3 Additive model of returns Page 7 Euler s formula for risk decomposition Page 11
More informationExploring Persuasiveness of Just-in-time Motivational Messages for Obesity Management
Exploring Persuasiveness of Just-in-time Motivational Messages for Obesity Management Megha Maheshwari 1, Samir Chatterjee 1, David Drew 2 1 Network Convergence Lab, Claremont Graduate University http://ncl.cgu.edu
More informationA toolbox of smooths. Simon Wood Mathematical Sciences, University of Bath, U.K.
A toolbo of smooths Simon Wood Mathematical Sciences, University of Bath, U.K. Smooths for semi-parametric GLMs To build adequate semi-parametric GLMs requires that we use functions with appropriate properties.
More informationWeek 2: Frequency distributions
Types of data Health Sciences M.Sc. Programme Applied Biostatistics Week 2: distributions Data can be summarised to help to reveal information they contain. We do this by calculating numbers from the data
More informationMachine Learning in the Wild. Dealing with Messy Data. Rajmonda S. Caceres. SDS 293 Smith College October 30, 2017
Machine Learning in the Wild Dealing with Messy Data Rajmonda S. Caceres SDS 293 Smith College October 30, 2017 Analytical Chain: From Data to Actions Data Collection Data Cleaning/ Preparation Analysis
More informationMachine Learning with MATLAB --classification
Machine Learning with MATLAB --classification Stanley Liang, PhD York University Classification the definition In machine learning and statistics, classification is the problem of identifying to which
More informationNina Zumel and John Mount Win-Vector LLC
SUPERVISED LEARNING IN R: REGRESSION Logistic regression to predict probabilities Nina Zumel and John Mount Win-Vector LLC Predicting Probabilities Predicting whether an event occurs (yes/no): classification
More informationSTENO Introductory R-Workshop: Loading a Data Set Tommi Suvitaival, Steno Diabetes Center June 11, 2015
STENO Introductory R-Workshop: Loading a Data Set Tommi Suvitaival, tsvv@steno.dk, Steno Diabetes Center June 11, 2015 Contents 1 Introduction 1 2 Recap: Variables 2 3 Data Containers 2 3.1 Vectors................................................
More informationCPSC 695. Methods for interpolation and analysis of continuing surfaces in GIS Dr. M. Gavrilova
CPSC 695 Methods for interpolation and analysis of continuing surfaces in GIS Dr. M. Gavrilova Overview Data sampling for continuous surfaces Interpolation methods Global interpolation Local interpolation
More informationLecture 7: Linear Regression (continued)
Lecture 7: Linear Regression (continued) Reading: Chapter 3 STATS 2: Data mining and analysis Jonathan Taylor, 10/8 Slide credits: Sergio Bacallado 1 / 14 Potential issues in linear regression 1. Interactions
More informationIntroduction to RStudio
First, take class through processes of: Signing in Changing password: Tools -> Shell, then use passwd command Installing packages Check that at least these are installed: MASS, ISLR, car, class, boot,
More informationEXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY
EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA, 2015 MODULE 4 : Modelling experimental data Time allowed: Three hours Candidates should answer FIVE questions. All questions carry equal
More informationUnified Methods for Censored Longitudinal Data and Causality
Mark J. van der Laan James M. Robins Unified Methods for Censored Longitudinal Data and Causality Springer Preface v Notation 1 1 Introduction 8 1.1 Motivation, Bibliographic History, and an Overview of
More informationExpectation Maximization (EM) and Gaussian Mixture Models
Expectation Maximization (EM) and Gaussian Mixture Models Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer 1 2 3 4 5 6 7 8 Unsupervised Learning Motivation
More informationmhealth Applications in CVD Prevention and Treatment Intersection of mhealth and CVD Physical Activity 2/18/2015
mhealth Applications in CVD Prevention and Treatment Theodore Feldman, MD, FACC, FACP Medical Director, Center for Prevention and Wellness at Baptist Health South Florida Medical Director, Miami Cardiac
More information1. Two double lectures about deformable contours. 4. The transparencies define the exam requirements. 1. Matlab demonstration
Practical information INF 5300 Deformable contours, I An introduction 1. Two double lectures about deformable contours. 2. The lectures are based on articles, references will be given during the course.
More informationSimulation studies. Patrick Breheny. September 8. Monte Carlo simulation Example: Ridge vs. Lasso vs. Subset
Simulation studies Patrick Breheny September 8 Patrick Breheny BST 764: Applied Statistical Modeling 1/17 Introduction In statistics, we are often interested in properties of various estimation and model
More informationmhealth Services to Link Health and Environment
mhealth Services to Link Health and Environment A/Prof. Vijay Sivaraman School of Electrical Eng. & Telecoms. University of New South Wales http://www2.ee.unsw.edu.au/~vijay/ 1 Overview I am a technologist:
More informationStatistical Learning Part 2 Nonparametric Learning: The Main Ideas. R. Moeller Hamburg University of Technology
Statistical Learning Part 2 Nonparametric Learning: The Main Ideas R. Moeller Hamburg University of Technology Instance-Based Learning So far we saw statistical learning as parameter learning, i.e., given
More information