Hierarchical Modelling for Large Spatial Datasets

Size: px

Start display at page:

Download "Hierarchical Modelling for Large Spatial Datasets"

Sabina Powell
5 years ago
Views:

1 Hierarchical Modelling for Large Spatial Datasets Sudipto Banerjee 1 and Andrew O. Finley 2 1 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. 2 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. July 2,

2 Big N problem Introduction The Big n issue Univariate spatial regression Y = Xβ + w + ɛ, Estimation involves (σ 2 R(φ) + τ 2 I) 1, which is n n. Matrix computations occur in each MCMC iteration. Known as the Big-N problem in geostatistics. Approach: Use a model Y = Xβ + Zw + ɛ. But what Z? 2 TIES 2009 Hierarchical Modeling and Analysis

3 Big N problem Predictive Process Models Consider knots S = {s 1,..., s n } with n << n. Let w = {w(s i )}n i=1 Z(θ) = {cov(w(s i ), w(s j ))} {var(w )} 1 is n n. Predictive process regression model Y = Xβ + Z(θ)w + ɛ, Fitting requires only n n matrix computations (n << n). Key attraction: The above arises as a process model: w(s) GP (0, σ 2 w ρ( ; φ)) instead of w(s). ρ(s 1, s 2 ; φ) = cov(w(s 1 ), w )var(w ) 1 cov(w, w(s 2 )) 3 TIES 2009 Hierarchical Modeling and Analysis

4 Big N problem Selection of knots Knots: A Knotty problem?? Knot selection: Regular grid? More knots near locations we have sampled more? Formal spatial design paradigm: maximize information metrics (Zhu and Stein, 2006; Diggle & Lophaven, 2006) Geometric considerations: space-filling designs (Royle & Nychka, 1998); various clustering algorithms Compare performance of estimation of range and smoothness by varying knot size. Stein (2007, 2008): method may not work for fine-scale spatial data Still a popular choice seamlessly adapts to multivariate and spatiotemporal settings. 4 TIES 2009 Hierarchical Modeling and Analysis

5 Big N problem Selection of knots tauˆ knots 5 TIES 2009 Hierarchical Modeling and Analysis

6 Big N problem Comparisons: Unrectified VS Rectified A rectified predictive process is defined as ɛ(s) indep w ɛ (s) = w(s) + ɛ(s), where N(0, σ 2 w(1 r(s, φ) R 1 (φ)r(s, φ))). Maximum likelihood estimates of τ 2 : # of Knots Predictive Process Rectified Predictive Process exact TIES 2009 Hierarchical Modeling and Analysis

7 from: Banerjee, S., A.O. Finley, P. Waldmann, and T. Ericsson. (2009). Hierarchical multivariate modeling of large spatially referenced genetic trial datasets. Journal of the American Statistical Association. Finley, A.O., S. Banerjee, P. Waldmann, and T. Ericsson. (2008) Hierarchical spatial modeling of additive and dominance genetic variance for large spatial trial datasets. Biometrics. DOI: /j x 7 TIES 2009 Hierarchical Modeling and Analysis

8 Modeling genetic variation in Scots pine (Pinus sylvestris L.), long-term progeny study in northern Sweden. Quantitative genetics: studies the inheritance of polygenic traits, focusing upon estimation of additive genetic variance, σa, 2 and the heritability h 2 = σa/σ 2 T 2 ot, where the σ2 T ot represents the total genetic and unexplained variation. A high heritability, h 2, should result in a larger selection response (i.e., a higher probability for genetic gain in future generations). 8 TIES 2009 Hierarchical Modeling and Analysis

9 Observed trees Data overview: established in 1971 (by Skogforsk) partial diallel design of 52 parent trees Observed height 8,160 planted randomly on 2.2m squares 1997 reinventory of 4,970 surviving trees, height, DBH, branch angle, etc. 9 TIES 2009 Hierarchical Modeling and Analysis

10 Genetic effects model: Y i = x T i β + a i + d i + ɛ i, a = [a i ] n i=1 MV N(0, σ2 aa) d = [d i ] n i=1 MV N(0, σ2 d D) ɛ = [ɛ i ] n i=1 N(0, τ 2 I n ) A and D are fixed relationship matrices (See e.g., Henderson, 1985; Lynch and Walsh,1998) Note, genetic variance is further partitioned into additive and the non-additive dominance component σ 2 d 10 TIES 2009 Hierarchical Modeling and Analysis

11 Genetic effects model: Y i = x T i β + a i + d i + ɛ i, Common feature is systematic heterogeneity among observational units (i.e., violation of ɛ N(0, τ 2 I n )) Spatial heterogeneity arises from: soil characteristics micro-climates light availability Residual correlation among units as a function of distance and/or direction = erroneous parameter estimates (e.g., biased h 2 ) 11 TIES 2009 Hierarchical Modeling and Analysis

12 Genetic model results Parameter credible intervals, 50% (2.5%, 97.5%) for the non-spatial models Scots pine trial. Non-spatial Parameter Add. Add. Dom. β (69.66, 75.08) (70.04, 74.57) σa (18.30, 49.85) (14.12, 43.96) σd (11.24, 40.11) τ (121.18, ) (100.51, ) h (0.12, 0.28) 0.15 (0.09, 0.26) 12 TIES 2009 Hierarchical Modeling and Analysis

13 Genetic model results, cont d. Residual surface Residual semivariogram So, ɛ N(0, τ 2 I n ). Consider a spatial model. 13 TIES 2009 Hierarchical Modeling and Analysis

14 Previous approaches to accommodating residual spatial dependence: Manipulate the mean function constructing covariates using residuals from neighboring units (see e.g., Wilkinson et al., 1983; Besag and Kempton, 1986; Williams, 1986) Geostatitical spatial process formed AR(1) col AR(1) row (Martin, 1990; Cullis et al., 1998) classical geostatistical method (Zimmerman and Harville, 1991) All are computationally feasible, but ad hoc and/or restrictive from a modeling perspective. 14 TIES 2009 Hierarchical Modeling and Analysis

15 Spatial model for genetic trials: Y (s i ) = x T (s i )β + a i + d i + w(s i ) + ɛ i, a = [a i ] n i=1 MV N(0, σ2 aa) d = [d i ] n i=1 MV N(0, σ2 d D) w = [w(s i )] n i=1 MV N(0, σ2 wc(θ)) ɛ = [ɛ i ] n i=1 N(0, τ 2 I n ) Tools used to estimate parameters: Markov chain Monte Carlo (MCMC) - iterative Gibbs sampler (β, a, d, w) Metropolis-Hastings and Slice samplers (θ) Here MCMC is computationally infeasible because of Big-N! 15 TIES 2009 Hierarchical Modeling and Analysis

16 Trick to sample genetic effects: Gibbs draw for random effects, e.g., a MV N(µ a, Σ a ), [ ] 1 where calculating Σ a = 1 A 1 + In σa 2 τ is computationally 2 expensive! However A and D are known, so use initial spectral decomposition i.e., A 1 = P T Λ 1 P. ( 1 Thus, Σ a = P T 1 Λ 1 + I) 1 σa 2 τ P to achieve computational 2 benefits. 16 TIES 2009 Hierarchical Modeling and Analysis

17 Unfortunately, this trick does not work for w. Rather, we proposed the knot-based predictive process. Corresponding predictive process model: Y (s i ) = x T (s i )β + a i + d i + w(s i ) + ɛ i, w(s i) = c(s i; θ) T C(θ) 1 (θ)w where, w = [w(s i )] m i=1 MV N(0, C (θ)) and C (θ) = [C(s i, s j ; θ)] m i,j=1 Projection = 17 TIES 2009 Hierarchical Modeling and Analysis

18 w can accomodate complex spatial dependence structures, E.g., anisotropic Matérn correlation function: ρ(s i, s j ; θ) = ( 1/Γ(ν)2 ν 1) ( 2 νd ij ) ν κ ν (2 νd ij ), where d ij = (s i s j ) T Σ 1 (s i s j ), Σ = G(ψ)Λ 2 G T (ψ). Thus, θ = (ν, ψ, Λ). Simulated Predicted 18 TIES 2009 Hierarchical Modeling and Analysis

19 Genetic + spatial effects models Candidate spatial models (i.e., specifications of C (θ)): 1 AR(1) col AR(1) row 2 isotropic Matérn 3 anisotropic Matérn Each model evaluated using 64, 144, and 256 knot grids. Model choice using Deviance Information Criterion (DIC) (Spiegelhalter et al., 2002) 19 TIES 2009 Hierarchical Modeling and Analysis

20 Table: Model comparisons using the DIC criterion for the Scots pine dataset. Model p D DIC Non-spatial Add , Add. Dom , Spatial Isotropic 64 Knots , Knots , Knots , Spatial Anisotropic 64 Knots , Knots , Knots , TIES 2009 Hierarchical Modeling and Analysis

21 Genetic + spatial effects models results Parameter credible intervals, 50% (2.5%, 97.5%) for the isotropic Matérn and 64 and 256 knots Scots pine trial. Spatial Parameter 64 Knots 256 Knots β (69.00, 76.05) (69.66, 79.66) σa (17.14, 41.82) (18.19, 53.69) σd (6.00, 34.27) (7.65, 27.05) σw (23.71, 73.34) (30.24, 88.10) τ (72.11, 99.65) (67.90, 96.16) ν 0.83 ( 0.31, 1.46) 0.47 (0.26, 1.28) φ 0.05 (0.02, 0.09) 0.04 (0.02, 0.09) Eff. Range (44.66, ) (45.22, ) h (0.13, 0.31) 0.25 (0.15, 0.39) Decrease in τ 2 due to removal of spatial variation, results in increase in h 2 (i.e., 0.25 vs with confounding). 21 TIES 2009 Hierarchical Modeling and Analysis

Predictive process balance model richness with computational

22 Genetic + spatial effects models results, cont d. Genetic model residuals w(s), 64 knots w(s), 256 knots Predictive process balance model richness with computational feasibility (e.g., 4,970 4,970 vs ). 22 TIES 2009 Hierarchical Modeling and Analysis

23 Summary Summary Challenge - to meet modeling needs: ensure computationally feasible reduce algorithmic complexity = cheap tricks (e.g., spectral decomp. of A prior to MCMC) reduce dimensionality = predictive process maintain richness and flexibility focus on the model not how to estimate the parameters = embrace new tools (MCMC) for estimating highly flexible hierarchical models truly acknowledge sources of uncertainty propagate uncertainty through hierarchical structures (e.g., recognize uncertainty in C(θ)) 23 TIES 2009 Hierarchical Modeling and Analysis

Monte Carlo for Spatial Models

Monte Carlo for Spatial Models Murali Haran Department of Statistics Penn State University Penn State Computational Science Lectures April 2007 Spatial Models Lots of scientific questions involve analyzing