INLA: an introduction

Size: px

Start display at page:

Download "INLA: an introduction"

Bathsheba Burns
5 years ago
Views:

1 INLA: an introduction Håvard Rue 1 Norwegian University of Science and Technology Trondheim, Norway May Joint work with S.Martino (Trondheim) and N.Chopin (Paris)

2 Latent Gaussian models Background Latent Gaussian models Stage 1 Observed data y = (y i ), y i x, θ π(y i x i, θ) Stage 2 Latent Gaussian field x θ N (µ, Q(θ) 1 ), Ax = 0 Stage 3 Priors for the hyperparameters θ π(θ) Unify many of the most used models in statistics

3 Latent Gaussian models Background Latent Gaussian models Stage 1 Observed data y = (y i ), y i x, θ π(y i x i, θ) Stage 2 Latent Gaussian field x θ N (µ, Q(θ) 1 ), Ax = 0 Stage 3 Priors for the hyperparameters θ π(θ) Unify many of the most used models in statistics

4 Latent Gaussian models Background Latent Gaussian models Stage 1 Observed data y = (y i ), y i x, θ π(y i x i, θ) Stage 2 Latent Gaussian field x θ N (µ, Q(θ) 1 ), Ax = 0 Stage 3 Priors for the hyperparameters θ π(θ) Unify many of the most used models in statistics

5 Latent Gaussian models Background Latent Gaussian models Stage 1 Observed data y = (y i ), y i x, θ π(y i x i, θ) Stage 2 Latent Gaussian field x θ N (µ, Q(θ) 1 ), Ax = 0 Stage 3 Priors for the hyperparameters θ π(θ) Unify many of the most used models in statistics

6 Latent Gaussian models Background Structured additive regression models Linear predictor η i = k β k z ki + j w ji f j (z ji ) + ɛ i Linear effects of covariates {z ki } Effects of f j ( ) Fixed weights {w ji } Commonly: f j (z ji ) = f j,zji Account for smooth response Temporal or spatially indexed covariates Unstructured terms ( random effects ) Depend on some parameters θ

7 Latent Gaussian models Background Structured additive regression models Linear predictor η i = k β k z ki + j w ji f j (z ji ) + ɛ i Linear effects of covariates {z ki } Effects of f j ( ) Fixed weights {w ji } Commonly: f j (z ji ) = f j,zji Account for smooth response Temporal or spatially indexed covariates Unstructured terms ( random effects ) Depend on some parameters θ

8 Latent Gaussian models Background Structured additive regression models Linear predictor η i = k β k z ki + j w ji f j (z ji ) + ɛ i Linear effects of covariates {z ki } Effects of f j ( ) Fixed weights {w ji } Commonly: f j (z ji ) = f j,zji Account for smooth response Temporal or spatially indexed covariates Unstructured terms ( random effects ) Depend on some parameters θ

9 Latent Gaussian models Background Structured additive regression models Linear predictor η i = k β k z ki + j w ji f j (z ji ) + ɛ i Linear effects of covariates {z ki } Effects of f j ( ) Fixed weights {w ji } Commonly: f j (z ji ) = f j,zji Account for smooth response Temporal or spatially indexed covariates Unstructured terms ( random effects ) Depend on some parameters θ

10 Latent Gaussian models Background Structured additive regression models Linear predictor η i = k β k z ki + j w ji f j (z ji ) + ɛ i Linear effects of covariates {z ki } Effects of f j ( ) Fixed weights {w ji } Commonly: f j (z ji ) = f j,zji Account for smooth response Temporal or spatially indexed covariates Unstructured terms ( random effects ) Depend on some parameters θ

11 Latent Gaussian models Background Structured additive regression models Linear predictor η i = k β k z ki + j w ji f j (z ji ) + ɛ i Linear effects of covariates {z ki } Effects of f j ( ) Fixed weights {w ji } Commonly: f j (z ji ) = f j,zji Account for smooth response Temporal or spatially indexed covariates Unstructured terms ( random effects ) Depend on some parameters θ

12 Latent Gaussian models Background Structured additive regression models Linear predictor η i = k β k z ki + j w ji f j (z ji ) + ɛ i Linear effects of covariates {z ki } Effects of f j ( ) Fixed weights {w ji } Commonly: f j (z ji ) = f j,zji Account for smooth response Temporal or spatially indexed covariates Unstructured terms ( random effects ) Depend on some parameters θ

13 Latent Gaussian models Background Structured additive regression models Linear predictor η i = k β k z ki + j w ji f j (z ji ) + ɛ i Linear effects of covariates {z ki } Effects of f j ( ) Fixed weights {w ji } Commonly: f j (z ji ) = f j,zji Account for smooth response Temporal or spatially indexed covariates Unstructured terms ( random effects ) Depend on some parameters θ

14 Latent Gaussian models Background Structured additive regression models η i = k β k z ki + j w ji f j (z ji ) + ɛ i Observations y π(y x, θ) = i π(y i η i, θ) from an (f.ex) exponential family with mean µ i = g 1 (η i ). Latent Gaussian model if x = ({β k }, {f ji }, {η i }) θ N (µ, Q(θ) 1 )

15 Latent Gaussian models Background Structured additive regression models η i = k β k z ki + j w ji f j (z ji ) + ɛ i Observations y π(y x, θ) = i π(y i η i, θ) from an (f.ex) exponential family with mean µ i = g 1 (η i ). Latent Gaussian model if x = ({β k }, {f ji }, {η i }) θ N (µ, Q(θ) 1 )

16 Latent Gaussian models Background Structured additive regression models η i = k β k z ki + j w ji f j (z ji ) + ɛ i Observations y π(y x, θ) = i π(y i η i, θ) from an (f.ex) exponential family with mean µ i = g 1 (η i ). Latent Gaussian model if x = ({β k }, {f ji }, {η i }) θ N (µ, Q(θ) 1 )

17 Latent Gaussian models Examples Examples Dynamic linear models Stochastic volatility Generalised linear (mixed) models Generalised additive (mixed) models Spline smoothing Semiparametric regression Space-varying (semiparametric) regression models Disease mapping Log-Gaussian Cox-processes Model-based geostatistics (*) Spatio-temporal models

18 Latent Gaussian models Examples Examples Dynamic linear models Stochastic volatility Generalised linear (mixed) models Generalised additive (mixed) models Spline smoothing Semiparametric regression Space-varying (semiparametric) regression models Disease mapping Log-Gaussian Cox-processes Model-based geostatistics (*) Spatio-temporal models

19 Latent Gaussian models Examples Examples Dynamic linear models Stochastic volatility Generalised linear (mixed) models Generalised additive (mixed) models Spline smoothing Semiparametric regression Space-varying (semiparametric) regression models Disease mapping Log-Gaussian Cox-processes Model-based geostatistics (*) Spatio-temporal models

20 Latent Gaussian models Examples Examples Dynamic linear models Stochastic volatility Generalised linear (mixed) models Generalised additive (mixed) models Spline smoothing Semiparametric regression Space-varying (semiparametric) regression models Disease mapping Log-Gaussian Cox-processes Model-based geostatistics (*) Spatio-temporal models

21 Latent Gaussian models Examples Examples Dynamic linear models Stochastic volatility Generalised linear (mixed) models Generalised additive (mixed) models Spline smoothing Semiparametric regression Space-varying (semiparametric) regression models Disease mapping Log-Gaussian Cox-processes Model-based geostatistics (*) Spatio-temporal models

22 Latent Gaussian models Examples Examples Dynamic linear models Stochastic volatility Generalised linear (mixed) models Generalised additive (mixed) models Spline smoothing Semiparametric regression Space-varying (semiparametric) regression models Disease mapping Log-Gaussian Cox-processes Model-based geostatistics (*) Spatio-temporal models

23 Latent Gaussian models Examples Examples Dynamic linear models Stochastic volatility Generalised linear (mixed) models Generalised additive (mixed) models Spline smoothing Semiparametric regression Space-varying (semiparametric) regression models Disease mapping Log-Gaussian Cox-processes Model-based geostatistics (*) Spatio-temporal models

24 Latent Gaussian models Examples Examples Dynamic linear models Stochastic volatility Generalised linear (mixed) models Generalised additive (mixed) models Spline smoothing Semiparametric regression Space-varying (semiparametric) regression models Disease mapping Log-Gaussian Cox-processes Model-based geostatistics (*) Spatio-temporal models

25 Latent Gaussian models Examples Examples Dynamic linear models Stochastic volatility Generalised linear (mixed) models Generalised additive (mixed) models Spline smoothing Semiparametric regression Space-varying (semiparametric) regression models Disease mapping Log-Gaussian Cox-processes Model-based geostatistics (*) Spatio-temporal models

26 Latent Gaussian models Examples Examples Dynamic linear models Stochastic volatility Generalised linear (mixed) models Generalised additive (mixed) models Spline smoothing Semiparametric regression Space-varying (semiparametric) regression models Disease mapping Log-Gaussian Cox-processes Model-based geostatistics (*) Spatio-temporal models

27 Latent Gaussian models Examples Examples Dynamic linear models Stochastic volatility Generalised linear (mixed) models Generalised additive (mixed) models Spline smoothing Semiparametric regression Space-varying (semiparametric) regression models Disease mapping Log-Gaussian Cox-processes Model-based geostatistics (*) Spatio-temporal models

28 Latent Gaussian models Examples Example: Disease mapping (BYM-model) Data y i Poisson(E i exp(η i )) Log-relative risk η i = u i + v i + β T z i Structured component u Unstructured component v Covariates z i Log-precisions log κ u and log κ v

29 Latent Gaussian models Examples Characteristic features Large dimension of the latent Gaussian field: A lot of conditional independence in the latent Gaussian field Few hyperparameters θ: dim(θ) between 1 and 5 Non-Gaussian data

30 Latent Gaussian models Examples Characteristic features Large dimension of the latent Gaussian field: A lot of conditional independence in the latent Gaussian field Few hyperparameters θ: dim(θ) between 1 and 5 Non-Gaussian data

31 Latent Gaussian models Examples Characteristic features Large dimension of the latent Gaussian field: A lot of conditional independence in the latent Gaussian field Few hyperparameters θ: dim(θ) between 1 and 5 Non-Gaussian data

32 Latent Gaussian models Examples Characteristic features Large dimension of the latent Gaussian field: A lot of conditional independence in the latent Gaussian field Few hyperparameters θ: dim(θ) between 1 and 5 Non-Gaussian data

33 Task Main task Compute the posterior marginals for the latent field π(x i y), i = 1,..., n Compute the posterior marginals for the hyperparameters π(θ j y), j = 1,..., dim(θ) Today s standard approach, is to make use of MCMC Main difficulties CPU-time Additive MC-errors

34 Task Main task Compute the posterior marginals for the latent field π(x i y), i = 1,..., n Compute the posterior marginals for the hyperparameters π(θ j y), j = 1,..., dim(θ) Today s standard approach, is to make use of MCMC Main difficulties CPU-time Additive MC-errors

35 Task Main task Compute the posterior marginals for the latent field π(x i y), i = 1,..., n Compute the posterior marginals for the hyperparameters π(θ j y), j = 1,..., dim(θ) Today s standard approach, is to make use of MCMC Main difficulties CPU-time Additive MC-errors

36 Task Main task Compute the posterior marginals for the latent field π(x i y), i = 1,..., n Compute the posterior marginals for the hyperparameters π(θ j y), j = 1,..., dim(θ) Today s standard approach, is to make use of MCMC Main difficulties CPU-time Additive MC-errors

37 Our approach INLA Integrated nested Laplace approximations (INLA) Utilise of the latent Gaussian field Laplace approximations Utilise the conditional independence properties of the latent Gaussian field Numerical algorithms for sparse matrices Utilise small dim(θ) Integrated Nested Laplace approximations

38 Our approach INLA Integrated nested Laplace approximations (INLA) Utilise of the latent Gaussian field Laplace approximations Utilise the conditional independence properties of the latent Gaussian field Numerical algorithms for sparse matrices Utilise small dim(θ) Integrated Nested Laplace approximations

39 Our approach INLA Integrated nested Laplace approximations (INLA) Utilise of the latent Gaussian field Laplace approximations Utilise the conditional independence properties of the latent Gaussian field Numerical algorithms for sparse matrices Utilise small dim(θ) Integrated Nested Laplace approximations

40 Our approach INLA Integrated nested Laplace approximations (INLA) Utilise of the latent Gaussian field Laplace approximations Utilise the conditional independence properties of the latent Gaussian field Numerical algorithms for sparse matrices Utilise small dim(θ) Integrated Nested Laplace approximations

41 Our approach INLA Integrated nested Laplace approximations (INLA) Utilise of the latent Gaussian field Laplace approximations Utilise the conditional independence properties of the latent Gaussian field Numerical algorithms for sparse matrices Utilise small dim(θ) Integrated Nested Laplace approximations

42 Our approach INLA Integrated nested Laplace approximations (INLA) Utilise of the latent Gaussian field Laplace approximations Utilise the conditional independence properties of the latent Gaussian field Numerical algorithms for sparse matrices Utilise small dim(θ) Integrated Nested Laplace approximations

43 Our approach Summary of results Main results HUGE improvement in both speed and accuracy compared to MCMC alternatives Relative error Practically exact results 2 Extensions: Marginal likelihood, DIC, Cross-validation,... INLA enable us to treat Bayesian latent Gaussian models properly and bring these models from the research communities to the end-users 2 Can construct counter-examples

44 Our approach Summary of results Main results HUGE improvement in both speed and accuracy compared to MCMC alternatives Relative error Practically exact results 2 Extensions: Marginal likelihood, DIC, Cross-validation,... INLA enable us to treat Bayesian latent Gaussian models properly and bring these models from the research communities to the end-users 2 Can construct counter-examples

45 Our approach Summary of results Main results HUGE improvement in both speed and accuracy compared to MCMC alternatives Relative error Practically exact results 2 Extensions: Marginal likelihood, DIC, Cross-validation,... INLA enable us to treat Bayesian latent Gaussian models properly and bring these models from the research communities to the end-users 2 Can construct counter-examples

46 Our approach Summary of results Main results HUGE improvement in both speed and accuracy compared to MCMC alternatives Relative error Practically exact results 2 Extensions: Marginal likelihood, DIC, Cross-validation,... INLA enable us to treat Bayesian latent Gaussian models properly and bring these models from the research communities to the end-users 2 Can construct counter-examples

47 Our approach Summary of results Main results HUGE improvement in both speed and accuracy compared to MCMC alternatives Relative error Practically exact results 2 Extensions: Marginal likelihood, DIC, Cross-validation,... INLA enable us to treat Bayesian latent Gaussian models properly and bring these models from the research communities to the end-users 2 Can construct counter-examples

48 Main ideas Main ideas (I) π(z) = π(x, z) π(x z) leading to π(z) = π(x, z) π(x z) mode(z) When π(x z) is the Gaussian-approximation, this is the Laplace-approximation Want π(x z) to be almost Gaussian.

49 Main ideas Main ideas (I) π(z) = π(x, z) π(x z) leading to π(z) = π(x, z) π(x z) mode(z) When π(x z) is the Gaussian-approximation, this is the Laplace-approximation Want π(x z) to be almost Gaussian.

50 Main ideas Main ideas (I) π(z) = π(x, z) π(x z) leading to π(z) = π(x, z) π(x z) mode(z) When π(x z) is the Gaussian-approximation, this is the Laplace-approximation Want π(x z) to be almost Gaussian.

51 Main ideas Main ideas (II) Posterior π(x, θ y) π(θ) π(x θ) i I π(y i x i, θ) Do the integration wrt θ numerically

52 Main ideas Main ideas (II) Posterior π(x, θ y) π(θ) π(x θ) i I π(y i x i, θ) Do the integration wrt θ numerically π(x i y) = π(θ j y) = π(θ y) π(x i θ, y) dθ π(θ y) dθ j

53 Main ideas Main ideas (II) Posterior π(x, θ y) π(θ) π(x θ) i I π(y i x i, θ) Do the integration wrt θ numerically π(x i y) = π(θ j y) = π(θ y) π(x i θ, y) dθ π(θ y) dθ j

54 Main ideas Remarks 1. Expect π(θ y) to be accurate, since x θ is a priori Gaussian Likelihood models are well-behaved so is almost Gaussian. π(x θ, y) 2. There are no distributional assumptions on θ y 3. Similar remarks are valid to π(x i θ, y)

55 Main ideas Remarks 1. Expect π(θ y) to be accurate, since x θ is a priori Gaussian Likelihood models are well-behaved so is almost Gaussian. π(x θ, y) 2. There are no distributional assumptions on θ y 3. Similar remarks are valid to π(x i θ, y)

56 Main ideas Remarks 1. Expect π(θ y) to be accurate, since x θ is a priori Gaussian Likelihood models are well-behaved so is almost Gaussian. π(x θ, y) 2. There are no distributional assumptions on θ y 3. Similar remarks are valid to π(x i θ, y)

57 Main ideas Remarks 1. Expect π(θ y) to be accurate, since x θ is a priori Gaussian Likelihood models are well-behaved so is almost Gaussian. π(x θ, y) 2. There are no distributional assumptions on θ y 3. Similar remarks are valid to π(x i θ, y)

58 Main ideas Remarks 1. Expect π(θ y) to be accurate, since x θ is a priori Gaussian Likelihood models are well-behaved so is almost Gaussian. π(x θ, y) 2. There are no distributional assumptions on θ y 3. Similar remarks are valid to π(x i θ, y)

59 Computational tools Background Computational issues Our approach in its raw form is not computational feasible Main issue is the large dimension, n, of the latent field Various strategies/tricks are required for obtaining a practical good solution

60 Computational tools Background Computational issues Our approach in its raw form is not computational feasible Main issue is the large dimension, n, of the latent field Various strategies/tricks are required for obtaining a practical good solution

61 Computational tools Background Computational issues Our approach in its raw form is not computational feasible Main issue is the large dimension, n, of the latent field Various strategies/tricks are required for obtaining a practical good solution

62 Computational tools Gaussian Markov random fields Gaussian Markov random fields Make use of the conditional independence properties in the latent field x i x j x ij Q ij = 0 where Q is the precision matrix (inverse covariance) Use numerical methods for sparse matrices!

63 Computational tools Gaussian Markov random fields Gaussian Markov random fields Make use of the conditional independence properties in the latent field x i x j x ij Q ij = 0 where Q is the precision matrix (inverse covariance) Use numerical methods for sparse matrices!

64 Computational tools Rank one updates Rank one updates Have computed the Cholesky factorisation Prec(x) = Q = LL T Want to compute the conditional mean and variances for x i x i No need to factorise Prec(x i x i ); can compute the correction for conditioning on x i.

65 Computational tools Rank one updates Rank one updates Have computed the Cholesky factorisation Prec(x) = Q = LL T Want to compute the conditional mean and variances for x i x i No need to factorise Prec(x i x i ); can compute the correction for conditioning on x i.

66 Computational tools Rank one updates Rank one updates Have computed the Cholesky factorisation Prec(x) = Q = LL T Want to compute the conditional mean and variances for x i x i No need to factorise Prec(x i x i ); can compute the correction for conditioning on x i.

67 Computational tools Simplified Laplace Simplified Laplace Approximation Expand the Laplace approximation of π(x i θ, y): log π(x i θ, y) = 1 2 x 2 i + b i x i d i x 3 i + Remarks Correct the Gaussian approximation for error in shift and skewness through b i and d i Fit a skew-normal density Computational fast 2φ(x)Φ(ax) Sufficient accurate for most applications

68 Computational tools Simplified Laplace Simplified Laplace Approximation Expand the Laplace approximation of π(x i θ, y): log π(x i θ, y) = 1 2 x 2 i + b i x i d i x 3 i + Remarks Correct the Gaussian approximation for error in shift and skewness through b i and d i Fit a skew-normal density Computational fast 2φ(x)Φ(ax) Sufficient accurate for most applications

69 Computational tools Simplified Laplace Simplified Laplace Approximation Expand the Laplace approximation of π(x i θ, y): log π(x i θ, y) = 1 2 x 2 i + b i x i d i x 3 i + Remarks Correct the Gaussian approximation for error in shift and skewness through b i and d i Fit a skew-normal density Computational fast 2φ(x)Φ(ax) Sufficient accurate for most applications

70 Computational tools Simplified Laplace Simplified Laplace Approximation Expand the Laplace approximation of π(x i θ, y): log π(x i θ, y) = 1 2 x 2 i + b i x i d i x 3 i + Remarks Correct the Gaussian approximation for error in shift and skewness through b i and d i Fit a skew-normal density Computational fast 2φ(x)Φ(ax) Sufficient accurate for most applications

71 Computational tools Simplified Laplace Simplified Laplace Approximation Expand the Laplace approximation of π(x i θ, y): log π(x i θ, y) = 1 2 x 2 i + b i x i d i x 3 i + Remarks Correct the Gaussian approximation for error in shift and skewness through b i and d i Fit a skew-normal density Computational fast 2φ(x)Φ(ax) Sufficient accurate for most applications

72 The Integrated nested Laplace-approximation (INLA) Summary The integrated nested Laplace approximation (INLA) I Step I Explore π(θ y) Locate the mode Use the Hessian to construct new variables Grid-search

73 The Integrated nested Laplace-approximation (INLA) Summary The integrated nested Laplace approximation (INLA) I Step I Explore π(θ y) Locate the mode Use the Hessian to construct new variables Grid-search

74 The Integrated nested Laplace-approximation (INLA) Summary The integrated nested Laplace approximation (INLA) I Step I Explore π(θ y) Locate the mode Use the Hessian to construct new variables Grid-search

75 The Integrated nested Laplace-approximation (INLA) Summary The integrated nested Laplace approximation (INLA) II Step II For each θ j For each i, compute the (simplified) Laplace approximation for x i

76 The Integrated nested Laplace-approximation (INLA) Summary The integrated nested Laplace approximation (INLA) III Step III Sum out θ j For each i, sum out θ π(x i y) j π(x i y, θ j ) π(θ j y) Build a log-spline corrected Gaussian N (x i ; µ i, σ 2 i ) exp(spline) to represent π(x i y).

77 The Integrated nested Laplace-approximation (INLA) Summary The integrated nested Laplace approximation (INLA) III Step III Sum out θ j For each i, sum out θ π(x i y) j π(x i y, θ j ) π(θ j y) Build a log-spline corrected Gaussian N (x i ; µ i, σ 2 i ) exp(spline) to represent π(x i y).

78 The Integrated nested Laplace-approximation (INLA) Assessing the error How can we assess the errors in the approximations? Important, but asymptotic arguments are difficult: dim(y) = O(n) and dim(x) = O(n)

79 The Integrated nested Laplace-approximation (INLA) Assessing the error Errors in the approximations of π(x i y) Compare a sequence of improved approximations 1. Gaussian approximation 2. Simplified Laplace 3. Laplace Compute the full Laplace-approximation for π(x i y, θ j ) only if the Gaussian and the Simplified Laplace approximation disagree.

80 The Integrated nested Laplace-approximation (INLA) Assessing the error Errors in the approximations of π(x i y) Compare a sequence of improved approximations 1. Gaussian approximation 2. Simplified Laplace 3. Laplace Compute the full Laplace-approximation for π(x i y, θ j ) only if the Gaussian and the Simplified Laplace approximation disagree.

81 The Integrated nested Laplace-approximation (INLA) Assessing the error Overall check: Equivalent number of replicates Tool 3: Estimate the effective number of parameters From the Deviance Information Criteria: p D (θ) n trace ( Q prior (θ) Q post. (θ) 1) Compare with the number of observations: #observations/p D (θ) high ratio is good Theoretical justification

82 The Integrated nested Laplace-approximation (INLA) Assessing the error Overall check: Equivalent number of replicates Tool 3: Estimate the effective number of parameters From the Deviance Information Criteria: p D (θ) n trace ( Q prior (θ) Q post. (θ) 1) Compare with the number of observations: #observations/p D (θ) high ratio is good Theoretical justification

83 The Integrated nested Laplace-approximation (INLA) Assessing the error Overall check: Equivalent number of replicates Tool 3: Estimate the effective number of parameters From the Deviance Information Criteria: p D (θ) n trace ( Q prior (θ) Q post. (θ) 1) Compare with the number of observations: #observations/p D (θ) high ratio is good Theoretical justification

84 Examples Examples 10:30-11:30: Andrea Riebler: Performance of INLA analysing bivariate meta-regression and age-period-cohort models. 12:30-13:30: Birgit Schrödle: Spatio-temporal disease mapping using INLA. 13:45-14:45: Virgilio Gómez-Rubio: Approximate Bayesian Inference for Small Area Estimation 15:00-16:00: Lea Fortunato: About the Rapid Inquiry Facility, and spatial analyses with WinBUGS and INLA. 09:00-10:00: Rupali Akerkar: Approximate Bayesian Inference for Survival models. 10:15-11:30: Ingelin Steinsland & Anna Marie Holand: Animal Model and INLA.

85 Extensions Marginal likelihood Marginal likelihood Marginal likelihood is the normalising constant for π(θ y)

86 Extensions DIC Deviance Information Criteria D(x; θ) = 2 i log(y i x i, θ) DIC = 2 Mean (D(x; θ)) D(Mean(x); θ )

87 Cross-validation Cross-validation Based on we can compute π(x i y i, θ) π(x i y, θ) π(y i x i, θ) π(y i y i ) Similar with π(θ y i ) Keep the integration points {θ j } fixed. Detect surprising observations: Prob(y new i y i y i )

88 Cross-validation Cross-validation Based on we can compute π(x i y i, θ) π(x i y, θ) π(y i x i, θ) π(y i y i ) Similar with π(θ y i ) Keep the integration points {θ j } fixed. Detect surprising observations: Prob(y new i y i y i )

89 Cross-validation Cross-validation Based on we can compute π(x i y i, θ) π(x i y, θ) π(y i x i, θ) π(y i y i ) Similar with π(θ y i ) Keep the integration points {θ j } fixed. Detect surprising observations: Prob(y new i y i y i )

90 Cross-validation Cross-validation Based on we can compute π(x i y i, θ) π(x i y, θ) π(y i x i, θ) π(y i y i ) Similar with π(θ y i ) Keep the integration points {θ j } fixed. Detect surprising observations: Prob(y new i y i y i )

91 Recent developments Recent developments in (R-)INLA Improving the code: speedups and improved parallel performance Improving the R-interface Extending the building-blocks: prior-models and likelihood-models Lot of things still todo... Documentation Worked out examples Webpage +++

92 Recent developments Improving the code The two largest changes the last year (summer 2008) Change the way inla works in a multi-core environment. Especially important for more than dual-core (xmas 2008) Implement a more efficient rank-one update formula (Thanks Birgit!). Especially important for models with many constraints, but gave a huge speedup also for models with a single constraint.

93 Recent developments Improving the code The two largest changes the last year (summer 2008) Change the way inla works in a multi-core environment. Especially important for more than dual-core (xmas 2008) Implement a more efficient rank-one update formula (Thanks Birgit!). Especially important for models with many constraints, but gave a huge speedup also for models with a single constraint.

94 Recent developments inla & multi-core Gradient and Hessian computations are done in parallel π(θ y) θ i and 2 θ i θ j π(θ y) All computations of π(x i θ j, y) is now done in parallel wrt j, and not i as before.

95 Recent developments Multi-core example inla-example 7: CANCER-INCIDENCE, n = 7138, dim(θ) = 3 Number of cores Time used s s 4 9.1s 6 8.6s 8 8.3s Linear algebra (Cholesky/Solve/Inverse): 50% Administration: 50%

96 Recent developments Recent developments in (R-)INLA Improved R-interface inla-binary is now bundled with the R-package. Make use of model.matrix() in R: formula ~... + x*z will expand as formula ~... Allow for formula ~... + x + z + x:z + offset(a) + offset(b) defining a+b to be fixed offset in formula Larger change for survival-models (more on this tomorrow...)

97 Recent developments Recent developments in (R-)INLA Improved R-interface inla-binary is now bundled with the R-package. Make use of model.matrix() in R: formula ~... + x*z will expand as formula ~... Allow for formula ~... + x + z + x:z + offset(a) + offset(b) defining a+b to be fixed offset in formula Larger change for survival-models (more on this tomorrow...)

98 Recent developments Recent developments in (R-)INLA Improved R-interface inla-binary is now bundled with the R-package. Make use of model.matrix() in R: formula ~... + x*z will expand as formula ~... Allow for formula ~... + x + z + x:z + offset(a) + offset(b) defining a+b to be fixed offset in formula Larger change for survival-models (more on this tomorrow...)

99 Recent developments Recent developments in (R-)INLA Improved R-interface inla-binary is now bundled with the R-package. Make use of model.matrix() in R: formula ~... + x*z will expand as formula ~... Allow for formula ~... + x + z + x:z + offset(a) + offset(b) defining a+b to be fixed offset in formula Larger change for survival-models (more on this tomorrow...)

100 Recent developments Building blocks [13] "3diidwishartpart0" "3diidwishartpart1" "3diidwishartpart2" > names(inla.models()$models) [1] "iid" "rw1" "rw2" [4] "crw2" "seasonal" "besag" [7] "ar1" "generic" "2diid" [10] "2diidwishart" "2diidwishartpart0" "2diidwishartpart1" [16] "z" "rw2d" "matern2d"

101 Recent developments The z -model The z-models is η =... + Zz +... where Z is a n k matrix and z N k (0, τi).

102 Recent developments Likelihood models > names(inla.models()$lmodels) [1] "poisson" "binomial" [3] "exponential" "piecewise.constant" [5] "piecewise.linear" "gaussian" [7] "laplace" "weibull" [9] "weibullcure" "stochvol_t" [11] "zeroinflated_poisson_0" "zeroinflated_poisson_1" [13] "zeroinflated_binomial_0" "zeroinflated_binomial_1" [15] "T" "stochvol.nig"

103 Recent developments Zero-inflated models The (type 0) likelihood is defined as Prob(y...) = p 1 [y=0] + (1 p) Poission(y y > 0) The (type 1) likelihood is defined as Prob(y...) = p 1 [y=0] + (1 p) Poission(y) [Birgit: Mixture over p is still on the list... sorry.]

104 Recent developments The TODO-list Better support for spatial (GMRF) models. More on this tomorrow (Finn.L). Alternative spline-models, B-splines? (Thomas.K?) Simultaneous credibility intervals Documentation Worked out examples Webpage +++

105 Summary Summary (I) Latent Gaussian models unifies many models into the same framework: unified approach towards Bayesian inference Use the same computer code Near optimal numerical algorithms for the sparse matrices Achieve nice speed-ups in a multi-core environment Practically exact results Prototype implementation inla-program: Inference for structured additive regression models GAM-like R-interface to the inla-program Available for Linux/Mac/Windows Open source

106 Summary Summary (I) Latent Gaussian models unifies many models into the same framework: unified approach towards Bayesian inference Use the same computer code Near optimal numerical algorithms for the sparse matrices Achieve nice speed-ups in a multi-core environment Practically exact results Prototype implementation inla-program: Inference for structured additive regression models GAM-like R-interface to the inla-program Available for Linux/Mac/Windows Open source

107 Summary Summary (I) Latent Gaussian models unifies many models into the same framework: unified approach towards Bayesian inference Use the same computer code Near optimal numerical algorithms for the sparse matrices Achieve nice speed-ups in a multi-core environment Practically exact results Prototype implementation inla-program: Inference for structured additive regression models GAM-like R-interface to the inla-program Available for Linux/Mac/Windows Open source

108 Summary Summary (I) Latent Gaussian models unifies many models into the same framework: unified approach towards Bayesian inference Use the same computer code Near optimal numerical algorithms for the sparse matrices Achieve nice speed-ups in a multi-core environment Practically exact results Prototype implementation inla-program: Inference for structured additive regression models GAM-like R-interface to the inla-program Available for Linux/Mac/Windows Open source

109 Summary Summary (I) Latent Gaussian models unifies many models into the same framework: unified approach towards Bayesian inference Use the same computer code Near optimal numerical algorithms for the sparse matrices Achieve nice speed-ups in a multi-core environment Practically exact results Prototype implementation inla-program: Inference for structured additive regression models GAM-like R-interface to the inla-program Available for Linux/Mac/Windows Open source

110 Summary Summary (I) Latent Gaussian models unifies many models into the same framework: unified approach towards Bayesian inference Use the same computer code Near optimal numerical algorithms for the sparse matrices Achieve nice speed-ups in a multi-core environment Practically exact results Prototype implementation inla-program: Inference for structured additive regression models GAM-like R-interface to the inla-program Available for Linux/Mac/Windows Open source

111 Summary Summary (I) Latent Gaussian models unifies many models into the same framework: unified approach towards Bayesian inference Use the same computer code Near optimal numerical algorithms for the sparse matrices Achieve nice speed-ups in a multi-core environment Practically exact results Prototype implementation inla-program: Inference for structured additive regression models GAM-like R-interface to the inla-program Available for Linux/Mac/Windows Open source

112 Summary Summary (I) Latent Gaussian models unifies many models into the same framework: unified approach towards Bayesian inference Use the same computer code Near optimal numerical algorithms for the sparse matrices Achieve nice speed-ups in a multi-core environment Practically exact results Prototype implementation inla-program: Inference for structured additive regression models GAM-like R-interface to the inla-program Available for Linux/Mac/Windows Open source

113 Summary Summary (I) Latent Gaussian models unifies many models into the same framework: unified approach towards Bayesian inference Use the same computer code Near optimal numerical algorithms for the sparse matrices Achieve nice speed-ups in a multi-core environment Practically exact results Prototype implementation inla-program: Inference for structured additive regression models GAM-like R-interface to the inla-program Available for Linux/Mac/Windows Open source

114 Summary Summary (I) Latent Gaussian models unifies many models into the same framework: unified approach towards Bayesian inference Use the same computer code Near optimal numerical algorithms for the sparse matrices Achieve nice speed-ups in a multi-core environment Practically exact results Prototype implementation inla-program: Inference for structured additive regression models GAM-like R-interface to the inla-program Available for Linux/Mac/Windows Open source

115 Summary Summary (II) It is our view that the prospects of this work are more important than this work itself: Make latent Gaussian models, more applicable, useful and appealing for the end user Allow us to use latent Gaussian models as baseline models, even in cases where more complex models are more appropriate

116 Summary Summary (II) It is our view that the prospects of this work are more important than this work itself: Make latent Gaussian models, more applicable, useful and appealing for the end user Allow us to use latent Gaussian models as baseline models, even in cases where more complex models are more appropriate

117 Summary Summary (II) It is our view that the prospects of this work are more important than this work itself: Make latent Gaussian models, more applicable, useful and appealing for the end user Allow us to use latent Gaussian models as baseline models, even in cases where more complex models are more appropriate

INLA: Integrated Nested Laplace Approximations

INLA: Integrated Nested Laplace Approximations John Paige Statistics Department University of Washington October 10, 2017 1 The problem Markov Chain Monte Carlo (MCMC) takes too long in many settings.