Modeling Criminal Careers as Departures From a Unimodal Population Age-Crime Curve: The Case of Marijuana Use

Similar documents
Smoothing and Forecasting Mortality Rates with P-splines. Iain Currie. Data and problem. Plan of talk

Analysis of Incomplete Multivariate Data

Deposited on: 07 September 2010

Additive hedonic regression models for the Austrian housing market ERES Conference, Edinburgh, June

CHAPTER 1 INTRODUCTION

Statistical Matching using Fractional Imputation

Monte Carlo for Spatial Models

Spatially Adaptive Bayesian Penalized Regression Splines (P-splines)

Doubly Cyclic Smoothing Splines and Analysis of Seasonal Daily Pattern of CO2 Concentration in Antarctica

Non-Parametric and Semi-Parametric Methods for Longitudinal Data

Bayesian Modelling with JAGS and R

Bayesian Spatiotemporal Modeling with Hierarchical Spatial Priors for fmri

Hierarchical Mixture Models for Nested Data Structures

MCMC Methods for data modeling

REGULARIZED REGRESSION FOR RESERVING AND MORTALITY MODELS GARY G. VENTER

GENREG DID THAT? Clay Barker Research Statistician Developer JMP Division, SAS Institute

Markov chain Monte Carlo methods

Theoretical Concepts of Machine Learning

Hierarchical Modelling for Large Spatial Datasets

Machine Learning and Data Mining. Clustering (1): Basics. Kalev Kask

Bayesian Estimation for Skew Normal Distributions Using Data Augmentation

Machine Learning Techniques for Detecting Hierarchical Interactions in GLM s for Insurance Premiums

FUNCTIONAL magnetic resonance imaging (fmri) is

Quantitative Biology II!

Performance of Sequential Imputation Method in Multilevel Applications

Statistics (STAT) Statistics (STAT) 1. Prerequisites: grade in C- or higher in STAT 1200 or STAT 1300 or STAT 1400

CSCI 599 Class Presenta/on. Zach Levine. Markov Chain Monte Carlo (MCMC) HMM Parameter Es/mates

Lecture VIII. Global Approximation Methods: I

NONPARAMETRIC REGRESSION WIT MEASUREMENT ERROR: SOME RECENT PR David Ruppert Cornell University

FMA901F: Machine Learning Lecture 3: Linear Models for Regression. Cristian Sminchisescu

A noninformative Bayesian approach to small area estimation

Bayesian Statistics Group 8th March Slice samplers. (A very brief introduction) The basic idea

A Short History of Markov Chain Monte Carlo

Statistical techniques for data analysis in Cosmology

A spatio-temporal model for extreme precipitation simulated by a climate model.

Computer vision: models, learning and inference. Chapter 10 Graphical Models

Semiparametric Mixed Effecs with Hierarchical DP Mixture

Clustering Relational Data using the Infinite Relational Model

Approximate Bayesian Computation. Alireza Shafaei - April 2016

Curve Sampling and Geometric Conditional Simulation

Interpolation by Spline Functions

MCMC Diagnostics. Yingbo Li MATH Clemson University. Yingbo Li (Clemson) MCMC Diagnostics MATH / 24

Ludwig Fahrmeir Gerhard Tute. Statistical odelling Based on Generalized Linear Model. íecond Edition. . Springer

A Nonparametric Bayesian Approach to Detecting Spatial Activation Patterns in fmri Data

Lecture 27, April 24, Reading: See class website. Nonparametric regression and kernel smoothing. Structured sparse additive models (GroupSpAM)

Rolling Markov Chain Monte Carlo

Network-based auto-probit modeling for protein function prediction

Convexization in Markov Chain Monte Carlo

STAT 705 Introduction to generalized additive models

REPLACING MLE WITH BAYESIAN SHRINKAGE CAS ANNUAL MEETING NOVEMBER 2018 GARY G. VENTER

Nonparametric regression using kernel and spline methods

GAMs semi-parametric GLMs. Simon Wood Mathematical Sciences, University of Bath, U.K.

Rolling Markov Chain Monte Carlo

Generalized Additive Models

Lecture 21 : A Hybrid: Deep Learning and Graphical Models

Variability in Annual Temperature Profiles

Model validation through "Posterior predictive checking" and "Leave-one-out"

Bayesian Robust Inference of Differential Gene Expression The bridge package

Predictor Selection Algorithm for Bayesian Lasso

Linear Modeling with Bayesian Statistics

Calibration and emulation of TIE-GCM

Warped Mixture Models

INLA: an introduction

Small area estimation by model calibration and "hybrid" calibration. Risto Lehtonen, University of Helsinki Ari Veijanen, Statistics Finland

CS281 Section 9: Graph Models and Practical MCMC

arxiv: v1 [stat.me] 2 Jun 2017

A GENERAL GIBBS SAMPLING ALGORITHM FOR ANALYZING LINEAR MODELS USING THE SAS SYSTEM

Package hergm. R topics documented: January 10, Version Date

Slice sampler algorithm for generalized Pareto distribution

Level-set MCMC Curve Sampling and Geometric Conditional Simulation

08 An Introduction to Dense Continuous Robotic Mapping

Time Series Analysis by State Space Methods

Liangjie Hong*, Dawei Yin*, Jian Guo, Brian D. Davison*

Robust Poisson Surface Reconstruction

Markov Chain Monte Carlo (part 1)

Bayesian adaptive distributed lag models

1.7.1 Laplacian Smoothing

Computational Physics PHYS 420

Smoothing Dissimilarities for Cluster Analysis: Binary Data and Functional Data

BART STAT8810, Fall 2017

Approximate Bayesian Computation methods and their applications for hierarchical statistical models. University College London, 2015

Robust line segmentation for handwritten documents

Introduction to Optimization

davidr Cornell University

Splines. Patrick Breheny. November 20. Introduction Regression splines (parametric) Smoothing splines (nonparametric)

PSY 9556B (Feb 5) Latent Growth Modeling

Accelerated Markov Chain Monte Carlo Sampling in Electrical Capacitance Tomography

DDP ANOVA Model. May 9, ddpanova-package... 1 ddpanova... 2 ddpsurvival... 5 post.pred... 7 post.sim DDP ANOVA with univariate response

Analyzing Longitudinal Data Using Regression Splines

FITTING PIECEWISE LINEAR FUNCTIONS USING PARTICLE SWARM OPTIMIZATION

Supplementary Material for A Locally Linear Regression Model for Boundary Preserving Regularization in Stereo Matching

Regression on a Graph

Image analysis. Computer Vision and Classification Image Segmentation. 7 Image analysis

Learning the Structure of Deep, Sparse Graphical Models

A Comparison of Two MCMC Algorithms for Hierarchical Mixture Models

Curve fitting using linear models

NONPARAMETRIC REGRESSION SPLINES FOR GENERALIZED LINEAR MODELS IN THE PRESENCE OF MEASUREMENT ERROR

Statistics & Analysis. A Comparison of PDLREG and GAM Procedures in Measuring Dynamic Effects

Knowledge Discovery and Data Mining

DPP: Reference documentation. version Luis M. Avila, Mike R. May and Jeffrey Ross-Ibarra 17th November 2017

Transcription:

Modeling Criminal Careers as Departures From a Unimodal Population Curve: The Case of Marijuana Use Donatello Telesca, Elena A. Erosheva, Derek A. Kreader, & Ross Matsueda April 15, 2014

extends Telesca and Inoue (2008) by applying the methods of Telesca and Inoue (2008) to a study of individual age-crime curves in the context of marijuana use over the course of young adulthood

Representative of criminal career trajectories Used to study: of onset of offending Timing of peak offending Duration of criminal careers Types of criminal careers Other features of criminal careers When combined with individual characteristics, can also be used to describe relationship between individual characteristics and features of criminal careers

Relatively new field of statistics The same/similar methods referred to as curve registration, warping regression models, structural averaging... Useful when: Data is observed as a time series for many individuals Features of the time-outcome relationship (peaks, inflection points) are of interest Substantial variation in individual curves

What would we need the data to look like for these questions to be easy to answer?

Marijuana Use

Marijuana Use Marijuana Use

Marijuana Use Marijuana Use

Marijuana Use Marijuana Use

What does the data actually look like?

Marijuana Use Marijuana Use

Marijuana Use Marijuana Use

Marijuana Use Marijuana Use

Marijuana Use Marijuana Use

How does the functional data analysis framework allow us to characterize data of this kind?

Marijuana Use Stochastic

Marijuana Use Stochastic

Marijuana Use Stochastic

Marijuana Use Stochastic

Formulation used by introduced in Ramsay and Li (1998) Introduces model where All individuals share a stochastic time-response function Variation in observed time-response curves is generated by individual, strictly monotonic functions mapping time to stochastic time, called warping functions and noise Flexibility in modeling warping functions obtained by using B-spline basis functions Smoothness of estimated warping functions obtained by penalization of relative curvature

Telesca and Inoue (2008) developed a framework for functional data analysis based on Ramsay and Li (1998) Allows information to be shared across curves Introduces a temporal misalignment window Incoporates smoothing penalty directly into the model Gives a formal account of amplitude and phase variability Allows for exact inference

Telesca etal. is an extension of Telesca and Inoue (2008) Extends the curve registration models, developed for continuous data, to count data Allows scale and time-transformations to depend on individual covariates

Marijuana Use Model mean trajectory as a degree two polynomial in time/age Incorporated individual variation using random effects and/or latent classes Random Effects: Raudenbush and Chan (1993) Latent Classes: Nagin and Land (1993); Roeder et al. (1999) Random Effects & Latent Classes: Muthén and Shedden (1999)

Marijuana Use Model mean trajectory as a degree two polynomial in time/age Incorporate individual variation using random effects and/or latent classes Random Effects: Raudenbush and Chan (1993) Latent Classes: Nagin and Land (1993); Roeder et al. (1999) Random Effects & Latent Classes: Muthén and Shedden (1999)

Marijuana Use λ i (t j, X i ) = a i (X i ) }{{} Amplitude Y ij λ i (t j, X i ) P (λ i (t j, X i )) S (t j, β) }{{} Shape µ i (t j, φ i ; X i ) }{{} Phase Shift = a i (X i ) S (µ i (t j, φ i ; X i ), β)

a i (X i ) E [a i X i ] = exp {X ib a } Gamma conditional prior distribution of a i X i, b a Conjugate prior distribution for Poisson rate parameter Naturally accommodates over dispersion

S (t j, β) S (t j, β) = S B (t j ) β S B (t j ) set of K B-spline basis functions of order four evaluated at time t Positivity and unimodality of S (t j, β) imposed by restrictions on β Second order shrinkage prior on a reparametrization of β shrinks S (t j, β) towards a piecewise linear trajectory

µ i (t j, φ i ; X i ) µ i (t j, φ i ; X i ) = S µ (t j ) φ i (X i ) S µ (t j ) set of Q B-spline basis functions of order four evaluated at time t Monotonicity and boundedness of µ i (t j, φ i ; X i ) imposed by restrictions on φ i First order random walk shrinkage prior on a reparametrization of φ i shrinks φ i towards the B-spline coefficients that yield the identity function, µ i (t j, φ i ; X i ) = t j E [φ iq X i, b φ ] = Γ q + X ib φ

Use Markov Chain Monte Carlo to fit model by combination of Gibbs sampling and Metropolis-Hastings sampling algorithms Use posterior predictive loss to choose number of interior knots

Model allows us to explicitly test: Relationship between level of offending and individual characteristics Unimodality of age-crime curve Relationship between phase shift and individual characteristics

Better familiarity with relevant concepts in foundational papers: Ramsay and Li (1998) Lang and Brezger (2004) Telesca and Inoue (2008) Work through derivations of: Posterior predictive loss for Poisson case Conditional posterior distributions of parameters MCMC algorithm used Begin working through code

Lang, S. and A. Brezger (2004, March). P-Splines. Journal of Computational and Graphical Statistics 13 (1), 183 212. Muthén, B. and K. Shedden (1999, June). Finite Mixture Modeling with Mixture Outcomes Using the EM Algorithm. Biometrics 55 (June), 463 469. Nagin, D. S. and K. C. Land (1993)., Criminal Careers, and Population Heterogeneity: Specification and Estimation of a Nonparametric, Mixed Poisson Model. Criminology 31 (3), 327 362. Ramsay, J. O. and X. Li (1998, May). Curve Registration. Journal of the Royal Statistical Society: Series B 60 (2), 351 363. Raudenbush, S. W. and W.-S. Chan (1993, December). Application of a Hierarchical Linear Model to the Study of Adolescent Deviance in an Overlapping Cohort Design. Journal of Consulting and Clinical Psychology 61 (6), 941 951. Roeder, K., K. G. Lynch, and D. S. Nagin (1999). Modeling Uncertainty in Latent Class Membership: A Case Study in Criminology. Journal of the American Statistical Association 94 (447), 766 776. Telesca, D., E. A. Erosheva, D. A. Kreager, and R. L. Matsueda (2012, December). Modeling Criminal Careers as Departures From a Unimodal Population Crime Curve: The Case of Marijuana Use. Journal of the American Statistical Association 107 (500), 1427 1440. Telesca, D. and L. Y. T. Inoue (2008, March). Hierarchical Curve Registration. Journal of the American Statistical Association 103 (481), 328 339.