GLM II. Basic Modeling Strategy CAS Ratemaking and Product Management Seminar by Paul Bailey. March 10, 2015

Similar documents
Fathom Dynamic Data TM Version 2 Specifications

Using Multivariate Adaptive Regression Splines (MARS ) to enhance Generalised Linear Models. Inna Kolyshkina PriceWaterhouseCoopers

Learner Expectations UNIT 1: GRAPICAL AND NUMERIC REPRESENTATIONS OF DATA. Sept. Fathom Lab: Distributions and Best Methods of Display

THE L.L. THURSTONE PSYCHOMETRIC LABORATORY UNIVERSITY OF NORTH CAROLINA. Forrest W. Young & Carla M. Bann

Enterprise Miner Tutorial Notes 2 1

Overview and Practical Application of Machine Learning in Pricing

Predictive Analytics: Demystifying Current and Emerging Methodologies. Tom Kolde, FCAS, MAAA Linda Brobeck, FCAS, MAAA

8. MINITAB COMMANDS WEEK-BY-WEEK

Applied Regression Modeling: A Business Approach

Frequently Asked Questions Updated 2006 (TRIM version 3.51) PREPARING DATA & RUNNING TRIM

Further Maths Notes. Common Mistakes. Read the bold words in the exam! Always check data entry. Write equations in terms of variables

Cpk: What is its Capability? By: Rick Haynes, Master Black Belt Smarter Solutions, Inc.

Predict Outcomes and Reveal Relationships in Categorical Data

Clustering in Ratemaking: Applications in Territories Clustering

SPSS QM II. SPSS Manual Quantitative methods II (7.5hp) SHORT INSTRUCTIONS BE CAREFUL

SAS (Statistical Analysis Software/System)

Stat 4510/7510 Homework 4

And the benefits are immediate minimal changes to the interface allow you and your teams to access these

STA Rev. F Learning Objectives. Learning Objectives (Cont.) Module 3 Descriptive Measures

Data Analysis and Solver Plugins for KSpread USER S MANUAL. Tomasz Maliszewski

Lecture: Simulation. of Manufacturing Systems. Sivakumar AI. Simulation. SMA6304 M2 ---Factory Planning and scheduling. Simulation - A Predictive Tool

Multiple Regression White paper

Why is Statistics important in Bioinformatics?

Cognalysis TM Reserving System User Manual

Performance Estimation and Regularization. Kasthuri Kannan, PhD. Machine Learning, Spring 2018

Metrics for Performance Evaluation How to evaluate the performance of a model? Methods for Performance Evaluation How to obtain reliable estimates?

Building Better Parametric Cost Models

Exploratory model analysis

Selected Introductory Statistical and Data Manipulation Procedures. Gordon & Johnson 2002 Minitab version 13.

Lecture on Modeling Tools for Clustering & Regression

1 RefresheR. Figure 1.1: Soy ice cream flavor preferences

Correctly Compute Complex Samples Statistics

This chapter will show how to organize data and then construct appropriate graphs to represent the data in a concise, easy-to-understand form.

Package ToTweedieOrNot

Artificial Neural Networks (Feedforward Nets)

Correctly Compute Complex Samples Statistics

Lesson 18-1 Lesson Lesson 18-1 Lesson Lesson 18-2 Lesson 18-2

Statistics Lecture 6. Looking at data one variable

THIS IS NOT REPRESNTATIVE OF CURRENT CLASS MATERIAL. STOR 455 Midterm 1 September 28, 2010

CS249: ADVANCED DATA MINING

Chapter 2: Modeling Distributions of Data

Distributions of Continuous Data

MAT 110 WORKSHOP. Updated Fall 2018

Machine Learning Techniques for Detecting Hierarchical Interactions in GLM s for Insurance Premiums

Excel 2010 with XLSTAT

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

Descriptive Statistics, Standard Deviation and Standard Error

Voluntary State Curriculum Algebra II

Introduction to mixed-effects regression for (psycho)linguists

Evaluating Classifiers

Introducing Microsoft SQL Server 2016 R Services. Julian Lee Advanced Analytics Lead Global Black Belt Asia Timezone

Bootstrapping Method for 14 June 2016 R. Russell Rhinehart. Bootstrapping

ECLT 5810 Data Preprocessing. Prof. Wai Lam

AP Statistics Summer Assignment:

STATS PAD USER MANUAL

Model Assessment and Selection. Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer

Chapter 6: DESCRIPTIVE STATISTICS

SLStats.notebook. January 12, Statistics:

Big Data Methods. Chapter 5: Machine learning. Big Data Methods, Chapter 5, Slide 1

Continuous Improvement Toolkit. Normal Distribution. Continuous Improvement Toolkit.

BIO 360: Vertebrate Physiology Lab 9: Graphing in Excel. Lab 9: Graphing: how, why, when, and what does it mean? Due 3/26

STA Module 2B Organizing Data and Comparing Distributions (Part II)

STA Learning Objectives. Learning Objectives (cont.) Module 2B Organizing Data and Comparing Distributions (Part II)

Brief Guide on Using SPSS 10.0

MINITAB Release Comparison Chart Release 14, Release 13, and Student Versions

Chapter 5. Track Geometry Data Analysis

REPLACING MLE WITH BAYESIAN SHRINKAGE CAS ANNUAL MEETING NOVEMBER 2018 GARY G. VENTER

REGULARIZED REGRESSION FOR RESERVING AND MORTALITY MODELS GARY G. VENTER

Chapter 6. THE NORMAL DISTRIBUTION

Machine Learning Techniques for Data Mining

BUSINESS ANALYTICS. 96 HOURS Practical Learning. DexLab Certified. Training Module. Gurgaon (Head Office)

piecewise ginireg 1 Piecewise Gini Regressions in Stata Jan Ditzen 1 Shlomo Yitzhaki 2 September 8, 2017

SPSS Basics for Probability Distributions

Machine Learning and Bioinformatics 機器學習與生物資訊學

OVERVIEW & RECAP COLE OTT MILESTONE WRITEUP GENERALIZABLE IMAGE ANALOGIES FOCUS

Analytical model A structure and process for analyzing a dataset. For example, a decision tree is a model for the classification of a dataset.

Nonparametric Methods Recap

Lecture 24: Generalized Additive Models Stat 704: Data Analysis I, Fall 2010

MDM 4UI: Unit 8 Day 2: Regression and Correlation

3. Data Analysis and Statistics

Product Catalog. AcaStat. Software

Chapter 2: The Normal Distributions

Poisson Regression and Model Checking

Normal Data ID1050 Quantitative & Qualitative Reasoning

Generalized least squares (GLS) estimates of the level-2 coefficients,

Technical Support Minitab Version Student Free technical support for eligible products

Data Management - 50%

Learn What s New. Statistical Software

IBM SPSS Categories. Predict outcomes and reveal relationships in categorical data. Highlights. With IBM SPSS Categories you can:

An introduction to SPSS

Special Review Section. Copyright 2014 Pearson Education, Inc.

Error Analysis, Statistics and Graphing

For our example, we will look at the following factors and factor levels.

CHAPTER 5. BASIC STEPS FOR MODEL DEVELOPMENT

Averages and Variation

Model selection and validation 1: Cross-validation

CS 229 Final Project - Using machine learning to enhance a collaborative filtering recommendation system for Yelp

Chapter 6. THE NORMAL DISTRIBUTION

Clustering and Visualisation of Data

Prepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order.

Transcription:

GLM II Basic Modeling Strategy 2015 CAS Ratemaking and Product Management Seminar by Paul Bailey March 10, 2015

Building predictive models is a multi-step process Set project goals and review background Gather and prepare data Explore Data Build Component Predictive Validate Component Combine Component Incorporate Constraints Ernesto walked us through the first 3 components We will now go through an example of the remaining steps: Building component predictive models We will illustrate how to build a frequency model Validating component models We will illustrate how to validate your component model We will also briefly discuss combining models and incorporating implementation constraints Goal should be to build best predictive models now and incorporate constraints later 1

Building component predictive models can be separated into two steps Set project goals and review background Gather and prepare data Explore Data Build Component Predictive Validate Component Combine Component Incorporate Constraints Initial Modeling Selecting error structure and link function Build simple initial model Testing basic modeling assumptions and methodology Iterative modeling Refining your initial models through a series of iterative steps complicating the model, then simplifying the model, then repeating 2

Initial modeling Initial modeling is done to test basic modeling methodology Is my link function appropriate? Is my error structure appropriate? Is my overall modeling methodology appropriate (e.g. do I need to cap losses? Exclude expense only claims? Model by peril?) 3

Examples of error structures Error functions reflect the variability of the underlying process and can be any distribution within the exponential family, for example: Gamma consistent with severity modeling; may want to try Inverse Gaussian Tweedie consistent with pure premium modeling Poisson consistent with frequency modeling Normal useful for a variety of applications 4

Generally accepted error structure and link functions Use generally accepted standards as starting point for link functions and error structures Observed Response Most Appropriate Link Function Most Appropriate Error Structure Variance Function -- -- Normal µ 0 Claim Frequency Log Poisson µ 1 Claim Severity Log Gamma µ 2 Claim Severity Log Inverse Gaussian µ 3 Pure Premium Log Gamma or Tweedie µ T Retention Rate Logit Binomial µ(1-µ) Conversion Rate Logit Binomial µ(1-µ) 5

Build an initial model Reasonable starting points for model structure Prior model Stepwise regression General insurance knowledge CART (Classification and Regression Trees) or similar algorithms 6

>.2 >.5 >.7 > 1.4 > 2.2 > 2.9 > 3.6 > 4.3 > 5.0 > 5.7 > 6.5 > 7.2 > 7.9 > 8.6 > 9.3 > 10.1 > 10.8 > 11.5 > 12.2 > 12.9 > 13.6 > 14.4 > 15.1 > 15.8 > 16.5 > 17.2 > 17.9 > 18.7 > 19.4 > 20.1 > 20.8 > 21.5 > 22.3 > 23.0 > 23.7 > 24.4 > 25.1 > 25.8 > 26.6 > 27.3 Test model assumptions Plot of all residuals tests selected error structure/link function Studentized Standardized Deviance Residuals 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0-0.5-1.0-1.5-2.0-2.5-3.0-3.5-1,000 0 1,000 2,000 3,000 4,000 5,000 6,000 7,000 8,000 9,000 Fitte d Value Two concentrations suggests two perils: split or use joint modeling 16 14 12 10 8 6 4 2 0-2 -4-6 Normal Error Structure/Log Link (Studentized Standardized Deviance Residuals) -8 10.5 11.0 11.5 12.0 12.5 13.0 13.5 14.0 14.5 15.0 15.5 16.0 16.5 Transformed Fitted Value Asymmetrical appearance suggests power of variance function is too low > 1 > 3 > 6 > 11 > 16 > 21 > 26 > 31 > 37 > 42 > 47 > 52 > 57 > 62 > 68 > 73 > 78 > 83 > 88 0.06 Crunched Residuals (Group Size: 72) 0.05 0.04 0.03 0.02 0.01 0.00-0.01-0.02-0.03 Elliptical pattern is ideal -0.04 0.00000 0.00005 0.00010 0.00015 0.00020 0.00025 0.00030 0.00035 0.00040 0.00045 0.00050 0.00055 0.00060 0.00065 0.00070 Fitted Value Use crunched residuals for frequency 7

Example: initial frequency model Link function: Log Gender Relativity Error structure: Poisson Initial variable selected based on industry knowledge: Gender Driver age Vehicle value Area (territory) Variable NOT in initial model: Vehicle body Vehicle age 8

Example: initial frequency model Link function: Log Driver Age Relativity Error structure: Poisson Initial variable selected based on industry knowledge: Gender Driver age Vehicle value Area (territory) Variable NOT in initial model: Vehicle body Vehicle age 9

Example: initial frequency model Link function: Log Vehicle Value Relativity Error structure: Poisson Initial variable selected based on industry knowledge: Gender Driver age Vehicle value Area (territory) Variable NOT in initial model: Vehicle body Vehicle age 10

Example: initial frequency model Link function: Log Area Relativity Error structure: Poisson Initial variable selected based on industry knowledge: Gender Driver age Vehicle value Area (territory) Variable NOT in initial model: Vehicle body Vehicle age 11

Example: initial frequency model - residuals Frequency residuals are hard to interpret without Crunching Two clusters: Data points with claims Data points without claims 12

Example: initial frequency model - residuals Order observations from smallest to largest predicted value Group residuals into 500 buckets The graph plots the average residual in the bucket Crunched residuals look good! 13

Building component predictive models can be separated into two steps Set project goals and review background Gather and prepare data Explore Data Build Component Predictive Validate Component Combine Component Incorporate Constraints Initial Modeling Selecting error structure and link function Build simple initial model Testing basic modeling assumptions and methodology Iterative modeling Refining your initial models through a series of iterative steps complicating the model, then simplifying the model, then repeating towerswatson.com 14

Iterative Modeling Initial models are refined using an iterative modeling approach Iterative modeling involves many decisions to complicate and simplify the models Review Model Your modeling toolbox can help you make these decisions We will discuss your tools shortly Simplify Exclude Group Curves Complicate Include Interactions 15

Ideal Model Structure To produce a sensible model that explains recent historical experience and is likely to be predictive of future experience Overall mean Best One parameter per observation Underfit: Predictive Poor explanatory power Model Complexity (number of parameters) Overfit: Poor predictive power Explains history 16

Your modeling tool box Model decisions include: Simplification: excluding variables, grouping levels, fitting curves Complication: including variables, adding interactions Your modeling toolbox will help you make these decisions Your tools include: Judgment (e.g., do the trends make sense?) Balance tests (i.e. actual vs. expected test) Parameters/standard errors Consistency of patterns over time or random data sets Type III statistical tests (e.g., chi-square tests, F-tests) 17

Modeling toolbox: judgment Modeled Frequency Relativity Vehicle Value The modeler should also ask, does this pattern make sense? Patterns may often be counterintuitive, but become reasonable after investigation Uses: Inclusion/exclusion Grouping Fitting curves Assessing interactions 18

Modeling toolbox: balance test Actual vs. Expected Frequency - Vehicle Age Balance test is essentially an actual vs. expected Can identify variables that are not in the model where the model is not in balance Indicates variable may be explaining something not in the model Uses: Inclusion 19

Modeling toolbox: parameters/standard errors Modeled Frequency Relativities With Standard Errors - Vehicle Body Parameters and standard errors provide confidence in the pattern exhibited by the data Uses: Horizontal line test for exclusion Plateaus for grouping A measure of credibility 20

Modeling toolbox: consistency of patterns Checking for consistency of patterns over time or across random parts of a data set is a good practical test Uses: Validating modeling decisions Including/excluding factors Grouping levels Fitting curves Adding Interactions Modeled Frequency Relativity Age Category 21

Modeling toolbox: type III tests Chi test and/or F-Test is a good statistical test to compare nested models H o : Two models are essentially the same H 1 : Two models are not the same Principle of parsimony: If two models are the same, choose the simpler model Uses: Inclusion/exclusion Chi-Square Percentage Meaning Action* <5% Reject H o Use More Complex Model 5%-15% Grey Area??? 15%-30% Grey Area??? >30% Accept H o Use Simpler Model 22

Example: frequency model iteration 1 simplification Modeling decision: Grouping Age Category and Area Tools Used: judgment, parameter estimates/std deviations, type III test Age Category Relativity Area Relativity Chi Sq P Val = 97.4% Chi Sq P Val = 99.9% 23

Example: frequency model iteration 1 simplification Modeling decision: fitting a curve to vehicle value Tools used: judgment, type III test, consistency test Vehicle Value Relativity Initial Model Vehicle Value Relativity Curve Fit Chi Sq P Val = 100.0% 24

Example: frequency model iteration 2 complication Modeling decision: adding vehicle body type Tools used: balance test, parameter estimates/std deviations, type III test Balance Test: Actual vs. Expected Across Vehicle Body Type Vehicle Body Type Not In Model Vehicle Body Type Relativities Vehicle Body Type Included in Model Chi Sq P Val = 1.3% 25

Example: iterative modeling continued. Iteration 3 - simplification Group vehicle body type Iteration 4 complication Add vehicle age Iteration 5 simplification group vehicle age levels 26

Example: frequency model iteration 6 complication Action: adding age x gender interaction Tools used: balance test, type III test, consistency test, judgment Balance Test: Two Way Actual vs. Expected Across Age x Gender Age x Gender Interaction NOT in model Vehicle Body Type Relativities Vehicle Body Type Included in Model M F Chi Sq P Val = 47.5% 27

Predictive models must be validated to have confidence in the predictive power of the models Set project goals and review background Gather and prepare data Explore Data Build Component Predictive Validate Component Combine Component Incorporate Constraints Model validation techniques include: Examining residuals Examining gains curves Examining hold out samples Changes in parameter estimates Actual vs. expected on hold out sample Component models and combined risk premium model should be validated 28

Model validation: residual analysis Recheck residuals to ensure appropriate shape 10 Studentized Standardized Deviance Residuals by Policyholder Age 8 6 4 2 0-2 -4-6 lt 17 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31-32 33-35 36-40 41-45 46-50 51-55 56-60 61-65 66-70 71-75 75-80 81+ Crunched residuals are symmetric For Severity - Does the Box- Whisker show symmetry across levels? 29

>.2 >.5 >.7 > 1.4 > 2.2 > 2.9 > 3.6 > 4.3 > 5.0 > 5.7 > 6.5 > 7.2 > 7.9 > 8.6 > 9.3 > 10.1 > 10.8 > 11.5 > 12.2 > 12.9 > 13.6 > 14.4 > 15.1 > 15.8 > 16.5 > 17.2 > 17.9 > 18.7 > 19.4 > 20.1 > 20.8 > 21.5 > 22.3 > 23.0 > 23.7 > 24.4 > 25.1 > 25.8 > 26.6 > 27.3 Model validation: residual analysis (cont d) Common issues with residual plots Studentized Standardized Deviance Residuals 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0-0.5-1.0-1.5-2.0-2.5-3.0-3.5-1,000 0 1,000 2,000 3,000 4,000 5,000 6,000 7,000 8,000 9,000 Fitte d Value Two concentrations suggests two perils: split or use joint modeling 16 14 12 10 8 6 4 2 0-2 -4-6 Normal Error Structure/Log Link (Studentized Standardized Deviance Residuals) -8 10.5 11.0 11.5 12.0 12.5 13.0 13.5 14.0 14.5 15.0 15.5 16.0 16.5 Transformed Fitted Value Asymmetrical appearance suggests power of variance function is too low > 1 > 3 > 6 > 11 > 16 > 21 > 26 > 31 > 37 > 42 > 47 > 52 > 57 > 62 > 68 > 73 > 78 > 83 > 88 0.06 Crunched Residuals (Group Size: 72) 0.05 0.04 0.03 0.02 0.01 0.00-0.01-0.02-0.03 Elliptical pattern is ideal -0.04 0.00000 0.00005 0.00010 0.00015 0.00020 0.00025 0.00030 0.00035 0.00040 0.00045 0.00050 0.00055 0.00060 0.00065 0.00070 Fitted Value Use crunched residuals for frequency 30

Model validation: gains curves Gains curve are good for comparing predictiveness of models Order observations from largest to smallest predicted value on X axis Cumulative actual claim counts (or losses) on Y axis As you move from left to right, the better model should accumulate actual losses faster 31

Model validation: hold out samples Holdout samples are effective at validating models Determine estimates based on part of data set Uses estimates to predict other part of data set Full Test/Training for Large Data Sets Partial Test/Training for Smaller Data Sets Train Data Build All Data Build Data Split Data Data Train Data Refit Parameters Test Data Compare Predictions to Actual Split Data Test Data Compare Predictions to Actual Predictions should be close to actuals for heavily populated cells 32

Model validation: lift charts on hold out data Actual vs. expected on holdout data is an intuitive validation technique Good for communicating model performance to non-technical audiences Can also create actual vs. expected across predictor dimensions 33

Component frequency and severity models can be combined to create pure premium models Set project goals and review background Gather and prepare data Explore Data Build Component Predictive Validate Component Combine Component Incorporate Constraints Component models can be constructed in many different ways The standard model: COMPONENT MODELS Frequency Severity COMBINE Frequency x Severity Poisson/ Negative Binomial Gamma 34

Building a model on modeled pure premium When using modeled pure premiums, select the gamma/log link (not the Tweedie) 1,400 1,200 1,000 800 Density: Severity Severity Modeled pure premiums will not have a point mass at zero Density 600 400 200 0 0 2,000 4,000 6,000 8,000 10,000 12,000 14,000 Range Density 2,400 2,200 2,000 1,800 1,600 1,400 1,200 1,000 800 600 400 Density: Pure Premium Pure Premium Raw pure premiums are bimodal (i.e., have a point mass at zero) and require a distribution such as the Tweedie 200 0 0 2,000 4,000 6,000 8,000 10,000 12,000 14,000 Range 35

Various constraints often need to be applied to the modeled pure premiums Set project goals and review background Gather and prepare data Explore Data Build Component Predictive Validate Component Combine Component Incorporate Constraints Goal: Convert modeled pure premiums into indications after consideration of internal and external constraints Not always possible or desirable to charge the fully indicated rates in the short run Marketing decisions Regulatory constraints Systems constraints Need to adjust the indications for known constraints 36

Constraints to give desired subsidies Offsetting one predictor changes parameters of other correlated predictors to make up for the restrictions The stronger the exposure correlation, the more that can be made up through the other variable Consequently, the modeler should not refit models when a desired subsidy is incorporated into the rating plan Example Result of refitting with constraint Potential action Insurer-Desired Subsidy Sr. mgmt wants subsidy to attract drivers 65+ Regulatory Subsidy Regulatory constraint requires subsidy of drivers 65+ Correlated factors will adjust to partially make up for the difference. For example, territories with retirement communities will increase. Do not refit models with constraint Consider implication of refitting and make a business decision 37