Individual Covariates

Similar documents
A. Using the data provided above, calculate the sampling variance and standard error for S for each week s data.

An introduction to plotting data

Install RStudio from - use the standard installation.

0 Graphical Analysis Use of Excel

Geology Geomath Estimating the coefficients of various Mathematical relationships in Geology

Age & Stage Structure: Elephant Model

Week 7: The normal distribution and sample means

How to Remove Duplicate Rows in Excel

BIOL 417: Biostatistics Laboratory #3 Tuesday, February 8, 2011 (snow day February 1) INTRODUCTION TO MYSTAT

Lastly, in case you don t already know this, and don t have Excel on your computers, you can get it for free through IT s website under software.

Logical operators: R provides an extensive list of logical operators. These include

Laboratory for Two-Way ANOVA: Interactions

Chapter 3: Rate Laws Excel Tutorial on Fitting logarithmic data

1 Introduction to Using Excel Spreadsheets

Linear Programming with Bounds

Excel Tips and FAQs - MS 2010

Graphical Analysis of Data using Microsoft Excel [2016 Version]

SAMLab Tip Sheet #4 Creating a Histogram

KINETICS CALCS AND GRAPHS INSTRUCTIONS

Running Minitab for the first time on your PC

Your Name: Section: INTRODUCTION TO STATISTICAL REASONING Computer Lab #4 Scatterplots and Regression

Introduction. About this Document. What is SPSS. ohow to get SPSS. oopening Data

PR3 & PR4 CBR Activities Using EasyData for CBL/CBR Apps

Let s use Technology Use Data from Cycle 14 of the General Social Survey with Fathom for a data analysis project

Section 4 General Factorial Tutorials

= 3 + (5*4) + (1/2)*(4/2)^2.

Lab #3. Viewing Data in SAS. Tables in SAS. 171:161: Introduction to Biostatistics Breheny

Using Excel for Graphical Analysis of Data

252 APPENDIX D EXPERIMENT 1 Introduction to Computer Tools and Uncertainties

Ben Baumer Instructor

Lab #9: ANOVA and TUKEY tests

Dealing with Data in Excel 2013/2016

Exercise: Graphing and Least Squares Fitting in Quattro Pro

Plotting Graphs. Error Bars

Scatterplot: The Bridge from Correlation to Regression

Math 182. Assignment #4: Least Squares

Open a new Excel workbook and look for the Standard Toolbar.

EXST 7014, Lab 1: Review of R Programming Basics and Simple Linear Regression

Math Lab 6: Powerful Fun with Power Series Representations of Functions Due noon Thu. Jan. 11 in class *note new due time, location for winter quarter

Lab 4 Projectile Motion

[1] CURVE FITTING WITH EXCEL

addition + =5+C2 adds 5 to the value in cell C2 multiplication * =F6*0.12 multiplies the value in cell F6 by 0.12

Frequently Asked Questions Updated 2006 (TRIM version 3.51) PREPARING DATA & RUNNING TRIM

BIOS: 4120 Lab 11 Answers April 3-4, 2018

Roc Model and Density Dependence, Part 1

Using Microsoft Excel

Using Excel for Graphical Analysis of Data

Tips and Guidance for Analyzing Data. Executive Summary

Polynomial Models Studio October 27, 2006

MPhil computer package lesson: getting started with Eviews

Section 2.3: Simple Linear Regression: Predictions and Inference

BIO 360: Vertebrate Physiology Lab 9: Graphing in Excel. Lab 9: Graphing: how, why, when, and what does it mean? Due 3/26

Introduction to SPSS

Data Management Project Using Software to Carry Out Data Analysis Tasks

Survey of Math: Excel Spreadsheet Guide (for Excel 2016) Page 1 of 9

Lab 3: Acceleration of Gravity

Creating a Box-and-Whisker Graph in Excel: Step One: Step Two:

Excel Assignment 4: Correlation and Linear Regression (Office 2016 Version)

Research Methods for Business and Management. Session 8a- Analyzing Quantitative Data- using SPSS 16 Andre Samuel

Put the Graphs for Each Health Plan on the Same Graph

Software Reference Sheet: Inserting and Organizing Data in a Spreadsheet

Linear and Quadratic Least Squares

Introduction (SPSS) Opening SPSS Start All Programs SPSS Inc SPSS 21. SPSS Menus

Brief Guide on Using SPSS 10.0

Excel 2010 with XLSTAT

Linear Regression in two variables (2-3 students per group)

Our Changing Forests Level 2 Graphing Exercises (Google Sheets)

Bayesian Workflow. How to structure the process of your analysis to maximise [sic] the odds that you build useful models.

Introduction to Spreadsheets

ECE 510 Midterm 13 Feb 2013

CHAPTER 8. Dead recovery models Brownie parameterization

Meeting 1 Introduction to Functions. Part 1 Graphing Points on a Plane (REVIEW) Part 2 What is a function?

LING 104 Keating/Yang Week 2 Section

Chapter One: Getting Started With IBM SPSS for Windows

Chapter 6: DESCRIPTIVE STATISTICS

5b. Descriptive Statistics - Part II

Package ecotox. June 28, 2018

Data Mining Lab 2: A Basic Tree Classifier

An Introduction to Growth Curve Analysis using Structural Equation Modeling

Appendix 3: PREPARATION & INTERPRETATION OF GRAPHS

Creating a Mail Merge Document

Practice in R. 1 Sivan s practice. 2 Hetroskadasticity. January 28, (pdf version)

Lab1: Use of Word and Excel

1 Simple Linear Regression

Excel 2016: Part 2 Functions/Formulas/Charts

610 R12 Prof Colleen F. Moore Analysis of variance for Unbalanced Between Groups designs in R For Psychology 610 University of Wisconsin--Madison

QGIS LAB SERIES GST 102: Spatial Analysis Lab 2: Introduction to Geospatial Analysis

Problem set for Week 7 Linear models: Linear regression, multiple linear regression, ANOVA, ANCOVA

You are to turn in the following three graphs at the beginning of class on Wednesday, January 21.

MAXQDA and Chapter 9 Coding Schemes

Advanced Curve Fitting. Eric Haller, Secondary Occasional Teacher, Peel District School Board

Microsoft Excel 2007 Creating a XY Scatter Chart

Modesto City Schools. Secondary Math I. Module 1 Extra Help & Examples. Compiled by: Rubalcava, Christina

Workshop. Import Workshop

Tutorial #1: Using Latent GOLD choice to Estimate Discrete Choice Models

To complete the computer assignments, you ll use the EViews software installed on the lab PCs in WMC 2502 and WMC 2506.

Experiment 1 CH Fall 2004 INTRODUCTION TO SPREADSHEETS

Measuring the Stack Height of Nested Styrofoam Cups

Error Analysis, Statistics and Graphing

22/10/16. Data Coding in SPSS. Data Coding in SPSS. Data Coding in SPSS. Data Coding in SPSS

Transcription:

WILD 502 Lab 2 Ŝ from Known-fate Data with Individual Covariates Today s lab presents material that will allow you to handle additional complexity in analysis of survival data. The lab deals with estimation using models that include individual covariates. You ll gain experience with working with the Design Matrix and with model output where covariates are used. Individual Covariates Below, you can see the header information and first 10 rows of data in lab2fawn.inp, which is a file with data from 1 time interval that treats all animals as belonging to 1 group but that also includes individual covariates as listed. Because we have individual covariates, we have to have a row of data for each individual rather than the collapsed format that we used last week. The fate of each individual is entered using a live-dead format: the 1 st number indicates if the animal was alive and studied in a given time interval (1 = yes it was alive and studied) and the 2 nd number indicates if the animal died or not in that interval (1 = yes, it died): 10 indicates it was studied and survived; a 11 indicates it was studied and died. Here, there is only 1 time interval, so it wouldn t make sense to have any animals entered as 00, i.e., not studied in 1 st interval. But, if you had multiple intervals (say 3 years), you could enter new animals as the study goes along, e.g., 001011 (not studied the 1 st year, studied and survived the 2 nd year, and died in the 3 rd year) or 000010 (not studied the 1 st or 2 nd years; studied and survived the 3 rd year). /* Fawn survival for 1 interval, Comments provide (1) radio frequency and (2) days lived. Live-dead encounter history for 1 occasion comes next. Number of animals with this encounter history comes after encounter history. Individual Covariates are: 1) Area: 0=Control 1=Treatment 2) Sex: 0=Female 1=Male 3) Mass (kg) 4) Length (cm)*/ /* 191.62 10 */ 11 1 1 0 34 126 ; /* 192.6 11 */ 11 1 1 0 33.4 128.5 ; /* 191.64 12 */ 11 1 1 0 29.9 130 ; /* 192.79 16 */ 11 1 1 0 31 113 ; /* 192.22 16 */ 11 1 1 0 26.7 116 ; /* 192.53 13 */ 11 1 1 0 23.9 124 ; /* 192.66 28 */ 11 1 1 1 33.5 126 ; /* 192.38 27 */ 11 1 1 1 37 129.5 ; /* 192.68 26 */ 11 1 1 0 26.5 113 ; /* 192.3 31 */ 11 1 1 1 41 117.5 ;

Lab 2 Fawns & Individual Covariates Page 2 1) Open MARK and choose File, New, and use lab2fawn.inp when you examine the input file you should notice the following: a. There is 1 encounter occasion and 1 group. b. There are 4 individual covariates. So, for models with no interactions, there are 2 4 possible models, which is 2 x 2 x 2 x 2 = 16 models. In a real study, you would want to develop a set of a priori models. Here, you will run a subset of those 16 models plus several others that are of interest (see below) to gain experience with building models with individual covariates. 2) When specifying the features of this new analysis, be sure to a. select Known Fate as the data type, b. provide a meaningful title perhaps Lab 2 Fawns with covariates c. choose the correct # of occasions, groups, and individual covariates, d. ENTER INDIVIDUAL COVARIATES NAMES (use button shown below). This doesn t have to be done, but it s easier to keep track of things later if you do. Go ahead and use Area, Sex, Mass, & Length e. Next, check out the Parameter Index Matrix for this dataset. Because we entered data for a single time interval and single group for a known-fate analysis, there s only a PIM for survival rate and it has a single cell, which means you ll only have 1 row in the design matrix. 3) To run each model of interest, you ll click on Design, request as many columns as needed, and then fill in values in each column to create the model of interest. a. Run an S(Length) model, which should have a Design Matrix with 2 columns: a 1 in the first column (intercept) and the word Length in the 2 nd column (if you named it length or used some abbreviation, type that version of the word)

Lab 2 Fawns & Individual Covariates Page 3 i. Accept the defaults for the numerical estimation run i. After you ve run the model, go to the Results Browser and examine the beta estimates for this model. Is the point estimate for the effect of length positive, zero, or negative? What evidence supports your claim? Consider what the CI indicates about the effect of Length: all positive, positive and negative, or all negative. ii. Use MARK s Individual Covariate Plot tool to build a plot of Survival Rate vs Length. Open the tool, click on 1:S as the parameter to plot, choose Length as the covariate to use in the plot, accept the default range of values (min and max in the dataset), output the data to EXCEL, and click on the OK button. Once you ve done this, examine the plot and view the EXCEL output. Next, enter the following R code into a script in R, copy the data from Excel to the clipboard and execute the R commands. s_length <- read.table("clipboard", header = F, col.names = c("length", "S", "se_s", "lcl_s", "ucl_s")) iii. Use R and the dplyr package to calculate the log-odds of survival & associated 95% LCI and UCI values using these R commands. library(dplyr) s_length <- s_length %>% mutate(log_odds_s = qlogis(s), lcl_log_odds_s = qlogis(lcl_s), ucl_log_odds_s = qlogis(ucl_s)) iv. Finally, use the ggplot2 package and the following R commands to plot (a) the log-odds of survival vs. length and (b) survival vs. length.

Lab 2 Fawns & Individual Covariates Page 4 library(ggplot2) ggplot(s_length, aes(x = Length, y = log_odds_s)) + geom_ribbon(aes(ymin = lcl_log_odds_s, ymax = ucl_log_odds_s), fill = "grey80") + geom_line() ggplot(s_length, aes(x = Length, y = S)) + geom_ribbon(aes(ymin = lcl_s, ymax = ucl_s), fill = "grey80") + geom_line() + ylim(0, 1) b. Run an S(Area) model. i. Go to the Results Browser and examine the beta estimates for this model. Is the point estimate for the effect of Area positive, zero, or negative? ii. What does the CI for effect of Area indicate? iii. What are the estimates of Survival for each area (include SE s)? One way to get MARK to provide these values along with proper SE s is to (a) select the S(Area) model, i.e., highlight it, and (b) use the Run menu s option to Regenerate Real and Derived Estimates. The first time you do this, set area=0 and record the real estimates that are produced. Repeat, set area=1, and record the estimate again (Note: values for other covariates don t matter as they don t appear in the S(Area) model). If you use this method, you ll have 2 copies of the model in your results browser, which messes up your model-selection results. So, you can run them both, record the estimates, and then delete all but 1 of the copies of the S(Area) model. Another way of proceeding is to use the individual covariate plot tool in MARK to output estimates to Excel for values of Area from 0 to 1, which will mean you ll have to delete all but the 1 st and last rows of output in Excel to get what you want. c. Run the S(Sex+Length) model. i. Next, go to the Results Browser and examine the beta estimates for this model. Are the point estimates of the 2 effects positive, zero, or negative? Be sure you can determine which beta estimate goes with each covariate. If you re stumped, look at the Design Matrix. ii. What do the CIs for effects indicate? All positive, positive and negative, all negative?

Lab 2 Fawns & Individual Covariates Page 5 iii. Use MARK s individual covariate plot and look at plots for (a) females (sex = 0) and for (b) males (sex = 1). iv. You can then go to each of the resulting Excel spreadsheets and add a column for sex where you then enter the appropriate value for each dataset. Next, you can combine the 2 spreadsheets into 1 and copy them over to R using the below to bring in the data and make useful plots. s_sex.length <- read.table("clipboard", header = F, col.names = c("length", "S", "se_s", "lcl_s", "ucl_s", "sex")) s_sex.length <- s_sex.length %>% mutate(sex = factor(sex, levels = c(0, 1), labels = c("female", "male")), log_odds_s = qlogis(s), lcl_log_odds_s = qlogis(lcl_s), ucl_log_odds_s = qlogis(ucl_s)) # use of 'alpha' on ribbons makes them opaque ggplot(s_sex.length, aes(x = Length, y = log_odds_s, group = Sex)) + geom_ribbon(aes(ymin = lcl_log_odds_s, ymax = ucl_log_odds_s), alpha = 0.2) + geom_line(aes(color = Sex)) ggplot(s_sex.length, aes(x = Length, y = S, group = Sex)) + geom_ribbon(aes(ymin = lcl_s, ymax = ucl_s), alpha = 0.2) + geom_line(aes(color = Sex)) + ylim(0, 1) d. Run all of the models in the table below

Lab 2 Fawns & Individual Covariates Page 6 Number Model 1 {S(.)} 2 {S(Area)} 3 {S(Mass)} 4 {S(Length)} 5 {S(Length + Length-squared)} 6 {S(Sex)} 7 {S(Area + Length)} 8 {S(Length + Mass)} 9 {S(Sex + Length)} 10 {S(Length + Length x sex)} 11 {S(Sex + Length + Length x sex)} Think carefully about what the differences are among models 9, 10 & 11, especially in terms of how many intercepts and slopes are present in each. You ll probably need a little help running models like the S(Length+Length-squared) model, which require you to either have (i) products of covariates (e.g., Length 2 or Length x Sex) in the list of covariates (we don t have that) or to (ii) learn about MARK s Design Matrix Functions. From the MARK help file, Twelve special functions are allowed as entries in the design matrix: add, product, power, min, max, log, exp, eq (equal to), gt (greater than), ge (greater than or equal to), lt (less than), and le (less than or equal to). We need to use the product function here. Your design matrix needs to look like the one below. For the S(Sex + Length + Length x sex) model, the design matrix needs to look like: e. You can obtain useful graphics from any of the models using the strategies provided above for getting estimates to Excel and then over to R for plotting using R code that you modify as needed for other models (e.g., change the name of the data frame to indicate which model you re working with; choose to use a data with or without the sex variable in it depending on whether the model includes that covariate or not). Be sure you know how the beta estimates and values of Length and Sex are used to generate the predicted values for all models and especially for models 9, 10, and 11.

Lab 2 Fawns & Individual Covariates Page 7 Formal assignment: Note: feel free to make any of the plots requested in MARK (if that s possible), Excel, or R depending on what works best for you. 1) A description of the data available for the analysis. 2) A table that contains 1 column with model names and another column that has the actual equation for the log-odds of survival. Model Name Equation for ln(s/(1-s)) S(.) B1 S(Area) B1 + B2*Area S(Sex + Length) B1+B2*Sex+B3*Length S(Length + Length*Sex) B1+B2*Length+B3*Length*Sex 3) Table of model-selection results. Note: if you ran any of the models more than once to obtain estimates of survival rate for different covariate conditions, you ll need to delete the repeats in order to obtain a valid table of AIC results. 4) Statements about the best model and model-selection uncertainty. 5) Features of the best model in terms of beta estimates (with SE s). 6) Graphics of survival (along with 95% CIs) as a function of the covariates contained in the best model. (Hint: you can provide a separate graphic for each sex obtained directly from Program MARK if you are having trouble using the R commands). 7) A graphic of log-odds of survival (along with 95% CIs) as function of length from the S(Length) model (you can make this in R or Excel, but nor MARK). 8) A graphic of survival as function of length (along with 95% CIs) and an explanation S of why this plot is curvilinear whereas the plot of ln( 1 S ) versus length was linear. 9) For the S(Area) model, what do the estimates of beta s and real parameters (along with their uncertainty indicate about possible differences in survival between the 2 areas? 10) For S(Sex+Length) model, what can be inferred about the effects of sex and length on survival rate. (Hint: for the sex effect, it s important to know how sex was coded in the original data, and this is explained at the head of the input file). 11) For the S(Length+Length-squared) model, please provide a plot of estimated survival rate (along with 95% confidence intervals) versus length. Also, please provide the accompanying equation for the line using the actual estimated beta values you obtained from MARK. 12) Your interpretation of what can be learned about survival of mule deer fawns given this dataset and this set of models. Use quantitative evidence to support your statements and feel free to refer to tables, graphs, and results you provided to earlier answers. 13) Provide any comments and questions that you may have on the analysis that you would like to have discussed at the start of the next lab.