Package dyncorr. R topics documented: December 10, Title Dynamic Correlation Package Version Date

Similar documents
Package cattonum. R topics documented: May 2, Type Package Version Title Encode Categorical Features

Package datasets.load

Package fastdummies. January 8, 2018

Package weco. May 4, 2018

Package estprod. May 2, 2018

Package SEMrushR. November 3, 2018

Package gains. September 12, 2017

Package ECctmc. May 1, 2018

Package PSTR. September 25, 2017

Package catenary. May 4, 2018

Package geojsonsf. R topics documented: January 11, Type Package Title GeoJSON to Simple Feature Converter Version 1.3.

Package PCADSC. April 19, 2017

Package bisect. April 16, 2018

Package lcc. November 16, 2018

Package gtrendsr. October 19, 2017

Package queuecomputer

Package areaplot. October 18, 2017

Package smoothr. April 4, 2018

Package ggimage. R topics documented: December 5, Title Use Image in 'ggplot2' Version 0.1.0

Package preprosim. July 26, 2016

Package kirby21.base

Package Numero. November 24, 2018

Package sigqc. June 13, 2018

Package vinereg. August 10, 2018

Package gggenes. R topics documented: November 7, Title Draw Gene Arrow Maps in 'ggplot2' Version 0.3.2

Package interplot. R topics documented: June 30, 2018

Package ClusterBootstrap

Package IATScore. January 10, 2018

Package canvasxpress

Package ArCo. November 5, 2017

Package gtrendsr. August 4, 2018

Package quickreg. R topics documented:

Package ClustGeo. R topics documented: July 14, Type Package

Package wrswor. R topics documented: February 2, Type Package

Package editdata. October 7, 2017

Package statip. July 31, 2018

Package icesadvice. December 7, 2018

Package fbroc. August 29, 2016

Package validara. October 19, 2017

Package sizemat. October 19, 2018

Package ggimage. R topics documented: November 1, Title Use Image in 'ggplot2' Version 0.0.7

Package spark. July 21, 2017

Package qualmap. R topics documented: September 12, Type Package

Package splines2. June 14, 2018

Package fitur. March 11, 2018

Package raker. October 10, 2017

Package textrank. December 18, 2017

Package vip. June 15, 2018

Package sfc. August 29, 2016

Package repec. August 31, 2018

Package CorporaCoCo. R topics documented: November 23, 2017

Package TANDEM. R topics documented: June 15, Type Package

Package epitab. July 4, 2018

Package linkspotter. July 22, Type Package

Package auctestr. November 13, 2017

Package fcr. March 13, 2018

Package barcoder. October 26, 2018

Package calpassapi. August 25, 2018

Package intccr. September 12, 2017

Package OptimaRegion

Package splithalf. March 17, 2018

Package gppm. July 5, 2018

Package modmarg. R topics documented:

Package nmslibr. April 14, 2018

Package cregulome. September 13, 2018

Package crossrun. October 8, 2018

Package mixphm. July 23, 2015

Package aop. December 5, 2016

Package strat. November 23, 2016

Package zoomgrid. January 3, 2019

Package MetaLonDA. R topics documented: January 22, Type Package

Package projector. February 27, 2018

Package attrcusum. December 28, 2016

Package docxtools. July 6, 2018

Package nprotreg. October 14, 2018

Package smapr. October 20, 2017

Package WordR. September 7, 2017

Package rprojroot. January 3, Title Finding Files in Project Subdirectories Version 1.3-2

Package jdx. R topics documented: January 9, Type Package Title 'Java' Data Exchange for 'R' and 'rjava'

Package r2d2. February 20, 2015

Package qicharts2. March 3, 2018

Package leaflet.minicharts

Package hypercube. December 15, 2017

Package SFtools. June 28, 2017

Package blandr. July 29, 2017

Package omu. August 2, 2018

Package mdftracks. February 6, 2017

Package tuts. June 12, 2018

Package BASiNET. October 2, 2018

Package npphen. August 31, 2017

Package dynsim. R topics documented: August 29, 2016

Package ggseas. June 12, 2018

Package nplr. August 1, 2015

Package nltt. October 12, 2016

Package GFD. January 4, 2018

Package addhaz. September 26, 2018

Package tidyimpute. March 5, 2018

Package autocogs. September 22, Title Automatic Cognostic Summaries Version 0.1.1

Package embed. November 19, 2018

Package sure. September 19, 2017

Transcription:

Title Dynamic Correlation Package Version 1.1.0 Date 2017-12-10 Package dyncorr December 10, 2017 Description Computes dynamical correlation estimates and percentile bootstrap confidence intervals for pairs of longitudinal responses, including consideration of lags and derivatives. Depends lpridge, R (>= 2.10), stats License GPL-2 Encoding UTF-8 LazyData true RoxygenNote 6.0.1 Suggests knitr, rmarkdown NeedsCompilation no Author Joel Dubin [aut, cre], Mike Li [aut], Dandi Qiao [aut], Hans-Georg Müller [aut] Maintainer Joel Dubin <jdubin@uwaterloo.ca> Repository CRAN Date/Publication 2017-12-10 21:54:38 UTC R topics documented: bootstrapci......................................... 2 dynamiccorrelation..................................... 5 dyncorrdata........................................ 9 Index 11 1

2 bootstrapci bootstrapci Bootstrap Confidence Interval Description Computes percentile bootstrap (BS) confidence intervals for dynamical correlation for pairs of longitudinal responses, including consideration of lags and derivatives, following a local polynomial regression smoothing step. Usage bootstrapci(dataframe, depvar, indepvar, subjectvar, function.choice, width.range, width.place, min.obs, points.length, points.by, boundary.trunc, byorder, max.dyncorrlag, B, percentile, by.deriv.only, seed) Arguments dataframe depvar indepvar The data frame that contains the dependent variables/responses, the independent variable (often time), and the subject/individual identification; there should be one row entry for each combination of subject/individual and indepvar (often time). Dependent variables/responses; at least two are necessary for purposes of calculating at least one dynamical correlation estimate of interest; there should be a unique column within depvar for each response. Independent variable, typically the discrete recorded time points at which the dependent variables were collected; note that this is the independent variable for purposes of curve creation leading into estimating the dynamical correlations between pairs of dependent variables; must be contained in a single column. subjectvar Column name of the individuals; there should be one row entry for each combination of subject/individual and indepvar. function.choice A vector of length 3 to indicate which derivatives desired for local polynomial curve smoothing; 1st entry is for 0th derivative (i.e., original function), 2nd entry is for 1st derivative, 3rd is for 2nd derivative; 1=yes, 0=no. e.g., c(1,0,1) would be specified if interest is in looking at derivative 0 and derivative 2, c(1,0,0) for looking at original function (0th derivative) only, etc. width.range Bandwidth for local polynomial regression curve smoothing for each dependent variable/response; it can be a list that specifies a distinct bandwidth two-element vector for each response, or a single two-element vector that is used for each of the responses the program is currently set up to allow linearly increasing or decreasing bandwidths, specified by this two-element vector with the increase (or decrease) occurring from the first argument in width.place to its second argument; the lpepa function within the lpridge package is called, which uses Epanecknikov kernel weighting, and the specifications of bandwidth in width.range

bootstrapci 3 will be used there; the default bandwidth is the range of indepvar (usually time) divided by 4, i.e., a constant global bandwidth. width.place Endpoints for width change assuming a non-constant bandwidth requested the program is currently set up to allow linearly increasing or decreasing bandwidths, specified in width.range, with the increase (or decrease) occurring from the first argument in width.place to its second argument; it can be a list that specifies different endpoints for each response, or a single vector of endpoints that is used for each of the responses; default is no endpoints specified which means constant global bandwidth throughout the range of indepvar. min.obs Minimum oberservation (follow-up period) required. If specified, individuals whose follow-up period shorter than min.obs will be removed from calculation. Default = NA (use whole dataset provided). points.length Number of indep (time) points for dynamic correlation calculation for each response of each individual. This is the number of points between time 0 and minimum of maximum of individual follow-up periods (max_common_obs). Note that each individual s full follow-up time span is used in the local polynomial regression curve smoothing step, but only the first points.length number of time points (from time 0 to max_common_obs) is used in the following dynamic correlation calculation. Default points.length value is set to 100; points.length takes precedence unless points.by is specified. points.by Interval between indep (time) points for local polynomial regression curve smoothing for each response of each individual. Both integer and non-integer value could be specified, and time grid will be computed accordingly. Note that points.length takes precedence (default = 100) unless points.by is specified. boundary.trunc Indicate the boundary of indepvar that should be truncated after smoothing; this may be done in case of concerns about estimating dynamical correlation at the boundaries of indepvar; this is a two element vector, where the first argument is how much to truncate from the right of the minimum value of indep.var and the second argument is how much to truncate from the left of the maximum value of indep var, within an individual and specific response; default is no truncation, i.e., c(0,0). byorder A vector that specifies the order of the variables/responses and derivatives (if any) to be the leading variable in the calculations; this will have an effect on how lag.input is to be interpreted; default is to use the order as specified in the depvar argument. max.dyncorrlag A specified lag value at which you would like to obtain a percentile BS dynamical correlation interval estimate; only single lag values are allowed (due to computational considerations), and the lag value considered might be that which is output from a lag analysis using the dynamiccorrelation function (see Example 2 on the dynamiccorrelation help page). B The number of samples used for the bootstrap. percentile The percentile used to construct confidence intervals for dynamic correlations. by.deriv.only If TRUE, the inter-dynamical correlations between different derivatives are not computed, which can save computation time (e.g., when function.choice=c(1,0,1) is specified and by.deriv.only=t, then only dynamical correlations will be calculated within (and not across) the 0th and 2nd derivative estimates, respectively); default is TRUE.

4 bootstrapci seed The seed used in generating the BS samples. Details This function will provide smooth estimates (curves/functions or their derivatives) of responses of interest, then generate dynamical correlation estimates for each pair of responses. Lags of interest can be specified, using the lag.input argument. For smoothing, the function uses local polynomial smoothing using Epanecknikov kernel weighting (by calling the function lpepa within the lpridge package). The default global bandwidth is generated by taking the range of indepvar (usually time) and dividing by 4. This, by default, will be a constant global bandwidth, but proper specification of the width.range and width.place arguments can allow for a more flexible bandwidth choice, including different specification for each response in depvar. Whereas the dynamiccorrelation program, which produces only point estimates, is fast, the bootstrapci program is slow in its current form, as quite a bit of processing is required for each bootstrap sample. Details of the two-stage bootstrap algorithm can be found in Dubin and Muller (2005). We will attempt to boost computing speed in future versions. In addition, as pointed out in Dubin and Muller (2005), a downward shift of the two-stage bootstrap CI toward 0 is intentional, in order to account for error in the curve creation step. However, it should be noted that greater than expected downward shifts may occur for dynamical correlations that are high in magnitude. A future version of this function will attempt to make this bootstrap approach more robust in this situation. Author(s) See Also Joel A. Dubin, Mike Li, Dandi Qiao, Hans-Georg Müller lpepa Examples ## Example 1: using default smoothing parameters, obtain bootstrap CI ## estimates for all three pairs of responses, for original ## function only. Note that B=200 or greater should be ## considered for real data analysis. examp1.bs <- bootstrapci(dataframe = dyncorrdata, function.choice = c(1,0,0), B = 2, percentile = c(0.025, 0.975), seed = 5) examp1.bs ## Example 2: using default smoothing parameters, obtain bootstrap CI ## estimates for all three pairs of responses, for original

dynamiccorrelation 5 ## function only, at -10 days lag, at.01 and.99 percentiles. ## Note that B=200 or greater should be considered for real ## data analysis. examp2.bs <- bootstrapci(dataframe = dyncorrdata, function.choice = c(1,0,0), max.dyncorrlag = -10, B = 2, percentile = c(0.01, 0.99), seed = 7) examp2.bs dynamiccorrelation Dynamic Correlation Description Computes dynamical correlation estimates for pairs of longitudinal responses, including consideration of lags and derivatives, following a local polynomial regression smoothing step. Usage dynamiccorrelation(dataframe, depvar, indepvar, subjectvar, function.choice, width.range, width.place, min.obs, points.length, points.by, boundary.trunc, lag.input, byorder, by.deriv.only, full.lag.output) Arguments dataframe depvar indepvar The data frame that contains the dependent variables/responses, the independent variable (often time), and the subject/individual identification; there should be one row entry for each combination of subject/individual and indepvar (often time). Dependent variables/responses; at least two are necessary for purposes of calculating at least one dynamical correlation estimate of interest; there should be a unique column within depvar for each response. Independent variable, typically the discrete recorded time points at which the dependent variables were collected; note that this is the independent variable for purposes of curve creation leading into estimating the dynamical correlations between pairs of dependent variables; must be contained in a single column. subjectvar Column name of the individuals; there should be one row entry for each combination of subject/individual and indepvar. function.choice A vector of length 3 to indicate which derivatives desired for local polynomial curve smoothing; 1st entry is for 0th derivative (i.e., original function), 2nd entry

6 dynamiccorrelation width.range is for 1st derivative, 3rd is for 2nd derivative; 1=yes, 0=no. e.g., c(1,0,1) would be specified if interest is in looking at derivative 0 and derivative 2, c(1,0,0) for looking at original function (0th derivative) only, etc. Bandwidth for local polynomial regression curve smoothing for each dependent variable/response; it can be a list that specifies a distinct bandwidth two-element vector for each response, or a single two-element vector that is used for each of the responses the program is currently set up to allow linearly increasing or decreasing bandwidths, specified by this two-element vector with the increase (or decrease) occurring from the first argument in width.place to its second argument; the lpepa function within the lpridge package is called, which uses Epanecknikov kernel weighting, and the specifications of bandwidth in width.range will be used there; the default bandwidth is the range of indepvar (usually time) divided by 4, i.e., a constant global bandwidth. width.place Endpoints for width change assuming a non-constant bandwidth requested the program is currently set up to allow linearly increasing or decreasing bandwidths, specified in width.range, with the increase (or decrease) occurring from the first argument in width.place to its second argument; it can be a list that specifies different endpoints for each response, or a single vector of endpoints that is used for each of the responses; default is no endpoints specified which means constant global bandwidth throughout the range of indepvar. min.obs points.length points.by Minimum oberservation (follow-up period) required. If specified, individuals whose follow-up period shorter than min.obs will be removed from calculation. Default = NA (use whole dataset provided). Number of indep (time) points for dynamic correlation calculation for each response of each individual. This is the number of points between time 0 and minimum of maximum of individual follow-up periods (max_common_obs). Note that each individual s full follow-up time span is used in the local polynomial regression curve smoothing step, but only the first points.length number of time points (from time 0 to max_common_obs) is used in the following dynamic correlation calculation. Default points.length value is set to 100; points.length takes precedence unless points.by is specified. Interval between indep (time) points for local polynomial regression curve smoothing for each response of each individual. Both integer and non-integer value could be specified, and time grid will be computed accordingly. Note that points.length takes precedence (default = 100) unless points.by is specified. boundary.trunc Indicate the boundary of indepvar that should be truncated after smoothing; this may be done in case of concerns about estimating dynamical correlation at the boundaries of indepvar; this is a two element vector, where the first argument is how much to truncate from the right of the minimum value of indep.var and the second argument is how much to truncate from the left of the maximum value of indep var, within an individual and specific response; default is no truncation, i.e., c(0,0). lag.input Values of lag to be considered; can be a vector of requested lags, for which a dynamical correlation estimate will be produced for each pair of responses at each lag; a positive value of lag.input means that the first entry for the dynamical correlation leads (occurs before) the second entry conversely, a negative value

dynamiccorrelation 7 Details byorder means that the second entry for the correlation leads the first entry; default is no lag at all considered. A vector that specifies the order of the variables/responses and derivatives (if any) to be the leading variable in the calculations; this will have an effect on how lag.input is to be interpreted; default is to use the order as specified in the depvar argument. by.deriv.only If TRUE, the inter-dynamical correlations between different derivatives are not computed, which can save computation time (e.g., when function.choice=c(1,0,1) is specified and by.deriv.only=t, then only dynamical correlations will be calculated within (and not across) the 0th and 2nd derivative estimates, respectively); default is TRUE. full.lag.output If TRUE, the dynamical correlation values for each pair of responses and requested derivative will be stored in vectors corresponding to different lag values, which enables plotting the correlations as a function of lag values; all the vectors will be stored in the returned attribute resultmatrix; default is FALSE. This function will provide smooth estimates (curves/functions or their derivatives) of longitudinal responses of interest per individual, then generate dynamical correlation estimates for each pair of responses. Lags of interest can be specified, using the lag.input argument. For smoothing, the function uses local polynomial smoothing using Epanecknikov kernel weighting (by calling the function lpepa within the lpridge package). The default global bandwidth is generated by taking the range of indepvar (usually time) and dividing by 4. This, by default, will be a constant global bandwidth, but proper specification of the width.range and width.place arguments can allow for a more flexible bandwidth choice, including different specification for each response in depvar. Note that the correlation will only be calculated as far as min of max observation time (max_common_obs) across all individuals, by default, hence caution should be taken if there is large heterogeneity among max observation times across individuals. Details of the methodology for dynamical correlation can be found in Dubin and Muller (2005). Author(s) Joel A. Dubin, Mike Li, Dandi Qiao, Hans-Georg Müller See Also lpepa Examples ## Example 1: using default smoothing parameters, obtain dynamical ## correlation estimates for all three pairs of responses, ## for both original function and the first derivative. ## Note that in dyncorrdata, mininum of maximum obs. time ## = 120, results should be the same if either points.by

8 dynamiccorrelation ## = 1 or points.length = 120 is specified. examp1_by <- dynamiccorrelation(dataframe = dyncorrdata, function.choice = c(1,1,0)) examp1_len <- dynamiccorrelation(dataframe = dyncorrdata, points.length = 120, function.choice = c(1,1,0)) examp1_by examp1_len ## Example 1a: re-run Example 1 using original dataset, but with min.obs ## set to 200. Range in lengths of follow-up periods between ## individuals is reduced. examp1a <- dynamiccorrelation(dataframe = dyncorrdata, min.obs = 150, function.choice = c(1,0,0)) examp1a ## Example 2: using default smoothing parameters, obtain dynamical ## correlation estimates for all three pairs of responses, ## looking at range of lags between -10 and +10, for original ## functions only examp2 <- dynamiccorrelation(dataframe = dyncorrdata, function.choice = c(1,0,0), lag.input = seq(-20,20, by=1)) examp2 ## note: output includes zero lag correlations, as well as maximum ## correlation (in absolute value) in max.dyncorr and and its ## corresponding lag value in max.dyncorrlag ## Example 3: re-rerun example 2, but set up for plotting of specified ## lagged correlations examp3 <- dynamiccorrelation(dataframe = dyncorrdata,

dyncorrdata 9 function.choice = c(1,0,0), lag.input = seq(-20,10, by=1), full.lag.output = TRUE) # conduct plotting, with one panel for each pair of responses considered; # the ylim adjustment is made here for the different magnitude of the # correlations between the two pairs par(mfrow=c(1,2)) plot(seq(-20,10, by=1), examp3$lagresultmatrix[[1]][1,], type='b', xlab = 'lag order (in days)', ylab = 'lagged correlations', ylim = c(-0.4, -0.2), main = 'dyncorr b/t resp1 and resp2 as a function of lag') abline(v = examp3$max.dyncorrlag[[1]][1,2], lty = 2) plot(seq(-20,10, by=1), examp3$lagresultmatrix[[1]][2,], type='b', xlab = 'lag order (in days)', ylab = 'lagged correlations', ylim = c(0.3, 0.5), main = 'dyncorr b/t resp1 and resp3 as a function of lag') abline(v = examp3$max.dyncorrlag[[1]][1,3], lty = 2) ## Example 4: same as the original function piece of Example 1, ## except now adjust the constant global bandwidth ## from the default to 40 examp4 <- dynamiccorrelation(dataframe = dyncorrdata, function.choice = c(1,0,0), width.range = c(40, 40)) examp4 dyncorrdata dyncorrdata

10 dyncorrdata Description An example dataset for use in the example calls in the help files for the dynamiccorrelation and bootstrapci functions. Usage Format This dataset contains three longitudinal responses for each subject, each measured every ten days over a varying length of follow-up depending upon subject (for each of 34 subjects). Each row contains all three responses for a given subject and a given follow-up time. The features are that the first and third responses have a moderate-sized positive dynamical correlation, while the first and second, and second and third, respectively, have lesser negative dynamical correlations. The dataset is called in the examples on the dynamiccorrelation and bootstrapci help pages. data(dyncorrdata) A 648 by 5 data frame.

Index Topic datasets dyncorrdata, 9 bootstrapci, 2 dynamiccorrelation, 5 dyncorrdata, 9 lpepa, 4, 7 11