An introduction to MCSim: a MetaCommunity Simulation package for ecologists using the R statistical environment

An introduction to MCSim: a MetaCommunity Simulation package for ecologists using the R statistical environment By Eric R. Sokol (sokole@gmail.com) Last updated: 01 Nov 2013 NOTE: MCSim is available at http://sites.google.com/site/metacommunitysimulation/ under the DOWNLOADS tab. Download MCSim_0.2-1.zip if you are using a windows machine, or download MCSim_0.2-1.tar.gz for other platforms. To use MCSim, you must have the vegan (Oksanen et al. 2013), vegetarian (Charney and Record 2012), and plyr (Wickham 2011) packages installed, which are available at the Comprehensive R Archive Network (CRAN) at http://www.r-project.org/. The untb (Hankin 2007) and geosphere (Hijmans and Williams 2011) packages are also used in this tutorial. I also suggest using RStudio s user interface for R, which can be found at http://www.rstudio.com/ide/. Introduction MCSim is a metacommunity simulation package for the R statistical environment (R Development Core Team 2011). The package includes the function fn.metasim, which can be used to simulate niche-based and dispersal-based metacommunity dynamics, keep track of the species composition at each site in the metacommunity through time, and report diversity statistics at the end of the simulation. My overall objective in creating this package was to provide a tool by which the user can see how commonly used measures of diversity respond to imposed constraints on metacommunity dynamics. For example, this package can be used to compare beta-diversity between simulations with low dispersal and simulations with high dispersal in different contexts (i.e., niche vs. neutral community models). What is a metacommunity? In MCSim, a metacommunity is made up of a collection of assemblages that are connected by functions that model the influence of emigration and immigration. What are metacommunity dynamics? Metacommunity dynamics are the local and regional processes that determine the community composition at each site (Leibold et al. 2004, Holyoak et al. 2005). The MCSim package simulates metacommunity dynamics using a modified version of Hubbell s (2001) zero-sum lottery model. The

balance of recruitment from local versus regional species pools is determined by m, where a higher m indicates more recruits come from the regional pool. Local interactions between the environmental filter (Ef) and species trait values are modeled after Gravel et al. (2006) and Sokol et al. (2011). What is the difference between spatially implicit and spatially explicit simulations? MCSim can be used to model either spatially implicit or explicit scenarios by changing the shape of the dispersal kernel. The dispersal kernel is modeled after Gravel et al. (2006), where an increased slope is associated with increased dispersal limitation. Under a high dispersal limitation scenario (e.g., SWM.slope = 200), dispersal will only occur between neighboring sites. If the slope is 0 (SWM.slope = 0), dispersal occurs equally among all sites in the simulation, which makes the scenario spatially implicit. In spatially explicit scenarios, the arrangement and patchiness of the environmental characteristics can affect diversity outcomes through complicated dynamics, whereas, the spatial arrangement of environmental factors will have no influence over metacommunity dynamics in a spatially implicit simulation. How is diversity measured? MCSim uses the vegetarian package (Charney and Record 2012) to calculate alpha, beta, and gamma diversity using Jost s multiplicative method (Jost 2006, 2007, Jost et al. 2011, Chao et al. 2012) dividing total metacommunity diversity into alpha, beta, and gamma components is known as diversity partitioning. Alpha diversity is a measure of mean local richness and/or evenness, gamma diversity is a measure of regional richness and/or evenness, and beta diversity, calculated as gamma / alpha, represents the number of distinct communities represented in the metacommunity. The order of q must be defined for these calculations. Beta-diversity is also a measure of how much community composition tends to vary among sites in a metacommunity. Diversities calculated using order q = 0 use presence/absence measures of diversity, and thus alpha diversity is a measure of mean local richness (species count) and gamma diversity is a measure of regional richness. Diversities calculated with q = 0 are sensitive to the presence/absence of rare species. Diversities calculated with order q = 1 incorporate abundance data, and thus are also influenced by evenness. Diversities calculated with order q = 2 are more sensitive to changes in the relative abundances of the most common species in the metacommunity. Beta-diversity can be further analyzed using variation partitioning (Borcard et al. 1992, Peres-Neto et al. 2006), which determines the proportion of beta-diversity that is explained by environmental variation [E] and spatial variation [S]. The current version of MCSim only creates one environmental gradient [E] that can interact with species traits to act as an environmental filter. MCSim uses the pcnm function in the vegan package to create spatial variables that represent different scales of spatial heterogeneity thus representing broad scale to fine scale spatial filters to model [S]. Variation partitioning calculates the proportion of beta-diversity that is associated with [a] only environmental variation, [b] spatially structured environmental variation, [c] only spatial variation, and [d] unexplained beta-diversity. MCSim uses the capscale function in vegan to create distance-based Redundancy Analysis (dbrda) models (Legendre and Anderson 1999), and partial dbrda models to use in variation partitioning. The function fn.variation.partition in MCSim is similar to the varpart function in vegan, but varpart uses rda and fn.variation.partition uses dbrda. The difference is any distance metric can be used with fn.variation.partition, whereas varpart is restricted to Euclidean distances and/or Euclidean distances on transformed data.

Getting started with MCSim Step 1. Create a directory for your project. I am using a windows machine, and thus I created a working directory called MCSim_DEMO as a subdirectory in my Documents folder. Start your R session and set the working directory using setwd(). Alternatively, if you are using RStudio you can select ProjectCreate Project Existing Directory Browse (navigate to the folder you wish to use as a working directory, in my case C:\Users\Administrator\Documents\MCSim_DEMO ). This step is important because the simulation will create output files that you will not want to have to search for later on. Step 2. Make sure you have downloaded and installed plyr, vegan, and vegetarian packages for R. If you are using RStudio, you can do this by selecting the packages tab, selecting install Packages. You will need an internet connection and you may need to choose a CRAN mirror (the server from which you download the packages from the CRAN).

Step 3. Install MCSim. Currently, MCSim is only available at http://sites.google.com/site/metacommunitysimulation/file-cabinet. If you are using a windows machine, download and save the MCSim.zip file to your working directory. You can install the package using install.packages() function. > install.packages("mcsim_0.2-1.zip") Warning in install.packages : package MCSim_0.2-1.zip is not available (for R version 3.0.1) Installing package into C:/Users/Administrator/Documents/R/win-library/3.0 (as lib is unspecified) inferring 'repos = NULL' from the file name package MCSim successfully unpacked and MD5 sums checked NOTE: I don t know why it warns that the package isn t available for R version 3.0.1. I built it using this version of R. It seems to work okay. Diversity partitioning Step 1. Start R session, load package > library(mcsim) Loading required package: plyr Loading required package: vegan Loading required package: permute This is vegan 2.0-7 Loading required package: vegetarian Attaching package: MCSim The following object is masked from package:plyr : count Step 2. Load the everglades.macroinverts data set. Note that this data set is stored in the MCSim package as a list. The list has three elements ($abund, $env, and $geo) that are each data frames. > data(everglades.macroinverts) > names(everglades.macroinverts) [1] "abund" "env" "geo" Step 3. Explore the data in the everglades.macroinverts list. The $abund element in the everglades.macroinverts list represents abundances of macroinvertebrates in sweep samples at the survey sites reported in Sokol et al. (2013). You can view the first few rows of the table using head() > head(everglades.macroinverts$abund) AMPHIP ANISOP BELSPP BERSPP BRAGRA CAENIS CALLIB CELEPO CELSPP AJE_0m_wet 4.333333 0 0 0.6666667 0 0.3333333 0.3333333 0 0.0000000 AJE_200m_wet 1.666667 0 0 0.0000000 0 0.0000000 0.0000000 0 0.3333333 AJE_400m_wet 0.500000 0 0 0.0000000 0 0.0000000 0.0000000 0 0.0000000 CXTE_200m_wet 1.333333 0 0 0.3333333 0 0.0000000 0.0000000 0 0.0000000 CXTE_400m_wet 1.000000 0 0 0.3333333 0 0.0000000 0.0000000 0 0.0000000 L31WA_200m_wet 0.000000 0 0 0.0000000 0 0.0000000 0.5000000 0 0.0000000 Note that I did not display all of the output above. The column names are species codes for macroinvertebrates and the row names are sites (which are explained in the paper). The numbers in the data frame are the abundances of each of the taxonomic groups in each sample. Step 4. Diversity partitioning can be used to calculate alpha, beta, and gamma diversity for the macroinvertebrates collected from the sites in the Everglades represented in the $abund element of the

data set. Alpha diversity is a measure of the mean local diversity at a site, beta diversity is a measure of the number of distinct communities in the data set, and gamma diversity is a measure of the regional diversity (see Jost et al. 2011). The fn.diversity.partition() function uses the vegetarian package to calculate alpha, beta, and gamma diversities using the multiplicative formula. When using fn.diversity.partition(), you have to choose an order q for the diversity calculations. If the order of q is set to 0, then these diversities are measured as richness (species counts), so alpha is the mean number of species observed at a site, and gamma diversity is a measure of the total number of species represented in the data set. If q = 2, then the diversity metric accounts for evenness, and it is more sensitive to changes in the relative abundance of dominant species rather than the presence or absence of rare species. If q = 1, it is equally sensitive to dominant and rare species. NOTE: fn.diversity.partition() weights the importance of each site by its total abundance (row total) when calculated alpha, beta, and gamma diversity when using q > 0. > fn.diversity.partition( + dat.comm = everglades.macroinverts$abund, + q.order = 0) d.alpha.q0 d.beta.q0 d.gamma.q0 AJE_0m_wet 17.56667 3.074004 54 Notice that gamma diversity is equal to the number of columns in the data set. This is because each taxon is observed at least once in the data set, thus regional richness (q = 0 diversity) is 54 taxonomic groups. Variation partitioning Step 1. Extract the geographic information from the everglades.macroinverts data set to a data frame with a shorter name (this makes writing code easier). > dat.geo<-everglades.macroinverts$geo Step 2. Load the geosphere package (you may need to install this package first). Use this package to calculate a geographic distance matrix using the distm() function. The resulting geographic distance matrix (which I named dist.geo in the code below) is a site-by-site triangular matrix where each element is the distance between two sites in meters. > require(geosphere) Loading required package: geosphere Loading required package: sp > obslist<-row.names(dat.geo) #save site names > dist.geo<-as.data.frame(distm(dat.geo[,c("longitude","latitude")], #this function creates the distance matrix from long and lat coordinates + fun=distvincentyellipsoid)) > row.names(dist.geo)<-obslist #paste the site names back into the row and column headers in the distance matrix > names(dist.geo)<-obslist Step 3. Use the pcnm() function from the vegan package to calculate eigenvectors using Principal Coordinates of Neighbor Matrices (PCNM) analysis, and extract the positive eigenvectors. Each eigenvector represents a spatial filter of increasingly fine scale. Thus, the first eigenvector (PCNM1) represents the broadest scale spatial heterogeneity, and each successive eigenvector (PCNM2, PCNM3, ) each represent finer and finer scale spatial heterogeneity. Below, I saved the spatial variables from

the pcnm analysis in the dat.pcnm data frame. In dat.pcnm, the columns are spatial variables (PCNM eigenvectors) and the rows are sites. Sites with similar PCNM scores are deemed similar with respect to that spatial filter. > mod.pcnm<-pcnm(dist.geo,dist.ret=true) > dat.pcnm<-as.data.frame(mod.pcnm$vectors) > dat.pcnm<-dat.pcnm[,mod.pcnm$values>0] Step 4. Use the fn.variation.partition() function from MCSim to calculate the different components of beta diversity (which are described above). The current version of the fn.variation.parition() function uses model selection based on adjusted R 2 values (see the vegan package) to select a subset of PCNM variables to represent [S] in the variation partitioning analysis. The selected PCNM variables are listed under S.vars in the output. The function will use all environmental variables that are supplied in the [E] component of the model. > fn.variation.partition( + dat.comm = everglades.macroinverts$abund, + E = everglades.macroinverts$env, + S = dat.pcnm, + q.order = 2) a b c d E S ES S.vars 1 0.178254 0.2908561 0.1001383 0.4307515 0.4691101 0.3909944 0.5692485 PCNM1 According to the output here, 46.9% of the variation in dominant-biased macroinvertebrate community composition is explained by environmental variation [E], 39.1% is explained by spatial variables [S] (PCNM eigenvectors), and 29.1% is explained by spatially structured environmental variables [b] (which is the intersection of [E] and [S]). How to run a simulation Step 1. Make sure you have a working directory and MCSim is loaded (see the getting started section above). My working directory is MCSim_DEMO located in my Documents folder on my windows machine. This information is important because the simulation will create a subdirectory (folder) called SIM_OUTPUT in your working directory where it will write R objects and metadata. Step 2. Setting simulation parameters Read the help documentation for fn.metasim(). This is the function you will use to call the simulation. The simulation creates a square grid of sites, each with a local assemblage size of JL. The default is for each site to have a JL of 200 individuals, so in the example landscape below, each circle is a site that has a community of 200 individuals (JL = 200), and the total metacommunity size (JM) is 200,000 individuals.

The fn.meatsim() function uses a lottery recruitment procedure based on Hubbell s (2001) neutral model, with modifications that in the influence of niche-based species sorting by environmental filters and dispersal dynamics in a spatially explicit landscape, both described in Gravel et al. (2006). The default values will run a simulation for 10 generations. Here is a list of the model parameters and a description of what they do (more information is available in the documentation for fn.metasim): scenario.id name your simulation alpha.fisher a number used to determine the initial regional diversity for your simulation. The default value is 2. Larger numbers create more regional diversity. nu The openness of the metacommunity, also known as Hubbell s speciation rate. This is the probability of a novel species (previously unobserved in the metacommunity) appearing. The default value (10-4, which means approximately 1 in every 10,000 recruitment events will select a novel species that is not currently occupying any location in the metacommunity). speciation.limit Sets an absolute, upper limit to the number of novel taxa. Default is max n.timesteps the number of generations in the simulation. I have found that simulations tend to stabilize after 20 to 50 generations. Default is set to 10. landscape.edge.size This sets the size of the grid. Default is 5, which will create a 5 by 5 grid of 25 sites. ave.jl This sets the average number of individuals at a site in the metacommunity. Default is 200 individuals per site. Note that other parameter settings can be used to vary the assemblage

sizes among the different sites throughout the landscape. You will set the minimum JL size, and the spatial scale at which JL will vary using other parameters. JL.min.proportion this parameter sets the minimum observed local assemblage size (JL) in the simulation. It is a proportion of ave.jl, so if you set it to 1, all assemblages are size ave.jl. If you set it to 0.5, then the smallest observed JL in the simulation will be half of ave.jl. The default value is 1. target.m m is a measure of the proportion of recruits in the local recruitment pool that are from external (neighboring) sites. You set a target value, because m can change from site to site because it is a function of both the influx of immigrants (IL) and the size of the existing local assemblage (JL). The default value is 0.25. Ef.specificity All sites are assigned an Ef value between 0 and 1, and all species are given affinities for an Ef value between 0 and 1. Ef.specificity sets the breadth of the trait set that local environmental filters select for. A value of 0 is the smallest specificity (Ef will select for a specific trait value, or affinity, i.e., 0.5). Larger values will select for a broader range of affinities. For example, if you set this to 0.1, it will select for traits within a range of length 0.1 along the environmental gradient. The default value is 0. Setting scaling variables the scale variables are all based on PCNM eigenvectors calculated for the sites in the landscape. You can choose to link immigration rates (IL), assemblages sizes (JL) and environmental filters (Ef) each to a spatial filter (PCNM eigenvector) of a different scale. To do this you choose a value >= 0 and < 1. A value of 0 will link the parameter to the spatial filter (PCNM) with the finest spatial heterogeneity, and a value of 1 will link the parameter to the broadest spatial filter (large scale spatial heterogeneity). There are also some special codes: a value < 0 will create a random landscape, A value >= 1 will create a landscape with two habitats, and a value of NA will create a completely homogenous landscape (same value for all sites). o IL.scale default value is NA o JL.scale default value is NA o Ef.scale default value is NA o Ef.specificity.scale this is a special case where you can currently only select values of -1, NA, or 1. Default value is NA same Ef specificity at all sites. SWM.slope sets the shape of the dispersal kernel. This is based off of Gravel et al. (2006). A value of 0 will create a spatially implicit simulation (the probability of dispersal from one site to another is not affected by how far apart the sites are in the landscape). Steeper slopes create dispersal limitation, where adjacent sites are more likely to contribute to each other s recruitment pools than sites that are far apart. The default value is 0. Setting species traits In this simulation, species are randomly assigned a dispersal trait and an environmental affinity. These traits affect their probabilities contributing to local and regional recruitment pools. o Dispersal traits species are randomly assigned a value between 0 and 1. These values determine their probability of contributing to the regional recruitment pools of neighboring sites. A value of 1 is the max dispersal, a value of 0 will give a species a 0 probability of contributing to the recruitment pool of neighboring sites. If all species are set to the same value, then dispersal dynamics will be neutral. If there is variation in dispersal abilities, this can create bias in the regional pools based on dispersal abilities.

o You do not set the dispersal traits for species, but you do set the median and range of possible values. The default values are trait.dispersal.median = 1 trait.dispersal.range = 0, all species will have the same dispersal ability. If you want to add in variation in dispersal abilities, set trait.dispersal.median =.75 trait.dispersal.range =.25 Environmental affinities and species sorting Ef traits describe species affinities to a location on the environmental gradient between 0 and 1. If a species has an affinity of 0.5, then its optimum site will be a site with an Ef value of 0.5. However, we can also set species niche breadths using trait.ef.sd. A larger niche breadth will increase the range of Ef values over which a species can survive. Setting the niche breadth to a very large value (e.g., trait.ef.sd = 20) will essentially make all species ecologically equivalent (all species can survive equally well at all sites), thus making the simulation neutral. Because Ef affinities are randomly assigned, you will only set the niche width to be applied to all species. Start off with a neutral simulation The default value creates relatively narrow niches, which can result in nichebased species sorting if there is sufficient environmental variability. trait.ef.sd = 0.1 (default value, niche model) trait.ef.sd = 20 (set to this value for a neutral model) Step 3. Setting for saving simulations and metadata the fn.metasim() function will calculate diversity partitions (alpha, beta, and gamma diversity) and variation partitions (how much of beta-diversity is related to environmental and spatial variation). q.order this will set the order q used for calculating diversity outcomes for diversity and variation partitioning. Default is to calculate outcomes for both q = 0 and q = 2. save.sim do you want to save the simulation? The default value is TRUE (will save the simulation). You can set it to FALSE if you are running many simulations (i.e., for a sensitivity analysis, when all you want is the metadata output) so that you don t fill up your hard drive. output.dir.path this is the name of the folder where simulations and metadata are saved. The default is SIM_OUTPUT. keep.timesteps which time steps do you want to calculate diversity statistics for? The default will keep timesteps 1 and 10. You will need to set this to the desired timestep, otherwise the simulation crashes. NOTE: The code used below is available at http://sites.google.com/site/metacommunitysimulation/filecabinet. Download demo_script.zip, which is a zipped directory with three.r files with the code used in steps 4 through 7. Step 4. Run a neutral simulation and save the output as sim.result in your R session. The script is available in the file RSCRIPT_neutral_simulation.R. Note that it creates a metadata file that displays the parameter settings and diversity outcomes for time step 20. > sim.result<-fn.metasim(scenario.id = "neutral", + alpha.fisher =.75,

+ n.timestep = 20, + landscape.edge.size = 10, + ave.jl = 200, + target.m = 0.01, + Ef.scale =.9, + trait.ef.sd = 20, + q.order = 2, + keep.timesteps = 20) Step 5. Plotting the simulation using fn.plot.sim() will create a 6 panel plot. The first 3 panels are representations of the landscape (a 10 by 10 grid in this case). In the first panel, the size of the each site is a relative representation of the local assemblage size (JL). In the second panel, point sizes are relative to their Ef value. In the third panel point sizes represent relative differences in m (the influence of immigrants from the regional pool). The fourth panel is a rank occurrence (the proportion of sites at which each species occurs) curve where the black bars represent initial occurrences rates and the red bars represent final occurrences. The 5 th panel displays species dispersal (black) and Ef (white) trait scores. The last panel is a principal coordinate analysis ordination of sites base on their taxonomic compositions at the initial time step (black) and final time step (red). Note that for this neutral simulation the red points are more spread out because of ecological drift under the neutral scenario where the influence of the regional pool is limited (m is relatively low at a value of 0.125). > fn.plot.sim(sim.result) If you want to run a niche simulation, use the code available in RSCRIPT_niche_simulation.R

Step 6. Delete everything in the SIM_OUTPUT directory. Using a for loop, we will run 10 neutral and 10 niche simulations and compare the metadata. We are using a 5 by 5 landscape in these simulations. This code is available in RSCRIPT_simulation_comparison.R > for (i in 1:10){ + sim.result<-fn.metasim(scenario.id = "neutral", + alpha.fisher =.75, + n.timestep = 20, + landscape.edge.size = 5, + ave.jl = 200, + target.m = 0.01, + Ef.scale =.9, + trait.ef.sd = 20, + q.order = 2, + keep.timesteps = 20) + sim.result<-fn.metasim(scenario.id = "niche", + alpha.fisher =.75, + n.timestep = 20, + landscape.edge.size = 5, + ave.jl = 200, + target.m = 0.01, + Ef.scale =.9, + trait.ef.sd = 0.125, + q.order = 2, + keep.timesteps = 20) + } Note that there is a metadata file for each simulation scenario type, and each has 10 rows (one for each rep). You can use the following code to combine the two files Step 7. Collate metadata into one file. > fn.combine.sim.csv.metadata("sim_output",n.sims=c("neutral","niche")) Now all metadata is in one file titled sim.metadata.csv in the SIM_OUTPUT directory. You can read the entire metadata file back into your R session using > sim.metadata<-read.csv("sim_output/sim.metadata.csv") Step 8. Compare the environmental component of beta diversity between niche and neutral simulations. > boxplot(e.q2~scenario.id,sim.metadata, ylab = [E] component of beta )

Literature Cited Borcard, D., P. Legendre, and P. Drapeau. 1992. Partialling out the spatial component of ecological variation. Ecology 73:1045 1055. Chao, A., C.-H. Chiu, and T. C. Hsieh. 2012. Proposing a resolution to debates on diversity partitioning. Ecology 93:2037 2051. Charney, N., and S. Record. 2012. vegetarian: Jost Diversity Measures for Community Data. Gravel, D., C. D. Canham, M. Beaudet, and C. Messier. 2006. Reconciling niche and neutrality: the continuum hypothesis. Ecology Letters 9:399 409. Hankin, R. K. 2007. Introducing untb, an R package for simulating ecological drift under the unified neutral theory of biodiversity. Journal of Statistical Software 22:1 15. Hijmans, R. J., and E. Williams. 2011. Spherical Trigonometry. Holyoak, M., M. A. Leibold, and R. D. Holt. 2005. Metacommunities: spatial dynamics and ecological communities. University of Chicago Press. Hubbell, S. P. 2001. A unified theory of biodiversity and biogeography. Princeton University Press. Jost, L. 2006. Entropy and diversity. Oikos 113:363 375. Jost, L. 2007. Partitioning diversity into independent alpha and beta components. Ecology 88:2427 2439. Jost, L., A. Chao, and R. L. Chazdon. 2011. Compositional similarity and beta diversity. Pages 68 84 in A. E. Magurran and B. J. McGill, editors. Biological Diversity: Frontiers in Measurement and Assessment. Oxford University Press, Oxford, UK. Legendre, P., and M. J. Anderson. 1999. Distance-based redundancy analysis: Testing multispecies responses in multifactorial ecological experiments. Ecological Monographs 69:1 24. Leibold, M. A., M. Holyoak, N. Mouquet, P. Amarasekare, J. M. Chase, M. F. Hoopes, R. D. Holt, J. B. Shurin, R. Law, D. Tilman, M. Loreau, and A. Gonzalez. 2004. The metacommunity concept: a framework for multi-scale community ecology. Ecology Letters 7:601 613. Oksanen, J., F. G. Blanchet, R. Kindt, P. Legendre, P. R. Minchin, R. B. O Hara, G. L. Simpson, P. Solymos, M. H. H. Stevens, and H. Wagner. 2013. vegan: Community Ecology Package. Peres-Neto, P. R., P. Legendre, S. Dray, and D. Borcard. 2006. Variation partitioning of species data matrices - estimation and comparison of fractions. Ecology 87:2614 2625. R Development Core Team. 2011. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. Sokol, E. R., E. F. Benfield, L. K. Belden, and V. H. Maurice. 2011. The assembly of ecological communities inferred from taxonomic and functional composition. The American naturalist 177:630 644. Sokol, E. R., J. M. Hoch, E. Gaiser, and J. C. Trexler. 2013. Metacommunity Structure Along Resource and Disturbance Gradients in Everglades Wetlands. Wetlands:1 12. Wickham, H. 2011. The Split-Apply-Combine Strategy for Data Analysis. Journal of Statistical Software 40:1 29.