Type Package Title Generate Data Summary in a Tidy Format Version 0.2.1 Package ezsummary August 29, 2016 Functions that simplify the process of generating print-ready data summary using 'dplyr' syntax. License MIT + file LICENSE LazyData TRUE URL https://github.com/haozhu233/ezsummary BugReports https://github.com/haozhu233/ezsummary/issues Depends R (>= 3.1.2) Imports dplyr (>= 0.4), tidyr (>= 0.5) Suggests testthat, knitr, rmarkdown RoxygenNote 5.0.1 VignetteBuilder knitr NeedsCompilation no Author Hao Zhu [aut, cre], Thomas Travison [ctb], Timothy Tsai [ctb], Akhmed Umyarov [ctb] Maintainer Hao Zhu <haozhu233@gmail.com> Repository CRAN Date/Publication 2016-07-11 22:59:40 R topics documented: auto_var_types....................................... 2 ezmarkup.......................................... 2 ezsummary......................................... 3 ezsummary_categorical................................... 3 ezsummary_quantitative.................................. 5 var_types.......................................... 6 1
2 ezmarkup Index 8 auto_var_types Automaticall assign var_types to the attributes If the user did not provide var_types, the function will preassume every variables to be quantitative variables. If a variable s type is character but either the user or the automatic step said it s a quantitative variable, the var_types attribute for that variable will be overwritten as categorical variable. At the same time, a warning message will be printed on the screen. auto_var_types() The imported data.frame ezmarkup Easy way to "markup" a table before it is sent to be displayed The final step of an analysis is to export the tables generated through analytical scripts to the desired platform, such as a pdf, a rmarkdown document or even a Shiny page. Sometimes, a lot of people wants to reorganize the data in a more human readable format. For example, people like to put the standard deviation inside a pair of parentheses after the mean. People also like to put the low and high ends of confidence interval inside a pair of parenthese, separated by " ~ ", after the estimated average. However, as far as I know, so far there isn t a straight forward function to deal with this need. This function is built to address this issue. ezmarkup(, pattern) The input table pattern The grouping pattern. Each dot "." represent one column. If two or more columns need to be combined in some certain formats, they should be put together inside a pair of brackets "[ ]". You can add any special characters, such as "(" and "~", inside the pair of brackets but please don t leave those special characters outside the brackets. If you want to add a dot as a special character. Please use "^.^" for every single dot you would like to add.
ezsummary 3 Examples library(dplyr) dt <- mtcars %>% group_by(cyl) %>% select(gear, carb) %>% ezsummary_categorical(n=true) ezmarkup(dt, "...[.(.)]") ezmarkup(dt, "..[. (. ~.)]") ezsummary Quick and Easy summarise function Quick and Easy summarise function ezsummary(,..., mode = c("ez", "details")) A vector, a data.frame or a dplyr.... that can be passed to ezsummary_q() and ezsummary_c() mode A character value that can be either "ez" or "details". "ez" is the default mode that will try to fits quantitative and categorical results into one table. If these two have different number of analyses or if set manually, mode "details" is enabled. In this mode, quantitative and categorical variables are displayed separately and the result is stored in a list Details For detailed options, please check the help document for ezsummary_q and ezsummary_c. You may also check out the package vignette for details. ezsummary_categorical Easily summarize categorical data ezsummary_categorical() summarizes categorical data. Shorthand for ezsummary_categorical
4 ezsummary_categorical ezsummary_categorical(, n = FALSE, count = TRUE, p = TRUE, p_type = c("decimal", "percent"), digits = 3, rounding_type = c("round", "signif", "ceiling", "floor"), P = FALSE, round.n = 3, flavor = c("long", "wide"), fill = 0, unit_markup = NULL) ezsummary_c(, n = FALSE, count = TRUE, p = TRUE, p_type = c("decimal", "percent"), digits = 3, rounding_type = c("round", "signif", "ceiling", "floor"), P = FALSE, round.n = 3, flavor = c("long", "wide"), fill = 0, unit_markup = NULL) n count p p_type digits rounding_type P round.n flavor fill unit_markup A vector, a data.frame or a dplyr. A T/F value; total counts of records. Default is FALSE. A T/F value; count of records in each category. Default is TRUE. A T/F value; proportion or percentage of records in each category. Default is TRUE. A character string determining the output format of p; possible values are decimal and percent. Default value is decimal. A numeric value determining the rounding digits; Replacement for round.n. Default setting is to read from getoption(). A character string determining the rounding method; possible values are round, signif, ceiling and floor. When ceiling or floor is selected, digits won t have any effect. Deprecated; Will change the value of p_type if used in this version. Deprecated; Will change the value of rounding_type if used in this version. A character string with two possible inputs: "long" and "wide". "Long" is the default setting which will put grouping information on the left side of the table. It is more machine readable and is good to be passed into the next analytical stage if needed. "Wide" is more print ready (except for column names, which you can fix in the next step, or fix in LaTex or packages like htmltable). In the "wide" mode, the analyzed variable will be the only "ID" variable and all the stats values will be presented ogranized by the grouping variables (if any). If there is no grouping, the outputs of "wide" and "long" will be the same. If set, missing values created by the "wide" flavor will be replaced with this value. Please check spread for details. Default value is 0 When unit_markup is not NULL, it will call the ezmarkup function and perform column combination here. To make everyone s life easier, I m using the term "unit" here. Each unit mean each group of statistical summary results. If you want to know mean and stand deviation, these two values are your units so you can put something like "[. (.)]" there # @param P Deprecated; Will change the value of p_type if used in this version.
ezsummary_quantitative 5 Examples library(dplyr) mtcars %>% group_by(am) %>% select(cyl, gear, carb) %>% ezsummary_categorical() mtcars %>% select(cyl, gear, carb) %>% ezsummary_categorical(n=true, round.n = 2) ezsummary_quantitative Easily summarize quantitative data ezsummary_quantitative() summarizes quantitative data. ezsummary_quantitative(, total = FALSE, n = FALSE, missing = FALSE, mean = TRUE, sd = TRUE, sem = FALSE, median = FALSE, quantile = FALSE, extra = NULL, digits = 3, rounding_type = c("round", "signif", "ceiling", "floor"), round.n = 3, flavor = c("long", "wide"), fill = 0, unit_markup = NULL) ezsummary_q(, total = FALSE, n = FALSE, missing = FALSE, mean = TRUE, sd = TRUE, sem = FALSE, median = FALSE, quantile = FALSE, extra = NULL, digits = 3, rounding_type = c("round", "signif", "ceiling", "floor"), round.n = 3, flavor = c("long", "wide"), fill = 0, unit_markup = NULL) total n missing mean sd sem A vector, a data.frame or a dplyr. a T/F value; total counts of records including both missing and read data records. Default is FALSE. A T/F value; total counts of records that is not missing. Default is FALSE. a T/F value; total counts of records that went missing( NA). Default is FALSE. A T/F value; the average of a set of data. Default value is TRUE. A T/F value; the standard deviation of a set of data. Default value is TRUE. A T/F value; the standard error of the mean of a set of data. Default value is FALSE.
6 var_types median quantile extra digits rounding_type round.n flavor fill unit_markup A T/F value; the median of a set of data. Default value is FALSE. A T/F value controlling 5 outputs; the 0%, 25%, 50%, 75% and 100% percentile of a set of data. Default value is FALSE. A character vector offering extra customizability to this function. Please see Details for detail. A numeric value determining the rounding digits; Replacement for round.n. Default setting is to read from getoption(). A character string determining the rounding method; possible values are round, signif, ceiling and floor. When ceiling or floor is selected, digits won t have any effect. Deprecated; Will change the value of rounding_type if used in this version. A character string with two possible inputs: "long" and "wide". "Long" is the default setting which will put grouping information on the left side of the table. It is more machine readable and is good to be passed into the next analytical stage if needed. "Wide" is more print ready (except for column names, which you can fix in the next step, or fix in LaTex or packages like htmltable). In the "wide" mode, the analyzed variable will be the only "ID" variable and all the stats values will be presented ogranized by the grouping variables (if any). If there is no grouping, the outputs of "wide" and "long" will be the same. If set, missing values created by the "wide" flavor will be replaced with this value. Please check spread for details. Default value is 0 When unit_markup is not NULL, it will call the ezmarkup function and perform column combination here. To make everyone s life easier, I m using the term "unit" here. Each unit mean each group of statistical summary results. If you want to know mean and stand deviation, these two values are your units so you can put something like "[. (.)]" there # @param P Deprecated; Will change the value of p_type if used in this version. Examples library(dplyr) mtcars %>% group_by(am) %>% select(mpg, wt, qsec) %>% ezsummary_quantitative() var_types Attach the variable type information with the dataset In order to analyze variables in the most appropriate way using this ezsummary package, you d better let the computer know what types of data (quantitative or categorical) you are asking it to compute. This function will attach a list of types you entered with the datasets so functions down the stream line can read these information and analyze based on that. The information is stored in the attributes of the dataset
var_types 7 var_types(, types) types A data.frame Character vector of length equal to the number of variables in the dataset. Use "q" and "c" to denote quantitative and categorical variables.
Index auto_var_types, 2 ezmarkup, 2 ezsummary, 3 ezsummary_c, 3 ezsummary_c (ezsummary_categorical), 3 ezsummary_categorical, 3 ezsummary_q, 3 ezsummary_q (ezsummary_quantitative), 5 ezsummary_quantitative, 5 spread, 4, 6 var_types, 6 8