Package tidyimpute. March 5, 2018

Similar documents
Package cattonum. R topics documented: May 2, Type Package Version Title Encode Categorical Features

Package messaging. May 27, 2018

Package fastdummies. January 8, 2018

Package qualmap. R topics documented: September 12, Type Package

Package purrrlyr. R topics documented: May 13, Title Tools at the Intersection of 'purrr' and 'dplyr' Version 0.0.2

Package datasets.load

Package oec. R topics documented: May 11, Type Package

Package TrafficBDE. March 1, 2018

Package bigreadr. R topics documented: August 13, Version Date Title Read Large Text Files

Package tidyselect. October 11, 2018

Package spark. July 21, 2017

Package mdftracks. February 6, 2017

Package canvasxpress

Package crossword.r. January 19, 2018

Package splithalf. March 17, 2018

Package keyholder. May 19, 2018

Package autocogs. September 22, Title Automatic Cognostic Summaries Version 0.1.1

Package bisect. April 16, 2018

Package comparedf. February 11, 2019

Package repec. August 31, 2018

Package robotstxt. November 12, 2017

Package geojsonsf. R topics documented: January 11, Type Package Title GeoJSON to Simple Feature Converter Version 1.3.

Package gggenes. R topics documented: November 7, Title Draw Gene Arrow Maps in 'ggplot2' Version 0.3.2

Package assertr. R topics documented: February 23, Type Package

Package ompr. November 18, 2017

Package ECctmc. May 1, 2018

Package facerec. May 14, 2018

Package catenary. May 4, 2018

Package dbx. July 5, 2018

Package colf. October 9, 2017

Package gtrendsr. October 19, 2017

Package tibble. August 22, 2017

Package validara. October 19, 2017

Package styler. December 11, Title Non-Invasive Pretty Printing of R Code Version 1.0.0

Package vinereg. August 10, 2018

Package jstree. October 24, 2017

Package docxtools. July 6, 2018

Package bupar. March 21, 2018

Package kirby21.base

Package lumberjack. R topics documented: July 20, 2018

Package glue. March 12, 2019

Package tidytransit. March 4, 2019

Package knitrprogressbar

Package pwrab. R topics documented: June 6, Type Package Title Power Analysis for AB Testing Version 0.1.0

Package condformat. October 19, 2017

Package calpassapi. August 25, 2018

Package curry. September 28, 2016

Package driftr. June 14, 2018

Package ggloop. October 20, 2016

Package ecoseries. R topics documented: September 27, 2017

Package utilsipea. November 13, 2017

Package reval. May 26, 2015

Package rgho. R topics documented: January 18, 2017

Package projector. February 27, 2018

Package RPresto. July 13, 2017

Package eply. April 6, 2018

Package taxizedb. June 21, 2017

Package chunked. July 2, 2017

Package implyr. May 17, 2018

Package sessioninfo. June 21, 2017

Package deductive. June 2, 2017

Package xltabr. November 28, 2017

Package gtrendsr. August 4, 2018

Package embed. November 19, 2018

Package SEMrushR. November 3, 2018

Package githubinstall

Package strat. November 23, 2016

Package maditr. May 10, 2018

Package extdplyr. February 27, 2017

Package readxl. April 18, 2017

Package ruler. February 23, 2018

Package NFP. November 21, 2016

Package descriptr. March 20, 2018

Package pmatch. October 19, 2018

Package fst. June 7, 2018

Package IATScore. January 10, 2018

Package fst. December 18, 2017

Package heims. January 25, 2018

Package textstem. R topics documented:

Package nngeo. September 29, 2018

Package librarian. R topics documented:

Package CatEncoders. March 8, 2017

Package gameofthrones

Package lucid. August 24, 2018

Package rcv. August 11, 2017

Package regexselect. R topics documented: September 22, Version Date Title Regular Expressions in 'shiny' Select Lists

Package clipr. June 23, 2018

Package areal. December 31, 2018

Package condusco. November 8, 2017

Package raker. October 10, 2017

Package kdtools. April 26, 2018

Package barcoder. October 26, 2018

Package editdata. October 7, 2017

Package textrank. December 18, 2017

Package queuecomputer

Package rsppfp. November 20, 2018

Package available. November 17, 2017

Package wrswor. R topics documented: February 2, Type Package

Package cli. November 5, 2017

Package collapsibletree

Transcription:

Title Imputation the Tidyverse Way Version 0.1.0 Date 2018-02-01 Package tidyimpute March 5, 2018 URL https://github.com/decisionpatterns/tidyimpute Functions and methods for imputing missing values (NA) in tables and list patterned after the tidyverse approach of 'dplyr' and 'rlang'; works with data.tables as well. BugReports https://github.com/decisionpatterns/tidyimpute/issues Depends R (>= 3.1.0) Imports methods, dplyr (>= 0.7.2), rlang (>= 0.1.2), na.tools (>= 0.1.0) Suggests testthat (>= 1.0.2), data.table (>= 1.10), magrittr License GPL-3 file LICENSE LazyData true RoxygenNote.0.1.9000 Repository CRAN Encoding UTF-8 Collate 'data.r' 'drop_cols.r' 'drop_rows.r' 'impute.r' 'make_impute.r' 'utils.r' 'impute_funs.r' 'na_predict.r' 'zzz.r' NeedsCompilation no Author Christopher Brown [aut, cre], Decision Patterns [cph] Maintainer Christopher Brown <chris.brown@decisionpatterns.com> Date/Publication 2018-03-05 11:4:49 UTC 1

2 drop_cols_all_na R topics documented: drop_cols_all_na...................................... 2 drop_rows_all_na...................................... 3 impute............................................ 4 impute_functions...................................... nacars............................................ 7 na_predict.......................................... 7 Index 8 drop_cols_all_na Remove columns with missing values Remove columns of a table whose values are all NA or who have any NA drop_cols_all_na() drop_cols_any_na() drop_na_cols() table-like object drop_cols_all_na removes all cols whose only values are NA. drop_cols_any_na removes columns that have any NA. They work on all table-like objects. Value An object of the same class as data with cols containing all NA values removed See Also dplyr::select()

drop_rows_all_na 3 drop_rows_all_na drop_rows_all_na, drop_rows_any_na Drop rows of a table whose values are all NA drop_rows_all_na() filter_all_na() drop_rows_any_na() filter_any_na() data-like object na_drop_rows removes all rows whose only values are NA. It works for all table-like objects. Value An object of the same class as with rows containing all NA values removed See Also dplyr::filter() Examples data(iris) <- iris[1:5,] [1:2,] <- NA [3,1] <- NA filter_all_na() filter_any_na() drop_rows_all_na() drop_rows_any_na()

4 impute impute Replace missing values in tables and lists Replace missing values (NA) in a table and lists impute(,.na,...) impute_at(,.na,.vars,...) impute_all(,.na,...) impute_if(,.na,.predicate,...).na list-like or table-like structure. scalar, vector or function as described in na.tools::na.replace()... additional args; either a unnamed list of columns (quoted or not) or name=function pairs. See..vars.predicate character; names of columns to be imputed dply-type predicate functions impute is similar to other dplyr verbs especially dplyr::mutate(). Like dplyr::mutate() it operates on columns. It changes only missing values (NA) to the value specified by.na. Behavior: Behavior depends on the values of.na and... impute can be used for three replacement operatations: 1. impute(,.na ) : ( missing... ) Replace missing values in ALL COLS by.na. This is analogous to impute_all. 2. impute(,.na,... ) : (... is an unnamed list) Replace column(s) specified in... by.na. Columns are specified as an unnamed list of quoted or unquoted column names. This is analogous to impute_at where... specifies.vars 3. impute(. col1=na.*, col2=na.* ) : ( missing.na ) : Replace by column-specific.na Additional arguments are to.na are not used; Use impute_at for this or create your own lambda functions. impute_all is like impute without specifying...... is used for additional arguments to.na

impute 5 Value Note Returns a object as the same type as. Columns are mutated to replace missing values (NA) with value specied by.na and...... is used to specify columns in impute but is used as additional arguments to.na in the other impute_* functions. See Also Examples The na.tools package. impute_functions data(nacars) ## Not run: nacars %>% impute(0, mpg, cyl) nacars %>% impute(1:, mpg, cyl) nacars %>% impute( na.mean ) nacars %>% impute( mean ) # unsafe nacars %>% impute( length, mpg, disp ) nacars %>% impute( mean, mpg, disp ) nacars %>% impute( mpg=na.mean, cyl=na.max ) nacars %>% impute( na.mean, c('mpg','disp') ) ## End(Not run) ## Not run: nacars %>% impute_at( -99,.vars=1:3 ) nacars %>% impute_at(.na=na.mean,.vars=1: ) # Same, uses... for additional args nacars %>% impute_at(.na=mean,.vars=1:, na.rm = TRUE ) nacars %>% impute_at(.na=na.mean,.vars = c('mpg','cyl', 'disp') ) ## End(Not run) ## Not run: nacars %>% impute_all( -99 ) nacars %>% impute_all( na.min ) ## End(Not run)

impute_functions impute_functions Table imputation methods Replace missing value methods with a variety of methods impute_functions(,.na,.vars,.predicate).na.vars.predicate table-like or list-like structure value/function to be used for replacement list of columns generated by vars(), or a character vector of column names, or a numeric vector of column positions. A predicate function to be applied to the columns or a logical vector.... addition passed to the imputation method These methods are modelled closely after dplyr::mutate() and the select style verbs. Most of the functions depend on the na.tools package. Function List: explicit: impute_explicit, impute_explicit_at, impute_explicit_all, impute_explicit_if zero: impute_zero, impute_zero_at, impute_zero_all, impute_zero_if inf: impute_inf, impute_inf_at, impute_inf_all, impute_inf_if neginf: impute_neginf, impute_neginf_at, impute_neginf_all, impute_neginf_if constant: impute_constant, impute_constant_at, impute_constant_all, impute_constant_if max: impute_max, impute_max_at, impute_max_all, impute_max_if min: impute_min, impute_min_at, impute_min_all, impute_min_if median: impute_median, impute_median_at, impute_median_all, impute_median_if mean: impute_mean, impute_mean_at, impute_mean_all, impute_mean_if most_freq: impute_most_freq, impute_most_freq_at, impute_most_freq_all, impute_most_freq_if quantile: impute_quantile, impute_quantile_at, impute_quantile_all, impute_quantile_if sample: impute_sample, impute_sample_at, impute_sample_all, impute_sample_if random: impute_random, impute_random_at, impute_random_all, impute_random_if replace: impute_replace, impute_replace_at, impute_replace_all, impute_replace_if

nacars 7 Examples ## Not run: nacars %>% impute_zero() nacars %>% impute_zero( mpg, cyl ) nacars %>% impute_zero( "mpg", "cyl" ) nacars %>% impute_zero( c("mpg","cyl") ) nacars %>% impute_zero( 1:2 ) ## End(Not run) nacars data with missing values data with missing values nacars Format An object of class data.frame with rows and 11 columns. cars and iris data sets with missing data for demonstration purposes. na_predict na_predict replace NA values by predictions of a model na_predict(x, object, data = x) x object data data object with predict method data object

Index Topic datasets nacars, 7 dplyr::filter(), 3 dplyr::mutate(), 4, dplyr::select(), 2 drop_cols_all_na, 2 drop_cols_any_na (drop_cols_all_na), 2 drop_na_cols (drop_cols_all_na), 2 drop_rows_all_na, 3 drop_rows_any_na (drop_rows_all_na), 3 filter_all_na (drop_rows_all_na), 3 filter_any_na (drop_rows_all_na), 3 impute, 4 impute_all (impute), 4 impute_at (impute), 4 impute_constant (impute_functions), impute_constant_all (impute_functions), impute_constant_at (impute_functions), impute_constant_if (impute_functions), impute_explicit (impute_functions), impute_explicit_all (impute_functions), impute_explicit_at (impute_functions), impute_explicit_if (impute_functions), impute_functions, impute_if (impute), 4 impute_inf (impute_functions), impute_inf_all (impute_functions), impute_inf_at (impute_functions), impute_inf_if (impute_functions), impute_max (impute_functions), impute_max_all (impute_functions), impute_max_at (impute_functions), impute_max_if (impute_functions), impute_mean (impute_functions), impute_mean_all (impute_functions), impute_mean_at (impute_functions), impute_mean_if (impute_functions), impute_median (impute_functions), impute_median_all (impute_functions), impute_median_at (impute_functions), impute_median_if (impute_functions), impute_min (impute_functions), impute_min_all (impute_functions), impute_min_at (impute_functions), impute_min_if (impute_functions), impute_most_freq (impute_functions), impute_most_freq_all (impute_functions), impute_most_freq_at (impute_functions), impute_most_freq_if (impute_functions), impute_neginf (impute_functions), impute_neginf_all (impute_functions), impute_neginf_at (impute_functions), impute_neginf_if (impute_functions), impute_quantile (impute_functions), impute_quantile_all (impute_functions), impute_quantile_at (impute_functions), impute_quantile_if (impute_functions), impute_random (impute_functions), impute_random_all (impute_functions), impute_random_at (impute_functions), impute_random_if (impute_functions), impute_replace (impute_functions), impute_replace_all (impute_functions), impute_replace_at (impute_functions), impute_replace_if (impute_functions), impute_sample (impute_functions), impute_sample_all (impute_functions), impute_sample_at (impute_functions), impute_sample_if (impute_functions), impute_zero (impute_functions), 8

INDEX 9 impute_zero_all (impute_functions), impute_zero_at (impute_functions), impute_zero_if (impute_functions), na.tools::na.replace(), 4 na_predict, 7 nacars, 7 nacars_dt (nacars), 7 nairis (nacars), 7 nairis_dt (nacars), 7