Package CatEncoders. March 8, 2017

Similar documents
Package datasets.load

Package fastdummies. January 8, 2018

Package cattonum. R topics documented: May 2, Type Package Version Title Encode Categorical Features

Package inca. February 13, 2018

Package kirby21.base

Package ECctmc. May 1, 2018

Package rmatio. July 28, 2017

Package quadprogxt. February 4, 2018

Package geojsonsf. R topics documented: January 11, Type Package Title GeoJSON to Simple Feature Converter Version 1.3.

Package bigreadr. R topics documented: August 13, Version Date Title Read Large Text Files

Package repec. August 31, 2018

Package spark. July 21, 2017

Package mdftracks. February 6, 2017

Package bisect. April 16, 2018

Package SEMrushR. November 3, 2018

Package hypercube. December 15, 2017

Package tidyimpute. March 5, 2018

Package Tushare. November 2, 2018

Package rngsetseed. February 20, 2015

Package crossword.r. January 19, 2018

Package glcm. August 29, 2016

Package projector. February 27, 2018

Package sfc. August 29, 2016

Package validara. October 19, 2017

Package deductive. June 2, 2017

Package WordR. September 7, 2017

Package IATScore. January 10, 2018

Package clipr. June 23, 2018

Package ecoseries. R topics documented: September 27, 2017

Package nngeo. September 29, 2018

Package VSE. March 21, 2016

Package dkdna. June 1, Description Compute diffusion kernels on DNA polymorphisms, including SNP and bi-allelic genotypes.

Package curry. September 28, 2016

Package getopt. February 16, 2018

Package zoomgrid. January 3, 2019

Package sankey. R topics documented: October 22, 2017

Package estprod. May 2, 2018

Package heims. January 25, 2018

Package ggimage. R topics documented: November 1, Title Use Image in 'ggplot2' Version 0.0.7

Package raker. October 10, 2017

Package Matrix.utils

Package comparedf. February 11, 2019

Package scraep. July 3, Index 6

Package gtrendsr. October 19, 2017

Package QCAtools. January 3, 2017

Package gifti. February 1, 2018

Package sessioninfo. June 21, 2017

Package rmumps. September 19, 2017

Package nmslibr. April 14, 2018

Package PCADSC. April 19, 2017

Package ompr. November 18, 2017

Package pwrab. R topics documented: June 6, Type Package Title Power Analysis for AB Testing Version 0.1.0

Package RandPro. January 10, 2018

Package nos. September 11, 2017

Package CorporaCoCo. R topics documented: November 23, 2017

Package humanize. R topics documented: April 4, Version Title Create Values for Human Consumption

Package ggimage. R topics documented: December 5, Title Use Image in 'ggplot2' Version 0.1.0

Package TANDEM. R topics documented: June 15, Type Package

Package embed. November 19, 2018

Package vinereg. August 10, 2018

Package rferns. November 15, 2017

Package assortnet. January 18, 2016

Package labelvector. July 28, 2018

Package woebinning. December 15, 2017

Package misclassglm. September 3, 2016

Package splithalf. March 17, 2018

Package rpst. June 6, 2017

Package condusco. November 8, 2017

Package missforest. February 15, 2013

Package messaging. May 27, 2018

Package RYoudaoTranslate

Package Rprofet. R topics documented: April 14, Type Package Title WOE Transformation and Scorecard Builder Version 2.2.0

Package wethepeople. March 3, 2013

Package strat. November 23, 2016

Package UNF. June 13, 2017

Package gtrendsr. August 4, 2018

Package SFtools. June 28, 2017

Package gcite. R topics documented: February 2, Type Package Title Google Citation Parser Version Date Author John Muschelli

Package AdhereR. June 15, 2018

Package AnDE. R topics documented: February 19, 2015

Package TrafficBDE. March 1, 2018

Package rvinecopulib

Package CompGLM. April 29, 2018

Package tropicalsparse

Package gsalib. R topics documented: February 20, Type Package. Title Utility Functions For GATK. Version 2.1.

Package tmpm. February 29, 2016

Package intccr. September 12, 2017

Package kerasformula

Package editdata. October 7, 2017

Package presens. July 29, 2016

Package dirmcmc. R topics documented: February 27, 2017

Package DPBBM. September 29, 2016

Package jstree. October 24, 2017

Package longclust. March 18, 2018

Package ClusterBootstrap

Package mmtsne. July 28, 2017

Package io. January 15, 2018

Package coda.base. August 6, 2018

Package densityclust

Package modmarg. R topics documented:

Transcription:

Type Package Title Encoders for Categorical Variables Version 0.1.1 Author nl zhang Package CatEncoders Maintainer nl zhang <setseed2016@gmail.com> March 8, 2017 Contains some commonly used categorical variable encoders, such as 'LabelEncoder' and 'One- HotEncoder'. Inspired by the encoders implemented in Python 'sklearn.preprocessing' package (see <http://scikit-learn.org/stable/modules/preprocessing.html>). License GPL-2 GPL-3 LazyData TRUE Imports Matrix (>= 1.2-6), data.table (>= 1.9.6), methods RoxygenNote 5.0.1 NeedsCompilation no Repository CRAN Date/Publication 2017-03-08 08:22:03 R topics documented: inverse.transform...................................... 2 LabelEncoder-class..................................... 3 LabelEncoder.Character-class................................ 3 LabelEncoder.Factor-class................................. 3 LabelEncoder.fit....................................... 4 LabelEncoder.Numeric-class................................ 5 OneHotEncoder-class.................................... 5 OneHotEncoder.fit..................................... 5 transform.......................................... 6 Index 8 1

2 inverse.transform inverse.transform inverse.transform transforms an integer vector back to the original vector inverse.transform transforms an integer vector back to the original vector Usage inverse.transform(enc, z) ## S4 method for signature 'LabelEncoder,numeric' inverse.transform(enc, z) Arguments enc z A fitted LabelEncoder A vector of integers Value A vector of characters, factors or numerics. Examples # character vector y y <- c('a','d','e',na) z <- transform(lenc,c('d','d',na,'f')) inverse.transform(lenc,z) # factor vector y y <- factor(c('a','d','e',na),exclude=null) z <- transform(lenc,factor(c('a','d',na,'f'))) inverse.transform(lenc,z) # numeric vector y set.seed(123) y <- c(1:10,na) newy <- sample(c(1:10,na),5) print(newy) z <-transform(lenc,newy) inverse.transform(lenc, z)

LabelEncoder-class 3 LabelEncoder-class An S4 class to represent a LabelEncoder. An S4 class to represent a LabelEncoder. type A character to denote the input type, either character, factor or numeric mapping A data.frame to store the mapping table LabelEncoder.Character-class An S4 class to represent a LabelEncoder with character input. An S4 class to represent a LabelEncoder with character input. classes A character vector to store the unique values of classes LabelEncoder.Factor-class An S4 class to represent a LabelEncoder with factor input. An S4 class to represent a LabelEncoder with factor input. classes A factor vector to store the unique values of classes

4 LabelEncoder.fit LabelEncoder.fit LabelEncoder.fit fits a LabelEncoder object LabelEncoder.fit fits a LabelEncoder object Usage LabelEncoder.fit(y) Arguments y A vector of characters, factors, or numerics, which can include NA as well Value Returns an object of S4 class LabelEncoder. Examples # factor y y <- factor(c('a','d','e',na),exclude=null) z <- transform(lenc,factor(c('d','d',na,'f'))) # character y y <- c('a','d','e',na) z <- transform(lenc,c('d','d',na,'f')) # numeric y set.seed(123) y <- sample(c(1:10,na),5) z <-transform(lenc,sample(c(1:10,na),5))

LabelEncoder.Numeric-class 5 LabelEncoder.Numeric-class An S4 class to represent a LabelEncoder with numeric input. An S4 class to represent a LabelEncoder with numeric input. classes A numeric vector to store the unique values of classes OneHotEncoder-class An S4 class to represent a OneHotEncoder An S4 class to represent a OneHotEncoder n_columns An integer value to store the number of columns of input data n_values A numeric vector to store the number of unique values in each column of input data column_encoders A list that stores the LabelEncoder for each column of input data OneHotEncoder.fit OneHotEncoder.fit fits an OneHotEncoder object OneHotEncoder.fit fits an OneHotEncoder object Usage OneHotEncoder.fit(X) Arguments X A matrix or data.frame, which can include NA Value Returns an object of S4 class OneHotEncoder

6 transform Examples # matrix input X1 <- matrix(c(0, 1, 0, 1, 0, 1, 2, 0, 3, 0, 1, 2),c(4,3),byrow=FALSE) oenc <- OneHotEncoder.fit(X1) z <- transform(oenc,x1,sparse=true) # return a sparse matrix # data.frame X2 <- cbind(data.frame(x1),x4=c('a','b','d',na),x5=factor(c(1,2,3,1))) oenc <- OneHotEncoder.fit(X2) z <- transform(oenc,x2,sparse=false) # return a dense matrix transform transform transforms a new data set using the fitted encoder Usage transform transforms a new data set using the fitted encoder transform(enc,...) ## S4 method for signature 'LabelEncoder.Numeric' transform(enc, y) ## S4 method for signature 'LabelEncoder.Character' transform(enc, y) ## S4 method for signature 'LabelEncoder.Factor' transform(enc, y) ## S4 method for signature 'OneHotEncoder' transform(enc, X, sparse = TRUE, new.feature.error = TRUE) Arguments enc A fitted encoder, i.e., LabelEncoder or OneHotEncoder... Additional argument list y X sparse A vector of character, factor or numeric values A data.frame or matrix If TRUE then return a sparse matrix, default = TRUE

transform 7 Value new.feature.error If TRUE then throw an error for new feature values; otherwise the new feature values are ignored, default = TRUE If enc is an OneHotEncoder, the returned value is a sparse or dense matrix. If enc is a LabelEncoder, the returned value is a vector. Examples # matrix X X1 <- matrix(c(0, 1, 0, 1, 0, 1, 2, 0, 3, 0, 1, 2),c(4,3),byrow=FALSE) oenc <- OneHotEncoder.fit(X1) z <- transform(oenc,x1,sparse=true) # return a sparse matrix # data.frame X X2 <- cbind(data.frame(x1),x4=c('a','b','d',na),x5=factor(c(1,2,3,1))) oenc <- OneHotEncoder.fit(X2) z <- transform(oenc,x2,sparse=false) # return a dense matrix # factor vector y y <- factor(c('a','d','e',na),exclude=null) z <- transform(lenc,factor(c('d','d',na,'f'))) # character vector y y <- c('a','d','e',na) z <- transform(lenc,c('d','d',na,'f')) # numeric vector y set.seed(123) y <- sample(c(1:10,na),5) z <-transform(lenc,sample(c(1:10,na),5))

Index inverse.transform, 2 inverse.transform,labelencoder,numeric-method (inverse.transform), 2 LabelEncoder-class, 3 LabelEncoder.Character,character-method LabelEncoder.Character-class, 3 LabelEncoder.Factor-class, 3 LabelEncoder.fit, 4 LabelEncoder.Numeric,numeric-method LabelEncoder.Numeric-class, 5 OneHotEncoder-class, 5 OneHotEncoder.fit, 5 transform, 6 transform, transform,labelencoder.character-method transform,labelencoder.factor,factor-method transform,labelencoder.factor-method transform,labelencoder.numeric-method transform,onehotencoder,any,logical-method transform,onehotencoder-method 8