Package gains. September 12, 2017

Similar documents
Package dyncorr. R topics documented: December 10, Title Dynamic Correlation Package Version Date

Package sure. September 19, 2017

Package intccr. September 12, 2017

Package areaplot. October 18, 2017

Package texteffect. November 27, 2017

Package sspline. R topics documented: February 20, 2015

Package FCGR. October 13, 2015

Package weco. May 4, 2018

Package rpst. June 6, 2017

Package OptimaRegion

Package nplr. August 1, 2015

Package fbroc. August 29, 2016

Package munfold. R topics documented: February 8, Type Package. Title Metric Unfolding. Version Date Author Martin Elff

Package PTE. October 10, 2017

Package sciplot. February 15, 2013

Package sizemat. October 19, 2018

Package cgh. R topics documented: February 19, 2015

Package fitplc. March 2, 2017

Package replicatedpp2w

Package Mondrian. R topics documented: March 4, Type Package

Package Rramas. November 25, 2017

Package EnQuireR. R topics documented: February 19, Type Package Title A package dedicated to questionnaires Version 0.

Package compeir. February 19, 2015

Package SmoothHazard

Package Daim. February 15, 2013

Package MIICD. May 27, 2017

Package PIPS. February 19, 2015

Package SCVA. June 1, 2017

Package CausalImpact

Package ClustGeo. R topics documented: July 14, Type Package

Package ModelGood. R topics documented: February 19, Title Validation of risk prediction models Version Author Thomas A.

Package basictrendline

Package complmrob. October 18, 2015

Package TVsMiss. April 5, 2018

Package qicharts. October 7, 2014

Package rknn. June 9, 2015

Package gtrendsr. October 19, 2017

Package ArCo. November 5, 2017

Package OLScurve. August 29, 2016

Package datasets.load

Package bdots. March 12, 2018

Package SCBmeanfd. December 27, 2016

Package ECctmc. May 1, 2018

Package kofnga. November 24, 2015

Package woebinning. December 15, 2017

Package sparsereg. R topics documented: March 10, Type Package

Package manet. September 19, 2017

Package epitab. July 4, 2018

Package iclick. R topics documented: February 24, Type Package

Package mlegp. April 15, 2018

Package fso. February 19, 2015

Package swissmrp. May 30, 2018

Package spcadjust. September 29, 2016

Package ExceedanceTools

Package QCAtools. January 3, 2017

Package qrfactor. February 20, 2015

Package SSRA. August 22, 2016

Package cosinor. February 19, 2015

Package SFtools. June 28, 2017

Package desplot. R topics documented: April 3, 2018

Package spikes. September 22, 2016

Package pwrrasch. R topics documented: September 28, Type Package

Package gtrendsr. August 4, 2018

Package cumseg. February 19, 2015

Package gems. March 26, 2017

Package quickreg. R topics documented:

Package smbinning. December 1, 2017

Package mixphm. July 23, 2015

Package catenary. May 4, 2018

Package activity. September 26, 2016

Package lvm4net. R topics documented: August 29, Title Latent Variable Models for Networks

Package DengueRT. May 13, 2016

Package svmpath. R topics documented: August 30, Title The SVM Path Algorithm Date Version Author Trevor Hastie

Logical operators: R provides an extensive list of logical operators. These include

Package condir. R topics documented: February 15, 2017

Package PCADSC. April 19, 2017

Package ParetoPosStable

Package DBfit. October 25, 2018

Package sankey. R topics documented: October 22, 2017

Package breakdown. June 14, 2018

Package ConvergenceClubs

Package GLDreg. February 28, 2017

Package lvplot. August 29, 2016

Package ScottKnottESD

Package RcmdrPlugin.survival

Package Grid2Polygons

Package ICEbox. July 13, 2017

Package spark. July 21, 2017

Package textrank. December 18, 2017

Package edear. March 22, 2018

Package optimus. March 24, 2017

Package r2d2. February 20, 2015

Package nprotreg. October 14, 2018

Package anidom. July 25, 2017

Package EFS. R topics documented:

Package RCA. R topics documented: February 29, 2016

Package SparseFactorAnalysis

Package flam. April 6, 2018

Package queuecomputer

Package DPBBM. September 29, 2016

Transcription:

Package gains September 12, 2017 Version 1.2 Date 2017-09-08 Title Lift (Gains) Tables and Charts Author Craig A. Rolling <craig.rolling@slu.edu> Maintainer Craig A. Rolling <craig.rolling@slu.edu> Depends R (>= 3.0.0) Constructs gains tables and lift charts for prediction algorithms. Gains tables and lift charts are commonly used in direct marketing applications. The method is described in Drozdenko and Drake (2002), ``Optimal Database Marketing'', Chapter 11. LazyData Yes License GPL-3 NeedsCompilation no Repository CRAN Date/Publication 2017-09-12 21:35:21 UTC R topics documented: ciascores.......................................... 2 gains............................................. 3 MineThatData........................................ 5 plot.gains.......................................... 5 print.gains.......................................... 7 Index 8 1

2 ciascores ciascores Cell Phones per Country with Predictions This data set gives the number of cell phones per person for 194 countries, courtesy of the CIA World Factbook. The data are mostly for 2008. It also gives predicted values of this variable from 5 different methods (OLS, Lasso, Regression Tree, Random Forest, and Additive Model). Finally, there is an indicator for each country indicating whether the country was used in the model development sample or not. cia.scores Format a data frame containing 194 rows and 8 columns. CellPhonesPP: Number of cell phones per person, from the CIA Factbook. PredOLS: Predicted response from an OLS regression. PredLasso: Predicted response from a LASSO regression. PredTree: Predicted response from a regression tree. PredRF: Predicted response from a Random Forest. PredSM: Predicted response from an additive model. PredGLM: Predicted probability (from a logistic regression) that the country has more cell phones than people. train: Indicator, =1 if the country was among the set used to make the predictions, =0 if the country was in the validation set (not used to make predictions). Source CIA - The World Factbook https://www.cia.gov/library/publications/the-world-factbook/ index.html

gains 3 gains Gains Table for a Vector of Predictions Takes a vector of actual responses and a vector of predictions and constructs a gains table to evaluate the predictions. gains(actual, predicted, groups=10, ties.method=c("max","min","first","average","random"), conf=c("none","normal","t","boot"), boot.reps=1000, conf.level=0.95, optimal=false,percents=false) Arguments Value actual predicted groups ties.method conf boot.reps a numeric vector of actual response values a numeric vector of predicted response values. This vector must have the same length as actual, and the ith value of this vector needs to be the model score for the subject with the ith value of the actual vector as its actual response. an integer containing the number of rows in the gains table. The default value is 10. method of breaking ties. See the ties.method argument of the rank procedure. method to construct confidence intervals for the mean response in each row of the table. If "none", then no confidence intervals are constructed. If "normal", then critical values from the normal distribution are used. If "t", then critical values from the t distribution are used. If "boot", then 1000 bootstrap samples are drawn from each row, and the upper and lower conf.level/2 values of the distribution are used as the confidence interval endpoints. the number of bootstrap replications to use for bootstrap confidence intervals. The default value is 1000. conf.level the 1 - alpha level of the confidence interval. The default value is 0.95. optimal percents a logical indicated whether the user wants optimal lift indices to be computed. Optimal lift indices represent the results that would be achieved from an optimal ranking of subjects. a logical that indicates whether to print the mean responses and predicted responses in percent form. gains returns an S3 object of class gains. The function print.gains can be used to print the results. The function plot.gains can be used to plot the mean response and cumulative mean response for each group. An object of class gains is a list containing the following components:

4 gains depth cumulative percentage of file covered by each row of the gains table (e.g. 10,20,30,...,100). obs cume.obs mean.resp number of observations in each row. cumulative number of observations in each row. mean response in each row. cume.mean.resp cumulative mean response in each row. cume.pct.of.total cumulative percent of total response. lift cume.lift lift index. The lift index is 100 times the mean.resp for the row divided by the cume.mean.resp for the last row. cumulative lift index. It is 100 times the cume.mean.resp for the row divided by the cume.mean.resp for the last row. mean.prediction mean predicted response in each row. min.prediction minimum predicted response in each row. min.prediction and max.prediction can be used to construct decision rules for applying the model. max.prediction maximum predicted response in each row. conf optimal num.groups percents conf.lower conf.upper opt.lift opt.cume.lift the argument given for conf. the argument given for optimal. the number of rows in the gains table. This will equal groups unless there are fewer distinct predicted values than groups. the argument given for percents. lower confidence limit for each row. Only included if confidence intervals are requested in the gains table. upper confidence limit for each row. Only included if confidence intervals are requested in the gains table. optimal lift index. The lift index achieved by an optimal ranking of subjects in the file. Only included if optimal lift is requested in the gains table. optimal cumulative lift index. The cumulative lift by an optimal ranking. Only included if optimal lift is requested in the gains table. See Also print.gains for printing the table in a nice way. plot.gains for drawing a graph representing the output. (This graph is sometimes called a lift chart.) Examples data(ciascores) with(subset(ciascores,train==0), gains(actual=cellphonespp, predicted=predols, optimal=true))

MineThatData 5 MineThatData MineThatData E-Mail Analytics Challenge Data With Predictions This data set contains information about purchases from an apparel company during a two-week response window. It is based on a dataset used for an analytics challenge on the MineThatData blog in 2008. Predictions from two different binary response models and two different spend models, conditional on response, are included. Finally, there is an indicator for each customer indicating whether the customer was used in the training sample for the models. MineThatData Format a data frame containing 64000 rows and 7 columns. conversion: 0/1 indicator of whether the customer purchased merchandise in the two-week response window. spend: Amount spent in dollars during the two-week response window. train: 0/1 indicator of whether the observation was used to construct the predictive models. logistic.score: Estimated response probability from a logistic regression. svm.score: Estimated response probability from a support vector machine. linear.score: Estimated revenue ("spend"), conditional on purchase, from the linear regression. rf.score: Estimated "spend", conditional on purchase, from the random forest. Source The MineThatData E-Mail Analytics and Data Mining Challenge http://blog.minethatdata. com/2008/03/minethatdata-e-mail-analytics-and-data.html plot.gains Plotting Gains Table Objects Plot method for objects of class gains. These plots are sometimes called lift charts.

6 plot.gains ## S3 method for class 'gains' plot(x, y=null, xlab="depth of File", ylab="mean Response", type="b", col=c("red3","bisque4","blue4"), pch=c(1,1,1), lty=c(1,1,1), legend=c( "Mean Response","Cumulative Mean Response","Mean Predicted Response"), ylim=c(min(c(x$mean.resp,x$mean.prediction)), max(c(x$mean.resp,x$mean.prediction))), main="gains Table Plot",...) Arguments x y xlab ylab type col pch lty legend ylim main an object of class gains. included for compatability with the plot generic but is not used. a title for the x axis. See title. a title for the y axis. See title. what type of plot should be drawn. The default is "b" for points and lines. vector of length 3 specifying the colors for the series of mean response rates, cumulative mean response rates, and mean predicted response rates, respectively. vector of length 3 specifying the plotting characters for the series of mean response rates, cumulative mean response rates, and mean predicted response rates, respectively. vector of length 3 specifying the line types for the series of mean response rates, cumulative mean response rates, and mean predicted response rates, respectively. character or expression vector of length 3 specifying the legend descriptions for the series of mean response rates, cumulative mean response rates, and mean predicted response rates, respectively. the y limits of the plot. See plot.window. an overall title for the plot. See title.... additional arguments to plot. See Also gains, plot. Examples data(ciascores) ## Not run: plot(with(subset(ciascores,train==0), gains(actual=cellphonespp, predicted=predols, optimal=true)), main="test Gains Table Plot") ## End(Not run)

print.gains 7 print.gains Printing Gains Table Objects Print method for objects of class gains. ## S3 method for class 'gains' print(x, digits=2,...) Arguments x digits See Also an object of class gains. minimum number of significant digits to print. See print.default.... additional arguments to print. gains, print. Examples data(ciascores) print(with(subset(ciascores,train==0), gains(actual=cellphonespp, predicted=predols, optimal=true)),digits=2)

Index Topic datasets ciascores, 2 MineThatData, 5 Topic misc gains, 3 plot.gains, 5 print.gains, 7 ciascores, 2 gains, 3 MineThatData, 5 plot.gains, 5 print.gains, 7 8