Package milr. June 8, 2017

Similar documents
Package ECctmc. May 1, 2018

Package TVsMiss. April 5, 2018

Package vinereg. August 10, 2018

Package PUlasso. April 7, 2018

Package TANDEM. R topics documented: June 15, Type Package

Package validara. October 19, 2017

Package fastdummies. January 8, 2018

Package clusternomics

Package mixsqp. November 14, 2018

Package MTLR. March 9, 2019

Package cattonum. R topics documented: May 2, Type Package Version Title Encode Categorical Features

Package semver. January 6, 2017

Package bisect. April 16, 2018

Package wrswor. R topics documented: February 2, Type Package

Package RAMP. May 25, 2017

Package fasttextr. May 12, 2017

Package GauPro. September 11, 2017

Package glmnetutils. August 1, 2017

Package nmslibr. April 14, 2018

Package ordinalnet. December 5, 2017

Package elmnnrcpp. July 21, 2018

Package NNLM. June 18, 2018

Package recosystem. September 2, 2017

Package skm. January 23, 2017

Package TrafficBDE. March 1, 2018

Package kdtools. April 26, 2018

Package flam. April 6, 2018

Package triebeard. August 29, 2016

Package geojsonsf. R topics documented: January 11, Type Package Title GeoJSON to Simple Feature Converter Version 1.3.

Package rvinecopulib

Package sbrl. R topics documented: July 29, 2016

Package BiDimRegression

Package logisticpca. R topics documented: August 29, 2016

Package vip. June 15, 2018

Package balance. October 12, 2018

Package inca. February 13, 2018

Package strat. November 23, 2016

Package kamila. August 29, 2016

Package colf. October 9, 2017

Package BigVAR. R topics documented: April 3, Type Package

Package ggimage. R topics documented: November 1, Title Use Image in 'ggplot2' Version 0.0.7

Package embed. November 19, 2018

Package glinternet. June 15, 2018

Package rpst. June 6, 2017

Package bife. May 7, 2017

Package EBglmnet. January 30, 2016

Package fastrtext. December 10, 2017

Package vennlasso. January 25, 2019

Package gppm. July 5, 2018

Package auctestr. November 13, 2017

Package ramcmc. May 29, 2018

Package rucrdtw. October 13, 2017

Package sparsenetgls

Package ggimage. R topics documented: December 5, Title Use Image in 'ggplot2' Version 0.1.0

Package queuecomputer

Package deductive. June 2, 2017

Package hiernet. March 18, 2018

Package farver. November 20, 2018

Package Cubist. December 2, 2017

Package robustreg. R topics documented: April 27, Version Date Title Robust Regression Functions

Package psda. September 5, 2017

Package sgmcmc. September 26, Type Package

Package jdx. R topics documented: January 9, Type Package Title 'Java' Data Exchange for 'R' and 'rjava'

Package CVR. March 22, 2017

Package projector. February 27, 2018

Package labelvector. July 28, 2018

Package kerasformula

Package spark. July 21, 2017

Package FWDselect. December 19, 2015

Package diffusr. May 17, 2018

Package mmpa. March 22, 2017

Package FSelectorRcpp

Package nngeo. September 29, 2018

Package pdfsearch. July 10, 2018

Package humanize. R topics documented: April 4, Version Title Create Values for Human Consumption

Package preprocomb. June 26, 2016

Package coga. May 8, 2018

Package bayescl. April 14, 2017

Package bigreadr. R topics documented: August 13, Version Date Title Read Large Text Files

Package treedater. R topics documented: May 4, Type Package

Package rtext. January 23, 2019

Package diagis. January 25, 2018

Package cosinor. February 19, 2015

Package SEMrushR. November 3, 2018

Package breakdown. June 14, 2018

Package ssa. July 24, 2016

Package widyr. August 14, 2017

Package SSLASSO. August 28, 2018

Package sigqc. June 13, 2018

Package SimilaR. June 21, 2018

Package modmarg. R topics documented:

Package geogrid. August 19, 2018

Package enpls. May 14, 2018

Package ECOSolveR. February 18, 2018

Package catenary. May 4, 2018

Package combiter. December 4, 2017

Package KernelKnn. January 16, 2018

Package misclassglm. September 3, 2016

Package RNAseqNet. May 16, 2017

Package rnn. R topics documented: June 21, Title Recurrent Neural Network Version 0.8.1

Transcription:

Type Package Package milr June 8, 2017 Title Multiple-Instance Logistic Regression with LASSO Penalty Version 0.3.0 Date 2017-06-05 The multiple instance data set consists of many independent subjects (called bags) and each subject is composed of several components (called instances). The outcomes of such data set are binary or categorical responses, and, we can only observe the subject-level outcomes. For example, in manufacturing processes, a subject is labeled as ``defective'' if at least one of its own components is defective, and otherwise, is labeled as ``non-defective''. The 'milr' package focuses on the predictive model for the multiple instance data set with binary outcomes and performs the maximum likelihood estimation with the Expectation-Maximization algorithm under the framework of logistic regression. Moreover, the LASSO penalty is attached to the likelihood function for simultaneous parameter estimation and variable selection. Author Ping-Yang Chen [aut, cre], ChingChuan Chen [aut], Chun-Hao Yang [aut], Sheng-Mao Chang [aut] URL https://github.com/pingyangchen/milr BugReports https://github.com/pingyangchen/milr/issues Maintainer Ping-Yang Chen <pychen.ping@gmail.com> Depends R (>= 3.2.3) Imports utils, piper (>= 0.5), numderiv, glmnet, Rcpp (>= 0.12.0), RcppParallel LinkingTo Rcpp, RcppArmadillo, RcppParallel Suggests testthat, knitr, knitcitations, rmarkdown, data.table, ggplot2, plyr LazyData yes SystemRequirements GNU make License MIT + file LICENSE 1

2 milr-package RoxygenNote 6.0.1 VignetteBuilder knitr NeedsCompilation yes Repository CRAN Date/Publication 2017-06-08 16:37:15 UTC R topics documented: milr-package........................................ 2 DGP............................................. 3 fitted.milr.......................................... 3 fitted.softmax........................................ 4 logit............................................. 4 milr............................................. 5 predict.milr......................................... 6 predict.softmax....................................... 7 softmax........................................... 8 Index 9 milr-package The milr package: multiple-instance logistic regression with lasso penalty The multiple instance data set consists of many independent subjects (called bags) and each subject is composed of several components (called instances). The outcomes of such data set are binary or multinomial, and, we can only observe the subject-level outcomes. For example, in manufactory processes, a subject is labeled as "defective" if at least one of its own components is defective, and otherwise, is labeled as "non-defective". The milr package focuses on the predictive model for the multiple instance data set with binary outcomes and performs the maximum likelihood estimation with the Expectation-Maximization algorithm under the framework of logistic regression. Moreover, the LASSO penalty is attached to the likelihood function for simultaneous parameter estimation and variable selection. References 1. Chen, R.-B., Cheng, K.-H., Chang, S.-M., Jeng, S.-L., Chen, P.-Y., Yang, C.-H., and Hsia, C.-C. (2016). Multiple-Instance Logistic Regression with LASSO Penalty. arxiv:1607.03615 [stat.ml].

DGP 3 DGP DGP: data generation Generating the multiple-instance data set. DGP(n, m, beta) n m beta an integer. The number of bags. an integer or vector of length n. If m is an integer, each bag has the identical number of instances, m. If m is a vector, the ith bag has m[i] instances. a vector. The true regression coefficients. Value a list including (1) bag-level labels, Z, (2) the design matrix, X, and (3) bag ID of each instance, ID. Examples data1 <- DGP(50, 3, runif(10, -5, 5)) data2 <- DGP(50, sample(3:5, 50, TRUE), runif(10, -5, 5)) fitted.milr Fitted Response of milr Fits Fitted Response of milr Fits ## S3 method for class 'milr' fitted(object, type = "bag",...) object type A fitted obejct of class inheriting from "milr". The type of fitted response required. Default is "bag", the fitted labels of bags. The "instance" option returns the fitted labels of instances.... further arguments passed to or from other methods.

4 logit fitted.softmax Fitted Response of softmax Fits Fitted Response of softmax Fits ## S3 method for class 'softmax' fitted(object, type = "bag",...) object type A fitted obejct of class inheriting from "softmax". The type of fitted response required. Default is "bag", the fitted labels of bags. The "instance" option returns the fitted labels of instances.... further arguments passed to or from other methods. logit logit link function calculate the values of logit link logit(x, beta) X beta A matrix, the design matrix. A vector, the coefficients. Value An vector of the values of logit link.

milr 5 milr Maximum likelihood estimation of multiple-instance logistic regression with LASSO penalty Please refer to milr-package. milr(y, x, bag, lambda = 0, numlambda = 20L, lambdacriterion = "BIC", nfold = 10L, maxit = 500L) y Value a vector. Bag-level binary labels. x the design matrix. The number of rows of x must be equal to the length of y. bag lambda a vector, bag id. the tuning parameter for LASSO-penalty. If lambda is a real value number, then the milr fits the model based on this lambda value. Second, if lambda is vector, then the optimal lambda value would be be chosen based on the optimality criterion, lambdacriterion. Finally, if lambda = -1, then the optimal lambda value would be chosen automatically. The default is 0. numlambda An integer, the maximum length of LASSO-penalty. in atuo-tunning mode (lambda = -1). The default is 20. lambdacriterion a string, the used optimality criterion for tuning the lambda value. It can be specified with lambdacriterion = "BIC" or lambdacriterion = "deviance". nfold an integer, the number of fold for cross-validation to choose the optimal lambda when lambdacriterion = "deviance". maxit an integer, the maximum iteration for the EM algorithm. The default is 500. An object with S3 class "milr". lambdaa vector of candidate lambda values. cva vector of predictive deviance via nfold-fold cross validation when lambdacriterion = "deviance". deviancea vector of deviance of candidate model for each candidate lambda value. BICa vector of BIC of candidate model for each candidate lambda value. best_indexan integer, indicates the index of the best model among candidate lambda values. best_modela list of the information for the best model including deviance (not cv deviance), BIC, chosen lambda, coefficients, fitted values, log-likelihood and variances of coefficients.

6 predict.milr Examples set.seed(100) beta <- runif(5, -5, 5) traindata <- DGP(70, 3, beta) testdata <- DGP(30, 3, beta) # default (not use LASSO) milr_result <- milr(traindata$z, traindata$x, traindata$id) coef(milr_result) # coefficients fitted(milr_result) fitted(milr_result, type = "instance") # fitted instance labels summary(milr_result) # summary milr predict(milr_result, testdata$x, testdata$id) predict(milr_result, testdata$x, testdata$id, type = "instance") # predicted instance labels # use BIC to choose penalty milr_result <- milr(traindata$z, traindata$x, traindata$id, exp(seq(log(0.01), log(50), length = 30))) coef(milr_result) # coefficients fitted(milr_result) fitted(milr_result, type = "instance") # fitted instance labels summary(milr_result) # summary milr predict(milr_result, testdata$x, testdata$id) predict(milr_result, testdata$x, testdata$id, type = "instance") # predicted instance labels # use auto-tuning milr_result <- milr(traindata$z, traindata$x, traindata$id, lambda = -1, numlambda = 20) coef(milr_result) # coefficients fitted(milr_result) fitted(milr_result, type = "instance") # fitted instance labels summary(milr_result) # summary milr predict(milr_result, testdata$x, testdata$id) predict(milr_result, testdata$x, testdata$id, type = "instance") # predicted instance labels # use cv in auto-tuning milr_result <- milr(traindata$z, traindata$x, traindata$id, lambda = -1, numlambda = 20, lambdacriterion = "deviance") coef(milr_result) # coefficients fitted(milr_result) fitted(milr_result, type = "instance") # fitted instance labels summary(milr_result) # summary milr predict(milr_result, testdata$x, testdata$id) predict(milr_result, testdata$x, testdata$id, type = "instance") # predicted instance labels predict.milr Predict Method for milr Fits Predict Method for milr Fits

predict.softmax 7 ## S3 method for class 'milr' predict(object, newdata = NULL, bag_newdata = NULL, type = "bag",...) object newdata bag_newdata type A fitted obejct of class inheriting from "milr". Default is NULL. A matrix with variables to predict. Default is NULL. A vector. The labels of instances to bags. If newdata and bag_newdata both are NULL, return the fitted result. The type of prediction required. Default is "bag", the predicted labels of bags. The "instance" option returns the predicted labels of instances.... further arguments passed to or from other methods. predict.softmax Predict Method for softmax Fits Predict Method for softmax Fits ## S3 method for class 'softmax' predict(object, newdata = NULL, bag_newdata = NULL, type = "bag",...) object newdata bag_newdata type A fitted obejct of class inheriting from "softmax". Default is NULL. A matrix with variables to predict. Default is NULL. A vector. The labels of instances to bags. If newdata and bag_newdata both are NULL, return the fitted result. The type of prediction required. Default is "bag", the predicted labels of bags. The "instance" option returns the predicted labels of instances.... further arguments passed to or from other methods.

8 softmax softmax Multiple-instance logistic regression via softmax function This function calculates the alternative maximum likelihood estimation for multiple-instance logistic regression through a softmax function (Xu and Frank, 2004; Ray and Craven, 2005). softmax(y, x, bag, alpha = 0,...) y Value a vector. Bag-level binary labels. x the design matrix. The number of rows of x must be equal to the length of y. bag alpha a vector, bag id. A non-negative realnumber, the softmax parameter.... arguments to be passed to the optim function. a list including coefficients and fitted values. References Examples 1. S. Ray, and M. Craven. (2005) Supervised versus multiple instance learning: An empirical comparsion. in Proceedings of the 22nd International Conference on Machine Learnings, ACM, 697 704. 2. X. Xu, and E. Frank. (2004) Logistic regression and boosting for labeled bags of instances. in Advances in Knowledge Discovery and Data Mining, Springer, 272 281. set.seed(100) beta <- runif(10, -5, 5) traindata <- DGP(70, 3, beta) testdata <- DGP(30, 3, beta) # Fit softmax-milr model S(0) softmax_result <- softmax(traindata$z, traindata$x, traindata$id, alpha = 0) coef(softmax_result) # coefficients fitted(softmax_result) fitted(softmax_result, type = "instance") # fitted instance labels predict(softmax_result, testdata$x, testdata$id) predict(softmax_result, testdata$x, testdata$id, type = "instance") # predicted instance labels # Fit softmax-milr model S(3) softmax_result <- softmax(traindata$z, traindata$x, traindata$id, alpha = 3)

Index DGP, 3 fitted.milr, 3 fitted.softmax, 4 logit, 4 milr, 5 milr-package, 2, 5 predict.milr, 6 predict.softmax, 7 softmax, 8 9