Package FisherEM. February 19, 2015

Similar documents
Package Rambo. February 19, 2015

Package sensory. R topics documented: February 23, Type Package

Package glassomix. May 30, 2013

Package cwm. R topics documented: February 19, 2015

Package flexcwm. May 20, 2018

Package mrm. December 27, 2016

Package longclust. March 18, 2018

Package HDclassif. May 11, 2018

Package mixphm. July 23, 2015

Package FMsmsnReg. March 30, 2016

Package Funclustering

Package edrgraphicaltools

Package RPMM. August 10, 2010

Clustering Lecture 5: Mixture Model

Package penalizedlda

Package ICsurv. February 19, 2015

Package RPMM. September 14, 2014

Package ssmn. R topics documented: August 9, Type Package. Title Skew Scale Mixtures of Normal Distributions. Version 1.1.

Package clustmd. May 8, 2017

Package wskm. July 8, 2015

Package SSLASSO. August 28, 2018

Package isopam. February 20, 2015

Package msgps. February 20, 2015

Package SeleMix. R topics documented: November 22, 2016

Package TVsMiss. April 5, 2018

Package ClustVarLV. December 14, 2016

Package cclust. August 29, 2013

Package freeknotsplines

Package NetCluster. R topics documented: February 19, Type Package Version 0.2 Date Title Clustering for networks

Package FPDclustering

Package gibbs.met. February 19, 2015

Package missforest. February 15, 2013

Package clustvarsel. April 9, 2018

Package clustmixtype

Package esabcv. May 29, 2015

Package RAMP. May 25, 2017

Clustering and Dissimilarity Measures. Clustering. Dissimilarity Measures. Cluster Analysis. Perceptually-Inspired Measures

Machine Learning (BSMC-GA 4439) Wenke Liu

Package logspline. February 3, 2016

SGN (4 cr) Chapter 11

Package MSwM. R topics documented: February 19, Type Package Title Fitting Markov Switching Models Version 1.

Package naivebayes. R topics documented: January 3, Type Package. Title High Performance Implementation of the Naive Bayes Algorithm

Package BigVAR. R topics documented: April 3, Type Package

Package clusternomics

Package ManlyMix. November 19, 2017

Expectation Maximization (EM) and Gaussian Mixture Models

Package sparsereg. R topics documented: March 10, Type Package

Package gppm. July 5, 2018

Package hsiccca. February 20, 2015

Package mcclust.ext. May 15, 2015

Tools and methods for model-based clustering in R

Package abe. October 30, 2017

Package capushe. R topics documented: April 19, Type Package

Package spikes. September 22, 2016

Package manet. September 19, 2017

I How does the formulation (5) serve the purpose of the composite parameterization

Package TBSSurvival. January 5, 2017

MSA220 - Statistical Learning for Big Data

Package DRaWR. February 5, 2016

Package glmmml. R topics documented: March 25, Encoding UTF-8 Version Date Title Generalized Linear Models with Clustering

Package kamila. August 29, 2016

Package RCA. R topics documented: February 29, 2016

CS 2750 Machine Learning. Lecture 19. Clustering. CS 2750 Machine Learning. Clustering. Groups together similar instances in the data sample

Package Binarize. February 6, 2017

Package flam. April 6, 2018

Package BayesCR. September 11, 2017

Package EBglmnet. January 30, 2016

Package tclust. May 24, 2018

Package SparseFactorAnalysis

Machine Learning (BSMC-GA 4439) Wenke Liu

Package sglasso. May 26, 2018

CS 229 Midterm Review

CS 1675 Introduction to Machine Learning Lecture 18. Clustering. Clustering. Groups together similar instances in the data sample

Package RaPKod. February 5, 2018

Package MixSim. April 29, 2017

Clustering. CS294 Practical Machine Learning Junming Yin 10/09/06

Package SOIL. September 20, 2017

Package SAENET. June 4, 2015

Package glinternet. June 15, 2018

Package mmtsne. July 28, 2017

Package marqlevalg. February 20, 2015

Facial Expression Recognition Using Non-negative Matrix Factorization

Package GWRM. R topics documented: July 31, Type Package

Package StagedChoiceSplineMix

Package vinereg. August 10, 2018

MIXMOD : a software for model-based classification with continuous and categorical data

Package biganalytics

Package kernelfactory

Package GAMBoost. February 19, 2015

Package ordinalclust

Package r.jive. R topics documented: April 12, Type Package

INF 4300 Classification III Anne Solberg The agenda today:

Package CSclone. November 12, 2016

Package pendvine. R topics documented: July 9, Type Package

Package bayescl. April 14, 2017

Package ordinalnet. December 5, 2017

Package StVAR. February 11, 2017

STAT432 Mini-Midterm Exam I (green) University of Illinois Urbana-Champaign February 25 (Monday), :00 10:50a SOLUTIONS

Machine Learning and Data Mining. Clustering (1): Basics. Kalev Kask

What is machine learning?

Transcription:

Type Package Title The Fisher-EM algorithm Version 1.4 Date 2013-06-21 Author Charles Bouveyron and Camille Brunet Package FisherEM February 19, 2015 Maintainer Camille Brunet <camille.brunet@gmail.com> Depends MASS,elasticnet The FisherEM package provides an efficient algorithm for the unsupervised classification of high-dimensional data. This FisherEM algorithm models and clusters the data in a discriminative and low-dimensional latent subspace. It also provides a low-dimensional representation of the clustered data. A sparse version of Fisher-EM algorithm is also provided. License GPL-2 LazyLoad yes NeedsCompilation no Repository CRAN Date/Publication 2013-06-28 18:44:30 R topics documented: FisherEM-package..................................... 2 fem............................................. 2 fem.ari............................................ 5 plot.fem........................................... 5 sfem............................................. 6 summary.fem........................................ 8 Index 10 1

2 fem FisherEM-package The FisherEM package Details The FisherEM package provides an efficient algorithm for the unsupervised classification of highdimensional data. This FisherEM algorithm models and clusters the data in a discriminative and low-dimensional latent subspace. It also provides a low-dimensional representation of the clustered data. Package: FisherEM Type: Package Version: 1.2 Date: 2012-07-09 License: GPL-2 LazyLoad: yes Author(s) Charles Bouveyron and Camille Brunet Maintainer: Camille Brunet <camille.brunet@gmail.com> References Charles Bouveyron, Camille Brunet (2012), "Simultaneous model-based clustering and visualization in the Fisher discriminative subspace.", Statistics and Computing, 22(1), 301-324. Charles Bouveyron, Camille Brunet (2012), "Discriminative variable selection for clustering with the sparse Fisher-EM algorithm", preprint Hal n-00685183. fem The Fisher-EM algorithm The Fisher-EM algorithm is a subspace clustering method for high-dimensional data. It is based on the Gaussian Mixture Model and on the idea that the data lives in a common and low dimensional subspace. An EM-like algorithm estimates both the discriminative subspace and the parameters of the mixture model.

fem 3 Usage fem(y,k=2:6,model= AkjB,method= reg,crit= icl,maxit=50,eps=1e-6,init= kmeans, nstart=25,tinit=c(),kernel=,disp=f) Arguments Y K Value The data matrix. Categorical variables and missing values are not allowed. An integer vector specifying the numbers of mixture components (clusters) among which the model selection criterion will choose the most appropriate number of groups. Default is 2:6. model A vector of discriminative latent mixture (DLM) models to fit. There are 12 different models: "DkBk", "DkB", "DBk", "DB", "AkjBk", "AkjB", "AkBk", "AkBk", "AjBk", "AjB", "ABk", "AB". The option "all" executes the Fisher- EM algorithm on the 12 DLM models and select the best model according to the maximum value obtained by model selection criterion. method crit maxit eps init nstart Tinit kernel disp A list is returned: The method use for the fitting of the projection matrix associated to the discriminative subspace. Three methods are available: svd, reg and gs. The reg method is the default. The model selection criterion to use for selecting the most appropriate model for the data. There are 3 possibilities: "bic", "aic" or "icl". Default is "icl". The maximum number of iterations before the stop of the Fisher-EM algorithm. The threshold value for the likelihood differences to stop the Fisher-EM algorithm. The initialization method for the Fisher-EM algorithm. There are 4 options: "random" for a randomized initialization, "kmeans" for an initialization by the kmeans algorithm, "hclust" for hierarchical clustering initialization or "user" for a specific initialization through the parameter "Tinit". Default is "kmeans". Notice that for "kmeans" and "random", several initializations are asked and the initialization associated with the highest likelihood is kept (see "nstart"). The number of restart if the initialization is "kmeans" or "random". In such a case, the initialization associated with the highest likelihood is kept. A n x K matrix which contains posterior probabilities for initializing the algorithm (each line corresponds to an individual). It enables to deal with the n < p problem. By default, no kernel (" ") is used. But the user has the choice between 3 options for the kernel: "linear", "sigmoid" or "rbf". If true, some messages are printed during the clustering. Default is false. K cls P The number of groups. the group membership of each individual estimated by the Fisher-EM algorithm. the posterior probabilities of each individual for each group.

4 fem U mean my prop D aic bic icl loglik ll method call plot crit The loading matrix which determines the orientation of the discriminative subspace. The estimated mean in the subspace. The estimated mean in the observation space. The estimated mixture proportion. The covariance matrices in the subspace. The value of the Akaike information criterion. The value of the Bayesian information criterion. The value of the integrated completed likelihood criterion. The log-likelihood values computed at each iteration of the FEM algorithm. the log-likelihood value obtained at the last iteration of the FEM algorithm. The method used. The call of the function. Some information to pass to the plot.fem function. The model selction criterion used. Author(s) Charles Bouveyron and Camille Brunet References Charles Bouveyron and Camille Brunet (2012), Simultaneous model-based clustering and visualization in the Fisher discriminative subspace, Statistics and Computing, 22(1), 301-324. Charles Bouveyron and Camille Brunet (2013), "Discriminative variable selection for clustering with the sparse Fisher-EM algorithm", Computational Statistics, to appear. See Also sfem, plot.fem, fem.ari, summary.fem Examples data(iris) res = fem(iris[,-5],k=2:5,model= AkB ) summary(res) plot(res) fem.ari(res,as.numeric(iris[,5]))

fem.ari 5 fem.ari Adjusted Rand index Usage The function computes the adjusted Rand index (ARI) which allows to compare two clustering partitions. fem.ari(x,y) Arguments x y A fem object containing the first partition to compare. The second partition to compare (as vector). Value ari The value of the ARI. See Also fem, sfem, plot.fem, summary.fem Examples data(iris) res = fem(iris[,-5],k=2:5,model= AkB ) summary(res) plot(res) fem.ari(res,as.numeric(iris[,5])) plot.fem The plot function for fem objects. Usage This function plots different information about fem objects such as model selection, log-likelihood evolution and visualization of the clustered data into the discriminative subspace fitted by the Fisher- EM algorithm. ## S3 method for class fem plot(x, frame=0, crit=c(),...)

6 sfem Arguments x frame crit See Also The fem object. 0: all plots; 1: selection of the number of groups; 2: log-likelihood; projection of the data into the discriminative subspace. The model selection criterion to display. Default is the criterion used in the fem function ( icl by default).... Additional options to pass to the plot function. fem, sfem, fem.ari, summary.fem Examples data(iris) res = fem(iris[,-5],k=2:5,model= AkB ) summary(res) plot(res) fem.ari(res,as.numeric(iris[,5])) sfem The sparse Fisher-EM algorithm The sparse Fisher-EM algorithm is a sparse version of the Fisher-EM algorithm. The sparsity is introduced within the F step which estimates the discriminative subspace. The sparsity on U is obtained by adding a l1 penalty to the optimization problem of the F step. Usage sfem(y,k=2:6,model= AkjB,method= reg,crit= icl,maxit=50,eps=1e-6,init= kmeans, nstart=25,tinit=c(),kernel=,disp=f,l1=0.1,l2=0,nbit=2) Arguments Y K The data matrix. Categorical variables and missing values are not allowed. An integer vector specifying the numbers of mixture components (clusters) among which the model selection criterion will choose the most appropriate number of groups. Default is 2:6. model A vector of discriminative latent mixture (DLM) models to fit. There are 12 different models: "DkBk", "DkB", "DBk", "DB", "AkjBk", "AkjB", "AkBk", "AkBk", "AjBk", "AjB", "ABk", "AB". The option "all" executes the Fisher- EM algorithm on the 12 DLM models and select the best model according to the maximum value obtained by model selection criterion.

sfem 7 method crit maxit eps init nstart Tinit kernel disp The method use for the fitting of the projection matrix associated to the discriminative subspace. Three methods are available: svd, reg and gs. The reg method is the default. The model selection criterion to use for selecting the most appropriate model for the data. There are 3 possibilities: "bic", "aic" or "icl". Default is "icl". The maximum number of iterations before the stop of the Fisher-EM algorithm. The threshold value for the likelihood differences to stop the Fisher-EM algorithm. The initialization method for the Fisher-EM algorithm. There are 4 options: "random" for a randomized initialization, "kmeans" for an initialization by the kmeans algorithm, "hclust" for hierarchical clustering initialization or "user" for a specific initialization through the parameter "Tinit". Default is "kmeans". Notice that for "kmeans" and "random", several initializations are asked and the initialization associated with the highest likelihood is kept (see "nstart"). The number of restart if the initialization is "kmeans" or "random". In such a case, the initialization associated with the highest likelihood is kept. A n x K matrix which contains posterior probabilities for initializing the algorithm (each line corresponds to an individual). It enables to deal with the n < p problem. By default, no kernel (" ") is used. But the user has the choice between 3 options for the kernel: "linear", "sigmoid" or "rbf". If true, some messages are printed during the clustering. Default is false. l1 The l1 penalty value (lasso) which has to be in [0,1]. A small value (close to 0) leads to a very sparse loading matrix whereas a value equals to 1 corresponds to no sparsity. Default is 0.1. l2 Value The l2 penalty value (elasticnet). Defaults is 0 (no regularization). nbit The number of iterations for the lasso procedure. Defaults is 2. A list is returned: K cls P U mean my prop D aic bic The number of groups. the group membership of each individual estimated by the Fisher-EM algorithm. the posterior probabilities of each individual for each group. The loading matrix which determines the orientation of the discriminative subspace. The estimated mean in the subspace. The estimated mean in the observation space. The estimated mixture proportion. The covariance matrices in the subspace. The value of the Akaike information criterion. The value of the Bayesian information criterion.

8 summary.fem icl loglik ll method call plot crit l1 l2 The value of the integrated completed likelihood criterion. The log-likelihood values computed at each iteration of the FEM algorithm. the log-likelihood value obtained at the last iteration of the FEM algorithm. The method used. The call of the function. Some information to pass to the plot.fem function. The model selction criterion used. The l1 value. The l2 value. Author(s) Charles Bouveyron and Camille Brunet References Charles Bouveyron and Camille Brunet (2012), Simultaneous model-based clustering and visualization in the Fisher discriminative subspace, Statistics and Computing, 22(1), 301-324. Charles Bouveyron and Camille Brunet (2013), "Discriminative variable selection for clustering with the sparse Fisher-EM algorithm", Computational Statistics, to appear. See Also fem, plot.fem, fem.ari, summary.fem Examples data(iris) res = sfem(iris[,-5],k=3,model= AkB,l1=seq(.01,.3,.05)) summary(res) plot(res) fem.ari(res,as.numeric(iris[,5])) summary.fem The summary function for fem objects. Usage This function summarizes fem objects. It in particular indicates which DLM model has been chosen and displays the loading matrix U if the original dimension is smaller than 10. ## S3 method for class fem summary(object,...)

summary.fem 9 Arguments object See Also The fem object.... Additional options to pass to the summary function. fem, sfem, fem.ari, plot.fem Examples data(iris) res = fem(iris[,-5],k=2:5,model= AkB ) summary(res) plot(res) fem.ari(res,as.numeric(iris[,5]))

Index fem, 2 fem.ari, 5 FisherEM (FisherEM-package), 2 FisherEM-package, 2 plot.fem, 5 sfem, 6 summary.fem, 8 10