Package SoftClustering

Similar documents
Package CuCubes. December 9, 2016

Package spark. July 21, 2017

Package Numero. November 24, 2018

Collaborative Rough Clustering

Package datasets.load

Package glassomix. May 30, 2013

Package subtype. February 20, 2015

Package clustering.sc.dp

Package ECctmc. May 1, 2018

Package RCA. R topics documented: February 29, 2016

Package StVAR. February 11, 2017

Package cgh. R topics documented: February 19, 2015

Package MeanShift. R topics documented: August 29, 2016

Package mdftracks. February 6, 2017

Package ArrayBin. February 19, 2015

Package CatEncoders. March 8, 2017

Package GFD. January 4, 2018

Package inca. February 13, 2018

Package bdots. March 12, 2018

Package anidom. July 25, 2017

Package PRISMA. May 27, 2018

Package nlnet. April 8, 2018

Package HMMCont. February 19, 2015

Package ssd. February 20, Index 5

Package GenSA. R topics documented: January 17, Type Package Title Generalized Simulated Annealing Version 1.1.

Package manet. September 19, 2017

Package RMCriteria. April 13, 2018

Package diffusr. May 17, 2018

Package lga. R topics documented: February 20, 2015

Package crossword.r. January 19, 2018

Package spatialtaildep

Package DRaWR. February 5, 2016

Package marqlevalg. February 20, 2015

Package braqca. February 23, 2018

Package bisect. April 16, 2018

Package cwm. R topics documented: February 19, 2015

Package ETC. February 19, 2015

Package lazydata. December 4, 2016

Package tmpm. February 29, 2016

Package weco. May 4, 2018

Package seg. February 15, 2013

Package hsiccca. February 20, 2015

Package CMPControl. February 19, 2015

Package geojsonsf. R topics documented: January 11, Type Package Title GeoJSON to Simple Feature Converter Version 1.3.

Package kamila. August 29, 2016

Package SSLASSO. August 28, 2018

Package frt. R topics documented: February 19, 2015

Package LVGP. November 14, 2018

Package binomlogit. February 19, 2015

Package PedCNV. February 19, 2015

Package genelistpie. February 19, 2015

Package arulesnbminer

Attribute Reduction using Forward Selection and Relative Reduct Algorithm

Package mmtsne. July 28, 2017

Package rucrdtw. October 13, 2017

Package BASiNET. October 2, 2018

Package RNAseqNet. May 16, 2017

Package FractalParameterEstimation

Package extweibquant

Package pomdp. January 3, 2019

Package Mondrian. R topics documented: March 4, Type Package

Package crossrun. October 8, 2018

Package fusionclust. September 19, 2017

Package FastKNN. February 19, 2015

Package RSKC. R topics documented: August 28, Type Package Title Robust Sparse K-Means Version Date Author Yumi Kondo

Package wskm. July 8, 2015

Package sspline. R topics documented: February 20, 2015

Package rferns. November 15, 2017

Package svapls. February 20, 2015

Package corclass. R topics documented: January 20, 2016

Package QCEWAS. R topics documented: February 1, Type Package

Package kirby21.base

Package esabcv. May 29, 2015

Package fractalrock. February 19, 2015

Package metaheur. June 30, 2016

Package geogrid. August 19, 2018

Package cooptrees. February 19, 2015

Package subspace. October 12, 2015

Package optimus. March 24, 2017

Package CONOR. August 29, 2013

Package glcm. August 29, 2016

Package r2d2. February 20, 2015

Package listdtr. November 29, 2016

Package SAM. March 12, 2019

Package qap. February 27, Index 5. Solve Quadratic Assignment Problems (QAP)

Package hypercube. December 15, 2017

Package MetaLonDA. R topics documented: January 22, Type Package

An Adaptive and Deterministic Method for Initializing the Lloyd-Max Algorithm

Package irrna. April 11, 2018

Package sparsereg. R topics documented: March 10, Type Package

Package ccapp. March 7, 2016

Package packcircles. April 28, 2018

Package freeknotsplines

Package JOP. February 19, 2015

Package rhor. July 9, 2018

Package arc. April 18, 2018

Package intccr. September 12, 2017

Package FPDclustering

Package flsa. February 19, 2015

Package QCAtools. January 3, 2017

Transcription:

Title Soft Clustering Algorithms Package SoftClustering February 19, 2015 It contains soft clustering algorithms, in particular approaches derived from rough set theory: Lingras & West original rough k-means, Peters' refined rough k-means, and PI rough k- means. It also contains classic k-means and a corresponding illustrative demo. Version 1.1502 Author G. Peters (Ed.) Maintainer G. Peters <peters.projects@outlook.com> Depends R (>= 3.0.2) License GPL-2 LazyData true NeedsCompilation no Repository CRAN Date/Publication 2015-02-10 13:36:07 R topics documented: createlowermshipmatrix.................................. 2 datatypeinteger....................................... 2 DemoDataC2D2a...................................... 3 HardKMeans........................................ 3 HardKMeansDemo..................................... 4 initializemeansmatrix.................................... 5 initmeansc2d2a...................................... 6 initmeansc3d2a...................................... 6 initmeansc4d2a...................................... 7 initmeansc5d2a...................................... 7 normalizematrix...................................... 8 plotroughkmeans..................................... 8 RoughKMeans_LW..................................... 9 RoughKMeans_PE..................................... 10 RoughKMeans_PI...................................... 12 RoughKMeans_SHELL................................... 13 Index 15 1

2 datatypeinteger createlowermshipmatrix Create Lower Approximation Creates a lower approximation out of an upper approximation. createlowermshipmatrix(uppermshipmatrix) uppermshipmatrix An upper approximation matrix. Returns the corresponding lower approximation. G. Peters. datatypeinteger Rough k-means Plotting Checks for integer. datatypeinteger(x) x As a replacement for is.integer(). is.integer() delivers FALSE when the variable is numeric (as superset for integer etc.) TRUE if x is integer otherwise FALSE. G. Peters.

DemoDataC2D2a 3 DemoDataC2D2a A Simple Data Set This is a simple two-dimensional data set with 200 objects in 2 clusters. It is provided as sample data set to easily test the algorithms. data(demodatac2d2a) Format A matrix [nclusters x nfeatures] = [2 x 2]. data(demodatac2d2a); plot(demodatac2d2a) HardKMeans Hard k-means HardKMeans performs classic (hard) k-means. HardKMeans(,, nclusters, maxiterations) nclusters maxiterations Matrix with the objects to be clustered. Dimension: [nobjects x nfeatures]. Select means derived from 1 = random (unity interval), 2 = maximum distances, matrix [nclusters x nfeatures] = self-defined means. Default: 2 = maximum distances. Number of clusters: Integer in [2, nobjects). Note, ncluster must be set even when is a matrix. For transparency, nclusters will not be overridden by the number of clusters derived from. Default: nclusters=2. Maximum number of iterations. Default: maxiterations=100.

4 HardKMeansDemo $upperapprox: Obtained upper approximations [nobjects x nclusters]. Note: Apply function createlowermshipmatrix() to obtain lower approximations; and for the boundary: boundary = upperapprox - lowerapp $clustermeans: Obtained means [nclusters x nfeatures]. $niterations: Number of iterations. M. Goetz, G. Peters, Y. Richter, D. Sacker, T. Wochinger. References Lloyd, S.P. (1982) Least squares quantization in PCM. IEEE Transactions on Information Theory 28, 128 137. Peters, G.; Crespo, F.; Lingras, P. and Weber, R. (2013) Soft clustering fuzzy and rough approaches and their extensions and derivatives. International Journal of Approximate Reasoning 54, 307 322. # An illustrative example clustering the sample data set DemoDataC2D2a.txt HardKMeans(DemoDataC2D2a, 2, 2, 100) HardKMeansDemo Hard k-means Demo HardKMeansDemo shows how hard k-means performs stepwise. The number of features is set to 2 and the maximum number of iterations is 100. HardKMeansDemo(,, nclusters) nclusters Matrix with the objects to be clustered. Dimension: [nobjects x nfeatures]. Default: no default set. Select means derived from 1 = random (unity interval), 2 = maximum distances, matrix [nclusters x nfeatures=2] = self-defined means. Default: =1 (random). Number of clusters: Integer in [2, min(5, nobjects-1)]. Note, ncluster must be set even when is a matrix. For transparency, nclusters will not be overridden by the number of clusters derived from. Default: nclusters=2.

initializemeansmatrix 5 None. G. Peters. References Lloyd, S.P. (1982) Least squares quantization in PCM. IEEE Transactions on Information Theory 28, 128 137. Peters, G.; Crespo, F.; Lingras, P. and Weber, R. (2013) Soft clustering fuzzy and rough approaches and their extensions and derivatives. International Journal of Approximate Reasoning 54, 307 322. # Clustering the data set DemoDataC2D2a.txt (nclusters=2, random initial means) HardKMeansDemo(DemoDataC2D2a,1,2) # Clustering the data set DemoDataC2D2a.txt (nclusters=2,3,4; initially set means) HardKMeansDemo(DemoDataC2D2a,initMeansC2D2a,3) HardKMeansDemo(DemoDataC2D2a,initMeansC3D2a,3) HardKMeansDemo(DemoDataC2D2a,initMeansC4D2a,4) # Clustering the data set DemoDataC2D2a.txt (nclusters=5, initially set means) # It leads to an empty cluster: a (rare) case of termination of k-means. HardKMeansDemo(DemoDataC2D2a,initMeansC5D2a,5) initializemeansmatrix Initialize Means Matrix initializemeansmatrix delivers an initial means matrix. initializemeansmatrix(, nclusters, ) nclusters Matrix with the objects as basis for the means matrix. Number of clusters. Select means derived from 1 = random (unity interval), 2 = maximum distances, matrix [nclusters x nfeatures] = self-defined means (will be returned unchanged). Default: 2 = maximum distances.

6 initmeansc3d2a Initial means matrix [nclusters x nfeatures]. M. Goetz, G. Peters, Y. Richter, D. Sacker, T. Wochinger. initmeansc2d2a Initial Means (2 Clusters) This file contains initial means for an example for the HardKMEANSDemo function. data(initmeansc2d2a) Format A data frame with 200 observations on 2 variables. data(initmeansc2d2a); plot(initmeansc2d2a) initmeansc3d2a Initial Means (3 Clusters) This file contains initial means for an example for the HardKMEANSDemo function. data(initmeansc3d2a) Format A matrix [nclusters x nfeatures] = [3 x 2]. data(initmeansc3d2a); plot(initmeansc3d2a)

initmeansc4d2a 7 initmeansc4d2a Initial Means (4 Clusters) This file contains initial means for an example for the HardKMEANSDemo function. data(initmeansc4d2a) Format A matrix [nclusters x nfeatures] = [4 x 2]. data(initmeansc4d2a); plot(initmeansc4d2a) initmeansc5d2a Initial Means (5 Clusters) This file contains initial means for an example for the HardKMEANSDemo function. data(initmeansc5d2a) Format A matrix [nclusters x nfeatures] = [5 x 2]. data(initmeansc5d2a); plot(initmeansc5d2a)

8 plotroughkmeans normalizematrix Matrix Normalization normalizematrix delivers a normalized matrix. normalizematrix(, normmethod, byrow) normmethod byrow Matrix with the objects to be normalized. 1 = unity interval, 2 = normal distribution (sample variance), 3 = normal distribution (population variance). Any other value returns the matrix unchanged. Default: = 1 (unity interval). TRUE = rows are normalized, FALSE = columns are normalized. Default: byrow = TRUE (rows are normalized). Normalized matrix. M. Goetz, G. Peters, Y. Richter, D. Sacker, T. Wochinger. plotroughkmeans Rough k-means Plotting plotroughkmeans plots the rough clustering results in 2D. Note: Plotting is limited to a maximum of 5 clusters. plotroughkmeans(, uppermshipmatrix,, plotdimensions, colouredplot)

RoughKMeans_LW 9 Matrix with the objects to be plotted. uppermshipmatrix Corresponding matrix with upper approximations. Corresponding means matrix. plotdimensions An integer vector of the length 2. Defines the to be plotted feature dimensions, i.e., max(plotdimensions = c(1:2)) <= nfeatures. Default: plotdimensions = c(1:2). colouredplot Select TRUE = colouredplot plot, FALSE = black/white plot. 2D-plot of clustering results. The boundary objects are represented by stars (*). G. Peters. RoughKMeans_LW Lingras & West s Rough k-means RoughKMeans_LW performs Lingras & West s k-means clustering algorithm. The commonly accepted relative threshold is applied. RoughKMeans_LW(,, nclusters, maxiterations, threshold, weightlower) nclusters maxiterations threshold weightlower Matrix with the objects to be clustered. Dimension: [nobjects x nfeatures]. Select means derived from 1 = random (unity interval), 2 = maximum distances, matrix [nclusters x nfeatures] = self-defined means. Default: 2 = maximum distances. Number of clusters: Integer in [2, nobjects). Note, ncluster must be set even when is a matrix. For transparency, nclusters will not be overridden by the number of clusters derived from. Default: nclusters=2. Maximum number of iterations. Default: maxiterations=100. Relative threshold in rough k-means algorithms (threshold >= 1.0). Default: threshold = 1.5. Weight of the lower approximation in rough k-means algorithms (0.0 <= weight- Lower <= 1.0). Default: weightlower = 0.7.

10 RoughKMeans_PE $upperapprox: Obtained upper approximations [nobjects x nclusters]. Note: Apply function createlowermshipmatrix() to obtain lower approximations; and for the boundary: boundary = upperapprox - lowerapp $clustermeans: Obtained means [nclusters x nfeatures]. $niterations: Number of iterations. M. Goetz, G. Peters, Y. Richter, D. Sacker, T. Wochinger. References Lingras, P. and West, C. (2004) Interval Set Clustering of web users with rough k-means. Journal of Intelligent Information Systems 23, 5 16. Lingras, P. and Peters, G. (2011) Rough Clustering. WIREs Data Mining and Knowledge Discovery 1, 64 72. Lingras, P. and Peters, G. (2012) Applying rough set concepts to clustering. In: Peters, G.; Lingras, P.; Slezak, D. and Yao, Y. Y. (Eds.) Rough Sets: Selected Methods and Applications in Management and Engineering, Springer, 23 37. Peters, G.; Crespo, F.; Lingras, P. and Weber, R. (2013) Soft clustering fuzzy and rough approaches and their extensions and derivatives. International Journal of Approximate Reasoning 54, 307 322. Peters, G. (2015) Is there any need for rough clustering? Pattern Recognition Letters 53, 31 37. # An illustrative example clustering the sample data set DemoDataC2D2a.txt RoughKMeans_LW(DemoDataC2D2a, 2, 2, 100, 1.5, 0.7) RoughKMeans_PE Peters Rough k-means RoughKMeans_PE performs Peters k-means clustering algorithm. RoughKMeans_PE(,, nclusters, maxiterations, threshold, weightlower)

RoughKMeans_PE 11 nclusters maxiterations threshold weightlower Matrix with the objects to be clustered. Dimension: [nobjects x nfeatures]. Select means derived from 1 = random (unity interval), 2 = maximum distances, matrix [nclusters x nfeatures] = self-defined means. Default: 2 = maximum distances. Number of clusters: Integer in [2, nobjects). Note, ncluster must be set even when is a matrix. For transparency, nclusters will not be overridden by the number of clusters derived from. Default: nclusters=2. Maximum number of iterations. Default: maxiterations=100. Relative threshold in rough k-means algorithms (threshold >= 1.0). Default: threshold = 1.5. Weight of the lower approximation in rough k-means algorithms (0.0 <= weight- Lower <= 1.0). Default: weightlower = 0.7. $upperapprox: Obtained upper approximations [nobjects x nclusters]. Note: Apply function createlowermshipmatrix() to obtain lower approximations; and for the boundary: boundary = upperapprox - lowerapp $clustermeans: Obtained means [nclusters x nfeatures]. $niterations: Number of iterations. M. Goetz, G. Peters, Y. Richter, D. Sacker, T. Wochinger. References Peters, G. (2006) Some refinements of rough k-means clustering. Pattern Recognition 39, 1481 1491. Peters, G.; Crespo, F.; Lingras, P. and Weber, R. (2013) Soft clustering fuzzy and rough approaches and their extensions and derivatives. International Journal of Approximate Reasoning 54, 307 322. Peters, G. (2015) Is there any need for rough clustering? Pattern Recognition Letters 53, 31 37. # An illustrative example clustering the sample data set DemoDataC2D2a.txt RoughKMeans_PE(DemoDataC2D2a, 2, 2, 100, 1.5, 0.7)

12 RoughKMeans_PI RoughKMeans_PI PI Rough k-means RoughKMeans_PI performs pi k-means clustering algorithm in its standard case. Therefore, weights are not required. RoughKMeans_PI(,, nclusters, maxiterations, threshold) nclusters maxiterations threshold Matrix with the objects to be clustered. Dimension: [nobjects x nfeatures]. Select means derived from 1 = random (unity interval), 2 = maximum distances, matrix [nclusters x nfeatures] = self-defined means. Default: 2 = maximum distances. Number of clusters: Integer in [2, nobjects). Note, ncluster must be set even when is a matrix. For transparency, nclusters will not be overridden by the number of clusters derived from. Default: nclusters=2. Maximum number of iterations. Default: maxiterations=100. Relative threshold in rough k-means algorithms (threshold >= 1.0). Default: threshold = 1.5. $upperapprox: Obtained upper approximations [nobjects x nclusters]. Note: Apply function createlowermshipmatrix() to obtain lower approximations; and for the boundary: boundary = upperapprox - lowerapp $clustermeans: Obtained means [nclusters x nfeatures]. $niterations: Number of iterations. M. Goetz, G. Peters, Y. Richter, D. Sacker, T. Wochinger. References Peters, G.; Crespo, F.; Lingras, P. and Weber, R. (2013) Soft clustering fuzzy and rough approaches and their extensions and derivatives. International Journal of Approximate Reasoning 54, 307 322. Peters, G. (2014) Rough clustering utilizing the principle of indifference. Information Sciences 277, 358 374. Peters, G. (2015) Is there any need for rough clustering? Pattern Recognition Letters 53, 31 37.

RoughKMeans_SHELL 13 # An illustrative example clustering the sample data set DemoDataC2D2a.txt RoughKMeans_PI(DemoDataC2D2a, 2, 2, 100, 1.5) RoughKMeans_SHELL Rough k-means Shell RoughKMeans_SHELL performs rough k-means algorithms with options for normalization and a 2D-plot of the results. RoughKMeans_SHELL(clusterAlgorithm,,, nclusters, normalizationmethod, maxiterations, plotdimensions, colouredplot, threshold, weightlower) clusteralgorithm Select 0 = classic k-means, 1 = Lingras & West s rough k-means, 2 = Peters rough k-means, 3 = π rough k-means. Default: clusteralgorithm = 3 (π rough k-means). Matrix with the objects to be clustered. Dimension: [nobjects x nfeatures]. Select means derived from 1 = random (unity interval), 2 = maximum distances, matrix [nclusters x nfeatures] = self-defined means. Default: 2 = maximum distances. nclusters Number of clusters: Integer in [2, nobjects). Note, ncluster must be set even when is a matrix. For transparency, nclusters will not be overridden by the number of clusters derived from. Default: nclusters=2. Note: Plotting is limited to a maximum of 5 clusters. normalizationmethod 1 = unity interval, 2 = normal distribution (sample variance), 3 = normal distribution (population variance). Any other value returns the matrix unchanged. Default: = 1 (unity interval). maxiterations Maximum number of iterations. Default: maxiterations=100. plotdimensions An integer vector of the length 2. Defines the to be plotted feature dimensions, i.e., max(plotdimensions = c(1:2)) <= nfeatures. Default: plotdimensions = c(1:2). colouredplot threshold weightlower Select TRUE = colouredplot plot, FALSE = black/white plot. Relative threshold in rough k-means algorithms (threshold >= 1.0). Default: threshold = 1.5. Note: It can be ignored for classic k-means. Weight of the lower approximation in rough k-means algorithms (0.0 <= weight- Lower <= 1.0). Default: weightlower = 0.7. Note: It can be ignored for classic k-means and π rough k-means

14 RoughKMeans_SHELL 2D-plot of clustering results. The boundary objects are represented by stars (*). $upperapprox: Obtained upper approximations [nobjects x nclusters]. Note: Apply function createlowermshipmatrix() to obtain lower approximations; and for the boundary: boundary = upperapprox - lowerapp $clustermeans: Obtained means [nclusters x nfeatures]. $niterations: Number of iterations. M. Goetz, G. Peters, Y. Richter, D. Sacker, T. Wochinger. # An illustrative example clustering the sample data set DemoDataC2D2a.txt RoughKMeans_SHELL(3, DemoDataC2D2a, 2, 2, 1, 100, c(1:2), TRUE, 1.5, 0.7)

Index Topic datasets DemoDataC2D2a, 3 initmeansc2d2a, 6 initmeansc3d2a, 6 initmeansc4d2a, 7 initmeansc5d2a, 7 createlowermshipmatrix, 2 datatypeinteger, 2 DemoDataC2D2a, 3 HardKMeans, 3 HardKMeansDemo, 4 initializemeansmatrix, 5 initmeansc2d2a, 6 initmeansc3d2a, 6 initmeansc4d2a, 7 initmeansc5d2a, 7 normalizematrix, 8 plotroughkmeans, 8 RoughKMeans_LW, 9 RoughKMeans_PE, 10 RoughKMeans_PI, 12 RoughKMeans_SHELL, 13 15