Package ICSShiny. April 1, 2018

Similar documents
Package ICSOutlier. February 3, 2018

Package kdetrees. February 20, 2015

The ICSNP Package. October 5, 2007

Package datasets.load

Package SPIn. R topics documented: February 19, Type Package

Package enpls. May 14, 2018

Package rafalib. R topics documented: August 29, Version 1.0.0

Package SC3. November 27, 2017

Package nonlinearicp

Package catenary. May 4, 2018

Package batchmeans. R topics documented: July 4, Version Date

Package SC3. September 29, 2018

Package smoothr. April 4, 2018

Package LDRTools. September 25, 2015

Package cattonum. R topics documented: May 2, Type Package Version Title Encode Categorical Features

Package FPDclustering

Package r2d2. February 20, 2015

Package ezsim. R topics documented: February 19, Type Package

Package PCADSC. April 19, 2017

Package ade4tkgui. R topics documented: November 9, Version Date Title 'ade4' Tcl/Tk Graphical User Interface

Clustering. Supervised vs. Unsupervised Learning

Package superbiclust

Package shinyhelper. June 21, 2018

Package pairsd3. R topics documented: August 29, Title D3 Scatterplot Matrices Version 0.1.0

Package kdtools. April 26, 2018

Package ccapp. March 7, 2016

Package naivebayes. R topics documented: January 3, Type Package. Title High Performance Implementation of the Naive Bayes Algorithm

Package rpst. June 6, 2017

Package Combine. R topics documented: September 4, Type Package Title Game-Theoretic Probability Combination Version 1.

Package Numero. November 24, 2018

Package zebu. R topics documented: October 24, 2017

Package mcmcse. February 15, 2013

Package RaPKod. February 5, 2018

Package madsim. December 7, 2016

Package bestnormalize

Package densityclust

Package msda. February 20, 2015

Package cgh. R topics documented: February 19, 2015

Package autoshiny. June 25, 2018

Package GADAG. April 11, 2017

Package spark. July 21, 2017

Package descriptr. March 20, 2018

Package OutliersO3. February 8, 2018

Package jstree. October 24, 2017

Package table1. July 19, 2018

Package longclust. March 18, 2018

Package CUFF. May 28, 2018

Package gcite. R topics documented: February 2, Type Package Title Google Citation Parser Version Date Author John Muschelli

Package robustfa. February 15, 2013

Package CINID. February 19, 2015

Package shinyfeedback

Package ANOVAreplication

Package sbf. R topics documented: February 20, Type Package Title Smooth Backfitting Version Date Author A. Arcagni, L.

Package libstabler. June 1, 2017

Package mvpot. R topics documented: April 9, Type Package

Package EnQuireR. R topics documented: February 19, Type Package Title A package dedicated to questionnaires Version 0.

Package FastImputation

Package MeanShift. R topics documented: August 29, 2016

Package frequencyconnectedness

Package intcensroc. May 2, 2018

Package subspace. October 12, 2015

Package projector. February 27, 2018

Package explor. R topics documented: October 10, Type Package Title Interactive Interfaces for Results Exploration Version 0.3.

Package MixSim. April 29, 2017

Package ldbod. May 26, 2017

Package SFtools. June 28, 2017

Package barcoder. October 26, 2018

Package TPD. June 14, 2018

Package bdots. March 12, 2018

Package subscreen. June 22, 2018

Data Mining: Exploring Data. Lecture Notes for Chapter 3

Package glmnetutils. August 1, 2017

Package arulesviz. April 24, 2018

Package flexcwm. May 20, 2018

Package Tnseq. April 13, 2017

Package influence.sem

Network Traffic Measurements and Analysis

Package UPMASK. April 3, 2017

Package Density.T.HoldOut

Package BibPlots. January 23, 2018

Package Rsomoclu. January 3, 2019

Package omicade4. June 29, 2018

Package clustvarsel. April 9, 2018

Package linkspotter. July 22, Type Package

Package editdata. October 7, 2017

Package OLScurve. August 29, 2016

Package pvclust. October 23, 2015

Package geojsonsf. R topics documented: January 11, Type Package Title GeoJSON to Simple Feature Converter Version 1.3.

Package restlos. June 18, 2013

Package ssd. February 20, Index 5

Package ggextra. April 4, 2018

Package vipor. March 22, 2017

Package beanplot. R topics documented: February 19, Type Package

Acquisition Description Exploration Examination Understanding what data is collected. Characterizing properties of data.

Package WordR. September 7, 2017

Package ranger. November 10, 2015

Package orloca. April 21, Type Package Depends methods, png, ucminf Suggests grdevices, graphics, knitr VignetteBuilder knitr

Package sigmanet. April 23, 2018

Package edci. May 16, 2018

Package fractalrock. February 19, 2015

Transcription:

Package ICSShiny April 1, 2018 Type Package Title ICS via a Shiny Application Version 0.5 Date 2018-04-01 Author Aurore Archimbaud, Joris May, Klaus Nordhausen, Anne Ruiz-Gazen Maintainer Klaus Nordhausen <klaus.nordhausen@tuwien.ac.at> Performs Invariant Coordinate Selection (ICS) (Tyler, Critchley, Duembgen and Oja (2009) <doi:10.1111/j.1467-9868.2009.00706.>) and especially ICS for multivariate outlier detection with application to quality control (Archimbaud, Nordhausen, Ruiz- Gazen (2016) <arxiv:1612.06118>) using a shiny app. License GPL (>= 2) Depends ICS, ICSOutlier, shiny Imports ICSNP, rrcov, simsalapar, DT NeedsCompilation no Repository CRAN Date/Publication 2018-04-01 20:21:35 UTC R topics documented: HRMEST.......................................... 2 ICSShiny.......................................... 3 MCD............................................ 6 print.icsshiny........................................ 7 TM............................................. 8 Inde 9 1

2 HRMEST HRMEST Wrapper for Joint Affine Equivariant Estimation of Multivariate Median and Tyler s Shape Matri The function returns, for some multivariate data, the joint affine equivariant estimation of multivariate median and Tyler s shape matri obtained from HR.Mest. HRMEST(,...) Details Value numeric data matri or dataframe.... further arguments passed on to HR.Mest. The use of the ICSShiny function requires to pass as an argument a list with: the location vector and the scatter matri, as the first two arguments. The HRMEST function is mainly for internal use in the ICSShiny application. location scatter the location vector obtained from the joint affine equivariant estimation of multivariate median and Tyler s shape matri. the scatter matri obtained from the joint affine equivariant estimation of multivariate median and Tyler s shape matri. Klaus Nordhausen References HR.Mest HR.Mest, ICSShiny Eamples res.hr.mest <- HRMEST(iris[, 1:4], maiter = 1000)

ICSShiny 3 ICSShiny Invariant Coordinate Selection With a Shiny App Performs ICS via a shiny app where the user can change the scatter matrices, eplore the output and download graphs and components. Also the ICS outlier detection framework, from the ICSOutlier package is available. It is inspired from the Factoshiny application of the FactoMineR package. ICSShiny(, S1 = MeanCov, S2 = Mean3Cov4, S1args = list(), S2args = list(), seed = NULL, ncores = NULL, iseed = NULL, pkg = "ICSOutlier") S1 S2 data matri or dataframe with at least two numeric variables. Please note that it can contain non-numeric variables, but ICS is only performed on numeric variables. name of the function which returns the first location vector T1 and scatter matri S1. See details and ics2 for more information. Default is MeanCov. name of the function which returns the second location vector T2 and scatter matri S2. See details and ics2 for more information. Default is Mean3Cov4. S1args list with optional additional arguments when calling function S1. S2args list with optional additional arguments when calling function S2. seed ncores iseed pkg Details to fi a seed when needed in order to fi the thresholds. Default is NULL. See details for more information. number of cores to be used in dist.simu.test and comp.simu.test. If NULL or 1, no parallel computing is used. Otherwise makecluster with type = "PSOCK" is used. If parallel computation is used the seed passed on to clustersetrngstream. Default is NULL which means no fied seed is used. When using parallel computing, a character vector listing all the packages which need to be loaded on the different cores via require. Must be at least "ICSOutlier" and must contain the packages needed to compute the scatter matrices. Choice of the parameters The scatter matrices and their associated location estimators can be selected through the list out of the options: MeanCov, Mean3Cov4, MCD, TM. It is also possible to run the application with your own functions as long as they are passed as an argument of the call to ICSShiny. However, in this case it is not possible to run the simulations steps for now. ICS is only performed on numeric variables. Only non-numeric variables are proposed for labelling and/or categorizing the observations.

4 ICSShiny Component selection For computing the kernel densities in the second sub-tab, the weight is given by the Gaussian function and the bandwidth follows the rule of thumb of Silverman (1986). For the automatic selection of the Invariant Components (IC), the referenced normality tests are the same as in the comp.norm.test function: "jarque.test", "anscombe.test", "bonett.test", "agostino.test", "shapiro.test". All the decisions are corrected from multiple testing by adjusting the levels as in comp.norm.test. The number of components to keep can also be decided from Monte Carlo simulations trough the comp.simu.test function. This parallel analysis method may need a very long time to compute, so it is used only if the user clicks on the Launch the test button. Value Returns several tabs on the navigator: Choice of the parameters The scatterplot matri of an ICS object for the parameters chosen on the left part (variables included/ecluded, the location vectors and scatter matrices). Component selection Three different subtabs to help the user to choose the interesting components. The first sub-tab is the screeplot of the eigenvalues of the ICS object followed by the summary of the analysis. The second sub-tab plots the kernel density of the ICS components. The third sub-tab suggests which components to select, starting from the highest and/or the lowest kurtosis, through different normality tests or simulations. The default values of the slidebar in the left are obtained from "agostino.test" at 5%. Matri scatterplot of invariant components The two sub-tabs aim at identifying groups or outliers by using pairwise plots of invariant coordinates. It offers two ways of plotting them: only two invariant components or a scatterplot matri with up to si invariant components. The left panel allows to color the groups identified by the user and label the observations. Outlier identification This tab plots outlyingness values for each observation based on the selected components. These squared ICS distances are computed through the ics.distances function as the Euclidian distance of the observations to the origin using the selected centered components. The identification of the outliers can be based on different cut-offs: from Monte Carlo simulations as in dist.simu.test or by giving a percentage or a number of observations to identify. Descriptive statistics This tab gives some descriptive statistics on different subsets of the data (for all the observations, for the observations from a given cluster, for the outlying observations) and enables to compare the sub-populations. The application includes a boplot, a kernel density, an histogram and some basic statistics: Min, Q1, Mean, Median, Q3 and Ma. Data Table This tab contains the dataset with a nice display and the possibility to choose different sub-populations of the data: all the observations, the observations from a given cluster or the outlying observations.

ICSShiny 5 Save This tab allows to display and save the data table of components and the summary of operations. The data frame contains the components kept in the analysis as well as the distance generated by these components. It also includes the cluster the observation belongs to whether the observation is defined as an outlier, as well as the variables used for labelling and categorizing the data. The data are saved in a csv format. The summary of operations contains a summary of all parameters that were used to obtain the current result, it may be useful for another user who may want to get the same result as the original user. It is saved in a tt format. The "Close the session" button closes the application and saves the icsshiny object into the global environment. Aurore Archimbaud and Joris May References Nordhausen, K., Oja, H. and Tyler, D.E. (2008), Tools for eploring multivariate data: The package ICS, Journal of Statistical Software, 28, 1 31. <doi:10.18637/jss.v028.i06>. Archimbaud, A., Nordhausen, K. and Ruiz-Gazen, A. (2016), ICS for multivariate outlier detection with application to quality control, <https://ariv.org/pdf/1612.06118.pdf>. ics2,ics.outlier, shiny website Eamples if(interactive()){ # ICS with ICSShiny: res.shiny <- ICSShiny(iris) # Close the session by clicking on the button or closing the navigator's tab # ICS on a result of an ICSshiny object ICSShiny(res.shiny) # ICS with ICSShiny and different parameters res.shiny <- ICSShiny(iris, S1 = MCD, S1args=list(alpha=0.7), seed = 7587) # ICS with ICSShiny with parallelization of computations and seed res.shiny <- ICSShiny(iris, iseed = 1234, ncores = 2) }

6 MCD MCD Wrapper for MCD location and scatter estimates The function returns, for some multivariate data, the MCD location and scatter estimates obtained from CovMcd. MCD(,...) Details Value numeric data matri or dataframe.... further arguments passed to or from other methods. The use of the ICSShiny function requires to pass as an argument a list with: the location vector and the scatter matri, as the first two arguments. The MCD function is proposed inside the ICSShiny application. location scatter MCD location vector. MCD scatter estimate. Aurore Archimbaud and Joris May References CovMcd CovMcd, ICSShiny Eamples res.mcd <- MCD(iris[, 1:4], alpha = 0.75)

print.icsshiny 7 print.icsshiny Prints the ICSshiny Results Prints an object of class icsshiny, typically the results of a call to ICSShiny. The output corresponds to the summary of operations. ## S3 method for class 'icsshiny' print(,...) an object of class icsshiny.... further arguments to be passed to or from methods. Aurore Archimbaud and Joris May ICSShiny Eamples ## Not run: # ICS with Factoshiny: res.shiny <- ICSshiny(iris) # click on the "Close the session" button or close the tab print(res.shiny) ## End(Not run)

8 TM TM Wrapper for Joint M-estimation of location and scatter for a multivariate t-distribution The function returns, for some multivariate data, the joint M-estimation of location and scatter matri for a multivariate t-distribution obtained from tm. TM(,...) Details Value numeric data matri or dataframe.... further arguments passed to or from other methods. The use of the ICSShiny function requires to pass as an argument a list with: the location vector and the scatter matri, as the first two arguments. The TM function is proposed inside the ICSShiny application. location scatter the location vector obtained from the joint M-estimation of location and scatter for a multivariate t-distribution. the scatter matri obtained from the joint M-estimation of location and scatter for a multivariate t-distribution. Aurore Archimbaud and Joris May References tm tm, ICSShiny Eamples res.tm <- TM(iris[, 1:4], df=3)

Inde Topic methods print.icsshiny, 7 Topic multivariate HRMEST, 2 ICSShiny, 3 MCD, 6 TM, 8 Topic print print.icsshiny, 7 clustersetrngstream, 3 comp.norm.test, 4 comp.simu.test, 3, 4 CovMcd, 6 dist.simu.test, 3, 4 HR.Mest, 2 HRMEST, 2 ics.distances, 4 ics.outlier, 5 ics2, 3, 5 ICSOutlier, 3 ICSShiny, 2, 3, 3, 6 8 makecluster, 3 MCD, 3, 6 Mean3Cov4, 3 MeanCov, 3 print.icsshiny, 7 require, 3 TM, 3, 8 tm, 8 9