Package MODA. January 8, 2019

Similar documents
Tutorial for the WGCNA package for R: III. Using simulated data to evaluate different module detection methods and gene screening approaches

Supplementary text S6 Comparison studies on simulated data

Package gpart. November 19, 2018

Package CoGAPS. January 24, 2019

Package omicplotr. March 15, 2019

Package PropClust. September 15, 2018

Package TMixClust. July 13, 2018

Package clusterseq. R topics documented: June 13, Type Package

Package EMDomics. November 11, 2018

Package QUBIC. September 1, 2018

Package SC3. September 29, 2018

Package methylgsa. March 7, 2019

Package ClusterSignificance

Package M3Drop. August 23, 2018

Package SC3. November 27, 2017

Social Data Management Communities

The Generalized Topological Overlap Matrix in Biological Network Analysis

Package enrichplot. September 29, 2018

EECS730: Introduction to Bioinformatics

Package scmap. March 29, 2019

Basics of Network Analysis

Package sigqc. June 13, 2018

Package dmrseq. September 14, 2018

Package cghmcr. March 6, 2019

Package superbiclust

Package macorrplot. R topics documented: October 2, Title Visualize artificial correlation in microarray data. Version 1.50.

Package BASiNET. October 2, 2018

(1) where, l. denotes the number of nodes to which both i and j are connected, and k is. the number of connections of a node, with.

Package BubbleTree. December 28, 2018

Package phantasus. March 8, 2019

Community Detection. Community

V4 Matrix algorithms and graph partitioning

Package dupradar. R topics documented: July 12, Type Package

Community detection for NetworkX Documentation

Package FitHiC. October 16, 2018

Package cycle. March 30, 2019

Dimension reduction : PCA and Clustering

Package SIMLR. December 1, 2017

Package MiRaGE. October 16, 2018

Package RNASeqR. January 8, 2019

Package methyvim. April 5, 2019

Package MaxContrastProjection

Cluster Analysis for Microarray Data

Package RTNduals. R topics documented: March 7, Type Package

Package LiquidAssociation

Package seqcat. March 25, 2019

Tutorial for the WGCNA package for R II. Consensus network analysis of liver expression data, female and male mice. 1. Data input and cleaning

Package ffpe. October 1, 2018

Package mimager. March 7, 2019

Simulation studies of module preservation: Simulation study of weak module preservation

Mining Social Network Graphs

Package DCD. September 12, 2017

Package BiocNeighbors

Big Data Analytics. Special Topics for Computer Science CSE CSE Feb 11

Package STROMA4. October 22, 2018

Package DRaWR. February 5, 2016

Package RcisTarget. July 17, 2018

Package pcagopromoter

Package canvasxpress

Package ERSSA. November 4, 2018

Package icnv. R topics documented: March 8, Title Integrated Copy Number Variation detection Version Author Zilu Zhou, Nancy Zhang

Package ccmap. June 17, 2018

Package EventPointer

Package stattarget. December 23, 2018

Package sscore. R topics documented: June 27, Version Date

Gene expression & Clustering (Chapter 10)

Package netbiov. R topics documented: October 14, Type Package

Package semisup. March 10, Version Title Semi-Supervised Mixture Model

Package Corbi. May 3, 2017

Package diffusr. May 17, 2018

Community detection. Leonid E. Zhukov

Package BDMMAcorrect

Spectral Clustering. Presented by Eldad Rubinstein Based on a Tutorial by Ulrike von Luxburg TAU Big Data Processing Seminar December 14, 2014

Package comphclust. May 4, 2017

Package ChIPseeker. July 12, 2018

Package ssizerna. January 9, 2017

Package BEARscc. May 2, 2018

Package SamSPECTRAL. January 23, 2019

Package EFS. R topics documented:

Analysis of Biological Networks. 1. Clustering 2. Random Walks 3. Finding paths

Package switchbox. August 31, 2018

Package SNPediaR. April 9, 2019

Clustering. Dick de Ridder 6/10/2018

ViTraM: VIsualization of TRAnscriptional Modules

Package DMCHMM. January 19, 2019

Package frma. R topics documented: March 8, Version Date Title Frozen RMA and Barcode

Package igc. February 10, 2018

Package GSRI. March 31, 2019

Clustering in Networks

Clustering. SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic

Package BiRewire. July 5, 2018

Package pandar. April 30, 2018

Package SteinerNet. August 19, 2018

Package Linnorm. October 12, 2016

Package svapls. February 20, 2015

Package mfa. R topics documented: July 11, 2018

Package CSFA. August 14, 2018

Study and Implementation of CHAMELEON algorithm for Gene Clustering

Spectral Clustering and Community Detection in Labeled Graphs

Tutorial for the WGCNA package for R II. Consensus network analysis of liver expression data, female and male mice

Transcription:

Type Package Package MODA January 8, 2019 Title MODA: MOdule Differential Analysis for weighted gene co-expression network Version 1.8.0 Date 2016-12-16 Author Dong Li, James B. Brown, Luisa Orsini, Zhisong Pan, Guyu Hu and Shan He Maintainer Dong Li <dxl466@cs.bham.ac.uk> MODA can be used to estimate and construct condition-specific gene co-expression networks, and identify differentially expressed subnetworks as conserved or condition specific modules which are potentially associated with relevant biological processes. License GPL (>= 2) Depends R (>= 3.3) Imports grdevices, graphics, stats, utils, WGCNA, dynamictreecut, igraph, cluster, AMOUNTAIN, RColorBrewer RoxygenNote 5.0.1 biocviews GeneExpression, Microarray, DifferentialExpression, Network Suggests BiocStyle, knitr, rmarkdown ignettebuilder knitr git_url https://git.bioconductor.org/packages/moda git_branch RELEASE_3_8 git_last_commit 61bf4fb git_last_commit_date 2018-10-30 Date/Publication 2019-01-07 R topics documented: CompareAllNets...................................... 2 comparemodulestwonets.................................. 3 datexpr1........................................... 4 datexpr2........................................... 4 getpartition......................................... 5 MIcondition......................................... 5 ModuleFrequency...................................... 6 1

2 CompareAllNets modulesrank........................................ 7 NMImatrix......................................... 7 PartitionDensity....................................... 8 PartitionModularity..................................... 9 recursiveigraph....................................... 10 WeightedModulePartitionAmoutain............................ 10 WeightedModulePartitionHierarchical........................... 11 WeightedModulePartitionLouvain............................. 12 WeightedModulePartitionSpectral............................. 13 Index 15 CompareAllNets Illustration of network comparison Compare the background network and a set of condition-specific network. Conserved or conditionspecific modules are indicated by the plain files, based on the statistics CompareAllNets(ResultFolder, intmodules, indicator, intconditionmodules, conditionnames, specifictheta, conservedtheta) ResultFolder intmodules where to store results how many modules in the background network indicator identifier of current profile, served as a tag in name intconditionmodules a numeric vector, each of them is the number of modules in each conditionspecific network. Or just single number conditionnames a character vector, each of them is the name of condition. Or just single name specifictheta the threshold to define min(s)+specifictheta, less than which is considered as condition specific module. s is the sums of rows in Jaccard index matrix. See supplementary file. conservedtheta The threshold to define max(s)-conservedtheta, greater than which is considered as condition conserved module. s is the sums of rows in Jaccard index matrix. See supplementary file. None See Also WeightedModulePartitionHierarchical, comparemodulestwonets

comparemodulestwonets 3 ResultFolder = 'ForSynthetic' # where middle files are stored CuttingCriterion = 'Density' # could be Density or Modularity indicator1 = 'X' # indicator for data profile 1 indicator2 = 'Y' # indicator for data profile 2 specifictheta = 0.1 #threshold to define condition specific modules conservedtheta = 0.1#threshold to define conserved modules intmodules1 <- WeightedModulePartitionHierarchical(datExpr1,ResultFolder, indicator1,cuttingcriterion) intmodules2 <- WeightedModulePartitionHierarchical(datExpr2,ResultFolder, indicator2,cuttingcriterion) CompareAllNets(ResultFolder,intModules1,indicator1,intModules2,indicator2, specifictheta,conservedtheta) comparemodulestwonets Illustration of two networks comparison Compare the background network and a condition-specific network. A Jaccard index is used to measure the similarity of two sets, which represents the similarity of each module pairs from two networks. comparemodulestwonets(sourcehead, nm1, nm2, ind1, ind2) sourcehead nm1 nm2 ind1 ind2 prefix of where to store results how many modules in the background network how many modules in the condition-specific network indicator of the background network indicator of the condition-specific network A matrix where each entry is the Jaccard index of corresponding modules from two networks

4 datexpr2 ResultFolder = 'ForSynthetic' # where middle files are stored CuttingCriterion = 'Density' # could be Density or Modularity indicator1 = 'X' # indicator for data profile 1 indicator2 = 'Y' # indicator for data profile 2 intmodules1 <- WeightedModulePartitionHierarchical(datExpr1,ResultFolder, indicator1,cuttingcriterion) intmodules2 <- WeightedModulePartitionHierarchical(datExpr2,ResultFolder, indicator2,cuttingcriterion) JaccardMatrix <- comparemodulestwonets(resultfolder,intmodules1,intmodules2, paste('/densemodulegene_',indicator1,sep=''), paste('/densemodulegene_',indicator2,sep='')) datexpr1 datexpr1 Format Synthetic gene expression profile with 20 samples and 500 genes. A matrix with 20 rows and 500 columns. ## plot the heatmap of the correlation matrix... ## Not run: heatmap(cor(as.matrix(datexpr1))) datexpr2 datexpr2 Synthetic gene expression profile with 25 samples and 500 genes. Format A matrix with 25 rows and 500 columns.

getpartition 5 ## plot the heatmap of the correlation matrix... ## Not run: heatmap(cor(as.matrix(datexpr2))) getpartition Get numeric partition from folder Get identified partitionassignment, only for synthetic data where gene names are numbers getpartition(resultfolder) ResultFolder folder used to save modules Number of partitions MIcondition Modules detection by each condition Module detection on each condition-specific network, which is constructed from all samples but samples belonging to that condition MIcondition(datExpr, conditionnames, ResultFolder, GeneNames, maxsize = 100, minsize = 30) datexpr gene expression profile, rows are samples and columns genes, rowname should contain condition specifier conditionnames character vector, each as the condition name ResultFolder GeneNames maxsize minsize where to store the clusters normally the gene official names to replace the colnames of datexpr the maximal nodes allowed in one module the minimal nodes allowed in one module

6 ModuleFrequency a numeric vector, each entry is the number of modules in condition-specific network ModuleFrequency Statistics of all conditions Statistics of all conditions. To highlight conserved or condition-specific by counting how frequent each module is lablelled as which, and then visualize the frequency by bar plot. ModuleFrequency(ResultFolder, intmodules, conditionnames, legendnames, indicator) ResultFolder intmodules where to store results how many modules in the background network conditionnames a character vector, each of them is the name legendnames indicator a character vector, each of them is the condition name showing up in the frequency barplot of condition. Or just single name identifier of current profile, served as a tag in name None See Also WeightedModulePartitionHierarchical, WeightedModulePartitionLouvain, WeightedModulePartitionSpectr WeightedModulePartitionAmoutain, CompareAllNets

modulesrank 7 modulesrank Modules rank from recursive communities detection Assign the module scores by weights, and rank them from highest to lowest modulesrank(foldername, indicator, GeneNames) foldername indicator GeneNames folder used to save modules normally a specific tag of condition Gene symbols, sometimes we need them instead of probe ids The numeber of modules See Also recursiveigraph NMImatrix Illustration of network comparison by NMI Compare the background network and a set of condition-specific network. returning a pair-wise matrix to show the normalized mutual information between each pair of networks in terms of partitioning NMImatrix(ResultFolder, intmodules, indicator, intconditionmodules, conditionnames, Nsize, legendnames = NULL, plt = FALSE)

8 PartitionDensity ResultFolder intmodules where to store results how many modules in the background network indicator identifier of current profile, served as a tag in name intconditionmodules a numeric vector, each of them is the number of modules in each conditionspecific network. Or just single number conditionnames a character vector, each of them is the name of condition. Or just single name Nsize legendnames plt The number of genes in total a character vector, each of them is the condition name showing up in the similarity matrix plot if applicable a boolean value to indicate whether plot the similarity matrix NMI matrix indicating the similarity between each two networks See Also CompareAllNets PartitionDensity Illustration of partition density Calculate the average density of all resulting modules from a partition. The density of each module is defined as the average adjacency of the module genes. PartitionDensity(ADJ, PartitionSet) ADJ PartitionSet gene similarity matrix vector indicates the partition label for genes partition density, defined as average density of all modules

PartitionModularity 9 References Langfelder, Peter, and Steve Horvath. "WGCNA: an R package for weighted correlation network analysis." BMC bioinformatics 9.1 (2008): 1. ADJ1=abs(cor(datExpr1,use="p"))^10 dissadj=1-adj1 hieradj=hclust(as.dist(dissadj), method="average" ) groups <- cutree(hieradj, h = 0.8) pdensity <- PartitionDensity(ADJ1,groups) PartitionModularity Illustration of modularity density Calculate the average modularity of a partition. The modularity of each module is defined from a natural generalization of unweighted case. PartitionModularity(ADJ, PartitionSet) ADJ PartitionSet gene similarity matrix vector indicates the partition label for genes partition modularity, defined as average modularity of all modules References Newman, Mark EJ. "Analysis of weighted networks." Physical review E 70.5 (2004): 056131. ADJ1=abs(cor(datExpr1,use="p"))^10 dissadj=1-adj1 hieradj=hclust(as.dist(dissadj), method="average" ) groups <- cutree(hieradj, h = 0.8) pdensity <- PartitionModularity(ADJ1,groups)

10 WeightedModulePartitionAmoutain recursiveigraph Modules identification by recursive community detection Modules detection using igraph s community detection algorithms, when the resulted module is larger than expected, it is further devided by the same program recursiveigraph(g, savefile, method = c("fastgreedy", "louvain"), maxsize = 200, minsize = 30) g savefile method maxsize minsize igraph object, the network to be partitioned plain text, used to store module, each line as a module specify the community detection algorithm maximal module size minimal module size None References Blondel, Vincent D., et al. "Fast unfolding of communities in large networks." Journal of statistical mechanics: theory and experiment 2008.10 (2008): P10008. WeightedModulePartitionAmoutain Modules detection by AMOUNTAIN algorithm Module detection based on the AMOUNTAIN algorithm, which tries to find the optimal module every time and use a modules extraction way WeightedModulePartitionAmoutain(datExpr, Nmodule, foldername, indicatename, GeneNames, maxsize = 200, minsize = 3, power = 6, tao = 0.2)

WeightedModulePartitionHierarchical 11 datexpr Nmodule foldername indicatename GeneNames maxsize minsize power tao gene expression profile, rows are samples and columns genes the number of clusters(modules) where to store the clusters normally a specific tag of condition normally the gene official names to replace the colnames of datexpr the maximal nodes allowed in one module the minimal nodes allowed in one module the power parameter of WGCNA, W_ij= cor(x_i,x_j) ^pwr the threshold to cut the adjacency matrix None References Blondel, Vincent D., et al. "Fast unfolding of communities in large networks." Journal of statistical mechanics: theory and experiment 2008.10 (2008): P10008. ResultFolder <- 'ForSynthetic' # where middle files are stored GeneNames <- colnames(datexpr1) intmodules1 <- WeightedModulePartitionAmoutain(datExpr1,5,ResultFolder,'X', GeneNames,maxsize=100,minsize=50) truemodule <- c(rep(1,100),rep(2,100),rep(3,100),rep(4,100),rep(5,100)) #mymodule <- getpartition(resultfolder) #randindex(table(mymodule,truemodule),adjust=f) WeightedModulePartitionHierarchical Modules detection by hierarchical clustering Module detection based on the optimal cutting height of dendrogram, which is selected to make the average density or modularity of resulting partition maximal. The clustering and visulization function are from WGCNA. WeightedModulePartitionHierarchical(datExpr, foldername, indicatename, cutmethod = c("density", "Modularity"), power = 10)

12 WeightedModulePartitionLouvain datexpr foldername indicatename cutmethod power gene expression profile, rows are samples and columns genes where to store the clusters normally a specific tag of condition cutting the dendrogram based on maximal average Density or Modularity the power parameter of WGCNA, W_ij= cor(x_i,x_j) ^power The number of clusters References Langfelder, Peter, and Steve Horvath. "WGCNA: an R package for weighted correlation network analysis." BMC bioinformatics 9.1 (2008): 1. See Also PartitionDensity PartitionModularity ResultFolder = 'ForSynthetic' # where middle files are stored CuttingCriterion = 'Density' # could be Density or Modularity indicator1 = 'X' # indicator for data profile 1 indicator2 = 'Y' # indicator for data profile 2 specifictheta = 0.1 #threshold to define condition specific modules conservedtheta = 0.1#threshold to define conserved modules intmodules1 <- WeightedModulePartitionHierarchical(datExpr1,ResultFolder, indicator1,cuttingcriterion) #mymodule <- getpartition(resultfolder) #randindex(table(mymodule,truemodule),adjust=f) WeightedModulePartitionLouvain Modules detection by Louvain algorithm Module detection based on the Louvain algorithm, which tries to maximize overall modularity of resulting partition. WeightedModulePartitionLouvain(datExpr, foldername, indicatename, GeneNames, maxsize = 200, minsize = 30, power = 6, tao = 0.2)

WeightedModulePartitionSpectral 13 datexpr foldername indicatename GeneNames maxsize minsize power tao gene expression profile, rows are samples and columns genes where to store the clusters normally a specific tag of condition normally the gene official names to replace the colnames of datexpr the maximal nodes allowed in one module the minimal nodes allowed in one module the power parameter of WGCNA, W_ij= cor(x_i,x_j) ^power the threshold to cut the adjacency matrix The number of clusters References Blondel, Vincent D., et al. "Fast unfolding of communities in large networks." Journal of statistical mechanics: theory and experiment 2008.10 (2008): P10008. ResultFolder <- 'ForSynthetic' # where middle files are stored indicator <- 'X' # indicator for data profile 1 GeneNames <- colnames(datexpr1) intmodules1 <- WeightedModulePartitionLouvain(datExpr1,ResultFolder,indicator,GeneNames) truemodule <- c(rep(1,100),rep(2,100),rep(3,100),rep(4,100),rep(5,100)) #mymodule <- getpartition(resultfolder) #randindex(table(mymodule,truemodule),adjust=f) WeightedModulePartitionSpectral Modules detection by spectral clustering Module detection based on the spectral clustering algorithm, which mainly solve the eigendecomposition on Laplacian matrix WeightedModulePartitionSpectral(datExpr, foldername, indicatename, GeneNames, power = 6, nn = 10, k = 2)

14 WeightedModulePartitionSpectral datexpr foldername indicatename GeneNames power nn k gene expression profile, rows are samples and columns genes where to store the clusters normally a specific tag of condition normally the gene official names to replace the colnames of datexpr the power parameter of WGCNA, W_ij= cor(x_i,x_j) ^power the number of nearest neighbor, used to construct the affinity matrix the number of clusters(modules) None References Von Luxburg, Ulrike. "A tutorial on spectral clustering." Statistics and computing 17.4 (2007): 395-416. ResultFolder <- 'ForSynthetic' # where middle files are stored indicator <- 'X' # indicator for data profile 1 GeneNames <- colnames(datexpr1) WeightedModulePartitionSpectral(datExpr1,ResultFolder,indicator, GeneNames,k=5) truemodule <- c(rep(1,100),rep(2,100),rep(3,100),rep(4,100),rep(5,100)) #mymodule <- getpartition(resultfolder) #randindex(table(mymodule,truemodule),adjust=f)

Index Topic NMI NMImatrix, 7 Topic Statistics ModuleFrequency, 6 Topic community recursiveigraph, 10 Topic comparison comparemodulestwonets, 3 Topic cutting WeightedModulePartitionHierarchical, 11 WeightedModulePartitionLouvain, 12 WeightedModulePartitionSpectral, 13 Topic data datexpr1, 4 datexpr2, 4 Topic dendrogram WeightedModulePartitionHierarchical, 11 WeightedModulePartitionLouvain, 12 WeightedModulePartitionSpectral, 13 Topic density PartitionDensity, 8 Topic detection recursiveigraph, 10 Topic differential CompareAllNets, 2 ModuleFrequency, 6 NMImatrix, 7 Topic modularity PartitionModularity, 9 Topic module CompareAllNets, 2 comparemodulestwonets, 3 ModuleFrequency, 6 NMImatrix, 7 Topic multiplecondition MIcondition, 5 Topic optimization WeightedModulePartitionAmoutain, 10 CompareAllNets, 2, 6, 8 comparemodulestwonets, 2, 3 datexpr1, 4 datexpr2, 4 getpartition, 5 MIcondition, 5 ModuleFrequency, 6 modulesrank, 7 NMImatrix, 7 PartitionDensity, 8, 12 PartitionModularity, 9, 12 recursiveigraph, 7, 10 WeightedModulePartitionAmoutain, 6, 10 WeightedModulePartitionHierarchical, 2, 6, 11 WeightedModulePartitionLouvain, 6, 12 WeightedModulePartitionSpectral, 6, 13 15