An introduction to network inference and mining - TP

Similar documents
An introduction to network inference and mining

Package RNAseqNet. May 16, 2017

Graph Visualization with Gephi

Using igraph for Visualisations

Package RCA. R topics documented: February 29, 2016

Machine Learning (BSMC-GA 4439) Wenke Liu

Package netbiov. R topics documented: October 14, Type Package

Package comphclust. February 15, 2013

Package comphclust. May 4, 2017

Package glassomix. May 30, 2013

Inferring Regulatory Networks by Combining Perturbation Screens and Steady State Gene Expression Profiles

Package XMRF. June 25, 2015

HomeWork 4 Hints {Your Name} deadline

Package nlnet. April 8, 2018

Simple Gephi Project from A to Z. Clément Levallois

Package multigraph. R topics documented: January 24, 2017

The fastclime Package for Linear Programming and Large-Scale Precision Matrix Estimation in R

Supplementary text S6 Comparison studies on simulated data

Social Network Analysis With igraph & R. Ofrit Lesser December 11 th, 2014

Introduction to R. UCLA Statistical Consulting Center R Bootcamp. Irina Kukuyeva September 20, 2010

GEPHI Introduction to network analysis and visualization

Preface to the Second Edition. Preface to the First Edition. 1 Introduction 1

Gene signature selection to predict survival benefits from adjuvant chemotherapy in NSCLC patients

ACHIEVEMENTS FROM TRAINING

Detecting and Analyzing Communities in Social Network Graphs for Targeted Marketing

Clustering analysis of gene expression data

Machine Learning (BSMC-GA 4439) Wenke Liu

Network Analysis. Dr. Scott A. Hale Oxford Internet Institute 16 March 2016

Package NetCluster. R topics documented: February 19, Type Package Version 0.2 Date Title Clustering for networks

Package multigraph. R topics documented: November 1, 2017

Tight Clustering: a method for extracting stable and tight patterns in expression profiles

Computing with large data sets

Package hbm. February 20, 2015

Machine Learning and Data Mining. Clustering (1): Basics. Kalev Kask

Package lvm4net. R topics documented: August 29, Title Latent Variable Models for Networks

Package MatchingFrontier

Data Science and Statistics in Research: unlocking the power of your data Session 3.4: Clustering

VIDAEXPERT: DATA ANALYSIS Here is the Statistics button.

Package pandar. April 30, 2018

Package CVR. March 22, 2017

Web Structure Mining Community Detection and Evaluation

Package DiffCorr. August 29, 2016

Community Detection. Community

Package ebdbnet. February 15, 2013

9/29/13. Outline Data mining tasks. Clustering algorithms. Applications of clustering in biology

July 7, 2015: New Version of textir

Comparison of Optimization Methods for L1-regularized Logistic Regression

Package QUBIC. September 1, 2018

Package ClustGeo. R topics documented: July 14, Type Package

Package pairsd3. R topics documented: August 29, Title D3 Scatterplot Matrices Version 0.1.0

Package sglasso. May 26, 2018

Package sigmanet. April 23, 2018

Chapter 11. Network Community Detection

ABSTRACT I. INTRODUCTION II. METHODS AND MATERIAL

Introduction for heatmap3 package

The supclust Package

Package arulesviz. April 24, 2018

Cluster Analysis for Microarray Data

10601 Machine Learning. Hierarchical clustering. Reading: Bishop: 9-9.2

Online Social Networks and Media. Community detection

MSA220 - Statistical Learning for Big Data

Package EBglmnet. January 30, 2016

Package bayescl. April 14, 2017

Tutorial for the WGCNA package for R II. Consensus network analysis of liver expression data, female and male mice

Graph Theory. Graph Theory. COURSE: Introduction to Biological Networks. Euler s Solution LECTURE 1: INTRODUCTION TO NETWORKS.

Graph Algorithms. Parallel and Distributed Computing. Department of Computer Science and Engineering (DEI) Instituto Superior Técnico.

CFinder The Community / Cluster Finding Program. Users' Guide

M2 in Statistics & Econometrics Graph mining Lesson 3 - Tests and random graphs

Package MODA. January 8, 2019

AA BB CC DD EE. Introduction to Graphics in R

Exploring cdna Data. Achim Tresch, Andreas Buness, Tim Beißbarth, Florian Hahne, Wolfgang Huber. June 17, 2005

Clustering Algorithms on Graphs Community Detection 6CCS3WSN-7CCSMWAL

Package smoothr. April 4, 2018

Package KEGGprofile. February 10, 2018

Package virtualarray

INTRODUCTION TO R. Basic Graphics

Supervised vs unsupervised clustering

##########################################################################

Package snowboot. R topics documented: October 19, Type Package Title Bootstrap Methods for Network Inference Version 1.0.

limma: A brief introduction to R

16 - Networks and PageRank

Lecture 4: Undirected Graphical Models

Mixed models in R using the lme4 package Part 2: Lattice graphics

Appendix I. TACTICS Toolbox v3.x. Interactive MATLAB Platform For Bioimaging informatics. User Guide TRACKING MODULE

Bioconductor exercises 1. Exploring cdna data. June Wolfgang Huber and Andreas Buness

Package fso. February 19, 2015

Bioconductor s sva package

Application of Hierarchical Clustering to Find Expression Modules in Cancer

Package camel. September 9, 2013

Package corclass. R topics documented: January 20, 2016

Instruction: Download and Install R and RStudio

Package LncPath. May 16, 2016

Package desplot. R topics documented: April 3, 2018

Package flare. August 29, 2013

Graphics - Part III: Basic Graphics Continued

Package ConvergenceClubs

Package ITNr. October 11, 2018

Picasso: A Sparse Learning Library for High Dimensional Data Analysis in R and Python

Basics of Network Analysis

Introduction to R. Biostatistics 615/815 Lecture 23

Transcription:

An introduction to network inference and mining - TP Nathalie Villa-Vialaneix - nathalie.villa@toulouse.inra.fr http://www.nathalievilla.org INRA, UR 0875 MIAT Formation INRA, Niveau 3 Formation INRA (Niveau 3) Network (TP) Nathalie Villa-Vialaneix 1 / 40

Packages in R is provided with basic functions but more than 3, 000 packages are available on the CRAN (Comprehensive R Archive Network) for additionnal functions (see also the project Bioconductor). Installing new packages (has to be done only once) with the command line or the menu (Windows or Mac OS X or RStudio) install. packages (c(" huge "," igraph ", " mixomics ")) Loading a package (has to be done each time R is re-started) library ( igraph ) Be sure you properly set the working directory (directory in which you put the data files) before you start Formation INRA (Niveau 3) Network (TP) Nathalie Villa-Vialaneix 2 / 40

Outline 1 Network inference Data description Inference with glasso Basic mining 2 Network mining Building the graph with igraph Global characteristics Visualization Individual characteristics Clustering 3 Use gephi Formation INRA (Niveau 3) Network (TP) Nathalie Villa-Vialaneix 3 / 40

Network inference Outline 1 Network inference Data description Inference with glasso Basic mining 2 Network mining Building the graph with igraph Global characteristics Visualization Individual characteristics Clustering 3 Use gephi Formation INRA (Niveau 3) Network (TP) Nathalie Villa-Vialaneix 4 / 40

Network inference Data description Use case description Data in the R package mixomics microarray data: expression of 120 selected genes potentially involved in nutritional problems on 40 mice. These data come from a nutrigenomic study [Martin et al., 2007]. data ( nutrimouse ) summary ( nutrimouse ) expr <- nutrimouse $ gene rownames ( expr ) <- paste (1: nrow ( expr ), nutrimouse $ genotype, nutrimouse $ diet, sep ="-") Formation INRA (Niveau 3) Network (TP) Nathalie Villa-Vialaneix 5 / 40

Network inference Data description Data distribution boxplot (expr, names = NULL ) Formation INRA (Niveau 3) Network (TP) Nathalie Villa-Vialaneix 6 / 40

Network inference Data description Gene correlations and clustering expr.c <- scale ( expr ) heatmap (as. matrix ( expr.c)) Formation INRA (Niveau 3) Network (TP) Nathalie Villa-Vialaneix 7 / 40

Network inference Data description Hierarchical clustering from raw data hclust. tree <- hclust ( dist (t( expr ))) plot ( hclust. tree ) rect. hclust ( hclust.tree, k=7, border = rainbow (7)) Formation INRA (Niveau 3) Network (TP) Nathalie Villa-Vialaneix 8 / 40

Network inference Data description Save gene clusters hclust. groups <- cutree ( hclust.tree, k =7) table ( hclust. groups ) hclust. groups # 1 2 3 4 5 6 7 # 18 4 20 52 17 6 3 Formation INRA (Niveau 3) Network (TP) Nathalie Villa-Vialaneix 9 / 40

Network inference Inference with glasso Sparse linear regression by Maximum Likelihood Estimation: [Friedman et al., 2008] Gaussien framework allows us to use ML optimization with a sparse penalization n p L (S X ) +pen = log P(X j j i Xi, S j ) λ S 1 i=1 j=1 glasso. res <- huge (as. matrix ( expr ), method =" glasso " glasso. res # M o d e l : g r a p h i c a l l a s s o ( g l a s s o ) # I n p u t : The Data M a t r i x # Path l e n g t h : 10 # G r a p h d i m e n s i o n : 120 # S p a r s i t y l e v e l : 0 - - - - - > 0. 2 1 2 8 8 5 2 estimates of the concentration matrix S are in glasso.res$icov[[1]],..., glasso.res$icov[[10]], each one corresponding to a different λ Formation INRA (Niveau 3) Network (TP) Nathalie Villa-Vialaneix 10 / 40

Network inference Inference with glasso Select λ for a targeted density with the StARS method [Liu et al., 2010] glasso. sel <- huge. select ( glasso.res, criterion =" stars ") plot ( glasso. sel ) Formation INRA (Niveau 3) Network (TP) Nathalie Villa-Vialaneix 11 / 40

Network inference Inference with glasso Using igraph to create the graph From the binary adjacency matrix: bin. mat <- as. matrix ( glasso. sel $ opt. icov )!=0 colnames ( bin. mat ) <- colnames ( expr ) Create an undirected simple graph from the matrix: nutrimouse. net <- simplify ( graph. adjacency ( bin.mat, mode =" max ")) nutrimouse. net # I G R A P H UN - - 120 392 -- # + attr : name ( v/c ) Formation INRA (Niveau 3) Network (TP) Nathalie Villa-Vialaneix 12 / 40

Network inference Basic mining Connected components The resulting network is not connected: is. connected ( nutrimouse. net ) # [1] F A L S E Connected components are extracted by: components. nutrimouse <- clusters ( nutrimouse. net ) components. nutrimouse # [1] 1 2 3 2 2 4 5 2 2 2 6 2 7 2 #... # # $ c s i z e # [1] 7 67 1 6 2 1 1 1 1 1 1 1 1 2 #... # # $ no # [1] 40 Formation INRA (Niveau 3) Network (TP) Nathalie Villa-Vialaneix 13 / 40

Network inference Basic mining Working on a connected subgraph The largest connected component (with 99 nodes) can be extracted with: nutrimouse. lcc <- induced. subgraph ( nutrimouse. net, components. nutrimouse $ membership == which. max ( components. nutrimouse $ csize )) nutrimouse. lcc # I G R A P H UN - - 67 375 -- # + attr : name ( v/c ) and visualized with: nutrimouse. lcc $ layout <- layout. kamada. kawai ( nutrimouse. lcc ) plot ( nutrimouse.lcc, vertex. size =2, vertex. color =" lightyellow ", vertex. frame. color =" lightyellow ", vertex. label. color =" black ", vertex. label. cex =0.7) Formation INRA (Niveau 3) Network (TP) Nathalie Villa-Vialaneix 14 / 40

Network inference Basic mining Resulting network Formation INRA (Niveau 3) Network (TP) Nathalie Villa-Vialaneix 15 / 40

Network inference Basic mining Node clustering (compared to raw clustering) Using one of the clustering function available in igraph: set. seed (1219) clusters. nutrimouse <- spinglass. community ( nutrimouse. lcc ) clusters. nutrimouse table ( clusters. nutrimouse $ membership ) # 1 2 3 4 # 15 25 17 10 The hierarchical clustering for nodes in the largest connected component is obtained with: induced. hclust. groups <- hclust. groups [ components. nutrimouse $ membership == which. max ( components. nutrimouse $ csize )] table ( induced. hclust. groups ) # 1 3 4 5 6 # 6 17 42 1 1 Formation INRA (Niveau 3) Network (TP) Nathalie Villa-Vialaneix 16 / 40

Network inference Basic mining Visual comparison par ( mfrow =c (1,2)) par ( mar = rep (1,4)) plot ( nutrimouse.lcc, vertex. size =5, vertex. color = rainbow (6)[ induced. hclust. groups ] vertex. frame. color = rainbow (6)[ induced. hclust. groups ], vertex. label =NA, main =" Hierarchical clustering ") plot ( nutrimouse.lcc, vertex. size =5, vertex. color = rainbow (4)[ clusters. nutrimouse $ membership ], vertex. frame. color = rainbow (4)[ clusters. nutrimouse $ membership ], vertex. label =NA, main =" Graph clustering ") Formation INRA (Niveau 3) Network (TP) Nathalie Villa-Vialaneix 17 / 40

Network inference Basic mining Visual comparison Formation INRA (Niveau 3) Network (TP) Nathalie Villa-Vialaneix 18 / 40

Network inference Basic mining Numeric comparison compare. communities ( induced. hclust. groups, clusters. nutrimouse $ membership, method =" nmi ") # [1] 0. 5 0 9 1 5 6 8 modularity ( nutrimouse.lcc, induced. hclust. groups ) # [1] 0. 2 3 6 8 4 6 2 Formation INRA (Niveau 3) Network (TP) Nathalie Villa-Vialaneix 19 / 40

Network mining Outline 1 Network inference Data description Inference with glasso Basic mining 2 Network mining Building the graph with igraph Global characteristics Visualization Individual characteristics Clustering 3 Use gephi Formation INRA (Niveau 3) Network (TP) Nathalie Villa-Vialaneix 20 / 40

Network mining Building the graph with igraph Use case description Data are Natty s facebook network 1 fbnet-el.txt is the edge list; fbnet-name.txt are the nodes initials. edgelist <- as. matrix ( read. table (" fbnet -el.txt ")) vnames <- read. table (" fbnet - name. txt ") vnames <- as. character ( vnames [,1]) The graph is built with: # with g r a p h. edgelist fbnet0 <- graph. edgelist ( edgelist, directed = FALSE ) fbnet0 # I G R A P H U - - - 152 551 -- See also help(graph.edgelist) for more graph constructors 1 Do it with yours: http://shiny.nathalievilla.org/fbs Formation INRA (Niveau 3) Network (TP) Nathalie Villa-Vialaneix 21 / 40

Network mining Building the graph with igraph Vertexes, vertex attributes The graph s vertexes are accessed and counted with: V( fbnet0 ) vcount ( fbnet0 ) Vertexes can be described by attributes: # add an a t t r i b u t e for v e r t i c e s V( fbnet0 ) $ initials <- vnames fbnet0 # I G R A P H U - - - 152 551 -- # + attr : i n i t i a l s ( v/x ) Formation INRA (Niveau 3) Network (TP) Nathalie Villa-Vialaneix 22 / 40

Network mining Building the graph with igraph Edges, edge attributes The graph s edges are accessed and counted with: E( fbnet0 ) # [1] 11 -- 1 # [2] 41 -- 1 # [3] 52 -- 1 # [4] 69 -- 1 # [5] 74 -- 1 # [6] 75 -- 1 #... ecount ( fbnet0 ) # 551 igraph can also handle edge attributes (and also graph attributes). Formation INRA (Niveau 3) Network (TP) Nathalie Villa-Vialaneix 23 / 40

Network mining Global characteristics Connected components is. connected ( fbnet0 ) # [1] F A L S E \ end { Rcode } As this network is not connected, the connected com \ begin { Rcode } fb. components <- clusters ( fbnet0 ) names (fb. components ) # [1] " m e m b e r s h i p " " c s i z e " " no " head (fb. components $ membership, 10) # [1] 1 1 2 2 1 1 1 1 3 1 fb. components $ csize # [1] 122 5 1 1 2 1 1 1 2 1 # [11] 1 2 1 1 2 3 1 1 1 1 # [21] 1 fb. components $no # [1] 21 Formation INRA (Niveau 3) Network (TP) Nathalie Villa-Vialaneix 24 / 40

Network mining Global characteristics Largest connected component fbnet. lcc <- induced. subgraph ( fbnet0, fb. components $ membership == which. max (fb. components $ csize )) # main c h a r a c t e r i s t i c s of the LCC fbnet. lcc # I G R A P H U - - - 122 535 -- # + attr : i n i t i a l s ( v/x ) is. connected ( fbnet. lcc ) # [1] TRUE and global characteristics graph. density ( fbnet. lcc ) # [1] 0. 0 7 2 4 8 3 4 transitivity ( fbnet. lcc ) # [1] 0. 5 6 0 4 5 2 4 Formation INRA (Niveau 3) Network (TP) Nathalie Villa-Vialaneix 25 / 40

Network mining Visualization Network visualization Different layouts are implemented in igraph to visualize the graph: plot ( fbnet.lcc, layout = layout. random, main =" random layout ", vertex. size =3, vertex. color =" pink ", vertex. frame. color =" pink " vertex. label. color =" darkred ", edge. color =" grey ", vertex. label =V( fbnet. lcc )$ initials ) Try also layout.circle, layout.kamada.kawai, layout.fruchterman.reingold... Network are generated with some randomness. See also help(igraph.plotting) for more information on network visualization igraph integrates a pre-defined graph attribute layout: V( fbnet. lcc )$ label <- V( fbnet. lcc )$ initials plot ( fbnet. lcc ) Formation INRA (Niveau 3) Network (TP) Nathalie Villa-Vialaneix 26 / 40

Network mining Visualization Formation INRA (Niveau 3) Network (TP) Nathalie Villa-Vialaneix 27 / 40

Network mining Individual characteristics Degree and betweenness fbnet. degrees <- degree ( fbnet. lcc ) summary ( fbnet. degrees ) # Min. 1 st Qu. M e d i a n Mean 3 rd Qu. Max. # 1.00 2.00 6.00 8.77 1 5. 0 0 3 1. 0 0 fbnet. between <- betweenness ( fbnet. lcc ) summary ( fbnet. between ) # Min. 1 st Qu. M e d i a n Mean 3 rd Qu. Max. # 0.00 0.00 1 4. 0 3 3 0 1. 7 0 1 2 3. 1 0 3 4 3 9. 0 0 and their distributions: par ( mfrow =c (1,2)) plot ( density ( fbnet. degrees ), lwd =2, main =" Degree distribution ", xlab =" Degree ", ylab =" Density ") plot ( density ( fbnet. between ), lwd =2, main =" Betweenness distribution ", xlab =" Betweenness ", ylab =" Density ") Formation INRA (Niveau 3) Network (TP) Nathalie Villa-Vialaneix 28 / 40

Network mining Individual characteristics Degree and betweenness distribution Formation INRA (Niveau 3) Network (TP) Nathalie Villa-Vialaneix 29 / 40

Network mining Individual characteristics Combine visualization and individual characteristics par ( mar = rep (1,4)) # set node a t t r i b u t e size with d e g r e e V( fbnet. lcc )$ size <- 2* sqrt ( fbnet. degrees ) # set node a t t r i b u t e color with b e t w e e n n e s s bet. col <- cut ( log ( fbnet. between +1),10, labels = FALSE ) V( fbnet. lcc )$ color <- heat. colors (10)[11 - bet. col ] plot ( fbnet.lcc, main =" Degree and betweenness ", vertex. frame. color = heat. colors (10)[ bet. col ], vertex. label =NA, edge. color =" grey ") Formation INRA (Niveau 3) Network (TP) Nathalie Villa-Vialaneix 30 / 40

Network mining Individual characteristics Formation INRA (Niveau 3) Network (TP) Nathalie Villa-Vialaneix 31 / 40

Network mining Clustering Node clustering One of the function to perform node clustering is spinglass.community (that prossibly produces different results each time it is used since it is based on a stochastic process): fbnet. clusters <- spinglass. community ( fbnet. lcc ) fbnet. clusters # G r a p h c o m m u n i t y s t r u c t u r e c a l c u l a t e d with # the s p i n g l a s s a l g o r i t h m # N u m b e r of c o m m u n i t i e s : 9 # M o d u l a r i t y : 0. 5 6 5 4 1 3 6 # M e m b e r s h i p v e c t o r : # [ 1] 9 5 6 5 5 9 1 9 1 1 4 9 9 6 2 7 2 7 2 7 5 #... table ( fbnet. clusters $ membership ) # 1 2 3 4 5 6 7 8 9 # 8 7 8 32 14 7 18 2 26 See help(communities) for more information. Formation INRA (Niveau 3) Network (TP) Nathalie Villa-Vialaneix 32 / 40

Network mining Clustering Combine clustering and visualization # c r e a t e a new a t t r i b u t e V( fbnet. lcc )$ community <- fbnet. clusters $ membership fbnet. lcc # I G R A P H U - - - 122 535 -- # + attr : l a y o u t ( g/n ), i n i t i a l s ( v/c ), # l a b e l ( v/c ), size ( v/n ), c o l o r ( v/c ), # c o m m u n i t y ( v/n ) Display the clustering: par ( mfrow =c (1,1)) par ( mar = rep (1,4)) plot ( fbnet.lcc, main =" Communities ", vertex. frame. color = rainbow (9)[ fbnet. clusters $ membership ], vertex. color = rainbow (9)[ fbnet. clusters $ membership ], vertex. label =NA, edge. color =" grey ") Formation INRA (Niveau 3) Network (TP) Nathalie Villa-Vialaneix 33 / 40

Network mining Clustering Combine clustering and visualization Formation INRA (Niveau 3) Network (TP) Nathalie Villa-Vialaneix 34 / 40

Network mining Clustering Export the graph The graphml format can be used to export the graph (it can be read by most of the graph visualization programs and includes information on node and edge attributes) write. graph ( fbnet.lcc, file =" fblcc. graphml ", format =" graphml ") see help(write.graph) for more information of graph exportation formats Formation INRA (Niveau 3) Network (TP) Nathalie Villa-Vialaneix 35 / 40

Use gephi Outline 1 Network inference Data description Inference with glasso Basic mining 2 Network mining Building the graph with igraph Global characteristics Visualization Individual characteristics Clustering 3 Use gephi Formation INRA (Niveau 3) Network (TP) Nathalie Villa-Vialaneix 36 / 40

Use gephi Import data Start and select new project. Create a graph from an edge list: file fbnet-el.txt File / Open Graph type: undirected Formation INRA (Niveau 3) Network (TP) Nathalie Villa-Vialaneix 37 / 40

Use gephi Import data Start and select new project. Create a graph from an edge list: file fbnet-el.txt File / Open Graph type: undirected Create a graph (with nodes an edges attributes) from a GraphML file: file fblcc.graphml previously created with igraph File / Open Graph type: undirected On the right part of the screen, the number of nodes and edges are indicated. Formation INRA (Niveau 3) Network (TP) Nathalie Villa-Vialaneix 37 / 40

Use gephi Import data Start and select new project. Create a graph from an edge list: file fbnet-el.txt File / Open Graph type: undirected Create a graph (with nodes an edges attributes) from a GraphML file: file fblcc.graphml previously created with igraph File / Open Graph type: undirected On the right part of the screen, the number of nodes and edges are indicated. Check nodes attributes and define nodes labels Data lab Copy data to a column: initials Copy to: Label Other nodes or edges attributes can be imported from a csv file with import file Formation INRA (Niveau 3) Network (TP) Nathalie Villa-Vialaneix 37 / 40

Use gephi Visualization Visualize the graph with Fruchterman & Reingold algorithm Bottom left of panel Global view Spatialization: Choose Fruchterman and Reingold Area: 800 - Gravity: 1 Click on Stop when stabilized Formation INRA (Niveau 3) Network (TP) Nathalie Villa-Vialaneix 38 / 40

Use gephi Visualization Visualize the graph with Fruchterman & Reingold algorithm Bottom left of panel Global view Spatialization: Choose Fruchterman and Reingold Area: 800 - Gravity: 1 Click on Stop when stabilized Customize the view: zoom or de-zoom with the mouse change link width (bottom toolbox) display labels and change labels size and color (bottom toolbox) with the select tool, visualize a node neighbors (top left toolbox) with the move tool, move a node (top left toolbox) Formation INRA (Niveau 3) Network (TP) Nathalie Villa-Vialaneix 38 / 40

Use gephi Visualization Visualize the graph with Fruchterman & Reingold algorithm Bottom left of panel Global view Spatialization: Choose Fruchterman and Reingold Area: 800 - Gravity: 1 Click on Stop when stabilized Customize the view: zoom or de-zoom with the mouse change link width (bottom toolbox) display labels and change labels size and color (bottom toolbox) with the select tool, visualize a node neighbors (top left toolbox) with the move tool, move a node (top left toolbox) Use the view to understand the graph (delete labels before) Color a node s neighbors (middle left toolbox) Visualize the shortest path between two nodes (middle left toolbox) Visualize the distances from a given node to all the other nodes by nodes coloring (middle left toolbox) Formation INRA (Niveau 3) Network (TP) Nathalie Villa-Vialaneix 38 / 40

Use gephi Graph mining Node characteristics Window / Statistics On the bottom right panel, Degree: run Check on Data Lab On the top left panel, Clustering / Nodes / Size: Choose a clustering parameter: Degree Size: Min size:10, Max size: 50, Apply On the bottom right panel, Shortest path: run Color: Choose a clustering parameter: Betweenness centrality Default: choose a palette, Apply Formation INRA (Niveau 3) Network (TP) Nathalie Villa-Vialaneix 39 / 40

Use gephi Node clustering Node characteristics On the bottom right panel, Modularity: run Check on Data Lab On the top left panel, Clustering / Nodes / Label color: Choose a clustering parameter: Modularity class Color: Default: choose a palette, Apply Formation INRA (Niveau 3) Network (TP) Nathalie Villa-Vialaneix 40 / 40

Use gephi Node clustering Node characteristics On the bottom right panel, Modularity: run Check on Data Lab On the top left panel, Clustering / Nodes / Label color: Choose a clustering parameter: Modularity class Export a view Color: Default: choose a palette, Apply Preview / Default with straight links / Refresh / Export (bottom left) Formation INRA (Niveau 3) Network (TP) Nathalie Villa-Vialaneix 40 / 40

Use gephi Friedman, J., Hastie, T., and Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical lasso. Biostatistics, 9(3):432 441. Liu, H., Roeber, K., and Wasserman, L. (2010). Stability approach to regularization selection for high dimensional graphical models. In Proceedings of Neural Information Processing Systems (NIPS 2010), pages 1432 1440, Vancouver, Canada. Martin, P., Guillou, H., Lasserre, F., Déjean, S., Lan, A., Pascussi, J., San Cristobal, M., Legrand, P., Besse, P., and Pineau, T. (2007). Novel aspects of PPARα-mediated regulation of lipid and xenobiotic metabolism revealed through a multrigenomic study. Hepatology, 54:767 777. Formation INRA (Niveau 3) Network (TP) Nathalie Villa-Vialaneix 40 / 40