Intelligent Network Management Using Graph Differential Anomaly Visualization Qi Liao
Network Management What is going on in the network? Public servers Private servers Wireless Users DMZ Applications Internet Enterprise Wired Users Data Central Michigan University 2
Security Management Needs of Network Manager Health check Situation awareness Accountability / Forensics Troubleshoot Challenges Huge amount of data Complexity Dynamics Gap: daily monitoring operational interpretation Central Michigan University 3
Network Anomaly Network anomaly is useful in many areas of network management. Some examples of easy anomalies Readings from sensor network DoS attack Port scanning Packet headers match a pattern More general (harder) anomalies Stealthy Less traffic Given only a time-series of network graphs, can we detect abnormal changes and find the underlying causes? 4
Graph Diff. Anomaly Visualization My network at time i My network at time j Spatial anomalies How similar / different? Temporal anomalies Central Michigan University 5
Differential Anomaly Visualization Graph differential anomaly visualization (DAV) framework Whole graphs Nodes and edges Communities (subgraphs) More tolerant to the dynamics of network. Effectively visualizes the dynamics and abnormal changes among the heterogeneous, time-series network graphs. 6
Monitoring Where, Who, and What Need finer granularity than raw network connectivity Two important enterprise network components Who (users) are responsible What (applications) are running on the network. CONTENT vs. CONTEXT Associated with each network connection Users, applications, parameters, file accesses, etc. Central Michigan University 7
Local Context Host Bigger picture: what is happening on the network Users Applications Central Michigan University 8
Traditional view 80,tcp 53,udp H H www.cmich.edu 8745,udp name.cmich.edu 4157,tcp H 2128,tcp 80,tcp H lab01.cps.cmich.edu 9875,tcp 79,tcp H R3208.orange.fr directory.cmich.edu Most existing tools show this view Web traffic in, web traffic out, DNS, Active Directory 9
Network flows Who and what? A IIS www.cmich.edu H 80,tcp qliao U A nessus U rmcfall A firefox 4157,tcp admin 80,tcp H U www U 9875,tcp lab01.cps.cmich.edu H R3208.orange.fr A apache Network Context Graphs
Data Collection Agent Gathers context from local hosts who (users), what (applications), when (time), where (hosts) Built-in system tools (free and robust) who, where what who, what, where when netstat ps lsof diff context Easy to deploy ( no change to existing systems) Lightweight CPU< 2% Bandwidth ( 1000 hosts: 240 Kbps = 0.2% of 100Mbps) Disk ( 1GB /host/year) Visual Analysis for the Enterprise Network Management and Security 11
HUA Graph View Graph controls Monitored hosts External Domains hops Apps Sort by degrees, weights, names Users Central Michigan University Node selection 12
Bipartite graphs The general HUA connectivity graphs can be separated into (multi-)bipartite graphs. src host dst host Central Michigan University 13
K-partite graphs Quadripartite graph Hosts Users Applications Hosts Infogain Critical path Central Michigan University 14
Local users (root) Similarity Graphs (app) # users bridges applications Ent. users (condor) Central Michigan University 15
Visual Analysis for Network Management Data mining / machine learning Automatic Algorithmic, analytic methods Visualization Manual interactive visual exploration Bring in domain knowledge from experienced managers. 16
Differential Anomaly Visualization What are the changes? What are the variance and invariance? How similar (different) from day-to-day network activities? What changes are normal / abnormal? How to quantify and visualize the evolution of changes? Dynamic and noisy data (hosts, users, applications) Differential Visualization Insights (variants, invariants, abnormal behaviors, root causes ) Central Michigan University 17
Hierarchical DAV (overview + context) Whole Graphs Nodes / Edges Communities Central Michigan University 18
Graph Diff. Anomaly Visualization My network at time i My network at time j Spatial anomalies How similar / different? Temporal anomalies Central Michigan University 19
Graph Properties Graph sizes Cluster coefficients Graph diameters Degree distributions Graph distances Graph variance scores Central Michigan University 20
Graph Similarity General graph isomorphism netscale Iss-node2 cclsun1 wizard Iss-node3 Iss-node4 Iss-node1 cclsun3 A more complex example Central Michigan University 21
Graph distance Edit distance: number of operations required to transform one into the other. Graph Edit Distance (GED) [Bunke07] to measure the graphs similarities. Maximum common subgraphs (MCS) based: d( g 1, g 2 ) 1 mcs( g max( g, g2), g ) Graph edit distance (GED) based: 1 1 2 d( g 1, g 2 ) g 1 g 2 mcs( g g g 2 1 2 1, g 2 ) Central Michigan University 22
Expected Graphs (EG) Minimum common supergraphs (MCP) MCP / MCPP g 1 Maximum common subgraphs (MCS) = invariance MCS g 3 g 2 variance Median Graph (MG) 23
Differential visualization New (appear) Show / Hide Old (disappear) Spatio-temporal dynamics Invariance 24
Differential visualization Old (disappear) Old Invariance All (disappear) New (appear) Invariance 25
Link Anomalies Not exactly link prediction problem. Common neighbors assumption Known nodes only assumption Non-dynamic assumption Proof-of-concept Non-linear weighting frequency function N w( t) d t P( Li ), d N t, w( t) probability of i-th link to appear 1 i t 1 {0,1} whether i-th link appears at time t t (1 N ( ) e w t non-linear time weighting function Can take inputs from future link anomaly algorithms ) 26
Link Anomalies Visualization RED: Type-I anomaly: should appear but did not appear BLUE: Type-II anomaly: should not appear but appeared 27
Link Anomalies Visualization Should not appear Should appear 28
Link Anomalies Visualization Should appear 29
Community-based DAV Intermediate similarity metric COARSE Graph property changes Community membership changes Susceptible to the dynamics of graphs FINE Node / edge changes Balance of granularity and complexity 30
Intra-graph clusters visualization 2) httpd web 3) desk apps 1) firefox Walktrap [Pons:2006] 4) Condor research computing Central Michigan University 31
Temporal Community Evolution Finance/HR Finance/HR day i day i+1 U 1 U 2 U 1 U 2 botnets U 3 U 4 U 3 cluster cluster U 4 U 8 U 5 U 6 U 9 Sales Sales U 5 U 6 U 9 cluster cluster U 7 U 8 cluster U 7 cc3.irc.ru 32
Community-based DAV Graphs changes via community similarity Similar to Rand Index [Rand71] dist( C1, C2 Flexibility SS ) 1 SS SD DD DD DS Suitability for highly dynamic networks Nodes consistently belong to the same (or different) communities changes are normal Central Michigan University 33
Community-based DAV (example) Anomaly caused by a spike of community changes at time 8 and 9 Walktrap 34
Community-based DAV (MDS view) Nodes that are farther away indicate anomalous user behaviors C 8 C 9 C 10 Graph/communities C 0 C 11 35
Communities of a User Similarity Graph Time: 8 Condor community Grad students community 36
Communities of a User Similarity Graph Grad students community Time: 9 Users change community membership Condor community 37
Conclusion Network (security) management is hard. Large scale, heterogeneity, dynamics, complexity Anomaly detection and analysis is important yet challenging. We developed a novel hierarchical graph differential anomaly visualization (DAV) framework Combines automated graph data mining and manual exploration. At different levels: Graphs, Nodes/Edges, Communities Completeness Overview vs. Details-on-demand Exact changes vs. Dynamic churns Detection vs. root causes DAV: intelligent, time-efficient management alternative. 38
More info visit http://cps.cmich.edu/liao1q Thank You! 39
Questions 40