Graph-based Detection of Anomalous Network Traffic

Similar documents
Traffic Dispersion Graph Based Anomaly Detection

Basic Concepts in Intrusion Detection

FPGA based Network Traffic Analysis using Traffic Dispersion Graphs

Improved Detection of Low-Profile Probes and Denial-of-Service Attacks*

Distributed Denial of Service (DDoS)

Anomaly Detection of Network Traffic Based on Analytical Discrete Wavelet Transform. Author : Marius SALAGEAN, Ioana FIROIU 10 JUNE /06/10

EXPERIMENTAL STUDY OF FLOOD TYPE DISTRIBUTED DENIAL-OF- SERVICE ATTACK IN SOFTWARE DEFINED NETWORKING (SDN) BASED ON FLOW BEHAVIORS

Chapter 7. Denial of Service Attacks

Peer-to-Peer Botnet Detection Using NetFlow. Connor Dillon

Automated Application Signature Generation Using LASER and Cosine Similarity

Anomaly Detection in Communication Networks

Early Application Identification

COMPUTER NETWORK SECURITY

Detecting Botnets Using Cisco NetFlow Protocol

Exploiting Dynamicity in Graph-based Traffic Analysis: Techniques and Applications

Security+ Guide to Network Security Fundamentals, Fourth Edition. Network Attacks Denial of service Attacks

ERT Threat Alert New Risks Revealed by Mirai Botnet November 2, 2016

NETWORK TRAFFIC ANALYSIS - A DIFFERENT APPROACH USING INCOMING AND OUTGOING TRAFFIC DIFFERENCES

Towards Traffic Anomaly Detection via Reinforcement Learning and Data Flow

DDoS PREVENTION TECHNIQUE

Intrusion Detection and Malware Analysis

Our Narrow Focus Computer Networking Security Vulnerabilities. Outline Part II

Check Point DDoS Protector Simple and Easy Mitigation

SecBlade Firewall Cards Attack Protection Configuration Example

ANOMALY DETECTION IN COMMUNICTION NETWORKS

Botnets Behavioral Patterns in the Network

Cooperative Anomaly and Intrusion Detection for Alert Correlation in Networked Computing Systems

CS395/495 Computer Security Project #2

ANALYSIS AND EVALUATION OF DISTRIBUTED DENIAL OF SERVICE ATTACKS IDENTIFICATION METHODS

AS INTERNET hosts and applications continue to grow,

Multi-phase IRC Botnet & Botnet Behavior Detection Model

Empirically Based Analysis: The DDoS Case

Table of Contents. 1 Intrusion Detection Statistics 1-1 Overview 1-1 Displaying Intrusion Detection Statistics 1-1

Intrusion Detection System For Denial Of Service Flooding Attacks In Sip Communication Networks

Journal of Chemical and Pharmaceutical Research, 2014, 6(7): Research Article

Fuzzy Intrusion Detection

Master Course Computer Networks IN2097

Attack Prevention Technology White Paper

Developing the Sensor Capability in Cyber Security

Computer Security: Principles and Practice

EVALUATIONS OF THE EFFECTIVENESS OF ANOMALY BASED INTRUSION DETECTION SYSTEMS BASED ON AN ADAPTIVE KNN ALGORITHM

HP High-End Firewalls

A Two-Layered Anomaly Detection Technique based on Multi-modal Flow Behavior Models

A SURVEY TO ANALYSE MITIGATION TECHNIQUES FOR DISTRIBUTED DENIAL OF SERVICE ATTACKS

Check Point DDoS Protector Introduction

Network Security. Chapter 0. Attacks and Attack Detection

Security: Worms. Presenter: AJ Fink Nov. 4, 2004

A TWO LEVEL ARCHITECTURE USING CONSENSUS METHOD FOR GLOBAL DECISION MAKING AGAINST DDoS ATTACKS

Cisco Stealthwatch. Internal Alarm IDs 7.0

ECE 435 Network Engineering Lecture 23

Global DDoS Measurements. Jose Nazario, Ph.D. NSF CyberTrust Workshop

A brief Incursion into Botnet Detection

4MMSR-Network Security Seminar. Peer-to-Peer Botnets: Overview and Case Study

Introduction to Netflow

P2P Botnet Detection Method Based on Data Flow. Wang Jiajia 1, a Chen Yu1,b

Master Course Computer Networks IN2097

Enhanced Multivariate Correlation Analysis (MCA) Based Denialof-Service

Data Sheet. DPtech Anti-DDoS Series. Overview. Series

Configuring attack detection and prevention 1

Behavior Based Malware Analysis: A Perspective From Network Traces and Program Run-Time Structure

Selecting Features for Intrusion Detection: A Feature Relevance Analysis on KDD 99 Intrusion Detection Datasets

Behavioral Graph Analysis of Internet Applications

MAD 12 Monitoring the Dynamics of Network Traffic by Recursive Multi-dimensional Aggregation. Midori Kato, Kenjiro Cho, Michio Honda, Hideyuki Tokuda

Measuring Intrusion Detection Capability: An Information- Theoretic Approach

CHAPTER V KDD CUP 99 DATASET. With the widespread use of computer networks, the number of attacks has grown

Intrusion Detection System (IDS) IT443 Network Security Administration Slides courtesy of Bo Sheng

DENIAL OF SERVICE ATTACKS

Configuring Anomaly Detection

(Im)possibility of Enumerating Zombies. Yongdae Kim (U of Minnesota - Twin Cities)

Denial of Service. Serguei A. Mokhov SOEN321 - Fall 2004

Towards a collaborative, flow-based, distributed inter-domain Intrusion Detection System

The UCSD Network Telescope

Dixit Verma Characterization and Implications of Flash Crowds and DoS attacks on websites

4.1.3 Filtering. NAT: basic principle. Dynamic NAT Network Address Translation (NAT) Public IP addresses are rare

Multi-Stream Fused Model: A Novel Real-Time Botnet Detecting Model

haltdos - Web Application Firewall

A SYSTEM FOR DETECTION AND PRVENTION OF PATH BASED DENIAL OF SERVICE ATTACK

A Hybrid Approach for Misbehavior Detection in Wireless Ad-Hoc Networks

Fregata. DDoS Mitigation Solution. Technical Specifications & Datasheet 1G-5G

Intrusion Detection by Combining and Clustering Diverse Monitor Data

Network Management and Monitoring

An Anomaly-Based Intrusion Detection System for the Smart Grid Based on CART Decision Tree

Data Sheet. DPtech IPS2000 Series Intrusion Prevention System. Overview. Series IPS2000-MC-N. Features

Means for Intrusion Detection. Intrusion Detection. INFO404 - Lecture 13. Content

Radware DefensePro DDoS Mitigation Release Notes Software Version Last Updated: December, 2017

CSE 565 Computer Security Fall 2018

Optimization of Firewall Rules

Anomaly Intrusion Detection System Using Hierarchical Gaussian Mixture Model

An Eye on the Storm: Inside the Storm Epidemic. Josh Ballard Network Security Analyst Kansas State University

Mapping Internet Sensors with Probe Response Attacks

Lecture 12. Application Layer. Application Layer 1

Evidence Gathering for Network Security and Forensics DFRWS EU Dinil Mon Divakaran, Fok Kar Wai, Ido Nevat, Vrizlynn L. L.

Bloom Filters. References:

Traffic Classification Using Visual Motifs: An Empirical Evaluation

Intelligent and Secure Network

Security Events and Alarm Categories (for Stealthwatch System v6.9.0)

Your projected and optimistically projected grades should be in the grade center soon o Projected: Your current weighted score /30 * 100

DDoS Attacks Detection Using GA based Optimized Traffic Matrix

Stealthwatch System v6.9.0 Internal Alarm IDs

Worldwide Detection of Denial of Service (DoS) Attacks

Transcription:

Graph-based Detection of Anomalous Network Traffic Do Quoc Le Supervisor: Prof. James Won-Ki Hong Distributed Processing & Network Management Lab Division of IT Convergence Engineering POSTECH, Korea lequocdo@postech.ac.kr 2012. 06. 22 POSTECH 1/26

Contents Introduction & Motivation Related Work Graph-based Network Traffic Modeling Graph Metrics Anomaly Detection & Attack Identification Validation Conclusion POSTECH 2/26

Introduction & Motivation POSTECH 3/26 The Internet continues to grow in size and complexity Security has become a critical issue. The occurrence of traffic anomalies (DDoS, flash crowds, port scans and worms). Challenges: Increasingly sophisticated attacks. Attacks are often hidden in existing applications, e.g. IRC, HTTP, or Peer-to-Peer: Worm scans or botnet C&C traffic. Methods for detecting traffic anomalies. Signature-based techniques Cannot detect anomalies caused by unknown attacks. Anomaly-based techniques: (Machine learning, data mining the statistical analysis, etc.) Generate a huge number of false alarms. Time consuming. Cannot detect anomalies whose traffic is similar with normal applications (traffic volume, number of packets, number of flows and average packet size).

Introduction & Motivation POSTECH 4/26 Goal: Improve detection accuracy and the ability of the state of art techniques for anomaly detection. Solution: Using a graph-based method to monitor network traffic and analyze the structure of communication patterns to detect anomalies and identify attacks. Why we study the structure of communication patterns in network traffic? Each attack has its own structure. Communication patterns structure changes when attacks occur. Can identify when attacks occur that can be difficult to detect using conventional means.

Contribution POSTECH 5/26 One of the first works using a Traffic Dispersion Graphs (TDGs) to detect anomalies Focus on structural characteristics of networks. Improve performance and ability of the state of the art techniques. Support intuitive visualization of traffic patterns. Introduce a new metric to analyze network traffic communication patterns overtime Implement an online anomaly detection system in an Enterprise network based on the proposed method Evaluate the approach by analyzing real attack traces

Related Work Zhou et al. [1] proposed a network traffic anomaly method based on graph mining Mining time-series graphs. Mining edge weight. Entropy of four attributes: source and destination IP address, source and destination port. The drawback: Enormous size computational complexity. We analyze unlabeled graphs and just concentrate on their nodes Godiyal et al. [2] used a graph matching method to identify attacks Applying isomorphism algorithm for whole traffic flow very time consuming. We identify attacks in abnormal network traffic only POSTECH 6/26

Related Work (cont.) POSTECH 7/26 Iliofotou et al. [3] use TDG to model network traffic as series of related graphs over time Using graph metrics Degree, degree distribution Entropy of degree distribution Graph edit distance Solving problem of traffic classification, possible application to anomaly detection. We model network traffic as TDG over time using new metrics.

Network Traffic Modeling Traffic Dispersion Graph (TDG) Each node IP address. Each edge interaction (flow) between two nodes. D-1 A D-2 B-1 B-2 F-1 D-2 B-1 B-2 F-1 Generated TDG D-1 A POSTECH 8/26

TDG Visualization POSTECH 9/26 HTTP Many disconnected components Very few nodes with in and-out degrees Web proxies? Source: Iliofotou et al. Slammer Worm UDP Dst. port 1434 Many high out-degree nodes Many disconnected components The majority of nodes have only indegree Nodes being scanned

Graph Metrics on TDGs POSTECH 10/26 What we have seen so far: Visualization is useful by itself However, it requires a human operator. Next step? Translate visual intuition into quantitative measures. How to quantitatively characterize properties of TDGs? Step 1: represent traffic as a sequence of graph snapshots. Step 2: use metrics that quantify differences between graphs. G t 0 G t 1 G t 2 G tn G x G y Time What are the differences in communication structure between Gx and Gy?

Graph Metrics on TDGs Static metrics Node degree In-degree Out-degree Degree distribution Show an approximate power-law. Maximum degree (Kmax) One of metrics to detect DDoS attack. Degree Assortativity Measure the tendency for nodes to be connected to similar nodes in term of their degree. Entropy of degree distribution Quantify heterogeneity of network : H X = P k k=1,k max log P k Where P(k) is the probability that a node has degree k. POSTECH 11/26

Graph Metrics on TDGs Dynamic metrics Graph edit distance d G i, G j = V i + V j 2 V i V j + E i + E j 2 E i E j Where V i, E i and V j, E j are the numbers of nodes and edges in graph G i and G j, respectively. dk-2 distance metric Based on dk-series concept Structure analysis - dk-n series: n=1,2,3, Look at inter-dependencies among topology characteristics. dk-n series are degree correlations within simple connected graphs of size n. dk-2 describes joint node degree distribution. dk-2 distance(g,g ) = Euclidean distance between dk- 2(G) and dk-2(g ) POSTECH 12/26

Anomaly Detection & Attack Identification Using graph metrics to detect abnormal network traffic. Anomalies: attacks which change communication structure in network(ddos attacks, Internet worms and scanning) The overall process consist of two parts: anomaly detection and attack identification Network Traffic Flow Anomaly Detection Attack Identification Alarm Figure 4. Overall detection process. POSTECH 13/26

Anomaly Detection & Attack Identification Anomaly Detection Step 1: Sampling network traffic and generating network flows. Step 2: Creating TDG (Dot format) from network flows in time sampling intervals. Step 3: Calculating adjacency matrices of the TDG and calculating graph metrics of the TDG. Step 4: Comparing values of graph metrics of the TDG with their threshold value. Graph metric value < Threshold normal TDG. Graph metric value > Threshold abnormal TDG. Figure 5. Detailed anomaly detection process. POSTECH 14/26

Anomaly Detection & Attack Identification Attack Identification Attack pattern: Figure 7. Attack pattern generation process. Attack identification: Figure 8. DDoS attack pattern in DDoS CAIDA trace. Figure 11. Attack identification process. Figure 9. Peacomm P2P botnet pattern. POSTECH 15/26

Validation POSTECH 16/26 Off-line analysis Trace DARPA 1999 Dataset Week 1 and week 3: no attack (for training data). Week 2: 43 attacks belonging to 18 labeled attack types are used for system development. Week 4 and week 5: 201 attacks belonging to 58 attack types (including 40 new attacks). POSTECH trace in 2009. 7. 9. Contain a famous DDoS attack on July 7, 2009 in South Korea. CAIDA DDoS trace in 2007. P2P Botnet trace (Peacomm) from a honeynet. On-line analysis Real-time anomaly detection Testing with port scanning attack

Validation (DARPA dataset) POSTECH 17/26 DARPA 1999 Dataset Figure 12. Kmax per minute over one day (Monday, Week 5) with normal and attacking traffic. Figure 13. dk-2 distance value per minute over one day (Monday, Week 5) with normal and attacking traffic.

Validation (DARPA dataset) DARPA 1999 Dataset Table 2. Performance of the Graph-based method using Kmax and dk-2 distance metric on Monday, Week5 traffic. Total instances Attacking instances DR FPR CR 1320 122 100 % 1.25 % 98.86 % Table 3. Number of attack instances detected for each attack type on Monday, Week5 traffic. Attack Type Number of attack instances for each attack type Number of detected attack instances for each attack type apache2-dos 30 30 arppoison-probe 15 15 dict-r2l 17 17 guesstelnet-r2l 4 4 ipsweep-prob 26 26 ls-probe 2 2 neptune-dos 5 5 portsweep-probe 4 4 smurf-dos 2 2 udpstorm-dos 16 16 crashiis-dos 1 1 POSTECH 18/26

POSTECH 19/26 Validation (POSTECH July, 2009) POSTECH traces on July, 2009 Date DDoS Attack Trace Size 03/31 No 30.7 GB 07/08 Yes 27.3 GB

Validation (POSTECH July, 2009) POSTECH traces on July, 2009 Figure 15. Kmax value over time of POSTECH s trace on July 8th 2009. Figure 16. dk-2 distance value over time of POSTECH s trace on July 8th 2009. POSTECH 20/26

Validation (POSTECH July, 2009) POSTECH 21/26 POSTECH traces on July, 2009 Postech Normal Trace in 2009 Postech DDoS Trace in 2009.7.9

Validation (Honeynet dataset) Real P2P botnet traffic (Peacomm) trace We executed Trojan Peacomm binary files in a honeynet which consisted of 12 hosts. Synthesized traffic dataset We injected P2P botnet (Peacomm) trace into normal POSTECH traffic trace. POSTECH 22/26

Anomaly Normal Validation (Honeynet Dataset) Results dk-2 Matrices POSTECH 23/26

Validation (Real-time anomaly detection) The real-time anomaly detection system Figure 22. Real-time Anomaly Detection System: Function diagram. Figure 23. Real-time Anomaly Detection System: User Interface. POSTECH 24/26

Validation (Real-time anomaly detection) POSTECH 25/26 Real-time anomaly detection system testing We implemented a Port scanning attack from a host in the dormitory network of our campus to a host outside our campus network. Using TCP Port Scanning tool to generate 100 Port scanning instances Result: DR = 100% and FP = 0. Figure 24. dk2 distance and Kmax value during TCP Port scanning attacks.

Conclusion & Future Work POSTECH 26/26 Conclusion Provide a new approach for anomaly detection. Improve performance of the state of the art techniques. Implement a real-time anomaly detection system based on the proposed method. New way to analyze network traffic for anomaly detection that offers clear visualization. Future work Developing a classifier that determines the thresholds automatically and in a statistical way. Validating our approach with other traces. Using a combination of our metrics and other effective metrics to increase accuracy in terms of anomaly detection and attacks identification.

References POSTECH 27/26 1. Y. Zhou, G. Hu and W. He, Using graph to detect network traffic anomaly, Conference on Communications Circuits and Systems, 2009. 2. A. Godiyal, M. Garland and C.H. John, Enhancing network traffic visualization by graph pattern analysis, 2011. 3. M. Ilifotou, P. Pappu, M. Faloutsos, M. Mitzenmacher, G. Varghese, and H. Kim, Graption: Automated detection of P2P applications using traffic dispersion graphs (TDGs), Tech. Rep. UCR-CS-2008-06080, Department of Computer Science and Engineering, University of California, Riverside, June 2008. 4. S. Voss and J. Subhlok, Performance of general graph isomorphism algorithms, Technical Report UH-CS-09-07, University of Houston, 2010. 5. J.W. Hong, Internet traffic monitoring and analysis using NG-MON, POSTECH, Advanced Communication Technology. The 6th International Conference, vol.1, pp. 100 120, 2004. 6. D. Whitney, Basic Network Metrics. Lecture note, 2008. 7. M. Iliofotou, M. Faloutsos and M. Mitzenmacher, Exploiting dynamicity in graph-based traffic analysis: techniques and applications, in Proceedings of the 5th international conference on Emerging networking experiments and technologies (CoNEXT '09). ACM, New York, NY, USA, 2009, pp. 241 252. 8. T.-F. Yen and M. K. Reiter, Are your hosts trading or plotting? Telling P2P file-sharing and bots apart, In 30th International Conference on Distributed Computing Systems, 2010. 9. D. Q. Le, T. Jeong, H. E. Roman, and J.W. Hong, Traffic Dispersion Graph Based Anomaly Detection, in Proc. of the Second Symposium on Information and Communication Technology (SoICT), Hanoi, Vietnam, Oct. 13-14, 2011, pp. 36 41.

Q & A POSTECH 28/26 Cảm ơn 감사합니다

Appendix POSTECH 29/26

Comparison POSTECH 30/26 Table 2. Performance of the Graph-based method using Kmax and dk-2 distance metric on Monday, Week5 traffic. Method Total instances Attacking instances DR FPR CR Proposed method 1320 122 100 % 1.25 % 98.86 % Wavelet-based method 1320 122 99% 56.97% 53.30%

Appendix (VF2 Algorithm) POSTECH 31/26 Source: P. Figgia

VF2 Algorithm @04 Considering two graph Q and G, the (sub)graph isomorphism from Q to G is expressed as the set of pairs (n,m) (with n G 1, with m G 2 ) 1 A 2 3 B C 2 B 1 A C 3 S 1 S 2 (1, 1) (1, 4) (2, 2) (2, 2) (3, 3) (3, 3) 4 A 32 POSTECH 32/26

VF2 Algorithm Idea: How to find candidate pair sets for a intermediate state? Finding the (sub)graph isomorphism between Q and G is a sequence of state transition. 1 A 1 A 2 B C 2 3 B C 4 A 3 Intermediate States s1 (2,2) s2 (2,2) (1,1) s3 (2,2)(1,1)(3,3) 33 POSTECH 33/26

VF2 Algorithm @04 Let s to be an intermediate state. Actually, s denotes a partial mapping from Q to G, namely, a mapping from a subgraph of Q to a subgraph of G. These two subgraphs are denoted as Q(s) and G(s). All neighbor vertices to Q(s) in graph Q are denoted as NQ(s), and all neighbor vertices to G(s) in graph G are denoted as NG(s). Candidate pair sets are a subset of NQ(s) NG(s). Assume that a pair (n,m) NQ(s) NG(s). 34 POSTECH 34/26

VF2 Algorithm 1 A 2 3 B C 2 B A 1 C 3 (2, 2) Candidate Pair Sets (1, 1) (1, 4) (3, 3) (3,3) 4 A 35 POSTECH 35/26

VF2 Algorithm 36 POSTECH 36/26

POSTECH 37/26 Drawing TDG Drawing Network Traffic Graph? Generate Visualize

Figure 4: DDoS Attack Taxonomy DDoS Attack Bandwidth Depletion Resource Depletion Flood Attack Amplification Attack Protocol Exploit Attack Malformed Packet Attack UDP ICMP Smurf Attack Fraggle Attack TCP SYN Attack PUSH + ACK Attack IP Address Attack IP Packet Options Attack Random Port Attack Same Port Attack Spoof Source IP Address? Spoof Source IP Address? Spoof Source IP Address? Spoof Source IP Address? Spoof Source IP Address? Spoof Source IP Address? Spoof Source IP Address? Direct Attack Loop Attack POSTECH 38/26

Attack Templates POSTECH 39/26 Pattern Specification DDoS Pattern

Attack Templates (1/3) POSTECH 40/26

Attack Templates (2/3) POSTECH 41/26

Attack Templates (3/3) POSTECH 42/26

Thresholds of POSTECH network TCP UDP ICMP Kmax: 5525 dk-2 distance: 11328 Kmax: 15327 dk-2 distance: 23608 Kmax: 1425 dk2: 2996 POSTECH 43/26

NG-MON2 POSTECH 44/26

NAT POSTECH 45/26

Validation (DARPA dataset) DARPA 1999 Dataset Week 1 and week 3: no attack (for training data). Week 2: 43 attacks belonging to 18 labeled attack types are used for system development. Week 4 and week 5: 201 attacks belonging to 58 attack types (including 40 new attacks). The traffic data on Monday, Week 5 of DARPA Dataset Including 122 attack instances. Attacks that change communication structure in network graph: Smurf, apache2, udpstorm, portsweep and etc. POSTECH 46/26

Validation POSTECH 47/26 We use standard measurements such as detection rate (DR), false positive rate (FPR) and overall classification rates (CR) to evaluate our approach. True Positive (TP): The number of anomalous instances that are correctly identified. True Negative (TN): The number of legitimate instances that are correctly classified. False Positive (FP): The number of instances that were incorrectly identified as anomalies, however in fact they are legitimate activities. False Negative (FN): The number of instances that were incorrectly classified as legitimate activities however in fact they are anomalous. DR = TP / (TP + FN) FPR = FP / (TN + FP) CR = (TP + TN) / (TP + TN + FP + FN)

Peacomm POSTECH 48/26 Connect to Overnet The bot publishes itself on the Overnet network and connects to peers. The initial list of peers is hard coded in the bot. Download Secondary Injection URL The bot uses hard coded keys to search for and download a value on the Overnet network. The value is an encrypted URL that points to the location of a secondary injection executable. Decrypt Secondary Injection URL The bot uses a hard coded key to decrypt the downloaded value, which is a URL. Download Secondary Injection The bot downloads the secondary injection from a web server using the decrypted URL. Execute Secondary Injection The bot executes the secondary injection, possibly scheduling future upgrades on the peer-to-peer network or scheduling bot stat tracking at some other resource. http://static.usenix.org/event/hotbots07/tech/full_papers/grizzard/grizzard_html/

Peacomm POSTECH 49/26 Figure 2: Number of Remote IPv4 Addres ses Contacted Over Time for Duration of Infection

POSTECH 50/26

POSTECH 51/26

POSTECH 52/26

POSTECH 53/26

POSTECH 54/26

Graph Metrics on TDGs dk-2 distance Structure analysis - dk-n series: n=1,2,3, Look at inter-dependencies among topology characteristics dk-n series are degree correlations within simple connected graphs of size n Source: Ben Zhao (June 22, 2011) POSTECH 55/26

P2P (1st generation) POSTECH 56/26

Gnutella (2nd generation) POSTECH 57/26

KaZaA (3rd generation) POSTECH 58/26

KaZaA POSTECH 59/26

Distributed Hash Tables (4th generation) POSTECH 60/26

dk-2 value matrix POSTECH 61/26 Normal Anomaly