Traffic Classification Using Visual Motifs: An Empirical Evaluation

Similar documents
Using Visual Motifs to Classify Encrypted Traffic

SVILUPPO DI UNA TECNICA DI RICONOSCIMENTO STATISTICO DI APPLICAZIONI SU RETE IP

Packet Classification in Co-mingled Traffic Streams

Lab 4: Network Packet Capture and Analysis using Wireshark

4. The transport layer

Intrusion Detection System (IDS) IT443 Network Security Administration Slides courtesy of Bo Sheng

Anomaly Detection in Communication Networks

Automated Traffic Classification and Application Identification using Machine Learning. Sebastian Zander, Thuy Nguyen, Grenville Armitage

OSI Transport Layer. objectives

Toward Efficient Querying of Compressed Network Payloads!

Developing the Sensor Capability in Cyber Security

CSC Network Security

ch02 True/False Indicate whether the statement is true or false.

Layer 4: UDP, TCP, and others. based on Chapter 9 of CompTIA Network+ Exam Guide, 4th ed., Mike Meyers

Network Review TEJ4M. SBrace

Linux Networking: tcp. TCP context and interfaces

A Hybrid Approach for Accurate Application Traffic Identification

Network Technology 1 5th - Transport Protocol. Mario Lombardo -

Early Application Identification

Fundamentals of Computer Networking AE6382

Polygraph: Automatically Generating Signatures for Polymorphic Worms

NT1210 Introduction to Networking. Unit 10

Firewall Simulation COMP620

Enhancing Byte-Level Network Intrusion Detection Signatures with Context

A Non-Parametric Approach to Generation and Validation of Synthetic Network Traffic

Can we trust the inter-packet time for traffic classification?

Exposing server performance to network managers through passive network measurements

Improved Detection of Low-Profile Probes and Denial-of-Service Attacks*

Means for Intrusion Detection. Intrusion Detection. INFO404 - Lecture 13. Content

Automated Application Signature Generation Using LASER and Cosine Similarity

Flowzilla: A Methodology for Detecting Data Transfer Anomalies in Research Networks. Anna Giannakou, Daniel Gunter, Sean Peisert

AIT 682: Network and Systems Security

Hands-On Ethical Hacking and Network Defense

Heuristics to Classify Internet Backbone Traffic based on Connection Patterns

Tracking the Evolution of Web Traffic:

Mapping Internet Sensors with Probe Response Attacks

CCNA 1 Chapter 7 v5.0 Exam Answers 2013

Using Visual Motifs to Classify Encrypted Traffic

Revealing Skype Traffic: When Randomness Plays with You

TRANSMISSION CONTROL PROTOCOL. ETI 2506 TELECOMMUNICATION SYSTEMS Monday, 7 November 2016

Configuring Health Monitoring

Tunneling Activities Detection Using Machine Learning Techniques

Mapping Internet Sensors with Probe Response Attacks

Distributed Systems. 27. Firewalls and Virtual Private Networks Paul Krzyzanowski. Rutgers University. Fall 2013

UDP, TCP, IP multicast

HMM Profiles for Network Traffic Classification (Extended Abstract)

Data and Computer Communications. Chapter 2 Protocol Architecture, TCP/IP, and Internet-Based Applications

Introduction to Networking

Application Layer Preprocessors

Operating Systems and. Computer Networks. Introduction to Computer Networks. Operating Systems and

TELNET is short for Terminal Network Enables the establishment of a connection to a remote system, so that the local terminal appears to be the

SE 4C03 Winter Final Examination Answer Key. Instructor: William M. Farmer

Activating Intrusion Prevention Service

Security Engineering. Lecture 16 Network Security Fabio Massacci (with the courtesy of W. Stallings)

Meeting 39. Guest Speaker Dr. Williams CEH Networking

Managing SonicWall Gateway Anti Virus Service

Hands-On Activity. Firewall Simulation. Simulated Network. Firewall Simulation 3/19/2010. On Friday, February 26, we will be meeting in

FPGA based Network Traffic Analysis using Traffic Dispersion Graphs

a. the physical layer, b. and the data-link layer. a. three physical layers, b. three data-link layers, c. and only one network layer.

6. The Transport Layer and protocols

(2½ hours) Total Marks: 75

Protecting Network Quality of Service Against Denial of Service Attacks

Intrusion Detection System

Applied IT Security. System Security. Dr. Stephan Spitz 6 Firewalls & IDS. Applied IT Security, Dr.

A TWO LEVEL ARCHITECTURE USING CONSENSUS METHOD FOR GLOBAL DECISION MAKING AGAINST DDoS ATTACKS

Computer and Network Security

Computer and Network Security

Inferring the Source of Encrypted HTTP Connections

Classification of Log Files with Limited Labeled Data

Objectives: (1) To learn to capture and analyze packets using wireshark. (2) To learn how protocols and layering are represented in packets.

Our Narrow Focus Computer Networking Security Vulnerabilities. Outline Part II

OSSIM Fast Guide

Intrusion Detection and Malware Analysis

The Bro Network Intrusion Detection System

Scanning. Course Learning Outcomes for Unit III. Reading Assignment. Unit Lesson UNIT III STUDY GUIDE

CIS 551 / TCOM 401 Computer and Network Security. Spring 2007 Lecture 12

9th Slide Set Computer Networks

Configuring attack detection and prevention 1

Overview Intrusion Detection Systems and Practices

QUESTION BANK EVEN SEMESTER

Transport Layer (TCP/UDP)

Internet Traffic Classification using Machine Learning

Generalization of Signatures for SSH Encrypted Traffic Identification

Router and ACL ACL Filter traffic ACL: The Three Ps One ACL per protocol One ACL per direction One ACL per interface

Internet and Intranet Protocols and Applications

Transport Layer Security

Defining Networks with the OSI Model. Module 2

CSCI 466 Midterm Networks Fall 2013

Configuring Anomaly Detection

Chapter 7. Local Area Network Communications Protocols

QUESTION BANK UNIT-I

Basic Concepts in Intrusion Detection

Configuring Anomaly Detection

Overview of Firewalls. CSC 474 Network Security. Outline. Firewalls. Intrusion Detection System (IDS)

Detecting Spammers with SNARE: Spatio-temporal Network-level Automatic Reputation Engine

Advanced Security and Mobile Networks

Firewalls, Tunnels, and Network Intrusion Detection

Graph-based Detection of Anomalous Network Traffic

From Traffic Measurement to Realistic Workload Generation

CSE 3214: Computer Network Protocols and Applications. Midterm Examination

Transcription:

Traffic Classification Using Visual Motifs: An Empirical Evaluation Wilson Lian 1 Fabian Monrose 1 John McHugh 1,2 1 University of North Carolina at Chapel Hill 2 RedJack, LLC VizSec 2010

Overview Background Visual Motifs Traffic Classification Empirical Evaluation

Background Motivation Internet Network Administrator

Background Motivation d3b07384d113e... GET /index.ht... 7d8ad5cb9c940... d41d8cd98f00b... MAIL FROM: foo@... d41d8cd98f00b...

Background Goals Port 22 36fd6d8c3f5af4... Port 22 fc2394c1a922... Port 22 f4d6d8c3f5a36... Port 22 222394c1a9fc... Port 5ad6d8c3ff436... Port 22 a92394c122fc... Port 25 MAIL FROM: foo@... Port 25 f98698466c3ef... Port 25 ef8698466c3f9... Port 25 DATA\r\nSubject: fo... Port 25 f98698466c3ef... Port 80 b314caafaa3e... Port 80 b314caafaa3e... Port 80 POST /login.ph... Port 80 b314caafaa3e... Port 80 POST /AuthSv... Port 80 GET /index.ht... Port 1214 113edec49eaa... Port 1214 006f7b3db8f4f... Port 1214 aa3edec49e11... Port 1214 4f006f7b3db8f0... Port 1214 9e3edec4aa11... Port 1214 b8006f7b3d4ff0...

Background Assumptions Reliable transport via TCP Stream Cipher No access to payload Payload length preservation Encryption at or above Transport Layer

Visual Motifs Bigram Heatmaps C. Wright, F. Monrose, and G.M. Masson. Using visual motifs to classify encrypted traffic. 2006

Visual Motifs Heatmap Construction Client Server SYN 48 bytes SYN-ACK 48 bytes ACK 40 bytes HTTP Request 891 bytes 40 bytes 270 bytes 1500 bytes Time 40 bytes 1500 bytes 40 bytes

Visual Motifs Heatmap Construction Client Server SYN 48 bytes SYN-ACK 48 bytes ACK 40 bytes HTTP Request 891 bytes 40 bytes 270 bytes 1500 bytes 40 bytes 1500 bytes Time 48-48 40 891-40 -270-1500 40-1500 40 40 bytes

Visual Motifs Heatmap Construction Client Server SYN 48 bytes SYN-ACK 48 bytes ACK 40 bytes HTTP Request 891 bytes 40 bytes 270 bytes 1500 bytes 40 bytes 1500 bytes Time 48-48 40 891-40 -270-1500 40-1500 40 (48, -48) (-48, 40) (40, 891) (891, -40) (-40, -270) (-270, -1500) (-1500, 40) (40, -1500) (-1500, 40) 40 bytes

Visual Motifs Heatmap Construction Client Server SYN 48 bytes SYN-ACK 48 bytes ACK 40 bytes HTTP Request 891 bytes 40 bytes 270 bytes 1500 bytes 40 bytes 1500 bytes Time 48-48 40 891-40 -270-1500 40-1500 40 (48, -48) (-48, 40) (40, 891) (891, -40) (-40, -270) (-270, -1500) (-1500, 40) (40, -1500) (-1500, 40) 40 bytes

Visual Motifs Heatmap Construction Client Server SYN 48 bytes SYN-ACK 48 bytes ACK 40 bytes HTTP Request 891 bytes 40 bytes 270 bytes 1500 bytes 40 bytes 1500 bytes Time 48-48 40 891-40 -270-1500 40-1500 40 (48, -48) (-48, 40) (40, 891) (891, -40) (-40, -270) (-270, -1500) (-1500, 40) (40, -1500) (-1500, 40) 40 bytes

Visual Motifs Heatmap Construction (40, 891) (48, -48) (-48, 40) (40, 891) (891, -40) (-40, -270) (-270, -1500) (-1500, 40) (40, -1500) (-1500, 40)

Visual Motifs Heatmap Construction (-48, 40) (-1500, 40) (-1500, 40) (-40, -270) (-270, -1500) (40, 891) (40, 891) (48, -48) (891, -40) (40, -1500) (48, -48) (-48, 40) (40, 891) (891, -40) (-40, -270) (-270, -1500) (-1500, 40) (40, -1500) (-1500, 40)

Visual Motifs Heatmap Construction (-48, 40) 3/9 (-1500, = 33.3% 40) (-1500, 40) (-40, -270) 2/9 = 22.2% (-270, -1500) (40, 891) 1/9 (40, = 11.1% 891) (48, -48) 3/9 (891, = 33.3% -40) (40, -1500) (48, -48) (-48, 40) (40, 891) (891, -40) (-40, -270) (-270, -1500) (-1500, 40) (40, -1500) (-1500, 40)

Visual Motifs Heatmap Construction

Visual Motifs Bigram Heatmaps

Visual Motifs Modeling Protocol Behavior (-48, 40) 3/9 (-1500, = 33.3% 40) (-1500, 40) 1 (40, 891) 1/9 (40, = 11.1% 891) 2 (-40, -270) 2/9 = 22.2% (-270, -1500) (48, -48) 3/9 (891, = 33% -40) (40, -1500) 3 4

Visual Motifs Modeling Protocol Behavior 1 0.9 0.8 0.7 0.6 Probability 0.5 0.4.333.333 0.3.222 0.2.111 0.1 0 0 1 2 3 4 Bin

Traffic Classification Comparing Protocol Models 1 0.9 0.8 0.7.700 Probability 0.6 0.5 0.4.333.333 0.3.222 0.2 0.1.100.111.150.050 1 2 3 4 Bin

Traffic Classification Comparing Protocol Models Score =.233+.589+.072+.283 = 1.177.700 Probability.333.100.111.222.150.333.050 0 1 2 3 4.222-.150 =.072 5 Difference.333-.100 =.233.333-.050 =.283.111-.700 =.589

Traffic Classification Comparing Protocol Models 1 0.9 0.8 0.7 Probability 0.6 0.5 0.4 0.3.333.400.222.333.300 0.2 0.1.111.150.150 1 2 3 4 Bin

Traffic Classification Comparing Protocol Models 0.7 Score =.067+.039+.072+.033 =.211 0.6 Probability 0.5 0.4.333.400.333.300 0.3.222 0.2 0.1.111.150.150 Difference -0-0.1-0.2 0 1 2 3 4.111-.150 =.039.333-.300 =.033.333-.400 =.067.222-.150 =.072 5

Traffic Classification Classifying Samples 1 Create training models for desired protocols 2 Build testing model for network trace to be classified 3 Find training model with lowest L 1 distance L 1 (A, B) = n i=1 A φi A τ B φ i B τ

Traffic Classification Classification Confidence Threshold Goal: Eliminate close calls Require 1st place candidate to lead 2nd place by certain amount to make decision Based on standard deviation of L 1 distances

Empirical Evaluation Evaluation Parameter Selection Bin size Confidence threshold Training set size Precision Recall true positives true positives + false positives true positives true positives + false negatives

Empirical Evaluation Evaluation Trial := 1 Randomly sample some percentage of available data for each port and train classifier 2 Randomly sample some number (λ = 45, 000) of the remaining data points for each port and create testing samples 3 Classify testing samples 50 Trials 80 110 445 Training Data Testing Data

Empirical Evaluation dartmouth Dataset Weekdays: January 19, 2004 February 6, 2004 Top 10 ports by number of sessions observed 25 (SMTP), 80 (HTTP), 88 (Kerberos), 110 (POP3), 135 (DCE), 139 (NetBIOS), 443 (HTTPS), 445 (MDS), 902 (VMware), 1214 (Kazaa) No payload data Used for parameter selection Total Packets 1.3 Billion Traffic Volume 707 GB Observed Ports 64,214 Sessions 5.2 Million

Empirical Evaluation darpa Data 1999 DARPA Intrusion Detection Evaluation Weeks 1, 3, 4, 5 Ports 21 (FTP), 23 (Telnet), 25 (SMTP), 79 (finger), 80 (HTTP), 110 (POP3) Additional cross-validation using parameter values determined from dartmouth. Total Packets 60 Million Traffic Volume 12 GB Observed Ports 10,274 Sessions 1.6 Million

Empirical Evaluation dartmouth Results 50 Trials 15% Training Set Size 45,000 Data Points Testing Set Size k = 0.75 Confidence Threshold

Empirical Evaluation dartmouth Results

Empirical Evaluation dartmouth Results Server Message Block (SMB) protocol Used for Windows resource sharing Windows NT: via NetBIOS (NBT) on TCP 139 Windows 2000: directly on TCP 445 Clients with NetBIOS enabled try connections on TCP 139 and TCP 445 Reference: http://www.ntsecurity.nu/papers/port445/

Empirical Evaluation darpa Results 50 Trials 15% Training Set Size 45,000 Data Points Testing Set Size k = 0.75 Confidence Threshold

Empirical Evaluation Inter-dataset Analysis 50 Trials 15% Training Set Size (dartmouth) 45,000 Data Points Testing Set Size (darpa) k = 0.25 Confidence Threshold

Empirical Evaluation Inter-dataset Analysis

Empirical Evaluation Inter-dataset Analysis Mailbomb attack?

Empirical Evaluation Inter-dataset Analysis

Limitations Must be trained on network to be tested Reliance on session de-multiplexing TCP only Deliberate attempts to disguise traffic (Folga et al., Wright et al. 2009)

Future Work On-line classification Protocol subcategorization (e.g., SSH interactive vs. SCP file transfer)

Conclusion Modeling protocol behavior using only packet size, direction, and order Resistant to encryption High precision and recall Quick and reliable traffic inspection Useful for anomaly and misuse detection

Questions? Thanks for listening. Q & A wwlian@gmail.com