Flowzilla: A Methodology for Detecting Data Transfer Anomalies in Research Networks. Anna Giannakou, Daniel Gunter, Sean Peisert

Similar documents
Flowzilla: A Methodology for Detecting Data Transfer Anomalies in Research Networks

Introduction Challenges with using ML Guidelines for using ML Conclusions

CS Review. Prof. Clarkson Spring 2017

Basic Concepts in Intrusion Detection

Enhancing Byte-Level Network Intrusion Detection Signatures with Context

Alfonso Valdes Keith Skinner SRI International

Developing the Sensor Capability in Cyber Security

A Passive Approach to Wireless NIC Identification

Firewalls, Tunnels, and Network Intrusion Detection

Internet Traffic Classification using Machine Learning

McPAD and HMM-Web: two different approaches for the detection of attacks against Web applications

Automated Network Anomaly Detection with Learning and QoS Mitigation. PhD Dissertation Proposal by Dennis Ippoliti

The Reconnaissance Phase

Measuring Intrusion Detection Capability: An Information- Theoretic Approach

DESIGN AND DEVELOPMENT OF MAC LAYER BASED DEFENSE ARCHITECTURE FOR ROQ ATTACKS IN WLAN

Towards Traffic Anomaly Detection via Reinforcement Learning and Data Flow

EVALUATIONS OF THE EFFECTIVENESS OF ANOMALY BASED INTRUSION DETECTION SYSTEMS BASED ON AN ADAPTIVE KNN ALGORITHM

TDC DoS Protection Service Description and Special Terms

Detecting Botnets Using Cisco NetFlow Protocol

Distributed Denial of Service (DDoS)

Artificial Immune System against Viral Attack

Predictive malware response testing methodology. Contents. 1.0 Introduction. Methodology version 1.0; Created 17/01/2018

An Autonomic Framework for Integrating Security and Quality of Service Support in Databases

Configuring Anomaly Detection

Intrusion Detection Systems

Mapping Internet Sensors with Probe Response Attacks

Big Data Analytics for Host Misbehavior Detection

Configuring Anomaly Detection

Traffic Classification Using Visual Motifs: An Empirical Evaluation

Handling Web and Database Requests Using Fuzzy Rules for Anomaly Intrusion Detection

ANOMALY DETECTION USING HOLT-WINTERS FORECAST MODEL

An advanced data leakage detection system analyzing relations between data leak activity

ARAKIS An Early Warning and Attack Identification System

Lecture Notes on Critique of 1998 and 1999 DARPA IDS Evaluations

YouLighter: An Unsupervised Methodology to Unveil YouTube CDN Changes

Means for Intrusion Detection. Intrusion Detection. INFO404 - Lecture 13. Content

A Resource Contention Analysis Framework for Diagnosis of Application Performance Anomalies in Consolidated Cloud Environments

Network traffic classification: From theory to practice

Intrusion Detection and Containment in Database Systems. Abhijit Bhosale M.Tech (IT) School of Information Technology, IIT Kharagpur

A Firewall Architecture to Enhance Performance of Enterprise Network

Intrusion Detection System

ENTERPRISE ENDPOINT PROTECTION BUYER S GUIDE

Security analytics: From data to action Visual and analytical approaches to detecting modern adversaries

ADAPTIVE NETWORK ANOMALY DETECTION USING BANDWIDTH UTILISATION DATA

Anatomy of a Real-Time Intrusion Prevention System

Evolutionary Algorithm Approaches for Detecting Computer Network Intrusion (Extended Abstract)

Detecting Anomalies in Network Traffic Using Maximum Entropy Estimation

PREEMPTIVE PREventivE Methodology and Tools to protect utilities

Overview Intrusion Detection Systems and Practices

SSL Automated Signatures

Detecting Network Performance Anomalies with Contextual Anomaly Detection

Mapping Internet Sensors with Probe Response Attacks

Evading Network Anomaly Detection Sytems - Fogla,Lee. Divya Muthukumaran

Quadratic Route Factor Estimation Technique for Routing Attack Detection in Wireless Adhoc Networks

Quadratic Route Factor Estimation Technique for Routing Attack Detection in Wireless Adhoc Networks

Data Sources for Cyber Security Research

CS 161 Computer Security

CSE543 - Computer and Network Security Module: Intrusion Detection

CSE543 - Computer and Network Security Module: Intrusion Detection

Lecture 12. Application Layer. Application Layer 1

Mahalanobis Distance Map Approach for Anomaly Detection

Analysis of CPU Pinning and Storage Configuration in 100 Gbps Network Data Transfer

UMSSIA INTRUSION DETECTION

Correlative Analytic Methods in Large Scale Network Infrastructure Hariharan Krishnaswamy Senior Principal Engineer Dell EMC

Network Anomaly Detection Using Autonomous System Flow Aggregates

Intrusion Detection System using AI and Machine Learning Algorithm

Design and Development of Secure Data Cache Framework. Please purchase PDF Split-Merge on to remove this watermark.

Atypical Behavior Identification in Large Scale Network Traffic

Analysis of Black-Hole Attack in MANET using AODV Routing Protocol

Configuring Anomaly Detection

Introduction to Security

Cpk: What is its Capability? By: Rick Haynes, Master Black Belt Smarter Solutions, Inc.

COMPARISON OF THE ACCURACY OF BIVARIATE REGRESSION AND BOX PLOT ANALYSIS IN DETECTING DDOS ATTACKS

Network Security. Chapter 0. Attacks and Attack Detection

ANOMALY detection techniques are the last line of defense

Energy-Aware Routing in Wireless Ad-hoc Networks

Intrusion Detection by Combining and Clustering Diverse Monitor Data

Double Guard: Detecting intrusions in Multitier web applications with Security

A Two-Layered Anomaly Detection Technique based on Multi-modal Flow Behavior Models

Polymorphic Blending Attacks. Slides by Jelena Mirkovic

ANOMALY DETECTION IN COMMUNICTION NETWORKS

Model Based Prediction Technique for Denial of Service Attack Detection

Exploiting n-gram location for intrusion detection

CS419 Spring Computer Security. Vinod Ganapathy Lecture 13. Chapter 6: Intrusion Detection

Intrusion Detection and Malware Analysis

Network Security Terms. Based on slides from gursimrandhillon.files.wordpress.com

Operational Experiences With High-Volume Network Intrusion Detection

Automated Application Signature Generation Using LASER and Cosine Similarity

Failure Diagnosis and Cyber Intrusion Detection in Transmission Protection System Assets Using Synchrophasor Data

Anomaly detection using adaptive resonance theory

Empirical Study of Automatic Dataset Labelling

Behavior-based Authentication Systems. Multimedia Security

CSE 565 Computer Security Fall 2018

On the challenges of network traffic classification with NetFlow/IPFIX

Access Control Using Intelligent Application Bypass

Botnets Behavioral Patterns in the Network

"GET /cgi-bin/purchase?itemid=109agfe111;ypcat%20passwd mail 200

The Forensic Chain-of-Evidence Model: Improving the Process of Evidence Collection in Incident Handling Procedures

Intrusion Detection System

Density-Aware Routing in Highly Dynamic DTNs: The RollerNet Case

Transcription:

Flowzilla: A Methodology for Detecting Data Transfer Anomalies in Research Networks Anna Giannakou, Daniel Gunter, Sean Peisert

Research Networks Scientific applications that process large amounts of data Frequent large data transfers between endpoints Research networks get attacked, just like every other information system

Anomaly-based Intrusion Detection How do we detect attacks/intrusions? Signature-based intrusion detection (known attacks) Anomaly-based intrusion detection (novel attacks) Anomaly: Significant deviation from normal profile Original idea from Dorothy Denning in 1986 How do we perform anomaly intrusion detection?

Anomaly-based Intrusion Detection Limitations Bad news L Defining a normal profile is hard Too many individual sessions, unpredictable behavior Feature distributions are very dynamic *(e.g. packet sizes, IP addresses, session size, duration, volume, payload patterns, etc) Generic internet traffic exhibits high variability Lack of ground truth Too many false positives Not very operationally appealing Are the detected events real anomalies? A.K. Marnerides, D.P. Pezaros, D. Hutchison, Internet traffic characterisation: Third-order statistics & higher-order spectra for precise traffic modelling,in Computer Networks, Volume 134,2018 R. Sommer and V. Paxson, Outside the Closed World: On Using Machine Learning for Network Intrusion Detection, in Proceedings of the 31st IEEE Symposium on Security and Privacy May 2010

Anomaly-based Intrusion Detection Feasibility Is anomaly detection feasible? Our hypothesis is that anomaly-based intrusion detection is feasible if Requires network domain with lower feature variability Easier to establish reliable normal profile Easier to detect deviations Good candidate domain: Research Networks

Flowzilla: General Principles Technique for detecting anomalies in traffic volumes Significant changes in the size of scientific data transfers Use machine learning to establish normal profile Train on past data transfers How do we define significant? With an adaptive technique for establishing a threshold

Flowzilla: Architecture Detection DTN1 Network Flows Training DTN2 Network Flows Network Flows Network Flows DB 1 5 Feature Extraction Filter 6 2 Model 3 Threshold Calculator DTN3 Detector 4 Anomalous Flows DTNn

Flowzilla: Adaptive Threshold Threshold definition can be tricky Too high à False negatives Too low à False positives Constant value does not account for seasonal trends Data transfers are one week apart

Flowzilla: Architecture Detection DTN1 Network Flows Training DTN2 Network Flows Network Flows Network Flows DB 1 5 Feature Extraction Filter 6 2 Model 3 Threshold Calculator DTN3 Detector 4 Anomalous Flows DTNn

Flowzilla: Components Model Predict flow size based on: Network throughput Flow duration Source/Destination IP Threshold calculator! "#$% real traffic volume! &"#' predicted traffic volume ( = mean value of! "#$%! &"#' + = ( + - x such that 90% of the flows are legitimate

Flowzilla: Training Training Data Flows from 10 NERSC DTNs Flows between 10/01/2017-11/30/2017 More than 350,000 flows Collected through tstat Originally 52 features per flow Feature Extraction Filter to extract necessary fields for model training

Flowzilla: Evaluation Questions: 1. How well does Flowzilla detect volume anomalies? 2. Does it detect anomalies regardless of size/time of occurrence? 3. Does the quality of predictions degrade after a certain time? Lack of ground truth in training dataset (which flows are actually malicious?)

Flowzilla: Evaluation Insert artificial anomalies of different size Data transfers between Grid5000 nodes and NERSC DTNs Experiment # of Nodes # of Transfers per Node Transfer Size Time interval between transfers 1 8 5 1-5 GB 1-60 min 2 8 5 10 GB 60 min

Flowzilla: Detection Results Model trained on data transfers between 01/10/2017 30/11/2017 Experiment Total Anomalies Anomaly Size True Positives False Negatives Total # of Flows 1 40 1-5 GB 34 6 12810 2 40 10 GB 37 3 30595 Detection rate remains above 80% in both experiments

Flowzilla: Anomalous Flows Size Prediction

Flowzilla: Quality of Prediction Weeks after Training Accuracy remains above 80% even 10 weeks after training

Flowzilla: Conclusion We have developed a technique for detecting volume anomalies in network transfers on research networks using machine learning Adaptive thresholds are predictably helpful for reducing false positives Acceptable detection rate (up to 92.5%) Model is temporally stable in predicting scientific flow sizes

Flowzilla: Future Work Expand to other types of anomalies Detect anomalies that span across multiple flows Incorporate additional tstat metrics in our prediction Experiment with different retraining strategies (confidence intervals)

Questions?