Introduction Challenges with using ML Guidelines for using ML Conclusions

Similar documents
Diverse network environments Dynamic attack landscape Adversarial environment IDS performance strongly depends on chosen classifier

Flowzilla: A Methodology for Detecting Data Transfer Anomalies in Research Networks. Anna Giannakou, Daniel Gunter, Sean Peisert

CSE 565 Computer Security Fall 2018

Protecting Against Modern Attacks. Protection Against Modern Attack Vectors

Demystifying Machine Learning

Detecting Attacks, Part 1

Big Data Analytics for Host Misbehavior Detection

Security Gap Analysis: Aggregrated Results

CSE543 - Computer and Network Security Module: Intrusion Detection

CE Advanced Network Security

Enhancing Byte-Level Network Intrusion Detection Signatures with Context

MACHINE LEARNING & INTRUSION DETECTION: HYPE OR REALITY?

Chair for Network Architectures and Services Department of Informatics TU München Prof. Carle. Network Security. Chapter 8

Hardening the Education. with NGFW. Narongveth Yutithammanurak Business Development Manager 23 Feb 2012

"GET /cgi-bin/purchase?itemid=109agfe111;ypcat%20passwd mail 200

Intrusion Detection by Combining and Clustering Diverse Monitor Data

CSE543 - Computer and Network Security Module: Intrusion Detection

CS Review. Prof. Clarkson Spring 2017

Detecting Malicious Hosts Using Traffic Flows

Basic Concepts in Intrusion Detection

UMSSIA INTRUSION DETECTION

FIREWALL PROTECTION AND WHY DOES MY BUSINESS NEED IT?

Network Security. Chapter 0. Attacks and Attack Detection

Intrusion Detection System using AI and Machine Learning Algorithm

IBM Next Generation Intrusion Prevention System

Technology Director Meeting

Joe Stocker, CISSP, MCITP, VTSP Patriot Consulting

Developing the Sensor Capability in Cyber Security

Risk Management. Modifications by Prof. Dong Xuan and Adam C. Champion. Principles of Information Security, 5th Edition 1

Security analytics: From data to action Visual and analytical approaches to detecting modern adversaries

Business Strategy Theatre

NetDefend Firewall UTM Services

Machine Learning with Python

Introduction to Traffic Analysis. George Danezis University of Cambridge, Computer Laboratory

COMPUTER NETWORK SECURITY

Cisco Service Control Service Security: Outgoing Spam Mitigation Solution Guide, Release 4.1.x

McPAD and HMM-Web: two different approaches for the detection of attacks against Web applications

The Future of Threat Prevention

Base64 The Security Killer

1110 Cool Things Your Firewall Should Do. Extend beyond blocking network threats to protect, manage and control application traffic

Securing Office 365 with SecureCloud

Casting out Demons: Sanitizing Training Data for Anomaly Sensors Angelos Stavrou,

SRX als NGFW. Michel Tepper Consultant

Intrusion Detection Systems

Sobering statistics. The frequency and sophistication of cybersecurity attacks are getting worse.

NOTHING IS WHAT IT SIEMs: COVER PAGE. Simpler Way to Effective Threat Management TEMPLATE. Dan Pitman Principal Security Architect

Chapter 9. Firewalls

Self Learning Networks An Overview

IBM Security Network Protection Solutions

Emerging Threat Intelligence using IDS/IPS. Chris Arman Kiloyan

The New Normal. Unique Challenges When Monitoring Hybrid Cloud Environments

Malicious Activity and Risky Behavior in Residential Networks

CS 161 Computer Security

Network Security. Course notes. Version

Intrusion Detection. October 19, 2018

SentryWire Next generation packet capture and network security.

SentryWire Next generation packet capture and network security.

ACS-3921/ Computer Security And Privacy. Chapter 9 Firewalls and Intrusion Prevention Systems

System Models. 2.1 Introduction 2.2 Architectural Models 2.3 Fundamental Models. Nicola Dragoni Embedded Systems Engineering DTU Informatics

How can we gain the insights and control we need to optimize the performance of applications running on our network?

SOLUTION BRIEF RSA NETWITNESS EVOLVED SIEM

THE RSA SUITE NETWITNESS REINVENT YOUR SIEM. Presented by: Walter Abeson

Preventing the Next Insider Threat from Leveraging Cross Domain Data Movement

Victorious warriors win first and then go to war, while defeated warriors go to war first and then seek to win. Sun Tzu, The Art of War

What to Look for When Evaluating Next-Generation Firewalls

SP Security Primer 101

PREEMPTIVE PREventivE Methodology and Tools to protect utilities

APP-ID. A foundation for visibility and control in the Palo Alto Networks Security Platform

Ken Hines, Ph.D GraniteEdge Networks

Computer Security: Principles and Practice

Anomaly Detection. You Chen

Empower stakeholders with single-pane visibility and insights Enrich firewall security data

Means for Intrusion Detection. Intrusion Detection. INFO404 - Lecture 13. Content

Trisul Network Analytics - Traffic Analyzer

Securing Your Amazon Web Services Virtual Networks

Automated Threat Management - in Real Time. Vectra Networks

Compare Security Analytics Solutions

EU GENERAL DATA PROTECTION: TIME TO ACT. Laurent Vanderschrick Channel Manager Belgium & Luxembourg Stefaan Van Hoornick Technical Manager BeNeLux

Data Sources for Cyber Security Research

App-ID. PALO ALTO NETWORKS: App-ID Technology Brief

Advanced Threat Hunting:

Intrusion Detection System (IDS) IT443 Network Security Administration Slides courtesy of Bo Sheng

Cisco Tetration Analytics Demo. Ing. Guenter Herold Area Manager Datacenter Cisco Austria GmbH

Operational Experiences With High-Volume Network Intrusion Detection

ENTERPRISE ENDPOINT PROTECTION BUYER S GUIDE

Analytics Driven, Simple, Accurate and Actionable Cyber Security Solution CYBER ANALYTICS

Raj Jain. Washington University in St. Louis

Intrusion Detection and Malware Analysis

Gladiator Incident Alert

CSC 574 Computer and Network Security. Firewalls and IDS

You will discuss topics related to ethical hacking, information risks, and security techniques which hackers will seek to circumvent.

Cisco Cyber Range. Paul Qiu Senior Solutions Architect

The Mimecast Security Risk Assessment Quarterly Report May 2017

Different attack manifestations Network packets OS calls Audit records Application logs Different types of intrusion detection Host vs network IT

Microsoft Security Management

BUILDING A NEXT-GENERATION FIREWALL

The SANS Institute Top 20 Critical Security Controls. Compliance Guide

Defense-in-Depth Against Malicious Software. Speaker name Title Group Microsoft Corporation

Office 365 Buyers Guide: Best Practices for Securing Office 365

Vendor: Cisco. Exam Code: Exam Name: Implementing Cisco Threat Control Solutions. Version: Demo

Transcription:

Introduction Challenges with using ML Guidelines for using ML Conclusions

Misuse detection Exact descriptions of known bad behavior Anomaly detection Deviations from profiles of normal behavior First proposed in 1987 by Dorothy Denning (Stanford Research Institute)

Attacks sophistication 403M new variants of malware created in 2011 100K unique malware samples daily in 2012 Q1 Required attacker knowledge decreasing Highly motivated attackers

Median time between breach and awareness 300-400+ days Duration of zero-day attacks up to 30 months, median 8 months % of attacks discovered by a third party 61% % of businesses that share breach info 2-3%

Product recommendations Amazon, Netflix Optical character recognition Google Natural language translation Google, Microsoft Spam detection Google, Yahoo, Microsoft, Facebook, Twitter

Almost all NIDS systems used in operational environments are misusebased Despite lots of research on anomaly detection Despite appeal of anomaly detection to find new attacks Despite success of ML in other domains

Outlier detection High cost of errors Lack of appropriate training data Interpretation of results Variability in network traffic Adaptive adversaries Evaluation difficulties

Training samples Classification Many from both classes Required quality Enough to distinguish two classes Outlier detection Almost all from one class Perfect model of normal Premise: Anomaly detection can find novel attacks Fact: ML is better at finding similar patterns than at finding outliers Example: Recommend similar products; similarity: products purchased together Conclusion: ML is better for finding variants of known attacks

Underlying assumptions Malicious activity is anomalous Anomalies correspond to malicious activity Do these assumptions hold? Former employee requests authorization code Account revocation bug? Insider threat? Username typo User authentication fails 10K times Brute force attack? User changed password, forgot to update script

Product recommendation Spam detection Cost of False Negatives Low: potential missed sales Low: spam finding way to inbox Cost of False Positives Low: continue shopping High: missed important email Intrusion detection High: Arbitrary damage High: wasted precious analyst time Post-processing: Spelling/grammar checkers to clean up results Proofreading: Much easier than verifying a network intrusion

Assume: Breathalyzer gets the answer right 90% of the time It detects a driver as drunk Question: What is the probability the driver is actually drunk?

Attack free data hard to obtain Labeled data expensive to obtain Product recommendation Spam detection Intrusion detection Training Supervised Supervised Unsupervised

Product recommendation Spam detection Intrusion detection Goal Classify Classify Classify and Interpret Network operator needs actionable reports What does the anomaly mean? Abnormal activity vs. Attack Incorporation of site-specific security policies Relation between features of anomaly detection & semantics of environment

Variability across all layers of the network Even most basic characteristics: bandwidth, duration of connections, application mix Large bursts of activity

What is a stable notion of normality? Anomalies Attacks One solution: Reduced granularity Example: Time-of-Day, Day-of-Week Pro: More stable Con: Reduced visibility

Adversaries adapt ML assumptions do not necessarily hold I.I.D, stationary distributions, linear separability, etc. ML algorithm itself can be an attack target Mistraining, evasion

Difficulties with data Data s sensitive nature Lack of appropriate public data Automated translation: European Union documents Simulation Capturing characteristics of real data Capturing novel attack detection Anonymization Fear of de-anonymization Removing features of interest to anomaly detection

Interpreting the results HTTP traffic of host did not match profile Contrast with spam detection: Little room for interpretation Adversarial environment Contrast with product recommendation: Little incentive to mislead the recommendation system

Using tools borrowed from ML in inappropriate ways Goal: Effective adoption of ML for largescale operational environments Not a Black box approach Crisp definition of context Understanding semantics of detection

Understand the threat model Keep the scope narrow Reduce the costs Use secure ML Evaluation Gain insights to the problem space

What kind of target environment? Academic vs enterprise; small vs large/backbone Cost of missed attacks Security demands, other deployed detectors Attackers skills and resources Targeted vs background radiation Risk posed by evasion

What are the specific attacks to detect? Choose the right tool for the task ML not a silver bullet Common pitfall: Start with intention to use ML or even worse a particular ML tool No Free Lunch Theorem Identify the appropriate features

Features: Byte frequencies in packet payloads Algorithm: Detect packets with anomalous frequency patterns Assumption: Attack payloads have different payload byte frequencies Question: Where does this assumption come from?

Threat model: Web-based attacks using input parameters to web applications Why anomaly detection: Attacks share conceptual similarities, yet different enough in their specifics for signatures Data: Successful GET requests to CGI apps, from web server Access Logs Features: Length of attribute value, Character distribution of attribute value Why is this feature relevant Length: Buffer overflow needs to send shellcode and padding Character distribution: Directory traversal uses too many. & /

Reduce the system s scope Classification over outlier detection Aggregate features over suitable intervals Post-process the alerts Provide meta-information to analyst to speed up inspection

If you know the enemy and know yourself, you need not fear the result of a hundred battles. If you know yourself but not the enemy, for every victory gained you will also suffer a defeat. If you know neither the enemy nor yourself, you will succumb in every battle. Sun Tzu, The Art of War

Develop insight into anomaly detection system s capabilities What can/can t it detect? Why?

ML as means to identify important features Use those features to build non-ml detectors ML as a means to an end

Outside the closed world: On using machine learning for network intrusion detection, Sommer-Paxson, 2010 Challenging the Anomaly Detection Paradigm: A Provocative Discussion, Gates-Taylor, 2007 The Base-Rate Fallacy and Its Implications for the Difficulty of Intrusion Detection, Axelsson, 1999