Estimating Persistent Spread in High-speed Networks Qingjun Xiao, Yan Qiao, Zhen Mo, Shigang Chen

Similar documents
Highly Compact Virtual Maximum Likelihood Sketches for Counting Big Network Data

Highly Compact Virtual Counters for Per-Flow Traffic Measurement through Register Sharing

Fit a Compact Spread Estimator in Small High-Speed Memory MyungKeun Yoon, Tao Li, Shigang Chen, and Jih-Kwon Peir

Origin-Destination Flow Measurement in High-Speed Networks

Origin- des*na*on Flow Measurement in High- Speed Networks

Lecture 2: Streaming Algorithms for Counting Distinct Elements

Robust TCP Stream Reassembly In the Presence of Adversaries

Master Course Computer Networks IN2097

Master Course Computer Networks IN2097

KNOM Tutorial Internet Traffic Matrix Measurement and Analysis. Sue Bok Moon Dept. of Computer Science

Analyzing Dshield Logs Using Fully Automatic Cross-Associations

Fast and Evasive Attacks: Highlighting the Challenges Ahead

Algorithms and Applications for the Estimation of Stream Statistics in Networks

Outline. Motivation. Our System. Conclusion

Bloom Filters. References:

New Directions in Traffic Measurement and Accounting. Need for traffic measurement. Relation to stream databases. Internet backbone monitoring

Distributed Data Streaming Algorithms for Network Anomaly Detection

AS Router Connectedness Based on Multiple Vantage Points and the Resulting Topologies

Virtual Dispersive Networking Spread Spectrum IP

Stochastic Analysis of Horizontal IP Scanning

The UCSD Network Telescope

Chapter 12: Indexing and Hashing

Provision of Quality of Service with Router Support

DDOS Attack Prevention Technique in Cloud

ERT Threat Alert New Risks Revealed by Mirai Botnet November 2, 2016

Packet Doppler: Network Monitoring using Packet Shift Detection

Our Narrow Focus Computer Networking Security Vulnerabilities. Outline Part II

DESIGN AND DEVELOPMENT OF MAC LAYER BASED DEFENSE ARCHITECTURE FOR ROQ ATTACKS IN WLAN

Joint Data Streaming and Sampling Techniques for Detection of Super Sources and Destinations

Network Defenses 21 JANUARY KAMI VANIEA 1

Detection of DNS Traffic Anomalies in Large Networks

IQ for DNA. Interactive Query for Dynamic Network Analytics. Haoyu Song. HUAWEI TECHNOLOGIES Co., Ltd.

3326 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 24, NO. 6, DECEMBER 2016

A Two-Layered Anomaly Detection Technique based on Multi-modal Flow Behavior Models

Securing Grid Data Transfer Services with Active Network Portals

CIS 551 / TCOM 401 Computer and Network Security. Spring 2007 Lecture 12

Chapter 12: Indexing and Hashing. Basic Concepts

Very Fast Containment of Scanning Worms. Nicholas Weaver, Stuart Staniford, Vern Paxson ICSI, Nevis Networks, ICSI & LBNL

A Hybrid Approach to CAM-Based Longest Prefix Matching for IP Route Lookup

Detecting Specific Threats

ANALYSIS AND EVALUATION OF DISTRIBUTED DENIAL OF SERVICE ATTACKS IDENTIFICATION METHODS

Stream Mode Algorithms and. Analysis

Dixit Verma Characterization and Implications of Flash Crowds and DoS attacks on websites

Analysis of Simulation Results

EFFICIENT DATA STRUCTURES AND PROTOCOLS WITH APPLICATIONS IN SPACE-TIME CONSTRAINED SYSTEMS

Primitives for Active Internet Topology Mapping: Toward High-Frequency Characterization

A Robust Video Hash Scheme Based on. 2D-DCT Temporal Maximum Occurrence

Networking interview questions

Methods for Scalable Interactive Exploration of Massive Datasets. Florin Rusu, Assistant Professor EECS, School of Engineering

Configuring attack detection and prevention 1

On Optimizing Load Balancing of Intrusion Detection and Prevention Systems. Anh Le, Ehab Al-Shaer, and Raouf Boutaba

Configuring attack detection and prevention 1

Lecture 6: Overlay Networks. CS 598: Advanced Internetworking Matthew Caesar February 15, 2011

Chapter 7. Denial of Service Attacks

SCREAM: Sketch Resource Allocation for Software-defined Measurement

Mapping Internet Sensors with Probe Response Attacks

Comparison of Firewall, Intrusion Prevention and Antivirus Technologies

Deep Packet Inspection of Next Generation Network Devices

Chapter 2: Memory Hierarchy Design Part 2

Chair for Network Architectures and Services Department of Informatics TU München Prof. Carle. Network Security. Chapter 8

Distributed Denial of Service (DDoS)

CS 561, Lecture 2 : Hash Tables, Skip Lists, Bloom Filters, Count-Min sketch. Jared Saia University of New Mexico

A TWO LEVEL ARCHITECTURE USING CONSENSUS METHOD FOR GLOBAL DECISION MAKING AGAINST DDoS ATTACKS

Security: Worms. Presenter: AJ Fink Nov. 4, 2004

Intruders. significant issue for networked systems is hostile or unwanted access either via network or local can identify classes of intruders:

Project Proposal. ECE 526 Spring Modified Data Structure of Aho-Corasick. Benfano Soewito, Ed Flanigan and John Pangrazio

Our Narrow Focus Computer Networking Security Vulnerabilities. IP-level vulnerabilities

Attack Prevention Technology White Paper

Network Security and Cryptography. 2 September Marking Scheme

Bloom Filter for Network Security Alex X. Liu & Haipeng Dai

Network Defenses 21 JANUARY KAMI VANIEA 1

Securing Grid Data Transfer Services with Active Network Portals

Stochastic Pre-Classification for SDN Data Plane Matching

Distributed Denial of Service

Uncertainties: Representation and Propagation & Line Extraction from Range data

PrivCount: A Distributed System for Safely Measuring Tor

Configuring Anomaly Detection

Chapter 12: Indexing and Hashing

Network Security (and related topics)

Configuring ARP attack protection 1

Configuring Anomaly Detection

Next Week. Network Security (and related topics) Project 3 Q/A. Agenda. My definition of network security. Network Security.

DENIAL OF SERVICE ATTACKS

NetDefend Firewall UTM Services

Failure Diagnosis and Cyber Intrusion Detection in Transmission Protection System Assets Using Synchrophasor Data

OpenFlow DDoS Mitigation

Security+ Guide to Network Security Fundamentals, Fourth Edition. Network Attacks Denial of service Attacks

Scalable and Robust DDoS Detection via Universal Monitoring

One Memory Access Bloom Filters and Their Generalization

Access Methods. Basic Concepts. Index Evaluation Metrics. search key pointer. record. value. Value

DNS SECURITY BEST PRACTICES

Spoofing Detection in Wireless Networks

Configuring ARP attack protection 1

Multivariate Correlation Analysis based detection of DOS with Tracebacking

Topic: Duplicate Detection and Similarity Computing

CCNA R&S: Introduction to Networks. Chapter 11: It s a Network

INTRODUCTION...2 SOLUTION DETAILS...3 NOTES...3 HOW IT WORKS...4

set active-probe (PfR)

Research in the Network Management Laboratory

Protecting DNS Critical Infrastructure Solution Overview. Radware Attack Mitigation System (AMS) - Whitepaper

Transcription:

Estimating Persistent Spread in High-speed Networks Qingjun Xiao, Yan Qiao, Zhen Mo, Shigang Chen Southeast University of China University of Florida

Motivation for Persistent Stealthy Spreaders Imagine a scenario A farm of servers are located in an Intranet The intranet is protected by a gateway router, which inspects the bypass traffic flows

Motivation for Persistent Stealthy Spreaders (cont.) Various malicious attacks may come from the Internet, for example, network/port scanning distributed denial-of-service (DDoS) attacks

Traditional Defense Technique Deployed at the Gateway Router Flow-based traffic monitoring For DDoS: monitor per-destination flow, the stream of packets sent to a common destination IP. For network scanner: monitor per-source flow, the stream of packets sent from a source IP. gateway gateway source destination source destination Per-destination Flow Per-source Flow

Traditional Defense Technique: Super-spreader Detection The spread of a flow is the number of distinct elements The spread of a per-destination flow is the number of distinct source addresses The spread of a per-source flow is the number of distinct destination addresses gateway gateway source destination source destination

Traditional Defense Technique: Super-spreader Detection The spread of a flow is the number of distinct elements The spread of a per-destination flow is the number of distinct source addresses A super-spreader detector locates the elephant flows whose spread exceeds a predefined The spread of threshold. a per-source [ref1, flow is ref2] the number of distinct destination addresses gateway gateway source destination source destination

Why New Techniques? The super-spreader detector may fail to discover malicious activities Example: Stealthy Degrade-of-Quality Attack Reduce the number of attacking machines to the scale of the number of legitimate users. Difficult to differentiate too many users and under attack

Why New Techniques? (cont.) Another Example: Stealthy Network Scan Reduce the probing rate to avoid detection gateway gateway gateway source destination Period 1 source destination Period 2 source destination Period 3 Attacker probes the intranet at a low rate, and it scans different network sections in different time periods Or use botnet to perform coordinated scan

A Useful Traffic Feature to Detect Stealthy Attacks The traffic of stealthy attackers will persist for much longer time than legitimate users Case 1: Stealthy Degrade-of-Quality attacks Legitimate users, when contacting web servers, typically stay for less than 20 minutes In contrast, attackers will send requests persistently to web servers to degrade their performance Case 2: Stealthy network scan attackers will scan the protected network for a long duration, in order to find the vulnerabilities avoid the network section scanned in one period to overlap with another for better efficiency

An Intuitive Explanation of Persistent Spread e 1 e 2 e 3 e 4 e 5 e 6 Period 1 e 1 Period 2 e 1 Period 3 e 2 e 2 e 3 e 3 e 4 e 4 e 5 e 5 e 6 e 6 persistent elements transient elements Persistent spread is the number of persistent elements, e.g., {e 1, e 4, e 6 } = 3

Problem Definition: Persistent Spread Estimation Notations: Let t be the number of measurement periods For a flow of interest, let S i be the set of elements that have been observed in the i th period,1 i t Problem: Estimate the cardinality of the intersection of t sets, e.g., S 1, S 2,, S t S* = S 1 S 2 S t

Challenges Constraint of Memory Usage: A good estimator design must use on-chip SRAM of NIC to support high packet processing speed It must use only a small portion of on-chip SRAM (e.g.,1 Mb), since on-chip SRAM are shared by many other functions --- routing/security/... Line Card or NIC (Network Interface Card) Bus On-chip SRAM Router Architecture Data Plane Control Plane Switch Fabric Main Memory CPU

Challenges (cont.) Fast online-operation (encoding) to keep up with line speed. Scalability: Simultaneously measuring large number of flows. Wide operating range to effectively measure elephant flows.

Baseline Solution: Hash Table with Partial Signatures In the i th period, set S i is recorded as a hash table A i, maintained in on-chip SRAM A i : array of hash buckets h(element) At the end of ith period, download A i to the main memory for post-processing Output: A 1 A 2 A t 8 bit partial signature + 32 bit pointer

Baseline Solution: Hash Table with Partial Signatures In the i th period, set S i is recorded as a hash table A i, maintained in on-chip SRAM A i : array of hash buckets h(element) At Since the end S of ith period, download A i to i is stored uncompressed, it has high the main memory memory cost for of 40 post-processing bits per element. Output: A 1 A 2 A t 8 bit partial signature + 32 bit pointer

Another Solution: Flajolet-Martin (FM) Sketches Set S i is compressed to store as a continuous variant of FM sketches [ref 5] Array of Buckets Y: 0.3 But inaccurate by estimating h(element) A bucket = A float number Exponential distribution with elem# S 1 S 2 S t S 1 U S 2 U U S t When the number of periods t grows, the ratio reduces, and becomes harder to estimate accurately

3 rd Solution based on Union of Bitmaps In the i th period, set S i is stored as a bitmap in on-chip SRAM h(element) bitmap B: 0 1 1 0 0 0 1 0 B[h(element)] := 1 When the i th period ends, download B i to main memory in main memory, there are t bitmaps B 1, B 2.B t, which correspond to sets S 1, S 2, S t of t periods

3 rd Solution based on Union of Bitmaps (cont.) Inclusion-exclusion rule converts intersection cardinality to weighted sum of union cardinalities Union cardinality S 1 U S 2 U U S t can be estimated from the bitwise OR B 1 ٧ B 2 ٧ ٧ B t However, when the number of periods t grows, B 1 ٧ B 2 ٧ ٧ B t become too dense

1 1 1 1 1 1 B 1 B 2 B 3 Our Solution based on Intersection of Bitmaps Our solution: Use the intersection bitmap B 1 ٨ B 2 ٨ ٨ B t Intuition: A persistent element sets the same bit in B 1, B 2,, B t to one, which distinguish it from transient elements transient element 1 persistent element transient element 2

Our Solution based on Intersection of Bitmaps (cont.) Notations: Z i is the fraction of zero bits in B i that are zeros Z* is the ratio of zero bits in bit array B* B* = B 1 B 2 B t n* is the number of persistent elements to estimate When t = 2, give a closed-form estimator: When t = 3, give a closed-form estimator:

Our Solution based on Intersection of Bitmaps (cont.) When t > 3, propose a numerical method where is calculated iteratively by the following procedure

Next Question: How big bit-maps are? One-size-for-all: If too big è waste of memory If too small è inaccurate elephant flows Flow spread distribution: # of Flows Power law distribution in log-log plot Flow Spread From CAIDA Traces -- Measurement Duration=1 Min

Myungkeun Yoon, Tao Li, Shigang Chen, Jih-kwon Peir, Fit a Compact Spread Estimator in Small High-Speed Memory, TON, vol. 19, no. 5, 2011. Virtual Bitmaps: one physical bitmap shared by all flows Our Design All flows share a single physical bitmap Each flow constructs a virtual bitmap by drawing bits pseudo-randomly from the shared physical bitmap Physical Bitmap virtual bitmap for a flow x:

Advantages Compactness: With sharing, elephant flows could borrow space from mice flows. Scalability: Able to estimate much more flows simultaneously. Simple online-operation: For each packet (src, dst), set: M[i] := 1, where i = H(H(src) mod m) dst ) mod u.

Bias of Virtual Bitmaps Positive Bias due to Bit Sharing Two virtual bitmaps may share the same bits For one flow, the elements coming from other flows are called noises Noises cause positive estimation bias Physical Bitmap 1 virtual virtual bitmap 1: 1 bitmap 2: 1 Myungkeun Yoon, Tao Li, Shigang Chen, Jih-kwon Peir, Fit a Compact Spread Estimator in Small High-Speed Memory, TON, vol. 19, no. 5, 2011.

Consider Multiple Monitoring Periods Physical Bitmap in Period 1: Virtual Bitmap 1: Virtual Bitmap 2: Physical Bitmap in Period 2: Virtual Bitmap 1: Virtual Bitmap 2:

Consider Multiple Monitoring Periods Physical Bitmap in Period 1: Virtual Bitmap 1: Virtual Bitmap 1: Virtual Bitmap 2: Physical Bitmap in Period 2: Virtual Bitmap 2: Intersection of Virtual Bitmaps of Flow 1 in Time Periods 1, 2,., t

Compensate Positive Bias For Virtual Bitmaps in Multiple Periods Use t = 2 as an Example. The equations for t = 3, 4, can be derived similarly. a) Estimate the number of persistent elements that have been mapped to the virtual bit vector b) Estimate the number of persistent elements for all flows in physical bitmap c) Estimate for the number of persistent elements that belongs to the flow of interest Noise Removal

Simulation Settings Persistent spread is in the range of 0 to 10 4 Signal-to-Noise Ratio (SNR) ranges: 1 to 0.4 SNR = S 1 S 2 S t S i - S 1 S 2 S t FM & our solution: <1 bit per element.

Simulation Results Hash table with partial signature FM sketch method based on S 1 S 2 S t S 1 U S 2 U U S t Our intersection-based virtual bitmap method based on B 1 ٨۸ B 2 ٨۸ ٨۸ B t

Summary of Contributions Propose a new primitive for network flow monitoring, named persistent spread estimator, which can detect stealthy network activities over long periods Describe a solution that can accurately estimate the persistent spread, and the accuracy improves as the increase of time periods t Provide extensive analysis of statistical properties of proposed methods, including estimator bias and variance Present comparative evaluation for 3 algorithms: Hash table with partial signature, FM sketch, and virtual bitmap.

Thanks! Questions? Presented by: Yan Qiao Ph.D., University of Florida