New Directions in Traffic Measurement and Accounting. Need for traffic measurement. Relation to stream databases. Internet backbone monitoring

Similar documents
New Directions in Traffic Measurement and Accounting

.7 million (Fix-West) and 0.8 million (MCI). Even with aggregation, the numberofflows in hour in the Fix-West used by [8]was as large as 0.5 million.

KNOM Tutorial Internet Traffic Matrix Measurement and Analysis. Sue Bok Moon Dept. of Computer Science

CIS 551 / TCOM 401 Computer and Network Security. Spring 2007 Lecture 12

Software Defined Traffic Measurement with OpenSketch

Enhancing Byte-Level Network Intrusion Detection Signatures with Context

Summary Cache based Co-operative Proxies

Power of Slicing in Internet Flow Measurement. Ramana Rao Kompella Cristian Estan

On the Difficulty of Scalably Detecting Network Attacks

Scalable Enterprise Networks with Inexpensive Switches

Language-Directed Hardware Design for Network Performance Monitoring

Heavy-Hitter Detection Entirely in the Data Plane

Catching Accurate Profiles in Hardware

Practical Lazy Scheduling in Wireless Sensor Networks. Ramana Rao Kompella and Alex C. Snoeren

Bloom filters and their applications

Wei Wang, Mehul Motani and Vikram srinivasan Department of Electrical & Computer Engineering National University of Singapore, Singapore

MAD 12 Monitoring the Dynamics of Network Traffic by Recursive Multi-dimensional Aggregation. Midori Kato, Kenjiro Cho, Michio Honda, Hideyuki Tokuda

Week 7: Traffic Models and QoS

Language-Directed Hardware Design for Network Performance Monitoring

New Directions in Traffic Measurement and Accounting

Toward a Reliable Data Transport Architecture for Optical Burst-Switched Networks

Some Observations of Internet Stream Lifetimes

Counter Braids: A novel counter architecture

Chapter 3 Part 2 Switching and Bridging. Networking CS 3470, Section 1

Supporting Service Differentiation for Real-Time and Best-Effort Traffic in Stateless Wireless Ad-Hoc Networks (SWAN)

Summarizing and mining inverse distributions on data streams via dynamic inverse sampling

IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 14, NO. 5, OCTOBER

Introduction to Data Mining

Lesson n.11 Data Structures for P2P Systems: Bloom Filters, Merkle Trees

Estimating Persistent Spread in High-speed Networks Qingjun Xiao, Yan Qiao, Zhen Mo, Shigang Chen

SCREAM: Sketch Resource Allocation for Software-defined Measurement

Be Fast, Cheap and in Control with SwitchKV. Xiaozhou Li

Medium Access Control

Security: Worms. Presenter: AJ Fink Nov. 4, 2004

NetFlow-based bandwidth estimation in IP networks

Simply Top Talkers Jeroen Massar, Andreas Kind and Marc Ph. Stoecklin

DevoFlow: Scaling Flow Management for High Performance Networks

Module 1: Basics and Background Lecture 4: Memory and Disk Accesses. The Lecture Contains: Memory organisation. Memory hierarchy. Disks.

Making Session Stores More Intelligent KYLE J. DAVIS TECHNICAL MARKETING MANAGER REDIS LABS

Outline. Motivation. Our System. Conclusion

Project 0: Implementing a Hash Table

Switch and Router Design. Packet Processing Examples. Packet Processing Examples. Packet Processing Rate 12/14/2011

Cascade Mapping: Optimizing Memory Efficiency for Flash-based Key-value Caching

Big Data Analytics CSCI 4030

Lecture 2: Streaming Algorithms for Counting Distinct Elements

A Framework for Efficient Class-based Sampling

TVA: A DoS-limiting Network Architecture L

AutoFocus: A Tool for Automatic Traffic Analysis. Cristian Estan, University of California, San Diego

Joint Data Streaming and Sampling Techniques for Detection of Super Sources and Destinations

Counter Braids: A novel counter architecture

Buffered Count-Min Sketch on SSD: Theory and Experiments

Design and Performance Analysis of a DRAM-based Statistics Counter Array Architecture

Tracking Frequent Items Dynamically: What s Hot and What s Not To appear in PODS 2003

Software Defined Networking

Increase-Decrease Congestion Control for Real-time Streaming: Scalability

Robust validation of network designs under uncertain demands and failures

Application of SDN: Load Balancing & Traffic Engineering

Load Balancing with Minimal Flow Remapping for Network Processors

Towards High-performance Flow-level level Packet Processing on Multi-core Network Processors

IP Accounting C H A P T E R

A Hybrid Approach for Accurate Application Traffic Identification

Randomization. Randomization used in many protocols We ll study examples:

Randomization used in many protocols We ll study examples: Ethernet multiple access protocol Router (de)synchronization Switch scheduling

Broadcasting Techniques for Mobile Ad Hoc Networks

Forwarding and Routers : Computer Networking. Original IP Route Lookup. Outline

Safely Measuring Tor. Rob Jansen U.S. Naval Research Laboratory Center for High Assurance Computer Systems

Safely Measuring Tor. Rob Jansen U.S. Naval Research Laboratory Center for High Assurance Computer Systems

Chapter 2: Memory Hierarchy Design Part 2

measurement goals why traffic measurement of Internet is so hard? measurement needs combined skills diverse traffic massive volume of traffic

Dynamic Pipelining: Making IP- Lookup Truly Scalable

Counter Braids: A novel counter architecture

Configuring QoS CHAPTER

Memory Hierarchy. Memory Flavors Principle of Locality Program Traces Memory Hierarchies Associativity. (Study Chapter 5)

Cheap and Large CAMs for High Performance Data-Intensive Networked Systems- The Bufferhash KV Store

Data Stream Mining. Tore Risch Dept. of information technology Uppsala University Sweden

Authors: Mark Handley, Vern Paxson, Christian Kreibich

Bitmap Algorithms for Counting Active Flows on High Speed Links

Chapter 6 Memory 11/3/2015. Chapter 6 Objectives. 6.2 Types of Memory. 6.1 Introduction

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

Unit 6 Chapter 15 EXAMPLES OF COMPLEXITY CALCULATION

Inter-Data-Center Network Traffic Prediction with Elephant Flows

SSA: A Power and Memory Efficient Scheme to Multi-Match Packet Classification. Fang Yu, T.V. Lakshman, Martin Austin Motoyama, Randy H.

CS6303 Computer Architecture Regulation 2013 BE-Computer Science and Engineering III semester 2 MARKS

Implementation of a leaky bucket module for simulations in NS-3

Stadium. A Distributed Metadata-private Messaging System. Matei Zaharia Nickolai Zeldovich SOSP 2017

Caches II CSE 351 Spring #include <yoda.h> int is_try(int do_flag) { return!(do_flag!do_flag); }

Informatica Universiteit van Amsterdam. Distributed Load-Balancing of Network Flows using Multi-Path Routing. Kevin Ouwehand. September 20, 2015

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

LECTURE 11. Memory Hierarchy

Summary Cache: A Scalable Wide-Area Web Cache Sharing Protocol

Data Streaming Algorithms for Efficient and Accurate Estimation of Flow Size Distribution

Key aspects of cloud computing. Towards fuller utilization. Two main sources of resource demand. Cluster Scheduling

Network Data Streaming A Computer Scientist s Journey in Signal Processing

Chapter 14 Performance and Processor Design

One-Pass Streaming Algorithms

Scalable and Robust DDoS Detection via Universal Monitoring

PCnet-FAST Buffer Performance White Paper

A Comparison of Performance and Accuracy of Measurement Algorithms in Software

Deliverable 1.1.6: Finding Elephant Flows for Optical Networks

Memory-Constrained Aggregate Computation over Data Streams

Transcription:

New Directions in Traffic Measurement and Accounting C. Estan and G. Varghese Presented by Aaditeshwar Seth 1 Need for traffic measurement Internet backbone monitoring Short term Detect DoS attacks Long term Traffic rerouting and link upgrades Accounting for usage based pricing System tasks Develop reports on flow usage: Size of flows 2 Relation to stream databases Large number of incoming tuples Map operator to differentiate between flows Sum operator to find size of flows 3 1

Issues Scalability with growing link speeds and number of flows Database solutions Indexing and query optimization Networking solutions Same, but not stated 4 Uses sampling Inaccurate Tracks all flows Netflow Resource intensive Hence, expensive: Use DRAM slower Also, too much state needs to be reported back 5 [Estan and Verghese, 2002] Reduce the problem Only track large flows #1: Identify large flows Reduces resource requirements #2: For these flows, do accurate tracking Reduced resource requirements allow faster and expensive SRAM to be used 6 2

Overview Algorithms Sample and hold Multistage filters Optimizations Entry preservation Early removal Shielding Conservative update of counters Analytical comparisons Experimental comparisons Implementation issues 7 Sample and hold Verification done through traces Oversampling: Assumptions about traffic mix to avoid collisions Threshold: Application specific Implementation simplicity P = Probability of sampling a byte. Adjust this. How? P = O / T High P implies more false positives Low P implies more latency in tracking, false negatives with bursty traffic can creep in 8 Multistage filters Actually yes! Suppose, d hashtables of size m with different hash functions. Probability of collision = e / n, where e depends on the hash function, n is the size of the hash table N values to hash With single hashtable of size dm, Collision prob = Ne / dm Higher accuracy With multiple hashtables, Collision prob = (Ne / m) Having more hashtables to avoid collisions is better than d one large hashtable: General theorem? Reduces both false positives and false negatives Logarithmic scaling Measurement interval also depend on traffic mix. How to infer? 9 3

Optimizations Entry preservation Retain all flows > T and all flows added in the current interval Large flows are measured accurately Early removal For flows added in the current interval, retain only flows > R Prevents small flows from carrying forward Shielding Do not hash large flows. Use a different hashtable for computing large flows? Prevents piggy-backing of small flows on large flows Conservative update of counters Update smallest counter. Trim other counters Prevents false positives 10 Analytical comparisons Constraint: Only use M memory entries z: % of link capacity used by a flow r: implementation factor n: number of flows x: sampling rate (1 in x get sampled) Accuracy comes at the cost of processing 11 Analytical comparisons: Netflow memory not limited t: measurement interval O: oversampling u: zc/t = how much larger is a flow than the threshold 12 4

Experimental comparisons 3 traces 3 definitions of flows 5 tuple: Src IP, Src port, Dst IP, Dst port, Proto Dst IP Src AS, Dst AS Few flows dominate Underutilized links 13 Experimental comparisons: Relative error = Average error / Threshold Over estimation with general and Zipfian distribution is due to underutilized links Bounds calculated using the link capacity 14 Experimental comparisons: Percentage of false positives decreases exponentially with depth of filter Tighter bounds Bounds calculated using the maximum traffic 15 5

Experimental comparisons: Flow Ids are 5 tuples Error = Sum of errors / Sum of flow sizes Algorithms good for large and very large flows 16 Experimental comparisons: Flow Ids are (Src AS, Dst AS) Few large flows Algorithms still work well 17 Implementation issues Estimation of parameters ADAPTTHRESHOLD algorithm Utilize up to 90% of hash table Leads to graceful degradation in case of over use, greater number of flow statistics in case of under use Similar to dynamic adjustment of the window size: Control sampling rate of tuples to drop Do the magic values work only on traces, or are they generic? 18 6

Conclusions and discussions Complete solution: algorithms, analytical evaluation, experimental evaluation, implementation design Exploitation of specific facts about distribution can simplify the problem for certain applications Argument of cheap DRAM Vs expensive SRAM is not good Not considered Netflow optimizations to reduce resource usage Assume that per packet sampling can be done through SRAM, and it scales with line speeds. Verification? Examples in proofs do not tell anything! Too many magic values. Suggests the need for frequent fine tuning. Automation? 19 7