Analyzing Cacheable Traffic in ISP Access Networks for Micro CDN applications via Content-Centric Networking

Similar documents
Performance and cost effectiveness of caching in mobile access networks

Multimedia Streaming. Mike Zink

Web Caching and Content Delivery

On the Feasibility of Prefetching and Caching for Online TV Services: A Measurement Study on

Caching Algorithm for Content-Oriented Networks Using Prediction of Popularity of Content

Optimal Cache Allocation for Content-Centric Networking

CS 425 / ECE 428 Distributed Systems Fall 2015

SamKnows test methodology

Investigating the Use of Synchronized Clocks in TCP Congestion Control

OpenCache. A Platform for Efficient Video Delivery. Matthew Broadbent. 1 st Year PhD Student

Coupling Caching and Forwarding: Benefits, Analysis & Implementation

YouLighter: An Unsupervised Methodology to Unveil YouTube CDN Changes

High-Level Testing Principles. Annex 5 to the Voluntary Codes of Practice on Residential and Business Broadband Speeds

arxiv: v3 [cs.ni] 3 May 2017

APPLICATION OF POLICY BASED INDEXES AND UNIFIED CACHING FOR CONTENT DELIVERY Andrey Kisel Alcatel-Lucent

Lightweight caching strategy for wireless content delivery networks

Topics in P2P Networked Systems

Measuring the Processing Performance of NetSniff

Varnish Streaming Server

Revisiting router architectures with Zipf

CHAPTER 5 PROPAGATION DELAY

NetOp Policy Manager Resource Admission Control Service (RACS) Delivers Content-Driven QoS for Optimal Video-On-Demand Experience

Open ContEnt Aware Networks

Yahoo Traffic Server -a Powerful Cloud Gatekeeper

DOCUMENT REFERENCE: SQ EN. SAMKNOWS TEST METHODOLOGY Methodology and technical information relating to the SamKnows testing platform.

Fiber in the backhaul : Powering mobile Broadband

Watching User Generated Videos with Prefetching

An In-depth Study of LTE: Effect of Network Protocol and Application Behavior on Performance

Increase-Decrease Congestion Control for Real-time Streaming: Scalability

Alibaba Cloud CDN. FAQs

Seminar on. By Sai Rahul Reddy P. 2/2/2005 Web Caching 1

On the Relationship of Server Disk Workloads and Client File Requests

Advanced Networking Technologies

Internet Inter-Domain Traffic. C. Labovitz, S. Iekel-Johnson, D. McPherson, J. Oberheide, F. Jahanian, Proc. of SIGCOMM 2010

A FRESH LOOK AT SCALABLE FORWARDING THROUGH ROUTER FIB CACHING. Kaustubh Gadkari, Dan Massey and Christos Papadopoulos

Unified Access and Aggregation Network Allowing Fixed and Mobile Networks to Converge: The COMBO project

Analysis of wireless information locality and association patterns in a campus

Resource Sharing or Designing Access Network For Low Cost.

Burst-mode Transceivers and Differentiated Classes of Service for PON. Dr. Monir Hossen ECE, KUET

Confused, Timid, and Unstable: Picking a Video Streaming Rate is Hard

Proactive-Caching based Information Centric Networking Architecture for Reliable Green Communication in ITS

Named Data Networking for 5G Wireless

The Internet today. Measuring the Internet: challenges and applications. Politecnico di Torino 7/12/2011. Speaker: Marco Mellia

INTERNET TRAFFIC MEASUREMENT (PART II) Gaia Maselli

95 th Percentile Billing

A dive into the caching performance of Content Centric Networking

A CONTENT-TYPE BASED EVALUATION OF WEB CACHE REPLACEMENT POLICIES

TCP Nice: A Mechanism for Background Transfers

Page 1. Outline / Computer Networking : 1 st Generation Commercial PC/Packet Video Technologies

Network infrastructure, routing and traffic. q Internet inter-domain traffic q Traffic estimation for the outsider

Cache Policies. Philipp Koehn. 6 April 2018

Bandwidth estimations

Implications of Proxy Caching for Provisioning Networks and Servers

Evaluating the Impact of Different Document Types on the Performance of Web Cache Replacement Schemes *

Concept: Traffic Flow. Prof. Anja Feldmann, Ph.D. Dr. Steve Uhlig

Confused, Timid, and Unstable: Picking a Video Streaming Rate is Hard

Randomized User-Centric Clustering for Cloud Radio Access Network with PHY Caching

A Simulation: Improving Throughput and Reducing PCI Bus Traffic by. Caching Server Requests using a Network Processor with Memory

Scaling Internet TV Content Delivery ALEX GUTARIN DIRECTOR OF ENGINEERING, NETFLIX

Back-Office Web Traffic on the Internet. IMC 2014 Vancouver, BC, CANADA November 5-7, 2014

Toward a Reliable Data Transport Architecture for Optical Burst-Switched Networks

CSE 124: Networked Services Lecture-17

CS 43: Computer Networks The Link Layer. Kevin Webb Swarthmore College November 28, 2017

MPC: Popularity-based Caching Strategy for Content Centric Networks

Appendix B. Standards-Track TCP Evaluation

MCBS: Matrix Computation Based Simulator of NDN

A First Look at Traffic on Smartphones

ICN & 5G. Dr.-Ing. Dirk Kutscher Chief Researcher Networking. NEC Laboratories Europe

WormTerminator: : An Effective Containment of Unknown and Polymorphic Fast Spreading Worms

The Power of Prediction: Cloud Bandwidth and Cost Reduction

EECS 428 Final Project Report Distributed Real-Time Process Control Over TCP and the Internet Brian Robinson

Analysis of Wireless Information Locality and Association Patterns in a Campus

Cache Less for More in Information- Centric Networks W. K. Chai, D. He, I. Psaras and G. Pavlou (presenter)

I/O Characterization of Commercial Workloads

Internet Traffic Characteristics. How to take care of the Bursty IP traffic in Optical Networks

Advanced RDMA-based Admission Control for Modern Data-Centers

Evaluation of Performance of Cooperative Web Caching with Web Polygraph

SEDA: An Architecture for Well-Conditioned, Scalable Internet Services

WALL: A Writeback-Aware LLC Management for PCM-based Main Memory Systems

Future Service Adaptive Access/Aggregation Network Architecture

Packet-Level Network Analytics without Compromises NANOG 73, June 26th 2018, Denver, CO. Oliver Michel

Modeling and Caching of Peer-to-Peer Traffic

Network Design Considerations for Grid Computing

XCo: Explicit Coordination for Preventing Congestion in Data Center Ethernet

Delay-line storage in optical communications switching

Datacenter Wide- area Enterprise

Cloud Transcoder: Bridging the Format and Resolution Gap between Internet Videos and Mobile Devices

Objectives: (1) To learn to capture and analyze packets using wireshark. (2) To learn how protocols and layering are represented in packets.

Troubleshooting High CPU Caused by the BGP Scanner or BGP Router Process

IEEE C /26. IEEE Working Group on Mobile Broadband Wireless Access <

Detect Cyber Threats with Securonix Proxy Traffic Analyzer

Performance Evaluation of CCN

ITERATIVE COLLISION RESOLUTION IN WIRELESS NETWORKS

New Cisco 2800 And 3800 Series Integrated Services Router Wan Optimization Bundles

Scaling Internet Routers Using Optics Producing a 100TB/s Router. Ashley Green and Brad Rosen February 16, 2004

Broadband Traffic Report: Download Growth Slows for a Second Year Running

Summary Cache based Co-operative Proxies

Reza Tourani, Satyajayant (Jay) Misra, Travis Mick

Performance Study of CCNx

IP SLAs Overview. Finding Feature Information. Information About IP SLAs. IP SLAs Technology Overview

Transcription:

Analyzing Cacheable Traffic in ISP Access Networks for Micro CDN applications via Content-Centric Networking Claudio Imbrenda Luca Muscariello Orange Labs Dario Rossi Telecom ParisTech

Outline Motivation Monitoring Collected statistics Trace driven simulation Traffic characterization Conclusions 2/31

Outline Motivation Monitoring Collected statistics Trace driven simulation Traffic characterization Conclusions 3/31

Motivation Traffic is skyrocketing: Web traffic, video streaming From outside of the ISP network Unpredictable Increased investment in access = increased pressure on backhaul µcdn: caching at the very edge of the network Caching is a relatively new tool for traffic engineering for ISPs Caching has multiple benefits Better quality of service (latency) Lower costs for operators Most ICN proposals include caching as a primitive This study therefore helps to validate that approach Traffic modeling is also useful 4/31

Outline Motivation Monitoring Collected statistics Trace driven simulation Traffic characterization Conclusions 5/31

HTTP analysis A correct analysis of HTTP traffic for caching purposes requires an accurate analysis of the HTTP transactions Not just requests and replies with timestamps, relate the replies to their requests Size of the objects (both advertised and effective), presence of ETAG and cookies We need to identify the objects and their the sizes to estimate the size of the caches from client GET / HTTP/1.1 Host: www.orange.com... time HTTP/1.1 200 OK Content-Length: 10000... from server <DATA> <CONNECTION BROKEN> real length 10000 bytes (advertised length) 6/31

Existing tools inadequate A tool has been developed to perform live analysis of HTTP traffic: HACkSAw Tstat: fast and lightweight TCP analyser Doesn't correlate requests and replies, misses the size of too many objects Bro: intrusion detection Accurate, but memory hungry and too slow Tool Detected requests CPU [s]* * Cumulative user time as reported by the time UNIX utility Memory usage [GB] 0-length replies Tstat2.4 2 531 210 445 0.3 1 128 109 bro 2 559 056 8033 4.2 424 355 HACkSAw 2 426 391 368 5.8 328 465 7/31 Benchmark of our tool (HACkSAw) compared to 2 other popular tools 1-hour PCAP trace of mobile backhaul traffic, peak hour, several tens of thousands of users

Dataset and vantage point Towards broadband access servers (BAS) Residential users (mostly fiber) April 18 th to May 30 th 2014: 42 days (~1000 hours) Few users Investigating potential advantages of caching with low user aggregation HACkSAw probe ONT OLT Local loop Edge routers Total Daily average Distinct users 1478 1218 ONT HTTP requests 369 238 000 8 586 000 HTTP data volume 37 TB 881 GB Optical splitter Distinct objects 174 283 000 4 956 000 8/31

Outline Motivation Monitoring Collected statistics Trace driven simulation Traffic characterization Conclusions 9/31

Collected statistics HTTP URL Content-Length ETAG Content ID: URL+ETAG TCP Connection length (real size) IP IP addresses Ethernet Client ID (MAC address) Packet timestamps 10/31

Computed statistics Statistics refer to a specific time interval Timestamp of request considered for the whole transaction Request cacheability R(t) Percentage of repeated requests for the same object Traffic reduction T(t) Percentage of traffic that could be cached by an oracle perfect cache with prefetching Virtual cache size V(t) Sum of the size of all cacheable objects observed Basically the minimum size needed for a perfect oracle cache 11/31

First scenario: cache at OLT Towards broadband access servers (BAS) 3 different scenarios: ONT OLT Local loop Edge routers 1. Cache at OLT only (vantage point) 2. Cache at ONT only (users) 3. Cache both at ONT and OLT ONT 2 timescales: 1 hour Optical splitter 1 day 12/31

Second scenario: cache at ONT Towards broadband access servers (BAS) 3 different scenarios: ONT OLT Local loop Edge routers 1. Cache at OLT only (vantage point) 2. Cache at ONT only (users) 3. Cache both at ONT and OLT ONT 2 timescales: 1 hour Optical splitter 1 day 13/31

Third scenario: cache at OLT and ONT Towards broadband access servers (BAS) 3 different scenarios: ONT OLT Local loop Edge routers 1. Cache at OLT only (vantage point) 2. Cache at ONT only (users) 3. Cache both at ONT and OLT ONT 2 timescales: 1 hour Optical splitter 1 day 14/31

Results: Cacheability and Traffic reduction caching at OLT Cacheability >40%; Traffic reduction around 35% Virtual cache size around 100GB Feasible in a line card with current technology Cacheability at OLT 15/31

Results: Cacheability and Traffic reduction caching at ONT Cacheability >25%; Traffic reduction >20% Virtual cache size around 100MB Values are averages, high variance Feasible in a home router with current technology Cacheability at ONT 16/31

Results: Cacheability and Traffic reduction caching at OLT and ONT Cacheability <20%; Traffic reduction around 20% Virtual cache size around 100GB Feasible in a line card with current technology Cacheability at OLT after ONT caches have filtered the traffic 17/31

Summary of results OLT only ONT only OLT after ONT Cacheability 42% 28% 19% Traffic reduction 34% 24% 20% Virtual cache size 100GB 100MB 100GB 18/31

Outline Motivation Monitoring Collected statistics Trace driven simulation Traffic characterization Conclusions 19/31

LRU simulation Towards broadband access servers (BAS) Trace driven simulation, only one scenario: Local loop 1. Cache at OLT only (vantage point) ONT OLT Edge routers Needed to measure the performance of a real cache Needed to assess the impact of temporal locality ONT Chunk-level cache: only the requested part of the objects is cached Optical splitter 20/31

Results: Simulated LRU Hit ratio 4 cache sizes: 1GB, 10GB, 100GB and 1TB Standard LRU replacement As expected the real performance is lower than the theoretic values shown before Still acceptable Saved traffic Cacheability gives a rough estimation of the real values Cacheability is easier to compute than doing an LRU simulation 21/31

LRU simulation shuffling With temporal correlation We want to assess the impact of temporal locality on the performance of the cache Shuffling: transactions are uniformly randomly redistributed inside their timeslots 1 hour and 1 day timeslots Smaller caches suffer, larger caches unaffected Effects of temporal locality IRM 22/31

LRU simulation shuffling results Hit ratio Cache size No shuffling Shuffling: 1 hour slots Shuffling: 1 day slots 1TB 42% 42% 42% 100GB 35% 35% 30% 10GB 25% 22% 17% 1GB 19% 15% 10% Traffic reduction Cache size No shuffling Shuffling: 1 hour slots Shuffling: 1 day slots 1TB 23% 23% 23% 100GB 17% 17% 13% 10GB 14% 11% 5% 1GB 11% 5% 2% 23/31 IRM is too conservative!

Outline Motivation Monitoring Collected statistics Trace driven simulation Traffic characterization Conclusions 24/31

Content popularity estimation Content popularity has a big impact on caching Zipf with α<1 has finite support Confidence interval of active catalog size is unbounded Catalog size is needed for correctly sizing the cache cache size Hit ratio active catalog With a Weibull we can estimate all the parameters 24 hours time period 95% confidence bands shown Passes χ 2 goodness of fit 25/31

Time correlation What is the refresh rate of the active catalog? Considering 1-hour timeslots, when and how much of the content in each timeslot will be requested again? Jaccard autocorrelation of the observed active catalog Jaccard coefficient hence 0 J(A,B) 1 J ( A, B)= A B A B Average time needed to reach the percentiles of asymptotic cacheability and traffic reduction The autocorrelation function is hence with C iδt being the content catalog observed in the time window iδt 26/31

Results: Jaccard autocorrelation Content changes quickly Weak 24-hours periodicity On average, less than 4% of the active catalog is the same in two consecutive 1-hour timeslots [hours] for k in [1,168], ΔT=1hour, with standard deviation error bars 27/31

Outline Motivation Monitoring Collected statistics Trace driven simulation Traffic characterization Conclusions 28/31

Conclusions IRM Zipf models are inadequate, time correlation is more important Problems estimating the size of the catalog Caching even in the access network yields real traffic savings ICN solutions with native caching will provide real benefits Small caches (100MB on ONT and 100GB on OLT) can potentially provide a reduction of 35% of the traffic on the backhaul HTTPS = HTTP+TLS hinders all transparent caching efforts So far only 15% of the traffic, but growing ICN solutions ideal: security and caching are built-in primitives New encryption-aware version of HTTP would be beneficial 29/31

Future work Further improvement of the tool More accuracy, less memory consumption Add support for more protocols Low cost probes: tens of measurement points in Europe Repeat the measurements at different vantage points More users, more traffic Include Layer 3-4 statistics Latency Throughput 30/31

Questions?