Tracking the Evolution of Web Traffic:

Similar documents
From Traffic Measurement to Realistic Workload Generation

A Non-Parametric Approach to Generation and Validation of Synthetic Network Traffic

Experimental Networking Research and Performance Evaluation

Discrete-Approximation of Measured Round Trip Time Distributions: A Model for Network Emulation

Exposing server performance to network managers through passive network measurements

Tuning RED for Web Traffic

What TCP/IP Protocol Headers Can Tell Us About the Web*

Differential Congestion Notification: Taming the Elephants

Investigating the Use of Synchronized Clocks in TCP Congestion Control

CPSC 641: WAN Measurement. Carey Williamson Department of Computer Science University of Calgary

An Empirical Study of Delay Jitter Management Policies

GENERATING REALISTIC TCP WORKLOADS

Empirical Models of TCP and UDP End User Network Traffic from Data Analysis

Appendix A. Methodology

Rapid Model Parameterization from Traffic Measurements

Understanding Patterns of TCP Connection Usage with Statistical Clustering

TCP = Transmission Control Protocol Connection-oriented protocol Provides a reliable unicast end-to-end byte stream over an unreliable internetwork.

Appendix B. Standards-Track TCP Evaluation

Video at the Edge passive delay measurements. Kathleen Nichols Pollere, Inc November 17, 2016

Long-Range Dependence in a Changing Internet Traffic Mix

Characterizing Internet Load as a Non-regular Multiplex of TCP Streams

arxiv: v1 [stat.ap] 6 Oct 2010

Passive, Streaming Inference of TCP Connection Structure for Network Server Management

COMP 249 Advanced Distributed Systems Multimedia Networking. Performance of Multimedia Delivery on the Internet Today

A Better-Than-Best Effort Forwarding Service For UDP

WEB traffic characterization and performance analysis

A Comparative Study of the Realization of Rate-Based Computing Services in General Purpose Operating Systems

TBIT: TCP Behavior Inference Tool

Multimedia Networking Research at UNC. University of North Carolina at Chapel Hill. Multimedia Networking Research at UNC

SSFNET TCP Simulation Analysis by tcpanaly

Differential Congestion Notification: Taming the Elephants

Trace based application layer modeling in ns-3

Improving Confidence in Network Simulations

Transport Layer TCP / UDP

Objectives: (1) To learn to capture and analyze packets using wireshark. (2) To learn how protocols and layering are represented in packets.

CS 356 Lab #1: Basic LAN Setup & Packet capture/analysis using Ethereal

Investigating the Use of Synchronized Clocks in TCP Congestion Control

GENERATING Tmix-BASED TCP APPLICATION. WORKLOADS IN NS-2 AND GTNetS. A Thesis. Presented to. the Graduate School of. Clemson University

Monitoring, Analysis and Modeling of HTTP and HTTPS Requests in a Campus LAN Shriparen Sriskandarajah and Nalin Ranasinghe.

Traffic Classification Using Visual Motifs: An Empirical Evaluation

Lab 2. All datagrams related to favicon.ico had been ignored. Diagram 1. Diagram 2

Traffic Characteristics of Bulk Data Transfer using TCP/IP over Gigabit Ethernet

Fall 2012: FCM 708 Bridge Foundation I

Network Traffic Characteristics of Data Centers in the Wild. Proceedings of the 10th annual conference on Internet measurement, ACM

Quantifying the effects of recent protocol improvements to TCP: Impact on Web performance

Introduction to OSI model and Network Analyzer :- Introduction to Wireshark

Catching IP traffic burstiness with a lightweight generator

COMP 249 Advanced Distributed Systems Multimedia Networking. The Integrated Services Architecture for the Internet

Game Traffic Analysis: An MMORPG Perspective

Visualization of Internet Traffic Features

COMPUTER NETWORK. Homework #2. Due Date: April 12, 2017 in class

EE 122: HyperText Transfer Protocol (HTTP)

Multimedia Streaming. Mike Zink

Universität Stuttgart

Web Caching and Content Delivery

Investigating Forms of Simulating Web Traffic. Yixin Hua Eswin Anzueto Computer Science Department Worcester Polytechnic Institute Worcester, MA

9. Wireshark I: Protocol Stack and Ethernet

A Fluid-Flow Characterization of Internet1 and Internet2 Traffic *

Congestion Control Without a Startup Phase

A Simulation: Improving Throughput and Reducing PCI Bus Traffic by. Caching Server Requests using a Network Processor with Memory

SC/CSE 3213 Winter Sebastian Magierowski York University CSE 3213, W13 L8: TCP/IP. Outline. Forwarding over network and data link layers

Topics in P2P Networked Systems

Evaluation of short-term traffic forecasting algorithms in wireless networks

Network Awareness and Network Security

An In-depth Study of LTE: Effect of Network Protocol and Application Behavior on Performance

Internet Traffic Characteristics. How to take care of the Bursty IP traffic in Optical Networks

the past doesn t impact the future!

Application-Level Measurements of Performance on the vbns *

Lecture 17 Overview. Last Lecture. Wide Area Networking (2) This Lecture. Internet Protocol (1) Source: chapters 2.2, 2.3,18.4, 19.1, 9.

RealMedia Streaming Performance on an IEEE b Wireless LAN

Spatio-Temporal Modeling of campus WLAN traffic demand

Lab Exercise UDP & TCP

Performance Evaluation of Tcpdump

UNH-IOL PCIe CONSORTIUM

CSc 450/550: Computer Communications and Networks (Summer 2007)

End-to-end available bandwidth estimation

Spatio-Temporal Modeling of Traffic Workload in a Campus WLAN

Evaluation of short-term traffic forecasting algorithms in wireless networks

A Passive State-Machine Based Approach for Reliable Estimation of TCP Losses

Toward Efficient Querying of Compressed Network Payloads!

Network traffic characterization

Network traffic characterization. A historical perspective

CS 716: Introduction to communication networks th class; 7 th Oct Instructor: Sridhar Iyer IIT Bombay

Your favorite blog : (popularly known as VIJAY JOTANI S BLOG..now in facebook.join ON FB VIJAY

On the State of ECN and TCP Options on the Internet

Master Course Computer Networks IN2097

Dynamic Adaptive Streaming over HTTP (DASH) Application Protocol : Modeling and Analysis

Improving Packet Processing Efficiency on Multi-core Architectures with Single Input Queue

The trace file is here:

Speeding up Transaction-oriented Communications in the Internet

Lab Exercise Protocol Layers

Network Test and Monitoring Tools

CCNA 1 Chapter 7 v5.0 Exam Answers 2013

Benchmarking and Compliance Methodology for White Rabbit Switches

Network Analyzer :- Introduction to Wireshark

Tool Manual (Version I)

Statistical Clustering of Internet Communication Patterns

Introduction to Networks Network Types. BTEC Nat IT Computer Networks

please study up before presenting

Out-Of-Core Sort-First Parallel Rendering for Cluster-Based Tiled Displays

Transcription:

The University of North Carolina at Chapel Hill Department of Computer Science 11 th ACM/IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS) Orlando, October 13 th, 2003 Tracking the Evolution of Web Traffic: 1995-2003 Félix Hernández-Campos Kevin Jeffay F. Donelson Smith http://www.cs.unc.edu/research/dirt Web Traffic Measurement and Analysis at UNC-Chapel Hill In 1997, populating web traffic generators for experimental networking research motivated a large- scale study of web traffic at UNC with three goals: Develop a light-weight methodology Based on passive measurement Easy to maintain models up-to-date Replace smaller-scale, quickly aging models Mah,, 1995 data set Crovella et. al,, 1995 data set (revised with 1998 data) Characterize the use of the protocol E.g.,, Use of persistent connections 1 2 Web Traffic Measurement and Analysis at UNC-Chapel Hill Our methodology and first results were published in SIGMETRICS/Performance 01 What TCP/IP Protocol Headers Can Tell Us About the Web Modeling aspect explored in a series of papers E.g., Variable Heavy Tails in Internet Traffic (with J.S. Marron)»(Part» I: Understanding Heavy Tails published in MASCOTS 02) In this talk, I will describe our approach and our observation on the evolution of web traffic: Three data sets: 1999, 2001 and 2003 Comparisons to Mah and Crovella et al. Methodology Study of Web Content Consumers University of North Carolina at Chapel Hill Web Clients Requests Responses Internet Web Servers We studied a large collection of users (~35,000) as web content consumers The only source of data for our study were packet header traces Anonymized IP addresses No headers 3 4

Methodology One-Way Packet Header Traces Methodology Processing Sequence Overview University of North Carolina at Chapel Hill Web Clients Gigabit Ethernet Traffic Monitor (tcpdump) Internet Web Servers tcpdump Raw TCP/IP headers trace Filter & Sort TCP Connections (Port 80) Connection-level Analysis Only inbound TCP/IP headers are captured Eliminate synchronization and buffering issues on the NIC Reduce trace size Trace collection: 2.7 TB of packet headers ~40 billion packets ~16 TB of data transfers Client Behavior Client-level Analysis Statistical Analysis Req/Rsp Rsp Exchanges 5 6 TCP/IP Headers and Request/response Exchange Web Client (UNC) Request SYN SYN- - - seqno 305 ackno 1 seqno 1 ackno 305 seqno 1461 ackno 305 seqno 2876 ackno 305 seqno 305 ackno 2876 Web Server (Internet) Response 7 TCP/IP Headers and Server-to-client Segments Only Web Client (UNC) Request Response SYN- seqno 1 ackno 1 - seqno 1 ackno 305 seqno 1461 ackno 305 seqno 2876 ackno 305 Web Server (Internet) Ackno Seqno 8

Methodology Request/Response Traces Unidirectional TCP/IP header traces are sufficient for capturing application-level behavior Web Client Request Computed Response Directly Observed Web Server Persistent Connections in Example TCP/IP Headers Web Client (UNC) Request 1 Response 1 Request 2 262 bytes Response 2 3465 bytes SYN- seqno 1 ackno 305 seqno 1461 ackno 305 seqno 2876 ackno 305 seqno 2876 ackno 567 seqno 4336 ackno 567 seqno 5796 ackno 567 seqno 6341 ackno 567 - seqno 1 ackno 1 Web Server (Internet) Ackno Seqno Ackno 9 10 Sizes of Requests Empirical CDFs Sizes of Requests Empirical CCDFs 0.82 0.64 1999 2003 Complementary 0.3e-5 1 MB 0.0003% 296 Requests 2.7% Bytes Size in Bytes 11 No. of Bytes 12

Response Sizes Response Sizes LogNormal Fits Complementary Systematic Wobbles Pareto Fits Size in Bytes 13 No. of Bytes 14 Page Identification Heuristic Two TCP Connections Example Objects Per Page Client Top-level Object Server Page 1 Quiet Time >1 second Client Embedded Objects Server Page 2 15 No. of Objects ( Exchanges) 16

Objects Per Page Page Requests Per IP Address Complementary No. of Objects ( Exchanges) 17 No. of Page Requests 18 Impact of Tracing Interval Length Questions: Can we obtain a sufficiently large sample with a small number of short traces? How does the length of the tracing interval affect the overall empirical distribution shapes? Should we include in the empirical distributions the data from incomplete TCP connections? Approach: Examine a wide range of trace lengths»4» 4 h., 2 h., 1h., 30 min., 15 min., 5 min. and 90 sec. Construct datasets by sub-sampling the 21 4-hour-long traces collected in 2001 E.g.,, remove first and last hour of each trace to produce 21 2-hour-long traces 19 Response Size in Bytes 20

Impact of Tracing Interval Length Impact of Tracing Interval Length Complementary Response Size in Bytes 21 No. of Pages Per Client IP Address 22 Impact of Partially-Captured Objects Impact of Partially-Captured Objects Complementary Response Size in Bytes 23 No. of Bytes 24

Summary and Conclusions Web Traffic Characterization New data to populate traffic generators Request sizes Response sizes Use of persistent connections... 1-hour long traces are sufficient to capture application-level behavior Short traces cut off large objects, which skews the tails of the distributions Persistent Connections: ~15% of all the connections 40-50% of all the transferred bytes 25