Modeling Web Proxy Cache Architectures

Similar documents
Evaluating the Impact of Different Document Types on the Performance of Web Cache Replacement Schemes *

Trace Driven Simulation of GDSF# and Existing Caching Algorithms for Web Proxy Servers

PETRI NET MODELLING OF CONCURRENCY CONTROL IN DISTRIBUTED DATABASE SYSTEM

A queueing network model to study Proxy Cache Servers

Queuing Systems. 1 Lecturer: Hawraa Sh. Modeling & Simulation- Lecture -4-21/10/2012

Petri Nets ee249 Fall 2000

Performance Analysis of Cell Switching Management Scheme in Wireless Packet Communications

Stochastic Petri nets

Timed Petri Nets in Modeling and Analysis of Cluster Tools

Statistical Testing of Software Based on a Usage Model

Outline. Petri nets. Introduction Examples Properties Analysis techniques. 1 EE249Fall04

A Quality of Service Decision Model for ATM-LAN/MAN Interconnection

Characterizing Home Pages 1

A New Web Cache Replacement Algorithm 1

Lecture 5: Performance Analysis I

Process- Concept &Process Scheduling OPERATING SYSTEMS

Transient Analysis Of Stochastic Petri Nets With Interval Decision Diagrams

OPTIMIZING PRODUCTION WORK FLOW USING OPEMCSS. John R. Clymer

Petri Nets ~------~ R-ES-O---N-A-N-C-E-I--se-p-te-m--be-r Applications.

Contents Introduction Petri Net Toolbox at a First Glance... 4

Safety and Reliability of Embedded Systems. (Sicherheit und Zuverlässigkeit eingebetteter Systeme) Safety and Reliability Analysis Models: Overview

HYBRID PETRI NET MODEL BASED DECISION SUPPORT SYSTEM. Janetta Culita, Simona Caramihai, Calin Munteanu

A New Call Admission Control scheme for Real-time traffic in Wireless Networks

Modular Petri Net Processor for Embedded Systems

Dependable and Secure Systems Dependability

Week 7: Traffic Models and QoS

Introduction to Modeling. Lecture Overview

Stochastic Petri Nets Supporting Dynamic Reliability Evaluation

An Efficient Web Cache Replacement Policy

Active Queue Management for Self-Similar Network Traffic

Performance Analysis of Time-enhanced UML Diagrams Based on Stochastic Processes

Performance Modeling of Proxy Cache Servers

IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 8, NO. 2, APRIL Segment-Based Streaming Media Proxy: Modeling and Optimization

Performance Modeling and Evaluation of Web Systems with Proxy Caching

Calculating Call Blocking and Utilization for Communication Satellites that Use Dynamic Resource Allocation

BUFFER STOCKS IN KANBAN CONTROLLED (TRADITIONAL) UNSATURATED MULTI-STAGE PRODUCTION SYSTEM

Análise e Modelagem de Desempenho de Sistemas de Computação: Component Level Performance Models of Computer Systems

Dependable and Secure Systems Dependability Master of Science in Embedded Computing Systems

An Approach for Enhanced Performance of Packet Transmission over Packet Switched Network

MANAGEMENT SCIENCE doi /mnsc ec pp. ec1 ec12

Dynamic Broadcast Scheduling in DDBMS

A Capacity Planning Methodology for Distributed E-Commerce Applications

On the Relationship of Server Disk Workloads and Client File Requests

The Impact of Write Back on Cache Performance

A Brief Introduction to Coloured Petri Nets

Evaluating the GPRS Radio Interface for Different Quality of Service Profiles

CS 556 Advanced Computer Networks Spring Solutions to Midterm Test March 10, YOUR NAME: Abraham MATTA

A model for the evaluation of storage hierarchies

A New Statistical Procedure for Validation of Simulation and Stochastic Models

Implementation and modeling of two-phase locking concurrency control a performance study

Visualization of Internet Traffic Features

ON-LINE QUALITATIVE MODEL-BASED DIAGNOSIS OF TECHNOLOGICAL SYSTEMS USING COLORED PETRI NETS

Chapter The LRU* WWW proxy cache document replacement algorithm

A CONTENT-TYPE BASED EVALUATION OF WEB CACHE REPLACEMENT POLICIES

Markov Chains and Multiaccess Protocols: An. Introduction

CS370: System Architecture & Software [Fall 2014] Dept. Of Computer Science, Colorado State University

Shaping Process Semantics

Characterizing Document Types to Evaluate Web Cache Replacement Policies

Conventionel versus Petri Net Modeling of a Transport Process in Postal Automation

Modeling and Performance Evaluation of ATM Switches

A Path Decomposition Approach for Computing Blocking Probabilities in Wavelength-Routing Networks

Assignment 5. Georgia Koloniari

Heuristic Algorithms for Multiconstrained Quality-of-Service Routing

Performance Testing from UML Models with Resource Descriptions *

Introduction to Electronic Design Automation. Model of Computation. Model of Computation. Model of Computation

Simulation Studies of the Basic Packet Routing Problem

WEEK 5 - APPLICATION OF PETRI NETS. 4.4 Producers-consumers problem with priority

Queuing Networks. Renato Lo Cigno. Simulation and Performance Evaluation Queuing Networks - Renato Lo Cigno 1

Chapter 4. Routers with Tiny Buffers: Experiments. 4.1 Testbed experiments Setup

Performance Analysis of WLANs Under Sporadic Traffic

Methods of Technical Risk Assessment in a Regional Context

Discrete Event Simulation and Petri net Modeling for Reliability Analysis

Iterative Specification Refinement in Deriving Logic Controllers

Buffer Management for Self-Similar Network Traffic

Dynamic Control and Optimization of Buffer Size for Short Message Transfer in GPRS/UMTS Networks *

Stochastic Petri Net Analysis of Deadlock Detection Algorithms in Transaction Database Systems with Dynamic Locking

arxiv: v3 [cs.ni] 3 May 2017

Integration of analytic model and simulation model for analysis on system survivability

Transactions on Information and Communications Technologies vol 3, 1993 WIT Press, ISSN

Key words: IP router, Differentiated services, QoS, Custom Queuing, Priority Queuing. 1. INTRODUCTION

International Conference on Advances in Mechanical Engineering and Industrial Informatics (AMEII 2015)

Congestion Control in Datacenters. Ahmed Saeed

THE TCP specification that specifies the first original

Introduction: Two motivating examples for the analytical approach

PARALLEL ALGORITHMS FOR IP SWITCHERS/ROUTERS

Popularity-Based PPM: An Effective Web Prefetching Technique for High Accuracy and Low Storage

Fluid models for evaluating threshold-based control policies for survivability of a distributed network

Overview of Dataflow Languages. Waheed Ahmad

Module 4: Stochastic Activity Networks

Managing test suites for services

>>> SOLUTIONS <<< Answer the following questions regarding the basics principles and concepts of networks.

VALIDATING AN ANALYTICAL APPROXIMATION THROUGH DISCRETE SIMULATION

Performance modeling of an Apache web server with bursty arrival traffic

Simulation of Petri Nets in Rule-Based Expert System Shell McESE

Efficient Randomized Web-Cache Replacement Schemes Using Samples From Past Eviction Times

A Heuristic Algorithm for Designing Logical Topologies in Packet Networks with Wavelength Routing

Analytic Performance Models for Bounded Queueing Systems

INTERNET OVER DIGITAL VIDEO BROADCAST: PERFORMANCE ISSUES

EE249 Discussion Petri Nets: Properties, Analysis and Applications - T. Murata. Chang-Ching Wu 10/9/2007

Petri Nets. Robert A. McGuigan, Department of Mathematics, Westfield State

Transcription:

Modeling Web Proxy Cache Architectures Christoph Lindemann und Andreas Reuys, Universität Dortmund, Informatik IV, Rechnersysteme und Leistungsbewertung, August-Schmidt-Str. 12, 44227 Dortmund, Germany, {cl, reuys}@cs.uni-dortmund.de, http://www4.cs.uni-dortmund.de/~lindemann Martin Reiser, GMD Institut für Medienkommunikation, Schloß Birlinghoven, Postfach 1316, 53754 Sankt Augustin, Germany, martin.reiser@gmd.de, http://imk.gmd.de Abstract In this paper we evaluate the performance of design alternatives for Web proxy cache architectures. Our performance study employs deterministic and stochastic Petri nets (DSPN) as stochastic modeling formalism because DSPN provide an illustrative graphical representation. The presented DSPN models address in particular the sensitivity of the performance of proxy cache architectures on the traffic intensity and burstiness of client requests as well as on the cache size and network transfer times. We employ DSPN for a comparative performance study of two Web proxy cache architectures: a shared proxy cache and a partitioned proxy cache for text and image documents. For comparison and validation purpose, we also present performance figures for a simple Web server without a proxy cache. For both cache architectures, the benefits of prefetching are also investigated. 1. Introduction Due to the phenomenal growth of the World Wide Web, a considerable number of related performance problems are currently being investigated in research activities. The overall performance of the Web is determined by the performance of individual components that build the Web, i.e., the traffic characteristics of client requests, the performance of servers and proxy servers, and the network latency. Since the majority of Web documents are static documents (i.e., text, image, audio and video files), caching at various network points appears very promising for reducing Web traffic and end user latency. A common way of web caching constitutes caching in the network itself by means of so-called Web proxy servers. Such proxy servers are intermediateries between browsers of clients and Web servers of the internet. Previous performance studies of Web proxy server caching have mostly focussed on replacement policies and have been conducted by trace-driven simulation (see e.g., [2], [3], [13]). Performing such a simulation study yields accurate quantitative results for a particular workload, though, requires substantial effort for collecting the traces and implementing the simulation model. While such simulation studies are surely needed in late system design cycles, a considerable number of performance problems of early design stages can effectively be answered with stochastic performance models. Such high-level stochastic models for evaluating design alternatives of Web proxy server architectures may well be used instead of or in conjunction with trace-driven simulation. Deterministic and stochastic Petri nets (DSPN) are a high-level modeling formalism for the specification of discrete-event systems with exponential and deterministic events. The offered traffic is typically stochastic (i.e., bursty and heavytailed [4]) in the internet, whereas for a given document class (e.g., text or images) the transfer times are constant. Consequently, the combination of stochastic and deterministic timing in one model is of particular practical relevance for Web server performance analysis. Since DSPN constitute a graphical modeling formalism, the development and validation of a DSPN is considerably easier and less error-prone than writing and debugging code for a trace-driven simulation study. Due to the same reason, the employment of DSPN is superior to the usage of low-level analytical models (e.g., Markov chains), for which mathematical equations have to be developed in order to determine their solution. Furthermore, highly efficient numerical methods are available for cost-effective quantitative performance analysis [9]. Thus, DSPN are attractive in early system design stages. This paper presents a performance study of design alternatives for Web proxy cache architectures. The main goal of this work constitutes the development of sufficiently accurate high-level stochastic models that can be utilized for cost-effective support of quantitative design of Web server architectures. The DSPN take into account the burstiness and selfsimiliarity of the arrival process of client requests observed in [13]. The presented performance study addresses in particular the sensitivity of the average latency for retrieving a document on the traffic intensity and burstiness of client requests as well as on the cache size and network bandwidth. We employ DSPN for a comparative performance study

of two Web proxy cache architectures: a shared cache versus a partitioned cache for text and image documents. For both cache architectures, the benefits of prefetching is also investigated. For comparison and validation purpose, we also present performance figures for a simple Web server without a proxy cache. For a comparative performance study of the two Web proxy server cache architectures without prefetching, we show that the DSPN modeling approach lead to the same results as the trace-driven simulation study [13]. However, the analysis of the presented DSPN requires only a few minutes of CPU time on a modern workstation. To the best of our knowledge, the performance of prefetching combined with a partitioned Web proxy server cache is studied for the first time in this paper. The remainder of this paper is organized as follows. Section 2 recalls the building blocks of DSPN and describes the software architecture and the graphical user interface of DSPNexpress2.000. The Web proxy cache architectures and its DSPN models are introduced in Section 3. Section 4 presents performance curves for these proxy cache architectures with and without prefetching. The final section contains concluding remarks. 2 The DSPN Modeling Approach fire after an exponentially distributed delay with given mean has elapsed. Deterministic transitions drawn as black bars, which fire after a given constant delay. The firing of an immediate transition has priority of the firing of any timed transition. Firing weights are associated with immediate transitions in order to determine the probability of firing, if in a marking several immediate transitions are in conflict. Firing delays of exponential and deterministic transitions as well as firing weights of immediate transitions may depend on the current marking of the DSPN; i.e., they may be marking-dependent. An important case of marking-dependency constitutes the infiniteserver firing semantics [9]. That is, the firing semantics of some exponential transition with a single input place is such that, if several tokens reside in its input place, each token represents an activity that proceeds concurrently to the others. Since this kind of marking-dependency models delays without queueing, the exponential transition is said to have infinite-server firing semantics. For a comparative performance analysis of Web proxy server cache architectures, we present DSPN models for different architectures in the third section. The DSPN explicitly represents the traffic intensity and burstiness of client requests as well as on the cache size and illustrate the performance gain due to utilizing a partitioned cache for text and image documents and due to prefetching. 2.1 The Building Blocks of DSPN Formally, a Petri net is a directed bipartite graph with one set of vertices called places (drawn as circles) and the other called transitions (drawn as bars). Places may contain tokens which are drawn as dots or numbers. Places and transitions are connected by directed arcs. Arcs are distinguished in ordinary arcs (drawn with arrow heads) and inhibitor arcs (drawn with empty circle heads). Arcs may also be labeled with integer numbers denoting their multiplicity. The default multiplicity of an arc is one. A transition is said to be enabled, if all of its input places contain at least as many tokens as the multiplicity of the corresponding input arc and all of its inhibitor places contain less tokens than the multiplicity of the corresponding inhibitor arc. A transition fires by removing from each input place as many tokens as the multiplicity of the corresponding input arc, and by adding to each output place as many tokens as the multiplicity of the corresponding output arc. In DSPN, three types of transitions exist: Immediate transitions drawn as thin bars which fire without delay. Exponential transitions drawn as empty bars, which 2.2 The Software Architecture of the Numerical Solvers The core of the package DSPNexpress constitutes the solution engine for discrete-event stochastic systems with exponential and deterministic events. The software architecture of this solution engine and its software modules are shown in Figure 1. The solution engine is drawn as the big white rectangular box. The six software modules are drawn as rectangles. These software modules are invoked from the solution engine as UNIX processes. Interprocess communication with sockets drawn as broken ellipses is employed for passing intermediate results from one module to the next. Steady state analysis of DSPNs without concurrent deterministic transitions relies on analysis of an embedded Markov chain (EMC) underlying such DSPNs [1]. To efficiently derive the probability matrix of this EMC, the concept of a subordinated Markov chain (SMC) was introduced. Recall that a SMC associated with a state s i is a CTMC whose states are given by the transitive closure of all states reachable from s i via the occurrence of exponential events [9]. After generating the reachability graph

DSPN Specification File Structural DSPN Results DES Specification File State Probabilities of DES Result Measures Throughput Values Marking Probabilities Solution Engine for Discrete-Event Stochastic Systems (DES) with Exponential and Deterministic Events Structural Petri Net Analysis Derive Tangible Reachability Graph Reachability Graph Derive Subordinated Markov Chains QMATRIX<1> QMATRIX<K> Derive EMC Derive GSSMC PMATRIX<1> PMATRIX<K> Solve Linear System Solve Volterra Equations State Probabilities Derive DSPN Results CMATRIX<1> CMATRIX<K> PMATRIX<0> Figure 1. The solver architecture for DSPNs CMATRIX<0> comprising of tangible markings (states) of the DSPN, for each state the generator matrix of its SMC is derived. These tasks are performed in the modules Derive Tangible Reachability Graph and Derive Subordinated Markov Chains, respectively. Entries of this probability matrix are computed by transient analysis of the SMCs. Subsequently, a linear system corresponding to the stationary equations of the EMC is solved. These task are performed in the submodules Derive EMC and Solve Linear System. Transient analysis of DSPNs is based on the analysis of a general state space Markov chain (GSSMC) embedded at equidistant time points nd (n = 0,1,2,...) of the continuous-time marking process. The Chapman Kolmogorov equations of the GSSMC constitute a system of Volterra integral equations [11]. Steady state analysis of DSPNs with concurrent deterministic transitions relies on the same approach [10]. The transition kernel of the GSSMC specifies one-step jump probabilities from a given state at instant of time nd to all reachable new states at instant of time (n+1)d. Key drivers for the computational efficiency of the GSSMC approach constitute the separability and piece-wise continuity of the transition kernel [11]. Furthermore, the elements of the transition kernel can effectively be determined by an extension of the concept of subordinated Markov chains. Numerical computation of kernel elements relies also on transient analysis of these CTMCs. This task is performed in submodule Derive GSSMC. Subsequently, for transient analysis a number of iterations corresponding to the mission time are performed on the system of Volterra equations [11] whereas for steady state analysis a linear system is solved for each mesh point [9], [10]. This task is performed in the submodule Solve Volterra Equations. We would like to point out that only the front end and the back end of the solution engine is tailored to DSPNs. That is instead of a DSPN specification file provided by the graphical interface of DSPNexpress, a specification file of an arbitrary discrete-event stochastic system with exponential and deterministic events (e.g., finite state machines) could be quantitatively evaluated by the solution engine of DSPNexpress using an appropriate filter. Figure 2. The graphical user interface

2.3 The Graphical User Interface Of course, the package DSPNexpress also provides a user-friendly graphical interface running under X11. To illustrate the features of this graphical interface, consider the snapshot shown in Figure 2. The first line displays the name of the package DSPNexpress and the actual version 2.000, the affiliation of the authors, University of Dortmund, Computer Systems and Performance Evaluation Group, and the year of release 1999. A DSPN of one of the following proxy architectures is displayed. Recall that in DSPNs three types of transitions exist: immediate transitions drawn as thin bars fire without delay, exponential transitions drawn as empty bars fire after an exponentially distributed delay whereas deterministic transitions drawn as black bars fire after a constant delay. At any time, DSPNexpress provides on-line helpmessages displayed in the third line of the interface. The command line and the object line are located on the left side of the interface. The buttons are located in a vertical line between the on-line help line and the working area. The working area constitutes the remaining big rectangle which contains the graphical representation of the DSPN. This DSPN is displayed with the options tags on. Thus, each place and each transition of this DSPN is labeled (e.g., Client, Hit, Miss, Queue, etc.). A detailed description of the features of the graphical interface is given in [9]. 3 The Considered Web Proxy Cache Architectures 3.1 Introduction to Web Proxy Caching The purpose of Web proxy servers lies in resolving client requests for Web documents. Proxy servers are dedicated computers typically located in routers of a local area network or an intranet [12]. Requests from clients enter the Web proxy server whose cache should hold most-sought documents. If a copy of the requested document resides in the proxy cache, a cache hit occurs. Otherwise, we say a cache miss occurs. In this case the proxy server establishes a HTTP connection to the originating Web server given in the URL of the requested document and retrieves the document. Subsequently, the requested document is stored in the proxy cache and returned to the requesting client. Figure 3 illustrates the main steps of a transaction of the hypertext transfer protocol (HTTP). For a detailed discussion on Web proxy servers and HTTP, we refer to the recent texts [12], [14]. Important quantities for minimizing the end user latency constitute the hit ratio in the proxy cache and the mean document transfer time. Latency for cache hit send request hit in proxy cache send request to server Latency for cache miss send document to proxy transfer document from proxy to client LAN Internet Client Proxy Server Remote Web Server Figure 3. An HTTP transaction via a Web proxy server

Clients K Generation of request Bursty traffic Hit in proxy cache Queued requests Miss in proxy cache Is text Delay for downloading a text Download text Network idle Download image Switch to bursty traffic N Normal traffic Transfer from proxy to client Switch to normal traffic Image buffer Reply to client M Text buffer Is image 3 3 Delay for downloading an image Replace image with image Replace text with image Insert image in cache Insert text in cache Replace image with text Replace text with text Figure 4. DSPN of a proxy server architecture with a shared cache 3.2 Shared Proxy Cache Figure 4 shows the DSPN of a proxy server architecture with a shared cache for text and image documents. We assume that client requests are generated according to a Markov modulated Poisson process (MMPP) whose governing continuous-time Markov chain comprises of N+1 states. In the DSPN, the firing delay of the exponential transition Generation of requests depends on the distribution of the N tokens among the places Bursty traffic and Normal traffic. When all N tokens reside in place Normal traffic, requests are generated according to a Poisson process whose parameter is given in Table 1. For other distributions of these N tokens, requests are generated according to a Poisson process with different parameters.thus, the DSPN explicitly represent the burstiness and self-similiarity in the arrival process for Web document requests as observed from measured data see e.g., [4], [13]. To take into account that client generates requests independently from each other, the exponential transition Generation of requests has infinite-server firing policy. Parameter Base Value Number of clients, K 50 Total size of Proxy cache, M 50 Size of a text document 1 Size of an image document 3 Percentage of requests for text documents 33.3 % Percentage of requests for image documents 66.6 % Mean time for generating requests in normal mode 60 Mean time spent in normal mode 60 Degree of burstiness 5/1 Burst frequency 1/10 Network transfer time for a text document 5 Network transfer time for an image document 15 Table 1. Base parameter setting of the DSPN

Download document Allocate bandwidth 3 Delay for downloading documents Insert in proxy cache 3 Download completed Aging delay Clients K Generation of request Bursty traffic Hit in proxy cache Queued requests Miss in proxy cache Is text Download text Network idle Download image Prefetched Delay for downloading a text Switch to bursty traffic N Normal traffic Transfer from proxy to client Switch to normal traffic Image buffer Reply to client M Text buffer Is image 3 3 Delay for downloading an image Replace image with image Replace text with image Insert image in cache Insert text in cache Replace image with text Replace text with text Figure 5. DSPN of a proxy server architecture with shared cache and prefetching Probabilities for hits and misses in the proxy cache (i.e., the probabilities for resolving the conflict between the immediate transitions Hit in proxy cache and Miss in proxy cache) depend on the number of documents currently cached. In the DSPN for a shared proxy cache shown in Figure 4 that is the number of tokens in the places Text buffer and Image buffer. These marking-dependent probabilities are determined according to [2]. The deterministic transitions Delay for dowloading a text and Delay for dowloading an image have associated markingdependent delays in order to approximately represent the concurrent transmission of multiple documents. Table 1 states the model parameters of the DSPN shown in Figures 4, 5 and 6, which underlie the performance experiments presented in Section 4. As basic units for time and size, we choose 1 second and 10 KByte. 3.3 Shared Proxy Cache with Prefetching To reduce latency in computer systems, prefetching of data is widely known and has been employed in different areas. A preliminary study on how to utilize prefetching for Web documents has been studied in [7], [8]. In this and the subsequent section, we are modeling the combination of caching and prefetching in order to investigate its implications to shared and partitioned proxy caches. Figure 5 shows the DSPN of a shared proxy cache architecture in which documents are prefetched when the network traffic is smaller than a threshold value. The gain of prefetching is a higher probability for a hit in the proxy cache whereas the cost for a cache miss is also greater. This is due to the fact that prefetching documents uses additional

Clients K Generation of request Bursty traffic Hit in proxy cache Queued requests Miss in proxy cache Is text Delay for downloading a text Download text Network idle Download image Switch to bursty traffic Switch to normal traffic Is image Delay for downloading an image N Normal traffic Transfer from proxy to client Reply to client Replace image with image Insert image in cache Insert text in cache Replace text with text Figure 6. DSPN of a proxy server architecture with partitioned cache network bandwidth. This is encoded in the DSPN of Figure 5 as follows: when the place Prefetched holds a token, the immediate transition Hit in proxy cache has a higher firing probability than the corresponding transition of the DSPN introduced in Figure 4. Furthermore, depending on the number of tokens in place Download documents, the load-dependent delays of the deterministic transitions Delay for downloading a text and Delay for downloading an image are also proportionally greater than in case of no prefetching activities. In order to avoid network congestion, prefetching documents is disabled when the number of queued requests (i.e., the number of tokens in place Queued requests) is greater than the threshold value. In the DSPN, we consider a threshold value of. This is modeled by an inhibitor arc with corresponding multiplicity from place Queued requests to Allocate bandwidth. 3.4 Partitioned Proxy Cache with and without Prefetching We consider a fixed partitioned proxy cache that may hold up to 3M 7 image and M 7 text documents. Since we consider a workload comprising of 2/3 image and 1/3 text documents, we assume that this partition is optimal. A DSPN of this proxy server architecture with partitioned cache is shown in Figure 6. As before, all places and transitions are labeled. Note that the DSPN is very similar to the DSPN of Figure 4. The main difference constitute the removal of the subnet comprising of the places Text buffer and Image buffer together with the immediate transitions Replace text with image and Replace image with text. Consequently, the probabilities for resolving the conflict between the immediate transitions Hit in proxy cache and Miss in proxy cache is fixed for a given cache size and partition rather than depending on the current number of cached text and image documents. Extending the DSPN of Figure 6 by the subnet modeling prefetching introduced in Figure 5, we obtain a DSPN for a partitioned proxy cache architecture with prefetching. 4 Performance Results For a comparative performance analysis of the Web proxy server cache architectures, we consider a system with 50 clients and 50 proxy cache units. Recall that text and image documents require 1 and 3 units, respectively. We assume that the overall workload comprises of 33.33% requests for image and 66.66% requests for text documents. A similar traffic pattern has been measured e.g., for the Web traffic in the Gigabit Testbed West operated by the Deutsche Forschungsnetz Verein. Since in this study we are interested in the sensitivity of proxy cache performance to workload parameters rather than absolute performance for specific workloads, the generation of client requests is represented by a

80 70 60 50 Latency in sec 40 30 10 0 1/110 1/100 1/90 1/80 1/70 1/60 1/50 No P.C. Shared P.C. Shared Prefetch P.C. Partitioned P.C. Partitioned Prefetch P.C. 1/40 1/30 1/ 1/10 Traffic intensity Figure 7. Average latency versus traffic intensity Latency in sec 80 70 60 50 40 30 10 No P.C. Shared P.C. Shared Prefetch P.C. Partitioned P.C. Partitioned Prefetch P.C. 0 1/1 2/1 3/1 4/1 5/1 6/1 7/1 8/1 9/1 10/1 Degree of burstiness Figure 8. Average latency versus degree of burstiness

80 70 Latency in sec 60 50 40 No P.C. Shared P.C. Shared Prefetch P.C. Partitioned P.C. Partitioned Prefetch P.C. 30 10 0 1/22 1/ 1/18 1/16 1/14 1/12 1/10 1/8 1/6 1/4 1/2 Burstiness frequency Figure 9. Average latency versus burst frequency 80 70 Latency in sec 60 50 40 No P.C. Shared P.C. Shared Prefetch P.C. Partitioned P.C. Partitioned Prefetch P.C. 30 10 0 0 10 30 40 50 60 70 80 90 100 Cache size Figure 10. Average latency versus cache size

Latency in sec 80 70 60 50 40 30 10 0 0 1 2 3 4 5 6 7 8 9 10 Document Transmission Delay in sec No P.C. Shared P.C. Shared Prefetch P.C. Partitioned P.C. Partition Prefetch P.C. Figure 11. Average latency versus transmission delay 80 Latency in sec 70 60 50 40 No P.C. Shared P.C. Shared Prefetch P.C. Partitioned P. C. Partitioned Prefetch P.C. 30 10 0 0 5 10 15 25 30 35 40 45 50 55 Number of Text Documents Figure 12. Average latency versus number of text documents used in the partitioned proxy cache

Markov modulated Poisson process comprising only of two states: bursty arrivals and normal arrivals. That is, we evaluate the DSPN for marking parameter N = 1. The performance curves presented below have been obtained by computing the steadystate solutions of the DSPN with the analysis tool DSPNexpress [9]. The solution of each DSPN requires only few minutes of CPU time for a single parameter setting on a modern workstation. Thus, the presented stochastic modeling approach requires substantially less computational effort than conducting a trace-driven simulation study. As performance measure, we consider the mean latency that an end-user experiences when receiving a document, denoted by L. Note that the mean number of pending client requests as well as the effective mean arrival rate of client request can be directly derived from the DSPN of Figure 4 to 6. Since we assume that the system is in steady state, we can apply Little's Law for deriving the average latency L. That is: L = 0N E{ # Clients} 5 X: Generationof request? In this formula, E{.} and X{.} denotes the mean number of tokens in the corresponding place of a DSPN in steady state and the throughput of the corresponding timed transition, respectively. Other performance measures such as the number of requests that reach servers and the volume of network traffic resulting from document requests can also be directly derived from the steady-state marking probabilities of the DSPN without additional computational cost. Figure 7 to 12 present performance results for the Web proxy server cache architectures described in Section 3. In each figure, a corresponding curve for a simple Web server without a proxy cache is included for comparison. In a first set of experiments, the cache size is kept fixed to M = 50 and the document transfer time is set to their base values. Figure 7 plots curves for the average end user latency versus the traffic intensity for a fixed burst frequency and degree of burstiness. Figure 8 and 9 plot corresponding curves in which the traffic intensity is kept fixed and the degree of burstiness and burst frequency is varied, respectively. In a second experiment, we keep the workload fixed to their base values given in Table 1 and vary the cache size and network bandwidth (i.e., the transmission times for text and image documents). Figure 10 plots curves of the average end user latency for increasing cache size. In this figure, obviously the curve for no proxy server is a straight line. Figure 11 plots curves showing the sensitivity of the average end user latency to the network bandwidth. Note that the x-axis of Figure 11 is labeled with the transmission delay of text documents. Since we assume that image documents have three times the size of a text document, their transmission delay is varied accordingly. Figures 7 to 11 show that the partitioned proxy cache with and without prefetching always performs better than the corresponding shared proxy cache. Furthermore, the inclusion of prefetching improves proxy cache performance if the traffic intensity is below the threshold value. Obviously, the internet without proxy server cache performs considerably worse than a corresponding environment with proxy cache. Recall that we considered a proxy server cache partition optimized for a given workload (i.e., for known percentages of image and text documents). Thus, this observation itself is not very surprising. Subsequently, in a third experiment we explore the impact of choosing the "right" partition of the proxy cache to the performance. Figure 12 plots the average end user delay versus ten feasible partitions of the proxy cache. Recall that the proxy cache has a size of 50 units; a text requires 1 unit and an image 3 units. We observe that only for three out of the ten partitions considered in Figure 12 (i.e., for 2 texts and 16 images, for 8 texts and 14 images, and for 14 texts and 12 images) the partitioned proxy server cache performs better than the shared proxy server cache. Obviously, the curves for the shared proxy cache with and without prefetching are straight lines in Figure 12. To summarize, we also observe that the performance difference between the shared proxy cache and the partitioned proxy cache is most sensitive to network bandwidth. For the partitioned proxy cache architecture, the second most sensitive quantity constitutes the selection of the fixed partition that is best suited for a given workload. Furthermore, as expected the traffic intensity; i.e., the rate at which clients generate requests has considerable impact on the end user delay. The performance difference between shared proxy cache and partitioned proxy cache remains almost constant for increasing degree of burstiness and burst frequency. Moreover, increasing the cache size while substantially decreasing the end user latency, leads only to a small performance gain for the partitioned proxy cache over the shared proxy cache architecture. Conclusions The paper presented deterministic and stochastic Petri net models (DSPN) for two Web proxy cache architectures: a shared cache and a partitioned cache with and without prefetching documents. The presented DSPN models address in particular the sensitivity of the end user latency on the workload (i.e., traffic intensity, degree of burstiness and burst frequency of client requests) as well as on the cache size and network bandwidth. The quantitative analysis of the presented DSPN requires only a few minutes of CPU time on a modern workstation; i.e.,

substantially less computational effort than the trace-driven simulation study presented in [13] whereas the same qualitative results have been derived for shared versus partitioned proxy cache without prefetching. Thus, the presented DSPN models are well suited for exploring the entire design space in early design stages for Web proxy server cache architectures. Our performance study shows that for a given workload (i.e., known percentages of image and text documents) one could derive a fixed partition of the proxy cache that optimizes performance. Thus, the development of a proxy server cache architecture that will be able to dynamically partition its cache depending on the actual workload would be highly desirable. A way towards solving this task constitutes the derivation of a working set model known from virtual memory in operating systems for Web server proxy caching. In this study we investigated the sensitivity of proxy cache performance to workload parameters rather than absolute performance for specific workloads. After measurements currently under way for multimedia applications and subsequent statistical analysis has been completed, the DSPN will be utilized for a performance study of Web proxy server cache architectures running under particular workloads. References [1] M. Ajmone Marsan and G. Chiola, On Petri Nets with Deterministic and Exponentially Distributed Firing Times, in: G. Rozenberg (Ed.) Advances in Petri Nets 1987, Lecture Notes in Computer Science 266, 132-145, Springer 1987. [2] M. Arlitt, R. Friedrich, and T. Jin, Performance Evaluation of Web Proxy Cache Replacement Policies, Proc. 10th Int. Conf. on Modeling Techniques and Tools for Computer Performance Evaluation, Palma de Mallorca, Spain, 193-6, 1998. [3] M. Arlitt and C. Williamson, Internet Web Servers: Workload Characterization and Performance Implications, IEEE/ACM Trans. on Networking, 5, 631-645, 1997. [4] M. Crovella and A. Bestavros, Self-Similiarity in World Wide Web Traffic: Evidence and Possible Causes, IEEE/ACM Trans. on Networking, 5, 835-846, 1997. [5] B. Duska, D. Marwood, and M. Feeley, The Measured Access Characteristics of World Wide Web Client Proxy Caches, Proc. USENIX Symp on Internet Technologies and Systems, Monterey California, 23-35, 1997. [6] A. Feldman, R. Caceres, F. Douglis, G. Glass, and M. Rabinovich, Performance of Web Proxy Caching in Heterogeneous Bandwidth Environments, Technical Report AT&T, January 1999. [7] H. Foxwell, and D. A. Menasce, Prefetching Results of Web Searches, Proc. of the 1998 Computer Measurement Group Conference, Anaheim, California, 1998. [8] Th. Kroeger, D. Long, and J. Mogul, Exploring the Bounds of Web Latency Reduction from Caching and Prefetching, Proc. USENIX Symp on Internet Technologies and Systems, Monterey California, 13-22, 1997. [9] Ch. Lindemann, Performance Modelling with Deterministic and Stochastic Petri Nets, John Wiley & Sons 1998. [10] Ch. Lindemann and G.S. Shedler, Numerical Analysis of Deterministic and Stochastic Petri Nets with Concurrent Deterministic Transitions, Performance Evaluation, Special Issue Proc. of PERFORMANCE '96, 27&28, 565-582, 1996. [11] Ch. Lindemann and A. Thümmler, Transient Analysis of Deterministic and Stochastic Petri Nets with Concurrent Deterministic Transitions, Performance Evaluation, Special Issue Proc. of PERFORMANCE '99, to appear. [12] A. Luotonen, Web Proxy Servers, Prentice Hall 1998. [13] S. Williams, M. Abrams, C. Standridge, G. Abdulla, and E. Fox, Removal Policies in Network Caches for World Wide Web Documents, Proc. ACM SIGCOMM 96, Stanford California, 293-305, 1996. [14] N. Yeager and R. McGrath, Web Server Technology, Morgan Kaufmann, 1996.