Software and Systems Research Center, Bell Labs, Murray Hill July 26, 1996 Prefetching Techniques for the Web Azer Bestavros Computer Science Departme
|
|
- Samson Fox
- 6 years ago
- Views:
Transcription
1 Networking and Distributed Systems Research Center, Bell Labs, Murray Hill July 25, 1996 Taming the WWW From Web Measurements to Web Protocol Design Azer Bestavros Computer Science Department Boston University Thursday July 25 th,
2 Software and Systems Research Center, Bell Labs, Murray Hill July 26, 1996 Prefetching Techniques for the Web Azer Bestavros Computer Science Department Boston University Friday July 26 th,
3 Overview of Oceans Projects Client-based Studies (Client Caching Protocols) Prefetching Protocols (Dynamic Server Selection Protocols) Server-based Studies (Server Caching Protocols) Dissemination Protocols Speculative Service Protocols Performance Modeling Studies Web Trac Characteristics (Web Access Patterns Characteristics) 3
4 Locality of Reference in a Client-Server Environment Locality of Reference Flavors Temporal: A document accessed frequently in the past is likely to be accessed again in the future. Spatial: A document \neighboring" a recently accessed document is likely to be accessed in the future. Geographical: A document accessed by a client is likely to be accessed in the future by \neighboring" clients. 4
5 Locality of Reference in a Client-Server Environment How to Capitalize on it? On the client side, use \caching" and \prefetching" (e.g. Distributed le systems, Sun NFS, AFS, [Standberg 1985, Morriss 1986, Howard 1988], Proxy caching [Danzig 1993, Acharya 1993, Papadimitriou 1994], Cooperative client caching [Blaze 1993, Dahlin 1994]). On the server side, use \information dissemination" [Bestavros 1994], \geographical caching" [Braunh and Clayh 1994], \speculative service" [Bestavros 1995], \geographical push caching" [Gwertzman and Seltzer 1995]. Motive The scalability of Internet services hinges on ecient distribution and partitioning of system resources to reduce the amount of data that must be moved. 5
6 Information Caching versus Information Dissemination Information Caching Initiated by a client or a group of clients. Geared towards reducing service time. Relies on temporal locality of client reference patterns. Ensuring consistency is expensive. Information Dissemination Initiated by servers. Geared towards balancing load and reducing trac. Relies on temporal/geographical popularity of documents. Ensuring consistency is cheap. Requires collaboration of \server proxies". 6
7 Client-initiated Caching Study Experiment Description We instrumented Mosaic to log all user accesses on our site [BCC:95]. We studied cache performance at various levels: { Session Caching: One cache per session { Host Caching: One cache per host { LAN Caching: One cache per LAN We used the logs obtained from Mosaic to perform trace simulations for various protocols [BCCCHM:95]. Sessions 4,700 Users 591 URLs Requested 575,775 Files Transferred 130,140 Unique Files Requested 46,830 Bytes Requested 2713 MB Bytes Transferred 1849 MB Unique Bytes Requested 1088 MB Summary Statistics for Trace Data Used in This Study 7
8 Client-initiated Caching Eectiveness Experiment Results Poor Byte Hit Rate < 40% with innite cache. Sharing amongst multiple clients is limited too! Byte Hit Rate Cache Expansion Index for Constant Byte Hit Rate Slope of No Geographic Locality of Reference 6% 10% 8% 12% Slope of Perfect Geographic Locality of Reference Multiprogramming Level (# of clients) Cache Expansion Index for Remote documents 8
9 The Server's Perspective Server Log Analysis We collected the logs of our departmental HTTP server and those of the Rolling Stones Multimedia server. We used the logs to analyze the popularity of various documents and to drive trace simulations of various server-initiated protocols. cs- Period 56 days 110 days URL requests 172,635 4,068,432 Bytes transferred 1,447 MB 112,015 MB Average daily transfer 26 MB 1,018 MB Files on system 2,018 N/A Files accessed (remotely) 974 (656) N/A (1,151) Size of (accessed) le system 50 MB (37 MB) N/A (402 MB) Unique clients (10+ requests) 8,123 60,461 Summary Statistics for Log Data Used in This Study 9
10 Popular documents are VERY popular! Zipf's law governs the probability of document access 5 2 1e+03 5 Access Frequency 2 1e e e+00 1e+00 1e+01 1e+02 1e+03 Document Rank The popularity of a le is inversely proportional to the rank of that le 10
11 The Server's Perspective Log Analysis of Making 25MB ( 3%) of data available to clients at a proxy one-hop closer to them would save more than 900MB/dayofnetwork bandwidth. Server Load Reduction % Frequency/ File ID MBytes Access frequency Expected bandwidth reduction 11
12 Information Dissemination Protocol Underlying Model A set of service proxies act as information \outlets" on the Internet. These service proxies oer space/bandwidth \for-rent" to other servers or proxies that constitute its Cluster. A server may belong to several clusters, thus allowing some of its les to be dessiminated to multiple service proxies. Service proxies are themselves servers who may be members of other clusters. Server Proxy of Server Proxy of Server Proxy of (Proxy of Server) Underlying Model for Information Dissemination 12
13 Information Dissemination Protocol Questions to be answered Given the access pattern at a server, which clusters should the server choose to join? Given the access pattern at a server, which les should the server disseminate? and where? Given the popularity prole of all servers in a cluster, how should the resources (space/bandwidth) at the service proxy be allocated? Assumptions The dissemination protocol should not require any \special" features/capabilities from other protocols. File popularity is a \universal" phenomenon (i.e. the probabilty of accessing a le is independent of who is accessing it). This is a conservative simplifying assumption. File popularity does not change drastically in a short period of time. This assumption has been veried. 13
14 Optimal Allocation of Storage at the Proxy Notation C = S 0 S 1 S 2 ::: S n is the set of servers in a cluster. S 0 is the service proxy of C. R i is the total number of bytes per unit time serviced by server S i to clients outside C. H i (b) is the probability that a request to S i will be to the most popular b bytes disseminated to S 0. B i is the number of bytes that S 0 duplicates from S i. B 0 = B 1 + B 2 + :::+ B n is the total storage space available at S 0. Goal Choose B i to maximize the percentage of trac serviced at S 0. C = P n i=1 R i H i (B i ) P n i=1 R i 14
15 Which Proxies Should be Contracted? Characterizing the Client Tree and Choosing Proxies Using the record route option of TCP/IP, it is possible to build a complete tree originating at the server with clients at the leaves. For this tree consisted of 18,000 nodes. The most popular les are disseminated down the tree and stored at proxies closer to the clients. The location of such proxies depends on the demand from the various parts of the tree. Analysis of logs for a consecutive 26-week period suggests that the shape of the tree (especially internal nodes) and the distribution of load is quite static over time. 15
16 How Does the Internet Look to a Server? Server Clients 16
17 How Much Bandwidth is Saved? How far could we \push" information towards clients? At least 8-9 hops! Replicating the most popular 25 MB from on few proxies yields a whoping saving of > 8 GBHops of network bandwidth per day Frequency Hops from Server to Clients How far away are clients? 17
18 How Much Bandwidth is Saved? Bytes x Hops Saved % Proxies Expected reduction in bandwidth as a result of dissemination 18
19 How Much History Should be Used to Disseminate? Short histories are better for shallow dissemination. Long histories are better for deep dissemination Weeks Reduction in Bandwidth (%) Weeks 2 Weeks 1 Week Replicas Disseminated weekly based on last N weeks usage 19
20 Dissemination Depth Maximum Depth Average Depth Minimum Depth Hops Replicas How far down the tree does dessimination go? 20
21 How Much Reduction in Server Load? Reduction in Server Load (%) Replicas Expected reduction in server load as a result of dissemination 21
22 How Much Reduction in Server Load? (closeup) Reduction is Server Load (%) Replicas Expected reduction in server load as a result of dissemination 22
23 Spatial Locality of reference Zipf's law governs the probability of sequences of document access! The popularity of a sequence of le requests is inversely proportional to the rank of that sequence. 1e+03 5 Access Frequency 2 1e e+01 5 K=1 K=2 K=3 2 1e+00 1e e e e+03 Sequence Rank 23
24 Spatial Locality of reference There are far less sequences in a server trace than randomly possible! Frequency in Trace Random Actual Sequence Length Could one use the knowledge of \popular" sequences for prefetching? 24
25 The Notion of Speculative Service Could client requests be predicted by the server? In many cases, the answer is yes. Servers could \speculatively" service documents before they are requested (a.k.a. server-initiated prefetching). Traversal D i D j D k Embedding D i1 D i2 D i3 D j1 D j2 D j3 D k1 D k2 D k3 Genesis of spatial locality of reference 25
26 Document Access Interdependency Matrix Notation Let p[i j] denote the conditional probability that document D j will be requested, within T w units of time after the request for D i. Let P denote the square matrix p[i j], for all possible documents 0 i j N. Let P denote the transitive closure of P. Thus, p [i j] is the probability that there will be a sequence of requests (inter-request time < T w ) starting with D i and ending with D j. Server log analysis Using server logs, the P and P matrices could be easily constructed Frequency Frequency P[i,j] P*[i,j] Histogram for ranges of p[i j] Histogram for ranges of p [i j] 26
27 Speculative Service Experiments Simulation Model Successive requests separated by lessthanstridetimeout units of time belong to the same \stride". Clients maintain a session cache. The session cache is purged if the time between successive requests exceeds SessionTimeout. Four performance metrics are used: Bandwidth ratio, Server Load ratio, Service Time ratio, and Miss rate ratio. Parameter Meaning Base Value CommCost Cost of communicating 1 Byte 1 unit ServCost Setup cost for a service request 10,000 unit StrideTimeout Value of time window T w 5.0 secs SessionTimeout Cache invalidation timeout 1 secs MaxSize Maximum size to prefetch 1 (no limit) Policy Speculative service algorithm p [i j] T p HistoryLength Length of the logs used for P 60 days UpdateCycle Frequency of recomputing P 1 day 27
28 Speculative Service Experiments Baseline Results Signicant improvement in performance (above what is achievable by client caching) could be achieved for a miniscule increase in trac. 5% extra bandwidth results in a whopping 30% reduction in server load, a 23% reduction in service time, and a 18% reduction in client miss-rate. Beyond some point, speculation does not seem to pay o. Ratio Ratio Server load Service time Miss rate Bandwidth Server load Timeliness Miss rate Tp more speculative less speculative Level of Speculation Traffic increase Metrics vs Speculation Level Metrics vs Bandwidth 28
29 Speculative Service Experiments Stability of the P and P Relations We varied the UpdateCycle from 1 to 7 days, while keeping the HistoryLength at 60 days. This change resulted in a 3% degradation in all measured metrics, suggesting that P and P do change (albeit very slowly) with time. Also, we varied the HistoryLength from 60 to 30 days, while keeping the UpdateCycle at 1 day. This change resulted in a 5% improvement in all measured metrics, suggesting that an aging mechanism must be used to phase-out dependencies exhibited in on older server traces, in favor of dependencies exhibited in more recent ones. Ratio Ratio day cycle 7-day cycle Server load Timeliness Miss rate Server load Timeliness Miss rate Server load Timeliness Miss rate Traffic increase 0.45 Traffic increase Eect of slower UpdateCycle Eect of shorter HistoryLength 29
30 Speculative Service Experiments Eect of Client Caching We compared simulations with SessionTimeout equal to 3,600 seconds (large cache) and to 120 seconds (small cache). The presence of client caching (even if modest) is likely to further improve the performance of speculative service. In order to reap all the benet from speculative service, client must cache \prefetched" documents long enough Ratio Smaller Cache Server load Timeliness Miss rate Larger Cache Server load Timeliness Miss rate Ratio Extra traffic = 10% Server load Timeliness Miss rate Extra traffic = 50% Server load Timeliness Miss rate Traffic increase Session Timeout Seconds Eect of client caching Eeciency of client caching 30
31 Speculative Service Experiments Cooperative Clients Performance could be further improved if documents \already cached at the client" are not speculatively serviced! Our simulations showed that speculative service with cooperative clients results in better bandwidth utilization, especially when the client performs \some" caching. Ratio Ratio Cooperative Clients Server load Timeliness Miss rate Non-cooperative Clients Server load Timeliness Miss rate Cooperative Clients Server load Timeliness Miss rate Non-cooperative Clients Server load Timeliness Miss rate Traffic increase Traffic increase Cooperative client (small cache) Cooperative client (innite cache) 31
32 Speculative Service Experiments Eect of Document Size The benets of speculation are most pronounced when documents serviced speculatively are small. We studied this by varying the MaxSize parameter. Ratio Ratio Server load Timeliness Miss rate Server load Timeliness Miss rate Low level of speculation MaxSize in KBytes High level of speculation MaxSize in KBytes 32
33 Client Prefetching Could the next request be predicted by the client? Using client traces, it is possible for the client software to perform \prefetching" [Bestavros and Cunha: 1995]. Performance of client-initiated prefetching is highly dependent on user access patterns. URL Id URL Id Reference Reference A \Mostly Surng" User A \Mostly Working" User 33
34 Performance of Client Prefetching Trace simulations indicate that client-initiated prefetching is very eective for \frequently traversed documents" but ineective for \newly/rarely traversed documents" Miss Rate Ratio Extra Bandwidth (%) Performance improvement for a \creature of habit" ::^ 34
35 Flip-Flop Between Server and Browser Hints By monitoring the \surng index", we favor either the server hints or the browser hints for prefetching purposes. Reference ID G W Time Use Server Hints to Prefetch User is Surfing G W > h Use Browser Hints to Prefetch User is Working G W < h 35
36 Implementation of Prefetching Agents Server-assisted Prefetching We have modied the NCSA HTTP server to allow it to build rst and second Markov models for all documents on a site. Everytime a client requests a page, the server automatically appends PREFETCH hints to the end of the served document: <PREFETCH AGENT="agent" HREF="url" RANK="rank%" PROB="prob"> Also, we have modied the NCSA Mosaic browser to allow it to interpret the server hints and perform prefetching accordingly. Client-initiated Prefetching We have modied the NCSA Mosaic browser to allow it to build rst and second Markov models for all documents accessed by a user. Everytime a client requests a document, the browser (may) automatically prefetch a number of other documents and keep them in the disk cache. Integrating Server and Client agents The above two implementations need to be integrated. We are working on that! 36
37 Web Trac Characteristics Motive Trac attributed to the use of the Web makes up an increasing percentage of internet trac. Understanding the nature of web trac is important for scaling and capacity planning purposes. Conclusions Trac attributed to the use of the Web exhibits self-similar properties. Web trac self-similarity is due to the heavy-tailed nature of transmission time distribution, which is related to Web le size distributions. Related Work Leland, Taqqu, Willinger, and Wilson's paper in SIGCOMM'93 established the self-similarity of LAN trac and Willinger, Taqqu, Sherman, and Wilson's paper in SIGCOMM'95 expalined self-similarity for LAN using ON/OFF model. 37
38 Web Trac Characteristics: Data Set used in this study Sessions 4,700 Users 591 URLs Requested 575,775 Files Transferred 130,140 Unique Files Requested 46,830 Bytes Requested 2713 MB Bytes Transferred 1849 MB Unique Bytes Requested 1088 MB Summary Statistics for Trace Data Used in This Study 38
39 Web Trac Characteristics: Burstiness at varying scales Bytes 0 5*10^6 10^7 1.5*10^7 2*10^7 2.5*10^7 3*10^7 Bytes 0 2*10^6 4*10^6 6*10^6 8*10^ Chronological time (slots of 1000 sec) Chronological time (slots of 100 sec) Bytes Bytes Chronological time (slots of 10 sec) Chronological time (slots of 1 sec) Trac Bursts over Four Orders of Magnitude Upper Left: 1000, Upper Right: 100, Lower Left: 10, and Lower Right: 1 Second Aggegrations. 39
40 Web Trac Characteristics: Transmission Time Distribution Web trac self similarity may be the result of superimposing many ON/OFF processes, where ON and/or OFF times are heavy-tailed (Pareto). p(x) = k x ;;1 Log10(P[X>x]) Log10(Transmission Time in Seconds) The t to a straight line of the Log-Log Complementary Distribution (LLCD) suggests that transmission times are heavy-tailed (i.e. Pareto with parameter ). 40
41 Heavy-tailed Transmission Times: The CLT Test 0 Aggregated Samples from Pareto Distribution 0 Aggregated Samples from Lognormal Distribution Log10(P[X>x]) Aggregated 100-Aggregated 10-Aggregated All Points Log10(P[X>x]) Aggregated 100-Aggregated 10-Aggregated All Points Log10(x) Log10(x) 0 Aggregated Transmission Times of WWW Files -1 Log10(P[X>x]) Aggregated 100-Aggregated 10-Aggregated All Points Log10(x) Comparison of CLT Test for Pareto (upper left) and Lognormal (upper right) and actual trac distribution (bottom). 41
42 Heavy-tailed Transmission Times: Genesis Transmission times are heavy-tailed because the size distribution of transmitted les is heavy-tailed. The size distribution of transmitted les is heavy-tailed because the size distribution of unique requests is heavy-tailed. File Requests File Transfers Unique Files (46,830) (130,140) (575,775) 42
43 Distributions of Requested, Unique, and Transferred Files Distribution Requested Files 1.16 Transferred Files 1.06 Unique Files Log10(P[X>x]) Unique Files File Transfers File Requests Log10(File Size in Bytes) 43
44 Why is the Size Distribution of Unique Files Heavy-Tailed? The size distribution of unique requests is heavy-tailed because it reects the size distribution of available les on the Web, which we have shown to be heavy-tailed by surveying Web le systems at 32 dierent sites. Distribution Unique Files 1.05 Available Files Log10(P[X>x]) Unique Files Available Files Log10(File Size in Bytes) 44
45 Relationship Between Various File Sets Transmission Times File Transmission File Transfers Unique Files Available Files Caching File Requests One-to-One Relation Subset Relation 45
46 How Typical Are Web Files? 100 WWW Document Sizes Unix File Sizes Percent of All Files e+06 Size in Bytes 46
47 Relationship Between Heavy Tail and Media 0 Size Distribution by File Type -1 Log10(P[X>x]) All Files a=1.06 Image Files Audio Files Video Files Text Files a= Log10(File Size in Bytes) 47
48 Conclusion: Overview of Oceans Projects Client-based Studies Client Caching Protocols Prefetching Protocols Dynamic Server Selection Protocols Server-based Studies Server Caching Protocols Dissemination Protocols Speculative Service Protocols Performance Modeling Studies Web Trac Characteristics Web Access Patterns Characteristics 48
In Proceddings of ACM SIGMOD 96 Data Mining Workshop,Montreal, Canada, June 1996.
In Proceddings of ACM SIGMOD 96 Data Mining Workshop,Montreal, Canada, June 1996. Middleware Support for Data Mining and Knowledge Discovery in Large-scale Distributed Information Systems Azer Bestavros
More informationIn Proceedings of ICDE 96: The 1996 International Conference on Data Engineering, New Orleans, Louisiana, March 1996.
In Proceedings of ICDE 96: The 996 International Conference on Data Engineering, New Orleans, Louisiana, March 996. Speculative Data Dissemination and Service to Reduce Server Load, Network Trac and Service
More informationExplaining World Wide Web Traffic Self-Similarity
Explaining World Wide Web Traffic Self-Similarity Mark E. Crovella and Azer Bestavros Computer Science Department Boston University 111 Cummington St, Boston, MA 02215 fcrovella,bestg@cs.bu.edu October
More informationIn World Wide Web, Special Issue on Characterization and Performance Evaluation
In World Wide Web, Special Issue on Characterization and Performance Evaluation Changes in Web Client Access Patterns Characteristics and Caching Implications Paul Barford Azer Bestavros Adam Bradley Mark
More informationReplicate It! Scalable Content Delivery: Why? Scalable Content Delivery: How? Scalable Content Delivery: How? Scalable Content Delivery: What?
Accelerating Internet Streaming Media Delivery using Azer Bestavros and Shudong Jin Boston University http://www.cs.bu.edu/groups/wing Scalable Content Delivery: Why? Need to manage resource usage as demand
More informationIn Proceedings of the Second Intl. Workshop on Services in Distributed and Networked Environments (SDNE '95)
In Proceedings of the Second Intl. Workshop on Services in Distributed and Networked Environments (SDNE '95) Application-Level Document Caching in the Internet Azer Bestavros Robert L. Carter Mark E. Crovella
More informationRelative Reduced Hops
GreedyDual-Size: A Cost-Aware WWW Proxy Caching Algorithm Pei Cao Sandy Irani y 1 Introduction As the World Wide Web has grown in popularity in recent years, the percentage of network trac due to HTTP
More informationWeb Caching and Content Delivery
Web Caching and Content Delivery Caching for a Better Web Performance is a major concern in the Web Proxy caching is the most widely used method to improve Web performance Duplicate requests to the same
More informationIn Proceedings of IEEE Infocom 98, San Francisco, CA
In Proceedings of IEEE Infocom 98, San Francisco, CA The Network Eects of Prefetching Mark Crovella and Paul Barford Computer Science Department Boston University 111 Cummington St, Boston, MA 2215 fcrovella,barfordg@cs.bu.edu
More informationIn Proc. of the 1996 ACM SIGMETRICS Intl. Conference on Measurement and Modeling of Computer Systems, Philadelphia, PA, May 1996
In Proc. of the 1996 ACM SIGMETRICS Intl. Conference on Measurement and Modeling of Computer Systems, Philadelphia, PA, May 1996 Self-Similarity in World Wide Web Trac Evidence and Possible Causes Mark
More informationCache Memories. From Bryant and O Hallaron, Computer Systems. A Programmer s Perspective. Chapter 6.
Cache Memories From Bryant and O Hallaron, Computer Systems. A Programmer s Perspective. Chapter 6. Today Cache memory organization and operation Performance impact of caches The memory mountain Rearranging
More informationMeasurement and Analysis of Traffic in a Hybrid Satellite-Terrestrial Network
Measurement and Analysis of Traffic in a Hybrid Satellite-Terrestrial Network Qing (Kenny) Shao and Ljiljana Trajkovic {qshao, ljilja}@cs.sfu.ca Communication Networks Laboratory http://www.ensc.sfu.ca/cnl
More informationA Top-10 Approach to Prefetching on the Web. Evangelos P. Markatos and Catherine E. Chronaki. Institute of Computer Science (ICS)
A Top- Approach to Prefetching on the Web Evangelos P. Markatos and Catherine E. Chronaki Institute of Computer Science (ICS) Foundation for Research & Technology { Hellas () P.O.Box 1385 Heraklio, Crete,
More informationMultimedia Streaming. Mike Zink
Multimedia Streaming Mike Zink Technical Challenges Servers (and proxy caches) storage continuous media streams, e.g.: 4000 movies * 90 minutes * 10 Mbps (DVD) = 27.0 TB 15 Mbps = 40.5 TB 36 Mbps (BluRay)=
More informationMeasurement and Analysis of Traffic in a Hybrid Satellite-Terrestrial Network
Measurement and Analysis of Traffic in a Hybrid Satellite-Terrestrial Network Qing (Kenny) Shao and Ljiljana Trajkovic {qshao, ljilja}@cs.sfu.ca Communication Networks Laboratory http://www.ensc.sfu.ca/cnl
More informationarxiv: v3 [cs.ni] 3 May 2017
Modeling Request Patterns in VoD Services with Recommendation Systems Samarth Gupta and Sharayu Moharir arxiv:1609.02391v3 [cs.ni] 3 May 2017 Department of Electrical Engineering, Indian Institute of Technology
More informationToday: World Wide Web! Traditional Web-Based Systems!
Today: World Wide Web! WWW principles Case Study: web caching as an illustrative example Invalidate versus updates Push versus Pull Cooperation between replicas Lecture 22, page 1 Traditional Web-Based
More informationCharacterizing Web Usage Regularities with Information Foraging Agents
Characterizing Web Usage Regularities with Information Foraging Agents Jiming Liu 1, Shiwu Zhang 2 and Jie Yang 2 COMP-03-001 Released Date: February 4, 2003 1 (corresponding author) Department of Computer
More informationDistributed File Systems Part II. Distributed File System Implementation
s Part II Daniel A. Menascé Implementation File Usage Patterns File System Structure Caching Replication Example: NFS 1 Implementation: File Usage Patterns Static Measurements: - distribution of file size,
More informationDaniel A. Menascé, Ph. D. Dept. of Computer Science George Mason University
Daniel A. Menascé, Ph. D. Dept. of Computer Science George Mason University menasce@cs.gmu.edu www.cs.gmu.edu/faculty/menasce.html D. Menascé. All Rights Reserved. 1 Benchmark System Under Test (SUT) SPEC
More informationA Comparison of File. D. Roselli, J. R. Lorch, T. E. Anderson Proc USENIX Annual Technical Conference
A Comparison of File System Workloads D. Roselli, J. R. Lorch, T. E. Anderson Proc. 2000 USENIX Annual Technical Conference File System Performance Integral component of overall system performance Optimised
More informationCumulative Percentage of File Requests
An Analysis of Geographical Push-Caching James Gwertzman, Microsoft Corporation Margo Seltzer, Harvard University Abstract Most caching schemes in wide-area, distributed systems are client-initiated. Decisions
More informationNew article Data Producer. Logical data structure
Quality of Service and Electronic Newspaper: The Etel Solution Valerie Issarny, Michel Ban^atre, Boris Charpiot, Jean-Marc Menaud INRIA IRISA, Campus de Beaulieu, 35042 Rennes Cedex, France fissarny,banatre,jmenaudg@irisa.fr
More informationStudy of Piggyback Cache Validation for Proxy Caches in the World Wide Web
The following paper was originally published in the Proceedings of the USENIX Symposium on Internet Technologies and Systems Monterey, California, December 997 Study of Piggyback Cache Validation for Proxy
More informationINTERNET OVER DIGITAL VIDEO BROADCAST: PERFORMANCE ISSUES
INTERNET OVER DIGITAL VIDEO BROADCAST: PERFORMANCE ISSUES Hakan Yılmaz TÜBİTAK Marmara Research Center Information Technologies Research Institute Kocaeli, Turkey hy@btae.mam.gov.tr Bülent Sankur Boğaziçi
More informationCharacterizing Document Types to Evaluate Web Cache Replacement Policies
Characterizing Document Types to Evaluate Web Cache Replacement Policies F.J. Gonzalez-Cañete, E. Casilari, Alicia Triviño-Cabrera Dpto. Tecnología Electrónica, Universidad de Málaga, E.T.S.I. Telecomunicación,
More informationThe Total Network Volume chart shows the total traffic volume for the group of elements in the report.
Tjänst: Network Health Total Network Volume and Total Call Volume Charts Public The Total Network Volume chart shows the total traffic volume for the group of elements in the report. Chart Description
More informationDesign and Implementation of A P2P Cooperative Proxy Cache System
Design and Implementation of A PP Cooperative Proxy Cache System James Z. Wang Vipul Bhulawala Department of Computer Science Clemson University, Box 40974 Clemson, SC 94-0974, USA +1-84--778 {jzwang,
More informationUCLA Computer Science. Department. Off Campus Gateway. Department. UCLA FDDI Backbone. Servers. Traffic Measurement Connection
Contradictory Relationship between Hurst Parameter and Queueing Performance Ronn Ritke, Xiaoyan Hong and Mario Gerla UCLA { Computer Science Department, 45 Hilgard Ave., Los Angeles, CA 924 ritke@cs.ucla.edu,
More informationAgenda Cache memory organization and operation Chapter 6 Performance impact of caches Cache Memories
Agenda Chapter 6 Cache Memories Cache memory organization and operation Performance impact of caches The memory mountain Rearranging loops to improve spatial locality Using blocking to improve temporal
More informationHTML Images Audio Video Dynamic Formatted Other. HTML Images Audio Video Dynamic Formatted Other. Percentage. Percentage
Internet Web Servers: Workload Characterization and Performance Implications Martin F. Arlitt Carey L. Williamson Abstract This paper presents a workload characterization study for Internet Web servers.
More informationImplications of Proxy Caching for Provisioning Networks and Servers
Implications of Proxy Caching for Provisioning Networks and Servers Mohammad S. Raunak, Prashant Shenoy, Pawan GoyalÞand Krithi Ramamritham Ý Department of Computer Science, ÞEnsim Corporation, University
More informationAnalyzing Cacheable Traffic in ISP Access Networks for Micro CDN applications via Content-Centric Networking
Analyzing Cacheable Traffic in ISP Access Networks for Micro CDN applications via Content-Centric Networking Claudio Imbrenda Luca Muscariello Orange Labs Dario Rossi Telecom ParisTech Outline Motivation
More information!! What is virtual memory and when is it useful? !! What is demand paging? !! When should pages in memory be replaced?
Chapter 10: Virtual Memory Questions? CSCI [4 6] 730 Operating Systems Virtual Memory!! What is virtual memory and when is it useful?!! What is demand paging?!! When should pages in memory be replaced?!!
More informationCS 347 Parallel and Distributed Data Processing
CS 347 Parallel and Distributed Data Processing Spring 2016 Notes 12: Distributed Information Retrieval CS 347 Notes 12 2 CS 347 Notes 12 3 CS 347 Notes 12 4 CS 347 Notes 12 5 Web Search Engine Crawling
More informationCS 347 Parallel and Distributed Data Processing
CS 347 Parallel and Distributed Data Processing Spring 2016 Notes 12: Distributed Information Retrieval CS 347 Notes 12 2 CS 347 Notes 12 3 CS 347 Notes 12 4 Web Search Engine Crawling Indexing Computing
More informationLast Class: Consistency Models. Today: Implementation Issues
Last Class: Consistency Models Need for replication Data-centric consistency Strict, linearizable, sequential, causal, FIFO Lecture 15, page 1 Today: Implementation Issues Replica placement Use web caching
More informationInterme diate DNS. Local browse r. Authorit ative ... DNS
WPI-CS-TR-00-12 July 2000 The Contribution of DNS Lookup Costs to Web Object Retrieval by Craig E. Wills Hao Shang Computer Science Technical Report Series WORCESTER POLYTECHNIC INSTITUTE Computer Science
More informationWhite paper ETERNUS Extreme Cache Performance and Use
White paper ETERNUS Extreme Cache Performance and Use The Extreme Cache feature provides the ETERNUS DX500 S3 and DX600 S3 Storage Arrays with an effective flash based performance accelerator for regions
More informationChallenges in large-scale graph processing on HPC platforms and the Graph500 benchmark. by Nkemdirim Dockery
Challenges in large-scale graph processing on HPC platforms and the Graph500 benchmark by Nkemdirim Dockery High Performance Computing Workloads Core-memory sized Floating point intensive Well-structured
More informationStochastic Models of Pull-Based Data Replication in P2P Systems
Stochastic Models of Pull-Based Data Replication in P2P Systems Xiaoyong Li and Dmitri Loguinov Presented by Zhongmei Yao Internet Research Lab Department of Computer Science and Engineering Texas A&M
More informationInteractive Branched Video Streaming and Cloud Assisted Content Delivery
Interactive Branched Video Streaming and Cloud Assisted Content Delivery Niklas Carlsson Linköping University, Sweden @ Sigmetrics TPC workshop, Feb. 2016 The work here was in collaboration... Including
More informationInvestigating Forms of Simulating Web Traffic. Yixin Hua Eswin Anzueto Computer Science Department Worcester Polytechnic Institute Worcester, MA
Investigating Forms of Simulating Web Traffic Yixin Hua Eswin Anzueto Computer Science Department Worcester Polytechnic Institute Worcester, MA Outline Introduction Web Traffic Characteristics Web Traffic
More informationToday: Coda, xfs! Brief overview of other file systems. Distributed File System Requirements!
Today: Coda, xfs! Case Study: Coda File System Brief overview of other file systems xfs Log structured file systems Lecture 21, page 1 Distributed File System Requirements! Transparency Access, location,
More informationIntroduction This document presents a workload characterization study for Web proxy servers, as part of a larger project on the design and performance
Web Proxy Workload Characterization Anirban Mahanti Carey Williamson Department of Computer Science University of Saskatchewan February 7, 999 Abstract This document presents a workload characterization
More informationTowards Deadline Guaranteed Cloud Storage Services Guoxin Liu, Haiying Shen, and Lei Yu
Towards Deadline Guaranteed Cloud Storage Services Guoxin Liu, Haiying Shen, and Lei Yu Presenter: Guoxin Liu Ph.D. Department of Electrical and Computer Engineering, Clemson University, Clemson, USA Computer
More informationThe Potential Costs and Benefits of Long-term Prefetching for Content Distribution
The Potential Costs and Benefits of Long-term Prefetching for Content Distribution Arun Venkataramani Praveen Yalagandula Ravindranath Kokku Sadia Sharif Mike Dahlin Technical Report TR-1-13 Department
More informationCS 425 / ECE 428 Distributed Systems Fall 2015
CS 425 / ECE 428 Distributed Systems Fall 2015 Indranil Gupta (Indy) Measurement Studies Lecture 23 Nov 10, 2015 Reading: See links on website All Slides IG 1 Motivation We design algorithms, implement
More informationA Proxy Caching Scheme for Continuous Media Streams on the Internet
A Proxy Caching Scheme for Continuous Media Streams on the Internet Eun-Ji Lim, Seong-Ho park, Hyeon-Ok Hong, Ki-Dong Chung Department of Computer Science, Pusan National University Jang Jun Dong, San
More informationTime-Domain Analysis of Web Cache Filter Effects (Extended Version)
Time-Domain Analysis of Web Cache Filter Effects (Extended Version) Guangwei Bai, Carey Williamson Department of Computer Science, University of Calgary, 25 University Drive NW, Calgary, AB, Canada T2N
More informationObjective-Optimal Algorithms for Long-term Web Prefetching
Objective-Optimal Algorithms for Long-term Web Prefetching Bin Wu and Ajay D Kshemkalyani Dept of Computer Science University of Illinois at Chicago Chicago IL 667 bwu ajayk @csuicedu Abstract Web prefetching
More informationFinding a needle in Haystack: Facebook's photo storage
Finding a needle in Haystack: Facebook's photo storage The paper is written at facebook and describes a object storage system called Haystack. Since facebook processes a lot of photos (20 petabytes total,
More informationOutline. CSC 447: Parallel Programming for Multi- Core and Cluster Systems
CSC 447: Parallel Programming for Multi- Core and Cluster Systems Performance Analysis Instructor: Haidar M. Harmanani Spring 2018 Outline Performance scalability Analytical performance measures Amdahl
More informationDemand fetching is commonly employed to bring the data
Proceedings of 2nd Annual Conference on Theoretical and Applied Computer Science, November 2010, Stillwater, OK 14 Markov Prediction Scheme for Cache Prefetching Pranav Pathak, Mehedi Sarwar, Sohum Sohoni
More informationSummary Cache based Co-operative Proxies
Summary Cache based Co-operative Proxies Project No: 1 Group No: 21 Vijay Gabale (07305004) Sagar Bijwe (07305023) 12 th November, 2007 1 Abstract Summary Cache based proxies cooperate behind a bottleneck
More informationPerformance Evaluation. of Input and Virtual Output Queuing on. Self-Similar Traffic
Page 1 of 11 CS 678 Topics in Internet Research Progress Report Performance Evaluation of Input and Virtual Output Queuing on Self-Similar Traffic Submitted to Zartash Afzal Uzmi By : Group # 3 Muhammad
More informationTag-based Social Interest Discovery
Tag-based Social Interest Discovery Xin Li / Lei Guo / Yihong (Eric) Zhao Yahoo!Inc 2008 Presented by: Tuan Anh Le (aletuan@vub.ac.be) 1 Outline Introduction Data set collection & Pre-processing Architecture
More informationImproving Internet Performance through Traffic Managers
Improving Internet Performance through Traffic Managers Ibrahim Matta Computer Science Department Boston University Computer Science A Glimpse of Current Internet b b b b Alice c TCP b (Transmission Control
More informationToday: World Wide Web
Today: World Wide Web WWW principles Case Study: web caching as an illustrative example Invalidate versus updates Push versus Pull Cooperation between replicas Lecture 22, page 1 Traditional Web-Based
More informationBuffer Management for Self-Similar Network Traffic
Buffer Management for Self-Similar Network Traffic Faranz Amin Electrical Engineering and computer science Department Yazd University Yazd, Iran farnaz.amin@stu.yazd.ac.ir Kiarash Mizanian Electrical Engineering
More informationGroup-Based Management of Distributed File Caches
Group-Based Management of Distributed File Caches Darrell D. E. Long Ahmed Amer Storage Systems Research Center Jack Baskin School of Engineering University of California, Santa Cruz Randal Burns Hopkins
More informationAdvanced Computer Networks
Advanced Computer Networks QoS in IP networks Prof. Andrzej Duda duda@imag.fr Contents QoS principles Traffic shaping leaky bucket token bucket Scheduling FIFO Fair queueing RED IntServ DiffServ http://duda.imag.fr
More informationSearch Engines. Information Retrieval in Practice
Search Engines Information Retrieval in Practice All slides Addison Wesley, 2008 Web Crawler Finds and downloads web pages automatically provides the collection for searching Web is huge and constantly
More informationDifferential Compression and Optimal Caching Methods for Content-Based Image Search Systems
Differential Compression and Optimal Caching Methods for Content-Based Image Search Systems Di Zhong a, Shih-Fu Chang a, John R. Smith b a Department of Electrical Engineering, Columbia University, NY,
More informationThe Design and Implementation of a Next Generation Name Service for the Internet (CoDoNS) Presented By: Kamalakar Kambhatla
The Design and Implementation of a Next Generation Name Service for the Internet (CoDoNS) Venugopalan Ramasubramanian Emin Gün Sirer Presented By: Kamalakar Kambhatla * Slides adapted from the paper -
More informationVenugopal Ramasubramanian Emin Gün Sirer SIGCOMM 04
The Design and Implementation of a Next Generation Name Service for the Internet Venugopal Ramasubramanian Emin Gün Sirer SIGCOMM 04 Presenter: Saurabh Kadekodi Agenda DNS overview Current DNS Problems
More informationToday: World Wide Web. Traditional Web-Based Systems
Today: World Wide Web WWW principles Case Study: web caching as an illustrative example Invalidate versus updates Push versus Pull Cooperation between replicas Lecture 22, page 1 Traditional Web-Based
More informationDynamic Broadcast Scheduling in DDBMS
Dynamic Broadcast Scheduling in DDBMS Babu Santhalingam #1, C.Gunasekar #2, K.Jayakumar #3 #1 Asst. Professor, Computer Science and Applications Department, SCSVMV University, Kanchipuram, India, #2 Research
More informationTrace Driven Simulation of GDSF# and Existing Caching Algorithms for Web Proxy Servers
Proceeding of the 9th WSEAS Int. Conference on Data Networks, Communications, Computers, Trinidad and Tobago, November 5-7, 2007 378 Trace Driven Simulation of GDSF# and Existing Caching Algorithms for
More informationCache Design for Transcoding Proxy Caching
Cache Design for Transcoding Proxy Caching Keqiu Li, Hong Shen, and Keishi Tajima Graduate School of Information Science Japan Advanced Institute of Science and Technology 1-1 Tatsunokuchi, Ishikawa, 923-1292,
More informationToday. Cache Memories. General Cache Concept. General Cache Organization (S, E, B) Cache Memories. Example Memory Hierarchy Smaller, faster,
Today Cache Memories CSci 2021: Machine Architecture and Organization November 7th-9th, 2016 Your instructor: Stephen McCamant Cache memory organization and operation Performance impact of caches The memory
More informationCache memories are small, fast SRAM-based memories managed automatically in hardware. Hold frequently accessed blocks of main memory
Cache Memories Cache memories are small, fast SRAM-based memories managed automatically in hardware. Hold frequently accessed blocks of main memory CPU looks first for data in caches (e.g., L1, L2, and
More informationCaching video contents in IPTV systems with hierarchical architecture
Caching video contents in IPTV systems with hierarchical architecture Lydia Chen 1, Michela Meo 2 and Alessandra Scicchitano 1 1. IBM Zurich Research Lab email: {yic,als}@zurich.ibm.com 2. Politecnico
More informationWebTraff: A GUI for Web Proxy Cache Workload Modeling and Analysis
WebTraff: A GUI for Web Proxy Cache Workload Modeling and Analysis Nayden Markatchev Carey Williamson Department of Computer Science University of Calgary E-mail: {nayden,carey}@cpsc.ucalgary.ca Abstract
More informationTHE TCP specification that specifies the first original
1 Median Filtering Simulation of Bursty Traffic Auc Fai Chan, John Leis Faculty of Engineering and Surveying University of Southern Queensland Toowoomba Queensland 4350 Abstract The estimation of Retransmission
More informationApplication of SDN: Load Balancing & Traffic Engineering
Application of SDN: Load Balancing & Traffic Engineering Outline 1 OpenFlow-Based Server Load Balancing Gone Wild Introduction OpenFlow Solution Partitioning the Client Traffic Transitioning With Connection
More informationFederated Array of Bricks Y Saito et al HP Labs. CS 6464 Presented by Avinash Kulkarni
Federated Array of Bricks Y Saito et al HP Labs CS 6464 Presented by Avinash Kulkarni Agenda Motivation Current Approaches FAB Design Protocols, Implementation, Optimizations Evaluation SSDs in enterprise
More informationCh. 7: Benchmarks and Performance Tests
Ch. 7: Benchmarks and Performance Tests Kenneth Mitchell School of Computing & Engineering, University of Missouri-Kansas City, Kansas City, MO 64110 Kenneth Mitchell, CS & EE dept., SCE, UMKC p. 1/3 Introduction
More informationA Scalable Content- Addressable Network
A Scalable Content- Addressable Network In Proceedings of ACM SIGCOMM 2001 S. Ratnasamy, P. Francis, M. Handley, R. Karp, S. Shenker Presented by L.G. Alex Sung 9th March 2005 for CS856 1 Outline CAN basics
More information10/16/2017. Miss Rate: ABC. Classifying Misses: 3C Model (Hill) Reducing Conflict Misses: Victim Buffer. Overlapping Misses: Lockup Free Cache
Classifying Misses: 3C Model (Hill) Divide cache misses into three categories Compulsory (cold): never seen this address before Would miss even in infinite cache Capacity: miss caused because cache is
More informationCONTENT DELIVERY IN THE MOBILE INTERNET
CONTENT DELIVERY IN THE MOBILE INTERNET Edited by Editor s Name A Wiley-Interscience Publication JOHN WILEY & SONS New York ffl Chichester ffl Weinheim ffl Brisbane ffl Singapore ffl Toronto Contents
More informationLarge-Scale Web Applications
Large-Scale Web Applications Mendel Rosenblum Web Application Architecture Web Browser Web Server / Application server Storage System HTTP Internet CS142 Lecture Notes - Intro LAN 2 Large-Scale: Scale-Out
More informationIP Traffic Characterization for Planning and Control
IP Traffic Characterization for Planning and Control V. Bolotin, J. Coombs-Reyes 1, D. Heyman, Y. Levy and D. Liu AT&T Labs, 200 Laurel Avenue, Middletown, NJ 07748, U.S.A. IP traffic modeling and engineering
More informationConcepts Introduced in Chapter 6. Warehouse-Scale Computers. Programming Models for WSCs. Important Design Factors for WSCs
Concepts Introduced in Chapter 6 Warehouse-Scale Computers A cluster is a collection of desktop computers or servers connected together by a local area network to act as a single larger computer. introduction
More informationChapter 5A. Large and Fast: Exploiting Memory Hierarchy
Chapter 5A Large and Fast: Exploiting Memory Hierarchy Memory Technology Static RAM (SRAM) Fast, expensive Dynamic RAM (DRAM) In between Magnetic disk Slow, inexpensive Ideal memory Access time of SRAM
More informationChapter 6 Memory 11/3/2015. Chapter 6 Objectives. 6.2 Types of Memory. 6.1 Introduction
Chapter 6 Objectives Chapter 6 Memory Master the concepts of hierarchical memory organization. Understand how each level of memory contributes to system performance, and how the performance is measured.
More informationWeek 7: Traffic Models and QoS
Week 7: Traffic Models and QoS Acknowledgement: Some slides are adapted from Computer Networking: A Top Down Approach Featuring the Internet, 2 nd edition, J.F Kurose and K.W. Ross All Rights Reserved,
More informationInformation Filtering and user profiles
Information Filtering and user profiles Roope Raisamo (rr@cs.uta.fi) Department of Computer Sciences University of Tampere http://www.cs.uta.fi/sat/ Information Filtering Topics information filtering vs.
More informationGenerating Representative Web Workloads for Network and Server Performance Evaluation
Boston University OpenBU Computer Science http://open.bu.edu CAS: Computer Science: Technical Reports 1997-12-31 Generating Representative Web Workloads for Network and Server Performance Evaluation Barford,
More informationNetworking and Internetworking 1
Networking and Internetworking 1 Today l Networks and distributed systems l Internet architecture xkcd Networking issues for distributed systems Early networks were designed to meet relatively simple requirements
More informationImproving object cache performance through selective placement
University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 2006 Improving object cache performance through selective placement Saied
More informationPerformance of relational database management
Building a 3-D DRAM Architecture for Optimum Cost/Performance By Gene Bowles and Duke Lambert As systems increase in performance and power, magnetic disk storage speeds have lagged behind. But using solidstate
More informationInternet Services and Search Engines. Amin Vahdat CSE 123b May 2, 2006
Internet Services and Search Engines Amin Vahdat CSE 123b May 2, 2006 Midterm: May 9 Annoucements Second assignment due May 15 Lessons from Giant-Scale Services Service Replication Service Partitioning
More informationGismo: A Generator of Internet Streaming Media Objects and Workloads
Boston University OpenBU Computer Science http://open.bu.edu CAS: Computer Science: Technical Reports 001-30 Gismo: A Generator of Internet Streaming Media Objects and Workloads Jin, Shudong Boston University
More informationVirtual Multi-homing: On the Feasibility of Combining Overlay Routing with BGP Routing
Virtual Multi-homing: On the Feasibility of Combining Overlay Routing with BGP Routing Zhi Li, Prasant Mohapatra, and Chen-Nee Chuah University of California, Davis, CA 95616, USA {lizhi, prasant}@cs.ucdavis.edu,
More informationRowena Cole and Luigi Barone. Department of Computer Science, The University of Western Australia, Western Australia, 6907
The Game of Clustering Rowena Cole and Luigi Barone Department of Computer Science, The University of Western Australia, Western Australia, 697 frowena, luigig@cs.uwa.edu.au Abstract Clustering is a technique
More informationCHARACTERIZATION OF BUS TRANSACTIONS FOR SPECWEB96 BENCHMARK
Chapter 1 CHARACTERIZATION OF BUS TRANSACTIONS FOR SPECWEB96 BENCHMARK Prasant Mohapatra Department of Computer Science and Engineering Michigan state University East Lansing, MI 48824 prasant@cse.msu.edu
More informationTuning RED for Web Traffic
Tuning RED for Web Traffic Mikkel Christiansen, Kevin Jeffay, David Ott, Donelson Smith UNC, Chapel Hill SIGCOMM 2000, Stockholm subsequently IEEE/ACM Transactions on Networking Vol. 9, No. 3 (June 2001)
More informationOptimizing Parallel Access to the BaBar Database System Using CORBA Servers
SLAC-PUB-9176 September 2001 Optimizing Parallel Access to the BaBar Database System Using CORBA Servers Jacek Becla 1, Igor Gaponenko 2 1 Stanford Linear Accelerator Center Stanford University, Stanford,
More informationThe War Between Mice and Elephants
The War Between Mice and Elephants Liang Guo and Ibrahim Matta Computer Science Department Boston University 9th IEEE International Conference on Network Protocols (ICNP),, Riverside, CA, November 2001.
More information