Channel 0. Channel 1 Channel 2. Channel 3 Channel 4. Channel 5 Channel 6 Channel 7

Optmzed Regonal Cachng for On-Demand Data Delvery Derek L. Eager Mchael C. Ferrs Mary K. Vernon Unversty of Saskatchewan Unversty of Wsconsn Madson Saskatoon, SK Canada S7N 5A9 Madson, WI 5376 eager@cs.usask.ca fferrs, vernong@cs.wsc.edu ABSTRACT Systems for on-demand delvery of large, wdely-shared data can use several technques to mprove cost/performance, ncludng: multcast data delvery, segmented data delvery, and regonal (or proxy) servers that cache some of the data close to the clents. Ths paper makes three contrbutons to the state-of-the-art desgn of such systems. Frst, we show how segmented multcast delvery technques, n partcular the recently proposed hgh-performance dynamc skyscraper scheme, can be moded to allow each object to be partally or fully cached at regonal servers. The new parttoned delvery archtecture supports shared delvery between the regonal and remote servers and mproves performance even f one server delvers the entre object. The second contrbuton s an analytc model that can be solved to determne the full/partal object cachng strategy that mnmzes delvery cost n the context of a system that has homogeneous regonal servers. Fnally, results n the paper llustrate the use of the model and provde nsght nto how the optmal cachng strategy s nuenced by key system and workload parameters, ncludng clent request rate, the relatve severty of the dsk bandwdth and storage capacty constrants at the regonal servers, and the relatve costs of regonal and remote delvery. Two mportant conclusons from the results are: (1) t s often cost-eectve to cache the ntal segments of many data objects rather than the complete data for fewer objects, and (2) the parttoned delvery archtecture and cachng partal objects can each greatly reduce delvery cost. Keywords: proxy cachng, skyscraper delvery, multcast, optmzaton 1. INTRODUCTION Ths paper consders optmzed regonal (or proxy) cachng strateges for the web and other on-demand data delvery systems that have two key features. Frst, the objects are sucently large and popular that t s cost-eectve to use multcast or broadcast delvery methods. Such objects mght nclude, for example, news clps, televson shows, medcal or recreatonal nformaton servces, popular product advertsements, or successful dstance educaton content. Note that even f the network does not support true multcast, smulated multcast can stll be used to conserve server resources such as dsk bandwdth. Second, the system uses a segmented delvery technque n whch each object s dvded nto xed ncreasng-szed segments that are multcast separately. A clent, or an agent actng on behalf of the clent, can receve a small number of segments smultaneously, and can buer segments that are receved ahead of need. Ths greatly ncreases the (cost-)sharng of the larger segment multcasts, whch n turn greatly reduces the server and network bandwdth requred to support a gven clent workload. 19,2,1,13,1,12 Segmented delvery technques were orgnally proposed for real-tme delvery of vdeo and other contnuous meda objects, but could also be employed for delvery of other large popular objects that do not have real-tme constrants. We use the term multcast n the remander of ths paper to denote ether multcast (e.g., va the Internet) or broadcast (e.g., va satellte or regonal cable network). The recently proposed dynamc skyscraper technque can provde true \on-demand" data delvery and has other cost/performance advantages over prevously proposed segmented (as well as non-segmented) delvery schemes. 1 The rst queston addressed n ths paper s how the dynamc skyscraper technque can be moded such that regonal servers can cache just the rst few segments of an object. Pror work on web cachng as well as dstrbuted vdeo-on-demand (VOD) archtectures has focused on the problem of determnng on whch server(s) to cache each entre data object so as to optmze system cost/performance. 16,2,4,6,21,5 However, n segmented multcasts, the Ths research was partally supported by the Natural Scences and Engneerng Research Councl of Canada under Grant OGP-264, by the Ar Force Oce of Scentc Research under Grant F4962-98-1-417, and by the Natonal Scence Foundaton under Grant ACI-961919.

ntal segments are smaller and are multcast more frequently; thus large savngs n remote server bandwdth can be obtaned at small storage cost n the regonal server. Snce there s also greater (cost-)sharng per multcast for the later segments, t may be advantageous to cache the ntal segments of many objects and to rely on remote multcast delvery of the later more wdely (cost-)shared segments, rather than fully cachng fewer (hghly popular) objects. In addton, cachng the ntal segments allows the regonal server to provde quck response whle hdng the latency of communcaton wth the remote server. Parallel work 17 further shows that cachng ntal segments can facltate workahead smoothng for varable bt rate contnuous meda delvery. We propose a new parttoned skyscraper archtecture to enable these potental benets. Ths parttoned archtecture can also be employed to mprove dynamc skyscraper performance n the case that the entre object s delvered by a gven server. A smlar parttoned archtecture can also be developed for other segmented delvery technques. Gven a parttoned delvery archtecture n whch regonal servers can cache partal objects and full objects, the next queston addressed n ths paper s whether an analytcal model can be devsed and solved to determne the mnmum-cost regonal cachng strategy for gven delvery cost models. For a system wth a remote server (.e., the nformaton source) and a set of regonal servers, one would lke to answer questons such as: 1. If the regonal servce provder s derent than the remote servce provder, whch full/partal objects should the regonal server cache to mnmze the use of the remote server, for a gven xed cost of the regonal server? 2. If the organzaton that owns a remote nformaton source also pays for the regonal cachng and delvery, how should the objects be parttoned between the remote server and the regonal servers to mnmze the overall cost of delvery to the end user? The model postulated n ths paper can evaluate such questons for a system wth one or more remote nformaton sources and a set of homogeneous regonal servers. The model s developed for systems that use the proposed parttoned dynamc skyscraper delvery technque, but t can be adapted to determne the optmal cachng strategy n systems wth heterogeneous regonal servers or systems that use other segmented or non-segmented delvery technques, ncludng conventonal rst-come rst-serve (FCFS) delvery. In Secton 5, we llustrate the use of the model and obtan nsght nto how the optmal cache content s nuenced by object popularty, the relatve severty of the constrants on dsk bandwdth and storage capacty at the regonal servers, and other key system parameters. Two mportant conclusons from the experments are: (1) t s often most cost-eectve to cache many partal objects at the regonal servers, and (2) the new parttoned dynamc skyscraper archtecture and the cachng of partal objects can each greatly reduce the mnmum delvery cost. Secton 2 descrbes the dynamc skyscraper multcast technque. Secton 3 denes the new parttoned dynamc skyscraper archtecture that allows regonal servers to store partal objects. Secton 4 develops the model for determnng the optmal cache content for a gven relatve cost of regonal and remote delvery. Secton 5 provdes some results and nsghts from applyng the model, and Secton 6 provdes conclusons for ths work. 2. BACKGROUND: DYNAMIC SKYSCRAPER MULTICASTS The parameters of a skyscraper multcast delvery system are dened n Table 1. The term channel refers to the entty used for a multcast and to the collecton of server and network resources requred to support the (multcast or broadcast) transmsson of an object segment at a gven delvery rate. Parameter C K n N s j T 1 W Denton total number of channels devoted to skyscraper multcasts number of segments per object, and channels per skyscraper multcast number of objects number of groups of skyscraper channels (N = C=K) sze of the j'th segment (relatve to the sze of the rst segment) duraton of a unt-segment transmsson the largest segment sze Table 1. Parameters of a Skyscraper System.

To smplfy the system descrpton and to gan ntal nsghts nto optmal regonal cachng strateges, the remander of ths paper assumes that all objects have the same length (.e., amount of data) and are delvered at the same rate. However, the system and the assocated delvery cost model can be moded to handle heterogeneous object lengths and delvery rates. A key feature of a dynamc skyscraper system 1 s that each object s dvded nto K segments wth a partcular progresson of relatve szes. The (smallest) ntal segment s multcast most frequently, to reduce the tme that a clent must wat to receve t. Each larger segment s multcast less frequently to reduce bandwdth usage. Channel Channel 1 Channel 2 Channel 3 Channel 4 Channel 5 Channel 6 Channel 7 x x x............... Fgure 1. Example Skyscraper Transmsson Schedule. (K = 8; W = 8) Each segment s transmtted on a derent channel accordng to a schedule such as the one llustrated n Fgure 1. In ths example, each object s dvded nto eght segments (.e., K = 8), wth relatve szes f1,2,2,4,4,8,8,8g, and the segments are each multcast on separate channels, numbered 7, respectvely. For example, a clent who receves the gray-shaded transmsson on channel wll receve the gray-shaded transmsson on each of the other channels. Ths clent wll have receved the entre object once the segment on channel 7 has been delvered. The segment transmssons for a gven object are each repeated on ther respectve channels a specc number of tmes, consttutng a transmsson cluster, as dented by the gray and strped shadng n the gure. Each transmsson perod marked wth an X on channel begns a new dentcally-structured transmsson cluster whch can be scheduled to delver a new object's segments, n response to clent requests. For the gven sequence and algnment of relatve segment szes, a clent that starts recepton durng any channel perod n a transmsson cluster can receve each of the other seven segments of the object, one per channel, wth no pauses between segments, requrng smultaneous recepton of at most two transmssons by the clent. For example, consder a clent who requests the object just before the last strped unt-segment transmsson on channel. Ths clent wll receve the last strped transmsson on each of channels 4, and then the gray-shaded transmssons on channels 5 7. All such recepton sequences, ncludng the ntal recepton sequence for the cluster, share the same multcast transmsson on channel K. y Many derent relatve segment sze progressons are possble. 1,12 Whchever progresson s used, t s bounded by a parameter, W, and padded f necessary wth W values up to length K, n order to lmt the requred clent storage capacty. Note that a new transmsson cluster begns on channel after each W unt-segment transmssons (.e., after a perod of W T 1 ), as marked by the Xs n Fgure 1. Note also that W T 1 s the duraton of the transmsson cluster on each of the K channels n the group. The clent buer space requrement can be derved from Fgure 1 by observng that clents who begns n the last unt-segment transmsson perod for a cluster wll need to buer W 1 unts of data once they start recevng the transmsson on channel K whch s shared wth the clents who started W 1 unts earler. Ths buerng capablty s easly accommodated by a commodty dsk. 1 The C channels that are provded for multcastng objects are organzed nto N groups of K channels each. The transmsson clusters n the derent groups of channels are persstently staggered such that a new transmsson cluster starts on a derent group at a xed spacng of W T1 N.......... y The delvery technque s named \skyscraper" due to the shape that s formed by stackng the segment szes one above the other.

If a clent requests an object and t s not possble to jon an on-gong or scheduled transmsson cluster for the object, the object s scheduled for delvery on the next avalable transmsson cluster that wll be multcast n the future. Requests that requre a new transmsson cluster are scheduled n rst-come rst-serve (FCFS) order because recent results show that for xed length objects, FCFS outperforms other proposed schedulng algorthms ncludng the maxmum factored queue length rst (MFQ) dscplne f both the mean and the varablty n clent watng tme are consdered. 18 Furthermore, the cluster assgnment can be done when the request arrves, and thus the system can mmedately nform the clent when the multcast wll begn. The cost/performance of the dynamc skyscraper delvery technque s further enhanced by temporarly reassgnng unused transmsson perods to requests that are watng for a unt-segment multcast n another actve transmsson cluster. Ths optmzaton, called channel stealng, and another optmzaton called dle cluster catch-up, sgncantly reduce average clent wat (eectng true on-demand delvery), as llustrated n Fgure 2. The detals of these optmzatons, and the dramatc mprovement n performance of dynamc skyscraper delvery over pror segmented and non-segmented delvery technques, are descrbed n pror work. 1 3. PARTITIONED SKYSCRAPER ARCHITECTURE Ths secton consders the applcaton of dynamc skyscraper delvery n a system wth regonal (or proxy) servers that cache some of the data. For smplcty, we descrbe the system n terms of a sngle remote server (.e., nformaton source) and multple regonal servers. The extenson to multple remote servers s straghtforward. In one scenaro, the nformaton source mght transmt va satellte and the regonal servers mght each transmt requested satellte content as well as locally-cached content va a cable network. In such a scenaro, regonal delvery of ntal segments can be tghtly synchronzed wth the remote delvery of later segments of the same object. Wth approprate multcast support, or perhaps wth smulated multcasts, the proposed parttoned dynamc skyscraper archtecture can also be employed when objects are delvered over the Internet or other commodty networks. For ths context, denton of an ecent mappng between channels and multcast groups and the small modcatons that are needed to accommodate the uncertanty n network delvery tmes n the case of real-tme playback, s left for future work. 3.1. A Nave Dynamc Skyscraper System It s relatvely straghtforward to devse a system wth regonal cachng that uses the dynamc skyscraper delvery technque f each object s ether delvered entrely by the remote server or s cached entrely at the regonal server. In ths case, the regonal server mght (1) allocate (or account for) regonal bandwdth to be used for remote server multcasts requested by ts clents, based on the (projected) transmsson tme at the remote server, and (2) use ts own transmsson clusters for multcasts of locally stored objects. The regonal server mght also need to delay clent requests for remote server multcasts and/or dynamcally adjust the rate at whch new regonal transmsson clusters are allocated, dependng on the dynamc clent workload and the total avalable regonal network bandwdth. The stuaton becomes more complex when consderng how the multcasts mght be organzed f the ntal k segments of some objects are cached at the regonal server and the remander of those objects are delvered by the remote server. z Wth partal object cachng, a shared mplementaton of dynamc skyscraper s requred. In the most nave mplementaton, whenever delvery of a partally cached object s requested, remote and regonal servers cooperate to provde a complete transmsson cluster on each regonal network that mght (eventually) need the remote porton of the multcast. Clearly, ths mght be very wasteful of bandwdth, as t mght be the case that the transmsson s not used by the clents n a gven regon. However, f ths s not done, then the problem arses as to how to accommodate a new clent request for a partally cached object at a regonal server that s not carryng (or plannng to carry) a current (or scheduled) remote server multcast of the later segments of the object. It may agan be wasteful of regonal network bandwdth to schedule the entre regonal porton of a transmsson cluster for the object, snce much of the new cluster wll not be usable n the case that the remanng tme for jonng the remote multcast s very short. Our proposed soluton to ths problem employs the concept of decoupled mn-transmsson clusters (or mnclusters) for the ntal segments cached at the regonal servers. These mn-clusters are allocated on-demand, allowng new clents to jon on-gong (or scheduled) remote server multcasts, as dscussed next. z Note that, for a gven object, t s always more cost-eectve to cache an earler segment than a later segment of the object, due to the relatve sze and multcast frequency of the segments.

3.2. A Cost-Eectve Parttoned Dynamc Skyscraper Archtecture The denton of mn-transmsson clusters s enabled by the recursve structure of dynamc skyscraper transmsson clusters. For example, the porton of the transmsson cluster n Fgure 1 that s carred on channels through 4 s composed of two consecutve smaller transmsson clusters wth K = 5 and W = 4, whle the porton carred on just channels through 2 s composed of four consecutve smaller transmsson clusters wth K = 3 and W = 2. A mn-transmsson cluster s dened as one of these smaller clusters, wth the K parameter (denoted by k) equal to the number of segments stored regonally and the W parameter (denoted by w) equal to the (relatve) sze of the longest segment stored regonally. The basc dea s to delver just a sngle nstance of the mn-cluster n response to a user request, rather than the multple repettons of whch a complete transmsson cluster would normally be composed. The multcasts of the remanng segments by the remote server retan the same delvery structure (for those segments) that s llustrated n Fgure 1. In the followng, we denote the segments delvered by mn-transmsson clusters the leadng segment set, and the remanng segments the tralng segment set. As for full transmsson clusters (wth K segments), the channels for mn-cluster transmssons at each regonal server are organzed nto groups of k channels each, and the channels for tralng-segment transmssons at the remote server are organzed nto groups of K k channels each. The clusters n the derent groups at each server are persstently staggered. For example, at each regonal server a new mn-cluster starts on a derent group every w T 1 dvded by the number of groups of mn-cluster channels. A new clent request for an object that s not currently beng multcast (or scheduled to be multcast) at the regonal server or at the remote server, queues at the regonal server for a mn-cluster, and queues at the remote server for a tralng segment set transmsson cluster. In ths way, the transmssons of the leadng and tralng sets of segments are decoupled. However, allocaton s coordnated, so that the mn-cluster at the regonal server s scheduled as early as possble (so as to mnmze delay) and the transmsson cluster at the remote server s scheduled as late as possble relatve to the schedulng of the mn-cluster. Ths maxmzes opportuntes for other clents to jon the tralng segment multcasts whle avodng any pause n the delvery for the rst clent. A new clent request for an object that s not currently beng multcast (or scheduled for multcast) at the regonal server, but s wthn the catch-up wndow of an n-progress (or scheduled) remote transmsson for the tralng segment set, must be allocated (1) a new mn-cluster at the regonal server, and (2) regonal network bandwdth for the remander of the remote server transmsson cluster. x One further optmzaton s made possble by the mn-clusters. Consder Fgure 1, and suppose that the regonal server caches the rst ve segments. In ths case, a mn-cluster can be allocated to a new clent request that arrves just before the multcast of the sxth segment by the remote server. Ths s (sx unt perods) later than the last unt-segment multcast on channel n the standard skyscraper transmsson cluster as dened n Secton 2. Ths polcy maxmzes the duraton of the total catch-up wndow for the tralng segment set transmsson cluster, whch can be shown to be of length (W s k+1 + s) T 1, where s s the sum of the szes of the segments cached regonally, and s k+1 s the sze of the rst segment delvered by the remote server. A new clent request can be satsed by an n-progress mn-cluster for the requested object at the regonal ste, as long as t arrves wthn the mn-catch-up wndow for that cluster. Ths catch-up wndow s at most of length (w 1) T 1. However, f the mn-cluster begns closer than (w 1) T 1 before the end of the catch-up wndow of the correspondng tralng segment set transmsson cluster, the mn-catch-up wndow wll be smaller, snce t s only possble for a request to jon an n-progress mn-cluster f t can also jon n wth the correspondng tralng segment set cluster. The varable sze of the mn-catch-up wndows s accounted for n the optmzaton model developed n the next secton. Increasng the sze of the total catch-up wndow mples ncreases n the clent storage requrement and n the number of segment multcasts a clent must be able to receve smultaneously. Speccally, the maxmum storage requrement (as measured n unt-segments) s W s k+1 + s rather than W 1. For the f1,2,2,4,4,...g segment sze progresson, f k = K 1 or f the rst two segments delvered by the remote server are of the same sze, clents must x Note that as long as more than one segment s stored at the regonal server, the sum of the szes of the segments cached regonally s greater than the sze of the rst segment delvered by the remote server. In ths case, the mn-cluster can be scheduled at any pont wthn the catch-up wndow. In contrast, storng only the rst unt-segment of an object regonally requres sgncantly tghter synchronzaton between remote and regonal servers to avod pauses n delvery. (Consder the case n whch a regonal clent request arrves just after the begnnng of a remote server multcast of the rst two-unt segment.) Thus, n the remander of the paper we assume k 6= 1.

be able to receve up to three (rather than two) segments smultaneously; otherwse clents must be able to receve up to four smultaneous segment multcasts. An mportant observaton s that mn-clusters can also be used when the entre objects are delvered by a gven server. The ncreased sze of the catch-up wndow and the more ecent use of bandwdth for less popular objects suggest that decoupled mn-clusters can lead to mproved performance n ths case as well as when object delvery s shared between remote and regonal servers. The optmzaton model developed n the next secton assumes the use of mn-clusters (of sze k segments) for each object, regardless of whether or how t s cached. 4. THE OPTIMIZATION MODEL Ths secton develops an analytc model that permts calculaton of the set of object segments that should be cached at the regonal (or proxy) servers n order to mnmze delvery cost for a gven skyscraper conguraton. The problem s more complex than the problems examned n prevous papers that address optmal object allocaton n dstrbuted VOD systems, owng to (1) the need to capture the behavor that occurs wth segmented multcast delvery, n whch clents wll jon (and cost-share) dynamcally-scheduled segment multcasts, as n the parttoned dynamc skyscraper system, and (2) the possblty that objects may be partally cached at the regonal servers. Speccally, the model must express a sutable measure of system performance for any set of allowed segment allocaton decsons, n the presence of the xed-sze total catch-up wndows and varable-sze mn-catch-up wndows descrbed n Secton 3. Nevertheless, we are able to develop a farly smple optmzaton model by focusng on the requred remote and regonal bandwdth requred to support a gven assgnment of objects and clent workload, rather than on more detaled performance measures such as average clent delay or blockng probablty. Note that the requred regonal network bandwdth s the same regardless of whether object segments are cached regonally or whether they are delvered from the remote server. However, the regonal server bandwdth, and the remote server and network bandwdth are a functon of the cachng decsons. Canddate object dstrbutons can thus be compared wth respect to the mpact they have on the predcted bandwdth requrements. We desre an estmate of requred bandwdth that s easy to compute, and that corresponds to a good operatng pont. Applyng concepts from asymptotc bounds analyss, 14 our estmate for the requred number of channels to support a gven clent workload s the average number of channels that would be n use f an nnte number of channels were avalable. { As we dscuss below, ths estmate s readly computed wth only mnmal statstcal assumptons. Furthermore, n a wde varety of contexts, ths form of estmate has been found to yeld an operatng pont close to the knee of the cost-performance curve, at whch pont the acheved performance per unt cost s maxmzed. As we llustrate n Fgure 2 for several dynamc skyscraper systems, the proposed bandwdth estmate s close to the knee of the curve of mean clent wat versus the nverse of the number of channels provded n the dynamc skyscraper delvery system. Furthermore, the examples llustrate that mean clent wat at the knee s typcally not much larger than the mnmum achevable for the gven skyscraper parameters. (That s, the system operates much lke a closed queueng system due to the fact that the queue for new transmsson clusters can never be greater than the number of objects.) Snce we wll be computng optmal regonal cache content for a xed skyscraper conguraton (.e, K; W; and k), the content that mnmzes the proposed bandwdth requrement should yeld average clent wat that s close to mnmum. 4.1. Optmal Channel Capacty for Non-Parttoned Systems The estmate of requred bandwdth s developed rst for the smpler context of a dynamc skyscraper system that does not have regonal servers and does not use mn-clusters to mprove performance (.e., k = ). Lettng C = K N denote the estmate of the requred number of channels, and denote the rate of requests for object, we obtan C = K N = K W T 1 nx =1 1 (W 1)T 1 + 1 The factor W T 1 s the duraton of a transmsson cluster on each channel. The 'th term n the sum s the nverse of the average tme between transmsson clusters that delver object, assumng requests for a new transmsson { Or, equvalently, the largest number of channels such that any fewer would be guaranteed to result n queueng of clent requests for any clent request arrval process wth the gven per-object average request rates.

cluster have zero wat tme (or an nnte number of channels are avalable), and assumng that the average arrval rate of requests for the object when a transmsson cluster s not avalable for catch-up s equal to the overall average arrval rate. (Note that the latter assumpton mples that 1= s the average tme from the end of a transmsson cluster catch-up wndow for object untl the next request for that object.) Thus, the 'th term gves the allocaton rate for new transmsson clusters for object f an nnte number of channels s avalable. Summng over all objects that use skyscraper multcasts gves the total maxmum allocaton rate, and multplyng ths maxmum allocaton rate by the duraton of a transmsson cluster gves the average number of groups of channels that would be n use f an nnte number of groups were avalable. Multplcaton by the number of channels per group gves an estmate of the total number of channels that should be provded. k 2 Average watng tme (mns) 15 1 5 Predcted knee locatons W=8, K=6, no stealng or dle cluster catch-up W=16, K=15, no stealng or dle cluster catch-up W=16, K=15 W=8, K=6.5.1.15.2.25 Inverse of number of channels Fgure 2. Average Clent Watng Tme vs. Inverse of Number of Channels. (224 12-mnute objects; = 8.) Fgure 2 gves sample results llustratng how close C s to the knee of the curve of average clent watng tme versus the nverse of the number of channels, for several skyscraper systems. Note that n these and other conguratons we have examned (not shown), the C estmates are close approxmatons to the knees of the curves, for the most basc dynamc skyscraper scheme as well as the sgncantly enhanced scheme that employs channel stealng and catch-up wth dle clusters. 4.2. Optmal Object Assgnments for Regonal Servers The specc optmzaton problem consdered s that of determnng the regonal cache contents that mnmze the overall cost of delvery, as represented by a weghted sum of the requred number of channels at the remote server and at each regonal server, subject to constrants on the bandwdth and storage capacty at each regonal server. The optmzaton model parameters are dened n Table 2. The key model outputs are the values that specfy whether object should be cached (fully or partally) at the regonal servers. It s assumed that all objects have the same segmentaton parameters (K, W, k, w), and that each object can have, k or K segments stored regonally. It s also assumed that the regonal stes are homogeneous n storage and bandwdth capabltes, as well as n clent request rates and object selecton frequences, and thus, that all of the regonal servers wll store the same k Note that the dependence of C on T1 s removed f the arrval rates are expressed n terms of requests per unt-segment transmsson perod. The watng tme curves n Fgure 2 were derved usng smulaton n whch the 95% condence ntervals are wthn 2% of the reported values. The total request arrval rate s eght requests per mnute, arrvals are Posson, and the object selecton frequences are gven by the Zpf() dstrbuton on 1 objects. The system contans the most popular 224 objects because, for the modeled object selecton frequences, (1) the populartes of the rst 1 objects matches reasonably well wth a partcular measurement of the 1 most popular objects n a vdeo rental outlet, 7 and (2) 8% of the requests are for the 224 most popular objects, whch s a popular denton of the hot set that mght use skyscraper delvery.

X l;r X t;r Input k K n N channels N segments P s j T 1 w W Output C remote C regonal D regonal ; X l;rr ; X l;r ; ; X t;rr ; X t;r R Rr r Parameter Denton number of segments n the leadng segment set number of segments per object number of objects maxmum number of channels at each regonal server maxmum storage capacty (measured n number of unt-segments) at each regonal server number of regonal servers sze of the j'th segment (relatve to the sze of the rst) duraton of a unt-segment transmsson the largest segment sze n the leadng segment set the largest segment sze n the tralng segment set the cost of a regonal server channel, relatve to that of a remote server channel total arrval rate of requests for object Parameter Denton number of channels needed for remote server multcasts number of channels needed for regonal server multcasts storage needed at each regonal server, n number of unt-segments maxmum rate at whch transmsson clusters can be allocated for multcasts of the leadng/tralng (l/t) segment set for object, dstngushed by whether the segments of the object are only stored at the remote server (R), are partally cached at the regonal servers (Rr), or are fully cached at the regonal servers (r) equals 1 f object s stored only at the remote server; otherwse equals 1 f only the leadng segment set of object s cached regonally; otherwse equals 1 f object s entrely cached regonally; otherwse Table 2. Parameters for the Regonal Cache Optmzaton Model. segments/objects. These assumptons smplfy the exploraton of the system desgn space and are thus approprate for ganng ntal nsghts. The model can easly be moded to nclude more general formulatons n the future. Wth the above assumptons, the optmzaton problem s formally descrbed as follows: mn subject to C remote () + P C regonal () C regonal () N channels D regonal () N segments R + Rr + r = 1; = 1; 2; : : : ; n R; Rr ; r 2 f; 1g; = 1; 2; : : : ; n Here the notaton represents the vector whose components are R, Rr, r, = 1; 2; : : : ; n. Soluton of ths model requres a method of computng the number of channels needed at the remote server as well as at each regonal server, as a functon of the clent workload (.e., request arrval rates for each object), the parttoned skyscraper conguraton, and the object segments cached at the regonal servers. For ths purpose, we use the same basc approach that was used n Secton 4.1, entalng ndng the maxmum allocaton rates of new (mn- and tralng segment) transmsson clusters, assumng an nnte number of channels s avalable. The maxmum allocaton rate for new transmsson clusters n a system wth no regonal cachng and no mnclusters s the nverse of the average tme between requests for a new transmsson cluster when there s no queueng, as gven by (W 1)T 1 + 1. The maxmum allocaton rates for tralng segment set transmsson clusters n the parttoned dynamc skyscraper archtecture are gven by smlar expressons that depend on whether the transmsson cluster s delvered by the remote or regonal server. Recall from Secton 3 that f the multcast uses mn-clusters P of k sze k segments, the duraton of the total catch-up wndow (for tralng segment set clusters) s W s k+1 + s j=1 j.

Notng that =P s the arrval rate of clent requests for object at a regonal server, the maxmum allocaton rates for tralng segment set clusters are as follows: X t;r = X t;rr = X t;r = P 1 k W s k+1 + s j=1 j P 1 k W s k+1 + s j=1 j T 1 + 1 T 1 + P The maxmum allocaton rates for mn-clusters are further complcated by the fact that the mn-catch-up wndow for such a cluster has a varety of possble lengths, dependng on when the mn-cluster begns n relatonshp to the end of the catch-up wndow of the correspondng tralng segment set transmsson cluster. To compute the average catch-up wndow sze for mn-clusters for object, we make two assumptons. Frst, we assume that the average arrval rate of mn-cluster requests for object durng the last wt 1 of the catch-up wndow of an object tralng segment set transmsson cluster s equal to the overall average arrval rate of mn-cluster requests for that object. Ths mples that the fracton of mn-clusters for object that wll begn durng ths tme perod s gven by wt 1 tmes the allocaton rate of object tralng segment set transmsson clusters. Second, we assume that such arrvals may occur anywhere wthn ths tme perod wth equal probablty. Thus, the average catch-up wndow sze of the correspondng mn-clusters s gven by w 1T 2 1. All other mn-clusters have a full catch-up wndow of sze (w 1)T 1. The three mn-cluster allocaton rates are therefore gven by the followng equatons: X l;r = X l;rr = X l;r = X t;r w 1 wt 1 X t;rr X t;r 1 + (1 X t;r 2 wt 1 )(w 1) T 1 + P wt 1 w 1 2 + (1 X t;rr wt 1 )(w 1) w 1 wt 1 + (1 X t;r 2 wt 1 )(w 1) 1 1 T 1 + P T 1 + 1 As n the smpler system model of Secton 4.1, multplyng each of the above allocaton rates by the duraton of, and number of channels n, a transmsson cluster of the correspondng type, yelds an estmate for the number of channels requred for that partcular type of transmsson cluster. For specc object segment allocatons, as reected through the values, the requred number of channels and the requred regonal storage can thus be computed as follows: C remote () = C regonal () = nx =1 ( R X t;r nx =1 D regonal () = + Rr X t;rr )(K k)w T 1 + R X l;r kwt 1 r X t;r (K k)w T 1 + ( r X l;r nx =1 @ ( r + Rr ) kx j=1 s j + r + Rr X l;rr )kwt 1 KX j=k+1 s j 1 A

The above equatons estmate the requred number of channels (or dsk I/O and network I/O bandwdth) at each type of server, assumng the network supports multcast or broadcast delvery. Ths s also the number of network channels requred for the remote (or regonal) multcasts f the respectve server s operatng over a broadcast network such as a satellte or cable network. The calculatons are somewhat mprecse for the requred network channels f the servers are operatng over a swtched network, such as the Internet, because the (average) bandwdth needed for the multcast depends on the (average) number of clents recevng the multcast, and ths n turn may depend on the object that s beng transmtted. Recall that the requred regonal network bandwdth s not aected by whether the objects are stored at the remote or regonal server. Thus, the only extenson that s needed to make the model precse for a swtched network s to factor n the average remote network bandwdth requred to multcast each object. In the nterest of ganng ntal nsghts, we defer ths extenson to future work. There are several ponts worth notng about the model. Frst, the model s vald for multple remote servers f each stores a dstnct subset of the objects n the system and f the network used by each remote server has the same cost for the same requred bandwdth. In ths case C remote s the aggregate number of channels that must be provded at the collecton of remote servers. Second, the model can be generalzed for heterogeneous regonal servers by developng separate calculatons, smlar to those gven above, and summng over the approprate objects, for each dstnct regonal server. Thrd, C remote, C regonal and D regonal are lnear functons of the bnary varables. Thus, the optmzaton model s a mxed nteger lnear program (MIP), 15 for whch relable soluton technques exst. 5. RESULTS In ths secton we provde results that llustrate the capabltes of the optmzaton model as a system desgn tool. The results also yeld nsght nto the form of the optmal cachng strategy at the regonal servers for varous system parameters and cost consderatons. We perform three types of experments wth the model: 1. In Secton 5.1, we nvestgate the optmal strategy when the objectve s to mnmze the use of remote server channels (.e., = ), under the constrants of xed regonal server capacty and bandwdth. These results wll show that the form of the optmal assgnments ders, dependng on whether the actve constrant s the bandwdth or the capacty of the regonal server. 2. In Secton 5.2, we examne how the optmal regonal cache contents change as the relatve cost of the regonal channels,, ncreases. These results wll show that the regonal servers should cache segments for less popular objects as ncreases. 3. In Secton 5.3 we show how the allocatons and system cost change as the capacty and bandwdth at the regonal servers are vared. All of the results wll show a hgh tendency to cache partal objects. The results wll also show that, compared wth the mnmum cost soluton when the regonal servers can only cache whole objects and the servers employ the tradtonal skyscraper archtecture, the parttoned skyscraper archtecture can result n substantal cost savngs, and the capablty to cache partal objects can result n further substantal cost savngs. Together these two new mprovements yeld substantal cost savngs except when clent request rate s very hgh and regonal server resources are severely lmted. The optmzaton model was formulated n an algebrac modelng language, GAMS, 3,11 to enable soluton usng a varety of MIP solvers. The most eectve algorthms currently mplemented are branch and bound (branch and cut) methods wth subproblems solved usng a varant of the smplex method for lnear programmng. 8 After some expermentaton wth the model formulaton and solvers, we used the XPRESS 9 code wth default settngs. The GAMS system s approprate for the speed of model development and analyss requred n ths research. However, t s possble that specalzed algorthms for ths problem could be more ecent. Desgn and mplementaton of such algorthms mght be approprate for routne use of the model. Each experment reported below solves the model to optmalty under the assumpton that there wll be at most one contguous set of objects that are fully cached at the regonal servers. Ths assumpton was enforced by addng the followng constrant to the model: X max r r ; 1 1:

16 14 12 Object Object 99 Total for Object Remote Regonal 16 14 12 Object Object 99 Remote Regonal 16 14 12 Object Object 99 Remote Regonal Number of channels 1 8 6 Number of channels 1 8 6 Number of channels 1 8 6 4 4 4 2 2 2.1.2.3.4.5.6 Fracton of object cached regonally.1.2.3.4.5.6 Fracton of object cached regonally.1.2.3.4.5.6 Fracton of object cached regonally (a) = 1, P = 1 (b) = 1, P = 1 (c) = 1, P = 1 Fgure 3. C remote and C regonal for Objects and 99 vs. Fracton Cached. Ths nequalty, whch greatly reduces the model soluton tme, was only added after substantal expermentaton wth the model showed t was always satsed by the optmal solutons. All experments n ths paper assume a system wth: (1) 1 objects havng selecton frequences modeled by the Zpf() dstrbuton, (2) K = 12, (3) relatve segment sze progresson of f1,2,2,4,4,8,8,...g, and (4) W = 64. For ths segment sze progresson and these values of K and W, the total length of each object s 189 unt-segments. In order for the results to be ndependent of object sze (n bytes) and delvery tme, we dene clent request arrval rate,, n terms of requests per unt-segment multcast perod yy, and regonal server capacty, N segments, n terms of the number of unt-segments that can be cached. Note that the unt-segment transmsson tme, T 1, s equal to the total tme to delver the object dvded by 189. To ad n nterpretng the results, each graph n Fgure 3 shows the estmated number of remote server channels and regonal server channels that should be provded for the most popular object (object ) and least popular object (object 99) as a functon of the fracton of the object that s cached at the regonal server. The x-axs n each graph ranges from to 66% because ths represents the range of to 11 segments of the object when K = 12 and W = 64. Key observatons from these gures are: Cachng the ntal segments of each object leads to greater decrease n remote bandwdth per unt of storage than cachng the later segments. The most popular object has the greatest decrease n remote server bandwdth and greatest ncrease n regonal server bandwdth as a functon of the fracton cached. As clent request rate ( ) ncreases, the derental n remote server channels requred for object versus object 99 decreases. If P s xed and greater than one, the derental n requred regonal server channels also decreases, but more gradually. When P = 1, the total number of channels requred s mnmum for some k >. Thus, the new transmsson mn-clusters proposed n ths paper mprove the cost/performance rato even when used wthn a sngle server. We have computed the mnmum-cost regonal cache contents for each experment below for k =, 3, 5, 7, and 9. The cachng strategy as a functon of the other system parameters s smlar for each k, and snce k = 7 acheves the lowest delvery cost n almost every case, we provde the results for k = 7. Note that the rst seven segments represents 29/189 or about 15% of the total object length. For each experment, we also express the regonal server storage capacty and bandwdth constrants n unts of a \baselne server capablty", where one baselne unt s N channels = 192 and N segments = 2436 zz. Note that 2436 yy As noted n secton 4, substtuton of = =T1 n the model equatons removes the dependence of the cost on the value of T1. zz These baselne constrants mght represent, for example, that the server contans four commodty dsks each wth capacty to store 4.35 ggabytes of data, the objects have total sze equal to 1.35 ggabytes (e.g., two-hour MPEG-1 encoded vdeos), and delvery rate s 1.5 mllon bts per second. The constrants are also vald, for example, f the server contans three tmes as many dsks and the objects are two-hour MPEG-2 encoded vdeos.

1 Cost reductons: 42%,23% 1 Cost reductons: 16%,52% 1 Cost reductons: %,26% percentage stored regonally 8 6 4 2 percentage stored regonally 8 6 4 2 percentage stored regonally 8 6 4 2 2 4 6 8 1 2 4 6 8 1 2 4 6 8 1 (a) = 1, P = 1, baselne (cap) (b) = 1 2, P = 1, 2base (cap) (c) = 1 3, P = 1, 2base (b/w) Fgure 4. Regonal Cache Content that Mnmze C remote ( =, k = 7) baselne: N channels = 192, N segments = 2436 (cap) ndcates the regonal storage capacty constrant s actve at the soluton; (b/w) ndcates the regonal bandwdth constrant s actve at the soluton. segments s about 13% of the total number of segments for the 1 objects n the system. 5.1. Mnmzng Use of the Remote Server In ths secton we solve the model for = to compute the partal and full objects that should be cached at the regonal servers to mnmze the use of remote server bandwdth. The optmal cache contents for partcular values of clent request arrval rate, number of regonal servers, and server capacty and bandwdth, are shown n Fgure 4. Above each graph, we gve the percent reducton n requred remote server bandwdth for two mprovements suggested n ths paper: (1) the percent reducton when the regonal server optmally caches only whole objects ( cr = ) and each server uses transmsson mn-clusters (of sze k = 7) compared wth optmally cachng only whole objects and not usng mn-clusters (k = ), and (2) the addtonal cost savngs when the regonal servers are able to cache partal objects. The key nsghts from these results are: There s a hgh tendency to cache partal objects. The use of transmsson mn-clusters greatly reduces delvery cost unless the clent request rate s hgh enough that full transmsson clusters for the leadng segment set are always well utlzed. The ablty to cache partal objects can lead to a substantal (up to 52%) reducton n delvery cost. When server storage capacty s the actve constrant, but there s sucent storage to cache more than the ntal k segments of each object, then the optmal soluton also fully stores as many of the most popular objects as wll t (for = ). When the regonal server bandwdth becomes the actve constrant, less popular object segments are cached; however the precse form of the optmal cache content s specc to the gven set of parameters that characterze the system, and thus the optmal content can only be determned by solvng the model. Increasng the number of regonal servers reduces the regonal clent request rate, whch can allevate the regonal server bandwdth constrant. For example, the server capacty s the actve constrant for = 1, P = 1, and 2 baselne, and the optmal soluton n ths case s dentcal to the soluton n Fgure 4(b).

1 Cost reductons: 4%,13% 1 Cost reductons: 38%,9% 1 Cost reductons: 38%,1% percentage stored regonally 8 6 4 2 percentage stored regonally 8 6 4 2 percentage stored regonally 8 6 4 2 2 4 6 8 1 2 4 6 8 1 2 4 6 8 1 (a) = :3 (cap) (b) = :5 (cap) (c) = :8 Fgure 5. Impact of Relatve Regonal Server Bandwdth Cost k = 7, = 1, P = 1, N channels = 192, N segments = 2436. (cap) ndcates the capacty constrant s actve. The bandwdth constrant s never actve. 5.2. Mnmzng the Total Cost of Delvery The experments n ths secton nvestgate the regonal cache content that mnmzes total delvery cost over remote and regonal channels, for a xed regonal server capacty and bandwdth. The results provde nsght nto how the form of the optmal assgnment of full and partal objects changes as the relatve cost of the regonal channels,, ncreases. To facltate comparson wth the prevous set of experments for =, we use the same system parameters as n Fgure 4(a), except that we ncrease the value of. The results, gven n Fgure 5, nclude the same cost reductons as n the prevous set of experments, and lead to the followng observatons: As ncreases, lower-popularty objects are cached at the regonal server. Ths s due to the earler observaton (from Fgure 3(a)) that more popular objects have a greater ncrease n regonal server bandwdth per unt of storage than the less popular objects. When s sucently hgh, as n Fgure 5(c), the mnmum cost soluton does not fully utlze ether the avalable regonal storage capacty or the avalable regonal bandwdth. Ths s prncpally due to lower cost sharng of the regonal multcasts. As n the experments for =, the use of transmsson mn-clusters greatly reduces mnmum delvery cost (for the gven clent arrval rate). The ablty to cache partal objects yelds a smaller reducton n mnmum delvery cost when fewer segments are stored. 5.3. Impact of Regonal Server Resource Constrants In ths secton we nvestgate the mpact of varyng the regonal server storage capacty and bandwdth (n unts of the baselne server capablty) on delvery cost and the form of the optmal object assgnments. Fgure 6 gves the results for the system parameters that were used n Fgure 4(b), except that = :1. For ths set of system parameters, the cache content s constraned by the regonal server storage capacty. As storage capacty ncreases, an ncreasng number of objects are partally cached. Once capacty s sucent to cache more than the rst seven segments (15%) of each object, the optmal soluton s to also fully cache some of the objects. In ths case, the least popular objects are fully cached (for = :1), snce the last ve segments of these objects have smlar bandwdth requrement for remote multcasts, but lower bandwdth requrement for regonal multcasts, as compared wth the last ve segments of the most popular objects (see Fgure 3(b)). As n the prevous set of

1 Cost reductons: 13%,15% 1 Cost reductons: 14%,29% 1 Cost reductons: 16%,31% percentage stored regonally 8 6 4 2 percentage stored regonally 8 6 4 2 percentage stored regonally 8 6 4 2 2 4 6 8 1 2 4 6 8 1 2 4 6 8 1 (a) 1/2 baselne (cap) (b) baselne (cap) (c) 2 baselne (cap) Fgure 6. Impact of Regonal Server Storage and Bandwdth Constrants k = 7, = 1, P = 1, = :1 1 Cost reductons: %,12% 1 Cost reductons: %,1% 1 Cost reductons: %,48% percentage stored regonally 8 6 4 2 percentage stored regonally 8 6 4 2 percentage stored regonally 8 6 4 2 2 4 6 8 1 2 4 6 8 1 2 4 6 8 1 (a) baselne (b/w) (b) 3 baselne (b/w) (c) 4 baselne (b/w) 1 Cost reductons: 3%,43% percentage stored regonally 8 6 4 2 2 4 6 8 1 (d) 6 baselne (cap) Fgure 7. Impact of Regonal Server Storage and Bandwdth Constrants k = 7 = 1, P = 1, = :1

12 12 1 1 8 8 cost 6 cost 6 4 4 2 2 5 1 15 2 25 3 35 number of dsks 5 1 15 2 25 3 35 number of dsks (a), P,, k from Fgure 6 (b), P,, k from Fgure 7 Fgure 8. Delvery Cost vs Regonal Server Storage and Bandwdth experments, > mples a tendency to store less popular object segments. The parttoned skyscraper archtecture and the ablty to cache partal objects each provde moderate reducton n delvery cost; altogether the reducton s substantal. Fgure 7 provdes results for the case that the clent request arrval rate s ncreased ten-fold and the relatve cost of the regonal channels,, s decreased by a factor of ten. These are the system parameters that were used n Fgure 4(c), except that = :1. In ths case, the optmal assgnment s constraned by the regonal server bandwdth at up to four tmes the baselne resource constrants, and for those conguratons t s optmal to store lower-popularty objects. On the other hand, when the regonal server resources are ncreased to sx tmes the baselne, the capacty constrant becomes actve. In ths case, s small enough that t s optmal to fully store the most popular objects completely at the regonal server (n addton to storng the rst seven segments of all other objects). Due to the hgh clent request rate, the parttoned skyscraper archtecture does not provde a cost savngs n these cases. However, because the request rate s hgh, the ablty to cache partal objects substantally reduces delvery cost when the regonal server has reasonable capacty and bandwdth. Fgure 8 gves the mnmum delvery cost (C remote + P C regonal ) as a functon of the regonal server resource constrants, wth four dsks beng equal to the baselne constrants, for the system conguratons n Fgures 6 and 7. Note that greater reducton n cost can be acheved by ncreasng the regonal server resources when s smaller, as would be expected. 6. CONCLUSIONS Ths paper has developed a parttoned dynamc skyscraper delvery archtecture that (1) provdes more ecent use of system bandwdth by ncreasng the sze of the wndow n whch clents can jon n on-gong multcasts, and (2) allows regonal servers to cache partal objects. A smlar parttoned archtecture that uses decoupled transmsson mn-clusters mght be appled to other segmented multcast delvery technques. We have also developed a smple optmzaton model that can be solved to determne the form of the optmal regonal cachng strategy n a system wth homogeneous regonal caches. The model computes the cache content that mnmzes delvery cost, constraned by the storage capacty and bandwdth avalable at the regonal servers. Intal results of the model show the mpact of varous system parameters on the form of the optmal allocaton polces. Key parameters nclude the regonal clent request arrval rate, the relatve constrants on dsk bandwdth and storage capacty at the regonal servers, and the cost of the regonal server bandwdth relatve to the cost of the remote server and network bandwdth. The results show a very strong tendency to store the ntal segments of many objects, rather than the entre data for fewer objects, at the regonal server. The results also show that the parttoned skyscraper archtecture and the ablty to cache partal objects can greatly reduce delvery cost. The optmzaton model developed n ths paper assumes, for smplcty, that each object s of equal sze and s ether completely stored at the remote server, completely stored at the regonal server, or has exactly a xed value