Reduction of Periodic Broadcast Resource Requirements with Proxy Caching

Reduction of Periodic Broadcast Resource Requirements with Proxy Caching Ewa Kusmierek and David H.C. Du Digital Technology Center and Department of Computer Science and Engineering University of Minnesota kusmiere, du @cs.umn.edu Abstract Video streaming on a large scale can be expensive and resource demanding. Periodic broadcast reduces server bandwidth usage for popular videos. While improving the scalability this mechanism increases the WAN bandwidth and buffer space needed by a client considerably. We propose to address the high resource demands on the client side with caching by a proxy server. We present periodic broadcast schemes designed to work with proxy caching that optimize server bandwidth, client bandwidth and client buffer requirements. We show that prefix caching can lower WAN bandwidth usage and derive the lower limit on the bandwidth requirement. Similar result is obtained for server bandwidth usage with prefix caching. We show also that chunk caching can reduce client buffer space requirement to a small value. We analyze the trade-off involved in reducing usage of each of the three resources. The results of the optimization are illustrated with a group of mpeg-4 encoded videos for which optimal parameters are obtained through dynamic programming. We formulate also and solve a problem of efficient use of the proxy storage space under the assumption that such space is limited. I. INTRODUCTION Many multimedia applications such as distance learning, digital library, video-conferencing, and entertainment on-demand rely on video delivery over the Internet. The typical environment consists of a video server providing access to a number of videos for a potentially large number of clients over Wide Area Network (WAN). Video streaming is challenging due to the amount of resources required such as server I/O bandwidth and storage space at the client, and cost of resources such as WAN bandwidth. In a large scale environment server I/O bandwidth can be easily exhausted by a large number of requests for a popular video. WAN bandwidth, on the other hand, even though may be plentiful in the backbone network, may be also expensive with limited amount available for a client in the access network. Periodic broadcast (PB) [ ] has been proposed to address the scalability issue of video streaming. PB schemes make the server I/O bandwidth usage independent of the number of clients. However, the requirements that a client has to satisfy are typically much larger than for a unicast transmission. A video is divided into a number of segments, each broadcasted continuously on a separate channel. Typically the client receives multiple segments simultaneously and buffers them until their playback time. Such an approach increases significantly This work is partially supported by NSF Grant EIA-444 and DTC Intelligent Storage Consortium the WAN bandwidth usage for a client and the storage space required for buffering. In this paper we examine how client resource requirements imposed by PB can be reduced without affecting the scalability of the server I/O bandwidth usage. We assume that a proxy server is placed in each client community to assist the central server by caching part of a video and delivering it directly to the client [6 9]. Path between a proxy and a client is contained in a local network characterized by abundant and low cost bandwidth. Hence, a video stream coming from a proxy does not consume any WAN bandwidth and incurs a minimal cost. The proxy I/O bandwidth and storage space are limited. Thus, it may be possible to cache only a part of a video. The main advantage offered by proxy caching is reduction of the amount of data transferred over WAN and consequently reduction of the server I/O and WAN bandwidth usage. We examine influence of caching of different video portions by a proxy on the resource requirements and describe periodic broadcast scheme designed to work with proxy caching. Hence, we answer two questions: what part of a video should be cached and what PB scheme should be used to optimize bandwidth usage and client buffer utilization. We establish that prefix caching is an optimal way to reduce PB-related server I/O bandwidth usage. We show also that prefix caching is an optimal way to reduce the WAN bandwidth usage by a client and derive a lower bound on the bandwidth for a given prefix size. The client buffer space requirement is addressed with chunk caching, i.e., caching chunks of frames distributed throughout the whole video. We show that chunk caching by a proxy can reduce client buffer space needed to a small value and present a group of PB schemes designed to work with this type of caching. For each of the resources: server I/O bandwidth, client WAN bandwidth and client buffer space, we examine how usage optimization of one resource affects the other requirements. All three optimal schemes are designed for a CBR video but can be applied also to an encoded VBR video. We illustrate the results obtained by applying bandwidth and buffer optimal PB schemes to a group of mpeg-4 encoded videos. The parameters for the PB schemes are obtained through dynamic programming approach. Since the proxy buffer space is limited it is important to use it efficiently. Based of the knowledge of interdependency between different resources we formulate an optimization problem for efficient use of the proxy storage space for each of the

three optimal PB schemes and propose heuristic algorithms for solving the problems. The paper is organized as follows. The related work is presented in Section II. In Sections III, IV and V we cover server I/O bandwidth, client WAN bandwidth and client buffer space optimization, respectively. Dynamic programming approach for optimizing client bandwidth and buffer space for encoded video is presented in Section VI. In Section VII we formulate the problem of efficient utilization of proxy storage space and propose heuristic solution. We conclude the paper in Section VIII. II. RELATED WORK The main advantage offered by periodic broadcast scheme is the scalability of the server bandwidth usage. Different schemes result in different bandwidth requirements as well as requirements on the client side, namely client network and I/O bandwidth and storage space. The minimal server bandwidth achievable for a given service delay has been derived in [4, ] and is equal to times playback rate, where is the video length and is the delay. The server bandwidth required by Greedy Disk-Conserving Broadcast (GDB) presented in [4] reaches.44 of the minimal server bandwidth, while Greedy Equal-Bandwidth Broadcast (GEBB) presented in [] can get arbitrarily close to the optimal value. In GDB transmission rate for each channel is equal to the playback rate and the goal is to minimize the number of server channels. GEBB also uses the same rate for all channels but this can rate be different from playback rate. Intuitively there is a trade-off between server bandwidth and client resource requirements. This trade-off has been explored in [4]. Video segmentation by GDB is designed to minimize the number of server channels subject to client I/O bandwidth and storage availability. The client I/O bandwidth is expressed as a multiple of the playback rate and accounts for one segment read from disk and played out, and segments received simultaneously and written on disk. It has been shown that as the client I/O bandwidth grows, storage space and the number of server channels are reduced. Similarly given sufficient client I/O bandwidth increasing storage space allows to reduce the number of channels. Client bandwidth requirements have been addressed also in []. The authors propose a way to limit the number of segments that the client has to receive simultaneously in Pagoda Broadcast [] and Fast Broadcast [3]. The reduction of client bandwidth comes at a price of increased service delay. In order to use the same amount of server bandwidth the modified schemes required service delay larger than the original schemes. The reduction of the number of segment the client has to receive simultaneously is proposed in [] with segmentation based on the Fibonacci number sequence. The number of segments received by client at any time is limited to only two. The Fibonacci has reception rules similar to GEBB. The server bandwidth usage is higher than required by GEBB but the client storage requirement is lower. The problem of client bandwidth requirements has been examined also in context of other video stream sharing techniques such as stream merging [4]. Bandwidth skimming technique introduced in [] is based on the assumption that a video is encoded at the rate slightly lower than the bandwidth available to the client. This bandwidth surplus is used to perform hierarchical stream mering. We explore the possibility of reducing PB-related resource requirement with proxy server providing caching service. The goal of our research is to find out the optimal server bandwidth, client bandwidth and client storage space for a given amount of storage space available at the proxy server. Proxy caching allows to eliminate PB-related service delay in addition to reducing resource requirements. We analyze the trade-off between different resource requirements of proxy-assisted PB schemes. Proxy-assisted periodic broadcast has been introduce in [6] but the main purpose was to reduce server bandwidth requirement but caching a number of initial segments of a video. Our approach is to design a PB scheme to work with proxy caching in order to reduce resource demands. III. SERVER BANDWIDTH OPTIMIZATION Periodic broadcast has been designed to address server resource requirements. A PB scheme that minimizes server bandwidth usage was introduced in []. We examine how proxy support can further lower I/O and network bandwidth usage for the server. The reduction is mainly due to the fact that part of a video cached by a proxy does not have to be transmitted by the server. The bandwidth optimal transmission scheme is applied to the remaining part of the video. A. Server Bandwidth Optimal Scheme In the bandwidth optimal scheme [] a video is partitioned into a number segments whose sizes follow a geometric progression. Each segment is transmitted on a separate channel and the transmission rate is the same for all channels. Client starts receiving all segments simultaneously and the reception can start at any time, i.e., without waiting for the transmission of the beginning of the first segment. The playback starts as soon as the first segment is completely received and from that point the total reception rate decreases as subsequent segments are received. Note that each segment is completely received just in time for its playback. Two parameters determine server bandwidth requirements: the start-up delay equal to the reception time of the first segment and the number of segments. The server bandwidth decreases with an increase in the start-up delay as the transmission is stretched in time more, and with an increase in the number of segments. Prefix caching by a proxy complements the optimal scheme in a natural way. Video prefix is delivered by the proxy to the client and played immediately. During prefix playback the client starts receiving also the remaining part of the video (suffix) from the server. By setting the start-up delay for the suffix to be equal to the prefix playback length, we eliminate start-up delay for the whole video. We examine server bandwidth requirements with prefix caching and show that it is an optimal way to reduce server bandwidth usage. We assume that the size of the portion of

3 a video cached by the proxy is expressed by relative to the placements length of the video, where. For the ease of presentation and without loss of generality we assume also that the video length is and that the playback rate is also equal to. r Fig.. b Server bandwidth optimal PB with prefix caching B. Prefix Caching We first examine the prefix caching scheme, i.e., we consider the case when the proxy caches a video prefix of size. The server bandwidth optimal scheme is applied to the server suffix of length. By choosing length of the first segment such that its transmission time is equal to the prefix playback time ( ) we eliminate the PB-related delay. Client receives video prefix from the proxy and at the same time starts receiving each of the suffix segments as illustrated in Figure. The playback is shown on the top with the prefix playback marked with dashed line, and the reception at the bottom. The shaded area marks reception of each of the segments within the server transmission schedule. The suffix start-up delay time is spent receiving and playing the prefix. The per-channel transmission rate is expressed as (following derivation in []: () and is reached as the number of channels approaches infinity. Fig.. server bandwidth 4 3. 3.. b = b =. b =. b =.3.......3.3.4.4. acceptable delay Server I/O bandwidth with prefix caching Figure presents server bandwidth as a function of the startup delay (relative to the video playback time) for various prefix sizes with suffix segments. Recall that server bandwidth decreases with an increase in the start-up delay. Now we add also prefix size influence. The actual start-up delay experienced by the client is equal to the difference between the acceptable delay and the prefix playback time, and is zero if the difference is negative, i.e., the prefix playback time is longer. We observe that server bandwidth decreases with an increase in prefix size for a fixed delay. Note that for a delay smaller than prefix playback time bandwidth is affected only by the prefix size. Generally, prefix caching reduces server I/O bandwidth usage and start-up delay. C. Server Bandwidth Optimal Caching The next question we ask is whether prefix caching is optimal with respect to the server bandwidth. In the general case the cached portion of a video does not have to constitute the prefix but may be distributed across the whole video in the form of small chunks of frames. A PB scheme that works with chunk caching is designed as follows. In order to eliminate PB-related delay we assume that the first chunk is placed as a small prefix. The reception time of the first segment is equal to the playback time of the first cached chunk. The reception time of the second segment is equal to the playback time of the first segment and the first two chunks. Generally, the reception time of the " th segment is equal to the playback time of the previous #" segments and " chunks. PSfrag replacements Figure 3 illustrates the scheme. The number of segments in () the portion of the video not cached by the proxy is equal to the number of cached chunks. Note that prefix caching can be where is the number of segments and the segment sizes are given by considered a special case of chunk caching with sizes of all. The per-channel bandwidth decreases with an increase in the number of segments, and in- chunks but the first one equal to. creases with the increase in the prefix size. The server bandwidth usage is equal to. The bandwidth lower bound is: Fig. 3. Server bandwidth optimal PB with chunk caching ) Bandwidth Optimization: Given the PB scheme based on chunk caching, we modify the server bandwidth optimization problem formulation given in [] in the following way: minimize $& (' subject to )$ * ' * $ * # ' * $ (' + $ (',.- / 3 where 4 is the transmission rate for the " th segment and is the size of the " th chunk. The goal is to minimize server bandwidth by selecting the size for each of cached chunks and the transmission rate for each of segments of the portion not cached. The sum of cached chunk sizes cannot exceed and the sum of segment size must be equal to the size of video portion not (3)

4 cached by the proxy. We require also that the number of segments is equal to, while the number of chunks can be smaller or equal to. We observe that the solution for the problem formulated in such a way is obtained for and for " / /, and with the same transmission rate selected for each channel. The first chunks constitute a prefix of size. Thus, the optimal server bandwidth is achieved with prefix caching. More precisely, the same optimal value of server bandwidth cannot be reached with another type of caching with the same number of channels. In order to verify this result, we change the problem formulation in the next step to eliminate the solution which relies on one large prefix chunk. We set the size of each of chunks to and change the first constraint in the problem formula- tion (3) to #" * # ' *. Now the set of variables $ contains only transmission rates for each channel (and automatically the segment sizes). We relax also the requirement that the number of segments is strictly. The solution obtained for the modified problem tends to combine together a number of initial chunks to form a prefix longer than (the size of a single chunk) but smaller then. Notice that combining two chunks eliminates one segment from the non-cached portion of the video. Both, the longer prefix and the larger number of segments, result in a lower server bandwidth. The trade-off between the size of prefix and the number of segments is explored to find values for both quantities which result in the minimal server bandwidth. Thus, the number of segments in the solution may be smaller than. The server bandwidth requirement is larger than server bandwidth needed with prefix caching for the same number of segments. Figure 4 presents server bandwidth as a function of the number of segments for size of cached portion equal also to. obtained with prefix caching and with fixed size chunk caching. We observe also that different rates are selected for different segments (recall the server optimal scheme with prefix caching uses the same rate for all channels), and that the rates are larger for segments with larger numbers. server bandwidth.4..8.6.4. prefix caching optimal chunk caching optimal PSfrag replacements ) Client bandwidth and buffer requirements: The server bandwidth optimization comes at a price of increased requirements at the client. We now examine these requirements. The bandwidth needed by the client is determined by the peak requirement at the beginning of the reception when all segments are received simultaneously and is equal to the server bandwidth. The client is required to buffer segments for later playback. The maximum buffer occupancy is reached when the total reception rate becomes equal to the playback rate. After that point, the buffer occupancy decreases since the total reception rate decreases. Let be such that and, i.e., after reception of the (l+)st segment the total reception rate becomes smaller than the playback rate. Then the client buffer size required for server-optimal scheme is equal to: (4) where. The buffer size is calculated as a buffer occupancy just after the playback of the th segment. We observe that client buffer requirement decreases with an increase in the number of segments due to the fact that the length of the reception time from the server increases. The client buffer requirement decreases also with an increases in the prefix size. The minimum buffer size is required for the minimum server bandwidth and is equal to is, i.e., approximately of the video suffix, i.e., of the whole video. The maximum buffer size is required for one segment and is equal to the suffix size +. IV. CLIENT BANDWIDTH OPTIMIZATION We now examine how proxy caching can help reduce the PBrelated client requirements, namely the WAN bandwidth. We ask two questions: ) what part of the video should be cached, and ) what PB schemes should be used in order to minimize client WAN bandwidth usage. Given the size of cached part of the video equal to, the client bandwidth can be intuitively reduced to by stretching the reception from the server in time as much as possible and without introducing any delay. In order to achieve this result the cached part is received from the proxy and played without delay, i.e., simultaneously with the reception. The remaining part, not cached by the proxy, is received over the playback time of the entire video. We now construct a group of schemes that allow to achieved such bandwidth reduction. Similarly to the server bandwidth optimization we first examine PB schemes designed based on the proxy prefix caching. 3 number of segments b Fig. 4. Server bandwidth comparison for prefix and chunk caching Due to nonlinearity of the first constraint, the solution was obtained using numerical methods. Fig.. One-channel client bandwidth optimal PB with prefix caching

A. Prefix Caching In the client-centric PB scheme one of the parameters controlling the design is the maximum number of segments that the client has to receive simultaneously at any time. This number is denoted by, while is the total number of the video segments transmitted by the server. ) One-Channel Scheme: We start with the most straight forward design in which the client receives only one segment at a time ( ). In order to minimize client bandwidth we stretch the reception time of each suffix segment as much as possible. Therefore, the first segment s reception is stretched along the entire prefix playback and its is determined by the playback time of the prefix:. The second segment is received during the playback time of the first segment and its size is. We assume that all segments have the same transmission rate since equalizing the channel rates results in minimization of the maximum rate. In order to ensure uninterrupted playback, each segment must be received before the beginning of its playback as illustrated in Figure. Thus, the segments sizes are defined as &. Note that the following condition has to be satisfied: $ (' $ '. Since $ (', then the lower bound on the client bandwidth is: () Therefore, the one-channel scheme can achieve the optimal client bandwidth. Although the bound is reached with the infinite number of channels, we observe that in practice a relatively small number of channels is sufficient to achieve bandwidth close to the optimal value. Based on the above result we can also establish that without proxy caching, the minimal client bandwidth is equal to, where is the start-up delay relative to the video playback length. a) Server bandwidth: We now examine how optimizing the client bandwidth requirement affects server bandwidth usage. We explore server bandwidth dependence on the number of segments and the client bandwidth. Note that the same number of segments can result from different values of client bandwidth. We assume at first that for a given number of segments, perchannel rate is selected in such a way that the reception time of the last th segment is equal to the playback time of the st segment. Hence, the per-channel rate is a solution to the following equation: ' (6) Note that setting to be slightly higher than this value does not affect the number of segments but the reception time is not used efficiently, i.e., the last segment is received earlier than needed. Figure 6 presents client bandwidth and server bandwidth as functions of the number of segments for prefix size. We observe that the server bandwidth initially decreases and then starts increasing again. This behavior is fairly intuitive. As the number of segments increases linearly, the per-channel rate decreases much slower. There exists a number of segments for which the server bandwidth reaches minimum and this number is usually small. Fig. 6. bandwidth 8 6 4 8 6 4 client bandwidth server bandwidth 3 4 6 7 8 9 number of segments k Server bandwidth dependence on number of segments Given a value of client bandwidth the corresponding server bandwidth is derived as follows. The sum of the segment sizes has to satisfy the following condition $ ' +. Since $ ' & $ ',, then and the server bandwidth is: (7) Figure 7 presents server bandwidth as a function of the client bandwidth. We observe that as the per-channel transmission rate (and client bandwidth) approaches, the server bandwidth and the number of segments approaches infinity. In practice server bandwidth required to reach client bandwidth requirement close to the optimal value is not excessively high. Figure 9 presents server bandwidth required to reach client bandwidth within. from the optimal value (the difference with the optimal value is no larger than. of the playback rate) for different values of the prefix size. Note that even though the server bandwidth required may be higher than the optimal value, the scalability of PB is preserved, i.e., the server bandwidth usage is independent of the number of clients. server bandwidth kr 4 4 3 3 3 4 6 client bandwidth r Fig. 7. Server bandwidth dependence on client bandwidth

6 b) Client buffer size: The buffer space required at the client is equal to the size of the largest suffix segment since the largest buffer occupancy occurs at the beginning of the largest segment playback. If the client bandwidth is smaller than the playback rate, then the size of the first segment determines buffer size:. The lower bound on the buffer size is and is achieved with the minimum client bandwidth. The lower bound is largest for (and prefix equal to of the whole video) and reaches of the whole video. For smaller than. the prefix size is larger reducing buffer requirements, for larger than. prefix size is small resulting in small segment sizes and cosequently small buffer requirements. For a given prefix size, client buffer requirement increases approximately linearly with the increase in client bandwidth and decrease in the number of segments. replacements Fig. 8. b Two-channel client bandwidth optimal PB with prefix caching ) Multi-Channel Scheme: We now examine how increasing the number of channels that the client has to receive simultaneously affects bandwidth requirements. We start with the case of and then formulate the general conclusion. Assume that the client can receive two channels at a time at rate each. Then the segment sizes are as follows:, + ), # #. The solution to this homogeneous linear recurrence is: where,, and are solution to the following set of equations: and. Note that for, and. In this case ' (' In order to examine what type of caching is optimal for client bandwidth we formulate the following optimization problem. 4 Similarly to the server bandwidth case, we assume that the cached portion of the video may be distributed in the form of chunks of frames throughout the whole video. We assume that is determinedpsfrag by the replacements fol- the client receives one segment at a time. The first segment is. The solutions Therefore, the minimum value of lowing equation: is (8) This minimum value of is equal to the half of the minimum value obtained in the one-channel case. However, the client has to receive two channels simultaneously and the minimum client bandwidth ends up being the same as in the previous case. For a given value of prefix size, both one-channel and two-channel schemes can achieve the same client bandwidth. We observe that the server bandwidth in the two-channel scheme is no larger and in most cases smaller than server bandwidth in one-channel scheme. For a given value of client band- Fig. 9. server bandwidth r s 4 3 3 m = m = m = 3 m = 4...3.4..6.7.8.9 prefix size b Server bandwidth for close to optimal client bandwidth width two-channel scheme has half the per-channel transmission rate of one-channel scheme and in most cases less than double the number of segments. The results obtained for the two-channel scheme can be generalized to describe schemes in which client receives segments simultaneously. In the general case # $ * ' * for " and # $ * ' for " -. Each of these schemes has the same lower bound on the client bandwidth:,. However, for a given value of client bandwidth the server bandwidth decreases with the increase in or at least does not increase. Figure 9 presents server bandwidth for various values of. We observe that the for small prefix size a considerable gain can be obtained by increasing to 3 or 4 channels. Further increase in does not yield a significant decrease of server bandwidth. The minimum client buffer requirements are similar for all schemes. For the client bandwidth smaller than playback rate the maximum buffer occupancy occurs at the beginning of the first suffix segment playback and is equal to +. B. Client Bandwidth Optimal Caching Scheme received during the playback time of the first chunk. The second segment is received during the playback time of the first segment and the second chunk. Each subsequent segment is received during playback of the previous segment and one cached chunk. The design is presented in Figure. Fig.. One-channel PB with chunk caching Our goal is to minimize client bandwidth. In the problem formulation in addition to the per-channel transmission rate,

* 7 Fig.. client bandwidth 3. 3.. prefix caching chunk caching. 3 3 number of segments Client bandwidth comparison for prefix and chunk caching also the sizes of all chunks are variables. We assume that the cached portion of the video of size is divided into chunks and thus, the portion not cached by the proxy is divided into segments. More formally: minimize subject to ) for ". $ ', $ ' + / / We find that, similarly to the server bandwidth case, the minimum client bandwidth is obtained when4+ which means that all chunks are combined to form a prefix of size. The portion not cached by the proxy is still divided into segments. Hence, we conclude that prefix caching is optimal for client bandwidth for a given number of channels. In order to verify this conclusion we next exclude the prefix solution from the consideration by setting the size of each chunk to be equal to. We observe that for any value of the client bandwidth obtained with prefix caching, the same value can be achieved with equal size chunk caching but with a larger number of segments. Figure presents client bandwidth as a function of the number of segments for both schemes for the size of the cached portion equal to... As the number of segments increases, the difference between two schemes decreases. V. CLIENT BUFFER OPTIMIZATION PB schemes increase the storage space required at the client significantly. Prefix caching allows to minimize client network bandwidth and eliminate PB-related delay. We show that chunk caching, i.e., caching of chunks of frames distributed through the video, by a proxy can reduce the storage space requirements at the client to a small value. Recall that in a prefix-based PB scheme that minimizes client bandwidth usage the largest amount of data is accumulated in the client s buffer at the end of the prefix playback (assuming that client bandwidth is lower than the playback rate). Prefix playback is sustained by data received from the proxy while all data received during that time from the server is buffered. (9) By distributing cached groups of frames throughout the whole video instead of concentrating them at the beginning, we give client a chance to drain data from the buffer in-between the cached chunks playback intervals. A. One-channel Chunk Caching We now examine in more details the chunk caching scheme. We start with a simple scheme that requires the client to receive only one segment at a time. Client receives the first segment from the server during the first chunk playback. Each subsequent segment is received during the playback of the previous segment and one cached chunk. For simplicity we assume that all chunks are of equal size;. The segment sizes are then defined as follows:, $ * '. Note that then ' (' () () The first component of the above summation represents the sum of sizes of all PB segments, while the second component represents the sum of sizes of cached chunks. From () we have () Note that given, the size of cached portion decreases with an increase in. Therefore we examine the limit of the size of cached portion as approaches infinity: (3) The above result shows that given per-channel transmission rate the size of cached portion of the video is lower bounded by. In other words, if the size of cached portion is, then the smallest client bandwidth achievable is. This lower bound is the same as in the prefix caching case. Client bandwidth increases as the chunk size increases and the number of chunks decreases. a) Client buffer size: Client buffer requirements are determined by the size of the last segment since the segment sizes are increasing:, under the assumption that. Note that The larger the number of segment, the smaller the size of the largest segment. Therefore, the client buffer requirements can be made arbitrarily small by dividing the video into sufficiently large number of chunks and segments independent of the cached portion size.

8 b) Server bandwidth: The server bandwidth required for chunk caching scheme is expressed as, where is the solution of the following equation: (4) For a given size of cached portion the server bandwidth generally increases with the number of chunks and segments. More precisely we observe the same trend as for prefix caching scheme, i.e., the server bandwidth initially decreases and then increases as the number of segments increases. As the client bandwidth approaches optimal value, the server bandwidth approaches infinity. c) Comparison with prefix caching: In order to estimate the cost of lowering buffer requirements we compare the resource requirements of the chunk caching scheme with the requirements of the prefix caching scheme. Figure (a) shows client buffer requirement as function of prefix size for both schemes for the client bandwidth close to the optimal value. Recall that prefix caching scheme has the highest lower bound on the client buffer size for the prefix size of.. We observe that chunk caching reduces buffer requirements considerably. The Figure shows results for two different sizes of chunks:. and.. Figure (b) presents corresponding server bandwidth. We observe that generally the server bandwidth with chunk caching is larger for large sizes of cached portion of the video. Smaller value of chunk size (.) results in a larger value of server bandwidth than smaller chunk size (.). We observe also that for a given size of a chunk the largest server bandwidth is reached when the size of the cached portion of the video is equal to of the video length. Note that there exists size of the cached portion for which both schemes have similar server bandwidth requirements and for which chunk caching has smaller client buffer requirements. For example, the server bandwidth usage is similar for prefix caching and chunk caching with chunk size equal to. when the size of cached video portion is approximately. At the same the client buffer size required with chunk caching is about of the video length lower than with prefix caching and the difference in client bandwidth is minimal. B. Multi-channel Chunk Caching Scheme Similarly to the prefix caching case, increasing the number of channels that client has to read from simultaneously decreases the server bandwidth requirements. Similarly to the prefix caching scheme we observe that as increases the reduction of server bandwidth is smaller. C. Client Buffer Optimal Caching Scheme In order to find optimal sizes of chunks and their distribution throughout the video for a fixed number of segments, we formulate the client buffer optimization problem. The problem has the same constraints as the client bandwidth optimization problem (9) but the goal is to minimize the size of largest segment as the one determining the client buffer size: client buffer server bandwidth.3..... 3 prefix chunk b i =. chunk b i =....3.4..6.7.8.9 cached portion size (a) client buffer prefix chunk b i =. chunk b i =....3.4..6.7.8.9 cached portion size (b) server bandwidth Fig.. Comparison between one-channel PB requirements with prefix and chunk caching minimize subject to # for "3 $ ', $ ' + / (/ () Intuitively a way to minimize the maximum segment size is to equalize segment sizes. The solution obtained using the numerical methods confirms the intuition. The segment sizes are given by. The first segment is received during the playback of the first chunk, while the consecutive segments are received each during playback time of one chunk and the preceding segment. Thus, the size of the first chunk is larger, while the subsequent chunk are of equal size: and for " / (/. The rate is obtained by solving the following equation: $ (' & and is equal to + (6)

* 9, rate has to satisfy the following condition:. Hence, for a given size of cached portion, the number of segments must be: Note that for + (7) For a smaller number of segments the smallest buffer requirement is obtained with +, i.e., with prefix caching. An attempt to use more than one chunk decreases the size of the first segment and increases subsequent segments and the buffer requirement. For in order to equalize the segment sizes it is necessary to choose rates which may differ from one segment to another. The rate for the first channel is and 4 for " / (/. Note that in this case the first channel rate is the largest and determines the client bandwidth requirement. The conclusion is as follows: for the number of segments smaller than, the smallest buffer is needed with prefix caching and equal size segments. For the number of segments larger or equal to it is possible to equalize segment sizes with chunk caching, which carries an additional advantage of equalizing transmission rates for all channels and consequently minimizing client bandwidth requirement. Note that equalizing rates in the first case increases buffer requirements as both quantities, segment sizes and transmission rates, cannot be equalized at the same time. We illustrate the concept of chunk caching with the fixed number of channels for in Figure 3. The smallest number of segments for which the number of chunks in the buffer optimal scheme is larger than is, for a value smaller than that prefix caching is used. Figure 3(a) presents client buffer space requirements as a function of the number of segments. We observe that for the buffer requirement decreases with the number of segments faster for chunk caching than for prefix caching. On the other hand, bandwidth requirement shows the opposite trend as presented in Figure 3(b). Prefix caching with equal segment sizes has higher client bandwidth than prefix caching with equal transmission rates. For the number of segments larger than or equal to, prefix caching with equal segment sizes has higher client bandwidth requirement than prefix caching with equal rates and smaller than chunk caching. VI. PERIODIC BROADCAST FOR VBR VIDEO Periodic broadcast with proxy caching presented so far can be applied also to compressed video whose transmission rate is not CBR. Such an application is possible due to the fact that the whole segment has to be received before its playback. Hence, transmission rate for each segment does not have to match playback rate. A video can be delivered through periodic broadcast at constant rate. A dynamic programing method has been constructed in [] to obtained solution for the server bandwidth optimization for encoded video. The pseudocode is presented in Figure 4, where is the size of the " th frame, is the number of frames, is the start-up delay and is consumption rate (frames/s). / " denotes the minimum server bandwidth for the first Fig. 3. client buffer client bandwidth.9.8.7.6..4.3.. prefix caching with equal rates prefix caching with equal segemnts chunk caching 4 6 8 4 6 8 number of segments 9 8 7 6 4 3 (a) client buffer prefix caching with equal rates prefix caching with equal segments chunk caching 4 6 8 4 6 8 number of segments (b) client bandwidth Caching with fixed number of channels frames and " segments, while / " denotes starting position of the " th segment. The algorithm can be applied to the suffix of a video with set to the playback time of the prefix. A similar method can be used to solve client bandwidth optimization problem for one-channel scheme. In this case is computed as / " )/ / " where / " denotes the largest per-segment rate for the first frames of the suffix and " segments. The complexity of both algorithms is. Dynamic programming can be also used to compute minimal client buffer requirements. We devise a dynamic programming algorithm under the assumption that the sizes of all segments are the same and equal to size of the part of the video not cached by the proxy divided by the number of segments. The goal is to choose chunk sizes so that the client bandwidth is minimized. "/ denotes the minimum client bandwidth requirements with the " equal-size segments and the size of the cached video '

$ ". " " "/ #". " " "/ "/ * ' $ #". " " " " / " $ " / " / " / " Fig. 4. * ' Server Bandwidth Optimization with Dynamic Programming portion equal to. " / denotes the starting position of the " th segment. We introduce two more variables: "/, which denotes starting position of the " th chunk, and "/, which denotes ending position of the " th segment. is the targeted segment size. The pseudocode for client buffer is presented in Figure. Function " / determines the number of frames whose cumulative size is as close to as possible but no larger than. The complexity of the algorithm is. The granularity for the size of the cached portion can be selected to control the complexity. In order to illustrate resource optimization results we apply the dynamic programming methods to mpeg-4 encoding of several movies [7] at three different levels of quality. All movies are 6 minutes long and encoded at the rate of frames per second. We assume that the proxy caches video portion of size equal to in the first case, and in the second case, of the video length. Due to the complexity of the dynamic programming algorithms the computations are performed at the granularity of frames. Part of the video not cached by the proxy is divided into segments. Table I shows the statistics for each movie and each quality level. Table II presents the results for server bandwidth, client bandwidth and client buffer optimal optimization. The rates are specified in Mbps and sizes in MB. The three movies differ in frames sizes variability. Silence of the Lambs shows the highest variability (Figure 6(a)), Jurassic Park I (Figure 6(b)) has medium variability, and Star Wars IV (Figure 6(c)) exhibits the lowest variability out of the three movies. Generally, the client bandwidth optimal scheme equalizes rates along all segments, while buffer optimal scheme equalizes segment sizes. The frame variability affects buffer requirements of the bandwidth optimal schemes. We observe that with of the video cache the client has to buffer about of Silence of the Lambs and 4 of the other two videos. Buffer requirements of the client bandwidth optimal scheme are similar to buffer requirements of the server bandwidth optimal scheme for Silence of the Lambs, but lower for the other two movies. For of the video buffered the bandwidth requirements "3 " " "/ $ ' " / " / / / / / ' "3 " " " " " " / / " " / / " / " / #$ # '& )( * $ # '& * ( $ #+$ # '& )( ' #+$ # '& )( " "/ " / " / " / " /, " /- " / " / Fig.. Client Buffer Optimization with Dynamic Programming are similar for client bandwidth optimal scheme and buffer optimal scheme. The difference increases with frame variability. With of the video buffered, the difference between bandwidth requirements of these two schemes are more pronounced, e.i., buffer optimal scheme has higher client and server bandwidth requirements but the reduction of the buffer requirement is larger. In the case of of the video cached buffer optimal scheme chooses a prefix to cache, i.e., all of the ten chunks but the first one have sizes of zero. With caching there are multiple chunks of frames, not necessarily consecutive, of non-zero size spread through the video. Buffer optimal scheme based on chunk caching offer a nice alternative for the two bandwidth optimal schemes. Its client bandwidth requirement is between client bandwidths for the bandwidth optimal schemes. The same result holds for server bandwidth usage, while the buffer space required is the smallest out of all three schemes. VII. RESOURCE OPTIMIZATION FOR MULTIPLE VIDEOS We have explored so far relations between four elements: size of the cached portion of a video, server I/O bandwidth, client WAN bandwidth and client buffer space required. One general conclusions is that the size of the cached portion of a video has a big influence on the other three elements. The larger the cached portion, the lower bandwidth and buffer requirements. However, a proxy has limited storage space that makes it impossible to cache a whole video and limits the size

TABLE I MOVIE STATISTICS movie quality size (MB) mean rate (Mbps) peak rate (Mbps) frame std max(f)-min(f) (B) low 48..3 876.7 66 Silence of the Lambs medium 79.8.4 6. 887 high 6.8 4.4 9.3 8 low 69..63 66.94 88 Jurassic Park I medium.7.7 8.38 848 high 34.7 3.3 8.97 66673 low 4..94 448.6 466 Star Wars IV medium 3.8.94 43.34 4664 high.8.9 93.48 9344 TABLE II BANDWIDTH AND BUFFER REQUIREMENTS FOR ENCODED VIDEOS movie Silence of the Lambs Jurassic Park I Star Wars IV cached server band. optimal client band. optimal client buffer optimal bandwidth buffer client band. server band. buffer client band. server band. buffer.87..8.7 6..7.9 4.4. 36.8. 4.93 4.39.38 3.36 7. 3.9.8.8.68 3.33 3.6 9. 4.7. 8.6.7.7 8.98.3.3 4.9.97 3.97.3 3.6 3..3.33 7.3.3 4.74.74 7.3 67.7.3 6.6 3.38.3 8.86.8.76 9.8.7.74 6.9.93..3 3.9 6.3.46 3.7..66 4.88.88 8.8 48.49.3 8.6 3.8.4 8.3.6.8 7.78.7.6 6.6.7 49.7.8.76 3.7.3.83.89. 39.76.78 7.84 37.46.84 7.99 3..3 9.97.7.67.93..63.9.36 4... 9.8.8.97 3.33..84.3 3.6 3.96.6 3.8.8.6 9.4.. 3.6.7.4.7.4 3.6.8.8 4.9..79 3.6.76 49.9.7.7 3.3.3.7.8 4. 3. 4 3.8 3..6 3..4 rate (Mbps). rate (Mbps). rate (Mbps)..8..6.4... 3 3 time (s) 3 3 time (s) 3 3 time (s) (a) Silence of the Lambs (b) Jurassic Park I (c) Star Wars IV Fig. 6. Per-frame playback rate for high quality mpeg-4 encoding

4. 6 4 3 3 4 video size (MB) playback rate (Mbps). prefix size (MB) 3. 3 4 6 7 8 9 video number 3 4 6 7 8 9 video number 3 4 6 7 8 9 video number (a) video sizes (b) playback rate (c) prefix sizes Fig. 7. Server bandwidth optimal solution of the cached portion. Thus, it is important to use the proxy storage space efficiently. Given the size of the video portion cached by the proxy, the bandwidth and buffer space requirements still depend on the number of segments in the portion which is not cached by the proxy. Hence, the problem is to choose the size of the cached portion for each video and the number of segments delivered by the server in such a way that resource usage is optimized with the efficient use of the proxy storage space. The general strategy we use in order to reduce the complexity of the problem is to decouple the size of the cached portion choice from the selection of the number of channels. We first partition the proxy buffer space among all videos assuming an infinite number of channels for the video part delivered by the server. Recall that such an assumption yields a minimum resource usage for a given size of cached portion. Next, we choose the number of segments to be used. The choice of this number is made to minimize the usage of one of the three resources subject to the availability of the other two. The relation between server I/O bandwidth, client WAN bandwidth and client buffer depends on the PB scheme. We use knowledge of this relation to determine all values needed for an optimal video transmission. We consider one scheme for each resource: server bandwidth optimal PB with prefix caching, one-channel PB with prefix caching and with chunk caching for client bandwidth and client buffer space optimization, respectively. A. Server I/O Bandwidth We first address the server I/O bandwidth optimization problem with proxy caching. Given a set of videos, we want to select the prefix size for each video and the number of segments for the suffix so that the aggregate server I/O bandwidth usage is minimized. Recall that the lower bound on the bandwidth is reached in the server bandwidth optimal scheme with an infinite number of suffix segments. Thus, one of the constraints for the problem must set a limit on this number. The client bandwidth is equal to the server bandwidth, and the client buffer space required decreases with a decreases in the bandwidth. Thus, neither client bandwidth nor client buffer space available limit the number of segment. We assume that there is certain overhead related to maintaining a transmission channel for each segment and this overhead is used to set the upper limit on the number of channels. More formally: minimize $ ' subject to ) $ ' ) $ ' 3) (8) where is the playback rate of the " th video, is the length of the " th video, and is the proxy buffer size. The minimization function expresses server I/O bandwidth usage. The set of variables consists of prefix sizes for each video and the number of suffix segments. The first constraint accounts for the limited proxy buffer size, the second for the total number of channels that the server can maintain. The third constraint ensures that each video has a non-zero prefix cached by proxy. In order to simplify the problem we first consider an asymptotic case with the minimum server bandwidth determined by the prefix size and obtained with the infinite number of segments:. In this way we eliminate a set of variables representing number of suffix segments of each video. All these assumptions result in the following problem formulation: minimize $ ' subject to $ ' 3 3 (9) where are the only variables. Intuitively, a longer prefix should be selected for a video with higher playback rate that for the one with lower rate. Also a longer video should have longer prefix. However, the influence of the video length on the prefix choice is smaller due to the logarithmic function applied to the video length. Thus, an approximate solution can be obtained heuristically by choosing a prefix of length proportional to the playback rate for each video. If the prefix size is larger than video length the difference is distributed among remaining videos resulting in a weighted max-min fair sharing allocation of proxy buffer space. The solution obtained for the above problem has to be adjusted to find the number