Stretch-Optimal Scheduling for On-Demand Data Broadcasts

Similar documents
Stretch-Optimal Scheduling for On-Demand Data Broadcasts

Volume 3, Issue 9, September 2013 International Journal of Advanced Research in Computer Science and Software Engineering

Evaluation of a Broadcast Scheduling Algorithm

Scheduling On-Demand Broadcast Items

Periodic Scheduling in On-Demand Broadcast System

Pull vs Push: A Quantitative Comparison for Data Broadcast

Data Dissemination Techniques in Mobile Computing Environment

Competitive Analysis of On-line Algorithms for On-demand Data Broadcast Scheduling

Pull vs. Hybrid: Comparing Scheduling Algorithms for Asymmetric Time-Constrained Environments

Dynamic Broadcast Scheduling in DDBMS

On Improving the Performance of Cache Invalidation in Mobile Environments

SLA-Aware Adaptive Data Broadcasting in Wireless Environments. Adrian Daniel Popescu

A Generalized Target-Driven Cache Replacement Policy for Mobile Environments

A Cost-Efficient Scheduling Algorithm of On-Demand Broadcasts

An Optimal Cache Replacement Policy for Wireless Data Dissemination under Cache Consistency

A Simulation-Based Analysis of Scheduling Policies for Multimedia Servers

Project Report, CS 862 Quasi-Consistency and Caching with Broadcast Disks

Efficient Remote Data Access in a Mobile Computing Environment

THE falling cost of both communication and mobile

Hybrid Cooperative Caching in a Mobile Environment

Data Indexing for Heterogeneous Multiple Broadcast Channel

An Adaptive Query Processing Method according to System Environments in Database Broadcasting Systems

On Scheduling Vehicle-Roadside Data Access

Energy-Conserving Access Protocols for Transmitting Data in Unicast and Broadcast Mode

Balancing the Tradeoffs between Data Accessibility and Query Delay in Ad Hoc Networks Λ

Energy-Efficient Mobile Cache Invalidation

CPET 565/CPET 499 Mobile Computing Systems. Lecture 8. Data Dissemination and Management. 2 of 3

Process Scheduling. Copyright : University of Illinois CS 241 Staff

Improving VoD System Efficiency with Multicast and Caching

Analysis of a Cyclic Multicast Proxy Server Architecture

A DYNAMIC RESOURCE ALLOCATION STRATEGY FOR SATELLITE COMMUNICATIONS. Eytan Modiano MIT LIDS Cambridge, MA

Lecture Topics. Announcements. Today: Uniprocessor Scheduling (Stallings, chapter ) Next: Advanced Scheduling (Stallings, chapter

Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks

SLA-Aware Adaptive On-Demand Data Broadcasting in Wireless Environments

Process- Concept &Process Scheduling OPERATING SYSTEMS

Scheduling of processes

Pervasive data access in wireless and mobile computing environments

Performance Evaluation of a Wireless Hierarchical Data Dissemination System

Chapter 6: CPU Scheduling. Operating System Concepts 9 th Edition

CPU THREAD PRIORITIZATION USING A DYNAMIC QUANTUM TIME ROUND-ROBIN ALGORITHM

6 Distributed data management I Hashing

Predicting connection quality in peer-to-peer real-time video streaming systems

Web-based Energy-efficient Cache Invalidation in Wireless Mobile Environment

Coding and Scheduling for Efficient Loss-Resilient Data Broadcasting

A Self-Adaptive Scheduling Algorithm of On-Demand Broadcasts

A Hybrid Data Delivery Method of Data Broadcasting and On-demand Wireless Communication

AODV-PA: AODV with Path Accumulation

Journal of Electronics and Communication Engineering & Technology (JECET)

Chapter 6: CPU Scheduling. Operating System Concepts 9 th Edition

Enhancement of the CBT Multicast Routing Protocol

Uniprocessor Scheduling. Aim of Scheduling

Uniprocessor Scheduling. Aim of Scheduling. Types of Scheduling. Long-Term Scheduling. Chapter 9. Response time Throughput Processor efficiency

Example: CPU-bound process that would run for 100 quanta continuously 1, 2, 4, 8, 16, 32, 64 (only 37 required for last run) Needs only 7 swaps

CPU Scheduling. CSE 2431: Introduction to Operating Systems Reading: Chapter 6, [OSC] (except Sections )

Properties of Processes

Delay-minimal Transmission for Energy Constrained Wireless Communications

Reduction of Periodic Broadcast Resource Requirements with Proxy Caching

Communication using Multiple Wireless Interfaces

Cache-Miss-Initiated Prefetch in Mobile Environments

A Modified DRR-Based Non-real-time Service Scheduling Scheme in Wireless Metropolitan Networks

THE CACHE REPLACEMENT POLICY AND ITS SIMULATION RESULTS

CS370 Operating Systems

CHAPTER 5 PROPAGATION DELAY

Start of Lecture: February 10, Chapter 6: Scheduling

CPU Scheduling. Daniel Mosse. (Most slides are from Sherif Khattab and Silberschatz, Galvin and Gagne 2013)

Disk Scheduling COMPSCI 386

Episode 5. Scheduling and Traffic Management

Application Layer Multicast Algorithm

Performance Analysis of Cell Switching Management Scheme in Wireless Packet Communications

Autonomous Transaction Processing Using Data Dependency in Mobile Environments Λ

Course Syllabus. Operating Systems

Fig. 1. Superframe structure in IEEE

Lightweight caching strategy for wireless content delivery networks

Low Latency via Redundancy

Uniprocessor Scheduling. Chapter 9

Ricardo Rocha. Department of Computer Science Faculty of Sciences University of Porto

8th Slide Set Operating Systems

Scheduling of Multiple Applications in Wireless Sensor Networks Using Knowledge of Applications and Network

Operating Systems. Scheduling

Data Access on Wireless Broadcast Channels using Keywords

Heuristic Algorithms for Multiconstrained Quality-of-Service Routing

Data Storage in Sensor Networks for Multi-dimensional Range Queries

Scheduling. The Basics

Using Hybrid Algorithm in Wireless Ad-Hoc Networks: Reducing the Number of Transmissions

Genetic-Algorithm-Based Construction of Load-Balanced CDSs in Wireless Sensor Networks

Module 17: "Interconnection Networks" Lecture 37: "Introduction to Routers" Interconnection Networks. Fundamentals. Latency and bandwidth

COMP 3361: Operating Systems 1 Final Exam Winter 2009

MODELING OF SMART GRID TRAFFICS USING NON- PREEMPTIVE PRIORITY QUEUES

ERR Broadcast Scheduling Algorithm

Chapter 7 CONCLUSION

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. 54, NO. 2, MARCH

Shaking Service Requests in Peer-to-Peer Video Systems

Input/Output Management

CPU Scheduling. The scheduling problem: When do we make decision? - Have K jobs ready to run - Have N 1 CPUs - Which jobs to assign to which CPU(s)

Improving Channel Scanning Procedures for WLAN Handoffs 1

CPU Scheduling. Operating Systems (Fall/Winter 2018) Yajin Zhou ( Zhejiang University

Chapter 9. Uniprocessor Scheduling

Volume 2, Issue 4, April 2014 International Journal of Advance Research in Computer Science and Management Studies

OPERATING SYSTEMS CS3502 Spring Processor Scheduling. Chapter 5

A Routing Protocol for Utilizing Multiple Channels in Multi-Hop Wireless Networks with a Single Transceiver

Transcription:

COMBINATORIAL OPTIMIZATION IN COMMUNICATION NETWORKS Maggie Cheng, Yingshu Li and Ding-Zhu Du (Eds.) pp. 622-637 c 2005 Springer Science + Business Media Stretch-Optimal Scheduling for On-Demand Data Broadcasts Yiqiong Wu, Jing Zhao, Min Shao, and Guohong Cao Department of Computer Science & Engineering The Pennsylvania State University, PA 16802 E-mail: ywu,jizhao,mshao,gcao @cse.psu.edu 1 Introduction For dissemination-based applications, unicast data delivery as used by the current web servers may not be scalable. With unicast, a data item must be transmitted for each client that have made the request, and hence the load on the server and the network increases with the number of clients. Broadcasting techniques have good scalability since a single broadcast response can potentially satisfy many clients. Recent technology advances in satellite networks, cable networks, wireless LANs and cellular networks make it possible to disseminate information through broadcasting. A key consideration in designing broadcast systems is the algorithm used to schedule the broadcast. Scheduling algorithms have been extensively studied in the context of operating systems [11]. Traditional scheduling algorithms such as first come first served (FCFS), round robin, shortest job first, are used in processor scheduling and disk scheduling. These scheduling algorithms are used to optimize the response time, throughput, or fairness. Since most of these scheduling algorithms are designed for the point-to-point communication environments, they may not be applicable to the broadcasting environments. A number of researchers addressed issues on broadcast schedule. For example, Wong [14] studied several scheduling algorithms such as first-come-first served (FCFS), longest wait time (LWT), most requests first (MRF) in broadcasting environments. Researches in [5, 8] investigated techniques to index the

Stretch-Optimal Scheduling for On-Demand Data Broadcasts 623 broadcast data to save power in mobile environments. Su and Tassiulas [12] formulated the broadcast scheduling as a dynamic optimization problem and proposed efficient suboptimal solutions which can achieve a mean access latency close to the lower bound. Hameed and Vaidya [7] applied existing fair queueing algorithms to broadcast scheduling. Most of the research [2, 6, 12] on broadcast scheduling focus on push-based broadcasting; that is, the server delivers data using a periodic broadcast program based on precompiled access profiles and typically, without any active user intervention. The application of such schemes are limited since they are not capable of adapting to the dynamic user requirements. There is another kind of broadcasting: on-demand (pull-based) broadcasting, where a large and dynamic client population request data from an information server which broadcasts data to the clients based on these requests. Aksoy and Franklin [3] initiated the study of on-demand broadcast scheduling. They proposed a scheduling algorithm which provides balanced treatment of both hot and cold data resulting in a good overall performance. The algorithm combines MRF and FCFS, and uses a novel pruning technique to reduce the computation overhead. However, the algorithm assumes that each data item has the same data size. Hence, it is not suitable for requests with variable data size, because response time alone is not a fair measurement for requests with different data size. Acharya and Muthukrishnan [1] addressed the broadcast scheduling problem in heterogeneous environments, where the data have different sizes. The solution is based on a new metric called, defined as the ratio of the response time of a request to its service time. Based on stretch, they proposed a scheduling algorithm, called longest total stretch first () to optimize the stretch and achieve a performance balance between worst cases and average cases. A straightforward implementation of is not practical for a large system, because the server has to recalculate the total stretch of each data item with pending requests at each broadcast interval to decide which data to broadcast next. In this chapter, we propose a stretch optimal scheduling algorithm for ondemand broadcast systems based on the observation that is not optimal in terms of the overall stretch. One nice property of the proposed algorithm is that it is extremely simple and the computation overhead is significantly lower compared to. Analytical studies as well as detailed simulation experiments are used to verify these claims. Simulation results show that our algorithm can significantly reduce the overall stretch and it still has a comparative worst-case system response time compared to existing scheduling algorithms. The rest of the chapter is organized as follows. Section 2 develops necessary background. In section 3, we present the stretch optimal scheduling

624 Y. Wu, J. Zhao, M. Shao and G. Cao algorithm, develop analytical models to demonstrate its optimality in terms of stretch, and address some implementation issues. Section 4 evaluates the performance of the proposed algorithm. Section 5 concludes the chapter. 2 Preliminaries In this section, we describe our system model, the performance metrics and the related scheduling algorithms. 2.1 The System Model Broadcast Channels Server Database Broadcast Scheduler Pull Manager client 1 client 2... client n On-Demand Channels Figure 1: An on-demand broadcast system Figure 1 illustrates a broadcast environment. which consists of a number of clients. These clients use uplink channels to send requests to the server, which uses a downlink (broadcast) channel to broadcast the responses. In the context of Hughes DirecPC architecture [13], the uplink channel would be phone lines from the clients to the DirecPC office and the downlink would be the satellite channel. In cellular networks that provide general packet radio service (GPRS) [9] service, the uplink would be the packet random access channel, and the downlink would be the packet broadcast channel. Similar to previous work on broadcast scheduling [3, 12, 14], we make several assumptions. There is a single broadcast channel monitored by all clients and the channel is fully dedicated to data broadcast. When a client needs a data item that it cannot find locally, it sends a request for the data to the server. These requests are queued up at the server upon arrival. Based on the scheduling algorithm, the server repeatedly chooses an item based on these requests,

Stretch-Optimal Scheduling for On-Demand Data Broadcasts 625 broadcasts it over the satellite link, and removes the associated requests form the queue. Clients continuously monitor the broadcast channel after having made the requests. We do not consider the effects of transmission errors. After an item is broadcast, it is received by all clients that are waiting for it. Unlike some previous work [3, 12, 14], we do not assume that the data items have equal sizes. 2.2 Performance Metrics One popular performance metric is the response time of a request, i.e., the time between sending a request and receiving the reply by a client. In heterogeneous settings, the response time is not a fair measure given that individual requests significantly differ from one another in their service time. In this chapter, we adopt an alternate performance measure, namely the stretch [1, 16, 15] of a request, defined to be the ratio of the response time of a request to its service time. Note that the service time is the time to complete the request if it were the only job in the system. The rationale for this choice is based on our intuition; i.e., clients with larger jobs should be expected to be in the system longer than those with smaller requests. The drawback of minimizing response time for heterogeneous workloads is that it tends to improve the system performance of large jobs (since they contribute the most to the response time). Minimizing stretch, on the other hand, is more fair for all job sizes. We use the average stretch to measure the overall system performance. Since it is important to ensure that the scheduling algorithm does not starve the requests of unpopular data, we also measure the worst-case wait time, which is the maximum amount of time that a client request waits in the service queue before being served. To make a scheduling decision, there is a decision overhead. When the database size is very large, the decision overhead may be significant. As a result, some broadcast bandwidth may be wasted since the server spends a large amount of time on making scheduling decisions. Thus, we also measure the decision overhead. 2.3 Related Scheduling Algorithms Some of the scheduling algorithms related to our work are listed as follows and are used for comparison. First Come First Served (FCFS): Broadcasts the data items in the order they are requested. To avoid redundant broadcasts, requests for data items that already have entries in the queue are ignored.

9 < 9 626 Y. Wu, J. Zhao, M. Shao and G. Cao Longest Waiting Time (LWT): Selects the data for which the total waiting time of pending requests is the largest, i.e., sum of all request s waiting time. Most Request First (MRF): The data items are broadcast in the order of the number of pending requests. The item with the most number of pending requests is chosen for broadcast. Shortest Service Time First (SSTF): The data items are broadcast in the order of the data service time. The data service time is relative to the data size. The data item with the smallest size is chosen for broadcast. Requests times Wait ( ): The data items are broadcast in the order of the value of function, where represents the number of pending requests and represents the first unserved request waiting time. The item with the largest is chosen for broadcast. The algorithm is a combination of MRF and FCFS in a way that ensures good scalability and low decision overhead. Largest Total Stretch First (): The data items are broadcast in the order of the total current stretch, i.e., the sum of the current stretches of all pending requests for the data. The data item with the longest total current stretch is chosen for broadcast. 3 The Stretch Optimal Scheduling Algorithm In this section, we first use an example to show that the algorithm is not optimal in terms of the overall stretch. Then, we propose a stretch optimal scheduling algorithm and discuss some implementation issues. 3.1 An Example In, the server maintains a queue for each data item. Since there may be multiple requests for the same item, let! #" denote the arrival time of the $&%(' request of item. ) denote the data size of item, and * denote the broadcast bandwidth. Suppose the current time is. Broadcasting item at time has the following stretch. +,.-/02143 6587:9 :;=< >@? 6" > (1)

G G 9 < 9 C ; C ; G G 9 < o o Stretch-Optimal Scheduling for On-Demand Data Broadcasts 627 The algorithm selects the data item with the maximum stretch (based on Equation 1) to broadcast. Although it is intuitively correct in a point-point communication environment, it may not be optimal in an on-demand broadcasting environment. For example, suppose a database has two data items such that )AB1DC, )6EF1HG. Assume that the request queues of these items are: AI1J,KA " AL1MN, IE1J,KA " EO1QP&RSE " EO1QTN. For simplicity, let *U1VC. At time W1XCY, the server needs to decide which item to broadcast first. If the algorithm is used: CY\;]C? M Z[-C#021 1]^ CY\;bG? P CY\;bG? T _+.-`G0a1 As a result, the second item is broadcast first, and then the first item is broadcast at \1JC,G (broadcasting the second item needs 2 time units). Then, the overall stretch is: Pc; C,Gc;]C? M However, if the first item is broadcast first, and the second item is broadcast at e1dcfc. Then, the overall stretch is: ^g; CfCa;G? P 1dCY CfCa;G? T Thus, the algorithm cannot minimize the overall stretch. 3.2 The Stretch Optimal Scheduling Algorithm Before presenting our stretch optimal scheduling algorithm, we first introduce a broadcast scheduling function for item : h-/0i1 C 3 6587j9 ) E (2) The scheduling algorithm, which selects the data item with maximum g-/0 (Equation 2) to broadcast, can minimize the overall stretch. Proof. We give the sketch of the proof. Based on Equation 1, for any two data items and k, the scheduling algorithm should determine to broadcast which one first. If the server broadcasts item first, the overall Stretch is: +,l(mgn 1 3 6587:9 :;=< >? 6" > ; 3 65f7po 1]T 1]P j;=< > ; < >? 6" n >

E o < > o o < > 9 9 A 628 Y. Wu, J. Zhao, M. Shao and G. Cao Similarly, if item k is broadcast first, the overall Stretch is: f, n!mq 1 3 6587ro :; < >? 6" n ; 3 #5f7:9 j; < > ; < >s? 6" We should minimize the overall stretch. Let _+ (mgn? f, n!mq 1 3 6587po ), ) n? 3 6587:9 To minimize the overall stretch, if Z (mgn? _+ ntmqvu Y, item should be broadcast first; otherwise, item k should be broadcast first. Thus, in order to broadcast item first, f, wmcn? ntmq[u YL1.x 3 6587po ) )!n uy3 #5f7:9 ) n )# ) n ) 1jx 3 65f7po C ) E n uz3 6587:9 (3) < 9 has the An example: Suppose a database has three data items such To minimize the overall stretch, item is broadcast when {6587:9 maximum value. that )Aa1dC, )6Ec1]G, and )6}c1]^. Assume that the request queues of these items are: A 1=, A " A 1MN, E 1~, A " E 1T&RS E " E 1 T&RS } " E 1T&RS " E 1 T&RS " E 1TN, I}L1,KA " }L1=G&RSE " }1=TN. At time h1jcy, the server needs to decide which item to broadcast first. If Theorem 3.2 is followed, since g-c#0a1~c, h-`g0i1, and h-`^0ƒ1, the data broadcasting order is 2-1-3, which has an average stretch of 1.81. The scheduling order of other algorithms and their average stretch costs are listed in Table 1. C ) E Table 1: An example Scheduling algorithm Broadcasting Order Average Stretch SSTF 1-2-3 1.875 1-3-2 2.62 Our 2-1-3 1.81 MRF 2-3-1 2.1 FCFS 3-1-2 2.94 LWT, RxW, 3-2-1 2.88

Stretch-Optimal Scheduling for On-Demand Data Broadcasts 629 The algorithm: Based on Equation 2, the overall stretch can be optimized. However, due to the removal of the time parameter in Equation 2, some cold data items may never be served, and the algorithm may suffer from the starvation problem. To address this problem, we add a scheduling deadline rule. When the first request for a data item arrives, a scheduling deadline is assigned to this pending request. When the deadline passes, the data item has the highest priority to be broadcast. The stretch optimal algorithm is as follows. 1. Scan the request queue of each data item to check if any request has reached the scheduling deadline. If the request for item has reached the scheduling deadline, let 1d and go to Step 3 directly. If multiple entries have passed their scheduling deadline, use their h-/0 to break the tie. 2. Scan the request queue of each data item to find the data item with maximum g-/0, and set B1. If multiple entries have the same h-/0, use their request arrival time to break the tie. 3. Broadcast item. Even with the tie breaking rules, it is possible that multiple candidates have passed their scheduling deadline, and they have the same value. In this case, the server just randomly picks one. We observe some interesting properties of the stretch optimal scheduling algorithm. For two requested data items with the same number of pending requests, the one with smaller data size should be chosen for broadcast. Hence, if there are similar number of pending requests for each data item, the proposed algorithm has similar performance to SSTF. When the requested data items have the same size, the one with more pending requests should be chosen for broadcast. Hence, when the data sizes are similar, the proposed algorithm has similar performance to MRF. 3.3 Implementation Issues To implement the stretch optimal scheduling algorithm, the server maintains two service queue data structures: a S-Heap, and a D-Link, which are shown in Figure 2. The service queue structure contains a single entry for each data item that has outstanding requests. In addition to an ˆŠ, each entry contains, g-/0,

9 630 Y. Wu, J. Zhao, M. Shao and G. Cao and C,)# Œ L which is the arrival time of the oldest outstanding request for the data item. When a request arrives at the server, a hash lookup is performed on the ˆŠ of the requested data item. If an entry already exists, the value of that entry is simply increased by A 1. The server also updates the S-Heap, which is < constructed based on the value of function h-/0. If no entry is found (i.e., there is currently no outstanding request for the data item), a new entry is created with the S value initialized as A < 9 and C,)# Œ L initialized to the current time. Then, the server inserts the new S value to S-Heap and adds the new C,), \ L value to the end of D-Link, which is sorted based on C,)# Œ L of each data item. To make a broadcast scheduling decision, the server first checks the head of the D-Link to see whether the scheduling deadline has been passed. If the deadline has been passed, it deletes the head of the D-Link, removes the entry from the service queue, and removes the corresponding node from the S-Heap. Otherwise, it removes the root of the S-Heap, removes the corresponding entry from the service queue, and removes the corresponding node from the D-Link. The S-Heap or D-link operation (remove or insert) takes Ž - w f a 0, where is number of nodes in the heap (at most equal to the number of data items in the database). Thus, the scheduling decision overhead is Ž - w f i 0. Compared to the decision overhead of, which is Ž - 0, our scheduling algorithm can significantly reduce the decision overhead. Service queue data structures ID 1stARV S 1 4 5 clock = 10 S-Heap S(i) value 5 Root 2 3 4 7 1 3 0.5 4 1 Head of D-Link End 4 3 1 2 1 0.5 5 6 7 4.5 9 9.3 3 1 2 Root of S - Heap ( 1st Arv. ) 1 3 D-Link 4 4.5 7 9 9.3 prev next left right Head End Figure 2: The service queue data structure 1 To reduce the computation overhead, a variable can be used to represent needs to be calculated once., and only

A Stretch-Optimal Scheduling for On-Demand Data Broadcasts 631 4 Performance Evaluation 4.1 Simulation Model We developed a simulation model written in CSIM [10]. The model represents an environment similar to that described in Section 2.1. Each client is simulated by a process which generates a stream of queries. Clients send requests to the server whenever the query is generated. The aggregated stream of requests, generated by the clients, follows a Poisson process with mean š. The density function for the inter-arrival times of accessing item is: -/S021]š 8œr where š 1Ÿž š, ž denotes the probability of clients requesting for data item. The access pattern follows cwž distribution (similar assumptions are made by other researchers as well [4, 7, 17]). In the g(ž distribution, the access probability of the +%(' data item is represented as follows. j1 C { nt A where Y«ª ª~C, is the database size. When 1 C, it is the strict Zipf distribution. When W1]Y, it becomes the uniform distribution. Large results in more skewed access distribution. The data item size varies from )Š to )#ā K±, and has the following two types of distributions. Random: The distribution of data sizes falls randomly between ) and ), ² ±. Zipf distribution favors data items with small sequence numbers. Since the data size is random, the data size and the access distribution do not have correlation. Increase: The size () ) of the data item ( ) grows linearly as increases; i.e. ) 1]), ; -/? C#0 <+³µŚ œ < ³ 9. In order words, the data item with œ smaller size will be accessed more frequently than bigger ones. 9 % A n 2. Most of the system parameters and their default values are listed in Table 4.2 The Effects of the Access Pattern Figure 3 shows the average stretch as a function of the access skew coefficient. As can be seen, our algorithm has the lowest average stretch. Since

A 632 Y. Wu, J. Zhao, M. Shao and G. Cao Table 2: Default System Parameters Database items 2000 items Number of clients 200 Broadcast bandwidth (* ) 115Kbps The minimum data item size (j ) 1KB The data item size ratio 1V¹ ³µŚ ¹ ³ 9 100 Mean query generate rate š A ºSº Zipf distribution parameter 0.6 < Average Stretch 140 120 100 80 60 40 20 0 0 0.2 0.4 0.6 0.8 1 Theta(Access pattern) FCFS LWT MRF SSTF RxW our Average Stretch 50 45 40 35 our (with deadline=2000s) 30 our (with deadline=3000s) our (with deadline=4000s) our (with deadline=5000s) our (without deadline) 25 0 0.2 0.4 0.6 0.8 1 Theta (Access Pattern) Figure 3: The average stretch under different access patterns (random distribution) also considers stretch when making scheduling decisions, it has similar performance to ours. Since FCFS, LWT, MRF, and Q do not consider stretch in broadcast scheduling, their average stretches are much higher than our algorithm and the algorithm. Note that SSTF has similar average stretch to our algorithm, since it broadcasts data in the order of data service time, which reflects the essence of the stretch-based algorithms. Generally speaking, in order to reduce the average stretch, the server should try to schedule data items with small data size and high popularity. When 1Y, the access pattern is uniformly distributed, and data items have similar popularity. As a result, the data size is the major factor. Since FCFS, LWT, MRF, and B Œ do not consider data size, they have much worse performance compared to our algorithm. The SSTF, however, considers the data size as major factor when making scheduling decisions, and it has similar average stretch to our algorithm when O1Y. When 1QC, the access pattern is much

Stretch-Optimal Scheduling for On-Demand Data Broadcasts 633 Average Stretch 160 140 120 100 80 60 40 20 0 0 0.2 0.4 0.6 0.8 1 Theta(Access pattern) FCFS LWT MRF SSTF RxW our Average Stretch 50 45 40 35 30 25 our (with deadline=2000) our (with deadline=3000) our (with deadline=4000) 20 our (with deadline=5000) our (without deadline) 15 0 0.2 0.4 0.6 0.8 1 Theta (Access Pattern) Figure 4: The average stretch under different access patterns (increase distribution) more skewed, and the data popularity becomes a major performance factor. Since FCFS, LWT, MRF, and \ consider the data popularity when making scheduling decisions, the performance difference between these algorithms and our algorithm becomes much smaller as L1dC. The SSTF algorithm, however, does not consider the popularity when making scheduling decisions, and its average stretch becomes much worse than our algorithm when L1dC. The right graph of Figure 3 also shows the performance of our algorithm with different deadlines. It is easy to see that the average stretch reduces as the scheduling deadline increases. As expected, the worst-case waiting time decreases as the scheduling deadline increases (as shown in Figure 5). As shown in Figure 5, our algorithm has a comparative worst-case waiting time even without scheduling deadline. When the scheduling deadline is 2000, it has the best worst-case waiting time compared to other scheduling algorithms. From Figure 3 and Figure 4, we can see that the difference between random distribution and the increase distribution is not that significant. Due to space limit, we only show results of the random distribution in the following sections. 4.3 The Effects of the System Work Load Figure 6 shows the average stretch as a function of the mean query generate rate, the database size, and the number of clients. Three figures have similar trends; that is, as the work load (in terms of query generate rate, database size, or number of clients) increases, the average stretch also increases. From the figure, we can see that our algorithm outperforms other scheduling algorithms in various scenarios.

634 Y. Wu, J. Zhao, M. Shao and G. Cao 5500 5500 Worst Case Wait Time (Seconds) 5000 4500 4000 3500 3000 FCFS LWT our 2500 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Theta (Access Pattern) Worst Case Wait Time (Seconds) 5000 4500 4000 3500 3000 our (with deadline=2000s) our (with deadline=3000s) our (with deadline=4000s) 2500 our (with deadline=5000s) our (without deadline) 2000 0 0.2 0.4 0.6 0.8 1 Theta (Access Pattern) Figure 5: The worst waiting time under different access patterns Average Stretch 160 140 120 100 80 60 40 FCFS LWT MRF SSTF RxW our Average Stretch 160 140 120 100 80 60 FCFS LWT MRF SSTF RxW our Average Stretch 110 100 90 80 70 60 50 FCFS LWT MRF SSTF RxW our 20 40 40 0 0.008 0.01 0.012 0.014 0.016 0.018 0.02 Mean query generate rate 20 1000 10000 Database Size 30 100 150 200 250 300 The number of clients Figure 6: The average stretch under different system work load 4.4 The Scheduling Overhead Figure 7 shows the scheduling overhead of the algorithm and our algorithm. In, the server has to recalculate the total stretch for every data item with pending requests in order to decide which data item to broadcast next, and hence the scheduling algorithm becomes a bottleneck due to its high computation overhead. As explained in Section 3.3, by using heaps, the computation complexity of our algorithm can be reduced to Ž - w f i 0. Results from Figure 7 also verifies that our algorithm can significantly reduce the computation overhead compared to the algorithm. Moreover, in the algorithm, the computation overhead increases linearly when the database size or the number of clients increases, whereas the computation overhead in our algorithm increases much slower, and then our algorithm is more scalable compared to the algorithm.

Stretch-Optimal Scheduling for On-Demand Data Broadcasts 635 1000 1000 1000 Average Number of Computation 100 10 our (without deadline) our (with deadline=2000s) our (with deadline=5000s) Average Number of Computation 100 10 our (without deadline) our (with deadline=2000s) our (with deadline=5000s) Average Number of Computation 100 10 our (without deadline) our (with deadline=2000s) our (with deadline=5000s) 1 0.008 0.01 0.012 0.014 0.016 0.018 0.02 Mean query generate rate (1/Seconds) 1 1000 10000 Database Size 1 100 150 200 250 300 The number of clients Figure 7: The scheduling overhead under different system work load. 5 Conclusions This chapter has addressed the challenges of large-scale on-demand data broadcasts introduced by broadcast media such as satellite networks, cable networks, wireless LANs and cellular networks. In such environments, the scheduling problem is different from that in the point-to-point communication environment or the push-based broadcast environment. Moreover, when variable-sized heterogeneous requests are considered, most of the previous scheduling algorithm fail to perform well. As stretch is widely adopted as a performance metric for variable-size data requests, we proposed a broadcast scheduling algorithm to optimize the system performance in terms of stretch. One nice property of the proposed algorithm is that it is extremely simple and the computation overhead is very low. Analytical results described the intrinsic behavior of the algorithm. Simulation results demonstrated that our algorithm significantly outperforms existing scheduling algorithms under various scenarios. References [1] S. Acharya and S. Muthukrishnan. Scheduling On-Demand Broadcasts: New Metrics and Algorithms. ACM MobiCom 98, pages 43 54, Oct. 1998. [2] S. Acharya, R. Alonso, M. Franklin, and S. Zdonik. Broadcast disks: Data Management for Asymmetric Communication Environments. Proc. ACM SIGMOD, pages 199 210, May 1995. [3] D. Aksoy and M. Franklin. RxW: A Scheduling Approach for Large- Scale On-Demand Data Broadcast. IEEE/ACM Transactions on Networking, 7(6), Dec. 1999.

636 Y. Wu, J. Zhao, M. Shao and G. Cao [4] M. Ammar and J. Wong. On the Optimality of Cyclic Transmission in Teletext Systems. IEEE Transactions on Communications, pages 8 87, Jan. 1987. [5] A. Datta, D. Vandermeer, A. Celik, and V. Kumar. Broadcast Protocols to Support Efficient Retrieval from Databases by Mobile Users. ACM Transactions on Database Systems, 24(1):1 79, March 1999. [6] H. Dykeman, M. H. Ammar, and J. Wong. Scheduling Algorithms for Videotext Systems under Broadcast Delivery. In Proc. International Conference of Communications, pages 1847 1851, 1996. [7] S. Hameed and N. Vaidya. Efficient Algorithms for Scheduling Data Broadcast. ACM/Baltzer Wireless Networks (WINET), pages 183 193, May 1999. [8] T. Imielinski, S. Viswanathan, and B. Badrinath. Data on Air: Organization and Access. IEEE Transactions on Knowledge and Data Engineering, 9(3):353 372, May/June 1997. [9] R. Kalden, I. Meirick, and M. Meyer. Wireless Internet Access Based on GPRS. IEEE Personal Communications, pages 8 18, Apr. 2000. [10] H. D. Schwetman. DSIM: A C-Based Process Oriented Simulation Language. Proc. Winter Simulation Conf., pages 387 396, 1986. [11] A. Silberschatz and P. Galvin. Operating System Concepts. Addison Wesley, 1997. [12] C. Su and L. Tassiulas. Broadcast Scheduling for Information Distribution. IEEE INFOCOM, pages 107 117, 1997. [13] Hughs Network Systems. DirecPC Home Page. http://www.direcpc.com, 2000. [14] J.W. Wong. Broadcast Delivery. Proceedings of the IEEE, 76(12), Dec. 1999. [15] X. Wu and V. Lee. Preemptive Maximum Stretch Optimization Scheduling for Wireless On-Demand Data Broadcast. Proc. of Int l Database Engineering and Applications Symposium (IDEAS), 2004. [16] Y. Wu and G. Cao. Stretch-Optimal Scheduling for On-Demand Data Broadcasts. IEEE International Conference on Computer Communications and Networks, pages 500 504, Oct. 2001.

Stretch-Optimal Scheduling for On-Demand Data Broadcasts 637 [17] L. Yin and G. Cao. Adaptive Power-Aware Prefetch in Wireless Networks. IEEE Transactions on Wireless Communication, 3(5):1648 1658, September 2004.