Dynamic Load balancing for I/O- and Memory- Intensive workload in Clusters using a Feedback Control Mechanism

Similar documents
Improving the Performance of I/O-Intensive Applications on Clusters of Workstations

Dynamic Load Balancing for I/O-Intensive Applications on Clusters

Study of Load Balancing Schemes over a Video on Demand System

Exploiting redundancy to boost performance in a RAID-10 style cluster-based file system

SIGNIFICANT cost advantages, combined with rapid advances

Effective Load Sharing on Heterogeneous Networks of Workstations

Adaptive Memory Allocations in Clusters to Handle Unexpectedly Large Data-Intensive Jobs

Abstract. 1. Introduction

Scheduling for Improved Write Performance in a Cost- Effective, Fault-Tolerant Parallel Virtual File System (CEFT-PVFS)

Load Balancing in Distributed System through Task Migration

Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks

Mean Value Analysis and Related Techniques

Computing Science 300 Sample Final Examination

New Optimal Load Allocation for Scheduling Divisible Data Grid Applications

A Case Study of Parallel I/O for Biological Sequence Search on Linux Clusters

A Thesis Report. Roll No: May 2009

EDUCATION RESEARCH EXPERIENCE

High-performance message striping over reliable transport protocols

Improving Effective Bandwidth of Networks on Clusters using Load Balancing for Communication-Intensive Applications.

Adaptive and Virtual Reconfigurations for Effective Dynamic Job Scheduling in Cluster Systems Λ

Stretch-Optimal Scheduling for On-Demand Data Broadcasts

Migration Based Page Caching Algorithm for a Hybrid Main Memory of DRAM and PRAM

CS Project Report

Design, Implementation and Performance Evaluation of A Cost-Effective, Fault-Tolerant Parallel Virtual File System

OVERHEADS ENHANCEMENT IN MUTIPLE PROCESSING SYSTEMS BY ANURAG REDDY GANKAT KARTHIK REDDY AKKATI

Effective Load Balancing in Grid Environment

Design of Parallel Algorithms. Course Introduction

Load Balancing in Heterogeneous Systems

Online Optimization of VM Deployment in IaaS Cloud

A Simple Model for Estimating Power Consumption of a Multicore Server System

CHAPTER 7 CONCLUSION AND FUTURE SCOPE

Supplementary File: Dynamic Resource Allocation using Virtual Machines for Cloud Computing Environment

Resolving Load Balancing Issue of Grid Computing through Dynamic Approach

Performance Impact of I/O on Sender-Initiated and Receiver-Initiated Load Sharing Policies in Distributed Systems

Co-operative Scheduled Energy Aware Load-Balancing technique for an Efficient Computational Cloud

Analytic Performance Models for Bounded Queueing Systems

A Simulation-Based Analysis of Scheduling Policies for Multimedia Servers

Efficient Resource Management for the P2P Web Caching

Trace Driven Simulation of GDSF# and Existing Caching Algorithms for Web Proxy Servers

Network Load Balancing Methods: Experimental Comparisons and Improvement

A Path Decomposition Approach for Computing Blocking Probabilities in Wavelength-Routing Networks

On the Relationship of Server Disk Workloads and Client File Requests

Random Early Detection Web Servers for Dynamic Load Balancing

Power and Locality Aware Request Distribution Technical Report Heungki Lee, Gopinath Vageesan and Eun Jung Kim Texas A&M University College Station

QoS Guided Min-Mean Task Scheduling Algorithm for Scheduling Dr.G.K.Kamalam

SAMBA-BUS: A HIGH PERFORMANCE BUS ARCHITECTURE FOR SYSTEM-ON-CHIPS Λ. Ruibing Lu and Cheng-Kok Koh

P2FS: supporting atomic writes for reliable file system design in PCM storage

A Generalized Blind Scheduling Policy

CS 4284 Systems Capstone

The Improvement and Implementation of the High Concurrency Web Server Based on Nginx Baiqi Wang1, a, Jiayue Liu2,b and Zhiyi Fang 3,*

RECHOKe: A Scheme for Detection, Control and Punishment of Malicious Flows in IP Networks

Chapter 8 Virtual Memory

Keywords: disk throughput, virtual machine, I/O scheduling, performance evaluation

IUT Job Cracker Design and Implementation of a Dynamic Job Scheduler for Distributed Computation

The Memory System. Components of the Memory System. Problems with the Memory System. A Solution

A Cool Scheduler for Multi-Core Systems Exploiting Program Phases

McGill University - Faculty of Engineering Department of Electrical and Computer Engineering

Remote Memory Management and Prefetching Techniques for Jobs in Grid

Network performance. slide 1 gaius. Network performance

A Study of the Performance Tradeoffs of a Tape Archive

Prof. Darshika Lothe Assistant Professor, Imperial College of Engineering & Research, Pune, Maharashtra

ADAPTIVE AND DYNAMIC LOAD BALANCING METHODOLOGIES FOR DISTRIBUTED ENVIRONMENT

FuxiSort. Jiamang Wang, Yongjun Wu, Hua Cai, Zhipeng Tang, Zhiqiang Lv, Bin Lu, Yangyu Tao, Chao Li, Jingren Zhou, Hong Tang Alibaba Group Inc

Resource Load Balancing Based on Multi-agent in ServiceBSP Model*

Achieving Distributed Buffering in Multi-path Routing using Fair Allocation

The Google File System

A Load Balancing Approach to Minimize the Resource Wastage in Cloud Computing

TESTING A COGNITIVE PACKET CONCEPT ON A LAN

A High Performance Bus Communication Architecture through Bus Splitting

A Modified Genetic Algorithm for Process Scheduling in Distributed System

Availability and Utility of Idle Memory in Workstation Clusters. Anurag Acharya, UC-Santa Barbara Sanjeev Setia, George Mason Univ

SERVICE ORIENTED REAL-TIME BUFFER MANAGEMENT FOR QOS ON ADAPTIVE ROUTERS

SugarCRM on IBM i Performance and Scalability TECHNICAL WHITE PAPER

ECE519 Advanced Operating Systems

Chapter 6: CPU Scheduling. Operating System Concepts 9 th Edition

SF-LRU Cache Replacement Algorithm

A Comparative Study of Load Balancing Algorithms: A Review Paper

Load Balancing with Random Information Exchanged based Policy

Scheduling Real Time Parallel Structure on Cluster Computing with Possible Processor failures

Modeling and Analysis of 2D Service Differentiation on e-commerce Servers

The Impact of More Accurate Requested Runtimes on Production Job Scheduling Performance

The Effect of Scheduling Discipline on Dynamic Load Sharing in Heterogeneous Distributed Systems

A PMU-Based Three-Step Controlled Separation with Transient Stability Considerations

Adapting Mixed Workloads to Meet SLOs in Autonomic DBMSs

Assignment 5. Georgia Koloniari

X-RAY: A Non-Invasive Exclusive Caching Mechanism for RAIDs

CHAPTER 6 STATISTICAL MODELING OF REAL WORLD CLOUD ENVIRONMENT FOR RELIABILITY AND ITS EFFECT ON ENERGY AND PERFORMANCE

Mobile Edge Computing for 5G: The Communication Perspective

Engineering Drawings Recognition Using a Case-based Approach

Lazy Agent Replication and Asynchronous Consensus for the Fault-Tolerant Mobile Agent System

SNAG: SDN-managed Network Architecture for GridFTP Transfers

ISSN Vol.03,Issue.04, July-2015, Pages:

Enhanced Performance of Database by Automated Self-Tuned Systems

Multimedia Systems 2011/2012

High Throughput WAN Data Transfer with Hadoop-based Storage

Identifying Stable File Access Patterns

A Proposal for a High Speed Multicast Switch Fabric Design

What s New in VMware vsphere 4.1 Performance. VMware vsphere 4.1

LECTURE 3:CPU SCHEDULING

Computing Submesh Reliability in Two-Dimensional Meshes

Transcription:

Dynamic Load balancing for I/O- and Memory- Intensive workload in Clusters using a Feedback Control Mechanism Xiao Qin, Hong Jiang, Yifeng Zhu, David R. Swanson Department of Computer Science and Engineering University of Nebraska - Lincoln, Lincoln, NE, USA. Email: {xqin, jiang, yzhu, dswanson}@cse.unl.edu Abstract. 1 One common assumption of the existing models of load balancing is that the weights of resources and I/O buffer size are statically configured. Though the static configuration of these parameters performs well in a cluster where the workload can be predicted, its performance is poor in dynamic systems where the workload is unknown. In this paper, a new feedback control mechanism is proposed to improve the overall performance of a cluster with I/O-intensive and memory-intensive workload. The mechanism dynamically adjusts the resource weights as well as the I/O buffer size. Results from a trace-driven simulation show that this mechanism is effective in enhancing the performance of a number of existing load-balancing schemes. 1 Introduction Dynamic load balancing schemes in a cluster can improve system performance by attempting to assign work, at run time, to machines with idle or under-utilized resources. Several distributed load-balancing schemes have been presented in the literature, primarily considering CPU [1], memory [2], or a combination of CPU and memory [3]. Although these load-balancing policies have been very effective in increasing the utilization of resources in distributed systems, they have ignored one type of resource, namely disk I/O. The impact of disk I/O on overall system performance is becoming significant as more and more jobs with high I/O demand are running on clusters. Therefore, we have developed two load balancing algorithms to improve the read and write performance of a parallel file system [4][5]. Very recently, surdeanu et. al [6] developed a load balancing model that considers I/O, CPU, and memory resources simultaneously. In this model, three resource weights were introduced to reflect the significance of resources in a cluster and, as an assumption, the weights of system resources are statically configured. 1 This paper appears in the proceedings of the 9th International Euro-Par Conference on Parallel Processing (Euro-Par 2003), Klagenfurt, Austria, Aug.26-29, 2003.

To release this assumption that does not hold for clusters with dynamic workloads, this paper proposes a feedback control mechanism to judiciously configure the resource weights in accordance with the dynamic workloads. Moreover, the new mechanism is able to improve the buffer utilization of each node in the cluster when its workload is I/O- and/or memory-intensive. 2 Adaptive Load Balancing Scheme 2.1 Weighted Average Load-balancing Scheme We consider the issue of a feedback control method in a cluster, M = {M 1,..., M n }, connected by a high-speed network. Since jobs may be delayed because of sharing resources with other jobs or being migrated to remote nodes, the slowdown imposed on a job j is defined as: slowdown(j) = time W ALL (j) time CP U (j) + time IO (j) (1) where time W ALL (j) is the total time the job spends running, accessing I/O, waiting, or migrating, and time CP U (j) and time IO (j) are the times spent by j on CPU and I/O, respectively, without any resource sharing. For a newly arrived job j at a node i, load balancing schemes attempt to ship it to a remote node with the lightest load if node i is heavily loaded, otherwise job j is admitted into node i and executed locally. We propose a weighted average load-balancing scheme, or WAL-FC, where a feedback control mechanism is incorporated. Each job is described by its requirements for CPU, memory, and I/O, which are measured by Seconds, Mbytes, and number of I/O accesses per ms, respectively. For a newly arrived job j at a node i, WAL-FC balances the system load in five steps. First, the load of node i is updated by adding job j s load. Second, a migration is initiated if node i is overloaded. Third, a candidate node k with the lowest load is chosen. To avoid useless migrations, the load discrepancy between node i and k has to be greater than the load induced by job j. If a candidate node is not available, no migration will be carried out. Fourth, WAL-FC determines if job j s migration is able to potentially reduce the job s slowdown. Finally, job j is migrated to the remote node k, and the loads of nodes i and k are updated. WAL-FC calculates the weighted average load index of each node i, which is defined as the weighted average of CPU and I/O load, thus: load W AL = W CP U load CP U (i) + W IO load IO (i), (2) where load CP U (i) is CPU load defined as the number of running jobs and load IO (i) is the I/O load defined as the summation of the individual implicit and explicit I/O load contributed by jobs assigned to node i. It is noted that the memory load is expressed by the implicit I/O load imposed by page faults. Let

l page (i, j) and l IO (i) denote the implicit and explicit I/O load of job j assigned to node i, respectively, then, load IO (i) can be defined as: load IO (i) = j M i l page (i, j) + j M i l IO (i). (3) When the node s available memory space is larger than or equal to the memory demand, there is no implicit I/O load imposed on the disk. Conversely, when the memory space of a node is unable to meet the memory requirements, page faults may lead to a high implicit I/O load. 2.2 A Feedback Control Mechanism To retain high performance, a feedback control mechanism is employed to automatically adjust the weights of resources. The high level view of the architecture for the mechanism is presented in Fig.1, where the architecture comprises a load-balancing scheme, a resources-sharing controller, and a feedback controller. The slowdown of a newly completed job and the history slowdowns are fed back to the feedback controller, which then determines the required control action W IO. ( W IO > 0 means the IO-weight needs to be increased, and otherwise the IO-weight should be decreased.) Since the sum of W CP U and W IO is 1, the control action W CP U can be obtained accordingly. Slowdown history slowdown(j) Completed job j Feedback Controller CPU MEM I/O WIO, WCPU Resource Sharing controller Running jobs Load- balancing Newly arrived jobs Fig. 1. Architecture of the feedback control mechanism The feedback controller attempts to manipulate W CP U and W IO in the following three steps. First, the controller calculates the slowdown s j of the newly completed job j. Second, s j is stored in the history table, which reflects a specific pattern of the recent slowdowns. The average value of the slowdowns(s avg ) in the history table is computed. Finally, the performance is considered being improved if s avg > s j, therefore W IO is increased if it has been increased by the previous control action, otherwise W IO is decreased. Similarly, s avg < s j

means that the performance has been worsened, suggesting that W IO has to be increased if the previous control action has reduced W IO, and vice versa. Besides configuring the weights, the feedback control mechanism is utilized to dynamically manipulate the buffer size of each node. The main goals are: (1) improve the buffer utilization and the buffer hit rate; and (2) reduce the number of page faults. In Fig.1, the feedback control generates action buf Size in addition to W CP U and W IO. Specifically, s avg > s j means the performance is improved by the previous control action, thereby increasing the buffer size if it has been increased by the previous control action, otherwise the buffer size is reduced. Likewise, s avg < s j indicates that the latest buffer control action leads to a worse performance, implying that the buffer size has to be increased if the previous control action has reduced the buffer size, and vice versa. 3 Experiments and Results To evaluate the performance of the proposed scheme, We have evaluated the performance of the following load-balancing policies: (1) CM: the CPU-memory-based policy [3] without using buffer feedback controller(bfc). (2) IO: the IO-based policy without using the BFC algorithm. The IO policy uses a load index that represents only the I/O load. (3) WAL: the Weighted Average Load-balancing scheme without BFC [6]. (4) CM-FC: the CPU-memory-based policy in which BFC is incorporated. (5) IO-FC: the IO-based load-balancing policy to which BFC is applied. (6) WAL-FC: the Weighted Average Load-balancing scheme with BFC. In addition, the above schemes are compared with the performance of the following two policies: the non-load-balancing policy without BFC(NLB) and the non-load-balancing policy that employs BFC(NLB-FC). 3.1 Simulation Model To study dynamic load balancing, Harchol-Balter and Downey [1] implemented a simulator for a distributed system with six nodes. Zhang et. al [3] extended this simulator by incorporating memory recourses. We have modified these simulators, implementing a simple disk model and an I/O buffer model. The traces used in the simulation are modified from [1][3], and it is assumed that the I/O access rate is randomly chosen in accordance with a uniform distribution. The simulated system is configured with parameters listed in Fig.2. Disk accesses of each job are modeled as a Poisson process. Data sizes of the I/O requests in each job are randomly generated based on a Gamma distribution with the mean size of 250KByte and the standard deviation of 50KByte. Since buffer can be used to reduce the disk I/O access frequency, we approximately model the buffer hit probability for job j running on node i as follows: { r j r hit rate(i, j) = if d j+1 buf (i, j) d data (j), r j r d buf (i,j) j+1 d data (j) otherwise, (4)

Parameters Value Parameters Value CPU Speed 800 MIPS Page Fault Service Time 8.1 ms RAM Size 640 MByte Seek and Rotation time 8.0 ms Initial Buffer Size 160MByte Disk Transfer Rate 40MB/Sec. Context switch time 0.1 ms Network Bandwidth 1Gbps Fig. 2. Data Characteristics where r j is the data re-access rate, d buf (i, j) is the buffer size allocated to job j, and d data (j) is the amount of data job j retrieves from or stores to the disk, given a buffer with infinite size. Buffer is a resource shared by multiple jobs in the node, and the buffer size a job can obtain in node i at run time depends on the jobs I/O access rate and mean data size of I/O accesses. Figure 3 shows the effects of buffer size on the buffer hit probabilities of the NLB, CM and IO policies. When buffer size is smaller than 150MByte, the buffer hit probability increases almost linearly with the buffer size. The increasing rate of the buffer hit probability drops when the buffer size is greater than 150Mbyte, suggesting that further increasing the buffer size can not significantly improve the buffer hit probability when the buffer size approaches to a level at which a large portion of the I/O data can be accommodated in the buffer. 80 70 60 50 40 30 20 10 0 Buffer Hit Probability NL, R=5 CM, R=5 IO, R=5 NL, R=3 CM, R=3 IO, R=3 10 50 100 150 200 250 300 350 400 450 Buffer Size (Mbyte) 100 90 80 70 60 50 40 30 20 10 0 Mean Slowdown NLB NLB-FC CM CM-FC IO IO-FC WAL WAL-FC 7.2 7.4 7.6 7.8 8 8.2 8.4 Number of Page Fault Per Millisecond Fig. 3. Buffer Hit Probability. Page-fault rate is 4.0No./ms, I/O access rate is 2.2No./ms, R is data re-access rate. Fig. 4. Mean slowdowns as a function of the page-fault rate. I/O access rate of 1.5No./ms 3.2 Memory and I/O intensive Workload This section attempts to show an interesting case where the workload is both memory and I/O intensive. The I/O access rate is set to 1.5No./ms, and the page fault rate is from 7.2No./ms to 8.4No./ms.

Not surprisingly, Figure 4 shows that load-balancing schemes achieve significant improvements. Moreover, the performances of CM, IO, and WAL are close to one another, whereas the performance of CM-FC is similar to those of IO-FC and WAL-FC. This is because the trace comprises memory-intensive and I/O-intensive jobs. Hence, while CM and CM-FC take advantage of balancing CPU-memory load, IO and IO-FC can enjoy benefits of balancing I/O load. A second observation is that, under the memory and I/O intensive workload, CM-based policies achieve higher level of improvements over NLB than IO-based schemes. The reason is that when memory and I/O demands are high, the buffer sizes in a cluster are unlikely to be changed, as there is a memory contention among memory-intensive and I/O-intensive jobs. Thus, the buffer sizes finally converge to a value that minimizes the mean slowdown. Third, the result shows that the feedback control mechanism can further improve the performance of the load-balancing schemes. For example, WAL-FC further decreases the slowdown of WAL by up to 27.3%, and IO-FC further decreases the slowdown of IO by up to 24.7%. This result suggests that, to sustain a high performance in clusters, compounding a feedback controller with an appropriate load-balancing policy is desirable and strongly recommend. 4 Conclusion In this paper, we have proposed a feedback control mechanism to dynamically adjust the weights of resources and the buffer sizes in a cluster. The proposed algorithm minimizes the number of page faults for memory-intensive jobs while improving the buffer utilization of I/O-intensive jobs. A trace-driven simulation demonstrates that our approach is effective in enhancing performance of existing dynamic load-balancing algorithms. Future study in this area is two-fold. First, we plan to study the stability of the proposed feedback controller. Second, we will study how quickly the controller converges to the optimal value. 5 Acknowledgments This work was partially supported by an NSF grant (EPS-0091900), a Nebraska University Foundation grant (26-0511-0019), and a UNL Academic Program Priorities Grant. Work was completed using the Research Computing Facility at University of Nebraska-Lincoln. This work was partially supported by an NSF grant (EPS-0091900), a Nebraska University Foundation grant (26-0511-0019), and a UNL Academic Program Priorities Grant. Work was completed using the Research Computing Facility at University of Nebraska-Lincoln. References 1. Harchol-Balter, M., Downey, A.: Exploiting process lifetime distributions for load balancing. ACM Transactions on Computer Systems 15 (1997) 253 285

2. Acharva, A., Setia, S.: Availability and utility of idle memory in workstation clusters. In: Proceedings of the ACM SIGMETRICS Conf. on Measuring and Modeling of Computer Systems. (1999) 3. Zhang, X., Qu, Y., Xiao, L.: Improving distributed workload performance by sharing both cpu and memory resources. In: Proceedings of the 20th Int l Conf. on Distributed Computing Systems. (2000) 4. Zhu, Y., Jiang, H., Qin, X., Feng, D., Swanson, D.: Improved read performance in a cost-effective, fault-tolerant parallel virtual file system (ceft-pvfs). In: Proc. of the 3rd IEEE/ACM Intl. Symp. on Cluster Computing and the Grid. (2003) 730 735 5. Zhu, Y., Jiang, H., Qin, X., Feng, D., Swanson, D.: Scheduling for improved write performance in a cost-effective, fault-tolerant parallel virtual file system (ceft-pvfs). In: the Fourth LCI International Conference on Linux Clusters. (2003) 6. Surdeanu, M., Modovan, D., Harabagiu, S.: Performance analysis of a distributed question/answering system. IEEE Trans. on Parallel and Distributed Systems 13 (2002) 579 596