Performance Analysis of Applying Replica Selection Technology for Data Grid Environments*

Size: px

Start display at page:

Download "Performance Analysis of Applying Replica Selection Technology for Data Grid Environments*"

Calvin Washington
5 years ago
Views:

1 Performance Analysis of Applying Replica Selection Technology for Data Grid Environments* Chao-Tung Yang 1,, Chun-Hsiang Chen 1, Kuan-Ching Li 2, and Ching-Hsien Hsu 3 1 High-Performance Computing Laboratory, Department of Computer Science and Information Engineering, Tunghai University, Taichung 40704, Taiwan ctyang@mail.thu.edu.tw 2 Parallel and Distributed Processing Center, Department of Computer Science and Information Management, Providence University, Taichung 43301, Taiwan kuancli@pu.edu.tw 3 Department of Computer Science and Information Engineering, Chung Hua University, Hsinchu 300, Taiwan chh@chu.edu.tw Abstract. The Data Grid enables the sharing, selection, and connection of a wide variety of geographically distributed computational and storage resources for solving large-scale data intensive scientific applications. Such technology efficiently manage and transfer terabytes or even petabytes of data for dataintensive, high-performance computing applications in wide-area, distributed computing environments. Replica selection process allows an application to choose a replica from replica catalog, based on its performance and data access features. In this paper, we build a Grid environment based on three existing PC Cluster environments and perform performance analysis of data transfers using GridFTP protocol over these systems. In addition, based on experimental results, it is proposed a cost model to pick the best replica, in real and dynamic network situations. Keywords: Grid computing, Data Grid, Replica selection, Globus, GridFTP. 1 Introduction Grid computing is utilization of many computers resources in a network to a single problem at the same time - usually to a scientific or technical problem that requires a great number of computer processing cycles or access to large amounts of data. A Grid computing environment provides a platform for scientific applications and physical experiments. A Grid is a large-scale virtual organization which resources are shared in order to solve problems [4, 7, 9, 10, 11 12]. Grid computing is distributed computing taken to the next evolutionary level. The goal is to create the vision of * This paper is supported in part by NSC Taiwan (National Science Council), under grants no. NSC E , NSC M , NSC M and NSC E The corresponding author. V. Malyshkin (Ed.): PaCT 2005, LNCS 3606, pp , Springer-Verlag Berlin Heidelberg 2005

2 Performance Analysis of Applying Replica Selection Technology 279 large and powerful self-managing virtual computer, which is a huge collection of connected heterogeneous systems. The emerging mechanism is resources sharing through the availability of high bandwidth network. The computational Grid is a term used to provider the users a better performance, especially in terms of speed and throughput. The term Data Grid aggregate distributed resources to produce results for large size problems. Most of these Data Grid applications are executed simultaneously and access a large number of shared data files in Grid. In certain data intensive scientific applications, such as high-energy physics, bioinformatics applications and astrophysical virtual observatory, we confront with huge amount of data. A Data Grid provides two essential basic services, which are a secure, reliable, efficient data transport protocol and replica management [2]. The high-speed transport protocol, GridFTP, extends the popular FTP protocol with some new features required for Data Grid applications, such as partial file transfer and third-party transfer [5]. The replica management service take advantage of replica catalog with GridFTP transfer to provide for the creation, registration, location and management of data replicas [1]. In this paper, we build a Grid environment based on three existing PC Cluster environments and perform performance analysis of data transfers using GridFTP protocol over these systems. In addition, based on experimental results, it is proposed a cost model to pick the best replica, in real and dynamic network situations. In this paper, we propose a cost model according to the three significant parameters: network bandwidth, CPU load and I/O state. Although the network situation is constantly changing and the storage equipments are busy or idle, we can use our cost model to determine the best replica immediately. The replica selection can be conducted accurately because our cost model is based on the system monitoring information that update continuously. 2 Background Review 2.1 Globus Toolkit The Globus Project [10, 11, 12] provides software tools that make it easier to build computational Grids and Grid-based applications. These tools are collectively called The Globus Toolkit. The Globus Toolkit is used by many organizations to build computational Grids that can support their applications. The composition of the Globus Toolkit can be pictured as three pillars: Resource Management, Information Services, and Data Management. Each pillar represents a primary component of the Globus Toolkit and makes use of a common foundation of security. GRAM implements a resource management protocol, MDS implements an information services protocol, and GridFTP implements a data transfer protocol. They all use the GSI security protocol at the connection layer [8, 11, 12, 13]. 2.2 NWS The Network Weather Service (NWS) [16] is a generalized and distributed monitoring system for producing short-term performance forecasts based on historical performance measurements. The goal of the system is to dynamically characterize and

3 280 C.-T. Yang et al. forecast the performance deliverable at the application level from a set of network and computational resources. It is composed of three component processes: nws_nameserver: implements a naming and discovery service used to manage a system of nws_sensor and nws_memory, nws_memory: provides persistent storage for the measurement data collected by the NWS deployment, nws_sensor: gathers performance measurements from a specified resource and communicates it to a set of nws_memory specified on the command line. A typical installation would involve one nws_nameserver, one or more nws_memory (which may reside on different machines), and a nws_sensor running on each machine for which resources are to be monitored. The system includes sensors for end-to-end TCP/IP performance (bandwidth and latency), available CPU percentage, and available non-paged memory. 2.3 Sysstat Utilities The Sysstat [15] utilities are a collection of performance monitoring tools for Linux OS, which sysstat package contains the sar, mpstat, and iostat commands. The sar command collects and reports system activity information. This information can also be saved in a system activity file for future inspection. The iostat command reports CPU statistics and I/O statistics for tty devices and disks. The statistics reported by sar concern I/O transfer rates, paging activity, process-related activities, interrupts, network activity, memory and swap space utilization, CPU utilization, kernel activities, and tty statistics, among others. Both uniprocessor (UP) and Symmetric multiprocessor (SMP) machines are fully supported. 3 Replica Selection 3.1 Replica Selection Scenario The system established in this research used the following architecture. Figure 1 shows our proposed replica selection model, to show how a client identifies the best location for a desired replica transfer. At first, the client login at the site local site and execute parallel applications in the Data Grid platform. This application checks the files are located in local site or not. If they are present at the local site, the application accesses them immediately. Otherwise, the application passes the logical file names to replica catalog server, which returns a list of physical locations for all registered copies. The application passes this list of replica locations to a replica selection server, which identifies the destination locations of storage system for all candidate data transfer operations. The replica selection server sends the possible destination locations to information server, which provides the performance of measurements and predictions of three system factors, as described in next section. According to these estimates, the replica selection server chooses the best replica location and returns location information to the parallel application, which receives the replica through GridFTP. Once finished the application s computation, the application returns the results to user.

Performance Analysis of Applying Replica Selection Technology 281 3.2 System Factors Fig. 1. Replica selection scenario We propose a replication selection model for Data Grid environments.

4 Performance Analysis of Applying Replica Selection Technology System Factors Fig. 1. Replica selection scenario We propose a replication selection model for Data Grid environments. In this environment, we can treat a biological database as a replica of Data Grid. When we execute large-scale data intensive applications in these environments, a site has both data stores and computational capabilities. To determine the best database from many of same replications is a significant problem. In our model, we consider three system factors that affect the replica selection: Network bandwidth: Network bandwidth is one of the most significant factors in Data Grid, since the size of a data file in Data Grid environment is usually very large. In other words, the data file transfer time is tightly dependent on network bandwidth situations. Because network bandwidth is unstable and dynamic factor, we should often measure and predict it as most accurate as possible. NWS (Network Weather Service) is a powerful toolkit for such purpose, CPU load: a Grid platform consists of a number of heterogeneous systems, built with different system architectures, e.g., cluster platforms, supercomputers, PCs. CPU load is a dynamic system factor, and if the CPU load of a system is heavy, it will certainly affect the data file download process from this site. The measurement of CPU status is done through the Globus Toolkit / MDS, I/O state: Data Grid nodes consist of different heterogeneous storage systems. The size of data in Data Grid is huge. If I/O state of the site that we would like to download file from is very busy, it will directly affect the data transfer performance. We measure the I/O state using sysstat utilities. 3.3 Replica Selection Cost Model The target function of a cost model for distributed and replicated data storage is the score of information from information service. We listed different influencing factors

5 282 C.-T. Yang et al. for our cost model in the previous section. However, we have to express these factors within a mathematical notation for further analysis. We assume node I is the local site which the user or application is logged in, while node j possesses the replica which the user or application wanted. The seven system parameters in our replica selection cost model are: Scorei : The score high or low represents the user or application acquiring the j replica effectively or not is from node I to node j, BW Pi : The percentage of bandwidth from node I to node j. In other words, the j current bandwidth divided the highest theoretical bandwidth, BW W : The weight of the network bandwidth defined by the administrator of the Data Grid, CPU P : The percentage of CPU idles of node j, j CPU W : The weight of the CPU load defined by the administrator of the Data Grid, I O P / : The percentage of I/O idles of node j, j I O W / : The weight of the I/O state defined by the administrator of the Data Grid, According to the given three system factors, we define the following general formula as: BW BW CPU CPU I / O I / O Scorei j = Pi j W + Pj W + Pj W (1) BW CPU I O In this formula, three influencing factors: W, W, and W /, described as the weights of network bandwidth, CPU, and I/O. These weights can be determined by the administrator of the Data Grid organization. According to different attributes of storage systems in Data Grid nodes, administrator can decide for different weights, because some storage equipment does not affect CPU load. After several experimental measurements, we consider that network bandwidth is the most significant factor, influencing directly the data transfer time. When we perform data transfer using GridFTP protocol, we discover that the CPU and I/O statuses slightly affect the performance of data transfer. In our Data Grid environment, we define the values as 80%, 10%, and 10%, respectively. 4 Experimental Environments and Results In this section, there are experimental results using GridFTP protocol. First, we measure and compare the FTP with GridFTP, as their file transfer time. Secondly, we focused in the parallel data transfer in this paper, measuring and comparing the GridFTP with 1, 2, 4, 8 and 16 TCP streams of file transfer time. The Data Grid testbed consisting of three Linux PC clusters is built as: THU site: four PCs with dual AMD AthlonMP 2.0GHz processors, 1GB DDR memory, 60GB HD, 1Gbps network bandwidth, Li-Zen site: four PCs with Intel Celeron 900MHz processor, 256MB DDR memory, 10GB HD, 30 Mbps network bandwidth, HIT site: four PCs with Intel P4 2.8GHz processors, 512MB DDR memory, 80GB HD, 1Gbps network bandwidth.

6 Performance Analysis of Applying Replica Selection Technology 283 Figure 2 shows the hardware and network configuration of our Data Grid testbed. The THU site is located in Tunghai University, Taichung City; Li-Zen site is located at Li-Zen High School, Taichung County, while HIT site is located in Hsiuping Institute of Technology, Taichung County, all in Taiwan. 4.1 FTP Versus GridFTP Fig. 2. Our Data Grid testbed The Globus Project surveyed available protocols and technologies, implemented some prototypes, and settled on using FTP and its existing extensions as a base, and then extending it again to add missing required functionality. The Globus alliance propose a common data transfer and access protocol named GridFTP that provides secure, efficient data movement in Grid environments. This protocol, which extends the standard FTP protocol, provides a superset of the features offered by the various Grid storage systems currently in use. In Grid environments, access to distributed data is typically as important as access to distributed computational resources. Distributed scientific and engineering applications require transfers of large amounts of data between storage systems, and access to large amounts of data by many geographically distributed applications and users for analyzing and visualization. We note that GridFTP protocol is extended from FTP protocol, and suitable for Grid environments. Figure 3 shows the performance of FTP and GridFTP by transferring four different file sizes. We transferred these files (256, 512, 1024 and 2048 megabytes) from THU site alpha01 to HIT site gridhit3 in our first experiment. 4.2 GridFTP with Parallel Data Transfer Using multiple TCP streams can improve aggregate bandwidth over using a single TCP stream in WAN environments. We apply this feature of GridFTP protocol to transfer different sizes files in Data Grid environments. GridFTP (as well as normal FTP) defines multiple wire protocols, or MODES, for the data channel. Most normal

7 284 C.-T. Yang et al. FTP servers only implement stream mode, i.e., the bytes flow in order over a single TCP connection. GridFTP defaults to this mode so that it is compatible with normal FTP servers. FTP versus GridFTP File Transfer Time (sec) FTP GridFTP File Sizes (MB) Fig. 3. FTP versus GridFTP However, GridFTP has another mode, called Extended Block Mode, or MODE E. This mode sends the data over the data channel in blocks. Each block consists of 8 bits of flags, a 64 bit integer indicating the offset from the start of the transfer, and a 64 bit integer indicating the length of the block in bytes, followed by a payload of length bytes. Because the offset and length are provided, out of order arrival is acceptable, i.e., the 10 th block could arrive before the 9 th because you know explicitly where it belongs. This allows us to use multiple TCP channels. If you use the parallelism option, globus-url-copy automatically puts the servers into MODE E. Note that parallel data transfer with one TCP stream is not the same as no parallel data transfer at all. Both will use a single stream, but the default will use stream mode and the parallel data transfer with one TCP stream will use mode E [12]. GridFTP with Parallel Data Transfer File Transfer Time (sec) GridFTP with no Parallel Data Transfer GridFTP with 1 TCP Stream GridFTP with 2 TCP Streams GridFTP with 4 TCP Streams GridFTP with 8 TCP Streams GridFTP with 16 TCP Streams File Sizes (MB) Fig. 4. GridFTP with parallel data transfer The parallelism option is used by the source data note to control how many parallel data connections may be established to each destination data node. Figure 4 shows the

Performance Analysis of Applying Replica Selection Technology 285 performance of GridFTP transferring 256, 512, 1024 and 2048 megabytes files with 1, 2, 4, 8 and 16 TCP streams from THU site alpha02

8 Performance Analysis of Applying Replica Selection Technology 285 performance of GridFTP transferring 256, 512, 1024 and 2048 megabytes files with 1, 2, 4, 8 and 16 TCP streams from THU site alpha02 to Li-Zen site lz04. According to the experiment result, we observed that parallel data transfer technique showed better performance for larger file sizes. Parallel data transfer really improves aggregate bandwidth, with the establishment of multiple data channels. 4.3 Replica Selection Cost Model According to the replica selection scenario in 3.1, a user logins the local site THU site alpha1, and specifies the characteristics of the desired data and passes this attribute description to replica catalog server. The replica catalog server queries its database and produces a list of logical files that contain data with the specified characteristics. The replica catalog server returns the information of physical locations for all registered replicas of the desired logical files. In this experiment, there is only one logical file, file-a, conform to user s request, and the size of file-a is 1024 megabytes. Table 1. The value of replica selection cost model and file transfer time alpha1 Alpha4 hit0 lz02 BW P i j CPU P j I O P / j Replica Selection Cost model Practical Data transfer time (a) (b) Fig. 5. GUI of replica selection cost model program

9 286 C.-T. Yang et al. Next, the user passes this list of replica locations to the replica selection server, which identifies the destination storage system locations for all candidate data transfer operations. There are three replicas mapping to the logical file file-a. These three replicas are individually located at different sites, alpha4, hit0, and lz02. The replica selection server sends the candidate destination locations to the information server [17], which provide the three system factors mentioned in 3.2. Based on the replica cost model referred in 3.3, the replica selection server chooses the best replica and transfers it to the local site alpha1 by GridFTP. Table 1 shows the values of system factors and the scores of the replica selection cost model, and the physical file transfer time. According to discussions given in 3.3, we implemented a replica selection cost model computer program. We also executed the program in our Data Grid testbed. Because the program is developed using Java programming language, we can execute it in any computing platform with JVM. Fig. 5(a) shows costs that are calculated based on the three system factors (the percentage of CPU idle, I/O idle and bandwidth from other sites) to alpha1. Figure 5(b) displays the average value based on the selected time scale, which is adjustable on the top scroll bar. We also can get the sort list of the costs by clicking the Cost button. 5 Conclusions and Future Work In this paper, we have presented the design and implementation of two fundamental services. The GridFTP protocol was extended from FTP protocol, and it provides beneficial features. In this research paper, we focused in parallel data transfer issues. After measuring the performance of GridFTP with parallel data transfer feature, we confirm that such technology improves data transfer. After measuring the performance of FTP and GridFTP with four different file sizes, we could observe that even file size is 2 gigabytes; the data transfer time is similar. However, we measured the performance of GridFTP with 1, 2, 4, 8 and 16 TCP streams. We are sure that the parallel data transfer technology efficiently saves data transfer time. After calculating the score of replica selection cost model, we can sort a list of replicas from the most efficient replica to worst one. Therefore, our cost model can provide users or applications the best choice mechanism for replica selection. As future work, there are three investigations will be carried out from this research. First, although we have employed the parallel data transfer feature to improve the performance of data transfer, there is another striped data transfer feature that can improve aggregate bandwidth. Second, we will consider how to determine the system factors weight and refer to more system factors in the replica selection cost model. Third and last one, we will extend our Data Grid testbed for analyzing the performance of replica selection in a dynamic and larger number of sites environment. References 1. B. Allcock, J. Bester, J. Bresnahan, A. Chervenak, I. Foster, C. Kesselman, S. Meder, V. Nefedova, D. Quesnal, S. Tuecke, Data Management and Transfer in High Performance Computational Grid Environments, Parallel Computing, Vol. 28 (5), pp , May 2002.

10 Performance Analysis of Applying Replica Selection Technology B. Allcock, J. Bester, J. Bresnahan, A. Chervenak, I. Foster, C. Kesselman, S. Meder, V. Nefedova, D. Quesnel, S. Tuecke, Secure, Efficient Data Transport and Replica Management for High-Performance Data-Intensive Computing, IEEE Mass Storage Conference, B. Allcock, S. Tuecke, I. Foster, A. Chervenak, and C. Kesselman, Protocols and Services for Distributed Data-Intensive Science, ACAT2000 Proceedings, pp , K. Czajkowski, S. Fitzgerald, I. Foster and C. Kesselman, Grid Information Services for Distributed Resource Sharing, Proceedings of the Tenth IEEE International Symposium on High-Performance Distributed Computing (HPDC-10), IEEE CS Press, August K. Czajkowski, I. Foster, N. Karonis, C. Kesselman, S. Martin, W. Smith and S. Tuecke, A Resource Management Architecture for Metacomputing Systems, Proc. IPPS/SPDP 98 Workshop on Job Scheduling Strategies for Parallel Processing, pp , R. L. De, C. Costa and S. Lifschitz, Database Allocation Strategies for Parallel BLAST Evaluation on Clusters, Proceedings of the Distributed and Parallel Databases, Vol. 13, Issue1, pp , Hingham, MA, USA, January I. Foster, The Grid: A New Infrastructure for 21st Century Science, Physics Today, 55(2):42-47, I. Foster, C. Kesselman, Globus: A Metacomputing Infrastructure Toolkit, Intl J. Supercomputer Applications, 11(2): , I. Foster and C. Kesselman, The Grid: Blueprint for a New Computing Infrastructure, Morgan-Kaufmann, I. Foster, C. Kesselman and S. Tuecke, The Anatomy of the Grid: Enabling Scalable Virtual Organizations, Intl J. Supercomputer Applications, 15(3), Global Grid Forum, The Globus Project, Introduction to Grid Computing with Globus, SETI@home: Search for Extraterrestrial Intelligence at home, berkeley. edu/ 15. SYSSTAT utilities home page, R. Wolski, N. Spring and J. Hayes, The Network Weather Service: A Distributed Resource Performance Forecasting Service for Metacomputing, Journal of Future Generation Computing Systems, Vol. 15, No. 5-6, pp , October X. Zhang, J. Freschl, and J. Schopf, A Performance Study of Monitoring and Information Services for Distributed Systems, Proceedings of HPDC, August 2003.

A resource broker with an efficient network information model on grid environments

J Supercomput (2007) 40: 249 267 DOI 10.1007/s11227-006-0025-0 A resource broker with an efficient network information model on grid environments Chao-Tung Yang Po-Chi Shih Cheng-Fang Lin Sung-Yi Chen