Performance Analysis of Applying Replica Selection Technology for Data Grid Environments*

Size: px
Start display at page:

Download "Performance Analysis of Applying Replica Selection Technology for Data Grid Environments*"

Transcription

1 Performance Analysis of Applying Replica Selection Technology for Data Grid Environments* Chao-Tung Yang 1,, Chun-Hsiang Chen 1, Kuan-Ching Li 2, and Ching-Hsien Hsu 3 1 High-Performance Computing Laboratory, Department of Computer Science and Information Engineering, Tunghai University, Taichung 40704, Taiwan ctyang@mail.thu.edu.tw 2 Parallel and Distributed Processing Center, Department of Computer Science and Information Management, Providence University, Taichung 43301, Taiwan kuancli@pu.edu.tw 3 Department of Computer Science and Information Engineering, Chung Hua University, Hsinchu 300, Taiwan chh@chu.edu.tw Abstract. The Data Grid enables the sharing, selection, and connection of a wide variety of geographically distributed computational and storage resources for solving large-scale data intensive scientific applications. Such technology efficiently manage and transfer terabytes or even petabytes of data for dataintensive, high-performance computing applications in wide-area, distributed computing environments. Replica selection process allows an application to choose a replica from replica catalog, based on its performance and data access features. In this paper, we build a Grid environment based on three existing PC Cluster environments and perform performance analysis of data transfers using GridFTP protocol over these systems. In addition, based on experimental results, it is proposed a cost model to pick the best replica, in real and dynamic network situations. Keywords: Grid computing, Data Grid, Replica selection, Globus, GridFTP. 1 Introduction Grid computing is utilization of many computers resources in a network to a single problem at the same time - usually to a scientific or technical problem that requires a great number of computer processing cycles or access to large amounts of data. A Grid computing environment provides a platform for scientific applications and physical experiments. A Grid is a large-scale virtual organization which resources are shared in order to solve problems [4, 7, 9, 10, 11 12]. Grid computing is distributed computing taken to the next evolutionary level. The goal is to create the vision of * This paper is supported in part by NSC Taiwan (National Science Council), under grants no. NSC E , NSC M , NSC M and NSC E The corresponding author. V. Malyshkin (Ed.): PaCT 2005, LNCS 3606, pp , Springer-Verlag Berlin Heidelberg 2005

2 Performance Analysis of Applying Replica Selection Technology 279 large and powerful self-managing virtual computer, which is a huge collection of connected heterogeneous systems. The emerging mechanism is resources sharing through the availability of high bandwidth network. The computational Grid is a term used to provider the users a better performance, especially in terms of speed and throughput. The term Data Grid aggregate distributed resources to produce results for large size problems. Most of these Data Grid applications are executed simultaneously and access a large number of shared data files in Grid. In certain data intensive scientific applications, such as high-energy physics, bioinformatics applications and astrophysical virtual observatory, we confront with huge amount of data. A Data Grid provides two essential basic services, which are a secure, reliable, efficient data transport protocol and replica management [2]. The high-speed transport protocol, GridFTP, extends the popular FTP protocol with some new features required for Data Grid applications, such as partial file transfer and third-party transfer [5]. The replica management service take advantage of replica catalog with GridFTP transfer to provide for the creation, registration, location and management of data replicas [1]. In this paper, we build a Grid environment based on three existing PC Cluster environments and perform performance analysis of data transfers using GridFTP protocol over these systems. In addition, based on experimental results, it is proposed a cost model to pick the best replica, in real and dynamic network situations. In this paper, we propose a cost model according to the three significant parameters: network bandwidth, CPU load and I/O state. Although the network situation is constantly changing and the storage equipments are busy or idle, we can use our cost model to determine the best replica immediately. The replica selection can be conducted accurately because our cost model is based on the system monitoring information that update continuously. 2 Background Review 2.1 Globus Toolkit The Globus Project [10, 11, 12] provides software tools that make it easier to build computational Grids and Grid-based applications. These tools are collectively called The Globus Toolkit. The Globus Toolkit is used by many organizations to build computational Grids that can support their applications. The composition of the Globus Toolkit can be pictured as three pillars: Resource Management, Information Services, and Data Management. Each pillar represents a primary component of the Globus Toolkit and makes use of a common foundation of security. GRAM implements a resource management protocol, MDS implements an information services protocol, and GridFTP implements a data transfer protocol. They all use the GSI security protocol at the connection layer [8, 11, 12, 13]. 2.2 NWS The Network Weather Service (NWS) [16] is a generalized and distributed monitoring system for producing short-term performance forecasts based on historical performance measurements. The goal of the system is to dynamically characterize and

3 280 C.-T. Yang et al. forecast the performance deliverable at the application level from a set of network and computational resources. It is composed of three component processes: nws_nameserver: implements a naming and discovery service used to manage a system of nws_sensor and nws_memory, nws_memory: provides persistent storage for the measurement data collected by the NWS deployment, nws_sensor: gathers performance measurements from a specified resource and communicates it to a set of nws_memory specified on the command line. A typical installation would involve one nws_nameserver, one or more nws_memory (which may reside on different machines), and a nws_sensor running on each machine for which resources are to be monitored. The system includes sensors for end-to-end TCP/IP performance (bandwidth and latency), available CPU percentage, and available non-paged memory. 2.3 Sysstat Utilities The Sysstat [15] utilities are a collection of performance monitoring tools for Linux OS, which sysstat package contains the sar, mpstat, and iostat commands. The sar command collects and reports system activity information. This information can also be saved in a system activity file for future inspection. The iostat command reports CPU statistics and I/O statistics for tty devices and disks. The statistics reported by sar concern I/O transfer rates, paging activity, process-related activities, interrupts, network activity, memory and swap space utilization, CPU utilization, kernel activities, and tty statistics, among others. Both uniprocessor (UP) and Symmetric multiprocessor (SMP) machines are fully supported. 3 Replica Selection 3.1 Replica Selection Scenario The system established in this research used the following architecture. Figure 1 shows our proposed replica selection model, to show how a client identifies the best location for a desired replica transfer. At first, the client login at the site local site and execute parallel applications in the Data Grid platform. This application checks the files are located in local site or not. If they are present at the local site, the application accesses them immediately. Otherwise, the application passes the logical file names to replica catalog server, which returns a list of physical locations for all registered copies. The application passes this list of replica locations to a replica selection server, which identifies the destination locations of storage system for all candidate data transfer operations. The replica selection server sends the possible destination locations to information server, which provides the performance of measurements and predictions of three system factors, as described in next section. According to these estimates, the replica selection server chooses the best replica location and returns location information to the parallel application, which receives the replica through GridFTP. Once finished the application s computation, the application returns the results to user.

4 Performance Analysis of Applying Replica Selection Technology System Factors Fig. 1. Replica selection scenario We propose a replication selection model for Data Grid environments. In this environment, we can treat a biological database as a replica of Data Grid. When we execute large-scale data intensive applications in these environments, a site has both data stores and computational capabilities. To determine the best database from many of same replications is a significant problem. In our model, we consider three system factors that affect the replica selection: Network bandwidth: Network bandwidth is one of the most significant factors in Data Grid, since the size of a data file in Data Grid environment is usually very large. In other words, the data file transfer time is tightly dependent on network bandwidth situations. Because network bandwidth is unstable and dynamic factor, we should often measure and predict it as most accurate as possible. NWS (Network Weather Service) is a powerful toolkit for such purpose, CPU load: a Grid platform consists of a number of heterogeneous systems, built with different system architectures, e.g., cluster platforms, supercomputers, PCs. CPU load is a dynamic system factor, and if the CPU load of a system is heavy, it will certainly affect the data file download process from this site. The measurement of CPU status is done through the Globus Toolkit / MDS, I/O state: Data Grid nodes consist of different heterogeneous storage systems. The size of data in Data Grid is huge. If I/O state of the site that we would like to download file from is very busy, it will directly affect the data transfer performance. We measure the I/O state using sysstat utilities. 3.3 Replica Selection Cost Model The target function of a cost model for distributed and replicated data storage is the score of information from information service. We listed different influencing factors

5 282 C.-T. Yang et al. for our cost model in the previous section. However, we have to express these factors within a mathematical notation for further analysis. We assume node I is the local site which the user or application is logged in, while node j possesses the replica which the user or application wanted. The seven system parameters in our replica selection cost model are: Scorei : The score high or low represents the user or application acquiring the j replica effectively or not is from node I to node j, BW Pi : The percentage of bandwidth from node I to node j. In other words, the j current bandwidth divided the highest theoretical bandwidth, BW W : The weight of the network bandwidth defined by the administrator of the Data Grid, CPU P : The percentage of CPU idles of node j, j CPU W : The weight of the CPU load defined by the administrator of the Data Grid, I O P / : The percentage of I/O idles of node j, j I O W / : The weight of the I/O state defined by the administrator of the Data Grid, According to the given three system factors, we define the following general formula as: BW BW CPU CPU I / O I / O Scorei j = Pi j W + Pj W + Pj W (1) BW CPU I O In this formula, three influencing factors: W, W, and W /, described as the weights of network bandwidth, CPU, and I/O. These weights can be determined by the administrator of the Data Grid organization. According to different attributes of storage systems in Data Grid nodes, administrator can decide for different weights, because some storage equipment does not affect CPU load. After several experimental measurements, we consider that network bandwidth is the most significant factor, influencing directly the data transfer time. When we perform data transfer using GridFTP protocol, we discover that the CPU and I/O statuses slightly affect the performance of data transfer. In our Data Grid environment, we define the values as 80%, 10%, and 10%, respectively. 4 Experimental Environments and Results In this section, there are experimental results using GridFTP protocol. First, we measure and compare the FTP with GridFTP, as their file transfer time. Secondly, we focused in the parallel data transfer in this paper, measuring and comparing the GridFTP with 1, 2, 4, 8 and 16 TCP streams of file transfer time. The Data Grid testbed consisting of three Linux PC clusters is built as: THU site: four PCs with dual AMD AthlonMP 2.0GHz processors, 1GB DDR memory, 60GB HD, 1Gbps network bandwidth, Li-Zen site: four PCs with Intel Celeron 900MHz processor, 256MB DDR memory, 10GB HD, 30 Mbps network bandwidth, HIT site: four PCs with Intel P4 2.8GHz processors, 512MB DDR memory, 80GB HD, 1Gbps network bandwidth.

6 Performance Analysis of Applying Replica Selection Technology 283 Figure 2 shows the hardware and network configuration of our Data Grid testbed. The THU site is located in Tunghai University, Taichung City; Li-Zen site is located at Li-Zen High School, Taichung County, while HIT site is located in Hsiuping Institute of Technology, Taichung County, all in Taiwan. 4.1 FTP Versus GridFTP Fig. 2. Our Data Grid testbed The Globus Project surveyed available protocols and technologies, implemented some prototypes, and settled on using FTP and its existing extensions as a base, and then extending it again to add missing required functionality. The Globus alliance propose a common data transfer and access protocol named GridFTP that provides secure, efficient data movement in Grid environments. This protocol, which extends the standard FTP protocol, provides a superset of the features offered by the various Grid storage systems currently in use. In Grid environments, access to distributed data is typically as important as access to distributed computational resources. Distributed scientific and engineering applications require transfers of large amounts of data between storage systems, and access to large amounts of data by many geographically distributed applications and users for analyzing and visualization. We note that GridFTP protocol is extended from FTP protocol, and suitable for Grid environments. Figure 3 shows the performance of FTP and GridFTP by transferring four different file sizes. We transferred these files (256, 512, 1024 and 2048 megabytes) from THU site alpha01 to HIT site gridhit3 in our first experiment. 4.2 GridFTP with Parallel Data Transfer Using multiple TCP streams can improve aggregate bandwidth over using a single TCP stream in WAN environments. We apply this feature of GridFTP protocol to transfer different sizes files in Data Grid environments. GridFTP (as well as normal FTP) defines multiple wire protocols, or MODES, for the data channel. Most normal

7 284 C.-T. Yang et al. FTP servers only implement stream mode, i.e., the bytes flow in order over a single TCP connection. GridFTP defaults to this mode so that it is compatible with normal FTP servers. FTP versus GridFTP File Transfer Time (sec) FTP GridFTP File Sizes (MB) Fig. 3. FTP versus GridFTP However, GridFTP has another mode, called Extended Block Mode, or MODE E. This mode sends the data over the data channel in blocks. Each block consists of 8 bits of flags, a 64 bit integer indicating the offset from the start of the transfer, and a 64 bit integer indicating the length of the block in bytes, followed by a payload of length bytes. Because the offset and length are provided, out of order arrival is acceptable, i.e., the 10 th block could arrive before the 9 th because you know explicitly where it belongs. This allows us to use multiple TCP channels. If you use the parallelism option, globus-url-copy automatically puts the servers into MODE E. Note that parallel data transfer with one TCP stream is not the same as no parallel data transfer at all. Both will use a single stream, but the default will use stream mode and the parallel data transfer with one TCP stream will use mode E [12]. GridFTP with Parallel Data Transfer File Transfer Time (sec) GridFTP with no Parallel Data Transfer GridFTP with 1 TCP Stream GridFTP with 2 TCP Streams GridFTP with 4 TCP Streams GridFTP with 8 TCP Streams GridFTP with 16 TCP Streams File Sizes (MB) Fig. 4. GridFTP with parallel data transfer The parallelism option is used by the source data note to control how many parallel data connections may be established to each destination data node. Figure 4 shows the

8 Performance Analysis of Applying Replica Selection Technology 285 performance of GridFTP transferring 256, 512, 1024 and 2048 megabytes files with 1, 2, 4, 8 and 16 TCP streams from THU site alpha02 to Li-Zen site lz04. According to the experiment result, we observed that parallel data transfer technique showed better performance for larger file sizes. Parallel data transfer really improves aggregate bandwidth, with the establishment of multiple data channels. 4.3 Replica Selection Cost Model According to the replica selection scenario in 3.1, a user logins the local site THU site alpha1, and specifies the characteristics of the desired data and passes this attribute description to replica catalog server. The replica catalog server queries its database and produces a list of logical files that contain data with the specified characteristics. The replica catalog server returns the information of physical locations for all registered replicas of the desired logical files. In this experiment, there is only one logical file, file-a, conform to user s request, and the size of file-a is 1024 megabytes. Table 1. The value of replica selection cost model and file transfer time alpha1 Alpha4 hit0 lz02 BW P i j CPU P j I O P / j Replica Selection Cost model Practical Data transfer time (a) (b) Fig. 5. GUI of replica selection cost model program

9 286 C.-T. Yang et al. Next, the user passes this list of replica locations to the replica selection server, which identifies the destination storage system locations for all candidate data transfer operations. There are three replicas mapping to the logical file file-a. These three replicas are individually located at different sites, alpha4, hit0, and lz02. The replica selection server sends the candidate destination locations to the information server [17], which provide the three system factors mentioned in 3.2. Based on the replica cost model referred in 3.3, the replica selection server chooses the best replica and transfers it to the local site alpha1 by GridFTP. Table 1 shows the values of system factors and the scores of the replica selection cost model, and the physical file transfer time. According to discussions given in 3.3, we implemented a replica selection cost model computer program. We also executed the program in our Data Grid testbed. Because the program is developed using Java programming language, we can execute it in any computing platform with JVM. Fig. 5(a) shows costs that are calculated based on the three system factors (the percentage of CPU idle, I/O idle and bandwidth from other sites) to alpha1. Figure 5(b) displays the average value based on the selected time scale, which is adjustable on the top scroll bar. We also can get the sort list of the costs by clicking the Cost button. 5 Conclusions and Future Work In this paper, we have presented the design and implementation of two fundamental services. The GridFTP protocol was extended from FTP protocol, and it provides beneficial features. In this research paper, we focused in parallel data transfer issues. After measuring the performance of GridFTP with parallel data transfer feature, we confirm that such technology improves data transfer. After measuring the performance of FTP and GridFTP with four different file sizes, we could observe that even file size is 2 gigabytes; the data transfer time is similar. However, we measured the performance of GridFTP with 1, 2, 4, 8 and 16 TCP streams. We are sure that the parallel data transfer technology efficiently saves data transfer time. After calculating the score of replica selection cost model, we can sort a list of replicas from the most efficient replica to worst one. Therefore, our cost model can provide users or applications the best choice mechanism for replica selection. As future work, there are three investigations will be carried out from this research. First, although we have employed the parallel data transfer feature to improve the performance of data transfer, there is another striped data transfer feature that can improve aggregate bandwidth. Second, we will consider how to determine the system factors weight and refer to more system factors in the replica selection cost model. Third and last one, we will extend our Data Grid testbed for analyzing the performance of replica selection in a dynamic and larger number of sites environment. References 1. B. Allcock, J. Bester, J. Bresnahan, A. Chervenak, I. Foster, C. Kesselman, S. Meder, V. Nefedova, D. Quesnal, S. Tuecke, Data Management and Transfer in High Performance Computational Grid Environments, Parallel Computing, Vol. 28 (5), pp , May 2002.

10 Performance Analysis of Applying Replica Selection Technology B. Allcock, J. Bester, J. Bresnahan, A. Chervenak, I. Foster, C. Kesselman, S. Meder, V. Nefedova, D. Quesnel, S. Tuecke, Secure, Efficient Data Transport and Replica Management for High-Performance Data-Intensive Computing, IEEE Mass Storage Conference, B. Allcock, S. Tuecke, I. Foster, A. Chervenak, and C. Kesselman, Protocols and Services for Distributed Data-Intensive Science, ACAT2000 Proceedings, pp , K. Czajkowski, S. Fitzgerald, I. Foster and C. Kesselman, Grid Information Services for Distributed Resource Sharing, Proceedings of the Tenth IEEE International Symposium on High-Performance Distributed Computing (HPDC-10), IEEE CS Press, August K. Czajkowski, I. Foster, N. Karonis, C. Kesselman, S. Martin, W. Smith and S. Tuecke, A Resource Management Architecture for Metacomputing Systems, Proc. IPPS/SPDP 98 Workshop on Job Scheduling Strategies for Parallel Processing, pp , R. L. De, C. Costa and S. Lifschitz, Database Allocation Strategies for Parallel BLAST Evaluation on Clusters, Proceedings of the Distributed and Parallel Databases, Vol. 13, Issue1, pp , Hingham, MA, USA, January I. Foster, The Grid: A New Infrastructure for 21st Century Science, Physics Today, 55(2):42-47, I. Foster, C. Kesselman, Globus: A Metacomputing Infrastructure Toolkit, Intl J. Supercomputer Applications, 11(2): , I. Foster and C. Kesselman, The Grid: Blueprint for a New Computing Infrastructure, Morgan-Kaufmann, I. Foster, C. Kesselman and S. Tuecke, The Anatomy of the Grid: Enabling Scalable Virtual Organizations, Intl J. Supercomputer Applications, 15(3), Global Grid Forum, The Globus Project, Introduction to Grid Computing with Globus, SETI@home: Search for Extraterrestrial Intelligence at home, berkeley. edu/ 15. SYSSTAT utilities home page, R. Wolski, N. Spring and J. Hayes, The Network Weather Service: A Distributed Resource Performance Forecasting Service for Metacomputing, Journal of Future Generation Computing Systems, Vol. 15, No. 5-6, pp , October X. Zhang, J. Freschl, and J. Schopf, A Performance Study of Monitoring and Information Services for Distributed Systems, Proceedings of HPDC, August 2003.

A resource broker with an efficient network information model on grid environments

A resource broker with an efficient network information model on grid environments J Supercomput (2007) 40: 249 267 DOI 10.1007/s11227-006-0025-0 A resource broker with an efficient network information model on grid environments Chao-Tung Yang Po-Chi Shih Cheng-Fang Lin Sung-Yi Chen

More information

A Distributed Media Service System Based on Globus Data-Management Technologies1

A Distributed Media Service System Based on Globus Data-Management Technologies1 A Distributed Media Service System Based on Globus Data-Management Technologies1 Xiang Yu, Shoubao Yang, and Yu Hong Dept. of Computer Science, University of Science and Technology of China, Hefei 230026,

More information

DiPerF: automated DIstributed PERformance testing Framework

DiPerF: automated DIstributed PERformance testing Framework DiPerF: automated DIstributed PERformance testing Framework Ioan Raicu, Catalin Dumitrescu, Matei Ripeanu, Ian Foster Distributed Systems Laboratory Computer Science Department University of Chicago Introduction

More information

UNICORE Globus: Interoperability of Grid Infrastructures

UNICORE Globus: Interoperability of Grid Infrastructures UNICORE : Interoperability of Grid Infrastructures Michael Rambadt Philipp Wieder Central Institute for Applied Mathematics (ZAM) Research Centre Juelich D 52425 Juelich, Germany Phone: +49 2461 612057

More information

A Resource Discovery Algorithm in Mobile Grid Computing Based on IP-Paging Scheme

A Resource Discovery Algorithm in Mobile Grid Computing Based on IP-Paging Scheme A Resource Discovery Algorithm in Mobile Grid Computing Based on IP-Paging Scheme Yue Zhang 1 and Yunxia Pei 2 1 Department of Math and Computer Science Center of Network, Henan Police College, Zhengzhou,

More information

A Data-Aware Resource Broker for Data Grids

A Data-Aware Resource Broker for Data Grids A Data-Aware Resource Broker for Data Grids Huy Le, Paul Coddington, and Andrew L. Wendelborn School of Computer Science, University of Adelaide Adelaide, SA 5005, Australia {paulc,andrew}@cs.adelaide.edu.au

More information

Replica Selection in the Globus Data Grid

Replica Selection in the Globus Data Grid Replica Selection in the Globus Data Grid Sudharshan Vazhkudai 1, Steven Tuecke 2, and Ian Foster 2 1 Department of Computer and Information Science The University of Mississippi chucha@john.cs.olemiss.edu

More information

Performance of DB2 Enterprise-Extended Edition on NT with Virtual Interface Architecture

Performance of DB2 Enterprise-Extended Edition on NT with Virtual Interface Architecture Performance of DB2 Enterprise-Extended Edition on NT with Virtual Interface Architecture Sivakumar Harinath 1, Robert L. Grossman 1, K. Bernhard Schiefer 2, Xun Xue 2, and Sadique Syed 2 1 Laboratory of

More information

Introduction to Grid Computing

Introduction to Grid Computing Milestone 2 Include the names of the papers You only have a page be selective about what you include Be specific; summarize the authors contributions, not just what the paper is about. You might be able

More information

Multi-path based Algorithms for Data Transfer in the Grid Environment

Multi-path based Algorithms for Data Transfer in the Grid Environment New Generation Computing, 28(2010)129-136 Ohmsha, Ltd. and Springer Multi-path based Algorithms for Data Transfer in the Grid Environment Muzhou XIONG 1,2, Dan CHEN 2,3, Hai JIN 1 and Song WU 1 1 School

More information

MSF: A Workflow Service Infrastructure for Computational Grid Environments

MSF: A Workflow Service Infrastructure for Computational Grid Environments MSF: A Workflow Service Infrastructure for Computational Grid Environments Seogchan Hwang 1 and Jaeyoung Choi 2 1 Supercomputing Center, Korea Institute of Science and Technology Information, 52 Eoeun-dong,

More information

A Resource Discovery Algorithm in Mobile Grid Computing based on IP-paging Scheme

A Resource Discovery Algorithm in Mobile Grid Computing based on IP-paging Scheme A Resource Discovery Algorithm in Mobile Grid Computing based on IP-paging Scheme Yue Zhang, Yunxia Pei To cite this version: Yue Zhang, Yunxia Pei. A Resource Discovery Algorithm in Mobile Grid Computing

More information

A Time-To-Live Based Reservation Algorithm on Fully Decentralized Resource Discovery in Grid Computing

A Time-To-Live Based Reservation Algorithm on Fully Decentralized Resource Discovery in Grid Computing A Time-To-Live Based Reservation Algorithm on Fully Decentralized Resource Discovery in Grid Computing Sanya Tangpongprasit, Takahiro Katagiri, Hiroki Honda, Toshitsugu Yuba Graduate School of Information

More information

A Dynamic Resource Broker and Fuzzy Logic Based Scheduling Algorithm in Grid Environment

A Dynamic Resource Broker and Fuzzy Logic Based Scheduling Algorithm in Grid Environment A Dynamic Resource Broker and Fuzzy Logic Based Scheduling Algorithm in Grid Environment Jiayi Zhou 1, Kun-Ming Yu 2, Chih-Hsun Chou 2, Li-An Yang 2, and Zhi-Jie Luo 2 1 Institute of Engineering Science,

More information

Chapter 4:- Introduction to Grid and its Evolution. Prepared By:- NITIN PANDYA Assistant Professor SVBIT.

Chapter 4:- Introduction to Grid and its Evolution. Prepared By:- NITIN PANDYA Assistant Professor SVBIT. Chapter 4:- Introduction to Grid and its Evolution Prepared By:- Assistant Professor SVBIT. Overview Background: What is the Grid? Related technologies Grid applications Communities Grid Tools Case Studies

More information

A Replica Location Grid Service Implementation

A Replica Location Grid Service Implementation A Replica Location Grid Service Implementation Mary Manohar, Ann Chervenak, Ben Clifford, Carl Kesselman Information Sciences Institute, University of Southern California Marina Del Rey, CA 90292 {mmanohar,

More information

GridNEWS: A distributed Grid platform for efficient storage, annotating, indexing and searching of large audiovisual news content

GridNEWS: A distributed Grid platform for efficient storage, annotating, indexing and searching of large audiovisual news content 1st HellasGrid User Forum 10-11/1/2008 GridNEWS: A distributed Grid platform for efficient storage, annotating, indexing and searching of large audiovisual news content Ioannis Konstantinou School of ECE

More information

THE VEGA PERSONAL GRID: A LIGHTWEIGHT GRID ARCHITECTURE

THE VEGA PERSONAL GRID: A LIGHTWEIGHT GRID ARCHITECTURE THE VEGA PERSONAL GRID: A LIGHTWEIGHT GRID ARCHITECTURE Wei Li, Zhiwei Xu, Bingchen Li, Yili Gong Institute of Computing Technology of Chinese Academy of Sciences Beijing China, 100080 {zxu, liwei, libingchen,

More information

THE GLOBUS PROJECT. White Paper. GridFTP. Universal Data Transfer for the Grid

THE GLOBUS PROJECT. White Paper. GridFTP. Universal Data Transfer for the Grid THE GLOBUS PROJECT White Paper GridFTP Universal Data Transfer for the Grid WHITE PAPER GridFTP Universal Data Transfer for the Grid September 5, 2000 Copyright 2000, The University of Chicago and The

More information

GridMonitor: Integration of Large Scale Facility Fabric Monitoring with Meta Data Service in Grid Environment

GridMonitor: Integration of Large Scale Facility Fabric Monitoring with Meta Data Service in Grid Environment GridMonitor: Integration of Large Scale Facility Fabric Monitoring with Meta Data Service in Grid Environment Rich Baker, Dantong Yu, Jason Smith, and Anthony Chan RHIC/USATLAS Computing Facility Department

More information

Grid Technologies & Applications: Architecture & Achievements

Grid Technologies & Applications: Architecture & Achievements Grid Technologies & Applications: Architecture & Achievements Ian Foster Mathematics & Computer Science Division, Argonne National Laboratory, Argonne, IL 60439, USA Department of Computer Science, The

More information

Introduction to GT3. Introduction to GT3. What is a Grid? A Story of Evolution. The Globus Project

Introduction to GT3. Introduction to GT3. What is a Grid? A Story of Evolution. The Globus Project Introduction to GT3 The Globus Project Argonne National Laboratory USC Information Sciences Institute Copyright (C) 2003 University of Chicago and The University of Southern California. All Rights Reserved.

More information

AGARM: An Adaptive Grid Application and Resource Monitor Framework

AGARM: An Adaptive Grid Application and Resource Monitor Framework AGARM: An Adaptive Grid Application and Resource Monitor Framework Wenju Zhang, Shudong Chen, Liang Zhang, Shui Yu, and Fanyuan Ma Shanghai Jiaotong University, Shanghai, P.R.China, 200030 {zwj03, chenshudong,

More information

Profiling Grid Data Transfer Protocols and Servers

Profiling Grid Data Transfer Protocols and Servers Profiling Grid Data Transfer Protocols and Servers George Kola, Tevfik Kosar, and Miron Livny Computer Sciences Department, University of Wisconsin-Madison 12 West Dayton Street, Madison WI 5370 {kola,kosart,miron}@cs.wisc.edu

More information

Design and Implementation of a Monitoring and Scheduling System for Multiple Linux PC Clusters*

Design and Implementation of a Monitoring and Scheduling System for Multiple Linux PC Clusters* Design and Implementation of a Monitoring and Scheduling System for Multiple Linux PC Clusters* Chao-Tung Yang, Chun-Sheng Liao, and Ping-I Chen High-Performance Computing Laboratory Department of Computer

More information

An Evaluation of Alternative Designs for a Grid Information Service

An Evaluation of Alternative Designs for a Grid Information Service An Evaluation of Alternative Designs for a Grid Information Service Warren Smith, Abdul Waheed *, David Meyers, Jerry Yan Computer Sciences Corporation * MRJ Technology Solutions Directory Research L.L.C.

More information

WSRF Services for Composing Distributed Data Mining Applications on Grids: Functionality and Performance

WSRF Services for Composing Distributed Data Mining Applications on Grids: Functionality and Performance WSRF Services for Composing Distributed Data Mining Applications on Grids: Functionality and Performance Domenico Talia, Paolo Trunfio, and Oreste Verta DEIS, University of Calabria Via P. Bucci 41c, 87036

More information

A Simulation Model for Large Scale Distributed Systems

A Simulation Model for Large Scale Distributed Systems A Simulation Model for Large Scale Distributed Systems Ciprian M. Dobre and Valentin Cristea Politechnica University ofbucharest, Romania, e-mail. **Politechnica University ofbucharest, Romania, e-mail.

More information

A Finite State Mobile Agent Computation Model

A Finite State Mobile Agent Computation Model A Finite State Mobile Agent Computation Model Yong Liu, Congfu Xu, Zhaohui Wu, Weidong Chen, and Yunhe Pan College of Computer Science, Zhejiang University Hangzhou 310027, PR China Abstract In this paper,

More information

A Performance Evaluation of WS-MDS in the Globus Toolkit

A Performance Evaluation of WS-MDS in the Globus Toolkit A Performance Evaluation of WS-MDS in the Globus Toolkit Ioan Raicu * Catalin Dumitrescu * Ian Foster +* * Computer Science Department The University of Chicago {iraicu,cldumitr}@cs.uchicago.edu Abstract

More information

TERAGRID 2007 CONFERENCE, MADISON, WI 1. GridFTP Pipelining

TERAGRID 2007 CONFERENCE, MADISON, WI 1. GridFTP Pipelining TERAGRID 2007 CONFERENCE, MADISON, WI 1 GridFTP Pipelining John Bresnahan, 1,2,3 Michael Link, 1,2 Rajkumar Kettimuthu, 1,2 Dan Fraser, 1,2 Ian Foster 1,2,3 1 Mathematics and Computer Science Division

More information

High Performance Computing Course Notes Grid Computing I

High Performance Computing Course Notes Grid Computing I High Performance Computing Course Notes 2008-2009 2009 Grid Computing I Resource Demands Even as computer power, data storage, and communication continue to improve exponentially, resource capacities are

More information

Design of Distributed Data Mining Applications on the KNOWLEDGE GRID

Design of Distributed Data Mining Applications on the KNOWLEDGE GRID Design of Distributed Data Mining Applications on the KNOWLEDGE GRID Mario Cannataro ICAR-CNR cannataro@acm.org Domenico Talia DEIS University of Calabria talia@deis.unical.it Paolo Trunfio DEIS University

More information

Dynamic Data Grid Replication Strategy Based on Internet Hierarchy

Dynamic Data Grid Replication Strategy Based on Internet Hierarchy Dynamic Data Grid Replication Strategy Based on Internet Hierarchy Sang-Min Park 1, Jai-Hoon Kim 1, Young-Bae Ko 2, and Won-Sik Yoon 2 1 Graduate School of Information and Communication Ajou University,

More information

GFS: A Distributed File System with Multi-source Data Access and Replication for Grid Computing

GFS: A Distributed File System with Multi-source Data Access and Replication for Grid Computing GFS: A Distributed File System with Multi-source Data Access and Replication for Grid Computing Chun-Ting Chen 1, Chun-Chen Hsu 1, 2, Jan-Jan Wu 2, and Pangfeng Liu 1, 3 1 Department of Computer Science

More information

Two-Level Dynamic Load Balancing Algorithm Using Load Thresholds and Pairwise Immigration

Two-Level Dynamic Load Balancing Algorithm Using Load Thresholds and Pairwise Immigration Two-Level Dynamic Load Balancing Algorithm Using Load Thresholds and Pairwise Immigration Hojiev Sardor Qurbonboyevich Department of IT Convergence Engineering Kumoh National Institute of Technology, Daehak-ro

More information

High Throughput WAN Data Transfer with Hadoop-based Storage

High Throughput WAN Data Transfer with Hadoop-based Storage High Throughput WAN Data Transfer with Hadoop-based Storage A Amin 2, B Bockelman 4, J Letts 1, T Levshina 3, T Martin 1, H Pi 1, I Sfiligoi 1, M Thomas 2, F Wuerthwein 1 1 University of California, San

More information

Web-based access to the grid using. the Grid Resource Broker Portal

Web-based access to the grid using. the Grid Resource Broker Portal Web-based access to the grid using the Grid Resource Broker Portal Giovanni Aloisio, Massimo Cafaro ISUFI High Performance Computing Center Department of Innovation Engineering University of Lecce, Italy

More information

Future Generation Computer Systems. Implementation of a medical image file accessing system in co-allocation data grids

Future Generation Computer Systems. Implementation of a medical image file accessing system in co-allocation data grids Future Generation Computer Systems 26 (2010) 1127 1140 Contents lists available at ScienceDirect Future Generation Computer Systems journal homepage: www.elsevier.com/locate/fgcs Implementation of a medical

More information

SDS: A Scalable Data Services System in Data Grid

SDS: A Scalable Data Services System in Data Grid SDS: A Scalable Data s System in Data Grid Xiaoning Peng School of Information Science & Engineering, Central South University Changsha 410083, China Department of Computer Science and Technology, Huaihua

More information

Grid Resources Search Engine based on Ontology

Grid Resources Search Engine based on Ontology based on Ontology 12 E-mail: emiao_beyond@163.com Yang Li 3 E-mail: miipl606@163.com Weiguang Xu E-mail: miipl606@163.com Jiabao Wang E-mail: miipl606@163.com Lei Song E-mail: songlei@nudt.edu.cn Jiang

More information

Day 1 : August (Thursday) An overview of Globus Toolkit 2.4

Day 1 : August (Thursday) An overview of Globus Toolkit 2.4 An Overview of Grid Computing Workshop Day 1 : August 05 2004 (Thursday) An overview of Globus Toolkit 2.4 By CDAC Experts Contact :vcvrao@cdacindia.com; betatest@cdacindia.com URL : http://www.cs.umn.edu/~vcvrao

More information

An Engineering Computation Oriented Visual Grid Framework

An Engineering Computation Oriented Visual Grid Framework An Engineering Computation Oriented Visual Grid Framework Guiyi Wei 1,2,3, Yao Zheng 1,2, Jifa Zhang 1,2, and Guanghua Song 1,2 1 College of Computer Science, Zhejiang University, Hangzhou, 310027, P.

More information

Simulating a Finite State Mobile Agent System

Simulating a Finite State Mobile Agent System Simulating a Finite State Mobile Agent System Liu Yong, Xu Congfu, Chen Yanyu, and Pan Yunhe College of Computer Science, Zhejiang University, Hangzhou 310027, P.R. China Abstract. This paper analyzes

More information

IMAGE: An approach to building standards-based enterprise Grids

IMAGE: An approach to building standards-based enterprise Grids IMAGE: An approach to building standards-based enterprise Grids Gabriel Mateescu 1 and Masha Sosonkina 2 1 Research Computing Support Group 2 Scalable Computing Laboratory National Research Council USDOE

More information

DDFTP: Dual-Direction FTP

DDFTP: Dual-Direction FTP In Proc. of The 11th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing (CCGrid 2011), pp. 504-513, May 2011. DDFTP: Dual-Direction FTP Jameela Al-Jaroodi and Nader Mohamed Faculty

More information

Performance Analysis of the Globus Toolkit Monitoring and Discovery Service, MDS2

Performance Analysis of the Globus Toolkit Monitoring and Discovery Service, MDS2 Performance Analysis of the Globus Toolkit Monitoring and Discovery Service, MDS Xuehai Zhang Department of Computer Science University of Chicago hai@cs.uchicago.edu Jennifer M. Schopf Mathematics and

More information

Evaluating the Performance of Skeleton-Based High Level Parallel Programs

Evaluating the Performance of Skeleton-Based High Level Parallel Programs Evaluating the Performance of Skeleton-Based High Level Parallel Programs Anne Benoit, Murray Cole, Stephen Gilmore, and Jane Hillston School of Informatics, The University of Edinburgh, James Clerk Maxwell

More information

PCGrid: Integration of College s Research Computing Infrastructures Using Grid Technology *

PCGrid: Integration of College s Research Computing Infrastructures Using Grid Technology * PCGrid: Integration of College s Research Computing Infrastructures Using Grid Technology * Kuan-Ching Li 1 Chiou-Nan Chen 1, 2 Chun-Chieh Liu 1 Chia-Fu Chang 1 Chia-Wen Hsu 1 Sheng-Shiang Hung 1 Chun-Yu

More information

Profiling Grid Data Transfer Protocols and Servers. George Kola, Tevfik Kosar and Miron Livny University of Wisconsin-Madison USA

Profiling Grid Data Transfer Protocols and Servers. George Kola, Tevfik Kosar and Miron Livny University of Wisconsin-Madison USA Profiling Grid Data Transfer Protocols and Servers George Kola, Tevfik Kosar and Miron Livny University of Wisconsin-Madison USA Motivation Scientific experiments are generating large amounts of data Education

More information

Weka4WS: a WSRF-enabled Weka Toolkit for Distributed Data Mining on Grids

Weka4WS: a WSRF-enabled Weka Toolkit for Distributed Data Mining on Grids Weka4WS: a WSRF-enabled Weka Toolkit for Distributed Data Mining on Grids Domenico Talia, Paolo Trunfio, Oreste Verta DEIS, University of Calabria Via P. Bucci 41c, 87036 Rende, Italy {talia,trunfio}@deis.unical.it

More information

High bandwidth, Long distance. Where is my throughput? Robin Tasker CCLRC, Daresbury Laboratory, UK

High bandwidth, Long distance. Where is my throughput? Robin Tasker CCLRC, Daresbury Laboratory, UK High bandwidth, Long distance. Where is my throughput? Robin Tasker CCLRC, Daresbury Laboratory, UK [r.tasker@dl.ac.uk] DataTAG is a project sponsored by the European Commission - EU Grant IST-2001-32459

More information

Data Management for Distributed Scientific Collaborations Using a Rule Engine

Data Management for Distributed Scientific Collaborations Using a Rule Engine Data Management for Distributed Scientific Collaborations Using a Rule Engine Sara Alspaugh Department of Computer Science University of Virginia alspaugh@virginia.edu Ann Chervenak Information Sciences

More information

Knowledge Discovery Services and Tools on Grids

Knowledge Discovery Services and Tools on Grids Knowledge Discovery Services and Tools on Grids DOMENICO TALIA DEIS University of Calabria ITALY talia@deis.unical.it Symposium ISMIS 2003, Maebashi City, Japan, Oct. 29, 2003 OUTLINE Introduction Grid

More information

Advanced School in High Performance and GRID Computing November Introduction to Grid computing.

Advanced School in High Performance and GRID Computing November Introduction to Grid computing. 1967-14 Advanced School in High Performance and GRID Computing 3-14 November 2008 Introduction to Grid computing. TAFFONI Giuliano Osservatorio Astronomico di Trieste/INAF Via G.B. Tiepolo 11 34131 Trieste

More information

An Adaptive Transfer Algorithm in GDSS

An Adaptive Transfer Algorithm in GDSS An Adaptive Transfer Algorithm in GDSS Hai Jin, Xiangshan Guan, Chao Xie and Qingchun Wang Key Laboratory for Cluster and Grid Computing, School of Computer Science and Technology, Huazhong University

More information

Redundant Parallel Data Transfer Schemes for the Grid Environment

Redundant Parallel Data Transfer Schemes for the Grid Environment Redundant Parallel Data Transfer Schemes for the Grid Environment R.S.Bhuvaneswaran Yoshiaki Katayama Naohisa Takahashi Department of Computer Science and Engineering, Graduate School of Engineering, Nagoya

More information

Data Management 1. Grid data management. Different sources of data. Sensors Analytic equipment Measurement tools and devices

Data Management 1. Grid data management. Different sources of data. Sensors Analytic equipment Measurement tools and devices Data Management 1 Grid data management Different sources of data Sensors Analytic equipment Measurement tools and devices Need to discover patterns in data to create information Need mechanisms to deal

More information

A RESOURCE MANAGEMENT FRAMEWORK FOR INTERACTIVE GRIDS

A RESOURCE MANAGEMENT FRAMEWORK FOR INTERACTIVE GRIDS A RESOURCE MANAGEMENT FRAMEWORK FOR INTERACTIVE GRIDS Raj Kumar, Vanish Talwar, Sujoy Basu Hewlett-Packard Labs 1501 Page Mill Road, MS 1181 Palo Alto, CA 94304 USA { raj.kumar,vanish.talwar,sujoy.basu}@hp.com

More information

GridSphere s Grid Portlets

GridSphere s Grid Portlets COMPUTATIONAL METHODS IN SCIENCE AND TECHNOLOGY 12(1), 89-97 (2006) GridSphere s Grid Portlets Michael Russell 1, Jason Novotny 2, Oliver Wehrens 3 1 Max-Planck-Institut für Gravitationsphysik, Albert-Einstein-Institut,

More information

An Efficient Storage Mechanism to Distribute Disk Load in a VoD Server

An Efficient Storage Mechanism to Distribute Disk Load in a VoD Server An Efficient Storage Mechanism to Distribute Disk Load in a VoD Server D.N. Sujatha 1, K. Girish 1, K.R. Venugopal 1,andL.M.Patnaik 2 1 Department of Computer Science and Engineering University Visvesvaraya

More information

pyglobus: A Python interface to the Globus Toolkit

pyglobus: A Python interface to the Globus Toolkit Abstract pyglobus: A Python interface to the Globus Toolkit Keith R. Jackson Lawrence Berkeley National Laboratory Developing high-performance problem solving environments/applications that allow scientists

More information

Computational Mini-Grid Research at Clemson University

Computational Mini-Grid Research at Clemson University Computational Mini-Grid Research at Clemson University Parallel Architecture Research Lab November 19, 2002 Project Description The concept of grid computing is becoming a more and more important one in

More information

PBS PRO: GRID COMPUTING AND SCHEDULING ATTRIBUTES

PBS PRO: GRID COMPUTING AND SCHEDULING ATTRIBUTES Chapter 1 PBS PRO: GRID COMPUTING AND SCHEDULING ATTRIBUTES Bill Nitzberg, Jennifer M. Schopf, and James Patton Jones Altair Grid Technologies Mathematics and Computer Science Division, Argonne National

More information

Text mining on a grid environment

Text mining on a grid environment Data Mining X 13 Text mining on a grid environment V. G. Roncero, M. C. A. Costa & N. F. F. Ebecken COPPE/Federal University of Rio de Janeiro, Brazil Abstract The enormous amount of information stored

More information

Middleware of Taiwan UniGrid

Middleware of Taiwan UniGrid Middleware of Taiwan UniGrid Po-Chi Shih 1, Hsi-Min Chen 2, Yeh-Ching Chung 1, Chien-Min Wang 3, Ruay-Shiung Chang 4, Ching-Hsien Hsu 5, Kuo-Chan Huang 6, Chao-Tung Yang 7 shedoh@sslab.cs.nthu.edu.tw,

More information

Globus Toolkit Firewall Requirements. Abstract

Globus Toolkit Firewall Requirements. Abstract Globus Toolkit Firewall Requirements v0.3 8/30/2002 Von Welch Software Architect, Globus Project welch@mcs.anl.gov Abstract This document provides requirements and guidance to firewall administrators at

More information

Globus Online and HPSS. KEK, Tsukuba Japan October 16 20, 2017 Guangwei Che

Globus Online and HPSS. KEK, Tsukuba Japan October 16 20, 2017 Guangwei Che Globus Online and HPSS KEK, Tsukuba Japan October 16 20, 2017 Guangwei Che Agenda (1) What is Globus and Globus Online? How Globus Online works? Globus DSI module for HPSS Globus Online setup DSI module

More information

An Evaluation of Object-Based Data Transfers on High Performance Networks

An Evaluation of Object-Based Data Transfers on High Performance Networks An Evaluation of Object-Based Data Transfers on High Performance Networks Phillip M. Dickens Department of Computer Science Illinois Institute of Technology dickens@iit.edu William Gropp Mathematics and

More information

Lessons learned producing an OGSI compliant Reliable File Transfer Service

Lessons learned producing an OGSI compliant Reliable File Transfer Service Lessons learned producing an OGSI compliant Reliable File Transfer Service William E. Allcock, Argonne National Laboratory Ravi Madduri, Argonne National Laboratory Introduction While GridFTP 1 has become

More information

Tortoise vs. hare: a case for slow and steady retrieval of large files

Tortoise vs. hare: a case for slow and steady retrieval of large files Tortoise vs. hare: a case for slow and steady retrieval of large files Abstract Large file transfers impact system performance at all levels of a network along the data path from source to destination.

More information

Functional Requirements for Grid Oriented Optical Networks

Functional Requirements for Grid Oriented Optical Networks Functional Requirements for Grid Oriented Optical s Luca Valcarenghi Internal Workshop 4 on Photonic s and Technologies Scuola Superiore Sant Anna Pisa June 3-4, 2003 1 Motivations Grid networking connection

More information

DYNAMO DirectorY, Net Archiver and MOver

DYNAMO DirectorY, Net Archiver and MOver DYNAMO DirectorY, Net Archiver and MOver Mark Silberstein, Michael Factor, and Dean Lorenz IBM Haifa Research Laboratories {marks,factor,dean}@il.ibm.com Abstract. The Grid communities efforts on managing

More information

A GridFTP Transport Driver for Globus XIO

A GridFTP Transport Driver for Globus XIO A GridFTP Transport Driver for Globus XIO Rajkumar Kettimuthu 1,2, Liu Wantao 3,4, Joseph Link 5, and John Bresnahan 1,2,3 1 Mathematics and Computer Science Division, Argonne National Laboratory, Argonne,

More information

ROCI 2: A Programming Platform for Distributed Robots based on Microsoft s.net Framework

ROCI 2: A Programming Platform for Distributed Robots based on Microsoft s.net Framework ROCI 2: A Programming Platform for Distributed Robots based on Microsoft s.net Framework Vito Sabella, Camillo J. Taylor, Scott Currie GRASP Laboratory University of Pennsylvania Philadelphia PA, 19104

More information

Cluster Abstraction: towards Uniform Resource Description and Access in Multicluster Grid

Cluster Abstraction: towards Uniform Resource Description and Access in Multicluster Grid Cluster Abstraction: towards Uniform Resource Description and Access in Multicluster Grid Maoyuan Xie, Zhifeng Yun, Zhou Lei, Gabrielle Allen Center for Computation & Technology, Louisiana State University,

More information

A Federated Grid Environment with Replication Services

A Federated Grid Environment with Replication Services A Federated Grid Environment with Replication Services Vivek Khurana, Max Berger & Michael Sobolewski SORCER Research Group, Texas Tech University Grids can be classified as computational grids, access

More information

A Survey Paper on Grid Information Systems

A Survey Paper on Grid Information Systems B 534 DISTRIBUTED SYSTEMS A Survey Paper on Grid Information Systems Anand Hegde 800 North Smith Road Bloomington Indiana 47408 aghegde@indiana.edu ABSTRACT Grid computing combines computers from various

More information

Grid Computing Systems: A Survey and Taxonomy

Grid Computing Systems: A Survey and Taxonomy Grid Computing Systems: A Survey and Taxonomy Material for this lecture from: A Survey and Taxonomy of Resource Management Systems for Grid Computing Systems, K. Krauter, R. Buyya, M. Maheswaran, CS Technical

More information

DDMG : A Data Dissemination Mechanism for Grid Environments

DDMG : A Data Dissemination Mechanism for Grid Environments IJCSNS International Journal of Computer Science and Network Security, VOL.6 No.9A, September 2006 109 DDMG : A Data Dissemination Mechanism for Grid Environments Hyung Jinn Kim University of Science and

More information

Grid Architectural Models

Grid Architectural Models Grid Architectural Models Computational Grids - A computational Grid aggregates the processing power from a distributed collection of systems - This type of Grid is primarily composed of low powered computers

More information

GRID COMPUTING BASED MODEL FOR REMOTE MONITORING OF ENERGY FLOW AND PREDICTION OF HT LINE LOSS IN POWER DISTRIBUTION SYSTEM

GRID COMPUTING BASED MODEL FOR REMOTE MONITORING OF ENERGY FLOW AND PREDICTION OF HT LINE LOSS IN POWER DISTRIBUTION SYSTEM GRID COMPUTING BASED MODEL FOR REMOTE MONITORING OF ENERGY FLOW AND PREDICTION OF HT LINE LOSS IN POWER DISTRIBUTION SYSTEM 1 C.Senthamarai, 2 A.Krishnan 1 Assistant Professor., Department of MCA, K.S.Rangasamy

More information

Dynamic Provisioning of a Parallel Workflow Management System

Dynamic Provisioning of a Parallel Workflow Management System 2008 International Symposium on Parallel and Distributed Processing with Applications Dynamic Provisioning of a Parallel Workflow Management System Ching-Hong Tsai Department of Computer Science, National

More information

Parallelizing Inline Data Reduction Operations for Primary Storage Systems

Parallelizing Inline Data Reduction Operations for Primary Storage Systems Parallelizing Inline Data Reduction Operations for Primary Storage Systems Jeonghyeon Ma ( ) and Chanik Park Department of Computer Science and Engineering, POSTECH, Pohang, South Korea {doitnow0415,cipark}@postech.ac.kr

More information

Grid Computing. MCSN - N. Tonellotto - Distributed Enabling Platforms

Grid Computing. MCSN - N. Tonellotto - Distributed Enabling Platforms Grid Computing 1 Resource sharing Elements of Grid Computing - Computers, data, storage, sensors, networks, - Sharing always conditional: issues of trust, policy, negotiation, payment, Coordinated problem

More information

An Introduction to the Grid

An Introduction to the Grid 1 An Introduction to the Grid 1.1 INTRODUCTION The Grid concepts and technologies are all very new, first expressed by Foster and Kesselman in 1998 [1]. Before this, efforts to orchestrate wide-area distributed

More information

A Fast and High Throughput SQL Query System for Big Data

A Fast and High Throughput SQL Query System for Big Data A Fast and High Throughput SQL Query System for Big Data Feng Zhu, Jie Liu, and Lijie Xu Technology Center of Software Engineering, Institute of Software, Chinese Academy of Sciences, Beijing, China 100190

More information

CS550. TA: TBA Office: xxx Office hours: TBA. Blackboard:

CS550. TA: TBA   Office: xxx Office hours: TBA. Blackboard: CS550 Advanced Operating Systems (Distributed Operating Systems) Instructor: Xian-He Sun Email: sun@iit.edu, Phone: (312) 567-5260 Office hours: 1:30pm-2:30pm Tuesday, Thursday at SB229C, or by appointment

More information

A Capabilities Based Communication Model for High-Performance Distributed Applications: The Open HPC++ Approach

A Capabilities Based Communication Model for High-Performance Distributed Applications: The Open HPC++ Approach A Capabilities Based Communication Model for High-Performance Distributed Applications: The Open HPC++ Approach Shridhar Diwan, Dennis Gannon Department of Computer Science Indiana University Bloomington,

More information

A Comparison of Conventional Distributed Computing Environments and Computational Grids

A Comparison of Conventional Distributed Computing Environments and Computational Grids A Comparison of Conventional Distributed Computing Environments and Computational Grids Zsolt Németh 1, Vaidy Sunderam 2 1 MTA SZTAKI, Computer and Automation Research Institute, Hungarian Academy of Sciences,

More information

XtreemFS a case for object-based storage in Grid data management. Jan Stender, Zuse Institute Berlin

XtreemFS a case for object-based storage in Grid data management. Jan Stender, Zuse Institute Berlin XtreemFS a case for object-based storage in Grid data management Jan Stender, Zuse Institute Berlin In this talk... Traditional Grid Data Management Object-based file systems XtreemFS Grid use cases for

More information

QoS-constrained List Scheduling Heuristics for Parallel Applications on Grids

QoS-constrained List Scheduling Heuristics for Parallel Applications on Grids 16th Euromicro Conference on Parallel, Distributed and Network-Based Processing QoS-constrained List Scheduling Heuristics for Parallel Applications on Grids Ranieri Baraglia, Renato Ferrini, Nicola Tonellotto

More information

A Grid-Enabled Component Container for CORBA Lightweight Components

A Grid-Enabled Component Container for CORBA Lightweight Components A Grid-Enabled Component Container for CORBA Lightweight Components Diego Sevilla 1, José M. García 1, Antonio F. Gómez 2 1 Department of Computer Engineering 2 Department of Information and Communications

More information

The glite File Transfer Service

The glite File Transfer Service The glite File Transfer Service Peter Kunszt Paolo Badino Ricardo Brito da Rocha James Casey Ákos Frohner Gavin McCance CERN, IT Department 1211 Geneva 23, Switzerland Abstract Transferring data reliably

More information

A Parallel Programming Environment on Grid

A Parallel Programming Environment on Grid A Parallel Programming Environment on Grid Weiqin Tong, Jingbo Ding, and Lizhi Cai School of Computer Engineering and Science, Shanghai University, Shanghai 200072, China wqtong@mail.shu.edu.cn Abstract.

More information

Automating large file transfers

Automating large file transfers Automating large file transfers Adam H. Villa 1 and Elizabeth Varki 1 1 Department of Computer Science, University of New Hampshire, Durham, NH, USA Abstract The amount of data being transferred by the

More information

High Performance Computing on MapReduce Programming Framework

High Performance Computing on MapReduce Programming Framework International Journal of Private Cloud Computing Environment and Management Vol. 2, No. 1, (2015), pp. 27-32 http://dx.doi.org/10.21742/ijpccem.2015.2.1.04 High Performance Computing on MapReduce Programming

More information

Community Software Development with the Astrophysics Simulation Collaboratory

Community Software Development with the Astrophysics Simulation Collaboratory CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. 2001; volume (number): 000 000 Community Software Development with the Astrophysics Simulation Collaboratory 5

More information

Globus XIO Compression Driver: Enabling On-the-fly Compression in GridFTP

Globus XIO Compression Driver: Enabling On-the-fly Compression in GridFTP Globus XIO Compression Driver: Enabling On-the-fly Compression in GridFTP Mattias Lidman, John Bresnahan, Rajkumar Kettimuthu,3 Computation Institute, University of Chicago, Chicago, IL Redhat Inc. 3 Mathematics

More information

Reasons not to Parallelize TCP Connections for Fast Long-Distance Networks

Reasons not to Parallelize TCP Connections for Fast Long-Distance Networks Reasons not to Parallelize TCP Connections for Fast Long-Distance Networks Zongsheng Zhang Go Hasegawa Masayuki Murata Osaka University Contents Introduction Analysis of parallel TCP mechanism Numerical

More information