arxiv:cs.dc/ v2 14 May 2002

Similar documents
The NorduGrid Architecture and Middleware for Scientific Applications

Atlas Data-Challenge 1 on NorduGrid

Usage statistics and usage patterns on the NorduGrid: Analyzing the logging information collected on one of the largest production Grids of the world

The NorduGrid production Grid infrastructure, status and plans

Data transfer over the wide area network with a large round trip time

Analysis of internal network requirements for the distributed Nordic Tier-1

High bandwidth, Long distance. Where is my throughput? Robin Tasker CCLRC, Daresbury Laboratory, UK

XenaCompact TM Series

Network Design Considerations for Grid Computing

USING ISCSI AND VERITAS BACKUP EXEC 9.0 FOR WINDOWS SERVERS BENEFITS AND TEST CONFIGURATION

TCP and BBR. Geoff Huston APNIC

Intel PRO/1000 PT and PF Quad Port Bypass Server Adapters for In-line Server Appliances

TCP and BBR. Geoff Huston APNIC

The Future of High-Performance Networking (The 5?, 10?, 15? Year Outlook)

Securing Grid Data Transfer Services with Active Network Portals

Solace Message Routers and Cisco Ethernet Switches: Unified Infrastructure for Financial Services Middleware

10 Gbit/s Challenge inside the Openlab framework

TCP and BBR. Geoff Huston APNIC

IEEE 1588 PTP clock synchronization over a WAN backbone

Setting-up WAN Emulation using WAN-Bridge Live-CD v1.10

Why Your Application only Uses 10Mbps Even the Link is 1Gbps?

NIC TEAMING IEEE 802.3ad

TCP and BBR. Geoff Huston APNIC. #apricot

Maximizing NFS Scalability

Operating the Distributed NDGF Tier-1

Topics. TCP sliding window protocol TCP PUSH flag TCP slow start Bulk data throughput

Azure database performance Azure performance measurements February 2017

iscsi Technology Brief Storage Area Network using Gbit Ethernet The iscsi Standard

CSE3213 Computer Network I

USER MANUAL. VIA IT Deployment Guide for Firmware 2.3 MODEL: P/N: Rev 7.

Server Specifications

Optimizing Performance: Intel Network Adapters User Guide

Benchmarking third-party-transfer protocols with the FTS

SLE experience over unreliable network links

THE GLOBUS PROJECT. White Paper. GridFTP. Universal Data Transfer for the Grid

Lotus Sametime 3.x for iseries. Performance and Scaling

Layer 4 TCP Performance in carrier transport networks Throughput is not always equal to bandwidth

Future trends in distributed infrastructures the Nordic Tier-1 example

Computer Networks. ENGG st Semester, 2010 Hayden Kwok-Hay So

Chapter 4. Routers with Tiny Buffers: Experiments. 4.1 Testbed experiments Setup

Internet data transfer record between CERN and California. Sylvain Ravot (Caltech) Paolo Moroni (CERN)

Chapter 1. Computer Networks and the Internet

Performance of Multicast Traffic Coordinator Framework for Bandwidth Management of Real-Time Multimedia over Intranets

PORTrockIT. IBM Spectrum Protect : faster WAN replication and backups with PORTrockIT

The Network Layer and Routers

PERFORMANCE TUNING TECHNIQUES FOR VERITAS VOLUME REPLICATOR

Networks: Communicating and Sharing Resources

Guide to Networking Essentials, 6 th Edition. Chapter 7: Network Hardware in Depth

Profiling Grid Data Transfer Protocols and Servers. George Kola, Tevfik Kosar and Miron Livny University of Wisconsin-Madison USA

LANCOM Techpaper Routing Performance

Vess A2000 Series. NVR Storage Appliance. Milestone Surveillance Solution. Version PROMISE Technology, Inc. All Rights Reserved.

SLE experience over unreliable network links

CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring Lecture 21: Network Protocols (and 2 Phase Commit)

Cisco Wide Area Application Services (WAAS) Mobile

Performance Analysis of Applying Replica Selection Technology for Data Grid Environments*

Networks & protocols research in Grid5000 DAS3

HyperIP : SRDF Application Note

ASN Configuration Best Practices

ClearStream. Prototyping 40 Gbps Transparent End-to-End Connectivity. Cosmin Dumitru! Ralph Koning! Cees de Laat! and many others (see posters)!

Video over LAN Gigabit Ethernet Switch Setup

This tutorial shows how to use ACE to Identify the true causes of poor response time Document the problems that are found

Solving the Data Transfer Bottleneck in Digitizers

KEMP 360 Vision. KEMP 360 Vision. Product Overview

COMMVAULT. Enabling high-speed WAN backups with PORTrockIT

Appendix B. Standards-Track TCP Evaluation

Internet Architecture & Performance. What s the Internet: nuts and bolts view

Cisco XR Series Service Separation Architecture Tests

Chapter 4 NETWORK HARDWARE

File Transfer: Basics and Best Practices. Joon Kim. Ph.D. PICSciE. Research Computing 09/07/2018

A data Grid testbed environment in Gigabit WAN with HPSS

Reduces latency and buffer overhead. Messaging occurs at a speed close to the processors being directly connected. Less error detection

ENSC 427: COMMUNICATION NETWORKS (Spring 2011) Final Report

LECTURE WK4 NETWORKING

The Total Network Volume chart shows the total traffic volume for the group of elements in the report.

2. Traffic. Contents. Offered vs. carried traffic. Characterisation of carried traffic

Why You Should Consider a Hardware Based Protocol Analyzer?

Performance of the NorduGrid ARC and the Dulcinea Executor in ATLAS Data Challenge 2

BlueGene/L. Computer Science, University of Warwick. Source: IBM

Send documentation comments to You must enable FCIP before attempting to configure it on the switch.

SAM at CCIN2P3 configuration issues

DISTRIBUTED HIGH-SPEED COMPUTING OF MULTIMEDIA DATA

PeerApp Case Study. November University of California, Santa Barbara, Boosts Internet Video Quality and Reduces Bandwidth Costs

Two-Tier Oracle Application

CNPE Communications and Networks Lab Book: Data Transmission Over Digital Networks

2. Traffic lect02.ppt S Introduction to Teletraffic Theory Spring

Impact of TCP Window Size on a File Transfer

Optimize and Accelerate Your Mission- Critical Applications across the WAN

Introduction to TCP/IP Offload Engine (TOE)

EMC Celerra Replicator V2 with Silver Peak WAN Optimization

Server Specifications

Computer Communication & Networks / Data Communication & Computer Networks Week # 03

NCP Computing Infrastructure & T2-PK-NCP Site Update. Saqib Haleem National Centre for Physics (NCP), Pakistan

Multimedia Environment for Mobiles (MEMO) - Interactive Multimedia Services to Portable and Mobile Terminals

CHARON-VAX application note

Host Solutions Group Technical Bulletin August 30, 2007

Radyne s SkyWire Gateway Quality of Service

Tortoise vs. hare: a case for slow and steady retrieval of large files

Data Management for the World s Largest Machine

Bridging and Switching Basics

Configuring Modular QoS Congestion Avoidance

Transcription:

Performance evaluation of the GridFTP within the NorduGrid project M. Ellert a, A. Konstantinov b, B. Kónya c, O. Smirnova c, A. Wäänänen d arxiv:cs.dc/2523 v2 14 May 22 a Department of Radiation Sciences, Uppsala University, Box 535, 751 21 Uppsala, Sweden b University of Oslo, Department of Physics, P. O. Box 48, Blindern, 316 Oslo, Norway c Particle Physics, Institute of Physics, Lund University, Box 118, 22 Lund, Sweden d Niels Bohr Institutet for Astronomi, Fysik og Geofysik, Blegdamsvej 17, DK-2, Copenhagen Ø, Denmark Abstract This report presents results of the tests measuring the performance of multi-threaded file transfers, using the GridFTP implementation of the Globus project over the NorduGrid network resources. Point to point WAN tests, carried out between the sites of Copenhagen, Lund, Oslo and Uppsala, are described. It was found that multiple threaded download via the high performance GridFTP protocol can significantly improve file transfer performance, and can serve as a reliable data transfer engine for future Data Grids. 1 Introduction Development of the Grid technologies is the goal of many emerging projects in distributed, dataintensive computing. The idea of merging the capacity of computers worldwide is particularly appealing for tasks, which demand extensive processing of big amounts of data, located at geographically distributed databases. The Grid provides innovative solutions, by introducing specific toolkits, which merge the computers involved into uniform networks. NorduGrid [1] is the project with the aim to create the Grid computing infrastructure in Nordic countries, making use of the available middleware. Project participants include universities and research centers in Denmark, Sweden, Finland and Norway. A secure, reliable, efficient and high performance data transfer mechanism over high-bandwidth wide area networks is a key component of any kind of Grid infrastructure. The Globus metacomputing project [2] has proposed the GridFTP protocol [3] which contains extensions to the standard highly popular FTP protocol, in order to meet the requirements of high performance wide area data movement. Their extended file transfer protocol supports the following features: Grid Security Infrastructure (GSI) [4] partial file transfer reliable data transfer third-party (from server to server) transfer automatic negotiation of TCP buffer sizes parallel (multi-threaded) data transfer data channel encryption 1

The Globus team has provided software implementation of their protocol in terms of production libraries and tools. The GridFTP code used was the alpha-4 release, checked out from the Globus CVS [5]. All the new features except for the TCP buffer negotiation, has been implemented, and the code became a part of the Globus Toolkit TM 2 release [6]. The purpose of this investigation was to examine the alpha release of the GridFTP code, test and compare its performance to standard FTP file transfer within the framework of the NorduGrid project. Because the GridFTP will be the underlying data transfer engine of many Grid projects, it is very important to have a first-hand experience of its performance and capabilities over the NorduGrid hardware and network resources. Particularly interesting issue is the expected performance improvements due to the new parallel transfer mechanism of the GridFTP protocol. It is well-known that the utilisation of parallel streams in data transfer over high speed WAN connections is a feasible solution to overcome the TCP buffer size limitations without the necessity of the modifications of any sensitive TCP system parameters [7], therefore parallel streams can lead to a significant improvement in data throughput. Although several multi-threaded file transfer clients exist based on the single-threaded FTP protocol, the Globus Project s GridFTP implementation is one of a very few, which support parallel streams on the protocol level. In the tests presented here, the parallel GridFTP performance over the NorduGrid network have been measured. September 21 NORWAY 2.5 Gbps 1Gbps 2.5 Gbps Lyngby Copenhagen Malmo Lund 155 Mbps 622 Mbps DENMARK Uppsala Oslo Stockholm Copenhagen SWEDEN Lund WAN lines: 2.5 Gbps, NorduNet 622 Mbps, SUNET 155 Mbps, SUNET km Figure 1: Connectivity map of the NorduGrid participating sites. 2 Test environment Four NorduGrid sites, Copenhagen, Lund, Oslo and Uppsala, have been participating in the tests. The sites are connected via the high speed NORDUnet network [8], the actual network configuration is given in Figure 1. 2

Table 1: Hardware specifications of the GridFTP servers. GridFTP CPU Memory NIC LAN server connectivity Lund P-III 1 GHz 512 MB Intel Pro VM Mbit/s Uppsala P-III 866 MHz 512 MB Intel Pro VE Mbit/s Copenhagen 2 P-III 933 MHz 512 MB Alteon Ace NIC 1 Gbit/s Gigabit Ethernet Oslo P-III 866 MHz 128 MB EtherExpress Pro Mbit/s Table 2: Installed software versions. GridFTP Platform Globus GridFTP server Toolkit TM software Lund Mandrake 8. 1.1.3b14 gsi-wuftpd-.5 Uppsala RedHat 7.1 1.1.3b14 gsi-wuftpd-.5 Copenhagen RedHat 6.2 1.1.3b14 gsi-wuftpd-.5 Oslo RedHat 7.1 1.1.3b14 gsi-wuftpd-.5 At each site a dedicated Linux server (the Globus gatekeeper of the local Grid cluster) with a GridFTP server installed was used in the investigation. The servers of Lund and Uppsala are connected through a Mbit/s link to the LAN, the Copenhagen server has a Gigabit connection, while the Oslo server at the time of the investigation had only a Mbit/s LAN connection. The hardware configurations of the GridFTP servers are listed in Table 1; more detailed description can be found on the NorduGrid Web-site [9]. The Globus Toolkit TM version 1.1.3b14, shipped with the globus-url-copy GridFTP client, was installed and configured on the machines. Each site ran the GSI-enabled Globus-modified version of the WU-FTPD server []. The particular software configuration of the servers is listed in Table 2. 3 Method of measurement The globus-url-copy tool and the gsi-wuftpd server from the Globus alpha release 4, were used as the client and server respectively. A test file of a size of Mbytes was transferred among the sites, since it was found that this was large enough to average out most of the network fluctuations. A single test consisted of repeated number of measurements of the transfer time of the same file over the same link within a short time interva;. During the tests, the default TCP settings were not modified, since the purpose was to study the performance without tuning system parameters. Throughput measurements with respect to different number of parallel threads were performed over three different network connections: Lund-Copenhagen (15 router hops) Lund-Uppsala (9 router hops) Lund-Oslo (11 router hops) An example of the command issued to transfer a file in 6 parallel threads from a local node to Uppsala is: globus-url-copy -p 6 \ file:/tmp/mb.tmp gsiftp://grid.tsl.uu.se/tmp/mb.tmp During the investigation, it was found that the background load of the network link had a significant influence on the download performance: for example, a single-threaded download time of the test file from Uppsala to Lund could change from 64s in the morning to 284s in the afternoon. However, 3

Transfer rate, Mbit/s 35 3 25 2 15 5 FTP clients Figure 2: Download performance of conventional FTP clients. for a reasonable period of time (approximately half an hour) the network conditions could be considered being stable. During the measurements, it was found that most of the cases over any given link the downloading time did not vary more than 5%, provided the downloads were made within 3 minutes. Therefore, in order to eliminate the effect of changing network load, download times were compared only if the downloads were performed over the same network link and taken within the same short time interval; furthermore, all the unreconsctructable and outstanding values were discarded. For a specific collection of data it was required that the difference in the download time of the test file measured at the beginning and at the end of the test session did not exceed 5%: in this way one could assume that the test downloads were carried out over a relatively constant network load of the particular network link. 4 Results 4.1 FTP clients Before investigating the parallel performance of the GridFTP protocol and the globus-url-copy client, tests comparing different Linux and MS Windows based FTP clients were carried out. Downloads were performed from the Uppsala server using different FTP clients installed on a dual booting (Mandrake 8. Linux and W2K) P-III 1 GHz machine in Lund. Under Linux, the FTP, NcFTP, lftp, SSH copy and the Prozilla [11] clients were tested. The Prozilla application is a multi-threaded Linux FTP client. It is capable of opening multiple connections to a server, where each of the connections downloads a part of a file, and upon completion of downloads, the partial files are merged. After rebooting the machine to W2K, the tests were repeated with FTP, Windows Commander built-in client, and CuteFTP Pro. The latter client allows up to 4 parallel connections per transferred file. The download performance of the clients were compared to single and multiple threaded GridFTP downloads. The results are shown in Figure 2 and Figure 3. It can be seen that multi- 4

Transfer rate, Mbit/s 6 5 4 GridFTP (Linux) Prozilla (Linux) CuteFTP Pro (W2K) 3 2 1 2 3 4 6 8 16 Figure 3: Download performance of multi-threaded FTP clients. threaded clients generally improve the transfer time, while Linux clients routinely outperform those of W2K. The Linux multi-threaded downloader Prozilla, although providing higher transfer rates with increasing thread number, generally consumes too much time during the reconstruction phase, slowing down the performance. GridFTP provides the fastest downloads already with 4 threads, and can improve further, providing LAN performance and the network load allow for it. 4.2 Lund-Uppsala transfer The main GridFTP tests were carried out between Lund and Uppsala. At both sites, the GridFTP servers are connected to their LAN with a fast Mbit/s Ethernet link. Lund has a 155 Mbit/s connection to the Malmö backbone, which is connected via a 622 Mbit/s line to Stockholm, and Stockholm in turn has a 155 Mbit/s line to Uppsala (see Figure 1). The network load of the data paths on the day of the measurements is shown in Figures 4 to 6 [12]. Most of the day (9: thru 1: CET), the Lund-Malmö link is usually overloaded (running at 9-% of its total capacity), while the other two links have a moderate load of 4%. Two sets of tests were performed: the first series of transfer measurements were taken in the morning at a relatively low network load, while the second was performed over a congested network at peak time. The throughput performance of the GridFTP with respect to the number of parallel threads via transferring the Mbyte test file between Lund and Uppsala was measured (see Figure 7). Usage of parallel threads radically increased the transfer rate both over the congested and uncongested network, regardless of the load. The performance was steadily increasing for up to 64 threads; over 96 threads instabilities and lower transfer rates were experienced. Over a congested network, the throughput with 64 threads increased 78% and 98% respectively in the two directions, compared to the normal single-threaded transfer. In case of the unloaded network, the transfer rate increased from 2 Mbit/s to 6 Mbit/s (Lund to Uppsala), which is actually around the maximum throughput of a Mbit/s LAN. This means that over the Lund-Uppsala data path already the LAN performance represents the bottleneck in the multi-threaded GridFTP 5

Figure 4: Daily network load of the Malmo-Lund link (percentage of total capacity, in- and outgoing traffic). Figure 5: Daily network load of the Stockholm-Malmo link (percentage of total capacity, in- and outgoing traffic). 6

Figure 6: Daily network load of the Stockholm-Uppsala link (percentage of total capacity, in- and outgoing traffic). Transfer rate, Mbit/s 8 7 6 5 4 Uppsala to Lund 7:3-8:5 17: - 17:4 Transfer rate, Mbit/s 8 7 6 5 4 Lund to Uppsala 8: - 8:15 16:5-17:5 3 3 2 2 8 16 32 64 96 128 256 1 2 3 4 6 8 16 32 64 96 128 256 1 2 3 4 6 Figure 7: Transfer rates for GridFTP downloads between Uppsala and Lund, measured at different network load. 7

Transfer rate, Mbit/s 8 7.8 7.6 7.4 7.2 7 6.8 6.6 6.4 6.2 6 1 2 3 4 6 8 16 32 64 96 128 256 Figure 8: GridFTP transfer rates for download from Oslo to Lund. downloads. 4.3 Lund-Oslo transfer Lund is connected to Oslo through the 155 Mbit/s Lund-Malmo, 2.5 Gbit/s Malmo-Stockholm and 2.5 Gbit/s Stockholm-Oslo links. However, the Mbit/s LAN connection of the Oslo server was the real bottleneck on the Lund-Oslo data path. The result of the multi-threaded test downloads from Oslo to Lund is shown in Figure 8. The 7 Mbit/s single-threaded transfer rate is already close to the Mbit/s theoretical maximum, thus the Mbit/s networking bottleneck seriously limits the functionality of the parallel downloads. 4.4 Lund-Copenhagen transfer The Lund-Copenhagen datalink, despite the geographical neighbourhood of the two sites, represents the longest network connection in these GridFTP tests (15 router hops). The data travel from Lund to Stockholm, then from Stockholm through a 2.5 Gbit/s link to Lyngby, the Danish NORDUnet gateway, and finally arrives to Copenhagen (see Figure 9). Performing parallel downloads from Lund to Copenhagen, a throughput gain from 7 Mbit/s to 3 Mbit/s (over a congested network) was experienced, obtained with 32 threads compared to the single download rate. 4.5 Third party transfers Finally, third party transfer tests from Copenhagen (controller) between the servers of Lund and Uppsala were made (see Figure ). A third party transfer implies two FTP control channels to servers from the controller, and one data channel between the two servers. The parallel threads resulted in a performance gain similar to one of the case of the direct transfer (compare Figure 7 and Figure ). The transfer rate increased from the single-threaded 15-2 M bit/s up to 5-55 Mbit/s, achieved with 32 threads. 8

Transfer rate, Mbit/s 35 3 25 2 15 5 1 2 3 4 6 8 16 32 64 96 128 256 Figure 9: GridFTP transfer rates for download from Lund to Copenhagen. Transfer rate, Mbit/s 6 5 Lund to Uppsala Uppsala to Lund 4 3 2 1 2 3 4 6 8 16 32 64 96 128 256 Figure : GridFTP transfer rates for the third-party (Copenhagen) transfer between Lund and Uppsala. 9

4.6 Stability issues In spite of the alpha status of the GridFTP software implementation, it was found that the globusurl-copy client and the gsi-wuftpd server are rather stable. However, using more than hundred parallel streams can sometimes cause instabilities. On several occasions, with large number of used threads, the start-up phase of the download froze for several seconds. Restarting the transfer usually solved the problem. More common unstable behaviour of the above--threads downloads was the extremely slow completion of the last few hundred kilobytes. On few occasions, the download process in its finishing state (retrieving the last few hundred bytes) suddenly restarted from the beginning. A general conclusion is that the software implementation for above ca. threads became unreliable. We expect that these instabilities will be resolved with the final Globus Toolkit TM 2 release. 5 Summary In this report the high performance GridFTP data transfer mechanism was evaluated over the NorduGrid resources. The tests were mainly focused on the performance gain due to the usage of parallel streams. Another purpose was to compare the performance of the Globus GridFTP implementation to several ordinary FTP clients. It was found that the performance of the conventional FTP clients differ not more than 2 percent and that the single threaded GridFTP could deliver performance in the same range. The multi-threaded GridFTP transfers resulted in a remarkable performance increase of about 6-8 percent, compared to a single threaded GridFTP (or conventional FTP) downloads. Parallel threads lead to increased throughput over both congested and unloaded networks. It s worth to point out that in some of the cases the LAN or the actual hardware configuration became the bottleneck in the GridFTP tests. Parallel third party (server to server) transfers were tested and similar performance enhancement to direct transfers was achieved. It is clearly unfair towards other users to execute a multi-threaded transfer over a congested network, as it consumes bandwith proportional to the number of threads. However, if a big bandwith is available, like in the case of a fat long dedicated line, a multi-threaded transfer is the only way to use all the available capacity without changing the system parameters. The tests were by no means exhaustive, many other cases could have been considered. The NorduGrid network resources are going to be considerably upgraded (the Lund-Malmö and the Copenhagen-Lyngby links) and the tests are planned to be extended using the upgraded network; moreover a dedicated 1GBit/s CERN-Copenhagen link is about to be set up for tests purposes in the near future. The alpha status software is proved to be relatively stable, since irregularities occurred only for downloads involving exaggerated amount of threads (above ). After these tests were done, the Globus Toolkit TM 2 was released, which includes an upgraded version of GridFTP. The conclusion is that the multi-threaded GridFTP significantly boosted the data throughput over the NorduGrid network and it is certainly a very promising solution for high performance data transfer. 6 Acknowledgements The NorduGrid project is supported by the Nordunet2 programme, financed by the Nordic Council of Ministers and by the Nordic Governments. References [1] NorduGrid project, www.nordugrid.org [2] The Globus project, www.globus.org

[3] Definition of the GridFTP protocol, features and implementation, www.globus.org/datagrid/deliverables/gridftp-overview-221.pdf, www-fp.mcs.anl.gov/dsl/gridftp-protocol-rfc-draft.pdf [4] Globus Security Infrastructure, www.globus.org/security [5] The GridFTP alpha 4 release, www.globus.org/gsiftp-alpha/release-alpha4.html [6] Globus toolkit version 2, www.globus.org/gt2 [7] TCP Tuning Guide for Distributed Application on Wide Area Networks, www-didc.lbl.gov/tcpwan.html [8] NORDUnet, www.nordu.net [9] NorduGrid hardware resources, www.nordugrid.org/hardware.html [] WU-FTPD project, www.wu-ftpd.org [11] Multi-threaded Linux FTP client, prozilla.delrom.ro [12] NORDUnet network statistics, www.nordu.net/stats, stats.noc.kth.se/stats 11