On enhancing GridFTP and GPFS performances

Similar documents
A comparison of data-access platforms for BaBar and ALICE analysis computing model at the Italian Tier1

The INFN Tier1. 1. INFN-CNAF, Italy

The CMS experiment workflows on StoRM based storage at Tier-1 and Tier-2 centers

Testing an Open Source installation and server provisioning tool for the INFN CNAF Tier1 Storage system

Long Term Data Preservation for CDF at INFN-CNAF

File Access Optimization with the Lustre Filesystem at Florida CMS T2

Data transfer over the wide area network with a large round trip time

A self-configuring control system for storage and computing departments at INFN-CNAF Tierl

Use of containerisation as an alternative to full virtualisation in grid environments.

Profiling Grid Data Transfer Protocols and Servers. George Kola, Tevfik Kosar and Miron Livny University of Wisconsin-Madison USA

Performance Characterization of the Dell Flexible Computing On-Demand Desktop Streaming Solution

Best Practices for Deploying a Mixed 1Gb/10Gb Ethernet SAN using Dell EqualLogic Storage Arrays

Cluster Setup and Distributed File System

Maintaining End-to-End Service Levels for VMware Virtual Machines Using VMware DRS and EMC Navisphere QoS

Experience with Fabric Storage Area Network and HSM Software at the Tier1 INFN CNAF

Current Status of the Ceph Based Storage Systems at the RACF

CMS users data management service integration and first experiences with its NoSQL data storage

Maintaining End-to-End Service Levels for VMware Virtual Machines Using VMware DRS and EMC Navisphere QoS

I Tier-3 di CMS-Italia: stato e prospettive. Hassen Riahi Claudio Grandi Workshop CCR GRID 2011

Best Practices for Deploying a Mixed 1Gb/10Gb Ethernet SAN using Dell Storage PS Series Arrays

Introduction to SRM. Riccardo Zappi 1

VMware Infrastructure Update 1 for Dell PowerEdge Systems. Deployment Guide. support.dell.com

SPINOSO Vincenzo. Optimization of the job submission and data access in a LHC Tier2

InfoSphere Warehouse with Power Systems and EMC CLARiiON Storage: Reference Architecture Summary

Exchange Server 2007 Performance Comparison of the Dell PowerEdge 2950 and HP Proliant DL385 G2 Servers

Accelerating Microsoft SQL Server 2016 Performance With Dell EMC PowerEdge R740

W H I T E P A P E R. Comparison of Storage Protocol Performance in VMware vsphere 4

A first look at 100 Gbps LAN technologies, with an emphasis on future DAQ applications.

Beyond Petascale. Roger Haskin Manager, Parallel File Systems IBM Almaden Research Center

Four-Socket Server Consolidation Using SQL Server 2008

Tiered IOPS Storage for Service Providers Dell Platform and Fibre Channel protocol. CloudByte Reference Architecture

Andrea Sciabà CERN, Switzerland

Large scale commissioning and operational experience with tier-2 to tier-2 data transfer links in CMS

Using Hadoop File System and MapReduce in a small/medium Grid site

Dell Reference Configuration for Large Oracle Database Deployments on Dell EqualLogic Storage

CASTORFS - A filesystem to access CASTOR

VMware Infrastructure Update 1 for Dell PowerEdge Systems. Deployment Guide. support.dell.com

Performance Report: Multiprotocol Performance Test of VMware ESX 3.5 on NetApp Storage Systems

Using IKAROS as a data transfer and management utility within the KM3NeT computing model

The ATLAS EventIndex: an event catalogue for experiments collecting large amounts of data

An Introduction to GPFS

EMC CLARiiON CX3-80. Enterprise Solutions for Microsoft SQL Server 2005

Reduce Costs & Increase Oracle Database OLTP Workload Service Levels:

FuxiSort. Jiamang Wang, Yongjun Wu, Hua Cai, Zhipeng Tang, Zhiqiang Lv, Bin Lu, Yangyu Tao, Chao Li, Jingren Zhou, Hong Tang Alibaba Group Inc

Testing the limits of an LVS - GridFTP cluster as a replacement for BeSTMan

Striped Data Server for Scalable Parallel Data Analysis

DELL Reference Configuration Microsoft SQL Server 2008 Fast Track Data Warehouse

Tests of PROOF-on-Demand with ATLAS Prodsys2 and first experience with HTTP federation

Low Latency Evaluation of Fibre Channel, iscsi and SAS Host Interfaces

EMC Virtual Architecture for Microsoft SharePoint Server Reference Architecture

The ATLAS Tier-3 in Geneva and the Trigger Development Facility

High-density Grid storage system optimization at ASGC. Shu-Ting Liao ASGC Operation team ISGC 2011

DELL TM AX4-5 Application Performance

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance

IBM System x High Performance Servers(R) Technical Support V4.

EMC Business Continuity for Microsoft Applications

Design a Remote-Office or Branch-Office Data Center with Cisco UCS Mini

Milestone Solution Partner IT Infrastructure Components Certification Report

Benchmark of a Cubieboard cluster

EMC CLARiiON CX3-40. Reference Architecture. Enterprise Solutions for Microsoft Exchange Enabled by MirrorView/S

A High-Performance Storage and Ultra- High-Speed File Transfer Solution for Collaborative Life Sciences Research

DELL EMC READY BUNDLE FOR VIRTUALIZATION WITH VMWARE AND FIBRE CHANNEL INFRASTRUCTURE

Virtualization of the MS Exchange Server Environment

A Low Latency Solution Stack for High Frequency Trading. High-Frequency Trading. Solution. White Paper

EMC Integrated Infrastructure for VMware. Business Continuity

Performance Scaling. When deciding how to implement a virtualized environment. with Dell PowerEdge 2950 Servers and VMware Virtual Infrastructure 3

SONAS Best Practices and options for CIFS Scalability

Demartek December 2007

VMware Infrastructure 3.5 for Dell PowerEdge Systems. Deployment Guide. support.dell.com

Analisi Tier2 e Tier3 Esperienze ai Tier-2 Giacinto Donvito INFN-BARI

Hosted Microsoft Exchange Server 2003 Deployment Utilizing Network Appliance Storage Solutions

The Legnaro-Padova distributed Tier-2: challenges and results

Performance of popular open source databases for HEP related computing problems

Microsoft SQL Server in a VMware Environment on Dell PowerEdge R810 Servers and Dell EqualLogic Storage

High Throughput WAN Data Transfer with Hadoop-based Storage

Storage Area Networks SAN. Shane Healy

ADDENDUM TO: BENCHMARK TESTING RESULTS UNPARALLELED SCALABILITY OF ITRON ENTERPRISE EDITION ON SQL SERVER

CSCS HPC storage. Hussein N. Harake

Evaluation Report: Improving SQL Server Database Performance with Dot Hill AssuredSAN 4824 Flash Upgrades

RAPIDIO USAGE IN A BIG DATA ENVIRONMENT

Microsoft SharePoint Server 2010 on Dell Systems

Nové možnosti storage virtualizace s řešením IBM Storwize V7000

Microsoft Exchange Server 2010 Performance on VMware vsphere 5

Benchmarking the ATLAS software through the Kit Validation engine

Exchange 2003 Archiving for Operational Efficiency

Use of the Internet SCSI (iscsi) protocol

Understanding StoRM: from introduction to internals

EMC Infrastructure for Virtual Desktops

Data center requirements

HP ProLiant blade planning and deployment

Data oriented job submission scheme for the PHENIX user analysis in CCJ

Constant monitoring of multi-site network connectivity at the Tokyo Tier2 center

PoS(ISGC 2011 & OGF 31)047

Status of KISTI Tier2 Center for ALICE

Streamlining CASTOR to manage the LHC data torrent

QLogic TrueScale InfiniBand and Teraflop Simulations

The PowerEdge M830 blade server

Database Services at CERN with Oracle 10g RAC and ASM on Commodity HW

Full Disclosure Report

A copy can be downloaded for personal non-commercial research or study, without prior permission or charge

Transcription:

Journal of Physics: Conference Series On enhancing GridFTP and GPFS performances To cite this article: A Cavalli et al 2 J. Phys.: Conf. Ser. 29 224 View the article online for updates and enhancements. Related content - StoRM-GPFS-TSM: A new approach to hierarchical storage management for the LHC experiments A Cavalli, L dell'agnello, A Ghiselli et al. - Identifying Gaps in Grid Middleware on Fast Networks with the Advanced Networking Initiative Dave Dykstra, Gabriele Garzoglio, Hyunwoo Kim et al. - A comparison of data-access platforms for BaBar and ALICE analysis computing model at the Italian Tier A Fella, F Furano, L Li Gioi et al. This content was downloaded from IP address 48.2.232.83 on /6/28 at :37

7th International Conference on Computing in High Energy and Nuclear Physics (CHEP9) IOP Publishing Journal of Physics: Conference Series 29 (2) 224 doi:.88/742-696/29//224 On Enhancing GridFTP and GPFS performances A. Cavalli, C. Ciocca, L. dell Agnello, T. Ferrari, D. Gregori, B. Martelli, A. Prosperini, P. Ricci, E. Ronchieri, V. Sapunenko, A. Sartirana 2, D. Vitlacil and S. Zani INFN CNAF, Bologna, Italy 2 LLR - École Polytechnique, Paris, France E-mail: elisabetta.ronchieri@cnaf.infn.it, vladimir.sapunenko@cnaf.infn.it E-mail: 2 sartiran@llr.in2p3.fr Abstract. One of the most demanding tasks which Computing in High Energy Physics has to deal with is reliable and high throughput transfer of large data volumes. Maximization and optimization of the data throughput are therefore key issues which have to be addressed by detailed investigations of the involved infrastructures and services. In this note, we present some transfer performance tests carried out at the INFN-CNAF Tier- center, using SLC4 64-bit Grid File Transfer Protocol (GridFTP) servers and a disk storage system based on the General Parallel File System (GPFS) from IBM. We describe the testbed setup and report the measurements of throughput performances in function of some fundamental variables, such as number of parallel file and number of streams per transfer, concurrent read and write activity and size of data blocks transferred. During this activity, we have verified that a significant improvement in performances of the GridFTP server can be obtained using 64bit version of Operating System and GPFS from IBM.. Introduction The optimization of Wide Area Network (WAN) data distribution in the framework of Grid Computing is one of the key issues in the deployment, commissioning and operation of the computing models of the current generation of High Energy Physics (HEP) experiments. The computing infrastructures of these experiments are required to handle huge amounts of data that have to be stored and analysed. Moreover, such computing systems are built by resources which are worldwide distributed and connected by the Grid middleware services. The Grid infrastructure manages data transfers by a set of layered services: the File Transfer Service [], the Storage Resource Managers [2], and Grid File Transfer Protocol (GridFTP) servers [3]. On the lowest level, the GridFTP protocol is used to efficiently manage highperformance, reliable and secure point to point data transfer. GridFTP servers are standard components of the Globus middleware suite [4] and directly interface with the different file system technologies at fabric level. In this paper, we present the results of some tests focused on the interaction and performance issues in a storage setup which combines GridFTP servers with the IBM General Parallel File System(GPFS)[]. Theaimofsuchtests istomeasuretheevolution ofperformancesinfunction of some fundamental variables: number of parallel transfers, number of streams per transfer, concurrent read and write activity, and size of data blocks transferred. c 2 IOP Publishing Ltd

7th International Conference on Computing in High Energy and Nuclear Physics (CHEP9) IOP Publishing Journal of Physics: Conference Series 29 (2) 224 doi:.88/742-696/29//224 The testbed was deployed at the INFN-CNAF Tier- center, located in Bologna, Italy. We employed a GPFS file system built over a Storage Area Network (SAN) infrastructure. GridFTP servers were configured on two SAN-attached GPFS running Scientific Linux CERN 4 (SLC4) 64-bit Operating System. During the testing activity, all the main metrics were measured: CPU load, memory usage, network utilization, data read and write rates. The rest of this paper is organized as follows. Section 2 presents details of the deployed GridFTP and GPFS testbed. Section 3 summarizes all the selected case studies. Section 4 briefly describes how we ran tests and measured the metrics. Section is devoted to the presentation of the results. Conclusions and outlook are given in Section 6. 2. Experimental Deployment of GridFTP and GPFS In this section we briefly describe the facilities deployed to perform the tests. Figure shows the schema of the SAN connecting the storage, the IBM GPFS servers and the GridFTP servers. GridFTP Data flow LAN Gb/s GPFS Data flow Gbit Ethernet Switch Gb/s Ethenet GPFS I/O servers GridFTP servers 4 Gb/s FC FC switch 4x4 Gb/s FC EMC CX4-96 (4 LUN x 8 TB) Figure. Experimental Deployment of GridFTP and GPFS. Storage - We have used four Logical Unit Numbers (LUNs) each of 8 TB. The LUNs are served by an EMC CLARiiON CX4-96 [6] subsystem. Blade Servers - There are six M6 Blades from Dell PowerEdge Me Modular Blade Enclosure [7]. Each blade server has been equipped with: 2 quad core Intel(R) Xeon(R) CPU E4 at 2.33 GHz; 6 GB of RAM; 2 SAS Hard Drives of 3 GB each configured in RAID using on-board LSI Logic / Symbios Logic SAS68E controller and used as local disk. The six blade servers have been divided in two groups: one of them is composed by four blades that run the GPFS servers, Network Shared Disk (NSD), and the other one by two 2

7th International Conference on Computing in High Energy and Nuclear Physics (CHEP9) IOP Publishing Journal of Physics: Conference Series 29 (2) 224 doi:.88/742-696/29//224 blades used as GridFTP servers with direct access to GPFS file system by Fibre Channel (FC) connection on external storage. This configuration allows for a complete separation between GridFTP and GPFS data flow as it shown on Figure. Network - There are dual-channel QLogic Corporation QLA2432 4 Gb/s FC adapter and two onboard Gigabit Ethernet ports (only one has been used). The interconnection architecture with external storage is supported by 4 Gb/s optical FC links connecting each blade with two Brocade M4424 switches [8] installed inside the Dell Me Blade Enclosures, while the Ethernet interconnection, we have used, has 48-port Extreme X4 [9] switch with Gb/s uplink. 3. Case Studies The tests aim to independently evaluate the performances of the different layers of a typical GPFS-based storage system. In order to do so we have defined three different case studies, each one testing a different layer of the storage system. In this section we give a brief description of the different case studies: Case I - GPFS bare performance on a SAN node. For preliminary tests we have used simple copy (i.e., cp) command to check transfer rates between memory, local and SAN-attached disks. Figure 2. GPFS bare performance on a SAN node - CP from-to GPFS. Figure 3. GPFS bare performance on a SAN node - CP from GPFS to /dev/null. Figure 4. GPFS bare performance on a SAN node - CP from GPFS to /tmp. Figure. GPFS bare performance on a SAN node - CP from /tmp to GPFS. 3

7th International Conference on Computing in High Energy and Nuclear Physics (CHEP9) IOP Publishing Journal of Physics: Conference Series 29 (2) 224 doi:.88/742-696/29//224 This test aims to measure the performances for a simple file copy on a SAN node. Files can be contemporaneously read from and written to the GPFS file system as described in Figure 2. They can be as well read from the GPFS file system and written either to the null device (i.e., /dev/null) or to a temporary directory (i.e., /tmp where the local disk is used) as shown in Figures 3 and 4 respectively. Finally, files can be also read from /tmp and written to the GPFS file system as described in Figure. In UNIX, the null device is a special file, which discards all data written to it reporting that the write operation succeeded, and provides no data to any process that reads from it returning end-of-file immediately []. ItisimportanttoevaluatewhichdiskI/OdevicesmaybeconsideredabottleneckhavingFC disks, i.e. GPFS partition, as the input stream and local disk /tmp partition as the output data stream. The intention was also to evaluate which factors introduce a performance worsening reading from GPFS and writing to /dev/null and then reading from GPFS and writing to /tmp: if performances were the same, read from GPFS would be the problem, considering that local disk access should be slower than FC disk. Case II - GPFS performances on a SAN enabled GridFTP server. This test aims to measure the GridFTP transfers performances over the SAN infrastructure. Files can be read from the GPFS file system and written to /tmp, and viceversa as shown in Figures 6 and 7. They can be contemporaneously read from and written to the GPFS file system as described in Figure 8. In each example a GridFTP server is used. In case I a given file is copied from a block device to another block device or memory, while in this case the file is transferred reading from a block device and then sending packets to one or more TCP sockets, therefore GridFTP introduces overheads that may affect final performances. However, having configured network parameters in order to adopt loopback device (i.e., a block device that can be used as a disk but really points to an ordinary file somewhere) during the usage of GridFTP with IP address or with localhost, no network layer is introduced. Case III - GPFS performances with GridFTP transfers among two SANs enabled GridFTP servers. This last test measure the performances of GridFTP transfers among two SANs enabled servers connected by a Gb plus Gb Ethernet link (i.e., up to Gb for each direction). Both unidirectional and bidirectional transfers have been tested. Files can be contemporaneously read from and written to the GPFS file system by using two different GridFTP servers as described in Figures 9 and. Figure 9 shows that each GridFTP can read from and write to GPFS, while Figure details that one GridFTP is used to read from GPFS and the other one to write to GPFS. In this case a given file is read from a block device via a GridFTP server to Gb link and written to the same block device using another GridFTP server to Gb link. In Case III, therefore, the Gb LAN connection limits the performances to 2% of those observed in Case II during transfers from-to GPFS. 4. Testbed Setup In this section we briefly describe the testbed used for running the different tests and measure the metrics. The setup was very simple and relied on basic UNIX and Grid middleware command wrapped in a perl script. The latter just allowed for constant submission of transfers, keeping a stable load, and managed the scheduling of various tests in sequence. In particular, we submitted GridFTP transfers using the globus-url-copy GridFTP client available with the Globus Toolkit middleware distribution, version 4. The measurement of the metrics on the servers was provided by dstat monitoring tool. This is a python based command line tool which allows for measurement of the usage of CPU, the 4

7th International Conference on Computing in High Energy and Nuclear Physics (CHEP9) IOP Publishing Journal of Physics: Conference Series 29 (2) 224 doi:.88/742-696/29//224 Figure 6. GPFS performances on a SAN enabled GridFTP server - Transfer from GPFS to /tmp. Figure 7. GPFS performances on a SAN enabled GridFTP server - Transfer from /tmp to GPFS. Figure 8. GPFS performances on a SAN enabled GridFTP server - Transfer from-to GPFS. Figure 9. GPFS performances with GridFTP transfers among two SANs enabled GridFTP servers - Transfer from-to GPFS. Figure. GPFS performances with GridFTP transfers among two SANs enabled GridFTP servers - Transfer from-to GPFS. network activity, the load average, and others. Through a plugin mechanism dstat can interface and measure the performance of various devices. External plugins are enabled by using a single

7th International Conference on Computing in High Energy and Nuclear Physics (CHEP9) IOP Publishing Journal of Physics: Conference Series 29 (2) 224 doi:.88/742-696/29//224 -M option followed by one or more plugin names. In particular a plugin for GPFS storage is available. The used command line dstat -M time,epoch,cpu,mem,load,gpfs,net 3 () provides logging of the following information: time stamp, both in hour and epoch format; CPU usage; memory usage; overall load; gpfs input and output rates; inbound and outbound network rates. All values are averaged over thirty second intervals. Table summarises all the statistics provided by the dstat command. The details of CPU usage are averaged over all the processors in the system. Statistic Event Description Table. List of Statistics. usr It shows the percentage of time taken by user processes. sys It identifies the system processor usage by the kernel. CPU idl It is the percentage of time the CPU is sitting idle. wai It shows the time tasks are waiting for I/O to complete. hiq It is the percentage of time spent processing hardware interrupts. siq It is the percentage of time spent processing software interrupts. min It shows an average of the number of runnable processes waiting for the CPU. Load min It shows an average of the number of runnable processes waiting for the CPU. min It shows an average of the number of runnable processes waiting for the CPU. used It is the used memory in bytes. Mem buff It is the buffered memory in bytes. cach It is the cached memory in bytes. free It is the free physical memory in bytes. GPFS read It is the number of byte read. write It is the number of byte written. Network receive It is the number of byte received. send It is the number of byte sent.. Simulation Results In this section we summarize the tests results. Tables 2 and 3 show respectively testbed information for software, and set up characteristics used for performing experiments. All of them have been executed varying the values for the relevant parameters such as the number of transferred parallel files and the number of streams per file expressed with M and N respectively. The tests have allowed to collect a lot of useful information on the behavior and performances in accessing a typical size file on a GPFS storage by direct POSIX access or by 6

7th International Conference on Computing in High Energy and Nuclear Physics (CHEP9) IOP Publishing Journal of Physics: Conference Series 29 (2) 224 doi:.88/742-696/29//224 Table 2. Testbed software details. Type OS Version dstat.6.6 SLC4 64-bit GPFS 3.2.-9 globus gridftp 2.3 globus-url-copy 3.2 cp.2. Table 3. Set up characteristics. Parameter Value GPFS page pool 8 GB default GPFS cache 2.4 GB default TCP buffer 262 KB max TCP buffer 2 GB GridFTP. We have verified a low CPU load looking at case simulations. Plots regarding some of the most relevant performed tests are reported in this section together with summary results. Case I - GPFS bare performances on a SAN node. Up to five parallel file copies have been tested. In this case network does not have any role. The GPFS file system has shown unidirectional read (i.e., GPFS-/dev/null where disk is not used) and write (i.e., /tmp - GPFS) performances up to. GB/s as shown in Figures and 2..7 GPFS /dev/null GPFS /tmp.7 /tmp GPFS.6.6...4.3.4.3.2.2.. 2 3 4 2 3 4 Figure. Bare reading rates from GPFS as function of M. Figure 2. Bare writing rates on GPFS as function of M. Maximum sustained rate of contemporaneous read and write operations reaches.3 GB/s as shown in Figure 3. The performance seems to smoothly decrease to. GB/s as the number of parallel files increase up to five. Figure 4 highlights this behaviour up to twenty parallel file copies. Case II - GPFS performances on a SAN node enabled GridFTP server. Up to twenty parallel files and ten streams per file have been tested. In this case GridFTP is directly SAN-attached, therefore network throughput is almost zero. Performances vary from.2-.3 GB/s read and write rate with -2 parallel transfers down to.-. GB/s with - parallel transfers. This seems to be fairly independent from the number of streams used in a single GridFTP transfer. Figures and 6 show the GPFS 7

7th International Conference on Computing in High Energy and Nuclear Physics (CHEP9) IOP Publishing Journal of Physics: Conference Series 29 (2) 224 doi:.88/742-696/29//224.3 read GPFS GPFS write GPFS GPFS..4 read GPFS GPFS write GPFS GPFS.3.4.2.3.2..3.2.2..... 2 3 4 2 number of parallel files M Figure 3. Bare GPFS reading and writing rates as function of M, up to five parallel file copies. Figure 4. Bare GPFS reading and writingrates as function of M, upto twenty parallel file copies. throughput in read and write respectively, given an i stream and a j parallel file with i N and j M..3.3.2..2. 2 2 Figure. Reading rates from GPFS as function of N and M. Figure 6. Writing rates on GPFS as function of N and M. Table 4 reports the average of GPFS throughput in read and write, transferring a file from GPFS to /tmp and viceversa specifying the number of streams N and the number of parallel files M. Case III - GPFS performances with GridFTP transfers among 2 SANs enabled GridFTP servers. Up to twenty parallel files and ten streams per file have been tested. In this case GridFTP resources are directly SAN-attached. Unidirectional transfers between the two GridFTP servers can be sustained saturating the Gb Ethernet link. This is independent from the number of parallel transfers and the 8

7th International Conference on Computing in High Energy and Nuclear Physics (CHEP9) IOP Publishing Journal of Physics: Conference Series 29 (2) 224 doi:.88/742-696/29//224 Test Type Table 4. Average of GPFS throughput GB/s. N M Write Read N M Write Read N M Write Read /tmp-gpfs.34.323 2.37 /tmp-gpfs.49.489 2.477 /tmp-gpfs.48.463 2.469 GPFS-/tmp.7. 2.48 GPFS-/tmp.8.93 2.86 GPFS-/tmp.7.72 2.74 stream per file. Figures 7 and 8 highlight GPFS performances with two GridFTP servers used to read from GPFS and write to GPFS respectively....... 2 2 Figure 7. Reading rates from GPFS as function of N and M by the st GridFTP. Figure 8. Writing rates from GPFS as function of N and M by the 2 nd GridFTP. Bidirectional transfers among the two servers have shown as well to be able to saturate Gb/s Ethernet link with a.24 GB/s read (write) performance ( Gbit =.2 GB/s). Figures 9 show GPFS throughput with read by the st GridFTP and write by the 2 nd GridFTP (i.e., the GPFS throughput in the other sense is almost the same to this one). The saturation actually takes place for five or more parallel transfers. Figures 9 also detail with a single transfer (i.e., M=) the overall read/write rate is.8 GB/s. Figures 7, 8, 9(a), and 9(b) show the GPFS throughput in read and write respectively, given an i stream and a j parallel file with i N and j M. The performance dependency on the number of parallel files can be explained by the usage of the operative system buffer: this needs further investigation. 6. Conclusions In this paper, we presented some tests which have been performed at INFN-CNAF Tier- with GridFTP servers on a GPFS storage system. In order to evaluate independently the 9

7th International Conference on Computing in High Energy and Nuclear Physics (CHEP9) IOP Publishing Journal of Physics: Conference Series 29 (2) 224 doi:.88/742-696/29//224...8.8.6.4.2.6.4.2 2 2 (a) read (b) write Figure 9. GPFS rates as function of N and M, reading by the st GridFTP and writing by the 2 nd GridFTP. performances of the different layers involved in a typical data transfer task we have identified three case studies: GPFS bare performance on a SAN node, GPFS performances on a SAN enabled GridFTP server and GPFS performances with GridFTP transfers among 2 SAN enabled GridFTP servers. The testbed setup was described. We reported the measurements of throughput performances in function of the number of parallel transferred files and the number of streams pertransfer. The results show a strong dependenceof the transfer rate on the number of parallel files. We need further investigations in order to fully understand such dependence. In addition it would be interesting to also verify this layout in interaction with the SRM layer in order to optimize both the number of streams per transfer and the number of parallel files, and see how system scales when the rate of transfer requests increases. This activity is part of the planned future tests. References [] E. Laure, S. M. Fisher, A. Frohner, C. Grandi, P. Kunszt, A. Krenek, O. Mulmo, F. Pacini, F. Prelz, J. White, M. Barroso, P. Buncic, F. Hemmer, A. Di Meglio, and A. Edlund, Programming the Grid with glite, In Journal of Computational Methods in Science and Technology, Vol. 2, No., pp. 33 4, 26. [2] A. Shoshani, A. Sim, and J. Gu, Storage Resource Managers: Middleware Components for Grid Storage, In Proceedings of NASA CONFERENCE PUBLICATION, pp. 29 224, 22, http://romulus.gsfc.nasa. gov/msst/conf22/papers/d2ap-ash.pdf. [3] W. Allock, GridFTP: Protocol Extensions to FTP for the Grid, Global Grid Forum GFD-R-P.2, 23. [4] Globus middleware, http://www.globus.org/. [] F. Schmuck, and R. Haskin, GPFS: A shared disk file system for large computing clusters, In Proceedings of the 22 Conference on File and Storage Technologies (FAST), 22. [6] CLARiiON CX4 Model 96, http://www.emc.com/products/detail/hardware/clariion-cx4-model-96. htm. [7] PowerEdge Me Modular Blade Enclosure, http://www.dell.com/content/products/productdetails. aspx/pedge_me?c=us&cs=&l=en&s=biz. [8] Brocade M4424 4Gb Fibre Channel Switch for Me-Series Blade Enclosures, http://www.dell.com/ content/products/productdetails.aspx/switch-brocade-m4424?c=us&cs=&l=en&s=biz. [9] SUMMIT X4e, http://www.extremenetworks.com/products/summit-x4e.aspx. [] The Open Group Base Specifications, Issue 6, IEEE Std 3., 24 Edition, Copyright 2-24 The IEEE and The Open Group, http://www.opengroup.org/onlinepubs/969399/.