Dimensioning storage and computing clusters for efficient high throughput computing

Size: px
Start display at page:

Download "Dimensioning storage and computing clusters for efficient high throughput computing"

Transcription

1 Journal of Physics: Conference Series Dimensioning storage and computing clusters for efficient high throughput computing To cite this article: E Accion et al 2012 J. Phys.: Conf. Ser View the article online for updates and enhancements. Related content - The MAGIC data processing pipeline R Firpo Curcoll, M Delfino, C Neissner et al. - The Grid Enabled Mass Storage System (GEMSS): the Storage and Data management system used at the INFN Tier1 at CNAF. Pier Paolo Ricci, Daniele Bonacorsi, Alessandro Cavalli et al. - An alternative model to distribute VO software to WLCG sites based on CernVM-FS: a prototype at PIC Tier1 E Lanciotti, G Merino, A Bria et al. This content was downloaded from IP address on 05/11/2018 at 10:29

2 Dimensioning storage and computing clusters for efficient high throughput computing E. Accion 1,3,A. Bria 1,2,G. Bernabeu 1,3,M. Caubet 1,3,M. Delfino 1,4,X. Espinal 1,2,G. Merino 1,3,F. Lopez 1,3,F. Martinez 1,3,E. Planas 1,2 1 Port d Informació Científica (PIC), Universitat Autònoma de Barcelona, Edifici D, ES Bellaterra (Barcelona), Spain 2 Also at Institut de Física d Altes Energies (IFAE), Universitat Autònoma de Barcelona, Edifici Cn, ES Bellaterra (Barcelona), Spain 3 Also at Centro de Investigaciones Energéticas Medioambientales y Tecnológicas (CIEMAT), Madrid, Spain 4 Also at Universitat Autònoma de Barcelona, Department of Physics, ES Bellaterra (Barcelona), Spain Abstract. Scientific experiments are producing huge amounts of data, and the size of their datasets and total volume of data continues increasing. These data are then processed by researchers belonging to large scientific collaborations, with the Large Hadron Collider being a good example. The focal point of scientific data centers has shifted from efficiently coping with PetaByte scale storage to deliver quality data processing throughput. The dimensioning of the internal components in High Throughput Computing (HTC) data centers is of crucial importance to cope with all the activities demanded by the experiments, both the online (data acceptance) and the offline (data processing, simulation and user analysis). This requires a precise setup involving disk and tape storage services, a computing cluster and the internal networking to prevent bottlenecks, overloads and undesired slowness that lead to losses cpu cycles and batch jobs failures. In this paper we point out relevant features for running a successful data storage and processing service in an intensive HTC environment. 1. Introduction Scientific experiments are experiencing an explosion of digital data production, both in quantity and size. Detectors and electronic devices in general are continuously increasing their intrinsic resolutions delivering huge amounts of digital data. The Large hadron Collider (LHC) at CERN envisaged this and started the WLCG[1] (World-wide LHC Computing Grid) project, a computing infrastructure involving more than 140 sites around the world and having a potential of six thousand users. The Port d Informació Científica (PIC) is one of the eleven first level centers (known as Tier1 centers) and supports three out of the four experiments in the LHC: ATLAS, CMS and LHCb. PIC is also providing computing services for research groups in astrophysics, cosmology, neuroimaging and genomics, but their requirements are usually smaller than the LHC experiment ones. The LHC delivers 15 PB of data every year, that is eventually stored, analyzed, archived and reprocessed. This means the same data is not only processed once, but several times, using disk storage as online data and a different concept of long term storage: nearline data. The Published under licence by IOP Publishing Ltd 1

3 interaction among the processing nodes, and the storage (online or nearline) has to be correctly synchronized to minimize CPU cycles lost due to I/O waits and to maximize the efficiency, this means that online data has to flow quickly and reliably and nearline data has to be intelligently pre-staged on buffer disk areas in front of the tape robots before the jobs start to run. One of the keys to reach the required performance is the correct dimensioning of the network among all the parties: processing nodes, disk servers and tape servers. PIC is running a computing farm of 4000 cores, and the experiments estimate on the jobs average data I/O rate is of about 5MB/s. This would translate in a constant data flux of 20GB/s between the nodes and the disk servers for a 100% occupied farm. Measurements show a yearly mean usage of 2GB/s, which translates into approximately 60 PB of internal data exchange per year. The acceptance and replication of data among the WLCG computing centers proceeds simultaneously to the internal data processing flows. The incoming data is steered automatically into nearline data or online data depending on its nature. The rate of exported and imported data in the WAN averaged at around 2 Gbps during last year, having peak values where all the available bandwidth (12 Gbps) was saturated for short periods of time. To illustrate this, Fig. 1 shows the WAN global traffic in and out from PIC during 5 weeks in Figure 1. WAN traffic. Data reception and exportation during one month in The storage setup at PIC is not implemented using commercial solutions but innovative disk system managers. Disk storage is managed by dcache [2] and tape storage is managed by Enstore [5] (developed at FNAL, Chicago). The details are shown in section 2 The requirements to achieve a high throughput performance in a heterogeneous hardware environment need a proper study and a fine tuning for most of the nodes, such that an homogeneous network performance per TB can be achieved. This is discussed in section 3. Interaction between disk and tape should be transparent from the user and experiment point of view. The tape storage is not the classical backup mainframe system but an automated robotic library used as a high latency disk, the details on the implementation of the tape system mechanism and its interaction with the disk storage are covered in section 4. 2

4 2. Handling online and nearline storage Two different kinds of storage are in place: disk and tape. Disk is normally used for short and mid-term storage, or storage that is planned to be accessed frequently. Tape is used for long-term storage which is planned to be accessed in a controlled manner and not too frequently. dcache handles the disk storage and Enstore handles the tape storage. dcache is a software system for storing and retrieving huge amounts of data, distributed among a large number of heterogeneous disk nodes and presented under a single name space with different access protocols. It features an interface for a tertiary storage system, space management, pool attraction, dataset replication, hot spot determination and recovery from disk or node failures. The dcache installation at PIC is roughly separated in three components: pools, head nodes and doors. Pools are servers which provide the raw disk storage capacity, normally composed of disks with some degree of redundancy. The redundancy at the level of disk server is achieved with hardware and software RAID technologies - via hardware RAID controllers in some cases, or with software RAID systems like ZFS or mdraid. The configuration of the pools is as homogeneous as possible, though still heterogeneous to cater for the different hardware deployed and different usage needs. Doors are software components that are responsible for translating the requests in any protocol to internal dcache brokers. Currently PIC has doors in production which are specific for gridftp, dcap, http/webdav and xrootd protocols. The dcache nodes that run the auxiliary services are known as head nodes. They are responsible for brokering requests across the pools, publishing information and other administrative tasks. In the case of PIC, servers either run as doors, pool nodes or head nodes. Other levels of redundancy beyond the one given by RAID can be achieved through the feature of dcache supporting copies of the same file on different disk pools. That allows to achieve redundancy of files, to keep critical files available even in the case of one disk server failure, and also fosters throughput as files accessed from many different clients can be replicated into different disk servers and thus requests load can be balanced. To do this, dcache uses a thermodynamic approach, allowing files to be dynamically copied to different disk servers in case of detecting that a given file is hot (accessed by many clients), and removing the copies once that space is needed for another purpose. One of the critical aspects of dcache operation is its ability of handling disk server failures smoothly; a critical failure in a pool very rarely has effects on the overall system, beyond losing availability of those files which were not duplicated. Even if dcache is not able to manage tape directly, it has features to interact with a tertiary storage system, normally implemented by tape. When a given set of conditions is met, it launches a script that triggers the migration/recall from the tertiary storage system. Enstore was chosen for this purpose. Enstore is a software system that provides distributed access to the data stored on tape and its management. It provides a generic interface so the user can access data in a similar way as they access native file systems. It features tape and robot management, scheduling the requests and handling quotas. Enstore uses a single name space that can be shared with other storage systems to provide an interface for users. Enstore installation at PIC can be roughly separated in tape servers, head servers and client machines. Tape servers are machines with HBA (Host Bus Adapter) controllers directly attached to tape drives: LTO3, LTO4, LTO5 and T10KC are currently in production. The current setup at PIC uses 8 servers, each of them with 2 HBA controllers with 2 ports each, hence connected 3

5 to 4 tape drives. Head servers run the brain of the system, providing the brokering among the clients and the tape servers, running services responsible for keeping quotas, grouping tape drives in different libraries and managing queues. The scheduling of the queue is done in a non-fifo way to allow for optimizations by trying to keep a given tape mounted (to alleviate mount/dismount latency penalties). PIC s HSM solution is therefore the integration of both dcache and Enstore systems. dcache handles disk servers and uses the tertiary storage interface to call a script that handles the translations needed from the dcache file to Enstore, which ends up calling a client command: encp, the main Enstore interface that copies data from/to disk to/from tape. In this way, dcache can migrate files to tape or recall them from tape per request of the user. The script runs on the pool node, and by using the encp command it generates an Enstore request handled by the Enstore servers. When a suitable tape drive is available, the tape is mounted there, and the file is staged into the pool node and from there it is served to the worker node. All meta-information of the file is stored in a shared name space system named PNFS [3]. A new version of the namespace called Chimera [4] is available in dcache and PIC plans to migrate PNFS to it before the end of This allows Enstore and dcache to collaborate without having a big dependence on each other. 3. Disk storage There is a large number of solutions for disk storage on the market. They can be categorized in two different branches: big, monolithic solutions, where one or multiple controllers manage a big set of disks in an opaque manner, normally achieving high resilience via RAID hardware controllers cheap, highly decoupled disks that are presented independently. These are usually referred to as JBOD systems (for Just a Bunch Of Disks). On those, the resilience is provided at a higher software level. This can be achieved, for instance, with multiple independent copies on disk instead of relying on each copy on disk to be always there. The implementation chosen, with dcache, is in between these two extremes, showing characteristics of both systems. For example, a double level of resilience is implemented by providing redundancy at the level of device (ZFS, RAID6) and at the application level. dcache handles widely accessed files triggering multiple and independent copies in case one of them becomes inaccessible or the pool is under high load. This is done because the first kind of redundancy allows to have a very low operational cost - disk failures have very low interference with the rest of the system. The second kind of redundancy is also used to be able to achieve faster read speeds for situations when a file is hot (being accessed by many clients) by allowing clients to read from different sources rather than only one. Three different approaches to hardware solutions are in place: DAS with s/w RAID: this first type of disk server hardware is represented by SunFire x4500 servers. Those are basically servers with 48 hard drives which are independently presented to the OS. Volume manager and redundancy capabilities normally offered by controllers are implemented with Solaris ZFS. Basically, the main CPU of the machine is the one working as RAID controller, as everything is done at OS level. That means tuning is also done at OS level, which is convenient as it is centralized. Data is served through the network with 4x1GE interface aggregated through LACP (Link Aggregation Control Protocol). 4

6 SAN: A second solution for hardware was a Data Direct Networks (DDN) S2A9900. This is a Storage Area Network (SAN) type system: 600 disks per system are served through a pair of controllers providing high availability for the system. Controller manages disks setting RAID6-like systems of 8+2 disks, that then are served through fiber channel using SCSI protocols to a set of blade servers, all of them running dcache. In this case the tuning is done at two levels: at the level of the controller and at the level of the fiber channel communication, controlled by the OS of the blade servers. Due to legacy reasons, the chosen operating system is Solaris 10 with ZFS just adding stripping. Tests have been done with Linux together with XFS and grouping disks with LVM (Logical Volume Manager) to get similar environments with good performance results. DAS with h/w RAID: These are SGI or Supermicro servers providing high density of disks (36 disks in a 4u server) presented as a single virtual device through an internal controller, with redundancy. Every server is currently organized in three different devices each one composed by 10+2 disks, arranged in RAID 6. Multiple performance tests has been performed to decide the exact configuration of the disks. A RAID 60: grouping 12 disks as one RAID 6, and then aggregating 3 of those groups as one block device. There was mainly one problem related to the controller: the scalability with multiple streams suffered high degradation because of saturation of the CPU of the raid controller. Three different RAID 6 and grouping them using LVM: An improvement has been observed in throughput basically because a bigger block size (4MB) could be setup. It was observed that the multiple streams writes were relatively slow (order of 400 MB/s) but the reads were fast (1 GB/s) in this setup. This is acceptable as it is relatively easy to parallelize writes into different disk servers to achieve the desired throughput, but it is not as easy to parallelize reads as they require a copy from a pool to another pool that can generate more problems than it solves. Incoming and outgoing traffic to WAN shows roughly a 1:1 read:write ratio, while LAN traffic shows a clean tendency towards reading, with a 2:1 read:write ratio. It can also be observed that traffic to/from tape is not high, but scalability to absorb eventual high throughput should be guaranteed, as tape drives are the most scarce resource. To ensure this scalability, the dcache system is configured such that all of the pools are eligible to be used for all of the purposes: WAN transfers, LAN transfers and tape recall or migrations. This configuration choice has proven to deliver optimum performance, since it potentially gives access to the maximum number of available spindles at any given time for any requested action. It is also a configuration that enables efficient use of resources, since for instance all of the available disk space at a given time can be used as cache for files on tape. The pool costs feature in dcache allows to select algorithms to balance the loads, so the system can adapt to the different kinds of loads each server has. Pool costs are basically a way to assign dynamically weights to pools. The load of the server can be taken into account (number of concurrent transfers to the disk server), and other variables like available space to calculate the weight. Then, for write requests, weight is used to select the less loaded server to write the data in. At the same time that the disk system is deployed and tuned for optimal performance, it is important to monitor which is the usage that the LHC experiments are making of the service, and their access patterns. Fig. 2 shows the fraction of the data stored on disk which was actually uniquely read every month. This data has been obtained from the dcache Billing DB, where an exhaustive accounting of every data transferred in or out of the disk system is recorded. The 5

7 results show that only around 20-30% of the data stored on disk is read every month for the large experiments. This seems to indicate that the system is running far from its full capacity, hence it suggests that there is room for improvement in the overall efficiency delivering the large disk storage service for the Tier1. These are preliminary results which now motivate triggering a deeper analysis of such access patterns, which is considered as future work. Figure 2. Percentage of the data stored on disk which is accessed by each experiment every month at PIC Tier1. 4. Tape storage Tape storage is managed by Enstore. The setup can be categorized in three components: head servers, tape servers and tape libraries Head servers There are four head servers deployed in each of the two instances (production and test), both running in separated environments. Configuration server: contains the Enstore central configuration. It is responsible of maintaining and distributing all the information about the system configuration across all the components of the tape system. This server also contains a centralized log file service for all the components. A web server also runs in the configuration server, acting as an interface to the tape system monitoring: accounting, rates, system status, etc. It also runs a real time application showing the state of the system, drive rates, drive buffer occupation, etc. Library manager (LM): runs two different groups of processes, the first group is formed by the virtual libraries defined by the combination of a physical library and media type. These processes are the responsible of managing queues of requests. The second group of processes running in the LM are the media changers that need to be defined for each physical library. These processes are responsible to launch the actual commands that operate the robot, mounting and dismounting tapes from drives. Backup: running backups of the data and providing storage space needed for migrations. Database: running Postgresql databases needed by the other services. It used to be one instance of Postgresql serving 3 different databases: file and volume catalog, drive status 6

8 Tape Tape Library Technology TS3500 SL8500 LTO3 drives 8 LTO3 tapes 1490 LTO4 drives 4 16 LTO4 tapes LTO5 drives 4 LTO5 tapes b drives b tapes 2494 Table 1. Number of tapes and tape drives per library and accounting, but it has been recently split into three different instances of Postgresql serving one of the DB each, to avoid interferences among the databases. A lot of the information regarding files is not exclusively stored in this database, but it is also exported to the PNFS, as a way to share information with dcache. Every tape file has an internal and an external ID. PNFS ID is used by dcache and BFID is meaningful for Enstore so the storage system can handle both disk and tape requests Tape servers Tape servers are currently Dell R710 with two Fiber Channel HBA controllers having two 4Gbps fiber-channel ports each, thus being able to connect to 4 different tape drives. Each tape server controls drives of different tape technologies, to avoid the failure of few tape servers to affect completely one technology. As network requirements with tape drives on modern technologies are around 150 MB/s, 10GE connections are needed for each tape server in order to prevent bottlenecks. Aggregation of 1GE links is not optimal, as individual streams can be more than 120 MB/s. With the current implemented networking technology it is not possible to efficiently distribute one stream among different network interfaces. Tape servers need well dimensioned memory to scale with the numbers of tape drives they are controlling. Production tape servers have 32GB of RAM, split into 12GB for the system and 5GB per tape drive controlling process. Each tape drive is managed by a single python process called a mover. Our current configuration of the tape servers involves four of these processes running concurrently, each of them responsible for handling a different tape drive. For load balancing purposes and high availability, each tape server runs different drive media (LTO3 LTO4, LTO5 and T10KC) Tape libraries The system handles two different tape libraries: IBM TS3500 and Oracle/STK SL8500, in Table 4.3 a breakdown is shown of the different tape libraries and technologies in use. There are tunable key features provided by Enstore for optimizing performance. Some of the problems already addressed are enumerated below. Data distribution on tapes: Data placement is relevant to exploit locality. Given the physical restriction that only one tape can be mounted in a given tape drive at a given time, if you place all relevant data in a single tape (not so strange considering sizes of tapes nowadays get up to 5TB) then retrieving that data must be done sequentially and only with one 7

9 tape drive. This can lead to suboptimal performance, specially when there are not enough concurrent requests to make use of a high number of tapes, as the tape drive utilization will drop substantially. To solve this, a parameter (file family width) can be configured that indicates the maximum number of streams a given set of data will use to migrate to tape. Thus, if 5TB of data are migrated to tape, and specify file family width=5, then five tapes will be mounted, and five streams written, storing 1TB of data on each tape. One wants to limit that number because using the maximum number of available tape drives, one could easily end up running into starvation problems (a single migration process monopolizing the resources). The number is also tuned having into account the resources assigned to each project. One should also be careful that a higher number of streams can lead to unwanted excess of fragmentation of data, having data scattered among too many different tapes. Saturation of disks servers on disk-to-tape copies: It can happen that only one of the disk servers gets most of the data that is going to be recalled/migrated from/to tape. One could run into contention of the disk, stalling tapes. This is clearly unwanted as tape drives are the slowest and thus one wants to maximize utilization making tape drives the bottleneck. To control this, Enstore discipline feature is used to limit the number of concurrent accesses to tape drives from a given disk server. In this way, disk servers throughput capability is throttled to prevent them using the tape drives in a sub-optimal manner. Avoid tape dismounting: Enstore s library manager has optimizations in order to avoid paying the dismount penalty. It implements a HAVE BOUND state that is configured to last for about two minutes when a tape is not requested anymore. That means that after a file has been served from a tape, if a new request appears before the 2 minutes timeout, the tape will still be there saving the dismount/mount time. Enstore handles a queue of requests, which is auto-shuffled to group requests accessing to the same tape, hence allowing to save some extra time in mount/dismount operations. Also, at the level of dcache, the minimum amount of data needed to trigger a migration can be tuned, thus preventing a constant leakage of small quantities of data that would force a tape to be mounted/dismounted all the time. 5. Network Networking is the layer that holds everything together. Careful dimensioning of the network flows is important to avoid making the network the bottleneck. The topology of choice is starlike, to be able to follow a simple structured cabling, where the central component is a Cisco 6509-E. There is an ongoing migration of this component to be moved into a Cisco Nexus 7009, providing the required scalability in terms of 10GE ports. Data flows with significant bandwidth requirements are basically three: WAN connections, worker nodes reading/writing from/to the disk system and those reading/writing from/to tape servers. Roughly, the bandwidth is dominated by the worker nodes interacting with the storage system, and it is estimated to be close to the 5-10 MB/s per job. Due to the Cisco 6509-E having a 40Gb bandwidth limitation between modules, it is not able to cope with the requirements. Getting a new switch with more bandwidth was a too expensive option at that time, and given the fact that most of the traffic comes from the interaction of worker nodes and the storage system, it was decided to buy a new set of two switches (Arista 7148SX) that operate as one virtual switch with 92 effective 10Gbps ports, wire-speed and with low latency in the nanosecond range. All L3 traffic among worker nodes, storage subsystem and tape servers was handled at the level of this switch, off-loading the most intensive network bandwidth from the main switch/router. Intensive data movement makes most of the Ethernet frames to be the size of the MTU (Maximum Transmission Unit). The usual default value of the MTU is 1500 bytes and this can have an impact on the CPU of the intervening parts, and can significantly drop the performance 8

10 of transfers. Thus, the possibility of using Jumbo Frames was investigated and adopted after seeing improvements in both CPU utilization and data throughput when using MTU value of 9000 bytes. From the worker nodes side the improvement in the combined throughput was 30% and the improvement on the disk servers throughput was 40% (DDN case). It has also been observed that default kernel values found in most common kernels do not match HTC environment requirements. As an example, TCP max buffer size, a kernel parameter that marks the size of the buffer to be used as the TCP window, tends to default to tiny values (in the order of 16KB), that do not provide enough buffer to be able to sustain high throughput transfers over WAN. As seen in ESnet web [6], recommended value is 16MB but a lower value of 8MB was chosen due to the high number of connections needed to the disk servers. This is because of memory restrictions on disk server machines. Another interesting value to tune is the max backlog. This number is the maximum number of packets that can be left unprocessed before the kernel starts discarding new ones. By default, this value is 1000 packets, which is too low for HTC environments; the recommendation found in ESnet web [6] is to set this value close to packets. Similarly the transmit queue length, defined as the transmit queue length of the device, should be increased to packets as recommended for HTC environment. An HTC environment handles large quantity of data, and that can lead to congestion. The algorithm used up to now in the case of congestion was BIC (Binary Increase Congestion) which is used by default in Linux kernels through There is an ongoing evaluation of CUBIC [7], an alternative algorithm available in Linux kernels and above that is an enhanced version of BIC: it simplifies the BIC window controls and improves its TCP-friendliness and RTT-fairness. Overall, it should be emphasized the fact that, for correctly dimensioning the network in an HTC environment like PIC, one needs to have into account all the available tuning at different levels, and the analysis of the network flows by throughput, to be able to optimize the economical component of the solution. 6. Conclusions An HTC service in production has been described. It has been pointed out that growing in capacity is not a brute force game of adding disks but a subtle strategy for pursuing right scaling process involving many layers in the data center. Public centers usually have a mix of hardware due in part to the public tender procurement procedures, this adds a level of complexity and performing a certain level of R&D is needed when new hardware is being deployed. For this reason is of great importance to have the storage management layers as decoupled as possible from the low level settings peculiarities, such as for instance the OS. This allows to wrap everything up in a single framework and co-operate transparently at the application layer. Also it has been shown that, once the disk management application is defined, the interconnection with the tape system has to be as simple as possible, considering the tape Libraries as high latency disk. As discussed in section 5 the main characteristic of an HTC environment is an intensive use of data on disk and tape. Successful processing can only be achieved if the network among tape system, disk system and the worker nodes is optimized and correctly dimensioned to be capable to handle I/O bursts. For instance, at the time a heavy data processing campaign starts, usually a full usage of the batch system is required with a huge number of jobs starting almost at the same time. For this workload profile, the handling of hot files and an intelligent way of replicating them as they are issued, it is crucial to prevent inefficiencies or an eventual unintended 9

11 Denial of Service, as was pointed out in section 2. Science is starting a high resolution phase in the current digital era, the amount of data collected is growing exponentially and the size of data is also in constant expansion. For this reason data centers providing support to scientific experiments with huge demands in computing, naturally tend to an HTC environment with the best possible performance. This results in a data center that runs many tasks where all the performance metrics matters (CPU usage, jobs/h, MB/s processed, I/O rates). 7. Acknowledgments The Port d Informació Científica (PIC) is maintained through a collaboration between the Generalitat de Catalunya, CIEMAT, IFAE and the Universitat Autònoma de Barcelona. This work was supported in part by grant FPA C02-01/02 and FPA C02-01/02 from the Ministerio de Educación y Ciencia, Spain. We would like to specially thank the dcache team in DESY, FNAL and NDGF, and the Enstore team in FNAL for their hard work, support and co-operation. 8. References [1] WLCG computing TDR available from [2] dcache project web page: [3] dcache PNFS: [4] dcache Chimera: [5] Fermilab - Enstore group web-page: [6] A high-speed network serving thousands of Department of Energy scientists and collaborators worldwide: [7] Injong Rhee, and Lisong Xu CUBIC: A New TCP-Friendly High-Speed TCP Variant In Proceedings of the third PFLDNet Workshop (France, February 2005) 10

Storage Resource Sharing with CASTOR.

Storage Resource Sharing with CASTOR. Storage Resource Sharing with CASTOR Olof Barring, Benjamin Couturier, Jean-Damien Durand, Emil Knezo, Sebastien Ponce (CERN) Vitali Motyakov (IHEP) ben.couturier@cern.ch 16/4/2004 Storage Resource Sharing

More information

LCG data management at IN2P3 CC FTS SRM dcache HPSS

LCG data management at IN2P3 CC FTS SRM dcache HPSS jeudi 26 avril 2007 LCG data management at IN2P3 CC FTS SRM dcache HPSS Jonathan Schaeffer / Lionel Schwarz dcachemaster@cc.in2p3.fr dcache Joint development by FNAL and DESY Cache disk manager with unique

More information

Getting prepared for the LHC Run2: the PIC Tier-1 case

Getting prepared for the LHC Run2: the PIC Tier-1 case Journal of Physics: Conference Series PAPER OPEN ACCESS Getting prepared for the LHC Run2: the PIC Tier-1 case To cite this article: J Flix et al 2015 J. Phys.: Conf. Ser. 664 052014 View the article online

More information

Data transfer over the wide area network with a large round trip time

Data transfer over the wide area network with a large round trip time Journal of Physics: Conference Series Data transfer over the wide area network with a large round trip time To cite this article: H Matsunaga et al 1 J. Phys.: Conf. Ser. 219 656 Recent citations - A two

More information

The INFN Tier1. 1. INFN-CNAF, Italy

The INFN Tier1. 1. INFN-CNAF, Italy IV WORKSHOP ITALIANO SULLA FISICA DI ATLAS E CMS BOLOGNA, 23-25/11/2006 The INFN Tier1 L. dell Agnello 1), D. Bonacorsi 1), A. Chierici 1), M. Donatelli 1), A. Italiano 1), G. Lo Re 1), B. Martelli 1),

More information

Long Term Data Preservation for CDF at INFN-CNAF

Long Term Data Preservation for CDF at INFN-CNAF Long Term Data Preservation for CDF at INFN-CNAF S. Amerio 1, L. Chiarelli 2, L. dell Agnello 3, D. De Girolamo 3, D. Gregori 3, M. Pezzi 3, A. Prosperini 3, P. Ricci 3, F. Rosso 3, and S. Zani 3 1 University

More information

Data Transfers Between LHC Grid Sites Dorian Kcira

Data Transfers Between LHC Grid Sites Dorian Kcira Data Transfers Between LHC Grid Sites Dorian Kcira dkcira@caltech.edu Caltech High Energy Physics Group hep.caltech.edu/cms CERN Site: LHC and the Experiments Large Hadron Collider 27 km circumference

More information

Scalability and Performance Improvements in the Fermilab Mass Storage System

Scalability and Performance Improvements in the Fermilab Mass Storage System Journal of Physics: Conference Series Scalability and Performance Improvements in the Fermilab Mass Storage System To cite this article: Matt Crawford et al 2012 J. Phys.: Conf. Ser. 396 052024 View the

More information

High Throughput WAN Data Transfer with Hadoop-based Storage

High Throughput WAN Data Transfer with Hadoop-based Storage High Throughput WAN Data Transfer with Hadoop-based Storage A Amin 2, B Bockelman 4, J Letts 1, T Levshina 3, T Martin 1, H Pi 1, I Sfiligoi 1, M Thomas 2, F Wuerthwein 1 1 University of California, San

More information

ATLAS operations in the GridKa T1/T2 Cloud

ATLAS operations in the GridKa T1/T2 Cloud Journal of Physics: Conference Series ATLAS operations in the GridKa T1/T2 Cloud To cite this article: G Duckeck et al 2011 J. Phys.: Conf. Ser. 331 072047 View the article online for updates and enhancements.

More information

Online data storage service strategy for the CERN computer Centre G. Cancio, D. Duellmann, M. Lamanna, A. Pace CERN, Geneva, Switzerland

Online data storage service strategy for the CERN computer Centre G. Cancio, D. Duellmann, M. Lamanna, A. Pace CERN, Geneva, Switzerland Online data storage service strategy for the CERN computer Centre G. Cancio, D. Duellmann, M. Lamanna, A. Pace CERN, Geneva, Switzerland Abstract. The Data and Storage Services group at CERN is conducting

More information

Database Services at CERN with Oracle 10g RAC and ASM on Commodity HW

Database Services at CERN with Oracle 10g RAC and ASM on Commodity HW Database Services at CERN with Oracle 10g RAC and ASM on Commodity HW UKOUG RAC SIG Meeting London, October 24 th, 2006 Luca Canali, CERN IT CH-1211 LCGenève 23 Outline Oracle at CERN Architecture of CERN

More information

Data preservation for the HERA experiments at DESY using dcache technology

Data preservation for the HERA experiments at DESY using dcache technology Journal of Physics: Conference Series PAPER OPEN ACCESS Data preservation for the HERA experiments at DESY using dcache technology To cite this article: Dirk Krücker et al 2015 J. Phys.: Conf. Ser. 66

More information

Constant monitoring of multi-site network connectivity at the Tokyo Tier2 center

Constant monitoring of multi-site network connectivity at the Tokyo Tier2 center Constant monitoring of multi-site network connectivity at the Tokyo Tier2 center, T. Mashimo, N. Matsui, H. Matsunaga, H. Sakamoto, I. Ueda International Center for Elementary Particle Physics, The University

More information

A scalable storage element and its usage in HEP

A scalable storage element and its usage in HEP AstroGrid D Meeting at MPE 14 15. November 2006 Garching dcache A scalable storage element and its usage in HEP Martin Radicke Patrick Fuhrmann Introduction to dcache 2 Project overview joint venture between

More information

Challenges and Evolution of the LHC Production Grid. April 13, 2011 Ian Fisk

Challenges and Evolution of the LHC Production Grid. April 13, 2011 Ian Fisk Challenges and Evolution of the LHC Production Grid April 13, 2011 Ian Fisk 1 Evolution Uni x ALICE Remote Access PD2P/ Popularity Tier-2 Tier-2 Uni u Open Lab m Tier-2 Science Uni x Grid Uni z USA Tier-2

More information

and the GridKa mass storage system Jos van Wezel / GridKa

and the GridKa mass storage system Jos van Wezel / GridKa and the GridKa mass storage system / GridKa [Tape TSM] staging server 2 Introduction Grid storage and storage middleware dcache h and TSS TSS internals Conclusion and further work 3 FZK/GridKa The GridKa

More information

System upgrade and future perspective for the operation of Tokyo Tier2 center. T. Nakamura, T. Mashimo, N. Matsui, H. Sakamoto and I.

System upgrade and future perspective for the operation of Tokyo Tier2 center. T. Nakamura, T. Mashimo, N. Matsui, H. Sakamoto and I. System upgrade and future perspective for the operation of Tokyo Tier2 center, T. Mashimo, N. Matsui, H. Sakamoto and I. Ueda International Center for Elementary Particle Physics, The University of Tokyo

More information

Cluster Setup and Distributed File System

Cluster Setup and Distributed File System Cluster Setup and Distributed File System R&D Storage for the R&D Storage Group People Involved Gaetano Capasso - INFN-Naples Domenico Del Prete INFN-Naples Diacono Domenico INFN-Bari Donvito Giacinto

More information

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez Scientific data processing at global scale The LHC Computing Grid Chengdu (China), July 5th 2011 Who I am 2 Computing science background Working in the field of computing for high-energy physics since

More information

Block Storage Service: Status and Performance

Block Storage Service: Status and Performance Block Storage Service: Status and Performance Dan van der Ster, IT-DSS, 6 June 2014 Summary This memo summarizes the current status of the Ceph block storage service as it is used for OpenStack Cinder

More information

dcache tape pool performance Niklas Edmundsson HPC2N, Umeå University

dcache tape pool performance Niklas Edmundsson HPC2N, Umeå University dcache tape pool performance Niklas Edmundsson HPC2N, Umeå University dcache tape pool performance This presentation is focused on dcache tape pools as most commonly deployed on the NDGF Tier1, with TSM

More information

Data oriented job submission scheme for the PHENIX user analysis in CCJ

Data oriented job submission scheme for the PHENIX user analysis in CCJ Journal of Physics: Conference Series Data oriented job submission scheme for the PHENIX user analysis in CCJ To cite this article: T Nakamura et al 2011 J. Phys.: Conf. Ser. 331 072025 Related content

More information

DELL EMC DATA DOMAIN SISL SCALING ARCHITECTURE

DELL EMC DATA DOMAIN SISL SCALING ARCHITECTURE WHITEPAPER DELL EMC DATA DOMAIN SISL SCALING ARCHITECTURE A Detailed Review ABSTRACT While tape has been the dominant storage medium for data protection for decades because of its low cost, it is steadily

More information

PROOF-Condor integration for ATLAS

PROOF-Condor integration for ATLAS PROOF-Condor integration for ATLAS G. Ganis,, J. Iwaszkiewicz, F. Rademakers CERN / PH-SFT M. Livny, B. Mellado, Neng Xu,, Sau Lan Wu University Of Wisconsin Condor Week, Madison, 29 Apr 2 May 2008 Outline

More information

Chapter 10: Mass-Storage Systems

Chapter 10: Mass-Storage Systems COP 4610: Introduction to Operating Systems (Spring 2016) Chapter 10: Mass-Storage Systems Zhi Wang Florida State University Content Overview of Mass Storage Structure Disk Structure Disk Scheduling Disk

More information

High-Energy Physics Data-Storage Challenges

High-Energy Physics Data-Storage Challenges High-Energy Physics Data-Storage Challenges Richard P. Mount SLAC SC2003 Experimental HENP Understanding the quantum world requires: Repeated measurement billions of collisions Large (500 2000 physicist)

More information

Worldwide Production Distributed Data Management at the LHC. Brian Bockelman MSST 2010, 4 May 2010

Worldwide Production Distributed Data Management at the LHC. Brian Bockelman MSST 2010, 4 May 2010 Worldwide Production Distributed Data Management at the LHC Brian Bockelman MSST 2010, 4 May 2010 At the LHC http://op-webtools.web.cern.ch/opwebtools/vistar/vistars.php?usr=lhc1 Gratuitous detector pictures:

More information

The evolving role of Tier2s in ATLAS with the new Computing and Data Distribution model

The evolving role of Tier2s in ATLAS with the new Computing and Data Distribution model Journal of Physics: Conference Series The evolving role of Tier2s in ATLAS with the new Computing and Data Distribution model To cite this article: S González de la Hoz 2012 J. Phys.: Conf. Ser. 396 032050

More information

The Google File System

The Google File System The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung SOSP 2003 presented by Kun Suo Outline GFS Background, Concepts and Key words Example of GFS Operations Some optimizations in

More information

Isilon: Raising The Bar On Performance & Archive Use Cases. John Har Solutions Product Manager Unstructured Data Storage Team

Isilon: Raising The Bar On Performance & Archive Use Cases. John Har Solutions Product Manager Unstructured Data Storage Team Isilon: Raising The Bar On Performance & Archive Use Cases John Har Solutions Product Manager Unstructured Data Storage Team What we ll cover in this session Isilon Overview Streaming workflows High ops/s

More information

Current Status of the Ceph Based Storage Systems at the RACF

Current Status of the Ceph Based Storage Systems at the RACF Journal of Physics: Conference Series PAPER OPEN ACCESS Current Status of the Ceph Based Storage Systems at the RACF To cite this article: A. Zaytsev et al 2015 J. Phys.: Conf. Ser. 664 042027 View the

More information

The Microsoft Large Mailbox Vision

The Microsoft Large Mailbox Vision WHITE PAPER The Microsoft Large Mailbox Vision Giving users large mailboxes without breaking your budget Introduction Giving your users the ability to store more email has many advantages. Large mailboxes

More information

Accelerate Applications Using EqualLogic Arrays with directcache

Accelerate Applications Using EqualLogic Arrays with directcache Accelerate Applications Using EqualLogic Arrays with directcache Abstract This paper demonstrates how combining Fusion iomemory products with directcache software in host servers significantly improves

More information

dcache Introduction Course

dcache Introduction Course GRIDKA SCHOOL 2013 KARLSRUHER INSTITUT FÜR TECHNOLOGIE KARLSRUHE August 29, 2013 dcache Introduction Course Overview Chapters I, II and Ⅴ christoph.anton.mitterer@lmu.de I. Introduction To dcache Slide

More information

Clustering and Reclustering HEP Data in Object Databases

Clustering and Reclustering HEP Data in Object Databases Clustering and Reclustering HEP Data in Object Databases Koen Holtman CERN EP division CH - Geneva 3, Switzerland We formulate principles for the clustering of data, applicable to both sequential HEP applications

More information

A High-Performance Storage and Ultra- High-Speed File Transfer Solution for Collaborative Life Sciences Research

A High-Performance Storage and Ultra- High-Speed File Transfer Solution for Collaborative Life Sciences Research A High-Performance Storage and Ultra- High-Speed File Transfer Solution for Collaborative Life Sciences Research Storage Platforms with Aspera Overview A growing number of organizations with data-intensive

More information

SurFS Product Description

SurFS Product Description SurFS Product Description 1. ABSTRACT SurFS An innovative technology is evolving the distributed storage ecosystem. SurFS is designed for cloud storage with extreme performance at a price that is significantly

More information

The Google File System

The Google File System The Google File System Sanjay Ghemawat, Howard Gobioff and Shun Tak Leung Google* Shivesh Kumar Sharma fl4164@wayne.edu Fall 2015 004395771 Overview Google file system is a scalable distributed file system

More information

Spanish Tier-2. Francisco Matorras (IFCA) Nicanor Colino (CIEMAT) F. Matorras N.Colino, Spain CMS T2,.6 March 2008"

Spanish Tier-2. Francisco Matorras (IFCA) Nicanor Colino (CIEMAT) F. Matorras N.Colino, Spain CMS T2,.6 March 2008 Spanish Tier-2 Francisco Matorras (IFCA) Nicanor Colino (CIEMAT) Introduction Report here the status of the federated T2 for CMS basically corresponding to the budget 2006-2007 concentrate on last year

More information

LHCb Computing Resources: 2018 requests and preview of 2019 requests

LHCb Computing Resources: 2018 requests and preview of 2019 requests LHCb Computing Resources: 2018 requests and preview of 2019 requests LHCb-PUB-2017-009 23/02/2017 LHCb Public Note Issue: 0 Revision: 0 Reference: LHCb-PUB-2017-009 Created: 23 rd February 2017 Last modified:

More information

Configuring Short RPO with Actifio StreamSnap and Dedup-Async Replication

Configuring Short RPO with Actifio StreamSnap and Dedup-Async Replication CDS and Sky Tech Brief Configuring Short RPO with Actifio StreamSnap and Dedup-Async Replication Actifio recommends using Dedup-Async Replication (DAR) for RPO of 4 hours or more and using StreamSnap for

More information

System level traffic shaping in disk servers with heterogeneous protocols

System level traffic shaping in disk servers with heterogeneous protocols Journal of Physics: Conference Series OPEN ACCESS System level traffic shaping in disk servers with heterogeneous protocols To cite this article: Eric Cano and Daniele Francesco Kruse 14 J. Phys.: Conf.

More information

Data oriented job submission scheme for the PHENIX user analysis in CCJ

Data oriented job submission scheme for the PHENIX user analysis in CCJ Data oriented job submission scheme for the PHENIX user analysis in CCJ Tomoaki Nakamura, Hideto En yo, Takashi Ichihara, Yasushi Watanabe and Satoshi Yokkaichi RIKEN Nishina Center for Accelerator-Based

More information

Storage on the Lunatic Fringe. Thomas M. Ruwart University of Minnesota Digital Technology Center Intelligent Storage Consortium

Storage on the Lunatic Fringe. Thomas M. Ruwart University of Minnesota Digital Technology Center Intelligent Storage Consortium Storage on the Lunatic Fringe Thomas M. Ruwart University of Minnesota Digital Technology Center Intelligent Storage Consortium tmruwart@dtc.umn.edu Orientation Who are the lunatics? What are their requirements?

More information

Overview of ATLAS PanDA Workload Management

Overview of ATLAS PanDA Workload Management Overview of ATLAS PanDA Workload Management T. Maeno 1, K. De 2, T. Wenaus 1, P. Nilsson 2, G. A. Stewart 3, R. Walker 4, A. Stradling 2, J. Caballero 1, M. Potekhin 1, D. Smith 5, for The ATLAS Collaboration

More information

Reliability Engineering Analysis of ATLAS Data Reprocessing Campaigns

Reliability Engineering Analysis of ATLAS Data Reprocessing Campaigns Journal of Physics: Conference Series OPEN ACCESS Reliability Engineering Analysis of ATLAS Data Reprocessing Campaigns To cite this article: A Vaniachine et al 2014 J. Phys.: Conf. Ser. 513 032101 View

More information

IBM Storwize V7000 Unified

IBM Storwize V7000 Unified IBM Storwize V7000 Unified Pavel Müller IBM Systems and Technology Group Storwize V7000 Position Enterprise Block DS8000 For clients requiring: Advanced disaster recovery with 3-way mirroring and System

More information

Data Storage. Paul Millar dcache

Data Storage. Paul Millar dcache Data Storage Paul Millar dcache Overview Introducing storage How storage is used Challenges and future directions 2 (Magnetic) Hard Disks 3 Tape systems 4 Disk enclosures 5 RAID systems 6 Types of RAID

More information

Cloud Optimized Performance: I/O-Intensive Workloads Using Flash-Based Storage

Cloud Optimized Performance: I/O-Intensive Workloads Using Flash-Based Storage Cloud Optimized Performance: I/O-Intensive Workloads Using Flash-Based Storage Version 1.0 Brocade continues to innovate by delivering the industry s first 16 Gbps switches for low latency and high transaction

More information

Monitoring of large-scale federated data storage: XRootD and beyond.

Monitoring of large-scale federated data storage: XRootD and beyond. Monitoring of large-scale federated data storage: XRootD and beyond. J Andreeva 1, A Beche 1, S Belov 2, D Diguez Arias 1, D Giordano 1, D Oleynik 2, A Petrosyan 2, P Saiz 1, M Tadel 3, D Tuckett 1 and

More information

The ATLAS Tier-3 in Geneva and the Trigger Development Facility

The ATLAS Tier-3 in Geneva and the Trigger Development Facility Journal of Physics: Conference Series The ATLAS Tier-3 in Geneva and the Trigger Development Facility To cite this article: S Gadomski et al 2011 J. Phys.: Conf. Ser. 331 052026 View the article online

More information

CISC 7310X. C11: Mass Storage. Hui Chen Department of Computer & Information Science CUNY Brooklyn College. 4/19/2018 CUNY Brooklyn College

CISC 7310X. C11: Mass Storage. Hui Chen Department of Computer & Information Science CUNY Brooklyn College. 4/19/2018 CUNY Brooklyn College CISC 7310X C11: Mass Storage Hui Chen Department of Computer & Information Science CUNY Brooklyn College 4/19/2018 CUNY Brooklyn College 1 Outline Review of memory hierarchy Mass storage devices Reliability

More information

Storage Evaluations at BNL

Storage Evaluations at BNL Storage Evaluations at BNL HEPiX at DESY Spring 2007 Robert Petkus RHIC/USATLAS Computing Facility Brookhaven National Laboratory State of Affairs Explosive disk storage growth trajectory over the next

More information

SGI Overview. HPC User Forum Dearborn, Michigan September 17 th, 2012

SGI Overview. HPC User Forum Dearborn, Michigan September 17 th, 2012 SGI Overview HPC User Forum Dearborn, Michigan September 17 th, 2012 SGI Market Strategy HPC Commercial Scientific Modeling & Simulation Big Data Hadoop In-memory Analytics Archive Cloud Public Private

More information

CouchDB-based system for data management in a Grid environment Implementation and Experience

CouchDB-based system for data management in a Grid environment Implementation and Experience CouchDB-based system for data management in a Grid environment Implementation and Experience Hassen Riahi IT/SDC, CERN Outline Context Problematic and strategy System architecture Integration and deployment

More information

Data services for LHC computing

Data services for LHC computing Data services for LHC computing SLAC 1 Xavier Espinal on behalf of IT/ST DAQ to CC 8GB/s+4xReco Hot files Reliable Fast Processing DAQ Feedback loop WAN aware Tier-1/2 replica, multi-site High throughout

More information

Optimizing Parallel Access to the BaBar Database System Using CORBA Servers

Optimizing Parallel Access to the BaBar Database System Using CORBA Servers SLAC-PUB-9176 September 2001 Optimizing Parallel Access to the BaBar Database System Using CORBA Servers Jacek Becla 1, Igor Gaponenko 2 1 Stanford Linear Accelerator Center Stanford University, Stanford,

More information

WLCG Transfers Dashboard: a Unified Monitoring Tool for Heterogeneous Data Transfers.

WLCG Transfers Dashboard: a Unified Monitoring Tool for Heterogeneous Data Transfers. WLCG Transfers Dashboard: a Unified Monitoring Tool for Heterogeneous Data Transfers. J Andreeva 1, A Beche 1, S Belov 2, I Kadochnikov 2, P Saiz 1 and D Tuckett 1 1 CERN (European Organization for Nuclear

More information

A GPFS Primer October 2005

A GPFS Primer October 2005 A Primer October 2005 Overview This paper describes (General Parallel File System) Version 2, Release 3 for AIX 5L and Linux. It provides an overview of key concepts which should be understood by those

More information

Deploy a High-Performance Database Solution: Cisco UCS B420 M4 Blade Server with Fusion iomemory PX600 Using Oracle Database 12c

Deploy a High-Performance Database Solution: Cisco UCS B420 M4 Blade Server with Fusion iomemory PX600 Using Oracle Database 12c White Paper Deploy a High-Performance Database Solution: Cisco UCS B420 M4 Blade Server with Fusion iomemory PX600 Using Oracle Database 12c What You Will Learn This document demonstrates the benefits

More information

Dell Reference Configuration for Large Oracle Database Deployments on Dell EqualLogic Storage

Dell Reference Configuration for Large Oracle Database Deployments on Dell EqualLogic Storage Dell Reference Configuration for Large Oracle Database Deployments on Dell EqualLogic Storage Database Solutions Engineering By Raghunatha M, Ravi Ramappa Dell Product Group October 2009 Executive Summary

More information

Conference The Data Challenges of the LHC. Reda Tafirout, TRIUMF

Conference The Data Challenges of the LHC. Reda Tafirout, TRIUMF Conference 2017 The Data Challenges of the LHC Reda Tafirout, TRIUMF Outline LHC Science goals, tools and data Worldwide LHC Computing Grid Collaboration & Scale Key challenges Networking ATLAS experiment

More information

Dell Fluid Data solutions. Powerful self-optimized enterprise storage. Dell Compellent Storage Center: Designed for business results

Dell Fluid Data solutions. Powerful self-optimized enterprise storage. Dell Compellent Storage Center: Designed for business results Dell Fluid Data solutions Powerful self-optimized enterprise storage Dell Compellent Storage Center: Designed for business results The Dell difference: Efficiency designed to drive down your total cost

More information

Streamlining CASTOR to manage the LHC data torrent

Streamlining CASTOR to manage the LHC data torrent Streamlining CASTOR to manage the LHC data torrent G. Lo Presti, X. Espinal Curull, E. Cano, B. Fiorini, A. Ieri, S. Murray, S. Ponce and E. Sindrilaru CERN, 1211 Geneva 23, Switzerland E-mail: giuseppe.lopresti@cern.ch

More information

Data Movement & Tiering with DMF 7

Data Movement & Tiering with DMF 7 Data Movement & Tiering with DMF 7 Kirill Malkin Director of Engineering April 2019 Why Move or Tier Data? We wish we could keep everything in DRAM, but It s volatile It s expensive Data in Memory 2 Why

More information

April 2010 Rosen Shingle Creek Resort Orlando, Florida

April 2010 Rosen Shingle Creek Resort Orlando, Florida Data Reduction and File Systems Jeffrey Tofano Chief Technical Officer, Quantum Corporation Today s Agenda File Systems and Data Reduction Overview File System and Data Reduction Integration Issues Reviewing

More information

Monte Carlo Production on the Grid by the H1 Collaboration

Monte Carlo Production on the Grid by the H1 Collaboration Journal of Physics: Conference Series Monte Carlo Production on the Grid by the H1 Collaboration To cite this article: E Bystritskaya et al 2012 J. Phys.: Conf. Ser. 396 032067 Recent citations - Monitoring

More information

I Tier-3 di CMS-Italia: stato e prospettive. Hassen Riahi Claudio Grandi Workshop CCR GRID 2011

I Tier-3 di CMS-Italia: stato e prospettive. Hassen Riahi Claudio Grandi Workshop CCR GRID 2011 I Tier-3 di CMS-Italia: stato e prospettive Claudio Grandi Workshop CCR GRID 2011 Outline INFN Perugia Tier-3 R&D Computing centre: activities, storage and batch system CMS services: bottlenecks and workarounds

More information

Dell PowerEdge R720xd with PERC H710P: A Balanced Configuration for Microsoft Exchange 2010 Solutions

Dell PowerEdge R720xd with PERC H710P: A Balanced Configuration for Microsoft Exchange 2010 Solutions Dell PowerEdge R720xd with PERC H710P: A Balanced Configuration for Microsoft Exchange 2010 Solutions A comparative analysis with PowerEdge R510 and PERC H700 Global Solutions Engineering Dell Product

More information

An Introduction to GPFS

An Introduction to GPFS IBM High Performance Computing July 2006 An Introduction to GPFS gpfsintro072506.doc Page 2 Contents Overview 2 What is GPFS? 3 The file system 3 Application interfaces 4 Performance and scalability 4

More information

Storage Area Network (SAN)

Storage Area Network (SAN) Storage Area Network (SAN) 1 Outline Shared Storage Architecture Direct Access Storage (DAS) SCSI RAID Network Attached Storage (NAS) Storage Area Network (SAN) Fiber Channel and Fiber Channel Switch 2

More information

High-density Grid storage system optimization at ASGC. Shu-Ting Liao ASGC Operation team ISGC 2011

High-density Grid storage system optimization at ASGC. Shu-Ting Liao ASGC Operation team ISGC 2011 High-density Grid storage system optimization at ASGC Shu-Ting Liao ASGC Operation team ISGC 211 Outline Introduction to ASGC Grid storage system Storage status and issues in 21 Storage optimization Summary

More information

The storage challenges of virtualized environments

The storage challenges of virtualized environments The storage challenges of virtualized environments The virtualization challenge: Ageing and Inflexible storage architectures Mixing of platforms causes management complexity Unable to meet the requirements

More information

LHCb Computing Resource usage in 2017

LHCb Computing Resource usage in 2017 LHCb Computing Resource usage in 2017 LHCb-PUB-2018-002 07/03/2018 LHCb Public Note Issue: First version Revision: 0 Reference: LHCb-PUB-2018-002 Created: 1 st February 2018 Last modified: 12 th April

More information

Users and utilization of CERIT-SC infrastructure

Users and utilization of CERIT-SC infrastructure Users and utilization of CERIT-SC infrastructure Equipment CERIT-SC is an integral part of the national e-infrastructure operated by CESNET, and it leverages many of its services (e.g. management of user

More information

Exam : S Title : Snia Storage Network Management/Administration. Version : Demo

Exam : S Title : Snia Storage Network Management/Administration. Version : Demo Exam : S10-200 Title : Snia Storage Network Management/Administration Version : Demo 1. A SAN architect is asked to implement an infrastructure for a production and a test environment using Fibre Channel

More information

Data storage services at KEK/CRC -- status and plan

Data storage services at KEK/CRC -- status and plan Data storage services at KEK/CRC -- status and plan KEK/CRC Hiroyuki Matsunaga Most of the slides are prepared by Koichi Murakami and Go Iwai KEKCC System Overview KEKCC (Central Computing System) The

More information

Monitoring ARC services with GangliARC

Monitoring ARC services with GangliARC Journal of Physics: Conference Series Monitoring ARC services with GangliARC To cite this article: D Cameron and D Karpenko 2012 J. Phys.: Conf. Ser. 396 032018 View the article online for updates and

More information

Chapter 12: Mass-Storage

Chapter 12: Mass-Storage Chapter 12: Mass-Storage Systems Chapter 12: Mass-Storage Systems Revised 2010. Tao Yang Overview of Mass Storage Structure Disk Structure Disk Attachment Disk Scheduling Disk Management Swap-Space Management

More information

Data center requirements

Data center requirements Prerequisites, page 1 Data center workflow, page 2 Determine data center requirements, page 2 Gather data for initial data center planning, page 2 Determine the data center deployment model, page 3 Determine

More information

designed. engineered. results. Parallel DMF

designed. engineered. results. Parallel DMF designed. engineered. results. Parallel DMF Agenda Monolithic DMF Parallel DMF Parallel configuration considerations Monolithic DMF Monolithic DMF DMF Databases DMF Central Server DMF Data File server

More information

Design a Remote-Office or Branch-Office Data Center with Cisco UCS Mini

Design a Remote-Office or Branch-Office Data Center with Cisco UCS Mini White Paper Design a Remote-Office or Branch-Office Data Center with Cisco UCS Mini February 2015 2015 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page 1 of 9 Contents

More information

DIRAC pilot framework and the DIRAC Workload Management System

DIRAC pilot framework and the DIRAC Workload Management System Journal of Physics: Conference Series DIRAC pilot framework and the DIRAC Workload Management System To cite this article: Adrian Casajus et al 2010 J. Phys.: Conf. Ser. 219 062049 View the article online

More information

The JINR Tier1 Site Simulation for Research and Development Purposes

The JINR Tier1 Site Simulation for Research and Development Purposes EPJ Web of Conferences 108, 02033 (2016) DOI: 10.1051/ epjconf/ 201610802033 C Owned by the authors, published by EDP Sciences, 2016 The JINR Tier1 Site Simulation for Research and Development Purposes

More information

Operating Systems. Lecture File system implementation. Master of Computer Science PUF - Hồ Chí Minh 2016/2017

Operating Systems. Lecture File system implementation. Master of Computer Science PUF - Hồ Chí Minh 2016/2017 Operating Systems Lecture 7.2 - File system implementation Adrien Krähenbühl Master of Computer Science PUF - Hồ Chí Minh 2016/2017 Design FAT or indexed allocation? UFS, FFS & Ext2 Journaling with Ext3

More information

IBM Tivoli Storage Manager for Windows Version Installation Guide IBM

IBM Tivoli Storage Manager for Windows Version Installation Guide IBM IBM Tivoli Storage Manager for Windows Version 7.1.8 Installation Guide IBM IBM Tivoli Storage Manager for Windows Version 7.1.8 Installation Guide IBM Note: Before you use this information and the product

More information

GFS: The Google File System

GFS: The Google File System GFS: The Google File System Brad Karp UCL Computer Science CS GZ03 / M030 24 th October 2014 Motivating Application: Google Crawl the whole web Store it all on one big disk Process users searches on one

More information

Technology Insight Series

Technology Insight Series IBM ProtecTIER Deduplication for z/os John Webster March 04, 2010 Technology Insight Series Evaluator Group Copyright 2010 Evaluator Group, Inc. All rights reserved. Announcement Summary The many data

More information

File Access Optimization with the Lustre Filesystem at Florida CMS T2

File Access Optimization with the Lustre Filesystem at Florida CMS T2 Journal of Physics: Conference Series PAPER OPEN ACCESS File Access Optimization with the Lustre Filesystem at Florida CMS T2 To cite this article: P. Avery et al 215 J. Phys.: Conf. Ser. 664 4228 View

More information

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Copyright 2012, Oracle and/or its affiliates. All rights reserved. 1 Storage Innovation at the Core of the Enterprise Robert Klusman Sr. Director Storage North America 2 The following is intended to outline our general product direction. It is intended for information

More information

Red Hat Gluster Storage performance. Manoj Pillai and Ben England Performance Engineering June 25, 2015

Red Hat Gluster Storage performance. Manoj Pillai and Ben England Performance Engineering June 25, 2015 Red Hat Gluster Storage performance Manoj Pillai and Ben England Performance Engineering June 25, 2015 RDMA Erasure Coding NFS-Ganesha New or improved features (in last year) Snapshots SSD support Erasure

More information

Chapter 12: Mass-Storage Systems. Operating System Concepts 8 th Edition,

Chapter 12: Mass-Storage Systems. Operating System Concepts 8 th Edition, Chapter 12: Mass-Storage Systems, Silberschatz, Galvin and Gagne 2009 Chapter 12: Mass-Storage Systems Overview of Mass Storage Structure Disk Structure Disk Attachment Disk Scheduling Disk Management

More information

iscsi Technology Brief Storage Area Network using Gbit Ethernet The iscsi Standard

iscsi Technology Brief Storage Area Network using Gbit Ethernet The iscsi Standard iscsi Technology Brief Storage Area Network using Gbit Ethernet The iscsi Standard On February 11 th 2003, the Internet Engineering Task Force (IETF) ratified the iscsi standard. The IETF was made up of

More information

Best Practices for Deploying a Mixed 1Gb/10Gb Ethernet SAN using Dell EqualLogic Storage Arrays

Best Practices for Deploying a Mixed 1Gb/10Gb Ethernet SAN using Dell EqualLogic Storage Arrays Dell EqualLogic Best Practices Series Best Practices for Deploying a Mixed 1Gb/10Gb Ethernet SAN using Dell EqualLogic Storage Arrays A Dell Technical Whitepaper Jerry Daugherty Storage Infrastructure

More information

Improving Performance using the LINUX IO Scheduler Shaun de Witt STFC ISGC2016

Improving Performance using the LINUX IO Scheduler Shaun de Witt STFC ISGC2016 Improving Performance using the LINUX IO Scheduler Shaun de Witt STFC ISGC2016 Role of the Scheduler Optimise Access to Storage CPU operations have a few processor cycles (each cycle is < 1ns) Seek operations

More information

MASS-STORAGE STRUCTURE

MASS-STORAGE STRUCTURE UNIT IV MASS-STORAGE STRUCTURE Mass-Storage Systems ndescribe the physical structure of secondary and tertiary storage devices and the resulting effects on the uses of the devicesnexplain the performance

More information

Backup and archiving need not to create headaches new pain relievers are around

Backup and archiving need not to create headaches new pain relievers are around Backup and archiving need not to create headaches new pain relievers are around Frank Reichart Senior Director Product Marketing Storage Copyright 2012 FUJITSU Hot Spots in Data Protection 1 Copyright

More information

IBM ProtecTIER and Netbackup OpenStorage (OST)

IBM ProtecTIER and Netbackup OpenStorage (OST) IBM ProtecTIER and Netbackup OpenStorage (OST) Samuel Krikler Program Director, ProtecTIER Development SS B11 1 The pressures on backup administrators are growing More new data coming Backup takes longer

More information

The Oracle Database Appliance I/O and Performance Architecture

The Oracle Database Appliance I/O and Performance Architecture Simple Reliable Affordable The Oracle Database Appliance I/O and Performance Architecture Tammy Bednar, Sr. Principal Product Manager, ODA 1 Copyright 2012, Oracle and/or its affiliates. All rights reserved.

More information