Parallel Storage Systems for Large-Scale Machines

Similar documents
Using IKAROS as a data transfer and management utility within the KM3NeT computing model

Forming an ad-hoc nearby storage, based on IKAROS and social networking services

Cluster Setup and Distributed File System

Feedback on BeeGFS. A Parallel File System for High Performance Computing

Crossing the Chasm: Sneaking a parallel file system into Hadoop

Crossing the Chasm: Sneaking a parallel file system into Hadoop

The Hadoop Distributed File System Konstantin Shvachko Hairong Kuang Sanjay Radia Robert Chansler

Ivane Javakhishvili Tbilisi State University High Energy Physics Institute HEPI TSU

I Tier-3 di CMS-Italia: stato e prospettive. Hassen Riahi Claudio Grandi Workshop CCR GRID 2011

High Throughput WAN Data Transfer with Hadoop-based Storage

Data Management. Parallel Filesystems. Dr David Henty HPC Training and Support

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez

Data transfer over the wide area network with a large round trip time

Efficient HTTP based I/O on very large datasets for high performance computing with the Libdavix library

Emerging Technologies for HPC Storage

The SHARED hosting plan is designed to meet the advanced hosting needs of businesses who are not yet ready to move on to a server solution.

Spanish Tier-2. Francisco Matorras (IFCA) Nicanor Colino (CIEMAT) F. Matorras N.Colino, Spain CMS T2,.6 March 2008"

Cloudian Sizing and Architecture Guidelines

Oracle Hospitality Materials Control. Server Sizing Guide

HPC and IT Issues Session Agenda. Deployment of Simulation (Trends and Issues Impacting IT) Mapping HPC to Performance (Scaling, Technology Advances)

Data storage services at KEK/CRC -- status and plan

CMS Belgian T2. G. Bruno UCL, Louvain, Belgium on behalf of the CMS Belgian T2 community. GridKa T1/2 meeting, Karlsruhe Germany February

Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft. Presented by Manfred Alef Contributions of Jos van Wezel, Andreas Heiss

Spark Over RDMA: Accelerate Big Data SC Asia 2018 Ido Shamay Mellanox Technologies

Clustering. Research and Teaching Unit

Data Analytics and Storage System (DASS) Mixing POSIX and Hadoop Architectures. 13 November 2016

Influence of Distributing a Tier-2 Data Storage on Physics Analysis

Ioan Raicu. Everyone else. More information at: Background? What do you want to get out of this course?

Accelerate Applications Using EqualLogic Arrays with directcache

The Fusion Distributed File System

CLOUDS OF JINR, UNIVERSITY OF SOFIA AND INRNE JOIN TOGETHER

EsgynDB Enterprise 2.0 Platform Reference Architecture

An Introduction to GPFS

I/O at the Center for Information Services and High Performance Computing

Next-Generation NVMe-Native Parallel Filesystem for Accelerating HPC Workloads

The Oracle Database Appliance I/O and Performance Architecture

File Transfer: Basics and Best Practices. Joon Kim. Ph.D. PICSciE. Research Computing 09/07/2018

Contents Overview of the Compression Server White Paper... 5 Business Problem... 7

OpenIO SDS on ARM A practical and cost-effective object storage infrastructure based on SoYouStart dedicated ARM servers.

Grid Computing Activities at KIT

BIG DATA AND HADOOP ON THE ZFS STORAGE APPLIANCE

PODShell: Simplifying HPC in the Cloud Workflow

BeoLink.org. Design and build an inexpensive DFS. Fabrizio Manfredi Furuholmen. FrOSCon August 2008

Constant monitoring of multi-site network connectivity at the Tokyo Tier2 center

The RAMDISK Storage Accelerator

A Breakthrough in Non-Volatile Memory Technology FUJITSU LIMITED

Integrated hardware-software solution developed on ARM architecture. CS3 Conference Krakow, January 30th 2018

SAM at CCIN2P3 configuration issues

Streamlining CASTOR to manage the LHC data torrent

Datura The new HPC-Plant at Albert Einstein Institute

AxxonSoft. The Axxon Smart. Software Package. Recommended platforms. Version 1.0.4

Accelerating Parallel Analysis of Scientific Simulation Data via Zazen

Apache Hadoop 3. Balazs Gaspar Sales Engineer CEE & CIS Cloudera, Inc. All rights reserved.

Virtuozzo 7. Installation Guide

Austrian Federated WLCG Tier-2

Understanding StoRM: from introduction to internals

Isilon Performance. Name

Database Services at CERN with Oracle 10g RAC and ASM on Commodity HW

Quobyte The Data Center File System QUOBYTE INC.

A High-Performance Storage and Ultra- High-Speed File Transfer Solution for Collaborative Life Sciences Research

The Wuppertal Tier-2 Center and recent software developments on Job Monitoring for ATLAS

Ambry: LinkedIn s Scalable Geo- Distributed Object Store

ASPERA HIGH-SPEED TRANSFER. Moving the world s data at maximum speed

Coordinating Parallel HSM in Object-based Cluster Filesystems

EMC ISILON X-SERIES. Specifications. EMC Isilon X200. EMC Isilon X400. EMC Isilon X410 ARCHITECTURE

Parallel File Systems. John White Lawrence Berkeley National Lab

Middleware-Tests with our Xen-based Testcluster

RAIDIX Data Storage Solution. Clustered Data Storage Based on the RAIDIX Software and GPFS File System

NCP Computing Infrastructure & T2-PK-NCP Site Update. Saqib Haleem National Centre for Physics (NCP), Pakistan

Experiences with the Parallel Virtual File System (PVFS) in Linux Clusters

WHITEPAPER. Improve Hadoop Performance with Memblaze PBlaze SSD

Dell Reference Configuration for Large Oracle Database Deployments on Dell EqualLogic Storage

Best Practices for Deploying Hadoop Workloads on HCI Powered by vsan

HCI: Hyper-Converged Infrastructure

A GPFS Primer October 2005

COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters

Storage Optimization with Oracle Database 11g

Introduction. Architecture Overview

The creation of a Tier-1 Data Center for the ALICE experiment in the UNAM. Lukas Nellen ICN-UNAM

ASN Configuration Best Practices

HTRC Data API Performance Study

Server Specifications

GTRC Hosting Infrastructure Reports

Benoit DELAUNAY Benoit DELAUNAY 1

Voldemort. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Overview Design Evaluation

Scaling Internet TV Content Delivery ALEX GUTARIN DIRECTOR OF ENGINEERING, NETFLIX

I/O Monitoring at JSC, SIONlib & Resiliency

System upgrade and future perspective for the operation of Tokyo Tier2 center. T. Nakamura, T. Mashimo, N. Matsui, H. Sakamoto and I.

Computer Science Section. Computational and Information Systems Laboratory National Center for Atmospheric Research

I/O: State of the art and Future developments

The INFN Tier1. 1. INFN-CNAF, Italy

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

X-ray imaging software tools for HPC clusters and the Cloud

Mixing and matching virtual and physical HPC clusters. Paolo Anedda

PUBLIC SAP Vora Sizing Guide

ISILON X-SERIES. Isilon X210. Isilon X410 ARCHITECTURE SPECIFICATION SHEET Dell Inc. or its subsidiaries.

Oracle IaaS, a modern felhő infrastruktúra

Acronis Storage 2.4. Installation Guide

SUN ZFS STORAGE 7X20 APPLIANCES

MOHA: Many-Task Computing Framework on Hadoop

Transcription:

Parallel Storage Systems for Large-Scale Machines Doctoral Showcase Christos FILIPPIDIS (cfjs@outlook.com) Department of Informatics and Telecommunications, National and Kapodistrian University of Athens We acknowledge the support of Special Account for Research Grants of the National and Kapodistrian University of Athens.

Research Challenges Large-scale scientific computations tend to stretch the limits of computational power. Parallel computing is generally recognized as the only viable solution to high performance computing problems. I/O has become a bottleneck in application performance as processor speed skyrockets, leaving storage hardware and software struggling to keep up. 2

Factors affecting I/O performance The most important factors affecting I/O performance are: 1.The number of parallel processes participating in the transfers, 2.The size of the individual transfers, 3.The I/O access patterns, 4.The storage architecture being used. 3

I/O performance limitations 1. Globally shared file systems, using current storage architectures, have several I/O performance limitations when used with large-scale systems, because: Bandwidth does not scale economically to large-scale systems, I/O traffic on the high speed network and on each storage server can be affected by other unrelated jobs. 2. Lack of coordination in the overall data flow (remotelocal access) 4

Doctoral Contribution This study: 1.Proposes a dynamically coordinated I/O architecture based on input parameters (topology/profile of the infrastructure, the load metrics), 2.Creates, on the fly, dedicated or semi-dedicated clusters of HDDs per job. 3.Provides coordinated parallel data transfers on the overall data flow. 5

IKAROS Framework IKAROS: Write-optimized system. M a n a g e s s t o r a g e Resources (I/O nodes, network, storage media) in all Tiers, based on input parameters. * each Tier is made up of several computing Centers and provides a specific set of services 6

IKAROS Features Deployment model Data layout Compatibility WAN capabilities HDFS PVFS2 GPFS IKAROS Co-locates compute and storage on the same node Exposes mapping of chunks to datanodes to Hadoop applications Custom API and semantics for specific users Can be exported through webdav Separate compute and storage nodes Maintains stripe layout information as extended attributes but not exposed to applications Separate compute and storage nodes not exposed to applications The user/app can choose both models, on the fly Decides on demand the chunk mapping, based on input parameters. Exposes mapping of chunks to applications and users UNIX UNIX UNIX, WINDWOS, MAC Can be exported through pnfs Can be exported through pnfs Build-in remote access capabilities. Supports parallel channels WAN data transfers, stripping servers, third 7 party data transfers.

IKAROS Architecture 3 node Types, All Nodes are peers. Latest version in nodejs 8

Remote-Local Access Overview Reverse read, Reverse HTTP, mainly data routing I/O Bottleneck, several read-write operations IKAROS: Direct access to each I/O node regardless of the Tier 9

IKAROS vs PVFS2+GridFTP PVFS2+GridFTP: W e m u s t m a n u a l l y synchronize the stripe size a n d t h e s t r i p e m a p p i n g between them. We have to initiate many i n d e p e n d e n t t r a n s f e r s, incurring much overhead to s e t u p a n d r e l e a s e connections. IKAROS: By using IKAROS we apply only coordinated parallel data transfers M i n i m i z e d i s k a n d n e t w o r k contention 10

HPC Environment We compare IKAROS with GPFS in an HPC environment (Ν clients). We create, on the fly, dedicated or semi-dedicated clusters of HDDs per job. Goal: Isolate I/O functions of a process from other unrelated jobs. 11

Testbed (Cytera Machine) Compute nodes 96 Storage Nodes 4 GPFS-Meta data System Network Connectivity - # Specs 4, hosted at the storage nodes 12 Intel Xeon CPU cores, 48 GBs of RAM and 15K rpm local HDD 360 TBs raw disk space in 18 Raid 6 arrays each with 10 7200 rpms HDDs Raid 10 arrays (one associated at each server) QDR (40Gbit/s) infiniband 12

Testbed Features, Profiling (Determine Input parameter) Network: QDR (40Gbit/s) infiniband Storage media: Compute & Storage nodes (~140MB/s) Optimal file chunk distribution Input Parameter: (client/hdd ratio = 1/4). Result: Due to storage media queuing mechanisms. 13

GPFS performance @ Cytera GPFS@ Cytera Clients/Storage-Server: 5/1 GPFS@LLNL (2000) : Clients/Storage-Server: 4/1 (38 servers, 152 clients) 80 GB file size (do not fit to memory). 180 HDD-RAID6. 4 Storage Servers. Max I/O performance (Write):~1600 MB/s. The available storage resources (I/O and Network) are underutilized 14

Measurements @ Cytera Create, on the fly, dedicated or semi-dedicated clusters of HDDs per job (input parameter: client/hdd ratio= 1/4). improve performance by 33% with the 1/3 of the available hard disks 80 GB file size We are be able to fully utilize the available storage resources (I/O and Network). 15

IKAROS- KM3NeT.org IKAROS is part of the KM3NeT Computing Model. KM3NeT is a future European deep-sea research infrastructure hosting a new generation neutrino detectors. It is an ESFRI infrastructure and a CERN recognized experiment. The collaboration includes about 45 institutes or universities from 13 different countries. 16

KM3NeT Computing Model Overview

European Grid Infrastructure-KM3NeT Workflow irods, IKAROS, DPM, DCACHE Default procedure: The data output is transferred from the Grid Worker Node to a local Grid storage element (SE),from the local SE to CC-Lyon, and then at the UI, by using the GridFTP and SSH protocols (several readwrite operations). IKAROS: Sends the output directly to the destination (laptop, local computer cluster, CC-Lyon) (one read-write operation)

Conclusions This study: Proposes a dynamically coordinated I/O architecture, based on input parameters. Creates, on the fly, dedicated or semi-dedicated clusters of HDDs per job. Provides coordinated parallel data transfers on the overall data flow. Minimizes disk and network contention. Improves I/O performance by 33% with the 1/3 of the available hard disks. 19