ARCHER/RDF Overview. How do they fit together? Andy Turner, EPCC

Similar documents
High Performance Computing. What is it used for and why?

High Performance Computing. What is it used for and why?

Data Management. Parallel Filesystems. Dr David Henty HPC Training and Support

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 11th CALL (T ier-0)

Tier 2 Computer Centres CSD3. Cambridge Service for Data Driven Discovery

HPC Architectures. Types of resource currently in use

LBRN - HPC systems : CCT, LSU

PRACE Project Access Technical Guidelines - 19 th Call for Proposals

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 16 th CALL (T ier-0)

First steps on using an HPC service ARCHER

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 6 th CALL (Tier-0)

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 13 th CALL (T ier-0)

User Training Cray XC40 IITM, Pune

An Introduction to GPFS

Emerging Technologies for HPC Storage

Parallel File Systems. John White Lawrence Berkeley National Lab

Using file systems at HC3

Shared Services Canada Environment and Climate Change Canada HPC Renewal Project

A GPFS Primer October 2005

DATARMOR: Comment s'y préparer? Tina Odaka

DVS, GPFS and External Lustre at NERSC How It s Working on Hopper. Tina Butler, Rei Chi Lee, Gregory Butler 05/25/11 CUG 2011

Short Talk: System abstractions to facilitate data movement in supercomputers with deep memory and interconnect hierarchy

CS500 SMARTER CLUSTER SUPERCOMPUTERS

Batch Scheduling on XT3

IBM řešení pro větší efektivitu ve správě dat - Store more with less

Storage Supporting DOE Science

Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments

How to run applications on Aziz supercomputer. Mohammad Rafi System Administrator Fujitsu Technology Solutions

Parallel File Systems Compared

An introduction to GPFS Version 3.3

HPC Solution. Technology for a New Era in Computing

SGI Overview. HPC User Forum Dearborn, Michigan September 17 th, 2012

API and Usage of libhio on XC-40 Systems

Data Movement & Tiering with DMF 7

An ESS implementation in a Tier 1 HPC Centre

Welcome! Virtual tutorial starts at 15:00 BST

Parallel File Systems for HPC

Data storage services at KEK/CRC -- status and plan

BeeGFS. Parallel Cluster File System. Container Workshop ISC July Marco Merkel VP ww Sales, Consulting

libhio: Optimizing IO on Cray XC Systems With DataWarp

HPC SERVICE PROVISION FOR THE UK

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 14 th CALL (T ier-0)

DELL EMC ISILON F800 AND H600 I/O PERFORMANCE

Batch Systems. Running calculations on HPC resources

Lustre overview and roadmap to Exascale computing

Implementing a Hierarchical Storage Management system in a large-scale Lustre and HPSS environment

HPC and IT Issues Session Agenda. Deployment of Simulation (Trends and Issues Impacting IT) Mapping HPC to Performance (Scaling, Technology Advances)

Filesystems on SSCK's HP XC6000

Shared Parallel Filesystems in Heterogeneous Linux Multi-Cluster Environments

SuperMike-II Launch Workshop. System Overview and Allocations

Trust Separation on the XC40 using PBS Pro

Before We Start. Sign in hpcxx account slips Windows Users: Download PuTTY. Google PuTTY First result Save putty.exe to Desktop

Compiling applications for the Cray XC

Data Analytics and Storage System (DASS) Mixing POSIX and Hadoop Architectures. 13 November 2016

Designed for Maximum Accelerator Performance

IBM IBM Open Systems Storage Solutions Version 4. Download Full Version :

Deep Learning on SHARCNET:

Batch environment PBS (Running applications on the Cray XC30) 1/18/2016

AFM Use Cases Spectrum Scale User Meeting

Ioan Raicu. Everyone else. More information at: Background? What do you want to get out of this course?

Toward portable I/O performance by leveraging system abstractions of deep memory and interconnect hierarchies

Choosing Resources Wisely Plamen Krastev Office: 38 Oxford, Room 117 FAS Research Computing

Beyond Petascale. Roger Haskin Manager, Parallel File Systems IBM Almaden Research Center

Smarter Clusters from the Supercomputer Experts

pnfs: Blending Performance and Manageability

Veritas NetBackup on Cisco UCS S3260 Storage Server

OBTAINING AN ACCOUNT:

Hands-On Workshop bwunicluster June 29th 2015

Feedback on BeeGFS. A Parallel File System for High Performance Computing

Xyratex ClusterStor6000 & OneStor

LustreFS and its ongoing Evolution for High Performance Computing and Data Analysis Solutions

I/O Monitoring at JSC, SIONlib & Resiliency

HPC NETWORKING IN THE REAL WORLD

EMC DATA DOMAIN OPERATING SYSTEM

System that permanently stores data Usually layered on top of a lower-level physical storage medium Divided into logical units called files

IT Certification Exams Provider! Weofferfreeupdateserviceforoneyear! h ps://

Architecting Storage for Semiconductor Design: Manufacturing Preparation

HECToR to ARCHER. An Introduction from Cray. 10/3/2013 Cray Inc. Property

GPFS for Life Sciences at NERSC

RAIDIX Data Storage Solution. Clustered Data Storage Based on the RAIDIX Software and GPFS File System

Comet Virtualization Code & Design Sprint

IBM Storwize V7000 Unified

Second workshop for MARS administrators

The Blue Water s File/Archive System. Data Management Challenges Michelle Butler

Extraordinary HPC file system solutions at KIT

HIGH PERFORMANCE COMPUTING FROM SUN

Lustre usages and experiences

Scaling Across the Supercomputer Performance Spectrum

NERSC. National Energy Research Scientific Computing Center

Cray Graph Engine / Urika-GX. Dr. Andreas Findling

Future Trends in Hardware and Software for use in Simulation

Grid Computing Competence Center Large Scale Computing Infrastructures (MINF 4526 HS2011)

Advanced Job Launching. mapping applications to hardware

Niclas Andersson

IBM Scale Out Network Attached Storage (SONAS) using the Acuo Universal Clinical Platform

MAHA. - Supercomputing System for Bioinformatics

Non-Production Databases. Oracle Recovery Manager. Presented By: Jeff Branan - Database Architect

A Long-distance InfiniBand Interconnection between two Clusters in Production Use

Research Cyberinfrastructure Collaboration Resources

VRT-202 Veritas Cluster Server for UNIX/Linux Administration

Transcription:

ARCHER/RDF Overview How do they fit together? Andy Turner, EPCC a.turner@epcc.ed.ac.uk

www.epcc.ed.ac.uk www.archer.ac.uk

Outline ARCHER/RDF Layout Available file systems Compute resources ARCHER Compute Nodes ARCHER Pre/Post-Processing (PP) Nodes RDF Data Analytic Cluster (DAC) Data transfer resources ARCHER Login Nodes ARCHER PP Nodes RDF Data Transfer Nodes (DTNs)

ARCHER and RDF

ARCHER UK National Supercomputer Large parallel compute resource Cray XC30 system 118,080 Intel Xeon cores High performance interconnect Designed for large parallel calculations Two file systems /home Store source code, key project data, etc. /work Input and output from calculations, not long-term storage

RDF Large scale data storage (~20 PiB) For data under active use, i.e. not an archive Multiple file systems available depending on project Modest data analysis compute resource Standard Linux cluster High-bandwidth connection to disks Data transfer resources

Terminology ARCHER Login Login nodes PP Serial Pre-/Post-processing nodes MOM PBS job launcher nodes /home Standard NFS file system /work Lustre parallel file system ARCHER installation is a Sonexion Lustre file system RDF DAC Data Analytic Cluster DTN Data Transfer Node GPFS General Parallel File System RDF parallel file system technology from IBM Multiple file systems available on RDF GPFs

Overview DTN DAC Login PP MOM Compute Nodes RDF File Systems GPFS Parallel /home NFS /work Lustre Parallel RDF ARCHER

Available File Systems

ARCHER /home Standard NFS file system Backed up daily Low-performance, limited space Mounted on: Login, PP, MOM (not Compute Nodes) /work Parallel Lustre file system No backup High performance read/write (not open/stat), large space (>4 PiB) Mounted on: Login, PP, MOM, Compute Nodes

RDF /epsrc, /nerc, /general Parallel GPFS file system Backed up for disaster recovery High performance (read/write/open/stat), v. large space (>20 PiB) Mounted on: DTN, DAC, Login, PP

Compute Resources

ARCHER Compute Nodes: 4920 nodes with 24 cores each (118,080 cores total) 64/128 GB memory per node Designed for parallel jobs (serial not well supported) /work file system only Accessed by batch system only PP Nodes 2 nodes with 64 cores each (256 hyperthreads in total) 1 TB memory per node Designed for serial/shared-memory jobs RDF file systems available Access directly or via batch system

RDF Data Analytic Cluster 12 standard compute nodes: 40 HyperThreads, 128 GB Memory 2 large compute nodes: 64 HyperThreads, 2 TB Memory Direct Infiniband connections to RDF file systems Access via batch system Designed for data-intensive workloads in parallel or serial

Data Transfer Resources ARCHER to/from RDF Primary resource is PP nodes Mounts ARCHER and RDF file systems Interactive data transfer can use ARCHER Login nodes Mounts ARCHER and RDF file systems Small amounts of data only To outside world RDF Data Transfer Nodes (DTNs) for large files ARCHER Login Nodes for small amounts of data only

Summary

ARCHER/RDF ARCHER and the RDF are separate systems Some RDF file systems are mounted on ARCHER login and PP nodes To enable easy data transfer (e.g. for analysis or transfer off site) A variety of file systems are available Each has its own use case Data management plan should consider which is best suited at each stage in data lifecycle Variety of compute resources available