Seagate ExaScale HPC Storage

Similar documents
A ClusterStor update. Torben Kling Petersen, PhD. Principal Architect, HPC

It s a Multicore World. John Urbanic Pittsburgh Supercomputing Center

It s a Multicore World. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist

HIGH-PERFORMANCE STORAGE FOR DISCOVERY THAT SOARS

Xyratex ClusterStor6000 & OneStor

It s a Multicore World. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist

Tech Talk on HPC. Ken Claffey. VP, Cloud Systems. May 2016

Managing HPC Active Archive Storage with HPSS RAIT at Oak Ridge National Laboratory

Refining and redefining HPC storage

The ClusterStor Engineered Solution for HPC. Dr. Torben Kling Petersen, PhD Principal Solutions Architect - HPC

Cray XC Scalability and the Aries Network Tony Ford

Parallel Computing & Accelerators. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist

Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments

SGI Overview. HPC User Forum Dearborn, Michigan September 17 th, 2012

All-Flash High-Performance SAN/NAS Solutions for Virtualization & OLTP

All-Flash High-Performance SAN/NAS Solutions for Virtualization & OLTP

All-Flash High-Performance SAN/NAS Solutions for Virtualization & OLTP

HPCG UPDATE: SC 15 Jack Dongarra Michael Heroux Piotr Luszczek

Building Self-Healing Mass Storage Arrays. for Large Cluster Systems

It s a Multicore World. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist

Lustre2.5 Performance Evaluation: Performance Improvements with Large I/O Patches, Metadata Improvements, and Metadata Scaling with DNE

THE SUMMARY. CLUSTER SERIES - pg. 3. ULTRA SERIES - pg. 5. EXTREME SERIES - pg. 9

Emerging Technologies for HPC Storage

Store Process Analyze Collaborate Archive Cloud The HPC Storage Leader Invent Discover Compete

Storage for HPC, HPDA and Machine Learning (ML)

It s a Multicore World. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist

The current status of the adoption of ZFS* as backend file system for Lustre*: an early evaluation

DDN s Vision for the Future of Lustre LUG2015 Robert Triendl

Isilon Scale Out NAS. Morten Petersen, Senior Systems Engineer, Isilon Division

HPE Scalable Storage with Intel Enterprise Edition for Lustre*

DDN. DDN Updates. Data DirectNeworks Japan, Inc Shuichi Ihara. DDN Storage 2017 DDN Storage

Dynamically unify your data center Dell Compellent: Self-optimized, intelligently tiered storage

NetApp: Solving I/O Challenges. Jeff Baxter February 2013

Feedback on BeeGFS. A Parallel File System for High Performance Computing

Accelerating Spectrum Scale with a Intelligent IO Manager

2012 HPC Advisory Council

IBM Storwize V7000 Unified

SFA12KX and Lustre Update

DDN. DDN Updates. DataDirect Neworks Japan, Inc Nobu Hashizume. DDN Storage 2018 DDN Storage 1

Isilon Performance. Name

DDN About Us Solving Large Enterprise and Web Scale Challenges

DDN and Flash GRIDScaler, Flashscale Infinite Memory Engine

An Overview of Fujitsu s Lustre Based File System

HCI: Hyper-Converged Infrastructure

Parallel Computing & Accelerators. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist

NST6000 UNIFIED HYBRID STORAGE. Performance, Availability and Scale for Any SAN and NAS Workload in Your Environment

Network Storage Appliance

Architecting Storage for Semiconductor Design: Manufacturing Preparation

Dell Fluid Data solutions. Powerful self-optimized enterprise storage. Dell Compellent Storage Center: Designed for business results

Cloudian Sizing and Architecture Guidelines

Next Generation Computing Architectures for Cloud Scale Applications

HPC Technology Update Challenges or Chances?

HP X9000 Network Storage Systems (NAS)

InfiniBand Strengthens Leadership as the Interconnect Of Choice By Providing Best Return on Investment. TOP500 Supercomputers, June 2014

Veritas NetBackup on Cisco UCS S3260 Storage Server

CSE5351: Parallel Procesisng. Part 1B. UTA Copyright (c) Slide No 1

Storage Optimization with Oracle Database 11g

Data Management. Parallel Filesystems. Dr David Henty HPC Training and Support

Dell PowerVault MD Family. Modular storage. The Dell PowerVault MD storage family

HPC Storage Use Cases & Future Trends

Dell PowerVault MD Family. Modular Storage. The Dell PowerVault MD Storage Family

Chapter 1. Introduction

Dell PowerVault MD Family. Modular storage. The Dell PowerVault MD storage family

Lustre Reference Architecture

Preparing GPU-Accelerated Applications for the Summit Supercomputer

Report on the Sunway TaihuLight System. Jack Dongarra. University of Tennessee. Oak Ridge National Laboratory

High Capacity network storage solutions

High-Performance Lustre with Maximum Data Assurance

HPCG UPDATE: ISC 15 Jack Dongarra Michael Heroux Piotr Luszczek

An ESS implementation in a Tier 1 HPC Centre

High Performance Storage Solutions

Intel Enterprise Edition Lustre (IEEL-2.3) [DNE-1 enabled] on Dell MD Storage

Enterprise Ceph: Everyway, your way! Amit Dell Kyle Red Hat Red Hat Summit June 2016

Disruptive Forces Affecting the Future

PART-I (B) (TECHNICAL SPECIFICATIONS & COMPLIANCE SHEET) Supply and installation of High Performance Computing System

IBM DS8880F All-flash Data Systems

Building dense NVMe storage

PRESENTATION TITLE GOES HERE

Deep Learning Performance and Cost Evaluation

New HPE 3PAR StoreServ 8000 and series Optimized for Flash

Warsaw. 11 th September 2018

THESUMMARY. ARKSERIES - pg. 3. ULTRASERIES - pg. 5. EXTREMESERIES - pg. 9

HP solutions for mission critical SQL Server Data Management environments

GROMACS (GPU) Performance Benchmark and Profiling. February 2016

Cold Storage: The Road to Enterprise Ilya Kuznetsov YADRO

1. ALMA Pipeline Cluster specification. 2. Compute processing node specification: $26K

ECS APPLIANCES. Specification Sheet

John Fragalla TACC 'RANGER' INFINIBAND ARCHITECTURE WITH SUN TECHNOLOGY. Presenter s Name Title and Division Sun Microsystems

Lenovo Database Configuration Guide

Solaris Engineered Systems

LustreFS and its ongoing Evolution for High Performance Computing and Data Analysis Solutions

DELL EMC ISILON F800 AND H600 I/O PERFORMANCE

CS2214 COMPUTER ARCHITECTURE & ORGANIZATION SPRING Top 10 Supercomputers in the World as of November 2013*

Enterprise HPC storage systems

Illinois Proposal Considerations Greg Bauer

Performance Boost for Seismic Processing with right IT infrastructure. Vsevolod Shabad CEO and founder +7 (985)

SUN ZFS STORAGE 7X20 APPLIANCES

High-Performance Computing - and why Learn about it?

Commvault ScaleProtect with Cisco UCS S3260 Storage Server

LAMMPS-KOKKOS Performance Benchmark and Profiling. September 2015

Transcription:

Seagate ExaScale HPC Storage Miro Lehocky System Engineer, Seagate Systems Group, HPC1

100+ PB Lustre File System 130+ GB/s Lustre File System 140+ GB/s Lustre File System 55 PB Lustre File System 1.6 TB/sec Lustre File System 500+ GB/s Lustre File System 1 TB/sec Lustre File System

Market leadership. Rank Name Computer Site 1 Tianhe-2 2 Titan 3 Sequoia 4 K computer 5 Mira 6 Trinity 7 Piz Daint 8 Shaheen II 9 Hazel Hen TH-IVB-FEP Cluster, Xeon E5-2692 12C 2.2GHz, TH Express-2, Intel Xeon Phi Cray XK7, Opteron 6274 16C 2.2GHz, Cray Gemini interconnect, NVIDIA K20x BlueGene/Q, Power BQC 16C 1.60 GHz, Custom Interconnect Fujitsu, SPARC64 VIIIfx 2.0GHz,, Tofu interconnect BlueGene/Q, Power BQC 16C 1.60GHz, Custom Cray XC40, Xeon E5-2698v3 16C 2.3GHz, Aries interconnect Cray XC30, Xeon E5-2670 8C 2.600GHz, Aries interconnect, NVIDIA K20x Cray XC40, Xeon E5-2698v3 16C 2.3GHz, Aries interconnect Cray XC40, Xeon E5-2680v3 12C 2.5GHz, Aries interconnect National Super Computer Center in Guangzhou DOE/SC/Oak Ridge National Laboratory Total Cores Rmax (TFLOPS) Rpeak (TFLOPS) Power (KW) File system Size Perf 3120000 33862700 54902400 17808 Lustre/H2FS 12.4 PB ~750 GB/s 560640 17590000 27112550 8209 Lustre 10.5 PB 240 GB/s DOE/NNSA/LLNL 1572864 17173224 20132659 7890 Lustre 55 PB 850 GB/s RIKEN AICS 705024 10510000 11280384 12659 Lustre 40 PB 965 GB/s DOE/SC/Argonne National Lab. DOE/NNSA/LANL/SNL 301056 8100900 Swiss National Supercomputing Centre (CSCS) KAUST, Saudi Arabia 786432 8586612 10066330 3945 GPFS 28.8 PB 240 GB/s 11078861 Lustre 76 PB 1,600 GB/s 115984 6271000 7788853 2325 Lustre 2.5 PB 138 GB/s 196,608 5,537 7,235 2,834 Lustre 17 PB 500 GB/s HLRS - Stuttgart 185088 5640170 7403520 Lustre 7 PB ~ 100 GB/s 10 Stampede PowerEdge C8220, Xeon E5-2680 8C 2.7GHz, IB FDR, Intel Xeon Phi TACC/ Univ. of Texas 462462 5168110 8520112 4510 Lustre 14 PB 150 GB/s n.b. NCSA Bluewaters 24 PB 1100 GB/s (Lustre 2.1.3)

Still The Same Concept: Fully integrated, fully balanced, no bottlenecks ClusterStor Scalable Storage Unit Intel Ivy bridge or Haswell CPUs F/EDR, 100 GbE & 2x40GbE, all SAS infrastructure SBB v3 Form Factor, PCIe Gen-3 Embedded RAID & Lustre support ClusterStor Manager Lustre File System (2.x) Data Protection Layer (PD-RAID/Grid-RAID) Linux OS Unified System Management (GEM-USM) Embedded server modules 2015 Seagate, Inc. All Rights Reserved.

So what s new??? Seagate Next Gen Lustre appliance CS9000 ClusterStor Lustre Software Lustre v2.5 Linux v6.5 CS L300 ClusterStor Lustre Software Lustre v2.5 Phase 1, Lustre 2.7 Phase 2 Linux v6.5 Phase 1, Linux 7.2 Phase 2 Management Switches 1Gbe Switches (component communication) Top of Rack Switches (ToR) Infiniband (IB) Mellanox FDR or 40GbE (High availability connectivity in the Rack) ClusterStor Management Unit Hardware (CMU) 4 Servers in a 2U Chassis, FDR / 40GbE Sever 1 & 2 File System Management, Boot Server 3 & 4 Meta Data (user data location map) 2U24 JBOD (HDD storage for Management and Meta Data) ClusterStor Scalable Storage (SSU) 5U84 Enclosure 6Gbit SAS Object Storage Servers (OSS) Network I/O Mellanox FDR/40GbE 7.2K RPM HDDs Management Switches 1Gbe Switches (component communication) Top of Rack Switches (ToR) Infiniband (IB) Mellanox EDR or 100/40GbE ClusterStor System Management Unit (SMU) 2U24 w/ Embedded Servers in HA Configuration File System Management, Boot, Storage ClusterStor Meta Data Management Unit (MMU) 2U24 w/ Embedded Servers in HA Configuration Meta Data (user data location map) ClusterStor Scalable Storage (SSU) 5U84 Enclosure Object Storage Servers (OSS) Network I/O Mellanox EDR 100/40GbE 10K and 7.2K RPM HDDs 2U24 Enclosure Seagate Koho Flash SSDs 2015 Seagate, Inc. All Rights Reserved.

ClusterStor GRIDRAID Feature Benefit De-clustered RAID 6: Up to 400% faster to repair Rebuild of 6TB drive MD RAID ~ 33.3 hours, GridRAID ~ 9.5 hours Recover from a disk failure and return to full data protection faster Repeal Amdahl s Law: speed of a parallel system is gated by the performance of the slowest component Minimizes application impact to widely striped file performance Minimize file system fragmentation Improved allocation and layout maximizes sequential data placement 4 to1 Reduction in OSTs Simplifies scalability challenges ClusterStor Integrated Management CLI and GUI configuration, monitoring and management reduces Opex Traditional RAID GridRAID Parity Rebuild Disk Pool #1 Parity Rebuild Disk Pool #2 Parity Rebuild Disk Pool #3 OSS Server Parity Rebuild Disk Pool #1 OSS Server Parity Rebuild Disk Pool #4 2015 Seagate, Inc. All Rights Reserved.

ClusterStor L300 HPC Disk Drive

HPC Optimized Performance 3.5 HDD High level product description

ClusterStor L300 HPC 4TB SAS HDD HPC Industry First; Best Mixed Application Workload Value Performance Leader World-beating performance over other 3.5in HDDs: Speeding data ingest, extraction and access 600 500 CS HPC HDD Capacity Strong 4TB of storage for big data applications Reliable Workhorse 2M hour MTBF and 750TB/year ratings for reliability under the toughest workloads your users throw at it 400 300 200 NL 7.2K RPM HDD CS HPC HDD NL 7.2K RPM HDD CS HPC HDD NL 7.2K RPM HDD Power Efficient Seagate s PowerBalance feature provides significant power benefits for minimal performance tradeoffs 100 0 Random writes (4K IOPS, WCD) Random reads (4KQ16 IOPS) Sequential data rate (MB/s)

IBM Spectrum Scale Based Solutions 2

Best in Class Features Designed for HPC, Big Data and Cloud Connectivity IB FDR, QDR and 40 GbE (EDR, 100GbE and Omnipath on roadmap) Exportable via CNFS, CIFS, Object storage, HDFS connectors Linux and Windows Clients Robust Feature Set Global Shared Access with Single Namespace across cluster/file systems Building Block approach with Scalable performance and capacity Distributed file system data caching and coherency management RAID 6 (8 +2) with De-clustered RAID Snapshot and Rollback Integrated Lifecycle Management Backup to Tape Non Disruptive Scaling, Restriping, Rebalancing Synchronously replicated data and metadata Management and Support Clusterstor CLI Based Single Point of Management RAS/Phone home SNMP integration with Business Operation Systems Low level Hardware Monitoring & Diagnostics Embedded monitoring, Proactive alerts Hardware Platform Industry s Fastest Converged Scale-Out Storage Platform Latest Intel Processors Embedded High Availability Servers Integrated into Data Storage Backplane Fastest Available IO details?? Extremely Dense Storage Enclosures with 84 drives in 5U

ClusterStor Spectrum Scale Performance Density Rack Configuration Key components: ClusterStor Manager Node (2U enclosure) 2 HA management servers 10 drives 2 Management switches ETN 1 ETN 2 High Speed Network 1 High Speed Network 2 CSM ETN 1 ETN 2 High Speed Network 1 High Speed Network 2 Performance: Up to 56GB /sec per rack Key components: 5U84 Enclosure Configured as s + Disk 2 HA Embedded Servers 76 to 80 7.2K RPM HDDs 4 to 8 SSD 42U reinforced Rack Custom cable harness Up to 7 enclosures in each rack (base + expansion) Base Rack Expansion Rack

ClusterStor Spectrum Scale Capacity Optimized Rack Configuration Key components: ClusterStor Manager Node (2U enclosure) 2 HA management servers 10 drives 2 Management switches Performance: Up to 32GB /sec per rack Key components: 5U84 Enclosure Configured as s + Disk 2 HA Embedded Servers 76 to 80 7.2K RPM HDDs 4 to 8 SSD 5U84 Enclosure Configured as JBODs 84 7.2K RPM HDDs SAS connected to servers, 1 to 1 ratio 42U reinforced Rack ETN 1 ETN 2 High Speed Network 1 High Speed Network 2 CSM JBOD JBOD JBOD Base Rack ETN 1 ETN 2 High Speed Network 1 High Speed Network 2 JBOD JBOD JBOD JBOD Expansion Rack

ClusterStor Spectrum Scale Standard Configuration 2U24 Management Server 5U84 Disk Enclosure (MD) Server x 2 servers > 8GB/sec per 5U84 (Clustered) ~ 20K File Creates per Sec ~ 2 Billion Files (MD) Server #1 (MD) Server #2 ClusterStor ToR & Mgt Switch, Rack, Cables, PDU Factory Integration & Test Up to (7) 5U84 s in base rack Metadata SSD Pool ~10K File Creates / sec ~ 1Billion files, 800 GB SSD x 2 User Data Pool ~4GB/sec HDD x qty (40) Metadata SSD Pool ~10K File Creates / sec ~ 1Billion files, 800 GB SSD x 2 User Data Pool ~4GB/sec HDD x qty (40)

Object Storage based archiving solutions 3

Clusterstor A200 Object Store Features High density storage up to 3.6PB usable per rack Can achieve 4 nines system availability Infinite numbers of objects (2 128 ) 8+2 network erasure coding for cost-effective protection Global object namespace Rapid drive rebuild (<1hr for 8TB in a large system) Object API & portfolio of network based interfaces ClusterStor A200 Integrated management & consensus based HA Performance scales with capacity(up to 10GB/s per rack)

ClusterStor A200: Resiliency Built-In Redundant TOR switches Combines data and management network traffic VLANs used to segregate network traffic 10GbE with 40GbE TOR uplinks Management Unit 1x2U24 enclosure 2x Embedded Controllers Storage Units 42U Rack with wiring loom & power cables Dual PDUs 2U spare space reserved for future configuration options Blanking plates as required Titan v2 5U84 enclosures 6x is the minimum config 82 SMR SATA HDD(8TB) Single Embedded Storage Controller Dual 10GbE network connections Resiliency withj 2SSU failures (12SSU s miimum)

Economic Benefits of SMR drives SMR Drives Shingled Technology increases capacity of platter by 30-40% Write tracks are overlapped by up to 50% of write width Read head is much smaller & can reliably read narrower tracks SMR Drives are optimal for object stores as most data is static/worm Updates require special intelligence and may be expensive in terms of performance Wide tracks in each band are often reserved for updates CS A200 manages SMR Drives directly to optimize workflow & caching Read Head Write Head Updates destroy portion of next track

Current Clusterstor Lustre solutions line-up CS-1500 CS-9000 L300 Base rack Expansion rack Base rack Expansion rack

Current ClusterStor Spectrum Scale/Active Archive line-up G200 A200 Base rack Expansion rack Base rack Expansion rack