Fujitsu Petascale Supercomputer PRIMEHPC FX10. 4x2 racks (768 compute nodes) configuration. Copyright 2011 FUJITSU LIMITED

Similar documents
Introduction of Fujitsu s next-generation supercomputer

The way toward peta-flops

Fujitsu HPC Roadmap Beyond Petascale Computing. Toshiyuki Shimizu Fujitsu Limited

Fujitsu s Approach to Application Centric Petascale Computing

White paper Advanced Technologies of the Supercomputer PRIMEHPC FX10

Technical Computing Suite supporting the hybrid system

Overview of Supercomputer Systems. Supercomputing Division Information Technology Center The University of Tokyo

Brand-New Vector Supercomputer

Fujitsu's Lustre Contributions - Policy and Roadmap-

White paper FUJITSU Supercomputer PRIMEHPC FX100 Evolution to the Next Generation

Fujitsu s new supercomputer, delivering the next step in Exascale capability

Experiences of the Development of the Supercomputers

Overview of Supercomputer Systems. Supercomputing Division Information Technology Center The University of Tokyo

Fujitsu s Technologies Leading to Practical Petascale Computing: K computer, PRIMEHPC FX10 and the Future

HOKUSAI System. Figure 0-1 System diagram

An Overview of Fujitsu s Lustre Based File System

Getting the best performance from massively parallel computer

Overview of Supercomputer Systems. Supercomputing Division Information Technology Center The University of Tokyo

Findings from real petascale computer systems with meteorological applications

Fujitsu and the HPC Pyramid

Introduction to the K computer

Fujitsu and the HPC Pyramid

Post-K: Building the Arm HPC Ecosystem

Advanced Software for the Supercomputer PRIMEHPC FX10. Copyright 2011 FUJITSU LIMITED

Post-K Supercomputer Overview. Copyright 2016 FUJITSU LIMITED

Overview of Tianhe-2

FUJITSU HPC and the Development of the Post-K Supercomputer

DAQ system at SACLA and future plan for SPring-8-II

in Action Fujitsu High Performance Computing Ecosystem Human Centric Innovation Innovation Flexibility Simplicity

Current Status of the Next- Generation Supercomputer in Japan. YOKOKAWA, Mitsuo Next-Generation Supercomputer R&D Center RIKEN

Challenges in Developing Highly Reliable HPC systems

Programming for Fujitsu Supercomputers

Fujitsu High Performance CPU for the Post-K Computer

Basic Specification of Oakforest-PACS

Results from TSUBAME3.0 A 47 AI- PFLOPS System for HPC & AI Convergence

Fujitsu s Technologies to the K Computer

LRZ SuperMUC One year of Operation

Ten Cooling Solutions to Support High-density Server Deployment. Schneider Electric Data Center Science Center White Paper #42

High Performance Computing with Fujitsu

Key Technologies for 100 PFLOPS. Copyright 2014 FUJITSU LIMITED

The Tofu Interconnect D

19. prosince 2018 CIIRC Praha. Milan Král, IBM Radek Špimr

1. ALMA Pipeline Cluster specification. 2. Compute processing node specification: $26K

Xyratex ClusterStor6000 & OneStor

Lustre2.5 Performance Evaluation: Performance Improvements with Large I/O Patches, Metadata Improvements, and Metadata Scaling with DNE

John Fragalla TACC 'RANGER' INFINIBAND ARCHITECTURE WITH SUN TECHNOLOGY. Presenter s Name Title and Division Sun Microsystems

MAHA. - Supercomputing System for Bioinformatics

Mapping MPI+X Applications to Multi-GPU Architectures

HIGH PERFORMANCE COMPUTING FROM SUN

SUN CUSTOMER READY HPC CLUSTER: REFERENCE CONFIGURATIONS WITH SUN FIRE X4100, X4200, AND X4600 SERVERS Jeff Lu, Systems Group Sun BluePrints OnLine

Supercomputers in ITC/U.Tokyo 2 big systems, 6 yr. cycle

HPC Hardware Overview

Toward Building up ARM HPC Ecosystem

Overview of the Post-K processor

Planning for Liquid Cooling Patrick McGinn Product Manager, Rack DCLC

TSUBAME-KFC : Ultra Green Supercomputing Testbed

The Future of SPARC64 TM

SPARC64 X: Fujitsu s New Generation 16 core Processor for UNIX Server

EN2910A: Advanced Computer Architecture Topic 06: Supercomputers & Data Centers Prof. Sherief Reda School of Engineering Brown University

Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments

Tianhe-2, the world s fastest supercomputer. Shaohua Wu Senior HPC application development engineer

Preparing GPU-Accelerated Applications for the Summit Supercomputer

This page is intentionally left blank.

The next generation supercomputer. Masami NARITA, Keiichi KATAYAMA Numerical Prediction Division, Japan Meteorological Agency

Exactly as much as you need.

PRIMEHPC FX10: Advanced Software

Mission Critical Facilities & Technology Conference November 3, 2011 Cooling 101. Nick Gangemi Regional Sales Manager Data Aire

Architecting Storage for Semiconductor Design: Manufacturing Preparation

Overview of Reedbush-U How to Login

NetApp: Solving I/O Challenges. Jeff Baxter February 2013

Introduction of Oakforest-PACS

Scaling to Petaflop. Ola Torudbakken Distinguished Engineer. Sun Microsystems, Inc

Datacenter Efficiency Trends. Cary Roberts Tellme, a Microsoft Subsidiary

Power Systems AC922 Overview. Chris Mann IBM Distinguished Engineer Chief System Architect, Power HPC Systems December 11, 2017

Introducing the next generation of affordable and productive massively parallel processing (MPP) computing the Cray XE6m supercomputer.

Supercomputers in Nagoya University

NetApp High-Performance Storage Solution for Lustre

Seagate ExaScale HPC Storage

INCREASE IT EFFICIENCY, REDUCE OPERATING COSTS AND DEPLOY ANYWHERE

System Packaging Solution for Future High Performance Computing May 31, 2018 Shunichi Kikuchi Fujitsu Limited

System Configuration and order-information Guide TX150 S2. November 2005

User Training Cray XC40 IITM, Pune

Business Centric Infrastructure from Fujitsu. 0 Copyright 2012 FUJITSU

Efficient Object Storage Journaling in a Distributed Parallel File System

Lustre architecture for Riccardo Veraldi for the LCLS IT Team

CAS 2K13 Sept Jean-Pierre Panziera Chief Technology Director

It s a Multicore World. John Urbanic Pittsburgh Supercomputing Center

FUJITSU PHI Turnkey Solution

Kengo Nakajima Information Technology Center, The University of Tokyo. SC15, November 16-20, 2015 Austin, Texas, USA

Vector Engine Processor of SX-Aurora TSUBASA

The Energy Challenge in HPC

Tofu Interconnect 2: System-on-Chip Integration of High-Performance Interconnect

CS500 SMARTER CLUSTER SUPERCOMPUTERS

Co-designing an Energy Efficient System

Agenda. Sun s x Sun s x86 Strategy. 2. Sun s x86 Product Portfolio. 3. Virtualization < 1 >

System Overview. Liquid Cooling for Data Centers. Negative Pressure Cooling Without Risk of Leaks

Resources Current and Future Systems. Timothy H. Kaiser, Ph.D.

100% Warm-Water Cooling with CoolMUC-3

InfiniBand Strengthens Leadership as The High-Speed Interconnect Of Choice

STAR-CCM+ Performance Benchmark and Profiling. July 2014

Transcription:

Fujitsu Petascale Supercomputer PRIMEHPC FX10 4x2 racks (768 compute nodes) configuration

PRIMEHPC FX10 Highlights Scales up to 23.2 PFLOPS Improves Fujitsu s supercomputer technology employed in the FX1 and K computer, the world s fastest supercomputer. Hybrid parallel (VISIMPACT) Collective SW FX1 SPARC64 TM VII 4C/40GF 121TFLOPS, CY2008~ Tofu interconnect ISA extension(hpc-ace) K computer * SPARC64 TM VIIIfx 8C/128GF 11PFLOPS, CY2010~ PRIMEHPC FX10 SPARC64 TM IXfx 16C/236.5GF ~ 23PFLOPS, CY2012~ * "K computer" is the name of a next-generation supercomputer developed by RIKEN in July 2010 1

PRIMEHPC FX10 Features High-speed and ultra-large-scale computing environment Up to 23.2 PFLOPS (98,304 nodes, 1,024 racks, 6 petabytes of memory) SPARC64 IXfx- high performance w/ low power consumption 236.5 GFLOPS, and over 2GFLOPS/W High execution performance with massively parallel apps Tofu interconnect, Technical Computing Suite, and VISIMPACT High reliability and high operability for a large-scale system 2

PRIMEHPC FX10 Configuration Job execution nodes (diskless) Tofu: 6D mesh/torus interconnect Compute nodes Disk and external comm. nodes IO nodes System Maximum configuration Node Number of nodes 98,304 (= 32x32x8x2x3x2) nodes Number of CPUs 1 CPU Peak performance 23.2 PFlops Peak performance 236.5 GFlops Total memory bandwidth 8.3 PB/s Memory bandwidth 85 GB/s Total interconnect bandwidth 3.1 PB/s Interconnect link bandwidth 5 GB/s x bidirectional Total IO bandwidth 49.1 TB/s IO bandwidth 8 GB/s per IO node 3

Implementation of PRIMEHPC FX10 Rack 96 compute nodes (24 SBs) 6 IO nodes (6 IOSBs) Maximum configuration 1,024 racks 98,304 compute nodes 6,144 IO nodes CPU 236.5 GFlops L2$ Core Core Core : Core : Node 1 CPU SX ctrl MC CPU Node DDR3 DIMM ICC 236.5 GFlops Memory of up to 64 GB System board (SB) 4 nodes (4 CPUs) 22.7 TFlops Memory of up to 6 TB 946 GFlops Memory of up to 256 GB 23.2 PFlops Memory of up to 6 PB 4

PRIMEHPC FX10 Packaging 12 System boards 12 PSUs 6 IO system boards 12 System boards Coolant inlet / outlet 5

Hybrid Cooling (Direct Water Cooling) Air vent valve Cooling plates for ICC ICC クーリングプレート Coolant couplers 液体コネクタ (Liquid Coupler) Intake hose for IOSBs Intake hose for IOSBs Motor valve Check valve Coolant inlet / outlet Temp. & dew sensors Filter Coolant pipes 配管 Piping Cooling plates for CPU SB with water cold plates 6

Hybrid Cooling (Slanted Mount System) Air vent valve Intake hose Exhaust air Apertures DIMMs for IOSBs Intake hose for IOSBs Check valve Motor valve DIMMs Temp. & dew sensors Apertures Filter Cabinet frame Coolant inlet / outlet Coolant pipes 7 Air velocity map of SB

K Computer and PRIMEHPC FX10 Fujitsu supercomputer w/ enhanced technology introduced for K computer K computer PRIMEHPC FX10 Note CPU SPARC64 VIIIfx SPARC64 IXfx SPARC V9 + HPC-ACE Peak perf. 128 GFLOPS 236.5 GFLOPS # of cores 8 16 Memory 16GiB 32GiB/64GiB 2GB/core~ BW 64GB/s 85GB/s Interconnect 6D mesh/torus Tofu interconnect System size X x Y x 17 X x Y x 9 Z=0 is I/O node link BW 5GB/s x bidirectional PRIMEHPC FX10 supports water cooling of ordinary chilling facility and resolves layout restrictions 8

Information Technology Center, The University of Tokyo Next Generation Supercomputer System Overview Compute nodes, Interactive nodes PRIMEHPC FX10 x 50 racks Local file system (4,800 comp. nodes, 300 I/O nodes) Peak Performance: 1.13 PFLOPS Memory capacity: 150 TB - aggregated memory bandwidth: 398 TB/sec Interconnect: 6D mesh/torus - Tofu - node-to-node: 5 GB/sec in both directions - network bi-section bandwidth: 6 TB/sec. PRIMERGY RX300 S6 x 2 (MDS) ETERNUS DX80 S2 x 150 (OST) Storage capacity: 1.1PB (RAID-5) Shared file system PRIMERGY RX300 S6 x 8 (MDS) PRIMERGY RX300 S6 x 40 (OSS) ETERNUS DX80 S2 x 4 (MDT) ETERNUS DX410 S2 x 80 (OST) Storage capacity: 2.1PB (RAID-6) InfiniBand network Log-in nodes Management servers Job management, operation management, authentication servers: PRIMERGY RX200S6 x 16 External file system PRIMERGY RX300 S6 x 8 Ethernet network External connection router LAN/WAN End users InfiniBand Ethernet FibreChannel

PRIMEHPC FX10 Summary Improves Fujitsu s supercomputer technology employed in the K computer, the world s fastest supercomputer Newly developed SPARC64 IXfx processor (236.5GFLOPS) Tofu interconnect (scales up to 98,304 nodes) Fujitsu provides all hardware and software Processor and interconnect Technical Computing Suite HPC middleware, as well as the FEFS highperformance distributed file system 10

11