Basic Specification of Oakforest-PACS

Similar documents
Introduction of Oakforest-PACS

Oakforest-PACS (OFP) Taisuke Boku

Towards Efficient Communication and I/O on Oakforest-PACS: Large-scale KNL+OPA Cluster

Oakforest-PACS and PACS-X: Present and Future of CCS Supercomputers

Overview of Reedbush-U How to Login

HOKUSAI System. Figure 0-1 System diagram

Overview of Supercomputer Systems. Supercomputing Division Information Technology Center The University of Tokyo

2018/9/25 (1), (I) 1 (1), (I)

Overview of Supercomputer Systems. Supercomputing Division Information Technology Center The University of Tokyo

Supercomputers in ITC/U.Tokyo 2 big systems, 6 yr. cycle

Results from TSUBAME3.0 A 47 AI- PFLOPS System for HPC & AI Convergence

User Training Cray XC40 IITM, Pune

IHK/McKernel: A Lightweight Multi-kernel Operating System for Extreme-Scale Supercomputing

Update of Post-K Development Yutaka Ishikawa RIKEN AICS

Japan s post K Computer Yutaka Ishikawa Project Leader RIKEN AICS

Introduction to High Performance Computing. Shaohao Chen Research Computing Services (RCS) Boston University

Overview of Supercomputer Systems. Supercomputing Division Information Technology Center The University of Tokyo

Current Status of the Next- Generation Supercomputer in Japan. YOKOKAWA, Mitsuo Next-Generation Supercomputer R&D Center RIKEN

Accelerated Computing Activities at ITC/University of Tokyo

Introduction of Fujitsu s next-generation supercomputer

High Performance Computing with Fujitsu

Overview of Tianhe-2

How to run applications on Aziz supercomputer. Mohammad Rafi System Administrator Fujitsu Technology Solutions

Arm in HPC. Toshinori Kujiraoka Sales Manager, APAC HPC Tools Arm Arm Limited

Fujitsu s Approach to Application Centric Petascale Computing

Technical Computing Suite supporting the hybrid system

Fujitsu Petascale Supercomputer PRIMEHPC FX10. 4x2 racks (768 compute nodes) configuration. Copyright 2011 FUJITSU LIMITED

Post-K: Building the Arm HPC Ecosystem

Oakforest-PACS (OFP) : Japan s Fastest Supercomputer

Site Update for Oakforest-PACS at JCAHPC

Designed for Maximum Accelerator Performance

FUJITSU PHI Turnkey Solution

Brand-New Vector Supercomputer

Rechenzentrum HIGH PERFORMANCE SCIENTIFIC COMPUTING

HPC Architectures. Types of resource currently in use

in Action Fujitsu High Performance Computing Ecosystem Human Centric Innovation Innovation Flexibility Simplicity

Pedraforca: a First ARM + GPU Cluster for HPC

ARM High Performance Computing

DDN and Flash GRIDScaler, Flashscale Infinite Memory Engine

The Arm Technology Ecosystem: Current Products and Future Outlook

Fujitsu HPC Roadmap Beyond Petascale Computing. Toshiyuki Shimizu Fujitsu Limited

IT4Innovations national supercomputing center. Branislav Jansík

An Introduction to the Intel Xeon Phi. Si Liu Feb 6, 2015

Cray XC Scalability and the Aries Network Tony Ford

Joint Usage/Research Center for Interdisciplinary Large-scale Information Infrastructures 2018 Call for Proposal of Joint Research Projects

Preparing GPU-Accelerated Applications for the Summit Supercomputer

Intel Many Integrated Core (MIC) Architecture

Post-K Supercomputer Overview. Copyright 2016 FUJITSU LIMITED

Intel Xeon Phi архитектура, модели программирования, оптимизация.

Resources Current and Future Systems. Timothy H. Kaiser, Ph.D.

Building supercomputers from commodity embedded chips

High Performance Computing and Data Resources at SDSC

The Architecture and the Application Performance of the Earth Simulator

Bei Wang, Dmitry Prohorov and Carlos Rosales

CS500 SMARTER CLUSTER SUPERCOMPUTERS

Tianhe-2, the world s fastest supercomputer. Shaohua Wu Senior HPC application development engineer

NERSC Site Update. National Energy Research Scientific Computing Center Lawrence Berkeley National Laboratory. Richard Gerber

IBM Power Systems HPC Cluster

HIGH PERFORMANCE COMPUTING FROM SUN

Communication-Computation Overlapping with Dynamic Loop Scheduling for Preconditioned Parallel Iterative Solvers on Multicore/Manycore Clusters

Overview of Reedbush-U How to Login

IBM CORAL HPC System Solution

Smarter Clusters from the Supercomputer Experts

Intel Xeon Phi архитектура, модели программирования, оптимизация.

Cray RS Programming Environment

Cheyenne NCAR s Next-Generation Data-Centric Supercomputing Environment

Toward Building up ARM HPC Ecosystem

Kengo Nakajima Information Technology Center, The University of Tokyo. SC15, November 16-20, 2015 Austin, Texas, USA

Blue Waters System Overview. Greg Bauer

Resources Current and Future Systems. Timothy H. Kaiser, Ph.D.

Oak Ridge National Laboratory Computing and Computational Sciences

InfiniBand Strengthens Leadership as the Interconnect Of Choice By Providing Best Return on Investment. TOP500 Supercomputers, June 2014

Erkenntnisse aus aktuellen Performance- Messungen mit LS-DYNA

TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 13 th CALL (T ier-0)

Scaling to Petaflop. Ola Torudbakken Distinguished Engineer. Sun Microsystems, Inc

Experiences in Optimizations of Preconditioned Iterative Solvers for FEM/FVM Applications & Matrix Assembly of FEM using Intel Xeon Phi

Umeå University

Umeå University

HPC Middle East. KFUPM HPC Workshop April Mohamed Mekias HPC Solutions Consultant. Agenda

An Overview of Fujitsu s Lustre Based File System

Trends in HPC (hardware complexity and software challenges)

Application Performance on IME

INTRODUCTION TO THE ARCHER KNIGHTS LANDING CLUSTER. Adrian

Post-K Development and Introducing DLU. Copyright 2017 FUJITSU LIMITED

Intel Xeon Phi Coprocessor

Fujitsu s Technologies Leading to Practical Petascale Computing: K computer, PRIMEHPC FX10 and the Future

Parallel Programming on Ranger and Stampede

Fujitsu s Technologies to the K Computer

Programming for the Intel Many Integrated Core Architecture By James Reinders. The Architecture for Discovery. PowerPoint Title

Design and Evaluation of a 2048 Core Cluster System

Toward Building up Arm HPC Ecosystem --Fujitsu s Activities--

Managing HPC Active Archive Storage with HPSS RAIT at Oak Ridge National Laboratory

Lustre2.5 Performance Evaluation: Performance Improvements with Large I/O Patches, Metadata Improvements, and Metadata Scaling with DNE

High Performance Computing. What is it used for and why?

I/O Monitoring at JSC, SIONlib & Resiliency

Fujitsu and the HPC Pyramid

University at Buffalo Center for Computational Research

OpenFOAM Performance Testing and Profiling. October 2017

Fujitsu and the HPC Pyramid

Programming for Fujitsu Supercomputers

Transcription:

Basic Specification of Oakforest-PACS Joint Center for Advanced HPC (JCAHPC) by Information Technology Center, the University of Tokyo and Center for Computational Sciences, University of Tsukuba

Oakforest-PACS in JCAHPC Very tight relationship and collaboration with two universities For primary supercomputer resources, uniform specification to single shared system Each university is financially responsible to introduce the machine and its operation First attempt in Japan Unified procurement toward single system with largest scale in Japan Advanced Computational Science Support Technology Division Kashiwa Campus, The University of Tokyo Budget Information Technology Center The University of Tokyo Administrative Council Chairman: Professor ahiroshi Nakamura Public Relations and Planning Division Advanced HPC Facility (Supercomputer) Operation Operation Budget Operation Technology Division Center for Computational Sciences University of Tsukuba 2017/03/10 OFP specification, JCAHPC 2

Specification of Oakforest-PACS system Total peak performance 25 PFLOPS Total number of compute nodes 8,208 Compute node Product Fujitsu PRIMERGY CX600 M1 (2U) + CX1640 M1 x 8node Processor Intel Xeon Phi 7250 (Code name: Knights Landing), 68 cores, 1.4 GHz Memory High BW 16 GB, 490 GB/sec (MCDRAM, effective rate) Low BW 96 GB, 115.2 GB/sec (peak rate) Interconnect Product Intel Omni-Path Architecture Link speed Topology 100 Gbps Fat-tree with (completely) full-bisection bandwidth 2017/03/10 OFP specification, JCAHPC 3

Full bisection bandwidth Fat-tree by Intel Omni-Path Architecture 12 of 768 port Director Switch (Source by Intel) Uplink: 24 2 Downlink: 24......... 1 24 25 48 49 72 362 of 48 port Edge Switch 2 Firstly, to reduce switches&cables, we considered : All the nodes into subgroups are connected with FBB Fat-tree Subgroups are connected with each other with >20% of FBB But, HW quantity is not so different from globally FBB, and globally FBB is preferred for flexible job management. 2017/03/10 OFP specification, JCAHPC 4

Specification of Oakforest-PACS system (Cont d) Parallel File System File Cache System Type Total Capacity Product Aggregate BW Type Total capacity Product Aggregate BW Power consumption Lustre File System 26.2 PB # of racks 102 DataDirect Networks SFA14KE 500 GB/sec Burst Buffer, Infinite Memory Engine (by DDN) 940 TB (NVMe SSD, including parity data by erasure coding) DataDirect Networks IME14K 1,560 GB/sec 4.2 MW (including cooling) 2017/03/10 OFP specification, JCAHPC 5

Software of Oakforest-PACS system Compute node Login node OS CentOS 7, McKernel Red Hat Enterprise Linux 7 Compiler MPI Library gcc, Intel compiler (C, C++, Fortran) Intel MPI, MVAPICH2 Intel MKL LAPACK, FFTW, SuperLU, PETSc, METIS, Scotch, ScaLAPACK, GNU Scientific Library, NetCDF, Parallel netcdf, Xabclib, ppopen-hpc, ppopen-at, MassiveThreads Application mpijava, XcalableMP, OpenFOAM, ABINIT-MP, PHASE system, FrontFlow/blue, FrontISTR, REVOCAP, OpenMX, xtapp, AkaiKKR, MODYLAS, ALPS, feram, GROMACS, BLAST, R packages, Bioconductor, BioPerl, BioRuby Distributed FS Job Scheduler Debugger Profiler Globus Toolkit, Gfarm Fujitsu Technical Computing Suite Allinea DDT Intel VTune Amplifier, Trace Analyzer & Collector 2017/03/10 OFP specification, JCAHPC 6

Software of Oakforest-PACS OS: Red Hat Enterprise Linux (Login nodes), CentOS or McKernel (Compute nodes, dynamically switchable) McKernel: OS for many-core CPU developed by RIKEN AICS Ultra-lightweight OS compared with Linux, no background noise to user program Will be deployed to post-k computer Compiler: GCC, Intel Compiler, XcalableMP XcalableMP: Parallel programming language developed by RIKEN AICS and University of Tsukuba Easy to develop high-performance parallel application by adding directives to original code written by C or Fortran Application: Open-source software OpenFOAM, ABINIT-MP, PHASE system, FrontFlow/blue, and so on 2017/03/10 OFP specification, JCAHPC 7

Photo of computation node Computation node (Fujitsu PRIMERGY CX1640 M1) with single chip Intel Xeon Phi (Knights Landing, 3+TFLOPS) and Intel Omni-Path Architecture card (100Gbps) Chassis with 8 nodes, 2U size (Fujitsu PRIMERGY CX600 M1) 2017/03/10 OFP specification, JCAHPC 8

Machine location: Kashiwa Campus of U. Tokyo U. Tsukuba Kashiwa Campus of U. Tokyo Hongo Campus of U. Tokyo 2017/03/10 OFP specification, JCAHPC 9