IHK/McKernel: A Lightweight Multi-kernel Operating System for Extreme-Scale Supercomputing

Size: px

Start display at page:

Download "IHK/McKernel: A Lightweight Multi-kernel Operating System for Extreme-Scale Supercomputing"

Mavis Logan
5 years ago
Views:

1 : A Lightweight Multi-kernel Operating System for Extreme-Scale Supercomputing Balazs Gerofi Exascale System Software Team, RIKEN Center for Computational Science 218/Nov/15 SC 18 Intel Extreme Computing Users Group (IXPUG) BoF

2 Motivation System software/os challenges for high-end HPC Node architecture: increasing complexity Large number of (possibly heterogeneous) processing cores, deep memory hierarchy, complex cache/numa topology Applications: increasing diversity Traditional/regular HPC + in-situ data analytics + Big Data processing + AI / Machine Learning + Workflows, etc. What do we need from the system software/os? Performance and scalability for large scale parallel apps Support for APIs tools, productivity, monitoring, etc. Full control over HW resources Ability to adapt to HW changes Emerging memory technologies, parallelism, power constrains We need performance and compatibility at the same time! Performance isolation and dynamic reconfiguration According to workload characteristics, support for co-location 2

3 : Lightweight Multi-kernel Architecture Interface for Heterogeneous Kernels (IHK): Allows dynamic partitioning of node resources (i.e., cores, physical memory, etc.) Enables management of multi-kernels (assign resources, load, boot, destroy, etc..) Provides inter-kernel communication (IKC), messaging and notification McKernel: A lightweight kernel developed from scratch, boots from IHK Designed for HPC, noiseless, simple, implements only performance sensitive system calls (roughly process and memory management) and the rest are offloaded to OS jitter contained in, LWK is isolated System daemon System call Kernel daemon Interrupt Proxy process Delegator module IHK Partition HPC Application IHK co-kernel McKernel Memory Partition System call 3

4 : Lightweight Multi-kernel Architecture Interface for Heterogeneous Kernels (IHK): Allows dynamic partitioning of node resources (i.e., cores, physical memory, etc.) Enables management of multi-kernels (assign resources, load, boot, destroy, etc..) Provides inter-kernel communication (IKC), messaging and notification McKernel: A lightweight kernel developed from scratch, boots from IHK No kernel modifications! No node reboot during reconfiguration and LWK initialization. Designed for HPC, noiseless, simple, implements only performance sensitive system calls (roughly process and memory management) and the rest are offloaded to OS jitter contained in, LWK is isolated System daemon System call Kernel daemon Interrupt Proxy process Delegator module IHK Partition HPC Application IHK co-kernel McKernel Memory Partition System call 4

5 vs. McKernel cores on Xeon Phi KNL NUMA NUMA 1 NUMA 2 NUMA 3 LWK runs on the majority of the chip A few cores are reserved for Mechanism to map inter-core communication to MPI process layout McKernel 5

Oakforest-PACS Configuration 8k Intel Xeon Phi (Knights Landing) compute nodes Intel OmniPath v1 interconnect Peak performance: ~25 PF Intel Xeon Phi 725 model: 68 cores @ 1.

6 Oakforest-PACS Configuration 8k Intel Xeon Phi (Knights Landing) compute nodes Intel OmniPath v1 interconnect Peak performance: ~25 PF Intel Xeon Phi 725 model: GHz 4 HW thread / core 272 logical OS s altogether 64 cores used for McKernel, 4 for 16 GB MCDRAM high-bandwidth memory Hot-pluggable in BIOS 96 GB DRAM Quadrant flat mode 6

Mini-applications on full-scale OFP 2.25E+11 2E+11 1.75E+11 1.5E+11 1.25E+11 1E+11 7.5E+1 5E+1 2.

7 Mini-applications on full-scale OFP 2.25E+11 2E E E E+11 1E E+1 5E+1 2.5E corespec corespec AMG213 19% MiniFE 2.8X 1.8E+8 1.6E+8 1.4E+8 1.2E+8 1.E+8 8.E+7 6.E+7 4.E+7 2.E+7.E+ 2.5E+8 2.3E+8 2.E+8 1.8E+8 1.5E+8 1.3E+8 1.E+8 7.5E+7 5.E+7 2.5E+7.E+ + corespec corespec Lulesh MILC ~2X 21% 7

8 Mini-applications on full-scale OFP corespec LAMMPS + corespec Analysis run+me (seconds) corespec corespec GeoFEM HPCG GAMERA 27% 8

9 Thank you for your attention! Questions? 9

Japan s post K Computer Yutaka Ishikawa Project Leader RIKEN AICS

Japan s post K Computer Yutaka Ishikawa Project Leader RIKEN AICS HPC User Forum, 7 th September, 2016 Outline of Talk Introduction of FLAGSHIP2020 project An Overview of post K system Concluding Remarks