IHK/McKernel: A Lightweight Multi-kernel Operating System for Extreme-Scale Supercomputing
|
|
- Mavis Logan
- 5 years ago
- Views:
Transcription
1 : A Lightweight Multi-kernel Operating System for Extreme-Scale Supercomputing Balazs Gerofi Exascale System Software Team, RIKEN Center for Computational Science 218/Nov/15 SC 18 Intel Extreme Computing Users Group (IXPUG) BoF
2 Motivation System software/os challenges for high-end HPC Node architecture: increasing complexity Large number of (possibly heterogeneous) processing cores, deep memory hierarchy, complex cache/numa topology Applications: increasing diversity Traditional/regular HPC + in-situ data analytics + Big Data processing + AI / Machine Learning + Workflows, etc. What do we need from the system software/os? Performance and scalability for large scale parallel apps Support for APIs tools, productivity, monitoring, etc. Full control over HW resources Ability to adapt to HW changes Emerging memory technologies, parallelism, power constrains We need performance and compatibility at the same time! Performance isolation and dynamic reconfiguration According to workload characteristics, support for co-location 2
3 : Lightweight Multi-kernel Architecture Interface for Heterogeneous Kernels (IHK): Allows dynamic partitioning of node resources (i.e., cores, physical memory, etc.) Enables management of multi-kernels (assign resources, load, boot, destroy, etc..) Provides inter-kernel communication (IKC), messaging and notification McKernel: A lightweight kernel developed from scratch, boots from IHK Designed for HPC, noiseless, simple, implements only performance sensitive system calls (roughly process and memory management) and the rest are offloaded to OS jitter contained in, LWK is isolated System daemon System call Kernel daemon Interrupt Proxy process Delegator module IHK Partition HPC Application IHK co-kernel McKernel Memory Partition System call 3
4 : Lightweight Multi-kernel Architecture Interface for Heterogeneous Kernels (IHK): Allows dynamic partitioning of node resources (i.e., cores, physical memory, etc.) Enables management of multi-kernels (assign resources, load, boot, destroy, etc..) Provides inter-kernel communication (IKC), messaging and notification McKernel: A lightweight kernel developed from scratch, boots from IHK No kernel modifications! No node reboot during reconfiguration and LWK initialization. Designed for HPC, noiseless, simple, implements only performance sensitive system calls (roughly process and memory management) and the rest are offloaded to OS jitter contained in, LWK is isolated System daemon System call Kernel daemon Interrupt Proxy process Delegator module IHK Partition HPC Application IHK co-kernel McKernel Memory Partition System call 4
5 vs. McKernel cores on Xeon Phi KNL NUMA NUMA 1 NUMA 2 NUMA 3 LWK runs on the majority of the chip A few cores are reserved for Mechanism to map inter-core communication to MPI process layout McKernel 5
6 Oakforest-PACS Configuration 8k Intel Xeon Phi (Knights Landing) compute nodes Intel OmniPath v1 interconnect Peak performance: ~25 PF Intel Xeon Phi 725 model: GHz 4 HW thread / core 272 logical OS s altogether 64 cores used for McKernel, 4 for 16 GB MCDRAM high-bandwidth memory Hot-pluggable in BIOS 96 GB DRAM Quadrant flat mode 6
7 Mini-applications on full-scale OFP 2.25E+11 2E E E E+11 1E E+1 5E+1 2.5E corespec corespec AMG213 19% MiniFE 2.8X 1.8E+8 1.6E+8 1.4E+8 1.2E+8 1.E+8 8.E+7 6.E+7 4.E+7 2.E+7.E+ 2.5E+8 2.3E+8 2.E+8 1.8E+8 1.5E+8 1.3E+8 1.E+8 7.5E+7 5.E+7 2.5E+7.E+ + corespec corespec Lulesh MILC ~2X 21% 7
8 Mini-applications on full-scale OFP corespec LAMMPS + corespec Analysis run+me (seconds) corespec corespec GeoFEM HPCG GAMERA 27% 8
9 Thank you for your attention! Questions? 9
Japan s post K Computer Yutaka Ishikawa Project Leader RIKEN AICS
Japan s post K Computer Yutaka Ishikawa Project Leader RIKEN AICS HPC User Forum, 7 th September, 2016 Outline of Talk Introduction of FLAGSHIP2020 project An Overview of post K system Concluding Remarks
More informationA Multi-Kernel Survey for High-Performance Computing
A Multi-Kernel Survey for High-Performance Computing Balazs Gerofi, Yutaka Ishikawa, Rolf Riesen, Robert W. Wisniewski, Yoonho Park, Bryan Rosenburg RIKEN Advanced Institute for Computational Science,
More informationRevisiting Virtual Memory for High Performance Computing on Manycore Architectures: A Hybrid Segmentation Kernel Approach
Revisiting Virtual Memory for High Performance Computing on Manycore Architectures: A Hybrid Segmentation Kernel Approach Yuki Soma, Balazs Gerofi, Yutaka Ishikawa 1 Agenda Background on virtual memory
More informationHigh Performance Computing Systems
High Performance Computing Systems Multikernels Doug Shook Multikernels Two predominant approaches to OS: Full weight kernel Lightweight kernel Why not both? How does implementation affect usage and performance?
More informationUpdate of Post-K Development Yutaka Ishikawa RIKEN AICS
Update of Post-K Development Yutaka Ishikawa RIKEN AICS 11:20AM 11:40AM, 2 nd of November, 2017 FLAGSHIP2020 Project Missions Building the Japanese national flagship supercomputer, post K, and Developing
More informationExtreme-Scale Operating Systems
Extreme-Scale Operating Systems Rolf Riesen 23 August 2016 Copyright c 2016 Intel Corporation. All rights reserved. ROME Legal Disclaimer Intel and the Intel logo are trademarks of Intel Corporation in
More informationIntroduction of Oakforest-PACS
Introduction of Oakforest-PACS Hiroshi Nakamura Director of Information Technology Center The Univ. of Tokyo (Director of JCAHPC) Outline Supercomputer deployment plan in Japan What is JCAHPC? Oakforest-PACS
More informationNERSC Site Update. National Energy Research Scientific Computing Center Lawrence Berkeley National Laboratory. Richard Gerber
NERSC Site Update National Energy Research Scientific Computing Center Lawrence Berkeley National Laboratory Richard Gerber NERSC Senior Science Advisor High Performance Computing Department Head Cori
More informationmos: An Architecture for Extreme Scale Operating Systems
mos: An Architecture for Extreme Scale Operating Systems Robert W. Wisniewski, Todd Inglett, Pardo Keppel, Ravi Murty, Rolf Riesen Presented by: Robert W. Wisniewski Chief Software Architect Extreme Scale
More informationIntel Xeon Phi архитектура, модели программирования, оптимизация.
Нижний Новгород, 2017 Intel Xeon Phi архитектура, модели программирования, оптимизация. Дмитрий Прохоров, Дмитрий Рябцев, Intel Agenda What and Why Intel Xeon Phi Top 500 insights, roadmap, architecture
More informationBasic Specification of Oakforest-PACS
Basic Specification of Oakforest-PACS Joint Center for Advanced HPC (JCAHPC) by Information Technology Center, the University of Tokyo and Center for Computational Sciences, University of Tsukuba Oakforest-PACS
More informationPerformance and Energy Usage of Workloads on KNL and Haswell Architectures
Performance and Energy Usage of Workloads on KNL and Haswell Architectures Tyler Allen 1 Christopher Daley 2 Doug Doerfler 2 Brian Austin 2 Nicholas Wright 2 1 Clemson University 2 National Energy Research
More informationEARLY EVALUATION OF THE CRAY XC40 SYSTEM THETA
EARLY EVALUATION OF THE CRAY XC40 SYSTEM THETA SUDHEER CHUNDURI, SCOTT PARKER, KEVIN HARMS, VITALI MOROZOV, CHRIS KNIGHT, KALYAN KUMARAN Performance Engineering Group Argonne Leadership Computing Facility
More informationPerformance Optimization of Smoothed Particle Hydrodynamics for Multi/Many-Core Architectures
Performance Optimization of Smoothed Particle Hydrodynamics for Multi/Many-Core Architectures Dr. Fabio Baruffa Dr. Luigi Iapichino Leibniz Supercomputing Centre fabio.baruffa@lrz.de Outline of the talk
More informationIntel Architecture for HPC
Intel Architecture for HPC Georg Zitzlsberger georg.zitzlsberger@vsb.cz 1st of March 2018 Agenda Salomon Architectures Intel R Xeon R processors v3 (Haswell) Intel R Xeon Phi TM coprocessor (KNC) Ohter
More informationIntel Xeon Phi архитектура, модели программирования, оптимизация.
Нижний Новгород, 2016 Intel Xeon Phi архитектура, модели программирования, оптимизация. Дмитрий Прохоров, Intel Agenda What and Why Intel Xeon Phi Top 500 insights, roadmap, architecture How Programming
More informationDirections in Workload Management
Directions in Workload Management Alex Sanchez and Morris Jette SchedMD LLC HPC Knowledge Meeting 2016 Areas of Focus Scalability Large Node and Core Counts Power Management Failure Management Federated
More informationScheduler Optimization for Current Generation Cray Systems
Scheduler Optimization for Current Generation Cray Systems Morris Jette SchedMD, jette@schedmd.com Douglas M. Jacobsen, David Paul NERSC, dmjacobsen@lbl.gov, dpaul@lbl.gov Abstract - The current generation
More informationDesigning High-Performance MPI Collectives in MVAPICH2 for HPC and Deep Learning
5th ANNUAL WORKSHOP 209 Designing High-Performance MPI Collectives in MVAPICH2 for HPC and Deep Learning Hari Subramoni Dhabaleswar K. (DK) Panda The Ohio State University The Ohio State University E-mail:
More informationDeep Learning with Intel DAAL
Deep Learning with Intel DAAL on Knights Landing Processor David Ojika dave.n.ojika@cern.ch March 22, 2017 Outline Introduction and Motivation Intel Knights Landing Processor Intel Data Analytics and Acceleration
More informationShort Talk: System abstractions to facilitate data movement in supercomputers with deep memory and interconnect hierarchy
Short Talk: System abstractions to facilitate data movement in supercomputers with deep memory and interconnect hierarchy François Tessier, Venkatram Vishwanath Argonne National Laboratory, USA July 19,
More informationHPC Architectures. Types of resource currently in use
HPC Architectures Types of resource currently in use Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us
More informationPost-K Development and Introducing DLU. Copyright 2017 FUJITSU LIMITED
Post-K Development and Introducing DLU 0 Fujitsu s HPC Development Timeline K computer The K computer is still competitive in various fields; from advanced research to manufacturing. Deep Learning Unit
More informationIFS RAPS14 benchmark on 2 nd generation Intel Xeon Phi processor
IFS RAPS14 benchmark on 2 nd generation Intel Xeon Phi processor D.Sc. Mikko Byckling 17th Workshop on High Performance Computing in Meteorology October 24 th 2016, Reading, UK Legal Disclaimer & Optimization
More informationEfficient Parallel Programming on Xeon Phi for Exascale
Efficient Parallel Programming on Xeon Phi for Exascale Eric Petit, Intel IPAG, Seminar at MDLS, Saclay, 29th November 2016 Legal Disclaimers Intel technologies features and benefits depend on system configuration
More informationVLPL-S Optimization on Knights Landing
VLPL-S Optimization on Knights Landing 英特尔软件与服务事业部 周姗 2016.5 Agenda VLPL-S 性能分析 VLPL-S 性能优化 总结 2 VLPL-S Workload Descriptions VLPL-S is the in-house code from SJTU, paralleled with MPI and written in C++.
More informationarxiv: v2 [hep-lat] 3 Nov 2016
MILC staggered conjugate gradient performance on Intel KNL arxiv:1611.00728v2 [hep-lat] 3 Nov 2016 Department of Physics, Indiana University, Bloomington IN 47405, USA E-mail: ruizli@umail.iu.edu Carleton
More informationINTRODUCTION TO THE ARCHER KNIGHTS LANDING CLUSTER. Adrian
INTRODUCTION TO THE ARCHER KNIGHTS LANDING CLUSTER Adrian Jackson a.jackson@epcc.ed.ac.uk @adrianjhpc Processors The power used by a CPU core is proportional to Clock Frequency x Voltage 2 In the past,
More informationCEA and RIKEN AICS Collaboration
CEA and RIKEN AICS Collaboration Yutaka Ishikawa RIKEN AICS 16:25 16:55 First French Japanese German Workshop on Programming and Computing for Exascale and beyond, 5 th April 2017, Tokyo Outline of Talk
More informationToward portable I/O performance by leveraging system abstractions of deep memory and interconnect hierarchies
Toward portable I/O performance by leveraging system abstractions of deep memory and interconnect hierarchies François Tessier, Venkatram Vishwanath, Paul Gressier Argonne National Laboratory, USA Wednesday
More informationSystem Software Stack for the Next Generation High-Performance Computers
1,2 2 Gerofi Balazs 1 3 2 4 4 5 6 7 7 PC CPU PC OS MPI I/O System Software Stack for the Next Generation High-Performance Computers Yutaka Ishikawa 1,2 Atsushi Hori 2 Gerofi Balazs 1 Masamichi Takagi 3
More informationMemory Footprint of Locality Information On Many-Core Platforms Brice Goglin Inria Bordeaux Sud-Ouest France 2018/05/25
ROME Workshop @ IPDPS Vancouver Memory Footprint of Locality Information On Many- Platforms Brice Goglin Inria Bordeaux Sud-Ouest France 2018/05/25 Locality Matters to HPC Applications Locality Matters
More informationINTRODUCTION TO THE ARCHER KNIGHTS LANDING CLUSTER. Adrian
INTRODUCTION TO THE ARCHER KNIGHTS LANDING CLUSTER Adrian Jackson adrianj@epcc.ed.ac.uk @adrianjhpc Processors The power used by a CPU core is proportional to Clock Frequency x Voltage 2 In the past, computers
More informationHPC Architectures evolution: the case of Marconi, the new CINECA flagship system. Piero Lanucara
HPC Architectures evolution: the case of Marconi, the new CINECA flagship system Piero Lanucara Many advantages as a supercomputing resource: Low energy consumption. Limited floor space requirements Fast
More informationIntroduction to Xeon Phi. Bill Barth January 11, 2013
Introduction to Xeon Phi Bill Barth January 11, 2013 What is it? Co-processor PCI Express card Stripped down Linux operating system Dense, simplified processor Many power-hungry operations removed Wider
More informationOverview of Tianhe-2
Overview of Tianhe-2 (MilkyWay-2) Supercomputer Yutong Lu School of Computer Science, National University of Defense Technology; State Key Laboratory of High Performance Computing, China ytlu@nudt.edu.cn
More informationWhat can/should we measure with benchmarks?
What can/should we measure with benchmarks? Jun Makino Department of Planetology, Kobe University FS2020 Project, RIKEN-CCS SC18 BoF 107 Pros and Cons of HPCx benchmarks Nov 13 Overview Last 40 years of
More informationCori (2016) and Beyond Ensuring NERSC Users Stay Productive
Cori (2016) and Beyond Ensuring NERSC Users Stay Productive Nicholas J. Wright! Advanced Technologies Group Lead! Heterogeneous Mul-- Core 4 Workshop 17 September 2014-1 - NERSC Systems Today Edison: 2.39PF,
More informationDesigning Shared Address Space MPI libraries in the Many-core Era
Designing Shared Address Space MPI libraries in the Many-core Era Jahanzeb Hashmi hashmi.29@osu.edu (NBCL) The Ohio State University Outline Introduction and Motivation Background Shared-memory Communication
More informationIntel Many Integrated Core (MIC) Matt Kelly & Ryan Rawlins
Intel Many Integrated Core (MIC) Matt Kelly & Ryan Rawlins Outline History & Motivation Architecture Core architecture Network Topology Memory hierarchy Brief comparison to GPU & Tilera Programming Applications
More informationA Cost Model for Data Stream Processing on Modern Hardware Constantin Pohl, Philipp Götze, Kai-Uwe Sattler
Processing on Modern Hardware Constantin Pohl, Philipp Götze, Kai-Uwe Sattler 31.08.17 Motivation and Introduction Main goals on Data Stream Processing Queries: High throughput & low latency Responsibility:
More informationInterconnect Your Future
Interconnect Your Future Smart Interconnect for Next Generation HPC Platforms Gilad Shainer, August 2016, 4th Annual MVAPICH User Group (MUG) Meeting Mellanox Connects the World s Fastest Supercomputer
More informationTECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 13 th CALL (T ier-0)
TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 13 th CALL (T ier-0) Contributing sites and the corresponding computer systems for this call are: BSC, Spain IBM System x idataplex CINECA, Italy Lenovo System
More informationThe knight makes his play for the crown Phi & Omni-Path Glenn Rosenberg Computer Insights UK 2016
The knight makes his play for the crown Phi & Omni-Path Glenn Rosenberg Computer Insights UK 2016 2016 Supermicro 15 Minutes Two Swim Lanes Intel Phi Roadmap & SKUs Phi in the TOP500 Use Cases Supermicro
More informationIntel Knights Landing Hardware
Intel Knights Landing Hardware TACC KNL Tutorial IXPUG Annual Meeting 2016 PRESENTED BY: John Cazes Lars Koesterke 1 Intel s Xeon Phi Architecture Leverages x86 architecture Simpler x86 cores, higher compute
More informationINTEL HPC DEVELOPER CONFERENCE FUEL YOUR INSIGHT
INTEL HPC DEVELOPER CONFERENCE FUEL YOUR INSIGHT INTEL HPC DEVELOPER CONFERENCE FUEL YOUR INSIGHT UPDATE ON OPENSWR: A SCALABLE HIGH- PERFORMANCE SOFTWARE RASTERIZER FOR SCIVIS Jefferson Amstutz Intel
More informationIXPUG 16. Dmitry Durnov, Intel MPI team
IXPUG 16 Dmitry Durnov, Intel MPI team Agenda - Intel MPI 2017 Beta U1 product availability - New features overview - Competitive results - Useful links - Q/A 2 Intel MPI 2017 Beta U1 is available! Key
More informationPost-K: Building the Arm HPC Ecosystem
Post-K: Building the Arm HPC Ecosystem Toshiyuki Shimizu FUJITSU LIMITED Nov. 14th, 2017 Exhibitor Forum, SC17, Nov. 14, 2017 0 Post-K: Building up Arm HPC Ecosystem Fujitsu s approach for HPC Approach
More informationA Design of Hybrid Operating System for a Parallel Computer with Multi-Core and Many-Core Processors
A Design of Hybrid Operating System for a Parallel Computer with Multi-Core and Many-Core Processors Mikiko Sato 1,5 Go Fukazawa 1 Kiyohiko Nagamine 1 Ryuichi Sakamoto 1 Mitaro Namiki 1,5 Kazumi Yoshinaga
More informationLeveraging Flash in HPC Systems
Leveraging Flash in HPC Systems IEEE MSST June 3, 2015 This work was performed under the auspices of the U.S. Department of Energy by under Contract DE-AC52-07NA27344. Lawrence Livermore National Security,
More informationOutline. Motivation Parallel k-means Clustering Intel Computing Architectures Baseline Performance Performance Optimizations Future Trends
Collaborators: Richard T. Mills, Argonne National Laboratory Sarat Sreepathi, Oak Ridge National Laboratory Forrest M. Hoffman, Oak Ridge National Laboratory Jitendra Kumar, Oak Ridge National Laboratory
More informationIntroduction to tuning on KNL platforms
Introduction to tuning on KNL platforms Gilles Gouaillardet RIST gilles@rist.or.jp 1 Agenda Why do we need many core platforms? KNL architecture Single-thread optimization Parallelization Common pitfalls
More informationHPMMAP: Lightweight Memory Management for Commodity Operating Systems. University of Pittsburgh
HPMMAP: Lightweight Memory Management for Commodity Operating Systems Brian Kocoloski Jack Lange University of Pittsburgh Lightweight Experience in a Consolidated Environment HPC applications need lightweight
More informationManaging HPC Active Archive Storage with HPSS RAIT at Oak Ridge National Laboratory
Managing HPC Active Archive Storage with HPSS RAIT at Oak Ridge National Laboratory Quinn Mitchell HPC UNIX/LINUX Storage Systems ORNL is managed by UT-Battelle for the US Department of Energy U.S. Department
More informationCray XC Scalability and the Aries Network Tony Ford
Cray XC Scalability and the Aries Network Tony Ford June 29, 2017 Exascale Scalability Which scalability metrics are important for Exascale? Performance (obviously!) What are the contributing factors?
More informationBest Practices for Setting BIOS Parameters for Performance
White Paper Best Practices for Setting BIOS Parameters for Performance Cisco UCS E5-based M3 Servers May 2013 2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page
More informationBei Wang, Dmitry Prohorov and Carlos Rosales
Bei Wang, Dmitry Prohorov and Carlos Rosales Aspects of Application Performance What are the Aspects of Performance Intel Hardware Features Omni-Path Architecture MCDRAM 3D XPoint Many-core Xeon Phi AVX-512
More informationIntel Xeon PhiTM Knights Landing (KNL) System Software Clark Snyder, Peter Hill, John Sygulla
Intel Xeon PhiTM Knights Landing (KNL) System Software Clark Snyder, Peter Hill, John Sygulla Motivation The Intel Xeon Phi TM Knights Landing (KNL) has 20 different configurations 5 NUMA modes X 4 memory
More informationEnergy Efficient K-Means Clustering for an Intel Hybrid Multi-Chip Package
High Performance Machine Learning Workshop Energy Efficient K-Means Clustering for an Intel Hybrid Multi-Chip Package Matheus Souza, Lucas Maciel, Pedro Penna, Henrique Freitas 24/09/2018 Agenda Introduction
More informationMDHIM: A Parallel Key/Value Store Framework for HPC
MDHIM: A Parallel Key/Value Store Framework for HPC Hugh Greenberg 7/6/2015 LA-UR-15-25039 HPC Clusters Managed by a job scheduler (e.g., Slurm, Moab) Designed for running user jobs Difficult to run system
More informationHPCG on Intel Xeon Phi 2 nd Generation, Knights Landing. Alexander Kleymenov and Jongsoo Park Intel Corporation SC16, HPCG BoF
HPCG on Intel Xeon Phi 2 nd Generation, Knights Landing Alexander Kleymenov and Jongsoo Park Intel Corporation SC16, HPCG BoF 1 Outline KNL results Our other work related to HPCG 2 ~47 GF/s per KNL ~10
More informationData center: The center of possibility
Data center: The center of possibility Diane bryant Executive vice president & general manager Data center group, intel corporation Data center: The center of possibility The future is Thousands of Clouds
More informationTowards Exascale Computing with the Atmospheric Model NUMA
Towards Exascale Computing with the Atmospheric Model NUMA Andreas Müller, Daniel S. Abdi, Michal Kopera, Lucas Wilcox, Francis X. Giraldo Department of Applied Mathematics Naval Postgraduate School, Monterey
More informationComparing Performance and Power Consumption on Different Architectures
Comparing Performance and Power Consumption on Different Architectures Andriani Mappoura August 18, 2017 MSc in High Performance Computing The University of Edinburgh Year of Presentation: 2017 Abstract
More informationThe GeantV prototype on KNL. Federico Carminati, Andrei Gheata and Sofia Vallecorsa for the GeantV team
The GeantV prototype on KNL Federico Carminati, Andrei Gheata and Sofia Vallecorsa for the GeantV team Outline Introduction (Digression on vectorization approach) Geometry benchmarks: vectorization and
More informationIntroduction to tuning on many core platforms. Gilles Gouaillardet RIST
Introduction to tuning on many core platforms Gilles Gouaillardet RIST gilles@rist.or.jp Agenda Why do we need many core platforms? Single-thread optimization Parallelization Conclusions Why do we need
More informationXPU A Programmable FPGA Accelerator for Diverse Workloads
XPU A Programmable FPGA Accelerator for Diverse Workloads Jian Ouyang, 1 (ouyangjian@baidu.com) Ephrem Wu, 2 Jing Wang, 1 Yupeng Li, 1 Hanlin Xie 1 1 Baidu, Inc. 2 Xilinx Outlines Background - FPGA for
More informationInspur AI Computing Platform
Inspur Server Inspur AI Computing Platform 3 Server NF5280M4 (2CPU + 3 ) 4 Server NF5280M5 (2 CPU + 4 ) Node (2U 4 Only) 8 Server NF5288M5 (2 CPU + 8 ) 16 Server SR BOX (16 P40 Only) Server target market
More informationPerformance analysis tools: Intel VTuneTM Amplifier and Advisor. Dr. Luigi Iapichino
Performance analysis tools: Intel VTuneTM Amplifier and Advisor Dr. Luigi Iapichino luigi.iapichino@lrz.de Which tool do I use in my project? A roadmap to optimisation After having considered the MPI layer,
More informationTessellation: Space-Time Partitioning in a Manycore Client OS
Tessellation: Space-Time ing in a Manycore Client OS Rose Liu 1,2, Kevin Klues 1, Sarah Bird 1, Steven Hofmeyr 3, Krste Asanovic 1, John Kubiatowicz 1 1 Parallel Computing Laboratory, UC Berkeley 2 Data
More informationHard Real-time Scheduling for Parallel Run-time Systems
Hard Real-time Scheduling for Parallel Run-time Systems Peter Dinda Xiaoyang Wang Jinghang Wang Chris Beauchene Conor Hetland Prescience Lab Department of EECS Northwestern University pdinda.org presciencelab.org
More informationIntroduc)on to Xeon Phi
Introduc)on to Xeon Phi IXPUG 14 Lars Koesterke Acknowledgements Thanks/kudos to: Sponsor: National Science Foundation NSF Grant #OCI-1134872 Stampede Award, Enabling, Enhancing, and Extending Petascale
More informationTrends of Network Topology on Supercomputers. Michihiro Koibuchi National Institute of Informatics, Japan 2018/11/27
Trends of Network Topology on Supercomputers Michihiro Koibuchi National Institute of Informatics, Japan 2018/11/27 From Graph Golf to Real Interconnection Networks Case 1: On-chip Networks Case 2: Supercomputer
More informationExascale: challenges and opportunities in a power constrained world
Exascale: challenges and opportunities in a power constrained world Carlo Cavazzoni c.cavazzoni@cineca.it SuperComputing Applications and Innovation Department CINECA CINECA non profit Consortium, made
More informationLLVM for the future of Supercomputing
LLVM for the future of Supercomputing Hal Finkel hfinkel@anl.gov 2017-03-27 2017 European LLVM Developers' Meeting What is Supercomputing? Computing for large, tightly-coupled problems. Lots of computational
More informationVARIABILITY IN OPERATING SYSTEMS
VARIABILITY IN OPERATING SYSTEMS Brian Kocoloski Assistant Professor in CSE Dept. October 8, 2018 1 CLOUD COMPUTING Current estimate is that 94% of all computation will be performed in the cloud by 2021
More informationAlexander Heinecke (Intel), Josh Tobin (UCSD), Alexander Breuer (UCSD), Charles Yount (Intel), Yifeng Cui (UCSD) Parallel Computing Lab Intel Labs
Alexander Heinecke (Intel), Josh Tobin (UCSD), Alexander Breuer (UCSD), Charles Yount (Intel), Yifeng Cui (UCSD) Parallel Computing Lab Intel Labs USA November 14 th 2017 Legal Disclaimer & Optimization
More informationHPC Innovation Lab Update. Dell EMC HPC Community Meeting 3/28/2017
HPC Innovation Lab Update Dell EMC HPC Community Meeting 3/28/2017 Dell EMC HPC Innovation Lab charter Design, develop and integrate Heading HPC systems Lorem ipsum Flexible reference dolor sit amet, architectures
More informationApril 2 nd, Bob Burroughs Director, HPC Solution Sales
April 2 nd, 2019 Bob Burroughs Director, HPC Solution Sales Today - Introducing 2 nd Generation Intel Xeon Scalable Processors how Intel Speeds HPC performance Work Time System Peak Efficiency Software
More informationHPC future trends from a science perspective
HPC future trends from a science perspective Simon McIntosh-Smith University of Bristol HPC Research Group simonm@cs.bris.ac.uk 1 Business as usual? We've all got used to new machines being relatively
More informationExpressing Heterogeneous Parallelism in C++ with Intel Threading Building Blocks A full-day tutorial proposal for SC17
Expressing Heterogeneous Parallelism in C++ with Intel Threading Building Blocks A full-day tutorial proposal for SC17 Tutorial Instructors [James Reinders, Michael J. Voss, Pablo Reble, Rafael Asenjo]
More informationInnovative Alternate Architecture for Exascale Computing. Surya Hotha Director, Product Marketing
Innovative Alternate Architecture for Exascale Computing Surya Hotha Director, Product Marketing Cavium Corporate Overview Enterprise Mobile Infrastructure Data Center and Cloud Service Provider Cloud
More informationSimulation using MIC co-processor on Helios
Simulation using MIC co-processor on Helios Serhiy Mochalskyy, Roman Hatzky PRACE PATC Course: Intel MIC Programming Workshop High Level Support Team Max-Planck-Institut für Plasmaphysik Boltzmannstr.
More informationThe Effect of In-Network Computing-Capable Interconnects on the Scalability of CAE Simulations
The Effect of In-Network Computing-Capable Interconnects on the Scalability of CAE Simulations Ophir Maor HPC Advisory Council ophir@hpcadvisorycouncil.com The HPC-AI Advisory Council World-wide HPC non-profit
More informationFujitsu s Approach to Application Centric Petascale Computing
Fujitsu s Approach to Application Centric Petascale Computing 2 nd Nov. 2010 Motoi Okuda Fujitsu Ltd. Agenda Japanese Next-Generation Supercomputer, K Computer Project Overview Design Targets System Overview
More informationPerformance optimization of the Smoothed Particle Hydrodynamics code Gadget3 on 2nd generation Intel Xeon Phi
Performance optimization of the Smoothed Particle Hydrodynamics code Gadget3 on 2nd generation Intel Xeon Phi Dr. Luigi Iapichino Leibniz Supercomputing Centre Supercomputing 2017 Intel booth, Nerve Center
More informationAim High. Intel Technical Update Teratec 07 Symposium. June 20, Stephen R. Wheat, Ph.D. Director, HPC Digital Enterprise Group
Aim High Intel Technical Update Teratec 07 Symposium June 20, 2007 Stephen R. Wheat, Ph.D. Director, HPC Digital Enterprise Group Risk Factors Today s s presentations contain forward-looking statements.
More informationPractical Near-Data Processing for In-Memory Analytics Frameworks
Practical Near-Data Processing for In-Memory Analytics Frameworks Mingyu Gao, Grant Ayers, Christos Kozyrakis Stanford University http://mast.stanford.edu PACT Oct 19, 2015 Motivating Trends End of Dennard
More informationThe Stampede is Coming: A New Petascale Resource for the Open Science Community
The Stampede is Coming: A New Petascale Resource for the Open Science Community Jay Boisseau Texas Advanced Computing Center boisseau@tacc.utexas.edu Stampede: Solicitation US National Science Foundation
More informationCapability Models for Manycore Memory Systems: A Case-Study with Xeon Phi KNL
SABELA RAMOS, TORSTEN HOEFLER Capability Models for Manycore Memory Systems: A Case-Study with Xeon Phi KNL spcl.inf.ethz.ch Microarchitectures are becoming more and more complex CPU L1 CPU L1 CPU L1 CPU
More informationCarlos Reaño, Javier Prades and Federico Silla Technical University of Valencia (Spain)
Carlos Reaño, Javier Prades and Federico Silla Technical University of Valencia (Spain) 4th IEEE International Workshop of High-Performance Interconnection Networks in the Exascale and Big-Data Era (HiPINEB
More informationDecoupling Cores, Kernels and Operating Systems
Decoupling Cores, Kernels and Operating Systems Gerd Zellweger, Simon Gerber, Kornilios Kourtis, Timothy Roscoe Systems Group, ETH Zürich 10/6/2014 1 Outline Motivation Trends in hardware and software
More informationCMCP: A Novel Page Replacement Policy for System Level Hierarchical Memory Management on Many-cores
CMCP: A Novel Page Replacement Policy for System Level Hierarchical Memory Management on Many-cores Balazs Gerofi, Akio Shimada, Atsushi Hori, Takagi Masamichi, Yutaka Ishikawa, Graduate School of Information
More informationPreparing your Application for Advanced Manycore Architectures
Preparing your Application for Advanced Manycore Architectures Katie Antypas Services Dept Head, NERSC-8 Project Lead CSGF HPC Workshop July 17, 2014-1 - What is Manycore? No precise definition Multicore
More informationUmeå University
HPC2N @ Umeå University Introduction to HPC2N and Kebnekaise Jerry Eriksson, Pedro Ojeda-May, and Birgitte Brydsö Outline Short presentation of HPC2N HPC at a glance. HPC2N Abisko, Kebnekaise HPC Programming
More informationPerformance Profiler. Klaus-Dieter Oertel Intel-SSG-DPD IT4I HPC Workshop, Ostrava,
Performance Profiler Klaus-Dieter Oertel Intel-SSG-DPD IT4I HPC Workshop, Ostrava, 08-09-2016 Faster, Scalable Code, Faster Intel VTune Amplifier Performance Profiler Get Faster Code Faster With Accurate
More informationUmeå University
HPC2N: Introduction to HPC2N and Kebnekaise, 2017-09-12 HPC2N @ Umeå University Introduction to HPC2N and Kebnekaise Jerry Eriksson, Pedro Ojeda-May, and Birgitte Brydsö Outline Short presentation of HPC2N
More informationBig Data Systems on Future Hardware. Bingsheng He NUS Computing
Big Data Systems on Future Hardware Bingsheng He NUS Computing http://www.comp.nus.edu.sg/~hebs/ 1 Outline Challenges for Big Data Systems Why Hardware Matters? Open Challenges Summary 2 3 ANYs in Big
More informationCurrent and Future Challenges of the Tofu Interconnect for Emerging Applications
Current and Future Challenges of the Tofu Interconnect for Emerging Applications Yuichiro Ajima Senior Architect Next Generation Technical Computing Unit Fujitsu Limited June 22, 2017, ExaComm 2017 Workshop
More informationWelcome. Virtual tutorial starts at BST
Welcome Virtual tutorial starts at 15.00 BST Using KNL on ARCHER Adrian Jackson With thanks to: adrianj@epcc.ed.ac.uk @adrianjhpc Harvey Richardson from Cray Slides from Intel Xeon Phi Knights Landing
More information