Barcelona Supercomputing Center

Similar documents
Butterfly effect of porting scientific applications to ARM-based platforms

The Mont-Blanc project Updates from the Barcelona Supercomputing Center

Pedraforca: a First ARM + GPU Cluster for HPC

Building supercomputers from commodity embedded chips

The Mont-Blanc Project

Building supercomputers from embedded technologies

EU Research Infra Integration: a vision from the BSC. Josep M. Martorell, PhD Associate Director

The Mont-Blanc approach towards Exascale

Exploiting CUDA Dynamic Parallelism for low power ARM based prototypes

European energy efficient supercomputer project

Task-based programming models to support hierarchical algorithms

Supercomputing with Commodity CPUs: Are Mobile SoCs Ready for HPC?

Interactive HPC: Large Scale In-Situ Visualization Using NVIDIA Index in ALYA MultiPhysics

HPC state of play. HPC Ecosystem. HPC system supply. HPC use 24% EU 4,3% EU. Application software & tools. Academia 23% Bio-sciences 22% CAE 21%

GOING ARM A CODE PERSPECTIVE

HPC projects. Grischa Bolls

ARM High Performance Computing

E4-ARKA: ARM64+GPU+IB is Now Here Piero Altoè. ARM64 and GPGPU

Trends in HPC (hardware complexity and software challenges)

HPC Resources & Training

BSC and integrating persistent data and parallel programming models

Optimizing an Earth Science Atmospheric Application with the OmpSs Programming Model

CS500 SMARTER CLUSTER SUPERCOMPUTERS

GPGPU on ARM. Tom Gall, Gil Pitney, 30 th Oct 2013

From the latency to the throughput age. Prof. Jesús Labarta Director Computer Science Dept (BSC) UPC

The Arm Technology Ecosystem: Current Products and Future Outlook

Leveraging Parallelware in MAESTRO and EPEEC

Supercomputing resources in BSC-CNS, RES & PRACE. Sergi Girona BSC Operations Director & PRACE Director

CUDA on ARM Update. Developing Accelerated Applications on ARM. Bas Aarts and Donald Becker

Enabling the ARM high performance computing (HPC) software ecosystem

Beyond Hardware IP An overview of Arm development solutions

GPUs and Emerging Architectures

High Performance Computing The Essential Tool for a Knowledge Economy

Sharing High-Performance Devices Across Multiple Virtual Machines

Introduction to High Performance Computing. Shaohao Chen Research Computing Services (RCS) Boston University

Innovative Alternate Architecture for Exascale Computing. Surya Hotha Director, Product Marketing

User Training Cray XC40 IITM, Pune

D.A.V.I.D.E. (Development of an Added-Value Infrastructure Designed in Europe) IWOPH 17 E4. WHEN PERFORMANCE MATTERS

OmpSs + OpenACC Multi-target Task-Based Programming Model Exploiting OpenACC GPU Kernel

Designed for Maximum Accelerator Performance

Digital Earth Routine on Tegra K1

Illinois Proposal Considerations Greg Bauer

High Performance Computing Software Development Kit For Mac OS X In Depth Product Information

CAS 2K13 Sept Jean-Pierre Panziera Chief Technology Director

SPANISH SUPERCOMPUTING NETWORK

IBM Power Systems HPC Cluster

HPC with GPU and its applications from Inspur. Haibo Xie, Ph.D

Arm Processor Technology Update and Roadmap

ALYA Multi-Physics System on GPUs: Offloading Large-Scale Computational Mechanics Problems

Intel Many Integrated Core (MIC) Architecture

IBM CORAL HPC System Solution

Hybrid KAUST Many Cores and OpenACC. Alain Clo - KAUST Research Computing Saber Feki KAUST Supercomputing Lab Florent Lebeau - CAPS

Exploiting Task-Parallelism on GPU Clusters via OmpSs and rcuda Virtualization

A Breakthrough in Non-Volatile Memory Technology FUJITSU LIMITED

Understanding the Role of GPGPU-accelerated SoC-based ARM Clusters

7 DAYS AND 8 NIGHTS WITH THE CARMA DEV KIT

ECE 574 Cluster Computing Lecture 18

The Cray Rainier System: Integrated Scalar/Vector Computing

InfiniBand Strengthens Leadership as the Interconnect Of Choice By Providing Best Return on Investment. TOP500 Supercomputers, June 2014

The Rise of Open Programming Frameworks. JC BARATAULT IWOCL May 2015

Performance POP up. EU H2020 Center of Excellence (CoE)

Design Decisions for a Source-2-Source Compiler

ACCELERATED COMPUTING: THE PATH FORWARD. Jen-Hsun Huang, Co-Founder and CEO, NVIDIA SC15 Nov. 16, 2015

The challenges of new, efficient computer architectures, and how they can be met with a scalable software development strategy.! Thomas C.

Introduction to High-Performance Computing

High Performance Computing from an EU perspective

Exascale: challenges and opportunities in a power constrained world

On the Path towards Exascale Computing in the Czech Republic and in Europe

Bright Cluster Manager Advanced HPC cluster management made easy. Martijn de Vries CTO Bright Computing

Current Status of the Next- Generation Supercomputer in Japan. YOKOKAWA, Mitsuo Next-Generation Supercomputer R&D Center RIKEN

Introduction to National Supercomputing Centre in Guangzhou and Opportunities for International Collaboration

OpenMPSuperscalar: Task-Parallel Simulation and Visualization of Crowds with Several CPUs and GPUs

HPC future trends from a science perspective

Overview of research activities Toward portability of performance

NEMO Performance Benchmark and Profiling. May 2011

Computing on Low Power SoC Architecture

2013 AWS Worldwide Public Sector Summit Washington, D.C.

Stan Posey, NVIDIA, Santa Clara, CA, USA

An Extension of the StarSs Programming Model for Platforms with Multiple GPUs

Software stack deployment for Earth System Modelling using Spack

HETEROGENEOUS SYSTEM ARCHITECTURE: PLATFORM FOR THE FUTURE

Programming Support for Heterogeneous Parallel Systems

Lecture 1: Introduction and Computational Thinking

HOW TO BUILD A MODERN AI

EuroHPC Bologna 23 Marzo Gabriella Scipione

Energy Efficient Computing Systems (EECS) Magnus Jahre Coordinator, EECS

Post-K: Building the Arm HPC Ecosystem

Implementation of the K-means Algorithm on Heterogeneous Devices: a Use Case Based on an Industrial Dataset

The Stampede is Coming Welcome to Stampede Introductory Training. Dan Stanzione Texas Advanced Computing Center

BlueDBM: An Appliance for Big Data Analytics*

HETEROGENEOUS HPC, ARCHITECTURAL OPTIMIZATION, AND NVLINK STEVE OBERLIN CTO, TESLA ACCELERATED COMPUTING NVIDIA

Please cite this article using the following BibTex entry:

ICON Performance Benchmark and Profiling. March 2012

Managing complex cluster architectures with Bright Cluster Manager

GPU Clouds IGT Cloud Computing Summit Mordechai Butrashvily, CEO 2009 (c) All rights reserved

Productive Performance on the Cray XK System Using OpenACC Compilers and Tools

Users and utilization of CERIT-SC infrastructure

CUDA on ARM Update. Developing Accelerated Applications on ARM. Bas Aarts and Donald Becker

System Design of Kepler Based HPC Solutions. Saeed Iqbal, Shawn Gao and Kevin Tubbs HPC Global Solutions Engineering.

Jay Kruemcke Sr. Product Manager, HPC, Arm,

Transcription:

www.bsc.es Barcelona Supercomputing Center Centro Nacional de Supercomputación EMIT 2016. Barcelona June 2 nd, 2016

Barcelona Supercomputing Center Centro Nacional de Supercomputación BSC-CNS objectives: R&D in Computer, Life, Earth and Engineering Sciences Supercomputing services and support to Spanish and European researchers BSC-CNS is a consortium that includes: Spanish Government 60% Catalonian Government 30% Universitat Politècnica de Catalunya (UPC) 10% 425 people, 41 countries 2

The MareNostrum 3 Supercomputer Over 10 15 Floating Point Operations per second Nearly 50,000 cores 100.8 TB of main memory 2 PB of disk storage 70% distributed through PRACE 24% distributed through RES 6% for BSC-CNS use 3

Mission of BSC Scientific Departments COMPUTER SCIENCES To influence the way machines are built, programmed and used: programming models, performance tools, Big Data, computer architecture, energy efficiency EARTH SCIENCES To develop and implement global and regional stateof-the-art models for short-term air quality forecast and long-term climate applications LIFE SCIENCES To understand living organisms by means of theoretical and computational methods (molecular modeling, genomics, proteomics) CASE To develop scientific and engineering software to efficiently exploit supercomputing capabilities (biomedical, geophysics, atmospheric, energy, social and economic simulations) 4

The BSC-CS project Influence the way machines are built programmed and used Through ideas demonstration Cooperation with manufacturers & products Our strength Vision Holistic/vertical background Technology Co-design environment 5

BSC technologies Programming model The StarSs concept (*Superscalar) : Task based, dataflow out of order execution Criticality and locality aware scheduling The OmpSs implementation OpenMP Standard Dynamic resource management Performance tools Trace visualization and analysis: extreme flexibility and detail Performance analytics Architecture background Leverage other markets investments Runtime aware architecture Big data Integration of Computational workflow models and runtimes Storage architecture and management Dynamic resource management 6

POP CoE Performance Optimization and Productivity A Horizontal Center of Excellence Across application areas, platforms, scales Providing Performance Optimization and Productivity services Precise understanding of application and system behavior Suggestion on how to refactor code in the most productive way For academic AND industrial codes Oct 2015 March 2018 Using both Open Source and Commercial toolsets? Application Performance Audit! Application Performance Plan Proof-of-Concept www.pop-coe.eu Funded by EC s H2020 programme 7

Mont-Blanc projects in a glance Vision: to leverage the fast growing market of mobile technology for scientific computation, HPC and non-hpc workload. 2012 2013 2014 2015 2016 2017 2018 Mont-Blanc Mont-Blanc 2 Mont-Blanc 3 8

Leveraging a fast-growing market HPC (+16%) Jun 2015: 25 M cores Nov 2015: 29 M cores Server today tomorrow Mont-Blanc 3 Server (+3%) 2013: 9.0 M 2014: 9.3 M Cost Desktop PC (-1 %) 2013: 316 M 2014: 314 M Mobile Mont-Blanc 1 and 2 Performance Smartphone (+30%) 2013: 1000 M 2014: 1300 M 9

The Mont-Blanc prototype ecosystem Prototypes are critical to accelerate software development System software stack + applications PRACE prototypes Tibidabo Carma Pedraforca Mini-clusters Arndale Odroid XU Odroid XU-3 NVIDIA Jetson Mont-Blanc prototype 1080 compute cards Fine grained power monitoring system Installed between Jan and May 2015 Operational since May 2015 @ BSC ARM 64-bit mini-clusters APM X-GENE2 Cavium ThunderX NVIDIA TX1 Mont-Blanc 3 demonstrator Based on newgeneration ARM 64- bit processors Targeting HPC market 2012 2013 2014 2015 2016 2017 10

The Mont-Blanc prototype Exynos 5 compute card 2 x Cortex-A15 @ 1.7GHz 1 x Mali T604 GPU 6.8 + 25.5 GFLOPS 15 Watts 2.1 GFLOPS/W Carrier blade 15 x Compute cards 485 GFLOPS 1 GbE to 10 GbE 300 Watts 1.6 GFLOPS/W Rack 8 BullX chassis 72 Compute blades 1080 Compute cards 2160 CPUs 1080 GPUs 4.3 TB of DRAM 17.2 TB of Flash 35 TFLOPS 24 kwatt Blade chassis 7U 9 x Carrier blade 135 x Compute cards 4.3 TFLOPS 2.7 kwatts 1.6 GFLOPS/W 11

System software stack and applications Source files (C, C++, FORTRAN, Python, ) Compilers GNU JDK Mercurium Scientific libraries ATLAS FFTW HDF5 clblas LAPACK Boost PETSc clfft Weak scaling Developer tools Extrae Perf DDT Scalasca Nagios Ganglia Cluster management Puppet SLURM OpenLDAP NTP Strong scaling Runtime libraries Nanos++ OpenCL CUDA MPI Power monitor Hardware support / Storage DVFS NFS Lustre CPU CPU CPU Linux OS / Ubuntu OpenCL driver GPU Network driver Network 12

Final remarks BSC highly active in emerging technologies Montblanc 1 and 2 Important role in the move of ARM based system towards HPC Demonstrating the potential to scale with low end components Cost effectiveness: showing the possibility to leverage components and technologies developed by other markets Challenging established thinking in the HPC sector A practical approach: design freezes in a very fast moving market but still possible to demonstrate ideas Potential To revert to other markets To improve the traditional HPC sector Montblanc 3 The real revolution is in the mind of programmers: latency throughput mindset Takes time but unavoidable 13