Path to Exascale? Intel in Research and HPC 2012

Similar documents
Intel Many Integrated Core (MIC) Architecture

Intel Software Development Products for High Performance Computing and Parallel Programming

PRACE PATC Course: Intel MIC Programming Workshop, MKL LRZ,

Intel MIC Architecture. Dr. Momme Allalen, LRZ, PRACE PATC: Intel MIC&GPU Programming Workshop

Parallel Programming. The Ultimate Road to Performance April 16, Werner Krotz-Vogel

High Performance Parallel Programming. Multicore development tools with extensions to many-core. Investment protection. Scale Forward.

Tutorial. Preparing for Stampede: Programming Heterogeneous Many-Core Supercomputers

John Hengeveld Director of Marketing, HPC Evangelist

AUTOMATIC SMT THREADING

PRACE PATC Course: Intel MIC Programming Workshop, MKL. Ostrava,

Intel Xeon Phi Coprocessor

Intel Many Integrated Core (MIC) Programming Intel Xeon Phi

Introduction to Xeon Phi. Bill Barth January 11, 2013

The Stampede is Coming: A New Petascale Resource for the Open Science Community

TACC s Stampede Project: Intel MIC for Simulation and Data-Intensive Computing

PORTING CP2K TO THE INTEL XEON PHI. ARCHER Technical Forum, Wed 30 th July Iain Bethune

Get Ready for Intel MKL on Intel Xeon Phi Coprocessors. Zhang Zhang Technical Consulting Engineer Intel Math Kernel Library

Vincent C. Betro, R. Glenn Brook, & Ryan C. Hulguin XSEDE Xtreme Scaling Workshop Chicago, IL July 15-16, 2012

Intel : Accelerating the Path to Exascale. Kirk Skaugen Vice President Intel Architecture Group General Manager Data Center Group

Maximize Performance and Scalability of RADIOSS* Structural Analysis Software on Intel Xeon Processor E7 v2 Family-Based Platforms

HPC. Accelerating. HPC Advisory Council Lugano, CH March 15 th, Herbert Cornelius Intel

Addressing the Increasing Challenges of Debugging on Accelerated HPC Systems. Ed Hinkel Senior Sales Engineer

MELLANOX EDR UPDATE & GPUDIRECT MELLANOX SR. SE 정연구

Editor s Day 2010 Intel Overview

Resources Current and Future Systems. Timothy H. Kaiser, Ph.D.

Intel Math Kernel Library (Intel MKL) Latest Features

Beyond Offloading Programming Models for the Intel Xeon Phi Coprocessor. Michael Hebenstreit, Senior Cluster Architect, Intel SFTS001

Ready For Future Computing? Levent Akyil Software and Services Group

Introduc)on to Xeon Phi

Using Intel VTune Amplifier XE for High Performance Computing

Chelsio Communications. Meeting Today s Datacenter Challenges. Produced by Tabor Custom Publishing in conjunction with: CUSTOM PUBLISHING

In-Network Computing. Sebastian Kalcher, Senior System Engineer HPC. May 2017

in Action Fujitsu High Performance Computing Ecosystem Human Centric Innovation Innovation Flexibility Simplicity

New Intel 45nm Processors. Reinvented transistors and new products

Delivering HPC Performance at Scale

Intel Xeon Phi архитектура, модели программирования, оптимизация.

The DEEP (and DEEP-ER) projects

HPC Architectures. Types of resource currently in use

Overview of Intel Xeon Phi Coprocessor

Parallel Programming on Ranger and Stampede

Resources Current and Future Systems. Timothy H. Kaiser, Ph.D.

Netweb Technologies Delivers India s Fastest Hybrid Supercomputer with Breakthrough Performance

Interconnect Your Future

Preparing for Highly Parallel, Heterogeneous Coprocessing

The Intel Xeon Phi Coprocessor. Dr-Ing. Michael Klemm Software and Services Group Intel Corporation

Erkenntnisse aus aktuellen Performance- Messungen mit LS-DYNA

SUPERMICRO, VEXATA AND INTEL ENABLING NEW LEVELS PERFORMANCE AND EFFICIENCY FOR REAL-TIME DATA ANALYTICS FOR SQL DATA WAREHOUSE DEPLOYMENTS

In-Network Computing. Paving the Road to Exascale. June 2017

Investigation of Intel MIC for implementation of Fast Fourier Transform

Single-Points of Performance

Intel Math Kernel Library Perspectives and Latest Advances. Noah Clemons Lead Technical Consulting Engineer Developer Products Division, Intel

Intel Many Integrated Core (MIC) Matt Kelly & Ryan Rawlins

Accelerating Insights In the Technical Computing Transformation

Introduction to the Intel Xeon Phi on Stampede

Intra-MIC MPI Communication using MVAPICH2: Early Experience

April 2 nd, Bob Burroughs Director, HPC Solution Sales

Intel MIC Programming Workshop, Hardware Overview & Native Execution LRZ,

Intel HPC Technologies Outlook

Growth in Cores - A well rehearsed story

Intel Xeon Phi архитектура, модели программирования, оптимизация.

QLogic TrueScale InfiniBand and Teraflop Simulations

Introduction to the Xeon Phi programming model. Fabio AFFINITO, CINECA

Accelerating Real-Time Big Data. Breaking the limitations of captive NVMe storage

Birds of a Feather Presentation

NVMe over Universal RDMA Fabrics

Does the Intel Xeon Phi processor fit HEP workloads?

Arm Processor Technology Update and Roadmap

Accelerator Programming Lecture 1

In-Network Computing. Paving the Road to Exascale. 5th Annual MVAPICH User Group (MUG) Meeting, August 2017

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance

Reusing this material

NAMD Performance Benchmark and Profiling. January 2015

Knights Corner: Your Path to Knights Landing

High Performance Computing The Essential Tool for a Knowledge Economy

Scalasca support for Intel Xeon Phi. Brian Wylie & Wolfgang Frings Jülich Supercomputing Centre Forschungszentrum Jülich, Germany

Solutions for Scalable HPC

Debugging Intel Xeon Phi KNC Tutorial

Paving the Road to Exascale

HPC in the Multicore Era

An Introduction to the Intel Xeon Phi Coprocessor

Simulation using MIC co-processor on Helios

USING OPEN FABRIC INTERFACE IN INTEL MPI LIBRARY

Interconnect Your Future

Computational issues for HI

Scientific Computing with Intel Xeon Phi Coprocessors

Intel MIC Programming Workshop, Hardware Overview & Native Execution. IT4Innovations, Ostrava,

Performance Evaluation of NWChem Ab-Initio Molecular Dynamics (AIMD) Simulations on the Intel Xeon Phi Processor

The Future of High Performance Interconnects

Intel Xeon Phi Programmability (the good, the bad and the ugly)

Performance Optimizations via Connect-IB and Dynamically Connected Transport Service for Maximum Performance on LS-DYNA

Atos announces the Bull sequana X1000 the first exascale-class supercomputer. Jakub Venc

ASKAP Central Processor: Design and Implementa8on

Programming for the Intel Many Integrated Core Architecture By James Reinders. The Architecture for Discovery. PowerPoint Title

SKA Regional Centre Activities in Australasia

Gen-Z Memory-Driven Computing

Intel Xeon Phi Coprocessor

unleashed the future Intel Xeon Scalable Processors for High Performance Computing Alexey Belogortsev Field Application Engineer

A Unified Approach to Heterogeneous Architectures Using the Uintah Framework

Interconnect Your Future

Achieving High Performance. Jim Cownie Principal Engineer SSG/DPD/TCAR Multicore Challenge 2013

Transcription:

Path to Exascale? Intel in Research and HPC 2012

Intel s Investment in Manufacturing New Capacity for 14nm and Beyond D1X Oregon Development Fab Fab 42 Arizona High Volume Fab 22nm Fab Upgrades D1D Oregon D1C Oregon Fab 32 Arizona Fab 28 Israel Fab 12 Arizona

Intel Labs Delivering Breakthrough Technologies to Fuel Intel s Growth Strong Research Partnerships UNIVERSITIES GOVERNMENT World-Class Research Processing and Programming Energy and Sustainability Security and Virtualization Si Photonics and Wireless INDUSTRY User Experience and Interaction and much more!

Intel European Exascale Labs Strong commitment to advance computing leading edge: Intel collaborating with HPC community & European researchers 3 labs in Europe, Exascale computing is the central topic ExaScale Computing Research Center, Paris Exascale Cluster Lab, Jülich Exascience Lab, Leuven Performance and scalability of Exascale applications Exascale cluster scalability and reliability Space weather prediction Architectural simulation and visualization Numerical kernels 4 At the core of Exascale

What s Intel Doing in 2012 in HPC Intel Xeon Processor: E5-2600/4600 Product Families Fabric Technology: Cray s Aries Interconnect Qlogic s TrueScale Product Family Intel Many Integrated Core Architecture

Next Front of System Innovation: Fabrics HPC Expertise Intellectual Property World-class Interconnects HPC Expertise Fabric Management & Software Highest Performance, Scalable IB Products Low-latency Ethernet Switching Data Center Ethernet Expertise High Radix & Low Radix Switch Products Intel s Comprehensive Connectivity and Fabric Portfolio Market Leading Compute & Ethernet Products Platform Expertise Unprecedented Rate of Innovation in HPC Fabric Other brands and names are the property of their respective owners.

Many Core and Multi-Core Many Integrated Cores at 1-1.5 GHz* Multi-core Intel Xeon processor at 2.0-3.5 GHz Die Size not to scale Many core relies on a high degree of parallelism to compensate for the lower speed of each individual core Relatively few specialized applications today are highly parallel, but those applications will benefit from Intel MIC *This is an estimated frequency range for purposes of comparison to multi-core it is not tied to a particular product 7

Intel Xeon Phi Product Family based on Intel Many Integrated Core Architecture Future Knights products Knights Corner Knights Ferry 1 st Intel MIC product 22 nm process >50 Intel Architecture cores In production in 2012 Software Development Platform 8

Spectrum of Execution Models CPU-Centric Intel Xeon Processor Intel MIC-Centric Intel Many Integrated Core (MIC) Multi-core Hosted Offload Symmetric Reverse Offload Many-Core Hosted General purpose serial and parallel computing Codes with balanced needs Highly-parallel codes Codes with highlyparallel phases Codes with serial phases Multi-core Many-core Main( ) MPI_*() Main( ) MPI_*() Main( ) MPI_*() Main( ) MPI_*() Main( ) MPI_*() Main() MPI_*() PCIe Productive Programming Models Across the Spectrum Supported with Intel Tools Not Currently Supported by Language Ext. Offload for Intel Tools

Intel MIC Architecture Code Examples 1. Offloading a function call #pragma offload target (mic) foo(); foo() {... } // Compiled for mic 2. Calculating Pi with automatic offload #pragma offload target (mic) #pragma omp parallel for reduction(+:pi) for (i=0; i<count; i++) { float t = (float)((i+0.5)/count); pi += 4.0/(1.0+t*t); } pi /= count 3. Using MKL with offload void your_hook() { float *A, *B, *C; /* Matrices */ #pragma offload target(mic) in(transa, transb, N, alpha, beta) \ in(a:length(matrix_elements)) \ in(b:length(matrix_elements)) \ in(c:length(matrix_elements)) \ out(c:length(matrix_elements)alloc_if(0)) sgemm(&transa, &transb, &N, &N, &N, &alpha, A, &N, B, &N, &beta, C, &N); } 10

Joint venture between CUT & UWA MWA ASKAP SKA

SKA ICT Challenges SKA will generate more data in one day than the whole Internet produces a year (~1.5EB per day). Tim Cornwell CSIRO 12 eresearch 2012

Xeon Phi (Intel MIC) early adoption at ICRAR Started in 2011 KNF A0 -> KNC A0 -> KNC B0 Compression of extremely large spectral-imaging data-cubes (400 TB, SkuareView, TBB, threads) Interferometry radio telescope data processing (software cross-correlation, DiFX, MPI, Pthread) N-body simulations for astrophysics (massive star forming regions, Gadget2, MPI) This technology makes us to believe that we are indeed on the path to Exascale computing required for SKA. 13 eresearch 2012