OpenMP 4.0: A Significant Paradigm Shift in Parallelism

Similar documents
OpenMP TR3 aka future 4.1 Introducing OpenMPCon 2015 OpenMP Code Challenge

S Comparing OpenACC 2.5 and OpenMP 4.5

State of OpenMP & Outlook on OpenMP 4.1

OpenACC/CUDA/OpenMP... 1 Languages and Libraries... 3 Multi-GPU support... 4 How OpenACC Works... 4

Comparing OpenACC 2.5 and OpenMP 4.1 James C Beyer PhD, Sept 29 th 2015

SC18 OpenMP BoF Report (BoF 109) Jim Cownie, Michael Klemm 28 November 2018

OpenACC2 vs.openmp4. James Lin 1,2 and Satoshi Matsuoka 2

The Heterogeneous Programming Jungle. Service d Expérimentation et de développement Centre Inria Bordeaux Sud-Ouest

Progress on OpenMP Specifications

Cray XE6 Performance Workshop

OpenMP 3.0. OpenMP ARB. Mark Bull SC07, November, 2007, Reno

AutoTune Workshop. Michael Gerndt Technische Universität München

Portability of OpenMP Offload Directives Jeff Larkin, OpenMP Booth Talk SC17

Understanding Dynamic Parallelism

SC16 OpenMP* BoF Report (BoF 113) Jim Cownie, Michael Klemm 28 November 2016

A Uniform Programming Model for Petascale Computing

Steve Scott, Tesla CTO SC 11 November 15, 2011

OpenACC Standard. Credits 19/07/ OpenACC, Directives for Accelerators, Nvidia Slideware

COMP Parallel Computing. Programming Accelerators using Directives

compunity: The OpenMP Community

INTRODUCTION TO OPENACC. Analyzing and Parallelizing with OpenACC, Feb 22, 2017

Modernizing OpenMP for an Accelerated World

Overview: The OpenMP Programming Model

Advanced OpenACC. Steve Abbott November 17, 2017

OpenACC Course. Office Hour #2 Q&A

OpenACC 2.6 Proposed Features

OpenMP 4.0/4.5. Mark Bull, EPCC

Evaluation of Asynchronous Offloading Capabilities of Accelerator Programming Models for Multiple Devices

An Introduction to the SPEC High Performance Group and their Benchmark Suites

Serial and Parallel Sobel Filtering for multimedia applications

trisycl Open Source C++17 & OpenMP-based OpenCL SYCL prototype Ronan Keryell 05/12/2015 IWOCL 2015 SYCL Tutorial Khronos OpenCL SYCL committee

Directive-based Programming for Highly-scalable Nodes

A General Discussion on! Parallelism!

OpenMP 4.0. Mark Bull, EPCC

Productive Performance on the Cray XK System Using OpenACC Compilers and Tools

ELP. Effektive Laufzeitunterstützung für zukünftige Programmierstandards. Speaker: Tim Cramer, RWTH Aachen University

OpenMP 4.5 target. Presenters: Tom Scogland Oscar Hernandez. Wednesday, June 28 th, Credits for some of the material

Addressing the Increasing Challenges of Debugging on Accelerated HPC Systems. Ed Hinkel Senior Sales Engineer

Hybrid KAUST Many Cores and OpenACC. Alain Clo - KAUST Research Computing Saber Feki KAUST Supercomputing Lab Florent Lebeau - CAPS

A Study of High Performance Computing and the Cray SV1 Supercomputer. Michael Sullivan TJHSST Class of 2004

Addressing Heterogeneity in Manycore Applications

Early Experiences Writing Performance Portable OpenMP 4 Codes

OPENACC DIRECTIVES FOR ACCELERATORS NVIDIA

An Introduction to OpenMP

Achieving Portable Performance for GTC-P with OpenACC on GPU, multi-core CPU, and Sunway Many-core Processor

OpenMP I. Diego Fabregat-Traver and Prof. Paolo Bientinesi WS16/17. HPAC, RWTH Aachen

OpenMP for Heterogeneous Multicore Embedded Systems using MCA API standard interface

OpenMP 4.5 Validation and

OpenMP Technical Report 1 on Directives for Attached Accelerators

Hybrid Parallelization: Performance from SMP Building Blocks

Hybrid multicore supercomputing and GPU acceleration with next-generation Cray systems

OpenACC. Introduction and Evolutions Sebastien Deldon, GPU Compiler engineer

Expressing Heterogeneous Parallelism in C++ with Intel Threading Building Blocks A full-day tutorial proposal for SC17

OpenCL: History & Future. November 20, 2017

OPENMP GPU OFFLOAD IN FLANG AND LLVM. Guray Ozen, Simone Atzeni, Michael Wolfe Annemarie Southwell, Gary Klimowicz

OpenMP for Accelerators

HSA Foundation! Advanced Topics on Heterogeneous System Architectures. Politecnico di Milano! Seminar Room (Bld 20)! 15 December, 2017!

SIGGRAPH Briefing August 2014

Parallel Programming. Libraries and Implementations

Advanced OpenMP. Lecture 11: OpenMP 4.0

Open Standards for Building Virtual and Augmented Realities. Neil Trevett Khronos President NVIDIA VP Developer Ecosystems

The Role of Standards in Heterogeneous Programming

OpenMP 4.0: a Significant Paradigm Shift in High-level Parallel Computing (Not just Shared Memory anymore) Michael Wong OpenMP CEO

HSA foundation! Advanced Topics on Heterogeneous System Architectures. Politecnico di Milano! Seminar Room A. Alario! 23 November, 2015!

Next Generation OpenGL Neil Trevett Khronos President NVIDIA VP Mobile Copyright Khronos Group Page 1

OPENACC ONLINE COURSE 2018

C6000 Compiler Roadmap

Scientific Programming in C XIV. Parallel programming

Parallel and High Performance Computing CSE 745

GPU Debugging Made Easy. David Lecomber CTO, Allinea Software

The Eclipse Parallel Tools Platform

An Introduction to OpenAcc

Early Experiences with the OpenMP Accelerator Model

A brief introduction to OpenMP

Ian Buck, GM GPU Computing Software

Shared memory programming model OpenMP TMA4280 Introduction to Supercomputing

NVIDIA Think about Computing as Heterogeneous One Leo Liao, 1/29/2106, NTU

Experiences with Achieving Portability across Heterogeneous Architectures

GPU Architecture. Alan Gray EPCC The University of Edinburgh

OpenMP 4.0 implementation in GCC. Jakub Jelínek Consulting Engineer, Platform Tools Engineering, Red Hat

Performance Tools for Technical Computing

Lattice Simulations using OpenACC compilers. Pushan Majumdar (Indian Association for the Cultivation of Science, Kolkata)

A case study of performance portability with OpenMP 4.5

Accelerator programming with OpenACC

Introduction to Parallel and Distributed Computing. Linh B. Ngo CPSC 3620

Introduction to OpenMP. Lecture 2: OpenMP fundamentals

Introduction to OpenMP

Portable and Productive Performance with OpenACC Compilers and Tools. Luiz DeRose Sr. Principal Engineer Programming Environments Director Cray Inc.

Omni Compiler and XcodeML: An Infrastructure for Source-to- Source Transformation

Shared Memory Programming with OpenMP

MPI 1. CSCI 4850/5850 High-Performance Computing Spring 2018

PERFORMANCE PORTABILITY WITH OPENACC. Jeff Larkin, NVIDIA, November 2015

TOOLS FOR IMPROVING CROSS-PLATFORM SOFTWARE DEVELOPMENT

IBM. IBM XL C/C++ and XL Fortran compilers on Power architectures overview

Is OpenMP 4.5 Target Off-load Ready for Real Life? A Case Study of Three Benchmark Kernels

The Arm Technology Ecosystem: Current Products and Future Outlook

Session 4: Parallel Programming with OpenMP

OpenMP tasking model for Ada: safety and correctness

EE/CSCI 451 Introduction to Parallel and Distributed Computation. Discussion #4 2/3/2017 University of Southern California

SDN and NFV as expressions of a systemic trend «integrating» Cloud, Networks and Terminals

Transcription:

OpenMP 4.0: A Significant Paradigm Shift in Parallelism Michael Wong OpenMP CEO michaelw@ca.ibm.com http://bit.ly/sc13-eval SC13

OpenMP 4.0 released 2

Agenda The OpenMP ARB History of OpenMP OpenMP 4.0 Parallelism OpenMP Future Parallelism OpenMP new Mission Statement 3

OpenMP ARB Current Organization OpenMP ARB (Administrative) OpenMP Board of Directors Josh Simons, Vmware Sanjiv Shah, Intel Koh Hotta, Fujitsu Andy Fritsch, TI Partha Tirumalai, Oracle One representative per member OpenMP Officers Michael Wong, CEO David Poulsen, CFO Josh Simons, Secretary OpenMP Committees (Actual Work) One representative per member Language, Bronis de Supinski Marketing, Matthijs van Waveren Tools, Martin Schutlz, Robert Dietrich 4

Agenda The OpenMP ARB History of OpenMP OpenMP 4.0 Parallelism OpenMP Future Parallelism OpenMP new Mission Statement 5

A brief history of OpenMP API Fortran V1.0 Fortran V1.1 Fortran V2.0 Fortran, C & C++ V2.5 Fortran, C & C++ V4.0 Fortran, C & C++ V3.1 Fortran, C & C++ V3.0 1998 2002 1997 C & C++ V1.0 1999 2000 2001 C & C++ V2.0 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 onwards, more agile Next OpenMP revision cycle: faster, more predictable Less monolithic: Delivering concurrent TRs & language extensions OpenMP is a living language

Today OpenMP internal Organization

OpenMP Members growth 26 members and growing From Dieter An Mey, RWTH Aachen 2012 8

Members added since 2012 Red Hat/GCC Barcelona SuperComputing Centre University of Houston 9

Agenda The OpenMP ARB History of OpenMP OpenMP 4.0 Parallelism OpenMP Future Parallelism OpenMP new Mission Statement 10

OpenMP 4.0 released 11

OpenMP 4.0 July: Released OpenMP 4.0 after 5 years This represents collaborative work by many of the brightest in industry, research, and academia, building on the consensus of 26 members. We strive to deliver high-level parallelism that is portable across 3 widelyimplemented common General Purpose languages, productive for HPC and consumers, and delivers highly competitive performance. I want to congratulate all the members for coming together to create such a momentous advancement in parallel programming, under such tight constraints and industry challenges. With this release, the OpenMP API will move immediately forward to the next release to bring even more usable parallelism to everyone. Michael Wong 12

Compilers coming GCC close to complete, expected in mid 2014 Intel 13.1 compiler supports Accelerators Clang support for OpenMP completed and released in Clang 3.4 Cray, TI, IBM, Oracle coming online 13

Significant Paradigm Shift OpenMP feels like a new language. The most pervasive parallelism Supporting HPC and consumers OpenMP adoption scorecard. How are we doing? Learning, compilers, and books. What s next for OpenMP?

More usable parallelism With OpenMP 4.0, modern OpenMP code breaks from traditional Shared Memory Parallelism, and works on heterogeneous devices across different memory architecture Provide high level way of programming parallelism And non-proprietary Still supports HPC, but now also consumer parallelism Error Model, tasks dependency, accelerators

We re All Learning OpenMP 4.0 The world is still in the early stages of gaining field experience using the new language features, individually and together. Includes everyone: OpenMP members, authors, developers. Example now published separately from spec

Compilers, Tools, and Books Using OpenMP: Implementations. Compilers: Several available now or soon (2014). GNU/GCC mid 2014 Intel Clang Tools: Rice University HPCTools Kit, IBM POMP support Learning OpenMP: Books, Tutorials, Videos, webinars. Tutorials: day long and in booth Books and articles coming More videos posted based on SC and other talks Webinars in the plan

OpenMP 4.0 Features Support for accelerators SIMD constructs to vectorize both serial as well as parallelized loops Error handling (cancel construct) Thread affinity Task groups and dependency Support for Fortran 2003 User-defined reductions Sequentially consistent atomics OMP_DISPLAY_ENV environment variable array section syntax for C/C++ Slide 18

OpenACC1 compared to OpenMP 4.0 (by Dr. James Beyer) OpenACC1 Parallel (offload) Parallel (multiple threads ) Kernels Data Loop Host data Cache Update Wait Declare OpenMP 4.0 Target Team/Parallel Target Data Distribute/Do/for/Simd Target Update Declare Target Slide 19

Agenda The OpenMP ARB History of OpenMP OpenMP 4.0 Parallelism OpenMP Future Parallelism OpenMP new Mission Statement 20

OpenMP future features OpenMP Tools: Profilers and Debuggers Consumer style parallelism: event/async/futures Enhance Accelerator support Additional Looping constructs Transactional Memory, Speculative Execution Task Model refinements CPU Affinity Common Array Shaping Full Error Model Interoperability Rebase to new C/C++/Fortran Standards Slide 21

Future OpenACC vs future OpenMP (by Dr. James Beyer) OpenACC2 enter data exit data data api routine async wait parallel in parallel tile Linkable Device_type OpenMP future Unstructured data environment declare target Parallel in parallel or team tile Linkable Device_type Slide 22

Agenda The OpenMP ARB History of OpenMP OpenMP 4.0 Parallelism OpenMP Future Parallelism OpenMP new Mission Statement 23

The Future Mission of OpenMP OpenMP s new mission statement Standardize directive-based multi-language high-level parallelism that is performant, productive and portable Updated from "Standardize and unify shared memory, thread-level parallelism for HPC 24

Marketing Larger booth with bigger tables, 10 stools Booth mini-talks, giveaways, cake, beer More talks and tutorials, speaker gifts 25

Brochures/Guides OMP WelcomeGuide(for serious prospects) OMP Meetings and Email Lists(for new members) OMP Membership Brochure(given out at conferences) OMP FAQ(given out at conferences) OMPSCheduleFlier(SC related talks and schedules) OpenMPbrochure-SC13(given out at SC 26

OpenACC (Statement in 2011) A spinoff from 4 OpenMP Members NVIDIA, PGI, Cray, CAPS To address immediate customer needs To hold the IP And to Beta test the OpenMP Accelerator implementation In time, will be folded back to OpenMP in some form We continue to aim to merge in future 27

External linkage Multicore association For closely distributed systems, have API They carry our brochures OpenFPGA C-code->(impulse)->verilog C-code->(impulse)->verilog->(VPR)->FPGA-image C-code-with-OpenMP->(impulse-exploiting-openMP)- >verilog->(vpr)->fpga-image Standard C++ Foundation I am serving as Director and Vice-President 28

Major Website update Prepare for multiple devices any screen size, windows to phones touch-friendly/-enabled designed for offline reading Links to Stack Overflow, Reddit on OpenMP related questions 29

Future meetings http://openmp.org/calendar.html Winter 2014: Jan 27-31, Intel Santa Clara Spring 2014: April, London, UK July/August: possible 3 rd meeting Fall 2013:Sept/Oct, IWOMP 2014: Brazil Continue to drive to ratify 4.1 and 5.0 30

OpenMP future More agile More rapid releases More technical Reports More consumer-style parallelism Deliver faster then ISO Deliver experimental and proprietary parallelism Allows you to be productive on several languages Supported by many vendors 31

OpenMP internal Organization Today Future

Feedback http://bit.ly/sc13-eval 33