High Performance Fortran. James Curry

Size: px

Start display at page:

Download "High Performance Fortran. James Curry"

Blanche Barton
5 years ago
Views:

1 High Performance Fortran James Curry

2 Wikipedia! New Fortran statements, such as FORALL, and the ability to create PURE (side effect free) procedures Compiler directives for recommended distributions of array data Extrinsic procedure interface for interfacing to non-hpf parallel procedures such as those using message passing Additional library routines - including environmental inquiry, parallel prefix/suffix (e.g., 'scan'), data scattering, and sorting operations

3 A History of HPF v1.0 HPF Version 1.0 Defined HPFF meetings in Built on top of Fortran 90 Rice University! GOALS: Data Parallel Programming Top performance on SIMD and MIMD Code Tuning

4 Achieving Parallel FORALL statement/construct PURE procedures INDEPENDENT directives

5 FORALL STATEMENT Element Array Assignment Statement FORALL ( i = 1:n, j=1:m ) A(i,j) = i+j instead of... a = SPREAD((/(i,i=1,n)/), DIM=2, NCOPIES=m) + & SPREAD((/i,i=1,m)/), DIM=1, NCOPIES=n)

6 FORALL CONSTRUCT FORALL ( i = 2:n-1, j=2:n-1 ) a(i,j) = a(i,j-1) + a(i,j+1) + a(i-1,j) + a(i+1,j) b(i,j) = a(i,j) END FORALL

7 PURE Procedures PURE = No side effects Required to be PURE if used in: The mask or body of a FORALL statement Within the body of a PURE procedure As an actual argument in a PURE procedure reference

8 INDEPENDENT Directive Proceeds a DO loop or a FORALL statement or construct Asserts to the compiler than things within the loop can be done in any order without affecting semantics

9 V1.1 Corrections, Clarifications, and interpretations!

10 HPF v2.0 GOALS Support for data parallel programming single threaded global name space and loosely synchronous parallel computation Portability across different architectures High performance on parallel computers with nonuniform memory access costs while not impeding performance on other machines Use of Standard Fortran (currently Fortran 95) as a base Open interfaces and interoperability with other languages (eg C) and other programming paradigms (eg message passing using MPI)

11 2.0 Features Data Distribution Data Parallel Execution Features Extrinsic Program Units Intrinsic Functions and Standard library

12 Data Distribution Conflict between Fortran standards and architecture demands ALIGN DISTRIBUTE

13 Data Parallel Execution Features INDEPENDENT REDUCTION

14 Extrinsic Program Units HPF programs using non-hpf procedures HPF programs using a different programming model Single Logical Thread Multiple Threads of Control A program unit s language and model taken together make up its extrinsic-kind EXTRINSIC(LANGUAGE = HPF, MODEL = GLOBAL )

15 Intrinsic Functions NUMBER_OF_PROCESSORS PROCESSORS_SHAPE INTEGER, DIMNSION(SIZE(PROCESSORS_SHAPE())) :: PSHAPE!HPF TEMPLATE T(100, 3* NUMBER_OF_PROCESSORS())

16 2.0 Extensions Data Mapping Data and Task Parallelism Asynchronous I/O HPF Extrinsics

17 Data Mapping REALIGN, REDISTRIBUTE, DYNAMIC GEN_BLOCK, INDIRECT RANGE SHADOW

18 Data and Task Parallelism ON RESIDENT TASK_REGION

19 Asynchronous I/O Extension of READ/WRITE to be asynchronous for direct, unformatted data. Done by added control parameters that specify nonblocking execution and a new statement (WAIT)

20 HPF Extrinsics Interfaces for different models of parallelism LOCAL for SPMD SERIAL for single process sequential Interfaces with C and Fortran 77

21 Changes from v1.1 Repartitioning of the language Features now in standard Fortran Features Removed and Restricted Elimination of HPF Subset

22 Changes from v1.1 Features moved to Approved Extensions New Features of 2.0 New Approved Extensions HPF Extrinsics

23 HPF Code Example FORALL v1 = x1 : u1:s1, mask1 ) FORALL v2 = x2 : u2:s2, marks ) a(e1) = rhs1 b(e2) = rhs2 END FORALL END FORALL Equivalent Fortran 90 code is 67 lines

24 HPF Code Example REAL a(1000), b(1000), c(1000), x(500), y(0:501) INTEGER inx(1000)!hpf$ PROCESSORS procs(10)!hpf$ DISTRIBUTE (BLOCK) ONTO procs :: a, b, inx!hpf$ DISTRIBUTE (CYCLIC) ONTO procs :: c!hpf$ ALIGN x(i) WITH y(i+1)... a(i) = b(i)! Assignment 1 x(i) = y(i+1)! Assignment 2 a(i) = c(i)! Assignment 3 a(i) = a(i-1) + a(i) + a(i+1)! Assignment 4 a(i) = c(i-1) + c(i) + c(i+1)! Assignment 5 x(i) = y(i)! Assignment 6 a(i) = a(inx(i)) + b(inx(i))! Assignment 7

25 Compiling ADAPTOR from the Institut Algorithmen und Wissenschaftliches Rechnen

26 What s it do? an HPF compiler that generates parallel Fortran programs using parallelism via MPI and/or Pthreads an OpenMP compiler that generates parallel Fortran programs using PThreads a Source-to-Source translation system for the optimization of Fortran codes for cache architectures

27 Where is it used? There are currently 35 listed HPF applications. Around 20 projects.

28 Applications 3D Magnetohydrodynamics AEROLOG ARC3D Princeton Ocean Model Simmux

29 Projects Adaptor Aurora Fx PHAROS

30 BenchMarks HPFBench FLOP count memory usage communication pattern local mem access array allocation

31 Questions?

32 Sources

High Performance Fortran. Language Specication. High Performance Fortran Forum. January 31, Version 2.0

High Performance Fortran. Language Specication. High Performance Fortran Forum. January 31, Version 2.0 High Performance Fortran Language Specication High Performance Fortran Forum January, Version.0 The High Performance Fortran Forum (HPFF), with participation from over 0 organizations, met from March to