Solvers, Programming Models and Proto Apps TOM VANDER AA APPLICATION WORKSHOP OCTOBER 2016, MANCHESTER

Size: px

Start display at page:

Download "Solvers, Programming Models and Proto Apps TOM VANDER AA APPLICATION WORKSHOP OCTOBER 2016, MANCHESTER"

Elizabeth Watts
6 years ago
Views:

1 Solvers, Programming Models and Proto Apps TOM VANDER AA APPLICATION WORKSHOP OCTOBER 2016, MANCHESTER

2 Strong Exa-Scaling is Hard CFD Application Today: 50M mesh points In ten years: 500M ExaScale Computers 10M cores Hence 50 mesh points per core CFD Proxy Application Proto application of EXA2CT Slide 2

3 CFD-Proxy on >1 Xeon-Phi speedup cores Slide 3

4 CFD-Proxy on >1 Xeon-Phi speedup comm_free gaspi_bulk_sync gaspi_async mpi_bulk_sync mpi_early_recv mpi_async mpi_fence_bulk_sync mpi_fence_async mpi_pscw_bulk_sync mpi_pscw_async cores Slide 4

5 Strong Exa-Scaling is Possible Bulk Synchronous Asynchronous GASPI write+notify MPI ISend/IRecv DON T Single Threaded Communication Thread-to-thread communication DO MPI Data Types Multi-threaded packing

EXA2CT Solvers that scale to ExaScale 10

6 EXA2CT Solvers that scale to ExaScale TBB Programming models that scale to ExaScale Using relevant reallife proto applications CILK PATUS GASPI SHARK

7 Solvers that scale to ExaScale Programming models that scale to ExaScale Using relevant reallife proto applications

8 EXA2CT Solvers that scale to ExaScale TBB Programming models that scale to ExaScale Using relevant reallife proto applications CILK PATUS GASPI SHARK

9 Overlap communication and computation in pipelined solvers Pipelined GMRES overlaps dot-product global communication latency with SpMV Available in PETSc

10 Counter Rounding Errors due to more Local Computations Slide 10

EXA2CT Solvers that scale to ExaScale 10 18 TBB Programming models that scale to

11 EXA2CT Solvers that scale to ExaScale TBB Programming models that scale to ExaScale Using relevant reallife proto applications CILK PATUS GASPI SHARK

12 GASPI in a nutshell PGAS API - designed to be Simple Multithreaded Global asynchronous dataflow Interoperability with MPI gaspi_notify gaspi_write

13 GASPI Key in EXA2CT in Proto-Application Aviation, Machine Learning, Nano-Electronics in Libraries for task-based programming for distributed work-stealing for resilience Slide 13

14 Example: TITUS Median Iteration Time group orig 014M orig 110M TITUS 014M TITUS 110M W s w Distributed Work Stealing using GASPI using small-world principle Hide latency with work Very high efficiency even for illbalanced problem Process Count Slide 14

15 EXA2CT Solvers that scale to ExaScale TBB Programming models that scale to ExaScale Using relevant reallife proto applications CILK PATUS GASPI SHARK

16 Proto Applications MUPHY Proto Applications ~1% can be filled up with experimental dose response data ABCD 10µM 1000s of targets 1nM Quarterly updated Millions of compounds Why? Experimental cost >5$ xm cpds x000 targets

17 Proto Applications MUPHY Proto Applications ~1% can be filled up with experimental dose response data ABCD 10µM 1000s of targets 1nM Quarterly updated Millions of compounds Why? Experimental cost >5$ xm cpds x000 targets

www.exa2ct.eu EXA2CT open source for you!

18 EXA2CT open source for you! Solvers in PETSC Programming libraries GASPI Dynamic programming Proto-Applications FEM/CFD, but also Machine Learning, Multi-Physics

21 Partners

GASPI AND THE EXA2CT PROJECT JUNE 2015, CRIHAN- CORIA ERIC PETIT UVSQ

GASPI AND THE EXA2CT PROJECT JUNE 2015, CRIHAN- CORIA ERIC PETIT UVSQ Outline A brief introduction to Exa2ct. About proto- applications Distributed/shared, harware/software, address space Outline An introduction