EZTrace upcoming features

Size: px

Start display at page:

Download "EZTrace upcoming features"

Edwina Joseph
5 years ago
Views:

1 EZTrace upcoming features François Trahay francois.trahay@telecom-sudparis.eu

2 Context Hardware is more and more complex NUMA, hierarchical caches, GPU,... Software is more and more complex Hybrid MPI+OpenMP, MPI+CUDA, Achieving good performance is hard Understanding the performance of an application is difficult Need for performance analysis tools 2

3 EZTrace 3

Developped at TSP + INRIA Bordeaux CeCILL-B licence (~BSD) C / Fortran

4 EZTrace Framework for performance analysis Generate execution traces - standard OTF or Pajé trace files Provides trace analysis facilities Developped at TSP + INRIA Bordeaux CeCILL-B licence (~BSD) C / Fortran / C++ programs Mostly tested on X86_64 / ARM 4

5 EZTrace plugins A set of pre-defined plugins Major programming models (MPI, OpenMP, CUDA) - Possibility to combine plugins (eg. MPI+OpenMP, etc.) General-purpose plugins (memory, pthread) Performance counters (PAPI) User-defined plugins Created for the user application/libary $ eztrace_plugin_generator./application $ eztrace_create_plugin example.tpl Third-party plugins shipped with external libraries (ex: PLASMA) 5

6 EZTrace 2-step analysis Run the application $ mpiexec np 2 eztrace t ''mpi cuda''./my_application Automatic instrumentation of the program Record events at key points Generate a (compact) raw trace file per process Post-mortem analysis $ eztrace_convert /tmp/trahay_eztrace_log_rank_* Generate a trace file for visualisation (Paje / OTF) - Using the GTG library Extract statistics 6

7 Generating traces 7 Outils d'analyse de performance

8 Instrumentation with LD_PRELOAD Install a wrapper for key functions Only works for shared libraries 8

9 Binary instrumentation Modify the entry of key functions in the binary Lightweight instrumentation Little architecture-specific code (~100s loc) - For now : x86_64 + ARMv7 9

10 LiTL : Lightweight Trace Library Events recorded in a binary format (timestamp, event_code, arg1, arg2,...) One buffer per thread Flush buffers at the end of the application 10

11 Performances Cluster Edel 2 x 4 cores per node Infiniband NAS Parallel Benchmarks Class=B, NProcs=64 overhead: ~100 ns per event 11

12 Trace analysis 12 Outils d'analyse de performance

13 Trace analysis Post-mortem analysis Read the LiTL trace files Interpret events Several possible output Generate a viewable trace file (Paje/OTF) Extract statistics In-browser trace analysis* * under development 13

14 Generating viewable trace files 14 generates a trace file Paje / OTF file formats Viewable with tools like Vampir or ViTE eztrace_convert

15 Extracting statistics 15 prints various statistics Time spent on locks Memory consumption MPI messages Duration of OpenMP parallel regions... eztrace_stats

16 In-browser analysis* Web-based trace analysis Communication matrix Gantt chart * under development 16

17 Conclusion EZTrace 1.0 is available Open source Contributions are welcome! Extensible Future work / upcoming features In-browser analysis Trace analysis in parallel Detection of patterns / anomalies 17

Questions? François Trahay francois.trahay@telecom-sudparis.

18 Questions? François Trahay 18 Outils d'analyse de performance

$in a trace Application phases that repeat 100 x { MPI_SEND MPI_RECV } MPI_Barrier 10000 x { MPI_SEND MPI_RECV$ } MPI_Barrier 19 (src=0 (src=1 dest=1 dest=0 len=16 len=16 tag=0) tag=0) (src=0 (src=1 dest=1 dest=0 len=16

} MPI_Barrier 19 (src=0 (src=1 dest=1 dest=0 len=16 len=16 tag=0) tag=0) (src=0 (src=1 dest=1 dest=0 len=16

19 Bonus: Pattern detection in EZTrace Visualizing a large trace is difficult Millions of events Detect patterns in a trace Application phases that repeat 100 x { MPI_SEND MPI_RECV } MPI_Barrier x { MPI_SEND MPI_RECV } MPI_Barrier 19 (src=0 (src=1 dest=1 dest=0 len=16 len=16 tag=0) tag=0) (src=0 (src=1 dest=1 dest=0 len=16 len=16 tag=0) tag=0) NPB CG class A 16 MPI Processes events NPB CG class A 16 MPI Processes events

20 Bonus: Detecting anomalies Select representative occurrences Instead of examining 1000 occurrences Select 1 occurrence per class #327 #549 # HP2 seminar August 2014

Runtime Function Instrumentation with EZTrace

Runtime Function Instrumentation with EZTrace Charles Aulagnon, Damien Martin-Guillerez, François Rué and François Trahay 5 th Workshop on Productivity and Performance (PROPER 2012) INTRODUCTION Modern