Altix Usage and Application Programming

Size: px

Start display at page:

Download "Altix Usage and Application Programming"

Clarence Nelson
6 years ago
Views:

Center for Information Services and High

1 Center for Information Services and High Performance Computing (ZIH) Altix Usage and Application Programming Discussion And Important Information For Users Zellescher Weg 12 Willers-Bau A113 Tel Matthias S. Mueller (matthias.mueller@tu-dresden.de)

2 Outline Timeline Support and Collaboration for Computational Science on HPC Access to the Systems and Current Configuration First Experiences Some final remarks

3 Timeline Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul Aug Sep Machine Room Upgrade Installation Stage 1a (Test operation) Installation Stage 1b Installation Stage 2

4 Overall Infrastructure - Details

5 Performance of computers at ZIH 1 Pflop/s 100 Tflop/s SUM 10 Tflop/s N=1 Altix + PC Farm 1 Tflop/s 100 Gflop/s 59.7 GF/s T3E Altix 3700 merkur, venus 10 Gflop/s Origin 2800, Rapunzel N=500 Origin 3800 Romulus, Remus 1 Gflop/s 100 Mflop/s

6 Evolution of a parallel application Debug Server Parallelization Correctness Performance - Postprocessing

7 HPC Consulting Serial Program Model? MPI OpenMP Platform Platform

8 Parallel Debugging - DDT MPI Groups File browse and Source pane Output, Breakpo ints, Watch Pane Thread, Stack, Local and Global Variables Pane Evaluation window

9 Vampir Performance Analysis of Applications

Trace generator 2. Vampir viewer and analyzer 3.

10 Vampir Next Generation Trace 1 Trace 2 Trace 3 Trace N Tools Worker 1 Worker 2 Worker m Server Master 1. Trace generator 2. Vampir viewer and analyzer 3. VNG viewer 4. Parallel VNG analysis engine 5. Conversion and analysis tools

11 Visualization of experimental data (Visualization of experimental data of a low speed axial compressor ) Flow field and compressor geometry Animation to show time evolution.

12 Third Party Applications Zellescher Weg 12 Willers-Bau A113 Tel Matthias S. Mueller (matthias.mueller@tu-dresden.de)

13 Third Party Applications??? maletti LS-Dyna MPI Installed CPMD Installed Maple Installed Mathematica Installed Matlab Installed Abaqus Installed Installed Ansys Marc Installed Nastran/Patran Installed Fluent Installed AMBER SMP SMP Gaussion03 Cluster Altix O3K O2K Name

14 Numerical Libraries Name O2K O3K Altix Cluster IMSL Installed MPI NAG Installed MPI BLAS Installed? Lapack Installed? ScaLapack??

15 Current Configuration Zellescher Weg 12 Willers-Bau A113 Tel Matthias S. Mueller (matthias.mueller@tu-dresden.de)

16 General configuration Currently the system is split into two partitions: Merkur with 64 CPUs Venus with 128 CPUs Merkur is for login Currently the debugger DDT is only available on merkur. This system has slower MPI communication and no one-sided communication, due to a removed xpmem module. Currently there are no cross-partition MPI jobs possible.

17 LSF Queue CPU count Time Limit Interactive h Small 1-8 8h Intermediate h Large h

18 Access Zellescher Weg 12 Willers-Bau A113 Tel Matthias S. Mueller (matthias.mueller@tu-dresden.de)

19 Access - Technical The only available method of access is via ssh Hostname: merkur.hrsk.tu-dresden.de

20 Access - administrative Access to the machine is granted by external committee after evaluation Proposals can be submitted online at Initially access will be granted immediately after proposal submission Test operation ( user-friendly mode ) during December Production starts in January 2006

21 Electronic Proposal Submission (I)

22 Electronic Proposal Submission (II)

23 First Experiences on Altix Zellescher Weg 12 Willers-Bau A113 Tel Matthias S. Mueller (matthias.mueller@tu-dresden.de)

24 Stresstests Memory: >18 tests, >68000 different patterns, >500 TB memory throughput ~20h test time MPI >28 tests, >14000 different patterns >100 TB message throughput ~24h test time DISK >260 tests, >11400 files, 8.5h, 157 TB disk throughput

25 MPI latency latency

26 MPI bandwidth bandwidth

27 I/O Performance during acceptance 3 2,5 2 1,5 1 0,5 Read Write 0 Accept Removed Disk Rebuild Read 2,89 2,73 2,73 Write 2,79 2,76 2,63

28 Scalability of /fastfs file system 1.8 I/O-Benchmark 3928 MB / CPU, 8 chunks bandwidth[gb/s] read (venus) (1.67 GB/s max.) read (merkur) (1.73 GB/s max.) write (venus) (1.51 GB/s max.) write (merkur) (1.18 GB/s max.) CPUs

29 Code Tuning: different compiler flags , ,904 Time[s] Flags

30 Short Comparison Origin - Altix Zellescher Weg 12 Willers-Bau A113 Tel Matthias S. Mueller (matthias.mueller@tu-dresden.de)

31 Matrix Multiplication from 7 numerical.matmul.f double Intel Itanium 2, FLOPS (jki) MIPS R12000, FLOPS (jki) 6 5 GFLOPS Matrix Size

32 DGEMM from numerical.matmul.c.0.scsl.double 1 Thread, Performance 2 Threads, Performance 4 Threads, Performance 16 Threads, Performance 8 Threads, Performance 32 Threads, Performance numerical.matmul.c.0.mkl.double auto-parallelism (OpenMP) using Intel MKL, 2 CPUs, Performance auto-parallelism (OpenMP) using Intel MKL, 4 CPUs, Performance auto-parallelism (OpenMP) using Intel MKL, 8 CPUs, Performance auto-parallelism (OpenMP) using Intel MKL, 32 CPUs, Performance auto-parallelism (OpenMP) using Intel MKL, 16 CPUs, Performance GFLOPS Matrix Size Matrix Size

33 MPI Bandwidth MPI Bandwidth (Pingpong with 8 pairs) Altix O3kK 3 Bandwidth [GiB/s] Message Size [MiB]

34 MPI latencies 7 Altix O3k us # pair

35 Single CPU Results for CFD kernels

36 Single CPU Results for CFD kernels

37 Performance of Lautrec: O3K vs. Altix Performance Rel. Speed O3K-00 Altix-00 O3K-01 Alitx-01 O3K-02 Alitx-02 O3K-03 Altix-03 O3K-04 Alitx #CPUs

38 Performance Ratio Altix3700/Origin3800 (preliminary)

39 Your results may be different. Feedback is very welcome.

40 ZIH Application Performance Competition Prices are awarded for the best ratio between SGI Origin 3800 and SGI Altix 3700 Two categories: Single CPU performance 32 CPU performance Criteria: Real application Demonstrated performance with Vampir tracefile Cheating is not allowed!! Deadline: Winners will be selected by the ZIH award committee ZIH staff is not eligible.

41 ZIH Application Performance Competition Prices: A good bottle of wine and one ZIH shirt for each category Good Luck!!!!

Introducing OTF / Vampir / VampirTrace

Center for Information Services and High Performance Computing (ZIH) Introducing OTF / Vampir / VampirTrace Zellescher Weg 12 Willers-Bau A115 Tel. +49 351-463 - 34049 (Robert.Henschel@zih.tu-dresden.de)