Technology evaluation at CSCS including BeeGFS parallel filesystem. Hussein N. Harake CSCS-ETHZ

Size: px

Start display at page:

Download "Technology evaluation at CSCS including BeeGFS parallel filesystem. Hussein N. Harake CSCS-ETHZ"

Cori McCoy
5 years ago
Views:

1 Technology evaluation at CSCS including BeeGFS parallel filesystem Hussein N. Harake CSCS-ETHZ

2 Agenda CSCS About the Systems Integration (SI) Unit Technology Overview DDN IME DDN WOS OpenStack BeeGFS Case Study What is BeeGFS? Test System Layout Tuning Monitoring Benchmark tools Results Next Steps Monitoring and Profiling Q&A CSCS

CSCS (Swiss National Supercomputing Centre) Founded in 1991 Enables world-class research with a scientific user lab Available to domestic and international researchers through a

3 CSCS (Swiss National Supercomputing Centre) Founded in 1991 Enables world-class research with a scientific user lab Available to domestic and international researchers through a transparent, peer-reviewed allocation process. Open to academia and are available as well to users from industry and the business sector. Operated by ETH Zurich and is located in Lugano. CSCS

4 24 years of supercomputers at CSCS 1991 NEC SX3 5.5 GF Adula 1996 NEC SX4 10 GF Gottardo 1999 NEC SX5 64 GF Prometeo 2002 IBM SP4 1.3 TF Venus 2005 Cray XT3 5.8 TF Palu 2006 IBM P5 4.5 TF Blanc Cray XE6 402 TF Monte Rosa Cray XC PF Piz Daint 2014 XC PF Piz Daint extension 4

5 Data Centre sq.m Machine Room - 20 MW of power and Cooling capacity - Lake Water cooling Liters/s CSCS

6 Overview of Systems Integration (SI) Unit Unit missions: - Managing projects - Relations with Vendors - Evaluating Technologies - Software deployments

7 Greina Cluster 7

8 Technology Overview DDN IME Image courtesy of DDN CSCS

9 Tchnology Overview DDN WOS (1) CSCS Image courtesy of DDN

10 Technology Overview DDN WOS (2) CSCS

11 Technology Overview DDN WOS (3) CSCS

12 Technology Overview - OpenStack Image source: CSCS

13 Eidos Layout 13

14 BeeGFS Case Study

15 What is BeeGFS? Parallel filesystem HPC oriented Used to be called FhGFS Alternative to Lustre and GPFS Developed by Fraunhofer Open-source Support delivered by ThinkParq Image courtesy of BeeGFS 15

16 Basic Features of BeeGFS Supports failover for data and Metadata using applications like Peacemaker, heartbeat Replication failover mechanism Supports Multiple data and metadata on both servers and targets Supports quota Uses Robin-hood to scan the entire filesystem Beegfs on demand filesystem (BeeOND) Easy to deploy and manage Support X86 and Open-power platform CSCS

17 Easy to deploy 17

18 BeeOND - Create a filesystem on Demand - Uses the hard drive / SSDs on every compute node - Filesystem get created by submitting a job to the schedule We are working on confirming SLURM support - Memory could used instead of SSDs - We used 20 SSDs on 20 nodes for our tests CSCS

19 Benefits of BeeOND Benefits from unused space No impact on the parallel filesystem Real utilization of the high speed network Filesystem scales with the compute nodes Open point: What is the overhead on the compute nodes? CSCS

20 Test System Layout One couplet (two controllers) 4 * FDR Links Two X86 servers One enclosure 60 drives DDN SSDs one raid volume 6 * 9 Raid 5 volumes 2 * FDR Links Dual sockets SB 128GB memory Fabric 1 * FDR Links CSCS

21 Tuning the servers echo 5 > /proc/sys/vm/dirty_background_ratio echo 20 > /proc/sys/vm/dirty_ratio echo 50 > /proc/sys/vm/vfs_cache_pressure echo > /proc/sys/vm/min_free_kbytes echo always > /sys/kernel/mm/transparent_hugepage/enabled echo always > /sys/kernel/mm/transparent_hugepage/defrag for dev in dm-0 dm-1 dm-2 dm-3 dm-4 dm-5 dm-6 do echo deadline > /sys/block/$dev/queue/scheduler echo 4096 > /sys/block/$dev/queue/nr_requests echo > /sys/block/$dev/queue/read_ahead_kb echo > /sys/block/$dev/queue/max_sectors_kb done echo performance tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor echo 1 > /proc/sys/vm/zone_reclaim_mode Documentation for the tuned parameters: CSCS

22 Monitoring clients activities (1) CSCS

23 Monitoring servers activities (2) CSCS

24 Benchmark tools Mdtest measuring metadata IOzone throughput read and write CSCS

25 Iozone results on /beegfs Test running: Children see throughput for 64 initial writers = kb/sec Min throughput per process = kb/sec Max throughput per process = kb/sec Avg throughput per process = kb/sec Min xfer = kb Test running: Children see throughput for 64 rewriters = kb/sec Min throughput per process = kb/sec Max throughput per process = kb/sec Avg throughput per process = kb/sec Min xfer = kb Test running: Children see throughput for 64 readers = kb/sec Min throughput per process = kb/sec Max throughput per process = kb/sec Avg throughput per process = kb/sec Min xfer = kb Test running: Children see throughput for 64 re-readers = kb/sec Min throughput per process = kb/sec Max throughput per process = kb/sec Avg throughput per process = kb/sec Min xfer = kb CSCS

26 Mdtest results on BeeOND Directory creation Directory Stat Directories per second Numer of MDSs Directories per second Numer of MDSs Directories per second Directory Removal Stat Numer of MDSs CSCS

27 Mdtest results on BeeOND File Creation File Stat Files per second Files per second Numer of MDSs Numer of MDSs File removal Files per second Numer of MDSs CSCS

28 Next steps Scaling on bigger cluster Verifying the fail over procedures Verify the BeeOND overhead on compute nodes Using Nvme instead of SSDs Using tmpfs Create BeeOND through SLURM jobs Use Robinhood to scan millions of files CSCS

29 Check-MK Monitoring and Profiling 29

30 CPU Utilization 30

31 Q&A 31

Technology Testing at CSCS including BeeGFS Preliminary Results. Hussein N. Harake CSCS-ETHZ

Technology Testing at CSCS including BeeGFS Preliminary Results Hussein N. Harake CSCS-ETHZ Agenda About CSCS About the Systems Integration (SI) Unit Technology Overview DDN IME DDN WOS OpenStack BeeGFS