Carlos Reaño Universitat Politècnica de València (Spain) HPC Advisory Council Switzerland Conference April 3, Lugano (Switzerland)

Size: px

Start display at page:

Download "Carlos Reaño Universitat Politècnica de València (Spain) HPC Advisory Council Switzerland Conference April 3, Lugano (Switzerland)"

Georgia Baldwin
5 years ago
Views:

1 Carlos Reaño Universitat Politècnica de València (Spain) Switzerland Conference April 3, Lugano (Switzerland)

2 What is rcuda? Installing and using rcuda rcuda over HPC networks InfiniBand How taking benefit from rcuda Sample scenarios Questions & Answers 2

3 What is rcuda? Installing and using rcuda rcuda over HPC networks InfiniBand How taking benefit from rcuda Sample scenarios Questions & Answers 3

4 Current HPC clusters: heterogeneous platforms (CPUs + GPUs ) Reduce execution time of parallel applications Increase costs: acquisition, maintenance, space... Energy consumption Low utilization rate Solution: remote GPU virtualization Physical GPUs only in some nodes of the cluster GPUs transparently shared among all the nodes Virtualization framework overhead Network overhead 4

5 GPGPU: OpenCL or CUDA General architecture remote GPU solutions: Existing solutions: GVirtuS, DS-CUDA, rcuda... rcuda (remote CUDA): freely available, supports CUDA 5.5 5

6 rcuda communication architecture: 6

7 rcuda communication architecture: Eth, IB,... 7

8 CUDA: Node 1 GPU rcuda (remote CUDA): Node 2 Network Node 1 GPU With rcuda Node 2 can use Node 1 GPU!!! 8

9 What is rcuda? Installing and using rcuda rcuda over HPC networks InfiniBand How taking benefit from rcuda Sample scenarios Questions & Answers 9

10 Where to obtain rcuda? Software Request Form Package contents. Two important folders: doc: rcuda user guide framework: rcuda server and library rcudad: rcuda server daemon rcudal: rcuda library Installing rcuda Just untar the tarball in both the server and the client(s) node(s) 10

11 Starting rcuda server: Set env. vars as if you were going to run a CUDA program: export PATH=$PATH:/usr/local/cuda/bin export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64 Start rcuda server: cd $HOME/rCUDA/framework/rCUDAd./rCUDAd 11

12 Starting rcuda server: Set env. vars as if you were going to run a CUDA program: Path to CUDA binaries export PATH=$PATH:/usr/local/cuda/bin export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64 Start rcuda server: cd $HOME/rCUDA/framework/rCUDAd./rCUDAd 12

13 Starting rcuda server: Set env. vars as if you were going to run a CUDA program: Path to CUDA libraries export PATH=$PATH:/usr/local/cuda/bin export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64 Start rcuda server: cd $HOME/rCUDA/framework/rCUDAd./rCUDAd 13

14 Starting rcuda server: Set env. vars as if you were going to run a CUDA program: export PATH=$PATH:/usr/local/cuda/bin export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64 Start rcuda server: Path to rcuda server cd $HOME/rCUDA/framework/rCUDAd./rCUDAd 14

15 Starting rcuda server: Set env. vars as if you were going to run a CUDA program: export PATH=$PATH:/usr/local/cuda/bin export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64 Start rcuda server: cd rcuda/framework/rcudad./rcudad Start rcuda server in background 15

16 Running a CUDA program with rcuda: Set env. vars as follows: export PATH=$PATH:/usr/local/cuda/bin export LD_LIBRARY_PATH=$HOME/rCUDA/framework/rCUDAl:$LD_LIBRARY_PATH export RCUDA_DEVICE_COUNT=1 export RCUDA_DEVICE_0=<server_name_or_ip_address>:0 Compile CUDA program using dynamic libraries: cd $HOME/NVIDIA_CUDA_Samples/5.5/1_Utilities/deviceQuery make EXTRA_NVCCFLAGS=--cudart=shared Run the CUDA program as usual:./devicequery... 16

17 Running a CUDA program with rcuda: Set env. vars as follows: Path to CUDA binaries export PATH=$PATH:/usr/local/cuda/bin export LD_LIBRARY_PATH=$HOME/rCUDA/framework/rCUDAl:$LD_LIBRARY_PATH export RCUDA_DEVICE_COUNT=1 export RCUDA_DEVICE_0=<server_name_or_ip_address>:0 Compile CUDA program using dynamic libraries: cd $HOME/NVIDIA_CUDA_Samples/5.5/1_Utilities/deviceQuery make EXTRA_NVCCFLAGS=--cudart=shared Run the CUDA program as usual:./devicequery... 17

18 Running a CUDA program with rcuda: Set env. vars as follows: Path to rcuda library export PATH=$PATH:/usr/local/cuda/bin export LD_LIBRARY_PATH=$HOME/rCUDA/framework/rCUDAl:$LD_LIBRARY_PATH export RCUDA_DEVICE_COUNT=1 export RCUDA_DEVICE_0=<server_name_or_ip_address>:0 Compile CUDA program using dynamic libraries: cd $HOME/NVIDIA_CUDA_Samples/5.5/1_Utilities/deviceQuery make EXTRA_NVCCFLAGS=--cudart=shared Run the CUDA program as usual:./devicequery... 18

19 Running a CUDA program with rcuda: Set env. vars as follows: Number of remote GPUs: 1, 2, 3... export PATH=$PATH:/usr/local/cuda/bin export LD_LIBRARY_PATH=$HOME/rCUDA/framework/rCUDAl:$LD_LIBRARY_PATH export RCUDA_DEVICE_COUNT=1 export RCUDA_DEVICE_0=<server_name_or_ip_address>:0 Compile CUDA program using dynamic libraries: cd $HOME/NVIDIA_CUDA_Samples/5.5/1_Utilities/deviceQuery make EXTRA_NVCCFLAGS=--cudart=shared Run the CUDA program as usual:./devicequery... 19

20 Running a CUDA program with rcuda: Set env. vars as follows: Name/IP of rcuda server export PATH=$PATH:/usr/local/cuda/bin export LD_LIBRARY_PATH=$HOME/rCUDA/framework/rCUDAl:$LD_LIBRARY_PATH export RCUDA_DEVICE_COUNT=1 export RCUDA_DEVICE_0=<server_name_or_ip_address>:0 Compile CUDA program using dynamic libraries: cd $HOME/NVIDIA_CUDA_Samples/5.5/1_Utilities/deviceQuery make EXTRA_NVCCFLAGS=--cudart=shared Run the CUDA program as usual:./devicequery... 20

21 Running a CUDA program with rcuda: Set env. vars as follows: GPU of remote server to use export PATH=$PATH:/usr/local/cuda/bin export LD_LIBRARY_PATH=$HOME/rCUDA/framework/rCUDAl:$LD_LIBRARY_PATH export RCUDA_DEVICE_COUNT=1 export RCUDA_DEVICE_0=<server_name_or_ip_address>:0 Compile CUDA program using dynamic libraries: cd $HOME/NVIDIA_CUDA_Samples/5.5/1_Utilities/deviceQuery make EXTRA_NVCCFLAGS=--cudart=shared Run the CUDA program as usual:./devicequery... 21

22 Running a CUDA program with rcuda: Set env. vars as follows: export PATH=$PATH:/usr/local/cuda/bin export LD_LIBRARY_PATH=$HOME/rCUDA/framework/rCUDAl:$LD_LIBRARY_PATH export RCUDA_DEVICE_COUNT=1 export RCUDA_DEVICE_0=<server_name_or_ip_address>:0 Compile CUDA program using dynamic libraries: cd $HOME/NVIDIA_CUDA_Samples/5.5/1_Utilities/deviceQuery make EXTRA_NVCCFLAGS=--cudart=shared Run the CUDA program as usual: Very important!!!./devicequery... 22

23 Live demonstration: devicequery using all the GPUs of the cluster bandwidthtest Problem: bandwidth with rcuda is too low!! Why? We are using TCP Solution: HPC networks InfiniBand (IB) 23

24 What is rcuda? Installing and using rcuda rcuda over HPC networks InfiniBand How taking benefit from rcuda Sample scenarios Questions & Answers 24

25 Is my IB network ready? Check that links are active and up in all nodes: $> ibstatus Infiniband device 'mlx4_0' port 1 status: default gid: fe80:0000:0000:0000:0001:c700:0000:e403 base lid: 0x1 sm lid: 0x1 state: 4: ACTIVE phys state: 5: LinkUp rate: 56 Gb/sec (4X FDR) link_layer: InfiniBand 25

26 Is my IB network ready? Test links with, for instance, an IB bandwidth test Node1-$ ib_write_bw ********************************* * Waiting for client to connect * ********************************* 1. Start bandwidth server in Node1 2. Start bandwidth client in Node2 Node2-$ ib_write_bw Node RDMA_Write BW Test Dual-port : OFF Device : mlx4_0 Number of qps : 1 Transport type : IB Connection type : RC Using SRQ : OFF TX depth : 128 CQ Moderation : 100 Mtu : 2048[B] Link type : IB Max inline data : 0[B] rdma_cm QPs : OFF Data ex. method : Ethernet local address:... remote address: #bytes #iterations BW peak[mb/sec] BW average[mb/sec] MsgRate[Mpps]

27 Starting rcuda server using IB: export RCUDAPROTO=IB cd $HOME/rCUDA/framework/rCUDAd./rCUDAd Run CUDA program using rcuda over IB: export RCUDAPROTO=IB cd $HOME/NVIDIA_CUDA_Samples/5.5/1_Utilities/bandwidthTest./bandwidthTest Live demonstration: bandwidthtest using IB Bandwidth is no more a problem!! 27

28 Starting rcuda server using IB: export RCUDAPROTO=IB cd $HOME/rCUDA/framework/rCUDAd./rCUDAd Tell rcuda we want to use IB Run CUDA program using rcuda over IB: Also in the client!! export RCUDAPROTO=IB cd $HOME/NVIDIA_CUDA_Samples/5.5/1_Utilities/bandwidthTest./bandwidthTest Live demonstration: bandwidthtest using IB Bandwidth is no more a problem!! 28

29 Starting rcuda server using IB: export RCUDAPROTO=IB cd $HOME/rCUDA/framework/rCUDAd./rCUDAd Run CUDA program using rcuda over IB: export RCUDAPROTO=IB cd $HOME/NVIDIA_CUDA_Samples/5.5/1_Utilities/bandwidthTest./bandwidthTest Live demonstration: bandwidthtest using IB Bandwidth is no more a problem!! 29

30 What is rcuda? Installing and using rcuda rcuda over HPC networks InfiniBand How taking benefit from rcuda Sample scenarios Questions & Answers 30

31 Sample scenarios: Typical behavior of CUDA applications: moving data to the GPU and performing a lot of computations there to compensate the overhead of having moved the data This benefits rcuda: more computations, less rcuda overhead Live demonstration Scalable applications: more GPUs, less execution time rcuda can use all the GPUs of the cluster, while CUDA only can use the ones directly connected to one node: for some applications, rcuda can get better results than with CUDA Live demonstration 31

32 What is rcuda? Installing and using rcuda rcuda over HPC networks InfiniBand How taking benefit from rcuda Sample scenarios Questions & Answers 32

33 33

34 Antonio Peña (1) Carlos Reaño Federico Silla José Duato rcuda Team Adrian Castelló Sergio Iserte Rafael Mayo Enrique S. Quintana-Ortí (1) Now at Argonne National Lab. (USA) 34

Improving overall performance and energy consumption of your cluster with remote GPU virtualization

Improving overall performance and energy consumption of your cluster with remote GPU virtualization Federico Silla & Carlos Reaño Technical University of Valencia Spain Tutorial Agenda 9.00-10.00 SESSION