Carlos Reaño Universitat Politècnica de València (Spain) HPC Advisory Council Switzerland Conference April 3, Lugano (Switzerland)

Size: px
Start display at page:

Download "Carlos Reaño Universitat Politècnica de València (Spain) HPC Advisory Council Switzerland Conference April 3, Lugano (Switzerland)"

Transcription

1 Carlos Reaño Universitat Politècnica de València (Spain) Switzerland Conference April 3, Lugano (Switzerland)

2 What is rcuda? Installing and using rcuda rcuda over HPC networks InfiniBand How taking benefit from rcuda Sample scenarios Questions & Answers 2

3 What is rcuda? Installing and using rcuda rcuda over HPC networks InfiniBand How taking benefit from rcuda Sample scenarios Questions & Answers 3

4 Current HPC clusters: heterogeneous platforms (CPUs + GPUs ) Reduce execution time of parallel applications Increase costs: acquisition, maintenance, space... Energy consumption Low utilization rate Solution: remote GPU virtualization Physical GPUs only in some nodes of the cluster GPUs transparently shared among all the nodes Virtualization framework overhead Network overhead 4

5 GPGPU: OpenCL or CUDA General architecture remote GPU solutions: Existing solutions: GVirtuS, DS-CUDA, rcuda... rcuda (remote CUDA): freely available, supports CUDA 5.5 5

6 rcuda communication architecture: 6

7 rcuda communication architecture: Eth, IB,... 7

8 CUDA: Node 1 GPU rcuda (remote CUDA): Node 2 Network Node 1 GPU With rcuda Node 2 can use Node 1 GPU!!! 8

9 What is rcuda? Installing and using rcuda rcuda over HPC networks InfiniBand How taking benefit from rcuda Sample scenarios Questions & Answers 9

10 Where to obtain rcuda? Software Request Form Package contents. Two important folders: doc: rcuda user guide framework: rcuda server and library rcudad: rcuda server daemon rcudal: rcuda library Installing rcuda Just untar the tarball in both the server and the client(s) node(s) 10

11 Starting rcuda server: Set env. vars as if you were going to run a CUDA program: export PATH=$PATH:/usr/local/cuda/bin export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64 Start rcuda server: cd $HOME/rCUDA/framework/rCUDAd./rCUDAd 11

12 Starting rcuda server: Set env. vars as if you were going to run a CUDA program: Path to CUDA binaries export PATH=$PATH:/usr/local/cuda/bin export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64 Start rcuda server: cd $HOME/rCUDA/framework/rCUDAd./rCUDAd 12

13 Starting rcuda server: Set env. vars as if you were going to run a CUDA program: Path to CUDA libraries export PATH=$PATH:/usr/local/cuda/bin export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64 Start rcuda server: cd $HOME/rCUDA/framework/rCUDAd./rCUDAd 13

14 Starting rcuda server: Set env. vars as if you were going to run a CUDA program: export PATH=$PATH:/usr/local/cuda/bin export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64 Start rcuda server: Path to rcuda server cd $HOME/rCUDA/framework/rCUDAd./rCUDAd 14

15 Starting rcuda server: Set env. vars as if you were going to run a CUDA program: export PATH=$PATH:/usr/local/cuda/bin export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64 Start rcuda server: cd rcuda/framework/rcudad./rcudad Start rcuda server in background 15

16 Running a CUDA program with rcuda: Set env. vars as follows: export PATH=$PATH:/usr/local/cuda/bin export LD_LIBRARY_PATH=$HOME/rCUDA/framework/rCUDAl:$LD_LIBRARY_PATH export RCUDA_DEVICE_COUNT=1 export RCUDA_DEVICE_0=<server_name_or_ip_address>:0 Compile CUDA program using dynamic libraries: cd $HOME/NVIDIA_CUDA_Samples/5.5/1_Utilities/deviceQuery make EXTRA_NVCCFLAGS=--cudart=shared Run the CUDA program as usual:./devicequery... 16

17 Running a CUDA program with rcuda: Set env. vars as follows: Path to CUDA binaries export PATH=$PATH:/usr/local/cuda/bin export LD_LIBRARY_PATH=$HOME/rCUDA/framework/rCUDAl:$LD_LIBRARY_PATH export RCUDA_DEVICE_COUNT=1 export RCUDA_DEVICE_0=<server_name_or_ip_address>:0 Compile CUDA program using dynamic libraries: cd $HOME/NVIDIA_CUDA_Samples/5.5/1_Utilities/deviceQuery make EXTRA_NVCCFLAGS=--cudart=shared Run the CUDA program as usual:./devicequery... 17

18 Running a CUDA program with rcuda: Set env. vars as follows: Path to rcuda library export PATH=$PATH:/usr/local/cuda/bin export LD_LIBRARY_PATH=$HOME/rCUDA/framework/rCUDAl:$LD_LIBRARY_PATH export RCUDA_DEVICE_COUNT=1 export RCUDA_DEVICE_0=<server_name_or_ip_address>:0 Compile CUDA program using dynamic libraries: cd $HOME/NVIDIA_CUDA_Samples/5.5/1_Utilities/deviceQuery make EXTRA_NVCCFLAGS=--cudart=shared Run the CUDA program as usual:./devicequery... 18

19 Running a CUDA program with rcuda: Set env. vars as follows: Number of remote GPUs: 1, 2, 3... export PATH=$PATH:/usr/local/cuda/bin export LD_LIBRARY_PATH=$HOME/rCUDA/framework/rCUDAl:$LD_LIBRARY_PATH export RCUDA_DEVICE_COUNT=1 export RCUDA_DEVICE_0=<server_name_or_ip_address>:0 Compile CUDA program using dynamic libraries: cd $HOME/NVIDIA_CUDA_Samples/5.5/1_Utilities/deviceQuery make EXTRA_NVCCFLAGS=--cudart=shared Run the CUDA program as usual:./devicequery... 19

20 Running a CUDA program with rcuda: Set env. vars as follows: Name/IP of rcuda server export PATH=$PATH:/usr/local/cuda/bin export LD_LIBRARY_PATH=$HOME/rCUDA/framework/rCUDAl:$LD_LIBRARY_PATH export RCUDA_DEVICE_COUNT=1 export RCUDA_DEVICE_0=<server_name_or_ip_address>:0 Compile CUDA program using dynamic libraries: cd $HOME/NVIDIA_CUDA_Samples/5.5/1_Utilities/deviceQuery make EXTRA_NVCCFLAGS=--cudart=shared Run the CUDA program as usual:./devicequery... 20

21 Running a CUDA program with rcuda: Set env. vars as follows: GPU of remote server to use export PATH=$PATH:/usr/local/cuda/bin export LD_LIBRARY_PATH=$HOME/rCUDA/framework/rCUDAl:$LD_LIBRARY_PATH export RCUDA_DEVICE_COUNT=1 export RCUDA_DEVICE_0=<server_name_or_ip_address>:0 Compile CUDA program using dynamic libraries: cd $HOME/NVIDIA_CUDA_Samples/5.5/1_Utilities/deviceQuery make EXTRA_NVCCFLAGS=--cudart=shared Run the CUDA program as usual:./devicequery... 21

22 Running a CUDA program with rcuda: Set env. vars as follows: export PATH=$PATH:/usr/local/cuda/bin export LD_LIBRARY_PATH=$HOME/rCUDA/framework/rCUDAl:$LD_LIBRARY_PATH export RCUDA_DEVICE_COUNT=1 export RCUDA_DEVICE_0=<server_name_or_ip_address>:0 Compile CUDA program using dynamic libraries: cd $HOME/NVIDIA_CUDA_Samples/5.5/1_Utilities/deviceQuery make EXTRA_NVCCFLAGS=--cudart=shared Run the CUDA program as usual: Very important!!!./devicequery... 22

23 Live demonstration: devicequery using all the GPUs of the cluster bandwidthtest Problem: bandwidth with rcuda is too low!! Why? We are using TCP Solution: HPC networks InfiniBand (IB) 23

24 What is rcuda? Installing and using rcuda rcuda over HPC networks InfiniBand How taking benefit from rcuda Sample scenarios Questions & Answers 24

25 Is my IB network ready? Check that links are active and up in all nodes: $> ibstatus Infiniband device 'mlx4_0' port 1 status: default gid: fe80:0000:0000:0000:0001:c700:0000:e403 base lid: 0x1 sm lid: 0x1 state: 4: ACTIVE phys state: 5: LinkUp rate: 56 Gb/sec (4X FDR) link_layer: InfiniBand 25

26 Is my IB network ready? Test links with, for instance, an IB bandwidth test Node1-$ ib_write_bw ********************************* * Waiting for client to connect * ********************************* 1. Start bandwidth server in Node1 2. Start bandwidth client in Node2 Node2-$ ib_write_bw Node RDMA_Write BW Test Dual-port : OFF Device : mlx4_0 Number of qps : 1 Transport type : IB Connection type : RC Using SRQ : OFF TX depth : 128 CQ Moderation : 100 Mtu : 2048[B] Link type : IB Max inline data : 0[B] rdma_cm QPs : OFF Data ex. method : Ethernet local address:... remote address: #bytes #iterations BW peak[mb/sec] BW average[mb/sec] MsgRate[Mpps]

27 Starting rcuda server using IB: export RCUDAPROTO=IB cd $HOME/rCUDA/framework/rCUDAd./rCUDAd Run CUDA program using rcuda over IB: export RCUDAPROTO=IB cd $HOME/NVIDIA_CUDA_Samples/5.5/1_Utilities/bandwidthTest./bandwidthTest Live demonstration: bandwidthtest using IB Bandwidth is no more a problem!! 27

28 Starting rcuda server using IB: export RCUDAPROTO=IB cd $HOME/rCUDA/framework/rCUDAd./rCUDAd Tell rcuda we want to use IB Run CUDA program using rcuda over IB: Also in the client!! export RCUDAPROTO=IB cd $HOME/NVIDIA_CUDA_Samples/5.5/1_Utilities/bandwidthTest./bandwidthTest Live demonstration: bandwidthtest using IB Bandwidth is no more a problem!! 28

29 Starting rcuda server using IB: export RCUDAPROTO=IB cd $HOME/rCUDA/framework/rCUDAd./rCUDAd Run CUDA program using rcuda over IB: export RCUDAPROTO=IB cd $HOME/NVIDIA_CUDA_Samples/5.5/1_Utilities/bandwidthTest./bandwidthTest Live demonstration: bandwidthtest using IB Bandwidth is no more a problem!! 29

30 What is rcuda? Installing and using rcuda rcuda over HPC networks InfiniBand How taking benefit from rcuda Sample scenarios Questions & Answers 30

31 Sample scenarios: Typical behavior of CUDA applications: moving data to the GPU and performing a lot of computations there to compensate the overhead of having moved the data This benefits rcuda: more computations, less rcuda overhead Live demonstration Scalable applications: more GPUs, less execution time rcuda can use all the GPUs of the cluster, while CUDA only can use the ones directly connected to one node: for some applications, rcuda can get better results than with CUDA Live demonstration 31

32 What is rcuda? Installing and using rcuda rcuda over HPC networks InfiniBand How taking benefit from rcuda Sample scenarios Questions & Answers 32

33 33

34 Antonio Peña (1) Carlos Reaño Federico Silla José Duato rcuda Team Adrian Castelló Sergio Iserte Rafael Mayo Enrique S. Quintana-Ortí (1) Now at Argonne National Lab. (USA) 34

Improving overall performance and energy consumption of your cluster with remote GPU virtualization

Improving overall performance and energy consumption of your cluster with remote GPU virtualization Improving overall performance and energy consumption of your cluster with remote GPU virtualization Federico Silla & Carlos Reaño Technical University of Valencia Spain Tutorial Agenda 9.00-10.00 SESSION

More information

The rcuda technology: an inexpensive way to improve the performance of GPU-based clusters Federico Silla

The rcuda technology: an inexpensive way to improve the performance of GPU-based clusters Federico Silla The rcuda technology: an inexpensive way to improve the performance of -based clusters Federico Silla Technical University of Valencia Spain The scope of this talk Delft, April 2015 2/47 More flexible

More information

Exploiting Task-Parallelism on GPU Clusters via OmpSs and rcuda Virtualization

Exploiting Task-Parallelism on GPU Clusters via OmpSs and rcuda Virtualization Exploiting Task-Parallelism on Clusters via Adrián Castelló, Rafael Mayo, Judit Planas, Enrique S. Quintana-Ortí RePara 2015, August Helsinki, Finland Exploiting Task-Parallelism on Clusters via Power/energy/utilization

More information

rcuda: towards energy-efficiency in GPU computing by leveraging low-power processors and InfiniBand interconnects

rcuda: towards energy-efficiency in GPU computing by leveraging low-power processors and InfiniBand interconnects rcuda: towards energy-efficiency in computing by leveraging low-power processors and InfiniBand interconnects Federico Silla Technical University of Valencia Spain Joint research effort Outline Current

More information

Carlos Reaño, Javier Prades and Federico Silla Technical University of Valencia (Spain)

Carlos Reaño, Javier Prades and Federico Silla Technical University of Valencia (Spain) Carlos Reaño, Javier Prades and Federico Silla Technical University of Valencia (Spain) 4th IEEE International Workshop of High-Performance Interconnection Networks in the Exascale and Big-Data Era (HiPINEB

More information

Opportunities of the rcuda remote GPU virtualization middleware. Federico Silla Universitat Politècnica de València Spain

Opportunities of the rcuda remote GPU virtualization middleware. Federico Silla Universitat Politècnica de València Spain Opportunities of the rcuda remote virtualization middleware Federico Silla Universitat Politècnica de València Spain st Outline What is rcuda? HPC Advisory Council China Conference 2017 2/45 s are the

More information

Why? High performance clusters: Fast interconnects Hundreds of nodes, with multiple cores per node Large storage systems Hardware accelerators

Why? High performance clusters: Fast interconnects Hundreds of nodes, with multiple cores per node Large storage systems Hardware accelerators Remote CUDA (rcuda) Why? High performance clusters: Fast interconnects Hundreds of nodes, with multiple cores per node Large storage systems Hardware accelerators Better performance-watt, performance-cost

More information

Remote GPU virtualization: pros and cons of a recent technology. Federico Silla Technical University of Valencia Spain

Remote GPU virtualization: pros and cons of a recent technology. Federico Silla Technical University of Valencia Spain Remote virtualization: pros and cons of a recent technology Federico Silla Technical University of Valencia Spain The scope of this talk HPC Advisory Council Brazil Conference 2015 2/43 st Outline What

More information

rcuda: an approach to provide remote access to GPU computational power

rcuda: an approach to provide remote access to GPU computational power rcuda: an approach to provide remote access to computational power Rafael Mayo Gual Universitat Jaume I Spain (1 of 60) HPC Advisory Council Workshop Outline computing Cost of a node rcuda goals rcuda

More information

Is remote GPU virtualization useful? Federico Silla Technical University of Valencia Spain

Is remote GPU virtualization useful? Federico Silla Technical University of Valencia Spain Is remote virtualization useful? Federico Silla Technical University of Valencia Spain st Outline What is remote virtualization? HPC Advisory Council Spain Conference 2015 2/57 We deal with s, obviously!

More information

rcuda: desde máquinas virtuales a clústers mixtos CPU-GPU

rcuda: desde máquinas virtuales a clústers mixtos CPU-GPU rcuda: desde máquinas virtuales a clústers mixtos CPU-GPU Federico Silla Universitat Politècnica de València HPC ADMINTECH 2018 rcuda: from virtual machines to hybrid CPU-GPU clusters Federico Silla Universitat

More information

Document downloaded from:

Document downloaded from: Document downloaded from: http://hdl.handle.net/10251/70225 This paper must be cited as: Reaño González, C.; Silla Jiménez, F. (2015). On the Deployment and Characterization of CUDA Teaching Laboratories.

More information

Framework of rcuda: An Overview

Framework of rcuda: An Overview Framework of rcuda: An Overview Mohamed Hussain 1, M.B.Potdar 2, Third Viraj Choksi 3 11 Research scholar, VLSI & Embedded Systems, Gujarat Technological University, Ahmedabad, India 2 Project Director,

More information

computational power computational

computational power computational rcuda: rcuda: an an approach approach to to provide provide remote remote access access to to computational computational power power Rafael Mayo Gual Universitat Jaume I Spain (1 of 59) HPC Advisory Council

More information

Deploying remote GPU virtualization with rcuda. Federico Silla Technical University of Valencia Spain

Deploying remote GPU virtualization with rcuda. Federico Silla Technical University of Valencia Spain Deploying remote virtualization with rcuda Federico Silla Technical University of Valencia Spain st Outline What is remote virtualization? HPC ADMINTECH 2016 2/53 It deals with s, obviously! HPC ADMINTECH

More information

Speeding up the execution of numerical computations and simulations with rcuda José Duato

Speeding up the execution of numerical computations and simulations with rcuda José Duato Speeding up the execution of numerical computations and simulations with rcuda José Duato Universidad Politécnica de Valencia Spain Outline 1. Introduction to GPU computing 2. What is remote GPU virtualization?

More information

An approach to provide remote access to GPU computational power

An approach to provide remote access to GPU computational power An approach to provide remote access to computational power University Jaume I, Spain Joint research effort 1/84 Outline computing computing scenarios Introduction to rcuda rcuda structure rcuda functionality

More information

On the Use of Remote GPUs and Low-Power Processors for the Acceleration of Scientific Applications

On the Use of Remote GPUs and Low-Power Processors for the Acceleration of Scientific Applications On the Use of Remote GPUs and Low-Power Processors for the Acceleration of Scientific Applications A. Castelló, J. Duato, R. Mayo, A. J. Peña, E. S. Quintana-Ortí, V. Roca, F. Silla Universitat Politècnica

More information

Increasing the efficiency of your GPU-enabled cluster with rcuda. Federico Silla Technical University of Valencia Spain

Increasing the efficiency of your GPU-enabled cluster with rcuda. Federico Silla Technical University of Valencia Spain Increasing the efficiency of your -enabled cluster with rcuda Federico Silla Technical University of Valencia Spain Outline Why remote virtualization? How does rcuda work? The performance of the rcuda

More information

rcuda: hybrid CPU-GPU clusters Federico Silla Technical University of Valencia Spain

rcuda: hybrid CPU-GPU clusters Federico Silla Technical University of Valencia Spain rcuda: hybrid - clusters Federico Silla Technical University of Valencia Spain Outline 1. Hybrid - clusters 2. Concerns with hybrid clusters 3. One possible solution: virtualize s! 4. rcuda what s that?

More information

MELLANOX EDR UPDATE & GPUDIRECT MELLANOX SR. SE 정연구

MELLANOX EDR UPDATE & GPUDIRECT MELLANOX SR. SE 정연구 MELLANOX EDR UPDATE & GPUDIRECT MELLANOX SR. SE 정연구 Leading Supplier of End-to-End Interconnect Solutions Analyze Enabling the Use of Data Store ICs Comprehensive End-to-End InfiniBand and Ethernet Portfolio

More information

The Future of Interconnect Technology

The Future of Interconnect Technology The Future of Interconnect Technology Michael Kagan, CTO HPC Advisory Council Stanford, 2014 Exponential Data Growth Best Interconnect Required 44X 0.8 Zetabyte 2009 35 Zetabyte 2020 2014 Mellanox Technologies

More information

Solving Dense Linear Systems on Graphics Processors

Solving Dense Linear Systems on Graphics Processors Solving Dense Linear Systems on Graphics Processors Sergio Barrachina Maribel Castillo Francisco Igual Rafael Mayo Enrique S. Quintana-Ortí High Performance Computing & Architectures Group Universidad

More information

Dynamic Kernel/Device Mapping Strategies for GPU-assisted HPC Systems

Dynamic Kernel/Device Mapping Strategies for GPU-assisted HPC Systems Dynamic Kernel/Device Mapping Strategies for GPU-assisted HPC Systems Jiadong Wu, Weiming Shi, and Bo Hong School of Electric and Computer Engineering Georgia Institute of Technology Atlanta, GA 30332

More information

dopencl - Supporting Distributed Heterogeneous Computing in HPC Clusters

dopencl - Supporting Distributed Heterogeneous Computing in HPC Clusters dopencl - Supporting Distributed Heterogeneous Computing in HPC Clusters Albano Alves 1, José Rufino 1, António Pina 2, and Luís Paulo Santos 2 1 Polytechnic Institute of Bragança, Bragança, Portugal,

More information

The rcuda middleware and applications

The rcuda middleware and applications The rcuda middleware and applications Will my application work with rcuda? rcuda currently provides binary compatibility with CUDA 5.0, virtualizing the entire Runtime API except for the graphics functions,

More information

Introduction to High-Speed InfiniBand Interconnect

Introduction to High-Speed InfiniBand Interconnect Introduction to High-Speed InfiniBand Interconnect 2 What is InfiniBand? Industry standard defined by the InfiniBand Trade Association Originated in 1999 InfiniBand specification defines an input/output

More information

OCTOPUS Performance Benchmark and Profiling. June 2015

OCTOPUS Performance Benchmark and Profiling. June 2015 OCTOPUS Performance Benchmark and Profiling June 2015 2 Note The following research was performed under the HPC Advisory Council activities Special thanks for: HP, Mellanox For more information on the

More information

clopencl - Supporting Distributed Heterogeneous Computing in HPC Clusters

clopencl - Supporting Distributed Heterogeneous Computing in HPC Clusters clopencl - Supporting Distributed Heterogeneous Computing in HPC Clusters Albano Alves 1, José Rufino 1, António Pina 2, and Luís Paulo Santos 2 1 Polytechnic Institute of Bragança, Bragança, Portugal,

More information

The Exascale Architecture

The Exascale Architecture The Exascale Architecture Richard Graham HPC Advisory Council China 2013 Overview Programming-model challenges for Exascale Challenges for scaling MPI to Exascale InfiniBand enhancements Dynamically Connected

More information

Oncilla - a Managed GAS Runtime for Accelerating Data Warehousing Queries

Oncilla - a Managed GAS Runtime for Accelerating Data Warehousing Queries Oncilla - a Managed GAS Runtime for Accelerating Data Warehousing Queries Jeffrey Young, Alex Merritt, Se Hoon Shon Advisor: Sudhakar Yalamanchili 4/16/13 Sponsors: Intel, NVIDIA, NSF 2 The Problem Big

More information

GPUs and Emerging Architectures

GPUs and Emerging Architectures GPUs and Emerging Architectures Mike Giles mike.giles@maths.ox.ac.uk Mathematical Institute, Oxford University e-infrastructure South Consortium Oxford e-research Centre Emerging Architectures p. 1 CPUs

More information

E4-ARKA: ARM64+GPU+IB is Now Here Piero Altoè. ARM64 and GPGPU

E4-ARKA: ARM64+GPU+IB is Now Here Piero Altoè. ARM64 and GPGPU E4-ARKA: ARM64+GPU+IB is Now Here Piero Altoè ARM64 and GPGPU 1 E4 Computer Engineering Company E4 Computer Engineering S.p.A. specializes in the manufacturing of high performance IT systems of medium

More information

Containing RDMA and High Performance Computing

Containing RDMA and High Performance Computing Containing RDMA and High Performance Computing Liran Liss ContainerCon 2015 Agenda High Performance Computing (HPC) networking RDMA 101 Containing RDMA Challenges Solution approach RDMA network namespace

More information

dopencl Evaluation of an API-Forwarding Implementation

dopencl Evaluation of an API-Forwarding Implementation dopencl Evaluation of an API-Forwarding Implementation Karsten Tausche, Max Plauth and Andreas Polze Operating Systems and Middleware Group Hasso Plattner Institute for Software Systems Engineering, University

More information

Performance Optimizations via Connect-IB and Dynamically Connected Transport Service for Maximum Performance on LS-DYNA

Performance Optimizations via Connect-IB and Dynamically Connected Transport Service for Maximum Performance on LS-DYNA Performance Optimizations via Connect-IB and Dynamically Connected Transport Service for Maximum Performance on LS-DYNA Pak Lui, Gilad Shainer, Brian Klaff Mellanox Technologies Abstract From concept to

More information

ETHERNET OVER INFINIBAND

ETHERNET OVER INFINIBAND 14th ANNUAL WORKSHOP 2018 ETHERNET OVER INFINIBAND Evgenii Smirnov and Mikhail Sennikovsky ProfitBricks GmbH April 10, 2018 ETHERNET OVER INFINIBAND: CURRENT SOLUTIONS mlx4_vnic Currently deprecated Requires

More information

RDMA in Embedded Fabrics

RDMA in Embedded Fabrics RDMA in Embedded Fabrics Ken Cain, kcain@mc.com Mercury Computer Systems 06 April 2011 www.openfabrics.org 2011 Mercury Computer Systems, Inc. www.mc.com Uncontrolled for Export Purposes 1 Outline Embedded

More information

High Performance Computing

High Performance Computing High Performance Computing Dror Goldenberg, HPCAC Switzerland Conference March 2015 End-to-End Interconnect Solutions for All Platforms Highest Performance and Scalability for X86, Power, GPU, ARM and

More information

Distributed-Shared CUDA: Virtualization of Large-Scale GPU Systems for Programmability and Reliability

Distributed-Shared CUDA: Virtualization of Large-Scale GPU Systems for Programmability and Reliability Distributed-Shared CUDA: Virtualization of Large-Scale GPU Systems for Programmability and Reliability Atsushi Kawai, Kenji Yasuoka Department of Mechanical Engineering, Keio University Yokohama, Japan

More information

Manuel F. Dolz, Juan C. Fernández, Rafael Mayo, Enrique S. Quintana-Ortí. High Performance Computing & Architectures (HPCA)

Manuel F. Dolz, Juan C. Fernández, Rafael Mayo, Enrique S. Quintana-Ortí. High Performance Computing & Architectures (HPCA) EnergySaving Cluster Roll: Power Saving System for Clusters Manuel F. Dolz, Juan C. Fernández, Rafael Mayo, Enrique S. Quintana-Ortí High Performance Computing & Architectures (HPCA) University Jaume I

More information

Jesus Escudero-Sahuquillo Universidad de Castilla-La Mancha (UCLM) SPAIN

Jesus Escudero-Sahuquillo Universidad de Castilla-La Mancha (UCLM) SPAIN Pedro Javier Garcia Jesus Escudero-Sahuquillo Universidad de Castilla-La Mancha (UCLM) SPAIN Universidad de Castilla-La Mancha (UCLM) SPAIN Style Powered tby: Conference itle 1 March 12, Barcelona, Spain

More information

Hidden Performance Tooling. Marc Skinner

Hidden Performance Tooling. Marc Skinner Hidden Performance Tooling Marc Skinner Twin Cities Users Group :: Q2/2016 What do mean by Peformance? Is my system slow? What is my system doing? Do I have a bottleneck? How has my system been performing

More information

Dynamic Sharing of GPUs in Cloud Systems

Dynamic Sharing of GPUs in Cloud Systems Dynamic Sharing of GPUs in Cloud Systems Khaled M. Diab, M. Mustafa Rafique, Mohamed Hefeeda Qatar Computing Research Institute (QCRI), Qatar Foundation; IBM Research, Ireland Email: kdiab@qf.org.qa; mustafa.rafique@ie.ibm.com;

More information

Open Compute Stack (OpenCS) Overview. D.D. Nikolić Updated: 20 August 2018 DAE Tools Project,

Open Compute Stack (OpenCS) Overview. D.D. Nikolić Updated: 20 August 2018 DAE Tools Project, Open Compute Stack (OpenCS) Overview D.D. Nikolić Updated: 20 August 2018 DAE Tools Project, http://www.daetools.com/opencs What is OpenCS? A framework for: Platform-independent model specification 1.

More information

The NE010 iwarp Adapter

The NE010 iwarp Adapter The NE010 iwarp Adapter Gary Montry Senior Scientist +1-512-493-3241 GMontry@NetEffect.com Today s Data Center Users Applications networking adapter LAN Ethernet NAS block storage clustering adapter adapter

More information

LAMMPS-KOKKOS Performance Benchmark and Profiling. September 2015

LAMMPS-KOKKOS Performance Benchmark and Profiling. September 2015 LAMMPS-KOKKOS Performance Benchmark and Profiling September 2015 2 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell, Mellanox, NVIDIA

More information

Can Memory-Less Network Adapters Benefit Next-Generation InfiniBand Systems?

Can Memory-Less Network Adapters Benefit Next-Generation InfiniBand Systems? Can Memory-Less Network Adapters Benefit Next-Generation InfiniBand Systems? Sayantan Sur, Abhinav Vishnu, Hyun-Wook Jin, Wei Huang and D. K. Panda {surs, vishnu, jinhy, huanwei, panda}@cse.ohio-state.edu

More information

NVIDIA CUDA C GETTING STARTED GUIDE FOR MAC OS X

NVIDIA CUDA C GETTING STARTED GUIDE FOR MAC OS X NVIDIA CUDA C GETTING STARTED GUIDE FOR MAC OS X DU-05348-001_v02 August 2010 Installation and Verification on Mac OS X DOCUMENT CHANGE HISTORY DU-05348-001_v02 Version Date Authors Description of Change

More information

arxiv: v1 [cs.dc] 14 Oct 2018

arxiv: v1 [cs.dc] 14 Oct 2018 Accelerator Virtualization in Fog Computing: Moving From the Cloud to the Edge arxiv:1810.06046v1 [cs.dc] 14 Oct 2018 Blesson Varghese 1, Carlos Reaño 1, and Federico Silla 2 1 School of Electronics, Electrical

More information

HPC Middle East. KFUPM HPC Workshop April Mohamed Mekias HPC Solutions Consultant. Agenda

HPC Middle East. KFUPM HPC Workshop April Mohamed Mekias HPC Solutions Consultant. Agenda KFUPM HPC Workshop April 29-30 2015 Mohamed Mekias HPC Solutions Consultant Agenda 1 Agenda-Day 1 HPC Overview What is a cluster? Shared v.s. Distributed Parallel v.s. Massively Parallel Interconnects

More information

Accelerating Spark RDD Operations with Local and Remote GPU Devices

Accelerating Spark RDD Operations with Local and Remote GPU Devices Accelerating Spark RDD Operations with Local and Remote GPU Devices Yasuhiro Ohno, Shin Morishima, and Hiroki Matsutani Dept.ofICS,KeioUniversity, 3-14-1 Hiyoshi, Kohoku, Yokohama, Japan 223-8522 Email:

More information

CUDA PROGRAMMING MODEL Chaithanya Gadiyam Swapnil S Jadhav

CUDA PROGRAMMING MODEL Chaithanya Gadiyam Swapnil S Jadhav CUDA PROGRAMMING MODEL Chaithanya Gadiyam Swapnil S Jadhav CMPE655 - Multiple Processor Systems Fall 2015 Rochester Institute of Technology Contents What is GPGPU? What s the need? CUDA-Capable GPU Architecture

More information

Welcome to the IBTA Fall Webinar Series

Welcome to the IBTA Fall Webinar Series Welcome to the IBTA Fall Webinar Series A four-part webinar series devoted to making I/O work for you Presented by the InfiniBand Trade Association The webinar will begin shortly. 1 September 23 October

More information

Design challenges of Highperformance. MPI over InfiniBand. Presented by Karthik

Design challenges of Highperformance. MPI over InfiniBand. Presented by Karthik Design challenges of Highperformance and Scalable MPI over InfiniBand Presented by Karthik Presentation Overview In depth analysis of High-Performance and scalable MPI with Reduced Memory Usage Zero Copy

More information

n N c CIni.o ewsrg.au

n N c CIni.o ewsrg.au @NCInews NCI and Raijin National Computational Infrastructure 2 Our Partners General purpose, highly parallel processors High FLOPs/watt and FLOPs/$ Unit of execution Kernel Separate memory subsystem GPGPU

More information

NVIDIA CUDA GETTING STARTED GUIDE FOR MAC OS X

NVIDIA CUDA GETTING STARTED GUIDE FOR MAC OS X NVIDIA CUDA GETTING STARTED GUIDE FOR MAC OS X DU-05348-001_v5.0 October 2012 Installation and Verification on Mac OS X TABLE OF CONTENTS Chapter 1. Introduction...1 1.1 System Requirements... 1 1.2 About

More information

SNAP Performance Benchmark and Profiling. April 2014

SNAP Performance Benchmark and Profiling. April 2014 SNAP Performance Benchmark and Profiling April 2014 Note The following research was performed under the HPC Advisory Council activities Participating vendors: HP, Mellanox For more information on the supporting

More information

Sharing High-Performance Devices Across Multiple Virtual Machines

Sharing High-Performance Devices Across Multiple Virtual Machines Sharing High-Performance Devices Across Multiple Virtual Machines Preamble What does sharing devices across multiple virtual machines in our title mean? How is it different from virtual networking / NSX,

More information

PARAVIRTUAL RDMA DEVICE

PARAVIRTUAL RDMA DEVICE 12th ANNUAL WORKSHOP 2016 PARAVIRTUAL RDMA DEVICE Aditya Sarwade, Adit Ranadive, Jorgen Hansen, Bhavesh Davda, George Zhang, Shelley Gong VMware, Inc. [ April 5th, 2016 ] MOTIVATION User Kernel Socket

More information

Distributed In-GPU Data Cache for Document-Oriented Data Store via PCIe over 10Gbit Ethernet

Distributed In-GPU Data Cache for Document-Oriented Data Store via PCIe over 10Gbit Ethernet Distributed In-GPU Data Cache for Document-Oriented Data Store via PCIe over 10Gbit Ethernet Shin Morishima 1 and Hiroki Matsutani 1,2,3 1 Keio University, 3-14-1 Hiyoshi, Kohoku-ku, Yokohama, Japan 223-8522

More information

Debugging CUDA Applications with Allinea DDT. Ian Lumb Sr. Systems Engineer, Allinea Software Inc.

Debugging CUDA Applications with Allinea DDT. Ian Lumb Sr. Systems Engineer, Allinea Software Inc. Debugging CUDA Applications with Allinea DDT Ian Lumb Sr. Systems Engineer, Allinea Software Inc. ilumb@allinea.com GTC 2013, San Jose, March 20, 2013 Embracing GPUs GPUs a rival to traditional processors

More information

FUJITSU PHI Turnkey Solution

FUJITSU PHI Turnkey Solution FUJITSU PHI Turnkey Solution Integrated ready to use XEON-PHI based platform Dr. Pierre Lagier ISC2014 - Leipzig PHI Turnkey Solution challenges System performance challenges Parallel IO best architecture

More information

Implementing Efficient and Scalable Flow Control Schemes in MPI over InfiniBand

Implementing Efficient and Scalable Flow Control Schemes in MPI over InfiniBand Implementing Efficient and Scalable Flow Control Schemes in MPI over InfiniBand Jiuxing Liu and Dhabaleswar K. Panda Computer Science and Engineering The Ohio State University Presentation Outline Introduction

More information

GROMACS (GPU) Performance Benchmark and Profiling. February 2016

GROMACS (GPU) Performance Benchmark and Profiling. February 2016 GROMACS (GPU) Performance Benchmark and Profiling February 2016 2 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Dell, Mellanox, NVIDIA Compute

More information

An Extension of the StarSs Programming Model for Platforms with Multiple GPUs

An Extension of the StarSs Programming Model for Platforms with Multiple GPUs An Extension of the StarSs Programming Model for Platforms with Multiple GPUs Eduard Ayguadé 2 Rosa M. Badia 2 Francisco Igual 1 Jesús Labarta 2 Rafael Mayo 1 Enrique S. Quintana-Ortí 1 1 Departamento

More information

Servers: Take Care of your Wattmeters!

Servers: Take Care of your Wattmeters! Solving some Mysteries in Power Monitoring of Servers: Take Care of your Wattmeters! M. Diouri 1, M. Dolz 2, O. Gluck 1, L. Lefevre 1, P. Alonso 3, S. Catalan 2, R. Mayo 2, and E. Quintana-Orti 2 1- INRIA

More information

My operating system is old but I don't care : I'm using NIX! B.Bzeznik BUX meeting, Vilnius 22/03/2016

My operating system is old but I don't care : I'm using NIX! B.Bzeznik BUX meeting, Vilnius 22/03/2016 My operating system is old but I don't care : I'm using NIX! B.Bzeznik BUX meeting, Vilnius 22/03/2016 CIMENT is the computing center of the University of Grenoble CIMENT computing platforms 132Tflops

More information

EXPERIENCES WITH NVME OVER FABRICS

EXPERIENCES WITH NVME OVER FABRICS 13th ANNUAL WORKSHOP 2017 EXPERIENCES WITH NVME OVER FABRICS Parav Pandit, Oren Duer, Max Gurtovoy Mellanox Technologies [ 31 March, 2017 ] BACKGROUND: NVME TECHNOLOGY Optimized for flash and next-gen

More information

InfiniCloud: Leveraging the Global InfiniCortex Fabric and OpenStack Cloud for Borderless High Performance Computing of Genomic Data and Beyond

InfiniCloud: Leveraging the Global InfiniCortex Fabric and OpenStack Cloud for Borderless High Performance Computing of Genomic Data and Beyond : Leveraging the Global InfiniCortex Fabric and OpenStack Cloud for Borderless High Performance Computing of Genomic Data and Beyond Kenneth Ban Institute of Molecular and Cell Biology, A*STAR, Dept. of

More information

Future Trends in Hardware and Software for use in Simulation

Future Trends in Hardware and Software for use in Simulation Future Trends in Hardware and Software for use in Simulation Steve Feldman VP/IT, CD-adapco April, 2009 HighPerformanceComputing Building Blocks CPU I/O Interconnect Software General CPU Maximum clock

More information

Benchmarking Instructions

Benchmarking Instructions Benchmarking Instructions Barcelona, May 4th, 2017 Summary Summary 1. CSUC benchmark suite 2017 2. General rules 3. Software 4. Results 1. CSUC benchmark suite 2017 With the present set of benchmarks,

More information

Advanced MPI: MPI Tuning or How to Save SUs by Optimizing your MPI Library! The lab.

Advanced MPI: MPI Tuning or How to Save SUs by Optimizing your MPI Library! The lab. Advanced MPI: MPI Tuning or How to Save SUs by Optimizing your MPI Library! The lab. Jérôme VIENNE viennej@tacc.utexas.edu Texas Advanced Computing Center (TACC). University of Texas at Austin Tuesday

More information

Solutions for Scalable HPC

Solutions for Scalable HPC Solutions for Scalable HPC Scot Schultz, Director HPC/Technical Computing HPC Advisory Council Stanford Conference Feb 2014 Leading Supplier of End-to-End Interconnect Solutions Comprehensive End-to-End

More information

NTRDMA v0.1. An Open Source Driver for PCIe NTB and DMA. Allen Hubbe at Linux Piter 2015 NTRDMA. Messaging App. IB Verbs. dmaengine.h ntb.

NTRDMA v0.1. An Open Source Driver for PCIe NTB and DMA. Allen Hubbe at Linux Piter 2015 NTRDMA. Messaging App. IB Verbs. dmaengine.h ntb. Messaging App IB Verbs NTRDMA dmaengine.h ntb.h DMA DMA DMA NTRDMA v0.1 An Open Source Driver for PCIe and DMA Allen Hubbe at Linux Piter 2015 1 INTRODUCTION Allen Hubbe Senior Software Engineer EMC Corporation

More information

Exploiting Full Potential of GPU Clusters with InfiniBand using MVAPICH2-GDR

Exploiting Full Potential of GPU Clusters with InfiniBand using MVAPICH2-GDR Exploiting Full Potential of GPU Clusters with InfiniBand using MVAPICH2-GDR Presentation at Mellanox Theater () Dhabaleswar K. (DK) Panda - The Ohio State University panda@cse.ohio-state.edu Outline Communication

More information

Performance Analysis and Evaluation of Mellanox ConnectX InfiniBand Architecture with Multi-Core Platforms

Performance Analysis and Evaluation of Mellanox ConnectX InfiniBand Architecture with Multi-Core Platforms Performance Analysis and Evaluation of Mellanox ConnectX InfiniBand Architecture with Multi-Core Platforms Sayantan Sur, Matt Koop, Lei Chai Dhabaleswar K. Panda Network Based Computing Lab, The Ohio State

More information

Mapping MPI+X Applications to Multi-GPU Architectures

Mapping MPI+X Applications to Multi-GPU Architectures Mapping MPI+X Applications to Multi-GPU Architectures A Performance-Portable Approach Edgar A. León Computer Scientist San Jose, CA March 28, 2018 GPU Technology Conference This work was performed under

More information

2 Installation Installing SnuCL CPU Installing SnuCL Single Installing SnuCL Cluster... 7

2 Installation Installing SnuCL CPU Installing SnuCL Single Installing SnuCL Cluster... 7 SnuCL User Manual Center for Manycore Programming Department of Computer Science and Engineering Seoul National University, Seoul 151-744, Korea http://aces.snu.ac.kr Release 1.3.2 December 2013 Contents

More information

High Performance Computing. Taichiro Suzuki Tokyo Institute of Technology Dept. of mathematical and computing sciences Matsuoka Lab.

High Performance Computing. Taichiro Suzuki Tokyo Institute of Technology Dept. of mathematical and computing sciences Matsuoka Lab. High Performance Computing Taichiro Suzuki Tokyo Institute of Technology Dept. of mathematical and computing sciences Matsuoka Lab. 1 Review Paper Two-Level Checkpoint/Restart Modeling for GPGPU Supada

More information

Open Fabrics Workshop 2013

Open Fabrics Workshop 2013 Open Fabrics Workshop 2013 OFS Software for the Intel Xeon Phi Bob Woodruff Agenda Intel Coprocessor Communication Link (CCL) Software IBSCIF RDMA from Host to Intel Xeon Phi Direct HCA Access from Intel

More information

Configuring Non-Volatile Memory Express* (NVMe*) over Fabrics on Intel Omni-Path Architecture

Configuring Non-Volatile Memory Express* (NVMe*) over Fabrics on Intel Omni-Path Architecture Configuring Non-Volatile Memory Express* (NVMe*) over Fabrics on Intel Omni-Path Architecture Document Number: J78967-1.0 Legal Disclaimer Legal Disclaimer You may not use or facilitate the use of this

More information

RoCE Update. Liran Liss, Mellanox Technologies March,

RoCE Update. Liran Liss, Mellanox Technologies March, RoCE Update Liran Liss, Mellanox Technologies March, 2012 www.openfabrics.org 1 Agenda RoCE Ecosystem QoS Virtualization High availability Latest news 2 RoCE in the Data Center Lossless configuration recommended

More information

HPC Performance Monitor

HPC Performance Monitor HPC Performance Monitor Application User Manual March 7, 2017 Page 1 of 35 Contents 1 Introduction... 4 2 System Requirements... 5 2.1 Client Side... 5 2.2 Server Side... 5 2.3 Cluster Configuration...

More information

VSC Users Day 2018 Start to GPU Ehsan Moravveji

VSC Users Day 2018 Start to GPU Ehsan Moravveji Outline A brief intro Available GPUs at VSC GPU architecture Benchmarking tests General Purpose GPU Programming Models VSC Users Day 2018 Start to GPU Ehsan Moravveji Image courtesy of Nvidia.com Generally

More information

Lab MIC Offload Experiments 7/22/13 MIC Advanced Experiments TACC

Lab MIC Offload Experiments 7/22/13 MIC Advanced Experiments TACC Lab MIC Offload Experiments 7/22/13 MIC Advanced Experiments TACC # pg. Subject Purpose directory 1 3 5 Offload, Begin (C) (F90) Compile and Run (CPU, MIC, Offload) offload_hello 2 7 Offload, Data Optimize

More information

Shadowfax: Scaling in Heterogeneous Cluster Systems via GPGPU Assemblies

Shadowfax: Scaling in Heterogeneous Cluster Systems via GPGPU Assemblies Shadowfax: Scaling in Heterogeneous Cluster Systems via GPGPU Assemblies Alexander Merritt, Vishakha Gupta, Abhishek Verma, Ada Gavrilovska, Karsten Schwan {merritt.alex,abhishek.verma}@gatech.edu {vishakha,ada,schwan}@cc.gtaech.edu

More information

Lustre File System on ARM

Lustre File System on ARM 1 Lustre File System on ARM Architecture Evaluation v1.1 September 2017 Carlos Thomaz 2 Motivations ARM momentum 64bit evolution Recent debuts on HPC Traction in new areas such as Machine Learning and

More information

World s most advanced data center accelerator for PCIe-based servers

World s most advanced data center accelerator for PCIe-based servers NVIDIA TESLA P100 GPU ACCELERATOR World s most advanced data center accelerator for PCIe-based servers HPC data centers need to support the ever-growing demands of scientists and researchers while staying

More information

Voltaire. Fast I/O for XEN using RDMA Technologies. The Grid Interconnect Company. April 2005 Yaron Haviv, Voltaire, CTO

Voltaire. Fast I/O for XEN using RDMA Technologies. The Grid Interconnect Company. April 2005 Yaron Haviv, Voltaire, CTO Voltaire The Grid Interconnect Company Fast I/O for XEN using RDMA Technologies April 2005 Yaron Haviv, Voltaire, CTO yaronh@voltaire.com The Enterprise Grid Model and ization VMs need to interact efficiently

More information

LS-DYNA Performance Benchmark and Profiling. April 2015

LS-DYNA Performance Benchmark and Profiling. April 2015 LS-DYNA Performance Benchmark and Profiling April 2015 2 Note The following research was performed under the HPC Advisory Council activities Participating vendors: Intel, Dell, Mellanox Compute resource

More information

OceanStor 9000 InfiniBand Technical White Paper. Issue V1.01 Date HUAWEI TECHNOLOGIES CO., LTD.

OceanStor 9000 InfiniBand Technical White Paper. Issue V1.01 Date HUAWEI TECHNOLOGIES CO., LTD. OceanStor 9000 Issue V1.01 Date 2014-03-29 HUAWEI TECHNOLOGIES CO., LTD. Copyright Huawei Technologies Co., Ltd. 2014. All rights reserved. No part of this document may be reproduced or transmitted in

More information

Interconnect Your Future

Interconnect Your Future Interconnect Your Future Gilad Shainer 2nd Annual MVAPICH User Group (MUG) Meeting, August 2014 Complete High-Performance Scalable Interconnect Infrastructure Comprehensive End-to-End Software Accelerators

More information

Exploiting GPUDirect RDMA in Designing High Performance OpenSHMEM for NVIDIA GPU Clusters

Exploiting GPUDirect RDMA in Designing High Performance OpenSHMEM for NVIDIA GPU Clusters 2015 IEEE International Conference on Cluster Computing Exploiting GPUDirect RDMA in Designing High Performance OpenSHMEM for NVIDIA GPU Clusters Khaled Hamidouche, Akshay Venkatesh, Ammar Ahmad Awan,

More information

Oncilla: A GAS Runtime for Efficient Resource Allocation and Data Movement in Accelerated Clusters

Oncilla: A GAS Runtime for Efficient Resource Allocation and Data Movement in Accelerated Clusters Oncilla: A GAS Runtime for Efficient Resource Allocation and Data Movement in Accelerated Clusters Jeff Young, Se Hoon Shon, Sudhakar Yalamanchili, Alex Merritt, Karsten Schwan School of Electrical and

More information

2008 International ANSYS Conference

2008 International ANSYS Conference 2008 International ANSYS Conference Maximizing Productivity With InfiniBand-Based Clusters Gilad Shainer Director of Technical Marketing Mellanox Technologies 2008 ANSYS, Inc. All rights reserved. 1 ANSYS,

More information

OPEN MPI WITH RDMA SUPPORT AND CUDA. Rolf vandevaart, NVIDIA

OPEN MPI WITH RDMA SUPPORT AND CUDA. Rolf vandevaart, NVIDIA OPEN MPI WITH RDMA SUPPORT AND CUDA Rolf vandevaart, NVIDIA OVERVIEW What is CUDA-aware History of CUDA-aware support in Open MPI GPU Direct RDMA support Tuning parameters Application example Future work

More information

Mellanox OFED for FreeBSD for ConnectX-4/ConnectX-4 Lx/ ConnectX-5 Release Note. Rev 3.5.0

Mellanox OFED for FreeBSD for ConnectX-4/ConnectX-4 Lx/ ConnectX-5 Release Note. Rev 3.5.0 Mellanox OFED for FreeBSD for ConnectX-4/ConnectX-4 Lx/ ConnectX-5 Release Note Rev 3.5.0 www.mellanox.com Mellanox Technologies NOTE: THIS HARDWARE, SOFTWARE OR TEST SUITE PRODUCT ( PRODUCT(S) ) AND ITS

More information

Modern GPU Programming With CUDA and Thrust. Gilles Civario (ICHEC)

Modern GPU Programming With CUDA and Thrust. Gilles Civario (ICHEC) Modern GPU Programming With CUDA and Thrust Gilles Civario (ICHEC) gilles.civario@ichec.ie Let me introduce myself ID: Gilles Civario Gender: Male Age: 40 Affiliation: ICHEC Specialty: HPC Mission: CUDA

More information

SR-IOV Support for Virtualization on InfiniBand Clusters: Early Experience

SR-IOV Support for Virtualization on InfiniBand Clusters: Early Experience SR-IOV Support for Virtualization on InfiniBand Clusters: Early Experience Jithin Jose, Mingzhe Li, Xiaoyi Lu, Krishna Kandalla, Mark Arnold and Dhabaleswar K. (DK) Panda Network-Based Computing Laboratory

More information