Fermi Cluster for Real-Time Hyperspectral Scene Generation

Size: px
Start display at page:

Download "Fermi Cluster for Real-Time Hyperspectral Scene Generation"

Transcription

1 Fermi Cluster for Real-Time Hyperspectral Scene Generation Gary McMillian, Ph.D. Crossfield Technology LLC 9390 Research Blvd, Suite I200 Austin, TX (512) x151 AF SBIR Program, Donald Snyder, III Program Manager Funding provided by Frank Carlen, Multi-Spectral Test

2 System Architecture & Approach Scenes generated by heterogeneous processors, then transported over In5iniBand to the projector(s) using RDMA protocol for high throughput and low latency Network interfaces aggregate data from multiple heterogeneous processors in high- speed frame buffers Contents of frame buffers output to projector through FPGA Mezzanine Card (FMC) interface IEEE 1588 Precision Time Protocol (PTP) provides global time synchronization Heterogeneous processors and projector network interfaces scale independently 7/20/11 Crossfield Technology LLC 2

3 Scalable System Architecture Processor Network CPU/GPU Interface 7/20/11 LVDS Projector HWIL Fiber InfiniBand Switch Processor Nodes DVI Network Interface Adapters Crossfield Technology LLC 3

4 HWIL Simulation System QuickPath Interconnect (QPI) ~100 Gbps PCI Express x8 ~32 Gbps (x16 ~64 Gbps) DDR3 SDRAM ~85 Gbps/ch x 3 ch GDDR5 SDRAM ~192 Gbps/ch x 6 ch QDR InfiniBand ~32 Gbps VITA 57.1 / FMC ~100 Gbps SERDES Gbps LVDS I/O Projector / HWIL User-definable PHY Frame Synch/Request FMC SSD DDR3 SDRAM CPU CPU DDR3 SDRAM FPGA DDR3 SDRAM QPI QPI PCIe x8 GDDR5 SDRAM GPU PCIe Bridge PCIe Bridge GPU GDDR5 SDRAM CPU DDR3 SDRAM PCIe x8 PCIe x8 GDDR5 SDRAM GPU Network Adapter Network Adapter 1U-4U Heterogeneous Processor Network Adapter 1U Crossfield Network Interface IEEE 1588 PTP Server + Ethernet InfiniBand Switch ( ports) 7/20/11 Crossfield Technology LLC 4

5 REAL-TIME HIGH PERFORMANCE COMPUTER (HPC) 7/20/11 Crossfield Technology LLC 5

6 Real-Time HPC Requirements Deterministic & Synchronous Synthesized images complete & ready at HWIL frame rate High Floating-Point Performance Implement physics-based algorithms High Bandwidth Inter-processor communications for data exchange Stream high-resolution images to projector at high frame rates High Memory Capacity & Performance Processor memory code, model parameters, data Non-volatile storage code, model parameters, data, logging 7/20/11 Crossfield Technology LLC 6

7 Intel Xeon Processor Roadmap Westmere Microarchitecture 32 nm process, 6 Cores 40 lanes PCI Express Gen channels DDR Sandy Bridge Microarchitecture 32 nm process, 4-8 Cores 40 lanes PCI Express Gen channels DDR /20/11 Crossfield Technology LLC 7

8 Nvidia CUDA GPU Roadmap 21 SEP 2010 Kepler To be released sometime in 2011, 28 nm process. Estimated performance of 4-6 DP GFLOPS/W Maxwell To be released sometime in 2013, 22 nm process. Estimated performance of DP GFLOPS/W 7/20/11 Crossfield Technology LLC 8

9 Nvidia Tesla (Fermi Architecture) CUDA Programming Environment C/C++, Fortran, OpenCL, Java, Python or DirectX Compute GIGATHREAD Engine 515 GFLOP Double Precision C2050/C GFLOP Single Precision PARALLEL DATACACHE Technology 3-6 GB GDDR5 memory 384-bit bus ECC option GPUDirect with InfiniBand M2050/M2070 PCI Express 2.0 (16 lanes) Two DMA engines for bi-directional data transfer 7/20/11 Crossfield Technology LLC 9

10 Nvidia Tesla Comparison Peak double precision floating point performance Peak single precision floating point performance Tesla C2070 Tesla M2070 Tesla M GFLOPS 515 GFLOPS 665 GFLOPS 1030 GFLOPS 1030 GFLOPS 1331 GFLOPS CUDA cores Memory size (GDDR5) Memory bandwidth (ECC off) Total Dissipated Power (TDP) 6 GB 6 GB 6 GB 144 GB/s 150 GB/s 177 GB/s 247 W 225 W 250 W Retail price $2300 ~$2300 ~$3500 7/20/11 Crossfield Technology LLC 10

11 InfiniBand Roadmap SDR - Single Data Rate DDR - Double Data Rate QDR - Quad Data Rate FDR - Fourteen Data Rate EDR - Enhanced Data Rate HDR - High Data Rate NDR - Next Data Rat 7/20/11 Crossfield Technology LLC 11

12 Mellanox ConnectX-2 Network Adapters Nvidia GPUDirect InfiniBand Adapter and Nvidia GPU share CPU memory region Open Fabrics Enterprise Distribution (OFED) Software Bandwidth 10G Ethernet 10/20/40G InfiniBand Protocol Support Remote Direct Memory Access (RDMA) OpenMPI, OSU MVAPICH, HPMPI, Intel MPI, MS MPI, Scali MPI TCP/UDP, IPoIB, SDP, RDS SRP, iser, NFS RDMA, FCoIB, FCoE PCIe 2.0 (8-lanes) Performance 1 µs Ping latency 50M MPI messages/s 7/20/11 Crossfield Technology LLC 12

13 Mellanox IS5200 InfiniBand Switch Non-blocking, full bisectional bandwidth ns latency Up to 216 QSFP ports Tb/s aggregate throughput 9U cabinet 6 spine modules 12 leaf modules 1 kw 7/20/11 Crossfield Technology LLC 13

14 Remote Direct Memory Access (RDMA) Remote Direct Memory Access enables data to be transferred from one processor s memory to another processor s memory across a network, without significantly involving either operating system RDMA supports zero-copy data transfers by enabling the network adapter to transfer data directly to or from application memory, eliminating the need to copy data between application memory and data buffers in the operating system kernel RDMA defines READ, WRITE and SEND/RECEIVE RDMA adapters support thousands of concurrent transactions using work queues 7/20/11 Crossfield Technology LLC 14

15 OpenFabrics Alliance (OFA) Open Source Application Level User APIs Diag Tools Open SM User Level MAD API InfiniBand OpenFabrics User Level Verbs / API iwarp R-NIC User Space IP Based App Access Sockets Based Access SDP Lib UDAPL Various MPIs Block Storage Access Clustered DB Access Access to File Systems SA MAD SMA PMA IPoIB Subnet Administrator Management Datagram Subnet Manager Agent Performance Manager Agent IP over InfiniBand Upper Layer Protocol Mid-Layer Kernel bypass SA Client Kernel Space MAD SMA IPoIB SDP SRP iser RDS Connection Manager Connection Manager Abstraction (CMA) Connection Manager NFS-RDMA RPC Cluster File Sys InfiniBand OpenFabrics Kernel Level Verbs / API iwarp R-NIC Kernel bypass SDP SRP iser RDS UDAPL HCA Sockets Direct Protocol SCSI RDMA Protocol (Initiator) iscsi RDMA Protocol (Initiator) Reliable Datagram Service User Direct Access Programming Lib Host Channel Adapter Provider Hardware Hardware Specific Driver InfiniBand HCA Hardware Specific Driver iwarp R-NIC R-NIC Key RDMA NIC Common InfiniBand iwarp Apps & Access Methods for using OF Stack 7/20/11 Crossfield Technology LLC 15

16 GPU Server Options 1U server Dual Xeon 5600 processors & 5520 chipsets Three 16-lane + one 8-lane PCIe slots Supports 1-3 M IB HCA 2U server Dual Xeon 5600 processors & 5520 chipsets Four 16-lane + two 8-lane PCIe slots (PLX 8647 switch) Supports 1-4 M IB HCA 4U server Dual Xeon 5600 processors & 5520 chipsets Eight 16-lane PCIe slots (4 PLX 8647 switches) Supports 4-7 C IB HCA 7/20/11 Crossfield Technology LLC 16

17 HPC System Configuration 4U Servers (64 + 1) Dual 6-core, 2.66 GHz Intel Xeon 5650 (Westmere) CPUs Dual Intel 5520 (Tylersburg-36D) IOH with 6.4 GT/s QPI Four 16-lane PCI Express Gen 2 slots Six 8 GB DDR DIMMs (48 GB) Four Nvidia Tesla C2070 (Fermi) GPUs One Mellanox 40G InfiniBand Host Channel Adapter One 300 GB, 10K RPM disk drive Mellanox 40G InfiniBand Switch (216 ports max) Symmetricom IEEE 1588 PTP Master Clock APC Smart-UPS RT 6000VA (18) 76 kw 42U Racks (9) *65 nodes x 1.4 kw/node = 91 kw 7/20/11 Crossfield Technology LLC 17

18 Advanced HPC System Configuration 2U Servers (64 + 1) Dual 6-core, 2.66 GHz Intel Xeon 5650 (Westmere) CPUs Dual Intel 5520 (Tylersburg-36D) IOH with 6.4 GT/s QPI Four 16-lane + two 8-lane PCI Express Gen 2 slots (with switch) Six 8 GB DDR DIMMs (48 GB) Three Nvidia Tesla M2090 (Fermi) GPUs Two Mellanox 40G InfiniBand Host Channel Adapters One 250 GB SSD (solid state disk) Mellanox 40G InfiniBand Switch (216 ports max) Symmetricom IEEE 1588 PTP Master Clock APC Symmetra PX SY100K100F UPS kw 42U Racks (4+1) 7/20/11 Crossfield Technology LLC 18

19 Future HPC System Configuration 2U Servers (64 + 1) Dual 8-core, 2.3 GHz Intel Xeon E (Sandy Bridge) CPUs Four 16-lane + two 8-lane PCI Express Gen 3 slots (with switch) Eight 8 GB DDR DIMMs (64 GB) Three Nvidia Tesla M2090 (Fermi) GPUs Two Mellanox 56G InfiniBand Host Channel Adapters One 250 GB SSD (solid state disk) Mellanox 56G InfiniBand Switch (648 ports max) Symmetricom IEEE 1588 PTP Master Clock APC Symmetra PX SY100K100F UPS kw 42U Racks (4+1) 7/20/11 Crossfield Technology LLC 19

20 IEEE 1588 Precision Time Protocol IEEE Precision Time Protocol (PTP) Version 2 overcomes network and application latency and jitter through hardware time stamping at the physical layer of the network. IEEE provides time transfer accuracy in the sub ns range, a significant improvement in time synchronization accuracy over Network Time Protocol (NTP). The Symmetricom XLi Grandmaster is IEEE PTP V2 compliant and time stamps PTP packets with a time stamp accuracy of 50 ns to UTC. Measured synchronization accuracy at a PTP client has been shown to be as good as a 17 ns offset from the XLi Grandmaster. Operating at 100BaseT line speed with deep time stamp packet buffers, the XLi Grandmaster can support thousands of 1588 clients. 7/20/11 Crossfield Technology LLC 20

21 Uninterruptable Power Supply (UPS) APC Symmetra PX 100kW Scalable to 100kW/100kVA 208V 3PH 332A Service 7/20/11 Crossfield Technology LLC 21

22 APC Symmetra PX Performance 7/20/11 Crossfield Technology LLC 22

23 HPC Performance Node System Cores CPU/GPU 12/ /98304 CPU SP FP Performance 128 GFLOP 8 TFLOP CPU DP FP Performance 64 GFLOP 4 TFLOP GPU SP FP Performance 3990 GFLOP 255 TFLOP GPU DP FP Performance 1995 GFLOP 128 TFLOP Main Memory Size 48 GB 3 TB Main Memory BW 64 GB/s 4 TB/s Disk Size 250 GB 16 TB Disk IOPS (4 KB) 20K 1.28M Disk R/W BW 500/315 MB/s 32/20 GB/s Network BW 50 Gb/s 3.2 Tb/s Power 1.5 kw 100 kw 7/20/11 Crossfield Technology LLC 23

24 HPC Procurement Schedule Breadboard Performance Evaluation 15 JUL Finalize HPC Configuration 15 JUL # Fermi Processors (4 -> 3) # IB Adapters (1 -> 2) UPS (100 kw), Server (4U -> 2U), SSD Request Final Vendor Quotes 1 AUG HPC Vendor Selection Issue HPC System Purchase Order OCT 31 HPC System Integration & Test by Vendor 6-12 week delivery ARO Installation DEC 31 Prepare electrical supply for UPS 7/20/11 Crossfield Technology LLC 24

25 REAL-TIME LINUX 7/20/11 Crossfield Technology LLC 25

26 Real-Time Operating System (RTOS) Requirements No dropped frames during simulation run Support Nvidia s CUDA Support InfiniBand Adapter with GPUDirect Support Precision Time Protocol (PTP) IEEE 1588 Candidate RTOS Concurrent Computer RedHawk RedHat MRG (Messaging, Real-Time, Grid) 7/20/11 Crossfield Technology LLC 26

27 Interrupt Dispatch Latency* *Ravi Malhotra, Real-Time Performance on Linux-based Systems, 2011 Freescale Technology Forum 7/20/11 Crossfield Technology LLC 27

28 Real-Time Support on Linux* Traditionally, Linux is not a real-time operating system Designed for server throughput performance rather than embedded systems latency Scheduling latencies can be unbound Big kernel lock and other mechanisms (softirq) typically end up blocking real-time critical tasks Processes cannot be pre-empted while executing system calls *Ravi Malhotra, Real-Time Performance on Linux-based Systems, 2011 Freescale Technology Forum 7/20/11 Crossfield Technology LLC 28

29 Sources of Latency & How RT Patch Helps* *Ravi Malhotra, Real-Time Performance on Linux-based Systems, 2011 Freescale Technology Forum 7/20/11 Crossfield Technology LLC 29

30 HPC PERFORMANCE MODEL 7/20/11 Crossfield Technology LLC 30

31 Hyperformix Workbench Performance Model 7/20/11 Crossfield Technology LLC 31

32 Workbench Model Steps The application consists of 9 steps that comprise the generation and transfer of a frame: 1. Projector requests frame (provides state data) 2. CPU setups Frame Generation Process 3. CPU writes task data to CPU Memory (DDR3 SDRAM) 4. CPU tasks the GPU to synthesize the Frame 5. GPU reads the task data from CPU memory 6. GPU synthesizes the Frame 7. GPU transfers the frame data to CPU memory 8. CPU tasks the InfiniBand Network Adapters to transfer the frame to Crossfield Network Interface via the InfiniBand Switch 9. Network Adapters transfer the frame to FPGA memory using RDMA Protocol 7/20/11 Crossfield Technology LLC 32

33 Hyperformix Workbench Performance Model 7/20/11 Crossfield Technology LLC 33

34 Workbench Model Results Application Steps Response (µs) Application.Step_1_Frame_Request_from_Projector.response Application.Step_2_and_3_Setup_Process_and_write_data_to_memory.response Application.Step_4_CPU_tasks_GPU.response Application.Step_5_GPU_reads_data_from_CPU_Memory.response Application.Step_6_GPU_synthesizes_Frame_first_transfer.response 1000 Application.Step_7_GPU_xfers_Frame_to_CPU_memory.response Application.Step_8_CPU_tasks_Network_Adapter_to_transfer_Frame_to_NI.response Application.Step_9_Network_Adapter_xfer_frame_to_NI_FPGA_Memory.response 2259 Application.Main_RT_App.All_Steps_transfer_RT_ /20/11 Crossfield Technology LLC 34

35 PROJECTOR INTERFACE 7/20/11 Crossfield Technology LLC 35

36 Projector Interfaces FPGA Mezzanine Cards (FMC) 1. Two Dual DVI 2. Parallel Fiber Optic Ports (8-10) 3. Digital Micromirror Device (DMD) Interface All modules provide 2 User Definable I/Os, e.g. HWIL Synchronization Signal Output Next Frame 7/20/11 Crossfield Technology LLC 36

NFS/RDMA over 40Gbps iwarp Wael Noureddine Chelsio Communications

NFS/RDMA over 40Gbps iwarp Wael Noureddine Chelsio Communications NFS/RDMA over 40Gbps iwarp Wael Noureddine Chelsio Communications Outline RDMA Motivating trends iwarp NFS over RDMA Overview Chelsio T5 support Performance results 2 Adoption Rate of 40GbE Source: Crehan

More information

Infiniband and RDMA Technology. Doug Ledford

Infiniband and RDMA Technology. Doug Ledford Infiniband and RDMA Technology Doug Ledford Top 500 Supercomputers Nov 2005 #5 Sandia National Labs, 4500 machines, 9000 CPUs, 38TFlops, 1 big headache Performance great...but... Adding new machines problematic

More information

OFED Storage Protocols

OFED Storage Protocols OFED Storage Protocols R. Pearson System Fabric Works, Inc. Agenda Why OFED Storage Introduction to OFED Storage Protocols OFED Storage Protocol Update 2 Why OFED Storage 3 Goals of I/O Consolidation Cluster

More information

Informatix Solutions INFINIBAND OVERVIEW. - Informatix Solutions, Page 1 Version 1.0

Informatix Solutions INFINIBAND OVERVIEW. - Informatix Solutions, Page 1 Version 1.0 INFINIBAND OVERVIEW -, 2010 Page 1 Version 1.0 Why InfiniBand? Open and comprehensive standard with broad vendor support Standard defined by the InfiniBand Trade Association (Sun was a founder member,

More information

Introduction to High-Speed InfiniBand Interconnect

Introduction to High-Speed InfiniBand Interconnect Introduction to High-Speed InfiniBand Interconnect 2 What is InfiniBand? Industry standard defined by the InfiniBand Trade Association Originated in 1999 InfiniBand specification defines an input/output

More information

CERN openlab Summer 2006: Networking Overview

CERN openlab Summer 2006: Networking Overview CERN openlab Summer 2006: Networking Overview Martin Swany, Ph.D. Assistant Professor, Computer and Information Sciences, U. Delaware, USA Visiting Helsinki Institute of Physics (HIP) at CERN swany@cis.udel.edu,

More information

Performance Analysis and Evaluation of Mellanox ConnectX InfiniBand Architecture with Multi-Core Platforms

Performance Analysis and Evaluation of Mellanox ConnectX InfiniBand Architecture with Multi-Core Platforms Performance Analysis and Evaluation of Mellanox ConnectX InfiniBand Architecture with Multi-Core Platforms Sayantan Sur, Matt Koop, Lei Chai Dhabaleswar K. Panda Network Based Computing Lab, The Ohio State

More information

PERFORMANCE ACCELERATED Mellanox InfiniBand Adapters Provide Advanced Levels of Data Center IT Performance, Productivity and Efficiency

PERFORMANCE ACCELERATED Mellanox InfiniBand Adapters Provide Advanced Levels of Data Center IT Performance, Productivity and Efficiency PERFORMANCE ACCELERATED Mellanox InfiniBand Adapters Provide Advanced Levels of Data Center IT Performance, Productivity and Efficiency Mellanox continues its leadership providing InfiniBand Host Channel

More information

VPI / InfiniBand. Performance Accelerated Mellanox InfiniBand Adapters Provide Advanced Data Center Performance, Efficiency and Scalability

VPI / InfiniBand. Performance Accelerated Mellanox InfiniBand Adapters Provide Advanced Data Center Performance, Efficiency and Scalability VPI / InfiniBand Performance Accelerated Mellanox InfiniBand Adapters Provide Advanced Data Center Performance, Efficiency and Scalability Mellanox enables the highest data center performance with its

More information

VPI / InfiniBand. Performance Accelerated Mellanox InfiniBand Adapters Provide Advanced Data Center Performance, Efficiency and Scalability

VPI / InfiniBand. Performance Accelerated Mellanox InfiniBand Adapters Provide Advanced Data Center Performance, Efficiency and Scalability VPI / InfiniBand Performance Accelerated Mellanox InfiniBand Adapters Provide Advanced Data Center Performance, Efficiency and Scalability Mellanox enables the highest data center performance with its

More information

Performance Accelerated Mellanox InfiniBand Adapters Provide Advanced Data Center Performance, Efficiency and Scalability

Performance Accelerated Mellanox InfiniBand Adapters Provide Advanced Data Center Performance, Efficiency and Scalability Performance Accelerated Mellanox InfiniBand Adapters Provide Advanced Data Center Performance, Efficiency and Scalability Mellanox InfiniBand Host Channel Adapters (HCA) enable the highest data center

More information

Application Acceleration Beyond Flash Storage

Application Acceleration Beyond Flash Storage Application Acceleration Beyond Flash Storage Session 303C Mellanox Technologies Flash Memory Summit July 2014 Accelerating Applications, Step-by-Step First Steps Make compute fast Moore s Law Make storage

More information

2008 International ANSYS Conference

2008 International ANSYS Conference 2008 International ANSYS Conference Maximizing Productivity With InfiniBand-Based Clusters Gilad Shainer Director of Technical Marketing Mellanox Technologies 2008 ANSYS, Inc. All rights reserved. 1 ANSYS,

More information

Performance Optimizations via Connect-IB and Dynamically Connected Transport Service for Maximum Performance on LS-DYNA

Performance Optimizations via Connect-IB and Dynamically Connected Transport Service for Maximum Performance on LS-DYNA Performance Optimizations via Connect-IB and Dynamically Connected Transport Service for Maximum Performance on LS-DYNA Pak Lui, Gilad Shainer, Brian Klaff Mellanox Technologies Abstract From concept to

More information

Multifunction Networking Adapters

Multifunction Networking Adapters Ethernet s Extreme Makeover: Multifunction Networking Adapters Chuck Hudson Manager, ProLiant Networking Technology Hewlett-Packard 2004 Hewlett-Packard Development Company, L.P. The information contained

More information

Improving Application Performance and Predictability using Multiple Virtual Lanes in Modern Multi-Core InfiniBand Clusters

Improving Application Performance and Predictability using Multiple Virtual Lanes in Modern Multi-Core InfiniBand Clusters Improving Application Performance and Predictability using Multiple Virtual Lanes in Modern Multi-Core InfiniBand Clusters Hari Subramoni, Ping Lai, Sayantan Sur and Dhabhaleswar. K. Panda Department of

More information

High Performance Computing

High Performance Computing High Performance Computing Dror Goldenberg, HPCAC Switzerland Conference March 2015 End-to-End Interconnect Solutions for All Platforms Highest Performance and Scalability for X86, Power, GPU, ARM and

More information

Introduction to Infiniband

Introduction to Infiniband Introduction to Infiniband FRNOG 22, April 4 th 2014 Yael Shenhav, Sr. Director of EMEA, APAC FAE, Application Engineering The InfiniBand Architecture Industry standard defined by the InfiniBand Trade

More information

MELLANOX EDR UPDATE & GPUDIRECT MELLANOX SR. SE 정연구

MELLANOX EDR UPDATE & GPUDIRECT MELLANOX SR. SE 정연구 MELLANOX EDR UPDATE & GPUDIRECT MELLANOX SR. SE 정연구 Leading Supplier of End-to-End Interconnect Solutions Analyze Enabling the Use of Data Store ICs Comprehensive End-to-End InfiniBand and Ethernet Portfolio

More information

ANSYS Fluent 14 Performance Benchmark and Profiling. October 2012

ANSYS Fluent 14 Performance Benchmark and Profiling. October 2012 ANSYS Fluent 14 Performance Benchmark and Profiling October 2012 Note The following research was performed under the HPC Advisory Council activities Special thanks for: HP, Mellanox For more information

More information

HA-PACS/TCA: Tightly Coupled Accelerators for Low-Latency Communication between GPUs

HA-PACS/TCA: Tightly Coupled Accelerators for Low-Latency Communication between GPUs HA-PACS/TCA: Tightly Coupled Accelerators for Low-Latency Communication between GPUs Yuetsu Kodama Division of High Performance Computing Systems Center for Computational Sciences University of Tsukuba,

More information

MegaGauss (MGs) Cluster Design Overview

MegaGauss (MGs) Cluster Design Overview MegaGauss (MGs) Cluster Design Overview NVIDIA Tesla (Fermi) S2070 Modules Based Solution Version 6 (Apr 27, 2010) Alexander S. Zaytsev p. 1 of 15: "Title" Front view: planar

More information

Tightly Coupled Accelerators Architecture

Tightly Coupled Accelerators Architecture Tightly Coupled Accelerators Architecture Yuetsu Kodama Division of High Performance Computing Systems Center for Computational Sciences University of Tsukuba, Japan 1 What is Tightly Coupled Accelerators

More information

Interconnect Your Future

Interconnect Your Future Interconnect Your Future Gilad Shainer 2nd Annual MVAPICH User Group (MUG) Meeting, August 2014 Complete High-Performance Scalable Interconnect Infrastructure Comprehensive End-to-End Software Accelerators

More information

OpenFabrics Interface WG A brief introduction. Paul Grun co chair OFI WG Cray, Inc.

OpenFabrics Interface WG A brief introduction. Paul Grun co chair OFI WG Cray, Inc. OpenFabrics Interface WG A brief introduction Paul Grun co chair OFI WG Cray, Inc. OFI WG a brief overview and status report 1. Keep everybody on the same page, and 2. An example of a possible model for

More information

Mellanox Technologies Maximize Cluster Performance and Productivity. Gilad Shainer, October, 2007

Mellanox Technologies Maximize Cluster Performance and Productivity. Gilad Shainer, October, 2007 Mellanox Technologies Maximize Cluster Performance and Productivity Gilad Shainer, shainer@mellanox.com October, 27 Mellanox Technologies Hardware OEMs Servers And Blades Applications End-Users Enterprise

More information

2-Port 40 Gb InfiniBand Expansion Card (CFFh) for IBM BladeCenter IBM BladeCenter at-a-glance guide

2-Port 40 Gb InfiniBand Expansion Card (CFFh) for IBM BladeCenter IBM BladeCenter at-a-glance guide 2-Port 40 Gb InfiniBand Expansion Card (CFFh) for IBM BladeCenter IBM BladeCenter at-a-glance guide The 2-Port 40 Gb InfiniBand Expansion Card (CFFh) for IBM BladeCenter is a dual port InfiniBand Host

More information

HPC Hardware Overview

HPC Hardware Overview HPC Hardware Overview John Lockman III April 19, 2013 Texas Advanced Computing Center The University of Texas at Austin Outline Lonestar Dell blade-based system InfiniBand ( QDR) Intel Processors Longhorn

More information

Memory Scalability Evaluation of the Next-Generation Intel Bensley Platform with InfiniBand

Memory Scalability Evaluation of the Next-Generation Intel Bensley Platform with InfiniBand Memory Scalability Evaluation of the Next-Generation Intel Bensley Platform with InfiniBand Matthew Koop, Wei Huang, Ahbinav Vishnu, Dhabaleswar K. Panda Network-Based Computing Laboratory Department of

More information

Study. Dhabaleswar. K. Panda. The Ohio State University HPIDC '09

Study. Dhabaleswar. K. Panda. The Ohio State University HPIDC '09 RDMA over Ethernet - A Preliminary Study Hari Subramoni, Miao Luo, Ping Lai and Dhabaleswar. K. Panda Computer Science & Engineering Department The Ohio State University Introduction Problem Statement

More information

HPC Customer Requirements for OpenFabrics Software

HPC Customer Requirements for OpenFabrics Software HPC Customer Requirements for OpenFabrics Software Matt Leininger, Ph.D. Sandia National Laboratories Scalable Computing R&D Livermore, CA 16 November 2006 I'll focus on software requirements (well maybe)

More information

FROM HPC TO THE CLOUD WITH AMQP AND OPEN SOURCE SOFTWARE

FROM HPC TO THE CLOUD WITH AMQP AND OPEN SOURCE SOFTWARE FROM HPC TO THE CLOUD WITH AMQP AND OPEN SOURCE SOFTWARE Carl Trieloff cctrieloff@redhat.com Red Hat Lee Fisher lee.fisher@hp.com Hewlett-Packard High Performance Computing on Wall Street conference 14

More information

Accelerating HPC. (Nash) Dr. Avinash Palaniswamy High Performance Computing Data Center Group Marketing

Accelerating HPC. (Nash) Dr. Avinash Palaniswamy High Performance Computing Data Center Group Marketing Accelerating HPC (Nash) Dr. Avinash Palaniswamy High Performance Computing Data Center Group Marketing SAAHPC, Knoxville, July 13, 2010 Legal Disclaimer Intel may make changes to specifications and product

More information

QuickSpecs. HP InfiniBand Options for HP BladeSystems c-class. Overview

QuickSpecs. HP InfiniBand Options for HP BladeSystems c-class. Overview Overview HP supports 40Gbps (QDR) and 20Gbps (DDR) InfiniBand products that include mezzanine Host Channel Adapters (HCA) for server blades, switch blades for c-class enclosures, and rack switches and

More information

PART-I (B) (TECHNICAL SPECIFICATIONS & COMPLIANCE SHEET) Supply and installation of High Performance Computing System

PART-I (B) (TECHNICAL SPECIFICATIONS & COMPLIANCE SHEET) Supply and installation of High Performance Computing System INSTITUTE FOR PLASMA RESEARCH (An Autonomous Institute of Department of Atomic Energy, Government of India) Near Indira Bridge; Bhat; Gandhinagar-382428; India PART-I (B) (TECHNICAL SPECIFICATIONS & COMPLIANCE

More information

Low latency, high bandwidth communication. Infiniband and RDMA programming. Bandwidth vs latency. Knut Omang Ifi/Oracle 2 Nov, 2015

Low latency, high bandwidth communication. Infiniband and RDMA programming. Bandwidth vs latency. Knut Omang Ifi/Oracle 2 Nov, 2015 Low latency, high bandwidth communication. Infiniband and RDMA programming Knut Omang Ifi/Oracle 2 Nov, 2015 1 Bandwidth vs latency There is an old network saying: Bandwidth problems can be cured with

More information

Proximity-based Computing

Proximity-based Computing Proximity-based Computing David Cohen, Goldman Sachs What is Proximity Computing 1. A business group uses rsync to replicate data from the intranet into a set of compute farms in advance of the execution

More information

Implementing Storage in Intel Omni-Path Architecture Fabrics

Implementing Storage in Intel Omni-Path Architecture Fabrics white paper Implementing in Intel Omni-Path Architecture Fabrics Rev 2 A rich ecosystem of storage solutions supports Intel Omni- Path Executive Overview The Intel Omni-Path Architecture (Intel OPA) is

More information

The Road to ExaScale. Advances in High-Performance Interconnect Infrastructure. September 2011

The Road to ExaScale. Advances in High-Performance Interconnect Infrastructure. September 2011 The Road to ExaScale Advances in High-Performance Interconnect Infrastructure September 2011 diego@mellanox.com ExaScale Computing Ambitious Challenges Foster Progress Demand Research Institutes, Universities

More information

10-Gigabit iwarp Ethernet: Comparative Performance Analysis with InfiniBand and Myrinet-10G

10-Gigabit iwarp Ethernet: Comparative Performance Analysis with InfiniBand and Myrinet-10G 10-Gigabit iwarp Ethernet: Comparative Performance Analysis with InfiniBand and Myrinet-10G Mohammad J. Rashti and Ahmad Afsahi Queen s University Kingston, ON, Canada 2007 Workshop on Communication Architectures

More information

QLogic in HPC Vendor Update IDC HPC User Forum April 16, 2008 Jeff Broughton Sr. Director Engineering Host Solutions Group

QLogic in HPC Vendor Update IDC HPC User Forum April 16, 2008 Jeff Broughton Sr. Director Engineering Host Solutions Group QLogic in HPC Vendor Update IDC HPC User Forum April 16, 2008 Jeff Broughton Sr. Director Engineering Host Solutions Group 1 Networking for Storage and HPC Leading supplier of Fibre Channel Leading supplier

More information

CPMD Performance Benchmark and Profiling. February 2014

CPMD Performance Benchmark and Profiling. February 2014 CPMD Performance Benchmark and Profiling February 2014 Note The following research was performed under the HPC Advisory Council activities Special thanks for: HP, Mellanox For more information on the supporting

More information

Birds of a Feather Presentation

Birds of a Feather Presentation Mellanox InfiniBand QDR 4Gb/s The Fabric of Choice for High Performance Computing Gilad Shainer, shainer@mellanox.com June 28 Birds of a Feather Presentation InfiniBand Technology Leadership Industry Standard

More information

Interconnection Network for Tightly Coupled Accelerators Architecture

Interconnection Network for Tightly Coupled Accelerators Architecture Interconnection Network for Tightly Coupled Accelerators Architecture Toshihiro Hanawa, Yuetsu Kodama, Taisuke Boku, Mitsuhisa Sato Center for Computational Sciences University of Tsukuba, Japan 1 What

More information

Checklist for Selecting and Deploying Scalable Clusters with InfiniBand Fabrics

Checklist for Selecting and Deploying Scalable Clusters with InfiniBand Fabrics Checklist for Selecting and Deploying Scalable Clusters with InfiniBand Fabrics Lloyd Dickman, CTO InfiniBand Products Host Solutions Group QLogic Corporation November 13, 2007 @ SC07, Exhibitor Forum

More information

SNAP Performance Benchmark and Profiling. April 2014

SNAP Performance Benchmark and Profiling. April 2014 SNAP Performance Benchmark and Profiling April 2014 Note The following research was performed under the HPC Advisory Council activities Participating vendors: HP, Mellanox For more information on the supporting

More information

Workshop on High Performance Computing (HPC) Architecture and Applications in the ICTP October High Speed Network for HPC

Workshop on High Performance Computing (HPC) Architecture and Applications in the ICTP October High Speed Network for HPC 2494-6 Workshop on High Performance Computing (HPC) Architecture and Applications in the ICTP 14-25 October 2013 High Speed Network for HPC Moreno Baricevic & Stefano Cozzini CNR-IOM DEMOCRITOS Trieste

More information

A Plugin-based Approach to Exploit RDMA Benefits for Apache and Enterprise HDFS

A Plugin-based Approach to Exploit RDMA Benefits for Apache and Enterprise HDFS A Plugin-based Approach to Exploit RDMA Benefits for Apache and Enterprise HDFS Adithya Bhat, Nusrat Islam, Xiaoyi Lu, Md. Wasi- ur- Rahman, Dip: Shankar, and Dhabaleswar K. (DK) Panda Network- Based Compu2ng

More information

DB2 purescale: High Performance with High-Speed Fabrics. Author: Steve Rees Date: April 5, 2011

DB2 purescale: High Performance with High-Speed Fabrics. Author: Steve Rees Date: April 5, 2011 DB2 purescale: High Performance with High-Speed Fabrics Author: Steve Rees Date: April 5, 2011 www.openfabrics.org IBM 2011 Copyright 1 Agenda Quick DB2 purescale recap DB2 purescale comes to Linux DB2

More information

Computing Infrastructure for Online Monitoring and Control of High-throughput DAQ Electronics

Computing Infrastructure for Online Monitoring and Control of High-throughput DAQ Electronics Computing Infrastructure for Online Monitoring and Control of High-throughput DAQ S. Chilingaryan, M. Caselle, T. Dritschler, T. Farago, A. Kopmann, U. Stevanovic, M. Vogelgesang Hardware, Software, and

More information

Philippe Thierry Sr Staff Engineer Intel Corp.

Philippe Thierry Sr Staff Engineer Intel Corp. HPC@Intel Philippe Thierry Sr Staff Engineer Intel Corp. IBM, April 8, 2009 1 Agenda CPU update: roadmap, micro-μ and performance Solid State Disk Impact What s next Q & A Tick Tock Model Perenity market

More information

Latest Advances in MVAPICH2 MPI Library for NVIDIA GPU Clusters with InfiniBand

Latest Advances in MVAPICH2 MPI Library for NVIDIA GPU Clusters with InfiniBand Latest Advances in MVAPICH2 MPI Library for NVIDIA GPU Clusters with InfiniBand Presentation at GTC 2014 by Dhabaleswar K. (DK) Panda The Ohio State University E-mail: panda@cse.ohio-state.edu http://www.cse.ohio-state.edu/~panda

More information

n N c CIni.o ewsrg.au

n N c CIni.o ewsrg.au @NCInews NCI and Raijin National Computational Infrastructure 2 Our Partners General purpose, highly parallel processors High FLOPs/watt and FLOPs/$ Unit of execution Kernel Separate memory subsystem GPGPU

More information

OPEN MPI WITH RDMA SUPPORT AND CUDA. Rolf vandevaart, NVIDIA

OPEN MPI WITH RDMA SUPPORT AND CUDA. Rolf vandevaart, NVIDIA OPEN MPI WITH RDMA SUPPORT AND CUDA Rolf vandevaart, NVIDIA OVERVIEW What is CUDA-aware History of CUDA-aware support in Open MPI GPU Direct RDMA support Tuning parameters Application example Future work

More information

Evolving HPC Solutions Using Open Source Software & Industry-Standard Hardware

Evolving HPC Solutions Using Open Source Software & Industry-Standard Hardware CLUSTER TO CLOUD Evolving HPC Solutions Using Open Source Software & Industry-Standard Hardware Carl Trieloff cctrieloff@redhat.com Red Hat, Technical Director Lee Fisher lee.fisher@hp.com Hewlett-Packard,

More information

MILC Performance Benchmark and Profiling. April 2013

MILC Performance Benchmark and Profiling. April 2013 MILC Performance Benchmark and Profiling April 2013 Note The following research was performed under the HPC Advisory Council activities Special thanks for: HP, Mellanox For more information on the supporting

More information

InfiniBand Networked Flash Storage

InfiniBand Networked Flash Storage InfiniBand Networked Flash Storage Superior Performance, Efficiency and Scalability Motti Beck Director Enterprise Market Development, Mellanox Technologies Flash Memory Summit 2016 Santa Clara, CA 1 17PB

More information

GPUs and Emerging Architectures

GPUs and Emerging Architectures GPUs and Emerging Architectures Mike Giles mike.giles@maths.ox.ac.uk Mathematical Institute, Oxford University e-infrastructure South Consortium Oxford e-research Centre Emerging Architectures p. 1 CPUs

More information

Mathematical computations with GPUs

Mathematical computations with GPUs Master Educational Program Information technology in applications Mathematical computations with GPUs GPU architecture Alexey A. Romanenko arom@ccfit.nsu.ru Novosibirsk State University GPU Graphical Processing

More information

GPU-centric communication for improved efficiency

GPU-centric communication for improved efficiency GPU-centric communication for improved efficiency Benjamin Klenk *, Lena Oden, Holger Fröning * * Heidelberg University, Germany Fraunhofer Institute for Industrial Mathematics, Germany GPCDP Workshop

More information

Solutions for Scalable HPC

Solutions for Scalable HPC Solutions for Scalable HPC Scot Schultz, Director HPC/Technical Computing HPC Advisory Council Stanford Conference Feb 2014 Leading Supplier of End-to-End Interconnect Solutions Comprehensive End-to-End

More information

Paving the Road to Exascale

Paving the Road to Exascale Paving the Road to Exascale Gilad Shainer August 2015, MVAPICH User Group (MUG) Meeting The Ever Growing Demand for Performance Performance Terascale Petascale Exascale 1 st Roadrunner 2000 2005 2010 2015

More information

InfiniBand SDR, DDR, and QDR Technology Guide

InfiniBand SDR, DDR, and QDR Technology Guide White Paper InfiniBand SDR, DDR, and QDR Technology Guide The InfiniBand standard supports single, double, and quadruple data rate that enables an InfiniBand link to transmit more data. This paper discusses

More information

Oncilla - a Managed GAS Runtime for Accelerating Data Warehousing Queries

Oncilla - a Managed GAS Runtime for Accelerating Data Warehousing Queries Oncilla - a Managed GAS Runtime for Accelerating Data Warehousing Queries Jeffrey Young, Alex Merritt, Se Hoon Shon Advisor: Sudhakar Yalamanchili 4/16/13 Sponsors: Intel, NVIDIA, NSF 2 The Problem Big

More information

OpenFabrics Alliance Interoperability Logo Group (OFILG) Dec 2011 Logo Event Report

OpenFabrics Alliance Interoperability Logo Group (OFILG) Dec 2011 Logo Event Report OpenFabrics Alliance Interoperability Logo Group (OFILG) Dec 2011 Logo Event Report UNH-IOL 121 Technology Drive, Suite 2 Durham, NH 03824 - +1-603-862-0090 OpenFabrics Interoperability Logo Group (OFILG)

More information

The NE010 iwarp Adapter

The NE010 iwarp Adapter The NE010 iwarp Adapter Gary Montry Senior Scientist +1-512-493-3241 GMontry@NetEffect.com Today s Data Center Users Applications networking adapter LAN Ethernet NAS block storage clustering adapter adapter

More information

SR-IOV Support for Virtualization on InfiniBand Clusters: Early Experience

SR-IOV Support for Virtualization on InfiniBand Clusters: Early Experience SR-IOV Support for Virtualization on InfiniBand Clusters: Early Experience Jithin Jose, Mingzhe Li, Xiaoyi Lu, Krishna Kandalla, Mark Arnold and Dhabaleswar K. (DK) Panda Network-Based Computing Laboratory

More information

The Future of High Performance Interconnects

The Future of High Performance Interconnects The Future of High Performance Interconnects Ashrut Ambastha HPC Advisory Council Perth, Australia :: August 2017 When Algorithms Go Rogue 2017 Mellanox Technologies 2 When Algorithms Go Rogue 2017 Mellanox

More information

Creating an agile infrastructure with Virtualized I/O

Creating an agile infrastructure with Virtualized I/O etrading & Market Data Agile infrastructure Telecoms Data Center Grid Creating an agile infrastructure with Virtualized I/O Richard Croucher May 2009 Smart Infrastructure Solutions London New York Singapore

More information

Key Measures of InfiniBand Performance in the Data Center. Driving Metrics for End User Benefits

Key Measures of InfiniBand Performance in the Data Center. Driving Metrics for End User Benefits Key Measures of InfiniBand Performance in the Data Center Driving Metrics for End User Benefits Benchmark Subgroup Benchmark Subgroup Charter The InfiniBand Benchmarking Subgroup has been chartered by

More information

Building the Most Efficient Machine Learning System

Building the Most Efficient Machine Learning System Building the Most Efficient Machine Learning System Mellanox The Artificial Intelligence Interconnect Company June 2017 Mellanox Overview Company Headquarters Yokneam, Israel Sunnyvale, California Worldwide

More information

Building the Most Efficient Machine Learning System

Building the Most Efficient Machine Learning System Building the Most Efficient Machine Learning System Mellanox The Artificial Intelligence Interconnect Company June 2017 Mellanox Overview Company Headquarters Yokneam, Israel Sunnyvale, California Worldwide

More information

The Future of Interconnect Technology

The Future of Interconnect Technology The Future of Interconnect Technology Michael Kagan, CTO HPC Advisory Council Stanford, 2014 Exponential Data Growth Best Interconnect Required 44X 0.8 Zetabyte 2009 35 Zetabyte 2020 2014 Mellanox Technologies

More information

Lustre2.5 Performance Evaluation: Performance Improvements with Large I/O Patches, Metadata Improvements, and Metadata Scaling with DNE

Lustre2.5 Performance Evaluation: Performance Improvements with Large I/O Patches, Metadata Improvements, and Metadata Scaling with DNE Lustre2.5 Performance Evaluation: Performance Improvements with Large I/O Patches, Metadata Improvements, and Metadata Scaling with DNE Hitoshi Sato *1, Shuichi Ihara *2, Satoshi Matsuoka *1 *1 Tokyo Institute

More information

Support for GPUs with GPUDirect RDMA in MVAPICH2 SC 13 NVIDIA Booth

Support for GPUs with GPUDirect RDMA in MVAPICH2 SC 13 NVIDIA Booth Support for GPUs with GPUDirect RDMA in MVAPICH2 SC 13 NVIDIA Booth by D.K. Panda The Ohio State University E-mail: panda@cse.ohio-state.edu http://www.cse.ohio-state.edu/~panda Outline Overview of MVAPICH2-GPU

More information

Memory Management Strategies for Data Serving with RDMA

Memory Management Strategies for Data Serving with RDMA Memory Management Strategies for Data Serving with RDMA Dennis Dalessandro and Pete Wyckoff (presenting) Ohio Supercomputer Center {dennis,pw}@osc.edu HotI'07 23 August 2007 Motivation Increasing demands

More information

A first look at 100 Gbps LAN technologies, with an emphasis on future DAQ applications.

A first look at 100 Gbps LAN technologies, with an emphasis on future DAQ applications. 21st International Conference on Computing in High Energy and Nuclear Physics (CHEP21) IOP Publishing Journal of Physics: Conference Series 664 (21) 23 doi:1.188/1742-696/664//23 A first look at 1 Gbps

More information

Performance Analysis of Memory Transfers and GEMM Subroutines on NVIDIA TESLA GPU Cluster

Performance Analysis of Memory Transfers and GEMM Subroutines on NVIDIA TESLA GPU Cluster Performance Analysis of Memory Transfers and GEMM Subroutines on NVIDIA TESLA GPU Cluster Veerendra Allada, Troy Benjegerdes Electrical and Computer Engineering, Ames Laboratory Iowa State University &

More information

Intel Enterprise Processors Technology

Intel Enterprise Processors Technology Enterprise Processors Technology Kosuke Hirano Enterprise Platforms Group March 20, 2002 1 Agenda Architecture in Enterprise Xeon Processor MP Next Generation Itanium Processor Interconnect Technology

More information

OceanStor 9000 InfiniBand Technical White Paper. Issue V1.01 Date HUAWEI TECHNOLOGIES CO., LTD.

OceanStor 9000 InfiniBand Technical White Paper. Issue V1.01 Date HUAWEI TECHNOLOGIES CO., LTD. OceanStor 9000 Issue V1.01 Date 2014-03-29 HUAWEI TECHNOLOGIES CO., LTD. Copyright Huawei Technologies Co., Ltd. 2014. All rights reserved. No part of this document may be reproduced or transmitted in

More information

Can Memory-Less Network Adapters Benefit Next-Generation InfiniBand Systems?

Can Memory-Less Network Adapters Benefit Next-Generation InfiniBand Systems? Can Memory-Less Network Adapters Benefit Next-Generation InfiniBand Systems? Sayantan Sur, Abhinav Vishnu, Hyun-Wook Jin, Wei Huang and D. K. Panda {surs, vishnu, jinhy, huanwei, panda}@cse.ohio-state.edu

More information

Future Routing Schemes in Petascale clusters

Future Routing Schemes in Petascale clusters Future Routing Schemes in Petascale clusters Gilad Shainer, Mellanox, USA Ola Torudbakken, Sun Microsystems, Norway Richard Graham, Oak Ridge National Laboratory, USA Birds of a Feather Presentation Abstract

More information

In-Network Computing. Paving the Road to Exascale. 5th Annual MVAPICH User Group (MUG) Meeting, August 2017

In-Network Computing. Paving the Road to Exascale. 5th Annual MVAPICH User Group (MUG) Meeting, August 2017 In-Network Computing Paving the Road to Exascale 5th Annual MVAPICH User Group (MUG) Meeting, August 2017 Exponential Data Growth The Need for Intelligent and Faster Interconnect CPU-Centric (Onload) Data-Centric

More information

Chelsio Communications. Meeting Today s Datacenter Challenges. Produced by Tabor Custom Publishing in conjunction with: CUSTOM PUBLISHING

Chelsio Communications. Meeting Today s Datacenter Challenges. Produced by Tabor Custom Publishing in conjunction with: CUSTOM PUBLISHING Meeting Today s Datacenter Challenges Produced by Tabor Custom Publishing in conjunction with: 1 Introduction In this era of Big Data, today s HPC systems are faced with unprecedented growth in the complexity

More information

Exploiting Full Potential of GPU Clusters with InfiniBand using MVAPICH2-GDR

Exploiting Full Potential of GPU Clusters with InfiniBand using MVAPICH2-GDR Exploiting Full Potential of GPU Clusters with InfiniBand using MVAPICH2-GDR Presentation at Mellanox Theater () Dhabaleswar K. (DK) Panda - The Ohio State University panda@cse.ohio-state.edu Outline Communication

More information

E4-ARKA: ARM64+GPU+IB is Now Here Piero Altoè. ARM64 and GPGPU

E4-ARKA: ARM64+GPU+IB is Now Here Piero Altoè. ARM64 and GPGPU E4-ARKA: ARM64+GPU+IB is Now Here Piero Altoè ARM64 and GPGPU 1 E4 Computer Engineering Company E4 Computer Engineering S.p.A. specializes in the manufacturing of high performance IT systems of medium

More information

NVIDIA GPUDirect Technology. NVIDIA Corporation 2011

NVIDIA GPUDirect Technology. NVIDIA Corporation 2011 NVIDIA GPUDirect Technology NVIDIA GPUDirect : Eliminating CPU Overhead Accelerated Communication with Network and Storage Devices Peer-to-Peer Communication Between GPUs Direct access to CUDA memory for

More information

RDMA in Embedded Fabrics

RDMA in Embedded Fabrics RDMA in Embedded Fabrics Ken Cain, kcain@mc.com Mercury Computer Systems 06 April 2011 www.openfabrics.org 2011 Mercury Computer Systems, Inc. www.mc.com Uncontrolled for Export Purposes 1 Outline Embedded

More information

Intel Workstation Technology

Intel Workstation Technology Intel Workstation Technology Turning Imagination Into Reality November, 2008 1 Step up your Game Real Workstations Unleash your Potential 2 Yesterday s Super Computer Today s Workstation = = #1 Super Computer

More information

N V M e o v e r F a b r i c s -

N V M e o v e r F a b r i c s - N V M e o v e r F a b r i c s - H i g h p e r f o r m a n c e S S D s n e t w o r k e d f o r c o m p o s a b l e i n f r a s t r u c t u r e Rob Davis, VP Storage Technology, Mellanox OCP Evolution Server

More information

OCTOPUS Performance Benchmark and Profiling. June 2015

OCTOPUS Performance Benchmark and Profiling. June 2015 OCTOPUS Performance Benchmark and Profiling June 2015 2 Note The following research was performed under the HPC Advisory Council activities Special thanks for: HP, Mellanox For more information on the

More information

Performance of HPC Applications over InfiniBand, 10 Gb and 1 Gb Ethernet. Swamy N. Kandadai and Xinghong He and

Performance of HPC Applications over InfiniBand, 10 Gb and 1 Gb Ethernet. Swamy N. Kandadai and Xinghong He and Performance of HPC Applications over InfiniBand, 10 Gb and 1 Gb Ethernet Swamy N. Kandadai and Xinghong He swamy@us.ibm.com and xinghong@us.ibm.com ABSTRACT: We compare the performance of several applications

More information

Fast packet processing in the cloud. Dániel Géhberger Ericsson Research

Fast packet processing in the cloud. Dániel Géhberger Ericsson Research Fast packet processing in the cloud Dániel Géhberger Ericsson Research Outline Motivation Service chains Hardware related topics, acceleration Virtualization basics Software performance and acceleration

More information

Memcached Design on High Performance RDMA Capable Interconnects

Memcached Design on High Performance RDMA Capable Interconnects Memcached Design on High Performance RDMA Capable Interconnects Jithin Jose, Hari Subramoni, Miao Luo, Minjia Zhang, Jian Huang, Md. Wasi- ur- Rahman, Nusrat S. Islam, Xiangyong Ouyang, Hao Wang, Sayantan

More information

Optimized Distributed Data Sharing Substrate in Multi-Core Commodity Clusters: A Comprehensive Study with Applications

Optimized Distributed Data Sharing Substrate in Multi-Core Commodity Clusters: A Comprehensive Study with Applications Optimized Distributed Data Sharing Substrate in Multi-Core Commodity Clusters: A Comprehensive Study with Applications K. Vaidyanathan, P. Lai, S. Narravula and D. K. Panda Network Based Computing Laboratory

More information

SwitchX Virtual Protocol Interconnect (VPI) Switch Architecture

SwitchX Virtual Protocol Interconnect (VPI) Switch Architecture SwitchX Virtual Protocol Interconnect (VPI) Switch Architecture 2012 MELLANOX TECHNOLOGIES 1 SwitchX - Virtual Protocol Interconnect Solutions Server / Compute Switch / Gateway Virtual Protocol Interconnect

More information

Evaluating the Impact of RDMA on Storage I/O over InfiniBand

Evaluating the Impact of RDMA on Storage I/O over InfiniBand Evaluating the Impact of RDMA on Storage I/O over InfiniBand J Liu, DK Panda and M Banikazemi Computer and Information Science IBM T J Watson Research Center The Ohio State University Presentation Outline

More information

iscsi or iser? Asgeir Eiriksson CTO Chelsio Communications Inc

iscsi or iser? Asgeir Eiriksson CTO Chelsio Communications Inc iscsi or iser? Asgeir Eiriksson CTO Chelsio Communications Inc Introduction iscsi is compatible with 15 years of deployment on all OSes and preserves software investment iser and iscsi are layered on top

More information

OpenFabrics Alliance Interoperability Logo Group (OFILG) May 2012 Logo Event Report

OpenFabrics Alliance Interoperability Logo Group (OFILG) May 2012 Logo Event Report OpenFabrics Alliance Interoperability Logo Group (OFILG) May 2012 Logo Event Report UNH-IOL 121 Technology Drive, Suite 2 Durham, NH 03824 - +1-603-862-0090 OpenFabrics Interoperability Logo Group (OFILG)

More information

NVMe Takes It All, SCSI Has To Fall. Brave New Storage World. Lugano April Alexander Ruebensaal

NVMe Takes It All, SCSI Has To Fall. Brave New Storage World. Lugano April Alexander Ruebensaal Lugano April 2018 NVMe Takes It All, SCSI Has To Fall freely adapted from ABBA Brave New Storage World Alexander Ruebensaal 1 Design, Implementation, Support & Operating of optimized IT Infrastructures

More information