Accelerated Verbs. Liran Liss Mellanox Technologies

Similar documents
OFV-WG: Accelerated Verbs

NON-CONTIGUOUS MEMORY REGISTRATION

13th ANNUAL WORKSHOP 2017 VERBS KERNEL ABI. Liran Liss, Matan Barak. Mellanox Technologies LTD. [March 27th, 2017 ]

14th ANNUAL WORKSHOP 2018 NVMF TARGET OFFLOAD. Liran Liss. Mellanox Technologies. April 2018

USER SPACE IPOIB PACKET PROCESSING

Scalable Fabric Interfaces

Open Fabrics Interfaces Architecture Introduction. Sean Hefty Intel Corporation

MULTI-PROCESS SHARING OF RDMA RESOURCES

Agenda. About us Why para-virtualize RDMA Project overview Open issues Future plans

jverbs: Java/OFED Integration for the Cloud

Containing RDMA and High Performance Computing

14th ANNUAL WORKSHOP 2018 T10-DIF OFFLOAD. Tzahi Oved. Mellanox Technologies. Apr 2018 [ LOGO HERE ]

Kernel OpenFabrics Interface

Optimize Storage Efficiency & Performance with Erasure Coding Hardware Offload. Dror Goldenberg VP Software Architecture Mellanox Technologies

A native InfiniBand Transporter for MySQL Cluster

libnetfilter_log Reference Manual

UNIX Sockets. Developed for the Azera Group By: Joseph D. Fournier B.Sc.E.E., M.Sc.E.E.

Mellanox Scalable Hierarchical Aggregation and Reduction Protocol (SHARP) API Guide. Version 1.0

HiKey970. I2C Development Guide. Issue 01. Date

nftables switchdev support

RDMA Container Support. Liran Liss Mellanox Technologies

PARAVIRTUAL RDMA DEVICE

lib/librte_ipsec A NATIVE DPDK IPSEC LIBRARY UPDATE 2018/12 DECLAN DOHERTY, MOHAMMAD ABDUL AWAL, KONSTANTIN ANANYEV

Processes communicating. Network Communication. Sockets. Addressing processes 4/15/2013

A Brief Introduction to the OpenFabrics Interfaces

Socket Programming for TCP and UDP

Network Communication

OPENFABRICS VERBS MULTI-VENDOR SUPPORT

High Level Design IOD KV Store FOR EXTREME-SCALE COMPUTING RESEARCH AND DEVELOPMENT (FAST FORWARD) STORAGE AND I/O

Sockets 15H2. Inshik Song

Asynchronous Peer-to-Peer Device Communication

HIGH-PERFORMANCE NETWORKING :: USER-LEVEL NETWORKING :: REMOTE DIRECT MEMORY ACCESS

Hybrid of client-server and P2P. Pure P2P Architecture. App-layer Protocols. Communicating Processes. Transport Service Requirements

IoT Deep DIve #3. Advanced BLE - Custom GAP/GATT

Application Programming Interface (API) for RK Hardware Codec. Application Programming Interface (API) for RK Hardware Codec

FreeDiameter Extension

Network Adapter Flow Steering

I/O Management Intro. Chapter 5

High Speed DAQ with DPDK

Stream Computing using Brook+

DPDK+eBPF KONSTANTIN ANANYEV INTEL

Inside Closed Process Groups An OpenAIS Service

Memory management. Johan Montelius KTH

CLIENT-SIDE PROGRAMMING

Task Executive. Gary J. Minden February 9, 2017

Low latency, high bandwidth communication. Infiniband and RDMA programming. Bandwidth vs latency. Knut Omang Ifi/Oracle 2 Nov, 2015

CISC2200 Threads Spring 2015

Memory Management Strategies for Data Serving with RDMA

Intel New RDT Features and Implementation Introduction

ICU 58 SpoofChecker API Changes

SIM5360 BMP Demo Basic Datanet Working Note V1.00

Wireless Sensor Networks. Introduction to the Laboratory

CS2141 Software Development using C/C++ C++ Basics

Programming Ethernet with Socket API. Hui Chen, Ph.D. Dept. of Engineering & Computer Science Virginia State University Petersburg, VA 23806

Wireless Base Band Device (bbdev) Amr Mokhtar DPDK Summit Userspace - Dublin- 2017

RDMA enabled NIC (RNIC) Verbs Overview. Renato Recio

SpiNNaker Application Programming Interface (API)

1. Overview Ethernet FIT Module Outline of the API API Information... 5

MicroBlaze TFTP Server User Guide

MOVING FORWARD WITH FABRIC INTERFACES

SEMANTIC ANALYSIS TYPES AND DECLARATIONS

CprE 288 Introduction to Embedded Systems Exam 1 Review. 1

Mellanox IB-Verbs API (VAPI)

OpenFabrics Interface WG A brief introduction. Paul Grun co chair OFI WG Cray, Inc.

UniFinger SFM series. Application Notes [ Fingerprint enrollment using SFR SDK ] Version by Suprema Inc.

Lecture 2. Outline. Layering and Protocols. Network Architecture. Layering and Protocols. Layering and Protocols. Chapter 1 - Foundation

12th ANNUAL WORKSHOP 2016 RDMA RESET SUPPORT. Liran Liss. Mellanox Technologies. [ April 4th, 2016 ]

Fabric Interfaces Architecture. Sean Hefty - Intel Corporation

Overview. Constructors and destructors Virtual functions Single inheritance Multiple inheritance RTTI Templates Exceptions Operator Overloading

What s an API? Do we need standardization?

UDP CONNECT TO A SERVER

Programming Ethernet with Socket API. Hui Chen, Ph.D. Dept. of Engineering & Computer Science Virginia State University Petersburg, VA 23806

NTRDMA v0.1. An Open Source Driver for PCIe NTB and DMA. Allen Hubbe at Linux Piter 2015 NTRDMA. Messaging App. IB Verbs. dmaengine.h ntb.

Structures, Unions Alignment, Padding, Bit Fields Access, Initialization Compound Literals Opaque Structures Summary. Structures

Application Acceleration Beyond Flash Storage

DYPOP API. Katsushi Kobayashi AICS (DC) US Ignite NV Copyright (C) 2012, Katsushi Kobayashi. All rights reserved. 1.! 2. " " Packet.

CREATING A COMMON SOFTWARE VERBS IMPLEMENTATION

Design Patterns in C++

Programming Internet with Socket API. Hui Chen, Ph.D. Dept. of Engineering & Computer Science Virginia State University Petersburg, VA 23806

Advanced Indexing Techniques with Lucene

Vinod Koul, Sanyog Kale Intel Corp. MIPI SoundWire Linux Subsystem: An introduction to Protocol and Linux Subsystem

Simulating Partial Specialization

HSA QUEUEING HOT CHIPS TUTORIAL - AUGUST 2013 IAN BRATT PRINCIPAL ENGINEER ARM

Dynamic Invocation Interface 5

Networking at the Speed of Light

Design Patterns in C++

Peripheral-side USB support for NetBSD

CSci 4061 Introduction to Operating Systems. IPC: Basics, Pipes

Real-Time Programming

Shared Memory. By Oren Kalinsky

Part Two - Process Management. Chapter 3: Processes

CSci 4061 Introduction to Operating Systems. IPC: Basics, Pipes

Performance Implications Libiscsi RDMA support

Process s Address Space. Dynamic Memory. Backing the Heap. Dynamic memory allocation 3/29/2013. When a process starts the heap is empty

Network Software Implementations

The User Datagram Protocol

Scuola Superiore Sant Anna. I/O subsystem. Giuseppe Lipari

CS 43: Computer Networks. 05: Socket Programming September 12-14, 2018

Announcements. CS 5565 Network Architecture and Protocols. Queuing. Demultiplexing. Demultiplexing Issues (1) Demultiplexing Issues (2)

CSCI565 Compiler Design

Transcription:

Accelerated Verbs Liran Liss Mellanox Technologies

Agenda Introduction Possible approaches Querying and releasing interfaces Interface family examples Application usage example Conclusion 2

Introduction High-end apps require extremely optimized data-path APIs HPC Packet processing Goals Provide the best possible performance (*) Focus on the fast path Dedicated functions for initiating IO Dedicated functions for polling completions Maintain 100% functional compatibility with existing Verbs Non-goals Optimize all Verbs Change the object model or semantics 3

Possible Approaches Verbs Extensions Object Oriented Interface Type Oriented Interface Object Oriented App context App context App context Interface1 App context App context Interface1 Interface1 QP ibv_context QP Interface1 Interface2 Interface3 Interface2 Interface2 Interface3 Interface3 Interface2 Interface3 QP QP QP Provider methods 4

Interface Object Oriented API Obtain accelerated function tables through a parameterized Query Interface Verb Benefit from all Object Oriented goodies Type checking Specialization Separate Interface acquirement from invocation All versioning and capability checks done at acquirement time Eliminate all overheads in fast path Gradual addition, extension, and deprecation of interfaces Obtained interface valid for requested object only Provider may choose to return the same interface for multiple objects 5

Query Interface enum ibv_query_intf_flags { IBV_QUERY_INTF_FLAG_ENABLE_CHECKS = (1 << 0), enum ibv_intf_scope { IBV_INTF_GLOBAL, IBV_INTF_EXPERIMENTAL, IBV_INTF_VENDOR, struct ibv_query_intf_params { uint32_t flags; enum ibv_intf_scope intf_scope; uint64_t vendor_guid; /* valid only for Vendor scope */ uint32_t intf; uint32_t intf_version; void *obj; void *family_params; uint32_t family_flags; uint32_t comp_mask; /* for future extensions */ static inline void *ibv_query_intf(struct ibv_context *context, struct ibv_query_intf_params *params, enum ibv_query_intf_status *status); int ibv_release_intf(struct ibv_context *context, void *intf); 6

Query Interface (cont.) enum ibv_query_intf_status { IBV_INTF_STAT_OK, IBV_NTF_STAT_VENDOR_NOT_SUPPORTED, IBV_INTF_STAT_INTF_NOT_SUPPORTED, IBV_INTF_STAT_VERSION_NOT_SUPPORTED, IBV_INTF_STAT_INVAL_PARARM, IBV_INTF_STAT_INVAL_OBJ, Notes Interfaces do not introduce new functionality to Verbs Each family is used only for a specific object type (QP, CQ, etc.) If no object is passed the call serves as a capability query Checks provider support for a given family and version Version usage Each increment must support all function calls of previous versions Interface changes require a new family 7

Burst Family Targets Ethernet packet processing Apps Optimizes Accumulating packets to be sent Bursts of single SGE packets struct ibv_qp_burst_family { int (*send_pending)(struct ibv_qp *qp, uint64_t addr, uint32_t length, uint32_t lkey, uint32_t flags); int (*send_pending_inline)(struct ibv_qp *qp, void *addr, uint32_t length, uint32_t flags); int (*send_pending_sg_list)(struct ibv_qp *qp, ibv_sge *sg_list, uint32_t num, uint32_t flags); int (*send_flush)(struct ibv_qp *qp); /* Send/receive burst of single sge packets; num indicates number of packets */ int (*send_burst)(struct ibv_qp *qp, ibv_sge *sg_list, uint32_t num, uint32_t flags); int (*recv_burst)(struct ibv_qp *qp, ibv_sge *sg_list, uint32_t num); 8

Poll Family Targets minimal polling information Optimizes Send completion counts Receive completions for which only the length is of interest Completions that contain the payload in the CQE struct ibv_cq_poll_family { int (*poll_cnt)(struct ibv_cq *cq, uint32_t max); int (*poll_cq_length)(void* buf, uint32_t *inl); 9

Application Code struct context { struct ibv_qp *qp; struct ibv_qp_ops_msg *msg; struct ibv_cq *cq; struct ibv_cq_formatted_ops *formatted; int create_ctx(struct context *ctx) { struct ibv_cq_attr cq_attr;... ctx->qp = ibv_create_qp(...); ctx->msg = ibv_query_intf(ctx->context,... /* QP channel family */); ctx->cq = ibv_create_cq(...); cq_attr.format_mask = IBV_WC_BASE IBV_WC_TS; ibv_modify_cq(cq, cq_attr, IBV_CQ_FORMAT_MASK); } ctx->formatted = ibv_query_intf(ctx->context,... /* Formatted CQ family */);... 10

Application Code (cont.) int send_message(struct context *ctx) { char my_msg[] = "blah"; struct { struct ibv_wc_base base; struct ibv_wc_ts ts; } my_comp; int ret; ctx->msg->send_inline(ctx->qp, my_msg, sizeof my_msg, SEND_WR_ID); } while (!(ret = ctx->formatted->poll_formatted(ctx->cq, &my_comp, sizeof(my_comp))) ); if (ret < 0) { /* Poll for error and bail out */... } printf("wr_id:%d timestamp:%lld\n", my_comp.base.wr_id, my_comp.ts.timestamp); return ret; 11

Conclusion Verbs is a robust and efficient interface Solid object model Efficient data path Verbs extensions allow adding new APIs in a forward- and backward-compatible manner The standard way to add new Verbs Accelerated Verbs provides highly optimized, domain-specific, data-path APIs Always a strict subset of Verbs 12

Thank You