Messaging Overview. Introduction. Gen-Z Messaging

Similar documents
DRAM and Storage-Class Memory (SCM) Overview

This presentation provides an overview of Gen Z architecture and its application in multiple use cases.

Gen-Z Overview. 1. Introduction. 2. Background. 3. A better way to access data. 4. Why a memory-semantic fabric

GEN-Z AN OVERVIEW AND USE CASES

Gen-Z Identifiers September Copyright 2016 by Gen-Z. All rights reserved.

Introduction to High-Speed InfiniBand Interconnect

Maximizing heterogeneous system performance with ARM interconnect and CCIX

Scaling to Petaflop. Ola Torudbakken Distinguished Engineer. Sun Microsystems, Inc

Intel Cluster Ready Allowed Hardware Variances

How Might Recently Formed System Interconnect Consortia Affect PM? Doug Voigt, SNIA TC

Gen-Z Communication at the Speed of Memory Kurtis Bowman President, Gen-Z Consortium

This presentation covers Gen Z Operation Classes, also referred to as OpClasses.

A Low Latency Solution Stack for High Frequency Trading. High-Frequency Trading. Solution. White Paper

NVM PCIe Networked Flash Storage

Gen-Z Memory-Driven Computing

Lossless 10 Gigabit Ethernet: The Unifying Infrastructure for SAN and LAN Consolidation

FC-NVMe. NVMe over Fabrics. Fibre Channel the most trusted fabric can transport NVMe natively. White Paper

Juniper Virtual Chassis Technology: A Short Tutorial

SOFTWARE-DEFINED BLOCK STORAGE FOR HYPERSCALE APPLICATIONS

1 Copyright 2011, Oracle and/or its affiliates. All rights reserved.

Interconnect Your Future

Mellanox Virtual Modular Switch

NVMe Direct. Next-Generation Offload Technology. White Paper

IsoStack Highly Efficient Network Processing on Dedicated Cores

Chelsio Communications. Meeting Today s Datacenter Challenges. Produced by Tabor Custom Publishing in conjunction with: CUSTOM PUBLISHING

Dell EMC. VxBlock Systems for VMware NSX 6.3 Architecture Overview

This presentation covers Gen Z coherency operations and semantics.

Paving the Road to Exascale

NVMe over Universal RDMA Fabrics

6.9. Communicating to the Outside World: Cluster Networking

This presentation covers Gen Z Buffer operations. Buffer operations are used to move large quantities of data between components.

Storage System. David Southwell, Ph.D President & CEO Obsidian Strategics Inc. BB:(+1)

Why NVMe/TCP is the better choice for your Data Center

Application Acceleration Beyond Flash Storage

Introduction to Infiniband

Highest Levels of Scalability Simplified Network Manageability Maximum System Productivity

Benefits of Offloading I/O Processing to the Adapter

SAN Virtuosity Fibre Channel over Ethernet

Intel Cloud Builder Guide: Cloud Design and Deployment on Intel Platforms

OCP Engineering Workshop - Telco

Communication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems.

Introduction. Network Architecture Requirements of Data Centers in the Cloud Computing Era

Cisco Nexus 4000 Series Switches for IBM BladeCenter

Accelerating Real-Time Big Data. Breaking the limitations of captive NVMe storage

Multiservice Optical Switching System CoreDirector FS. Offering new services and enhancing service velocity

ALCATEL-LUCENT ENTERPRISE DATA CENTER SWITCHING SOLUTION Automation for the next-generation data center

MELLANOX EDR UPDATE & GPUDIRECT MELLANOX SR. SE 정연구

RapidIO.org Update. Mar RapidIO.org 1

Unify Virtual and Physical Networking with Cisco Virtual Interface Card

GUIDE. Optimal Network Designs with Cohesity

SAS Technical Update Connectivity Roadmap and MultiLink SAS Initiative Jay Neer Molex Corporation Marty Czekalski Seagate Technology LLC

Density Optimized System Enabling Next-Gen Performance

The Transition to PCI Express* for Client SSDs

Flex System EN port 10Gb Ethernet Adapter Product Guide

AcuSolve Performance Benchmark and Profiling. October 2011

2017 Storage Developer Conference. Mellanox Technologies. All Rights Reserved.

PEX8764, PCI Express Gen3 Switch, 64 Lanes, 16 Ports

The NE010 iwarp Adapter

Cisco Nexus 5000 and Emulex LP21000

This presentation cover Gen Z Access Control.

W H I T E P A P E R : O P E N. V P N C L O U D. Implementing A Secure OpenVPN Cloud

Cisco Unified Computing System Delivering on Cisco's Unified Computing Vision

This presentation covers Gen Z Memory Management Unit (ZMMU) and memory interleave capabilities.

OFA Developer Workshop 2013

10 ways to securely optimize your network. Integrate WAN acceleration with next-gen firewalls to enhance performance, security and control

What s New in VMware vsphere 4.1 Performance. VMware vsphere 4.1

RoCE vs. iwarp Competitive Analysis

Dell EMC. VxRack System FLEX Architecture Overview

Flashmatrix Technology

mtera SONET/SDH Migration

2008 International ANSYS Conference

An Introduction to GPFS

Innovating and Integrating for Communications and Storage

14th ANNUAL WORKSHOP 2018 A NEW APPROACH TO SWITCHING NETWORK IMPLEMENTATION. Harold E. Cook. Director of Software Engineering Lightfleet Corporation

RapidIO.org Update.

Future Routing Schemes in Petascale clusters

Ravindra Babu Ganapathi

N V M e o v e r F a b r i c s -

Enabling High Performance Data Centre Solutions and Cloud Services Through Novel Optical DC Architectures. Dimitra Simeonidou

The Virtual Machine Aware SAN

Configuring SR-IOV. Table of contents. with HP Virtual Connect and Microsoft Hyper-V. Technical white paper

Birds of a Feather Presentation

Sub-microsecond interconnects for processor connectivity The opportunity

Performance Optimizations via Connect-IB and Dynamically Connected Transport Service for Maximum Performance on LS-DYNA

Super-convergence for Real Time Analytics at the Edge Flashmatrix 1000 Series. storage.toshiba.com/flashmatrix/

All Roads Lead to Convergence

Fast and Easy Persistent Storage for Docker* Containers with Storidge and Intel

OFA Developer Workshop 2014

Plexxi Theory of Operations White Paper

Intel Rack Scale Architecture. using Intel Ethernet Multi-host Controller FM10000 Family

HETEROGENEOUS SYSTEM ARCHITECTURE: PLATFORM FOR THE FUTURE

1/5/2012. Overview of Interconnects. Presentation Outline. Myrinet and Quadrics. Interconnects. Switch-Based Interconnects

Solace Message Routers and Cisco Ethernet Switches: Unified Infrastructure for Financial Services Middleware

PERFORMANCE ACCELERATED Mellanox InfiniBand Adapters Provide Advanced Levels of Data Center IT Performance, Productivity and Efficiency

Performance Considerations of Network Functions Virtualization using Containers

VPI / InfiniBand. Performance Accelerated Mellanox InfiniBand Adapters Provide Advanced Data Center Performance, Efficiency and Scalability

PCI Express High Speed Fabric. A complete solution for high speed system connectivity

Interconnect Your Future

How to abstract hardware acceleration device in cloud environment. Maciej Grochowski Intel DCG Ireland

Dell EMC Storage with Panasonic Video Insight

Transcription:

Page 1 of 6 Messaging Overview Introduction Gen-Z is a new data access technology that not only enhances memory and data storage solutions, but also provides a framework for both optimized and traditional messaging solutions. The following sections unveil the features and capabilities of the Gen-Z architecture by first describing the foundational features and building on these to illustrate the next generation components and system solutions possible. Gen-Z Messaging The Gen-Z DRAM and Storage-Class Memory (SCM) Overview and Gen-Z Block Storage Overview documents describe the enhanced features and capabilities of Gen-Z as a data access interface and interconnect. Figure 1 illustrates an example infrastructure solution built on using Gen-Z for fabric-attached memory data access, but using an existing network solution for messaging between a pair of servers. The Media Controller in the memory module and the two CPUs have integrated Gen-Z logic and interfaces to support low-latency, high-bandwidth, resilient, and scalable byte-addressable data access to volatile or storage class (persistent) memory. But this configuration has deployment challenges. First, messaging and data access have their own independent fabrics and components including NICs (for networking), switches, cabling, management software, and mechanical packaging. Second, the message latency between the CPUs will be higher than the data access latencies because of the additional protocol translations at the PCIe and network interfaces. Finally, this configuration incurs higher cost, power consumption, and operational management overhead. Figure 1: Gen-Z with Existing Networking Figure 2: Gen-Z for Messaging & Data Access Figure 2 demonstrates an alternative that consolidates messaging with data access onto Gen-Z. The Gen-Z architecture includes protocol and packet handling mechanisms to support both reliable and unreliable messaging seamlessly with byte-addressable and block data access operations. This configuration offers many benefits including: Sub-250 ns one-way, application-to-application latency Bandwidth scaling Simple, optimized, system designs Reduced power consumption

Messaging Overview Page 2 of 6 Integrated management framework Reduced capital and operating costs Gen-Z messaging can be used for efficient, memory semantic communication between a variety of Gen-Z components, including compute components (e.g. CPUs, SoCs, GPUs, FPGAs, etc.), or communication-specific components (e.g. gateways to other network technologies like Ethernet, InfiniBand, etc.). Gen-Z enables legacy protocols and/or messaging APIs to be tunneled/mapped over Gen-Z infrastructure, as well as enabling next generation workloads that have native Gen-Z library support. Gen-Z supports messaging in deployments as small as a single enclosure (rack mount server, blade enclosure, etc.), to rack-scale, and even data center scale deployments if the components implement the optional Gen-Z features that support this capability. Factors such as component/solution features, cost, complexity, power, performance (latency/bandwidth), management models, etc. all affect the particular workloads, target markets, and deployment models that can be deployed using Gen-Z technology. Flexible, High Bandwidth Serial Interfaces Gen-Z utilizes high-speed serial physical layer technology supporting 16, 25, 28, 56, and 112 GT/s rates on each serial signal. Gen-Z supports link interfaces that support 1-256 serial signals for transmit and receive (the most common link widths will be 1, 2, 4, 8, or 16 serial signals). Gen-Z interfaces support both electrical and optical connectivity between components. Gen-Z supports symmetric link interfaces where there are an equal number of transmit and receive serial signals, as well as asymmetric link interfaces where the number of transmit and receive serial signals are not the same. Although symmetric links are most common, asymmetric links enable solutions to tailor transmit or receive bandwidths to application-specific needs (e.g., at link initialization, configuring six receive and two transmit lanes to increase memory read and/or receive message bandwidth by 50%). Figure 3 illustrates aggregate unidirectional transmit or receive bandwidths of 4, 8, or 16 links. Each link is independently configured and can be dedicated to a specific workload or shared by multiple workloads. Enhanced Multipath for Messaging Figure 3: Examples of Gen-Z Compute Component Link Bandwidth Multi-link Gen-Z components increase aggregate bandwidth, system resiliency, and topology flexibility. This enables Gen-Z solutions to deliver optimal performance and availability without sacrificing performance. Figure 4 illustrates a pair of dualprocessor systems with four Gen-Z links per processor connected to two Gen-Z switches. Each system can isolate and loadbalance traffic to meet application performance needs. In the event of switch or link failure, Gen-Z transparently rebalances traffic to the remaining links and switches to minimize application performance impacts. In this example, the loss of a single link represents a 25% performance loss (the more links provisioned, the lower the performance loss due to link failure).

Messaging Overview Page 3 of 6 Figure 4: Gen-Z Enhanced Multipath Capabilities Flexible Topologies and Advanced Switching Capabilities Gen-Z architecture enables solutions to support a wide range of topologies from simple point-to-point to single-enclosure fanout, to scale-out multi-rack solutions. Gen-Z also enables switch logic to be integrated into any type of component (e.g. CPUs, GPUs, memory, network gateways, etc.) to reduce costs, improve peer-to-peer latencies and support fabric topologies like 2D/3D Torus, Mesh, and Daisy-Chains. Using standalone switches enables more scalable topologies, such as Clos, Fat Trees, Hypercube, Dragonfly, etc. Gen-Z components and switches (internal or standalone) incorporate novel, compact relay table configurations that enable nearly any routing algorithm to be supported without sacrificing performance. Further, Gen-Z supports adaptive routing to dynamically bypass congested or failed paths to deliver reduce jitter and ensure application availability. To reduce message latency jitter, Gen-Z supports up to 32 virtual channels (VCs) and up to 256-byte data payloads. VCs enables Gen-Z to separate traffic to target independent buffer resources. When combined with VC remapping, packets from limited VC components can be transparently distributed across multiple switch paths to maximize fabric utilization and efficiency. A 256-byte data payload enables Gen-Z to achieve ~90% protocol efficiency while minimizing jitter due to head-lo-line blocking (a modest data payload size ensures different priority packets can be efficiently interleaved with minimal latency and complexity). Message Models Any processor, GPU, FPGA, DSP, etc. component can utilize Gen-Z messaging to improve solution performance, capability, and availability. Such components can support software-based or hardware-assisted messaging. Figure 5 illustrates two processors using software-based messaging by exchanging Gen-Z read, write, and atomic operations (these operations can be transparently generated by components with an integrated memory management unit used to translate load / store / atomic instructions and operations into Gen-Z read, write, interrupt, and atomic operations). Software-based messaging can be directly used by applications to deliver optimal latency and power efficiency. For example, a thread in Processor B can access another thread s data in Processor A using load-store (read-write) semantics. Figure 5: Gen-Z Software-based Messaging

Messaging Overview Page 4 of 6 Hardware-assisted messaging uses lightweight, component-integrated data movers to move data with minimum software involvement. Data movers can use Gen-Z write message operations to send and receive up to 64 KB of message data that targets destination-managed memory. For messaging that requires large data movement, data movers support either Gen-Z put and get operations to move up to 4 GB of data movement, or large read operations to move up to 1 MB of data, both of which require little software involvement. In all of these modes, data movers can support queue-based programming models. All of these operations and models can be used directly by applications or integrated within existing network and storage stacks or application middleware. This enables Gen-Z solutions to reduce the cost, complexity, and overheads to deliver highly-efficient and high-performance messaging solutions. Optimized but Flexible Software Architecture One of the most important benefits of Gen-Z is that it delivers messaging services with software that is a fraction of the size of modern protocol stacks. Gen-Z enables a new generation of applications and protocols that can directly access Gen-Z hardware interfaces to facilitate high-performance data path messaging to other components. Yet, Gen-Z software modules can be created that fit into exiting protocol stacks to enable today s applications to maintain backward compatibility and still benefit from many of the capabilities of Gen-Z. Figure 6: Gen-Z Messaging Software Architecture Figure 6 illustrates a high-level view of the Gen-Z software architecture in a typical compute component. As indicated in the far right-hand side of Figure 6, next generation applications can be bound to a Gen-Z Library to have direct access to Gen-Z hardware for sending and receiving messages. This model has the thinnest software layer yielding high-bandwidth, low latency messaging. But the Gen-Z software model supports the creation of a Gen-Z Provider for the Open Fabric Interface (OFI) Libfabric environment to support existing High Performance Computing (HPC) middleware and workloads. Furthermore, Gen-Z supports the creation of either a native Gen-Z socket level transport protocol, or an emulated Ethernet driver that fits under the industry standard TCP/IP stack, both of which support existing applications written to the industry standard Sockets API. All of these

Messaging Overview Page 5 of 6 solutions can operate under unmodified operating systems (requiring no kernel changes) by simply installing the Gen-Z kernel drivers and libraries to support these models. Therefore, Gen-Z enables infrastructure solutions that gracefully transition to optimized Gen-Z messaging solutions while allowing existing applications to take advantage of Gen-Z performance, resiliency, and security features. Security Gen-Z offers a combination of hardware-enforced isolation techniques and full packet authentication to prevent errant or malicious component software or hardware from communicating with unauthorized components or accessing unauthorized resources, including Gen-Z message queue memory. In addition, for protection and network isolation, Gen-Z Fabric Management configures communications between SoCs on both a component and logical message sender and receiver basis. Gen-Z multicast mechanisms similarly restrict which components belong to various multicast groups. Summary Gen-Z technology is designed to supplement platforms and infrastructure with capabilities for a new generation of workloads that require messaging services, while delivering many of these capabilities even to existing applications and workloads via industry standard APIs and messaging stacks. These capabilities include: Next generation convergence with memory/storage data access and messaging on common infrastructure o Lower capital and operating costs with unified management High Performance o Very low latencies and high bandwidth o Robust, serial electrical/optical interface roadmap with selectable modes and link widths Hardware driven multipath for higher bandwidth and resiliency o Minimal degradation of performance in the event of failures o No redundant drivers or multipath software required Advanced switching technology facilitates messaging for enclosure and rack scale solutions o Reduces latency jitter and enables congestion avoidance using 32 virtual channels (VCs) Thin, light-weight software architecture that supports unmodified operating systems o Graceful transition from existing to next generation messaging APIs and protocol stacks o Optimized messaging for east-west traffic flows within clusters of Gen-Z infrastructure Security with authentication and features that protect and isolate an infrastructure s messaging resources

Messaging Overview Page 6 of 6 DISCLAIMER This document is provided as is with no warranties whatsoever, including any warranty of merchantability, noninfringement, fitness for any particular purpose, or any warranty otherwise arising out of any proposal, specification, or sample. Gen-Z Consortium disclaims all liability for infringement of proprietary rights, relating to use of information in this document. No license, express or implied, by estoppel or otherwise, to any intellectual property rights is granted herein. Gen-Z is a trademark or registered trademark of the Gen-Z Consortium. All other product names are trademarks, registered trademarks, or servicemarks of their respective owners.