MANAGING NODE CONFIGURATION WITH 1000S OF NODES

Similar documents
Intel Omni-Path Fabric Manager GUI Software

Intel Omni-Path Fabric Manager GUI Software

Intel Omni-Path Fabric Manager GUI Software

Configuring Non-Volatile Memory Express* (NVMe*) over Fabrics on Intel Omni-Path Architecture

Clear CMOS after Hardware Configuration Changes

Optimization of Lustre* performance using a mix of fabric cards

Intel Unite Plugin Guide for VDO360 Clearwater

Intel Unite. Intel Unite Firewall Help Guide

Omni-Path Cluster Configurator

Intel True Scale Fabric Switches Series

Intel Unite Plugin for Logitech GROUP* and Logitech CONNECT* Devices INSTALLATION AND USER GUIDE

Intel QuickAssist for Windows*

SELINUX SUPPORT IN HFI1 AND PSM2

Intel Unite Solution Version 4.0

Management Scalability. Author: Todd Rimmer Date: April 2014

Intel Unite Solution Intel Unite Plugin for WebEx*

Intel & Lustre: LUG Micah Bhakti

IPSO 6LoWPAN IoT Software for Yocto Project* for Intel Atom Processor E3800 Product Family

Intel Omni-Path Fabric Switches

Ravindra Babu Ganapathi

Zhang, Hongchao

Intel Visual Compute Accelerator Product Family

Intel Unite Solution. Plugin Guide for Protected Guest Access

Intel Security Dev API 1.0 Production Release

Intel Xeon W-3175X Processor Thermal Design Power (TDP) and Power Rail DC Specifications

IP over IB Protocol. Introduction CHAPTER

Intel Unite Solution Version 4.0

Intel Unite Solution. Linux* Release Notes Software version 3.2

Intel Visual Compute Accelerator Product Family

Intel Unite Solution Version 4.0

GM8126 MAC DRIVER. User Guide Rev.: 1.0 Issue Date: December 2010

Modernizing Meetings: Delivering Intel Unite App Authentication with RFID

Intel Unite. Enterprise Test Environment Setup Guide

True Scale Fabric Switches Series

Intel Speed Select Technology Base Frequency - Enhancing Performance

Intel Omni-Path Fabric Switches

Intel Unite Solution. Plugin Guide for Protected Guest Access

Intel QuickAssist for Windows*

Intel Unite Solution Intel Unite Plugin for Ultrasonic Join

Andreas Dilger High Performance Data Division RUG 2016, Paris

Update on Scalable SA Project

White Paper. May Document Number: US

Intel Omni-Path Fabric Software

FABRIC PERFORMANCE MANAGEMENT AND MONITORING

No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.

Localized Adaptive Contrast Enhancement (LACE)

6th Generation Intel Core Processor Series

HAProxy* with Intel QuickAssist Technology

Intel Firmware Support Package (Intel FSP) for Intel Xeon Processor D Product Family (formerly Broadwell-DE), Gold 001

Intel Celeron Processor J1900, N2807 & N2930 for Internet of Things Platforms

Jomar Silva Technical Evangelist

12th ANNUAL WORKSHOP 2016 NVME OVER FABRICS. Presented by Phil Cayton Intel Corporation. April 6th, 2016

Intel Omni-Path Fabric Software in Red Hat* Enterprise Linux* 7.4

Intel Unite Solution Version 4.0

Intel Quark Microcontroller Software Interface Pin Multiplexing

Intel Omni-Path Fabric Switches GUI

Movidius Neural Compute Stick

Intel Open Network Platform Release 2.0 Hardware and Software Specifications Application Note. SDN/NFV Solutions with Intel Open Network Platform

Intel Simple Network Management Protocol (SNMP) Subagent v8.0

Intel Cache Acceleration Software for Windows* Workstation

CREATING A COMMON SOFTWARE VERBS IMPLEMENTATION

Intel Omni-Path Fabric Host Software

OMNI-PATH FABRIC TOPOLOGIES AND ROUTING

Hetero Streams Library (hstreams Library) User's Guide

Host Redundancy, and IPoIB and SRP Redundancies

Intel Omni-Path Fabric Host Software

LNet Roadmap & Development. Amir Shehata Lustre * Network Engineer Intel High Performance Data Division

Intel Integrated Native Developer Experience 2015 (OS X* host)

DIY Security Camera using. Intel Movidius Neural Compute Stick

Intel Software Guard Extensions SDK for Linux* OS. Installation Guide

Intel Omni-Path Architecture

IEEE1588 Frequently Asked Questions (FAQs)

Intel Omni-Path Fabric Software

An introduction to today s Modular Operating System

Evolving Small Cells. Udayan Mukherjee Senior Principal Engineer and Director (Wireless Infrastructure)

LED Manager for Intel NUC

Running Docker* Containers on Intel Xeon Phi Processors

Intel Cache Acceleration Software (Intel CAS) for Linux* v2.9 (GA)

BIOS Implementation of UCSI

Intel Quark SE Microcontroller C1000 Power Sequencing Considerations

C H A P T E R InfiniBand Commands Cisco SFS 7000 Series Product Family Command Reference Guide OL

Intel IoT Gateway Platform Data Editor Tool

How to Create a.cibd File from Mentor Xpedition for HLDRC

Intel Setup and Configuration Service. (Lightweight)

Intel Compute Card Slot Design Overview

How to Create a.cibd/.cce File from Mentor Xpedition for HLDRC

Intel System Information Retrieval Utility

Intel Cluster Toolkit Compiler Edition 3.2 for Linux* or Windows HPC Server 2008*

Intel Ethernet Controller I350 Frequently Asked Questions (FAQs)

Let us ping! First we will learn the Hello World of a networked machine.

Intel Desktop Board DZ68DB

Intel System Studio for Microcontrollers

Fast-track Hybrid IT Transformation with Intel Data Center Blocks for Cloud

Intel Omni-Path Fabric Software in SUSE* Linux* Enterprise Server 12 SP3

Intel Manageability Commander User Guide

InfiniPath Drivers and Software for QLogic QHT7xxx and QLE7xxx HCAs. Table of Contents

For personnal use only

Intel Cluster Ready Allowed Hardware Variances

Intel Omni-Path Fabric Software in Red Hat* Enterprise Linux* 7.5

Intel Xeon Processor E v3 Family

Transcription:

13th ANNUAL WORKSHOP 2017 MANAGING NODE CONFIGURATION WITH 1000S OF NODES Ira Weiny Intel Corp 2017

PROBLEM Clusters are built around individual servers Linux configuration is often designed around a single desktop or single server Single server configuration can present problems with running clusters at scale Various rdma tools now exist to aid in managing nodes 2

3

What is it From the man page: The IB ACM implements and provides a framework for name, address, and route (path) resolution services over InfiniBand. and Omni-Path Architecture The IB ACM package is comprised of three components: the ibacm core service, the default provider ibacmp shared library, and a test/configuration utility - ib_acme. All three are userspace components and are available for Linux. Additional details are given below. 4

What is it, and why use it? Ibacm and its companion librdmacm are essential to generic fabric operation Doesn t ibacm have scalability issues? NO! not when used properly! 5

Local SA daemon Now with plugins! Verbs Other libfabric Verbs provider rdmacm Socket Node SA Proxy Ibacm: Included Providers (core) ibacmp ibssa External Providers DSAP... PSM2 Userspace Kernel Kernel Users Kernel Users Ib_sa Phys Port 0 Phys Port 1 Phys Port 2 Phys Port 3 Phys Port 4 2 Port Device SM/SA SM/SA SM/SA SM/SA Prefix: 0xFE80000000000001 Prefix: 0xFE80000 000000002 Prefix: 0xFE80000 000000003 6

Kernel Support Local SA daemon Now with kernel support! Currently only supports IB Path Records Planned support for OPA Path Records Multicast Records Provider ibacm Yes Ibverbs.ko (SA client code) send_mad() Local service enabled No ULP User Kernel Send to SA 7

Installation How do I get it? Your friendly neighborhood distro RHEL, SLES, Ubuntu Rdma-core https://github.com/linux-rdma/rdma-core What version do I need V1.1.0 or greater NOTE: the rdma-core version of the package jumps to match the rdma-core version Kernel support was in added in 4.3 8

IB IBACM Acmp Default provider included in ibacm (rdma-core) package Ibssa external provider for opensm OPA DSAP Distributed SA Provider Part of the opa-ff package https://github.com/01org/opa-ff Plugins Rpm name Opa-address-resolution 9

DSAP provider (plugin) Provides standard Path Record lookups to librdmacm users Node Process A librdmacm ibacm Ibacm daemon Process B PSM Process C libfabric Typically verbs users Also provides shared memory access to it s internal database through libopasadb Topology FM config (Virt Fabric) File provider (Extension of dsap) libopasadb dsap OPA SA DB (Shared mem) Standard API libopasadb Standard API libopasadb 10 FM (SA)

Ensure IPoIB is configured and running on the ports you want to control $ ifconfig ib0 ib0: flags=4163<up,broadcast,running,multicast> mtu 65520 inet 10.228.217.70 netmask 255.255.252.0 broadcast 10.228.219.255 inet6 fe80::211:7501:165:abf8 prefixlen 64 scopeid 0x20<link> Infiniband hardware address can be incorrect! Please read BUGS section in ifconfig(8). infiniband 80:00:00:0E:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00 txqueuelen 256 (InfiniBand) RX packets 8 bytes 504 (504.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 19 bytes 3122 (3.0 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 11

Configure your providers <edit /etc/rdma/ibacm_opts.cfg to assign providers to specific subnets> # provider: # Specifies the provider to assign to each subnet # ACM providers may override the address and route resolution # protocols with provider specific protocols. # provider name (prefix default) # Example: provider dsap 0xFE80000000000000 provider ibacmp default If the config file does not yet exist use ib_acme to generate one $ ib_acme -O /etc/rdma/ibacm_opts.cfg 12

Ensure ibacm is configured to run on startup and restart the daemon $ systemctl enable ibacm $ systemctl restart ibacm 13

Test a query $ ib_acme -f lid -s 2 -d 1 Service: localhost Destination: 1 Source: 2 Path information dgid: fe80::11:7501:165:ae8b sgid: fe80::11:7501:165:abf8 dlid: 1 slid: 2 flow label: 0x0 hop limit: 0 tclass: 0 reversible: 1 pkey: 0x8001 sl: 0 mtu: 7 rate: 16 packet lifetime: 15 return status 0x0 14

Verbs Access Use librdmacm for connection management Ibacm is automatically contacted from the library and cached path records are utilized!!! 15

PSM2 access Configure PSM2 to use Path Record queries <add the following to mpi_run scripts or otherwise add the environment variable setting> export PSM2_PATH_REC=OPP 16

What happens if ibacm and/or dsap is not running/configured? phcppriv12.ph.intel.com.75727ips_opp_init: PSM path record queries using OFED Plus Plus (libopasadb.so.1.0.0) from /lib64/libopasadb.so.1.0.0 2017-03-12 23:46:23 ERROR: Unable to open shared memory table. phcppriv12.ph.intel.com.75727unable to initialize OFED Plus library successfully. (err=51) phcppriv12.ph.intel.com.75727mpi_stress: OPP: Unable to obtain OPP context. Disabling OPP interface for path record queries. phcppriv12.ph.intel.com.75727mpi_stress: Make sure SM is running... phcppriv12.ph.intel.com.75727mpi_stress: Make sure service ibacm is running... phcppriv12.ph.intel.com.75727mpi_stress: to start ibacm: service ibacm start phcppriv12.ph.intel.com.75727mpi_stress: or enable it at boot time: opaconfig -E ibacm 17

Other Details Log files /var/log/ibacm.log Config files /etc/rdma/ibacm_opts.cfg /etc/rdma/ibacm_addr.cfg 18

Further work Kernel support Add support for OPA Path Records Add support for Multicast Records Configure start up to ensure ibacm can run prior to kernel ULPs ibacm/providers Multicast group operations (join/leave) Pre-joined groups Multicast Records Preconfigured topologies Remove IPoIB dependency 19

RDMA-NDD 20

RDMA-NDD Keeping nodes named so you don t have to Setting node descriptions has a long history First there was the scripts to be run at start up set_nodedesc.sh -- 2007 RHEL udev rule RHEL 6 time? Truescale IFS hard coded in driver rdma-ndd 2014 rdma-ndd was built to be flexible and robust Watches for hostname changes Watches for device adds Responds to these events by setting the node description 21

RDMA-NDD Where do I get it Friendly neighborhood distro! At least some distros have integrated it with their udev rules NOTE: It is separated to it s own rpm: rdma-ndd-*.rpm Upstream infiniband-diags > v1.6.5 Soon to be removed rdma-core-12 or greater 22

RDMA-NDD Configuration Configuration differs depending on the version Infiniband-diags version uses the ibdiag.conf file Rdma-core version is configured within systemd The format can take 2 wild cards %h => replace with the hostname %d => replace with the device name Default is %h %d 23

SUMMARY Two daemons are now standard as part of the rdmacore ibacm rdma-ndd These daemons make it easier for clusters of nodes to be managed together 24

13th ANNUAL WORKSHOP 2017 THANK YOU Ira Weiny Intel Corp

DISCLAIMERS No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document. This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest forecast, schedule, specifications and roadmaps. The products and services described may contain defects or errors known as errata which may cause deviations from published specifications. Current characterized errata are available on request. You may not use or facilitate the use of this document in connection with any infringement or other legal analysis concerning Intel products described herein. You agree to grant Intel a non-exclusive, royalty-free license to any patent claim thereafter drafted which includes subject matter disclosed herein. Forecasts: Any forecasts of requirements for goods and services are provided for discussion purposes only. Intel will have no liability to make any purchase pursuant to forecasts. Any cost or expense you incur to respond to requests for information or in reliance on any forecast will be at your own risk and expense. Business Forecast: Statements in this document that refer to Intel s plans and expectations for the quarter, the year, and the future, are forward-looking statements that involve a number of risks and uncertainties. A detailed discussion of the factors that could affect Intel s results and plans is included in Intel s SEC filings, including the annual report on Form 10-K. Copies of documents which have an order number and are referenced in this document may be obtained by calling 1-800-548-4725 or by visiting www.intel.com/design/literature.htm. Intel, the Intel logo, are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others Intel Corporation. 26