Inspur AI Computing Platform - PDF Free Download

Inspur Server

Inspur AI Computing Platform 3 Server NF5280M4 (2CPU + 3 ) 4 Server NF5280M5 (2 CPU + 4 ) Node (2U 4 Only) 8 Server NF5288M5 (2 CPU + 8 ) 16 Server SR BOX (16 P40 Only)

Server target market Purpose H/W Requirements CSP Deep Learning Real-time transcoding VOD Server CSP Multi, P2P, RDMA supports High capacity local storage High performance-price ratio HPC Heterogeneous Computing HPC Cluster HPC Multi /MIC, P2P, RDMA supports 100G Ethernet, InfiniBand ext.

NVLink High-speed interconnect NVLink is a high-speed interconnect that replaces PCI Express to provide up to 12X faster data sharing between s. Better support on Direct2.0 Unified Memory Pool NVIDIA P100 (Pascal) 3

NVIDIA Direct Peer-to-Peer (P2P) Communication Direct Advantages Accelerated communication with network and storage devices Peer-to-Peer Transfers between s Peer-to-Peer memory access RDMA

NF5288M4 VS NF5288M5 CPU Memory NF5288M4 1. 2x Intel Xeon Processors E5-2600V4 series 2. TDP up to 145W 16 DDR4 DIMM per node support DDR4-2400 memory 1. 2 x SKL-EP CPU 2. TDP up to 165W NF5288M5 16 DDR4 DIMM per node 4 Apache Pass supports PCIE I/O 8 x PCI-E3.0 x16 (x8 link) or 4 x PCI-E3.0 x16 (x16 link) and 1 x PCI-E3.0 x24 1 x PCIe 3.0 x8 mezzanine RAID 2 x PCIe 3.0 x16 HHHL Front slot Storage Support System Fan Support 8 x 3.5 /2.5 SAS/SATA/SSD Supporting up to 4 x and MIC Accelerator Cards Redundant Hot swap System Fan, Air Cooling Storage Controller: 8 x 2.5 U.2 (8639) 2 x M.2 PCIe & SATA on board Up to 8 x 300W /SXM2 Redundant Hot swap System Fan, Air cooling or Hybrid cooling. PSU 2x 1620/2000W PSU 80plus Platinum 2x 3000W PSU 80plus Titanium

NF5288M5 server for Purley 2U server for HPC and Machine Learning 2 x SKL EP Processors, TDP up to 165W, Support SKL-F SKUs Support 8 Xeon Phi/ in a 2U chassis Both PCIe AIC & SXM2 are supported 8 2.5 U.2 storage bays TDP up to 300W 3000W 1+1 PSU Titanium Sample MP 2017.04 2017.08 SXM2 Configuration PCIe AIC configuration

NF5288M5 server for Purley 2 PCIe 16 HHHL slot 2 3000w PSU Front view Front I/O 8 2.5 U.2 4 PCIe 16 HHHL slot (only for SXM2 configuration) 2 C20 Power connector Rear view Rear I/O 4 10G Ethernet

SXM2 configuration 4 PCIe 16 HHHL slot Liquid cooling connector (Optional) 8 SXM2 NVIDIA 5 Redundant dual rotor fan 2 Skylake, 165W TDP 899.5mm 16 DDR4 2400 DIMM 2 Front PCIe 16 expansion SXM2 NVIDIA

NF5288M5 SXM2 Liquid Solution Water out Water in LQ2 LQ4

8 SXM2 Topologic on NF5288M5 RAID CPU0 UPI CPU1 Front PCIe x16 Front PCIe x16 8x U.2 Rear PCIe x16 PCIe switch 96-lane Rear PCIe x16 Rear PCIe x16 Rear PCIe x16 PCIe switch 96-lane PCIe switch 96-lane PCIe 0 2 4 6 NVLINK 1 3 5 7

PCIe AIC configuration 8 PCIe dual slot PCIe 16 For XEON Phi/ Coprocessor maintain handle 5 Redundant dual rotor fan 4 Card per group design 2 Skylake, 165W TDP 899.5mm 16 DDR4 2400 DIMM 2 Front PCIe 16 expansion

Flexible topologic in 8 PCIe configuration HHHL PCIe 16 RAID PCIe 8 CPU0 UPI CPU1 Slimline Port PCIe 16 SW SW HHHL PCIe 16 Slimline Port PCIe 16 RAID PCIe 8 Slimline Port PCIe 16 CPU0 SW UPI CPU1 SW Slimline Port PCIe 16 HHHL PCIe 16 RAID PCIe 8 Slimline Port PCIe 16 CPU0 SW UPI CPU1 SW HHHL PCIe 16 Slimline Port PCIe 16 Proposal A All s in same domain RAID mezzanine 2 HHHL PCIe 16 in front 8 U.2 Proposal B High ratio on Xeon Phi/ vs CPU RAID mezzanine 2 HHHL PCIe 16 in front or 1xHHHL PCIex16 + 4xU.2 or 8xU.2 Proposal C More expansibility, High CPU to bandwidth RAID mezzanine 2 HHHL PCIe 16 in front 8 U.2

AGX-2 Supports Different Cards Supports 8 NVIDIA Tesla P100 Cards (Built-in NVIDIA NVLink) Supports 8 NVIDIA Tesla P100 P40 P4 Cards (PCIe interface)

GX4 BOX 4*s PCI-e Expansion Motherboard 2*1600W Power Supply NVMe SSD Expansion Efficient Thermal Fan PCI-e Switch Chip PCI-e*16 Expansion

GX4 resource decoupling and pooling Partition design of CPU server and Box Flexible topology & high scalability Efficient data communications & high TCO revenue Scale up 8*s 16*s Scale out

GX4 Flexible s Topology Balanced Public cloud service Small-scale model training Common Deep Learning model training Cascaded Deep Learning model training P2P function enhanced CPU Server CPU Server CPU Server CPU0 UPI CPU1 CPU0 UPI CPU1 CPU0 UPI CPU1 PCIe switch PCIe switch PCIe switch PCIe switch PCIe switch PCIe switch 0 2 0 2 GP U 0 GP U 2 GP U 0 GP U 2 0 2 0 2 1 3 1 3 GP U 1 GP U 3 GP U 1 GP U 3 1 3 1 3

TCO Revenue Tradition Tradition Cluster Framework Purchase Cost CPU CPU CPU CPU CPU IB Switch CPU CPU CPU Large scale I/O Redundancy 4 sets of CPU + Memory +Storage 4 * IB cards 1* IB switch 16*s

High TCO Benefit BOX 16-CARD IN ONE SYSTEM communication needs no network protocol conversion reduce 50%+ I/O redundancy; Compared with tradition framework, Purchase cost reduce by $15,000+. CPU Server CPU0 UPI CPU1 Purchase Cost 0 1 PCIe switch 2 3 0 1 PCIe switch 2 3 0 1 PCIe switch 2 3 0 1 PCIe switch 2 3 Lower I/O Redundancy 1 sets of CPU + Memory+Storage 0* IB cards 0* IB switch 16*s

GX4 Support with full range of PCIE accelerators Support various, FPGA, KNL and other PCIE cards, and reserve NVMe pooling function Fast Data Swapping In Memory Large Amount of Training Data Higher Efficiency for DL Inference Higher price/performanc e ratio for HPC Better TCO for Inference NVIDIA Tesla P100 NVIDIA Tesla P40 NVIDIA Tesla P4 Intel KNL FPGA

GX4 Specifications Model Number CPU MEM Storage PCI-e Head node Specifications NF5280M5 Support 2*Intel Next Generation Processer Platform - Skylake 1. 24 x DDR4 DIMM and 12 x Apache Pass; 2. Support RDIMM, LRDIMM, NVdimm 3. Support 2400, 2666 MT/s 1. Maximum support 3.5* 12 + 2.5 *4, including 3 front NVMe 2. Maximum support2.5* 24 + 2.5*4+3.5*4, including 6 front NVMe Support up to 4 BOX Model Number Size Management Chip BOX Specifications SF0204P1 4*PCIe P100/P40/P4/KNL/FPGA 435mmx87.5mmx740mm AST2500 BCM58522 U.2 16 direct-connect U.2(Without ) PCIe I/O Power Supply Outlet Support 1 standard PCI-e*16 slot; 4 mini PCI-e* 4 cable RJ45 management port, serial port 1600w 1+1 redundant power supply Rear-end outlet