Meltdown and Spectre Interconnect Evaluation Jan 2018 1
Meltdown and Spectre - Background Most modern processors perform speculative execution This speculation can be measured, disclosing information about data regions that are protected The attacks use the speculative execution process to gain access to restricted or confidential information Meltdown and Spectre fixes cause processor performance degradation based on the info below: Large: 8% to 19%+ Highly cached random memory, with buffered I/O, OLTP database workloads, and benchmarks with high kernel-to-user space transitions. Examples include OLTP Workloads, random I/O to NvME, etc. Modest: 3% to 7% Database analytics and Java VMs. These applications may have significant sequential disk or network traffic, but kernel/device drivers are able to aggregate requests to moderate level of kernel-to-user transitions. Small: 2% to 5% Workloads that spend little time in the kernel were measured. This is with jobs that run mostly in user space and are scheduled using CPU-pinning or NUMA-control. Examples include Linpack NxN on x86 and SPECcpu2006 Minimal - None Accelerator technologies that generally bypass the kernel in favor of user direct access. Examples include DPDK, RDMA and other offloads that bypass the kernel. Following slides provide measured performance impact on various interconnect technologies 2
Offload Interconnect Ensures Highest RDMA and Kernel Bypass are critical to Ensure Highest and ROI Offload-Based Interconnect Impact 0% Onload-Based Interconnect Impact Up to -47% Offload-based Interconnect technologies bypass the Kernel and therefore maintain best performance Example: RDMA (InfiniBand and Ethernet), RDMA-based NVMe-over-Fabric, other interconnect offloads Onload-based interconnect performance is negatively impacted by Meltdown and Spectre fixes Includes TCP/IP over Ethernet, OmniPath and other onload-based products 3
RoCE vs TCP Results Impact: 0% 0% Impact: -47% -47% Before Before applying software patches, After After applying software patches 4
InfiniBand vs OmniPath Results Impact: 0% 0% Impact: -26% -26% Before Before applying software patches, After After applying software patches 5
NVMe-over-Fabric Results RDMA Guarantees Highest NVMe-oF Impact: 0% 0% Before Before applying software patches, After After applying software patches 6
Lower is Better Number of CPU Cores Data Streaming Results Video Streaming (64 Streams = 96Gb/s) 35 Kernel Streaming Mellanox Interconnect Streaming (VMA) 30 25 20 15 21 Cores Needed -44% Impact 31 Cores Needed 30 Fewer CPU Cores Needed for the Same Throughput 10 5 0 <1 Core <1 Core Before After Before After Before Before applying software patches, After After applying software patches 7
Setup Information RoCE and TCP CPU Intel Xeon CPU E5-2697A v4 x86_64 @ 2.60GHz Operating System Red Hat Enterprise Linux Server 7.4 Kernel Version 3.10.0-693.11.6.el7.x86_64, 3.10.0-693.el7.x86_64 Description gen-l-vrt-149_gen-l-vrt-159 kernel-3.10.0-693.11.6.el7.x86_64 b2b Adapter: ConnectX-5, Firmware 16.22.0170, Driver MLNX_OFED_LINUX-4.3-0.0.5.0 InfiniBand and OmniPath CPU: Intel Xeon CPU Gold 6138 CPU @ 2.00GHz Kernel 3.10.0-693.el7.x86_64, 3.10.0-693.11.6.el7.x86_64 Operating System Red Hat Enterprise Linux Server 7.4 OPA driver: IntelOPA-IFS.RHEL74-x86_64.10.6.1.0.2 InfiniBand EDR driver: MLNX_OFED 4.2 NVMe Adapter: ConnectX-5 CPU Intel Xeon CPU E5-2690 v3, RAM 64G Operating System Red Hat Enterprise Linux Server 7.4 OFED 4.3-0.1.0.0 Kernel 3.10.0-693.11.6.el7.x86_64 8
Thank You 9