NVMe SSD Performance Evaluation Guide for Windows Server 2016 and Red Hat Enterprise Linux 7.4

Similar documents
NVMe Performance Testing and Optimization Application Note

NUMA Topology for AMD EPYC Naples Family Processors

Microsoft Windows 2016 Mellanox 100GbE NIC Tuning Guide

Memory Population Guidelines for AMD EPYC Processors

Linux Network Tuning Guide for AMD EPYC Processor Based Servers

EPYC VIDEO CUG 2018 MAY 2018

Intelligent Tiered Storage Acceleration Software for Windows 10

Fan Control in AMD Radeon Pro Settings. User Guide. This document is a quick user guide on how to configure GPU fan speed in AMD Radeon Pro Settings.

Changing your Driver Options with Radeon Pro Settings. Quick Start User Guide v2.1

AMD Radeon ProRender plug-in for Unreal Engine. Installation Guide

Linux Network Tuning Guide for AMD EPYC Processor Based Servers

Driver Options in AMD Radeon Pro Settings. User Guide

Performance Tuning Guidelines for Low Latency Response on AMD EPYC -Based Servers Application Note

Java Application Performance Tuning for AMD EPYC Processors

AMD NVMe/SATA RAID Quick Start Guide for Windows Operating Systems

Changing your Driver Options with Radeon Pro Settings. Quick Start User Guide v3.0

White Paper AMD64 TECHNOLOGY SPECULATIVE STORE BYPASS DISABLE

AMD EPYC and NAMD Powering the Future of HPC February, 2019

Thermal Design Guide for Socket SP3 Processors

AMD EPYC Processors Showcase High Performance for Network Function Virtualization (NFV)

Solid State Graphics (SSG) SDK Setup and Raw Video Player Guide

FOR ENTERPRISE 18.Q3. August 8 th, 2018

Forza Horizon 4 Benchmark Guide

Enhance your Cloud Security with AMD EPYC Hardware Memory Encryption

Intel Server Board S2600CW2S

Your Challenges, Our Solutions. The NAS and/or laptop storage are not enough; how can I upgrade once and expand all?

* ENDNOTES: RVM-26 AND RZG-01.

VMware vsphere 6.5. Radeon Pro V340 MxGPU Deployment Guide for. Version 1.0

Hyper-converged infrastructure with Proxmox VE virtualization platform and integrated Ceph Storage.

amdgpu Graphics Stack Documentation

Lenovo XClarity Provisioning Manager User Guide

Implementing IBM Easy Tier with IBM Real-time Compression IBM Redbooks Solution Guide

AMD Radeon ProRender plug-in for Universal Scene Description. Installation Guide

Samsung PM1725a NVMe SSD

INTRODUCTION TO OPENCL TM A Beginner s Tutorial. Udeepta Bordoloi AMD


SAMSUNG ELECTRONICS RESERVES THE RIGHT TO CHANGE PRODUCTS, INFORMATION AND SPECIFICATIONS WITHOUT NOTICE.

AMD IOMMU VERSION 2 How KVM will use it. Jörg Rödel August 16th, 2011

Intel Server Board S5520HC

FAQs HP Z Turbo Drive Quad Pro

Changpeng Liu, Cloud Software Engineer. Piotr Pelpliński, Cloud Software Engineer

AMD EPYC Delivers Linear Scalability for Docker with Bare-Metal Performance

Getting Started with InfoSphere Streams Quick Start Edition (VMware)

ADVANCED RENDERING EFFECTS USING OPENCL TM AND APU Session Olivier Zegdoun AMD Sr. Software Engineer

The devices can be set up with RAID for additional performance and redundancy using software RAID. Models HP Z Turbo Drive Quad Pro 2x512GB PCIe SSD

CSCS HPC storage. Hussein N. Harake

IBM MQ Appliance HA and DR Performance Report Model: M2001 Version 3.0 September 2018

IBM MQ Appliance Performance Report Version June 2015

OpenMPDK and unvme User Space Device Driver for Server and Data Center

Maximizing Six-Core AMD Opteron Processor Performance with RHEL

CAUTIONARY STATEMENT 1 EPYC PROCESSOR ONE YEAR ANNIVERSARY JUNE 2018

Optimizing SDS for the Age of Flash. Krutika Dhananjay, Raghavendra Gowdappa, Manoj Hat

Accelerate Finger Printing in Data Deduplication Xiaodong Liu & Qihua Dai Intel Corporation

Intel Server Board S1200BTS

User Guide. Storage Executive. Introduction. Storage Executive User Guide. Introduction

QuickSpecs. PCIe Solid State Drives for HP Workstations

Samsung V-NAND SSD 970 EVO Plus

SSDs Going Mainstream Do you believe? Tom Rampone Intel Vice President/General Manager Intel NAND Solutions Group

Intel Atom Processor Based Platform Technologies. Intelligent Systems Group Intel Corporation

Crusoe Processor Model TM5800

IBM MQ Appliance HA and DR Performance Report Version July 2016

Falcon: Scaling IO Performance in Multi-SSD Volumes. The George Washington University

MxGPU Setup Guide with VMware

Clear CMOS after Hardware Configuration Changes


StarWind Virtual SAN Configuring HA Shared Storage for Scale-Out File Servers in Windows Server 2012R2

21 TB Data Warehouse Fast Track for Microsoft SQL Server 2014 Using the PowerEdge R730xd Server Deployment Guide

Changpeng Liu. Senior Storage Software Engineer. Intel Data Center Group

SCALING DGEMM TO MULTIPLE CAYMAN GPUS AND INTERLAGOS MANY-CORE CPUS FOR HPL

Radeon Pro Software: Radeon Pro ReLive. User Guide v3.0

Building AMD64 Applications with the Microsoft Platform SDK. Developer Application Note

Family 15h Models 00h-0Fh AMD FX -Series Processor Product Data Sheet

Setting up the DR Series System on Acronis Backup & Recovery v11.5. Technical White Paper

Samsung SSD Data Migration v.3.1. User Manual

Optimizing Fusion iomemory on Red Hat Enterprise Linux 6 for Database Performance Acceleration. Sanjay Rao, Principal Software Engineer

High speed transfer on PCs

Deep Learning Performance and Cost Evaluation

SUPPORT MATRIX. HYCU OMi Management Pack for Citrix

FuzeDrive. User Guide. for Microsoft Windows 10 x64. Version Date: June 20, 2018

Identifying Performance Bottlenecks with Real- World Applications and Flash-Based Storage

Intel Solid State Drive 660p Series

Lenovo PM963 NVMe Enterprise Value PCIe SSDs Product Guide

AMD Processor Recognition. Application Note

Intel Server Board S1200V3RPO Intel Server System R1208RPOSHORSPP

Intel Server Board S2400SC

Family 15h Models 00h-0Fh AMD Opteron Processor Product Data Sheet

Samsung V-NAND SSD 860 PRO

Kingston SSD Manager. User Guide (V )

Oracle MaxRep for SAN. Configuration Sizing Guide. Part Number E release November

Seagate Enterprise SATA SSD with DuraWrite Technology Competitive Evaluation

User Guide. Storage Executive Command Line Interface. Introduction. Storage Executive Command Line Interface User Guide Introduction

CAUTIONARY STATEMENT This presentation contains forward-looking statements concerning Advanced Micro Devices, Inc. (AMD) including, but not limited to

Deep Learning Performance and Cost Evaluation

HPE MSA 2042 Storage. Data sheet

AVR42789: Writing to Flash on the New tinyavr Platform Using Assembly

Samsung V-NAND SSD 860 EVO M.2

INTERFERENCE FROM GPU SYSTEM SERVICE REQUESTS

SSD Utility. Installation Guide. Software Version 3.n

DatasheetDirect.com. Visit to get your free datasheets. This datasheet has been downloaded by

Run Anywhere. The Hardware Platform Perspective. Ben Pollan, AMD Java Labs October 28, 2008

Transcription:

NVMe SSD Performance Evaluation Guide for Windows Server 2016 and Red Hat Enterprise Linux 7.4 Publication # 56367 Revision: 0.70 Issue Date: August 2018 Advanced Micro Devices

2018 Advanced Micro Devices, Inc. All rights reserved. The information contained herein is for informational purposes only, and is subject to change without notice. While every precaution has been taken in the preparation of this document, it may contain technical inaccuracies, omissions and typographical errors, and AMD is under no obligation to update or otherwise correct this information. Advanced Micro Devices, Inc. makes no representations or warranties with respect to the accuracy or completeness of the contents of this document, and assumes no liability of any kind, including the implied warranties of noninfringement, merchantability or fitness for particular purposes, with respect to the operation or use of AMD hardware, software or other products described herein. No license, including implied or arising by estoppel, to any intellectual property rights is granted by this document. Terms and limitations applicable to the purchase or use of AMD s products are as set forth in a signed agreement between the parties or in AMD's Standard Terms and Conditions of Sale. Trademarks AMD, the AMD Arrow logo, AMD EPYC, and combinations thereof, are trademarks of Advanced Micro Devices, Inc. Other product names used in this publication are for identification purposes only and may be trademarks of their respective companies. Windows and Windows Server are registered trademarks of Microsoft Corporation. Linux is a registered trademark of Linus Torvalds.

56367 Rev. 0.70 August 2018 NVMe SSD Performance Evaluation Guide for Windows Contents Evaluating NVMe SSD Performance using Windows Server 2016... 6 Prerequisites... 6 SSD Preconditioning... 6 Disable Write-Caching... 6 Power Options... 6 Performance Options... 7 Using Diskspd for Performance Testing... 7 Using fio for Performance Testing... 9 Evaluating NVMe Performance using Red Hat Enterprise Linux 7.4... 10 Prerequisites... 10 SSD Preconditioning... 11 Using fio for Performance Testing... 11 Performance Results... 13 Sequential Throughput Performance... 13 Random IOPS Performance... 14 Appendix A Test System Configuration... 15 Contents 3

NVMe SSD Performance Evaluation Guide for Windows 56367 Rev. 0.70 August 2018 List of Tables Table 1. Diskspd Options Used in Performance Testing... 7 Table 2. Samsung PM1725a 1.6TB (NUMA node 3, Physical Drive #1)... 8 Table 3. Micron 9200 3.2TB (NUMA node 6, Physical Drive #2)... 9 Table 4. fio Options Used in Windows... 10 Table 5. fio Options Used in Linux... 12 Table 6. Configuration of the Systems Under Test (SUTs)... 15 Table 7. Required BIOS Settings for Testing... 15 List of Figures Figure 1. Sequential Throughput Performance on Samsung PM1725a 1.6TB NVMe SSD... 13 Figure 2. Sequential Throughput Performance on Micron 9200 3.2TB NVMe SSD... 13 Figure 3. Random IOPS Performance on Samsung PM1725a 1.6TB NVMe SSD... 14 Figure 4. Random IOPS Performance on Micron 9200 3.2TB NVMe SSD... 14 4 List of Tables

56367 Rev. 0.70 August 2018 NVMe SSD Performance Evaluation Guide for Windows Revision History Date Revision Description August 2018 0.70 Initial public release. Revision History 5

NVMe SSD Performance Evaluation Guide for Windows 56367 Rev. 0.70 August 2018 Evaluating NVMe SSD Performance using Windows Server 2016 Prerequisites Evaluating NVMe performance on Windows Server 2016 requires a few modifications before benchmarks can be run. Below is a list of required prerequisites to ensure that benchmark results are accurate and repeatable. For more information on system configuration consult Appendix A. SSD Preconditioning Before running any benchmarks, it is crucial that you prepare the drive by preconditioning the drive. Preconditioning a drive is recommended to achieve sustained performance on a fresh drive. To better understand why SSD preconditioning is important, you can visit the following site: http://www.snia.org/sites/default/education/tutorials/2011/fall/solidstate/estherspanjer_the_w hy_how_ssd_performance_benchmarking.pdf. The preconditioning process utilizes three steps to ensure that benchmarking results are accurate and repeatable. It is recommended to run the following workloads with twice the advertised capacity of the SSD to guarantee that all available memory is filled with data including the factory provisioned area. Secure erase the SSD Fill SSD with 128k sequential data twice Fill the drive with 4k random data If you re running a sequential workload to estimate the read or write throughput, you may skip the last step, although it is not recommended. Disable Write-Caching To measure the true performance of the NVMe SSDs being used, it is recommended that you disable write-caching in your Windows installation. This setting can be found by: 1. Go to Device Manager 2. Under Disk Drives, right click on the device name and click on Policies 3. Under Removal Policy, choose Quick Removal 4. Click OK to save the setting. Power Options For the highest performance possible, it is important to ensure that the power plan is set to achieve the highest performance. This setting can be verified by navigating to: 6 Test System Configuration

56367 Rev. 0.70 August 2018 NVMe SSD Performance Evaluation Guide for Windows Control Panel Hardware Power Options Set to High Performance Performance Options Similarly, it is recommended that any additional visual effects be turned off. This setting is accessible by navigating to: Control Panel System and Security Advanced System Settings Performance Visual Effects and select Adjust for best performance Using Diskspd for Performance Testing Diskspd is a command line tool for storage benchmarking on Microsoft Windows that generates a variety of requests against computer files, partitions, or storage devices. Here is an example for diskspd usage and description of options used in our testing: > diskspd.exe -Suw -b4k -t16 -ag1,16,17,18,19,20,21,22,23,24,25, 26,27,28,29,30,31 -o32 -w100 -W600 -d300 #1 Table 1. Diskspd Options Used in Performance Testing Option -Suw -b4k Description Controls caching behavior. u: disable software caching w: enable writethrough (no hardware write caching) Size of block -t16 Number of threads per target -ag1,16,17,18,19,20,21,22,23,24,25,26,27,28, 29,30,31 -o32 Advanced CPU affinity Assigns threads round-robin to the CPUs provided. g1 specifies processor group 1, and 16-31 are the core numbers within that Number of outstanding I/O requests per target per thread -w100 Percentage of write requests -W60 Warm up time -d300 Test duration #1 Target (physical drive number) Revision History 7

NVMe SSD Performance Evaluation Guide for Windows 56367 Rev. 0.70 August 2018 To determine the physical drive number of the desired target SSD in Windows: 1. Open a command prompt window 2. Run diskpart. Running this command switches to diskpart s command line interface 3. Type list disk to see the disk numbers and information To determine the assigned NUMA node for each physical device in Windows: 1. Go to Control Panel 2. Open Device Manager 3. Under Disk Drives, right click on the device name and click on Properties. 4. Go to the Details tab and click on the Property field. 5. Scroll down the list and click on NUMA. The value provided is the NUMA node the device is assigned to. Use the following table to interpret the color coding used in the commands below: Color Red Green Blue Process group number Description 8 CPUs on the same NUMA node the NVMe SSD is located on 8 CPUs located on an adjacent NUMA node to the NUMA node NVMe SSD is located on, in the same process group Table 2. Samsung PM1725a 1.6TB (NUMA node 3, Physical Drive #1) Operation Sequential Read Sequential Write Random Read Random Write Command Line diskspd.exe -Suw -b128k -si128k -t16 - ag0,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31 -o32 -w0 -W600 -d300 #1 diskspd.exe -Suw -b128k -si128k -t16 - ag0,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31 -o32 -w100 -W600 -d300 #1 diskspd.exe -Suw -b4k -t16 - ag0,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31 -o32 -w0 -W600 -d300 #1 diskspd.exe -Suw -b4k -t16 - ag0,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31 -o32 -w100 -W600 -d300 #1 8 Test System Configuration

56367 Rev. 0.70 August 2018 NVMe SSD Performance Evaluation Guide for Windows Table 3. Micron 9200 3.2TB (NUMA node 6, Physical Drive #2) Operation Sequential Read Sequential Write Random Read Random Write Command Line diskspd.exe -Suw -b128k -si128k -t16 - ag1,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31 -o32 -w0 -W600 -d300 #2 diskspd.exe -Suw -b128k -si128k -t16 - ag1,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31 -o32 -w100 -W600 -d300 #2 diskspd.exe -Suw -b4k -t16 - ag1,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31 -o32 -w0 -W600 -d300 #2 diskspd.exe -Suw -b4k -t16 - ag1,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31 -o32 -w100 -W600 -d300 #2 The affinity is set based on the locality of the cores to the NUMA node the physical drive is located on. Always run your tests long enough to counter the effect of any caching (use -W for warmup time). Using fio for Performance Testing fio is a tool that can spawn threads or processes doing a particular type of I/O operation as specified by the user. One way this tool can be leverage is to use a jobfile that invokes fio as follows under the command line: > fio <jobfile> The intended parameters for the job could be included in the jobfile. Below is an example of a jobfile used for random write operation: [global] ioengine=windowsaio direct=1 iodepth=32 group_reporting=1 numjobs=16 ramp_time=600 runtime=300 [4k-ramdwr] bs=4k rw=randwrite filename=\\.\physicaldrive1 Revision History 9

NVMe SSD Performance Evaluation Guide for Windows 56367 Rev. 0.70 August 2018 Table 4. fio Options Used in Windows Option ioengine=windowsaio direct=1 iodepth=32 group_reporting=1 Numjobs=16 ramp_time=600 runtime=300 bs=4k rw=randwrite filename=\\.\physicaldrive1 Description Defines how the job issues I/O. Use Windows native asynchronous I/O Use non-buffered I/O Number of I/O units to keep in flight against the file Display per-group reports instead of per-job when numjobs is specified Number of clones (processes/threads performing the same workload) of this job fio will run the specified workload for this amount of time (in seconds) before logging any performance numbers. Useful for letting performance settle before logging results, thus minimizing the runtime required for stable results Terminate processing after the specified number of seconds Block size for I/O units Type of I/O pattern This is how you define raw devices in Windows Evaluating NVMe Performance using Red Hat Enterprise Linux 7.4 Prerequisites Evaluating NVMe performance on Red Hat Enterprise Linux 7.4 requires a few modifications before benchmarks can be run. Below is a list of required prerequisites to ensure that benchmark results are accurate and repeatable. For more information on system configuration consult Appendix A. 10 Test System Configuration

56367 Rev. 0.70 August 2018 NVMe SSD Performance Evaluation Guide for Windows SSD Preconditioning Before running any benchmarks, it is crucial that you prepare the drive by preconditioning the drive. Preconditioning a drive is recommended to achieve sustained performance on a fresh drive. To better understand why SSD preconditioning is important, you can visit the following site: http://www.snia.org/sites/default/education/tutorials/2011/fall/solidstate/estherspanjer_the_w hy_how_ssd_performance_benchmarking.pdf. The preconditioning process utilizes three steps to ensure that benchmarking results are accurate and repeatable. It is recommended to run that the following workloads be run with twice the advertised capacity of the SSD to guarantee that all available memory is filled with data including the factory provisioned area. Secure erase the SSD Fill SSD with sequential data twice Fill the drive with 4k random data twice If you re running a sequential workload to estimate the read or write throughput, you may skip the last step, although it is not recommended. Using fio for Performance Testing fio is a tool that can spawn threads or processes doing a particular type of I/O operation as specified by the user. One way this tool can be leverage is to use a jobfile that invokes fio as follows under the command line: > fio <jobfile> The intended parameters for the job could be included in the jobfile. Below is an example of a jobfile used for random write operation: [global] ioengine=libaio direct=1 iodepth=32 group_reporting=1 numjobs=16 ramp_time=600 runtime=300 [4k-ramdwr] bs=4k rw=randwrite filename=/dev/nvme[0]n1 #0 is the device identifier here Revision History 11

NVMe SSD Performance Evaluation Guide for Windows 56367 Rev. 0.70 August 2018 Table 5. fio Options Used in Linux Option ioengine=libaio direct=1 iodepth=32 group_reporting=1 numjobs=16 ramp_time=600 runtime=300 bs=4k rw=randwrite filename=/dev/nvme[device identifier] n1 Defines how the job issues I/O. Use non-buffered I/O Description Number of I/O units to keep in flight against the file Display per-group reports instead of per-job when numjobs is specified Number of clones (processes/threads performing the same workload) of this job fio will run the specified workload for this amount of time before logging any performance numbers. Useful for letting performance settle before logging results, thus minimizing the runtime required for stable results Terminate processing after the specified number of seconds Block size for I/O units Type of I/O pattern This is how Linux defines raw devices Please note that to achieve higher IOPS performance in Linux, the numjobs parameter as well as the iodepth can be increased. You can also specify the NUMA nodes to be used by parameter numa_cpu_nodes. Always start with the CPUs that are located on the same NUMA node as the NVMe SSD device. 12 Test System Configuration

56367 Rev. 0.70 August 2018 NVMe SSD Performance Evaluation Guide for Windows Performance Results Sequential Throughput Performance Figure 1. Sequential Throughput Performance on Samsung PM1725a 1.6TB NVMe SSD Figure 2. Sequential Throughput Performance on Micron 9200 3.2TB NVMe SSD Revision History 13

NVMe SSD Performance Evaluation Guide for Windows 56367 Rev. 0.70 August 2018 Random IOPS Performance Figure 3. Random IOPS Performance on Samsung PM1725a 1.6TB NVMe SSD Figure 4. Random IOPS Performance on Micron 9200 3.2TB NVMe SSD 14 Test System Configuration

56367 Rev. 0.70 August 2018 NVMe SSD Performance Evaluation Guide for Windows Appendix A Test System Configuration Table 6. Configuration of the Systems Under Test (SUTs) Platform AMD AMD Processor 2x AMD EPYC 7601 2x AMD EPYC 7601 RAM OS NVMe SSD #1 NVMe SSD #2 16x 16GB DRAM @ 2666MHz Windows Server 2016 DataCenter 64-bit version 10.0.14393 Samsung PM1725a 1.6TB (NUMA node 3) Micron 9200 MAX 3.2TB (NUMA node 6) 16x 16GB DRAM @ 2666MHz Red Hat Enterprise Linux Server release 7.4 (Maipo) Samsung PM1725a 1.6TB (NUMA node 3) Micron 9200 MAX 3.2TB (NUMA node 2) Diskspd version 2.0.20a N/A fio version 3.6 3.1 Table 7. Required BIOS Settings for Testing BIOS Settings Simultaneous Multithreading (SMT) Desired Value Disabled Revision History 15