Extreme Storage Performance with exflash DIMM and AMPS

Similar documents
Emulex LPe16000B 16Gb Fibre Channel HBA Evaluation

Fusion iomemory PCIe Solutions from SanDisk and Sqrll make Accumulo Hypersonic

vsan 6.6 Performance Improvements First Published On: Last Updated On:

Evaluation Report: Improving SQL Server Database Performance with Dot Hill AssuredSAN 4824 Flash Upgrades

Lenovo Database Configuration for Microsoft SQL Server TB

White Paper. File System Throughput Performance on RedHawk Linux

Understanding the Performance Benefits of MegaRAID FastPath with Lenovo Servers

IBM Emulex 16Gb Fibre Channel HBA Evaluation

Deploy a High-Performance Database Solution: Cisco UCS B420 M4 Blade Server with Fusion iomemory PX600 Using Oracle Database 12c

Identifying Performance Bottlenecks with Real- World Applications and Flash-Based Storage

PRESERVE DATABASE PERFORMANCE WHEN RUNNING MIXED WORKLOADS

RAID SEMINAR REPORT /09/2004 Asha.P.M NO: 612 S7 ECE

Aerospike Scales with Google Cloud Platform

QLE10000 Series Adapter Provides Application Benefits Through I/O Caching

The Benefits of Solid State in Enterprise Storage Systems. David Dale, NetApp

Apache Spark Graph Performance with Memory1. February Page 1 of 13

Condusiv s V-locity Server Boosts Performance of SQL Server 2012 by 55%

Persistent Memory. High Speed and Low Latency. White Paper M-WP006

Seagate Enterprise SATA SSD with DuraWrite Technology Competitive Evaluation

HP Z Turbo Drive G2 PCIe SSD

Low Latency Evaluation of Fibre Channel, iscsi and SAS Host Interfaces

NEC Express5800 A2040b 22TB Data Warehouse Fast Track. Reference Architecture with SW mirrored HGST FlashMAX III

W H I T E P A P E R. Comparison of Storage Protocol Performance in VMware vsphere 4

Condusiv s V-locity VM Accelerates Exchange 2010 over 60% on Virtual Machines without Additional Hardware

Got Isilon? Need IOPS? Get Avere.

White paper ETERNUS Extreme Cache Performance and Use

IBM InfoSphere Streams v4.0 Performance Best Practices

Dell PowerEdge R730xd Servers with Samsung SM1715 NVMe Drives Powers the Aerospike Fraud Prevention Benchmark

Solid State Performance Comparisons: SSD Cache Performance

Understanding Data Locality in VMware vsan First Published On: Last Updated On:

Lenovo exflash DDR3 Storage DIMMs Product Guide

Modification and Evaluation of Linux I/O Schedulers

Comparison of Storage Protocol Performance ESX Server 3.5

Webinar Series: Triangulate your Storage Architecture with SvSAN Caching. Luke Pruen Technical Services Director

Accelerating Enterprise Search with Fusion iomemory PCIe Application Accelerators

Managing Performance Variance of Applications Using Storage I/O Control

Evaluation Report: HP StoreFabric SN1000E 16Gb Fibre Channel HBA

Benchmarking Enterprise SSDs

Virtuozzo Hyperconverged Platform Uses Intel Optane SSDs to Accelerate Performance for Containers and VMs

A Comparative Study of Microsoft Exchange 2010 on Dell PowerEdge R720xd with Exchange 2007 on Dell PowerEdge R510

Non-Volatile Memory Cache Enhancements: Turbo-Charging Client Platform Performance

PowerVault MD3 SSD Cache Overview

Design Considerations for Using Flash Memory for Caching

Emulex LPe16000B Gen 5 Fibre Channel HBA Feature Comparison

Introduction Disks RAID Tertiary storage. Mass Storage. CMSC 420, York College. November 21, 2006

The Impact of SSD Selection on SQL Server Performance. Solution Brief. Understanding the differences in NVMe and SATA SSD throughput

IBM Education Assistance for z/os V2R2

A Major Change in x86 Server Design: IBM X6 Platforms

IBM FlashSystems with IBM i

The Right Read Optimization is Actually Write Optimization. Leif Walsh

The Google File System

Online Transaction Processing Workloads First Published On: Last Updated On:

HANA Performance. Efficient Speed and Scale-out for Real-time BI

Was ist dran an einer spezialisierten Data Warehousing platform?

An Oracle White Paper September Oracle Utilities Meter Data Management Demonstrates Extreme Performance on Oracle Exadata/Exalogic

L7: Performance. Frans Kaashoek Spring 2013

ADVANCED IN-MEMORY COMPUTING USING SUPERMICRO MEMX SOLUTION

Accelerate Applications Using EqualLogic Arrays with directcache

Understanding Data Locality in VMware Virtual SAN

IBM Spectrum Scale IO performance

Utilizing Ultra-Low Latency Within Enterprise Architectures

Deep Learning Performance and Cost Evaluation

Performance Report: Multiprotocol Performance Test of VMware ESX 3.5 on NetApp Storage Systems

HP SmartCache technology

Technical White paper Certification of ETERNUS DX in Milestone video surveillance environment

Technical Paper. Performance and Tuning Considerations for SAS on Fusion-io ION Accelerator

Intel Xeon Scalable Family Balanced Memory Configurations

Frequently Asked Questions. s620 SATA SSD Enterprise-Class Solid-State Device

GFS Overview. Design goals/priorities Design for big-data workloads Huge files, mostly appends, concurrency, huge bandwidth Design for failures

IBM B2B INTEGRATOR BENCHMARKING IN THE SOFTLAYER ENVIRONMENT

LEVERAGING FLASH MEMORY in ENTERPRISE STORAGE

NVMe SSDs with Persistent Memory Regions

QLogic/Lenovo 16Gb Gen 5 Fibre Channel for Database and Business Analytics

Oracle Event Processing Extreme Performance on Sparc T5

EMC XTREMCACHE ACCELERATES VIRTUALIZED ORACLE

Intel Solid State Drive Data Center Family for PCIe* in Baidu s Data Center Environment

System recommendations for version 17.1

Performance Modeling and Analysis of Flash based Storage Devices

A New Metric for Analyzing Storage System Performance Under Varied Workloads

The Need for Consistent IO Speed in the Financial Services Industry. Silverton Consulting, Inc. StorInt Briefing

... IBM Advanced Technical Skills IBM Oracle International Competency Center September 2013

Flash In the Data Center

JVM Performance Study Comparing Java HotSpot to Azul Zing Using Red Hat JBoss Data Grid

Caching & Tiering BPG

All-Flash Storage Solution for SAP HANA:

Demartek Evaluation Accelerated Business Results with Seagate Enterprise Performance HDDs

Oracle s JD Edwards EnterpriseOne IBM POWER7 performance characterization

IBM System Storage DS8870 Release R7.3 Performance Update

CSE 451: Operating Systems Spring Module 12 Secondary Storage. Steve Gribble

FILE SYSTEMS. CS124 Operating Systems Winter , Lecture 23

A Performance Characterization of Microsoft SQL Server 2005 Virtual Machines on Dell PowerEdge Servers Running VMware ESX Server 3.

stec Host Cache Solution

Clustering and Reclustering HEP Data in Object Databases

Accelerate Database Performance and Reduce Response Times in MongoDB Humongous Environments with the LSI Nytro MegaRAID Flash Accelerator Card

High Availability through Warm-Standby Support in Sybase Replication Server A Whitepaper from Sybase, Inc.

EMC CLARiiON Backup Storage Solutions

Accelerating Workload Performance with Cisco 16Gb Fibre Channel Deployments

NetApp SolidFire and Pure Storage Architectural Comparison A SOLIDFIRE COMPETITIVE COMPARISON

High-Performance Lustre with Maximum Data Assurance

Cisco Tetration Analytics Platform: A Dive into Blazing Fast Deep Storage

Transcription:

Extreme Storage Performance with exflash DIMM and AMPS

214 by 6East Technologies, Inc. and Lenovo Corporation All trademarks or registered trademarks mentioned here are the property of their respective holders. For more information on AMPS, visit the 6East web site at http://www.crankuptheamps.com or contact 6East at info@crankuptheamps.com. For more information on exflash DIMMs, see the IBM/Lenovo product page at http://www- 3.ibm.com/systems/x/options/storage/solidstate/exflashdimm/ EXTREME STORAGE PERFORMANCE WITH LENOVO EXFLASH DIMM 2

AMPS, the Advanced Message Processing System from 6East Technologies, is the software behind some of the most demanding financial applications in the world. AMPS applications require predictable latency at high messaging volumes, while maintaining a persistent, queryable flight recorder and state of the world, or current value cache. This paper introduces AMPS, describes the demands that AMPS places on storage systems, and how adding flash storage on the memory channel improves performance and helps to meet those needs. AMPS is engineered for high performance in real world conditions. Our customer applications regularly publish millions of transactional messages a second. AMPS provides a current value cache, the State of the World, that can hold the current state of those messages, and allows applications to run ad hoc queries on the State of the World and the transaction log, as well as play back messages from any point in time. High Performance Storage Storage matters. To provide this kind of performance, AMPS needs storage that runs at consistently high throughput with predictable low latency. AMPS records the state of the message with the metadata required to retrieve the message, such as a timestamp, in near real time for millions of messages a second. This requires the ability to sustainably write multiple gigabytes a second of data at predictable latencies. What happens when the storage system isn t up to the demands of AMPS? AMPS publishers save messages locally until AMPS acknowledges that the message has been persisted. When the rate of message publication exceeds the throughput of the storage device, publishers need to retain messages longer. This requires more local storage. If transaction logging can t keep up with the rate of publication, the additional storage requirements, or push back on the publishers, can cause publishers to run out of storage. One way to work around this is to allow publishers to discard messages before the messages have been acknowledged. This adds risk of lost or duplicate messages if the publisher or AMPS fails over, and masks the problem of slow storage rather than solving the problem. As an example, a moderately-sized AMPS system logging a million 1KB messages per second requires more than a gigabyte per second of sustained write throughput, in addition to the throughput required to read the State of the World and the transaction log for queries. AMPS workloads mix heavy sequential writes with multiple sequential reads starting at arbitrary points in the transaction log. For most storage systems, this is a nightmare scenario and the performance numbers show that. EXTREME STORAGE PERFORMANCE WITH LENOVO EXFLASH DIMM 3

Putting Storage to the Test To evaluate storage performance, 6East engineers created a test using the AMPS high performance I/O engine. The test creates the same read/write pattern as the transaction logging that an AMPS instance performs under heavy load, using the same code path. However, the test allows our engineers to precisely adjust the read/write mix and time this single component. By eliminating other variables from the test, we can more accurately measure storage performance under AMPS workloads. AMPS transaction logging is characterized by sequential reads and writes. For this test, we simulate 512 byte messages being written to storage. Our storage engine batches the messages into a 128 message chunk, submitting 64K bytes per operation. When we add reads to the mix, the reads requested are also sequential access from the file: the same access pattern AMPS uses to replay transaction logs. Hardware Specifications We ran the tests on hardware with the following specifications: IBM x365 M4 Processor Memory PCIe Storage exflash DIMMs Disk 2 x Intel Xeon E5-269 @ 2.9GHz 64GB or 384GB Leading vendor, 785GB capacity 8 DIMMs, software RAID, 3TB total capacity IBM 9Y8878 at KRPM, 3GB capacity These specifications are similar to the specifications that 6East customers use for production deployment of AMPS. EXTREME STORAGE PERFORMANCE WITH LENOVO EXFLASH DIMM 4

Spinning Disk To get a baseline for storage performance, we ran our test using the drive in one of our performance lab machines. The access pattern for the AMPS transaction log, sequential writes and reads, matches the optimum access pattern for spinning disk. Most performance critical applications don t use this method any more, but comparing to spinning disk gives a good baseline comparison. Here are the performance characteristics of the AMPS I/O engine with spinning disk, using writes only: Disk Write Only Disk Write Only 15 5 Notice that the latency chart uses a logarithmic scale: this is necessary because of the wide variance in numbers, and the fact that the numbers range from roughly 56μs to over 5ms. We use the same scale throughout this paper, to make it easy to compare the charts. Likewise, for the throughput chart, we use a consistent scale. The spinning disk can handle over 6, 512B messages per second. EXTREME STORAGE PERFORMANCE WITH LENOVO EXFLASH DIMM 5

Few applications simply use sequential writes. To make the test reflect an active, real-world workload, we added in an 8% read mix, so that the AMPS engine was requesting 8% reads, 2% writes. The chart below shows the results. Disk Write Only Disk 8% Mix Disk Write Only Disk 8% Read 15 5 The latency drops somewhat when we add a heavy load of reads to the mix. However, this also reduces throughput to just under 5, messages per second. There s still a large variation in latency, though somewhat less than the write-only case. Those are respectable numbers, and there s a good reason why systems still include spinning disk. EXTREME STORAGE PERFORMANCE WITH LENOVO EXFLASH DIMM 6

PCIe Flash High-performance systems no longer rely on spinning disk for performance-critical storage. Instead, most current systems use PCIe-based flash storage. This storage doesn t rely on mechanical parts, and uses the PCIe bus for fast access to memory and CPU. Here are the results for PCIe write only. PCIe Write Only PCIe Write Only 15 5 The overall throughput numbers are much better, and the latency is lower compared to the spinning disk: this is clearly a better solution for high-performance applications. The spinning disk had a wide variation in latency times. PCIe is more consistent, although it still suffers from regular spikes in performance. With % writes, the PCIe solution can handle nearly 1.8 million messages a second, or three times the number of messages handled by the spinning disk. EXTREME STORAGE PERFORMANCE WITH LENOVO EXFLASH DIMM 7

As with the spinning disk, though, when we introduce reads, both latency and throughput suffer. We ran the test again, changing the mix of operations from % write to 8% read and 2% write. PCIe 8% Mix PCIe Write Only PCIe Write Only PCIe 8% Read 15 5 The performance of the PCIe flash storage system suffers dramatically when we add 8% reads to the workload. Latency not only becomes higher overall, but begins to vary drastically. Likewise, throughput degrades to the point where the PCIe storage system is processing fewer messages per second than the spinning disk. In our tests, at an 8% read mix, the PCIe storage system was only able to handle 365, 512B messages. PCIe offers good performance for write-heavy applications, but is not well suited for applications that include a heavy read mix in their I/O access patterns. EXTREME STORAGE PERFORMANCE WITH LENOVO EXFLASH DIMM 8

exflash DIMM Finally, we ran the test on the new exflash DIMM from Lenovo. This is brand new technology that adds flash storage to the memory channel of the system. The memory channel is designed for predictable, high-speed data movement. As before, we first ran the test with a % write mix. exflash Write Only exflash Write Only 15 5 The overall latency and throughput numbers improve substantially as compared to PCIe storage. While the PCIe system has spikes of higher latency at regular intervals, the exflash DIMMs maintain consistent performance. The DIMM can handle approximately 2.15 million messages a second as compared with 1.84 in the PCIe system. So far, so good. As we ve seen throughout this paper, though, the question is whether the exflash DIMMs can handle a workload that is heavily weighted toward reads. EXTREME STORAGE PERFORMANCE WITH LENOVO EXFLASH DIMM 9

The exflash DIMMs perform amazingly well at an 8% read and 2% write mix. Overall latency is only slightly higher, with none of the wide variations in performance seen with the spinning disk and PCIe solutions. drops only slightly, to 1.87 million 512B messages per second slightly more throughput than the PCIe card showed at a % write mix. exflash Write Only exflash 8% Mix exflash Write Only exflash 8% Read 15 5 Summary The numbers speak for themselves. In our testing, the exflash DIMMs outperform both PCIe and spinning disk on every dimension: higher throughput, lower latency, and rock solid predictability. The advantages are clear even when doing only sequential writes. When we add reads to the mix, the exflash DIMM technology leaves the other technologies behind. EXTREME STORAGE PERFORMANCE WITH LENOVO EXFLASH DIMM

The expanded charts below show all of the tests plotted together, on the same scale as the individual charts above. In write-only workloads, exflash DIMMs show lower latency and less variation for write only workloads. Disk Write Only PCIe Write Only exflash Write Only As you can see, latency is dramatically different for exflash DIMMs overall. The latency is lower and more consistent than any other solution in the write only case. EXTREME STORAGE PERFORMANCE WITH LENOVO EXFLASH DIMM 11

numbers on a pure write workload also favor the exflash DIMM, as shown in the following chart. Disk Write Only PCIe Write Only exflash Write Only 15 5 exflash DIMMs provides higher throughput at a lower and more consistent latency than either the spinning disk or the leading PCIe flash storage solution in pure write workloads. EXTREME STORAGE PERFORMANCE WITH LENOVO EXFLASH DIMM 12

Charting the 8% read, 2% write mix side by side, the results are similar. PCIe 8% Mix Disk 8% Mix exflash 8% Mix At the 8% read, 2% write mix, the PCIe solution has much wider variability in results than the spinning disk, while the exflash DIMM continues to have both lower and more consistent latency. EXTREME STORAGE PERFORMANCE WITH LENOVO EXFLASH DIMM 13

The throughput comparisons across technologies are even more impressive. Disk 8% Read PCIe 8% Read exflash 8% Read 15 5 Both spinning disk and PCIe flash storage suffer a dramatic drop in throughput at the 8% read mix. exflash DIMMs, on the other hand, have only a slight drop in throughput. Bottom Line Our testing shows that there is no other technology on the market that shows the level of performance and throughput achieved with exflash DIMMs. exflash DIMMs are the only technology in the market that can keep up with AMPS I/O and deliver the uncompromising performance our customers demand. At 6East, we specialize in high performance messaging systems. We ve worked closely to test the technology behind exflash DIMMs while that technology was in development. We tell our customers that if they want to get the most out of AMPS, storage on the memory channel and exflash DIMMs need to be included in their planning. EXTREME STORAGE PERFORMANCE WITH LENOVO EXFLASH DIMM 14