Optimizing Apache Spark with Memory1. July Page 1 of 14
|
|
- Cameron Linette Hill
- 5 years ago
- Views:
Transcription
1 Optimizing Apache Spark with Memory1 July 2016 Page 1 of 14
2 Abstract The prevalence of Big Data is driving increasing demand for real -time analysis and insight. Big data processing platforms, like Apache Spark, leverage application memory to provide the required performance. Unfortunately, due to their heavy reliance on memory, the potential of these platforms is subverted by the cost and capacity limitations of DRAM. This paper demonstrates that by leveraging Diablo Technologies Memory1 to maximize the available memory, Inspur servers can do more work (75% efficiency improvement); unleashing the full potential of real-time, big data processing. Introduction Bigger, Faster Data Today, data is being generated at unprecedented rates and from a growing variety of sources. As a result, the term big data has become ingrained into our standard lexicon. But just how big is Big Data? To put the speed and size of Big Data into perspective, it s worth noting that it has been estimated that over 90% of the world s data has been generated in the past two years. Current estimates put our daily data generation rate at 2.5 exabytes 1. That s one billion gigabytes of new data... every single day. Accordingly, modern businesses are faced with both a unique opportunity and a significant challenge. The opportunity lies in transforming Big Data into actionable knowledge. The adage knowledge is power has never been more true than in today s data -saturated society. The ability to effectively leverage data impacts our lives in numerous ways. It helps us with everything from stopping disease by analyzing medical statistics to identify key patterns and indicators, to finding the perfect restaurant/car/outfit/mate by turning data-driven insights into targeted, personalized recommendations. However, performing these data- to- knowledge transformations is no easy task. In many areas, our ability to analyze data has lagged behind the increasing size and speed of the data itself. The sheer density and velocity of the incoming information, coupled with the need for accurate, real -time analysis, can make it very difficult to process and manage ibm.com/software/data/bigdata/what is big data.html Page 2 of 14
3 Apache Spark TM and In-Memory Computing To handle increasing data rates and demanding user expectations, big data processing platforms like Apache Spark have emerged and quickly gained popularity. Spark provides a general- purpose clustered computing framework that can rapidly ingest and process realtime streams of data, enabling instantaneous analytics and decision -making. How Spark Works Creation, transformation, and manipulation of in- memory data is significantly faster than alternative, storage -centric approaches. Consequently, to provide its uniquely high performance, Apache Spark relies on in -memory data management. In Spark, the main data abstraction is the Resilient Distributed Dataset (RDD). RDDs are simply collections of data that can be partitioned across the nodes of a cluster and operated upon in parallel. These RDDbased operations are central to Spark functionality and performance. Keeping Spark RDDs inmemory (as opposed to on disk) keeps data closest to the CPUs, enabling the fastest access and the most optimized system-level performance. Spark s Memory Capacity Problem As one might expect, Spark operations are extremely dependent on memory capacity. To facilitate rapid data retrieval, objects in Spark need to be quickly created, cached, sorted, grouped, and/or joined, thus creating a need for massive amounts of application memory. Unfortunately, the memory available in a single server is insufficient to handle most Spark jobs. In large part, this is due to the cost and capacity constraints imposed by DRAM. Those constraints are amongst the key reasons that Spark deployments are often distributed across many servers. Deploying many servers enables system designers to create memory footprints that are much larger than a single server could provide. Though its distributed architecture enables larger pools of memory, there are still issues that subvert the full potential of Spark s disaggregated approach. Key issues include: Cost of DRAM The high cost of DRAM deters system designers from providing the larger, single -server memory footprints that would minimize cluster sizes and fully optimize performance. Page 3 of 14
4 Cost of additional nodes Creating large clusters requires additional expense due to the cost of the servers, associated networking, and increased operational expenses (e.g. due to added power consumption). Networking overhead Too many network hops can bottleneck Spark performance. Splitting Spark jobs across a large cluster requires data transfer and coordination between cluster nodes. When clusters grow too large, this overhead can negatively impact performance. In this paper, we will demonstrate how expanded application memory capacity, enabled by Memory1TM from Diablo TechnologiesTM, addresses key issues faced by Spark in traditional, DRAM-only deployments. Introducing Memory1 What Is Memory1? Diablo Technologies Memory1 is the first memory DIMM to expose NAND flash as standard application memory. This revolutionary solution provides the industry s most economical and highest -capacity byte -addressable memory modules. Memory1 provides up to 4X more memory capacity than other DIMMs, enabling dramatic increases in application memory per server. This enables significant performance advantages, due to increased data locality and reduced access times. Memory1 also minimizes Total Cost of Ownership (TCO) by reducing the number of servers required to support memory -constrained applications (e.g. Apache Spark). Memory1 DIMMs interface seamlessly with existing hardware and software. They are JEDEC compatible DDR4 DIMMs and are deployed into standard DDR4 DIMM slots. Processors, motherboards, operating systems, and applications do not need to change. Target applications simply leverage the expanded memory capacity as they see fit. Figure 1: 128GB Memory1 DIMM By leveraging flash s massive cost, power, and capacity advantages over DRAM, Memory1 DIMMs drastically change the economics of server memory, unleashing applications to leverage huge pools of local memory that were previously infeasible to provide. Page 4 of 14
5 Solving Spark with Memory1 Benchmarking SORT Performance Because most big data applications must perform a Sort on the entire dataset, performance is crucial. The speed at which data can be organized into easily searchable configurations is often Spark s primary performance bottleneck. The more rapidly SORT operations can be performed on large datasets, the more quickly data can be ingested, retrieved, manipulated, and analyzed. Therefore, to simulate the critical demands of a typical Spark workload, we utilized the industry-standard spark-perf benchmark to perform SORT operations on a typically -sized 500GB dataset. For comparison purposes, the SORT jobs were performed on both DRAM - only and Memory1 hardware configurations. Hardware Setup To efficiently sort datasets of a given magnitude, Spark requires significantly more memory than the dataset size. This is due to the additional memory needed to support RDD creation, object manipulation, and to support Spark management processes. Clustered Spark servers also require additional memory capacity to support the additional overhead created by intra-server coordination and synchronization activities. To facilitate the effective sorting of a 500GB total dataset, the DRAM -only configuration was sized to provide 1.5TB of application memory, using a cluster of three 2 -socket servers, each with 512GB of DRAM. This 3 -to- 1 ratio of dataset size -to- available memory enables a 500GB SORT job to complete in an acceptable timeframe (i.e. under 1 hour). Each cluster node represents a typical 2 -socket, 16 DIMM- slot server, fully populated with 32GB DRAM modules (16 DIMM slots * 32GB = 512GB). To demonstrate the improved efficiency and economics provided by Memory1, the Memory1 setup included only a single Inspur NF5180M4 2 -socket server, populated with 1TB of application memory. The Memory1 server was populated with eight 16GB DRAM DIMMs and eight 128GB Memory1 DIMMs, providing a total of 1TB of application memory. Note that, in the single- server Memory1 configuration, there is no additional overhead due to intra -server coordination/synchronization. Therefore, in this case, we expected a 2 -to- 1 ratio of dataset size -to- available memory to provide acceptable performance when sorting a 500GB dataset. The CPU and memory configuration for both setups is summarized in Table 1 below. Page 5 of 14
6 Testing With DRAM-Only DRAM-only: Test Setup Table 1: Server configuration details To simulate a typical customer configuration in today s DRAM -only deployments, our 3- server Spark cluster was deployed using the aforementioned setup and presented with a 500GB dataset generated by the spark-perf benchmark. DRAM-only: Results Using the DRAM- only cluster, sorting a 500GB dataset took 27.5 minutes, as shown in Figure 2. Figure 2: DRAM-Only SORT time for 500GB dataset Page 6 of 14
7 DRAM -only: Total Cost of Ownership (TCO) The total CAPEX for the three cluster DRAM-only setup (based on typical server and memory costs) was $47,400. This represents the cost of servers, processors, memory and other associated hardware. Of course, operational costs are also important to consider, so we calculated a simple OPEX based purely on the electrical costs associated with the deployment. For the DRAM -only cluster, the 3 -year OPEX totaled nearly $3,500 as shown in Table 2 below. Note that, in a realworld deployment, OPEX costs would be even higher when considering additional expenses associated with server management, physical space required, cooling costs, etc. Table 2: OPEX For DRAM -Only Configuration (1.5 TB Total Application Memory) Adding the $47,400 CAPEX and the $3, year OPEX yields a 3- year TCO of $50,834. So, in summary, utilizing at 3 -server cluster based solely on DRAM can sort 500GB of Spark data in 27.5 minutes at a 3 -year cost of $50,834. To complete the analysis, we also calculated several efficiency metrics as shown below in Table 3 below. Page 7 of 14
8 Table 3: Efficiency Metrics For DRAM -Only Configuration (1.5TB Total Application Memory) Testing With Memory1 Memory1 Setup To test Memory1, we again sorted a 500GB dataset generated by the spark-perf benchmark. This time, however, the entire sort job was handled within the expanded memory of a single Memory1 server. Memory1 Results When using the Inspur NF5180M4 with Memory1, sorting a 500GB dataset took just 19.5 minutes. This represents more than a 29% reduction in SORT time versus the DRAM -only Figure 3: DRAM-Only and Memory1 SORT times Page 8 of 14
9 Memory1 TCO Total CAPEX for this setup (based on typical server and memory costs) was $16,496, including the server, processors, eight DRAM and eight Memory1 DIMMs. Again, operational costs are also critical, so we also calculated OPEX based on the electrical costs associated with the Memory1 deployment. For the Memory1 server, the 3 -year OPEX totaled just $1,144 as shown in Table 5 below. Table 5: OPEX For Memory1 Configuration (1TB Total Application Memory) Adding the $16,496 CAPEX and the $1, year OPEX yields a 3 -year TCO of $17,640. So, in summary, a single 1 -terabyte Memory1 server can sort 500GB of Spark data in 19.5 minutes at a 3- year cost of $17,640. To complete the picture, we also calculated several efficiency metrics as shown in Table 6 below. Page 9 of 14
10 Table 6: Efficiency Metrics For Memory1 Configuration (1TB Total Application Memory) Compare DRAM-Only vs. Memory1 As shown in Table 7 below, a side -by- side comparison of performance, cost, and efficiency is very telling. When compared to the 3 -server DRAM- only cluster, the single Memory1 server was able to sort data faster and with significantly reduced TCO. The improvement in both cost and power efficiency is dramatic and demonstrates clear superiority over the DRAM -only configuration. Table 7: TCO comparison between DRAM -Only Configuration and Memory1 Configuration Page 10 of 14
11 As clearly evidenced by the test results, Memory1 -enabled servers provide compelling advantages in all facets of an Apache Spark deployment. Solution cost, power consumption, and SORT efficiency are all significantly improved by the Memory1 configuration. Spark Shuffle Architecture Memory1 Advantage Apache Spark allows for large data sets to be acted upon in memory, making it a faster alternative than Hadoop or other data sources alone. Spark follows a similar process to the MapReduce paradigm implemented in Hadoop, though allows for greater flexibility by providing architects the ability to persist, or cache, Resilient Distributed Datasets (RDD s) to memory ( persist memory ) or storage ( persist disk ). The results discussed thus far have been in persist memory mode. However, many Spark architects will necessarily persist to storage instead, potentially shifting the performance bottleneck to storage. Using a Shuffle-Sort model, the data is first Mapped, a procedure in which the job is divided among multiple nodes in the cluster. Each node then applies a key-value to each piece of data, writing to a separate file (or bucket) for each key, causing a shuffle write. Once the data has been mapped to these files, the data is Sorted by their corresponding key-values, combining data from similar buckets in the Reduce stage, causing a shuffle read. In persist disk mode, shuffle data is written to and read from storage, causing huge slowdowns in the processing of data. Memory1 allows for the shuffle data to be accelerated as well, even when persisting to disk. Because Memory1 is application memory, a RAMDisk can be created and used for the shuffle data in Spark, removing the storage bottleneck and making the full performance of Memory1 available for Shuffle Data. To illustrate the use of Memory1 in the persist disk mode, we performed the Sort test on the Inspur NF5180M4 in persist disk mode using Memory1 as a RAMDisk. First, we performed a sort on a 200 GB data set in order to establish a baseline. We then reran the test using 500 GB, 1 TB and 1.5 TB datasets. Figure 3 shows the results from each of these tests. Page 11 of 14
12 Figure 4: Results of Memory1 in persist disk mode As can be seen in figure 4, the results show very linear performance in completion times as the dataset increases in size. By writing shuffle data to a Memory1 RAMDisk, Spark takes advantage of the latency and bandwidth benefits of the processors memory controllers. Because Memory1 is connected directly to the processors of the server, latency is significantly reduced and bandwidth is considerably higher. Apache Spark architects now have an option available to increase the performance of their operations by mapping Memory1 using a RAMDisk. Page 12 of 14
13 What We ve Shown Diablo Technologies Memory1 economically enables a dramatic expansion of the application memory available in each server. By enabling a 65% decrease in TCO and a 75% increase in SORT efficiency-per -dollar, Memory1 eliminates the hardware cost concerns traditionally faced in multi -server Spark deployments. In addition, having more memory -per- server enables each Spark server to perform more work, which mitigates the impact of networking overhead by reducing the number of servers required. In summary, Memory1 allows Spark system designers to: Avoid the high cost of DRAM -only implementations Reduce the number of Spark servers required for a given job, thus minimizing the cost of additional servers Minimize the number of network hops, thereby minimizing the bandwidth and latency impact of networking overhead These benefits are only possible with the improved capacity and economics provided by Memory1. By massively increasing application memory per server, Memory1 improves Spark s Return on Investment (ROI) by both maximizing performance and minimizing Total Cost of Ownership (TCO). Page 13 of 14
14 About Inspur Inspur Systems Inc., located in Fremont, CA, is part of Inspur Group, a leading Cloud Computing and global IT Solutions Provider. Inspur was founded in 1945 and has since provided IT products and services for over 85 countries in the world. Inspur is ranked by Gartner as one of the Top5 largest server manufacturers in the world and #1 in China. Inspur provides our global customers with data center servers and storage solutions which are Tier1 quality and performance, energy efficient, cost effective and built specific to actual workloads and data center environments. As a leading total solutions and services provider, Inspur is capable of providing total solutions at IaaS, PaaS and SaaS level with high-end servers, mass storage systems, cloud operating system and information security technology. For more information, visit About Diablo Technologies Diablo Technologies is a leading developer of high-performance memory products that solve urgent business problems by wringing more performance out of fewer servers. Diablo s Memory1 combines the highest capacity memory modules with their leading Software Defined Memory platform. Memory1 enables a dramatic reduction in datacenter expenses with significant increases in server and application capability. Diablo s products and technology are included in solutions from leading server vendors such as Inspur. Diablo is best known for its innovative Memory Channel Storage (MCS ) architecture. Memory Channel Storage dramatically decreased storage access times by more than 80% by attaching flash storage directly to the CPU s memory controller All Rights Reserved. The dt logo, Diablo Technologies, and Memory1 are trademarks or registered trademarks of Diablo Technologies, Incorporated. All other trademarks are property of their respective owners. The Inspur Logo, is a trademark of Inspur Group. All other trademarks are property of their respective owners. Page 14 of 14
Apache Spark Graph Performance with Memory1. February Page 1 of 13
Apache Spark Graph Performance with Memory1 February 2017 Page 1 of 13 Abstract Apache Spark is a powerful open source distributed computing platform focused on high speed, large scale data processing
More informationExpand In-Memory Capacity at a Fraction of the Cost of DRAM: AMD EPYCTM and Ultrastar
White Paper March, 2019 Expand In-Memory Capacity at a Fraction of the Cost of DRAM: AMD EPYCTM and Ultrastar Massive Memory for AMD EPYC-based Servers at a Fraction of the Cost of DRAM The ever-expanding
More informationNew Approach to Unstructured Data
Innovations in All-Flash Storage Deliver a New Approach to Unstructured Data Table of Contents Developing a new approach to unstructured data...2 Designing a new storage architecture...2 Understanding
More informationFusion iomemory PCIe Solutions from SanDisk and Sqrll make Accumulo Hypersonic
WHITE PAPER Fusion iomemory PCIe Solutions from SanDisk and Sqrll make Accumulo Hypersonic Western Digital Technologies, Inc. 951 SanDisk Drive, Milpitas, CA 95035 www.sandisk.com Table of Contents Executive
More informationTop 5 Reasons to Consider
Top 5 Reasons to Consider NVM Express over Fabrics For Your Cloud Data Center White Paper Top 5 Reasons to Consider NVM Express over Fabrics For Your Cloud Data Center Major transformations are occurring
More information2 to 4 Intel Xeon Processor E v3 Family CPUs. Up to 12 SFF Disk Drives for Appliance Model. Up to 6 TB of Main Memory (with GB LRDIMMs)
Based on Cisco UCS C460 M4 Rack Servers Solution Brief May 2015 With Intelligent Intel Xeon Processors Highlights Integrate with Your Existing Data Center Our SAP HANA appliances help you get up and running
More informationBIG DATA AND HADOOP ON THE ZFS STORAGE APPLIANCE
BIG DATA AND HADOOP ON THE ZFS STORAGE APPLIANCE BRETT WENINGER, MANAGING DIRECTOR 10/21/2014 ADURANT APPROACH TO BIG DATA Align to Un/Semi-structured Data Instead of Big Scale out will become Big Greatest
More informationAccelerating Enterprise Search with Fusion iomemory PCIe Application Accelerators
WHITE PAPER Accelerating Enterprise Search with Fusion iomemory PCIe Application Accelerators Western Digital Technologies, Inc. 951 SanDisk Drive, Milpitas, CA 95035 www.sandisk.com Table of Contents
More informationAccelerate Big Data Insights
Accelerate Big Data Insights Executive Summary An abundance of information isn t always helpful when time is of the essence. In the world of big data, the ability to accelerate time-to-insight can not
More informationHCI: Hyper-Converged Infrastructure
Key Benefits: Innovative IT solution for high performance, simplicity and low cost Complete solution for IT workloads: compute, storage and networking in a single appliance High performance enabled by
More informationAerospike Scales with Google Cloud Platform
Aerospike Scales with Google Cloud Platform PERFORMANCE TEST SHOW AEROSPIKE SCALES ON GOOGLE CLOUD Aerospike is an In-Memory NoSQL database and a fast Key Value Store commonly used for caching and by real-time
More informationFIVE REASONS YOU SHOULD RUN CONTAINERS ON BARE METAL, NOT VMS
WHITE PAPER FIVE REASONS YOU SHOULD RUN CONTAINERS ON BARE METAL, NOT VMS Over the past 15 years, server virtualization has become the preferred method of application deployment in the enterprise datacenter.
More informationTaking Hyper-converged Infrastructure to a New Level of Performance, Efficiency and TCO
Taking Hyper-converged Infrastructure to a New Level of Performance, Efficiency and TCO Adoption of hyper-converged infrastructure is rapidly expanding, but the technology needs a new twist in order to
More informationApache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context
1 Apache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context Generality: diverse workloads, operators, job sizes
More informationDriveScale-DellEMC Reference Architecture
DriveScale-DellEMC Reference Architecture DellEMC/DRIVESCALE Introduction DriveScale has pioneered the concept of Software Composable Infrastructure that is designed to radically change the way data center
More informationReduce Latency and Increase Application Performance Up to 44x with Adaptec maxcache 3.0 SSD Read and Write Caching Solutions
Reduce Latency and Increase Application Performance Up to 44x with Adaptec maxcache 3. SSD Read and Write Caching Solutions Executive Summary Today s data centers and cloud computing environments require
More informationAccelerating Hadoop Applications with the MapR Distribution Using Flash Storage and High-Speed Ethernet
WHITE PAPER Accelerating Hadoop Applications with the MapR Distribution Using Flash Storage and High-Speed Ethernet Contents Background... 2 The MapR Distribution... 2 Mellanox Ethernet Solution... 3 Test
More informationMassive Scalability With InterSystems IRIS Data Platform
Massive Scalability With InterSystems IRIS Data Platform Introduction Faced with the enormous and ever-growing amounts of data being generated in the world today, software architects need to pay special
More informationDell EMC Hyper-Converged Infrastructure
Dell EMC Hyper-Converged Infrastructure New normal for the modern data center GLOBAL SPONSORS Traditional infrastructure and processes are unsustainable Expensive tech refreshes, risky data migrations
More informationMicron and Hortonworks Power Advanced Big Data Solutions
Micron and Hortonworks Power Advanced Big Data Solutions Flash Energizes Your Analytics Overview Competitive businesses rely on the big data analytics provided by platforms like open-source Apache Hadoop
More informationWhy Converged Infrastructure?
Why Converged Infrastructure? Three reasons to consider converged infrastructure for your organization Converged infrastructure isn t just a passing trend. It s here to stay. According to a recent survey
More informationEMC XTREMCACHE ACCELERATES ORACLE
White Paper EMC XTREMCACHE ACCELERATES ORACLE EMC XtremSF, EMC XtremCache, EMC VNX, EMC FAST Suite, Oracle Database 11g XtremCache extends flash to the server FAST Suite automates storage placement in
More informationADVANCED IN-MEMORY COMPUTING USING SUPERMICRO MEMX SOLUTION
TABLE OF CONTENTS 2 WHAT IS IN-MEMORY COMPUTING (IMC) Benefits of IMC Concerns with In-Memory Processing Advanced In-Memory Computing using Supermicro MemX 1 3 MEMX ARCHITECTURE MemX Functionality and
More informationVirtualization of the MS Exchange Server Environment
MS Exchange Server Acceleration Maximizing Users in a Virtualized Environment with Flash-Powered Consolidation Allon Cohen, PhD OCZ Technology Group Introduction Microsoft (MS) Exchange Server is one of
More informationWhy Converged Infrastructure?
Why Converged Infrastructure? Three reasons to consider converged infrastructure for your organization Converged infrastructure isn t just a passing trend. It s here to stay. A recent survey 1 by IDG Research
More informationSUPERMICRO, VEXATA AND INTEL ENABLING NEW LEVELS PERFORMANCE AND EFFICIENCY FOR REAL-TIME DATA ANALYTICS FOR SQL DATA WAREHOUSE DEPLOYMENTS
TABLE OF CONTENTS 2 THE AGE OF INFORMATION ACCELERATION Vexata Provides the Missing Piece in The Information Acceleration Puzzle The Vexata - Supermicro Partnership 4 CREATING ULTRA HIGH-PERFORMANCE DATA
More informationINTEL NEXT GENERATION TECHNOLOGY - POWERING NEW PERFORMANCE LEVELS
INTEL NEXT GENERATION TECHNOLOGY - POWERING NEW PERFORMANCE LEVELS Russ Fellows Enabling you to make the best technology decisions July 2017 EXECUTIVE OVERVIEW* The new Intel Xeon Scalable platform is
More informationHyper-Converged Infrastructure: Providing New Opportunities for Improved Availability
Hyper-Converged Infrastructure: Providing New Opportunities for Improved Availability IT teams in companies of all sizes face constant pressure to meet the Availability requirements of today s Always-On
More informationDell EMC Hyper-Converged Infrastructure
Dell EMC Hyper-Converged Infrastructure New normal for the modern data center Nikolaos.Nikolaou@dell.com Sr. Systems Engineer Greece, Cyprus & Malta GLOBAL SPONSORS Traditional infrastructure and processes
More informationACCELERATE YOUR ANALYTICS GAME WITH ORACLE SOLUTIONS ON PURE STORAGE
ACCELERATE YOUR ANALYTICS GAME WITH ORACLE SOLUTIONS ON PURE STORAGE An innovative storage solution from Pure Storage can help you get the most business value from all of your data THE SINGLE MOST IMPORTANT
More informationWas ist dran an einer spezialisierten Data Warehousing platform?
Was ist dran an einer spezialisierten Data Warehousing platform? Hermann Bär Oracle USA Redwood Shores, CA Schlüsselworte Data warehousing, Exadata, specialized hardware proprietary hardware Introduction
More informationSOLUTION BRIEF TOP 5 REASONS TO CHOOSE FLASHSTACK
SOLUTION BRIEF TOP 5 REASONS TO CHOOSE FLASHSTACK New IT service delivery methodologies are revolutionizing how IT departments function and how users access the applications that make businesses successful.
More informationDRAM and Storage-Class Memory (SCM) Overview
Page 1 of 7 DRAM and Storage-Class Memory (SCM) Overview Introduction/Motivation Looking forward, volatile and non-volatile memory will play a much greater role in future infrastructure solutions. Figure
More informationDell EMC Isilon All-Flash
Enterprise Strategy Group Getting to the bigger truth. ESG Lab Validation Dell EMC Isilon All-Flash Scale-out All-flash Storage for Demanding Unstructured Data Workloads By Tony Palmer, Senior Lab Analyst
More informationNGD Systems: Introduction to Computational Storage
NGD Systems: Introduction to Computational Storage Updated: June 2018 Executive Summary: The advent of high-performance, high-capacity flash storage has changed the dynamics of the storage-compute relationship.
More informationFor Healthcare Providers: How All-Flash Storage in EHR and VDI Can Lower Costs and Improve Quality of Care
For Healthcare Providers: How All-Flash Storage in EHR and VDI Can Lower Costs and Improve Quality of Care WHITE PAPER Table of Contents The Benefits of Flash for EHR...2 The Benefits of Flash for VDI...3
More informationFor DBAs and LOB Managers: Using Flash Storage to Drive Performance and Efficiency in Oracle Databases
For DBAs and LOB Managers: Using Flash Storage to Drive Performance and Efficiency in Oracle Databases WHITE PAPER Table of Contents The Benefits of Flash Storage for Oracle Databases...2 What DBAs Need
More informationSolution Brief. A Key Value of the Future: Trillion Operations Technology. 89 Fifth Avenue, 7th Floor. New York, NY
89 Fifth Avenue, 7th Floor New York, NY 10003 www.theedison.com @EdisonGroupInc 212.367.7400 Solution Brief A Key Value of the Future: Trillion Operations Technology Printed in the United States of America
More informationDe-dupe: It s not a question of if, rather where and when! What to Look for and What to Avoid
De-dupe: It s not a question of if, rather where and when! What to Look for and What to Avoid By Greg Schulz Founder and Senior Analyst, the StorageIO Group Author The Green and Virtual Data Center (CRC)
More informationKey Considerations for Improving Performance And Virtualization in Microsoft SQL Server Environments
Key Considerations for Improving Performance And Virtualization in Microsoft SQL Server Environments Table of Contents Maximizing Performance in SQL Server Environments............... 4 Focusing on Hardware...........................................
More informationTECHNICAL OVERVIEW ACCELERATED COMPUTING AND THE DEMOCRATIZATION OF SUPERCOMPUTING
TECHNICAL OVERVIEW ACCELERATED COMPUTING AND THE DEMOCRATIZATION OF SUPERCOMPUTING Table of Contents: The Accelerated Data Center Optimizing Data Center Productivity Same Throughput with Fewer Server Nodes
More informationChoosing the Best Network Interface Card for Cloud Mellanox ConnectX -3 Pro EN vs. Intel XL710
COMPETITIVE BRIEF April 5 Choosing the Best Network Interface Card for Cloud Mellanox ConnectX -3 Pro EN vs. Intel XL7 Introduction: How to Choose a Network Interface Card... Comparison: Mellanox ConnectX
More informationStorage Solutions for VMware: InfiniBox. White Paper
Storage Solutions for VMware: InfiniBox White Paper Abstract The integration between infrastructure and applications can drive greater flexibility and speed in helping businesses to be competitive and
More informationLATEST INTEL TECHNOLOGIES POWER NEW PERFORMANCE LEVELS ON VMWARE VSAN
LATEST INTEL TECHNOLOGIES POWER NEW PERFORMANCE LEVELS ON VMWARE VSAN Russ Fellows Enabling you to make the best technology decisions November 2017 EXECUTIVE OVERVIEW* The new Intel Xeon Scalable platform
More informationLEVERAGING FLASH MEMORY in ENTERPRISE STORAGE
LEVERAGING FLASH MEMORY in ENTERPRISE STORAGE Luanne Dauber, Pure Storage Author: Matt Kixmoeller, Pure Storage SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA unless
More informationTop 4 considerations for choosing a converged infrastructure for private clouds
Top 4 considerations for choosing a converged infrastructure for private clouds Organizations are increasingly turning to private clouds to improve efficiencies, lower costs, enhance agility and address
More informationVirtuozzo Containers
Parallels Virtuozzo Containers White Paper An Introduction to Operating System Virtualization and Parallels Containers www.parallels.com Table of Contents Introduction... 3 Hardware Virtualization... 3
More informationBenefits of SD-WAN to the Distributed Enterprise
WHITE PAPER Benefits of SD-WAN to the Distributed Enterprise 1 B enefits of SD-WAN to the Distributed Enterprise Branch Networking Today More Bandwidth, More Complexity Branch or remote office network
More informationDeploy a High-Performance Database Solution: Cisco UCS B420 M4 Blade Server with Fusion iomemory PX600 Using Oracle Database 12c
White Paper Deploy a High-Performance Database Solution: Cisco UCS B420 M4 Blade Server with Fusion iomemory PX600 Using Oracle Database 12c What You Will Learn This document demonstrates the benefits
More informationEvaluation Report: Improving SQL Server Database Performance with Dot Hill AssuredSAN 4824 Flash Upgrades
Evaluation Report: Improving SQL Server Database Performance with Dot Hill AssuredSAN 4824 Flash Upgrades Evaluation report prepared under contract with Dot Hill August 2015 Executive Summary Solid state
More informationSystem Memory at a Fraction of the DRAM Cost
white paper Data Center Software-Defined Memory System Memory at a Fraction of the DRAM Cost Intel Optane SSDs with Intel Memory Drive Technology Offers Memory Expansion Solution. Abstract Contemporary
More informationThe Impact of SSD Selection on SQL Server Performance. Solution Brief. Understanding the differences in NVMe and SATA SSD throughput
Solution Brief The Impact of SSD Selection on SQL Server Performance Understanding the differences in NVMe and SATA SSD throughput 2018, Cloud Evolutions Data gathered by Cloud Evolutions. All product
More informationHow Architecture Design Can Lower Hyperconverged Infrastructure (HCI) Total Cost of Ownership (TCO)
Economic Insight Paper How Architecture Design Can Lower Hyperconverged Infrastructure (HCI) Total Cost of Ownership (TCO) By Eric Slack, Sr. Analyst December 2017 Enabling you to make the best technology
More informationPlanning For Persistent Memory In The Data Center. Sarah Jelinek/Intel Corporation
Planning For Persistent Memory In The Data Center Sarah Jelinek/Intel Corporation SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA unless otherwise noted. Member companies
More informationBusiness Benefits of Policy Based Data De-Duplication Data Footprint Reduction with Quality of Service (QoS) for Data Protection
Data Footprint Reduction with Quality of Service (QoS) for Data Protection By Greg Schulz Founder and Senior Analyst, the StorageIO Group Author The Green and Virtual Data Center (Auerbach) October 28th,
More informationIntroduction to Big-Data
Introduction to Big-Data Ms.N.D.Sonwane 1, Mr.S.P.Taley 2 1 Assistant Professor, Computer Science & Engineering, DBACER, Maharashtra, India 2 Assistant Professor, Information Technology, DBACER, Maharashtra,
More informationUNLEASH YOUR APPLICATIONS
UNLEASH YOUR APPLICATIONS Meet the 100% Flash Scale-Out Enterprise Storage Array from XtremIO Opportunities to truly innovate are rare. Yet today, flash technology has created the opportunity to not only
More informationDataON and Intel Select Hyper-Converged Infrastructure (HCI) Maximizes IOPS Performance for Windows Server Software-Defined Storage
Solution Brief DataON and Intel Select Hyper-Converged Infrastructure (HCI) Maximizes IOPS Performance for Windows Server Software-Defined Storage DataON Next-Generation All NVMe SSD Flash-Based Hyper-Converged
More informationAccelerating Microsoft SQL Server 2016 Performance With Dell EMC PowerEdge R740
Accelerating Microsoft SQL Server 2016 Performance With Dell EMC PowerEdge R740 A performance study of 14 th generation Dell EMC PowerEdge servers for Microsoft SQL Server Dell EMC Engineering September
More informationTPC-E testing of Microsoft SQL Server 2016 on Dell EMC PowerEdge R830 Server and Dell EMC SC9000 Storage
TPC-E testing of Microsoft SQL Server 2016 on Dell EMC PowerEdge R830 Server and Dell EMC SC9000 Storage Performance Study of Microsoft SQL Server 2016 Dell Engineering February 2017 Table of contents
More informationPRESERVE DATABASE PERFORMANCE WHEN RUNNING MIXED WORKLOADS
PRESERVE DATABASE PERFORMANCE WHEN RUNNING MIXED WORKLOADS Testing shows that a Pure Storage FlashArray//m storage array used for Microsoft SQL Server 2016 helps eliminate latency and preserve productivity.
More informationDell EMC ScaleIO Ready Node
Essentials Pre-validated, tested and optimized servers to provide the best performance possible Single vendor for the purchase and support of your SDS software and hardware All-Flash configurations provide
More informationAccelerating Real-Time Big Data. Breaking the limitations of captive NVMe storage
Accelerating Real-Time Big Data Breaking the limitations of captive NVMe storage 18M IOPs in 2u Agenda Everything related to storage is changing! The 3rd Platform NVM Express architected for solid state
More informationW H I T E P A P E R U n l o c k i n g t h e P o w e r o f F l a s h w i t h t h e M C x - E n a b l e d N e x t - G e n e r a t i o n V N X
Global Headquarters: 5 Speen Street Framingham, MA 01701 USA P.508.872.8200 F.508.935.4015 www.idc.com W H I T E P A P E R U n l o c k i n g t h e P o w e r o f F l a s h w i t h t h e M C x - E n a b
More information2/26/2017. Originally developed at the University of California - Berkeley's AMPLab
Apache is a fast and general engine for large-scale data processing aims at achieving the following goals in the Big data context Generality: diverse workloads, operators, job sizes Low latency: sub-second
More informationDell EMC All-Flash solutions are powered by Intel Xeon processors. Learn more at DellEMC.com/All-Flash
N O I T A M R O F S N A R T T I L H E S FU FLA A IN Dell EMC All-Flash solutions are powered by Intel Xeon processors. MODERNIZE WITHOUT COMPROMISE I n today s lightning-fast digital world, your IT Transformation
More informationstec Host Cache Solution
White Paper stec Host Cache Solution EnhanceIO SSD Cache Software and the stec s1120 PCIe Accelerator speed decision support system (DSS) workloads and free up disk I/O resources for other applications.
More informationSamsung s Green SSD (Solid State Drive) PM830. Boost data center performance while reducing power consumption. More speed. Less energy.
Samsung s Green SSD (Solid State Drive) PM830 Boost data center performance while reducing power consumption More speed. Less energy. Reduce data center power consumption Data center and power consumption
More informationThe Impact of Hyper- converged Infrastructure on the IT Landscape
The Impact of Hyperconverged Infrastructure on the IT Landscape Focus on innovation, not IT integration BUILD Consumes valuables time and resources Go faster Invest in areas that differentiate BUY 3 Integration
More informationThe next step in Software-Defined Storage with Virtual SAN
The next step in Software-Defined Storage with Virtual SAN Osama I. Al-Dosary VMware vforum, 2014 2014 VMware Inc. All rights reserved. Agenda Virtual SAN s Place in the SDDC Overview Features and Benefits
More informationFLASHARRAY//M Business and IT Transformation in 3U
FLASHARRAY//M Business and IT Transformation in 3U TRANSFORM IT Who knew that moving to all-flash storage could help reduce the cost of IT? FlashArray//m makes server and workload investments more productive,
More informationMicrosoft Exchange Server 2010 workload optimization on the new IBM PureFlex System
Microsoft Exchange Server 2010 workload optimization on the new IBM PureFlex System Best practices Roland Mueller IBM Systems and Technology Group ISV Enablement April 2012 Copyright IBM Corporation, 2012
More informationTITLE. the IT Landscape
The Impact of Hyperconverged Infrastructure on the IT Landscape 1 TITLE Drivers for adoption Lower TCO Speed and Agility Scale Easily Operational Simplicity Hyper-converged Integrated storage & compute
More informationIBM Power Systems solution for SugarCRM
IBM Power Systems solution for SugarCRM Performance and scaling overview of Sugar on IBM Power Systems running Linux featuring the new IBM POWER8 technology Steve Pratt, Mark Nellen IBM Systems and Technology
More informationE-Guide BENEFITS AND DRAWBACKS OF SSD, CACHING, AND PCIE BASED SSD
E-Guide BENEFITS AND DRAWBACKS OF SSD, CACHING, AND PCIE BASED SSD A modern trend in IT infrastructures reveals that more and more companies are installing solid-state storage and caching. Additionally,
More informationThe 7 Habits of Highly Effective API and Service Management
7 Habits of Highly Effective API and Service Management: Introduction The 7 Habits of Highly Effective API and Service Management... A New Enterprise challenge has emerged. With the number of APIs growing
More informationKingston s Data Reduction Technology for longer SSD life and greater performance
Kingston s Data Reduction Technology for longer SSD life and greater performance Solid-State Drives (SSDs) have transitioned from being an expensive storage device to becoming common in tablets, and the
More informationChina Big Data and HPC Initiatives Overview. Xuanhua Shi
China Big Data and HPC Initiatives Overview Xuanhua Shi Services Computing Technology and System Laboratory Big Data Technology and System Laboratory Cluster and Grid Computing Laboratory Huazhong University
More informationTOP 5 REASONS TO CHOOSE FLASHSTACK FOR HEALTHCARE
SOLUTION BRIEF TOP 5 REASONS TO CHOOSE FLASHSTACK FOR HEALTHCARE New IT service delivery methodologies are revolutionizing how hospital IT departments function and how IT staff and clinicians access the
More informationEmbedded Technosolutions
Hadoop Big Data An Important technology in IT Sector Hadoop - Big Data Oerie 90% of the worlds data was generated in the last few years. Due to the advent of new technologies, devices, and communication
More informationCopyright 2012 EMC Corporation. All rights reserved.
1 FLASH 1 ST THE STORAGE STRATEGY FOR THE NEXT DECADE Iztok Sitar Sr. Technology Consultant EMC Slovenia 2 Information Tipping Point Ahead The Future Will Be Nothing Like The Past 140,000 120,000 100,000
More informationREFERENCE ARCHITECTURE Quantum StorNext and Cloudian HyperStore
REFERENCE ARCHITECTURE Quantum StorNext and Cloudian HyperStore CLOUDIAN + QUANTUM REFERENCE ARCHITECTURE 1 Table of Contents Introduction to Quantum StorNext 3 Introduction to Cloudian HyperStore 3 Audience
More informationVEXATA FOR ORACLE. Digital Business Demands Performance and Scale. Solution Brief
Digital Business Demands Performance and Scale As enterprises shift to online and softwaredriven business models, Oracle infrastructure is being pushed to run at exponentially higher scale and performance.
More informationIBM Data Science Experience White paper. SparkR. Transforming R into a tool for big data analytics
IBM Data Science Experience White paper R Transforming R into a tool for big data analytics 2 R Executive summary This white paper introduces R, a package for the R statistical programming language that
More information6WINDGate. White Paper. Packet Processing Software for Wireless Infrastructure
Packet Processing Software for Wireless Infrastructure Last Update: v1.0 - January 2011 Performance Challenges for Wireless Networks As advanced services proliferate and video consumes an ever-increasing
More informationUpgrade to Microsoft SQL Server 2016 with Dell EMC Infrastructure
Upgrade to Microsoft SQL Server 2016 with Dell EMC Infrastructure Generational Comparison Study of Microsoft SQL Server Dell Engineering February 2017 Revisions Date Description February 2017 Version 1.0
More informationIntel Solid State Drive Data Center Family for PCIe* in Baidu s Data Center Environment
Intel Solid State Drive Data Center Family for PCIe* in Baidu s Data Center Environment Case Study Order Number: 334534-002US Ordering Information Contact your local Intel sales representative for ordering
More informationIBM XIV Storage System
IBM XIV Storage System Technical Description IBM XIV Storage System Storage Reinvented Performance The IBM XIV Storage System offers a new level of high-end disk system performance and reliability. It
More informationToward a Memory-centric Architecture
Toward a Memory-centric Architecture Martin Fink EVP & Chief Technology Officer Western Digital Corporation August 8, 2017 1 SAFE HARBOR DISCLAIMERS Forward-Looking Statements This presentation contains
More information3D NAND Technology Scaling helps accelerate AI growth
3D NAND Technology Scaling helps accelerate AI growth Jung Yoon, Ranjana Godse IBM Supply Chain Engineering Andrew Walls IBM Flash Systems August 2018 1 Agenda 3D-NAND Scaling & AI Flash density trend
More informationIBM Real-time Compression and ProtecTIER Deduplication
Compression and ProtecTIER Deduplication Two technologies that work together to increase storage efficiency Highlights Reduce primary storage capacity requirements with Compression Decrease backup data
More informationMellanox Virtual Modular Switch
WHITE PAPER July 2015 Mellanox Virtual Modular Switch Introduction...1 Considerations for Data Center Aggregation Switching...1 Virtual Modular Switch Architecture - Dual-Tier 40/56/100GbE Aggregation...2
More informationIt s Time to Move Your Critical Data to SSDs Introduction
It s Time to Move Your Critical Data to SSDs Introduction by the Northamber Storage Specialist Today s IT professionals are well aware that users expect fast, reliable access to ever-growing amounts of
More informationThe Benefits of Solid State in Enterprise Storage Systems. David Dale, NetApp
The Benefits of Solid State in Enterprise Storage Systems David Dale, NetApp SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA unless otherwise noted. Member companies
More informationWhen, Where & Why to Use NoSQL?
When, Where & Why to Use NoSQL? 1 Big data is becoming a big challenge for enterprises. Many organizations have built environments for transactional data with Relational Database Management Systems (RDBMS),
More informationIncreasing Performance of Existing Oracle RAC up to 10X
Increasing Performance of Existing Oracle RAC up to 10X Prasad Pammidimukkala www.gridironsystems.com 1 The Problem Data can be both Big and Fast Processing large datasets creates high bandwidth demand
More informationBroadcast-Quality, High-Density HEVC Encoding with AMD EPYC Processors
Solution Brief December, 2018 2018 Broadcast-Quality, High-Density HEVC Encoding with AMD EPYC Processors HIGHLIGHTS o The AMD EPYC SoC brings a new balance to the datacenter. Utilizing an x86-architecture,
More informationDeploying Application and OS Virtualization Together: Citrix and Virtuozzo
White Paper Deploying Application and OS Virtualization Together: Citrix and Virtuozzo www.swsoft.com Version 1.0 Table of Contents The Virtualization Continuum: Deploying Virtualization Together... 3
More informationTHE COMPLETE GUIDE COUCHBASE BACKUP & RECOVERY
THE COMPLETE GUIDE COUCHBASE BACKUP & RECOVERY INTRODUCTION Driven by the need to remain competitive and differentiate themselves, organizations are undergoing digital transformations and becoming increasingly
More informationBUYING SERVER HARDWARE FOR A SCALABLE VIRTUAL INFRASTRUCTURE
E-Guide BUYING SERVER HARDWARE FOR A SCALABLE VIRTUAL INFRASTRUCTURE SearchServer Virtualization P art 1 of this series explores how trends in buying server hardware have been influenced by the scale-up
More information