Understanding Rack-Scale Disaggregated Storage

Similar documents
Pelican: A building block for exascale cold data storage

Flamingo: Enabling Evolvable HDD-based Near-Line Storage

Pocket: Elastic Ephemeral Storage for Serverless Analytics

Highly Scalable, Non-RDMA NVMe Fabric. Bob Hansen,, VP System Architecture

2012 Enterprise Strategy Group. Enterprise Strategy Group Getting to the bigger truth. TM

3D NAND Technology Scaling helps accelerate AI growth

E-Store: Fine-Grained Elastic Partitioning for Distributed Transaction Processing Systems

Lies, Damn Lies and Performance Metrics. PRESENTATION TITLE GOES HERE Barry Cooks Virtual Instruments

SurFS Product Description

Cold Storage: The Road to Enterprise Ilya Kuznetsov YADRO

SAS Technical Update Connectivity Roadmap and MultiLink SAS Initiative Jay Neer Molex Corporation Marty Czekalski Seagate Technology LLC

MOHA: Many-Task Computing Framework on Hadoop

Copyright 2012 EMC Corporation. All rights reserved.

HyperScalers JetStor appliance with Raidix storage software

A Disseminated Distributed OS for Hardware Resource Disaggregation Yizhou Shan

Roadmap for Enterprise System SSD Adoption

Pelican: A building block for exascale cold data storage

IBM System Storage DS8870 Release R7.3 Performance Update

Isilon Performance. Name

Vblock Architecture. Andrew Smallridge DC Technology Solutions Architect

Storage Optimization with Oracle Database 11g

Azure SQL Database for Gaming Industry Workloads Technical Whitepaper

Microsoft Exchange Server 2010 workload optimization on the new IBM PureFlex System

Evaluation Report: Improving SQL Server Database Performance with Dot Hill AssuredSAN 4824 Flash Upgrades

Big Data and Object Storage

Design a Remote-Office or Branch-Office Data Center with Cisco UCS Mini

Dell EMC Isilon All-Flash

RAID4S: Improving RAID Performance with Solid State Drives

The Next Generation of Extreme OLTP Processing with Oracle TimesTen

Nutanix Tech Note. Virtualizing Microsoft Applications on Web-Scale Infrastructure

Decentralized Distributed Storage System for Big Data

Modern hyperconverged infrastructure. Karel Rudišar Systems Engineer, Vmware Inc.

Cisco UCS Virtual Interface Card 1225

Exam Objectives for MCSA Installation, Storage, and Compute with Windows Server 2016

NVMe Direct. Next-Generation Offload Technology. White Paper

Designing Next Generation FS for NVMe and NVMe-oF

Hálózatok üzleti tervezése

Lenovo Software Defined Storage with Windows Server 2016 Datacenter Storage Spaces

Hedvig as backup target for Veeam

Enabling High Performance Data Centre Solutions and Cloud Services Through Novel Optical DC Architectures. Dimitra Simeonidou

TPC-E testing of Microsoft SQL Server 2016 on Dell EMC PowerEdge R830 Server and Dell EMC SC9000 Storage

Scale your Data Center with SAS Marty Czekalski Market and Ecosystem Development for Emerging Technology, Western Digital Corporation

DELL EMC DATA DOMAIN OPERATING SYSTEM

Functional Testing of SQL Server on Kaminario K2 Storage

MICHAËL BORGERS, SYSTEM ENGINEER MATTHIAS SPELIER, SYSTEM ENGINEER. Software Defined Datacenter

Evolution of Rack Scale Architecture Storage

3.1. Storage. Direct Attached Storage (DAS)

ActiveScale Erasure Coding and Self Protecting Technologies

VMWARE EBOOK. Easily Deployed Software-Defined Storage: A Customer Love Story

Flash Considerations for Software Composable Infrastructure. Brian Pawlowski CTO, DriveScale Inc.

Quobyte The Data Center File System QUOBYTE INC.

Copyright 2012 EMC Corporation. All rights reserved.

Deep Learning Performance and Cost Evaluation

Introduction to Windows Azure Cloud Computing Futures Group, Microsoft Research Roger Barga, Jared Jackson, Nelson Araujo, Dennis Gannon, Wei Lu, and

The storage challenges of virtualized environments

Emulex LPe16000B 16Gb Fibre Channel HBA Evaluation

Cisco Unified Computing System Delivering on Cisco's Unified Computing Vision

Next Gen Storage StoreVirtual Alex Wilson Solutions Architect

Microsoft SQL Server in a VMware Environment on Dell PowerEdge R810 Servers and Dell EqualLogic Storage

How Smarter Systems Deliver Smarter Economics and Optimized Business Continuity

Storage Solutions for VMware: InfiniBox. White Paper

Integrated and Hyper-converged Data Protection

DDN. DDN Updates. Data DirectNeworks Japan, Inc Shuichi Ihara. DDN Storage 2017 DDN Storage

SSD Architecture for Consistent Enterprise Performance

Nimble Storage Adaptive Flash

Data Protection at Cloud Scale. A reference architecture for VMware Data Protection using CommVault Simpana IntelliSnap and Dell Compellent Storage

SolidFire. Petr Slačík Systems Engineer NetApp NetApp, Inc. All rights reserved.

Introduction. Network Architecture Requirements of Data Centers in the Cloud Computing Era

Virtual Security Server

Next-Generation Cloud Platform

It s Time to Move Your Critical Data to SSDs Introduction

Oracle Exalogic Elastic Cloud Overview. Peter Hoffmann Technical Account Manager

Deploy a High-Performance Database Solution: Cisco UCS B420 M4 Blade Server with Fusion iomemory PX600 Using Oracle Database 12c

Data Processing at the Speed of 100 Gbps using Apache Crail. Patrick Stuedi IBM Research

Lenovo Database Configuration Guide

Empowering. Video Surveillance. Highly integrated. Simple to deploy. Vess A-Series & Vess R-Series

Assessing performance in HP LeftHand SANs

Flash Storage with 24G SAS Leads the Way in Crunching Big Data

Veritas NetBackup on Cisco UCS S3260 Storage Server

AMD Disaggregates the Server, Defines New Hyperscale Building Block

IBM System Storage DS8800 Overview

XD5300 Series. Product Highlights

V.I.B.E. Virtual. Integrated. Blade. Environment. Harveenpal Singh. System-x PLM

Solaris Engineered Systems

DELL EMC VXRACK FLEX FOR HIGH PERFORMANCE DATABASES AND APPLICATIONS, MULTI-HYPERVISOR AND TWO-LAYER ENVIRONMENTS

DELL EMC DATA DOMAIN OPERATING SYSTEM

Nutanix Complete Cluster Reference Architecture for Virtual Desktop Infrastructure

Overview of HP tiered solutions program for Microsoft Exchange Server 2010

Reduce Latency and Increase Application Performance Up to 44x with Adaptec maxcache 3.0 SSD Read and Write Caching Solutions

White Paper. Extending NetApp Deployments with stec Solid-State Drives and Caching

Agenda. AWS Database Services Traditional vs AWS Data services model Amazon RDS Redshift DynamoDB ElastiCache

Optimizing the Data Center with an End to End Solutions Approach

AWS Solutions Architect Associate (SAA-C01) Sample Exam Questions

The Fastest And Most Efficient Block Storage Software (SDS)

Solid State Performance Comparisons: SSD Cache Performance

Data center requirements

SAN Design Best Practices for the Dell PowerEdge M1000e Blade Enclosure and EqualLogic PS Series Storage (1GbE) A Dell Technical Whitepaper

Composable Infrastructure for Public Cloud Service Providers

Switched SAS. Sharable, Scalable SAS Infrastructure. Terry Gibbons, Director Software Planning and Strategy, LSI Corporation.

Next Generation Computing Architectures for Cloud Scale Applications

Transcription:

Understanding Rack-Scale Disaggregated Storage Sergey Legtchenko, Hugh Williams, Kaveh Razavi, Austin Donnelly, Richard Black, Andrew Douglas, Nathanael Cheriere, Daniel Fryer, Kai Mast, Angela Demke Brown, Ana Klimovic, Andy Slowey and Antony Rowstron intern, visiting researcher

Storage Disaggregation Common in the cloud: Compute racks Storage racks Network Improves performance/cost: Independent resource scaling Rack hardware specialization Does not happen in HDD storage racks: Shared-nothing Servers Direct-attached Storage (DAS) Strict HDD Ownership Principle: HDD always managed by the server to which it is physically attached Do we need rack-scale storage disaggregation? DAS DAS DAS

Rack-Scale HDD Storage Disaggregation Relaxing the HDD Ownership Principle At a given time, a HDD is managed by one server but it is possible to reconfigure which server it is. Enables 4 types of disaggregation: Configuration Disaggregation Failure Disaggregation Dynamic Elastic Disaggregation Complete Disaggregation No reconfiguration during normal operation Reconfiguration part of normal operation Storage Fabric

No Reconfiguration during Normal Operation Configuration Disaggregation One rack for all workloads Configure once at deployment Optimized offline for workload General purpose storage rack Customized rack layout Deploy as regular storage Failure Disaggregation Reconfigure on server failure Move HDDs, not data Failed server Failover

Reconfiguration is part of Normal Operation Dynamic Elastic Disaggregation Dynamically adapt HDD-to-server ratio High load: each server gets its fair share of HDDs Low load: most HDDs attached to few servers High load Moderate load Low load Complete Disaggregation Reconfigure at IO granularity Any server can IO to any file on any HDD

Summary of Disaggregation Scenarios Storage stack redesign Configuration Disaggregation Failure disaggregation Dynamic Elastic Disaggregation Complete Disaggregation No Small Moderate High Online controller No Not necessarily Yes Yes (on IO path) Reconfiguration frequency Reconfiguration overhead O(rack lifetime) O(server failures) O(hours-days) O(IO rate) None Not under normal operation Low High Low Flexibility High

A Fabric to Explore Rack-Scale Disaggregation Storage switch Custom hardware design Circuit switch abstraction 160 ports @ 6 Gbps/port Benefit of the design: extreme flexibility SATA HBA SATA HDDs Microcontroller circuit management Crosspoint ASIC Circuits between any port pairs Electrical signal forwarding SAS HBA SAS Enclosure SATA SSDs

Experience with Configuration Disaggregation Easy to enable No controller No reconfiguration overhead Unmodified software on servers Simplifies management & operation One storage rack for all workloads Also very useful for development We use it on a daily basis! Fast instantiation of test configurations Our test setup for configuration disaggregation

Experience with Failure Disaggregation Hardware trends impact data availability: HDD and SSD capacities grow Servers can have a LOT of direct-attached storage e.g.: Petabytes of data per Pelican (cold storage) server On failure, amount of data and time to recover increases Failure disaggregation improves availability Reduces data unavailability to tens of seconds or less No resources used to rebuild data No reconfiguration overhead for normal operation Pelican prototype has: 1152 HDDs/rack 2 servers

Exploring Dynamic Elastic Disaggregation Ongoing work Storage workloads are bursty Average server utilization is low Load skew across servers Online controller Monitors storage traffic in the rack Adapts HDD-to-server ratio Not on the data path Better server utilization Allows storage tiering within the rack Some servers can host background jobs, spot VM instances Dynamic elastic disaggregation setup SAS enclosures Storage fabric

Milliseconds MB/s Complete Disaggregation, seriously? Can we reconfigure per IO? Time to switch and mount SATA SSD: 300 Driver 200 NTFS + Mount 100 0 Impact on throughput of switching after every IO: (no File system mount) 1000 Max SSD bandwidth 100 10 Cost of reconfiguration Not Switched Switched 1 4 256 16384 Log Record Size (KB)

Experience with Complete Disaggregation A lot of pain: Ecosystem challenges Redesign of the storage stack High overhead for small IO Meta data service on the IO path Hard to implement/debug Benefits Fine-grain load balancing Server failure tolerance by design Complete disaggregation setup 120 SATA SSDs 4 servers, 3 SATA ports/server

Conclusion In the cloud today: no disaggregation in storage racks Fixed drive-to-server mapping We designed a storage fabric to explore in-rack disaggregation Rack-scale storage disaggregation can be useful and affordable Configuration disaggregation Failure disaggregation Dynamic elastic disaggregation Can become a challenge Complete disaggregation Substantial benefits No/small reconfiguration overheads Little or no software/hardware changes High reconfiguration overhead Hard to implement and maintain