Modern Erasure Codes for Distributed Storage Systems

Similar documents
Modern Erasure Codes for Distributed Storage Systems

Your Data is in the Cloud: Who Exactly is Looking After It?

Locally repairable codes

Fast Erasure Coding for Data Storage: A Comprehensive Study of the Acceleration Techniques. Tianli Zhou & Chao Tian Texas A&M University

Network Coding for Distributed Storage Systems* Presented by Jayant Apte ASPITRG 7/9/13 & 7/11/13

All About Erasure Codes: - Reed-Solomon Coding - LDPC Coding. James S. Plank. ICL - August 20, 2004

Performance Models of Access Latency in Cloud Storage Systems

Screaming Fast Galois Field Arithmetic Using Intel SIMD Instructions. James S. Plank USENIX FAST. University of Tennessee

Cloud Storage Reliability for Big Data Applications: A State of the Art Survey

Enabling Node Repair in Any Erasure Code for Distributed Storage

Codes for Modern Applications

A survey on regenerating codes

Network Codes for Next-Generation Distributed Storage:! Opportunities & Challenges. Kannan Ramchandran

HADOOP 3.0 is here! Dr. Sandeep Deshmukh Sadepach Labs Pvt. Ltd. - Let us grow together!

Erasure Codes for Heterogeneous Networked Storage Systems

Repair Pipelining for Erasure-Coded Storage

EECS 121: Coding for Digital Communication & Beyond Fall Lecture 22 December 3. Introduction

Data Integrity Protection scheme for Minimum Storage Regenerating Codes in Multiple Cloud Environment

EC-Bench: Benchmarking Onload and Offload Erasure Coders on Modern Hardware Architectures

On Coding Techniques for Networked Distributed Storage Systems

Capacity Assurance in Hostile Networks

Privacy, Security, and Repair in Distributed Storage Systems. Siddhartha Kumar

Screaming Fast Galois Field Arithmetic Using Intel SIMD Instructions

Finding the most fault-tolerant flat XOR-based erasure codes for storage systems Jay J. Wylie

A Comparative Study on Distributed Storage and Erasure Coding Techniques Using Apache Hadoop Over NorNet Core

Clay Codes: Moulding MDS Codes to Yield an MSR Code

Software-defined Storage: Fast, Safe and Efficient

Optimal Exact-Regenerating Codes for Distributed Storage at the MSR and MBR Points via a Product-Matrix Construction

Erasure coding and AONT algorithm selection for Secure Distributed Storage. Alem Abreha Sowmya Shetty

SCALING UP OF E-MSR CODES BASED DISTRIBUTED STORAGE SYSTEMS WITH FIXED NUMBER OF REDUNDANCY NODES

A Survey on Erasure Coding Techniques for Cloud Storage System

Improving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick

Coding Techniques for Distributed Storage Systems

A RANDOMLY EXPANDABLE METHOD FOR DATA LAYOUT OF RAID STORAGE SYSTEMS. Received October 2017; revised February 2018

Everything You Wanted To Know About Storage (But Were Too Proud To Ask) The Basics

CSC310 Information Theory. Lecture 21: Erasure (Deletion) Channels & Digital Fountain Codes. November 22, 2006 see

On the Speedup of Recovery in Large-Scale Erasure-Coded Storage Systems (Supplementary File)

PANASAS TIERED PARITY ARCHITECTURE

CSC310 Information Theory. Lecture 22: Erasure (Deletion) Channels & Digital Fountain Codes. November 30, 2005 see

Rethinking Erasure Codes for Cloud File Systems: Minimizing I/O for Recovery and Degraded Reads

Optimize Storage Efficiency & Performance with Erasure Coding Hardware Offload. Dror Goldenberg VP Software Architecture Mellanox Technologies

A Performance Evaluation of Open Source Erasure Codes for Storage Applications

ECS High Availability Design

Fountain Codes Based on Zigzag Decodable Coding

ActiveScale Erasure Coding and Self Protecting Technologies

DiskReduce: Making Room for More Data on DISCs. Wittawat Tantisiriroj

Regenerating Codes for Errors and Erasures in Distributed Storage

Storage systems have grown to the point where failures are inevitable,

On the Speedup of Single-Disk Failure Recovery in XOR-Coded Storage Systems: Theory and Practice

On Data Parallelism of Erasure Coding in Distributed Storage Systems

Coding theory for scalable media delivery

Cooperative Pipelined Regeneration in Distributed Storage Systems

CORE: Augmenting Regenerating-Coding-Based Recovery for Single and Concurrent Failures in Distributed Storage Systems

Information-theoretically Secure Regenerating Codes for Distributed Storage

Raptor Codes for P2P Streaming

RAID (Redundant Array of Inexpensive Disks)

Distributed File System Based on Erasure Coding for I/O-Intensive Applications

Decentralized Distributed Storage System for Big Data

ActiveScale Erasure Coding and Self Protecting Technologies

Building High Speed Erasure Coding Libraries for ARM and x86 Processors. Per Simonsen, CEO, MemoScale May 2017

Encoding-Aware Data Placement for Efficient Degraded Reads in XOR-Coded Storage Systems

Interference Alignment in Regenerating Codes for Distributed Storage: Necessity and Code Constructions

Screaming Fast Galois Field Arithmetic Using Intel SIMD Instructions

Online data storage service strategy for the CERN computer Centre G. Cancio, D. Duellmann, M. Lamanna, A. Pace CERN, Geneva, Switzerland

BCStore: Bandwidth-Efficient In-memory KV-Store with Batch Coding. Shenglong Li, Quanlu Zhang, Zhi Yang and Yafei Dai Peking University

Resume Maintaining System for Referral Using Cloud Computing

Short Code: An Efficient RAID-6 MDS Code for Optimizing Degraded Reads and Partial Stripe Writes

Parity Logging with Reserved Space: Towards Efficient Updates and Recovery in Erasure-coded Clustered Storage

Lecture 4 September 17

FPGA Implementation of Erasure Codes in NVMe based JBOFs

DiskReduce: Making Room for More Data on DISCs. Wittawat Tantisiriroj

HP AutoRAID (Lecture 5, cs262a)

Apache Hadoop 3. Balazs Gaspar Sales Engineer CEE & CIS Cloudera, Inc. All rights reserved.

Always-On Erasure Coding

An Improvement of Quasi-cyclic Minimum Storage Regenerating Codes for Distributed Storage

So you think you can FEC?

Disk Arrays Nov. 7, 2011!

Exploration of Erasure-Coded Storage Systems for High Performance, Reliability, and Inter-operability

EC-Cache: Load-balanced, Low-latency Cluster Caching with Online Erasure Coding

Dell Technologies IoT Solution Surveillance with Genetec Security Center

WHITE PAPER SINGLE & MULTI CORE PERFORMANCE OF AN ERASURE CODING WORKLOAD ON AMD EPYC

CSEP 561 Error detection & correction. David Wetherall

Efficient Load Balancing and Disk Failure Avoidance Approach Using Restful Web Services

Data Protection for Cisco HyperFlex with Veeam Availability Suite. Solution Overview Cisco Public

A Simulation Analysis of Reliability in Erasure-Coded Data Centers

Graph based codes for distributed storage systems

Error Control Coding for MLC Flash Memories

MODERN storage systems, such as GFS [1], Windows

ECFS: A decentralized, distributed and faulttolerant FUSE filesystem for the LHCb online farm

COS 318: Operating Systems. Storage Devices. Vivek Pai Computer Science Department Princeton University

CSCI-1680 Link Layer Reliability Rodrigo Fonseca

Networked Systems and Services, Fall 2018 Chapter 2. Jussi Kangasharju Markku Kojo Lea Kutvonen

GFS Overview. Design goals/priorities Design for big-data workloads Huge files, mostly appends, concurrency, huge bandwidth Design for failures

Packet-Level Forward Error Correction in Video Transmission

PCM: A Parity-check Matrix Based Approach to Improve Decoding Performance of XOR-based Erasure Codes

Understanding System Characteristics of Online Erasure Coding on Scalable, Distributed and Large-Scale SSD Array Systems

Scale-out Data Deduplication Architecture

Distributed File System based on Erasure Coding for I/O Intensive Applications

GridBlocks DISK - Distributed Inexpensive Storage with K-availability

Applied Erasure Coding in Networks and Distributed Storage

Transcription:

Modern Erasure Codes for Distributed Storage Systems Storage Developer Conference, SNIA, Bangalore Srinivasan Narayanamurthy Advanced Technology Group, NetApp May 27 th 2016 1

Everything around us is changing! The Data Deluge Disk capacities and densities are increasing faster than the disk transfer rates Increased delay to recover using classical techniques lead to availability exposure Changing Storage Technologies Architectures: Scale-out, Distributed Storage, Cloud, Converged Media: Flash, NVM, SMR, Tape, et al. Features: Geo-distribution, Security, Use of commodity hardware (Failure is a norm!) Newer Dimensions of Erasure Codes Optimality tradeoffs redefined More about this inside 2 Erasure coding usage is growing, and is now available in an increasing number of newer object, file and block storage arrays, but not in traditional general purpose disk arrays. Gartner

Organization Background Erasure Codes Timeline Classical Codes - (n, k) code Modern Codes Codes on Codes Network Codes Technical Analysis Optimality Tradeoff and Reliability Analysis System Requirements and Codes Literature & Key Players 3

Background Timeline Classical (n, k) codes 4

Timeline Overview Classical Codes Fountain Codes Codes on Codes Network Codes Tradeoff against Storage Overhead Reliability Performance Repair Degree Repair Bandwidth 5

Erasure Codes Timeline 1950s 1960 1998 2002 2010 2011 2013 Hamming Reed- Solomon Fountain Luby, LT Tornado Network Codes for Storage On the Locality Local Regeneration THEORY SYSTEMS RAID RAID-6 RapTor Pyramid Hierarchical Azure (mlrc) XORBAS (flrc) Double Replica I/O, BW Storage (FAST 15) 1988 2002 2006 2007 2008 2012 2013 2014 2015 6

Classical Codes (n, k) code; Illustration (6, 4) code Chunk Encode Retrieve Decode Re-encode Replace Object / File k data blocks n encoded blocks Surviving k blocks Reconstructed data Lost blocks n encoded blocks Encode Decode Repair Think distributed systems; repairs are expensive! 7

Modern Codes Codes on Codes Network Codes 8

Codes On Codes (n 1, k 1 ) + (n 2, k 2 ) & (k, l, r) codes Hierarchical & Pyramid Codes Azure (mlrc) Locality/ Max. Recoverability Repair Degree k=6 data fragments, l=2 local parities and r=2 global parities Decoding 3 and 4 failures in mlrc RS (14,10) Hierarchical Bottom Up Pyramid Top Down Facebook (XORBAS) Single block failures Locality/ Min. Distance 9

Regenerating Codes Inspired by Network Codes A B P1 = A + 2B A B C D C D A + B + C + D A+ 2B + C + 2D P2 = 2C + D P3 = 4A + 5B + 4C + 5D An Information Flow Graph & Min-Cut Bound A + 2B + 3C + D 3A + 2B + 2C + 3D Functional Repair 5A + 7B + 8C + 7D 6A + 9B + 6C + 6D 10

Regenerating Codes Repair By Transfer (RBT), MBR Code Local Regenerating Code 1 1 2 3 4 4 1 5 7 9 9 2 7 4 6 7 P MBR 5 2 5 6 8 6 8 3 3 8 9 P P Storage No Codes Exist! Storage / Repair BW 11 Pentagon Code Repair Bandwidth MSR

Technical Analysis Optimality Tradeoffs Reliability Analysis System Requirements 12

Summary of Codes and their Tradeoff Code/Family Tradeoff MDS Storage overhead Reliability Replication & Parity (RAID) Storage overhead Reliability Reed-Solomon Storage overhead Reliability Near-Optimal Correction capabilities Computational Complexity Fountain Rate Probability of Correction Codes on Codes Storage overhead Repair Degree (Fan-in) Azure (mlrc) MDS Maximum Recoverability XORBAS (flrc) Locality Minimum Distance Regenerating Storage overhead Repair Bandwidth Local Regenerating Storage overhead Reconstruction Cost 13

Reliability Analysis MTTDL is in the order of E+12 (for LRC) and E+09 (for Regenerating) Locally Repairable Codes Regenerating Codes 2.5 20 2.5 30 2 1.5 16 12 2 1.5 1 24 18 12 1 8 0.5 6 0.5 4 0 0 0 3-replica RS (14,10) flrc mlrc 3-replica RS (14,10) flrc (10,6,5) mlrc (6,2,2) (6,2,2) 0 Storage Overhead MTTDL Repair Traffic Storage Overhead MTTDL Code Length 14

System Requirements & Example Codes Architecture System Properties of the System Requirements for a Code General-purpose storage array Geo-distributed storage Secure Storage Distributed Systems Most Important Least Important Most Important Least Important Reliability & Performance Repair over WAN is expensive Security Parallelism & Availability Cost Reliability Complexity Storage overhead across DR sites Storage overhead Storage overhead Local repair Faster degraded reads Systematic Storage overhead Repair time Storage overhead Example family/ code MSR, SD/STAIR Codes LRC Non-systematic codes; MBR Replication Workload Big Data (say, Hadoop) Large volumes of data Write latency Storage overhead Repair bandwidth Regenerating (MSR/MBR), systematic 15

Literature & Key Players Theory & Systems 16

Literature & Key Players The Researchers, The Big Companies & The Startups! MSR/MBR Points (2013) UT Austin Alex Dimakis XORBAS (2013) Greenan Network Codes for Storage (2010) UC Berkeley Kannan R Piggyback (2013) Hitchhiker (2014) GF Intel SIMD (2013) Flat XOR (2010) Jerasure (2014) U Tennessee James Plank PM (MSR) RBT (2011-2015) PM (RBT) (2015) SD-Codes (2013) GF-Complete (2013) PMDS (2013) IISc Vijay Kumar STAIR (2014) Pyramid (2007) Double Rep (2014) Chinese U HongKong Patrick PC Lee Parikshit Gopalan mlrc (2012) Locality (2011) MIT Muriel Medard Network Flow & Linear Coding RLNC RS, Fountain Fountain NTU Oggier Self-repairing (2011) Non-systematic RS THEORY SYSTEMS RS 17

Other Relevant Areas Cross-object Coding Sector & Disk failures PMDS, SD, STAIR Codes Other media: Flash: LDPC, WOM, Multi-write codes; NVM Security Dispersal, AONT-RS Cloud NC-Cloud Transformational Codes: Transform encoded data to different parameters as they become hot/cold without decoding and re-encoding 18

Thank you. 19