Erasure Codes for Heterogeneous Networked Storage Systems

Similar documents
On Coding Techniques for Networked Distributed Storage Systems

Locally repairable codes

Modern Erasure Codes for Distributed Storage Systems

IJREAT International Journal of Research in Engineering & Advanced Technology, Volume 1, Issue 5, Oct-Nov, 2013 ISSN:

Modern Erasure Codes for Distributed Storage Systems

Parallel Particle Swarm Optimization for Reducing Data Redundancy in Heterogeneous Cloud Storage

AN ABSTRACT OF THE THESIS OF. Arul Nambi Dhamodaran for the degree of Master of Science in

Cooperative Pipelined Regeneration in Distributed Storage Systems

Minimization of Storage Cost in Distributed Storage Systems with Repair Consideration

On the Speedup of Recovery in Large-Scale Erasure-Coded Storage Systems (Supplementary File)

Software-defined Storage: Fast, Safe and Efficient

Self-organized Data Redundancy Management for Peer-to-Peer Storage Systems

Introduction to Distributed Computing Systems

Resource Allocation in Contention-Based WiFi Networks

SYSTEM UPGRADE, INC Making Good Computers Better. System Upgrade Teaches RAID

Load rebalancing with advanced encryption standard for Hadoop Distributed File System using Distributed Hash table

BCStore: Bandwidth-Efficient In-memory KV-Store with Batch Coding. Shenglong Li, Quanlu Zhang, Zhi Yang and Yafei Dai Peking University

Network+ Guide to Networks, Fourth Edition. Chapter 8 Network Operating Systems and Windows Server 2003-Based Networking

Enabling Node Repair in Any Erasure Code for Distributed Storage

Cache Management for In Memory. Jun ZHANG Oct 15, 2018

Network Coding for Distributed Storage Systems* Presented by Jayant Apte ASPITRG 7/9/13 & 7/11/13

Randomized Network Coding in Distributed Storage Systems with Layered Overlay

Femto-Matching: Efficient Traffic Offloading in Heterogeneous Cellular Networks

Staggeringly Large Filesystems

Purity: building fast, highly-available enterprise flash storage from commodity components

Computer-System Organization (cont.)

An Empirical Study of the Repair Performance of Novel Coding Schemes for Networked Distributed Storage Systems

How to Reduce Data Capacity in Objectbased Storage: Dedup and More

Reliable Multipath Routing with Fixed Delays in MANET Using Regenerating Nodes

Optimal Cache Allocation for Content-Centric Networking

Cross-Monotonic Multicast

WHITE PAPER SINGLE & MULTI CORE PERFORMANCE OF AN ERASURE CODING WORKLOAD ON AMD EPYC

Building High Speed Erasure Coding Libraries for ARM and x86 Processors. Per Simonsen, CEO, MemoScale May 2017

ActiveScale Erasure Coding and Self Protecting Technologies

L3S Research Center, University of Hannover

Capacity Assurance in Hostile Networks

ActiveScale Erasure Coding and Self Protecting Technologies

Combining Erasure-Code and Replication Redundancy Schemes for Increased Storage and Repair Efficiency in P2P Storage Systems

Mutually Exclusive Data Dissemination in the Mobile Publish/Subscribe System

MATE-EC2: A Middleware for Processing Data with Amazon Web Services

Cassandra - A Decentralized Structured Storage System. Avinash Lakshman and Prashant Malik Facebook

E-Companion: On Styles in Product Design: An Analysis of US. Design Patents

ECS High Availability Design

Sprout: A functional caching approach to minimize service latency in erasure-coded storage

The Complex Network Phenomena. and Their Origin

Open vstorage RedHat Ceph Architectural Comparison

BUBBLE RAP: Social-Based Forwarding in Delay-Tolerant Networks

Multilevel Fault-tolerance for Designing Dependable Wireless Networks

Improving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick

Incentive-Aware Routing in DTNs

HYDRAstor: a Scalable Secondary Storage

Efficient Synchronization of Multiple Databases over Broadcast Networks

Enabling Data Integrity Protection in Regenerating-Coding-Based Cloud Storage: Theory and Implementation (Supplementary File)

RAID (Redundant Array of Inexpensive Disks)

Performance Analysis of Storage-Based Routing for Circuit-Switched Networks [1]

On the Speedup of Single-Disk Failure Recovery in XOR-Coded Storage Systems: Theory and Practice

Sprout: A functional caching approach to minimize service latency in erasure-coded storage

Hybrid Approaches for Distributed Storage Systems

Performance Models of Access Latency in Cloud Storage Systems

A Time-To-Live Based Reservation Algorithm on Fully Decentralized Resource Discovery in Grid Computing

Amdahl s Law in the Datacenter Era! A Market for Fair Processor Allocation!

Topology-Aware Node Selection for Data Regeneration in Heterogeneous Distributed Storage Systems

Course Content. Parallel & Distributed Databases. Objectives of Lecture 12 Parallel and Distributed Databases

Efficient Routing for Cooperative Data Regeneration in Heterogeneous Storage Networks

DISTRIBUTED DATABASES

Distributed Meta-data Servers: Architecture and Design. Sarah Sharafkandi David H.C. Du DISC

NETWORK CODING: FROM THEORY TO MEDIA

An Introduction to RAID

Co-Design of Many-Accelerator Heterogeneous Systems Exploiting Virtual Platforms. SAMOS XIV July 14-17,

Data Integrity Protection scheme for Minimum Storage Regenerating Codes in Multiple Cloud Environment

J. Parallel Distrib. Comput. Sporadic decentralized resource maintenance for P2P distributed storage networks

Context based optimal shape coding

FLAT DATACENTER STORAGE CHANDNI MODI (FN8692)

Joint Coding/Routing Optimization for Correlated Sources in Wireless Visual Sensor Networks

CSCI-1680 Transport Layer III Congestion Control Strikes Back Rodrigo Fonseca

EC-Bench: Benchmarking Onload and Offload Erasure Coders on Modern Hardware Architectures

Federated Array of Bricks Y Saito et al HP Labs. CS 6464 Presented by Avinash Kulkarni

International Journal of Innovations in Engineering and Technology (IJIET)

Repair Pipelining for Erasure-Coded Storage

A Survey on the Performance of Parallel Downloading. J. L. Chiang April 7, 2005

A Generalized Replica Placement Strategy to Optimize Latency in a Wide Area Distributed Storage System

MDSA: Modified Distributed Storage Algorithm for Wireless Sensor Networks

Codes, Bloom Filters, and Overlay Networks. Michael Mitzenmacher

HYDRAstor: a Scalable Secondary Storage

HADOOP 3.0 is here! Dr. Sandeep Deshmukh Sadepach Labs Pvt. Ltd. - Let us grow together!

Analysis of P2P Storage Systems. March 13, 2009

Optimal Transmission and Scheduling in Delay Tolerant Network

Implementation and performance evaluation of distributed cloud storage solutions using random linear network coding

Outline. Failure Types

GridDB Technical Design Document Version 1.0

Video Streaming with Network Coding

A Transpositional Redundant Data Update Algorithm for Growing Server-less Video Streaming Systems

Spatially-Localized Compressed Sensing and Routing in Multi-Hop Sensor Networks 1

Peer To Peer Communication Using Heterogeneous Networks

Opportunistic Routing Algorithms in Delay Tolerant Networks

Dynamic Load-Balanced Multicast for Data-Intensive Applications on Clouds 1

Stretch-Optimal Scheduling for On-Demand Data Broadcasts

Codes for Modern Applications

ECE 598HH: Special Topics in Wireless Networks and Mobile Systems

Elastic Cloud Storage (ECS)

Transcription:

Erasure Codes for Heterogeneous Networked Storage Systems Lluís Pàmies i Juárez Lluís Pàmies i Juárez lpjuarez@ntu.edu.sg

. Introduction Outline 2. Distributed Storage Allocation Problem 3. Homogeneous Distributed Systems 4. Heterogeneous Distributed Systems:. Orchestrated Systems 2. P2P Systems 5. Other Open Problems in Distributed Storage Systems.

Data Centers

Peer-to-Peer / Friend-to-Friend Networks

Networked Storage Systems How to store a file maximizing its data reliability?

Replication: Networked Storage Systems

Networked Storage Systems (5,3)-erasure code: Erasure Codes (MDS): Split the file in 3 chunks and generate 5 linear combinations of them (e.g. reed-solomon).

Networked Storage Systems decoding (5,3)-erasure code: Data is decoded by contacting any 3 surviving nodes. Same reliability with a smaller storage footprint (5/3 instead of 3)

Storage Allocation Problem How to assign n redundant blocks to a given set of storage nodes?

De-facto Premises Node failures/unavailabilities follow an uniform distribution. The assignment of the n encoded fragments has no impact on data reliability.

Traditional Coding Coding: The encoded data stored to each node: Has always the same size. Has always the same importance. Data assignment: block per node

. Introduction Outline 2. Distributed Storage Allocation Problem 3. Homogeneous Distributed Systems 4. Heterogeneous Distributed Systems:. Orchestrated Systems 2. P2P Systems 5. Other Open Problems in Distributed Storage Systems.

Exploiting Heterogeneities: Alternative Coding Schemes Different size: Different importance[]: [] Hierarchical Codes. Duminuco, Biersack (P2P'2008)

Storage Allocation Problem

Solutions for Different Regimes (probabilistic access) Max Symmetrical Spreading: Spread the budget T uniformly into the n storage nodes. Min Symmetrical Spreading: Spread the budget T uniformly into T out of the n storage nodes. REPLICATION 3 [] Erasure code replication revisited. --- Lin, Chiu, Lee. P2P'2004 [2] Distributed Storage Allocation Problems. --- Leong, Dimakis, Ho. NetCod'200 [3] Distributed Storage Allocation for High Reliability. --- Leong, Dimakis, Ho. ICC'200 [4] Symmetric Allocations for Distributed Storage. --- Leong, Dimakis, Ho. GLOBECOM'200

. Introduction Outline 2. Distributed Storage Allocation Problem 3. Homogeneous Distributed Systems 4. Heterogeneous Distributed Systems:. Orchestrated Systems 2. P2P Systems 5. Other Open Problems in Distributed Storage Systems.

Problem in Heterogeneous Environments How are redundant blocks assigned to storage nodes? uniformly? N=4 randomly? N=2 proportionally? N=2 p= 0.2 0.75 0.9

Assignments in Different Scenarios Orchestrated Storage Systems: All storage nodes belong to the same organization. The objective is to maximize overall storage capacity. P2P Storage Systems: Each node is a user that exchanges data reciprocally with other users Users need to provide more resources to obtain more capacity. Users aim to minimize the resources they have to exchange to store a given amount of data.

. Introduction Outline 2. Distributed Storage Allocation Problem 3. Homogeneous Distributed Systems 4. Heterogeneous Distributed Systems:. Orchestrated Systems 2. P2P Systems 5. Other Open Problems in Distributed Storage Systems.

Assignment in Orchestrated Systems We generate a large number of redundant blocks and we try different assignment policies. Optimization process based on a PSO algorithm. We run the PSO for different redundancies, and different heterogeneities. Data Assignment Policies Fitness function: 0.2 0.75 0.9 data availability Number of nodes 4 Number of blocks 2

Data Assignment Policies Assignment in Orchestrated Systems (cont'd) Proportional assignment achieves the maximum data availability N=000, nodes=00

Assignment in Orchestrated Systems (cont'd) Proportional assignment: Data Assignment Policies Possible problems: 0.2 0.75 0.9 ) Highest available nodes are over utilized 2)Part of the capacity from lowest available nodes is never used it minimizes the overall storage capacity. 3)Unfair assignments in P2P It does not happen!

. Introduction Outline 2. Distributed Storage Allocation Problem 3. Homogeneous Distributed Systems 4. Heterogeneous Distributed Systems:. Orchestrated Systems 2. P2P Systems 5. Other Open Problems in Distributed Storage Systems.

Guaranteeing Fairness Between Peers Common decentralized solution to guarantee fairness among users: Reciprocal data exchanges between users.

Guaranteeing Fairness Between Peers Common decentralized solution to guarantee fairness among users: Reciprocal data exchanges between users. Advantages: No third parties involved: Users need to find their own storage partners they send data directly to the node that will store it. Fairness: For each data block stored remotely, peers needs to give back the same amount of local disk resources. No users consumes more resources than the ones it provides.

Reciprocal Exchanges

Reciprocal Exchanges

Reciprocal Exchanges Symmetric Exchanges

Reciprocal Exchanges

Reciprocal Exchanges Problem: Does not incentivize users to improve their online availabilities. Solution: Selfish partner selection.

Selfish Partner Selection

Selfish Partner Selection?

Selfish Partner Selection?

Selfish Partner Selection

Selfish Partner Selection Gradient Topology: Users exchange data with users of similar online availability. High-available users require less redundancy.

Problem With Selfish Partner Selection Lets compare two different scenarios: Selfish Selection: Random Selection: /

Problem With Selfish Partner Selection Lets compare two different scenarios: Selfish Selection: Random Selection: / 0.65

Problem With Selfish Partner Selection redundancy 0.65 online node availability

Problem With Selfish Partner Selection redundancy 0.65 online node availability

Problem With Selfish Partner Selection redundancy 0.65 online node availability

Problem With Selfish Partner Selection Selfish selection and gradient topologies are suboptimal in terms of the overall storage resources consumed in the storage system: require more redundancy. redundancy 0.65 online node availability

Problem With Selfish Partner Selection Two nodes, all availabilities:

Problem With Selfish Partner Selection Two nodes, all availabilities: 00 nodes, clustered by availability:

Problem Statement Random selection of storage partners reduces the overall storage resources required in the system: Low available peers benefit by switching from selfish to random partner selection. But high available peers are not interested on switching from selfish to random partner selection policy.

Problem Statement Random selection of storage partners reduces the overall storage resources required in the system: Low available peers benefit by switching from selfish to random partner selection. But high available peers are not interested on switching from selfish to random partner selection policy. Can we make the random partner selection policy attractive for all peers?

Interesting Observation redundancy Always Positive 0.65 online node availability

Interesting Observation redundancy 0.65 online node availability

Interesting Observation redundancy Redundancy savings for low-available nodes. Redundancy losses for high-available nodes. 0.65 online node availability

Interesting Observation redundancy Redundancy savings for low-available nodes. Redundancy losses for high-available nodes. 0.65 online node availability Savings > Losses

Interesting Observation

Interesting Observation

Interesting Observation Can low-available nodes compensate the losses of high-available nodes, and globally reduce the amount of resources all users contribute?

Asymmetric Reciprocal Exchanges 2 2 2 2 2

Asymmetric Reciprocal Exchanges 2 2 2 2 2 Given the availabilities of two users, which is the optimal asymmetric exchange ratio?

Our Implementation Solve a system of linear equations, defined by to proportionality rules: Global savings are distributed proportional to the online availability of each peer. Each peer compensates the partners more available than her proportionally to their online availability.

. Introduction Outline 2. Distributed Storage Allocation Problem 3. Homogeneous Distributed Systems 4. Heterogeneous Distributed Systems:. Orchestrated Systems 2. P2P Systems 5. Other Open Problems in Distributed Storage Systems.

Repair Problem: Other Open Problems Large datacenters register 3-6% of hard drives failures every year high repair communication Repair redundant blocks without reconstructing the original file [],[2]. Allocation problems Consider datacenter network topologies and different correlated failure patterns. Data access: Replication guarantees efficient accesses (no decoding) and allows to move computation to where data is stored (less communication). Improve data assignments in coding to minimize these inefficiencies. Data insertion: In erasure codes data is inserted from a single node that has to generate and store n redundant blocks low insertion throughput. Use in-network coding to improve the data insertion throughput [3]. [] Network Coding for Distributed Storage Systems. Dimakis et al. IEEE Transactions on Information Theory. [2] Self-repairing Homomorphic Codes for Distributed Storage Systems. Oggier and Datta. Infocom 200. [3] In-Network Redundancy Generation for Opportunistic Speedup of Backup. Pamies, Datta and Oggier. 20

Q&A Thanks