Improving Bandwidth Efficiency of Peer-to-Peer Storage

Similar documents
The OceanStore Write Path

Staggeringly Large File Systems. Presented by Haoyan Geng

Opportunistic Use of Content Addressable Storage for Distributed File Systems

Reducing Replication Bandwidth for Distributed Document Databases

Consistency-preserving Caching of Dynamic Database Content

Improving Duplicate Elimination in Storage Systems

The Google File System

A DEDUPLICATION-INSPIRED FAST DELTA COMPRESSION APPROACH W EN XIA, HONG JIANG, DA N FENG, LEI T I A N, M I N FU, YUKUN Z HOU

A HIGH-PERFORMANCE OBLIVIOUS RAM CONTROLLER ON THE CONVEY HC-2EX HETEROGENEOUS COMPUTING PLATFORM

CS 152 Computer Architecture and Engineering. Lecture 9 - Address Translation

Midterm II December 4 th, 2006 CS162: Operating Systems and Systems Programming

File Systems. CS170 Fall 2018

An Approach for Hybrid-Memory Scaling Columnar In-Memory Databases

CS 152 Computer Architecture and Engineering. Lecture 9 - Address Translation

Store, Forget & Check: Using Algebraic Signatures to Check Remotely Administered Storage

The Google File System

Brocade: Landmark Routing on Peer to Peer Networks. Ling Huang, Ben Y. Zhao, Yitao Duan, Anthony Joseph, John Kubiatowicz

Authenticated Storage Using Small Trusted Hardware Hsin-Jung Yang, Victor Costan, Nickolai Zeldovich, and Srini Devadas

Global Data Plane. The Cloud is not enough: Saving IoT from the Cloud & Toward a Global Data Infrastructure PRESENTED BY MEGHNA BAIJAL

Two-Party Fine-Grained Assured Deletion of Outsourced Data in Cloud Systems

Designing a global repository using OceanStore

15-740/ Computer Architecture Lecture 19: Main Memory. Prof. Onur Mutlu Carnegie Mellon University

Dynamo: Amazon s Highly Available Key-value Store

ChunkStash: Speeding Up Storage Deduplication using Flash Memory

A Low-bandwidth Network File System

Third Midterm Exam April 24, 2017 CS162 Operating Systems

EC-Bench: Benchmarking Onload and Offload Erasure Coders on Modern Hardware Architectures

CS 152 Computer Architecture and Engineering. Lecture 11 - Virtual Memory and Caches

Dynamic Source Routing in ad hoc wireless networks

WAN Optimized Replication of Backup Datasets Using Stream-Informed Delta Compression

REMOTE PERSISTENT MEMORY ACCESS WORKLOAD SCENARIOS AND RDMA SEMANTICS

Stanford University Computer Science Department CS 240 Sample Quiz 2 Questions Winter February 25, 2005

Shared Memory Multiprocessors. Symmetric Shared Memory Architecture (SMP) Cache Coherence. Cache Coherence Mechanism. Interconnection Network

The Google File System

ECE 2300 Digital Logic & Computer Organization. More Caches

Introduction to OpenMP. Lecture 10: Caches

Deduplication Storage System

EFFICIENTLY ENABLING CONVENTIONAL BLOCK SIZES FOR VERY LARGE DIE- STACKED DRAM CACHES

EE/CSCI 451: Parallel and Distributed Computation

IBLT and weak block propagation performance. Kalle Rosenbaum (Popeller) & Rusty Russell (BlockStream)

Lecture S3: File system data layout, naming

Stanford University Computer Science Department CS 240 Quiz 2 with Answers Spring May 24, total

COSC 6385 Computer Architecture - Memory Hierarchy Design (III)

Peer-to-peer Sender Authentication for . Vivek Pathak and Liviu Iftode Rutgers University

The Google File System

Purity: building fast, highly-available enterprise flash storage from commodity components

Evaluating the Impact of RDMA on Storage I/O over InfiniBand

Advanced cache optimizations. ECE 154B Dmitri Strukov

CA485 Ray Walshe Google File System

JANUARY 20, 2016, SAN JOSE, CA. Microsoft. Microsoft SQL Hekaton Towards Large Scale Use of PM for In-memory Databases

Lecture 7 - Memory Hierarchy-II

Hadoop Beyond Batch: Real-time Workloads, SQL-on- Hadoop, and thevirtual EDW Headline Goes Here

Midterm II December 3 rd, 2007 CS162: Operating Systems and Systems Programming

Crossing the Chasm: Sneaking a parallel file system into Hadoop

Virtualization Technique For Replica Synchronization

Name: Instructions. Problem 1 : Short answer. [48 points] CMU / Storage Systems 23 Feb 2011 Spring 2012 Exam 1

CS 152 Computer Architecture and Engineering. Lecture 8 - Address Translation

The Fusion Distributed File System

I/O Buffering and Streaming

Cray XE6 Performance Workshop

Crossing the Chasm: Sneaking a parallel file system into Hadoop

The Google File System

An Efficient PGP Keyserver without Prior Context

On the Effects of Bandwidth Reduction Techniques in Distributed Applications

Introduction to Software Security Hash Functions (Chapter 5)

Certified Apache Cassandra Professional VS-1046

Staggeringly Large Filesystems

Enhancing Redundant Network Traffic Elimination

NDN-RTC. Peter Gusev UCLA REMAP 9/5/2014

CS 152 Computer Architecture and Engineering. Lecture 8 - Address Translation

Remote Data Checking for Network Codingbased. Distributed Storage Systems

Advantages of P2P systems. P2P Caching and Archiving. Squirrel. Papers to Discuss. Why bother? Implementation

Datacenter Wide- area Enterprise

CS433 Final Exam. Prof Josep Torrellas. December 12, Time: 2 hours

FADE: A Secure Overlay Cloud Storage System with Access Control and Assured Deletion. Patrick P. C. Lee

Online Version Only. Book made by this file is ILLEGAL. Design and Implementation of Binary File Similarity Evaluation System. 1.

Analyzing Memory Access Patterns and Optimizing Through Spatial Memory Streaming. Ogün HEPER CmpE 511 Computer Architecture December 24th, 2009

File System Implementations

Contents Part I Traditional Deduplication Techniques and Solutions Introduction Existing Deduplication Techniques

A Comparison of File. D. Roselli, J. R. Lorch, T. E. Anderson Proc USENIX Annual Technical Conference

A Framework for Memory Hierarchies

ECE7995 (6) Improving Cache Performance. [Adapted from Mary Jane Irwin s slides (PSU)]

The Google File System

Peer Content Caching and Retrieval: Content Identification

Lecture: Large Caches, Virtual Memory. Topics: cache innovations (Sections 2.4, B.4, B.5)

S3FS in the wide area Alvaro Llanos E M.Eng Candidate Cornell University Spring/2009

Moneta: A High-Performance Storage Architecture for Next-generation, Non-volatile Memories

Albis: High-Performance File Format for Big Data Systems

Portland State University ECE 587/687. Caches and Memory-Level Parallelism

Key Point. What are Cache lines

Main Memory (Part II)

An Evaluation of Checkpoint Recovery For Massively Multiplayer Online Games

HPDedup: A Hybrid Prioritized Data Deduplication Mechanism for Primary Storage in the Cloud

WORKLOAD CHARACTERIZATION OF INTERACTIVE CLOUD SERVICES BIG AND SMALL SERVER PLATFORMS

Delta Compressed and Deduplicated Storage Using Stream-Informed Locality

Early Measurements of a Cluster-based Architecture for P2P Systems

What's new in Jewel for RADOS? SAMUEL JUST 2015 VAULT

Secure Distributed Storage in Peer-to-peer networks

RAS$Directed,Instruc1on,Prefetching, (RDIP),

IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 8, NO. 2, APRIL Segment-Based Streaming Media Proxy: Modeling and Optimization

Transcription:

Improving Bandwidth Efficiency of Peer-to-Peer Storage Patrick Eaton, Emil Ong, John Kubiatowicz University of California, Berkeley http://oceanstore.cs.berkeley.edu/

P2P Storage: Promise vs.. Reality

P2P Storage: Promise vs.. Reality Promise: P2P storage is Globally accessible Highly available Extremely durable

P2P Storage: Promise vs.. Reality Promise: P2P storage is Globally accessible Highly available Extremely durable

P2P Storage: Promise vs.. Reality Promise: P2P storage is Globally accessible Highly available Extremely durable Reality: Many challenges

Reality: Non-Uniform Network Connectivity Well-connected network core Weakly-connected clients Challenge: Reasonable performance for weakly-connected clients

Reality: Non-Uniform Network Connectivity Well-connected network core Weakly-connected clients Challenge: Reasonable performance for weakly-connected clients

Infrastructure Assumptions Infrastructure helps manage objects Like FARSITE or OceanStore Provides update application, serialization, durability, etc. Partially trusted servers in network core Trusted to execute protocol, not trusted with user data

Infrastructure Assumptions Infrastructure helps manage objects Like FARSITE or OceanStore Provides update application, serialization, durability, etc. Partially trusted servers in network core Trusted to execute protocol, not trusted with user data Update Result

Infrastructure Assumptions Infrastructure helps manage objects Like FARSITE or OceanStore Provides update application, serialization, durability, etc. Partially trusted servers in network core Trusted to execute protocol, not trusted with user data Update Read Result Data

Challenges Caching, Prefetching, Consistency Efficient update encoding, Link scheduling Verification, Privacy Security Prefetching Link scheduling

Challenges Caching, Prefetching, Consistency Efficient update encoding, Link scheduling Verification, Privacy Security Prefetching Cache Link scheduling

Challenges Caching, Prefetching, Consistency Efficient update encoding, Link scheduling Verification, Privacy Security Prefetching Cache Link scheduling

Challenges Caching, Prefetching, Consistency Efficient update encoding, Link scheduling Verification, Privacy Security Prefetching Cache Link scheduling

Challenges Caching, Prefetching, Consistency Efficient update encoding, Link scheduling Verification, Privacy Security Prefetching Link scheduling Cache?

Bandwidth-Efficient Peer-to-Peer Storage Goal Minimize bandwidth to store data Maintain data privacy and verifiability?

Outline Introduction and Background Minimize bandwidth to write data Delta-encoded Updates Efficiency Results Maintain privacy and verifiability Data Representation Conclusion

Ideal Updates Ideally, application provides changes Alternatively, extract changes

Common Approaches Whole - replace all data on change Block - write blocks that have changed Fix-sized blocks

Alternative: Rabin Fingerprints

Computing Changes Diff F = H(C1) H(C2) H(C3) H(C4) H(C5), F = H(C1) H(C6) H(C4) H(C5) Compare list of hashes to find changes Remove C2 and C3, Insert C6

Computing Changes Diff F = H(C1) H(C2) H(C3) H(C4) H(C5), F = H(C1) H(C6) H(C4) H(C5) Compare list of hashes to find changes Remove C2 and C3, Insert C6

Creating Updates Combine list of changes and File Summaries Remove C2 and C3, Insert C6 1. H(C1), 0, 500 2. H(C2), 500, 500 3. H(C3), 1000, 300 4. H(C4), 1300, 600 5. H(C5), 1900, 700 1. H(C1), 0, 500 2. H(C6), 500, 1000 3. H(C4), 1500, 600 4. H(C5), 2100, 700

Creating Updates Combine list of changes and File Summaries Remove C2 and C3, Insert C6 1. H(C1), 0, 500 2. H(C2), 500, 500 3. H(C3), 1000, 300 4. H(C4), 1300, 600 5. H(C5), 1900, 700 1. H(C1), 0, 500 2. H(C6), 500, 1000 3. H(C4), 1500, 600 4. H(C5), 2100, 700 Update = (Remove offset=500, length=800), (Insert offset=500, length=1000, E(C6))

Creating Updates Combine list of changes and File Summaries Remove C2 and C3, Insert C6 1. H(C1), 0, 500 2. H(C2), 500, 500 3. H(C3), 1000, 300 4. H(C4), 1300, 600 5. H(C5), 1900, 700 1. H(C1), 0, 500 2. H(C6), 500, 1000 3. H(C4), 1500, 600 4. H(C5), 2100, 700 Update = (Remove offset=500, length=800), (Insert offset=500, length=1000, E(C6)) Data =

Three Micro-Workloads MS Word 11 page (700 KB) document Perform global search-and-replace Java development Source code files from consecutive CVS check-ins 26 files (435 KB) change Email User receives 5 new messages Mailbox size increases 80 KB -> 100 KB

Overheads Computation of File Summaries 5 MB of data/second Storage overhead of File Summaries Word: 4.4 MB of state/gb of user data Update creation Word: 120 ms 50 ms to encrypt data

Update Size Server stores initial version of file Measure update size to write new version Update Size (bytes) 800,000 700,000 600,000 500,000 400,000 300,000 200,000 100,000 0 Whole Block Rabin Ideal Update Approach

Update Latency Measure time to save new version Compute update, network transmission, execute Results shown for 56 kb network connection 120 seconds 100 80 60 40 20 0 Whole Block Rabin Rsync/Ideal Update Approach

Outline Introduction and Background Minimize bandwidth to write data Delta-encoded Updates Efficiency Results Maintain privacy and verifiability Data Representation Conclusion

Data Representation Requirements Implement operations used in updates Append, Truncate, Insert, Delete Verifiable Index blocks of variable size Updates cause proportional changes Data treated as opaque

Data Structure Build tree over blocks of data Reference children by secure hash for verifiability Insight: Use relative offsets

Modifying the Data Structure

Conclusion P2P storage systems should support weakly-connected clients Delta-encoded updates reduce update bandwidth Verifiable data structure to execute updates of encrypted data

Related Work Low Bandwidth File System (LBFS) Muthitacharoen, Chen, and Mazieres CASPER file system Tolia, et. al. EXODUS database system Carey, DeWitt, Richardson, and Shekita

Future Work Evaluate on larger, real-world workloads Study other challenges Efficient sharing between users/devices Write buffering and consistency Support disconnected operation

Improving Bandwidth Efficiency of Peer-to-Peer Storage Patrick Eaton, Emil Ong, and John Kubiatowicz University of California, Berkeley http://oceanstore.cs.berkeley.edu/