Improving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick
|
|
- Marvin Burns
- 6 years ago
- Views:
Transcription
1 Improving Cloud Storage Cost and Data Resiliency with Erasure Codes Michael Penick
2 Commodity Storage Hosting storage FTP backup Goals Inexpensive (use commodity hardware) Resilient to failures Highly available Customizable 2
3 MogileFS Open source distributed filesystem Written by Brad Fitzpatrick No single point of failure Automatic/Asynchronous file replication Shared-Nothing design (disks) Local filesystem agnostic Flat namespace Tracker Storage Node MetadataD B 3
4 MogileFS Tracker Clients MetadataD B Storage Node 4
5 NebulaFS Large file support Offsite Replication Self-healing Data retention C++ client (PHP and Perl SWIG wrappers) Metadata Sharding Range GETs 5
6 NebulaFS Clients Tracker / Storage Node MySQL Tracker / Storage Node MySQL Storage Node Storage Node 6
7 FTP Backup FTP Presentation (Net::FTPServer) VFS DB NebulaFSAPI Super Nodes NebulaFS Storage Nodes Metadata DB 7
8 Widely Applicable Storage service (REST) layer New Product Integrations Online File Folder (videos and images) Website Builder/ Photo Album Go Daddy Cloud Servers (snapshots) 8
9 Object Storage RESTful Presentation (S3, GDCS) VFS NebulaFSAPI VFS DB User DB Super Nodes NebulaFS Storage Nodes Metadata DB 9
10 Why? 10
11 Why? 1.01% 1.20% ~3.25 PB 5.39% 8.51% Aries FTP WST/PA OFF 83.89% VDC Other 0.30% 55.44% ~10.8 PB 1.80% 2.56% 38.44% 1.44% Aries FTP WST/PA OFF VDC Other 11
12 The Problem NebulaFS = Inexpensive, resilient, highly available storage Problem: Disk drives fail...a lot. F = mean time to failure In a system of n devices our mean time failure is: F/n Solution: Replicate the data 12
13 Replication Success! Duplicate Copy 1 Copy 2 Replicate Copy 3 Copy 4 Disk 1 Disk 2 Disk n 13
14 Replication Simple and effective Durability: % over 1 year (or 0.1 of 1 million objects) 99.99% over 3 years (or 100 of 1 million objects) Problem: 100 % overhead per copy +300% overhead for 3 onsite and 1 offsite copy There has to be a better way. 14
15 Erasure Codes Forward error correction code Add redundant data (codes) to message so that it can be recovered Where s EC used? Optical media Media streaming File Systems (RAID-6, several distributed FS, ) 15
16 Erasure code (write) Copy Divide Encode k m Disk 1 Disk 2 Disk n Copy.75
17 Erasure code (read) Disk Disk k Verify Decode Disk n
18 Erasure codes What? k number of original pieces m number of redundant pieces (codes) How? k = 4, m = 3: only 75% overhead (3 failures) k = 10, m = 6: only 60% overhead (6 failures) k = 9, m = 3: only 33% overhead (3 failures) AKA: k = 10, m = 6 10 of 16
19 Trade-offs (positive) Better resilience to failure Durability for 10 of 16: % over 1 year (or of 1 million objects) % over 3 years (or 0.1 of 1 million objects) Durability for 9 of 12: % over 1 year (or 0.1 of 1 million objects) 99.99% over 3 years (or 100 of 1 million objects) Significant savings (includes a full offsite copy) 10 of 16: (4 2.60) / 4 = 35% savings (60 % w/o offsite) 9 of 12: (4 2.33) / 4 = 42% savings (67% w/o offsite)
20 Trade-offs (negative) Computationally expensive Increased number of IOPS Complexity (additional metadata) More nodes and connections
21 Erasure Codes Optimal erasure code Any k pieces of the message can recover the message Reed-Solomon (and Cauchy Reed-Solomon) Libraries (Jerasure, Zfec, Luby, librs, ) Stability/Performance Evaluation Paper A Performance Comparison of Open-Source Erasure Coding Libraries for Storage Applications
22 EC Libraries - zfec Reed Solomon Written in C (Python and Haskell bindings) Download: Documentation is the source code 22
23 EC Libraries zfec Encoding 23
24 EC Libraries zfec Decoding 24
25 EC Libraries zfec Decoding contd. 25
26 EC Libraries zfec Decoding contd. 26
27 EC Libraries zfec Decoding cond. k = 6, m =2, erasures = { 0, 2, -1 } inpkts coding 0 data 1 coding 1 data 3 data 4 data 5 outpkts data 0 data 2 index = { 6, 1, 7, 3, 4, 5 } 27
28 EC Libraries - Jerasure Reed Solomon, Cauchy Reed Solomon, and Minimal Density Codes Written in C (no bindings) Download: html Good documentation and examples 28
29 EC Libraries Jerasure Encoding 29
30 EC Libraries Jerasure Decoding 30
31 EC Libraries Performance 2500 Encoding 2000 MB/s w = 8 w = 16 w = 32 Jerasure RS Jearsure CRS zfec 31
32 EC Libraries Performance MB/s Decoding w = 8 w = 16 w = 32 Jerasure RS Jearsure CRS zfec 32
33 Integration EC Library EC library (Phase I) Read/Copy Write Repair
34 Integration EC Library Inputs/Outputs abstracted boost::asio (HTTP) PHP/Perl bindings Random access reads (i.e. Range GET) Data validated/corrected on-the-fly
35 Integration EC Library Writes k m Disk 1 Disk 2 Disk n
36
37 Integration EC Library Failures
38 Integration EC Library Reads Disk Disk k Disk n
39 Integration EC Library
40 Integration EC Library Copy Disk Disk k Disk 1 Disk 2 Disk n
41 Integration EC Library
42 Integration EC Library Repair Disk 1 Disk 2 Disk n k
43 Integration EC Library
44 Integration Reads/Writes
45 Integration Reads/Writes DB Increased number of file_device entries Decreased number of file entries Change the meaning of class 45
46 Integration Reads/Writes DB
47 Integration Write
48 Integration Read
49 Integration Recovery
50 Lessons Learned CRC32 can be slow Intel s Slicing-by-8 Algorithm Block size can limit your smallest file size Lighttpd doesn t support Transfer-Encoding: chunked 50
51 Performance Test Setup 6 super nodes (tracker and storage node) 180 drives Drives not distributed i.e. not 30 drives per node EC strips maximally distributed 51
52 Performance Test Results MB/s Writes 1KB 1MB 16MB 32MB 64MB 128MB 256MB File Size ec_1_of_2 ec_6_of_9 ec_9_of_12 ec_10_of_16 replication 52
53 Performance Test Results MB/s Reads 1KB 1MB 16MB 32MB 64MB 128MB 256MB File Size ec_1_of_2 ec_6_of_9 ec_9_of_12 ec_10_of_16 replication 53
54 Migrations RESTful Presentation (S3, GDCS) VFS NebulaFSAPI VFS DB User DB Migration Script Super Nodes NebulaFS Storage Nodes Metadata DB 54
55 Future Finish Phase III Repairs Offsite copy Net new growth Optimizations Open source 55
56 Questions Thank You! 56
57 57
EC-Bench: Benchmarking Onload and Offload Erasure Coders on Modern Hardware Architectures
EC-Bench: Benchmarking Onload and Offload Erasure Coders on Modern Hardware Architectures Haiyang Shi, Xiaoyi Lu, and Dhabaleswar K. (DK) Panda {shi.876, lu.932, panda.2}@osu.edu The Ohio State University
More informationSoftware-defined Storage: Fast, Safe and Efficient
Software-defined Storage: Fast, Safe and Efficient TRY NOW Thanks to Blockchain and Intel Intelligent Storage Acceleration Library Every piece of data is required to be stored somewhere. We all know about
More informationHow to Reduce Data Capacity in Objectbased Storage: Dedup and More
How to Reduce Data Capacity in Objectbased Storage: Dedup and More Dong In Shin G-Cube, Inc. http://g-cube.kr Unstructured Data Explosion A big paradigm shift how to generate and consume data Transactional
More informationEverything You Wanted To Know About Storage (But Were Too Proud To Ask) The Basics
Everything You Wanted To Know About Storage (But Were Too Proud To Ask) The Basics Today s Presenters Bob Plumridge HDS Chief Technology Officer - EMEA Alex McDonald NetApp CTO Office 2 SNIA Legal Notice
More informationWhat's new in Jewel for RADOS? SAMUEL JUST 2015 VAULT
What's new in Jewel for RADOS? SAMUEL JUST 2015 VAULT QUICK PRIMER ON CEPH AND RADOS CEPH MOTIVATING PRINCIPLES All components must scale horizontally There can be no single point of failure The solution
More informationChanging Requirements for Distributed File Systems in Cloud Storage
Changing Requirements for Distributed File Systems in Cloud Storage Wesley Leggette Cleversafe Presentation Agenda r About Cleversafe r Scalability, our core driver r Object storage as basis for filesystem
More informationActiveScale Erasure Coding and Self Protecting Technologies
WHITE PAPER AUGUST 2018 ActiveScale Erasure Coding and Self Protecting Technologies BitSpread Erasure Coding and BitDynamics Data Integrity and Repair Technologies within The ActiveScale Object Storage
More informationModern Erasure Codes for Distributed Storage Systems
Modern Erasure Codes for Distributed Storage Systems Storage Developer Conference, SNIA, Bangalore Srinivasan Narayanamurthy Advanced Technology Group, NetApp May 27 th 2016 1 Everything around us is changing!
More informationActiveScale Erasure Coding and Self Protecting Technologies
NOVEMBER 2017 ActiveScale Erasure Coding and Self Protecting Technologies BitSpread Erasure Coding and BitDynamics Data Integrity and Repair Technologies within The ActiveScale Object Storage System Software
More informationSolidFire and Ceph Architectural Comparison
The All-Flash Array Built for the Next Generation Data Center SolidFire and Ceph Architectural Comparison July 2014 Overview When comparing the architecture for Ceph and SolidFire, it is clear that both
More informationA Performance Evaluation of Open Source Erasure Codes for Storage Applications
A Performance Evaluation of Open Source Erasure Codes for Storage Applications James S. Plank Catherine D. Schuman (Tennessee) Jianqiang Luo Lihao Xu (Wayne State) Zooko Wilcox-O'Hearn Usenix FAST February
More informationBuilding High Speed Erasure Coding Libraries for ARM and x86 Processors. Per Simonsen, CEO, MemoScale May 2017
Building High Speed Erasure Coding Libraries for ARM and x86 Processors Per Simonsen, CEO, MemoScale May 2017 Agenda MemoScale company and team Erasure coding - brief intro MemoScale erasure codes Performance
More informationUsing Cloud Services behind SGI DMF
Using Cloud Services behind SGI DMF Greg Banks Principal Engineer, Storage SW 2013 SGI Overview Cloud Storage SGI Objectstore Design Features & Non-Features Future Directions Cloud Storage
More informationMassively Scalable File Storage. Philippe Nicolas, KerStor
Philippe Nicolas, KerStor SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA. Member companies and individuals may use this material in presentations and literature under
More informationModern Erasure Codes for Distributed Storage Systems
Modern Erasure Codes for Distributed Storage Systems Srinivasan Narayanamurthy (Srini) NetApp Everything around us is changing! r The Data Deluge r Disk capacities and densities are increasing faster than
More informationAchieving the Potential of a Fully Distributed Storage System
Achieving the Potential of a Fully Distributed Storage System HPCN Workshop 2013, DLR Braunschweig, 7-8 May 2013 Slide 1 Scality Quick Facts Founded 2009 Experienced management team HQ in the San Francisco,
More informationECS High Availability Design
ECS High Availability Design March 2018 A Dell EMC white paper Revisions Date Mar 2018 Aug 2017 July 2017 Description Version 1.2 - Updated to include ECS version 3.2 content Version 1.1 - Updated to include
More informationGFS: The Google File System
GFS: The Google File System Brad Karp UCL Computer Science CS GZ03 / M030 24 th October 2014 Motivating Application: Google Crawl the whole web Store it all on one big disk Process users searches on one
More informationDistributed Systems. Lec 10: Distributed File Systems GFS. Slide acks: Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung
Distributed Systems Lec 10: Distributed File Systems GFS Slide acks: Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung 1 Distributed File Systems NFS AFS GFS Some themes in these classes: Workload-oriented
More informationCS-580K/480K Advanced Topics in Cloud Computing. Object Storage
CS-580K/480K Advanced Topics in Cloud Computing Object Storage 1 When we use object storage When we check Facebook, twitter Gmail Docs on DropBox Check share point Take pictures with Instagram 2 Object
More informationECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective
ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective Part II: Data Center Software Architecture: Topic 1: Distributed File Systems GFS (The Google File System) 1 Filesystems
More informationCA485 Ray Walshe Google File System
Google File System Overview Google File System is scalable, distributed file system on inexpensive commodity hardware that provides: Fault Tolerance File system runs on hundreds or thousands of storage
More informationCopyright 2010 EMC Corporation. Do not Copy - All Rights Reserved.
1 Using patented high-speed inline deduplication technology, Data Domain systems identify redundant data as they are being stored, creating a storage foot print that is 10X 30X smaller on average than
More informationECFS: A decentralized, distributed and faulttolerant FUSE filesystem for the LHCb online farm
Journal of Physics: Conference Series OPEN ACCESS ECFS: A decentralized, distributed and faulttolerant FUSE filesystem for the LHCb online farm To cite this article: Tomasz Rybczynski et al 2014 J. Phys.:
More informationGFS Overview. Design goals/priorities Design for big-data workloads Huge files, mostly appends, concurrency, huge bandwidth Design for failures
GFS Overview Design goals/priorities Design for big-data workloads Huge files, mostly appends, concurrency, huge bandwidth Design for failures Interface: non-posix New op: record appends (atomicity matters,
More informationContent Addressed Storage (CAS)
Content Addressed Storage (CAS) Module 3.5 2006 EMC Corporation. All rights reserved. Content Addressed Storage (CAS) - 1 Content Addressed Storage (CAS) Upon completion of this module, you will be able
More informationIBM Spectrum NAS, IBM Spectrum Scale and IBM Cloud Object Storage
IBM Spectrum NAS, IBM Spectrum Scale and IBM Cloud Object Storage Silverton Consulting, Inc. StorInt Briefing 2017 SILVERTON CONSULTING, INC. ALL RIGHTS RESERVED Page 2 Introduction Unstructured data has
More informationDeduplication Storage System
Deduplication Storage System Kai Li Charles Fitzmorris Professor, Princeton University & Chief Scientist and Co-Founder, Data Domain, Inc. 03/11/09 The World Is Becoming Data-Centric CERN Tier 0 Business
More informationCoding theory for scalable media delivery
1 Coding theory for scalable media delivery Michael Luby RaptorQ is a product of Qualcomm Technologies, Inc. Application layer erasure coding complements traditional error coding Forward Error Correction
More informationNovember 7, DAN WILSON Global Operations Architecture, Concur. OpenStack Summit Hong Kong JOE ARNOLD
November 7, 2013 DAN WILSON Global Operations Architecture, Concur dan.wilson@concur.com @tweetdanwilson OpenStack Summit Hong Kong JOE ARNOLD CEO, SwiftStack joe@swiftstack.com @joearnold Introduction
More informationRed Hat Gluster Storage performance. Manoj Pillai and Ben England Performance Engineering June 25, 2015
Red Hat Gluster Storage performance Manoj Pillai and Ben England Performance Engineering June 25, 2015 RDMA Erasure Coding NFS-Ganesha New or improved features (in last year) Snapshots SSD support Erasure
More informationUK LUG 10 th July Lustre at Exascale. Eric Barton. CTO Whamcloud, Inc Whamcloud, Inc.
UK LUG 10 th July 2012 Lustre at Exascale Eric Barton CTO Whamcloud, Inc. eeb@whamcloud.com Agenda Exascale I/O requirements Exascale I/O model 3 Lustre at Exascale - UK LUG 10th July 2012 Exascale I/O
More informationHADOOP 3.0 is here! Dr. Sandeep Deshmukh Sadepach Labs Pvt. Ltd. - Let us grow together!
HADOOP 3.0 is here! Dr. Sandeep Deshmukh sandeep@sadepach.com Sadepach Labs Pvt. Ltd. - Let us grow together! About me BE from VNIT Nagpur, MTech+PhD from IIT Bombay Worked with Persistent Systems - Life
More informationHPSS RAIT. A high performance, resilient, fault-tolerant tape data storage class. 1
HPSS RAIT A high performance, resilient, fault-tolerant tape data storage class http://www.hpss-collaboration.org 1 Why RAIT? HPSS supports striped tape without RAIT o Conceptually similar to RAID 0 o
More informationThe Btrfs Filesystem. Chris Mason
The Btrfs Filesystem Chris Mason The Btrfs Filesystem Jointly developed by a number of companies Oracle, Redhat, Fujitsu, Intel, SUSE, many others All data and metadata is written via copy-on-write CRCs
More informationEvaluating Cloud Storage Strategies. James Bottomley; CTO, Server Virtualization
Evaluating Cloud Storage Strategies James Bottomley; CTO, Server Virtualization Introduction to Storage Attachments: - Local (Direct cheap) SAS, SATA - Remote (SAN, NAS expensive) FC net Types - Block
More informationLessons learned while automating MySQL in the AWS cloud. Stephane Combaudon DB Engineer - Slice
Lessons learned while automating MySQL in the AWS cloud Stephane Combaudon DB Engineer - Slice Our environment 5 DB stacks Data volume ranging from 30GB to 2TB+. Master + N slaves for each stack. Master
More informationIntroduction to Database Services
Introduction to Database Services Shaun Pearce AWS Solutions Architect 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved Today s agenda Why managed database services? A non-relational
More informationRepair Pipelining for Erasure-Coded Storage
Repair Pipelining for Erasure-Coded Storage Runhui Li, Xiaolu Li, Patrick P. C. Lee, Qun Huang The Chinese University of Hong Kong USENIX ATC 2017 1 Introduction Fault tolerance for distributed storage
More informationHandling Big Data an overview of mass storage technologies
SS Data & Handling Big Data an overview of mass storage technologies Łukasz Janyst CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/it GridKA School 2013 Karlsruhe, 26.08.2013 What is Big Data?
More informationScreaming Fast Galois Field Arithmetic Using Intel SIMD Instructions. James S. Plank USENIX FAST. University of Tennessee
Screaming Fast Galois Field Arithmetic Using Intel SIMD Instructions James S. Plank University of Tennessee USENIX FAST San Jose, CA February 15, 2013. Authors Jim Plank Tennessee Kevin Greenan EMC/Data
More informationStaggeringly Large Filesystems
Staggeringly Large Filesystems Evan Danaher CS 6410 - October 27, 2009 Outline 1 Large Filesystems 2 GFS 3 Pond Outline 1 Large Filesystems 2 GFS 3 Pond Internet Scale Web 2.0 GFS Thousands of machines
More informationErasure coding and AONT algorithm selection for Secure Distributed Storage. Alem Abreha Sowmya Shetty
Erasure coding and AONT algorithm selection for Secure Distributed Storage Alem Abreha Sowmya Shetty Secure Distributed Storage AONT(All-Or-Nothing Transform) unkeyed transformation φ mapping a sequence
More informationDell EMC CIFS-ECS Tool
Dell EMC CIFS-ECS Tool Architecture Overview, Performance and Best Practices March 2018 A Dell EMC Technical Whitepaper Revisions Date May 2016 September 2016 Description Initial release Renaming of tool
More informationCat Herding. Why It s Time for a Millennial Approach to Storage. Cloud Expo East Western Digital Corporation All rights reserved 01/25/2016
Cat Herding Why It s Time for a Millennial Approach to Storage Cloud Expo East 1 A Time and Place for Everything The PC Movement of the 1980 s put pressure on mainframe storage architects In 1987 the RAID
More informationScale-out Data Deduplication Architecture
Scale-out Data Deduplication Architecture Gideon Senderov Product Management & Technical Marketing NEC Corporation of America Outline Data Growth and Retention Deduplication Methods Legacy Architecture
More informationTable of Contents. Introduction 3
1 Table of Contents Introduction 3 Data Protection Technologies 4 Btrfs File System Snapshot Technology How shared folders snapshot works Custom Scripting for Snapshot Retention Policy Self-Service Recovery
More informationThe Microsoft Large Mailbox Vision
WHITE PAPER The Microsoft Large Mailbox Vision Giving users large mailboxes without breaking your budget Introduction Giving your users the ability to store more email has many advantages. Large mailboxes
More informationData Integrity in Stateful Services. Velocity, China, 2016
Data Integrity in Stateful Services Velocity, China, 2016 Data Integrity Bringing Sexy Back Protect the Data. -Every DBA who doesn t want to be fired Breaking Integrity Down Physical Integrity - Help,
More informationFile systems CS 241. May 2, University of Illinois
File systems CS 241 May 2, 2014 University of Illinois 1 Announcements Finals approaching, know your times and conflicts Ours: Friday May 16, 8-11 am Inform us by Wed May 7 if you have to take a conflict
More informationCloud Computing and Hadoop Distributed File System. UCSB CS170, Spring 2018
Cloud Computing and Hadoop Distributed File System UCSB CS70, Spring 08 Cluster Computing Motivations Large-scale data processing on clusters Scan 000 TB on node @ 00 MB/s = days Scan on 000-node cluster
More informationNext Generation Erasure Coding Techniques Wesley Leggette Cleversafe
Next Generation Erasure Coding Techniques Wesley Leggette Cleversafe Topics r What is Erasure Coded Storage? r The evolution of Erasure Coded storage r From first- to third-generation erasure coding r
More informationTCO REPORT. NAS File Tiering. Economic advantages of enterprise file management
TCO REPORT NAS File Tiering Economic advantages of enterprise file management Executive Summary Every organization is under pressure to meet the exponential growth in demand for file storage capacity.
More informationFast Erasure Coding for Data Storage: A Comprehensive Study of the Acceleration Techniques. Tianli Zhou & Chao Tian Texas A&M University
Fast Erasure Coding for Data Storage: A Comprehensive Study of the Acceleration Techniques Tianli Zhou & Chao Tian Texas A&M University 2 Contents Motivation Background and Review Evaluating Individual
More informationOracle Secure Backup: Achieve 75 % Cost Savings with Your Tape Backup
1 Oracle Secure Backup: Achieve 75 % Cost Savings with Your Tape Backup Donna Cooksey Oracle Principal Product Manager John Swallow Waters Corporation Sr. Infrastructure Architect Enterprise Software Solutions
More informationCLOUD-SCALE FILE SYSTEMS
Data Management in the Cloud CLOUD-SCALE FILE SYSTEMS 92 Google File System (GFS) Designing a file system for the Cloud design assumptions design choices Architecture GFS Master GFS Chunkservers GFS Clients
More informationFrequently asked questions from the previous class survey
CS 370: OPERATING SYSTEMS [MASS STORAGE] Shrideep Pallickara Computer Science Colorado State University L29.1 Frequently asked questions from the previous class survey How does NTFS compare with UFS? L29.2
More informationDatacenter Storage with Ceph
Datacenter Storage with Ceph John Spray john.spray@redhat.com jcsp on #ceph-devel Agenda What is Ceph? How does Ceph store your data? Interfaces to Ceph: RBD, RGW, CephFS Latest development updates Datacenter
More informationFacilitating Magnetic Recording Technology Scaling for Data Center Hard Disk Drives through Filesystem-level Transparent Local Erasure Coding
Facilitating Magnetic Recording Technology Scaling for Data Center Hard Disk Drives through Filesystem-level Transparent Local Erasure Coding Yin Li, Hao Wang, Xuebin Zhang, Ning Zheng, Shafa Dahandeh,
More informationFILE STORAGE. Philippe Nicolas, Scality
PRESENTATION MASSIVELY TITLE SCALABLE GOES HERE FILE STORAGE Philippe Nicolas, Scality SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA unless otherwise noted. Member
More informationDifferential RAID: Rethinking RAID for SSD Reliability
Differential RAID: Rethinking RAID for SSD Reliability Mahesh Balakrishnan Asim Kadav 1, Vijayan Prabhakaran, Dahlia Malkhi Microsoft Research Silicon Valley 1 The University of Wisconsin-Madison Solid
More informationOutline. Spanner Mo/va/on. Tom Anderson
Spanner Mo/va/on Tom Anderson Outline Last week: Chubby: coordina/on service BigTable: scalable storage of structured data GFS: large- scale storage for bulk data Today/Friday: Lessons from GFS/BigTable
More informationGoogle File System. Arun Sundaram Operating Systems
Arun Sundaram Operating Systems 1 Assumptions GFS built with commodity hardware GFS stores a modest number of large files A few million files, each typically 100MB or larger (Multi-GB files are common)
More informationData Integrity in Stateful Services. Percona Live, Santa Clara, 2017
Data Integrity in Stateful Services Percona Live, Santa Clara, 2017 Data Integrity Bringing Sexy Back Protect the Data. -Every DBA who doesn t want to be fired Breaking Integrity Down Physical Integrity
More informationElastic Cloud Storage (ECS)
Elastic Cloud Storage (ECS) Version 3.1 Administration Guide 302-003-863 02 Copyright 2013-2017 Dell Inc. or its subsidiaries. All rights reserved. Published September 2017 Dell believes the information
More informationEngineering Goals. Scalability Availability. Transactional behavior Security EAI... CS530 S05
Engineering Goals Scalability Availability Transactional behavior Security EAI... Scalability How much performance can you get by adding hardware ($)? Performance perfect acceptable unacceptable Processors
More informationCloudian Sizing and Architecture Guidelines
Cloudian Sizing and Architecture Guidelines The purpose of this document is to detail the key design parameters that should be considered when designing a Cloudian HyperStore architecture. The primary
More informationOffloaded Data Transfers (ODX) Virtual Fibre Channel for Hyper-V. Application storage support through SMB 3.0. Storage Spaces
2 ALWAYS ON, ENTERPRISE-CLASS FEATURES ON LESS EXPENSIVE HARDWARE ALWAYS UP SERVICES IMPROVED PERFORMANCE AND MORE CHOICE THROUGH INDUSTRY INNOVATION Storage Spaces Application storage support through
More informationWhite Paper Simplified Backup and Reliable Recovery
Simplified Backup and Reliable Recovery NEC Corporation of America necam.com Overview Amanda Enterprise from Zmanda - A Carbonite company, is a backup and recovery solution that offers fast installation,
More informationSimplifying Collaboration in the Cloud
Simplifying Collaboration in the Cloud WOS and IRODS Data Grid Dave Fellinger dfellinger@ddn.com Innovating in Storage DDN Firsts: Streaming ingest from satellite with guaranteed bandwidth Continuous service
More informationNew Fresh Storage Approach for New IT Challenges Laurent Denel Philippe Nicolas OpenIO
New Fresh Storage Approach for New IT Challenges Laurent Denel Philippe Nicolas OpenIO Agenda Company profile and background Business and Users needs OpenIO approach Competition Conclusion Company profile
More informationORACLE RMAN DESIGN BEST PRACTICES WITH PANZURA QUICKSILVER CLOUD STORAGE CONTROLLERS
WHITE PAPER ORACLE RMAN DESIGN BEST PRACTICES WITH PANZURA QUICKSILVER CLOUD STORAGE CONTROLLERS Oracle is the de facto standard in the enterprise world when it comes to mission critical databases. Panzura
More informationHyper-converged Secondary Storage for Backup with Deduplication Q & A. The impact of data deduplication on the backup process
Hyper-converged Secondary Storage for Backup with Deduplication Q & A The impact of data deduplication on the backup process Table of Contents Introduction... 3 What is data deduplication?... 3 Is all
More informationReal Life Web Development. Joseph Paul Cohen
Real Life Web Development Joseph Paul Cohen joecohen@cs.umb.edu Index 201 - The code 404 - How to run it? 500 - Your code is broken? 200 - Someone broke into your server? 400 - How are people using your
More informationChoosing Hardware and Operating Systems for MySQL. Apr 15, 2009 O'Reilly MySQL Conference and Expo Santa Clara,CA by Peter Zaitsev, Percona Inc
Choosing Hardware and Operating Systems for MySQL Apr 15, 2009 O'Reilly MySQL Conference and Expo Santa Clara,CA by Peter Zaitsev, Percona Inc -2- We will speak about Choosing Hardware Choosing Operating
More informationECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective
ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective Part II: Software Infrastructure in Data Centers: Distributed File Systems 1 Permanently stores data Filesystems
More informationGFS: The Google File System. Dr. Yingwu Zhu
GFS: The Google File System Dr. Yingwu Zhu Motivating Application: Google Crawl the whole web Store it all on one big disk Process users searches on one big CPU More storage, CPU required than one PC can
More informationIntroduction to Scientific Data Management
Introduction to Scientific Data Management damien.francois@uclouvain.be November 2017 http://www.cism.ucl.ac.be/training 1 http://www.cism.ucl.ac.be/training Goal of this session: Share tools, tips and
More informationOptimize Storage Efficiency & Performance with Erasure Coding Hardware Offload. Dror Goldenberg VP Software Architecture Mellanox Technologies
Optimize Storage Efficiency & Performance with Erasure Coding Hardware Offload Dror Goldenberg VP Software Architecture Mellanox Technologies SNIA Legal Notice The material contained in this tutorial is
More informationTITLE: PRE-REQUISITE THEORY. 1. Introduction to Hadoop. 2. Cluster. Implement sort algorithm and run it using HADOOP
TITLE: Implement sort algorithm and run it using HADOOP PRE-REQUISITE Preliminary knowledge of clusters and overview of Hadoop and its basic functionality. THEORY 1. Introduction to Hadoop The Apache Hadoop
More informationThe storage challenges of virtualized environments
The storage challenges of virtualized environments The virtualization challenge: Ageing and Inflexible storage architectures Mixing of platforms causes management complexity Unable to meet the requirements
More informationPRESENTATION TITLE GOES HERE. Understanding Architectural Trade-offs in Object Storage Technologies
Object Storage 201 PRESENTATION TITLE GOES HERE Understanding Architectural Trade-offs in Object Storage Technologies SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA
More informationCluster-Level Google How we use Colossus to improve storage efficiency
Cluster-Level Storage @ Google How we use Colossus to improve storage efficiency Denis Serenyi Senior Staff Software Engineer dserenyi@google.com November 13, 2017 Keynote at the 2nd Joint International
More informationCloud Storage with AWS: EFS vs EBS vs S3 AHMAD KARAWASH
Cloud Storage with AWS: EFS vs EBS vs S3 AHMAD KARAWASH Cloud Storage with AWS Cloud storage is a critical component of cloud computing, holding the information used by applications. Big data analytics,
More informationData storage on Triton: an introduction
Motivation Data storage on Triton: an introduction How storage is organized in Triton How to optimize IO Do's and Don'ts Exercises slide 1 of 33 Data storage: Motivation Program speed isn t just about
More informationDistributed System. Gang Wu. Spring,2018
Distributed System Gang Wu Spring,2018 Lecture7:DFS What is DFS? A method of storing and accessing files base in a client/server architecture. A distributed file system is a client/server-based application
More informationLarge Scale MySQL Migration
to PostgreSQL! May 17, 2012 Content 1 Presentation Former Architecture A Wind of Change 2 PostgreSQL Architecture 3 4 In production Any question? Content 1 Presentation Former Architecture A Wind of Change
More informationCS370 Operating Systems
CS370 Operating Systems Colorado State University Yashwant K Malaiya Spring 2018 Lecture 24 Mass Storage, HDFS/Hadoop Slides based on Text by Silberschatz, Galvin, Gagne Various sources 1 1 FAQ What 2
More informationPelican: A building block for exascale cold data storage
Pelican: A building block for exascale cold data storage Shobana Balarishnan, Richard Black, Austin Donnelly, Paul England, Adam Glass, Dave Harper, Sergey Legtchenko, Aaron Ogus, Eric Peterson, Antony
More informationgoals monitoring, fault tolerance, auto-recovery (thousands of low-cost machines) handle appends efficiently (no random writes & sequential reads)
Google File System goals monitoring, fault tolerance, auto-recovery (thousands of low-cost machines) focus on multi-gb files handle appends efficiently (no random writes & sequential reads) co-design GFS
More informationHDFS Architecture. Gregory Kesden, CSE-291 (Storage Systems) Fall 2017
HDFS Architecture Gregory Kesden, CSE-291 (Storage Systems) Fall 2017 Based Upon: http://hadoop.apache.org/docs/r3.0.0-alpha1/hadoopproject-dist/hadoop-hdfs/hdfsdesign.html Assumptions At scale, hardware
More informationREFERENCE ARCHITECTURE. Rubrik and Nutanix
REFERENCE ARCHITECTURE Rubrik and Nutanix TABLE OF CONTENTS INTRODUCTION - RUBRIK...3 INTRODUCTION - NUTANIX...3 AUDIENCE... 4 INTEGRATION OVERVIEW... 4 ARCHITECTURE OVERVIEW...5 Nutanix Snapshots...6
More informationRAIDIX Data Storage Solution. Clustered Data Storage Based on the RAIDIX Software and GPFS File System
RAIDIX Data Storage Solution Clustered Data Storage Based on the RAIDIX Software and GPFS File System 2017 Contents Synopsis... 2 Introduction... 3 Challenges and the Solution... 4 Solution Architecture...
More informationReliable Computing I
Instructor: Mehdi Tahoori Reliable Computing I Lecture 8: Redundant Disk Arrays INSTITUTE OF COMPUTER ENGINEERING (ITEC) CHAIR FOR DEPENDABLE NANO COMPUTING (CDNC) National Research Center of the Helmholtz
More informationEnterprise Ceph: Everyway, your way! Amit Dell Kyle Red Hat Red Hat Summit June 2016
Enterprise Ceph: Everyway, your way! Amit Bhutani @ Dell Kyle Bader @ Red Hat Red Hat Summit June 2016 Agenda Overview of Ceph Components and Architecture Evolution of Ceph in Dell-Red Hat Joint OpenStack
More informationTechTour. Winning Technology Roadmaps. Sig Knapstad. Cutting Edge
TechTour Winning Technology Roadmaps Sig Knapstad Cutting Edge sk@ceag.com 1 2 Avid Nexis E5 3 Spectra BlackPearl TechTour 4 Simplified Overview T950 5 6 Winning Technology Roadmaps TechTour What does
More informationCS370 Operating Systems
CS370 Operating Systems Colorado State University Yashwant K Malaiya Spring 2018 Lecture 25 RAIDs, HDFS/Hadoop Slides based on Text by Silberschatz, Galvin, Gagne (not) Various sources 1 1 FAQ Striping:
More informationApplication Recovery. Andreas Schwegmann / HP
Intelligent PRESENTATION Architecture TITLE GOES HERE for Application Recovery Andreas Schwegmann / HP SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA unless otherwise
More informationBackup Solution Testing on UCS B-Series Server for Small-Medium Range Customers (Disk to Tape) Acronis Backup Advanced Suite 11.5
Backup Solution Testing on UCS B-Series Server for Small-Medium Range Customers (Disk to Tape) Acronis Backup Advanced Suite 11.5 First Published: March 16, 2015 Last Modified: March 19, 2015 Americas
More informationOpendedupe & Veritas NetBackup ARCHITECTURE OVERVIEW AND USE CASES
Opendedupe & Veritas NetBackup ARCHITECTURE OVERVIEW AND USE CASES May, 2017 Contents Introduction... 2 Overview... 2 Architecture... 2 SDFS File System Service... 3 Data Writes... 3 Data Reads... 3 De-duplication
More information