WAN Optimized Replication of Backup Datasets Using Stream-Informed Delta Compression
|
|
- Darrell Kevin Bruce
- 6 years ago
- Views:
Transcription
1 WAN Optimized Replication of Backup Datasets Using Stream-Informed Delta Compression Philip Shilane, Mark Huang, Grant Wallace, & Windsor Hsu Backup Recovery Systems Division EMC Corporation
2 Introduction 80% of our customers replicate most of their data off-site for disaster recovery WAN bandwidth often limits throughput Data reduction techniques increase effective throughput Deduplication and local compression are effective Delta compression with stream-informed caching adds 2X additional compression Remote Office Backup System Central Office Backup System
3 Example Of Deduplication And Delta Compression Chunk fp Sketches based on Broder [97 & 00]
4 Example Of Deduplication And Delta Compression Chunk fp sk Maximal Value 1 Maximal Value 2 Maximal Value 3 Maximal Value 4 Sketches based on Broder [97 & 00] super_feature = Rabin_fp(feature 1 feature 4 ) sketch is one or more super_features
5 Example Of Deduplication And Delta Compression Chunk fp sk Chunk fp (duplicate of earlier chunk) Sketches based on Broder [97 & 00] super_feature = Rabin_fp(feature 1 feature 4 ) sketch is one or more super_features
6 Example Of Deduplication And Delta Compression Chunk fp sk Chunk fp sk (similar to earlier chunk) Regions of difference Maximal Value 1 Maximal Value 2 Maximal Value 3 Maximal Value 4 Sketches based on Broder [97 & 00] super_feature = Rabin_fp(feature 1 feature 4 ) sketch is one or more super_features
7 Example Of Deduplication And Delta Compression Chunk fp sk Chunk fp sk (similar to earlier chunk) Regions of difference Transmit fp and differences Sketches based on Broder [97 & 00]
8 Sketch Index Options Full index (simple idea) Requires IO Difficult to update Finds all similarity matches 256 TB capacity 8 KB chunks 16 byte record 0.5 TB index per super-feature
9 Sketch Index Options Full index (simple idea) Requires IO Difficult to update Finds all similarity matches Partial index (slightly better?) Load and evict with LRU policy Not persistent Must be as large as full backup 256 TB capacity 8 KB chunks 16 byte record 0.5 TB index per super-feature
10 Sketch Index Options Partial index would have to hold a full backup 256 to TB be capacity effective 8 KB chunks 16 byte record Full index (simple idea) Requires IO Difficult to update Finds all similarity matches Partial index (slightly better?) Load and evict with LRU policy Not persistent Must be as large as full backup 0.5 TB index per super-feature Stream-informed cache (our contribution) Experimentally demonstrate that delta locality closely matches deduplication locality for backup datasets Updates handled by fingerprint system Little extra memory Finds most similarity matches
11 Sketch Index Options Full index (simple idea) Requires IO Difficult to update Finds all similarity matches Partial index (slightly better?) Load and evict with LRU policy Not persistent Must be as large as full backup 256 TB capacity 8 KB chunks 16 byte record 0.5 TB index per super-feature Partial index has to be large enough to index entire primary storage system
12 Sketch Index Options Full index (simple idea) Requires IO Difficult to update Finds all similarity matches Partial index (slightly better?) Load and evict with LRU policy Not persistent Must be as large as full backup 256 TB capacity 8 KB chunks 16 byte record 0.5 TB index per super-feature Partial index has to be large enough to index entire primary storage system Stream-informed cache (our contribution) Experimentally demonstrate that delta locality closely matches deduplication locality for backup datasets Updates handled by fingerprint system Little extra memory Finds most similarity matches
13 Stream-Informed Locality Similarity search can leverage deduplication locality Sketch cache loaded based on fingerprint cache Fingerprint and Sketch Cache
14 Stream-Informed Locality Similarity search can leverage deduplication locality Sketch cache loaded based on fingerprint cache Full 1 A B C Fingerprint and Sketch Cache fp A fp B fp C sk A sk B sk C
15 Stream-Informed Locality Similarity search can leverage deduplication locality Sketch cache loaded based on fingerprint cache Full 1 A B C Fingerprint and Sketch Cache Container 1 fp A fp B fp C sk A sk B sk C
16 Stream-Informed Locality Similarity search can leverage deduplication locality Sketch cache loaded based on fingerprint cache Full 1 A B C D E F Fingerprint and Sketch Cache Container 1 fp A sk A fp D sk D fp B sk B fp E sk E fp C sk C fp F sk F
17 Stream-Informed Locality Similarity search can leverage deduplication locality Sketch cache loaded based on fingerprint cache Full 1 A B C D E F Fingerprint and Sketch Cache Container 1 Container 2 fp A sk A fp D sk D fp B sk B fp E sk E fp C sk C fp F sk F
18 Stream-Informed Locality Similarity search can leverage deduplication locality Sketch cache loaded based on fingerprint cache Full 1 A B C D E F Store containers to disk Fingerprint and Sketch Cache Container 1 Container 2 fp A sk A fp D sk D fp B sk B fp E sk E fp C sk C fp F sk F
19 Stream-Informed Locality Similarity search can leverage deduplication locality Sketch cache loaded based on fingerprint cache Full 1 Full 2 A A B C D E F Fingerprint and Sketch Cache
20 Stream-Informed Locality Similarity search can leverage deduplication locality Sketch cache loaded based on fingerprint cache Full 1 Full 2 A A B C D E F Load container from disk Fingerprint and Sketch Cache
21 Stream-Informed Locality Similarity search can leverage deduplication locality Sketch cache loaded based on fingerprint cache Full 1 A B C D E F Full 2 A Load container from disk Fingerprint and Sketch Cache Container 1 fp A fp B fp C sk A sk B sk C
22 Stream-Informed Locality Similarity search can leverage deduplication locality Sketch cache loaded based on fingerprint cache Full 1 A B C D E F Full 2 A B C Fingerprint and Sketch Cache Container 1 fp A fp B fp C sk A sk B sk C
23 Stream-Informed Locality Similarity search can leverage deduplication locality Sketch cache loaded based on fingerprint cache Full 1 A B C D E F Full 2 A B C D Load container from disk Fingerprint and Sketch Cache Container 1 fp A fp B fp C sk A sk B sk C
24 Stream-Informed Locality Similarity search can leverage deduplication locality Sketch cache loaded based on fingerprint cache Full 1 A B C D E F Full 2 A B C D E F Fingerprint and Sketch Cache Container 1 Container 2 E is similar to previous chunk E and is a sketch match fp A sk A fp D sk D fp B sk B fp E sk E fp C sk C fp F sk F
25 Replication With Deduplication Remote Office Central Office
26 Replication With Deduplication Send chunk fingerprints Remote Office fp() fp() fp() Central Office
27 Replication With Deduplication Send chunk fingerprints Remote Office fp() fp() fp() Reply with list of missing fingerprints fp() fp() Central Office
28 Replication With Deduplication Send chunk fingerprints Remote Office fp() fp() fp() Reply with list of missing fingerprints fp() fp() Central Office Send missing chunks
29 Replication With Deduplication Send chunk fingerprints Remote Office fp() fp() fp() Reply with list of missing fingerprints fp() fp() Central Office Send missing chunks
30 Replication With Deduplication And Delta Remote Office Central Office m
31 Replication With Deduplication And Delta Remote Office Send chunk fingerprints fp() fp() fp() fp() fp(m) Central Office m
32 Replication With Deduplication And Delta Remote Office Reply with list of missing fingerprints fp() fp(m) Central Office m
33 Replication With Deduplication And Delta Remote Office Central Office Send sketches m sk() sk(m)
34 Replication With Deduplication And Delta Ø m Remote Office Central Office Send sketches m sk() sk(m)
35 Replication With Deduplication And Delta Ø m Remote Office Central Office m Reply with list of fingerprints for similar chunks Ø m
36 Replication With Deduplication And Delta Remote Office Central Office m Send missing chunks and delta encodings () +m
37 Replication With Deduplication And Delta Remote Office Central Office m Send missing chunks and delta encodings m () +m
38 Replication With Deduplication And Delta Remote Office Central Office m Send missing chunks and delta encodings m () +m
39 Replication With Deduplication And Delta Remote Office Central Office m Send missing chunks and delta encodings () +m () +m
40 Replication With Deduplication And Delta Remote Office Central Office m Send missing chunks and delta encodings () +m () () +m +m
41 Replication With Deduplication And Delta Remote Office Central Office m Send missing chunks and delta encodings () +m () () +m +m
42 Replication With Deduplication And Delta Remote Office Central Office m Send missing chunks and delta encodings () +m () +m m
43 Replication With Deduplication And Delta Remote Office Central Office m m Send missing chunks and delta encodings () +m () +m m
44 Properties Of Stream-Informed Delta Compression Pros: Eliminates on-disk sketch index Improves performance fewer disk reads than using a sketch index Small memory footprint Improves compression Cons: Dependent on stream locality and caching to find similar chunks Requires read IO and CPU to process delta chunks
45 Datasets Dataset Type Backup Policy TB Months Source Code Version control repository Weekly full Daily incremental Workstations 16 desktops Weekly full Daily incremental MS Exchange server Daily full System Logs Server s /var directory Weekly full Daily incremental Home Directories Engineers home directories Weekly full Daily incremental
46 Index VS Stream-Informed Cache Cache sized at 12 MB per stream Deduplication 1 super-feature 2 super-features 3 super-features 4 super-features
47 Index VS Stream-Informed Cache Cache sized at 12 MB per stream For one super-feature, compression is within 14% of using an index Deduplication 1 super-feature 2 super-features 3 super-features 4 super-features
48 Index VS Stream-Informed Cache Cache sized at 12 MB per stream For one super-feature, compression is within 14% of using an index Two super-features in a cache is better than an index with one Deduplication 1 super-feature 2 super-features 3 super-features 4 super-features
49 Index VS Stream-Informed Cache Cache sized at 12 MB per stream For one super-feature, compression is within 14% of using an index Two super-features in a cache is better than an index with one Deduplication 1 super-feature 2 super-features 3 super-features 4 super-features
50 Index VS Stream-Informed Cache Cache sized at 12 MB per stream For one super-feature, compression is within 14% of using an index Two super-features in a cache is better than an index with one Deduplication 1 super-feature 2 super-features 3 super-features 4 super-features
51 Index VS Stream-Informed Cache Cache sized at 12 MB per stream For one super-feature, compression is within 14% of using an index Two super-features in a cache is better than an index with one Deduplication 1 super-feature 2 super-features 3 super-features 4 super-features
52 Index VS Stream-Informed Cache Cache sized at 12 MB per stream For one super-feature, compression is within 14% of using an index Two super-features in a cache is better than an index with one 1. Deduplication is slightly different because of different implementations 2. Delta compression in a cache was higher than a full index because of better delta encoding with a candidate in the cache Deduplication 1 super-feature 2 super-features 3 super-features 4 super-features
53 Delta Compression Typical delta improvement is 2X beyond deduplication and GZ compression Choose GZ or Delta with GZ Dataset Deduplication GZ Delta w/ GZ Delta Improv. Source Code 24.9X 7.2X 14.9X 2.1X Workstations 5.7X 2.8X 8.8X 3.1X 6.9X 3.1X 5.8X 1.9X System Logs 57.9X 4.6X 10.2X 2.2X Home Directories 31.7X 3.1X 5.5X 1.8X Compression factors are presented after first week of seeding.
54 Network Throughput Effective throughput is 1-2 orders of magnitude faster
55 Overheads And Limitations Sketches take up about 20 bytes per non-duplicate chunk Uses read IO and CPU on source and destination Sketching is a 20% slowdown on writes, but only for non-duplicates Scales linearly at destination with number of streams
56 Overheads And Limitations Sketches take up about 20 bytes per non-duplicate chunk Resource utilization during replication Uses read IO and CPU on source and destination Sketching is a 20% slowdown on writes, but only for non-duplicates Scales linearly at destination with number of streams Shared sketch cache affects compression System sized to handle 20 streams Number of systems replicating to a single destination With 25 streams compression loss of 0-12% With 50 streams compression loss of 0-27%
57 Overheads And Limitations Sketches take up about 20 bytes per non-duplicate chunk Uses read IO and CPU on source and destination Sketching is a 20% slowdown on writes, but only for non-duplicates Scales linearly at destination with number of streams Shared sketch cache affects compression System sized to handle 20 streams With 25 streams compression loss of 0-12% With 50 streams compression loss of 0-27%
58 Customer Results Median customer has 2X delta compression beyond deduplication
59 Related Work Optimized network transfer Spring00, Muthitacharoen01, Eshghi07, and Park07 File synchronization Tridgell00, Suel04 Delta compression Burns97, Mogul97, Hunt98, Chan99, MacDonald00, Suel02, Trendafilov02, and Chen04 Similarity detection Brin94, Manber94, Broder[97 & 00], Douglis03, Kulkarni04, You04, Jain05, and Aronovich09 Deduplicated storage Policroniades04, and Bobbarjung06 Stream-informed deduplication Zhu08, Bhagwat09, Lillibridge09, Min10, Guo11, and Xia11
60 Conclusion Delta locality closely matches deduplication locality for backup datasets Good scalability Stream-informed delta compression is effective with a small cache CPU and IO utilization is low Product allows customers to replicate and protect twice as much data across a WAN
61 Questions?
62
Delta Compressed and Deduplicated Storage Using Stream-Informed Locality
Delta Compressed and Deduplicated Storage Using Stream-Informed Locality Philip Shilane, Grant Wallace, Mark Huang, and Windsor Hsu Backup Recovery Systems Division EMC Corporation Abstract For backup
More informationDeduplication Storage System
Deduplication Storage System Kai Li Charles Fitzmorris Professor, Princeton University & Chief Scientist and Co-Founder, Data Domain, Inc. 03/11/09 The World Is Becoming Data-Centric CERN Tier 0 Business
More informationMIGRATORY COMPRESSION Coarse-grained Data Reordering to Improve Compressibility
MIGRATORY COMPRESSION Coarse-grained Data Reordering to Improve Compressibility Xing Lin *, Guanlin Lu, Fred Douglis, Philip Shilane, Grant Wallace * University of Utah EMC Corporation Data Protection
More informationThe Logic of Physical Garbage Collection in Deduplicating Storage
The Logic of Physical Garbage Collection in Deduplicating Storage Fred Douglis Abhinav Duggal Philip Shilane Tony Wong Dell EMC Shiqin Yan University of Chicago Fabiano Botelho Rubrik 1 Deduplication in
More informationReducing Replication Bandwidth for Distributed Document Databases
Reducing Replication Bandwidth for Distributed Document Databases Lianghong Xu 1, Andy Pavlo 1, Sudipta Sengupta 2 Jin Li 2, Greg Ganger 1 Carnegie Mellon University 1, Microsoft Research 2 Document-oriented
More informationDeduplication File System & Course Review
Deduplication File System & Course Review Kai Li 12/13/13 Topics u Deduplication File System u Review 12/13/13 2 Storage Tiers of A Tradi/onal Data Center $$$$ Mirrored storage $$$ Dedicated Fibre Clients
More informationSparse Indexing: Large-Scale, Inline Deduplication Using Sampling and Locality
Sparse Indexing: Large-Scale, Inline Deduplication Using Sampling and Locality Mark Lillibridge, Kave Eshghi, Deepavali Bhagwat, Vinay Deolalikar, Greg Trezise, and Peter Camble Work done at Hewlett-Packard
More informationA DEDUPLICATION-INSPIRED FAST DELTA COMPRESSION APPROACH W EN XIA, HONG JIANG, DA N FENG, LEI T I A N, M I N FU, YUKUN Z HOU
A DEDUPLICATION-INSPIRED FAST DELTA COMPRESSION APPROACH W EN XIA, HONG JIANG, DA N FENG, LEI T I A N, M I N FU, YUKUN Z HOU PRESENTED BY ROMAN SHOR Overview Technics of data reduction in storage systems:
More informationChunkStash: Speeding Up Storage Deduplication using Flash Memory
ChunkStash: Speeding Up Storage Deduplication using Flash Memory Biplob Debnath +, Sudipta Sengupta *, Jin Li * * Microsoft Research, Redmond (USA) + Univ. of Minnesota, Twin Cities (USA) Deduplication
More informationAccelerating Restore and Garbage Collection in Deduplication-based Backup Systems via Exploiting Historical Information
Accelerating Restore and Garbage Collection in Deduplication-based Backup Systems via Exploiting Historical Information Min Fu, Dan Feng, Yu Hua, Xubin He, Zuoning Chen *, Wen Xia, Fangting Huang, Qing
More informationCharacteristics of Backup Workloads in Production Systems
Characteristics of Backup Workloads in Production Systems Grant Wallace Fred Douglis Hangwei Qian Philip Shilane Stephen Smaldone Mark Chamness Windsor Hsu Backup Recovery Systems Division EMC Corporation
More informationDEC: An Efficient Deduplication-Enhanced Compression Approach
2016 IEEE 22nd International Conference on Parallel and Distributed Systems DEC: An Efficient Deduplication-Enhanced Compression Approach Zijin Han, Wen Xia, Yuchong Hu *, Dan Feng, Yucheng Zhang, Yukun
More informationDELL EMC DATA DOMAIN SISL SCALING ARCHITECTURE
WHITEPAPER DELL EMC DATA DOMAIN SISL SCALING ARCHITECTURE A Detailed Review ABSTRACT While tape has been the dominant storage medium for data protection for decades because of its low cost, it is steadily
More informationA Survey and Classification of Storage Deduplication Systems
11 A Survey and Classification of Storage Deduplication Systems JOÃO PAULO and JOSÉ PEREIRA, High-Assurance Software Lab (HASLab), INESC TEC & University of Minho The automatic elimination of duplicate
More informationDesign Tradeoffs for Data Deduplication Performance in Backup Workloads
Design Tradeoffs for Data Deduplication Performance in Backup Workloads Min Fu,DanFeng,YuHua,XubinHe, Zuoning Chen *, Wen Xia,YuchengZhang,YujuanTan Huazhong University of Science and Technology Virginia
More informationEMC DATA DOMAIN OPERATING SYSTEM
EMC DATA DOMAIN OPERATING SYSTEM Powering EMC Protection Storage ESSENTIALS High-Speed, Scalable Deduplication Up to 31 TB/hr performance Reduces requirements for backup storage by 10 to 30x and archive
More informationBusiness Benefits of Policy Based Data De-Duplication Data Footprint Reduction with Quality of Service (QoS) for Data Protection
Data Footprint Reduction with Quality of Service (QoS) for Data Protection By Greg Schulz Founder and Senior Analyst, the StorageIO Group Author The Green and Virtual Data Center (Auerbach) October 28th,
More informationCost-Efficient Remote Backup Services for Enterprise Clouds
1650 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 12, NO. 5, OCTOBER 2016 Cost-Efficient Remote Backup Services for Enterprise Clouds Yu Hua, Senior Member, IEEE, Xue Liu, Member, IEEE, and Dan Feng,
More informationWhite paper ETERNUS CS800 Data Deduplication Background
White paper ETERNUS CS800 - Data Deduplication Background This paper describes the process of Data Deduplication inside of ETERNUS CS800 in detail. The target group consists of presales, administrators,
More informationEMC DATA DOMAIN PRODUCT OvERvIEW
EMC DATA DOMAIN PRODUCT OvERvIEW Deduplication storage for next-generation backup and archive Essentials Scalable Deduplication Fast, inline deduplication Provides up to 65 PBs of logical storage for long-term
More informationAn Application Awareness Local Source and Global Source De-Duplication with Security in resource constraint based Cloud backup services
An Application Awareness Local Source and Global Source De-Duplication with Security in resource constraint based Cloud backup services S.Meghana Assistant Professor, Dept. of IT, Vignana Bharathi Institute
More informationDELL EMC DATA DOMAIN EXTENDED RETENTION SOFTWARE
WHITEPAPER DELL EMC DATA DOMAIN EXTENDED RETENTION SOFTWARE A Detailed Review ABSTRACT This white paper introduces Dell EMC Data Domain Extended Retention software that increases the storage scalability
More informationTechnology Insight Series
EMC Avamar for NAS - Accelerating NDMP Backup Performance John Webster June, 2011 Technology Insight Series Evaluator Group Copyright 2011 Evaluator Group, Inc. All rights reserved. Page 1 of 7 Introduction/Executive
More informationVeeam and HP: Meet your backup data protection goals
Sponsored by Veeam and HP: Meet your backup data protection goals Eric Machabert Сonsultant and virtualization expert Introduction With virtualization systems becoming mainstream in recent years, backups
More informationCopyright 2010 EMC Corporation. Do not Copy - All Rights Reserved.
1 Using patented high-speed inline deduplication technology, Data Domain systems identify redundant data as they are being stored, creating a storage foot print that is 10X 30X smaller on average than
More informationTintri & Veeam VM Backup & Replication Best Practices. John Phillips Strategic Alliances and Technical Marketing Ryan Post Systems Engineer
Tintri & Veeam VM Backup & Replication Best Practices John Phillips Strategic Alliances and Technical Marketing Ryan Post Systems Engineer 1 VM-aware Storage from Tintri Stores VMs and vdisks (only!) No
More informationWhite Paper Simplified Backup and Reliable Recovery
Simplified Backup and Reliable Recovery NEC Corporation of America necam.com Overview Amanda Enterprise from Zmanda - A Carbonite company, is a backup and recovery solution that offers fast installation,
More informationSee what s new: Data Domain Global Deduplication Array, DD Boost and more. Copyright 2010 EMC Corporation. All rights reserved.
See what s new: Data Domain Global Deduplication Array, DD Boost and more 2010 1 EMC Backup Recovery Systems (BRS) Division EMC Competitor Competitor Competitor Competitor Competitor Competitor Competitor
More informationDASH COPY GUIDE. Published On: 11/19/2013 V10 Service Pack 4A Page 1 of 31
DASH COPY GUIDE Published On: 11/19/2013 V10 Service Pack 4A Page 1 of 31 DASH Copy Guide TABLE OF CONTENTS OVERVIEW GETTING STARTED ADVANCED BEST PRACTICES FAQ TROUBLESHOOTING DASH COPY PERFORMANCE TUNING
More informationEMC Data Domain for Archiving Are You Kidding?
EMC Data Domain for Archiving Are You Kidding? Bill Roth / Bob Spurzem EMC EMC 1 Agenda EMC Introduction Data Domain Enterprise Vault Integration Data Domain NetBackup Integration Q & A EMC 2 EMC Introduction
More informationProtect enterprise data, achieve long-term data retention
Technical white paper Protect enterprise data, achieve long-term data retention HP StoreOnce Catalyst and Symantec NetBackup OpenStorage Table of contents Introduction 2 Technology overview 3 HP StoreOnce
More informationOnline Version Only. Book made by this file is ILLEGAL. Design and Implementation of Binary File Similarity Evaluation System. 1.
, pp.1-10 http://dx.doi.org/10.14257/ijmue.2014.9.1.01 Design and Implementation of Binary File Similarity Evaluation System Sun-Jung Kim 2, Young Jun Yoo, Jungmin So 1, Jeong Gun Lee 1, Jin Kim 1 and
More informationReducing The De-linearization of Data Placement to Improve Deduplication Performance
Reducing The De-linearization of Data Placement to Improve Deduplication Performance Yujuan Tan 1, Zhichao Yan 2, Dan Feng 2, E. H.-M. Sha 1,3 1 School of Computer Science & Technology, Chongqing University
More informationRecoverPoint Operations
This chapter contains the following sections: RecoverPoint Appliance Clusters, page 1 Consistency Groups, page 2 Replication Sets, page 17 Group Sets, page 20 Assigning a Policy to a RecoverPoint Task,
More informationSpeeding Up Cloud/Server Applications Using Flash Memory
Speeding Up Cloud/Server Applications Using Flash Memory Sudipta Sengupta and Jin Li Microsoft Research, Redmond, WA, USA Contains work that is joint with Biplob Debnath (Univ. of Minnesota) Flash Memory
More informationRhinoback Online Backup. In-File Delta
December 2006 Table of Content 1 Introduction... 3 1.1 Differential Delta Mode... 3 1.2 Incremental Delta Mode... 3 2 Delta Generation... 4 3 Block Size Setting... 4 4 During Backup... 5 5 During Restore...
More informationSetting Up the DR Series System on Veeam
Setting Up the DR Series System on Veeam Quest Engineering June 2017 A Quest Technical White Paper Revisions Date January 2014 May 2014 July 2014 April 2015 June 2015 November 2015 April 2016 Description
More informationStorageCraft OneXafe and Veeam 9.5
TECHNICAL DEPLOYMENT GUIDE NOV 2018 StorageCraft OneXafe and Veeam 9.5 Expert Deployment Guide Overview StorageCraft, with its scale-out storage solution OneXafe, compliments Veeam to create a differentiated
More informationThe Google File System
October 13, 2010 Based on: S. Ghemawat, H. Gobioff, and S.-T. Leung: The Google file system, in Proceedings ACM SOSP 2003, Lake George, NY, USA, October 2003. 1 Assumptions Interface Architecture Single
More informationTechnology Insight Series
IBM ProtecTIER Deduplication for z/os John Webster March 04, 2010 Technology Insight Series Evaluator Group Copyright 2010 Evaluator Group, Inc. All rights reserved. Announcement Summary The many data
More informationOracle Advanced Compression. An Oracle White Paper June 2007
Oracle Advanced Compression An Oracle White Paper June 2007 Note: The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated
More informationFile Systems Management and Examples
File Systems Management and Examples Today! Efficiency, performance, recovery! Examples Next! Distributed systems Disk space management! Once decided to store a file as sequence of blocks What s the size
More informationCSE 124: Networked Services Lecture-16
Fall 2010 CSE 124: Networked Services Lecture-16 Instructor: B. S. Manoj, Ph.D http://cseweb.ucsd.edu/classes/fa10/cse124 11/23/2010 CSE 124 Networked Services Fall 2010 1 Updates PlanetLab experiments
More informationPracticeDump. Free Practice Dumps - Unlimited Free Access of practice exam
PracticeDump http://www.practicedump.com Free Practice Dumps - Unlimited Free Access of practice exam Exam : E22-280 Title : Avamar Backup and Data Deduplication Exam Vendors : EMC Version : DEMO Get Latest
More informationdedupv1: Improving Deduplication Throughput using Solid State Drives (SSD)
University Paderborn Paderborn Center for Parallel Computing Technical Report dedupv1: Improving Deduplication Throughput using Solid State Drives (SSD) Dirk Meister Paderborn Center for Parallel Computing
More informationChapter 4 Data Movement Process
Chapter 4 Data Movement Process 46 - Data Movement Process Understanding how CommVault software moves data within the production and protected environment is essential to understanding how to configure
More informationA study of practical deduplication
A study of practical deduplication Dutch T. Meyer University of British Columbia Microsoft Research Intern William Bolosky Microsoft Research Why Dutch is Not Here A study of practical deduplication Dutch
More informationSetting Up the Dell DR Series System on Veeam
Setting Up the Dell DR Series System on Veeam Dell Engineering April 2016 A Dell Technical White Paper Revisions Date January 2014 May 2014 July 2014 April 2015 June 2015 November 2015 April 2016 Description
More informationThe World s Fastest Backup Systems
3 The World s Fastest Backup Systems Erwin Freisleben BRS Presales Austria 4 EMC Data Domain: Leadership and Innovation A history of industry firsts 2003 2004 2005 2006 2007 2008 2009 2010 2011 First deduplication
More informationData Deduplication Methods for Achieving Data Efficiency
Data Deduplication Methods for Achieving Data Efficiency Matthew Brisse, Quantum Gideon Senderov, NEC... SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA. Member companies
More informationKevin Russell ExaGrid Systems. Platinum sponsors:
Kevin Russell ExaGrid Systems Platinum sponsors: Established in 2002 Largest independent vendor in the space 100% focused on disk-based backup Winner of the most industry awards Patented technology Large
More informationDeduplication and Its Application to Corporate Data
White Paper Deduplication and Its Application to Corporate Data Lorem ipsum ganus metronique elit quesal norit parique et salomin taren ilat mugatoque This whitepaper explains deduplication techniques
More informationCheetah: An Efficient Flat Addressing Scheme for Fast Query Services in Cloud Computing
IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications Cheetah: An Efficient Flat Addressing Scheme for Fast Query Services in Cloud Computing Yu Hua Wuhan National
More informationCOS 318: Operating Systems. NSF, Snapshot, Dedup and Review
COS 318: Operating Systems NSF, Snapshot, Dedup and Review Topics! NFS! Case Study: NetApp File System! Deduplication storage system! Course review 2 Network File System! Sun introduced NFS v2 in early
More informationBalakrishnan Nair. Senior Technology Consultant Back Up & Recovery Systems South Gulf. Copyright 2011 EMC Corporation. All rights reserved.
Balakrishnan Nair Senior Technology Consultant Back Up & Recovery Systems South Gulf 1 Thinking Fast: The World s Fastest Backup Now Does Archive Too Introducing the New EMC Backup and Recovery Solutions
More informationIBM Tivoli Storage Manager Version Introduction to Data Protection Solutions IBM
IBM Tivoli Storage Manager Version 7.1.6 Introduction to Data Protection Solutions IBM IBM Tivoli Storage Manager Version 7.1.6 Introduction to Data Protection Solutions IBM Note: Before you use this
More informationConfiguring Short RPO with Actifio StreamSnap and Dedup-Async Replication
CDS and Sky Tech Brief Configuring Short RPO with Actifio StreamSnap and Dedup-Async Replication Actifio recommends using Dedup-Async Replication (DAR) for RPO of 4 hours or more and using StreamSnap for
More informationEMC RecoverPoint. EMC RecoverPoint Support
Support, page 1 Adding an Account, page 2 RecoverPoint Appliance Clusters, page 3 Replication Through Consistency Groups, page 4 Group Sets, page 22 System Tasks, page 24 Support protects storage array
More informationStorageCraft OneBlox and Veeam 9.5 Expert Deployment Guide
TECHNICAL DEPLOYMENT GUIDE StorageCraft OneBlox and Veeam 9.5 Expert Deployment Guide Overview StorageCraft, with its scale-out storage solution OneBlox, compliments Veeam to create a differentiated diskbased
More informationScale-out Object Store for PB/hr Backups and Long Term Archive April 24, 2014
Scale-out Object Store for PB/hr Backups and Long Term Archive April 24, 2014 Gideon Senderov Director, Advanced Storage Products NEC Corporation of America Long-Term Data in the Data Center (EB) 140 120
More informationCan t We All Get Along? Redesigning Protection Storage for Modern Workloads
Can t We All Get Along? Redesigning Protection Storage for Modern Workloads Yamini Allu, Fred Douglis, Mahesh Kamat, Ramya Prabhakar, Philip Shilane, and Rahul Ugale, Dell EMC https://www.usenix.org/conference/atc18/presentation/allu
More informationVCS-276.exam. Number: VCS-276 Passing Score: 800 Time Limit: 120 min File Version: VCS-276
VCS-276.exam Number: VCS-276 Passing Score: 800 Time Limit: 120 min File Version: 1.0 VCS-276 Administration of Veritas NetBackup 8.0 Version 1.0 Exam A QUESTION 1 A NetBackup policy is configured to back
More informationCase study: Building bi-directional DR. Joep Piscaer, VMware vexpert, VCDX #101
Case study: Building bi-directional DR Joep Piscaer, VMware vexpert, VCDX #101 Agenda Introduction Project description, goals, requirements, constraints High level overview: product and component overview
More informationVirtualization Technique For Replica Synchronization
Virtualization Technique For Replica Synchronization By : Ashwin G.Sancheti Email:ashwin@cs.jhu.edu Instructor : Prof.Randal Burns Date : 19 th Feb 2008 Roadmap Motivation/Goals What is Virtualization?
More informationChapter 3 `How a Storage Policy Works
Chapter 3 `How a Storage Policy Works 32 - How a Storage Policy Works A Storage Policy defines the lifecycle management rules for all protected data. In its most basic form, a storage policy can be thought
More informationPASS4TEST. IT Certification Guaranteed, The Easy Way! We offer free update service for one year
PASS4TEST \ http://www.pass4test.com We offer free update service for one year Exam : E20-329 Title : Technology Architect Backup and Recovery Solutions Design Exam Vendor : EMC Version : DEMO Get Latest
More informationIBM Spectrum Protect Version Introduction to Data Protection Solutions IBM
IBM Spectrum Protect Version 8.1.2 Introduction to Data Protection Solutions IBM IBM Spectrum Protect Version 8.1.2 Introduction to Data Protection Solutions IBM Note: Before you use this information
More informationScale-out Data Deduplication Architecture
Scale-out Data Deduplication Architecture Gideon Senderov Product Management & Technical Marketing NEC Corporation of America Outline Data Growth and Retention Deduplication Methods Legacy Architecture
More informationEaSync: A Transparent File Synchronization Service across Multiple Machines
EaSync: A Transparent File Synchronization Service across Multiple Machines Huajian Mao 1,2, Hang Zhang 1,2, Xianqiang Bao 1,2, Nong Xiao 1,2, Weisong Shi 3, and Yutong Lu 1,2 1 State Key Laboratory of
More informationFrequency Based Chunking for Data De-Duplication
2010 18th Annual IEEE/ACM International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems Frequency Based Chunking for Data De-Duplication Guanlin Lu, Yu Jin, and
More informationVMware vsphere Data Protection 5.8 TECHNICAL OVERVIEW REVISED AUGUST 2014
VMware vsphere Data Protection 5.8 TECHNICAL OVERVIEW REVISED AUGUST 2014 Table of Contents Introduction.... 3 Features and Benefits of vsphere Data Protection... 3 Additional Features and Benefits of
More informationHP Designing and Implementing HP Enterprise Backup Solutions. Download Full Version :
HP HP0-771 Designing and Implementing HP Enterprise Backup Solutions Download Full Version : http://killexams.com/pass4sure/exam-detail/hp0-771 A. copy backup B. normal backup C. differential backup D.
More informationGFS: The Google File System. Dr. Yingwu Zhu
GFS: The Google File System Dr. Yingwu Zhu Motivating Application: Google Crawl the whole web Store it all on one big disk Process users searches on one big CPU More storage, CPU required than one PC can
More informationECE 2300 Digital Logic & Computer Organization. More Caches
ECE 23 Digital Logic & Computer Organization Spring 217 More Caches 1 Prelim 2 stats High: 9 (out of 9) Mean: 7.2, Median: 73 Announcements Prelab 5(C) due tomorrow 2 Example: Direct Mapped (DM) Cache
More informationDell DR4100. Disk based Data Protection and Disaster Recovery. April 3, Birger Ferber, Enterprise Technologist Storage EMEA
Dell DR4100 Disk based Data Protection and Disaster Recovery April 3, 2013 Birger Ferber, Enterprise Technologist Storage EMEA Agenda DR4100 Technical Review DR4100 New Features DR4100 Use Cases DR4100
More informationConstruct a High Efficiency VM Disaster Recovery Solution. Best choice for protecting virtual environments
Construct a High Efficiency VM Disaster Recovery Solution Best choice for protecting virtual environments About NAKIVO Established in the USA since 2012 Provides data protection solutions for VMware, Hyper-V
More informationParallelizing Inline Data Reduction Operations for Primary Storage Systems
Parallelizing Inline Data Reduction Operations for Primary Storage Systems Jeonghyeon Ma ( ) and Chanik Park Department of Computer Science and Engineering, POSTECH, Pohang, South Korea {doitnow0415,cipark}@postech.ac.kr
More informationGlobal Headquarters: 5 Speen Street Framingham, MA USA P F
B U Y E R C A S E S T U D Y V M w a r e I m p r o v e s N e t w o r k U t i l i z a t i o n a n d B a c k u p P e r f o r m a n c e U s i n g A v a m a r ' s C l i e n t - S i d e D e d u p l i c a t i
More informationHP StoreOnce: reinventing data deduplication
HP : reinventing data deduplication Reduce the impact of explosive data growth with HP StorageWorks D2D Backup Systems Technical white paper Table of contents Executive summary... 2 Introduction to data
More informationBackup for VMware ESX Environments with Veritas NetBackup PureDisk from Symantec
Backup for VMware ESX Environments with Veritas NetBackup PureDisk from Symantec A comparison of three different methods of backing up ESX guests using deduplication Mark Salmon, Sr. Principal Systems
More informationLazy Exact Deduplication
Lazy Exact Deduplication JINGWEI MA, REBECCA J. STONES, YUXIANG MA, JINGUI WANG, JUNJIE REN, GANG WANG, and XIAOGUANG LIU, College of Computer and Control Engineering, Nankai University 11 Deduplication
More informationHYDRAstor: a Scalable Secondary Storage
HYDRAstor: a Scalable Secondary Storage 7th TF-Storage Meeting September 9 th 00 Łukasz Heldt Largest Japanese IT company $4 Billion in annual revenue 4,000 staff www.nec.com Polish R&D company 50 engineers
More informationData Reduction Meets Reality What to Expect From Data Reduction
Data Reduction Meets Reality What to Expect From Data Reduction Doug Barbian and Martin Murrey Oracle Corporation Thursday August 11, 2011 9961: Data Reduction Meets Reality Introduction Data deduplication
More informationDELL EMC DATA DOMAIN OPERATING SYSTEM
DATA SHEET DD OS Essentials High-speed, scalable deduplication Up to 68 TB/hr performance Reduces protection storage requirements by 10 to 30x CPU-centric scalability Data invulnerability architecture
More informationThe 5 Keys to Virtual Backup Excellence
The 5 Keys to Virtual Backup Excellence The Challenges 51 87 Percent of IT pros say virtualized server backups do not meet or only partially meet business goals for restores and recovery Percent of CIOs
More informationHyper-converged Secondary Storage for Backup with Deduplication Q & A. The impact of data deduplication on the backup process
Hyper-converged Secondary Storage for Backup with Deduplication Q & A The impact of data deduplication on the backup process Table of Contents Introduction... 3 What is data deduplication?... 3 Is all
More informationWHITE PAPER. DATA DEDUPLICATION BACKGROUND: A Technical White Paper
WHITE PAPER DATA DEDUPLICATION BACKGROUND: A Technical White Paper CONTENTS Data Deduplication Multiple Data Sets from a Common Storage Pool.......................3 Fixed-Length Blocks vs. Variable-Length
More informationTransferSoft Launches HyperTransfer for OEMs Most Comprehensive and Efficient Enterprisee Data Transfer Software Suite in the Industry
Broadcast Beat Magazine NAB 2015 Edition Author: Curtis Chan, Senior Editor TransferSoft Launches HyperTransfer for OEMs Most Comprehensive and Efficient Enterprisee Data Transfer Software Suite in the
More informationHEAD HardwarE Accelerated Deduplication
HEAD HardwarE Accelerated Deduplication Final Report CS710 Computing Acceleration with FPGA December 9, 2016 Insu Jang Seikwon Kim Seonyoung Lee Executive Summary A-Z development of deduplication SW version
More informationGFS: The Google File System
GFS: The Google File System Brad Karp UCL Computer Science CS GZ03 / M030 24 th October 2014 Motivating Application: Google Crawl the whole web Store it all on one big disk Process users searches on one
More informationThe Google File System
The Google File System By Ghemawat, Gobioff and Leung Outline Overview Assumption Design of GFS System Interactions Master Operations Fault Tolerance Measurements Overview GFS: Scalable distributed file
More informationSetting Up Quest QoreStor with Veeam Backup & Replication. Technical White Paper
Setting Up Quest QoreStor with Veeam Backup & Replication Technical White Paper Quest Engineering August 2018 2018 Quest Software Inc. ALL RIGHTS RESERVED. THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES
More informationGoogle File System. Arun Sundaram Operating Systems
Arun Sundaram Operating Systems 1 Assumptions GFS built with commodity hardware GFS stores a modest number of large files A few million files, each typically 100MB or larger (Multi-GB files are common)
More informationThe Google File System (GFS)
1 The Google File System (GFS) CS60002: Distributed Systems Antonio Bruto da Costa Ph.D. Student, Formal Methods Lab, Dept. of Computer Sc. & Engg., Indian Institute of Technology Kharagpur 2 Design constraints
More informationDELL EMC DATA DOMAIN OPERATING SYSTEM
DATA SHEET DD OS Essentials High-speed, scalable deduplication Reduces protection storage requirements by up to 55x Up to 3x restore performance CPU-centric scalability Data invulnerability architecture
More informationVMware vsphere Data Protection Evaluation Guide REVISED APRIL 2015
VMware vsphere Data Protection REVISED APRIL 2015 Table of Contents Introduction.... 3 Features and Benefits of vsphere Data Protection... 3 Requirements.... 4 Evaluation Workflow... 5 Overview.... 5 Evaluation
More informationTIBX NEXT-GENERATION ARCHIVE FORMAT IN ACRONIS BACKUP CLOUD
TIBX NEXT-GENERATION ARCHIVE FORMAT IN ACRONIS BACKUP CLOUD 1 Backup Speed and Reliability Are the Top Data Protection Mandates What are the top data protection mandates from your organization s IT leadership?
More informationThe Google File System
The Google File System Sanjay Ghemawat, Howard Gobioff and Shun Tak Leung Google* Shivesh Kumar Sharma fl4164@wayne.edu Fall 2015 004395771 Overview Google file system is a scalable distributed file system
More informationOpendedupe & Veritas NetBackup ARCHITECTURE OVERVIEW AND USE CASES
Opendedupe & Veritas NetBackup ARCHITECTURE OVERVIEW AND USE CASES May, 2017 Contents Introduction... 2 Overview... 2 Architecture... 2 SDFS File System Service... 3 Data Writes... 3 Data Reads... 3 De-duplication
More informationYour World is Hybrid: Protecting your VMs with Veeam and HPE Storage. Federico Venier HPE Storage Technical marketing
Your World is Hybrid: Protecting your s with and HPE Storage Federico Venier HPE Storage Technical marketing federico.venier@hpe.com HPE & : Integration summary SQL Applications HPE + License Reselling
More information