Deduplication Storage System

Size: px
Start display at page:

Download "Deduplication Storage System"

Transcription

1 Deduplication Storage System Kai Li Charles Fitzmorris Professor, Princeton University & Chief Scientist and Co-Founder, Data Domain, Inc. 03/11/09

2 The World Is Becoming Data-Centric CERN Tier 0 Business & personal life becoming digital Make better decisions Science e-science causing data explosion Accelerate science discovery 03/11/09 2

3 How Much Information(IDC Estimates in 2007)? Data size is on the way to zetabytes (10 9 TB) 161 exabytes (10 8 TB) of information was created or replicated worldwide in 2006 IDC estimates 6X growth by 2010 to 988 exabytes (a zetabyte) / year New technical information 2X every 2 years Examples 7 Wal-Mart (US) 500 TB, 10 transactions / day in 2004 Google s BigTable (US) 1-2 petabytes AT&T has 11 exabytes (10 7 TB) of wireline, wireless, Internet data 03/11/09 3

4 Challenges in A Digital World Find information Search engines do well for text document data Audio, images, video, sensor data are much more difficult Open problems in analysis, visualization, summarization Access data Cloud Computing and P2P address location and management issues But, $/bandwidth vary and improve slowly Store and protect data Protect all versions of data frequently Recover any version of data quickly 03/11/09 4

5 A Traditional Data Center $$$$ Mirrored storage $$$ Clients Server Primary storage 03/11/09 5

6 A Data Center w/ Networked Disk Storage? Dedicated Fibre Mirrored storage Clients Server Primary storage Costs Purchase: $$$$ Bandwidth: $$$$$ Power: $$$ 20 primary (3-month retention) Onsite WAN 20 primary Remote 03/11/09 6

7 Trends for Bytes/$ Storage increases exponentially Gap between adjacent classes is about 3-5X Bandwidth becomes flat Low-end WAN: slow improvements High-end WAN: almost no improvement Mbytes Tape storage ATA storage FC storage Flash storage WAN BW /11/09 7

8 Replication with WAN Is Expensive A T3 line example T3 is 45Mbits/sec or ~486GB/day T3 cost is about $72k/year Policies Dataset size Cost Daily full backups ~500GB $300/GB/ 2 Years Daily incremental & weekly full backups ~3TB $48/GB/ 2 Years The problem Bandwidth costs 4x 20x data center primary storage WAN Mbytes/$ increases slowly (<< data growth rate) Situation will get worse 03/11/09 8

9 A Data Center with Deduplication Storage Mirrored storage Clients Server Primary storage Promises Purchase: ~Tape libraries Space: >10X reduction WAN BW: >10X reduction Power: >10X reduction Onsite WAN Remote 03/11/09 9

10 High-Level Idea of Deduplication Traditional local compression Deduplication Encode in a sliding window of bytes (e.g. 100K) ~2 compression ~10-50 compression Large window more redundant data 03/11/09 10

11 Backup Data Example View from Backup Software (tar or similar format) Data Stream First Full Backup Incr 1 Incr 2 Second Full Backup A B C D A E F G A B H A E I B J C D E F G H Deduplicated Storage: Redundancies pooled, compressed A B C D E F G H I J = Unique variable segments = Redundant data segments = Compressed unique segments 03/11/09 11

12 Two Deduplication Approaches Deltas Computing a sketch for each segment [Broder97] Find a similar segment by comparing sketches Compute deltas with the most similar segment But, need to read segments from disks Fingerprinting Computing a fingerprint as ID for each segment Use an index to lookup if the segment is a duplicate Efficiency depends on index lookups 03/11/09 12

13 Two approaches Segmentation Fixed: simple but cannot handle shifts well Content-based variable: independent of shifts A X C D A Y C D A B C D A B C D Segment size... Rolling fingerprinting until fp = xxxx0000 Smaller higher compression Smaller more memory, less throughput Example: 100GB index / 20TB for 4K segment size 03/11/09 13

14 Main Components Interfaces (NFS, CIFS, VTL, ) Object-Oriented File System Deduplication GC & Verification Data Layout Replication RAID (e.g. RAID-6) Disk Shelves 03/11/09 14

15 Design Challenges Very reliable and self-healing Backup data is the last stop Multi-dimensional verifications and self-healing High throughput + high compression at low cost Why high throughput: a day has 24 hours Why high compression: make disks cost like tapes Controller cost should be small 03/11/09 15

16 Typical Alternatives Caching the index? File buffer cache has low hit rate (<80%) Fingerprints are random Parallelize the index? 7200 RPM Seagate 1TB drive: <120 seeks/sec 120 8KB segment: 0.96 MB/sec/disk Need 200 disks to achieve 200 MB/s! [Venti02] Buffer the data? Need a very large disk buffer Long delay to move data offsite A day still has 24 hours 03/11/09 16

17 High Throughput, High Compression Main ideas at Low Cost Layout data on disk with duplicate locality Use a sophisticated cache for the fingerprint index Use a summary data structure for new data Use locality-preserved caching for old data See a FAST paper for details Benjamin Zhu, Kai Li and Hugo Patterson. Avoiding the Disk Bottleneck in a Deduplication Storage System. In Proceedings of The 6 th USENIX Conference on File and Storage Technologies (FAST 08). February /11/09 17

18 Summary Vector Goal: Use minimal memory to test for new data Summarize what segments have been stored, with Bloom filter (Bloom 70) in RAM If Summary Vector says no, it s new segment Summary Vector Approximation Set bits fp(s i ) h 1 h 2 h Index Data Structure Read bits fp(s i ) h 1 h 2 h 3 03/11/09 18

19 Known Analysis Results Bloom filter with m bits k independent hash functions After inserting n keys, the probability of a false positive is: Examples: m/n = 6, k = 4: p = m/n = 8, k = 6: p = p = m ( kn / m 1 e ) k Experimental data validate the analysis results kn 03/11/09 19 k

20 Stream Informed Segment Layout Goal: Capture duplicate locality on disk Segments from the same stream are stored in the same containers Metadata (index data) are also in the containers Metadata section Metadata section Metadata section Data section Data section Data section Stream 1 Stream 2 Stream 3 03/11/09 20

21 Locality Preserved Caching (LPC) Goal: Maintain duplicate locality in the cache Disk Index has all <fingerprint, containerid> pairs Index Cache caches a subset of such pairs On a miss, lookup Disk Index to find containerid Load the metadata of a container into Index Cache, replace if needed Miss Disk Index ContainerID Metadata Data Load metadata Index Cache Replacement Container 03/11/09 21

22 Putting Them Together A fingerprint Duplicate New Index Cache No Summary Vector Replacement Maybe Disk Index metadata metadata metadata datametadata data data data 03/11/09 22

23 Evaluation What to evaluate Disk I/O reduction results Drive the evaluation with two real datasets Observe results at different parts of the system Write and read throughput A synthetic benchmark with multiple streams Mimic multiple backups Deduplication results Report results from two data centers Platform 2 qardcore 2.8Ghz Intel CPUs, 8GB RAM, 10GE NIC, 1GB NVRAM, 15 7,200 RPM ATA disks 03/11/09 23

24 Disk I/O Reduction Results Exchange data (2.56TB) 135-daily full backups Engineering data (2.39TB) 100-day daily inc, weekly full # disk I/Os % of total # disk I/Os % of total No summary, No SISL/LPC 328,613, % 318,236, % Summary only 274,364, % 259,135, % SISL/LPC only 57,725, % 60,358, % Summary & SISL/LPC 3,477, % 1,257, % 03/11/09 24

25 Write Throughput MB/s Generations G e Platform: 2xQuadcore Xeon+15 disks+10ge 03/11/09 25

26 Read Throughput MB/s Generations G e Platform: 2xQuadcore Xeon+15 disks+10ge 03/11/09 26

27 Real World Example at Datacenter A 03/11/09 27

28 Real World Compression at Datacenter A 03/11/09 28

29 Real World Example at Datacenter B 03/11/09 29

30 Real World Compression at Datacenter B 03/11/09 30

31 Local compression Related Work Relatively small windows of bytes [Ziv&Lempel77, ] Larger windows [Bentley&McILroy99] File-level deduplication system Use file hashes to detect duplicate files CAS systems, not addressing throughput issues Fixed-size block deduplication storage prototype [Venti02] Fixed-size segments, not addressing throughput issues Segment-level deduplication Content-based segmentation methods [Manber93, Brin94] Variable-size segment deduplication for network traffic [Spring00, LBFS01, TAPER05, ] Variable-size segment vs. delta deduplication [Kulkarni04] Use of Bloom filters Summary data structure [Bloom70, Fan98, Broder02] Detect duplicates [TAPER05] 03/11/09 31

32 Summary Deduplication storage replaces tape library Purchase cost: < tape library solution Space reduction: 10 30x Bandwidth reduction: x Power reduction: > 10x (~compression ratio) Scalable deduplication NFS throughput ~110MB/s on 2 2.6Ghz CPUs w/ 15 disks ~210MB/s on 2 2-core 3Ghz CPUs w/ 15 disks ~360MB/s on 2 4-core 2.8Ghz CPUs w/ 15 disks (~750MB/s for OST interface) Impact Over 10,000 systems deployed to many data centers >70% replicate data over WAN for disaster recovery See white papers at 03/11/09 32

33 Beyond the Backup Use Case Nearline storage (widely in use already) Besides throughput, optimize for file IO s/sec 4-10X compression, depending on data Archiving (beginning) Lock, shredding, encryption, 4-10X compression, depending on data Future issues FLASH + Dedup storage New storage eco-system for data centers Storage infrastructure for Cloud Computing 03/11/09 33

Deduplication File System & Course Review

Deduplication File System & Course Review Deduplication File System & Course Review Kai Li 12/13/13 Topics u Deduplication File System u Review 12/13/13 2 Storage Tiers of A Tradi/onal Data Center $$$$ Mirrored storage $$$ Dedicated Fibre Clients

More information

WAN Optimized Replication of Backup Datasets Using Stream-Informed Delta Compression

WAN Optimized Replication of Backup Datasets Using Stream-Informed Delta Compression WAN Optimized Replication of Backup Datasets Using Stream-Informed Delta Compression Philip Shilane, Mark Huang, Grant Wallace, & Windsor Hsu Backup Recovery Systems Division EMC Corporation Introduction

More information

EMC DATA DOMAIN PRODUCT OvERvIEW

EMC DATA DOMAIN PRODUCT OvERvIEW EMC DATA DOMAIN PRODUCT OvERvIEW Deduplication storage for next-generation backup and archive Essentials Scalable Deduplication Fast, inline deduplication Provides up to 65 PBs of logical storage for long-term

More information

See what s new: Data Domain Global Deduplication Array, DD Boost and more. Copyright 2010 EMC Corporation. All rights reserved.

See what s new: Data Domain Global Deduplication Array, DD Boost and more. Copyright 2010 EMC Corporation. All rights reserved. See what s new: Data Domain Global Deduplication Array, DD Boost and more 2010 1 EMC Backup Recovery Systems (BRS) Division EMC Competitor Competitor Competitor Competitor Competitor Competitor Competitor

More information

EMC DATA DOMAIN OPERATING SYSTEM

EMC DATA DOMAIN OPERATING SYSTEM EMC DATA DOMAIN OPERATING SYSTEM Powering EMC Protection Storage ESSENTIALS High-Speed, Scalable Deduplication Up to 31 TB/hr performance Reduces requirements for backup storage by 10 to 30x and archive

More information

DELL EMC DATA DOMAIN SISL SCALING ARCHITECTURE

DELL EMC DATA DOMAIN SISL SCALING ARCHITECTURE WHITEPAPER DELL EMC DATA DOMAIN SISL SCALING ARCHITECTURE A Detailed Review ABSTRACT While tape has been the dominant storage medium for data protection for decades because of its low cost, it is steadily

More information

Copyright 2010 EMC Corporation. Do not Copy - All Rights Reserved.

Copyright 2010 EMC Corporation. Do not Copy - All Rights Reserved. 1 Using patented high-speed inline deduplication technology, Data Domain systems identify redundant data as they are being stored, creating a storage foot print that is 10X 30X smaller on average than

More information

Get More Out of Storage with Data Domain Deduplication Storage Systems

Get More Out of Storage with Data Domain Deduplication Storage Systems 1 Get More Out of Storage with Data Domain Deduplication Storage Systems David M. Auslander Sales Director, New England / Eastern Canada 2 EMC Data Domain Dedupe everything without changing anything Simplify

More information

DELL EMC DATA DOMAIN OPERATING SYSTEM

DELL EMC DATA DOMAIN OPERATING SYSTEM DATA SHEET DD OS Essentials High-speed, scalable deduplication Up to 68 TB/hr performance Reduces protection storage requirements by 10 to 30x CPU-centric scalability Data invulnerability architecture

More information

Delta Compressed and Deduplicated Storage Using Stream-Informed Locality

Delta Compressed and Deduplicated Storage Using Stream-Informed Locality Delta Compressed and Deduplicated Storage Using Stream-Informed Locality Philip Shilane, Grant Wallace, Mark Huang, and Windsor Hsu Backup Recovery Systems Division EMC Corporation Abstract For backup

More information

COS 318: Operating Systems. NSF, Snapshot, Dedup and Review

COS 318: Operating Systems. NSF, Snapshot, Dedup and Review COS 318: Operating Systems NSF, Snapshot, Dedup and Review Topics! NFS! Case Study: NetApp File System! Deduplication storage system! Course review 2 Network File System! Sun introduced NFS v2 in early

More information

Balakrishnan Nair. Senior Technology Consultant Back Up & Recovery Systems South Gulf. Copyright 2011 EMC Corporation. All rights reserved.

Balakrishnan Nair. Senior Technology Consultant Back Up & Recovery Systems South Gulf. Copyright 2011 EMC Corporation. All rights reserved. Balakrishnan Nair Senior Technology Consultant Back Up & Recovery Systems South Gulf 1 Thinking Fast: The World s Fastest Backup Now Does Archive Too Introducing the New EMC Backup and Recovery Solutions

More information

DELL EMC DATA DOMAIN OPERATING SYSTEM

DELL EMC DATA DOMAIN OPERATING SYSTEM DATA SHEET DD OS Essentials High-speed, scalable deduplication Reduces protection storage requirements by up to 55x Up to 3x restore performance CPU-centric scalability Data invulnerability architecture

More information

Sparse Indexing: Large-Scale, Inline Deduplication Using Sampling and Locality

Sparse Indexing: Large-Scale, Inline Deduplication Using Sampling and Locality Sparse Indexing: Large-Scale, Inline Deduplication Using Sampling and Locality Mark Lillibridge, Kave Eshghi, Deepavali Bhagwat, Vinay Deolalikar, Greg Trezise, and Peter Camble Work done at Hewlett-Packard

More information

Scale-out Object Store for PB/hr Backups and Long Term Archive April 24, 2014

Scale-out Object Store for PB/hr Backups and Long Term Archive April 24, 2014 Scale-out Object Store for PB/hr Backups and Long Term Archive April 24, 2014 Gideon Senderov Director, Advanced Storage Products NEC Corporation of America Long-Term Data in the Data Center (EB) 140 120

More information

ChunkStash: Speeding Up Storage Deduplication using Flash Memory

ChunkStash: Speeding Up Storage Deduplication using Flash Memory ChunkStash: Speeding Up Storage Deduplication using Flash Memory Biplob Debnath +, Sudipta Sengupta *, Jin Li * * Microsoft Research, Redmond (USA) + Univ. of Minnesota, Twin Cities (USA) Deduplication

More information

EMC Data Domain for Archiving Are You Kidding?

EMC Data Domain for Archiving Are You Kidding? EMC Data Domain for Archiving Are You Kidding? Bill Roth / Bob Spurzem EMC EMC 1 Agenda EMC Introduction Data Domain Enterprise Vault Integration Data Domain NetBackup Integration Q & A EMC 2 EMC Introduction

More information

Data Deduplication Methods for Achieving Data Efficiency

Data Deduplication Methods for Achieving Data Efficiency Data Deduplication Methods for Achieving Data Efficiency Matthew Brisse, Quantum Gideon Senderov, NEC... SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA. Member companies

More information

White paper ETERNUS CS800 Data Deduplication Background

White paper ETERNUS CS800 Data Deduplication Background White paper ETERNUS CS800 - Data Deduplication Background This paper describes the process of Data Deduplication inside of ETERNUS CS800 in detail. The target group consists of presales, administrators,

More information

IBM řešení pro větší efektivitu ve správě dat - Store more with less

IBM řešení pro větší efektivitu ve správě dat - Store more with less IBM řešení pro větší efektivitu ve správě dat - Store more with less IDG StorageWorld 2012 Rudolf Hruška Information Infrastructure Leader IBM Systems & Technology Group rudolf_hruska@cz.ibm.com IBM Agenda

More information

Accelerate the Journey to 100% Virtualization with EMC Backup and Recovery. Copyright 2010 EMC Corporation. All rights reserved.

Accelerate the Journey to 100% Virtualization with EMC Backup and Recovery. Copyright 2010 EMC Corporation. All rights reserved. Accelerate the Journey to 100% Virtualization with EMC Backup and Recovery 1 Exabytes Today s Data Protection Challenges Unabated Data Growth Backup typically represents a factor of 4 to 30 times production

More information

ASN Configuration Best Practices

ASN Configuration Best Practices ASN Configuration Best Practices Managed machine Generally used CPUs and RAM amounts are enough for the managed machine: CPU still allows us to read and write data faster than real IO subsystem allows.

More information

SPECIFICATION FOR NETWORK ATTACHED STORAGE (NAS) TO BE FILLED BY BIDDER. NAS Controller Should be rack mounted with a form factor of not more than 2U

SPECIFICATION FOR NETWORK ATTACHED STORAGE (NAS) TO BE FILLED BY BIDDER. NAS Controller Should be rack mounted with a form factor of not more than 2U SPECIFICATION FOR NETWORK ATTACHED STORAGE (NAS) TO BE FILLED BY BIDDER S.No. Features Qualifying Minimum Requirements No. of Storage 1 Units 2 Make Offered 3 Model Offered 4 Rack mount 5 Processor 6 Memory

More information

The World s Fastest Backup Systems

The World s Fastest Backup Systems 3 The World s Fastest Backup Systems Erwin Freisleben BRS Presales Austria 4 EMC Data Domain: Leadership and Innovation A history of industry firsts 2003 2004 2005 2006 2007 2008 2009 2010 2011 First deduplication

More information

Backup and archiving need not to create headaches new pain relievers are around

Backup and archiving need not to create headaches new pain relievers are around Backup and archiving need not to create headaches new pain relievers are around Frank Reichart Senior Director Product Marketing Storage Copyright 2012 FUJITSU Hot Spots in Data Protection 1 Copyright

More information

DELL EMC DATA DOMAIN DEDUPLICATION STORAGE SYSTEMS

DELL EMC DATA DOMAIN DEDUPLICATION STORAGE SYSTEMS SPEC SHEET DELL EMC DATA DOMAIN DEDUPLICATION STORAGE SYSTEMS Data Domain Systems Dell EMC Data Domain deduplication storage systems continue to revolutionize disk backup, archiving, and disaster recovery

More information

HP D2D & STOREONCE OVERVIEW

HP D2D & STOREONCE OVERVIEW HP D2D & STOREONCE OVERVIEW Robert Clifford Pre-Sales Storage Consultant 1 Data loss (Recovery Point Objective) Let s put it in context.. Last Transaction Increased availability Mission critical support

More information

A. Deduplication rate is less than expected, accounting for the remaining GSAN capacity

A. Deduplication rate is less than expected, accounting for the remaining GSAN capacity Volume: 326 Questions Question No: 1 An EMC Avamar customer s Gen-1 system with 4 TB of GSAN capacity has reached read-only threshold. The customer indicates that the deduplicated backup data accounts

More information

DELL EMC DATA DOMAIN DEDUPLICATION STORAGE SYSTEMS

DELL EMC DATA DOMAIN DEDUPLICATION STORAGE SYSTEMS SPEC SHEET DELL EMC DATA DOMAIN DEDUPLICATION STORAGE SYSTEMS Data Domain Systems Dell EMC Data Domain deduplication storage systems continue to revolutionize disk backup, archiving, and disaster recovery

More information

Understanding Primary Storage Optimization Options Jered Floyd Permabit Technology Corp.

Understanding Primary Storage Optimization Options Jered Floyd Permabit Technology Corp. Understanding Primary Storage Optimization Options Jered Floyd Permabit Technology Corp. Primary Storage Optimization Technologies that let you store more data on the same storage Thin provisioning Copy-on-write

More information

DELL EMC DATA DOMAIN DEDUPLICATION STORAGE SYSTEMS

DELL EMC DATA DOMAIN DEDUPLICATION STORAGE SYSTEMS Data Domain Systems Table 1. DELL EMC DATA DOMAIN DEDUPLICATION STORAGE SYSTEMS Dell EMC Data Domain deduplication storage systems continue to revolutionize disk backup, archiving, and disaster recovery

More information

Opendedupe & Veritas NetBackup ARCHITECTURE OVERVIEW AND USE CASES

Opendedupe & Veritas NetBackup ARCHITECTURE OVERVIEW AND USE CASES Opendedupe & Veritas NetBackup ARCHITECTURE OVERVIEW AND USE CASES May, 2017 Contents Introduction... 2 Overview... 2 Architecture... 2 SDFS File System Service... 3 Data Writes... 3 Data Reads... 3 De-duplication

More information

Hyper-converged Secondary Storage for Backup with Deduplication Q & A. The impact of data deduplication on the backup process

Hyper-converged Secondary Storage for Backup with Deduplication Q & A. The impact of data deduplication on the backup process Hyper-converged Secondary Storage for Backup with Deduplication Q & A The impact of data deduplication on the backup process Table of Contents Introduction... 3 What is data deduplication?... 3 Is all

More information

Dell DR4100. Disk based Data Protection and Disaster Recovery. April 3, Birger Ferber, Enterprise Technologist Storage EMEA

Dell DR4100. Disk based Data Protection and Disaster Recovery. April 3, Birger Ferber, Enterprise Technologist Storage EMEA Dell DR4100 Disk based Data Protection and Disaster Recovery April 3, 2013 Birger Ferber, Enterprise Technologist Storage EMEA Agenda DR4100 Technical Review DR4100 New Features DR4100 Use Cases DR4100

More information

HPE SimpliVity. The new powerhouse in hyperconvergence. Boštjan Dolinar HPE. Maribor Lancom

HPE SimpliVity. The new powerhouse in hyperconvergence. Boštjan Dolinar HPE. Maribor Lancom HPE SimpliVity The new powerhouse in hyperconvergence Boštjan Dolinar HPE Maribor Lancom 2.2.2018 Changing requirements drive the need for Hybrid IT Application explosion Hybrid growth 2014 5,500 2015

More information

Rethinking Deduplication Scalability

Rethinking Deduplication Scalability Rethinking Deduplication Scalability Petros Efstathopoulos Petros Efstathopoulos@symantec.com Fanglu Guo Fanglu Guo@symantec.com Symantec Research Labs Symantec Corporation, Culver City, CA, USA 1 ABSTRACT

More information

Technology Insight Series

Technology Insight Series EMC Avamar for NAS - Accelerating NDMP Backup Performance John Webster June, 2011 Technology Insight Series Evaluator Group Copyright 2011 Evaluator Group, Inc. All rights reserved. Page 1 of 7 Introduction/Executive

More information

50 TB. Traditional Storage + Data Protection Architecture. StorSimple Cloud-integrated Storage. Traditional CapEx: $375K Support: $75K per Year

50 TB. Traditional Storage + Data Protection Architecture. StorSimple Cloud-integrated Storage. Traditional CapEx: $375K Support: $75K per Year Compelling Economics: Traditional Storage vs. StorSimple Traditional Storage + Data Protection Architecture StorSimple Cloud-integrated Storage Servers Servers Primary Volume Disk Array ($100K; Double

More information

A DEDUPLICATION-INSPIRED FAST DELTA COMPRESSION APPROACH W EN XIA, HONG JIANG, DA N FENG, LEI T I A N, M I N FU, YUKUN Z HOU

A DEDUPLICATION-INSPIRED FAST DELTA COMPRESSION APPROACH W EN XIA, HONG JIANG, DA N FENG, LEI T I A N, M I N FU, YUKUN Z HOU A DEDUPLICATION-INSPIRED FAST DELTA COMPRESSION APPROACH W EN XIA, HONG JIANG, DA N FENG, LEI T I A N, M I N FU, YUKUN Z HOU PRESENTED BY ROMAN SHOR Overview Technics of data reduction in storage systems:

More information

Speeding Up Cloud/Server Applications Using Flash Memory

Speeding Up Cloud/Server Applications Using Flash Memory Speeding Up Cloud/Server Applications Using Flash Memory Sudipta Sengupta and Jin Li Microsoft Research, Redmond, WA, USA Contains work that is joint with Biplob Debnath (Univ. of Minnesota) Flash Memory

More information

DELL EMC DATA DOMAIN DEDUPLICATION STORAGE SYSTEMS

DELL EMC DATA DOMAIN DEDUPLICATION STORAGE SYSTEMS DELL EMC DATA DOMAIN DEDUPLICATION STORAGE SYSTEMS Dell EMC Data Domain deduplication storage systems continue to revolutionize disk backup, archiving, and disaster recovery with high-speed, inline deduplication.

More information

Veeam and HP: Meet your backup data protection goals

Veeam and HP: Meet your backup data protection goals Sponsored by Veeam and HP: Meet your backup data protection goals Eric Machabert Сonsultant and virtualization expert Introduction With virtualization systems becoming mainstream in recent years, backups

More information

TCO REPORT. NAS File Tiering. Economic advantages of enterprise file management

TCO REPORT. NAS File Tiering. Economic advantages of enterprise file management TCO REPORT NAS File Tiering Economic advantages of enterprise file management Executive Summary Every organization is under pressure to meet the exponential growth in demand for file storage capacity.

More information

Protect enterprise data, achieve long-term data retention

Protect enterprise data, achieve long-term data retention Technical white paper Protect enterprise data, achieve long-term data retention HP StoreOnce Catalyst and Symantec NetBackup OpenStorage Table of contents Introduction 2 Technology overview 3 HP StoreOnce

More information

Redesign your Backup with Deduplication

Redesign your Backup with Deduplication Redesign your Backup with Deduplication June, 2010 1 Agenda Registration Welcome (5 minutes) Introduction to Deduplication for Backup (15 minutes) Lunch EMC Data Domain and Avamar (60 minutes) Discussion

More information

DASH COPY GUIDE. Published On: 11/19/2013 V10 Service Pack 4A Page 1 of 31

DASH COPY GUIDE. Published On: 11/19/2013 V10 Service Pack 4A Page 1 of 31 DASH COPY GUIDE Published On: 11/19/2013 V10 Service Pack 4A Page 1 of 31 DASH Copy Guide TABLE OF CONTENTS OVERVIEW GETTING STARTED ADVANCED BEST PRACTICES FAQ TROUBLESHOOTING DASH COPY PERFORMANCE TUNING

More information

WHITE PAPER. DATA DEDUPLICATION BACKGROUND: A Technical White Paper

WHITE PAPER. DATA DEDUPLICATION BACKGROUND: A Technical White Paper WHITE PAPER DATA DEDUPLICATION BACKGROUND: A Technical White Paper CONTENTS Data Deduplication Multiple Data Sets from a Common Storage Pool.......................3 Fixed-Length Blocks vs. Variable-Length

More information

Building Backup-to-Disk and Disaster Recovery Solutions with the ReadyDATA 5200

Building Backup-to-Disk and Disaster Recovery Solutions with the ReadyDATA 5200 Building Backup-to-Disk and Disaster Recovery Solutions with the ReadyDATA 5200 WHITE PAPER Explosive data growth is a challenging reality for IT and data center managers. IDC reports that digital content

More information

Increasing Performance of Existing Oracle RAC up to 10X

Increasing Performance of Existing Oracle RAC up to 10X Increasing Performance of Existing Oracle RAC up to 10X Prasad Pammidimukkala www.gridironsystems.com 1 The Problem Data can be both Big and Fast Processing large datasets creates high bandwidth demand

More information

SMORE: A Cold Data Object Store for SMR Drives

SMORE: A Cold Data Object Store for SMR Drives SMORE: A Cold Data Object Store for SMR Drives Peter Macko, Xiongzi Ge, John Haskins Jr.*, James Kelley, David Slik, Keith A. Smith, and Maxim G. Smith Advanced Technology Group NetApp, Inc. * Qualcomm

More information

Quest DR Series Disk Backup Appliances

Quest DR Series Disk Backup Appliances Quest DR Series Disk Backup Appliances Back up more. Store less. Perform better. Keeping up with the volume of data to protect can be complex and time consuming, but managing the storage of that data doesn

More information

DELL EMC DATA DOMAIN EXTENDED RETENTION SOFTWARE

DELL EMC DATA DOMAIN EXTENDED RETENTION SOFTWARE WHITEPAPER DELL EMC DATA DOMAIN EXTENDED RETENTION SOFTWARE A Detailed Review ABSTRACT This white paper introduces Dell EMC Data Domain Extended Retention software that increases the storage scalability

More information

dedupv1: Improving Deduplication Throughput using Solid State Drives (SSD)

dedupv1: Improving Deduplication Throughput using Solid State Drives (SSD) University Paderborn Paderborn Center for Parallel Computing Technical Report dedupv1: Improving Deduplication Throughput using Solid State Drives (SSD) Dirk Meister Paderborn Center for Parallel Computing

More information

STOREONCE OVERVIEW. Neil Fleming Mid-Market Storage Development Manager. Copyright 2010 Hewlett-Packard Development Company, L.P.

STOREONCE OVERVIEW. Neil Fleming Mid-Market Storage Development Manager. Copyright 2010 Hewlett-Packard Development Company, L.P. STOREONCE OVERVIEW Neil Fleming Neil.Fleming@HP.com Mid-Market Development Manager 1 DETERMINE YOUR RECOVERY NEEDS Wks Days Hrs Mins Secs Secs Mins Hrs Days Wks TAPE D2D Recovery Point SNAP SNAP Recovery

More information

EMC Solutions for Backup to Disk EMC Celerra LAN Backup to Disk with IBM Tivoli Storage Manager Best Practices Planning

EMC Solutions for Backup to Disk EMC Celerra LAN Backup to Disk with IBM Tivoli Storage Manager Best Practices Planning EMC Solutions for Backup to Disk EMC Celerra LAN Backup to Disk with IBM Tivoli Storage Manager Best Practices Planning Abstract This white paper describes how to configure the Celerra IP storage system

More information

HYDRAstor: a Scalable Secondary Storage

HYDRAstor: a Scalable Secondary Storage HYDRAstor: a Scalable Secondary Storage 7th TF-Storage Meeting September 9 th 00 Łukasz Heldt Largest Japanese IT company $4 Billion in annual revenue 4,000 staff www.nec.com Polish R&D company 50 engineers

More information

The Business Case to deploy NetBackup Appliances and Deduplication

The Business Case to deploy NetBackup Appliances and Deduplication The Business Case to deploy NetBackup Appliances and Deduplication Noel Winter Appliance Product Lead EMEA Alex Lowe Regional Product Manager EMEA 1 Industry Leading Appliances Powered by Intel Intel Xeon

More information

Zero Data Loss Recovery Appliance DOAG Konferenz 2014, Nürnberg

Zero Data Loss Recovery Appliance DOAG Konferenz 2014, Nürnberg Zero Data Loss Recovery Appliance Frank Schneede, Sebastian Solbach Systemberater, BU Database, Oracle Deutschland B.V. & Co. KG Safe Harbor Statement The following is intended to outline our general product

More information

Trends in Data Protection and Restoration Technologies. Mike Fishman, EMC 2 Corporation

Trends in Data Protection and Restoration Technologies. Mike Fishman, EMC 2 Corporation Trends in Data Protection and Restoration Technologies Mike Fishman, EMC 2 Corporation SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA unless otherwise noted. Member

More information

Top Trends in DBMS & DW

Top Trends in DBMS & DW Oracle Top Trends in DBMS & DW Noel Yuhanna Principal Analyst Forrester Research Trend #1: Proliferation of data Data doubles every 18-24 months for critical Apps, for some its every 6 months Terabyte

More information

NetVault Backup Client and Server Sizing Guide 2.1

NetVault Backup Client and Server Sizing Guide 2.1 NetVault Backup Client and Server Sizing Guide 2.1 Recommended hardware and storage configurations for NetVault Backup 10.x and 11.x September, 2017 Page 1 Table of Contents 1. Abstract... 3 2. Introduction...

More information

DEDUPLICATION BASICS

DEDUPLICATION BASICS DEDUPLICATION BASICS 4 DEDUPE BASICS 6 WHAT IS DEDUPLICATION 8 METHODS OF DEDUPLICATION 10 DEDUPLICATION EXAMPLE 12 HOW DO DISASTER RECOVERY & ARCHIVING FIT IN? 14 DEDUPLICATION FOR EVERY BUDGET QUANTUM

More information

Quest DR Series Disk Backup Appliances

Quest DR Series Disk Backup Appliances Quest DR Series Disk Backup Appliances Back up more. Store less. Perform better. Keeping up with the volume of data to protect can be complex and time consuming, but managing the storage of that data doesn

More information

HP Dynamic Deduplication achieving a 50:1 ratio

HP Dynamic Deduplication achieving a 50:1 ratio HP Dynamic Deduplication achieving a 50:1 ratio Table of contents Introduction... 2 Data deduplication the hottest topic in data protection... 2 The benefits of data deduplication... 2 How does data deduplication

More information

Oracle Zero Data Loss Recovery Appliance (ZDLRA)

Oracle Zero Data Loss Recovery Appliance (ZDLRA) Oracle Zero Data Loss Recovery Appliance (ZDLRA) Overview Attila Mester Principal Sales Consultant Data Protection Copyright 2015, Oracle and/or its affiliates. All rights reserved. Safe Harbor Statement

More information

INFRASTRUCTURE BEST PRACTICES FOR PERFORMANCE

INFRASTRUCTURE BEST PRACTICES FOR PERFORMANCE INFRASTRUCTURE BEST PRACTICES FOR PERFORMANCE Michael Poulson and Devin Jansen EMS Software Software Support Engineer October 16-18, 2017 Performance Improvements and Best Practices Medium-Volume Traffic

More information

HP StoreOnce: reinventing data deduplication

HP StoreOnce: reinventing data deduplication HP : reinventing data deduplication Reduce the impact of explosive data growth with HP StorageWorks D2D Backup Systems Technical white paper Table of contents Executive summary... 2 Introduction to data

More information

EMC Business Continuity for Microsoft Applications

EMC Business Continuity for Microsoft Applications EMC Business Continuity for Microsoft Applications Enabled by EMC Celerra, EMC MirrorView/A, EMC Celerra Replicator, VMware Site Recovery Manager, and VMware vsphere 4 Copyright 2009 EMC Corporation. All

More information

Cascade Mapping: Optimizing Memory Efficiency for Flash-based Key-value Caching

Cascade Mapping: Optimizing Memory Efficiency for Flash-based Key-value Caching Cascade Mapping: Optimizing Memory Efficiency for Flash-based Key-value Caching Kefei Wang and Feng Chen Louisiana State University SoCC '18 Carlsbad, CA Key-value Systems in Internet Services Key-value

More information

Symantec Backup Exec Blueprints

Symantec Backup Exec Blueprints Symantec Backup Exec Blueprints Blueprint for Optimized Duplication Backup Exec Technical Services Backup & Recovery Technical Education Services Symantec Backup Exec Blueprints - Optimized Duplication

More information

The Logic of Physical Garbage Collection in Deduplicating Storage

The Logic of Physical Garbage Collection in Deduplicating Storage The Logic of Physical Garbage Collection in Deduplicating Storage Fred Douglis Abhinav Duggal Philip Shilane Tony Wong Dell EMC Shiqin Yan University of Chicago Fabiano Botelho Rubrik 1 Deduplication in

More information

Microsoft Azure StorSimple Hybrid Cloud Storage. Manu Aery, Raju S

Microsoft Azure StorSimple Hybrid Cloud Storage. Manu Aery, Raju S Learn. Connect. Explore. Microsoft Azure StorSimple Hybrid Cloud Storage Manu Aery, Raju S Agenda Storage challenges Addressing storage challenges with StorSimple StorSimple 8000 series Overview Demo &

More information

TIBX NEXT-GENERATION ARCHIVE FORMAT IN ACRONIS BACKUP CLOUD

TIBX NEXT-GENERATION ARCHIVE FORMAT IN ACRONIS BACKUP CLOUD TIBX NEXT-GENERATION ARCHIVE FORMAT IN ACRONIS BACKUP CLOUD 1 Backup Speed and Reliability Are the Top Data Protection Mandates What are the top data protection mandates from your organization s IT leadership?

More information

Flashed-Optimized VPSA. Always Aligned with your Changing World

Flashed-Optimized VPSA. Always Aligned with your Changing World Flashed-Optimized VPSA Always Aligned with your Changing World Yair Hershko Co-founder, VP Engineering, Zadara Storage 3 Modern Data Storage for Modern Computing Innovating data services to meet modern

More information

Nutanix Tech Note. Virtualizing Microsoft Applications on Web-Scale Infrastructure

Nutanix Tech Note. Virtualizing Microsoft Applications on Web-Scale Infrastructure Nutanix Tech Note Virtualizing Microsoft Applications on Web-Scale Infrastructure The increase in virtualization of critical applications has brought significant attention to compute and storage infrastructure.

More information

Scale-out Data Deduplication Architecture

Scale-out Data Deduplication Architecture Scale-out Data Deduplication Architecture Gideon Senderov Product Management & Technical Marketing NEC Corporation of America Outline Data Growth and Retention Deduplication Methods Legacy Architecture

More information

Reducing Replication Bandwidth for Distributed Document Databases

Reducing Replication Bandwidth for Distributed Document Databases Reducing Replication Bandwidth for Distributed Document Databases Lianghong Xu 1, Andy Pavlo 1, Sudipta Sengupta 2 Jin Li 2, Greg Ganger 1 Carnegie Mellon University 1, Microsoft Research 2 Document-oriented

More information

Deduplication has been around for several

Deduplication has been around for several Demystifying Deduplication By Joe Colucci Kay Benaroch Deduplication holds the promise of efficient storage and bandwidth utilization, accelerated backup and recovery, reduced costs, and more. Understanding

More information

IBM Storwize V7000 Unified

IBM Storwize V7000 Unified IBM Storwize V7000 Unified Pavel Müller IBM Systems and Technology Group Storwize V7000 Position Enterprise Block DS8000 For clients requiring: Advanced disaster recovery with 3-way mirroring and System

More information

IBM ProtecTIER and Netbackup OpenStorage (OST)

IBM ProtecTIER and Netbackup OpenStorage (OST) IBM ProtecTIER and Netbackup OpenStorage (OST) Samuel Krikler Program Director, ProtecTIER Development SS B11 1 The pressures on backup administrators are growing More new data coming Backup takes longer

More information

Symantec Backup Exec Blueprints

Symantec Backup Exec Blueprints Symantec Backup Exec Blueprints Blueprint for Large Installations Backup Exec Technical Services Backup & Recovery Technical Education Services Symantec Backup Exec Blueprints 1 Symantec Backup Exec Blueprints

More information

HYDRAstor: a Scalable Secondary Storage

HYDRAstor: a Scalable Secondary Storage HYDRAstor: a Scalable Secondary Storage 7th USENIX Conference on File and Storage Technologies (FAST '09) February 26 th 2009 C. Dubnicki, L. Gryz, L. Heldt, M. Kaczmarczyk, W. Kilian, P. Strzelczak, J.

More information

StorageCraft OneXafe and Veeam 9.5

StorageCraft OneXafe and Veeam 9.5 TECHNICAL DEPLOYMENT GUIDE NOV 2018 StorageCraft OneXafe and Veeam 9.5 Expert Deployment Guide Overview StorageCraft, with its scale-out storage solution OneXafe, compliments Veeam to create a differentiated

More information

Technology Insight Series

Technology Insight Series IBM ProtecTIER Deduplication for z/os John Webster March 04, 2010 Technology Insight Series Evaluator Group Copyright 2010 Evaluator Group, Inc. All rights reserved. Announcement Summary The many data

More information

HPE Data Protector Deduplication

HPE Data Protector Deduplication Technical white paper HPE Data Protector Deduplication Introducing Backup to Disk devices and deduplication Table of contents Summary 3 Overview 3 When to use deduplication 4 Advantages of B2D devices

More information

ADVANCED DATA REDUCTION CONCEPTS

ADVANCED DATA REDUCTION CONCEPTS ADVANCED DATA REDUCTION CONCEPTS Thomas Rivera, Hitachi Data Systems Gene Nagle, BridgeSTOR Author: Thomas Rivera, Hitachi Data Systems Author: Gene Nagle, BridgeSTOR SNIA Legal Notice The material contained

More information

Virtualization Selling with IBM Tape

Virtualization Selling with IBM Tape Virtualization Selling with IBM Tape Thepvitoon Kultumyotin (thepvito@th.ibm.com) Agenda Introduction IBM Tape Portfolio Virtual Tape Virtual Tape Concepts IBM TS7530 Product Overview IBM De-Duplication

More information

Configuring Short RPO with Actifio StreamSnap and Dedup-Async Replication

Configuring Short RPO with Actifio StreamSnap and Dedup-Async Replication CDS and Sky Tech Brief Configuring Short RPO with Actifio StreamSnap and Dedup-Async Replication Actifio recommends using Dedup-Async Replication (DAR) for RPO of 4 hours or more and using StreamSnap for

More information

Backup and Recovery Best Practices

Backup and Recovery Best Practices Backup and Recovery Best Practices Session: 3 Track: ELA Services Skip Farmer Symantec 1 Backup System Infrastructure 2 Isolating Performance Issues 3 Virtual Machine Backups 4 Reporting - Opscenter Analytics

More information

ADVANCED DEDUPLICATION CONCEPTS. Thomas Rivera, BlueArc Gene Nagle, Exar

ADVANCED DEDUPLICATION CONCEPTS. Thomas Rivera, BlueArc Gene Nagle, Exar ADVANCED DEDUPLICATION CONCEPTS Thomas Rivera, BlueArc Gene Nagle, Exar SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA. Member companies and individual members may

More information

Protecting Oracle databases with HPE StoreOnce Catalyst and RMAN

Protecting Oracle databases with HPE StoreOnce Catalyst and RMAN Protecting Oracle databases with HPE StoreOnce Catalyst and RMAN Oracle database backup using the HPE StoreOnce Catalyst Plug-in for Oracle RMAN Technical white paper Technical white paper Contents Introduction...

More information

S SNIA Storage Networking Management & Administration

S SNIA Storage Networking Management & Administration S10 201 SNIA Storage Networking Management & Administration Version 23.3 Topic 1, Volume A QUESTION NO: 1 Which two (2) are advantages of ISL over subscription? (Choose two.) A. efficient ISL bandwidth

More information

StorageCraft OneBlox and Veeam 9.5 Expert Deployment Guide

StorageCraft OneBlox and Veeam 9.5 Expert Deployment Guide TECHNICAL DEPLOYMENT GUIDE StorageCraft OneBlox and Veeam 9.5 Expert Deployment Guide Overview StorageCraft, with its scale-out storage solution OneBlox, compliments Veeam to create a differentiated diskbased

More information

EMC BACKUP AND RECOVERY PRODUCT OVERVIEW

EMC BACKUP AND RECOVERY PRODUCT OVERVIEW EMC BACKUP AND RECOVERY PRODUCT OVERVIEW Next-generation data protection Essentials Next-generation Data Protection Backup redesign is an imperative to keep pace with data growth and virtualization Disk-

More information

Maximizing Data Efficiency: Benefits of Global Deduplication

Maximizing Data Efficiency: Benefits of Global Deduplication Maximizing Data Efficiency: Benefits of Global Deduplication Advanced Storage Products Group Table of Contents 1 Understanding Deduplication 2 Scalability Limitations 4 Scope Limitations 5 Islands of Capacity

More information

IBM Tivoli Storage Manager Version Introduction to Data Protection Solutions IBM

IBM Tivoli Storage Manager Version Introduction to Data Protection Solutions IBM IBM Tivoli Storage Manager Version 7.1.6 Introduction to Data Protection Solutions IBM IBM Tivoli Storage Manager Version 7.1.6 Introduction to Data Protection Solutions IBM Note: Before you use this

More information

NetVault Backup Client and Server Sizing Guide 3.0

NetVault Backup Client and Server Sizing Guide 3.0 NetVault Backup Client and Server Sizing Guide 3.0 Recommended hardware and storage configurations for NetVault Backup 12.x September 2018 Page 1 Table of Contents 1. Abstract... 3 2. Introduction... 3

More information

IBM Tivoli Storage Manager for Windows Version Installation Guide IBM

IBM Tivoli Storage Manager for Windows Version Installation Guide IBM IBM Tivoli Storage Manager for Windows Version 7.1.8 Installation Guide IBM IBM Tivoli Storage Manager for Windows Version 7.1.8 Installation Guide IBM Note: Before you use this information and the product

More information

THE SUMMARY. CLUSTER SERIES - pg. 3. ULTRA SERIES - pg. 5. EXTREME SERIES - pg. 9

THE SUMMARY. CLUSTER SERIES - pg. 3. ULTRA SERIES - pg. 5. EXTREME SERIES - pg. 9 PRODUCT CATALOG THE SUMMARY CLUSTER SERIES - pg. 3 ULTRA SERIES - pg. 5 EXTREME SERIES - pg. 9 CLUSTER SERIES THE HIGH DENSITY STORAGE FOR ARCHIVE AND BACKUP When downtime is not an option Downtime is

More information

Global Headquarters: 5 Speen Street Framingham, MA USA P F

Global Headquarters: 5 Speen Street Framingham, MA USA P F B U Y E R C A S E S T U D Y V M w a r e I m p r o v e s N e t w o r k U t i l i z a t i o n a n d B a c k u p P e r f o r m a n c e U s i n g A v a m a r ' s C l i e n t - S i d e D e d u p l i c a t i

More information