ASN Configuration Best Practices

Similar documents
Using Synology SSD Technology to Enhance System Performance Synology Inc.

Forensic Toolkit System Specifications Guide

Configuring Short RPO with Actifio StreamSnap and Dedup-Async Replication

Quantifying FTK 3.0 Performance with Respect to Hardware Selection

Backup and Recovery. Benefits. Introduction. Best-in-class offering. Easy-to-use Backup and Recovery solution

IBM Tivoli Storage Manager for Windows Version Installation Guide IBM

Deduplication Storage System

BlackBerry AtHoc Networked Crisis Communication Capacity Planning Guidelines. AtHoc SMS Codes

DELL EMC DATA DOMAIN SISL SCALING ARCHITECTURE

TIBX NEXT-GENERATION ARCHIVE FORMAT IN ACRONIS BACKUP CLOUD

PAC094 Performance Tips for New Features in Workstation 5. Anne Holler Irfan Ahmad Aravind Pavuluri

WHITE PAPER: BEST PRACTICES. Sizing and Scalability Recommendations for Symantec Endpoint Protection. Symantec Enterprise Security Solutions Group

WHITE PAPER. How Deduplication Benefits Companies of All Sizes An Acronis White Paper

Performance Characterization of the Dell Flexible Computing On-Demand Desktop Streaming Solution

StorageCraft OneXafe and Veeam 9.5

Figure 1-1: Local Storage Status (cache).

Technical White paper Certification of ETERNUS DX in Milestone video surveillance environment

EMC Backup and Recovery for Microsoft SQL Server

The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into

System recommendations for version 17.1

HP Dynamic Deduplication achieving a 50:1 ratio

Server Specifications

Vess A2000 Series. NVR Storage Appliance. Milestone Surveillance Solution. Version PROMISE Technology, Inc. All Rights Reserved.

XenData Product Brief: SX-250 Server for Optical Disc Archive Libraries

What is QES 2.1? Agenda. Supported Model. Live demo

Single-pass restore after a media failure. Caetano Sauer, Goetz Graefe, Theo Härder

Triton file systems - an introduction. slide 1 of 28

Evaluation Report: Improving SQL Server Database Performance with Dot Hill AssuredSAN 4824 Flash Upgrades

StarTeam Performance and Scalability Techniques

COMP283-Lecture 3 Applied Database Management

VMware Virtual SAN Backup Using VMware vsphere Data Protection Advanced SEPTEMBER 2014

dedupv1: Improving Deduplication Throughput using Solid State Drives (SSD)

Using QNAP Local and Remote Snapshot To Fully Protect Your Data

High Availability and Disaster Recovery features in Microsoft Exchange Server 2007 SP1

Surveillance Dell EMC Storage with Milestone XProtect Corporate

Hyper-converged Secondary Storage for Backup with Deduplication Q & A. The impact of data deduplication on the backup process

SONAS Best Practices and options for CIFS Scalability

S SNIA Storage Networking Management & Administration

Data Protection at Cloud Scale. A reference architecture for VMware Data Protection using CommVault Simpana IntelliSnap and Dell Compellent Storage

Data backup and Disaster Recovery Expert

StorageCraft OneBlox and Veeam 9.5 Expert Deployment Guide

Milestone Certified Solution

Installing Acronis Backup Advanced Edition

AxxonSoft. The Axxon Smart. Software Package. Recommended platforms. Version 1.0.4

I/O CANNOT BE IGNORED

Backup Appliances. Geir Aasarmoen og Kåre Juvkam

BC31: A Case Study in the Battle of Storage Management. Scott Kindred, CBCP esentio Technologies

DASH COPY GUIDE. Published On: 11/19/2013 V10 Service Pack 4A Page 1 of 31

Chapter 10: Mass-Storage Systems

Oracle Zero Data Loss Recovery Appliance (ZDLRA)

INFRASTRUCTURE BEST PRACTICES FOR PERFORMANCE

Cisco HyperFlex Hyperconverged Infrastructure Solution for SAP HANA

Reference Architecture Microsoft Exchange 2013 on Dell PowerEdge R730xd 2500 Mailboxes

System Requirements. SuccessMaker 3

White Paper Features and Benefits of Fujitsu All-Flash Arrays for Virtualization and Consolidation ETERNUS AF S2 series

NetVault Backup Client and Server Sizing Guide 2.1

Chapter 10: Mass-Storage Systems. Operating System Concepts 9 th Edition

Veeam Backup & Replication on IBM Cloud Solution Architecture

Veritas NetBackup on Cisco UCS S3260 Storage Server

White paper 200 Camera Surveillance Video Vault Solution Powered by Fujitsu

Vess A2000 Series NVR Storage Appliance

IBM Spectrum Protect Version Introduction to Data Protection Solutions IBM

NetVault Backup Client and Server Sizing Guide 3.0

The World s Fastest Backup Systems

CS399 New Beginnings. Jonathan Walpole

WEBHOSTS, ISPS & RESELLERS SETUP REQUIREMENTS & SIGNUP FORM LOCAL CLOUD

IBM Tivoli Storage Manager Version Introduction to Data Protection Solutions IBM

CS3600 SYSTEMS AND NETWORKS

Creating the Fastest Possible Backups Using VMware Consolidated Backup. A Design Blueprint

SPECIFICATION FOR NETWORK ATTACHED STORAGE (NAS) TO BE FILLED BY BIDDER. NAS Controller Should be rack mounted with a form factor of not more than 2U

Chapter 4 Data Movement Process

SIMATIC. Process Historian 2014 SP2 SIMATIC Process Historian. Process Historian - Installation Notes 1. Process Historian - Release Notes

Backup Solution Testing on UCS B and C Series Servers for Small-Medium Range Customers (Disk to Tape) Acronis Backup Advanced Suite 11.

HYBRID STORAGE TM. WITH FASTier ACCELERATION TECHNOLOGY

Microsoft SQL Server in a VMware Environment on Dell PowerEdge R810 Servers and Dell EqualLogic Storage

Storage Optimization with Oracle Database 11g

EMC DATA DOMAIN PRODUCT OvERvIEW

Manual Sql Server 2012 Express Limit Cpu

System Requirements EDT 6.0. discoveredt.com

The SHARED hosting plan is designed to meet the advanced hosting needs of businesses who are not yet ready to move on to a server solution.

CS370 Operating Systems

Hardware & System Requirements

MobiLink 10 Performance. A whitepaper from Sybase ianywhere. Author: Graham Hurst. Date: March 2009

Deduplication File System & Course Review

Providing a first class, enterprise-level, backup and archive service for Oxford University

BIG DATA AND HADOOP ON THE ZFS STORAGE APPLIANCE

SSIM Collection & Archiving Infrastructure Scaling & Performance Tuning Guide

COS 318: Operating Systems. NSF, Snapshot, Dedup and Review

MySQL Database Scalability

Ronald van der Pol

THE SUMMARY. CLUSTER SERIES - pg. 3. ULTRA SERIES - pg. 5. EXTREME SERIES - pg. 9

OpenVMS Storage Updates

DEDICATED SERVERS WITH EBS

IBM InfoSphere Streams v4.0 Performance Best Practices

DEDUPLICATION BASICS

ALEXA XR Module Workflows

Deep Dive on SimpliVity s OmniStack A Technical Whitepaper

IMT Standards. Standard number A Server Blade. GoA IMT Standards

System recommendations for version 17.1

Server Specifications

Transcription:

ASN Configuration Best Practices Managed machine Generally used CPUs and RAM amounts are enough for the managed machine: CPU still allows us to read and write data faster than real IO subsystem allows. Buffering and other techniques are used so the amount of committed RAM is generally low. IO speed is what directly affects backup speed. It does not matter if RAID is used, the backup speed is the same. Disabling SSL on a machine may increase backup speed when the machine backs up to ASN alone. If multiple machines back up simultaneously, disabling SSL does not have effect. Currently SSL encryption is turned on by default. Network The network speed tends to be the main bottleneck during simultaneous backups without deduplication. Backup of client machine with 7200-10000 RPM hard disk generates about 300-400 Mbit/s network traffic. With deduplication, or with 1 Gbit/s network bandwidth bottleneck can be shifted to the ASN storage. Use fast network devices if simultaneous backups to ASN are supposed. Deduplication may have little effect if the amount of unique data is large, so network congestion is still possible with deduplication. ASN (with enabled deduplication) General It is better to use a dedicated machine for ASN. The requirements for such machine are like for a file server. One vault per one ASN is the best practice for the following reasons: 1. Better performance since less simultaneous processes (such as indexing or compacting) are running. 2. Better deduplication because data cannot be deduplicated across different vaults. CPU and RAM Having multiple CPU cores is a plus. The more cores processor the better (5-6 threads per indexing, 2-4 threads for each client backup). Meanwhile 2 Gb of free memory for ASN is enough so no need to grow it more. Add the amount of memory the operating system requires. Storage devices Deduplication basics When creating a vault on a storage node, you specify paths to the vault folder and the vault database. The vault folder stores backup files reduced as a result of deduplication and one or two large files

(datastores) that contain data items unique to the vault. The vault database stores the links that are necessary to assemble the deduplicated data. Vault data and vault database should be stored on different disks. This makes for better performance because both the vault and the database are intensively accessed during any ASN operation. The following scheme illustrates how deduplicated data is stored and why access time to the database is critical. VAULT DATABASE DISK h+link Database VAULT DISK Archive1 Backup0 Backup1 h h h o h h o h h h Original data item Backup0 o o o o o Archive2.. Datastore Instead of the original data items, backups (TIB files) store hashes of these items (h). For disk-level backups, a data item corresponds to a disk block. For file-level backups, a data item corresponds to a file. The data that is not yet deduplicated or can never be deduplicated (for example, files less than 4 KB in size, or password-protected backups) is stored in the TIB files in its original state (o). The database stores both hashes and links to the datastore, where the original data corresponding to these hashes is found. If there are both disk-level and file-level backups in the vault, there are two separate datastores for them. When a recovery process is started, the software takes the 1 st hash from the backup file, and searches the database for the link to the corresponding data item. Using this link, the software finds the original

data item in the datastore and puts the data item to the output data stream. The data that can never be deduplicated is put directly to the output data stream. Database disk Minimal access time is extremely important for the device hosting the vault database. Example: RAID0 consisting of 10k rpm SAS-drives. A Solid State Disk (SSD) device offers the fastest access. Noncompromise solutions are RAID0 of two SSDs or RAID 0+1 which will add both speed and redundancy. When the database size exceeds 20 GB, which corresponds to approximately 700 GB of unique data stored in the vault, backup performance may reduce. If backup speed is critical for your business processes, do either of the following: 1) Delete some of the archives 2) Export some of the archives or the entire vault to a different storage and then delete the original archives from the vault (After operations 1 or 2 the compacting operation is required) 3) Organize a new storage node and redirect your backups there Vault disk The device that hosts the vault can be relatively slow. Example: a 7200rpm SATA-drive. There is no preference to local disks or SAN. The device that hosts the vault must have plenty of free space for indexing and compacting operations. Indexing The indexing operation can be briefly described as deduplication at target. It disassembles the backed up data and writes the unique data items to the datastore file(s). Indexing of an archive starts every time a new backup is added to the archive. Indexing creates a clone of the archive being indexed on the same drive as the indexing proceeds. Once indexing is complete a new clone copy is kept, the old instance of the archive is deleted. If indexing is interrupted (by stopping the ASN service for example) the new incomplete clone copy is deleted. Hence old archive size + 10% amount of free space is a minimum requirement for a normal indexing operation. The old archive may not be deduplicated yet, if deduplication at source is not configured. For this reason, the required free space can be estimated as 1.1 multiplied by the size of the largest expected backup. If there is not enough free space in the vault, the indexing task will fail and start again after 5 10 minutes, on the assumption that some space has been freed up. The more free space there is in the vault, the faster your archives will reduce to the minimum possible size.

Compacting The compacting operation compacts the datastore files, that is, removes the data items that become unnecessary after some of the backups have been deleted. Compacting creates a clone of the datastore, but it does not copy the entire datastore at once. The software checks every data item in the datastore. If there is a link to the item from an existing archive, the item is moved to the new datastore. At the end of the operation, the old datastore contains only the unnecessary items and it is safely deleted. The worst situation from the disk space standpoint is when a datastore contains an item corresponding to a large file. The software needs space to copy this item to the new datastore before deleting it from the old one. This way, free disk space required for compacting is equal to the size of the largest file expected in a file-level backup. For disk-level backups, the a data item size is equal to the disk block size and so is negligible. Number of connections The default settings for ASN connections and backup queue length are: - 10 machines for simultaneous backup - 50 machines waiting in the queue These numbers can be safely changed to 50/50 for now, due to major enhancement in ASN handling parallel backups. This can reduce an average backup time. ASN client machine number estimations One ASN can process a limited number of client machines. To estimate this limit one should take into account how much data is going to be backed up and take into consideration the general backup to ASN speed as well as indexing speed. Backup to ASN speed depends on the ASN machine configuration and it is actually hard to estimate the configuration to be used. Here is an example test results which can be used for rough estimations *: - Backup data size: 200 Gb - Backup time: 200 Min - Indexing time: 570 Min Backup and restore work almost independently so to estimate how much data can be processed by a single ASN indexing time can be used. In the example configuration ASN can reindex 500 Gb of backup data a day. The amount of data to be backed up from one machine can be taken from internal company statistics. General assumption is that each client machine has about 70 Gb of data to be backed up with 2% daily change rate. For initial full backups ASN will be able to process 500/70=7 machines a day (if the network bandwidth is not a bottleneck, see above). For incremental data processing the number of machines is

pretty big. So the number of machines to be processed by one ASN depends on the time one has for initial backups. * the configuration of the test server: 2x Intel Xeon E5420 2.5GHz, 8 Gb RAM, RAID0 on SATA 7200 rpm disks (6 disks) for both storage and DB.