BACKUP AND RECOVERY OF THE PIVOTAL DATA COMPUTING APPLIANCE UAP EDITION

Size: px
Start display at page:

Download "BACKUP AND RECOVERY OF THE PIVOTAL DATA COMPUTING APPLIANCE UAP EDITION"

Transcription

1 White Paper BACKUP AND RECOVERY OF THE PIVOTAL DATA COMPUTING APPLIANCE UAP EDITION EMC Data Domain Systems, EMC Data Domain Boost, Pivotal Greenplum Database, Pivotal HD, Pivotal Chorus, Network File System Data warehouse protection Major storage savings with data deduplication Incremental backup of append-only tables EMC Solutions Group Abstract This white paper provides insight into how EMC Data Domain deduplication storage systems deal with the data growth, retention requirements, and recovery service levels that are essential to businesses. This white paper explores the practices and considerations for backing up the Pivotal Data Computing Appliance Unified Analytics Platform Edition to Data Domain systems and how to effectively exploit Data Domain s leading-edge technology. July 2013

2 Copyright 2013 EMC Corporation. All Rights Reserved. EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice. The information in this publication is provided as is. EMC Corporation makes no representations or warranties of any kind with respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose. Use, copying, and distribution of any EMC software described in this publication requires an applicable software license. For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com. Pivotal, and the Pivotal logo are registered trademarks or trademarks of GoPivotal Corporation in the United States and other countries. All other trademarks used herein are the property of their respective owners. Copyright 2013 GoPivotal Corporation. All Rights Reserved. All trademarks used herein are the property of their respective owners. Part Number H

3 Table of contents Executive summary... 6 Business case... 6 Solution overview... 6 Key results... 7 Introduction... 9 Purpose... 9 Scope... 9 Audience... 9 Terminology... 9 Overview of components Introduction to the components Pivotal DCA UAP Edition EMC Data Domain deduplication storage system Solution architecture Overview of the DCA UAP Edition and Data Domain solution Physical environment Hardware resources Software resources Data Domain system features, integration, and administration Introduction to Data Domain systems Faster, more efficient backup Network-efficient replication Seamless integration Ultra-safe storage for fast and reliable recovery Scalable deduplication storage Easy integration Integration into an existing Pivotal DCA UAP Edition Operational simplicity Data integrity Data compression SISL Multipath and load-balancing configuration DD Boost Design considerations Data Domain Enterprise Manager

4 Data Domain file system Creating and working with shares on the Data Domain system Pivotal Chorus Installing and configuring Data Domain with the DCA UAP Edition Overview Installing Data Domain Configuring Data Domain Connecting Data Domain to the DCA UAP Edition Pivotal GPDB, NFS: Configuring the Data Domain system and the DCA UAP Edition Pivotal GPDB, NFS: Backing up to Data Domain from the DCA UAP Edition Pivotal GPDB, NFS: Enabling Direct I/O Pivotal GPDB, NFS: Restoring data to the DCA UAP Edition from Data Domain Pivotal GPDB, DD Boost: Configuring the Data Domain system and the DCA UAP Edition Pivotal GPDB, DD Boost: Backing up to Data Domain from the DCA UAP Edition Pivotal GPDB, DD Boost: Restoring data to the DCA UAP Edition from Data Domain Pivotal HD, NFS: Configuring the Data Domain system and the DCA UAP Edition Pivotal HD, NFS: Backing up to Data Domain from the DCA UAP Edition Pivotal HD, NFS: Restoring data to the DCA UAP Edition using Data Domain Backing up Pivotal Chorus using Data Domain Restoring Pivotal Chorus using Data Domain Pivotal GPDB, NFS test results Test objectives Test scenarios Test 1 results Test 2 results Pivotal GPDB, DD Boost test results Test objectives Test scenarios Test 1 results Test 2 results Pivotal HD, NFS test results Test objectives Test scenarios Test 1 results Test 2 results

5 Conclusion Summary Findings References Product documentation

6 Executive summary Business case The Pivotal Data Computing Appliance (DCA) Unified Analytics Platform (UAP) Edition combines the power of its classic data warehouse Pivotal Greenplum Database (GPDB) for structured data management with Pivotal Hadoop (HD) for unstructured data management. Another critical UAP component, Pivotal Chorus, provides collaborative analytics. In today s business environments, data analytical systems are: Growing exponentially over time, often leading to many terabytes of storage requiring protection. Frequently becoming unmanageable, lacking the ability to back up or provide effective disaster recovery (DR). To ensure that an organization can make the right decisions in the shortest possible time, its data warehouse, including structured and unstructured data, must be reliable, secure, high performing, and extremely flexible to support the growing data volume, concurrency of many users, and increasing complexity of analysis. Therefore, an organization s data needs to be loaded into the system so that it can be queried intelligently and quickly. The Pivotal DCA UAP Edition enables organizations to make strategic business and operational decisions based on advanced analytics, which in turn contribute to the overall success of an organization, both functionally and economically. Data protection for a data warehouse is a critical IT discipline. The data needs to be backed up, recovered, and restored efficiently and cost effectively. Businesses have historically chosen simple approaches, such as periodically creating a full backup to tape. With the rapid growth in the volume of data stored, a simple, periodic full backup to tape or to a non-deduplicating disk-based storage system is no longer economical. EMC Data Domain deduplication storage systems provide the recoverability or reliability of next-generation backup solutions for the efficient and cost effective business needs of backup. EMC recommends that you read this paper in conjunction with Pivotal Data Computing Appliance Getting Started Guide. Solution overview Data warehouse environments demand a comprehensive strategy for data protection, security, and high availability. Data recovery options must align with application and business requirements to yield the highest availability. Organizations face many challenges and require: Efficient use of both infrastructure and people to support the business by: Improving IT efficiency by saving hours of staff time and boosting user productivity Reducing operational and infrastructure costs by eliminating the requirements of tape and reducing the data center footprint requirements 6 6

7 Reducing the need for massive backup storage Matching infrastructure costs with changing information value through efficient, cost-effective storage Simplicity through ease of integration and management Faster backup and restores to: Minimize backup windows, reducing the impact on application and system performance Meet more aggressive recovery time objective (RTO) service-level agreements (SLAs) set by business owners by maximizing end-to-end recovery performance The DCA UAP Edition provides an end-to-end data warehousing/business intelligence (DW/BI) solution packaged within a manageable, consolidated, self-contained data warehouse appliance that can be easily integrated into an existing data center. This white paper describes a backup and recovery environment for the DCA UAP Edition using EMC Data Domain deduplication storage systems. The Data Domain architecture provides a robust backup and recovery solution that protects companies increasingly critical Business Intelligence (BI) and Data Warehouse systems. This is increasingly critical since most businesses cannot survive without their BI function. A backup solution that provides an easy and fast method of data recovery is a necessary requirement rather than an optional one. Pivotal has ensured that leading-edge technology is available to support the backup and recovery of its DCA UAP Edition. This white paper demonstrates how the Data Domain system achieves this in the most simple, efficient, and cost-effective way to provide information protection and greatly increased storage savings. Key results With data deduplication, organizations can reduce the amount of data that needs to be stored over an extended period of time. This offers cost savings in terms of the number of disks or tapes required for backup. In addition, organizations can fundamentally change the way they protect their backup and archive data. With Data Domain s deduplication technology, only new data is written to disk, saving vast amounts of backup storage and backup time. This eliminates the restore penalties of incremental backups. This solution describes a backup and recovery environment for the DCA UAP Edition using the Data Domain DD990. The test results show that this solution meets the business challenges faced by many organizations today through: Operational efficiency Operational ease and efficiency through the utilization of massively parallel processing (MPP) architecture for Pivotal GPDB backups across the network to the Data Domain system in the most efficient way. Storage reduction Major space saving advantages using Data Domain inline deduplication. In this case, test results with Pivotal GPDB Data Domain Boost (DD Boost) demonstrate: 7 7

8 Savings of 34.4x on repetitive nightly backups An aggregate 11.3x cumulative storage reduction over a week of uncompressed backups to the Data Domain system Support for incremental backups Dramatically reduced backup times with support for incremental backups of append-only (AO) tables. Faster backup and recovery Fast restore times for returning uncompressed and deduplicated data to the DCA UAP Edition. The test results with Pivotal GPDB DD Boost demonstrate that the combination of gpcrondump backups and Data Domain deduplication technology provides an average backup of TB/hour and a restore of 6.19 TB/hour. This enables the backup and recovery of two DCA UAP Edition Pivotal GPDB Compute Modules (18 TB uncompressed database) in 1.62 and 2.92 hours respectively. Backup of Pivotal HD Efficient backup and recovery of the Pivotal HD data. The test results for Pivotal HD NFS backups demonstrate that the combination of DistCp and Data Domain deduplication technology provides an average backup of 5.21 TB/hour and a restore of 3.86 TB/hour. The Data Domain DD990 in this solution provides an effective, disk-based backup target that significantly minimizes storage requirements. 8 8

9 Introduction Purpose Scope Audience Terminology The purpose of this white paper is to illustrate how Data Domain s industry-leading technology provides a powerful backup, archiving, and disaster recovery solution that can scale with the most demanding data center requirements, such as data warehousing. This white paper also explores the various practices and considerations for backing up DCA UAP Edition data to the Data Domain system and how to effectively exploit this technology. The scope of this white paper is to document: DCA UAP Edition features and benefits (at a high level) Data Domain system features, as used in this solution Installing and configuring the Data Domain system for the DCA UAP Edition The solution test objectives, scenarios, and results This white paper is intended for: Field personnel who are tasked with implementing a backup and recovery solution for the DCA UAP Edition or a data warehouse Customers, including IT planners, storage architects, and database administrators involved in evaluating, acquiring, managing, operating, or designing a backup and recovery solution for the DCA UAP Edition or a data warehouse EMC staff and partners, for guidance and for the development of proposals Table 1 defines key terms used in this document. Table 1. Terminology Term Addressable capacity Apache Hadoop Definition The amount of physical space available on a Data Domain system to store deduplicated and compressed backup images Allows for the distributed processing of large data sets across clusters of computers using a simple programming model. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than relying on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly available service on top of a cluster of computers, each of which may be prone to failures. For more information, see the Hadoop website. 9 9

10 Term Append-only (AO) table Business intelligence (BI) Cumulative compression factor Data Domain Data Domain Boost (DD Boost) Data warehousing (DW) Deduplication Disaster Recovery (DR) GPDB Pivotal HD HDFS Definition Tables that allow you to add records to the database, but do not allow updating or deleting existing records within the tables The effective use of information assets to improve the profitability, productivity, or efficiency of a business. IT professionals use this term to refer to the business applications and tools that enable such information usage. The source of information is frequently the data warehouse. The ratio of the logical storage size to the physically stored space EMC deduplication storage system Extends the backup optimization benefits of Data Domain deduplication storage solutions by distributing parts of the deduplication process to the backup server or application client. DD Boost dramatically increases throughput speeds, minimizes backup LAN load, and improves backup server utilization. The process of organizing and managing information assets of an enterprise. IT professionals often refer to the physically stored data content in some databases managed by database management software as the data warehouse. They refer to applications that manipulate the data stored in such databases as DW applications. Deduplication looks for redundancy of very large sequences of bytes across large comparison windows. Long sequences are compared to the history of other such sequences and the first uniquely stored version of a sequence is referenced rather than stored again. Processes, policies, and procedures to prepare for recovery or continuation of technology infrastructures that is vital to an organization after a disaster. Greenplum Database Pivotal HD is a 100%-open-source certified and supported version of the Apache Hadoop stack. It includes HDFS, MapReduce, Hive, Pig, HBase, and Zookeeper. Pivotal HD s packaged Hadoop distribution removes the complexity associated with building out a Hadoop cluster from scratch, which is required with other distributions. Hadoop Distributed File System (HDFS) is the primary storage system used by Hadoop applications. HDFS creates multiple replicas of data blocks and distributes them on compute nodes throughout a cluster to enable reliable, extremely rapid computations

11 Term Local compression Logical storage size Massively Parallel Processing (MPP) Scale out Shared-Nothing Architecture Storage unit UAP VLAN overlay Definition Standard lossless compression algorithms. The algorithms available on a Data Domain system include lz (Lempel Ziv), gz, and gzfast: lz: The default algorithm that provides the best throughput. EMC recommends using the lz option. gz: A zip-style compression that uses the least amount of space for data storage (10 to 20% less than lz on average; however, some datasets compress more). This also uses the most CPU cycles (up to five times as much as lz). The gz compression type is commonly used for near-line storage applications in which performance requirements are low. gzfast: A zip-style compression that uses less space for compressed data, but more CPU cycles (twice as much as lz). gzfast is the recommended alternative for sites that want more compression at the cost of lower performance. The total size of all backup images in all pools on a Data Domain system. This total size includes all pools mapped to a Data Domain system instance, which can include primary disk pools and clone storage pools. A type of distributed computing architecture where tens to hundreds of processors work concurrently to solve large computational problems. A technique that increases the total processing power of a system by adding additional independent computational nodes, as opposed to augmenting a single, large computer with incremental disk, processor, or memory resources. A distributed computing architecture made up of a collection of independent, self-sufficient nodes. This is in contrast to a traditional central computer that hosts all information and processing in a single location. The Data Domain system exposes pre-made disk volumes called storage units to a DD Boost-enabled media server. Multiple media servers, each with the Data Domain Boost plug-in, can use the same storage unit on a Data Domain system as a storage server. Unified Analytics Platform that includes Pivotal GPDB, Pivotal HD, Pivotal Chorus, and Pivotal DCA Used to separate network traffic from the DCA UAP Edition internal network

12 Overview of components Introduction to the components Pivotal DCA UAP Edition This section identifies and briefly describes the components deployed in the solution environment. The components used are: Pivotal DCA UAP Edition EMC Data Domain deduplication storage system The DCA UAP Edition is a purpose-built, highly scalable, parallel DW appliance that architecturally integrates database compute, storage, and networking into an enterprise-class, easy-to-implement system. The DCA UAP Edition brings in the power of MPP architecture, delivers the fastest data loading capacity, and provides the best price/performance ratio in the industry without the complexity and constraints of proprietary hardware. The DCA UAP Edition is capable of managing, storing, and analyzing large volumes of structured and unstructured data. The DCA UAP Edition includes Pivotal GPDB, Pivotal HD, and Pivotal Chorus. The DCA UAP Edition is offered in multiple-rack appliance configurations to achieve the maximum flexibility and scalability for organizations faced with terabyte- to petabyte-scale data opportunities. Many DCA UAP Edition product options are available. This solution uses the following configuration option: Two DCA UAP Edition Pivotal GPDB Compute Modules Four DCA UAP Edition Pivotal HD Modules The backup and recovery solution for the Pivotal GPDB modules uses the Pivotal gpcrondump backup utility. We deployed the utility using the following two methods: Over a network file system (NFS) protocol Using DD Boost Both methods back up the data to a Data Domain deduplication storage system. The gpcrondump backup utility automates the parallel backup of distributed Pivotal GPDB across multiple Segment Servers. For NFS, each of the Segment Servers has a directory mapped to a Data Domain system. For DD Boost, data is backed up to a dedicated storage unit on the Data Domain system. The backup and recovery solution for the Pivotal HD modules uses the DistCp (distributed copy) utility. DistCp was deployed over NFS only. The DistCp tool is used for large inter-/intra-cluster copying. It uses MapReduce to achieve its distribution, error handling and recovery, and reporting. It expands a list of files and directories into input to map tasks, each of which copies a partition of the files specified in the source list. For more information about DistCp, see the Hadoop website

13 Figure 1 illustrates the architectural layout of the DCA UAP Edition configuration used. Figure 1. DCA UAP Edition configuration 13 13

14 Table 2 and Table 3 briefly describe the main components of DCA UAP Edition. Table 2. DCA UAP Edition hardware components Item Pivotal GPDB system Pivotal GPDB Master Servers Pivotal GPDB Segment Servers Pivotal HD Master Servers Pivotal HD Worker Servers Description An associated set of Segment Instances and a Master Instance running on an array, which can be composed of one or more hosts The servers responsible for the automatic parallelization of queries The servers that process and analyze the data The servers that perform administration of the Pivotal HD cluster The servers that process, store, and analyze the data in the Pivotal HD cluster Table 3. DCA UAP Edition software components Item Pivotal GPDB Pivotal Chorus Description Pivotal GPDB is an MPP database server, based on PostgreSQL open-source technology. It is explicitly designed to support BI applications and large, multiterabyte data warehouses. Agile analytics productivity platform For more information about the DCA UAP Edition, see EMC Pivotal Data Computing Appliance Getting Started Guide. EMC Data Domain deduplication storage system EMC Data Domain deduplication storage systems dramatically reduce the amount of disk storage needed to retain and protect enterprise data. By identifying redundant data as it is being stored, Data Domain systems reduce backup storage requirements by 10x to 30x. Backup data can then be efficiently replicated and retrieved over existing networks for streamlined disaster recovery and consolidated backup operations. This allows Data Domain appliances to integrate seamlessly into database architectures, maintaining existing backup strategies with no changes to scripts, backup processes, or system architecture. The Data Domain appliance is the industry s fastest, most cost-effective and scalable single-controller deduplication storage solution for disk-based backup and networkefficient DR. The Data Domain Stream-Informed Segment Layout (SISL ) scaling architecture enables the fast-inline deduplication throughput of the Data Domain system. A CPUcentric approach to deduplication delivers high throughput while minimizing the number of disk spindles required

15 Solution architecture Overview of the DCA UAP Edition and Data Domain solution Physical environment This section illustrates the architectural layout of the DCA UAP Edition and Data Domain solution. It also provides details of the hardware and software resources that we used in the solution. Figure 2 illustrates the architectural layout of the solution used for both DCA UAP Edition and Data Domain configurations. Figure 2. Solution architecture layout 15 15

16 Hardware resources Table 4 lists the hardware we used to validate this solution. Table 4. Data Domain DD990 system Equipment Specification Quantity DD990 Expansion shelves (32 TB) 128 GB memory 4 x internal SATA drives Four 1000/100/10 copper Ethernet 2 x dual-port 10 GbE optical network interface cards ES20 disk shelves with 16 x 2 TB SATA disks 1 6 Table 5 lists the specifications of the Pivotal GPDB Master Servers used in this solution. Table 5. Pivotal GPDB Master Server specifications Hardware Specification Quantity Processor Intel X GHz (8 core) 2 Memory DDR MHz 64 GB Dual-port converged network adapter 2 x 10 Gbps 1 Quad-port network adapter 4 x 1 Gbps 1 RAID controller Dual-channel 6 Gb/s SAS 1 Hard disks 300 GB 10 K RPM SAS (One RAID5 volume of 4+1 with one hot spare) Note: Master Servers use the same type and number of drives in the Pivotal GPDB Compute and Pivotal GPDB Standard systems

17 Table 6 and Table 7 list the specifications of the Pivotal GPDB Segment Servers and Pivotal HD Servers we used in this solution. Table 6. Pivotal GPDB Segment Server specifications Hardware Specification Quantity Processor Intel X GHz (8 core) 2 Memory DDR MHz 64 GB Dual-port converged network adapter 2 x 10 Gb/s 1 Quad-port network adapter 2 x 1 Gb/s 1 RAID controller Dual-channel 6 Gb/s SAS 1 Hard disks 300 GB 10K RPM SAS 24 Table 7. Pivotal HD Server specifications Hardware Specification Quantity Processor Intel X GHz (8 core) 2 Memory DDR MHz 64 GB Dual-port converged network adapter 2 x 10 Gb/s 1 Quad-port network adapter 2 x 1 Gb/s 1 RAID controller Dual-channel 6 Gb/s SAS 1 Hard disks 3 TB 7.5K RPM SATA 12 Table 8 details the backup network specification. Table 8. Backup network Equipment Ports Specification DCA UAP Edition 4 10 GbE Interconnect Bus 1 1 The existing Pivotal DCA UAP Edition Interconnect Bus is designed to accommodate the backup network port requirements

18 Software resources Table 9 lists the software we used to validate this solution. Table 9. Software resources Software Version Comment Data Domain OS Data Domain operating system Red Hat Enterprise Linux 6.2 DCA UAP Edition Servers operating system Pivotal GPDB or above required for gpcrondump incremental backups Pivotal HD 1.2 Pivotal HD Modules Pivotal Chorus 2.4 Pivotal Chorus for collaborative analytics Fabric OS CEE 18 18

19 Data Domain system features, integration, and administration Introduction to Data Domain systems EMC Data Domain deduplication storage systems provide a next-generation backup and recovery solution for big data that allows users to enjoy the retention and recovery benefits of inline deduplication as well as the offsite disaster recovery protection of replication over the wide area network (WAN). Data Domain systems reduce the amount of disk storage needed to retain and protect data by 10x to 30x. Data on disk is available online and onsite for longer retention periods, and restores become fast and reliable. Storing only unique data on disk also means that data can be cost-effectively replicated over existing networks to remote sites for DR. EMC further extends these benefits through DD Boost. DD Boost enables advanced integration between Data Domain systems and Pivotal GPDB for faster, more efficient backup and recovery. This section also provides details on Data Domain system integration and administration. Faster, more efficient backup Distributed deduplication process dramatically increases throughput Reduced network bandwidth utilization Network-efficient replication Cost-efficient disaster recovery Encrypted replication Up to 99 percent bandwidth reduction Faster time-to-dr readiness Seamless integration Configured using native Pivotal GPDB backup and restore utilities Ultra-safe storage for fast and reliable recovery Data Invulnerability Architecture Continuous recovery verification, fault detection, and healing End-to-end data integrity Scalable deduplication storage EMC Data Domain is the industry s fastest deduplication storage system for enterprise backup and archiving workloads. With a throughput of up to 31 TB/hour, Data Domain systems can protect up to 28.5 petabytes of logical capacity, enabling more backups to complete sooner, while putting less pressure on limited backup windows

20 Easy integration Data Domain is qualified with all leading enterprise backup software and archiving applications and easily integrates into existing software infrastructures without changing the data center or distributed office data protection. Data Domain systems integrate easily into existing data centers. All Data Domain systems can be configured as storage destinations for leading backup and archiving applications using NFS, common internet file system (CIFS), DD Boost, or virtual tape library (VTL) protocols. Consult the compatibility matrices for information about the applications that work with the different configurations. Multiple backup servers can share one Data Domain system. Integration into an existing Pivotal DCA UAP Edition The DCA UAP Edition architecture is designed to easily integrate Data Domain systems in a nondisruptive, seamless manner. When connecting to a single rack, use ports on the interconnect switches. When connecting to a multiple rack configuration, use ports on the aggregation switches to ensure an even network load. Once the VLAN overlay and the interconnect or aggregation switch ports are configured appropriately, you then need to connect the Data Domain system directly into the DCA UAP Edition environment and start the configuration steps for NFS or DD Boost. For more information on this integration, see the Data Domain and DCA UAP Edition connectivity section. Operational simplicity Data Domain systems are simple to install and manage. Connect an appliance to the backup server either as a file server via Ethernet or as a VTL via Fibre Channel. You can use all three interfaces simultaneously. DD Boost is also compatible with many other backup applications. For more information, see the EMC Data Domain Boost Compatibility Matrix at the Data Domain support portal. Data integrity The Data Domain Data Invulnerability Architecture provides ultra-safe storage for reliable recovery and continuous protection. It provides the industry s best defense against data integrity issues. Continuous recovery verification, along with extra levels of data protection, continuously detect and protect businesses against data integrity issues during the initial backup and throughout the data life cycle. Unlike any other enterprise array or file system, each appliance ensures recoverability is verified and then continuously re-verified. The Data Domain operating system (DD OS) includes extra levels of data protection to protect itself against storage-related faults that threaten data recoverability. Dual disk parity RAID 6 is part of the foundation for continuous fault detection and healing on DD OS. RAID 6 provides protection against two simultaneous disk faults, can rebuild a failed disk even if there are read errors on other sectors, and can detect and correct errors during reads. This added protection ensures the highest levels of data availability. When determining global uniqueness, DD OS uses strong cryptographic hashing for speed and security. A universal hash protects you against random and malicious hash collisions. An append-only write policy prevents overwriting valid data

21 After a backup completes, Data Domains data validation process looks at what was written to disk. The data validation process checks that all file segments are logically correct within the file system and that the data on the disk is the same as it was written to disk. In the background, the Online Verify operation continuously checks that the data on the disks is correct and unchanged since the earlier validation process. The back-end storage is set up in a double parity RAID 6 configuration (two parity drives). Additionally, hot spares are configured within the system. Each parity stripe has block checksums to ensure that the data is correct. The checksums are constantly used during the online verification operation and when data is read from the Data Domain system. With double parity, the system can protect against up to two simultaneous disk failures. To keep data synchronized in the event of a hardware or power failure, the Data Domain system uses non-volatile RAM (NVRAM) to track outstanding I/O operations. An NVRAM card with fully-charged batteries (the typical state) can retain data for a minimum of 48 hours. When reading during a restore operation, the DD OS uses multiple consistency checks to verify that restored data is correct. Data compression The Data Domain systems store only unique data. Through Global Compression, a Data Domain system pools redundant data from each backup image. The storage of unique data is invisible to the backup software, which sees the entire virtual file system. DD OS data compression is independent of a data format. Data can be structured, such as databases, or unstructured, such as text files. Data can be from file systems or raw volumes. Typical compression ratios are 20:1 on average over many weeks. This assumes weekly full and daily incremental backups. A backup that includes many duplicate or similar files (files copied several times with minor changes) benefits the most from compression. Depending on backup volume, size, retention period, and rate of change, the amount of compression can vary. The best compression happens with backup volume sizes of at least 10 mebibytes. (MiB a unit of data storage that is exactly 1,048,576 bytes, the base 2 equivalent of MB). To take full advantage of multiple Data Domain systems, a site that has more than one Data Domain system should consistently back up the same client system or set of data to the same Data Domain system. For example, if a full backup of all sales data goes to Data Domain system A, the incremental backups and future full backups for sales data should also go to Data Domain system A. A Data Domain system compresses data at two levels: Global compression Compares received data to already stored data on disk. Duplicated data does not need to be stored again, while new data is locally compressed before being written to disk

22 Local compression A Data Domain system uses a local compression algorithm developed specifically to maximize throughput as data is written to disk. The default algorithm (lz) allows shorter backup windows for backup jobs but uses more space. Local compression options provide a trade-off between performance and space usage. For more information on how to change compression types, see Data Domain Operating System (DD OS) Administration Guide. SISL Multipath and load-balancing configuration DD Boost Design considerations Data Domain Enterprise Manager Data Domain SISL enables high throughput, inline deduplication. SISL identifies 99 percent of the duplicate segments in RAM, inline, before storing it on disk. It also stores related segments and fingerprints together so that large groups can be read at once. With these patented techniques, Data Domain can use the full capacity of large SATA disks for data protection and minimize the number of disks needed to deliver high throughput. In the long term, SISL greatly improves the Data Domain system performance including increasing CPU speeds. Data Domain systems have at least two 10 GbE ports that support multipath configuration and load balancing. In a multipath configuration on the Data Domain system, each of the two 10 GbE ports is connected to a separate port on the backup server. DD Boost significantly increases performance by distributing parts of the deduplication process to the backup server. DD Boost also gives the control and visibility permissions to backup applications into the Data Domain replication process. These simplify disaster recovery procedures and serves as a solid foundation for additional integration between backup applications and Data Domain systems. Retention of data, frequency, rate of change, and backup policies influence the decision when determining the amount of storage required in the Data Domain system. All Data Domain systems run the DD OS, which includes Data Domain Enterprise Manager, a simple web-based rich Internet application for managing Data Domain systems. DD System Manager provides both a GUI and a command line interface (CLI) for configuration management and monitoring all system operations. The web-based GUI, available through Ethernet connections, can manage up to 20 Data Domain systems (depending on the model) at any location. DD System Manager provides a single, consolidated management interface that allows for the configuration and operation of many system features and settings. DD System Manager also provides real-time graphs and tables that enable users to monitor the status of system hardware components and configured features. Additionally, a command set that performs all system functions is available to users through the CLI. Commands configure system settings and display system hardware status, feature configuration, and operations

23 The CLI is available through a serial console when keyboard and monitor are directly attached to the Data Domain system, or remotely through an Ethernet connection using SSH or Telnet. For more information on Data Domain Enterprise Manager, see Data Domain Operating System (DD OS) Administration Guide. Data Domain file system Data Domain systems are designed to be a highly reliable storage of last resort to provide longer-term onsite retention of backups. As new backups are added to the system, old backups age out. Such removals are normally done under the control of the backup software (on the backup server) based on the configured retention period. This process is similar to configuring tape retention policies in which older backups are retired and the tapes are reused for new backups. When the backup software removes an old backup from a Data Domain system, the space on the Data Domain system becomes available only after the Data Domain system cleans the retired disk space. A good way to manage space on a Data Domain system is to retain as many online backups as possible, with some empty space (about 20 percent of the total space available) to comfortably accommodate backups until the next scheduled cleaning run. Space utilization on a Data Domain system is primarily affected by: The backup policy and redundancy in the data The size, redundancy, and rate of change of the backup data The retention period specified in the backup software High levels of compression result when you back up datasets with many duplicates and retain them for long periods of time. The Data Domain file system supports the following interfaces: NFS CIFS DD Boost VTL For more information on the file system, see Data Domain Operating System (DD OS) Administration Guide. Creating and working with shares on the Data Domain system When creating shares, assign client access to each directory separately and remove access from each directory separately. Note: If you are implementing replication, a single destination Data Domain system can receive backups from both CIFS clients and NFS clients as long as you use separate directories for each client type. Do not mix CIFS and NFS data in the same directory

24 Pivotal Chorus Pivotal Chorus is a collaboration tool that data science teams use to share information about the data they work on. Pivotal Chorus provides an agile and analytic infrastructure for data scientists to collaborate on data sets, methods, and workflows, providing valuable insights to each other. Pivotal Chorus maintains its own database. To protect this valuable data, this solution includes instructions for backing up the Pivotal Chorus to Data Domain. For this solution, we installed Pivotal Chorus on the DCA UAP Edition Standby Master Server (smdw). For more information, see Pivotal Chorus Installation Guide

25 Installing and configuring Data Domain with the DCA UAP Edition Overview Data Domain storage systems integrate seamlessly into existing DCA UAP Edition deployments. You can export Data Domain storage as NFS shares and easily mount those shares on the DCA UAP Edition Pivotal GPDB and Pivotal HD Servers and use them as a backup target. With Pivotal GPDB support for DD Boost, the Data Domain system also integrates easily in this scenario. Depending on the DCA UAP Edition configuration, other Data Domain system models are also supported, for example, DD670, DD860, and DD990. To determine the correct Data Domain model, your local backup and recovery specialist should complete a sizing exercise. For more information on compatibility, see the Data Domain Support Portal. For this solution, we used the DD990 to back up the DCA UAP Edition via NFS and DD Boost. Installing Data Domain Install the Data Domain hardware, including a system console, as described in the Data Domain Installation and Setup Guide, which ships with the Data Domain system. The Installation and Setup Guide provides instructions for installing the Data Domain system, connecting it to an administrative console, and powering it on. For more information after completing the installation and powering on the system, see Data Domain Operating System (DD OS) Initial Configuration Guide. Configuring Data Domain When the installation is completed, the Data Domain Configuration Wizard starts automatically. The Configuration Wizard performs an initial configuration of the Data Domain system, configuring only what is needed for the most basic system setup, including licenses, network, file system, CIFS, and NFS. For more information about using the Configuration Wizard and changing or updating the configuration, see Data Domain Operating System (DD OS) Initial Configuration Guide. Connecting Data Domain to the DCA UAP Edition A high-speed 10 GB connection is available when connecting the Data Domain system to the DCA UAP Edition. When connecting to a single rack, use ports on the Interconnect switches. When connecting to a multiple rack configuration, use ports on the aggregation switch to ensure an even network load. Connectivity is established by creating a VLAN (virtual local area network) overlay. A VLAN overlay separates internal and external traffic by creating a separate virtual network over the same physical network with a different IP address range. To create a VLAN overlay on the DCA UAP Edition, modify the switch configuration to allow traffic over the new VLAN ID and create a VNIC (virtual network interface card) on each DCA UAP Edition server to receive traffic over the different IP address range. The DCA Setup utility provides an automated method for configuring switches to allow VLAN traffic and to create VNICs on the DCA UAP Edition servers. Ports 41 through 44 are available on the Interconnect switches for external connectivity. Aggregation switches reserve ports 45 through 48 for external connectivity. When using the DCA Setup utility to add a VLAN for external connectivity, the VLAN will be added to these ports

26 For detailed steps on setting up a VLAN overlay, see the Connect through the Interconnect or Aggregation Switch section of the Pivotal Data Computing Appliance Administration Guide. Note: DCA Setup is still under active development and currently allows for only limited interconnect and aggregate switch configuration options. You may need to manually adjust interconnect and aggregate switch configurations to suit your environments. The DCA Setup utility currently sets up a single switch link aggregation (LAG) switch configuration. For this solution we connected the Data Domain system to both aggregate switches using a VLAN overlay, so further configuration changes to the interconnect and aggregate switches are required, to set up the LAG across both aggregate switches (one port on each aggregate switch) for network redundancy. Note: Ensure that the documented process of establishing the VLAN overlay in the Data Computing Appliance Administration Guide has been followed before making the following configuration steps. The additional configuration steps are as follows: 1. Stop the Pivotal GPDB and as the admin user run the following commands on each aggregate switch: [root@mdw ~]# ssh admin@aggr-sw-1 enable config t interface eth47 no channel-group 200 desc not used shutdown interface eth48 no channel-group 200 desc not used shutdown interface eth46 no channel-group 200 desc DD990 eth10a channel-group 201 mode active interface eth45 desc DD990 eth11a interface port-channel 200 desc mlag to DD990 - ports 45 switchport mode access switchport access vlan [Overlay VLAN ID] spanning-tree portfast mlag 200 interface port-channel 201 desc mlag to DD990 - ports 46 switchport mode access switchport access vlan [Overlay VLAN ID] spanning-tree portfast mlag 201 copy run start [root@mdw ~]# ssh admin@aggr-sw-2 enable 26 26

27 config t interface eth46 no channel-group 200 desc DD990 eth10b channel-group 201 mode active interface eth45 desc DD990 eth11b interface port-channel 200 desc mlag to DD990 - ports 45 switchport mode access switchport access vlan [Overlay VLAN ID] spanning-tree portfast mlag 200 interface port-channel 201 desc mlag to DD990 - ports 46 switchport mode access switchport access vlan [Overlay VLAN ID] spanning-tree portfast mlag 201 copy run start 2. Configure the Port Channel and VLAN ID to match those previously setup using the DCA setup utility (the default Port Channel is 200). With the Pivotal GPDB stopped, connect as the admin user, and run the following commands on each interconnect switch in this case i-sw-1, i-sw-2, i-sw-3, and i-sw-4: For example: [root@mdw ~]# ssh admin@i-sw-1 enable vlan [Overlay VLAN ID] desc overlay vlan copy run start The VLAN overlay setup is now completed. Depending on distance, you can connect Data Domain to the DCA UAP Edition using both Multi-Mode Fibre (MMF) cables and optical Small Form-Factor Pluggable (SFP) devices, or with twin-ax cables. For this solution, we used four MMF cables through optical SFP devices, as shown in Table 10. Table 10. DCA and Data Domain connectivity DCA UAP Edition Interconnect Bus Data Domain system interface Purpose aggr-sw-1, Port 45 Slot 11, Port 1/eth11a Pivotal GPDB Backup & Restore aggr-sw-2, Port 45 Slot 11, Port 2/eth11b Pivotal GPDB Backup & Restore aggr-sw-1, Port 46 Slot 10, Port 1/eth10a Pivotal HD Backup & Restore aggr-sw-2, Port 46 Slot 10, Port 2/eth10b Pivotal HD Backup & Restore 27 27

28 Table 11 lists the hardware we used for connectivity. Table 11. Hardware used for DCA UAP Edition and Data Domain connectivity Hardware Part number Quantity Data Domain: 10 GbE card C-10G-L2PO 2 10 GbE SFP SFP-10G-SRL 4 OM3 multimode optical cable CTX-OM3-10M 4 A connectivity kit that consists of one C-10G-L2PO, two SFP-10G-SRL and two CTX- OM3-10M is available under part number DCA2-10GBEXT. Pivotal GPDB, NFS: Configuring the Data Domain system and the DCA UAP Edition For NFS backups, the Data Domain system must be configured to enable the DCA UAP Edition to access the NFS directories that are used as a target for the gpcrondump utility to store the backups. Figure 3 shows how we configured Data Domain using two 10 GbE interfaces, eth11a and eth11b, to create a virtual interface (veth1) with a VLAN overlay IP address of sysadmin@dd_990_1# net show settings port enabled DHCP IP address netmask type additional setting /prefix length eth0a yes no n/a eth0b yes no n/a eth0c no n/a n/a n/a n/a eth0d no n/a n/a n/a n/a eth10a yes n/a n/a n/a n/a bonded to veth2 eth10b yes n/a n/a n/a n/a bonded to veth2 eth11a yes n/a n/a n/a n/a bonded to veth1 eth11b yes n/a n/a n/a n/a bonded to veth1 veth1 yes no n/a lacp hash xor-l3l4: eth11a,eth11b veth2 yes no n/a lacp hash xor-l3l4: eth10a,eth10b Figure 3. Pivotal GPDB, NFS Data Domain bonded interfaces 28 28

29 We set up the virtual interface using the Enterprise Manager GUI, as shown in Figure 4. Figure 4. Setting up a virtual interface on Data Domain We used the following options: Bonding Information Type: Aggregate Mode: LACP Hash: XOR-L3L4 Speed/Duplex Speed: 10Gb Duplex: Full MTU: 1500 For more information, see Data Domain Operating System (DD OS) Administration Guide. It is very important to configure the host files on both the DCA UAP Edition and the Data Domain system. The following example shows the Data Domain host entries on the DCA UAP Edition: [gpadmin@mdw ~]$ grep "dd_990_1" /etc/hosts ### Data Domain - Backup dd_990_1 # Data Domain virtual interface veth1 After the host file is configured on mdw, you can copy /etc/hosts to smdw and the segment servers using gpscp

30 For example, on the DCA UAP Edition: ~]$ gpscp f <hostfile> /etc/hosts =:/etc/hosts Where <hostfile> contains smdw and the segment servers. To add the hosts to the Data Domain system, type: net hosts add <ipaddr> <host-list> The following example shows how to add the 10 GbE bonded interface for mdw on the Data Domain: sysadmin@dd_990_1# net hosts add mdw Added "mdw" -> " " mapping to hosts list. Repeat this step using the 10 GbE VLAN overlay IP address assigned to each Master and Segment Server. For this solution, we assigned the IP addresses in Table 12 to the VNIC on each DCA UAP Edition server using the DCA Setup utility. For complete instructions on setting up a VLAN overlay, see the Connect through the Interconnect or Aggregation Switch section in Pivotal Data Computing Appliance Administration Guide. Table 12. VLAN Overlay IP addresses DCA UAP Edition server Host name VLAN overlay IP address Master Server 01 mdw Master Server 02 smdw Segment Server 01 sdw Segment Server 02 sdw Segment Server 03 sdw Segment Server 04 sdw Segment Server 05 sdw Segment Server 06 sdw Segment Server 07 sdw Segment Server 08 sdw To configure NFS, perform the following steps: 1. On the Data Domain system set the access level to the VLAN overlay network for the NFS share /backup. The VLAN overlay network used in this case is /24. sysadmin@dd_990_1# nfs add /backup /

31 2. Ensure that the VLAN overlay network was added correctly on Data Domain as shown in the following example: nfs show clients path client options /backup /24 (rw,no_root_squash,no_all_squash,secure) 3. Enable NFS on the Data Domain system using the following command: nfs enable NFS server is enabled 4. Temporarily mount the Data Domain NFS share /backup on the DCA UAP Edition Master Server as follows: a. Create a temporary mount point for /backup by issuing the following command: [root@mdw ~]# mkdir /backup_tmp b. Mount the /backup directory on the Master Server by issuing the follow command (all one line): [root@mdw ~]# mount -t nfs o hard,intr,nfsvers=3,tcp,rsize= ,wsize= dd_990_1:/backup /backup_tmp/ 5. Create the DCA directory, and the subdirectory GPDB, which will become the backup target on Data Domain. In the following example, we created the directory DCA-01/GPDB using the mkdir command on the DCA UAP Edition: Note: In this case, log in as the user gpadmin to avoid any issues with permissions while running the gpcrondump utility. [gpadmin@mdw ~]$ mkdir /backup_tmp/dca-01;mkdir /backup_tmp/dca-01/gpdb 6. Unmount the /backup_tmp directory using the following command on the DCA UAP Edition: [root@mdw ~]# umount /backup_tmp 31 31

32 We created one NFS share on Data Domain and a corresponding mount point on each server. Table 13 provides an overview of the directory mappings that we used in this solution. Table 13. NFS shares and mount points DCA UAP Edition server Host name NFS share on Data Domain Mount point on DCA Master Server 01 mdw /backup/dca-01/gpdb /backup/ddr-990 Master Server 02 smdw /backup/dca-01/gpdb /backup/ddr-990 Segment Server 01 sdw1 /backup/dca-01/gpdb /backup/ddr-990 Segment Server 02 sdw2 /backup/dca-01/gpdb /backup/ddr-990 Segment Server 03 sdw3 /backup/dca-01/gpdb /backup/ddr-990 Segment Server 04 sdw4 /backup/dca-01/gpdb /backup/ddr-990 Segment Server 05 sdw5 /backup/dca-01/gpdb /backup/ddr-990 Segment Server 06 sdw6 /backup/dca-01/gpdb /backup/ddr-990 Segment Server 07 sdw7 /backup/dca-01/gpdb /backup/ddr-990 Segment Server 08 sdw8 /backup/dca-01/gpdb /backup/ddr-990 Note: You can create the mount point on each DCA server using the gpssh utility. You must create the /backup directory as the root user but /backup/ DDR-990 must be owned by the gpadmin user. The following example describes one way of creating the DCA UAP Edition mount points. 7. Issue the following commands on the DCA UAP Edition: [root@mdw ~]# cd ~gpadmin/gpconfigs [root@mdw gpconfigs]# gpssh -f hostfile_gpdb => mkdir /backup => mkdir /backup/ddr-990 => chown gpadmin:gpadmin /backup/ddr-990 => exit 8. Mount the Data Domain NFS share on all Master and Segment Servers according to Table 13. The following command on the DCA UAP Edition provides an example: [root@mdw ~]# gpssh -h sdw4 -v -e mount -t nfs -o rw,hard,intr,tcp,rsize= ,wsize= dd_990_1:/backup/dca-01/gpdb /backup/ddr

33 9. Check that the NFS share is mounted correctly on each Master and Segment Server using the following command on the DCA UAP Edition: gpconfigs]# for NODE in `cat hostfile_gpdb`; do gpssh -h $NODE -v -e df -h grep /backup/; done; [mdw] dd_990_1:/backup/dca-01/gpdb [mdw] 130T 68M 130T 1% /backup/ddr-990 [smdw] dd_990_1:/backup/dca-01/gpdb [smdw] 130T 68M 130T 1% /backup/ddr-990 [sdw1] dd_990_1:/backup/dca-01/gpdb [sdw1] 130T 68M 130T 1% /backup/ddr-990 [sdw2] dd_990_1:/backup/dca-01/gpdb [sdw2] 130T 68M 130T 1% /backup/ddr-990 [sdw3] dd_990_1:/backup/dca-01/gpdb [sdw3] 130T 68M 130T 1% /backup/ddr-990 [sdw4] dd_990_1:/backup/dca-01/gpdb [sdw4] 130T 68M 130T 1% /backup/ddr-990 [sdw5] dd_990_1:/backup/dca-01/gpdb [sdw5] 130T 68M 130T 1% /backup/ddr-990 [sdw6] dd_990_1:/backup/dca-01/gpdb [sdw6] 130T 68M 130T 1% /backup/ddr-990 [sdw7] dd_990_1:/backup/dca-01/gpdb [sdw7] 130T 68M 130T 1% /backup/ddr-990 [sdw8] dd_990_1:/backup/dca-01/gpdb [sdw8] 130T 68M 130T 1% /backup/ddr-990 Figure 5. Checking the NFS share is mounted If all the NFS shares are mounted successfully, the Data Domain system is ready to receive backups run by the gpcrondump utility. The gpcrondump utility creates a directory named db_dumps when the backup is started for the first time. Pivotal GPDB, NFS: Backing up to Data Domain from the DCA UAP Edition To automate routine NFS backups, Pivotal provides the gpcrondump utility, which can be called directly from the command line, or from a crontab entry. The gpcrondump utility is a wrapper for the gp_dump command. You can also use gpcrondump to back up objects other than databases and data, such as database roles and server configuration files. Pivotal GPDB supports incremental backups of append-only tables, including column-oriented tables. The difference between an incremental backup and a full backup is that append-only tables are backed up only if one of the following operations was performed on the table since the last backup: ALTER TABLE, INSERT, TRUNCATE or DROP and then re-create the table. For a partitioned, append-only table, only the changed table partitions are backed up. By default, gpcrondump creates data dump files in the Master and each Segment instance s data directory in <data_directory>/db_dumps/yyyymmdd. By default, the segment data dump files are compressed using gzip. For NFS backups, the gpcrondump utility with the -u parameter is required. Without the -u parameter, the backups are stored locally in the DCA UAP Edition array

34 For the NFS solution, the intention was to send the backups to the Data Domain system and use the Data Domain NFS share created in /backup/dca-01/gpdb as follows on the DCA UAP Edition: db_dumps]$ pwd /backup/dca-01/gpdb/db_dumps db_dumps]$ ls -la total 12 drwxrwxr-x 7 gpadmin gpadmin 382 Mar 20 15:01. drwxrwxr-x 4 gpadmin gpadmin 158 Mar 16 17:10.. drwx gpadmin gpadmin 1342 Mar 16 19: drwx gpadmin gpadmin 2608 Mar 17 19: drwx gpadmin gpadmin 1563 Mar 19 23: drwx gpadmin gpadmin 3328 Mar 20 15: We generated all the NFS full backups by running the following command as gpadmin user: gpcrondump -x tpch -g -G -u /backup/dca-01 -b a z We generated all the NFS incremental backups by running the following command as gpadmin user: gpcrondump -x tpch -g -G -u /backup/dca-01 -b a z -- incremental Log files for the Pivotal GPDB management utilities are written to ~/gpadminlogs by default. The naming convention for management log files is: <script_name>_<date>.log.for example: gpcrondump_ log Table 14 lists the gpcrondump options used in NFS. Table 14. NFS gpcrondump options Parameter Description -x Database name -g Copy config files -G Dump global objects -u Backup target directory -a Do not prompt -b Bypass disk space check -z Do not use compression --incremental Incremental backup for append-only tables, including column-oriented For more options, run gpcrondump help or see Pivotal Greenplum Database Administrator Guide

35 Note: The gpcrondump compressed option --rsyncable is also available, but may not be suitable for environments with a high rate of random data changes. To gain the maximum benefit from EMC Data Domain deduplication technology, we recommend using the gpcrondump uncompressed option z. You can automate parallel backups with gpcrondump by adding an entry to the crontab file on the Master Server. For more information see Pivotal Greenplum Database Administrator Guide. Pivotal GPDB, NFS: Enabling Direct I/O In some cases, Pivotal GPDB backups over NFS can fail when the backup target becomes overwhelmed with the rate at which the DCA UAP Edition sends data over the network. Pivotal GPDB , , and versions above enable more control over the data size and writing mechanism to the NFS backup target using Direct I/O. In standard operating conditions, file systems use memory for file cache and databases use memory for database cache. With Direct I/O enabled, the file system cache is bypassed and data is transferred directly from disk to the application buffer. This effectively avoids the double caching effect and reduces CPU utilization. To enable Direct I/O, type: $ gpconfig -c gp_backup_directio -v on To activate the configuration setting, type: $ gpstop -u Note: Despite the name, gpstop u does not stop the database. It merely causes the database to re-read/update configuration settings. To verify Direct I/O is enabled, type: $ gpconfig s gp_backup_directio To set the chunk size to 10 MB, type: $ gpconfig -c gp_backup_directio_read_chunk_mb -v 10 To activate the configuration setting, type: $ gpstop -u To verify the data chunk size, type: $ gpconfig s gp_backup_directio_read_chunk_mb The default chunk size is 20, which is optimal for most environments

36 Pivotal GPDB, NFS: Restoring data to the DCA UAP Edition from Data Domain For restoring a database, Pivotal provides the gpdbrestore utility, which is a wrapper for the gp_restore command. Restoring data using gpdbrestore requires that: Backup files were created by a gpcrondump. The Pivotal GPDB system is up and running. The Pivotal GPDB system has the same number of primary segment instances as the system that was backed up. The database being restored to is already created in the system. The NFS shares on the Data Domain should be available and mounted on both Master Servers and on each Segment Server for the restore to proceed. Note: If an incremental backup is specified when restoring data, gpdbrestore ensures that the correct incremental backups and full backup are used to restore the data. For example, to restore the tpch database backup generated by the gpcrondump on 23/04/2013 from the DCA UAP Edition, perform the following steps. 1. Ensure that the Data Domain system can be reached from the Master Server by issuing the following commands on the DCA UAP Edition: [gpadmin@mdw ~]$ ping dd_990_1 PING dd_990_1 ( ) 56(84) bytes of data. 64 bytes from dd_990_1 ( ): icmp_seq=1 ttl=64 time=0.105 ms 64 bytes from dd_990_1 ( ): icmp_seq=2 ttl=64 time=0.103 ms 2. Ensure that the database being restored to has already been created in the system by issuing the following command on the DCA UAP Edition: [gpadmin@mdw ~]$ createdb tpch This command fails if the database already exists. 3. In the case of a full database restore we can only restore to an empty database, so ensure that the database contains no objects by issuing the following commands on the DCA UAP Edition: [gpadmin@mdw ~]$ psql -d tpch psql (8.2.14) Type "help" for help. tpch=# \d No relations found. 4. Run the gpdbrestore utility by issuing the following command on the DCA UAP Edition: [gpadmin@mdw ~]$ gpdbrestore -u /backup/ddr-990 -G -t

37 The log files for the Pivotal GPDB management utilities are written to ~/gpadminlogs by default. The naming convention for the log files is <script_name>_<date>.log, for example gpdbrestore_ log. Table 15 lists the gpdbrestore options we used. Table 15. NFS gpdbrestore options used Parameter Description -u Path to the backup files -G Restore global objects -t The 14 digit timestamp key of the backup set to be restored For more options run gpdbrestore help or see Pivotal Greenplum Database Administrator Guide. Pivotal GPDB, DD Boost: Configuring the Data Domain system and the DCA UAP Edition For DD Boost backups, the Data Domain system must be configured to enable the DCA UAP Edition to access the Data Domain storage unit that is used as the target for the gpcrondump utility to store the backups. Figure 6 shows how we configured Data Domain using two 10 GbE interfaces (eth11a and eth11b) to create a virtual interface (veth1) with a VLAN overlay IP address of sysadmin@dd_990_1# net show settings port enabled DHCP IP address netmask type additional setting /prefix length eth0a yes no n/a eth0b yes no n/a eth0c no n/a n/a n/a n/a eth0d no n/a n/a n/a n/a eth10a yes n/a n/a n/a n/a bonded to veth2 eth10b yes n/a n/a n/a n/a bonded to veth2 eth11a yes n/a n/a n/a n/a bonded to veth1 eth11b yes n/a n/a n/a n/a bonded to veth1 veth1 yes no n/a lacp hash xor-l3l4: eth11a,eth11b veth2 yes no n/a lacp hash xor-l3l4: eth10a,eth10b Figure 6. Pivotal GPDB, DD Boost Data Domain bonded interfaces 37 37

38 We set up the virtual interface using the Enterprise Manager GUI, as shown in Figure 7. Figure 7. Setting up the Virtual Interface on Data Domain We used the following options: Bonding Information Type: Aggregate Mode: LACP Hash: XOR-L3L4 Speed/Duplex Speed: 10Gb Duplex: Full MTU: 1500 For more information, see Data Domain Operating System (DD OS) Administration Guide. It is very important to configure the host files on both the DCA UAP Edition and the Data Domain system. The following example shows how the Data Domain host entries look on the DCA UAP Edition: ~]$ grep "dd_990_1" /etc/hosts ### Data Domain - Backup dd_990_1 # Data Domain virtual interface veth1 Once the host file is configured on mdw, you can copy /etc/hosts to smdw and the Segment Servers using gpscp

39 For example on the DCA UAP Edition: ~]$ gpscp f <hostfile> /etc/hosts =:/etc/hosts Where <hostfile> contains smdw and the Segment Servers. To add the hosts to the Data Domain system, type: net hosts add <ipaddr> <host-list> The following example shows how to add the 10 GbE bonded interface for mdw on the Data Domain: sysadmin@dd_990_1# net hosts add mdw Added "mdw" -> " " mapping to hosts list. Repeat this step using the 10 GbE VLAN overlay IP address assigned to each Master and Segment Server. For this solution, we assigned the IP addresses in Table 16 to the VNIC on each DCA UAP Edition server using the DCA Setup utility. For complete instructions on setting up a VLAN overlay, see the Connect through the Interconnect or Aggregation Switch section in the Pivotal Data Computing Appliance Administration Guide. Table 16. VLAN overlay IP addresses DCA UAP Edition server Host name VLAN overlay IP address Master Server 01 mdw Master Server 02 smdw Segment Server 01 sdw Segment Server 02 sdw Segment Server 03 sdw Segment Server 04 sdw Segment Server 05 sdw Segment Server 06 sdw Segment Server 07 sdw Segment Server 08 sdw

40 To configure DD Boost, perform the following steps: 1. Check that you have a DD Boost license by issuing the following command on the Data Domain: sysadmin@dd_990_1# license show Feature licenses: ## License Key Feature XXXX-YYYYY-XXXX-YYYY DDBOOST If necessary, use the following command to add the license on the Data Domain: sysadmin@dd_990_1# license add <license-code> Where <license-code> is the DD Boost license key you received from Data Domain. 2. Enable NFS using the following command on the Data Domain: Note: The access list for the DD Boost clients is an NFS export, so NFS must be enabled. sysadmin@dd_990_1# nfs enable NFS server is enabled 3. Enable DD Boost access for the DCA UAP Edition Master and Segment Servers using the following command on the Data Domain: sysadmin@dd_990_1# ddboost access add clients mdw mdw : Added The example enables DD Boost access for mdw, repeat this step for all Master and Segment Servers. 4. Ensure that all Master and Segment Servers were added correctly, by issuing following command on the Data Domain: sysadmin@dd_990_1# ddboost access show DD Boost access allowed from the following clients: mdw sdw1 sdw2 sdw3 sdw4 sdw5 sdw6 sdw7 sdw8 smdw 5. Create a DD Boost user. By default, the DD Boost user is sysadmin. If an individual user is required for DD Boost, refer to the Data Domain Operating System (DD OS) Administration Guide. This solution uses sysadmin as the user

41 6. Set up the DD Boost login credentials by using the following syntax. This process creates the encrypted login credentials on the Master and Segment Servers to allow access to the Data Domain system. gpcrondump --ddboost-host <ddboost_hostname> --ddboost-user <ddboost_user> --ddboost-backupdir <backup_directory> For example on the DCA UAP Edition: ~]$ gpcrondump --ddboost-host dd_990_1 --ddboostuser sysadmin --ddboost-backupdir GPBD/DCA-01 Password: Enter the password for user sysadmin. Figure 8 shows an example of the output from one server :13:56:40: gpcrondump:mdw:gpadmin-[info]:-starting gpcrondump with args: --ddboost-host dd_990_1 --ddboost-user sysadmin -- ddboost-backupdir GPDB/DCA-01 Password: [sdw4] :13:57:02 ddboost-[debug]:-libraries were loaded successfully [sdw4] :13:57:02 ddboost-[info]:-creating LB on /home/gpadmin/ddboost_config :13:57:02: gpcrondump:mdw:gpadmin-[info]:-config delivered successfully, host: sdw4 Figure 8. DD Boost login credentials setup 7. Determine whether distributed segment processing is enabled on Data Domain system by issuing the following command: sysadmin@dd_990_1# ddboost option show distributed-segmentprocessing DD Boost option "distributed-segment-processing" is enabled If disabled, enable it by issuing the following command: sysadmin@dd_990_1# ddboost option set distributed-segmentprocessing enabled DD Boost option "distributed-segment-processing" set to enabled. 8. Ensure that the interface groups on Data Domain are disabled or at least not used for the DCA UAP Edition interfaces by issuing the following command: sysadmin@dd_990_1# ddboost ifgroup show config interfaces No interfaces in ifgroup sysadmin@dd_990_1# ddboost ifgroup status Status of ifgroup is "disabled". 9. Enable DD Boost on Data Domain by issuing the following command: sysadmin@dd_990_1# ddboost enable DD Boost enabled. Alternatively, as with all other DD Boost options, you can enable DD Boost using the Enterprise Manager GUI with the Data Management > DD Boost view option

42 The Data Domain system is now ready to receive DD Boost backups run by the gpcrondump utility. Pivotal GPDB, DD Boost: Backing up to Data Domain from the DCA UAP Edition To automate routine DD Boost backups, Pivotal provides the gpcrondump utility that can be called directly or from a crontab entry. The gpcrondump utility is a wrapper for the gp_dump command. gpcrondump also enables the backup of objects other than databases and data, such as database roles and server configuration files. Unlike NFS backups, DD Boost does not support incremental backups at this time. By default, gpcrondump creates data dump files in the Master and each Segment instance s data directory in <data_directory>/db_dumps/yyyymmdd. The Segment data dump files are compressed using gzip. For DD Boost backups, gpcrondump with the --ddboost parameter is required. Without the --ddboost parameter, the backups are stored locally in the DCA UAP Edition array. For the DD Boost solution, the deduplication is performed on the Segment Servers before sending the data to the storage unit on the Data Domain system. This reduces network traffic and speeds up the backup. Note: A storage unit called GPDB is automatically created on the Data Domain system during the initial DD Boost backup. We generated all the DD Boost backups by running the following command as the gpadmin user. For gpcrondump uncompressed backups with DD Boost, issue the following command: gpcrondump -x tpch --ddboost -z a Log files for the Pivotal GPDB management utilities are written to ~/gpadminlogs by default. The naming convention for management log files is <script_name>_<date>.log. For example: gpcrondump_ log. Table 17 lists the gpcrondump options used in DD Boost. Table 17. DD Boost gpcrondump options used Parameter Description -x Database name --ddboost Use Data Domain DD Boost for this backup -z Do not use compression -a Do not prompt For more options, run gpcrondump help or see Pivotal Greenplum Database Administrator Guide

43 Note: The gpcrondump compressed option --rsyncable is also available, but may not be suitable for environments with a high rate of random data changes. To gain the maximum benefit from Data Domain deduplication technology, we recommend the gpcrondump uncompressed option -z. It is also possible to automate parallel backups with gpcrondump by adding an entry to the crontab file of the Master Server. For more information see Pivotal Greenplum Database Administrator Guide. Pivotal GPDB, DD Boost: Restoring data to the DCA UAP Edition from Data Domain For restoring a database using DD Boost, Pivotal provides the gpdbrestore utility, which is a wrapper for the gp_restore command. Restoring data with DD Boost using gpdbrestore requires that: Backup files were created using a gpcrondump ddboost operation. The Pivotal database system is up and running. The Pivotal database system has the same number of primary and segment instances as the system that was backed up. The database to which the data is being restored is already created in the system. For example, to restore the tpch database backup generated by the gpcrondump on 14/03/2013 to the DCA UAP Edition, perform the following steps: 1. Ensure that the Data Domain system can be reached from the Master Server by issuing the ping command on the DCA UAP Edition as shown in Figure 9: [gpadmin@mdw ~]$ ping dd_990_1_1 PING dd_990_1_1 ( ) 56(84) bytes of data. 64 bytes from dd_990_1_1 ( ): icmp_seq=1 ttl=64 time=0.105 ms 64 bytes from dd_990_1_1 ( ): icmp_seq=2 ttl=64 time=0.103 ms [gpadmin@mdw ~]$ ping dd_990_1_2 PING dd_990_1_2 ( ) 56(84) bytes of data. 64 bytes from dd_990_1_2 ( ): icmp_seq=1 ttl=64 time=0.755 ms 64 bytes from dd_990_1_2 ( ): icmp_seq=2 ttl=64 time=0.109 ms Figure 9. Issuing the ping command 2. Ensure the database to which the data is being restored has already been created in the system by issuing the following command on the DCA UAP Edition: [gpadmin@mdw ~]$ createdb tpch Note: This command fails if the database already exists

44 3. In the case of a full database restore we can only restore to an empty database, so ensure that the database contains no objects by issuing the following command on the DCA UAP Edition: [gpadmin@mdw ~]$ psql -d tpch psql (8.2.14) Type "help" for help. tpch=# \d No relations found. 4. Run the gpdbrestore utility by issuing the following command on the DCA UAP Edition: [gpadmin@mdw ~]$ gpdbrestore -t ddboost -a Log files for the Pivotal GPDB management utilities are written to ~/gpadminlogs by default. The naming convention for management log files is <script_name>_<date>.log, for example: gpdbrestore _ log. Table 18 lists the gpdbrestore options used in DD Boost. Table 18. DD Boost gpdbrestore options used Parameter Description -t Timestamp key --ddboost Use DD Boost for this restore -a Do not prompt For more options, run gpdbrestore help or refer to the Pivotal Greenplum Database Administrator Guide. Note: With Data Domain systems, it is also possible to do a hot backup while running query and ingest loads

45 Pivotal HD, NFS: Configuring the Data Domain system and the DCA UAP Edition For backups of the DCA UAP Edition Pivotal HD modules, the Data Domain system must be configured to enable the Pivotal HD servers to access the NFS directories that are used as a target for the DistCp utility to store the backups. Figure 10 shows how we configured the Data Domain system using two 10 GbE interfaces (eth10a and eth10b) to create a virtual interface (veth2) with a VLAN IP address of sysadmin@dd_990_1# net show settings port enabled DHCP IP address netmask type additional setting /prefix length eth0a yes no n/a eth0b yes no n/a eth0c no n/a n/a n/a n/a eth0d no n/a n/a n/a n/a eth10a yes n/a n/a n/a n/a bonded to veth2 eth10b yes n/a n/a n/a n/a bonded to veth2 eth11a yes n/a n/a n/a n/a bonded to veth1 eth11b yes n/a n/a n/a n/a bonded to veth1 veth1 yes no n/a lacp hash xor-l3l4: eth11a,eth11b veth2 yes no n/a lacp hash xor-l3l4: eth10a,eth10b Figure 10. Pivotal HD, NFS Data Domain bonded interfaces We set up the virtual interface using the Enterprise Manager GUI on the Data Domain, as shown in Figure 11. Figure 11. Setup Virtual Interface on Data Domain 45 45

46 We used the following options: Bonding Information Type: Aggregate Mode: LACP Hash: XOR-L3L4 Speed/Duplex Speed: 10Gb Duplex: Full MTU: 1500 For more information, see Data Domain Operating System (DD OS) Administration Guide. It is important to configure the host files on both the DCA UAP Edition Pivotal HD servers and the Data Domain system. The following example shows the Data Domain host entries on the DCA UAP Edition: ~]$ grep " dd_990_2" /etc/hosts ### Data Domain - Backup dd_990_2 # Data Domain interface veth2 Once the host file is configured on hdm1, you can copy /etc/hosts to all the other Pivotal HD servers using gpscp. For example: [root@hdm1 ~]$ gpscp f <hostfile> /etc/hosts =:/etc/hosts Where <hostfile> contains all the other Pivotal HD servers. To add the hosts to the Data Domain system, type the following command: net hosts add <ipaddr> <host-list> The following example shows how to add the 10 GbE bonded interface for hdm1 and hdw1 on the Data Domain: sysadmin@dd_990_1# net hosts add hdm1 Added "hdm1" -> " " mapping to hosts list. sysadmin@dd_990_1# net hosts add hdw1 Added "hdw1" -> " " mapping to hosts list. Repeat this command using the 10 GbE VLAN overlay IP address assigned to all Pivotal HD Servers. For this solution, we assigned the IP addresses in Table 19 to the VNIC for each Pivotal HD server using DCA Setup utility. For complete instructions on setting up a VLAN overlay, see the Connect through the Interconnect or Aggregation Switch section in the Pivotal Data Computing Appliance Administration Guide

47 Table 19. VLAN overlay IP addresses DCA UAP Edition server Host name VLAN overlay IP address Pivotal HD Master 01 hdm Pivotal HD Master 02 hdm Pivotal HD Master 03 hdm Pivotal HD Master 04 hdm Pivotal HD Worker 01 hdw Pivotal HD Worker 02 hdw Pivotal HD Worker 03 hdw Pivotal HD Worker 04 hdw Pivotal HD Worker 05 hdw Pivotal HD Worker 06 hdw Pivotal HD Worker 07 hdw Pivotal HD Worker 08 hdw Pivotal HD Worker 09 hdw Pivotal HD Worker 10 hdw Pivotal HD Worker 11 hdw Pivotal HD Worker 12 hdw To configure NFS, perform the following steps: 1. On the Data Domain set the access level to the VLAN overlay network for the NFS share /backup. The VLAN overlay network used in this case is /24. nfs add /backup /24 NFS export for /backup is added 2. Ensure that the VLAN overlay network was added correctly on Data Domain, as shown in Figure 12: nfs show clients path client options /backup /24 (rw,no_root_squash,no_all_squash,secure) /backup /24 (rw,no_root_squash,no_all_squash,secure) Figure 12. VLAN overlay network access Note: /24 was added previously for the Pivotal GPDB

48 3. Enable NFS on the Data Domain system by issuing the following command: nfs enable The NFS server is enabled. 4. Temporarily mount the Data Domain NFS share /backup on the DCA UAP Edition Pivotal HD Master as shown in the following example: a. Create a temporary mount point for the /backup directory by issuing the following command: [root@hdm1 ~]# mkdir /backup_tmp b. Mount the /backup directory on the Master Server (all one line) by issuing the following command: [root@hdm1 ~]# mount -t nfs o hard,intr,nfsvers=3,tcp,rsize= ,wsize= dd_990_2:/backup /backup_tmp/ 5. Create the DCA directory, and the subdirectory HD, which will become the backup target on Data Domain. Note: In this case, use the user gpadmin to avoid any issues with permissions while running the DistCp utility. In the following example, we created the directory DCA-01 and the subdirectory HD using the mkdir command. On the DCA UAP Edition, issue the following command: [gpadmin@hdm1 ~]$ mkdir /backup_tmp/dca-01;mkdir /backup_tmp/dca-01/hd 6. After all the directories are created, unmount the /backup_tmp directory by issuing the following command on the DCA UAP Edition: [root@hdm1 ~]# umount /backup_tmp 48 48

49 We created a single NFS share on Data Domain and the mount point created on each server was /backup/ddr-990. Table 20 outlines the directory mapping we used in this solution. Table 20. NFS share and mount points DCA UAP Edition server Host name NFS share on Data Domain Mount point on DCA UAP Edition Pivotal HD Master 01 hdm1 /backup/dca-01/hd /backup/ddr-990 Pivotal HD Master 02 hdm2 /backup/dca-01/hd /backup/ddr-990 Pivotal HD Master 03 hdm3 /backup/dca-01/hd /backup/ddr-990 Pivotal HD Master 04 hdm4 /backup/dca-01/hd /backup/ddr-990 Pivotal HD Worker 01 hdw1 /backup/dca-01/hd /backup/ddr-990 Pivotal HD Worker 02 hdw2 /backup/dca-01/hd /backup/ddr-990 Pivotal HD Worker 03 hdw3 /backup/dca-01/hd /backup/ddr-990 Pivotal HD Worker 04 hdw4 /backup/dca-01/hd /backup/ddr-990 Pivotal HD Worker 05 hdw5 /backup/dca-01/hd /backup/ddr-990 Pivotal HD Worker 06 hdw6 /backup/dca-01/hd /backup/ddr-990 Pivotal HD Worker 07 hdw7 /backup/dca-01/hd /backup/ddr-990 Pivotal HD Worker 08 hdw8 /backup/dca-01/hd /backup/ddr-990 Pivotal HD Worker 09 hdw9 /backup/dca-01/hd /backup/ddr-990 Pivotal HD Worker 10 hdw10 /backup/dca-01/hd /backup/ddr-990 Pivotal HD Worker 11 hdw11 /backup/dca-01/hd /backup/ddr-990 Pivotal HD Worker 12 hdw12 /backup/dca-01/hd /backup/ddr-990 Note: You can create the mount point on each DCA UAP Edition using the gpssh utility. The owner of the /backup directory must be root. The owner of the /backup/ddr-990 directory must be gpadmin. The following steps describe how to create the mount points on the DCA UAP Edition. 7. On the DCA UAP Edition Pivotal HD Master (hdm1), issue the following commands: [root@hdm1 ~]# cd ~gpadmin/gpconfigs [root@hdm1 gpconfigs]# gpssh -f hostfile_hadoop => mkdir /backup => mkdir /backup/ddr-990 => chown gpadmin:gpadmin /backup/ddr-990 => exit 49 49

50 8. After the mount points are created, mount the Data Domain NFS share on each Pivotal HD server according to Table 20. The following command on the DCA UAP Edition provides an example: gpconfigs]# gpssh -h hdm1 -v -e mount -t nfs -o rw,hard,intr,tcp,rsize= ,wsize= dd_990_2:/backup/dca-01/hd /backup/ddr Check that the directories are mounted correctly by issuing the following command on the DCA UAP Edition: gpconfigs]# for NODE in `cat /home/gpadmin/gpconfigs/hostfile_hadoop`; do gpssh -h $NODE -v -e df -h grep "/backup/"; done; [hdm1] dd_990_2:/backup/dca-01/hd [hdm1] 130T 1.2T 129T 1% /backup/ddr-990 [hdm2] dd_990_2:/backup/dca-01/hd [hdm2] 130T 1.2T 129T 1% /backup/ddr-990 [hdm3] dd_990_2:/backup/dca-01/hd [hdm3] 130T 1.2T 129T 1% /backup/ddr-990 [hdm4] dd_990_2:/backup/dca-01/hd [hdm4] 130T 1.2T 129T 1% /backup/ddr-990 [hdw1] dd_990_2:/backup/dca-01/hd [hdw1] 130T 1.2T 129T 1% /backup/ddr-990 [hdw2] dd_990_2:/backup/dca-01/hd [hdw2] 130T 1.2T 129T 1% /backup/ddr-990 [hdw3] dd_990_2:/backup/dca-01/hd [hdw3] 130T 1.2T 129T 1% /backup/ddr-990 [hdw4] dd_990_2:/backup/dca-01/hd [hdw4] 130T 1.2T 129T 1% /backup/ddr-990 [hdw5] dd_990_2:/backup/dca-01/hd [hdw5] 130T 1.2T 129T 1% /backup/ddr-990 [hdw6] dd_990_2:/backup/dca-01/hd [hdw6] 130T 1.2T 129T 1% /backup/ddr-990 [hdw7] dd_990_2:/backup/dca-01/hd [hdw7] 130T 1.2T 129T 1% /backup/ddr-990 [hdw8] dd_990_2:/backup/dca-01/hd [hdw8] 130T 1.2T 129T 1% /backup/ddr-990 [hdw9] dd_990_2:/backup/dca-01/hd [hdw9] 130T 1.2T 129T 1% /backup/ddr-990 [hdw10] dd_990_2:/backup/dca-01/hd [hdw10] 130T 1.2T 129T 1% /backup/ddr-990 [hdw11] dd_990_2:/backup/dca-01/hd [hdw11] 130T 1.2T 129T 1% /backup/ddr-990 [hdw12] dd_990_2:/backup/dca-01/hd [hdw12] 130T 1.2T 129T 1% /backup/ddr-990 If all the NFS shares are mounted successfully, the Data Domain system is ready to receive any backups run by the DistCp utility

51 Pivotal HD, NFS: Backing up to Data Domain from the DCA UAP Edition DistCp, which was designed for Hadoop inter- and intra-cluster copying, can be used for Pivotal HD backups via NFS. The DistCp utility can be called directly or from a crontab entry. DistCp uses a distributed copy from the source Pivotal HD cluster to the destination. For the Pivotal HD backup solution, the intention was to send the backups to Data Domain so the directory we used was the Data Domain NFS share created on /backup/dca-01/hd. We generated all the Pivotal HD NFS backups by running the following command as user gpadmin on the Pivotal HD Master server (hdm1): [gpadmin@hdm1 ~]$ hadoop distcp hdfs://hdm1:9000/user/gpadmin file:///backup/ddr-990/<yyyymmddhhmm> To take advantage of Data Domain s deduplication technology, we used a script to create a new target directory <YYYYMMDDhhmm> on the Data Domain NFS share for each individual backup. This way, Data Domain takes care of any duplication in subsequent backups. Note: Attempting to send subsequent backups to the same target directory may result in DistCp skipping previously copied files even if they have changed since the previous backup. Table 21 lists some of the DistCp options. Table 21. Some distcp options Parameter Description -p Preserve status -i Ignore failures -log <logdir> Write logs to <logdir> -m <num_maps> Maximum number of simultaneous copies -overwrite -update -skipcrccheck Overwrite destination Overwrite if the source size is different from the destination size Do not use CRC check to determine if source is different from destination. Relevant only if update is specified The -m parameter limits the maximum number of simultaneous copies. This ensures that the backup target is not overwhelmed as all data needs to be written to Data Domain during the initial backup. If you do not use the-m option, DistCp attempts to schedule work, the minimum being (total_bytes / bytes.per.map OR 20 * num_task_trackers), where bytes.per.map defaults to 256 MB

52 For example: ~]$ hadoop distcp m 40 hdfs://hdm1:9000/user/gpadmin file:///backup/dca-01/<yyyymmddhhmm> Log files are written to /var/log/gphd/hadoop by default. For example: hadoophadoop-namenode-hdm1.log For more information, see the Hadoop website. Pivotal HD, NFS: Restoring data to the DCA UAP Edition using Data Domain This solution also uses DistCp to restore the data. We reversed the source and destination fields so that the source became the NFS share provided by the Data Domain system and the Pivotal HD cluster became the target. We generated all the Pivotal HD NFS restores by running the following command as user gpadmin: hadoop distcp file:///backup/ddr-990/<source-folder> hdfs://hdm1:9000/<destination-folder> Backing up Pivotal Chorus using Data Domain Perform the following steps to back up Pivotal Chorus to Data Domain: 1. Create a backup target for Chorus on the Data Domain system as follows: a. Create a temporary mount point for /backup by issuing the following command on the DCA UAP Edition: [root@smdw ~]# mkdir /backup_tmp Mount the Data Domain /backup directory on the Standby Master Server (all one line) by issuing the following command: [root@smdw ~]# mount -t nfs o hard,intr,nfsvers=3,tcp,rsize=32768,wsize=32768 dd_990_1_1:/backup /backup_tmp/ b. Create a backup directory on Data Domain for Chorus. In the following example, the directory DCA01-Chorus was created by user chorus using mkdir on the DCA UAP Edition as follows: [chorus@smdw ~]$ mkdir /backup_tmp/dca01-chorus After the Chorus backup directory is created, unmount the /backup_tmp directory as root by issuing the following command on the DCA UAP Edition: [root@smdw ~]# umount /backup_tmp c. Create a mount point to mount the Chorus backup share on the Standby Master Server by issuing the following commands on the DCA UAP Edition: [root@smdw ~]# mkdir /backup [root@smdw ~]# mkdir /backup/dca01-chorus [root@smdw ~]# chown chorus:chorus /backup/dca01-chorus 52 52

53 d. Mount the /backup/dca01-chorus share on the Standby Master Server by issuing the following command all one line: ~]# mount -t nfs o hard,intr,nfsvers=3,tcp,rsize=32768,wsize=32768 dd_990_1_1:/backup/dca01-chorus /backup/dca01-chorus 2. On the DCA UAP Edition Standby Master Server, change the user to chorus using the following command: [root@smdw ~]# su - chorus 3. Change the directory to <chorus install path> using the following command: [chorus@smdw ~]$ cd /usr/local/greenplum-chorus 4. Source the path chorus_path.sh using the following command: [chorus@smdw chorus]$ source chorus_path.sh 5. Back up Pivotal Chorus to Data Domain using the following syntax: chorus_control.sh backup [-d backup_dir] [-r rolling_days] Notes: Chorus should be running when you run the backup. Chorus binary backup file greenplum_chorus_backup_yyyymmdd_hhmmss.tar is dumped to the specified backup directory. You can control the expiry time by setting rolling_days. If no backup directory is specified, a default backup directory /data/greenplum-chorus/bak is created and used. The example in Figure 13 backs up Chorus to /backup/dca01-chorus (the Data Domain share) and automatically removes all Chorus backup files more than 30 days old: [chorus@smdw greenplum-chorus]$ chorus_control.sh backup -d /backup/dca01-chorus -r 30 Backing up chorus data... /usr/local/greenplum-chorus/releases/ e2c3529e/vendor/bundle/jruby/1.9/gems/jdbc-postgres /lib/jdbc/postgres.rb:4 warning: already initialized constant VERSION Dumping database contents... Compressing assets... Created backup archive file: /backup/dca01- Chorus/greenplum_chorus_backup_ _ tar Removing backups more than 30 days old... Figure 13. Backup of Chorus 53 53

54 Restoring Pivotal Chorus using Data Domain To restore Pivotal Chorus, complete the following steps: 1. Stop Chorus using the following command: ~]# su chorus 2. Change directory to the Chorus installation directory using the following commands: ~]$ cd /usr/local/greenplum-chorus chorus]$ source chorus_path.sh chorus]$ chorus_control.sh stop 3. Restore Pivotal Chorus to Data Domain using the following syntax: chorus_control.sh restore backup_filename Figure 14 restores Chorus from /backup/dca01-chorus (the Data Domain share) using a Chorus backup from date and time_150141: [chorus@smdw greenplum-chorus]$ chorus_control.sh restore /backup/dca01- Chorus/greenplum_chorus_backup_ _ tar Restoring chorus data... Setting maximum database connections to 75 [production] postgres started as pid /usr/local/greenplum-chorus/releases/ e2c3529e/vendor/bundle/jruby/1.9/gems/jdbc-postgres /lib/jdbc/postgres.rb:4 warning: already initialized constant VERSION ** Invoke backup:restore (first_time) ** Execute backup:restore Continuing will overwrite existing assets and data. It is strongly advised that you have a recent backup available before performing a restore. Are you sure you want to continue? (Y/N): y Deleting existing assets... Restoring backed up assets... Restoring database... [production] stopping postgres ( Stopped ) Restore of /backup/dca01- Chorus/greenplum_chorus_backup_ _ tar completed. To start Chorus, run the following commands: source /usr/local/greenplum-chorus/chorus_path.sh chorus_control.sh start [chorus@smdw greenplum-chorus]$ chorus_control.sh start Setting maximum database connections to 75 [production] postgres started as pid [production] Worker started as pid [production] Scheduler started as pid [production] Solr started as pid on port 8983 [production] Writing nginx config... [production] nginx started as pid [production] updating jetty config... [production] starting jetty... Figure 14. Restoring Chorus 54 54

55 4. Start Chorus with the recovered instance using the following commands: ~]# su chorus ~]$ cd /usr/local/greenplum-chorus chorus]$ source chorus_path.sh chorus]$ chorus_control.sh start Pivotal Chorus is now restored, and you can log in to Chorus as usual. For more information, see Pivotal Chorus Installation Guide

56 Pivotal GPDB, NFS test results Test objectives Test scenarios The test objectives were to validate the success of data backup and restore activity over NFS by: Quantifying NFS backups, full and incremental, to Data Domain using increased data loads. Quantifying the restoration of data via NFS with the time to complete the restores. Table 22 outlines the data backup and restore test scenarios in this section. Table 22. Pivotal GPDB, NFS backup and restore test scenarios Test Description 1 Perform full backups to Data Domain via NFS with Pivotal backup compression off and Data Domain lz compression over a simulated week of full backups. Using the same dataset, perform full and incremental backups to Data Domain via NFS with Pivotal backup compression off and Data Domain lz compression over a simulated week of backups and compare that with the full backup results. 2 Perform a restore of the Pivotal database via NFS from a Pivotal uncompressed backup. Test 1 results In Tests 1a and 1b we performed full and incremental backups to Data Domain with Pivotal backup compression off and Data Domain lz compression on. Note: We did not use the Direct I/O feature. For more information, see the Pivotal GPDB, NFS: Enabling Direct I/O section

57 Figure 15 illustrates the five percent daily increase in the source database over seven days on the DCA UAP Edition with a gpcrondump uncompressed backup. Although the database consisted of heap and AO tables, the uncompressed database size is quoted. To simulate the real-life data growth of a data warehouse, we applied an incremental data load of five percent between backups. Uncompressed DB size 5% daily increase over 7 days DB size in TB Backup cycle (days) Figure 15. Pivotal GPDB, NFS Five percent daily increase over a simulated seven days 57 57

58 Figure 16 illustrates the backup duration in minutes on the DCA UAP Edition with a gpcrondump uncompressed backup with a five percent daily increase. The first backup takes more time to complete because it is the first time the data is written to the Data Domain system. The value of the incremental backups is also clear here as only the heap and changed AO tables are backed up. Backup duration Full Incremental Minutes Day 1 Day 2 Day 3 Day 4 Day 5 Day 6 Day 7 Backup cycle (days) Figure 16. Pivotal GPDB, NFS gpcrondump uncompressed full and increment backup duration 58 58

59 Figure 17 illustrates the backup rate in TB/hour with an uncompressed gpcrondump backup. As seen previously in the backup duration test the throughput is slower on the first backup. Using gpcrondump uncompressed backups with Data Domain s deduplication technology results in an average ingest speed of 7.31 TB/hour for full backups two to seven. The average ingest speed of 5.38 TB/hour for incremental backups two through seven is somewhat less because a higher percentage of the data in each backup set is writing to Data Domain for the first time. Average ingest rates Full Incremental TB/hr Day 1 Day 2 Day 3 Day 4 Day 5 Day 6 Day 7 Backup cycle (days) Figure 17. Pivotal GPDB, NFS Ingest rates with gpcrondump uncompressed full and incremental backups Due to the strength of the Data Domain deduplication technology, there was an average of 34.2x storage saving for each nightly full backup. Shortly after the nightly database backups began, the savings were significant. For example, on the second day, a full backup of 1,971 GiBibytes (GiB) used just 60 GiB, resulting in a 33.1x storage reduction

60 Figure 18 illustrates the tremendous savings that can be achieved with full backups on a daily basis. Over time, the savings are even greater. GiB Daily storage savings Day 1 Day 2 Day 3 Day 4 Day 5 Day 6 Day 7 Full backup cycle - days Daily increase in data stored on DD (GiB) Nightly dedupe rate (X) Storage savings vs daily full Figure 18. Pivotal GPDB, NFS Full Backup storage savings after seven days As expected, the average storage savings for nightly incremental backups is reduced by 11.2x. In this case, on the second day, an incremental backup of just GiBibytes (GiB) used 53 GiB, resulting in a 7x storage reduction

61 Figure 19 illustrates the tremendous savings that can be achieved with full backups on a daily basis. Over time, the savings are even greater. GiB Daily storage savings Day 1 Day 2 Day 3 Day 4 Day 5 Day 6 Day 7 Incremental backup cycle - days Daily increase in data stored on DD (GiB) Nightly dedupe rate (X) Storage savings vs daily full Figure 19. Pivotal GPDB, NFS Incremental Backup storage savings after seven days Note: During all the backup tests, the DCA UAP Edition was idle

62 Figure 20 illustrates the cumulative effect of storage savings over a seven-day full backup cycle. After seven days of running the backup, 15,352 GiB of data was backed up; however, only 1,362 GiB of storage was needed on the Data Domain deduplication storage system, resulting in a 11.3x storage saving. This 91.1 percent storage saving represents a significant decrease in backup infrastructure requirements and facility costs. As a result, regular backups of large data warehouses are more sustainable and much less costly than before. Cumulative data ingested / stored (GiB) 18,000 16,000 14,000 12,000 10,000 8,000 6,000 4,000 2,000 Cumulative storage savings Day 1 Day 2 Day 3 Day 4 Day 5 Day 6 Day 7 Full backup cycle (days) 12.0 Cumulative ingest data (GiB) Cumulative data stored on DD (GiB) Cumulative dedupe rate (X) Cumulative dedupe rate Figure 20. Pivotal GPDB, NFS Cumulative storage savings over seven days 62 62

63 Figure 21 illustrates the cumulative effect of the storage savings over a seven-day incremental backup cycle. In this case, after seven days of running the backups only 5,745 GiB of data was backed up; however as expected a very similar value of 1,327 GiB of storage was still needed on the Data Domain deduplication storage system. The resulting storage savings in this case is reduced by 4.3x. This 76.9 percent storage with incremental backups still represents significant savings in backup infrastructure and facility costs. As a result, regular backups of large data warehouses are far more sustainable and much less costly than before. Cumulative data ingested / stored (GiB) 7,000 6,000 5,000 4,000 3,000 2,000 1,000 0 Cumulative storage savings Day 1 Day 2 Day 3 Day 4 Day 5 Day 6 Day 7 Incremental backup cycle (days) Cumulative dedupe rate Cumulative ingest data (GiB) Cumulative data stored on DD (GiB) Cumulative dedupe rate (X) Figure 21. Pivotal GPDB, NFS Cumulative storage savings over seven days Note: There is roughly a 5.5 percent difference between the daily database sizes reported using Pivotal utilities in Figure 16 versus the actual data saved to the Data Domain system, as noted in Figure 20. The reason for this difference is that gpcrondump extracts the actual data from the Pivotal database and backs it up to the Data Domain system. The 5.5 percent is additional Pivotal overhead that is not required for a successful backup or restore. Test 2 results In Test 2, we performed a restore of the Pivotal database using the Pivotal uncompressed backup set from previous tests. The DCA UAP Edition restore with uncompressed data produced the following results: The restore of a 2.72 TB database from an uncompressed backup was achieved in minutes. The average restore throughput of a Pivotal uncompressed backup was 5.71 TB/hour

64 Pivotal GPDB, DD Boost test results Test objectives Test scenarios The test objectives were to validate the success of a DD Boost data backup and restore by: Quantifying DD Boost full backups to Data Domain using increased data loads Quantifying the restoration of data via DD Boost with the time to complete the restores Table 23 outlines the data backup and restore test scenarios in this section. Table 23. Pivotal GPDB, DD Boost backup and restore test scenarios Test Description 1 Perform a full backup to Data Domain via DD Boost with Pivotal backup compression off using Data Domain lz compression over a simulated week of full backups. 2 Restore the Pivotal database using DD Boost from a Pivotal uncompressed backup

65 Test 1 results Test 1 performs a full backup to Data Domain with Pivotal compression off, using Data Domain lz compression. Figure 22 illustrates the five percent daily increase in the source database size over seven days on the DCA UAP Edition with an uncompressed backup of an uncompressed database. To simulate the real-life data growth of a data warehouse, we applied an incremental data load of five percent between backups. DB size in TB Uncompressed DB size 5% daily increase over 7 days Day 1 Day 2 Day 3 Day 4 Day 5 Day 6 Day 7 Backup cycle (days) Figure 22. Pivotal GPDB, DD Boost: Five percent daily increase over a simulated seven days with an uncompressed database 65 65

66 Figure 23 illustrates the backup duration in minutes on the DCA UAP Edition with an uncompressed backup of an uncompressed database, based on the five percent daily increase. The first backup takes more time to complete because it is the first time the data is written to the Data Domain system Backup duration Minutes Day 1 Day 2 Day 3 Day 4 Day 5 Day 6 Day 7 Backup cycle (days) Figure 23. Pivotal GPDB, DD Boost: Backup duration with gpcrondump uncompressed backup 66 66

EMC DATA DOMAIN PRODUCT OvERvIEW

EMC DATA DOMAIN PRODUCT OvERvIEW EMC DATA DOMAIN PRODUCT OvERvIEW Deduplication storage for next-generation backup and archive Essentials Scalable Deduplication Fast, inline deduplication Provides up to 65 PBs of logical storage for long-term

More information

EMC DATA DOMAIN OPERATING SYSTEM

EMC DATA DOMAIN OPERATING SYSTEM EMC DATA DOMAIN OPERATING SYSTEM Powering EMC Protection Storage ESSENTIALS High-Speed, Scalable Deduplication Up to 31 TB/hr performance Reduces requirements for backup storage by 10 to 30x and archive

More information

Dell EMC SAP HANA Appliance Backup and Restore Performance with Dell EMC Data Domain

Dell EMC SAP HANA Appliance Backup and Restore Performance with Dell EMC Data Domain Dell EMC SAP HANA Appliance Backup and Restore Performance with Dell EMC Data Domain Performance testing results using Dell EMC Data Domain DD6300 and Data Domain Boost for Enterprise Applications July

More information

See what s new: Data Domain Global Deduplication Array, DD Boost and more. Copyright 2010 EMC Corporation. All rights reserved.

See what s new: Data Domain Global Deduplication Array, DD Boost and more. Copyright 2010 EMC Corporation. All rights reserved. See what s new: Data Domain Global Deduplication Array, DD Boost and more 2010 1 EMC Backup Recovery Systems (BRS) Division EMC Competitor Competitor Competitor Competitor Competitor Competitor Competitor

More information

Protect enterprise data, achieve long-term data retention

Protect enterprise data, achieve long-term data retention Technical white paper Protect enterprise data, achieve long-term data retention HP StoreOnce Catalyst and Symantec NetBackup OpenStorage Table of contents Introduction 2 Technology overview 3 HP StoreOnce

More information

Copyright 2010 EMC Corporation. Do not Copy - All Rights Reserved.

Copyright 2010 EMC Corporation. Do not Copy - All Rights Reserved. 1 Using patented high-speed inline deduplication technology, Data Domain systems identify redundant data as they are being stored, creating a storage foot print that is 10X 30X smaller on average than

More information

DELL EMC DATA DOMAIN OPERATING SYSTEM

DELL EMC DATA DOMAIN OPERATING SYSTEM DATA SHEET DD OS Essentials High-speed, scalable deduplication Up to 68 TB/hr performance Reduces protection storage requirements by 10 to 30x CPU-centric scalability Data invulnerability architecture

More information

DELL EMC DATA DOMAIN OPERATING SYSTEM

DELL EMC DATA DOMAIN OPERATING SYSTEM DATA SHEET DD OS Essentials High-speed, scalable deduplication Reduces protection storage requirements by up to 55x Up to 3x restore performance CPU-centric scalability Data invulnerability architecture

More information

Get More Out of Storage with Data Domain Deduplication Storage Systems

Get More Out of Storage with Data Domain Deduplication Storage Systems 1 Get More Out of Storage with Data Domain Deduplication Storage Systems David M. Auslander Sales Director, New England / Eastern Canada 2 EMC Data Domain Dedupe everything without changing anything Simplify

More information

EMC Business Continuity for Microsoft Applications

EMC Business Continuity for Microsoft Applications EMC Business Continuity for Microsoft Applications Enabled by EMC Celerra, EMC MirrorView/A, EMC Celerra Replicator, VMware Site Recovery Manager, and VMware vsphere 4 Copyright 2009 EMC Corporation. All

More information

EMC Backup and Recovery for Microsoft SQL Server

EMC Backup and Recovery for Microsoft SQL Server EMC Backup and Recovery for Microsoft SQL Server Enabled by Microsoft SQL Native Backup Reference Copyright 2010 EMC Corporation. All rights reserved. Published February, 2010 EMC believes the information

More information

EMC Data Domain for Archiving Are You Kidding?

EMC Data Domain for Archiving Are You Kidding? EMC Data Domain for Archiving Are You Kidding? Bill Roth / Bob Spurzem EMC EMC 1 Agenda EMC Introduction Data Domain Enterprise Vault Integration Data Domain NetBackup Integration Q & A EMC 2 EMC Introduction

More information

TECHNICAL OVERVIEW OF NEW AND IMPROVED FEATURES OF EMC ISILON ONEFS 7.1.1

TECHNICAL OVERVIEW OF NEW AND IMPROVED FEATURES OF EMC ISILON ONEFS 7.1.1 TECHNICAL OVERVIEW OF NEW AND IMPROVED FEATURES OF EMC ISILON ONEFS 7.1.1 ABSTRACT This introductory white paper provides a technical overview of the new and improved enterprise grade features introduced

More information

EMC GREENPLUM DATA COMPUTING APPLIANCE: PERFORMANCE AND CAPACITY FOR DATA WAREHOUSING AND BUSINESS INTELLIGENCE

EMC GREENPLUM DATA COMPUTING APPLIANCE: PERFORMANCE AND CAPACITY FOR DATA WAREHOUSING AND BUSINESS INTELLIGENCE White Paper EMC GREENPLUM DATA COMPUTING APPLIANCE: PERFORMANCE AND CAPACITY FOR DATA WAREHOUSING AND BUSINESS INTELLIGENCE A Detailed Review EMC GLOBAL SOLUTIONS Abstract This white paper provides readers

More information

EMC Virtual Infrastructure for Microsoft Applications Data Center Solution

EMC Virtual Infrastructure for Microsoft Applications Data Center Solution EMC Virtual Infrastructure for Microsoft Applications Data Center Solution Enabled by EMC Symmetrix V-Max and Reference Architecture EMC Global Solutions Copyright and Trademark Information Copyright 2009

More information

EMC Integrated Infrastructure for VMware. Business Continuity

EMC Integrated Infrastructure for VMware. Business Continuity EMC Integrated Infrastructure for VMware Business Continuity Enabled by EMC Celerra and VMware vcenter Site Recovery Manager Reference Architecture Copyright 2009 EMC Corporation. All rights reserved.

More information

EMC Solutions for Backup to Disk EMC Celerra LAN Backup to Disk with IBM Tivoli Storage Manager Best Practices Planning

EMC Solutions for Backup to Disk EMC Celerra LAN Backup to Disk with IBM Tivoli Storage Manager Best Practices Planning EMC Solutions for Backup to Disk EMC Celerra LAN Backup to Disk with IBM Tivoli Storage Manager Best Practices Planning Abstract This white paper describes how to configure the Celerra IP storage system

More information

Quest DR Series Disk Backup Appliances

Quest DR Series Disk Backup Appliances Quest DR Series Disk Backup Appliances Back up more. Store less. Perform better. Keeping up with the volume of data to protect can be complex and time consuming, but managing the storage of that data doesn

More information

Microsoft Office SharePoint Server 2007

Microsoft Office SharePoint Server 2007 Microsoft Office SharePoint Server 2007 Enabled by EMC Celerra Unified Storage and Microsoft Hyper-V Reference Architecture Copyright 2010 EMC Corporation. All rights reserved. Published May, 2010 EMC

More information

Balakrishnan Nair. Senior Technology Consultant Back Up & Recovery Systems South Gulf. Copyright 2011 EMC Corporation. All rights reserved.

Balakrishnan Nair. Senior Technology Consultant Back Up & Recovery Systems South Gulf. Copyright 2011 EMC Corporation. All rights reserved. Balakrishnan Nair Senior Technology Consultant Back Up & Recovery Systems South Gulf 1 Thinking Fast: The World s Fastest Backup Now Does Archive Too Introducing the New EMC Backup and Recovery Solutions

More information

Quest DR Series Disk Backup Appliances

Quest DR Series Disk Backup Appliances Quest DR Series Disk Backup Appliances Back up more. Store less. Perform better. Keeping up with the volume of data to protect can be complex and time consuming, but managing the storage of that data doesn

More information

Veritas NetBackup on Cisco UCS S3260 Storage Server

Veritas NetBackup on Cisco UCS S3260 Storage Server Veritas NetBackup on Cisco UCS S3260 Storage Server This document provides an introduction to the process for deploying the Veritas NetBackup master server and media server on the Cisco UCS S3260 Storage

More information

DELL EMC DATA DOMAIN SISL SCALING ARCHITECTURE

DELL EMC DATA DOMAIN SISL SCALING ARCHITECTURE WHITEPAPER DELL EMC DATA DOMAIN SISL SCALING ARCHITECTURE A Detailed Review ABSTRACT While tape has been the dominant storage medium for data protection for decades because of its low cost, it is steadily

More information

DATA PROTECTION IN A ROBO ENVIRONMENT

DATA PROTECTION IN A ROBO ENVIRONMENT Reference Architecture DATA PROTECTION IN A ROBO ENVIRONMENT EMC VNX Series EMC VNXe Series EMC Solutions Group April 2012 Copyright 2012 EMC Corporation. All Rights Reserved. EMC believes the information

More information

IBM Real-time Compression and ProtecTIER Deduplication

IBM Real-time Compression and ProtecTIER Deduplication Compression and ProtecTIER Deduplication Two technologies that work together to increase storage efficiency Highlights Reduce primary storage capacity requirements with Compression Decrease backup data

More information

Data Protection for Cisco HyperFlex with Veeam Availability Suite. Solution Overview Cisco Public

Data Protection for Cisco HyperFlex with Veeam Availability Suite. Solution Overview Cisco Public Data Protection for Cisco HyperFlex with Veeam Availability Suite 1 2017 2017 Cisco Cisco and/or and/or its affiliates. its affiliates. All rights All rights reserved. reserved. Highlights Is Cisco compatible

More information

EMC GREENPLUM DATA COMPUTING APPLIANCE SAN MIRROR ON EMC VNX

EMC GREENPLUM DATA COMPUTING APPLIANCE SAN MIRROR ON EMC VNX White Paper EMC GREENPLUM DATA COMPUTING APPLIANCE SAN MIRROR ON EMC VNX EMC Greenplum DCA, EMC VNX, EMC SnapView, EMC NetWorker, EMC Data Domain Use your existing VNX SAN backup Deploy EMC NetWorker for

More information

White paper ETERNUS CS800 Data Deduplication Background

White paper ETERNUS CS800 Data Deduplication Background White paper ETERNUS CS800 - Data Deduplication Background This paper describes the process of Data Deduplication inside of ETERNUS CS800 in detail. The target group consists of presales, administrators,

More information

WHY SECURE MULTI-TENANCY WITH DATA DOMAIN SYSTEMS?

WHY SECURE MULTI-TENANCY WITH DATA DOMAIN SYSTEMS? Why Data Domain Series WHY SECURE MULTI-TENANCY WITH DATA DOMAIN SYSTEMS? Why you should take the time to read this paper Provide data isolation by tenant (Secure logical data isolation for each tenant

More information

DELL EMC DATA DOMAIN DEDUPLICATION STORAGE SYSTEMS

DELL EMC DATA DOMAIN DEDUPLICATION STORAGE SYSTEMS DELL EMC DATA DOMAIN DEDUPLICATION STORAGE SYSTEMS Dell EMC Data Domain deduplication storage systems continue to revolutionize disk backup, archiving, and disaster recovery with high-speed, inline deduplication.

More information

The World s Fastest Backup Systems

The World s Fastest Backup Systems 3 The World s Fastest Backup Systems Erwin Freisleben BRS Presales Austria 4 EMC Data Domain: Leadership and Innovation A history of industry firsts 2003 2004 2005 2006 2007 2008 2009 2010 2011 First deduplication

More information

HP Dynamic Deduplication achieving a 50:1 ratio

HP Dynamic Deduplication achieving a 50:1 ratio HP Dynamic Deduplication achieving a 50:1 ratio Table of contents Introduction... 2 Data deduplication the hottest topic in data protection... 2 The benefits of data deduplication... 2 How does data deduplication

More information

Building Backup-to-Disk and Disaster Recovery Solutions with the ReadyDATA 5200

Building Backup-to-Disk and Disaster Recovery Solutions with the ReadyDATA 5200 Building Backup-to-Disk and Disaster Recovery Solutions with the ReadyDATA 5200 WHITE PAPER Explosive data growth is a challenging reality for IT and data center managers. IDC reports that digital content

More information

EMC Disk Library Automated Tape Caching Feature

EMC Disk Library Automated Tape Caching Feature EMC Disk Library Automated Tape Caching Feature A Detailed Review Abstract This white paper details the EMC Disk Library configuration and best practices when using the EMC Disk Library Automated Tape

More information

WHY DO I NEED FALCONSTOR OPTIMIZED BACKUP & DEDUPLICATION?

WHY DO I NEED FALCONSTOR OPTIMIZED BACKUP & DEDUPLICATION? WHAT IS FALCONSTOR? FAQS FalconStor Optimized Backup and Deduplication is the industry s market-leading virtual tape and LAN-based deduplication solution, unmatched in performance and scalability. With

More information

DELL EMC DATA DOMAIN EXTENDED RETENTION SOFTWARE

DELL EMC DATA DOMAIN EXTENDED RETENTION SOFTWARE WHITEPAPER DELL EMC DATA DOMAIN EXTENDED RETENTION SOFTWARE A Detailed Review ABSTRACT This white paper introduces Dell EMC Data Domain Extended Retention software that increases the storage scalability

More information

EMC STORAGE FOR MILESTONE XPROTECT CORPORATE

EMC STORAGE FOR MILESTONE XPROTECT CORPORATE Reference Architecture EMC STORAGE FOR MILESTONE XPROTECT CORPORATE Milestone multitier video surveillance storage architectures Design guidelines for Live Database and Archive Database video storage EMC

More information

Oracle RAC 10g Celerra NS Series NFS

Oracle RAC 10g Celerra NS Series NFS Oracle RAC 10g Celerra NS Series NFS Reference Architecture Guide Revision 1.0 EMC Solutions Practice/EMC NAS Solutions Engineering. EMC Corporation RTP Headquarters RTP, NC 27709 www.emc.com Oracle RAC

More information

DELL EMC DATA DOMAIN DEDUPLICATION STORAGE SYSTEMS

DELL EMC DATA DOMAIN DEDUPLICATION STORAGE SYSTEMS Data Domain Systems Table 1. DELL EMC DATA DOMAIN DEDUPLICATION STORAGE SYSTEMS Dell EMC Data Domain deduplication storage systems continue to revolutionize disk backup, archiving, and disaster recovery

More information

Video Surveillance EMC Storage with LENSEC Perspective VMS

Video Surveillance EMC Storage with LENSEC Perspective VMS Video Surveillance EMC Storage with LENSEC Perspective VMS Sizing Guide H14768 01 Copyright 2016 EMC Corporation. All rights reserved. Published in the USA. Published March, 2016 EMC believes the information

More information

EMC Celerra NS20. EMC Solutions for Microsoft Exchange Reference Architecture

EMC Celerra NS20. EMC Solutions for Microsoft Exchange Reference Architecture EMC Solutions for Microsoft Exchange 2007 EMC Celerra NS20 EMC NAS Product Validation Corporate Headquarters Hopkinton, MA 01748-9103 1-508-435-1000 www.emc.com Copyright 2008 EMC Corporation. All rights

More information

BACKUP AND RECOVERY FOR ORACLE DATABASE 11g WITH EMC DEDUPLICATION A Detailed Review

BACKUP AND RECOVERY FOR ORACLE DATABASE 11g WITH EMC DEDUPLICATION A Detailed Review White Paper BACKUP AND RECOVERY FOR ORACLE DATABASE 11g WITH EMC DEDUPLICATION EMC GLOBAL SOLUTIONS Abstract This white paper provides guidelines for the use of EMC Data Domain deduplication for Oracle

More information

EMC Backup and Recovery for Microsoft Exchange 2007 SP1. Enabled by EMC CLARiiON CX4-120, Replication Manager, and VMware ESX Server 3.

EMC Backup and Recovery for Microsoft Exchange 2007 SP1. Enabled by EMC CLARiiON CX4-120, Replication Manager, and VMware ESX Server 3. EMC Backup and Recovery for Microsoft Exchange 2007 SP1 Enabled by EMC CLARiiON CX4-120, Replication Manager, and VMware ESX Server 3.5 using iscsi Reference Architecture Copyright 2009 EMC Corporation.

More information

Surveillance Dell EMC Storage with Synectics Digital Recording System

Surveillance Dell EMC Storage with Synectics Digital Recording System Surveillance Dell EMC Storage with Synectics Digital Recording System Sizing Guide H15109 REV 1.1 Copyright 2016-2017 Dell Inc. or its subsidiaries. All rights reserved. Published June 2016 Dell believes

More information

Surveillance Dell EMC Storage with Verint Nextiva

Surveillance Dell EMC Storage with Verint Nextiva Surveillance Dell EMC Storage with Verint Nextiva Sizing Guide H14897 REV 1.3 Copyright 2016-2017 Dell Inc. or its subsidiaries. All rights reserved. Published September 2017 Dell believes the information

More information

Setting Up the DR Series System on Veeam

Setting Up the DR Series System on Veeam Setting Up the DR Series System on Veeam Quest Engineering June 2017 A Quest Technical White Paper Revisions Date January 2014 May 2014 July 2014 April 2015 June 2015 November 2015 April 2016 Description

More information

INTRODUCING VERITAS BACKUP EXEC SUITE

INTRODUCING VERITAS BACKUP EXEC SUITE INTRODUCING VERITAS BACKUP EXEC SUITE January 6, 2005 VERITAS ARCHITECT NETWORK TABLE OF CONTENTS Managing More Storage with Fewer Resources...3 VERITAS Backup Exec Suite...3 Continuous Data Protection...

More information

InfoSphere Warehouse with Power Systems and EMC CLARiiON Storage: Reference Architecture Summary

InfoSphere Warehouse with Power Systems and EMC CLARiiON Storage: Reference Architecture Summary InfoSphere Warehouse with Power Systems and EMC CLARiiON Storage: Reference Architecture Summary v1.0 January 8, 2010 Introduction This guide describes the highlights of a data warehouse reference architecture

More information

Symantec NetBackup 5200

Symantec NetBackup 5200 Symantec NetBackup 5200 is a single-vendor enterprise backup appliance that installs in minutes, with 32 TB of disk storage Data Sheet: Data Protection Overview is a single-vendor enterprise backup appliance

More information

DELL EMC DATA DOMAIN DEDUPLICATION STORAGE SYSTEMS

DELL EMC DATA DOMAIN DEDUPLICATION STORAGE SYSTEMS SPEC SHEET DELL EMC DATA DOMAIN DEDUPLICATION STORAGE SYSTEMS Data Domain Systems Dell EMC Data Domain deduplication storage systems continue to revolutionize disk backup, archiving, and disaster recovery

More information

EMC NetWorker Backup Solution for SAP HANA. Data Protection for Big Data

EMC NetWorker Backup Solution for SAP HANA. Data Protection for Big Data EMC NetWorker Backup Solution for SAP HANA Data Protection for Big Data 1 Why Is SAP HANA Important? Addresses #1 Tech Priority of CIOs Real Time Analytics & Business Intelligence (Source: Gartner) Transforms

More information

Setting Up the Dell DR Series System as an NFS Target on Amanda Enterprise 3.3.5

Setting Up the Dell DR Series System as an NFS Target on Amanda Enterprise 3.3.5 Setting Up the Dell DR Series System as an NFS Target on Amanda Enterprise 3.3.5 Dell Engineering September 2015 A Dell Technical White Paper Revisions Date June 2015 September 2015 Description Initial

More information

EMC CLARiiON Backup Storage Solutions

EMC CLARiiON Backup Storage Solutions Engineering White Paper Backup-to-Disk Guide with Computer Associates BrightStor ARCserve Backup Abstract This white paper describes how to configure EMC CLARiiON CX series storage systems with Computer

More information

EMC Backup and Recovery for Microsoft Exchange 2007

EMC Backup and Recovery for Microsoft Exchange 2007 EMC Backup and Recovery for Microsoft Exchange 2007 Enabled by EMC CLARiiON CX4-120, Replication Manager, and Hyper-V on Windows Server 2008 using iscsi Reference Architecture Copyright 2009 EMC Corporation.

More information

EMC CLARiiON CX3-40. Reference Architecture. Enterprise Solutions for Microsoft Exchange 2007

EMC CLARiiON CX3-40. Reference Architecture. Enterprise Solutions for Microsoft Exchange 2007 Enterprise Solutions for Microsoft Exchange 2007 EMC CLARiiON CX3-40 Metropolitan Exchange Recovery (MER) for Exchange Server Enabled by MirrorView/S and Replication Manager Reference Architecture EMC

More information

Veeam Availability Solution for Cisco UCS: Designed for Virtualized Environments. Solution Overview Cisco Public

Veeam Availability Solution for Cisco UCS: Designed for Virtualized Environments. Solution Overview Cisco Public Veeam Availability Solution for Cisco UCS: Designed for Virtualized Environments Veeam Availability Solution for Cisco UCS: Designed for Virtualized Environments 1 2017 2017 Cisco Cisco and/or and/or its

More information

FLASHARRAY//M Business and IT Transformation in 3U

FLASHARRAY//M Business and IT Transformation in 3U FLASHARRAY//M Business and IT Transformation in 3U TRANSFORM IT Who knew that moving to all-flash storage could help reduce the cost of IT? FlashArray//m makes server and workload investments more productive,

More information

EMC Business Continuity for Microsoft SharePoint Server (MOSS 2007)

EMC Business Continuity for Microsoft SharePoint Server (MOSS 2007) EMC Business Continuity for Microsoft SharePoint Server (MOSS 2007) Enabled by EMC Symmetrix DMX-4 4500 and EMC Symmetrix Remote Data Facility (SRDF) Reference Architecture EMC Global Solutions 42 South

More information

Hitachi Adaptable Modular Storage and Hitachi Workgroup Modular Storage

Hitachi Adaptable Modular Storage and Hitachi Workgroup Modular Storage O V E R V I E W Hitachi Adaptable Modular Storage and Hitachi Workgroup Modular Storage Modular Hitachi Storage Delivers Enterprise-level Benefits Hitachi Adaptable Modular Storage and Hitachi Workgroup

More information

EMC CLARiiON CX3-80. Enterprise Solutions for Microsoft SQL Server 2005

EMC CLARiiON CX3-80. Enterprise Solutions for Microsoft SQL Server 2005 Enterprise Solutions for Microsoft SQL Server 2005 EMC CLARiiON CX3-80 EMC Long Distance Recovery for SQL Server 2005 Enabled by Replication Manager and RecoverPoint CRR Reference Architecture EMC Global

More information

EMC VSPEX SERVER VIRTUALIZATION SOLUTION

EMC VSPEX SERVER VIRTUALIZATION SOLUTION Reference Architecture EMC VSPEX SERVER VIRTUALIZATION SOLUTION VMware vsphere 5 for 100 Virtual Machines Enabled by VMware vsphere 5, EMC VNXe3300, and EMC Next-Generation Backup EMC VSPEX April 2012

More information

Optimizing and Managing File Storage in Windows Environments

Optimizing and Managing File Storage in Windows Environments Optimizing and Managing File Storage in Windows Environments A Powerful Solution Based on Microsoft DFS and Virtual File Manager September 2006 TR-3511 Abstract The Microsoft Distributed File System (DFS)

More information

Setting Up the Dell DR Series System on Veeam

Setting Up the Dell DR Series System on Veeam Setting Up the Dell DR Series System on Veeam Dell Engineering April 2016 A Dell Technical White Paper Revisions Date January 2014 May 2014 July 2014 April 2015 June 2015 November 2015 April 2016 Description

More information

EMC DL3D Best Practices Planning

EMC DL3D Best Practices Planning Best Practices Planning Abstract This white paper is a compilation of specific configuration and best practices information for the EMC DL3D 4000 for its use in SAN environments as well as the use of its

More information

NetVault Backup Client and Server Sizing Guide 2.1

NetVault Backup Client and Server Sizing Guide 2.1 NetVault Backup Client and Server Sizing Guide 2.1 Recommended hardware and storage configurations for NetVault Backup 10.x and 11.x September, 2017 Page 1 Table of Contents 1. Abstract... 3 2. Introduction...

More information

EMC GREENPLUM MANAGEMENT ENABLED BY AGINITY WORKBENCH

EMC GREENPLUM MANAGEMENT ENABLED BY AGINITY WORKBENCH White Paper EMC GREENPLUM MANAGEMENT ENABLED BY AGINITY WORKBENCH A Detailed Review EMC SOLUTIONS GROUP Abstract This white paper discusses the features, benefits, and use of Aginity Workbench for EMC

More information

DELL EMC DATA DOMAIN DEDUPLICATION STORAGE SYSTEMS

DELL EMC DATA DOMAIN DEDUPLICATION STORAGE SYSTEMS SPEC SHEET DELL EMC DATA DOMAIN DEDUPLICATION STORAGE SYSTEMS Data Domain Systems Dell EMC Data Domain deduplication storage systems continue to revolutionize disk backup, archiving, and disaster recovery

More information

Virtualizing SQL Server 2008 Using EMC VNX Series and VMware vsphere 4.1. Reference Architecture

Virtualizing SQL Server 2008 Using EMC VNX Series and VMware vsphere 4.1. Reference Architecture Virtualizing SQL Server 2008 Using EMC VNX Series and VMware vsphere 4.1 Copyright 2011, 2012 EMC Corporation. All rights reserved. Published March, 2012 EMC believes the information in this publication

More information

Technology Insight Series

Technology Insight Series EMC Avamar for NAS - Accelerating NDMP Backup Performance John Webster June, 2011 Technology Insight Series Evaluator Group Copyright 2011 Evaluator Group, Inc. All rights reserved. Page 1 of 7 Introduction/Executive

More information

Video Surveillance EMC Storage with Honeywell Digital Video Manager

Video Surveillance EMC Storage with Honeywell Digital Video Manager Video Surveillance EMC Storage with Honeywell Digital Video Manager Sizing Guide H14748 01 Copyright 2016 EMC Corporation. All rights reserved. Published in the USA. Published February, 2016 EMC believes

More information

Setting Up the DR Series System as an NFS Target on Amanda Enterprise 3.3.5

Setting Up the DR Series System as an NFS Target on Amanda Enterprise 3.3.5 Setting Up the DR Series System as an NFS Target on Amanda Enterprise 3.3.5 Dell Engineering November 2016 A Quest Technical White Paper Revisions Date June 2015 November 2016 Description Initial release

More information

Table 1 The Elastic Stack use cases Use case Industry or vertical market Operational log analytics: Gain real-time operational insight, reduce Mean Ti

Table 1 The Elastic Stack use cases Use case Industry or vertical market Operational log analytics: Gain real-time operational insight, reduce Mean Ti Solution Overview Cisco UCS Integrated Infrastructure for Big Data with the Elastic Stack Cisco and Elastic deliver a powerful, scalable, and programmable IT operations and security analytics platform

More information

Storwize/IBM Technical Validation Report Performance Verification

Storwize/IBM Technical Validation Report Performance Verification Storwize/IBM Technical Validation Report Performance Verification Storwize appliances, deployed on IBM hardware, compress data in real-time as it is passed to the storage system. Storwize has placed special

More information

Hitachi Adaptable Modular Storage and Workgroup Modular Storage

Hitachi Adaptable Modular Storage and Workgroup Modular Storage O V E R V I E W Hitachi Adaptable Modular Storage and Workgroup Modular Storage Modular Hitachi Storage Delivers Enterprise-level Benefits Hitachi Data Systems Hitachi Adaptable Modular Storage and Workgroup

More information

EMC Celerra CNS with CLARiiON Storage

EMC Celerra CNS with CLARiiON Storage DATA SHEET EMC Celerra CNS with CLARiiON Storage Reach new heights of availability and scalability with EMC Celerra Clustered Network Server (CNS) and CLARiiON storage Consolidating and sharing information

More information

StorageCraft OneXafe and Veeam 9.5

StorageCraft OneXafe and Veeam 9.5 TECHNICAL DEPLOYMENT GUIDE NOV 2018 StorageCraft OneXafe and Veeam 9.5 Expert Deployment Guide Overview StorageCraft, with its scale-out storage solution OneXafe, compliments Veeam to create a differentiated

More information

Arcserve UDP 7000 Appliance Series Revolution, Not Just Evolution

Arcserve UDP 7000 Appliance Series Revolution, Not Just Evolution Arcserve 7000 Appliance Series Revolution, Not Just Evolution For customers who require a set and forget backup and recovery solution, Arcserve Unified Data Protection () 7000 is the first complete and

More information

Surveillance Dell EMC Storage with IndigoVision Control Center

Surveillance Dell EMC Storage with IndigoVision Control Center Surveillance Dell EMC Storage with IndigoVision Control Center Sizing Guide H14832 REV 1.1 Copyright 2016-2017 Dell Inc. or its subsidiaries. All rights reserved. Published May 2016 Dell believes the information

More information

Dell DR4000 Replication Overview

Dell DR4000 Replication Overview Dell DR4000 Replication Overview Contents Introduction... 1 Challenges with Data Disaster Recovery... 1 The Dell DR4000 Solution A Replication Overview... 2 Advantages of using DR4000 replication for disaster

More information

Preserving the World s Most Important Data. Yours. SYSTEMS AT-A-GLANCE: KEY FEATURES AND BENEFITS

Preserving the World s Most Important Data. Yours. SYSTEMS AT-A-GLANCE: KEY FEATURES AND BENEFITS Preserving the World s Most Important Data. Yours. SYSTEMS AT-A-GLANCE: KEY FEATURES AND BENEFITS We are the only company to integrate disk, tape, and replication in a single solution set for better near-term

More information

Dell PowerEdge R720xd with PERC H710P: A Balanced Configuration for Microsoft Exchange 2010 Solutions

Dell PowerEdge R720xd with PERC H710P: A Balanced Configuration for Microsoft Exchange 2010 Solutions Dell PowerEdge R720xd with PERC H710P: A Balanced Configuration for Microsoft Exchange 2010 Solutions A comparative analysis with PowerEdge R510 and PERC H700 Global Solutions Engineering Dell Product

More information

EMC CLARiiON CX3 Series FCP

EMC CLARiiON CX3 Series FCP EMC Solutions for Microsoft SQL Server 2005 on Windows 2008 EMC CLARiiON CX3 Series FCP EMC Global Solutions 42 South Street Hopkinton, MA 01748-9103 1-508-435-1000 www.emc.com www.emc.com Copyright 2008

More information

DEDUPLICATION BASICS

DEDUPLICATION BASICS DEDUPLICATION BASICS 4 DEDUPE BASICS 6 WHAT IS DEDUPLICATION 8 METHODS OF DEDUPLICATION 10 DEDUPLICATION EXAMPLE 12 HOW DO DISASTER RECOVERY & ARCHIVING FIT IN? 14 DEDUPLICATION FOR EVERY BUDGET QUANTUM

More information

EMC Virtual Infrastructure for Microsoft Exchange 2010 Enabled by EMC Symmetrix VMAX, VMware vsphere 4, and Replication Manager

EMC Virtual Infrastructure for Microsoft Exchange 2010 Enabled by EMC Symmetrix VMAX, VMware vsphere 4, and Replication Manager EMC Virtual Infrastructure for Microsoft Exchange 2010 Enabled by EMC Symmetrix VMAX, VMware vsphere 4, and Replication Manager Reference Architecture Copyright 2010 EMC Corporation. All rights reserved.

More information

EMC BACKUP AND RECOVERY PRODUCT OVERVIEW

EMC BACKUP AND RECOVERY PRODUCT OVERVIEW EMC BACKUP AND RECOVERY PRODUCT OVERVIEW Next-generation data protection Essentials Next-generation Data Protection Backup redesign is an imperative to keep pace with data growth and virtualization Disk-

More information

Technology Insight Series

Technology Insight Series IBM ProtecTIER Deduplication for z/os John Webster March 04, 2010 Technology Insight Series Evaluator Group Copyright 2010 Evaluator Group, Inc. All rights reserved. Announcement Summary The many data

More information

50 TB. Traditional Storage + Data Protection Architecture. StorSimple Cloud-integrated Storage. Traditional CapEx: $375K Support: $75K per Year

50 TB. Traditional Storage + Data Protection Architecture. StorSimple Cloud-integrated Storage. Traditional CapEx: $375K Support: $75K per Year Compelling Economics: Traditional Storage vs. StorSimple Traditional Storage + Data Protection Architecture StorSimple Cloud-integrated Storage Servers Servers Primary Volume Disk Array ($100K; Double

More information

vstart 50 VMware vsphere Solution Specification

vstart 50 VMware vsphere Solution Specification vstart 50 VMware vsphere Solution Specification Release 1.3 for 12 th Generation Servers Dell Virtualization Solutions Engineering Revision: A00 March 2012 THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES

More information

Reducing Costs in the Data Center Comparing Costs and Benefits of Leading Data Protection Technologies

Reducing Costs in the Data Center Comparing Costs and Benefits of Leading Data Protection Technologies Reducing Costs in the Data Center Comparing Costs and Benefits of Leading Data Protection Technologies November 2007 Reducing Costs in the Data Center Table of Contents The Increasingly Costly Data Center...1

More information

ECONOMICAL, STORAGE PURPOSE-BUILT FOR THE EMERGING DATA CENTERS. By George Crump

ECONOMICAL, STORAGE PURPOSE-BUILT FOR THE EMERGING DATA CENTERS. By George Crump ECONOMICAL, STORAGE PURPOSE-BUILT FOR THE EMERGING DATA CENTERS By George Crump Economical, Storage Purpose-Built for the Emerging Data Centers Most small, growing businesses start as a collection of laptops

More information

VxRack System SDDC Enabling External Services

VxRack System SDDC Enabling External Services VxRack System SDDC Enabling External Services May 2018 H17144 Abstract This document describes how to enable external services for a VxRack System SDDC. Use cases included are Dell EMC Avamar-based backup

More information

IBM Spectrum NAS. Easy-to-manage software-defined file storage for the enterprise. Overview. Highlights

IBM Spectrum NAS. Easy-to-manage software-defined file storage for the enterprise. Overview. Highlights IBM Spectrum NAS Easy-to-manage software-defined file storage for the enterprise Highlights Reduce capital expenditures with storage software on commodity servers Improve efficiency by consolidating all

More information

EMC XTREMCACHE ACCELERATES VIRTUALIZED ORACLE

EMC XTREMCACHE ACCELERATES VIRTUALIZED ORACLE White Paper EMC XTREMCACHE ACCELERATES VIRTUALIZED ORACLE EMC XtremSF, EMC XtremCache, EMC Symmetrix VMAX and Symmetrix VMAX 10K, XtremSF and XtremCache dramatically improve Oracle performance Symmetrix

More information

Data Sheet FUJITSU Storage ETERNUS CS200c S3

Data Sheet FUJITSU Storage ETERNUS CS200c S3 Data Sheet FUJITSU Storage ETERNUS CS200c S3 Data Sheet FUJITSU Storage ETERNUS CS200c S3 All-in-one Backup Appliance Powered by Commvault ETERNUS CS Data Protection Appliances ETERNUS CS data protection

More information

STOREONCE OVERVIEW. Neil Fleming Mid-Market Storage Development Manager. Copyright 2010 Hewlett-Packard Development Company, L.P.

STOREONCE OVERVIEW. Neil Fleming Mid-Market Storage Development Manager. Copyright 2010 Hewlett-Packard Development Company, L.P. STOREONCE OVERVIEW Neil Fleming Neil.Fleming@HP.com Mid-Market Development Manager 1 DETERMINE YOUR RECOVERY NEEDS Wks Days Hrs Mins Secs Secs Mins Hrs Days Wks TAPE D2D Recovery Point SNAP SNAP Recovery

More information

EMC Virtual Infrastructure for Microsoft Exchange 2007

EMC Virtual Infrastructure for Microsoft Exchange 2007 EMC Virtual Infrastructure for Microsoft Exchange 2007 Enabled by EMC Replication Manager, EMC CLARiiON AX4-5, and iscsi Reference Architecture EMC Global Solutions 42 South Street Hopkinton, MA 01748-9103

More information

Data Sheet FUJITSU Storage ETERNUS CS200c S4

Data Sheet FUJITSU Storage ETERNUS CS200c S4 Data Sheet FUJITSU Storage ETERNUS CS200c S4 Data Sheet FUJITSU Storage ETERNUS CS200c S4 Integrated Backup Appliance for the Digital World ETERNUS CS Data Protection Appliances ETERNUS CS data protection

More information

DATA DOMAIN INVULNERABILITY ARCHITECTURE: ENHANCING DATA INTEGRITY AND RECOVERABILITY

DATA DOMAIN INVULNERABILITY ARCHITECTURE: ENHANCING DATA INTEGRITY AND RECOVERABILITY WHITEPAPER DATA DOMAIN INVULNERABILITY ARCHITECTURE: ENHANCING DATA INTEGRITY AND RECOVERABILITY A Detailed Review ABSTRACT No single mechanism is sufficient to ensure data integrity in a storage system.

More information

Configuring a Single Oracle ZFS Storage Appliance into an InfiniBand Fabric with Multiple Oracle Exadata Machines

Configuring a Single Oracle ZFS Storage Appliance into an InfiniBand Fabric with Multiple Oracle Exadata Machines An Oracle Technical White Paper December 2013 Configuring a Single Oracle ZFS Storage Appliance into an InfiniBand Fabric with Multiple Oracle Exadata Machines A configuration best practice guide for implementing

More information

Dell DR4100. Disk based Data Protection and Disaster Recovery. April 3, Birger Ferber, Enterprise Technologist Storage EMEA

Dell DR4100. Disk based Data Protection and Disaster Recovery. April 3, Birger Ferber, Enterprise Technologist Storage EMEA Dell DR4100 Disk based Data Protection and Disaster Recovery April 3, 2013 Birger Ferber, Enterprise Technologist Storage EMEA Agenda DR4100 Technical Review DR4100 New Features DR4100 Use Cases DR4100

More information