P R O D U C T I N D E P T H. Evaluating Grid Storage for Enterprise Backup, DR and Archiving: NEC HYDRAstor

Size: px
Start display at page:

Download "P R O D U C T I N D E P T H. Evaluating Grid Storage for Enterprise Backup, DR and Archiving: NEC HYDRAstor"

Transcription

1 Evaluating Grid Storage for Enterprise Backup, DR and Archiving: NEC HYDRAstor September 2008 It s hard to believe that just fifteen years ago, most enterprises were challenged to manage hundreds of gigabytes, and possibly several terabytes, of data. Today, most large enterprises are already managing at least one hundred terabytes of data if not more, and are concerned that the monolithic storage architectures that were good at managing the scale of data required in the past have run out of gas. A number of new market drivers, including evolving regulatory mandates, increasingly stringent e-discovery requirements, new data sources such as Web 2.0, and data growth rates in the 50% to 60% range for at least the next five years, mean that most large enterprises will be called upon to manage multiple petabytes of data in the foreseeable future, and to do so cost-effectively. Decisions about replacement storage platforms need to take this scale issue into account. Monolithic storage architectures have predominated in enterprise environments, but the new scale requirements are not well served by this architecture. When a monolithic array is outgrown, it requires a disruptive, fork-lift upgrade to move to next generation technologies. Ratios between processing performance and storage capacity cannot be very flexibly configured, and routine storage management tasks such as provisioning are not cost-effectively scalable into the petabyte range. Growing workloads cannot be seamlessly spread across physical array boundaries. Burgeoning requirements are calling for a new, dense storage architecture that has very different performance, capacity, configuration flexibility, and $/GB requirements than that offered by monolithic storage architectures. In this Product Profile, we ll take a look at the requirements for secondary storage platforms designed to handle the data explosion cost-effectively, and then show how NEC s HYDRAstor, a scale-out NAS solution that embodies a number of compelling technologies, meets those requirements. HYDRAstor is already enjoying strong traction in the industry among large enterprises because of its massive scalability, rich feature set, and aggressive pricing (usable capacity under $1/GB at scale). 1 of 19

2 Re-Evaluating the $/GB Costs of Conventional Storage Environments To manage overall storage capacity more cost-effectively, Taneja Group has been encouraging end users to consider their data access and usage patterns, understand how that translates into storage performance (particularly access latency) requirements, know the costs associated with different storage tiers, and then use that information to tier storage appropriately to achieve the lowest overall $/GB to meet specific requirements. Armed with this data, large enterprises can make strategic decisions about where to deploy disk in their secondary storage environments, allowing them to actually improve their ability to meet service level agreements while managing to a lower overall $/GB cost for storage capacity (across both primary and secondary storage). In our experience, most large enterprises are keeping 70% to 90% more data on primary storage than is necessary to meet their actual requirements, and as a result are paying a significantly higher $/GB cost for their overall storage capacity than they have to. This fact, combined with the very aggressive price points for disk-based secondary storage that are being offered by scale-out NAS platforms built around grid architectures, will be a flashpoint for large enterprises looking to take the plunge. It s clear that conventional storage architectures do not meet the scalability, performance, and manageability requirements very cost-effectively in this new era of exploding data growth, but it may not be clear what the right successor architecture is. In our opinion, the intelligent application of storage tiering concepts will become increasingly important, with grid-based storage as the heir apparent in the secondary storage arena, offering compelling economic advantages over both tape and existing monolithic storage architectures. An Ideal Storage Tiering Model In an ideal world, Taneja Group posits a four-tier storage architecture. Legacy environments often had only two tiers: disk for primary storage, and tape for everything else. Disk offered low latency access, high availability and reliability, and could support high duty cycles with enterprise-class Fibre Channel (FC) and SCSI disk drives, but historically has cost two orders of magnitude or more as much as tape. Tape offered nearly limitless capacity and very low acquisition costs on a $/GB basis, but did not meet even medium duty cycle requirements very reliably. Accordingly, disk was used for primary storage and tape for backup and, if they were being done at all, DR and archiving. Primary storage requirements include low latency, high availability, high data reliability, and a high duty cycle (since data residing in primary storage environments is expected to be accessed and modified frequently). As data ages, frequency of access tends to decline, although access frequency declines at different rates for different data. At some point, data no longer requires 2 of 19

3 low latency access and can be moved to lower cost secondary storage. Within secondary storage environments, there can be multiple tiers, each with different requirements. Secondary storage generally hosts applications like backup, disaster recovery (DR), and archiving. It s interesting to note that, for most organizations, a given amount of primary storage generates at least five to ten times as much secondary storage if not more. Based on studies done by the Taneja Group in 1H08, enterprise class primary storage generally costs five to eight times as much as secondary storage so it is important to ensure that the only data that is maintained on primary storage actually has to be there. But with a given amount of primary storage generating as much as 10x or more secondary storage capacity, it is also important to ensure that the secondary tier(s) meets the performance, availability, and reliability requirements with an aggressive $/GB. Figure 1. A tiered storage architecture in an ideal world, showing the key requirements for each tier. Secondary storage requirements are no longer well met using only tape. With the rapid growth of primary data, backup windows have become a problem. 7x24 requirements for many applications leave little time to back up data sets that are now larger than ever, and growing faster than ever before. Stringent recovery point objective (RPO) and recovery time objective (RTO) requirements are not well met with tape, particularly for the most frequent types of restore requests (object level restores). When tape is used to try and regularly handle the types of duty cycle requirements demanded by most backup and restore activities, its reliability suffers; from our discussions with end users, it is not uncommon for them to see tape reliability 3 of 19

4 issues affecting as many as 20% or more of attempted restores. Disk offered a solution for these problems, but its high cost had historically limited its use in secondary storage applications. The emergence of new disk technologies, such as SATA and in particular storage capacity optimization (data reduction technologies like data de-duplication), has now dropped the cost of usable, disk-based secondary storage capacity. Grid architectures provide a storage platform that can leverage disk effectively, can scale to petabytes of capacity, can leverage inherent redundancies to support high levels of availability and resiliency, and, when used in conjunction with storage capacity optimization technologies, can offer $/GB costs under $1. DR requires low cost, long term storage, off-site location, and may carry its own set of RPO/RTO requirements. If disk is used in DR environments, this allows replication to be used to send data to a remote location, avoiding the security and other issues associated with physical tape transport. The use of storage capacity optimization technology can significantly reduce the amount of data that has to be sent across a wide area network (WAN), making replication a viable and more cost-effective option than ever before. The use of replication to create and maintain one or more DR sites generally lets a company significantly improve the RPO of their DR capability since getting the data off of backup clients and to remote locations usually takes only hours instead of days. As e-discovery grows in importance, archiving platform requirements are changing. Tapes are not searchable on-line, and when used as a medium for archiving result in very high discovery costs. Given the prevalence of lawsuits (most large companies are involved in at least one lawsuit at all times), an archive that is searchable on-line can now provide significant cost savings. Fortune 1000 companies frequently assume a minimum of $500,000 in discovery costs against a tape-based archive per lawsuit, whereas e-discovery costs against a disk-based archive for the same lawsuit could easily be one tenth of that. These issues together mean that it is time for large enterprises to re-evaluate the use of disk for secondary storage applications like backup, DR, and archiving. Criteria For The New Secondary Storage Platform Offerings in this space should be platform-based products that offer a set of specific capabilities that can be leveraged across a variety of secondary applications. The assumption is that they will be used in conjunction with existing secondary storage application software like enterprise backup or archiving, and it is the combination of the two (e.g. an enterprise backup application and a new secondary storage platform) that provides the overall solution. Massive scalability and performance. In the new environment, secondary storage requires a scalable architecture that can take a customer from entry level configurations in the terabyte range to tens of petabytes (PBs) of data and beyond, providing the necessary throughput as configurations grow. 4 of 19

5 Modular configurability. Business and regulatory mandates require that the secondary storage platform will have to be independently scalable and configurable along two key metrics: capacity and performance. This configuration flexibility should allow the creation of extremely throughput-intensive or extremely capacity-intensive configurations by adding the appropriate resources as needed without incurring disruptive fork-lift upgrades. High availability. To provide the service necessary in the 24x7 world of IT operations, the secondary storage platform should have no single point of failure and allow for on-line maintenance, upgrades, and data migration. This must include a transparent ability to rebalance workloads as resources are added or subtracted from a system as well as an ability to easily establish replicated configurations for DR purposes. Platform data management tools. Data must be stored securely and redundantly in a manner that can sustain multiple concurrent failures; data must be both reliable and persistent. The platform must include storage capacity optimization technologies designed to minimize the amount of raw storage capacity required to store a given amount of information. Capacity optimization ratios will vary based on data type, application, and company-specific policies. Despite all disk s advantages in secondary storage applications, the administrative overhead associated with disk provisioning has been a significant disadvantage which becomes more onerous as storage capacities increase. Look for self-managing capabilities that can help to significantly increase the administrative span of control by automating provisioning tasks as much as possible. Ability to integrate with existing processes. Most large enterprises already have a set of processes established around secondary storage applications like backup and archive that likely leverage some combination of disk and tape. Millions of dollars have often been invested in these processes, and any changes or additions to secondary storage infrastructure can only realistically be done in an evolutionary way which requires minimal or no disruption to existing processes. While storage targets may offer different characteristics, it is important that they support the different access mechanisms and management paradigms of widely deployed backup and archive management software. Industry standards-based. The platform must be built to outlast vendor-specific technology, such as underlying hardware and APIs, which will undoubtedly change over the long time periods likely to be required for most secondary storage applications. Use of commodity hardware, standardized access protocols, and support for a wide variety of popular secondary storage applications will provide a future-proof solution. Affordability. Tape has been the media of choice for most secondary storage applications. Tape acquisition and associated power consumption costs are low, but given evolving data 5 of 19

6 protection and archiving requirements, tape s disadvantages are becoming more impactful and more costly. Tape fares poorly against disk for discovery purposes, and tape also does not offer the option to winnow down the amount of data retained on primary storage that no longer requires low latency access but still requires on-line access. When looking at a secondary storage platform to accommodate future growth, we strongly urge end users to take issues such as these into account, not just acquisition costs. In certain application environments, diskbased platforms may offer total costs of ownership that are very close to tape. With these criteria, it is important to note that they define a set of features that can be used across multiple secondary storage applications simultaneously. Platforms in this class in fact offer this as a specific cost-savings advantage, providing an ability to safely co-locate certain secondary storage applications simultaneously on a single, massively scalable storage platform. Virtual platforms, dedicated to backup, archive, or other secondary storage applications, can be established within the larger platform, and configured and managed accordingly. This allows the storage for multiple applications to be centrally managed, while any application-specific management requirements are met by the secondary storage application software. Enter NEC HYDRAstor Figure 2. HYDRAstor layers into an existing environment, presenting a NAS based storage target for existing secondary storage applications like backup and archive. NEC Corporation is a $41B international company that has been selling storage products for over 50 years. NEC Corporation of America, the North American subsidiary of NEC Corporation headquartered in Irving, Texas, unveiled its HYDRAstor unified disk storage platform in 2007, basing it around a grid storage architecture that is scalable into the petabyte 6 of 19

7 range today. HYDRAstor supports two types of nodes: Accelerator and Storage. Accelerator Nodes scale performance, while Storage Nodes scale disk capacity (and the performance necessary to address the added capacity). Both nodes types can be added non-disruptively. NEC initially targeted HYDRAstor as a backup and archival storage platform for large enterprise environments, but as it continues to gain traction and prove itself in the market, it s clear that NEC could broaden its positioning to include use as a primary storage platform for consolidated file services. In September 2008, NEC unveiled the HYDRAstor HS which includes more powerful nodes based on newer technologies that can be easily integrated into existing HYDRAstor configurations built around the older HS nodes. NEC has also enhanced RepliGrid, HYDRAstor s replication software, to include support for N to 1 replicated configurations, supporting more flexibility in setting up DR solutions. As the first grid storage solution introduced by a Fortune 200 company, HYDRAstor caused quite a stir. Since its introduction less than a year ago, NEC has deployed HYDRAstor in production in over 50 large enterprise environments across a number of different verticals, including financial services, health care, telecommunications, entertainment, manufacturing, and education. Common business drivers for HYDRAstor acquisition cited by NEC customers include its ability to support a disk-based secondary storage platform that is massively scalable, cost-effective, and exhibits very high data reliability. Most customers were already sold on the concept of using disk for backup, but were concerned about disk s affordability and manageability. HYDRAstor s aggressive $/GB costs for usable storage capacity, strongly aided by its support for storage capacity optimization technology, combined with its self-managing, self-healing capabilities to address these concerns. How Well Does HYDRAstor Meet The Criteria? The detailed discussion of how HYDRAstor meets our established criteria will rely heavily on an understanding of what the Accelerator and Storage Nodes do and how they are configured, so we ll address that issue first. Accelerator Nodes are based around an industry standardsbased 2U server with Intel-compatible CPUs, main memory, and 2 internal drives. In the HS Accelerator Node models just announced this month, NEC is using quad core, 3GHz CPUs, 8GB of RAM, and 147GB SAS drives. Each Accelerator Node supports 300MB/sec of throughput and offers multiple Ethernet (1Gb or 10Gb) connection options. Accelerator Nodes manage the NFS or CIFS interface, perform the preliminary chunking and fingerprinting of data as it comes into the system, and handle the routing for HYDRAstor s unique distributed hash table. Chunking is performed at the sub-file level and uses variable length windows - two factors supporting higher data reduction ratios. All Accelerator Nodes run DynamicStor, which can be thought of as an operating environment that includes a coherent, distributed file system that leverages the inherent redundancy of grid architectures and an automatic failover capability to ensure that no node is a single point of failure. The distributed hash table is one of HYDRAstor s stand out technologies as will be explained later. 7 of 19

8 Storage Nodes are also based around an industry standards-based 2U server that supports quad core, 3 GHz CPUs, 24GB of RAM, and 12 1TB SATA disk drives. DynamicStor virtualizes storage within and across all Storage Nodes in a grid to create one logical pool of capacity accessible by all Accelerator Nodes, and handles all the storage provisioning tasks automatically as resources are added, subtracted, or moved to different locations. For performance and scalability reasons, each Storage Node supports only a part of the distributed hash table. After an Accelerator Node performs the preliminary storage capacity optimization work, it sends the data directly to the Storage Node responsible for that part of the hash table. Because Storage Nodes do not have to process the entire hash table, they support much higher performance than conventional methods where the entire hash table must be processed for each request. This distributed hashing approach also overcomes other issues common with conventional hashing designs, such as an inability to go beyond a hash table of a certain size (due to main memory or other limitations) and poor performance due to paging (because the hash table is too big to fit in main memory). Figure 3. HYDRAstor systems can be created from any combination of Accelerator and Storage Nodes, supporting tens of thousands of MB/sec of bandwidth and tens of petabytes, to meet a wide variety of performance and capacity requirements. 8 of 19

9 HYDRAstor systems can be very flexibly configured by adding Accelerator and Storage Nodes independently as needed. Backup applications are generally deployed in ratios of 1:2 (Accelerator Nodes to Storage Nodes), while archive applications generally require less bandwidth due to usage patterns, and are deployed in ratios of 1:4 (Accelerator Nodes to Storage Nodes). HYDRAstor can, however, support any combination of Accelerator and Storage Nodes. Massive scalability and performance. Entry-level HYDRAstor configurations start with what NEC calls a 1x2 configuration that includes 1 Accelerator Node and 2 Storage Nodes, providing 300MB/sec of throughput and 24TB of raw capacity (12TB per Storage Node). Assuming a 20:1 data de-duplication ratio and the 25% data protection overhead inherent in HYDRAstor s Distributed Resilient Data (DRD) logical data redundancy default setting, this entry level configuration would provide a customer with 360TB of usable capacity. The largest pre-configured HYDRAstor model available from NEC has 55 Accelerator Nodes, 110 Storage Nodes, supports 16.5GB/sec of bandwidth, and offers 1.3PB of raw storage capacity. This system is actually being built in NEC s Redmond Technology Center in Washington state, and will be used for internal testing and demo purposes. HYDRAstor includes DataRedux, NEC s patent-pending storage capacity optimization technology, which brings usable capacity easily up into the tens of petabytes for large configurations, depending upon the level of data redundancy configured and the data reduction ratios achievable across various workloads. Existing customers using HYDRAstor as a backup platform are achieving data reduction ratios of 20:1 or more over time. HYDRAstor s architecture puts it into an elite category of platforms that are able to provide well-balanced, high performance even as the system is scaled out to its maximum capacity. To understand how HYDRAstor achieves this linear scalability, we need at least a high level understanding of how HYDRAstor handles the data. Each Accelerator Node includes at least one share (i.e. file system). When a file is written to this share, it is broken into chunks. A chunk is then broken into multiple fragments and stored across as many Storage Nodes as possible. This approach spreads the processing associated with any disk read or write across many disk spindles, each of which has dedicated processing power within its own Storage Node. As Storage Nodes are added, so is the additional processing power necessary to keep its disks performing optimally. As Accelerator Nodes are added, they also provide additional processing power, as well as an additional 300MB/sec of throughput, whose usage is spread in a balanced manner across all existing Storage Nodes by DynamicStor. It is this balanced addition of resources, the usage of which is spread evenly across all nodes in the system, which supports HYDRAstor s linear scalability. HYDRAstor s scale-out architecture allows resources to be added over time and even across technology generations (e.g. Accelerator Nodes incorporating newer processor technology and 9 of 19

10 higher density memory, Storage Nodes incorporating larger capacity and/or different disk types) without incurring a fork-lift upgrade. This extends the usable life of a platform like HYDRAstor from the standard 3-4 years associated with monolithic architectures to one that could be decades in length, significantly changing the economics associated with total cost of ownership (TCO) calculations. Modular configurability. With HYDRAstor, Accelerator and Storage Nodes can be added as needed, with no constraints. As shown earlier in Figure 3 on page 8, if more throughput is required, more Accelerator Nodes can be added, up to a maximum configuration supporting 16.5GB/sec. If more storage capacity is needed, more Storage Nodes can be added, supporting tens of petabytes of usable capacity. Unlike monolithic storage architectures, performance does not degrade as capacity is added (on the contrary, it increases) and there is no constraint which limits the maximum amount of throughput associated with any particular amount of storage capacity. Note also that, by defining share names and associating those with particular secondary storage applications, users can effectively create multiple virtual systems which have the same scalability characteristics as the overall HYDRAstor configuration. A particular share or set of shares can be designated as a backup platform, while a separate set of shares is dedicated for use as an archive platform. These virtual systems are all managed centrally through HYDRAstor s web-based management GUI. High availability. DynamicStor takes advantage of the inherent redundancy in grid architectures to maintain active-active configurations for all Accelerator and Storage Nodes. This is one of the reasons that NEC recommends that HYDRAstor configurations include multiple nodes of each type. If an Accelerator Node fails, then another Accelerator Node can take over and serve the shares that the failed Accelerator Node was providing. Overall throughput, of course, drops when an Accelerator Node fails, but there is no service disruption or loss of data. In larger HYDRAstor configurations, the load of the failed Accelerator Node can be spread across all remaining Accelerator Nodes to minimize any perceived performance impact. In its default setting, HYDRAstor s patent-pending DRD technology ensures that the failure of three disks is tolerated without any loss of either data or access to that data. In 2x4 configurations (2 Accelerator Nodes, 4 Storage Nodes) or larger, even an entire Accelerator or Storage Node can be lost without any impact to the data or the access to it. DRD can be simplistically defined as an enhanced RAID capability that allows the user to dial in the level of protection (e.g. the number of simultaneous device failures that can occur without impacting data availability) desired, although it really goes far beyond that because it can provide much greater resiliency with less overhead than conventional RAID implementations. It is NEC s position that RAID 5 and 6 implementations are not sufficient to address data availability and reliability issues in storage environments where storage capacity optimization 10 of 19

11 technologies are in use. RAID 5 was designed to ride through any single drive failure, but incurred a pretty substantial write penalty and a 20% cost overhead. Data on a failed disk would be transparently recovered in the background (called a rebuild) while data continued to be available to end users. When disks were relatively small in capacity, rebuild times were not as much of a complicating factor for RAID 5 configurations, but as disk capacities grew, so did the amount of data that had to be transferred to return drive configurations to normal operation. With the size of today s 500GB and larger drives, rebuild times became so long that the bit error rate started to pose significant risks. The typical bit error rate of a 500GB SATA drive is 1 in 10 14, which means that a read error can be expected for every 12.5TB of data read. In a standard RAID 5 configuration with 5 disks, the likelihood of encountering an uncorrectable read error during a rebuild is fairly high. This is in fact what led to the development of RAID 6, a RAID configuration which protects against dual concurrent drive failures by introducing a second parity drive. While RAID 6 was an improvement over RAID 5 in terms of resilience, the increased capacity overhead of RAID 6 increases the likelihood of read errors during rebuilds, increases rebuild times, and introduces additional performance overhead. As disk drives get denser, these problems with RAID 6 will only get worse. To address these issues, NEC developed DRD. In its default configuration of resiliency level 3, DRD protects data from any three concurrent disk failures without the RAID write penalty, without incurring performance degradation during rebuilds, without long rebuild times, and with less storage overhead than RAID 6. Briefly, here s how DRD works. DRD initially takes each chunk of data that a write is broken down into, and breaks it into multiple data fragments, with the number of fragments depending upon the resiliency level defined by the administrator (it will always be 12 minus the resiliency level). DRD then computes the required number of parity fragments (a resiliency level of 3 creates 3 parity fragments), with all parity fragment calculations based solely on the data fragments this is all done in main memory, no data must ever be read from disk a second time for DRD to complete any operations, including rebuilds. A total of 12 fragments are always written; at the default setting of 3, DRD writes 9 data fragments and 3 parity fragments, while with a setting of 4 DRD would write 8 data fragments and 4 parity fragments. These 12 fragments are then written across as many Storage Nodes as possible in a configuration. If there are more than 12 Storage Nodes in a system, then the data fragments are written across 12 nodes on each write, while if there are less than 12 Storage Nodes in a system, DRD will distribute the fragments on different disks within a node to maximize resilience but at any rate always distributes fragments to achieve maximum data resiliency given the available hardware. At resiliency level 3, a chunk of data can be re-created using any 9 out of the 12 fragments, allowing the system to simultaneously lose as many as 3 fragments without impacting data availability. At resiliency level 3, DRD incurs 25% overhead (compared to RAID 5 s 20% and RAID 6 s 33%) but offers 200% more resiliency than RAID 5 and 50% more than RAID 6. DRD calculations are done by the Storage Nodes, with each node performing the DRD calculations only for the data that it writes. 11 of 19

12 NEC s claim that DRD does not suffer performance degradation during the rebuild process is based on the way data is retrieved. When an Accelerator Node requests a chunk, it requests it from one of the many Storage Nodes on which its associated fragments reside. With the default DRD setting, the Storage Node takes the first nine fragments it has been provided to recreate the chunk; if any of these fragments are parity fragments, then the actual fragment is reconstructed in RAM before being passed back to the Accelerator Node. This avoids the many re-reads necessary to reconstruct data in conventional RAID implementations, and is how HYDRAstor is able to maintain I/O performance even when one or more disks or Storage Nodes are being rebuilt. HYDRAstor s distributed hash tables also support high availability. Centralized hash tables can represent a single point of failure which can make data unrecoverable. HYDRAstor distributes its hash tables across all Storage Nodes even though each node is only responsible for actively handling a small portion of the hash table during regular operation - so that even the failure of multiple nodes does not result in the loss of any hash information. HYDRAstor also supports its own replication facility called RepliGrid. RepliGrid is licensed on particular Accelerator Nodes, with those nodes handling all replication tasks for the entire HYDRAstor. RepliGrid uses asynchronous, IP-based replication and minimizes bandwidth requirements by only sending unique, capacity optimized data across the WAN. RepliGrid can be configured to replicate at either the Accelerator Node level or at the file system level. Support for N+1 configurations allows replication to occur either from multiple Accelerator Nodes to one Accelerator Node or from multiple grids to one grid. Combined, these features support a solid high availability story. Nodes of either kind can be added, removed, replaced, or shut down for maintenance without impacting HYDRAstor s ability to service applications and keep data available. Because of the way it lays data down across independent components, DRD maintains the same high level of performance for data access even during a drive rebuild. HYDRAstor itself is built to sustain system-level availability in the fault tolerant class (99.999%). Platform data management tools. HYDRAstor offers tools to manage data for both persistence and reliability, offers storage capacity optimization technologies to minimize the amount of raw capacity required to storage a given amount of data, and employs a selfmanaging approach which obviates the need for administrators to perform storage provisioning tasks as system configurations change. We ve already talked about DRD and how it provides high availability, and it should be clear how DRD s capabilities support data persistence in the face of multiple failures. DynamicStor includes built-in data verification capabilities that also support data reliability. As chunks are created by the Accelerator Nodes, a hash is calculated for each chunk. HYDRAstor uses a very powerful algorithm for its hashing 12 of 19

13 calculations that guarantee that the odds are less than 1 in that the hash value is not unique. DataRedux performs data de-duplication at the sub-file level, using variable length windows, and then compresses the data using modified Lempel-Ziv algorithms prior to storing it. Both the Accelerator and Storage Nodes participate in this process. As data comes into the Accelerator Nodes, it is broken into chunks and unique hash values created. If the hash value is one that HYDRAstor has already seen, then the new data chunk is discarded and a reference pointer is stored instead. If the chunk turns out to be one that HYDRAstor has not seen before, then it compresses it and writes it directly to disk (using the DRD algorithm). DataRedux effectively establishes a global de-duplication repository that can be used against any data coming in from any of the Accelerator Nodes, regardless of whether they have been configured into one or more virtual systems. With this single de-duplication repository scaling up to 16.5GB/sec of throughput and potentially tens of petabytes of usable capacity, it is arguably the most scalable storage capacity optimization engine in the industry. One of the most onerous tasks associated with storage system administration is provisioning. In conventional systems, there are RAID controllers, volume managers, and RAID groups to configure. LUN, volume, and file system sizes must be pre-determined, and monitored to ensure that sufficient capacity has been allocated to meet storage requirements. Although it may be hard to believe, to initialize a HYDRAstor system you just need to connect the Accelerator Nodes to your Ethernet (rack mounted models of the HYDRAstor are already preconnected into the HYDRAstor grid), power them up, set up an administrator account, and create at least one share on each of the Accelerator Nodes it s not necessary to associate the shares with any volumes, real or otherwise. All other tasks are handled by DynamicStor. The Storage Nodes automatically discover each other, detect the DRD resiliency level, and start setting up what needs to be created for a new storage system. As resources are added, DynamicStor discovers them, determines their type, and adds them to the resource pool. Applications will interact with specific Accelerator Nodes based on where shares reside; in this way, specific application workloads can be associated with specific Accelerator Nodes, although all shares on all Accelerator Nodes are running data de-duplication against a global repository. DynamicStor automatically recovers any shares on a failed Accelerator Node elsewhere in the system without any knowledge on the part of the application or end users that a recovery has transparently occurred in the background. The total capacity of all the Storage Nodes is combined into a single large pool of virtual storage that is equally accessible by any and all Accelerator Nodes. This storage pool can be increased or decreased in size on-line without any impact to applications and without administrators having to do anything other than physically plug in the new resources. It s clear that the provisioning models that have been historically associated with monolithic storage architectures were not going to be cost-effective in environments approaching 100TB and beyond, and the type of intelligent self-management demonstrated by HYDRAstor is what s needed. 13 of 19

14 Ability to integrate with existing processes. Backup is probably the area in which this is most an issue, although archive and DR need to be considered as well as other common secondary storage applications. Existing backup processes are likely to be built around tapebased architectures, using enterprise backup software like Symantec NetBackup, IBM Tivoli Storage Manager, EMC Networker, or any of a number of other enterprise backup software products. HYDRAstor supports NFS and CIFS interfaces, allowing it to be configured as a NAS-based disk as disk target for all major backup software products. To set HYDRAstor up as a disk-based backup target, you would need to enable and configure the disk target capability of your existing backup software, then define one or more shares as backup targets. The big disadvantage of working with disk-as-disk backup targets has been the additional provisioning work that is not required with tape, but as pointed out earlier, HYDRAstor s unique self-managing capabilities do not require any of this provisioning either. Once the backup targets have been defined, all of the other backup policies, including schedules, can be leveraged as is, without change. This change to a disk-based backup target brings with it a number of key advantages over tape-based backup in the areas of backup window, RPO, RTO, and recovery reliability. And given HYDRAstor s replication capability, it now also makes it much easier to establish and maintain a DR facility since backups can be easily replicated to remote locations. HYDRAstor s DataRedux minimizes the amount of bandwidth necessary to replicate data, helping to keep any additional network investments required to enable replication to a minimum. Using replication to distribute data to remote locations offers significant advantages over physical tape transport, including faster distribution to maintain more aggressive RPOs at the remote site, better security, and less administrative overhead (i.e. loading, labeling, packing, and shipping of tapes). Data can be migrated from HYDRAstor to physical tape at any point in the process where a customer may want to do so, allowing HYDRAstor to co-exist with existing tape infrastructure or entirely replace it, depending on customer requirements. Many enterprises are still using tape for archiving purposes, but evolving e-discovery requirements are making active (disk-based) archiving more and more compelling. Grid architecture platforms like HYDRAstor that leverage SATA-based disk and storage capacity optimization technologies like DataRedux can realistically offer usable capacity at under $1/GB, and data protection technologies like DRD provide the kind of data reliability required for active archiving. Within a single physical HYDRAstor platform, virtual systems can be defined, although they will perform all data de-duplication against the same global repository. Separate backup and archive systems could be established within a single HYDRAstor grid, and then replication could be used to easily create and maintain a remote-site DR strategy for both systems. Industry standards based. Accelerator and Storage Nodes are based on commodity hardware platforms, running Intel-compatible processors, standard SAS or SATA disks, and 14 of 19

15 the Linux operating system. NEC based HYDRAstor nodes on industry standards like Intel, Linux, NFS, and CIFS as a conscious choice to keep costs down, make it easy to ride the performance and capacity curves for CPU and disk, and minimize integration costs. While the grid architecture is innovative and very germane to addressing enterprise backup, archiving, and DR issues, the crown jewel of the solution is clearly the DynamicStor software environment. NEC s product strategy will allow them to focus their development resources on DynamicStor, which is the key to maintaining their differentiation, while leveraging the huge development resources of Intel, Seagate, and other commodity hardware providers to stay competitive on the hardware side. Basing the platform around industry standards has another positive implication for end users as well. The potential longevity of scale-out NAS platforms like HYDRAstor forms a strong part of the value proposition since enterprise backup, archive, and DR require long term solutions. HYDRAstor s ability to accommodate multiple technology generations without fork lift upgrades provides a form of future proofing that will keep expansion costs low over long time horizons. Since customers can increase both performance and capacity independently, there is little risk that the right mix can be achieved to meet a variety of storage platform requirements cost-effectively, a claim that cannot be made for monolithic architectures. Providing access to the HYDRAstor platform through industry standard interfaces like NFS and CIFS, instead of proprietary interfaces used by some vendors particularly in the archive space, requires no custom coding for integration and entails less risk for customers since they enjoy broad industry support. Affordability. The HYDRAstor features that support affordability have already been discussed, but a key metric to evaluate here is the effective $/GB for usable capacity. Enterprise-class tape libraries today support a $/GB of between $.20 and $.30 (for configurations in the 100TB+ range and assuming 2:1 compression), while at list price HYDRAstor comes in at under $.50/GB (assuming the entry level 1x2 with 24TB of raw capacity, the default DRD setting of 3, and a 20:1 storage capacity optimization ratio). While a detailed cost comparison is beyond the scope of this paper, we can provide some guidance on how to evaluate costs in your particular situation: TAPE COSTS Tape acquisition costs (including tape drives and media replacement costs over time) Savings due to compression (assume 2:1 although it may be lower in your environment) 7x24 support costs (for most products these are in the 18-22% of list cost range per year) Energy costs (cost per kilowatt hour varies across the nation between roughly $.05 and $.15) Administrative overhead (time spent loading, labeling, packing, shipping, retrieving tapes) Archive (may or may not apply depending on whether an archive is in use) 15 of 19

16 HYDRAstor COSTS Storage platform acquisition costs for base capacity (raw capacity plus DRD data protection) Savings due to storage capacity optimization (ratios vary between 10:1 and 20:1) 7x24 support costs (12% of list price) Energy costs (cost per kilowatt hour varies depending on location) Administrative overhead (ongoing management) Archive (may or may not apply depending on whether an archive is in use) In general, tape will have a lower acquisition cost for raw storage capacity than disk and may have energy costs that are 5% - 15% of the costs of disk; administrative costs will be greater than disk but may not represent more than 10% of the overall TCO over a 5 year period; for archiving, we re seeing Fortune 1000 companies frequently assume a minimum of $500K for discovery costs against tape for a single large lawsuit, and $1500 every time a tape has to be searched. For HYDRAstor, initial platform costs for the HS (assuming the entry level configuration) provide a $/GB for raw capacity of $7.50 at list ($180K for 24TB). We also have to factor in the 25% overhead imposed by the default level of logical data redundancy for DRD, putting base capacity (raw capacity after logical data redundancy has been accounted for) at $9.38/GB. Storage capacity optimization ratios vary, but our research shows that for backup, most enterprises where storage capacity optimization is in use are achieving data reduction ratios between 10:1 and 20:1. A worst case assumption of 10:1 would put HYDRAstor usable capacity at $.94/GB, while a 20:1 assumption would put it at $.47/GB. Energy costs for HYDRAstor may be 20 times higher than for tape, but note that over a 5 year period this may be a difference of only one hundred thousand dollars or so for configurations that are supporting 100TB 2ooTB of data, an amount that pales in comparison to acquisition costs for these types of configurations. It s difficult to put a monetary value on backup window, RPO, RTO, and recovery reliability advantages that would apply across the board, so we ll leave that to the end user. If archive is being used as well though, economics can shift heavily in favor disk. Using the $500K guideline estimate for tape-based discovery costs for a single large lawsuit, we know that e-discovery costs against an active archive are roughly 1/10 of that. This means that the savings associated with a single lawsuit for a large enterprise can easily outweigh tape s energy savings over a 5 year period for archives in the multi hundred terabyte range. How many lawsuits is your company involved with in one year? You may argue that you don t intend to use disk for archive so these savings may not apply, but we would counter With savings like these achievable, why wouldn t you consider an active archive? There s one other key point to consider with active archives. We made the point earlier that 70% to 90% of the data that most customers keep on primary storage does not require primary storage s high performance. But the question is that if you are not going to keep this data on 16 of 19

17 primary storage, where would you put it? It s not true archive data in the sense that it would very rarely be accessed so you clearly would not want to put it on tape. Much of this data still needs to be on-line, but does not require the low latency access provided by primary storage. If an archive is disk-based, however, it is easy to make a case to move this data. Archiving software can ensure that the data is still referenceable by on-line applications, and an active archive will make it easily retrievable, albeit at rates slightly slower than primary storage. Certain data, like medical images, retained legal documents, and expense reports, may be born archive and can be stored immediately on an active archive. If you could migrate even 50% of the data sitting on primary storage right now to an active archive, how much could you save? You would lower not only primary storage costs, but backup costs as well since you d be backing up less data on a regular basis. Any primary storage cost savings would be calculated at primary storage s much higher $/GB rates. Realistically, this could mean that in the first year or two you create an active archive, you could need to buy very little or possibly no primary storage to accommodate growth, and then you would buy primary storage in successive years at a much slower rate. Key HYDRAstor Differentiators Data de-duplication appliances targeted at secondary storage environments are one way to address storage capacity optimization needs, but they can suffer from limited scalability and performance relative to HYDRAstor. When dealing with large data stores, issues like high single stream performance, scalable across multiple nodes and a global de-duplication repository, an ability to support tens of petabytes of usable capacity, and a graceful upgrade path that can accommodate the addition of new generations of technology without downtime are much better met by scale-out solutions like HYDRAstor than they are standalone appliances. Scale-out NAS platforms based on grid architectures and targeted specifically as secondary storage are available from several vendors, and they all support a highly available storage platform scalable into the petabyte range, but HYDRAstor is differentiated from these competitors in ways that are meaningful to end users. Key HYDRAstor differentiators include: A grid architecture platform that not only supports high end performance and scalability but includes an integrated storage capacity optimization capability (DataRedux) that relies on sub-file, variable length window data de-duplication to bring costs down well under $1/GB for usable capacity Rated at 16.5GB/sec, HYDRAstor s DataRedux boasts the industry s highest throughput storage capacity optimization engine - the fastest in-line data de-duplication vendors today can offer roughly 1GB/sec of single stream throughput, while the fastest postprocessing vendors claim up to 9.6GB/sec; because HYDRAstor uses an in-line 17 of 19

18 approach, it minimizes capacity requirements and gets the data into capacity optimized form, ready for replication to remote sites, faster than post-processing approaches can A data protection scheme (DRD) that is unique in the industry in allowing customers to dial in the level of data resiliency desired, but that in its default configuration provides 50% better protection than RAID 6 with less overhead for solid, cost-effective data reliability, and this number can be much higher if the issues associated with bit error rates and drive rebuilds are taken into account; one of the standout features of DRD is its ability to maintain high performance data access even during disk rebuilds, a feature unique to the way DRD lays data out on disk A unique distributed hash table that supports linear scalability even as HYDRAstor approaches its maximum configurations, and unlike competitive products actually increases performance as more storage capacity is added Taneja Group Opinion Taneja Group is a proponent of leveraging disk for secondary storage applications where it makes sense. In backup, disk is a clear win over tape for near term backup requirements, addressing operational issues such as backup window, RPO, RTO, and recovery reliability in ways that tape just cannot. For large enterprises, evolving e-discovery requirements are quickly turning economics in active archiving s favor. In many environments, tape will continue to play a role for many years to come, but it is increasingly being relegated to specific roles such as a second backup tier (for older data that is not accessed very frequently yet is still too recent to archive) or a final DR repository (export to tape after data has been replicated to a remote site). For large enterprises that are dealing with lawsuits on a regular basis, tape is becoming a less and less viable medium for archiving purposes on a pure economic basis, not to mention a speed of response one. Massively scalable storage platforms based on grid architectures provide a compelling diskbased alternative for secondary storage applications. For these platforms to meet the functional and cost requirements, though, there are clearly certain features they must provide: A high performance platform scalable into the petabyte range Linear scalability combined with modular configuration flexibility that will allow the right mix of performance and capacity to be achieved for any secondary storage application High availability that rivals that of fault tolerant platforms (99.999%) A data protection scheme that overcomes the deficiencies of conventional RAID and meets data reliability requirements with high performance Storage capacity optimization technologies that significantly reduce the amount of raw storage capacity required to store data 18 of 19

Maximizing Data Efficiency: Benefits of Global Deduplication

Maximizing Data Efficiency: Benefits of Global Deduplication Maximizing Data Efficiency: Benefits of Global Deduplication Advanced Storage Products Group Table of Contents 1 Understanding Deduplication 2 Scalability Limitations 4 Scope Limitations 5 Islands of Capacity

More information

NEC Sets High Bar in Scale-Out Secondary Storage Market

NEC Sets High Bar in Scale-Out Secondary Storage Market WHITE PAPER NEC Sets High Bar in Scale-Out Secondary Storage Market Sponsored by: NEC Eric Burgener August 2015 IDC OPINION Managing the explosive data growth in 3rd Platform computing environments is

More information

Scale-Out Architectures for Secondary Storage

Scale-Out Architectures for Secondary Storage Technology Insight Paper Scale-Out Architectures for Secondary Storage NEC is a Pioneer with HYDRAstor By Steve Scully, Sr. Analyst February 2018 Scale-Out Architectures for Secondary Storage 1 Scale-Out

More information

FOUR WAYS TO LOWER THE COST OF REPLICATION

FOUR WAYS TO LOWER THE COST OF REPLICATION WHITE PAPER I JANUARY 2010 FOUR WAYS TO LOWER THE COST OF REPLICATION How an Ultra-Efficient, Virtualized Storage Platform Brings Disaster Recovery within Reach for Any Organization FOUR WAYS TO LOWER

More information

Scale-out Object Store for PB/hr Backups and Long Term Archive April 24, 2014

Scale-out Object Store for PB/hr Backups and Long Term Archive April 24, 2014 Scale-out Object Store for PB/hr Backups and Long Term Archive April 24, 2014 Gideon Senderov Director, Advanced Storage Products NEC Corporation of America Long-Term Data in the Data Center (EB) 140 120

More information

Milestone Solution Partner IT Infrastructure Components Certification Report

Milestone Solution Partner IT Infrastructure Components Certification Report Milestone Solution Partner IT Infrastructure Components Certification Report NEC HYDRAstor 30-05-2016 Table of Contents Introduction... 4 Certified Products... 4 Solution Architecture... 5 Topology...

More information

NEC Express5800 R320f Fault Tolerant Servers & NEC ExpressCluster Software

NEC Express5800 R320f Fault Tolerant Servers & NEC ExpressCluster Software NEC Express5800 R320f Fault Tolerant Servers & NEC ExpressCluster Software Downtime Challenges and HA/DR Solutions Undergoing Paradigm Shift with IP Causes of Downtime: Cost of Downtime: HA & DR Solutions:

More information

Scale-out Data Deduplication Architecture

Scale-out Data Deduplication Architecture Scale-out Data Deduplication Architecture Gideon Senderov Product Management & Technical Marketing NEC Corporation of America Outline Data Growth and Retention Deduplication Methods Legacy Architecture

More information

Virtualizing SQL Server 2008 Using EMC VNX Series and VMware vsphere 4.1. Reference Architecture

Virtualizing SQL Server 2008 Using EMC VNX Series and VMware vsphere 4.1. Reference Architecture Virtualizing SQL Server 2008 Using EMC VNX Series and VMware vsphere 4.1 Copyright 2011, 2012 EMC Corporation. All rights reserved. Published March, 2012 EMC believes the information in this publication

More information

The Microsoft Large Mailbox Vision

The Microsoft Large Mailbox Vision WHITE PAPER The Microsoft Large Mailbox Vision Giving users large mailboxes without breaking your budget Introduction Giving your users the ability to store more email has many advantages. Large mailboxes

More information

The Data-Protection Playbook for All-flash Storage KEY CONSIDERATIONS FOR FLASH-OPTIMIZED DATA PROTECTION

The Data-Protection Playbook for All-flash Storage KEY CONSIDERATIONS FOR FLASH-OPTIMIZED DATA PROTECTION The Data-Protection Playbook for All-flash Storage KEY CONSIDERATIONS FOR FLASH-OPTIMIZED DATA PROTECTION The future of storage is flash The all-flash datacenter is a viable alternative You ve heard it

More information

Protect enterprise data, achieve long-term data retention

Protect enterprise data, achieve long-term data retention Technical white paper Protect enterprise data, achieve long-term data retention HP StoreOnce Catalyst and Symantec NetBackup OpenStorage Table of contents Introduction 2 Technology overview 3 HP StoreOnce

More information

Dell Fluid Data solutions. Powerful self-optimized enterprise storage. Dell Compellent Storage Center: Designed for business results

Dell Fluid Data solutions. Powerful self-optimized enterprise storage. Dell Compellent Storage Center: Designed for business results Dell Fluid Data solutions Powerful self-optimized enterprise storage Dell Compellent Storage Center: Designed for business results The Dell difference: Efficiency designed to drive down your total cost

More information

Copyright 2010 EMC Corporation. Do not Copy - All Rights Reserved.

Copyright 2010 EMC Corporation. Do not Copy - All Rights Reserved. 1 Using patented high-speed inline deduplication technology, Data Domain systems identify redundant data as they are being stored, creating a storage foot print that is 10X 30X smaller on average than

More information

Microsoft Office SharePoint Server 2007

Microsoft Office SharePoint Server 2007 Microsoft Office SharePoint Server 2007 Enabled by EMC Celerra Unified Storage and Microsoft Hyper-V Reference Architecture Copyright 2010 EMC Corporation. All rights reserved. Published May, 2010 EMC

More information

Optimizing and Managing File Storage in Windows Environments

Optimizing and Managing File Storage in Windows Environments Optimizing and Managing File Storage in Windows Environments A Powerful Solution Based on Microsoft DFS and Virtual File Manager September 2006 TR-3511 Abstract The Microsoft Distributed File System (DFS)

More information

White Paper Simplified Backup and Reliable Recovery

White Paper Simplified Backup and Reliable Recovery Simplified Backup and Reliable Recovery NEC Corporation of America necam.com Overview Amanda Enterprise from Zmanda - A Carbonite company, is a backup and recovery solution that offers fast installation,

More information

EMC Integrated Infrastructure for VMware. Business Continuity

EMC Integrated Infrastructure for VMware. Business Continuity EMC Integrated Infrastructure for VMware Business Continuity Enabled by EMC Celerra and VMware vcenter Site Recovery Manager Reference Architecture Copyright 2009 EMC Corporation. All rights reserved.

More information

Power of the Portfolio. Copyright 2012 EMC Corporation. All rights reserved.

Power of the Portfolio. Copyright 2012 EMC Corporation. All rights reserved. Power of the Portfolio 1 VMAX / VPLEX K-12 School System District seeking system to support rollout of new VDI implementation Customer found Vblock to be superior solutions versus competitor Customer expanded

More information

Reasons to Deploy Oracle on EMC Symmetrix VMAX

Reasons to Deploy Oracle on EMC Symmetrix VMAX Enterprises are under growing urgency to optimize the efficiency of their Oracle databases. IT decision-makers and business leaders are constantly pushing the boundaries of their infrastructures and applications

More information

IBM System Storage DS5020 Express

IBM System Storage DS5020 Express IBM DS5020 Express Manage growth, complexity, and risk with scalable, high-performance storage Highlights Mixed host interfaces support (FC/iSCSI) enables SAN tiering Balanced performance well-suited for

More information

Achieving Rapid Data Recovery for IBM AIX Environments An Executive Overview of EchoStream for AIX

Achieving Rapid Data Recovery for IBM AIX Environments An Executive Overview of EchoStream for AIX Achieving Rapid Data Recovery for IBM AIX Environments An Executive Overview of EchoStream for AIX Introduction Planning for recovery is a requirement in businesses of all sizes. In implementing an operational

More information

Hitachi Adaptable Modular Storage and Hitachi Workgroup Modular Storage

Hitachi Adaptable Modular Storage and Hitachi Workgroup Modular Storage O V E R V I E W Hitachi Adaptable Modular Storage and Hitachi Workgroup Modular Storage Modular Hitachi Storage Delivers Enterprise-level Benefits Hitachi Adaptable Modular Storage and Hitachi Workgroup

More information

ECONOMICAL, STORAGE PURPOSE-BUILT FOR THE EMERGING DATA CENTERS. By George Crump

ECONOMICAL, STORAGE PURPOSE-BUILT FOR THE EMERGING DATA CENTERS. By George Crump ECONOMICAL, STORAGE PURPOSE-BUILT FOR THE EMERGING DATA CENTERS By George Crump Economical, Storage Purpose-Built for the Emerging Data Centers Most small, growing businesses start as a collection of laptops

More information

WHY DO I NEED FALCONSTOR OPTIMIZED BACKUP & DEDUPLICATION?

WHY DO I NEED FALCONSTOR OPTIMIZED BACKUP & DEDUPLICATION? WHAT IS FALCONSTOR? FAQS FalconStor Optimized Backup and Deduplication is the industry s market-leading virtual tape and LAN-based deduplication solution, unmatched in performance and scalability. With

More information

Hitachi Adaptable Modular Storage and Workgroup Modular Storage

Hitachi Adaptable Modular Storage and Workgroup Modular Storage O V E R V I E W Hitachi Adaptable Modular Storage and Workgroup Modular Storage Modular Hitachi Storage Delivers Enterprise-level Benefits Hitachi Data Systems Hitachi Adaptable Modular Storage and Workgroup

More information

Next Generation Backup: Better ways to deal with rapid data growth and aging tape infrastructures

Next Generation Backup: Better ways to deal with rapid data growth and aging tape infrastructures Next Generation Backup: Better ways to deal with rapid data growth and aging tape infrastructures Next 1 What we see happening today. The amount of data businesses must cope with on a daily basis is getting

More information

Atlantis Computing Adds the Ability to Address Classic Server Workloads

Atlantis Computing Adds the Ability to Address Classic Server Workloads FLASH Atlantis Computing Adds the Ability to Address Classic Server Workloads Eric Burgener Brett Waldman IN THIS FLASH This IDC Flash discusses Atlantis Computing's In-Memory Storage technology and how

More information

Hyper-converged Secondary Storage for Backup with Deduplication Q & A. The impact of data deduplication on the backup process

Hyper-converged Secondary Storage for Backup with Deduplication Q & A. The impact of data deduplication on the backup process Hyper-converged Secondary Storage for Backup with Deduplication Q & A The impact of data deduplication on the backup process Table of Contents Introduction... 3 What is data deduplication?... 3 Is all

More information

DELL EMC DATA DOMAIN SISL SCALING ARCHITECTURE

DELL EMC DATA DOMAIN SISL SCALING ARCHITECTURE WHITEPAPER DELL EMC DATA DOMAIN SISL SCALING ARCHITECTURE A Detailed Review ABSTRACT While tape has been the dominant storage medium for data protection for decades because of its low cost, it is steadily

More information

Nutanix Tech Note. Virtualizing Microsoft Applications on Web-Scale Infrastructure

Nutanix Tech Note. Virtualizing Microsoft Applications on Web-Scale Infrastructure Nutanix Tech Note Virtualizing Microsoft Applications on Web-Scale Infrastructure The increase in virtualization of critical applications has brought significant attention to compute and storage infrastructure.

More information

Eight Tips for Better Archives. Eight Ways Cloudian Object Storage Benefits Archiving with Veritas Enterprise Vault

Eight Tips for Better  Archives. Eight Ways Cloudian Object Storage Benefits  Archiving with Veritas Enterprise Vault Eight Tips for Better Email Archives Eight Ways Cloudian Object Storage Benefits Email Archiving with Veritas Enterprise Vault Most organizations now manage terabytes, if not petabytes, of corporate and

More information

Cybernetics Virtual Tape Libraries Media Migration Manager Streamlines Flow of D2D2T Backup. April 2009

Cybernetics Virtual Tape Libraries Media Migration Manager Streamlines Flow of D2D2T Backup. April 2009 Cybernetics Virtual Tape Libraries Media Migration Manager Streamlines Flow of D2D2T Backup April 2009 Cybernetics has been in the business of data protection for over thirty years. Our data storage and

More information

The UnAppliance provides Higher Performance, Lower Cost File Serving

The UnAppliance provides Higher Performance, Lower Cost File Serving The UnAppliance provides Higher Performance, Lower Cost File Serving The UnAppliance is an alternative to traditional NAS solutions using industry standard servers and storage for a more efficient and

More information

The storage challenges of virtualized environments

The storage challenges of virtualized environments The storage challenges of virtualized environments The virtualization challenge: Ageing and Inflexible storage architectures Mixing of platforms causes management complexity Unable to meet the requirements

More information

Veeam Availability Solution for Cisco UCS: Designed for Virtualized Environments. Solution Overview Cisco Public

Veeam Availability Solution for Cisco UCS: Designed for Virtualized Environments. Solution Overview Cisco Public Veeam Availability Solution for Cisco UCS: Designed for Virtualized Environments Veeam Availability Solution for Cisco UCS: Designed for Virtualized Environments 1 2017 2017 Cisco Cisco and/or and/or its

More information

Automated Storage Tiering on Infortrend s ESVA Storage Systems

Automated Storage Tiering on Infortrend s ESVA Storage Systems Automated Storage Tiering on Infortrend s ESVA Storage Systems White paper Abstract This white paper introduces automated storage tiering on Infortrend s ESVA storage arrays. Storage tiering can generate

More information

A High-Performance Storage and Ultra- High-Speed File Transfer Solution for Collaborative Life Sciences Research

A High-Performance Storage and Ultra- High-Speed File Transfer Solution for Collaborative Life Sciences Research A High-Performance Storage and Ultra- High-Speed File Transfer Solution for Collaborative Life Sciences Research Storage Platforms with Aspera Overview A growing number of organizations with data-intensive

More information

Dell PowerVault MD Family. Modular storage. The Dell PowerVault MD storage family

Dell PowerVault MD Family. Modular storage. The Dell PowerVault MD storage family Dell MD Family Modular storage The Dell MD storage family Dell MD Family Simplifying IT The Dell MD Family simplifies IT by optimizing your data storage architecture and ensuring the availability of your

More information

IBM řešení pro větší efektivitu ve správě dat - Store more with less

IBM řešení pro větší efektivitu ve správě dat - Store more with less IBM řešení pro větší efektivitu ve správě dat - Store more with less IDG StorageWorld 2012 Rudolf Hruška Information Infrastructure Leader IBM Systems & Technology Group rudolf_hruska@cz.ibm.com IBM Agenda

More information

IBM Real-time Compression and ProtecTIER Deduplication

IBM Real-time Compression and ProtecTIER Deduplication Compression and ProtecTIER Deduplication Two technologies that work together to increase storage efficiency Highlights Reduce primary storage capacity requirements with Compression Decrease backup data

More information

NEC HYDRAstor Date: September, 2009 Author: Terri McClure, Senior Analyst, and Lauren Whitehouse, Senior Analyst

NEC HYDRAstor Date: September, 2009 Author: Terri McClure, Senior Analyst, and Lauren Whitehouse, Senior Analyst Product Brief NEC HYDRAstor Date: September, 2009 Author: Terri McClure, Senior Analyst, and Lauren Whitehouse, Senior Analyst Abstract: With its latest HYDRAstor release, NEC has rounded out the platform

More information

FLASHARRAY//M Business and IT Transformation in 3U

FLASHARRAY//M Business and IT Transformation in 3U FLASHARRAY//M Business and IT Transformation in 3U TRANSFORM IT Who knew that moving to all-flash storage could help reduce the cost of IT? FlashArray//m makes server and workload investments more productive,

More information

How to Protect Your Small or Midsized Business with Proven, Simple, and Affordable VMware Virtualization

How to Protect Your Small or Midsized Business with Proven, Simple, and Affordable VMware Virtualization How to Protect Your Small or Midsized Business with Proven, Simple, and Affordable VMware Virtualization January 2011 Business continuity and disaster recovery (BC/DR) planning is becoming a critical mandate

More information

HOW DATA DEDUPLICATION WORKS A WHITE PAPER

HOW DATA DEDUPLICATION WORKS A WHITE PAPER HOW DATA DEDUPLICATION WORKS A WHITE PAPER HOW DATA DEDUPLICATION WORKS ABSTRACT IT departments face explosive data growth, driving up costs of storage for backup and disaster recovery (DR). For this reason,

More information

Hyper-Converged Infrastructure: Providing New Opportunities for Improved Availability

Hyper-Converged Infrastructure: Providing New Opportunities for Improved Availability Hyper-Converged Infrastructure: Providing New Opportunities for Improved Availability IT teams in companies of all sizes face constant pressure to meet the Availability requirements of today s Always-On

More information

RAID SEMINAR REPORT /09/2004 Asha.P.M NO: 612 S7 ECE

RAID SEMINAR REPORT /09/2004 Asha.P.M NO: 612 S7 ECE RAID SEMINAR REPORT 2004 Submitted on: Submitted by: 24/09/2004 Asha.P.M NO: 612 S7 ECE CONTENTS 1. Introduction 1 2. The array and RAID controller concept 2 2.1. Mirroring 3 2.2. Parity 5 2.3. Error correcting

More information

DELL EMC DATA DOMAIN EXTENDED RETENTION SOFTWARE

DELL EMC DATA DOMAIN EXTENDED RETENTION SOFTWARE WHITEPAPER DELL EMC DATA DOMAIN EXTENDED RETENTION SOFTWARE A Detailed Review ABSTRACT This white paper introduces Dell EMC Data Domain Extended Retention software that increases the storage scalability

More information

A Practical Guide to Cost-Effective Disaster Recovery Planning

A Practical Guide to Cost-Effective Disaster Recovery Planning White Paper PlateSpin A Practical Guide to Cost-Effective Disaster Recovery Planning Organizations across the globe are finding disaster recovery increasingly important for a number of reasons. With the

More information

Disk-Based Data Protection Architecture Comparisons

Disk-Based Data Protection Architecture Comparisons Disk-Based Data Protection Architecture Comparisons Abstract The dramatic drop in the price of hard disk storage combined with its performance characteristics has given rise to a number of data protection

More information

If you knew then...what you know now. The Why, What and Who of scale-out storage

If you knew then...what you know now. The Why, What and Who of scale-out storage If you knew then......what you know now The Why, What and Who of scale-out storage The Why? Calculating your storage needs has never been easy and now it is becoming more complicated. With Big Data (pssst

More information

iscsi Technology Brief Storage Area Network using Gbit Ethernet The iscsi Standard

iscsi Technology Brief Storage Area Network using Gbit Ethernet The iscsi Standard iscsi Technology Brief Storage Area Network using Gbit Ethernet The iscsi Standard On February 11 th 2003, the Internet Engineering Task Force (IETF) ratified the iscsi standard. The IETF was made up of

More information

SolidFire and Pure Storage Architectural Comparison

SolidFire and Pure Storage Architectural Comparison The All-Flash Array Built for the Next Generation Data Center SolidFire and Pure Storage Architectural Comparison June 2014 This document includes general information about Pure Storage architecture as

More information

EMC DATA DOMAIN PRODUCT OvERvIEW

EMC DATA DOMAIN PRODUCT OvERvIEW EMC DATA DOMAIN PRODUCT OvERvIEW Deduplication storage for next-generation backup and archive Essentials Scalable Deduplication Fast, inline deduplication Provides up to 65 PBs of logical storage for long-term

More information

NetApp SolidFire and Pure Storage Architectural Comparison A SOLIDFIRE COMPETITIVE COMPARISON

NetApp SolidFire and Pure Storage Architectural Comparison A SOLIDFIRE COMPETITIVE COMPARISON A SOLIDFIRE COMPETITIVE COMPARISON NetApp SolidFire and Pure Storage Architectural Comparison This document includes general information about Pure Storage architecture as it compares to NetApp SolidFire.

More information

Zero Data Loss Recovery Appliance DOAG Konferenz 2014, Nürnberg

Zero Data Loss Recovery Appliance DOAG Konferenz 2014, Nürnberg Zero Data Loss Recovery Appliance Frank Schneede, Sebastian Solbach Systemberater, BU Database, Oracle Deutschland B.V. & Co. KG Safe Harbor Statement The following is intended to outline our general product

More information

New Approach to Unstructured Data

New Approach to Unstructured Data Innovations in All-Flash Storage Deliver a New Approach to Unstructured Data Table of Contents Developing a new approach to unstructured data...2 Designing a new storage architecture...2 Understanding

More information

LEVERAGING A PERSISTENT HARDWARE ARCHITECTURE

LEVERAGING A PERSISTENT HARDWARE ARCHITECTURE WHITE PAPER I JUNE 2010 LEVERAGING A PERSISTENT HARDWARE ARCHITECTURE How an Open, Modular Storage Platform Gives Enterprises the Agility to Scale On Demand and Adapt to Constant Change. LEVERAGING A PERSISTENT

More information

DEDUPLICATION BASICS

DEDUPLICATION BASICS DEDUPLICATION BASICS 4 DEDUPE BASICS 6 WHAT IS DEDUPLICATION 8 METHODS OF DEDUPLICATION 10 DEDUPLICATION EXAMPLE 12 HOW DO DISASTER RECOVERY & ARCHIVING FIT IN? 14 DEDUPLICATION FOR EVERY BUDGET QUANTUM

More information

HYDRAstor: a Scalable Secondary Storage

HYDRAstor: a Scalable Secondary Storage HYDRAstor: a Scalable Secondary Storage 7th TF-Storage Meeting September 9 th 00 Łukasz Heldt Largest Japanese IT company $4 Billion in annual revenue 4,000 staff www.nec.com Polish R&D company 50 engineers

More information

DELL POWERVAULT MD FAMILY MODULAR STORAGE THE DELL POWERVAULT MD STORAGE FAMILY

DELL POWERVAULT MD FAMILY MODULAR STORAGE THE DELL POWERVAULT MD STORAGE FAMILY DELL MD FAMILY MODULAR STORAGE THE DELL MD STORAGE FAMILY Simplifying IT The Dell PowerVault MD family can simplify IT by optimizing your data storage architecture and ensuring the availability of your

More information

Storageflex HA3969 High-Density Storage: Key Design Features and Hybrid Connectivity Benefits. White Paper

Storageflex HA3969 High-Density Storage: Key Design Features and Hybrid Connectivity Benefits. White Paper Storageflex HA3969 High-Density Storage: Key Design Features and Hybrid Connectivity Benefits White Paper Abstract This white paper introduces the key design features and hybrid FC/iSCSI connectivity benefits

More information

Veritas NetBackup on Cisco UCS S3260 Storage Server

Veritas NetBackup on Cisco UCS S3260 Storage Server Veritas NetBackup on Cisco UCS S3260 Storage Server This document provides an introduction to the process for deploying the Veritas NetBackup master server and media server on the Cisco UCS S3260 Storage

More information

EMC Virtual Infrastructure for Microsoft Applications Data Center Solution

EMC Virtual Infrastructure for Microsoft Applications Data Center Solution EMC Virtual Infrastructure for Microsoft Applications Data Center Solution Enabled by EMC Symmetrix V-Max and Reference Architecture EMC Global Solutions Copyright and Trademark Information Copyright 2009

More information

Global Headquarters: 5 Speen Street Framingham, MA USA P F

Global Headquarters: 5 Speen Street Framingham, MA USA P F Global Headquarters: 5 Speen Street Framingham, MA 01701 USA P.508.872.8200 F.508.935.4015 www.idc.com W H I T E P A P E R T h e B u s i n e s s C a s e f o r M i d r a n g e T a p e L i b r a r y S o

More information

EMC Virtual Infrastructure for Microsoft Exchange 2010 Enabled by EMC Symmetrix VMAX, VMware vsphere 4, and Replication Manager

EMC Virtual Infrastructure for Microsoft Exchange 2010 Enabled by EMC Symmetrix VMAX, VMware vsphere 4, and Replication Manager EMC Virtual Infrastructure for Microsoft Exchange 2010 Enabled by EMC Symmetrix VMAX, VMware vsphere 4, and Replication Manager Reference Architecture Copyright 2010 EMC Corporation. All rights reserved.

More information

HP s VLS9000 and D2D4112 deduplication systems

HP s VLS9000 and D2D4112 deduplication systems Silverton Consulting StorInt Briefing Introduction Particularly in today s economy, costs and return on investment (ROI) often dominate product selection decisions. However, gathering the appropriate information

More information

Discover the all-flash storage company for the on-demand world

Discover the all-flash storage company for the on-demand world Discover the all-flash storage company for the on-demand world STORAGE FOR WHAT S NEXT The applications we use in our personal lives have raised the level of expectations for the user experience in enterprise

More information

All-Flash Storage Solution for SAP HANA:

All-Flash Storage Solution for SAP HANA: All-Flash Storage Solution for SAP HANA: Storage Considerations using SanDisk Solid State Devices WHITE PAPER Western Digital Technologies, Inc. 951 SanDisk Drive, Milpitas, CA 95035 www.sandisk.com Table

More information

Virtualization of the MS Exchange Server Environment

Virtualization of the MS Exchange Server Environment MS Exchange Server Acceleration Maximizing Users in a Virtualized Environment with Flash-Powered Consolidation Allon Cohen, PhD OCZ Technology Group Introduction Microsoft (MS) Exchange Server is one of

More information

White Paper. EonStor GS Family Best Practices Guide. Version: 1.1 Updated: Apr., 2018

White Paper. EonStor GS Family Best Practices Guide. Version: 1.1 Updated: Apr., 2018 EonStor GS Family Best Practices Guide White Paper Version: 1.1 Updated: Apr., 2018 Abstract: This guide provides recommendations of best practices for installation and configuration to meet customer performance

More information

Disk-to-Disk-to-Tape (D2D2T)

Disk-to-Disk-to-Tape (D2D2T) Disk-to-Disk-to-Tape (D2D2T) Where Disk Fits Into Backup Tape originated in the 1950 s as the primary storage device for computers. It was one of the fi rst ways to store data beyond the memory of a computer,

More information

Get More Out of Storage with Data Domain Deduplication Storage Systems

Get More Out of Storage with Data Domain Deduplication Storage Systems 1 Get More Out of Storage with Data Domain Deduplication Storage Systems David M. Auslander Sales Director, New England / Eastern Canada 2 EMC Data Domain Dedupe everything without changing anything Simplify

More information

Tegile Enters the All-Flash Array Market with Super Density Offering

Tegile Enters the All-Flash Array Market with Super Density Offering FLASH Tegile Enters the All-Flash Array Market with Super Density Offering Eric Burgener IN THIS FLASH This IDC Flash discusses the recent Tegile announcement, just prior to VMworld 2015 in San Francisco,

More information

WHITE PAPER. How Deduplication Benefits Companies of All Sizes An Acronis White Paper

WHITE PAPER. How Deduplication Benefits Companies of All Sizes An Acronis White Paper How Deduplication Benefits Companies of All Sizes An Acronis White Paper Copyright Acronis, Inc., 2000 2009 Table of contents Executive Summary... 3 What is deduplication?... 4 File-level deduplication

More information

Evaluation Report: Improving SQL Server Database Performance with Dot Hill AssuredSAN 4824 Flash Upgrades

Evaluation Report: Improving SQL Server Database Performance with Dot Hill AssuredSAN 4824 Flash Upgrades Evaluation Report: Improving SQL Server Database Performance with Dot Hill AssuredSAN 4824 Flash Upgrades Evaluation report prepared under contract with Dot Hill August 2015 Executive Summary Solid state

More information

Storage Designed to Support an Oracle Database. White Paper

Storage Designed to Support an Oracle Database. White Paper Storage Designed to Support an Oracle Database White Paper Abstract Databases represent the backbone of most organizations. And Oracle databases in particular have become the mainstream data repository

More information

Lenovo RAID Introduction Reference Information

Lenovo RAID Introduction Reference Information Lenovo RAID Introduction Reference Information Using a Redundant Array of Independent Disks (RAID) to store data remains one of the most common and cost-efficient methods to increase server's storage performance,

More information

Symantec NetBackup 7 for VMware

Symantec NetBackup 7 for VMware V-Ray visibility into virtual machine protection Overview There s little question that server virtualization is the single biggest game-changing trend in IT today. Budget-strapped IT departments are racing

More information

Virtual Security Server

Virtual Security Server Data Sheet VSS Virtual Security Server Security clients anytime, anywhere, any device CENTRALIZED CLIENT MANAGEMENT UP TO 50% LESS BANDWIDTH UP TO 80 VIDEO STREAMS MOBILE ACCESS INTEGRATED SECURITY SYSTEMS

More information

Cisco UCS Mini Software-Defined Storage with StorMagic SvSAN for Remote Offices

Cisco UCS Mini Software-Defined Storage with StorMagic SvSAN for Remote Offices Solution Overview Cisco UCS Mini Software-Defined Storage with StorMagic SvSAN for Remote Offices BENEFITS Cisco UCS and StorMagic SvSAN deliver a solution to the edge: Single addressable storage pool

More information

Preserving the World s Most Important Data. Yours. SYSTEMS AT-A-GLANCE: KEY FEATURES AND BENEFITS

Preserving the World s Most Important Data. Yours. SYSTEMS AT-A-GLANCE: KEY FEATURES AND BENEFITS Preserving the World s Most Important Data. Yours. SYSTEMS AT-A-GLANCE: KEY FEATURES AND BENEFITS We are the only company to integrate disk, tape, and replication in a single solution set for better near-term

More information

Global Headquarters: 5 Speen Street Framingham, MA USA P F

Global Headquarters: 5 Speen Street Framingham, MA USA P F Global Headquarters: 5 Speen Street Framingham, MA 01701 USA P.508.872.8200 F.508.935.4015 www.idc.com W H I T E P A P E R T h e R e a l i t y o f D a t a P r o t e c t i o n a n d R e c o v e r y a n

More information

White Paper. Low Cost High Availability Clustering for the Enterprise. Jointly published by Winchester Systems Inc. and Red Hat Inc.

White Paper. Low Cost High Availability Clustering for the Enterprise. Jointly published by Winchester Systems Inc. and Red Hat Inc. White Paper Low Cost High Availability Clustering for the Enterprise Jointly published by Winchester Systems Inc. and Red Hat Inc. Linux Clustering Moves Into the Enterprise Mention clustering and Linux

More information

Chapter 1. Storage Concepts. CommVault Concepts & Design Strategies: https://www.createspace.com/

Chapter 1. Storage Concepts. CommVault Concepts & Design Strategies: https://www.createspace.com/ Chapter 1 Storage Concepts 4 - Storage Concepts In order to understand CommVault concepts regarding storage management we need to understand how and why we protect data, traditional backup methods, and

More information

Executive Summary. The Need for Shared Storage. The Shared Storage Dilemma for the SMB. The SMB Answer - DroboElite. Enhancing your VMware Environment

Executive Summary. The Need for Shared Storage. The Shared Storage Dilemma for the SMB. The SMB Answer - DroboElite. Enhancing your VMware Environment Executive Summary The Need for Shared Storage The Shared Storage Dilemma for the SMB The SMB Answer - DroboElite Enhancing your VMware Environment Ideal for Virtualized SMB Conclusion Executive Summary

More information

Archiving, Backup, and Recovery for Complete the Promise of Virtualisation Unified information management for enterprise Windows environments

Archiving, Backup, and Recovery for Complete the Promise of Virtualisation Unified information management for enterprise Windows environments Archiving, Backup, and Recovery for Complete the Promise of Virtualisation Unified information management for enterprise Windows environments The explosion of unstructured information It is estimated that

More information

IBM TS7700 grid solutions for business continuity

IBM TS7700 grid solutions for business continuity IBM grid solutions for business continuity Enhance data protection and business continuity for mainframe environments in the cloud era Highlights Help ensure business continuity with advanced features

More information

DAHA AKILLI BĐR DÜNYA ĐÇĐN BĐLGĐ ALTYAPILARIMIZI DEĞĐŞTĐRECEĞĐZ

DAHA AKILLI BĐR DÜNYA ĐÇĐN BĐLGĐ ALTYAPILARIMIZI DEĞĐŞTĐRECEĞĐZ Information Infrastructure Forum, Istanbul DAHA AKILLI BĐR DÜNYA ĐÇĐN BĐLGĐ ALTYAPILARIMIZI DEĞĐŞTĐRECEĞĐZ 2010 IBM Corporation Information Infrastructure Forum, Istanbul IBM XIV Veri Depolama Sistemleri

More information

SYSTEM UPGRADE, INC Making Good Computers Better. System Upgrade Teaches RAID

SYSTEM UPGRADE, INC Making Good Computers Better. System Upgrade Teaches RAID System Upgrade Teaches RAID In the growing computer industry we often find it difficult to keep track of the everyday changes in technology. At System Upgrade, Inc it is our goal and mission to provide

More information

DATA PROTECTION IN A ROBO ENVIRONMENT

DATA PROTECTION IN A ROBO ENVIRONMENT Reference Architecture DATA PROTECTION IN A ROBO ENVIRONMENT EMC VNX Series EMC VNXe Series EMC Solutions Group April 2012 Copyright 2012 EMC Corporation. All Rights Reserved. EMC believes the information

More information

Quest DR Series Disk Backup Appliances

Quest DR Series Disk Backup Appliances Quest DR Series Disk Backup Appliances Back up more. Store less. Perform better. Keeping up with the volume of data to protect can be complex and time consuming, but managing the storage of that data doesn

More information

Midsize Enterprise Solutions Selling Guide. Sell NetApp s midsize enterprise solutions and take your business and your customers further, faster

Midsize Enterprise Solutions Selling Guide. Sell NetApp s midsize enterprise solutions and take your business and your customers further, faster Midsize Enterprise Solutions Selling Guide Sell NetApp s midsize enterprise solutions and take your business and your customers further, faster Many of your midsize customers might have tried to reduce

More information

De-dupe: It s not a question of if, rather where and when! What to Look for and What to Avoid

De-dupe: It s not a question of if, rather where and when! What to Look for and What to Avoid De-dupe: It s not a question of if, rather where and when! What to Look for and What to Avoid By Greg Schulz Founder and Senior Analyst, the StorageIO Group Author The Green and Virtual Data Center (CRC)

More information

First Financial Bank. Highly available, centralized, tiered storage brings simplicity, reliability, and significant cost advantages to operations

First Financial Bank. Highly available, centralized, tiered storage brings simplicity, reliability, and significant cost advantages to operations Customer Profile First Financial Bank Highly available, centralized, tiered storage brings simplicity, reliability, and significant cost advantages to operations A midsize community bank with a service

More information

Management Update: Storage Management TCO Considerations

Management Update: Storage Management TCO Considerations IGG-09172003-01 C. Stanley Article 17 September 2003 Management Update: Storage Management TCO Considerations CIOs, asset managers, data center managers and business managers should be aware of the total

More information

CONFIGURATION GUIDE WHITE PAPER JULY ActiveScale. Family Configuration Guide

CONFIGURATION GUIDE WHITE PAPER JULY ActiveScale. Family Configuration Guide WHITE PAPER JULY 2018 ActiveScale Family Configuration Guide Introduction The world is awash in a sea of data. Unstructured data from our mobile devices, emails, social media, clickstreams, log files,

More information

IBM XIV Storage System

IBM XIV Storage System IBM XIV Storage System Technical Description IBM XIV Storage System Storage Reinvented Performance The IBM XIV Storage System offers a new level of high-end disk system performance and reliability. It

More information

HCI: Hyper-Converged Infrastructure

HCI: Hyper-Converged Infrastructure Key Benefits: Innovative IT solution for high performance, simplicity and low cost Complete solution for IT workloads: compute, storage and networking in a single appliance High performance enabled by

More information

The World s Fastest Backup Systems

The World s Fastest Backup Systems 3 The World s Fastest Backup Systems Erwin Freisleben BRS Presales Austria 4 EMC Data Domain: Leadership and Innovation A history of industry firsts 2003 2004 2005 2006 2007 2008 2009 2010 2011 First deduplication

More information