A High-Performance Storage and Ultra- High-Speed File Transfer Solution for Collaborative Life Sciences Research

Similar documents
Entertainment and Media s Post-production Storage Crisis

Titan SiliconServer for Oracle 9i

Microsoft Office SharePoint Server 2007

Ultra high-speed transmission technology for wide area data movement

HCI: Hyper-Converged Infrastructure

DATA PROTECTION IN A ROBO ENVIRONMENT

EMC ISILON HARDWARE PLATFORM

Microsoft SQL Server in a VMware Environment on Dell PowerEdge R810 Servers and Dell EqualLogic Storage

Virtualizing SQL Server 2008 Using EMC VNX Series and VMware vsphere 4.1. Reference Architecture

Xcellis Technical Overview: A deep dive into the latest hardware designed for StorNext 5

Low Latency Evaluation of Fibre Channel, iscsi and SAS Host Interfaces

EMC Business Continuity for Microsoft Applications

MongoDB on Kaminario K2

Microsoft SharePoint Server 2010 on Dell Systems

EMC Backup and Recovery for Microsoft SQL Server

Create a Flexible, Scalable High-Performance Storage Cluster with WekaIO Matrix

New Approach to Unstructured Data

EMC CLARiiON CX3-40. Reference Architecture. Enterprise Solutions for Microsoft Exchange 2007

IBM Storwize V7000 Unified

MODERNISE WITH ALL-FLASH. Intel Inside. Powerful Data Centre Outside.

EditShare XStream EFS Shared Storage

Technical Brief: Titan & Alacritech iscsi Accelerator on Microsoft Windows

Evaluation Report: HP StoreFabric SN1000E 16Gb Fibre Channel HBA

Dell Reference Configuration for Large Oracle Database Deployments on Dell EqualLogic Storage

EMC STORAGE FOR MILESTONE XPROTECT CORPORATE

Accelerate Database Performance and Reduce Response Times in MongoDB Humongous Environments with the LSI Nytro MegaRAID Flash Accelerator Card

NAS for Server Virtualization Dennis Chapman Senior Technical Director NetApp

Dell EMC All-Flash solutions are powered by Intel Xeon processors. Learn more at DellEMC.com/All-Flash

SUPERMICRO, VEXATA AND INTEL ENABLING NEW LEVELS PERFORMANCE AND EFFICIENCY FOR REAL-TIME DATA ANALYTICS FOR SQL DATA WAREHOUSE DEPLOYMENTS

Upgrade to Microsoft SQL Server 2016 with Dell EMC Infrastructure

THE OPEN DATA CENTER FABRIC FOR THE CLOUD

Accelerating Microsoft SQL Server 2016 Performance With Dell EMC PowerEdge R740

MAKING CLOUD DEPLOYMENTS A REALITY WITH HYPERCONVERGED INFRASTRUCTURE

Building Backup-to-Disk and Disaster Recovery Solutions with the ReadyDATA 5200

SurFS Product Description

EMC ISILON X-SERIES. Specifications. EMC Isilon X210. EMC Isilon X410 ARCHITECTURE

Pivot3 Acuity with Microsoft SQL Server Reference Architecture

Use of the Internet SCSI (iscsi) protocol

iscsi Technology: A Convergence of Networking and Storage

An Oracle White Paper December Accelerating Deployment of Virtualized Infrastructures with the Oracle VM Blade Cluster Reference Configuration

Best Practices for Deploying a Mixed 1Gb/10Gb Ethernet SAN using Dell EqualLogic Storage Arrays

Surveillance Dell EMC Storage with Milestone XProtect Corporate

Top 5 Reasons to Consider

Hybrid Cloud NAS for On-Premise and In-Cloud File Services with Panzura and Google Cloud Storage

Deploy a Next-Generation Messaging Platform with Microsoft Exchange Server 2010 on Cisco Unified Computing System Powered by Intel Xeon Processors

Copyright 2012 EMC Corporation. All rights reserved.

IBM TotalStorage Enterprise Storage Server Model 800

Dell Fluid Data solutions. Powerful self-optimized enterprise storage. Dell Compellent Storage Center: Designed for business results

Performance Evaluation Using Network File System (NFS) v3 Protocol. Hitachi Data Systems

W H I T E P A P E R U n l o c k i n g t h e P o w e r o f F l a s h w i t h t h e M C x - E n a b l e d N e x t - G e n e r a t i o n V N X

Virtualization of the MS Exchange Server Environment

The PowerEdge M830 blade server

EMC CLARiiON CX3-40. Reference Architecture. Enterprise Solutions for Microsoft Exchange Enabled by MirrorView/S

Microsoft Exchange Server 2010 workload optimization on the new IBM PureFlex System

IBM řešení pro větší efektivitu ve správě dat - Store more with less

Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments

Hitachi Unified Compute Platform Pro for VMware vsphere

IBM Emulex 16Gb Fibre Channel HBA Evaluation

2 to 4 Intel Xeon Processor E v3 Family CPUs. Up to 12 SFF Disk Drives for Appliance Model. Up to 6 TB of Main Memory (with GB LRDIMMs)

ARISTA: Improving Application Performance While Reducing Complexity

Dell EMC Isilon All-Flash

Customer Success Story Los Alamos National Laboratory

DELL EMC ISILON SCALE-OUT NAS PRODUCT FAMILY Unstructured data storage made simple

Reference Architecture Microsoft Exchange 2013 on Dell PowerEdge R730xd 2500 Mailboxes

Achieve Optimal Network Throughput on the Cisco UCS S3260 Storage Server

FlashGrid Software Enables Converged and Hyper-Converged Appliances for Oracle* RAC

REFERENCE ARCHITECTURE Quantum StorNext and Cloudian HyperStore

Technology Insight Series

QLE10000 Series Adapter Provides Application Benefits Through I/O Caching

ISILON X-SERIES. Isilon X210. Isilon X410 ARCHITECTURE SPECIFICATION SHEET Dell Inc. or its subsidiaries.

TPC-E testing of Microsoft SQL Server 2016 on Dell EMC PowerEdge R830 Server and Dell EMC SC9000 Storage

DELL EMC ISILON SCALE-OUT NAS PRODUCT FAMILY

EMC Virtual Infrastructure for Microsoft Applications Data Center Solution

Data Protection for Cisco HyperFlex with Veeam Availability Suite. Solution Overview Cisco Public

Introducing Panasas ActiveStor 14

Performance Characterization of the Dell Flexible Computing On-Demand Desktop Streaming Solution

STORAGE CONSOLIDATION WITH IP STORAGE. David Dale, NetApp

Midsize Enterprise Solutions Selling Guide. Sell NetApp s midsize enterprise solutions and take your business and your customers further, faster

SGI Overview. HPC User Forum Dearborn, Michigan September 17 th, 2012

Surveillance Dell EMC Storage with Bosch Video Recording Manager

MOVE TO A FLEXIBLE IT MODEL ENTERPRISE DATA CENTER SOLUTIONS.

Dell PowerEdge R720xd with PERC H710P: A Balanced Configuration for Microsoft Exchange 2010 Solutions

Milestone Solution Partner IT Infrastructure Components Certification Report

Intel Solid State Drive Data Center Family for PCIe* in Baidu s Data Center Environment

Deploy a High-Performance Database Solution: Cisco UCS B420 M4 Blade Server with Fusion iomemory PX600 Using Oracle Database 12c

Red Hat Ceph Storage and Samsung NVMe SSDs for intensive workloads

Connectivity to Cloud-First Applications

Lossless 10 Gigabit Ethernet: The Unifying Infrastructure for SAN and LAN Consolidation

Surveillance Dell EMC Storage with Verint Nextiva

Evaluation of Multi-protocol Storage Latency in a Multi-switch Environment

IBM Aspera for Microsoft SharePoint

Comparing File (NAS) and Block (SAN) Storage

STORAGE CONSOLIDATION WITH IP STORAGE. David Dale, NetApp

ASPERA HIGH-SPEED TRANSFER. Moving the world s data at maximum speed

Chelsio Communications. Meeting Today s Datacenter Challenges. Produced by Tabor Custom Publishing in conjunction with: CUSTOM PUBLISHING

THE EMC ISILON STORY. Big Data In The Enterprise. Deya Bassiouni Isilon Regional Sales Manager Emerging Africa, Egypt & Lebanon.

EMC ISILON X-SERIES. Specifications. EMC Isilon X200. EMC Isilon X400. EMC Isilon X410 ARCHITECTURE

Virtual WAN Optimization Controllers

Design a Remote-Office or Branch-Office Data Center with Cisco UCS Mini

Performance Benefits of NVMe over Fibre Channel A New, Parallel, Efficient Protocol

Transcription:

A High-Performance Storage and Ultra- High-Speed File Transfer Solution for Collaborative Life Sciences Research Storage Platforms with Aspera

Overview A growing number of organizations with data-intensive application environments in research, life sciences, high-performance computing, financial services, media and entertainment, have adopted file-based workflows for the delivery and creation of data and media. As a result, location independence is a driving force in the way business and research is conducted. Files of virtually every imaginable size and format need to be stored and transported between facilities, delivered to distribution hubs, backed up and archived, over national and global distances. But moving many gigabytes or terabytes of data over public or private IP networks can introduce great inefficiencies in a digital workflow. Data throughput and delivery times are susceptible to TCP-based bottlenecks that get worse as distances increase, can seriously impact productivity and hamper the need to meet aggressive schedules, time to market and revenue goals. Figure 1. Collaboration and sharing in Genome Sequencing environments Collaborative research is common in Life Sciences, where geographically dispersed sites within a single organization or a group of organizations share data, results and resources. The data often originates from genome sequencing instruments and go through multiple heavy computational analysis stages to extract the desired information, resulting in a wide range of file sizes and types. Data and results can be shared at any step in the process depending on how the workflow is set up and how the research facilities are collaborating and sharing resources. Science and methodologies change rapidly and require the infrastructure to be flexible and support changing workload and communication needs. Aspera s fasp transport technology and high-performance storage platforms create a formidable combination: lightning-fast data access matches storage performance. A Fully Optimized 6 Gbps Distance-independent WAN Transfer Solution Aspera s patented fasp protocol eliminates the devastating effects of packet loss and latency that plague standard TCP transport solutions over long distances. fasp is capable of exceeding the WAN transfer performance and scalability limits of traditional TCP-based protocols; specifically NFS/ CIFS in network-attached storage environments, and FTP, HTTP, SCP and RSYNC in application environments. Aspera software runs natively on all major operating systems (Windows, Linux, Mac) and leverages existing client/server hardware architectures and networking technologies. Aspera solutions integrate fully with s Mercury and Titan storage platforms and suite of optimization tools for data management. With s patented Silicon Accelerated File System architecture with its efficient metadata handling, exceptional performance can be achieved from even the slowest disk type or tier of storage, while Aspera fasp transport enables distance-neutral data access that scales linearly with WAN bandwidth.

Testing High-Performance Data Storage and Transport To evaluate storage platforms and Aspera fasp transport technology, a test environment was used to simulated real-world conditions. The primary goal in this testing was to measure the performance of the Aspera and solution, and how Gbps transfer rates could be achieved under varying WAN conditions (packet loss and latency combinations). To stress both the storage and transfer protocols, 200+ GB files and a great number of small files averaging 5MB in size were transferred between storage nodes. This reflects the typical mix of files often found in Life Sciences related research environments with data originating from Next Generation Sequencing instruments are managed and analyzed, generating derivative data sets and results of different types. Joint lab testing demonstrated WAN transfer speeds up to a blistering 6.1Gbps with the wide range of file sizes, structure and types. This throughput was observed over a variety of adverse network conditions, including worst-case global Internet scenarios with very high latency and packet loss (5% packet loss and 300 ms latency). The combined /Aspera storage and transfer system is capable of transferring 1TB of data in about 20 minutes, regardless of the WAN distance. The 10Gbps test topology is shown below: real-world WAN conditions (latency and packet loss) were achieved using the Apposite Linktropy 10Gb Ethernet WAN emulator. Commodity network and server systems were used to route data and run Aspera server software (responsible for initiating and managing the file transmission from source to destination). s high-performance network storage platforms served as the send/receive storage nodes to accommodate the high-speed data transfer applications. The Test Environment Research Facility A LAN Switch (10GbE) Brocade TurboIron 24x WAN Simulator (10GbE) Transfer across WAN Research Facility B LAN Switch (10GbE) Brocade TurboIron 24x Titan 3210 Read from Storage Apposite Netropy 10G WAN Emulator Write to Storage Titan 3210 4 hosts (sending side) running Aspera Software 6 hosts (receiving side) running Aspera Software Storage (RC12), 60 x 450G SAS HDDs, = 27TB total Research data Illumina Sequencers End-to-end WAN data flow Storage (RC12), 120 x 300G SAS HDDs, = 36TB total Figure 2: 10Gbps network topology, server, instrument and storage configuration Testing Detail All tests were performed over a 10Gbps Ethernet WAN environment. The Apposite Linktropy WAN emulator introduced 0.1%, 1.0% and 5% packet loss, with 10ms, 100ms and 300ms roundtrip time (RTT). All packet loss and latency combinations were tested, simulating metro, national and global WAN distances with best-case private WAN performance and worst-case Internet loss. Output performance metrics were used to calculate transfer times, receive/send efficiency and data throughput speeds.

Transfer times and send/receive efficiency. Because standard TCP-based protocols are greatly affected by network latency and packet loss in long distance data transfers, WAN emulation hardware is essential in objectively measuring Aspera s fasp performance versus that of TCP-based protocols such as CIFS, NFS, FTP and HTTP. Throughput. storage platform throughput was measured by transferring a data set designed to be representative for multiple next generation sequencing data sets, including the changing characteristics due to changes in instruments and applications: multiple data files averaging 5MB in size and totaling 203GB multiple data files averaging 5GB in size and totaling 209GB multiple data files averaging 250MB in size and totaling 1TB multiple data files averaging 5MB in size and totaling 1TB individual 201GB data files File types and sizes varied from large image files, over intermediate analysis results to smaller text files. In each case, the files were segmented and transferred using 96 concurrent sessions, demonstrating the storage platform s ability to handle IOPS-intensive data sourcing. The Results: An average of 5Gbps or more for all WAN combinations storage transfer throughput over 10GbE WAN under moderate packet loss and latency conditions reached a sustained 5.5Gbps. Under worst-case Internet packet loss and latency conditions (5% loss / 300 ms latency), throughput was sustained at 4.3Gbps. At these throughput rates, transferring 1TB over any metro, national or global WAN would take a little more than 20 minutes. Moving the same amount of data using NFS, CIFS, FTP, HTTP, RSYNC, and SCP would take 24 hours, under the best-case WAN condition. Under worst-case network conditions, it is likely that the same transfers would never successfully complete. The diagram below shows the Aspera/ performance compared to TCP-based protocols, over the different WAN latency and packet loss combinations. The performance of Aspera s fasp is nearly linear over all WAN conditions, yielding a near 3000x improvement over TCP. Figure 3: 10Gbps storage WAN throughput: Aspera fasp compared to TCP-based transport solutions.

Summary In a variety of enterprises and research organizations, ranging from life sciences, financial services, and high-performance computing to media and entertainment, incredible amounts of data need to be stored, exchanged and distributed quickly, predictably and reliably. Legacy file transport and storage technologies cannot keep up with growing file sizes, high-speed WAN deployments and the need for global, dynamic data access. storage platforms allow organizations to incorporate their existing storage infrastructure in a dynamic, tiered high-performance storage environment that can be scaled to accommodate future growth. Aspera s fasp ultra-high-speed transfer technology scales to take full advantage of any WAN bandwidth, independent of distance or packet loss. Using existing infrastructures, this breakthrough technology combination enables massive volumes of information to be accessed, managed and moved on a global basis, at speeds that meet the mission-critical nature of today s data access. Deployed in conjunction with Aspera software, storage delivers real solutions that allow organizations to meet the demand for data access and ensure business, time to market and revenue goals are met. TEST SYSTEM CONFIGURATION storage node File system consisted of 120 LSI 300GB, 15,000-rpm SAS write drives, RAID 5+1 One file system and one storage pool consisting of 120 LSI RC12 300GB, 15,000-rpm SAS drives using RAID5 5+1@64K volume groups with 20 total 16 mount points per client (16 independent TCP connections between each storage client and the storage server) Linux servers running CentOS release 5.5 (Final) and Red Hat Enterprise Linux Server release 5.5 (Tikanga): - One Dell R710 with x2 Quad-Core Xeon 2.5 GHz processors, 24 GB RAM; - 2 SuperMicro Intel x5680 @ 3.33 GHz 2 x6 core 48 GB RAM - 1 SuperMicro Intel x5420 @ 2.5 GHz 2x4 core 16 GB RAM Dual-port 10GbE NIC per host server Apposite Linktropy 10G WAN Emulator Brocade TurboIron 10GbE switch with VLAN configuration FC connectivity to the LSI storage array x1 file system and one storage pool consisting of 60 LSI RC12 450GB, 15,000-rpm SAS drives using RAID5 5+1@64K volume groups with 10 total x1 10TB file system, 24 mount points per client (24 independent TCP connections between each storage client and the storage server) CLIENT SYSTEM CONFIGURATION storage node x6 Linux servers running CentOS release 5.5 (Final) and Red Hat Enterprise Linux Server release 5.5 (Tikanga) x2 Dell R710 with 2 Quad-Core Xeon 2.5 GHz processors, 24 GB RAM x2 SuperMicro E5640 @ 2.67 GHz x2 SuperMicro E5540 @ 2.53 GHz Dual port 10 GbE NIC per host server

WAN EMULATION CONFIGURATION Brocade TurboIron 10GbE switch with VLAN configuration Apposite Linktropy 10G WAN emulator ASPERA SERVER SOFTWARE Aspera Enterprise Server 2.6 About Aspera Aspera is the creator of next-generation transport technologies that move the world s data at maximum speed regardless of file size, transfer distance and network conditions. Based on its patented fasp protocol, Aspera software fully utilizes existing infrastructures to deliver the fastest, most predictable file transfer experience. Aspera s core technology delivers unprecedented control over bandwidth, complete security and uncompromising reliability. Over one thousand organizations across a variety of industries on six continents rely on Aspera software for the business-critical transport of their digital assets. Please visit http://www.asperasoft.com for more information. About At, we know that lightning fast storage and large capacities in a standards based file storage environment enables new research and collaboration. In Life Sciences and Research we deliver the performance and scalability needed to produce results quickly and handle an ever increasing flow of data and analysis results. With methodologies and applications changing much quicker than IT infrastructures, we deliver storage solutions that are architected from the start to thrive in those situations. provides industry-leading file storage at scale with the simplicity for the end user of having a single, global namespace and performance to handle the demands from researchers. While also providing the control, flexible architecture and scalability needed to optimize for cost and efficiency.

About is a leading provider of high performance unified network storage systems to enterprise markets, as well as data intensive markets, such as electronic discovery, entertainment, federal government, higher education, Internet services, oil and gas and life sciences. Our products support both network attached storage, or NAS, and storage area network, or SAN, services on a converged network storage platform. We enable companies to expand the ways they explore, discover, research, create, process and innovate in data-intensive environments. Our products replace complex and performance-limited products with high performance, scalable and easy to use systems capable of handling the most data intensive applications and environments. Further, we believe that our energy efficient design and our products ability to consolidate legacy storage infrastructures, dramatically increases storage utilization rates and reduces our customers total cost of ownership. Corporation Corporate Headquarters 50 Rio Robles Drive San Jose, CA 95134 t 408 576 6600 f 408 576 6601 www.bluearc.com UK Ltd. European Headquarters Queensgate House Cookham Road Bracknell RG12 1RB, United Kingdom t +44 (0) 1344 408 200 f +44 (0) 1344 408 202 2011 Corporation. All rights reserved. The logo is a registered trademark of Corporation. 05/11 WP-ASPRLIFE-00