RAIDIX Data Storage Solution Highly Scalable Data Storage System and Broadcasting Corporations 2017
Contents Synopsis... 2 Introduction... 4 Challenges and the Solution... 5 Solution Architecture... 8 Technical Characteristics... 9 Business Impact... 11 About RAIDIX... 11 Synopsis Storing and accessing large files in data-rich industries, such as medical diagnostics or resource-intensive research, may be a challenging task for any IT infrastructure. As for Media & Entertainment, this segment works with files that are truly gargantuan. In case the resolution of digital content is 2K or 4K over four times the standard resolution an average feature film can take up to 2TB size in Full HD and up to 15TB in 8K. Postproduction professionals operate with terabytes of data and perform video editing and grading with the use of resource-intensive applications like Blackmagic Design s DaVinci Resolve. One of the key objectives of multimedia studios is to support high throughput required for uninterrupted video processing with no frames dropped from multiple workstations. At that, given the growing data volumes and limited storage capacity, postproduction companies and TV channels require high-scalability solutions. As a rule, petabyte volumes and clustered data architectures imply Scale-Out data storage. When building a storage cluster from multiple nodes, a system administrator stumbles upon the limitations of traditional file systems. These restrictions include metadata and data stored in the same volumes; low scalability in terms of capacity, performance, files quantity, direc- All rights reserved. RAIDIX, 2017 2
tory depth; lack of cross-platform compatibility, etc. In this scenario, a distributed clustered file system comes up as an optimal solution. In this document, we ll share details of typical M&E tasks, performance metrics and Scale-Out solutions based on RAIDIX and the HyperFS file system for major IT infrastructures. All rights reserved. RAIDIX, 2017 3
Introduction Key video production and broadcasting requirements include: High throughput and sustainable performance even in case of drive failure Hot spare capability with no downtime or performance degradation Workload prioritization on the application level Flexible support for Fibre Channel, iscsi, NFS, SMB, FTP, AFP and other access protocols High Scale-Up and Scale-Out potential. When operating with high workloads, it s crucial to balance storage performance, density and TCO. System performance depends on the number of drives and individual performance of each drive. As a rule, high-speed software-defined technology utilizing HDDs fully complies with sequential workload requirements. In combination with highdensity JBOD enclosures, storage professionals create effective configurations that cater to a large number of parallel threads or high-definition streams. Greater scalability requires implementation of cluster solutions, such as the system based on the RAIDIX management software and HyperFS. Aside from high performance and low latencies, the list of key features expected from a fully functional Scale-Out solution includes: Single namespace for multiple storage clusters Concurrent access via versatile protocols File and block access to the same data. All rights reserved. RAIDIX, 2017 4
Challenges and the Solution Video data editing implies real-time reading/writing and processing of uncompressed video streams. Data storage latencies may lead to dropped frames, in which case the process will have to start over. A key task for film companies is video production within minimum timeframes. In technical terms, this requirement calls for high throughput and fault-tolerance throughout the production lifecycle. Media holdings running tens of parallel projects need new data storage systems or regular updates to existing configurations in view of ever-growing content volumes. An eligible data storage system should enable high performance and reliable storage of multi-petabyte data volumes with minimal investment. Other critical factors involve maximum data processing speeds and reliable failover all the way from ingestion to broadcasting. In this paper, we ll focus on a high-capacity storage system (from 2U/48TB up to 4U/108TB), easily expandable with JBOD enclosures, high-performing and capable of handling numerous 2К/4К video streams. Data integrity with no frames dropped is ensured by patented RAIDIX algorithms including RAID 6 with double-parity and 7.3 with triple parity. Thus, RAID 7.3 commits to uninterrupted system performance even if up to three drives fail. As data storage management software, RAIDIX operates with commodity х86 64 components (casing, drives, interface controllers, memory, processors, etc.) and allows the end customer to customize RAID arrays for specific M&E tasks and decrease overall implementation and maintenance costs. RAIDIX supports professional equipment from Аррlе, AJA, Blackmagic Design, as well as Хsan, metasan, StorNext, FalconStor environments and video editing software (Adobe Premiere, Final Cut Pro, Avid, Smoke, DaVinci Resolve, SGO Mistika, etc.). Moreover, RAIDIX allows the administrator to install professional editing and grading software right on the storage node, thus minimizing hardware overheads. All rights reserved. RAIDIX, 2017 5
What are the RAIDIX capabilities in terms of building multi-node storage clusters? As we mentioned before, this task goes beyond the functionality of traditional file systems (FS). Classic FS impose the following limitations: Metadata and data are stored on the same partitions Files are scattered across the partition, causing access latencies No protection against defragmentation Low scalability by capacity, performance, file number, directory depth, etc. Lack of native cross-platform support. These issues can be resolved with the aid of cluster file systems. The HyperFS system from Scale Logic, for instance, ensures high scalability with full process transparency for the customer and shared access to data from various OS s (in particular, through a dedicated NAS-gateway). Integrated with HyperFS, the RAIDIX software allows for a single namespace for SAN and NAS. Technical benefits of the comprehensive solution include: Up to 4B files in a single directory Up to 4096 partitions that can be consolidated within a single FS No single point of failure (SPOF) Dynamic FS scalability by volume and performance Support for the latest versions of popular OS Mac/Windows/Linux. The HyperFS SAN file system ensures required redundancy, high data availability, mirroring of paths and data. HyperFS for SAN helps to transform multiple FS or iscsi drive arrays into a unified storage cluster. This cluster enables parallel editing and playback from several client machines, as well as high performance and shared data access within a single namespace. The system encompasses an optional metadata controller (MDC) with redundancy structure as well as a full-redundancy SAN-structure with metadata All rights reserved. RAIDIX, 2017 6
mirroring. HyperFS SAN also supports multi-path configurations in the Fibre Channel and iscsi environments. The system reveals no SPOF and ensures high storage reliability. Picture 1. HyperFS SAN Infrastructure. The capabilities of Scale-Out NAS systems employed in major M&E infrastructures involve consolidation of up to 64 nodes in a single cluster, concurrent access via versatile protocols (SMB v2/v3, NFS v3/v4, FTP/FTPS, HTTP/HTTPS/WebDAV), workload balancing across the nodes (Round-Robin, Connection Count, Load node), and Active Directory support. Enhanced RAIDIX and HyperFS features include: System optimization for large and small files Support for user and folder quotes All rights reserved. RAIDIX, 2017 7
SNMP monitoring by SNMP for SONG and MDC LDAP/Active Directory an opportunity to use a local user database or integrate with Active Directory ACL support an opportunity to utilize ACL on all supported operating systems. Bottom line: the system based on RAIDIX and HyperFS provides film companies, postproduction studios and TV channels with a high-performance solution featuring a single namespace, concurrent access via various protocols, low latencies, high scalability, file and block access to the same data. Solution Architecture The data storage architecture (see pic. 2) based on RAIDIX and the HyperFS cluster system is comprised of three key components: Storage nodes (SharedDisk/data storage system). HDD-based systems aimed at fault-tolerant data storage Directory services (MDS). Intended for storing data references, resource arbitration and access management. Directory services per se do not store data or metadata. Clients. RAIDIX clients may be represented by servers or computers with preinstalled client software for shared access. In a scenario that involves connection of a large number of clients with no specific client software installed, the solution architecture allows for setting up NAS gateways that enable data operations. All rights reserved. RAIDIX, 2017 8
Pic. 2. Data storage architecture based on the RAIDIX software and cluster file system. Technical Characteristics System capacity (theoretical limit) Max. number of files/objects/folders File size (theoretical limit) File name length Directory depth 64 ZB Up to 4,000,000,000 when using 4TB metadata volume 64 ZB Windows: 255 ASCII characters; Linux/Mac: 255 ASCII characters Windows: 244 characters; Linux: 4096 Bytes Max. number of LUNs 4093 Exported Paths 512 No. of metadata controllers (MDC) Up to 2, can be configured in HA mode Number of concurrent file systems 16 All rights reserved. RAIDIX, 2017 9
Full redundancy configuration Dynamic file system expansion Supported: No single point of failure Yes, LUNs can be added with no downtime Windows 7 32/x86_64/Win 8, Win 10 Windows 2008/2008_R2/2012/2012_R2/Server 32/x86_64/2016_x86_64 Supported SAN client OSs RHEL 5 (Update3 Uupdate10) 32/x86_64, RHEL 6 (Update0 Update8) 32/x86_64, RHEL 7 {Update0 Update2} SUSE 11 SP1 3 OS X 10.7 10.12 SSD Support Yes All rights reserved. RAIDIX, 2017 10
Business Impact RAIDIX allows the administrator to employ multiple storage nodes, distributing data dynamically and balancing workloads across the nodes. The solution architecture enables adding of new system nodes by request without the need for data migration or system re-configuration. The main benefit of RAIDIX is the ability of concurrent high-performing data processing on the block level from a single or multiple data storage systems and numerous workstations, which is not possible within a classic SAN architecture. The RAIDIX technology in combination with the HyperFS file system complies with the high performance and fault-tolerance requirements, and ensures shared access to video content from multiple workstations. The use of the RAIDIX technology allows the user to minimize hardware overheads when building a storage cluster by providing effective scale-out of the existing infrastructures with no downtime or performance slump. About RAIDIX RAIDIX (www.raidix.com) is a leading solution provider and developer of highperformance data storage systems. The company s strategic value builds on patented erasure coding methods and innovative technology designed by the in-house research laboratory. The RAIDIX Global Partner Network encompasses system integrators, storage vendors and IT solution providers offering RAIDIX-powered products for professional and enterprise use. All rights reserved. RAIDIX, 2017 11