IBM V7000 Unified R1.4.2 Asynchronous Replication Performance Reference Guide
|
|
- Barry Arnold
- 6 years ago
- Views:
Transcription
1 V7 Unified Asynchronous Replication Performance Reference Guide IBM V7 Unified R1.4.2 Asynchronous Replication Performance Reference Guide Document Version 1. SONAS / V7 Unified Asynchronous Replication Development Team Hiroyuki Miyoshi Satoshi Takai Hiroshi Araki SONAS / V7 Unified Replication Architect Thomas Bish Development Manager Norie Iwasaki Page 1 of 39
2 V7 Unified Asynchronous Replication Performance Reference Guide TABLE OF CONTENTS 1 DOCUMENT INFORMATION SUMMARY OF CHANGES RELATED DOCUMENTS INTRODUCTION ASYNC REPLICATION OPERATIONAL OVERVIEW ASYNC REPLICATION PHASES ASYNC REPLICATION MODE REPLICATION FREQUENCY AND RPO CONSIDERATIONS ASYNC REPLICATION PERFORMANCE BENCHMARK TEST ENVIRONMENT PERFORMANCE CHARACTERISTICS FOR VARIOUS FACTORS Async Replication Configuration Options Fullsync Network Host Workload Number of Files File Size (Fixed Size) Filesystem Configuration TYPICAL CASES Case Definition Results...31 Page 2 of 39
3 V7 Unified Asynchronous Replication Performance Reference Guide 1 Document Information 1.1 Summary of changes Version Date Short Description.1 213/7/3 initial draft version.2 213/9/2 added test machines, modified test cases.3 213/1/11 performance results added, revised chapters.4 213/1/16 updated chapter 3 and 4 per review comments, corrected typo in chapter /1/18 corrected wording per review comments 1.2 Related Documents IBM V7 Unified Information Center SONAS Copy Services Asynchronous Replication - Best Practices document can be searched from ibm.com/support/entry/portal/overview/hardware/system_storage/network_attached _storage_%28nas%29/sonas/scale_out_network_attached_storage Version R1.4.1 is available as of today (213/1/18) which gives enough information for this performance reference guide. The direct link to Version R1.4.1 is 1.ibm.com/support/docview.wss?uid=ssg1S74448 Version R1.4.2 is planned to be uploaded IBM Storwize V7 Unified Performance Best Practice Version 1.2 is planned to be uploaded. Once uploaded, it should be available in the same portal: ibm.com/support/entry/portal/overview/hardware/system_storage/network_attached _storage_%28nas%29/sonas/scale_out_network_attached_storage Page 3 of 39
4 V7 Unified Asynchronous Replication Performance Reference Guide 2 Introduction V7 Unified asynchronous replication (async replication) provides a function to copy data from a primary V7 Unified (source V7 Unified) to a remote V7 Unified (target V7 Unified). As the name implies, the copy is done asynchronous to the host I/O. The basis of the disaster recovery solution is built around this function. Every time the async replication is invoked, the incremental data is scanned and transferred to the target. The async replication can be scheduled to enable the periodic execution. In order to keep the target up-to-date and meet the RPO (Recovery Point Objective) requirement, the of the async replication is important. Also, since this runs as a storage background task, it is necessary to understand the resource consumption (such as CPU, memory, and disk) so that the frequency and the time can be decided as not to impact the host I/O response time or other V7 Unified storage applications such as NDMP. A separate best practice guide (SONAS Copy Services Asynchronous Replication - Best Practices) already describes the overall, basic usage, considerations, and problem determination of async replication. The location of the Best Practices is described in 1.2 Related Documents. This document focuses on the performance investigation and the resource consumption of async replication under various configurations. The purpose is to give a reference to help configure V7 Unified that uses async replication. Page 4 of 39
5 V7 Unified Asynchronous Replication Performance Reference Guide 3 Async Replication Operational Overview 3.1 Async Replication Phases The async replication is per filesystem basis and each replication goes through the following phases: Snapshot of source file system to create a write-consistent point in time image of the file space to replicate Scan of the source file system snapshot for created/modified/deleted files and directories since the last successful completion of async replication Replication of changed contents to the target system Snapshot of target file system, Removal of source file system snapshot The detail of each phase is described in the best practice guide or the information center. Here are some notes from the performance perspective, Scan phase This is executed on the source V7 Unified. If both nodes are configured as replication participating nodes, both will be used. This will scan the metadata for all the files in the filesystem. It is recommended to have metadata spread across multiple NSDs. By default when creating a filesystem by GUI or CLI mkfs command, multiple NSDs (vdisks) with dataandmetadata types are automatically created which is OK. Use of SSDs for the metadata is also recommended to improve the scan performance. The scan involves text sorting. Depending on the total number of files and the length and complexity of the directory structure and file names, this may consume noticeable CPU usage on the nodes. The memory usage is limited to 5% of the physical memory on nodes per async operation. The time it takes for the scan phase is dependent on the total number of files in the source filesystem and the length and complexity of the directory structure instead of file sizes. Replication phase Multiple rsync processes will be invoked in parallel (if configured) from both nodes (if configured) to replicate the changed files to the target. The async operation will segment the changed files into groups based on their size and location within the source file system. Large files (over 1MB) are processed first. As replication processes complete the large files, the smaller files will be processed. Rsync determines the transmission mode (whole or delta) for each file depending on if the file exists on the target side and the size of the file on the source and the target. In whole-transfer mode, rsync will send the whole data of the file to the target. In delta-transfer mode, rsync will find out the delta blocks between the source/target and will send only the delta to the target via network. Page 5 of 39
6 V7 Unified Asynchronous Replication Performance Reference Guide 3.2 Async Replication Mode There are 2 modes in async replication: incremental fullsync Incremental replication When startrepl CLI command is invoked without any option, incremental replication is executed by default. This will scan for the files that have been modified since the last successful replication and replicate the modified files only to the target side. Fullsync replication This can be executed by adding --fullsync option to startrepl. This will list up all the files/directories in the filesystem and let rsync process the full list. Rsync will compare all the files identified in the source file system and will only update the modified files. Typically, a customer should not need to specify "--fullsync". If startrepl is invoked for the first time for a filesystem or if a code detected a severe error condition that it would be safe to audit all the files, it will automatically switch to fullsync mode. The --fullsync option is primarily intended to audit the target against the source if an event made it such that the target file system may have skewed from the source from the normal incremental replications. The --fullsync should also be used in the initial failback from the DR site to the original primary to ensure the primary source system is recovered correctly from the state of the disaster against the current state of the DR site. Page 6 of 39
7 V7 Unified Asynchronous Replication Performance Reference Guide 4 Replication Frequency and RPO Considerations The recovery point objective (RPO) describes the acceptable amount of data loss, in time, should a disaster occur. The RPO for async replication is based off of the frequency of the startrepl command and is limited by the duration of the async replication operation. The first step in the async process is to create a point in time snapshot of the source file system which creates a data consistent point. This snapshot is used as a base to perform the changed file (delta) scan and as the source of the file replication. Only one async increment may be running at any time per replication relationship. Therefore the minimum RPO is defined by the time of the completion of one increment to the time of the completion of the next increment. The duration of the replication depends on a number of factors, including: Amount of files within the source file system - The number of files contained in the source file system effects the duration of the scan time. Async replication uses a high-speed file system scan to quickly identify the files that have been changed since the last successful async replication. While the scan is optimized, the number of files contained in the file system will add to the time it takes to perform the scan. The scan process is distributed across the number of nodes configured to participate in the async process. Number of disks and how the source/target filesystems are created from them The disk bandwidth plays a major role in the performance of async replication. It tends to be the bottleneck in many cases. Generally, if more mdisks (V7 RAID volumes) with sufficient number of disks are used for the source/target filesystems, the faster replication will likely to be obtained. When multiple filesystems are created from the same mdisk group, the async replication on a filesystem and the host I/O to the other filesystems will share the same disks and have impacts to each other. Especially, when multiple replications to those filesystems are executed at the same time, they will all be limited by the capability of the shared disks. It is strongly recommended to stagger the replication in such case. Amount of changed data requiring replication - The time it takes to transfer the contents from the source V7 Unified to the target is a direct result of the amount of data which has been modified since the last async operation. Basically, the async operation only moves the changed contents of files between the source and target to minimize the amount of data needing to be sent over the network. However, depending on the file size, it is faster to send the whole data instead of finding the delta and sending only the delta. In R1.4.2, if the file size is larger than 1GB or less than 1MB, the transmission mode is changed to whole-file transfer. File size profile of the changed files - When talking about the of a replication, in terms of MB/sec, if there are many large files (over GB), the data transfer will be dominant and the MB/sec value will be high. However, if there are many small files (like a few KB), the file attributes transfer and inode delete/create operations will be dominant. In that case, the MB/sec value will show very low value and we recommend to evaluate file/sec as the Bandwidth and capabilities of the network between V7 Unified systems - The network capabilities play a factor in the time it takes the replication process to complete. Enough network bandwidth must be available to handle all of the updates that have occurred since the start of the last increment before the start of the next scheduled increment. Otherwise, the next startrepl will fail which will affectively double the RPO for this time. Number of source and target nodes participating in async replication - It is strongly recommended to use both nodes for replication. It is possible to only configure one node Page 7 of 39
8 V7 Unified Asynchronous Replication Performance Reference Guide pair between the source and the target V7 Unified but the scan/rsync will be half unless network or disk is extremely slow. Also, the redundancy is lost in case something happens to a node during replication. Configuration parameters for replication The number of replication processes configured per node, the strong vs fast encryption cipher, and software compression help to tune the replication performance. See below for more information. Other workloads running concurrently with replication Using HSM managed file systems at source or target file systems- HSM managed file systems can greatly affect the time it takes for async replication to complete if the changed files in the source file system have been moved to the secondary media before async replication has replicated it to the target. NDMP - NDMP uses the same scan that the async replication uses. The scan is an I/O intensive operation and may consume significant CPU time. When scheduling tasks, it is recommended to stagger the time for NDMP and async replication. Number of replication processes per node - This configuration parameter allows for more internal rsync processes to be used on each node to replicate data to the other side. This provides more parallelism and increases the potential replication bandwidth and rate. Encryption Cipher The data transferred between V7 Unified systems are encrypted via SSH as part of the replication. The default cipher is strong (AES), but limits the maximum per process network transfer rate to approximately 35-4 MB/s. The fast cipher (arcfour) is not as strong, but increases the per replication process network transfer rate to approximately 95 MB/s. On trusted networks, it is advised to use the fast cipher for increasing bandwidth utilization. Software compression This compresses the data to be transferred over the network before being encrypted. The compression is performed using the CPU of the node transferring the data. For data which is compressible, this can provide a means of reducing the data to be sent over the network to increase the effective bandwidth. If the data is not compressible, the calculation overhead is added and the overall may degrade. Page 8 of 39
9 V7 Unified Asynchronous Replication Performance Reference Guide 5 Async Replication Performance Benchmark First, the performance characteristic investigation is done for each factor. The obtained data and the trend are shown in section 5.2 Performance Characteristics for Various Factors. Afterwards, a few typical use cases are defined and the performance benchmark is shown for each in section 5.3 Typical Cases. 5.1 Test Environment Replication Source V7 Unified 2 Filer Nodes and 1 V7 Filer Node (273-7): x365 M3 server CPU Intel(R) Xeon(R) x 8 Memory 72GB (GPFS cache 36GB) V7 ( ): 3GB 1KRPM 6Gbps SAS HDD x 21 (6TB) 8Gbps FC V7 Unified version: Replication Target V7 Unified 2 Filer Nodes and 1 V7 Filer Node (273-72): x365 M4 server CPU Intel(R) Xeon(R) x 4 Memory 72GB (GPFS cache 36GB) V7 ( ): 9GB 1KRPM 6Gbps SAS HDD x 23 (2TB) 8Gbps FC V7 Unified version: Network 1 Gbps, bonding mode: balance-alb (6) 1 Gbps, bonding mode: active-backup (1) (1 node pair only) Page 9 of 39
10 V7 Unified Asynchronous Replication Performance Reference Guide 5.2 Performance Characteristics for Various Factors The following sections illustrate how changes in the various parameters impact the replication performance and time Async Replication Configuration Options Objective: To illustrate the effects in replication performance, duration, and resource consumption when various async replication configuration options are varied. Test variations: Number of nodes (1, 2) Number of processes (1, 5, 1) Encryption cipher (strong, fast) Compression (enabled, disabled) For the compression test, 2 sets of test data are used 1. Completely random data which the compression ratio is very low (almost 1:1). 2. filled data which the compression ratio is high. In case of 1 GB file, the ratio is almost 1:1. Test configurations: Test files 5k files, about 63 GB total, file size varying from a few KB to a few GB (same as #7 in "Typical Cases" in section 5.3) test files containing random data Filesystem GUI default (Basic-RAID5 capacity template) GPFS blocksize: 256 KB 3 NSDs (Vdisks) per file system, all dataandmetadata V7 Mdisk group extent size: 256 KB 3 RAID5 Mdisks with physical disks (7, 7, 6) for source cluster and (7, 7, 7) on target cluster Mdisk stripe size: 128 KB Network 1 Gbps no latency Async Replication Configurations Source & target snapshot Host workload none Replication: Initial replication Measured values: Elapsed time for scan/replication, Throughput (MB/sec, file/sec) for replication phase CPU/memory/storage utilization for scan/replication Page 1 of 39
11 Rsync Phase Throughput (MB/sec) Rsync Phase Throughput (MB/sec) V7 Unified Asynchronous Replication Performance Reference Guide Results: Number of Nodes node 2 node Number of Nodes Fig ) Node number impact on rsync As for the rsync time, 2 node configuration is almost 2 times faster. The scan time in these cases didn't show much difference since only 5k test files were used and both completed in about 1 seconds. However, the scan is also expected to increase with the number of nodes. Number of Processes Per Node proc 5 proc 1 proc Number of Processes Fig ) Process number impact on rsync The rsync also increases as the number of processes is increased from 1 to 1. The resource consumption is shown below. Page 11 of 39
12 Disk Utilization (%) CPU Usage (%) V7 Unified Asynchronous Replication Performance Reference Guide Source V7 Unified active management node CPU usage: Replication 1 proc 5 proc 1proc Fig ) CPU usage of the source V7 Unified active management node for various replication process number As shown, the 1 process case consumes about 15% more CPU than the 1 process case. Source V7 Unified active management node NSD (VDISK) usage: Replication 1 proc 5 proc 1proc Fig ) Disk utilization of the source V7 Unified active management node for various replication process numbers In this environment, when 5 or 1 processes are used, the source disk access is almost 1% indicating the bottleneck. Page 12 of 39
13 Disk Utilization (%) CPU Usage (%) V7 Unified Asynchronous Replication Performance Reference Guide Target V7 Unified node CPU usage Replication 1 proc 5 proc 1 proc Fig ) CPU usage of the target V7 Unified node for various replication process numbers Target V7 Unified node NSD (VDISK) usage: Replication Elpased Time (sec) 1 proc 5 proc 1 proc Fig ) Disk utilization of the target V7 Unified node for various replication process numbers Async replication copies large files (>1MB) first. Looking at the log of the 1 process replication run, the small files (a few KB) started to get replicated after 25 minutes (15 sec) to the end (around 3 sec). This matches the disk utilization peaks shown. As for the memory usage, the free memory didn't change much through out the replication (4-5 MB usage maximum out of 15-2GB free memory). GPFS pre-allocates the memory as page pool so the memory consumption will not increase with the I/O. The amount of memory used by the scan is limited to 5% and each rsync handles files one at a time so it would not use excessive amount of memory. Page 13 of 39
14 Rsync Phase Throughput (MB/sec) Rsync Phase Throughput (MB/sec) V7 Unified Asynchronous Replication Performance Reference Guide Encryption Type strong (aes128-cbc) Encryption Type Fig ) Encryption type impact on rsync fast (arcfour) No difference was observed in this test case. 2 rsync processes were invoked in total and the average was about 11 MB/sec for each. This is due to the disk access limitation as shown above. The strong encryption cipher (aes128-cbc) is known to max out at 35-4 MB/sec so if the disk and the network allowed over 4 MB/sec for each, the encryption type would start to show a difference. Compression When this option is enabled, rsync will compress the data before sending to the target V7 Unified. The target will de-compress the data upon receiving compression off compression on random data Test File Data filled with zeros Fig ) Compression option impact on rsync The compression option severely degraded the of the random data test files but improved when the zero-filled test files were used. For the random test data, the compression will not be able to compress data and the performance dropped due to the overhead. The next graph shows how the compression keeps consuming the CPU. Page 14 of 39
15 CPU Usage (%) CPU Usage (%) V7 Unified Asynchronous Replication Performance Reference Guide 12 compression on compression off Replication Fig ) Random test files compression option impact on CPU usage of the source node For the zero-filled data, the CPU will still be consumed but the data will be compressed and the overall replication time is shorter Replication compression on compression off Fig ) Compression option impact on CPU usage of the source node (zero-filled data) Sparse option effect One other interesting aspect is the sparse option. In R1.4.x, a sparse option is always enabled and it will always create sparse files on the target. Due to this option, the amount of write I/O and the consumed space on the target side are significantly decreased for the zero-filled test data case executed for the compression test above. The sparse option takes effect regardless of the compression option. Page 15 of 39
16 Disk Utilization (%) V7 Unified Asynchronous Replication Performance Reference Guide The following table shows the df output of the filesystem on the target V7 Unified after the replication of the 2 data patterns (the compression is disabled). The source V7 Unified has about 631GB but due to the sparse option, very little data is written to the target side for zerofilled case. test case random data test files zero-filled test files df output Filesystem 1K-blocks Used Available Use% Mounted on /dev/gpfs % /ibm/gpfs Filesystem 1K-blocks Used Available Use% Mounted on /dev/gpfs % /ibm/gpfs The disk utilization on the target side also shows the significant decrease of write I/O for the zerofilled test case random data zero-filled data Replication Fig ) Target node disk utilization for random data and zero filled test data Page 16 of 39
17 V7 Unified Asynchronous Replication Performance Reference Guide Fullsync Objective: To show the replication performance of fullsync vs incremental Test variations: async replication mode Test configurations: Test files 5k files, about 63 GB total, file size varying from a few KB to a few GB (same as #7 in "Typical Cases" in section 5.3) test files containing random data Filesystem GUI default (Basic-RAID5 capacity template) GPFS blocksize: 256 KB 3 NSDs (Vdisks) per file system, all dataandmetadata V7 Mdisk group extent size: 256 KB 3 RAID5 Mdisks with physical disks (7, 7, 6) for source cluster and (7, 7, 7) on target cluster Mdisk stripe size: 128 KB Network 1 Gbps no latency Async Replication Configurations 2 nodes 1 processes each Fast encryption cipher No compression Source & target snapshot Host workload none Replication: Incremental replication (1% delta) Fullsync replication (1% delta) Measured values: Elapsed time for scan/replication, Throughput (MB/sec, file/sec) for replication phase CPU/memory/storage utilization for scan/replication Page 17 of 39
18 Total V7 Unified Asynchronous Replication Performance Reference Guide Results: Initial Incremental Incremental fullsync Replication Fig ) Rsync difference between fullsync and incremental When a --fullsync option is supplied, it will audit the entire files on the source side so there will be some overhead compared to a pure incremental replication. However, it will not send the data over the network as long as the attributes of the files are the same so the required time is much less the initial full data send replication. Page 18 of 39
19 V7 Unified Asynchronous Replication Performance Reference Guide Network Objective: To show the performance impact of the network bandwidth and latency Test variations: 1 Gbps vs 1 Gbps 1 Gbps Insert network latency by netem Test configurations: Test files 5k files, about 63 GB total, file size varying from a few KB to a few GB (same as #7 in "Typical Cases" in section 5.3) test files containing random data Filesystem GUI default (Basic-RAID5 capacity template) GPFS blocksize: 256 KB 3 NSDs (Vdisks) per file system, all dataandmetadata V7 Mdisk group extent size: 256 KB 3 RAID5 Mdisks with physical disks (7, 7, 6) for source cluster and (7, 7, 7) on target cluster Mdisk stripe size: 128 KB Async replication configuration 1 node or 2 nodes 1 processes each Fast encryption cipher Source & target snapshot Host workload none Replication: Initial replication Measured values: Elapsed time for scan/replication, Throughput (MB/sec, file/sec) for replication phase Network transferred byte Page 19 of 39
20 Rsync Phase Throughput (MB/sec) Rsync Phase Throughput (MB/sec) V7 Unified Asynchronous Replication Performance Reference Guide Results: 1 Gbps vs 1 Gbps Gbps Ethernet Port Rate 1Gbps Fig ) Rsync difference between 1 Gbps and 1 Gbps network (1 node 1 process configuration) RTT (Round-Time Trip) delay impact ms 1ms 2ms 1ms RTT delay Fig ) RTT delay impact on rsync (2 node 1 process configuration) Page 2 of 39
21 V7 Unified Asynchronous Replication Performance Reference Guide Host Workload Objective: To show the replication performance while host I/O is running Test variations: Host (NFS client) I/O load Test configurations: Test files 5k files, about 63 GB total, file size varying from a few KB to a few GB (same as #7 in "Typical Cases" in section 5.3) test files containing random data Filesystem GUI default (Basic-RAID5 capacity template) GPFS blocksize: 256 KB 3 NSDs (Vdisks) per file system, all dataandmetadata V7 Mdisk group extent size: 256 KB 3 RAID5 Mdisks with physical disks (7, 7, 6) for source cluster and (7, 7, 7) on target cluster Mdisk stripe size: 128 KB Network 1 Gbps no latency Async Replication Configurations 2 nodes 1 processes each Fast encryption cipher No compression Source & target snapshot Replication: Initial replication Measured values: Elapsed time for scan/replication, Throughput (MB/sec, file/sec) for replication phase CPU/memory/storage utilization for scan/replication Host I/O (MB/sec, IOPS) Page 21 of 39
22 Rsync Phase Throughput (MB/sec) Aggregate Bandwidth (KB/s) V7 Unified Asynchronous Replication Performance Reference Guide Results: stress First, a stress tool which executes dd write is run from a NFS client. The number of dd threads within the tool is varied to control the workload. The following figure shows the filesystem performance impact (measured by fio) READ KB/s from NFS client WRITE KB/s from NFS client nostress 4 thread 8 thread 12 thread Stress Tool Number of Threads Fig ) filesystem bandwidth under stress tool replication under the workload Async replication was executed while the tool was running on the source V7 Unified thread (% load) 4 thread (22.3% load) 8 thread (25.7% load) Stress Tool Number of Threads 12 thread (59% load) Fig ) Replication rsync under various stress The following graph shows the disk utilization of the source. Page 22 of 39
23 Disk Utilization (%) V7 Unified Asynchronous Replication Performance Reference Guide thread 8 thread no stress Replication Fig ) Disk utilization of the source V7 Unified while replication with host I/O Page 23 of 39
24 V7 Unified Asynchronous Replication Performance Reference Guide Number of Files Objective: To show the replication performance characteristic (mainly scan) when the number of files is varied Test variations: Num of files (1million, 5million, 1million) Test configurations: Test files 1KB files varying the number test files containing random data Filesystem GUI default (Basic-RAID5 capacity template) GPFS blocksize: 256 KB 3 NSDs (Vdisks) per file system, all dataandmetadata V7 Mdisk group extent size: 256 KB 3 RAID5 Mdisks with physical disks (7, 7, 6) for source cluster and (7, 7, 7) on target cluster Mdisk stripe size: 128 KB Network 1 Gbps no latency Async Replication Configurations 2 nodes 1 processes each Fast encryption cipher No compression Source & target snapshot Host workload none Replication: Initial replication Measured values: Elapsed time for scan/replication, Throughput (MB/sec, file/sec) for replication phase CPU/memory/storage utilization for scan/replication Page 24 of 39
25 Rsync Phase Throughput (file/sec) Scan Phase V7 Unified Asynchronous Replication Performance Reference Guide Results: scan time As shown below, the scan time increases as the number of total files in the filesystem increases mil 5mil 1mil 1mil Number of Files Fig ) Scan time difference related to the total number of files in the source V7 Unified Rsync The rsync is expected to be consistent since the file size is the same among the test cases (1 million case is omitted because of that case had over GB files included and the characteristic will be different) mil 5mil 1mil Number of Files Fig ) Rsync related to the total number of files in the source V7 Unified NOTE: In this test case, the number of files within a subdirectory was limited to 1. However, if there are more files (especially small files like a few KB files) and a replication was invoked, 2 nodes may try to replicate files under the same sub-directory. On the target side, 2 different nodes may try to create files under the same directory at the same time and may hit the gpfs directory lock collision which may degrade the performance. Page 25 of 39
26 V7 Unified Asynchronous Replication Performance Reference Guide File Size (Fixed Size) Objective: To show the replication performance characteristic for a given file size Test variations: File size (1KB only, 1MB only, 1GB only) Test configurations: Test files File number varying depending on the size 1KB -> 1mil 1MB -> 1mil 1GB -> 1k test files containing random data Filesystem GUI default (Basic-RAID5 capacity template) GPFS blocksize: 256 KB 3 NSDs (Vdisks) per file system, all dataandmetadata V7 Mdisk group extent size: 256 KB 3 RAID5 Mdisks with physical disks (7, 7, 6) for source cluster and (7, 7, 7) on target cluster Mdisk stripe size: 128 KB Network Gbps no latency Async Replication Configurations 2 nodes 1 processes each Fast encryption cipher No compression Source & target snapshot Host workload none Replication: Initial replication Measured values: Elapsed time for scan/replication, Throughput (MB/sec, file/sec) for replication phase CPU/memory/storage utilization for scan/replication Page 26 of 39
27 Rsync Phase Throughtput (file/sec) Rsync Phase Throughtpu (MB/sec) V7 Unified Asynchronous Replication Performance Reference Guide Results: For small files (1 KB below), the in terms of MB/sec becomes extremely low. This is because the overhead of processing the file (inode and attribute data) dominates the performance yet those are not counted MB/sec 1KB x 1mil 1MB x 1mil 1GB x 1 Test Files Fig ) Rsync (MB/sec) per various test file size If the same data is viewed by file/sec, the small file is high while large file (1 GB below) becomes extremely low file/sec 1KB x 1mil 1MB x 1mil 1GB x 1 Test Files Fig ) Rsync (file/sec) per various test file size Typically when various file sizes are mixed, the tends to be measured by MB/sec of the file contents data. However, if the source side has a huge number of small files (a few KB files), the file attributes copy and the inode creation/deletion operation dominates the performance and the MB/sec value will be low. File/sec needs to be evaluated in that case. Page 27 of 39
28 V7 Unified Asynchronous Replication Performance Reference Guide Filesystem Configuration Objective: To show the replication performance impact of the filesystem storage configuration Test variations: Number of NSDs in a filesystem on source/target V7 Unified from 1 to 12 all NSDs metadataanddata Mdisk configuration from GUI menu on both source/target V7 Unified RAID5-Basic capacity, RAID5-Basic performance, RAID-performance Test configurations: Test files 5k files, about 63 GB total, file size varying from a few KB to a few GB (same as #7 in "Typical Cases" in section 5.3) test files containing random data Network 1 Gbps no latency Async replication configuration 2 nodes 1 processes each Fast encryption cipher No compression Source & target snapshot Host workload none Replication: Initial replication Measured values: Elapsed time for scan/replication, Throughput (MB/sec, file/sec) for replication phase CPU/memory/storage utilization for scan/replication Results: V7 Unified provides several options when configuring storage. Details are described in the performance white paper but the basic steps are to create mdisks (RAID volumes), vdisks (NSDs), and gpfs filesystems. From operation point of view, the mdisks and the mdisk groups need to be defined first and then creating a gpfs filesystem by CLI/GUI will create the vdisks and the filesystem. mdisk creation GUI provides several menus. Basic-RAID5 capacity --- This is the recommended default template in GUI. In this test configuration, 3 RAID-5 mdisks will be created. The numbers of physical drives are (6, 7, 7) on the source cluster and (7, 7, 7) on the target cluster. All the test cases except for this test case are done using this template. Basic-RAID5 performance --- In this test configuration, 2 RAID-5 mdisks will be created. The number of physical drives are (9, 9) RAID performance -- In this test configuration, 2 RAID- mdisks will be created. The number of physical drives are (9, 9) RAID6 is also available Page 28 of 39
29 Scan Phase Rsync Phase Throughtpu (MB/sec) V7 Unified Asynchronous Replication Performance Reference Guide filesystem/vdisk creation When created from GUI, it will automatically create 3 vdisks (NSDs) per filesystem. All the test configurations except for this test case are done using 3 vdisks. The number of vdisks (NSDs) per filesystem can be configured by mkfs CLI command. The async replication is investigated for various storage configurations NSD 3 NSD 6 NSD 12 NSD 1 5 Basic-RAID5 Capacity Basic-RAID5 Performance GUI "Configure Internal Storage" Menu RAID Performance Fig ) mdisk configuration and number of vdisk impact to rsync Basic-RAID5 Capacity Basic-RAID5 Performance GUI "Configure Internal Storage" Menu RAID Performance 1 NSD 3 NSD 6 NSD 12 NSD Fig ) mdisk configuration and number of vdisk impact to scan With only 1 V7 enclosure, these did not show much difference. Page 29 of 39
30 V7 Unified Asynchronous Replication Performance Reference Guide 5.3 Typical Cases Case Definition Based on our experiences with customers, a few "typical" cases are defined, mainly concerning: Number of files to replicate in the filesystem File size profile of the file system to replicate Number of filesystems to replicate Case Approx num of files Approx total capacity Num of filesyst ems File size profile (1KB, 1KB, 1MB, 1MB, 1GB, 1GB) 1 1mil 315 GB 1 (1mil,,,, 2, 2) 2 1mil 2.2 TB 1 (98mil,, 2mil,, 2, 2) 3 1mil.95 GB 1 (1mil,,,,, ) 4 1mil 121 GB 1 (1mil,,,, 2, 1) 5 1mil 2.2 TB 1 (8k,, 18k, 2k, 2, 1) 6 5k 4.1 TB 1 (4k, 12k, 3k, 4k,, ) 7 5k 631 GB 1 (3k, 1k, 95k, 5k, 2, ) + 2 1GB 8 2k x 3 52 GB x 3 3 (1k, 8k, 15k, 5k, 1, ) x 3 9 5mil x GB x 3 3 (5mil,,,,, ) x 3 The rest of the configuration/condition: Filesystem configuration GUI default (Basic-RAID5 capacity template) GPFS blocksize: 256 KB 3 NSDs (Vdisks) per file system, all dataandmetadata V7 Mdisk group extent size: 256 KB 3 RAID5 Mdisks with physical disks (7, 7, 6) for source cluster and (7, 7, 7) on target cluster Mdisk stripe size: 128 KB Network 1 Gbps no latency Async replication configuration 2 nodes 1 processes each Fast encryption cipher No compression Source & target snapshot Host workload None Increment Ratio 1% (5% new, 5% updated) Replication: Initial replication Incremental replication (1% delta) Measured values: Elapsed time for scan/replication, Throughput (MB/sec, file/sec) for replication phase CPU/memory/storage utilization for scan/replication Note) As demonstrated in Network, the is always affected by the latency time of the network connection. Also, the network efficiency (packet loss, packet sequencing, and bit error rates) and network switch capabilities should be configured properly. Page 3 of 39
31 CPU Usage (%) V7 Unified Asynchronous Replication Performance Reference Guide Results The detail replication results (time and ) are shown for each case. Also, the system resource consumption (CPU and disk) are shown for some cases. Case Case 1 Approx num of files Approx total capacity Num of filesyste ms File size profile (1KB, 1KB, 1MB, 1MB, 1GB, 1GB) 1 1mil 315 GB 1 (1mil,,,, 2, 2) Elapsed time and rsync phase : Initial Replication Incremental Replication Total Scan Rsync Other MB/sec file/sec Total Scan Rsync Other MB/sec file/sec System resource usage Replication source target Fig ) CPU usage on source/target V7 Unified node Page 31 of 39
32 Disk Utilization (%) V7 Unified Asynchronous Replication Performance Reference Guide 12 1 source target Replication Fig ) Disk utilization on source/target V7 Unified node As for the memory usage, the free memory didn't change much through out the replication. On the source side, the scan consumed around 3GB and released the memory after the scan. Rsync phase used a maximum of around 2GB. On the target side, rsync also used a maximum of around 2GB. The disk utilization graph (Fig ) indicates that the disk access on the source side is the bottleneck in this case. Case Case 2 Approx num of files Approx total capacity Num of filesyste ms File size profile (1KB, 1KB, 1MB, 1MB, 1GB, 1GB) 2 1mil 2.2 TB 1 (98mil,, 2mil,, 2, 2) Elapsed time and rsync phase : Initial Replication Incremental Replication Total Scan Rsync Other MB/sec file/sec Total Scan Rsync Other MB/sec file/sec Case Case 3 Approx num of files Approx total capacity Num of filesyste ms 3 1mil.95 GB 1 (1mil,,,,, ) File size profile (1KB, 1KB, 1MB, 1MB, 1GB, 1GB) Elapsed time and rsync phase : Initial Replication Incremental Replication Page 32 of 39
33 CPU Usage (%) V7 Unified Asynchronous Replication Performance Reference Guide Total Scan Rsync Other MB/sec file/sec Total Scan Rsync Other MB/sec file/sec Case Case 4 Approx num of files Approx total capacity Num of filesyste ms 4 1mil 121 GB 1 (1mil,,,,2,1) File size profile (1KB, 1KB, 1MB, 1MB, 1GB, 1GB) Elapsed time and rsync phase : Initial Replication Incremental Replication Total Scan Rsync Other MB/sec file/sec Total Scan Rsync Other MB/sec file/sec Case Case 5 Approx num of files Approx total capacity Num of filesyste ms File size profile (1KB, 1KB, 1MB, 1MB, 1GB, 1GB) 5 1mil 2.2 TB 1 (8k,, 18k, 2k, 2, 1) Elapsed time and rsync phase : Initial Replication Incremental Replication Total Scan Rsync Other MB/sec file/sec Total Scan Rsync Other MB/sec file/sec System resource usage Replication source target Fig ) CPU usage on source/target V7 Unified node Page 33 of 39
34 Disk Utilization V7 Unified Asynchronous Replication Performance Reference Guide 12 1 source target Replication Fig ) Disk utilization on source/target V7 Unified node Case Case 6 Approx num of files Approx total capacity Num of filesyste ms File size profile (1KB, 1KB, 1MB, 1MB, 1GB, 1GB) 6 5k 4.1 TB 1 (4k, 12k, 3k, 4k,, ) Elapsed time and rsync phase : Initial Replication Incremental Replication Total Scan Rsync Other MB/sec file/sec Total Scan Rsync Other MB/sec file/sec Case Case 7 Approx num of files Approx total capacity Num of filesyste ms File size profile (1KB, 1KB, 1MB, 1MB, 1GB, 1GB) 7 5k 631 GB 1 (3k, 1k, 95k, 5k, 2, ) + 2 1GB Elapsed time and rsync phase : Initial Replication Incremental Replication Total Scan Rsync Other MB/sec file/sec Total Scan Rsync Other MB/sec file/sec Page 34 of 39
35 Disk Utilization (%) CPU Usage (%) V7 Unified Asynchronous Replication Performance Reference Guide System resource usage source target Replication Fig ) CPU usage on source/target V7 Unified node source target Replication Target Fig ) Disk utilization on source/target V7 Unified node Page 35 of 39
36 CPU Usage (%) V7 Unified Asynchronous Replication Performance Reference Guide Case Case 8 Approx num of files Approx total capacity Num of filesyste ms File size profile (1KB, 1KB, 1MB, 1MB, 1GB, 1GB) 8 2k x 3 52 GB x 3 3 (1k, 8k, 15k, 5k, 1, ) x filesystems (gpfs, gpfs1, gpfs2) scheduled at the same time gpfs elapsed time and rsync phase : Initial Replication Incremental Replication Total Scan Rsync Other MB/sec file/sec Total Scan Rsync Other MB/sec file/sec The results of the other 2 file systems were almost identical. System resource usage Replication source target Fig ) CPU usage on source/target V7 Unified node Page 36 of 39
37 CPU Usage (%) Disk Utilization (%) V7 Unified Asynchronous Replication Performance Reference Guide Replication source dm- source dm-3 source dm-6 target dm- target dm-3 target dm-6 gpfs fs (dm- - dm-2) gpfs1 fs (dm-3 - dm-5) gpfs2 fs (dm-6 - dm-8) Fig ) Disk utilization on source/target V7 Unified node As shown, when the replications of 3 filesystems are executed concurrently, the CPU usage reaches close to 9% on the source node. This may result in seeing CPU warning (75%) / CPU high threshold (9%) messages (INFO level) during rsync phase Only 1 filesystem (gpfs) with the same test file profile gpfs elapsed time and rsync phase : Initial Replication Incremental Replication Total Scan Rsync Other MB/sec file/sec Total Scan Rsync Other MB/sec file/sec The is 3 times faster than when 3 filesystems are executed at the same time. System resource usage: Replication source target Fig ) CPU usage on source/target V7 Unified node Page 37 of 39
38 Disk Utilization (%) V7 Unified Asynchronous Replication Performance Reference Guide 12 1 source target Replication Fig ) Disk utilization on source/target V7 Unified node Case Case 9 Approx num of files Approx total capacity Num of filesyste ms File size profile (1KB, 1KB, 1MB, 1MB, 1GB, 1GB) 9 5mil x GB x 3 3 (5mil,,,,, ) x filesystems (gpfs, gpfs1, gpfs2) scheduled at the same time gpfs elapsed time and rsync phase : Initial Replication Incremental Replication Total Scan Rsync Other MB/sec file/sec Total Scan Rsync Other MB/sec file/sec gpfs1 elapsed time and rsync phase : Initial Replication Incremental Replication Total Scan Rsync Other MB/sec file/sec Total Scan Rsync Other MB/sec file/sec gpfs2 elapsed time and rsync phase : Initial Replication Incremental Replication Total Scan Rsync Other MB/sec file/sec Total Scan Rsync Other MB/sec file/sec Page 38 of 39
39 V7 Unified Asynchronous Replication Performance Reference Guide Only 1 filesystem (gpfs) with the same test file profile gpfs elapsed time and rsync phase : Initial Replication Incremental Replication Total Scan Rsync Other MB/sec file/sec Total Scan Rsync Other MB/sec file/sec Page 39 of 39
SONAS Best Practices and options for CIFS Scalability
COMMON INTERNET FILE SYSTEM (CIFS) FILE SERVING...2 MAXIMUM NUMBER OF ACTIVE CONCURRENT CIFS CONNECTIONS...2 SONAS SYSTEM CONFIGURATION...4 SONAS Best Practices and options for CIFS Scalability A guide
More informationIntel Enterprise Edition Lustre (IEEL-2.3) [DNE-1 enabled] on Dell MD Storage
Intel Enterprise Edition Lustre (IEEL-2.3) [DNE-1 enabled] on Dell MD Storage Evaluation of Lustre File System software enhancements for improved Metadata performance Wojciech Turek, Paul Calleja,John
More informationConfiguring Short RPO with Actifio StreamSnap and Dedup-Async Replication
CDS and Sky Tech Brief Configuring Short RPO with Actifio StreamSnap and Dedup-Async Replication Actifio recommends using Dedup-Async Replication (DAR) for RPO of 4 hours or more and using StreamSnap for
More informationIBM Storwize V7000 Unified
IBM Storwize V7000 Unified Pavel Müller IBM Systems and Technology Group Storwize V7000 Position Enterprise Block DS8000 For clients requiring: Advanced disaster recovery with 3-way mirroring and System
More informationDell EMC CIFS-ECS Tool
Dell EMC CIFS-ECS Tool Architecture Overview, Performance and Best Practices March 2018 A Dell EMC Technical Whitepaper Revisions Date May 2016 September 2016 Description Initial release Renaming of tool
More informationRed Hat Gluster Storage performance. Manoj Pillai and Ben England Performance Engineering June 25, 2015
Red Hat Gluster Storage performance Manoj Pillai and Ben England Performance Engineering June 25, 2015 RDMA Erasure Coding NFS-Ganesha New or improved features (in last year) Snapshots SSD support Erasure
More informationCloudian Sizing and Architecture Guidelines
Cloudian Sizing and Architecture Guidelines The purpose of this document is to detail the key design parameters that should be considered when designing a Cloudian HyperStore architecture. The primary
More informationExperiences in Clustering CIFS for IBM Scale Out Network Attached Storage (SONAS)
Experiences in Clustering CIFS for IBM Scale Out Network Attached Storage (SONAS) Dr. Jens-Peter Akelbein Mathias Dietz, Christian Ambach IBM Germany R&D 2011 Storage Developer Conference. Insert Your
More informationCA485 Ray Walshe Google File System
Google File System Overview Google File System is scalable, distributed file system on inexpensive commodity hardware that provides: Fault Tolerance File system runs on hundreds or thousands of storage
More informationCS3600 SYSTEMS AND NETWORKS
CS3600 SYSTEMS AND NETWORKS NORTHEASTERN UNIVERSITY Lecture 11: File System Implementation Prof. Alan Mislove (amislove@ccs.neu.edu) File-System Structure File structure Logical storage unit Collection
More informationIBM EXAM QUESTIONS & ANSWERS
IBM 000-452 EXAM QUESTIONS & ANSWERS Number: 000-452 Passing Score: 800 Time Limit: 120 min File Version: 68.8 http://www.gratisexam.com/ IBM 000-452 EXAM QUESTIONS & ANSWERS Exam Name: IBM Storwize V7000
More informationFrom an open storage solution to a clustered NAS appliance
From an open storage solution to a clustered NAS appliance Dr.-Ing. Jens-Peter Akelbein Manager Storage Systems Architecture IBM Deutschland R&D GmbH 1 IBM SONAS Overview Enterprise class network attached
More informationWhite Paper. EonStor GS Family Best Practices Guide. Version: 1.1 Updated: Apr., 2018
EonStor GS Family Best Practices Guide White Paper Version: 1.1 Updated: Apr., 2018 Abstract: This guide provides recommendations of best practices for installation and configuration to meet customer performance
More informationIsilon Performance. Name
1 Isilon Performance Name 2 Agenda Architecture Overview Next Generation Hardware Performance Caching Performance Streaming Reads Performance Tuning OneFS Architecture Overview Copyright 2014 EMC Corporation.
More informationChapter 10: Mass-Storage Systems. Operating System Concepts 9 th Edition
Chapter 10: Mass-Storage Systems Silberschatz, Galvin and Gagne 2013 Chapter 10: Mass-Storage Systems Overview of Mass Storage Structure Disk Structure Disk Attachment Disk Scheduling Disk Management Swap-Space
More informationNový IBM Storwize V7000 Unified block-file storage system Simon Podepřel Storage Sales 2011 IBM Corporation
Nový IBM Storwize V7000 Unified block-file storage system Simon Podepřel Storage Sales simon_podeprel@cz.ibm.com Agenda V7000 Unified Overview IBM Active Cloud Engine for V7kU 2 Overview V7000 Unified
More informationEnterprise2014. GPFS with Flash840 on PureFlex and Power8 (AIX & Linux)
Chris Churchey Principal ATS Group, LLC churchey@theatsgroup.com (610-574-0207) October 2014 GPFS with Flash840 on PureFlex and Power8 (AIX & Linux) Why Monitor? (Clusters, Servers, Storage, Net, etc.)
More informationChapter 10: Mass-Storage Systems
Chapter 10: Mass-Storage Systems Silberschatz, Galvin and Gagne 2013 Chapter 10: Mass-Storage Systems Overview of Mass Storage Structure Disk Structure Disk Attachment Disk Scheduling Disk Management Swap-Space
More informationPowerVault MD3 SSD Cache Overview
PowerVault MD3 SSD Cache Overview A Dell Technical White Paper Dell Storage Engineering October 2015 A Dell Technical White Paper TECHNICAL INACCURACIES. THE CONTENT IS PROVIDED AS IS, WITHOUT EXPRESS
More informationBIG DATA AND HADOOP ON THE ZFS STORAGE APPLIANCE
BIG DATA AND HADOOP ON THE ZFS STORAGE APPLIANCE BRETT WENINGER, MANAGING DIRECTOR 10/21/2014 ADURANT APPROACH TO BIG DATA Align to Un/Semi-structured Data Instead of Big Scale out will become Big Greatest
More informationChapter 11: Implementing File Systems
Chapter 11: Implementing File Systems Operating System Concepts 99h Edition DM510-14 Chapter 11: Implementing File Systems File-System Structure File-System Implementation Directory Implementation Allocation
More informationNetwork Design Considerations for Grid Computing
Network Design Considerations for Grid Computing Engineering Systems How Bandwidth, Latency, and Packet Size Impact Grid Job Performance by Erik Burrows, Engineering Systems Analyst, Principal, Broadcom
More informationGoogle File System. Arun Sundaram Operating Systems
Arun Sundaram Operating Systems 1 Assumptions GFS built with commodity hardware GFS stores a modest number of large files A few million files, each typically 100MB or larger (Multi-GB files are common)
More informationEI 338: Computer Systems Engineering (Operating Systems & Computer Architecture)
EI 338: Computer Systems Engineering (Operating Systems & Computer Architecture) Dept. of Computer Science & Engineering Chentao Wu wuct@cs.sjtu.edu.cn Download lectures ftp://public.sjtu.edu.cn User:
More informationAssessing performance in HP LeftHand SANs
Assessing performance in HP LeftHand SANs HP LeftHand Starter, Virtualization, and Multi-Site SANs deliver reliable, scalable, and predictable performance White paper Introduction... 2 The advantages of
More informationDell Fluid Data solutions. Powerful self-optimized enterprise storage. Dell Compellent Storage Center: Designed for business results
Dell Fluid Data solutions Powerful self-optimized enterprise storage Dell Compellent Storage Center: Designed for business results The Dell difference: Efficiency designed to drive down your total cost
More informationZadara Enterprise Storage in
Zadara Enterprise Storage in Google Cloud Platform (GCP) Deployment Guide March 2017 Revision A 2011 2017 ZADARA Storage, Inc. All rights reserved. Zadara Storage / GCP - Deployment Guide Page 1 Contents
More informationStorage for HPC, HPDA and Machine Learning (ML)
for HPC, HPDA and Machine Learning (ML) Frank Kraemer, IBM Systems Architect mailto:kraemerf@de.ibm.com IBM Data Management for Autonomous Driving (AD) significantly increase development efficiency by
More informationBuilding Backup-to-Disk and Disaster Recovery Solutions with the ReadyDATA 5200
Building Backup-to-Disk and Disaster Recovery Solutions with the ReadyDATA 5200 WHITE PAPER Explosive data growth is a challenging reality for IT and data center managers. IDC reports that digital content
More informationWhite Paper FUJITSU Storage ETERNUS DX S4/S3 series Extreme Cache/Extreme Cache Pool best fit for fast processing of vast amount of data
White Paper FUJITSU Storage ETERNUS DX S4/S3 series Extreme Cache/Extreme Cache Pool best fit for fast processing of vast amount of data Extreme Cache / Extreme Cache Pool, which expands cache capacity
More informationCurrent Topics in OS Research. So, what s hot?
Current Topics in OS Research COMP7840 OSDI Current OS Research 0 So, what s hot? Operating systems have been around for a long time in many forms for different types of devices It is normally general
More informationOPERATING SYSTEM. Chapter 12: File System Implementation
OPERATING SYSTEM Chapter 12: File System Implementation Chapter 12: File System Implementation File-System Structure File-System Implementation Directory Implementation Allocation Methods Free-Space Management
More informationA GPFS Primer October 2005
A Primer October 2005 Overview This paper describes (General Parallel File System) Version 2, Release 3 for AIX 5L and Linux. It provides an overview of key concepts which should be understood by those
More informationData Management. Parallel Filesystems. Dr David Henty HPC Training and Support
Data Management Dr David Henty HPC Training and Support d.henty@epcc.ed.ac.uk +44 131 650 5960 Overview Lecture will cover Why is IO difficult Why is parallel IO even worse Lustre GPFS Performance on ARCHER
More informationSoftNAS Cloud Performance Evaluation on Microsoft Azure
SoftNAS Cloud Performance Evaluation on Microsoft Azure November 30, 2016 Contents SoftNAS Cloud Overview... 3 Introduction... 3 Executive Summary... 4 Key Findings for Azure:... 5 Test Methodology...
More informationCopyright 2012 EMC Corporation. All rights reserved.
1 TRANSFORMING MICROSOFT APPLICATIONS TO THE CLOUD Louaye Rachidi Technology Consultant 2 22x Partner Of Year 19+ Gold And Silver Microsoft Competencies 2,700+ Consultants Worldwide Cooperative Support
More informationIBM Active Cloud Engine centralized data protection
IBM Active Cloud Engine centralized data protection Best practices guide Sanjay Sudam IBM Systems and Technology Group ISV Enablement December 2013 Copyright IBM Corporation, 2013 Table of contents Abstract...
More informationChapter 11: Implementing File
Chapter 11: Implementing File Systems Chapter 11: Implementing File Systems File-System Structure File-System Implementation Directory Implementation Allocation Methods Free-Space Management Efficiency
More informationRAIDIX Data Storage Solution. Clustered Data Storage Based on the RAIDIX Software and GPFS File System
RAIDIX Data Storage Solution Clustered Data Storage Based on the RAIDIX Software and GPFS File System 2017 Contents Synopsis... 2 Introduction... 3 Challenges and the Solution... 4 Solution Architecture...
More informationChapter 10: File System Implementation
Chapter 10: File System Implementation Chapter 10: File System Implementation File-System Structure" File-System Implementation " Directory Implementation" Allocation Methods" Free-Space Management " Efficiency
More informationThe advantages of architecting an open iscsi SAN
Storage as it should be The advantages of architecting an open iscsi SAN Pete Caviness Lefthand Networks, 5500 Flatiron Parkway, Boulder CO 80301, Ph: +1-303-217-9043, FAX: +1-303-217-9020 e-mail: pete.caviness@lefthandnetworks.com
More informationDELL EMC ISILON F800 AND H600 I/O PERFORMANCE
DELL EMC ISILON F800 AND H600 I/O PERFORMANCE ABSTRACT This white paper provides F800 and H600 performance data. It is intended for performance-minded administrators of large compute clusters that access
More informationDELL EMC UNITY: BEST PRACTICES GUIDE
DELL EMC UNITY: BEST PRACTICES GUIDE Best Practices for Performance and Availability Unity OE 4.5 ABSTRACT This white paper provides recommended best practice guidelines for installing and configuring
More informationTechnical Paper. Performance and Tuning Considerations for SAS on Dell EMC VMAX 250 All-Flash Array
Technical Paper Performance and Tuning Considerations for SAS on Dell EMC VMAX 250 All-Flash Array Release Information Content Version: 1.0 April 2018 Trademarks and Patents SAS Institute Inc., SAS Campus
More informationChapter 11: Implementing File Systems. Operating System Concepts 9 9h Edition
Chapter 11: Implementing File Systems Operating System Concepts 9 9h Edition Silberschatz, Galvin and Gagne 2013 Chapter 11: Implementing File Systems File-System Structure File-System Implementation Directory
More informationIBM Emulex 16Gb Fibre Channel HBA Evaluation
IBM Emulex 16Gb Fibre Channel HBA Evaluation Evaluation report prepared under contract with Emulex Executive Summary The computing industry is experiencing an increasing demand for storage performance
More informationGFS: The Google File System
GFS: The Google File System Brad Karp UCL Computer Science CS GZ03 / M030 24 th October 2014 Motivating Application: Google Crawl the whole web Store it all on one big disk Process users searches on one
More informationThe Google File System
The Google File System Sanjay Ghemawat, Howard Gobioff and Shun Tak Leung Google* Shivesh Kumar Sharma fl4164@wayne.edu Fall 2015 004395771 Overview Google file system is a scalable distributed file system
More informationEMC Solutions for Backup to Disk EMC Celerra LAN Backup to Disk with IBM Tivoli Storage Manager Best Practices Planning
EMC Solutions for Backup to Disk EMC Celerra LAN Backup to Disk with IBM Tivoli Storage Manager Best Practices Planning Abstract This white paper describes how to configure the Celerra IP storage system
More informationIBM SONAS Storage Intermix
IBM SONAS Storage Intermix The best practices guide Jason Auvenshine, Storage Architect Tom Beglin, Product Architect IBM Systems and Technology Group November 2013 Copyright IBM Corporation, 2013 Page
More informationSMB 3.0 Performance Dan Lovinger Principal Architect Microsoft
SMB 3.0 Performance Dan Lovinger Principal Architect Microsoft Overview Stats & Methods Scenario: OLTP Database Scenario: Cluster Motion SMB 3.0 Multi Channel Agenda: challenges during the development
More informationSoftNAS Cloud Performance Evaluation on AWS
SoftNAS Cloud Performance Evaluation on AWS October 25, 2016 Contents SoftNAS Cloud Overview... 3 Introduction... 3 Executive Summary... 4 Key Findings for AWS:... 5 Test Methodology... 6 Performance Summary
More informationCSE 124: Networked Services Lecture-17
Fall 2010 CSE 124: Networked Services Lecture-17 Instructor: B. S. Manoj, Ph.D http://cseweb.ucsd.edu/classes/fa10/cse124 11/30/2010 CSE 124 Networked Services Fall 2010 1 Updates PlanetLab experiments
More informationParallel File Systems for HPC
Introduction to Scuola Internazionale Superiore di Studi Avanzati Trieste November 2008 Advanced School in High Performance and Grid Computing Outline 1 The Need for 2 The File System 3 Cluster & A typical
More informationPRESENTATION TITLE GOES HERE
Enterprise Storage PRESENTATION TITLE GOES HERE Leah Schoeb, Member of SNIA Technical Council SNIA EmeraldTM Training SNIA Emerald Power Efficiency Measurement Specification, for use in EPA ENERGY STAR
More informationFeedback on BeeGFS. A Parallel File System for High Performance Computing
Feedback on BeeGFS A Parallel File System for High Performance Computing Philippe Dos Santos et Georges Raseev FR 2764 Fédération de Recherche LUmière MATière December 13 2016 LOGO CNRS LOGO IO December
More informationIBM Spectrum NAS. Easy-to-manage software-defined file storage for the enterprise. Overview. Highlights
IBM Spectrum NAS Easy-to-manage software-defined file storage for the enterprise Highlights Reduce capital expenditures with storage software on commodity servers Improve efficiency by consolidating all
More informationChapter 12: File System Implementation
Chapter 12: File System Implementation Chapter 12: File System Implementation File-System Structure File-System Implementation Directory Implementation Allocation Methods Free-Space Management Efficiency
More information1Z0-433
1Z0-433 Passing Score: 800 Time Limit: 0 min Exam A QUESTION 1 What is the function of the samfsdump utility? A. It provides a metadata backup of the file names, directory structure, inode information,
More informationS SNIA Storage Networking Management & Administration
S10 201 SNIA Storage Networking Management & Administration Version 23.3 Topic 1, Volume A QUESTION NO: 1 Which two (2) are advantages of ISL over subscription? (Choose two.) A. efficient ISL bandwidth
More informationThe Btrfs Filesystem. Chris Mason
The Btrfs Filesystem Chris Mason The Btrfs Filesystem Jointly developed by a number of companies Oracle, Redhat, Fujitsu, Intel, SUSE, many others All data and metadata is written via copy-on-write CRCs
More informationNEC M100 Frequently Asked Questions September, 2011
What RAID levels are supported in the M100? 1,5,6,10,50,60,Triple Mirror What is the power consumption of M100 vs. D4? The M100 consumes 26% less energy. The D4-30 Base Unit (w/ 3.5" SAS15K x 12) consumes
More informationA. Deduplication rate is less than expected, accounting for the remaining GSAN capacity
Volume: 326 Questions Question No: 1 An EMC Avamar customer s Gen-1 system with 4 TB of GSAN capacity has reached read-only threshold. The customer indicates that the deduplicated backup data accounts
More informationWarsaw. 11 th September 2018
Warsaw 11 th September 2018 Dell EMC Unity & SC Series Midrange Storage Portfolio Overview Bartosz Charliński Senior System Engineer, Dell EMC The Dell EMC Midrange Family SC7020F SC5020F SC9000 SC5020
More informationMongoDB on Kaminario K2
MongoDB on Kaminario K2 June 2016 Table of Contents 2 3 3 4 7 10 12 13 13 14 14 Executive Summary Test Overview MongoPerf Test Scenarios Test 1: Write-Simulation of MongoDB Write Operations Test 2: Write-Simulation
More informationIBM řešení pro větší efektivitu ve správě dat - Store more with less
IBM řešení pro větší efektivitu ve správě dat - Store more with less IDG StorageWorld 2012 Rudolf Hruška Information Infrastructure Leader IBM Systems & Technology Group rudolf_hruska@cz.ibm.com IBM Agenda
More informationThe Google File System
The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google SOSP 03, October 19 22, 2003, New York, USA Hyeon-Gyu Lee, and Yeong-Jae Woo Memory & Storage Architecture Lab. School
More informationOnCommand Cloud Manager 3.2 Deploying and Managing ONTAP Cloud Systems
OnCommand Cloud Manager 3.2 Deploying and Managing ONTAP Cloud Systems April 2017 215-12035_C0 doccomments@netapp.com Table of Contents 3 Contents Before you create ONTAP Cloud systems... 5 Logging in
More informationVMware Virtual SAN Backup Using VMware vsphere Data Protection Advanced SEPTEMBER 2014
VMware SAN Backup Using VMware vsphere Data Protection Advanced SEPTEMBER 2014 VMware SAN Backup Using VMware vsphere Table of Contents Introduction.... 3 vsphere Architectural Overview... 4 SAN Backup
More informationSSH Bulk Transfer Performance. Allan Jude --
SSH Bulk Transfer Performance Allan Jude -- allanjude@freebsd.org Introduction 15 Years as FreeBSD Server Admin FreeBSD src/doc committer (ZFS, bhyve, ucl, xo) FreeBSD Core Team (July 2016-2018) Co-Author
More informationDistributed Filesystem
Distributed Filesystem 1 How do we get data to the workers? NAS Compute Nodes SAN 2 Distributing Code! Don t move data to workers move workers to the data! - Store data on the local disks of nodes in the
More informationBoost IBM i performance with IBM FlashSystem
Jana Jamsek Advanced technical skills, Europe Boost IBM i performance with IBM FlashSystem Agenda IBM FlashSystem Benefits of FlashSystem for an IBM i customer IBM i workloads with IBM FlashSystem Tools
More informationSAP Applications on IBM XIV System Storage
SAP Applications on IBM XIV System Storage Hugh Wason IBM Storage Product Manager SAP Storage Market - Why is it Important? Storage Market for SAP is estimated at $2Bn+ SAP BW storage sizes double every
More informationAccelerate with IBM Storage: IBM FlashSystem A9000/A9000R and Hyper-Scale Manager (HSM) 5.1 update
Accelerate with IBM Storage: IBM FlashSystem A9000/A9000R and Hyper-Scale Manager (HSM) 5.1 update Lisa Martinez Brian Sherman Steve Solewin IBM Hyper-Scale Manager Copyright IBM Corporation 2016. Session
More informationdavidklee.net heraflux.com linkedin.com/in/davidaklee
@kleegeek davidklee.net heraflux.com linkedin.com/in/davidaklee Specialties / Focus Areas / Passions: Performance Tuning & Troubleshooting Virtualization Cloud Enablement Infrastructure Architecture Health
More informationEfficiently Backing up Terabytes of Data with pgbackrest
Efficiently Backing up Terabytes of Data with pgbackrest David Steele Crunchy Data PGDay Russia 2017 July 6, 2017 Agenda 1 Why Backup? 2 Living Backups 3 Design 4 Features 5 Performance 6 Changes to Core
More informationFuture File System: An Evaluation
Future System: An Evaluation Brian Gaffey and Daniel J. Messer, Cray Research, Inc., Eagan, Minnesota, USA ABSTRACT: Cray Research s file system, NC1, is based on an early System V technology. Cray has
More informationKey metrics for effective storage performance and capacity reporting
Key metrics for effective storage performance and capacity reporting Key Metrics for Effective Storage Performance and Capacity Reporting Objectives This white paper will cover the key metrics in storage
More informationIBM FlashSystems with IBM i
IBM FlashSystems with IBM i Proof of Concept & Performance Testing Fabian Michel Client Technical Architect 1 2012 IBM Corporation Abstract This document summarizes the results of a first Proof of Concept
More informationASPERA HIGH-SPEED TRANSFER. Moving the world s data at maximum speed
ASPERA HIGH-SPEED TRANSFER Moving the world s data at maximum speed ASPERA HIGH-SPEED FILE TRANSFER Aspera FASP Data Transfer at 80 Gbps Elimina8ng tradi8onal bo
More informationSee what s new: Data Domain Global Deduplication Array, DD Boost and more. Copyright 2010 EMC Corporation. All rights reserved.
See what s new: Data Domain Global Deduplication Array, DD Boost and more 2010 1 EMC Backup Recovery Systems (BRS) Division EMC Competitor Competitor Competitor Competitor Competitor Competitor Competitor
More informationTest Report: Digital Rapids Transcode Manager Application with NetApp Media Content Management Solution
Technical Report Test Report: Digital Rapids Transcode Manager Application with NetApp Media Content Management Solution Jim Laing, NetApp July 2012 TR-4084 TABLE OF CONTENTS 1 Executive Summary... 3 2
More informationSoftware-defined Storage: Fast, Safe and Efficient
Software-defined Storage: Fast, Safe and Efficient TRY NOW Thanks to Blockchain and Intel Intelligent Storage Acceleration Library Every piece of data is required to be stored somewhere. We all know about
More informationCS252 S05. CMSC 411 Computer Systems Architecture Lecture 18 Storage Systems 2. I/O performance measures. I/O performance measures
CMSC 411 Computer Systems Architecture Lecture 18 Storage Systems 2 I/O performance measures I/O performance measures diversity: which I/O devices can connect to the system? capacity: how many I/O devices
More informationWhite Paper Features and Benefits of Fujitsu All-Flash Arrays for Virtualization and Consolidation ETERNUS AF S2 series
White Paper Features and Benefits of Fujitsu All-Flash Arrays for Virtualization and Consolidation Fujitsu All-Flash Arrays are extremely effective tools when virtualization is used for server consolidation.
More informationMemory-Based Cloud Architectures
Memory-Based Cloud Architectures ( Or: Technical Challenges for OnDemand Business Software) Jan Schaffner Enterprise Platform and Integration Concepts Group Example: Enterprise Benchmarking -) *%'+,#$)
More informationRIGHTNOW A C E
RIGHTNOW A C E 2 0 1 4 2014 Aras 1 A C E 2 0 1 4 Scalability Test Projects Understanding the results 2014 Aras Overview Original Use Case Scalability vs Performance Scale to? Scaling the Database Server
More informationVMware vsphere 5.0 STORAGE-CENTRIC FEATURES AND INTEGRATION WITH EMC VNX PLATFORMS
VMware vsphere 5.0 STORAGE-CENTRIC FEATURES AND INTEGRATION WITH EMC VNX PLATFORMS A detailed overview of integration points and new storage features of vsphere 5.0 with EMC VNX platforms EMC Solutions
More informationdavidklee.net gplus.to/kleegeek linked.com/a/davidaklee
@kleegeek davidklee.net gplus.to/kleegeek linked.com/a/davidaklee Specialties / Focus Areas / Passions: Performance Tuning & Troubleshooting Virtualization Cloud Enablement Infrastructure Architecture
More informationV. Mass Storage Systems
TDIU25: Operating Systems V. Mass Storage Systems SGG9: chapter 12 o Mass storage: Hard disks, structure, scheduling, RAID Copyright Notice: The lecture notes are mainly based on modifications of the slides
More informationExam Name: Midrange Storage Technical Support V2
Vendor: IBM Exam Code: 000-118 Exam Name: Midrange Storage Technical Support V2 Version: 12.39 QUESTION 1 A customer has an IBM System Storage DS5000 and needs to add more disk drives to the unit. There
More informationCatalogic DPX TM 4.3. ECX 2.0 Best Practices for Deployment and Cataloging
Catalogic DPX TM 4.3 ECX 2.0 Best Practices for Deployment and Cataloging 1 Catalogic Software, Inc TM, 2015. All rights reserved. This publication contains proprietary and confidential material, and is
More informationThe Google File System
The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung SOSP 2003 presented by Kun Suo Outline GFS Background, Concepts and Key words Example of GFS Operations Some optimizations in
More informationSurFS Product Description
SurFS Product Description 1. ABSTRACT SurFS An innovative technology is evolving the distributed storage ecosystem. SurFS is designed for cloud storage with extreme performance at a price that is significantly
More informationCostefficient Storage with Dataprotection
Costefficient Storage with Dataprotection for the Cloud Era Karoly Vegh Principal Systems Consultant / Central and Eastern Europe March 2017 Safe Harbor Statement The following is intended to outline our
More informationTHE SUMMARY. CLUSTER SERIES - pg. 3. ULTRA SERIES - pg. 5. EXTREME SERIES - pg. 9
PRODUCT CATALOG THE SUMMARY CLUSTER SERIES - pg. 3 ULTRA SERIES - pg. 5 EXTREME SERIES - pg. 9 CLUSTER SERIES THE HIGH DENSITY STORAGE FOR ARCHIVE AND BACKUP When downtime is not an option Downtime is
More informationFile System Implementation
File System Implementation Last modified: 16.05.2017 1 File-System Structure Virtual File System and FUSE Directory Implementation Allocation Methods Free-Space Management Efficiency and Performance. Buffering
More informationIBM System Storage SAN Volume Controller IBM Easy Tier enhancements in release
IBM System Storage SAN Volume Controller IBM Easy Tier enhancements in 7.5.0 release Kushal S. Patel, Shrikant V. Karve, Sarvesh S. Patel IBM Systems, ISV Enablement July 2015 Copyright IBM Corporation,
More informationCHAPTER 11: IMPLEMENTING FILE SYSTEMS (COMPACT) By I-Chen Lin Textbook: Operating System Concepts 9th Ed.
CHAPTER 11: IMPLEMENTING FILE SYSTEMS (COMPACT) By I-Chen Lin Textbook: Operating System Concepts 9th Ed. File-System Structure File structure Logical storage unit Collection of related information File
More informationWebscale, All Flash, Distributed File Systems. Avraham Meir Elastifile
Webscale, All Flash, Distributed File Systems Avraham Meir Elastifile 1 Outline The way to all FLASH The way to distributed storage Scale-out storage management Conclusion 2 Storage Technology Trend NAND
More information