Implementation of LBGM on IBM Storwize V7000 & Performance Monitoring A paper by Suman Debnath System and Technology Group Lab Services India Software Lab, IBM Feb 2012
Table of contents 1. Introduction to Global Mirror with Change Volume... 1 1.1. Change Volume... 4 1.2. Cycle Mode and Options... 5 1.3. Cycling Period... 6 2. Global Mirror with Change Volume Configuration... 8 2.1. IBM Storwize V7000 Partnership Creation GUI... 8 2.2. Implementation of Global Mirror with Change volume via GUI... 9 2.3. IBM Storwize V7000 Partnership Creation CLI... 16 2.4. Implementation of Global Mirror with Change volume via CLI... 17 3. Performance Monitoring... 18 3.1. Performance Monitoring via Web Tool... 18 3.2. Performance Monitoring via CLI... 21 4 Summary... 22 Resources... 23 Trademarks and special notices... 24
1. Introduction to Global Mirror with Change Volume The Global Mirror provides an asynchronous copy, which means that the secondary volume is not an exact match of the primary volume at every point in time. Global Mirror provides long distance asynchronous remote mirroring function up to approximately 8000 km (5000 miles) between sites. In Global Mirror, write operations are completed on the primary site and the write acknowledgement is sent to the host before it is received at the secondary site. An update of this write operation is sent to the secondary site at a later stage, which provides the capability to perform remote copy over distances exceeding the limitations of synchronous remote copy. Regular Global Mirror is designed to achieve an RPO as low as possible, so that data is as up-to-date as possible. This places some strict requirements on your infrastructure and in certain situations, with low network link quality or congested or overloaded hosts, you maybe impacted by multiple 1920 (congestion errors.) Congestion errors happen in three primary situations: Congestion at the source site via host or network. Congestion on the network link or network path. Congestion at the target site via host or network. Original GM with non-configurable RPO of generally only a few seconds All writes sent in sequence number order ensuring consistency at DR site Requires communication bandwidth to handle peak data change rate With 6.3.0, Global Mirror receives new functionality designed to address a few conditions which negatively impact some Global Mirror implementations: Estimation of Bandwidth requirements tends to be complex. It is often difficult to guarantee the latency and bandwidth requirements can be met.
Congested hosts on either the source or target site can cause disruption. Congested network links can cause disruption with only intermittent peaks. New GM utilizing Change volumes(lbgm Low Bandwidth Global Mirror) with configurable RPO Uses FlashCopy function under the covers RPO can be configured from few minutes to 24 hours - Will fluctuate depending on bandwidth available and data change rate In order to address these issues, Change Volumes have been added as an option for Global Mirror relationships. Change Volumes leverage the FlashCopy functionality, but cannot be manipulated as FlashCopy volumes, as they are special purpose only. Change Volumes provide the ability to replicate point-in-time images on a cycling period (default 300 seconds.) This means that your change rate will only need to include the condition of the data at the point-in-time the image was taken, instead of all the updates during the period. This can provide significant reductions in replication volume.
Figure: Global Mirror without Change Volume Dependent writes to primary sent with sequence number to secondary site to ensure they are applied in the same order 80ms roundtrip latency maximum Bandwidth sized for peak change rates RPO seconds and non-configurable Figure: Global Mirror with Change Volume Initially copy all data from primary volume to secondary volume at point-in-time of when GM relationship started Change volumes hold point-in-time copy of pieces of data called grains, 256K, that change during cycling mode 80ms roundtrip latency maximum Bandwidth sized based on RPO desired
RPO minutes to hours 1.1. Change Volume With Change Volumes, a FlashCopy mapping exists between the primary volume and the primary Change Volume. The mapping is updated on the cycling period (60 seconds to 1 Day.) The primary Change Volume is then replicated to the secondary Global Mirror volume at the target site, which is then captured in another change volume on the target site. This provides an always consistent image at the target site and protects your data from being inconsistent during resynchronization. Change volumes are used to record changes to the Primary and Secondary volumes of a remote copy relationship FlashCopy mapping exists between a Primary and its Change volume and a Secondary and its Change volume Change volume requirements Must be same virtual size and in the same I/O group as the Primary or Secondary volume of a remote copy relationship The Change volume will be Thin Provisioned (TP) Can t be used for host I/O and cannot be mapped to a host Can t be used by remote copy or FlashCopy FlashCopy mapping is for internal use User cannot manipulate it like a normal FlashCopy mapping Most CLI commands will fail
Uses nocopy option so minimal real capacity used on TP change volumes 1.2. Cycle Mode and Options Starting with SVC 6.3 Global Mirror can operate with or without cycling. When operating without cycling, write operations are applied to the secondary volume as soon as possible after they are applied to the primary volume. The secondary volume is generally less than 1 second behind the primary volume, which minimizes the amount of data that must be recovered in the event of a failover. However, this requires that a highbandwidth link be provisioned between the two sites. When Global Mirror operates in cycling mode, changes are tracked and where needed copied to intermediate change volumes. Changes are transmitted to the secondary site periodically. The secondary volumes are much further behind the primary volume, and more data must be recovered in the event of a failover. Because the data transfer can be smoothed over a longer time period, however, lower bandwidth is required to provide an effective solution. Initially sends all data from Primary to Secondary for a given point-intime and then cycle starts again sending deltas since the last point-intime Change volumes at both primary and secondary are used for: Change volume for Primary Stores copy-on-write changes from Primary that may need to be sent to secondary to preserve point-in-time when cycle last started Change volume for Secondary Used to ensure a consistent point-in-time image for recovery at DR Every X seconds the cycle is automatically started X is called the cycling period and is configurable
The Primary and its Change volume are read and data is copied to remote system to establish a point-in-time at DR site The Secondary and its Change volume automatically preserve that point-in-time when the cycle starts again Ensures there is always a consistent copy at the DR site Options for -cyclingmode are none single multi Specifies the behavior of Global Mirror for this relationship Specifying the default none gives identical behavior to Global Mirror in previous versions of SVC and Storwize V7000 Specifying single uses the cycling protocol but stops after one complete cycle of data has been replicated to the remote system giving similar behavior to a FlashCopy across sites Specifying multi uses the cycling protocol To start a relationship with cycling mode set to single or multi there must be Change volumes defined for the relationship Admin can create Change volumes or let system create them via GUI when configuring GM with Change volumes option Note that the cycling mode can only be changed when the relationship is stopped and in consistent_stopped or inconsistent_stopped state 1.3. Cycling Period Change Volumes provide the ability to replicate point-in-time images on a cycling period (default 300 seconds.) This means that your change rate will only need to include the condition of the data at the point-in-time the image was taken, instead of all the updates during the period. The cycling period can be adjusted with the chrcrelationship -cycleperiodseconds <60-86400> command from the CLI. If a copy does not complete in the cycle period, the next cycle will not start until the prio r one has completed. It is for this reason that using change volumes gives you two possibilities for RPO. 1. If your replication completes in the cycling period, then your RPO is twice the cycling period.
2. If your replication does not complete within the cycling period, then your RPO is twice the completion time. The next cycling period will start immediately after r the prior one is finished. The user configures the cycling period This is the minimum time between cycle starts Default is 300 seconds If the link is slow or I/O throughput is high, it may take longer than this for the new data to be transferred to the secondary If so then the next copy will start as soon as the last one has finished
2. Global Mirror with Change Volume Configuration 2.1. IBM Storwize V7000 Partnership Creation GUI A Partnership is created between the two IBM Storwize V7000 storage systems that are to be used for the Disaster Recovery solution. In the IBM Storwize V7000 GUI, on the Partnership panel, you can manage the partnership between the two storage systems. To access the Partnership panel, select the Copy Services function and click Partnership as shown in the Figure: Click the New Partnership button to create a new partnership with another cluster, as shown in Figure:
Enter a bandwidth (MBps) that is used by the background copy process between the clusters in the partnership. Set this value so that it is less than or equal to the bandwidth that can be sustained by the communication link between the cluster. The link must be able to sustain any host requests and the rate of background copy. The bandwidth setting is global setting and applies to MM or GM and applies only to initial sync and resync between primary and secondary volumes of a relationship. This setting is in MBytes not Mbits and should be initially set low relative to available bandwidth of communication link. It is a dynamic setting so you can adjust as needed. There is a setting in later versions of SVC/SWV7K to change globally the copy rate for initial sync and resync at a per relationship basis. The default is 25MBytes and is usually fine since you will have probably lots of relationships trying to initially sync or resync. After the partnership definition is completed on the first IBM Storwize V7000, the partnership of the first panel displays Partially Configured: Local. Perform the partnership creation on the remote storage system, at which point the partnership pane will display Fully Configured. 2.2. Implementation of Global Mirror with Change volume via GUI A Remote Copy relationship can be defined between two volumes, where one is the master (source) and the other one is the auxiliary (target) volume of exactly same size. And with Global Mirror with Change Volume, we need to have Change Volume both for the master and auxiliary volume. The change volume (thin volume) can be created beforehand of same size of the master and auxiliary.
To create a new Remote Copy relationship, go to Copy Services > Remote Copy Click on New Relationship, to start the Global Mirror with Change volume wizard Click on Global Mirror with Change Volumes Select the Remote Copy auxiliary (target) storage system, the local system, or the already defined second storage system as the Remote Copy partner.
The Remote Copy master and auxiliary volume need to be specified. Both volumes must have the same size. Now, we need to select the Change Volume for the Master (Primary) volume, select on Yes, add a master volume and click Next As, we are trying to create a new Global Mirror with Change Volume, select Create a new master volume and click on Finish
Recheck the Master and Auxiliary volume and click on Next As we are creating a New Global Mirror with Change Volume relationship, where the Auxiliary (Target) volume is not yet synchronized, select No, the volumes are not synchronized and click Next As, we are yet to create and set the Change Volume for the Auxiliary Volume, select No, do not start copying and click on Finish
Now, login to the DR V7000 storage and create the Change Volume for the Auxiliary Volume of same size (thin volume) On the DR V7000 itself and go to the Copy Services > Remote Copy and select the relationship that we have just created
Right Click on the relationship and go to Global Mirror Change Volumes > Add Existing. Here we are going to set the Change Volume that we had just created in the previous step and associate it with the Auxiliary Volume. Select the Change Volume and click on Add Now we need to go back to the DC V7000 and go to Copy Service > Remote Copy and right click on the relationship we had created and click on Edit Properties
Now we need to change the Cycling Mode to Multiple, which makes the Global Mirror to behave like a Global Mirror with Change Volume and put the Cycle Period based on the RPO considered and the available bandwidth Click on OK to execute the change.
Now, we are all set to start the Global Mirror with Change Volume relationship. Just right Click on the relationship and click on Start. Now it s time to monitor the progress. 2.3. IBM Storwize V7000 Partnership Creation CLI To verify the clustered system can communicate with each other: V7K_1> lspartnershipcandidate V7K_2> lspartnershipcandidate Pre-verification of the system configuration: V7K_1> lspartnership V7K_2> lspartnership Creating the partnership between the DC V7000 (V7K_1) and DR V7000 (V7K_2): V7K_1> mkpartnership -bandwidth <bandwidth value in MBps> V7K_2 V7K_1> lspartnership V7K_2> mkpartnership -bandwidth <bandwidth value in MBps> V7K_1 V7K_2> lspartnership
2.4. Implementation of Global Mirror with Change volume via CLI Now, we can create the Global Mirror relationship; make sure you have already created the auxiliary volume of exactly of same size that of the master volume V7K_1> mkrcrelationship -master VOL1_DC -aux VOL1_DR -cluster V7K_2 -name GM_REL1 -global V7K_1> lsrcrelationship Create thin provisioned change volumes both at DC and DR V7000 V7K_1> mkvdisk -iogrp 0 -mdiskgrp 0 -size 10 -unit gb -rsize 20% - autoexpand -grainsize 32 -name VOL1_DC_CHANGE_VOL V7K_2> mkvdisk -iogrp 0 -mdiskgrp 0 -size 10 -unit gb -rsize 20% - autoexpand -grainsize 32 -name VOL1_DR_CHANGE_VOL Set Cycling mode on GM Relationship V7K_1> chrcrelationship -cyclingmode multi GM_REL1 V7K_1> chrcrelationship -cycleperiodseconds <60-86400> GM_REL1 Setting Change Volume on Master Volume V7K_1> chrcrelationship -masterchange VOL1_DC_CHANGE_VOL VOL1_DC V7K_1> lsrcrelationship GM_REL1 Setting Change Volume on Auxiliary Volume 7K_2> chrcrelationship -auxchange VOL1_DR_CHANGE_VOL VOL1_DR V7K_2> lsrcrelationship GM_REL1 Start the standalone relationship with cycling mode V7K_1> startrcrelationship GM_REL1 V7K_1> lsrcrelationship GM_REL1
3. Performance Monitoring Let us consider we have Brocade 7600/7800 FCIP router or IBM SAN06B-R Router at both DC and DR sites and the FCIP tunnel is already created. Now, after starting the Global Mirror with Change volume replication between DC and DR, we can monitor the progress in terms of performance (throughput, compression (if enabled), etc.) in the following way: 3.1. Performance Monitoring via Web Tool Login to the DC and DR FCIP router via web tool:
Go to Monitor > Performance Monitor Go to Performance Graphs > Tunnel and TCP Graph
Select Throughput, Effective Throughput and Compression ratio, to get the graphs populated After sometime, the graph would become populated with more data
3.2. Performance Monitoring via CLI Telnet to the FCIP router located at the DC site (both the fabric) DC_FCIP1> switchshow DC_FCIP1> portshow fciptunnel <VE_Port> DC_FCIP1> portshow fciptunnel <VE_Port> -perf DC_FCIP1> portshow fciptunnel <VE_Port> -c -tcp We can also monitor the VE_Port performance throughput (in kbytes/sec) by the following command DC_FCIP1> portperfshow
4 Summary In this paper, we had considered two IBM Storwize V7000 located at DC and DR site running with microcode of v6.3 with Brocade 78000 FCIP router at both the sites. With the latest microcode of v6.3, we can now leverage Global Mirror with Change Volume feature for asynchronously replicating the data from DC to DR even will less available bandwidth. This paper might give you a baseline on how to setup a GM with Change Volume relationship and monitoring the progress in terms of performance once started.
Resources These Web sites provide useful references to supplement the information contained in this document: IBM Storwize V7000 Info Center http://publib.boulder.ibm.com/infocenter/storwize/ic/index.jsp IBM System Storage b-type Multiprotocol Routing http://www.redbooks.ibm.com/abstracts/sg247544.html?open Implementing the IBM Storwize V7000 V6.3 http://www.redbooks.ibm.com/abstracts/sg247938.html?open Using the SVC for Business Continuity http://www.redbooks.ibm.com/abstracts/sg247371.html?open
Trademarks and special notices Copyright IBM Corporation 2012. All rights Reserved. References in this document to IBM products or services do not imply that IBM intends to make them available in every country. IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. If these and other IBM trademarked terms are marked on their first occurrence in this information with a trademark symbol ( or ), these symbols indicate U.S. registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered or common law trademarks in other countries. A current list of IBM trademarks is available on the Web at "Copyright and trademark information" at www.ibm.com/legal/copytrade.shtml.