APPLICATION NOTE Simple Chassis Cluster Upgrade SRX Series Services Gateways for the Branch Upgrade Junos OS with Minimal Traffic Disruption and a Single Command Copyright 2013, Juniper Networks, Inc. 1
Table of Contents Introduction...3 Scope...3 Design Considerations...3 Hardware Requirements...3 Software Requirements...3 Description and Deployment Scenario...3 Simple Cluster Upgrade Process...3 CLI Command and Syntax... 4 Simple Cluster Upgrade Example... 4 Before Chassis Cluster Upgrade... 4 Simple Cluster Upgrade Process... 4 After Completion of Simple Cluster Upgrade... 5 Error Handling and Recovery from Failure During SCU... 5 Limitations in Simple Cluster Upgrade (SCU)... 5 Conclusion... 5 About Juniper Networks... 5 2 Copyright 2013, Juniper Networks, Inc.
Introduction Juniper Networks SRX Series Services Gateways for the branch integrate carrier-class routing, comprehensive security, and feature rich Ethernet switching features in a single device. These platforms provide different high availability (HA) options to deploy in mission critical networks. SRX Series gateways also support a chassis cluster feature which provides HA for security features like firewall, IPsec VPN, intrusion prevention system (IPS), and unified threat management (UTM). With this chassis cluster capability, the SRX Series can enable active/backup or active/active redundancy through session and configuration synchronization between two cluster nodes. Upgrading software on a branch SRX Series chassis cluster can be very tedious, however. To upgrade software, one must break up the cluster, upgrade nodes separately, and then recreate the chassis cluster again. This process requires considerable downtime, and there is no single command to complete the entire process. In Juniper Networks Junos operating system 11.2R2, branch SRX Series gateways introduce a feature called Simple Cluster Upgrade (SCU), which simplifies the chassis cluster upgrade process with a single command and with minimal traffic disruption. As a significant added benefit, this process does not require breaking of the cluster. Scope As of Junos OS 11.2R2, only branch SRX Series platforms support this feature. All information discussed in this document is limited to SRX Series for the branch platforms. SRX Series for the high end supports unified in-service software upgrade (unified ISSU), which is more comprehensive than SCU and does not require any traffic disruption. Design Considerations Hardware Requirements Juniper Networks SRX210, SRX210E, SRX220, SRX240, SRX550 and SRX650 Services Gateways Software Requirements Junos OS 11.2R2 or greater Description and Deployment Scenario Simple Cluster Upgrade (SCU) simplifies the software upgrade process of chassis cluster nodes. This feature introduces a single command-line interface (CLI) command (or management interface) to upgrade/downgrade both cluster nodes with minimal traffic disruption (around 30 seconds). This process can be initiated remotely through a CLI, Junos Web, or Juniper Networks Network and Security Manager, without breaking up the chassis cluster. During this process, the SRX Series device will experience redundant group failovers. It validates the package and checks version compatibility before doing the upgrade. If the system finds that the new package is not compatible with the currently installed version, the device will refuse the upgrade and ask the user to take corrective action. Copyright 2013, Juniper Networks, Inc. 3
Simple Cluster Upgrade Process This flow diagram explains different states in the SCU process. 0 1 Initial state primary and secondary are with Junos OS 11.2R2. Node 0 is primary and is secondary. request system software in-serviceupgrade <11.2R3.tar.gz> command executed to upgrade cluster to 11.2R3 2 3 MGD daemon copy image to /var/tmp of primary Primary copies image to /var/tmp of secondary 4 5 Upgrade secondary node with unlink and verify option. In case of error SCU is aborted and image is deleted Upgrade primary node with options supplied by user. Skip verify option as verification already done on secondary 6 7 Set SCU bit in EEPROM of secondary and failover all redundant groups (RGs) on to Reboot secondary node () and primary enters SCU window state 8 9 After reboot reads EEPROM and enters SCU window state and block data traffic and continues forwarding Both node exchange SCU hear beat. Though both nodes are primary but only 1 node () is forwarding traffic 10 11 11.2R3 11.2R3 Reboot older primary (). New primary () stopped receiving SCU heart beats. Once new primary stopped SCU heart beats, it comes out SCU window, clears EEPROM and start forwarding data traffic. 12 11.2R3 11.2R3 13 11.2R3 11.2R3 After reboots and it became secondary as is already primary Once process is completed both nodes upgraded to new image and swapped RG0 states 4 Copyright 2013, Juniper Networks, Inc.
CLI Command and Syntax With a single CLI command, SCU upgrades/downgrades both nodes of a cluster with minimal impact: request system software in-service-upgrade <path of software image> no-sync [unlink] The SCU process can be stopped by issuing the command shown below: request system software abort in-service-upgrade Simple Cluster Upgrade Example Before Chassis Cluster Upgrade root@srx650-1> show version node0: JUNOS Software Release [11.4-20111021.0] node1: JUNOS Software Release [11.4-20111021.0] {primary:node0} Simple Cluster Upgrade Process root@srx650-1> request system software in-service-upgrade /b/junos-srxsme- 11.2R3.3-domestic.tgz no-sync no-validate ISSU: Validating package Saving state for rollback... ISSU: finished upgrading on secondary node node1 ISSU: start upgrading software package on primary node ISSU: failover all redundancy-groups 1...n to primary node Successfully reset all redundancy-groups priority back to configured ones. Redundancy-groups-0 will not be reset and the primaryship remains unchanged. Successfully reset all redundancy-groups priority back to configured ones. ISSU: rebooting Secondary Node Shutdown NOW! [pid 2114] ISSU: Waiting for secondary node node1 to reboot. ISSU: went down ISSU: Waiting for to come up ISSU: came up ISSU: secondary node node1 booted up. Shutdown NOW! [pid 1857] *** FINAL System shutdown message from root@srx650-1 *** System going down IMMEDIATELY Copyright 2013, Juniper Networks, Inc. 5
After Completion of Simple Cluster Upgrade root@srx650-1> show version node0: JUNOS Software Release [11.2R3.3] node1: JUNOS Software Release [11.2R3.3] {primary:node1} root@srx650-1> Error Handling and Recovery from Failure During SCU 1. In the case of failure or successful completion of SCU, the software image from the secondary node is deleted to ensure that there is no wastage in secondary disk space. 2. Abort command clears the SCU state on both nodes to resume normal operation. 3. In the case of secondary boots with a backup image, the old primary will detect this and abort the SCU process. The old primary will continue forwarding data traffic, but console access will be needed to roll back the software image on the secondary. 4. If secondary fails to boot, then the primary will time out and abort the SCU process. Primary will continue forwarding data traffic, and the failed secondary can be recovered through the console. 5. If upgrade of the primary fails, the secondary image is rolled back and the SCU process is aborted. 6. If primary fails to reboot with the new image or comes up with a backup image, secondary (new primary) will start forwarding the packets; however, console access will be needed to recover old primary. 7. Finally, an error will occur if the SCU process is started from an SCU-supported Junos OS version to a non supported version. Limitations in Simple Cluster Upgrade (SCU) 1. This is not ISSU, and a downtime of 30 seconds is expected during this SCU process. Also, sessions and route tables need to be recreated in a new primary node. 2. As of Junos OS 11.2R2, SCU is not supported through Junos Web and NSM. 3. Only one SCU session can be active at a time; an error occurs if multiple SCU processes are started simultaneously. Conclusion Simple Cluster Upgrade (SCU) on the SRX Series Services Gateways greatly simplifies the software upgrade process of chassis cluster nodes. With a single command, both nodes of a chassis cluster can be upgraded or downgraded with minimal traffic disruption and without breaking up the cluster. 6 Copyright 2013, Juniper Networks, Inc.
About Juniper Networks Juniper Networks is in the business of network innovation. From devices to data centers, from consumers to cloud providers, Juniper Networks delivers the software, silicon and systems that transform the experience and economics of networking. The company serves customers and partners worldwide. Additional information can be found at www.juniper.net. Corporate and Sales Headquarters Juniper Networks, Inc. 1194 North Mathilda Avenue Sunnyvale, CA 94089 USA Phone: 888.JUNIPER (888.586.4737) or 408.745.2000 Fax: 408.745.2100 www.juniper.net APAC and EMEA Headquarters Juniper Networks International B.V. Boeing Avenue 240 1119 PZ Schiphol-Rijk Amsterdam, The Netherlands Phone: 31.0.207.125.700 Fax: 31.0.207.125.701 To purchase Juniper Networks solutions, please contact your Juniper Networks representative at 1-866-298-6428 or authorized reseller. Copyright 2013 Juniper Networks, Inc. All rights reserved. Juniper Networks, the Juniper Networks logo, Junos, NetScreen, and ScreenOS are registered trademarks of Juniper Networks, Inc. in the United States and other countries. All other trademarks, service marks, registered marks, or registered service marks are the property of their respective owners. Juniper Networks assumes no responsibility for any inaccuracies in this document. Juniper Networks reserves the right to change, modify, transfer, or otherwise revise this publication without notice. 3500211-001-EN Jan 2013 Printed on recycled paper Copyright 2013, Juniper Networks, Inc. 7