2016 3 rd International Conference on Materials Science and Mechanical Engineering (ICMSME 2016) ISBN: 978-1-60595-391-5 Dual-System Warm Standby of Remote Sensing Satellite Control System Technology Fei Lei, Lin Zhu, Qian Chai Beijing University of Technology, Beijing, China Dali Gong Kunlun Energy Company Limited, Beijing, China ABSTRACT: Standby is a technique that has been widely applied to improve system reliability and availability in the stage of system design. For satellite control system, it requires features of high reliability, security and sustainability. In this paper, we design a dual-system warm standby technology. The system consists of double control servers, one is operating control server and another one is standby control server. Dual-system warm standby log files use differential ghost method that can t only strengthen log real-time but also reduce network usage. Erasing log files after the operation is completed that can reduce reliance on storage memory. When switching to backup server, according to the dual-system warm backup log files, it can recover the current mission progress and continue to run the mission. Through experimental verification, the design has a good reliability and stability. It can be widely used in aerospace field test. 1 INTRODUCTION With the extensive application of high-tech in the field of aviation, the development of remote sensing satellites achieve a breakthrough and the applications are also expanding. For example, remote sense of satellites have been widely used in agriculture, forestry, territory, irrigation, water, etc. (Tang et al. 2011 & Steve et al. 2015 & Russell et al. 2015). The control system as a brain monitor remote sensing satellites in real time to ensure its normal operation. Due to the complex and huge remote sensing satellites control system, the system contains a large number of the basic units and information exchange between the amount of each unit is growing. So the remote sensing satellite control system needs to guarantee high reliability. The one of key issue is that how to protect the remote sensing satellite control system capable of reliable and durable operation to be solved. It requires control system with redundant capability (Zhang et al. 2007). For needs of privacy and network security in the most remote sensing satellite monitoring mission, the remote sensing satellite control system access to the Internet. So we don t apply for network cloud redundancy technology. In this paper, for the security needs of remote sensing satellite and control system, we design dual-system warm standby technology. The rest of the paper is organized as follows. Section 2 analyzes the three common dual-system 137 standby technology. Section 3 provides the design of dual-system warm standby: principles of technology and the composition of technology. Section 4 uses the test data to prove high reliability by Markov chain theory. Conclusion is summarized in the end. 2 THE CONVENTIONAL DUAL-SYSTEM BACKUP TECHNOLOGY Redundancy is a form of backup. When the system failure occurred due to some problems, for the secure communication of important information, to avoid heavy losses, the system must switch to backup components continue to work through effective means (Wang et al. 2008). Briefly, it is to repeat some configurations of the system components. When the system failure occurred, components of redundancy configurations started to work and undertook the work of damaged components of the system. Components of redundancy configurations can reduce the duration of the fault, make the system restore normal operation as soon as possible and enhance stability of the system significantly. Conventional dual-system redundancy technologies include: dual-system cold standby technology, dual-system warm standby technology and dual-system hot standby technology. Dual-system cold standby technology, it means that active and standby servers both work at same
time. Active server timing data automatically synchronizes to the standby server in normal operation. When the active server failure occurs and can t continue to run, operator needs to switch to backup server manually. Backup server continue to provide the services for the system and support the system continuing operations. Dual-system cold standby technology implies that the inactive component has a zero failure rate and cannot fail while in standby state (Yun et al. 2010). Dual-system warm standby technology, it generally refers to the warm standby server technology based on active / standby mode. It requires two sets of identical servers. In normal operation, the active server is running and another server is in standby mode. When the active server failure occurred and can t continue to run, standby server would detect the active server is error by warm redundancy software. Then restart standby server. According to the backup log files, the standby server continue to provide the service for the system. Dual-system hot standby technology, its hardware structure is similar to dual-system hot standby technology. In normal operation, the two servers both are running. And they accept and handle the data at same time. Two servers exchange the data in real time, the only active server provide service for the system. When the active server failure occurred and can t continue to run, the standby server would detect the situation. Then the standby server would start without interference and provide to service for the system. Dual-system hot standby implies that the inactive component runs the same operational environment as backup for the active component. It means that the lifetime of the inactive component is same with the active component and the state of the inactive is synchronize with the active. 3 DUAL-SYSTEM WARM STANDBY TECHNOLOGY DESIGN The remote sensing satellite control system needs to guarantee high reliability for server. Dual-system cold standby technology has some fault: when the active server failure occurred and can t continue to run, it has to be switched to standby server manually; the operator has to intervene and switching time is long. Although dual-system hot standby technology switching time is short, it has some fault too. For example, two server both are running, they waste many performance. The dual-system warm standby technology compared to cold and hot can compromise the reliability, timeliness and performance well (Jia et al. 2010). Meanwhile, for reducing dual-system warm standby system hardware dependent, RAID (Redundant Arrays of Independent Disks) is applied for NAS (Network attached Storage) technology that can share RAID between active and standby server. So we chose dual-system warm standby technology in the remote sensing satellite control system. 3.1 The basic design principles 1. Dual-system warm standby technology can automatically switch. When the active control server failure occurred and can t continue to run, the standby server can automatically switch to work. 2. Dual-system warm standby technology can detect the error by itself. It can detect the error quickly. 3. Dual-system warm standby technology has good persistence. When the active control server failure occurred, the standby server can continue to perform remote sensing satellite missions, recover the data and instructions. 4. Dual-system warm standby technology requires lower the hardware demands. It won t enhance the hardware requirements for the system. 5. Dual-system warm standby technology has simple network configuration structure. It is convenient that the operators maintain the system. 3.2 The composition of dual-system warm standby technology Dual-system warm standby system consists of terminal device, two server (active and standby), interchanger and RAID. Two servers operating system run on Linux redhat-server 6.5. They install dual-system warm standby software and have double network adapters. Each server connects another server and interchanger. The RAID is 100TB. Communications are applied for protocol. (Terminal device and server, active server and standby server, server and RAID. Figure 1 illustrates dual-system warm standby network structure. Sever(active) Terminal Device Sever(standby) Interchanger RAID Figure 1. Dual-system warm standby network structure. 138
3.3 The principle of dual-system warm standby technology Start 3.3.1 The configuration of network and heartbeat line There are heartbeats lines between the active server and standby server. Heartbeat line timing sends signal to detect the status between active and standby server. Detect the status of two servers whether other server is in normal operation. The way can enhance the effective of detection. Each server has difference network port to connect interchanger and another server by heartbeat line. Every network port has different IP address. 3.3.2 Shared RAID NAS (Network attached Storage), it is a kind of distributed, independent data integration for large, centralized management of data center. It is the technology that provides service for different hosts and servers (Li et al. 2003). NAS configures own node in LAN, so NAS server can handle data within network. NAS system installed in RAID that can make two servers (active and standby server) have access to same files simultaneously. It achieves the purpose of sharing RAID. NAS can work without application server, reduce the burden of CPU and enhance the performance obviously. 3.3.3 Processing method of warm standby log files The key is that utilize log files to restore history in dual-system warm standby system. In the system recovery process, active server fails place resulting in inconsistent data changes over then standby server restore consistent status with active server before the fault occurred (Zhou et al. 2011). So we specially designed log files for remote sensing satellite control mission to recover. Figure 2 illustrates process of server handles Log files and we will detailed explain the framework in the following. When remote sensing satellite carries on the control mission, the active server sends data or instructions to the terminal device and the standby server is placed on standby. The Log plug-in of the active servers records the action information (send data or instructions) in Log files on RAID. When the action is finished, the terminal device would feedback the action completion messages for the active server. Then the active server delete the log files of the action on RAID. Figure 2 illustrates server handles Log files. Control servers Log files use the differential ghost method. In the processing of modifying the Log files on the server, Log files ghost based on the file system in Bytes, send or modify the new data (Liu et al. 2015). 139 Send instruction or data Record the action to Log files Receive the feedback Yes Delete the action from Log files End No Figure 2. Framework of server handle Log files. Log files use differential ghost method that can enhance the access efficiency. And it also reduces use of the network, CPU, memory and other hardware resources. Record the progress of remote sensing satellite control task is more accurate because Log files has fast written and read speed. Provide convenience for the intelligent recovery of dual-system warm standby technology. There is much action information in the remote sensing satellite control mission. Delete Log files after the action completed that can reduce the Log files storage in RAID. 3.3.4 Processing method of warm standby log files Figure 3 illustrates process of dual-system switching and we will detailed explain the framework in the following. Dual-system warm standby software detects the status of the active server. When the active server failure occurred and can t continue to run, software can detect it by heartbeat line. The software would automatically switch to the standby server immediately. The terminal device would detect the status of the active server by the fault of network or feedback then switch to the standby server. The standby server start to run. Dual-system warm standby software would restart the active server after the switch is completed. If restart is failed, the software would restart the active server by power off.
Auto switch Heartline detect normal operation of the active sever No Warm software launch the inactive sever Res plus-in read the Log files Read the current progress of mission Continue the mission Yes Figure 3. Framework of dual-system switching. The Res plug-in of the standby server would read the Log files in RAID. The standby server would read the progress of the remote sensing satellite mission. And the standby servers would read the data and instructions on mission in RAID then continue to send them to the terminal device. The switching of dual-system warm standby completed. 4 TESTING AND ANALYZING The design of dual-system warm standby technology completed. It has been test in a laboratory test environment. The main test method is fault injection. The ways to fault injection: 1. Hardware fault: turn off the power and disconnect network; 2. Software fault: Memory crash, process stop and software exception. 150 tests were carried out for each failure mode. Table 1 illustrates the results of fault injection. Table 1. The results of fault injection. Fault way Detection Switching Restoring Turn off the power 150 150 150 Disconnect network 150 149 149 Process stop 147 145 142 Memory crash 146 145 144 Software exception 149 149 146 From the test data, we can know that the dualsystem warm standby technology achieved a 100% detection in hardware fault and there are only 1 experiment failed automatically switch. There are many tests can t detect the faults and recovery in software fault but it almost automatically switches after detects the faults. In total 750 tests, the standby servers automatically switches within 3 second. According to method of Markov chain reliability evaluations of fault-tolerant systems (Liu et al. 2011), the method is calculating the reliability of dual-system is proposed (Guo et al. 2008). The reliability of the dual-system standby technology is calculated as high as 99.46%, the technology can satisfy the high reliability and real time requirements of remote sensing satellite control system. 5 CONCLUSION In this paper, we introduced the dual-system warm standby technology for the remote sensing satellite control mission. We handled the Log files by the differential ghost method that can reduce the requirements of hardware and enhance the real time of Log files. The server s Res plug-in can know the progress of the mission by Log files and continue to implement the mission. It can effective judges the fault and automatically switches that also can ensure continuous operation of the mission. Compared to other redundant technology, its network structure is simple and can reduce the development time. By testing in a laboratory test environment, we can indicate that the technology is reasonable, reliable and has good operation results for the mission. It has practical significance for ensuring remote sensing satellite control mission. REFERENCES [1] Guo. H. & Yang, X. 2008. Quantitative reliability assessment for safety related systems using Markov models. Journal of Tsinghua University 48(1):149-152. [2] Jia, W. & Zhang C. 2010. Design and Implementation of a High -Reliable Dual Warm-Redundant System of On-Board Computer. Journal of Computer Research and Development 47(suppl):127-132. [3] Liu, B. & Wu, Z. 2011. Markov chain reliability evaluations of fault-tolerant systems. Journal of Tsinghua University 51(S1):1414-1417. [4] Liu, C. & Peng, W. 2015. Research and Design of Hot Backup Redundancy of PLC. Computer Engineering and Applications (2):46-49. [5] Li, S. & Yang, H. 2003. The Application and Research of Network Attached Storage. Computer Engineering and Applications 47(13):194-196. [6] Russell, A. 2015. Damage evaluation for aircraft CFRP components using piezoelectric sensing. Structural Health Monitoring-An International Jounal 2015(V2):1517-1523. [7] Steve, G. & Nik, R. 2015. SHM certification requirements for military aircraft-an Australian perspective. Structural 140
Health Monitoring-An International Journal 2015(V1):126-133. [8] Tang, X. & Cong, N. 2011. Present Situation and Development of Surveying and Mapping Satellites in China. Geomatics World04 (2):40-44. [9] Wang, JH. 2008. Highly reliable hot standby redundant system of computer. Research On Development 27(4):42-44. [10] Yun, WY. & Cha JH. 2010. Optimal design of a general warm standby system. Reliability Engineering and System Safety 95(2010):880-886. [11] Zhang, S. 2007. Research on Structure Designing for Redundancy-based Complex System of Systems with Dynamic Adaptation. Journal Of National University Of Defense Technology 29(5):132-138. [12] Zhou, X. & Qin, X. 2011. Research on highly available database Hot Standby system and its performance. Journal of China University of Mining & Technoogy40 (2):315-320. 141