FlexFrame for SAP. Version 4.0. FA Agents - Installation and Administration. Edition March 2007 Document Version 2.0

Size: px
Start display at page:

Download "FlexFrame for SAP. Version 4.0. FA Agents - Installation and Administration. Edition March 2007 Document Version 2.0"

Transcription

1 FlexFrame for SAP Version 4.0 FA Agents - Installation and Administration Edition March 2007 Document Version 2.0

2 Fujitsu Siemens Computers GmbH Copyright Fujitsu Siemens Computers GmbH 2007 FlexFrame, PRIMECLUSTER. PRIMEPOWER and PRIMERGY are trademarks of Fujitsu Siemens Computers SPARC64 is a registered trademark of Fujitsu Ltd. SAP and NetWeaver are trademarks or registered trademarks of SAP AG in Germany and in several other countries Linux is a registered trademark of Linus Torvalds SUSE Linux is a registered trademark of Novell, Inc., in the United States and other countries Java and Solaris are trademarks of Sun Microsystems, Inc. in the United States and other countries Intel and PXE are registered trademarks of Intel Corporation in the United States and other countries MaxDB is a registered trademark of MySQL AB, Sweden MySQL is a registered trademark of MySQL AB, Sweden NetApp, Network Appliance, Open Network Technology for Appliance Products, Write Anywhere File Layout and WAFL are trademarks or registered trademarks of Network Appilance, Inc. in the United States and other countries Oracle is a registered trademark of ORACLE Corporation EMC, CLARiiON, Symmetrix, PowerPath, Celerra and SnapSure are trademarks or registered trademarks of EMC Corporation in the United States and other countries SPARC is a trademark of SPARC International, Inc. in the United States and other countries Ethernet is a registered trademark of XEROX, Inc., Digital Equipment Corporation and Intel Corporation Windows, Excel and Word are registered trademarks of Microsoft Corporation All other hardware and software names used are trademarks of their respective companies. All rights, including rights of translation, reproduction by printing, copying or similar methods, in part or in whole, are reserved. Offenders will be liable for damages. All rights, including rights created by patent grant or registration of a utility model or design, are reserved. Delivery subject to availability. Right of technical modification reserved.

3 Contents 1 Introduction FlexFrame-Autonomy Additional Documentation Target Group Notational Conventions Document History Related Documents First Steps Installation and Startup Installation Requirements The FlexFrame Solution Installation Installation Packages Standard Installation Configuration Starting and Stopping FA WebInterface Function Installation Configuration Starting and Stopping DomainManager FlexFrame Autonomy Architecture Pool Creation and Grouping Virtual FlexFrame Autonomy Pools Grouping Manual Group Creation Configuration in the LDAP Directory Automatic (generic) Group Creation Service Classes Service Priority Service Power Value Class Creation Rules Testament Types Service-specific Testaments Enqueue Service in Case of Replicated Enqueue FA Configuration, Work and Log Files Systems Service Types Replicated Enqueue Service FA Agents - Installation and Administration

4 Contents Live Cache Generic Services Service State Model Service Detection Model Service Reaction Model FlexFrame Performance and Accounting Option Performance Option Accounting Option Billing FlexFrame Autonomy FlexFrame Autonomy Reactions Restart Reboot Switchover General Rules Internal Switchover External Switchover Maintenance Self-Repair Strategies Self-Repair in the Event of a Service Failure Self-Repair in the Event of a Node Failure Takeover by a Spare Node (Switchover) Multi Node Failure Case 1: ShortTime Failure Case 2: LongTime Failure Takeover Rules Overview TakeOver strategy Overview FirstFit LowPrioFit TakeOver Rule Overview Static Takeover Rule Dynamic TakeOver Rules Operating Mode Event Mode Local Reaction Mode Central Reaction Mode Autonomous Operation of a FlexFrame Infrastructure FlexFrame Autonomy and the Adaptive Computing Controller (ACC) FlexFrame Autonomy and FSC FlexFrame Scripts FlexFrame Autonomy and User Interactions myamc.fa Agents: Starting/Stopping/Status FA Agents - Installation and Administration

5 Contents Starting the myamc.fa Agents Manually Stopping the myamc.fa Agents Manually Status of the myamc.fa Agents Starting/Stopping an SAP Instance Starting an SAP Instance Stopping an SAP Instance Possible Applications General Semi-autonomous Operation Monitoring of Application Instances Autonomy for Application Instances Restart Reboot Switchover FA Work and Log Files General Overview, Principal Directories, Files Collecting Diagnostic Information for Support Assistance Selected Files Livelist Services List Services Log Reboot Switchover XML Repository BlackBoard Migration of FA Agent Versions on Pool Level The FA Migration Tool Pool Mode File Mode Usage of Help Command Line Interface Command Execution at All Nodes of a Pool WebInterface Installation / Configuration Prerequisites Installation Configuration Web Server Login IDs Link to the myamc.messenger Database LDAP Options GUI Options Other Settings FA Agents - Installation and Administration

6 Contents 5.2 Visualization Starting the WebInterface / Access via Web Browser Login Overview of Elements Pool / Group Tree Overview Status Selecting an Element Different Tree Presentations Status Display Node Panel System Panel Instance Panel Assigning States to Colors Message Display Fields Navigation Viewsets Sorting Configuration of FlexFrame Autonomy with the Webinterface Interaction Commands Activating the Context Menus Confirming a Command Pools Groups Nodes System Instance Updates Update Interval Manual Update Reinitialization Pause Mode (No Update) Info and Help FlexFrame Performance and Accounting Plug-in FlexFrame Reporting Plug-in FlexFrame Autonomy Power Shutdown Concept General Power Shutdown Architecture Basics Power Shutdown for Blade Systems Power Shutdown for PRIMERGY Systems Power Shutdown for PRIMEPOWER Systems FA Agents - Installation and Administration

7 Contents 6.4 Power Shutdown Configuration Switchover Control Parameters User, Password and Community Management Blades Application Nodes Default Shutdown Mode Parameter Reference FA Agents FA Agent Configuration Files SNMP Traps General Structure Default Parameter File Pooling and Grouping Pooling Grouping LDAP Grouping Manual Group Assignment Generic Grouping Default Parameter File Service Classes Service Priority Service Power Value Class Creation Rules Example FlexFrame Autonomy General Parameters Parameters for the Performance and Accounting Option Node-Related Parameters Service-Related Parameters Parameters for the Definition of a Generic Service Parametering of the Service Detection Path Configuration Shutdown Configuration Default Parameter File BlackBoard General Implementation Generating BlackBoard Commands WebInterface Interactive Commands FA Agents - Installation and Administration

8 Contents 9 FlexFrame Autonomous Agent Traps Format of the FlexFrame Autonomy SNMP Traps Severities Overview of the FlexFrame Autonomy SNMP Traps Troubleshooting Abbreviations Glossary Index FA Agents - Installation and Administration

9 1 Introduction For many companies, applications such as SAP today provide the basis for handling all important business processes. Failure of these components therefore results in considerable costs. Nowadays companies must be able to react very rapidly to changing market and organizational demands, which also means that it must be possible to adapt the capacity of existing IT resources very quickly to the changing requirements. The myamc components for monitoring the availability and utilization of IT systems with their intelligent automated facility for responding to system failures are the answer to these demands. FlexFrame -Autonomy complements the powerful monitoring and management functions of myamc with functions which permit the autonomous operation of a distributed applications environment. These functions reduce the number of manual interventions and make the operation of your business critical applications more efficient. FlexFrame offers a flexible hardware architecture which can be adapted to altered requirements and, together with management components, permits highly available operation of this infrastructure. Partial outages are automatically repaired or compensated for. FlexFrame Autonomy is an integral component of every FlexFrame solution and provides the functions for implementing operation with considerably reduced operator interventions through the built-in autonomy functions, right up to a high-availability solution. This manual describes the functional concepts and the application scenarios for Flex- Frame-Autonomy. FlexFrame Autonomy for distributed database instances, SAP central instances and SAP application instances FlexFrame Autonomy supports SAP, SAPDB and Oracle instances Status monitoring, restart, reboot or switchover of instances 1.1 FlexFrame-Autonomy The Application Management Center myamc is a solution for monitoring and managing IT infrastructures. The resources required for a business process, from monitoring of a printer, the network and of the server and the applications which run on it can be monitored using myamc. The FlexFrame Autonomy component myamc.fa substantially extends the range of functions. In addition to monitoring, this component also provides the option of implementing automatic restoration of failed services autonomously. These self-repair mechanisms are not just effective locally for one system, however; they also permit a failed service to be moved automatically to another resource which, in line with a defined rule for operation, is suitable for operating the service. FA Agents - Installation and Administration 1

10 Introduction This function permits a considerable reduction in the number of manual interventions by an administrator. Availability is increased, and the costs for operating a complex applications environment are reduced. For this functionality, myamc.fa uses its agents and management components to detect, collect and analyze the information. Autonomous functions can be configured for various tasks and requirements by combining different detectors and manager components and by defining and selecting the reaction and decision rules. In conjunction with the powerful myamc GUI, the entire infrastructure can be presented in a straightforward manner in an IT cockpit. 1.2 Additional Documentation Further application options for other myamc management components are described in the document myamc.overview. Use of the Messenger for editing and forwarding myamc.fa messages is described in the documentation myamc.messenger. 1.3 Target Group This documentation is intended to support both users of FlexFrame Autonomy and administrators who wish to integrate this solution in an enterprise IT management solution. 1.4 Notational Conventions The following conventions are used in this manual: Additional information that should be observed. Warning that must be observed. fixed font <fixed font> fixed font Names of paths, files, commands, and system output. Names of variables. User input in command examples (if applicable using <> with variables) 2 FA Agents - Installation and Administration

11 Introduction 1.5 Document History Document Version Changes Date 1.0 First Edition Adding some new features FA Agents Version Related Documents FlexFrame for SAP Planning Tool FlexFrame for SAP Installation of a FlexFrame Environment FlexFrame for SAP Installation Guide for SAP Solutions FlexFrame for SAP Administration and Operation FlexFrame for SAP Network Design and Configuration Guide FlexFrame for SAP Installation ACC 1.0 SP13 FlexFrame for SAP myamc.fa_logagent - Concept and Usage FlexFrame for SAP Upgrading FlexFrame 3.1 or 3.2 to 4.0 FlexFrame for SAP White Paper PRIMECLUSTER Documentation ServerView Documentation SUSE Linux Enterprise Server Documentation Solaris Documentation FA Agents - Installation and Administration 3

12

13 2 First Steps 2.1 Installation and Startup This chapter describes how you start and stop the FlexFrame Autonomy components. It also describes how FlexFrame Autonomy is installed and its basic configuration. FlexFrame Autonomy provides a comprehensive, flexible and scalable solution for setting up semi-autonomous IT processes. Its functionality falls into three subareas: FA_AppAgents: FlexFrame Autonomy Application Agents for monitoring, checking and controlling instances FA_CtrlAgent: FlexFrame Autonomy Control Agent for monitoring, checking and controlling Application Nodes with a separate Control Node. FA WebInterface: A component for displaying the active services on a web frontend. To monitor instances, the FA_AppAgent supplies cyclical information on the availability of an instance in a definable rhythm. For this purpose it is necessary that the FA_AppAgent is active on every node. myamc.messenger is used to forward information on faults and autonomous reactions to the outside. This messaging component of the myamc family should be operated on the Control Node. 2.2 Installation Requirements The FlexFrame Solution The FlexFrame Autonomy solution was conceived and developed especially for the FlexFrame for SAP solution from Fujitsu Siemens Computers. Consequently the FlexFrame solution with the components Shared OS, Virtualized SAP Application and NetApp Storage on the target computers is a prerequisite for the procedure described in the following. Further details on FlexFrame configurations can be found in the FlexFrame manual Installation of a FlexFrame Environment. Use of FlexFrame Autonomy on other Linux architectures (e.g. standalone systems or for monitoring processes which do not belong to SAP R/3) is not described in this manual and is not supported. FA Agents - Installation and Administration 5

14 First Steps The following prerequisites are thus particularly important: Server architecture with IP storage (NetApp Filer) and Client, Server and Storage LANs. Paths for read-only and read/write Root Images. SAP start scripts from Fujitsu Siemens Computers Operating system SUSE Linux Enterprise Server (SLES) FA Agents are installed in a directory on the storage system which is reachable and available to all nodes in accordance with the FlexFrame rules for jointly used programs. Programs are always accessed and installed via a Control Node. The FA Agents are installed using two RPM packages. Normally the agents are stored in the directories /opt/myamc/fa_appagent and /opt/myamc/fa_ctrlagent. The /opt/myamc directory is located in a FlexFrame environment on the Filer and is available from every Application Node and Control Node. Multiple FlexFrame Autonomy versions can be installed simultaneously. Installation, configuration and activation of a version are three separate activities. Installation, parameterization and configuration of a new version can thus be performed during ongoing operation. Only when all preparations have been completed is the active version deactivated and the new version activated. Deactivation and activation of a version always takes place on a pool-specific basis. In this way new agent versions can, for example, first be activated in a pool with test systems Installation In the case of a FlexFrame standard installation, new software components are installed via one of the Control Nodes. The FlexFrame Autonomy software is contained in the /opt/myamc directory. Ensure that all servers (Control and Application Nodes) use the same directories. FlexFrame Autonomy is thus also installed in a tree on a Filer (NFS share). The NFS file systems used have to support NFS file locking. control1:/opt/myamc # mount filer1_qa:/vol/volff/flexframe/myamc on /FlexFrame/myAMC type nfs (rw,nfsvers=3,intr,noac,wsize=32768,rsize=32768,addr= ) filer1_qa:/vol/volff/flexframe/scripts on /FlexFrame/scripts type nfs (rw,nfsvers=3,intr,nolock,noac,wsize=32768,rsize=32768,addr= ) control1:/opt/myamc # ls -al /opt/myamc lrwxrwxrwx 1 root root 16 Dec 2 18:35 /opt/myamc -> /FlexFrame/myAMC 6 FA Agents - Installation and Administration

15 First Steps Installation Packages The following packages must be installed: myamc.flexframe Autonomy Application Agent; the installation package for this is called myamc.fa_appagent-<x.y-z>.i386.rpm myamc.flexframe Autonomy Control Agent; the installation package for this is called myamc.fa_ctrlagent-<x.y-z>.i386.rpm myamc.flexframe Autonomy WebInterface; the installation package for this is called myamc.fa_webgui-<x.y-z>.i386.rpm where X.Y-Z stands for the version number. myamc.domainmanager (optional, e.g.for the Performance and Accounting option ), The installation package for this is called myamc.fa_domainmanager-x.y-z.i386.rpm Standard Installation Standard installation is implemented from a completely writeable (as the user root) directory tree. 1. Log onto the target computer as root and copy the rpm packages to a temporary directory. 2. Install the required package with rpm ihv --nodeps myamc.fa_appagent-<x.y-z>.i386.rpm After all the required packages have been installed, the start scripts may need to be copied to the ROOT Images of the various node types (Application / Control). Only the myamc.fa_ctrlagent may run on the Control Node, and only the myamc.fa_appagent may run on the Application Nodes. FA Agents - Installation and Administration 7

16 First Steps Configuration The FlexFrame Autonomy Agents do not require any additional configuration for use in productive operation. The myamc_fa.xml file is stored when installation takes place. This file already contains a complete parameter set for the operation of the FA_AppAgents and FA_CtrlAgents. The services to be monitored and the reaction scenarios which run in the event of problems are parameterized in this file. The parameters and their default values are described in section 7.5. The mode in which the agents are to operate is also configured here. In the course of the startup, in particular the start and stop times, the function of the MonitorAlerts, and the times for a reboot and switchover need to be checked. The MonitorAlerts are a component part of the der FlexFrame basic installation. The MonitorAlert- Time must always be at least three times as great as the parameterized CheckCycleTime. In the startup scenarios, the real start, stop, restart and reboot times must be determined individually for each service type. If the times specified for start, restart, reboot or switchover are not sufficient, this can result in unwanted reaction escalations. Changes in the parameter file become effective only after the agents have been restarted. The FA migration tool enables a configuration file of an existing installation to transfer the data automatically to a new configuration file. Parameters which, for example, were not present in an older version of the configuration file are then initially automatically set to their default values Starting and Stopping During installation, links to the FA Agents start/stop scripts were set in /etc/init.d/. Run this script without any options so that all available options are displayed, e.g. start or stop. Example: Starting the FA Application Agent: /etc/init.d/myamc.fa_appagent start /opt/myamc/fa_appagent/ myamc.fa_appagent start Example: Starting the FA Control Agent: /etc/init.d/myamc.fa_ctrlagent start /opt/myamc/fa_ctrlagent/ myamc.fa_ctrlagent start 8 FA Agents - Installation and Administration

17 First Steps 2.3 FA WebInterface Function The FA WebInterface visualizes all nodes and services which exist in a FlexFrame System insofar as these are monitored by an FA_AppAgent. The status, availability and messages of the Application and Control Agents are displayed Installation The installation package is called myamc.fa_webgui-<x.y-z>.i386.rpm. A prerequisite here is that an Apache-Tomcat Servlet Container is installed. Currently Tomcat >= 5.0.x is supported Configuration Provided no paths have been changed in the FA configuration, the configuration of the WebInterface is restricted to linking it into the Tomcat configuration file. For this purpose the following change must be made in the Tomcat configuration file (e.g. /opt/jakarta-tomcat-<version>/conf/server.xml): The following line has to be added at the end of the configuration file (in front of </Host>): <Context path="/fawebgui" reloadable="true" docbase="/opt/myamc/fawebgui" workdir="/opt/myamc/fawebgui/work" /> Changes to the web server require Tomcat to be restarted or reloaded. Further settings can be made in the files /opt/myamc/config/fa_webgui.conf (general settings, paths, cycle tymes, database settings) and /opt/myamc/config/amc-users.xml (user administration). The various settings are described in section Changes require the FA WebInterface to be restarted or reloaded (e.g. via the Tomcat Service Manager) or Tomcat to be restarted or reloaded. FA Agents - Installation and Administration 9

18 First Steps Starting and Stopping The WebInterface can always be reached if the Apache-Tomcat is running. This can generally be started using the script /etc/init.d/jakarta-tomcat start. The WebInterface can then be reached at the following address: The specified port can be changed in the Tomcat configuration file server.xml. Prerequisites here are Mozilla >= or Internet Explorer >= 6.0 and the Java plugin for Sun >= DomainManager The DomainManager is installed on the Control Node. It could be integrated in PRIMECLUSTER, but this not the default. It has to be done in the single projects. The accounting and performance data collected by the FA Application Agents is automatically adopted by the ITDW and can be visualized and evaluated with the help of the FA WebGUI with the Accounting and Performance management plugin. The DomainManager is configured via the file /opt/myamc/domainmanager/config/ DomainManager.xml. Pool-specific configuration is also possible. Changes to parameters in the DomainManager configuration are dynamically recognized and adopted. Alternatively to processing through the DomainManager, the files can also be accessed by an external DomainManager which runs outside of FlexFrame. In addition to this, extension of the Tomcat server by means of the myamc.fileretriever module is possible. This is optional and not part of the standard delivery. 10 FA Agents - Installation and Administration

19 3 FlexFrame Autonomy Architecture FlexFrame Autonomy is a powerful component for high-availability operation of systems with distributed instances. A FlexFrame solution consists of Storage, Application Servers and redundant Contol Nodes. This product has been implemented for this solution comprising storage, servers and connectivity. It enables fast and flexible setup of solutions which offer autonomous functions to simplify and provide flexibility for operating applications. The figure below shows an overview of the FlexFrame architecture and the associated FlexFrame Autonomy components: The benefit of the FlexFrame Autonomy solution lies in the flexibility for integrating new nodes and instances without changing the configuration. Components of FlexFrame-Autonomy: FlexFrame Autonomy Application Agents (FA_AppAgents) FlexFrame Autonomy Control Agents (FA_CtrlAgents) The FlexFrame Autonomy copmponents permit highly available, semi-autonomous operation of distributed applications. In principle the instances can be distributed to any number of nodes within a FlexFrame solution. The individual services are monitored via FlexFrame Autonomy Agents. By default, the Application Agents currently support SAP central instances and SAP application instances, as well as SAPDB and Oracle database instances. FA Agents - Installation and Administration 11

20 FlexFrame Autonomy Architecture 3.1 Pool Creation and Grouping FlexFrame Autonomy Version 2.0 permits pool creation and grouping functions to be implemented Virtual FlexFrame Autonomy Pools A pool is the assignment of hardware resources to a virtual FlexFrame Autonomy pool. From the viewpoint of autonomy and of the high-availability functions, an Autonomy pool is an independent structure. In a standard installation of Version 1, all resources of a FlexFrame solution are managed in a single pool. Configuration of the pools takes place directly with the configuration of FlexFrame in the LDAP. The FA_App Agents ascertain the pool affiliation at startup. Configuration of the FlexFrame Agents always relates to one pool, i.e. there is one directory tree with the parameters and configuration data for each pool. In a pool, the FA Agents provide the autonomous functions restart, reboot and switchover of services and nodes. These reactions no longer relate to all nodes of a FlexFrame solution, but only to the set of nodes which belong to the same pool. Pool creation results in virtual FlexFrame Autonomy pools being created, each of which performs autonomous functions independently of other pools which exist in the same FlexFrame solution. A FlexFrame Autonomy pool always consists of one Control Agent and n Application Agents. Each Control Agent is responsible only for the Application Nodes which belong to its pool and shares a joint config and data directory with its Application Nodes. For each pool it is thus possible to parameterize autonomous behavior which is independent of other pools. The flexibility and security of a virtual FlexFrame pool is based on two major new features which the FlexFrame Autonomy Agents provide. A Control Agent for each virtual FlexFrame Autonomy pool Each Application Agent is provided with a flexible assignment to the pool and thus to the Control Agent with which it interworks. The use and interleaving of these two new options with the FlexFrame basic functionality offers a large number of new options to enhance the flexibility in server farms. The virtual FlexFrame Autonomy pools provide the option of simple and secure operation of multiple Autonomy clusters which run in parallel and simultaneously in a distributed IT infrastructure. 12 FA Agents - Installation and Administration

21 FlexFrame Autonomy Architecture FA_Version 1.0 config data config data FA_Version 1.x config data A virtual FlexFrame Autonomy pool offers the advantage of complete separation of all reactions and the associated parameter sets for the start and stop times. FlexFrame Autonomy can also be completely disabled for a virtual pool (e.g. for service and maintenance) without affecting any other virtual FlexFrame Autonomy pool which is running in parallel. A virtual FlexFrame does not share its FlexFrame Autonomous Agents with any other virtual FlexFrame. In this way, depending on the configuration, the virtual FlexFrame Autonomy pools could use different binary statuses Grouping For flexible server farming, FlexFrame offers grouping functions which differ from the pool in that these enable nodes and services within a pool to be assigned to different groups. A group is thus always a part of a virtual FlexFrame pool. Grouping can also be implemented according to the same generic rules. Group schemas can be defined for this purpose. In the parameter file you select the schema which is to be used for group creation. The configuration information for the groups is stored in the myamc_fa_group.xml file. The entries in this file can be made manually or by taking them over from the LDAP directory. Configuration can take place through concrete assignment or through generic assignment. FA Agents - Installation and Administration 13

22 FlexFrame Autonomy Architecture Manual Group Creation The group assignment is entered in the configuration file manually. In the event of manual group creation each node name is unambiguously assigned a group name Configuration in the LDAP Directory As of FlexFrame V3.1 the group information can be stored in the LDAP directory. When the Agents are started, the group information is read directly from the LDAP directory Automatic (generic) Group Creation Automatic group creation is performed on the basis of generic information which the Application Agents can ascertain automatically. For generic group creation it makes sense to use the host names, the IP addresses or the operating system employed. In the event of generic group creation the concrete host name is not entered in the myamc_fa_group.xml file, but a creation element which enables the algorithm for generic group creation to find a group assignment. Example: On the basis of the platform information, i.e. if no manual configuration of groups takes place, two groups are created, the group of the Linux nodes within a FlexFrame installation and the group of the Solaris Nodes within a FlexFrame installation. In this case the group name is also created generically. For this purpose each schema is assigned a group naming rule which combines a fixed part with a variable part. Automatic group creation is nont currently used by myamc FA Agents in an FlexFrame environment, as the groups are usually configured statically by the FlexFrame configuration tool. 3.2 Service Classes The service classes are required for the prioritized operation of individual services or systems. A service is defined as an application instance which must be identified unambiguously and which can be started and stopped individually, e.g. central instance, application instance or database instance (CI, AP DB). A service class defines the minimum requirements which must be provided when services are taken over in the event of a switchover. When multiple nodes fail simultaneously, the spare nodes in the group take into account the priorities of the services which have failed. First all prio 1 services are taken over, and only then all services with a lower priority. It will be possible to extend the attributes of a service in the future, as already shown in the examples (e.g. operating system). 14 FA Agents - Installation and Administration

23 FlexFrame Autonomy Architecture A system is a logical unit consisting of multiple service instances which together define a system. In an SAP system these comprise the database instance, central instance and application instances. The following attributes are defined in the service classes: Service priority Service power value In the future it will be possible to enhance such a service class by further attributes which, for example, define the operating system required by a service or the number of CPUs or the performance requirement of the service Service Priority The highest service priority is 1. Every service is assigned this priority by default, i.e. if no service classes are defined, all services have the priority 1. The higher the number, the lower the priority of a service. Priority 0 has a special status. Setting priority 0 for a service class enables the autonomous functions to be disabled for a service. The service priority is evaluated for all autonomous reactions. If, for example, a service of a productive system and a service of a test system are running on the same node and the test system s service is assigned priority 5, this service is not executed because the productive system s service which is functioning without error has the higher priority of Service Power Value The service power value specifies for a service a performance number which defines the maximum performance (SAPS) required by this service. This value is used for takeover scenarios; the add rule requires the service power value A failed service with a performance value of 50 can, for example, also be taken over by a node which still has at least 50 of its maximum performance number free Class Creation Rules A service belongs either to the default class which always exists or it can be assigned unambiguously to another class by evaluating the aforementioned variables Testament Types The switchover scenarios use testaments to transport the service information to other nodes. The creation of the testaments can be node-based or service-based. With node- FA Agents - Installation and Administration 15

24 FlexFrame Autonomy Architecture based testaments, all services of a node always come together to the takeover node. With service-based testaments the services could be taken by different nodes. The parameter for the testament type and the takeover rules therefore strongly influence the possible takeover scenarios Service-specific Testaments Service-specific testaments are used for services which require individual takeover scenarios. The following services require service-specific testaments Enqueue Service in Case of Replicated Enqueue The enqueue service with replicated enqueue service has its own service type. This special testament is built dynamically if a replicated enqueue service exists. For this servicespecific testament, the service-based takeover rule applies. Only nodes with a replicated enqueue service can apply. 3.3 FA Configuration, Work and Log Files The figure below provides an overview of the configuration and log files which are generated by FA components and stored on the common file system. These files also form the permanent memory which is required, for example, to restore the services needed when a system is rebooted. directory structure /opt/myamc/:./fawebgui./vff/common/.vff_template.<version>./vff/vff_<pool_name> config log log/appagt log/ctrlagt log.common data/fa/ data/fa/blackboard data/fa/livelist data/fa/servicelists data/fa/servicelogs data/fa/xmlrepository data/fa/reboot data/fa/switchover data/fa/performance FA web interface Template for pool directories Pool directory Pool-specific configuration data Log files Log files of Application Nodes Log files of Control Nodes Common log files FA data directory Blackboard directory Live list Service files of all nodes Service files of all nodes (history) XML files for the web interface Reboot files for all nodes Switchover files for all nodes Performance and accounting files 16 FA Agents - Installation and Administration

25 FlexFrame Autonomy Architecture In a standard FlexFrame installation the directory tree /opt/myamc exists for myamc.fa. All the directories and files required for the myamc.fa software are located here. 3.4 Systems A system is based on several services which belong to a logical group. SAP systems are an example of logical systems. The services of such a system can be distributed in one pool on several Application Nodes. The FA-AppAgents identify the services and the system they belong to autonomously and they identify standard SAP services automatically. 3.5 Service Types The FA_AppAgents are able to identify standard SAP services and the hierarchy in a logical SAP system. For these services the FA_AppAgents do the autonomous reactions restart, reboot and switchover. DB, CI, APP, J, JC, SCS, ASCS, ERS, LC Version 3.0 of the FA Agents can monitor the service types SCS and ASCS with replicated enqueue service (ERS). The detection of SCS/ASCS with or without replicated enqueue service is done automatically Replicated Enqueue Service Version 3.0 of the FA Agents can monitor the service types SCS and ASCS with replicated enqueue service (ERS) scenarios. The detection of SCS/ASCS with or without replicated enqueue service is done automatically. For an SCS or ASCS service there is a replicated enqueue service on which the enqueue table is replicated. If the SCS or ASCS service fails, this service must be restarted on an associated replicated enqueue service. The SCS, ASCS service takes over the enqueue service table present there and stops the replicated enqueue service. Once the replicated enqueue service is stopped, the testament is published and, with the autonomy scenarios for internal switchover, the replicated enqueue service gets a new node and starts up. So if the SCS or ASCS fails again, there is another replicated enqueue service for a new takeover scenario. This scenario works with one or more replicated enqueue services in one system. The rules for the switchover of the replicated enqueue are the same as those configured for the other services. The switchover to the replicated enqueue service is based on a service-based testament. The rule to apply for this service is based on the generally con- FA Agents - Installation and Administration 17

26 FlexFrame Autonomy Architecture figured takeover rule as well as on service priority and the takeover type for this pool or the dynamic takeover table Live Cache With version 3.0 of the FA Agents it is possible to integrate the live cache into the standard autonomy scenarios. The FA Agents offer the standard autonomy functions restart, reboot and switchover for the live cache. A specialty of the live cache is the possibility to stop it from the SAP GUI. For this restart scenario you have to check the restart times of the live cache, otherwise this scenario cannot be diagnosed (recognized) as a fault of the live cache. 3.6 Generic Services Generic services are services which are not integrated into the FA-AppAgents autonomy rules. With generic services it is now possible to integrate other virtualized services into the autonomy monitoring and reaction scenarios. A generic service is a logic application suite consisting of one or more subservices. For this purpose a generic service is defined through a set of parameters which are used for its identification and which generate the service states. The description and definition of a service is arranged in several models: Service state model Service detection model Service reaction model Service State Model The autonomy scenarios are based on an defined state model. The standard service state model uses the following states: Starting Running Stopping Error The state changes are initated through events from an event script or through detection. For a generic service, implementation and integration in the standard start/stop procedure of the service is necessary. The standard state model knows the following events: Start Stop Restart 18 FA Agents - Installation and Administration

27 FlexFrame Autonomy Architecture Error Watch Nowatch Service Detection Model The service detection model provides the basis for identifying the service and building the state model. A service detection model needs the parameters for the identification of the service components. The parameters are the subservice and the processes of the subservice. For this there are parameters for hierarchy and process count. There is also a process filter and exception rules, to avoid ambiguities Service Reaction Model The service reaction model defines the reaction and the connection to the start, stop and restart scripts. The reaction API has the parameters Script and Parameter: Script Parameter The call reference for the script Set of parameters for the script The FA-AppAgents reaction API provides a set of parameters, which can be used as call parameters in @{SRVDISP}@ Parameter for the (SID, in upper case) Parameter for the (SID, in lower case) Parameter for service name (in upper case) Service name Display service name (in upper case) Display service name Instance number (two-digits) 3.7 FlexFrame Performance and Accounting Option The FA-Agents provide optional performance and accouting data. The agents collect node-, service- and group-based information. The FlexFrame performance and accounting option requires the activation of additional services on the Control Node. This service does a performance and accounting calculation of the raw data. The FA Agents produce performance and accounting collets in the data directory of the pool. There are 3 types of collet data Collets per node with the name pattern Perf_Node~<node_name>.prf.<number>.col FA Agents - Installation and Administration 19

28 FlexFrame Autonomy Architecture Collets per service group with the name pattern Perf_Group~<node_name>.prf.<number>.col Collets per service with the name pattern Perf_Service~<SrvType>~<SID>~<ID>.prf.<number>.col The number and size of the collets produced by the FA Agents can be adjusted. In the standard adjustment there are in each case 10 collets per service or node installed. This results in a ring buffer of data automatically reorganized by the agents. For the sizing it is possible to calculate the required storage size through the number of nodes and the size of the report cycle. The parameters of the DomainManager and of the backup routine have to be configured in a waythat the raw data can be safely processed before being overwritten by the FA Agents. The following graphic shows the architecture of the performance and accounting option. Network FA-Application Agents FA-Performance and Accounting Service mysap.com Applicationserver Databaseserver DB CI APP ITDW Storage Accounting and Performance Collets Performance Option The performance option measures several performance values. For all measured values there is a minimum, average, maximum and total value. This data is supplied in absolute as well as relative form. The performance option enables monitoring and evaluation of the server and services over a longer period of time. For every node the following data are available as a minimal, average and maximum value: load of SAP-, database- or generic services other services Machine idle 20 FA Agents - Installation and Administration

29 FlexFrame Autonomy Architecture By using the generic services, the granularity of the performance values will be increased. The data of the performance and accounting option can be directly visualized with the FlexFrame FA Web GUI with performance and accounting management plugin. The granularity of the view and the timespan can be freely defined. Service Groups Services are combined to form groups through specific criteria. This enables the groupaggregated evaluation of the recorded data.the collected data is aggregated per report cycle and is created for every node. By default the following groups exist: SAP DB IDLE OTHER SAP services Database services Share (proportion) of the free CPU capacity Sum of all processes not belonging to a defined group It is possible to define further services and assign them to existing or new groups Accounting Option The accounting option is, like the performance option, an optionally activatable part of the FA Agents. The production of the accounting data is a multistage process determining accounting data through aggregation and analysis of the recorded raw data. System Service Hostname Time- CPU CPU Mem Mem SAPS SAPS SID P22 DB Host 1 stamp ms % Kb % abs % P22 CI Host 2 P22 P22 P22 P22 P22 APP Host 3 J Host 4 JC Host 5 SCS Host 6 ASCS Host 7 Backup Host 1 xy Host 3 per Report-cycle Min, Max, Avg, Total The accounting data is determined on the basis of SAPS values. SAPS is the measured size used for the sizing of a server for the SAP operation. SAPS values are only available within the scope of a defined benchmark with defined SAP transactions.therefore only SAPS equivalents can be produced and calculated during the operation. For this purpose FA Agents - Installation and Administration 21

30 FlexFrame Autonomy Architecture the agents dynamically evaluate information on the SAP version and hardware-workload data and use this to calculate the SAPS equivalent values. Important parameters for the accounting are detection and report cycles. The detection cycle defines the number of measurements within a report cycle. The minimum, maximum and average values are calculated on the basis of individual measurements for a report cycle. The detection cycle therefore always corresponds to the detection cycle of the FA Agents, which is also parametered for the autonomy function. The following figure shows the ascertainment and calculation of values with regard to the detection cycle and report cycle. SAPs Max workload Server capacity Min workload Total work Detection cycle Default 10 sec t Report-cycle 1 min Automatic Calculation of SAPS Values The SAPS calculation is based on the automatic and dynamically determined workload ability of a node. Based on a variety of technical features such as cache, CPU, hyperthreating etc. and the possibilities of the operating system to use these, modern servers can, come to wrong assumptions concerning the workload abilities of a node. In these cases the automatic valuation can result in defective workload calculations. If the internal automatic ascertainment of the SAPS value results in defective values, the manual SAPS calculation can be used. Manual Calculation of SAPS Values If the maximum workload number of a server could not be correctly determined via the FA application agent the workload number can be individually defined for each node. The workload values are then calculated using the prepared workload data. In this way the individual particularities of the workload abilities of a node can be taken into considera- 22 FA Agents - Installation and Administration

31 FlexFrame Autonomy Architecture tion. For this purpose, however, the workload values for each node have to be entered manually Billing Using another calculation stage, chargeable workload units can be calculated from the SAPS-based accounting data. For the calculation, a range of parameters enabling differentiated pricing of the workload used can be set. In the default mode, all systems and services are charged at the same value every time. With the help of the FlexFrame ControlCenterAccounting plug-ins, the pricing can be determined through additional configuration settings. Therefore the following statements are necessary: Service contract no. System ID ServiceID Date range Day type, e.g. weekday, holiday, weekend Time of day, e.g. daytime, nighttime operation. The billing table enables very differentiated billing of accounting data as far as the service contract level. By way of system, service time, time of day and time types, different service contract items with various workload prices can be used. CPU/ SAPS Values Aggregation cycle Service- Contract System Service ID typ fromdate todate daytype from Time to Time Service level rule Accounting Accounting Unit rule Price SC_12345 P22 all workday 00:00 24:00 Sapsrule Standard 0.25 SC_12345 P22 all weekend 00:00 24:00 Sapsrule Standard 0.15 SC_12345 P23 DB daily 00:00 24:00 Sapsrule Standard 0.30 SC_12345 P23 other daily 00:00 24:00 Sapsrule Standard 0.20 SC_12345 Q22 all daily 00:00 24:00 Sapsrule Standard 0.15 Accounting cycle Accounting report FA Agents - Installation and Administration 23

32 FlexFrame Autonomy Architecture 24 FA Agents - Installation and Administration

33 FlexFrame Autonomy 4 FlexFrame Autonomy The operation of SAP systems is becoming increasingly complex, the number of components required is constantly rising. Installation, configuration and operation of a distributed SAP installation consequently involve considerable administrative effort. The demands on the systems change rapidly, and it must be possible to expand an existing configuration of replace failed components both quickly and flexibly. Through the use of Autonomy Agents, FlexFrame enables the number of operator interventions to be reduced and availability to be increased. This chapter describes the application scenarios for the FlexFrame Autonomy functions. Installation and startup of the agents is described in chapter 2. To permit active operation of a FlexFrame Autonomy installation, the FlexFrame Autonomy Agents must run on the Application Nodes and a Control Agent on the active Control Node. Use of the Messenger component is optional and is required only for displaying and forwarding events of the FA Agents and for integration into Enterprise Event Management Systems. The FlexFrame Autonomy Application Agent is used to monitor SAP central instances, SAP application instances and database instances. In the event of a problem, so-called self-repair mechanisms are used for these services. Execution of these self-repair mechanisms can be triggered locally or centrally. For each service/node these mechanisms can be divided into the following categories: Monitoring of a service Restarting of a service if it was down Rebooting of a node if a service could not be started again after one or more restarts Switchover (automatic change) to another node if the reboot could not be performed or was not successful Detecting of START, STOP and maintenance situations Control functions for displaying activities and statuses, sending mails and SMSs, configurable in conjunction with time, contact and problem situation FA Agents - Installation and Administration 25

34 FlexFrame Autonomy 4.1 FlexFrame Autonomy Reactions FlexFrame Autonomy detects problems and decides autonomously on the reactions to be implemented after evaluating rules which can be controlled via parameters. FlexFrame Autonomy knows the following basic reactions: Restart Reboot Switchover (internal / external) These basic reactions, combined with pool creation, grouping, the service classes, and the service priorities result in a large number of reaction scenarios Restart The FA_AppAgent restarts a service if a required subservice is down or no longer available. In this case it checks whether the service is available again after the restart on the basis of a configurable time interval. The restart is not performed if any service which runs on the node has already triggered a reboot. Furthermore, failure of multiple subservices of a service leads to a restart within the configured time interval only until service availability has been restored. The number of restart attempts for restoring service availability can be configured. If the number of parameterized restart attempts is 0, failure of the service results directly in a reboot attempt. If the number of reboots permitted for the nodes is 0, a switchover is initiated. The restart reactions are not affected by pool creation, grouping or service classification Reboot A node is rebooted if an monitored service has failed and could not be made available again after the configured number of restart attempts, or if no restarts are allowed. The autonomous reaction reboot also evaluates the service class and the service priority of the service which causes the reboot. However, if multiple services are running simultaneously on the nodes, the reboot rule is used to check whether services with the same or higher priority are still running. If this is the case, the reboot is not performed but only a corresponding alarm generated which informs the administrator of this problem. 26 FA Agents - Installation and Administration

35 FlexFrame Autonomy Switchover A switchover always leads to all the monitored services of a node being moved to another Server Node. The decision to move to another node can be taken locally by the FlexFrame Autonomy Agent (internal switchover), or by the Control Agent on one of the Control Nodes (external switchover) General Rules Takeover in the event of a node failure is implemented using what is termed an applicant rule. The applicant rule states that each spare node may apply to take over the services of a failed node. Pool creation, grouping and service classes permit new switchover scenarios which can satisfy different availability requirements depending on the parameterization. This results in the following scenarios: Pool-dependent switchover Group-dependent switchover Service-prioritized switchover The failure of a node is only reacted to within a virtual FlexFrame pool. Groups can be defined within a virtual FlexFrame pool. The applicant rule states that a node only issues an application when a node in its own group fails. The granularity of the reaction to a system failure can be further refined by prioritizing individual services. The applicant rule states that in the event of simultaneous failure of multiple services, the application is first issued for the switchover file (testament) of the service with the higher priority. Only if all higher-priority services have been taken over by another node and free spare nodes exist do these apply for the switchover files of lower-priority services which still need to be taken over. When services with priority 0 fail, no applications are made by spare nodes. This prevents spare nodes being used up by the failure of unimportant test systems. The parameter file also contains a minimum priority parameter. This parameter provides a very simple way to define, for example, that spare nodes only apply to take over the services of a node if none of the failed services has a lower priority than that entered there. In conjunction with the basic rule by default all services have priority 1, a lower priority can be configured for individual services, thus providing a simple way to prevent valuable spare nodes being used up by the failure of test systems. FA Agents - Installation and Administration 27

36 FlexFrame Autonomy Internal Switchover In the case of an internal switchover the Application Agent recognizes that a service is down and cannot (or depending on the configuration may not) be restored using a restart or reboot. The FlexFrame Autonomy Agent then initiates an internal switchover. The actual takeover by another node begins with the transfer message. Only spare nodes can apply and take over these services. The node which takes over control starts the required services. If, after the maximum switchover time, the FA_AppAgent on the system that is to take over control is not able to start the services, it reports this by means of an SNMP trap. The switchover is aborted and must be processed further by the administrator External Switchover In contrast to the internal switchover, the external switchover is detected and initiated by the Control Agent on a Control Node. This is required if the system is showing no sign of life or can no longer be reached in the network. Reachability is tested using Ping or SSH tests. The user decides whether to perform only Ping, only SSH or both kinds of test. Additionally the Ping requests may be configured for client LAN, server LAN, and/or storage LAN interfaces. The takeover is performed in the same way as for the internal switchover. In order to enable user-specific actions before or after a node was powered down, the CtrlAgent calls hook scripts, which may be customized by the user. The scripts are provided with the return code of the previously executed action. Pre-PowerOff hook script: Called with return code 0 as argument, as there was no previously executed action. Post-PowerOff hook: Called with the return code of the Pre-PowerOff hook script (if it failed, i.e. the return code was!= 0) or with the return code of the power off script. If the configuration value IgnorePoffHookResult is set to true, the return codes of the hook scripts are ignored. If set to false, they are used as hints on how to proceed in case of errors: if the Pre-PowerOff hook script returns a value!= 0, power off will not be performed, if the Pre-PowerOff hook script returns a value!= 0, the SwitchOver will not proceed. This enables the user to customize the external switch over and power off processes based on additional information or rules or to perform additional actions, e.g. mounting SAN devices Maintenance The autonomous functions and reactions of the FA Agents can be disabled for individual services by calling a maintenance script. This is always required when application instances are to be started and stopped without autonomous reactions. 28 FA Agents - Installation and Administration

37 FlexFrame Autonomy A service is set to nowatch using the following scripts on the relevant Application Node: sapdb <SID> nowatch sapci <SID> nowatch sapapp <ID> <SID> nowatch A service is reincluded in monitoring using the following scripts on the relevant Application Node: sapdb <SID> watch sapci <SID> watch sapapp <ID> <SID> watch 4.2 Self-Repair Strategies In terms of the strategy for restoring a failed service, a distinction must be made between the following failures: Service failure Node failure A detailed description of the procedure for the subsequent autonomous reaction was provided in the preceding chapter Self-Repair in the Event of a Service Failure If a service failure occurs, this is detected by the myamc.fa_appagent and an attempt is made to make the service available again using the following autonomous reactions and their escalations: Restart of the service Reboot of the node Switchover (internal) Implementation and the number of the above-mentioned autonomous reactions and the escalations can be affected by the configuration Self-Repair in the Event of a Node Failure If a node failure occurs, this is detected by the myamc.fa_ctrlagent and an attempt is made to make the service available again using the following autonomous reactions: Switchover (external) FA Agents - Installation and Administration 29

38 FlexFrame Autonomy Takeover by a Spare Node (Switchover) The standard rule in the FlexFrame concept for taking over the services of a failed node is to have them taken over by a spare node. Every Application Node in a standard FlexFrame installation on which a FA_AppAgent is running and none of the monitored services exists is automatically a spare node. If a switchover is started as a result of a node failure or escalation of a service failure, all spare nodes apply to take over the services. The quickest node in the application procedure is chosen and takes over the tasks Multi Node Failure The simultaneous failure of multiple systems or nodes is called Multi Node Failure. This indicates a different kind of failure than a single node or system failure, where the cause is usually more complex. From version V30A10 on, the FA Agents offer support for the automatic detection of Multi Node Failures with different reactions and additional alarms. This allows the recognition of failure states, which require the attention and decision of an administrator. Several new parameters allow the modification of the usual behaviour, like delaying or skipping reactions. Additionally a set of new alarms triggered by userconfigurable indicators inform the administrator in case of of a multi node failure, so he may take apprioriate actions. The configuration of these indicators can be performed per pool. There are two different Multi Node Failure scenarios: 1. Simultaneous failure of multiple Nodes, Systems or Services, e.g. due to a power outage in a blade cabinet, which shows an affect within a short period of time (e.g. one minute) 2. Failure of several Nodes, Systems or Services, within a specific timerange, which is bigger than the one specified above (e.g. one hour) These scenarios are called ShortTime Failure and LongTime Failure. The CtrlAgent keeps a list of all failures, with each entry containing node name and timestamp. If the number of entries within a scenario-specific timerange exceeds the limit, the CtrlAgent assumes a Multi Node Failure Case 1: ShortTime Failure Simultaneous failure of multiple Nodes, Systems or Services, e.g. due to a power outage in a blade cabinet, which shows an affect within a short period of time (e.g. one minute). 30 FA Agents - Installation and Administration

39 FlexFrame Autonomy MultiNodeFailure_ShortTime_FailureCount Specifies the number of failures within a certain time range, which leads to a Multi Node Failure state. MultiNodeFailure_ShortTime_FailureTime Specifies the time range (in seconds) to be used for failure aggregation. MultiNodeFailure_ShortTime_ReactionDelay Specifies a delay time (in seconds) before the CtrlAgent reacts on failures. MultiNodeFailure_ShortTime_ReactionAction (for future use) Specifies a reaction different from the normal modus of operation. In case of a Short Time Multi Node Failure the CtrlAgent sends an emergency alarm. Additionally the usual autonomous reactions can be delayed or skipped (by setting MultiNodeFailure_ShortTime_ReactionDelay to a very big value) Case 2: LongTime Failure Failure of several Nodes, Systems or Services, within a specific timerange, which is bigger than the one specified above (e.g. one hour). MultiNodeFailure_LongTime_FailureCount Specifies the number of failures within a certain time range, which leads to a Multi Node Failure state. MultiNodeFailure_LongTime_FailureTime Specifies the time range (in seconds) to be used for failure aggregation. MultiNodeFailure_LongTime_ReactionDelay Specifies a delay time (in seconds) before the CtrlAgent reacts on failures. MultiNodeFailure_LongTime_ReactionAction (for future use) Specifies a reaction different from the normal modus of operation. In case of a Long Time Multi Node Failure the CtrlAgent sends an emergency alarm. Additionally the usual autonomous reactions can be delayed or skipped (by setting MultiNodeFailure_LongTime_ReactionDelay to a very big value). FA Agents - Installation and Administration 31

40 FlexFrame Autonomy 4.3 Takeover Rules Overview Rule based high availability for nodes and services is performed by evaluating rule sets, which control the take over of services from a failed node. They consist of qualification rules, take over strategy and take over rules. The qualification rules specifies, which nodes may apply for the services of a failed node. The take over strategy defines the conflict resolution mode to be used, when more than one node applies for a node testament. The take over rules controls the actual take over, i.e. service start order and possibly service displacement or replacement TakeOver strategy Overview The qualification rules specifies, which nodes may apply for the services of a failed node. When performing a SwitchOver, all nodes may apply for take over of the failed node s services by taking part in an auction. As long as the auction lasts, all nodes, which match the requirements as specified in the failed node s testament may apply. When it is finished, the take over strategy is used to decide which node won the auction FirstFit FirstFit specifies that the first node, which applied for a testament, is the winner. This is the default strategy LowPrioFit From Version V30A10 on, the FA Agents provide a new strategy: LowPrioFit. The application node containing the services with the lowest priority wins the auction. It therefore has the best chance to replace or displace some running services in order to take over the failed services. By definition a spare node is considered to have the lowest priority, so it will win an auction over a node with running services. A node with only services of priority 0 will win over a node with services of priority 1 and higher and so on. This strategy can be used as an alternative to FirstFit. This changes only the behaviour of the new take over rules: add rule, replace rule and substitution rule. If only the spare 32 FA Agents - Installation and Administration

41 FlexFrame Autonomy node rule is used, the behaviour is the same as with the FirstFit strategy, because all spare nodes have the same priority and the first one wins the auction TakeOver Rule Overview In version 3 and higher, the FA Agents offer the option of configuring various takeover rules. It is now possible to replace or supplement the previously available spare node rule with further alternatives. Generally it is possible to differentiate between a static takeover rule and a dynamic takeover rule. These takeover rules can be applied not only for nodes but also for service-based testaments. With node-based testaments it is evaluated in each case on the basis of the highest priority in the testament or the highest priority of a current service on an Application Node Static Takeover Rule The takeover rule is referred to as static if one of the possible takeover rules is parametered. The spare node rule available until version 3 is a static takeover rule. With version 3.0 of the agents the following static parametered takeover rules will be available: Spare node rule Add rule Replace rule Substitution rule Dynamic These rules allow very granular reaction to the breakdown of Application Nodes and services. It is no longer necessary to always keep spare nodes on hand in case of a breakdown of services for high availability.this function can now also be performed by Application Nodes which already possess services Takeover through Spare Nodes (TakeOver) The takeover by way of a spare node is the standard rule in the FlexFrame concept for taking over the servicees of the defective node. FA Agents - Installation and Administration 33

42 FlexFrame Autonomy Every Application Node in a standard FlexFrame installation on which the FA_AppAgent runs and on which none of the controlled services exist, is automatically a spare node. If, through the breakdown of a node or the escalation of a service disturbance, switchover occurs, all spare nodes apply for the takeover of the services. The quickest node in the application procedure receives the job and will take over those services. The application takes place at group level. In a default configuration, the spare node rule always applies Add Rule With this rule all spare nodes of a group can apply but so too can all nodes which still have sufficient workload reserves. The add rule only uses a node if the priority of the services in the testament is equal to or higher than the priority already running on the node. This prevents a high-priority node from taking over lower-priority services. With the add rule, only the nodes which possess services with lower or equal priority apply (from pool, group). If (OWN_prio_max >= SWO_prio_max ) apply In addition to the already running services, the services that have been taken over are started. Thus the add rule is employed if running services are supposed to be stopped by the takeover rule. This normally gives rise to a performance disadvantage after the taking over of the services for the services taken over as well as the services already running on the Application Node. The add rule expects for the services a configured maximum SAPS value reserved for a service. The node only applies if its SAPS workload suffices for the operation of the already running services and the services taken over Replace Rule This rule enables all spare nodes of a group to apply, but also all other nodes on which services with lower priority run. The replace rule only uses a node if the priority of the services in the testament is higher than the priority of the services already running. If ( OWN_prio_max > SWO_prio_max ) apply Attention: Prio 1 is highest priority: numbers > 1 decreasing priority. Services which are already running are displaced. The adopted services are started. A displaced service is transferred via the internal SwitchOver. 34 FA Agents - Installation and Administration

43 FlexFrame Autonomy The other nodes of the group apply according to the application procedure and the services can again be made available for other nodes, provided that lower-priority nodes still exist within the group. This rule defines which services may be stopped within the context of a takeover scenario within the context of a takeover scenario may be stopped, so that higher-priority services can be adopted and can possibly be made available on another node Substitution Rule This rule enables all spare nodes of a group to apply, but also all other nodes on which lower-priority services run. With the substitution rule, only nodes (from pool, group) possessing services with a lower priority than the high priority of the switchover file apply. If ( OWN_prio_max > SWO_prio_max ) apply Attention: Prio 1 is highest priority numbers > 1 decreasing priority Services which are already running are stopped according to the rules of the stop hierarchy. The adopted services are started Dynamic TakeOver Rules Overview In the static mode only one of the possible takeover rules can count. With the parameter dynamic as static takeover rule the deceision which rule ist used is made dynamicaliy, depending on the priorities of the services on the Application Nodes and the highest priority of the defective services. For this purpose, the prioritities table exists in the parameter file, which is only used in the case of a static rule dynamically parametered. The dynamic takeover rule allows disjunct as well as overlapping priority domains between the spare-node rule and one of the other static takeover rules.for the takeover rules add rule, replace rule and substitution rule, the priority domains must definitely belong to a rule. If a priority domain is defective then the first fitting is used. The dynamic takeover rules are best understood from the view of a service on one of the Application Nodes. The add rule replace rule and the substitution rule can be seen from the view of a running service on an Application Node as escalation stages which, within the context of a takeover scenario, increasingly handicap it unless it can avoid impairment through a higher priority. As soon as a testament is published for takeover, all application nodes of the group check if they can improve their status (services with higher priority). An Application Node FA Agents - Installation and Administration 35

44 FlexFrame Autonomy applies if, on successful application, an improvement to its service priority status is possible. The successful applicant determines the taking over dynamically according to the parametered takeover rule..the rule results from the evaluation of the dynamic takeover table and the highest priority of the running services. From this view, naturally the spare node has a special role, as it has no services it could handicap.it has the function of protecting its resources for special challenges. The parameters registered for the spare node result in principle in another evaluation. The parameters determine the priority domains in which the spare node may apply.the priority domain for which a spare node is responsible can overlap with the priority domains of the other rules. A spare node now no longer applies for defective services, as was the case with simple spare-node rules, but only for services in the priority domain. This prevents spare nodes from beingwasted on operating defective lower priority services. The rules are called dynamic takeover rules if, depending on the highest service priority that an application server already has, it is dynamically decided according to which rule an Application Node takes part in the application procedure on breakdown of other Application Nodes or services.the decision as to which rule is applied depends on the services running on an Application Node at that time. The dynamic takeover rules are supposed to ensure that a node is always available for the highest-priority services. Lower-priority services can be operated in parallel. On breakdown of higher-priority systems, those (the lower-priority services) are displaced or replaced. The dynamic rules also have the function of reserving valuable spare nodes for highpriority services. This allows the use of different workloads and priorities in a FlexFrame pool service. In case of hardware breakdowns very different disturbance scenarios, it is now possible to use the remaining hardware optimally. The application itself does not change. Of all the Application Nodes that applied, the one which registered first wins.this is determined by the configuration-parameter qualification rule first fit. The rules to be applied are parametered through a table with the following structure : Spare node rule Prio >= 1 < 2 Add rule Prio > = 2 < 4 Replace rule Prio >= 4 < 6 Substitution rule Prio >= 6 The interpretation of this table depends on the dynamic evaluation implemented by an Application Node for itself. An Application Node without services is a spare node. A spare node applies only for node or service testaments with a service belonging to a priority domain defined for spare nodes.this rule enables the reservation of spare nodes for high-priority services, without them being lost through lower-priority services. This is 36 FA Agents - Installation and Administration

45 FlexFrame Autonomy shown by the example with the following parametering for the spare-node rule in the dynamic takeover table. Spare node rule Prio >= 1 < = 2 The spare nodes only apply for defective services which have a priority of 1 or 2. This rule allows the restriction of the use of spare nodes on high-priority services. Spare node rule Prio >=1 <= 2 exclusive The additional attribute exclusive prevents Application Nodes which already possess services from applying for services with the priority 1 or 2. For Application Nodes which are not spare nodes, the rule for application is a bit more complex. Here, two different dynamic influencing factors are at work: the highest priority of a service in a testament for which it is possible to apply, and the highest priority of a service already running on the Application Node. For the application rule, the principle applies that an Application Node does not apply for new services if this would cause already-running services with equal or higher priority to be handicapped. Example of a dynamic takeover rule: Group 1 Group 2 Pool n 1 Spare Testament Prio SAPs Individual dynamic switchover rules per pool Group specific application process Candidate with add rule Sparenode >=1 < 2 Add rule >= 2 < 3 Replace rule >= 3 < 4 Substitute rule >= 4 Candidate with repluce rule Candidate with substitution rule Dynamic Example In the Dynamic rule the highest priority of services in the SwitchOver file and the highest priority of the services at the own node will be used for further decision criteria. For this there is a further configuration: FA Agents - Installation and Administration 37

46 FlexFrame Autonomy Testament High Prio TakeOver rule (min) (max) SpareNode 1 4 Own High Prio (min) (max) Add 3 4 Replace 5 6 Substitute 8 20 Additionally there is the setting Dyn_Spare_exclusive, which means that the priority range for SpareNode is exclusively reserved for the SpareNodes. The table above means the following: High Prio in testament == 1 to 4: Dyn_Spare_exclusive = true: Only SpareNodes may apply. Dyn_Spare_exclusive = false: All nodes may apply. High-Prio in testament > 4: SpareNodes may not apply. All nodes with Own High Prio 3-4, 5-6, and 8-20 may apply. All nodes with Own High Prio 1, 2, 7, and >20 may not apply. Own High-Prio == 3 or 4: The TakeOver rule Add will be used. Own High-Prio == 5 or 6: The TakeOver rule Replace will be used. Own High-Prio == 7 to 20: The TakeOver rule Substitute will be used. 4.4 Operating Mode FlexFrame Autonomy has various operating modes. The operating modes enable the FA Agents to be used for simple monitoring tasks with automatic alarms, through to semi or fully autonomous operation of a service. Particularly in the startup and learning phases, this flexible configurability permits successive replacement of manual interventions by autonomous reactions. The following operating modes are provided to implement the reactions: Event mode 38 FA Agents - Installation and Administration

47 FlexFrame Autonomy Local reaction mode Central reaction mode The operating modes can be defined at service level. This enables the degree of autonomous functions to be configured on an individual basis in accordance with the priority and importance of a service. Parameterization of these modes takes place using the parameters Service_MaxRestartNumber Node_MaxRebootNumber Node_MaxSwitchOverNumber Event Mode No autonomous functions are performed in event mode. The events are just reported to an event console in the form of messages. The administrator can then decide whether the reactions proposed by the Autonomy Agents are appropriate and then execute these manually. This mode is particularly recommended in the introductory phase. For this purpose the parameter variables for the number of permissible restarts, reboots and switchovers is set to Local Reaction Mode Local reaction mode is the standard operating mode for a FlexFrame installation. The reactions for restart, reboot and switchover are activated for all services in the myamc.fa.xml parameter file, and at the same time the number of reaction attempts for a reaction type is defined Central Reaction Mode In central reaction mode the reactions are not initiated by the FA_AppAgents. Here each of the local parameter variables for restart, reboot and switchover is set to 0. The reactions are initiated from a central position. This central position can be the FA_CtrlAgent in a FlexFrame installation. At the moment this method can only be used for external switchover scenarios. A further option is that the reactions are created externally on the basis of the traps sent and forwarded to the FA_AppAgents via the BlackBoard. FA Agents - Installation and Administration 39

48 FlexFrame Autonomy 4.5 Autonomous Operation of a FlexFrame Infrastructure In every application environment the user needs an option for starting and stopping new application instances. This can be implemented via the SAP Adaptive Computing Controller (ACC), or using the FSC FlexFrame scripts. FlexFrame Autonomy enables one of these two options to be used on a pool-specific basis. The selection is made in the myamc_fa.xml file. The agents autonomous reactions can also take place directly by calling the FlexFrame autonomous scripts, or alternately by transferring a job to the ACC. The ACC then executes the required reactions FlexFrame Autonomy and the Adaptive Computing Controller (ACC) The Adaptive Computing Controller is the SAP component which can be used to start and stop the application instances. To permit the ACC to be utilized for user interactions and for the autonomous reactions, the SAP system containing the ACC functionality must be configured in the configuration file myamc_fa_acc.xml. Information on installing and configuring the ACC must be taken from the current SAP documention FlexFrame Autonomy and FSC FlexFrame Scripts The reactions and user interactions take place either alternatively or always (in installations earlier than FlexFrame V 3.1) making direct use of the FSC FlexFrame scripts. The FSC FlexFrame scripts are responsible for starting and stopping the SAP instances. They permit the SAP instances to be visualized and also supply the information required for the Autonomous Agents to detect the user interactions. 4.6 FlexFrame Autonomy and User Interactions The autonomous reactions and the user interactions influence the status of FlexFrame- Autonomy. Status changes and user interactions are logged in the log files and, if so configured, also sent as traps. Status creation begins with starting up the agents and the evaluation of the parameterization information. When the FA-AppAgents are started up they read the parameterization file and subsequently know their job and operating mode. The FA Agents send a startup trap when they start up. 40 FA Agents - Installation and Administration

49 FlexFrame Autonomy Service instances that are already running are automatically recognized and managed. The most important commands, reactions and their events in the various startup situations of the FA Agent are described in the following. Restart, no services running: The FA_AppAgent sends only a node startup alert but no alerts for services which do not exist. Restart of the FA_AppAgent when services are running: The agent reports its own restart and availability and the various SAP services which are running. After reboot: The agent reports its own startup and availability and the successful startup of each individual service. After switchover: The agent that takes over control starts the services described in the testament and send events corresponding to these myamc.fa Agents: Starting/Stopping/Status Starting the myamc.fa Agents Manually An FA_AppAgent can be started manually using the following command: /etc/init.d/myamc.fa_appagent start /opt/myamc/fa_appagent/myamc.fa_appagent start An FA_CtrlAgent can be started manually using the following command: /etc/init.d/myamc.fa_ctrlagent start /opt/myamc/fa_ctrlagent/myamc.fa_ctrlagent start or /opt/myamc/fa_ctrlagent/myamc.fa_ctrlagent start <pool_name> /opt/myamc/fa_ctrlagent/myamc.fa_ctrlagent start Cust_ Stopping the myamc.fa Agents Manually An FA_AppAgent can be stopped manually using the following command: /etc/init.d/myamc.fa_appagent stop /opt/myamc/fa_appagent/myamc.fa_appagent stop FA Agents - Installation and Administration 41

50 FlexFrame Autonomy An FA_CtrlAgent can be stopped manually using the following command: /etc/init.d/myamc.fa_ctrlagent stop /opt/myamc/fa_ctrlagent/myamc.fa_ctrlagent stop or /opt/myamc/fa_ctrlagent/myamc.fa_ctrlagent stop <pool_name> /opt/myamc/fa_ctrlagent/myamc.fa_ctrlagent stop Cust_1 When the myamc.fa Agents are stopped a ShutDown trap is sent Status of the myamc.fa Agents The status of the FA_AppAgents can be inquired using the following command: /etc/init.d/myamc.fa_appagent status /opt/myamc/fa_appagent/myamc.fa_appagent status The status of the FA_CrtlAgents can be inquired using the following command: /etc/init.d/myamc.fa_ctrlagent status /opt/myamc/fa_ctrlagent/myamc.fa_ctrlagent status or /opt/myamc/fa_ctrlagent/myamc.fa_ctrlagent status <pool_name> /opt/myamc/fa_ctrlagent/myamc.fa_ctrlagent status Cust_ Starting/Stopping an SAP Instance Starting an SAP Instance A further SAP instance of the type DB, CI, APP, J, JC, SCS, ASCS, LC and ERS can be started at any time. Use of the FlexFrame start scripts in a version which has been released for the FA Agents is mandatory. An instance which is to be monitored via the Autonomy Agents is started using the following script calls on the Application Node: sapdb <SID> start sapci <SID> start sapapp <ID> <SID> start The startup of an instance is documented by the agents using the following traps: 42 FA Agents - Installation and Administration

51 FlexFrame Autonomy ServiceIsStarting trap, ServiceIsStarted trap, or ErrorStartingService trap Stopping an SAP Instance An active SAP instance of the type DB, CI, APP, J, JC, SCS, ASCS, LC and ERS can be stopped at any time. Use of the FlexFrame stop scripts in a version which has been released for the myamc.fa Agents is mandatory. Service instances which are stopped while the myamc.fa_appagent is running are detected by the agent and acknowledged with corresponding SNMP traps. In this case the FA Agent can use mechanisms integrated into the FlexFrame solution to distinguish between an instance being intentionally stopped and a service or instance crashing. The following traps are sent: ServiceIsStopping trap, ServiceHasStopped trap, or ErrorStoppingService trap 4.7 Possible Applications General The FlexFrame Autonomy solution offers various configuration and application options. Application and Control Agents share the tasks for implementing autonomy in a Flex- Frame solution. The tasks of the Control and Application Nodes vary according to their configuration. Some typical autonomy scenarios for the most important applications are presented in the following: Passive monitoring of your instances (notification mode); no reactions take place (event mode). Control of the instances availability using active FlexFrame Autonomy Application Agents (local reaction mode); the reactions are triggered by the FA_AppAgent. Control of the instances availability using passive FlexFrame Autonomy Application Agents and active High-Autonomy Control Agents (central reaction mode); the reactions are triggered by the FA_CtrlAgent. The settings for notifications and reactions can be configured independently. FA Agents - Installation and Administration 43

52 FlexFrame Autonomy Semi-autonomous Operation In certain situations it may make sense only to use notification functions of myamc.fa and initially to dispense with active intervention in the system. This scenario is practical, for example, for using a central position to monitor various systems and analyze failure frequencies Monitoring of Application Instances In order to monitor application instances, an FA_AppAgent must run on every agent which is equipped with detectors for the application instance to be monitored and has been configured for this. The monitoring and reaction can be configured individually for each application instance. 44 FA Agents - Installation and Administration

53 FlexFrame Autonomy The service-specific parameters are set in the configuration section of the services. A configuration for the event mode can achieved by the following settings in the Configsection Services Default section: Service_SendTraps Service_EnableMonitoring Node_MaxRebootNumber : 0 Node_MaxSwitchOverNumber : 0 : true : true Service_MaxRestartNumber : 0 Servcice_TrapSendDelayTime : 0 or greater With this configuration no reactions take place for any service type. However, event messages are sent if services are not available or have failed. The parameters can, for example, be set on a service-specific basis for the semiautonomous operation of services. In the Configsection Services APP section: Service_SendTraps : true Service_EnableMonitoring : true Node_MaxRebootNumber : 0 Node_MaxSwitchOverNumber : 0 Service_MaxRestartNumber : 3 Service_MaxRestartTime : 120 With these setttings for the APP service, up to three restart attempts are made to render an application service available again after it has failed. No reactions are implemented for the other service types CI and DB. However, event messages are sent if services are not available or have failed Autonomy for Application Instances To permit autonomous operation of applications, FlexFrame Autonomy provides the option of monitoring instances and reacting actively to the failure of a service. The type of reaction depends on the configuration set. The following parameters can be used for this purpose in the parameter file: Service_SendTraps : true Service_EnableMonitoring : true Node_MaxRebootNumber : 2 Node_MaxSwitchOverNumber : 1 Service_MaxRestartNumber : 3 Service_TrapSendDelayTime : 0 or greater Service_ReactionDelayTime : 0 or greater MaxRebootTime : 120 FA Agents - Installation and Administration 45

54 FlexFrame Autonomy Restart Restarting a service is the first option Autonomy Agents have of reacting to a service failure. For this purpose a Service_MaxRestartNumber greater than 0 is specified and also the Service_MaxRestartTime. The Service_MaxRestartTime is evaluated by the FA_AppAgent. Service_MaxRestartNumber : 10 Service_MaxRest arttime : 240 The time by which the service must be available again is the Service_MaxRestartTime of this service. A maximum of Service_MaxRestartNumber attempts are made to restore availability through this reaction; escalation to the next escalation level then follows Reboot Rebooting a node is a further option for an Autonomy Agent to react and the next escalation level after a restart. For this purpose a Node_MaxRebootNumber greater than 0 is specified and also the Node_MaxRebootTime. The Node_MaxRebootTime here is evaluated by both the FA_AppAgent and the FA_CtrlAgent. After the Node_MaxRebootTime has elapsed the Control Agent uses the Node_CheckAvailabiltyCommand to check the availability of the system. If the RebootNumber is set to two or three, this value is increased by the corresponding factor. Node_MaxRebootNumber : 2 MaxRebootTime : 120 The time by which a server and all its services must be available again is calculated by adding the Node_MaxRebootTime and the greatest Node_MaxRestartTime of all services to be started on this server. A maximum of Service_MaxRestartNumber attempts are made to restore availability through this reaction; escalation to the next escalation level then follows Switchover A distinction is made between the internal and external switchovers. The internal switchover of all a node s services is a further reaction option for the Autonomy Agent and the next escalation level after a reboot. An internal switchover is initiated by a failed node which actively wishes to transfer its services. This is therefore a further reaction option for the Autonomy Agent and the next escalation level after a reboot. 46 FA Agents - Installation and Administration

55 FlexFrame Autonomy For this purpose a Node_MaxSwitchOverNumber greater than 0 is specified. Node_MaxSwitchOverNumber : 1 The time by which a server and all its services must be available again is calculated by adding the Service_MaxRestartTime of all services to be started on this server. An external switchover is initiated by a Control Agent which determines that an Application Node is no longer working. 4.8 FA Work and Log Files General The functions of the FA Agents are documented in various files. These files may not be changed manually as this can impair errorfree operation of the FA Agents or result in errored reactions. These files are created dynamically during ongoing operation. Deleting these files leads to a status in which the Autonomous Agents reorganize themselves, and from this point they reevaluate the situation from the current viewpoint without any previous knowledge Overview, Principal Directories, Files Base directory: /opt/myamc/ Subdirectories./scripts./scripts/sap./scripts/acc./scripts/PowerMng./scripts/ShutDown_Node./scripts/fa_list_services.sh./scripts/allnodes./config./config/FA_WebGui.conf Content Scripts for various tasks Link to the FlexFrame scripts Scripts for the SAPACC Interface Scripts for the power managemnet blades. Scripts to shutdown a node. Script to list all FlexFrame service states. Script to execute a command at all nodes of a pool. General configuration data General settings for the WebGUI (directories, cycle times, database settings) FA Agents - Installation and Administration 47

56 FlexFrame Autonomy Subdirectories./config/amc-users.xml Content User management./fa_appagent./fa_appagent/myamc.fa_appagent./fa_appagent/pgtool_pool.sh./fa_appagent/pgtool_version.sh./fa_appagent/pvget.sh./fa_ AppAgent/BBTool.sh./FA_ AppAgent/BBT_dialog.sh./FA_AppAgent/bin_Solaris_<VxxKxx>./FA_AppAgent/bin_Linux_<VxxKxx>./FA_AppAgent/bin_Linux_SLES9_<VxxKyy>./FA_AppAgent/lib_Solaris_<VxxKxx>./FA_AppAgent/lib_Linux_<VxxKxx>./FA_AppAgent/lib_Linux_SLES9_<VxxKyy>./FA_AppAgent/config./FA_AppAgent/log Installation path of myamc. FA_AppAgent and of diverse scripts. Start/Stop scripts AppAgent Determination of pool membership Determination of pool version Determination of the SAPS number of a node BlackBoard control BlackBoard dialog mode control Binaries and libraries for Solaris and Linux for each version. myamc.fa_appagent-specific configuration data empty./fa_ctrlagent./fa_ctrlagent/myamc.fa_appagent./fa_ctrlagent/pgtool_pool.sh./fa_ctrlagent/pgtool_version.sh./fa_ctrlagent/pvget.sh./fa_ctrlagent/bbtool.sh./fa_ctrlagent/bbt_dialog.sh Installation path of myamc. FA_CtrlAgent and of scripts Start/StopScripts CtrlAgent Determination of pool membership Determination of pool version Determination of the SAPS number of a node BlackBoard control BlackBoard dialog mode control 48 FA Agents - Installation and Administration

57 FlexFrame Autonomy Subdirectories./FA_CtrlAgent/bin_Solaris_<VxxKxx>./FA_CtrlAgent/bin_Linux_<VxxKxx>./FA_CtrlAgent/bin_Linux_SLES9_<VxxKyy>./FA_CtrlAgent/lib_Solaris_<VxxKxx>./FA_CtrlAgent/lib_Linux_<VxxKxx>./FA_CtrlAgent/lib_Linux_SLES9_<VxxKyy>./FA_CtrlAgent/config./FA_CtrlAgent/log Content Binaries and libraries for Solaris and Linux for each version. myamc.fa_ctrlagent-specific configuration data empty./vff./vff/log./vff/log/myamc_fa_pools.xml./vff/log/myamc_fa_pools-default.xml./vff/common/.vff_template.<vxxkxx>./vff/common/.vff_template.<vxxkxx>/ config./vff/common/.vff_template.<vxxkxx>/ config/traptargets.xml./vff/common/.vff_template.<vxxkxx>/ config/myamc_fa.xml./vff/common/.vff_template.<vxxkxx>/ config/myamc_fa-default.xml./vff/common/.vff_template.<vxxkxx>/ config/myamc_fa_acc.xml./vff/common/.vff_template.<vxxkxx>/ config/myamc_fa_acc-default.xml./vff/common/.vff_template.<vxxkxx>/ config/myamc_fa_rules.xml./vff/common/.vff_template.<vxxkxx>/ config/myamc_fa_ Rules -default.xml Pool-specific (vff) data Pool-specific log files Pools configuration file and its default version (this is used as LDAP cache) Template of pool-specific data for each version Configuration Trap targets myamc.fa configuration and default myamc.fa ACC configuration and default myamc.fa service rules and delivery status (default) FA Agents - Installation and Administration 49

58 FlexFrame Autonomy Subdirectories./vFF/Common/.vFF_template.<VxxKxx>/ config/myamc_fa_gui.xml./vff/common/.vff_template.<vxxkxx>/ config/myamc_fa_gui-default.xml./vff/common/.vff_template.<vxxkxx>/ config/myamc_fa_groups.xml./vff/common/.vff_template.<vxxkxx>/ config/myamc_fa_groups-default.xml./vff/common/.vff_template.<vxxkxx>/ config/myamc_fa_sd_sec.xml./vff/common/.vff_template.<vxxkxx>/ config/myamc_fa_sd_sec-default.xml./vff/common/.vff_template.<vxxkxx>/log./vff/common/.vff_template.<vxxkxx>/log /AppAgt./vFF/Common/.vFF_template.<VxxKxx>/log /CtlrAgt./vFF/Common/.vFF_template.<VxxKxx>/ data./vff/common/.vff_template.<vxxkxx>/ data/fa./vff/common/.vff_template.<vxxkxx>/ data/fa/livelist./vff/common/.vff_template.<vxxkxx>/ data/fa/xmlrepository./vff/common/.vff_template.<vxxkxx>/ data/fa/servicelists./vff/common/.vff_template.<vxxkxx>/ data/fa/servicelogs./vff/common/.vff_template.<vxxkxx>/ data/fa/reboot./vff/common/.vff_template.<vxxkxx>/ data/fa/switchover./vff/common/.vff_template.<vxxkxx>/ data/fa/blackboard Content myamc.fa GUI configuration and default myamc.fa groups configuration and default This is used as LDAP cache myamc.fa shutdown security configuration and ddefault Logfiles of myamc.fa_appagent and myamc.fa_ctrlagent. for each pool Work files Live list livelist.log XML repository for the web interface livelist. xmlservices_<nodename>.xml Service lists Services_<nodename>.lst Service logs (history) Services_<nodename>.log Reboot files Reboot_<nodename>.lst SwitchOver files SwitchOver_<nodename>.lst BlackBoard blackboard.txt 50 FA Agents - Installation and Administration

59 FlexFrame Autonomy Subdirectories./vFF/Common/.vFF_template.<VxxKxx>/ data/fa/performance./vff/vff_cust_1./vff/vff_cust_1/config./vff/vff_cust_1/log/./vff/vff_cust_1/data./vff/vff_cust_1/data/fa/../vff/vff_cust_2./vff/vff_cust_2/config./vff/vff_cust_2/log/./vff/vff_cust_2/data./vff/vff_cust_2/data/fa/. Content Measured performance data Pool-specific data for pool Cust_1 (example). See above for the description of the subdirectories and files. Pool-specific data for pool Cust_2 (example). See above for the description of the subdirectories and files Collecting Diagnostic Information for Support Assistance If support is needed, there is special data needed in the FlexFrame Support. This information is required to analyze problems with FlexFrame and the Autonomous Agents. Error description, as precise as possible What is the problem or error? On which nodes does it occur? Version of the FA Agents installed Run rpm -qa grep myamc on the Control Node. Configuration, work and log files of the FA Agents The following script creates an achiv with the desired information: /opt/myamc/fa_ctrlagent/save_fa_files_for_diag.sh This script must be invoked on the control- node! cd /opt/myamc/fa_ctrlagent./save_fa_files_for_diag.sh FA Agents - Installation and Administration 51

60 FlexFrame Autonomy Selected Files The write cycle for the entries (with the exception of reboot, switchover and BlackBoard) and the storage location of the files described in the following are defined using a parameter in the configuration file myamc_fa.xml Livelist Each FA_AppAgent regularly enters itself in this list. Through these entries the myamc.fa_ctrlagent recognizes whether the various myamc.fa Application Agents are available and functioning without error Services List This file (testament) exists for each FA_AppAgent on a node-specific basis. In it the agent logs the services which it has detected using its detectors plus their current status. A service-related status is logged in this file. The contents are updated after each detector cycle Services Log The contents of this file are identical to those of the Services-List file, with the difference that the history is contained in this file. This enables status changes and reaction decisions to be detected and replicated Reboot The contents of this file are identical to those of the Services-List file. The file serves as an information storage when a reboot takes place. It is written only for the autonomous reaction reboot and is deleted again after the reboot has been completed and the services have been started up Switchover The contents of this file are identical to those of the Services-List file. The file serves as information storage (testament) when a switchover takes place. It is written only for the autonomous reaction switchover and is deleted again after the services have been taken over. 52 FA Agents - Installation and Administration

61 FlexFrame Autonomy XML Repository In terms of contents the files in the XML Repository are the same as those in the Livelist and Services List. By contrast, the contents are written in XML notation and can thus be visualized directly with the associated FA WebInterface. The write cycle for the entries and the storage location of the file are defined using a parameter in the configuration file myamc_fa.xml BlackBoard The BlackBoard is an input interface for the FA Agents. Commands can be entered here which are executed by the FA_AppAgents. The commands have a specific validity period and are secured against manipulation. The file is written manually using a tool which guaranetees, among other things, protection against manipulation. 4.9 Migration of FA Agent Versions on Pool Level The FlexFrame Autonomy Agents offer a whole raft of strategies and functionalities for installing and activating patches and new release versions for a wide range of security, test and release scenarios. The administrator can use the update and activation functionality provided by the agents in line with his/her requirements. The following basic functions are available: A B C D E Reading and observing update, patch and release notes Installation of a new FA Agent version parallel to an operating FA Agent version. All data and configuration information for the operating FA Agent version are retained. Taking over of the configuration data for the new FA Agent version using the FA migration tool Pool-by-pool configuration/parameterization and activation of the new FA Agent version Testing of a new FA Agent version, e.g. in a separate test pool, if required by deactivating the autonomous reactions for test operation The following activities are required to install or update a FlexFrame Autonomy Agent patch or a newer release version. 1. Reading of the update, patch and release notes and observation of any required modifications and special features, in particular in the event of simultaneous updating of FlexFrame and operating system versions and patches. 2. Installation of a patch or a new release version (FA CtrlAgent and FA AppAgent). FA Agents - Installation and Administration 53

62 FlexFrame Autonomy 3. Parameterization/configuration, possibly using the FA migration tool Copying the parameters from the active agent version to the migration configuration directory using the FA migration tool. Normally the following call is sufficient to do this: /opt/myamc/fa_ctrlagent/mgrtool.sh --target-release=<release> --migrate-pool=<pool> --backup This following call migrates the configuration of the pool1 pool to the version V20K23 for the associated FA Agents. /opt/myamc/fa_ctrlagent/mgrtool.sh --target-release=v20k23 --migrate-pool=pool1 --backup Starting pool migration pool: pool1 source release: V20K22 target release: V20K23 migration dir: /opt/myamc//vff/vff_pool1/migration.v20k migration succeeded see file /opt/myamc/vff//vff_pool1/migration.v20k / MIGRATION-INSTRUCTIONS.txt for details and installation instructions. The modified files are written into a subdirectory and no current files are modified. In addition, the migration directory contains a backup of all current files. All new/modified parameters are listed in the MIGRATION-INSTRUCTIONS.txt file Testing, parameterizing and configuring the parameters taken over in the TrapTargets.xml, myamc_fa_acc.xml, myamc_fa_groups.xml, myamc_fa_rules.xml, myamc_fa.xml, myamc_fa_gui.xml and myamc_fa_sd_sec.xml files Testing any new parameters and, if necessary, modifying the default values entered Parameterizing and configuring of FlexFrame/operating system version dependencies if the FlexFrame basis is updated at the same time Check the modifications made by the migration tool, according to the file MIGRATION-INSTRUCTIONS.txt in result directory of migration. 54 FA Agents - Installation and Administration

63 FlexFrame Autonomy 4. Pool-specific deactivation of the active FA Agent Stopping the FA CtrlAgent for the pool whose agents are to be updated. /etc/init.d/myamc.fa_ctrlagent stop <pool> 4.2. Stopping the FA AppAgents on all nodes of the pool whose agents are to be updated. /etc/init.d/myamc.fa_appagent stop 5. Pool-specific activation of the new FA Agents Modifying the active agent version in the.info file in the associated pool directory. To do this the pool.release.current entry must be adapted accordingly. The syntax is VxxKyy. This syntax is mandatory. control1:/ # cat /opt/myamc/vff/vff_pool1/.info # Version V20K17 pool.release.base=v20k17 pool.release.current=v20k23 This pool will use the FA Agents of the versions V20K23. Alternately this file can be transferred from the migrated configuration to the configuration directory: cd /opt/myamc/vff/vff_pool1 cp./migration.<version>.<date>/.info Transferring the migrated configuration to the configuration directory. cd /opt/myamc/vff/vff_pool1 cp./migration.<version>.<date>/config/*./config 5.3. Starting the FA AppAgents on all Application Nodes of the updated pool. /etc/init.d/myamc.fa_appagent start 5.4. Starting the FA CtrlAgent. /etc/init.d/myamc.fa_ctrlagent start pool1 FA Agents - Installation and Administration 55

64 FlexFrame Autonomy 6. Checking the new active FA Agent version Checking the agent processes. Output for the example of the pool pool1: control1:/ # /etc/init.d/myamc.fa_ctrlagent status pool1 Status of myamc.fa_ctrlagent ( myamc_fa_ctrlagent ) in vff='pool1' at host 'control1'... root :41 pts/5 00:00:00./myAMC_FA_CtrlAgent vff=pool1 - lf=/opt/myamc/vff/vff_pool1/log/ctrlagt/ root :41 pts/5 00:00:00./myAMC_FA_CtrlAgent vff=pool1 - lf=/opt/myamc/vff/vff_pool1/log/ctrlagt/ root :42 pts/5 00:00:00./myAMC_FA_CtrlAgent vff=pool1 - lf=/opt/myamc/vff/vff_pool1/log/ctrlagt/ root :42 pts/5 00:00:00./myAMC_FA_CtrlAgent vff=pool1 - lf=/opt/myamc/vff/vff_pool1/log/ctrlagt/ root :42 pts/5 00:00:00./myAMC_FA_CtrlAgent vff=pool1 - lf=/opt/myamc/vff/vff_pool1/log/ctrlagt/ control1:/ #control1:/ # echo $? Checking agent messages at startup. control1:/ # /etc/init.d/myamc.fa_ctrlagent start pool1 Found vff='pool1'. Checking the files for vff='pool1'... Starting myamc.fa_ctrlagent ( myamc_fa_ctrlagent ) Version ( V20K23 ) in vff='pool1' at host 'control1' Diagnosis and checking that the shown data is correct 6.4. Performing FlexFrame Autonomy tests (restart, reboot, etc.). Steps 1, 2 and 3 can take place while FlexFrame Autonomy is active. The FA Autonomy functions are not available only for a brief period between deactivation of the active agent version and activation of the new agent version. Note that only version-compatible FA CtrlAgents, FA AppAgents and FlexFrame versions can be used. Compatibility of the agent versions with various FlexFrame versions results in dependencies which must be taken into account. 56 FA Agents - Installation and Administration

65 FlexFrame Autonomy 4.10 The FA Migration Tool The FA migration tool is used to migrate configurations of a selected pool to and from a particular FA Agent version. The FA migration tool also enables you to merge configuration files Pool Mode Pool mode generates a migrated configuration in the Migration.<version>_<timestamp> subdirectory, including the backup of the current files. To enable the migrated configuration to be used it must be copied into the relevant configuration directory of the pool concerned. Required / useful parameters: -p/--migrate-pool=<pool> -r/--target-release=<release> -b/--backup [-V/]--verbose] [-d/--pools-basedir=<dir>] (optional) [-c/--clean] optional [-s/--source-release=<release>] (optional) See section for a description of the various parameters. Example: MGRTool.sh --migrate-pool=<pool> --target-release=<release> --backup File Mode File mode merges two files which are in myamc config format. The two files are defined with the parameters merge-file and template. File mode can only be used on files which are in myamc config format (e.g. myamc_fa.xml). The myamc_pools.xml and myamc_fa_groups.xml files are not in this format. These files can therefore not be migrated using file mode, but only in pool mode. Required / useful parameters: -m/--merge-file=<file> -t/--template=<template> -o/--out-file=<file> FA Agents - Installation and Administration 57

66 FlexFrame Autonomy [-V/--verbose] [-c/--clean] (optional) Example: MGRTool.sh -merge-file=file1.xml --template=file1-default.xml --out-file=file-out.xml Usage of Help The usage of the FA migration tool can be output using the following command: /opt/myamc/fa_ctrlagent/mgrtool.sh -help Explanation of the Parameters of the FA migration tool -m/--merge-file=<file> -t/--template=<template> -o/--out-file=<file> -p/--migrate-pool=<pool> -s/--source-release=<release> -r/--target-release=<release> -b/--backup -d/--pools-basedir=<dir> -l/--list-releases -c/--clean -V/--verbose -lf/--logfile <file> -lp/--logpath <path> -h/--help Merges the specified file with the template Specifies the template to be used for the merge Write merged results into the specified file (use '-' for standard output) Specifies the pool to be migrated Migrates from specified release (parameter is optional). If parameter is not specified, the version of the pool to be converted is used. Migration to the specified release Generates a backup of all files Basic directory of the pools (default: /opt/myamc/vff) Lists all available (installed) releases Removes unnecessary files and configuration settings Detailed output during migration Writes log messages to the specified log file Generates the log file in the specified directory Prints usage, as shown above 58 FA Agents - Installation and Administration

67 FlexFrame Autonomy 4.11 Command Line Interface The command line interface is used to list the states of all FlexFrame services. Usage: /opt/myamc/scripts/fa_list_services.sh -h usage:./fa_list_services.sh [-ivcch] [<pool>...] -i: show inactive services -v: verbose mode -c force use of colors for state information -C suppress colors for state information -H suppress headers -h show usage -? show usage if no pools are specified, services of all pools will be shown. Example: /opt/myamc/scripts #./fa_list_services.sh -C p1 Pool Group Hostname SID Type Id State p1 Trauben pw250a OEC db RUNNING p1 g1 rx600a OEP db RUNNING 4.12 Command Execution at All Nodes of a Pool For execution of a command at all nodes of a pool you can use the script allnodes: cn2:/opt/myamc/scripts #./allnodes <pool> <command> Example: cn2:/opt/myamc/scripts #./allnodes FCK date Pool: FCK node pw250c: Tue Jun 27 10:28:21 CEST 2006 node pw250d: Tue Jun 27 10:28:21 CEST 2006 node pw650a: Tue Jun 27 10:28:21 CEST 2006 node blade4: Tue Jun 27 10:28:21 CEST 2006 node rx300a: Tue Jun 27 10:28:21 CEST 2006 node rx300b: Tue Jun 27 10:28:22 CEST 2006 FA Agents - Installation and Administration 59

68

69 5 WebInterface The FA WebInterface visualizes all nodes and services present in a FlexFrame system insofar as they are monitored by an FA_AppAgent. The display shows the status, availablity and messages of the Application and Control Agents. 5.1 Installation / Configuration Prerequisites On the Control Node an Apache-Tomcat Servlet Container must be installed. Currently Tomcat >= 5.0.x is supported. Additionally a Sun Java JRE or SDK >= must be installed. Prerequisites for the clients are Mozilla >= or Internet Explorer >= 6.0 and the Java plugin for Sun >= Installation The installation package is called myamc.fa_webgui-<x.y-z>.i386.rpm. Install the relevant package on the Control Node with rpm i myamc.fa_webgui-<x.y-z>.i386.rpm The files are installed under /opt/myamc/fawebgui Configuration Web Server Provided no paths have been changed in the FA configuration, the configuration runs out of the box. Changes to the web server require Tomcat to be restarted or reloaded Login IDs Login IDs are stored in the file /opt/myamc/config/amc-users.xml. User names and passwords can be entered simply here. No administration of rights is provided in the current version; each user has the same rights and sees all pools, groups and systems. FA Agents - Installation and Administration 61

70 WebInterface The file s as-supplied status is as follows: <?xml version="1.0"?> <amc:element name="userlist" id="userlist" type="role" xmlns:amc="myamc/elements/1.0"> <amc:elementref name="admin" id="adminpwd" type="user" src="vflex/livelist.xml"/> <amc:elementref name="flexframe" id="flexpwd" type="user" src="vflex/livelist.xml"/> </amc:element> Each user is assigned an entry in the following format: <amc:elementref name="flexframe" id="flexpwd" type="user" src="vflex/livelist.xml"/> The name field contains the user name and the id field the password. The remaining fields must be taken over as they have been defined. The default values for user and password are: User: PW: myamc FlexFrame The configuration file is an XML file whose format must be valid, otherwise no user can log in. You can check this by opening the file in Internet Explorer, for example. If the file is displayed without an error message, it is also accepted by the WebInterface. If changes are made, the FA WebInterface must be restarted or reloaded (e.g. using the Tomcat Service Manager) or Tomcat must be restarted or reloaded Link to the myamc.messenger Database The FA WebInterface enables messages of the Application and Control Agents to be displayed. These are sent from the Agents to the myamc.messenger, which filters them, automatically triggers reactions as required, and writes them into a database. To allow the messages to be displayed, the database access parameters must be specified in the configuration file /opt/myamc/config/fa_webgui.conf. If changes are made, the FA WebInterface must be restarted or reloaded (e.g. using the Tomcat Service Manager) or Tomcat must be restarted or reloaded. 62 FA Agents - Installation and Administration

71 WebInterface The table below includes all database-specific configuration settings, their default values and a description. Messenger database messengerdb.jdbc.url messengerdb.jdbc.username messengerdb.jdbc.password messengerdb.jdbc.log.file messengerdb.jdbc.log. append messengerdb.jdbc.log. debuglevel messengerdb.jdbc.drivers Default value and description jdbc:mysql://localhost:3306/messenger Specifies the name of the DB server (here localhost), the TCP port (3306) and the name of the database. Generally only the host name of the database server need be adapted here if the database is not running on the same computer as the WebInterface. myamc User name for database access. The specified user must have read permission (SELECT) and implement access from the computer on which the Webinterface is running. FlexFrame Password for the above-mentioned user. /opt/myamc/vff/log/webgui_messengerdb.l og File in which the error messages of the database access are written. false Specifies whether error messages are appended to the file after a restart (true) or whether these are rewritten and old messages deleted (false) 2 Log level for writing error messages. Higher values mean more detailed error messages. com.mysql.jdbc.driver Name of the Java class which implements the database driver. No modifications are required for mysql. Change of user and password: See mysql-manual. FA Agents - Installation and Administration 63

72 WebInterface Access to the database functions only if the access rights are set accordingly. This can be done interactively, for example, using the SQL command line tool mysql. The line below permits read access to all tables of the messenger database for a user with the name messenger and the password messenger who comes from the network with the address /24. The syntax, with all possible options, is described in the mysql manual. mysql> GRANT SELECT ON messenger.* TO messenger@ % IDENTIFIED BY messenger ; LDAP Options The graphical user interface provides structural information of pools, nodes and services even if they are not running. They are gained by accessing the information stored in the system-wide LDAP database. LDAP access is preconfigured for standard installations and works out of the box. In order to allow fine-grained control, the following options can be specified in the configuration file /opt/myamc/config/fa_webgui.conf. If changes are made, the FA WebInterface must be restarted or reloaded (e.g. using the Tomcat Service Manager) or Tomcat must be restarted or reloaded. The table below includes all LDAP-specific configuration settings, their default values and a description. LDAP database ldap.config.file ldap.username ldap.password Default value and description /etc/openldap/ldap.conf Specifies the name of the system-wide LDAP config file, which contains parameters for all services requiring LDAP access, e.g. automount, etc. This file will be read in order to get the necessary settings. <empty> User name for LDAP access. This parameter may contain the BindDN (equivalent of user name). <empty> Password for LDAP access. ldap.update-cycletime 3600 Specifies how often (in seconds) structural information is updated. ldap.enabled true Specifies whether LDAP access is enabled (true, false) 64 FA Agents - Installation and Administration

73 WebInterface GUI Options The graphical user interface provides some options which can be set using applet parameters. These permit the window size to be changed, automatic login, and the activation of the context menus. Settings are changed using applet parameters in the /opt/myamc/fawebgui/index.html file. This file has the following format (extract): <applet name="flexwebclientapplet" code="de.myamc.fa.gui.fclient.class" archive="flex.jar,jdom.jar" width="100%" height="100%"> </applet> Autlogin and activation of the context menus are implemented using applet parameters, which are set as follows: <applet name="flexwebclientapplet" code="de.myamc.fa.gui.fclient.class" archive="flex.jar,jdom.jar" width="100%" height="100%"> <!-- set to "yes" to enable context menues --> <param name="allowcontextmenues" value="yes"> <!-- specify user and passwort for auto login --> <param name="user" value="admin"> <param name="password" value="adminpwd"> </applet> Parameter allowcontextmenues user password Possible values and description yes or true and no or false (no = default) Specifies whether context menus (and thus actions) are permitted. User name for Autologin Password for Autologin FA Agents - Installation and Administration 65

74 WebInterface Other Settings Further settings can be made in the file /opt/myamc/config/fa_webgui.conf. If changes are made, the FA WebInterface must be restarted or reloaded (e.g. using the Tomcat Service Manager) or Tomcat must be restarted or reloaded. BlackBoard settings blackboard.toolpath blackboard.toolfile blackboard.path blackboard.file blackboard.md5key Default value and description The BlackBoard is an interface for transferring commands to the Application and Control Agents (see chapter 8). /opt/myamc/fa_ctrlagent Path containing the BlackBoard command line tool. BBTool.sh Name of the BlackBoard command line tool. This is normally a shell script which starts the correct program in accordance with the operating system (Solaris or Linux). /opt/myamc/vff/ Path for the BlackBoard file (without pool name!) data/fa/blackboard/blackboard.txt Name of the BlackBoard file (without pool name!) myamc.fa_bbtool MD5 key which is used for saving. Paths and file names path.flexframe.data path.flexframe.config file.amc-roles Default value and description /opt/myamc/vff Location of the data directories (pools) /opt/myamc/config Configuration directory amc-users.xml Name of the file (in the above-mentioned configuration directory) which contains the user definitions. Reading out the FA data Default value and description pools.update-cycletime 3 Specifies how often the data of all pools is to be read out (in seconds). 66 FA Agents - Installation and Administration

75 WebInterface Logging Default value and description log.level 1 Log level for the server part of the WebInterface. The log data is written to the log file of the servlet container (Tomcat), i.e. generally under /opt/jakarta-tomcat/logs/. (error=0, warning=1, info=2, debug=5) log.historysize 300 The messages can also be inquired via the front-end. This setting determines how many messages are displayed. Older messages are overwritten. 5.2 Visualization Starting the WebInterface / Access via Web Browser The WebInterface is always reachable when the Apache Tomcat is running. The Apache Tomcat will be started by PRIMECLUSTER. The WebInterface can be reached at the following URL: The port specified can be changed in the Tomcat configuration files server.xml Login The user name and password must be entered in the login mask to permit authentication. You can use the WebInterface only with a valid combination of user name and password (for details on configuring users see section ). FA Agents - Installation and Administration 67

76 WebInterface Overview of Elements The WebInterface provides a clearly structured display of all the elements in a FlexFrame system. The left-hand side shows pools, groups and nodes in a tree structure. The main panel provides and overview of the status of all instances and displays messages from the various agents. The panels at the lower edge show all the nodes and systems and their current statuses Pool / Group Tree Overview The pool and group tree shows all elements of a FlexFrame system in a hierarchical structure. Each pool has, as child elements, groups, which in turn are used as containers for systems, nodes or instances. 68 FA Agents - Installation and Administration

77 WebInterface Status The status of an element is indicated by its color. The colors have the following meanings: Red Critical Yellow Warning Green Normal, everything OK. White Not active or no further information Black Not active or shut down Color inheritance takes place according to the worst-case principle. Here an element displays the worst value for its own status or for the status of its child elements Selecting an Element When an element is selected, the associated instances, nodes and systems are displayed in the panels on the right-hand side: When a pool is selected, the instance view shows all instances which are running on nodes of this pool, the node panel shows all nodes belonging to the selected pool, and the system panel shows all systems which have instances on nodes of this pool. When a group is selected, the instance view shows all instances which are running on nodes of this group, the node panel shows all nodes belonging to the selected group, and the system panel shows all systems which have instances on nodes of this group. When an Application Node is selected, the instance view shows all instances which are running on this node, the node panel shows the selected node and the system panel shows all systems which have instances on this node. When a system is selected, the instance view shows all instances belonging to this system, the node panel shows the nodes on which the associated instances are running, and the system panel shows the selected system. When an instance is selected, the instance view shows the selected instance, the node panel shows the node on which the instance is running, and the system panel shows the system to which the instance belongs. FA Agents - Installation and Administration 69

78 WebInterface Different Tree Presentations The tree view offers different presentations for displaying the various elements. The Nodes tree displays the pool, group and node elements. The nodes displayed in black are not currently running, the ones in white are running, but are not currently hosting services, i.e. they are spare nodes. The Systems tree displays the pool, system and service elements. 70 FA Agents - Installation and Administration

79 WebInterface The All tree displays all elements, from pool to service Status Display Node Panel The node panel displays all nodes on which instances belonging to the currently selected element are running. Each node is shown with its name, icon and status, the node status corresponding to the worst instance status. FA Agents - Installation and Administration 71

80 WebInterface If you leave the mouse cursor over a node icon, a tool tip is displayed specifying the name, group and pool. In addition, all nodes which are currently displayed which belong to the same group are marked with a black border System Panel The system panel displays all systems of the instances belonging to the currently selected element. Each system is shown with its name, icon and status, the system status corresponding to the worst instance status Instance Panel The instance panel displays all instances belonging to the currently selected element in a list view containing the following fields: Severity Current instance status: red = critical, yellow = warning, green = everything OK, white = no information or not active. Pool Name of the pool incorporating the node on which the instance runs. Group Name of the group incorporating the node on which the instance runs. SID Name of the system to which the instance belongs. ID Instance number. Name Generic instance name. 72 FA Agents - Installation and Administration

81 WebInterface Type Type of instance: database (DB), central instance (CI), application instance (APP) Priority Priority of the instance (see section 3.2.1) Node Node on which the instance is running State Current status of the instance The fields in the list can be arranged in any order by left-clicking on the column header. Clicking on it again reverses the sort order, and a third click cancels the sort again. An arrow appears next to the header which specifies the current sort order and sort direction. If sorting is to take place according to several columns, you click on the first column (twice if required, depending on whether sorting is to take place in ascending or descending order). You select all other columns with CTRL + mouse click. The lower the sort priority of a column becomes, the smaller the arrows next to the column header become. In order, for example, to sort by pool and within a pool by group, you first click on Pool, hold down the CTRL key, and then click on Group. FA Agents - Installation and Administration 73

82 WebInterface Assigning States to Colors As described above, color inheritance takes place according to the worst-case principle. Here an element displays the worst value for its own status or for the status of its child elements. The states are assigned to the color display as follows: white green yellow red black inactive or no further information normal, everything ok warning critical critical Node states RUNNING SWITCH_INT SWITCH_EXT PowerOff DOWN Service states SHUTDOWN DOWN WATCH NOWATCH UNKNOWN NULL RUNNING STOPPING STARTING REBOOT RESTART RESTARTING REBOOT REBOOTING RBGET SWITCH SWITCHOVER SWGET ERROR The keywords for the states are accepted in either upper or lower case. 74 FA Agents - Installation and Administration

83 WebInterface Message Display The message display provides an overview of all messages which come from the Application and Control Agents of the FlexFrame system. The messages are prefiltered in accordance with the element currently selected in the tree. If the highest element with the name FlexFrame is selected, all messages are displayed. If a specific node is selected, only messages of this node are displayed Fields The following fields are available in the message panel: MsgId Unique message ID Date Message timestamp Severity Message severity: red = critical, yellow = warning, green = everything OK, white = no information or not active Pool Pool name Group Group name Host Node name FA Agents - Installation and Administration 75

84 WebInterface System System name InstNum Instance number Category Category ShortMessage Abbreviated message LongMessage Detailed message Type Type of service affected (DB, APP, CI) SubType Subtype of the service affected State Current status Vname Virtual (generic) service name ServiceId Service ID (generally the same as the instance number) Sender Application which sent the message Info 6-9 Fields for additional information Navigation The navigation bar enables you to navigate through all messages: The number in parentheses in the title indicates the number of messages found within the selected time range. The example above shows, that there are 12 messages in the specified time range. auto refresh automatically looks for new messages during each update cycle. 76 FA Agents - Installation and Administration

85 WebInterface The buttons and can be used to scroll within the messages. Each time one of the buttons is selected, the timerange is adjusted by the time span specified below. button updates the mes- fast-forwards toward the most recent messages. The sages in the current time range. The fields From and to specify the currently selected timerange. They can be changed as well by manually entering a valid date. Start and end time are synchronized, i.e. the timerange will always be the one specified below. The buttons and show or hide additional options: The time span specifies the amount of time by which the displayed timerange will be adjusted into past or future, when the buttons and or next to the start and end date are used. A manual refresh is necessary after the first time the message view has been opened, and whenever start date, end date, or the time span has been changed. Predefined viewsets can be selected using the drop down box and their settings adjusted using the options menu. See section for details on viewsets Viewsets Viewsets are named groups of message columns. The web interface ships with the viewsets FlexFrame (default) and all, which shows all available columns. Viewsets can be managed using the options button: FA Agents - Installation and Administration 77

86 WebInterface Set as default viewset allows to specifiy, which viewset is used by default, when the WebGui has been started. Save viewset as allows the creation of copies Save viewset stores changes made to the currently selected viewset. Select columns shows a dialog with all available columns: The columns to be displayed for a viewset can be selected here. Apply changes the currently displayed viewset. The changed settings can be permanently store using Save viewset from the options menu. The columns can be reorderd in the message view by dragging them to the appropriate position. The positions will be remembered, when the viewset is saved using Save viewset. 78 FA Agents - Installation and Administration

87 WebInterface Sorting The message window allows sorting of the currently displayed messages. The fields in the list can be arranged in any order by left-clicking on the column header. Clicking on it again reverses the sort order, and a third click cancels sorting. An arrow next to the header specifies the current sort order and direction. If sorting is to take place according to several columns, click on the first column (twice if required, depending on whether sorting is to take place in ascending or descending order). Additional columns can be selected with CTRL + mouse click. The size of the arrow next to the column header increases and decreases with sort priority. In order, for example, to sort by group and within a group by nodes, click on Group, hold down the CTRL key, and then click on Node. By default messages are displayed in descending chronological order, i.e. new messages are shown at the top Configuration of FlexFrame Autonomy with the Webinterface The FlexFrame Autonomy Agents can be configured using the web interface. In order to use the configuration editor, the context menus must be enabled. This can be achieved by setting the option allowcontextmenues in the file /opt/myamc/fawebgui/index.html to yes (see section ). The configuration editor can be started by right-clicking on a pool element and selecting Edit configuration from the context menu. Each parameter offers a detailed description, when the mouse pointer is positioned over the parameter name: The values can be modified using the input fields, changes are indicated by a an icon: FA Agents - Installation and Administration 79

88 WebInterface All parameters are validated, whenever a change occurs. Validation failures are indicated by an icon, and offer a detailed error description as a tooltip: Parameter lists can be extended using the add button, unnecessary entries can be removed using the remove button. When a new entry is created, an input dialog asks for a name, which uniquely identifies this entry. The suggested name is composed of a template name and the current date: Some additional elements can be used to control the configuration process: Save: Stores the modified values in the config files. Revert: Reverts all changes. Close: Closes the configuration dialog. 5.3 Interaction Commands The users can implement their interaction requirements, e.g. starting or stopping an instance, via the ACC interface or directly via the FA WebInterface. The FA WebInterface has the advantage that it also visualizes the configured pools and groups. Execution takes place as for the ACC interface via the ACC service. 80 FA Agents - Installation and Administration

89 WebInterface The WebInterface provides the option of actively intervening in the control of the Flex- Frame system. Consequently it is, for example, possible to reboot a node, to restart a service, or to relocate all of one node s services on another. The actions available for an element are offered in a context menu which can be reached by clicking the right mouse button on the element required Activating the Context Menus Contect menus (and thus actions) are not active in the WebInterface in the as-supplied status. To activate these, the applet parameter allowcontextmenues must be set to yes (see section ). The use and activation of the WebInterface for direct transfer and execution of the commands via the BlackBoard without using the ACC component is not included in the FlexFrame license Confirming a Command Actions are forwarded to the ACC service or BlackBoard in the form of commands (see chapter 8). To prevent commands being issued inadvertently, each action must be confirmed individually. All the parameters are then displayed again as illustrated in the figure below. Execute executes the command, Cancel aborts it. FA Agents - Installation and Administration 81

90 WebInterface The checkboxes Ignore power value and get log are only relevant for actions using SAP ACC: Ignore power value Each service has a certain requirement of CPU power, which must be satisfied by the target node. If this option is checked, this value will be ignored. get log Specifies whether to return the results of the command. For some commands, you can switch to editing mode using Modify in order to change individual parameters: To execute the modified command, click on OK and then Execute. 82 FA Agents - Installation and Administration

91 WebInterface Pools The following commands have been defined for a pool element: FA Agents - Installation and Administration 83

92 WebInterface Groups The following commands have been defined for a group element: 84 FA Agents - Installation and Administration

93 WebInterface Nodes The following commands have been defined for a node element: Switchover using Drag n Drop. When you drag a node with the left mouse button pressed and drop it over another node, a menu opens which provides the option of moving all the services of the first node to the second one. FA Agents - Installation and Administration 85

94 WebInterface System The following commands have been defined for a system element: 86 FA Agents - Installation and Administration

95 WebInterface Instance The following commands have been defined for an instance element: FA Agents - Installation and Administration 87

96 WebInterface Updates The WebInterface regularly updates all the displayed elements, states and messages Update Interval The interval between two updates can be set using the context menu of the which can be reached using the right mouse button: button, An update interval of between three seconds and two hours can be selected Manual Update You can initiate a manual update outside the update cycle using the button. 88 FA Agents - Installation and Administration

97 WebInterface Reinitialization If you wish to reinitialize and update the entire interface, you can do this using the Reinitialize option in thecontext menu of the button Pause Mode (No Update) If you wish to disable the update for a while (pause mode), you can do this using the Enable Update option in thecontext menu of the button. FA Agents - Installation and Administration 89

98 WebInterface 5.4 Info and Help The button enables you to access the information and help dialogs. Message History The Message History displays the last messages of the WebInterface. Status Results of Commands like Start Service etc. 90 FA Agents - Installation and Administration

99 WebInterface 5.5 FlexFrame Performance and Accounting Plug-in The FlexFrame Autonomous Management Center contains, as optional components, plug-ins for the views of the performance and accounting data. The plug-in consists of a graphical display in which the performance or accounting values are shown in the form of graphs. The temporal view domain can be freely chosen and is only limited through the data timespans stored in the Repository. The GUI uses a powerful cache algorithm to optimize the display, which minimizes the waste of resources and optimizes the response time for the data request. The cache works through page size and number of pages, meaning that, according to the efficiency of the front-end system used, the default parameters can be temporarily altered to adjust the signal performance to individual demands. Pagesize Pagecount These two parameters influence the size of the display cache and the quantity of data to be transferred at the request of the GUI to its server. The cache algorithm erases old pages only after obtaining the page count. Then the pages not used for the longest time are erased. The performance plug-in shows the CPU and memory values for a chosen service, in each case in individual graphs. Every graph shows minimum, average and maximum values. The displayed timespan can be shifted using the two arrows at the top of the plug-in. FA Agents - Installation and Administration 91

100 WebInterface FlexFrame Reporting Plug-in The FlexFrame Reporting plug-in contains the production, export and print of reports from the FlexFrame Autonomous Management Center. This plug-in is only available in the standalone variant of the management center; display within a web browser is not possible. The plug-in shows a selection of available reports. New predefined or user-specified reports can simply be filed in the report register and are immediately applicable. All reports have a standardized parameter interface. A new report can easily be derived from one of the existing reports. 92 FA Agents - Installation and Administration

101 6 FlexFrame Autonomy Power Shutdown Concept 6.1 General The FlexFrame Autonomy power shutdown function is designed to provide a simple, easy-to-configure method for implementing secure shutdown of various hardware platforms. For this pupose the FlexFrame Autonomy Control Agents make direct use of the Shutdown Agents from the PRIMECLUSTER shutdown facility. These are part of the highavailability solution PRIMECLUSTER which is used on all Control Nodes of a FlexFrame system. For external switchover of application instances, FlexFrame Autonomy requires a secure option for shutting down nodes simply and securely. Normally these agents are provided with their configuration information in the course of the PCL configuration. However, in a FlexFrame solution only the Control Node PCLs are configured; no configuration information on any Application Nodes exists in PRIMECLUSTER. The FlexFrame Autonomy Agents ascertain this lack of configuration information at runtime and then generate the configuration information required for the agents. Different configuration information is generated in accordance with the type of system for which the power shutdown is performed. The FlexFrame Autonomy Agents consequently create an agent-specific configuration file: Blade Config for SA-Blade PRIMERGY Config for SA_IPMI PRIMERGY Config for SA_RSB PRIMEPOWER Config for SA_XSCF PRIMEPOWER Config for SA_RPS Midrange PRIMEPOWER Config for SA_SCON Enterprise To call the Power Shutdown Agents a user and a password must be transferred. The user and password are stored in a template file on a pool-specific basis. The FlexFrame Autonomy Control Agents use these entries to automatically generate the actual configuration files which are used by the Shutdown Agents. FA Agents - Installation and Administration 93

102 FlexFrame Autonomy Power Shutdown Concept 6.2 Power Shutdown Architecture The figure below provides an overview of the components involved and how these interwork in a FlexFrame environment. The PRIMECLUSTER software runs on the two Control Nodes in a FlexFrame solution. The FlexFrame Autonomy Control Agent runs on the active Control Node. The FlexFrame Autonomy Application Agents provide the Control Agent with information on the computer type and further information which is required for generic creation of configuration files, insofar as this is technically possible and the information is unambiguous and can be ascertained securely. Information which cannot be ascertained generically must be entered manually in the Powershutdown section of the myamc_fa.xml file. 94 FA Agents - Installation and Administration

103 FlexFrame Autonomy Power Shutdown Concept 6.3 Basics Power Shutdown for Blade Systems Power shutdown for the Blade systems is implemented via the SA_blade Agents of the PCL-SF facility. The IP addresses of the Management Blade must be configured Power Shutdown for PRIMERGY Systems By default the power shutdown function for the PRIMERGY systems in FlexFrame takes place via the IPMI interface or, alternately, via an RSB board Power Shutdown for PRIMEPOWER Systems The PRIMEPOWER systems are divided into three groups in accordance with the power shutdown function available: PRIMEPOWER Midrange Systems with XSCF (e.g. 250, 450) Each PRIMEPOWER system which is to be addressed via SA_xscf Agents must be configured in a configuration file. The user name and the password are taken from a pool-specific password file. PRIMEPOWER Midrange Systems without XSCF (e.g. 650, 850) In the PRIMEPOWER Midrange Systems without XSCF, power shutdown is controlled via the SA_rps Agent. PRIMEPOWER Enterprise Systems with SCON In the PRIMEPOWER Enterprise Systems, power shutdown is performed via the SA_scon Agent. The shutdown parameters are configured manually in the Powershutdown section of the FA configuration. Each power shutdown technology requires an individual configuration file from which the associated SA_Shutdown Agent obtains the specific information which enables it to perform the shutdown. Primepower Midrange with 2 RPS Boards With the Primepower Midrange systems without XSCF, power shutdown is controlled via the SA_rps Agent. The Primepower 650 or 850, however, can be equipped with 2 RPS-boxes. The configuration data which are necessary for this will be generated automatically by the FA CtrlAgent. FA Agents - Installation and Administration 95

104 FlexFrame Autonomy Power Shutdown Concept 6.4 Power Shutdown Configuration The following sections describe the configuration of the power shutdown. This can also be done in the myamc Webinterface. Further information on configuring with the myamc Webinterface please see section Switchover Control Parameters The normal lapse on breakdown of a node requires the disconnecting of this node to avoid a disturbance caused by the existence of duplicate services. If it is not possible to disconnect the node, the move breaks up with a failure notice. The services concerned are not restarted. With the help of the parameter IgnoreShutdownFailure, this failure can be influenced in that, if a shutdown fault occurs, for example because a shutdown parameter was not configured for a node, the relevant node is separated from the network. If a node is separated from the network, the network must be reactivated manually. For this, the following command must be executed on the Control Node before the defective node is restarted: ff_sw_ports.pl --op up --name <hostname> If the network interfaces of the node are not reactivated, it is not possible for the node to start User, Password and Community To use agent power shutdown, user, password and community must be defined in the configuration of the FA Agent. This configuration is specified in the pool-specific configuration file. The entries for user, password and community must be the same as those configured in the Application Nodes. Further information on the configuration of user, password and community in the Application Nodes can be found in the manual Flex- Frame TM for SAP - Installation of a FlexFrame Environment, section Power-Shutdown Configuration. The configuration of default user, password and community can be found in the file myamc_fa_sd_sec.xml at config section Security_default. In addition, individual Application Nodes can be defined with other values for user, password and community. To permit this, a corresponding configuration section must be defined for the Application Node. Parameters that are not required should be left empty. User, password and community may not be left completely empty. Nor is it permissible to leave the user empty but to specify a password. 96 FA Agents - Installation and Administration

105 FlexFrame Autonomy Power Shutdown Concept The following example illustrates configuration of the Application Node blade2. There ABCDE is defined as the user, secret as the password and private as the SNMP community. The name of the configuration section (Security_1) is arbitrary. A condition is that this configuration section must be contained within the Security configuration section. <!-- first entry --> <configsection name="security_1"> <configentry name="sec_hostname"> <value type="string">blade2</value> <configentry name="sec_user"> <value type="string">abcde</value> <configentry name="sec_passwort"> <value type="string">secret</value> <configentry name="sec_community"> <value type="string">private</value> </configsection> Management Blades As the Management Blades cannot be detected by the FA CtrlAgent automatically, they must be configured. This is done in the Managementblades configuration section of the myamc_fa.xml configuration file. The example below illustrates the configuration of the Management Blade BX300- control. The name of the configuration section (Mgmt_Blade_1) is arbitrary. A condition is that this configuration section must be contained within the Managementblades configuration section. Further Management Blades can be entered using the relevant configuration sections (see the othermgmtblade example). <configsection name="managementblades"> <!-- Here the 'Hostname's of the management-blades must be configured. --> <!-- first entry --> <configsection name="mgmt_blade_1"> <configentry name="hostname"> <value type="string">bx300-control</value> </configsection> FA Agents - Installation and Administration 97

106 FlexFrame Autonomy Power Shutdown Concept <!-- second entry --> <configsection name="mgmt_blade_2"> <configentry name="hostname"> <value type="string">othermgmtblade</value> </configsection> </configsection> Application Nodes Normally the shutdown procedures for all Application Nodes are determined automatically. However, it may be necessary to configure individual systems for power management yourself. To permit this, the values required must be entered in the pool-specific configuration file. Which values must be specified for which power shutdown type is shown in the table below. Yes Must be set No Can be left empty. The config entry has to exist, though. Default Is read from config section Security_default or Default_ShutdownMode, if left empty. This default value is ignored if a value is found in the current section. Value Server type (as of 11/2007) Host name Shutdown type MAC address Management Blade IP Hardware Shutdown mode Linux Solaris BLADE IPMI RSB XSCF[2] RPS ALOM SCON BX300 BX600 RX300 RX300S2/S3 RX600S2/S3 RX800S2/S3 RX600 RX800 Yes No No No No PW250 PW450 M4000 M5000 Yes/ Default PW650 T1000 PW850 T2000 PW900 PW1500 PW FA Agents - Installation and Administration

107 FlexFrame Autonomy Power Shutdown Concept Value IP address (Control LAN) Linux Solaris BLADE IPMI RSB XSCF[2] RPS ALOM SCON Console No Yes Machine Port No Yes SNMP community Yes/ Default No Yes/ Default User No Yes/ Default No Password No Yes/ Default No No No No Yes No Default Must be specified. Can be left empty. configentry is mandatory, however! Is read from the Security_default or Default_ShutdownMode configuration section if it is empty. If a value is entered, it overrides the default value. Example: <!-- 2. entry --> <configsection name="rx300-01_ipmi"> <configentry name="hostname"> <value type="string">rx300-01</value> <configentry name="shutdowntyp"> <value type="string"></value> <configentry name="macaddress"> <value type="string"></value> <configentry name="hardware"> <value type="string"></value> <configentry name="shutdownmode"> <value type="string">cycle</value> <configentry name="ip_address"> FA Agents - Installation and Administration 99

108 FlexFrame Autonomy Power Shutdown Concept <value type="string"></value> <configentry name="slot"> <value type="string"></value> 100 FA Agents - Installation and Administration

109 FlexFrame Autonomy Power Shutdown Concept <configentry name="mgmtbladeip"> <value type="string"></value> <configentry name="console"> <value type="string"></value> <configentry name="machine"> <value type="string"></value> <configentry name="port"> <value type="string"></value> </configsection> Default Shutdown Mode A default shutdown mode is defined for the power shutdown. This can be changed by the configuration entry ShutdownMode. The possible values are cycle or leave-off. You are advised not to change the default shutdown mode so as to guarantee that Application Nodes are shut down securely. <configentry name="default_shutdownmode"> <value type="string">leave-off</value> FA Agents - Installation and Administration 101

110

111 7 Parameter Reference 7.1 FA Agents Operation of the FA Agents does not necessarily require individual parameterization. After installation executable parameter files are available. For productive use, the values must have been tested and, if necessary, adjusted to the requirements and the start, stop, ping and restart times called for in the customer system for the services monitored in myamc.fa. The monitoring of additional services which are not included in the standard rules is performed in the rule file myamc.fa_rules.xml. The default parameters of the FA Agents are set such that, when the performance and accounting option is activated, no changes to the parameter settings are necessary FA Agent Configuration Files The configuration of FlexFrame Autonomy is stored in several configuration files that can be found in /opt/myamc/vff/vff_<poolname>/config. The files are in XML format. TrapTargets.xml Trap targets (pool-specific). The recipients of SNMP trap messages can be defined. myamc_fa_groups.xml Groups (pool-specific). The group memberships are configured. It is used as LDAP cache. myamc_fa.xml FA Autonomy (pool-specific). Settings for autonomous reactions. myamc_fa_acc.xml ACC Connector (pool-specific). Settings for the SAP-ACC interface. myamc_fa_gui.xml FA GUI (pool-specific). Settings for the FlexFrame WebInterface. myamc_fa_sd_sec.xml FA Shutdown Security (pool-specific). Settings for the Power-Shutdown. myamc_fa_rules.xml Generic service configuration For most configuration files, there is an additional default file named <filename>-default.xml which contains the as-supplied status is available after installation. The myamc_fa_default.xml file is an exact copy of myamc_fa.xml in its original state. FA Agents - Installation and Administration 103

112 Parameter Reference 7.2 SNMP Traps General The TrapTargets.xml file contains all the information which is needed to send SNMP traps. Two parameters are required for each target: Host name or IP address SNMP community The community corresponds roughly to a password. However, mostly the default value public suffices. For FlexFrame at least the two Control Nodes have to be set as trap targets Structure The file is in XML format and has the following structure: <?xml version="1.0" encoding="iso "?> <configuration> <configsection name="snmpconnector"> <configsection name="trapsender"> <configsection name="traptargets"> <!-- List of trap targets. For each host/community combination to send traps to a config section containing the two config entries Host and Community has to exist --> <!-- first trap target --> <configsection name="control1"> <! - host name of ip address or name of host to send SNMP trap to --> <configentry name="host"> <value type="string">control1</value> <!-- community string to use for a SNMP trap --> <configentry name="community"> <value type="string">public</value> </configsection> 104 FA Agents - Installation and Administration

113 Parameter Reference </configsection> </configsection> </configsection> </configuration> The name or IP address of the node which is to receive the traps is specified as the value of the parameter Host (<configentry name="host">), the SNMP community as the value of the parameter Community (<configentry name="community">). A trap can be sent to any number of targets. If further targets are to be defined, the enclosing ConfigSection must be copied, renamed and adapted accordingly. Example: [header, see above] <!-- first trap target --> <configsection name="control1"> <!-- host name of ip address or name of host to send SNMP trap to --> <configentry name="host"> <value type="string">control1</value> <!-- community string to use for a SNMP trap --> <configentry name="community"> <value type="string">public</value> </configsection> <!-- second trap target --> <configsection name="control2"> <!-- host name of ip address or name of host to send SNMP trap to --> <configentry name="host"> <value type="string">control2</value> <!-- community string to use for a SNMP trap --> <configentry name="community"> <value type="string">public</value> FA Agents - Installation and Administration 105

114 Parameter Reference </configsection> <!-- next trap target goes here --> [footer, see above] Default Parameter File <?xml version="1.0" encoding="iso "?> <configuration> <configsection name="snmpconnector"> <configsection name="trapsender"> <configsection name="traptargets"> <!-- List of trap targets. For each host/community combination to send traps to a config section containing the two config entries Host and Community has to exist --> <!-- first trap target --> <configsection name="control1"> --> <!-- host name of ip address or name of host to send SNMP trap to <configentry name="host"> <value type="string">control1</value> <!-- community string to use for a SNMP trap --> <configentry name="community"> <value type="string">public</value> </configsection> <!-- second trap target --> <configsection name="control2"> > <!-- host name of ip address or name of host to send SNMP trap to -- <configentry name="host"> <value type="string">control2</value> 106 FA Agents - Installation and Administration

115 Parameter Reference <!-- community string to use for a SNMP trap --> <configentry name="community"> <value type="string">public</value> </configsection> <!-- next trap target goes here --> </configsection> </configsection> </configsection> </configuration><?xml version="1.0" encoding="iso "?> <configuration> <configsection name="snmpconnector"> <configsection name="trapsender"> <configsection name="traptargets"> <!-- List of trap targets. For each host/community combination to send traps to a config section containing the two config entries Host and Community has to exist --> <!-- first trap target --> <configsection name="control1"> <!-- host name of ip address or name of host to send SNMP trap to --> <configentry name="host"> <value type="string">control1</value> <!-- community string to use for a SNMP trap --> <configentry name="community"> <value type="string">public</value> </configsection> <!-- second trap target --> <configsection name="control2"> FA Agents - Installation and Administration 107

116 Parameter Reference <!-- host name of ip address or name of host to send SNMP trap to --> <configentry name="host"> <value type="string">control2</value> <!-- community string to use for a SNMP trap --> <configentry name="community"> <value type="string">public</value> </configsection> <!-- next trap target goes here --> </configsection> </configsection> </configsection> </configuration> 7.3 Pooling and Grouping Pooling Pool creation means the flexible assignment of Application Nodes to a pool. The Flex- Frame Autonomy Agents take over the pool information from the FlexFrame configuration in the LDAP. When an Application Node is restarted, a start script for the FA Agents is called. This start script determines the pool to which this Application Node belongs and starts the FA_AppAgent accordingly Grouping FlexFrame offers grouping functions for flexible server farming. Grouping enables nodes and services within a pool to be assigned to different groups. A group is therefore always a part of a virtual FlexFrame pool. In FlexFrame V3.0, in contrast to FlexFrame V3.1 and higher, there is only one pool. FlexFrame Autonomy grouping is a function which is configured as of FlexFrame V3.1 via LDAP. Individual parameterization is then no longer required for the Autonomy Agents. 108 FA Agents - Installation and Administration

117 Parameter Reference Groups are configured in the myamc_fa_group.xml file in all FlexFrame installations which only want to configure grouping on the FlexFrame Autonomy level (e.g. when the FlexFrame V2.0 Agents are used in FlexFrame V3.0). An absolute group configuration can be implemented in this file, as in the LDAP configuration. However, generic parameterization is also possible as an alternative. This can also be used in FlexFrame V3.1 and higher if absolute grouping is not to be used via the LDAP LDAP Grouping LDAP grouping is performed in conjunction with the general configuration of the nodes in the LDAP directory. Here the FlexFrame Autonomy Agents take over the group information from the LDAP directory at startup, as they do the pool information. Individual group configuration in the myamc_fa_group.xml file is thus no longer required Manual Group Assignment The group assignment is entered manually in the configuration file. Example: <gr:group schema="default" name="gr_all"> <gr:description></gr:description> <match category="${node-hostname}"> <value>vade*</value> <value>yod*</value> <value>blade_a*</value> <value>blade_b*</value> <value>server*</value> </match> </gr:group> <gr:group schema="default_os" name="gr_solaris"> <gr:description></gr:description> <match category="${node-hostname}"> <value>vade*</value> <value>blade_sol*</value> <value>server*</value> </match> </gr:group> FA Agents - Installation and Administration 109

118 Parameter Reference Generic Grouping Generic group creation is implemented on the basis of generic information which the Application Agents can obtain automatically. For generic group creation it makes sense to use the host names, the IP addresses or the operating system employed. The group name is also created generically. For this purpose each schema is assigned a group naming rule which combines a fixed component with a variable component. The following base values are available for creating the generic rules: Operating system (os) Network (network) Number of CPUs (cpu) The following generic group creation rules result: Group definition by operating system, network and number of CPUs schema="os_network_cpu" name="autogroup_${os-typ}_${cpu-cnt}cpu_${ip-adr:netmask=/24}" Group definition by operating system and number of CPUs schema="os_cpu" name="autogroup_${os-typ}_${cpu-cnt}cpu" Group definition by operating system and network schema="os_network" name="autogroup_${os-typ}_${ip-adr:netmask=/24}" Group definition by network and number of CPUs schema="network_cpu" name="autogroup_${cpu-cnt}cpu_${ip-adr:netmask=/24}" Group definition by operating system schema="os" name="autogroup_${os-typ}" Group definition by network schema="network" name="autogroup_${ip-adr:netmask=/24}" Group definition by number of CPUs schema="cpu" name="autogroup_${cpu-cnt}cpu" The parameters are combined via a group schema. The parameters for multiple group schemas can thus be stored in one FA_Group.xml file. A schema is activated in FA_Config.xml. 110 FA Agents - Installation and Administration

119 Parameter Reference Logical operation The following logical operations are possible: OR (through multiple value lines within a match condition) <match category="${node-hostname}"> <value>vade*</value> <value>yod*</value> <value>blade_a*</value> <value>blade_b*</value> <value>server*</value> </match> AND (through multiple match conditions within a group definition) <match category="${node-powervalue}"> <value>1000</value> </match> <match category="${node-hostname}"> <value>blade_a*</value> <value>blade_b*</value> <value>server*</value> </match> Wildcards The following wildcards can be used: * * is equivalent to any number of arbitrary characters.?? is equivalent to one arbitrary character. Min / Max values For numeric variables, the following syntax can be used to inquire value ranges: syntax="min" (in this case a variable is used as a minimum) <match category="${cpu-cnt}" syntax="min"> <value>4</value> </match> syntax="max" (in this case a variable is used as a maximum) <match category="${ip-adr}" syntax="max"> <value> </value> </match> FA Agents - Installation and Administration 111

120 Parameter Reference Masking with IP addresses With IP addresses, a type of masking similar to the netmask can be used in order to select network segments: ${ip-adr:netmask=/24} (in this case only the first 24 bits of the IP address are taken into account) <match category="${ip-adr:netmask=/24}"> <value> </value> </match> Example: <!-- auto groups --> <! DO NOT EDIT!!! If you need your own group definition, please use the section before!!! --> <gr:group schema="os_network_cpu" name="autogroup_${os-typ}_ ${CPU-cnt}cpu_${ip-adr:netmask=/24}"> <gr:description>group defined by number of CPUs, network and operating system</gr:description> </gr:group> <gr:group schema="os_cpu" name="autogroup_${os-typ}_${cpu-cnt}cpu"> <gr:description>group defined by number of CPUs and operating system</gr:description> </gr:group> <gr:group schema="os_network" name="autogroup_${os-typ}_ ${ip-adr:netmask=/24}"> <gr:description>group defined by number of CPUs and network</gr:description> </gr:group> <gr:group schema="network_cpu" name="autogroup_${cpu-cnt}cpu_ ${ip-adr:netmask=/24}"> <gr:description>group defined by number of CPUs</gr:description> </gr:group> <gr:group schema="os" name="autogroup_${os-typ}"> <gr:description>group defined by operating system</gr:description> </gr:group> 112 FA Agents - Installation and Administration

121 Parameter Reference <gr:group schema="network" name="autogroup_${ip-adr:netmask=/24}"> <gr:description>group defined by network</gr:description> </gr:group> <gr:group schema="cpu" name="autogroup_${cpu-cnt}cpu"> <gr:description>group defined by number of CPUs</gr:description> </gr:group> Default Parameter File <?xml version="1.0" encoding="iso "?> <defs:definitions xmlns:defs="myamc/definitions/1.0" xmlns:gr="myamc/groups/1.0" xmlns:attr="myamc/attribute/1.0"> <gr:services> <!-- "service schema" is specified in myamc_fa.xml allowed categories: "system-id" ("P46", "O20",...) "service-type" ("db", "app", "ci", "scs", "jc", j",...) "service-id" ("00",...)... --> <gr:service schema="default_trivial" name="default"> <!-- attributes for selected services --> <attr:attribute name="service-priority" value="1"/> <attr:attribute name="service-powervalue" value="2201"/> </gr:service> <gr:service schema="default" name="default"> <match category="${service-type}"> <value>srv_dbora</value> </match> <!-- attributes for selected services --> <attr:attribute name="service-priority" value="2"/> <attr:attribute name="service-powervalue" value="2202"/> </gr:service> FA Agents - Installation and Administration 113

122 Parameter Reference <gr:service schema="default" name="default"> <match category="${service-type}"> <value>srv_dbsap</value> </match> <!-- attributes for selected services --> <attr:attribute name="service-priority" value="2"/> <attr:attribute name="service-powervalue" value="2202"/> </gr:service> <gr:service schema="default" name="default"> <match category="${service-type}"> <value>srv_scs</value> </match> <!-- attributes for selected services --> <attr:attribute name="service-priority" value="3"/> <attr:attribute name="service-powervalue" value="2203"/> </gr:service> <gr:service schema="default" name="default"> <match category="${service-type}"> <value>srv_ascs</value> </match> <!-- attributes for selected services --> <attr:attribute name="service-priority" value="3"/> <attr:attribute name="service-powervalue" value="2203"/> </gr:service> <gr:service schema="default" name="default"> <match category="${service-type}"> <value>srv_ci</value> </match> <!-- attributes for selected services --> <attr:attribute name="service-priority" value="4"/> <attr:attribute name="service-powervalue" value="2204"/> </gr:service> <gr:service schema="default" name="default"> <match category="${service-type}"> <value>srv_jc</value> </match> <!-- attributes for selected services --> <attr:attribute name="service-priority" value="5"/> <attr:attribute name="service-powervalue" value="2205"/> </gr:service> 114 FA Agents - Installation and Administration

123 Parameter Reference <gr:service schema="default" name="default"> <match category="${service-type}"> <value>srv_j</value> </match> <!-- attributes for selected services --> <attr:attribute name="service-priority" value="6"/> <attr:attribute name="service-powervalue" value="2206"/> </gr:service> <gr:service schema="default" name="default"> <match category="${service-type}"> <value>srv_app</value> </match> <!-- attributes for selected services --> <attr:attribute name="service-priority" value="7"/> <attr:attribute name="service-powervalue" value="2207"/> </gr:service> <gr:service schema="service_type" name="service_${service-type}"> <match category="${service-type}"> <value>srv_dbora</value> </match> <!-- attributes for selected services --> <attr:attribute name="service-priority" value="1"/> <attr:attribute name="service-powervalue" value="1111"/> </gr:service> <gr:service schema="service_type" name="service_${service-type}"> <match category="${service-type}"> <value>srv_dbsap</value> </match> <!-- attributes for selected services --> <attr:attribute name="service-priority" value="1"/> <attr:attribute name="service-powervalue" value="1111"/> </gr:service> <gr:service schema="service_type" name="service_${service-type}"> <match category="${service-type}"> <value>srv_scs</value> </match> <!-- attributes for selected services --> <attr:attribute name="service-priority" value="2"/> <attr:attribute name="service-powervalue" value="2222"/> </gr:service> FA Agents - Installation and Administration 115

124 Parameter Reference <gr:service schema="service_type" name="service_${service-type}"> <match category="${service-type}"> <value>srv_ascs</value> </match> <!-- attributes for selected services --> <attr:attribute name="service-priority" value="2"/> <attr:attribute name="service-powervalue" value="2222"/> </gr:service> <gr:service schema="service_type" name="service_${service-type}"> <match category="${service-type}"> <value>srv_ci</value> </match> <!-- attributes for selected services --> <attr:attribute name="service-priority" value="3"/> <attr:attribute name="service-powervalue" value="3333"/> </gr:service> <gr:service schema="service_type" name="service_${service-type}"> <match category="${service-type}"> <value>srv_jc</value> </match> <!-- attributes for selected services --> <attr:attribute name="service-priority" value="4"/> <attr:attribute name="service-powervalue" value="4444"/> </gr:service> <gr:service schema="service_type" name="service_${service-type}"> <match category="${service-type}"> <value>srv_j</value> </match> <!-- attributes for selected services --> <attr:attribute name="service-priority" value="5"/> <attr:attribute name="service-powervalue" value="5555"/> </gr:service> <gr:service schema="service_type" name="service_${service-type}"> <match category="${service-type}"> <value>srv_app</value> </match> <!-- attributes for selected services --> <attr:attribute name="service-priority" value="6"/> <attr:attribute name="service-powervalue" value="6666"/> </gr:service> 116 FA Agents - Installation and Administration

125 Parameter Reference <gr:service schema="static" name="proddb"> <gr:description></gr:description> <match category="${system-id}"> <value>p*</value> </match> <match category="${service-type}"> <value>db</value> <value>db</value> </match> <!-- attributes for selected services --> <attr:attribute name="service-priority" value="1"/> <attr:attribute name="service-powervalue" value="1000"/> </gr:service> </gr:services> <!-- ********************************************************** --> <!-- ********************************************************** --> <!-- ********************************************************** --> <gr:groups> <!-- "group schema" is specified in myamc_fa.xml...) allowed categories: "hostname" ("vader",...) "ip-adr" (" ",...) "OS-typ" ("SunOS",...) "OS-version" ("5.8", "SUSE SLES-8 (i386); VERSION = 8.1", "OS-bits" (for future use) "CPU-arch" ("sun4u",...) "CPU-cnt" ("2") (min-, max-) "CPU-frequency(MHz)" ("333") (min-, max-) "cache-size(kb)" ("2048") (min-, max-) "mem-total(mb)" ("1024") (min-, max-) "node-hostname" ("vader",...) "node-product-name" ("SUNW,Ultra-5_10",...) "node-vendor" ("Sun_Microsystems",...) "node-powervalue" ("1000") (min-, max-) --> <!-- ********************************************************** --> <!-- ********************************************************** --> FA Agents - Installation and Administration 117

126 Parameter Reference <!-- --> <gr:group schema="default" name="gr_all"> <gr:description></gr:description> <match category="${node-hostname}"> <value>vade*</value> <value>yod*</value> <value>blade_a*</value> <value>blade_b*</value> <value>server*</value> </match> </gr:group> <gr:group schema="default_ldap" name="ld_solaris_8"> <gr:description></gr:description> <match category="${os-version}"> <value>5.8</value> </match> </gr:group> <gr:group schema="default_ldap" name="ld_sles_8"> <gr:description></gr:description> <match category="${os-version}"> <value>suse SLES-8*</value> </match> </gr:group> <gr:group schema="default_ldap" name="ld_solaris_9"> <gr:description></gr:description> <match category="${os-version}"> <value>5.9</value> </match> </gr:group> <gr:group schema="default_ldap" name="ld_sles_9"> <gr:description></gr:description> <match category="${os-version}"> <value>*suse LINUX Enterprise Server 9*</value> </match> </gr:group> <gr:group schema="default_os" name="solaris_8"> <gr:description></gr:description> <match category="${os-version}"> 118 FA Agents - Installation and Administration

127 Parameter Reference <value>5.8</value> </match> </gr:group> <gr:group schema="default_os" name="sles_8"> <gr:description></gr:description> <match category="${os-version}"> <value>suse SLES-8*</value> </match> </gr:group> <gr:group schema="default_os" name="solaris_9"> <gr:description></gr:description> <match category="${os-version}"> <value>5.9</value> </match> </gr:group> <gr:group schema="default_os" name="sles_9"> <gr:description></gr:description> <match category="${os-version}"> <value>*suse LINUX Enterprise Server 9*</value> </match> </gr:group> <gr:group schema="bsp_ostyp_1" name="bsp_gr_default_${os-typ}"> <gr:description></gr:description> <match category="${os-typ}"> <value>${os-typ}</value> </match> <match category="${cpu-cnt}" syntax="min"> <value>0</value> </match> <match category="${mem-total(mb)}" syntax="min"> <value>20</value> </match> <match category="${ip-adr:netmask=/24}"> <value> </value> </match> <match category="${node-powervalue}" syntax="min"> <value>0</value> </match> <match category="${node-hostname}"> <value>vade*</value> <value>blade_a*</value> FA Agents - Installation and Administration 119

128 Parameter Reference <value>blade_b*</value> <value>server*</value> </match> </gr:group> <gr:group schema="bsp_static_1" name="bsp_mycompany_${os-typ}"> <gr:description></gr:description> <match category="${os-typ}"> <value>${os-typ}</value> </match> <match category="${cpu-cnt}" syntax="min"> <value>4</value> </match> <match category="${mem-total(mb)}" syntax="min"> <value>4000</value> </match> <match category="${ip-adr:netmask=/24}"> <value> </value> </match> <match category="${node-powervalue}"> <value>1000</value> </match> <match category="${node-hostname}"> <value>blade_a*</value> <value>blade_b*</value> <value>server*</value> </match> </gr:group> <gr:group schema="bsp_static_2" name="bsp_mycompany"> <gr:description></gr:description> <match category="${cpu-cnt}" syntax="max"> <value>2</value> </match> <match category="${ip-adr:netmask=/24}"> <value> </value> </match> <match category="${node-powervalue}"> <value>1000</value> </match> <match category="${node-hostname}"> <value>blade_a*</value> <value>blade_b*</value> <value>server*</value> </match> 120 FA Agents - Installation and Administration

129 Parameter Reference </gr:group> <gr:group schema="bsp_static_3" name="bsp_3"> <gr:description></gr:description> <match category="${ip-adr}" syntax="min"> <value> </value> </match> <match category="${ip-adr}" syntax="max"> <value> </value> </match> <match category="${node-powervalue}"> <value>1000</value> </match> <match category="${node-hostname}"> <value>vad*</value> <value>blade_a*</value> <value>blade_b*</value> <value>server*</value> </match> </gr:group> <!-- ********************************************************** --> <!-- ********************************************************** -->!!! <!-- auto groups --> <!-- DO NOT EDIT!!! If you need your own group definition, please use the section befor --> <gr:group schema="os_network_cpu" name="autogroup_${os-typ}_${cpucnt}cpu_${ip-adr:netmask=/24}"> <gr:description>group defined by number of CPUs, network and operating system</gr:description> </gr:group> <gr:group schema="os_cpu" name="autogroup_${os-typ}_${cpu-cnt}cpu"> <gr:description>group defined by number of CPUs and operating system</gr:description> </gr:group> <gr:group schema="os_network" name="autogroup_${os-typ}_${ipadr:netmask=/24}"> FA Agents - Installation and Administration 121

130 Parameter Reference <gr:description>group defined by number of CPUs and network</gr:description> </gr:group> <gr:group schema="network_cpu" name="autogroup_${cpu-cnt}cpu_${ipadr:netmask=/24}"> <gr:description>group defined by number of CPUs</gr:description> </gr:group> <gr:group schema="os" name="autogroup_${os-typ}"> <gr:description>group defined by operating system</gr:description> </gr:group> <gr:group schema="network" name="autogroup_${ip-adr:netmask=/24}"> <gr:description>group defined by network</gr:description> </gr:group> <gr:group schema="cpu" name="autogroup_${cpu-cnt}cpu"> <gr:description>group defined by number of CPUs</gr:description> </gr:group> <!-- ********************************************************** --> </gr:groups> </defs:definitions> 7.4 Service Classes The service classes are defined and parameterized in the group configuration file of a virtual FlexFrame pool. A sevice class is defined by the following variables: "system-id" ("P46", "O20",...) "service-type" ("db", "app", "ci",...) "service-id" ("00",...) The attributes service-priority and service-powervalue are defined in accordance with these variables. In the future it will be possible to extend such a service class by further attributes which, for example, define the operating system required by a service or the number of CPUs/performance requirement of the service. 122 FA Agents - Installation and Administration

131 Parameter Reference Service Priority The highest service priority is 1. Every service is assigned this priority by default, i.e. if no service classes are defined, all services have the priority 1. The higher the number, the lower the priority of a service. Priority 0 has a special status. Setting priority 0 for a service class enables the autonomous functions to be disabled for a service. The service priority is evaluated for all autonomous reactions. If, for example, a service of a productive system and a service of a test system are running on the same node and the test system s service is assigned priority 5, this service is not executed because the productive system s service which is functioning without error has the higher priority of Service Power Value The service power value specifies for a service a performance number which defines the maximum performance (SAPS) required by this service. This is provided for future enhancements in the field of load distribution and load transfer. A failed service with a performance value of 50 can, for example, also be taken over by a node which still has at least 50 of its maximum performance number free Class Creation Rules A service belongs either to the default class which always exists or it can be assigned unambiguously to another class by evaluating the aforementioned variables Example <gr:services> <! "service schema" is specified in myamc_fa.xml allowed categories: "system-id" ("P46", "O20",...) "service-type" ("db", "app", "ci",...) "service-id" ("00",...)... --> <gr:service schema="default" name="default"> <match category="${service-type}"> <value>db</value> <value>db</value> FA Agents - Installation and Administration 123

132 Parameter Reference </match> <!-- attributes for selected services --> <attr:attribute name="service-priority" value="2"/> <attr:attribute name="service-powervalue" value="2202"/> </gr:service> <gr:service schema="default" name="default"> <match category="${service-type}"> <value>ci</value> <value>ci</value> </match> <!-- attributes for selected services --> <attr:attribute name="service-priority" value="3"/> <attr:attribute name="service-powervalue" value="2203"/> </gr:service> <gr:service schema="default" name="default"> <match category="${service-type}"> <value>app</value> <value>app</value> </match> <!-- attributes for selected services --> <attr:attribute name="service-priority" value="4"/> <attr:attribute name="service-powervalue" value="2204"/> </gr:service> FlexFrame Autonomy The base parameterization is always required. The parameterization is organized hierarchically, i.e. information can be configured identically for all services or specifically for individual services. myamc_fa.xml The information to be configured relates to the following components: General parameters Parameters for the Performance and Accounting Option Node-related parameters Service-related parameters Path configurations 124 FA Agents - Installation and Administration

133 Parameter Reference General Parameters CheckCycleTime Defines the cycles in which the detector modules supply results and the rule modules evaluate the status derived from these. The parameter value may not be less than the minimum processing time which the detector modules, rule modules and reaction modules require to process a cycle. The default value in the as-supplied status is 10 seconds. The parameter value must also always be at least 1/3 of the lifetime of the MonitorAlerts. In the FlexFrame standard installation the lifetime of the MonitorAlerts is 30 seconds. LivelistWriterTime Defines the intervals at which the FA Agents must generate a Livelist. It is specified in seconds. ControlAgentTime Specifies how often the Control Agent checks the Livelists of the Application Agents. The parameter should thus be about the same as the LiveListWriterTime. MaxHeartbeatTime Specifies the maximum time which may elapse between two Livelist entries of an Application Agent before the Control Agent intervenes. The MaxHeartbeatTime must therefore always be greater than the ControlAgentTime and the Livelist- WriterTime. In practice the factor of 3 between LivelistWriterTime and Max- HeartbeatTime has proved practical. MaxRebootTime Specifies the maximum time which may elapse between two Livelist entries of an Application Agent before the Control Agent intervenes if the latter is rebooting. MaxFailedReachNumber Specifies how often the Control Agent attempts to reach a node after the MaxHeartbeatTime has been exceeded before an external switchover is initiated. MaxAgeSwitchOverFile This parameter specifies the max age (in seconds) of SwitchOver-File. If the age of a SwitchOver-file exceeds this value, the SwitchOver-file is ignored. Node_SwitchOverTyp Specifies the mode according to which the testaments are created: node-based or service-based. The following keywords are currently valid for this parameter: service node TakeOverStrategy Defines how the application to take over services of defective nodes occurs and how the winner is determined. FA Agents - Installation and Administration 125

134 Parameter Reference Node_TakeOverRule Defines which takeover rule is to be used. The following values are possible: SpareNode Substitution (add rule) Displacement (replace rule) Supplementation (supplementation) Dynamic DynamicTakeOverRule Spare node rule Prio >= 1 < 2 Add rule Prio >= x1 Replace rule Prio >= x2 <= x3 Substitution Prio >0 x4 <= x Parameters for the Performance and Accounting Option FA Agent: PerfdataReportCycleTime (myamc_fa.xml) The report cycle specifies the cycle of creating a performance and accounting value, which is written to the collet. max_colletcount_performance_files (myamc_fa_appagent_spec.xml) Number of collet generations that are stored before being rewritten. max_filesize_performance_files (myamc_fa_appagent_spec.xml) Maximum size of the performance files. This parameter serves to limit the size of the files. collet_switch_start_performance_files (myamc_fa_appagent_spec.xml) Defines the date and time when new collet files are written. Eexample: :00:00 collet_switch_cycle_performance_files (myamc_fa_appagent_spec.xml) Cycle time when new collets are written. Example: 3600 By combining collet_switch_start_performance_files and collet_switch_cycle_performance_files, it is possible to specify that, starting at 0:00, new collets are written every 3600 seconds, i.e. every hour. It is also possible to implement different cycle times. DomainManagerCycle Cycle in which the DomainManager collects the data collets and stores them in the database. 126 FA Agents - Installation and Administration

135 Parameter Reference Node-Related Parameters Node_MaxRebootNumber Specifies how many consecutive reboots may be performed to restore a service. If 3 is specified, the Application Agent attempts to make the system available again with up to three reboots. Here you must bear in mind that a reboot is also unsuccsessful if the system could not be restored within the MaxRebootTime set. In the event of reboot problems, the MaxRebootTime parameter must therefore also always be checked and compared with the reboot time actually needed. Node_MaxSwitchOverNumber Specifies how many consecutive switchovers may be performed to restore a service. Node_SwitchOverServiceStartDelayTime After an internal switchover this parameter defines how long will be waited until the services to be taken over are started on the node that is taking over control. This value is needed since, depending on the switches used, the virtual IP addresses cannot be released directly owing to the devices internal caching. This delay time must therefore be greater than the devices caching time otherwise the service startup on the node which is taking over control will fail. Node_SendTrapsAllowed Releases or blocks the sending of node traps. Node_RebootCommand Specifies which command is executed when the Application Agent initiates a reboot. Normally this is a shutdown with a subsequent reboot. Node_ShutdownCommand Specifies which command is executed when the Application Agent initiates a switchover. Normally this is a shutdown without a subsequent reboot. Node_PowerDownCommand Specifies which command is executed by the Control Agent before an external switchover is initiated. In this way it is ensured that the services on the node being switched over are really stopped and can be taken over without any problem by other nodes. The Control Agent waits at most for the period specified with the ShutdownFac_Shut_Exec_Timeout parameter before it continues with the switchover. Node_CheckAvailabilityCommand Specifies which command is executed by the Control Agent to check the availability of a node. A return value of 0 is interpreted as a positive result, every other return value as negative. The Control Agent waits at most for the period of 5 seconds. If the command has not been executed completely by then, it is assumed that the test is negative, i.e. the node is no longer available, resulting in an external switchover. FA Agents - Installation and Administration 127

136 Parameter Reference Node_RemoteExecutionCommand Specifies which command the Control Agent puts ahead of a command to be executed on another node. This is used, for example, to start or stop a service remotely on an Application Node. Usually ssh is used here. SwitchOver_ext_Unavailability_check Specifies which unavailability check will be performed before external SwitchOver. Possible values are: = 'PING_and_SSH': 'ping' and 'ssh' must fail for Poff (= default) = 'PING_or_SSH': 'ping' or 'ssh' must fail for Poff = 'PING_only': 'ping' only must fail for Poff = 'SSH_only': 'ssh' only must fail for Poff SwitchOver_ext_Unavailability_check_PING Specifies which PING unavailability check will be performed before external Switch- Over. Possible values are: = '1': 'ping' the normal hostname ('blade2') (= default) = '2': 'ping' the server-lan hostname ('blade2-se') = '4': 'ping' the storage-lan hostname ('blade2-st') = '3': 'ping' normal and server-lan = '5': 'ping' normal and storage-lan = '6': 'ping' server-lan and storage-lan = '7': 'ping' normal and server-lan and storage-lan Service-Related Parameters The following parameters can be set individually for each service type or for multiple services simultaneously. This is also a result of the hierarchical structure of the parameter file. In the parameter file there is also an option for configuring the values for the DB, CI and APP services individually. The value of the default service is used for any value which is not service-specific. Service_EnableMonitoring Service_SendTraps Service_MaxRestartNumber Service_TrapSendDelayTime Service_ReactionDelayTime Service_MaxStartTime Service_MaxStopTime 128 FA Agents - Installation and Administration

137 Parameter Reference The dynamic behavior of the FA Application and Control Agents depends very much on the values in the configuration file and the physical conditions. You must therefore check very carefully that the relation between certain values is secure and application-oriented. Service_EnableMonitoring Defines whether monitoring is enabled or disabled for the service type in question. Service_SendTraps Releases or blocks the sending of service traps. Service_MaxRestartNumber Defines how many attempts are made to restart a failed service. This value can be configured individually for each service type. The value is typically in the range 1 to 10. The value 0 means that no attempt is made to restart a failed service. If reboots are permitted on the node, failure of a service leads directly to a reboot. Service_TrapSendDelayTime Defines the send delay time for the service traps. Service_ReactionDelayTime Interworks directly with CheckCycleTime. It can be set individually for each service type. This time defines how long the triggering of a reaction is delayed after a failure has been detected. FA Agents - Installation and Administration 129

138 Parameter Reference Examples: CheckCycleTime = 10 sec; ServiceReactionDelayTime = 30 sec In this example a failed service is detected in a cycle. However, the reaction only takes place after 30 seconds. The failure must therefore have been identified as a failure over at least three detection cycles. This allows you to prevent a detection error resulting in an incorrect reaction. CheckCycleTime = 10 sec; ServiceReactionDelayTime = 0 sec In this example the required reaction takes place immediately in the cycle in which the problem was detected. Service_MaxRestartTime Defines the maximum time which may be required for a service type in the event of a restart. If this time is exceeded, a second or nth attempt is made in accordance with Service_MaxRestartNumber. Thus if too short a time is selected for the service to be monitored and the hardware used, i.e. the service requires longer to restart than permitted by Service_MaxRestartTime, a problem situation is triggered incorrectly. Service_MaxStartTime Defines how long a service may take to start up. If this time is exceeded, the Agent interprets the service as not started and initiates further reactions. Service_MaxStopTime Defines how long a service may take to stop. If this time is exceeded, the Agent interprets the service as not stopped and initiates appropriate reactions. Service_PingVirtualServiceInterface Defines whether the associated virtual FlexFrame service interface is pinged to determine the availability of a service. If it is set to 0 the virtual LAN interfaces of the client and server network are not queried. Interface availability then has no influence on the status change of a service. Switching off this parameter causes a defined wait time (which is parameter "Node_SwitchOverServiceStartDelayTime") before starting services in a takeover (switchover) Parameters for the Definition of a Generic Service By default, the FA Agents can supervise a defined quantity of services, depending on the version. For these services, the rules for the detection and the rules for the autonomous reaction are fixed components of the FA Agents. Generic services are services which are not considered in the standard scope of the FA Agents but which will also be considered within the context of supervision and of the autonomous reaction. This section describes the parameters that are used to enable monitoring by the FA Agents as well as the autonomous reactions. 130 FA Agents - Installation and Administration

139 Parameter Reference In the parameter file myamc_fa_rules.xml, the necessary information for the detection of the service is registered. Most of the parameters are optional, meaning they are not normally needed for monitoring. In the following the mandatory parameters are indicated by an asteriks (*). The optional parameters are set with default values if not used. The parameter file for the definition of generic services is organized in two main parts: the parametering of the detection and of the reactions which are required for a service. For the configuration of a generic service, this generic service must be described in the detection section.the reaction is then defined in the second section. Therefore, ideally, in each case one of the templates in the detection and reaction block is copied. This must then be individualized with the specific parameters in the appropiate places. The referencing between the detection and reaction blocks occurs via the defined service name. Attention: The templates are set to not active. Therefore, in principle, the individualization of the parameter Active must be changed to 1. Furthermore, the following standard service parameters must be defined such that they are also valid for built-in services. If service-specified parameters are not set, the default values will be used. Service_EnableMonitoring Service_SendTraps Service_MaxRestartNumber Service_TrapSendDelayTime Service_ReactionDelayTime Service_MaxStartTime Service_MaxStopTime Parametering of the Service Detection The parameter set for the detection consists of a header which defines the service as a whole. A service itself can consist of one or more subservices. For every subservice a separate subservice parameter block must be created. Parameters of the service header *Name Unique symbolic servicename inside the rulefile. The maximum string length is 50 characters. It does not have to be identical with the technical service name. The service name is the reference to the reaction. Description Used for documentation purposes only. FA Agents - Installation and Administration 131

140 Parameter Reference *Displayname The display name is used by the FlexFrame ControlCenter. If it is not defined, the symbolic name is used, but it should be defined. It should be as short as possible because this service name will be displayed in the FlexFrame ControlCenter (limited space available). Active Activates (1) or deactivates (0) the parameter block. MonitorParam Defines the required parameters that produce a state-altering event during starting or stopping of a service. The event script is called MonitorFlag and is doing nothing. It remains in the process list for a defined timeperiod (normally 30 seconds).to attach the MonitorFlag to a service, the following call syntax is prescribed: MonitorFlag <RefSrv> <State> <InstNo> <SID> RefSrv This is a reference to a service and is defined under MonitorParam. State start, stop, restart, watch or nowatch InstNo A positive instance number; this information is optional SID The system id. This information is optional Orderprio When there are several services, the order priority defines within a system the order in which they will be started. Stopping will be done in the reverse order. PowerValue Workload demands of a service in SAPS. ServicePrio Priority of the service to be defined. *Group Accounting group name for the performance and accounting management. A new group can be defined for a service or its values appear in an existing group. For example, SAP: If the backup server should be monitored and its workload consumption should appear in the group SAP, then it should be configured to the group SAP. 132 FA Agents - Installation and Administration

141 Parameter Reference Subservice parametering A service consists of a minimum of one or more subservices. For every subservice the following parameters must be defined: Subservice Subservice name Symbolic name for the subservice. This parameter is optional. Display name This parameter is for example used as service name in failure traps or error messages. Active Active (1) or inactive (0). The default is active. Subservice detector parametering For every subservice one or more detectors can be activated. In version 3.0 of the FA Agents, only detectors of the type process exist. In future versions more detectors will be added. A detector requires the following parameters: Detector Detector type Process Active Switches the rule on or off ProcessName The process name to be detected CountMin Minimum number of required processes of this type CountMax Maximum number of required processes of this type HierachyMin Minimum process hierachy of the subservices Severity Determined Severity (warning or critical) if the affiliated subservice has a fault. Parametering of the service reactions The second big block to be defined is the reactions required for a service. For this the commands for starting, stopping and restarting the services must be defined. FA Agents - Installation and Administration 133

142 Parameter Reference The reaction parameter block for a service also has the service name, meaning the same name. In a reaction the determining of a program or script to be called occurs in the case of each start, stop or restart of the services. The first command specifies the start call, the second the stop call and the third the restart call. Each command is composed of the attributes script and parameter. script defines the called program or script. If a parameter is omitted, no parameter is used Path Configuration The path configuration is used to define the directories in which the myamc.fa components store their various work files. A FlexFrame Autonomy solution stores a range of information, such as files with display information for the WebInterface and logging information to be used for support when this is required, in various files. To ensure performance and clarity are retained even in larger configurations, we recommend that you do not modify these settings! If the suggested path configuration is changed, though, make sure that clatiy is still retained and no problems arise with regard to performance and accessibility. LiveListLogFilePath This parameter specifies the directory in which the Livelist is stored. LiveListXmlFilePath This parameter specifies the directory in which the XML representation of the Livelist is stored. This file is required by the FA WebInterface. The parameter should contain the same path as ServicesXmlFilePath. ServicesXmlFilePath This parameter specifies the directory in which the XML representation of the services list is stored. These files are required by the FA WebInterface. The parameter should contain the same path as LiveListXmlFilePath. ServicesListFilePath This parameter specifies the directory in which the services list files are stored. ServicesLogFilePath This parameter specifies the directory in which the services log files are stored. RebootListFilePath This parameter specifies the directory in which the reboot files are stored. These files contain a list of all services which must be restored after a reboot. SwitchOverListFilePath This parameter specifies the directory in which the switchover files are stored. These files contain a list of all services which must be restored on another node after a switchover. 134 FA Agents - Installation and Administration

143 Parameter Reference PerformanceFilePath This parameter specifies the directory in which the performance files are stored. These files contain measured values for performance data. SAPScriptFilePath This parameter specifies the directory in which the start and stop scripts for the SAP services (sapdb, sapci, sapapp, etc.) can be found. The default path (/opt/myamc/scripts/sap) is normally a symbolic link to the actual script directory. ControlFilePath This parameter specifies the directory in which the control files (<service type><service id><service sid>_host) generated by the start/stop scripts are contained. BlackboardFilePath This parameter specifies the directory in which the BlackBoard file can be found. Commands can be entered in it which are executed by the FA_AppAgents. GroupConfigFile This parameter specifies the file in which the group affiliation is configured. PrePoffHookPath This parameter specifies the script, which is execute before powering off a node. If IgnorePoffHookResult = true, the return code will be ignored, otherwise the node will be powered off only if this script returns 0. PostPoffHookPath This parameter specifies the script, which is execute after powering off a node and before performing a SwitchOver. If IgnorePoffHookResult = true, the return code will be ignored, otherwise the SwitchOver will be performed only if this script returns Shutdown Configuration The shutdown feature is described in detail in section 6.4. IgnoreShutdownFailure The parameter IgnoreShutdownFailure defines whether, after a failed shutdown/powerdown, the network interfaces of the relevant node are to be deactivated, which would ensure that all services running on this node are shut down and the services can be switched over to an other Application Node. FA Agents - Installation and Administration 135

144 Parameter Reference Default Parameter File <?xml version="1.0" encoding="iso "?> <configuration> <configsection name="myamc.fa"> <!-- *** timing parameters *** --> <!-- specifies how often (in seconds) myamc.fa checks process states. This value should not exceed a third of the time interval specified in the script monitor-alert (which is part of the FlexFrame installation). The default value for this parameter is 10 seconds (Note: 'monitor_alert' is 30). --> <configentry name="checkcycletime"> <value type="unsignedinteger">10</value> <!-- specifies how often (in seconds) myamc.fa writes live list entries This value should not exceed a third of the time interval specified by the parameter MaxHeartbeatTime. --> <configentry name="livelistwritertime"> <value type="unsignedinteger">10</value> <!-- specifies how often (in seconds) myamc.fa looks for new live list entries. This value should not exceed a third of the time interval specified by the parameter MaxHeartbeatTime. --> <configentry name="controlagenttime"> <value type="unsignedinteger">10</value> <!-- Maximum time (in seconds) between two application agent heartbeat messages in live list --> <configentry name="maxheartbeattime"> 136 FA Agents - Installation and Administration

145 Parameter Reference <value type="unsignedinteger">30</value> <!-- Maximum time (in seconds) a machine takes to reboot --> <configentry name="maxreboottime"> <value type="unsignedinteger">300</value> <!-- Maximum number of failed reach attempts --> <configentry name="maxfailedreachnumber"> <value type="unsignedinteger">1</value> data. <!-- specifies how often (in seconds) myamc.fa writes performance This value should be a multiple of 'CheckCycleTime'. The default value for this parameter is 60 seconds. --> <configentry name="perfdatareportcycletime"> <value type="unsignedinteger">600</value> <!-- specifies the max age (in seconds) of SwitchOver-File. If the age of a SwitchOver-file exceeds this value, the SwitchOver-file is ignored. --> <configentry name="maxageswitchoverfile"> <value type="unsignedinteger">120</value> <!-- *** node parameters *** --> <configsection name="node"> SwitchOver-file --> <!-- --> <!-- specifies the minimal service prio for take over a <configentry name="node_minserviceprio"> <value type="unsignedinteger">1</value> FA Agents - Installation and Administration 137

146 Parameter Reference <!-- specifies the 'service schema' with controls 'priority' and 'load' --> --> --> --> SAP-services --> based --> --> <!-- See file 'myamc_groups.xml' <configentry name="node_serviceschema"> <value type="string">default</value> <!-- specifies the 'group schema' with controls grouping. <!-- See file 'myamc_groups.xml' <configentry name="node_groupschema"> <value type="string">default_ldap</value> <!-- specifies the interface used for start/stop/restart <configentry name="node_sap_interface"> <value type="string">sapscripts</value> <!-- <value type="string">acc</value> --> <!-- <value type="string">sapscripts, ACC</value> --> <!-- specifies whether SwitchOver is service based or node <!-- alowed values are 'service' and 'node' <configentry name="node_switchovertyp"> <!-- <value type="string">service</value> --> <value type="string">node</value> > <!-- specifies the rule to take over a SwitchOver-file -- <!-- alowed values are: = 'SpareNode': 138 FA Agents - Installation and Administration

147 Parameter Reference taken SwitchOver-File. Over-File. File will be written therefore, Over-rules none. Only a spare node will Start the services from = 'add' (Hinzufuegen, Ergaenzung): Additional start the services from taken Switch- = 'replace' (Verschieben): Running services will be stopped and a SwitchOver- Start the services from taken SwitchOver-File. = 'substitute' (Ersetzung): Running services will be stopped, Start the services from taken SwitchOver-File. = 'dynamic': Cause of prios there will be done one of the Take- 'SpareNode', 'add', 'replace', 'substitute', or --> <configentry name="node_takeoverrule"> <!-- <value type="string">add</value> <value type="string">replace</value> <value type="string">substitute</value> <value type="string">dynamic</value> --> <value type="string">sparenode</value> <!-- specifies the ranges for take over rule 'dynamic'. For 'Dyn_Spare_*' it is the high prio of SwitchOverfile. For others it is the high prio of own node. --> <!-- 'Spare': --> <configentry name="dyn_spare_min"> <value type="unsignedinteger">1</value> <configentry name="dyn_spare_max"> <value type="unsignedinteger">4</value> <!-- 'Add' (Hinzufuegen, Ergaenzen): --> <configentry name="dyn_add_min"> type="unsignedinteger">3</value> <configentry name="dyn_add_max"> type="unsignedinteger">4</value> <value <value FA Agents - Installation and Administration 139

148 Parameter Reference <!-- 'Replace' (Verschieben): --> <configentry name="dyn_replace_min"> type="unsignedinteger">5</value> <configentry name="dyn_replace_max"> type="unsignedinteger">6</value> <!-- 'Substitute' (Ersetzen): --> <configentry name="dyn_substitute_min"> type="unsignedinteger">7</value> <configentry name="dyn_substitute_max"> type="unsignedinteger">20</value> <value <value <value <value is <!-- specifies whether the min- max- range of spare node exclusive for spare nodes or not. --> <configentry name="dyn_spare_exclusive"> <value type="boolean">true</value> The escalation will be done for all ser- "SwitchOver" vices on a node. Over"(single service) service. <!-- specifies the escalation type --> <!-- alowed values are: = 'Node': Node-escalation: "Restart" => "Rebbot" => = 'Service': Service-escalation: "Restart" => "Switch- The escalation will be done for a single --> <configentry name="escalationtype"> <!-- <value type="string">service</value> --> <value type="string">node</value> <!-- specifies the TakeOver strategy --> <!-- alowed values are: 140 FA Agents - Installation and Administration

149 Parameter Reference = 'FirstFit': The first node, who applies for TakeOver, wins and gets the SwitchOver-File. --> <configentry name="takeoverstrategy"> <value type="string">firstfit</value> <!-- specifies the maximum number of reboots --> <configentry name="node_maxrebootnumber"> <value type="unsignedinteger">3</value> <!-- specifies the maximum number of switch overs --> <configentry name="node_maxswitchovernumber"> <value type="unsignedinteger">3</value> the service will be started after this time inter- true, node tries false, val to <!-- if Service_PingVirtualServiceInterface is set to this value specifies the time window, in which a to take over a service. if Service_PingVirtualServiceInterface is set to ensure, that is is actually down. --> <configentry name="node_switchoverservicestartdelaytime"> <value type="unsignedinteger">120</value> <!-- specifies whether traps are allowed --> <configentry name="node_sendtrapsallowed"> <value type="boolean">true</value> <!-- specifies the command to be executed apon reboot --> <configentry name="node_rebootcommand"> <value type="string">/opt/myamc/scripts/shutdown_node/shutdown_node.sh reboot</value> FA Agents - Installation and Administration 141

150 Parameter Reference <!-- specifies the command to be executed apon shutdown -- > <configentry name="node_shutdowncommand"> <value type="string">/opt/myamc/scripts/shutdown_node/shutdown_node.sh shutdown</value> <!-- specifies the command to be executed to power down a node before another node is allowed to take over its services. The variable ${node-name} can be used to specifiy the name of the node the actual command should be executed on. The actual command will be appended to this command --> <configentry name="node_powerdowncommand"> <!-- <value type="string">/bin/true</value> --> <value type="string">/bin/su - root -c "/opt/myamc/scripts/powermng/poweronoff.sh ${node-name} down"</value> executed on another node. The variable ${node- /bin/uname -a</value> be name} can tual will name} </value> <!-- specifies the command to be executed to in order to determine whether a node is still available --> <configentry name="node_checkavailabilitycommand"> <value type="string">/usr/bin/ssh ${node-name} <!-- specifies the command to be used when a command is to be used to specifiy the name of the node the ac- command should be executed on. The actual command be appended to this command --> <configentry name="node_remoteexecutioncommand"> <value type="string">/usr/bin/ssh root@${node- 142 FA Agents - Installation and Administration

151 Parameter Reference </configsection> <!-- *** service parameters *** --> <configsection name="services"> <configsection name="default"> <!-- specifies whether to monitor a service --> <configentry name="service_enablemonitoring"> <value type="boolean">true</value> <!-- specifies whether traps are allowed --> <configentry name="service_sendtraps"> <value type="boolean">true</value> <!-- specifies the maximum number of restarts --> <configentry name="service_maxrestartnumber"> <value type="unsignedinteger">10</value> <!-- specifies how long (in seconds) to delay a trap in case of an error --> <configentry name="service_trapsenddelaytime"> <value type="unsignedinteger">15</value> <!-- specifies how long (in seconds) to delay a reaction in case of an error This value should be at least three times as high as the time interval specified in CheckCycleTime. --> <configentry name="service_reactiondelaytime"> <value type="unsignedinteger">45</value> <!-- specifies maximum service restart time (in seconds) --> <configentry name="service_maxrestarttime"> <value type="unsignedinteger">600</value> <!-- specifies maximum service start time (in seconds) --> <configentry name="service_maxstarttime"> <value type="unsignedinteger">300</value> FA Agents - Installation and Administration 143

152 Parameter Reference <!-- specifies maximum service stop time (in seconds) --> <configentry name="service_maxstoptime"> <value type="unsignedinteger">300</value> <!-- specifies whether to ping virtual service interface in order to detect whether a service is up and running. --> <configentry name="service_pingvirtualserviceinterface"> <value type="boolean">true</value> </configsection> <!-- *** timing parameters for DB services *** --> <configsection name="srv_dbora"> <!-- specifies maximum service restart time --> <configentry name="service_maxrestarttime"> <value type="unsignedinteger">600</value> <!-- specifies maximum service start time --> <configentry name="service_maxstarttime"> <value type="unsignedinteger">300</value> <!-- specifies maximum service stop time --> <configentry name="service_maxstoptime"> <value type="unsignedinteger">300</value> </configsection> <!-- *** timing parameters for DB services *** --> <configsection name="srv_dbsap"> <!-- specifies maximum service restart time --> <configentry name="service_maxrestarttime"> <value type="unsignedinteger">600</value> <!-- specifies maximum service start time --> <configentry name="service_maxstarttime"> <value type="unsignedinteger">300</value> <!-- specifies maximum service stop time --> <configentry name="service_maxstoptime"> <value type="unsignedinteger">300</value> 144 FA Agents - Installation and Administration

153 Parameter Reference </configsection> <!-- *** timing parameters for CI services *** --> <configsection name="srv_ci"> <!-- specifies maximum service restart time --> <configentry name="service_maxrestarttime"> <value type="unsignedinteger">600</value> <!-- specifies maximum service start time --> <configentry name="service_maxstarttime"> <value type="unsignedinteger">300</value> <!-- specifies maximum service stop time --> <configentry name="service_maxstoptime"> <value type="unsignedinteger">300</value> </configsection> <!-- *** timing parameters for APP services *** --> <configsection name="srv_app"> <!-- specifies maximum service restart time --> <configentry name="service_maxrestarttime"> <value type="unsignedinteger">600</value> <!-- specifies maximum service start time --> <configentry name="service_maxstarttime"> <value type="unsignedinteger">300</value> <!-- specifies maximum service stop time --> <configentry name="service_maxstoptime"> <value type="unsignedinteger">300</value> </configsection> <!-- *** timing parameters for SCS services *** --> <configsection name="srv_scs"> <!-- specifies maximum service restart time --> <configentry name="service_maxrestarttime"> <value type="unsignedinteger">600</value> <!-- specifies maximum service start time --> <configentry name="service_maxstarttime"> <value type="unsignedinteger">300</value> FA Agents - Installation and Administration 145

154 Parameter Reference <!-- specifies maximum service stop time --> <configentry name="service_maxstoptime"> <value type="unsignedinteger">300</value> </configsection> <!-- *** timing parameters for ASCS services *** --> <configsection name="srv_ascs"> <!-- specifies maximum service restart time --> <configentry name="service_maxrestarttime"> <value type="unsignedinteger">600</value> <!-- specifies maximum service start time --> <configentry name="service_maxstarttime"> <value type="unsignedinteger">300</value> <!-- specifies maximum service stop time --> <configentry name="service_maxstoptime"> <value type="unsignedinteger">300</value> </configsection> <!-- *** timing parameters for JC services *** --> <configsection name="srv_jc"> <!-- specifies maximum service restart time --> <configentry name="service_maxrestarttime"> <value type="unsignedinteger">600</value> <!-- specifies maximum service start time --> <configentry name="service_maxstarttime"> <value type="unsignedinteger">300</value> <!-- specifies maximum service stop time --> <configentry name="service_maxstoptime"> <value type="unsignedinteger">300</value> </configsection> <!-- *** timing parameters for J services *** --> <configsection name="srv_j"> <!-- specifies maximum service restart time --> <configentry name="service_maxrestarttime"> 146 FA Agents - Installation and Administration

155 Parameter Reference <value type="unsignedinteger">600</value> <!-- specifies maximum service start time --> <configentry name="service_maxstarttime"> <value type="unsignedinteger">300</value> <!-- specifies maximum service stop time --> <configentry name="service_maxstoptime"> <value type="unsignedinteger">300</value> </configsection> <!-- *** timing parameters for livecache (LC) services *** --> <configsection name="srv_lc"> <!-- specifies maximum service restart time --> <configentry name="service_maxrestarttime"> <value type="unsignedinteger">600</value> <!-- specifies maximum service start time --> <configentry name="service_maxstarttime"> <value type="unsignedinteger">300</value> <!-- specifies maximum service stop time --> <configentry name="service_maxstoptime"> <value type="unsignedinteger">300</value> </configsection> <!-- *** timing parameters for enqueue replication (ERS) services *** --> <configsection name="srv_ers"> <!-- specifies maximum service restart time --> <configentry name="service_maxrestarttime"> <value type="unsignedinteger">600</value> <!-- specifies maximum service start time --> <configentry name="service_maxstarttime"> <value type="unsignedinteger">300</value> <!-- specifies maximum service stop time --> <configentry name="service_maxstoptime"> FA Agents - Installation and Administration 147

156 Parameter Reference <value type="unsignedinteger">300</value> </configsection> </configsection> <!-- *** path parameters *** --> <!-- specifies path to FA scripts --> <configentry name="fascriptfilepath"> <value type="string">/opt/myamc/scripts</value> <!-- specifies path to live list file --> <configentry name="livelistlogfilepath"> <value type="string">/opt/myamc/vff/vff_${vff}/data/fa/livelist</value> <!-- specifies path to live list file In order to use the FlexWeb web interface this path must be the same as specified in <install-path>/web/myamc-flexweb.conf and in ServicesXmlFilePath --> <configentry name="livelistxmlfilepath"> <value type="string">/opt/myamc/vff/vff_${vff}/data/fa/xmlrepository</value> <!-- specifies path to service xml file In order to use the FlexWeb web interface this path must be the same as specified in <install-path>/web/myamc-flexweb.conf and in LiveListXmlFilePath --> <configentry name="servicesxmlfilepath"> <value type="string">/opt/myamc/vff/vff_${vff}/data/fa/xmlrepository</value> <!-- specifies path to services list file --> <configentry name="serviceslistfilepath"> <value type="string">/opt/myamc/vff/vff_${vff}/data/fa/servicelists</value> 148 FA Agents - Installation and Administration

157 Parameter Reference <!-- specifies path to services log file --> <configentry name="serviceslogfilepath"> <value type="string">/opt/myamc/vff/vff_${vff}/data/fa/servicelogs</value> <!-- specifies path to reboot file --> <configentry name="rebootlistfilepath"> <value type="string">/opt/myamc/vff/vff_${vff}/data/fa/reboot</value> <!-- specifies path to switch over file --> <configentry name="switchoverlistfilepath"> <value type="string">/opt/myamc/vff/vff_${vff}/data/fa/switchover</value> <!-- specifies path to performance data --> <configentry name="performancefilepath"> <value type="string">/opt/myamc/vff/vff_${vff}/data/fa/performance</value> <!-- specifies path to ACC script files <install-path>/scripts/acc should either contain the acc scripts or must be a link to the correct directory. --> <configentry name="accscriptfilepath"> <value type="string">/opt/myamc/scripts/acc</value> <!-- specifies path to SAP script files <install-path>/scripts/sap should either contain the sap scripts or must be a link to the correct directory. --> <configentry name="sapscriptfilepath"> <value type="string">/opt/myamc/scripts/sap</value> <!-- specifies path to control file --> FA Agents - Installation and Administration 149

158 Parameter Reference <configentry name="controlfilepath"> <value type="string">/opt/myamc/scripts/sap/log</value> <!-- specifies path to blackboard file --> <configentry name="blackboardfilepath"> <value type="string">/opt/myamc/vff/vff_${vff}/data/fa/blackboard</value> <!-- specifies name of group config file --> <configentry name="groupconfigfile"> <value type="string">/opt/myamc/vff/vff_${vff}/config/myamc_fa_groups.xml</value> <!-- specifies name of rule config file. If the file does not exist, it will be ignored. --> <configentry name="ruleconfigfile"> <value type="string">/opt/myamc/vff/vff_${vff}/config/myamc_fa_rules.xml</value> <!-- *** misc parameters *** --> <!-- specifies the postfix of server-lan --> <configentry name="lanpostfixserver"> <value type="string">-se</value> <!-- specifies the postfix of client-lan --> <configentry name="lanpostfixclient"> <value type="string"></value> <!-- specifies the postfix of storage-lan --> <configentry name="lanpostfixstorage"> <value type="string">-st</value> <!-- specifies the postfix of control-lan --> <configentry name="lanpostfixcontrol"> <value type="string">-co</value> 150 FA Agents - Installation and Administration

159 Parameter Reference --> <!-- specifies whether or not to respect the service dependencies <configentry name="respectservicedependencies"> <value type="boolean">true</value> <configsection name="shutdown_facility"> <configsection name="executable"> <configentry name="shut_ex_blade"> <value type="string">/opt/smaw/smawsf/bin/sa_blade</value> <configentry name="shut_ex_ipmi"> <value type="string">/opt/smaw/smawsf/bin/sa_ipmi</value> <configentry name="shut_ex_ipmipower"> <value type="string">/usr/bin/ipmipower</value> <configentry name="shut_ex_rsb"> <value type="string">/opt/smaw/smawsf/bin/sa_rsb</value> <configentry name="shut_ex_xscf"> <value type="string">/opt/smaw/smawsf/bin/sa_satxscf</value> <configentry name="shut_ex_rps"> <value type="string">/opt/smaw/smawsf/bin/sa_rps</value> <configentry name="shut_ex_scon"> <value type="string">/opt/smaw/smawsf/bin/sa_scon</value> </configsection> <configsection name="configuration"> <configentry name="shut_cnf_blade"> <value type="string">/etc/opt/smaw/smawsf/sa_blade.cfg</value> FA Agents - Installation and Administration 151

160 Parameter Reference <configentry name="shut_cnf_ipmi"> <value type="string">/etc/opt/smaw/smawsf/sa_ipmi.cfg</value> <configentry name="shut_cnf_rsb"> <value type="string">/etc/opt/smaw/smawsf/sa_rsb.cfg</value> <configentry name="shut_cnf_xscf"> <value type="string">/etc/opt/smaw/smawsf/sa_satxscf.cfg</value> <configentry name="shut_cnf_rps"> <value type="string">/etc/opt/smaw/smawsf/sa_rps.cfg</value> <configentry name="shut_cnf_scon"> <value type="string">/etc/opt/smaw/smawsf/sa_scon.cfg</value> </configsection> configured. --> <configsection name="managementblades"> <!-- Here the 'Hostname's of the management-blades must be <!-- first entry --> <!-- this is in comment, because it is only an example!!! <configsection name="mgmt_blade_1"> <configentry name="hostname"> <value type="string">vader</value> </configsection> --> <!-- 2. entry --> <!-- this is in comment, because it is only an example!!! <configsection name="mgmt_blade_2"> <configentry name="hostname"> <value type="string">yoda</value> </configsection> 152 FA Agents - Installation and Administration

161 Parameter Reference --> </configsection> -> <!-- Here the default 'ShutdownMode' must be configured. - <!-- Allowed values for: 'ShutdownMode': cycle, leave-off --> <configentry name="default_shutdownmode"> <value type="string">leave-off</value> <!-- Here the file for the Poff-synchronisation may be configured. --> <configentry name="shut_file_for_poff_sync"> <value type="string">/opt/myamc/vff/log/poff_sa_agt_cfg_files.log</value> <!-- Here the values for timing may be configured. --> <!-- Shut_Cycle [ms]: Cycletime of Shutdownfunctionality. Must be > 10 Sec. Shut_SNMP_Timeout [ms]: SNMP-timeout Shut_SNMP_Tries []: SNMP-tries Shut_Exec_Timeout [ms]: Timeout for execute the SA-agents Shut_ReNew_validity_1 [ms]: renew the SA-config if hostdata are valid Shut_ReNew_validity_0 [ms]: renew the SA-config if hostdata are invalid --> <configentry name="shut_cycle"> <value type="unsignedinteger">600000</value> <configentry name="shut_snmp_timeout"> <value type="unsignedinteger">3000</value> <configentry name="shut_snmp_tries"> <value type="unsignedinteger">3</value> FA Agents - Installation and Administration 153

162 Parameter Reference <configentry name="shut_exec_timeout"> <value type="unsignedinteger">15000</value> <configentry name="shut_renew_validity_1"> <value type="unsignedinteger"> </value> <configentry name="shut_renew_validity_0"> <value type="unsignedinteger">600000</value> <configentry name="ignoreshutdownfailure"> <value type="boolean">true</value> be configured. --> RPS, SCON <configsection name="brutforceshutdown"> <!-- Here all parameters for a 'brut force shutdown' must <!-- --> Allowed values for: 'ShutdownTyp': UNKNOWN, BLADE, IPMI, RSB, XSCF, 'Hardware': UNKNOWN, LINUX, SOLARIS 'ShutdownMode': cycle, leave-off <!-- first entry --> <!-- this is in comment, because it is only an example!!! <configsection name="bfsd_blade_1"> <configentry name="hostname"> <value type="string">vader</value> <configentry name="shutdowntyp"> <value type="string">xscf</value> <configentry name="macadress"> <value type="string">00c00d0032f7</value> <configentry name="hardware"> <value type="string">solaris</value> 154 FA Agents - Installation and Administration

163 Parameter Reference type="string">cycle</value> type="string"> </value> type="string">console_1</value>!!! --> type="string">blade</value> type="string">00c00d0032f7</value> type="string">linux</value> type="string">cycle</value> <configentry name="shutdownmode"> <value <configentry name="ip_adress"> <value <configentry name="console"> <value <configentry name="machine"> <value type="string">i686</value> <configentry name="port"> <value type="string">2</value> </configsection> <!-- 2. entry --> <!-- this is in comment, because it is only an example <configsection name="bfsd_blade_2"> <configentry name="hostname"> <value type="string">yoda</value> <configentry name="shutdowntyp"> <value <configentry name="macadress"> <value <configentry name="hardware"> <value <configentry name="shutdownmode"> <value <configentry name="ip_adress"> FA Agents - Installation and Administration 155

164 Parameter Reference type="string"> </value> type="string">console_1</value> --> <value <configentry name="console"> <value <configentry name="machine"> <value type="string">i686</value> <configentry name="port"> <value type="string">2</value> </configsection> </configsection> </configsection> <configsection name="additional_checks"> additional checks. <!-- specifies how often (in seconds) myamc.fa make the Addition Checks: 'Mount', 'Filers', 'Files', 'Lock'. The default value for this parameter is 60 seconds. --> <configentry name="addcheckcycletime"> <value type="unsignedinteger">60</value> <!-- specifies the filename for 'lock'-checks checks. "<hostname>.log.lock" will be added by program. --> <configentry name="lock_file"> <value type="string">/opt/myamc/vff/vff_${vff}/log/appagt/chk_lock_</value> 156 FA Agents - Installation and Administration

165 Parameter Reference <configsection name="file_check"> <!-- For the FileCheck there must be configured 'Filename' and 'permissions'. --> <!-- 'Filename' must be full specified or accassable from 'bin_xxx'-directory. --> <!-- 'permissions': Allowed are 'F'=exists, 'R'=readable, 'W'=writable, 'X'=executable --> <!-- first entry --> <configsection name="file_1"> <configentry name="filename"> <value type="string">/opt/myamc/scripts/sap/sapdb</value> <configentry name="permissions"> <value type="string">frx</value> </configsection> <!-- next entry --> <configsection name="file_2"> <configentry name="filename"> <value type="string">/opt/myamc/scripts/sap/sapci</value> <configentry name="permissions"> <value type="string">frx</value> </configsection> <!-- next entry --> <configsection name="file_3"> <configentry name="filename"> <value type="string">/opt/myamc/scripts/sap/sapapp</value> <configentry name="permissions"> <value type="string">frx</value> </configsection> <!-- next entry --> <configsection name="file_4"> FA Agents - Installation and Administration 157

166 Parameter Reference <configentry name="filename"> <value type="string">/opt/myamc/scripts/sap/sapscs</value> <configentry name="permissions"> <value type="string">frx</value> </configsection> <!-- next entry --> <configsection name="file_sapscs"> <configentry name="filename"> <value type="string">/opt/myamc/scripts/sap/sapscs</value> <configentry name="permissions"> <value type="string">frx</value> </configsection> <!-- next entry --> <configsection name="file_5"> <configentry name="filename"> <value type="string">/opt/myamc/scripts/sap/sapjc</value> <configentry name="permissions"> <value type="string">frx</value> </configsection> <!-- next entry --> <configsection name="file_6"> <configentry name="filename"> <value type="string">/opt/myamc/scripts/sap/sapj</value> <configentry name="permissions"> <value type="string">frx</value> </configsection> <!-- next entry --> <configsection name="file_saplc"> <configentry name="filename"> 158 FA Agents - Installation and Administration

167 Parameter Reference <value type="string">/opt/myamc/scripts/sap/saplc</value> <configentry name="permissions"> <value type="string">frx</value> </configsection> <!-- next entry --> <configsection name="file_sapers"> <configentry name="filename"> <value type="string">/opt/myamc/scripts/sap/sapers</value> <configentry name="permissions"> <value type="string">frx</value> </configsection> <!-- next entry --> <configsection name="file_7"> <configentry name="filename"> <value type="string">/opt/myamc/scripts/sap/monitor_alert</value> <configentry name="permissions"> <value type="string">frx</value> </configsection> <!-- next entry --> <configsection name="file_8"> <configentry name="filename"> <value type="string">/opt/myamc/scripts/acc/accdb</value> <configentry name="permissions"> <value type="string">frx</value> </configsection> <!-- next entry --> <configsection name="file_9"> <configentry name="filename"> FA Agents - Installation and Administration 159

168 Parameter Reference <value type="string">/opt/myamc/scripts/acc/accci</value> <configentry name="permissions"> <value type="string">frx</value> </configsection> <!-- next entry --> <configsection name="file_10"> <configentry name="filename"> <value type="string">/opt/myamc/scripts/acc/accapp</value> <configentry name="permissions"> <value type="string">frx</value> </configsection> <!-- next entry --> <configsection name="file_11"> <configentry name="filename"> <value type="string">/opt/myamc/scripts/acc/accscs</value> <configentry name="permissions"> <value type="string">frx</value> </configsection> <!-- next entry --> <configsection name="file_accascs"> <configentry name="filename"> <value type="string">/opt/myamc/scripts/acc/accascs</value> <configentry name="permissions"> <value type="string">frx</value> </configsection> <!-- next entry --> <configsection name="file_12"> <configentry name="filename"> 160 FA Agents - Installation and Administration

169 Parameter Reference <value type="string">/opt/myamc/scripts/acc/accjc</value> <configentry name="permissions"> <value type="string">frx</value> </configsection> <!-- next entry --> <configsection name="file_13"> <configentry name="filename"> <value type="string">/opt/myamc/scripts/acc/accj</value> <configentry name="permissions"> <value type="string">frx</value> </configsection> <!-- next entry --> <configsection name="file_accers"> <configentry name="filename"> <value type="string">/opt/myamc/scripts/acc/accers</value> <configentry name="permissions"> <value type="string">frx</value> </configsection> <!-- next entry --> <configsection name="file_acclc"> <configentry name="filename"> <value type="string">/opt/myamc/scripts/acc/acclc</value> <configentry name="permissions"> <value type="string">frx</value> </configsection> </configsection> FA Agents - Installation and Administration 161

170 Parameter Reference <configsection name="mount_check"> <!-- For the Mount_check there must be configured 'Mountpoint' and 'permissions'. --> <!-- 'Mountpoint' must be full specified. --> <!-- 'permissions': Allowed are 'R'=readable, 'W'=writable --> <!-- first entry --> <configsection name="mount_1"> <configentry name="mountpoint"> <value type="string">/flexframe/myamc</value> <configentry name="permissions"> <value type="string">rw</value> </configsection> </configsection> <configsection name="filer_check"> <!-- For the Filer_check there must be configured 'IP_addr' and 'permissions'. --> <!-- 'IP_Hostname': IP-address of filer, or hostname. --> <!-- 'ffu': parmaeter for future use --> type="string">filer</value> <!-- first entry --> <configsection name="filer_1"> <configentry name="ip_hostname"> <value <configentry name="ffu"> <value type="string">xxx</value> </configsection>!!! <!-- 2. entry --> <!-- this is in comment, because it is only an example 162 FA Agents - Installation and Administration

171 Parameter Reference type="string">filer2</value> --> <configsection name="filer_2"> <configentry name="ip_hostname"> <value <configentry name="ffu"> <value type="string">xxx</value> </configsection> </configsection> </configsection> </configsection> </configuration> FA Agents - Installation and Administration 163

172

173 8 BlackBoard 8.1 General myamc.fa offers a command interface via which a node, an instance or a complete system can be started and stopped. Furthermore it enables a system to be placed in the nowatch status, which results in FA monitoring being disabled for this period. The command interface is used both for manual intervention and also for operation via the WebInterface. 8.2 Implementation The BlackBoard is implemented as an ASCII file which is secured against manipulation. This file is also used as a log for the commands triggered via the BlackBoard. Each BlackBoard command is valid for a particular period. After this the command is discarded. Each BlackBoard command has a mechanism which secures it against being modified with an editor. Command syntax: Each command is represented by a line in the ASCII file blackboard.txt. This line is formed of Variable=Value tuples. The following variables are available: TimeStamp Time the entry was made in the format DD-MM-YY HH:MM:SS TimeLong Timestamp of the entry (seconds since 1970) SRC-ID Source identifier (sender ID: "AppAgent", "CtrlAgent", "WebGui", "Extern" etc.) SRC-Hostname Host name of the sender CMD-ID Command identifier: String which identifies the command. vff Name of the virtual FlexFrame for which this command applies. Group Name of the group for which this command applies. FA Agents - Installation and Administration 165

174 BlackBoard Service Name of the service for which this command applies. SID SID of the SAP system for which this command applies. Inst-Nr Instance number for which this command applies. Node Host name of the node for which this command applies. Value Value for specific commands. Validity Validity period for this command in seconds (examples: "0", "180", etc.). Key String which protects the command against manipulation and entries made with an editor. Example: TimeStamp= :00:10; TimeLong= ; SRC-ID=myAMC.FA_WebInterface; SRC-Hostname=vader; CMD-ID=Service_Start; vff=bayer1; Group=Produktiv; Service=SRV_APP; SID=P*; Inst-Nr=; Node=; Value=; Validity=240; Key=65h kjjhjkh Individual command syntax fields can be empty. These are then not entered in the BlackBoard file. Individual command syntax fields can be filled with keywords (e.g. "*"). Wildcards can also be used. An empty field does not mean *! If, for example, a command is to apply for all nodes, Node=* must be specified. 166 FA Agents - Installation and Administration

175 BlackBoard The following SRC-IDs (source identifiers) are currently permitted: myamc.fa_appagent myamc.fa_ctrlagent myamc.fa_webinterface myamc.fa_bbtool myamc.fa_bbtool_1 myamc.fa_bbtool_2 myamc.fa_bbtool_3 myamc.fa_bbtool_4 myamc.fa_bbtool_5 No other SRC-IDs are accepted. 8.3 Generating BlackBoard Commands WebInterface Commands can be issued to the BlackBoard from the myamc.fa WebInterface if interaction is enabled or if the user s rights permit this. Here the variables required are requested by the users, if necessary, and the command is written to the BlackBoard. Over the Webinterface only the commands permitted for the relevant element (pool, group, node, service) are offered. In future enhancements it will also be possible to store security prompts and password queries here to provide the greatest degree of security against user errors. The actual entry in the BlackBoard file is made using the BBTool.sh script Interactive BlackBoard commands can be entered manually in the BlackBoard file using the BBTool.sh and BBT_dialog.sh scripts. The scripts and stored programs only provide a limited plausibility check. You should therefore prefereably use the myamc.fa WebInterface. FA Agents - Installation and Administration 167

176 BlackBoard 8.4 Commands Command Parameters (wildcards permitted) Parameters (no wildcards) Service_Start Node, vff Group, Service, SID, Inst-Nr (with APP) Service_Start_as_SwGet (for internal use only) Service_Stop Service_Restart Service_SetPrio Service_Watch Service_Nowatch Service_ReactionON Service_ReactionOFF Service_TrapsendON Service_TrapsendOFF Node_Reboot Node_Shutdown Node_Switchover Node, vff Node, vff, Group, Service, SID, Inst-Nr Node, vff, Group, Service, SID, Inst-Nr Node, vff, Group, Service, SID, Inst-Nr Node, vff, Group, Service, SID, Inst-Nr Node, vff, Group, Service, SID, Inst-Nr Node, vff, Group, Service, SID, Inst-Nr Node, vff, Group, Service, SID, Inst-Nr Node, vff, Group, Service, SID, Inst-Nr Node, vff, Group, Service, SID, Inst-Nr Node, vff Node, vff Node, vff Group, Service, SID, Inst-Nr (with APP) Value Value? Priority Free command (only autonomous test and support) Node, vff Value Command 168 FA Agents - Installation and Administration

177 9 FlexFrame Autonomous Agent Traps FlexFrame Autonomous Agents can be easily linked into Enterprise Management scenarios. They supply an SNMP trap for all major status changes and for all reactions which are implemented. 9.1 Format of the FlexFrame Autonomy SNMP Traps Each trap supplies all the important information on the time, physical and virtual identification of the trap sender, and the severity, through to a meaningful message text in short or long form, enabling it to be used directly for display in Enterprise Event Management Systems or as a brief info, mail or SMS. The contents of the traps of the FlexFrame Agents V1.0 and V2.0 are in principle identical. Trap attributes which did not exist in Version 1.0, e.g. pool name and group name, are not sent. The format of the FlexFrame Autonomy traps is such that they can be analyzed and further processed in Enterprise Management Systems using filter modules. All important attributes which concern FlexFrame Autonomy can be found in the variable bindings of a FlexFrame Autonomy trap. Enterprise OID , Major Trap ID 6 Minor Trap ID 2 All traps are sent with the Enterprise OID , plus the Major Trap ID 6 and the Minor Trap ID 2. General trap format (all variable bindings have the prefix ). FA Agents - Installation and Administration 169

178 FlexFrame Autonomous Agent Traps Example VB Description Comment 1 10 Trap version Application ID (One of the two fields is always configurable) myamc.fa 21 Application name Timestamp (unixtime) Date %Y-%m-%d 12:15:00 32 Time %H:%M:%S 110 Symbolic group name (if configured) 111 Group name (if configured) o System ID belana 131 Physical device name Physical IP address Type: Alarm (1), Event (2), Log (3) SRV: ci:message server: Log: 152 Category SRV: <service_type>:<subservice_type>:log: Instance number cio11_o11_ Instance name <virtual_servername>_<sid>_<inst_no> Severity (see table) 170 FA Agents - Installation and Administration

179 FlexFrame Autonomous Agent Traps Example VB Description Comment Cust_1 190 Pool name (vff) Name of pool to which the host belongs BSP_GR_1 191 Group name Group name of the service Priority Priority of the service SRV_DBORA 200 Service type name Unique service type name message server 201 Subservice type Down 202 Status cio Virtual host name (server LAN) The following names are possible: SRV_DBORA, SRV_DBSAP, SRV_CI, SRV_APP, SRV_SCS, SRV_ASCS, SRV_JC, SRV_J Service ID In the first version, it corresponds to the instance number (VB 161) myamc_fa_appagent 205 Sender Process name : myamc_fa_{app Ctrl}Agent db 206 Service type display name Display name of service type Message ID Unique trap no. (internal ID) message server is down 500 Short message Foramatted short message text message server of service ci (O11 00) on node belana is down 501 Long message Foramatted long message text FA Agents - Installation and Administration 171

180 FlexFrame Autonomous Agent Traps 9.2 Severities All messages contain a severity code. The following section describes the different severity levels: Normal Messages which indicate the transition to a normal state are sent with severity normal. Warning Messages regarding problems which are detected by the FA Agents are sent with severity warning. Critical Messages regarding problems which lead to a reaction (restart, reboot, switchover) are sent with severity critical. Emergency Messages regarding severe problems or situations are sent with severity emergency. This includes messages regarding MultiNodeFailures, which is a new feature as of Version 3.0A10 of the FA Agents. 9.3 Overview of the FlexFrame Autonomy SNMP Traps The following tables provide an overview of all defined traps of the FlexFrame Agents V1 and V2. These lists can be modified and extended at any time as a result of change requests. A minus sign in a field means that this VB does not exist in the trap. Table FA Agents - Installation and Administration

181 FlexFrame Autonomous Agent Traps VB 210 (as of V2.0) Internal ID see VB 210 (as of V2.0) Short Message VB 500 Long Message VB CtrlAgtUp has started <sender> on node <phys. server> has started 2 CtrlAgtDown has shut down <sender> on node <phys. server> has shut down 3 CtrlAgtSwitchOver switch over 4 CtrlAgtSwitchOverFailed switch over failed switching over service <service> <SID> <ID> node <phys. server> switching over of service <service> <SID> <ID> from node <phys. server> failed 5 CtrlAgtPoffHost_ok power off ok <hostname> power off ok: <specific message> 6 CtrlAgtPoffHost_failed power off failed <hostname> power off failed: <specific message> 7 CtrlAgtPoffHost_IF_ERR power off not done <hostname> power off failed: <specific message> 8 CtrlAgtPoffSwitchOffNetInterfacesOk switched off network interfaces switched off network interfaces 9 CtrlAgtPoffSwitchOffNetInterfacesFailed 10 CtrlAgentPoffNotDone 11 CtrlAgentPoffHhk_PING_ok failed to switch off network interfaces Power off not done '<hostname> ext. SwitchOver-Check '<hostname> : ping ok failed to switch off network interfaces Power off not done '<hostname>, because node may be available. Administrator, check availability of node. ext. SwitchOver-Check '<hostname> : ping ok <specific message> FA Agents - Installation and Administration 173

182 FlexFrame Autonomous Agent Traps Table 1 VB 210 (as of V2.0) Internal ID see VB 210 (as of V2.0) Short Message VB 500 Long Message VB CtrlAgentPoffHhk_PING_fail 13 CtrlAgentPoffHhk_SSH_ok 14 CtrlAgentPoffHhk_SSH_fail ext. SwitchOver-Check '<hostname> : ping failed ext. SwitchOver-Check '<hostname> : ssh ok ext. SwitchOver-Check '<hostname> : ssh failed ext. SwitchOver-Check '<hostname> : ping failed <specific message> ext. SwitchOver-Check '<hostname> : ssh ok <specific message> ext. SwitchOver-Check '<hostname> : ssh failed <specific message> 15 CtrlAgentMultiFailure_ShortTime MultiNodeFailure: ShortTime. MultiNodeFailure: ShortTime. <specific message> 16 CtrlAgentMultiFailure_LongTime MultiNodeFailure: ShortTime. MultiNodeFailure: ShortTime. <specific message> 18 CtrlAgentNodeLiveMessageFailed 19 CtrlAgentNodeNotRechable_SwitchOver_decided Node live message failed <hostname>. Node not reachable, Switch- Over decided <hostname>. Node live message failed <hostname>. <specific message> Node not reachable, SwitchOver decided <hostname>. <specific message> 100 AgentUp has <state> <sender> on node <phys. server> has <state> 101 AgentDown has <state> <sender> on node <phys. server> has <state> 110 NodeShutDown (as of V2.0) node <state> node <phys. server> has <state> 111 NodeRebooting (as of V2.0) node <state> node <phys. server> has <state> 174 FA Agents - Installation and Administration

183 FlexFrame Autonomous Agent Traps Table 1 VB 210 (as of V2.0) Internal ID see VB 210 (as of V2.0) Short Message VB 500 Long Message VB NodeRebootStart (as of V2.0) node <state> node <phys. server> has <state> 113 NodeSwitchOver (as of V2.0) node <state> node <phys. server> has <state> 115 NodeTakeOverFailed (as of V2.0) node <state> node <phys. server> has <state> 200 ServiceStarting is <state> service <service> on node <phys. server> is <state> 201 ServiceUp has <state> service <service> on node <phys. server> has <state> 202 ServiceStartFailed has <state> service <service> on node <phys. server> has <state> 203 ServiceStopping is <state> service <service> on node <phys. server> is <state> 204 ServiceDown has <state> service <service> on node <phys. server> has <state> 205 ServiceStopFailed has <state> service <service> on node <phys. server> has <state> 206 ServiceFailed <state> service <service> on node <phys. server> <state> 207 ServiceRestart (as of V2.0) is <state> service <service> on node <phys. server> is <state> 208 ServiceRestartFailed <state> service <service> on node <phys. server> <state> 209 ServiceWatch (as of V2.0) <state> watching service <service> on node <phys. server> FA Agents - Installation and Administration 175

184 FlexFrame Autonomous Agent Traps Table 1 VB 210 (as of V2.0) Internal ID see VB 210 (as of V2.0) Short Message VB 500 Long Message VB ServiceNowatch (as of V2.0) <state> No longer watching service <service> on node <phys. server> 211 ServiceReboot (as of V2.0) Is <state> service <service> on node <phys. server> is <state> 212 ServiceRebootStart (as of V2.0) Is <state> service <service> on node <phys. server> is <state> 213 ServiceSwitchOver (as of V2.0) Is <state> service <service> on node <phys. server> is <state> 214 ServiceSwitchOverStart (as of V2.0) Is <state> service <service> on node <phys. server> is <state> 300 (1) SubServiceDown is <state> saposcol on node <phys. server> is <state> 300 (2) 301 SubServiceFailed <state> 302 SubServiceDeleted <state> <ServiceSubType> of service <service> on node <phys. server> is <state> <ServiceSubType> of service <service> on node <phys. server> <state> <state> <ServiceSubType> for service <service> on node <phys. server> 400 CHK_Filers Filer <state> <specific message> 401 CHK_Files File <state> <specific message> 402 CHK_Lock Lock <state> <specific message> 176 FA Agents - Installation and Administration

185 FlexFrame Autonomous Agent Traps Table 1 VB 210 (as of V2.0) Internal ID see VB 210 (as of V2.0) Short Message VB 500 Long Message VB CHK_Mounts Mount <state> <specific message> 500 BB_get node <state> <specific message> 501 BB_DoIt node <state> <specific message> 502 BB_Error node <state> <specific message> 503 BB_NoMatch node <state> <specific message> Table 2 VB 210 (as of V2.0) Service VB 200 Service Subtype VB 201 ID VB 204 Logical Server Name in ServerLAN (SL) VB 203 SID VB 121 Long Instance No. VB SRV_DBORASRV_DBSAP, SRV_LC - SL <SID> - 3 SRV_CI, SRV_JC, SRV_SCS, SRV_ASCS, SRV_APP, SRV_J - <ID> SL <SID> <Inst No> SRV_ERS <ID> - <SID> <Inst No> FA Agents - Installation and Administration 177

186 FlexFrame Autonomous Agent Traps Table 2 VB 210 (as of V2.0) Service VB 200 Service Subtype VB 201 ID VB 204 Logical Server Name in ServerLAN (SL) VB 203 SID VB 121 Long Instance No. VB 161 SRV_DBORASRV_DBSAP, SRC_LC - SL <SID> - 4 SRV_CI, SRV_JC, SRV_SCS, RV_ASCS, SRV_APP, SRV_J, SRV_ERS - <ID> SL <SID> <Inst No> SRV_ERS <ID> - <SID> <Inst No> 5 14, , , , 115 <hostname> SRV_DBORASRV_DBSAP, SRV_LC - SL <SID> SRV_CI, SRV_JC, SRV_SCS, SRV_ASCS, SRV_APP, SRV_J - <ID> SL <SID> <Inst No> SRV_ERS <ID> - <SID> <Inst No> 300 (1) - saposcol - SL FA Agents - Installation and Administration

187 FlexFrame Autonomous Agent Traps Table 2 VB 210 (as of V2.0) Service VB 200 Service Subtype VB 201 ID VB 204 Logical Server Name in ServerLAN (SL) VB 203 SID VB 121 Long Instance No. VB 161 SRV_DBSAP, SRV_LC vserver, kernel SRV_DBORA ora_dbw, ora_lgwr, ora_ckpt, ora_smon, ora_pmon, tnslsnr, ora_mman, ora_arc - SL (2) SRV_ERS ers.sap <ID> - <SID> <Inst No> SRV_CI ms.sap, dw.sap SRV_JC SRV_SCS, SRV_ASCS SRV_APP SRV_J jc.sap ms.sap, en.sap dw.sap jc.sap <ID> SL <SID> <Inst No> 301 SRV_DBORASRV_DBSAP, SRV_LC - SL <SID> - SRV_CI, SRV_JC, SRV_SCS, SRV_ASCS, SRV_APP, SRV_J server-lan ping <ID> SL <SID> <Inst No> SRV_CI, SRV_JC, SRV_SCS, SRV_ASCS, SRV_APP, SRV_J client-lan ping <ID> SL <SID> <Inst No> FA Agents - Installation and Administration 179

188 FlexFrame Autonomous Agent Traps Table 2 VB 210 (as of V2.0) Service VB 200 Service Subtype VB 201 ID VB 204 Logical Server Name in ServerLAN (SL) VB 203 SID VB 121 Long Instance No. VB 161 SRV_DBORASRV_DBSAP, SRV_LC - SL <SID> SRV_CI, SRV_JC, SRV_SCS, SRV_ASCS, SRV_APP, SRV_J control file <ID> SL <SID> <Inst No> SRV_ERS <ID> - <SID> <Inst No> , SRV_DBORASRV_DBSAP, SRV_CI, SRV_JC, SRV_SCS, SRV_ASCS, SRV_APP, SRV_J, SRV_ERS, SRV_LC - <ID> SL <SID> <Inst No> Table 3 VB 210 (as of V2.0) State VB 202 Sender VB 205 Long Severity VB started myamc_fa_[app] [Ctrl]Agent 50 2 shut down myamc_fa_[app] [Ctrl]Agent myamc_fa_[app] [Ctrl]Agent failed myamc_fa_[app] [Ctrl]Agent FA Agents - Installation and Administration

189 FlexFrame Autonomous Agent Traps Table 3 VB 210 (as of V2.0) State VB 202 Sender VB 205 Long Severity VB poff ok myamc_fa_[app] [Ctrl]Agent poff failed myamc_fa_[app] [Ctrl]Agent poff not done myamc_fa_[app] [Ctrl]Agent netoff ok myamc_fa_[app] [Ctrl]Agent netoff failed myamc_fa_[app] [Ctrl]Agent poff not done myamc_fa_[app] [Ctrl]Agent ping ok myamc_fa_[app] [Ctrl]Agent ping failed myamc_fa_[app] [Ctrl]Agent ssh ok myamc_fa_[app] [Ctrl]Agent ssh failed myamc_fa_[app] [Ctrl]Agent MultiNodeFailure: ShortTime myamc_fa_[app] [Ctrl]Agent MultiNodeFailure: ShortTime myamc_fa_[app] [Ctrl]Agent Node live message failed myamc_fa_[app] [Ctrl]Agent Node not reachable myamc_fa_[app] [Ctrl]Agent started myamc_fa_[app] [Ctrl]Agent 50 FA Agents - Installation and Administration 181

190 FlexFrame Autonomous Agent Traps Table 3 VB 210 (as of V2.0) State VB 202 Sender VB 205 Long Severity VB shut down myamc_fa_[app] [Ctrl]Agent shut down myamc_fa_[app] [Ctrl]Agent reboot (down) myamc_fa_[app] [Ctrl]Agent reboot (up) myamc_fa_[app] [Ctrl]Agent switch over myamc_fa_[app] [Ctrl]Agent take over failed myamc_fa_[app] [Ctrl]Agent starting myamc_fa_[app] [Ctrl]Agent started myamc_fa_[app] [Ctrl]Agent failed to start myamc_fa_[app] [Ctrl]Agent stopping myamc_fa_[app] [Ctrl]Agent stopped myamc_fa_[app] [Ctrl]Agent failed to stop myamc_fa_[app] [Ctrl]Agent failed myamc_fa_[app] [Ctrl]Agent restart myamc_fa_[app] [Ctrl]Agent failed to restart myamc_fa_[app] [Ctrl]Agent FA Agents - Installation and Administration

191 FlexFrame Autonomous Agent Traps Table 3 VB 210 (as of V2.0) State VB 202 Sender VB 205 Long Severity VB watch myamc_fa_[app] [Ctrl]Agent no watch myamc_fa_[app] [Ctrl]Agent reboot myamc_fa_[app] [Ctrl]Agent reboot (start) myamc_fa_[app] [Ctrl]Agent swich over myamc_fa_[app] [Ctrl]Agent switch over (start) myamc_fa_[app] [Ctrl]Agent (1) down myamc_fa_[app] [Ctrl]Agent (2) down myamc_fa_[app] [Ctrl]Agent failed (<nr>) myamc_fa_[app] [Ctrl]Agent delete myamc_fa_[app] [Ctrl]Agent failed myamc_fa_[app] [Ctrl]Agent failed myamc_fa_[app] [Ctrl]Agent failed myamc_fa_[app] [Ctrl]Agent failed myamc_fa_[app] [Ctrl]Agent BB get cmd myamc_fa_[app] [Ctrl]Agent 50 FA Agents - Installation and Administration 183

192 FlexFrame Autonomous Agent Traps Table 3 VB 210 (as of V2.0) State VB 202 Sender VB 205 Long Severity VB BB do cmd myamc_fa_[app] [Ctrl]Agent BB cmd error myamc_fa_[app] [Ctrl]Agent 50, 150, BB cmd no match myamc_fa_[app] [Ctrl]Agent 50, 150, FA Agents - Installation and Administration

193 10 Troubleshooting The FA Agents offer a large number of diagnostic options for detecting and diagnosing problems on the FA Agents themselves or other components. Problems concerning FA Agents can be assigned to one of the following categories: FlexFrame installation and configuration errors Parameter errors Configuration errors Detection, reaction errors, start, stop, maintenance errors Power-shutdown errors Typical consequences of installation and configuration errors are: FA Agents fail to start Error messages during startup of FA Agents Error: Mount points missing Diagnosis: In the case of mount points monitored by FA Autonomy, traps are sent to the central trap consoles. With other mount points which are absolutely essential for the operation of the node concerned it can happen that the agents cannot be started as the directories required are not available. Response: Provide the mount points required with the appropriate mount options. Error: Mount points without File Locking Diagnosis: The FA Agents log this situation both in the operating system s Syslog and in special files (/opt/myamc/vff/log/log_syslog*). Response: Provide the mount points required with the appropriate mount options (lock). Error: Rights for the directories/files are not sufficient Diagnosis: In the case of files monitored by FA Autonomy, traps are sent to the central trap consoles. With other directories/files which are absolutely essential for the operation of the node concerned it can happen that the agents cannot be started as the directories/files required are not available. FA Agents - Installation and Administration 185

194 Troubleshooting Response: Provide the directories/files with the required rights. Error: Agents do not have the authorization to write to the directories assigned Diagnosis: The FA Autonomy work and log files are not written. Response: Provide the directories/files with the required rights. Error: Version incompatibility FlexFrame installation and FlexFrame Autonomy installation are not directly compatible. This can always be the case when older FlexFrame installations are updated with new FlexFrame Autonomy Agents. Diagnosis: For diagnosis and troubleshooting, the mount points, the directory structure and the access rights to the directories used by the agents must be checked. Response: Using the command line tool, check that the parameters used in the FA config files are compatible with the version and syntactically correct. Error: Pool assignment not found A node is assigned to the wrong pool or to the default pool. Diagnosis: Display on the FA_WebGUI or in the agent s start trap and display on an event console. Response: Check the LDAP configuration parameters, call the PGTool Pool.sh and check the pool name returned. Check the pool membership for each node. Error: Group assignment is not correct Diagnosis: Display on the FA_WebGUI or in the agent s start trap and display on an event console. Response: Check the group configuration in the group configuration file. Check the group membership for each node. 186 FA Agents - Installation and Administration

195 Troubleshooting Error: Service priority not recognized Diagnosis: Display on the FA_WebGUI or in the agent s start trap and display on an event console. Response: Check the configuration of the service class and service priority in the group configuration file. Check the values for each node. Error: Availability problem not rectified by autonomous reaction Diagnosis: Services are discontinued (possibly due to hardware fault) and are not made available again by FlexFrame Autonomy. Response: Check whether nodes are available for taking over the services (Spare Nodes). Check whether the FA Agents on the nodes involved have been started. Error: Services do not start Constant reboot Permanent switchover Diagnosis: SAP services which are started do not enter run mode but are repreatedly restarted or, if the problem escalates, the node is rebooted or an internal switchover takes place. Possible causes: The MaxRestart time for the service is too short. This parameter can be adjusted in the FA configuration. The virtual interfaces cannot be reached. There is a permanent problem which prevents a service being started (e.g. necessary database recovery). Response: Stop the FA Agents to interrupt escalation of the reaction and check whether the service can be started manually. If the service cannot be started manually, this problem must be corrected by the administrator. If the service can be started manually, the time required for this must be matched to the MaxStart time and MaxRestart time in the configuration and the configuration must be adjusted, if necessary. If the virtual interfaces cannot be reached from the Application Agent, the network configuration must be checked. FA Agents - Installation and Administration 187

196 Troubleshooting Error: Service cannot be stopped Diagnosis: An active SAP service is repeatedly restarted after a manual stop command. Response: If the FlexFrame SAP scripts were not used for the manual stop command, this is the cause. The FlexFrame SAP scripts must be used. If the FlexFrame SAP scripts were used, the Monitor Alert Script might not be available or does not have the required rights. However, it is also possible that the Monitor Alert Time and CycleTime are configured incorrectly. The agents CycleTime is too long in relation to the Monitor Alert Time. The Monitor Alert Time must be at least 3 * the CycleTime. Error: Maintenance activities are interrupted by autonomous reactions Diagnosis: Unwanted autonomous reactions during maintenance. Response: Set NoWatch for the service concerned or stop the Application Agents for the node concerned and restart them after maintenance has been completed. Error: Incorrect display on the FA_WebGUI Diagnosis: The state checked manually does not match the display. Response: Check whether the Application Agents concerned, the Control Agents concerned and the web server are running properly for the WebInterface. The log files of the Flex- Frame Autonomy Agents can be used for the diagnosis. The FlexFrame Autonomy Agents write detailed log files. The functions of the FA Agents are documented in their own files. These files are created dynamically during ongoing operation and may not be modified manually as this can impair fault-free operation of the FA Agents or lead to erroneous reactions. Deleting these files results in a state in which the Autonomous Agents reorganize themselves and, from this point on, analyze the situation from their current viewpoint without any previous information. 188 FA Agents - Installation and Administration

197 11 Abbreviations ABAP ACC ACI ACPI APM APOLC CCU CIFS DART DHCP DIT ERP ESF EULA FAA FC FTP IP IPMP LAN LDAP LUN MAC MINRA NAS NDMP NFS Advanced Business Application Programming Adaptive Computing Controller Adaptive Computing Infrastructure Advanced Configuration and Power Interface Advanced Power Management Advanced Planner & Optimizer Life Cache Console Connection Unit Common Internet File System Data Access in Real Time Dynamic Host Configuration Protocol Domain Information Tree Enterprise Resource Planning Enhanced System Facility End User License Agreement FlexFrame Autonomous Agent Fiber Channel File Transfer Protocol Internet Protocol IP Multipathing Local Area Network Lightweight Directory Access Protocol Logical Unit Number Media Access Control Minimal Read Ahead Network Attached Storage Network Data Management Protocol Network File System FA Agents - Installation and Administration 189

198 Abbreviations NIC NVRAM OBP OLTP ONTAP OSS POST PCL PFS PW PXE PY QA QS RAID RARP RDBMS RHEL RSB SCS SAP BW SAPGUI SAPOSS SID SLD SLES SMB SMC SNMP Network Interface Card Non-Volatile Random Access Memory Open Boot Prom On-Line Transaction Processing Open Network Technology for Appliance Products Open Source Software Power-On Self Test PRIMECLUSTER Production File System (on Celerra) PRIMEPOWER Preboot Execution Environment PRIMERGY Quality Assurance Quality of Service Redundant Array of Independent (or Inexpensive) Disks Reverse Address Resolution Protocol Relational Database Management System Red Hat Enterprise Linux Remote Service Board System Console Software SAP Business Warehouse SAP Graphical User Interface SAP Online System Service System Identifier System Landscape Directory SUSE Linux Enterprise Server Server Message Block System Management Console Simple Network Management Protocol 190 FA Agents - Installation and Administration

199 Abbreviations SPOC TELNET TFTP UDP UPS VLAN VTOC WAN WAS WAFL XSCF Single Point Of Control Telecommunications Network Trivial File Transfer Protocol User Datagram Protocol Uninterruptible Power Supply Virtual Local Area Network Virtual Table Of Contents Wide Area Network Web Application Server Write Anywhere File Layout Extended System Control Facility FA Agents - Installation and Administration 191

200

201 12 Glossary Adaptive Computing Controller SAP system for monitoring and controlling SAP environments. Advanced Business Application Programming Proprietary programming language of SAP. Advanced Power Management Advanced Power Management defines a layer between the hardware and the operating system that effectively shields the programmer from hardware details. Application Agent A software program for monitoring and managing applications. Application Node A host for applications (e.g. SAP instances db, ci, agate, wgate, app etc.). This definition includes Application Servers as well as Database Servers. Automounter The automounter is an NFS utility that automatically mounts directories on an NFS client as they are needed, and unmounts them when they are no longer needed. Autonomous Agent Central system management and high availability software component of FlexFrame. Blade A special form factor for computer nodes. BladeRunner The working title for the solution part of SAP for FlexFrame. BOOTPARAM Boot time parameters of the kernel. BRBACKUP SAP backup and restore tools. Celerra NAS system of EMC. Checkpoint Restore On EMC Celerra a SnapSure feature that restores a PFS to a point in time using checkpoint information. As a precaution, SnapSure automatically creates a new checkpoint of the PFS before it performs the restore operation. Client LAN Virtual network segment within FlexFrame, used for client-server traffic. FA Agents - Installation and Administration 193

202 Glossary Common Internet File System A protocol for the sharing of file systems (same as SMB). Computing Node From the SAP ACI perspective: A host that is used for applications. Control Agent A software program for monitoring and managing nodes within FlexFrame. Control LAN Virtual network segment within FlexFrame, used for system management traffic. Control Node A physical computer system, controlling and monitoring the entire FlexFrame landscape and running shared services in the rack (dhcp, tftp, ldap etc.). Control Station A Control Node in an SAP ACI environment. DART Operating system of Celerra data movers (Data Access in Real Time). Dynamic Host Configuration Protocol DHCP is a protocol for assigning dynamic IP addresses to devices on a network. Dynamic Host Configuration Protocol server A DHCP server provides configuration parameters specific to the DHCP client host, required by the host to participate on the Internet. EMC NAS Network attached storage for file systems of EMC. Enterprise Resource Planning Enterprise Resource Planning systems are management information systems that integrate and automate many of the business practices associated with the operations or production aspects of a company. Ethernet A Local Area Network which supports data transfer rates of 10 megabits per second. Fiber Channel Fiber Channel is a serial computer bus intended for connecting high-speed storage devices to computers. Filer Network attached storage for file systems of NetApp. FlexFrame A joint project in which the main partners are SAP, Network Appliance, Intel and Fujitsu Siemens Computers. 194 FA Agents - Installation and Administration

203 Glossary FlexFrame TM for SAP FlexFrame TM for SAP is a radically new architecture for SAP environments. It exploits the latest business-critical computing technology to deliver major cost savings for SAP customers. FlexFrame internal LAN Switch Cisco network switches which are integral part of the FlexFrame for SAP hardware configuration and which are automatically configured by the FlexFrame for SAP software. Gigabit Ethernet A Local Area Network which supports data transfer rates of 1 gigabit (1,000 megabits) per second. Host name The name of a node (assigned to an interface) that is resolved to a unique IP address. One node can have multiple host names (cf. node name). In SAP environments host names are currently limited to 13 alphanumeric characters including the hyphen ( - ). The first character must be a letter. In the SAP environment host names are case-sensitive. Image In the FlexFrame documentation, Image is used as a synonym for Hard Disk Image. Internet Protocol Address A unique number used by computers to refer to each other when sending information through networks using the Internet Protocol. Lightweight Directory Access Protocol Protocol for accessing on-line directory services. Local Area Network A computer network that spans a relatively small area. Most LANs are confined to a single building or group of buildings. However, one LAN can be connected to other LANs over any distance via telephone lines and radio waves. A system of LANs connected in this way is called a Wide Area Network (WAN). Local host name The name of the node (physical computer); it can be displayed and set using the command /bin/hostname. Logical Unit Number An address for a single (SCSI) disk drive. MAC address Device identifier number of a Network Interface Card. In full: "media access control address". FA Agents - Installation and Administration 195

204 Glossary MaxDB A relational database system from mysql (formerly ADABAS and SAPDB). Media Access Control address An identifier for network devices, usually unique. The MAC address is stored physically on the device. NAS system Network Attached Storage of any vendor (in our context: EMC NAS or NetApp Filer). NDMPcopy NDMPcopy transfers data between Filers using the Network Data Management Protocol (NDMP). Netboot A boot procedure for computers where the operating system is provided via a network instead of local disks. Netweaver SAP NetWeaver is the technical foundation of SAP solutions. Network Appliance Filer See Filer. Network Attached Storage A data storage device that is connected via a network to one or multiple computers. Network File System A network protocol for network-based storage access. Network Interface Card A hardware device that allows computer communication via networks. Node A physical computer system controlled by an OS. Node name The name of a physical node as returned by the command uname -n. Each node name within a FlexFrame environment must be unique. Non-Volatile Random Access Memory A type of memory that retains its contents when the power is turned off. On-Line Transaction Processing Transaction processing via computer networks. OpenLDAP An Open Source LDAP Service Implementation. Open Network Technology for Appliance Products The operating system of Network Appliance Filers. 196 FA Agents - Installation and Administration

205 Glossary Open Source Software Software that is distributed free of charge under an open source license, such as the GNU Public License. Oracle RAC A cluster database by Oracle Corporation. Physical host Name of a physical computer system (node). Power-On Self Test Part of a computer's boot process; automatic testing of diverse hardware components. Preboot Execution Environment An environment that allows a computer to boot from a network resource without having a local operating system installed. PRIMECLUSTER Fujitsu Siemens Computer s high-availability and clustering software. PRIMEPOWER Fujitsu Siemens Computer's SPARC-based server product line. PRIMERGY Fujitsu Siemens Computer's i386-based server product line. Red Hat Enterprise Linux Linux distribution by Red Hat, Inc., targeting business customers. Reverse Address Resolution Protocol A protocol allowing resolution of an IP address corresponding to a MAC address. SAP Service In FlexFrame: SAP Service and DB Services. SAP service script An administration script for starting and stopping an SAP application on a virtual host. SAP Solution Manager Service portal for the implementation, operation and optimization of an SAP solution. SAPLogon Front-end software for SAPGUI. SAPRouter Router for SAP services like SAPGUI or SAPTELNET. SavVol A Celerra volume to which SnapSure copies original point-in-time data blocks from the PFS before the blocks are altered by a PFS transaction. FA Agents - Installation and Administration 197

206 Glossary Server A physical host (hardware), same as node. Service A software program providing functions to clients. Service type The type of an application or service (db, ci, app, agate, wgate etc.). Single Point of Control In FlexFrame: One user interface to control a whole FlexFrame environment. Storage LAN A virtual LAN segment within a FlexFrame environment, carrying the traffic to NAS systems. SUSE Linux Enterprise Server A Linux distribution by Novell, specializing in server installations. Telecommunications Network A terminal emulation program for TCP/IP networks such as the Internet. Trivial File Transfer Protocol A simple form of the File Transfer Protocol (FTP). TFTP uses the User Datagram Protocol (UDP) and provides no security features. It is often used by servers to boot diskless workstations, X-terminals, and routers. TFTP server A simple FTP implementation. Virtual host The name of the virtual host on which an application runs; it is assigned to a physical node when an application is started. Virtual Local Area Network A VLAN is a logically segmented network mapped over physical hardware according to the IEEE 802.1q standard. Virtualization Virtualization means the separation of hardware and processes. In a virtualized environment (FlexFrame), a process can be moved between hardware nodes while staying transparent to the user and application. 198 FA Agents - Installation and Administration

207 13 Index A Adaptive Computing Controller (ACC) 36 autonomous functions 12 autonomy application scenarios 24 collecting diagnostic information for support assistance 46 definition of generic services parameters 121 directories, production and log files 42 for application instances 40 FSC FlexFrame scripts 36 general parameters 117 migration of Fa Agent versions on pool level 48 node-related parameters 119 operating mode 34 parameters for the performance and accounting option 118 path configuration 125 power shutdown 87 semi-autonomous operation 39 service-related parameters 120 shutdown configuration 126 user interactions 36 WebInterface 55 Autonomy 1 ACC 36 possible applications 39 autonomy architecture 11 autonomy basic reactions 25 reboot 25 restart 25 switchover 26 autonomy components configuring 5 installing 5 starting 5 stopping 5 autonomy pools 12 autonomy scenarios 39 autonomy software configuration 8 FlexFrame solution 5 installation 6 Autonomy software 6 start scripts 7 autonomy traps 159 format 159 overview 162 B BlackBoard commands command file 155 command list 158 FA Agents - Installation and Administration 199

208 Index interactive 157 via WebInterface 157 C class creation rules 15 Client LAN 6 D DomainManager 10 E event mode 35 F FA Agent deactivating and activating 6 installing 6 system and directory 6 version 6 FA components configuration and log files 16 service types 17 systems 17 FA migration tool 52 FA WebInterface 5, 9 configuring 9 function 9 installing 9 starting and stopping 10 FA_AppAgents 5 FA_CtrlAgent 5 FlexFrame accounting option 19 accounting Plug-in 85 autonomy 24 performance 19, 85 reporting Plug-in 86 FlexFrame infrastructure autonomous operation 35 G generic services 18 group schema 13 grouping 12 grouping function 13 H hardware resources 12 I installation packages Autonomy software 7 installation requirements 5 IP storage 6 L live cache 18 M maintenance scripts 27 myamc 1 myamc.fa 1 myamc.fa Agent starting and stopping 8 starting, stopping, status 37 myamc.fa_appagent 7 myamc.fa_ctrlagent FA Agents - Installation and Administration

209 Index myamc.messenger 2, 5 myamc.overview 2 N Netapp Filer 6 node failure 28 P parameterization class creation rules 115 default parameter file 98, 105, 127 FA Agents 95 FlexFrame autonomy 116 grouping 100 pool creation 100 service classes 114 service power value 115 service priority 115 traps 96 pool creation 12 power shutdown 87 Application Nodes 92 architecture 88 Blade systems 89 configuration 90 default shutdown mode 94 Management Blades 91 PRIMEPOWER systems 89 PRIMERGY systems 89 switchover control parameters 90 user, password, community 90 PRIMECLUSTER shutdown facility 87 R reaction mode central 35 local 35 read/write Root Image 6 read-only Root Image 6 reboot 12 replicated enqueue service 17 restart 12 S SAP instance starting and stopping 38 SAP start scripts 6 self-repair strategies 28 Server LAN 6 service classes 14 service detection model 19 service failure 28 service instance 14 service power value 15 service priotiy 15 service reaction model 19 service state model 18 SLES 6 spare nodes 28 Storage LAN 6 switchover 12 FA Agents - Installation and Administration 201

210 Index T takeover rules 29 dynamic 31 static 29 testamant types 15 W WebInterface BlackBoard settings 60 commands 74 configuration 55 configuration of FlexFrame autonomy 73 info and help 84 interaction 74 logging 61 login 61 message display 69 overview 62 paths and file names 60 pool / group tree 62 reading out the FA data 60 starting 61 status display 65 updates 82 visualization FA Agents - Installation and Administration

211 Information on this document On April 1, 2009, Fujitsu became the sole owner of Fujitsu Siemens Computers. This new subsidiary of Fujitsu has been renamed Fujitsu Technology Solutions. This document from the document archive refers to a product version which was released a considerable time ago or which is no longer marketed. Please note that all company references and copyrights in this document have been legally transferred to Fujitsu Technology Solutions. Contact and support addresses will now be offered by Fujitsu Technology Solutions and have the The Internet pages of Fujitsu Technology Solutions are available at and the user documentation at Copyright Fujitsu Technology Solutions, 2009 Hinweise zum vorliegenden Dokument Zum 1. April 2009 ist Fujitsu Siemens Computers in den alleinigen Besitz von Fujitsu übergegangen. Diese neue Tochtergesellschaft von Fujitsu trägt seitdem den Namen Fujitsu Technology Solutions. Das vorliegende Dokument aus dem Dokumentenarchiv bezieht sich auf eine bereits vor längerer Zeit freigegebene oder nicht mehr im Vertrieb befindliche Produktversion. Bitte beachten Sie, dass alle Firmenbezüge und Copyrights im vorliegenden Dokument rechtlich auf Fujitsu Technology Solutions übergegangen sind. Kontakt- und Supportadressen werden nun von Fujitsu Technology Solutions angeboten und haben die Die Internetseiten von Fujitsu Technology Solutions finden Sie unter und unter finden Sie die Benutzerdokumentation. Copyright Fujitsu Technology Solutions, 2009

FlexFrame for SAP 4.1A

FlexFrame for SAP 4.1A User Guide English FlexFrame for SAP 4.1A Planning Tool FlexFrame for SAP Version 4.1A Planning Tool Edition April 2008 Document Version 1.0 Fujitsu Siemens Computers GmbH Copyright Fujitsu Siemens Computers

More information

FlexFrame for SAP 4.2A

FlexFrame for SAP 4.2A User Guide English FlexFrame for SAP 4.2A Management Tool FlexFrame for SAP Version 4.2A Management Tool Edition December 2008 Document Version 1.0 Fujitsu Siemens Computers GmbH Copyright Fujitsu Siemens

More information

All technical aspects described in this document and this document itself is subject of change without further notice.

All technical aspects described in this document and this document itself is subject of change without further notice. WHITE PAPER FlexFrame for SAP Version 5.1A THE FLEXFRAME INFRASTRUCTURE SOLUTION - TECHNICAL WHITE PAPER INTRODUCTION This document describes the technical aspects of the FlexFrame infrastructure solution

More information

The name FlexFrame is a generic term for both "FlexFrame for SAP " (FF4S) and "FlexFrame Orchestrator" (FFO).

The name FlexFrame is a generic term for both FlexFrame for SAP  (FF4S) and FlexFrame Orchestrator (FFO). WHITE PAPER FlexFrame Orchestrator Version 1.0A THE FLEXFRAME INFRASTRUCTURE SOLUTION - TECHNICAL WHITE PAPER INTRODUCTION This document describes the technical aspects of the FlexFrame infrastructure

More information

FlexFrame. Version 5.3A/1.0A. Management Tool. Edition October 2013 Document Version 1.0

FlexFrame. Version 5.3A/1.0A. Management Tool. Edition October 2013 Document Version 1.0 FlexFrame Version 5.3A/1.0A Management Tool Edition October 2013 Document Version 1.0 Fujitsu Limited Copyright Fujitsu Technology Solutions 2013 FlexFrame and PRIMERGY are trademarks or registered trademarks

More information

The name FlexFrame is a generic term for both the traditional "FlexFrame for SAP " (FF4S) and the current "FlexFrame Orchestrator" (FFO).

The name FlexFrame is a generic term for both the traditional FlexFrame for SAP  (FF4S) and the current FlexFrame Orchestrator (FFO). WHITE PAPER FlexFrame Orchestrator Version 1.4A THE FLEXFRAME INFRASTRUCTURE SOLUTION - TECHNICAL WHITE PAPER INTRODUCTION This document describes the technical aspects of the FlexFrame infrastructure

More information

User's Guide for Infrastructure Administrators (Resource Management)

User's Guide for Infrastructure Administrators (Resource Management) ServerView Resource Orchestrator Cloud Edition V3.0.0 User's Guide for Infrastructure Administrators (Resource Management) Windows/Linux J2X1-7612-01ENZ0(05) April 2012 Preface Purpose This manual provides

More information

LifeKeeper for Linux v5.0. Sybase ASE Recovery Kit Administration Guide

LifeKeeper for Linux v5.0. Sybase ASE Recovery Kit Administration Guide LifeKeeper for Linux v5.0 Sybase ASE Recovery Kit Administration Guide October 2010 SteelEye and LifeKeeper are registered trademarks. Adobe Acrobat is a registered trademark of Adobe Systems Incorporation.

More information

The name FlexFrame is a generic term for both the traditional "FlexFrame for SAP " (FF4S) and the current "FlexFrame Orchestrator" (FFO).

The name FlexFrame is a generic term for both the traditional FlexFrame for SAP  (FF4S) and the current FlexFrame Orchestrator (FFO). WHITE PAPER FlexFrame Orchestrator Version 1.2A THE FLEXFRAME INFRASTRUCTURE SOLUTION - TECHNICAL WHITE PAPER INTRODUCTION This document describes the technical aspects of the FlexFrame infrastructure

More information

ServerView Resource Orchestrator V User's Guide. Windows/Linux

ServerView Resource Orchestrator V User's Guide. Windows/Linux ServerView Resource Orchestrator V2.3.0 User's Guide Windows/Linux J2X1-7530-01ENZ0(02) July 2011 Preface Purpose This manual provides an outline of ServerView Resource Orchestrator (hereinafter Resource

More information

Environment 7.1 SR5 on AIX: Oracle

Environment 7.1 SR5 on AIX: Oracle PUBLIC Installation Guide SAP NetWeaver Composition Environment 7.1 SR5 on AIX: Oracle Production Edition Target Audience Technology consultants System administrators Document version: 1.1 05/16/2008 Document

More information

Systemwalker Service Quality Coordinator. Technical Guide. Windows/Solaris/Linux

Systemwalker Service Quality Coordinator. Technical Guide. Windows/Solaris/Linux Systemwalker Service Quality Coordinator Technical Guide Windows/Solaris/Linux J2X1-6800-02ENZ0(00) November 2010 Preface Purpose of this manual This manual explains the functions and usage of Systemwalker

More information

The name FlexFrame is a generic term for both the traditional "FlexFrame for SAP " (FF4S) and the current "FlexFrame Orchestrator" (FFO).

The name FlexFrame is a generic term for both the traditional FlexFrame for SAP  (FF4S) and the current FlexFrame Orchestrator (FFO). WHITE PAPER FlexFrame Orchestrator Version 1.1A THE FLEXFRAME INFRASTRUCTURE SOLUTION - TECHNICAL WHITE PAPER INTRODUCTION This document describes the technical aspects of the FlexFrame infrastructure

More information

Systemwalker Service Quality Coordinator. Technical Guide. Windows/Solaris/Linux

Systemwalker Service Quality Coordinator. Technical Guide. Windows/Solaris/Linux Systemwalker Service Quality Coordinator Technical Guide Windows/Solaris/Linux J2X1-6800-03ENZ0(00) May 2011 Preface Purpose of this manual This manual explains the functions and usage of Systemwalker

More information

PRIMECLUSTER. Web-Based Admin View Operation Guide

PRIMECLUSTER. Web-Based Admin View Operation Guide PRIMECLUSTER Web-Based Admin View Operation Guide Edition August 2005 Preface This manual outlines the functions and operation of the Web-Based Admin View. Web-Based Admin View is a common base to utilize

More information

Network Management Utility

Network Management Utility 4343-7705-02 Network Management Utility Foreword Welcome Network Management Utility is utility software that provides central control over printers, copiers, and other devices on a network. With Network

More information

IBM Tivoli Storage Manager Version Introduction to Data Protection Solutions IBM

IBM Tivoli Storage Manager Version Introduction to Data Protection Solutions IBM IBM Tivoli Storage Manager Version 7.1.6 Introduction to Data Protection Solutions IBM IBM Tivoli Storage Manager Version 7.1.6 Introduction to Data Protection Solutions IBM Note: Before you use this

More information

Application Servers - Installing SAP Web Application Server

Application Servers - Installing SAP Web Application Server Proven Practice Application Servers - Installing SAP Web Application Server Product(s): IBM Cognos 8.3, SAP Web Application Server Area of Interest: Infrastructure DOC ID: AS02 Version 8.3.0.0 Installing

More information

FlexFrame for SAP 4.0

FlexFrame for SAP 4.0 User Guide English FlexFrame for SAP 4.0 myamc.fa_logagent - Concept and Usage FlexFrame for SAP Version 4.0 myamc.fa_logagent - Concept and Usage Edition March 2007 Document Version 1.0 Fujitsu Siemens

More information

IBM Spectrum Protect Version Introduction to Data Protection Solutions IBM

IBM Spectrum Protect Version Introduction to Data Protection Solutions IBM IBM Spectrum Protect Version 8.1.2 Introduction to Data Protection Solutions IBM IBM Spectrum Protect Version 8.1.2 Introduction to Data Protection Solutions IBM Note: Before you use this information

More information

Network Server Suite. v10.3 Installation Guide

Network Server Suite. v10.3 Installation Guide Network Server Suite v10.3 Installation Guide Copyright Copyright HelpSystems, LLC. All rights reserved. www.helpsystems.com US: +1 952-933-0609 Outside the U.S.: +44 (0) 870 120 3148 IBM, AS/400, OS/400,

More information

PRIMECLUSTER. Installation and Administration Guide 4.0. (for Linux)

PRIMECLUSTER. Installation and Administration Guide 4.0. (for Linux) PRIMECLUSTER Installation and Administration Guide 4.0 (for Linux) Edition June 2003 Preface This manual serves as your starting point for using PRIMECLUSTER. It explains the workflow of the series of

More information

Interstage Business Process Manager Analytics V11.1. Installation Guide. Windows/Solaris/Linux

Interstage Business Process Manager Analytics V11.1. Installation Guide. Windows/Solaris/Linux Interstage Business Process Manager Analytics V11.1 Installation Guide Windows/Solaris/Linux J2U3-0052-02(00) July 2010 About this Manual This document describes how to install Interstage Business Process

More information

FlexFrame Orchestrator

FlexFrame Orchestrator FlexFrame Orchestrator Version 1.2A Management Tool Edition November 2015 Document Version 1.0 Fujitsu Limited Copyright 2015 Fujitsu Technology Solutions GmbH PRIMEFLEX is a registered trademark of Fujitsu

More information

ServerView Suite Enterprise Edition V2.41

ServerView Suite Enterprise Edition V2.41 ServerView Suite ServerView Suite Enterprise Edition V2.41 System Administration within a Domain System Administration within a Domain Sprachen: En Edition May 2009 Comments Suggestions Corrections The

More information

Release Information for FlexFrame for SAP V5.1A00

Release Information for FlexFrame for SAP V5.1A00 Copyright 2013 Fujitsu Technology Solutions Table of Contents General... 1 Ordering... 2 Delivery... 2 Documentation... 2 SW/HW Extensions / New Functionality... 3 Technical Information... 4 Resource Requirements...

More information

IBM Tivoli Directory Server

IBM Tivoli Directory Server Build a powerful, security-rich data foundation for enterprise identity management IBM Tivoli Directory Server Highlights Support hundreds of millions of entries by leveraging advanced reliability and

More information

Symantec NetBackup PureDisk Compatibility Matrix Created August 26, 2010

Symantec NetBackup PureDisk Compatibility Matrix Created August 26, 2010 Symantec NetBackup PureDisk 6.6.1 Compatibility Matrix Created August 26, 2010 Copyright 2010 Symantec Corporation. All rights reserved. Symantec, the Symantec Logo, and Backup Exec are trademarks or registered

More information

Extended Search Administration

Extended Search Administration IBM Lotus Extended Search Extended Search Administration Version 4 Release 0.1 SC27-1404-02 IBM Lotus Extended Search Extended Search Administration Version 4 Release 0.1 SC27-1404-02 Note! Before using

More information

NCP Secure Enterprise Management for Linux Release Notes

NCP Secure Enterprise Management for Linux Release Notes Major Release: 4.01 r32851 Date: November 2016 Prerequisites The following x64 operating systems and databases with corresponding ODBC driver have been tested and released: Linux Distribution Database

More information

FlexFrame for SAP. Version 5.1A. Network Design and Configuration Guide. Edition March 2012 Document Version 1.1

FlexFrame for SAP. Version 5.1A. Network Design and Configuration Guide. Edition March 2012 Document Version 1.1 FlexFrame for SAP Version 5.1A Network Design and Configuration Guide Edition March 2012 Document Version 1.1 Fujitsu Limited Copyright Fujitsu Technology Solutions 2011 FlexFrame and PRIMERGY are trademarks

More information

MasterScope Virtual DataCenter Automation Media v3.0

MasterScope Virtual DataCenter Automation Media v3.0 MasterScope Virtual DataCenter Automation Media v3.0 Release Memo 1st Edition June, 2016 NEC Corporation Disclaimer The copyrighted information noted in this document shall belong to NEC Corporation. Copying

More information

ServerView Resource Coordinator VE. Setup Guide. Windows/Linux

ServerView Resource Coordinator VE. Setup Guide. Windows/Linux ServerView Resource Coordinator VE Setup Guide Windows/Linux J2X1-7459-02ENZ0(00) November 2009 Preface Purpose This manual contains an outline of ServerView Resource Coordinator VE (hereinafter Resource

More information

High Availability for Enterprise Clouds: Oracle Solaris Cluster and OpenStack

High Availability for Enterprise Clouds: Oracle Solaris Cluster and OpenStack High Availability for Enterprise Clouds: Oracle Solaris Cluster and OpenStack Eve Kleinknecht Principal Product Manager Thorsten Früauf Principal Software Engineer November 18, 2015 Safe Harbor Statement

More information

FUJITSU Storage ETERNUS AF series and ETERNUS DX S4/S3 series Non-Stop Storage Reference Architecture Configuration Guide

FUJITSU Storage ETERNUS AF series and ETERNUS DX S4/S3 series Non-Stop Storage Reference Architecture Configuration Guide FUJITSU Storage ETERNUS AF series and ETERNUS DX S4/S3 series Non-Stop Storage Reference Architecture Configuration Guide Non-stop storage is a high-availability solution that combines ETERNUS SF products

More information

FlexFrame for SAP. Version 5.0A. Administration and Operation. Edition December 2011 Document Version 1.7

FlexFrame for SAP. Version 5.0A. Administration and Operation. Edition December 2011 Document Version 1.7 FlexFrame for SAP Version 5.0A Administration and Operation Edition December 2011 Document Version 1.7 Fujitsu Limited Copyright Fujitsu Technology Solutions 2011 FlexFrame and PRIMERGY are trademarks

More information

ServerView Resource Orchestrator Cloud Edition V Setup Guide. Windows/Linux

ServerView Resource Orchestrator Cloud Edition V Setup Guide. Windows/Linux ServerView Resource Orchestrator Cloud Edition V3.1.0 Setup Guide Windows/Linux J2X1-7610-02ENZ0(00) July 2012 Preface Resource Orchestrator Documentation Road Map The documentation road map for Resource

More information

FUJITSU Software ServerView Resource Orchestrator V Errata. Windows/Linux

FUJITSU Software ServerView Resource Orchestrator V Errata. Windows/Linux FUJITSU Software ServerView Resource Orchestrator V3.1.2 Errata Windows/Linux J2X1-7732-04ENZ0(01) June 2014 Preface Purpose This manual provides corrections to the FUJITSU Software ServerView Resource

More information

PRIMECLUSTER. Web-Based Admin View Operation Guide

PRIMECLUSTER. Web-Based Admin View Operation Guide PRIMECLUSTER Web-Based Admin View Operation Guide Edition June 2004 Preface This manual outlines the functions and operation of the Web-Based Admin View. Web-Based Admin View is a common base to utilize

More information

Cisco Unified Computing System for SAP Landscapes

Cisco Unified Computing System for SAP Landscapes Cisco Unified Computing System for SAP Landscapes Improve IT Responsiveness and Agility for Rapidly Changing Business Demands by Using the Cisco Unified Computing System White Paper November 2010 Introduction

More information

Service Portal User Guide

Service Portal User Guide FUJITSU Cloud Service K5 IaaS Service Portal User Guide Version 1.4 FUJITSU LIMITED All Rights Reserved, Copyright FUJITSU LIMITED 2015-2016 K5IA-DC-M-005-001E Preface Purpose of This Manual This manual

More information

What's New in the DBA Cockpit with SAP NetWeaver 7.0

What's New in the DBA Cockpit with SAP NetWeaver 7.0 What's New in the DBA Cockpit with SAP NetWeaver 7.0 Applies to: Database monitoring and administration of SAP systems running on DB2 for Linux, UNIX, and Windows using the latest DBA Cockpit that has

More information

MySQL Enterprise Monitor Manual

MySQL Enterprise Monitor Manual MySQL Enterprise Monitor 3.3.9 Manual Abstract This manual documents the MySQL Enterprise Monitor version 3.3.9. For notes detailing the changes in each release, see the MySQL Enterprise Monitor 3.3 Release

More information

Forwarding Alerts to Alert Management (ALM)

Forwarding Alerts to Alert Management (ALM) Forwarding Alerts to Alert Management (ALM) HELP.BCCCM SAP NetWeaver 04 Copyright Copyright 2004 SAP AG. All rights reserved. No part of this publication may be reproduced or transmitted in any form or

More information

EXPRESSCLUSTER X. System Configuration Guide. for Linux SAP NetWeaver. April 17, st Edition

EXPRESSCLUSTER X. System Configuration Guide. for Linux SAP NetWeaver. April 17, st Edition EXPRESSCLUSTER X for Linux SAP NetWeaver System Configuration Guide April 17, 2018 1st Edition Revision History Edition Revised Date Description 1st Apr 17, 2018 New guide Copyright NEC Corporation 2018.

More information

Oracle VM. Getting Started Guide for Release 3.2

Oracle VM. Getting Started Guide for Release 3.2 Oracle VM Getting Started Guide for Release 3.2 E35331-04 March 2014 Oracle VM: Getting Started Guide for Release 3.2 Copyright 2011, 2014, Oracle and/or its affiliates. All rights reserved. Oracle and

More information

SAP NetWeaver MDM MDM Import and Syndication Server & Port Concept

SAP NetWeaver MDM MDM Import and Syndication Server & Port Concept Welcome to your RKT Live Expert Session SAP NetWeaver MDM MDM Import and Syndication Server & Port Concept Michael Reil SAP NetWeaver Product Management Please note that we are recording this session!

More information

NCP Secure Enterprise Management (Win) Release Notes

NCP Secure Enterprise Management (Win) Release Notes Service Release: 4.01 r32851 Datum: November 2016 Prerequisites Operating System Support The following Microsoft Operating Systems are supported with this release: Windows Server 2008 R2 64 Bit Windows

More information

EXPRESSCLUSTER X SingleServerSafe 3.3 for Linux. Configuration Guide. 10/02/2017 6th Edition

EXPRESSCLUSTER X SingleServerSafe 3.3 for Linux. Configuration Guide. 10/02/2017 6th Edition EXPRESSCLUSTER X SingleServerSafe 3.3 for Linux Configuration Guide 10/02/2017 6th Edition Revision History Edition Revised Date Description 1st 02/09/2015 New manual 2nd 06/30/2015 Corresponds to the

More information

SnapCenter Software 4.0 Concepts Guide

SnapCenter Software 4.0 Concepts Guide SnapCenter Software 4.0 Concepts Guide May 2018 215-12925_D0 doccomments@netapp.com Table of Contents 3 Contents Deciding whether to use the Concepts Guide... 7 SnapCenter overview... 8 SnapCenter architecture...

More information

SAP Solutions on VMware vsphere : High Availability

SAP Solutions on VMware vsphere : High Availability SAP Solutions on VMware vsphere : High Availability Table of Contents Introduction...1 vsphere Overview...1 VMware Fault Tolerance...1 Background on High Availability for SAP Solutions...2 VMware HA...3

More information

Nimsoft Monitor. websphere Guide. v1.5 series

Nimsoft Monitor. websphere Guide. v1.5 series Nimsoft Monitor websphere Guide v1.5 series Legal Notices Copyright 2012, Nimsoft Corporation Warranty The material contained in this document is provided "as is," and is subject to being changed, without

More information

DocuShare Installation Guide

DocuShare Installation Guide DocuShare Installation Guide Publication date: December 2009 This document supports DocuShare Release 6.5/DocuShare CPX Release 6.5 Prepared by: Xerox Corporation DocuShare Business Unit 3400 Hillview

More information

ETERNUS SF AdvancedCopy Manager Operator's Guide for Cluster Environment

ETERNUS SF AdvancedCopy Manager Operator's Guide for Cluster Environment ETERNUS SF AdvancedCopy Manager 14.2 Operator's Guide for Cluster Environment J2X1-7452-04ENZ0(00) June 2011 Preface Purpose This manual explains the installation and customization of ETERNUS SF AdvancedCopy

More information

PRIMECLUSTER. Web-Based Admin View Operation Guide

PRIMECLUSTER. Web-Based Admin View Operation Guide PRIMECLUSTER Web-Based Admin View Operation Guide Edition April 2006 Preface This manual outlines the functions and operation of the Web-Based Admin View. Web-Based Admin View is a common base to utilize

More information

PRIMERGY ServerView Suite ServerView Performance Manager

PRIMERGY ServerView Suite ServerView Performance Manager - English PRIMERGY ServerView Suite ServerView Performance Manager ServerView Operations Manager V4.90 Edition February 2018 Comments Suggestions Corrections The User Documentation Department would like

More information

FUJITSU Software ServerView Infrastructure Manager V2.1. Start Guide

FUJITSU Software ServerView Infrastructure Manager V2.1. Start Guide FUJITSU Software ServerView Infrastructure Manager V2.1 Start Guide CA92344-1717-02 August 2017 Preface Purpose This manual describes overviews of the functions and installation procedures, and describes

More information

Teamcenter Installation on Windows Clients Guide. Publication Number PLM00012 J

Teamcenter Installation on Windows Clients Guide. Publication Number PLM00012 J Teamcenter 10.1 Installation on Windows Clients Guide Publication Number PLM00012 J Proprietary and restricted rights notice This software and related documentation are proprietary to Siemens Product Lifecycle

More information

EMC Celerra CNS with CLARiiON Storage

EMC Celerra CNS with CLARiiON Storage DATA SHEET EMC Celerra CNS with CLARiiON Storage Reach new heights of availability and scalability with EMC Celerra Clustered Network Server (CNS) and CLARiiON storage Consolidating and sharing information

More information

SystemManager G 8.0 WebConsole Option

SystemManager G 8.0 WebConsole Option SystemManager G 8.0 Release Memo First Edition July, 2018 NEC Corporation SMG0800E-REL-1820 Disclaimer The copyrighted information noted in this document shall belong to NEC Corporation. Copying or revising

More information

FUJITSU Storage ETERNUS SF Storage Cruiser V16.3 / AdvancedCopy Manager V16.3. Cluster Environment Setup Guide

FUJITSU Storage ETERNUS SF Storage Cruiser V16.3 / AdvancedCopy Manager V16.3. Cluster Environment Setup Guide FUJITSU Storage ETERNUS SF Storage Cruiser V16.3 / AdvancedCopy Manager V16.3 Cluster Environment Setup Guide B1FW-6006-04ENZ0(00) October 2015 Preface Purpose This manual provides information on installation

More information

Sentences Installation Guide. Sentences Version 4.0

Sentences Installation Guide. Sentences Version 4.0 Sentences Installation Guide Sentences Version 4.0 A publication of Lazysoft Ltd. Web: www.sentences.com Lazysoft Support: support@sentences.com Copyright 2000-2012 Lazysoft Ltd. All rights reserved. The

More information

ServerView Resource Orchestrator Cloud Edition V Quick Start Guide. Windows/Linux

ServerView Resource Orchestrator Cloud Edition V Quick Start Guide. Windows/Linux ServerView Resource Orchestrator Cloud Edition V3.1.0 Quick Start Guide Windows/Linux J2X1-7622-02ENZ0(00) July 2012 Preface QSGDocument road map The following manuals are provided with Resource Orchestrator.

More information

ExpressCluster X 2.0 for Linux

ExpressCluster X 2.0 for Linux ExpressCluster X 2.0 for Linux Installation and Configuration Guide 03/31/2009 3rd Edition Revision History Edition Revised Date Description First 2008/04/25 New manual Second 2008/10/15 This manual has

More information

This guide consists of the following two chapters and an appendix. Chapter 1 Installing ETERNUSmgr This chapter describes how to install ETERNUSmgr.

This guide consists of the following two chapters and an appendix. Chapter 1 Installing ETERNUSmgr This chapter describes how to install ETERNUSmgr. Preface This installation guide explains how to install the "ETERNUSmgr for Linux" storage system management software on an ETERNUS DX400 series, ETERNUS DX8000 series, ETERNUS2000, ETERNUS4000, ETERNUS8000,

More information

ServerView Resource Orchestrator V User's Guide. Windows/Linux

ServerView Resource Orchestrator V User's Guide. Windows/Linux ServerView Resource Orchestrator V2.2.1 User's Guide Windows/Linux J2X1-7526-01ENZ0(01) November 2010 Preface Purpose This manual provides an outline of ServerView Resource Orchestrator (hereinafter Resource

More information

Adding Application Protection in Virtualized SAP Environments in vsphere. Thomas Jorczik

Adding Application Protection in Virtualized SAP Environments in vsphere. Thomas Jorczik Adding Application Protection in Virtualized SAP Environments in vsphere Thomas Jorczik SIOS Technology Corporation Founded in November 1999 (SteelEye Technology Inc.) to follow-on all LifeKeeper activities

More information

ETERNUS SF Express V15.3/ Storage Cruiser V15.3/ AdvancedCopy Manager V15.3. Migration Guide

ETERNUS SF Express V15.3/ Storage Cruiser V15.3/ AdvancedCopy Manager V15.3. Migration Guide ETERNUS SF Express V15.3/ Storage Cruiser V15.3/ AdvancedCopy Manager V15.3 Migration Guide B1FW-5958-06ENZ0(00) June 2013 Preface Purpose This manual describes how to upgrade to this version from the

More information

Storage Monitoring Made Easy for DBAs: Diagnosing Performance Problems. Senior Product Manager Consulting Member of Technical Staff

Storage Monitoring Made Easy for DBAs: Diagnosing Performance Problems. Senior Product Manager Consulting Member of Technical Staff Storage Monitoring Made Easy for DBAs: Diagnosing Performance Problems Anirban Chatterjee Sriram Palapudi Senior Product Manager Consulting Member of Technical Staff The following is intended to outline

More information

FUJITSU Software ServerView Resource Orchestrator Cloud Edition V Quick Start Guide. Windows/Linux

FUJITSU Software ServerView Resource Orchestrator Cloud Edition V Quick Start Guide. Windows/Linux FUJITSU Software ServerView Resource Orchestrator Cloud Edition V3.1.2 Quick Start Guide Windows/Linux J2X1-7622-06ENZ0(01) June 2014 Preface Purpose of This Document This manual explains the flow of installation

More information

How To...Use a Debugging Script to Easily Create a Test Environment for a SQL-Script Planning Function in PAK

How To...Use a Debugging Script to Easily Create a Test Environment for a SQL-Script Planning Function in PAK SAP NetWeaver SAP How-To NetWeaver Guide How-To Guide How To...Use a Debugging Script to Easily Create a Test Environment for a SQL-Script Planning Function in PAK Applicable Releases: SAP NetWeaver BW

More information

Fujitsu Technology Solutions. StorMan Version 2.0 May Release Notice

Fujitsu Technology Solutions. StorMan Version 2.0 May Release Notice Fujitsu Technology Solutions StorMan Version 2.0 May 2009 Release Notice All rights reserved, especially industrial property rights. Modifications to technical data and delivery subject to availability.

More information

Managing Serviceguard Extension for SAP on Linux (IA64 Integrity and x86_64)

Managing Serviceguard Extension for SAP on Linux (IA64 Integrity and x86_64) Managing Serviceguard Extension for SAP on Linux (IA64 Integrity and x86_64) *T2392-90015* Printed in the US HP Part Number: T2392-90015 Published: March 2009 Legal Notices Copyright (R) 2000-2009 Hewlett-Packard

More information

Whitepaper: Back Up SAP HANA and SUSE Linux Enterprise Server with SEP sesam. Copyright 2014 SEP

Whitepaper: Back Up SAP HANA and SUSE Linux Enterprise Server with SEP sesam.  Copyright 2014 SEP Whitepaper: Back Up SAP HANA and SUSE Linux Enterprise Server with SEP sesam info@sepusa.com www.sepusa.com Table of Contents INTRODUCTION AND OVERVIEW... 3 SOLUTION COMPONENTS... 4-5 SAP HANA... 6 SEP

More information

ETERNUS SF Express V15.1/ Storage Cruiser V15.1/ AdvancedCopy Manager V15.1. Migration Guide

ETERNUS SF Express V15.1/ Storage Cruiser V15.1/ AdvancedCopy Manager V15.1. Migration Guide ETERNUS SF Express V15.1/ Storage Cruiser V15.1/ AdvancedCopy Manager V15.1 Migration Guide B1FW-5958-03ENZ0(00) August 2012 Preface Purpose This manual describes how to upgrade to this version from the

More information

Release Information for FlexFrame V4.0A10 for SAP

Release Information for FlexFrame V4.0A10 for SAP Release Information for FlexFrame V4.0A10 for SAP Copyright 2008 Fujitsu Siemens Computers Table of Contents General... 1 Ordering... 1 Delivery... 1 Documentation... 2 Software Extensions / New Functionality...

More information

Storage Manager 2018 R1. Installation Guide

Storage Manager 2018 R1. Installation Guide Storage Manager 2018 R1 Installation Guide Notes, Cautions, and Warnings NOTE: A NOTE indicates important information that helps you make better use of your product. CAUTION: A CAUTION indicates either

More information

ServerView Resource Coordinator VE. Command Reference. Windows

ServerView Resource Coordinator VE. Command Reference. Windows ServerView Resource Coordinator VE Command Reference Windows B1WD-2751-01ENZ0(01) August 2009 Preface Purpose This manual explains the commands available in ServerView Resource Coordinator VE (hereinafter

More information

ServerView Resource Coordinator VE. Installation Guide. Windows/Linux

ServerView Resource Coordinator VE. Installation Guide. Windows/Linux ServerView Resource Coordinator VE Installation Guide Windows/Linux J2X1-7458-03ENZ0(00) February 2010 Preface Purpose This manual explains how to install ServerView Resource Coordinator VE (hereinafter

More information

Dell One Identity Manager Administration Guide for Connecting to SharePoint

Dell One Identity Manager Administration Guide for Connecting to SharePoint Dell One Identity Manager 7.1.3 Administration Guide for Connecting to SharePoint 2016 Dell Inc. All rights reserved. This product is protected by U.S. and international copyright and intellectual property

More information

FUJITSU Software ServerView Suite ServerView Installation Manager

FUJITSU Software ServerView Suite ServerView Installation Manager User Guide - English FUJITSU Software ServerView Suite ServerView Installation Manager Edition June 2017 Comments Suggestions Corrections The User Documentation Department would like to know your opinion

More information

FlexFrame for SAP 4.0

FlexFrame for SAP 4.0 User Guide English FlexFrame for SAP 4.0 Installation ACC 1.0 SP13 FlexFrame for SAP Version 4.0 Installation ACC 1.0 SP13 Edition March 2007 Document Version 1.0 Fujitsu Siemens Computers GmbH Copyright

More information

WHITE PAPER. Implementing Fault Resilient Protection for mysap in a Linux Environment. Introducing LifeKeeper from SteelEye Technology

WHITE PAPER. Implementing Fault Resilient Protection for mysap in a Linux Environment. Introducing LifeKeeper from SteelEye Technology Implementing Fault Resilient Protection for mysap in a Linux Environment Introducing LifeKeeper from SteelEye Technology WHITE PAPER Introduction In the past, high-availability solutions were costly to

More information

IBM. Planning and Installation. IBM Workload Scheduler. Version 9 Release 4

IBM. Planning and Installation. IBM Workload Scheduler. Version 9 Release 4 IBM Workload Scheduler IBM Planning and Installation Version 9 Release 4 IBM Workload Scheduler IBM Planning and Installation Version 9 Release 4 Note Before using this information and the product it

More information

Deployment Scenario: WebSphere Portal Mashup integration and page builder

Deployment Scenario: WebSphere Portal Mashup integration and page builder Deployment Scenario: WebSphere Portal 6.1.5 Mashup integration and page builder Deployment Scenario: WebSphere Portal 6.1.5 Mashup integration and page builder...1 Abstract...2 Portal Mashup integration

More information

Editor for Personnel Calculation Rules (PY-XX-TL)

Editor for Personnel Calculation Rules (PY-XX-TL) Editor for Personnel Calculation Rules (PY-XX-TL) HELP.PAXX Release 4.6C SAP AG Copyright Copyright 2001 SAP AG. All rights reserved. No part of this publication may be reproduced or transmitted in any

More information

Version 2.3 User Guide

Version 2.3 User Guide V Mware vcloud Usage Meter Version 2.3 User Guide 2012 VMware, Inc. All rights reserved. This product is protected by U.S. and international copyright and intellectual property laws. This product is covered

More information

IBM. Planning and Installation. IBM Tivoli Workload Scheduler. Version 9 Release 1 SC

IBM. Planning and Installation. IBM Tivoli Workload Scheduler. Version 9 Release 1 SC IBM Tivoli Workload Scheduler IBM Planning and Installation Version 9 Release 1 SC32-1273-13 IBM Tivoli Workload Scheduler IBM Planning and Installation Version 9 Release 1 SC32-1273-13 Note Before using

More information

SIOS Protection Suite for Linux SAP Recovery Kit v9.2. Administration Guide

SIOS Protection Suite for Linux SAP Recovery Kit v9.2. Administration Guide SIOS Protection Suite for Linux SAP Recovery Kit v9.2 Administration Guide October 2017 This document and the information herein is the property of SIOS Technology Corp. (previously known as SteelEye Technology,

More information

1 BRIEF / Oracle Solaris Cluster Features and Benefits

1 BRIEF / Oracle Solaris Cluster Features and Benefits Oracle Solaris Cluster is a comprehensive high availability (HA) and disaster recovery (DR) solution for Oracle SPARC and x86 environments that is based on Oracle Solaris. It combines extreme service availability

More information

ServerView Resource Coordinator VE V Messages. Windows/Linux

ServerView Resource Coordinator VE V Messages. Windows/Linux ServerView Resource Coordinator VE V2.2.2 Messages Windows/Linux J2X1-7462-06ENZ0(04) August 2011 Preface Purpose This manual provides an explanation of messages used by ServerView Resource Coordinator

More information

PRIMECLUSTER GDS 4.3A20B. Installation Guide. Oracle Solaris

PRIMECLUSTER GDS 4.3A20B. Installation Guide. Oracle Solaris PRIMECLUSTER GDS 4.3A20B Installation Guide Oracle Solaris J2S2-1607-04ENZ0(00) September 2013 Preface Purpose This manual explains how to install PRIMECLUSTER GDS. Target Readers This manual is written

More information

ExpressCluster X 3.1 for Linux

ExpressCluster X 3.1 for Linux ExpressCluster X 3.1 for Linux Installation and Configuration Guide 10/11/2011 First Edition Revision History Edition Revised Date Description First 10/11/2011 New manual Copyright NEC Corporation 2011.

More information

Fusion Registry 9 SDMX Data and Metadata Management System

Fusion Registry 9 SDMX Data and Metadata Management System Registry 9 Data and Management System Registry 9 is a complete and fully integrated statistical data and metadata management system using. Whether you require a metadata repository supporting a highperformance

More information

Red Hat Virtualization 4.1 Product Guide

Red Hat Virtualization 4.1 Product Guide Red Hat Virtualization 4.1 Product Guide Introduction to Red Hat Virtualization 4.1 Red Hat Virtualization Documentation TeamRed Hat Red Hat Virtualization 4.1 Product Guide Introduction to Red Hat Virtualization

More information

EXPRESSCLUSTER X 3.3. Configuration Example. for Linux SAP NetWeaver. 10/3/2016 4th Edition

EXPRESSCLUSTER X 3.3. Configuration Example. for Linux SAP NetWeaver. 10/3/2016 4th Edition EXPRESSCLUSTER X 3.3 for Linux SAP NetWeaver Configuration Example 10/3/2016 4th Edition Revision History Edition Revised Date Description First 10/1/2012 New manual 2 nd 10/25/2013 Changing the setting

More information

Base Configuration Wizard

Base Configuration Wizard User Guide - English FUJITSU Software ServerView Suite Base Configuration Wizard ServerView Operations Manager V7.20 Edition August 2017 Comments Suggestions Corrections The User Documentation Department

More information

Red Hat Virtualization 4.2

Red Hat Virtualization 4.2 Red Hat Virtualization 4.2 Introduction to the VM Portal Accessing and Using the VM Portal Last Updated: 2018-07-30 Red Hat Virtualization 4.2 Introduction to the VM Portal Accessing and Using the VM

More information

ETERNUS SF AdvancedCopy Manager V15.0. Quick Reference

ETERNUS SF AdvancedCopy Manager V15.0. Quick Reference ETERNUS SF AdvancedCopy Manager V15.0 Quick Reference B1FW-5967-02ENZ0(00) April 2012 Preface Purpose This manual describes the pre-installation requirements, installation procedure, configuration procedure,

More information

User's Guide - Master Schedule Management

User's Guide - Master Schedule Management FUJITSU Software Systemwalker Operation Manager User's Guide - Master Schedule Management UNIX/Windows(R) J2X1-3170-14ENZ0(00) May 2015 Preface Purpose of This Document This document describes the Master

More information