SUSE Linux Enterprise Server 11: Certified Linux Engineer Manual

Size: px

Start display at page:

Download "SUSE Linux Enterprise Server 11: Certified Linux Engineer Manual"

Abigayle Daniels
5 years ago
Views:

1 SUSE Linux Enterprise Server 11: Certified Linux Engineer Manual 3107 Novell Training Services AUTHORIZED COURSEWARE Novell Training Services (en) 15 April 2009 Part # REV A

2 Novell Training Services (en) 15 April 2009 Legal Notices Novell, Inc., makes no representations or warranties with respect to the contents or use of this documentation, and specifically disclaims any express or implied warranties of merchantability or fitness for any particular purpose. Further, Novell, Inc., reserves the right to revise this publication and to make changes to its content, at any time, without obligation to notify any person or entity of such revisions or changes. Further, Novell, Inc., makes no representations or warranties with respect to any software, and specifically disclaims any express or implied warranties of merchantability or fitness for any particular purpose. Further, Novell, Inc., reserves the right to make changes to any and all parts of Novell software, at any time, without any obligation to notify any person or entity of such changes. Any products or technical information provided under this Agreement may be subject to U.S. export controls and the trade laws of other countries. You agree to comply with all export control regulations and to obtain any required licenses or classification to export, re-export or import deliverables. You agree not to export or re-export to entities on the current U.S. export exclusion lists or to any embargoed or terrorist countries as specified in the U.S. export laws. You agree to not use deliverables for prohibited nuclear, missile, or chemical biological weaponry end uses. See the Novell International Trade Services Web page ( / for more information on exporting Novell software. Novell assumes no responsibility for your failure to obtain any necessary export approvals. Copyright 2008 Novell, Inc. All rights reserved. No part of this publication may be reproduced, photocopied, stored on a retrieval system, or transmitted without the express written consent of the publisher. Novell, Inc., has intellectual property rights relating to technology embodied in the product that is described in this document. In particular, and without limitation, these intellectual property rights may include one or more of the U.S. patents listed on the Novell Legal Patents Web page ( company/legal/patents/) and one or more additional patents or pending patent applications in the U.S. and in other countries. Novell, Inc. 404 Wyman Street, Suite 500 Waltham, MA U.S.A. Online Documentation: To access the latest online documentation for this and other Novell products, see the Novell Documentation Web page ( Novell Trademarks For Novell trademarks, see the Novell Trademark and Service Mark list ( Third-Party Materials All third-party trademarks are the property of their respective owners.

3 Contents Introduction 7 Student Kit Deliverables 7 Course Design 8 Course Objectives 8 Audience 8 Certification and Prerequisites 8 Agenda 9 Novell Training Services (en) 15 April 2009 Exercise Conventions 10 SUSE Linux Enterprise Server 11 Information 11 SUSE Linux Enterprise Server 11 Support and Maintenance 11 Novell Customer Center 11 SUSE Linux Enterprise Server 11 Online Resources 12 SECTION 1 Manage Networking 13 Objective 1 Configure Networking on SLES Manage the Network Configuration Information from YaST Manage the Network Configuration with Command Line Tools Objective 2 Understand Linux Network Bridges 39 Understand How Bridging Works Configure Network Bridges Exercise 1-1 Configure a Network Bridge Objective 3 Bond Network Adapters 46 Understand Network Bonding Configure Bonding Understand How /sbin/ifup Deals with Bonding Devices Exercise 1-2 Bonding of Network Interfaces Objective 4 Configure Virtual Local Area Networks 54 How VLANs Work How Tagging Works Configure SLES 11 with YaST to Support VLANs Exercise 1-3 Configure VLAN Tagging

4 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Summary 62 SECTION 2 Manage Server Storage 65 Objective 1 Manage SCSI Devices on Linux 66 SCSI Concepts How SCSI Devices Work on Linux Common SCSI Commands Objective 2 Describe How Fibre Channel SANs Work 91 How SANs Work SAN Hardware Providing Clustering and Redundancy with SANs Objective 3 Implement a SAN with iscsi 109 The Benefits of Using iscsi How iscsi Works iscsi Terminology Implement an iscsi SAN Implementing an isns Server Exercise 2-1 Configure a SAN with iscsi Summary 145 SECTION 3 Work with Xen Virtualization 149 Objective 1 Understand Virtualization Technology 150 Virtualization Background Virtualization Products Virtualization Hardware Management of Virtual Machines in the Enterprise Objective 2 Implement SLES 11 as a Xen Host Server 162 Install Xen During Installation of SLES Install Xen on an Installed SLES Objective 3 Implement SLES 11 as a Xen Guest 166 Install a Xen Virtual Machine Locally Install a Xen Virtual Machine Using Remote Storage Install a Xen Virtual Machine Non-Interactively Manage Xen Domains with Virt-Manager Manage Xen Domains from the Command Line Migrate Xen Virtual Machines between Hosts Exercise 3-1 Install and Use Xen Virtualization

5 Summary 192 SECTION 4 Harden Servers 193 Objective 1 Describe Server Hardening 194 What Is Server Hardening? Are Linux Systems Vulnerable? Objective 2 Harden a SLES 11 Server 198 Checking File Permissions Securing Software and Services Managing User Access Closing Unnecessary Ports Exercise 4-1 Harden a SLES 11 Server Novell Training Services (en) 15 April 2009 Objective 3 Harden Services with AppArmor 232 Improve Application Security with AppArmor Create and Manage AppArmor Profiles Control AppArmor Monitor AppArmor Exercise 4-2 Protect Services with AppArmor Objective 4 Implement an IDS 260 How AIDE Works Configuring AIDE Rules Using AIDE to Check for Altered Files Exercise 4-3 Configure AIDE Summary 271 SECTION 5 Update Servers 273 Objective 1 Describe How SMT Works 274 Registering with the Novell Customer Center How SMT Works Objective 2 Install and Configure an SMT Server 280 Generate Your Mirror Credentials Install the SMT Server Set the SMT Job Schedule Manage Software Repositories Objective 3 Configure SMT Client Systems 301 Configure SMT Client Systems Manage SMT Client Systems Objective 4 Stage Repositories 313 Staging Repositories from the Command Line Staging Repositories in YaST Exercise 5-1 Implement an SMT Server

6 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Summary 320 SECTION 6 Prepare Servers for Disasters 323 Objective 1 Design a Backup Strategy 324 Choosing a Backup Method Choosing a Backup Media Defining a Backup Schedule Determining What to Back Up Objective 2 Use Linux Tools to Create Backups 328 Create Backups with tar Mirror Directories with rsync Use LVM Snapshots with Backups Exercise 6-1 Use an LVM Snapshot Volume to Create a Consistent Backup Objective 3 Implement Multipath I/O 343 Understand Multipathing Configure and Administer Multipathing Exercise 6-2 Access Remote Storage Using Multipath I/O Summary 353 SECTION 7 Monitor Server Health 355 Objective 1 Document the Server 356 Creating a Server Deployment Plan Maintaining a Server Configuration Log Documenting System Changes and Maintenance Events Creating Server Baselines Objective 2 Monitor Log Files with logwatch 376 How logwatch Works Installing and Configuring logwatch Using logwatch to Monitor Your Server Log Files Objective 3 Monitor Network Hosts with Nagios 384 How Nagios Works Installing Nagios Configuring the Nagios Server Configuring Monitored Hosts Exercise 7-1 Use Nagios to Monitor Network Hosts Summary 437 6

7 Introduction Introduction The SUSE Linux Enterprise Server: Certified Linux Expert 11 course (3107) is designed to enable professional administrators to perform specialized tasks on SUSE Linux Enterprise Server 11 concerning networking, storage, virtualization, and disaster preparedness and recovery. The target audience for this course is administrators of Novell Certified Linux Professional 11 level with a solid Linux background. Therefore, the practical tasks are given as scenarios that need to be solved with the information from the theory part. An outline of steps to take to solve the task and sample solutions are provided, but detailed step-by-step instructions are not included. This course prepares candidates for the Novell Certified Linux Engineer 11 Practicum exam. This certification is designed to prove advanced SLES 11 server administration skills. These skills are core to various areas of specialization whether a candidate focuses on security, application servers, or directory management, the skills in this certification are relevant. Novell Training Services (en) 15 April 2009 Student Kit Deliverables Your student kit includes the following: SUSE Linux Enterprise Server: Certified Linux Expert 11 manual SUSE Linux Enterprise Server: Certified Linux Expert 11 workbook SUSE Linux Enterprise Server: Certified Linux Expert 11 Course DVDs 1, 2, 3 SUSE Linux Enterprise Server 11 product DVD SUSE Linux Enterprise Desktop 11 product DVD The SUSE Linux Enterprise Server: Certified Linux Expert 11 Course DVDs contain setup instructions, the manuals in PDF files, exercise files, the SLE11 Software Development Kit ISO, and XEN virtual machine images that are used in some of the scenarios. The scenarios are based on a set of three computers installed with SUSE Linux Enterprise Server 11 that are shared by two students. One of these computers (da1) provides services, such as iscsi, to the other two hosts (da2 and da3). Most of the exercises are performed on these two SUSE Linux Enterprise Server 11 hosts. In addition to these three machines installed on physical hardware, SUSE Linux Enterprise Server 11 virtual machines are set up during exercises in the Workbook. NOTE: Instructions for setting up a self-study environment are in the 3107_setup.pdf on the Course DVD. 7

8 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Course Design The course design is explained under the following headings: Course Objectives on page 8 Audience on page 8 Certification and Prerequisites on page 8 Agenda on page 9 Course Objectives This course teaches you how to perform the following SUSE Linux Enterprise Server 11 administrative tasks: Manage Networking Manage Server Storage Design a Server Virtualization Strategy Harden Servers Update Servers Prepare Servers for Disasters Monitor Server Health These are common tasks for an advanced SUSE Linux Enterprise Server 11 administrator in an enterprise environment. Audience The audience for this course are the experienced SLES server administrators who are, or intend to be, managing SLES servers at a level that goes beyond the administration skills covered in CLA 11 and CLP 11. It also prepares for the Novell Certified Linux Engineer 11 Practicum examination. Certification and Prerequisites A solid Linux background is a prerequisite for this course. Participants should have passed the CLP 11 Practicum exam or have comparable knowledge as taught in the CLP 11 courses SUSE Linux Enterprise 11 Fundamentals (3101), SUSE Linux Enterprise 11 Administration (3102), and SUSE Linux Enterprise Server 11 Administration (3103). This course can be used as preparation towards the following certifications: Novell Certified Linux Engineer 11 on page 8 Novell Certified Linux Engineer 11 This course can be used as the first course towards the Novell Certified Linux Engineer 11 (Novell CLE 11) Certification. Unlike a multiple choice test, the Novell 8

9 Introduction CLE 11 exam is a practical exam, called Practicum, requiring the examinee to perform actual system administration tasks. The Practicum is a hands-on, scenariobased exam where you apply the knowledge you have learned to solve real-life problems demonstrating that you know what to do and how to do it. The practicum tests you on objectives in the skills outlined in the following four Novell CLP 11 and CLE 11 courses: SUSE Linux Enterprise 11 Fundamentals - Course 3101 SUSE Linux Enterprise 11 Administration - Course 3102 SUSE Linux Enterprise Server 11 Administration - Course 3103 SUSE Linux Enterprise Server 11: Certified Linux Engineer - Course 3107 Novell Training Services (en) 15 April 2009 As with all Novell certifications, course work is recommended. To achieve the certification, you are required to pass the Novell CLE 11 Practicum. The following illustrates the training/testing path for Novell CLE 11: Figure Intro-1 Certification Path for Novell CLE 11 NOTE: For more information about Novell certification programs and taking the Novell CLE 11 Practicum, see ( Agenda The following is the agenda for this five-day course: 9

10 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Table Intro-1 Agenda Section Duration Day 1 Introduction 00:30 Section 1: Manage Networking 04:00 Section 2: Manage Server Storage 01:00 Day 2 Section 2: Manage Server Storage (contd.) 04:00 Section 3: Work with Xen Virtualization 02:30 Day 3 Section 3: Work with Xen Virtualization (contd.) 03:00 Section 4: Harden Servers 04:00 Day 4 Section 5: Update Servers 04:30 Section 6: Prepare Servers for Disasters 02:00 Day 5 Section 6: Prepare Servers for Disasters (contd.) 01:00 Section 7: Monitor Server Health 05:00 Exercise Conventions When working through an exercise, you will see conventions that indicate information you need to enter that is specific to your server. The following describes the most common conventions: italicized text: This is refers to your unique situation, such as the hostname of your server. For example, supposing the hostname of your server is da50 and you see the following: hostname.digitalairlines.com You would enter da50.digitalairlines.com xx: This is the IP address that is assigned to your SUSE Linux Enterprise Server 11. For example, supposing your IP address is and you see the following: xx You would enter Select: The word select is used in exercise steps with reference to menus where you can choose between different entries, such as drop-down menus. 10

11 Enter and Type: The words enter and type have distinct meanings. Introduction The word enter means to type text in a field or at a command line and press the Enter key when necessary. The word type means to type text without pressing the Enter key. If you are directed to type a value, make sure you do not press the Enter key or you might activate a process that you are not ready to start. Key combinations: Ctrl+Alt+F1 indicates that all three keys should be pressed at the same time. Ctrl, Alt, F1 indicates that the three keys should be pressed and released one after the other. Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11 Information The copy of SUSE Linux Enterprise Server 11 you receive in your student kit is a fully functioning copy of the SUSE Linux Enterprise Server 11 product. The following information will help you to get the most out of SUSE Linux Enterprise Server 11: SUSE Linux Enterprise Server 11 Support and Maintenance on page 11 Novell Customer Center on page 11 SUSE Linux Enterprise Server 11 Online Resources on page 12 SUSE Linux Enterprise Server 11 Support and Maintenance To receive official support and maintenance updates, you need to do one of the following: Register for a free registration/serial code that provides you with 60 days of support and maintenance. Purchase a copy of SUSE Linux Enterprise Server 11 from Novell (or an authorized dealer). You can obtain your free 60-day support and maintenance code at products/server/eval.html ( NOTE: You will need to have a Novell login account to access the 60-day evaluation. Novell Customer Center Novell Customer Center is an intuitive, Web-based interface that helps you to manage your business and technical interactions with Novell. Novell Customer Center consolidates access to information, tools, and services such as the following: Automated registration for new SUSE Linux Enterprise products Patches and updates for all shipping Linux products from Novell Order history for all Novell products, subscriptions, and services 11

12 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Entitlement visibility for new SUSE Linux Enterprise products Linux subscription renewal status Subscription renewals via partners or Novell For example, a company might have an administrator who needs to download SUSE Linux Enterprise software updates, a purchaser who wants to review the order history, and an IT manager who has to reconcile licensing. With Novell Customer Center, the company can meet all these needs in one location and can give users access rights appropriate to their roles. You can access the Novell Customer Center at ( SUSE Linux Enterprise Server 11 Online Resources Novell provides a variety of online resources to help you configure and implement SUSE Linux Enterprise Server 11: ( This is the Novell home page for SUSE Linux Enterprise Server ( sles11/) This is the Novell Documentation Web site for SUSE Linux Enterprise Server ( This is the home page for all Novell Linux support and it includes links to support options such as Knowledgebase, downloads, and FAQs. ( This Web site provides the latest implementation guidelines and suggestions from Novell on a variety of products, including SUSE Linux Enterprise. 12

13 SECTION 1 Manage Networking Manage Networking In this section, you first review how the network is configured in SUSE Linux Enterprise Server 11 using YaST and command line tools. The subsequent objectives cover advanced networking topics, including bridging, the bonding of network devices, and VLANs. Objectives 1. Configure Networking on SLES 11 on page Understand Linux Network Bridges on page Bond Network Adapters on page Configure Virtual Local Area Networks on page 54 Novell Training Services (en) 15 April

14 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Objective 1 Configure Networking on SLES 11 To configure networking on SLES 11, you can user either YaST or command line tools. Manage the Network Configuration Information from YaST on page 14 Manage the Network Configuration with Command Line Tools on page 22 Manage the Network Configuration Information from YaST The YaST module for configuring network cards and the network connection can be accessed from the YaST Control Center. To activate the network configuration module, select Network Devices > Network Settings. The following appears: Figure 1-1 YaST Network Settings: Overview You can set up and modify your configuration information using the following: Global Options Tab on page 15 Overview Tab on page 15 Hostname/DNS Tab on page 18 Routing Tab on page 19 General Tab on page 19 14

15 Figure 1-2 Global Options Tab When you select the Global Options tab, the following appears: YaST Network Settings: Global Options Manage Networking Novell Training Services (en) 15 April 2009 Select one of the following network setup methods: User Controlled with NetworkManager. Uses a desktop applet to manage the connections for all network interfaces. Traditional Method with ifup. Uses the ifup command. This is the recommended setup method for SLES 11. You can also enable IPv6 and your DHCP Client options in this tab. Overview Tab Using the traditional method, select the Overview tab to view the detected network cards, as shown in the following: 15

16 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 1-3 YaST Network Settings: Overview Select the card you want to configure; then click Edit. Usually the cards are auto-detected by YaST, and the correct kernel module is used. If the card is not recognized by YaST, the required module must be entered manually in YaST. Do this by clicking Add in the Overview tab. The following dialog appears: Figure 1-4 YaST Network Settings: Hardware Dialog 16

17 Figure 1-5 Manage Networking In this dialog, you specify the details of the interface to configure, such as Network Device Type (Ethernet) and Configuration Name (0). Under Kernel Module, specify the name of the module to load. You can select the card model from a list of network cards. Some kernel modules can be configured more precisely by adding options or parameters for the kernel. Details about parameters for specific modules can be found in the kernel documentation. After clicking Next, the following dialog appears: YaST Network Card Setup: Access Novell Training Services (en) 15 April 2009 In this dialog, specify the following information to integrate the network device into an existing network: Dynamic Address (via DHCP). Select this option if the network card should receive an IP address from a DHCP server. Statically assigned IP Address. Select this option if you want to statically assign an IP address to the network card. Subnet Mask. Specify the subnet mask for your network. Hostname. Specify a unique name for this system. Additional Addresses. Select Add to assign an additional IP address to the interface. A dialog appears where you can enter an alias, the IP address and the netmask. The alias appears in the output of ifconfig as a virtual interface ethx:alias with the IP address assigned in YaST. 17

18 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Hostname/DNS Tab Figure 1-6 Select the Hostname/DNS tab. The following appears: YaST Network Settings: Hostname/DNS This dialog lets you enter the following: Hostname. Enter a name for the computer. This name should be unique within the network. Domain Name. Enter the DNS domain the computer belongs to. A computer can be addressed uniquely using its FQDN (Fully Qualified Domain Name). This consists of the host name and the name of the domain. For example: da51.digitalairlines.com. List of name servers. Enter the IP address of your organization s DNS server(s). You can specify a maximum of three name servers. Domain search list. Enter your DNS domain. In the local network, it is usually more appropriate to address other hosts with their host names, not with their FQDNs. The domain search list specifies the domains that the system can append to the host name to create the FQDN. For example, da51 is expanded with the search list digitalairlines.com to the FQDN da51.digitalairlines.com. This name is then passed to the name server to be resolved. If the search list contains several domains, the completion takes place one after the other, and the resulting FQDN is passed to the name server until an entry returns an associated IP address. Separate the domains with commas or white space. 18

19 Figure 1-7 Routing Tab To modify routing, select the Routing tab. The following appears: YaST Network Settings: Routing Manage Networking Novell Training Services (en) 15 April 2009 On the Routing tab, you can define the following: Default Gateway. If the network has a gateway (a computer that forwards information from the local network to other networks), its address can be specified in the network configuration. All data not addressed to the local network is forwarded directly to the gateway. Routing Table. You can create entries in the routing table of the system by selecting Expert Configuration. Enable IP Forwarding. If you select this option, IP packets that are not destined for your computer are forwarded from one NIC to the other according to the routing table. All the necessary information is now available to activate the network card. General Tab On the General tab of the Network Card Setup dialog, you can set up additional network card options, as shown in the following: 19

20 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 1-8 YaST Network Card Setup: General Tab You can configure the following: Device Activation. Specify when the interface should be set up. Possible values include the following: At Boot Time. During system start. On Cable Connection. If there is a physical network connection. On Hotplug. When the hardware is plugged in. Manually. The interface must be manually started. Never. The interface is never started. On NFSroot. The interface is automatically started, but can t be shut down using the rcnetwork stop command. The ifdown command, however, can still be used to bring down the interface. This is useful when the root file system resides on an NFS server. Firewall Zone. Use to activate/deactivate the firewall for the interface. If activated, you can specify which firewall zone to put the interface in: Firewall Disabled Internal Zone (Unprotected) Demilitarized Zone External Zone 20

21 Manage Networking Device Control. Normally only root is allowed to activate and deactivate a network interface. To allow normal users to do this, activate the Enable Device Control for Non-root User via KInternet option. Maximum Transfer Unit (MTU). Specify the maximum size of an IP package. The size depends on the hardware. For an Ethernet interface, the maximum size is 1500 bytes. After you save the configuration in YaST, the Ethernet card should be activated and connected to the network. You can verify this with the ip command, as shown in the following: geeko@da1:~> ip address show 1: lo: <LOOPBACK,UP,LOWER_UP> mtu qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet /8 brd scope host lo inet /8 brd scope host secondary lo inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 1000 link/ether 00:0c:29:7f:82:69 brd ff:ff:ff:ff:ff:ff inet /16 brd scope global eth0 inet6 fe80::20c:29ff:fe7f:8269/64 scope link valid_lft forever preferred_lft forever Novell Training Services (en) 15 April 2009 In this example, the eth0 interface was configured. Additionally, the loopback device (lo) is always set up. On some Linux distributions, you also see a sit0 (Simple Internet Transition) device; it is needed to connect to IPv6 networks. YaST writes the network configuration information for eth0 the /etc/ sysconfig/network/ifcfg-eth0 file. The content of this file looks similar to the following: BOOTPROTO='static' STARTMODE='auto' NAME='nForce2 Ethernet Controller' BROADCAST='' ETHTOOL_OPTIONS='' IPADDR=' /16' MTU='' NETWORK='' REMOTE_IPADDR='' USERCONTROL='no' Additional IP addresses configured in YaST appear in the file as follows: IPADDR_ALIAS='u.v.w.x/y' LABEL_ALIAS='ALIAS' If LABEL_ALIAS is set to '' (empty), the additional IP addresses can only be seen in the output of the ip command, they are not visible in the output of ifconfig. 21

22 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Manage the Network Configuration with Command Line Tools The advantage of YaST is that it not only changes the network configuration, but it also takes care of all network configuration files in a consistent manner. To change the current network configuration, YaST uses commands in the background that you should be familiar with. This will help you to better understand what YaST actually does, and also help you to make quick changes, for instance, to test a certain setup without making permanent configuration changes. To work effectively with command line tools to configure network settings, you need to be able to do the following: Use the ip Tool to Configure Network Settings on page 22 Use Additional Tools to Configure Network Settings on page 33 Configure the Hostname and Name Resolution on page 36 Use the ip Tool to Configure Network Settings You can use the ip command to change the network interface configuration quickly from the command line. You can use the ip command as a normal user to display the current network setup. To change the network setup, you have to be logged in as root. IMPORTANT: Changes made with the ip tool are not persistent. If you reboot the system, all changes will be lost. To make them persistent, you must edit the appropriate configuration files. Changing the network interface configuration at the command line is especially useful for testing purposes, but if you want a configuration to be permanent, you must save it in a configuration file. These configuration files are generated automatically when you set up a network card with YaST, but you can also create and edit them with a text editor. When using the ip command, you need to be able to do the following: Display the Current Network Configuration on page 22 Change the Current Network Configuration on page 26 Monitor the State of Devices on page 27 Set Up Routing with the ip Tool on page 28 Save Device Settings to a Configuration File on page 31 Display the Current Network Configuration With the ip tool, you can display the following information: IP Address Setup on page 23 Device Attributes on page 24 Device Statistics on page 25 Neighbor Entries on page 26 22

23 IP Address Setup Manage Networking To display the IP address setup of all interfaces, enter ip address show at the shell prompt. Depending on your network setup, you see information similar to the following: da1:~ # ip address show 1: lo: <LOOPBACK,UP,LOWER_UP> mtu qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet /8 brd scope host lo inet /8 brd scope host secondary lo inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 1000 link/ether 00:0c:29:7f:82:69 brd ff:ff:ff:ff:ff:ff inet /16 brd scope global eth0 inet6 fe80::20c:29ff:fe7f:8269/64 scope link valid_lft forever preferred_lft forever Novell Training Services (en) 15 April 2009 The information is grouped by network interfaces. Every interface entry starts with a digit, called the interface index; the interface name is displayed after the interface index. In the above example, there are two interfaces: lo: The loopback device, which is available on every Linux system, even when no network adapter is installed. (As stated above, device and interface are often used synonymously in the context of network configuration.) Using this virtual device, applications on the same machine can use the network to communicate with each other. For example, you can use the IP address of the loopback device to access a locally installed Web server by typing in the address bar of your Web browser. eth0: The first Ethernet adapter of the computer in this example. Ethernet devices are normally called eth0, eth1, eth2, and so on. You always have the entries for the loopback device, and on some systems there is also, by default, a sit0 device (sit stands for Simple Internet Transition). The sit0 device is a special virtual device which can be used to encapsulate IPv6 packets into IPv4 packets. It is not used in a normal IPv4 network. Depending on your hardware setup, you might have more Ethernet devices in the ip output. Several lines of information are displayed for every network interface, such as eth0 in the preceding example: 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 1000 The most important information of the line in this example is the interface index (2) and the interface name (eth0). 23

24 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual The other information shows additional attributes set for this device, such as the hardware address of the Ethernet adapter (00:0c:29:7f:82:69): link/ether 00:0c:29:7f:82:69 brd ff:ff:ff:ff:ff:ff In the following line, the IPv4 setup of the device is displayed: inet /16 brd scope global eth0 inet is followed by the IP address ( ), and brd follows the broadcast address ( ). The length of the network mask in bits (16) is displayed after the IP address and separated from it by a /. The following lines show the IPv6 configuration of the device: inet6 fe80::20c:29ff:fe7f:8269/64 scope link valid_lft forever preferred_lft forever The IPv6 address shown here is the automatically assigned link local address. The address is generated from the hardware address of the device. The link local IPv6 address allows you to contact other computers in the same network segment only and has limited functionality. IPv6 is not covered in this course. Depending on the device type, the information can differ. However, the most important information (such as assigned IP addresses) is always shown. NOTE: IPv6 is covered in the SUSE Linux Enterprise Server Administration (3103) course. ip options can be abbreviated, ip a s produces the same output as ip address show. If two possible options start with the same letters, more than one letter is required to make the options unambiguous. Device Attributes If you are interested only in the device attributes and not in the IP address setup, you can enter ip link show: da2:~ # ip link show 1: lo: <LOOPBACK,UP,LOWER_UP> mtu qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 1000 link/ether 00:0c:29:7f:82:69 brd ff:ff:ff:ff:ff:ff The information is similar to what you see when you enter ip address show, but the information about the address setup is missing. The device attributes are displayed in brackets right after the device name. The following is a list of some possible attributes and their meanings: 24

25 Table 1-1 Device Attributes Attribute UP LOOPBACK BROADCAST POINTOPOINT Description Manage Networking The device is turned on. It is ready to transmit packets to and receive packets from the network. The device is a loopback device. The device can send packets to all hosts sharing the same network. The device is connected only to one other device. All packets are sent to and received from the other device. Novell Training Services (en) 15 April 2009 MULTICAST The device can send packets to a group of other systems at the same time. PROMISC The device listens to all packets on the network, not only to those sent to the device's hardware address. This is usually used for network monitoring. Device Statistics You can use the -s option with the ip command to display additional statistics information about the devices. The command looks like the following: ip -s link show eth0 By giving the device name at the end of the command line, the output is limited to one specific device. This can also be used to display the address setup or the device attributes. The following is an example of the information displayed for the device eth0. As with other options of ip, such as address, the link and show options can be abbreviated: da2:~ # ip -s li sh dev eth0 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 1000 link/ether 00:0c:29:7f:82:69 brd ff:ff:ff:ff:ff:ff RX: bytes packets errors dropped overrun mcast TX: bytes packets errors dropped carrier collsns Two additional sections with information are displayed for every device. Each of the sections has a headline with a description of the information displayed. The section starting with RX displays information about received packets, and the section starting with TX displays information about sent packets. The sections display the following information: Bytes. The total number of bytes received or transmitted by the device. Packets. The total number of packets received or transmitted by the device. 25

26 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Errors. The total number of receiver or transmitter errors. Dropped. The total number of packets dropped due to a lack of resources. Overrun. The total number of receiver overruns resulting in dropped packets. As a rule, if a device is overrun, it means that there are serious problems in the Linux kernel or that your computer is too slow for the device. Mcast. The total number of received multicast packets. This option is supported by only a few devices. Carrier. The total number of link media failures due to a lost carrier. Collsns. The total number of collision events on Ethernet media. Compressed. The total number of compressed packets. Neighbor Entries The ip neighbor show command lists the content of the ARP cache: da2:~ # ip neigh ls fe80::248:54ff:fe51:460d dev eth0 lladdr 00:48:54:51:46:0d router DELAY 2a01:123:456:1::1 dev eth0 lladdr 00:48:54:51:46:0d router REACHABLE dev eth0 lladdr 00:48:54:51:46:0d REACHABLE In the above example, the neighbor with the IP address can be reached via eth0 and has the MAC address 00:48:54:51:46:0d. This neighbor also has an IPv6 address (2a01:123:...) and is an IPv6 router (flag router). Change the Current Network Configuration You can also use the ip tool to change the network configuration by performing the following tasks: Assign an IP Address to a Device on page 26 Delete the IP Address from a Device on page 27 Change Device Attributes on page 27 Assign an IP Address to a Device To assign an address to a device, use a command similar to the following: da2:~ # ip address add /24 brd + dev eth0 In this example, the command assigns the IP address to the device eth0. The network mask is 24 bits long, as determined by the /24 after the IP address. The brd + option sets the broadcast address automatically as determined by the network mask. You can enter ip address show dev eth0 to verify the assigned IP address. The assigned IP address is displayed in the output of the command line. 26

27 Manage Networking IMPORTANT: You can assign more than one IP address to a device. Additional addresses assigned with the ip command are not visible in the output of ifconfig. Delete the IP Address from a Device To delete the IP address from a device, use a command similar to the following: da2:~ # ip address del /24 dev eth0 In this example, the command deletes the IP address from the device eth0. Use ip a s eth0 to verify that the address was deleted. Novell Training Services (en) 15 April 2009 Change Device Attributes You can also change device attributes with the ip tool. The following is the basic command to set device attributes: ip link set device attribute The possible attributes are described in Device Attributes on page 24. The most important attributes are up and down. By setting these attributes, you can enable or disable a network device. To enable a network device (such as eth0), enter the following command: da2:~ # ip link set eth0 up To disable a network device (such as eth0), enter the following command: da2:~ # ip link set eth0 down To change the MAC address of a network device (such as eth0), enter the following command: da2:~ # ip link set address 00:22:33:44:55:66 dev eth0 Monitor the State of Devices You can use the ip command to monitor the state of devices, addresses, and routes continuously. The following example shows the output of ip monitor all when the network cable got removed: 27

28 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual da2:~ # ip monitor all 4: eth2: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 1500 qdisc pfifo_fast master bond0 state DOWN link/ether 00:80:c8:f6:86:14 brd ff:ff:ff:ff:ff:ff 4: eth2: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond0 state UP link/ether 00:80:c8:f6:86:14 brd ff:ff:ff:ff:ff:ff Using the rtmon command, it is possible to write the information to a file that can later be viewed with the ip monitor command: da2:~ # rtmon file /var/log/rtmon.log ^C da2:~ # ip monitor file /var/log/rtmon Timestamp: Tue Feb 2 13:25: us 1: lo: <LOOPBACK,UP,LOWER_UP> mtu qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: eth0: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP link/ether 00:19:d1:9f:17:87 brd ff:ff:ff:ff:ff:ff 3: eth1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master if7 state UP link/ether 00:80:c8:f6:88:9f brd ff:ff:ff:ff:ff:ff 4: eth2: <BROADCAST,MULTICAST,PROMISC,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master if7 state UP link/ether 00:80:c8:f6:88:9f brd ff:ff:ff:ff:ff:ff 7: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP link/ether 00:80:c8:f6:88:9f brd ff:ff:ff:ff:ff:ff 8: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN link/ether 00:19:d1:9f:17:87 brd ff:ff:ff:ff:ff:ff Timestamp: Tue Feb 2 13:25: us dev br0 lladdr 00:11:11:c2:35:f4 REACHABLE Timestamp:... The output shows a state dump at the moment rtmon was started and then Timestamp entries with the information about changes that happened at that moment. Set Up Routing with the ip Tool You can use the ip tool to configure the routing table of the Linux kernel. The routing table determines the path IP packets use to reach the destination system. When using the ip command to set up routing, you need to be able to View the Routing Table on page 29 Add Routes to the Routing Table on page 29 Delete Routes from the Routing Table on page 30 Save Routing Settings to a Configuration File on page 30 28

29 Manage Networking NOTE: Routing is a complex topic; this objective covers only the most common routing scenarios. View the Routing Table To view the current routing table, enter ip route show. For most systems, the output looks similar to the following: da1:~ # ip route show /16 dev eth0 proto kernel scope link src /16 dev eth0 scope link /8 dev lo scope link default via dev eth0 Novell Training Services (en) 15 April 2009 Every line represents an entry in the routing table. Each line in the example is shown and explained below: /16 dev eth0 proto kernel scope link src This line represents the route for the local network. All network packets to a system in the same network are sent directly through the device eth /16 dev eth0 scope link This line shows a network route for the network. Hosts can use this network for address auto configuration. SLES 11 automatically assigns a free IP address from this network when no other device configuration is present. The route to this network is always set, especially when the system itself has no assigned IP address from that network /8 dev lo scope link This is the route for the loopback device. default via dev eth0 This line is the entry for the default route. All network packets that cannot be sent according to the previous entries of the routing table are sent through the gateway defined in this entry. Depending on the setup of your machine, the content of the routing table varies. In most cases, you have at least two entries in the routing table: One route to the local network to which the system is connected One route to the default gateway for all other packets Add Routes to the Routing Table The following are the most common tasks you complete when adding a route: Set a Route to the Locally Connected Network on page 30 Set a Route to a Different Network on page 30 29

30 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Set a Default Route on page 30 Delete Routes from the Routing Table on page 30 Save Routing Settings to a Configuration File on page 30 Set a Route to the Locally Connected Network The following command sets a route to the locally connected network: da2:~ # ip route add /16 dev eth0 The system in this example is in the network. The network mask is 16 bits long ( ). All packets to the local network are sent directly through the device eth0. Set a Route to a Different Network The following command sets a route to a different network: da2:~ # ip route add /24 via All packets for the network are sent through the gateway Set a Default Route The following command sets a default route: da2:~ # ip route add default via Packets that cannot be sent according to previous entries in the routing table are sent through the gateway with an IP address of Delete Routes from the Routing Table To delete an entry from the routing table, use a command similar to the following: da2:~ # ip route delete /24 dev eth0 This command deletes the route to the network assigned to the device eth0. Save Routing Settings to a Configuration File Routing settings made with the ip tool are lost when you reboot your system. Settings have to be written to configuration files to be restored at boot time. 30

31 Manage Networking Routes to the directly connected network are automatically set up when a device is started. All other routes are saved in the /etc/sysconfig/network/routes configuration file. The following shows the content of a typical configuration file: eth0 default Each line of the configuration file represents an entry in the routing table. Each line is explained below: eth0 All packets sent to the network with the network mask are sent to the gateway through the eth0 device. Default This entry represents a default route. All packets that are not affected by the previous entries of the routing table are sent to the gateway It is not necessary to fill out the last two columns of the line for a default route. To apply changes to the routing configuration file, you need to restart the affected network device with the ifdown and ifup commands. Novell Training Services (en) 15 April 2009 Save Device Settings to a Configuration File All device configuration changes you make with ip are lost when the system is rebooted. To restore the device configuration automatically when the system is started, the settings need to be saved in configuration files. The configuration files for network devices are located in the /etc/sysconfig/ network/ directory. If the network devices are set up with YaST, one configuration file is created for every device. For Ethernet devices, the filenames consist of ifcfgand then the name of the device., such as ifcfg-eth0. We recommend that you set up a device with YaST first and then change the configuration file as needed. Setting up a device from scratch is a complex task, because the hardware driver would also need to be configured manually. The content of the configuration files depends on the configuration of the device. To change the configuration file, you need to know how to do the following: Configure a Device Statically on page 31 Configure a Device Dynamically with DHCP on page 33 Start and Stop Configured Interfaces on page 33 Configure a Device Statically The content of a configuration file of a statically configured device, such as ifcfgeth0, is similar to the following: 31

32 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual BOOTPROTO='static' STARTMODE='auto' NAME='nForce2 Ethernet Controller' ETHTOOL_OPTIONS='' BROADCAST='' IPADDR=' /16' MTU='' NETWORK='' REMOTE_IPADDR='' USERCONTROL='no' The configuration file includes several lines. Each line has an option and a value assigned to that option, as explained below: BOOTPROTO='static'. Determines the way the device is configured. There are two possible values: static. The device is configured with a static IP address. dhcp. The device is configured automatically with a DHCP server. STARTMODE='auto'. Determines how the device is started. This option can use the following values: auto. The device is started at boot time or when initialized at runtime. manual. The device must be started manually with ifup. ifplugd. The interface is controlled by ifplugd. ETHTOOL_OPTIONS='' The ethtool utility is used for querying settings of an Ethernet device and changing them (for instance, setting the speed or half/full duplex mode). The manual page for ethtool lists the available options. If you want ethtool to modify any settings, list the options here. If no options are listed, ethtool is not called. BROADCAST=''. Broadcast address of the network. If empty, the broadcast address is derived from the IP address and the netmask, according to the configuration in /etc/sysconfig/network/config. IPADDR=' /16'. IP address of the device. NETWORK=''. Address of the network itself. REMOTE_IPADDR=''. Required only if you are setting up a point-to-point connection. MTU=''. Specifies a value for the MTU (Maximum Transmission Unit). If you don t specify a value, the default value is used. For an Ethernet device, the default is 1500 bytes. The /etc/sysconfig/network/ifcfg.template file contains a template that you can use as a base for device configuration files. It also has comments explaining the various options. 32

33 Configure a Device Dynamically with DHCP Manage Networking If you want to configure a device by using a DHCP server, you set the BOOTPROTO option to dhcp. When the device is configured by DHCP, you don t need to set any options for the network address configuration in the file. If there are any IP address settings, they are set in addition to the settings from the DHCP server. Start and Stop Configured Interfaces To apply changes to a configuration file, you need to stop the corresponding interface and start it again. You can do this with the ifdown and ifup commands. For example, entering ifdown eth0 disables the device eth0. Entering ifup eth0 enables eth0 again. When the device is restarted, the new configuration is read from the configuration file. Novell Training Services (en) 15 April 2009 NOTE: Configuring the interfaces with IP addresses, routes, etc., using the ip tool requires an existing device setup, including a correctly loaded kernel module. This is usually done at boot time by udev. If you want more information on this topic, read /etc/sysconfig/hardware/ README.hwcfg_and_device_initialisation and the references listed in that file. Use Additional Tools to Configure Network Settings In addition to the ip command, a Linux system has several other command line tools that can be used to configure the network setup. You need to be able to Use the ifconfig Tool to Configure Network Settings on page 33 Use the iwconfig Tool to Configure Wireless Network Settings on page 34 Use the route Tool to Configure the Routing Table on page 35 Use the ifconfig Tool to Configure Network Settings In older Linux versions, the ifconfig tool was used instead of the ip tool to configure network settings, but has since been replaced in the scripts used to set up the network configuration by the ip tool. Because the ifconfig command is still available on SLES 11 and other current distributions, the following paragraphs explain its use. To use the ifconfig command, you need to be able to Use the ifconfig Command to View the Network Configuration on page 33 Use the ifconfig Command to Change the Network Configuration on page 34 Use the ifconfig Command to View the Network Configuration The ifconfig command without any options displays the current interface configuration: 33

34 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual da1:~ # ifconfig eth0 Link encap:ethernet HWaddr 00:0C:29:7F:82:69 inet addr: Bcast: Mask: inet6 addr: fe80::20c:29ff:fe7f:8269/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:1142 errors:0 dropped:0 overruns:0 frame:0 TX packets:665 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes: (111.6 Kb) TX bytes: (126.2 Kb) Interrupt:19 Base address:0x2024 lo Link encap:local Loopback inet addr: Mask: inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:10517 errors:0 dropped:0 overruns:0 frame:0 TX packets:10517 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes: (5.2 Mb) TX bytes: (5.2 Mb) To view the configuration of a specific device only, add the device name to the command, such as ifconfig eth0. The ifconfig -a command shows all interfaces, including those that are down. Use the ifconfig Command to Change the Network Configuration The basic syntax of the ifconfig command looks as follows: ifconfig interface address options To configure the IP address of an interface, you would use ifconfig in the following manner: ifconfig eth broadcast netmask If eth0 already had an IP address assigned, it is replaced by the new one. To assign a second IP address to the interface, you have to add a designation to the device as shown in the following example: ifconfig eth0: broadcast netmask You can remove the above IP address with the following command: ifconfig eth0:1 del Use the iwconfig Tool to Configure Wireless Network Settings The iwconfig command is similar to the ifconfig command, but dedicated to wireless interfaces. 34

35 Manage Networking iwconfig without options beyond the interface name displays the current settings, as in the following: da-laptop:~ # iwconfig eth0 eth0 unassociated ESSID:off/any Nickname:"ipw2100" Mode:Managed Channel=0 Access Point: Not-Associated Bit Rate:0 kb/s Tx-Power:16 dbm Retry short limit:7 RTS thr:off Fragment thr:off Encryption key:off Power Management:off Link Quality:0 Signal level:0 Noise level:0 Rx invalid nwid:0 Rx invalid crypt:0 Rx invalid frag:0 Tx excessive retries:0 Invalid misc:0 Missed beacon:0 Novell Training Services (en) 15 April 2009 In the above example there is no access to a wireless network. With iwconfig you can set parameters specific to wireless devices, including ESSID, mode, encryption key, and transmit power. Use the route Tool to Configure the Routing Table The route command is used to display and change the kernel routing table. Without options it displays the current routing table. Use the -n option to display numerical addresses, as shown in the following: da10:~ # route -n Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface U eth UG eth U eth U lo UG eth0 The following command will add a route to the /16 network via the router with the IP address : route add -net netmask gw You can delete that route again with the following command: route del -net netmask gw The default route is set with the following command: route add default gw

36 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Configure the Hostname and Name Resolution The system hostname and your network s name resolver can be set manually. In this objective, you learn how to do the following: Set the Host and Domain Name on page 36 Configure Name Resolution on page 36 Set the Host and Domain Name The hostname is configured in the /etc/hostname file. The content of the file is similar to the following: da1.digitalairlines.com The file contains the fully qualified domain name of the system, in this case, da1.digitalairlines.com. Configure Name Resolution Name resolution is needed to translate host and domain names into IP addresses and vice versa. The components involved on a Linux machine are The /etc/hosts File on page 36 The /etc/resolv.conf file on page 37 The /etc/nsswitch.conf File on page 37 The /etc/hosts File The /etc/hosts file contains IP addresses and corresponding host names. Usually it only contains the IP addresses and host name of the machine it resides on, but you can add other entries to it as needed. A typical hosts file found on SLES 11 looks similar to the following: # # hosts This file describes a number of hostname-to-address # mappings for the TCP/IP subsystem. It is mostly # used at boot time, when no name servers are running. # On small systems, this file can be used instead of a # "named" name server. # Syntax: # # IP-Address Full-Qualified-Hostname Short-Hostname # localhost da1.digitalairlines.com da1 # special IPv6 addresses ::1 localhost ipv6-localhost ipv6-loopback fe00::0 ipv6-localnet 36

37 ff00::0 ff02::1 ff02::2 ff02::3 The /etc/resolv.conf file ipv6-mcastprefix ipv6-allnodes ipv6-allrouters ipv6-allhosts Manage Networking The /etc/resolv.conf file defines the search prefix and the name servers to use. The content of the file is similar to the following: search digitalairlines.com nameserver nameserver nameserver The file contains two types of entries: search. The domain names listed after search are used to complete incomplete host names. For example, if you look up the host name da3, the name is automatically completed to the fully qualified domain name da3.digitalairlines.com. nameserver. Every entry starting with nameserver is followed by an IP address of a name server. You can configure up to three name servers. If the first name server fails, the next one is used. Novell Training Services (en) 15 April 2009 The /etc/nsswitch.conf File The /etc/nsswitch.conf (name service switch) file defines the sequence in which the various databases are queried. The file looks similar to the following: # # /etc/nsswitch.conf # # An example Name Service Switch config file. This file should be # sorted with the most-used services at the beginning. # # The entry '[NOTFOUND=return]' means that the search for an # entry should stop if the search in the previous entry turned # up nothing. Note that if the search failed due to some other reason # (like no NIS server responding) then the search continues with the # next entry. # # Legal entries are: # # compat Use compatibility setup # nisplus Use NIS+ (NIS version 3) # nis Use NIS (NIS version 2), also called YP # dns Use DNS (Domain Name Service) # files Use the local files # [NOTFOUND=return] Stop searching if not found so far # # For more information, please read the nsswitch.conf.5 manual page. 37

38 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual # # passwd: files nis # shadow: files nis # group: files nis passwd: compat group: compat hosts: networks:... files dns files dns The hosts: entry is the one relevant for host name resolution. It specifies that to resolve a host name, the /etc/hosts file is consulted first (files), and if there is no applicable entry in the /etc/hosts file, the query is sent to a DNS server (dns). 38

39 Objective 2 Understand Linux Network Bridges Manage Networking A network bridge configured on a Linux host connects network segments at the data link layer (Layer 2 of the OSI model). It provides very similar functionality to a common (Layer 2) switch, with the added benefit of being able to manage traffic. The disadvantages include higher latencies and higher energy consumption in comparison to dedicated switches. Because of these disadvantages, a dedicated switch would most likely be used in a physical network to provide most of the functionality a Linux network bridge provides. In Linux, bridging is most frequently used to connect the network interfaces of physical and virtual machines, allowing virtual machines to connect to the physical network. However, a good understanding of bridging in physical networks will make it easier for you to understand how virtual machines are connected to physical networks. To administer Linux network bridges in a physical and a virtual setting involving Xen, you need to Understand How Bridging Works on page 39 Configure Network Bridges on page 39 Understand the Xen Networking Concept on page 156 Understand How Bridging Works A bridge connects two or more networks and forwards ethernet frames arriving on one network interface that are destined for a machine connected to a network segment attached to one of the other interfaces. The interfaces connected to a bridge run in promiscuous mode to catch frames that are addressed to NICs on the other side of the bridge. The first time an ARP broadcast for a MAC address associated with a certain IP address is received on a NIC of the bridge, the broadcast is sent out on all other NICs. If an answer is received, this answer is passed back to the NIC where the request originated. The bridge keeps track of the MAC addresses in each network segment. The MAC addresses are kept in a Source Address Table (SAT) to avoid having to send ARP broadcasts to all interfaces. Instead, ethernet frames are sent only to the segment where the destination MAC address resides. The bridge itself does not need an IP address of its own, unless you want to access it over the network. In this case, the bridge not the eth0 or eth1 interfaces needs an IP address. This can be a static or a dynamic IP address. Configure Network Bridges To configure a network bridge in SLES 11, you need to Use YaST to Set Up a Bridge on page 40 Novell Training Services (en) 15 April

40 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 1-9 Administer a Bridge with Command Line Tools on page 42 Make a Bridge Configuration Permanent on page 43 Use YaST to Set Up a Bridge A bridge can be configured using YaST s Network Settings module. Select YaST > Network Devices > Network Settings or open a terminal window and enter as the root user yast2 lan. In the Network Settings dialog, select Add. The Hardware Dialog appears: YaST2 Hardware Dialog From the Device Type dialog, select Bridge, then click Next. The Network Card Setup dialog appears: 40

41 Figure 1-10 Bridge: Network Card Setup Manage Networking Novell Training Services (en) 15 April 2009 In this dialog, you can select Dynamic Address (DHCP), or you can select Statically assigned IP Address and enter the static IP address information for the bridge. In the Bridged Devices frame, select the interfaces you want to include in the bridge. Select Next to return to the Network Settings dialog. It now includes the bridge: 41

42 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 1-11 Bridge: Network Settings Dialog Click OK to finish the configuration. YaST writes the configuration to the /etc/ sysconfig/network/ifcfg-bridgename file. Its syntax is explained in Make a Bridge Configuration Permanent on page 43. Administer a Bridge with Command Line Tools The tool used to administer a Linux bridge is the brctl command. The brctl command is part of the bridge-utils package which is not included in the default installation. You can install the bridge-utils package using the yast -i bridgeutils command. To connect the networks connected to the NICs via a bridge, you first have to create the bridge. Then you need to add the interfaces (also referred to as ports) to the bridge using brctl using the following syntax: brctl addbr bridge-name brctl addif bridge-name interface You can delete interfaces from the bridge with the following command: brctl delif bridge-name interface The brctl command also allows you to view the current bridge configuration using the show parameter, as shown in the following: 42

43 Manage Networking da10:~ # brctl addbr br0 da10:~ # brctl addif br0 eth1 da10:~ # brctl addif br0 eth0 da10:~ # brctl show bridge name bridge id STP enabled interfaces br d19f17f4 no eth0 eth1 To activate the bridge, it must be brought up using the ip command: Novell Training Services (en) 15 April 2009 da10:~ # ip link set br0 up If you want to assign an IP address to the bridge, use the ip command in the same way as you would to assign an IP address to an Ethernet interface, using the name of the bridge as device: ip add add IP_address/netmask brd + dev bridgename To delete the br0 bridge, first bring it down with the ip link set br0 down command and then delete it with brctl delbr br0. Make a Bridge Configuration Permanent A bridge created with the above commands is not persistent across reboots. To create a bridge automatically when the system boots, a configuration file, such as ifcfg-br0, is needed in the /etc/sysconfig/network/ directory that replaces any existing configuration files for the Ethernet cards. Its content is similar to the following: BOOTPROTO='none' BRIDGE='yes' BRIDGE_FORWARDDELAY='0' BRIDGE_PORTS='eth0 eth1' BRIDGE_STP='off' STARTMODE='onboot' USERCONTROL='no' The above configuration does not assign an IP address to the bridge. To use a DHCP server to get an address, make sure the BOOTPROTO variable is set to dhcp (instead of none). To use a static IP address, edit the file so it looks similar to the following: BOOTPROTO='static' IPADDR=' /16' BRIDGE='yes' BRIDGE_FORWARDDELAY='0' BRIDGE_PORTS='eth0 eth1' BRIDGE_STP='off' STARTMODE='onboot' USERCONTROL='no' 43

44 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual NOTE: If you followed the above steps, but the bridge is not working as expected, use the SuSEfirewall2 status command to determine if any packet filter rules are set. If there are, delete them by entering SusSEfirewall2 stop and try again. Understand the Xen Networking Concept on page 156 explains how bridges are used with Xen. 44

45 Exercise 1-1 Configure a Network Bridge In this lab, configure a network bridge. You can find this lab in the workbook. (End of Exercise) Manage Networking Novell Training Services (en) 15 April

46 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Objective 3 Bond Network Adapters Bonding of network adapters means that two (or more) physical network links are combined to act as a single logical link. The purpose is to improve availability or the bandwidth of a connection. Bonding has to be configured on both ends of the link. Therefore, it depends not only on the features Linux offers, but also on what the switch (or router or computer) on the other end of the link supports. To administer bonded network adapters, you need to Understand Network Bonding on page 46 Configure Bonding on page 46 Understand How /sbin/ifup Deals with Bonding Devices on page 51 Bonding of Network Interfaces on page 53 Understand Network Bonding The Linux kernel includes a bonding kernel module that allows you to combine two or more network links into one single logical link. On SLES 11, YaST allows you to easily configure bonding. There are two main purposes for bonding network links: Improved availability Bonding can be used to avoid a single point of failure by combining two distinct physical links. These links can take different routes within a building, include separate network components, such as switches, and get their electrical supply from different sources. If one of the physical components of one link fails, the logical connection is unaffected because the other physical link takes over the traffic. Improved bandwidth Bonding allows you to add additional bandwidth to an existing link, by adding the bandwidths of the physical links involved. NOTE: Not all network components, such as switches, support all bonding modes. Configure Bonding Bonding can be configured using YaST s Network Settings module. Select YaST > Network Devices > Network Settings or open a terminal window and enter as the root user yast2 lan. In the Network Settings dialog, select Add. The Hardware Dialog appears: 46

47 Figure 1-12 YaST2 Hardware Dialog Manage Networking Novell Training Services (en) 15 April 2009 From the Device Type dialog, select Bond, then click Next. The Network Card Setup dialog appears: 47

Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 1-13 Bonding: Network Card Setup In this dialog, you can select No IP Address

48 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 1-13 Bonding: Network Card Setup In this dialog, you can select No IP Address (for Bonding Devices), Dynamic Address (DHCP), or you can select Statically assigned IP Address and enter the static IP address information for the bonding device. The Bond Slaves frame lists the network devices available for bonding. Select the interfaces you want to add. The Bond Driver Options drop down menu offers the following choices (the explanation is taken from /usr/src/linux/documentation/ networking/bonding.txt): mode=active-backup miimon=100. Only one slave in the bond is active. A different slave becomes active if, and only if, the active slave fails. The bond's MAC address is externally visible on only one port (network adapter) to avoid confusing the switch. The miimon=100 parameter specifies the MII (Medium Independent Interface, an Ethernet industry standard) link monitoring frequency in milliseconds. This determines how often the link state of each slave is inspected for link failures. A value of zero disables MII link monitoring. A value of 100 is a good starting point. For other parameters than can be used instead of miimon, enter modinfo bonding. Instead of monitoring the link state with miimon, you could, for instance use arp_ip_target=ip_address arp_interval=milliseconds to monitor if a certain IP address can be 48

49 Manage Networking reached. If it can t be reached anymore on the active link, the link switches to the other one. This mode provides fault tolerance. mode=balance-rr. Round-robin policy: Transmit packets in sequential order from the first available slave through the last. This mode provides load balancing and fault tolerance. mode=balance-xor. XOR policy: Transmit based on the selected transmit hash policy. The default policy is simple [(source MAC address XOR'd with destination MAC address) modulo slave count]. Alternate transmit policies may be selected via the xmit_hash_policy option, described below. This mode provides load balancing and fault tolerance. mode=broadcast. Broadcast policy: transmits everything on all slave interfaces. This mode provides fault tolerance. mode=802.3ad. IEEE 802.3ad Dynamic link aggregation: Creates aggregation groups that share the same speed and duplex settings. Utilizes all slaves in the active aggregator according to the 802.3ad specification. Prerequisites: 1. Ethtool support in the base drivers for retrieving the speed and duplex of each slave. 2. A switch that supports IEEE 802.3ad Dynamic link aggregation. Most switches will require some type of configuration to enable 802.3ad mode. mode=balance-tlb. Adaptive transmit load balancing: channel bonding that does not require any special switch support. The outgoing traffic is distributed according to the current load (computed relative to the speed) on each slave. Incoming traffic is received by the current slave. If the receiving slave fails, another slave takes over the MAC address of the failed receiving slave. Prerequisite: Ethtool support in the base drivers for retrieving the speed of each slave. mode=balance-alb. Adaptive load balancing: includes balance-tlb plus receive load balancing (rlb) for IPV4 traffic, and does not require any special switch support. The receive load balancing is achieved by ARP negotiation. The bonding driver intercepts the ARP Replies sent by the local system on their way out and overwrites the source hardware address with the unique hardware address of one of the slaves in the bond such that different peers use different hardware addresses for the server. Select Next to return to the Network Settings dialog. It now includes the bond device: Novell Training Services (en) 15 April

50 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 1-14 Bonding: Network Settings Overview The resulting ifcfg-bond0 configuration file looks similar to the following: BONDING_MASTER='yes' BONDING_MODULE_OPTS='mode=active-backup miimon=100' BONDING_SLAVE0='eth0' BONDING_SLAVE1='eth2' BOOTPROTO='static' BROADCAST='' ETHTOOL_OPTIONS='' IPADDR=' /16' MTU='' NAME='' NETWORK='' REMOTE_IPADDR='' STARTMODE='auto' USERCONTROL='no' Bonding works without files for the individual interfaces. If there is a file for an interface, it should contain the BOOTPROTO= none and STARTMODE= off values, as in the following example of an ifcfg-ethx file: BOOTPROTO='none' BROADCAST='' ETHTOOL_OPTIONS='' IPADDR='' MTU='' NAME='RTL-8139/8139C/8139C+' 50

51 NETMASK='' NETWORK='' REMOTE_IPADDR='' STARTMODE='off' USERCONTROL='no' Manage Networking To create a bonding configuration without YaST, you have to create an ifcfgbondx file as above and delete the ifcfg-ethx files of the bonded interfaces. You can get information on the current bonding configuration, such as which of the devices is active, by viewing files in the /sys/class/net/bondx/bonding/ directory, as in the following: Novell Training Services (en) 15 April 2009 da10:~ # cat /sys/class/net/bond0/bonding/active_slave eth1 da10:~ # cat /sys/class/net/bond0/bonding/slaves eth1 eth2 da10:~ # cat /sys/class/net/bond0/bonding/mode active-backup 1 da10:~ # cat /sys/class/net/bond0/bonding/miimon 100 Understand How /sbin/ifup Deals with Bonding Devices The /sbin/ifup command is a shell script that sets up or (if called as /sbin/ ifdown) shuts down network interfaces. A peculiarity of a bond device in combination with DHCP is that the device has to be assembled first before a DHCP request can succeed. The ifup script takes care of that by delaying the DHCP request until this has happened, as can be seen in the following output of ifup: da10:~ # ifup bond0 bond0 bond0 enslaved interface: eth0 bond0 enslaved interface: eth2 Starting DHCP4 client on bond bond0 DHCP4 continues in background da10:~ # ip a s dev bond0 5: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP link/ether 00:19:d1:9f:17:87 brd ff:ff:ff:ff:ff:ff inet /24 brd scope global bond0 inet6 fe80::219:d1ff:fe9f:1787/64 scope link valid_lft forever preferred_lft forever da10:~ # Before a bond device can be shut down, its enslaved devices have to be taken down first. This is also taken care of by ifup (ifdown is a link to ifup): 51

52 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual da10:~ # ifdown bond0 bond0 is still used from interfaces eth0 eth2 eth0 device: Intel Corporation 82566DM Gigabit Network Connection (rev 02) No configuration found for eth0 Nevertheless the interface will be shut down. could not find configuration file ifcfg-eth0 eth2 device: Realtek Semiconductor Co., Ltd. RTL-8139/8139C/ 8139C+ (rev 10) No configuration found for eth2 Nevertheless the interface will be shut down. could not find configuration file ifcfg-eth2 bond0 now going down itself bond0 da10:~ # For details on how this is done view the contents of the /sbin/ifup script itself. 52

53 Exercise 1-2 Bonding of Network Interfaces Manage Networking In this lab, combine two physical network interfaces into one logical interface. This lab requires two computers that are connected via two network interfaces. You can find this lab in the workbook. (End of Exercise) Novell Training Services (en) 15 April

54 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Objective 4 Configure Virtual Local Area Networks Switches supporting Virtual Local Area Networks (VLANs) allow you to divide a LAN into smaller sections. The same could be achieved with routers and separate networks, but the approach with VLANs and switches is often easier to implement with the existing physical network infrastructure. This objective explains how Linux supports VLANs based on the IEEE 802.1q standard. You need to understand How VLANs Work on page 54 How Tagging Works on page 56 Configure SLES 11 with YaST to Support VLANs on page 56 How VLANs Work To understand VLANs, a clear definition of a LAN is required. A LAN in this context is everything included in a broadcast domain. Hubs and switches are part of the LAN, and the boarder of the LAN is marked by one or more routers. If you connect a switch to a LAN, the machines connected to that switch become part of the LAN. The core component of a VLAN is a VLAN-capable switch. A single VLAN switch can be part of multiple LANs. Such switches are sometimes also referred to as Level 3 switches. They actually do route packets but are more flexible than dedicated routers because they allow, for instance, a host that moved to a different physical network to remain in the same VLAN. Imagine that you are the network administrator in a small company that has three departments whose computers need to be separated from each other. Without VLANs, you would need a router and a simple switch for each network. With VLANs, all you would need would be one device. At the VLAN-capable switch, you would assign each department a VLAN of its own and add the department s computers to that VLAN. NOTE: How this is done and how rules are created to allow or deny access from one VLAN to the other depends on the brand of the switch and is beyond the scope of this course. To connect two VLAN-capable switches, a so-called trunk is used. A trunk is a connection that can carry multiple VLANs. Trunks following the IEEE 802.1q standard mark (tag) an ethernet frame as belonging to a specific VLAN by a 4-bit identifier added to the ethernet frame. Trunks allow VLANs to spread across several switches the tags are used on either side to identify the VLAN the frame belongs to. A VLAN setup comprising two VLANs and two VLAN switches is illustrated in the following figure: 54

Figure 1-15 VLANs with Two Switches Manage Networking Novell Training Services (en) 15 April 2009 The Linux VLAN capabilities allow a SLES 11 server to be configured to be part of both VLANs, as

55 Figure 1-15 VLANs with Two Switches Manage Networking Novell Training Services (en) 15 April 2009 The Linux VLAN capabilities allow a SLES 11 server to be configured to be part of both VLANs, as illustrated in the following figure: Figure 1-16 SLES 11 Serving Two VLANs The SLES 11 server is connected to the VLAN switch with a trunk connection and configured as covered in Configure SLES 11 with YaST to Support VLANs on 55

56 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual page 56. To be able to address the services provided by SLES 11 (or the virtual machines in the above figure), vlan1 and vlan2 should use different IP networks. NOTE: Some of the information in this section was obtained from ( How Tagging Works Within a single switch, a VLAN can be created by defining which port belongs to which VLAN. On a VLAN spanning several switches using this technique, you would need to add a Link (cable) for each VLAN. To connect switches with a single link (a trunk), tags are added to Ethernet frames to identify the VLAN they belong to. VLAN Tags extend the Ethernet header included with every Ethernet frame by 4 bytes (32 bits): PID - Tag Protocol Identifier (16 bits): Fixed value of 0x8100. Frame carries 802.1q tag information. Priority (3 bits): User priority information. CFI - Canonical Format Indicator (1 bit): Indicates whether the MAC address format is canonical (LSB first, value 0) or non-canonical (LSB last, value 1). VID - VLAN Identifier (12 bits): VLAN identifier of the frame; 12 bits allow you to differentiate 4094 VLANs (values 0 and 4095 may not be used). Only hosts that share the same VLAN ID can communicate with each other. NOTE: How a switch deals with a tagged frame depends on the features of the switch. Simple switches will ignore it and deal with the frame according to its regular Ethernet header. More sophisticated switches that do not have a management interface deal with existing tags within their learning capabilities, but cannot set or remove VLAN tags. Manageable switches understand the VLAN tags and can be configured to add or remove tags on ports, for example, to accommodate hosts that cannot deal with tagged frames. Configure SLES 11 with YaST to Support VLANs To include a SLES 11 server in a VLAN, a VLAN interface needs to be created. If the server is supposed to be integrated into several VLANs, an interface has to be created for each of them. VLAN interfaces are connected to a real interface, such as eth0, but this interface does not need to be configured (it shouldn t have an IP address; only the vlanx interfaces should have one). The YaST Network Settings module allows you to add VLAN interfaces. In the Network Settings dialog, select Add. The Hardware dialog appears: 56

57 Figure 1-17 VLAN: YaST Hardware Dialog Manage Networking Novell Training Services (en) 15 April 2009 Select VLAN as Device Type, choose your VLAN number as a Configuration Name and click Next. The Network Card Setup Dialog appears: 57

Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 1-18 VLAN: Network Card Setup Under Real Interface for VLAN select the network

58 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 1-18 VLAN: Network Card Setup Under Real Interface for VLAN select the network card that is connected to your VLAN. Select Dynamic Address (DHCP) or select Statically assigned IP Address and enter the IP address information. Click Next to return to the Network Settings dialog, which now lists the new vlan interface: 58

Figure 1-19 VLAN: Network Settings Manage Networking Novell Training Services (en) 15 April 2009 YaST writes the configuration to the /etc/sysconfig/network/ifcfgvlanx file, which looks similar to

59 Figure 1-19 VLAN: Network Settings Manage Networking Novell Training Services (en) 15 April 2009 YaST writes the configuration to the /etc/sysconfig/network/ifcfgvlanx file, which looks similar to the following: BOOTPROTO='static' BROADCAST='' ETHERDEVICE='eth2' ETHTOOL_OPTIONS='' IPADDR=' /24' MTU='' NAME='' NETWORK='' REMOTE_IPADDR='' STARTMODE='onboot' USERCONTROL='no' The ETHERDEVICE entry defines this as a VLAN configuration using eth2. The command line tool to manage VLAN interfaces is vconfig. To add a VLAN device, use the vconfig add interface VLAN-ID command. Once the VLAN device is created, you can configure it with the ip command. 59

60 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual The following example illustrates this: da-host:/etc/sysconfig/network # vconfig add eth2 1 Added VLAN with VID == 1 to IF -:eth2:- WARNING: VLAN 1 does not work with many switches, consider another number if you have problems. da-host:/etc/sysconfig/network # ip a s vlan1 18: vlan1@eth2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN link/ether 00:48:54:51:45:8c brd ff:ff:ff:ff:ff:ff da-host:/etc/sysconfig/network # ip add add /24 brd + dev vlan1 da-host:/etc/sysconfig/network # ip a s vlan1 18: vlan1@eth2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN link/ether 00:48:54:51:45:8c brd ff:ff:ff:ff:ff:ff inet /24 brd scope global vlan1 A VLAN interface can be deleted with the vconfig rem vlan_device. The /sbin/ifup script deals with VLAN interfaces and their underlying network interfaces much the same way as with bonding interfaces, as covered in Understand How /sbin/ifup Deals with Bonding Devices on page 51. For details, view the contents of the script itself. 60

61 Exercise 1-3 Configure VLAN Tagging Manage Networking In this lab, you work with VLAN tags such that only computers that belong to the same VLAN speak to each other. You can find this lab in the workbook. (End of Exercise) Novell Training Services (en) 15 April

62 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Summary Objective Configure Networking on SLES 11 Summary The YaST module for configuring the network card and the network connection can be found at Network Devices > Network Settings. The following details are then needed to integrate the network device into an existing network: Method of network setup IP address and network mask Hostname Name server Routing (gateway) After you save the configuration with YaST, the Ethernet card should be available in the computer. You can verify this with the ip address show command. You can perform the following tasks with the ip tool: Display the IP address setup: ip address show Display device attributes: ip link show Display device statistics: ip -s link show Assign an IP address: ip address add IP_address/netmask brd + dev device_name Delete an IP address: ip address del IP_address dev device_name View the routing table: ip route show Add routes to the routing table: ip route add network/netmask dev device_name Delete routes from the routing table: ip route del network/netmask dev device_name The configuration files for network devices are located in /etc/sysconfig/network/. Configured devices can be enabled with ifup device_name and disabled with ifdown device_name. The configuration for the routing table is located in the / etc/sysconfig/network/routes file. The hostname is configured in the /etc/hostname file. Name resolution is configured in the /etc/ resolv.conf file. 62

63 Objective Understand Linux Network Bridges Bond Network Adapters Configure Virtual Local Area Networks Summary Manage Networking A network bridge provides similar functionality as a switch, with the added benefit of allowing you to manage traffic. Bridges are an essential element of the network configuration used with Xen. A bridge can be set up with YaST. YaST creates the bridge, adds the interfaces to it, and writes the needed configuration files. The brctl command allows a manual configuration, but this configuration is not persist over reboots. It is necessary to create and edit the configuration files, such as ifcfg-br0, in /etc/sysconfig/network/ for the bridge to be created automatically when the system starts. Several physical links can be combined into one logical link to increase bandwidth or availability of the network connection. YaST allows you to create bond devices and writes the needed configuration files. The /sbin/ifup script makes sure that the bonding device is assembled properly when the system starts. When called as /sbin/ifdown, the script properly shuts down the bonding device. VLANs allow you to create smaller sections within a LAN without requiring the additional hardware and network connections for a router. The core component of a VLAN is a VLAN-capable switch. Several such switches are connected by trunks. A trunk is a connection that can carry multiple VLANs. Trunks following the IEEE 802.1q standard mark (tag) an ethernet frame as belonging to a specific VLAN by a 4-bit identifier added to the ethernet frame. Trunks allow VLANs to spread across several switches the tags are used on either side to identify the VLAN the frame belongs to. VLANs can be configured graphically using YaST or manually using the vconfig command and editing the ifcfg-vlanx file in the /etc/sysconfig/ network/ directory to make the configuration permanent across reboots. Novell Training Services (en) 15 April

64 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual 64

65 SECTION 2 Manage Server Storage Manage Server Storage In this section you learn how to implement and manage server-based storage solutions on a SUSE Linux Enterprise Server 11 system. Objectives 1. Manage SCSI Devices on Linux on page Describe How Fibre Channel SANs Work on page Implement a SAN with iscsi on page 109 Novell Training Services (en) 15 April

66 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Objective 1 Manage SCSI Devices on Linux On desktop systems, you will usually find either Parallel ATA (PATA) or Serial ATA (SATA) storage systems implemented. In fact many desktop systems include both storage buses built into the motherboard. While PATA and SATA storage systems work very well in desktop systems, they are inadequate for heavily used server systems that need to be highly available. In servers, you will usually use one of the following for storage: Internal (or external) storage using RAID arrays based on Small Computer System Interface (SCSI) hard disk drives External storage via a Storage Area Network (SAN) using RAID arrays based on SCSI hard disk drives In this objective, you learn how to manage SCSI devices installed in a SUSE Linux Enterprise Server (SLES) 11 system. The following topics will be addressed: SCSI Concepts on page 66 How SCSI Devices Work on Linux on page 77 Common SCSI Commands on page 83 SCSI Concepts Internal storage in a server system is usually implemented using a SCSI controller and storage devices. While SCSI can also be used to connect other types of devices, such as scanners and printers, the discussion in this section will be limited to storage devices such hard disks, tape drives, and optical drives. The following topics will be addressed: The SCSI Bus on page 66 SCSI IDs and LUNs on page 69 SCSI Bus Termination on page 73 SCSI Standards on page 74 The SCSI Bus The SCSI standard defines the commands, protocols, and interfaces needed to create a storage I/O bus. This SCSI bus is composed of a chain of devices electrically connected together. A simple bus composed of a SCSI controller and two SCSI hard disks is shown in the figure below: 66

Figure 2-1 The SCSI Bus Manage Server Storage Novell Training Services (en) 15 April 2009 Notice that all devices on the bus are connected by a single cable.

67 Figure 2-1 The SCSI Bus Manage Server Storage Novell Training Services (en) 15 April 2009 Notice that all devices on the bus are connected by a single cable. The devices on the SCSI bus are managed by the SCSI controller. The SCSI controller also provides an interface between the motherboard and the I/O bus. It has its own BIOS that is loaded when the system is initially powered on. The SCSI controller may be integrated into the motherboard of the server itself or it may be implemented as an expansion board. In the SCSI bus in the figure above, the SCSI controller is implemented as a PCI expansion board that is installed in a PCI expansion slot in your server s motherboard. Most SCSI controllers have two connectors to which a SCSI cable can be connected: Internal External The internal and external connectors for the SCSI controller shown in Figure 2-1 on page 67 are shown in the figure below: 67

Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 2-2 Internal and External SCSI Connectors The internal connector is used to

68 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 2-2 Internal and External SCSI Connectors The internal connector is used to connect internal SCSI devices, such as hard disk drives and optical drives, to the controller. Internal devices are connected using a single 50- or 68-pin ribbon cable. The external connector on the controller, on the other hand, is used to connect external SCSI devices, such as external drive arrays, to the bus. External SCSI devices usually have two SCSI ports, allowing you to connect multiple external devices together in a chain just as is done with internal SCSI devices. An example of an external SCSI hard disk enclosure with two SCSI ports is shown in the figure below: 68

Figure 2-3 External SCSI Connectors Manage Server Storage Novell Training Services (en) 15 April 2009 The SCSI controller is commonly implemented at one end of the SCSI bus.

69 Figure 2-3 External SCSI Connectors Manage Server Storage Novell Training Services (en) 15 April 2009 The SCSI controller is commonly implemented at one end of the SCSI bus. This configuration is shown in Figure 2-1 on page 67. However, because most SCSI controllers include both internal and external connectors, it s also possible for the SCSI controller to reside in the middle of the bus. In this configuration, internal SCSI devices are connected to the internal connector on the controller while one or more external devices are connected to the external connector. Depending upon the type of SCSI system, you can connect 8, 16, or 32 devices to the SCSI bus. However, remember that the SCSI controller itself always counts as a device. Therefore, you can effectively connect 7, 15, or 31 devices to the SCSI bus, depending upon which SCSI standard you are using. SCSI IDs and LUNs Devices on the SCSI bus are identified by a unique identifier called the SCSI ID. The ID number is used by the SCSI controller to route data on the bus to or from the appropriate device. Depending upon the SCSI standard you are using, you can use the following values for a device s SCSI ID:

70 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 2-4 By default, the SCSI controller is usually assigned a SCSI ID of 7 by most manufacturers. However, this value is usually configurable if you want to assign it a value other than 7. It s important to note that the SCSI ID does not identify a device s location on the SCSI bus. You can assign any device any SCSI ID as long as no other device is using that ID number. For example, in the example below, the first hard disk on the SCSI bus is assigned an ID of 1, the second hard disk is assigned a value of 0, and the controller is assigned a value of 7: Assigning SCSI ID Numbers While the ID does not identify a device s location, it does specify the device s priority on the SCSI bus. For the first 8 IDs, higher SCSI ID numbers have higher priority on the bus. Therefore, ID 7 has the highest priority on the bus while 0 has the lowest priority. If the SCSI standard you are using allows 16 devices on the bus, then the next range of IDs from 8 to 15 also use the highest ID in the range (15) as the highest priority. However, the entire range of IDs has a lower priority than the range of IDs from 0 to 7. In other words, the overall sequence of SCSI ID priorities for a bus that supports up to 16 devices is 7, 6, 5, 4, 3, 2, 1, 0, 15, 14, 13, 12, 11, 10, 9, 8. SCSI IDs and their associated priority levels are used to determine which device has control of the SCSI bus. This process is called arbitration. If multiple devices try to access the SCSI bus at the same time, the device with the highest priority device takes precedence. The device with a lower priority must then wait until the higher priority device releases control of the SCSI bus. You can assign any device on the SCSI bus any available ID number. However, there are two general strategies used for associating specific types of devices with certain 70

71 Manage Server Storage SCSI ID numbers. The strategy you choose will depend upon the role the system plays in your network. One strategy is to assign slower devices to higher-priority SCSI IDs to prevent faster devices from monopolizing the bus. Using this strategy, the SCSI controller is usually assigned the highest-priority ID number (7). Then, faster devices such as hard disk drives are assigned to IDs 0-2. Optical drives are assigned to IDs 3-4. Finally, slower devices such as tape drives, scanners, and printers are assigned IDs 5-6. The second strategy tends to be the opposite of the first. Instead of assigning slow devices to high-priority SCSI IDs, you assign your highest-priority devices to the highest-priority SCSI IDs. For example, if you have SCSI devices on the bus that can t tolerate interruptions when transferring data (such as a DVD burner), they could be assigned a higher priority number. Likewise, you may choose to assign the hard disk drives in a file server system to higher priority ID number. In addition to a SCSI ID number, each device on the SCSI bus is assigned at least one Logical Unit Number (LUN). Some SCSI devices have a single LUN assigned while more complex devices may have multiple LUNs assigned. LUNs allow multiple SCSI devices to be addressed via a single ID. Each SCSI ID number can be further subdivided into LUNs 0 through 15. LUNs allow you to connect multi-component devices such as a drive array to your SCSI bus. The array itself is assigned only a single SCSI ID. However, each disk within the array is assigned a separate LUN, allowing each disk to be addressed individually. The SCSI ID assigned to a particular device on the bus can be configured in a variety of ways. External devices are commonly configured using a rotating switch. In the figure below, an external drive array has been configured to use SCSI ID number 3: Novell Training Services (en) 15 April 2009 Figure 2-5 Setting the SCSI ID on an External Drive Array Some internal SCSI devices have their SCSI ID number set using three jumpers. Each jumper has a corresponding value assigned to it: Jumper 0 = 1 Jumper 1 = 2 Jumper 2 = 4 A jumper that is shunted contributes its corresponding value to the SCSI ID number. a jumper that is not shunted does not contribute a value to the ID number. Consider the internal SCSI hard drive shown in the figure below: 71

Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 2-6 Setting the SCSI ID on an Internal Hard Drive The legend indicates that

72 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 2-6 Setting the SCSI ID on an Internal Hard Drive The legend indicates that the configuration pins are numbered from 1-12 from right to left. It also indicates that pins 2, 3, and 4 are used to configure the SCSI ID number. Pin 4 sets ID Bit 0, pin 3 sets ID Bit 1, and pin 2 sets ID Bit 2. Looking closely at the pins in the figure, you can see that pin 3 is shunted, but pin 2 and pin 4 are not. The SCSI ID number is therefore calculated as follows: Jumper 0 = 0 (because it isn t shunted) Jumper 1 = 2 (because it is shunted) Jumper 2 = 0 (because it isn t shunted) Adding together all the values created by the jumpers, the internal hard drive in Figure 2-6 on page 72 has a SCSI ID of 2 assigned. Some other types of SCSI devices are software configured. For example, most newer SCSI controllers have a small setup program included in their BIOS. The setup program is accessed by pressing a specified key sequence during system boot. You 72

Manage Server Storage can then use the menu-driven interface provided by the program to configure the SCSI ID number that the controller board will use.

73 Manage Server Storage can then use the menu-driven interface provided by the program to configure the SCSI ID number that the controller board will use. SCSI Bus Termination Because all SCSI devices on the SCSI bus connect to a shared media, steps must be taken to capture and absorb electrical signals when they reach either end of the bus cabling to prevent them from reflecting back down the bus and corrupting data. This is called termination. It is implemented by configuring a terminating resistor at each end of the SCSI bus (but not in between). Novell Training Services (en) 15 April 2009 NOTE: Any devices connected to the SCSI bus after the terminator will not be visible to the controller. Mis-configured SCSI IDs or mis-configured termination account for a very large proportion of SCSI implementation problems. Termination on the SCSI bus has been implemented in a variety of ways over the years, including the following: Off the SCSI Device: This type of termination involves connecting a terminator to the end of the SCSI bus. The way it is done depends upon whether you are terminating an internal or external SCSI bus: Internal: To terminate an internal SCSI bus, connect a terminator to the end of the internal SCSI ribbon cable. External: Insert a terminator plug in the open port of the last device on the external SCSI bus. On the SCSI Device. Termination can also be configured on a SCSI device itself. This is done on the last physical device on the bus and can be accomplished using the following methods: Resistor Packs. Older SCSI devices may be terminated by inserting resistor packs in a specified socket on the device. Two types of commonly used resistor packs are shown below: Figure 2-7 SCSI Terminators Jumpers. Some SCSI devices provide a single jumper that, when shunted, enables termination on the device. In the figure below, pin 6 is used to enable or disable termination: 73

Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 2-8 Using Jumpers to Enable Termination In the figure above, you can see that

74 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 2-8 Using Jumpers to Enable Termination In the figure above, you can see that pin 6 (numbered from right to left) does not have a shunt installed. Therefore, termination is disabled on this hard disk drive. Software Termination. Some SCSI devices, especially SCSI controllers, use software to turn termination off and on. For SCSI controllers, you can run the setup program from the controller s BIOS and use the interface presented to enable or disable termination. NOTE: Termination is usually enabled by default on most SCSI controllers. If you connect both internal and external devices to the controller, you must disable termination on the controller because it would reside in the middle of the SCSI bus in this situation. SCSI Standards The SCSI standard has been around for many years. As such, it is has evolved and improved as time has passed. You need to be familiar with the following SCSI standards: SCSI-I on page 75 74

75 SCSI-II on page 75 SCSI-III on page 76 Serial Attached SCSI (SAS) on page 77 SCSI-I Manage Server Storage One of the earliest implementations of SCSI was the SCSI-I standard, adopted in The SCSI-I standard used an 8-bit bus that ran at 5 MHz and supported transfer speeds up to 5 MBps. SCSI-I supported a maximum of 8 devices (including the controller board). The maximum total cable length for the entire bus was 6 meters. One of the problems with SCSI-I was the fact that it didn t include standardized device commands. As a result, SCSI controller and SCSI devices frequently had to be made by the same manufacturer for them to work together correctly. Novell Training Services (en) 15 April 2009 SCSI-II To address the problems encountered with SCSI-I, the SCSI-II standard was released in the early 1990s. It specified a common command set and standard connectors to rectify the lack of standardization experienced with SCSI-I. It also defined several different bus widths: 8-bit (Narrow) 16-bit (Wide) 32-bit (Rarely implemented) It also defined two different bus speeds: 5 MHz 10 MHz (Fast) Because the SCSI-II standard offers three different bus widths and two different bus speeds, there are actually multiple sub-definitions within the standard based on the combination of the two. These are shown in the table below: Table 2-1 SCSI-II Standards Name Bus Width Bus Speed Data Transfer Rate SCSI-II 8-bit 5 MHz 5 MBps Fast SCSI-II 8-bit 10 MHz 10 MBps Wide SCSI-II 16-bit 5 MHz 10 MBps Fast Wide SCSI-II 16-bit 10 MHz 20 MBps Double-Wide SCSI-II 32-bit 10 MHz 40 MBps SCSI-II also introduced a new signaling technology for the SCSI bus. SCSI-I and SCSI-II both support a data transfer scheme called Single-Ended (SE). SE signaling uses one wire per data bit and supports a maximum overall bus length of 6 meters. 75

76 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Because only one wire is used per data bit, bus lengths longer than 6 meters tend to experience increasing amounts of electromagnetic interference and data throughput slows down. To address this, SCSI-II introduced a second data transfer scheme called High- Voltage Differential (HVD), which is sometimes referred to as simply differential SCSI. Unlike SE, HVD uses two wires per data bit implemented in a manner that reduces electromagnetic interference and cross-talk. As a result, the length of the SCSI bus can be significantly increased, up to 25 meters. WARNING: SE and HVD SCSI devices connectors appear similar. However, they are completely incompatible. If you mix SE and HVD devices on the same SCSI bus, you will destroy the devices. The SCSI-II standard allows up to 16 devices on the SCSI bus (including the controller). SCSI-III The SCSI-III standard, adopted in the late 1990s, is really an umbrella standard composed of many smaller SCSI standards. Like SCSI-II, SCSI-III specifies two bus widths: 8-bit 16-bit (Wide) However, SCSI-III identifies three new bus speeds: 20 MHz (Ultra SCSI) 40 MHz (Ultra-2 SCSI) 80 MHz (Ultra-3 SCSI) As with SCSI-II, SCSI-III implementations are created by combining a particular bus width with a specific bus speed. These are listed below: Table 2-2 SCSI-III Standards Name Bus Width Bus Speed Data Transfer Rate Ultra SCSI-III 8-bit 20 MHz 20 MBps Wide Ultra SCSI-III 16-bit 20 MHz 40 MBps Ultra-2 SCSI 8-bit 40 MHz 40 MBps Wide Ultra-2 SCSI 16-bit 40 MHz 80 MBps Ultra-3 SCSI 8-bit 80 MHz 80 MBps Wide Ultra-3 SCSI 16-bit 80 MHz 160 MBps Like SCSI-II, SCSI-III supports SE or HVD signaling. These data transmission schemes are implemented in Ultra SCSI-III and Wide Ultra SCSI-III. However, SCSI-III also introduces a new signaling mechanism called Low-Voltage Differential 76

77 Manage Server Storage (LVD). LVD works in much the same manner as HVD, but uses less power and allows higher data transfer rates. LVD or HVD can be used with Ultra-2 SCSI and Wide Ultra-2 SCSI. Only LVD is used with Ultra-3 SCSI and Wide Ultra-3 SCSI. As such, Ultra SCSI-III, Wide Ultra SCSI-III, Ultra-2 SCSI, and Wide Ultra-2 SCSI all support a maximum bus length up to 25 meters if differential signaling is used. However, bus lengths of only meters is supported if SE signaling is used. Ultra-3 SCSI and Wide Ultra-3 SCSI support a maximum bus length up to 12 meters. Like SCSI-II, most SCSI-III standards support a maximum of 16 devices on the SCSI bus (including the controller). However, Ultra-3 SCSI and Wide Ultra-3 SCSI both support up to 32 devices on the bus. Novell Training Services (en) 15 April 2009 Serial Attached SCSI (SAS) SAS is the latest (and fastest) version of SCSI currently available, offering data transfer rates of 3 Gbps and 6 Gbps. SAS uses a serial communication bus instead of parallel (as used in prior SCSI standards) and works in much the same manner as Serial ATA (SATA). Even though it uses a different bus, SAS still uses the standard SCSI command set. However, SAS does differ from traditional parallel SCSI in several key ways: Like SATA, SAS uses point-to-point connections instead of multi-drop connections on a ribbon cable. Each SAS device has a dedicated connection to the SAS controller. Because SAS uses point-to-point connections, there is no need for termination. Because SAS uses point-to-point connections, each device has full access to all of the available bandwidth on the bus. SAS devices don t have to share the media as parallel SCSI devices do. Using expanders, a SAS domain can contain up to 16,256 devices. SAS devices don t use ID numbers. Each SAS device in a SAS domain has a globally unique identifier that is assigned by the device manufacturer (much like an Ethernet MAC address). This identifier is called the World Wide Name (WWN); it is sometimes called the SAS Address. The WWN uniquely identifies the device in the SAS domain just as the SCSI ID uniquely identifies a device on a parallel SCSI bus. NOTE: SAS devices are not compatible with parallel SCSI devices. However, you can connect SATA drives to a SAS controller board. How SCSI Devices Work on Linux The Linux kernel uses a three-layer architecture to manage the SCSI subsystem, shown in the figure below: 77

Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 2-9 Linux SCSI Architecture Each SCSI operation, such as reading or writing to

78 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 2-9 Linux SCSI Architecture Each SCSI operation, such as reading or writing to or from a storage device, uses a driver from each layer in the architecture. Layer 3 provides drivers specific to the type of SCSI device being addressed. The sd (sd_mod.o) and sr (sr_mod.o) modules provide block device interfaces, while the st (st.o) and sg (sg.o) modules provide character device interfaces. The sd interface is used to access SCSI hard drives. The sr interfaces is used to access optical devices. The st interface is used to access tape drives. The sg module provides a generic SCSI interface that uses a char device interface. It is used to communicate with SCSI devices such as scanners, optical disc burners, and audio CDs. It can also be used to address many other types of non-scsi devices, including USB storage devices FireWire (IEEE 1394) devices ATAPI optical drives The sg interface uses pseudo SCSI devices drivers to bridge between the native bus protocol and the SCSI subsystem. This allows upper-level SCSI device drivers to access these non-scsi devices. Layer 2 of the SCSI subsystem architecture is used by all SCSI operations. It provides internal interfaces and services common to all upper and lower layer drivers. Layer 1 drivers provide the actual host bus adapter interfaces used by the upper-layer drivers and are specific to the actual hardware installed in the system. Accordingly, SCSI hard disk drives in a Linux system are accessed though a device file in /dev that begins with sd. For example, the first SCSI hard disk in the system is accessed through /dev/sda, the second through /dev/sdb, and so on. The Linux kernel currently supports up to 128 SCSI hard disks in a given system. The partitions on each SCSI hard disk are accessed by a device file in /dev that uses the hard disk name concatenated with the partition number. For example, the first partition on the first SCSI disk in the system is accessed through /dev/sda1. The second partition on that same disk is accessed through /dev/sda2, and so on. The Linux kernel currently supports up to 15 partitions per disk. 78

79 Manage Server Storage While IDE hard disks on past Linux distributions had a naming scheme of /dev/ hda, /dev/hdb, etc., they now follow the SCSI same naming scheme as well. This is also true for SATA hard disks. Optical SCSI storage devices in a Linux system are also accessed through a device file in /dev/. These device filenames begin with scd. NOTE: In earlier Linux distributions, the device file name for optical devices began with sr. The first SCSI optical storage drive is accessed through /dev/scd0, the second through /dev/scd1, and so on. SCSI tape drives in a Linux system are also accessed through a device file in /dev/ using filenames that begin with st. For example, the first tape drive is accessed through /dev/st0. Generic SCSI devices, such as scanners, are mapped to a device file in /dev/ that begins with sg; for example, /dev/sg0. The sg driver is capable of recognizing 256 SCSI devices. Novell Training Services (en) 15 April 2009 NOTE: All SCSI hard disk drives (but not the partitions they contain), SCSI optical drives, and SCSI tape drives are also mapped to an sg device file in /dev/. Non-SCSI devices that are accessed through the SCSI subsystem using a pseudo driver (such as USB storage devices) are accessed using the appropriate device file name in /dev/. For example, an external USB hard drive may be mapped to /dev/sdb (assuming there is one other hard drive installed in the system). The device file name for this type of device begins with sd because it s managed by the sd module in the SCSI subsystem. An external FireWire hard drive would be mapped in the same manner. An ATAPI optical drive would be mapped to /dev/scd0 (assuming no other SCSI optical drives are installed). Again, because the device is an optical drive and is managed by the sr driver, its device filename in /dev/ begins with scd. To effectively manage SCSI devices on a Linux system, you also need to be familiar with how they are identified by the kernel and the drivers in the SCSI subsystem. Four identifiers are used to uniquely identify each device in the system: 79

80 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 2-10 Identifying SCSI Devices The Host Bus Adapter (HBA) number is assigned by the kernel in ascending order starting with 0. The first SCSI HBA in the system is assigned to scsi0, the second is assigned to scsi1, and so on. Each HBA controls one or more SCSI buses. Each bus connected to a given SCSI controller is also assigned a number, again beginning with 0. Accordingly, each SCSI bus has one or more SCSI devices connected to it. The HBA itself is called an initiator and is assigned a SCSI ID number (7 by default). The initiator manages the various SCSI devices connected to its bus, which are referred to as targets. The SCSI ID assigned to each target is also included in the identifier. Each device on the bus that is assigned a SCSI ID can also assign multiple LUNs. These are commonly used by devices such as tape drives and drive arrays that contain either multiple media (such as tapes) or multiple storage devices. The LUN is also included in the identifier. The Linux kernel uses these four numbers to identify a specific device on a SCSI bus in the system using the convention shown below: Host, Bus, Target, LUN For example: scsi0,00,00,00 This identifier refers to the first hard drive (assigned SCSI ID 0 with a default LUN of 0) on the first SCSI controller installed in the system. 80

81 Manage Server Storage You can view very useful information about the SCSI subsystem on Linux by viewing the contents of the appropriate files in the /proc pseudo subsystem. These files are located in the scsi subdirectory of /proc/. One of the most frequently used files is the /proc/scsi/scsi file. It contains a list of SCSI devices currently recognized by the SCSI subsystem. To view its contents, you can use the cat /proc/scsi/scsi command at the shell prompt. An example is shown below: DA1:~ # cat /proc/scsi/scsi Attached devices: Host: scsi0 Channel: 00 Id: 00 Lun: 00 Vendor: VMware, Model: VMware Virtual S Rev: 1.0 Type: Direct-Access ANSI SCSI revision: 02 Host: scsi2 Channel: 00 Id: 00 Lun: 00 Vendor: NECVMWar Model: VMware IDE CDR10 Rev: 1.00 Type: CD-ROM ANSI SCSI revision: 05 Novell Training Services (en) 15 April 2009 As you can see in the example above, the first line in the output for each device contains the identifier discussed previously. The second and third lines contain device-specific information the kernel queried from the device when it came online. Information about your HBA driver can also be found in the /proc/scsi/ directory. Each driver has its own subdirectory within /proc/scsi/ named after itself (for example: mptspi or BusLogic). Within this subdirectory is a file named after the host number assigned to the HBA (such as 0, 1, and so on) that contains information about the driver and HBA. NOTE: The actual information contained in this file depends on the individual driver. As with the /proc/scsi/scsi file, you can view the contents of this file using the cat command. An example is shown below: da1:~ # cat /proc/scsi/mptspi/0 ioc0: LSI53C1030 B0, FwRev= h, Ports=1, MaxQ=128 da1:~ # Information about SCSI optical devices is contained in the /proc/sys/dev/ cdrom/ directory. This directory contains the following files: autoclose autoeject check_media debug info lock 81

82 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual If you want to view information about your optical device, you can use the cat command to view the contents of the info file in this directory. An example is shown below: da1:~ # cat /proc/sys/dev/cdrom/info CD-ROM information, Id: cdrom.c /12/17 drive name: sr0 drive speed: 1 drive # of slots: 1 Can close tray: 1 Can open tray: 1 Can lock tray: 1 Can change speed: 1 Can select disk: 0 Can read multisession: 1 Can read MCN: 1 Reports media changed: 1 Can play audio: 1 Can write CD-R: 0 Can write CD-RW: 0 Can read DVD: 0 Can write DVD-R: 0 Can write DVD-RAM: 0 Can read MRW: 1 Can write MRW: 1 Can write RAM: 1 da1:~ # The info file shown above is not writable by the user. However, the other files in this directory can be used to configure various aspects of the optical disc system. For example, suppose the autoclose file contains a value of 1, as shown below: da1:~ # cat /proc/sys/dev/cdrom/autoclose 1 da1:~ # A value of 1 in this file indicates that the autoclose feature is enabled for your optical device. If you want to disable this feature, you can use the echo command to write a value of 0 to the file. To do this, you would enter the following command at the shell prompt (as root): echo "0" > /proc/sys/dev/cdrom/autoclose The sg module also writes information to files in the /proc pseudo file system in the /proc/scsi/sg/ directory, which contains the following files: /proc/scsi/sg/debug. Displays the state of sg driver. /proc/scsi/sg/def_reserved_size. Contains sg module load parameters. 82

83 Manage Server Storage /proc/scsi/sg/devices. Contains a table of numeric device data. The first row of the output contains information about /dev/sg0, the second row contains information about /dev/sg1, and so on. /proc/scsi/sg/device_hdr. Contains column headers for the date in the devices file above. /proc/scsi/sg/device_strs. Contains a list of SCSI devices managed by the sg driver obtained from the SCSI INQUIRYcommand. /proc/scsi/sg/version. Contains the sg module version number. Common SCSI Commands You can use the following utilities from the shell prompt to manage SCSI devices on a Linux system: dd. The dd command is normally used to convert and copy files in the Linux file system. However, it can also be used to evaluate the performance of your SCSI hard disks and optical drives. For example, if you wanted to test to see how long it takes to read 1 GB of information from the first SCSI hard disk drive (/dev/sda) in your system, you could end the following command at the shell prompt (as root): Novell Training Services (en) 15 April 2009 da1:/ # time dd if=/dev/sda of=/dev/null bs=512 count= records in records out bytes (1.0 GB) copied, s, 14.6 MB/s real user sys 1m8.336s 0m5.536s 0m50.531s The dd command reads one 512-byte sector at a time in sequence, starting from block 0, and writes it to the /dev/null device until blocks have been read (1 GB). The output from the time command displays timing statistics for the program being run (in this case, dd). These statistics consist of The elapsed real time between invocation and termination The user CPU time The system CPU time NOTE: You can also perform this test using the sg_dd command instead of dd. lsscsi. The lsscsi command lists SCSI devices and their attributes. It uses information from sysfs to display a list of SCSI devices currently recognized by the system. Sample output from this command is displayed below: 83

84 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual da1:/ # lsscsi [0:0:0:0] disk VMware, VMware Virtual S 1.0 /dev/sda [2:0:0:0] cd/dvd NECVMWar VMware IDE CDR /dev/sr0 sg_scan. The sg_scan command scans for sg devices and outputs a line of information for each sg device that is currently bound to a SCSI bus. An example is shown below: da1:/ # sg_scan /dev/sg0: scsi0 channel=0 id=0 lun=0 /dev/sg1: scsi2 channel=0 id=0 lun=0 [em] da1:/ # You can use the -i option with the sg_scan command to send an INQUIRY command and output results in a second (indented) line. You can also use the -x option with sg_scan to display queue depth information in the output. Examples of these two options are shown below: da1:/ # sg_scan -i /dev/sg0: scsi0 channel=0 id=0 lun=0 VMware, VMware Virtual S 1.0 [rmb=0 cmdq=1 pqual=0 pdev=0x0] /dev/sg1: scsi2 channel=0 id=0 lun=0 [em] NECVMWar VMware IDE CDR [rmb=1 cmdq=0 pqual=0 pdev=0x5] da1:/ # sg_scan -x /dev/sg0: scsi0 channel=0 id=0 lun=0 cmd_per_lun=7 queue_depth=32 /dev/sg1: scsi2 channel=0 id=0 lun=0 [em] cmd_per_lun=1 queue_depth=1 da1:/ # sg_map. The sg_map command displays mappings between sg and other SCSI devices. It can be is difficult to determine which actual SCSI device an sg device file, such as /dev/sg0, refers to. The sg_map command loops through all of the sg devices and finds the corresponding SCSI hard disk, optical drive, or tape drive. However, be aware that some SCSI devices, such as scanners, have no mapped SCSI device name other than their sg device name. An example of using the sg_map command is shown below: da1:/ # sg_map /dev/sg0 /dev/sda /dev/sg1 /dev/scd0 84

85 You can use the following options with the sg_map command: Manage Server Storage -i. Causes sg_map to send an INQUIRY command and report each device s vendor, product, and revision strings. -x. Causes sg_map to display each device s identifier after each active sg device name using the following syntax: host_number bus_number scsi_id lun Examples of using these options with sg_map are shown below: da1:/ # sg_map -i /dev/sg0 /dev/sda VMware, VMware Virtual S 1.0 /dev/sg1 /dev/scd0 NECVMWar VMware IDE CDR Novell Training Services (en) 15 April 2009 da1:/ # sg_map -x /dev/sg /dev/sda /dev/sg /dev/scd0 da1:/ # sg_rbuf. The sg_rbuf command can be used to measure the throughput of your SCSI bus. It reads data with the READ BUFFER command and then discards it. It is assumed that the data is being read from the disk's memory cache, which is faster than reading data from the media itself. For example, to test the device that is mapped to /dev/sg0, you could enter the following command at the shell prompt (as root): time sg_rbuf /dev/sg0 sg_test_rwbuf. The sg_test_rwbuf command tests the SCSI HBA by sending write and read operations to a device's buffer and then calculating checksums. A random pattern is written to the data buffer on the device and then read back. If the same pattern is found, Success is reported. If they do not match an error is reported and up to 24 bytes from the first point of mismatch are reported. The first line shows the data that was written while the second line displays the data that was received. The syntax for using sg_test_rwbuf is as follows: sg_test_rwbuf -s size -q scsi_device The -s option specifies the size of the buffer (in bytes) to be written and then read. This parameter must be less than or equal to the size of the device's data buffer. Alternatively, you can just use the -q option, which issues a READ BUFFER command to determine the available data buffer length and offset. It prints this information on the screen and then exits without running the write/read tests. sg_turs. The sg_turs command sends a specified number of TEST UNIT READY commands to a specific SCSI device. This can be useful for testing command overhead. The TEST UNIT READY command sends a 6-byte command and then receives a SCSI status value from the device it was sent to. 85

86 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual The syntax for using sg_turs is as follows: sg_turs -n number -t -v scsi_device The -n option specifies the number of commands to send to the device. The -p option shows a progress indicator. The -t option displays the total duration and the average number of commands executed per second after completing the requested number of commands. sg_inq. The sg_inq command sends the SCSI INQUIRY command to a SCSI device and displays the data returned. The syntax for using sg_inq is as follows: sg_inq scsi_device An example of using the sg_inq command is shown below: da1:/ # sg_inq /dev/sg0 standard INQUIRY: PQual=0 Device_type=0 RMB=0 version=0x02 [SCSI-2] [AERC=0] [TrmTsk=0] NormACA=0 HiSUP=0 Resp_data_format=2 SCCS=0 ACC=0 TPGS=0 3PC=0 Protect=0 BQue=0 EncServ=0 MultiP=0 [MChngr=0] [ACKREQQ=0] Addr16=0 [RelAdr=0] WBus16=1 Sync=1 Linked=0 [TranDis=0] CmdQue=1 length=36 (0x24) Peripheral device type: disk Vendor identification: VMware, Product identification: VMware Virtual S Product revision level: 1.0 da1:/ # sginfo. The sginfo command is a very powerful and useful SCSI tool. It is used to access mode page information for a SCSI device in your system. You can use the following options with sginfo: -a. Displays INQUIRY data and the unit serial number followed by all mode pages reported by the device. -A. Displays INQUIRY data and the unit serial number followed by all mode pages and all mode subpages reported by the device. -c. Displays information in the Caching mode page. -C. Displays information in the Control mode page. -d. Displays defect lists. -D. Displays information in the Disconnect-Reconnect mode page. -e. Displays information in the Error Recovery mode page. -E. Displays information in the Control Extension mode page. -f. Displays information in the Format Device mode page. -g. Displays information in the Rigid Disk Drive Geometry mode page. An example of using sginfo with the -g option to display the drive geometry of a SCSI hard disk is shown below: 86

87 da1:/ # sginfo -g /dev/sg0 Rigid Disk Geometry mode page (0x4) Number of cylinders 2088 Number of heads 255 Starting cyl. write precomp 2088 Starting cyl. reduced current 0 Device step rate 0 Landing Zone Cylinder 2088 RPL 0 Rotational Offset 0 Rotational Rate 7200 Manage Server Storage Novell Training Services (en) 15 April G. Displays the grown defect list. -i. Displays the response to a standard INQUIRY command. An example is shown below: da1:/ # sginfo -i /dev/sg0 INQUIRY response (cmd: 0x12) Device Type 0 Peripheral Qualifier 0 Removable 0 Version 2 NormACA 0 HiSup 0 Response Data Format 2 SCCS 0 ACC 0 ALUA 0 3PC 0 Protect 0 BQue 0 EncServ 0 MultiP 0 MChngr 0 Addr16 0 Relative Address 0 Wide bus 16 1 Synchronous neg. 1 Linked Commands 0 Command Queueing 1 Vendor: VMware, Product: VMware Virtual S Revision level: 1.0 -I. Displays the Informational Exceptions mode page. -l. Lists known SCSI devices on the system. An example is shown below: 87

88 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual da1:/ # sginfo -l /dev/scd0 /dev/sr0 /dev/sda /dev/sg0 [=/dev/sda scsi0 ch=0 id=0 lun=0] /dev/sg1 [=/dev/scd0 scsi2 ch=0 id=0 lun=0] -n. Accesses information in the Notch and Partition mode page. -P. Accesses information in the Power Condition mode page. -r. Displays all SCSI device names in the /dev directory. However, it does not list sg device names. Devices which only have an sg device name (such as a scanner) are not reported. -s. Displays information in the unit serial number page. -V. Accesses information in the Verify Error Recovery mode page. NOTE: Similar information can be gathered with the sdparm command. See the sdparm man page for more information. sg_format. The sg_format command can be used to format or resize a SCSI disk. SCSI hard drives are typically formatted by the manufacturer with a block size of 512 bytes and the largest number of blocks recommended. Most manufacturers reserve a number of tracks and sectors for reassignment of logical block addresses at some point in the life of the disk. The sg_format utility can format SCSI hard disks, change their block size (if necessary), and change the block count (which changes the number of accessible blocks on the media, thereby resizing the disk). When the sg_format command is issued without options, it simply lists the existing block size and block count derived from two sources. An example is shown below: da1:/ # sg_format /dev/sg0 VMware, VMware Virtual S 1.0 peripheral_type: disk [0x0] Mode Sense (block descriptor) data, prior to changes: No block descriptors present Read Capacity (10) results: Number of blocks= Block size=512 bytes No changes made. To format use '--format'. To resize use '-- resize' Some of the more useful options you can use with sg_format are listed below: -c n. Specifies the number of blocks to be formatted or media to be resized to. This option can be used with either the -F (format) or -r (resize) options. -F. Issues a SCSI FORMAT command. 88

89 WARNING: This will destroy all the data on the disk! Manage Server Storage This option is required if you want to change the block size of a disk. -r. Resizes the disk. This option changes the number of blocks on the device reported by the READ CAPACITY command. This option should be used with the -c n option. The contents of all logical blocks on the media will remain unchanged, which allows any resize operation to be reversed. NOTE: This option cannot be used with the -F or -s options. Novell Training Services (en) 15 April s n. Specifies the block size for the device. The default value is whatever is currently reported by the block descriptor. This option must be used in conjunction with the -F option. If the block size specified is different than the current block size, a MODE SELECT command is issued to change it prior to sending the FORMAT command. NOTE: See the sg_format man page for more information. rescan-scsi-bus.sh: The rescan-scsi-bus.sh command is used to rescan the SCSI bus, allowing you to add and remove SCSI devices without rebooting the system. You can use the following options with this command: -l: Activates scanning for LUNs L n: Activates scanning for LUNs 0-n. -w: Scans for target device IDs c: Enables scanning of channels 0, 1. -r: Enables the removal of SCSI devices. --forcerescan: Rescans existing devices. --forceremove: Removes and re-adds every device. --channels=list: Scans only channel(s) provided in list. NOTE: The list parameter is a comma-separated list of values. No spaces are allowed. --ids=list: Scans only target ID(s) in provided in list. --luns=list: Scans only LUN(s) in provided in list. In the example below, a new SCSI hard disk has been found and added using the rescan-scsi-bus.sh command: 89

90 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual DA1:~ # rescan-scsi-bus.sh Host adapter 0 (mptspi) found. Host adapter 1 (ata_piix) found. Host adapter 2 (ata_piix) found. Scanning SCSI subsystem for new devices Scanning host 0 for SCSI target IDs , all LUNs Scanning for device OLD: Host: scsi0 Channel: 00 Id: 00 Lun: 00 Vendor: VMware, Model: VMware Virtual S Rev: 1.0 Type: Direct-Access ANSI SCSI revision: 02 Scanning for device OLD: Host: scsi0 Channel: 00 Id: 01 Lun: 00 Vendor: VMware, Model: VMware Virtual S Rev: 1.0 Type: Direct-Access ANSI SCSI revision: 02 Scanning for device OLD: Host: scsi0 Channel: 00 Id: 02 Lun: 00 Vendor: VMware, Model: VMware Virtual S Rev: 1.0 Type: Direct-Access ANSI SCSI revision: 02 Scanning for device NEW: Host: scsi0 Channel: 00 Id: 03 Lun: 00 Vendor: VMware, Model: VMware Virtual S Rev: 1.0 Type: Direct-Access ANSI SCSI revision:

Objective 2 Figure 2-11 Describe How Fibre Channel SANs Work Manage Server Storage While you can use SCSI devices as directly attached storage in a server system, you can also implement them as

91 Objective 2 Figure 2-11 Describe How Fibre Channel SANs Work Manage Server Storage While you can use SCSI devices as directly attached storage in a server system, you can also implement them as external storage using a Storage Area Network (SAN). In this objective, you learn how hardware SANs work. The following topics are addressed: How SANs Work on page 91 SAN Hardware on page 94 Providing Clustering and Redundancy with SANs on page 104 How SANs Work In the early days of computer networking, each server in the network had its own dedicated, directly attached storage media. For example, a server might have three SCSI hard disks installed within the server chassis configured into a RAID 5 array using a hardware RAID controller board. The storage media in a particular server could only be used by that server. All other servers in the network had to have their own dedicated storage. This created a situation where islands of information developed in the network, as shown in the figure below: Islands of Information Novell Training Services (en) 15 April 2009 In this example, the application running on App Server 1 will become unavailable should the server experience a hardware or power failure. Because App Server 2 and 3 have no access to the storage media on App Server 1, they are unable to take over for a failed App Server 1. Likewise, if App Server 2 becomes overloaded with excessive requests, it can t offload some of its work to App Server 1 or 3. 91

Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual In addition, backing up data from these servers is problematic.

92 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual In addition, backing up data from these servers is problematic. Each one has to be backed up individually using either a directly attached tape drive or over the network to a backup server where a tape drive is connected. This is problematic for the following reasons: Each server has to fill two roles: Application server Backup server Basically, each server must function as an application server in the daytime and as a backup server at night. This configuration may have worked in the past, but in the modern networking world, the window where backups may occur without impacting mission-critical applications is getting smaller and smaller. Many organizations need 24/7/365 availability for their servers, so they really have no backup window at all. Managing separate backups for each server system requires additional personnel resources. Backing up over the network consumes excessive bandwidth and can be very slow. Again, if your servers need to be accessible 24/7/365, then you have no practical window available for an extended network backup. To avoid these issues, you can implement a SAN instead of directly attached storage. A SAN uses externally connected storage instead of directly attached or internal storage. Servers in your network access the external storage media over a network connection that is dedicated to the SAN. Your everyday network traffic, such as file, print, database, and Web traffic remains where it is on your standard Ethernet network. However, your storage-related I/O network traffic runs on the SAN. This configuration is shown in the figure below: Figure 2-12 SANs This creates two separate networks and corresponding management domains. The first is the Fibre Channel SAN dedicated to storage. The second is the traditional 92

93 Manage Server Storage LAN that carries standard transmissions that network clients use to request and use services provided by network servers. A key difference is the type of hardware used. The standard network uses commodity-grade Ethernet hardware. However, the SAN uses Fibre Channel networking components, which allow very fast data transfers (measured in Gbps). The storage media connected to the SAN appears to be a locally attached device to the operating system running on the servers. The OS doesn t know that the storage device actually exists somewhere else on the SAN network. You can use standard operating system utilities to create partitions and file systems on the remote storage and mount them in the local server file system. Using SAN storage has many benefits over using locally attached storage devices: Simplified management: Instead of managing multiple storage devices located on each individual server, SANs provide centralized management of your network storage. Portability: Because all of the servers connected to the SAN share the same storage, it s very easy to shift storage from one server to another. For example, if one server needs to go offline for a period of time, another server in the SAN can easily take over its responsibilities by simply shifting the storage to the new server. SAN booting: It s also possible to boot your server directly from the SAN. This functionality makes it very easy to replace a malfunctioning server. You simply boot a new server from the SAN and it immediately takes on the identity of the failed system. Reliability: SANs can also increase the reliability of your storage solutions. Using SANs, you can implement redundant I/O paths as well as run-time data replication. Load balancing and high availability: A key benefit of SANs is the ability for multiple servers to share the same storage media at the same time. This allows you to configure a variety of clustering configurations. You can configure load balancing by dividing the network load among multiple servers. You can also configure one server to immediately fail over to another server should the first server go down for some reason. Scalability: It s relatively easy to add storage capacity to a SAN. Most SAN hardware supports hot-plugging and hot-swapping, allowing you to dynamically add additional storage space to the SAN without bringing your servers down. Simplified backups: Because your network storage is condensed into a single location, it s much easier and more efficient to run your backups. In fact, many SANs include backup functionality in the SAN firmware. This provides two key benefits: Servers can be servers: Your network servers can focus on running the network services they have been configured to provide without spending part Novell Training Services (en) 15 April

94 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual of their time functioning as backup servers. The firmware in the SAN takes care of all backup processes. Efficient use of network bandwidth: Instead of running backup operations on your standard Ethernet network, your backups run on the SAN instead. This preserves your Ethernet network bandwidth for standard client/server operations instead of backup traffic. Increased performance: The I/O performance of a SAN can be much faster than that associated with traditional locally attached storage due to the very high data transfer rates provided by the SAN infrastructure. The key disadvantage of implementing a SAN is cost. The initial outlay required to purchase and install SAN hardware is considerably more than that required for locally attached storage. However, many studies show SANs actually save money in the long run due to their reliability, centralized management, and scalability. SAN Hardware As discussed previously, a SAN is composed of the following components: Servers SAN networking infrastructure SAN-enabled storage devices, such as hard disks and tape libraries The SAN infrastructure is not composed of traditional commodity-grade Ethernet hardware (with the exception of iscsi, which will be discussed later in this section). Instead, it is built using unique, dedicated hardware and software. WARNING: At the time of this writing, many of the SAN standards are still in an evolutionary state and haven t been fully implemented by all manufacturers. You need to be very careful when purchasing SAN hardware. If you mix and match components from different manufacturers, it may be difficult to get them to work together correctly. To understand how SANs work, you need to be familiar with the following: Protocols and Topologies on page 94 Ports on page 98 Fibre Channel Protocol Layers on page 101 Login on page 103 Addressing on page 103 SCSI Bridges on page 104 Protocols and Topologies Currently, the infrastructure of choice for implementing a SAN is Fibre Channel. Fibre Channel is a high-speed serial I/O protocol used almost exclusively for storage networking. 94

Manage Server Storage Despite its name, Fibre Channel is actually media independent. It can be implemented on either shielded twisted-pair copper wire (using a DB-9 connector) or fiber-optic cabling.

95 Manage Server Storage Despite its name, Fibre Channel is actually media independent. It can be implemented on either shielded twisted-pair copper wire (using a DB-9 connector) or fiber-optic cabling. Twisted-pair copper wiring is considerably less expensive to implement than fiber-optic cabling. However, using twisted-pair also dramatically reduces the maximum cable length that can be used. In addition, transmission speeds over 200 MBps require fiber-optic cabling. As a result Fibre Channel is most frequently implemented with single-mode or multimode fiber-optic cable. The following types of cable are supported: 50µm Multi-Mode 62.5µm Multi-Mode Single-Mode Novell Training Services (en) 15 April 2009 Both long wave and short wave lasers can be used for all types of fiber-optic cabling. In addition, the SC fiber-optic connector is used with Fibre Channel. Several different versions of Fibre Channel have been introduced over the last 15 years. The key versions you should be familiar with include the following: 1GFC: This early version of Fibre Channel became available in 1997 and transferred data at a rate around 200 MBps. 2GFC: This version of Fibre Channel was introduced in 2001 and it doubled the throughput of 1GFC, providing transfer speeds up to 400 MBps. 4GFC: This version of Fibre Channel was introduced in 2005 and it again doubled the speed of its predecessor. It transfers data at a rate of 800 MBps. 8GFC: This later version of Fibre Channel was introduced in It provides throughput eight times faster than 1GFC, transferring data at 1600 MBps. A Fibre Channel SAN can be wired using one of three topologies, listed below: FC-P2P: In this topology, two Fibre Channel devices are connected directly to each other in a point-to-point configuration. The transmit fiber of one device goes to the receive fiber of the other device and vice versa. An example is shown below: Figure 2-13 FC-P2P Topology With FC-P2P, there is no sharing of the media. This allows the devices to utilize the total bandwidth of the link. A simple link initialization of the two devices is required before communications can begin. 95

Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 2-14 FC-AL (Arbitrated Loop): Allows to connect up to 127 devices in a single

96 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 2-14 FC-AL (Arbitrated Loop): Allows to connect up to 127 devices in a single network without implementing a Fabric switch. As no switch is involved, it is less expensive, but the topology is more complex. In this design, all devices are connected in a ring, similar to that used in a Token Ring network. An example is shown below: Arbitrated Loop Topology Unlike other Fibre Channel topologies, the media is shared among all devices in the Arbitrated Loop, reducing the amount of bandwidth available to an individual device. At first glance, the Arbitrated Loop topology appears very similar to that used by Token Ring networks. However, it does not use a token-passing media access control. Instead, when a device is ready to transmit data, it must first arbitrate and gain control of the loop. It does this by transmitting an Arbitrate Primitive (ARB) signal along with the Arbitrated Loop Physical Address (AL_PA) of the device. When the device receives its own ARB signal from around the ring, it gains control of the loop and can communicate with another device by transmitting an Open Primitive (OPN) signal to the destination device. This process essentially creates a point-to-point connection between the two devices. All other devices in the loop simply repeat the data as it traverses the ring. This creates a logical point-to-point connection between the two devices that are communicating. 96

97 Figure 2-15 Manage Server Storage There is no limit on how long a device may retain control of the loop. However, the device that has control may not arbitrate again until all other devices in the loop have had a chance to arbitrate first. A key weakness of this design is the fact that it functions as a true physical ring. A failed device in the loop breaks communications. However, like other ringbased network topologies, you can implement a central hub. The hub acts as a wiring concentrator. All devices in the ring connect to the central hub instead of to each other, forming a physical star topology (although it still operates logically as a ring). An example is shown below: An Arbitrated Loop with a Central Hub Novell Training Services (en) 15 April 2009 In this configuration, the hub can determine when a device in the ring is up or down and bypass it if necessary. As with other network topologies, Fibre Channel hubs can be cascaded to provide additional ports. FC-SW (Switched Fabric): In a FC-SW topology, a central Fibre Channel switch is used, much like that used in a twisted-pair Ethernet network. Individual devices or entire Arbitrated Loops can be connected to the switch, using a physical star topology. An example is shown in the figure below: 97

98 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 2-16 Fabric Topology Much like a switched Ethernet network, the Fabric Fibre Channel topology allows many devices to communicate at the same time without sharing the SAN media bandwidth. The traffic between two devices flows only through the switch; it is not transmitted to any other port. In addition, if a particular device fails, it does not affect communications between the remaining devices on the SAN. The key disadvantage of this topology is that it costs more than an arbitrated loop to implement because of the added cost of the Fibre Channel switch. Ports In a Fibre Channel SAN, a port is any device that is connected to the SAN network. It does not refer to a hardware port, as is common in traditional LAN terminology. A port can include devices such as A shared SAN storage device A Fibre Channel interface card installed in a server A Fibre Channel switch 98

99 Figure 2-17 Manage Server Storage The Fibre Channel standard defines several different types of ports, including the following: Node Port (N_port): A port on a node that is used with either FC-P2P or FC-SW topologies, as shown below: N_ports in a Fabric Novell Training Services (en) 15 April 2009 Node Loop Port (NL_port): A port on a node used with the FC-AL topology, as shown below: 99

100 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 2-18 NL_ports in an Arbitrated Loop Fabric Port (F_port): A port on a Fibre Channel switch that connects to an N_port using a point-to-point connection. An F_port is not compatible with an NL_port in an FC-AL loop. Fabric Loop Port (FL_port): A port on a Fibre Channel switch that connects to an NL_port in an FC-AL loop. Fx_port: A port on a Fibre Channel switch that can function as an F_port when connected to an N_port or as an FL_port when connected to an NL_port. Expansion Port (E_port): A port on a Fibre Channel switch that can be used to link to an E_port on another fibre channel switch, establishing an inter-switch link (ISL). Generic Port (G_port): A port on a Fibre Channel switch that can function as either an E_port or F_port. EX_port: A port on a Fibre Channel router that connects to the E_port on a Fibre Channel switch. 100

Figure 2-19 Fibre Channel Protocol Layers Manage Server Storage Like the OSI model used with traditional LAN protocols, the Fibre Channel protocol is stratified into five layers: Fibre Channel Layers

101 Figure 2-19 Fibre Channel Protocol Layers Manage Server Storage Like the OSI model used with traditional LAN protocols, the Fibre Channel protocol is stratified into five layers: Fibre Channel Layers Novell Training Services (en) 15 April 2009 FC0 defines the Physical layer. It specifies Fibre Channel cables, connectors, and optical and electrical parameters. FC1 defines the Data Link layer, which specifies the transmission protocol used by Fibre Channel. This includes serial encoding and decoding rules, special characters, and error control. The information transmitted over a Fibre Channel SAN is encoded 8 bits at a time into a 10-bit Transmission Character. FC2 defines the Network layer, providing communications between ports. FC2 defines the following: A signaling protocol used as a transport mechanism The framing rules for the data being transferred between ports Several mechanisms for controlling the class of service A means for managing the data transfer sequence FC3 defines the Common Services layer. It provides common services required for advanced Fibre Channel features, including the following: Striping: Uses multiple N_ports in parallel to transmit a single information unit across multiple links. Hunt Groups: Allows multiple ports to respond to the same address. This can improve efficiency by decreasing the probability of reaching a busy N_port. Multicast: Sends a single transmission to multiple destination ports, including sending to all N_ports (broadcast) or to only a subset of N_ports on a fabric. FC4 defines the Protocol Mapping layer in which other protocols are encapsulated into an information unit for delivery. It also specifies mapping rules of upper layer protocols using the FC levels below FC4. The following protocols are supported by FC4: SCSI 101

102 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Intelligent Peripheral Interface (IPI) IP ATM IEEE Fibre Channel allows the fabric to be partitioned into zones. Zoning allows you to control access to and simplify the management of a fabric. For example, you can use zoning to specify which devices on the SAN each server is allowed to access. NOTE: Zoning can only be implemented in a FC-SW SAN. It functions at Layer 2 of the Fibre Channel protocol. Zoning can be implemented on a fabric using four different strategies: Soft Zoning: With soft zoning, each server is configured with a list of SAN devices that is allowed to see on the fabric. Soft zoning doesn t actually restrict access to SAN devices. Instead, it simply hides devices from a server that isn t allowed to access them. A key drawback to soft zoning is that if a server already knows the network address of a SAN device, the server can access it directly; circumventing the access control rules defined for it. Hard Zoning: With hard zoning, each SAN server is configured with a list of SAN devices that is allowed to access on the fabric. Unlike soft zoning, which simply hides unauthorized devices, hard zoning actually prevents communication between a server and SAN devices it isn t allowed to access using frame filtering. NOTE: Hard zoning is more secure than soft zoning. Port Zoning: Port zoning controls access by preventing specified ports on the Fibre Channel switch from accessing specified ports on the fabric. By doing this, you control which SAN devices each server may access. A key drawback of port zoning is that it doesn t take into account the name of the device connected to a port. For example, if you were to disconnect a SAN device and reconnect it to a different port on the SAN switch, your access control rules would no longer work. Name Zoning: Name zoning is also called WWN zoning. Instead of restricting access by port, name zoning controls access using the WWN of each SAN device. Name zoning allows you to move devices in the fabric without invalidating your access control rules. You can combine zoning strategies together to partition your fabric. For example, hard zoning is frequently implemented in conjunction with name zoning. In addition to zoning, you can also manage access to SAN devices on the fabric using LUN masking. LUN masking is implemented at Layer 4 of the Fibre Channel protocol. LUN masking controls access using the LUNs assigned to SAN devices. 102

103 Manage Server Storage Essentially, LUN masking hides LUNs on the fabric from servers that aren t allowed access the associated devices. This can be very useful in situations where Windows servers are connected to the same SAN as Linux or NetWare servers. By default, Windows servers try to apply volume labels to all LUNs they can see on the SAN, including Linux or NetWare volumes. This can corrupt those volumes, rendering them inaccessible to Linux or NetWare servers. Using LUN masking you can hide LUNs assigned to Linux or NetWare volumes from your Windows servers, preventing them from writing labels to these volumes. Login Novell Training Services (en) 15 April 2009 Fibre Channel defines two types of login procedures: Fabric Login N_port Login All node ports must attempt to log in with the fabric right after the link or the Loop has been initialized. To do this, the node port must transmit a Fabric Login (FLOGI) frame to the well-known fabric address FFFFFE (hex). If it exists, the fabric will respond with an Accept (ACC) frame and send it back to the node port. Fabric Login accomplishes several important tasks: It determines if a fabric exists. If a fabric is present, it does the following: It provides the node with a set of operating parameters specific to the fabric, such as which classes of service are supported. It assigns an N_port identifier to the node. If a fabric is not present, an ACC is sent from the remote N_port informing the source N_port that it is connected in a point-to-point topology (such as FC-P2P). After Fabric Login, a node port must perform an N_port Login with a remote node port before it can communicate with it. An N_port Login is similar to a Fabric Login. A Port Login (PLOGI) frame is transmitted from the source node port to the destination node port. An ACC frame is transmitted back in response. It also provides the source N_port with a set of operating parameters specific to the destination N_port, such as the classes of service that are supported. Once the Fabric and Port Login processes are complete, the device can stay logged in indefinitely. Addressing Traditional Ethernet LANs use a static 6-byte MAC address burned into the ROM chip on each connected network interface to uniquely identify each host on the network. 103

104 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Fibre Channel, on the other hand, uses a 3-byte hexadecimal address identifier to uniquely identify each port on the SAN. for example: 0BADBE. As discussed previously, this identifier is dynamically assigned during Fabric Login. Once an identifier has been assigned, an N_port can use its Source_ID (S_ID) to transmit frames to a destination N_port using the Destination_ID (D_ID). NOTE: Addresses between FFFFF0 and FFFFFE are defined as well known addresses and are reserved for specific devices. For example, FFFFFE is reserved for the fabric. In addition, FFFFFF is reserved for broadcasts. Before Fabric Login occurs, the N_Port's S_ID is said to be undefined and set to a value of If your Fibre Channel SAN uses the FC-P2P topology, then the Fabric Login process will fail because no fabric is present. In this situation, the two ports in the point-topoint connection will simply assign themselves two unique identifiers. NL_ports in a Arbitrated Loop topology also use a 3-byte S_ID. However, they must also have an Arbitrated Loop Physical Address (AL_PA) assigned. The AL_PA is a 1-byte value that is dynamically assigned every time the loop is initialized. Once done, the NL_ports in the loop will also attempt Fabric Login. If successful, the fabric will set the upper two bytes of the NL_Port's identifier to the appropriate values and then use the low byte of the port s identifier as the AL_PA. If the Fabric Login fails (for example, if no fabric exists), then the upper two bytes of the port s identifier will remain set to 0000 and the lower byte will simply be the port's AL_PA. Fibre Channel also uses Name_identifiers in conjunction with the S_ID to uniquely identify devices on the SAN. This is a static 64-bit value that is used to uniquely identify Nodes using the Node_Name Name_identifier Ports using the Port_Name Name_identifier The fabric using the Fabric_Name Name_identifier SCSI Bridges Fibre Channel to SCSI bridges are frequently used to convert data between the Fibre Channel and SCSI standards. This allows you to use your existing legacy SCSI storage and backup devices with a Fibre Channel SAN. Providing Clustering and Redundancy with SANs A key benefit of SANs is that they allow you to implement clustering with your mission-critical servers. A server cluster is a group of redundantly configured servers that work together to provide highly available access for clients to important applications, services, and data while reducing unscheduled outages. 104

105 Manage Server Storage The applications, services, and data are configured as cluster resources that can be failed over or migrated between servers in the cluster. This is possible because the servers in the cluster use shared storage provided by a SAN. For example, if a failure occurs on one node of the cluster, the clustering software gracefully relocates its resources and current sessions to another server in the cluster. Clients connect to the cluster instead of an individual server, so users are not aware of which server is currently providing the service or data. In fact, users are usually able to continue their sessions without interruption when a fail-over event occurs. Each server in the cluster runs the same operating system and applications that are needed to provide the application, service, or data resources to clients. Clustering software monitors the health of each of the member servers by listening for its heartbeat, which is a simple message that lets the other cluster members know it is alive. The cluster s virtual server provides a single point for accessing, configuring, and managing the cluster servers and resources. The virtual identity is bound to the cluster s master node and remains with the master node regardless of which member server acts as the master node. The master server also keeps information about each of the member servers and the resources they are running. If the master server fails, the control duties are passed to another server in the cluster. You can configure many Linux servers in a high-availability cluster. You can manage a cluster from a single point of control to adjust resources to meet changing workload requirements, and thus manually balance the load across the cluster. Resources can also be migrated manually within the cluster to allow you to troubleshoot hardware. For example, you can move applications, Web sites, and so on to other servers in your cluster without waiting for a server to fail. This helps you to reduce unplanned service outages and planned outages for software and hardware maintenance and upgrades. Using server clusters provides the following benefits over standalone servers: Increased availability of applications, services, and data Improved performance Lower cost of operation Scalability Disaster recovery Data protection Server consolidation Storage consolidation Novell Training Services (en) 15 April 2009 NOTE: The implementation and management of server clusters is an intense topic that requires its own course to teach properly. As the main focus of this section is SAN technology, clustering will be covered at only an introductory level here. 105

106 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 2-20 The benefits that server clustering provides can be better understood through the following scenario. Suppose you have configured a three-server cluster, with a Web server installed on each of the three servers in the cluster. Each of the servers in the cluster hosts two Web sites. All the data, graphics, and Web page content for each Web site is stored on a shared disk system connected to each of the servers in the cluster. The figure below depicts how this setup might look: A Three-Server Cluster During normal cluster operation, each server is in constant communication with the other servers in the cluster and performs periodic polling of all registered resources to detect failure. Suppose Web Server 1 experiences hardware or software problems and the users who depend on Web Server 1 lose their connections. The figure below shows how resources are moved when Web Server 1 fails: 106

Figure 2-21 Fail Over in a Clustered Server Environment Manage Server Storage Novell Training Services (en) 15 April 2009 As you can see in the figure above, Web Site A moves to Web Server 2 and Web

107 Figure 2-21 Fail Over in a Clustered Server Environment Manage Server Storage Novell Training Services (en) 15 April 2009 As you can see in the figure above, Web Site A moves to Web Server 2 and Web Site B moves to Web Server 3. IP addresses and certificates also move to Web Server 2 and Web Server 3. When you configure the cluster, you decide where the services and applications hosted on each server go if a failure occurs. In this example, you configured Web Site A to move to Web Server 2 and Web Site B to move to Web Server 3. This way, the workload once handled by Web Server 1 is evenly distributed. When Web Server 1 failed, the clustering software did the following: Detected the failure. Remounted the shared data directories that were formerly mounted on Web server 1 on Web Server 2 and Web Server 3 as specified. Restarted applications that were running on Web Server 1 on Web Server 2 and Web Server 3 as specified. Transferred IP addresses to Web Server 2 and Web Server 3 as specified. In this example, the fail-over process happened quickly and users regained access to Web site information within seconds, and in most cases, without logging in again. Now suppose the problems with Web Server 1 are resolved, and Web Server 1 is returned to a normal operating state. Web Site A and Web Site B will automatically be moved back to Web Server 1 and Web Server operation will return to the way it was before Web Server 1 failed. Most clustering software also provides resource migration capabilities. You can move applications, data, and services to other servers in your cluster without waiting for a server to fail. 107

108 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual For example, you could have manually moved Web Site A or Web Site B from Web Server 1 to either of the other servers in the cluster. You might do this to upgrade or perform scheduled maintenance on Web Server 1, or just to increase performance or accessibility of the Web sites. 108

109 Objective 3 Implement a SAN with iscsi Manage Server Storage As discussed in the previous objective, one option for implementing a SAN is to install and configure a dedicated high-speed Fibre Channel network to use for storage. However, a key drawback associated with this option is the fact that Fibre Channel hardware can be very expensive. Another alternative is to implement a SAN using commodity-grade Ethernet hardware and the iscsi protocol. In this objective, you learn how to do this. The following topics are addressed: The Benefits of Using iscsi on page 109 How iscsi Works on page 110 iscsi Terminology on page 111 Implement an iscsi SAN on page 112 Implementing an isns Server on page 129 Configure a SAN with iscsi on page 144 The Benefits of Using iscsi As discussed earlier, one of the key drawbacks associated with Fibre Channel SANs is their cost. Fibre Channel SANs can be very expensive to implement and manage. You would need Fibre Channel (or SCSI) external storage devices, which can cost $50,000 or more A Fibre Channel switch and associated cabling, which can cost $8,000 or more A Fibre Channel HBA for each server, which can cost $800 or more each Specialized training (cost varies by manufacturer) iscsi offers a much lower-cost alternative to using Fibre Channel. iscsi allows you to Use inexpensive commodity-grade Ethernet networking hardware Leverage your existing local area networking knowledge and expertise Use familiar LAN networking protocols Reuse your existing directly attached storage hardware iscsi can use your existing directly-attached storage devices, networking infrastructure, and network adapters. In addition, no additional training is required. This makes iscsi a considerably less-expensive SAN solution compared to Fibre Channel. As such, iscsi really isn t a replacement for or a competitor to Fibre Channel. Both are SAN solutions, but they are aimed at two different markets. Essentially iscsi makes SAN technology available to organizations that previously could not implement Fibre Channel infrastructure due to cost. iscsi can also function as a Novell Training Services (en) 15 April

110 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual supplement for a Fibre Channel SAN, providing a bridge into an existing Fibre Channel fabric. How iscsi Works Your server operating system uses SCSI commands to communicate with I/O storage devices. These commands are usually issued through your I/O bus, but they can also be sent over a network connection using the IP protocol. With iscsi, the operating system (which functions as an iscsi initiator) does not send the SCSI command directly to an attached storage device. Instead, it sends SCSI commands to another server system on the network (which functions as an iscsi target). The iscsi target system redirects the SCSI commands to a directly attached storage device, which can be a partition, logical volume, or file. This is shown in the figure below: Figure 2-22 A Simple iscsi SAN NOTE: The iscsi protocol is standardized in RFC3720. Essentially, iscsi uses a client/server connection between the iscsi initiator (client) and iscsi target (server). iscsi uses port 3260 by default. You connect to a remote storage device on a different system. The remote storage appears to the local operating system as a locally attached hard disk, effectively creating a SAN. The remote devices are available on Linux as regular SCSI devices (/dev/sdn) and should be mounted with the _netdev option. To control access, you can configure the 110

111 Manage Server Storage iscsi targets to require a username and password upon connection from an iscsi initiator. Because of the amount of data transferred over the network by iscsi, it is important that you ensure enough network bandwidth is available to the iscsi SAN. We strongly recommend that you use Gigabit Ethernet switches, wiring, and HBAs when implementing an iscsi SAN. If the demand on the SAN is high enough, you may want to consider implementing a dedicated network for the iscsi communications, in much the same way that Fibre Channel uses a dedicated network infrastructure. iscsi Terminology To implement and manage an iscsi SAN, you need to be familiar with the following terms, components, and concepts: Network entity: A network entity is a device connected to an IP network that hosts one or more iscsi nodes (such as a physical or virtual server). Network portal: The network portal is the IP address and port through which an iscsi node is accessed on a network entity. Protocol Data Unit (PDU): The PDU is a data structure that carries the messages communicated between the iscsi initiator and target. iscsi node: An iscsi node can be either an iscsi target or iscsi initiator. A network entity can host one or more iscsi nodes that are accessible through one or more network portals. iscsi name: The iscsi name is the unique name of an iscsi node, typically an iscsi Qualified Name (IQN). Assigning unique names to iscsi nodes allows multiple nodes to share the same network portal on a network entity. iscsi Qualified Name (IQN): The IQN is designed to be globally unique using Internet naming standards. It is comprised of the unique identifier assigned to an iscsi node using the following syntax: iqn.year-month.tld.internet_domain:string The year-month portion of the IQN identifies the year and month the Internet domain was registered. The TLD portion of the IQN specifies the top-level domain of the Internet domain (such as com, net, org, and so on). The Internet_domain portion of the IQN identifies the registered Internet domain name. An IQN may look similar to the following: iqn com.mydomain:target1 iscsi target: An iscsi target is equivalent to a traditional SCSI device. iscsi initiator: An iscsi initiator is equivalent to a traditional SCSI host bus adapter. LUN: In terms of iscsi, a LUN is equivalent to a traditional SCSI disk. An iscsi target hosts one or more LUNs. Novell Training Services (en) 15 April

Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 2-23 The relationship between the preceding components and concepts is

112 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 2-23 The relationship between the preceding components and concepts is depicted in the figure below: iscsi SAN Components and Architecture Implement an iscsi SAN With this background in mind, you are ready to learn how to implement an iscsi SAN. In this part of this objective, the following topics will be addressed: Implement an iscsi Target on page 112 Implementing an iscsi Initiator on page 122 Implement an iscsi Target The first thing you need to do is to implement an iscsi target. To do this, you need to be familiar with the following topics: The ietd Daemon on page 112 Install and Configure an iscsi Target on page 114 The ietd Daemon The iscsi target daemon (ietd) enables a server to host iscsi targets. It allows block devices to be accessed over the network as if they were locally attached SCSI devices. Supported targets can be any block device or image file. An unrestricted number of targets can be configured on each server with multiple LUNs per target. The ietd daemon supports CHAP authentication, providing a global as well as a pertarget authentication mechanism. It provides incoming authentication, which requires the initiator to authenticate to the target. It also provides outgoing authentication, which requires the target to authenticate to the initiator. 112

113 Manage Server Storage The RPM packages required to install the iscsi target daemon include the following: iscsitarget yast-iscsi-server The binary file for the ietd daemon is /usr/sbin/ietd. Its init script is /etc/ init.d/iscsitarget. The ietd daemon can be managed using the following utilities: /usr/sbin/ietadm (command line) The YaST iscsi-server module The ietd daemon s configuration file is /etc/ietd.conf. A portion of a sample ietd configuration file is shown below: Novell Training Services (en) 15 April 2009 # Example iscsi target configuration #isnsserver #isnsaccesscontrol No Target iqn com.mydomain:target1 Lun 0 Path=/dev/sdb,Type=fileio,ScsiId=0 #IncomingUser uname password #OutgoingUser uname password #Alias tgt1 Target iqn com.mydomain:target2 Lun 0 Path=/dev/mapper/vg1-lv0,Type=fileio,ScsiId=tgt2-0 Lun 1 Path=/dev/mapper/vg1-lv1,Type=fileio,ScsiId=tgt2-1 #IncomingUser uname password #OutgoingUser uname password #Alias tgt2 The parameters used in this configuration file are described in the table below: Table 2-3 ietd.conf Configuration File Options Option isnsserver isnsaccesscontrol Target Lun Definition Specifies the IP address of the isns server (if isns has been enabled). Default = disabled. Enables or disables isns name resolution. Default = no (disabled). Defines the name of the target (iqn:target_name). This is the first line of a Target definition and is required. The name must be unique on a given target server. Defines the type of device or the path to a device being exported as a LUN. This option is required. Multiple Lun entries can be defined for a single target section. Each Lun must have a unique number assigned. If not, one will be automatically assigned. IncomingUser Defines a target-level username and password that will be required to connect to the target. This option is not required. 113

114 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Option OutgoingUser Alias Definition Install and Configure an iscsi Target Defines a target-level username and password that is used to authenticate back to the initiator. This option is not required. Assigns an alias name to the target. This option is not required. The alias name can be used instead of the iqn to refer to the target. SLES 11 includes the software required to set up an iscsi target and initiator, but the packages are not installed by default. To install an iscsi target on SLES 11, you need to complete the following tasks on the server where you want to create iscsi target devices: 1. If necessary, insert your SLES 11 installation media in your server s optical drive and mount it. 2. Start YaST on the server and enter your root user s password when prompted. 3. In YaST, select Network Services > iscsi Target. 4. When prompted to install the iscsitarget package, select Install. Wait while the packages are installed. When the installation is complete, YaST displays the iscsi Target Overview page with the Service tab selected, as shown below: Figure 2-24 The iscsi Target Overview in YaST 114

115 Manage Server Storage 5. Under Service Start, select When Booting to ensure the target service loads each time the server is started. 6. Select Open Port in Firewall. 7. Select Finish. 8. When prompted to restart the iscsi target service, select Yes. With the target software installed, you next need to configure the service itself as well as any targets you want it to provide. You need to prepare the storage space you want to use as a target device. You can use an entire hard disk, a logical volume, or a file. If you choose to use a partition, you need to first create it, but don t format it and don t mount it in the file system. An example of doing this with the YaST Partitioner module is shown below: Novell Training Services (en) 15 April 2009 Figure 2-25 Creating an iscsi Partition If you want to use a file on the server as the target device (for instance, a 1 GB file, as in the following example), you could create it using the following command: dd if=/dev/zero of=/iscsi/first-device bs=1m count=1024 This creates an empty 1 GB file in the /iscsi directory named first-device. Once done, you need to configure the file or partition you created as an iscsi target. To do this in YaST, complete the following: 1. Start YaST on the server and provide your root user s password when prompted. 115

116 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure Select Network Services > iscsi Target. YaST opens to the iscsi Target Overview page with the Service tab selected. 3. If authentication is required to connect to the target devices you are configuring on this server, do the following: a. Select the Global tab. The following is displayed: The Global Tab in the iscsi Target Configuration YaST Module b. Deselect No Authentication to enable authentication. There are two authentication options that you can configure. One option is to require that an initiator prove that it has permission to run a discovery on the iscsi target. This is done using Incoming Authentication. The other option is to require the iscsi target to prove to the initiator that it is the expected target. In this case, the iscsi target can also provide a username and password. This is done using Outgoing Authentication. c. If you want to use incoming authentication, verify that Incoming Authentication is selected. d. If Incoming Authentication is selected, select Add; then enter a username and password for authentication and select OK. NOTE: You can repeat this step to use multiple credentials for incoming authentication. e. If you want to use outgoing authentication, verify that Outgoing Authentication is marked and then enter a username and password. 116

117 Figure Configure your iscsi target devices by doing the following: a. Select the Targets tab. The following is displayed: The Targets Tab in the iscsi Target Overview YaST Module Manage Server Storage Novell Training Services (en) 15 April 2009 b. If you have not already done so, delete the example iscsi target from the list and confirm the deletion by selecting Continue. c. Select Add to add a new iscsi target. The Target and Identifier fields are automatically populated for you, as shown below: 117

118 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 2-28 Adding a New iscsi Target d. Select Add. The following is displayed: Figure 2-29 Configuring an iscsi Target Device 118

119 Manage Server Storage e. Enter the path and filename of the disk, partition, or file you want to use as the target device in the Path field. For example, if you wanted to use the first partition on the second SCSI hard disk in the system, you would enter /dev/sdb1 in the Path field. If you wanted to use a file as the target device, you would mark Type=fileio, and then enter the file s name and path in the Path field. For example: /iscsi/first-device. NOTE: You can also manually specify the sectors to allocate to a LUN. If you need additional options, select Expert Settings. Novell Training Services (en) 15 April 2009 f. Select OK. g. Repeat this process for each target you want to create. h. When done, select Next. The following is displayed: Figure 2-30 Configuring Authentication for an iscsi Target i. Configure your authentication requirements for the target; then select Next. 5. Select Finish. When you re done, you should have a new entry added to your /etc/ietd.conf file that appears similar to the following: Target iqn com.digitalairlines:0ec16387-ce25-447e-8bbb- 9cdaff388e6b Lun 0 Path=/iscsi/first-device,Type=fileio Lun 1 Path=/dev/sdb1 119

120 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual If desired, you can also use a text editor to edit the ietd.conf file and manually define your iscsi targets. Define your iscsi targets and LUNs with the Target and Lun parameters, as shown above. The target declaration always starts with a Target identifier followed by LUN definitions. The syntax to be used is shown below: Target iqn.yyyy-mm.<reversed domain name>[:identifier] Lun 0 Path=path Lun 1 Path=path Lun 2 Path=path You can create a Lun definition in the following ways: Logical volume. Maps the LUN to an LVM volume. For example: Lun 0 Path=/dev/mapper/system-v3 Disk or disk partition. Maps the LUN to a hard disk or disk partition. For example: Lun 1 Path=/dev/sda4 File. Maps the LUN to an image file in the file system. For example: Lun 2 Path=/iscsi/first-device,Type=fileio TIP: All parameters before the first Target declaration in the file are global. Parameters within a Target declaration apply only to that target definition. If you want to configure authentication, you need to edit the IncomingUser and OutgoingUser parameters within the Target definitions in the file. Both parameters use the same syntax: IncomingUser username password OutgoingUser username password TIP: If you place the IncomingUser and OutgoingUser parameters before the first Target declaration, they will be used to configure authentication for the discovery of the iscsi target. TIP: When configuring incoming authentication for the target, you have to use outgoing authentication on the initiator, and vice versa. When complete, save your changes to the file and exit the text editor. Then restart the iscsitarget daemon by entering rciscsitarget restart at the shell prompt (as root). Once your target definitions have been configured, you can view your iscsi configuration in the /proc/net/iet file system. The volume file in this directory lists your target definitions and LUNs. An example is shown below: DA1:/proc/net/iet # cat volume tid:1 name:iqn com.digitalairlines:0ec16387-ce25-447e-8bbb- 9cdaff388e6b lun:0 state:0 iotype:fileio iomode:wt path:/iscsi/first-device lun:1 state:0 iotype:fileio iomode:wt path:/dev/sdb1 120

121 Manage Server Storage The session file in this directory displays active sessions. An entry is added to the file for each connected initiator. An example is shown below: DA1:/proc/net/iet # cat session tid:1 name:iqn com.digitalairlines:0ec16387-ce25-447e-8bbb- 9cdaff388e6b sid: initiator:iqn de.suse:cn=da2.digitalairlines.com,01.9ff842f5645 cid:0 ip: state:active hd:none dd:none sid: initiator:iqn de.suse:01.6f7259c88b70 cid:0 ip: state:active hd:none dd:none Novell Training Services (en) 15 April 2009 When you make changes to the iscsi target configuration file, you must restart the target daemon to activate them. However, be aware that if you do this, all active sessions will be interrupted. To make changes without interrupting active sessions, you can use the ietadm administration utility. Do the following: 1. Modify your /etc/ietd.conf configuration file with the changes you want to make, but don t restart the target daemon. 2. Use the ietadm utility from the shell prompt to implement those same changes to the running target daemon. This utility uses the following identifiers: target ID (tid) session ID (sid) connection ID (cid) For example, if you wanted to create a new target, you would enter the ietadm command using the following syntax: ietadm --op new --tid=target_number --params Name=target_name If you wanted to create a new LUN for an existing target, you would enter the ietadm command using the following syntax: ietadm --op new --tid=target_number --lun=n --params Path=path_to_device If you wanted to configure authentication, you would enter the ietadm command using the following syntax: ietadm --op new --tid=target_number --user -- params=incominguser=username,password=password You could also use the OutgoingUser parameter with the above command to configure outgoing authentication. You can also delete active connections with ietadm. First, check all active connections by viewing the contents of /proc/net/iet/session. Then enter the following command at the shell prompt: ietadm --op delete --tid=target_number -- sid=sid_of_connection --cid=connection_id 121

122 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual WARNING: Be aware that this will cause the device to become inaccessible on the client system and any processes accessing the device are likely to hang. Remember, the changes that you make with the ietadm utility are not persistent. They will be lost at the next reboot if they are not added to the /etc/ietd.conf configuration file. Implementing an iscsi Initiator With the iscsi target configured, you next need to implement iscsi initiators. The iscsi initiator is used to connect to any iscsi target. The configuration of the iscsi initiator involves two major steps: The discovery of available iscsi targets The setup of an iscsi session The term discovery refers to the process of requesting the targets available on a iscsi target server. The terms login and logout refer to the process of connecting to or disconnecting from an iscsi target. A session is created when an initiator logs into a target. Before an iscsi initiator can connect to a target, it must learn about the target through the discovery process. There are three discovery methods defined: SendTargets: This is the native method of discovery in which a query is sent directly to the iscsi target server. isns: In this method, a query is sent to an isns server requesting a list of targets that have been registered and are in the initiator's domain. SLP: In this method, the initiator uses SLP to discover iscsi targets. This option is not currently supported. Whichever discovery method is used, authentication to the target server may be required to see the targets hosted on that server. Just because an initiator knows a target is available doesn't mean it can actually access the LUNs hosted by that target. All information that was discovered by the iscsi initiator is stored in two database files that reside in /etc/iscsi/iqn/ip:port/default/. There is one database for the discovery of targets and one for discovered nodes. To connect to the LUNs associated with a target, the initiator must first log in to the target. After a successful login, all LUNs hosted by the target will appear as SCSI disks in /dev/ on the initiator system. The iscsi initiator in SLES 11 is provided by a daemon called open-iscsi, which creates a virtual SCSI host bus adapter in the system. To install this daemon on a SLES 11 system, you must install the following RPM packages: open-iscsi yast-iscsi-client 122

123 Manage Server Storage The binary file for the open-iscsi daemon is /sbin/iscsid. The init scripts for this daemon are /etc/init.d/open-iscsi and /etc/init.d/ boot.open-iscsi. Its configuration file is /etc/iscsid.conf, which is actually just a symbolic link to the /etc/iscsi/iscsid.conf file. You can manage the initiator daemon using the iscsiadm command-line utility and the YaST iscsi-client module. To install and configure the open-iscsi daemon on a SLES 11 system, complete the following: 1. Start YaST and enter your root user s password when prompted. 2. In YaST, select Network Services > iscsi Initiator. 3. When prompted to install the open-iscsi package, select Install. Wait while the packages are installed. When complete, YaST displays the iscsi Initiator Overview page with the Service tab selected, as shown below: Novell Training Services (en) 15 April 2009 Figure 2-31 The YaST iscsi Initiator Module 4. Under Service Start, select When Booting to automatically start the initiator service when the server reboots. 5. Verify the initiator name that was automatically generated for you. You can edit it, if desired. The initiator name must be globally unique on your network. The IQN uses the following syntax: iqn.yyyy-mm.top_level_domain.domain:n1:n2 123

124 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual The n1 and n2 parameters are alphanumeric characters. For example: iqn de.suse:01:9c83a3e15f64 The Initiator Name is automatically generated using the corresponding value from the /etc/iscsi/initiatorname.iscsi configuration file on the server. If the server has iscsi Boot Firmware Table (ibft) support, the Initiator Name is created using the corresponding value in the ibft. The ibft is a block of information containing various parameters required to boot from a remote iscsi device, including iscsi target and initiator descriptions for the server. If this is the case, you will not be able to change the initiator name in the YaST interface. You must use your server s BIOS Setup program instead to modify it. 6. Specify the method that should be used to discover iscsi targets on the network. You can choose from the following: isns Discovered Targets To use isns, specify the IP address of the isns server and the isns service port number on the Service tab. The default port is NOTE: isns server installation and configuration is discussed later in this objective. 7. If you want to use the Discovered Targets option instead of isns, complete the following: a. Select the Discovered Targets tab. b. Select Discovery to open the iscsi Initiator Discovery dialog, shown below: 124

125 Figure 2-32 Configuring Discovered Targets Manage Server Storage Novell Training Services (en) 15 April 2009 c. Enter the IP address of the target server in the IP Address field. d. If necessary, deselect No Authentication; then enter the authentication credentials required to access the target server. e. Select Next to connect to the iscsi target server and start the discovery process. When complete, you should see a list of iscsi targets on the target server, as shown below: 125

126 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 2-33 Discovering iscsi Targets NOTE: If you see a time-out error, verify that port 3260 has been opened in the hostbased firewall on both the target and initiator systems. f. Select Log In to activate the target. You will be prompted to enter authentication credentials to access the selected iscsi target. g. If required, enter your authentication credentials; then select Next. The connected target should now appear in the Connected Targets tab and the virtual iscsi device should be available for mounting in the file system. This is shown below: 126

127 Figure 2-34 Connected to an iscsi Target From an iscsi Initiator Manage Server Storage Novell Training Services (en) 15 April 2009 h. If you want to change the startup preference for the target, select the target name in the list provided; then select Toggle Startup and select one of the following: Automatic. This option is used for iscsi targets that are to be connected when the iscsi service itself starts up. Onboot. This option is used for iscsi targets that are to be connected during boot. If selected, this option causes the iscsi target device to be connected on server boot. Manual. This option requires you to manually connect to the target. i. Select Finish. 8. Open a terminal session on the initiator, switch to your root user using the su - command, and then enter lsscsi at the shell prompt. When you do, you should see a list of iscsi LUNs available to you on the target server with their associated device file in /dev/. These iscsi devices are identified using the IET label. An example of two LUNs on a connected iscsi target are shown in the example below: DA2:~ # lsscsi [0:0:0:0] disk VMware, VMware Virtual S 1.0 /dev/sda [2:0:0:0] cd/dvd NECVMWar VMware IDE CDR /dev/sr0 [3:0:0:0] disk IET VIRTUAL-DISK 0 /dev/sdb [3:0:0:1] disk IET VIRTUAL-DISK 0 /dev/sdc 127

128 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual At this point, your local operating system can manage the iscsi target device as if it were a locally attached hard disk drive. You can use the YaST Partitioner module or the fdisk command-line utility to create partitions on the remote device and create a file system on it. You can also use the space provided by the device in an LVM configuration. Once done, of course, you must mount it in the local server s file system. In the example below, the remote iscsi target mapped to /dev/sdb has been partitioned (/dev/sdb1) and has had an ext3 file system created on it. It has also been mounted in the /iscsi/ directory on the initiator. da2:~ # mount /dev/sda2 on / type ext3 (rw,acl,user_xattr)... /dev/sr0 on /media/suse_sles type iso9660 (ro,nosuid,nodev,uid=1000) /dev/sdb1 on /iscsi type ext3 (rw) You can also connect to a remote iscsi target using the iscsiadm command line utility. It can be used to interface with the open-iscsi daemon and the database and can perform all required iscsi initiator functions. The iscsiadm utility can be used in three different modes: Discovery. This mode is used to discover targets on an iscsi target server. The syntax is as follows: iscsiadm -m discovery options You can use the following options in discovery mode: -t. Specifies the discovery type. You can use a value of sendtargets (st), isns, or slp (not currently supported) with this option. -p. Specifies the portal (IP_address:port). The default TCP port is l. Initiates login using the target s global username and password. Several examples of using the iscsiadm utility are listed below: iscsiadm -m discovery -t st -p This example discovers targets on an iscsi target server using the sendtargets option. iscsiadm -m discovery -t isns This example discovers targets using an isns server. Node. This mode is used to interact with discovered targets on a node. The syntax is as follows: iscsiadm -m node options You can use the following options in node mode: -T. Specifies the target name (iqn). -l. Initiates login using target s username and password. 128

129 -u. Initiates logout. Manage Server Storage -p. If a portal is provided, this option causes all nodes listening on that portal to be displayed. Several examples of using the iscsiadm utility in node mode are shown below: iscsiadm -m node -T iqn com.da:target1 -l This example logs in to the target. iscsiadm -m node -T iqn com.da:target1 -u This example logs out of the target. iscsiadm -m node This example lists all discovered targets. iscsiadm -m node -o delete This example deletes all discovered targets. iscsiadm -m node -o delete -t iqn com.da:target1 This example deletes the specified discovered target. Session. This mode is used to view information about current iscsi sessions. The syntax is as follows: iscsiadm -m session options You can use the following options in session mode: -r. Displays statistics about a specific session identified by its session ID (sid). -s. Displays statistics about all sessions. Several examples of using the iscsiadm utility in session mode are shown below: iscsiadm -m session This example displays a list of all current sessions. iscsiadm -m session -r 3 This example displays statistics about a specific session with a session ID of 3. Novell Training Services (en) 15 April 2009 Implementing an isns Server For iscsi initiators to access an iscsi target, the initiator must first discover the target. This can be problematic because a SAN may be composed of many storage devices that are dispersed across complex networks. iscsi initiators must be able to identify storage resources in the SAN and determine whether they have access to them. Currently there are three defined modes of target discovery: Traditional iscsi. Initiators manually initiate discovery 129

130 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Internet Storage Name Service (isns). Initiators discover targets from an isns server SLP. Initiators discover targets with the SLP protocol (not currently supported) isns provides automated discovery, management, and configuration of iscsi devices on an IP network in a manner comparable to that found in traditional Fibre Channel SANs. WARNING: isns should be used only in a secure internal network. In this part of this objective, you learn how to implement isns. The following topics are addressed: How isns Works on page 130 Installing and Configuring isns on page 132 How isns Works isns removes the need for initiators to know the actual IP addresses of the iscsi target servers. If isns has been implemented, the initiator can send a query to the isns server, which returns a list of iscsi targets, along with their associated IP addresses, that the initiator has permission to access. isns also allows iscsi initiators and targets to be organized into zones in a manner similar to a Fibre Channel fabric. isns allows you to create isns discovery domains and discovery domain sets. A discovery domain is a group of iscsi nodes. An iscsi initiator node can only discover targets that reside within the same discovery domain it resides in. Discovery domains should be used to group together targets that need to be discovered by a common set of initiators. A discovery domain set is a group of discovery domains. You group and organize your iscsi targets and initiators into discovery domains and then group the discovery domains into discovery domain sets. This is shown below: 130

Figure 2-35 isns Discovery Domains and Discovery Domain Sets Manage Server Storage Novell Training Services (en) 15 April 2009 By dividing storage nodes into domains, you can limit the discovery

131 Figure 2-35 isns Discovery Domains and Discovery Domain Sets Manage Server Storage Novell Training Services (en) 15 April 2009 By dividing storage nodes into domains, you can limit the discovery process of each host to the most appropriate subset of targets registered with isns. This allows the storage network to scale by reducing the number of unnecessary discoveries and by limiting the amount of time each host spends establishing discovery relationships. Consider the following example. Suppose you have 100 iscsi initiators and 100 iscsi targets. Depending upon your configuration, all iscsi initiators could potentially try to discover and connect to any or all of the 100 iscsi targets. By grouping initiators and targets into discovery domains, you can prevent iscsi initiators in one department from discovering the iscsi targets in another department. The result is that the iscsi initiators in a specific department only discover those iscsi targets that are part of the department s discovery domain. Both iscsi targets and iscsi initiators can use isns clients to communicate with isns servers using the isns protocol. Using the isns client, they can Register device attribute information in a common discovery domain Download information about other registered clients Receive asynchronous notification of events that occur in their discovery domain isns servers perform the following tasks: Respond to isns queries and requests sent from isns clients using the isns protocol. 131

132 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Initiate isns protocol state change notifications. Store properly authenticated information submitted by a registration request in an isns database. Some of the benefits provided by isns include the following: Provides a central location for registration, discovery, and management of networked storage assets Integrates with the DNS infrastructure Improves scalability compared to other discovery methods At this point, there are also a few things that isns cannot do yet. These include the following: isns does not support multipathing isns currently only supports a single server, which is a master with no slave. No replication or distribution of the service is available. isns currently does not support dynamic discovery or rediscovery of targets. There is no notification when a target is added to or removed from a discovery domain. Before proceeding, there are several isns-related terms that you need to be familiar with: Network entity: A network entity is a device connected to an IP network that hosts one or more iscsi nodes (such as a physical or virtual server). Network portal: The network portal is the IP address and port through which an iscsi node is accessed in a network entity. iscsi node: An iscsi node is an iscsi entity, either a target or initiator. A network entity can host one or more iscsi nodes, which are accessible through one or more network portals. Installing and Configuring isns isns Server for Linux is included with SLES 10 SP2 and later, but it is not installed by default. To install isns, you must first install the isns packages (isns and yast2- isns) and then configure the isns service. isns can be installed on a dedicated server or on the same server where an iscsi target or initiator has been configured. To install isns, complete the following tasks: 1. If necessary, insert your SLES 11 installation media into your server s optical drive and mount it. 2. Start YaST on the server. 3. When prompted, enter your root user s password. 4. Select Network Services > isns Server. 5. When prompted to install the isns packages, select Install. 132

133 Figure 2-36 Manage Server Storage Wait while the necessary packages are installed. When complete, the following is displayed: Configuring the isns Service Novell Training Services (en) 15 April In the Address of isns Server field, enter the DNS name or IP address of the isns Server. 7. Under Service Start, select one of the following: When Booting. If you select this option, the isns service starts automatically on server startup. Manually (Default). If you select this option, the isns service must be started manually by entering one of the following commands at the shell prompt of the server where it is installed (as root): rcisns start /etc/init.d/isns start 8. Select Open Port in Firewall. 9. Select Finish. WARNING: At the time this course was written, the YaST isns module does not create the / etc/isns/isns.conf file correctly after selecting Finish. You must manually create this file and add the following lines with a text editor: isns.address = your_isns_server_ip_address isns.port =

134 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual After creating this file and adding these lines, restart the isns service by entering rcisns restart at the shell prompt. At this point, the isns service is installed and running on your system. You can verify this by entering (as root) rcisns status at the shell prompt. You should see output similar to the following: da1:~ # rcisns status Checking for isns Internet Storage Naming Service: da1:~ # However, it hasn t been configured yet. To configure isns, you need to complete the following series of tasks: Create isns Discovery Domains on page 134 Create isns Discovery Domain Sets on page 136 Add iscsi Nodes to a Discovery Domain on page 139 running Create isns Discovery Domains The first thing you need to do is create your isns discovery domains. An isns server must have at least one discovery domain and one discovery domain set configured. Accordingly, a default discovery domain named default DD is automatically created when you install the isns service. Any existing iscsi targets and initiators that have been configured to use isns are automatically added to this default discovery domain. More than likely, you will want to create your custom discovery domains. To create a new discovery domain, complete the following: 1. Start YaST and select Network Services > isns Server. 2. Select the Discovery Domains tab. When you do, the following is displayed: 134

135 Figure 2-37 The Discovery Domain Tab Manage Server Storage Novell Training Services (en) 15 April 2009 The Discovery Domains field lists all configured discovery domains. You can create new discovery domains or delete existing domains. Deleting a domain removes the members from the domain, but it does not delete the iscsi node members themselves. The Discovery Domain Members field lists all iscsi nodes assigned to the selected discovery domain. Selecting a different discovery domain refreshes the list with members from that discovery domain. You can add iscsi nodes to or delete nodes from the selected discovery domain. When an iscsi initiator performs a discovery request, the isns service returns all iscsi node targets that are members of the same discovery domain. 3. Select Create Discovery Domain. The following is displayed: 135

Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 2-38 Creating a New Discovery Domain 4.

136 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 2-38 Creating a New Discovery Domain 4. Enter a name for the new discovery domain you are creating; then select OK. 5. Repeat the above steps to create any additional discovery domains you want to add. Create isns Discovery Domain Sets Once you ve created your discovery domains, you next need to create isns discovery domain set(s). All discovery domains must belong to a discovery domain set. After you create a new discovery domain and add nodes to it, you must add it to a discovery domain set for it to be active. A default discovery domain set named default DDS is automatically created when you install isns. The default discovery domain (default DD) is added to this domain set automatically. To create a new discovery domain set, you need to do the following: 1. If necessary, start YaST and select Network Services > isns Server. 2. Select the Discovery Domains Sets tab. The following is displayed: 136

Figure 2-39 The Discovery Domain Sets Tab Manage Server Storage Novell Training Services (en) 15 April 2009 The Discovery Domain Sets field lists all configured discovery domain sets.

137 Figure 2-39 The Discovery Domain Sets Tab Manage Server Storage Novell Training Services (en) 15 April 2009 The Discovery Domain Sets field lists all configured discovery domain sets. In the isns database, a discovery domain set contains discovery domains, which in turn contain iscsi node members. NOTE: Remember, a discovery domain must be a member of a discovery domain set in order to be active. The Discovery Domain Set Members field lists all discovery domains that are assigned to the selected discovery domain set. You can add discovery domains to or delete discovery domains from the currently selected discovery domain set. Deleting a discovery domain removes it from the domain set, but it does not delete the discovery domain itself. 3. Select Create Discovery Domain Set. The following is displayed: Figure 2-40 Creating a New Discovery Domain Set 137

Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 2-41 4.

138 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure Enter a name for the discovery domain set you are creating; then select OK. 5. Add discovery domains to the new discovery domain set by selecting the new discovery domain set and then selecting Add Discovery Domain. When you do, a screen similar to the following is displayed: Adding Discovery Domains to a Discovery Domain Set 6. Select the discovery domain you want to add to the discovery domain set; then select Add Discovery Domain. 7. Repeat this process to add additional discovery domains to the discovery domain set. 8. When complete, select Done. You should now see that the discovery domains you selected are members of the discovery domain set. An example is shown below: 138

139 Figure 2-42 Viewing Discovery Domain Set Members Manage Server Storage Novell Training Services (en) 15 April 2009 Add iscsi Nodes to a Discovery Domain At this point, your discovery domain and discovery domain set architecture is complete. The next thing you need to do is to add iscsi nodes to your discovery domains. If configured to do so, an iscsi target server registers its targets with the isns server when the iscsi target daemon starts. Likewise, iscsi initiators also register themselves with the isns server when they start. To configure an iscsi target to register with the isns server, do the following: 1. On the target server, start YaST and select Network Services > iscsi Target. 2. On the Service tab, select isns Access Control. 3. In the isns Server field, enter the IP address of your isns server. An example is shown below: 139

Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 2-43 Configuring an iscsi Target to Use isns 4. Select Finish.

140 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 2-43 Configuring an iscsi Target to Use isns 4. Select Finish. Once done, the targets on the server should be displayed on the iscsi Nodes tab in the YaST isns Server module, as shown below: Figure 2-44 Viewing iscsi Nodes Registered with the isns Server You can also configure the isns server settings for the target manually by editing the /etc/ietd.conf file and adding the following parameters before the first target declaration: isnsserver isns_server_ip_address isnsaccesscontrol Yes An example is shown below: isnsserver isnsaccesscontrol Yes Target iqn com.digitalairlines:0ec16387-ce25-447e-8bbb- 9cdaff388e6b Lun 0 Path=/iscsi/first-device,Type=fileio Lun 1 Path=/dev/sdb1 After making the change, you will need to restart the iscsitarget daemon on the target server using its init script in /etc/init.d or its rc script (rciscsitarget). To configure an iscsi initiator to use an isns server, do the following: 1. Start YaST on the initiator server and select Network Services > iscsi Initiator. 2. In the isns Address field on the Service tab, enter the IP address of your isns server. 140

141 Figure 2-45 Manage Server Storage 3. In the isns Port field, enter the port used by your isns server. The default is port An example is shown below: Configuring an iscsi Initiator to Use isns Novell Training Services (en) 15 April Select Finish. You can also manually edit the /etc/iscsid.conf file and add the following options to the file: isns.address = isns_server_ip_address isns.port = 3205 An example is shown below: ################ # isns settings ################ # Address of isns server isns.address = isns.port = 3205 WARNING: As of the time this course was written, the YaST initiator module doesn t correctly modify the /etc/iscsid.conf file with the IP address of the isns server after selecting Finish. To fix this problem, open the file in a text editor and uncomment the following lines in the file: #isns.address = #isns.port = 3205 Then replace in the isns.address line with the IP address of your isns server. Save your changes to the file and exit your text editor, then restart the initiator by entering rcopen-iscsi restart at the shell prompt. After making changes to the file, you need to restart the open-iscsi daemon using its init script in /etc/init.d or its rc script (rcopen-iscsi). 141

Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Once done, these targets and initiators must be grouped into discovery domains.

142 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Once done, these targets and initiators must be grouped into discovery domains. You should place the iscsi initiators and the targets you want them to be able to discover into the same discovery domain. To do this, complete, the following: 1. If necessary, start YaST and select Network Services > isns Server. 2. Select the iscsi Nodes tab. 3. Review the list of nodes displayed to make sure that the iscsi targets and initiators that you want to isns-enable are listed. If the desired iscsi target or initiator is not listed, you might need to restart the iscsi service on the node. You can do this by running one of the following commands at the shell prompt of the appropriate system (as root): Restart an iscsi initiator: rcopen-iscsi restart Restart an iscsi target: rciscsitarget restart The iscsi node should be automatically added to the isns database when you restart the iscsi service or reboot the server unless you comment out the isns portion of the iscsi configuration file. You can also select an iscsi node and then select Delete to remove that node from the isns database. This is useful if you are no longer using an iscsi node or have renamed it. 4. Select the Discovery Domains tab. 5. Select the desired discovery domain. 6. Select Add Existing iscsi Node; then select the node you want to add to the domain and select Add Node. 7. Repeat the above step to add each node you want to include in the discovery domain. NOTE: Every iscsi node must be in at least one discovery domain. A node can belong to multiple discovery domains at the same time. 8. Select Done. You should see that the node is now a discovery domain member, as shown below: Figure 2-46 Adding iscsi Nodes as Discovery Domain Members 142

Figure 2-47 Manage Server Storage NOTE: Remember, for an initiator to be allowed access to a target, they must both be members of the same discovery domain. 9. Select Finish.

143 Figure 2-47 Manage Server Storage NOTE: Remember, for an initiator to be allowed access to a target, they must both be members of the same discovery domain. 9. Select Finish. At this point, the configuration is complete. You can now perform discovery from iscsi initiators using information from the isns server. An example is shown in the figure below: Performing Discovery Using an isns Server Novell Training Services (en) 15 April

144 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Exercise 2-1 Configure a SAN with iscsi In this lab, you configure a dedicated SAN using iscsi. You will find this lab in the workbook. (End of Exercise) 144

145 Summary Objective Manage SCSI Devices on Linux Summary Manage Server Storage The SCSI standard defines the commands, protocols, and interfaces needed to create a storage I/O bus. The devices on the SCSI bus are managed by the SCSI controller, which provides an interface between the motherboard and the I/O bus. Depending upon the type of SCSI system, you can connect 8, 16, or 32 devices to the SCSI bus. Devices on the SCSI bus are identified by a unique identifier called the SCSI ID, which is used by the SCSI controller to route data to or from the appropriate device. The ID also specifies the device s priority on the bus. The overall sequence of SCSI ID priorities for a bus that supports 16 devices is 7, 6, 5, 4, 3, 2, 1, 0, 15, 14, 13, 12, 11, 10, 9, 8. Termination is used on the SCSI bus to capture and absorb electrical signals when they reach either end of the bus cabling to prevent them from reflecting back down the bus and corrupting data. Commonly used SCSI standards include SCSI-I SCSI-II SCSI-III Serial Attached SCSI (SAS) The Linux kernel uses a three-layer architecture to manage the SCSI subsystem. Each SCSI operation uses a driver from each layer in the architecture. Layer 3 provides drivers specific to the type of SCSI device being addressed: sd and sr: Block device interfaces (hard disks and optical drives) st and sg: Character device interfaces (scanners, optical disc burners, audio CDs, USB, and FireWire devices) Common SCSI commands include lsscsi, sg_scan, sg_map, sg_rbuf, sg_test_rwbuf, sg_turs, sg_inq, sginfo, and sg_format. Novell Training Services (en) 15 April

146 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Objective Describe How Fibre Channel SANs Work Summary A SAN uses externally connected storage instead of directly attached or internal storage. Servers in your network access the external storage media over a network connection that is dedicated to the SAN. A SAN uses Fibre Channel networking components which allow very fast data transfers measured in Gbps. The storage media connected to the SAN appears to be a locally attached devices to the local server operating system. The OS doesn t know that the storage device actually exists somewhere else on the SAN network. Currently, the infrastructure of choice for implementing a SAN is Fibre Channel, which is a high-speed serial I/ O protocol used almost exclusively for storage networking. A Fibre Channel SAN can be wired using one of three topologies: FC-P2P (Point-to-Point FC-AL (Arbitrated Loop) FC-SW (Switched Fabric) In a Fibre Channel SAN, a port is any device that is connected to the SAN network and is used for communications. The Fibre Channel standard defines several different types of ports, including the following: Node Port (N_port) Node Loop Port (NL_port) Fabric Port (F_port) Fabric Loop Port (FL_port) Fx_port Expansion Port (E_port) Generic Port (G_port) EX_port A key benefit of SANs is that they allow you to implement clustering with your mission-critical servers. A server cluster is a group of redundantly configured servers that work together to provide clients with highly available access to important applications, services, and data while reducing unscheduled outages. 146

147 Objective Implement a SAN with iscsi Summary Manage Server Storage iscsi offers a much lower cost alternative to using Fibre Channel. iscsi allows you to use inexpensive commodity-grade Ethernet networking hardware; leverage your existing local area networking knowledge and expertise; use existing, familiar networking paradigms; and reuse your existing directly attached storage hardware. With iscsi, the operating system (which functions as an iscsi initiator) sends SCSI commands to another server system on the network (which functions as an iscsi target). The iscsi target system redirects the SCSI commands to a directly-attached storage device, which can be a partition, logical volume, or file. iscsi uses a client/server connection between the iscsi initiator (client) and iscsi target (server). iscsi uses port 3260 by default. Remote devices are available on Linux as a regular SCSI devices (/dev/ sdn) and should be mounted with the _netdev option. To control access, you can configure the iscsi targets to require a username and password upon connection from an iscsi initiator. Because of the amount of data transferred over the network by iscsi, it is important that you ensure enough network bandwidth is available to the iscsi SAN. We strongly recommend that you use Gigabit Ethernet switches, wiring, and HBAs when implementing an iscsi SAN. If the demand on the SAN is high, you may want to consider implementing a dedicated network for the iscsi communications, in much the same way that Fibre Channel uses a dedicated network infrastructure. The iscsi target daemon (ietd) enables a server to host iscsi targets. It allows block devices to be accessed over the network as if they were locally attached SCSI devices. Supported targets can be any block device or image file. An unrestricted number of targets can be configured on each server with multiple LUNs per target. The iscsi initiator is used to connect to any iscsi target. The configuration of iscsi initiator involves two major steps: The discovery of available iscsi targets The setup of an iscsi session The term discovery refers to the process of requesting the targets available on a iscsi target server. The terms login and logout refer to the process of connecting to or disconnecting from an iscsi target. A session is created when an initiator logs in to a target. Novell Training Services (en) 15 April

148 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual 148

149 SECTION 3 Work with Xen Virtualization Work with Xen Virtualization This section gives an overview of the existing virtualization technologies. It explains how to set up SUSE Linux Enterprise Server 11 (SLES 11) as a host for Xen virtual machines and how to set up SLES 11 within a Xen virtual machine. It also covers using iscsi-based storage for Xen guests and the migration of Xen virtual machines between hosts. Novell Training Services (en) 15 April 2009 Objectives 1. Understand Virtualization Technology on page Implement SLES 11 as a Xen Host Server on page Implement SLES 11 as a Xen Guest on page

150 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Objective 1 Understand Virtualization Technology Virtualization technology separates a running instance of an operating system from the physical hardware. Instead of running on a physical machine, the operating system runs in a so-called virtual machine. Multiple virtual machines share the resources of the underlying hardware. The idea of virtualization is not new. Hardware platforms like IBM s pseries or zseries have supported virtualization for a long time, and software like VMware Workstation for x86 based systems has been available for many years. However, Intel- and AMD-based x86 systems have only recently begun to provide enough resources to run several virtual machine at the same time. In addition, the x86 instruction set has been expanded to allow virtualization in a way that was not possible earlier. Due to these developments, virtualization is no longer limited to expensive hardware and mainframe computers. It is an option every system administrator can consider to Consolidate workloads from several physical machines that are under-utilized, thus reducing energy consumption Move workloads from one physical machine to another to allow maintenance without service downtime Move workloads to new physical hardware to meet increased demand without service downtime A system administrator deciding which operating systems and services to run in a virtualized environment has a lot of different options to choose from. He has to decide on Hardware best suited for the system Software for the virtualization itself Operating system and software running within virtual machines In the year 2006, Microsoft and Novell announced a collaboration agreement that included as one of its main purposes improving interoperability, with virtualization as one of the key areas named. As a result of this collaboration the performance of SLES 11 running as guest on a Windows 2008 Server/HyperV host as well as of Windows 2008 Server running as a guest on a SLES 11/Xen server was markedly improved. Other players in the virtualization arena include VMware, with various virtualization products, and Sun with Virtual Box. To make an informed decision and to work effectively with virtualization technologies, you need to understand the following: Virtualization Background on page 151 Virtualization Products on page 158 Virtualization Hardware on page 159 Management of Virtual Machines in the Enterprise on page

151 Virtualization Background Work with Xen Virtualization The virtualization technology included with SLES 11 is called Xen. The following explanations refer to Xen as it is used on SLES 11. However, the basic concepts apply to other solutions as well. Virtualization allows you to run multiple virtual systems on one physical machine. In comparison with non-virtualized physical hardware, virtualization provides the following advantages: Efficient hardware utilization: Often systems are not using the full potential of their hardware. When multiple virtual machines are run on the same hardware, the resources are used more efficiently. Reduced downtime: Virtual machines can be migrated to a new physical host system. This reduces downtime in case of a hardware failure. Flexible resource allocation: Hardware resources can be allocated on demand. When the resource requirements of a virtual machine change, resource allocation can be adjusted or the virtual machine can be migrated to a different physical host. Xen allows you to run multiple virtual machines on a single piece of x86-based Intel or AMD hardware. To understand how Xen works, you need to do the following: Understand Virtualization Methods on page 151 Understand CPU Virtualization on page 153 Understand the Xen Architecture on page 155 Understand the Xen Networking Concept on page 156 Novell Training Services (en) 15 April 2009 Understand Virtualization Methods You should understand the following virtualization methods: Full Virtualization. The virtualization software emulates a full virtual machine, including all hardware resources. The operating system running in the virtual machine (guest OS) communicates with these resources as if they were physical hardware. VMware Workstation is a popular full virtualization software. 151

Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 3-1 Full Virtualization Xen supports full virtualization on specialized x86

152 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 3-1 Full Virtualization Xen supports full virtualization on specialized x86 hardware developed by Intel and AMD (Virtual Technology [VT] hardware). Intel and AMD extended the x86 Standard to support virtualization. Full virtualization works with unmodified guest operating systems, including Microsoft Windows, but generates more overhead, resulting in a weaker performance. Para-Virtualization: Instead of emulating a full virtual machine, paravirtualization software provides an Application Programming Interface (API) which is used by the guest OS to access hardware resources. The guest OS must be aware that it runs in a virtual machine and must know how to access the API. Figure 3-2 Para-Virtualization Para-virtualization provides better performance because it does not emulate all hardware details. However, the guest OS needs to be modified to run with paravirtualization; therefore, only open source operating systems like Linux or BSD 152

Work with Xen Virtualization can be installed. One exception is NetWare 6.5 SP7, which has been adjusted by Novell to run para-virtualized in a Xen virtual machine.

153 Work with Xen Virtualization can be installed. One exception is NetWare 6.5 SP7, which has been adjusted by Novell to run para-virtualized in a Xen virtual machine. Another advantage of para-virtualization is the flexible resource allocation. Because the guest OS is aware of the virtual environment, Xen can, for example, change the memory allocation of a virtual machine on the fly without requiring a reboot of the virtual machine. Progressive Paravirtual Mode. This is a hybrid of full and para-virtualization. By adding drivers, such as network and storage drivers, that can use para-virtual system calls, performance is improved, even if other parts of the operating system continue to access emulated hardware. Novell Training Services (en) 15 April 2009 Figure 3-3 Progressive Para-Virtual Mode The hardware has to be VT-enabled for Xen to be able to use progressive paravirtual mode. Understand CPU Virtualization Most modern CPU architectures support different levels of privilege, called rings. Ring 0 has the highest privilege; processes running in this ring are referred to as running in supervisor or kernel mode. Processes running in higher rings are referred to as running in user mode. The higher the number of the ring, the less privilege a process running in this ring has. An x86-based CPU supports four levels of privilege, but in practice only rings 0 and 3 are used. In Linux and Windows, the operating system kernel and hardware drivers run in Ring 0, and user processes run in Ring 3. Only processes in Ring 0 can access hardware. If a process in a higher ring needs to access hardware, such as the hard disk, it has to use the APIs of the kernel. 153

Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 3-4 CPU Rings with Native

This is the reason why the operating system has to be modified to run para-virtualized the kernel has to know that it is running in Ring 1 instead

154 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 3-4 CPU Rings with Native Operating System When using Xen, the Xen Hypervisor runs in Ring 0, and the operating system kernel is moved to Ring 1. This is the reason why the operating system has to be modified to run para-virtualized the kernel has to know that it is running in Ring 1 instead of Ring 0. Figure 3-5 CPU Rings with Xen Hypervisor VT-enabled hardware allows the Hypervisor to emulate a Ring 0 for the operating system kernel, allowing it to run unmodified: 154

Figure 3-6 CPU Rings and VT-Enabled Hardware Work with Xen Virtualization Novell Training Services (en) 15 April 2009 Understand the Xen Architecture Xen consists of the following three major

155 Figure 3-6 CPU Rings and VT-Enabled Hardware Work with Xen Virtualization Novell Training Services (en) 15 April 2009 Understand the Xen Architecture Xen consists of the following three major components: Virtual Machine Monitor: The virtual machine monitor forms a layer between physical hardware and virtual machines. In general, this kind of software is called a hypervisor. Xen kernel: The modified Linux kernel for Xen para-virtualization. It can be used for Domain 0 as well as for Domain U (see below). Xen tools: The Xen tools are a set of command line and graphical applications that are used to administer virtual machines. The virtual machine monitor must be loaded before any of the virtual machines are started. When working with Xen, virtual machines are called domains. The Xen virtual machine monitor includes neither any drivers to access the physical hardware of the host machine nor an interface to communicate directly with an administrator. These tasks are performed by an operating system running in the privileged Domain 0 (Dom0). The following is an illustration of a Xen system with three domains: 155

Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 3-7 Xen Domains Xen plus the privileged Domain 0 can also be referred to as a

156 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 3-7 Xen Domains Xen plus the privileged Domain 0 can also be referred to as a Virtual Machine Server. An unprivileged domain is called Domain U (DomU) in the Xen terminology and is also known as a Virtual Machine. A process called xend runs in the Dom0 Linux installation. This process is used to manage all Xen domains running on a system and to provide access to their consoles. SUSE Linux Enterprise Server 11 can be used for privileged (Dom0) and unprivileged (DomU) Xen domains. Understand the Xen Networking Concept In a Xen setup, the xend management process in Dom0 controls the physical network interfaces of a host system. When a DomU starts up, the /etc/xen/scripts/ network-bridge script takes care of the virtual interface needed to connect the new DomU to the physical network via the bridge. When a new Unprivileged Domain is created, the following changes to the network configuration are made (simplified): 1. Xen provides a virtual network device to the new domain. Within that domain, that device will appear as ethx. 2. xend creates a new virtual interface in Dom0. 3. The virtual interface in Dom0 and the virtual network device in the unprivileged domain are connected through a virtual point-to-point connection. 4. The virtual interface in Dom0 is added to the bridge with the physical interface. These steps affect only the general network connectivity. The IP configuration inside the unprivileged domain is configured separately with DHCP or a static network configuration. The following graphic illustrates the relationship of the various interfaces involved: 156

Figure 3-8 Xen Networking The output of ip a s shows the new interface: Work with Xen Virtualization Novell Training Services (en) 15 April 2009 da10:~ # ip address show 1: lo: <LOOPBACK,UP,LOWER_UP>

157 Figure 3-8 Xen Networking The output of ip a s shows the new interface: Work with Xen Virtualization Novell Training Services (en) 15 April 2009 da10:~ # ip address show 1: lo: <LOOPBACK,UP,LOWER_UP> mtu qdisc noqueue state UNKNOWN... 2: eth1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN qlen : eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen : br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN link/ether 00:19:d1:9f:17:87 brd ff:ff:ff:ff:ff:ff inet /16 brd scope global br0 inet6 fe80::219:d1ff:fe9f:1787/64 scope link valid_lft forever preferred_lft forever 5: vif1.0: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 32 link/ether fe:ff:ff:ff:ff:ff brd ff:ff:ff:ff:ff:ff inet6 fe80::fcff:ffff:feff:ffff/64 scope link valid_lft forever preferred_lft forever The new interface is added to the existing bridge, as shown in the output of brctl: da10:~ # brctl show bridge name bridge id STP enabled interfaces br d19f1787 no eth0 vif1.0 The naming scheme is vifdomain_number.interface_number For example, the counterpart for eth0 in domain number 2 is vif2.0. The /etc/xen/scripts/ directory contains additional scripts that can be used to set up NAT or routing instead of the default bridge setup. In the /etc/xen/xendconfig.sxp file, you can configure which network scripts are used by xend. 157

158 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Virtualization Products In addition to Xen on SLES 11, there are various virtualization solutions on the market. These include the following: Hyper-V. Hyper-V, previously named Viridian, is a hypervisor-based technology that is a key feature of Windows Server It provides a scalable, reliable, and highly available virtualization platform. A core component of Hyper-V, Windows hypervisor is a thin layer of software between the hardware and the OS that allows multiple operating systems to run, unmodified, on a host computer at the same time. It provides simple partitioning functionality and is responsible for maintaining strong isolation between partitions. Source: Microsoft Hyper-V FAQ ( windowsserver2008/en/us/hyperv-faq.aspx). Kernel-Based Virtual Machine (KVM). KVM is a full virtualization solution for Linux on x86 hardware with virtualization extensions (Intel VT or AMD-V). It consists of a loadable kernel module, kvm.ko, that provides the core virtualization infrastructure and a processor-specific module, kvm-intel.ko or kvm-amd.ko. KVM also requires a modified QEMU, although work is underway to move the required changes upstream. Using KVM, one can run multiple virtual machines running unmodified Linux or Windows images. Each virtual machine has private virtualized hardware: a network card, disk, graphics adapter, etc. The kernel component of KVM is included in mainline Linux, as of Source: ( VirtualBox. VirtualBox is an x86 and AMD64/Intel64 virtualization product for enterprise as well as home use. Not only is VirtualBox an extremely feature-rich, high-performance product for enterprise customers, it is also the only professional solution that is freely available as Open Source Software under the terms of the GNU General Public License (GPL). Presently, VirtualBox runs on Windows, Linux, Macintosh, and OpenSolaris hosts, and it supports a large number of guest operating systems including, but not limited to, Windows (NT 4.0, 2000, XP, Server 2003, Vista, and Windows 7), DOS/Windows 3.x, Linux (2.4 and 2.6), Solaris and OpenSolaris, and OpenBSD. VirtualBox is being actively developed with frequent releases and has an evergrowing list of features, supported guest operating systems and platforms it runs on. VirtualBox is a community effort backed by a dedicated company: everyone is encouraged to contribute while Sun ensures the product always meets professional quality criteria. Source: VirtualBox ( VMware. VMware offers a variety of virtualization products, from desktop products such as VMware Workstation to data center oriented products. 158

159 More information: VMware ( Work with Xen Virtualization Xen. The Xen hypervisor, the powerful open source industry standard for virtualization, offers a powerful, efficient, and secure feature set for virtualization of x86, x86_64, IA64, ARM, and other CPU architectures. It supports a wide range of guest operating systems including Windows, Linux, Solaris, and various versions of the BSD operating systems. The University of Cambridge Computer Laboratory developed the first versions of Xen. The Xen hypervisor is a layer of software running directly on computer hardware, replacing the operating system and, thereby, allowing the computer hardware to run multiple guest operating systems concurrently. Support for x86, x86-64, Itanium, Power PC, and ARM processors allows the Xen hypervisor to run on a wide variety of computing devices. Linux, NetBSD, FreeBSD, Solaris, Windows, and other common operating systems are supported as guests running on the hypervisor. The Xen.org community develops and maintains the Xen hypervisor as a free solution licensed under the GPL. Source: ( Novell Training Services (en) 15 April 2009 Virtualization Hardware Both Intel and AMD provide CPUs that provide hardware support for CPU virtualization as described in Understand CPU Virtualization on page 153. Intel calls this Intel VT-x, AMD calls it AMD-V. NOTE: Intel provides a very technical description of the issues encountered with virtualization and Intel s solution to these problems in Intel Technology Journal, Vol 10, Issue 3 ( AMD provides information on its virtualization solution on its Web site as well, such as Putting Server Virtualization to Work ( Developments since the development of CPU virtualization support include Intel VT-d (Intel Virtualization Technology for Directed I/O). It addresses I/O performance issues as well as protected access to I/O resources from a given virtual machine (Intel Technology Journal, Vol 10, Issue 3 ( AMD Direct Connect Architecture and AMD HyperTransport technology. These are designed to help virtualization software more efficiently run applications in separate, isolated environments. Intel Virtualization Technology for Connectivity (Intel VT-c). It is a collection of I/O virtualization technologies that enables lower CPU utilization, reduced system latency, and improved networking and I/O throughput (Intel Virtualization Technology for Connectivity ( connectivity/solutions/virtualization.htm). Management of Virtual Machines in the Enterprise Individual Xen virtual machines can be managed with the tools, such as xm, virsh, and virt-manager, that come with SLES 11. However, when the system 159

160 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual administrator is dealing with several physical and many virtual machines, these tools have their limitations; the system administrator would have to monitor the CPU, memory, and I/O load to then manually add, move, or shut down virtual or physical machines accordingly. PlateSpin Workload Management from Novell is a portfolio of enterprise-class products that simplify the management of server workloads across today s mixed IT environments. The component that specifically deals with the management of virtual machines is PlateSpin Orchestrate. Other components include PlateSpin Recon, PlateSpin Migrate, PlateSpin Protect, and PlateSpin Forge. PlateSpin Orchestrate dramatically simplifies the management of heterogeneous virtual assets, controlling the entire lifecycle of each virtual machine. Via built-in automation, resource usage can be kept aligned with business requirements. PlateSpin Orchestrate supports VMware*, Xen* and Microsoft* Hyper-V* based virtual machines. As virtualization becomes the standard platform for service deployment in the enterprise, control has to be enforced on how new workloads are provisioned. At the same time, the workload creation and provisioning process needs to be as simple as possible. PlateSpin Orchestrate has built-in support for template-based provisioning, allowing data center administrators to create predefined templates for all workloads that are to be deployed, and to keep track of cloned instances. Furthermore, with PlateSpin Orchestrate, moving virtual machine images and running virtual machines is accomplished easily and quickly. Today s data centers increasingly employ a mix of different hardware, operating systems, and hypervisor technologies that must work together to meet business needs. PlateSpin Orchestrate offers a simplified console to operate your heterogeneous virtual environment and start, stop, clone or move virtual machines, allowing you to confidently deploy and manage VMware, Microsoft, and Xen virtualization throughout your data center. Installing PlateSpin Orchestrate on SLES is straightforward and fast, even for highavailability setups. The PlateSpin Orchestrate VM Client is cross-platform, and installs on Linux* and Windows*. PlateSpin Orchestrate Server features a Web interface where installers for PlateSpin Orchestrate Agents can be downloaded. PlateSpin Orchestrate includes the following components: Table 3-1 PlateSpin Orchestrate Components Attribute PlateSpin Orchestrate Server PlateSpin Orchestrate Agent Description Detects physical and virtual data center resources and schedules tasks on those resources via the PlateSpin Orchestrate Agent. Provides automated policy-based VM lifecycle management capabilities. Executes tasks on behalf of the PlateSpin Orchestrate Server. 160

161 Attribute PlateSpin Orchestrate VM Client PlateSpin Orchestrate Development Client PlateSpin Orchestrate User Portal Description Work with Xen Virtualization A user-friendly interface for manually managing virtual machine lifecycle operations. The VM Client enables you to easily manage VMware, Xen and Hyper-V virtual machines. A graphical studio for creating, testing and troubleshooting the building blocks of data center automation: policies, constraints, tasks and events.processor Provides end users with the ability to administer and schedule tasks (assigned by the administrator) on the managed resources. More information on virtualization and workload management solutions from Novell can be found at ( Novell Training Services (en) 15 April

Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Objective 2 Implement SLES 11 as a Xen Host Server This objective covers Xen

162 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Objective 2 Implement SLES 11 as a Xen Host Server This objective covers Xen virtualization from the viewpoint of the virtualization host (also sometimes referred to as a virtualization server): SLES 11 is installed on the physical hardware and hosts Xen virtual machines (Xen guests). To set up a Xen server, you need to install the Xen kernel and additional Xen packages on top of a SLES 11 installation. You need to understand how to Install Xen During Installation of SLES 11 on page 162 Install Xen on an Installed SLES 11 on page 165 Install Xen During Installation of SLES 11 This installation on the physical hardware will be your future Domain 0 (Dom0). The other Xen domains (DomUs) are installed later on local physical partitions, file system images, or a remote storage medium. If you plan to use local physical partitions, make sure that the initial SUSE Linux Enterprise Server 11 installation is not using all of the available disc space. For maximum flexibility, use the logical volume manager (LVM) for a Xen system. To install Xen as part of the SUSE Linux Enterprise Server 11 installation, in the Server Base Scenario dialog presented during the first stage of the installation, select the Xen Virtualization Host scenario: Figure 3-9 Xen Host Server: Installation Scenario 162

Figure 3-10 Work with Xen Virtualization This scenario includes only the Minimal System (Appliances) and the Xen Virtual Machine Host Server software patterns.

163 Figure 3-10 Work with Xen Virtualization This scenario includes only the Minimal System (Appliances) and the Xen Virtual Machine Host Server software patterns. Using this scenario, the graphical X Window interface is not configured, but you can still start graphical applications, such as virtmanager, when logged in to the server with SSH. If you prefer to have a full SLES 11 installation, you can choose the Physical Machine Scenario and add the Xen Virtual Machine Host Server software pattern as part of the software selection: Xen Host Server: Software Selection Novell Training Services (en) 15 April 2009 As a general rule, you should run services (such as a Web server, a database, or Novell services like ifolder) in a DomU, not in Dom0. Therefore, it is not necessary to select the respective patterns during the installation of Dom0. As part of the installation the network configuration is changed to include a bridge to connect the physical network interface with the virtual network interfaces of the virtual machines. For a Xen host server, the following packages have to be installed in the SLES 11 installation: xen: Contains the Xen virtual machine monitor (Hypervisor). xen-libs: Contains the libraries used to interact with the Xen virtual machine monitor. xen-tools: Contains xend and a collection of command line tools to administer a Xen system. 163

164 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual vm-install: Contains Python scripts used to define a Xen virtual machine, and to cause an operating system to begin installing within that virtual machine. xen-doc-*: (Optional) Contains Xen documentation in various formats. virt-manager: Provides a graphical interface to manage virtual machines. virt-viewer: Provides a graphical console client for connecting to virtual machines. bridge-utils: Contains utilities to configure Linux ethernet bridges, which are used to connect the domains to each other and to the physical network interface. kernel-xen: Contains a modified Linux kernel that runs in a Xen domain, both Dom0 and DomU. These are all part of the Xen pattern. The installation of the kernel-xen package automatically adds an entry like the following to the /boot/grub/menu.lst bootloader configuration file. ###Don't change this comment - YaST2 identifier: Original name: xen### title Xen -- SUSE Linux Enterprise Server root (hd0,1) kernel /boot/xen.gz module /boot/vmlinuz xen root=/dev/disk/by-id/ata- ST380815AS_6QZ2FW3T-part2 insmod=e100 resume=/dev/disk/by-id/ata- ST380815AS_6QZ2FW3T-part1 splash=silent crashkernel= showopts vga=0x317 module /boot/initrd xen The entry in menu.lst adds a new option to the boot menu of your system. When you select this entry, the Xen virtual machine monitor is loaded (kernel /boot/ xen.gz) which starts SUSE Linux Enterprise Server 11 in Dom0 (see the lines starting with module). Before rebooting your system with the Xen option, you should determine if the automatically generated entry is correct. Make sure that The line root (hd0,1) points to the partition which contains the Xen virtual machine monitor and the Kernel of the Linux installation for Dom0. For example, hd0,1 designates the second partition on the first hard drive in the system. Also check if the parameter root= in the first module line points to the root partition of the Dom0 installation. The Xen version of the Linux kernel and the initrd are loaded in the module lines. The names of the image files should end in -xen. After checking the bootloader configuration file, you can reboot your system and select the Xen option from the bootloader menu. In the early stages of the boot process, you will see some messages of the Xen virtual machine monitor on the screen. Then the Dom0 Linux operating system is started. If the system is not booting properly, you can switch back to a non-virtualized system by selecting the regular SLES 11 boot option. 164

165 Install Xen on an Installed SLES 11 Work with Xen Virtualization You can easily add Xen to an existing installation of SLES 11 using the YaST module created for this purpose. In YaST, select Virtualization > Install Hypervisor and Tools. The required Xen packages are installed. The necessary changes are made to /boot/grub/menu.lst as described in Install Xen During Installation of SLES 11 on page 162 and a default network bridge is configured. Reboot the machine and select the Xen kernel from the boot menu. To boot the Xen kernel by default, edit the default entry in /boot/grub/ menu.lst: Novell Training Services (en) 15 April 2009 # Modified by YaST2. Last modification on Thu Apr 2 17:27:29 CEST 2009 default 0 timeout 8 gfxmenu (hd0,1)/boot/message ##YaST - activate ###Don't change this comment - YaST2 identifier: Original name: xen### title Xen -- SUSE Linux Enterprise Server With default 0, the first entry is booted by default, with default 1 the second, and so on. If you want to find out which kernel is currently in use, enter uname -a in a terminal window: da10:~ # uname -a Linux da xen #1 SMP :40: i686 i686 i386 GNU/Linux 165

166 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Objective 3 Implement SLES 11 as a Xen Guest SLES 11 is prepared to run as a paravirtualized guest on a SLES 11 Xen host server. Unlike the installation of a Xen virtual machine locally, the installation of a Xen virtual machine using remote storage requires some preparation. However, the advantage is that you can easily migrate the virtual machine from one physical machine to another. To work effectively with Xen, you need to be able to Install a Xen Virtual Machine Locally on page 166 Install a Xen Virtual Machine Using Remote Storage on page 173 Install a Xen Virtual Machine Non-Interactively on page 176 Manage Xen Domains with Virt-Manager on page 178 Manage Xen Domains from the Command Line on page 183 Migrate Xen Virtual Machines between Hosts on page 189 Install and Use Xen Virtualization on page 191 Install a Xen Virtual Machine Locally After you have installed Xen and the Xen tools on the Xen host server, you can use vm-install to create unprivileged Xen domains. vm-install can be started directly from the command line or by starting YaST and selecting Virtualization > Create Virtual Machines. This tool guides you step by step through the creation of a Xen domain on your system. The first dialog looks like the following: 166

Figure 3-11 Virtual Machine Installation Work with Xen Virtualization Novell Training Services (en) 15 April 2009 This first page gives some information on the creation of a virtual machine.

167 Figure 3-11 Virtual Machine Installation Work with Xen Virtualization Novell Training Services (en) 15 April 2009 This first page gives some information on the creation of a virtual machine. Selecting Forward opens a dialog where you have a choice between a new installation of an operating system and the use of an existing image. If you decide to install an operating system, the following dialog appears: 167

168 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 3-12 Virtual Machine Installation: OS Type Your choice of the type of operating system determines the suggested values in the next dialog: 168

Figure 3-13 Virtual Machine Installation: Summary Work with Xen Virtualization Novell Training Services (en) 15 April 2009 It is necessary to specify the installation source.

169 Figure 3-13 Virtual Machine Installation: Summary Work with Xen Virtualization Novell Training Services (en) 15 April 2009 It is necessary to specify the installation source. Other values, such as the size of the virtual hard disk, can be changed as needed. To change a setting, select the blue headline. We recommend switching to a fixed MAC address for Linux virtual machines. Select Network Adapter on the Summary page to edit the suggested values or to add another virtual network adapter. Select Edit on the Network Adapters page to open the following dialog: 169

170 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 3-14 Virtual Machine Installation: Network Adapter Selecting Randomly generated MAC address causes a new MAC address to be created each time the virtual machine is started. With this setting and SLES11 as the operating system within the virtual machine, the interface name within the virtual machine changes each time the virtual machine is started. To avoid this, select Specified MAC address. The vendor string for Xensource is 00:16:3e. Enter hex values in the spaces provided, making sure they are unique within your network. Click Apply to return to the previous dialog. In the Summary dialog, select Disks to change hard disk parameters or to add a hard disk or a CDROM drive. The following dialog appears: 170

171 Figure 3-15 Virtual Machine Installation: Disks Work with Xen Virtualization Novell Training Services (en) 15 April 2009 Select Edit to change the highlighted entry. The following dialog appears: 171

Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 3-16 Virtual Machine Installation: Virtual Disk You can specify a different

172 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 3-16 Virtual Machine Installation: Virtual Disk You can specify a different image file and change its size. When you select Create Sparse Image File, the image file does not immediately use the specified amount of disk space on the storage medium; it grows as space is actually used within the virtual machine. It is also possible to specify a block device like /dev/sda5 instead of a file. Select OK to return to the Disks dialog. Select Apply in the Disks dialog to return to the Summary page. The dialog for the CDROM drive is almost identical. To specify an installation medium, in the Summary dialog select Operating System Installation. The following dialog appears: 172

Figure 3-17 Virtual Machine Installation: OS Installation Work with Xen Virtualization Novell Training Services (en) 15 April 2009 In the Network URL text box, you can specify an installation source

173 Figure 3-17 Virtual Machine Installation: OS Installation Work with Xen Virtualization Novell Training Services (en) 15 April 2009 In the Network URL text box, you can specify an installation source located in the network, such as nfs:// /data/install/sles11. Select Apply to return to the Summary dialog. To start the installation, select OK in the Summary dialog. A VNC window appears, allowing you to control and configure the operating system installation. When you install SLES 11 in a virtual machine, the device name for the first hard disk within the virtual machine is /dev/xvda, the device name for the second disk is /dev/xvdb, and so on. Apart from this detail, a virtual installation is almost identical to an installation on real hardware. Install a Xen Virtual Machine Using Remote Storage Having the virtual machine image files stored remotely is a prerequisite to being able to migrate a Xen virtual machine from one host to another. The virtual machine needs to be able to access the underlying storage in exactly the same way no matter which host it is running on. iscsi is a convenient way to provide remote storage for Xen virtual machines. However, the remote storage must appear in the same way on any host involved, and the remote file system must never be accessed by two virtual machine instances at the same time, as that would damage the file system. 173

Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 3-18 The iscsi target on the storage server is prepared as covered in

174 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 3-18 The iscsi target on the storage server is prepared as covered in Implement an iscsi Target on page 112. On any Xen host server involved, the iscsi initiator is prepared as covered in Implementing an iscsi Initiator on page 122. Because the iscsi disk could appear on one host as /dev/sdb and as, /dev/sdc on another, it is advisable to use the links contained in the /dev/disk/by-path/ directory as pointers to the storage space instead of the devices /dev/sdx: Xen: Virtual Disk Because these IDs include the IP address of the ISCI target and the IQN of the resource, they remain the same on all hosts involved. An alternative to selecting a device is to select the ISCSI protocol in the Virtual Disk dialog and enter the IQN of the iscsi resource First establish the available targets with the iscsiadm command, as shown in the following: 174

175 Work with Xen Virtualization da3:~ # iscsiadm -m discovery -t sendtargets -p :3260,1 iqn com.digitalairlines:2549b473-7e46-4c91-975f-48e4c3c2b :3260,1 iqn com.digitalairlines:34161b91-cbc a2fb-b :3260,1 iqn com.digitalairlines:9dddbae6-b98a- 42e1-bc1d-5debbaeed9a4 These match the available disks, as shown in the following: da3:~ # ls -l /dev/disk/by-path/ total 0 lrwxrwxrwx 1 root root 9 Jan 29 13:05 ip :3260-iscsi- iqn com.digitalairlines:2549b473-7e46-4c91-975f- 48e4c3c2b505-lun-0 ->../../sdd lrwxrwxrwx 1 root root 10 Jan 29 13:05 ip :3260-iscsi- iqn com.digitalairlines:2549b473-7e46-4c91-975f- 48e4c3c2b505-lun-0-part1 ->../../sdd1 lrwxrwxrwx 1 root root 10 Jan 29 13:05 ip :3260-iscsi- iqn com.digitalairlines:2549b473-7e46-4c91-975f- 48e4c3c2b505-lun-0-part2 ->../../sdd2 lrwxrwxrwx 1 root root 9 Jan 29 13:05 ip :3260-iscsiiqn com.digitalairlines:34161b91-cbc a2fbb lun-0 ->../../sdc... Novell Training Services (en) 15 April 2009 In the Virtual Disk dialog, enter the IQN without the IP address, as shown in the following: 175

176 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 3-19 Xen: Virtual Disk with iscsi The remaining steps for the installation remain the same as with an installation on local hard disks. Another way to integrate remote storage is NFS. Export the directory where you want the files to be stored and import that directory on the Xen host. If you want to migrate the virtual machines, make sure the path to the disk files is the same on all Xen hosts involved. Install a Xen Virtual Machine Non-Interactively It is possible to pass command line arguments to the vm-install command. In combination with an AutoYaST control file, completely unattended installations of SLES 11 Xen virtual machines are possible. 176

177 Work with Xen Virtualization Options passed to vm-install influence the pre-selected settings presented in the graphical interface. The vm-install manual page includes the following: -d DISK, --disk DISK. Defines an additional virtual disk. Repeat for multiple disks. DISK is of the form PDEV,VDEV[,TYPE[,MODE[,MB[,OPTIONS...]]]. PDEV describes the physical storage. In its simplest form, PDEV can be a path to a file or block device. More complex forms require prefixing a protocol. Valid protocols are file, iscsi, nbd, npiv, phy, tap:aio, tap:qcow, and tap:vmdk. Examples: /dev/sdb, phy:/dev/sdb, tap:qcow:/disks/ disk0.qcow, iscsi:iqn de.suse@0ac47ee2-216e- 452a-a341-a12624cd0225. VDEV is the name of the virtual device. VDEV may be named using Linux fully virtualized names (hda, hdb, hdc,...), Linux para-virtualized names (xvda, xvdb, xvdc,...), or a number (0, 1, 2,...). TYPE may be disk (default) or cdrom. MODE is r for read-only or w for writable. If the mode is not specified, a reasonable default is chosen. MB is a number specifying how large to create the disk (only meaningful if PDEV does not already exist). Currently, only file, tap:aio, and tap:qcow can be created from scratch. OPTIONS are any number of protocol-specific options. For example, file might be passed the sparse=0 option. -n NAME, --name NAME. Name of the VM. This must be unique among all VMs on the physical machine. If not specified, a unique name will be chosen based on the OS type. -v, --para-virt. This VM should be para-virtualized. The OS must support paravirtualization. -V, --full-virt. This VM should be fully virtualized. The hardware must support full virtualization. -o TYPE, --os-type TYPE. Type of guest OS. This defines many defaults and helps decide how to bootstrap para-virtualized OSs. --os-settings FILE. A file or directory to be given to the OS at install time, via a temporary virtual disk; used to automate the installation of the OS. The format of this file (or layout of the directory) depends on the OS. Not all OSs support automated installations. For SLES and OpenSUSE guests, this would be an AutoYaST XML file. -p, --pxe-boot. Specify PXE booting for the VM installation. -x TEXT, --extra-args TEXT. Additional arguments to pass to the paravirtualized OS. Note that the tool will automatically generate the necessary OS-specific arguments to bootstrap the installation. -s URL, --source URL. Installation source of the operating system (e.g., nfs:host:/path, ftp://host/path). To Novell Training Services (en) 15 April

178 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual install from an existing virtual disk, use the syntax dev:/vdev, for example: dev:/xvda. The types of installation sources supported varies based on the OS type. --background. Run in the background. Do not interactively prompt for settings; use defaults if necessary. Implies --no-autoconsole. Backgrounded VM creation jobs can be managed with the vm-install-jobs command. -no-autoconsole. Don't automatically try to connect to the VM console. This tool will pause (waiting for the installation to finish) until the VM is caused to shut down by some other means. A vm-install command line that installs a SLES 11 Xen virtual machine without any user interaction, using an NFS installation server, could look like the following (in one line): vm-install --os-type sles11 --source nfs:// / data/install/sles11 --os-settings nfs:// / data/install/autoyast/crs/3107/da4-xen.xml If needed, you can connect to the installation running in the background using the xm vncviewer domain_name command. It opens a VNC viewer window displaying the installation of the domain_name domain. Manage Xen Domains with Virt-Manager Virt-Manager is a graphical tool used to manage virtual domains. It can be started by entering the virt-manager command or by selecting Virtualization > Virtual Machine Manager in YaST. 178

179 Figure 3-20 Virt-Manager Work with Xen Virtualization Novell Training Services (en) 15 April 2009 Double-click a virtual machine entry to open a VNC window: 179

Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 3-21 DomU In the screenshot above, the virtual machine is running.

180 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 3-21 DomU In the screenshot above, the virtual machine is running. You could pause the machine or shut it down using the respective buttons. Closing the VNC window as such does not affect the state of the machine. It continues to run and you can attach to the VNC session again by double-clicking the respective entry in Virt-Manager. If you double-click an entry of a virtual machine that is not currently running, the window appears empty and you can start the machine by clicking the Run button. To release the mouse cursor from the VNC window, press Ctrl+Alt. When you select an entry in the Virtual Machine Manager window with the right mouse button and then select Details, another dialog appears: 180

181 Figure 3-22 DomU: Utilization Work with Xen Virtualization Novell Training Services (en) 15 April 2009 The Overview tab shows a graph of CPU and memory usage. The Hardware tab allows you to view and change certain hardware parameters: 181

Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 3-23 DomU: Hardware Details You can add or remove virtual processors, change

182 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 3-23 DomU: Hardware Details You can add or remove virtual processors, change the memory currently used, or add and remove hard disks and CDROM/DVD drives. Removing and adding the CDROM drive is necessary when changing a CDROM in the drive. Currently, CDROM drives appear as hard disks within the virtual machines and media changes are not detected automatically. Due to a bug at the time of this writing, adding and removing CDROM drives in Virt- Manager is not possible. You have to use the xm command to access the content of a CDROM/DVD or to change it. (The xm command will be covered in more detail in Use the xm Tool on page 184.) To change a DVD or CDROM in a virtual machine, do the following: 1. Insert the CDROM or DVD into the DVD drive. It will be mounted automatically in Dom0. 2. Open a terminal window, su - to root, and then add the drive with the command xm block-attach domainid dev_in_dom0 dev_in_domu r For instance xm block-attach sles11 phy:/dev/sr0 /dev/xvdb r 3. Within DomU, mount the device (/dev/xvdb in the example above). When you want to change the CDROM/DVD, unmount the device in DomU. 182

183 Work with Xen Virtualization 4. In Dom0, determine the ID for the CDROM entry and then remove this entry from the virtual machine with the xm commands as shown below: da10:~ # xm block-list sles11 Vdev BE handle state evt-ch ring-ref BE-path /local/domain/0/backend/vbd/ 1/ /local/domain/0/backend/vbd/ 1/51728 da10:~ # xm block-detach sles da10:~ # Novell Training Services (en) 15 April Change the CDROM/DVD in the drive and attach the device again as explained in Step 2. Manage Xen Domains from the Command Line In this objective, you learn how to manage Xen domains at the command line. To do this, you need to Understand Managed and Unmanaged Domains on page 183 Understand a Domain Configuration File on page 184 Use the xm Tool on page 184 Use the virsh Tool on page 187 Automate Domain Startup and Shutdown on page 188 Understand Managed and Unmanaged Domains In Xen version 2, all DomUs were configured by a configuration file. You can still use configuration files with Xen version 3. Virtual domains that are configured by configuration files only are referred to as unmanaged domains. Unmanaged domains appear in Virt-Manager or in the output of the xm list command (covered later in this objective) only when they are running. With Xen version 3, configuration details can be stored in the Xenstore database located in /var/lib/xenstored/tdb. One advantage is that the virtual machines always appear in virt-manager, even when not running, and can be started as described in the previous objective. Virtual machines that have their configuration in the Xenstore database are referred to as managed domains. You can use the xm new configfile command to move configuration information from a configuration file into the Xenstore database. Currently it is not possible with the xm command to export a configuration from the Xenstore database to a configuration file. To remove configuration information from the Xenstore database, use the xm delete vm_name command. This command removes only the configuration information from the database; the disk image files remain unchanged. 183

184 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual When a virtual machine is created with vm-install, the configuration is written to / etc/xen/vm/vm_name and to the Xenstore database simultaneously. Later changes to the configuration file have no effect on the information in the Xenstore database. To change the configuration in the Xenstore database, delete the configuration from the database with xm delete vm_name, edit the configuration file in /etc/ xen/vm/, and integrate the new configuration in the database with xm new configfile. Understand a Domain Configuration File The configuration files for domains created with vm-install are located in /etc/ xen/vm/. A configuration file contains several keywords which configure different aspects of a Xen domain. A configuration file created by vm-install during the installation of a virtual machine could look like the following: name="sles11" uuid="3eb65cbd-ae8e-2a79-cf1e d085" memory=512 maxmem=512 vcpus=2 on_poweroff="destroy" on_reboot="restart" on_crash="destroy" localtime=0 keymap="en-us" builder="linux" bootloader="/usr/bin/pygrub" bootargs="" extra=" " disk=[ 'file:/var/lib/xen/images/sles11/disk0,xvda,w', 'phy:/dev/ sr0,xvdb:cdrom,r', ] vif=[ 'mac=00:16:3e:31:24:13,bridge=br0', ] vfb=['type=vnc,vncunused=1'] Under /etc/xen/examples/, you find example files which can be used to create a configuration from scratch. The comments in these files (lines starting with a # sign) give more information on the available options and the required syntax. NOTE: A good source for detailed documentation and HOWTOs about Xen and the domain configuration files is the Xen wiki at: ( Use the xm Tool The xm command line uses the following format: xm subcommand [options] [arguments] [variables] 184

185 Work with Xen Virtualization xm is the administration command line tool for Xen domains. xm communicates with the xend management process running on the Dom0 Linux installation. You can get a complete list of the xm subcommands by entering xm help. The xm manual page contains information on the available options for each of the subcommands. This manual covers only the more frequently used subcommands. You can use the create subcommand to start an unmanaged virtual machine: xm create -c -f /data/xen/sles11-webserver.conf The -c option lets xm connect to the terminal of the started domain, so that you can interact with the system. To disconnect from the terminal and return to the original command line, enter the key combination Ctrl-]. The -f option specifies the configuration file of the domain that should be started. The list command displays information about all managed Xen domains and the currently running unmanaged Xen domains: Novell Training Services (en) 15 April 2009 da10:~ # xm list Name ID Mem VCPUs State Time(s) Domain r sles b The output of the list command contains the following fields: name: Name of the domain as specified in the configuration file. ID: Numeric, consecutive domain ID, which is automatically assigned when the domain starts. Mem: Amount of memory assigned to the domain. VCPUs: Number of virtual CPUs utilized by this domain. State: Current state of the domain. This could be r: Domain is running. b: Domain has been created but is currently blocked. This can happen when a domain is waiting for I/O or when there is nothing for a domain to do. p: Domain is paused. The state of the domain is saved and can be restored. s: Domain is in the process of being shut down. c: Domain has crashed due to an error or misconfiguration. Time: Total run time of the domain as accounted for by Xen. An alternative to list is the command top, which displays domain information updated in real-time. To start a managed domain, use the following command: xm start vm_name The console command connects you with the terminal of a running domain: 185

186 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual xm console domain_id The command takes the domain id as a parameter, which can be determined with the list command (field: ID). The name (field: Name) works as well. As mentioned before, use the key combination Ctrl-] to disconnect from a terminal. With the pause command, you can interrupt the execution of a domain temporarily: xm pause domain_id A paused domain is not completely shut down. The current state is saved and the execution of the domain can be continued with the unpause command: xm unpause domain_id To shut down a domain, use the shutdown command: xm shutdown domain_id This is equivalent to using the appropriate command within the virtual machine (shutdown -h now in Linux). If the domain is not responding anymore, you can force the shutdown of the domain with the destroy command: xm destroy domain_id This is equivalent to pulling the plug on a physical machine. To save the state of a domain for a longer time (for example, over a reboot of Dom0) you can use the save command: xm save domain_id filename The domain can be restored from the resulting file with the restore command: xm restore filename Another commonly used command is mem-set, which allows you to change the memory allocation of a domain: xm mem-set domain_id amount_of_memory The amount of memory is specified in megabytes. Block devices can be added to DomUs with the xm block-attach command:. xm block-attach domainid dev_in_dom0 dev_in_domu r/w To remove the device again, first use xm block-list to find out what DeviceID to use in the xm block-detach command: xm block-list domainid xm block-detach domainid DeviceID 186

187 Use the virsh Tool Work with Xen Virtualization The virsh command is similar to the xm command. The basic structure of the virsh command is as follows: virsh subcommand <domainid> [options] virsh can be used to administer Xen domains. The options are similar to those of the xm command; however, there are also some options that are different. You can get a complete list of the virsh subcommands by entering virsh help. The virsh manual page contains information on the available options for each of the subcommands. This manual covers only the more frequently used subcommands. You can use the create subcommand to start an unmanaged virtual machine using a configuration file in xml format: virsh create /data/xen/da-xen.xml The configuration in xml format can be viewed using the virsh dumpxml domain_name command. The console subcommand connects you with the terminal of a running domain: virsh console domain_id The console command takes the domain ID as a parameter, which can be determined with the virsh list command (field: ID). The name (field: Name) works as well. Use the key combination Ctrl-] to disconnect from a terminal. The virsh list command displays information about running Xen domains; however, the xm list command also lists managed domains that are not currently running. To start a managed domain, use the following command: virsh start vm_name With the suspend subcommand, you can interrupt the execution of a domain temporarily: virsh suspend domain_id A suspended domain is not completely shut down. The current state is saved and the execution of the domain can be continued with the resume subcommand: virsh resume domain_id To shut down a domain, use the shutdown subcommand: virsh shutdown domain_id This is equivalent to using the appropriate command within the virtual machine (shutdown -h now in Linux). If the domain is not responding anymore, you can force the shutdown of the domain with the destroy command: Novell Training Services (en) 15 April

188 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual virsh destroy domain_id This is equivalent to pulling the plug on a physical machine. To save the state of a domain for a longer time (for example, over a reboot of Dom0) you can use the save subcommand: virsh save domain_id filename The domain can be restored from the resulting file with the restore subcommand: virsh restore filename Another commonly used subcommand is setmem, which allows you to change the memory allocation of a domain: virsh setmem domain_id amount_of_memory The amount of memory is specified in kilobytes. Block devices can be added to DomUs with the disk-attach subcommand: virsh attach-disk domainid dev_in_dom0 dev_in_domu To remove the device again, use the detach-disk subcommand: virsh detach-disk domainid dev_in_domu Automate Domain Startup and Shutdown When you start, shut down, or reboot the Dom0 of a Xen system, other running Xen domains are also affected. The other Xen domains cannot operate without a running Dom0. SLES 11 comes with a start script called xendomains which is included in the xentools package. The script, which should be installed on Dom0, does the following: When Dom0 is booted, all domains with configuration files located under / etc/xen/auto/ are started. It is recommended to create a symbolic link in this directory pointing to the actual configuration file in /etc/xen/vm/. When Dom0 is shut down or rebooted, running Xen domains are shut down automatically. NOTE: If you have a configuration file for a domain that is also in the Xenstore database, the automatic start uses the information in the configuration file and ignores the information in Xenstore, which may be different from that in the configuration file. To start and stop managed domains automatically, you can create a start script based on the /etc/init.d/skeleton file, using the applicable xm commands, such as xm start vm_name and xm shutdown vm_name. 188

189 Work with Xen Virtualization The xendomains script has configuration options that can be adjusted in the / etc/sysconfig/xendomains file. The configuration variables in this file are explained in accompanying comments. One interesting option is to migrate domains automatically to a different host when a Dom0 is shut down. This can be configured in the variable XENDOMAINS_MIGRATE. The variable has to be set to the IP address of the target machine. When the variable is empty, no migration is performed. Migrate Xen Virtual Machines between Hosts For a virtual machine to be able to be migrated, the virtual machine has to be able to access the physical storage, no matter on which host it is running, and the xend daemon has to permit migrations. With iscsi this is achieved by going through the same iscsi initiator setup on all hosts involved. This ensures that the migrated host can access the same storage before and after the migration. To allow the migration from one host to the other, the following entries in the /etc/ xen/xend-config.sxp configuration file have to be modified on the involved machines to allow migration: Novell Training Services (en) 15 April (xend-relocation-server yes)... (xend-relocation-port 8002)... (xend-relocation-address '')... (xend-relocation-hosts-allow '')... In a production environment it is recommended to set values for the relocation address and the allowed hosts. In a lab environment this is not really important. After making changes to the configuration file, shut down any running virtual machines and restart the xen daemon with the rcxend restart command. The migration is triggered with the following command on the host where the virtual machine currently resides: xm migrate domain_name target_host This command shuts down the running domain on the current host and starts it again on the target host. The configuration is moved together with the machine, it is not necessary to have a pre-existing configuration on the target host. Any services running on the virtual machine are unavailable during the migration. To reduce the service downtime to about a second or even less, depending on the speed of the network, it is possible to perform a live migration: xm migrate --live domain_name target_host 189

190 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 3-24 During the live migration, the content of the random access memory of the virtual machine is copied to the target host; then several times the changes of memory since the previous copy are copied over while the virtual machine is still up and running. Because the content of the memory keeps changing, there is a point where this copying becomes ineffective; at that point the virtual machine is stopped on the source host, the remaining difference in memory between source and target is copied to the target host, and the machine is started again on the target host. Only a fraction of the content of the RAM needs to be copied during this final step, so the downtime is very short and, depending on the application, almost unnoticeable. You can watch the progress of the migration using the xm top command: Xen Migration in xm top 190

191 Exercise 3-1 Install and Use Xen Virtualization Work with Xen Virtualization In this lab, you install Xen on your physical machine and create Xen virtual machines. You will find this lab in the workbook. (End of Exercise) Novell Training Services (en) 15 April

192 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Summary Objective Understand Virtualization Technology Implement SLES 11 as a Xen Host Server Implement SLES 11 as a Xen Guest Summary Virtualization technology separates a running instance of an operating system from the physical hardware. Instead of running on a physical machine, the operating system runs in a so-called virtual machine. Multiple virtual machines share the resources of the underlying hardware. The hardware is controlled by the hypervisor (virtual machine monitor). Xen virtualization methods include para-virtualization, requiring the operating system in the virtual machine to be modified to be aware of the virtualization, and full virtualization, requiring no modification of the operating system, but requiring specialized hardware. Advantages of virtualization include better utilization of the available hardware and flexibility to move workloads from one physical machine to another, depending on business needs or maintenance needs. For SLES 11 to be able to host virtual machines, the Xen kernel has to be installed and loaded. Xen can be installed at the time of the new SLES 11 installation by selecting the Xen pattern, or on an installed system using the YaST Install Hypervisor and Tools module. When Xen is installed, the network configuration is changed to include a bridge to connect the virtual machine as well as the physical machine to the network. The vm-install command offers a GUI that allows you to configure the virtual machine settings and to install an operating system into it. The disk files can reside on a local hard drive or on remote storage. Use of remote storage is required to be able to migrate the virtual machine from one physical host to another while it is running. It is also possible to copy the virtual machine drive images to another host and use vm-install to set up the virtual machine on the new host. The command line mode of vm-install, combined with an installation server and an AutoYaST XML control file allows completely unattended installations of SLES 11 Xen virtual machines. 192

193 SECTION 4 Harden Servers Harden Servers In this section, you learn how to harden a newly installed SLES 11 server system. Objectives 1. Describe Server Hardening on page Harden a SLES 11 Server on page Harden Services with AppArmor on page Implement an IDS on page 260 Novell Training Services (en) 15 April

194 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Objective 1 Describe Server Hardening In this objective, you learn about the rationale for hardening a Linux server system. The following topics will be addressed: What Is Server Hardening? on page 194 Are Linux Systems Vulnerable? on page 197 What Is Server Hardening? It has often been said that the only way to make a computer system truly secure is to unplug it from the network, remove the network board, disconnect the keyboard and mouse, and remove the monitor. Unfortunately, such a system, while secure, would also be completely useless. The process of implementing server security, therefore, should be one of risk management, not risk avoidance. By making a server available on the network, you inherently increase its vulnerability. The key is to reduce its vulnerabilities as much as possible without impacting its availability, functionality, or performance. This process, called server hardening, involves identifying possible venues of attack into a computer system and then taking steps to eliminate them or reduce their vulnerability. The manner in which you harden a computer system varies, depending upon whether the system is a server or a desktop. Desktop systems are designed to provide the end user with easy access to a wide variety of computing tools (that they may or may not immediately need). The end user is assumed to have little or no computer expertise. Therefore, a desktop system must require minimal configuration tasks to get up and running each day. Because a desktop system will likely be used by only one person, it is arguable that the assets it contains are probably of limited value to an attacker. Such assets could include Usernames and passwords Browsing history Credit card numbers Personal information such as Social Security numbers, birth dates, and so on The key avenues of attack to a desktop system would be through its user interface, web browsers, and client software. As such, most desktop hardening strategies would include the following: Using a strong password policy Promoting social engineering awareness Protecting against malware Configuring a host-based firewall Configuring automatic updates 194

195 Harden Servers Essentially, a desktop system needs to have some degree of openness such that lesscomputer literate end users can still use it to accomplish meaningful work. Hardening for a server system, on the other hand, needs to be approached in a much different manner. Unlike a desktop system, the assets contained on a server system are probably infinitely more valuable; making server systems a much more inviting target to intruders. As such, server hardening should abide by the Principle of Least Privilege, the main concepts of which are listed below: Users should have only the degree of access to the server necessary for them to complete their work and no more. The server should only have the services and software required for it to fulfill its function on the network and no more. To accomplish this you should consider using the 10 Immutable Laws of Security to guide your server hardening procedures. The 10 Immutable Laws of Security were written several years ago in a Microsoft TechNet article. Although they were written as a response to issues encountered with Windows systems, these 10 laws are also quite applicable to Linux security. Novell Training Services (en) 15 April 2009 NOTE: The full text of the TechNet article is available on the Web at en-us/library/cc aspx ( The 10 Immutable Laws of Security: Law #1: If a bad guy can persuade you to run his program on your computer, it's not your computer anymore. Your computer doesn t know the difference between malware and a legitimate program. It will run either one equally as well. Law #2: If a bad guy can alter the operating system on your computer, it's not your computer anymore. Operating system files usually run with system-level privileges. If they are compromised, an intruder can gain unlimited access to your system. Law #3: If a bad guy has unrestricted physical access to your computer, it's not your computer anymore. Server systems should be kept in a locked server room where only authorized persons have keys or access codes. Servers should not reside in an empty cubicle in a remote corner of the office. (Yes, this practice has been observed in the past.) User accounts on desktop systems should have strong passwords assigned. Screen savers should have a short timeout period and should require authentication credentials to be supplied to resume the session. Users should not write down their passwords on sticky notes and stick them to their desk, monitor, or keyboard (a very common practice). Users should also be educated as to how to handle social engineering strategies. 195

196 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Law #4: If you allow a bad guy to upload programs to your Web site, it's not your Web site any more. Law #5: Weak passwords trump strong security. As mentioned previously, user accounts should be assigned strong passwords. A strong password is one that: Is at least 6 characters long Is not based on a word found in a dictionary Contains upper- and lower-case characters Contains numbers Does not contain words that can be associated with you personally, such as your maiden name, your birth date, your anniversary, you spouse s name, your pet s name, or the town where you grew up Is changed frequently Law #6: A computer is only as secure as the administrator is trustworthy. System admins occupy a position of trust. An untrustworthy admin completely negates all other security measures you may have put in place. You need to ensure that the people you hire to manage your systems are of impeccable character. They should not be allowed access to your servers until they have been fully vetted. Many employers now require admins to pass a background check as well as a credit check. In accordance with the Principle of Least Privilege, admins should have access to only those aspects of the server and network they are responsible for and nothing beyond that. In other words, don t give all admins the root user s password on the server. Create admin user accounts and delegate the necessary administrative access to that account using sudo. Law #7: Encrypted data is only as secure as the decryption key. The author of the TechNet article points out that the biggest, strongest door lock in the world is of little value if you leave the key under your doormat. Encryption keys in your network operate under the same principle. If you have to leave your keys on your server, be sure to hide them in a secure location protected by file system security. Law #8: An out-of-date virus scanner is only marginally better than no virus scanner at all. Malware does its greatest damage when it first emerges. Scanning with a virus scanner that hasn t been updated in months offers no protection. Law #9: Absolute anonymity isn't practical, in real life or on the Web. Law #10: Technology is not a panacea. Many admins rely too heavily on technology to secure their servers and network. While technology can provide you with numerous security tools, it can t replace 196

197 Harden Servers good, old-fashioned observation, monitoring, awareness, critical skepticism, and diligence. In this section of this course, we will implement these principles to teach you how to harden your servers. A later section in this course will discuss how to monitor your systems. Are Linux Systems Vulnerable? Unfortunately, many Linux administrators assume that their systems are immune to attacks and therefore more secure than Microsoft Windows systems. This is a mistaken assumption. Historically, Windows systems have been in the mainstream for a longer period of time, providing intruders with the time and access necessary to develop sophisticated attacks. This creates the illusion that Windows systems are more vulnerable than Linux systems. However, in the last 10 years, Linux has become much more widely deployed than in the past. In particular, Linux servers have become mainstays in the server room and are entrusted with providing mission-critical services as well as storing valuable data. As such, more and more Linux exploits are being observed each year. Novell Training Services (en) 15 April

198 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Objective 2 Harden a SLES 11 Server As discussed in the previous objective, your job is to minimize and manage the risk exploits pose to your Linux servers by hardening them. In this objective, you learn how to do this. The following topics are addressed: Checking File Permissions on page 198 Securing Software and Services on page 202 Managing User Access on page 207 Closing Unnecessary Ports on page 219 Harden a SLES 11 Server on page 231 NOTE: Portions of this objective are based on a Novell Cool Solutions article entitled SLES 10 Hardening. The full text of this article can be viewed at SLES_10_Hardening ( WARNING: Be aware that the list of tasks presented here is not all-inclusive for all server environments. You must evaluate your computing environment and create your own server hardening plan that accounts for threats specific to your organization. Checking File Permissions Permissions on system files need to be set correctly to ensure that an unauthorized user or process cannot make unauthorized changes to these files. This is done for you during the installation of the operating system itself. You need to be aware of the following: Setting the System-Wide Permission Level on page 198 Checking SUID root Permissions on page 200 Setting the System-Wide Permission Level Assuming you have not manually changed permissions on your system files, all should be well. However, SLES 11 systems include various levels of pre-set permissions, which are defined in the following files: /etc/permissions /etc/permissions.easy /etc/permissions.local /etc/permissions.secure /etc/permissions.paranoid There are also limited permissions settings defined for specific programs in the files in the /etc/permissions.d/ directory. 198

199 Harden Servers Which set of permissions is best? On multi-user systems, the strictest possible set of permissions will prevent normal non-root users from doing certain things that they commonly need to do, such as accessing an optical disc burner, shutting down the system, and so on. As discussed in the previous objective, you need to strike a balance when selecting a set of permissions. You must balance between locking the system down so tightly that users can t do the work that they need to do and leaving too many security holes open. These preset system-wide permissions settings can be set in two ways: You can use the YaST Local Security module (yast2 security). You can set them manually in the /etc/sysconfig/security file, a sample of which is shown below: Novell Training Services (en) 15 April 2009 ## Path: System/Security/Permissions ## Description: Configuration of permissions on the system ## Type: list(set,warn,no) ## Default: set ## Config: permissions # # SuSEconfig can call chkstat to check permissions and ownerships # for files and directories (using /etc/permissions). # Setting to "set" will correct it, "warn" produces warnings, if # something strange is found. Disable this feature with "no". # CHECK_PERMISSIONS="set" ## Type: string ## Default: "easy local" # # SuSE Linux contains two different configurations for # chkstat. The differences can be found in /etc/permissions.secure # and /etc/permissions.easy. If you create your own configuration # (e.g. permissions.foo), you can enter the extension here as well. # # (easy/secure local foo whateveryouwant). # PERMISSION_SECURITY="easy local" ## Path: System/Security/PolicyKit ## Description: Configuration of default PolicyKit privileges ## Type: list(set,warn,no) ## Default: set ## Config: polkit_default_privs # # SuSEconfig can check PolicyKit default privileges. # Setting this variable to "set" will change privileges that don't # match the default. Setting to "warn" only prints a warning and # "no" will disable this feature. # # Defaults to "set" if not specified # CHECK_POLKIT_PRIVS="" 199

200 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual ## Type: string ## Default: "standard" # # SUSE ships with two sets of default privilege settings. These are # "standard" and "restrictive". # # Examples: "standard", "restrictive foo bar" # # If not set the value depends on the setting of # PERMISSION_SECURITY. If PERMISSION_SECURITY contains 'secure' or # 'paranoid' the value will be 'restrictive', otherwise 'standard'. # # The 'local' file is always evaluated and takes precedence over # all other files. # POLKIT_DEFAULT_PRIVS="restrictive" ## Type: list(yes,yast,no) ## Default: yes # # When working with packages and installation sources, check keys # and signatures: yes = in YaST and ZENWorks, yast = in YaST, no = # no checking. # CHECK_SIGNATURES="yes" After making any changes in this file, you need to run SuSEconfig at the shell prompt to apply the changes to the system. If you choose to use the YaST Local Security module, this will be done automatically for you on exit. Checking SUID root Permissions In addition to setting system-wide permission levels (which include SUID permission settings), you can also check for SUID root permissions yourself. SUID stands for Set User ID. When a SUID file is executed, the process which runs it is granted access to system resources based on the user who owns the executable file, not on the user who ran the command. When the root user owns an SUID file, it allows the process created by the file to perform actions that the user who started it is not allowed to do. There are a small number of files on your SLES system that need to have SUID root permissions set. However, SUID files can represent a vulnerability in your system as many exploits are facilitated using files with this permission set. Permissions for SUID-enabled files appear as follows: -rwsr-xr-x You can search for files on your Linux system that have SUID permissions set using the following command (as root): find / -type f -perm -u=s -ls An example is shown below: 200

201 Harden Servers da1:~ # find / -type f -perm -u=s -ls rwsr-xr-x 1 root root Feb /bin/umount rwsr-xr-x 1 root root Feb /bin/ping rwsr-xr-x 1 root root Feb /bin/su rwsr-xr-x 1 root root Feb /bin/mount rwsr-xr-x 1 root root Feb /bin/ping rwsr-xr-x 1 root audio Feb /bin/eject You need to verify that only the required system files have this permission set. Any other files with this permission set could represent a security breach. You should also search for files that are both executable and writable by Others. You can do this using the following command: find / -type f -perm -o=w,u=x -ls You should find no matching files. You can run these two commands manually; however, we recommend that you create a script that runs them for you and writes the output to a secure log file. Use cron to run the script on a regular basis. You can also install the seccheck package on your system. The seccheck program will run these two commands automatically for you on a schedule as well as performing several other security checks. The following checks are run each day: /etc/passwd. Checks the length, number, and content of fields in this file. It also checks for user accounts with the same uid. It also looks for accounts (other than root or bin) that have a uid or gid of 0 or 1. /etc/shadow. Checks the length, number, and content of fields in this file. It also looks for accounts with no password assigned. /etc/group. Checks the length, number, and content of fields in this file. root user checks. Verifies that the value of umask and the PATH environment variable are secure. /etc/ftpusers. Checks to see if system users are in this file. /etc/aliases. Checks for mail aliases which execute programs..rhosts. Checks to see if users'.rhosts files contain the + character. Home directories. Checks to see if users home directories are writable or if they are owned by someone else. Dot-files. Checks the various hidden (dot) files in users home directories to see if they are writable or if they are owned by someone else. Mailbox check. Checks to see if users mailboxes are owned by the right user and are only readable by the owner. NFS export check. Verifies that NFS exports are not exported globally. NFS import check. Verifies that NFS mounts have the nosuid option set. Novell Training Services (en) 15 April

202 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Promisc check. Checks to see if your network cards are running in promiscuous mode. List modules. Lists loaded modules. List sockets. Lists open ports. In addition, the following checks are run once a week: Password check. Runs john to crack your password file. If successful, the user will receive an notice directing them to change their password immediately. NOTE: This check is not available on SLES 11 because the password cracker John the Ripper ( is not included with SLES 11. rpm md5 check: Checks for changed files using the rpm utility s md5 checksum feature. suid/sgid check: Lists all SUID and SGID files. Exec group write: Lists all executables that are writable by Group or Others. Writable check: Lists all files that are writable by Others. Device check: Lists all devices in the system. After you install the seccheck package on your system, the seccheck file is added to the /etc/cron.d/ directory. This file runs the /usr/lib/secchk/ security-control.sh script on a daily, weekly, and monthly basis, as shown below: Figure 4-1 The seccheck cron File After running, the seccheck program creates the following reports in the /var/ lib/seccheck/ directory: security-report-daily security-report-weekly security-report-monthly Securing Software and Services The next issue you should be aware of is hardening the server by securing its software and services. You need to consider the following Removing Unnecessary Software and Services on page 203 Using chroot Jails on page

203 Removing Unnecessary Software and Services Harden Servers Recall in the previous section that we said a hardened server should follow the Principle of Least Privilege. That is, it should provide the software and services required for its role and no more. One way to do this is to review the software and services installed on the server and remove anything that violates this principle. Services potentially provide attackers with vulnerabilities they can exploit by targeting the associated open port(s) on the system. If you are not going to use a particular service, it is better to not have it installed. The key to minimizing software and services is to define the server s role in your server deployment plan before the system is actually installed. Using this role definition, you can identify the software and services that will need to be included during the installation process. We recommend that you start with a minimal installation pattern and progressively add the packages that you know you will need. Be aware that using the default patterns provided by YaST during the installation process will probably install a lot of packages that may not be necessary. We recommend that you view the details of each pattern during the server installation and remove any packages that aren t required. Novell Training Services (en) 15 April 2009 NOTE: Be careful to not break package and pattern dependencies in an attempt to trim down your installation. This could cause unexpected problems and may render your system unsupported. Using the YaST Software Management module can help you avoid this situation as it warns you if the removal of a package will break a dependency. YaST provides you with a handy tool for viewing what is installed on the system. Start YaST, then select Software > Software Management. In the Filter drop-down list, select Patterns. When you do, a list of patterns installed on the system is displayed. To view all of the packages associated with a given pattern, select the desired pattern on the left. A list of packages is displayed on the right, as shown below: 203

Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 4-2 Viewing Installed Packages You can accomplish the same thing from the

204 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 4-2 Viewing Installed Packages You can accomplish the same thing from the shell prompt using the rpm -qa command. A list of all installed packages is displayed, as shown below: da1:~ # rpm -q --all yast2-trans-stats sles-installquick_en-pdf man-pages gnome-icon-theme util-linux-lang redbook libgweather-lang libgnome-lang gpg2-lang gnome-main-menu-lang gconf2-lang atk-lang gnome-menus xorg-x11-libice The output can be quite long, so you may want to consider piping the output to the more command. You can also pipe it to the grep command and filter the output to display only the packages you want to see. A key decision you need to make in this process is whether or not you want to include a graphical user interface on the server. Many Linux server admins prefer to run their 204

205 Harden Servers systems with a command-line interface only. This strategy reduces server overhead and it also eliminates the additional avenue of attack presented by having the GUI software installed on the system. Other Linux server administrators go ahead and install the GUI environment on their servers. They feel the increased ease and speed of management the GUI provides is worth the security risk and overhead associated with the graphical environment. One suggested solution is a compromise between both positions. You can install the GUI environment on the server, but configure the system to boot to runlevel 3 by default. Then, when the GUI is needed, it can be started using the startx command at the shell prompt. When the GUI is no longer needed, it can be exited and the system returned to a command-line only environment. Novell Training Services (en) 15 April 2009 Using chroot Jails Certain services, such as the FTP, NTP, Postfix, and dhcpd daemons, can be run in a change root (chroot) jail. A chroot jail changes the root directory in the file system for the running process and all of its child processes. The process then cannot access files outside that false root directory in the file system, hence the name chroot jail. Using chroot jails helps to some extent to defend your system against buffer overflow attacks. Code containing a buffer overflow error can corrupt the value that tells the CPU which memory address it should return to after it is done running the offending function. An attacker can use this error to tell the CPU to return to an address where his malicious code resides, causing it to be run without permission. Essentially, a buffer overflow attack tricks the CPU into running the intruder s code. This can be a serious issue if the process is running as a privileged Linux user, such as root. It could potentially allow the intruder to spawn a shell session and gain full access to the entire system. Network daemons are a prime avenue intruders can use to execute a buffer overflow attack. Using chroot jails can mitigate the impact from such an attack. Running services chrooted ensures that any access to the server from outside can only affect the part of the file system where that chroot system resides. If you configure a process to chroot to a particular directory in the file system, any future system calls issued by that process will see the directory you specified as the file system root. It is then impossible for that process to access files outside that false root directory. If you want to experiment with chroot jails, do the following: 1. Create a temporary directory somewhere in your Linux file system (for example /jail). Then create a subdirectory within that directory named bin (for example, /jail/bin). 2. Copy the following files to the bin directory in your temporary directory: /bin/bash /bin/sh /bin/ls 205

206 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual 3. Switch to the /bin directory and enter ldd bash. Note the files listed and copy them to the bin subdirectory of your temporary directory. Repeat this process for the sh and ls binaries. 4. At the shell prompt, enter chroot temporary_directory_name. At this point, you are in a chroot jail. If you were to enter the ls / command, it would see the top of the file system / as the temporary directory. The shell is unaware that there are any directories in the file system above this new root directory. An example is shown below: da1:/ # ls / bin dev home jail lost+found mnt proc sbin sys usr boot etc iscsi lib media opt root srv tmp var da1:/ # chroot /jail da1:/ # ls / bin When you end the /bin/bash process within the jail by entering exit, you return to your previous shell. WARNING: Root can escape the jail, but a process running as a non-root user would have to gain root privileges first by exploiting some vulnerability within the jail to gain root privileges to be able to do so. On SLES most services that can run in a chroot jail are configured to do so by default when they are installed. For example, to run the BIND daemon (named) chrooted, the /etc/sysconfig/named file is edited and the NAMED_RUN_CHROOTED= parameter is set to yes. An example is shown below: ## Path: Network/DNS/Name Server ## Description: Names server settings ## Type: yesno ## Default: yes ## ServiceRestart: lwresd,named # # Shall the DNS server 'named' or the LightWeight RESolver Daemon, # lwresd run in the chroot jail /var/lib/named/? # # Each time you start one of the daemons with the init script, # /etc/named.conf, # /etc/named.conf.include, /etc/rndc.key, and all files listed in # NAMED_CONF_INCLUDE_FILES will be copied relative to /var/lib/named/. # # The pid file will be in /var/lib/named/var/run/named/ and named # named.pid or lwresd.pid. # NAMED_RUN_CHROOTED="yes" As you can see in the example above, when this parameter is enabled, the /var/ lib/named directory becomes the new root directory for the named daemon. Because the daemon will not be able to access any files outside of this directory tree 206

207 Harden Servers structure, the various configuration and data files used by the daemon are copied from /etc/ to this directory. Managing User Access The next thing you need to do to harden your SLES 11 server is to manage how users access the system. The following topics are addressed are addressed in this part of this objective: Removing or Disabling Unnecessary User Accounts on page 207 Using SUDO to Delegate Administration Privileges on page 207 Implement ACLs on page 211 Hardening SSH Access on page 216 Novell Training Services (en) 15 April 2009 Removing or Disabling Unnecessary User Accounts All guest, unused, and otherwise unnecessary user accounts must be disabled or removed from your Linux system. Again, this process should be included in your initial server deployment plan. You should identify which regular user accounts are needed and the level of access each one requires. You should also identify which system (non-login) user accounts are required on the server. WARNING: Be sure you don t accidentally delete or disable an account that is actually needed, especially system accounts. It s also critical that you keep track of turnover in your organization. Accounts for employees who are leaving must be disabled or deleted when they leave your organization (either temporarily or permanently). Using SUDO to Delegate Administration Privileges You should carefully manage how you use the root user account on your system. Remember that root has full access to the entire system. When doing day-to-day work, you should log in as a normal user and switch to root only to perform tasks that require root permissions. When done, you should switch back to your normal user account immediately. This strategy works well enough if you employ a single administrator for your Linux systems (assuming you abide by it). However, if your organization uses a distributed administration model, this strategy is lacking. Giving a distributed admin your root user s password effectively makes that admin a full administrator with full access to the entire system. This situation violates the The Principle of Least Privilege discussed in the previous objective. Distributed administrators should have only the level of access required to fill their job role and no more. 207

208 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual It would be preferable to provide root-level access to distributed admins for only the commands you want them to be able to run without giving them the root password. This can be done using sudo. The default configuration of sudo in SLES 11 requires that you know the root user s password. Obviously, if you actually know the root password, you don t really need to use sudo to complete administrative tasks. Using sudo instead of su to complete administrative tasks, however, has an advantage in that the commands you execute are logged to /var/log/messages. In addition, you do not need to retype the password for each command (as must be done with the su -c command) because it is cached for several minutes by sudo. An example of using sudo to shut down the system while logged in as geeko is shown below: geeko@da1:~> sudo /sbin/shutdown -h now We trust you have received the usual lecture from the local System Administrator. It usually boils down to these three things: #1) Respect the privacy of others. #2) Think before you type. #3) With great power comes great responsibility. root's password: However, you can change the configuration of sudo such that it asks for the user s password instead of the root password. This is the preferable configuration when delegating administrative responsibilities to lower-level admins. To do this, put a pound sign (#) in front of the following two lines in /etc/ sudoers using the visudo command:. # In the default (unconfigured) configuration, sudo asks for the root # password. This allows use of an ordinary user account for # administration of a freshly installed system. When configuring sudo, # delete the two following lines: Defaults targetpw # ask for the password of the target user i.e. root ALL ALL=(ALL) ALL # WARNING! Only use this together with 'Defaults # targetpw'! # Runas alias specification # User privilege specification root ALL=(ALL) ALL... Using visudo, you can specify which commands a user can or cannot enter by configuring the /etc/sudoers file. The following is the general syntax of an entry in the configuration file: user/group host = (user) command1, command2... For example 208

209 geeko ALL = /sbin/shutdown Harden Servers In this example, the user geeko is able to run the /sbin/shutdown command with the permissions of root on all computers (ALL). Being able to specify the computer in /etc/sudoers allows you to copy the same file to different computers without having to grant the same permissions on all computers involved. Adding NOPASSWD: to the line allows the specified user to execute the command without entering a password: geeko ALL = NOPASSWD: /sbin/shutdown The /etc/sudoers file can also be configured with aliases to define who can do what as root. The following aliases are used: User_Alias. Users who are allowed to run commands. Cmnd_Alias. Commands that users are allowed to run. Host_Alias. Hosts that users are allowed to run the commands on. Runas_Alias. Usernames that commands may be run as. Novell Training Services (en) 15 April 2009 You use the User_Alias directive to define an alias containing the user accounts (separated by commas) you want to allow to run commands: User_Alias ALIAS = users For example, to create an alias named POWERUSERS that contains the tux and geeko user accounts, you would enter the following in the /etc/sudoers file: User_Alias POWERUSRES = tux, geeko All alias names must start with a capital letter. You next need to use Cmnd_Alias to define an alias that contains the commands (using the full path) that you want users to be able to run. You can separate multiple commands with commas. For example, if the users in question are developers that need to be able to kill hung processes from time to time, you could define an alias named KPROCS that contains the kill and killall command, as shown below: Cmnd_Alias KPROCS = /bin/kill, /usr/bin/killall Next, you use Host_Alias to specify which systems the users can run the commands on. For example, to let them run the commands on a system named da1, you would use the following: Host_Alias HSTS = da1 Finally, you need to assemble all of these aliases together to define exactly what will happen. The syntax is: User_Alias Host_Alias = (user) Cmnd_Alias You can also combine aliases and users and commands, as in the following: User_Alias host = (user) Cmnd_Alias, command 209

210 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Using the aliases defined above, you could allow the specified users to run the specified commands on the specified hosts as root by entering the following: POWERUSERS HSTS = (root) KPROCS This sample configuration is shown below: User_Alias Cmnd_Alias Host_Alias POWERUSERS POWERUSERS = tux, geeko KPROCS = /bin/kill, /usr/bin/killall HSTS = da1 HSTS = (root) KPROCS To exit the editor, press Esc and then enter :exit. The visudo utility checks your syntax and informs you if you have made any errors. At this point, the users you defined can now execute the commands you specified as root by entering sudo command at the shell prompt. For example, the geeko user could kill a process named top owned by root by entering sudo killall top at the shell prompt, as shown below: geeko@da1:~> sudo killall top geeko's password: geeko@da1:~> After supplying the geeko user s password, the process is killed. If you run the sudo command again within a few minutes from within the same terminal session, you won t be prompted for the user s password again. YaST includes the Sudo module that you can also use to configure the sudoers file. Start YaST, then select Security and Users > Sudo. By default, a list of your sudo rules is displayed, as shown below: 210

211 Figure 4-3 Using YaST to Configure sudo Rules Harden Servers Novell Training Services (en) 15 April 2009 Using the sudo module in YaST, you configure your User_Aliases using the User Alias link, your Host_Aliases using the Host Alias link, and your Cmnd_Aliases using the Command Alias link. Then you use the Rules for sudo link to construct your sudo rules. WARNING: Several commands, such as vi or less, allow the user to open a shell from within the program. If you allow a user to execute such a program as root, you have granted him access to a root shell that allows him to completely bypass the limitations you intended to impose by using sudo. Consult the sudo and sudoers manual pages for further caveats. Implement ACLs To further harden your server, you should consider implementing access control lists (ACLs) to grant users the appropriate level of file system access to complete their jobs. Again the Principle of Least Privilege should be observed. Users should be given access to the files and directories in the file system that they need to complete their job and nothing more. ACLs allow you to use more granular permissions to control access to files and directories in the Linux file system than that allowed by traditional Read, Write, and Execute POSIX permissions. ACLs allow you to configure file system permissions similar to those used by other server operating systems, such as NetWare or Windows. 211

212 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual To use ACLs for advanced file system access control, you need to be familiar with the following concepts: How ACLs Work on page 212 Basic ACL Commands on page 212 ACL Terminology on page 213 ACL Types on page 215 How Applications Handle ACLs on page 216 How ACLs Work Traditionally, three sets of permissions are defined for each file object on a Linux system. These sets include the read (r), write (w), and execute (x) permissions for each of three types of users: User (file owner) Group Other authenticated users This concept is adequate for most practical cases. In the past, however, for more complex scenarios or advanced applications, system administrators had to use a number of tricks to circumvent the limitations of the traditional permission concept. ACLs provide an extension of the traditional file permission concept. They allow you to assign permissions to individual users or groups even if these do not correspond to the original owner or the owning group. ACLs are a feature of the Linux kernel and are supported by the ReiserFS, Ext2, Ext3, JFS, and XFS file systems. Using ACLs, you can create complex scenarios without implementing complex permission models on the application level. The advantages of ACLs are clearly evident in situations like replacing a Windows server with a Linux server providing file and print services with Samba. Since Samba supports ACLs, user permissions can be configured both on the Linux server and in Windows. Basic ACL Commands There are two basic commands used to manage ACLs: setfacl. Sets file ACLs. getfacl. Displays the ACLs of a file or directory. A simple scenario where ACLs come in handy would be a situation where you want to grant write access to a file to one user beside the owning user. Using the conventional approach, you would have to create a new group, make the two users involved members of the group; change the owning group of the file to the new group, and then grant write access to the file for the group. root access would be required to create the group and to make the two users members of that group. 212

213 Harden Servers With ACLs, you can achieve the same results by making the file writable for the owner plus the named user: touch file ls -l file -rw-r--r-- 1 geeko users :08 file geeko@da1:~> setfacl -m u:tux:rw file geeko@da1:~> ls -l file -rw-rw-r--+ 1 geeko users :08 file geeko@da1:~> getfacl file # file: file # owner: geeko # group: users user::rwuser:tux:rwgroup::r-- mask::rw- other::r-- Novell Training Services (en) 15 April 2009 Another advantage of this approach is that the user can decide on his own whom he grants access to his files. The system administrator doesn t have to create a group. Note that the output of ls changes when ACLs are used (see the second output of ls above). A + is added to alert you to the fact that ACLs are defined for this file, and the permissions displayed for the group have a different significance. They display the value of the ACL mask now, and no longer the permissions granted to the owning group. ACL Terminology The following list defines terms commonly used when discussing ACLs: user class. The conventional POSIX permission concept uses three classes of users for assigning permissions in the file system: the owning user, the owning group, and other users. Three permission bits can be set for each user class, giving permission to read (r), write (w), and execute (x). access ACL. Determine access permissions for users and groups for all kinds of file system objects (files and directories). default ACL. Can be applied only to directories. They determine the permissions a file system object inherits from its parent directory when it is created. The following shows how default ACLs are set using the -d option and their effect: 213

214 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual da1:~ # mkdir testdir da1:~ # setfacl -d -m u:tux:rwx testdir da1:~ # getfacl testdir # file: testdir # owner: root # group: root user::rwx group::r-x other::r-x default:user::rwx default:user:tux:rwx default:group::r-x default:mask::rwx default:other::r-x da1:~ # mkdir testdir/subdir da1:~ # getfacl testdir/subdir # file: testdir/subdir # owner: root # group: root user::rwx user:tux:rwx group::r-x mask::rwx other::r-x default:user::rwx default:user:tux:rwx default:group::r-x default:mask::rwx default:other::r-x #effective:rw- #effective:r-- da1:~ # getfacl testdir/file # file: testdir/file # owner: root # group: root user::rwuser:tux:rwgroup::r-x mask::rw- other::r-- ACL entry. Each ACL consists of a set of ACL entries. An ACL entry contains a type, a qualifier for the user or group the entry refers to, and a set of permissions. For some entry types, the qualifier for the group or users is undefined. 214

215 ACL Types There are two basic classes of ACLs: Harden Servers Minimum ACL. Includes the entries for the types: owning user, owning group, and other. These correspond to the conventional permission bits for files and directories. Extended ACL. Contains a mask entry and can contain several entries of the named user and named group types. ACLs extend the classic Linux file permission by the following permission types: named user. Lets you assign permissions to individual users. named group. Lets you assign permissions to individual groups. mask. Lets you limit the permissions of named users or groups. Novell Training Services (en) 15 April 2009 The following is an overview of all possible ACL types: Table 4-1 ACL Types Type owner named user owning group named group mask other Text Form user::rwx user:name:rwx group::rwx group:name:rwx mask::rwx other::rwx The permissions defined in the entries owner and other are always effective. Except for the mask entry, all other entries (named user, owning group, and named group) can be either effective or masked. If permissions exist in the named user, owning group, or named group entries as well as in the mask, they are effective (logical AND). Permissions contained only in the mask or only in the actual entry are not effective. The following example determines the effective permissions for the user jane: Table 4-2 ACL Example Entry Type Text Form Permissions named user user:jane:r-x r-x mask mask::rw- rw Effective permissions: r-- 215

216 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual The ACL contains two entries, one for the named user jane and one mask entry. Jane has permissions to read and execute the corresponding file, but the mask only contains permissions for reading and writing. Because of the AND combination, the effective rights allow jane to only read the file. How Applications Handle ACLs You can use ACLs to implement very complex permission scenarios that meet the requirements of applications. However, some important applications still lack ACL support. Except for the Star Archiver, there are currently no backup applications included with SLES 11 that guarantee the full preservation of ACLs. Basic file commands (including cp, mv, and ls) support ACLs, but many editors and file managers (such as Konqueror or Nautilus) do not. If you copy files with Konqueror or Nautilus, the ACLs of these files will be lost. If you modify files with an editor, the ACLs of files are sometimes preserved and sometimes not, depending on how the editor handles files. If the editor writes changes directly to the original file, ACLs are preserved. If the editor saves the changes to a new file that is then renamed using the old filename, the ACLs will probably be lost unless the editor supports ACLs. Hardening SSH Access The SSH daemon is commonly used by system administrators to remotely access and manage Linux server systems. Although SSH is much more secure than older remote access services, such as Telnet or RSH, it still represents an avenue that an attacker can try to use to potentially gain unauthorized access to your server system. As we have discussed elsewhere in this section, you must balance between the ease of management provided by SSH and the security vulnerabilities it introduces. You could, of course, simply disable the daemon and close the door to possible attacks. However, most administrators rely heavily on the remote management features SSH provides. Instead of shutting SSH off completely, we recommend that you manage the risk by implementing measures to make the service as secure as possible. A very basic strategy is to change the port number that the SSH server daemon runs on. This strategy isn t foolproof. An experienced attacker can still run a port scan on your server and identify the open port the SSH daemon is using. However, changing the port number can hide the daemon from less-experienced attackers. By default, SSH runs on port 22. You can change daemon s configuration to use a different port by editing the /etc/ssh/sshd_config file and editing the Port directive. An example is shown below: Port You can also use the YaST SSHD Server Configuration module to change the port number, as shown below: 216

217 Figure 4-4 Setting the SSH Daemon Port Harden Servers We recommend that you use a high port number that will be easy to remember. However, we recommend that you don t use a port number that will make it obvious as to what service is using it. For example, we recommend that you don t use ports such as 2222 or Instead, specify a port number that isn't obvious. Another strategy you can use to harden the SSH server daemon is to disable root access to the system through the SSH daemon. Novell Training Services (en) 15 April 2009 WARNING: Doing this prevents you from logging in as root from an SSH client. If you still need root-level access from a remote SSH session, you can use the su - command to switch to root or you can configure sudo such that your regular user account can run commands with root-level access. To disable root access, open your /etc/ssh/sshd_config file in a text editor (or access the Login Settings tab in the YaST SSHD Server Configuration module) and change the value of the PermitRootLogin directive to no. An example is shown below:... AllowTcpForwarding yes Compression yes MaxAuthTries 6 PermitRootLogin no PrintMotd yes... In addition to restricting root access via SSH, you can also restrict SSH access to one or more specific user or group accounts. If a user tries to establish an SSH session with the server and their account is not specified, they are denied access. To do this, open your /etc/ssh/sshd_config file in a text editor and add one of the following directives to the file: AllowUsers usera userb userc userd. Allows you to restrict SSH access to the user(s) specified in the directive. AllowGroups groupa groupb groupc. Allows you to restrict SSH access to the group(s) specified in the directive. Any user who is a member of one of the specified groups is allowed to establish an SSH session with the server. An example is shown below: 217

218 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual... AllowTcpForwarding yes Compression yes MaxAuthTries 6 PermitRootLogin no PrintMotd yes PubkeyAuthentication yes RSAAuthentication no AllowUsers geeko... In this example, only the geeko user is allowed to establish an SSH session with the server. If another user tries to establish an SSH session with the server, the user receives an Access denied message. This is shown below: login as: tux Using keyboard-interactive authentication. Password: Access denied You also need to be concerned with automated brute force attacks, which are commonly executed against SSH servers. A brute-force attack uses a program that attempts to discover passwords by systematically trying every possible combination of letters, numbers, and symbols until it discovers a correct combination that works. To guard against this type of attack, you can apply firewall rules that detect and prevent connection attempts that exceed a specified number from a particular host within a defined time period. Use the following iptables rules: iptables -A INPUT -i eth0 -p tcp --dport 22 -m state --state NEW -m recent --set --name SSH iptables -A INPUT -i eth0 -p tcp --dport 22 -m state --state NEW -m recent --update --seconds hitcount 4 --rttl --name SSH -j LOG -- log-prefix "SSH BRUTE FORCE PROTECTION " iptables -A INPUT -i eth0 -p tcp --dport 22 -m state --state NEW -m recent --update --seconds hitcount 4 --rttl --name SSH -j DROP These rules allow four connection attempts within a 10 minute time period from a single IP address. Any connections beyond this are blocked. To completely negate automated brute force attacks, you should consider switching off username / password authentication to the SSH server. This can be done by adding the following line to your /etc/ssh/sshd_config file: ChallengeResponseAuthentication no After making the change, save the file and then restart your SSH daemon. Once done, only public/private key-pair authentication is allowed, rendering brute force attacks useless. 218

Closing Unnecessary Ports Harden Servers Next, you need to scan your SLES 11 server and identify open IP ports that need to be closed. In this objective, you learn how to do this.

219 Closing Unnecessary Ports Harden Servers Next, you need to scan your SLES 11 server and identify open IP ports that need to be closed. In this objective, you learn how to do this. The following topics are addressed: Identifying Open Ports on page 219 Closing Ports on page 224 Identifying Open Ports There are several options available for identifying open ports on your server system. One of the easiest ways (although not the most comprehensive) is to use the YaST System Services (Runlevel) module. Do the following: 1. Start YaST; then select System > System Services (Runlevel). 2. Select the Expert Mode option. When you do, a list of all installed services on your system is displayed along with their status. A sample is shown below: Novell Training Services (en) 15 April 2009 Figure 4-5 Viewing Running Services 3. Review the list of services and identify any that shouldn t be running. 4. To stop a service that shouldn t be running, do the following: a. From the list displayed, select the service to be stopped. 219

Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual b. Stop the service by selecting Start/Stop/Refresh > Stop Now. c.

220 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual b. Stop the service by selecting Start/Stop/Refresh > Stop Now. c. Disable the service by selecting Set/Reset > Disable the Service. 5. When done, select OK. 6. (Optional) to prevent the service from being started, use the YaST Software Management module or the rpm -e command to uninstall it from the system. In addition, you should also run a port scan of your system to see what IP ports are currently open. Many programs are available to do this. One of the more commonly used utilities is the nmap port scanner. To install nmap and use it to scan your server for open IP ports, do the following: 1. Install nmap on a system in your network. On a SLES 11 system, this can be done by completing the following: a. Start YaST, then select Software > Software Management. b. In the Search field, enter nmap. c. Mark the nmap package on the right, as shown below: Figure 4-6 Installing nmap d. Select Accept. e. When prompted to install dependent packages, select Continue. f. When the installation is complete, close YaST. 2. To run an exhaustive scan of your server, enter the following command from the system where nmap is installed: nmap -v -A server_dns_name_or_ip_address 220

221 Harden Servers The -A option enables OS detection and version detection, enables script scanning and runs a traceroute. The -v option increases verbosity level of the output. You can use multiple v s to further increase the verbosity of the output. NOTE: The nmap utility is a very powerful tool. A full discussion of all the options you can use with the nmap command is beyond the scope of this objective. See docs.html ( for full documentation. Sample output from a scan of a SLES 11 system that has its host-based firewall disabled is shown below: Novell Training Services (en) 15 April 2009 DA2:~ # nmap -v -A da1.digitalairlines.com Starting Nmap 4.75 ( ) at :39 MST Initiating ARP Ping Scan at 14:39 Scanning [1 port] Completed ARP Ping Scan at 14:39, 0.02s elapsed (1 total hosts) Initiating Parallel DNS resolution of 1 host. at 14:39 Completed Parallel DNS resolution of 1 host. at 14:39, 0.00s elapsed Initiating SYN Stealth Scan at 14:39 Scanning [1000 ports] Discovered open port 53/tcp on Discovered open port 443/tcp on Discovered open port 22/tcp on Discovered open port 80/tcp on Discovered open port 3306/tcp on Discovered open port 3260/tcp on Discovered open port 111/tcp on Completed SYN Stealth Scan at 14:39, 0.23s elapsed (1000 total ports) Initiating Service scan at 14:39 Scanning 7 services on Completed Service scan at 14:40, 83.57s elapsed (7 services on 1 host) Initiating OS detection (try #1) against SCRIPT ENGINE: Initiating script scanning. Initiating SCRIPT ENGINE at 14:40 Completed SCRIPT ENGINE at 14:41, 31.08s elapsed Host appears to be up... good. Interesting ports on : Not shown: 993 closed ports 221

222 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual PORT STATE SERVICE VERSION 22/tcp open ssh OpenSSH 5.1 (protocol 2.0) 53/tcp open domain ISC BIND P2 zone-transfer: digitalairlines.com. SOA DA1.digitalairlines.com. root.da1.digitalairlines.com. digitalairlines.com. NS da1.digitalairlines.com. da1.digitalairlines.com. A da2.digitalairlines.com. A da3.digitalairlines.com. A _ digitalairlines.com. SOA DA1.digitalairlines.com. root.da1.digitalairlines.com. 80/tcp open http Apache httpd ((Linux/SUSE)) _ HTML title: Access forbidden! 111/tcp open rpcbind rpcinfo: ,3,4 111/udp rpcbind _ ,3,4 111/tcp rpcbind 443/tcp open ssl/http Apache httpd ((Linux/SUSE)) _ SSLv2: server still supports SSLv2 _ HTML title: Access forbidden! 3260/tcp open unknown? 3306/tcp open mysql MySQL (unauthorized) MAC Address: 00:50:56:00:00:01 (VMWare) Device type: general purpose Running: Linux 2.6.X OS details: Linux Uptime guess: days (since Thu Jan 21 18:35: ) Network Distance: 1 hop TCP Sequence Prediction: Difficulty=202 (Good luck!) IP ID Sequence Generation: All zeros Read data files from: /usr/share/nmap OS and Service detection performed. Please report any incorrect results at Nmap done: 1 IP address (1 host up) scanned in seconds Raw packets sent: 1020 (45.640KB) Rcvd: 1017 (41.440KB) DA2:~ # As you can see in the output above, this system has seven network ports open: 53: DNS server 80 and 443: Apache Web server 22: SSH server 3306: MySQL server 3260: SMT server 111: SUN remote procedure call Running the scan a second time with the targets system s firewall enabled reveals that only the DNS and SSH services are exposed to the network. This is shown in the example below: 222

223 Harden Servers da2:~ # nmap -v -A da1.digitalairlines.com Starting Nmap 4.75 ( ) at :58 MST Initiating ARP Ping Scan at 14:58 Scanning [1 port] Completed ARP Ping Scan at 14:58, 0.04s elapsed (1 total hosts) Initiating Parallel DNS resolution of 1 host. at 14:58 Completed Parallel DNS resolution of 1 host. at 14:58, 0.00s elapsed Initiating SYN Stealth Scan at 14:58 Scanning [1000 ports] Discovered open port 53/tcp on Discovered open port 22/tcp on Completed SYN Stealth Scan at 14:58, 4.05s elapsed (1000 total ports) Initiating Service scan at 14:58 Scanning 2 services on Completed Service scan at 14:58, 6.01s elapsed (2 services on 1 host) Initiating OS detection (try #1) against SCRIPT ENGINE: Initiating script scanning. Initiating SCRIPT ENGINE at 14:58 Completed SCRIPT ENGINE at 14:58, 4.04s elapsed Host appears to be up... good. Interesting ports on : Not shown: 998 filtered ports PORT STATE SERVICE VERSION 22/tcp open ssh OpenSSH 5.1 (protocol 2.0) 53/tcp open domain ISC BIND P2 zone-transfer: digitalairlines.com. SOA DA1.digitalairlines.com. root.da1.digitalairlines.com. digitalairlines.com. NS da1.digitalairlines.com. da1.digitalairlines.com. A da2.digitalairlines.com. A da3.digitalairlines.com. A _ digitalairlines.com. SOA DA1.digitalairlines.com. root.da1.digitalairlines.com. MAC Address: 00:50:56:00:00:01 (VMWare) Warning: OSScan results may be unreliable because we could not find at least 1 open and 1 closed port Device type: general purpose Running: Linux 2.6.X OS details: Linux Uptime guess: days (since Thu Jan 21 18:30: ) Network Distance: 1 hop TCP Sequence Prediction: Difficulty=202 (Good luck!) IP ID Sequence Generation: All zeros Novell Training Services (en) 15 April 2009 Read data files from: /usr/share/nmap OS and Service detection performed. Please report any incorrect results at Nmap done: 1 IP address (1 host up) scanned in seconds Raw packets sent: 2033 (91.208KB) Rcvd: 17 (1096B) 223

224 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Closing Ports If the results of your port scan identified network ports on your system that need to be closed, you have several options. If the service listening on the port isn t necessary, you can simply disable the service or even uninstall the software from the system. The procedure for doing this was discussed previously in Identifying Open Ports on page 219. If the service is necessary and needs to remain running but should be inaccessible externally, you can modify the service s configuration to listen only on specific interfaces. You can also modify your host-based firewall rules on the server. An easy way to do this is to use the YaST Firewall module. Do the following: 1. Start YaST on your server; then select Security and Users > Firewall. The following is displayed: Figure 4-7 Enabling the Host Firewall in YaST The configuration is divided into seven sections that can be accessed directly from the tree structure in the left frame of the window: Start-Up. Configures the start-up behavior of the firewall. The firewall is started automatically by default. You can start and stop the firewall here as well. Interfaces. Lists all known network interfaces. You can add an interface to or remove an interface from a zone here. 224

225 The firewall defines three security zones: Harden Servers External Zone. In most cases, the external network is the Internet, but it could be another insecure network, such as a WLAN. Internal Zone. The internal zone is a private network, in most cases the LAN. If the hosts on this network use IP addresses from the private range, you can use SuSEfirewall to enable network address translation (NAT) to hide your internal hosts from the external network. Demilitarized Zone (DMZ). DMZ systems are isolated from the internal network. Hosts located in the DMZ can be reached both from the external and the internal network, but they are not allowed to access the internal network themselves. This setup can be used to put an additional line of defense in front of the internal network. Allowed Services. Specifies the services that should be made available to external hosts. Select the desired zone under Allowed Services for Selected Zone; then specify the services to allow from the list displayed. Masquerading. Enables masquerading, which is a Linux form of network address translation (NAT). Masquerading hides your internal network from external networks, such as the Internet, while enabling hosts in the internal network to access the external network transparently. Requests from the external network are blocked while requests from the internal network appear to originate from the masquerading server. Broadcast. Configures the zones and networks that will allow broadcasts. IPsec Support. Specifies whether IPsec should be enabled or not. IPsec allows encrypted communications between trusted hosts or networks through an untrusted network, such as the Internet. You can use the Details option in this screen to configure how successfully decrypted IPsec packets are to be handled. For example, they can be handled as if they came from the internal zone. Logging Level. Configures logging for accepted and not accepted packets. You can select from Log All, Log Only Critical, or Do Not Log Any for both types of packets. Custom Rules. Configures special firewall rules that allow connections by matching specified criteria such as source network, protocol, destination port, and source port. You can configure rules for the external, internal, and demilitarized zones. 2. Verify that the firewall is enabled at system boot and that it is currently running. 3. Select Allowed Services. A screen similar to the following is displayed: Novell Training Services (en) 15 April

Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 4-8 Viewing Open Ports in the Host Firewall 4.

226 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 4-8 Viewing Open Ports in the Host Firewall 4. If the service that needs its port to be closed in the firewall is listed in the Service to Allow field, select it and then select Delete. 5. If the service that needs its port to be closed in the firewall is not listed in the Service to Allow field, it probably has an advanced configuration that needs to be modified. Do the following: a. Select Advanced. The following is displayed: Figure 4-9 Configuring Allowed Firewall Ports 226

227 Harden Servers b. In the TCP Ports and/or UDP Ports fields, locate the port that needs to be closed and delete it. c. Select OK. 6. Select Next. 7. Review your changes in the Firewall Configuration Summary screen; then select Finish. 8. Run a port scan using nmap from a different system to verify that the port was closed in the firewall. You can also accomplish the same task from the command line by editing the SuSEfirewall2 configuration files located in /etc/sysconfig. SuSEfirewall2 is a front-end to the iptables package that provides your host-based firewall. Novell Training Services (en) 15 April 2009 NOTE: If you are comfortable working with the iptables command at the shell prompt, you can use it to configure your host firewall instead of SuSEfirewall. SuSEfirewall2 is a script that reads the variables set in the /etc/sysconfig/ SuSEfirewall2 configuration file to generate a set of iptables rules. Any kind of network traffic not explicitly allowed by the filtering rule set is suppressed by iptables. Therefore, each of the interfaces with incoming traffic must be placed into one of the three zones. For each of the zones, you can define the services or protocols allowed. The rule set is only applied to packets originating from remote hosts. Locally generated packets are not captured by the firewall. The configuration of SuSEfirewall2 can be performed with YaST, as discussed previously. It can also be done manually using the /etc/sysconfig/ SuSEfirewall2 file (which is well commented). NOTE: Several configuration examples are available in /usr/share/doc/packages/ SuSEfirewall2/EXAMPLES. The following configuration directives are contained in this file: FW_DEV_EXT. Specifies that the device is directly connected to the Internet. For a modem connection, enter ppp0. For an ISDN link, use ippp0. DSL connections use dsl0. Specify auto to use the interface that corresponds to the default route. FW_DEV_INT. Specifies that the device is linked to the internal private network (such as eth0). Leave this blank if there is no internal network and the firewall protects only the host on which it runs. FW_ROUTE: Enables routing. If you want to use the masquerading function in the firewall, set this parameter to yes. NOTE: Masquerading is a Linux form of Network Address Translation (NAT). 227

228 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual With masquerading enabled, your internal hosts will not be visible to the outside, because their private network addresses are translated into this host s public network address. WARNING: If your firewall does not use masquerading, you should only set this parameter to yes if you want to allow access to the internal network from the external network. In this case, your internal hosts will need to use registered (public) IP addresses. FW_MASQUERADE. Enables the masquerading function. FW_MASQ_NETS. Specifies the hosts or networks to masquerade. You can specify multiple networks by leaving a space between the individual entries. For example: FW_MASQ_NETS=" / " FW_PROTECT_FROM_INT. If set to yes, protects your host from attacks originating from your internal network. Services are only available to the internal network if explicitly enabled. NOTE: See FW_SERVICES_INT_TCP and FW_SERVICES_INT_UDP. FW_SERVICES_EXT_TCP. Specifies the TCP ports that should be opened in the firewall and made available to the external network. FW_SERVICES_EXT_UDP. Specifies the UDP ports that should be opened in the firewall and made available to the external network. FW_SERVICES_ACCEPT_EXT. Specifies services to allow from the Internet. This is a more generic form of the FW_SERVICES_EXT_TCP and FW_SERVICES_EXT_UDP settings, and more specific than FW_TRUSTED_NETS. The syntax is as follows: network,protocol[,dport][,sport] For example: 0/0,tcp,22 FW_SERVICES_INT_TCP. Defines the TCP services that are available to the internal network. The syntax is the same as that used for FW_SERVICES_EXT_TCP, but the settings are applied to the internal network. This parameter only needs to be configured if FW_PROTECT_FROM_INT is set to yes. FW_SERVICES_INT_UDP. Defines the UDP services that are available to the internal network. The syntax is the same as that used for FW_SERVICES_EXT_UDP, but the settings are applied to the internal network. 228

229 Harden Servers This parameter only needs to be configured if FW_PROTECT_FROM_INT is set to yes. FW_SERVICES_ACCEPT_INT. Specifies services to allow through the firewall to internal hosts. See FW_SERVICES_ACCEPT_EXT. If you need to close a port in the firewall, look for the FW_SERVICES_EXT_TCP="port" and FW_SERVICES_EXT_UDP="port" directives. If the port in question is listed in one or both of these directives, remove it and save your changes to the file. NOTE: The FW_SERVICES_EXT_TCP and FW_SERVICES_EXT_UDP parameters correlate to the TCP Port and UDP Port fields in the Additional Allowed Ports screen in the YaST Firewall module, as shown in Figure 4-9 on page 226. Novell Training Services (en) 15 April 2009 In addition to the FW_SERVICES_EXT_TCP and FW_SERVICES_EXT_UDP parameters, you also need to look at the FW_CONFIGURATIONS_EXT="list_of_services" parameter. Within the /etc/ sysconfig directory is the SuSEfirewall2.d/services subdirectory. This subdirectory contains separate firewall configuration files for some of the more commonly implemented services on SLES 11. These are shown below: DA1:/etc/sysconfig/SuSEfirewall2.d/services # ls TEMPLATE cups netbios-server samba-client vnc-httpd ypbind apache2 iscsitarget nfs-client samba-server vnc-server apache2-ssl isns ntp squid xdmcp bind mysql postfix sshd xorg-x11-server An example of the sshd firewall configuration file in this directory is shown below: DA1:/etc/sysconfig/SuSEfirewall2.d/services # cat./sshd ## Name: Secure Shell Server ## Description: Open ports for Secure Shell Server # space separated list of allowed TCP ports TCP="ssh" The FW_CONFIGURATIONS_EXT parameter in /etc/sysconfig/ SuSEFirewall2 identifies which of these service-specific configuration files in / etc/sysconfig/susefirewall2.d/services are active, and hence which ports are open in the firewall. For example, to open the host firewall to allow SSH and DNS traffic through, you would configure the following parameter in /etc/ sysconfig/susefirewall2: FW_CONFIGURATIONS_EXT="bind sshd" NOTE: The FW_CONFIGURATIONS_EXT parameter correlates with the Allowed Service field in the YaST Firewall module, as shown in Figure 4-8 on page

230 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual After configuring the firewall, you need to activate the new rules and then test your setup. The firewall rule sets are implemented by entering SuSEfirewall2 start as root at the shell prompt. Then use nmap to run a new port scan of the server system and verify that the previously open ports have been closed. NOTE: You can also use the nessus package to scan your system for vulnerabilities. You can get more information about nessus at ( 230

231 Exercise 4-1 Harden a SLES 11 Server In this lab, you harden an existing SLES 11 installation. You can find this lab in the workbook. (End of Exercise) Harden Servers Novell Training Services (en) 15 April

232 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Objective 3 Harden Services with AppArmor Regardless of how hard you try, it is impossible to guarantee that your server system will never be compromised. When administering critical servers where the integrity and security of your data is paramount, we recommend that you take additional measures to ensure that the system is still under control of the administrator. To do this, you should consider hardening your SLES 11 system using Novell AppArmor. AppArmor is a mandatory access control scheme that lets you specify which files the program may read, write, and execute on a per program basis. AppArmor secures applications by enforcing good application behavior without relying on attack signatures. This allows it to prevent attacks even if they are exploiting previously unknown vulnerabilities. To work with AppArmor, you need to understand how to do the following: Improve Application Security with AppArmor on page 232 Create and Manage AppArmor Profiles on page 233 Control AppArmor on page 250 Monitor AppArmor on page 253 Protect Services with AppArmor on page 259 Improve Application Security with AppArmor Even if you keep your software up-to-date, you can still be hit with an attack that exploits a vulnerability that is not yet known or for which there is no fix available yet. The idea of AppArmor is to have the kernel limit what a software program can do. In addition to the limitations set by the usual user and group permissions, limitations are imposed based on a set of rules for a specific program. Often an intruder does not directly gain root privileges. By exploiting a vulnerability in a web server, for example, he might be able to start a shell as the web server user account. With that unprivileged access, he exploits yet another vulnerability in some other software program, eventually gaining root-level access. With AppArmor, even if an intruder manages to find a vulnerability in some server software, the damage is limited due to the fact that the intruder may not access any files or execute any programs beyond what the application is allowed by AppArmor s rule set. This concept is not limited to normal accounts; it can also limit (to a certain extent) what an application running with root privileges may do. For example, a confined process cannot call certain system calls, even if running as root. Thus even if an attacker gains root privileges, he would still be limited in what he might be able to do. AppArmor hooks into the Linux Security Modules Framework of the kernel. Profiles in /etc/apparmor.d/ are used to configure which application may access and execute which files. 232

233 Harden Servers AppArmor must be activated before the applications it controls are started. Applications already running when AppArmor is activated are not controlled by AppArmor, even if a profile has been created for them. Therefore, AppArmor is activated early in the boot process by the /etc/init.d/boot.apparmor script. In a default installation of SLES 11, AppArmor is actively protecting several services using a set of profiles provided by Novell. Using the provided YaST modules or command line tools, you can easily adapt these to your needs and create new profiles for additional applications you want to protect. As a general rule, you should use AppArmor to confine programs that grant privileges to resources that the user running the program does not have, including: Network daemons programs that use open network ports Cron jobs Web applications Novell Training Services (en) 15 April 2009 NOTE: The AppArmor documentation sometimes refers to the product by its former name, Subdomain. Create and Manage AppArmor Profiles SLES 11 comes with AppArmor profiles for various applications, including as syslog-ng, ntpd, nscd, and others. The profiles are contained in /etc/apparmor.d/. The filename of the profile represents the filename of the application, including the path, with / being replaced by a.. Therefore, the profile for /usr/sbin/squid would be contained in / etc/apparmor.d/usr.sbin.squid. A profile can include other files using an #include statement. The /etc/ apparmor/abstractions/ directory contains several include files that are designed to be included in AppArmor profiles, depending on the kind of program to be protected. There are abstractions for files that should be readable or writable by all programs (base), for nameservice-related files like /etc/passwd (nameservice), for files related to console operations (console), and others. The profiles are plain text files and it is therefore possible to create and manage them using any text editor. However, using command line tools or YaST greatly simplifies the profile creation process. In addition to the active profiles in /etc/apparmor.d/, there are several sample profiles available in /etc/apparmor/profiles/extras/ that you can customize to meet your needs. You simply make the appropriate changes and then copy them to /etc/apparmor.d/ to activate them. To successfully administer AppArmor, you need to be familiar with the following concepts and tasks: AppArmor Profiles and Rules on page

234 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual AppArmor ChangeHat on page 237 Administering AppArmor Profiles with YaST on page 238 Administer AppArmor Profiles with Command Line Tools on page 247 AppArmor Profiles and Rules Novell AppArmor profiles contain two main types of AppArmor rules: Path entries. Specify what a process can access in the file system. Capability entries. Specify specific POSIX capabilities a process is granted, overriding the default limitation. NOTE: For more information, see the apparmor man page and the capabilities man page. Other files containing AppArmor rules can be pulled in with #include statements. For example, the profile (/etc/apparmor.d/sbin.klogd) for /sbin/ klogd, the kernel log daemon, is shown below: # Profile for /sbin/klogd #include <tunables/global> /sbin/klogd { #include <abstractions/base> capability sys_admin, network inet stream, /boot/system.map* r, /dev/tty rw, /sbin/klogd rmix, /var/log/boot.msg rwl, /var/run/klogd.pid krwl, /var/run/klogd/klogd.pid krwl, /var/run/klogd/kmsg r, } Comments start with a # sign, as shown in the first line of the file. However, #include statements are not interpreted as comments, but are used to include rules from other files, as discussed earlier. The path as given above is relative to the /etc/ apparmor.d/ directory. The /etc/apparmor.d/tunables/global directive in line 3 above is used to include definitions that should be available in every profile. It in turn includes the /etc/apparmor.d/tunables/home and /etc/apparmor.d/ 234

235 tunables/proc files, which variables. Harden Servers The /etc/apparmor.d/abstractions/ directory contains files with general rules grouped by common application tasks. These include access to files all applications need (base), access to authentication mechanisms (authentication), graphics environments (kde, gnome), name resolution (nameservice), and others. Instead of having these redundantly specified in several profiles, they are defined once and included in all profiles that need them. Line 5 in the example above specifies the absolute path to the program confined by AppArmor. The corresponding rules, as well as any include files, follow within the curly braces {}. Line 8 enables the sys_admin capability for this program. Any other capabilities needed would be listed in separate lines starting with capability. Line 10 allows an IPv4 socket of type stream. The network rule restricts all socket based operations. The mediation checks whether a socket of a given type and family can be created, read, or written to. There is no mediation based on port number or protocol beyond tcp, udp, and raw. The remaining lines list files and directories with the corresponding access permission granted. For lines listing files and directories, the following wildcards can be used: *. Substitutes any number of characters, except /. **. Substitutes any number of characters, including /. Use ** to include subdirectories.?. Substitutes any single character, except /. [abc]. Substitutes a, b, or c. [a-d]. Substitutes a, b, c, or d. {ab,cd}. Substitutes either ab or cd. Novell Training Services (en) 15 April 2009 The permissions that can be granted include the following: r. Read mode allows the program to have read access to the resource. Read access is required for scripts. A running process needs this permission to allow it to dump core or to be attached to with ptrace. w. Write mode allows the program to have write access to the resource. Files must have this permission if they are to be unlinked (removed). NOTE: The mode conflicts with Append mode. a. Append mode allows the program to have limited, appending-only write access to the file. Append mode will prevent an application from opening the file for write access unless it passes the O_APPEND parameter flag on open. 235

236 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual NOTE: The mode conflicts with Write mode. l. Link mode mediates access to symbolic links and hard links. It grants the privilege to unlink (remove) files. k. Lock mode allows the program to lock a file with the specified name. This permission covers both advisory and mandatory locking. m. Allow Executable Mapping mode allows a file to be mapped into memory using mmap's PROT_EXEC flag, which marks the pages as executable. It is used on some architectures to provide non-executable data pages, which can complicate exploit attempts. AppArmor uses this mode to limit which files a well-behaved program may use as libraries. ix. Inherit Execute mode allows the executed resource to inherit the current profile. px. Discrete Profile Execute mode requires that a profile be defined for the resource executed. If there is no profile defined, access is denied. cx. Similar to px, but the profile for the resource executed is included as part of the current local profile, not as a separate file. The following example shows how a profile for /bin/grep is included in the profile for /usr/lib/firefox/firefox.sh: # Last Modified: Fri Jan 15 04:29: #include <tunables/global> /usr/lib/firefox/firefox.sh { #include <abstractions/base>... profile /bin/grep { #include <abstractions/base> } } /bin/grep mr,... If no local profile exists, execution is denied. cix: Same as cx, but if there is no local profile provided, the execution will fall back to ix. Px: Discrete Profile Execute mode -- scrub the environment allows the named program to run in px mode, but AppArmor will invoke the Linux kernel's unsafe_exec routines to scrub the environment, similar to setuid programs. NOTE: See man 8 ld.so for information about setuid/setgid environment scrubbing. Cx, Cix: cx and cix with scrubbed environment as explained under Px. 236

237 Harden Servers ux: Unconstrained Execute mode allows the program to execute the resource without a Novell AppArmor profile being applied to the executed resource. This should be used only in rare circumstances. Ux: Unconstrained execute -- scrub the environment. As with ux, this mode should only be used in rare exceptions. The modes with an x in the name, such as ix, px, Px, ux, and Ux, cannot be combined. NOTE: See the apparmor.d man page for more information. Novell Training Services (en) 15 April 2009 AppArmor ChangeHat An AppArmor profile represents the security policy for an individual program instance or process. However, if a portion of the program needs different access permissions than other portions, the program can change hats to use a security context that is different from the main program. This is known as a hat or subprofile. ChangeHat enables programs to change to or from a hat within an AppArmor profile. It enables you to define security at a finer level than process. This feature requires that each application be made ChangeHat aware, meaning that it is modified to make a request to the AppArmor module to switch security domains at arbitrary times during execution. Two examples of ChangeHat-aware applications are the Apache Web server and Tomcat. A profile can have an arbitrary number of subprofiles, but there are only two levels: a subprofile cannot have further sub-subprofiles. A subprofile is written as a separate profile and named as the containing profile followed by the subprofile name, separated by a ^. An example is shown below: /usr/sbin/httpd2-prefork { #include <abstractions/base> #include <abstractions/nameservice> #include <abstractions/php5>... ^/subdir/index.php { #include <abstractions/nameservice>... Subprofiles must be stored in the same file as the parent profile. NOTE: Novell AppArmor provides the mod_apparmor module (from the apache2-mod_apparmor package) for the Apache program. This module makes the Apache Web server ChangeHat aware. Install it along with Apache. 237

238 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual When Apache is ChangeHat aware, it checks for the following customized AppArmor security profiles in the order specified for every URI request that it receives. URI-specific hat ^app/templates/images/left.gif {... } DEFAULT_URI If no URI-specific hat is found, mod_apparmor will fall back to attempting to use the hat DEFAULT_URI ^DEFAULT_URI { #include <abstractions/nameservice> #include <abstractions/base>... } Most static web pages can simply make use of the DEFAULT_URI hat. If the DEFAULT_URI subprofile does not exist, mod_apparmor will fall back to using the global apache profile. HANDLING_UNTRUSTED_INPUT Before any requests come in to apache, mod_apparmor will attempt to change hat into the HANDLING_UNTRUSTED_INPUT hat. mod_apparmor will attempt to use this hat while apache is doing the initial parsing of a given http request, before it is given to a specific handler (like mod_php) for processing: ^HANDLING_UNTRUSTED_INPUT { #include <abstractions/nameservice> /var/log/apache2/* w, /**.htaccess r, } NOTE: For more details on hats, see Section 5 of the AppArmor Administration Guide in /usr/ share/doc/manual/apparmor-admin_en/manual/index.html. Administering AppArmor Profiles with YaST The profiles presented thus far in this part of this objective have been rather simple and short. If you browse through the profiles in /etc/apparmor.d/ or /etc/ apparmor/profiles/extras/, you will see that some profiles can be much more complex. AppArmor comes with tools that help you create and maintain AppArmor profiles. YaST also provides modules that provide a graphical interface for these tools. You can use the Add Profile Wizard in YaST to create a new profile or modify an existing one. Before running the Wizard to profile an application, you need to first stop the application you want to create a profile for. 238

239 Figure 4-10 Harden Servers To access the Add Profile Wizard, start YaST and select the Novell AppArmor group; then select Add Profile Wizard. The first step is to enter the application you want to profile, as shown below: Add Profile Wizard Novell Training Services (en) 15 April 2009 If no path is given, the Wizard looks for the binary in the search path contained in the $PATH environment variable. Select Create to continue. When you start the Add Profile Wizard the first time, the Repository dialog is displayed. Here you can specify whether you want to enable access to a repository of prepared profiles at apparmor.opensuse.org ( The next dialog asks you to start and use the application you want to profile, as shown below: 239

240 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 4-11 AppArmor Profile Wizard Start the application and use it in the same way that you expect it to be used in your production environment. For instance, if you are profiling a Web server, access it in a way that you expect it to be accessed during normal operation. For a Web browser, use it in a way you expect the users to access web content. During this learning phase, any access to files or capabilities needed by the application is granted as well as logged in the /var/log/audit/audit.log log file. Because full access is granted, you have to make sure that the system is absolutely secure such that no attack can happen during this phase of profile creation. AppArmor is not yet protecting your application. Once you feel you have gone through all expected uses of the application, select Scan system log for AppArmor events in the YaST AppArmor Profile Wizard dialog. For each event you are presented with a dialog where you can decide what should happen when this event occurs in the future. The dialog offers different options, depending on the event. For process access, a screen similar to the following is displayed: 240

Figure 4-12 External Program Configuration Harden Servers Novell Training Services (en) 15 April 2009 You can set the following options: Inherit.

241 Figure 4-12 External Program Configuration Harden Servers Novell Training Services (en) 15 April 2009 You can set the following options: Inherit. The executed resource inherits the current (parent s) profile. Profile. Requires that a specific profile exist for the executed program. Child. The profile for the executed program is integrated into the profile of the program being configured. These rules apply only when the program is a child process of the program for which the profile is configured. Name. In addition to the cx transition mode, AppArmor adds the ability to specify exactly which profile will be transitioned to. This can be useful if multiple binaries should share a single profile, or if they should use a different profile than their name would specify. After clicking Name, you are asked if you want to transition to a local profile. If you select Yes, the profile is created within the current profile. If you select No, a new profile is created and you are prompted for its name. Named profile transitions use -> to indicate the name of the profile that should be transitioned to. With cx it points to a local profile, with px it points to a discrete profile. An example is shown below: 241

242 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual # pointer to discrete profile, /etc/apparmor.d/profile_pdf_viewer /usr/bin/evince px -> pdf_viewer, # pointer to local profile /bin/basename cx -> basename,... profile /usr/bin/file { #include <abstractions/base>... } Unconfined. Executes the program without a security profile. Do not run unconfined unless absolutely necessary. ix fallback on/off. Used to toggle Profile, Child, and Name to Profile ix, Child ix, and Name ix. Deny. The execution of the program will be denied. In case of file access, the dialog displayed offers different options: Figure 4-13 File Access Configuration 242

243 Harden Servers The Add Profile Wizard suggests an access mode (r, w, l, k, or a combination thereof). If more than one item appears in the list of files, directories, or #includes, select the appropriate option; then select one of the following: Allow. Grants the program access to the specified file or directory. Deny. Prevents the program from accessing the specified file or directory. Sometimes the suggested files or directories do not fit your needs. In this case, you can modify them: Glob. Selecting Glob once replaces the filename with an asterisk, including all files in the directory. Selecting Glob twice replaces the file and the directory it resides in by **, including all directories and files of that level in the directory tree. Selecting Glob again goes up one level in the path. Glob w/ext. Selecting Glob w/ext once replaces the filename with an *, but retains the file name extension. For example, text.txt becomes *.txt. Selecting Glob w/ext twice replaces the file and the directory it resides in with **, retaining the file name extension. For example, /a/b/c/text.txt becomes /a/b/ **.txt. Edit. Enables editing of the highlighted line. The new edited line appears at the bottom of the list. Opts. Selecting Opts allows you to toggle Owner Permissions and Audit on and off. An ownership test of a file is done by checking whether the user associated with the confined process matches the owner of the file. Specifically, the fsuid (see man 2 setuid) of the process is compared to the files uid. For example: Novell Training Services (en) 15 April 2009 owner /** l, # restrict hardlinks to user owned files AppArmor provides the ability to audit rules so that when they are matched, an audit message is added to the audit log. To enable audit messages for a given rule, the audit keyword is prepended to the rule. For example: audit /etc/shadow w, If you need to audit only a given permission on a rule, the rule can be split into two rules. For example, suppose audit messages are only desired for write accesses to /etc/shadow. The rule can be split as follows: audit /etc/shadow w, /etc/shadow r, This will result in audit messages recorded when /etc/shadow is opened for writing, but not when it is opened for just reading. It is important to note that audit messages are not generated for every read or write access to a file, only when a file is actually opened for reading or writing. Audit control can be combined with owner conditional file rules to provide auditing when users access files they own, but at this time it cannot be used to audit only files they don't own. Example: audit owner /home/*/.ssh/** rw, 243

Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 4-14 After you have modified the line, select Allow or Deny.

244 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 4-14 After you have modified the line, select Allow or Deny. Go through each learning mode entry in this same way. Once all entries have been processed, you are returned to the AppArmor Profile Wizard dialog where you were asked to run the application. If necessary, you can run the application again, and configure any additional entries generated. Before the changes are saved to disk, you can review the changes, as shown in the following example: AppArmor Profile Changes When you re satisfied, confirm the changes by selecting OK, then select Finish. The profile is written and activated. You can also manually create a new profile. To do this, start YaST and select Novell AppArmor > Manually Add Profile. You will be prompted to select a file for which you want to create the profile. Once done, the AppArmor Profile dialog is displayed. In this screen, you can add, edit, or delete entries in the profile. An example is shown below: 244

245 Figure 4-15 AppArmor Profile Dialog Harden Servers Novell Training Services (en) 15 April 2009 The advantage of using YaST to create your AppArmor profiles is its syntax checking. If you feel comfortable with the AppArmor syntax, you can also just use a text editor, such as vi, to create and edit your profiles. Profiles containing hats can be created and maintained just like any other profile using YaST or AppArmor command line tools. The command line utilities provide you with several additional options that are useful when dealing with hats. An example of using the genprof utility to manage hats in a profile is shown below: Reading log entries from /var/log/audit/audit.log. Updating AppArmor profiles in /etc/apparmor.d. Profile: /usr/sbin/httpd2-prefork Default Hat: DEFAULT_URI Requested Hat: /index.php (A)dd Requested Hat / (U)se Default Hat / [(D)eny] / Abo(r)t / (F)inish 245

246 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual NOTE: For more information, see the change_hat man page. If you need to update an existing profile, you can run the Add Profile Wizard a second time or you can run the Update Profile Wizard in YaST. When you run the Add Profile Wizard on a program for which a profile already exists, the wizard uses the existing profile as a starting point to work from. You then go through the same process you used earlier to create a new profile. If you want to update several profiles at once or update profiles for applications that run over a long period of time, the Update Profile Wizard is usually a better choice than the Add Profile Wizard. However, you may need to complete several preparatory steps using command line tools. The first step is to decide which application profiles you want to update and to put AppArmor into complain mode (also called learning mode) for these applications. NOTE: If no profile exists yet for an application you want to profile, you ll have to create one first using the autodep program. The complain command is used to activate complain mode in AppArmor. You can either use the program or the profile as the argument. For instance, both of the following commands can be used to switch AppArmor into complain mode for Firefox: complain firefox complain /etc/apparmor.d/usr.lib.firefox.firefox.sh If you want to enable complain mode for all applications confined by AppArmor, use the complain /etc/apparmor.d/* command. In profiles that are in complain mode, the path to the application being confined is followed by flags=(complain): # Profile for /sbin/klogd #include <tunables/global> /sbin/klogd flags=(complain){... } Once done, you then need to use your application to create events in the log file. The next step is to start the Update Profile Wizard by starting YaST and selecting Novell AppArmor > Update Profile Wizard. The interface is almost identical to the Add Profile Wizard interface; also the choices you are presented do not differ. However, as you are updating different profiles, you have to pay special attention to the profile in the first line to be sure that your decision on allowing or denying fits the respective profile. Once the log file has been processed, select Finish. The profiles will be reloaded, but AppArmor is still running in complain mode. 246

247 Harden Servers To have AppArmor again enforce the rules, use the enforce command at the shell prompt. This command uses the same syntax as complain. For example: enforce /etc/apparmor.d/* This command puts all profiles in enforce mode. If, for some reason, you no longer want to confine an application with AppArmor, you need to delete its profile. To delete a profile, start YaST and select Novell AppArmor > Delete Profile. Select the profile to delete; then select Next. After you select Yes in the confirmation dialog, the profile is deleted and the application is no longer confined by AppArmor. Novell Training Services (en) 15 April 2009 Administer AppArmor Profiles with Command Line Tools In addition to YaST, there are several tools that you can use to create and maintain AppArmor profiles from the shell prompt. These include the following: autodep genprof logprof vim autodep generates a profile skeleton for a program and loads it into the AppArmor module in complain mode. The syntax is autodep program1 program2... genprof (Generate Profile) is used to create a profile for an application. Stop the application you want to create a profile for before running genprof. genprof runs autodep on the specified program if no profile exists for it yet, puts the new or already existing profile in complain mode, marks the log file, and then prompts the user to start the program to profile and to exercise its functionality. An example of using genprof is shown below: da10:~ # genprof firefox Please start the application to be profiled in another window and exercise its functionality now. Once completed, select the "Scan" button below in order to scan the system logs for AppArmor events. For each AppArmor event, you will be given the opportunity to choose whether the access should be allowed or denied. Profiling: /usr/lib/firefox/firefox.sh [(S)can system log for SubDomain events] / (F)inish 247

248 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Once the user has completed these tasks and has pressed s in the terminal window where genprof is running, genprof calls logprof to run against the system log where it was marked when genprof was started. In case of access to a program, the output looks similar to the following: Reading log entries from /var/log/audit/audit.log. Updating AppArmor profiles in /etc/apparmor.d. Profile: /usr/lib/firefox/firefox.sh Execute: /bin/basename Severity: unknown (I)nherit / (P)rofile / (C)hild / (N)ame / (U)nconfined / (X)ix / (D)eny / Abo(r)t / (F)inish In case of access to a file or directory, the output looks similar to the following: Profile: /usr/lib/firefox/firefox.sh Path: /dev/tty Mode: rw Severity: #include <abstractions/consoles> [2 - /dev/tty] [(A)llow] / (D)eny / (G)lob / Glob w/(e)xt / (N)ew / Abo(r)t / (F)inish / (O)pts Type the appropriate number to switch lines as applicable; then type the letter in parenthesis corresponding to what you want to do. The options offered here are the same as those offered within YaST. NOTE: New in this interface corresponds, to a certain extent, to the Edit option in YaST. Once all log entries have been processed, you are asked if you would like to save the changes, as shown below: 248

249 = Changed Local Profiles = Harden Servers The following local profiles were changed. Would you like to save them? [1 - /usr/lib/firefox/firefox.sh] (S)ave Changes / [(V)iew Changes] / Abo(r)t Writing updated profile for /usr/bin/evince. Profiling: /usr/bin/evince Novell Training Services (en) 15 April 2009 [(S)can system log for SubDomain events] / (F)inish After typing s, you are returned to genprof, where you can start a new scan or finish the profile generation by typing f, as shown below: Writing updated profile for /usr/lib/firefox/firefox.sh. Profiling: /usr/lib/firefox/firefox.sh [(S)can system log for SubDomain events] / (F)inish Reloaded SubDomain profiles in enforce mode. Finished generating profile for /usr/lib/firefox/firefox.sh. When profiling an application that is ChangeHat-aware, options that are specific for hats are presented by genprof or logprof. An example is shown below: Reading log entries from /var/log/audit/audit.log. Updating AppArmor profiles in /etc/apparmor.d. Profile: /usr/sbin/httpd2-prefork Default Hat: DEFAULT_URI Requested Hat: /index.php (A)dd Requested Hat / (U)se Default Hat / [(D)eny] / Abo(r)t / (F)inish The Add Requested Hat option creates an entry in the profile that is specific for that URI. The Use Default Hat option adds rules within the DEFAULT_URI subprofile. The logprof tool is used to scan the /var/log/audit/audit.log file for entries created by AppArmor for profiles in learning mode, and to interactively create new profile entries. The choices you have are the same as those described under genprof. If you want logprof to start scanning from a certain point in the log file, you can pass a string that describes that point. The following is an example of an entry in the log file: 249

250 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual type=apparmor_allowed msg=audit( :117186): operation="inode_permission" requested_mask="::r" denied_mask="::r" fsuid=1000 name="/usr/share/doc/manual/sles-admin_en-pdf/" pid=8108 parent=7055 profile="/usr/bin/evince" Using the -m option with log prof, for example: logprof -m " :117186", you can start the scan of the log file from that point, ignoring earlier entries. To confine processes for which the profiles have been changed, you need to enter enforce profile and restart the respective process, if it is running. NOTE: The profiles can be manually edited using any text editor. We recommend using vim because AppArmor includes a syntax-highlighting description that enables vim to highlight syntax elements in profiles. Control AppArmor AppArmor can be controlled using the /etc/init.d/boot.apparmor script or the /sbin/rcapparmor link to this script. This init script uses the usual parameters start, stop, etc. However, because AppArmor is not a daemon, their function is slightly different. To control AppArmor, you have to know how to Start and Stop AppArmor on page 250 View AppArmor s Status on page 251 Reload Profiles on page 253 Start and Stop AppArmor To confine an application, AppArmor has to be active before the application starts. Therefore, AppArmor is usually activated early in the boot process. The rcapparmor start command is used to activate AppArmor. However, only applications started after the activation of AppArmor can be confined. Even if, for instance, a profile for Squid exists, Squid will not be confined if it was already running before you started AppArmor. To include Squid in AppArmor s protection, you need to restart Squid after activating AppArmor. If you do not want AppArmor to confine your applications any longer, you can use the rcapparmor stop command. NOTE: In SLES 10 you can unload the AppArmor kernel modules (apparmor and aamatch_pcre) using the rcapparmor kill command. In SLES 11, however, AppArmor is not loaded as a module, it is compiled into the kernel. Therefore, the rcapparmor kill command will result in an error message on this platform. 250

251 View AppArmor s Status Harden Servers The rcapparmor status command provides you with a general overview of profiles and processes. An example is shown below: da10:~ # rcapparmor status apparmor module is loaded. 14 profiles are loaded. 14 profiles are in enforce mode. /usr/lib/firefox/firefox.sh//basename /usr/sbin/ntpd /usr/sbin/identd /usr/lib/firefox/firefox.sh///usr/bin/file /sbin/klogd /sbin/syslogd /sbin/syslog-ng /usr/sbin/traceroute /usr/sbin/nscd /usr/lib/firefox/firefox.sh /usr/bin/evince /usr/sbin/mdnsd /bin/ping /usr/sbin/avahi-daemon 0 profiles are in complain mode. 5 processes have profiles defined. 3 processes are in enforce mode : /sbin/klogd (8468) /sbin/syslog-ng (8465) /usr/sbin/nscd (8496) 2 processes are in complain mode. null-complain-profile (7337) null-complain-profile (7336) 0 processes are unconfined but have a profile defined. Novell Training Services (en) 15 April 2009 If you restart AppArmor, you must restart all of your confined applications. In the following example, AppArmor was restarted, but the confined applications were not: da10:~ # rcapparmor stop Unloading AppArmor profiles done da-host:~ # rcapparmor start Loading AppArmor profiles done da-host:~ # rcapparmor status da10:~ # rcapparmor status... 3 processes are unconfined but have a profile defined. /sbin/klogd (8468) /sbin/syslog-ng (8465) /usr/sbin/nscd (8496) Restarting one of the processes for which there is a profile changes the output of rcapparmor status as follows: 251

252 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual da10:~ # rcnscd restart Shutting down Name Service Cache Daemon done Starting Name Service Cache Daemon done da10:~ # rcapparmor status... 1 processes are in enforce mode : /usr/sbin/nscd (8609)... 2 processes are unconfined but have a profile defined. /sbin/klogd (8468) /sbin/syslog-ng (8465) The output of AppArmor does not contain specific data regarding the profiles or the processes being confined. A list of the profiles loaded is kept in /sys/kernel/security/apparmor/ profiles. It looks similar to the following: da10:~ # cat /sys/kernel/security/apparmor/profiles /usr/sbin/traceroute (enforce) /usr/sbin/ntpd (enforce) /usr/sbin/nscd (enforce) /usr/sbin/mdnsd (enforce)... The command unconfined lists processes that have bound sockets but have no profiles loaded: da10:~ # unconfined 3554 /sbin/rpcbind not confined 3554 /sbin/rpcbind not confined 3554 /sbin/rpcbind not confined 3554 /sbin/rpcbind not confined 3554 /sbin/rpcbind not confined 3554 /sbin/rpcbind not confined 4105 /usr/sbin/ietd not confined 4105 /usr/sbin/ietd not confined 4172 /usr/sbin/cupsd not confined 4172 /usr/sbin/cupsd not confined 4269 /usr/lib/postfix/master not confined 4346 /usr/sbin/sshd not confined 4346 /usr/sbin/sshd not confined This does not give information about processes with profiles that are not confined because they were running already when AppArmor was activated. To spot those, run ps -Z and compare the output with the content of /sys/kernel/security/ profiles. Restart any processes that should be confined. 252

253 Reload Profiles Harden Servers If you have changed profiles in /etc/apparmor.d/ manually with a text editor (not using AppArmor tools such as logprof), you need to reload the profiles concerned. The command to do this is rcapparmor reload. rcapparmor restart is equivalent to reload. It does not stop and then start AppArmor, but it does reload the profiles. Processes that were confined before rcapparmor reload was issued remain confined (unless you deleted their profile or changed their status from enforce to complain). The enforce and complain commands toggle the status from enforce to complain and vice versa, and reload the profiles concerned. Novell Training Services (en) 15 April 2009 Monitor AppArmor There are two ways to monitor AppArmor: Security Event Report on page 253 Security Event Notification on page 256 Security Event Report To configure and view AppArmor security event reports, start YaST and select Novell AppArmor > AppArmor Reports. It can also be launched directly from a console window as root by entering yast2 SD_Report at the shell prompt. The dialog that opens shows when security event reports are generated. An example is shown below: 253

Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 4-16 AppArmor Security Event Report By default these are created once a day at

254 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 4-16 AppArmor Security Event Report By default these are created once a day at midnight. You can schedule new security incident reports and edit existing reports. For example, you can set the address that should receive the report. You can also delete event reports. Selecting a report and then selecting Run Now either shows the result directly, or, in the case of the Security Incident Report, first opens a dialog where you can fine tune the content of the resulting report. An example is shown below: 254

Figure 4-17 AppArmor Report Configuration Harden Servers Novell Training Services (en) 15 April 2009 Select Help to view an explanation of the available

255 Figure 4-17 AppArmor Report Configuration Harden Servers Novell Training Services (en) 15 April 2009 Select Help to view an explanation of the available options. Once you have configured what you want to have included in your report, select Next. The report is displayed, showing the security events, as shown below: 255

Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 4-18 AppArmor On-Demand Report Depending on your configuration in the previous

256 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 4-18 AppArmor On-Demand Report Depending on your configuration in the previous dialog, it is also saved to disk in the /var/log/apparmor/reports-exported/directory by default. Security Event Notification To configure the security event notification, start YaST and select Novell AppArmor > AppArmor Control Panel. You can also start the module directly from a console window as root by entering yast2 subdomain at the shell prompt. The following dialog is displayed: 256

Figure 4-19 AppArmor Event Configuration Harden Servers Novell Training Services (en) 15 April 2009 Select Configure in the Enable Security Event Notification field.

257 Figure 4-19 AppArmor Event Configuration Harden Servers Novell Training Services (en) 15 April 2009 Select Configure in the Enable Security Event Notification field. A dialog is displayed where you can configure the frequency of the notifications, addresses, and the severity levels the reports should cover. This is shown below: 257

258 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 4-20 AppArmor: Security Event Notification Select OK to save your configuration. Close the AppArmor configuration window by selecting Done. 258

259 Exercise 4-2 Protect Services with AppArmor In this lab, you configure AppArmor to protect services. You can find this lab in the workbook. (End of Exercise) Harden Servers Novell Training Services (en) 15 April

260 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Objective 4 Implement an IDS The last topic addressed in this objective is that of implementing an Intrusion Detection System (IDS). An IDS works in a manner similar to that used by anti-virus software you are probably already familiar with. Anti-virus software works by looking for patterns in files known to indicate the presence of a virus. Similarly, an IDS operates by monitoring certain facets of your server system looking for patterns that indicate suspicious activity is under way. The aspect of your system monitored depends upon which IDS package you use. Some of the more commonly used IDS packages include the following: Advanced Intrusion Detection Environment (AIDE). AIDE is a file integrity checker. It checks for unauthorized alterations to files in your server s file system. arpwatch. arpwatch watches arp traffic to detect new hosts when they connect to the network. Argus. Argus monitors connections between network hosts. Snort. Snort monitors network traffic, looking for patterns in the transmissions that indicate suspicious activity is under way. A review of each of these packages is beyond the scope of this objective. Instead, we will focus on using AIDE to check for altered files. The following topics will be addressed: How AIDE Works on page 260 Configuring AIDE Rules on page 262 Using AIDE to Check for Altered Files on page 266 Configure AIDE on page 270 How AIDE Works The AIDE IDS works by creating a database that contains information about the various files on your system using rules you specify in the /etc/aide.conf configuration file. The database stores information about various file attributes including Permissions Inode number User Group File size mtime ctime atime 260

261 Growing size Number of links Link name Harden Servers Most importantly, AIDE also creates a cryptographic checksum of each file. While it is relatively easy for an intruder to modify a file and then hide his or her tracks by manipulating the file s modification date and size, it is very difficult to manipulate a file s cryptographic checksum. AIDE can use one (or more) of the following Message Digest algorithms to generate checksums for each monitored file: sha1 sha256 sha512 md5 rmd160 tiger Novell Training Services (en) 15 April 2009 We recommend that you implement AIDE on your system and create its database right after the system is first installed and before it has been connected to the network. By doing this, AIDE is able to create pristine checksums for its monitored files. This first AIDE database will function as a baseline or snapshot against which subsequent updates and changes will be evaluated. While you could theoretically configure your AIDE database to monitor all files in your server s file system, we strongly recommend that you don t do this. It would generate a huge number of false-positive reports any time normal file-related activity occurred, such as a user saving changes to a word-processing file. Instead, the database should be constrained to key system files that should not change over time, including Binary executable files Libraries Header files You should not include files in the database that change frequently by design, including Log files in /var/log Mail spool files in /var/spool/mail The proc file system /proc Users home directories in /home Temporary directories such as /tmp WARNING: Be aware that an intruder may actually place files in these directories specifically because they are commonly excluded by IDS systems. You need to determine the balance between 261

262 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual security and manageability for your organization when determining which files and directories to monitor with AIDE. After creating your initial baseline, you can use AIDE to examine system files that are very likely to be altered. This is especially true if you suspect that an intruder has already tried to break into your system. Files that are commonly altered to carry a trojan, hide an intruder s tracks, and so on include the following: ls ps netstat who Any other system utility that must be altered in order to hide a security breach According to the AIDE documentation, the ls and ps binary files are common targets for intruders. The ps command can be altered to not display processes run by the attacker, such as a keystroke logger. The ls command can also be altered to not show files created by the attacker or the attacker s processes, such as a log file created by the keystroke logger mentioned previously. If you suspect that an attack has occurred and that the intruder may have successfully broken in, we recommend that you run AIDE immediately. When you do, the initial AIDE database is used as a baseline to check for altered files. Any modifications not permitted by the AIDE configuration file are reported. WARNING: Be aware that AIDE is not foolproof. Remember, technology is not a substitute for vigilance. Configuring AIDE Rules AIDE is included on your SLES 11 installation media. Install it using the YaST Software Management module or the yast -i aide command. Once you have installed AIDE, you next need to configure rules in the AIDE configuration file to specify which files should be monitored. The /etc/ aide.conf file contains sample configurations that you can use as a template for creating your own rules, as shown below:... # Configuration parameters # database=file:/var/lib/aide/aide.db database_out=file:/var/lib/aide/aide.db.new verbose=1 report_url=stdout warn_dead_symlinks=yes # # Custom rules # 262

263 Binlib ConfFiles Logs Devices Databases StaticDir ManPages # # Directories and files # # Kernel, system map, etc. /boot = p+i+n+u+g+s+b+m+c+md5+sha1 = p+i+n+u+g+s+b+m+c+md5+sha1 = p+i+n+u+g+s = p+i+n+u+g+s+b+c+md5+sha1 = p+n+u+g = p+i+n+u+g = p+i+n+u+g+s+b+m+c+md5+sha1 Binlib Harden Servers Novell Training Services (en) 15 April 2009 # watch config files, but exclude, what changes at boot time,...!/etc/mtab!/etc/lvm* /etc ConfFiles # Binaries /bin /sbin # Libraries /lib # Complete /usr and /opt /usr /opt Binlib Binlib Binlib Binlib Binlib # Log files /var/log$ StaticDir #/var/log/aide/aide.log(.[0-9])?(.gz)? Databases #/var/log/aide/error.log(.[0-9])?(.gz)? Databases #/var/log/setuid.changes(.[0-9])?(.gz)? Databases /var/log Logs # Devices!/dev/pts /dev # Other miscellaneous files /var/run$!/var/run /var/lib Devices StaticDir Databases # Test only the directory when dealing with /proc /proc$ StaticDir!/proc # manpages can be trojaned, especially depending on *roff implementation #/usr/man ManPages #/usr/share/man ManPages #/usr/local/man ManPages

264 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual There are three types of lines in aide.conf: Configuration lines. Used to set configuration parameters and define variables. Selection lines. Used to indicate which files will be added to the database. Macro lines. Used to define variables within the config file. The first section of the file is the Configuration Parameters section. It is used to configure general parameters, such as the location of the AIDE database file. You configure your AIDE rules using the Custom Rules and Directories and Files sections. These sections are bolded in the example file above. Instead of defining a rule for each directory or file that you want AIDE to monitor, you can define a custom rule once in the Custom Rules section and give it a name. Then, when configuring monitoring for a specific directory, you can tell AIDE to use the custom rule defined previously. Consider the following rule defined under Custom Rules in the configuration file above: Binlib = p+i+n+u+g+s+b+m+c+md5+sha1 The name of the rule is Binlib. The following file options are then defined for that rule: Table 4-3 AIDE Rule Options Option p i n u g s S b m c md5 sha1 Description Check the file permissions of the selected files or directories. Check the inode number. Every filename has a unique inode number that should not change. Check the number of links pointing to the file. Check to see if the owner of the file has changed. Check to see if the group assigned to the file has changed. Check to see if the file size has changed. Check for growing size. Check to see if the block count used by the file has changed. Check to see if the modification time of the file has changed. Check to see if the files access time stamp has changed. Check to see if the md5 checksum of the file has changed. Check to see if the sha1 (160 Bit) checksum of the file has changed. NOTE: For a complete list of the available checking options, see /usr/share/doc/ packages/aide/manual.html. 264

265 Harden Servers Later in the file, you can specify that the files in the /boot directory be monitored using the Binlib rule. This is shown below: # Kernel, system map, etc. /boot Binlib Likewise, you can also define several other rules in the Custom Rules section, as shown below: ConfFiles Logs Devices Databases StaticDir ManPages = p+i+n+u+g+s+b+m+c+md5+sha1 = p+i+n+u+g+s = p+i+n+u+g+s+b+c+md5+sha1 = p+n+u+g = p+i+n+u+g = p+i+n+u+g+s+b+m+c+md5+sha1 You can then assign these rules to specific files and directories in the file system just as you did above for the /boot directory. Some examples are listed below: Novell Training Services (en) 15 April 2009 /lib /var/log /dev /var/lib Binlib Logs Devices Databases NOTE: You don t have to configure your rules in this way if you don t want to. You can just list the directory or file to be monitored followed by the rule to be applied. For example: /etc p+i+u+g This rule checks permissions, inode, user, and group for the files in the /etc directory. You may have noticed in the example presented earlier that regular expression matching is used in the AIDE configuration file. An example is shown below:!/var/run /var/lib Databases AIDE can use three different types of selection lines in the aide.conf configuration file. They begin with the following characters: /. Regular selection lines =. Equals selection lines!. Negative selection lines For example, the expression!/var/run in the configuration above tells AIDE to exclude the /var/run directory. Conversely, the expression /var/lib in the configuration above tells AIDE to include all files in the /var/lib directory. If you need to refer to a single file or directory in an expression, you should add a $ character to the end of the file or directory name. This matches the exact name only. It will not include any other files or directories that have the same beginning characters in their name. Consider the following example: /proc$!/proc StaticDir 265

266 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual This configuration causes AIDE to include only the /proc directory itself. It excludes the files within /proc. Once you are done configuring your /etc/aide.conf file, you can check your configuration and make sure your logic and syntax are correct using the aide -- config-check command at the shell prompt. If your configuration is valid, you should see no output from the command. If an error is found, the command will return the line number and nature of the error in the output. An example is shown below: da1:/ # aide --config-check 35:syntax error:! 35:Error while reading configuration:! Configuration error As you can see above, the error is located in line 35 of /etc/aide.conf. Using AIDE to Check for Altered Files Once your AIDE configuration is complete and the syntax checked, you are ready to initialize the AIDE database. This is done using either the aide -i or aide -- init commands. NOTE: The process of initializing the AIDE database can take some time to complete. This command creates a new database that contains all of the files that you specified in your AIDE configuration file at the location specified by the database_out parameter. By default, this is /var/lib/aide/aide.db.new. Rename this file to /var/lib/aide/aide.db. You can view the contents of the new database file using the cat, less, or more commands at the shell prompt. An example is shown below: 266

267 Harden Servers DA1:/ # less # This file was generated by Aide, version # Time of generation was name lname attr perm bcount uid gid size mtime ctime inode lcount md5 sha1 /dev MTI2NDU0NTk1MQ== /bin MTI2MjExNzAzMg== MTI2MjExNzAzMg== /lib MTI2MjExNjg2Mw== MTI2MjExNjg2Mw== /etc MTI2NDU0NTk1MA== MTI2NDU0NTk1MA== /usr MTI2MzQwNDgwNg== MTI2MzQwNDgwNg== /opt MTI2MjExNjY3Nw== MTI2MjExNjY3Nw== /sbin MTI2Mjk5MTAzMg== MTI2Mjk5MTAzMg== /boot MTI2MjA5MjY4OA== MTI2MjA5MjY4OA== /var/lib /var/log /var/lib/sfcb /var/lib/postfix /var/lib/aide /var/lib/syslog-ng /var/lib/susehelp /var/lib/xdm /var/lib/named /var/lib/suseregister /var/lib/aide/aide.db.new lines 1-22/ % Novell Training Services (en) 15 April 2009 The newly created database file should be moved to a secure, read-only media such as a CD-R disc. You should also include the following AIDE program components on the read-only media: /etc/aide.conf /usr/bin/aide The AIDE man page files and documentation You should not maintain your AIDE configuration file in the server s hard disk (readwrite) file system. An intruder could potentially access the file and alter it or identify an unmonitored directory where he or she could place malicious files. WARNING: Only the configuration file on the read-only media should be used when running AIDE checks. To run an AIDE check, mount your AIDE read-only media and run the aide executable off of it using the following syntax: 267

268 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual aide --check This check can be run daily, weekly, or monthly, depending upon your level of concern. You can increase the verbosity of the output using the -V option in the command. Sample output from an aide check operation is shown below: AIDE found differences between database and filesystem!! Start timestamp: :18:49 Summary: Total number of files: Added files: 1 Removed files: 1 Changed files: Added files: added: /var/lib/aide/aide.db Removed files: removed: /var/lib/aide/aide.db.new Changed files: changed: /etc/mime.types changed: /dev/xconsole changed: /dev/shm/pulse-shm Detailed information about changes: File: /etc/mime.types Permissions: -rw-r--r--, -rwxrwxrwx Ctime : :56:58, :17:57 File: /dev/xconsole Ctime : :55:48, :18:48 File: /dev/shm/pulse-shm Bcount : 152, At this point, you need to evaluate the report and decide which items are normal and which items are of concern. In the example above, the /var/lib/aide/ aide.db.new file was renamed to /var/lib/aide/aide.db. This is a result of a mv operation that was used by the administrator to rename the file and isn t cause for concern. 268

269 Harden Servers WARNING: In a production environment, you should not store the /var/lib/aide/aide.db file in the local file system. This was done here for demonstration purposes only. Instead, it should be maintained on a secure read-only file system, such as a CD-R. In addition, the /dev/xconsole file had its ctime attribute changed. Likewise, the /dev/shm/pulse-shm file had its bcount attribute changed. These are also the results of normal operations and aren t cause for concern. However, also notice that the permissions assigned to the /etc/mime.types file were changed from 644 to 777, making the file executable by anyone on the system. This is not a normal and probably indicates that the file was modified by an attacker into some form of trojan. If at some point after the initial database has been created, you need to update your AIDE configuration file, you must update your database after the changes have been saved. This is done using the aide --update command at the shell prompt. The --update option performs the same operation as the --check option. However, it also creates a new database file. This database file should be placed on your read-only media along with the new configuration file. Novell Training Services (en) 15 April

270 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Exercise 4-3 Configure AIDE In this lab, you configure AIDE and use it to detect changes in the file system. You can find this lab in the workbook. (End of Exercise) 270

271 Summary Objective Describe Server Hardening Summary Harden Servers The process of server hardening involves identifying possible venues of attack into a computer system and then taking steps to eliminate them or reduce their vulnerability. Unlike a desktop system, the assets contained on a server system are probably infinitely more valuable, making server systems a much more inviting target to intruders. As such, server hardening should abide by the Principle of Least Privilege, the main concepts of which are listed below: Users should have only the degree of access to the server necessary for them to complete their work and no more. The server should only have the services and software required for it to fulfill its function on the network and no more. In the last ten years Linux has become much more widely deployed than in the past. In particular, Linux servers have become mainstays in the server room and are entrusted with providing mission-critical services as well as storing valuable data. As such, more and more Linux exploits are being developed each year. Novell Training Services (en) 15 April

272 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Objective Harden a SLES 11 Server Harden Services with AppArmor Implement an IDS Summary A key administrative job is to minimize and manage the risk intruders pose to your Linux servers by hardening them. At a minimum, you should Check file permissions Secure software and services Manage user access Close unnecessary ports Permissions on system files need to be set correctly to ensure that an unauthorized user or process cannot make unauthorized changes to these files. You may need to set the system-wide permission level and check SUID root permissions. The next issue you should be aware of is hardening the server by securing its software and services. You need to consider removing unnecessary software and services as well as using chroot jails. Then you need to manage how users access the system. You should consider doing the following: Remove or disable unnecessary user accounts. Use SUDO to delegate administration privileges. Implement ACLs. Harden SSH access. Next, you need to scan your server and identify open IP ports that need to be closed. Novell AppArmor is a mandatory access control scheme that lets you specify which files a process may read, write, and execute on a per program basis. AppArmor secures applications by enforcing good application behavior without relying on attack signatures. This allows it to prevent attacks even if they are exploiting previously unknown vulnerabilities. AppArmor can be configured using the YaST AppArmor modules or the autodep, genprof, and logprof commands. An IDS monitors certain facets of your server system, looking for patterns that indicate suspicious activity is under way. The AIDE IDS creates a database that contains information about the various files on your system using rules you specify. AIDE also creates a cryptographic checksum of each file. 272

273 SECTION 5 Update Servers Update Servers In this section, you learn how to maintain a local update server for SLE systems using the Subscription Management Tool (SMT). Objectives 1. Describe How SMT Works on page Install and Configure an SMT Server on page Configure SMT Client Systems on page Stage Repositories on page 313 Novell Training Services (en) 15 April

274 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Objective 1 Describe How SMT Works The management of server and desktop subscriptions and support entitlements in SLE 11 is done using the online Novell Customer Center (NCC) and the Subscription Management Tool (SMT). In this objective, you learn how the NCC and SMT work together to accomplish this. The following topics are addressed: Registering with the Novell Customer Center on page 274 How SMT Works on page 278 Registering with the Novell Customer Center The NCC is a Web-based portal that provides a single location for your organization to obtain updates and support for your SUSE Linux Enterprise-based products. You can manage your subscription information, including status, renewal, utilization and forecasting. The NCC notifies you when new software updates become available, allowing you to ensure licensing compliance, increase productivity, and reduce systems management costs. To be able to receive updates from Novell, you must first register with the NCC. You can configure the NCC during installation or you can configure it later using YaST. To configure the NCC after installation, complete the following: 1. Start YaST and select Other > Novell Customer Center Configuration. The following dialog appears: Figure 5-1 Novell Customer Center Configuration 274

275 2. Configure the following parameters: a. Select Configure Now. Update Servers b. To simplify the registration process, include information about your system by selecting Optional Information and Hardware Profile. NOTE: No information is passed to anyone outside Novell. The transmitted data is used for statistical purposes and to enhance driver support. You can view the transmitted information in the ~/.suse_register.log file. c. Select Registration Code. This causes you to be prompted for your product code. It also registers you for installation support included with your product. d. Select Regularly Synchronize with the Customer Center. This option verifies that your update sources are still valid and adds any new sources that may be available. It also sends any modifications, such as hardware changes, to Novell. 3. Select Next. You are prompted that the Novell web site will be contacted. 4. When prompted that a Web browser will be launched, select Continue to access the registration dialog, shown below: Novell Training Services (en) 15 April 2009 Figure 5-2 Registering your Activation Codes for SLES Enter your address in the fields provided. 275

Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual 6. Enter your activation code in the field provided.

276 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual 6. Enter your activation code in the field provided. This grants you access to online updates during your subscription period. Registering an instance of SLES, SLED, or OES 2 requires a valid activation code. Evaluation codes can be obtained from the SLES/SLED/OES product pages and from ( If a system is registered without a code, a provisional code is assigned by the NCC and no update repositories are assigned. 7. Select Submit to submit the registration. 8. In the Novell Customer Center System Registration page, select Continue. After a few minutes, you should be prompted that the configuration was successful, as shown below: Figure 5-3 Successful Registration with the NCC 9. Select OK. At this point, if you access the Software Repositories module in YaST, you should see that the SLES11-Updates repository from has been added, as shown below: 276

com/center/ (http://www.novell.com/center/) to administer your Novell products and subscriptions.

277 Figure 5-4 Viewing the Online Update Software Repository Update Servers Novell Training Services (en) 15 April 2009 Once the installation is complete, you can visit the NCC at ( to administer your Novell products and subscriptions. NOTE: A Novell login is required. If you do not have a Novell account yet, a link is provided on the login page that allows you to create one. An example is shown below: Figure 5-5 Managing Subscriptions in the NCC 277

278 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual How SMT Works SMT is a package proxy system that is integrated with the NCC. It provides key NCC capabilities locally at your site. SMT provides a repository and registration target that is synchronized with the NCC, thus maintaining all the capabilities of the NCC while allowing a more secure, centralized deployment. SMT allows you to provision updates for SLE devices (server or desktop). By downloading these updates only once to the update server and then distributing them throughout the enterprise, you can set more restrictive firewall policies and avoid bandwidth issues stemming from repeated download of the same updates by each device. Using SMT, you can easily determine how many entitlements are pending renewal at the end of your billing cycle without having to physically walk through the data center. The same entitlement information is also available through your NCC account to completely streamline the process. SMT provides an infrastructure that can support several hundred SLE devices for each SMT installation server, ensuring accurate and efficient tracking. SMT also supports fully disconnected operations for highly secure sites, such as installations in military, government, or other highly classified deployments. Using the Subscription Management Tool improves the interaction among SLE devices within your network and simplifies how they receive their system updates. The SMT server informs your SLE devices of any available software updates. Each device then obtains the required software updates from the SMT server instead of downloading the updates individually from nu.novell.com. SMT enables an infrastructure for several hundred SLE devices per instance of each installation (depending on the specific utilization profile). This offers you more accurate and efficient server tracking. SMT can be utilized with any SLE software deployment from Novell that is updated through nu.novell.com, beginning with the SLE 10 SP2 family of products and including the SLE 11 releases. The Subscription Management Tool for SLE provides Assurance of firewall and regulatory compliance Reduced bandwidth usage during software updates Full support under active subscription from Novell Maintenance of existing customer interface with the NCC Accurate server entitlement tracking and effective measurement of subscription usage An automated process to easily tally entitlement totals A simple installation process that automatically synchronizes server entitlement with the NCC 278

279 Update Servers SMT is available as a download to customers with an active SUSE Linux Enterprise Server (SLES) subscription and is fully supported. It is packaged as an add-on product that can be installed on a SLES 10 SP2 or later server. Any SLE 11 product updated from nu.novell.com can be locally supported via an SMT server. The Subscription Management Tool for SLES 11 includes several new innovative systems management capabilities, including the following: You can stage patches to an internal managed area under full control of the system administrator. This provides you with the option to carry out integration testing before fully enabling the new patches on the SLE systems in your organization. You can centrally push packages to managed devices. It provides improved setup and operation of fully disconnected configurations. It provides support for System Z as server hosting architecture, in addition to x86 and x86_64. It provides full integration with the new supportability infrastructure delivered with SLE 11, including the Novell Support Link and the Novell Support Advisor from Novell Technical Services (NTS). This feature helps facilitate problem reporting and troubleshooting. The SMT utility can be obtained from ( Novell Training Services (en) 15 April

Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Objective 2 Install and Configure an SMT Server SMT is distributed as an add-on

280 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Objective 2 Install and Configure an SMT Server SMT is distributed as an add-on product for SLES 11. You can choose to install SMT as an add-on during the initial installation process or you can install SMT as an addon to an already installed base system. To install and configure the SMT server, you need to do the following: Generate Your Mirror Credentials on page 280 Install the SMT Server on page 282 Set the SMT Job Schedule on page 291 Manage Software Repositories on page 294 Generate Your Mirror Credentials Before creating local mirrors of the update repositories, you need to have the proper mirror credentials. You can get these credentials from the NCC by doing the following: 1. Open a Web browser and access the NCC at ( Log in to the NCC. 3. Select My Products > Products. A list of product families is shown: Figure 5-6 Viewing Product Registrations in the Novell Customer Center 4. Expand any product family by selecting its name. You can also expand all product families by selecting the Expand All Product Families icon. Products in the expanded families are shown. 5. Double-click a specific product in the list to show detailed information about the product. A screen similar to the following is displayed: 280

281 Figure 5-7 Product Subscription Information Update Servers Novell Training Services (en) 15 April In the Downloads section, select the Mirror Credentials link. 7. If prompted, select Generate. The credentials and mirror channels will be listed. These values are the same for all users and subscriptions for a specific company. A sample is shown below: 281

282 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 5-8 Generating Mirror Credentials The credentials listed can be set in the YaST SMT module or manually entered in the /etc/smt.conf file. Install the SMT Server Once you have generated your mirror credentials, you need to install the SMT service on the server you have designated to fill this roll. To install SMT, do the following: 1. Open a Web browser and access download.novell.com ( download.novell.com). 2. In the Product or Technology drop-down list, select Subscription Management Tool; then select Search. You should see a screen similar to the following: 282

Figure 5-9 Downloading the Subscription Management Tool ISOs Update Servers Novell Training Services (en) 15 April 2009 3. Select the SMT download for SLES 11. 4.

283 Figure 5-9 Downloading the Subscription Management Tool ISOs Update Servers Novell Training Services (en) 15 April Select the SMT download for SLES Log in using your Novell user account and then download the SMT ISO files. 5. Copy the SMT ISO files to the SLES 11 server that will function as your SMT server. 6. Install the MySQL database service on the SMT server by doing the following: a. Open a terminal session on your SMT server and switch to root using the su - command. b. Check to see if MySQL has already been installed by entering rpm -q mysql at the shell prompt. c. If the MySQL database service has not been installed, start YaST and install the following packages: mysql mysql-client perl-dbd-mysql d. After installing the MySQL packages, enter rcmysql start at the shell prompt (as root) to start database service on the server. You should see output similar to the following: 283

284 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual DA1:~ # rcmysql start Creating MySQL privilege database... Installing MySQL system tables... OK Filling help tables... OK PLEASE REMEMBER TO SET A PASSWORD FOR THE MySQL root USER! To do so, start the server, then issue the following commands: /usr/bin/mysqladmin -u root password 'new-password' /usr/bin/mysqladmin -u root -h DA1.digitalairlines.com password 'new-password' Alternatively you can run: /usr/bin/mysql_secure_installation which will also give you the option of removing the test databases and anonymous user created by default. This is strongly recommended for production servers. See the manual for more instructions. You can start the MySQL daemon with: cd /usr ; /usr/bin/mysqld_safe & You can test the MySQL daemon with mysql-test-run.pl cd mysql-test ; perl mysql-test-run.pl Please report any problems with the /usr/bin/mysqlbug script! The latest information about MySQL is available on the web at e. Set the MySQL root user s password by entering mysqladmin - u root password new_password at the shell prompt. NOTE: The MySQL root user is not the same as the Linux system root user! f. Verify that MySQL is running by entering rcmysql status at the shell prompt. You should see output similar to the following: DA1:~ # rcmysql status Checking for service MySQL: running 7. Start YaST. If MySQL is not running, enter rcmysql start at the shell prompt to start it. 8. Select Software > Add-On Products. 284

9. Select Add. 10. Select Local ISO Image, then select Next. Update Servers 11. Browse to and select the SLE-11-SMT-GM-Media1.iso file you copied earlier to the server, then select Next. 12.

285 9. Select Add. 10. Select Local ISO Image, then select Next. Update Servers 11. Browse to and select the SLE-11-SMT-GM-Media1.iso file you copied earlier to the server, then select Next. 12. In the License Agreement screen, accept the license agreement; then select Next. 13. If necessary, select Patterns from the Filter drop-down list in the Software Installation screen. 14. Verify that Subscription Management Tool is marked, as shown below: Figure 5-10 Installing the SMT Pattern on SLES 11 Novell Training Services (en) 15 April Select Accept. 16. In the Automatic Changes screen, select Continue. Wait while the packages are installed. When complete, the SMT Configuration Wizard is displayed, as shown below: 285

286 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 5-11 SMT Configuration Wizard 17. Configure the following: a. Verify that Enable Subscription Management Tool Service (SMT) is selected. When activating SMT, the following important operations are performed by YaST: The Apache configuration is changed by creating symbolic links in the / etc/apache2/conf.d/ directory. Links to the /etc/smt.d/ nu_server.conf and /etc/smt.d/smt_mod_perl.conf files are created there. The Apache Web server is started (or reloaded if already running). The MySQL server is started (or reloaded if already running). If they do not exist, the smt MySQL user account and all the tables in the database necessary to support the SMT service are created. The schema of the SMT database is checked. If the database schema is obsolete, the SMT database is upgraded to conform to the current schema. The cron daemon is configured to run SMT jobs by creating a symbolic link to the /etc/smt.d/novell.com-smt file in the /etc/ cron.d/ directory. 286

Update Servers If you later deactivate SMT, the following operations are performed by YaST: Symbolic links created upon SMT activation in the /etc/apache2/ conf.d/ and /etc/cron.

287 Update Servers If you later deactivate SMT, the following operations are performed by YaST: Symbolic links created upon SMT activation in the /etc/apache2/ conf.d/ and /etc/cron.d/ directories are deleted. The cron daemon, Apache Web server, and MySQL server are reloaded. Neither Apache nor MySQL are stopped because they are probably required for server functions other than SMT. b. If the firewall is enabled, select Open Port in Firewall to allow access to the SMT service from remote computers. c. Enter your NCC mirroring credentials in the NU User and NU Password fields. d. Verify the credentials by selecting Test. SMT will connect to the Customer Center server using the credentials provided and download test data. If successful, you should see a screen similar to the following: Novell Training Services (en) 15 April 2009 Figure 5-12 Testing Mirror Credentials e. Select OK. f. Enter the address you used for NCC registration in the NCC Used for Registration field. g. The SMT Server URL field should be automatically populated with the URL of your SLES 11 server. h. Select Next. The following screen is displayed: 287

288 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 5-13 Configuring SMT Database Parameters 18. In the second configuration screen, configure the following: a. In the Database Password for smt User fields, enter a password of your choosing. NOTE: The password fields should not be left empty. The SMT service will create its own user account (named smt) in the MySQL database and assign it the password you specify here. b. Enter all addresses SMT should send reports to by selecting Add and then entering the appropriate address. The comma-separated list of addresses for SMT reports is added to the report option in the /etc/smt.conf configuration file. c. Select Next. You are prompted to supply the MySQL root user s password, as shown below: 288

289 Figure 5-14 Providing SMT with the MySQL root User s Password Update Servers Novell Training Services (en) 15 April 2009 d. When prompted, enter your MySQL database root user s password; then select OK. If the current MySQL root password is null, you will be asked to enter a new MySQL root password. e. After a few minutes, the Novell Customer Center Configuration screen is displayed, as shown below: Figure 5-15 Registering with the NCC After Installing SMT f. Make sure Configure Now is marked; then select Next. 289

Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 5-16 Wait while the SMT server is registered. g.

290 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 5-16 Wait while the SMT server is registered. g. When prompted that the NCC configuration was successful, select Details. You should see that the SLE11-SMT-Updates catalog has been enabled, as shown below: Enabling the SMT Updates Catalog h. Select OK > OK. When the installation is complete, you should see that the Subscription Management Tool has been installed as an add-on product; as shown below: Figure 5-17 SMT Added as an Add-On Product NOTE: To reconfigure the SMT service after the installation is complete, you can use the SMT Server Configuration and SMT Server Management modules in YaST. 290

291 i. Select OK. j. Close YaST. Update Servers After installation, a series of directories are created in the /srv/www/htdocs directory (by default) on the SMT server. These directories and their functions are as follows: repo. Contains all files shared by SMT, such as updates, repo signing keys, and so on. It also contains several subdirectories that contain other important SMT files: repo/$rce. Contains published updates. repo/full/. Contains all mirrored updates when staging is enabled. repo/testing/. Provides a staging area for updates before they are published. repo/keys/. Contains the repository signing keys. repo/tools/. Contains tools used by SMT clients. Novell Training Services (en) 15 April 2009 The following files are used to configure and manage the SMT service on the server: /etc/smt.conf. This is the main SMT configuration file. /etc/smt.d/. This directory contains other SMT configuration files. /etc/apache2/conf.d/. Contains Apache Web server configuration files that define SMT perl: smt_mod_perl.conf smt_support.conf nu_server.conf It also contains libraries and aliases, as well as symbolic links to the files in / etc/smt.d/ /etc/cron.d/novell.com-smt. Defines SMT cron jobs. Set the SMT Job Schedule The next task you need to complete is to configure the SMT job schedule. The YaST SMT Server Configuration module allows you to schedule periodic SMT jobs. YaST uses cron to schedule configured jobs. Five types of jobs can be set: Synchronization of Updates. Synchronizes with NCC, updates repositories, and downloads new updates. Report Generation. Generates and sends SMT reports to addresses defined during the configuration of SMT. NCC Registration. Registers all clients with NCC that are not already registered or that have changed their data since the last registration. Job Queue Cleanup. Cleans up queued jobs. Upload Support Configs. Uploads support configs (available only when the SMT -Support package is installed). 291

292 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 5-18 To configure the SMT server job schedule, do the following: 1. Start YaST. 2. Select Network Services > SMT Server Configuration > Schedule SMT Jobs. The following is displayed: Scheduled SMT Jobs The table contains a list of all scheduled jobs, their type, frequency, date, and time to run. You can add, delete, or edit these scheduled events. 3. If you want to add a scheduled SMT job, select Add. The Adding New SMT Scheduled Job dialog is displayed: 292

Figure 5-19 Adding a New SMT Scheduled Job Update Servers Novell Training Services (en) 15 April 2009 4. Choose the synchronization job to schedule.

293 Figure 5-19 Adding a New SMT Scheduled Job Update Servers Novell Training Services (en) 15 April Choose the synchronization job to schedule. You can select one of the following jobs: Synchronization of Updates Generation of Reports NCC Registration Job Queue Cleanup Uploading Support Configs 5. Specify the frequency of the newly scheduled SMT job. Jobs can be performed Daily, Weekly, Monthly, or Periodically (every n-th hour or every n-th minute). 6. Set the job start time by entering the appropriate start day of the month, day of the week, hour and/or minute. For periodical jobs, enter the respective periods. For weekly and monthly jobs, specify the day of the week or day of the month. 7. Select Add. 8. If you want to change the frequency, time, or date of a scheduled SMT job, select the job in the table and select Edit. Then change any parameters as if you were creating a new schedule and select OK. 293

294 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual 9. If you want to cancel a scheduled job and delete it from the table, select the job in the table and select Delete. 10. Select Finish to apply the settings and exit the SMT Server Configuration module. Manage Software Repositories Next, you need to configure your mirrored software repositories. To do this, you need to be familiar with the following: The tools and procedures for viewing information about software repositories available through SMT The procedure for configuring the mirrored repositories The procedure for setting up custom repositories The local SMT database needs to be updated periodically with any information downloaded from the NCC. These periodic updates can be configured using the YaST SMT Server Configuration module, as described in the previous topic. Once done, you can manage your software repositories in the following ways: Manage Repositories from the Shell Prompt on page 294 Managing Repositories in YaST on page 297 Manage Repositories from the Shell Prompt To update the SMT database manually, you can use the smt-ncc-sync command at the shell prompt (as root). The smt-ncc-sync command gets data from the NCC and updates the local SMT database. An example is shown below: DA1:~ # smt-ncc-sync Downloading Product information Downloading Target information Downloading repository information Downloading Product/Repository relations Downloading Subscription information Downloading Registration information Flagged repositories which can be mirrored DA1:~ # This utility can also be used to save NCC data to a directory in the file system instead of the SMT database. Accordingly, it can also read NCC data from a directory in the file system instead of downloading it from the NCC itself. The smt-ncc-sync command can be run with the following options: --fromdir directory. Reads NCC data from a directory instead of downloading it. --todir directory. Writes NCC data to the specified directory without updating the SMT database. 294

295 --logfile file or --L file. Specifies the path to a log file. Update Servers --createdbreplacementfile. Creates a database replacement file for using an smt-mirror without a database. The database installed with SMT contains information about all software repositories available on the NCC. However, the credentials you use to authenticate to the NCC determine which repositories can actually be mirrored. The mirrorability of repositories is determined by fetching ( using your mirror credentials. Repositories that can be mirrored have the MIRRORABLE flag set in the respective table in the SMT database. The fact that a repository can be mirrored does not mean that it has to be mirrored. Only repositories with the DOMIRROR flag set in the SMT database will actually be mirrored. You can use the smt-repos command at the shell prompt (as root) to list available software repositories and additional information. Using this command without any options lists all available repositories, including repositories that cannot be mirrored. In the first column, the enabled repositories (repositories that are set to be mirrored) are marked with Yes. Disabled repositories are marked with No. The other columns show ID, type, name, target, and description of the listed repositories. The last column shows whether the repositories can be mirrored. You can use the -verbose option with this command to get additional information, such as the source URL of the repository and the path it will be mirrored to. The repository listing can be limited to only repositories that can be mirrored or to enabled repositories. To list only repositories that can be mirrored, use the -m or -- only-mirrorable option. To list only enabled repositories, use the -o or -- only-enabled option. An example is shown below: Novell Training Services (en) 15 April 2009 Figure 5-20 Using the smt-repos Command 295

296 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual You can also list only repositories with a particular name or show information about a repository with a particular name and target. To list repositories with a particular name, use the smt-repos repository_name command. To show information about a repository with a particular name and target, use the smt-repos repository_name target command. Only enabled repositories can be mirrored. In the database, the enabled repositories have the DOMIRROR flag set. Repositories can be enabled or disabled using the smt-catalogs script. To enable one or more repositories, do the following: 1. If you want to enable all repositories that can be mirrored or just choose one repository from the list of all repositories, run the smt-repos -e command. To limit the list to only repositories that can be mirrored, use the -m option. To limit the list to only repositories with a particular name, use the smt-repos -e repository_name command. To list only a repository with a particular name and target, use the smt-repos -e repository_name target command. If you want to enable all repositories belonging to a certain product, use the -- enable-by-prod or -p option followed by the name of the product and (optionally) its version, architecture, and release number. For example, to enable all catalogs belonging to SLES 10 SP2 for the PowerPC architecture, you would use the smt-repos -p SUSE-Linux- Enterprise-Server-SP2,10,ppc command. NOTE: You can display a list of known products using the smt-list-products command. 2. If more than one catalog is listed, specify the one you want to enable by entering its ID (as listed in the repository table). If you want to enable all the listed catalogs, enter a. To disable one or more repositories, do the following: 1. If you want to disable all enabled repositories or just choose one repository from the list of all repositories, run the smt-repos -d command. If you want to choose the repository to be disabled from a shorter list (or if you want to disable all catalogs from a limited group) you can use any of the options discussed previously to limit the list of repositories. To limit the list to only enabled repositories, use the -o option. To limit the list to only repositories with a particular name, use the smt-repos -d repository_name command. To list only a repository with a particular name and target, use the smt-repos -d repository_name target command. 2. If more than one repository is listed, choose which one you want to disable by entering its ID. If you want to disable all the listed repositories, enter a. 296

297 You can also mirror repositories that are not available from the NCC (custom repositories) using the smt-setup-custom-repos command at the shell prompt. NOTE: Custom repositories can also be deleted. To set up a custom repository to be available through SMT, do the following: Update Servers 1. If you don t know the ID of the product the new repository should belong to, use the smt-list-products command to get the ID. 2. Run the smt-setup-custom-repos --productid product_id -- name repository_name --exturl repository_url command at the shell prompt. The product_id is the ID of the product the repository belongs to. The repository_name is the name of the repository. The repository_url is the URL the repository is available from. If the repository should be available for more than one product, specify the IDs of all products that should use the repository. For example, to make My Repo available at to products with IDs 423, 424, and 425, you would use the following command: smt-setup-custom-repos --productid productid productid name 'My_Repo' --exturl ' example.com/my_repo' In its default configuration, SLE does not allow the use of unsigned repositories. To remove a configured custom repository from the SMT database, enter smtsetup-custom-repos --delete ID at the shell prompt, where ID is the ID of the repository to be removed. Novell Training Services (en) 15 April 2009 Managing Repositories in YaST You can also use the SMT Server Management module in YaST to manage your repositories. To start the module, start YaST and select Network Services > SMT Server Management. NOTE: You can also enter yast2 smt at the command line (as root) to start the SMT Server Management module. The Repositories tab is displayed by default, as shown below: 297

Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 5-21 Using SMT Management to View a List of Repositories A list of all

298 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 5-21 Using SMT Management to View a List of Repositories A list of all available package repositories for SMT is displayed on the Repositories tab. The list displays each repository's name, target product, architecture, mirroring flag, staging flag, date of last mirroring, and a short description. You can sort the information in the list by clicking on the appropriate column heading. You can also filter the repositories displayed using the Filter drop-down list in the upper left of the screen. The filter s list items are collected and assembled dynamically using the first word in the repository names. For example, if you wanted to view only SLE 11 repositories, you could select SLE 11 from the Filter drop-down list, as shown below: 298

Figure 5-22 Filtering the List of Displayed Repositories Update Servers Novell Training Services (en) 15 April 2009 Before you can offer package repositories to client systems, you need to create a

299 Figure 5-22 Filtering the List of Displayed Repositories Update Servers Novell Training Services (en) 15 April 2009 Before you can offer package repositories to client systems, you need to create a local mirror of the appropriate packages. To do this, complete the following: 1. From the list displayed on the Repositories tab, select the repository you want to mirror. 2. Select Toggle Mirroring. A check mark will be displayed in the Mirroring column of the selected repository. 3. Select Mirror Now. The repository is mirrored immediately. A pop-up window appears with information about mirroring status and results. An example is shown below: 299

Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 5-23 Mirroring a Repository 4.

300 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 5-23 Mirroring a Repository 4. When the mirroring process is complete, select OK. The list of repositories will be refreshed. You should now see that the repository you flagged for mirroring is listed as mirrored using the date and time it was mirrored, as shown below: Figure 5-24 A Mirrored Repository 5. When complete, select OK to close the SMT Server Management module. 300

301 Objective 3 Configure SMT Client Systems Update Servers Any machine running SLE 10 SP2 or later can be configured to register with your SMT server and download software updates from it instead of communicating directly with the NCC servers. To do this, you need to equip your SMT client systems with the SMT server's URL. Because the client and server communicate using the HTTPS protocol during registration, you need to make sure the client trusts the server's certificate. If you set up your SMT server to use the default server certificate, the CA certificate will be available on the SMT server at In this situation, the registration process will automatically download the CA certificate from there, unless configured otherwise. On the other hand, if the SMT server certificate was issued by an external CA, you must manually enter the path to the server certificate. In this objective, you learn how to do this. The following topics are addressed: Configure SMT Client Systems on page 301 Manage SMT Client Systems on page 306 Configure SMT Client Systems There are several ways to configure the client machine to use SMT: Configure Clients Using Kernel Parameters at Boot on page 301 Configuring Clients Using AutoYaST Profile on page 302 Configure Clients Using the clientsetup4smt.sh Script on page 303 Configure SMT Clients in YaST on page 304 Novell Training Services (en) 15 April 2009 Configure Clients Using Kernel Parameters at Boot Any client can be configured to use SMT by providing the regurl and regcert kernel parameters as a boot option at system startup. The first parameter below is mandatory; the latter is optional: regurl. The URL of the SMT server: regsvc/. The domain name you specify must be identical to the FQDN of the server certificate used on the SMT server. The following is an example of using regurl: regurl= regcert. The location of the SMT server's CA certificate. Specify one of the following: URL. Specifies a remote server (http, https or ftp) from which the certificate can be downloaded. For example: regcert= 301

302 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual floppy. Specifies that the certificate is to be loaded from a floppy diskette. The floppy has to be inserted at boot time. You will not be prompted to insert it if it is missing. The value has to start with the string floppy, followed by the path to the certificate on the diskette. For example: regcert=floppy/smt/smt-ca.crt Local Path. Specifies an absolute path to the certificate in the file system of the local machine. For example: regcert=/data/inst/smt/smt-ca.cert Interactive. Opens a pop-up menu during installation that allows you to manually specify the path to the certificate. For example: regcert=ask Deactivate Certificate Installation. Used if the certificate will be installed by an add-on product. For example: regcert=done Make sure the values you enter are correct. If regurl has not been specified correctly, the registration of the update source will fail. If the wrong value for regcert has been entered, you will be prompted for the local path to the certificate. If regcert is not specified, it will default to domain_name/smt.crt with domain_name being the name of the SMT server. If the SMT server gets a new certificate from a new and untrusted CA, the clients will need to fetch the new CA certificate file. This is done automatically during the registration process but only if a URL was used at installation time to fetch the certificate or if the regcert parameter was omitted and the default URL used. If the certificate was loaded using any other method, such as floppy or local path, the CA certificate will not be updated. Configuring Clients Using AutoYaST Profile Clients can also be configured to register with the SMT server using an AutoYaST profile. NOTE: A full discussion of AutoYaST is beyond the scope of this section. Only SMT-specific configuration parameters are described here. To configure SMT specific data using AutoYaST, do the following: 1. As root, start YaST and select Miscellaneous > Autoinstallation to start the graphical AutoYaST front-end. 302

303 Update Servers 2. Open an existing profile by selecting File > Open, or create a profile based on the current system's configuration by selecting Tools > Create Reference Profile, or just work with an empty profile. 3. Select Software > Novell Customer Center Configuration. An overview of the current configuration is shown. 4. Select Configure. 5. Set the URL of the SMT Server and, optionally, the location of the SMT Certificate. The possible values are the same as for the regurl and regcert kernel parameters. Novell Training Services (en) 15 April 2009 NOTE: The ask value for regcert does not work with AutoYaST because it requires user interaction. 6. Set all other configuration parameters needed for the systems to be deployed. 7. Select File > Save As and enter a filename, such as autoinst.xml, for the profile. When you re done, you should have an AutoYaST profile that contains the <suse_register> element, which is used to configure SMT settings. An example is shown below: <suse_register> <do_registration config:type="boolean">true</do_registration> <reg_server> reg_server> <reg_server_cert> reg_server_cert> </suse_register> Configure Clients Using the clientsetup4smt.sh Script The /usr/share/doc/packages/smt/clientsetup4smt.sh script file is provided with SMT. This script allows you to configure a client machine to use an SMT server or to reconfigure it to use a different SMT server. NOTE: The clientsetup4smt.sh script works with SLE 10 SP2 and SLE 11 systems. To configure a client machine to use SMT with the clientsetup4smt.sh script, do the following: 1. Copy the /usr/share/doc/packages/smt/clientsetup4smt.sh script file from your SMT server to the client machine. This script is available for download from the SMT server. You can get it by running the following command at the shell prompt of the client system: wget 303

304 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual 2. Run the script on the client machine from the shell prompt (as root). NOTE: You will need to make the file executable with the chmod command before you can run it. The script can be executed in two ways. In the first case, the script name is followed by the registration URL. For example:./clientsetup4smt.sh center/regsvc You can also run the script name followed by the --host option and the hostname of the SMT server. For example:./clientsetup4smt.sh --host da1.digitalairlines.com 3. You are prompted to download and accept the server's CA certificate. Accept it by typing y. NOTE: The script performs all necessary modifications on the client. However, the registration process itself is not performed by the script. The script then downloads and asks you to accept additional keys to sign repositories with. 4. Perform a registration by executing suse_register or running yast2 inst_suse_register on the client. The apache2-example-pages package includes a file named robots.txt. The file is installed into the Apache2 document root directory and controls how clients can access files from the web server. If this package is installed on the server, the clientsetup4smt.sh script will fail to download the keys stored under / repo/keys. You can solve this problem by either editing robots.txt or uninstalling the apache2-example-pages package. If you choose to edit the robots.txt file, open it in a text editor and add the following line before the Disallow: / line. Allow: /repo/keys Then save the changes and quit the editor. Configure SMT Clients in YaST For SLE 11 systems, you can set the SMT server URL in YaST when registering with the NCC. Do the following: 1. On the SMT client system, start YaST, then select Other > Novell Customer Center Configuration. 304

305 Figure In the Customer Center Configuration module, select Advanced > Local Registration Server. The following is displayed: Registering with a Local Registration Server in YaST Update Servers Novell Training Services (en) 15 April In the Registration Server field, enter the URL for your SMT server. For example: 4. If necessary, specify the path and filename for your server certificate in the Server CA Certificate Location field. 5. Select OK. 6. You may be prompted to accept the certificate from the SMT server, as shown below: 305

Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 5-26 Accepting the SMT Server Certificate If this is the case, select Trust to

306 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 5-26 Accepting the SMT Server Certificate If this is the case, select Trust to accept the certificate. 7. Verify that Configure Now is marked; then select Next. Manage SMT Client Systems SMT allows you to register client machines with the NCC. Client machines must first be configured to use your SMT server. In this part of this objective, the following topics are addressed: Manage Registrations on page 306 Apply Updates to SMT Clients on page 308 Manage Registrations To list client machines registered with the SMT server, use the smt-listregistrations command at the shell prompt. The following information is listed for each client: Unique ID Hostname Date and Time of last contact with the SMT server The software products the client uses To delete a registration from SMT and NCC, use the smt-deleteregistrations -g Client_ID command at the shell prompt. To delete 306

307 Update Servers multiple registrations, the -g option can be specified multiple times. The ID of the client machine to be deleted can be determined from the output of the smt-listregistrations command. The smt-register command registers clients with the NCC. All clients that are currently not registered or whose data has changed since the last registration are registered. To register clients whose registration has failed for some reason, use the -- reseterror option with this command. This option resets the NCC registration error flag and tries to submit failed registrations again. You can use the YaST SMT Server Configuration module to set up client registration scheduling. In the default configuration, registrations repeat every 15 minutes. To create or modify a new registration schedule, do the following: 1. Start YaST and launch the SMT Server Configuration module. 2. Go to the Scheduled SMT Jobs tab. 3. If you want to modify the existing schedule, select the NCC Registration job, select Edit, and then modify the schedule to meet your requirements. 4. If you want to create a new registration schedule, do the following: a. Select Add. b. Under Job to Run, select NCC Registration. c. Specify the frequency for the scheduled SMT job. You can perform jobs Daily, Weekly, Monthly, or Periodically (every n-th hour or every n-th minute). d. Set the job start time. Do not set the frequency to less than 10 minutes. e. Select Add. Novell Training Services (en) 15 April 2009 NOTE: To disable automatic registration, change the forwardregistration value to false in the [LOCAL] section of the /etc/smt.conf configuration file. After registration, you can check the status of your SMT clients in the YaST SMT Server Management module. Start YaST and select Network Services > SMT Server Management, then select the Client Status tab. A screen similar to the following is displayed: 307

Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 5-27 Managing SMT Clients in YaST The Clients Status tab contains information

308 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 5-27 Managing SMT Clients in YaST The Clients Status tab contains information about all the clients that use the repositories on your SMT server. It is divided into two main parts: A list of registered clients Detailed information about each client You can read the client's host name, the date and time of the last contact with the SMT server, and the client s update status. The update status is shown as one of the following states: Up-to-Date. Indicates the client s packages have been updated to the last version available in the production repository. Updates Available. Indicates there are optional or recommended updates available for the client. Critical. Indicates there are security patches or package manager patches available for the client. In the lower part of the window, additional detailed information about the highlighted client is available. It consists of extended status information as well as detailed information about the number and types of available updates. Apply Updates to SMT Clients During the registration process, software repositories for each appropriate mirrored repository on the SMT server are automatically created for you on SMT client systems. An example is shown below: 308

Figure 5-28 Figure 5-29 Configured SMT Software Repositories Update Servers If, for some reason, the repositories on the SMT client were not automatically created, you can configure them manually

In the URL field, enter the URL for a mirrored repository on your SMT server.

309 Figure 5-28 Figure 5-29 Configured SMT Software Repositories Update Servers If, for some reason, the repositories on the SMT client were not automatically created, you can configure them manually using the YaST Software Repositories module. Do the following: 1. On the SMT client system, start YaST and select Software > Software Repositories > Add. 2. Mark Specify URL; then select Next. 3. In the URL field, enter the URL for a mirrored repository on your SMT server. For example, if the mirrored repository resides in /srv/www/htdocs/repo/$rce/ SLE11-SDK-Updates/sle-11-i586 on an SMT server named DA-SMT, you would enter Updates/sle-11-i586/. This is shown in the figure below: Configuring an SMT Server Repository on the Client Novell Training Services (en) 15 April Select Next. 5. When prompted to authenticate, enter a username and password for a user on the SMT server; then select Continue. At this point, you should see a repository added for a mirrored repository located on your SMT server. An example is shown below: 309

310 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 5-30 SMT Server Repository Added 6. Select OK. Once the SMT server repository is in place, you can apply updates on the SMT client by running the Online Update module in YaST. When you do, a list of available patches is displayed, as shown below: 310

Figure 5-31 Installing Updates from the SMT Server Update Servers Novell Training Services (en) 15 April 2009 Mark the updates you want to install, then select Accept.

311 Figure 5-31 Installing Updates from the SMT Server Update Servers Novell Training Services (en) 15 April 2009 Mark the updates you want to install, then select Accept. You can also apply updates on the SMT client from the command line using the zypper utility. The simplest way to use zypper is to enter the zypper command at the shell prompt followed by a command. For example, to apply all needed patches on the system you would enter zypper patch. NOTE: In addition to patches, the zypper utility can also be used to install, remove, or update regular packages, much like the rpm utility. See the zypper man page for more information. By default, zypper asks for confirmation before installing a package as well as when a problem is encountered. You can override this behavior using the --non-interactive option. For example, zypper --non-interactive patch. Likewise, if the patches being installed require you to accept a license agreement, you can use the --auto-agreewith-licenses option to automatically accept all license agreements without user intervention. For example, zypper --auto-agree-with-licenses patch. When you run any of the above commands, all patches in the repository are checked for relevancy and installed, if appropriate. To just list all relevant patches in the repository without installing them, enter zypper list-patches at the shell prompt. 311

312 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual The zypper utility can also be used to manage repositories on the SMT client. To view a list of configured repositories, enter zypper repos at the shell prompt. An example is shown below: da3:~ # zypper repos # Alias Name Enabled Refresh sle-11-i586 sle-11-i586 Yes Yes... If you want to remove a repository, you can enter zypper removerepo along with the alias or number of the repository you want to delete. For example, to remove the first repository listed in the example above, you would enter zypper removerepo 1 at the shell prompt. You can also use the zypper command to add a new repository from the shell prompt. To do this, you would enter zypper addrepo URI alias. The alias parameter defines a unique identifier for the repository. An example is shown below: da3:~ # zypper addrepo SLE11-SDK-Updates/sle-11-i586/ sle-11-i586 Adding repository 'sle-11-i586' [done] Repository 'sle-11-i586' successfully added Enabled: Yes Autorefresh: No URI: 11-i586/ da3:~ # Zypper will warn you if you specify an alias that is already used by another repository. Zypper can be used to rename a repository as well. To do this, enter zypper renamerepo repository_number new_alias at the shell prompt. For example, the zypper renamerepo 1 MyUpdates command would rename the first repository shown in the example above from sle-11-i586 to MyUpdates. 312

313 Objective 4 Stage Repositories Update Servers After the mirroring process is complete on the SMT server, you have the option of staging the mirrored repositories. The staging process allows you to create testing and production repositories based on the mirrored repositories. The testing repository helps you examine repository packages and test them on a limited number of SMT client systems before you make them available to your production environment. You have two options for staging repositories: Staging Repositories from the Command Line on page 313 Staging Repositories in YaST on page 315 Implement an SMT Server on page 319 Staging Repositories from the Command Line The first option is to stage repositories using the smt-staging command at the shell prompt. The directory where mirrored repositories are stored is set by the MirrorTo option in the /etc/smt.conf configuration file. By default, this is set to /srv/www/ htdocs, as shown in the example below: Novell Training Services (en) 15 April 2009 DA1:~ # cat /etc/smt.conf grep MirrorTo MirrorTo=/srv/www/htdocs The production environment is maintained in the repo/ subdirectory. By default, this is /srv/ The testing environment is maintained in the same directory structure as the production environment, but it is located in the testing/ subdirectory. By default, this is /srv/www/htdocs/repos/ testing/. Repositories with staging enabled are mirrored to the repo/full subdirectory. This subdirectory is not accessible, by default, to SMT clients. Therefore, with staging enabled, new updates are not automatically made available to clients, giving you a chance to test them with a limited number of client systems. You can select patches, create a snapshot, and put it into the repos/testing/ directory. After tests are complete, you can put the contents of repo/testing into the production environment in repo/. You can enable or disable staging with the smt-repos command using the following syntax: smt-repos --enable staging or smt-repos -s 313

Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual You will be prompted to specify which repository you want to enable staging for, as

314 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual You will be prompted to specify which repository you want to enable staging for, as shown below: Select repository number (or all) to change, (1-367,a) : You can enter a specific repository ID number or enter a to enable staging for all repositories. NOTE: You can disable staging for a repository using one of the following commands at the shell prompt: smt-repos --disable staging or smt-repos -S You can list updates available for staging using the following command at the shell prompt: smt-staging listupdates repository_id target The output from this command is defined in the following figure: Figure 5-32 Viewing Output from the smt-staging listupdates Command You can also list detailed information about a specific update using the following command at the shell prompt: smt-staging listupdates repository_id target --patch patch_name-version The output from this command is defined in the following figure: 314

315 Figure 5-33 Viewing Details About a Specific Update Update Servers Novell Training Services (en) 15 April 2009 Once you ve identified the patches you want to stage, you need to create the testing repository. To do this, enter the following command at the shell prompt: smt-staging createrepo repository_id --testing Once done, you can then test the patches on a limited number of test SMT client systems. If no problems are discovered during testing, you can then create your production repository by entering the following command at the shell prompt: smt-staging createrepo repository_id --production If necessary, you can manually allow or disallow patches using the allow or forbid options with the smt-staging command followed by the appropriate patch name and version or category. The syntax for doing this is shown below: smt-staging forbid --patch patch_name smt staging forbid --category category_name To register an SMT client to use the testing environment, you need to modify the / etc/suseregister.conf file on the client system and make the following change to the register directive: register = command=namespace=testing Staging Repositories in YaST In addition to the command-line tools discussed above, you can also stage repositories in the YaST SMT Server Management module. Do the following: 1. Start YaST and select Network Services > SMT Server Management. 2. From the repository list, select the repository you want to enable staging for. 3. Select Toggle Staging. A check mark appears in the Staging column of the selected repository, as shown below: 315

Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 5-34 Toggling Staging for a Repository 4.

316 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 5-34 Toggling Staging for a Repository 4. Repeat this process for each repository you want to stage. NOTE: You can only stage repositories that are currently mirrored. The Toggle Staging button will not be active for unmirrored repositories. 5. Select the Staging tab. 6. In the Repository Name drop-down list, select the repository you want to stage. A list of patches in the repository is displayed. Information about each patch, including its name, version, category, testing and production flags, and summary is displayed; as shown below: 316

Figure 5-35 Staging Individual Patches Update Servers Novell Training Services (en) 15 April 2009 Next to the Repository Name drop-down list is the Patch Category filter.

317 Figure 5-35 Staging Individual Patches Update Servers Novell Training Services (en) 15 April 2009 Next to the Repository Name drop-down list is the Patch Category filter. You can use this feature to configure a filter that will list only the patches that belong to a particular category. If the selected repository allows patch filtering, you can toggle the status flag for individual patches by selecting Toggle Patch Status. 7. Create the testing repository by selecting Create Snapshot > From Full Mirror to Testing. Wait while the staging process is completed. During this time, the updates from the repository are staged to the /srv/www/htdocs/repo/testing directory. This may take several minutes to complete. 8. Select OK. 9. Test the updates with your SMT client test systems. To register an SMT client to use the testing environment, you need to modify the /etc/suseregister.conf file on the client system and make the following change to the register directive: register = command=namespace=testing 10. When you are satisfied that the updates are working properly in your test environment, you are ready to move those updates to your production environment. Do the following: a. Start YaST on the SMT server and select Network Services > SMT Server Management. 317

318 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 5-36 b. Select the Staging tab; then select the repository being tested from the Repository Name drop-down list. c. Select Create Snapshot > From Testing to Production. Wait a few minutes while the updates are staged to production. This may take a few minutes to complete. After the production snapshot is created, a green check mark is displayed in the Repository Name drop-down list, as shown below: Production Snapshot Created d. Select OK. At this point, the updates should be distributed to your production SMT client systems according to the schedule you configured earlier. NOTE: If the client systems have not been activated in the NCC, you must do so before updates will be delivered. 318

319 Exercise 5-1 Implement an SMT Server In this lab, you configure an SMT server. You will find this lab in the workbook. (End of Exercise) Update Servers Novell Training Services (en) 15 April

320 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Summary Objective Describe How SMT Works Summary The management of server and desktop subscriptions and support entitlements in SLE 11 is done using the online Novell Customer Center (NCC) and the Subscription Management Tool (SMT). The NCC is a Web-based portal that provides a single location for your organization to obtain updates and support for your SLE-based products. You can manage your subscription information, including status, renewal, utilization, and forecasting. The NCC notifies you when new software updates become available, allowing you to ensure licensing compliance, increase productivity, and reduce systems management costs. To be able to receive updates from Novell, you must first register with the NCC. You can configure the NCC during installation or you can configure it later using YaST. SMT is a package proxy system that is integrated with the NCC. It provides key NCC capabilities locally at your site. SMT provides a repository and registration target that is synchronized with the NCC, thus maintaining all the capabilities of the NCC while allowing a more secure, centralized deployment. SMT allows you to provision updates for SLE devices (server or desktop). By downloading these updates only once to the update server and then distributing them throughout the enterprise, you can set more restrictive firewall policies and avoid bandwidth issues stemming from repeated download of the same updates by each device. Using SMT, you can easily determine how many entitlements are pending renewal at the end of your billing cycle without having to physically walk through the data center. The same entitlement information is also available through your NCC account to completely streamline the process. SMT provides an infrastructure that can support several hundred SLE devices for each SMT installation server, ensuring accurate and efficient tracking. SMT also supports fully disconnected operations for highly secure sites, such as installations in military, government, or other highly classified deployments. 320

321 Objective Summary Update Servers Install and Configure an SMT Server SMT is distributed as an add-on product for SLES 11. You can choose to install SMT as an add-on during the initial installation process or you can install SMT as an add-on to an already installed base system. To install and configure the SMT server, you need to do the following: Generate mirror credentials. Install the SMT server. Set the SMT job schedule. Novell Training Services (en) 15 April 2009 Manage software repositories. Configure SMT Client Systems Any machine running SUSE Linux Enterprise 10 SP2 or later can be configured to register with your SMT server and download software updates from it instead of communicating directly with the NCC servers. To do this, you need to equip your SMT client systems with the SMT server's URL. Because the client and server communicate using the HTTPS protocol during registration, you need to make sure the client trusts the server's certificate. If you set up your SMT server to use the default server certificate, the CA certificate will be available on the SMT server at In this situation, the registration process will automatically download the CA certificate from there, unless configured otherwise. If the SMT server certificate was issued by an external CA, you must manually enter the path to the server certificate file. There are several ways to configure a client machine to use SMT: Using kernel parameters at boot Using an AutoYaST profile Using the clientsetup4smt.sh script Using YaST 321

322 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Objective Stage Repositories Summary After the mirroring process is complete on the SMT server, you have the option of staging the mirrored repositories. The staging process allows you to create testing and production repositories based on the mirrored repositories. The testing repository helps you examine repository packages and test them on a limited number of SMT client systems before you make them available to your production environment. The production environment is maintained in /srv/ The testing environment is maintained in the same directory structure as the production environment, but it is located in the /srv/ www/htdocs/repos/testing/ subdirectory. Repositories that have staging enabled are mirrored to the /repo/full subdirectory. This subdirectory is not accessible, by default, to SMT clients. Therefore, with staging enabled, new updates are not automatically made available to clients, giving you a chance to test them with a limited number of client systems. You can select patches, create a snapshot, and put it into the repos/testing/ directory. After tests are complete, you can put the contents of repo/testing into the production environment in repo/. 322

323 SECTION 6 Prepare Servers for Disasters Prepare Servers for Disasters Business continuity planning and disaster recovery are very broad subjects that include topics such as physical security of the data center, cold or hot standby facilities at a different location, backup strategies, redundant power supplies and network connections, uninterruptible power supplies, RAID systems, and clusters, to name a few. What you implement depends to a large degree on your business needs and your budget. However, some of the above measures to mitigate the effects of a disaster can be implemented using tools that are included with SUSE Linux Enterprise Server (SLES) 11. In this section, you learn how to develop a backup strategy and use the backup tools shipped with SLES 11. You also learn to use multipath I/O and a redundant network connection to connect to a SAN, eliminating the network as a possible single point of failure. Novell Training Services (en) 15 April 2009 Objectives 1. Design a Backup Strategy on page Use Linux Tools to Create Backups on page Implement Multipath I/O on page

324 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Objective 1 Design a Backup Strategy One of the key tasks that you must perform as a SLES 11 administrator is to ensure that the data on the systems you are responsible for is protected. One of the best ways that you can do this is to back up the data on a regular basis. Having a backup creates a redundant copy of important system data so that if a disaster occurs, the information can be restored. Remember that the data on your Linux system is usually stored on hard drives, which are mechanical devices. Hard drives use electrical motors, spinning platters, and other moving parts that gradually wear out over time. All hard drives have a Mean Time Before Failure (MTBF) value assigned to them by the manufacturer. This value provides an estimate of how long a given drive will last before it fails. Remember, with hard drives, it s not a matter of if a hard drive will fail, but a matter of when. In addition to hard drive failures, there is always the possibility that one or more of the following will occur: Users delete files by accident. A virus deletes or damages important files. A notebook system gets lost or destroyed. An attacker deletes data on a server. Natural disasters, such as thunderstorms, generate electrical spikes that destroy storage systems. Because of these factors, it is very important that you regularly back up important data. In this section, you learn how to do this. Before you can actually back up data, you first need to develop a backup strategy by doing the following: Choosing a Backup Method on page 324 Choosing a Backup Media on page 326 Defining a Backup Schedule on page 326 Determining What to Back Up on page 327 Choosing a Backup Method The first step in developing a backup strategy is to select the type of backups you will use. The following options are available: Full Backup on page 325 Incremental Backup on page 325 Differential Backup on page

325 Full Backup Prepare Servers for Disasters The first option is to run a full backup. In a full backup, all specified files are backed up to your backup media, regardless of whether they ve been modified since the last backup. After being backed up, each file is flagged as having been backed up. This strategy is thorough and exhaustive. It s also the fastest option when you need to restore data from a backup. The disadvantage, however, is that full backups can take a very long time to complete because every single file is backed up, whether it s been changed or not. Novell Training Services (en) 15 April 2009 Incremental Backup Because of the amount of time required to complete full backups, many administrators mix full backups with incremental backups. During an incremental backup, only the files that have been modified since the last backup (full or incremental) are backed up. After being backed up, each file is flagged as having been backed up. If you use a full/incremental strategy, you normally run a full backup only once a week. This is usually done when the system load is lightest, such as Friday night. Then you run incremental backups each of the other six days in the week. Using this strategy, you end up with one full backup and six incremental backups for each week. The advantage of this strategy is primarily speed. Because incrementals back up only files that have changed since the last full or incremental backup, they generally run much faster than full backups. However, incremental backups do have a drawback. If you need to restore data from the backup set, you must restore six backups in exactly the correct order. The full backup is restored first, followed by the first incremental, then the second incremental, and so on. This can be a relatively slow process. Differential Backup As an alternative to incremental backups, you can also combine differential backups with your full backup. During a differential backup, only the files that have been modified since the last full backup are backed up. Even though they have been backed up during a previous differential backup, the files involved are not flagged as having been backed up. You must use differential backups in conjunction with full backups. Again, you usually run a full backup once a week when the system load is lightest. Then you run a differential backup each of the other nights of the week. Remember that a differential backup backs up only files that have changed since the last full backup, not since the last differential. Therefore, each day s backup gets progressively bigger. The main advantage to this strategy is that restores are really fast. Instead of the seven backups required to restore from a full/incremental backup, you have to restore only two backups when using full/differential backups: the last full backup followed by the last differential backup. 325

326 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 6-1 The disadvantage to this method is that the differential backups start out running very fast, but can become almost as long as a full backup by the time you reach the last day in the cycle. NOTE: Do not mix incremental and differential backups together! Your backups will lose data. The following illustrates the difference between incremental and differential backups: Mon Tue Wed Thur Fri Full backup Incremental backup Mon Tue Wed Thur Fri Full backup Differential backup Choosing a Backup Media Once you have selected your backup strategy, you need to select your backup media type. You must choose an appropriate backup media for the amount of data to be backed up. Tape drives are commonly used by Linux administrators because they have the best price-to-capacity ratio. Most tape drives are SCSI devices. This allows multiple types of tape drives (such as DAT, EXABYTE, and DLT) to be accessed in the same way. In addition, tapes can be easily rotated and reused. Other options for data backup include writable DVDs, removable hard drives, and magneto-optical (MO) drives. Another option is a Storage Area Network (SAN). With a SAN, a storage network is set up to back up data exclusively from different computers on a central backup server. But even a SAN often uses magnetic tapes to store the data. Backup media should always be stored separately from the backed-up systems. This prevents the backups from being lost in case of a fire or other natural disaster in the server room. We recommend that you keep a copy of your sensitive backup media stored safely offsite. Defining a Backup Schedule Next, you need to define when you will run your backups. You can select whatever backup schedule works best for your organization. However, many Linux admins work on a weekly rotation, as discussed previously. Identify one day for your full 326

327 Prepare Servers for Disasters backup and then designate the remaining days of the week for your incremental or differential backups. As stated earlier, you should schedule your backups to occur when the load on the system is at its lightest. Late at night or in the early morning is usually best, depending on your organization s work schedule. You should also be sure to keep a rotation of backups. We recommend that you rotate your backup media such that you have three to four weeks of past backups on hand. That way, if a file that was deleted two weeks ago is suddenly needed again, you can restore it from one of your rotated media sets. Determining What to Back Up Finally, you need to determine what data you will include in your backups. One option is to back up the entire system. This is a safe, thorough option. However, it s also somewhat slow due to the sheer amount of data involved. Another option is to back up only critical data on the system, such as users files and the system configuration information. In the event of a disaster, you can simply reinstall a new system and then restore the critical data from your backups. If you choose this strategy, then you should consider backing up the following directories in the Linux file system: /etc /root /home /var /opt /srv Novell Training Services (en) 15 April

328 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Objective 2 Use Linux Tools to Create Backups Various commercial applications exist to back up data. However, the tools included with SLES 11, such as tar and rsync, especially when used within scripts scheduled with cron to run at specific times, might already be sufficient for your backup needs. To back up data with Linux tools, you need to be able to Create Backups with tar on page 328 Mirror Directories with rsync on page 332 Use LVM Snapshots with Backups on page 334 Use an LVM Snapshot Volume to Create a Consistent Backup on page 342 Create Backups with tar The tar (tape archiver) tool is the most commonly used application for data backup on Linux systems. It archives files in a special format, either directly on a backup medium (such as magnetic tape) or to an archive file in the file system. To use tar, you need to be familiar with the following tasks: Create tar Archives on page 328 Unpack tar Archives on page 329 Exclude Files from Backup on page 329 Perform Incremental and Differential Backups on page 330 Use tar Command Line Options on page 331 Create tar Archives The tar format is a container format for files and directory structures. By convention, the extension of the archive files is.tar. tar archives can be saved to a file and stored in a file system. They can also be written directly to a backup tape. Normally the data in the archive files is not compressed, but you can enable compression with additional compression commands. If archive files are compressed (usually with the gzip command), then the extension of the filename is either.tar.gz or.tgz. The syntax for using tar is as follows: tar options archive_file_name directory_to_be_backed_up You can also use tar options tape_device_file_name directory_to_be_backed_up All directories and files under the specified directory are included in the archive. For example: tar -cvf /backup/etc.tar /etc 328

329 Prepare Servers for Disasters In this example, the tar command backs up the complete contents of the /etc directory to the /backup/etc.tar file. The -c (create) option creates the archive. The -v (verbose) option displays a more detailed output of the backup process. The name of the archive to be created entered after the -f (file) option. This can be either a normal file or a device file (such as a tape drive), as in the following: tar -cvf /dev/st0 /home In this example, the /home directory is backed up to the tape drive /dev/st0. When an archive is created, absolute paths are made relative by default. This means that the leading / is removed, as shown in the following output: Novell Training Services (en) 15 April 2009 tar: Removing leading / from member names You can view the contents of an archive by entering the following: tar -tvf /backup/etc.tar Unpack tar Archives Once you have created your archives, you can then use tar to extract (unpack) files from the archive. To do this, use the following syntax: tar -xvf device_or_file_name For example: tar -xvf /dev/st0 This writes all files in the archive to the current directory. Due to the relative path specifications in the tar archive, the directory structure of the archive is created here. If you want to extract to another directory, use the -C option followed by the directory name. If you want to extract just one file, add the filename with its path as contained in the archive, as in the following: tar -xvf /test1/backup.tar -C / home/user1/.bashrc The path will be created if it does not exist. An existing file with the same name will be overwritten. Exclude Files from Backup If you want to exclude certain files from the backup, you can create a list of these files in an exclude file. Each excluded file is listed on its own line, as shown in the following: 329

330 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual /home/user1/.bashrc /home/user2/text* In this example, the /home/user1/.bashrc file from user1 and all files that begin with Text in the home directory of user2 will be excluded from the backup. This list is then passed to tar with the -X option, as in the following: tar -cv -X exclude.files -f /dev/st0 /home Perform Incremental and Differential Backups With tar, you can approximate an incremental or differential backup by backing up only files that have been changed or newly created since a specific date. You need to be able to Use a Snapshot File for Incremental Backups on page 330 Use find to Create a Differential Backup on page 330 Use a Snapshot File for Incremental Backups tar lets you use a snapshot file that contains information about the last backup process. This file needs to be specified with the -g option. First, you need to make a full backup with a tar command, as in the following: tar -cz -g /backup/snapshot_file -f /backup/ backup_full.tar.gz /home In this example, the /home directory is backed up to the /backup/ backup_full.tar.gz file. The snapshot /backup/snapshot_file file does not exist and is created. You can then perform an incremental backup the next day using the following command: tar -cz -g /backup/snapshot_file -f /backup/ backup_mon.tar.gz /home In this example, tar uses the snapshot file to determine which files or directories have changed since the last backup. Only changed files are included in the new backup / backup/backup_mon.tar.gz. Use find to Create a Differential Backup You can also use the find command to identify files that need to be backed up as a differential backup. First, you use the following command to make a full backup: tar -czf /backup/backup_full.tar.gz /home In this example, the /home directory is backed up into the /backup/ backup_full.tar.gz file. Then you can use the following command (all on one line) to back up all files that are newer than the full backup: 330

331 Prepare Servers for Disasters find /home -type f -newer /backup/backup_full.tar.gz - print0 tar --null -cvf /backup/backup_mon.tar.gz -T - In this example, all files (-type f) in the /home directory that are newer than the /backup/backup_mon.tar.gz file are archived. The -print0 and --null options ensure that files with spaces in their names are also archived. The -T option determines that files piped to stdin are included in the archive. One problem with the previous command line might be caused by its long execution time when you have to back up a lot of data. If a file is created or changed after the backup command is started but before the backup is completed, this file is older than the reference backup archive but at the same time is not included in this archive. This could lead to a situation where the file is not backed up in the next differential backup, because only the files which are newer than the reference archive are included. Instead of the previous backup archive, you can also create a file with the touch command and use this file as reference in the find/tar command line. Novell Training Services (en) 15 April 2009 Use tar Command Line Options The following are several useful tar command line options: Table 6-1 Options Used with the tar Command Options Description -c Creates an archive. -C Changes to the specified directory. -d Compares files in the archive with those in the file system. -f Uses the specified archive file or device. -j Directly compresses or decompresses the tar archive using bzip2, a modern, efficient compression program. -r Appends files to an archive. -u Includes only those files in an archive that are newer than the version in the archive (update). -v Displays the files which are being processed (verbose mode). -x Extracts files from an archive. -X Excludes files listed in a file. -z Directly compresses or decompresses the tar archive using gzip. NOTE: For more information about tar, enter man tar at the shell prompt. 331

332 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Mirror Directories with rsync The rsync utility is designed to create copies of entire directories across a network to a different computer. As such, rsync is an ideal tool to back up data across the network to the file system of a remote computer or to a locally connected USB drive. It s important to note that rsync works in a very different manner than other backup utilities. Instead of creating an archive file, rsync creates a mirror copy of the data being backed up in the file system of the destination device. A key benefit of using rsync is that when copying data, rsync compares the source and the target directory and transfers only data that has been changed or created. Therefore, the first time rsync is run, all of the data is copied. Thereafter, only files that have been changed or newly created in the source directory are copied to the target directory. In this objective, you learn how to use rsync in two different ways: Use rsync to Create a Local Backup on page 332 Use rsync to Create a Remote Backup on page 333 Use rsync to Create a Local Backup The rsync utility can be used to create a local backup. The mirrored target directory could reside in the same file system as the source directory, or it could reside on a removable device such as a USB or Firewire hard drive. For example, you could mirror all home directories by entering the following at the shell prompt: rsync -a /home /shadow In this example, the /home directory is mirrored to the /shadow directory. The /home directory is first created in the /shadow directory, and then the actual home directories of the users are created under /shadow/home. If you want to mirror the content of a directory and not the directory itself, you can use a command such as the following: rsync -a /home/ /shadow By adding a / to the end of the source directory, only the data under /home is copied. If you run the same command again, only files that have changed or are new since the last time rsync was run will be transferred. The -a option used in the examples above puts rsync into archive mode. Archive mode is a combination of various other options (namely rlptgod) and ensures that the characteristics of the copied files are identical to the originals. The -a option ensures the following are preserved in the mirrored copy of the directory: Symbolic links (l option) 332

333 Table 6-2 Access permissions (p option) Owners (o option) Group membership (g option) Time stamp (t option) In addition, the -a option incorporates the -r option, which ensures that subdirectories are copied recursively. The following are some other useful rsync options: Options Used with the rsync Command Prepare Servers for Disasters Novell Training Services (en) 15 April 2009 Option Description -a Puts rsync into the archive mode. -x Saves files on one file system only, does not include file systems residing on other partitions. -v Enables the verbose mode. Use this mode to output information about the transferred files and the progress of the copying process. -z Compresses the data during the transfer. This is especially useful for remote synchronization. --delete --exclude-from -W, --whole-file Deletes files from the mirrored directory that no longer exist in the original directory. Does not back up files listed in an exclude file. Copies files whole (without delta-xfer algorithm). This is the default when both the source and destination are specified as local paths. You can change this, and many other rsync options, by prefixing it with --no: -- no-w The --exclude-from option can be used as follows: rsync -a --exclude-from=/home/exclude /home/ /shadow/home In this example, none of the files listed in the /home/exclude file are backed up. Empty lines or lines beginning with ; or # are ignored. Use rsync to Create a Remote Backup Using rsync and SSH, you can log in to other systems over the network and perform data synchronization remotely. For example, the following command copies the home directory of the tux user to a backup server: rsync -av root@da1:/home/tux /backup/home/ You can specify the remote shell to use with the -e option, but as ssh is the default, it is not necessary to include this option for ssh. The source directory is specified by the expression root@da1:/home/tux. This means that rsync should log in to da1 as root and transfer the /home/tux directory. 333

334 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Of course, this also works in the other direction. In the following example, the backup of the home directory is copied back to the da1 system: rsync -av /backup/home/tux root@da1:/home/ NOTE: rsync must be installed on both the source and the target computer for this to work. Another way to perform remote synchronization with rsync is to employ an rsync server. This allows you to use remote synchronization without having to allow an SSH login. NOTE: For more information, consult the rsync documentation at Use LVM Snapshots with Backups A general problem with backups using the above tools is that files might change while the backup is in progress, making the backup inconsistent. Some files may have changed shortly after the backup began and others might have changed shortly before it finished. You cannot be sure which version is included in the backup. For data residing on logical volumes, using a Logical Volume Manager (LVM) snapshot volume and backing up data from the snapshot, not the original volume, eliminates such inconsistencies. Data visible in the mounted snapshot does not change, even if the original data changes. The snapshot freezes the state of the original volume at the time of the creation, while the data on the original volume can continue to change. To use LVM snapshots, you have to be able to Configure Logical Volumes with YaST on page 334 Configure LVM with Command Line Tools on page 338 Configure and Use an LVM Snapshot Volume on page 340 Configure Logical Volumes with YaST The following are the basic steps for configuring logical volumes (LVM) with YaST: Define LVM Partitions (Physical Volumes) on the Hard Drive on page 334 Create a Volume Group and Logical Volumes on page 335 Define LVM Partitions (Physical Volumes) on the Hard Drive During (or after) the installation of SLES 11, you need to configure the LVM partition on the hard disk. You can use YaST or fdisk to perform this task. When configuring the LVM partition, choose the following options in YaST: Formatting options: Do not format partition File system ID: 0x8E Linux LVM 334

335 Figure 6-2 Create a Volume Group and Logical Volumes Prepare Servers for Disasters In the YaST Expert Partitioner, select Volume Management. The following is displayed: LVM: Expert Partitioner Novell Training Services (en) 15 April 2009 Figure 6-3 Then click Add Volume Group. The following appears: LVM: Add Volume Group 335

336 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Use this dialog to create a new logical volume group by specifying the following: Volume Group Name. Name of your volume group. Physical Extent Size. Smallest unit of a logical volume group. With LVM version 1, this also defined the maximum size of a logical volume. Entering a value of 4 MB allowed logical volumes of 256 GB. With LVM2, this limitation does not exist anymore. If you are not sure which values to specify, use the default settings. Available Physical Volumes. Physical volumes that can be added to the volume group. Selected Physical Volumes. Physical volumes that will be added to the volume group. After you click Finish, you will return to Volume Management where you can select the newly created volume group. Next, you need to create a logical volume by clicking Add. Follow the prompts to create the logical volume. You will specify the following options: Logical Volume Name. A descriptive name for the volume, such as data, mail, or accounting. Size. The maximum space can be used, or you can specify a specific size. Stripes. The number of physical volumes the logical volume will be striped over (software RAID0). A value of 1 means no striping. You can specify up to 8. The size of the stripe can also be selected if you select a value greater than 1. Striping is useful only if you have two or more disks. It can increase performance by allowing parallel file system reads and writes, but it also increases the risk of data loss. One failed disk can lead to data corruption in the whole volume group. Click Finish to return to the Volume Group dialog. The following appears: 336

337 Figure 6-4 LVM: Expert Partitioner Prepare Servers for Disasters Novell Training Services (en) 15 April 2009 To make the changes permanent, select Next > Finish. Otherwise, you can click the Add, Edit, Resize, or Delete buttons to manage the logical volumes in the LVM volume group. NOTE: LVM configuration is done through Volume Management in YaST. The yast2 lvm_config module from SLES 10 is no longer used in SLES 11. The following options are available to configure LVM groups: Add. Adds a new logical volume to the volume group. Edit. Lets you change the formatting and mounting options for the selected volume. Resize. Lets you resize a logical volume by dragging the slider or manually entering a size as shown in the following figure. 337

Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 6-5 LVM: Resize Logical Volume A graphical view shows how much space has been

338 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 6-5 LVM: Resize Logical Volume A graphical view shows how much space has been used or is free (available) for both the logical volume (LV) and the volume group (VG). Remove. Removes a selected volume. To delete a volume group, select the Overview tab and then click Delete. In order to delete a volume group, you must first delete all logical volumes from the group. Physical volumes are shown on the Physical Volumes tab. However, management of physical volumes must be done through the Hard Disks view. NOTE: For additional information on configuring LVM, see the LVM HOWTO ( HOWTO/LVM-HOWTO/). Configure LVM with Command Line Tools Setting up LVM consists of several steps, with a dedicated tool for each: Tools to Administer Physical Volumes on page 339 Tools to Administer Volume Groups on page 339 Tools to Administer Logical Volumes on page 339 This objective presents only a brief overview. Not all available LVM tools are covered. To view the tools that come with LVM, enter rpm -ql lvm2 less at the shell prompt and review the corresponding manual pages for details on each tool. 338

339 Tools to Administer Physical Volumes Prepare Servers for Disasters Partitions or entire disks can serve as physical volumes for LVM. The ID of a partition used as part of LVM should be Linux LVM, 0x8e. However, 0x83, Linux, works as well. You cannot use an entire disk as a physical volume if it contains a partition table. Overwrite any existing partition table using dd: da10:~ # dd if=/dev/zero of=/dev/sdd bs=512 count=1 The next step is to initialize the partition for LVM using pvcreate: Novell Training Services (en) 15 April 2009 da10:~ # pvcreate /dev/sda9 Physical volume "/dev/sda9" successfully created pvscan shows the physical volumes and their use: da10:~ # pvscan PV /dev/sda9 lvm2 [242,95 MB] Total: 1 [242,95 MB] / in use: 0 [0 ] / in no VG: 1 [242,95 MB] Use the pvmove tool to move data from one physical volume to another (providing there is enough space), in order to remove a physical volume from LVM. Tools to Administer Volume Groups The vgcreate tool is used to create a new volume group. To create the volume group system and add the physical volume /dev/sda9 to it, enter the following: da10:~ # vgcreate system /dev/sda9 Volume group "system" successfully created da10:~ # pvscan PV /dev/sda9 VG system lvm2 [240,00 MB / 240,00 MB free] Total: 1 [240,00 MB] / in use: 1 [240,00 MB] / in no VG: 0 [0 ] pvscan displays the new configuration. To add further physical volumes to the group, use vgextend. Removing unused physical volumes is done with vgreduce after shifting data from the physical volume scheduled for removal to other physical volumes using pvmove. vgremove removes a volume group, provided there are no logical volumes in the group. Tools to Administer Logical Volumes To create a logical volume, use lvcreate, specifying the size, the name for the logical volume, and the volume group: 339

340 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual da10:~ # lvcreate -L 100M -n data system Logical volume "data" created The next step is to create a file system within the logical volume and mount it: da10:~ # lvscan ACTIVE '/dev/system/data' [100,00 MB] inherit da10:~ # mkfs.ext3 /dev/system/data mke2fs (01-Sep-2008) Filesystem label=... This filesystem will be automatically checked every 33 mounts or 180 days, whichever comes first. Use tune2fs -c or -i to override. da10:~ # mount /dev/system/data /data As shown above, lvscan is used to view the logical volumes. It shows the device to use for the formatting and mounting. lvextend is used to increase the size of a logical volume. After that, you can increase the size of the file system on that logical volume to make use of the additional space. Before you use lvreduce to reduce the size of a logical volume, you first must reduce the size of the file system. If you cut off parts of the file system by simply reducing the size of the logical volume without shrinking the file system first, you will lose data. Configure and Use an LVM Snapshot Volume Complete the following steps to create and later remove an LVM snapshot: 1. Decide on the size of the snapshot. The snapshot does not have to be the same size as the original LVM volume you are snapshoting. When data changes on the original volume, the old data is copied to the snapshot before writing the new data to the original volume. The snapshot therefore only needs to be large enough to hold the changes happening during its lifetime. The snapshot volume will be dropped when it gets full, so you have to be sure to allocate enough space for it (if the snapshot has the same size as the original volume, it will not get dropped). 2. Create the snapshot with the following command: lvcreate -L size -s -n snapshot_name /dev/vgname/ original_volume 3. Mount the snapshot and back up data: mount -o ro /dev/vgname/snapshot_name /mountpoint 340

341 tar When done, unmount the snapshot and delete it: umount /dev/vgname/snapshot_name lvremove /dev/vgname/snapshot_name Prepare Servers for Disasters Novell Training Services (en) 15 April

342 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Exercise 6-1 Use an LVM Snapshot Volume to Create a Consistent Backup In this lab, you create an LVM volume, create some files on this LVM volume, create a snapshot of the volume, and back up the snapshot. You can find this lab in the Workbook. (End of Exercise) 342

Objective 3 Figure 6-6 Implement Multipath I/O Prepare Servers for Disasters The purpose of multipath I/O is to have independent access paths from the CPU to a mass storage device.

343 Objective 3 Figure 6-6 Implement Multipath I/O Prepare Servers for Disasters The purpose of multipath I/O is to have independent access paths from the CPU to a mass storage device. This can include redundant bus systems or network devices. Multipathing is used to eliminate single points of failure, such as a single network connection. With access to the mass storage device running over two or more connections, access to the data is still possible if one link fails. To implement multipath I/O, you need to Understand Multipathing on page 343 Configure and Administer Multipathing on page 344 Access Remote Storage Using Multipath I/O on page 352 Understand Multipathing Linux has allowed access to storage systems using redundant physical paths natively since kernel version (mid 2005). Before that only commercial software was available for this purpose. With kernel , the Device Mapper was extended by a multipath target. With multipathing, the storage system is accessed via two (or more) paths. This can be implemented, for instance, with two iscsi initiators pointing at the same target using two IP addresses on the storage system. Multipathing is managed at the device level the Device Mapper on the host combines the two SCSI disk devices created by the iscsi initiators (for instance, /dev/sdc and /dev/sdd) into one virtual / dev/dm-0 device: Device Mapper Multipath Novell Training Services (en) 15 April 2009 If one path fails, access to the storage device is routed through the remaining path. There may be a short delay, but otherwise applications should remain unaffected. NOTE: Some storage arrays require specific hardware handlers. Consult the hardware vendor s documentation to determine if its hardware handler must be installed for Device Mapper Multipath. 343

344 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Configure and Administer Multipathing Device Mapper Multipathing consists of several components, including the following: Device Mapper Multipath Module. This is the dm-multipath.ko kernel module and other device mapper modules that extend its functionality, such as dm-round-robin.ko. The modules are loaded by the /etc/init.d/boot.multipath and / etc/init.d/multipathd scripts. Multipath I/O Management Tools. These tools are contained in the multipathtools RPM package and include multipath, multipathd, libraries in / lib/multipath and the /etc/init.d/boot.multipath and /etc/ init.d/multipathd init-scripts. multipath is used to detect multiple paths to devices for fail-over or performance reasons and coalesces them. multipathd is the daemon in charge of checking for failed paths. When this happens, it reconfigures the multipath map the path belongs to, so that this map regains its maximum performance and redundancy. This daemon executes the external multipath config tool when events occur. In turn, the multipath tool signals the multipathd daemon when it is done with devmap reconfiguration, so that it can refresh its failed path list. mdadm. Needed when using LVM or Linux software RAID on top of multipathing devices. Before running /etc/init.d/boot.multipath start, the output of fdisk -l only shows the devices created by the iscsi initiators (/dev/sda is the local hard disk, and /dev/sdb and /dev/sdc are iscsi devices): 344

345 da-host:~ # fdisk -l Disk /dev/sda: 80.0 GB, bytes 255 heads, 63 sectors/track, 9729 cylinders Units = cylinders of * 512 = bytes Disk identifier: 0x000a1de1 Prepare Servers for Disasters Device Boot Start End Blocks Id System /dev/sda Linux swap / Solaris /dev/sda2 * Linux Novell Training Services (en) 15 April 2009 Disk /dev/sdb: 10.7 GB, bytes 64 heads, 32 sectors/track, cylinders Units = cylinders of 2048 * 512 = bytes Disk identifier: 0x Disk /dev/sdb doesn't contain a valid partition table Disk /dev/sdc: 10.7 GB, bytes 64 heads, 32 sectors/track, cylinders Units = cylinders of 2048 * 512 = bytes Disk identifier: 0x Disk /dev/sdc doesn't contain a valid partition table After running /etc/init.d/boot.multipath start, an additional /dev/ dm-0 appears: da-host:~ # fdisk -l Disk /dev/sda: 80.0 GB, bytes Disk identifier: 0x Disk /dev/sdb doesn't contain a valid partition table Disk /dev/sdc: 10.7 GB, bytes 64 heads, 32 sectors/track, cylinders Units = cylinders of 2048 * 512 = bytes Disk identifier: 0x Disk /dev/sdc doesn't contain a valid partition table Disk /dev/dm-0: 10.7 GB, bytes 255 heads, 63 sectors/track, 1305 cylinders Units = cylinders of * 512 = bytes Disk identifier: 0x Disk /dev/dm-0 doesn't contain a valid partition table 345

346 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual The disk identifier is the same for sdb, sdc, and dm-0. You can create a file system on the /dev/dm-0 device, as shown in the following: da-host:~ # mke2fs -j /dev/dm-0 mke2fs (01-Sep-2008) Filesystem label= OS type: Linux Block size=4096 (log=2) Fragment size=4096 (log=2) inodes, blocks blocks (5.00%) reserved for the super user First data block=0 Maximum filesystem blocks= block groups blocks per group, fragments per group 8096 inodes per group Superblock backups stored on blocks: 32768, 98304, , , , , , Writing inode tables: done Creating journal (32768 blocks): done Writing superblocks and filesystem accounting information: done This filesystem will be automatically checked every 39 mounts or 180 days, whichever comes first. Use tune2fs -c or -i to override. da-host:~ # mount /dev/dm-0 /mnt da-host:~ # ls /mnt lost+found da-host:~ # It is also possible to partition the /dev/dm-0 device. This is supported, but not recommended; instead, prepare the SAN devices according to your needs, using the vendor s tools. In the following example, two primary partitions are created: 346

347 Prepare Servers for Disasters da-host:~ # fdisk /dev/dm-0 Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel Building a new DOS disklabel with disk identifier 0x0b8348cd. Changes will remain in memory only, until you decide to write them. After that, of course, the previous content won't be recoverable. The number of cylinders for this disk is set to There is nothing wrong with that, but this is larger than 1024, and could in certain setups cause problems with: 1) software that runs at boot time (e.g., old versions of LILO) 2) booting and partitioning software from other OSs (e.g., DOS FDISK, OS/2 FDISK) Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite) Novell Training Services (en) 15 April 2009 Command (m for help): n... # creating two primary partitions Command (m for help): p Disk /dev/dm-0: 10.7 GB, bytes 255 heads, 63 sectors/track, 1305 cylinders Units = cylinders of * 512 = bytes Disk identifier: 0x0b8348cd Device Boot Start End Blocks Id System /dev/dm-0p Linux /dev/dm-0p Linux Command (m for help): w The partition table has been altered! Calling ioctl() to re-read partition table. WARNING: Re-reading the partition table failed with error 22: Invalid argument. The kernel still uses the old table. The new table will be used at the next reboot. Syncing disks. da-host:~ # partprobe The output of fdisk -l now looks like this: 347

348 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual da-host:~ # fdisk -l... Disk /dev/sdb: 10.7 GB, bytes 255 heads, 63 sectors/track, 1305 cylinders Units = cylinders of * 512 = bytes Disk identifier: 0x0b8348cd Device Boot Start End Blocks Id System /dev/sdb Linux /dev/sdb Linux Disk /dev/sdc: 10.7 GB, bytes 255 heads, 63 sectors/track, 1305 cylinders Units = cylinders of * 512 = bytes Disk identifier: 0x0b8348cd Device Boot Start End Blocks Id System /dev/sdc Linux /dev/sdc Linux Disk /dev/dm-0: 10.7 GB, bytes 255 heads, 63 sectors/track, 1305 cylinders Units = cylinders of * 512 = bytes Disk identifier: 0x0b8348cd Device Boot Start End Blocks Id System /dev/dm-0p Linux /dev/dm-0p Linux Disk /dev/dm-1: 1069 MB, bytes 255 heads, 63 sectors/track, 129 cylinders Units = cylinders of * 512 = bytes Disk identifier: 0x Disk /dev/dm-1 doesn't contain a valid partition table Disk /dev/dm-2: 1069 MB, bytes 255 heads, 63 sectors/track, 130 cylinders Units = cylinders of * 512 = bytes Disk identifier: 0xe Disk /dev/dm-2 doesn't contain a valid partition table The information for /dev/sdb, /dev/sdc, and /dev/dm-0 shows the new partitions. Two new devices, /dev/dm-1 and /dev/dm-2, appeared that represent the new partitions. These devices would be used with the mkfs command to create file systems. The multipath tools detect most storage arrays automatically and no additional configuration is required. However, the system defaults can be overridden with an / etc/multipath.conf file. It does not exist until you create and configure it. It has the following structure: 348

349 # /etc/multipath.conf defaults { user_friendly_names yes #path_grouping_policy failover path_grouping_policy multibus failback immediate... } devnode_blacklist { devnode "sd[a-b]$" devnode... } blacklist_exceptions {... } multipaths { multipath { wwid alias my_storage } } devices { device { vendor "NETAPP " product "LUN"... } } Prepare Servers for Disasters Novell Training Services (en) 15 April 2009 Entries in the defaults section can be overridden by entries in subsequent sections. The path_grouping_policy in the defaults section determines how the paths are used: failover. One path is assigned per priority group so that only one path at a time is used. multibus. (Default) All valid paths are in one priority group. Traffic is loadbalanced across all active paths in the group. group_by_prio. One priority group exists for each path priority value. Paths with the same priority are in the same priority group. Priorities are assigned by an external program. group_by_serial. Paths are grouped by the SCSI target serial number (controller node World Wide Name [WWN]). group_by_node_name. One priority group is assigned per target node name. Target node names are fetched in /sys/class/fc_transport/ target*/node_name. 349

350 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual The possible entries in each section are described in the multipath.conf manual page and also with comments in the /usr/share/doc/packages/multipathtools/multipath.conf.annotated file. You can also copy the /usr/share/doc/packages/multipath-tools/ multipath.conf.synthetic file to /etc/multipath.conf and edit it according to your needs. The following is an actual configuration for two SANs from two different vendors: # /etc/multipath.conf defaults { polling_interval 10 path_grouping_policy failover rr_weight priorities failback 30 no_path_retry 4 } blacklist { devnode cciss devnode fd devnode hd devnode md devnode sr devnode st devnode ram devnode loop } devices { device { vendor product path_grouping_policy prio_callout } "HITACHI" "DF600F" failover "/sbin/pp_hds_modular %d" } device { vendor "DGC" product "*" product_blacklist "LUNZ" path_grouping_policy group_by_prio hardware_handler "1 emc" } prio_callout path_checker "/sbin/mpath_prio_emc /dev/%n" emc_clariion To activate any changes made in the configuration file, stop the multipathd service (/ etc/init.d/multipathd stop) and clear existing multipath bindings with the multipath -F command. Then create new multipath bindings with the multipath -v2 -l command and start multipathd again (/etc/init.d/ multipathd start). 350

351 Prepare Servers for Disasters After having cleared multipath bindings with the multipath -F command, you can test the configuration with the multipath -v3 -d (verbosity level 3, dry run) command, as shown in the following: da-host:~ # multipath -v3 -d... create: n/a IET,VIRTUAL-DISK [size=10g][features=0][hwhandler=0][n/a] \_ round-robin 0 [prio=2][undef] \_ 6:0:0:0 sdb 8:16 [undef][ready] \_ 7:0:0:0 sdc 8:32 [undef][ready] Jan 05 09:01:00 unloading const prioritizer Jan 05 09:01:00 unloading directio checker Novell Training Services (en) 15 April 2009 After starting multipathing with /etc/init.d/multipathd start, multipath -ll produces a similar output: da-host:~ # /etc/init.d/multipathd start Starting multipathd done da-host:~ # multipath -ll dm-0 IET,VIRTUAL- DISK [size=10g][features=0][hwhandler=0][rw] \_ round-robin 0 [prio=2][active] \_ 6:0:0:0 sdb 8:16 [active][ready] \_ 7:0:0:0 sdc 8:32 [active][ready] To start multipath services with system boot, enter chkconfig multipathd on and chkconfig boot.multipath on. The SLES 11 multipathing documentation ( sles11/stor_admin/?page=/documentation/sles11/stor_admin/data/bookinfo.html) contains detailed information on all aspects of Device Mapper Multipathing on SLES

352 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Exercise 6-2 Access Remote Storage Using Multipath I/O In this lab, you access remote storage space using iscsi and Multipath I/O. You can find this lab in the Workbook. (End of Exercise) 352

353 Summary Objective Design a Backup Strategy Use Linux Tools to Create Backups Summary Prepare Servers for Disasters To develop a backup strategy, you need to complete the following: Choose a backup method. Choose backup media. There are three basic backup strategies: Full backup: All data is backed up. Incremental backup: Only the data that has changed since the last Incremental or full backup is saved. Differential backup: Only the data that has changed since the last full backup is saved. tar is a commonly used tool to perform data backups under Linux. It can write data directly to a backup medium or to an archive file. Archive files normally end in.tar. If they are compressed, they end in.tar.gz or.tgz. The following is the basic syntax to create a tar archive: tar -cvf archive_file directory_to_be_archived To unpack a tar archive, use the following command: tar -xvf archive_file If you want to use tar with gzip for compression, you need to add the z option to the tar command. Archives can also be written directly to tape drives. In this case, the device name of the tape drive must be used instead of a filename. tar can also be used for incremental or differential backups. The rsync command is used to synchronize the content of directories, locally or remotely, over the network. rsync uses special algorithms to ensure that only those files that are new or have been changed since the last synchronization are copied. The basic command to synchronize the content of two local directories is the following: rsync -a source_dir target_dir To perform a remote synchronization, use the following: rsync -ave ssh user@remotehost:path target_dir Novell Training Services (en) 15 April

354 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Objective Implement Multipath I/O Summary The purpose of multipath I/O is to have independent access paths from the CPU to a mass storage device. The main components of Linux multipathing include the device mapper multipathing kernel modules and the multipathing tools multipath, used to detect multiple paths to devices, and multipathd, the daemon in charge of checking for failed paths. The multipathing configuration is contained in the / etc/multipath.conf file. 354

355 SECTION 7 Monitor Server Health Monitor Server Health In this section, you learn how to monitor the health of your SLES 11 systems. Objectives 1. Document the Server on page Monitor Log Files with logwatch on page Monitor Network Hosts with Nagios on page 384 Novell Training Services (en) 15 April

356 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Objective 1 Document the Server In this section, you learn how to monitor your SLES 11 systems. The first task we will address is that of documenting your server. At first glance, documenting systems may not sound like a task related to server monitoring. However, it s key to making your monitoring tasks productive. By establishing baselines with your documentation, you can easily identify what has changed when something goes wrong with your server system. Without the baselines that server documentation provides, you re left to guesses and intuition, which may not yield accurate answers. Even though maintaining server documentation is one of the most important tasks you need to complete as a system administrator, it s also probably the task that administrators most frequently neglect. In this objective, you learn how to maintain a complete set of documentation for your SLES 11 systems. The following topics will be addressed: Creating a Server Deployment Plan on page 356 Maintaining a Server Configuration Log on page 365 Documenting System Changes and Maintenance Events on page 367 Creating Server Baselines on page 369 Creating a Server Deployment Plan The first task you should complete before installing a new server is to create a server deployment plan. Most Linux system administrators are technically minded. They enjoy working with computers and aren t intimidated by new technologies. Unfortunately, they are also notorious for not documenting their work. They are content with the fact that all of the information they need to complete their job is in their heads. If you're working on a test system in a lab environment, you can probably get away with implementing Linux without documenting your work. There's usually little risk if you make mistakes. In fact, it can be a great learning experience. However, when working with a system that will be used in a production environment, this approach is unacceptable. Mistakes on your part can lead to system outages, which can cost your organization time and money. Instead of deploying a Linux system in a haphazard, unstructured manner, you should develop a server deployment plan before you start the installation process. Doing so can help you prevent costly errors. In this part of this objective, we'll discuss how to go about planning a SLES 11 installation. The following topics will be addressed: Conducting a Needs Assessment on page 357 Developing the Project Scope and Schedule on page

357 Monitor Server Health Verifying System Requirements and Hardware Compatibility on page 359 Planning the File System on page 360 Selecting Software Packages on page 363 Specifying User Accounts on page 364 Gathering Network Information on page 364 Selecting an Installation Source on page 364 Conducting a Needs Assessment Novell Training Services (en) 15 April 2009 The first step in your server deployment plan is to conduct a needs assessment. This step is one of the most important aspects of creating a server deployment plan. Unfortunately, it's also the most frequently skipped step and, even when it is done, it's frequently done poorly. Completing a needs assessment will require you to assume the role of a project manager. You will need to meet with a variety of different individuals to gather data about the server deployment. Your findings should be documented, distributed, and reviewed by key project stakeholders. The needs assessment portion of your server deployment plan should answer the following questions: What are the goals of the project? Determine why the implementation is being proposed in the first place. You should ask questions such as What problems will this deployment fix? What are the expectations for the deployment? What objectives will be met by the implementation? When you document the goals of the project, be sure to use language that is clear and measurable. Be sure to talk to everyone involved. If you don't, you won't get a clear picture of what is expected and will probably fail to meet a particular goal. Who are the stakeholders in this project? Identify all individuals who will be impacted by the project in any way. You should ask the following questions: Who requested the new system? Who will use the system after it's installed? Who has the authority to fund the project? Who has authority to allocate employee resources to the project? Who must give final approval before you can begin? Who will maintain and support the system after it is implemented? These are critical questions that must be answered before you begin any deployment. You'll probably find that there are individuals in every organization who will try to circumvent established policies to get you to do something for them without first obtaining the proper approvals. 357

358 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Don't make the mistake of assuming that a new system has been approved and funded simply because someone asked for it. If you first identify all the stakeholders in the deployment, you can be sure that the project has been approved and the necessary funds have been allocated before you begin. When is the system needed? Determine when the project should be completed. Before you can create a schedule for your project, you need to know when the stakeholders expect it to be done. Developing the Project Scope and Schedule By gathering this data in your needs assessment, you have effectively defined the project scope, which is one of the most important components in your deployment plan. The project scope specifies exactly what will be done, when it will be done, and who will do it. Most deployment projects are a three-way balancing act between the following: Schedule Resources Scale This relationship is depicted below: Figure 7-1 Balancing Schedule, Resources, and Scale in a Deployment Project To successfully manage any project, you must keep these three elements in balance. For example, if the schedule is excessively short, then you will need to either increase the number of resources assigned to the project or you will need to decrease its scale. If your schedule, scale, and resources aren't in balance, the project will probably fail in some way. Using project management software can help you calculate how long a project will take using a given number of resources. You can delegate specific tasks to specific 358

359 Monitor Server Health resources and assign task durations.you can also define task dependencies, and thus identify which tasks must be completed before other tasks begin. Using this information, project management software can calculate your schedule, allowing you to see how long the project is going to take. You can also adjust various aspects of the project to see what the effect will be. For example, you can add an additional resource to the project and see what effect it will have on your overall schedule. Once your project schedule is complete, be sure all of the stakeholders of the project have a chance to review and approve the schedule. Defining a project scope and a project schedule before beginning a deployment may take some time to complete. However, the benefits will almost always outweigh the cost of the time spent. Novell Training Services (en) 15 April 2009 Verifying System Requirements and Hardware Compatibility Next, you should verify you system requirements and hardware compatibility. Before you start installing SLES 11, you need to make sure it is compatible with your hardware and that your software and services will run on it. In the early days of Linux, hardware compatibility could be a significant issue. There just weren't enough developers writing Linux drivers. If you were installing Linux on a brand-name system using common hardware components, it was relatively easy to get Linux installed and working correctly. However, if your system used uncommon or proprietary hardware, then getting Linux running correctly could be a challenge. Fortunately, this is much less of an issue today. Most server hardware vendors offer Linux drivers for their hardware. In fact, many drivers for common server hardware are included in the SLES 11 installation media. To be safe, however, it s a good idea to check the SLES 11 hardware compatibility list (HCL) on the Novell Web site ( ( and verify that your system hardware is listed. In addition to checking the HCL, you also need to verify that your hardware meets the SLES 11 system requirements. The latest version can be found at ( server/techspecs.html). An example is shown in the figure below: 359

Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 7-2 Viewing SLES 11 System Requirements A key aspect you need to consider is

360 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 7-2 Viewing SLES 11 System Requirements A key aspect you need to consider is your server s CPU architecture. Be sure you select the correct architecture for your system's CPU. As you can see in the figure above, SLES 11 is currently supported on the following architectures: x86 x86_64 (AMD64 and Intel EM64T) IA64 (Itanium 2) IBM POWER IBM System z (64-bit) In addition, you need to create a list of software and services that will be installed on the server and verify that they are compatible with SLES 11. Planning the File System Next, you need to specify how and where the server file system will be created. First, you need to decide if your server will use directly-attached storage devices or if it will connect to a Fibre Channel or iscsi SAN, as discussed earlier in this course. Next, 360

361 Monitor Server Health you need to define how your storage device will be partitioned and what file system will be used. Consider the following: Using Traditional Partitions or LVM on page 361 Choosing a File System on page 362 Defining System Partitions on page 362 Using Traditional Partitions or LVM One of the first choices you need to make is whether you will use traditional partitions on the storage devices used in your server system or if you will use Logical Volume Manager (LVM). LVM enables the flexible distribution of file systems over several partitions or hard disks. It allows you to reallocate hard disk space after the initial partitioning has already been done during installation. To do this, LVM uses a virtual pool of memory space (called a volume group or VG) from which logical volumes (LVs) can be created as needed. The operating system then accesses these LVs instead of the physical partitions. Volume groups can span more than one disk. Several disks can constitute one single VG. In this way, LVM provides a layer of abstraction from the physical disk space that allows its segmentation to be changed in a much easier and safer way than physical re-partitioning. This is represented in the figure below: Novell Training Services (en) 15 April 2009 Figure 7-3 LVM Compared to Traditional Partitioning In this figure, physical partitioning (left) is compared with LVM segmentation (right). On the left, one single disk has been divided into three physical partitions (PART), each with a mount point (MP) assigned so that the operating system can access them. On the right side, two disks have been divided into two and three physical partitions each. Two LVM volume groups (VG 1 and VG 2) have been defined, with the physical partitions included as physical volumes (PV). VG 1 contains two partitions from DISK 1 and one from DISK 2. VG 2 contains the remaining two partitions from DISK 2. Within the volume groups, four logical volumes (LV 1 through LV 4) have 361

362 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual been defined, which can be accessed by the operating system using the associated mount points. Some advantages of LVM over physical partitions include the following: Space from several hard disks or partitions can be combined into a large logical volume. Logical volumes can be easily enlarged when the free space is exhausted by adding more space to the volume group. You can add hard disk space to a running system, assuming your server allows hot-swappable storage. You can implement striping to distribute data of a logical volume over several physical volumes. If these physical volumes reside on different disks, this can improve the reading and writing performance just like RAID 0. The snapshot feature of LVM enables consistent server backups of a running system. With these features, using LVM is a good choice for heavily used server systems. If you have a growing data stock, as in the case of databases or user directories, LVM is especially useful. LVM allows up to 256 LVs per server. LVM 2 is now available, starting with version 2.6 of the Linux kernel. LVM 2 is backwards-compatible with the previous version of LVM and allows the continued management of older volume groups. LVM 2 does not require kernel patches. It makes use of the device mapper integrated in version 2.6 of the kernel. Choosing a File System The job of the file system is to reliably store data on the hard drive and organize it in such a way that it is easily accessible. SLES 11 offers a wide variety of file systems that you can choose from: ext2 ext3 Reiser FAT (not recommended) XFS Defining System Partitions Next, you need to plan what partitions will be created on which disks in your system, how many partitions will be created, how large they will be, and where they will be mounted in the file system. If you are using LVM, you also need to determine what volume groups will be created, what logical volumes will be created, and where they will be mounted in the file system. 362

363 Monitor Server Health You should plan to create several partitions on your hard drives to add a degree of fault tolerance to your server system. Problems encountered in one partition are isolated from the other partitions in the system. For example, suppose you used the default partitioning proposal when installing SLES 11 and had your entire file system mounted at the root directory (/). If a user were to consume all of the available space on the partition by copying huge files to his home directory in /home, it could cause the entire system to crash. If, on the other hand, you were to create a separate partition for /home and the user were to again consume all the available disk space on the /home partition by copying very large files to his home directory, the system will remain running. The partitions containing your system files, log files, and application files are not affected because the issue is isolated to a single partition. Depending no the purpose of your server, you should consider creating separate partitions in addition to your server s swap partition(s) for the directories listed in the table below: Novell Training Services (en) 15 April 2009 Table 7-1 Server Partitions Mount Point Recommendation / A partition for the root directory is always required. This partition should be 4 GB or larger in size, depending on how you allocate directories, such as / var, to separate partitions. /boot /opt /tmp /usr /var You can create a partition for the /boot directory, which contains your Linux system files. This partition should be about MB in size. You can create a partition for application files installed in /opt. You should allocate as much space as necessary to accommodate applications that use this directory. You can create a partition for your system s temporary files stored in /tmp. You should allocate at least 1 GB or more for this partition. You can create a partition for system utilities stored in /usr. You should allocate at least 5 GB to this partition. You may need to allocate more depending on the packages you choose to install. You can create a partition for the log, mail, and spool files stored in /var. Because log files can become quite large, it's a good idea to isolate them in their own partition. You should allocate at least 3 GB of space to this partition. The above sizes are suggested minimums. Considering today s hard disk sizes, you probably will use much larger sizes for the listed partitions. Selecting Software Packages Next, you need to specify which software packages you want to include on your server. As discussed in Section 4 of this course, you should install only the software that is necessary for the server to fulfill its role in your organization and nothing more. Installing software that isn t necessary creates a security risk in your system that you must actively manage. 363

364 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Installing only the software that is absolutely necessary eliminates this risk and makes your life as an administrator easier. Specifying User Accounts When planning your SLES 11 deployment, you should list the user accounts that will be needed on the system. As discussed in Section 4, you need to ensure your user accounts have strong password policies enforced and that they have access only to aspects of the server that they need to do their jobs and no more. NOTE: This is called the Principle of Least Privilege. If you need certain user accounts to have more than the default level of access, use sudo and ACLs to grant them the level of access they need. Gathering Network Information Next, you need to gather network information for the server. You need to answer several questions: Will IPv4 and/or IPv6 be installed? Will the system have its networking configuration dynamically assigned or will it need to be manually configured? What hostname will be assigned to the system? What is the name of the DNS domain the system will reside in? What services will the server provide? What port needs to be opened in the host firewall? Selecting an Installation Source Like most Linux distributions, SLES 11 provides you with several different options for installing the server system. These options include the following: Install from DVD: This option is frequently used when installing a limited number of SLES 11 systems, usually ten or fewer. Install from a network server: Another option for installing a SLES 11 system is to install over the network from an existing network server. You can install from a server on the network that has been installed as an installation source using the SMB/CIFS, FTP, NFS, or HTTP protocol. The key advantage of performing a network installation is that you can install a large number of servers at once without having to burn multiple DVDs. Depending upon your network bandwidth and utilization, you may find that this option is actually faster than performing an installation from a local DVD. 364

365 Monitor Server Health To perform a network installation, you first need to set up an installation server on an existing Linux server in your network and copy the SLES 11 installation files to it. This can be done using the Installation Server module in YaST on another SLES server. Perform an automated installation: You can increase the speed of the installation process using an automated installation. In an automated installation, you use configuration information stored in a control file (in XML format) to specify how the server is to be configured, making manual intervention during the installation process unnecessary. This can be exceptionally advantageous when installing multiple systems that will be configured in the same manner. You should not confuse automated installation with cloning or imaging. An automated installation uses the standard installation process. Answers to questions asked during the installation are contained in and read from the control file. In addition, the hardware detection routine runs during an automated installation. This allows you to use the same control file to install systems that may use diverse hardware. This control file can be made available on a floppy disk, on a USB device, or from a server on the network. While the control file can be created manually, it s much easier to use the Autoyast tool to create your control file. Another way to make a control file is to manually install your first SLES 11 server and, in the last installation screen, mark Clone This System. The settings used in the current installation will be saved to an XML file that you can use to install subsequent systems. Specify how your server will be installed in your deployment plan. Novell Training Services (en) 15 April 2009 Maintaining a Server Configuration Log Once your initial deployment is complete, you should next develop and maintain documentation for your new server. As with server deployment plans, many network administrators fail to do this task, preferring instead to focus on other tasks that seem more urgent at the time. However, maintaining current documentation for your SLES 11 system provides many crucial benefits and should not be avoided, including the following: All of your system configuration information is located in a central location. Resolutions to problems you encounter are documented. The information can be referenced and reused when the problems recur. Other network administrators who are assisting you or may be hired in your place after you leave your organization have a complete reference describing the system and how it is configured. It also describes problems you may have encountered and how they were resolved. This shortens their learning curve, helping them become productive in a shorter amount of time. How you store this information is up to you. Some system administrators record system information by hand in a spiral-bound notebook. This option is inexpensive, 365

366 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual portable, and doesn t require electricity to use it. Unlike an electronic method, this option works in a power outage. The disadvantage of using a notebook is that it can be tedious to record large amounts of information manually and the information tends to become dated quickly, requiring constant updates. In addition, your penmanship needs to be such that others can read your handwriting. Many administrators choose instead to store their system documentation electronically. There are a number of templates available on the Internet that you can download and edit with a word processor or spreadsheet application, such as OpenOffice.org. Some administrators use a network management system to store all of their system information centrally on a database server. There are a variety of agents that can be installed on networked Linux systems that automatically update a central database with information about the system they are running on. If you choose an electronic method, consider printing a hard copy of the information and storing it in a 3-ring binder as a backup. This will preserve your system documentation in the event of a hard drive failure. It also keeps your documentation available in the event of a power outage. After you initially deploy a new SLES 11 system, you should carefully document your system configuration. You should consider including the following information: Table 7-2 Server Configuration Log Data Parameter Hardware Information to Be Documented System manufacturer and model number Vendor purchased from and date purchased Warranty information Motherboard make and model number BIOS make and version number CPU make and model number RAM manufacturer, size, type, and installed amount Hard drive make, model, size, and geometry Power supply output and date of installation CD/DVD drive make and model number List of installed expansion boards 366

367 Parameter Server Operating System Networking Information to Be Documented Video board and monitor information, including: Video card make and model number Installed video memory Chipset Maximum screen resolution supported by the monitor Monitor refresh rates Network board make and model number Linux distribution installed, including support pack number Date installed Disk partitions and mount points Additional kernel modules loaded Operating system parameters such as Keyboard type and language Mouse type Time zone Packages installed Static IP address information, including IP address Subnet mask Default gateway router address DNS server address Network services installed Monitor Server Health Network service configuration parameters from configuration files in the /etc directory. Novell Training Services (en) 15 April 2009 As you can see from the table above, much of this information is very sensitive. You should use ACL assignments to control access to the electronic version of your documentation so that only authorized users can access it. In addition, you should store hard copies of your documentation in a locked cabinet and strictly control who has keys. Documenting System Changes and Maintenance Events In any networking environment, change is the only constant. Your systems will change over time. You will install new versions of your operating system. You will install new hardware. Your networking parameters will change. To keep track of these changes, you should maintain a change log and a maintenance log along with your system documentation. 367

368 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Table 7-3 You should make an entry in your change log every time you make a significant change to the system. For example, if you were to replace a network board in one of your systems, you should record the information in your change log. A sample change log entry is shown below: Sample Change Log Entry Category Component Reason for Change Description RTL8169S Gigabit Ethernet network board. Old board malfunctioned and stopped working. Date Installed 05/04/2009 New Component Information Make: Intel Model: Pro1000 Specs: 1000 Mbps Ethernet Mac Address: C-7F-1D Vendor: CompuStuff In addition to documenting changes, you should also document routine maintenance activities in a maintenance log. Keeping a maintenance log can help you organize your maintenance tasks. A maintenance log could include information such as that listed in the table below: Table 7-4 Sample Maintenance Log Entry Parameter to Be Documented Description Backups Date last backup was run Type of backup (full, differential, or incremental) Tape number or label Media name Tape rotation information Baselines Date the last baseline was run Baseline statistics Comparison and analysis of old baselines to latest baseline Updates Type of update (such as BIOS, RAID controller BIOS, video BIOS, kernel, driver, or rpm update) Name of update installed Date update was installed 368

369 Creating Server Baselines Monitor Server Health Server baselines are invaluable resources when monitoring, maintaining, and troubleshooting a SLES 11 server. Baselining involves taking a snapshot of your server s performance when its in a pristine state, such as right after it is initially deployed. This is your initial baseline. As time passes, you should take additional baselines at regular intervals and compare the results to your initial baseline. Doing this provides you with a picture of what's happening with your server. For example, you may see that memory usage has risen dramatically after an update was applied. Or, you may see that CPU utilization has gradually increased and disk I/O operations have slowed down as additional users have been added to the system. Using this information, you can take action before these issues cause a catastrophic system failure. To create a server baseline, you need to monitor and document a variety of system parameters. The actual parameters monitored will vary from organization to organization. Some suggested parameters to monitor include the following: CPU utilization RAM utilization Swap partition utilization Free disk space Disk I/O throughput Network throughput Novell Training Services (en) 15 April 2009 You can use a variety of tools to gather this information on a SLES system. One of the most useful is top. To use this utility, enter top at the shell prompt. Data similar to the following is displayed: 369

Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 7-4 Using top The top utility runs until you close the program by pressing q.

370 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 7-4 Using top The top utility runs until you close the program by pressing q. As it runs, it continually updates the statistics displayed on the screen at regular intervals. You can view the following: System uptime Number of users logged in Load average Total number of tasks loaded The number of processes in the following states: Running Sleeping Stopped Zombied NOTE: A zombied process is one that has finished executing and has exited, but the process s parent process didn t get notified that it was finished and hasn t released the child process s PID. A zombied process may eventually clear up on its own. If it doesn t, you may need to manually kill its parent process. CPU utilization Physical memory statistics, including: Total Used 370

371 Free Buffers Swap memory statistics, including: Total Used Free Cached Monitor Server Health A partial list of running processes, one on each line. The following columns are used to display information about each process: PID. The process ID of the process. USER. The name of the user that owns the process. PR. The priority assigned to the process. NI. The nice value of the process. VIRT. The amount of virtual memory used by the process. RES. The amount of physical RAM the process is using in kilobytes. SHR. The amount of shared memory used by the process. S. The status of the process. Possible values include the following: D. Uninterruptibly sleeping R. Running S. Sleeping T. Traced or stopped Z. Zombied %CPU. The percentage of CPU time used by the process. %MEM. The percentage of available physical RAM used by the process. TIME+. The total amount of CPU time the process has consumed since being started. COMMAND. The name of the command that was entered to start the process. You can also use the sar utility to gather system information. Unlike top, sar doesn t run continuously. Instead, it takes several snapshots over a specified period of time to generate statistics. Novell Training Services (en) 15 April 2009 NOTE: To use sar, you must first install the sysstat and sysstat-isag packages. Once the utility is installed, you can run the sar command at the shell prompt using the following syntax: sar option interval count 371

372 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual You can use the following options with sar: -b: Reports I/O and transfer rate statistics. -B: Reports paging statistics. -I { irq SUM ALL XALL }: Reports statistics for a given interrupt number. SUM displays the total number of interrupts received per second. ALL displays statistics from the first 16 interrupts. XALL displays statistics from all interrupts, including potential APIC interrupt sources. -n { DEV EDEV SOCK ALL }: Reports network statistics. DEV displays statistics for network devices. EDEV displays statistics for failures (errors) from network devices. SOCK displays statistics for sockets in use. ALL displays statistics for all of the network activities. -P { cpu ALL }: Reports per-processor statistics for the specified processor(s). Using ALL reports statistics for each individual processor and globally for all processors. -r: Reports memory and swap space utilization statistics. -R: Reports memory statistics. -u: Reports CPU utilization. -W: Displays swap partition stats. For example, to view CPU utilization stats using 5 measurements 3 seconds apart, you would enter sar -u 3 5 at the shell prompt. Information similar to the following would be displayed: DA1:~ # sar -u 3 5 Linux pae (DA1) 02/01/10 _i686_ 13:27:06 CPU %user %nice %system %iowait %steal %idle 13:27:09 all :27:12 all :27:15 all :27:18 all :27:21 all Average: all If you want to view this information graphically, add the -o option to the sar command. This option saves the output from sar as the /var/log/sa/sax binary file. You can then use the System Activity Grapher within the GNOME graphical environment to view the data in the form of a line chart. An example is shown below: 372

373 Figure 7-5 Viewing sar Data in the System Activity Grapher Utility Monitor Server Health Novell Training Services (en) 15 April 2009 Another option for graphically viewing system resource usage is the GNOME System Monitor, shown below: 373

Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 7-6 Using GNOME System Monitor to Manage Resource Usage You can view

374 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 7-6 Using GNOME System Monitor to Manage Resource Usage You can view information on the following tabs: System. Displays system information. Processes. Displays information about running processes. Resources. Displays CPU history, memory usage history, and network usage history. File systems. Displays file system usage statistics. Hardware. Displays a list of hardware currently in use in the system. You can also use the df command to view disk space usage in the file system of the server. An example is shown below: DA1:~ # df -h Filesystem Size Used Avail Use% Mounted on /dev/sda2 7.9G 3.5G 4.1G 46% / udev 467M 65M 403M 14% /dev /dev/sdc1 20G 6.2G 13G 34% /srv/www /dev/sr0 2.7G 2.7G 0 100% /media/suse_sles

Figure 7-7 Monitor Server Health Likewise, you can also use the GNOME Disk Usage Analyzer to monitor disk space usage in the file system graphically.

375 Figure 7-7 Monitor Server Health Likewise, you can also use the GNOME Disk Usage Analyzer to monitor disk space usage in the file system graphically. An example is shown below: Using Disk Usage Analyzer to Monitor Disk Space Novell Training Services (en) 15 April 2009 Using these utilities, you can gather extensive information about your SLES 11 system that can be used to establish server baselines. The interval for taking baselines will vary from organization to organization and from system to system. For example, a server that contains mission-critical information and must remain available 24/7 should have baselines taken very frequently (weekly or bi-weekly). NOTE: You can also use Nagios to create server baselines. Installation and configuration of Nagios is covered in Monitor Network Hosts with Nagios on page

376 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Objective 2 Monitor Log Files with logwatch With your server documentation complete, the next task you should complete is to monitor your server log files. These files contain invaluable information that can warn you of impending problems or even a possible attack by an intruder. You can, of course, manually analyze your log files using the cat, less, or more commands at the shell prompt. You can also read your log files in a text editor such as vi or gedit. However, your log monitoring activities can be made more efficient and more effective by using software that analyzes and summarizes the information in your log files for you. In this objective, you learn how to monitor your log files using the logwatch utility on SLES 11. The following topics are addressed: How logwatch Works on page 376 Installing and Configuring logwatch on page 376 Using logwatch to Monitor Your Server Log Files on page 381 How logwatch Works Logwatch is a customizable log monitoring system that can be configured to analyze your server log files for a specified period of time. Using the information it finds, it can generate reports for the areas you configure. Among other things, logwatch can be useful for reporting failed login attempts. However, it can be configured to report any activity that gets logged in the various server system logs. Installing and Configuring logwatch You can install logwatch on SLES 11 either by using the YaST Software Management module, or using the yast -i logwatch command. With the installation complete, a directory structure is created in /usr/share/ logwatch for use by the logwatch program. This directory structure contains executables and configuration files that configure the default behavior of logwatch. NOTE: The default configuration allows logwatch to run on most systems without completing additional configuration tasks. The default.conf subdirectory resides in /usr/share/logwatch along with the lib, scripts, and dist.conf subdirectories. The lib directory contains perl library files. The scripts directory Contains perl executables. The dist.conf directory contains configuration files specific to your Linux distribution. The default.conf directory is very important. It contains the default logwatch configuration files which you need to be familiar with, as shown below: 376

377 Figure 7-8 The Contents of the /usr/share/logwatch/default.conf Directory Monitor Server Health Novell Training Services (en) 15 April 2009 The first is the services subdirectory, which contains configuration files specific to each service. Logwatch determines which services are available using the contents of this directory. Each configuration file is named using its associated service name along with the.conf suffix, as shown below: DA1:/usr/share/logwatch/default.conf/services # ls afpd.conf imapd.conf raid.conf amavis.conf in.qpopper.conf resolver.conf arpwatch.conf init.conf rt314.conf audit.conf ipop3d.conf samba.conf automount.conf iptables.conf saslauthd.conf autorpm.conf kernel.conf scsi.conf bfd.conf mailscanner.conf secure.conf cisco.conf modprobe.conf sendmail-largeboxes.conf clam-update.conf mountd.conf sendmail.conf clamav-milter.conf named.conf shaperd.conf clamav.conf netopia.conf slon.conf courier.conf netscreen.conf smartd.conf cron.conf oidentd.conf sonicwall.conf denyhosts.conf openvpn.conf sshd.conf dhcpd.conf pam.conf sshd2.conf dmeventd.conf pam_pwdb.conf stunnel.conf dnssec.conf pam_unix.conf sudo.conf dovecot.conf php.conf syslogd.conf dpkg.conf pix.conf tac_acc.conf emerge.conf pluto.conf up2date.conf evtapplication.conf pop3.conf vpopmail.conf evtsecurity.conf portsentry.conf vsftpd.conf evtsystem.conf postfix.conf windows.conf exim.conf pound.conf xntpd.conf eximstats.conf proftpd-messages.conf yum.conf extreme-networks.conf pureftpd.conf zz-disk_space.conf fail2ban.conf qmail-pop3d.conf zz-fortune.conf ftpd-messages.conf qmail-pop3ds.conf zz-network.conf ftpd-xferlog.conf qmail-send.conf zz-runtime.conf http.conf qmail-smtpd.conf zz-sys.conf identd.conf qmail.conf 377

378 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual These.conf files contain settings that configure how logwatch will gather information from the associated service s log file. For example, the xntpd.conf file in this directory is shown below: # /usr/share/logwatch/default.conf/services/xntpd.conf... Title = "XNTPD" # Which logfile group... #LogFile = secure LogFile = messages # Only give lines pertaining to the ntpd service... *MultiService = ntpd,xntpd,ntpdate *RemoveHeaders... As you can see in the example above, this file directs logwatch to query the /var/ log/messages file for any lines containing the text strings ntpd, xntpd, or ntpdate. The logfiles subdirectory in /usr/share/logwatch/default.conf contains the logfile group configuration files. These files contain information about one or more log files. Several services may use the same logfile group configuration file. A listing of files in this directory is shown below: DA1:/usr/share/logwatch/default.conf/logfiles # ls autorpm.conf http.conf resolver.conf bfd.conf iptables.conf rt314.conf cisco.conf kernel.conf samba.conf clam-update.conf maillog.conf secure.conf cron.conf messages.conf sonicwall.conf daemon.conf netopia.conf syslog.conf denyhosts.conf netscreen.conf tac_acc.conf dnssec.conf php.conf up2date.conf dpkg.conf pix.conf vsftpd.conf emerge.conf pureftp.conf windows.conf eventlog.conf qmail-pop3d-current.conf xferlog.conf exim.conf qmail-pop3ds-current.conf yum.conf extreme-networks.conf qmail-send-current.conf fail2ban.conf qmail-smtpd-current.conf The logwatch.conf file in /usr/share/logwatch/default.conf is the logwatch configuration file. NOTE: Many of the parameters in this file can be overridden using command-line options when running logwatch manually from the shell prompt. The default logwatch.conf file is shown below. The key directives are shown in bold: 378

379 Monitor Server Health # /usr/share/logwatch/default.conf/logwatch.conf... # Default Log Directory # All log-files are assumed to be given relative to this directory. LogDir = /var/log # You can override the default temp directory (/tmp) here TmpDir = /var/cache/logwatch # Default person to mail reports to. Can be a local account or a # complete address. Variable Print should be set to No to # enable mail feature. MailTo = root # WHen using option --multi , it is possible to specify a different # recipient per host processed. For example, to send the report # for hostname host1 to user@example.com, use: #Mailto_host1 = user@example.com # Multiple recipients can be specified by separating them with a space. Novell Training Services (en) 15 April 2009 # Default person to mail reports from. Can be a local account or a # complete address. MailFrom = Logwatch # If set to 'Yes', the report will be sent to stdout instead of being # mailed to above person. Print = Yes # if set, the results will be saved in <filename> instead of mailed # or displayed. #Save = /tmp/logwatch # Use archives? If set to 'Yes', the archives of logfiles # (i.e. /var/log/messages.1 or /var/log/messages.1.gz) will # be searched in addition to the /var/log/messages file. # This usually will not do much if your range is set to just # 'Yesterday' or 'Today'... it is probably best used with # By default this is now set to Yes. To turn off Archives uncomment this. #Archives = No # Range = All # The default time range for the report... # The current choices are All, Today, Yesterday Range = yesterday # The default detail level for the report. # This can either be Low, Med, High or a number. # Low = 0 # Med = 5 # High = 10 Detail = Low # The 'Service' option expects either the name of a filter # (in /usr/share/logwatch/scripts/services/*) or 'All'. # The default service(s) to report on. This should be left as All for 379

380 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual # most people. Service = All # You can also disable certain services (when specifying all) Service = "-zz-network" # Prevents execution of zz-network service, which # prints useful network configuration info. Service = "-zz-sys" # Prevents execution of zz-sys service, which # prints useful system configuration info. Service = "-eximstats" # Prevents execution of eximstats service, which # is a wrapper for the eximstats program. # If you only cared about FTP messages, you could use these 2 lines # instead of the above: #Service = ftpd-messages # Processes ftpd messages in /var/log/ messages #Service = ftpd-xferlog # Processes ftpd messages in /var/log/ xferlog # Maybe you only wanted reports on PAM messages, then you would use: #Service = pam_pwdb # PAM_pwdb messages - usually quite a bit #Service = pam # General PAM messages... usually not many # You can also choose to use the 'LogFile' option. This will cause # logwatch to only analyze that one logfile.. for example: #LogFile = messages # will process /var/log/messages. This will run all the filters that # process that logfile. This option is probably not too useful to # most people. Setting 'Service' to 'All' above analyizes all LogFiles # anyways... # # By default we assume that all Unix systems have sendmail or a sendmail-like system. # The mailer code Prints a header with To: From: and Subject:. # At this point you can change the mailer to any thing else that can handle that output # stream. TODO test variables in the mailer string to see if the To/ From/Subject can be set # From here with out breaking anything. This would allow mail/mailx/ nail etc... -mgt mailer = "/usr/sbin/sendmail -t" # # With this option set to 'Yes', only log entries for this particular host # (as returned by 'hostname' command) will be processed. The hostname # can also be overridden on the commandline (with --hostname option). This # can allow a log host to process only its own logs, or Logwatch can be # run once per host included in the logfiles. # # The default is to report on all log entries, regardless of its source host. # Note that some logfiles do not include host information and will not 380

381 be # influenced by this setting. # #HostLimit = Yes... Monitor Server Health The ignore.conf file in /usr/share/logwatch/default.conf specifies regular expressions that, when matched by a line in the output of logwatch, will suppress that line. The /etc/logwatch directory is also created. Within this directory are two subdirectories: conf. Contains the configuration files specific to your system. The structure of this directory is the same as the /usr/share/logwatch/default.conf directory discussed previously: Novell Training Services (en) 15 April 2009 Figure 7-9 Contents of /etc/logwatch/conf The files in this directory are used to customize logwatch and override the default values located in /usr/share/logwatch/default.conf. The /etc/ logwatch/conf directory is first searched for files with the same name and relative location as those in the /usr/share/logwatch/default.conf directory. Variables declared in the files in /etc/logwatch/conf override the defaults. scripts. Contains executable scripts specific to your system. Again, these files are used to customize logwatch and override the defaults found in /usr/ share/logwatch/scripts. The structures of these directories are the same as those found in /usr/share/ logwatch. Using logwatch to Monitor Your Server Log Files Once logwatch has been installed and configured, you can use it to monitor the log files on your SLES 11 server. This is done using the logwatch command at the shell prompt (as root). You can use the following options with this command: 381

382 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual NOTE: These options can be used to override parameters in the logwatch.conf file. --detail level: Sets the detail level of the report. The level parameter can be a value between 0 and 10. You can also use a value of low, med, or high, which correspond to 0, 5, and logfile log_file_group: Forces logwatch to process only the set of log files defined by the log_file_group parameter. Logwatch will process all services that use those log files. You can specify multiple log file groups with this option. --service service_name: Forces logwatch to process only the log files for the specified service. Logwatch will also process any log file groups necessary for the service. This option can be specified multiple times to process multiple services at once. You can also specify All to process all services and logfile-groups configured in your configuration files. --print: Prints the output of the logwatch command to stdout (the screen). --mailto address: Sends the output of the logwatch command to the address or user specified. NOTE: The --mailto option overrides the --print option. --range range: Limits logwatch to the date range specified. You can use values such as Yesterday, Today, and All. --archives: Forces logwatch to process log file archives in addition to the current log file. --save file_name: Saves the output of logwatch to a file instead of displaying or ing it. --logdir directory: Forces logwatch to process log files in the specified directory instead of the default directory. For example, if you want to process log entries for the SSH server daemon running on your server, you could enter the following command at the shell prompt: logwatch --service sshd --detail high Sample output from this command is shown below: ################### Logwatch (05/19/07) #################### Processing Initiated: Mon Feb 1 10:34: Date Range Processed: all Detail Level of Output: 10 Type of Output: unformatted Logfiles for Host: da1 ################################################################## SSHD Begin SSHD Killed: 9 Time(s) 382

383 SSHD Started: 22 Time(s) Didn't receive an ident from these IPs: : 4 Time(s) Illegal users from: : 11 times rtracy/keyboard-interactive/pam: 9 times tux/keyboard-interactive/pam: 2 times Login attempted when not in AllowUsers list: tux : 2 Time(s) Users logging in through sshd: geeko: : 23 times Monitor Server Health Novell Training Services (en) 15 April 2009 Error in PAM authentication: Authentication failure for geeko from : 2 Time(s) Authentication failure for illegal user tux from : 2 Time(s) Authentication failure for root from : 1 Time(s) User not known to the underlying authentication module for illegal user rtracy from : 9 Time(s) **Unmatched Entries** Protocol major versions differ for : SSH-2.0-OpenSSH_5.1 vs. SSH-1.5-NmapNSE_1.0 : 4 time(s) SSHD End ###################### Logwatch End ######################### As you can see, logwatch reports very useful information from the sshd log file. For example, you can see that two unauthorized users (rtracy and tux) tried to log in to the server from the same IP address ( ). This is a fairly strong indicator that an intruder may have been attempting to log in. You can also see the number of authentication errors by the various users accessing the server. 383

384 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Objective 3 Monitor Network Hosts with Nagios To this point in this course, we have focused on monitoring a single server system. Monitoring multiple network hosts and devices in your network, on the other hand, can be a cumbersome, inefficient task unless you implement a centralized network monitoring system. One option for doing this is to purchase an expensive proprietary solution. However, this can also be done by implementing the Open Source product Nagios. Nagios is a powerful monitoring system that helps you identify and resolve problems with your network hosts and infrastructure. If an issue is detected, Nagios can be configured to send alerts, allowing you to fix it before it becomes a serious problem. NOTE: Nagios is an extensive product that can be challenging to implement. A full discussion of all the facets and features of Nagios is beyond the scope of this objective. The goal of this course is to help you complete a basic installation and configuration of Nagios such that you have the base skills and knowledge needed to explore the product in greater depth on your own. In this objective, you learn how to complete a basic implementation of Nagios. The following topics are addressed: How Nagios Works on page 384 Installing Nagios on page 390 Configuring the Nagios Server on page 397 Configuring Monitored Hosts on page 423 Use Nagios to Monitor Network Hosts on page 436 How Nagios Works Nagios is an enterprise monitoring application that you can be used to monitor hosts and devices in your network. Nagios is an Open Source project developed by Ethan Galstad. NOTE: The Nagios home page is ( Nagios stands for Nagios Ain't Gonna Insist On Sainthood. It was originally named Saint and then later renamed Netsaint. To implement and manage Nagios, you should be familiar with the following: Nagios Functionality on page 385 Nagios Add-Ons on page 385 Nagios Notifications on page 388 Nagios Plugins on page 388 Nagios Checks on page 389 Nagios Web Console on page

385 Nagios Functionality Monitor Server Health Nagios can monitor just about any network host that can be accessed using the IP protocol, including Servers Desktops Network devices Applications Nagios can monitor network hosts running the following operating systems: Microsoft Windows UNIX Linux NetWare Novell Training Services (en) 15 April 2009 With the proper configuration, Nagios can monitor network devices through firewalls, VPN tunnels, and SSH tunnels. It can also monitor devices securely over the Internet if your organization has remote sites. Accordingly, Nagios can be configured to use several different protocols to monitor hosts, including HTTP SNMP SSH Because it is SNMP-enabled, Nagios can be configured to generate alerts using information from existing SNMP management applications, such as HP Openview or Open NMS, that you may have already deployed in your network. Nagios is extensive and very flexible. It includes a large number of plugins that can run monitoring checks on a wide variety of network assets. In addition, Nagios also allows you to develop your own custom monitoring checks using C, Perl, or shell scripts. Nagios can be deployed in a distributed implementation using multiple servers. For example, you can configure Nagios servers at remote sites to feed information to a central Nagios server at your organization s headquarters. Nagios Add-Ons Nagios functionality can be extended using several key add-ons that you need to be familiar with. These are listed below: Nagios Remote Program Execution (NRPE): The NRPE add-on allows you to execute Nagios plugins on remote network systems and feed the information it 385

Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual gathers back to your Nagios server, allowing you to monitor local resources on remote

386 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual gathers back to your Nagios server, allowing you to monitor local resources on remote network hosts. The information can be securely transferred from the remote monitored system to the Nagios server using SSL. The NRPE add-on is composed of three components: The check_nrpe plugin that resides on the Nagios server The nrpe daemon that runs on the remote system that is being monitored The Nagios plugins that are installed on the remote system that is being monitored The relationship between these components is shown in the figure below: Figure 7-10 NRPE Components The Nagios server uses the check_nrpe plugin to send a request to the nrpe daemon running on the remote system. The nrpe daemon then does the following: Receives the request from the Nagios server Runs the Nagios plugin specified by the server Returns the data it gathers from the monitored system to the Nagios server Nagios Service Check Accepter (NSCA). The nsca add-on allows you to submit passive service check results to another server running Nagios, effectively allowing Nagios to run in a distributed monitoring environment. The NSCA add-on is composed of daemon (server) and client software, as discussed below: nsca. The nsca daemon that runs on the central Nagios server. It is configured using the nsca.cfg configuration file. send_nsca. The nsca client program that runs on remote monitored hosts and sends data to the nsca daemon running on your central Nagios server. It is configured using the send_nsca.cfg configuration file. The relationship between these components is shown in the figure below: 386

Starting with Nagios version 2, the ability to store data, such as status, state history, and notification history in a database was removed.

387 Figure 7-11 nsca Components Monitor Server Health Novell Training Services (en) 15 April 2009 NDOUtils. The NDOUtils add-on allows you to store Nagios data in a MySQL database. Starting with Nagios version 2, the ability to store data, such as status, state history, and notification history in a database was removed. The NDOUtils add-on restores this functionality using a MySQL database. It also allows data from multiple Nagios servers to be stored in the same database. This is shown below: Figure 7-12 Using NDOUtils To use the NDOUtils add-on, you must be running Nagios server version 2.7 or later. 387

388 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual WARNING: This add-on was considered experimental as of the time this course was published. Nagios Notifications Once installed, the Nagios server watches the hosts you have configured it to monitor. If an issue of concern is found, it notifies you about the problem. Nagios allows you to customize what issues are important to you and what issues are not. The Nagios web console displays the status of monitored hosts. It uses three colors to identify and categorize the severity of the information displayed: Green: Normal Yellow: Caution Red: Critical Nagios can be configured to send notifications when a problem is encountered. Notifications consist of a message containing the name of the affected host or service, its state, and any other pertinent information. They can be sent using a variety of external programs, such as , instant messaging, or via Blackberry. You can configure contact groups which should receive notifications about specific conditions. Nagios also provides escalation management functionality. For example, you could configure Nagios to notify a front-line support technician when a problem occurs. If the problem isn t resolved within a specified period of time, the message could be escalated to a back-line support engineer. Nagios Plugins Nagios is modular. The Nagios server itself does not include any testing functionality. Instead it runs external programs called plugins, which are scripts or binaries that are used to perform service and host checks. Nagios includes several standard plugins that can be used to monitor and test some of the more commonly used hosts and services. They are located in the /usr/lib/ nagios/plugins directory. You need to identify which tests you want to implement and then enable the associated plugins in Nagios. If you have the necessary programming skills, you can access the source code for the these plugins and modify it to fit your own specific needs. You can even develop your own plugins if you need to implement custom tests. NOTE: You can download a wide variety of Nagios plugins that other administrators have developed from ( 388

389 Nagios Checks Nagios categorizes checks into two types: Monitor Server Health Host Checks: Host checks verify that a remote system is reachable. This is usually done by pinging the host s IP address. Host checks are not the preferred type of test used by Nagios. They are performed only when necessary; for example, if none of the services on a particular host can be reached. Nagios can be configured with the network topology, allowing you to test the various routers and switches between the Nagios server and the remote host if it becomes unreachable. When Nagios runs a host check, three possible results are returned: OK: Indicates the host is available and reachable. DOWN: Indicates the host is unavailable. UNREACHABLE: Indicates the host is unavailable. If a host test returns a result of DOWN, the result itself can be in one of two states: Soft. Indicates the result is currently pending. This happens when the initial host test returns a result of DOWN. However, Nagios will continue to retry contacting the remote host several times before giving up. Until these retries are complete, the result will remain in the soft state. If one of the retry tests eventually returns an OK result, then a soft recovery is said to have occurred. Hard. A hard state indicates that the initial host test has failed and that all of the subsequent retry tests have also failed. This indicates the host is definitely DOWN and notifications should be sent to the appropriate individuals. Service Checks. Service checks are the preferred means for testing network hosts. A service check tests a specific network service and gathers information about it, such as running processes, associated CPU load, and so on. Basic service checks test to see if a service s network port is open on the remote host and if the service is actually listening on it. More extensive tests can be used to verify that the service is actually the correct service and see how fast the service responds to queries. Nagios can perform three different types of service checks: Active Checks. Active checks are directly initiated by the Nagios server. Indirect Checks. Indirect checks require the use of an intermediate agent, such as nrpe or nsca. Indirect checks can be used to monitor such things as Disk usage and processor load on remote systems Novell Training Services (en) 15 April

390 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Services that reside behind a firewall Time-sensitive services between remote hosts Passive Checks: Passive checks are performed by an external application. The results of the test are passed to Nagios using the external commands file. Passive checks can be used in situations where Nagios can t access the remote system over the network, such as when monitored hosts are located behind a firewall. They also work in distributed monitoring implementations where you don t want to maintain a live connection at all times between the Nagios server and the monitored systems. Passive checks are also useful for monitoring asynchronous services that can t be actively checked reliably. Nagios Web Console Nagios uses a browser-based interface to display the information it has gathered. It can be used to view current events as well as review past events discovered by Nagios. A sample is shown below: Figure 7-13 The Nagios Web Console Installing Nagios Now that you understand how Nagios works, you are ready to install it. In this part of this objective, you learn how to do this. The following topics are addressed: Planning a Nagios Implementation on page 390 Installing Nagios on page 392 Planning a Nagios Implementation Before installing Nagios, you need to plan how it will be implemented in your network. The larger your site, the more critical this planning becomes. There are several key issues that you need to keep in mind. 390

Figure 7-14 Monitor Server Health First, you need to deploy your Nagios servers in a network location where they can access the hosts to be monitored using the IP protocol.

391 Figure 7-14 Monitor Server Health First, you need to deploy your Nagios servers in a network location where they can access the hosts to be monitored using the IP protocol. If this isn t possible due to firewall or other security restrictions, you may need to deploy multiple Nagios servers in a distributed deployment. In this situation, you configure your remote Nagios servers to monitor the hosts they have access to and then send their information back to a central Nagios server. This is shown below: Distributed Nagios Deployment Novell Training Services (en) 15 April 2009 Implementing Nagios in this manner has several advantages. Key among these is the fact that the data from multiple Nagios servers is aggregated in one location, allowing you to monitor all your systems from one single web console. In addition, you have only one set of notifications to manage. The disadvantage to this configuration is the fact that your monitoring traffic is exposed on the wire as it is transferred to your central Nagios server. You should use encryption to secure the data transmissions if you choose to use this type of deployment. In addition to manageability and security, you also need to take performance into consideration. Performance issues become most obvious when the hosts being monitored are connected over a slow or unreliable WAN link. Again, using a distributed configuration as discussed above can dramatically increase performance. Instead of running tests directly over the WAN link, a remote Nagios server can run tests on monitored hosts locally and then just upload the results to your central Nagios server over the WAN link. 391

392 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Using a distributed deployment also dramatically reduces overhead. Each remote Nagios server is responsible for its own domain while the centralized Nagios server simply aggregates data from the remote servers. This can dramatically increase the overall performance of Nagios. Nagios can run on Intel, SPARC, Alpha, PowerPC, and Itanium platforms, but Intel is the most commonly used hardware and is the most supported platform. The system requirements are dependent upon the number of hosts that will be monitored. For Nagios servers that will monitor less than 100 hosts, the minimum requirements are as follows: A single 800 MHz CPU 512 MB RAM 5 GB of free disk space If your Nagios server will monitor more hosts, you need to increase your server hardware, depending on the number or hosts you will monitor. For 1000 hosts, you should consider the following minimums: Multiple 3 GHz CPUs 2 GB RAM 40 GB of free disk space Regardless of the size of your deployment, you need have a C compiler installed on the Nagios server. You will also need Apache installed to use the Web console. Installing Nagios Now that you have your deployment plan in place for Nagios, you re ready to start the installation of the product. You have two options for doing this: Download and install Nagios from ( Install Nagios from your SLES 11 installation media. The choice you make is significant because the version of Nagios included on the SLES 11 media has been customized to run on this Linux distribution and there are some differences. Key among these are the locations where the Nagios system and configuration files are stored. The Novell version of Nagios has changed the Nagios directories and files to make them more compliant with the File System Hierarchy (FSH) standard. In this course, we will focus on using the version of Nagios that is included with SLES 11. To install Nagios, use YaST to install the following packages: nagios: This package installs the Nagios server daemon. nagios-nsca (optional): This package installs the NSCA daemon. nagios-nsca-client (optional): This package installs the NSCA client software. 392

393 Monitor Server Health nagios-plugins: This package installs the Nagios plugins that are used to run checks. nagios-plugins-extras (optional): This package installs additional, less commonly used Nagios plugins. nagios-www: Provides the Web console interface used by Nagios. Once the Nagios installation is complete, restart the Apache web server to enable any Web server configuration changes that were made when Nagios was installed by entering rcapache2 restart at the shell prompt (as root). Make sure the Apache Web server starts automatically by entering chkconfig apache2 on. During the installation of Nagios, a new system user and a new system group both named nagios are created on your server. These new accounts are shown below: Novell Training Services (en) 15 April 2009 Figure 7-15 The nagios User and Group The Nagios daemon will run as this user on the system. 393

394 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual During the installation process, the following directories and files are created on your Nagios server: /etc/nagios. Contains the following Nagios server configuration files: DA1:/etc/nagios # ls -R.: cgi.cfg command.cfg nagios.cfg objects resource.cfg./objects: commands.cfg localhost.cfg switch.cfg timeperiods.cfg contacts.cfg printer.cfg templates.cfg windows.cfg /usr/lib/nagios/cgi. Contains the following Nagios CGI files: DA1:/usr/lib/nagios/cgi # ls avail.cgi history.cgi statusmap.cgi traceroute.cgi cmd.cgi notifications.cgi statuswml.cgi trends.cgi config.cgi outages.cgi statuswrl.cgi extinfo.cgi showlog.cgi summary.cgi histogram.cgi status.cgi tac.cgi /usr/lib/nagios/plugins. Contains the following Nagios plugin files: DA1:/usr/lib/nagios/plugins # ls check_breeze check_ide_smart check_ntp check_tcp check_by_ssh check_ifoperstatus check_ntp_peer check_time check_clamd check_ifstatus check_ntp_time check_udp check_cluster check_imap check_nwstat check_ups check_dhcp check_ircd check_oracle check_users check_dig check_linux_raid.pl check_overcr check_wave check_disk check_load check_ping check_xenvm check_disk_smb check_log check_pop eventhandlers check_dns check_mailq check_procs negate check_dummy check_mrtg check_real urlize check_file_age check_mrtgtraf check_rpc utils.pm check_flexlm check_nagios check_sensors utils.sh check_ftp check_netapp.pl check_smtp check_http check_nntp check_ssh check_icmp check_nt check_swap After the initial installation of Nagios, it s a good idea to verify that your plugins were installed correctly. This can be done by running the plugin executable from the shell prompt against the local system. Complete the following: 1. Open a terminal session on the Nagios server. 2. Switch to root using the su - command followed by your root user s password. 3. At the shell prompt, enter /usr/lib/nagios/plugins/ plugin_name -H localhost. 394

395 Monitor Server Health For example, if the SSH daemon is configured and running on your server, you could run the check_ssh plugin against your local system. An example is shown below: DA1:~ # /usr/lib/nagios/plugins/check_ssh -H localhost SSH OK - OpenSSH_5.1 (protocol 2.0) The output from this command tells you two things. First, the SSH daemon is running on the server and responding to queries. Second, it tells you the Nagios plugin that discovered the SSH daemon is working correctly. /usr/sbin/nagios. The Nagios server daemon binary file. /usr/sbin/nagiostats. The Nagios statistics utility binary file. /usr/share/nagios/docs. Contains the Nagios HTML documentation files. /var/log/nagios. Contains the Nagios log files. /var/log/nagios/archives. Contains the archived Nagios log files. /var/spool/nagios. Contains external commands. Novell Training Services (en) 15 April 2009 After installing Nagios or after making changes to any of its configuration files, it s a good idea to run the Nagios binary in verification mode to ensure there are no errors in its configuration. Enter the following command at the shell prompt (as root): /usr/sbin/nagios -v name_of_config_file For example: /usr/sbin/nagios -v /etc/nagios/nagios.cfg Nagios checks its configuration file and reports any errors found. In the example below, no errors were encountered when this command was run: 395

396 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual DA1:~ # /usr/sbin/nagios -v /etc/nagios/nagios.cfg Nagios Copyright (c) Ethan Galstad ( Last Modified: License: GPL Reading configuration data... Running pre-flight check on configuration data... Checking services... Checked 8 services. Checking hosts... Checked 1 hosts. Checking host groups... Checked 1 host groups. Checking service groups... Checked 0 service groups. Checking contacts... Checked 1 contacts. Checking contact groups... Checked 1 contact groups. Checking service escalations... Checked 0 service escalations. Checking service dependencies... Checked 0 service dependencies. Checking host escalations... Checked 0 host escalations. Checking host dependencies... Checked 0 host dependencies. Checking commands... Checked 24 commands. Checking time periods... Checked 5 time periods. Checking for circular paths between hosts... Checking for circular host and service dependencies... Checking global event handlers... Checking obsessive compulsive processor commands... Checking misc settings... Total Warnings: 0 Total Errors: 0 Things look okay - No serious problems were detected during the preflight check Nagios can be run as either a user process or as a daemon. To run Nagios as a user process, enter the following command at the shell prompt: /usr/sbin/nagios name_of_config_file For example: /usr/sbin/nagios /etc/nagios/nagios.cfg 396

397 Monitor Server Health To run Nagios as a daemon, you need to first insert the Nagios daemon and configure it to run at its default runlevels by entering insserv nagios at the shell prompt (as root). The Nagios daemon is configured to start in runlevels 3 and 5 by default: Once done, you can manage the daemon using the /etc/init.d/nagios init script (or its accompanying rc script rcnagios): /etc/init.d/nagios start. Starts the Nagios daemon. /etc/init.d/nagios stop. Stops the Nagios daemon. /etc/init.d/nagios restart. Restarts the Nagios daemon. /etc/init.d/nagios reload. Reloads the Nagios configuration without restarting the daemon. /etc/init.d/nagios status. Displays the status of the Nagios daemon. /etc/init.d/nagios check. Checks the configuration for syntax errors. Novell Training Services (en) 15 April 2009 Configuring the Nagios Server After Nagios has been installed on your SLES 11 server, you need to configure it. In this part of this objective, you learn how to do this. Be aware that Nagios is a very powerful and very flexible product. Because of this, configuring Nagios is quite complex and can be daunting to new users. We recommend that you begin by designing and implementing a basic configuration first. Once you have it working correctly and performing the way you want, you can then build on this configuration by adding custom monitoring parameters. The first thing you need to do is configure the Nagios server daemon. You need to be familiar with the following: Configuring Apache to Support Nagios on page 397 Testing the Nagios Server on page 399 Nagios Terminology on page 404 Configuring Nagios Monitoring on the Server on page 404 Configuring Apache to Support Nagios The first thing you need to do is configure the Apache Web server to support Nagios. Do the following: 1. As root, open the /etc/apache2/conf.d/nagios.conf file in a text editor on the Nagios server. 2. Verify that the directives listed in bold below have been added to the file: <Directory "/usr/lib/nagios/cgi"> # SSLRequireSSL Options ExecCGI AllowOverride None Order allow,deny Allow from all 397

398 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual # Order deny,allow # Deny from all # Allow from AuthName "Nagios Access" AuthType Basic AuthUserFile /etc/nagios/htpasswd.users Require valid-user </Directory> Alias /nagios "/usr/share/nagios" <Directory "/usr/share/nagios"> # SSLRequireSSL Options None AllowOverride None Order allow,deny Allow from all # Order deny,allow # Deny from all # Allow from AuthName "Nagios Access" AuthType Basic AuthUserFile /etc/nagios/htpasswd.users Require valid-user </Directory> WARNING: Notice in the example above that Apache uses Basic authentication by default. Basic authentication sends your username and password in clear text with every HTTP request. For obvious security reasons, you should consider implementing a more secure method of authentication such as Digest, which creates an MD5 hash of your authentication credentials. For more information, see the Nagios documentation at ( 3. Close the file and exit the text editor. 4. Create the /etc/nagios/htpasswd.users file and add the nagiosadmin user account to it by entering the following command at the shell prompt (as root): htpasswd2 -c /etc/nagios/htpasswd.users nagiosadmin Enter a password of your choosing for the user when prompted. An example is shown below: DA1:~ # htpasswd2 -c /etc/nagios/htpasswd.users nagiosadmin New password: Re-type new password: Adding password for user nagiosadmin 5. Restart Apache to apply the changes by entering rcapache2 restart at the shell prompt (as root). 398

Testing the Nagios Server Monitor Server Health At this point, you can perform a quick test to verify that Nagios is running properly by opening a Web browser on the server desktop and accessing

399 Testing the Nagios Server Monitor Server Health At this point, you can perform a quick test to verify that Nagios is running properly by opening a Web browser on the server desktop and accessing localhost/nagios. You should be prompted for a username and password. Enter the following: Username: nagiosadmin Password: your_nagios_admin_password The Nagios monitoring home page should be displayed, as shown below: Novell Training Services (en) 15 April 2009 Figure 7-16 The Nagios Monitoring Home Page At this point, Nagios is monitoring default parameters on the local server where it is running. If you select Host Detail on the left, you should see localhost listed as the only monitored host, as shown below: 399

If you select the localhost link, you can view detailed host information, as shown below: Figure 7-18 Viewing Extended Host Information If

400 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 7-17 Monitoring the Local Server As you can see in the example above, Nagios uses a host check to verify that it is up and reachable. If you select the localhost link, you can view detailed host information, as shown below: Figure 7-18 Viewing Extended Host Information If you select Service Detail on the left, you can see a list of services that Nagios has been configured to monitor by default on the localhost. These are shown below: 400

Figure 7-19 Viewing Service Details Monitor Server Health Novell Training Services (en) 15 April 2009 As you can see in the figure above, the following parameters are monitored by default: Current

401 Figure 7-19 Viewing Service Details Monitor Server Health Novell Training Services (en) 15 April 2009 As you can see in the figure above, the following parameters are monitored by default: Current system load Current number of authenticated users Apache Web server (HTTP) PING performance Free space on the root partition SSH daemon Swap partition usage Total number of processes NOTE: The /etc/nagios/objects/localhost.cfg file is used to define what is monitored on the local system where the Nagios daemon is running. You can view detailed information about a particular parameter by selecting the appropriate link. For example, if you select HTTP, you will see information similar to the following: 401

402 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 7-20 Viewing Service Details You can view a list of issues by selecting Service Problems on the left. When you do, a list of current problems is displayed. In the figure below, no problems have been cataloged: Figure 7-21 Listing Current Service Problems You can view information about the Nagios service itself by selecting Process Info on the left. A page similar to that shown below is displayed: 402

When you do, a screen similar to the following is displayed: Figure 7-23 Viewing Nagios Server Performance

403 Figure 7-22 Viewing Nagios Process Information Monitor Server Health Novell Training Services (en) 15 April 2009 You can view Nagios server performance information by selecting Performance Info on the left. When you do, a screen similar to the following is displayed: Figure 7-23 Viewing Nagios Server Performance Information As you can see in the figure above, statistics for active and passive host and service tests are displayed. 403

404 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Nagios Terminology There are several terms used in the Nagios Web console pages shown above that you need to be familiar with before going any farther. These are listed below: Flapping: Flapping occurs when a monitored host or service on a host rapidly changes states (such as coming up and going down repeatedly). Generally speaking, this usually indicates that the host or service is experiencing a problem. However, even though flapping can indicate something is wrong with the host or service, you probably don t want to receive a flurry of notifications each time the state changes. Fortunately, Nagios can be configured to detect flapping, which prevents Nagios from sending out an excessive number of notifications. Nagios first logs a flapping message; then it adds a comment for the host or service. Subsequent notifications are then suppressed. State Stalking: State stalking allows you to record Nagios check results even if no state change occurred. You can configure state stalking for all states or just a specified selection of states. State stalking has many advantages. Key among these is that it provides enhanced logging that can be very useful when troubleshooting problems and tracking issues. Obsession: Obsession is used when you want to run commands for every check of a host or service and return all results whether good or bad. Obsession is usually only used in a distributed monitoring deployment to send the results of every check to a central aggregating server. Configuring Nagios Monitoring on the Server At this point, your Nagios server is up and running, but it s only monitoring itself. The real power of Nagios is its ability to monitor multiple hosts and devices on your network. To do this, you first need to configure the Nagios server daemon to monitor hosts other than itself. To accomplish this, you need to be familiar with the following concepts and tasks: Nagios Configuration Components on page 405 Setting Up the Nagios Configuration Files on page 407 Configuring Hosts on page 408 Configuring Services on page 411 Configuring Contacts on page 415 Configuring Time Periods on page 417 Configuring Commands on page

405 Nagios Configuration Components Before configuring Nagios, you need to be familiar with several key Nagios configuration concepts. Consider the following: Monitor Server Health Objects. Nagios is configured by defining objects, which represent the services and hosts in your environment that you want to monitor. You begin by defining your hosts. Then you define the services associated with each host. These services are the attributes and functions of each host that you want to monitor, including applications, such as the SSH daemon, or server attributes, such as disk space usage. You can define the following object types: host. Used to define physical devices on the network such as servers, routers, firewalls, and so on. hostgroup. Used to group similar hosts together. For example, you could group hosts based on host type (such as all servers), or you could group hosts together based on their physical location (such as Building A, 1st Floor). service. Used to define the service running on a host that you want to monitor. Service objects can represent actual services, such as NTP or SSH, or server attributes, such as free disk space or swap partition utilization. servicegroup. Used to group similar services together. contact. Used to define individuals who should receive notifications. It is also used to define escalation contacts. contactgroup. Used to define a group of contacts. Novell Training Services (en) 15 April 2009 NOTE: All contacts must be a member of a contactgroup! timeperiod. Used to define time periods used by your organization, such as a specific work shift, weekends or general business hours. command. Used to specify the plugin that will be run by the check process to monitor a host. servicedependency. Used to define the services a specific service is dependent upon. serviceescalation. Used to define a service notification escalation process. hostdependency. Used to specify that a particular host is dependent upon other hosts. hostescalation. Used to define a host notification escalation process. hostextinfo. Used to customize the way hosts are displayed in the Web console. serviceextinfo. Used to customize the way services are displayed in the web console. 405

406 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Contacts. Contacts are the individuals in your organization who manage the objects defined above. A contact includes information about the method to be used to contact these individuals. In addition, contacts define what users can see in the Nagios Web console. In order for your user account to see information about a particular object, you must be defined as a contact for it. Groups. Groups allow similar objects to be grouped together for easier management. You can define the following three types of groups: Host Group. Allows multiple hosts to be displayed together in the web console. Service Group. Allows multiple services to be displayed together in the web console. Contact Group. Allows multiple contacts to be grouped together for sending notifications. Checks. Checks define mechanisms for monitoring hosts and services. Check commands are generally executable plugin files that check a service or host and return the status to the Nagios server. The server analyzes the status and if the results of the check vary from the parameters you ve configured, then a status change has occurred. For example, if the free disk space on the root partition drops below a threshold you configure, the state changes to Warning. You can use the following standard Nagios plugins to check various attributes of a monitored host: check_disk. Monitors the disk space available on a storage device. check_file-age. Checks the age and size of a file. check_load. Monitors host load statistics. check_log. Checks a log file for a specified entry. check_mailq. Checks the number of items in mail queues for MTAs. check_procs. Monitors the number of processes or checks the status of a specified process. check_swap. Monitors swap partition utilization. In addition to server parameters, you can also run checks on network services on monitored hosts. Nagios plugins have been created for commonly used network services such as NTP, LDAP, SSH, FTP, ICMP, and SMTP. These checks connect to the network service on the host and return a status of either OK or CRITICAL. Most of these plugins can also return other monitoring data, including A banner from the service Performance statistics Response times 406

407 You can use the following Nagios plugins to monitor network services: check_ssh. Monitors the SSH daemon. Monitor Server Health check_dhcp. Checks to see if the DHCP daemon is up and offering address leases. check_dns. Checks to see if the DNS daemon is up and resolving host names into IP addresses. check_http. Checks the status of an HTTP server daemon. check_imap. Checks the status of an IMAP server daemon. check_ldap. Checks the status of an LDAP server daemon. check_ntp. Checks the status of the NTP server daemon. check_pop. Checks the status of a POP3 server daemon. check_rpc. Monitors RPC services. check_smtp. Checks the status of an SMTP server daemon. Notifications. A state change usually causes a notification to be sent to one or more specified contact objects. Novell Training Services (en) 15 April 2009 Setting Up the Nagios Configuration Files One of the first tasks you must complete is to define where your Nagios configuration files are located. This is done by editing the /etc/nagios/nagios.cfg file and entering the full path to the configuration file in the cfg-file directive. The default configuration is shown in bold below: # OBJECT CONFIGURATION FILE(S) # These are the object configuration files in which you define hosts, # host groups, contacts, contact groups, services, etc. # You can split your object definitions across several config files # if you wish (as shown below), or keep them all in a single config file. # You can specify individual object config files as shown below: cfg_file=/etc/nagios/objects/commands.cfg cfg_file=/etc/nagios/objects/contacts.cfg cfg_file=/etc/nagios/objects/timeperiods.cfg cfg_file=/etc/nagios/objects/templates.cfg # Definitions for monitoring the local (Linux) host cfg_file=/etc/nagios/objects/localhost.cfg A second option is to use the cfg_dir directive in the /etc/nagios/ nagios.cfg file to specify that all files with the.cfg extension in a specified directory be used. Examples are shown in bold below: 407

408 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual # You can also tell Nagios to process all config files (with a.cfg # extension) in a particular directory by using the cfg_dir # directive as shown below: cfg_dir=/etc/nagios/servers cfg_dir=/etc/nagios/printers cfg_dir=/etc/nagios/switches cfg_dir=/etc/nagios/routers How you organize your configuration object definitions depends on your environment s specific requirements. One option is to create separate configuration files for each object type. For small deployments or test systems, you can simply define all of your objects in a single file, such as /etc/nagios/objects.cfg. For larger deployments, you could create separate files for each type of object. For example, you could create a file named /etc/nagios/hosts.cfg for your host objects and a file named /etc/nagios/services.cfg for your service objects. For geographically dispersed deployments, you could create a separate file for each location that contains all object definitions for that site. For example, you could create a file named /etc/nagios/provo.cfg for the Provo, UT campus and / etc/nagios/waltham.cfg for the Waltham, MA campus. Whichever strategy you choose to use, be sure to configure the appropriate directives in the /etc/nagios/nagios.cfg file. Objects are defined in these configuration files by specifying a set of directives. These directives could be unique to a specific type of object or could be generically applied to many objects. Some directives are mandatory and will cause the Nagios daemon to fail to start if they are missing or not constructed correctly. Directives in your configuration files are defined using the define directive. The syntax for this directive is shown below: define object_type{ directive_1 parameters directive_2 parameters } Configuring Hosts The next step is to configure your host objects in the configuration file(s) you defined above. As discussed earlier, host objects are used to represent physical devices on your network, such as servers, desktops, routers, firewalls, and switches. Host objects are associated with their corresponding physical devices using one of two parameters: IP address (most commonly used) MAC address The syntax for defining a host object is shown below: 408

409 define host{ directive_1 parameters directive_2 parameters } The following directives are mandatory when defining a host object: host_name. Defines a short name for the host. alias. Defines a longer, more descriptive name for the host. address. Specifies the IP address (or MAC address) of the host. Monitor Server Health check_period. Specifies the time period during which active checks can be run against this host. max_check_attempts. Specifies the number of times Nagios will retry the host check command if it returns any state other than OK. contact_groups. Defines who will be contacted if there is a state change. notification_interval. Specifies the notification interval in minutes. notification_period. Specifies the time period when notifications can be sent. notification_options. Defines state changes that will trigger a notification to be sent. You can specify the following values: d: Send notifications on a DOWN state. u: Send notifications on an UNREACHABLE state. r: Send notifications on recoveries (OK state). f: Send notifications when the host starts and stops flapping. s: Send notifications when scheduled downtime starts and ends. n: Do not send any notifications. If you do not specify any notification options, Nagios will assume that you want notifications sent on all states. You can specify multiple values by separating them with a comma. Novell Training Services (en) 15 April 2009 NOTE: Additional directives can also be used to further customize your Nagios object definitions. You can learn more about these directives in the Nagios documentation available at ( A sample host object definition is shown below: define host{ host_name da1.digitalairlines.com alias Course 3107 DNS server address check_period 24x7 max_check_attempts 1 409

410 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual } contact_groups admins notification_interval 30 notification_period 24x7 notification_options d,u,r Within a host definition, you can also specify that an existing template be used to reduce the amount of configuration work you must do manually. This is done using the use directive. An example is shown below: use linux-server ; Name of host template to use ; This host definition will inherit all variables that are defined ; in (or inherited by) the linux-server host template definition. As you can see in the example above, this directive causes the current host definition to inherit all variables that are defined in the linux-server host template definition. By default, Nagios provides several predefined templates that you can use in the / etc/nagios/objects/templates.cfg file, which is where the linux-server template is defined. This is shown below: # Linux host definition template - This is NOT a real host, just a template! define host{ name linux-server use generic-host check_period 24x7 check_interval 5 retry_interval 1 max_check_attempts 10 check_command check-host-alive notification_period workhours notification_interval 120 notification_options d,u,r contact_groups admins register 0 } As you can see above, the linux-server template itself inherits directives from the generic-host template, which is also defined in the /etc/nagios/objects/ templates.cfg file. NOTE: The /etc/nagios/objects/templates.cfg file also contains additional server templates (such as for Windows servers) as well as templates for other types of objects, including contacts and services. You can also define hostgroup objects. A hostgroup can be used to group one or more host objects together for simplifying configuration and management of hosts. The syntax for defining a hostgroup object is shown below: define hostgroup{ directive_1 parameters directive_2 parameters } 410

411 You can use the following directives in a hostgroup definition: Monitor Server Health hostgroup_name (required). Defines a short name that is used to identify the hostgroup. alias (required). Defines a description used to identify the hostgroup. members. Specifies a list of hosts that are members of this hostgroup, using host_name values from the host object definitions. You can specify multiple hosts by separating their names with commas. hostgroup_members: Defines a list of other hostgroups whose members should be included in this group. A sample hostgroup object definition is shown below: Novell Training Services (en) 15 April 2009 define hostgroup{ hostgroup_name alias members } 3107Servers Servers used in the 3107 course da1.digitalairlines.com Configuring Services Next, you need to define your service objects in your Nagios configuration file(s). Service objects can represent actual services, such as an NTP or SSH daemon, or a server metric, such as free disk space or swap partition utilization. They can also represent databases, applications, or log files. These services can reside on the local server or on remote hosts, as defined by your host objects. Service definitions are designed to check the associated service on a regular schedule. The following process occurs: 1. A service check is scheduled. 2. The check is run on the service. 3. The results are analyzed. 4. The state of the service is determined. If the state is OK, then the service state is updated and the next check cycle is scheduled. If it is something other than OK, the service is considered to be in a soft non-ok state. In this situation, the service is rechecked a specified number of times. If it recovers, then Nagios returns to normal checking. If the service does not recover, the state changes to a hard non-ok state and a notification is sent, if configured to do so. Nagios assigns services to a series of states that are different than those used for hosts. Services use one normal state and three error states: OK. Normal status state. WARNING. Set when a service reaches a configured threshold, for example when disk space utilization reaches 80%. 411

412 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual CRITCIAL. Set when a service exceeds an increased threshold, for example when the disk space utilization exceeds 90%. UNKNOWN. Set when the state cannot be determined. The syntax for defining a service object is shown below: define service{ directive_1 parameters directive_2 parameters } The following directives are mandatory when defining a service object: service_description. Specifies a description of the service. The value of this directive can contain spaces, dashes, and colons, but semicolons, apostrophes, and quotation marks should not be used. Be aware that services are uniquely identified with the host_name and service_description directives. Therefore, the value of this directive must be unique. host_name. Specifies the short name(s) of the host(s) that the service is associated with. Use the value defined in the host_name directive in the host object definitions. You can specify multiple hosts by separating their names with commas. check_command. Specifies the command to be run to check the status of the service. max_check_attempts. Defines the number of times that Nagios will retry the check command if it returns any state other than OK. check_interval. Specifies how long to wait before scheduling the next normal check of the service. retry_interval. Defines how long to wait before rechecking the service. Retry checks happen when the service changes to a state other than the OK state. check_period. Specifies when active checks of this service can be made. notification_interval. Defines how long to wait (in minutes) before re-notifying a contact that this service remains in a state other than the OK state. notification_period. Specifies the time period during which notifications for this service can be sent. notification_options. Defines service states when notifications should be sent. You can use one or more of the following values (separated by commas): w: Send notifications on a WARNING state. u: Send notifications on an UNKNOWN state. c: Send notifications on a CRITICAL state. r: Send notifications on recoveries (OK state). f: Send notifications when the service starts or stops flapping. 412

413 s: Send notifications when scheduled downtime starts and ends. n: No service notifications will be sent. Monitor Server Health contact_groups. Lists the contact groups to whom notifications should be sent whenever issues occur with this service. Multiple contact groups can be specified separated by commas. NOTE: You must specify at least one contact or contact group in each service object definition. You need to define one service object for each service or metric you want to monitor. For example, if you want to check disk utilization on the first partition on a host s SCSI hard drive, you could define the following service object: Novell Training Services (en) 15 April 2009 define service{ host_name da1.digitalairlines.com service_description Check root partition check_command check_disk!/dev/sda1 max_check_attempts 5 check_interval 5 retry_interval 3 check_period 24x7 notification_interval 30 notification_period 24x7 notification_options w,c,r contact_groups admins } Likewise, a service object that monitors the DNS service on a host could be defined as shown below: define service{ host_name da1.digitalairlines.com service_description DNS check_command check_dns max_check_attempts 3 check_interval 5 retry_interval 1 check_period 24x7 notification_interval 60 notification_period 24x7 notification_options w,c,r contact_groups DNS-Admins } Nagios allows you to include event handlers in your service object definitions. Event handlers run every time a service changes states. To do this, you need to add the following directives: event_handler_enabled. Determines whether event handling for the service is enabled. Specify a value of 0 to disable service event handler. Specify a value of 1 to enable service event handler. 413

414 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual event_handler. Specifies the command that should run whenever the service s state changes. event_handler_timeout (optional). Specifies the maximum amount of time that the event handler command can run. An example of using an event handler to automatically restart the DNS service if its state changes is shown below: define service{ host_name da1.digitalairlines.com service_description DNS check_command check_dns max_check_attempts 3 check_interval 5 retry_interval 1 check_period 24x7 notification_interval 60 notification_period 24x7 notification_options w,c,r contact_groups DNS-Admins event_handler_enabled 1 event_handler server-restart } NOTE: See the Nagios online documentation at ( for more information on event handlers and how to write scripts for them. In addition to service objects, you can also define servicegroup objects. A servicegroup definition is used to group one or more services together to simplify configuration and administration tasks. The syntax for defining a servicegroup object is shown below: define servicegroup{ directive_1 parameters directive_2 parameters } Some of the most commonly used directives for defining a servicegroup are listed below: servicegroup_name (required). Defines a name for the service group. alias (required). Defines a description that is used to identify the service group. members: Lists the names of the services and their corresponding hosts that are members of the group. The syntax for this directive requires you to specify a host name before the service name, as shown below: members=host_name,service_name 414

415 Monitor Server Health You can specify multiple host and service names by separating them with commas. servicegroup_members. Defines a list of other servicegroups whose members should be included in this group. A sample servicegroup object definition is shown below: define servicegroup{ servicegroup_name alias members } MailServers Mail servers used in the 3107 course da1.digitalairlines.com Novell Training Services (en) 15 April 2009 Configuring Contacts Next you need to define your contact objects. As we ve discussed previously, contacts are the individuals in your organization who manage the host and service objects defined previously. They are the persons to whom notifications are sent. A contact includes information about the method to be used to contact these individuals. Contacts also control what information users can view in the Nagios Web console. The syntax for defining a contact object is shown below: define contact{ directive_1 parameters directive_2 parameters } NOTE: By default, your contacts are defined in the /etc/nagios/objects/contacts.cfg file. The following directives are mandatory when defining a contact object: contact_name. Defines a name for the contact which will be referenced in your contact group definitions. host_notifications_enabled. Determines whether the contact will receive notifications about host-related events. Specify 0 to disable host notifications or 1 to enable host notifications. service_notifications_enabled. Determines whether the contact will receive notifications about service-related events. Specify 0 to disable service notifications or 1 to enable service notifications. host_notification_period. Specifies the time period during which the contact can be notified about host-related events. service_notification_period. Specifies the time period during which the contact can be notified about service-related events. 415

416 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual host_notification_options. Defines the host-related states when notifications should be sent to the contact. You can specify one or more of the following values (separated by commas): d: Send on DOWN host states. u: Send on UNREACHABLE host states. r: Send on UP host recovery states. f: Send when the host starts or stops flapping. s: Send when scheduled downtime starts and ends. n: Do not send any host notifications. service_notification_options. Defines the service-related states when notifications should be sent to the contact. You can specify one or more of the following values (separated by commas): w: Send on WARNING service states. u: Send on UNKNOWN service states. c: Send on CRITICAL service states. r: Send on OK service recoveries states. f: Send when the service starts or stops flapping. n: Do not send any service notifications. host_notification_commands. Lists the command(s) used to notify the contact of a host-related event. You can specify multiple commands by separating with commas. service_notification_commands. Lists the command(s) used to notify the contact of a service-related event. You can specify multiple commands by separating with commas. NOTE: Notification commands are defined for you by default in the /etc/nagios/ commands.cfg file. A sample contact object is shown below where the rtracy user is defined and configured to be notified via define contact{ contact_name rtracy alias Robb Tracy host_notifications_enabled 1 service_notifications_enabled 1 service_notification_period 24x7 host_notification_period 24x7 service_notification_options w,u,c,r host_notification_options d,u,r 416

417 service_notification_commands notify-by- host_notification_commands host-notify-by- address } Monitor Server Health The alias directive in the example above is used to define a longer name or description for the contact. The directive is used to specify an address for the contact. You could also add the pager directive to define a pager number for the contact. The addressx directive in the example above is used to define additional addresses for the contact, such as phone numbers, IM addresses, and so on. You can define up to 6 additional address using this directive for each contact. You can also insert the contactgroups directive into your contact definitions, which allows you to specify the name of the contactgroup(s) the contact is a member of. A contactgroup is used to group one or more contacts together to simplify configuration and administration. The syntax for defining a contactgroup object is shown below: Novell Training Services (en) 15 April 2009 define contactgroup{ directive_1 parameters directive_2 parameters } Some of the most commonly used directives for defining a contactgroup are listed below: contactgroup_name (required). Defines a name for the contactgroup. alias (required). Defines a description that is used to identify the contactgroup. members. Lists the names of the contacts that are members of the group. You can specify multiple contact names by separating them with commas. servicegroup_members. Defines a list of other servicegroups whose members should be included in this group. A sample contactgroup object definition is shown below: define contactgroup{ contactgroup_name alias members } DNS-Admins DNS Server Administrators rtracy Configuring Time Periods A time period is a list of days and times that are considered to be valid for sending notifications and conducting service checks. Nagios comes pre-configured with several commonly-used time periods already defined for you. These are located in the /etc/nagios/objects/timeperiods.cfg file by default. 417

418 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual The syntax for defining time periods is a little different from other Nagios configuration objects: define timeperiod{ directive_1 parameters directive_2 parameters day_of_the_week timeranges } Some of the default time periods defined in the /etc/nagios/objects/ timeperiods.cfg file are shown below: # This defines a timeperiod where all times are valid for checks, # notifications, etc. The classic "24x7" support nightmare. :-) define timeperiod{ timeperiod_name 24x7 alias 24 Hours A Day, 7 Days A Week sunday 00:00-24:00 monday 00:00-24:00 tuesday 00:00-24:00 wednesday 00:00-24:00 thursday 00:00-24:00 friday 00:00-24:00 saturday 00:00-24:00 } # 'workhours' timeperiod definition define timeperiod{ timeperiod_name workhours alias Normal Work Hours monday 09:00-17:00 tuesday 09:00-17:00 wednesday 09:00-17:00 thursday 09:00-17:00 friday 09:00-17:00 } # 'none' timeperiod definition define timeperiod{ timeperiod_name none alias No Time Is A Good Time } # Some U.S. holidays # Note: The timeranges for each holiday are meant to *exclude* the #holidays from being treated as a valid time for notifications, etc. #You probably don't want your pager going off on New Year's. Although #you're employer might... :-) define timeperiod{ name us-holidays timeperiod_name us-holidays alias U.S. Holidays 418

419 Monitor Server Health january 1 00:00-00:00 ; New Years monday -1 may 00:00-00:00 ; Memorial Day july 4 00:00-00:00 ; Independence Day monday 1 september 00:00-00:00 ; Labor Day thursday -1 november 00:00-00:00 ; Thanksgiving december 25 00:00-00:00 ; Christmas } # This defines a modified "24x7" timeperiod that covers every day of #the year, except for U.S. holidays (defined in the timeperiod above). define timeperiod{ timeperiod_name 24x7_sans_holidays alias 24x7 Sans Holidays Novell Training Services (en) 15 April 2009 use us-holidays ; Get holiday exceptions sunday 00:00-24:00 monday 00:00-24:00 tuesday 00:00-24:00 wednesday 00:00-24:00 thursday 00:00-24:00 friday 00:00-24:00 saturday 00:00-24:00 } If necessary, you can also define your own time periods. The following directives are used to define time periods: timeperiod_name. Defines a name for the time period. alias. Defines a description used to identify the time period. weekday. Defines time ranges that are valid for a specific day of the week (Sunday through Saturday). Specify the time range using the following syntax: hh:mm-hh:mm Remember, hours are specified in military time (using a 24-hour clock). For example, the following directive specifies a time range from 11:15 AM until 2:00 PM on Saturday: saturday 11:15-14:00 NOTE: To exclude an entire day from the time period, omit it from the timeperiod definition. exclude. Specifies the name of one or more timeperiod definitions (separated by commas) that should be excluded from this timeperiod. Configuring Commands The last thing you need to do is define your command objects. By default, your command definitions are located in the /etc/nagios/objects/ commands.cfg file. The syntax for defining a command object is as follows: 419

420 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual define command{ command_name command_name command_line command_line } You use the following directives in a command definition: command_name. Defines a name for the command. This name is very important as it is referenced in other definitions, such as the check_command directive in service objects. command_line. Specifies the actual executable file (such as a plugin in the / usr/lib/nagios/plugins directory) that is run by Nagios. A sample command definition that will run a check against the DNS service that hosts the digitalairlines.com domain is shown below: define command{ command_name check_dns command_line /usr/lib/nagios/plugins/check_dns -H da1.digitalairlines.com } Notice in the example above that the name of host to test with the check_dns plugin is hard-coded in the command definition. If you want to make your commands more flexible, you can use macros. For example, to make the check_dns plugin run its test on whatever host is currently being checked, you could substitute the $HOSTADDRESS$ macro in the place of the hard-coded command line. An example is shown below: define command{ command_name check_dns command_line /usr/lib/nagios/plugins/check_dns -H $HOSTADDRESS$ } In addition, you can also pass arguments to a command. To do this, you use $ARGn$ macros in this directive of the command definition. If you do, you must specify these arguments along with the command object to be run in the check_command directive in your service object. For example, the check_disk command expects one argument, as shown in bold below: command[check_disk]=/usr/lib/nagios/plugins/check_disk -w 85% -c 95% -p $ARG1$ In this case, the argument is the name of the device or path in the file system to check, such as /dev/sda2. Therefore, when constructing a check_command directive in a service definition that uses this command, you must also specify the appropriate argument. NOTE: If you re not sure which parameter an argument correlates to for a given plugin, you can navigate to the /usr/lib/nagios/plugins directory and then run that plugin directly from the shell prompt with the -h option. When you do, help information for that plugin is displayed. In the example above, you would see that the -p option is used by the check_disk plugin to specify the path to test. 420

421 Monitor Server Health The key thing to remember here is the fact that you separate the argument from the command (and any other arguments if more than one is required) with an exclamation point (!). An example is shown below: check_command check_disk!/dev/sda2 NOTE: For more information on macros, see the Macros section of the Nagios documentation available from ( For Nagios monitoring to work, you need to link all these components together. For example, let s suppose you want to monitor the DNS service running on the da1.digitalairlines.com server and notify the rtracy user when something is wrong. For our purposes here, we ll assume that the DNS service resides on the same server where Nagios is running. Novell Training Services (en) 15 April 2009 NOTE: We ll discuss how to monitor remote services later in this objective under Configuring Monitored Hosts on page 423. The host definition for this server could be similar to the following: define host{ host_name da1.digitalairlines.com alias Course 3107 DNS server address check_period 24x7 max_check_attempts 1 contact_groups admins notification_interval 30 notification_period 24x7 notification_options d,u,r } The service definition for the DNS service running on this host would be similar to the following: define service{ host_name da1.digitalairlines.com service_description DNS check_command check_dns max_check_attempts 3 check_interval 5 retry_interval 1 check_period 24x7 notification_interval 60 notification_period 24x7 notification_options w,c,r contact_groups DNS-Admins } The contact group associated with this service definition (DNS-Admins) could be defined as follows: 421

422 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual define contactgroup{ contactgroup_name DNS-Admins alias DNS Server Administrators members rtracy } The contact that is a member of this group (rtracy) could be defined as follows: define contact{ contact_name rtracy alias Robb Tracy host_notifications_enabled 1 service_notifications_enabled 1 service_notification_period 24x7 host_notification_period 24x7 service_notification_options w,u,c,r host_notification_options d,u,r service_notification_commands notify-by- host_notification_commands host-notify-by- rtracy@localhost address } The time periods referenced in these definitions (24x7) could be defined as follows: define timeperiod{ timeperiod_name 24x7 alias 24 Hours A Day, 7 Days A Week sunday 00:00-24:00 monday 00:00-24:00 tuesday 00:00-24:00 wednesday 00:00-24:00 thursday 00:00-24:00 friday 00:00-24:00 saturday 00:00-24:00 } Finally, the command definition referenced in the service definition could be defined as follows: define command{ command_name check_dns command_line /usr/lib/nagios/plugins/check_dns -H da1.digitalairlines.com } By putting all of these pieces together, you configure the Nagios daemon to monitor the DNS service on da1.digitalairlines.com. NOTE: You need to ensure that each of the files that contains these definitions is specified in the / etc/nagios/nagios.cfg file using a cfg_file= directive. Sample results for these definitions in the Web console are shown in the figure below: 422

Figure 7-24 Putting It All Together Configuring Monitored Hosts Monitor Server Health Up to this point, we have discussed how to set up and configure the Nagios server system.

423 Figure 7-24 Putting It All Together Configuring Monitored Hosts Monitor Server Health Up to this point, we have discussed how to set up and configure the Nagios server system. We now need to discuss how to configure your monitored hosts. Recall that there are two categories of checks that can be run using Nagios: Monitoring Remote Services on page 423 Monitoring Remote Server Metrics on page 427 Novell Training Services (en) 15 April 2009 Monitoring Remote Services The testing of remote network services with Nagios is relatively straight-forward. You need to do the following: 1. Configure a host configuration file for the remote host or add it to your existing host configuration file. NOTE: An easy way to get started is to simply copy the localhost.cfg file that comes with Nagios and modify it for the remote host. If you choose this option, you will need to add 423

424 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual the new configuration file to the /etc/nagios/nagios.cfg file using the cfg_file= directive. An example is shown below: # You can specify individual object config files as shown below: cfg_file=/etc/nagios/objects/commands.cfg... cfg_file=/etc/nagios/objects/localhost.cfg cfg_file=/etc/nagios/objects/da2.cfg A sample host definition for the DA2 server is shown below: define host{ use linux-server host_name DA2.digitalairlines.com alias DA2 SLES 11 Server address } NOTE: Notice in the above example that the service definitions use the linux-server template, which is defined, by default, in the templates.cfg file. It contains several default values that will be applied to the host definition, as shown below: define host{ name linux-server use generic-host check_period 24x7 check_interval 5 retry_interval 1 max_check_attempts 10 check_command check-host-alive notification_period workhours notification_interval 120 notification_options d,u,r contact_groups admins register 0 } 2. Configure the network services you want to monitor on the remote host in the appropriate configuration file. For example, if you wanted to monitor the SSH daemon and the Apache Web server on the remote host, you would use the following service definitions: define service{ use local-service host_name DA2.digitalairlines.com service_description SSH check_command check_ssh notifications_enabled 0 } define service{ use local-service 424

425 Monitor Server Health host_name DA2.digitalirlines.com service_description HTTP check_command check_http notifications_enabled 0 } NOTE: Notice in the above example that the service definitions use the local-service template. This is defined, by default, in the templates.cfg file and contains several default values that will be applied to the service definition. This is shown below: define service{ name local-service use generic-service max_check_attempts 4 normal_check_interval 5 retry_check_interval 1 register 0 } Novell Training Services (en) 15 April If necessary, customize your contacts configuration file with the appropriate individuals for the remote host. 4. If necessary, customize your time period definitions. 5. Configure the appropriate commands in your command definition file to support the remote host. 6. Save all of your changes and reload the Nagios configuration. When you re done, you should see the new remote host added to your list of monitored hosts in the Web console. In the example below, the DA2.digitalairlines.com server has been added to the list of hosts monitored by Nagios: 425

Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 7-25 Monitoring Services on a Remote Server In this example, the HTTP and SSH

426 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 7-25 Monitoring Services on a Remote Server In this example, the HTTP and SSH daemons on DA2 are being monitored. You can view detailed information about each service by selecting the appropriate links under DA2.digitalairlines.com in the Web console. For example, information about the Apache Web server running on this server is shown below: 426

Figure 7-26 Viewing Details About a Remote Service Monitor Server Health Novell Training Services (en) 15 April 2009 Monitoring Remote Server Metrics Monitoring remote server metrics with Nagios is a

427 Figure 7-26 Viewing Details About a Remote Service Monitor Server Health Novell Training Services (en) 15 April 2009 Monitoring Remote Server Metrics Monitoring remote server metrics with Nagios is a little more difficult than monitoring network services. As discussed in Nagios Add-Ons on page 385, you can use one of the following Nagios add-ons to run checks on the remote server and send the results back to the Nagios server: Nagios Remote Program Execution (NRPE). The NRPE add-on allows you to execute Nagios plugins on remote network systems and feed the information it gathers back to your Nagios server, allowing you to monitor local resources on remote network hosts. Nagios Service Check Accepter (NSCA). The nsca add-on allows you to submit passive service check results to another server running Nagios, effectively allowing Nagios to run in a distributed monitoring environment. Both of these options work well and are frequently implemented. However, both require you to install both the Nagios plugins and a special Nagios daemon on your remote systems to work. NOTE: If you want to implement distributed monitoring, as discussed in Planning a Nagios Implementation on page 390, then you must use NRPE or NSCA. Due to time and space constraints, these two options will not be addressed in this objective. If you want to learn more, see the Nagios documentation at ( documentation). 427

Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 7-27 In addition to NRPE and NSCA, Nagios offers a a third remote monitoring

428 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 7-27 In addition to NRPE and NSCA, Nagios offers a a third remote monitoring option that doesn t require you to install a special daemon and, more importantly, increases the security of the information being transferred between the remote host and the Nagios server. This is done using the check_by_ssh plugin, which establishes a secure SSH connection between the Nagios server and the remote system and then runs the desired check command on it. This is depicted in the figure below: Using the check_by_ssh Plugin to Run Checks on Remote Systems NOTE: While using check_by_ssh increases the security of data being transferred between hosts, it also increases server overhead as it must repeatedly set up and then destroy SSH connections between monitored systems and the Nagios host. To configure remote monitoring using the check_by_ssh daemon, you need to first configure SSH such that a session can be securely established between the Nagios server and the monitored system without requiring a password. To do this, complete the following: 1. Verify that the SSH daemon is running and accessible on the Nagios server as well as on all monitored systems. 2. Install the Nagios plugins on the monitored system. For SLES 11 systems, this can be accomplished by installing the nagios-plugins package. At this point, the Nagios plugins are installed in the /usr/lib/nagios/ plugins directory on the monitored system, as shown below: 428

429 Monitor Server Health DA2:/usr/lib/nagios/plugins # ls check_breeze check_icmp check_nntp check_smtp check_by_ssh check_ide_smart check_nt check_ssh check_clamd check_ifoperstatus check_ntp check_swap check_cluster check_ifstatus check_ntp_peer check_tcp check_dhcp check_imap check_ntp_time check_time check_dig check_ircd check_nwstat check_udp check_disk check_linux_raid.pl check_oracle check_ups check_disk_smb check_load check_overcr check_users check_dns check_log check_ping check_wave check_dummy check_mailq check_pop check_xenvm check_file_age check_mrtg check_procs negate check_flexlm check_mrtgtraf check_real urlize check_ftp check_nagios check_rpc utils.pm check_http check_netapp.pl check_sensors utils.sh Novell Training Services (en) 15 April On the monitored system, create a user and group each named nagios. Specify the /var/lib/nagios directory as the home directory of the nagios user. The nagios user should only be member of the nagios group. This user account will be used to accept an SSH connection from the Nagios server and run the local Nagios plugin executable files. Do the following: a. On the monitored system, start YaST and select Security and Users > User and Group Management. b. Select the Groups tab. c. View system groups by selecting Set Filter > System Groups. d. Select Add; then complete the following parameters for the new group: Group Name: nagios Group ID: Use the default gid provided. e. Select OK. f. Select the Users tab. g. View system users by selecting Set Filter > System Users. h. Select Add. i. Enter the following parameters for the new user: User s Full Name: Used for Nagios Username: nagios Password: Enter a password of your choosing. j. Select the Details tab. k. In the Home Directory field, enter /var/lib/nagios. l. In the Default Group drop-down list, select nagios. m. In the Additional Groups field, deselect all marked groups. n. Select OK. 429

Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 7-28 You should now see a new user added to your list of system users on the

430 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 7-28 You should now see a new user added to your list of system users on the monitored system named nagios, as shown below: Creating the nagios User Account o. Select OK to write the new user and group accounts to disk. p. Close YaST. q. Open a terminal session. r. Switch to the nagios user account by entering su - nagios at the shell prompt. s. When prompted, enter the password you configured for the nagios user account. t. At the shell prompt, enter cd ~ to switch to the nagios user s home directory. u. At the shell prompt, enter mkdir.ssh. v. Restrict access to the.ssh directory to only the nagios user by entering chmod 700 /var/lib/nagios/.ssh at the shell prompt. w. At the shell prompt, enter exit. 4. Create a public/private keypair for the nagios user on your Nagios server and transfer the key file to the monitored system. This allows the Nagios server to establish a secure connection to the monitored system as nagios over SSH without having to supply a password. On SLES 11, this is done by completing the following: a. Switch to your Nagios server. b. Stop Nagios by opening a terminal session, switching to root with the su - command, and entering rcnagios stop at the shell prompt. c. Assign a default shell and password to your nagios user by doing the following: i. Start YaST. ii. Select Security and Users > User and Group Management. iii. Select Set Filter > System Users. iv. Select the nagios user (which was created for you by default when you installed Nagios) and then select Edit. v. Enter a password of your choosing in the Password fields. 430

431 vi. Select the Details tab. vii. In the Login Shell drop-down list, select /bin/bash. viii. Select OK > OK. ix. Close YaST. Monitor Server Health d. At the shell prompt, enter su - nagios to switch to your nagios user account. As you are logged in as root you are not prompted for a password. e. At the shell prompt, enter ssh-keygen -t dsa. f. When prompted to specify the file name and path where the key file should be saved, accept the default of /var/lib/nagios/.ssh/id_dsa by pressing Enter. g. When prompted to enter a passphrase, leave the entry blank and press Enter. h. Repeat this process when prompted to enter the passphrase a second time. Sample output from this command is shown below: Novell Training Services (en) 15 April 2009 nagios@da1:~> ssh-keygen -t dsa Generating public/private dsa key pair. Enter file in which to save the key (/var/lib/nagios/.ssh/ id_dsa): Created directory '/var/lib/nagios/.ssh'. Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /var/lib/nagios/.ssh/ id_dsa. Your public key has been saved in /var/lib/nagios/.ssh/ id_dsa.pub. The key fingerprint is: 28:8c:b3:bc:f7:23:d4:4e:d3:51:40:40:b3:26:65:89 nagios@da1 The key's randomart image is: +--[ DSA 1024]----+ o*+o. Eo.o.. o. o o.. o o...s.. o..+. o. o..o o.. o As you can see above, the key files are created for you in /var/lib/ nagios/.ssh. These files are shown below: nagios@da1:~> ls.ssh/ id_dsa id_dsa.pub The file we will use is the id_dsa.pub file, which is the public key for the nagios user. 431

432 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual i. Copy the public key file from the Nagios server to the home directory of the nagios user on the monitored system by entering the following command (all on one line): scp.ssh/id_dsa.pub nagios@remote_host_name:~/.ssh/ authorized_keys j. If prompted to accept a security key from the remote host, enter yes. k. When prompted, enter the password for the nagios user on the remote host. You should see that the file was copied to authorized_keys in the.ssh directory in the home directory of the nagios user on the remote system. An example is shown below: nagios@da1:~> scp.ssh/id_dsa.pub nagios@da2.digitalairlines.com:~/.ssh/authorized_keys The authenticity of host 'da2.digitalairlines.com ( )' can't be established. RSA key fingerprint is 9f:68:0c:29:20:b4:21:8f:50:74:72:99:b2:84:e1:ed. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'da2.digitalairlines.com, ' (RSA) to the list of known hosts. Password: id_dsa.pub 100% KB/s 00:00 l. Test the configuration by establishing an SSH connection from the Nagios server to the remote system as the nagios user. Enter the following command: ssh nagios@remote_host_name An SSH connection should be established without prompting you for a password. m. Enter exit to close the connection. n. Enter exit again to exit out of the nagios user account at the shell prompt. o. Start YaST again on the Nagios server and use the User and Group Management module to set the default shell for the nagios user back to / bin/false. Once done, you next need to configure the commands you want to run on the remote system. This is done by configuring a new definition that runs the check_by_ssh plugin and passes the name of the plugin you actually want to run on the remote system as an argument. For example, suppose you wanted to monitor disk space utilization on the remote server using the check_disk plugin. You could define a command similar to the following: 432

433 Monitor Server Health # 'check_remote_disk' command definition define command { command_name check_remote_disk command_line /usr/lib/nagios/plugins/check_by_ssh -H $HOSTADDRESS$ -C "/usr/lib/nagios/plugins/check_disk -w 75% -c 90% -p $ARG1$" } NOTE: Remember, command objects are defined in the /etc/nagios/objects/ commands.cfg file by default. The $HOSTADDRESS$ macro shown above allows Nagios to dynamically add the IP address of the host where the command is being run into the command. Novell Training Services (en) 15 April 2009 Notice that the entire command is enclosed in quotation marks. This is done to ensure that the correct command and parameters are passed to the check_by_ssh plugin. In this example, the check_by_ssh plugin will establish an SSH connection as the nagios user to the remote system and then run the check_disk plugin on that system. Using a path or device you specify as an argument, the check_disk plugin will issue a warning if utilization exceeds 75% and a critical warning if utilization exceeds 90%. Once the command definition is complete, you need to define a service object that uses the command in the appropriate configuration file. For example: define service { use local-service host_name DA2.digitalairlines.com service_description Root Partition Utilization check_command check_remote_disk!/dev/sda2 } This example assumes the root partition is mounted at /dev/sda2 on the remote system. Once the command and service definitions for checking remote systems have been added to your configuration files, you can start the Nagios daemon (or reload the configuration if it is already running). If you make a mistake in your configuration, you will see an error message in the output of the rcnagios start or rcnagios reload command directing you to view the /var/log/nagios/config.err file. An example is shown below: DA1:/etc/nagios/objects # rcnagios start Starting nagios - Error in configuration files - please read /var/log/nagios/config.err failed This file can be extremely helpful in identifying the nature of the error and where it occurred in your configuration files. If Nagios starts correctly, you can then access the Web console and select the Service Detail link. When you do, you should see that the service definition has been added to the host. You will need to wait a few minutes for the check to execute. 433

Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 7-29 NOTE: If the check hasn t been run yet, the next scheduled time when the

434 Novell Training Services (en) 15 April 2009 SUSE Linux Enterprise Server 11: Certified Linux Engineer 11 / Manual Figure 7-29 NOTE: If the check hasn t been run yet, the next scheduled time when the check will be run should be displayed. After the remote check runs, you should see the results of the check displayed in the Service Detail page. An example using the configuration presented earlier is shown below: Viewing the Results of a Remote Check As with the checks configured previously, you can view more details by selecting the service in the list provided. For the Root Partition Utilization check configured earlier, the following detailed information is displayed: 434

435 Figure 7-30 Viewing Remote Service Check Details Monitor Server Health Novell Training Services (en) 15 April

Novell Access Manager

Quick Start AUTHORIZED DOCUMENTATION Novell Access Manager 3.1 SP2 June 11, 2010 www.novell.com Novell Access Manager 3.1 SP2 Quick Start Legal Notices Novell, Inc., makes no representations or warranties