Herding Clones. Mike Kershaw August 17, urmk/

Similar documents
Supercharged VM Startup: A System V-Style INIT Process for VM and Guests

COMP 3400 Mainframe Administration 1

At course completion. Overview. Audience profile. Course Outline. : 55187B: Linux System Administration. Course Outline :: 55187B::

"Charting the Course... MOC B: Linux System Administration. Course Summary

z/vm 6.3 Installation or Migration or Upgrade Hands-on Lab Sessions

Course 55187B Linux System Administration

z/vm Introduction 3/10/2014

Exam LFCS/Course 55187B Linux System Administration

Customer Experiences:

Welcome to getting started with Ubuntu Server. This System Administrator Manual. guide to be simple to follow, with step by step instructions

1. Logging in to VM - Regular Login - Disconnected Login - Stealing the session - Logging off - Disconnected log off

How to Use This Lab Manual

This material is based on work supported by the National Science Foundation under Grant No

IBM Systems and Technology Group. Abstract

Oracle VM Template for MySQL Enterprise Edition =========================================================================== ===

Avaya Aura TM System Platform R6.0.1 Service Pack Release Notes Issue 1.4

Chapter Two. Lesson A. Objectives. Exploring the UNIX File System and File Security. Understanding Files and Directories

Thousands of Linux Installations (and only one administrator)

Critical Analysis and last hour guide for RHCSA/RHCE Enterprise 7

ECE 550D Fundamentals of Computer Systems and Engineering. Fall 2017

SUSE Linux Enterprise Server Starter System for System z Installation Guide

The Linux IPL Procedure

Opengear Bulk Provisioning Guide

Unit 2: Manage Files Graphically with Nautilus Objective: Manage files graphically and access remote systems with Nautilus

CA MIA Tape Sharing for z/vm

"Charting the Course... RHCE Rapid Track Course. Course Summary

Linux Essentials Objectives Topics:

Linux Systems Administration Getting Started with Linux

Contents in Detail. Acknowledgments

Chapter 11: File-System Interface

Disks, Filesystems 1

Zend Core TM. Installation and Maintenance Guide. Zend Core for Oracle. By Zend Technologies, Inc. w w w. z e n d. c o m

HP-UX System Administration

Chapter 7. Getting Started with Boot from Volume

SA2 v6 Linux System Administration II Net Configuration, Software, Troubleshooting

NETWORK CONFIGURATION AND SERVICES. route add default gw /etc/init.d/apache restart

Virtualization. Michael Tsai 2018/4/16

This course is for those wanting to learn basic to intermediate topics in Solaris 10 system administration.

CCNA 1 Chapter 2 v5.0 Exam Answers %

Certification. System Initialization and Services

Certified Ubuntu Professional VS-1140

CompTIA Linux Course Overview. Prerequisites/Audience. Course Outline. Exam Code: XK0-002 Course Length: 5 Days

Control Center Planning Guide

Chapter 10: File System. Operating System Concepts 9 th Edition

FastTrack to Red Hat Linux System Administrator Course Overview

How to Restrict a Login Shell Using Linux Namespaces

Troubleshooting the Installation

CIS 191A Final Exam. Fall CIS 191 Final Exam

NetIQ Privileged Account Manager 3.2 Patch Update 4 Release Notes

GNU/Linux: An Essential Guide for Students Undertaking BLOSSOM

Juniper Secure Analytics Patch Release Notes

Scratchbox Remote Shell

GL-280: Red Hat Linux 7 Update. Course Description. Course Outline

Initialization: runlevels

Installing MediaWiki using VirtualBox

Using z/vm DirMaint in an SSI Cluster

1Z Oracle Certified Professional Oracle Solaris 10 System Administrator Exam Summary Syllabus Questions

High Availability. Neale Ferguson Sine Nomine Associates Tuesday 13 August,

Automatically Logging on a User at Linux System Boot time for Console Management

Saving Real Storage with xip2fs and DCSS. Ihno Krumreich Project Manager for SLES on System z

Zenoss Resource Manager Upgrade Guide

Designing high-availability solutions using HP Integrity Virtual Machines as HP Serviceguard packages

Aware IM Version 8.1 Installation Guide

AppGate 11.0 RELEASE NOTES

SysadminSG RHCSA Study Guide

Lenovo ThinkSystem NE Release Notes. For Lenovo Cloud Network Operating System 10.6

File System Hierarchy Standard (FHS)

Hardening servers for the modern internet

Users Manual. OP5 System 2.4. OP5 AB. Page 1 of 6

Linux Systems Security. Logging and Network Monitoring NETS1028 Fall 2016

IBM Cloud Manager with OpenStack: z/vm Integration Considerations

INSTALLATION. Security of Information and Communication Systems

Virtual Data Center (vdc) Manual

WLM1200-RMTS User s Guide

CCNA 1 Chapter 2 v5.0 Exam Answers 2013

DUAL OS INSTALLATION

vsphere Upgrade Update 2 Modified on 4 OCT 2017 VMware vsphere 6.0 VMware ESXi 6.0 vcenter Server 6.0

1 LINUX KERNEL & DEVICES

Getting Started with Linux

"Charting the Course... Enterprise Linux System Administration Course Summary

System z Virtualization and Linux Workshop Bootcamp System z Hardware / Architecture

Using the Offline Diagnostic Monitor Menu

SE Linux Implementation LINUX20

What this Guide Covers. Additional Info. 1. Linux based Servers. 2. Windows Servers. 3. GoldLite and Virtual Servers. 4. Other servers. 5.

CHAPTER III PLANNING

RedHat Certified Engineer

Migration and Building of Data Centers in IBM SoftLayer

OPERATING SYSTEMS. Božo Krstajić, PhD, University of Montenegro Podgorica.

Working with Ubuntu Linux. Track 2 Workshop June 2010 Pago Pago, American Samoa

Contents. Deployment: Automated Installation of Cygwin

z/vm Security Essentials

XenServer Release Notes

MCSA Windows Server A Success Guide to Prepare- Microsoft Upgrading Your Skills to MCSA Windows Server edusum.

TRACK TRACK for VM What What s happening in your Virtual Machine? irtual Review & Lab

Introduction to UNIX/LINUX Security. Hu Weiwei

Unix System Architecture, File System, and Shell Commands

NCSA Security R&D Last Updated: 11/03/2005

How to install the software of ZNS8022

CHAPTER 2 ACTIVITY

Linux for zseries and VSE

Transcription:

Herding Clones Mike Kershaw Michael.Kershaw@marist.edu August 17, 2004 1

Why? Computer Science department wanted to offer students their own servers for classwork which would be available for the entire duration of their academic career IBM Joint Study to use virtual servers in an academic setting 2

Requirements for Clones Very large number of Linux guests on z/vm (currently over 500) Too many to create manually Long life-cycle before guests are decommissioned - Up to 5 years life cycle for each server Difficult to maintain kernel patches and services for large numbers of systems Must be flexible and modifiable by users Used by students and faculty for purposes not foreseen in the original design Must be able to push security updates to the clones during 3

Requirements for Clones use Should be user-friendly Not all users will be experts Basic functionality and configuration should be accessible by all without compromising the ability of advanced users 3

Finding out what doesn t work First created as fully stand-alone servers for specific courses This design had many drawbacks Could not maintain servers or fix problems No mechanism for security updates No mechanism for network changes No method to recover forgotten passwords - Not linked to central LDAP system Could not give users new software Required Linux/Unix skills to configure or modify services 4

Finding out what doesn t work Required a large amount of disk space for each server - entire root FS duplicated needlessly Prior to guest LANs, point-to-point IUCV links were needed. Requires 2 IPs per clone No method to automatically allocate IPs and modify the VM TCP/IP config Limited number of IUCV links per VM stack Poor error recovery on IUCV - rebooting stack meant rebooting all attached linux systems 4

Designing the next phase New generation of servers designed with help from Sine Nomine consultant, Scott Courtney Shared root filesystem Greatly reduced impact on drive space - 2.5GB shared between all systems Single point of maintenance and upgrades LDAP access control Known Good configuration rollback point no matter what users change z/vm Guest LAN connection allows automatic creation and 5

Designing the next phase hundreds of guests per LAN. User accounts and passwords controlled by LDAP (PAM- LDAP and NSS-LDAP) Users able to install their own software to writeable partition Central management server Web-based management of user servers Central database of allocated systems and IPs Service monitoring and network services for clone machines 5

Dividing up the filesystem Start with a standard install with a writeable root partition Identify components a user would likely need to modify, and relocate the files (and optionally recompile the binaries to point to a different configuration location) /usr/local - Contains the entire writeable filesystem, in accordance with the general guideline of installing nonsystem software in the local path. /home and /root - Obvious first candidate, users need to be able to write to their own directories /etc - Some configuration files need to be written to, but the bulk of the directory should remain readonly. Notable exceptions are hosts, ld.so.conf and ld.so.cache, which 6

Dividing up the filesystem are relocated to /usr/local/etc, and mtab which is symlinked to /proc/mounts /tmp - Relocated for obvious reasons. Could have been made into a shmfs/tmpfs filesystem, but memory is at a premium. /var - Let the users have syslogs /dev - A special case. Device info is needed during the initial startup, so it cannot be symlinked, but it must also be writeable. During the boot procedure, a tiny loopback filesystem from the user partition is mounted over /dev. This is a legacy from the 2.2.x kernel, and could be replaced with a mount-at-boot devfs 2.4.x or 2.6.x kernel. 6

Organizing the read-only components Configure PAM-LDAP and NSS-LDAP so that user maintenance is remote Relocate shared services to /usr/share for ease of maintenance Reconfigure or recompile shared services to refer to /usr/local for configuration files Easiest method for making maintainable services on the user segment of the filesystem is to use symlinks to replicate the directory from the shared filesystem. (See next slide) 7

An example service: Apache The complete Apache directory is created in the user segment with symlinks to the shared read-only components: /usr/local/apache/bin -> /usr/share/apache/bin/ /usr/local/apache/cgi-bin /usr/local/apache/conf /usr/local/apache/htdocs /usr/local/apache/icons /usr/local/apache/include -> /usr/share/apache/include/ /usr/local/apache/libexec -> /usr/share/apache/libexec/ /usr/local/apache/logs -> /var/log/apache/logs/ 8

User segment configuration Classes may require students to reconfigure services or run different versions than those provided All user services configs are stored in the writeable portion /usr/local/etc/rc.d/rc.local allows users to add their own services or override existing services Per-host unique SSH keys generated on the first boot and stored in /usr/local/etc/ Networking information is stored in virtual TAGs and extracted with hcp TAG QUERY 9

CMS user configuration Users are not allowed to log into CMS for their virtual machines All servers share a common 191 minidisk with the configuration files and profile exec profile exec queries user name on boot, and loads the config file defining owner, any custom init parameters, and network status 10

Managing virtual servers Customized login shell accepts service, shutdown, and restart commands sent to the console Able to control what pre-defined services boot. For example, apache is configured to load on all new servers, but jakarta and mysql are not. Able to restart a service without connecting to the server Able to open pre-defined ports in the firewall to their server only In the future, users will be able to request servers via the website 11

Managing virtual servers Web interface on the main server lets users control their servers 11

12 http://reason.marist.edu/ urmk/

Network access control Central server acts as router and DNS server for all user machines OSPF routing and IPtables firewalling to restrict ports to be consistent with our site policies for network access 13

User-Controlled services Users must be able to control services without being experts at Linux. Service control is through a seemingly complicated series of scripts and service machines, but has always worked reliably: 1. PHP modifies the service state database to reflect the request if server is being changed to always up or always down 2. PHP writes the request to a transaction file as USER SERVICENAME UP DOWN RESTART 3. Privileged daemon uses hcp to forward the request as a message to a PROP service machine 14

User-Controlled services 4. PROP validates the destination, and uses CP SEND to write the command to the users console 5. Modified login shell accepts the command, queries the central MySQL database for the location of the service scripts, and runs them 14

User-Controlled services This convoluted path is necessary to preserve the security of the system: 1. Users cannot directly issue hcp commands 2. Only the central control server can issue a service command 3. The central control server should not have VM privileges high enough to write to arbitrary consoles Latency of the complete system from web-click to service being changed is less than a second despite the many layers 15

Timed firewalls Security vs. Useability Users will need to open ports for special services, ie, classes writing java servers Ports can be opened in the firewall in hour-and-a-half durations Timed firewall ports fulfill two requirements: 1. Increased security by limiting exposure time 2. Increased user awareness of security by forcing users to manually chose to open a port Timed firewall entries are placed in mysql with user, end 16

Timed firewalls time, and port. MySQL is overkill, but it s already running 16

Creating machines This procedure will be replaced in the future with an automated system based off LDAP and the web interface. 1. Perl code on the controller accepts the user account and number of total systems requested 2. If the user already has as many machines as were requested, generation is aborted 3. Allocation script generates guest machine account names and creates database entries with the network information, ownership, etc 4. Created machines are written to transaction file 5. Transaction file is FTPd to VM 17

Creating machines 6. Set of REXX scripts are run against the transaction file to perform DIRMAINT, disk copy, guest LAN grants, and IPL 17

Starting and stopping machines Controller can request autolog and shutdown via messages to service machine Servers all run custom login shell which accepts a HALT command via the console no root password needed for service machine to start a shutdown On IPL, controller begins a staged IPL of all servers in the database flagged to boot automatically. Controller IPLs 15 at a time with a 30 second delay to minimize contention for the 191 shared minidisk When the controller is issued a HALT command by the operators, it begins a staged shutdown of all the virtual machines, again spaced to prevent massive contention for swap IO: 500 18

Starting and stopping machines servers with 64 meg of storage each can cause VM to stop responding for 10-15 minutes if they are all logged off at once Predates signal shutdown, but custom login shell provides a number of additional commands which are not available with shutdown signals 18

Monitoring user servers The central database tracks all the services which should be running on the student servers. A perl script running on the central server monitors and responds to: Unresponsive servers. After a server has been unresponsive for 5 minutes, a halt is sent to the console, and if it continues to be unresponsive, a CP FORCE logoff is sent Down servers. If a server is down, it will be restarted Down services. If a service is down, the service will be restarted via the same mechanism the web page administration system uses 19

Creating servers on demand In order to create servers quickly on demand in the future: 1. A pool of unassigned servers is kept in nolog status 2. Assign is carried out as normal to modify the controller database 3. A message is sent to a service machine to assign an empty server and IPL it at this point, the server is available and running for the user, with no more delay than a DIRM CHNGID 4. In the background, the service machine repopulates the pool of empty servers via FlashCopy or DDR 20

Making secure web interfaces Not directly related to the clone systems, but an important facet: How do you securely issue commands that require root privs? (ie, hcp). Running apache as root is BAD Giving apache access to run root-commands like hcp is also very bad. Directly handing user input, even sanitized user input, is usually bad. So what can be done? Transaction files PHP writes the command to a transaction file, ie: SERVICE 21

Making secure web interfaces LSURMK apache START A privileged daemon (which could be as simple as a bash script) detects the change either real-time via tail or at regular intervals The daemon parses the command - once removed from user input, plus a fixed language for commands is easy to validated. The daemon can then execute the command as root, with minimal delay, securely removed from an unprivileged webserver 21

Contact Email: Michael.Kershaw@marist.edu Download: http://reason.marist.edu/~urmk/ 22