Saving Your Bacon Recovering From Common Linux Startup Failures

Similar documents
Best practices with SUSE Linux Enterprise Server Starter System and extentions Ihno Krumreich

Saving Real Storage with xip2fs and DCSS. Ihno Krumreich Project Manager for SLES on System z

Essentials. Johannes Meixner. about Disaster Recovery (abbreviated DR) with Relax-and-Recover (abbreviated ReaR)

Linux and z Systems in the Datacenter Berthold Gunreben

Exploring History with Hawk

How To Make Databases on SUSE Linux Enterprise Server Highly Available Mike Friesenegger

Protect your server with SELinux on SUSE Linux Enterprise Server 11 SP Sander van Vugt

Docker Networking In OpenStack What you need to know now. Fawad Khaliq

The Linux IPL Procedure

SUSE Manager in Large Scale 17220

Using Linux Containers as a Virtualization Option

Build with SUSE Studio, Deploy with SUSE Linux Enterprise Point of Service and Manage with SUSE Manager Case Study

Novell SLES 10/Xen. Roadmap Presentation. Clyde R. Griffin Manager, Xen Virtualization Novell, Inc. cgriffin at novell.com.

Cloud in a box. Fully automated installation of SUSE Openstack Cloud 5 on Dell VRTX. Lars Everbrand. Software Developer

SUSE OpenStack Cloud. Enabling your SoftwareDefined Data Center. SUSE Expert Days. Nyers Gábor Trainer &

Chapter Two. Lesson A. Objectives. Exploring the UNIX File System and File Security. Understanding Files and Directories

SUSE Linux Enterprise Kernel Back to the Future

SUSE Linux Enterprise High Availability Extension

SUSE Manager Roadmap OS Lifecycle Management from the Datacenter to the Cloud

Open Enterprise & Open Community

Filesystem Hierarchy Operating systems I800 Edmund Laugasson

RocketRAID 2310/2300 Controller Fedora Linux Installation Guide

Managing Linux Servers Comparing SUSE Manager and ZENworks Configuration Management

Welcome to getting started with Ubuntu Server. This System Administrator Manual. guide to be simple to follow, with step by step instructions

SaltStack and SUSE Systems and Configuration Management that Scales and is Easy to Extend

Unix System Architecture, File System, and Shell Commands

Provisioning with SUSE Enterprise Storage. Nyers Gábor Trainer &

Samba HA Cluster on SLES 9

BOV89296 SUSE Best Practices Sharing Expertise, Experience and Knowledge. Christoph Wickert Technical Writer SUSE /

Chapter 6. Boot time configuration. Chapter 6 Boot time configuration

List of Linux Commands in an IPm

Linux Installation Planning

Manage Directories and Files in Linux. Objectives. Understand the Filesystem Hierarchy Standard (FHS)

Hands-on with Native Linux Containers (LXC) Workbook

Linux Essentials. Programming and Data Structures Lab M Tech CS First Year, First Semester

Define Your Future with SUSE

Linux High Availability on IBM z Systems

Secure Authentication

SUSE Manager and Salt

Welcome to SUSE Expert Days 2017 Service Delivery with DevOps

Course 55187B Linux System Administration

Chapter 6. Linux File System

Linux Systems Administration Getting Started with Linux

SUSE An introduction...

Introduction to Linux. Roman Cheplyaka

CREATION OF A MINIMAL STAND ALONE RTAI SYSTEM ================================================

SICOOB. The Second Largest Linux on IBM System z Implementation in the World. Thiago Sobral. Claudio Kitayama

Novell Infiniband and XEN

INTRODUCTION TO LINUX

If you don't care about how it works but you just would like that it works read here. Other wise jump to the next chapter.

LiLo Crash Recovery. 1.0 Preparation Tips. 2.0 Quick Steps to recovery

Expert Days SUSE Enterprise Storage

Disks, Filesystems 1

Building a Secure and Compliant Cloud Infrastructure. Ben Goodman Principal Strategist, Identity, Compliance and Security Novell, Inc.

Linux Files and the File System

Introduction to Software Defined Infrastructure SUSE Linux Enterprise 15

At course completion. Overview. Audience profile. Course Outline. : 55187B: Linux System Administration. Course Outline :: 55187B::

"Charting the Course... MOC B: Linux System Administration. Course Summary

Outline. Cgroup hierarchies

THE HONG KONG POLYTECHNIC UNIVERSITY Department of Electronic and Information Engineering

Exploring the High Availability Storage Infrastructure. Tutorial 323 Brainshare Jo De Baer Technology Specialist Novell -

Filesystem Hierarchy and Permissions

Chapter-3. Introduction to Unix: Fundamental Commands

Introduction to Unix: Fundamental Commands

TestOut Linux Pro - English 4.0.x OBJECTIVE MAPPING: CompTIA Linux+ LX0-103

CSE 303 Lecture 2. Introduction to bash shell. read Linux Pocket Guide pp , 58-59, 60, 65-70, 71-72, 77-80

Getting Started with Linux

Getting your department account

Outline. Cgroup hierarchies

Chapter 7. Getting Started with Boot from Volume

File System Hierarchy Standard (FHS)

Building Customized Linux Kernels A live demonstration. Mark Post August 17, 2004 Session # 9280

Filesystem Hierarchy and Permissions

Exam LFCS/Course 55187B Linux System Administration

Disks, Filesystems Todd Kelley CST8177 Todd Kelley 1

Presented by Bill Genske Gary Jackson

Server Consolidation with Xen Farming

FC-FCoE Adapter Inbox Driver Update for Linux Kernel 2.6.x. Table of Contents

The kernel constitutes the core part of the Linux operating system. Kernel duties:

Introduction to Linux Part I: The Filesystem Luca Heltai

SUSE Advanced Troubleshooting: The Boot Process Lecture

LPIC-1 System Administrator

QLogic QLA4010/QLA4010C/QLA4050/QLA4050C/ QLA4052C/QMC4052/QLE4060C/QLE4062C iscsi Driver for Linux Kernel 2.6.x.

Using Crowbar to Deploy Your OpenStack Cloud. Adam Spiers Vincent Untz John H Terpstra

Installation of the OS

Too Many Metas A high level look at building a metadata desktop. Joe Shaw

Question No: 1 In capacity planning exercises, which tools assist in listing and identifying processes of interest? (Choose TWO correct answers.

GNU/Linux 101. Casey McLaughlin. Research Computing Center Spring Workshop Series 2018

Overview LEARN. History of Linux Linux Architecture Linux File System Linux Access Linux Commands File Permission Editors Conclusion and Questions

PowerVM Lx86 for x86 Linux Applications Administration Guide

Basic Shell Commands. Bok, Jong Soon

Accessing LINUX file systems from CMS. Metropolitan VM Users Association January 24, 2005

Insight Control Server Provisioning Capturing and Installing SUSE Enterprise Linux 12 System Images

Systems Programming/ C and UNIX

Embedded System Design

Unix Filesystem. January 26 th, 2004 Class Meeting 2

UNIX System Programming Lecture 3: BASH Programming

Introduction Into Linux Lecture 1 Johannes Werner WS 2017

Embedded Linux Systems. Bin Li Assistant Professor Dept. of Electrical, Computer and Biomedical Engineering University of Rhode Island

G54ADM Sample Exam Questions and Answers

Transcription:

Saving Your Bacon Recovering From Common Linux Startup Failures Mark Post Novell, Inc. Friday, August 12, 2011 Session Number 10105

Agenda How the boot process is supposed to work What things can go wrong How you can get things running quickly How to fix things so the next IPL will work How to prevent similar issues in the first place Questions I will take questions during the presentation, unless time gets short. 2

How the boot process is supposed to work

How the boot process is supposed to work Hardware (or z/vm) reads the bootstrap record from disk The bootstrap record has entries in it pointing to The kernel plus the kernel parameters The initial ram disk (initrd) The kernel is read into memory and control is passed to its entry point Soon, like other architectures, it will be compressed on disk, so the kernel will decompress itself first. The kernel parameters are passed to the kernel by the boot loader. They get stuffed into a pre-determined address within the kernel itself. 4

How the boot process is supposed to work (2) The kernel goes through it's initialization process, which includes parsing the kernel parameters Parameters the kernel doesn't recognize as being for itself are ignored. All are put in /proc/cmdline. The initrd is read in, decompressed and unpacked The file system from the initrd is mounted as the (temporary) root file system The kernel finds the userspace program in the initrd to handle the rest of the boot process By default this is init but kernel parameters can specify something else, e.g., /bin/bash 5

How the boot process is supposed to work (3) The init program begins execution. In SLES, this is a bash script. The init program does all sorts of things. The ones we're interested here are: Parsing parameters of interest in /proc/cmdline > Some of these are for kernel modules to be loaded, others are not directly related to the kernel or its modules. Loading kernel modules for various hardware with appropriate parms Finding and mounting the permanent root file system Starting execution of the real init program 6

What things can go wrong

What could possibly go wrong? Ooh, where do we start? The boot loader can't find both the kernel and the initrd When the kernel tries to uncompress and mount the initrd, it's not valid The initrd doesn't contain all the kernel modules you need to activate all your DASD and your NIC Your root file system is on an LVM logical volume, but the initrd doesn't have that support built in Your root file system is not on an LVM logical volume, but the initrd does have that support built in 8

What could possibly go wrong? (2) The scripts in the initrd don't activate all the DASD volumes you need Your root file system is on a network server, but the initrd can't initialize your NIC You've edited /etc/fstab and now there's a typo in it You don't remember the root password You get a message that the init program can't be found None of your kernel modules can be found You've told the kernel to ignore the address of the console. 9

How you can get things running quickly

We're Baaaaack If you're going to edit a file, make a backup first. If you don't like that idea, you're going to hate learning ed or sed Have a small rescue system available that you can use to access the DASD or SCSI devices of a system with problems. Use the kernel and initrd from the installation media to boot from the z/vm card reader, or the HMC in the case of an LPAR. Activate the DASD needed for the root file system and mount it for fs in dev sys proc; do mount bind /$fs /mnt/$fs; done mount bind /proc/mounts /mnt/etc/mtab chroot /mnt 11

Understand at what point in the boot process various resources get activated E.g., Network interface initialization happens after the initrd is replaced by the real root file system DASD or SCSI device(s) needed by the root file system are brought online in the initrd Know how to dynamically activate and bring online DASD LVM volume groups and logical volumes QETH devices (NICs for the most part, but can be FCP adapters) SAN WWPNs and LUNs 12

Set up the Terminal Server guest, and configure all the target systems to be able to receive those connections. Requires a current enough SLES10 or SLES11 SP1 system Keep in mind that if your root file system can't get mounted, the target system won't be able to receive connections because the code for that lives in the root file system. 13

How I normally set up my file systems # df -h Filesystem Size Used Avail Use% Mounted /dev/dasda1 388M 246M 123M 67% / /dev/mapper/vg1-home 97M 4.2M 88M 5% /home /dev/mapper/vg1-opt 74M 4.1M 66M 6% /opt /dev/mapper/vg1-tmp 291M 43M 234M 16% /tmp /dev/mapper/vg1-usr 2.6G 1.8G 678M 73% /usr /dev/mapper/vg1-var 392M 298M 78M 80% /var 14

The mkinitrd is supposed to figure out what device type the root file system is on (including network-based), and include the necessary pieces for that in the initrd. Sometimes there are bugs that prevent this from working right. On my SLES11 SP1 system, the only kernel modules that in the initrd are: dasd_eckd_mod.ko dasd_mod.ko ext3.ko jbd.ko mbcache.ko 15

Commands available in the initrd bin: awk echo ln mv sleep usleep bash grep logger rm test warpclock cat ipconfig ls run-init touch chmod ipconfig.sh mkdir sed true cp kill mknod sh umount date linuxrc mount showconsole uname sbin: blogd fsck insmod reboot udevd dasd_configure fsck.ext3 killall5 resume dasdinfo halt modprobe showconsole dasdview ifup pidof udevadm 16

boot: 01-devfunctions.sh 02-start.sh 03-storage.sh 04-udev.sh 05-blogd.sh 05-clock.sh 11-block.sh 11-dasd.sh 21-devinit_done.sh 81-resume.userspace.sh 82-resume.kernel.sh 83-mount.sh 84-remount.sh 91-createfb.sh 91-killblogd.sh 91-killudev.sh 91-shell.sh 92-killblogd2.sh 93-boot.sh 17

How to fix things so the next IPL will work

Use YaST to define the resource, or Use the command line functions provided dasd_configure qeth_configure zfcp_host_configure zfcp_disk_configure Either way, the proper udev rules will be written into the /etc/udev/rules.d/ files. If in doubt, re-run mkinitrd and zipl. Check to make sure the modules necessary to get the root file system mounted are included. 19

How to prevent similar issues in the first place

Prevention, Worth a Pound... Use YaST Yeah, I know, that can be a pain at times. But it usually causes fewer problems than humans. > It's also partly why a lot of YaST functions can be called from scripts. After updating /etc/fstab, do a mount -a command and look for any errors. Really know and understand what is in your initrd, and what is supposed to be there. Unpack the initrd into a temporary directory and poke around mkdir tempdir cd tempdir zcat /boot/initrd cpio -ivmd 21

Prevention (2) Understand that what goes into the initrd is only those things needed to get the root file system mounted. Anything beyond that is unnecessary, and in some cases can cause startup failures. Don't assume that the way things were done in a prior release will be done the same way now. Read the release notes. For every service pack. Yes, I mean it. No, I'm not kidding. zipl.conf updates not needed for DASD adds/deletes /etc/sysconfig/hardware versus /etc/udev/rules.d 22

Questions 23

Unpublished Work of Novell, Inc. All Rights Reserved. This work is an unpublished work and contains confidential, proprietary, and trade secret information of Novell, Inc. Access to this work is restricted to Novell employees who have a need to know to perform tasks within the scope of their assignments. No part of this work may be practiced, performed, copied, distributed, revised, modified, translated, abridged, condensed, expanded, collected, or adapted without the prior written consent of Novell, Inc. Any use or exploitation of this work without authorization could subject the perpetrator to criminal and civil liability. General Disclaimer This document is not to be construed as a promise by any participating company to develop, deliver, or market a product. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. Novell, Inc. makes no representations or warranties with respect to the contents of this document, and specifically disclaims any express or implied warranties of merchantability or fitness for any particular purpose. The development, release, and timing of features or functionality described for Novell products remains at the sole discretion of Novell. Further, Novell, Inc. reserves the right to revise this document and to make changes to its content, at any time, without obligation to notify any person or entity of such revisions or changes. All Novell marks referenced in this presentation are trademarks or registered trademarks of Novell, Inc. in the United States and other countries. All third-party trademarks are the property of their respective owners.