First Five Minutes on a System. What to do and why

Similar documents
3.5 Inch TFT Display

Linux Manually Mount External Hard Drive Ntfs 3g Could Not

Linux Manually Mount External Hard Drive Ntfs-3g Could Not

Getting Started with Linux Development

Getting Started with Linux

To understand this, let's build a layered model from the bottom up. Layers include: device driver filesystem file

1. Set up the storage to allow access to the LD(s) by the server following the NEC storage user guides.

Manual File System Check Linux Command Line

Recover Deleted Files With Scalpel

PiCloud. Building owncloud on a Raspberry PI

rsync link-dest Local, rotated, quick and useful backups!

Practical 5. Linux Commands: Working with Files

Linux Files and the File System

How To Resize ext3 Partitions Without Losing Data

Windows Method Using Linux Live CD and Gparted

Adafruit's Raspberry Pi Lesson 6. Using SSH

Exploring the system, investigating hardware & system resources

CloudFleet Documentation

Unix System Architecture, File System, and Shell Commands

Linux Hardware Management. Linux System Administration COMP2018 Summer 2017

CS370 Operating Systems

CS197U: A Hands on Introduction to Unix

BASH SCRIPT TIPS. function jumpto { label=$1 cmd=$(sed -n "/$label:/{:a;n;p;ba};" $0 grep -v ':$') eval "$cmd" exit }

UNIVERSITY OF BOLTON CREATIVE TECHNOLOGIES COMPUTING WITH WEBSITE DEVELOPMENT SEMESTER ONE EXAMINATIONS 2018/2019 UNIX MODULE NO: CPU5003

1 Installation (briefly)

Full file at

Linux Development Getting Started

High-Availability Storage with GlusterFS on CentOS 7 - Mirror across two storage servers

COS 318: Operating Systems. File Systems. Topics. Evolved Data Center Storage Hierarchy. Traditional Data Center Storage Hierarchy

Linux+ Guide to Linux Certification, Third Edition

How To Reinstall Grub In Windows 7 Without Losing Data And Programs

Course 55187B Linux System Administration

Lab E2: bypassing authentication and resetting passwords

LPIC-1 Exam 101, Part 1 of 2, version 4.0

The Ultimate Linux/Windows System

CST8177 Linux II. Linux Boot Process

Exam LFCS/Course 55187B Linux System Administration

UNIX File Hierarchy: Structure and Commands

CPSC 457 OPERATING SYSTEMS FINAL EXAM

CompTIA Linux+/LPIC-1 COPYRIGHTED MATERIAL

PL-I Assignment Broup B-Ass 5 BIOS & UEFI

Fedora Core: Made Simple

CIS 191A Final Exam. Fall CIS 191 Final Exam

This is Worksheet and Assignment 12. Disks, Partitions, and File Systems

#Uncomment the second line to enable any form of FTP write command. #write_enable=yes

Resizing Virtual Appliances (Debian) in VirtualBox for Windows Shutdown your Virtual Box Machin before you start the following steps

Question No: 1 In capacity planning exercises, which tools assist in listing and identifying processes of interest? (Choose TWO correct answers.

*nix Crash Course. Presented by: Virginia Tech Linux / Unix Users Group VTLUUG

Why You Should Not Use Arch

Read-only rootfs. Theory and practice. Chris Simmonds. Embedded Linux Conference Europe Read-only rootfs 1 Copyright , 2net Ltd

At course completion. Overview. Audience profile. Course Outline. : 55187B: Linux System Administration. Course Outline :: 55187B::

Ubuntu installation alongside windows 8/8.1 and 10

iscsi storage is used as shared storage in Redhat cluster, VMware vsphere, Redhat Enterprise Virtualization Manager, Ovirt, etc.

Linux Systems Administration Getting Started with Linux

Lecture 2: The file system

Installing caos with Cinch on Floppy Disk

Deploying VNFs Using AutoVNF

strace Practical Application Troubleshooting Tuesday, February 19, 13

This lab exercise is to be submitted at the end of the lab session! passwd [That is the command to change your current password to a new one]

Using a Raspberry Pi to Remote Access a Windows Computer

PASS4TEST IT 인증시험덤프전문사이트

Likes in Linux. (Yes, open source licenses exist, but I am not talking about those since they are generally userfriendly.)

Oracle Tablespaces, etc.: Managing the Disk Resource

Ubuntu Manual Disk Partitioning Guide

Linux Essentials Objectives Topics:

Alpine Linux Documentation

Amahi Instruction Manual

Filesystem Hierarchy and Permissions

Virtual File System. Don Porter CSE 306

Davide Cavaliere 18 th February 2017

MINI-HOWTO backup and/or restore device or partition using zsplit/unzsplit

ECE590 Enterprise Storage Architecture Homework #2: Drives and RAID

Linux Howtos. Fedora 9 Install (114) CIS Fall Fedora 9 Install (114) Fedora 9 installation with custom partitions.

"Charting the Course... MOC B: Linux System Administration. Course Summary

The Wonderful World of Services VINCE

More Raspian. An editor Configuration files Shell scripts Shell variables System admin

Connection Broker Where Virtual Desktops Meet Real Business. Installing Leostream Connect on HP Thin Clients

Do A Manual System Restore On Windows 8 Hp

Disks, Filesystems 1

Adding a block devices and extending file systems in Linux environments

Use the Reiser4 file system under Linux rev.2

File system implementation

File system implementation

Quick Start Guide for BeagleBone. Table of Contents. by Brian Fraser Last update: Sept 24, 2017

Tuesday 6th October Agenda

Linux Performance Tuning

CompTIA Linux Course Overview. Prerequisites/Audience. Course Outline. Exam Code: XK0-002 Course Length: 5 Days

NASA Lab. Partition/Filesystem/Bootloader. TinRay, Yu-Chuan

DEDICATED SERVERS WITH EBS

Linux Kung-Fu. James Droste UBNetDef Fall 2016

CompTIA Exam LX0-103 CompTIA Linux+ [Powered by LPI] Exam 1 Version: 6.0 [ Total Questions: 120 ]

CST8207 GNU/Linux O/S I Disks and Partitions

Installation of the OS

Format Hard Drive Using Windows 7 Recovery Disk

Linux Filesystems Ext2, Ext3. Nafisa Kazi

A guide to correct PostgreSQL setup and management

Linux Essentials. Smith, Roderick W. Table of Contents ISBN-13: Introduction xvii. Chapter 1 Selecting an Operating System 1

Linux OS Fundamentals for the SQL Admin. Anthony E. Nocentino

Disks, Filesystems, Booting Todd Kelley CST8177 Todd Kelley 1

Week 10 Project 3: An Introduction to File Systems. Classes COP4610 / CGS5765 Florida State University

Transcription:

First Five Minutes on a System What to do and why

Guidelines and Ideas No hard rules Started writing at 4:21pm Jan 21

Please feel free to POLITELY comment

** means a real life example with names removed for reasons

Server vs Desktop vs Router vs Device vs Other Server might have lots of services Desktop might have overly tweaked configs Router might be out of logging space Device might have corrupt storage medium Different actions depending on scope. There is not generic item that will instantly report what issues might exist. Understand the environment and business need for the device. More than one issue could be present and distract you from seeing the whole picture. ** Raspberry PI - Over logging and partition full, SD card is over write limit.

Motivation We all get escalations Maybe it is a family computer, maybe it is a production service. For whatever reason we want to fix the issue and often don t have time or resources to toss more hardware at the problem. Using Linux we can do a great deal of validation that is non-destructive. Some Operating Systems have very cumbersome tools that make troubleshooting an issue. Example might be a corrupt registry on a Microsoft Windows OS. ** 20 years of accounting is on here and this is the only backup.

The real world You will never be asked to look at something configured well. The desktops or servers that we are asked to review are often barely standing or insecure. If you want to enable an architecture that requires no trouble shooting we shall have another talk. Great systems are a lot of work. http://sts.ono.at/blog/2012/02/01/a-systemspolicy/ ** It must be a hacker our systems are perfect.

What to do first after smelling for smoke? Regardless of what is broken there are some simple things to check. df date dmesg ** I think it is the cat-warmer2000 service that is causing all the trouble, oh, wait out of disk space.

df (-h means human readable) What mounts are mounted? lathama@lappy7:~$ df -h Are any full? Filesystem Size Used Avail Use% Mounted on Are any surprisingly empty? Are any missing? /dev/sda1 113G 11G 96G 11% / udev 10M 0 10M 0% /dev tmpfs 1.6G 18M 1.5G 2% /run Any storage space left to work with? tmpfs 3.8G 124M 3.7G 4% /dev/shm tmpfs 5.0M 4.0K 5.0M 1% /run/lock tmpfs 3.8G 0 3.8G 0% /sys/fs/cgroup tmpfs 769M 8.0K 769M 1% /run/user/1000

df Part 2 If unsure, compare to mount Does it look sane? ** mount causes a kernel crash, run, just run. lathama@lappy7:~$ mount sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime) proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) udev on /dev type devtmpfs (rw,relatime,size=10240k,nr_inodes=981721,mode=755) devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000) tmpfs on /run type tmpfs (rw,nosuid,relatime,size=1574484k,mode=755) /dev/sda1 on / type ext4 (rw,relatime,errors=remount-ro,data=ordered) <snip>

date Having an incorrect date is a huge issue. This could make things fail like DHCP, NFS, Web Browsing, TLS/SSL and many many more. lathama@lappy7:~$ date Thu Jan 21 16:33:40 CST 2016 lathama@lappy7:~$ ntpq -p remote refid st t when poll reach delay offset jitter ============================================================================== bisesa.kakaopor 74.117.214.2 2 u 1 64 1 58.894-0.407 0.537 resolver2.level 10.67.8.18 3 u - 64 1 36.579-2.468 0.078 ntp.newfxlabs.c 128.9.176.30 2 u 4 64 0 0.000 0.000 0.000

** We set it to that date to avoid the license fee

dmesg or kernel ring buffer An in memory log that survives in memory in the case that the system logging functions are not working, started or other. Output Way to noisey to paste here. Everyone read the manual page as new features to this old tool make it much better ** The HDD is fine, that kernel error is a mistake

Solid ground, now we can get to work uptime free -h top You want to see 0-60 days, anything else is just scary Updates happen, or they should RAM NOM NOM NOM << Cookie Monsters cousin Data Dragon Load average is not a percentage - can be 4000.00 swap usage shown cd / Look for cores and maybe /var/cores on some systems ** Our 1000 days of uptime means we know what we are doing (system does not survive a reboot)

Logs are your friends Do a ls -lha /var/log and see if there are recent logs, and validate they are current with date. If the file descriptors are not up to today there is a larger issue. ** Custom log configure to implement magical unicorn fart features disabling all logs.

Validating Packages - First need of root or sudo!!! Manual changes, security concerns or other, simple solution apt-get install --reinstall $(dpkg --get-selections grep -v deinstall) or yum reinstall $(yum list installed awk '{print $1}') ** All our packages are up-to-date and we trust bob who repackages all the Distro packages to fix their stupid mistakes.

Tools to not use.. Some tools are not commonly installed on all systems. So it can be frustrating to rely on them. It is best to learn to use the lowest common denominator to troubleshoot systems with. Tools like sar, mpstat, iostat, and pidstat from Netflix s article are not installed on my laptop for example. ** We use an inhouse developed validation tool written in COBOL and we paid a lot of money for it so use it...

Defense of the stupid user. Who is accessing the system? Who has accessed the system recently? Who should not be accessing the system right now? passwd, w, last, who, wall and many many more. Resetting all users passwords is ALWAYS an option and should never be ignored as an option (unless directory service). ** Live tracking a user who was fired deleting files on a system in the 90s

Wall wall System is going down for reboot in 2 minutes. ^^^^^^^^ This will find the user 95% of the time you will hear a WTF or grunt...

5 minutes has been up for a while... You should have Determined if the system is on fire. Documented problem Acknowledge that the system owner is not willing to do the right thing