Kernel Services CIS 657
System Processes in Traditional Unix Three processes created at startup time init process 1 user-mode administrative tasks (keeps getty running; shutdown) ancestor of all of your processes Still in FreeBSD swapper process 0 kernel process moves entire processes between main memory and secondary storage pagedaemon process 2 kernel process moves parts of processes in and out
Kernel Processes in FreeBSD 5.x idle Runs when there are no other ready procs swapper Moves processes from secondary storage -> main memory vmdaemon Moves processes from main memory -> secondary storage pagedaemon Writes portions of a process s s address space to storage pagezero Supplies zero-filled pages bufdaemon Supplies clean buffers (by writing out dirty buffers)
Kernel Processes in FreeBSD 5.x (II) syncer Ensures dirty file data written within 30 seconds ktrace Logs system call trace records to a log file vnlru Maintains supply of free vnodes (LRU) random Seeds kernel random numbers and /dev/random g_event Handles dynamic devices g_up Data from device drivers -> processes g_down Data from processes -> device drivers
Note on Init and Logging In init keeps a getty process running on each terminal port (tty( tty) getty initializes the port and waits for a login name (the login: login: prompt) getty reads a string and execs login login prompts for the password, performs one- way encryption, and compares values if successful, login sets the user id and execs a shell
Run-Time Organization Top Half per-process stack library of shared code maintains process structure (always resident) maintains user structure (can be swapped out) division between process/user dependent on memory never preempted for another process (but can yield the processor) can block interrupts by setting processor priority level (see discussion on bottom half)
Run-Time Organization II Bottom Half handles hardware interrupts asynchronous activities (wrt( Top Half) special kernel stack (might not be a process running) Top and bottom half coordinate around work queues using mutexes top half starts I/O requests, waits for bottom half to finish
Entry Into the Kernel Hardware interrupt I/O device (disks, network cards, etc.) clock (used for scheduling, time of day) Hardware trap divide by 0, illegal memory reference Software-initiated trap system call
Entry Into the Kernel II First, kernel must save machine state Why? Example sequence hardware switches to kernel mode hardware pushes onto per-process kernel stack the PC, PSW, trap info additional asm routine saves all other state that the hardware doesn t kernel calls a C routine--the handler.
Entry Into the Kernel III Handlers for each kind of entry: syscall() for a system call trap() for hardware traps interrupt handlers for devices Each kind of handler takes specific parameters (e.g., syscall number, exception frame; or the unit number for an interrupt).
Return From Kernel asm routine restores registers and user stack pointer (it undoes what the companion asm routine did) hardware restores the stored PC, PSW, etc. (undoes what it did on the way in) execution returns at the next instruction in the user process
Software Interrupts Used as low-priority processing mechanism in the kernel Hardware interrupts have high priority Can put work in work queues (cf. network) When high-priority work is done, low-priority software interrupt does the rest might be real interrupt, might be flag checked in kernel (architecture-dependent) can be preempted by another hardware interrupt
Priority Levels in FreeBSD 5.2 High-priority hardware interrupt creates work for lower-priority software interrupt Work queues Software interrupt routines lower priority than device drivers; higher than user processes Hardware interrupt > software interrupt > user
Example: Network Packets Device driver (hardware interrupt) takes packets from network, puts them in a work queue; controller re-enabled Software interrupt handler moves packets from work queues to destination processes Thus, delivery to processes doesn t block packets coming in from the network
What FreeBSD 4.4 Did for x86 The cpl variable holds the current priority level Various macros are defined to set the cpl to a new level (e.g. spl0(), splx(), spltty()) Interrupts at lower levels are masked (but not really--a pending bit is set and the majority of the work handling the interrupt is deferred) See /usr/src/sys/i386/isa/ipl_funcs.c Homework: find out (and submit) what FreeBSD 5.2.1 does (you may work with your lab partner).
Clock Interrupts The system clock interrupts, or ticks, at regular intervals (usually 100 Hz). Interrupt handler calls the hardclock() routine, which must run quickly running for more than one tick will miss the next interrupt, causing the time-of-day clock to skew lower-priority devices (network devices, disk controllers) cannot be serviced while hardclock() is running. Non-critical clock functions handled by softclock()
The Four Clocks Hardclock: hardware timer, 100 Hz. Softclock: : handles non-critical timing work Profclock: profiling clock (collect process performance information), 1024 Hz. Statclock: collects system statistics, 128 Hz.
Clock Interrupts: What hardclock() does check for an interval timer on the currently running process increment the time of day do the job of profclock() if there is no spparate profiling clock do the job of statclock() if there is no separate clock for statistics gathering call softclock() directly if the cpl is low (saves overhead of a software interrupt that would just do that when hardclock() returns)
Statistics Historically, hardclock() collected resource utilization statistics, and forced context switches Problems (see McCanne & Torek) potential for inaccurate measurement of CPU utilization inaccurate profiling Use semi-randomized sampling with a second clock (the stat clock, see statclock()) charge the current process with a tick; if it has four, recalculate priority record what the system was doing at time of tick
Softclock() Handles events in the callout queue, such as: timeouts (real-time timer) retransmits dropped network packets monitors some peripherals that require polling process scheduling Scheduler would/should be called every second in a perfect world; uses interval timer to run 1 second after it last finished Discussion: why not do scheduling in hardclock()?
Callout Queue A circular list of n queues; sorted by time of event (in ticks) Each queue sorted in time order (soonest first) Pointer to current queue (now( now) ) moves around circular list hardclock() advances pointer, and softclock runs when the lead item in the new queue has time 0.
Example Callout Queue (200 queues in example) now +199 now now + 1 now now+200 now f(x) g(y) f(z) now + 2 now + 3. now + 2 h(x) Question: does f(z) happen this cycle, or next?
Memory Management Two kinds of executable files in BSD Unix interpreted compiled (directly executed) First 16 bits in a file contain a magic number telling what kind of file it is. #! indicates an interpreted file; interpreter must be directly executable (#!/bin/sh is the most common) Other magic numbers indicate whether the file can be paged and whether the text is sharable.
FreeBSD Process Layout 0xfff00000 0x00000000 Special stuff... User stack shared libs heap bss initialized data text 0-filled bss,, stack most of process is demand paged into memory (Ch. 5) Symbol table Initialized data text elf header elf magic number
What Was That Special Stuff? Per-process kernel stack Red zone User area Ps_strings struct Signal code Env strings argv strings Env pointers argc argv pointers Argv, argc, envp contain arguments and environment signal code used by kernel to deliver signals ps_strings used by ps to located argv of process
Timing Services Real Time: gettimeofday() returns the time since 1 Jan 1970 in UTC (the Epoch) adjtime() allows one to tweak the clock keeps multiple machines close enough response to normal clock skew give a delta argument; speed up or slow down the counted microseconds per clock tick by 10% until delta is reached Time is reported in microseconds
Interval Timers Each process gets three interval timers real: decrements in real time; SIGALRM; run from timeout queue maintained by softclock() profiling: decrements only when process runs, but tracks both user and kernel-mode execution; SIGPROF; checked by profclock() process virtual: decrements only when the process is running; SIGVTALRM; checked by profclock()
User, Group, Other Identifiers User ID (uid( uid): 32-bit identifier for all processes of each user, set by administrator Group ID (gid( gid): 32-bit identifier. Many users in one group; many groups for each user Root: uid 0, gid 0. These bits are checked on file access
Permission Checks Checked in order If the UID F == UID P use owner permission bits If UID F!= UID P, but GID F GID P then use group permission bits If UID F!= UID P, and GID F GID P then use the other permission bits Recall discussion last time of how uid, gid set on login Can I own a file that I can t t read?
Rights Amplification Users may need temporary write access on files (e.g. passwd) setuid() does this changes effective user id real user id stays the same effective uid also saved seteuid() changes only the effective user id setgid() used to work like setuid() now just put effective gid into 0th element of array
Effects of Syscalls on UIDs Action Real Effective Saved Exec-normal R R R Exec-setuid R S S Seteuid(R) R R S Seteuid(S) R S S Seteuid(R) R R S Exec-normal R R R R = Real UID, S = Special-privilege UID