Motivation Threads Chapter 4 Most modern applications are multithreaded Threads run within application Multiple tasks with the application can be implemented by separate Update display Fetch data Spell checking Answer a network request Process creation is heavy-weight while thread creation is light-weight Can simplify code, increase efficiency Kernels are generally multithreaded 1 Thread of execution Thread = Single sequence of instructions Pointed to by the program counter (PC) Executed by the processor Conventional programming model & OS structure Single threaded One process == one thread The OS scheduler decides which process gets to run client Multithreaded Server Architecture (1) request server (3) resume listening for additional client requests (2) create new thread to service the request thread 3 1
Benefits of Threads Example: a file server on a LAN It needs to handle several file requests over a short period Hence more efficient to create (and destroy) a single thread for each request On a SMP machine: multiple can possibly be executing simultaneously on different processors Example2: one thread display menu and read user input while the other thread execute user commands Benefits Responsiveness may allow continued execution if part of process is blocked, especially important for user interfaces Resource Sharing share resources of process, easier than shared memory or message passing Economy cheaper than process creation, thread switching lower overhead than context switching Scalability process can take advantage of multiprocessor architectures 5 Benefits of Threads Since within the same process share memory and files, they can communicate with each other without invoking the kernel Therefore necessary to synchronize the activities of various so that they do not obtain inconsistent views of the data (chap 5) Multicore Programming Multicore or multiprocessor systems putting pressure on programmers, challenges include: Dividing activities Balance Data splitting Data dependency Testing and debugging Parallelism implies a system can perform more than one task simultaneously Concurrency supports more than one task making progress Single processor / core, scheduler providing concurrency 7 2
Multicore Programming (Cont.) Types of parallelism Data parallelism distributes subsets of the same data across multiple cores, same operation on each Task parallelism distributing across cores, each thread performing unique operation As # of grows, so does architectural support for threading CPUs have cores as well as hardware Consider Oracle SPARC T4 with 8 cores, and 8 hardware per core Concurrency vs. Parallelism Concurrent execution on single-core system: single core T1 T2 T3 T4 T1 T2 T3 T4 T1 time Parallelism on a multi-core system: core 1 T 1 T 3 T 1 T 3 T 1 core 2 T 2 T 4 T 2 T 4 T 2 time Single and Multithreaded Processes Threads and Processes code data files code data files registers stack registers registers registers stack stack stack thread thread single-threaded process multithreaded process 12 3
Multi-threaded model Thread 0 A thread is a subset of a process: A process contains one or more kernel Thread 1 Threads share memory and open files BUT: separate program counter, registers, and stack Shared memory includes the heap and global/ static data No memory protection among the Preemptive multitasking: Operating system preempts & schedules, not processes A process is just a container for one or more stack 1 stack 2 heap data+bss text 13 Why is this good? Threads are more efficient Much less overhead to create: no need to create new copy of memory space, file descriptors, etc. Sharing memory is easy (automatic) No need to figure out inter-process communication mechanisms Take advantage of multiple CPUs just like processes Program scales with increasing # of CPUs Take advantage of multiple cores 14 Multi-threaded programming patterns Single task thread Do a specific job and then release the thread Worker Specific task for each worker thread Dispatch task to the thread that handles it Thread pools Create a pool of a priori Use an existing thread to perform a task; wait if no available Common model for servers 15 Threads share: Text segment (instructions, including shared libraries) Data segment (static and global data) BSS segment (uninitialized data) Inter-process communication mechanisms: shared memory, message queues, etc. Open file descriptors Signals Current working directory User and group IDs Sharing Threads do not share: Thread ID Saved registers, stack pointer, instruction pointer Stack (local variables, temporary variables, return addresses) Signal mask (allows a signal to be restricted to one or more ) Scheduling information (e.g., priority) 16 4
Implementation Process info (Process Control Block) contains one or more Thread Control Blocks (TCB): Thread ID Saved registers Other per-thread info (signal mask, scheduling parameters) PCB PCB PCB TCB TCB TCB Threads States Three key states: running, ready, blocked They have no suspend state because all within the same process share the same address space Indeed: suspending (ie: swapping) a single thread involves suspending all of the same process Termination of a process, terminates all within the process TCB TCB TCB 17 18 Scheduling A thread-aware operating system scheduler schedules, not processes A process is just a container for one or more Scheduler has to realize: Scheduling Challenges Context switch among of different processes is more expensive Flush cache memory (or have memory with process tags) Flush virtual memory TLB (or have tagged TLB) Replace page table pointer in memory management unit CPU Affinity Rescheduling onto a different CPU is more expensive The CPU s cache may have memory used by the thread cached Try to reschedule the thread onto the same processor on which it last ran 19 20 5
Kernel-level vs. User-level User-level Kernel-level Threads supported by operating system OS handles scheduling, creation, synchronization User-level Library with code for creation, termination, scheduling Kernel sees one execution context: one process Cannot take advantage of multiple cores May or may not be preemptive Advantages Low-cost: user level operations that do not require switching to the kernel Scheduling algorithms can be replaced easily & custom to app Greater portability Disadvantages If a thread is blocked, all for the process are blocked Every system call needs an asynchronous counterpart Cannot take advantage of multiprocessing 21 22 User-Level Threads (ULT) The kernel is not aware of the existence of All thread management is done by the application by using a thread library Thread switching does not require kernel mode privileges (no mode switch) Scheduling is application specific Threads library Contains code for: creating and destroying passing messages and data between scheduling thread execution saving and restoring thread contexts 23 24 6
Kernel activity for ULTs Advantages and inconveniences of ULT The kernel is not aware of thread activity but it is still managing process activity When a thread makes a system call, the whole process will be blocked but for the thread library that thread is still in the running state So thread states are independent of process states 25 Advantages Thread switching does not involve the kernel: no mode switching Scheduling can be application specific: choose the best algorithm. ULTs can run on any OS. Only needs a thread library Inconveniences Most system calls are blocking and the kernel blocks processes. So all within the process will be blocked The kernel can only assign processes to processors. Two within the same process cannot run simultaneously on two processors 26 Kernel-Level Threads (KLT) Advantages and inconveniences of KLT All thread management is done by kernel No thread library but an API to the kernel thread facility Kernel maintains context information for the process and the Switching between requires the kernel Scheduling on a thread basis Ex: Windows NT and OS/2 27 Advantages the kernel can simultaneously schedule many of the same process on many processors blocking is done on a thread level kernel routines can be multithreaded Inconveniences thread switching within the same process involves the kernel. We have 2 mode switches per thread switch this results in a significant slow down 28 7
You can have both User-level thread library on top of multiple kernel 1:1 kernel only (1 user thread = 1 kernel thread N:1 user only (N user on 1 kernel thread/ process) N:M hybrid threading (N user on M kernel ) Combined ULT/KLT Approaches Thread creation done in the user space Bulk of scheduling and synchronization of done in the user space The programmer may adjust the number of KLTs May combine the best of both approaches Example is Solaris 29 30 p: POSIX Threads POSIX.1c, Threads extensions (IEEE Std 1003.1c-1995) Defines API for managing Linux: native POSIX Thread Library Also on Solaris, Mac OS X, NetBSD, FreeBSD API library on top of Win32 31 8