TDDI04 Concurrent Programming, Operating Systems, and Real-time Operating Systems Process Synchronization [SGG7] Chapter 6 Copyright Notice: The lecture notes are mainly based on Silberschatz s, Galvin s and Gagne s book ( Operating System Concepts, 7th ed., Wiley, 2005). No part of the lecture notes may be reproduced in any form, due to the copyrights reserved by Wiley. These lecture notes should only be used for internal teaching purposes at the Linköping University. Acknowledgment: The lecture notes are originally compiled by C. Kessler, IDA. Klas Arvidsson, IDA, Linköpings universitet. Contents The Critical-Section Problem Disable interrupts for small critical sections and 1 CPU Peterson s Solution for 2 processes Hardware Support for Synchronization Atomic TestAndSet, Atomic Swap Semaphores and Locks Why and How to Avoid Busy Waiting Classic Problems for Studying Synchronization Bounded Buffer, Readers/Writers, Dining Philosophers Monitors Condition variables Concerns both processes sharing memory and threads (in the following we only write processes, for brevity) TDDI04, K. Arvidsson, IDA, Linköpings universitet. 5.2 Background Concurrent access to shared data may result in data inconsistency And nondeterminism result may depend on relative speed of processes involved ( race conditions ) Maintaining data consistency requires mechanisms to ensure the orderly execution of cooperating processes Consider a process producing data to be read by an other process. (The consumer-producer problem). This can be realized with a shared memory buffer and some variables keeping track of total size, used size (count), read position and write position. The count variable will be shared by both processes: Initially, count is set to 0. Producer increments count after it produces a new item Consumer decrements count after it consumes an item. Remaining variables can be read-only or modified by only one process. TDDI04, K. Arvidsson, IDA, Linköpings universitet. 5.3 Producer/Consumer Producer: while (true) // produce an item and put in nextproduced while (count == BUFFER_SIZE) ; // do nothing buffer[in] = nextproduced; in = (in + 1) % BUFFER_SIZE; count++; n Shared integer variable count, initialized to 0, keep track of the number of items Producer out in buffer Consumer Consumer: while (true) while (count == 0) ; // do nothing nextconsumed = buffer[out]; out = (out + 1) % BUFFER_SIZE; count--; // consume item in nextconsumed TDDI04, K. Arvidsson, IDA, Linköpings universitet. 5.4 Race Conditions lead to Nondeterminism count++ could be implemented in machine code as 39: register1 = count : load r1, _count 40: register1 = register1 + 1 : add r1, r1, #1 41: count = register1 : store _count, r1 count-- could be implemented as NOT ATOMIC! 22: register2 = count 23: register2 = register2-1 24: count = register2 Consider this execution interleaving, with count = 5 initially: 39: producer executes register1 = count register1 = 5 Context 40: producer executes register1 = register1 + 1 register1 = 6 switch 22: consumer executes register2 = count register2 = 5 23: consumer executes register2 = register2-1 register2 = 4 41: producer executes count = register1 count = 6 24: consumer executes count = register2 count = 4 Compare to a different interleaving, e.g., 39,40,41,switch,22,23,24 TDDI04, K. Arvidsson, IDA, Linköpings universitet. 5.5 Critical Section Critical Section: A set of instructions, operating on shared data or resources, that should be executed by a single process without interruption Atomicity of execution Mutual exclusion: At most one process should be allowed to operate inside at any time Consistency: inconsistent intermediate states of shared data not visible to other processes outside May consist of different program parts for different processes Ex: producer and consumer code accessing count is 1 critical section General structure, with structured control flow:... Entry of critical section C critical section C: operation on shared data Exit of critical section C TDDI04, K. Arvidsson, IDA, Linköpings universitet. 5.6 1
Requirements on a Solution to the Critical-Section Problem 1. Mutual Exclusion If process P i is executing in its critical section C, then no other processes can be executing in their critical sections C (accessing the same shared data/resource) 2. Progress If no process is executing in its critical section C and there exist some processes that wish to enter their critical section C, then the selection of the process that will enter C next cannot be postponed indefinitely 3. Bounded Waiting A bound must exist on the number of times that other processes are allowed to enter their critical sections after a process has made a request to enter its critical section and before that request is granted Assume that each process executes at a nonzero speed No assumption concerning relative speed of the N processes TDDI04, K. Arvidsson, IDA, Linköpings universitet. 5.7 A First Attempt For 2 processes P 0 P 1 Assumes that Load and Store execute atomically Shared variables: boolean flag[2] = false, false ; flag[i] == TRUE P i ready to enter its critical section, for i = 0, 1 n Code for process Pi : (Notation: j = 1 i ) flag[i] = TRUE; while (flag[j]) ; critical section flag[i] = false; remainder section Satisfies mutual exclusion, while (TRUE); but not progress requirement. P0 sets flag[0]=true; context switch; P1 sets flag[1]=true; infinite loop TDDI04, K. Arvidsson, IDA, Linköpings universitet. 5.8 Peterson s Solution Hardware Support for Synchronization For 2 processes P 0 P 1 Assumes that Load and Store instructions execute atomically Shared variables: int turn; bool ready[2] = false, false ; The variable turn indicates whose turn it is to enter the critical section. ready[i] = true process P i is ready to enter, for i=0,1. Code for process P i ( j = 1 i ): ready[i] = TRUE; turn = j; while ( ready[j] && turn == j) ; /* busy wait */. CRITICAL SECTION ready[i] = FALSE;. REMAINDER SECTION while (TRUE); Most systems provide hardware support for protecting critical sections Uniprocessors could disable interrupts Currently running code would execute without preemption Generally too inefficient on multiprocessor systems > Operating systems using this are not broadly scalable Modern machines provide special atomic instructions TestAndSet: test memory word and set value atomically Swap: swap contents of two memory words atomically > Atomic = non-interruptable > If multiple TestAndSet or Swap instructions are executed simultaneously (each on a different CPU in a multiprocessor), then they take effect sequentially in some arbitrary order. TDDI04, K. Arvidsson, IDA, Linköpings universitet. 5.9 TDDI04, K. Arvidsson, IDA, Linköpings universitet. 5.10 TestAndSet Instruction Definition in pseudocode: boolean TestAndSet (boolean *target) boolean rv = *target; atomic *target = TRUE; return rv; // return the OLD value Swap Instruction Definition in pseudocode: void Swap (boolean *a, boolean *b) boolean temp = *a; atomic *a = *b; *b = temp; TDDI04, K. Arvidsson, IDA, Linköpings universitet. 5.11 TDDI04, K. Arvidsson, IDA, Linköpings universitet. 5.12 2
TestAndSet using atomic swap Definition using atomic swap: boolean TestAndSet (boolean *target) boolean rv = TRUE; atomic swap boolean tmp = *target; *target = rv; Swap(&rv, target); rv = tmp; return rv; // return the OLD value Solution using TestAndSet Shared boolean variable lock, initialized to FALSE (= unlocked) while ( TestAndSet (&lock )) ; // do nothing but spinning on the lock (busy waiting) // critical section lock = FALSE; // remainder section while ( TRUE); TDDI04, K. Arvidsson, IDA, Linköpings universitet. 5.13 TDDI04, K. Arvidsson, IDA, Linköpings universitet. 5.14 Solution using Swap Shared Boolean variable lock initialized to FALSE; Each process has a local Boolean variable tmp. tmp = TRUE; while ( tmp == TRUE) Swap (&lock, &tmp ); // critical section lock = FALSE; while (TRUE); // remainder section // busy waiting TDDI04, K. Arvidsson, IDA, Linköpings universitet. 5.15 Semaphore Semaphore S: shared integer variable two standard atomic operations to modify S: wait() and signal(), originally called P() and V() Provides mutual exclusion wait (S) while S <= 0 S--; signal (S) S++; ; // busy waiting Must be atomic TDDI04, K. Arvidsson, IDA, Linköpings universitet. 5.16 Semaphores and Locks Counting semaphore integer value, increment/decrement Binary semaphore value in 0, 1 resp., FALSE, TRUE ; can be simpler to implement (e.g., using TestAndSet) Also known as (mutex) locks Called spinlocks if using busy waiting Can implement a counting semaphore S using a lock Provides mutual exclusion: Semaphore S; // initialized to 1 wait (S); Critical Section code signal (S); Semaphore Implementation using a Lock Must guarantee that no two processes can execute wait(s) and signal(s) on the same semaphore S at the same time Thus, implementation becomes the critical section problem where the wait and signal code are placed in the critical section. Protect this critical section by a mutex lock L. TDDI04, K. Arvidsson, IDA, Linköpings universitet. 5.17 TDDI04, K. Arvidsson, IDA, Linköpings universitet. 5.18 3
Busy Waiting? Up to now, we used busy waiting in critical section implementations (spinning on semaphore / lock) Mostly OK on multiprocessors (disable interrupts costly) OK if critical section is very short and rarely occupied But accesses shared memory frequently For longer critical sections or high congestion, waste of CPU time As applications may spend lots of time in critical sections, this is not a good solution. Eliminate Busy Waiting With each semaphore there is an associated waiting queue. Each entry in a waiting queue contains: Process table index, e.g. pid Pointer to next entry Two operations: block place the process invoking the operation on the appropriate waiting queue. wakeup remove one of processes in the waiting queue and place it in the ready queue. TDDI04, K. Arvidsson, IDA, Linköpings universitet. 5.19 TDDI04, K. Arvidsson, IDA, Linköpings universitet. 5.20 Semaphore Implementation w/o Busy Waiting typedef struct int value; struct process *wqueue; semaphore; void wait ( semaphore *S ) S->value--; if (S->value < 0) add this process to S->wqueue; block(); // I release the lock on the critical section for S // and release the CPU void signal ( semaphore *S ) S->value++; if (S->value <= 0) remove a process P from S->wqueue; wakeup (P); // append P to ready queue TDDI04, K. Arvidsson, IDA, Linköpings universitet. 5.21 Problems: Deadlock and Starvation Deadlock two or more processes are waiting indefinitely for an event that can be caused only by one of the waiting processes Typical example: Nested critical sections Guarded by semaphores S and Q, initialized to 1 P 0 P 1 time wait (S); wait (Q); signal (S); signal (Q); wait (Q); wait (S); TDDI04, K. Arvidsson, IDA, Linköpings universitet. 5.22 signal (Q); signal (S); Starvation indefinite blocking. A process may never be removed from the semaphore queue in which it is suspended. Classical Problems of Synchronization Bounded-Buffer Problem Bounded-Buffer Problem Readers and Writers Problem Dining-Philosophers Problem Producer out in buffer Consumer Buffer with space for N items Semaphore mutex initialized to the value 1 Protects accesses to out, int, buffer Semaphore full initialized to the value 0 Counts number of filled buffer places Semaphore empty initialized to the value N Counts number of free buffer places TDDI04, K. Arvidsson, IDA, Linköpings universitet. 5.23 TDDI04, K. Arvidsson, IDA, Linköpings universitet. 5.24 4
Bounded Buffer Producer: // produce an item wait (empty); wait (mutex); // add item to buffer Consumer: wait (full); wait (mutex); Atomic decrement; block if no place filled (all empty) // remove an item from buffer Readers-Writers Problem 2 types of processes accessing a shared data set: Readers only read accesses, no updates Writers can both read and write. Relaxed Mutual Exclusion Problem: allow either multiple readers or one single writer to access the shared data at the same time. signal (mutex); signal (full); while (true); signal (mutex); signal (empty); // atomic incr. // consume the removed item while (true); Solution: use Semaphore mutex initialized to 1. Semaphore wrt initialized to 1. Integer readcount initialized to 0. // to protect readcount // 0: someone inside // count readers inside TDDI04, K. Arvidsson, IDA, Linköpings universitet. 5.25 TDDI04, K. Arvidsson, IDA, Linköpings universitet. 5.26 Readers-Writers Problem (Cont.) The structure of a writer process: wait(wrt); // writers queue on wrt // writing is performed signal(wrt); while(true); TDDI04, K. Arvidsson, IDA, Linköpings universitet. 5.27 Readers-Writers Problem (Cont.) The structure of a reader process: wait(mutex); readcount++; if (readcount == 1) wait(wrt); // first reader queues on wrt, signal(mutex); // and the others on mutex. // reading is performed wait(mutex); readcount--; if (readcount == 0) signal(wrt); signal(mutex); while (true); This solution favors the readers. Writers may starve. TDDI04, K. Arvidsson, IDA, Linköpings universitet. 5.28 Dining-Philosophers Problem Shared data Bowl of rice (data set) Array of 5 semaphores chopstick [5], all initialized to 1 The structure of Philosopher i: wait(chopstick[i]); wait(chopstick[(i + 1) % 5]); // eat signal(chopstick[i]); signal(chopstick[(i + 1)%5]); // think while(true); 3+1 Deadlock-Free Solutions of the Dining-Philosophers Problem Allow at most four philosophers to be sitting simultaneously at the table Allow a philosopher to pick up her/his chopsticks only if both chopsticks are available (both in the same critical section) Odd philosophers pick up left chopstick first; even philosophers pick up right chopstick first (Asymmetric solution) In Fork: Represent chopsticks as bits in a bitvector (integer) and use bit-masked atomic operations to grab both chopsticks simultaneously. Works for up to sizeof(int) philosophers. [Keller, K., Träff: Practical PRAM Programming, Wiley 2001] TDDI04, K. Arvidsson, IDA, Linköpings universitet. 5.29 TDDI04, K. Arvidsson, IDA, Linköpings universitet. 5.30 5
Pitfalls with Semaphores Correct use of semaphore operations: Protect all possible entries/exits of control flow into/from critical section: wait (mutex). signal (mutex) Possible sources of synchronization errors: Omitting wait (mutex) or signal (mutex) (or both)?? wait (mutex). wait (mutex)?? wait (mutex1). signal (mutex2)?? Multiple semaphores with different orders of wait() calls?? > Example: Each philosopher first grabs the chopstick to its left -> risk for deadlock! TDDI04, K. Arvidsson, IDA, Linköpings universitet. 5.31 Monitors Monitor: high-level abstraction à la OOP that provides a convenient and effective mechanism for process synchronization Encapsulate shared data, protecting semaphore, and all operations in a single abstract data type Only one process may be active within the monitor at a time monitor monitor-name // shared variable declarations initialization (.) procedure P1 (). procedure Pn () In Java realized as classes with synchronized methods TDDI04, K. Arvidsson, IDA, Linköpings universitet. 5.32 Condition Variables Monitor with Condition Variables condition x; Waiting queue associated with each condition variable Two operations on a condition variable: wait(x) or x.wait () process that invokes the operation is suspended. signal(x) or x.signal () resumes one of processes (if any) that invoked wait (x) x Awoken thread must re-apply for entry into the monitor. Two scenarios: Signal and wait: signal ing thread steps aside for resumed Signal and continue: signal ing thread continues while resumed thread waits for signal er TDDI04, K. Arvidsson, IDA, Linköpings universitet. 5.33 TDDI04, K. Arvidsson, IDA, Linköpings universitet. 5.34 Example: Monitor with condition variable monitor // Simple resource allocation monitor in pseudocode boolean inuse = false; // simple shared state variable condition available; // condition variable initialization inuse = false; procedure getresource ( ) if (inuse) wait( available ); // wait until available is signaled inuse = true; // resource is now in use procedure returnresource ( ) inuse = false; signal( available ); // signal a waiting thread to proceed TDDI04, K. Arvidsson, IDA, Linköpings universitet. 5.35 Monitor Solution to Dining Philosophers monitor DiningPhilosopher enum THINKING, HUNGRY, EATING state [5] ; condition self [5]; void pickup ( int i ) state[i] = HUNGRY; test ( i ); if (state[i]!= EATING) self [i].wait(); void putdown ( int i ) state[i] = THINKING; test((i+4)%5); // left neighbor test((i+1)%5); // right neighbor void test ( int i ) if ( (state[(i+4)%5]!= EATING) && (state[i] == HUNGRY) && (state[(i+1)%5]!= EATING) ) state[i] = EATING; self[i].signal (); initialization_code() for (int i = 0; i < 5; i++) state[i] = THINKING; // end monitor TDDI04, K. Arvidsson, IDA, Linköpings universitet. 5.36 6
Synchronization Examples Solaris Windows XP Linux Pthreads [SGG7] Chapter 6.8 TDDI04, K. Arvidsson, IDA, Linköpings universitet. 5.37 7