Filip Jagodzinski
Announcements Reading Task Should take 30 minutes-ish maximum Homework 2 Book questions including two custom ones will be posted to the course website today Programming tasks will be posted after Friday s lab Lab 2 This Friday (prelude to Homework 2)
Review Two processes are independent if the write set of each is disjoint from both the read and write sets of the other
Review Two processes are independent if the write set of each is disjoint from both the read and write sets of the other a b c d e f P a P b Q: Are P a and P b independent?
Review Two processes are independent if the write set of each is disjoint from both the read and write sets of the other a b c d e f Write set Read set P a d d, a P b b P a P b Q: Are P a and P b independent? Q: Is P a s write set disjoint from the read and write sets of P b?
Review Two processes are independent if the write set of each is disjoint from both the read and write sets of the other a b c d e f Write set Read set P a d d, a P b b P a P b Q: Are P a and P b independent? Q: Is P a s write set disjoint from the read and write sets of P b?
Review Two processes are independent if the write set of each is disjoint from both the read and write sets of the other a b c d e f Write set Read set P a d d, a P b b P a P b Q: Are P a and P b independent? Yes Q: Is P a s write set disjoint from the read and write sets of P b? Q: Is P b s write set disjoint from the read and write sets of P a? Yes Yes
Review History : a possible ordering of instruction invocations among multiple processes executing on a multi processor computer
Review History : a possible ordering of instruction invocations among multiple processes executing on a multi processor computer One way to specify a history is using < i1 < i2 < i3
Review History : a possible ordering of instruction invocations among multiple processes executing on a multi processor computer One way to specify a history is using < i1 < i2 < i3 Unless otherwise stated, instructions are assumed to be executed atomically*, meaning, allowing to proceed to completion. A more technical definition of atomic is : a guarantee that an operation will complete without interruption or signal calls
Review Single-threaded process vs multi-threaded processes memory memory registers stack registers registers registers stack stack stack
Review Single-threaded process vs multi-threaded processes memory memory registers stack registers registers registers stack stack stack Advantages Responsiveness Resource Sharing Economy Scalability Disadvantages/Challenges Identifying tasks Balance Data splitting Data dependency
Review Single-threaded process vs multi-threaded processes memory memory registers stack registers registers registers stack stack stack Advantages Responsiveness Resource Sharing Economy Scalability Disadvantages/Challenges Identifying tasks Balance Data splitting Data dependency If 4 CPUs are recruited to sum the entries of the matrix, and each CPU is responsible for a unique color region, then no two CPUs contend for the same shared piece of information
Review Single-threaded process vs multi-threaded processes memory memory registers stack registers registers registers stack stack stack If 4 CPUs are recruited to calculate the product of two matrices 2 3 4 5 x 6 7 8 9 Advantages Responsiveness Resource Sharing Economy Scalability Disadvantages/Challenges Identifying tasks Balance Data splitting Data dependency????
Review Single-threaded process vs multi-threaded processes memory memory registers stack registers registers registers stack stack stack If 4 CPUs are recruited to calculate the product of two matrices 2 3 4 5 x 6 7 8 9 Advantages Responsiveness Resource Sharing Economy Scalability Disadvantages/Challenges Identifying tasks Balance Data splitting Data dependency 2x6 + 3x8 = 36 Q: Which data does the red CPU rely on? 36
Review Single-threaded process vs multi-threaded processes memory memory registers stack registers registers registers stack stack stack If 4 CPUs are recruited to calculate the product of two matrices 2 3 4 5 x 6 7 8 9 Q: Which data does the blue CPU rely on? 2x7 + 3x9 = 41 Advantages Responsiveness Resource Sharing Economy Scalability Disadvantages/Challenges Identifying tasks Balance Data splitting Data dependency 36 41
Review Single-threaded process vs multi-threaded processes memory memory registers stack registers registers registers stack stack stack If 4 CPUs are recruited to calculate the product of two matrices 2 3 4 5 x 6 7 8 9 2x6 + 3x8 = 36 2x7 + 3x9 = 41 Advantages Responsiveness Resource Sharing Economy Scalability Disadvantages/Challenges Identifying tasks Balance Data splitting Data dependency 36 41 Both the red & blue CPUs need access to the 2 and 3, so one CPU must wait
Review Single-threaded process vs multi-threaded processes memory memory registers stack registers registers registers stack stack stack If 4 CPUs are recruited to calculate the product of two matrices 2 3 4 5 x 6 7 8 9 2x6 + 3x8 = 36 2x7 + 3x9 = 41 Advantages Responsiveness Resource Sharing Economy Scalability Disadvantages/Challenges Identifying tasks Balance Data splitting Data dependency 2,3
Review m threads executing n atomic actions each m n possible histories 3 1? A1 B1 C1
Review m threads executing n atomic actions each m n possible histories 3 1 6 A1 B1 C1 A1 B1 C1 A1 C1 B1 B1 C1 A1 B1 A1 C1 C1 A1 B1 C1 B1 A1
Review m threads executing n atomic actions each m n possible histories 3 1 6 2 2? A1 A2 B1 B2
Review m threads executing n atomic actions each m n possible histories 3 1 6 2 2? A1 A2 B1 B2 A1 A2 B1 B2 Identify the instructions
Review m threads executing n atomic actions each m n possible histories 3 1 6 2 2? A1 A2 B1 B2 A1 A2 B1 B2 Next enumerate all histories regardless if the A1 < A2 condition, and B1 < B2 conditions are met A1 A2 B1 B2 A1 A2 B2 B1 A1 B1 B2 A2 A1 B1 A2 B2 A1 B2 A2 B1 A1 B2 B1 A2 A2 A1 B1 B2 A2 A1 B2 B1 A2 B1 B2 A1 A2 B1 A1 B2 A2 B2 A1 B1 A2 B2 B1 A1 B1 A2 A1 B2 B1 A2 B2 A1 B1 A1 B2 A2 B1 A1 A2 B2 B1 B2 A2 A1 B1 B2 A1 A2 B2 A2 B1 A1 B2 A2 A1 B1 B2 B1 A1 A2 B2 B1 A2 A1 B2 A1 A2 B1 B2 A1 B1 A2
Review m threads executing n atomic actions each m n possible histories 3 1 6 2 2? A1 A2 B1 B2 A1 A2 B1 B2 Then prune based on whether the A1 < A2 and B1 < B2 conditions are met A1 A2 B1 B2 A1 A2 B2 B1 A1 B1 B2 A2 A1 B1 A2 B2 A1 B2 A2 B1 A1 B2 B1 A2 A2 A1 B1 B2 A2 A1 B2 B1 A2 B1 B2 A1 A2 B1 A1 B2 A2 B2 A1 B1 A2 B2 B1 A1 B1 A2 A1 B2 B1 A2 B2 A1 B1 A1 B2 A2 B1 A1 A2 B2 B1 B2 A2 A1 B1 B2 A1 A2 B2 A2 B1 A1 B2 A2 A1 B1 B2 B1 A1 A2 B2 B1 A2 A1 B2 A1 A2 B1 B2 A1 B1 A2
Review m threads executing n atomic actions each m n possible histories 3 1 6 2 2 6 A1 A2 B1 B2 A1 A2 B1 B2 Then prune based on whether the A1 < A2 and B1 < B2 conditions are met A1 A2 B1 B2 A1 A2 B2 B1 A1 B1 B2 A2 A1 B1 A2 B2 A1 B2 A2 B1 A1 B2 B1 A2 A2 A1 B1 B2 A2 A1 B2 B1 A2 B1 B2 A1 A2 B1 A1 B2 A2 B2 A1 B1 A2 B2 B1 A1 B1 A2 A1 B2 B1 A2 B2 A1 B1 A1 B2 A2 B1 A1 A2 B2 B1 B2 A2 A1 B1 B2 A1 A2 B2 A2 B1 A1 B2 A2 A1 B1 B2 B1 A1 A2 B2 B1 A2 A1 B2 A1 A2 B1 B2 A1 B1 A2
Review m threads executing n atomic actions each m n possible histories 3 1 6 2 2 6 A1 A2 B1 B2 A1 A2 B1 B2 Then prune based on whether the A1 < A2 and B1 < B2 conditions are met A1 A2 B1 B2 A1 A2 B2 B1 A1 B1 B2 A2 A1 B1 A2 B2 A1 B2 A2 B1 A1 B2 B1 A2 A2 A1 B1 B2 A2 A1 B2 B1 A2 B1 B2 A1 A2 B1 A1 B2 A2 B2 A1 B1 A2 B2 B1 A1 B1 A2 A1 B2 B1 A2 B2 A1 B1 A1 B2 A2 B1 A1 A2 B2 B1 B2 A2 A1 B1 B2 A1 A2 We finished with : Is there a formula for calculating the count of possible histories among m threads with n instructions each. B2 A2 B1 A1 B2 A2 A1 B1 B2 B1 A1 A2 B2 B1 A2 A1 B2 A1 A2 B1 B2 A1 B1 A2
Review m threads executing n atomic actions each m n possible histories 3 1 6 2 2 6 A1 A2 B1 B2 A1 A2 B1 B2 Then prune based on whether the A1 < A2 and B1 < B2 conditions are met A1 A2 B1 B2 A1 A2 B2 B1 A1 B1 B2 A2 A1 B1 A2 B2 A1 B2 A2 B1 A1 B2 B1 A2 A2 A1 B1 B2 A2 A1 B2 B1 A2 B1 B2 A1 A2 B1 A1 B2 A2 B2 A1 B1 A2 B2 B1 A1 B1 A2 A1 B2 B1 A2 B2 A1 B1 A1 B2 A2 B1 A1 A2 B2 B1 B2 A2 A1 B1 B2 A1 A2 We finished with : Is there a formula for calculating the count of possible histories among m threads with n instructions each. B2 A2 B1 A1 B2 A2 A1 B1 B2 B1 A1 A2 B2 B1 A2 A1 B2 A1 A2 B1 B2 A1 B1 A2
In-class exercise
In-class exercise We finished with : Is there a formula for calculating the count of possible histories among m threads with n instructions each. (mn)! (n!) m
Review User threads
Review User threads Be sure you know both the terminology (many-to-one, one-toone, and many-to-many), but also can list drawbacks/limitations and advantages of each approach
Today Concurrency versus Parallel Finish up Threads Chapter 5, Synchronization Peterson s Mutex
Concurrency versus parallel Q: Are concurrent and parallel synonyms? Q: Is it possible for a system to be executing something concurrently, but without parallelism?
Concurrency versus parallel Q: Are concurrent and parallel synonyms? Q: Is it possible for a system to be executing something concurrently, but without parallelism? The two terms are often used interchangeably even in technical manuscripts but there are subtle differences
Concurrency versus parallel Q: Are concurrent and parallel synonyms? Q: Is it possible for a system to be executing something concurrently, but without parallelism? The two terms are often used interchangeably even in technical manuscripts but there are subtle differences Concurrent means that two tasks are being executed (from the perspective of the user) at the same time regardless whether one or more CPUs are being used and the ORDER of execution is indeterminate (many possible histories) the OS selects scheduling, and one process may be waiting on another Parallel means that at least two processes are actively (nobody is waiting) working at the same time
Concurrency versus parallel Q: Are concurrent and parallel synonyms? Q: Is it possible for a system to be executing something concurrently, but without parallelism? The two terms are often used interchangeably even in technical manuscripts but there are subtle differences Concurrent means that two tasks are being executed (from the perspective of the user) at the same time regardless whether one or more CPUs are being used and the ORDER of execution is indeterminate (many possible histories) the OS selects scheduling, and one process may be waiting on another Parallel means that at least two processes are actively (nobody is waiting) working at the same time
Concurrency versus parallel Q: Are concurrent and parallel synonyms? Q: Is it possible for a system to be executing something concurrently, but without parallelism? cpu0 t1 t2 t3 t1 t2 t3 On a single CPU machine (that is NOT superscalar), concurrency is possible, but parallelism is not cpu0 cpu1 t1 t3 t1 t3 t1 t3 t2 t4 t2 t4 t2 t4 On a multiple CPU machine, concurrency AND parallelism is possible
Threads When designing multithreaded programs, special circumstance arise
Threads When designing multithreaded programs, special circumstance arise fork(): create a separate, duplicate process exec(): create a new process In a single threaded program, it is clear what happens when either of the above system calls are made But when there are multiple threads for a single process
Threads When designing multithreaded programs, special circumstance arise fork(): create a separate, duplicate process exec(): create a new process In a single threaded program, it is clear what happens when either of the above system calls are made But when there are multiple threads for a single process If a thread calls fork(), does the new process duplicate all threads, or does the new process run on a single thread? Some systems have multiple fork() routines, one that duplicates all threads, and another that duplicates the single thread that issued the fork()
Threads When designing multithreaded programs, special circumstance arise Q: What are the mechanisms by which a thread is killed (cancelled), perhaps mid-execution? Task : Describe a multi-thread programming scenario in which a thread might need to be terminated prior to completion
Threads When designing multithreaded programs, special circumstance arise Q: What are the mechanisms by which a thread is killed (cancelled), perhaps mid-execution? Task : Describe a multi-thread programming scenario in which a thread might need to be terminated prior to completion The thread to be terminated is referred to as a target thread
Threads Asynchronous cancellation Deferred cancellation
Threads Asynchronous cancellation Deferred cancellation A thread immediately terminates the target thread The target thread periodically checks in to find out if it should be terminated; if so, it does so in an orderly fashion Q: Why might asynchronous thread cancellation be problematic?
Threads Asynchronous cancellation Deferred cancellation A thread immediately terminates the target thread The target thread periodically checks in to find out if it should be terminated; if so, it does so in an orderly fashion Q: Why might asynchronous thread cancellation be problematic? Threads take on distinct roles in a computation/process, and information might need to be shared among threads. buffer thread 1 thread 2
Threads Asynchronous cancellation Deferred cancellation A thread immediately terminates the target thread The target thread periodically checks in to find out if it should be terminated; if so, it does so in an orderly fashion Q: Why might asynchronous thread cancellation be problematic? Threads take on distinct roles in a computation/process, and information might need to be shared among threads. Terminating a thread prematurely might impact the thread that is killed, AND the thread that was in the process of communication with the killed thread. buffer thread 1 thread 2
Threads Asynchronous cancellation Deferred cancellation A thread immediately terminates the target thread The target thread periodically checks in to find out if it should be terminated; if so, it does so in an orderly fashion In deferred cancellation, a target thread must first check whether the conditions for termination have been satisfied for example, via the use of a flag If the flag is set to yes to specify still writing to shared resource, and the thread is targeted for termination, the thread holds off until buffer thread 1 thread 2 iswriting : 0/1
Threads Asynchronous cancellation Deferred cancellation A thread immediately terminates the target thread The target thread periodically checks in to find out if it should be terminated; if so, it does so in an orderly fashion In deferred cancellation, a target thread must first check whether the conditions for termination have been satisfied for example, via the use of a flag If the flag is set to yes to specify still writing to shared resource, and the thread is targeted for termination, the thread holds off until buffer thread 1 thread 2 iswriting : 0/1
Synchronization With the use of threads and possible large counts of histories what can go wrong?
Synchronization With the use of threads and possible large counts of histories what can go wrong? Let s take a closer look at the producer / consumer problem we ve already briefly seen Q: What is the use of the in, and out variables? in = out = 0; while (true) { item = produce_item() while (((in + 1) % BUFFER_SIZE == out) {}/* do nothing */; buffer[in] = item; in = (in + 1) % BUFFER_SIZE; } while (true) { while (in == out) {}/* do nothing */; item = buffer[out]; out = (out + 1) % BUFFER_SIZE; consume_item(item) }
Synchronization With the use of threads and possible large counts of histories what can go wrong? Let s take a closer look at the producer / consumer problem we ve already briefly seen Q: What is the use of the in, and out variables? in = out = 0; while (true) { item = produce_item() while (((in + 1) % BUFFER_SIZE == out) {}/* do nothing */; buffer[in] = item; in = (in + 1) % BUFFER_SIZE; } while (true) { while (in == out) {}/* do nothing */; item = buffer[out]; out = (out + 1) % BUFFER_SIZE; consume_item(item) } The producer checks the value of in (in combination with modulus and the SIZE of the buffer), to see if something has already been produced which hasn t yet been consumed if the buffer is not full, an item is placed into the buffer
Synchronization With the use of threads and possible large counts of histories what can go wrong? Let s take a closer look at the producer / consumer problem we ve already briefly seen Q: What is the use of the in, and out variables? in = out = 0; while (true) { item = produce_item() while (((in + 1) % BUFFER_SIZE == out) {}/* do nothing */; buffer[in] = item; in = (in + 1) % BUFFER_SIZE; } while (true) { while (in == out) {}/* do nothing */; item = buffer[out]; out = (out + 1) % BUFFER_SIZE; consume_item(item) } The producer checks the value of in (in combination with modulus and the SIZE of the buffer), to see if something has already been produced which hasn t yet been consumed if the buffer is not full, an item is placed into the buffer
Synchronization With the use of threads and possible large counts of histories what can go wrong? Let s take a closer look at the producer / consumer problem we ve already briefly seen Q: What is the use of the in, and out variables? in = out = 0; while (true) { item = produce_item() while (((in + 1) % BUFFER_SIZE == out) {}/* do nothing */; buffer[in] = item; in = (in + 1) % BUFFER_SIZE; } while (true) { while (in == out) {}/* do nothing */; item = buffer[out]; out = (out + 1) % BUFFER_SIZE; consume_item(item) } After which the producer updates the value of in, so that on subsequent checks of the variable in, the produce will spin (do nothing)
Synchronization With the use of threads and possible large counts of histories what can go wrong? Let s take a closer look at the producer / consumer problem we ve already briefly seen Q: What is the use of the in, and out variables? in = out = 0; while (true) { item = produce_item() while (((in + 1) % BUFFER_SIZE == out) {}/* do nothing */; buffer[in] = item; in = (in + 1) % BUFFER_SIZE; } while (true) { while (in == out) {}/* do nothing */; item = buffer[out]; out = (out + 1) % BUFFER_SIZE; consume_item(item) } After which the producer updates the value of in, so that on subsequent checks of the variable in, the producer will spin (do nothing)
Synchronization With the use of threads and possible large counts of histories what can go wrong? Let s take a closer look at the producer / consumer problem we ve already briefly seen Q: What is the use of the in, and out variables? in = out = 0; while (true) { item = produce_item() while (((in + 1) % BUFFER_SIZE == out) {}/* do nothing */; buffer[in] = item; in = (in + 1) % BUFFER_SIZE; } while (true) { while (in == out) {}/* do nothing */; item = buffer[out]; out = (out + 1) % BUFFER_SIZE; consume_item(item) } The consumer, in the meantime, checks repeatedly if the value of in is equal to out. If yes, that means that as many items have been produced as have been consumed, in which case the consumer waits
Synchronization With the use of threads and possible large counts of histories what can go wrong? Let s take a closer look at the producer / consumer problem we ve already briefly seen Q: What is the use of the in, and out variables? in = out = 0; while (true) { item = produce_item() while (((in + 1) % BUFFER_SIZE == out) {}/* do nothing */; buffer[in] = item; in = (in + 1) % BUFFER_SIZE; } while (true) { while (in == out) {}/* do nothing */; item = buffer[out]; out = (out + 1) % BUFFER_SIZE; consume_item(item) } If in!= out, that means that more items have been produced than consumed, in which case the consumer retrieves the value in the buffer, and updates the value of out.
Synchronization With the use of threads and possible large counts of histories what can go wrong? A shortcoming of this approach is that the buffer can at most hold BUFFER_SIZE -1 elements The buffer is not fully utilized a consequence of using modulus in = out = 0; while (true) { item = produce_item() while (((in + 1) % BUFFER_SIZE == out) {}/* do nothing */; buffer[in] = item; in = (in + 1) % BUFFER_SIZE; } while (true) { while (in == out) {}/* do nothing */; item = buffer[out]; out = (out + 1) % BUFFER_SIZE; consume_item(item) }
Synchronization With the use of threads and possible large counts of histories what can go wrong? This can be fixed by introducing a variable, counter, which is incremented each time something is produced, and decremented, each time something is consumed in = out = 0; while (true) { item = produce_item() while (counter == BUFFER_SIZE) {}/* do nothing */; buffer[in] = item; in = (in + 1) % BUFFER_SIZE; counter++; } while (true) { while (counter == 0) {}/* do nothing */; item = buffer[out]; out = (out + 1) % BUFFER_SIZE; counter--; consume_item(item) } The counter variable now mediates adding to / removing from the buffer, which can be fully utilized
Synchronization With the use of threads and possible large counts of histories what can go wrong? The issue now is the increment and decrement of counter seems harmless But remember, these two threads are running concurrently (at the same time) Q: What could go wrong? in = out = 0; while (true) { item = produce_item() while (counter == BUFFER_SIZE) {}/* do nothing */; buffer[in] = item; in = (in + 1) % BUFFER_SIZE; counter++; } while (true) { while (counter == 0) {}/* do nothing */; item = buffer[out]; out = (out + 1) % BUFFER_SIZE; counter--; consume_item(item) }
Synchronization Thread A Thread B i1: count = count + 1 i2: count = count - 1
Synchronization Thread A Thread B i1: count = count + 1 i2: count = count - 1 a1: load count a2: add 1 a3: store count Register/ALU view b1: load count b2: subtract 1 b3: store count Remember that executing an instruction involves multiple architecture-level steps, including loading registers, loading ALUs, executing ALUs, fetching results from ALU, etc.
Synchronization Thread A Thread B i1: count = count + 1 i2: count = count - 1 a1: load count a2: add 1 a3: store count Register/ALU view b1: load count b2: subtract 1 b3: store count Remember that executing an instruction involves multiple architecture-level steps, including loading registers, loading ALUs, executing ALUs, fetching results from ALU, etc. Assume initial value of count = 4 What are a few of the possible instruction histories for Threads A and B?
Synchronization Thread A Thread B i1: count = count + 1 i2: count = count - 1 a1: load count a2: add 1 a3: store count Register/ALU view b1: load count b2: subtract 1 b3: store count Remember that executing an instruction involves multiple architecture-level steps, including loading registers, loading ALUs, executing ALUs, fetching results from ALU, etc. Assume initial value of count = 4 Notice that all of the instructions in both threads are executed sequentially... a1<a2<a3, and b1<b2<b3 What are the final values of count for these 4 histories? a1 < b1 < a2 < a3 < b2 < b3 a1 < a2 < b1 < b2 < b3 < a3 a1 < a2 < a3 < b1 < b2 < b3 b1 < b2 < a1 < b3 < a2 < a3
Synchronization Thread A Thread B i1: count = count + 1 i2: count = count - 1 a1: load count a2: add 1 a3: store count Register/ALU view b1: load count b2: subtract 1 b3: store count Remember that executing an instruction involves multiple architecture-level steps, including loading registers, loading ALUs, executing ALUs, fetching results from ALU, etc. Assume initial value of count = 4 Make sure you understand why this happened Task : Explain in your own words what caused this to happen a1 < b1 < a2 < a3 < b2 < b3 count = 3 a1 < a2 < a3 < b1 < b2 < b3 count = 4 a1 < a2 < b1 < b2 < b3 < a3 count = 5 b1 < b2 < a1 < b3 < a2 < a3 count = 5
Synchronization Thread A Thread B i1: count = count + 1 i2: count = count - 1 a1: load count a2: add 1 a3: store count Register/ALU view b1: load count b2: subtract 1 b3: store count Remember that executing an instruction involves multiple architecture-level steps, including loading registers, loading ALUs, executing ALUs, fetching results from ALU, etc. Assume initial value of count = 4 In most of these histories, both a1 and b1 fetched the value of count BEFORE the other process wrote back to it a1 < b1 < a2 < a3 < b2 < b3 count = 3 a1 < a2 < a3 < b1 < b2 < b3 count = 4 a1 < a2 < b1 < b2 < b3 < a3 count = 5 b1 < b2 < a1 < b3 < a2 < a3 count = 5
Synchronization Thread A Thread B i1: count = count + 1 i2: count = count - 1 a1: load count a2: add 1 a3: store count Register/ALU view b1: load count b2: subtract 1 b3: store count Remember that executing an instruction involves multiple architecture-level steps, including loading registers, loading ALUs, executing ALUs, fetching results from ALU, etc. Assume initial value of count = 4 Which is the desired final value of count? a1 < b1 < a2 < a3 < b2 < b3 count = 3 a1 < a2 < a3 < b1 < b2 < b3 count = 4 a1 < a2 < b1 < b2 < b3 < a3 count = 5 b1 < b2 < a1 < b3 < a2 < a3 count = 5
Synchronization Thread A Thread B i1: count = count + 1 i2: count = count - 1 a1: load count a2: add 1 a3: store count Register/ALU view b1: load count b2: subtract 1 b3: store count Remember that executing an instruction involves multiple architecture-level steps, including loading registers, loading ALUs, executing ALUs, fetching results from ALU, etc. Assume initial value of count = 4 What is the one property of the desired history that is different from the non-desirable histories? a1 < b1 < a2 < a3 < b2 < b3 count = 3 a1 < a2 < a3 < b1 < b2 < b3 count = 4 a1 < a2 < b1 < b2 < b3 < a3 count = 5 b1 < b2 < a1 < b3 < a2 < a3 count = 5
Synchronization Thread A Thread B i1: count = count + 1 i2: count = count - 1 a1: load count a2: add 1 a3: store count Register/ALU view b1: load count b2: subtract 1 b3: store count Remember that executing an instruction involves multiple architecture-level steps, including loading registers, loading ALUs, executing ALUs, fetching results from ALU, etc. Assume initial value of count = 4 What is the one property of the desired history that is different from the non-desirable histories? a1 < b1 < a2 < a3 < b2 < b3 count = 3 a1 < a2 < a3 < b1 < b2 < b3 count = 4 a1 < a2 < b1 < b2 < b3 < a3 count = 5 b1 < b2 < a1 < b3 < a2 < a3 count = 5
Synchronization Going back to the code the problem areas are the increment and decrement sections Intuitively, we want to permit only one of these to be executing at any one time we want to enforce atomic execution in = out = 0; while (true) { item = produce_item() while (counter == BUFFER_SIZE) {}/* do nothing */; buffer[in] = item; in = (in + 1) % BUFFER_SIZE; counter++; } while (true) { while (counter == 0) {}/* do nothing */; item = buffer[out]; out = (out + 1) % BUFFER_SIZE; counter--; consume_item(item) }
Synchronization Going back to the code the problem areas are the increment and decrement sections in = out = 0; Critical section A region of code for which access is controlled and/or coordinated by multiple processes while (true) { item = produce_item() while (counter == BUFFER_SIZE) {}/* do nothing */; buffer[in] = item; in = (in + 1) % BUFFER_SIZE; counter++; Critical section } while (true) { while (counter == 0) {}/* do nothing */; item = buffer[out]; out = (out + 1) % BUFFER_SIZE; counter--; Critical section consume_item(item) }
Synchronization Going back to the code the problem areas are the increment and decrement sections in = out = 0; Entry section We want to write code to control access into the critical section This is right BEFORE the critical section while (true) { item = produce_item() while (counter == BUFFER_SIZE) {}/* do nothing */; buffer[in] = item; in = (in + 1) % BUFFER_SIZE; counter++; } Entry section while (true) { while (counter == 0) {}/* do nothing */; item = buffer[out]; out = (out + 1) % BUFFER_SIZE; counter--; consume_item(item) Entry section }
Synchronization Going back to the code the problem areas are the increment and decrement sections Exit section We want to write code which in some way informs the other processes that access is now allowed This is right AFTER the critical section in = out = 0; while (true) { item = produce_item() while (counter == BUFFER_SIZE) {}/* do nothing */; buffer[in] = item; in = (in + 1) % BUFFER_SIZE; counter++; } Exit section while (true) { while (counter == 0) {}/* do nothing */; item = buffer[out]; out = (out + 1) % BUFFER_SIZE; counter--; consume_item(item) Exit section }
Synchronization A solution to the critical section problem must satisfy 3 criteria
Synchronization A solution to the critical section problem must satisfy 3 criteria Mutual exclusion : if a process is executing its critical section, no other process can be executing its critical section This is the most basic requirement
Synchronization A solution to the critical section problem must satisfy 3 criteria Mutual exclusion : if a process is executing its critical section, no other process can be executing its critical section Progress : If no process is executing its critical section, AND some process wants to enter its critical section, then only those processes NOT executing code in their other sections can decide who enters This helps to ensure that one of the processes which is actively waiting to enter its critical section is allowed to enter its critical section
Synchronization A solution to the critical section problem must satisfy 3 criteria Mutual exclusion : if a process is executing its critical section, no other process can be executing its critical section Progress : If no process is executing its critical section, AND some process wants to enter its critical section, then only those processes NOT executing code in their other sections can decide who enters Bounded Waiting : there must be a limit on the number of times that ANOTHER process is allowed to enter its critical section after a process has made a request to enter its critical section This ensures that everybody who is waiting has their turn at entering their critical section no one process can repeatedly enter indefinitely while some wait indefinitely
Synchronization Peterson s Solution do { flag[i] = true; turn = j; while (flag[j] && turn == j){}; Critical section flag[i] = false // other stuff (non critical) } while(true);
Synchronization Peterson s Solution do { flag[i] = true; turn = j; while (flag[j] && turn == j){}; Critical section flag[i] = false // other stuff (non critical) } while(true); Works for 2 processes/threads j = 1-i; Requires additional (shared) data : int turn; boolean flag[2];
Synchronization Peterson s Solution do { flag[i] = true; turn = j; while (flag[j] && turn == j){}; Critical section flag[i] = false // other stuff (non critical) } while(true); Works for 2 processes/threads j = 1-i; Requires additional (shared) data : int turn; boolean flag[2]; Two processes/threads are both running this code The j refers to the other process turn indicates whose turn it is to enter their section flag[] specifies if one of i or j are ready to enter their section Q: What is the entry condition/section? Q: What is the exit condition/section?
Synchronization Peterson s Solution do { flag[i] = true; turn = j; while (flag[j] && turn == j){}; Critical section flag[i] = false // other stuff (non critical) } while(true); Works for 2 processes/threads j = 1-i; Process i sets flag[i] to true (to specify, hey, I m ready to entry my critical section ) A process sets turn to j (the other process) Q: If both processes try to enter their critical section, what happens? Requires additional (shared) data : int turn; boolean flag[2];
Synchronization Peterson s Solution do { flag[i] = true; turn = j; while (flag[j] && turn == j){}; Critical section flag[i] = false // other stuff (non critical) } while(true); Works for 2 processes/threads j = 1-i; Requires additional (shared) data : int turn; boolean flag[2]; Process i sets flag[i] to true (to specify, hey, I m ready to entry my critical section ) A process sets turn to j (the other process) If both processes try to enter their critical section, although flag might be [1,1], because turn is a single shared value, only one of the processes will succeed in setting turn to the other process.
Synchronization Peterson s is a software solution to the critical section problem but it is not guaranteed to work because retrieval of the shared variable turn by two processes can be plagued by the same problem as increment (value = value + 1, for example) performed by multiple threads
Synchronization Peterson s is a software solution to the critical section problem but it is not guaranteed to work because retrieval of the shared variable turn by two processes can be plagued by the same problem as increment (value = value + 1, for example) performed by multiple threads Other solutions include: Kernel level instructions, such as test_and_set, and test_and_test_and_set, which act directly on hardware Kernel level compare_and_swap instrucions In the case of multiple processes/threads (not just two), test_and_set is used in combination with an array which holds information about ALL of the processes that are waiting to enter, along with a shared lock variable, that specifies if ANY process is in its critical section These kernel level operations are usually not accessible to users via system calls
Synchronization Most operating systems allow, via a system call, the use of a mutex this allows the application programmer to solve a critical section problem Mutex : What does this mean?
Synchronization Most operating systems allow, via a system call, the use of a mutex this allows the application programmer to solve a critical section problem Mutex : Mutual Exclusion This high level idea is the following : have the OS provide system calls for a lock that can be used to control access to a critical section.
Synchronization Most operating systems allow, via a system call, the use of a mutex this allows the application programmer to solve a critical section problem Mutex : Mutual Exclusion This high level idea is the following : have the OS provide system calls for a lock that can be used to control access to a critical section. do { // acquire lock Critical section // release lock // other stuff (remainder) } while(true);
Synchronization Most operating systems allow, via a system call, the use of a mutex this allows the application programmer to solve a critical section problem Mutex : Mutual Exclusion This high level idea is the following : have the OS provide system calls for a lock that can be used to control access to a critical section. do { // acquire lock Critical section // release lock // other stuff (remainder) } while(true); Multiple processes are running the code show left What is needed are system calls for acquiring a lock (which would grant access to the critical section), and releasing the lock (to specify that a process is done with their critical section)
Synchronization Most operating systems allow, via a system call, the use of a mutex this allows the application programmer to solve a critical section problem Mutex : Mutual Exclusion This high level idea is the following : have the OS provide system calls for a lock that can be used to control access to a critical section. do { // acquire lock Critical section acquire(){ while(!available) {}; available = false } // release lock // other stuff (remainder) } while(true); release(){ available = true } Q: What is the variable that is used to specify whether a process is in its critical section?
Up Next Lab 2 Implementation of Mutex using blitz