The University of Texas at Arlington

Similar documents
The University of Texas at Arlington

Lecture 8: September 30

Concurrency: a crash course

Synchronization for Concurrent Tasks

SYNCHRONIZATION M O D E R N O P E R A T I N G S Y S T E M S R E A D 2. 3 E X C E P T A N D S P R I N G 2018

Multi-core Architecture and Programming

What is the Race Condition? And what is its solution? What is a critical section? And what is the critical section problem?

Chapter 6 Concurrency: Deadlock and Starvation

MULTITHREADING AND SYNCHRONIZATION. CS124 Operating Systems Fall , Lecture 10

Concurrency. Chapter 5

PROCESS SYNCHRONIZATION

Dealing with Issues for Interprocess Communication

Programming Languages

G52CON: Concepts of Concurrency

CSE 451: Operating Systems Winter Lecture 7 Synchronization. Steve Gribble. Synchronization. Threads cooperate in multithreaded programs

CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring Lecture 8: Semaphores, Monitors, & Condition Variables

Threading and Synchronization. Fahd Albinali

Deadlock and Monitors. CS439: Principles of Computer Systems September 24, 2018

Programming Languages

Introduction to OS Synchronization MOS 2.3

CMSC421: Principles of Operating Systems

CS533 Concepts of Operating Systems. Jonathan Walpole

C09: Process Synchronization

Synchronization I. Jo, Heeseung

Synchronization 1. Synchronization

Programming in Parallel COMP755

Recap: Thread. What is it? What does it need (thread private)? What for? How to implement? Independent flow of control. Stack

Deadlock and Monitors. CS439: Principles of Computer Systems February 7, 2018

Operating Systems. Lecture 4 - Concurrency and Synchronization. Master of Computer Science PUF - Hồ Chí Minh 2016/2017

CSE 451: Operating Systems Winter Lecture 7 Synchronization. Hank Levy 412 Sieg Hall

CS 333 Introduction to Operating Systems. Class 4 Concurrent Programming and Synchronization Primitives

Concurrency: Deadlock and Starvation

Overview. CMSC 330: Organization of Programming Languages. Concurrency. Multiprocessors. Processes vs. Threads. Computation Abstractions

CS 333 Introduction to Operating Systems. Class 3 Threads & Concurrency. Jonathan Walpole Computer Science Portland State University

CS 333 Introduction to Operating Systems. Class 3 Threads & Concurrency. Jonathan Walpole Computer Science Portland State University

Concurrency: Deadlock and Starvation. Chapter 6

Week 3. Locks & Semaphores

Deadlock. Only one process can use the resource at a time but once it s done it can give it back for use by another process.

CS 153 Design of Operating Systems Winter 2016

Systèmes d Exploitation Avancés

Last Class: Monitors. Real-world Examples

Models of concurrency & synchronization algorithms

UNIT:2. Process Management

More on Synchronization and Deadlock

Background. The Critical-Section Problem Synchronisation Hardware Inefficient Spinning Semaphores Semaphore Examples Scheduling.

IV. Process Synchronisation

Last Class: Deadlocks. Today

Interprocess Communication By: Kaushik Vaghani

Module 1. Introduction:

Chapter 6: Synchronization. Operating System Concepts 8 th Edition,

Problem Set 2. CS347: Operating Systems

What's wrong with Semaphores?

Synchronization 1. Synchronization

Multiprocessor System. Multiprocessor Systems. Bus Based UMA. Types of Multiprocessors (MPs) Cache Consistency. Bus Based UMA. Chapter 8, 8.

IT 540 Operating Systems ECE519 Advanced Operating Systems

Process/Thread Synchronization

Resource management. Real-Time Systems. Resource management. Resource management

CS420: Operating Systems. Process Synchronization

Multiprocessor Systems. COMP s1

Concurrency, Mutual Exclusion and Synchronization C H A P T E R 5

Chapter 6: Process Synchronization

CPSC/ECE 3220 Fall 2017 Exam Give the definition (note: not the roles) for an operating system as stated in the textbook. (2 pts.

Chapter 6: Process Synchronization. Operating System Concepts 8 th Edition,

Chapter 5 Concurrency: Mutual Exclusion. and. Synchronization. Operating Systems: Internals. and. Design Principles

CS 571 Operating Systems. Midterm Review. Angelos Stavrou, George Mason University

Concurrent Processes Rab Nawaz Jadoon

Concurrency and Synchronisation

UNIX Input/Output Buffering

Operating Systems. Operating Systems Summer 2017 Sina Meraji U of T

Real-Time Systems. Lecture #4. Professor Jan Jonsson. Department of Computer Science and Engineering Chalmers University of Technology

Monitors; Software Transactional Memory

Synchronization. CS 475, Spring 2018 Concurrent & Distributed Systems

Threads Tuesday, September 28, :37 AM

Deadlock. Concurrency: Deadlock and Starvation. Reusable Resources

Synchronization COMPSCI 386

Operating Systems. Designed and Presented by Dr. Ayman Elshenawy Elsefy

Last Class: Synchronization. Review. Semaphores. Today: Semaphores. MLFQ CPU scheduler. What is test & set?

Page 1. Goals for Today" Atomic Read-Modify-Write instructions" Examples of Read-Modify-Write "

Lecture 5 Threads and Pthreads II

Resource Sharing & Management

Chapter 5: Process Synchronization. Operating System Concepts 9 th Edition

7: Interprocess Communication

Operating Systems ECE344

Semaphore. Originally called P() and V() wait (S) { while S <= 0 ; // no-op S--; } signal (S) { S++; }

Interprocess Communication and Synchronization

Chapter 5 Asynchronous Concurrent Execution

EECS 482 Introduction to Operating Systems

CMSC 330: Organization of Programming Languages

CS4961 Parallel Programming. Lecture 12: Advanced Synchronization (Pthreads) 10/4/11. Administrative. Mary Hall October 4, 2011

Reminder from last time

W4118 Operating Systems. Instructor: Junfeng Yang

Synchronization. Silvina Hanono Wachman Computer Science & Artificial Intelligence Lab M.I.T.

Synchronization Spinlocks - Semaphores

Last Class: Synchronization

CS 31: Introduction to Computer Systems : Threads & Synchronization April 16-18, 2019

Background. Old Producer Process Code. Improving the Bounded Buffer. Old Consumer Process Code

ENGR 3950U / CSCI 3020U UOIT, Fall 2012 Quiz on Process Synchronization SOLUTIONS

10/17/ Gribble, Lazowska, Levy, Zahorjan 2. 10/17/ Gribble, Lazowska, Levy, Zahorjan 4

Multiprocessors and Locking

Multiprocessor Synchronization

Transcription:

The University of Texas at Arlington Lecture 6: Threading and Parallel Programming Constraints CSE 5343/4342 Embedded Systems II Based heavily on slides by Dr. Roger Walker More Task Decomposition: Dependence Graph Graph = {vertices, (directed) edges Vertix (node) for each: Variable assignment (except index variables) Constant Operator or function call Directed edges (arrows) indicate use of variables and constants for: Data flow Control flow 1

Dependence Graph Example #1 for (i = 0; i < 3; i++) a[i] = b[i] / 2.0; b[0] 2 b[1] 2 b[2] 2 / / / a[0] a[1] a[2] 3 Dependence Graph Example #1 for (i = 0; i < 3; i++) a[i] = b[i] / 2.0; Domain decomposition possible b[0] 2 b[1] 2 b[2] 2 / / / a[0] a[1] a[2] 4 2

Dependence Graph Example #2 for (i = 1; i < 4; i++) a[i] = a[i-1] * b[i]; a[0] b[1] b[2] b[3] * * * a[1] a[2] a[3] 5 Dependence Graph Example #2 for (i = 1; i < 4; i++) a[i] = a[i-1] * b[i]; No domain decomposition b[3] a[0] b[1] b[2] * * * a[1] a[2] a[3] 6 3

Dependence Graph Example #3 a = f(x, y, z); b = g(w, x); t = a + b; c = h(z); s = t / c; w x y z g f h b a c t / s 7 Dependence Graph Example #3 a = f(x, y, z); b = g(w, x); t = a + b; c = h(z); s = t / c; w x y z g f h Task decomposition with 3 CPUs. b t a / c s 8 4

Multi-thread Concepts Multi-Threading concepts are needed in order to obtain maximum performance from the multi-core microprocessors. These concepts include : Creating, Terminating, Suspending, and Resuming Threads Thread Synchronization Methods Semaphores, Mutexes, Locks and Critical Sections. Using Threads Benefits of using threads include: Increased performance Better resource utilization Efficient data sharing However there are risks of using threads: Data race conditions Deadlocks Code complexity Portability issues Testing and debugging difficulty 5

Waiting for Threads Blocking versus non-blocking Looping on a condition is expensive Thread scheduled even when no work CPU time stolen from threads performing work Hard to find the right balance Locking probably too much or not enough Thread.Sleep inflexible Better option: Just wait for it! Synchronization Synchronization controls the relative order of thread execution and resolves conflicts among threads. Threads sometime need to wait for other threads to be in known state before continuing In shared-memory systems constraints have to be imposed for proper order of execution or to avoid corrupted/locked data. Two basic types of synchronization: 1. mutual exclusion 2. condition synchronization 6

Mutual Exclusion Program logic used to ensure single-thread access to a critical region. One thread blocks a critical section of code that contains shared data that one or more threads wait for access. Other threads are blocked from entering critical section (until the first thread is done) The use of proper synchronization techniques insures that only one thread is allowed access to a critical section at any one instance. The major challenge of threaded programming is to implement critical sections in such a way that multiple threads perform mutually exclusive operations for critical sections and do not use critical sections simultaneously. Mutual Exclusion done by a Critical Section 7

Condition Synchronization Condition synchronization allows a thread to wait until a specific condition is reached (e.g., Semaphores) Deadlocks Thread waits for a resource that will never become available Self-deadlock (recursive deadlock): A thread wants to acquire a resource that is already belonging to it Lock-ordering deadlock (more common): Example: thread A locks resource R1 then tries to lock resource R2; meanwhile thread B locks R2 and tries to lock R1; in some scenario thread A could have acquired R1 and is waiting for R2 while B has acquired R2 and is waiting for R1. As implied by the name, deadlocks are not good, they need to be avoided at all costs. 8

Deadlocks (cont d) Deadlocks Occur when a thread waits for a condition that never occurs. Are commonly results from the competition between threads for system resources held by other threads. The four necessary conditions for a deadlock are: Mutual exclusion condition Hold and wait condition No preemption condition Circular wait condition Deadlock.cpp This program illustrates the potential for deadlock in a badlocking hierarchy. It is possible for one thread to lock both critical sections and avoid deadlock. However, concurrent programs that rely on a particular order-of-execution without enforcing that order will eventually fail. 9

Race Conditions Race conditions: Are the most common errors in concurrent programs. Occur because the programmer assumes a particular order of execution but does not guarantee that order through synchronization. A Data Race: Refers to a storage conflict situation. Occurs when two or more threads simultaneously access the same memory location while at least one thread is updating that location. Result in two possible conflicts: Read/Write conflicts Write/Write conflicts Race conditions are usually not obvious Errors most likely only occur unexpectedly and unpredictably Locks are the key to avoidance Using Synchronization Synchronization is about making sure that threads take turns when they need to, typically to access some shared object. Depending on your specific application needs, you will find that different options make more sense than others. Operating systems have to provide some support for atomic operations. Windows simplifies this process since it has built-in support for suspending a thread at the scheduler level when necessary. In this manner, one thread can be put to sleep until a certain condition occurs in another thread. By letting one thread sleep instead of just repeatedly checking to see if another thread is done, performance is dramatically improved. 10

Synchronization Primitives Synchronization typically performed by three types of primitives: Semaphores Locks, and Condition variables Primitives are implemented by atomic operations by use of a memory fence or barrier processor dependent operation that insures threads see other threads memory operations by maintaining reasonable order Semaphores Introduced by Edsger Dijkstra (1968) A Semaphore is a form of a counter that allows multiple threads access to a resource by incrementing or decrementing the semaphore. Typical use is protecting a shared resource of which at most n instances are allowed to exist simultaneously. Use P to acquire a resource and V to release. Concept of capacity, thus can be represented by an integer. Semaphores are created with a specified capacity, and once that number of threads have locked (P-proberen) it, subsequent access is blocked until a slot opens up (Vverhogen). 11

Semaphores (cont d) P and V need to be atomic to protect the semaphore variable. The P operation busy-waits (or maybe sleeps) until a resource is available, whereupon it immediately claims one. A Semaphore with a capacity of one is a binary semaphore (it is also essentially a Mutex, with the exception that any thread can release it, not just a thread that has acquired it.). Semaphores can be used across processes as well. Semaphores are not as frequently used anymore. Semaphores (example) Producer/Consumer threads: Producer: void producer() { while(1) { produce_data(); p_sem->release(); // V operation Consumer: void consumer() { while(1) { p_sem->wait(); // P operation consume_data(); 12

Locks Insure that a only a single thread can have access to a resource The coarse granular locks have higher lock contention than finer granular ones. Locks could be realized by binary semaphores (and an initialization entity) Acquire(): waits for the lock sate to be unlocked and then sets the state to lock Release(): Changes the lock state from locked to unlocked Critical Section Implementation To avoid deadlocks, locks should be mostly used inside critical sections where there is but a single entry and single exit point. <critical section start> <acquire lock A> (operate on shared memory protected by lock) <release lock A> <critical section end> 13

Locks Locking restricts access to an object to one thread Minimize locking/synchronization whenever possible Make objects thread-safe when appropriate Acquire locks late, release early Shorter duration, the better Lock only when necessary Locking Example private object padlock = new object(); public void CoordinateWork() { (new Thread(PerformWork)).Start(); (new Thread(PerformWork)).Start(); private void PerformWork() { while(true) { lock(padlock) { /* GET NEXT ITEM */ /* DO WORK HERE */ unlock(padlock) 14

Lock Types Mutex: simplest lock; can include a timer attribute for release or an try-finally exception to release Recursive: can be repeatedly acquired by the owning thread (used in recursive functions). This can thus avoid recursive deadlocks. Read-Write Locks: allow simultaneous read access for multiple threads but limit the write access to only one thread. Use when multiple threads need to read shared data but do not need to perform a write operation on the data. Granularity (how much is locked) matters. Spin Locks: Waiting threads must spin or poll the states of a lock rather than getting blocked. Used mostly on multiprocessor systems as the one processor is essentially blocked spinning. Use when hold time of locks are short (i.e., less than a blocking or waking up of a thread). Condition Variables Usually, condition variables are user-mode objects that cannot be shared across processes. In general condition variables are a method to implement a message regarding a specific condition that a thread is waiting on and the thread has a lock on specific resource. To prevent occurrences of deadlocks, the following atomic operations on a condition variable can be used. Wait(L), Signal(L), and Broadcast(L) Condition variables enable threads to atomically release a lock and enter the sleeping state. They can be used with critical sections or slim reader/writer (SRW) locks. Condition variables support operations that "wake one" or "wake all" waiting threads. After a thread is woken, it re-acquires the lock it released when the thread entered the sleeping state. 15

Condition Variables Suppose a thread has a lock on specific resource, but cannot proceed until a particular condition occurs. For this case, the thread can release the lock but will need it returned when the condition occurs. The wait() is a method of releasing the lock and letting the next thread waiting on this resource to now use the resource. The condition the original thread was waiting on is passed via the condition variable to the new thread with the lock. When the new thread is finished with the resource, it checks the condition variable and returns the resource to the original holder by use of the signal() or broadcast commands. The broadcasts enables all waiting threads for that resource to run. Example: Condition Variable Condition C; Lock L; Bool LC = false; void producer() { while (1) { L ->acquire(); // start critical section while(lc == true) { C -> wait(l); // produce the next data LC = true; C ->signal(l); // end critical section L ->release(); void consumer() { while (1) { L ->acquire(); // start critical section while (LC == false) { C ->wait(l); // consume the next data LC = false; //end critical section L ->release(); 16

Message Passing Message is a special method of communication to transfer information or a signal from one domain to another. For multi-threading environments the domain is referred to as the boundary of a thread. Message passing, or MPI, (used in distributed computing, parallel processing, etc.) A method to communicate between threads, or processes. Messages Threads communication within a process is known as intra-process communication. Messages that reside in different processes use inter-process communication. To synchronize operation of threads, semaphores, locks, and condition variables are used. Synchronization primitives convey status and access information. To communicate data thread messaging is done. 17

Summary For synchronization, an understanding of atomic operations will help avoid deadlock and eliminate race conditions. Use a proper synchronization construct-based framework for threaded applications. Use higher-level synchronization constructs over primitive types (more OS support) An application cannot contain any possibility for a deadlock scenario. Threads can perform message passing using different approaches: intra-process, inter-process Important to understand the way threading features of third-party libraries are implemented. Different implementations may cause applications to fail in unexpected ways. 18