SYSTEMS MEMO #12. A Synchronization Library for ASIM. Beng-Hong Lim Laboratory for Computer Science.

Size: px
Start display at page:

Download "SYSTEMS MEMO #12. A Synchronization Library for ASIM. Beng-Hong Lim Laboratory for Computer Science."

Transcription

1 ALEWIFE SYSTEMS MEMO #12 A Synchronization Library for ASIM Beng-Hong Lim (bhlim@masala.lcs.mit.edu) Laboratory for Computer Science Room NE January 9, 1992 Abstract This memo describes the functions in the synchronization library provided for programs written for ASIM and acts as a user's manual. Mul-T provides futures and binary semaphores as primitive synchronization mechanisms. For experimenting with other synchronization constructs, we have extended the language to include J-structures, L-structures, mutualexclusion locks, counting semaphores and barriers. An extension of futures to allow thread placement directives is also provided. 1 Introduction A synchronization library is provided in ASIM for users to experiment with various synchronization mechanisms. This memo assumes knowledge of programming in the ASIM environment. See Alewife Memo 13 for a description of the ASIM environment. The library contains implementations of mutual-exclusion locks, counting semaphores, J- structures, L-structures, and barriers. These supplement the synchronization mechanisms already present in Mul-T on ASIM, viz., futures, and binary semaphores. This memo will briey describe the synchronization mechanisms and the user callable functions associated with each mechanism. These functions are automatically linked with the user program by the ASIM compilation process. The Alewife project is funded in part by NSF Experimental Systems grant # MIP , in part by DARPA contract # N K-0825, in part by a NSF Presidential Young Investigator Award, and in part by LSI Logic and IBM. 1

2 2 Waiting for synchronization Some of the functions described below include a <cost-limit> parameter. This parameter is used to control the method of waiting for failed synchronizations. The following describes how to use the cost-limit parameter. Traditionally, when a synchronization attempt fails, the thread executing the synchronization either spins or blocks. The synchronization mechanisms implemented in the library gives the user control over the waiting method. Besides always spinning and always blocking, the user can make the thread spin for some amount of time before blocking the thread if the synchronization condition has not yet been satised. This is specied via a \spin-cost" threshold. The thread will spin until the cost of spinning is above the \spin-cost" threshold, and then block. Specifying a threshold of 0 yields an \always block" algorithm, while a threshold of mostpositive-xnum will eectively yield a \always spin" algorithm (unless you can aord to wait for ASIM to simulate most-positive-xnum cycles). Specifying a threshold equal to the cost of blocking yields a strongly 2-competitive algorithm, which means that the cost of waiting will be guaranteed to be not more than 2 times the cost of an optimal o-line waiting algorithm. The variable *blocking-ovh* is set to 1000 by default and is used as the default \spin-cost" threshold. One point to be aware of is that it is possible for the cost of an \always block" algorithm or an \always spin" algorithm to be less than twice optimal, so that the 2-competitive algorithm is not guaranteed to be the best alternative. A more detailed description of two-phase waiting strategies can be found in [4]. 3 Mutual Exclusion Locks Mutual exclusion locks can be atomically acquired and released. This can be used to protect access to critical sections of code. (make-lock) { creates and returns a lock object (lock? l) { returns #t if l is a lock object (lock-failed? l) { successful. tries to acquire lock l. Returns #t if it failed to acquire the lock, #f if (spin-lock l) { tries to acquire lock l, using a 2-competitive waiting algorithm with spinning. Context switching is disabled while in the spin phase. (%spin-lock l cost-limit) { tries to acquire lock l, spinning until spinning cost cost-limit. Context switching is disabled while in the spin phase. (sspin-lock l) { tries to acquire lock l, using a 2-competitive waiting algorithm with switchspinning. (%sspin-lock l cost-limit) { tries to acquire lock l, switch spinning until spin cost costlimit. 2

3 (unlock l) { unlocks l, and releases all waiters. 4 FIFO Mutual Exclusion Locks FIFO locks work like mutual exclusion locks except that there is a value associated with the lock. Successful lock attempts lock and return the value of the lock. Failed lock attempts immediately block and queue the thread on a rst-in-rst-out queue. Releasing the lock writes a new value into the lock and also releases the waiter at the head of the FIFO queue, if any. (make-fo-lock) { creates and returns a fo lock with an initial value of 0. (lock-fo-lock l) { tries to acquire lock l. Returns the FIFO lock value when lock is successfully acquired. (release-fo-lock l value) { FIFO queue, if any exist. writes value into the lock and releases the rst waiter in the 5 Binary Semaphores (make-semaphore) { creates and returns a semaphore object. (semaphore? x) { returns #t if x is a semaphore object, #f otherwise. (semaphore-p sem) { wait by switch spinning if semaphore value is 0, set value to 0 and return if semaphore value is 1. (semaphore-v sem) { sets semaphore value to 1. (semaphore-conditional-p sem) { returns #t if semaphore value is 0, returns #f otherwise and sets value to 0. 6 Counting semaphores Counting semaphores are semaphores that can take on values that are nonnegative integers. Although binary semaphore can be used to implement counting semaphores, the implementation provided here takes advantage of the hardware full/empty bits for a more ecient implementation. (make-counting-semaphore initval) { creates and returns a semaphore with value set to initval. (c-semaphore-p c-sem) { decrements semaphore value if > 0 and return, otherwise wait until value is positive, then decrement. (%c-semaphore-p c-sem cost-limit) { specied. like c-semaphore-p but the spin cost-limit can be 3

4 (c-semaphore-v c-sem) { increments semaphore value. (%c-semaphore-v c-sem cost-limit) { specied. like c-semaphore-v but the spin cost-limit can be (fetch-and-add c-sem value) { pre-incremented value. increments semaphore value by <value>, and returns the (%fetch-and-add c-sem value cost-limit) { like fetch-and-add, but the spin cost-limit can be specied. 7 Barriers There are 2 implementations of barriers: a simple barrier, and a software combining tree barrier. In a simple barrier, all threads arriving at the barrier increment the barrier count and wait at a single release ag except for the last arrival. In a tree barrier, the count and release ag is distributed throughout a tree. A description of a combining tree barrier can be found in [6]. The simple barrier implementation is not as scalable as the tree barrier due to potential contention. For barriers with more than 4 participants, the tree barrier performs better if all the participants arrive at approximately the same time. 7.1 Simple Barrier (create-barrier nthreads) { create and return a barrier for <nthreads> threads. (barrier b) { wait at barrier b for all processes to reach it. (%barrier b cost-limit) { like barrier, but cost-limit can be specied. 7.2 Tree Barrier (make-tree-barrier <nthreads> <bf>) { create and return a barrier for <nthreads> threads organized in a combining tree with a maximum branching factor of <bf>. (make-dist-tree-barrier <nthreads> <tpp> <bf>) { create and return a barrier for <nthreads> threads organized in a combining tree with a maximum branching factor of <bf*tpp>. The nodes of the tree are distributed more or less evenly among the processors. (tree-barrier <b> <thread-id> cost-limit) { wait at barrier <b> using <thread-id>. This implies that the user will need to have some scheme for uniquely numbering the threads participating in the tree barrier from 0 to <nthreads>01. 8 J-structures J-structures are one-dimensional vectors with presence bits associated with each element. A newly allocated J-structures has the presence bit turned o for each element. A write to the 4

5 J-structure using iset turns the presence bit on. A reference to a element with the presence bit turned o suspends the referencing task, while a reference with the presence bit turned on acts like a normal vector reference. This provides for synchronization between producers of J-structure values and consumers of J-structure values. A description of J-strucutres can be found in [4]. J-structures can be used to implement I-structures [1]. One major dierence between I- structures and J-structures is that J-structures can be reset and reused. reset-istruct unsets all the presence bits in the J-structure. (make-jstruct <len>) { - Create and return an J-structure of length <len>. (jref js i) { - Reference element i in J-structure js. Wait for value if presence bit is unset using a competitive waiting algorithm. (set (jref js i) val) { - Set the value of element i in J-structure js to value val. Also set the presence bit for that element and release any waiters. (reset-jstruct js) { - Reset the presence bits for each element in J-structure js. Signal an error if there are any waiters waiting on any of the J-structure elements. It is the programmer's responsibility to ensure that there are no longer any waiters on any J-structure element before resetting the J-structure. When waiting for an J-structure element, the maximum spinning cost can be set by using set-max-jref-sspin. (set-max-jref-sspin cycles) { - Switch spin until cost is greater than cycles before blocking on the J-structure element. 9 L-structures L-structures are also one-dimensional vectors with presence bits associated with each element. However the read and write operations on L-structures are dierent. L-structures support 3 operations: a locking read, an unlocking write, and a non-locking read. A locking read waits until a slot is full before emptying the slot and returning the value. An unlocking write writes a value to an empty slot, and sets it to full, releasing any waiters. A non-locking read returns the value found in a slot if full; otherwise it returns an invalid value. An L-structure therefore allows mutually exclusive access to each of its slots. The synchronizing L-structure reads and writes can be used to implement M-structures [2]. However, L-structures are dierent from M-structures in that they allow multiple non-locking readers. (make-lstruct <len>) { - Create and return an L-structure of length <len>. (lref ls i) { - Read and lock element i in L-structure ls. Wait if presence bit is unset using a competitive waiting algorithm. (lpeek ls i) { - Read element i in L-structure ls. If presence bit is set, return the value read, otherwise return an invalid lpeek value. 5

6 (invalid-lpeek? value) { - Returns #t if value is an invalid value returned by lpeek, #f otherwise. (lset ls i val) { - Set the value of element i in L-structure ls to value val. Also set the presence bit for that element and release any waiters. When waiting for an L-structure element, the maximum spinning cost can be set by using set-max-lref-sspin. (set-max-lref-sspin cycles) { - Switch spin until cost is greater than cycles before blocking on the L-structure element. 10 Futures A description of futures can be found in [3]. Although Mul-T already provides futures as a thread spawning and synchronizing primitive, we extended the Alewife environment to provide explicit placement directives for futures. (future-on <pnum> <body>) { - Enqueue a new task on processor <pnum> to be executed by that processor. Using a value of nil for <pnum> causes the future to be spawned locally. When waiting for an unresolved future, the maximum spinning cost can be set by using set-max-future-sspin. (set-max-future-sspin cycles) { - Switch spin until cost is greater than cycles before blocking on the future. References [1] Arvind, R. S. Nikhil, and K. K. Pingali. I-Structures: Data Structures for Parallel Computing. In Proceedings of the Workshop on Graph Reduction, (Springer-Verlag Lecture Notes in Computer Science 279), September/October [2] Paul S. Barth, Rishiyur S. Nikhil, and Arvind. M-structures: Extending a parallel, non-strict, functional language with state. In Proceedings of the 5th ACM Conference on Functional Programming Languages and Computer Architecture, August [3] David A. Kranz, R. Halstead, and E. Mohr. Mul-T: A High-Performance Parallel Lisp. In Proceedings of SIGPLAN '89, Symposium on Programming Languages Design and Implementation, June [4] Beng-Hong Lim and Anant Agarwal. Waiting Algorithms for Synchronization in Large-Scale Multiprocessors. Technical report, MIT VLSI Memo , February

7 [5] Eric Mohr, David A. Kranz, and Robert H. Halstead. Lazy task creation: A technique for increasing the granularity of parallel programs. In Proceedings of Symposium on Lisp and Functional Programming, June [6] Pen-Chung Yew, Nian-Feng Tzeng, and Duncan H. Lawrie. Distributing hot-spot addressing in large-scale multiprocessors. IEEE Transactions on Computers, C-36(4):388{395, April

Low-Cost Support for Fine-Grain Synchronization in. David Kranz, Beng-Hong Lim, Donald Yeung and Anant Agarwal. Massachusetts Institute of Technology

Low-Cost Support for Fine-Grain Synchronization in. David Kranz, Beng-Hong Lim, Donald Yeung and Anant Agarwal. Massachusetts Institute of Technology Low-Cost Support for Fine-Grain Synchronization in Multiprocessors David Kranz, Beng-Hong Lim, Donald Yeung and Anant Agarwal Laboratory for Computer Science Massachusetts Institute of Technology Cambridge,

More information

A simple correctness proof of the MCS contention-free lock. Theodore Johnson. Krishna Harathi. University of Florida. Abstract

A simple correctness proof of the MCS contention-free lock. Theodore Johnson. Krishna Harathi. University of Florida. Abstract A simple correctness proof of the MCS contention-free lock Theodore Johnson Krishna Harathi Computer and Information Sciences Department University of Florida Abstract Mellor-Crummey and Scott present

More information

Operating Systems. Lecture 4 - Concurrency and Synchronization. Master of Computer Science PUF - Hồ Chí Minh 2016/2017

Operating Systems. Lecture 4 - Concurrency and Synchronization. Master of Computer Science PUF - Hồ Chí Minh 2016/2017 Operating Systems Lecture 4 - Concurrency and Synchronization Adrien Krähenbühl Master of Computer Science PUF - Hồ Chí Minh 2016/2017 Mutual exclusion Hardware solutions Semaphores IPC: Message passing

More information

IT 540 Operating Systems ECE519 Advanced Operating Systems

IT 540 Operating Systems ECE519 Advanced Operating Systems IT 540 Operating Systems ECE519 Advanced Operating Systems Prof. Dr. Hasan Hüseyin BALIK (5 th Week) (Advanced) Operating Systems 5. Concurrency: Mutual Exclusion and Synchronization 5. Outline Principles

More information

Real-Time Scalability of Nested Spin Locks. Hiroaki Takada and Ken Sakamura. Faculty of Science, University of Tokyo

Real-Time Scalability of Nested Spin Locks. Hiroaki Takada and Ken Sakamura. Faculty of Science, University of Tokyo Real-Time Scalability of Nested Spin Locks Hiroaki Takada and Ken Sakamura Department of Information Science, Faculty of Science, University of Tokyo 7-3-1, Hongo, Bunkyo-ku, Tokyo 113, Japan Abstract

More information

Concurrency: a crash course

Concurrency: a crash course Chair of Software Engineering Carlo A. Furia, Marco Piccioni, Bertrand Meyer Concurrency: a crash course Concurrent computing Applications designed as a collection of computational units that may execute

More information

Concurrency: Deadlock and Starvation

Concurrency: Deadlock and Starvation Concurrency: Deadlock and Starvation Chapter 6 E&CE 354: Processes 1 Deadlock Deadlock = situation in which every process from a set is permanently blocked, i.e. cannot proceed with execution Common cause:

More information

SYNCHRONIZATION M O D E R N O P E R A T I N G S Y S T E M S R E A D 2. 3 E X C E P T A N D S P R I N G 2018

SYNCHRONIZATION M O D E R N O P E R A T I N G S Y S T E M S R E A D 2. 3 E X C E P T A N D S P R I N G 2018 SYNCHRONIZATION M O D E R N O P E R A T I N G S Y S T E M S R E A D 2. 3 E X C E P T 2. 3. 8 A N D 2. 3. 1 0 S P R I N G 2018 INTER-PROCESS COMMUNICATION 1. How a process pass information to another process

More information

Multiprocessor Synchronization

Multiprocessor Synchronization Multiprocessor Synchronization Material in this lecture in Henessey and Patterson, Chapter 8 pgs. 694-708 Some material from David Patterson s slides for CS 252 at Berkeley 1 Multiprogramming and Multiprocessing

More information

CSC2/458 Parallel and Distributed Systems Scalable Synchronization

CSC2/458 Parallel and Distributed Systems Scalable Synchronization CSC2/458 Parallel and Distributed Systems Scalable Synchronization Sreepathi Pai February 20, 2018 URCS Outline Scalable Locking Barriers Outline Scalable Locking Barriers An alternative lock ticket lock

More information

Chapter 5 Concurrency: Mutual Exclusion and Synchronization

Chapter 5 Concurrency: Mutual Exclusion and Synchronization Operating Systems: Internals and Design Principles Chapter 5 Concurrency: Mutual Exclusion and Synchronization Seventh Edition By William Stallings Designing correct routines for controlling concurrent

More information

IV. Process Synchronisation

IV. Process Synchronisation IV. Process Synchronisation Operating Systems Stefan Klinger Database & Information Systems Group University of Konstanz Summer Term 2009 Background Multiprogramming Multiple processes are executed asynchronously.

More information

Lecture 8: September 30

Lecture 8: September 30 CMPSCI 377 Operating Systems Fall 2013 Lecture 8: September 30 Lecturer: Prashant Shenoy Scribe: Armand Halbert 8.1 Semaphores A semaphore is a more generalized form of a lock that can be used to regulate

More information

Last Class: Synchronization

Last Class: Synchronization Last Class: Synchronization Synchronization primitives are required to ensure that only one thread executes in a critical section at a time. Concurrent programs Low-level atomic operations (hardware) load/store

More information

Role of Synchronization. CS 258 Parallel Computer Architecture Lecture 23. Hardware-Software Trade-offs in Synchronization and Data Layout

Role of Synchronization. CS 258 Parallel Computer Architecture Lecture 23. Hardware-Software Trade-offs in Synchronization and Data Layout CS 28 Parallel Computer Architecture Lecture 23 Hardware-Software Trade-offs in Synchronization and Data Layout April 21, 2008 Prof John D. Kubiatowicz http://www.cs.berkeley.edu/~kubitron/cs28 Role of

More information

Models of concurrency & synchronization algorithms

Models of concurrency & synchronization algorithms Models of concurrency & synchronization algorithms Lecture 3 of TDA383/DIT390 (Concurrent Programming) Carlo A. Furia Chalmers University of Technology University of Gothenburg SP3 2016/2017 Today s menu

More information

Algorithms for Scalable Synchronization on Shared Memory Multiprocessors by John M. Mellor Crummey Michael L. Scott

Algorithms for Scalable Synchronization on Shared Memory Multiprocessors by John M. Mellor Crummey Michael L. Scott Algorithms for Scalable Synchronization on Shared Memory Multiprocessors by John M. Mellor Crummey Michael L. Scott Presentation by Joe Izraelevitz Tim Kopp Synchronization Primitives Spin Locks Used for

More information

Synchronization. Coherency protocols guarantee that a reading processor (thread) sees the most current update to shared data.

Synchronization. Coherency protocols guarantee that a reading processor (thread) sees the most current update to shared data. Synchronization Coherency protocols guarantee that a reading processor (thread) sees the most current update to shared data. Coherency protocols do not: make sure that only one thread accesses shared data

More information

EECS 482 Introduction to Operating Systems

EECS 482 Introduction to Operating Systems EECS 482 Introduction to Operating Systems Winter 2018 Harsha V. Madhyastha Recap Multi-threaded code with monitors: Locks for mutual exclusion Condition variables for ordering constraints Every thread

More information

The University of Texas at Arlington

The University of Texas at Arlington The University of Texas at Arlington Lecture 6: Threading and Parallel Programming Constraints CSE 5343/4342 Embedded Systems II Based heavily on slides by Dr. Roger Walker More Task Decomposition: Dependence

More information

Advance Operating Systems (CS202) Locks Discussion

Advance Operating Systems (CS202) Locks Discussion Advance Operating Systems (CS202) Locks Discussion Threads Locks Spin Locks Array-based Locks MCS Locks Sequential Locks Road Map Threads Global variables and static objects are shared Stored in the static

More information

Request Network Reply Network CPU L1 Cache L2 Cache STU Directory Memory L1 cache size unlimited L1 write buer 8 lines L2 cache size unlimited L2 outs

Request Network Reply Network CPU L1 Cache L2 Cache STU Directory Memory L1 cache size unlimited L1 write buer 8 lines L2 cache size unlimited L2 outs Evaluation of Communication Mechanisms in Invalidate-based Shared Memory Multiprocessors Gregory T. Byrd and Michael J. Flynn Computer Systems Laboratory Stanford University, Stanford, CA Abstract. Producer-initiated

More information

Efficient Fine Grained Synchronization Support Using Full/Empty Tagged Shared Memory and Cache Coherency Vladimir Vlassov and Csaba Andras Moritz

Efficient Fine Grained Synchronization Support Using Full/Empty Tagged Shared Memory and Cache Coherency Vladimir Vlassov and Csaba Andras Moritz Efficient Fine Grained Synchronization Support Using Full/Empty Tagged Shared Memory and Cache Coherency Vladimir Vlassov and Csaba Andras Moritz Department of Teleinformatics Efficient Fine Grained Synchronization

More information

Reminder from last time

Reminder from last time Concurrent systems Lecture 2: More mutual exclusion, semaphores, and producer-consumer relationships DrRobert N. M. Watson 1 Reminder from last time Definition of a concurrent system Origins of concurrency

More information

CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring Lecture 8: Semaphores, Monitors, & Condition Variables

CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring Lecture 8: Semaphores, Monitors, & Condition Variables CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring 2004 Lecture 8: Semaphores, Monitors, & Condition Variables 8.0 Main Points: Definition of semaphores Example of use

More information

Synchronization. Before We Begin. Synchronization. Credit/Debit Problem: Race Condition. CSE 120: Principles of Operating Systems.

Synchronization. Before We Begin. Synchronization. Credit/Debit Problem: Race Condition. CSE 120: Principles of Operating Systems. CSE 120: Principles of Operating Systems Lecture 4 Synchronization January 23, 2006 Prof. Joe Pasquale Department of Computer Science and Engineering University of California, San Diego Before We Begin

More information

1 0 1 k d 1+ 1 n. 5 nk. (1 0 d + B + M 0 1 (3)

1 0 1 k d 1+ 1 n. 5 nk. (1 0 d + B + M 0 1 (3) A k-ary n-cube direct network can be modeled in a similar fashion [2] by replacing the expression for T in Equation 1 with, 2 T = 41+ B 3 1 k d 1 0 1 k d 1+ 1 n 5 nk (1 0 d + B + M 0 1 (3) ) where k is

More information

High Performance Synchronization Algorithms for. Multiprogrammed Multiprocessors. (Extended Abstract)

High Performance Synchronization Algorithms for. Multiprogrammed Multiprocessors. (Extended Abstract) High Performance Synchronization Algorithms for Multiprogrammed Multiprocessors (Extended Abstract) Robert W. Wisniewski, Leonidas Kontothanassis, and Michael L. Scott Department of Computer Science University

More information

The University of Texas at Arlington

The University of Texas at Arlington The University of Texas at Arlington Lecture 10: Threading and Parallel Programming Constraints CSE 5343/4342 Embedded d Systems II Objectives: Lab 3: Windows Threads (win32 threading API) Convert serial

More information

Adaptive Migratory Scheme for Distributed Shared Memory 1. Jai-Hoon Kim Nitin H. Vaidya. Department of Computer Science. Texas A&M University

Adaptive Migratory Scheme for Distributed Shared Memory 1. Jai-Hoon Kim Nitin H. Vaidya. Department of Computer Science. Texas A&M University Adaptive Migratory Scheme for Distributed Shared Memory 1 Jai-Hoon Kim Nitin H. Vaidya Department of Computer Science Texas A&M University College Station, TX 77843-3112 E-mail: fjhkim,vaidyag@cs.tamu.edu

More information

CS 153 Design of Operating Systems Winter 2016

CS 153 Design of Operating Systems Winter 2016 CS 153 Design of Operating Systems Winter 2016 Lecture 9: Semaphores and Monitors Some slides from Matt Welsh Summarize Where We Are Goal: Use mutual exclusion to protect critical sections of code that

More information

Synchronization for Concurrent Tasks

Synchronization for Concurrent Tasks Synchronization for Concurrent Tasks Minsoo Ryu Department of Computer Science and Engineering 2 1 Race Condition and Critical Section Page X 2 Algorithmic Approaches Page X 3 Hardware Support Page X 4

More information

Chapter 5 Concurrency: Mutual Exclusion. and. Synchronization. Operating Systems: Internals. and. Design Principles

Chapter 5 Concurrency: Mutual Exclusion. and. Synchronization. Operating Systems: Internals. and. Design Principles Operating Systems: Internals and Design Principles Chapter 5 Concurrency: Mutual Exclusion and Synchronization Seventh Edition By William Stallings Designing correct routines for controlling concurrent

More information

Compositional C++ Page 1 of 17

Compositional C++ Page 1 of 17 Compositional C++ Page 1 of 17 Compositional C++ is a small set of extensions to C++ for parallel programming. OVERVIEW OF C++ With a few exceptions, C++ is a pure extension of ANSI C. Its features: Strong

More information

Threads. Threads The Thread Model (1) CSCE 351: Operating System Kernels Witawas Srisa-an Chapter 4-5

Threads. Threads The Thread Model (1) CSCE 351: Operating System Kernels Witawas Srisa-an Chapter 4-5 Threads CSCE 351: Operating System Kernels Witawas Srisa-an Chapter 4-5 1 Threads The Thread Model (1) (a) Three processes each with one thread (b) One process with three threads 2 1 The Thread Model (2)

More information

Semaphores and Monitors: High-level Synchronization Constructs

Semaphores and Monitors: High-level Synchronization Constructs 1 Synchronization Constructs Synchronization Coordinating execution of multiple threads that share data structures Semaphores and Monitors High-level Synchronization Constructs A Historical Perspective

More information

CS377P Programming for Performance Multicore Performance Synchronization

CS377P Programming for Performance Multicore Performance Synchronization CS377P Programming for Performance Multicore Performance Synchronization Sreepathi Pai UTCS October 21, 2015 Outline 1 Synchronization Primitives 2 Blocking, Lock-free and Wait-free Algorithms 3 Transactional

More information

Deterministic Futexes Revisited

Deterministic Futexes Revisited A. Zuepke Deterministic Futexes Revisited Alexander Zuepke, Robert Kaiser first.last@hs-rm.de A. Zuepke Futexes Futexes: underlying mechanism for thread synchronization in Linux libc provides: Mutexes

More information

Message Passing Improvements to Shared Address Space Thread Synchronization Techniques DAN STAFFORD, ROBERT RELYEA

Message Passing Improvements to Shared Address Space Thread Synchronization Techniques DAN STAFFORD, ROBERT RELYEA Message Passing Improvements to Shared Address Space Thread Synchronization Techniques DAN STAFFORD, ROBERT RELYEA Agenda Background Motivation Remote Memory Request Shared Address Synchronization Remote

More information

Operating Systems. Operating Systems Summer 2017 Sina Meraji U of T

Operating Systems. Operating Systems Summer 2017 Sina Meraji U of T Operating Systems Operating Systems Summer 2017 Sina Meraji U of T More Special Instructions Swap (or Exchange) instruction Operates on two words atomically Can also be used to solve critical section problem

More information

Operating Systems (1DT020 & 1TT802)

Operating Systems (1DT020 & 1TT802) Uppsala University Department of Information Technology Name: Perso. no: Operating Systems (1DT020 & 1TT802) 2009-05-27 This is a closed book exam. Calculators are not allowed. Answers should be written

More information

Mutex Implementation

Mutex Implementation COS 318: Operating Systems Mutex Implementation Jaswinder Pal Singh Computer Science Department Princeton University (http://www.cs.princeton.edu/courses/cos318/) Revisit Mutual Exclusion (Mutex) u Critical

More information

Chapter 5 Asynchronous Concurrent Execution

Chapter 5 Asynchronous Concurrent Execution Chapter 5 Asynchronous Concurrent Execution Outline 5.1 Introduction 5.2 Mutual Exclusion 5.2.1 Java Multithreading Case Study 5.2.2 Critical Sections 5.2.3 Mutual Exclusion Primitives 5.3 Implementing

More information

Lecture. DM510 - Operating Systems, Weekly Notes, Week 11/12, 2018

Lecture. DM510 - Operating Systems, Weekly Notes, Week 11/12, 2018 Lecture In the lecture on March 13 we will mainly discuss Chapter 6 (Process Scheduling). Examples will be be shown for the simulation of the Dining Philosopher problem, a solution with monitors will also

More information

C09: Process Synchronization

C09: Process Synchronization CISC 7310X C09: Process Synchronization Hui Chen Department of Computer & Information Science CUNY Brooklyn College 3/29/2018 CUNY Brooklyn College 1 Outline Race condition and critical regions The bounded

More information

Operating Systems. Designed and Presented by Dr. Ayman Elshenawy Elsefy

Operating Systems. Designed and Presented by Dr. Ayman Elshenawy Elsefy Operating Systems Designed and Presented by Dr. Ayman Elshenawy Elsefy Dept. of Systems & Computer Eng.. AL-AZHAR University Website : eaymanelshenawy.wordpress.com Email : eaymanelshenawy@yahoo.com Reference

More information

Adaptive Lock. Madhav Iyengar < >, Nathaniel Jeffries < >

Adaptive Lock. Madhav Iyengar < >, Nathaniel Jeffries < > Adaptive Lock Madhav Iyengar < miyengar@andrew.cmu.edu >, Nathaniel Jeffries < njeffrie@andrew.cmu.edu > ABSTRACT Busy wait synchronization, the spinlock, is the primitive at the core of all other synchronization

More information

Thunks (continued) Olivier Danvy, John Hatcli. Department of Computing and Information Sciences. Kansas State University. Manhattan, Kansas 66506, USA

Thunks (continued) Olivier Danvy, John Hatcli. Department of Computing and Information Sciences. Kansas State University. Manhattan, Kansas 66506, USA Thunks (continued) Olivier Danvy, John Hatcli Department of Computing and Information Sciences Kansas State University Manhattan, Kansas 66506, USA e-mail: (danvy, hatcli)@cis.ksu.edu Abstract: Call-by-name

More information

A.3. MultiLisp futures

A.3. MultiLisp futures (define (make-rendezvous f) (letrec ((port1 nil) (port2 nil) (rendezvous (lambda (op) (case op ((send) (lambda args (call/sp (lambda (p) (throw port1 (lambda() (throw p (apply f args)))) (die))))) ((accept)

More information

CSE 153 Design of Operating Systems

CSE 153 Design of Operating Systems CSE 153 Design of Operating Systems Winter 2018 Midterm Review Midterm in class on Monday Covers material through scheduling and deadlock Based upon lecture material and modules of the book indicated on

More information

Page 1. Goals for Today" Atomic Read-Modify-Write instructions" Examples of Read-Modify-Write "

Page 1. Goals for Today Atomic Read-Modify-Write instructions Examples of Read-Modify-Write Goals for Today" CS162 Operating Systems and Systems Programming Lecture 5 Semaphores, Conditional Variables" Atomic instruction sequence Continue with Synchronization Abstractions Semaphores, Monitors

More information

MULTIPROCESSORS AND THREAD LEVEL PARALLELISM

MULTIPROCESSORS AND THREAD LEVEL PARALLELISM UNIT III MULTIPROCESSORS AND THREAD LEVEL PARALLELISM 1. Symmetric Shared Memory Architectures: The Symmetric Shared Memory Architecture consists of several processors with a single physical memory shared

More information

Threading and Synchronization. Fahd Albinali

Threading and Synchronization. Fahd Albinali Threading and Synchronization Fahd Albinali Parallelism Parallelism and Pseudoparallelism Why parallelize? Finding parallelism Advantages: better load balancing, better scalability Disadvantages: process/thread

More information

Chapter 6 Concurrency: Deadlock and Starvation

Chapter 6 Concurrency: Deadlock and Starvation Operating Systems: Internals and Design Principles Chapter 6 Concurrency: Deadlock and Starvation Seventh Edition By William Stallings Operating Systems: Internals and Design Principles When two trains

More information

Resource management. Real-Time Systems. Resource management. Resource management

Resource management. Real-Time Systems. Resource management. Resource management Real-Time Systems Specification Implementation Verification Mutual exclusion is a general problem that exists at several levels in a real-time system. Shared resources internal to the the run-time system:

More information

Concept of a process

Concept of a process Concept of a process In the context of this course a process is a program whose execution is in progress States of a process: running, ready, blocked Submit Ready Running Completion Blocked Concurrent

More information

Concurrency. Chapter 5

Concurrency. Chapter 5 Concurrency 1 Chapter 5 2 Concurrency Is a fundamental concept in operating system design Processes execute interleaved in time on a single processor Creates the illusion of simultaneous execution Benefits

More information

CIS Operating Systems Application of Semaphores. Professor Qiang Zeng Spring 2018

CIS Operating Systems Application of Semaphores. Professor Qiang Zeng Spring 2018 CIS 3207 - Operating Systems Application of Semaphores Professor Qiang Zeng Spring 2018 Big picture of synchronization primitives Busy-waiting Software solutions (Dekker, Bakery, etc.) Hardware-assisted

More information

Real-Time Systems. Lecture #4. Professor Jan Jonsson. Department of Computer Science and Engineering Chalmers University of Technology

Real-Time Systems. Lecture #4. Professor Jan Jonsson. Department of Computer Science and Engineering Chalmers University of Technology Real-Time Systems Lecture #4 Professor Jan Jonsson Department of Computer Science and Engineering Chalmers University of Technology Real-Time Systems Specification Resource management Mutual exclusion

More information

Concurrency: Deadlock and Starvation. Chapter 6

Concurrency: Deadlock and Starvation. Chapter 6 Concurrency: Deadlock and Starvation Chapter 6 Deadlock Permanent blocking of a set of processes that either compete for system resources or communicate with each other Involve conflicting needs for resources

More information

Note: in this document we use process and thread interchangeably.

Note: in this document we use process and thread interchangeably. Summary on Monitor Implementation techniques Note: in this document we use process and thread interchangeably. Monitor is neither a process (thread) nor an active entity. It is just an abstract data type

More information

Partial Marking GC. A traditional parallel mark and sweep GC algorithm has, however,

Partial Marking GC. A traditional parallel mark and sweep GC algorithm has, however, Partial Marking GC Yoshio Tanaka 1 Shogo Matsui 2 Atsushi Maeda 1 and Masakazu Nakanishi 1 1 Keio University, Yokohama 223, Japan 2 Kanagawa University, Hiratsuka 259-12, Japan Abstract Garbage collection

More information

Operating Systems, Assignment 2 Threads and Synchronization

Operating Systems, Assignment 2 Threads and Synchronization Operating Systems, Assignment 2 Threads and Synchronization Responsible TA's: Zohar and Matan Assignment overview The assignment consists of the following parts: 1) Kernel-level threads package 2) Synchronization

More information

EECS 482 Introduction to Operating Systems

EECS 482 Introduction to Operating Systems EECS 482 Introduction to Operating Systems Winter 2018 Harsha V. Madhyastha Monitors vs. Semaphores Monitors: Custom user-defined conditions Developer must control access to variables Semaphores: Access

More information

Process Synchronization

Process Synchronization CSC 4103 - Operating Systems Spring 2007 Lecture - VI Process Synchronization Tevfik Koşar Louisiana State University February 6 th, 2007 1 Roadmap Process Synchronization The Critical-Section Problem

More information

Synchronization. CSE 2431: Introduction to Operating Systems Reading: Chapter 5, [OSC] (except Section 5.10)

Synchronization. CSE 2431: Introduction to Operating Systems Reading: Chapter 5, [OSC] (except Section 5.10) Synchronization CSE 2431: Introduction to Operating Systems Reading: Chapter 5, [OSC] (except Section 5.10) 1 Outline Critical region and mutual exclusion Mutual exclusion using busy waiting Sleep and

More information

CS 550 Operating Systems Spring Concurrency Semaphores, Condition Variables, Producer Consumer Problem

CS 550 Operating Systems Spring Concurrency Semaphores, Condition Variables, Producer Consumer Problem 1 CS 550 Operating Systems Spring 2018 Concurrency Semaphores, Condition Variables, Producer Consumer Problem Semaphore Semaphore is a fundamental synchronization primitive used for Locking around critical

More information

Synchronization. Before We Begin. Synchronization. Example of a Race Condition. CSE 120: Principles of Operating Systems. Lecture 4.

Synchronization. Before We Begin. Synchronization. Example of a Race Condition. CSE 120: Principles of Operating Systems. Lecture 4. CSE 120: Principles of Operating Systems Lecture 4 Synchronization October 7, 2003 Before We Begin Read Chapter 7 (Process Synchronization) Programming Assignment #1 Due Sunday, October 19, midnight Prof.

More information

CSE Traditional Operating Systems deal with typical system software designed to be:

CSE Traditional Operating Systems deal with typical system software designed to be: CSE 6431 Traditional Operating Systems deal with typical system software designed to be: general purpose running on single processor machines Advanced Operating Systems are designed for either a special

More information

Introduction to Real-Time Operating Systems

Introduction to Real-Time Operating Systems Introduction to Real-Time Operating Systems GPOS vs RTOS General purpose operating systems Real-time operating systems GPOS vs RTOS: Similarities Multitasking Resource management OS services to applications

More information

Remaining Contemplation Questions

Remaining Contemplation Questions Process Synchronisation Remaining Contemplation Questions 1. The first known correct software solution to the critical-section problem for two processes was developed by Dekker. The two processes, P0 and

More information

Semaphores. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

Semaphores. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University Semaphores Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu EEE3052: Introduction to Operating Systems, Fall 2017, Jinkyu Jeong (jinkyu@skku.edu) Synchronization

More information

Dr. D. M. Akbar Hussain DE5 Department of Electronic Systems

Dr. D. M. Akbar Hussain DE5 Department of Electronic Systems Concurrency 1 Concurrency Execution of multiple processes. Multi-programming: Management of multiple processes within a uni- processor system, every system has this support, whether big, small or complex.

More information

Department of. Computer Science. Uniqueness Analysis of Array. Omega Test. October 21, Colorado State University

Department of. Computer Science. Uniqueness Analysis of Array. Omega Test. October 21, Colorado State University Department of Computer Science Uniqueness Analysis of Array Comprehensions Using the Omega Test David Garza and Wim Bohm Technical Report CS-93-127 October 21, 1993 Colorado State University Uniqueness

More information

Reactive Synchronization Algorithms for Multiprocessors

Reactive Synchronization Algorithms for Multiprocessors Synchronization Algorithms for Multiprocessors Beng-Hong Lim and Anant Agarwal Laboratory for Computer Science Massachusetts Institute of Technology Cambridge, MA 039 Abstract Synchronization algorithms

More information

PROCESS SYNCHRONIZATION

PROCESS SYNCHRONIZATION PROCESS SYNCHRONIZATION Process Synchronization Background The Critical-Section Problem Peterson s Solution Synchronization Hardware Semaphores Classic Problems of Synchronization Monitors Synchronization

More information

Implementing Mutual Exclusion. Sarah Diesburg Operating Systems CS 3430

Implementing Mutual Exclusion. Sarah Diesburg Operating Systems CS 3430 Implementing Mutual Exclusion Sarah Diesburg Operating Systems CS 3430 From the Previous Lecture The too much milk example shows that writing concurrent programs directly with load and store instructions

More information

A Unified Formalization of Four Shared-Memory Models

A Unified Formalization of Four Shared-Memory Models Computer Sciences Technical Rert #1051, September 1991, Revised September 1992 A Unified Formalization of Four Shared-Memory Models Sarita V. Adve Mark D. Hill Department of Computer Sciences University

More information

Lecture 6 Consistency and Replication

Lecture 6 Consistency and Replication Lecture 6 Consistency and Replication Prof. Wilson Rivera University of Puerto Rico at Mayaguez Electrical and Computer Engineering Department Outline Data-centric consistency Client-centric consistency

More information

Multiprocessors II: CC-NUMA DSM. CC-NUMA for Large Systems

Multiprocessors II: CC-NUMA DSM. CC-NUMA for Large Systems Multiprocessors II: CC-NUMA DSM DSM cache coherence the hardware stuff Today s topics: what happens when we lose snooping new issues: global vs. local cache line state enter the directory issues of increasing

More information

CSCI 8530 Advanced Operating Systems. Part 5 Process Coordination and Synchronization

CSCI 8530 Advanced Operating Systems. Part 5 Process Coordination and Synchronization CSCI 8530 Advanced Operating Systems Part 5 Process Coordination and Synchronization Updated: September 13, 2016 Location of Process Coordination in the Hierarchy Coordination of Processes Necessary in

More information

CSE 120: Principles of Operating Systems. Lecture 4. Synchronization. October 7, Prof. Joe Pasquale

CSE 120: Principles of Operating Systems. Lecture 4. Synchronization. October 7, Prof. Joe Pasquale CSE 120: Principles of Operating Systems Lecture 4 Synchronization October 7, 2003 Prof. Joe Pasquale Department of Computer Science and Engineering University of California, San Diego 2003 by Joseph Pasquale

More information

Ownership of a queue for practical lock-free scheduling

Ownership of a queue for practical lock-free scheduling Ownership of a queue for practical lock-free scheduling Lincoln Quirk May 4, 2008 Abstract We consider the problem of scheduling tasks in a multiprocessor. Tasks cannot always be scheduled independently

More information

CPSC/ECE 3220 Summer 2017 Exam 2

CPSC/ECE 3220 Summer 2017 Exam 2 CPSC/ECE 3220 Summer 2017 Exam 2 Name: Part 1: Word Bank Write one of the words or terms from the following list into the blank appearing to the left of the appropriate definition. Note that there are

More information

Java: Pitfalls and Strategies

Java: Pitfalls and Strategies Java: Pitfalls and Strategies Pao-Ann Hsiung National Chung Cheng University Chiayi, Taiwan Adapted from Bo Sandén, Copying with Java Threads, IEEE Computer, Vol. 37, No. 4, pp. 20-27, April 2004. Contents

More information

EE458 - Embedded Systems Lecture 8 Semaphores

EE458 - Embedded Systems Lecture 8 Semaphores EE458 - Embedded Systems Lecture 8 Semaphores Outline Introduction to Semaphores Binary and Counting Semaphores Mutexes Typical Applications RTEMS Semaphores References RTC: Chapter 6 CUG: Chapter 9 1

More information

Algorithms for Scalable Lock Synchronization on Shared-memory Multiprocessors

Algorithms for Scalable Lock Synchronization on Shared-memory Multiprocessors Algorithms for Scalable Lock Synchronization on Shared-memory Multiprocessors John Mellor-Crummey Department of Computer Science Rice University johnmc@rice.edu COMP 422/534 Lecture 21 30 March 2017 Summary

More information

What is the Race Condition? And what is its solution? What is a critical section? And what is the critical section problem?

What is the Race Condition? And what is its solution? What is a critical section? And what is the critical section problem? What is the Race Condition? And what is its solution? Race Condition: Where several processes access and manipulate the same data concurrently and the outcome of the execution depends on the particular

More information

Reminder from last <me

Reminder from last <me Concurrent systems Lecture 2: Mutual exclusion and process synchronisa

More information

Parallel Computer Architecture Spring Distributed Shared Memory Architectures & Directory-Based Memory Coherence

Parallel Computer Architecture Spring Distributed Shared Memory Architectures & Directory-Based Memory Coherence Parallel Computer Architecture Spring 2018 Distributed Shared Memory Architectures & Directory-Based Memory Coherence Nikos Bellas Computer and Communications Engineering Department University of Thessaly

More information

More Types of Synchronization 11/29/16

More Types of Synchronization 11/29/16 More Types of Synchronization 11/29/16 Today s Agenda Classic thread patterns Other parallel programming patterns More synchronization primitives: RW locks Condition variables Semaphores Message passing

More information

Page 1. Goals for Today. Atomic Read-Modify-Write instructions. Examples of Read-Modify-Write

Page 1. Goals for Today. Atomic Read-Modify-Write instructions. Examples of Read-Modify-Write Goals for Today CS162 Operating Systems and Systems Programming Lecture 5 Atomic instruction sequence Continue with Synchronization Abstractions Semaphores, Monitors and condition variables Semaphores,

More information

Taking a Virtual Machine Towards Many-Cores. Rickard Green - Patrik Nyblom -

Taking a Virtual Machine Towards Many-Cores. Rickard Green - Patrik Nyblom - Taking a Virtual Machine Towards Many-Cores Rickard Green - rickard@erlang.org Patrik Nyblom - pan@erlang.org What we all know by now Number of cores per processor AMD Opteron Intel Xeon Sparc Niagara

More information

SHARED-MEMORY COMMUNICATION

SHARED-MEMORY COMMUNICATION SHARED-MEMORY COMMUNICATION IMPLICITELY VIA MEMORY PROCESSORS SHARE SOME MEMORY COMMUNICATION IS IMPLICIT THROUGH LOADS AND STORES NEED TO SYNCHRONIZE NEED TO KNOW HOW THE HARDWARE INTERLEAVES ACCESSES

More information

Lecture 18: Coherence and Synchronization. Topics: directory-based coherence protocols, synchronization primitives (Sections

Lecture 18: Coherence and Synchronization. Topics: directory-based coherence protocols, synchronization primitives (Sections Lecture 18: Coherence and Synchronization Topics: directory-based coherence protocols, synchronization primitives (Sections 5.1-5.5) 1 Cache Coherence Protocols Directory-based: A single location (directory)

More information

Symmetric Multiprocessors: Synchronization and Sequential Consistency

Symmetric Multiprocessors: Synchronization and Sequential Consistency Constructive Computer Architecture Symmetric Multiprocessors: Synchronization and Sequential Consistency Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology November

More information

Lecture 9: Midterm Review

Lecture 9: Midterm Review Project 1 Due at Midnight Lecture 9: Midterm Review CSE 120: Principles of Operating Systems Alex C. Snoeren Midterm Everything we ve covered is fair game Readings, lectures, homework, and Nachos Yes,

More information

Generalized Iteration Space and the. Parallelization of Symbolic Programs. (Extended Abstract) Luddy Harrison. October 15, 1991.

Generalized Iteration Space and the. Parallelization of Symbolic Programs. (Extended Abstract) Luddy Harrison. October 15, 1991. Generalized Iteration Space and the Parallelization of Symbolic Programs (Extended Abstract) Luddy Harrison October 15, 1991 Abstract A large body of literature has developed concerning the automatic parallelization

More information

Using the Holey Brick Tree for Spatial Data. in General Purpose DBMSs. Northeastern University

Using the Holey Brick Tree for Spatial Data. in General Purpose DBMSs. Northeastern University Using the Holey Brick Tree for Spatial Data in General Purpose DBMSs Georgios Evangelidis Betty Salzberg College of Computer Science Northeastern University Boston, MA 02115-5096 1 Introduction There is

More information

CMSC421: Principles of Operating Systems

CMSC421: Principles of Operating Systems CMSC421: Principles of Operating Systems Nilanjan Banerjee Assistant Professor, University of Maryland Baltimore County nilanb@umbc.edu http://www.csee.umbc.edu/~nilanb/teaching/421/ Principles of Operating

More information