Lock free Algorithm for Multi-core architecture. SDY Corporation Hiromasa Kanda SDY

Size: px
Start display at page:

Download "Lock free Algorithm for Multi-core architecture. SDY Corporation Hiromasa Kanda SDY"

Transcription

1 Lock free Algorithm for Multi-core architecture Corporation Hiromasa Kanda

2 1.Introduction Background needed Multi-Thread Problem of Multi-Thread program What's Lock free? Lock-free algorithm performance 2.Lock free Algorithm Atomic operation Lock-free Algorithm basic concept Lock-free queue (using arrangement ) Lock-free queue (liner ) Queue performance Lock-free hash-map Hash-map performance 3.Summary Contents

3 1.Introduction Background needed Multi-Thread Multi-core and SMT(HT) Limited CMOS scaling Manage memory access and CPU clock What is needed in application? Micro Parallelization Multi-Process Multi-Thread Amdahl's law The speedup of a program using multiple processors in parallel computing is limited Problem of Multi-thread Shared resource synchronization

4 1.Introduction Problem of Multi-Thread program 1/2 There was resources problem that share it when concurrent access multi-thread. Thread 1 queue Thread 2 dequeue dequeue dequeue(); dequeue; 30 competition

5 1.Introduction Problem of Multi-Thread program 2/2 The traditional approach to multi-thread programming is using locks synchronize access to shared resources. Mutex,Semaphore lock wait Thread 1 lock q; q.dequeue(); unlock q 10 lock 1 2 dequeue 3 unlock queue q lock 2' dequeue 4 Thread 2 lock q; q.dequeue(); 20 unlock 5 unlock q

6 What's Lock free? 1.Introduction Lock-free is "non-blocking" algorithm that doesn't break value when access each thread at the same time. Thread 1 queue q Thread 2 dequeue 1 10 dequeue q.dequeue(); enqueue 20 q.dequeue(); q.enqueue(40); 3 Lock-free algorithm Thread Safe!

7 1.Introduction Lock-free algorithm performance Lock-free algorithm performance is better than that of mutex lock algorithm, 100 times from 10 times. The difference extends further for more CPUs.

8 2.Lock free Algorithm Queue performance Measurement 4core CPU 3GHz X86-64 clock frequency(rdtsc) /thread num std queue(use mutex) lock-free queue(liner) lock-free queue(vector) SLOW FAST Thread num

9 2.Lock free Algorithm Lock-free hash-map performance Measurement 4core CPU 3GHz X86-64 clock frequency(rdtsc) /thread num std map(use mutex) lock-free hash-map SLOW FAST Thread num

10 2.Lock free Algorithm Lock-free algorithm elements Lock-free algorithm use 2 elements. 1) atomic operation. Not use memory barrier. Blocked small memory address. This new feature permits applications to use atomically update memory without using other synchronization primitives. gcc4.1 support this operation on user space. 2) key-value transaction Atomic operation is uses one key value. Small memory address transaction composes lockfree algorithm data statement.

11 2.Lock free Algorithm Lock-free Algorithm basic concept 1/3 Basic concept while(true){ 1.Data(pointer) collect 2.Some operation 3.Data(pointer) compare & update use CAS If (CAS = true) break; Exclusive Loop end case of CAS true

12 2.Lock free Algorithm Lock-free Algorithm basic concept 2/3 Thread 1 int i Thread 2 while(true){ int j = i; int k = j + 10; collect 10 update i = 20; if(cas(&i, j, k){ break; false retry 20

13 2.Lock free Algorithm Lock-free Algorithm basic concept 3/3 Thread 1 while(true){ int j = i; int k = j + 10; retry collect int i 10 if(cas(&i, j, k){ break; true update 20 30

14 2.Lock free Algorithm Lock-free queue ( using arrangement ) 1/3 Vector lock-free queue (using arrangement) head number 0 tail number M enqueue head number 0 tail number 2 0 A 1 B M

15 2.Lock free Algorithm Lock-free queue ( using arrangement ) 2/3 enqueue(value) while(true){ tail = tail_number; Data collect (tail pointer) if ( CAS(&tail_number, tail, tail + 1 ) ) break; Loop end case of tail pointer value is equal to tail node[tail].value = value; Set value

16 2.Lock free Algorithm Lock-free queue ( using arrangement ) 3/3 dequeue(value) while(true){ head = head_number ; if(!(node[head].value) ){ if( head_number == tail pointer ) return false; else { if( CAS(&head_number, head, head + 1 ) ){ value = node[head].value; Break; Data collect (head pointer) Queue is empty? Loop end case of head pointer value is equal to head node[head].value = ; return true; Erase value

17 2.Lock free Algorithm Lock-free queue ( liner ) 1/5 Liner Lock-free queue algorithm head pointer A tail pointer X node A next B value data A node B next C value data B node C next D value data C node X next X value data X Initial state head pointer dummy tail pointer dummy dummy node next dummy value

18 2.Lock free Algorithm Lock-free queue ( liner ) 2/5 enqueue head pointer dummy tail pointer dummy 2 head pointer dummy tail pointer A dummy node next dummy value 1 node A next dummy value data A dummy node next A value node A next dummy value data A head pointer dummy tail pointer A 2 head pointer dummy tail pointer B dummy node next A value node A next dummy value data A 1 node B next dummy value data B dummy node next A value node A next B value data A node B next dummy value data B

19 Lock-free queue ( liner ) 3/5 enqueue(value) new_node = new node_type(); new_node->value = value; new_node->next = dummy; 2.Lock free Algorithm Create node. while( true ){ tail = tail_pointer; if( CAS(&tail_pointer->next, dummy, new_node) ){ TAS(&tail_pointer, new_node); break; Data collect (tail pointer). Loop end case of tail pointer's next value is equal to dummy. Update tail pointer. return true;

20 2.Lock free Algorithm Lock-free queue ( liner ) 4/5 Dequeue pattan1 Node A next dummy value data A Head pointer dummy dummy node next A value 1 tail pointer A node A next dummy value 3 Update dummy case tail = head->next Head pointer dummy dummy node next dummy value tail pointer dummy Dequeue pattan2 Update dummy 2 2 Update dummy case tail = head Node A next B value data A Head pointer B dummy node next dummy value 1 tail pointer C node B next C value data B node C next dummy value data C head pointer C dummy node next dummy tail pointer C node C next dummy value data C

21 2.Lock free Algorithm Lock-free queue ( liner ) 5/5 dequeue(value) while( true ){ head = head_pointer; next = head_pointer->next; if( head == dummy ){ if( next == dummy ) return false; if( CAS(&head_pointer, head, next->next) ){ TAS(&dummy->next,dummy); CAS(&tail_pointer, next, dummy ); value = next->value; delete next; break; else { if( CAS(&head_pointer, head, next) ){ CAS(&tail_pointer, head, dummy); value = head->value; delete head; break; return true; Data collect (head pointer,next pointer). Queue empty? Case Pattan1 Loop end case of head pointer value is equal to collect head. Case Pattan2 Loop end case of head pointer value is equal to collect head.

22 Lock-free hash-map 1/4 Lock-free hash-map algorithm hash-map 2.Lock free Algorithm key value M Hash function Key A Value A hash-map key value A M A

23 Lock-free hash-map 2/4 2.Lock free Algorithm Insert( key, value ) hashvalue = get_hashvalue( key ); pre_key = do{ for(;;){ if(!hashmap[hashvalue].key ) break; if( hashmap[hashvalue].key == key ){ pre_key = hashmap[hashvalue].key; break; else{ hashmap[hashvalue].rehash = true; hashvalue = get_rehashvalue( hashvalue ); while(!cas( &hashmap[hashvalue].key, pre_key, key ) ); Get hash value Break arrangement key is null Break arrangement key equal to key Retry re-hash Loop end CAS true TAS(&hashmap[hashvalue].value,value); Set value

24 Lock-free hash-map 3/4 2.Lock free Algorithm return value find( key ) hashvalue = get_hashvalue( key ); for(;;){ if( hashmap[hashvalue].key == key ){ return hashmap[hashvalue].value; else{ if(!hashmap[hashvalue].rehash ) return ; Get hash value Return value case arrangement key is equal to key Return null(false) Case key is not equal to key hashvalue = get_rehashvalue( hashvalue ); Retry after get rehash value

25 Lock-free hash-map 4/4 erase( key ) 2.Lock free Algorithm hashvalue = get_hashvalue( key ); do{ for(;;){ if( hashmap[hashvalue].key == key ){ pre_key = hashmap[hashvalue].key; break; else{ if(!hashmap[hashvalue].rehash ) return false; hashvalue = get_rehashvalue( hashvalue ); while(!cas( &hashmap[hashvalue].key, pre_key, ) ); TAS(&hashmap[hashvalue].value,); return true; Get hash value Data collect Not find key Retry after get rehash value Erase key and value

26 4.Summary Possible simple coding on userspace application Possible access faster than use mutex-lock I plan to create lock-free list. Present,considering some method that is find(),delete().. base on liner queue. See source forge website:

CS510 Concurrent Systems. Jonathan Walpole

CS510 Concurrent Systems. Jonathan Walpole CS510 Concurrent Systems Jonathan Walpole Simple, Fast, and Practical Non-Blocking and Blocking Concurrent Queue Algorithms utline Background Non-Blocking Queue Algorithm Two Lock Concurrent Queue Algorithm

More information

Cache-Aware Lock-Free Queues for Multiple Producers/Consumers and Weak Memory Consistency

Cache-Aware Lock-Free Queues for Multiple Producers/Consumers and Weak Memory Consistency Cache-Aware Lock-Free Queues for Multiple Producers/Consumers and Weak Memory Consistency Anders Gidenstam Håkan Sundell Philippas Tsigas School of business and informatics University of Borås Distributed

More information

Synchronising Threads

Synchronising Threads Synchronising Threads David Chisnall March 1, 2011 First Rule for Maintainable Concurrent Code No data may be both mutable and aliased Harder Problems Data is shared and mutable Access to it must be protected

More information

Concurrency - II. Recitation 3/24 Nisarg Raval Slides by Prof. Landon Cox and Vamsi Thummala

Concurrency - II. Recitation 3/24 Nisarg Raval Slides by Prof. Landon Cox and Vamsi Thummala Concurrency - II Recitation 3/24 Nisarg Raval Slides by Prof. Landon Cox and Vamsi Thummala So far lock () if (nonote && nomilk){ leave note at store unlock () buy milk lock () remove note unlock () else

More information

Hazard Pointers. Number of threads unbounded time to check hazard pointers also unbounded! difficult dynamic bookkeeping! thread B - hp1 - hp2

Hazard Pointers. Number of threads unbounded time to check hazard pointers also unbounded! difficult dynamic bookkeeping! thread B - hp1 - hp2 Hazard Pointers Store pointers of memory references about to be accessed by a thread Memory allocation checks all hazard pointers to avoid the ABA problem thread A - hp1 - hp2 thread B - hp1 - hp2 thread

More information

Lecture 7: Transactional Memory Intro. Topics: introduction to transactional memory, lazy implementation

Lecture 7: Transactional Memory Intro. Topics: introduction to transactional memory, lazy implementation Lecture 7: Transactional Memory Intro Topics: introduction to transactional memory, lazy implementation 1 Transactions New paradigm to simplify programming instead of lock-unlock, use transaction begin-end

More information

A Non-Blocking Concurrent Queue Algorithm

A Non-Blocking Concurrent Queue Algorithm A Non-Blocking Concurrent Queue Algorithm Bruno Didot bruno.didot@epfl.ch June 2012 Abstract This report presents a new non-blocking concurrent FIFO queue backed by an unrolled linked list. Enqueue and

More information

Non-blocking Array-based Algorithms for Stacks and Queues!

Non-blocking Array-based Algorithms for Stacks and Queues! Non-blocking Array-based Algorithms for Stacks and Queues! Niloufar Shafiei! Department of Computer Science and Engineering York University ICDCN 09 Outline! Introduction! Stack algorithm! Queue algorithm!

More information

Lecture 21: Transactional Memory. Topics: consistency model recap, introduction to transactional memory

Lecture 21: Transactional Memory. Topics: consistency model recap, introduction to transactional memory Lecture 21: Transactional Memory Topics: consistency model recap, introduction to transactional memory 1 Example Programs Initially, A = B = 0 P1 P2 A = 1 B = 1 if (B == 0) if (A == 0) critical section

More information

Lecture 4: Directory Protocols and TM. Topics: corner cases in directory protocols, lazy TM

Lecture 4: Directory Protocols and TM. Topics: corner cases in directory protocols, lazy TM Lecture 4: Directory Protocols and TM Topics: corner cases in directory protocols, lazy TM 1 Handling Reads When the home receives a read request, it looks up memory (speculative read) and directory in

More information

Fast and Scalable Queue-Based Resource Allocation Lock on Shared-Memory Multiprocessors

Fast and Scalable Queue-Based Resource Allocation Lock on Shared-Memory Multiprocessors Fast and Scalable Queue-Based Resource Allocation Lock on Shared-Memory Multiprocessors Deli Zhang, Brendan Lynch, and Damian Dechev University of Central Florida, Orlando, USA April 27, 2016 Mutual Exclusion

More information

Fast and Scalable Queue-Based Resource Allocation Lock on Shared-Memory Multiprocessors

Fast and Scalable Queue-Based Resource Allocation Lock on Shared-Memory Multiprocessors Background Fast and Scalable Queue-Based Resource Allocation Lock on Shared-Memory Multiprocessors Deli Zhang, Brendan Lynch, and Damian Dechev University of Central Florida, Orlando, USA December 18,

More information

Implementing Mutual Exclusion. Sarah Diesburg Operating Systems CS 3430

Implementing Mutual Exclusion. Sarah Diesburg Operating Systems CS 3430 Implementing Mutual Exclusion Sarah Diesburg Operating Systems CS 3430 From the Previous Lecture The too much milk example shows that writing concurrent programs directly with load and store instructions

More information

Synchronization. Before We Begin. Synchronization. Credit/Debit Problem: Race Condition. CSE 120: Principles of Operating Systems.

Synchronization. Before We Begin. Synchronization. Credit/Debit Problem: Race Condition. CSE 120: Principles of Operating Systems. CSE 120: Principles of Operating Systems Lecture 4 Synchronization January 23, 2006 Prof. Joe Pasquale Department of Computer Science and Engineering University of California, San Diego Before We Begin

More information

Monitors; Software Transactional Memory

Monitors; Software Transactional Memory Monitors; Software Transactional Memory Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico October 18, 2012 CPD (DEI / IST) Parallel and

More information

Distributed Systems Synchronization. Marcus Völp 2007

Distributed Systems Synchronization. Marcus Völp 2007 Distributed Systems Synchronization Marcus Völp 2007 1 Purpose of this Lecture Synchronization Locking Analysis / Comparison Distributed Operating Systems 2007 Marcus Völp 2 Overview Introduction Hardware

More information

Scalable Concurrent Hash Tables via Relativistic Programming

Scalable Concurrent Hash Tables via Relativistic Programming Scalable Concurrent Hash Tables via Relativistic Programming Josh Triplett September 24, 2009 Speed of data < Speed of light Speed of light: 3e8 meters/second Processor speed: 3 GHz, 3e9 cycles/second

More information

Lecture 6: Lazy Transactional Memory. Topics: TM semantics and implementation details of lazy TM

Lecture 6: Lazy Transactional Memory. Topics: TM semantics and implementation details of lazy TM Lecture 6: Lazy Transactional Memory Topics: TM semantics and implementation details of lazy TM 1 Transactions Access to shared variables is encapsulated within transactions the system gives the illusion

More information

CSC 1052 Algorithms & Data Structures II: Linked Queues

CSC 1052 Algorithms & Data Structures II: Linked Queues CSC 1052 Algorithms & Data Structures II: Linked Queues Professor Henry Carter Spring 2018 Recap A queue simulates a waiting line, where objects are removed in the same order they are added The primary

More information

CS 31: Introduction to Computer Systems : Threads & Synchronization April 16-18, 2019

CS 31: Introduction to Computer Systems : Threads & Synchronization April 16-18, 2019 CS 31: Introduction to Computer Systems 22-23: Threads & Synchronization April 16-18, 2019 Making Programs Run Faster We all like how fast computers are In the old days (1980 s - 2005): Algorithm too slow?

More information

C09: Process Synchronization

C09: Process Synchronization CISC 7310X C09: Process Synchronization Hui Chen Department of Computer & Information Science CUNY Brooklyn College 3/29/2018 CUNY Brooklyn College 1 Outline Race condition and critical regions The bounded

More information

Lindsay Groves, Simon Doherty. Mark Moir, Victor Luchangco

Lindsay Groves, Simon Doherty. Mark Moir, Victor Luchangco Lindsay Groves, Simon Doherty Victoria University of Wellington Mark Moir, Victor Luchangco Sun Microsystems, Boston (FORTE, Madrid, September, 2004) Lock-based concurrency doesn t scale Lock-free/non-blocking

More information

PROCESS SYNCHRONIZATION

PROCESS SYNCHRONIZATION PROCESS SYNCHRONIZATION Process Synchronization Background The Critical-Section Problem Peterson s Solution Synchronization Hardware Semaphores Classic Problems of Synchronization Monitors Synchronization

More information

Distributed Operating Systems

Distributed Operating Systems Distributed Operating Systems Synchronization in Parallel Systems Marcus Völp 2009 1 Topics Synchronization Locking Analysis / Comparison Distributed Operating Systems 2009 Marcus Völp 2 Overview Introduction

More information

Grand Central Dispatch. Sri Teja Basava CSCI 5528: Foundations of Software Engineering Spring 10

Grand Central Dispatch. Sri Teja Basava CSCI 5528: Foundations of Software Engineering Spring 10 Grand Central Dispatch Sri Teja Basava CSCI 5528: Foundations of Software Engineering Spring 10 1 New Technologies in Snow Leopard 2 Grand Central Dispatch An Apple technology to optimize application support

More information

Lecture 21: Transactional Memory. Topics: Hardware TM basics, different implementations

Lecture 21: Transactional Memory. Topics: Hardware TM basics, different implementations Lecture 21: Transactional Memory Topics: Hardware TM basics, different implementations 1 Transactions New paradigm to simplify programming instead of lock-unlock, use transaction begin-end locks are blocking,

More information

CS 241 Honors Concurrent Data Structures

CS 241 Honors Concurrent Data Structures CS 241 Honors Concurrent Data Structures Bhuvan Venkatesh University of Illinois Urbana Champaign March 27, 2018 CS 241 Course Staff (UIUC) Lock Free Data Structures March 27, 2018 1 / 43 What to go over

More information

Synchronization. Before We Begin. Synchronization. Example of a Race Condition. CSE 120: Principles of Operating Systems. Lecture 4.

Synchronization. Before We Begin. Synchronization. Example of a Race Condition. CSE 120: Principles of Operating Systems. Lecture 4. CSE 120: Principles of Operating Systems Lecture 4 Synchronization October 7, 2003 Before We Begin Read Chapter 7 (Process Synchronization) Programming Assignment #1 Due Sunday, October 19, midnight Prof.

More information

Midterm Exam Amy Murphy 6 March 2002

Midterm Exam Amy Murphy 6 March 2002 University of Rochester Midterm Exam Amy Murphy 6 March 2002 Computer Systems (CSC2/456) Read before beginning: Please write clearly. Illegible answers cannot be graded. Be sure to identify all of your

More information

CSE Traditional Operating Systems deal with typical system software designed to be:

CSE Traditional Operating Systems deal with typical system software designed to be: CSE 6431 Traditional Operating Systems deal with typical system software designed to be: general purpose running on single processor machines Advanced Operating Systems are designed for either a special

More information

Brushing the Locks out of the Fur: A Lock-Free Work Stealing Library Based on Wool

Brushing the Locks out of the Fur: A Lock-Free Work Stealing Library Based on Wool Brushing the Locks out of the Fur: A Lock-Free Work Stealing Library Based on Wool Håkan Sundell School of Business and Informatics University of Borås, 50 90 Borås E-mail: Hakan.Sundell@hb.se Philippas

More information

Monitors; Software Transactional Memory

Monitors; Software Transactional Memory Monitors; Software Transactional Memory Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico March 17, 2016 CPD (DEI / IST) Parallel and Distributed

More information

Mutex Implementation

Mutex Implementation COS 318: Operating Systems Mutex Implementation Jaswinder Pal Singh Computer Science Department Princeton University (http://www.cs.princeton.edu/courses/cos318/) Revisit Mutual Exclusion (Mutex) u Critical

More information

== isn t always equal?

== isn t always equal? == isn t always equal? In Java, == does the expected for primitives. int a = 26; int b = 26; // a == b is true int a = 13; int b = 26; // a == b is false Comparing two references checks if they are pointing

More information

Order Is A Lie. Are you sure you know how your code runs?

Order Is A Lie. Are you sure you know how your code runs? Order Is A Lie Are you sure you know how your code runs? Order in code is not respected by Compilers Processors (out-of-order execution) SMP Cache Management Understanding execution order in a multithreaded

More information

Martin Kruliš, v

Martin Kruliš, v Martin Kruliš 1 Optimizations in General Code And Compilation Memory Considerations Parallelism Profiling And Optimization Examples 2 Premature optimization is the root of all evil. -- D. Knuth Our goal

More information

CSE 120: Principles of Operating Systems. Lecture 4. Synchronization. October 7, Prof. Joe Pasquale

CSE 120: Principles of Operating Systems. Lecture 4. Synchronization. October 7, Prof. Joe Pasquale CSE 120: Principles of Operating Systems Lecture 4 Synchronization October 7, 2003 Prof. Joe Pasquale Department of Computer Science and Engineering University of California, San Diego 2003 by Joseph Pasquale

More information

Parallelization and Synchronization. CS165 Section 8

Parallelization and Synchronization. CS165 Section 8 Parallelization and Synchronization CS165 Section 8 Multiprocessing & the Multicore Era Single-core performance stagnates (breakdown of Dennard scaling) Moore s law continues use additional transistors

More information

Concurrency and High Performance Reloaded

Concurrency and High Performance Reloaded Concurrency and High Performance Reloaded Disclaimer Any performance tuning advice provided in this presentation... will be wrong! 2 www.kodewerk.com Me Work as independent (a.k.a. freelancer) performance

More information

Concept of a process

Concept of a process Concept of a process In the context of this course a process is a program whose execution is in progress States of a process: running, ready, blocked Submit Ready Running Completion Blocked Concurrent

More information

Parallelism and Data Synchronization

Parallelism and Data Synchronization Project 5 Parallelism and Data Synchronization Out: In class on Tuesday, 10 Nov 2009 Due: 10am on Friday, 20 Nov 2009 In this project, you will be focusing on the theoretical side of parallelism and data

More information

Fast and Lock-Free Concurrent Priority Queues for Multi-Thread Systems

Fast and Lock-Free Concurrent Priority Queues for Multi-Thread Systems Fast and Lock-Free Concurrent Priority Queues for Multi-Thread Systems Håkan Sundell Philippas Tsigas Outline Synchronization Methods Priority Queues Concurrent Priority Queues Lock-Free Algorithm: Problems

More information

Atomicity CS 2110 Fall 2017

Atomicity CS 2110 Fall 2017 Atomicity CS 2110 Fall 2017 Parallel Programming Thus Far Parallel programs can be faster and more efficient Problem: race conditions Solution: synchronization Are there more efficient ways to ensure the

More information

Whatever can go wrong will go wrong. attributed to Edward A. Murphy. Murphy was an optimist. authors of lock-free programs LOCK FREE KERNEL

Whatever can go wrong will go wrong. attributed to Edward A. Murphy. Murphy was an optimist. authors of lock-free programs LOCK FREE KERNEL Whatever can go wrong will go wrong. attributed to Edward A. Murphy Murphy was an optimist. authors of lock-free programs LOCK FREE KERNEL 251 Literature Maurice Herlihy and Nir Shavit. The Art of Multiprocessor

More information

That means circular linked list is similar to the single linked list except that the last node points to the first node in the list.

That means circular linked list is similar to the single linked list except that the last node points to the first node in the list. Leaning Objective: In this Module you will be learning the following: Circular Linked Lists and it operations Introduction: Circular linked list is a sequence of elements in which every element has link

More information

CSE 486/586 Distributed Systems

CSE 486/586 Distributed Systems CSE 486/586 Distributed Systems Mutual Exclusion Steve Ko Computer Sciences and Engineering University at Buffalo CSE 486/586 Recap: Consensus On a synchronous system There s an algorithm that works. On

More information

CPSC 261 Midterm 2 Thursday March 17 th, 2016

CPSC 261 Midterm 2 Thursday March 17 th, 2016 CPSC 261 Midterm 2 Thursday March 17 th, 2016 [9] 1. Multiple choices [5] (a) Among the following terms, circle all of those that refer to a responsibility of a thread scheduler: Solution : Avoiding deadlocks

More information

Proving linearizability & lock-freedom

Proving linearizability & lock-freedom Proving linearizability & lock-freedom Viktor Vafeiadis MPI-SWS Michael & Scott non-blocking queue head tail X 1 3 2 null CAS compare & swap CAS (address, expectedvalue, newvalue) { atomic { if ( *address

More information

CS 153 Lab4 and 5. Kishore Kumar Pusukuri. Kishore Kumar Pusukuri CS 153 Lab4 and 5

CS 153 Lab4 and 5. Kishore Kumar Pusukuri. Kishore Kumar Pusukuri CS 153 Lab4 and 5 CS 153 Lab4 and 5 Kishore Kumar Pusukuri Outline Introduction A thread is a straightforward concept : a single sequential flow of control. In traditional operating systems, each process has an address

More information

Other consistency models

Other consistency models Last time: Symmetric multiprocessing (SMP) Lecture 25: Synchronization primitives Computer Architecture and Systems Programming (252-0061-00) CPU 0 CPU 1 CPU 2 CPU 3 Timothy Roscoe Herbstsemester 2012

More information

Queue ADT. January 31, 2018 Cinda Heeren / Geoffrey Tien 1

Queue ADT. January 31, 2018 Cinda Heeren / Geoffrey Tien 1 Queue ADT Cinda Heeren / Geoffrey Tien 1 PA1 testing Your code must compile with our private test file Any non-compiling submissions will receive zero Note that only functions that are called will be compiled

More information

Today: Synchronization. Recap: Synchronization

Today: Synchronization. Recap: Synchronization Today: Synchronization Synchronization Mutual exclusion Critical sections Example: Too Much Milk Locks Synchronization primitives are required to ensure that only one thread executes in a critical section

More information

SYNCHRONIZATION M O D E R N O P E R A T I N G S Y S T E M S R E A D 2. 3 E X C E P T A N D S P R I N G 2018

SYNCHRONIZATION M O D E R N O P E R A T I N G S Y S T E M S R E A D 2. 3 E X C E P T A N D S P R I N G 2018 SYNCHRONIZATION M O D E R N O P E R A T I N G S Y S T E M S R E A D 2. 3 E X C E P T 2. 3. 8 A N D 2. 3. 1 0 S P R I N G 2018 INTER-PROCESS COMMUNICATION 1. How a process pass information to another process

More information

Queues. October 20, 2017 Hassan Khosravi / Geoffrey Tien 1

Queues. October 20, 2017 Hassan Khosravi / Geoffrey Tien 1 Queues October 20, 2017 Hassan Khosravi / Geoffrey Tien 1 Queue ADT Queue ADT should support at least the first two operations: enqueue insert an item to the back of the queue dequeue remove an item from

More information

Last Class: Synchronization

Last Class: Synchronization Last Class: Synchronization Synchronization primitives are required to ensure that only one thread executes in a critical section at a time. Concurrent programs Low-level atomic operations (hardware) load/store

More information

Multithreading and Interactive Programs

Multithreading and Interactive Programs Multithreading and Interactive Programs CS160: User Interfaces John Canny. Last time Model-View-Controller Break up a component into Model of the data supporting the App View determining the look of the

More information

Lecture 9: Multiprocessor OSs & Synchronization. CSC 469H1F Fall 2006 Angela Demke Brown

Lecture 9: Multiprocessor OSs & Synchronization. CSC 469H1F Fall 2006 Angela Demke Brown Lecture 9: Multiprocessor OSs & Synchronization CSC 469H1F Fall 2006 Angela Demke Brown The Problem Coordinated management of shared resources Resources may be accessed by multiple threads Need to control

More information

CSE 153 Design of Operating Systems

CSE 153 Design of Operating Systems CSE 153 Design of Operating Systems Winter 19 Lecture 7/8: Synchronization (1) Administrivia How is Lab going? Be prepared with questions for this weeks Lab My impression from TAs is that you are on track

More information

CS 25200: Systems Programming. Lecture 26: Classic Synchronization Problems

CS 25200: Systems Programming. Lecture 26: Classic Synchronization Problems CS 25200: Systems Programming Lecture 26: Classic Synchronization Problems Dr. Jef Turkstra 2018 Dr. Jeffrey A. Turkstra 1 Announcements Lab 5 posted this evening Web server Password protection SSL cgi-bin

More information

Discussion 2C Notes (Week 3, January 21) TA: Brian Choi Section Webpage:

Discussion 2C Notes (Week 3, January 21) TA: Brian Choi Section Webpage: Discussion 2C Notes (Week 3, January 21) TA: Brian Choi (schoi@cs.ucla.edu) Section Webpage: http://www.cs.ucla.edu/~schoi/cs32 Abstraction In Homework 1, you were asked to build a class called Bag. Let

More information

8. Fundamental Data Structures

8. Fundamental Data Structures 172 8. Fundamental Data Structures Abstract data types stack, queue, implementation variants for linked lists, [Ottman/Widmayer, Kap. 1.5.1-1.5.2, Cormen et al, Kap. 10.1.-10.2] Abstract Data Types 173

More information

Parallel Computing. Lecture 16: OpenMP - IV

Parallel Computing. Lecture 16: OpenMP - IV CSCI-UA.0480-003 Parallel Computing Lecture 16: OpenMP - IV Mohamed Zahran (aka Z) mzahran@cs.nyu.edu http://www.mzahran.com PRODUCERS AND CONSUMERS Queues A natural data structure to use in many multithreaded

More information

Introduction to the Stack. Stacks and Queues. Stack Operations. Stack illustrated. elements of the same type. Week 9. Gaddis: Chapter 18

Introduction to the Stack. Stacks and Queues. Stack Operations. Stack illustrated. elements of the same type. Week 9. Gaddis: Chapter 18 Stacks and Queues Week 9 Gaddis: Chapter 18 CS 5301 Fall 2015 Jill Seaman Introduction to the Stack Stack: a data structure that holds a collection of elements of the same type. - The elements are accessed

More information

Chapter 20: Binary Trees

Chapter 20: Binary Trees Chapter 20: Binary Trees 20.1 Definition and Application of Binary Trees Definition and Application of Binary Trees Binary tree: a nonlinear linked list in which each node may point to 0, 1, or two other

More information

An Introduction to Parallel Systems

An Introduction to Parallel Systems Lecture 4 - Shared Resource Parallelism University of Bath December 6, 2007 When Week 1 Introduction Who, What, Why, Where, When? Week 2 Data Parallelism and Vector Processors Week 3 Message Passing Systems

More information

Queues. ADT description Implementations. October 03, 2017 Cinda Heeren / Geoffrey Tien 1

Queues. ADT description Implementations. October 03, 2017 Cinda Heeren / Geoffrey Tien 1 Queues ADT description Implementations Cinda Heeren / Geoffrey Tien 1 Queues Assume that we want to store data for a print queue for a student printer Student ID Time File name The printer is to be assigned

More information

RCU in the Linux Kernel: One Decade Later

RCU in the Linux Kernel: One Decade Later RCU in the Linux Kernel: One Decade Later by: Paul E. Mckenney, Silas Boyd-Wickizer, Jonathan Walpole Slides by David Kennedy (and sources) RCU Usage in Linux During this same time period, the usage of

More information

Lock-Free Techniques for Concurrent Access to Shared Objects

Lock-Free Techniques for Concurrent Access to Shared Objects This is a revised version of the previously published paper. It includes a contribution from Shahar Frank who raised a problem with the fifo-pop algorithm. Revised version date: sept. 30 2003. Lock-Free

More information

EECS 482 Introduction to Operating Systems

EECS 482 Introduction to Operating Systems EECS 482 Introduction to Operating Systems Winter 2018 Baris Kasikci Slides by: Harsha V. Madhyastha http://knowyourmeme.com/memes/mind-blown 2 Recap: Processes Hardware interface: app1+app2+app3 CPU +

More information

CS 333 Introduction to Operating Systems. Class 3 Threads & Concurrency. Jonathan Walpole Computer Science Portland State University

CS 333 Introduction to Operating Systems. Class 3 Threads & Concurrency. Jonathan Walpole Computer Science Portland State University CS 333 Introduction to Operating Systems Class 3 Threads & Concurrency Jonathan Walpole Computer Science Portland State University 1 Process creation in UNIX All processes have a unique process id getpid(),

More information

COMP 3430 Robert Guderian

COMP 3430 Robert Guderian Operating Systems COMP 3430 Robert Guderian file:///users/robg/dropbox/teaching/3430-2018/slides/06_concurrency/index.html?print-pdf#/ 1/76 1 Concurrency file:///users/robg/dropbox/teaching/3430-2018/slides/06_concurrency/index.html?print-pdf#/

More information

Synchronization 1. Synchronization

Synchronization 1. Synchronization Synchronization 1 Synchronization key concepts critical sections, mutual exclusion, test-and-set, spinlocks, blocking and blocking locks, semaphores, condition variables, deadlocks reading Three Easy Pieces:

More information

Threads and Locks, Part 2. CSCI 5828: Foundations of Software Engineering Lecture 08 09/18/2014

Threads and Locks, Part 2. CSCI 5828: Foundations of Software Engineering Lecture 08 09/18/2014 Threads and Locks, Part 2 CSCI 5828: Foundations of Software Engineering Lecture 08 09/18/2014 1 Goals Cover the material presented in Chapter 2 of our concurrency textbook In particular, selected material

More information

Semaphore. Originally called P() and V() wait (S) { while S <= 0 ; // no-op S--; } signal (S) { S++; }

Semaphore. Originally called P() and V() wait (S) { while S <= 0 ; // no-op S--; } signal (S) { S++; } Semaphore Semaphore S integer variable Two standard operations modify S: wait() and signal() Originally called P() and V() Can only be accessed via two indivisible (atomic) operations wait (S) { while

More information

JAVA CONCURRENCY FRAMEWORK. Kaushik Kanetkar

JAVA CONCURRENCY FRAMEWORK. Kaushik Kanetkar JAVA CONCURRENCY FRAMEWORK Kaushik Kanetkar Old days One CPU, executing one single program at a time No overlap of work/processes Lots of slack time CPU not completely utilized What is Concurrency Concurrency

More information

Threads Cannot Be Implemented As a Library

Threads Cannot Be Implemented As a Library Threads Cannot Be Implemented As a Library Authored by Hans J. Boehm Presented by Sarah Sharp February 18, 2008 Outline POSIX Thread Library Operation Vocab Problems with pthreads POSIX Thread Library

More information

CS 450 Exam 2 Mon. 4/11/2016

CS 450 Exam 2 Mon. 4/11/2016 CS 450 Exam 2 Mon. 4/11/2016 Name: Rules and Hints You may use one handwritten 8.5 11 cheat sheet (front and back). This is the only additional resource you may consult during this exam. No calculators.

More information

G52CON: Concepts of Concurrency

G52CON: Concepts of Concurrency G52CON: Concepts of Concurrency Lecture 4: Atomic Actions Natasha Alechina School of Computer Science nza@cs.nott.ac.uk Outline of the lecture process execution fine-grained atomic actions using fine-grained

More information

Lock Oscillation: Boosting the Performance of Concurrent Data Structures

Lock Oscillation: Boosting the Performance of Concurrent Data Structures Lock Oscillation: Boosting the Performance of Concurrent Data Structures Panagiota Fatourou FORTH ICS & University of Crete Nikolaos D. Kallimanis FORTH ICS The Multicore Era The dominance of Multicore

More information

.:: UNIT 4 ::. STACK AND QUEUE

.:: UNIT 4 ::. STACK AND QUEUE .:: UNIT 4 ::. STACK AND QUEUE 4.1 A stack is a data structure that supports: Push(x) Insert x to the top element in stack Pop Remove the top item from stack A stack is collection of data item arrange

More information

Whatever can go wrong will go wrong. attributed to Edward A. Murphy. Murphy was an optimist. authors of lock-free programs 3.

Whatever can go wrong will go wrong. attributed to Edward A. Murphy. Murphy was an optimist. authors of lock-free programs 3. Whatever can go wrong will go wrong. attributed to Edward A. Murphy Murphy was an optimist. authors of lock-free programs 3. LOCK FREE KERNEL 309 Literature Maurice Herlihy and Nir Shavit. The Art of Multiprocessor

More information

Programming Paradigms for Concurrency Lecture 3 Concurrent Objects

Programming Paradigms for Concurrency Lecture 3 Concurrent Objects Programming Paradigms for Concurrency Lecture 3 Concurrent Objects Based on companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit Modified by Thomas Wies New York University

More information

! A data type for which: ! An ADT may be implemented using various. ! Examples:

! A data type for which: ! An ADT may be implemented using various. ! Examples: Stacks and Queues Unit 6 Chapter 19.1-2,4-5 CS 2308 Fall 2018 Jill Seaman 1 Abstract Data Type A data type for which: - only the properties of the data and the operations to be performed on the data are

More information

CS4961 Parallel Programming. Lecture 12: Advanced Synchronization (Pthreads) 10/4/11. Administrative. Mary Hall October 4, 2011

CS4961 Parallel Programming. Lecture 12: Advanced Synchronization (Pthreads) 10/4/11. Administrative. Mary Hall October 4, 2011 CS4961 Parallel Programming Lecture 12: Advanced Synchronization (Pthreads) Mary Hall October 4, 2011 Administrative Thursday s class Meet in WEB L130 to go over programming assignment Midterm on Thursday

More information

Research on the Implementation of MPI on Multicore Architectures

Research on the Implementation of MPI on Multicore Architectures Research on the Implementation of MPI on Multicore Architectures Pengqi Cheng Department of Computer Science & Technology, Tshinghua University, Beijing, China chengpq@gmail.com Yan Gu Department of Computer

More information

Synchronization. Disclaimer: some slides are adopted from the book authors slides with permission 1

Synchronization. Disclaimer: some slides are adopted from the book authors slides with permission 1 Synchronization Disclaimer: some slides are adopted from the book authors slides with permission 1 What is it? Recap: Thread Independent flow of control What does it need (thread private)? Stack What for?

More information

Synchronization II: EventBarrier, Monitor, and a Semaphore. COMPSCI210 Recitation 4th Mar 2013 Vamsi Thummala

Synchronization II: EventBarrier, Monitor, and a Semaphore. COMPSCI210 Recitation 4th Mar 2013 Vamsi Thummala Synchronization II: EventBarrier, Monitor, and a Semaphore COMPSCI210 Recitation 4th Mar 2013 Vamsi Thummala Check point: Mission in progress Master synchronization techniques Develop best practices for

More information

Assignment 1. Check your mailbox on Thursday! Grade and feedback published by tomorrow.

Assignment 1. Check your mailbox on Thursday! Grade and feedback published by tomorrow. Assignment 1 Check your mailbox on Thursday! Grade and feedback published by tomorrow. COMP250: Queues, deques, and doubly-linked lists Lecture 20 Jérôme Waldispühl School of Computer Science McGill University

More information

Operating Systems. Lecture 4 - Concurrency and Synchronization. Master of Computer Science PUF - Hồ Chí Minh 2016/2017

Operating Systems. Lecture 4 - Concurrency and Synchronization. Master of Computer Science PUF - Hồ Chí Minh 2016/2017 Operating Systems Lecture 4 - Concurrency and Synchronization Adrien Krähenbühl Master of Computer Science PUF - Hồ Chí Minh 2016/2017 Mutual exclusion Hardware solutions Semaphores IPC: Message passing

More information

Programming in Parallel COMP755

Programming in Parallel COMP755 Programming in Parallel COMP755 All games have morals; and the game of Snakes and Ladders captures, as no other activity can hope to do, the eternal truth that for every ladder you hope to climb, a snake

More information

Reminder from last time

Reminder from last time Concurrent systems Lecture 2: More mutual exclusion, semaphores, and producer-consumer relationships DrRobert N. M. Watson 1 Reminder from last time Definition of a concurrent system Origins of concurrency

More information

Characterizing the Performance and Energy Efficiency of Lock-Free Data Structures

Characterizing the Performance and Energy Efficiency of Lock-Free Data Structures Characterizing the Performance and Energy Efficiency of Lock-Free Data Structures Nicholas Hunt Paramjit Singh Sandhu Luis Ceze University of Washington {nhunt,paramsan,luisceze}@cs.washington.edu Abstract

More information

Advance Operating Systems (CS202) Locks Discussion

Advance Operating Systems (CS202) Locks Discussion Advance Operating Systems (CS202) Locks Discussion Threads Locks Spin Locks Array-based Locks MCS Locks Sequential Locks Road Map Threads Global variables and static objects are shared Stored in the static

More information

EECS 482 Introduction to Operating Systems

EECS 482 Introduction to Operating Systems EECS 482 Introduction to Operating Systems Fall 2017 Manos Kapritsos Slides by: Harsha V. Madhyastha Implementing reader-writer locks with monitors Shared data needed to implement readerstart, readerfinish,

More information

Condition Variables CS 241. Prof. Brighten Godfrey. March 16, University of Illinois

Condition Variables CS 241. Prof. Brighten Godfrey. March 16, University of Illinois Condition Variables CS 241 Prof. Brighten Godfrey March 16, 2012 University of Illinois 1 Synchronization primitives Mutex locks Used for exclusive access to a shared resource (critical section) Operations:

More information

BQ: A Lock-Free Queue with Batching

BQ: A Lock-Free Queue with Batching BQ: A Lock-Free Queue with Batching Gal Milman Technion, Israel galy@cs.technion.ac.il Alex Kogan Oracle Labs, USA alex.kogan@oracle.com Yossi Lev Oracle Labs, USA levyossi@icloud.com ABSTRACT Victor Luchangco

More information

Introduction to the Stack. Stacks and Queues. Stack Operations. Stack illustrated. Week 9. elements of the same type.

Introduction to the Stack. Stacks and Queues. Stack Operations. Stack illustrated. Week 9. elements of the same type. Stacks and Queues Week 9 Gaddis: Chapter 18 (8th ed.) Gaddis: Chapter 19 (9th ed.) CS 5301 Fall 2018 Jill Seaman Introduction to the Stack Stack: a data structure that holds a collection of elements of

More information

CS510 Advanced Topics in Concurrency. Jonathan Walpole

CS510 Advanced Topics in Concurrency. Jonathan Walpole CS510 Advanced Topics in Concurrency Jonathan Walpole Threads Cannot Be Implemented as a Library Reasoning About Programs What are the valid outcomes for this program? Is it valid for both r1 and r2 to

More information

CS 470 Spring Mike Lam, Professor. Semaphores and Conditions

CS 470 Spring Mike Lam, Professor. Semaphores and Conditions CS 470 Spring 2018 Mike Lam, Professor Semaphores and Conditions Synchronization mechanisms Busy-waiting (wasteful!) Atomic instructions (e.g., LOCK prefix in x86) Pthreads Mutex: simple mutual exclusion

More information

Section I Section Real Time Systems. Processes. 1.4 Inter-Processes Communication (part 1)

Section I Section Real Time Systems. Processes. 1.4 Inter-Processes Communication (part 1) EE206: Software Engineering IV 1.4 Inter-Processes Communication 1 page 1 of 16 Section I Section Real Time Systems. Processes 1.4 Inter-Processes Communication (part 1) Process interaction Processes often

More information