Threads Cannot Be Implemented As a Library

Similar documents
CS510 Advanced Topics in Concurrency. Jonathan Walpole

Foundations of the C++ Concurrency Memory Model

Synchronising Threads

CS 153 Lab4 and 5. Kishore Kumar Pusukuri. Kishore Kumar Pusukuri CS 153 Lab4 and 5

Shared Memory Parallel Programming with Pthreads An overview

Programming with Shared Memory. Nguyễn Quang Hùng

CSE 374 Programming Concepts & Tools

Symmetric Multiprocessors: Synchronization and Sequential Consistency

Synchronization I. Jo, Heeseung

Operating systems. Lecture 3 Interprocess communication. Narcis ILISEI, 2018

Pre-lab #2 tutorial. ECE 254 Operating Systems and Systems Programming. May 24, 2012

Processor speed. Concurrency Structure and Interpretation of Computer Programs. Multiple processors. Processor speed. Mike Phillips <mpp>

CS4961 Parallel Programming. Lecture 12: Advanced Synchronization (Pthreads) 10/4/11. Administrative. Mary Hall October 4, 2011

CMSC421: Principles of Operating Systems

Chapter 2 Processes and Threads

CS 220: Introduction to Parallel Computing. Introduction to CUDA. Lecture 28

POSIX Threads: a first step toward parallel programming. George Bosilca

C++ Memory Model Tutorial

So far, we know: Wednesday, October 4, Thread_Programming Page 1

Sharing Objects Ch. 3

Warm-up question (CS 261 review) What is the primary difference between processes and threads from a developer s perspective?

Synchronization. CS61, Lecture 18. Prof. Stephen Chong November 3, 2011

CSci 4061 Introduction to Operating Systems. Synchronization Basics: Locks

Mid-term Roll no: Scheduler Activations: Effective Kernel Support for the User-Level Management of Parallelism

Threads and Synchronization. Kevin Webb Swarthmore College February 15, 2018

CSE 451 Midterm 1. Name:

Computer Systems Laboratory Sungkyunkwan University

Java Memory Model. Jian Cao. Department of Electrical and Computer Engineering Rice University. Sep 22, 2016

CSE 153 Design of Operating Systems Fall 2018

ENCM 501 Winter 2019 Assignment 9

C09: Process Synchronization

Advanced Threads & Monitor-Style Programming 1/24

M4 Parallelism. Implementation of Locks Cache Coherence

The New Java Technology Memory Model

Final Exam. 12 December 2018, 120 minutes, 26 questions, 100 points

CS 153 Lab6. Kishore Kumar Pusukuri

Module 6: Process Synchronization. Operating System Concepts with Java 8 th Edition

THREADS. Jo, Heeseung

Memory model for multithreaded C++: August 2005 status update

Multithreading Programming II

Programming in Parallel COMP755

Operating Systems Structure

Systems Programming/ C and UNIX

Programming Language Memory Models: What do Shared Variables Mean?

Chapter 4 Concurrent Programming

Introduction to Parallel Programming Part 4 Confronting Race Conditions

The Java Memory Model

Chapter 6: Process Synchronization. Module 6: Process Synchronization

POSIX Threads and OpenMP tasks

MULTITHREADING AND SYNCHRONIZATION. CS124 Operating Systems Fall , Lecture 10

HPCSE - I. «Introduction to multithreading» Panos Hadjidoukas

Paralleland Distributed Programming. Concurrency

Concurrency: a crash course

Discussion CSE 224. Week 4

Other consistency models

Final Exam. 11 May 2018, 120 minutes, 26 questions, 100 points

CS 3305 Intro to Threads. Lecture 6

Implementing Mutual Exclusion. Sarah Diesburg Operating Systems CS 3430

COMP6771 Advanced C++ Programming

The mutual-exclusion problem involves making certain that two things don t happen at once. A non-computer example arose in the fighter aircraft of

Using Weakly Ordered C++ Atomics Correctly. Hans-J. Boehm

Thread-Local. Lecture 27: Concurrency 3. Dealing with the Rest. Immutable. Whenever possible, don t share resources

Synchronization. Disclaimer: some slides are adopted from the book authors slides with permission 1

CS 5523 Operating Systems: Midterm II - reivew Instructor: Dr. Tongping Liu Department Computer Science The University of Texas at San Antonio

High Performance Computing Lecture 21. Matthew Jacob Indian Institute of Science

Solving the Producer Consumer Problem with PThreads

Synchronization I. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

CS Lecture 3! Threads! George Mason University! Spring 2010!

Threads. Thread Concept Multithreading Models User & Kernel Threads Pthreads Threads in Solaris, Linux, Windows. 2/13/11 CSE325 - Threads 1

Systèmes d Exploitation Avancés

Questions from last time

CS370 Operating Systems

Introduction to PThreads and Basic Synchronization

Concurrent Programming Lecture 10

Operating Systems (2INC0) 2017/18

Week 3. Locks & Semaphores

Audience. Revising the Java Thread/Memory Model. Java Thread Specification. Revising the Thread Spec. Proposed Changes. When s the JSR?

ECE 574 Cluster Computing Lecture 8

CS 450 Exam 2 Mon. 4/11/2016

More Shared Memory Programming

CS 333 Introduction to Operating Systems. Class 3 Threads & Concurrency. Jonathan Walpole Computer Science Portland State University

Introduction to OS Synchronization MOS 2.3

Chapter 5: Process Synchronization. Operating System Concepts 9 th Edition

Overview of Lecture 4. Memory Models, Atomicity & Performance. Ben-Ari Concurrency Model. Dekker s Algorithm 4

CS516 Programming Languages and Compilers II

CS 31: Introduction to Computer Systems : Threads & Synchronization April 16-18, 2019

Problem. Context. Hash table

Process Synchronization

OPERATING SYSTEMS DESIGN AND IMPLEMENTATION Third Edition ANDREW S. TANENBAUM ALBERT S. WOODHULL. Chap. 2.2 Interprocess Communication

CS 333 Introduction to Operating Systems. Class 3 Threads & Concurrency. Jonathan Walpole Computer Science Portland State University

Outline. CS4254 Computer Network Architecture and Programming. Introduction 2/4. Introduction 1/4. Dr. Ayman A. Abdel-Hamid.

High-level Synchronization

CSE 451: Operating Systems Winter Lecture 7 Synchronization. Steve Gribble. Synchronization. Threads cooperate in multithreaded programs

CS 470 Spring Mike Lam, Professor. Advanced OpenMP

Warm-up question (CS 261 review) What is the primary difference between processes and threads from a developer s perspective?

CS377P Programming for Performance Multicore Performance Multithreading

Module 6: Process Synchronization

CS 470 Spring Mike Lam, Professor. Advanced OpenMP

COMP 3430 Robert Guderian

CSE 153 Design of Operating Systems

Transcription:

Threads Cannot Be Implemented As a Library Authored by Hans J. Boehm Presented by Sarah Sharp February 18, 2008

Outline POSIX Thread Library Operation Vocab Problems with pthreads

POSIX Thread Library Many languages were not originally specified with thread support. For C and C++, the POSIX Thread Library (pthreads) was created for thread support. The authors argue that issues with thread support lie almost exclusively with the compiler and the language specification itself, not with the thread library or its specification. Hence they cannot be addressed purely within the thread library.

What does the POSIX Thread Library guarantee? Formal definitions of the memory model were rejected as unreadable by the vast majority of programmers. No thread can read or modify a memory location while another thread may be modifying it. Functions to synchronize memory with respect to other threads. pthread mutex lock() pthread mutex unlock() These calls include memory barriers to prevent hardware reordering of memory operations around these calls The compiler must assume these functions can read or write any global variable, so memory references can t be moved across the call to this function.

Vocab Sequential Sequential Consistency Sequential Correctness if the program were executed in a sequential manner (i.e. one thread on one processor) with no other programs being executed, it would produce sequentially correct results. compiler or hardware transformations of multiprocess algorithms may preserve sequential correctness of individual processes but not preserve sequential consistency of the whole algorithm.

Compiler Optimizations Program 1 (x and y initialized to 0): Thread 1: if (x == 1) ++y; Thread 2: if (y == 1) ++x; Can we ever get x == 1 and y == 1?

Compiler Optimizations Program 1 (x and y initialized to 0): Thread 1: if (x == 1) ++y; Thread 2: if (y == 1) ++x; Can we ever get x == 1 and y == 1? What if the compiler does this? Thread 1: ++y; if (x!= 1) y; Thread 2: ++x; if (y!= 1) x; This is a sequentially correct result, but a multithreaded system running this code will not be sequentially consistent. Solution: The compiler must know that this program will run in a multithreaded environment. The library itself does not ensure the optimizer produces correct code.

Memory Operations on Adjacent Addresses Processors cannot write or read memory of arbitrary widths. A write of x.a = 42 in a structure like this struct { int a:17; // a is a bitfield, 17 bits wide int b:15; // b is a bitfield, 15 bits wide } x; may involve reading and writing more than just the variable a: int temp = x; temp &= ~0x1ffff; // mask off old a temp = 42; x = temp; // overwrite all of x What if we have one thread that updates a in structure x, and another thread that updates b in structure x?

Memory Operations on Adjacent Addresses The POSIX threads specification prohibits concurrent writes to the same memory location, so it endorses this behavior. May cause problems if different fields were supposed to be protected by different locks. The pthreads specification also allows this behavior for global variables adjacent to a structure. The linker may reorder global variables in memory, so no variable is safe! Solution: Make the language specify when adjacent data can be implicitly overwritten.

Register promotion Several compiler optimizations lead to this problem: Instead of reading a variable repeatedly inside a loop, store it in a register. The conditional branch on an If is rarely taken, so optimize for the case where it is not taken.

Register promotion The code in question (where x is a global variable): for (...) {... if (mt) pthread_mutex_lock(...); x =... x... if (mt) prhread_mutex_unlock(...);

The optimized code: Register promotion r = x; for (...) {... // might change r in here if (mt) { x = r; // update x with current value of the register pthread_mutex_lock(...); r = x; // update r in case something changed x } r =... r... if (mt) { x = r; prhread_mutex_unlock(...); r = x; }

Register promotion Why do this? But the pthread standard says that memory must be synchronized with the logical program state around the pthread mutex lock() and pthread mutex unlock() calls. If x is protected by a lock, we are reading and writing it outside the lock. Solution: Make sure the compiler is aware this could be a threaded program. It won t do register promotion around unknown function calls.

What about lockless algorithms? The only parallel programming technique the POSIX thread standard allows is mutual exclusion. This excludes lockless algorithms and algorithms that use atomic operations (but not locks). This is part of the reason pthreads (almost) work. However, this is very costly...

pthreads vs. lockless algorithms