This is a talk given by me to the Northwest C++ Users Group on May 19, 2010.

Similar documents
Thread Safety. Review. Today o Confinement o Threadsafe datatypes Required reading. Concurrency Wrapper Collections

Operating Systems, Assignment 2 Threads and Synchronization

And Even More and More C++ Fundamentals of Computer Science

CS11 Java. Fall Lecture 7

CS 2112 Lecture 20 Synchronization 5 April 2012 Lecturer: Andrew Myers

The New Java Technology Memory Model

It turns out that races can be eliminated without sacrificing much in terms of performance or expressive power.

Real-Time and Concurrent Programming Lecture 4 (F4): Monitors: synchronized, wait and notify

CPS 310 midterm exam #1, 2/19/2016

Learning from Bad Examples. CSCI 5828: Foundations of Software Engineering Lecture 25 11/18/2014

Microsoft Visual C# Step by Step. John Sharp

Asynchronous OSGi: Promises for the masses. Tim Ward.

OpenACC Course. Office Hour #2 Q&A

CPS 310 second midterm exam, 11/6/2013

CPS 310 first midterm exam, 10/6/2014

Operating Systems CMPSCI 377 Spring Mark Corner University of Massachusetts Amherst

Deviations are things that modify a thread s normal flow of control. Unix has long had signals, and these must be dealt with in multithreaded

CSE 120 Principles of Operating Systems Spring 2016

CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring Lecture 8: Semaphores, Monitors, & Condition Variables

CS61B, Spring 2003 Discussion #17 Amir Kamil UC Berkeley 5/12/03

CMSC 330: Organization of Programming Languages. Threads Classic Concurrency Problems

Typed Assembly Language for Implementing OS Kernels in SMP/Multi-Core Environments with Interrupts

CS Operating Systems

CS Operating Systems

CS 333 Introduction to Operating Systems. Class 3 Threads & Concurrency. Jonathan Walpole Computer Science Portland State University

CMSC 330: Organization of Programming Languages

G Programming Languages Spring 2010 Lecture 13. Robert Grimm, New York University

Introducing Shared-Memory Concurrency

Chapter 1 Getting Started

Ch. 12: Operator Overloading

Condition Variables & Semaphores

CS533 Concepts of Operating Systems. Jonathan Walpole

EECS 482 Introduction to Operating Systems

C++11 and Compiler Update

Solving the Producer Consumer Problem with PThreads

Favoring Isolated Mutability The Actor Model of Concurrency. CSCI 5828: Foundations of Software Engineering Lecture 24 04/11/2012

Lecture 1: Overview

Overview of the Ruby Language. By Ron Haley

Monitors; Software Transactional Memory

Models of concurrency & synchronization algorithms

What's wrong with Semaphores?

Chenyu Zheng. CSCI 5828 Spring 2010 Prof. Kenneth M. Anderson University of Colorado at Boulder

AP COMPUTER SCIENCE JAVA CONCEPTS IV: RESERVED WORDS

Harvard School of Engineering and Applied Sciences CS 152: Programming Languages

Deadlock and Monitors. CS439: Principles of Computer Systems September 24, 2018

CS 333 Introduction to Operating Systems. Class 3 Threads & Concurrency. Jonathan Walpole Computer Science Portland State University

Microsoft. Microsoft Visual C# Step by Step. John Sharp

Java Threads. Introduction to Java Threads

CS 31: Introduction to Computer Systems : Threads & Synchronization April 16-18, 2019

Import Statements, Instance Members, and the Default Constructor

Monitors; Software Transactional Memory

Casting -Allows a narrowing assignment by asking the Java compiler to "trust us"

Homework #10 due Monday, April 16, 10:00 PM

CS 455: INTRODUCTION TO DISTRIBUTED SYSTEMS [THREADS] Frequently asked questions from the previous class survey

Grand Central Dispatch and NSOperation. CSCI 5828: Foundations of Software Engineering Lecture 28 12/03/2015

Lecture #7: Implementing Mutual Exclusion

Lecture 5: Synchronization w/locks

Chapter 6: Process [& Thread] Synchronization. CSCI [4 6] 730 Operating Systems. Why does cooperation require synchronization?

PROCESS SYNCHRONIZATION

EECS 482 Introduction to Operating Systems

More Synchronization; Concurrency in Java. CS 475, Spring 2018 Concurrent & Distributed Systems

The Java Memory Model

Performance Throughput Utilization of system resources

Lecture 3. Lecture

Concurrency. Glossary

Lesson 1A - First Java Program HELLO WORLD With DEBUGGING examples. By John B. Owen All rights reserved 2011, revised 2015

Principles of Software Construction: Concurrency, Part 2

Threads and Parallelism in Java

C++\CLI. Jim Fawcett CSE687-OnLine Object Oriented Design Summer 2017

Vector and Free Store (Pointers and Memory Allocation)

The Dining Philosophers Problem CMSC 330: Organization of Programming Languages

C++11/14 Rocks. Clang Edition. Alex Korban

Chap. 6 Part 1. CIS*3090 Fall Fall 2016 CIS*3090 Parallel Programming 1

Introduction to Programming Using Java (98-388)

HY 345 Operating Systems

Inheritance: Develop solutions by abstracting real-world object and their interaction into code to develop software solutions. Layering: Organization

+ Abstract Data Types

M301: Software Systems & their Development. Unit 4: Inheritance, Composition and Polymorphism

Synchronization II: EventBarrier, Monitor, and a Semaphore. COMPSCI210 Recitation 4th Mar 2013 Vamsi Thummala

CS510 Advanced Topics in Concurrency. Jonathan Walpole

CS-537: Midterm Exam (Spring 2001)

Programming with MPI

From IMP to Java. Andreas Lochbihler. parts based on work by Gerwin Klein and Tobias Nipkow ETH Zurich

CS152: Programming Languages. Lecture 7 Lambda Calculus. Dan Grossman Spring 2011

Multiple Inheritance. Computer object can be viewed as

Smart Pointers. Some slides from Internet

Deadlock and Monitors. CS439: Principles of Computer Systems February 7, 2018

Stacks and Queues

CMSC 330: Organization of Programming Languages. The Dining Philosophers Problem

Answer Key. 1. General Understanding (10 points) think before you decide.

CS 160: Interactive Programming

Page 1. Goals for Today" Atomic Read-Modify-Write instructions" Examples of Read-Modify-Write "

Chapter 5 Methods. public class FirstMethod { public static void main(string[] args) { double x= -2.0, y; for (int i = 1; i <= 5; i++ ) { y = f( x );

Deadlock CS 241. March 19, University of Illinois

Enums. In this article from my free Java 8 course, I will talk about the enum. Enums are constant values that can never be changed.

Object Oriented Software Design II

Memory Consistency Models

CS558 Programming Languages

Operating Systems. Lecture 4 - Concurrency and Synchronization. Master of Computer Science PUF - Hồ Chí Minh 2016/2017

CSCI-1200 Data Structures Fall 2011 Lecture 24 Garbage Collection & Smart Pointers

Transcription:

This is a talk given by me to the Northwest C++ Users Group on May 19, 2010. 1

I like this picture because it clearly shows what is private and what is shared. In the message-passing system, threads operate in their separate spaces except for the message queues (or channels) that are shared between them. This contract may be broken if the messages themselves contain pointers or references. Passing pointers is okay in some cases: 1. When using move semantics, for instance, passing unique_ptr 2. When passing pointers to monitors objects that provide their own locking on every entry 3. When the sharing is intentional and well understood Notice that you can t pass references in a distributed setting, e.g., between network nodes. This is the main difference between inter-thread message passing and interprocess or over the network message passing. I don t like the idea of a universal message queue, as provided by MPI or Boost. 2

John Reppy s Ph.D. thesis introduced a very elegant and flexible message-passing system to Caml. I took a lot of ideas from him when designing my C++ messagepassing mini-library. 3

This is a true atom of message passing. Very simple yet universal. I must be able to implement an MVar in terms of my C++ primitives to prove that they are as expressive. 4

Here are some of the things you can build from MVars. I m not suggesting that we should follow these implementations in C++ (far from that!). But it shows you the expressive power of MVars. 5

The two primitives introduced by John Reppy: channel and event roughly, storage and synchronization. The important part is that channel operations are asynchronous but they return events which provide synchronization. The send event may be ignored if you want to have asynchronous communications or, as in this example, sync d if you want communication to be synchronous (rendezvousbased). When you call receive, you are only expressing the intention of receiving a message. You have to actually sync on the resulting event to retrieve it (similar to waiting for a future). The sync may block waiting for the message. 6

The first composability challenge is: Can I wait for more than one channel at a time, provided they all return the same type of message? ([e1, e2] is a two-element list in Caml.) 7

The second challenge is: Can I wait on multiple channels that return different types of messages? And can I do it in a type-safe manner? Caml lets you pair events with handler functions using wrap. The syntax: fun ls -> length ls Introduces a lambda (an anonymous function literal). Here we have a function that takes a list and returns its length; and an identity function for the integer channel. In general, handlers may be elaborate functions or closures, so you have full expressive power of Caml at your disposal to process messages. This mechanism is reminiscent of continuations; and Scala, for instance, plays with continuations in its message passing scheme. 8

My primitives must be simple but not too simple. The composability is my main goal, but it s also important not to impose any restrictions on message types. Many MP libraries require messages to be shallow values (no references) or void pointers with count (type safety out of the window!). Granted, passing arbitrary objects may introduce inadvertent sharing and data races, so the programmer should be careful. But this is C++ after all! Keep in mind that there are many non-trivial message types that may be passed safely. They include messages with move semantics (unique_ptrs in particular); and monitors, which provide their own locking (like Java objects with synchronized methods). There is a misguided idea around that a good message queue must work the same between threads and between processes or even machines (it s called location transparency ). This immediately prevents the programmer from passing references; and that not only kills performance, but also makes it impossible to have first-class channel objects. I should be able to pass a channel through a channel to go beyond Hoare s CSP (Communicating Sequential Processes). Notice that it s possible to build a location-transparent message queue on top of a first-class channel, but not vice versa. 9

I m introducing the primitives provided by the C++0x Standard Library. In general, you shouldn t use Mutex directly it s supposed to be wrapped into a scoped lock object (next slide). Otherwise you ll have problems with exception safety, if you re not careful. 10

You create these scoped locks on the stack and they guarantee that the underlying mutex will be released no matter what, even in the face of exceptions. 11

I use this diagram to explain the problem of missed notifications and why the condition variable must know about locking. The setting and the checking of the condition must be performed under a lock because it involves shared state. (The re-testing is necessary because of the everpresent possibility of spurious wakeups.) 1. You can t hold the lock while blocking because no client would be able to set the condition (they have to take the same lock!) and you d wait forever. 2. But if you release the lock before calling Wait, you ll leave open a gap between unlock and Wait. A client in another thread might set the condition and call Notify during this gap, and you will miss the notification. (This is unlike Windows autoreset events, which have a state once notified they stay signaled, and subsequent Wait will not block. The solution is to pass the lock to Wait, which is supposed to atomically unlock it and enter the wait state. The wakeup also atomically re-acquires the lock. 12

Here you see Wait taking a lock as an argument, as I explained before. This also ensures that the client holds the lock before and after the wait (if this is a scoped lock). It s trivial to implement condition_variable on Vista and Win7 there is a special API. On earlier versions of Windows, the implementation is possible but much more tricky (search the Internet if you re interested). Posix had condition variables since time immemorial. The difference between notify_one and notify_all is how many threads are awaken, if there s more than one waiter. 13

This is my proposal. It s heavily influenced by, but not directly mappable to, John Reppy s Caml system. All locking and signaling is done using Events. The Channel takes care of message storage. Channel is not a class or an interface, it s just a concept. I list the steps required when sending or receiving the message. Compare these list with the code on the next slide. 14

This is the proof of concept: implementing an MVar using my primitives. Notice that the storage part, or the channel, is just one member variable, _msg. There are two more primitives, SndLock and RcvLock that, together with Event, form the basis of my mini-library. The notification step is hidden inside the destructor of SndLock which is cute, but hard to make exception safe. I will probably need to provide a separate Notify method to SndLock or just require the client to call Event::Notify directly. Here the code under SndLock is exception safe, so it doesn t matter. The details of checking the condition and looping are hidden inside Event::Wait (see next slide). 15

Besides a mutex and a condition variable, I need the condition itself. It is provided by a Boolean variable _isnotified. 16

SndLock is a scoped lock. In addition it does the notification in its destructor. As I explained earlier, this is not exception safe so, in the future I ll probably remove it. RcvLock is just a thin wrapper over a scoped lock, which takes an Event, rather than a Mutex in its constructor. 17

Here s an example of a ready-made channel template. A lot of times all you need is to pass a single message by value. ValueChannel is your friend. The importance of the _valid flag and the method IsEmpty will become obvious when I talk about composability. In a nutshell, if you wait for multiple channels, you have to have a way to tell which channel has new data. 18

This is the first composability test. Composing monomorphic channels. Very easy: just use the same Event for multiple channels. I use and ad-hoc communication object that is passed by reference to another thread to establish the connection. 19

This is the code in the client thread. Notice how easy it is to create a thread in C++0x. All you need is a thread function (next slide) and, optionally, an argument to the thread function. Type safety is guaranteed (type mismatch is detected if the function parameter type doesn t match the argument type). Actually you can have a thread function of several arguments, and pass them all to the thread constructor. That s the power of the new variadic templates. I could have passed the event and the two channels as three separate arguments but, unfortunately, Visual Studio 2010 doesn t support variadic templates. (Nor does it have thread support, which I had to implement myself.) The main thread sends two messages on two channels but it uses the same event to create both SndLocks. This allows the receiver to block on just one event. 20

Here s the thread function. It takes a pointer to the communication object, as I mentioned before. It goes into a loop, takes the lock and waits on the event (which, as you remember, is shared between the two channels). When it wakes up it checks the channels and retrieves the string messages, if available. Here you can see the IsEmpty method in action. 21

This is the second test of composability: combining polymorphic channels. The solution is taken directly from Caml wrap together a channel with its handler in a type-safe manner. I wrote a helper template function CreateValueWrapper that takes a channel (not an event, like in Caml) and a handler function. Here the function is a C++0x lambda. To make it even more exciting, this lambda is really a closure: it captures the local variable, count, by reference. Of course, you can accomplish the same by creating a function object (an object with the overloaded operator() ). I chose to make all my wrapped channels implement the same interface, WrappedChannel. This is because I want to use my library class MultiChannel to store and process these wrappers. The methods Peek and Receive must be called under a RcvLock. A particular wrapper might decide to run the handler under the lock inside Receive; if it doesn t take too long. If the handler is more elaborate or may block, it would be unwise to hold the lock while executing it. In such a case you implement the wrapper to simply copy the message into local storage inside Receive, and call the handler on that copy in Execute. The rule is that, if you call the handler in Receive, you should return a null. Otherwise return the this pointer and the client will call its Execute method outside of the lock. I ll show you how this is done when discussing the MultiChannel class next. 22

This is how you use the MultiChannel. You create the wrappers (see previous slide) and register them with the multi-channel. The method Receive does all the waiting, receiving, and processing for you. See the next slide. 23

Here you can see how WrappedChannels are used. Under the lock, I call Receive on each wrapper. Some of the wrappers return null they have already called their handlers with the message. Others return the execs, which I store in a vector for later perusal. After the lock is released, I iterate over the execs and call the Execute method, which executes the handlers without holding the lock. This is safe because, in that case, Receive made a copy of the message, which can now be accessed without locking. This solution offers great flexibility. 24

What s worth stressing is that the implementation of the storage part the channel is left wide open. Sure, helper classes and functions will cover most of the usage but, in a pinch, the programmer is free to implement their own solutions. There s no more one-queue-fits-all uniformity, and that s important if you care about performance. In particular, you can get a lot of mileage by using move semantics, which is a perfect way of passing large messages between threads. Finally, you can easily pass a pair of references to an event and a channel through a channel to create very dynamic communication systems. Have I fulfilled all the requirements I put forward for my mini-library? I think so. 25