Simon Peyton Jones (Microsoft Research) Tokyo Haskell Users Group April 2010

Similar documents
CSE 230. Concurrency: STM. Slides due to: Kathleen Fisher, Simon Peyton Jones, Satnam Singh, Don Stewart

!!"!#"$%& Atomic Blocks! Atomic blocks! 3 primitives: atomically, retry, orelse!

COS 326 David Walker. Thanks to Kathleen Fisher and recursively to Simon Peyton Jones for much of the content of these slides.

David Walker. Thanks to Kathleen Fisher and recursively to Simon Peyton Jones for much of the content of these slides.

Composable Shared Memory Transactions Lecture 20-2

So#ware Transac-onal Memory. COS 326 David Walker Princeton University

Simon Peyton Jones Microsoft Research August 2012

The future is parallel The future of parallel is declarative. Simon Peyton Jones Microsoft Research

PARALLEL AND CONCURRENT PROGRAMMING IN HASKELL

COMP3151/9151 Foundations of Concurrency Lecture 8

Software Transactional Memory

LiU-FP2010 Part II: Lecture 7 Concurrency

Compiler Correctness for Software Transactional Memory

Programming Language Concepts: Lecture 13

G Programming Languages Spring 2010 Lecture 13. Robert Grimm, New York University

CS 2112 Lecture 20 Synchronization 5 April 2012 Lecturer: Andrew Myers

Tackling Concurrency With STM. Mark Volkmann 10/22/09

Tackling Concurrency With STM

An Update on Haskell H/STM 1

Lightweight Concurrency Primitives for GHC. Peng Li Simon Peyton Jones Andrew Tolmach Simon Marlow

Synchronization Part II. CS403/534 Distributed Systems Erkay Savas Sabanci University

Advances in Programming Languages

Reminder from last time

Persistent Data Structures and Managed References

Monitors; Software Transactional Memory

Distributed Systems COMP 212. Revision 2 Othon Michail

Transactional Events. Kevin Donnelly. Matthew Fluet. Abstract. 1. Introduction. Cornell University

A Pragmatic Case For. Static Typing

Multicore programming in Haskell. Simon Marlow Microsoft Research

Identity, State and Values

I/O in Haskell. To output a character: putchar :: Char -> IO () e.g., putchar c. To output a string: putstr :: String -> IO () e.g.

CSE 230. Parallelism. The Grand Challenge. How to properly use multi-cores? Need new programming models! Parallelism vs Concurrency

Locks and Threads and Monads OOo My. Stephan Bergmann StarOffice/OpenOffice.org Sun Microsystems

Lectures 24 and 25: Scheduling; Introduction to Effects

CMSC 132: Object-Oriented Programming II

Transactions. CS 475, Spring 2018 Concurrent & Distributed Systems

Summary: Open Questions:

Advanced Threads & Monitor-Style Programming 1/24

Haskell for (E)DSLs. Andres Löh. Functional Programming exchange, 16 March Well-Typed LLP

) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons)

Lecture 21: Transactional Memory. Topics: consistency model recap, introduction to transactional memory

CS5412: TRANSACTIONS (I)

Exceptions in Java

CSE332: Data Abstractions Lecture 22: Shared-Memory Concurrency and Mutual Exclusion. Tyler Robison Summer 2010

Software Transactional Memory Pure functional approach

Problems with Concurrency. February 19, 2014

Producer Consumer in Ada

CS11 Java. Fall Lecture 7

Transactions in a Distributed Key-Value Store

Monitors; Software Transactional Memory

Chapter 22. Transaction Management

Dealing with Issues for Interprocess Communication

Synchronization Part 2. REK s adaptation of Claypool s adaptation oftanenbaum s Distributed Systems Chapter 5 and Silberschatz Chapter 17

Concurrent Programming in C++ Venkat

A short introduction to concurrency and parallelism in Haskell

Concurre Concurr nt n Systems 8L for Part IB Handout 3 Dr. Steven Hand 1

Synchronization. Chapter 5

Higher Order Combinators for Join Patterns using STM

Concurrency. Glossary

Revisiting Software Transactional Memory in Haskell 1

The Dining Philosophers Problem CMSC 330: Organization of Programming Languages

Advances in Programming Languages

COMP3151/9151 Foundations of Concurrency Lecture 8

Parallel Code Smells: A Top 10 List

TEST CONCURRENT PROGRAMMING. code: date: 19 June 2015 time:

CMSC 330: Organization of Programming Languages. The Dining Philosophers Problem

Synchronization. CS61, Lecture 18. Prof. Stephen Chong November 3, 2011

Introduction to Concurrent Software Systems. CSCI 5828: Foundations of Software Engineering Lecture 08 09/17/2015

The State of Parallel Programming. Burton Smith Technical Fellow Microsoft Corporation

) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons)

The New Java Technology Memory Model

CS193k, Stanford Handout #8. Threads 3

MultiThreading 07/01/2013. Session objectives. Introduction. Introduction. Advanced Java Programming Course

doi: / Composable Memory Transactions By Tim Harris, Simon Marlow, Simon Peyton Jones, and Maurice Herlihy

CMSC 330: Organization of Programming Languages. Threads Classic Concurrency Problems

Advanced Java Programming Course. MultiThreading. By Võ Văn Hải Faculty of Information Technologies Industrial University of Ho Chi Minh City

Introduction to Concurrent Software Systems. CSCI 5828: Foundations of Software Engineering Lecture 12 09/29/2016

ZooKeeper & Curator. CS 475, Spring 2018 Concurrent & Distributed Systems

Topics. File Buffer Cache for Performance. What to Cache? COS 318: Operating Systems. File Performance and Reliability

Concurrency Control In Distributed Main Memory Database Systems. Justin A. DeBrabant

Using Haskell at Bdellium

CSE 153 Design of Operating Systems

Transactional Memory. Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit

Stuart

Sharing Objects Ch. 3

NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY. Tim Harris, 28 November 2014

Concurrent & Distributed Systems Supervision Exercises

Reminder from last time

Synchronization. CSCI 5103 Operating Systems. Semaphore Usage. Bounded-Buffer Problem With Semaphores. Monitors Other Approaches

Monads. Mark Hills 6 August Department of Computer Science University of Illinois at Urbana-Champaign

Thread Safety. Review. Today o Confinement o Threadsafe datatypes Required reading. Concurrency Wrapper Collections

416 Distributed Systems. RPC Day 2 Jan 12, 2018

Programmazione di sistemi multicore

Internet and Intranet Protocols and Applications

unreadtvar: Extending Haskell Software Transactional Memory for Performance

Haskell Monads CSC 131. Kim Bruce

Summary Semaphores. Passing the Baton any await statement. Synchronisation code not linked to the data

It turns out that races can be eliminated without sacrificing much in terms of performance or expressive power.

CSE 153 Design of Operating Systems Fall 2018

References. Monadic I/O in Haskell. Digression, continued. Digression: Creating stand-alone Haskell Programs

Transcription:

Simon Peyton Jones (Microsoft Research) Tokyo Haskell Users Group April 2010

Geeks Practitioners 1,000,000 10,000 100 1 The quick death 1yr 5yr 10yr 15yr

Geeks Practitioners 1,000,000 10,000 100 The slow death 1 1yr 5yr 10yr 15yr

Geeks Practitioners 1,000,000 Threshold of immortality 10,000 100 The complete absence of death 1 1yr 5yr 10yr 15yr

Geeks Practitioners 1,000,000 10,000 100 1 The committee language 1yr 5yr 10yr 15yr

Geeks Practitioners 1,000,000 10,000 I'm already looking at coding problems and my mental perspective is now shifting back and forth between purely OO and more FP styled solutions (blog Mar 2007) Learning Haskell is a great way of training yourself to think functionally so you are ready to take full advantage of C# 3.0 when it comes out (blog Apr 2007) 100 The second life? 1 1990 1995 2000 2005 2010

langpop.com langpop.com Oct 2008

langpop.com Oct 2008

1976 packages 533 developers 256 new packages Jan-Mar 2010 11.5 uploads/day 4k downloads/day

Parallelism is a big opportunity for Haskell The language is naturally parallel (the opposite of Java) Everyone is worried about how to program parallel machines

Explicit threads Non-deterministic by design Monadic: forkio and STM Semi-implicit Deterministic Pure: par and seq Data parallel Deterministic Pure: parallel arrays main :: IO () = do { ch <- newchan ; forkio (iomanager ch) ; forkio (worker 1 ch)... etc... } f :: Int -> Int f x = a `par` b `seq` a + b where a = f (x-1) b = f (x-2) Shared memory initially; distributed memory eventually; possibly even GPUs General attitude: using some of the parallel processors you already have, relatively easily

Explicit threads Non-deterministic by design Monadic: forkio and STM Semi-implicit Deterministic Pure: par and seq Today s Data parallel Deterministic focus Pure: parallel arrays main :: IO () = do { ch <- newchan ; forkio (iomanager ch) ; forkio (worker 1 ch)... etc... } f :: Int -> Int f x = a `par` b `seq` a + b where a = f (x-1) b = f (x-2) Shared memory initially; distributed memory eventually; possibly even GPUs General attitude: using some of the parallel processors you already have, relatively easily

After 30 years of research, the most widely-used co-ordination mechanism for shared-memory task-level concurrency is... Locks and condition variables

After 30 years of research, the most widely-used co-ordination mechanism for shared-memory task-level concurrency is... Locks and condition variables (invented 30 years ago)

A 10-second review: Races: due to forgotten locks Deadlock: locks acquired in wrong order. Lost wakeups: forgotten notify to condition variable Diabolical error recovery: need to restore invariants and release locks in exception handlers These are serious problems. But even worse... 16

Scalable double-ended queue: one lock per cell No interference if ends far enough apart But watch out when the queue is 0, 1, or 2 elements long! 17

Coding style Sequential code Difficulty of concurrent queue Undergraduate 18

Coding style Sequential code Locks and condition variables Difficulty of concurrent queue Undergraduate Publishable result at international conference 19

Coding style Sequential code Locks and condition variables Difficulty of concurrent queue Undergraduate Publishable result at international conference Atomic blocks Undergraduate 20

atomic {... sequential get code... } To a first approximation, just write the sequential code, and wrap atomic around it All-or-nothing semantics: Atomic commit Atomic block executes in Isolation Cannot deadlock (there are no locks!) Atomicity makes error recovery easy (e.g. exception thrown inside the get code) ACID 21

Optimistic concurrency atomic {... <code>... } One possibility: Execute <code> without taking any locks Each read and write in <code> is logged to a thread-local transaction log Writes go to the log only, not to memory At the end, the transaction tries to commit to memory Commit may fail; then transaction is re-run

Realising STM in Haskell

main = do { putstr (reverse yes ) ; putstr no } Effects are explicit in the type system (reverse yes ) :: String -- No effects (putstr no ) :: IO () -- Can have effects The main program is an effect-ful computation main :: IO ()

newref :: a -> IO (Ref a) readref :: Ref a -> IO a writeref :: Ref a -> a -> IO () main = do { r <- newref 0 ; incr r ; s <- readref r ; print s } incr :: Ref Int -> IO () incr r = do { v <- readref r ; writeref r (v+1) } Reads and writes are 100% explicit! You can t say (r + 6), because r :: Ref Int

fork :: IO a -> IO ThreadId fork spawns a thread it takes an action as its argument main = do { r <- newref 0 ; fork (incr r) ; incr r ;... } A race incr :: Ref Int -> IO () incr r = do { v <- readref f; writeref r (v+1) }

atomic :: IO a -> IO a main = do { r <- newref 0 ; fork (atomic (incr r)) ; atomic (incr r) ;... } atomic is a function, not a syntactic construct A worry: what stops you doing incr outside atomic?

Better idea: atomic :: STM a -> IO a newtvar :: a -> STM (TVar a) readtvar :: TVar a -> STM a writetvar :: TVar a -> a -> STM () inct :: TVar Int -> STM () inct r = do { v <- readtvar r; writetvar r (v+1) } main = do { r <- atomic (newtvar 0) ; fork (atomic (inct r)) ; atomic (inct r) ;... }

Notice that: Can t fiddle with TVars outside atomic block [good] Can t do IO inside atomic block [sad, but also good] No changes to the compiler (whatsoever). Only runtime system and primops....and, best of all... atomic :: STM a -> IO a newtvar :: a -> STM (TVar a) readtvar :: TVar a -> STM a writetvar :: TVar a -> a -> STM ()

inct :: TVar Int -> STM () inct r = do { v <- readtvar r; writetvar r (v+1) } inct2 :: TVar Int -> STM () inct2 r = do { inct r; inct r } foo :: IO () foo =...atomic (inct2 r)... Composition is THE way we build big programs that work An STM computation is always executed atomically (e.g. inct2). The type tells you. Simply glue STMs together arbitrarily; then wrap with atomic No nested atomic. (What would it mean?)

STM monad supports exceptions: throw :: Exception -> STM a catch :: STM a -> (Exception -> STM a) -> STM a In the call (atomic s), if s throws an exception, the transaction is aborted with no effect; and the exception is propagated into the IO monad No need to restore invariants, or release locks! See paper for the question of the exception value itself

Three new ideas retry orelse always

withdraw :: TVar Int -> Int -> STM () withdraw acc n = do { bal <- readtvar acc ; if bal < n then retry; ; writetvar acc (bal-n) } retry :: STM () retry means abort the current transaction and re-execute it from the beginning. Implementation avoids the busy wait by using reads in the transaction log (i.e. acc) to wait simultaneously on all read variables

No condition variables! Retrying thread is woken up automatically when acc is written. No lost wake-ups! No danger of forgetting to test everything again when woken up; the transaction runs again from the beginning. e.g. atomic (do { withdraw a1 3 ; withdraw a2 7 })

Because retry can appear anywhere inside an atomic block, including nested deep within a call. e.g. atomic (do { withdraw a1 3 ; withdraw a2 7 }) Waits for a1>3 AND a2>7, without changing withdraw Contrast: atomic (a1 > 3 && a2 > 7) {...stuff... } which breaks the abstraction inside...stuff...

atomic (do { withdraw a1 3 `orelse` withdraw a2 3 ; deposit b 3 }) Try this...and and then do this...and if it retries, try this orelse :: STM a -> STM a -> STM a

transfer :: TVar Int -> TVar Int -> TVar Int -> STM () transfer a1 a2 b = do { withdraw a1 3 `orelse` withdraw a2 3 atomic (transfer a1 a2 b `orelse` transfer a3 a4 b) ; deposit b 3 } transfer has an orelse, but calls to transfer can still be composed with orelse

A transaction is a value of type (STM t) Transactions are first-class values Build a big transaction by composing little transactions: in sequence, using choice, inside procedures... Finally seal up the transaction with atomic :: STM a -> IO a No nested atomic! But orelse is like a nested transaction No concurrency within a transaction!

Nice equations: orelse is associative (but not commutative) retry `orelse` s = s s `orelse` retry = s (STM is an instance of MonadPlus)

The route to sanity is by establishing invariants that are assumed on entry, and guaranteed on exit, by every atomic block We want to check these guarantees. But we don t want to test every invariant after every atomic block. Hmm... Only test when something read by the invariant has changed... rather like retry

always :: STM Bool -> STM () newaccount :: STM (TVar Int) newaccount = do { v <- newtvar 0 ; always (do { cts <- readtvar v ; return (cts >= 0) }) ; return v } Any transaction that modifies the account will check the invariant (no forgotten checks) An arbitrary boolean-valued STM computation

always :: STM Bool -> STM () always adds a new invariant to a global pool of invariants Conceptually, every invariant is checked after every transaction But the implementation checks only invariants that read TVars that have been written by the transaction...and garbage collects invariants that are checking dead TVars

Everything so far is intuitive and armwavey But what happens if it s raining, and you are inside an orelse and you throw an exception that contains a value that mentions...? We need a precise specification!

We have one No way to wait for complex conditions

Atomic blocks (atomic, retry, orelse) are a real step forward It s like using a high-level language instead of assembly code: whole classes of low-level errors are eliminated. Not a silver bullet: you can still write buggy programs; concurrent programs are still harder to write than sequential ones; aimed at shared memory But the improvement is very substantial