Concurrency on x86-64; Threads Programming Tips

Similar documents
Synchronization: Advanced

Carnegie Mellon. Bryant and O Hallaron, Computer Systems: A Programmer s Perspective, Third Edition

Synchronization: Basics

Synchronization: Basics

Concurrent Programming

CS 5523 Operating Systems: Midterm II - reivew Instructor: Dr. Tongping Liu Department Computer Science The University of Texas at San Antonio

Carnegie Mellon Concurrency and Synchronization

Shared Variables in Threaded C Programs. Systemprogrammering Föreläsning 11 Programming with Threads

Programming with Threads Dec 7, 2009"

for (i = 0; i < 2; i++)

! What is the memory model for threads? ! How are variables mapped to memory instances? ! How many threads reference each of these instances?

Carnegie Mellon Synchronization: Advanced

Today. Threads review Sharing Mutual exclusion Semaphores

Synchronization April 29, 2008

CS3733: Operating Systems

Pthreads (2) Dong-kun Shin Embedded Software Laboratory Sungkyunkwan University Embedded Software Lab.

Pthreads. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Carnegie Mellon. Synchroniza+on : Introduc+on to Computer Systems Recita+on 14: November 25, Pra+k Shah (pcshah) Sec+on C

CS 3723 Operating Systems: Final Review

Lecture #24 Synchronization

CS 5523 Operating Systems: Thread and Implementation

CS 5523: Operating Systems

Proxy: Concurrency & Caches

CS Operating Systems: Threads (SGG 4)

Concurrent Programming

Computer Systems Laboratory Sungkyunkwan University

Example. Example. Threading Language and Support. CS528 Multithreading: Programming with Threads. The boss/worker model. Thread Programming models

CS 3305 Intro to Threads. Lecture 6

Total Score is updated. The score of PA4 will be changed! Please check it. Will be changed

CS 333 Introduction to Operating Systems. Class 3 Threads & Concurrency. Jonathan Walpole Computer Science Portland State University

Threads (SGG 4) Outline. Traditional Process: Single Activity. Example: A Text Editor with Multi-Activity. Instructor: Dr.

Programming with Threads

Synchroniza+on: Advanced

CS 333 Introduction to Operating Systems. Class 3 Threads & Concurrency. Jonathan Walpole Computer Science Portland State University

Threads. What is a thread? Motivation. Single and Multithreaded Processes. Benefits

Carnegie Mellon Performance of Mul.threaded Code

CS533 Concepts of Operating Systems. Jonathan Walpole

Concurrent Programming

CS510 Operating System Foundations. Jonathan Walpole

Concurrency. Johan Montelius KTH

POSIX threads CS 241. February 17, Copyright University of Illinois CS 241 Staff

CS333 Intro to Operating Systems. Jonathan Walpole

Process Synchroniza/on

What is concurrency? Concurrency. What is parallelism? concurrency vs parallelism. Concurrency: (the illusion of) happening at the same time.

pthreads CS449 Fall 2017

The course that gives CMU its Zip! Concurrency I: Threads April 10, 2001

CS 475. Process = Address space + one thread of control Concurrent program = multiple threads of control

Multithreading Programming II

HPCSE - I. «Introduction to multithreading» Panos Hadjidoukas

Carnegie Mellon. Bryant and O Hallaron, Computer Systems: A Programmer s Perspective, Third Edition

Lecture #23 Concurrent Programming

Threads. lightweight processes

Processes. Johan Montelius KTH

A process. the stack

Concurrent Programming is Hard! Concurrent Programming. Reminder: Iterative Echo Server. The human mind tends to be sequential

CS510 Operating System Foundations. Jonathan Walpole

Concurrent Programming

Concurrent Programming. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Last class: Today: Thread Background. Thread Systems

Thread. Disclaimer: some slides are adopted from the book authors slides with permission 1

G Programming Languages - Fall 2012

Concurrent Programming

Introduction to C. Robert Escriva. Cornell CS 4411, August 30, Geared toward programmers

Threads. Threads (continued)

THREADS: (abstract CPUs)

CS 3723 Operating Systems: Midterm II - Review

Concurrent Programming

Recitation 14: Proxy Lab Part 2

CS 333 Introduction to Operating Systems Class 4 Concurrent Programming and Synchronization Primitives

CS 153 Lab4 and 5. Kishore Kumar Pusukuri. Kishore Kumar Pusukuri CS 153 Lab4 and 5

CISC2200 Threads Spring 2015

Introduction to Threads

Introduction to C. Ayush Dubey. Cornell CS 4411, August 31, Geared toward programmers

Concurrent Programming

Lecture 4. Threads vs. Processes. fork() Threads. Pthreads. Threads in C. Thread Programming January 21, 2005

Cache Coherence and Atomic Operations in Hardware

TCSS 422: OPERATING SYSTEMS

Outline. CS4254 Computer Network Architecture and Programming. Introduction 2/4. Introduction 1/4. Dr. Ayman A. Abdel-Hamid.

Introduction to C. Zhiyuan Teo. Cornell CS 4411, August 26, Geared toward programmers

Concurrent Servers Dec 2, 2009"

From Java to C. Thanks to Randal E. Bryant and David R. O'Hallaron (Carnegie-Mellon University) for providing the basis for these slides

Introduction to C. Sean Ogden. Cornell CS 4411, August 30, Geared toward programmers

Threading Language and Support. CS528 Multithreading: Programming with Threads. Programming with Threads

CSC 1600: Chapter 6. Synchronizing Threads. Semaphores " Review: Multi-Threaded Processes"

CS 261 Fall Mike Lam, Professor. Threads

CS-345 Operating Systems. Tutorial 2: Grocer-Client Threads, Shared Memory, Synchronization

Concurrency and Synchronization. ECE 650 Systems Programming & Engineering Duke University, Spring 2018

Remote Procedure Call (RPC) and Transparency

Lecture 15a Persistent Memory & Shared Pointers

Approaches to Concurrency

Operating Systems. Spring Dongkun Shin, SKKU

Final Exam. 11 May 2018, 120 minutes, 26 questions, 100 points

call connect call read call connect ret connect call fgets Client 2 blocks waiting to complete its connection request until after lunch!

CSE 160 Lecture 7. C++11 threads C++11 memory model

OS lpr. www. nfsd gcc emacs ls 1/27/09. Process Management. CS 537 Lecture 3: Processes. Example OS in operation. Why Processes? Simplicity + Speed

Overview. CMSC 330: Organization of Programming Languages. Concurrency. Multiprocessors. Processes vs. Threads. Computation Abstractions

Recitation 14: Proxy Lab Part 2

CSE 306/506 Operating Systems Threads. YoungMin Kwon

Lec 26: Parallel Processing. Announcements

POSIX Threads. HUJI Spring 2011

Transcription:

Concurrency on x86-64; Threads Programming Tips Brad Karp UCL Computer Science CS 3007 22 nd March 2018 (lecture notes derived from material from Eddie Kohler, David Mazières, Phil Gibbons, Dave O Hallaron, and Randy Bryant) 1

Outline Implementing synchronization primitives on x86-64 Compare-and-swap (lock cmpxchg) instruction x86-64 low-level behavior: what s atomic, what isn t? Gotchas to avoid when programming with pthreads Further Reading 2

How Do We Implement Synchronization in Assembly? Traditional assembly-level building block: compare-andswap (CAS) instruction Atomic Can be used to build all synchronization primitives we ve discussed CAS intuition: Thread reads shared state from memory; this is expected value if no other thread modifies this memory underneath our thread Thread computes desired updated value for shared state (e.g., for counter, increment of counter) Thread executes CAS instruction which atomically checks if memory still has expected value (no other thread has changed it) iff yes, updates memory to updated value, returns success if no, returns failure 3

How Do We Implement Synchronization in Assembly? (II) CAS definition (pseudocode): // hardware does entirety of below ATOMICALLY! int CAS_instr(uint64_t *value, uint64_t expected, uint64_t desired) if (*value == expected) *value = desired; return 1; else return 0; Typically call CAS in while loop: upon failure, reread *value; recompute desired, try again CAS is flexible: allows arbitrary computation on a memory value; each thread can do different computation on that value 4

Abstract CAS Example: Add 17 volatile uint64_t theval = 0; void threadfunc(void *arg) uint64_t *pv = (uint64_t *) arg; uint64_t expected, desired; while (1) expected = *pv; desired = expected + 17; if (CAS_instr(pv, expected, desired)) break; // [create N pthreads each running threadfunc(, &theval) ] Upon return, theval will have been atomically increased by 17 Loop retries if theval has changed underneath thread 5

x86-64 s CAS: lock cmpxchg Same functionality as CAS; slightly different interface Returns prior value of shared memory location Caller determines success/failure based on whether returned value equals expected As usual, different variants for different types (b, l, q) Pseudocode: // The entirety of the below is executed ATOMICALLY TYPE cmpxchg(type* object, TYPE expected, TYPE desired) TYPE actual = *object; if (actual == expected) *object = desired; return actual; C11 offers atomic_compare_exchange_strong() as portable API for CPU s CAS-like instruction 6

x86-64 Assembly vs. Concurrency and Shared Memory Carnegie Mellon Instructions with atomic effect include: Loading memory à register an aligned 1-, 2-, 4-, 8-byte value Storing register à memory an aligned 1-, 2-, 4-, 8-byte value Note, e.g., that if many threads concurrently store 8-byte aligned values to same address, no partial/ blended results But beware: only true for aligned accesses; if memory spans cache-line boundary, no atomicity! (Why?) Instructions without atomic effect include: incq (%rax); increments 64-bit int stored at address in %rax instruction in fact must load, increment, store if target address aligned, load and store each have atomic effect, but set of three ops as a whole does not! x86-64 instructions that include both reads and writes are nonatomic by default 7

x86-64 and Concurrency (cont d) But building locks requires atomic instructions! x86-64 supports atomic instructions with optional lock prefix Locks the bus during instruction s execution (coordinates among cores and their caches; effectively does mutual exclusion in hardware for operations on target address) e.g., lock incq (%rax) has atomic effect On x86-64, lock cmpxchg is universal building block for atomic functions But N.B. that there are other instructions with atomic effect, and lock cmpxchg isn t always the fastest alternative for all function implementations! Achieving highest performance and desired app semantics for concurrent access to shared memory is tricky Can even trade-off strength of semantics and performance (beyond scope of this lecture ) 8

Outline Implementing synchronization primitives on x86-64 Gotchas to avoid when programming with pthreads Further Reading 9

Shared Variables in Threaded C Programs Question: Which variables in a threaded C program are shared? The answer is not as simple as global variables are shared and stack variables are private Def: A variable x is shared if and only if multiple threads reference some instance of x. Requires answers to the following questions: What is the memory model for threads? How are instances of variables mapped to memory? How many threads might reference each of these instances? 10

Threads Memory Model: Conceptual Multiple threads run within the context of a single process Each thread has its own separate thread context Thread ID, stack, stack pointer, PC, condition codes, and GP registers All threads share the remaining process context Code, data, heap, and shared library segments of the process virtual address space Open files and installed handlers Thread 1 (private) stack 1 Thread 1 context: Data registers Condition codes SP 1 PC 1 Thread 2 (private) stack 2 Thread 2 context: Data registers Condition codes SP 2 PC 2 Shared code and data shared libraries run-time heap read/write data read-only code/data 11

Threads Memory Model: Actual stack 1 Thread 1 (private) Thread 1 context: Data registers Condition codes SP 1 PC 1 stack 2 Thread 2 (private) Thread 2 context: Data registers Condition codes SP 2 PC 2 Virtual Address Space Shared code and data shared libraries run-time heap read/write data read-only code/data 12

Threads Memory Model: Actual stack 1 Thread 1 (private) Thread 1 context: Data registers Condition codes SP 1 PC 1 stack 2 Thread 2 (private) Thread 2 context: Data registers Condition codes SP 2 PC 2 Virtual Address Space Shared code and data shared libraries run-time heap read/write data read-only code/data 13

Threads Memory Model: Actual Separation of data is not strictly enforced: Register values are truly separate and protected, but Any thread can read and write the stack of any other thread stack 1 Thread 1 (private) Thread 1 context: Data registers Condition codes SP 1 PC 1 stack 2 Thread 2 (private) Thread 2 context: Data registers Condition codes SP 2 PC 2 Virtual Address Space Shared code and data shared libraries run-time heap read/write data read-only code/data Mismatch between conceptual and operational models is source of confusion and errors! 14

Example Program to Illustrate Sharing int main(int argc, char *argv[]) long i; pthread_t tid; char *msgs[2] = "Hello from foo", "Hello from bar" ; Pthread_create(&tid, NULL, thread, (void *)i); Pthread_exit(NULL); sharing.c 15

Example Program to Illustrate Sharing int main(int argc, char *argv[]) long i; pthread_t tid; char *msgs[2] = "Hello from foo", "Hello from bar" ; Pthread_create(&tid, NULL, thread, (void *)i); Pthread_exit(NULL); sharing.c A common but inelegant way to pass a single argument to a thread routine 16

Example Program to Illustrate Sharing int main(int argc, char *argv[]) long i; pthread_t tid; char *msgs[2] = "Hello from foo", "Hello from bar" ; Pthread_create(&tid, NULL, thread, (void *)i); Pthread_exit(NULL); sharing.c Peer threads reference main thread s stack indirectly through global ptr variable A common but inelegant way to pass a single argument to a thread routine 17

Mapping Variable Instances to Memory Global variables Def: Variable declared outside of a function Virtual memory contains exactly one instance of any global variable Local variables Def: Variable declared inside function without static attribute Each thread stack contains one instance of each local variable Local static variables Def: Variable declared inside function with the static attribute Virtual memory contains exactly one instance of any local static variable 18

Mapping Variable Instances to Memory Carnegie Mellon int main(int main, char *argv[]) long i; pthread_t tid; char *msgs[2] = "Hello from foo", "Hello from bar" ; Pthread_create(&tid, NULL, thread, (void *)i); Pthread_exit(NULL); sharing.c 19

Mapping Variable Instances to Memory Global var: 1 instance (ptr [data]) Carnegie Mellon int main(int main, char *argv[]) long i; pthread_t tid; char *msgs[2] = "Hello from foo", "Hello from bar" ; Pthread_create(&tid, NULL, thread, (void *)i); Pthread_exit(NULL); sharing.c 20

Mapping Variable Instances to Memory Carnegie Mellon Global var: 1 instance (ptr [data]) Local vars: 1 instance (i.m, msgs.m) int main(int main, char *argv[]) long i; pthread_t tid; char *msgs[2] = "Hello from foo", "Hello from bar" ; Pthread_create(&tid, NULL, thread, (void *)i); Pthread_exit(NULL); sharing.c 21

Mapping Variable Instances to Memory Carnegie Mellon Global var: 1 instance (ptr [data]) Local vars: 1 instance (i.m, msgs.m) int main(int main, char *argv[]) long i; pthread_t tid; char *msgs[2] = "Hello from foo", "Hello from bar" ; Pthread_create(&tid, NULL, thread, (void *)i); Pthread_exit(NULL); sharing.c Local var: 2 instances ( myid.p0 [peer thread 0 s stack], myid.p1 [peer thread 1 s stack] ) 22

Mapping Variable Instances to Memory Carnegie Mellon Global var: 1 instance (ptr [data]) Local vars: 1 instance (i.m, msgs.m) int main(int main, char *argv[]) long i; pthread_t tid; char *msgs[2] = "Hello from foo", "Hello from bar" ; Pthread_create(&tid, NULL, thread, (void *)i); Pthread_exit(NULL); sharing.c Local var: 2 instances ( myid.p0 [peer thread 0 s stack], myid.p1 [peer thread 1 s stack] ) Local static var: 1 instance (cnt [data]) 23

Shared Variable Analysis Which variables are shared? Variable Referenced by Referenced by Referenced by instance main thread? peer thread 0? peer thread 1? ptr cnt i.m msgs.m myid.p0 myid.p1 int main(int main, char *argv[]) long Answer: i; pthread_t A variable tid; x is shared iff multiple threads char *msgs[2] = "Hello from foo", reference at least one instance of x. Thus: "Hello from bar" ; n ipthread_create(&tid, and myid are not shared NULL, thread,(void *)i); Pthread_exit(NULL); n ptr, cnt, and msgs are shared 24

Shared Variable Analysis Which variables are shared? Variable Referenced by Referenced by Referenced by instance main thread? peer thread 0? peer thread 1? ptr cnt i.m msgs.m myid.p0 myid.p1 yes int main(int main, char *argv[]) long Answer: i; pthread_t A variable tid; x is shared iff multiple threads char *msgs[2] = "Hello from foo", reference at least one instance of x. Thus: "Hello from bar" ; n ipthread_create(&tid, and myid are not shared NULL, thread,(void *)i); Pthread_exit(NULL); n ptr, cnt, and msgs are shared 25

Shared Variable Analysis Which variables are shared? Variable Referenced by Referenced by Referenced by instance main thread? peer thread 0? peer thread 1? ptr cnt i.m msgs.m myid.p0 myid.p1 yes long Answer: i; pthread_t A variable tid; x is shared iff multiple threads char *msgs[2] = "Hello from foo", reference at least one instance of x. Thus: yes int main(int main, char *argv[]) "Hello from bar" ; n ipthread_create(&tid, and myid are not shared NULL, thread,(void *)i); Pthread_exit(NULL); n ptr, cnt, and msgs are shared 26

Shared Variable Analysis Which variables are shared? Variable Referenced by Referenced by Referenced by instance main thread? peer thread 0? peer thread 1? ptr cnt i.m msgs.m myid.p0 myid.p1 yes yes yes int main(int main, char *argv[]) long Answer: i; pthread_t A variable tid; x is shared iff multiple threads char *msgs[2] = "Hello from foo", reference at least one instance of x. Thus: "Hello from bar" ; n ipthread_create(&tid, and myid are not shared NULL, thread,(void *)i); Pthread_exit(NULL); n ptr, cnt, and msgs are shared 27

Shared Variable Analysis Which variables are shared? Variable Referenced by Referenced by Referenced by instance main thread? peer thread 0? peer thread 1? ptr cnt i.m msgs.m myid.p0 myid.p1 yes yes yes no int main(int main, char *argv[]) long Answer: i; pthread_t A variable tid; x is shared iff multiple threads char *msgs[2] = "Hello from foo", reference at least one instance of x. Thus: "Hello from bar" ; n ipthread_create(&tid, and myid are not shared NULL, thread,(void *)i); Pthread_exit(NULL); n ptr, cnt, and msgs are shared 28

Shared Variable Analysis Which variables are shared? Variable Referenced by Referenced by Referenced by instance main thread? peer thread 0? peer thread 1? ptr cnt i.m msgs.m myid.p0 myid.p1 yes yes yes no yes int main(int main, char *argv[]) long Answer: i; pthread_t A variable tid; x is shared iff multiple threads char *msgs[2] = "Hello from foo", reference at least one instance of x. Thus: "Hello from bar" ; n ipthread_create(&tid, and myid are not shared NULL, thread,(void *)i); Pthread_exit(NULL); n ptr, cnt, and msgs are shared 29

Shared Variable Analysis Which variables are shared? Variable Referenced by Referenced by Referenced by instance main thread? peer thread 0? peer thread 1? ptr cnt i.m msgs.m myid.p0 myid.p1 int main(int main, char *argv[]) yes yes yes no yes yes long Answer: i; pthread_t A variable tid; x is shared iff multiple threads char *msgs[2] = "Hello from foo", reference at least one instance of x. Thus: "Hello from bar" ; n ipthread_create(&tid, and myid are not shared NULL, thread,(void *)i); Pthread_exit(NULL); n ptr, cnt, and msgs are shared 30

Shared Variable Analysis Which variables are shared? Variable Referenced by Referenced by Referenced by instance main thread? peer thread 0? peer thread 1? ptr cnt i.m msgs.m myid.p0 myid.p1 yes yes yes no yes yes yes int main(int main, char *argv[]) long Answer: i; pthread_t A variable tid; x is shared iff multiple threads char *msgs[2] = "Hello from foo", reference at least one instance of x. Thus: "Hello from bar" ; n ipthread_create(&tid, and myid are not shared NULL, thread,(void *)i); Pthread_exit(NULL); n ptr, cnt, and msgs are shared 31

Shared Variable Analysis Which variables are shared? Variable Referenced by Referenced by Referenced by instance main thread? peer thread 0? peer thread 1? ptr cnt i.m msgs.m myid.p0 myid.p1 yes yes yes no yes yes yes no int main(int main, char *argv[]) long Answer: i; pthread_t A variable tid; x is shared iff multiple threads char *msgs[2] = "Hello from foo", reference at least one instance of x. Thus: "Hello from bar" ; n ipthread_create(&tid, and myid are not shared NULL, thread,(void *)i); Pthread_exit(NULL); n ptr, cnt, and msgs are shared 32

Shared Variable Analysis Which variables are shared? Variable Referenced by Referenced by Referenced by instance main thread? peer thread 0? peer thread 1? ptr cnt i.m msgs.m myid.p0 myid.p1 yes yes yes no yes yes yes no no int main(int main, char *argv[]) long Answer: i; pthread_t A variable tid; x is shared iff multiple threads char *msgs[2] = "Hello from foo", reference at least one instance of x. Thus: "Hello from bar" ; n ipthread_create(&tid, and myid are not shared NULL, thread,(void *)i); Pthread_exit(NULL); n ptr, cnt, and msgs are shared 33

Shared Variable Analysis Which variables are shared? Variable Referenced by Referenced by Referenced by instance main thread? peer thread 0? peer thread 1? ptr cnt i.m msgs.m myid.p0 myid.p1 yes yes yes no yes yes yes no no yes int main(int main, char *argv[]) long Answer: i; pthread_t A variable tid; x is shared iff multiple threads char *msgs[2] = "Hello from foo", reference at least one instance of x. Thus: "Hello from bar" ; n ipthread_create(&tid, and myid are not shared NULL, thread,(void *)i); Pthread_exit(NULL); n ptr, cnt, and msgs are shared 34

Shared Variable Analysis Which variables are shared? Variable Referenced by Referenced by Referenced by instance main thread? peer thread 0? peer thread 1? ptr cnt i.m msgs.m myid.p0 myid.p1 yes yes yes no yes yes yes no no yes yes int main(int main, char *argv[]) long Answer: i; pthread_t A variable tid; x is shared iff multiple threads char *msgs[2] = "Hello from foo", reference at least one instance of x. Thus: "Hello from bar" ; n ipthread_create(&tid, and myid are not shared NULL, thread,(void *)i); Pthread_exit(NULL); n ptr, cnt, and msgs are shared 35

Shared Variable Analysis Which variables are shared? Variable Referenced by Referenced by Referenced by instance main thread? peer thread 0? peer thread 1? ptr cnt i.m msgs.m myid.p0 myid.p1 yes yes yes no yes yes yes no no yes yes yes int main(int main, char *argv[]) long Answer: i; pthread_t A variable tid; x is shared iff multiple threads char *msgs[2] = "Hello from foo", reference at least one instance of x. Thus: "Hello from bar" ; n ipthread_create(&tid, and myid are not shared NULL, thread,(void *)i); Pthread_exit(NULL); n ptr, cnt, and msgs are shared 36

Shared Variable Analysis Which variables are shared? Variable Referenced by Referenced by Referenced by instance main thread? peer thread 0? peer thread 1? ptr cnt i.m msgs.m myid.p0 myid.p1 yes yes yes no yes yes yes no no yes yes yes no int main(int main, char *argv[]) long Answer: i; pthread_t A variable tid; x is shared iff multiple threads char *msgs[2] = "Hello from foo", reference at least one instance of x. Thus: "Hello from bar" ; n ipthread_create(&tid, and myid are not shared NULL, thread,(void *)i); Pthread_exit(NULL); n ptr, cnt, and msgs are shared 37

Shared Variable Analysis Which variables are shared? Variable Referenced by Referenced by Referenced by instance main thread? peer thread 0? peer thread 1? ptr cnt i.m msgs.m myid.p0 myid.p1 yes yes yes no yes yes yes no no yes yes yes no yes int main(int main, char *argv[]) long Answer: i; pthread_t A variable tid; x is shared iff multiple threads char *msgs[2] = "Hello from foo", reference at least one instance of x. Thus: "Hello from bar" ; n ipthread_create(&tid, and myid are not shared NULL, thread,(void *)i); Pthread_exit(NULL); n ptr, cnt, and msgs are shared 38

Shared Variable Analysis Which variables are shared? Variable Referenced by Referenced by Referenced by instance main thread? peer thread 0? peer thread 1? ptr cnt i.m msgs.m myid.p0 myid.p1 yes yes yes no yes yes yes no no yes yes yes no yes no int main(int main, char *argv[]) long Answer: i; pthread_t A variable tid; x is shared iff multiple threads char *msgs[2] = "Hello from foo", reference at least one instance of x. Thus: "Hello from bar" ; n ipthread_create(&tid, and myid are not shared NULL, thread,(void *)i); Pthread_exit(NULL); n ptr, cnt, and msgs are shared 39

Shared Variable Analysis Which variables are shared? Variable Referenced by Referenced by Referenced by instance main thread? peer thread 0? peer thread 1? ptr cnt i.m msgs.m myid.p0 myid.p1 yes yes yes no yes yes yes no no yes yes yes no yes no no int main(int main, char *argv[]) long Answer: i; pthread_t A variable tid; x is shared iff multiple threads char *msgs[2] = "Hello from foo", reference at least one instance of x. Thus: "Hello from bar" ; n ipthread_create(&tid, and myid are not shared NULL, thread,(void *)i); Pthread_exit(NULL); n ptr, cnt, and msgs are shared 40

Shared Variable Analysis Which variables are shared? Variable Referenced by Referenced by Referenced by instance main thread? peer thread 0? peer thread 1? ptr cnt i.m msgs.m myid.p0 myid.p1 yes yes yes no yes yes yes no no yes yes yes no yes no no no int main(int main, char *argv[]) long Answer: i; pthread_t A variable tid; x is shared iff multiple threads char *msgs[2] = "Hello from foo", reference at least one instance of x. Thus: "Hello from bar" ; n ipthread_create(&tid, and myid are not shared NULL, thread,(void *)i); Pthread_exit(NULL); n ptr, cnt, and msgs are shared 41

Shared Variable Analysis Which variables are shared? Variable Referenced by Referenced by Referenced by instance main thread? peer thread 0? peer thread 1? ptr cnt i.m msgs.m myid.p0 myid.p1 yes yes yes no yes yes yes no no yes yes yes no yes no no no yes int main(int main, char *argv[]) long Answer: i; pthread_t A variable tid; x is shared iff multiple threads char *msgs[2] = "Hello from foo", reference at least one instance of x. Thus: "Hello from bar" ; n ipthread_create(&tid, and myid are not shared NULL, thread,(void *)i); Pthread_exit(NULL); n ptr, cnt, and msgs are shared 42

Shared Variable Analysis Which variables are shared? Variable Referenced by Referenced by Referenced by instance main thread? peer thread 0? peer thread 1? ptr cnt i.m msgs.m myid.p0 myid.p1 yes yes yes no yes yes yes no no yes yes yes no yes no no no yes Answer: A variable x is shared iff multiple threads reference at least one instance of x. Thus: n ptr, cnt, and msgs are shared n i and myid are not shared Moral: beware unintended sharing; take care to synchronize all shared accesses! 43

Crucial Concept: Thread Safety Functions called from a thread must be thread-safe Def: A function is thread-safe iff it will always produce correct results when called repeatedly from multiple concurrent threads Classes of thread-unsafe functions: Class 1: Functions that do not protect shared variables Class 2: Functions that keep state across multiple invocations Class 3: Functions that return a pointer to a static variable Class 4: Functions that call thread-unsafe functions 44

Thread-Unsafe Functions (Class 1) Failing to protect shared variables Fix: use synchronization primitives, e.g., spinlock, mutex, condition variable Pitfall: synchronization operations invariably slow down code Carnegie Mellon 45

Thread-Unsafe Functions (Class 2) Relying on persistent state across multiple function invocations Example: Random number generator that relies on static state static unsigned int next = 1; /* rand: return pseudo-random integer on 0..32767 */ int rand(void) next = next*1103515245 + 12345; return (unsigned int)(next/65536) % 32768; /* srand: set seed for rand() */ void srand(unsigned int seed) next = seed; 46

Thread-Safe Random Number Generator Fix: pass state as argument and thereby eliminate static state /* rand_r - return pseudo-random integer on 0..32767 */ int rand_r(int *nextp) *nextp = *nextp*1103515245 + 12345; return (unsigned int)(*nextp/65536) % 32768; Consequence: programmer using rand_r() must maintain seed 47

Thread-Unsafe Functions (Class 3) Returning pointer to static variable Fix 1: Rewrite function so caller passes address of variable to store result Changes in caller and callee Fix 2: Lock-and-copy Simple changes in caller (and none in callee) However, caller must free memory /* Convert integer to string */ char *itoa(int x) static char buf[11]; sprintf(buf, "%d", x); return buf; 48

Thread-Unsafe Functions (Class 3) Returning pointer to static variable Fix 1: Rewrite function so caller passes address of variable to store result Changes in caller and callee Fix 2: Lock-and-copy Simple changes in caller (and none in callee) However, caller must free memory /* Convert integer to string */ char *itoa(int x) static char buf[11]; sprintf(buf, "%d", x); return buf; char *lc_itoa(int x, char *dest) pthread_lock_mutex(&mutex); strcpy(dest, itoa(x)); pthread_unlock_mutex(&mutex); return dest; Warning: Some functions like gethostbyname() require a deep copy. Use reentrant gethostbyname_r() version instead. 49

Thread-Unsafe Functions (Class 4) Carnegie Mellon Calling thread-unsafe functions Calling one thread-unsafe function makes the entire calling function thread-unsafe Fix: Modify function so it calls only thread-safe functions J 50

Reentrant Functions Def: A function is reentrant iff it accesses no shared variables when called by multiple threads Important subset of thread-safe functions Require no synchronization operations Only way to make a Class 2 function thread-safe is to make it reetnrant (e.g., rand_r) All functions Thread-safe functions Reentrant functions Thread-unsafe functions 51

Thread-Safe Library Functions All functions in the Standard C Library (at the back of K&R text) are thread-safe Examples: malloc(), free(), printf(), scanf() Most UNIX-specific C library calls are thread-safe, with a few exceptions: Thread-unsafe function Class Reentrant version asctime() 3 asctime_r() ctime() 3 ctime_r() gethostbyaddr() 3 gethostbyaddr_r() gethostbyname() 3 gethostbyname_r() inet_ntoa() 3 (none) localtime() 3 localtime_r() rand() 2 rand_r() 52

Further Reading (unexaminable) Intel s weak memory semantics are subtle, and known as Total Store Ordering (TSO); full details explored in Sewell s: http://www.cl.cam.ac.uk/~pes20/weakmemory/cacm.pdf The scalability of lock implementations, especially in kernel code, is of great import to performance; a classic exploration from MIT s PDOS group, showing performance howlers in Linux: https://people.csail.mit.edu/nickolai/papers/boyd-wickizer-locks.pdf C11 supports a vast array of options for specifying the memory semantics you want Atomics to use; an informative exploration (for C++11) by Preshing: http://preshing.com/20120913/acquire-and-release-semantics/ Designing high-performance applications that run on many cores and share memory is hard, and an ongoing systems research topic. Kohler et al. s Masstree is an impressively fast (key, value) store for multi-core machines: http://read.seas.harvard.edu/~kohler/pubs/mao12cache.pdf 53