Review: Easy Piece 1

Similar documents
CS 537 Lecture 11 Locks

Concurrency: Locks. Announcements

Concurrency: Mutual Exclusion (Locks)

Notes. CS 537 Lecture 5 Threads and Cooperation. Questions answered in this lecture: What s in a process?

[537] Locks. Tyler Harter

Questions answered in this lecture: CS 537 Lecture 19 Threads and Cooperation. What s in a process? Organizing a Process

Synchronization. Disclaimer: some slides are adopted from the book authors slides with permission 1

CS 537 Lecture 6 Fast Translation - TLBs

Synchronization. CS61, Lecture 18. Prof. Stephen Chong November 3, 2011

CS 333 Introduction to Operating Systems. Class 3 Threads & Concurrency. Jonathan Walpole Computer Science Portland State University

Introduction to PThreads and Basic Synchronization

CS 5523 Operating Systems: Midterm II - reivew Instructor: Dr. Tongping Liu Department Computer Science The University of Texas at San Antonio

CS 333 Introduction to Operating Systems. Class 3 Threads & Concurrency. Jonathan Walpole Computer Science Portland State University

! Why is synchronization needed? ! Synchronization Language/Definitions: ! How are locks implemented? Maria Hybinette, UGA

CS 475. Process = Address space + one thread of control Concurrent program = multiple threads of control

Thread. Disclaimer: some slides are adopted from the book authors slides with permission 1

www nfsd emacs lpr Process Management CS 537 Lecture 4: Processes Example OS in operation Why Processes? Simplicity + Speed

Processes Prof. James L. Frankel Harvard University. Version of 6:16 PM 10-Feb-2017 Copyright 2017, 2015 James L. Frankel. All rights reserved.

Multi-level Translation. CS 537 Lecture 9 Paging. Example two-level page table. Multi-level Translation Analysis

10/17/ Gribble, Lazowska, Levy, Zahorjan 2. 10/17/ Gribble, Lazowska, Levy, Zahorjan 4

CSE 451: Operating Systems Winter Lecture 7 Synchronization. Hank Levy 412 Sieg Hall

Recap: Thread. What is it? What does it need (thread private)? What for? How to implement? Independent flow of control. Stack

Agenda. Threads. Single and Multi-threaded Processes. What is Thread. CSCI 444/544 Operating Systems Fall 2008

Synchronization I. Jo, Heeseung

CS 31: Introduction to Computer Systems : Threads & Synchronization April 16-18, 2019

CSE 451: Operating Systems Winter Lecture 7 Synchronization. Steve Gribble. Synchronization. Threads cooperate in multithreaded programs

Threads and Too Much Milk! CS439: Principles of Computer Systems January 31, 2018

OS lpr. www. nfsd gcc emacs ls 9/18/11. Process Management. CS 537 Lecture 4: Processes. The Process. Why Processes? Simplicity + Speed

Lecture 4: Mechanism of process execution. Mythili Vutukuru IIT Bombay

CSE 153 Design of Operating Systems Fall 2018

Dealing with Issues for Interprocess Communication

Threads. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

Today s Topics. u Thread implementation. l Non-preemptive versus preemptive threads. l Kernel vs. user threads

What s in a traditional process? Concurrency/Parallelism. What s needed? CSE 451: Operating Systems Autumn 2012

Threads and Synchronization. Kevin Webb Swarthmore College February 15, 2018

Threads and Too Much Milk! CS439: Principles of Computer Systems February 6, 2019

Synchronization. CSCI 3753 Operating Systems Spring 2005 Prof. Rick Han

Chapter 6: Process [& Thread] Synchronization. CSCI [4 6] 730 Operating Systems. Why does cooperation require synchronization?

Overview. CMSC 330: Organization of Programming Languages. Concurrency. Multiprocessors. Processes vs. Threads. Computation Abstractions

CS 5523: Operating Systems

Motivation. Threads. Multithreaded Server Architecture. Thread of execution. Chapter 4

10/10/ Gribble, Lazowska, Levy, Zahorjan 2. 10/10/ Gribble, Lazowska, Levy, Zahorjan 4

What s in a process?

Operating Systems. Synchronization

Concurrency, Thread. Dongkun Shin, SKKU

Problem Set 2. CS347: Operating Systems

Systèmes d Exploitation Avancés

Synchronization for Concurrent Tasks

Processes and Threads. From Processes to Threads. The Case for Threads. The Case for Threads. Process abstraction combines two concepts.

OS Structure. User mode/ kernel mode (Dual-Mode) Memory protection, privileged instructions. Definition, examples, how it works?

Goals. Processes and Threads. Concurrency Issues. Concurrency. Interlacing Processes. Abstracting a Process

Process/Thread Synchronization

Lecture 5: Synchronization w/locks

Operating Systems. Lecture 4 - Concurrency and Synchronization. Master of Computer Science PUF - Hồ Chí Minh 2016/2017

Implementing Locks. Nima Honarmand (Based on slides by Prof. Andrea Arpaci-Dusseau)

Synchronization Principles

CS333 Intro to Operating Systems. Jonathan Walpole

OS Structure. User mode/ kernel mode. System call. Other concepts to know. Memory protection, privileged instructions

Processes (Intro) Yannis Smaragdakis, U. Athens

CS 333 Introduction to Operating Systems Class 4 Concurrent Programming and Synchronization Primitives

Programming in Parallel COMP755

What s An OS? Cyclic Executive. Interrupts. Advantages Simple implementation Low overhead Very predictable

Process/Thread Synchronization

Semaphores. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

Lecture #7: Implementing Mutual Exclusion

CS 153 Design of Operating Systems Winter 2016

Introduction to Parallel Programming Part 4 Confronting Race Conditions

CS 537 Lecture 2 Computer Architecture and Operating Systems. OS Tasks

Operating Systems CMPSCI 377 Spring Mark Corner University of Massachusetts Amherst

Concurrent systems Lecture 1: Introduction to concurrency, threads, and mutual exclusion

Threads Chapter 5 1 Chapter 5

Synchronization I. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

CS 3723 Operating Systems: Final Review

Last 2 Classes: Introduction to Operating Systems & C++ tutorial. Today: OS and Computer Architecture

Motivation of Threads. Preview. Motivation of Threads. Motivation of Threads. Motivation of Threads. Motivation of Threads 9/12/2018.

CS533 Concepts of Operating Systems. Jonathan Walpole

Synchronization Spinlocks - Semaphores

Threads Implementation. Jo, Heeseung

Chap.6 Limited Direct Execution. Dongkun Shin, SKKU

CPSC/ECE 3220 Fall 2017 Exam Give the definition (note: not the roles) for an operating system as stated in the textbook. (2 pts.

Distributed Systems Operation System Support

CSC 4320 Test 1 Spring 2017

Threads. Raju Pandey Department of Computer Sciences University of California, Davis Spring 2011

CS5460: Operating Systems

CS 326: Operating Systems. Process Execution. Lecture 5

CS 5460/6460 Operating Systems

Locks. Dongkun Shin, SKKU

4. The Abstraction: The Process

Operating System Architecture. CS3026 Operating Systems Lecture 03

CS5460: Operating Systems

Concurrent Server Design Multiple- vs. Single-Thread

Parallel Programming Languages COMP360

High Performance Computing Course Notes Shared Memory Parallel Programming

CSCI [4 6] 730 Operating Systems. Example Execution. Process [& Thread] Synchronization. Why does cooperation require synchronization?

Chapter 6 Process Synchronization

CS 333 Introduction to Operating Systems. Class 4 Concurrent Programming and Synchronization Primitives

Threads Implementation. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Operating Systems. Thread Synchronization Primitives. Thomas Ropars.

INF 212 ANALYSIS OF PROG. LANGS CONCURRENCY. Instructors: Crista Lopes Copyright Instructors.

Lecture 4: Memory Management & The Programming Interface

Transcription:

CS 537 Lecture 10 Threads Michael Swift 10/9/17 2004-2007 Ed Lazowska, Hank Levy, Andrea and Remzi Arpaci-Dussea, Michael Swift 1 Review: Easy Piece 1 Virtualization CPU Memory Context Switch Schedulers Allocation Segmentation Paging TLBs Multilevel Swapping 1

Motivation for Concurrency http://cacm.acm.org/magazines/2012/4/147359-cpu-db-recording-microor-history/fulltext Motivation CPU Trend: Same speed, but multiple cores Goal: Write applications that fully utilize many cores Option 1: Build apps from many communicating es Example: Chrome ( per tab) Communicate via pipe() or similar Pros? Don t need new abstractions; good for security Cons? Cumbersome programming High communication overheads Expensive context switching (why expensive?) 2

CONCURRENCY: Option 2 New abstraction: thread Threads are like es, except: multiple threads of same share an address space Divide large task across several cooperative threads Communicate through shared address space Common Programming Models Multi-threaded programs tend to be structured as: Producer/consumer Multiple producer threads create data (or work) that is handled by one of the multiple consumer threads Pipeline Task is divided into series of subtasks, each of which is handled in series by a different thread Defer work with background thread One thread performs non-critical work in the background (when CPU idle) 3

CPU 1 CPU 2 thread 1 thread 2 RAM What state do threads share? CPU 1 CPU 2 thread 1 thread 2 RAM PageDir A PageDir B What threads share page directories? 4

CPU 1 CPU 2 thread 1 thread 2 RAM PageDir A PageDir B CPU 1 CPU 2 thread 1 thread 2 RAM PageDir A PageDir B 5

CPU 1 CPU 2 thread 1 thread 2 IP IP RAM PageDir A PageDir B Do threads share Instruction Pointer? CPU 1 CPU 2 thread 1 thread 2 IP IP RAM PageDir A PageDir B Virt Mem (PageDir A) CODE HEAP 6

CPU 1 CPU 2 thread 1 thread 2 IP IP RAM PageDir A PageDir B Virt Mem (PageDir A) CODE HEAP Share code, but each thread may be executing different code at the same time à Different Instruction Pointers CPU 1 CPU 2 thread 1 thread 2 IP IP RAM PageDir A PageDir B Virt Mem (PageDir A) CODE HEAP 7

CPU 1 CPU 2 thread 1 thread 2 IP SP IP SP RAM PageDir A PageDir B Virt Mem (PageDir A) CODE HEAP Do threads share stack pointer? CPU 1 CPU 2 thread 1 thread 2 IP SP IP SP RAM PageDir A PageDir B Virt Mem (PageDir A) CODE HEAP STACK 1 STACK 2 8

CPU 1 CPU 2 thread 1 thread 2 IP SP IP SP RAM PageDir A PageDir B Virt Mem (PageDir A) CODE HEAP STACK 1 STACK 2 threads executing different functions need different stacks (old) Process address space 0xFFFFFFFF stack (dynamic allocated mem) SP address space 0x00000000 heap (dynamic allocated mem) static data (data segment) code (text segment) PC 10/9/17 2004-2007 Ed Lazowska, Hank Levy, Andrea and Remzi Arpaci-Dussea, Michael Swift 18 9

(new) Address space with threads 0xFFFFFFFF address space thread 1 stack thread 2 stack thread 3 stack heap (dynamic allocated mem) SP (T1) SP (T2) SP (T3) 0x00000000 static data (data segment) code (text segment) 10/9/17 2004-2007 Ed Lazowska, Hank Levy, Andrea and Remzi Arpaci-Dussea, Michael Swift PC (T2) PC (T1) PC (T3) 19 THREAD VS. Process Multiple threads within a single share: Process ID (PID) Address space Code (instructions) Most data (heap) Open file descriptors Current working directory User and group id Each thread has its own Thread ID (TID) Set of registers, including Program counter and Stack pointer Stack for local variables and return addresses (in same address space) 10

Thread context switch Like context switch trap to kernel save context of currently thread push machine state onto thread stack restore context of the next thread pop machine state from next thread s stack return as the new thread execution resumes at PC of next thread What s not done: change address space 10/9/17 2004-2007 Ed Lazowska, Hank Levy, Andrea and Remzi Arpaci-Dussea, Michael Swift 21 THREAD API Variety of thread systems exist POSIX Pthreads Common thread operations Create Exit Join (instead of wait() for es) 11

OS Support: Approach 1 User-level threads: Many-to-one thread mapping Implemented by user-level runtime libraries Create, schedule, synchronize threads at user-level OS is not aware of user-level threads OS thinks each contains only a single thread of Advantages Does not require OS support; Portable Can tune scheduling policy to meet application demands Lower overhead thread operations since no system call Disadvantages? Cannot leverage multiors Entire blocks when one thread blocks OS Support: Approach 2 Kernel-level threads: One-to-one thread mapping OS provides each user-level thread with a kernel thread Each kernel thread scheduled independently Thread operations (creation, scheduling, synchronization) performed by OS Advantages Each kernel-level thread can run in parallel on a multior When one thread blocks, other threads from can be scheduled Disadvantages Higher overhead for thread operations OS must scale well with increasing number of threads 12

Thread Schedule #1 balance = balance + 1; balance at 0x9cd4 0x9cd4: 100 %rip = 0x195 %rip: 0x195 %rip: 0x195 T1 A Thread Schedule #1 0x9cd4: 100 %eax: 100 %rip = 0x19a %eax: 100 %rip: 0x19a %rip: 0x195 T1 13

Thread Schedule #1 0x9cd4: 100 %rip = 0x19d %rip: 0x19d %rip: 0x195 T1 Thread Schedule #1 0x9cd4: 101 %rip = 0x1a2 %rip: 0x1a2 %rip: 0x195 T1 14

Thread Schedule #1 0x9cd4: 101 %rip = 0x1a2 %rip: 0x195 %rip: 0x195 T1 Thread Context Switch Thread Schedule #1 0x9cd4: 101 %rip = 0x195 %rip: 0x1a2 %rip: 0x195 T2 15

Thread Schedule #1 0x9cd4: 101 %rip = 0x19a %rip: 0x1a2 %rip: 0x19a T2 Thread Schedule #1 0x9cd4: 101 %eax: 102 %rip = 0x19d %rip: 0x1a2 %eax: 102 %rip: 0x19d T2 16

Thread Schedule #1 0x9cd4: 102 %eax: 102 %rip = 0x1a2 %rip: 0x1a2 %eax: 102 %rip: 0x1a2 T2 Thread Schedule #1 0x9cd4: 102 %eax: 102 %rip = 0x1a2 %rip: 0x1a2 %rip: 0x195 T2 Desired Result! 17

Another schedule Thread Schedule #2 0x9cd4: 100 %rip = 0x195 %rip: 0x195 %rip: 0x195 T1 18

Thread Schedule #2 0x9cd4: 100 %eax: 100 %rip = 0x19a %eax: 100 %rip: 0x19a %rip: 0x195 T1 Thread Schedule #2 0x9cd4: 100 %rip = 0x19d %rip: 0x19d %rip: 0x195 T1 Thread Context Switch 19

Thread Schedule #2 0x9cd4: 100 %rip = 0x195 %rip: 0x19d %rip: 0x195 T2 Thread Schedule #2 0x9cd4: 100 %eax: 100 %rip = 0x19a %rip: 0x19d %eax: 100 %rip: 0x19a T2 20

Thread Schedule #2 0x9cd4: 100 %rip = 0x19d %rip: 0x19d %rip: 0x19d T2 Thread Schedule #2 0x9cd4: 101 %rip = 0x1a2 %rip: 0x19d %rip: 0x1a2 T2 A 21

Thread Schedule #2 0x9cd4: 101 %rip = 0x1a2 %rip: 0x19d %rip: 0x195 T2 Thread Context Switch Thread Schedule #2 0x9cd4: 101 %rip = 0x19d %rip: 0x19d %rip: 0x1a2 T1 Thread Context Switch 22

Thread Schedule #2 0x9cd4: 101 %rip = 0x19d %rip: 0x19d %rip: 0x1a2 T1 Thread Schedule #2 0x9cd4: 101 %rip = 0x1a2 %rip: 0x1a2 %rip: 0x1a2 T1 23

Thread Schedule #2 0x9cd4: 101 %rip = 0x1a2 %rip: 0x1a2 %rip: 0x1a2 T1 WRONG Result! Final value of balance is 101 Timeline View mov 0x123, %eax add %0x1, %eax mov %eax, 0x123 mov 0x123, %eax add %0x2, %eax mov %eax, 0x123 How much is added to shared variable? 3: correct! 24

Timeline View mov 0x123, %eax add %0x1, %eax mov %eax, 0x123 mov 0x123, %eax add %0x2, %eax mov %eax, 0x123 How much is added? 2: incorrect! Timeline View mov 0x123, %eax mov 0x123, %eax add %0x2, %eax add %0x1, %eax mov %eax, 0x123 mov %eax, 0x123 How much is added? 1: incorrect! 25

Timeline View mov 0x123, %eax add %0x1, %eax mov %eax, 0x123 mov 0x123, %eax add %0x2, %eax mov %eax, 0x123 How much is added? 3: correct! Timeline View mov 0x123, %eax add %0x1, %eax mov %eax, 0x123 mov 0x123, %eax add %0x2, %eax mov %eax, 0x123 How much is added? 2: incorrect! 26

Non-Determinism Concurrency leads to non-deterministic results Not deterministic result: different results even with same inputs race conditions Whether bug manifests depends on CPU schedule! Passing tests means little How to program: imagine scheduler is malicious Assume scheduler will pick bad ordering at some point What do we want? Want 3 instructions to execute as an uninterruptable group That is, we want them to be atomic mov 0x123, %eax add %0x1, %eax mov %eax, 0x123 critical section More general: Need mutual exclusion for critical sections if A is in critical section C, B can t (okay if other es do unrelated work) 27

Synchronization Build higher-level synchronization primitives in OS Operations that ensure correct ordering of instructions across threads Motivation: Build them once and get them right Monitors Locks Semaphores Condition Variables Loads Stores Test&Set Disable Interrupts Locks Goal: Provide mutual exclusion (mutex) Three common operations: Allocate and Initialize Pthread_mutex_t mylock = PTHREAD_MUTEX_INITIALIZER; Acquire Acquire exclusion access to lock; Wait if lock is not available (some other in critical section) Spin or block (relinquish CPU) while waiting Pthread_mutex_lock(&mylock); Release Release exclusive access to lock; let another enter critical section Pthread_mutex_unlock(&mylock); 28

Timeline View with locks mutex_lock() mutex_lock() mov 0x123, %eax add %0x1, %eax mov %eax, 0x123 Mutex_unlock() mov 0x123, %eax add %0x2, %eax mov %eax, 0x123 Mutex_unlock() How much is added to shared variable? 3: correct! Conclusions Concurrency is needed to obtain high performance by utilizing multiple cores Threads are multiple execution streams within a single or address space (share PID and address space, own registers and stack) Context switches within a critical section can lead to nondeterministic bugs (race conditions) Use locks to provide mutual exclusion 29

Implementing Synchronization To implement, need atomic operations Atomic operation: No other instructions can be interleaved Examples of atomic operations Code between interrupts on uniors Disable timer interrupts, don t do any I/O Loads and stores of words Load r1, B Store r1, A Special hw instructions Test&Set Compare&Swap Implementing Locks: Attempt #1 Turn off interrupts for critical sections Prevent dispatcher from another thread Code executes atomically Void acquire(lockt *l) { disableinterrupts(); } Void release(lockt *l) { enableinterrupts(); } Disadvantages?? 30

Implementing LOCKS: Attempt #2 Code uses a single shared lock variable Boolean lock = false; // shared variable Void acquire() { while (lock) /* wait */ ; lock = true; } Void release() { lock = false; } Why doesn t this work? 31