Misc. Notes. In BVT, switch from i j only if E j E i C/w i. In BVT, Google is example of W better then L. For lab, recall PDE and PTE bits get ANDed

Similar documents
Intro to Threads. Two approaches to concurrency. Threaded multifinger: - Asynchronous I/O (lab) - Threads (today s lecture)

Administrivia. p. 1/20

Review: Thread package API

Administrivia. Review: Thread package API. Program B. Program A. Program C. Correct answers. Please ask questions on Google group

Systèmes d Exploitation Avancés

Review: Thread package API

Review: Thread package API

Readers-Writers Problem. Implementing shared locks. shared locks (continued) Review: Test-and-set spinlock. Review: Test-and-set on alpha

Administrivia. Lab 1 will be up by tomorrow, Due Oct. 11

Review: Test-and-set spinlock

Program A. Review: Thread package API. Program B. Program C. Correct answers. Correct answers. Q: Can both critical sections run?

CS 318 Principles of Operating Systems

Processes (Intro) Yannis Smaragdakis, U. Athens

CS 318 Principles of Operating Systems

Review: Thread package API

Administrivia. Section scheduled at 2:15pm in Gates B03 (next door)

ECE 391 Exam 1 Review Session - Spring Brought to you by HKN

CS 31: Introduction to Computer Systems : Threads & Synchronization April 16-18, 2019

ICS143A: Principles of Operating Systems. Midterm recap, sample questions. Anton Burtsev February, 2017

Administrivia. No credit for late assignments w/o extension Ask cs140-staff for extension if you can t finish

Administrivia. Processes. Speed. Processes in the real world. A process s view of the world. Inter-Process Communication

Register Allocation, iii. Bringing in functions & using spilling & coalescing

Concurrency, Thread. Dongkun Shin, SKKU

CS 537 Lecture 2 - Processes

Synchronization I. Jo, Heeseung

Process & Thread Management II. Queues. Sleep() and Sleep Queues CIS 657

Process & Thread Management II CIS 657

Section 4: Threads CS162. September 15, Warmup Hello World Vocabulary 2

Implementing Threads. Operating Systems In Depth II 1 Copyright 2018 Thomas W. Doeppner. All rights reserved.

CMSC 412 Project #3 Threads & Synchronization Due March 17 th, 2017, at 5:00pm

Recap: Thread. What is it? What does it need (thread private)? What for? How to implement? Independent flow of control. Stack

Problem Set 2. CS347: Operating Systems

CS3210: Coordination. Tim Andersen

Operating Systems. Lecture 4 - Concurrency and Synchronization. Master of Computer Science PUF - Hồ Chí Minh 2016/2017

[537] Locks. Tyler Harter

Synchronization. Disclaimer: some slides are adopted from the book authors slides with permission 1

1 RCU. 2 Improving spinlock performance. 3 Kernel interface for sleeping locks. 4 Deadlock. 5 Transactions. 6 Scalable interface design

Concurrency: Mutual Exclusion (Locks)

CS533 Concepts of Operating Systems. Jonathan Walpole

Assembly Language: Function Calls" Goals of this Lecture"

CSE 451: Operating Systems Winter Lecture 7 Synchronization. Steve Gribble. Synchronization. Threads cooperate in multithreaded programs

Review: Program Execution. Memory program code program data program stack containing procedure activation records

Assembly Language: Function Calls" Goals of this Lecture"

CS 537 Lecture 11 Locks

SYNCHRONIZATION M O D E R N O P E R A T I N G S Y S T E M S R E A D 2. 3 E X C E P T A N D S P R I N G 2018

Lecture 5: Synchronization w/locks

CS 31: Intro to Systems Functions and the Stack. Martin Gagne Swarthmore College February 23, 2016

Assembly Language: Function Calls

MULTITHREADING AND SYNCHRONIZATION. CS124 Operating Systems Fall , Lecture 10

Assembly I: Basic Operations. Jo, Heeseung

CMSC421: Principles of Operating Systems

Lecture 9: Multiprocessor OSs & Synchronization. CSC 469H1F Fall 2006 Angela Demke Brown

ASSEMBLY I: BASIC OPERATIONS. Jo, Heeseung

Protection and System Calls. Otto J. Anshus

IV. Process Synchronisation

CSE 451: Operating Systems Winter Lecture 7 Synchronization. Hank Levy 412 Sieg Hall

CS5460: Operating Systems

CSC369 Lecture 2. Larry Zhang

CSC369 Lecture 2. Larry Zhang, September 21, 2015

4. The Abstraction: The Process

Synchronising Threads

Homework / Exam. Return and Review Exam #1 Reading. Machine Projects. Labs. S&S Extracts , PIC Data Sheet. Start on mp3 (Due Class 19)

Synchronization. CS61, Lecture 18. Prof. Stephen Chong November 3, 2011

THREADS: (abstract CPUs)

Locks. Dongkun Shin, SKKU

Assembly I: Basic Operations. Computer Systems Laboratory Sungkyunkwan University

Threads and Synchronization. Kevin Webb Swarthmore College February 15, 2018

Operating Systems (1DT020 & 1TT802)

Dealing with Issues for Interprocess Communication

Concurrency: Locks. Announcements

Implementing Mutual Exclusion. Sarah Diesburg Operating Systems CS 3430

Synchronization I. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Chapter 6: Process Synchronization

Synchronization Principles

Review: Processes. A process is an instance of a running program. POSIX Thread APIs:

Homework. In-line Assembly Code Machine Language Program Efficiency Tricks Reading PAL, pp 3-6, Practice Exam 1

Administrivia. Section scheduled at 12:30pm in Skilling. Lab 1 is ready. Ask cs140-staff for extension if you can t finish

Lecture 9: Midterm Review

Assembly Language: Function Calls. Goals of this Lecture. Function Call Problems

CS420: Operating Systems. Process Synchronization

Atomic Instructions! Hakim Weatherspoon CS 3410, Spring 2011 Computer Science Cornell University. P&H Chapter 2.11

CS 111. Operating Systems Peter Reiher

SYSTEM CALL IMPLEMENTATION. CS124 Operating Systems Fall , Lecture 14

CS 537 Lecture 2 Computer Architecture and Operating Systems. OS Tasks

10/17/ Gribble, Lazowska, Levy, Zahorjan 2. 10/17/ Gribble, Lazowska, Levy, Zahorjan 4

Introduction to OS Synchronization MOS 2.3

Synchronization 1. Synchronization

Review: Easy Piece 1

3. Process Management in xv6

Section 4: Threads and Context Switching

administrivia today start assembly probably won t finish all these slides Assignment 4 due tomorrow any questions?

- gcc file A.c compiler running on file A - gcc file B.c compiler running on file B - emacs text editor - firefox web browser

Review: Program Execution. Memory program code program data program stack containing procedure activation records

Review: Program Execution. Memory program code program data program stack containing procedure activiation records

Lesson 6: Process Synchronization

CPEG421/621 Tutorial

Cache Coherence and Atomic Operations in Hardware

Programming in Parallel COMP755

x86 architecture et similia

Anne Bracy CS 3410 Computer Science Cornell University

Transcription:

Misc. Notes In BVT, switch from i j only if E j E i C/w i - A j A i C/w i doesn t make sense - Might want to switch to j when A j > A i if j warped In BVT, Google is example of W better then L - Otherwise, have to set U to something (e.g., 3 L) - End up with windows where some threads completely unschedulable - Long running slow thread might suddenly get warped again For lab, recall PDE and PTE bits get ANDed - PG_P, PG_W, PG_U - if set in PTE, probably want in PDE p.1/23

Intro to Threads Threads: most popular abstraction for concurrency - Lighter-weight abstraction than processes - All threads in one process have same memory, file descriptors, etc. - Allows one process to use multiple CPUs Example: threaded web server: - Service many clients simultaneously for (;;) { fd = accept_client (); thread_create (service_client, &fd); } p.2/23

How to share CPU amongst threads Each thread has execution state: - Stack, program counter, registers, condition codes, etc. Switch the CPU amongst the threads - Save away execution state of one, load up that of next When to switch? - Current thread can no longer use the CPU (waiting for I/O) - Current thread has had CPU for too long (preemption) - Scheduler maintains lists of runnable/running/waiting threads p.3/23

Thread package API Ø Ö Ø ÚÓ Òµ ÚÓ µ ÚÓ Ö µ - Create a new thread, run Ò with Ö ÚÓ Ü Ø µ - Destroy current thread ÚÓ Ó Ò Ø Ø Ö µ - Wait for thread Ø Ö to exit p.4/23

Synchronization primitives ÚÓ ÐÓ ÑÙØ Ü Ø Ñµ ÙÒÐÓ ÑÙØ Ü Ø Ñµ ÚÓ - Only one thread acuires Ñ at a time, others wait - All global data must be protected by a mutex! ÚÓ Û Ø ÑÙØ Ü Ø Ñ ÓÒ Ø µ - Atomically unlock Ñ and sleep until signaled ÚÓ Ò Ð ÓÒ Ø µ ÖÓ Ø ÓÒ Ø µ ÚÓ - Wake one/all users waiting on p.5/23

Example: Taking job from work queue job *job_queue; mutex_t job_mutex; cond_t job_cond; void workthread (void *) { job *j; for (;;) { lock (job_mutex); while (!(j = job_queue)) wait (job_mutex, job_cond); job_queue = j->next; unlock (job_mutex); do (j); } } p.6/23

Example: Adding job to work queue void addjob (job *j) { lock (job_mutex); j->next = job_queue; job_queue = j; signal (job_cond); unlock (job_mutex); } Atomic release/wait necessary in ÛÓÖ Ø Ö, otherwise: - ÛÓÖ Ø Ö checks queue, releases lock - Ó adds job to queue, signals Ó ÑÙØ Ü - ÛÓÖ Ø Ö waits for signal that was already delivered p.7/23

Other thread package features Alerts cause exception in a thread Trylock don t block if can t acquire mutex Timedwait timeout on condition variable Shared locks concurrent read accesses to data Thread priorities control scheduling policy Thread-specific global data p.8/23

Implementing shared locks struct sharedlk { int i; mutex_t m; cond_t c; }; void AcquireExclusive (sharedlk *sl) { lock (sl->m); while (sl->i) { wait (sl->m, sl->c); } sl->i = -1; unlock (sl->m); } void AcquireShared (sharedlk *sl) { lock (sl->m); while (sl->i < 0) { wait (sl->m, sl->c); } sl->i++; unlock (sl->m); } p.9/23

shared locks (continued) void ReleaseShared (sharedlk *sl) { lock (sl->m); if (!--sl->i) signal (sl->c); unlock (sl->m); } void ReleaseExclusive (sharedlk *sl) { lock (sl->m); sl->i = 0; broadcast (sl->c); unlock (sl->m); } Must deal with starvation p.10/23

Mutex ordering: Deadlock - A locks m1, B locks m2, A locks m2, B locks m1 - How to avoid? Similar deadlock with condition variables - Suppose resource 1 managed by c 1, resource 2 by c 2 - A has 1, waits on c2, B has 2, waits on c1 Mutex/condition variable deadlock: - lock (a); lock (b); while (!ready) wait (b, c); unlock (b); unlock (a); - lock (a); lock (b); ready = true; signal (c); unlock (b); unlock (a); Moral: Bad to hold locks when crossing abstraction barriers! p.11/23

Detecting deadlock Static approaches (hard) Threads package can keep track of locks held Program grinds to a halt - Examine with debugger, find lock order problem Threads package can deduce partial order - For each lock acquired, order with other locks held - If cycle occurs, abort with error - Detects potential deadlocks even if they do not occur p.12/23

Data races Example: modify global Ü without mutex - Might compile to: load, add 1, store - Bad interleaving changes result: load, load,... Even single instructions can have races - E.g., Ð ½ Ü - Not atomic on MP without ÐÓ prefix! Even reads dangerous on some architectures But sometimes cheating buys efficiency if (!initialized) { lock (m); if (!initialized) { initialize (); initialized = 1; } unlock (m); } p.13/23

Detecting data races Static methods (hard) Debugging painful race might occur rarely Instrumentation modify program to trap memory accesses Lockset algorithm (eraser) particularly effective: - For each global memory location, keep a lockset - On each access, remove any locks not currently held - If lockset becomes empty, abort: No mutex protects data - Catches potential races even if they don t occur p.14/23

Implementing user-level threads Allocate a new stack for reach Ö Ø thread Keep a queue of runnable threads Replace networking system calls (read/write/etc.) - If operation would block, switch and run different thread Schedule periodic timer signal ( Ø Ø Ñ Ö) - Switch to another thread on timer signals (preemption) p.15/23

Example Per-thread state in thread control block structure typedef struct tcb { unsigned long md_esp; /* Stack pointer of thread */ char *t_stack; /* Bottom of thread s stack */ /*... */ }; Machine-dependent thread-switch function: - ÚÓ Ø Ö Ñ Û Ø Ø ÙÖÖ ÒØ Ø Ò Üص Machine-dependent thread initialization function: - ÚÓ Ø Ö Ñ Ò Ø Ø Ø ÚÓ Òµ ÚÓ µ ÚÓ Ö µ p.16/23

i386 thread_md_switch pushl %ebp; movl %esp,%ebp # Save frame pointer pushl %ebx; pushl %esi; pushl %edi # Save callee-saved r movl 8(%ebp),%edx movl 12(%ebp),%eax movl %esp,(%edx) movl (%eax),%esp # %edx = thread_current # %eax = thread_next # %edx->md_esp = %esp # %esp = %eax->md_esp popl %edi; popl %esi; popl %ebx popl %ebp ret # Restore callee save # Restore frame point # Resume execution p.17/23

i386 thread_md_init void thread_md_init (tcb *t, void (*fn) (void *), void *arg) { u_long *sp = (u_long *) (t->t_stack + thread_stack_size); /* Set up a callframe to thread_begin */ *--sp = (u_long) arg; *--sp = (u_long) fn; *--sp = (u_long) t; *--sp = 0; /* No return address */ /* Now set up saved registers for switch.s */ *--sp = (u_long) thread_begin; /* return address */ *--sp = 0; /* ebp */ *--sp = 0; /* ebx */ *--sp = 0; /* esi */ *--sp = 0; /* edi */ } t->t_md.md_esp = (mdreg_t) sp; Swich will call Ø Ö Ò Ò Ö µ p.18/23

Implementing synchronization Can cheat on uniprocessor - Set do not interrupt bit - If timer interrupt arrives, set interrupted bit - Manipulate mutex data structure - Clear DNI bit - If interrupted bit set, yield Note: Only works if one kernel thread for all user threads p.19/23

For MP, need hardware support Need atomic read-write or read-modify-write: ÒØ Ø Ø Ò Ø ÒØ ÐÓ Ôµ Example: - ÐÓ Ô ½ Sets and returns old value Now can implement spinlocks: #define lock(lockp) while (test_and_set (lockp)) #define unlock(lockp) *lockp = 0 When more threads than processors, don t just spin - Wastes CPU when other runnable work exists Especially if thread holding lock doesn t have a CPU But gratuitous context switch has cost - Good plan: spin for a bit, then yield p.20/23

Synchronization on i386 Ü instruction, exchanges reg with mem _test_and_set: movl movl xchg ret 8(%esp), %edx $1, %eax %eax, (%edx) CPU locks memory system around read and write - Ü Ð I.e., always acts like it ÐÓ has prefix - Prevents other uses of the bus (e.g., DMA) Operates at memory bus speed, not CPU speed - Much slower than cached read/buffered write p.21/23

Synchronization on alpha Ð Ð Ð load locked ØÐ store conditional _test_and_set: ldq_l v0, 0(a0) bne v0, 1f addq zero, 1, v0 stq_c v0, 0(a0) beq v0, _test_and_set mb addq zero, zero, v0 1: ret zero, (ra), 1 p.22/23

Implementing kernel level threads Plan9 gave a good example of how to do this Start with process abstraction in kernel Strip out unnecessary features - Same address space - Same file table - Ö ÓÖ (Plan9 s actually allowed individual control) Faster than a process, but still very heavy weight p.23/23