Engineering Robust Server Software
|
|
- Melinda Briggs
- 6 years ago
- Views:
Transcription
1 Engineering Robust Server Software Scalability
2 Lock Free Data Structures Atomics operations work great when they do what you need E.g., increment an int What about more complicated things? E.g., No hardware support for atomically adding to a BST Lock Free Data Structures Good when few write conflicts Generally based on atomic CAS Freeing memory makes things hard Much easier if we don't need to free things (GCed languages) 2
3 Lock Free LinkedList class LinkedList { class Node{ public: const int data; std::atomic<node *> next; Node(int d): data(d), next(nullptr) { Node(int d, Node * n): data(d), next(n) { ~Node() { ; std::atomic<node*> ; ; 3
4 Lock Free LinkedList data: 4 data: 8 data: 12 4
5 Lock Free LinkedList addfront(3) data: 4 data: 8 data: 12 data: 3 5
6 Lock Free LinkedList data: 4 data: 8 data: 12 void addfront(int x) { Node * next =.load(std::memory_order_???); Node * temp = new Node(x, next); while (!.compare_exchange_weak(next, temp, std::memory_order_???)) { temp->next.store(next, std::memory_order_???); 6
7 Lock Free LinkedList data: 4 data: 8 data: 12 next void addfront(int x) { Node * next =.load(std::memory_order_???); Node * temp = new Node(x, next); while (!.compare_exchange_weak(next, temp, std::memory_order_???)) { temp->next.store(next, std::memory_order_???); 7
8 Lock Free LinkedList data: 4 data: 8 data: 12 next temp data: 3 void addfront(int x) { Node * next =.load(std::memory_order_???); Node * temp = new Node(x, next); while (!.compare_exchange_weak(next, temp, std::memory_order_???)) { temp->next.store(next, std::memory_order_???); 8
9 data: 42 Lock Free LinkedList data: 4 data: 8 data: 12 next temp data: 3 Another thread just did a racing update! void addfront(int x) { Node * next =.load(std::memory_order_???); Node * temp = new Node(x, next); while (!.compare_exchange_weak(next, temp, std::memory_order_???)) { temp->next.store(next, std::memory_order_???); 9
10 data: 42 Lock Free LinkedList data: 4 data: 8 data: 12 next temp data: 3 void addfront(int x) { Node * next =.load(std::memory_order_???); Node * temp = new Node(x, next); so CAS fails (!= temp) while (!.compare_exchange_weak(next, temp, std::memory_order_???)) { temp->next.store(next, std::memory_order_???); 10
11 data: 42 Lock Free LinkedList data: 4 data: 8 data: 12 next temp data: 3 void addfront(int x) { Node * next =.load(std::memory_order_???); Node * temp = new Node(x, next); while (!.compare_exchange_weak(next, temp, std::memory_order_???)) { temp->next.store(next, std::memory_order_???); 11
12 data: 42 Lock Free LinkedList data: 4 data: 8 data: 12 next temp data: 3 Suppose no other racing writes, so CAS succeeds void addfront(int x) { Node * next =.load(std::memory_order_???); Node * temp = new Node(x, next); while (!.compare_exchange_weak(next, temp, std::memory_order_???)) { temp->next.store(next, std::memory_order_???); 12
13 data: 42 Lock Free LinkedList data: 4 data: 8 data: 12 next temp data: 3 Suppose no other racing writes, so CAS succeeds void addfront(int x) { Node * next =.load(std::memory_order_???); Node * temp = new Node(x, next); while (!.compare_exchange_weak(next, temp, std::memory_order_???)) { temp->next.store(next, std::memory_order_???); 13
14 data: 42 Lock Free LinkedList data: 4 data: 8 data: 12 data: 3 void addfront(int x) { Node * next =.load(std::memory_order_???); Node * temp = new Node(x, next); while (!.compare_exchange_weak(next, temp, std::memory_order_???)) { temp->next.store(next, std::memory_order_???); 14
15 data: 42 Lock Free LinkedList data: 4 data: 8 data: 12 data: 3 void addfront(int x) { Node * next =.load(std::memory_order_???); Node * temp = new Node(x, next); while (!.compare_exchange_weak(next, temp, std::memory_order_???)) { temp->next.store(next, std::memory_order_???); So what memory order do these need? 15
16 Lock Free LinkedList Thread 0 For the the CASes, what can you say about happens-before? Thread 1 temp = allocate memory store temp.data = x store temp.next = next CAS. CAS What memory ordering does that sound like? void addfront(int x) { Node * next =.load(std::memory_order_???); Node * temp = new Node(x, next); while (!.compare_exchange_weak(next, temp, std::memory_order_???)) { temp->next.store(next, std::memory_order_???); 16
17 Acquire/Release Semantics What we want is acquire/release semantics Load: acquire Store: release store A store B store C store D (release) When load (acquire) receives value from store (release) load D (acquire) All prior stores in the releasing thread become visible-side effects In acquiring thread (only) Effectively: establishes happens-before for all these stores 17
18 Lock Free LinkedList void addfront(int x) { Node * next =.load(std::memory_order_???); Node * temp = new Node(x, next); while (!.compare_exchange_weak(next, temp, std::memory_order_acq_rel)) temp->next.store(next, std::memory_order_???); 18
19 Lock Free LinkedList What about this store? What ordering do we need for it? void addfront(int x) { Node * next =.load(std::memory_order_???); Node * temp = new Node(x, next); while (!.compare_exchange_weak(next, temp, std::memory_order_acq_rel)) temp->next.store(next, std::memory_order_???); 19
20 Lock Free LinkedList What about this store? What ordering do we need for it? - temp is still private to this thread - the next thing we are going to do is CAS w/ acq_rel [will ensure this update is visible to any thread reading the new value] So we can use relaxed void addfront(int x) { Node * next =.load(std::memory_order_???); Node * temp = new Node(x, next); while (!.compare_exchange_weak(next, temp, std::memory_order_acq_rel)) temp->next.store(next, std::memory_order_relaxed); 20
21 Lock Free LinkedList What about this load? What ordering do we need for it? void addfront(int x) { Node * next =.load(std::memory_order_???); Node * temp = new Node(x, next); while (!.compare_exchange_weak(next, temp, std::memory_order_acq_rel)) temp->next.store(next, std::memory_order_relaxed); 21
22 Lock Free LinkedList What about this load? What ordering do we need for it? - Don't care about relationship to other values until after CAS - CAS will perform acquire [and fail if has changed] So we can use relaxed again void addfront(int x) { Node * next =.load(std::memory_order_relaxed); Node * temp = new Node(x, next); while (!.compare_exchange_weak(next, temp, std::memory_order_acq_rel)) temp->next.store(next, std::memory_order_relaxed); 22
23 Lock Free LinkedList: addsorted void addsorted(int x) { std::atomic<node *> * ptr = &; Node * newnode = new Node(x); Node * nextnode; do { while (true) { nextnode = ptr->load(std::memory_order_acquire); if (nextnode == nullptr nextnode->data > x) { break; ptr = &nextnode->next; newnode->next.store(nextnode, std::memory_order_relaxed); while (!ptr->compare_exchange_weak(nextnode, newnode, std::memory_order_acq_rel)); 23
24 Lock Free LinkedList: addsorted void addsorted(int x) { std::atomic<node *> * ptr = &; Node * newnode = new Node(x); Node * nextnode; Why acquire? do { while (true) { nextnode = ptr->load(std::memory_order_acquire); if (nextnode == nullptr nextnode->data > x) { break; ptr = &nextnode->next; newnode->next.store(nextnode, std::memory_order_relaxed); while (!ptr->compare_exchange_weak(nextnode, newnode, std::memory_order_acq_rel)); 24
25 Why Not Just SC? void addsorted(int x) { std::atomic<node *> * ptr = &; What if we make it all SC? Node * newnode = new Node(x); Node * nextnode; do { while (true) { nextnode = ptr->load(std::memory_order_seq_cst); if (nextnode == nullptr nextnode->data > x) { break; ptr = &nextnode->next; newnode->next.store(nextnode, std::memory_order_seq_cst); while (!ptr->compare_exchange_weak(nextnode, newnode, std::memory_order_seq_cst)); 25
26 Why Not Just Do SC? If we use SC, it will be right Too weak of a memory ordering -> bugs on some hardware Cost of SC? 26
27 Why Not Just Do SC? If we use SC, it will be right Too weak of a memory ordering -> bugs on some hardware Cost of SC? Slower How much slower? 27
28 Why Not Just Do SC? If we use SC, it will be right Too weak of a memory ordering -> bugs on some hardware Cost of SC? Slower How much slower? x86: not too much (already very strong memory consistency) Power8: 2x 4x (depending on data size) 28
29 Lock Free LinkedList: removefront bool removefront(int & outdata) { Node * temp =.load(std::memory_order_acquire); if (temp == nullptr) { return false; Node * next = temp->next.load(std::memory_order_relaxed); while (!.compare_exchange_weak(temp, next, std::memory_order_acq_rel)) if (temp == nullptr) { return false; next = temp->next.load(std::memory_order_relaxed); if (temp!= nullptr) { outdata = temp->data; return true; What is wrong with this code? return false; 29
30 Lock Free LinkedList: removefront bool removefront(int & outdata) { Node * temp =.load(std::memory_order_acquire); if (temp == nullptr) { return false; Node * next = temp->next.load(std::memory_order_relaxed); while (!.compare_exchange_weak(temp, next, std::memory_order_acq_rel)) if (temp == nullptr) { return false; next = temp->next.load(std::memory_order_relaxed); if (temp!= nullptr) { outdata = temp->data; return true; return false; Leak memory here (temp is last reference to node) or is it? 30
31 Difficulty with freeing memory Thread 0 temp Thread 0 does temp =.load; data: 4 data: 8 data: 12 Thread 1 31
32 Difficulty with freeing memory Thread 0 temp Thread 0 does next = temp->next.load; next data: 4 data: 8 data: 12 Thread 1 32
33 Difficulty with freeing memory Thread 0 temp Thread 1 does temp =.load next data: 4 data: 8 data: 12 Thread 1 temp 33
34 Difficulty with freeing memory Thread 0 temp Thread 0 does (successful) CAS on next data: 4 data: 8 data: 12 Thread 1 temp 34
35 Difficulty with freeing memory Thread 0 temp Note: if Thread 0 does not delete temp, Thread 1 will get next, then fail its CAS, then correctly re-get the. next data: 4 data: 8 data: 12 Thread 1 temp 35
36 Difficulty with freeing memory Thread 0 temp but if thread 0 deletes temp next data: 8 data: 12 Thread 1 temp 36
37 Difficulty with freeing memory Thread 0 temp Thread 1 has temp dangling when it does temp->next.load next data: 8 data: 12 Thread 1 temp 37
38 Difficulty with freeing memory Thread 0 temp we could move it to another list of trash [Note that thread 1's CAS will fail] next data: 4 data: 8 data: 12 trash Thread 1 temp 38
39 Difficulty Cleaning Up Memory How can we clean up our trash list? Delete "later"? How long is later? Must ensure no other thread still reference it Could we just recycle the nodes? I.e., allocate nodes out of the trash for this same list? ONLY if we can ensure no other threads still reference it.. Why? Otherwise might have old pointer to address X Reallocate node at X CAS succeeds (X=X) even though we've updated list 39
40 Freeing Data in LF DS Option 1: count threads operating in DS If count == 1, only this thread -> free nodes Difficulty: May never have only 1 thread in DS Option 2: require all threads to finish one operation after update Only can see stale data if in same operation as update Difficulty: Thread may not be doing any operations Option 3: track set of threads operating in DS Difficulty: complicated Option 4: highly scalable R/W locks R = normal operations, W = exclusive operations [freeing] Difficulty: need high read scalability 40
41 Freeing Data in Lock Free DSes GC makes it easy Stop the world Considers root sets from all threads So much simpler in Java 41
42 More Complex Deletions? What about deleting from an arbitrary position? Delete specific value, index, 42
43 More Complex Deletions? Thread 0 Thread 1 todel 4 todel 8 curr curr data: 1 data: 4 data: 8 data: 12 What about deleting from an arbitrary position? Delete specific value, index, Turns out to be more complex.. 43
44 More Complex Deletions? Thread 0 Thread 1 todel 4 todel 8 curr curr data: 1 data: 4 data: 8 data: 12 What would thread 0 do by itself? 44
45 More Complex Deletions? Thread 0 Thread 1 todel 4 todel 8 curr curr data: 1 data: 4 data: 8 data: 12 What would thread 0 do by itself? Atomic CAS is on 1's next field. 45
46 More Complex Deletions? Thread 0 Thread 1 todel 4 todel 8 curr curr data: 1 data: 4 data: 8 data: 12 What would thread 1 do by itself? 46
47 More Complex Deletions? Thread 0 Thread 1 todel 4 todel 8 curr curr data: 1 data: 4 data: 8 data: 12 What would thread 1 do by itself? Atomic CAS is on 4's next 47
48 More Complex Deletions? Thread 0 Thread 1 todel 4 todel 8 curr curr data: 1 data: 4 data: 8 data: 12 What happens if we do both at the same time? Do any CASes fail? 48
49 More Complex Deletions? Thread 0 Thread 1 todel 4 todel 8 curr curr data: 1 data: 4 data: 8 data: 12 What happens if we do both at the same time? Undid 8's delete Do any CASes fail? No: both succeed Similar problem with racing adds + deletes Note: our removefront only works if we only have addfront (what race is there if we also have addsorted?) 49
50 Lock Free BST Add
51 Lock Free BST Add 17 Option 1: CAS 24's left pointer
52 Lock Free BST Add 17 Option 1: CAS 24's left pointer But what if we want to rebalance? 4 24 How would deletes work? - Saw complex for LLs 17 52
53 Lock Free BST Add 17 Option 1: CAS 24's left pointer 24 Rebalanced. How would we do that lock free?
54 Another Option Option 2: Do functional update, and only CAS the root For any operation (add, or delete) 54
55 Add 17 Option 2: Do functional update Lock Free BST temp
56 Add 17 Option 2: Do functional update Lock Free BST temp Atomic CAS /temp
57 Add 17 Option 2: Do functional update Lock Free BST temp Atomic CAS /temp Free unused nodes
58 Lock Free BST Add 17 Option 2: 42 Do functional update Atomic CAS /temp Free unused nodes Done
59 Add 17 Option 2: Do functional update Lock Free BST temp Atomic CAS /temp What if CAS fails?
60 Add 17 Option 2: Do functional update Lock Free BST temp Atomic CAS /temp What if CAS fails? - Head has changed - Conflicting writes - Discard temp tree - Do add again to 17 60
61 Add 17 Option 2: Do functional update Lock Free BST temp Atomic CAS /temp Free unused nodes For nodes seen by other threads just as hard as in LLs.. 61
62 BST: State Re-creation What about all those extra nodes we make? Isn't that inefficient? How much state do we have to recreate for an N node tree? 62
63 BST: State Duplication What about all those extra nodes we make? Isn't that inefficient? How much state do we have to recreate for an N node tree? O(lg(N)) Note that a LinkedList would have O(N) state duplication Could use this idea there, just less efficient.. 63
64 BST: O(lg(N) State Duplication Triangles = arbitrarily large subtree (imagine thousands or millions of nodes) 64
65 BST: O(lg(N) State Duplication Only state along actual add path is duplicated Remainder is shared 12 65
66 Rebalancing? Rebalancing also only affects addition path. Can use same idea: share unaffected state v N+3 r N+2 L N+2 N L N+1 v N+1 N N+1 r N N(-1) N(-1) N N(-1) N(-1) 66
67 Lock Free Data Structures: Wrap Up Upside: scalability Downside: complexity Think carefully about races Think carefully about memory ordering Deletions are generally hardest Freeing memory is tricky 67
Engineering Robust Server Software
Engineering Robust Server Software Scalability Impediments to Scalability Shared Hardware Functional Units Caches Memory Bandwidth IO Bandwidth Data Movement From one core to another Blocking Let's talk
More informationCS 241 Honors Concurrent Data Structures
CS 241 Honors Concurrent Data Structures Bhuvan Venkatesh University of Illinois Urbana Champaign March 27, 2018 CS 241 Course Staff (UIUC) Lock Free Data Structures March 27, 2018 1 / 43 What to go over
More informationLinked Data Structures. Linked lists. Readings: CP:AMA The primary goal of this section is to be able to use linked lists and trees.
Linked Data Structures Readings: CP:AMA 17.5 The primary goal of this section is to be able to use linked lists and trees. CS 136 Winter 2018 11: Linked Data Structures 1 Linked lists Racket s list type
More informationCS 153 Design of Operating Systems Winter 2016
CS 153 Design of Operating Systems Winter 2016 Lecture 7: Synchronization Administrivia Homework 1 Due today by the end of day Hopefully you have started on project 1 by now? Kernel-level threads (preemptable
More information7/6/2015. Motivation & examples Threads, shared memory, & synchronization. Imperative programs
Motivation & examples Threads, shared memory, & synchronization How do locks work? Data races (a lower level property) How do data race detectors work? Atomicity (a higher level property) Concurrency exceptions
More informationCSE 100: C++ TEMPLATES AND ITERATORS
CSE 100: C++ TEMPLATES AND ITERATORS Announcements iclickers: Please register at ted.ucsd.edu. Start ASAP!! For PA1 (Due next week). 10/6 grading and 10/8 regrading How is Assignment 1 going? A. I haven
More informationMotivation & examples Threads, shared memory, & synchronization
1 Motivation & examples Threads, shared memory, & synchronization How do locks work? Data races (a lower level property) How do data race detectors work? Atomicity (a higher level property) Concurrency
More informationCSE332: Data Abstractions Lecture 22: Shared-Memory Concurrency and Mutual Exclusion. Tyler Robison Summer 2010
CSE332: Data Abstractions Lecture 22: Shared-Memory Concurrency and Mutual Exclusion Tyler Robison Summer 2010 1 Toward sharing resources (memory) So far we ve looked at parallel algorithms using fork-join
More informationCMSC 330: Organization of Programming Languages
CMSC 330: Organization of Programming Languages Multithreading Multiprocessors Description Multiple processing units (multiprocessor) From single microprocessor to large compute clusters Can perform multiple
More informationCA341 - Comparative Programming Languages
CA341 - Comparative Programming Languages David Sinclair Dynamic Data Structures Generally we do not know how much data a program will have to process. There are 2 ways to handle this: Create a fixed data
More informationC++ Memory Model. Don t believe everything you read (from shared memory)
C++ Memory Model Don t believe everything you read (from shared memory) The Plan Why multithreading is hard Warm-up example Sequential Consistency Races and fences The happens-before relation The DRF guarantee
More informationMarwan Burelle. Parallel and Concurrent Programming
marwan.burelle@lse.epita.fr http://wiki-prog.infoprepa.epita.fr Outline 1 2 Solutions Overview 1 Sharing Data First, always try to apply the following mantra: Don t share data! When non-scalar data are
More informationLinked Data Structures. Linked lists. Readings: CP:AMA The primary goal of this section is to be able to use linked lists and trees.
Linked Data Structures Readings: CP:AMA 17.5 The primary goal of this section is to be able to use linked lists and trees. CS 136 Fall 2018 11: Linked Data Structures 1 Linked lists Racket s list type
More informationCSE 451: Operating Systems Winter Lecture 7 Synchronization. Steve Gribble. Synchronization. Threads cooperate in multithreaded programs
CSE 451: Operating Systems Winter 2005 Lecture 7 Synchronization Steve Gribble Synchronization Threads cooperate in multithreaded programs to share resources, access shared data structures e.g., threads
More informationSynchronization. CS61, Lecture 18. Prof. Stephen Chong November 3, 2011
Synchronization CS61, Lecture 18 Prof. Stephen Chong November 3, 2011 Announcements Assignment 5 Tell us your group by Sunday Nov 6 Due Thursday Nov 17 Talks of interest in next two days Towards Predictable,
More informationBinary Search Trees. Contents. Steven J. Zeil. July 11, Definition: Binary Search Trees The Binary Search Tree ADT...
Steven J. Zeil July 11, 2013 Contents 1 Definition: Binary Search Trees 2 1.1 The Binary Search Tree ADT.................................................... 3 2 Implementing Binary Search Trees 7 2.1 Searching
More informationLECTURE 03 LINKED LIST
DATA STRUCTURES AND ALGORITHMS LECTURE 03 LINKED LIST IMRAN IHSAN ASSISTANT PROFESSOR AIR UNIVERSITY, ISLAMABAD LINKED LISTS DEFINITION A linked list is a data structure where each object is stored in
More informationEngineering Robust Server Software
Engineering Robust Server Software Exceptions Exceptions Handling problems: exceptions C++ Java temp-and-swap RAII Smart Pointers finally specifications finalizers (and why they are not what you need for
More informationNON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY. Tim Harris, 31 October 2012
NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY Tim Harris, 31 October 2012 Lecture 6 Linearizability Lock-free progress properties Queues Reducing contention Explicit memory management Linearizability
More informationLearning from Bad Examples. CSCI 5828: Foundations of Software Engineering Lecture 25 11/18/2014
Learning from Bad Examples CSCI 5828: Foundations of Software Engineering Lecture 25 11/18/2014 1 Goals Demonstrate techniques to design for shared mutability Build on an example where multiple threads
More informationCSE 153 Design of Operating Systems
CSE 153 Design of Operating Systems Winter 19 Lecture 7/8: Synchronization (1) Administrivia How is Lab going? Be prepared with questions for this weeks Lab My impression from TAs is that you are on track
More informationIn Java we have the keyword null, which is the value of an uninitialized reference type
+ More on Pointers + Null pointers In Java we have the keyword null, which is the value of an uninitialized reference type In C we sometimes use NULL, but its just a macro for the integer 0 Pointers are
More information11/19/2013. Imperative programs
if (flag) 1 2 From my perspective, parallelism is the biggest challenge since high level programming languages. It s the biggest thing in 50 years because industry is betting its future that parallel programming
More informationNON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY. Tim Harris, 17 November 2017
NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY Tim Harris, 17 November 2017 Lecture 7 Linearizability Lock-free progress properties Hashtables and skip-lists Queues Reducing contention Explicit
More informationSynchronization Spinlocks - Semaphores
CS 4410 Operating Systems Synchronization Spinlocks - Semaphores Summer 2013 Cornell University 1 Today How can I synchronize the execution of multiple threads of the same process? Example Race condition
More informationCSE 451: Operating Systems Winter Lecture 7 Synchronization. Hank Levy 412 Sieg Hall
CSE 451: Operating Systems Winter 2003 Lecture 7 Synchronization Hank Levy Levy@cs.washington.edu 412 Sieg Hall Synchronization Threads cooperate in multithreaded programs to share resources, access shared
More information15 418/618 Project Final Report Concurrent Lock free BST
15 418/618 Project Final Report Concurrent Lock free BST Names: Swapnil Pimpale, Romit Kudtarkar AndrewID: spimpale, rkudtark 1.0 SUMMARY We implemented two concurrent binary search trees (BSTs): a fine
More informationOverview. CMSC 330: Organization of Programming Languages. Concurrency. Multiprocessors. Processes vs. Threads. Computation Abstractions
CMSC 330: Organization of Programming Languages Multithreaded Programming Patterns in Java CMSC 330 2 Multiprocessors Description Multiple processing units (multiprocessor) From single microprocessor to
More informationOperating Systems. Designed and Presented by Dr. Ayman Elshenawy Elsefy
Operating Systems Designed and Presented by Dr. Ayman Elshenawy Elsefy Dept. of Systems & Computer Eng.. AL-AZHAR University Website : eaymanelshenawy.wordpress.com Email : eaymanelshenawy@yahoo.com Reference
More informationEnhancing std::atomic_flag for waiting.
Doc number: P0514R0 Revises: None. Date: 2016-11-15 Project: Programming Language C++, Concurrency Working Group Reply-to: Olivier Giroux Enhancing std::atomic_flag for waiting. TL;DR
More informationNON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY. Tim Harris, 21 November 2014
NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY Tim Harris, 21 November 2014 Lecture 7 Linearizability Lock-free progress properties Queues Reducing contention Explicit memory management Linearizability
More informationCache Coherence and Atomic Operations in Hardware
Cache Coherence and Atomic Operations in Hardware Previously, we introduced multi-core parallelism. Today we ll look at 2 things: 1. Cache coherence 2. Instruction support for synchronization. And some
More informationC++ - Lesson 2 This is a function prototype. a' is a function that takes an integer array argument and returns an integer pointer.
C++ - Lesson 2 1. Explain the following declarations: a) int *a(int a[]); This is a function prototype. 'a' is a function that takes an integer array argument and returns an integer pointer. b) const char
More informationLecture 5: Synchronization w/locks
Lecture 5: Synchronization w/locks CSE 120: Principles of Operating Systems Alex C. Snoeren Lab 1 Due 10/19 Threads Are Made to Share Global variables and static objects are shared Stored in the static
More informationProcess Synchronization
Process Synchronization Concurrent access to shared data may result in data inconsistency Multiple threads in a single process Maintaining data consistency requires mechanisms to ensure the orderly execution
More informationShenandoah An ultra-low pause time Garbage Collector for OpenJDK. Christine H. Flood Roman Kennke
Shenandoah An ultra-low pause time Garbage Collector for OpenJDK Christine H. Flood Roman Kennke 1 What does ultra-low pause time mean? It means that the pause time is proportional to the size of the root
More informationConcurrency & Parallelism. Threads, Concurrency, and Parallelism. Multicore Processors 11/7/17
Concurrency & Parallelism So far, our programs have been sequential: they do one thing after another, one thing at a. Let s start writing programs that do more than one thing at at a. Threads, Concurrency,
More informationCS 220: Introduction to Parallel Computing. Introduction to CUDA. Lecture 28
CS 220: Introduction to Parallel Computing Introduction to CUDA Lecture 28 Today s Schedule Project 4 Read-Write Locks Introduction to CUDA 5/2/18 CS 220: Parallel Computing 2 Today s Schedule Project
More informationChapter 5. Multiprocessors and Thread-Level Parallelism
Computer Architecture A Quantitative Approach, Fifth Edition Chapter 5 Multiprocessors and Thread-Level Parallelism 1 Introduction Thread-Level parallelism Have multiple program counters Uses MIMD model
More informationCS5460/6460: Operating Systems. Lecture 11: Locking. Anton Burtsev February, 2014
CS5460/6460: Operating Systems Lecture 11: Locking Anton Burtsev February, 2014 Race conditions Disk driver maintains a list of outstanding requests Each process can add requests to the list 1 struct list
More informationSynchronization I. Jo, Heeseung
Synchronization I Jo, Heeseung Today's Topics Synchronization problem Locks 2 Synchronization Threads cooperate in multithreaded programs To share resources, access shared data structures Also, to coordinate
More informationIntermediate Programming, Spring 2017*
600.120 Intermediate Programming, Spring 2017* Misha Kazhdan *Much of the code in these examples is not commented because it would otherwise not fit on the slides. This is bad coding practice in general
More informationCS 106X Sample Final Exam #2
CS 106X Sample Final Exam #2 This sample exam is intended to demonstrate an example of some of the kinds of problems that will be asked on the actual final exam. We do not guarantee that the number of
More informationCS 326: Operating Systems. Process Execution. Lecture 5
CS 326: Operating Systems Process Execution Lecture 5 Today s Schedule Process Creation Threads Limited Direct Execution Basic Scheduling 2/5/18 CS 326: Operating Systems 2 Today s Schedule Process Creation
More informationOverview: Memory Consistency
Overview: Memory Consistency the ordering of memory operations basic definitions; sequential consistency comparison with cache coherency relaxing memory consistency write buffers the total store ordering
More informationC++ Programming. Pointers and Memory Management. M1 Math Michail Lampis
C++ Programming Pointers and Memory Management M1 Math Michail Lampis michail.lampis@dauphine.fr Dynamic Memory Allocation Data in your program lives (mostly) in two areas The stack The heap So far, we
More information- 1 - Handout #22S May 24, 2013 Practice Second Midterm Exam Solutions. CS106B Spring 2013
CS106B Spring 2013 Handout #22S May 24, 2013 Practice Second Midterm Exam Solutions Based on handouts by Eric Roberts and Jerry Cain Problem One: Reversing a Queue One way to reverse the queue is to keep
More informationUsing Weakly Ordered C++ Atomics Correctly. Hans-J. Boehm
Using Weakly Ordered C++ Atomics Correctly Hans-J. Boehm 1 Why atomics? Programs usually ensure that memory locations cannot be accessed by one thread while being written by another. No data races. Typically
More informationCSE143 Notes for Monday, 7/06/15
CSE143 Notes for Monday, 7/06/15 Arrays use contiguous memory: Arrays are what we call "random access" structures because we can quickly access any value within the array. If you use arr[2], then you can
More informationPart X. Advanced C ++
Part X Advanced C ++ topics Philip Blakely (LSC) Advanced C++ 158 / 217 References The following are highly regarded books. They are fairly in-depth, and I haven t read them in their entirity. However,
More informationAn introduction to weak memory consistency and the out-of-thin-air problem
An introduction to weak memory consistency and the out-of-thin-air problem Viktor Vafeiadis Max Planck Institute for Software Systems (MPI-SWS) CONCUR, 7 September 2017 Sequential consistency 2 Sequential
More informationShenandoah: Theory and Practice. Christine Flood Roman Kennke Principal Software Engineers Red Hat
Shenandoah: Theory and Practice Christine Flood Roman Kennke Principal Software Engineers Red Hat 1 Shenandoah Christine Flood Roman Kennke Principal Software Engineers Red Hat 2 Shenandoah Why do we need
More informationDeadlock CS 241. March 19, University of Illinois
Deadlock CS 241 March 19, 2014 University of Illinois Slides adapted in part from material accompanying Bryant & O Hallaron, Computer Systems: A Programmer's Perspective, 2/E 1 The Dining Philosophers
More informationStudent Name and ID CS 32, WINTER 2015, PRACTICE MIDTERM I.
UCLA Computer Science Department TA: Kung-Hua Chang Student Name and ID CS 32, WINTER 2015, PRACTICE MIDTERM I. Problem # Maximal Possible Points Received 1.1 3 1.2 5 1.3 5 1.4 5 1.5 5 2 3 3.1 4 3.2 5
More informationIntroduction to Linked Data Structures
Introduction to Linked Data Structures A linked data structure consists of capsules of data known as nodes that are connected via links Links can be viewed as arrows and thought of as one way passages
More informationSymmetric Multiprocessors: Synchronization and Sequential Consistency
Constructive Computer Architecture Symmetric Multiprocessors: Synchronization and Sequential Consistency Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology November
More informationFine-grained synchronization & lock-free programming
Lecture 17: Fine-grained synchronization & lock-free programming Parallel Computer Architecture and Programming CMU 15-418/15-618, Spring 2016 Tunes Minnie the Moocher Robbie Williams (Swings Both Ways)
More informationAssignment 4: SkipList
Assignment 4: SkipList Goals : Working with dynamic arrays, pointers, doubly linked lists For this assignment, you will write a Skip List data structure to store integers. When searching a Skip List, items
More informationPart III. Advanced C ++
Part III Advanced C ++ topics Philip Blakely (LSC) Advanced C++ 66 / 119 References The following are highly regarded books. They are fairly in-depth, and I haven t read them in their entirity. However,
More informationQuestions: ECE551 PRACTICE Final
ECE551 PRACTICE Final This is a full length practice midterm exam. If you want to take it at exam pace, give yourself 175 minutes to take the entire test. Just like the real exam, each question has a point
More informationAlgorithms in Systems Engineering ISE 172. Lecture 16. Dr. Ted Ralphs
Algorithms in Systems Engineering ISE 172 Lecture 16 Dr. Ted Ralphs ISE 172 Lecture 16 1 References for Today s Lecture Required reading Sections 6.5-6.7 References CLRS Chapter 22 R. Sedgewick, Algorithms
More informationComputational Optimization ISE 407. Lecture 16. Dr. Ted Ralphs
Computational Optimization ISE 407 Lecture 16 Dr. Ted Ralphs ISE 407 Lecture 16 1 References for Today s Lecture Required reading Sections 6.5-6.7 References CLRS Chapter 22 R. Sedgewick, Algorithms in
More informationMultiprocessor Cache Coherence. Chapter 5. Memory System is Coherent If... From ILP to TLP. Enforcing Cache Coherence. Multiprocessor Types
Chapter 5 Multiprocessor Cache Coherence Thread-Level Parallelism 1: read 2: read 3: write??? 1 4 From ILP to TLP Memory System is Coherent If... ILP became inefficient in terms of Power consumption Silicon
More informationBw-Tree. Josef Schmeißer. January 9, Josef Schmeißer Bw-Tree January 9, / 25
Bw-Tree Josef Schmeißer January 9, 2018 Josef Schmeißer Bw-Tree January 9, 2018 1 / 25 Table of contents 1 Fundamentals 2 Tree Structure 3 Evaluation 4 Further Reading Josef Schmeißer Bw-Tree January 9,
More informationChapter 20: Binary Trees
Chapter 20: Binary Trees 20.1 Definition and Application of Binary Trees Definition and Application of Binary Trees Binary tree: a nonlinear linked list in which each node may point to 0, 1, or two other
More informationObject-Oriented Programming for Scientific Computing
Object-Oriented Programming for Scientific Computing Dynamic Memory Management Ole Klein Interdisciplinary Center for Scientific Computing Heidelberg University ole.klein@iwr.uni-heidelberg.de 2. Mai 2017
More informationThreads, Concurrency, and Parallelism
Threads, Concurrency, and Parallelism Lecture 24 CS2110 Spring 2017 Concurrency & Parallelism So far, our programs have been sequential: they do one thing after another, one thing at a time. Let s start
More informationSometimes incorrect Always incorrect Slower than serial Faster than serial. Sometimes incorrect Always incorrect Slower than serial Faster than serial
CS 61C Summer 2016 Discussion 10 SIMD and OpenMP SIMD PRACTICE A.SIMDize the following code: void count (int n, float *c) { for (int i= 0 ; i
More informationAgenda. The main body and cout. Fundamental data types. Declarations and definitions. Control structures
The main body and cout Agenda 1 Fundamental data types Declarations and definitions Control structures References, pass-by-value vs pass-by-references The main body and cout 2 C++ IS AN OO EXTENSION OF
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY Computer Systems Engineering: Spring Quiz I Solutions
Department of Electrical Engineering and Computer Science MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.033 Computer Systems Engineering: Spring 2011 Quiz I Solutions There are 10 questions and 12 pages in this
More informationGarbage Collection Algorithms. Ganesh Bikshandi
Garbage Collection Algorithms Ganesh Bikshandi Announcement MP4 posted Term paper posted Introduction Garbage : discarded or useless material Collection : the act or process of collecting Garbage collection
More informationRecap: Thread. What is it? What does it need (thread private)? What for? How to implement? Independent flow of control. Stack
What is it? Recap: Thread Independent flow of control What does it need (thread private)? Stack What for? Lightweight programming construct for concurrent activities How to implement? Kernel thread vs.
More informationSummer Final Exam Review Session August 5, 2009
15-111 Summer 2 2009 Final Exam Review Session August 5, 2009 Exam Notes The exam is from 10:30 to 1:30 PM in Wean Hall 5419A. The exam will be primarily conceptual. The major emphasis is on understanding
More informationIt turns out that races can be eliminated without sacrificing much in terms of performance or expressive power.
The biggest two problems in multi-threaded programming are races and deadlocks. Races reached new levels with the introduction of relaxed memory processors. It turns out that races can be eliminated without
More informationCompiler Construction D7011E
Compiler Construction D7011E Lecture 14: Memory Management Viktor Leijon Slides largely by Johan Nordlander with material generously provided by Mark P. Jones. 1 First: Run-time Systems 2 The Final Component:
More information// Pointer to the first thing in the list
Linked Lists Dynamic variables combined with structs or classes can be linked together to form dynamic lists or other structures. We define a record (called a node) that has at least two members: next
More informationDynamic Data Structures
Dynamic Data Structures We have seen that the STL containers vector, deque, list, set and map can grow and shrink dynamically. We now examine how some of these containers can be implemented in C++. To
More informationDealing with Issues for Interprocess Communication
Dealing with Issues for Interprocess Communication Ref Section 2.3 Tanenbaum 7.1 Overview Processes frequently need to communicate with other processes. In a shell pipe the o/p of one process is passed
More informationDynamic Memory Management
Dynamic Memory Management 1 Goals of this Lecture Help you learn about: Dynamic memory management techniques Garbage collection by the run-time system (Java) Manual deallocation by the programmer (C, C++)
More informationG Programming Languages Spring 2010 Lecture 13. Robert Grimm, New York University
G22.2110-001 Programming Languages Spring 2010 Lecture 13 Robert Grimm, New York University 1 Review Last week Exceptions 2 Outline Concurrency Discussion of Final Sources for today s lecture: PLP, 12
More informationBinary Trees and Binary Search Trees
Binary Trees and Binary Search Trees Learning Goals After this unit, you should be able to... Determine if a given tree is an instance of a particular type (e.g. binary, and later heap, etc.) Describe
More informationLinked Lists: The Role of Locking. Erez Petrank Technion
Linked Lists: The Role of Locking Erez Petrank Technion Why Data Structures? Concurrent Data Structures are building blocks Used as libraries Construction principles apply broadly This Lecture Designing
More informationCS 5523 Operating Systems: Midterm II - reivew Instructor: Dr. Tongping Liu Department Computer Science The University of Texas at San Antonio
CS 5523 Operating Systems: Midterm II - reivew Instructor: Dr. Tongping Liu Department Computer Science The University of Texas at San Antonio Fall 2017 1 Outline Inter-Process Communication (20) Threads
More informationCS5460: Operating Systems
CS5460: Operating Systems Lecture 9: Implementing Synchronization (Chapter 6) Multiprocessor Memory Models Uniprocessor memory is simple Every load from a location retrieves the last value stored to that
More informationProcess Synchronization
Process Synchronization Chapter 6 2015 Prof. Amr El-Kadi Background Concurrent access to shared data may result in data inconsistency Maintaining data consistency requires mechanisms to ensure the orderly
More informationSmart Pointers. Some slides from Internet
Smart Pointers Some slides from Internet 1 Part I: Concept Reference: Using C++11 s Smart Pointers, David Kieras, EECS Department, University of Michigan C++ Primer, Stanley B. Lippman, Jesee Lajoie, Barbara
More informationCMSC 714 Lecture 14 Lamport Clocks and Eraser
Notes CMSC 714 Lecture 14 Lamport Clocks and Eraser Midterm exam on April 16 sample exam questions posted Research project questions? Alan Sussman (with thanks to Chris Ackermann) 2 Lamport Clocks Distributed
More informationModels of concurrency & synchronization algorithms
Models of concurrency & synchronization algorithms Lecture 3 of TDA383/DIT390 (Concurrent Programming) Carlo A. Furia Chalmers University of Technology University of Gothenburg SP3 2016/2017 Today s menu
More informationLOCKLESS ALGORITHMS. Lockless Algorithms. CAS based algorithms stack order linked list
Lockless Algorithms CAS based algorithms stack order linked list CS4021/4521 2017 jones@scss.tcd.ie School of Computer Science and Statistics, Trinity College Dublin 2-Jan-18 1 Obstruction, Lock and Wait
More informationCS 134: Operating Systems
CS 134: Operating Systems Locks and Low-Level Synchronization CS 134: Operating Systems Locks and Low-Level Synchronization 1 / 25 2 / 25 Overview Overview Overview Low-Level Synchronization Non-Blocking
More informationCSE 120 Principles of Operating Systems
CSE 120 Principles of Operating Systems Spring 2018 Lecture 15: Multicore Geoffrey M. Voelker Multicore Operating Systems We have generally discussed operating systems concepts independent of the number
More informationLargest Online Community of VU Students
WWW.VUPages.com http://forum.vupages.com WWW.VUTUBE.EDU.PK Largest Online Community of VU Students MIDTERM EXAMINATION SEMESTER FALL 2003 CS301-DATA STRUCTURE Total Marks:86 Duration: 60min Instructions
More informationOrder Is A Lie. Are you sure you know how your code runs?
Order Is A Lie Are you sure you know how your code runs? Order in code is not respected by Compilers Processors (out-of-order execution) SMP Cache Management Understanding execution order in a multithreaded
More informationAdvanced Java Concepts Unit 5: Trees. Notes and Exercises
Advanced Java Concepts Unit 5: Trees. Notes and Exercises A Tree is a data structure like the figure shown below. We don t usually care about unordered trees but that s where we ll start. Later we will
More informationC++ Programming Lecture 4 Software Engineering Group
C++ Programming Lecture 4 Software Engineering Group Philipp D. Schubert VKrit Date: 24.11.2017 Time: 15:45 h Your opinion is important! Please use the free text comments Contents 1. Operator overloading
More informationChapter 5. Multiprocessors and Thread-Level Parallelism
Computer Architecture A Quantitative Approach, Fifth Edition Chapter 5 Multiprocessors and Thread-Level Parallelism 1 Introduction Thread-Level parallelism Have multiple program counters Uses MIMD model
More informationCSE 153 Design of Operating Systems Fall 2018
CSE 153 Design of Operating Systems Fall 2018 Lecture 5: Threads/Synchronization Implementing threads l Kernel Level Threads l u u All thread operations are implemented in the kernel The OS schedules all
More informationChapter 18 Vectors and Arrays [and more on pointers (nmm) ] Bjarne Stroustrup
Chapter 18 Vectors and Arrays [and more on pointers (nmm) ] Bjarne Stroustrup www.stroustrup.com/programming Abstract arrays, pointers, copy semantics, elements access, references Next lecture: parameterization
More informationProgramming in C. Lecture 5: Aliasing, Graphs, and Deallocation. Dr Neel Krishnaswami. Michaelmas Term University of Cambridge
Programming in C Lecture 5: Aliasing, Graphs, and Deallocation Dr Neel Krishnaswami University of Cambridge Michaelmas Term 2017-2018 1 / 20 The C API for Dynamic Memory Allocation void *malloc(size_t
More informationCSE 332: Data Structures & Parallelism Lecture 17: Shared-Memory Concurrency & Mutual Exclusion. Ruth Anderson Winter 2019
CSE 332: Data Structures & Parallelism Lecture 17: Shared-Memory Concurrency & Mutual Exclusion Ruth Anderson Winter 2019 Toward sharing resources (memory) So far, we have been studying parallel algorithms
More information