Concurrent programming: From theory to practice. Concurrent Algorithms 2016 Tudor David

Size: px
Start display at page:

Download "Concurrent programming: From theory to practice. Concurrent Algorithms 2016 Tudor David"

Transcription

1 oncurrent programming: From theory to practice oncurrent Agorithms 2016 Tudor David

2 From theory to practice Theoretica (design) Practica (design) Practica (impementation) 2

3 From theory to practice Theoretica (design) Practica (design) Practica (impementation) Impossibiities Upper/Lower bounds Techniques System modes orrectness proofs orrectness Design (pseudo-code) 3

4 From theory to practice Theoretica (design) Practica (design) Practica (impementation) Impossibiities Upper/Lower bounds Techniques System modes orrectness proofs orrectness System modes shared memory message passing Finite memory Practicaity issues re-usabe objects Performance Design (pseudo-code) Design (pseudo-code, prototype) 4

5 From theory to practice Theoretica (design) Practica (design) Practica (impementation) Impossibiities Upper/Lower bounds Techniques System modes orrectness proofs orrectness System modes shared memory message passing Finite memory Practicaity issues re-usabe objects Performance Hardware Which atomic ops Memory consistency ache coherence Locaity Performance Scaabiity Design (pseudo-code) Design (pseudo-code, prototype) Impementation (code) 5

6 Exampe: inked ist impementations Throughput (Mop/s) # ores pessimistic "bad" inked ist "good" inked ist 6

7 Outine PU caches ache coherence Pacement of data Hardware synchronization instructions orrectness: Memory mode & compier Performance: Programming techniques 7

8 Outine PU caches ache coherence Pacement of data Hardware synchronization instructions orrectness: Memory mode & compier Performance: Programming techniques 8

9 Why do we use caching? ore ore freq: 2GHz = 0.5 ns / instr ore Disk = ~ms Disk 9

10 Why do we use caching? ore ore freq: 2GHz = 0.5 ns / instr ore Disk = ~ms ore Memory = ~100ns Memory Disk 10

11 Why do we use caching? ore ache Memory ore freq: 2GHz = 0.5 ns / instr ore Disk = ~ms ore Memory = ~100ns ache - Large = sow - Medium = medium - Sma = fast Disk 11

12 Why do we use caching? ore L3 Memory ore freq: 2GHz = 0.5 ns / instr ore Disk = ~ms ore Memory = ~100ns ache - ore L3 = ~20ns - ore = ~7ns - ore = ~1ns Disk 12

13 Typica server configurations Inte Xeon AMD Opteron GHz GHz - : 32KB - : 64KB - : 256KB - : 512KB - L3: 24MB - L3: 12MB - Memory: 256GB - Memory: 256GB 13

14 Experiment Throughput of accessing some memory, depending on the memory size 14

15 Outine PU caches ache coherence Pacement of data Hardware synchronization instructions orrectness: Memory mode & compier Performance: Programming techniques 15

16 Unti ~2004: Singe-cores ore Memory ore freq: 3+GHz ore Disk ore Memory ache - ore L3 - ore - ore Disk 16

17 After ~2004: Muti-cores ore 0 ore 1 ore freq: ~2GHz ore Disk ore Memory ache L3 - ore shared L3 - ore Memory - ore Disk 17

18 Muti-cores with private caches ore 0 ore 1 Private = mutipe copies L3 Memory Disk 18

19 ache coherence for consistency ore 0 X ore 1 ore 0 has X and ore 1 - wants to write on X - wants to read X - did ore 0 write or read X? L3 Memory Disk 19

20 ache-coherence principes ore 0 ore 1 To perform a write X - invaidate a readers, or - previous writer To perform a read - find the atest copy L3 Memory Disk 20

21 ache coherence with MESI A state diagram State (per cache ine) - Modified: the ony dirty copy - Excusive: the ony cean copy - Shared: a cean copy - Invaid: useess data 21

22 The utimate goa for scaabiity Possibe states - Modified: the ony dirty copy - Excusive: the ony cean copy - Shared: a cean copy - Invaid: useess data Which state is our favorite? 22

23 The utimate goa for scaabiity Possibe states - Modified: the ony dirty copy - Excusive: the ony cean copy -Shared: a cean copy - Invaid: useess data = threads can keep the data cose ( cache) = faster 23

24 Experiment The effects of fase sharing 24

25 Outine PU caches ache coherence Pacement of data Hardware synchronization instructions orrectness: Memory mode & compier Performance: Programming techniques 25

26 Uniformity vs. non-uniformity Typica desktop machine Memory aches = Uniform Typica server machine Memory aches aches Memory = non-uniform (NUMA) 26

27 Latency (ns) to access data Memory Memory L3 L3 27

28 Latency (ns) to access data Memory 1 Memory L3 L3 28

29 Latency (ns) to access data Memory 1 Memory 7 L3 L3 29

30 Latency (ns) to access data Memory 1 Memory 7 20 L3 L3 30

31 Latency (ns) to access data Memory 1 40 Memory 7 20 L3 L3 31

32 Latency (ns) to access data Memory Memory 7 20 L3 L3 32

33 Latency (ns) to access data Memory Memory 7 20 L3 L3 33

34 Latency (ns) to access data Memory Memory 20 L3 L3 34

35 Latency (ns) to access data Memory Memory 20 L3 oncusion: we need to take care of ocaity L3 35

36 Experiment The effects of ocaity 36

37 Outine PU caches ache coherence Pacement of data Hardware synchronization instructions orrectness: Memory mode & compier Performance: Programming techniques 37

38 The Programmer s Toobox: Hardware synchronization instructions Depends on the processor AS generay provided J TAS and atomic increment not aways provided x86 processors (Inte, AMD): Atomic exchange, increment, decrement provided Memory barrier aso avaiabe Inte as of 2014 provides transactiona memory 38

39 Exampe: Atomic ops in G type sync_fetch_and_op(type *ptr, type vaue); type sync_op_and_fetch(type *ptr, type vaue); // OP in {add,sub,or,and,xor,nand} type sync_va_compare_and_swap(type *ptr, type odva, type newva); boo sync_boo_compare_and_swap(type *ptr, type odva, type newva); sync_synchronize(); // memory barrier 39

40 Inte s transactiona synchronization extensions (TSX) 1. Hardware ock eision (HLE) Instruction prefixes: XAQUIRE XRELEASE Exampe (G): he_{acquire,reease}_compare_exchange_n{1,2,4,8} Try to execute critica sections without acquiring/reeasing the ock If confict detected, abort and acquire the ock before re-doing the work 40

41 Inte s transactiona synchronization extensions (TSX) 2. Restricted Transactiona Memory (RTM) _xbegin(); _xabort(); _xtest(); _xend(); Limitations: Not starvation free Transactions can be aborted various reasons Shoud have a non-transactiona back-up Limited transaction size 41

42 Inte s transactiona synchronization extensions (TSX) 2. Restricted Transactiona Memory (RTM) Exampe: if (_xbegin() == _XBEGIN_STARTED){ counter = counter + 1; _xend(); } ese { sync_fetch_and_add(&counter,1); } 42

43 Outine PU caches ache coherence Pacement of data Hardware synchronization instructions orrectness: Memory mode & compier Performance: Programming techniques 43

44 oncurrent agorithm correctness Designing correct concurrent agorithms: 1. Theoretica part 2. Practica part à invoves impementation The processor and the compier optimize assuming no concurrency! L 44

45 The memory consistency mode //A, B shared variabes, initiay 0; //r1, r2 oca variabes; P1 P2 A = 1; B = 1; r1 = B; r2 = A; What vaues can r1 and r2 take? (assume x86 processor) Answer: (0,1), (1,0), (1,1) and (0,0) 45

46 The memory consistency mode à The order in which memory instructions appear to execute What woud the programmer ike to see? Sequentia consistency A operations executed in some sequentia order; Memory operations of each thread in program order; Intuitive, but imits performance; 46

47 The memory consistency mode How can the processor reorder instructions to different memory addresses? x86 (Inte, AMD): TSO variant Reads not reordered w.r.t. reads Writes not reordered w.r.t writes Writes not reordered w.r.t. reads Reads may be reordered w.r.t. writes to different memory addresses //A,B, //gobas int x,y,z; x = A; y = B; B = 3; A = 2; y = A; = 4; z = B; 47

48 The memory consistency mode Singe thread reorderings transparent; Avoid reorderings: memory barriers x86 impicit in atomic ops; voatie in Java; Expensive - use ony when reay necessary; Different processors different memory modes e.g., ARM reaxed memory mode (anything goes!); VMs (e.g. JVM, LR) have their own memory modes; 48

49 Beware of the compier void ock(int * some_ock) { whie (AS(some_ock,0,1)!= 0) {} asm voatie( ::: memory ); //compier barrier } void unock(int * some_ock) { asm voatie( ::: memory ); //compier barrier *some_ock = 0; } voatie int the_ock=0; ock(&the_ock); unock(&the_ock); voatie!= Java voatie The compier can: reorder instructions remove instructions not write vaues to memory 49

50 Outine PU caches ache coherence Pacement of data Hardware synchronization instructions orrectness: Memory mode & compier Performance: Programming techniques 50

51 oncurrent Programming Techniques What techniques can we use to speed up our concurrent appication? Main idea: Minimize contention on cache ines Use case: Locks acquire() = ock() reease() = unock() 51

52 TAS The simpest ock Test-and-Set Lock typedef voatie uint ock_t; void acquire(ock_t * some_ock) { whie (TAS(some_ock)!= 0) {} asm voatie( ::: memory ); } void reease(ock_t * some_ock) { asm voatie( ::: memory ); *some_ock = 0; } 52

53 How good is this ock? A simpe benchmark Have 48 threads continuousy acquire a ock, update some shared data, and unock Measure how many operations we can do in a second Test-and-Set ock: 190K operations/second 53

54 How can we improve things? Avoid cache-ine ping-pong: Test-and-Test-and-Set Lock void acquire(ock_t * some_ock) { whie(1) { whie (*some_ock!= 0) {} if (TAS(some_ock) == 0) { return; } } asm voatie( ::: memory ); } void reease(ock_t * some_ock) { asm voatie( ::: memory ); *some_ock = 0; } 54

55 Performance comparison Ops/second (thousands) Test-and-Set Test-and-Test-and-Set 55

56 But we can do even better Avoid thundering herd: Test-and-Test-and-Set with Back-off void acquire(ock_t * some_ock) { uint backoff = INITIAL_BAKOFF; whie(1) { whie (*some_ock!= 0) {} if (TAS(some_ock) == 0) { return; } ese { ock_seep(backoff); backoff=min(backoff*2,maximum_bakoff); } } asm voatie( ::: memory ); } void reease(ock_t * some_ock) { asm voatie( ::: memory ); *some_ock = 0; } 56

57 Performance comparison Ops/second (thousands) Test-and-Set Test-and-Test-and-Set Test-and-Test-and-Set w. backoff 57

58 Are these ocks fair? Processed requests per thread, Test-and-Set ock Number of processed requests Thread number 58

59 What if we want fairness? Use a FIFO mechanism: Ticket Locks typedef ticket_ock_t { voatie uint head; voatie uint tai; } ticket_ock_t; void acquire(ticket_ock_t * a_ock) { uint my_ticket = fetch_and_inc(&(a_ock->tai)); whie (a_ock->head!= my_ticket) {} asm voatie( ::: memory ); } void reease(ticket_ock_t * a_ock) { asm voatie( ::: memory ); a_ock->head++; } 59

60 What if we want fairness? Processed requests per thread, Ticket Locks Number of processed requests Thread number 60

61 Performance comparison Ops/second (thousands)

62 an we back-off here as we? Yes, we can: Proportiona back-off void acquire(ticket_ock_t * a_ock) { uint my_ticket = fetch_and_inc(&(a_ock->tai)); uint distance, current_ticket; whie (1) { current_ticket = a_ock->head; if (current_ticket == my_ticket) break; distance = my_ticket current_ticket; if (distance > 1) ock_seep(distance * BASE_SLEEP); } asm voatie( ::: memory ); } void reease(ticket_ock_t * a_ock) { asm voatie( ::: memory ); a_ock->head++; } 62

63 Performance comparison Ops/second (thousands)

64 Sti, everyone is spinning on the same variabe. Use a different address for each thread: Queue Locks 1 run eaving 2 run spin 3 spin 4 arriving spin Use with care: 1. storage overheads 2. compexity 64

65 Performance comparison Ops/second (thousands)

66 To summarize on ocks 1. Reading before trying to write 2. Pausing when it s not our turn 3. Ensuring fairness (does not aways bring ++) 4. Accessing disjoint addresses (cache ines) More than 10x performance gain! 66

67 oncusion oncurrent agorithm design Theoretica design Practica design (may be just as important) Impementation You need to know your hardware For correctness For performance 67

Concurrent programming: From theory to practice. Concurrent Algorithms 2015 Vasileios Trigonakis

Concurrent programming: From theory to practice. Concurrent Algorithms 2015 Vasileios Trigonakis oncurrent programming: From theory to practice oncurrent Algorithms 2015 Vasileios Trigonakis From theory to practice Theoretical (design) Practical (design) Practical (implementation) 2 From theory to

More information

CSE120 Principles of Operating Systems. Prof Yuanyuan (YY) Zhou Synchronization: Semaphore

CSE120 Principles of Operating Systems. Prof Yuanyuan (YY) Zhou Synchronization: Semaphore CSE120 Principes of Operating Systems Prof Yuanyuan (YY) Zhou Synchronization: Synchronization Needs Two synchronization needs Mutua excusion Whenever mutipe threads access a shared data, you need to worry

More information

CS5460/6460: Operating Systems. Lecture 14: Scalability techniques. Anton Burtsev February, 2014

CS5460/6460: Operating Systems. Lecture 14: Scalability techniques. Anton Burtsev February, 2014 CS5460/6460: Operating Systems Lecture 14: Scalability techniques Anton Burtsev February, 2014 Recap: read and write barriers void foo(void) { a = 1; smp_wmb(); b = 1; } void bar(void) { while (b == 0)

More information

CSE120 Principles of Operating Systems. Prof Yuanyuan (YY) Zhou Scheduling

CSE120 Principles of Operating Systems. Prof Yuanyuan (YY) Zhou Scheduling CSE120 Principes of Operating Systems Prof Yuanyuan (YY) Zhou Scheduing Announcement Homework 2 due on October 25th Project 1 due on October 26th 2 CSE 120 Scheduing and Deadock Scheduing Overview In discussing

More information

CSE120 Principles of Operating Systems. Prof Yuanyuan (YY) Zhou Midterm Review

CSE120 Principles of Operating Systems. Prof Yuanyuan (YY) Zhou Midterm Review CSE120 Principes of Operating Systems Prof Yuanyuan (YY) Zhou Midterm Review Overview The midterm Architectura support for OSes OS modues, interfaces, and structures Processes Threads Synchronization Scheduing

More information

Hardware Transactional Memory on Haswell

Hardware Transactional Memory on Haswell Hardware Transactional Memory on Haswell Viktor Leis Technische Universität München 1 / 15 Introduction transactional memory is a very elegant programming model transaction { transaction { a = a 10; c

More information

CSE120 Principles of Operating Systems. Prof Yuanyuan (YY) Zhou Lecture 4: Threads

CSE120 Principles of Operating Systems. Prof Yuanyuan (YY) Zhou Lecture 4: Threads CSE120 Principes of Operating Systems Prof Yuanyuan (YY) Zhou Lecture 4: Threads Announcement Project 0 Due Project 1 out Homework 1 due on Thursday Submit it to Gradescope onine 2 Processes Reca that

More information

The need for atomicity This code sequence illustrates the need for atomicity. Explain.

The need for atomicity This code sequence illustrates the need for atomicity. Explain. Lock Implementations [ 8.1] Recall the three kinds of synchronization from Lecture 6: Point-to-point Lock Performance metrics for lock implementations Uncontended latency Traffic o Time to acquire a lock

More information

COS 318: Operating Systems. Virtual Memory Design Issues: Paging and Caching. Jaswinder Pal Singh Computer Science Department Princeton University

COS 318: Operating Systems. Virtual Memory Design Issues: Paging and Caching. Jaswinder Pal Singh Computer Science Department Princeton University COS 318: Operating Systems Virtua Memory Design Issues: Paging and Caching Jaswinder Pa Singh Computer Science Department Princeton University (http://www.cs.princeton.edu/courses/cos318/) Virtua Memory:

More information

CSL373: Lecture 5 Deadlocks (no process runnable) + Scheduling (> 1 process runnable)

CSL373: Lecture 5 Deadlocks (no process runnable) + Scheduling (> 1 process runnable) CSL373: Lecture 5 Deadlocks (no process runnable) + Scheduling (> 1 process runnable) Past & Present Have looked at two constraints: Mutual exclusion constraint between two events is a requirement that

More information

l Tree: set of nodes and directed edges l Parent: source node of directed edge l Child: terminal node of directed edge

l Tree: set of nodes and directed edges l Parent: source node of directed edge l Child: terminal node of directed edge Trees & Heaps Week 12 Gaddis: 20 Weiss: 21.1-3 CS 5301 Fa 2016 Ji Seaman 1 Tree: non-recursive definition Tree: set of nodes and directed edges - root: one node is distinguished as the root - Every node

More information

Understanding Hardware Transactional Memory

Understanding Hardware Transactional Memory Understanding Hardware Transactional Memory Gil Tene, CTO & co-founder, Azul Systems @giltene 2015 Azul Systems, Inc. Agenda Brief introduction What is Hardware Transactional Memory (HTM)? Cache coherence

More information

Intro to Programming & C Why Program? 1.2 Computer Systems: Hardware and Software. Why Learn to Program?

Intro to Programming & C Why Program? 1.2 Computer Systems: Hardware and Software. Why Learn to Program? Intro to Programming & C++ Unit 1 Sections 1.1-3 and 2.1-10, 2.12-13, 2.15-17 CS 1428 Spring 2018 Ji Seaman 1.1 Why Program? Computer programmabe machine designed to foow instructions Program a set of

More information

Lecture outline Graphics and Interaction Scan Converting Polygons and Lines. Inside or outside a polygon? Scan conversion.

Lecture outline Graphics and Interaction Scan Converting Polygons and Lines. Inside or outside a polygon? Scan conversion. Lecture outine 433-324 Graphics and Interaction Scan Converting Poygons and Lines Department of Computer Science and Software Engineering The Introduction Scan conversion Scan-ine agorithm Edge coherence

More information

Intro to Programming & C Why Program? 1.2 Computer Systems: Hardware and Software. Hardware Components Illustrated

Intro to Programming & C Why Program? 1.2 Computer Systems: Hardware and Software. Hardware Components Illustrated Intro to Programming & C++ Unit 1 Sections 1.1-3 and 2.1-10, 2.12-13, 2.15-17 CS 1428 Fa 2017 Ji Seaman 1.1 Why Program? Computer programmabe machine designed to foow instructions Program instructions

More information

Intel Transactional Synchronization Extensions (Intel TSX) Linux update. Andi Kleen Intel OTC. Linux Plumbers Sep 2013

Intel Transactional Synchronization Extensions (Intel TSX) Linux update. Andi Kleen Intel OTC. Linux Plumbers Sep 2013 Intel Transactional Synchronization Extensions (Intel TSX) Linux update Andi Kleen Intel OTC Linux Plumbers Sep 2013 Elision Elision : the act or an instance of omitting something : omission On blocking

More information

Scalable Locking. Adam Belay

Scalable Locking. Adam Belay Scalable Locking Adam Belay Problem: Locks can ruin performance 12 finds/sec 9 6 Locking overhead dominates 3 0 0 6 12 18 24 30 36 42 48 Cores Problem: Locks can ruin performance the locks

More information

Functions. 6.1 Modular Programming. 6.2 Defining and Calling Functions. Gaddis: 6.1-5,7-10,13,15-16 and 7.7

Functions. 6.1 Modular Programming. 6.2 Defining and Calling Functions. Gaddis: 6.1-5,7-10,13,15-16 and 7.7 Functions Unit 6 Gaddis: 6.1-5,7-10,13,15-16 and 7.7 CS 1428 Spring 2018 Ji Seaman 6.1 Moduar Programming Moduar programming: breaking a program up into smaer, manageabe components (modues) Function: a

More information

A Petrel Plugin for Surface Modeling

A Petrel Plugin for Surface Modeling A Petre Pugin for Surface Modeing R. M. Hassanpour, S. H. Derakhshan and C. V. Deutsch Structure and thickness uncertainty are important components of any uncertainty study. The exact ocations of the geoogica

More information

No connection establishment Do not perform Flow control Error control Retransmission Suitable for small request/response scenario E.g.

No connection establishment Do not perform Flow control Error control Retransmission Suitable for small request/response scenario E.g. UDP & TCP 2018/3/26 UDP Header Characteristics of UDP No connection estabishment Do not perform Fow contro Error contro Retransmission Suitabe for sma request/response scenario E.g., DNS Remote Procedure

More information

CS5460: Operating Systems

CS5460: Operating Systems CS5460: Operating Systems Lecture 9: Implementing Synchronization (Chapter 6) Multiprocessor Memory Models Uniprocessor memory is simple Every load from a location retrieves the last value stored to that

More information

Lecture Notes for Chapter 4 Part III. Introduction to Data Mining

Lecture Notes for Chapter 4 Part III. Introduction to Data Mining Data Mining Cassification: Basic Concepts, Decision Trees, and Mode Evauation Lecture Notes for Chapter 4 Part III Introduction to Data Mining by Tan, Steinbach, Kumar Adapted by Qiang Yang (2010) Tan,Steinbach,

More information

A METHOD FOR GRIDLESS ROUTING OF PRINTED CIRCUIT BOARDS. A. C. Finch, K. J. Mackenzie, G. J. Balsdon, G. Symonds

A METHOD FOR GRIDLESS ROUTING OF PRINTED CIRCUIT BOARDS. A. C. Finch, K. J. Mackenzie, G. J. Balsdon, G. Symonds A METHOD FOR GRIDLESS ROUTING OF PRINTED CIRCUIT BOARDS A C Finch K J Mackenzie G J Basdon G Symonds Raca-Redac Ltd Newtown Tewkesbury Gos Engand ABSTRACT The introduction of fine-ine technoogies to printed

More information

Chapter 5: Transactions in Federated Databases

Chapter 5: Transactions in Federated Databases Federated Databases Chapter 5: in Federated Databases Saes R&D Human Resources Kemens Böhm Distributed Data Management: in Federated Databases 1 Kemens Böhm Distributed Data Management: in Federated Databases

More information

Shape Analysis with Structural Invariant Checkers

Shape Analysis with Structural Invariant Checkers Shape Anaysis with Structura Invariant Checkers Bor-Yuh Evan Chang Xavier Riva George C. Necua University of Caifornia, Berkeey SAS 2007 Exampe: Typestate with shape anaysis Concrete Exampe Abstraction

More information

Agent architectures. Francesco Amigoni

Agent architectures. Francesco Amigoni Francesco Amigoni Designing inteigent agents An agent is defined by its agent function f() that maps a sequence of perceptions to an action p(0), p(1),..., p(t) f() a(t) AGENT perceptions actions ENVIRONMENT

More information

Advance Operating Systems (CS202) Locks Discussion

Advance Operating Systems (CS202) Locks Discussion Advance Operating Systems (CS202) Locks Discussion Threads Locks Spin Locks Array-based Locks MCS Locks Sequential Locks Road Map Threads Global variables and static objects are shared Stored in the static

More information

Transactional Memory. How to do multiple things at once. Benjamin Engel Transactional Memory 1 / 28

Transactional Memory. How to do multiple things at once. Benjamin Engel Transactional Memory 1 / 28 Transactional Memory or How to do multiple things at once Benjamin Engel Transactional Memory 1 / 28 Transactional Memory: Architectural Support for Lock-Free Data Structures M. Herlihy, J. Eliot, and

More information

index.pdf March 17,

index.pdf March 17, index.pdf March 17, 2013 1 ITI 1121. Introduction to omputing II Marce Turcotte Schoo of Eectrica Engineering and omputer Science Linked List (Part 2) Tai pointer ouby inked ist ummy node Version of March

More information

Nearest Neighbor Learning

Nearest Neighbor Learning Nearest Neighbor Learning Cassify based on oca simiarity Ranges from simpe nearest neighbor to case-based and anaogica reasoning Use oca information near the current query instance to decide the cassification

More information

ECE 172 Digital Systems. Chapter 5 Uniprocessor Data Cache. Herbert G. Mayer, PSU Status 6/10/2018

ECE 172 Digital Systems. Chapter 5 Uniprocessor Data Cache. Herbert G. Mayer, PSU Status 6/10/2018 ECE 172 Digita Systems Chapter 5 Uniprocessor Data Cache Herbert G. Mayer, PSU Status 6/10/2018 1 Syabus UP Caches Cache Design Parameters Effective Time t eff Cache Performance Parameters Repacement Poicies

More information

Synchronization. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

Synchronization. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University Synchronization Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Types of Synchronization Mutual Exclusion Locks Event Synchronization Global or group-based

More information

Synchronization. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

Synchronization. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University Synchronization Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu SSE3054: Multicore Systems, Spring 2017, Jinkyu Jeong (jinkyu@skku.edu) Types of Synchronization

More information

CSE 120 Principles of Operating Systems

CSE 120 Principles of Operating Systems CSE 120 Principles of Operating Systems Spring 2018 Lecture 15: Multicore Geoffrey M. Voelker Multicore Operating Systems We have generally discussed operating systems concepts independent of the number

More information

RDF Objects 1. Alex Barnell Information Infrastructure Laboratory HP Laboratories Bristol HPL November 27 th, 2002*

RDF Objects 1. Alex Barnell Information Infrastructure Laboratory HP Laboratories Bristol HPL November 27 th, 2002* RDF Objects 1 Aex Barne Information Infrastructure Laboratory HP Laboratories Bristo HPL-2002-315 November 27 th, 2002* E-mai: Andy_Seaborne@hp.hp.com RDF, semantic web, ontoogy, object-oriented datastructures

More information

A Local Optimal Method on DSA Guiding Template Assignment with Redundant/Dummy Via Insertion

A Local Optimal Method on DSA Guiding Template Assignment with Redundant/Dummy Via Insertion A Loca Optima Method on DSA Guiding Tempate Assignment with Redundant/Dummy Via Insertion Xingquan Li 1, Bei Yu 2, Jiani Chen 1, Wenxing Zhu 1, 24th Asia and South Pacific Design T h e p i c Automation

More information

Introduction to OpenMP

Introduction to OpenMP MPSoC Architectures OpenMP Aberto Bosio, Associate Professor UM Microeectronic Departement bosio@irmm.fr Introduction to OpenMP What is OpenMP? Open specification for Muti-Processing Standard API for defining

More information

Multiprocessors II: CC-NUMA DSM. CC-NUMA for Large Systems

Multiprocessors II: CC-NUMA DSM. CC-NUMA for Large Systems Multiprocessors II: CC-NUMA DSM DSM cache coherence the hardware stuff Today s topics: what happens when we lose snooping new issues: global vs. local cache line state enter the directory issues of increasing

More information

Computer Networks. College of Computing. Copyleft 2003~2018

Computer Networks. College of Computing.   Copyleft 2003~2018 Computer Networks Computer Networks Prof. Lin Weiguo Coege of Computing Copyeft 2003~2018 inwei@cuc.edu.cn http://icourse.cuc.edu.cn/computernetworks/ http://tc.cuc.edu.cn Attention The materias beow are

More information

RWLocks in Erlang/OTP

RWLocks in Erlang/OTP RWLocks in Erlang/OTP Rickard Green rickard@erlang.org What is an RWLock? Read/Write Lock Write locked Exclusive access for one thread Read locked Multiple reader threads No writer threads RWLocks Used

More information

A Concurrent Skip List Implementation with RTM and HLE

A Concurrent Skip List Implementation with RTM and HLE A Concurrent Skip List Implementation with RTM and HLE Fan Gao May 14, 2014 1 Background Semester Performed: Spring, 2014 Instructor: Maurice Herlihy The main idea of my project is to implement a skip

More information

Lecture 19: Synchronization. CMU : Parallel Computer Architecture and Programming (Spring 2012)

Lecture 19: Synchronization. CMU : Parallel Computer Architecture and Programming (Spring 2012) Lecture 19: Synchronization CMU 15-418: Parallel Computer Architecture and Programming (Spring 2012) Announcements Assignment 4 due tonight at 11:59 PM Synchronization primitives (that we have or will

More information

Invyswell: A HyTM for Haswell RTM. Irina Calciu, Justin Gottschlich, Tatiana Shpeisman, Gilles Pokam, Maurice Herlihy

Invyswell: A HyTM for Haswell RTM. Irina Calciu, Justin Gottschlich, Tatiana Shpeisman, Gilles Pokam, Maurice Herlihy Invyswell: A HyTM for Haswell RTM Irina Calciu, Justin Gottschlich, Tatiana Shpeisman, Gilles Pokam, Maurice Herlihy Multicore Performance Scaling u Problem: Locking u Solution: HTM? u IBM BG/Q, zec12,

More information

MultiJav: A Distributed Shared Memory System Based on Multiple Java Virtual Machines. MultiJav: Introduction

MultiJav: A Distributed Shared Memory System Based on Multiple Java Virtual Machines. MultiJav: Introduction : A Distributed Shared Memory System Based on Multiple Java Virtual Machines X. Chen and V.H. Allan Computer Science Department, Utah State University 1998 : Introduction Built on concurrency supported

More information

Lecture 19: Coherence and Synchronization. Topics: synchronization primitives (Sections )

Lecture 19: Coherence and Synchronization. Topics: synchronization primitives (Sections ) Lecture 19: Coherence and Synchronization Topics: synchronization primitives (Sections 5.4-5.5) 1 Caching Locks Spin lock: to acquire a lock, a process may enter an infinite loop that keeps attempting

More information

250P: Computer Systems Architecture. Lecture 14: Synchronization. Anton Burtsev March, 2019

250P: Computer Systems Architecture. Lecture 14: Synchronization. Anton Burtsev March, 2019 250P: Computer Systems Architecture Lecture 14: Synchronization Anton Burtsev March, 2019 Coherence and Synchronization Topics: synchronization primitives (Sections 5.4-5.5) 2 Constructing Locks Applications

More information

NUMA-aware Reader-Writer Locks. Tom Herold, Marco Lamina NUMA Seminar

NUMA-aware Reader-Writer Locks. Tom Herold, Marco Lamina NUMA Seminar 04.02.2015 NUMA Seminar Agenda 1. Recap: Locking 2. in NUMA Systems 3. RW 4. Implementations 5. Hands On Why Locking? Parallel tasks access shared resources : Synchronization mechanism in concurrent environments

More information

Introduction. Coherency vs Consistency. Lec-11. Multi-Threading Concepts: Coherency, Consistency, and Synchronization

Introduction. Coherency vs Consistency. Lec-11. Multi-Threading Concepts: Coherency, Consistency, and Synchronization Lec-11 Multi-Threading Concepts: Coherency, Consistency, and Synchronization Coherency vs Consistency Memory coherency and consistency are major concerns in the design of shared-memory systems. Consistency

More information

Everything You Always Wanted to Know About Synchronization but Were Afraid to Ask

Everything You Always Wanted to Know About Synchronization but Were Afraid to Ask Everything You Always Wanted to Know About Synchronization but Were Afraid to Ask Tudor David, Rachid Guerraoui and Vasileios Trigonakis Ecole Polytechnique Federale de Lausanne(EPFL) Haksu Lim, Luis,

More information

As Michi Henning and Steve Vinoski showed 1, calling a remote

As Michi Henning and Steve Vinoski showed 1, calling a remote Reducing CORBA Ca Latency by Caching and Prefetching Bernd Brügge and Christoph Vismeier Technische Universität München Method ca atency is a major probem in approaches based on object-oriented middeware

More information

Lecture 9: Multiprocessor OSs & Synchronization. CSC 469H1F Fall 2006 Angela Demke Brown

Lecture 9: Multiprocessor OSs & Synchronization. CSC 469H1F Fall 2006 Angela Demke Brown Lecture 9: Multiprocessor OSs & Synchronization CSC 469H1F Fall 2006 Angela Demke Brown The Problem Coordinated management of shared resources Resources may be accessed by multiple threads Need to control

More information

Other consistency models

Other consistency models Last time: Symmetric multiprocessing (SMP) Lecture 25: Synchronization primitives Computer Architecture and Systems Programming (252-0061-00) CPU 0 CPU 1 CPU 2 CPU 3 Timothy Roscoe Herbstsemester 2012

More information

Multiprocessors and Locking

Multiprocessors and Locking Types of Multiprocessors (MPs) Uniform memory-access (UMA) MP Access to all memory occurs at the same speed for all processors. Multiprocessors and Locking COMP9242 2008/S2 Week 12 Part 1 Non-uniform memory-access

More information

Straight-line code (or IPO: Input-Process-Output) If/else & switch. Relational Expressions. Decisions. Sections 4.1-6, , 4.

Straight-line code (or IPO: Input-Process-Output) If/else & switch. Relational Expressions. Decisions. Sections 4.1-6, , 4. If/ese & switch Unit 3 Sections 4.1-6, 4.8-12, 4.14-15 CS 1428 Spring 2018 Ji Seaman Straight-ine code (or IPO: Input-Process-Output) So far a of our programs have foowed this basic format: Input some

More information

Searching, Sorting & Analysis

Searching, Sorting & Analysis Searching, Sorting & Anaysis Unit 2 Chapter 8 CS 2308 Fa 2018 Ji Seaman 1 Definitions of Search and Sort Search: find a given item in an array, return the index of the item, or -1 if not found. Sort: rearrange

More information

Taking a Virtual Machine Towards Many-Cores. Rickard Green - Patrik Nyblom -

Taking a Virtual Machine Towards Many-Cores. Rickard Green - Patrik Nyblom - Taking a Virtual Machine Towards Many-Cores Rickard Green - rickard@erlang.org Patrik Nyblom - pan@erlang.org What we all know by now Number of cores per processor AMD Opteron Intel Xeon Sparc Niagara

More information

Lock-Free, Wait-Free and Multi-core Programming. Roger Deran boilerbay.com Fast, Efficient Concurrent Maps AirMap

Lock-Free, Wait-Free and Multi-core Programming. Roger Deran boilerbay.com Fast, Efficient Concurrent Maps AirMap Lock-Free, Wait-Free and Multi-core Programming Roger Deran boilerbay.com Fast, Efficient Concurrent Maps AirMap Lock-Free and Wait-Free Data Structures Overview The Java Maps that use Lock-Free techniques

More information

Distributed Operating Systems

Distributed Operating Systems Distributed Operating Systems Synchronization in Parallel Systems Marcus Völp 203 Topics Synchronization Locks Performance Distributed Operating Systems 203 Marcus Völp 2 Overview Introduction Hardware

More information

Module 7: Synchronization Lecture 13: Introduction to Atomic Primitives. The Lecture Contains: Synchronization. Waiting Algorithms.

Module 7: Synchronization Lecture 13: Introduction to Atomic Primitives. The Lecture Contains: Synchronization. Waiting Algorithms. The Lecture Contains: Synchronization Waiting Algorithms Implementation Hardwired Locks Software Locks Hardware Support Atomic Exchange Test & Set Fetch & op Compare & Swap Traffic of Test & Set Backoff

More information

Synchronization. Coherency protocols guarantee that a reading processor (thread) sees the most current update to shared data.

Synchronization. Coherency protocols guarantee that a reading processor (thread) sees the most current update to shared data. Synchronization Coherency protocols guarantee that a reading processor (thread) sees the most current update to shared data. Coherency protocols do not: make sure that only one thread accesses shared data

More information

NUMA-Aware Reader-Writer Locks PPoPP 2013

NUMA-Aware Reader-Writer Locks PPoPP 2013 NUMA-Aware Reader-Writer Locks PPoPP 2013 Irina Calciu Brown University Authors Irina Calciu @ Brown University Dave Dice Yossi Lev Victor Luchangco Virendra J. Marathe Nir Shavit @ MIT 2 Cores Chip (node)

More information

PL/SQL, Embedded SQL. Lecture #14 Autumn, Fall, 2001, LRX

PL/SQL, Embedded SQL. Lecture #14 Autumn, Fall, 2001, LRX PL/SQL, Embedded SQL Lecture #14 Autumn, 2001 Fa, 2001, LRX #14 PL/SQL,Embedded SQL HUST,Wuhan,China 402 PL/SQL Found ony in the Orace SQL processor (sqpus). A compromise between competey procedura programming

More information

ffwd: delegation is (much) faster than you think Sepideh Roghanchi, Jakob Eriksson, Nilanjana Basu

ffwd: delegation is (much) faster than you think Sepideh Roghanchi, Jakob Eriksson, Nilanjana Basu ffwd: delegation is (much) faster than you think Sepideh Roghanchi, Jakob Eriksson, Nilanjana Basu int get_seqno() { } return ++seqno; // ~1 Billion ops/s // single-threaded int threadsafe_get_seqno()

More information

A Quest for Predictable Latency Adventures in Java Concurrency. Martin Thompson

A Quest for Predictable Latency Adventures in Java Concurrency. Martin Thompson A Quest for Predictable Latency Adventures in Java Concurrency Martin Thompson - @mjpt777 If a system does not respond in a timely manner then it is effectively unavailable 1. It s all about the Blocking

More information

CSE120 Principles of Operating Systems. Prof Yuanyuan (YY) Zhou Advanced Memory Management

CSE120 Principles of Operating Systems. Prof Yuanyuan (YY) Zhou Advanced Memory Management CSE120 Principes of Operating Systems Prof Yuanyuan (YY) Zhou Advanced Memory Management Advanced Functionaity Now we re going to ook at some advanced functionaity that the OS can provide appications using

More information

Synchronization. Erik Hagersten Uppsala University Sweden. Components of a Synchronization Even. Need to introduce synchronization.

Synchronization. Erik Hagersten Uppsala University Sweden. Components of a Synchronization Even. Need to introduce synchronization. Synchronization sum := thread_create Execution on a sequentially consistent shared-memory machine: Erik Hagersten Uppsala University Sweden while (sum < threshold) sum := sum while + (sum < threshold)

More information

Readme ORACLE HYPERION PROFITABILITY AND COST MANAGEMENT

Readme ORACLE HYPERION PROFITABILITY AND COST MANAGEMENT ORACLE HYPERION PROFITABILITY AND COST MANAGEMENT Reease 11.1.2.4.000 Readme CONTENTS IN BRIEF Purpose... 2 New Features in This Reease... 2 Instaation Information... 2 Supported Patforms... 2 Supported

More information

CSC2/458 Parallel and Distributed Systems Scalable Synchronization

CSC2/458 Parallel and Distributed Systems Scalable Synchronization CSC2/458 Parallel and Distributed Systems Scalable Synchronization Sreepathi Pai February 20, 2018 URCS Outline Scalable Locking Barriers Outline Scalable Locking Barriers An alternative lock ticket lock

More information

Chapter 5. Multiprocessors and Thread-Level Parallelism

Chapter 5. Multiprocessors and Thread-Level Parallelism Computer Architecture A Quantitative Approach, Fifth Edition Chapter 5 Multiprocessors and Thread-Level Parallelism 1 Introduction Thread-Level parallelism Have multiple program counters Uses MIMD model

More information

Transactional Memory. Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit

Transactional Memory. Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit Transactional Memory Companion slides for The by Maurice Herlihy & Nir Shavit Our Vision for the Future In this course, we covered. Best practices New and clever ideas And common-sense observations. 2

More information

CS4021/4521 INTRODUCTION

CS4021/4521 INTRODUCTION CS4021/4521 Advanced Computer Architecture II Prof Jeremy Jones Rm 4.16 top floor South Leinster St (SLS) jones@scss.tcd.ie South Leinster St CS4021/4521 2018 jones@scss.tcd.ie School of Computer Science

More information

Spinlocks. Spinlocks. Message Systems, Inc. April 8, 2011

Spinlocks. Spinlocks. Message Systems, Inc. April 8, 2011 Spinlocks Samy Al Bahra Devon H. O Dell Message Systems, Inc. April 8, 2011 Introduction Mutexes A mutex is an object which implements acquire and relinquish operations such that the execution following

More information

HTTP Random Access and Live Resources IETF 100

HTTP Random Access and Live Resources IETF 100 HTTP Random Access and Live Resources IETF 1 Barbara Stark bs7652@att.com Darshak Thakore d.thakore@cabeabs.com Craig Pratt pratt@acm.org / craig@ecaspia.com Use existing bytes Range Unit with very arge

More information

Adaptive Lock. Madhav Iyengar < >, Nathaniel Jeffries < >

Adaptive Lock. Madhav Iyengar < >, Nathaniel Jeffries < > Adaptive Lock Madhav Iyengar < miyengar@andrew.cmu.edu >, Nathaniel Jeffries < njeffrie@andrew.cmu.edu > ABSTRACT Busy wait synchronization, the spinlock, is the primitive at the core of all other synchronization

More information

Implementing Transactional Memory in Kernel space

Implementing Transactional Memory in Kernel space Implementing Transactional Memory in Kernel space Breno Leitão Linux Technology Center leitao@debian.org leitao@freebsd.org Problem Statement Mutual exclusion concurrency control (Lock) Locks type: Semaphore

More information

LogTM: Log-Based Transactional Memory

LogTM: Log-Based Transactional Memory LogTM: Log-Based Transactional Memory Kevin E. Moore, Jayaram Bobba, Michelle J. Moravan, Mark D. Hill, & David A. Wood 12th International Symposium on High Performance Computer Architecture () 26 Mulitfacet

More information

A Memory Grouping Method for Sharing Memory BIST Logic

A Memory Grouping Method for Sharing Memory BIST Logic A Memory Grouping Method for Sharing Memory BIST Logic Masahide Miyazai, Tomoazu Yoneda, and Hideo Fuiwara Graduate Schoo of Information Science, Nara Institute of Science and Technoogy (NAIST), 8916-5

More information

Unit 12: Memory Consistency Models. Includes slides originally developed by Prof. Amir Roth

Unit 12: Memory Consistency Models. Includes slides originally developed by Prof. Amir Roth Unit 12: Memory Consistency Models Includes slides originally developed by Prof. Amir Roth 1 Example #1 int x = 0;! int y = 0;! thread 1 y = 1;! thread 2 int t1 = x;! x = 1;! int t2 = y;! print(t1,t2)!

More information

A Comparison of a Second-Order versus a Fourth- Order Laplacian Operator in the Multigrid Algorithm

A Comparison of a Second-Order versus a Fourth- Order Laplacian Operator in the Multigrid Algorithm A Comparison of a Second-Order versus a Fourth- Order Lapacian Operator in the Mutigrid Agorithm Kaushik Datta (kdatta@cs.berkeey.edu Math Project May 9, 003 Abstract In this paper, the mutigrid agorithm

More information

Design of IP Networks with End-to. to- End Performance Guarantees

Design of IP Networks with End-to. to- End Performance Guarantees Design of IP Networks with End-to to- End Performance Guarantees Irena Atov and Richard J. Harris* ( Swinburne University of Technoogy & *Massey University) Presentation Outine Introduction Mutiservice

More information

Fast and Scalable Queue-Based Resource Allocation Lock on Shared-Memory Multiprocessors

Fast and Scalable Queue-Based Resource Allocation Lock on Shared-Memory Multiprocessors Fast and Scalable Queue-Based Resource Allocation Lock on Shared-Memory Multiprocessors Deli Zhang, Brendan Lynch, and Damian Dechev University of Central Florida, Orlando, USA April 27, 2016 Mutual Exclusion

More information

Models of concurrency & synchronization algorithms

Models of concurrency & synchronization algorithms Models of concurrency & synchronization algorithms Lecture 3 of TDA383/DIT390 (Concurrent Programming) Carlo A. Furia Chalmers University of Technology University of Gothenburg SP3 2016/2017 Today s menu

More information

Fast and Scalable Queue-Based Resource Allocation Lock on Shared-Memory Multiprocessors

Fast and Scalable Queue-Based Resource Allocation Lock on Shared-Memory Multiprocessors Background Fast and Scalable Queue-Based Resource Allocation Lock on Shared-Memory Multiprocessors Deli Zhang, Brendan Lynch, and Damian Dechev University of Central Florida, Orlando, USA December 18,

More information

Searching & Sorting. Definitions of Search and Sort. Other forms of Linear Search. Linear Search. Week 11. Gaddis: 8, 19.6,19.8.

Searching & Sorting. Definitions of Search and Sort. Other forms of Linear Search. Linear Search. Week 11. Gaddis: 8, 19.6,19.8. Searching & Sorting Week 11 Gaddis: 8, 19.6,19.8 CS 5301 Fa 2017 Ji Seaman 1 Definitions of Search and Sort Search: find a given item in a ist, return the position of the item, or -1 if not found. Sort:

More information

Performance Improvement via Always-Abort HTM

Performance Improvement via Always-Abort HTM 1 Performance Improvement via Always-Abort HTM Joseph Izraelevitz* Lingxiang Xiang Michael L. Scott* *Department of Computer Science University of Rochester {jhi1,scott}@cs.rochester.edu Parallel Computing

More information

Application of Intelligence Based Genetic Algorithm for Job Sequencing Problem on Parallel Mixed-Model Assembly Line

Application of Intelligence Based Genetic Algorithm for Job Sequencing Problem on Parallel Mixed-Model Assembly Line American J. of Engineering and Appied Sciences 3 (): 5-24, 200 ISSN 94-7020 200 Science Pubications Appication of Inteigence Based Genetic Agorithm for Job Sequencing Probem on Parae Mixed-Mode Assemby

More information

VMM Emulation of Intel Hardware Transactional Memory

VMM Emulation of Intel Hardware Transactional Memory VMM Emulation of Intel Hardware Transactional Memory Maciej Swiech, Kyle Hale, Peter Dinda Northwestern University V3VEE Project www.v3vee.org Hobbes Project 1 What will we talk about? We added the capability

More information

Authorization of a QoS Path based on Generic AAA. Leon Gommans, Cees de Laat, Bas van Oudenaarde, Arie Taal

Authorization of a QoS Path based on Generic AAA. Leon Gommans, Cees de Laat, Bas van Oudenaarde, Arie Taal Abstract Authorization of a QoS Path based on Generic Leon Gommans, Cees de Laat, Bas van Oudenaarde, Arie Taa Advanced Internet Research Group, Department of Computer Science, University of Amsterdam.

More information

Java Semaphore (Part 1)

Java Semaphore (Part 1) Java Semaphore (Part 1) Douglas C. Schmidt d.schmidt@vanderbilt.edu www.dre.vanderbilt.edu/~schmidt Institute for Software Integrated Systems Vanderbilt University Nashville, Tennessee, USA Learning Objectives

More information

Synchronization COMPSCI 386

Synchronization COMPSCI 386 Synchronization COMPSCI 386 Obvious? // push an item onto the stack while (top == SIZE) ; stack[top++] = item; // pop an item off the stack while (top == 0) ; item = stack[top--]; PRODUCER CONSUMER Suppose

More information

Multiprocessors. Flynn Taxonomy. Classifying Multiprocessors. why would you want a multiprocessor? more is better? Cache Cache Cache.

Multiprocessors. Flynn Taxonomy. Classifying Multiprocessors. why would you want a multiprocessor? more is better? Cache Cache Cache. Multiprocessors why would you want a multiprocessor? Multiprocessors and Multithreading more is better? Cache Cache Cache Classifying Multiprocessors Flynn Taxonomy Flynn Taxonomy Interconnection Network

More information

Algorithms for Scalable Synchronization on Shared Memory Multiprocessors by John M. Mellor Crummey Michael L. Scott

Algorithms for Scalable Synchronization on Shared Memory Multiprocessors by John M. Mellor Crummey Michael L. Scott Algorithms for Scalable Synchronization on Shared Memory Multiprocessors by John M. Mellor Crummey Michael L. Scott Presentation by Joe Izraelevitz Tim Kopp Synchronization Primitives Spin Locks Used for

More information

CPSC/ECE 3220 Fall 2017 Exam Give the definition (note: not the roles) for an operating system as stated in the textbook. (2 pts.

CPSC/ECE 3220 Fall 2017 Exam Give the definition (note: not the roles) for an operating system as stated in the textbook. (2 pts. CPSC/ECE 3220 Fall 2017 Exam 1 Name: 1. Give the definition (note: not the roles) for an operating system as stated in the textbook. (2 pts.) Referee / Illusionist / Glue. Circle only one of R, I, or G.

More information

Arrays. Array Data Type. Array - Memory Layout. Array Terminology. Gaddis: 7.1-3,5

Arrays. Array Data Type. Array - Memory Layout. Array Terminology. Gaddis: 7.1-3,5 Arrays Unit 5 Gaddis: 7.1-3,5 CS 1428 Spring 2018 Ji Seaman Array Data Type Array: a variabe that contains mutipe vaues of the same type. Vaues are stored consecutivey in memory. An array variabe decaration

More information

Bridge Talk Release Notes for Meeting Exchange 5.0

Bridge Talk Release Notes for Meeting Exchange 5.0 Bridge Tak Reease Notes for Meeting Exchange 5.0 This document ists new product features, issues resoved since the previous reease, and current operationa issues. New Features This section provides a brief

More information

Chapter 5. Multiprocessors and Thread-Level Parallelism

Chapter 5. Multiprocessors and Thread-Level Parallelism Computer Architecture A Quantitative Approach, Fifth Edition Chapter 5 Multiprocessors and Thread-Level Parallelism 1 Introduction Thread-Level parallelism Have multiple program counters Uses MIMD model

More information

Space-Time Trade-offs.

Space-Time Trade-offs. Space-Time Trade-offs. Chethan Kamath 03.07.2017 1 Motivation An important question in the study of computation is how to best use the registers in a CPU. In most cases, the amount of registers avaiabe

More information

Role of Synchronization. CS 258 Parallel Computer Architecture Lecture 23. Hardware-Software Trade-offs in Synchronization and Data Layout

Role of Synchronization. CS 258 Parallel Computer Architecture Lecture 23. Hardware-Software Trade-offs in Synchronization and Data Layout CS 28 Parallel Computer Architecture Lecture 23 Hardware-Software Trade-offs in Synchronization and Data Layout April 21, 2008 Prof John D. Kubiatowicz http://www.cs.berkeley.edu/~kubitron/cs28 Role of

More information

Welcome - CSC 301. CSC 301- Foundations of Programming Languages

Welcome - CSC 301. CSC 301- Foundations of Programming Languages Wecome - CSC 301 CSC 301- Foundations of Programming Languages Instructor: Dr. Lutz Hame Emai: hame@cs.uri.edu Office: Tyer, Rm 251 Office Hours: TBA TA: TBA Assignments Assignment #0: Downoad & Read Syabus

More information

CSE506: Operating Systems CSE 506: Operating Systems

CSE506: Operating Systems CSE 506: Operating Systems CSE 506: Operating Systems Disk Scheduling Key to Disk Performance Don t access the disk Whenever possible Cache contents in memory Most accesses hit in the block cache Prefetch blocks into block cache

More information