What Is STM and How Well It Performs? Aleksandar Dragojević
|
|
- Justin Hoover
- 5 years ago
- Views:
Transcription
1 What Is STM and How Well It Performs? Aleksandar Dragojević 1
2 Hardware Trends 2
3 Hardware Trends CPU 2
4 Hardware Trends CPU 2
5 Hardware Trends CPU 2
6 Hardware Trends CPU 2
7 Hardware Trends CPU CPU 2
8 Hardware Trends CPU CPU CPU CPU 2
9 Hardware Trends CPU CPU CPU CPU CPU CPU CPU CPU 2
10 Concurrent Programming Domain of experts Fine-grained locking Lock-free Average programmers New abstractions needed 3
11 Transactional Memory 4
12 Transactional Memory // O1: move 20 a->b 1: int a = acc_a; 2: acc_a = a - 20; 3: int b = acc_b; 4: acc_b = b + 20; // O2: add 10 to a 5: int a = acc_a; 6: acc_a = a + 10; 4
13 Transactional Memory // O1: move 20 a->b atomic { 1: int a = acc_a; 2: acc_a = a - 20; 3: int b = acc_b; 4: acc_b = b + 20; } // O2: add 10 to a atomic { 5: int a = acc_a; 6: acc_a = a + 10; } 4
14 Transactional Memory // O1: move 20 a->b atomic { 1: int a = acc_a; 2: acc_a = a - 20; 3: int b = acc_b; 4: acc_b = b + 20; } // O2: add 10 to a atomic { 5: int a = acc_a; 6: acc_a = a + 10; } 4
15 Transactional Memory Sequential code Add transactions atomic key word System ensures correctness equivalent to sequential 5
16 Is it really simpler? 6
17 Is it really simpler? void Enqueue(int value) { ptrver_t next, tail, tail2, newb; node_t *node = alloc_queue_node(value); while(true) { tail = Tail; next = tail.ptr->next; tail2 = Tail; if(tail == tail2) if(next.ptr == NULL) { newb.ptr = node; newb.ver = next.ver + 1; if(cas(&tail.ptr->next, next, newb)) break; } else { newb.ptr = next.ptr; newb.ver = tail.ver + 1; CAS(&Tail, tail, newb); } } newb.ptr = node; newb.ver = tail.ver + 1; CAS(&Tail, tail, newb); } 6
18 Is it really simpler? void Enqueue(int value) { node_t *node = alloc_queue_node(value); atomic { if(tail == NULL) head = node; else tail->next = node; tail = node; } } 6
19 Disclaimer 7
20 What I might say TM is easy to use by non-experts TM shows great promise I hope we can make TM widely used 8
21 What I am not saying TM 9
22 Outline What is TM? How to implement an STM? SwissTM STM performance predicting the performance 10
23 What is TM? 11
24 What is TM? What does TM guarantee? 11
25 Serializability 12
26 Serializability 1: int a = acc_a; 2: acc_a = a - 20; 3: int b = acc_b; 4: acc_b = b + 20; 12
27 Serializability atomic { // t1 1: int a = acc_a; 2: acc_a = a - 20; 3: int b = acc_b; 4: acc_b = b + 20; } 12
28 Serializability atomic { // t1 1: int a = acc_a; 2: acc_a = a - 20; 3: int b = acc_b; 4: acc_b = b + 20; } atomic { // t2 5: int a = acc_a; 6: acc_a = a + 10; } 12
29 Serializability atomic { // t1 1: int a = acc_a; 2: acc_a = a - 20; 3: int b = acc_b; 4: acc_b = b + 20; } t 1 t 2 atomic { // t2 5: int a = acc_a; 6: acc_a = a + 10; } t correct 12
30 Serializability atomic { // t1 1: int a = acc_a; 2: acc_a = a - 20; 3: int b = acc_b; 4: acc_b = b + 20; } t 2 t 1 atomic { // t2 5: int a = acc_a; 6: acc_a = a + 10; } t correct 12
31 Serializability atomic { // t1 1: int a = acc_a; 2: acc_a = a - 20; 3: int b = acc_b; 4: acc_b = b + 20; } atomic { // t2 5: int a = acc_a; 6: acc_a = a + 10; } t client 12
32 Serializability atomic { // t1 1: int a = acc_a; 2: acc_a = a - 20; 3: int b = acc_b; 4: acc_b = b + 20; } atomic { // t2 5: int a = acc_a; 6: acc_a = a + 10; } t bank 12
33 How is this achieved? T 1 T 2 13
34 How is this achieved? T 1 t 1 T 2 13
35 How is this achieved? T 1 t 1 1 T 2 13
36 How is this achieved? T 1 t T 2 13
37 How is this achieved? T 1 t T 2 13
38 How is this achieved? T 1 t T 2 13
39 How is this achieved? T 1 t commit T 2 13
40 How is this achieved? T 1 t commit T 2 t 2 13
41 How is this achieved? T 1 t commit T 2 t
42 How is this achieved? T 1 t commit T 2 t
43 How is this achieved? T 1 t commit T 2 t commit 13
44 How is this achieved? T 1 T 2 13
45 How is this achieved? T 1 t 1 1 T 2 13
46 How is this achieved? T 1 t 1 1 T 2 t
47 How is this achieved? T 1 t T 2 t
48 How is this achieved? T 1 t T 2 t
49 How is this achieved? T 1 t T 2 t
50 How is this achieved? T 1 t T 2 t abort 13
51 How is this achieved? T 1 t T 2 t 2 13
52 How is this achieved? T 1 t T 2 t 2 13
53 How is this achieved? T 1 t commit T 2 t 2 13
54 How is this achieved? T 1 t commit T 2 t
55 How is this achieved? T 1 t commit T 2 t
56 How is this achieved? T 1 t commit T 2 t commit 13
57 How is this achieved? TM monitors accesses to objects When it detects conflicting access one transaction is aborted its actions are rolled back it is restarted transaction commits When all actions are not conflicting 14
58 Is serializability enough? 15
59 Is serializability enough? Variables: int x=0, y=1; Invariant: x < y 15
60 Is serializability enough? atomic { 1: int xl = x; 2: int yl = y; 3: x = yl; 4: y = yl * 2; } Variables: int x=0, y=1; Invariant: x < y 15
61 Is serializability enough? Variables: int x=0, y=1; Invariant: x < y atomic { 1: int xl = x; 2: int yl = y; 3: x = yl; 4: y = yl * 2; } atomic { 5: int xl = x; 6: int yl = y; 7: int zl = 1/(yl-xl); 8: z = zl; } 15
62 Consistent view All transactions must observe consistent views of memory at all times even the aborted ones 16
63 Opacity Serializability there exists an equivalent serial (one thread) execution Consistent memory view no transaction can e.g. divide by zero because of non-consistent reads 17
64 TM semantics Committed: instantaneous Aborted: never visible All: observe consistent state 18
65 TM semantics 19
66 TM semantics T 1 T 2 T 3 serial 19
67 TM semantics T 1 t 11 :commit T 2 T 3 serial 19
68 TM semantics T 1 t 11 :commit T 2 t 21 :commit T 3 serial 19
69 TM semantics T 1 t 11 :commit T 2 T 3 t 21 :commit t 31 :abort serial 19
70 TM semantics T 1 t 11 :commit T 2 T 3 t 21 :commit t 31 :abort t 32 :commit serial 19
71 TM semantics T 1 t 11 :commit t 12 :abort t 13 :commit T 2 t 21 :commit t 22 :commit T 3 t 31 :abort t 32 :commit t 33 :commit serial 19
72 TM semantics T 1 t 11 :commit t 12 :abort t 13 :commit T 2 t 21 :commit t 22 :commit T 3 t 31 :abort t 32 :commit t 33 :commit serial 19
73 TM semantics T 1 t 11 :commit t 12 :abort t 13 :commit T 2 t 21 :commit t 22 :commit T 3 t 31 :abort t 32 :commit t 33 :commit serial t 11 19
74 TM semantics T 1 t 11 :commit t 12 :abort t 13 :commit T 2 t 21 :commit t 22 :commit T 3 t 31 :abort t 32 :commit t 33 :commit serial t 11 19
75 TM semantics T 1 t 11 :commit t 12 :abort t 13 :commit T 2 t 21 :commit t 22 :commit T 3 t 31 :abort t 32 :commit t 33 :commit serial t 11 t 21 19
76 TM semantics T 1 t 11 :commit t 12 :abort t 13 :commit T 2 t 21 :commit t 22 :commit T 3 t 31 :abort t 32 :commit t 33 :commit serial t 11 t 21 19
77 TM semantics T 1 t 11 :commit t 12 :abort t 13 :commit T 2 t 21 :commit t 22 :commit T 3 t 31 :abort t 32 :commit t 33 :commit serial t 11 t 21 t 22 19
78 TM semantics T 1 t 11 :commit t 12 :abort t 13 :commit T 2 t 21 :commit t 22 :commit T 3 t 31 :abort t 32 :commit t 33 :commit serial t 11 t 21 t 22 19
79 TM semantics T 1 t 11 :commit t 12 :abort t 13 :commit T 2 t 21 :commit t 22 :commit T 3 t 31 :abort t 32 :commit t 33 :commit serial t 11 t 21 t 22 t 32 19
80 TM semantics T 1 t 11 :commit t 12 :abort t 13 :commit T 2 t 21 :commit t 22 :commit T 3 t 31 :abort t 32 :commit t 33 :commit serial t 11 t 21 t 22 t 32 t 13 t 33 19
81 How to implement an STM? 20
82 How to implement an STM? How does it all fit? 20
83 Software TM Available now Component of HyTM Backwards compatibility 21
84 From.cc To.exe 22
85 From.cc To.exe.cc 22
86 From.cc To.exe.cc.exe 22
87 From.cc To.exe.cc?.exe 22
88 From.cc To.exe.cc cc.exe 22
89 From.cc To.exe.cc cc.o.exe 22
90 From.cc To.exe.cc cc.o ld.exe 22
91 From.cc To.exe.cc cc.o ld.exe lib*.a 22
92 From.cc To.exe.cc cc.o ld.exe lib*.a 22
93 From.cc To.exe.cc cc.o ld.exe TM? lib*.a 22
94 From.cc To.exe.cc.exe 22
95 From.cc To.exe.cc Tx pass.exe 22
96 From.cc To.exe.cc Tx pass cc.exe 22
97 From.cc To.exe.cc Tx pass cc.o.exe 22
98 From.cc To.exe.cc Tx pass cc.o ld.exe 22
99 From.cc To.exe.cc Tx pass cc.o ld.exe libtm.a 22
100 From.cc To.exe.cc Tx pass cc.o ld.exe libtm.a 22
101 From.cc To.exe.cc Tx pass cc.o ld.exe libtm.a 22
102 TM library Implements TM Similar (same) API Different algorithms different performance different properties 23
103 TM library API tx_start() tx_read(addr) : val tx_write(addr, val) tx_commit() tx_abort() 24
104 TM libraries SwissTM (EPFL) TinySTM (UNINE) DSTM, TL2, TLRW, SkyTM (Sun) McRT (Intel) SXM, Bartok (Microsoft)... 25
105 Tx pass 26
106 Tx pass atomic { // t1 1: int a = acc_a; 2: acc_a = a - 20; 3: int b = acc_b; 4: acc_b = b + 20; } 26
107 Tx pass atomic { // t1 1: int a = acc_a; 2: acc_a = a - 20; 3: int b = acc_b; 4: acc_b = b + 20; } tx_start(); 1: int a = tx_read(acc_a); 2: tx_write(acc_a, a-20); 3: int b = tx_read(acc_b); 4: tx_write(acc_b, b+20); tx_commit(); 26
108 Implementing Tx pass Manual Compiler Other 27
109 Manual Tx pass Manually insert TM API calls Highly optimized no unnecessary TM calls Error-prone missing TM calls Tedious need to rewrite a lot of code 28
110 Compiler Tx pass Integrated with the compiler Simple to use Lower performance unnecessary TM calls Better support for optimizations lower the overheads 29
111 Other Tx pass Source to source compiler separate, simpler compiler Bytecode instrumentation for managed languages (Java, C#) 30
112 Tx Passes C/C++ Intel DTMC (LLVM) Sun gcc Java Deuce 31
113 SwissTM 32
114 SwissTM How to implement an STM library? 32
115 Setting 33
116 Setting CPU 1 33
117 Setting CPU 1 CPU 2 33
118 Setting CPU 1 CPU 2 CPU 3 33
119 Setting Main memory CPU 1 CPU 2 CPU 3 33
120 Setting Main memory 0x0 CPU 1 CPU 2 CPU 3 33
121 Setting Main memory CPU 1 0x0 0x4 CPU 2 CPU 3 33
122 Setting Main memory CPU 1 0x0 0x4 0x8 0xc CPU 2 CPU 3 0xffffffff 33
123 Setting Main memory CPU 1 op(addr) 0x0 0x4 0x8 0xc CPU 2 CPU 3 0xffffffff 33
124 Setting Main memory CPU 1 op(addr) 0x0 0x4 0x8 0xc CPU 2 CPU 3 0xffffffff 33
125 Setting (2) Memory consists of locations word sized (32 bits) Each location has address CPUs execute operations on locations read, write, c&s, t&s,... 34
126 Setting (3) Operations have different costs Try to avoid expensive operations c&s writing shared data reading shared data 35
127 SwissTM Two phase locking Invisible reads time-based validation Deferred updates support rollback 36
128 SwissTM design TM 37
129 SwissTM design TM conflict detection 37
130 SwissTM design TM conflict detection contention manager 37
131 SwissTM design t 1 TM conflict detection contention manager t 2 37
132 SwissTM design t 1 R(acc_a) tx_read(acc_a) TM conflict detection contention manager t 2 37
133 SwissTM design t 1 R(acc_a) TM conflict detection contention manager t 2 R(acc_a) tx_read(acc_a) 37
134 SwissTM design t 1 R(acc_a) W(acc_a) tx_write(acc_a) TM conflict detection contention manager t 2 R(acc_a) 37
135 SwissTM design t 1 R(acc_a) W(acc_a) TM conflict detection contention manager t 2 R(acc_a) tx_write(acc_a) W(acc_a) 37
136 SwissTM design t 1 R(acc_a) W(acc_a) TM conflict detection contention manager t 2 R(acc_a) abort 37
137 Conflict Detection eager > lazy t 1 commit RW V t 2 RW abort lazy wasted 38
138 Conflict Detection eager > lazy t 1 commit RW V t 2 RW abort lazy wasted 38
139 Conflict Detection eager > lazy t 1 commit RW V t 2 RW abort abort eager wasted 38
140 Conflict Detection 38
141 Conflict Detection lazy > eager t 1 commit W V t 2 R commit lazy 38
142 Conflict Detection lazy > eager t 1 commit W V t 2 R abort eager 38
143 Conflict Detection lazy > eager t 1 commit W V t 2 R commit lazy 38
144 Contention Manager t 1 W? V t 2 W 39
145 Contention Manager t 1 Greedy W abort younger V t 2 W abort 39
146 Contention Manager 39
147 Contention Manager t 1 W? V t 2 W 39
148 Contention Manager t 1 Timid abort W abort later V t 2 W 39
149 SwissTM Design Mixed invalidation lazy for read/write eager for write/write Two-phase contention manager timid for short greedy for long 40
150 Starting point void tx_start() word_t tx_read(word_t *addr) void tx_write(word_t *addr, word_t val) void tx_commit() 41
151 Where is the lock? 42
152 Where is the lock? Global Lock Table 42
153 Where is the lock? Global Lock Table lock[0] lock[1] lock[2] lock[1m lock[4m - 1] 42
154 Where is the lock? Global Lock Table lock[0] lock[1] map(addr) lock[2] lock[1m lock[4m - 1] 42
155 Where is the lock? Global Lock Table lock[0] lock[1] map(0xab000020) lock[2] lock[1m lock[4m - 1] 42
156 Where is the lock? Global Lock Table lock[0] lock[1] lock[2] map(0xab000020) a b lock[1m lock[4m - 1] 42
157 Where is the lock? Global Lock Table lock[0] lock[1] lock[2] map(0xab000020) a b lock[1m lock[4m - 1] 42
158 Where is the lock? Global Lock Table lock[0] lock[1] lock[2] map(0xab000020) a b lock[1m lock[4m - 1] 42
159 Where is the lock? Global Lock Table lock[0] lock[1] lock[2] map(0xab000020) a b lock[1m lock[4m - 1] 42
160 Lock 43
161 Lock write lock read lock 43
162 Lock unlocked write lock 0x0 43
163 Lock unlocked write lock 0x0 locked write lock owner log entry ptr 43
164 Lock unlocked write lock 0x0 locked write lock owner log entry ptr unlocked read lock version << 1 43
165 Lock unlocked write lock 0x0 locked write lock owner log entry ptr unlocked read lock version << 1 locked read lock 0x1 43
166 Lock (2) Two memory words Write lock encounter time Read lock commit time detect write/write conflicts detect read/write conflicts 44
167 Versions Each location has a version Use shared version counter speeds up validation Every transaction that writes, updates the counter on commit 45
168 Versions (2) Transactions read version counter at start If location version lower than counter no updates to the location since start read set is consistent Otherwise, validate remember current version counter 46
169 Shared data Commit_ts // shared version counter Greedy_ts // shared CM counter Lock_table // locks for all locations 47
170 Thread-local data _start // rollback jump target _valid_ts // read set version _read_log // what tx read _write_log // what tx wrote _cm_ts // CM timestamp 48
171 Start 1: tx_start(): 2: _start := create_jump_target() 3: _valid_ts := read(commit_ts) 49
172 Read 1: tx_read(word_t *addr) 2: (r_lock, w_lock) := map(addr) 3: if locked_by_me(w_lock) return get_val(w_lock) 4: version := read(r_lock) 5: while true 6: if version = 0x1 7: version := read(r_lock) 8: continue 9: value := read(addr) 10: version2 := read(r_lock) 11: if version = version2 break 12: version := version2 13: read_log_add(r_lock, version) 14: if version > _valid_ts and not extend() 15: rollback() 16: return value 50
173 Read 1: tx_read(word_t *addr) 2: (r_lock, w_lock) := map(addr) Map to lock 3: if locked_by_me(w_lock) return get_val(w_lock) 4: version := read(r_lock) 5: while true 6: if version = 0x1 7: version := read(r_lock) 8: continue 9: value := read(addr) 10: version2 := read(r_lock) 11: if version = version2 break 12: version := version2 13: read_log_add(r_lock, version) 14: if version > _valid_ts and not extend() 15: rollback() 16: return value 50
174 Read 1: tx_read(word_t *addr) 2: (r_lock, w_lock) := map(addr) Map to lock 3: if locked_by_me(w_lock) return get_val(w_lock) 4: version := read(r_lock) 5: while true 6: if version = 0x1 7: version := read(r_lock) 8: continue 9: value := read(addr) 10: version2 := read(r_lock) 11: if version = version2 break 12: version := version2 13: read_log_add(r_lock, version) 14: if version > _valid_ts and not extend() 15: rollback() 16: return value 50 Read after write
175 Read 1: tx_read(word_t *addr) 2: (r_lock, w_lock) := map(addr) Map to lock 3: if locked_by_me(w_lock) return get_val(w_lock) 4: version := read(r_lock) 5: while true 6: if version = 0x1 7: version := read(r_lock) 8: continue 9: value := read(addr) 10: version2 := read(r_lock) 11: if version = version2 break 12: version := version2 13: read_log_add(r_lock, version) 14: if version > _valid_ts and not extend() 15: rollback() 16: return value 50 Read after write Read consistent version and value
176 Read 1: tx_read(word_t *addr) 2: (r_lock, w_lock) := map(addr) Map to lock 3: if locked_by_me(w_lock) return get_val(w_lock) 4: version := read(r_lock) 5: while true 6: if version = 0x1 7: version := read(r_lock) 8: continue 9: value := read(addr) 10: version2 := read(r_lock) 11: if version = version2 break 12: version := version2 13: read_log_add(r_lock, version) 14: if version > _valid_ts and not extend() 15: rollback() 16: return value 50 Read after write Read consistent version and value Read state consistent?
177 Extend 1: extend() 2: ts := read(commits_ts) 3: if validate() 4: _valid_ts := ts 5: return true 6: return false 51
178 Validate 1: validate() 2: for entry in _read_log 3: if entry.version = read(entry.r_lock) 4: continue 5: if locked_by_me(entry.w_lock) 6: continue 7: return false 8: return true 52
179 Write 1: tx_write(word_t *addr, word_t val) 2: (r_lock, w_lock) := map(addr) 3: if locked_by_me(w_lock) 4: update(w_lock, val) 5: return 6: while true 7: if w_lock!= 0x0 8: if cm_should_abort(w_lock) rollback() 9: else continue 10: entry := write_log_add(w_lock,addr,val) 11: if c&s(w_lock, 0, entry) break 12: write_log_remove(entry) 13: if read(r_lock) > _valid_ts and not extend() 14: rollback() 53
180 Write 1: tx_write(word_t *addr, word_t val) 2: (r_lock, w_lock) := map(addr) 3: if locked_by_me(w_lock) Map to lock 4: update(w_lock, val) 5: return 6: while true 7: if w_lock!= 0x0 8: if cm_should_abort(w_lock) rollback() 9: else continue 10: entry := write_log_add(w_lock,addr,val) 11: if c&s(w_lock, 0, entry) break 12: write_log_remove(entry) 13: if read(r_lock) > _valid_ts and not extend() 14: rollback() 53
181 Write 1: tx_write(word_t *addr, word_t val) 2: (r_lock, w_lock) := map(addr) 3: if locked_by_me(w_lock) Map to lock 4: update(w_lock, val) 5: return 6: while true 7: if w_lock!= 0x0 8: if cm_should_abort(w_lock) rollback() 9: else continue 10: entry := write_log_add(w_lock,addr,val) 11: if c&s(w_lock, 0, entry) break 12: write_log_remove(entry) 13: if read(r_lock) > _valid_ts and not extend() 14: rollback() Write after write 53
182 Write 1: tx_write(word_t *addr, word_t val) 2: (r_lock, w_lock) := map(addr) 3: if locked_by_me(w_lock) Map to lock 4: update(w_lock, val) 5: return 6: while true Acquire 7: if w_lock!= 0x0 write lock 8: if cm_should_abort(w_lock) rollback() 9: else continue 10: entry := write_log_add(w_lock,addr,val) 11: if c&s(w_lock, 0, entry) break 12: write_log_remove(entry) 13: if read(r_lock) > _valid_ts and not extend() 14: rollback() Write after write 53
183 Write 1: tx_write(word_t *addr, word_t val) 2: (r_lock, w_lock) := map(addr) 3: if locked_by_me(w_lock) Map to lock 4: update(w_lock, val) 5: return 6: while true Acquire 7: if w_lock!= 0x0 write lock 8: if cm_should_abort(w_lock) rollback() 9: else continue 10: entry := write_log_add(w_lock,addr,val) 11: if c&s(w_lock, 0, entry) break 12: write_log_remove(entry) 13: if read(r_lock) > _valid_ts and not extend() 14: rollback() 53 Write after write Validate (for WaR)
184 Commit 1: tx_commit() 2: if is_empty(_write_log) return 3: for entry in _write_log 4: write(entry.r_lock,0x1) 5: ts := increment(commits_ts) 6: if ts > _valid_ts + 1 and not validate() 7: for entry in _write_log 8: write(entry.r_lock, entry.version) 9: rollback() 10: for entry in _write_log 11: write(entry.addr, entry.value) 12: write(entry.r_lock, ts << 1) 13: write(entry.w_lock, 0) 54
185 Commit 1: tx_commit() 2: if is_empty(_write_log) return 3: for entry in _write_log 4: write(entry.r_lock,0x1) 5: ts := increment(commits_ts) 6: if ts > _valid_ts + 1 and not validate() 7: for entry in _write_log 8: write(entry.r_lock, entry.version) 9: rollback() 10: for entry in _write_log 11: write(entry.addr, entry.value) 12: write(entry.r_lock, ts << 1) 13: write(entry.w_lock, 0) Read-only 54
186 Commit 1: tx_commit() 2: if is_empty(_write_log) return 3: for entry in _write_log 4: write(entry.r_lock,0x1) 5: ts := increment(commits_ts) 6: if ts > _valid_ts + 1 and not validate() 7: for entry in _write_log 8: write(entry.r_lock, entry.version) 9: rollback() 10: for entry in _write_log 11: write(entry.addr, entry.value) 12: write(entry.r_lock, ts << 1) 13: write(entry.w_lock, 0) Read-only Acquire read locks 54
187 Commit 1: tx_commit() 2: if is_empty(_write_log) return 3: for entry in _write_log 4: write(entry.r_lock,0x1) 5: ts := increment(commits_ts) 6: if ts > _valid_ts + 1 and not validate() 7: for entry in _write_log 8: write(entry.r_lock, entry.version) 9: rollback() 10: for entry in _write_log 11: write(entry.addr, entry.value) 12: write(entry.r_lock, ts << 1) 13: write(entry.w_lock, 0) Read-only Acquire read locks Get next version 54
188 Commit 1: tx_commit() 2: if is_empty(_write_log) return 3: for entry in _write_log 4: write(entry.r_lock,0x1) 5: ts := increment(commits_ts) 6: if ts > _valid_ts + 1 and not validate() 7: for entry in _write_log 8: write(entry.r_lock, entry.version) 9: rollback() 10: for entry in _write_log 11: write(entry.addr, entry.value) 12: write(entry.r_lock, ts << 1) 13: write(entry.w_lock, 0) Read-only Acquire read locks Get next version Validate read set 54
189 Commit 1: tx_commit() 2: if is_empty(_write_log) return 3: for entry in _write_log 4: write(entry.r_lock,0x1) 5: ts := increment(commits_ts) 6: if ts > _valid_ts + 1 and not validate() 7: for entry in _write_log 8: write(entry.r_lock, entry.version) 9: rollback() 10: for entry in _write_log 11: write(entry.addr, entry.value) 12: write(entry.r_lock, ts << 1) 13: write(entry.w_lock, 0) Read-only Acquire read locks Get next version Validate read set Commit values to memory 54
190 Rollback 1: rollback() 2: for entry in _write_log 3: write(entry.w_lock, 0x0) 4: long_jump(_start) 55
191 Contention manager 1: on_start() 2: _cm_ts := 3: on_write() 4: if _cm_ts = and size(_write_log) > 10 5: _cm_ts := increment(greedy_ts) 6: on_rollback() 7: wait_random() 8: cm_should_abort(w_lock) 9: if _cm_ts = return true 10: owner = owner(w_lock) 11: if owner._cm_ts < _cm_ts return true 12: abort(owner) 13: return false 56
192 Contention manager 1: on_start() 2: _cm_ts := Start as Timid 3: on_write() 4: if _cm_ts = and size(_write_log) > 10 5: _cm_ts := increment(greedy_ts) 6: on_rollback() 7: wait_random() 8: cm_should_abort(w_lock) 9: if _cm_ts = return true 10: owner = owner(w_lock) 11: if owner._cm_ts < _cm_ts return true 12: abort(owner) 13: return false 56
193 Contention manager 1: on_start() 2: _cm_ts := Start as Timid 3: on_write() 4: if _cm_ts = and size(_write_log) > 10 5: _cm_ts := increment(greedy_ts) 6: on_rollback() 7: wait_random() 8: cm_should_abort(w_lock) 9: if _cm_ts = return true 10: owner = owner(w_lock) 11: if owner._cm_ts < _cm_ts return true 12: abort(owner) 13: return false 56 Switch to Greedy
194 Contention manager 1: on_start() 2: _cm_ts := Start as Timid 3: on_write() 4: if _cm_ts = and size(_write_log) > 10 5: _cm_ts := increment(greedy_ts) 6: on_rollback() 7: wait_random() 8: cm_should_abort(w_lock) 9: if _cm_ts = return true 10: owner = owner(w_lock) 11: if owner._cm_ts < _cm_ts return true 12: abort(owner) 13: return false 56 Switch to Greedy Random backoff
195 Contention manager 1: on_start() 2: _cm_ts := Start as Timid 3: on_write() 4: if _cm_ts = and size(_write_log) > 10 5: _cm_ts := increment(greedy_ts) 6: on_rollback() 7: wait_random() 8: cm_should_abort(w_lock) 9: if _cm_ts = return true Timid 10: owner = owner(w_lock) 11: if owner._cm_ts < _cm_ts return true 12: abort(owner) 13: return false 56 Switch to Greedy Random backoff
196 Contention manager 1: on_start() 2: _cm_ts := Start as Timid 3: on_write() 4: if _cm_ts = and size(_write_log) > 10 5: _cm_ts := increment(greedy_ts) 6: on_rollback() 7: wait_random() 8: cm_should_abort(w_lock) 9: if _cm_ts = return true Timid 10: owner = owner(w_lock) 11: if owner._cm_ts < _cm_ts return true 12: abort(owner) 13: return false 56 Switch to Greedy Random backoff Greedy
197 STMBench7 Throughput [10 3 tx/s] 7 SwissTM TinySTM RSTM TL read dominated Thread count 57
198 Duration [s] Mixed invalidation TinySTM SwissTM LeeTM Thread count 58
199 Duration [s] 80 Mixed invalidation TinySTM SwissTM LeeTM irr 5% Thread count 58
200 Duration [s] Mixed invalidation TinySTM SwissTM LeeTM irr 20% Thread count 58
201 Two-phase CM Throughput [10 6 tx/s] Two-phase Greedy red-black tree Thread count 59
202 Next steps Translate into code some additional details Compile Run 60
203 Additional details Memory management allocations inside transactions that abort? Actions that cannot be rolled back input / output Same data used by non-tx code privatization / publication 61
204 STM performance 62
205 STM performance How fast is an STM really? 62
206 Measuring performance Compare different approaches different STMs STM vs locking vs lock-free How to express performance? metric Common approaches work 63
207 Approach I 64
208 Approach I 64
209 Approach I (2) Take some (long) task e.g. genome sequencing Run with different STMs Faster STM needs less time 65
210 Approach II 66
211 Approach II (2) Take some repetitive task e.g. insert element into red-black tree Keep executing it for some (fixed) time Run with different STMs Faster STM executes the task more times 67
212 Approach III 68
213 Avoiding pitfalls STM1 sequences genome faster than STM2 does not mean STM1 also solves some optimization problem A faster than STM2 No complete solution run as many workloads as possible Be careful 69
214 STMBench7 70
215 STMBench7 What does a benchmark look like? 70
216 STMBench7 Uses approach II Large data structure Modeled on OO7 CAD / CAM / CASE workloads Two locking, several STM implementations Crash test 71
217 Data structure 72
218 Data structure (2) 73
219 Operations 45 operations Read only and Update Short and long Three different workloads read, read-write, write 74
220 Execution Thread 1 op1 Thread 2 op2 Thread 3 op3 Data structure Latency, throughput 75
221 Execution (2) Threads execute a mix of operations Experiment lasts for 10s for example Measure of performance is the number of executed operations per second 76
222 STMBench7 Throughput [10 3 tx/s] 7 SwissTM TinySTM RSTM TL read dominated Thread count 77
223 Important question 78
224 Important question Can STM outperform sequential code? 78
225 SwissTM vs Sequential (x86 manual)!d&&(5d$$ #!" +" *" )" (" '" &" %" $" #" #" $" &" *" #("!"!"#$%&'($!"#$%&'()*+,-&$!"#$*+,-&$ "'.&/$ 0&123&$ 41-+5(&+$ 63&'1/$7,89$ 63&'1/$:2;$ A'('$ 7'/9-'<B&$ :,1C&($:,/-$ %<-+&&$!C,DB,/-$ 79
226 SwissTM vs Sequential (x86 manual)!d&&(5d$$ #!" +" *" )" (" '" &" %" $" #" #" $" &" *" #("!"!"#$%&'($!"#$%&'()*+,-&$!"#$*+,-&$ "'.&/$ 0&123&$ 41-+5(&+$ 63&'1/$7,89$ 63&'1/$:2;$ A'('$ 7'/9-'<B&$ :,1C&($:,/-$ %<-+&&$!C,DB,/-$ 79
227 SwissTM vs Sequential (x86 manual)!d&&(5d$$ #!" +" *" )" (" '" &" %" $" #" #" $" &" *" #("!"!"#$%&'($!"#$%&'()*+,-&$!"#$*+,-&$ "'.&/$ 0&123&$ 41-+5(&+$ 63&'1/$7,89$ 63&'1/$:2;$ A'('$ 7'/9-'<B&$ :,1C&($:,/-$ %<-+&&$!C,DB,/-$ 79
228 SwissTM vs Sequential (x86 manual)!d&&(5d$$ #!" +" *" )" (" '" &" %" $" #" #" $" &" *" #("!"!"#$%&'($!"#$%&'()*+,-&$!"#$*+,-&$ "'.&/$ 0&123&$ 41-+5(&+$ 63&'1/$7,89$ 63&'1/$:2;$ A'('$ 7'/9-'<B&$ :,1C&($:,/-$ %<-+&&$!C,DB,/-$ 79
229 SwissTM vs Sequential (x86 manual)!d&&(5d$$ #!" +" *" )" (" '" &" %" $" #" #" $" &" *" #("!"!"#$%&'($!"#$%&'()*+,-&$!"#$*+,-&$ "'.&/$ 0&123&$ 41-+5(&+$ 63&'1/$7,89$ 63&'1/$:2;$ A'('$ 7'/9-'<B&$ :,1C&($:,/-$ %<-+&&$!C,DB,/-$ 79
230 SwissTM vs Sequential (x86 manual)!d&&(5d$$ #!" +" *" )" (" '" &" %" $" #" #" $" &" *" #("!"!"#$%&'($!"#$%&'()*+,-&$!"#$*+,-&$ "'.&/$ 0&123&$ 41-+5(&+$ 63&'1/$7,89$ 63&'1/$:2;$ A'('$ 7'/9-'<B&$ :,1C&($:,/-$ %<-+&&$!C,DB,/-$ 79
231 SwissTM vs Sequential (x86 manual)!d&&(5d$$ #!" +" *" )" (" '" &" %" $" #" #" $" &" *" #("!"!"#$%&'($!"#$%&'()*+,-&$!"#$*+,-&$ "'.&/$ 0&123&$ 41-+5(&+$ 63&'1/$7,89$ 63&'1/$:2;$ A'('$ 7'/9-'<B&$ :,1C&($:,/-$ %<-+&&$!C,DB,/-$ 79
232 SwissTM vs Sequential (x86 manual)!d&&(5d$$ #!" +" *" )" (" '" &" %" $" #" #" $" &" *" #("!"!"#$%&'($!"#$%&'()*+,-&$!"#$*+,-&$ "'.&/$ 0&123&$ 41-+5(&+$ 63&'1/$7,89$ 63&'1/$:2;$ A'('$ 7'/9-'<B&$ :,1C&($:,/-$ %<-+&&$!C,DB,/-$ 79
233 SwissTM vs Sequential (x86 manual)!d&&(5d$$ #!" +" *" )" (" '" &" %" $" #" #" $" &" *" #("!"!"#$%&'($!"#$%&'()*+,-&$!"#$*+,-&$ "'.&/$ 0&123&$ 41-+5(&+$ 63&'1/$7,89$ 63&'1/$:2;$ A'('$ 7'/9-'<B&$ :,1C&($:,/-$ %<-+&&$!C,DB,/-$ 79
234 SwissTM vs Sequential (x86 Intel compiler) 8A$$/.A& #!" +" *" )" (" '" &" %" $" #" #" $" &" *" #("!"!"#$%& '$()*$& +(,-./$-& 0*$"(%&1234& 0*$"(%&5)6& 5"7#-2(,4& 8%9":& ;"9"<)(&1234& ;"9"<)(&5)6& ="/"& 1"%4,"7>$& 8?2A>2%,& 80
235 SwissTM vs Sequential (x86 Intel compiler) 8A$$/.A& #!" +" *" )" (" '" &" %" $" #" #" $" &" *" #("!"!"#$%& '$()*$& +(,-./$-& 0*$"(%&1234& 0*$"(%&5)6& 5"7#-2(,4& 8%9":& ;"9"<)(&1234& ;"9"<)(&5)6& ="/"& 1"%4,"7>$& 8?2A>2%,& 80
236 SwissTM vs Sequential (x86 Intel compiler) 8A$$/.A& #!" +" *" )" (" '" &" %" $" #" #" $" &" *" #("!"!"#$%& '$()*$& +(,-./$-& 0*$"(%&1234& 0*$"(%&5)6& 5"7#-2(,4& 8%9":& ;"9"<)(&1234& ;"9"<)(&5)6& ="/"& 1"%4,"7>$& 8?2A>2%,& 80
237 SwissTM vs Sequential (x86 Intel compiler) 8A$$/.A& #!" +" *" )" (" '" &" %" $" #" #" $" &" *" #("!"!"#$%& '$()*$& +(,-./$-& 0*$"(%&1234& 0*$"(%&5)6& 5"7#-2(,4& 8%9":& ;"9"<)(&1234& ;"9"<)(&5)6& ="/"& 1"%4,"7>$& 8?2A>2%,& 80
238 SwissTM vs Sequential (x86 Intel compiler) 8A$$/.A& #!" +" *" )" (" '" &" %" $" #" #" $" &" *" #("!"!"#$%& '$()*$& +(,-./$-& 0*$"(%&1234& 0*$"(%&5)6& 5"7#-2(,4& 8%9":& ;"9"<)(&1234& ;"9"<)(&5)6& ="/"& 1"%4,"7>$& 8?2A>2%,& 80
239 SwissTM vs Sequential (x86 Intel compiler) 8A$$/.A& #!" +" *" )" (" '" &" %" $" #" #" $" &" *" #("!"!"#$%& '$()*$& +(,-./$-& 0*$"(%&1234& 0*$"(%&5)6& 5"7#-2(,4& 8%9":& ;"9"<)(&1234& ;"9"<)(&5)6& ="/"& 1"%4,"7>$& 8?2A>2%,& 80
240 SwissTM vs Sequential (x86 Intel compiler) 8A$$/.A& #!" +" *" )" (" '" &" %" $" #" #" $" &" *" #("!"!"#$%& '$()*$& +(,-./$-& 0*$"(%&1234& 0*$"(%&5)6& 5"7#-2(,4& 8%9":& ;"9"<)(&1234& ;"9"<)(&5)6& ="/"& 1"%4,"7>$& 8?2A>2%,& 80
241 SwissTM vs Sequential (x86 Intel compiler) 8A$$/.A& #!" +" *" )" (" '" &" %" $" #" #" $" &" *" #("!"!"#$%& '$()*$& +(,-./$-& 0*$"(%&1234& 0*$"(%&5)6& 5"7#-2(,4& 8%9":& ;"9"<)(&1234& ;"9"<)(&5)6& ="/"& 1"%4,"7>$& 8?2A>2%,& 80
242 Compiler cost 8A$$/.A& $"!#,"!#+"!#*"!#)"!#("!#'"!#&"!#%"!#$" $" %" '" +" $)"!"!"#$%& '$()*$& +(,-./$-& 0*$"(%&1234& 0*$"(%&5)6& 5"7#-2(,4& 8%9":& ;"9"<)(&1234& ;"9"<)(&5)6& ="/"& 1"%4,"7>$& 8?2A>2%,& 81
243 Compiler cost 8A$$/.A& $"!#,"!#+"!#*"!#)"!#("!#'"!#&"!#%"!#$" $" %" '" +" $)"!"!"#$%& '$()*$& +(,-./$-& 0*$"(%&1234& 0*$"(%&5)6& 5"7#-2(,4& 8%9":& ;"9"<)(&1234& ;"9"<)(&5)6& ="/"& 1"%4,"7>$& 8?2A>2%,& 81
244 Compiler cost 8A$$/.A& $"!#,"!#+"!#*"!#)"!#("!#'"!#&"!#%"!#$" $" %" '" +" $)"!"!"#$%& '$()*$& +(,-./$-& 0*$"(%&1234& 0*$"(%&5)6& 5"7#-2(,4& 8%9":& ;"9"<)(&1234& ;"9"<)(&5)6& ="/"& 1"%4,"7>$& 8?2A>2%,& 81
245 Compiler cost 8A$$/.A& $"!#,"!#+"!#*"!#)"!#("!#'"!#&"!#%"!#$" $" %" '" +" $)"!"!"#$%& '$()*$& +(,-./$-& 0*$"(%&1234& 0*$"(%&5)6& 5"7#-2(,4& 8%9":& ;"9"<)(&1234& ;"9"<)(&5)6& ="/"& 1"%4,"7>$& 8?2A>2%,& 81
246 Compiler cost 8A$$/.A& $"!#,"!#+"!#*"!#)"!#("!#'"!#&"!#%"!#$" $" %" '" +" $)"!"!"#$%& '$()*$& +(,-./$-& 0*$"(%&1234& 0*$"(%&5)6& 5"7#-2(,4& 8%9":& ;"9"<)(&1234& ;"9"<)(&5)6& ="/"& 1"%4,"7>$& 8?2A>2%,& 81
247 Compiler cost 8A$$/.A& $"!#,"!#+"!#*"!#)"!#("!#'"!#&"!#%"!#$" $" %" '" +" $)"!"!"#$%& '$()*$& +(,-./$-& 0*$"(%&1234& 0*$"(%&5)6& 5"7#-2(,4& 8%9":& ;"9"<)(&1234& ;"9"<)(&5)6& ="/"& 1"%4,"7>$& 8?2A>2%,& 81
248 Compiler cost 8A$$/.A& $"!#,"!#+"!#*"!#)"!#("!#'"!#&"!#%"!#$" $" %" '" +" $)"!"!"#$%& '$()*$& +(,-./$-& 0*$"(%&1234& 0*$"(%&5)6& 5"7#-2(,4& 8%9":& ;"9"<)(&1234& ;"9"<)(&5)6& ="/"& 1"%4,"7>$& 8?2A>2%,& 81
249 SwissTM vs Sequential (Sparc) '$" E#$>F$ >G$>H$ EI$ '#" '!"!D&&(5D$$ &" %" $" #" '" #" $" &" '%" (#" %$"!"!"#$%&'($!"#$%&'()*+,-&$!"#$*+,-&$ "'.&/$ 0&123&$ 41-+5(&+$ 63&'1/$7,89$ 63&'1/$:2;$ A'('$ 7'/9-'<B&$ :,1C&($:,/-$ %<-+&&$!C,DB,/-$ 82
250 SwissTM vs Sequential (Sparc) '$" E#$>F$ >G$>H$ EI$ '#" '!"!D&&(5D$$ &" %" $" #" '" #" $" &" '%" (#" %$"!"!"#$%&'($!"#$%&'()*+,-&$!"#$*+,-&$ "'.&/$ 0&123&$ 41-+5(&+$ 63&'1/$7,89$ 63&'1/$:2;$ A'('$ 7'/9-'<B&$ :,1C&($:,/-$ %<-+&&$!C,DB,/-$ 82
251 SwissTM vs Sequential (Sparc) '$" E#$>F$ >G$>H$ EI$ '#" '!"!D&&(5D$$ &" %" $" #" '" #" $" &" '%" (#" %$"!"!"#$%&'($!"#$%&'()*+,-&$!"#$*+,-&$ "'.&/$ 0&123&$ 41-+5(&+$ 63&'1/$7,89$ 63&'1/$:2;$ A'('$ 7'/9-'<B&$ :,1C&($:,/-$ %<-+&&$!C,DB,/-$ 82
252 SwissTM vs Sequential (Sparc) '$" E#$>F$ >G$>H$ EI$ '#" '!"!D&&(5D$$ &" %" $" #" '" #" $" &" '%" (#" %$"!"!"#$%&'($!"#$%&'()*+,-&$!"#$*+,-&$ "'.&/$ 0&123&$ 41-+5(&+$ 63&'1/$7,89$ 63&'1/$:2;$ A'('$ 7'/9-'<B&$ :,1C&($:,/-$ %<-+&&$!C,DB,/-$ 82
253 SwissTM vs Sequential (Sparc) '$" E#$>F$ >G$>H$ EI$ '#" '!"!D&&(5D$$ &" %" $" #" '" #" $" &" '%" (#" %$"!"!"#$%&'($!"#$%&'()*+,-&$!"#$*+,-&$ "'.&/$ 0&123&$ 41-+5(&+$ 63&'1/$7,89$ 63&'1/$:2;$ A'('$ 7'/9-'<B&$ :,1C&($:,/-$ %<-+&&$!C,DB,/-$ 82
254 SwissTM vs Sequential (Sparc) '$" E#$>F$ >G$>H$ EI$ '#" '!"!D&&(5D$$ &" %" $" #" '" #" $" &" '%" (#" %$"!"!"#$%&'($!"#$%&'()*+,-&$!"#$*+,-&$ "'.&/$ 0&123&$ 41-+5(&+$ 63&'1/$7,89$ 63&'1/$:2;$ A'('$ 7'/9-'<B&$ :,1C&($:,/-$ %<-+&&$!C,DB,/-$ 82
255 SwissTM vs Sequential (Sparc) '$" E#$>F$ >G$>H$ EI$ '#" '!"!D&&(5D$$ &" %" $" #" '" #" $" &" '%" (#" %$"!"!"#$%&'($!"#$%&'()*+,-&$!"#$*+,-&$ "'.&/$ 0&123&$ 41-+5(&+$ 63&'1/$7,89$ 63&'1/$:2;$ A'('$ 7'/9-'<B&$ :,1C&($:,/-$ %<-+&&$!C,DB,/-$ 82
256 SwissTM vs Sequential (Sparc) '$" E#$>F$ >G$>H$ EI$ '#" '!"!D&&(5D$$ &" %" $" #" '" #" $" &" '%" (#" %$"!"!"#$%&'($!"#$%&'()*+,-&$!"#$*+,-&$ "'.&/$ 0&123&$ 41-+5(&+$ 63&'1/$7,89$ 63&'1/$:2;$ A'('$ 7'/9-'<B&$ :,1C&($:,/-$ %<-+&&$!C,DB,/-$ 82
257 SwissTM vs Sequential (Sparc) '$" E#$>F$ >G$>H$ EI$ '#" '!"!D&&(5D$$ &" %" $" #" '" #" $" &" '%" (#" %$"!"!"#$%&'($!"#$%&'()*+,-&$!"#$*+,-&$ "'.&/$ 0&123&$ 41-+5(&+$ 63&'1/$7,89$ 63&'1/$:2;$ A'('$ 7'/9-'<B&$ :,1C&($:,/-$ %<-+&&$!C,DB,/-$ 82
258 SwissTM vs Sequential (Sparc) '$" E#$>F$ >G$>H$ EI$ '#" '!"!D&&(5D$$ &" %" $" #" '" #" $" &" '%" (#" %$"!"!"#$%&'($!"#$%&'()*+,-&$!"#$*+,-&$ "'.&/$ 0&123&$ 41-+5(&+$ 63&'1/$7,89$ 63&'1/$:2;$ A'('$ 7'/9-'<B&$ :,1C&($:,/-$ %<-+&&$!C,DB,/-$ 82
259 SwissTM vs Sequential (Sparc) '$" E#$>F$ >G$>H$ EI$ '#" '!"!D&&(5D$$ &" %" $" #" '" #" $" &" '%" (#" %$"!"!"#$%&'($!"#$%&'()*+,-&$!"#$*+,-&$ "'.&/$ 0&123&$ 41-+5(&+$ 63&'1/$7,89$ 63&'1/$:2;$ A'('$ 7'/9-'<B&$ :,1C&($:,/-$ %<-+&&$!C,DB,/-$ 82
260 SwissTM vs Sequential (Sparc) '$" E#$>F$ >G$>H$ EI$ '#" '!"!D&&(5D$$ &" %" $" #" '" #" $" &" '%" (#" %$"!"!"#$%&'($!"#$%&'()*+,-&$!"#$*+,-&$ "'.&/$ 0&123&$ 41-+5(&+$ 63&'1/$7,89$ 63&'1/$:2;$ A'('$ 7'/9-'<B&$ :,1C&($:,/-$ %<-+&&$!C,DB,/-$ 82
261 SwissTM vs Sequential (Sparc) '$" E#$>F$ >G$>H$ EI$ '#" '!"!D&&(5D$$ &" %" $" #" '" #" $" &" '%" (#" %$"!"!"#$%&'($!"#$%&'()*+,-&$!"#$*+,-&$ "'.&/$ 0&123&$ 41-+5(&+$ 63&'1/$7,89$ 63&'1/$:2;$ A'('$ 7'/9-'<B&$ :,1C&($:,/-$ %<-+&&$!C,DB,/-$ 82
262 SwissTM vs Sequential x86 (16) SPARC (64) SwissTM (manual) SwissTM (compiler) 16/17 17/17 13/14 - Total 46/48 (95%) 83
263 SwissTM vs Sequential x86 (16) SPARC (64) min max avg min max avg SwissTM (manual) SwissTM (compiler)
264 Predicting STM performance 85
265 Predicting STM performance Can we guess STM performance? 85
266 Good performance? 12 STMBench7 9 Speedup Threads 86
267 Good performance? 12 9 STMBench7 Yes Read Speedup Threads 86
268 Good performance? 12 9 STMBench7 Or maybe no? Read Speedup 6 3 Write Threads 86
269 Open questions What is really going on? significant difference What should we do? given application - use TM or not? 87
270 Answers so far Rules of thumb Theory Existing models 88
271 Rules of Thumb Good performance if mostly read operations low contention Vague difficult to apply 89
272 Theory Worst case performance: Guerraoui, Kapałka 2007: Every progressive, single-version TM that uses invisible reads has the time complexity of Ω(k) where k = Obj. 90
273 Theory Worst case performance: Guerraoui, Kapałka 2007: Every progressive, single-version TM that uses invisible reads has the time complexity of Ω(k) where k = Obj. We want common case 90
274 Existing models Model one aspect: Heindl, Pokam, Adl-Tabatabai 2009: conflict detection algorithmic performance Answer specific question: Usui et al. 2009: use lock or STM for a critical section 91
275 Pragmatic approach 92
276 Pragmatic approach 92
277 Pragmatic approach 92
278 Pragmatic approach 92
279 Pragmatic approach 92
280 Pragmatic approach 92
281 Pragmatic approach speedup = f(n) 92
282 Constructing function 10 Genome 8 Speedup Threads 93
283 Constructing function 10 Genome 8 Speedup Threads 93
284 Constructing function 10 Genome f(n) 8 Speedup Threads 93
285 Desired properties Simple General Low error 94
286 Interpolation 10 Genome 8 Speedup Threads 95
287 Interpolation 10 8? Genome Speedup Threads 95
288 Interpolation 10 Genome 8 Speedup Threads 95
289 Interpolation 10 Genome Polynomial 8 Speedup Threads 95
290 Interpolation 10 Genome Rational 8 Speedup Threads 95
291 Choosing function Low error Simple Rational Polynomial Logarithmic Exponential Power 96
292 Choosing function Low error Simple Rational Polynomial Logarithmic Exponential Power 96
293 Extrapolation 15 Genome 12 Speedup Threads 97
294 Extrapolation 15 Genome Speedup ? Threads 97
295 Extrapolation 15 Genome 12 Speedup Threads 97
296 Extrapolation 15 Polynomial 8 Genome 12 Rational 8 Speedup Threads 97
297 Extrapolation 15 Genome Polynomial Rational 32 Speedup Threads 97
298 Extrapolation error 4 Intruder 3 Speedup Threads 98
299 Extrapolation error 4 Intruder 3 Rational 8 Speedup Threads 98
300 Extrapolation error 4 Intruder Speedup 3 2 6% Rational Threads 98
301 Extrapolation error 4 Intruder 3 Rational 16 Speedup Threads 98
302 Extrapolation error 4 Intruder 3 9% Rational 16 Speedup Threads 98
303 Extrapolation error 4 Intruder Speedup 3 2 Rational Threads 98
304 Extrapolation error 4 Intruder Speedup 3 2 Rational 32 7% Threads 98
305 Extrapolation error 4 Intruder ~10% for 2m Speedup 3 2 Rational 32 7% Threads 98
306 Applications Predict performance use TM or not? assign more CPUs? Predict optimal thread count schedule threads 99
307 Use TM or not? 12 Rational 8 9 Speedup Threads 100
308 Use TM or not? 12 Rational 8 9 Genome Speedup Threads 100
309 Use TM or not? 12 Rational 8 9 Genome Speedup Threads 100
310 Use TM or not? 12 Rational 8 9 Genome Speedup 6 3 SSCA Threads 100
311 Use TM or not? 12 Rational 8 9 Genome Speedup 6 3 SSCA Threads 100
312 Scheduling 1. Schedule ni threads 2. Measure performance 3. Construct f(n) 4. Find ni+1 for which f(n) is max 5. Schedule ni+1 threads 101
313 Scheduling 3 Intruder (max 22) Speedup 2 1 Measured Threads 102
314 Scheduling 3 Intruder (max 22) Speedup 2 1 f(n) Measured (max 24) Threads 102
315 Scheduling 3 Intruder (max 22) f(n) Speedup 2 1 Measured (max 64) Threads 102
316 Scheduling 3 Intruder (max 22) Speedup 2 1 f(n) Measured (max 22) Threads 102
317 Scheduling 3 Intruder (max 22) Speedup 2 1 f(n) Measured (max 22) Threads 102
318 Limitations f(n) strongly depends on workload computer system Not always accurate 103
319 Future work Find better functions Use characteristics of performance cannot be negative Use runtime information STM OS 104
320 Links SwissTM Intel C/C++ DTMC C/C++ (LLVM) 105
321 References Rachid Guerraoui, Michał Kapałka, and Jan Vitek. STMBench7: A Benchmark for Software Transactional Memory. EuroSys Rachid Guerraoui and Michał Kapałka. On the Correctness of Transactional Memory. PPoPP Aleksandar Dragojević, Rachid Guerraoui, and Michał Kapałka. Dividing Transactional Memories by Zero. Transact
322 References (2) Aleksandar Dragojević, Rachid Guerraoui, and Michał Kapałka. Stretching Transactional Memory. PLDI Aleksandar Dragojević, Pascal Felber, Vincent Gramoli, Rachid Guerraoui. Why STM can be more than a Research Toy. CACM April, Aleksandar Dragojević, Rachid Guerraoui. Predicting the Scalability of an STM: A Pragmatic Approach. Transact
323 Thank you 108
324 Questions, comments? 108
Stretching Transactional Memory
Stretching Transactional Memory Aleksandar Dragojević Rachid Guerraoui Michał Kapałka Ecole Polytechnique Fédérale de Lausanne, School of Computer and Communication Sciences, I&C, Switzerland {aleksandar.dragojevic,
More informationAgenda. Designing Transactional Memory Systems. Why not obstruction-free? Why lock-based?
Agenda Designing Transactional Memory Systems Part III: Lock-based STMs Pascal Felber University of Neuchatel Pascal.Felber@unine.ch Part I: Introduction Part II: Obstruction-free STMs Part III: Lock-based
More informationTransactional Memory. R. Guerraoui, EPFL
Transactional Memory R. Guerraoui, EPFL Locking is history Lock-freedom is difficult Wanted A concurrency control abstraction that is simple, robust and efficient Transactions Historical perspective Eswaran
More informationTransactional Memory
Transactional Memory Michał Kapałka EPFL, LPD STiDC 08, 1.XII 2008 Michał Kapałka (EPFL, LPD) Transactional Memory STiDC 08, 1.XII 2008 1 / 25 Introduction How to Deal with Multi-Threading? Locks? Wait-free
More informationFalse Conflict Reduction in the Swiss Transactional Memory (SwissTM) System
False Conflict Reduction in the Swiss Transactional Memory (SwissTM) System Aravind Natarajan Department of Computer Engineering The University of Texas at Dallas Richardson, TX - 75080, USA Email: aravindn@utdallas.edu
More informationConflict Detection and Validation Strategies for Software Transactional Memory
Conflict Detection and Validation Strategies for Software Transactional Memory Michael F. Spear, Virendra J. Marathe, William N. Scherer III, and Michael L. Scott University of Rochester www.cs.rochester.edu/research/synchronization/
More informationEvaluating the Impact of Transactional Characteristics on the Performance of Transactional Memory Applications
Evaluating the Impact of Transactional Characteristics on the Performance of Transactional Memory Applications Fernando Rui, Márcio Castro, Dalvan Griebler, Luiz Gustavo Fernandes Email: fernando.rui@acad.pucrs.br,
More informationChí Cao Minh 28 May 2008
Chí Cao Minh 28 May 2008 Uniprocessor systems hitting limits Design complexity overwhelming Power consumption increasing dramatically Instruction-level parallelism exhausted Solution is multiprocessor
More information6.852: Distributed Algorithms Fall, Class 20
6.852: Distributed Algorithms Fall, 2009 Class 20 Today s plan z z z Transactional Memory Reading: Herlihy-Shavit, Chapter 18 Guerraoui, Kapalka, Chapters 1-4 Next: z z z Asynchronous networks vs asynchronous
More informationLowering the Overhead of Nonblocking Software Transactional Memory
Lowering the Overhead of Nonblocking Software Transactional Memory Virendra J. Marathe Michael F. Spear Christopher Heriot Athul Acharya David Eisenstat William N. Scherer III Michael L. Scott Background
More informationEigenBench: A Simple Exploration Tool for Orthogonal TM Characteristics
EigenBench: A Simple Exploration Tool for Orthogonal TM Characteristics Pervasive Parallelism Laboratory, Stanford University Sungpack Hong Tayo Oguntebi Jared Casper Nathan Bronson Christos Kozyrakis
More informationImproving the Practicality of Transactional Memory
Improving the Practicality of Transactional Memory Woongki Baek Electrical Engineering Stanford University Programming Multiprocessors Multiprocessor systems are now everywhere From embedded to datacenter
More informationUnifying Thread-Level Speculation and Transactional Memory
Unifying Thread-Level Speculation and Transactional Memory João Barreto 1, Aleksandar Dragojevic 2, Paulo Ferreira 1, Ricardo Filipe 1, and Rachid Guerraoui 2 1 INESC-ID/Technical University Lisbon, Portugal
More informationUnifying Thread-Level Speculation and Transactional Memory
Unifying Thread-Level Speculation and Transactional Memory João Barreto 1, Aleksandar Dragojevic 2, Paulo Ferreira 1, Ricardo Filipe 1,, and Rachid Guerraoui 2 1 INESC-ID/Technical University Lisbon, Portugal
More informationPerformance Evaluation of Adaptivity in STM. Mathias Payer and Thomas R. Gross Department of Computer Science, ETH Zürich
Performance Evaluation of Adaptivity in STM Mathias Payer and Thomas R. Gross Department of Computer Science, ETH Zürich Motivation STM systems rely on many assumptions Often contradicting for different
More informationPerformance Comparison of Various STM Concurrency Control Protocols Using Synchrobench
Performance Comparison of Various STM Concurrency Control Protocols Using Synchrobench Ajay Singh Dr. Sathya Peri Anila Kumari Monika G. February 24, 2017 STM vs Synchrobench IIT Hyderabad February 24,
More information1 Publishable Summary
1 Publishable Summary 1.1 VELOX Motivation and Goals The current trend in designing processors with multiple cores, where cores operate in parallel and each of them supports multiple threads, makes the
More informationRelaxing Concurrency Control in Transactional Memory. Utku Aydonat
Relaxing Concurrency Control in Transactional Memory by Utku Aydonat A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Graduate Department of The Edward S. Rogers
More informationMcRT-STM: A High Performance Software Transactional Memory System for a Multi- Core Runtime
McRT-STM: A High Performance Software Transactional Memory System for a Multi- Core Runtime B. Saha, A-R. Adl- Tabatabai, R. Hudson, C.C. Minh, B. Hertzberg PPoPP 2006 Introductory TM Sales Pitch Two legs
More informationPreventing versus Curing: Avoiding Conflicts in Transactional Memories
Preventing versus Curing: Avoiding Conflicts in Transactional Memories Aleksandar Dragojević Anmol V. Singh Rachid Guerraoui Vasu Singh EPFL Abstract Transactional memories are typically speculative and
More informationHydraVM: Mohamed M. Saad Mohamed Mohamedin, and Binoy Ravindran. Hot Topics in Parallelism (HotPar '12), Berkeley, CA
HydraVM: Mohamed M. Saad Mohamed Mohamedin, and Binoy Ravindran Hot Topics in Parallelism (HotPar '12), Berkeley, CA Motivation & Objectives Background Architecture Program Reconstruction Implementation
More informationRemote Transaction Commit: Centralizing Software Transactional Memory Commits
IEEE TRANSACTIONS ON COMPUTERS 1 Remote Transaction Commit: Centralizing Software Transactional Memory Commits Ahmed Hassan, Roberto Palmieri, and Binoy Ravindran Abstract Software Transactional Memory
More informationByteSTM: Java Software Transactional Memory at the Virtual Machine Level
ByteSTM: Java Software Transactional Memory at the Virtual Machine Level Mohamed Mohamedin Thesis submitted to the Faculty of the Virginia Polytechnic Institute and State University in partial fulfillment
More informationThe Semantics of Progress in Lock-Based Transactional Memory
The Semantics of Progress in Lock-Based Transactional Memory Rachid Guerraoui Michał Kapałka EPFL, Switzerland {rachid.guerraoui, michal.kapalka}@epfl.ch Abstract Transactional memory (TM) is a promising
More informationPerformance Improvement via Always-Abort HTM
1 Performance Improvement via Always-Abort HTM Joseph Izraelevitz* Lingxiang Xiang Michael L. Scott* *Department of Computer Science University of Rochester {jhi1,scott}@cs.rochester.edu Parallel Computing
More informationSTMBench7: A Benchmark for Software Transactional Memory
STMBench7: A Benchmark for Software Transactional Memory Rachid Guerraoui School of Computer and Communication Sciences, EPFL Michał Kapałka School of Computer and Communication Sciences, EPFL Jan Vitek
More information1 Introduction. Michał Kapałka EPFL, Switzerland Rachid Guerraoui EPFL, Switzerland
THE THEORY OF TRANSACTIONAL MEMORY Rachid Guerraoui EPFL, Switzerland rachid.guerraoui@epfl.ch Michał Kapałka EPFL, Switzerland michal.kapalka@epfl.ch Abstract Transactional memory (TM) is a promising
More informationImplementing and Evaluating Nested Parallel Transactions in STM. Woongki Baek, Nathan Bronson, Christos Kozyrakis, Kunle Olukotun Stanford University
Implementing and Evaluating Nested Parallel Transactions in STM Woongki Baek, Nathan Bronson, Christos Kozyrakis, Kunle Olukotun Stanford University Introduction // Parallelize the outer loop for(i=0;i
More informationTransactional Memory. How to do multiple things at once. Benjamin Engel Transactional Memory 1 / 28
Transactional Memory or How to do multiple things at once Benjamin Engel Transactional Memory 1 / 28 Transactional Memory: Architectural Support for Lock-Free Data Structures M. Herlihy, J. Eliot, and
More informationEnsuring Irrevocability in Wait-free Transactional Memory
Ensuring Irrevocability in Wait-free Transactional Memory Jan Kończak Institute of Computing Science, Poznań University of Technology Poznań, Poland jan.konczak@cs.put.edu.pl Paweł T. Wojciechowski Institute
More informationScalable Software Transactional Memory for Chapel High-Productivity Language
Scalable Software Transactional Memory for Chapel High-Productivity Language Srinivas Sridharan and Peter Kogge, U. Notre Dame Brad Chamberlain, Cray Inc Jeffrey Vetter, Future Technologies Group, ORNL
More informationSpeculative Synchronization
Speculative Synchronization José F. Martínez Department of Computer Science University of Illinois at Urbana-Champaign http://iacoma.cs.uiuc.edu/martinez Problem 1: Conservative Parallelization No parallelization
More informationLock vs. Lock-free Memory Project proposal
Lock vs. Lock-free Memory Project proposal Fahad Alduraibi Aws Ahmad Eman Elrifaei Electrical and Computer Engineering Southern Illinois University 1. Introduction The CPU performance development history
More informationLecture 21: Transactional Memory. Topics: Hardware TM basics, different implementations
Lecture 21: Transactional Memory Topics: Hardware TM basics, different implementations 1 Transactions New paradigm to simplify programming instead of lock-unlock, use transaction begin-end locks are blocking,
More informationReduced Hardware NOrec: A Safe and Scalable Hybrid Transactional Memory
Reduced Hardware NOrec: A Safe and Scalable Hybrid Transactional Memory Alexander Matveev MIT amatveev@csail.mit.edu Nir Shavit MIT shanir@csail.mit.edu Abstract Because of hardware TM limitations, software
More informationScheduling Transactions in Replicated Distributed Transactional Memory
Scheduling Transactions in Replicated Distributed Transactional Memory Junwhan Kim and Binoy Ravindran Virginia Tech USA {junwhan,binoy}@vt.edu CCGrid 2013 Concurrency control on chip multiprocessors significantly
More informationOpen Nesting in Software Transactional Memory. Paper by: Ni et al. Presented by: Matthew McGill
Open Nesting in Software Transactional Memory Paper by: Ni et al. Presented by: Matthew McGill Transactional Memory Transactional Memory is a concurrent programming abstraction intended to be as scalable
More informationSerializability of Transactions in Software Transactional Memory
Serializability of Transactions in Software Transactional Memory Utku Aydonat Tarek S. Abdelrahman Edward S. Rogers Sr. Department of Electrical and Computer Engineering University of Toronto {uaydonat,tsa}@eecg.toronto.edu
More information6 Transactional Memory. Robert Mullins
6 Transactional Memory ( MPhil Chip Multiprocessors (ACS Robert Mullins Overview Limitations of lock-based programming Transactional memory Programming with TM ( STM ) Software TM ( HTM ) Hardware TM 2
More information! Part I: Introduction. ! Part II: Obstruction-free STMs. ! DSTM: an obstruction-free STM design. ! FSTM: a lock-free STM design
genda Designing Transactional Memory ystems Part II: Obstruction-free TMs Pascal Felber University of Neuchatel Pascal.Felber@unine.ch! Part I: Introduction! Part II: Obstruction-free TMs! DTM: an obstruction-free
More informationTxFS: Leveraging File-System Crash Consistency to Provide ACID Transactions
TxFS: Leveraging File-System Crash Consistency to Provide ACID Transactions Yige Hu, Zhiting Zhu, Ian Neal, Youngjin Kwon, Tianyu Chen, Vijay Chidambaram, Emmett Witchel The University of Texas at Austin
More informationEXPLOITING SEMANTIC COMMUTATIVITY IN HARDWARE SPECULATION
EXPLOITING SEMANTIC COMMUTATIVITY IN HARDWARE SPECULATION GUOWEI ZHANG, VIRGINIA CHIU, DANIEL SANCHEZ MICRO 2016 Executive summary 2 Exploiting commutativity benefits update-heavy apps Software techniques
More informationTransac'onal Libraries Alexander Spiegelman *, Guy Golan-Gueta, and Idit Keidar * Technion Yahoo Research
Transac'onal Libraries Alexander Spiegelman *, Guy Golan-Gueta, and Idit Keidar * * Technion Yahoo Research 1 Mul'-Threading is Everywhere 2 Agenda Mo@va@on Concurrent Data Structure Libraries (CDSLs)
More informationLecture 7: Transactional Memory Intro. Topics: introduction to transactional memory, lazy implementation
Lecture 7: Transactional Memory Intro Topics: introduction to transactional memory, lazy implementation 1 Transactions New paradigm to simplify programming instead of lock-unlock, use transaction begin-end
More informationIntegrating Transactionally Boosted Data Structures with STM Frameworks: A Case Study on Set
Integrating Transactionally Boosted Data Structures with STM Frameworks: A Case Study on Set Ahmed Hassan Roberto Palmieri Binoy Ravindran Virginia Tech hassan84@vt.edu robertop@vt.edu binoy@vt.edu Abstract
More informationNON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY. Tim Harris, 28 November 2014
NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY Tim Harris, 28 November 2014 Lecture 8 Problems with locks Atomic blocks and composition Hardware transactional memory Software transactional memory
More informationSELF-TUNING HTM. Paolo Romano
SELF-TUNING HTM Paolo Romano 2 Based on ICAC 14 paper N. Diegues and Paolo Romano Self-Tuning Intel Transactional Synchronization Extensions 11 th USENIX International Conference on Autonomic Computing
More informationTransactional Memory. Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit
Transactional Memory Companion slides for The by Maurice Herlihy & Nir Shavit Our Vision for the Future In this course, we covered. Best practices New and clever ideas And common-sense observations. 2
More information) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons)
) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons) Goal A Distributed Transaction We want a transaction that involves multiple nodes Review of transactions and their properties
More information) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons)
) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons) Goal A Distributed Transaction We want a transaction that involves multiple nodes Review of transactions and their properties
More informationABORTING CONFLICTING TRANSACTIONS IN AN STM
Committing ABORTING CONFLICTING TRANSACTIONS IN AN STM PPOPP 09 2/17/2009 Hany Ramadan, Indrajit Roy, Emmett Witchel University of Texas at Austin Maurice Herlihy Brown University TM AND ITS DISCONTENTS
More informationTransactifying Apache s Cache Module
H. Eran O. Lutzky Z. Guz I. Keidar Department of Electrical Engineering Technion Israel Institute of Technology SYSTOR 2009 The Israeli Experimental Systems Conference Outline 1 Why legacy applications
More informationDatabase Management Systems Concurrency Control
atabase Management Systems Concurrency Control B M G 1 BMS Architecture SQL INSTRUCTION OPTIMIZER MANAGEMENT OF ACCESS METHOS CONCURRENCY CONTROL BUFFER MANAGER RELIABILITY MANAGEMENT Index Files ata Files
More informationOn Improving Transactional Memory: Optimistic Transactional Boosting, Remote Execution, and Hybrid Transactions
On Improving Transactional Memory: Optimistic Transactional Boosting, Remote Execution, and Hybrid Transactions Ahmed Hassan Preliminary Examination Proposal submitted to the Faculty of the Virginia Polytechnic
More informationSoftware transactional memory
Transactional locking II (Dice et. al, DISC'06) Time-based STM (Felber et. al, TPDS'08) Mentor: Johannes Schneider March 16 th, 2011 Motivation Multiprocessor systems Speed up time-sharing applications
More informationLecture 12 Transactional Memory
CSCI-UA.0480-010 Special Topics: Multicore Programming Lecture 12 Transactional Memory Christopher Mitchell, Ph.D. cmitchell@cs.nyu.edu http://z80.me Database Background Databases have successfully exploited
More informationLecture 21: Transactional Memory. Topics: consistency model recap, introduction to transactional memory
Lecture 21: Transactional Memory Topics: consistency model recap, introduction to transactional memory 1 Example Programs Initially, A = B = 0 P1 P2 A = 1 B = 1 if (B == 0) if (A == 0) critical section
More informationHow Live Can a Transactional Memory Be?
How Live Can a Transactional Memory Be? Rachid Guerraoui Michał Kapałka February 18, 2009 Abstract This paper asks how much liveness a transactional memory (TM) implementation can guarantee. We first devise
More informationMulti-Core Computing with Transactional Memory. Johannes Schneider and Prof. Roger Wattenhofer
Multi-Core Computing with Transactional Memory Johannes Schneider and Prof. Roger Wattenhofer 1 Overview Introduction Difficulties with parallel (multi-core) programming A (partial) solution: Transactional
More information10 Supporting Time-Based QoS Requirements in Software Transactional Memory
1 Supporting Time-Based QoS Requirements in Software Transactional Memory WALTHER MALDONADO, University of Neuchâtel, Switzerland PATRICK MARLIER, University of Neuchâtel, Switzerland PASCAL FELBER, University
More informationOn Exploring Markov Chains for Transaction Scheduling Optimization in Transactional Memory
WTTM 2015 7 th Workshop on the Theory of Transactional Memory On Exploring Markov Chains for Transaction Scheduling Optimization in Transactional Memory Pierangelo Di Sanzo, Marco Sannicandro, Bruno Ciciani,
More informationLogTM: Log-Based Transactional Memory
LogTM: Log-Based Transactional Memory Kevin E. Moore, Jayaram Bobba, Michelle J. Moravan, Mark D. Hill, & David A. Wood 12th International Symposium on High Performance Computer Architecture () 26 Mulitfacet
More informationUniversity of Karlsruhe (TH)
University of Karlsruhe (TH) Research University founded 1825 Technical Briefing Session Multicore Software Engineering @ ICSE 2009 Transactional Memory versus Locks - A Comparative Case Study Victor Pankratius
More informationAtomic Transac1ons. Atomic Transactions. Q1: What if network fails before deposit? Q2: What if sequence is interrupted by another sequence?
CPSC-4/6: Operang Systems Atomic Transactions The Transaction Model / Primitives Serializability Implementation Serialization Graphs 2-Phase Locking Optimistic Concurrency Control Transactional Memory
More informationSpeculative Synchronization: Applying Thread Level Speculation to Parallel Applications. University of Illinois
Speculative Synchronization: Applying Thread Level Speculation to Parallel Applications José éf. Martínez * and Josep Torrellas University of Illinois ASPLOS 2002 * Now at Cornell University Overview Allow
More informationAn Update on Haskell H/STM 1
An Update on Haskell H/STM 1 Ryan Yates and Michael L. Scott University of Rochester TRANSACT 10, 6-15-2015 1 This work was funded in part by the National Science Foundation under grants CCR-0963759, CCF-1116055,
More informationIntroduction to Data Management CSE 344
Introduction to Data Management CSE 344 Lecture 22: More Transaction Implementations 1 Review: Schedules, schedules, schedules The DBMS scheduler determines the order of operations from txns are executed
More informationUnderstanding Hardware Transactional Memory
Understanding Hardware Transactional Memory Gil Tene, CTO & co-founder, Azul Systems @giltene 2015 Azul Systems, Inc. Agenda Brief introduction What is Hardware Transactional Memory (HTM)? Cache coherence
More informationarxiv: v1 [cs.dc] 13 Aug 2013
Opacity of Memory Management in Software Transactional Memory Holger Machens and Volker Turau Institute of Telematics Hamburg University of Technology {machens,turau}@tuhh.de http://www.ti5.tuhh.de arxiv:1308.2881v1
More informationPotential violations of Serializability: Example 1
CSCE 6610:Advanced Computer Architecture Review New Amdahl s law A possible idea for a term project Explore my idea about changing frequency based on serial fraction to maintain fixed energy or keep same
More informationLock-based concurrency control. Q: What if access patterns rarely, if ever, conflict? Serializability. Concurrency Control II (OCC, MVCC)
Concurrency Control II (CC, MVCC) CS 418: Distributed Systems Lecture 18 Serializability Execution of a set of transactions over multiple items is equivalent to some serial execution of s Michael Freedman
More informationCoping With Context Switches in Lock-Based Software Transacional Memory Algorithms
TEL-AVIV UNIVERSITY RAYMOND AND BEVERLY SACKLER FACULTY OF EXACT SCIENCES SCHOOL OF COMPUTER SCIENCE Coping With Context Switches in Lock-Based Software Transacional Memory Algorithms Dissertation submitted
More informationHeckaton. SQL Server's Memory Optimized OLTP Engine
Heckaton SQL Server's Memory Optimized OLTP Engine Agenda Introduction to Hekaton Design Consideration High Level Architecture Storage and Indexing Query Processing Transaction Management Transaction Durability
More informationOn the Correctness of Transactional Memory
On the Correctness of Transactional Memory Rachid Guerraoui Michał Kapałka School of Computer and Communication Sciences, EPFL {rachid.guerraoui, michal.kapalka}@epfl.ch Abstract Transactional memory (TM)
More informationCommit Phase in Timestamp-based STM
Commit Phase in Timestamp-based STM Rui Zhang Dept. of Computer Science Rice University Houston, TX 77005, USA ruizhang@rice.edu Zoran Budimlić Dept. of Computer Science Rice University Houston, TX 77005,
More informationConcurrency in Transactional Memory
Concurrency in Transactional Memory Vincent Gramoli To cite this version: Vincent Gramoli. Concurrency in Transactional Memory. Distributed, Parallel, and Cluster Computing [cs.dc]. UPMC - Sorbonne University,
More informationA Causality-Based Runtime Check for (Rollback) Atomicity
A Causality-Based Runtime Check for (Rollback) Atomicity Serdar Tasiran Koc University Istanbul, Turkey Tayfun Elmas Koc University Istanbul, Turkey RV 2007 March 13, 2007 Outline This paper: Define rollback
More informationA Machine Learning Approach to Adaptive Software Transaction Memory
Lehigh University Lehigh Preserve Theses and Dissertations 2011 A Machine Learning Approach to Adaptive Software Transaction Memory Qingping Wang Lehigh University Follow this and additional works at:
More informationTransaction Management: Introduction (Chap. 16)
Transaction Management: Introduction (Chap. 16) CS634 Class 14 Slides based on Database Management Systems 3 rd ed, Ramakrishnan and Gehrke What are Transactions? So far, we looked at individual queries;
More informationNemo: NUMA-aware Concurrency Control for Scalable Transactional Memory
Nemo: NUMA-aware Concurrency Control for Scalable Transactional Memory Mohamed Mohamedin Virginia Tech Blacksburg, Virginia mohamedin@vt.edu Sebastiano Peluso Virginia Tech Blacksburg, Virginia peluso@vt.edu
More informationConcurrent programming: From theory to practice. Concurrent Algorithms 2015 Vasileios Trigonakis
oncurrent programming: From theory to practice oncurrent Algorithms 2015 Vasileios Trigonakis From theory to practice Theoretical (design) Practical (design) Practical (implementation) 2 From theory to
More informationSchedule. Today: Feb. 21 (TH) Feb. 28 (TH) Feb. 26 (T) Mar. 5 (T) Read Sections , Project Part 6 due.
Schedule Today: Feb. 21 (TH) Transactions, Authorization. Read Sections 8.6-8.7. Project Part 5 due. Feb. 26 (T) Datalog. Read Sections 10.1-10.2. Assignment 6 due. Feb. 28 (TH) Datalog and SQL Recursion,
More informationLecture: Transactional Memory, Networks. Topics: TM implementations, on-chip networks
Lecture: Transactional Memory, Networks Topics: TM implementations, on-chip networks 1 Summary of TM Benefits As easy to program as coarse-grain locks Performance similar to fine-grain locks Avoids deadlock
More informationA Dynamic Instrumentation Approach to Software Transactional Memory. Marek Olszewski
A Dynamic Instrumentation Approach to Software Transactional Memory by Marek Olszewski A thesis submitted in conformity with the requirements for the degree of Master of Applied Science Graduate Department
More information) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons)
) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons) Transactions - Definition A transaction is a sequence of data operations with the following properties: * A Atomic All
More informationHybrid Static-Dynamic Analysis for Statically Bounded Region Serializability
Hybrid Static-Dynamic Analysis for Statically Bounded Region Serializability Aritra Sengupta, Swarnendu Biswas, Minjia Zhang, Michael D. Bond and Milind Kulkarni ASPLOS 2015, ISTANBUL, TURKEY Programming
More informationA Relativistic Enhancement to Software Transactional Memory
A Relativistic Enhancement to Software Transactional Memory Philip W. Howard Portland State University Jonathan Walpole Portland State University Abstract Relativistic Programming is a technique that allows
More informationSummary: Open Questions:
Summary: The paper proposes an new parallelization technique, which provides dynamic runtime parallelization of loops from binary single-thread programs with minimal architectural change. The realization
More informationEnhancing efficiency of Hybrid Transactional Memory via Dynamic Data Partitioning Schemes
Enhancing efficiency of Hybrid Transactional Memory via Dynamic Data Partitioning Schemes Pedro Raminhas Instituto Superior Técnico, Universidade de Lisboa Lisbon, Portugal Email: pedro.raminhas@tecnico.ulisboa.pt
More informationLecture 12: Hardware/Software Trade-Offs. Topics: COMA, Software Virtual Memory
Lecture 12: Hardware/Software Trade-Offs Topics: COMA, Software Virtual Memory 1 Capacity Limitations P P P P B1 C C B1 C C Mem Coherence Monitor Mem Coherence Monitor B2 In a Sequent NUMA-Q design above,
More informationPerformance Tradeoffs in Software Transactional Memory
Master Thesis Computer Science Thesis no: MCS-2010-28 May 2010 Performance Tradeoffs in Software Transactional Memory Gulfam Abbas Naveed Asif School School of Computing of Computing Blekinge Blekinge
More informationMonitors; Software Transactional Memory
Monitors; Software Transactional Memory Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico October 18, 2012 CPD (DEI / IST) Parallel and
More informationConcurrency control CS 417. Distributed Systems CS 417
Concurrency control CS 417 Distributed Systems CS 417 1 Schedules Transactions must have scheduled so that data is serially equivalent Use mutual exclusion to ensure that only one transaction executes
More informationProblems Caused by Failures
Problems Caused by Failures Update all account balances at a bank branch. Accounts(Anum, CId, BranchId, Balance) Update Accounts Set Balance = Balance * 1.05 Where BranchId = 12345 Partial Updates - Lack
More informationWhat are Transactions? Transaction Management: Introduction (Chap. 16) Major Example: the web app. Concurrent Execution. Web app in execution (CS636)
What are Transactions? Transaction Management: Introduction (Chap. 16) CS634 Class 14, Mar. 23, 2016 So far, we looked at individual queries; in practice, a task consists of a sequence of actions E.g.,
More informationSummary: Issues / Open Questions:
Summary: The paper introduces Transitional Locking II (TL2), a Software Transactional Memory (STM) algorithm, which tries to overcomes most of the safety and performance issues of former STM implementations.
More informationLast Class Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications
Last Class Carnegie Mellon Univ. Dept. of Computer Science 15-415/615 - DB Applications C. Faloutsos A. Pavlo Lecture#23: Concurrency Control Part 2 (R&G ch. 17) Serializability Two-Phase Locking Deadlocks
More informationCS377P Programming for Performance Multicore Performance Synchronization
CS377P Programming for Performance Multicore Performance Synchronization Sreepathi Pai UTCS October 21, 2015 Outline 1 Synchronization Primitives 2 Blocking, Lock-free and Wait-free Algorithms 3 Transactional
More informationTMUNIT: Testing Software Transactional Memories
TMUNIT: Testing Software Transactional Memories Derin Harmanci Pascal Felber University of Neuchâtel Switzerland {derin.harmanci,pascal.felber}@unine.ch Vincent Gramoli EPFL & University of Neuchâtel Switzerland
More informationExtracting Parallelism from Legacy Sequential Code Using Software Transactional Memory
Extracting Parallelism from Legacy Sequential Code Using Software Transactional Memory Mohamed M. Saad Preliminary Examination Proposal submitted to the Faculty of the Virginia Polytechnic Institute and
More information