Hybrid Static-Dynamic Analysis for Statically Bounded Region Serializability

Size: px
Start display at page:

Download "Hybrid Static-Dynamic Analysis for Statically Bounded Region Serializability"

Transcription

1 Hybrid Static-Dynamic Analysis for Statically Bounded Region Serializability Aritra Sengupta, Swarnendu Biswas, Minjia Zhang, Michael D. Bond and Milind Kulkarni ASPLOS 2015, ISTANBUL, TURKEY

2 Programming Language Semantics? Data Races C++ no guarantee of semantics catch-fire semantics Java provides weak semantics

3 Weak Semantics T1 A a = null; boolean init = false; T2 a = new A(); init = true; if (init) a.field++;

4 Weak Semantics T1 a = new A(); init = true; No data dependence T2 if (init) a.field++;

5 Weak Semantics T1 A a = null; boolean init = false; T2 a = new A(); init = true; if (init) a.field++;

6 Weak Semantics T1 T2 init = true; if (init) a.field++; a = new A();

7 Weak Semantics T1 T2 Null dereference init = true; if (init) a.field++; a = new A();

8 DRF0 Atomicity of synchronization-free regions for data-race-free programs Data races - no semantics C++, Java follow variants of DRF0 Adve and Hill, ISCA, 1990

9 Need for Stronger Memory Models The inability to define reasonable semantics for programs with data races is not just a theoretical shortcoming, but a fundamental hole in the foundation of our languages and systems Give better semantics to programs with data races Stronger memory models Adve and Boehm, CACM, 2010

10 Sequential Consistency (SC) Shared memory accesses interleave arbitrarily while each thread maintains program order

11 Sequential Consistency DRF0 Sequential Consistency

12 An Example Program Under SC int pos = 0 int [ ] buffer 0 0 T1 T2 buffer[pos++]= 5 buffer[pos++] = 6

13 An Example Program Under SC int pos = 0 int [ ] buffer 0 0 T1 T2 t1 = pos buffer [t1] = 5 t1 = t1 + 1 pos = t1 t2 = pos buffer [t2] = 6 t2 = t2 + 1 pos = t2

14 An Example Program Under SC int pos = 0 int [ ] buffer 0 0 T1 T2 t1 = pos buffer [t1] = 5 t1 = t1 + 1 pos = t1 t2 = pos buffer [t2] = 6 t2 = t2 + 1 pos = t2 Time

15 An Example Program Under SC int pos = 0 int [ ] buffer 0 0 T1 T2 t1 = pos buffer [t1] = 5 t1 = t1 + 1 pos = = 1 buffer 6 0 t2 = pos buffer [t2] = 6 Time pos = t1 t2 = t2 + 1 pos = t2

16 An Example Program Under SC int pos = 0 int [ ] buffer 0 0 T1 T2 buffer[pos++]= 5 pos = = 1 buffer[pos++] = 6 buffer buffer 6 0 pos = = 2 buffer

17 An Example Program Under SC int pos = 0 int [ ] buffer 0 0 T1 T2 pos = = 1 buffer[pos++]= 5 buffer[pos++] = 6 buffer 6 0 SC execution

18 Programmer Assumption Atomicity of high-level operations

19 Can SC Eliminate Common Concurrency Bugs?...programmers do not reason about correctness of parallel code in terms of interleavings of individual memory accesses SC does not prevent common concurrency bugs Data races dangerous even under SC Adve and Boehm, CACM 2010

20 Run-time cost Run-time cost vs Strength Synchronization-free region serializability 1 SC DRF0 Strength 1. Ouyang et al.... and region serializability for all. In HotPar, 2013.

21 Run-time cost Run-time cost vs Strength Synchronization-free region serializability 1 SC DRF0 Strength 1. Ouyang et al.... and region serializability for all. In HotPar, 2013.

22 Run-time cost Run-time cost vs Strength Statically Bounded Region Serializability Strength

23 Run-time cost Contribution EnfoRSer: An analysis to enforce SBRS practically Evaluation: Low run-time cost, eliminates real bugs Statically Bounded Region Serializability Strength

24 New Memory Model: Statically Bounded Region Serializability (SBRS)

25 Program Execution Behaviors DRF0 SC Statically Bounded Region Serializability

26 Statically Bounded Region Serializability (SBRS) acq(lock) Synchronization operations Method calls Loop backedges methodcall() rel(lock)

27 Statically Bounded Region Serializability (SBRS) acq(lock) methodcall() rel(lock)

28 Statically Bounded Region Serializability (SBRS) Statically and dynamically bounded Loop backedges

29 Under SBRS T1 pos = 0 buffer 0 0 T2 buffer[pos++]= 5 buffer[pos++] = 6

30 Under SBRS T1 pos = 0 buffer 0 0 T2 t1 = pos buffer [t1] = 5 t1 = t1 + 1 pos = t1 t2 = pos buffer [t2] = 6 t2 = t2 + 1 pos = t2

31 Under SBRS T1 t1 = pos buffer [t1] = 5 t1 = t1 + 1 pos = t1 pos = = 2 buffer 5 6 pos = 0 buffer 0 0 T2 t1 = pos buffer [t1] = 6 t1 = t1 + 1 pos = t1

32 Under SBRS T1 t1 = pos buffer [t1] = 5 t1 = t1 + 1 pos = t1 pos = 0 buffer 0 0 T2 t1 = pos buffer [t1] = 6 t1 = t1 + 1 pos = t1 pos = = 2 buffer 6 5

33 Bug Elimination x+= 42; read modify write if (o!= null) {...= o.f; } check before use buffer[pos++] = val; multi-variable operation

34 EnfoRSer: A Hybrid Static- Dynamic Analysis to Enforce SBRS

35 EnfoRSer, an efficient enforcement of SBRS Compiler Transformations Runtime Enforcement Two-phase Locking

36 Basic Mechanism Y = = X Y =

37 Basic Mechanism Write lock Y = Read lock = X Y =

38 Basic Mechanism Y = = X Y =

39 Basic Mechanism Y = Program access = X Y =

40 Basic Mechanism Y = = X Ownership transferred Y =

41 Basic Mechanism X = = X Y =

42 Basic Mechanism T1 T2 Y = X = = X = Y Conflict Y = Z =

43 Basic Mechanism T1 T2 Y = X = = X = Y Deadlock Y = Z =

44 Basic Mechanism T1 Y = Lightweight readerwriter locks 2 Biased synchronization Lose ownership while acquiring locks T2 X = = X Deadlock = Y Y = Z = 2. Bond et al. Octet: Capturing and Controlling Cross-Thread Dependences Efficiently. In OOPSLA, 2013.

45 Basic Mechanism T1 Two-phase locking violated T2 Y = X = = X Ownership transferred = Y Y = Z =

46 Basic Mechanism T1 T2 Y = X = = X = Y Y = Z =

47 Basic Mechanism Y = = X Locks acquired Y =

48 Challenges in Basic Mechanism Stores executed Y = = X Y =

49 EnfoRSer Atomicity Transformations Idempotent: Defer stores until all locks are acquired Speculation: Execute stores speculatively and roll back in case of a conflict

50 Idempotent Transformation X = = Y X =

51 Idempotence Transformation X = = Y X =

52 Idempotence Transformation = Y Stores deferred X = X =

53 Idempotence Mechanism Conflict = Y X = X =

54 Idempotence Mechanism Side-effect free = Y X = X =

55 Idempotence Mechanism = Y Execute stores X = X =

56 Idempotence Challenges X = = Y = X X =

57 Idempotence Challenges Loads data dependent on stores = Y = X X =

58 Idempotence Challenges Aliasing between loads and stores Data dependence? = Y = Z X =

59 Speculation Transformation X = = Y X =

60 Speculation Transformation Backup store values old_x = X X = = Y Generate roll-back code old_x = Z Z = Z = old_z X = old_x

61 Speculation Mechanism old_x = X X = Conflict Execute roll-back code = Y old_x = Z Z = Z = old_z X = old_x

62 Speculation Mechanism old_x = X X = Conflict = Y old_x = Z Z = Z = old_z X = old_x

63 Speculation Mechanism old_x = X X = = Y All locks acquired old_x = Z Z = Z = old_z X = old_x

64 Speculation Challenges Y = X = = Z

65 Speculation Challenges Y = Executed store = Z

66 Speculation Challenges Y = Executed store Conflict = Z Y = old_y X = old_x

67 Similar to Software Transactional Memory (STM)? Idempotent approach similar to STMs that defer stores until a transaction commits Speculation approach similar to STMs that execute stores but undo them if a transaction needs to abort

68 Similar to Software Transactional Memory (STM)? Idempotent approach similar STMs that defer EnfoRSer stores until a transaction commit provides atomicity Speculation approach similar STMs that undo of statically stores before a transaction abort bounded regions more efficiently!

69 Similar to Software Transactional Memory (STM)? Idempotent approach similar STMs that defer EnfoRSer stores until a transaction commit provides atomicity Speculation approach similar STMs that undo of statically stores before a transaction abort bounded regions more efficiently! Bounded regions: efficient code generation Short regions: conservative conflict detection

70 Implementation and Evaluation

71 Implementation Developed in Jikes RVM Code publicly available on Jikes RVM Research Archive

72 Experimental Methodology Benchmarks DaCapo 2006, 9.12-bach Fixed-workload versions of SPECjbb2000 and SPECjbb2005 Platform AMD Opteron system: 32 cores

73 Whole-Program Static Analysis Remove instrumentation from data-race-free accesses [Naik et al. s 2006 race detection algorithm, Chord]

74 % overhead over unmodified JVM EnfoRSer: Run-time Performance Idempotent % overhead on average

75 % overhead over unmodified JVM EnfoRSer: Run-time Performance Idempotent Speculation % overhead on average

76 % overhead over unmodified JVM EnfoRSer: Run-time Performance pjbb2005, over 100% Idempotent overhead Cao et al., WoDet 2014 Speculation

77 % overhead over unmodified JVM EnfoRSer: Run-time Performance Idempotent Speculation Speculation via log 53% overhead on average (speculation via log)

78 % overhead over unmodified JVM EnfoRSer: Run-time Performance with and without static race detection Idempotent Speculation Speculation via log 0 geomean w race det geomean w/o race det

79 Evaluation: Concurrency Errors Avoidance SBRS s potential to eliminate concurrency bugs exposed on relaxed memory models

80 Avoiding Concurrency Errors DRF0 (AM) SC SBRS hsqldb6 Infinite loop Correct Correct sunflow9 Null pointer exception Correct Correct jbb2000 Corrupt output Corrupt output Correct jbb2000 Infinite loop Correct Correct sor Infinite loop Correct Correct lufact Infinite loop Correct Correct moldyn Infinite loop Correct Correct raytracer Fails validation Fails validation Correct AM = Adversarial Memory, Flanagan and Freund, PLDI 2010

81 Avoiding Concurrency Errors DRF0(AM) SC SBRS hsqldb6 Infinite loop Correct Correct sunflow9 Null pointer exception Correct Correct jbb2000 Corrupt output Corrupt output Correct jbb2000 Infinite loop Correct Correct sor Infinite loop Correct Correct lufact Infinite loop Correct Correct moldyn Infinite loop Correct Correct raytracer Fails validation Fails validation Correct AM = Adversarial Memory, Flanagan and Freund, PLDI 2010

82 Avoiding Concurrency Errors DRF0 SC SBRS hsqldb6 Infinite loop Correct Correct sunflow9 Null pointer exception Correct Correct jbb2000 Corrupt output Corrupt output Correct Avoids all the errors exposed by AM. jbb2000 Infinite loop Correct Correct sor Infinite loop Correct Correct lufact Infinite loop Correct Correct moldyn Infinite loop Correct Correct raytracer Fails validation Fails validation Correct AM = Adversarial Memory, Flanagan and Freund, PLDI 2010

83 Related Work Checks conflicts in bounded region DRFx, Marino et al., PLDI 2010 Checks conflicts in synchronization-free regions Conflict Exceptions, Lucia et al., ISCA 2010 Enforces atomicity of bounded regions Bulk Compiler, Ahn et al., MICRO 2009 Enforces atomicity of synchronization free regions and region serializability for all, Ouyang et al., HotPar 2013 Requires customized hardware Requires additional cores

84 Run-time cost Conclusion EnfoRSer: An analysis to enforce SBRS practically Evaluation: Low run-time cost, eliminates real bugs Statically Bounded Region Serializability Strength

Toward Efficient Strong Memory Model Support for the Java Platform via Hybrid Synchronization

Toward Efficient Strong Memory Model Support for the Java Platform via Hybrid Synchronization Toward Efficient Strong Memory Model Support for the Java Platform via Hybrid Synchronization Aritra Sengupta, Man Cao, Michael D. Bond and Milind Kulkarni 1 PPPJ 2015, Melbourne, Florida, USA Programming

More information

Legato: Bounded Region Serializability Using Commodity Hardware Transactional Memory. Aritra Sengupta Man Cao Michael D. Bond and Milind Kulkarni

Legato: Bounded Region Serializability Using Commodity Hardware Transactional Memory. Aritra Sengupta Man Cao Michael D. Bond and Milind Kulkarni Legato: Bounded Region Serializability Using Commodity Hardware al Memory Aritra Sengupta Man Cao Michael D. Bond and Milind Kulkarni 1 Programming Language Semantics? Data Races C++ no guarantee of semantics

More information

Lightweight Data Race Detection for Production Runs

Lightweight Data Race Detection for Production Runs Lightweight Data Race Detection for Production Runs Swarnendu Biswas, UT Austin Man Cao, Ohio State University Minjia Zhang, Microsoft Research Michael D. Bond, Ohio State University Benjamin P. Wood,

More information

Toward Efficient Strong Memory Model Support for the Java Platform via Hybrid Synchronization

Toward Efficient Strong Memory Model Support for the Java Platform via Hybrid Synchronization Toward Efficient Strong Memory Model Support for the Java Platform via Hybrid Synchronization Aritra Sengupta Man Cao Michael D. Bond Milind Kulkarni Ohio State University Purdue University {sengupta,caoma,mikebond}@cse.ohio-state.edu

More information

DoubleChecker: Efficient Sound and Precise Atomicity Checking

DoubleChecker: Efficient Sound and Precise Atomicity Checking DoubleChecker: Efficient Sound and Precise Atomicity Checking Swarnendu Biswas, Jipeng Huang, Aritra Sengupta, and Michael D. Bond The Ohio State University PLDI 2014 Impact of Concurrency Bugs Impact

More information

Hybrid Static Dynamic Analysis for Statically Bounded Region Serializability

Hybrid Static Dynamic Analysis for Statically Bounded Region Serializability Hybrid Static Dynamic Analysis for Statically Bounded Region Serializability Aritra Sengupta Swarnendu Biswas Minjia Zhang Michael D. Bond Milind Kulkarni Ohio State University Purdue University {sengupta,biswass,zhanminj,mikebond@cse.ohio-state.edu

More information

Hybrid Static Dynamic Analysis for Region Serializability

Hybrid Static Dynamic Analysis for Region Serializability Hybrid Static Dynamic Analysis for Region Serializability Aritra Sengupta Swarnendu Biswas Minjia Zhang Michael D. Bond Milind Kulkarni + Ohio State University + Purdue University {sengupta,biswass,zhanminj,mikebond@cse.ohio-state.edu,

More information

FastTrack: Efficient and Precise Dynamic Race Detection (+ identifying destructive races)

FastTrack: Efficient and Precise Dynamic Race Detection (+ identifying destructive races) FastTrack: Efficient and Precise Dynamic Race Detection (+ identifying destructive races) Cormac Flanagan UC Santa Cruz Stephen Freund Williams College Multithreading and Multicore! Multithreaded programming

More information

11/19/2013. Imperative programs

11/19/2013. Imperative programs if (flag) 1 2 From my perspective, parallelism is the biggest challenge since high level programming languages. It s the biggest thing in 50 years because industry is betting its future that parallel programming

More information

Legato: End-to-End Bounded Region Serializability Using Commodity Hardware Transactional Memory

Legato: End-to-End Bounded Region Serializability Using Commodity Hardware Transactional Memory Legato: End-to-End Bounded Region Serializability Using Commodity Hardware Transactional Memory Aritra Sengupta Man Cao Michael D. Bond Milind Kulkarni Ohio State University (USA) Purdue University (USA)

More information

Valor: Efficient, Software-Only Region Conflict Exceptions

Valor: Efficient, Software-Only Region Conflict Exceptions Valor: Efficient, Software-Only Region Conflict Exceptions Swarnendu Biswas PhD Student, Ohio State University biswass@cse.ohio-state.edu Abstract Data races complicate programming language semantics,

More information

Atomicity for Reliable Concurrent Software. Race Conditions. Motivations for Atomicity. Race Conditions. Race Conditions. 1. Beyond Race Conditions

Atomicity for Reliable Concurrent Software. Race Conditions. Motivations for Atomicity. Race Conditions. Race Conditions. 1. Beyond Race Conditions Towards Reliable Multithreaded Software Atomicity for Reliable Concurrent Software Cormac Flanagan UC Santa Cruz Joint work with Stephen Freund Shaz Qadeer Microsoft Research 1 Multithreaded software increasingly

More information

Avoiding Consistency Exceptions Under Strong Memory Models

Avoiding Consistency Exceptions Under Strong Memory Models Avoiding Consistency Exceptions Under Strong Memory Models Minjia Zhang Microsoft Research (USA) minjiaz@microsoft.com Swarnendu Biswas University of Texas at Austin (USA) sbiswas@ices.utexas.edu Michael

More information

Valor: Efficient, Software-Only Region Conflict Exceptions

Valor: Efficient, Software-Only Region Conflict Exceptions to Reuse * * Evaluated * OOPSLA * Artifact * AEC Valor: Efficient, Software-Only Region Conflict Exceptions Swarnendu Biswas Minjia Zhang Michael D. Bond Brandon Lucia Ohio State University (USA) Carnegie

More information

The Common Case Transactional Behavior of Multithreaded Programs

The Common Case Transactional Behavior of Multithreaded Programs The Common Case Transactional Behavior of Multithreaded Programs JaeWoong Chung Hassan Chafi,, Chi Cao Minh, Austen McDonald, Brian D. Carlstrom, Christos Kozyrakis, Kunle Olukotun Computer Systems Lab

More information

EECS 570 Lecture 13. Directory & Optimizations. Winter 2018 Prof. Satish Narayanasamy

EECS 570 Lecture 13. Directory & Optimizations. Winter 2018 Prof. Satish Narayanasamy Directory & Optimizations Winter 2018 Prof. Satish Narayanasamy http://www.eecs.umich.edu/courses/eecs570/ Slides developed in part by Profs. Adve, Falsafi, Hill, Lebeck, Martin, Narayanasamy, Nowatzyk,

More information

CalFuzzer: An Extensible Active Testing Framework for Concurrent Programs Pallavi Joshi 1, Mayur Naik 2, Chang-Seo Park 1, and Koushik Sen 1

CalFuzzer: An Extensible Active Testing Framework for Concurrent Programs Pallavi Joshi 1, Mayur Naik 2, Chang-Seo Park 1, and Koushik Sen 1 CalFuzzer: An Extensible Active Testing Framework for Concurrent Programs Pallavi Joshi 1, Mayur Naik 2, Chang-Seo Park 1, and Koushik Sen 1 1 University of California, Berkeley, USA {pallavi,parkcs,ksen}@eecs.berkeley.edu

More information

Process Synchronization. Mehdi Kargahi School of ECE University of Tehran Spring 2008

Process Synchronization. Mehdi Kargahi School of ECE University of Tehran Spring 2008 Process Synchronization Mehdi Kargahi School of ECE University of Tehran Spring 2008 Producer-Consumer (Bounded Buffer) Producer Consumer Race Condition Producer Consumer Critical Sections Structure of

More information

Chí Cao Minh 28 May 2008

Chí Cao Minh 28 May 2008 Chí Cao Minh 28 May 2008 Uniprocessor systems hitting limits Design complexity overwhelming Power consumption increasing dramatically Instruction-level parallelism exhausted Solution is multiprocessor

More information

A Causality-Based Runtime Check for (Rollback) Atomicity

A Causality-Based Runtime Check for (Rollback) Atomicity A Causality-Based Runtime Check for (Rollback) Atomicity Serdar Tasiran Koc University Istanbul, Turkey Tayfun Elmas Koc University Istanbul, Turkey RV 2007 March 13, 2007 Outline This paper: Define rollback

More information

The Java Memory Model

The Java Memory Model Jeremy Manson 1, William Pugh 1, and Sarita Adve 2 1 University of Maryland 2 University of Illinois at Urbana-Champaign Presented by John Fisher-Ogden November 22, 2005 Outline Introduction Sequential

More information

Towards Parallel, Scalable VM Services

Towards Parallel, Scalable VM Services Towards Parallel, Scalable VM Services Kathryn S McKinley The University of Texas at Austin Kathryn McKinley Towards Parallel, Scalable VM Services 1 20 th Century Simplistic Hardware View Faster Processors

More information

Motivation & examples Threads, shared memory, & synchronization

Motivation & examples Threads, shared memory, & synchronization 1 Motivation & examples Threads, shared memory, & synchronization How do locks work? Data races (a lower level property) How do data race detectors work? Atomicity (a higher level property) Concurrency

More information

Designing Memory Consistency Models for. Shared-Memory Multiprocessors. Sarita V. Adve

Designing Memory Consistency Models for. Shared-Memory Multiprocessors. Sarita V. Adve Designing Memory Consistency Models for Shared-Memory Multiprocessors Sarita V. Adve Computer Sciences Department University of Wisconsin-Madison The Big Picture Assumptions Parallel processing important

More information

Lecture 21: Transactional Memory. Topics: consistency model recap, introduction to transactional memory

Lecture 21: Transactional Memory. Topics: consistency model recap, introduction to transactional memory Lecture 21: Transactional Memory Topics: consistency model recap, introduction to transactional memory 1 Example Programs Initially, A = B = 0 P1 P2 A = 1 B = 1 if (B == 0) if (A == 0) critical section

More information

Lightweight Data Race Detection for Production Runs

Lightweight Data Race Detection for Production Runs Lightweight Data Race Detection for Production Runs Ohio State CSE Technical Report #OSU-CISRC-1/15-TR01; last updated August 2016 Swarnendu Biswas Ohio State University biswass@cse.ohio-state.edu Michael

More information

Atomicity. Experience with Calvin. Tools for Checking Atomicity. Tools for Checking Atomicity. Tools for Checking Atomicity

Atomicity. Experience with Calvin. Tools for Checking Atomicity. Tools for Checking Atomicity. Tools for Checking Atomicity 1 2 Atomicity The method inc() is atomic if concurrent ths do not interfere with its behavior Guarantees that for every execution acq(this) x t=i y i=t+1 z rel(this) acq(this) x y t=i i=t+1 z rel(this)

More information

NDSeq: Runtime Checking for Nondeterministic Sequential Specs of Parallel Correctness

NDSeq: Runtime Checking for Nondeterministic Sequential Specs of Parallel Correctness EECS Electrical Engineering and Computer Sciences P A R A L L E L C O M P U T I N G L A B O R A T O R Y NDSeq: Runtime Checking for Nondeterministic Sequential Specs of Parallel Correctness Jacob Burnim,

More information

Types for Precise Thread Interference

Types for Precise Thread Interference Types for Precise Thread Interference Jaeheon Yi, Tim Disney, Cormac Flanagan UC Santa Cruz Stephen Freund Williams College October 23, 2011 2 Multiple Threads Single Thread is a non-atomic read-modify-write

More information

) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons)

) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons) ) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons) Transactions - Definition A transaction is a sequence of data operations with the following properties: * A Atomic All

More information

Efficient Processor Support for DRFx, a Memory Model with Exceptions

Efficient Processor Support for DRFx, a Memory Model with Exceptions Efficient Processor Support for DRFx, a Memory Model with Exceptions Abhayendra Singh Daniel Marino Satish Narayanasamy Todd Millstein Madanlal Musuvathi University of Michigan, Ann Arbor University of

More information

C. Flanagan Dynamic Analysis for Atomicity 1

C. Flanagan Dynamic Analysis for Atomicity 1 1 Atomicity The method inc() is atomic if concurrent threads do not interfere with its behavior Guarantees that for every execution acq(this) x t=i y i=t+1 z rel(this) acq(this) x y t=i i=t+1 z rel(this)

More information

7/6/2015. Motivation & examples Threads, shared memory, & synchronization. Imperative programs

7/6/2015. Motivation & examples Threads, shared memory, & synchronization. Imperative programs Motivation & examples Threads, shared memory, & synchronization How do locks work? Data races (a lower level property) How do data race detectors work? Atomicity (a higher level property) Concurrency exceptions

More information

The Java Memory Model

The Java Memory Model The Java Memory Model The meaning of concurrency in Java Bartosz Milewski Plan of the talk Motivating example Sequential consistency Data races The DRF guarantee Causality Out-of-thin-air guarantee Implementation

More information

Potential violations of Serializability: Example 1

Potential violations of Serializability: Example 1 CSCE 6610:Advanced Computer Architecture Review New Amdahl s law A possible idea for a term project Explore my idea about changing frequency based on serial fraction to maintain fixed energy or keep same

More information

Semantic Atomicity for Multithreaded Programs!

Semantic Atomicity for Multithreaded Programs! P A R A L L E L C O M P U T I N G L A B O R A T O R Y Semantic Atomicity for Multithreaded Programs! Jacob Burnim, George Necula, Koushik Sen! Parallel Computing Laboratory! University of California, Berkeley!

More information

Nondeterminism is Unavoidable, but Data Races are Pure Evil

Nondeterminism is Unavoidable, but Data Races are Pure Evil Nondeterminism is Unavoidable, but Data Races are Pure Evil Hans-J. Boehm HP Labs 5 November 2012 1 Low-level nondeterminism is pervasive E.g. Atomically incrementing a global counter is nondeterministic.

More information

Cooperative Concurrency for a Multicore World

Cooperative Concurrency for a Multicore World Cooperative Concurrency for a Multicore World Stephen Freund Williams College Cormac Flanagan, Tim Disney, Caitlin Sadowski, Jaeheon Yi, UC Santa Cruz Controlling Thread Interference: #1 Manually x = 0;

More information

DMP Deterministic Shared Memory Multiprocessing

DMP Deterministic Shared Memory Multiprocessing DMP Deterministic Shared Memory Multiprocessing University of Washington Joe Devietti, Brandon Lucia, Luis Ceze, Mark Oskin A multithreaded voting machine 2 thread 0 thread 1 while (more_votes) { load

More information

Lecture: Consistency Models, TM. Topics: consistency models, TM intro (Section 5.6)

Lecture: Consistency Models, TM. Topics: consistency models, TM intro (Section 5.6) Lecture: Consistency Models, TM Topics: consistency models, TM intro (Section 5.6) 1 Coherence Vs. Consistency Recall that coherence guarantees (i) that a write will eventually be seen by other processors,

More information

Chapter 5. Multiprocessors and Thread-Level Parallelism

Chapter 5. Multiprocessors and Thread-Level Parallelism Computer Architecture A Quantitative Approach, Fifth Edition Chapter 5 Multiprocessors and Thread-Level Parallelism 1 Introduction Thread-Level parallelism Have multiple program counters Uses MIMD model

More information

New Programming Abstractions for Concurrency in GCC 4.7. Torvald Riegel Red Hat 12/04/05

New Programming Abstractions for Concurrency in GCC 4.7. Torvald Riegel Red Hat 12/04/05 New Programming Abstractions for Concurrency in GCC 4.7 Red Hat 12/04/05 1 Concurrency and atomicity C++11 atomic types Transactional Memory Provide atomicity for concurrent accesses by different threads

More information

MAMA: Mostly Automatic Management of Atomicity

MAMA: Mostly Automatic Management of Atomicity MAMA: Mostly Automatic Management of Atomicity Christian DeLozier Joseph Devietti Milo M. K. Martin Computer and Information Science Department, University of Pennsylvania {delozier, devietti, milom}@cis.upenn.edu

More information

Maximal Causality Reduction for TSO and PSO. Shiyou Huang Jeff Huang Parasol Lab, Texas A&M University

Maximal Causality Reduction for TSO and PSO. Shiyou Huang Jeff Huang Parasol Lab, Texas A&M University Maximal Causality Reduction for TSO and PSO Shiyou Huang Jeff Huang huangsy@tamu.edu Parasol Lab, Texas A&M University 1 A Real PSO Bug $12 million loss of equipment curpos = new Point(1,2); class Point

More information

Monitors; Software Transactional Memory

Monitors; Software Transactional Memory Monitors; Software Transactional Memory Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico October 18, 2012 CPD (DEI / IST) Parallel and

More information

Towards Transactional Memory Semantics for C++

Towards Transactional Memory Semantics for C++ Towards Transactional Memory Semantics for C++ Tatiana Shpeisman Intel Corporation Santa Clara, CA tatiana.shpeisman@intel.com Yang Ni Intel Corporation Santa Clara, CA yang.ni@intel.com Ali-Reza Adl-Tabatabai

More information

Transactional Memory

Transactional Memory Transactional Memory Architectural Support for Practical Parallel Programming The TCC Research Group Computer Systems Lab Stanford University http://tcc.stanford.edu TCC Overview - January 2007 The Era

More information

MultiJav: A Distributed Shared Memory System Based on Multiple Java Virtual Machines. MultiJav: Introduction

MultiJav: A Distributed Shared Memory System Based on Multiple Java Virtual Machines. MultiJav: Introduction : A Distributed Shared Memory System Based on Multiple Java Virtual Machines X. Chen and V.H. Allan Computer Science Department, Utah State University 1998 : Introduction Built on concurrency supported

More information

Programming Language Memory Models: What do Shared Variables Mean?

Programming Language Memory Models: What do Shared Variables Mean? Programming Language Memory Models: What do Shared Variables Mean? Hans-J. Boehm 10/25/2010 1 Disclaimers: This is an overview talk. Much of this work was done by others or jointly. I m relying particularly

More information

Transaction Memory for Existing Programs Michael M. Swift Haris Volos, Andres Tack, Shan Lu, Adam Welc * University of Wisconsin-Madison, *Intel

Transaction Memory for Existing Programs Michael M. Swift Haris Volos, Andres Tack, Shan Lu, Adam Welc * University of Wisconsin-Madison, *Intel Transaction Memory for Existing Programs Michael M. Swift Haris Volos, Andres Tack, Shan Lu, Adam Welc * University of Wisconsin-Madison, *Intel Where do TM programs ome from?! Parallel benchmarks replacing

More information

Improving the Practicality of Transactional Memory

Improving the Practicality of Transactional Memory Improving the Practicality of Transactional Memory Woongki Baek Electrical Engineering Stanford University Programming Multiprocessors Multiprocessor systems are now everywhere From embedded to datacenter

More information

Java Memory Model. Jian Cao. Department of Electrical and Computer Engineering Rice University. Sep 22, 2016

Java Memory Model. Jian Cao. Department of Electrical and Computer Engineering Rice University. Sep 22, 2016 Java Memory Model Jian Cao Department of Electrical and Computer Engineering Rice University Sep 22, 2016 Content Introduction Java synchronization mechanism Double-checked locking Out-of-Thin-Air violation

More information

Lecture: Consistency Models, TM

Lecture: Consistency Models, TM Lecture: Consistency Models, TM Topics: consistency models, TM intro (Section 5.6) No class on Monday (please watch TM videos) Wednesday: TM wrap-up, interconnection networks 1 Coherence Vs. Consistency

More information

Lightweight Data Race Detection for Production Runs

Lightweight Data Race Detection for Production Runs Lightweight Data Race Detection for Production Runs Swarnendu Biswas University of Texas at Austin (USA) sbiswas@ices.utexas.edu Man Cao Ohio State University (USA) caoma@cse.ohio-state.edu Minjia Zhang

More information

McRT-STM: A High Performance Software Transactional Memory System for a Multi- Core Runtime

McRT-STM: A High Performance Software Transactional Memory System for a Multi- Core Runtime McRT-STM: A High Performance Software Transactional Memory System for a Multi- Core Runtime B. Saha, A-R. Adl- Tabatabai, R. Hudson, C.C. Minh, B. Hertzberg PPoPP 2006 Introductory TM Sales Pitch Two legs

More information

Dynamic Monitoring Tool based on Vector Clocks for Multithread Programs

Dynamic Monitoring Tool based on Vector Clocks for Multithread Programs , pp.45-49 http://dx.doi.org/10.14257/astl.2014.76.12 Dynamic Monitoring Tool based on Vector Clocks for Multithread Programs Hyun-Ji Kim 1, Byoung-Kwi Lee 2, Ok-Kyoon Ha 3, and Yong-Kee Jun 1 1 Department

More information

Mohamed M. Saad & Binoy Ravindran

Mohamed M. Saad & Binoy Ravindran Mohamed M. Saad & Binoy Ravindran VT-MENA Program Electrical & Computer Engineering Department Virginia Polytechnic Institute and State University TRANSACT 11 San Jose, CA An operation (or set of operations)

More information

NOW Handout Page 1. Memory Consistency Model. Background for Debate on Memory Consistency Models. Multiprogrammed Uniprocessor Mem.

NOW Handout Page 1. Memory Consistency Model. Background for Debate on Memory Consistency Models. Multiprogrammed Uniprocessor Mem. Memory Consistency Model Background for Debate on Memory Consistency Models CS 258, Spring 99 David E. Culler Computer Science Division U.C. Berkeley for a SAS specifies constraints on the order in which

More information

Transactional Memory. Yaohua Li and Siming Chen. Yaohua Li and Siming Chen Transactional Memory 1 / 41

Transactional Memory. Yaohua Li and Siming Chen. Yaohua Li and Siming Chen Transactional Memory 1 / 41 Transactional Memory Yaohua Li and Siming Chen Yaohua Li and Siming Chen Transactional Memory 1 / 41 Background Processor hits physical limit on transistor density Cannot simply put more transistors to

More information

MELD: Merging Execution- and Language-level Determinism. Joe Devietti, Dan Grossman, Luis Ceze University of Washington

MELD: Merging Execution- and Language-level Determinism. Joe Devietti, Dan Grossman, Luis Ceze University of Washington MELD: Merging Execution- and Language-level Determinism Joe Devietti, Dan Grossman, Luis Ceze University of Washington CoreDet results overhead normalized to nondet 800% 600% 400% 200% 0% 2 4 8 2 4 8 2

More information

Optimistic Shared Memory Dependence Tracing

Optimistic Shared Memory Dependence Tracing Optimistic Shared Memory Dependence Tracing Yanyan Jiang1, Du Li2, Chang Xu1, Xiaoxing Ma1 and Jian Lu1 Nanjing University 2 Carnegie Mellon University 1 powered by Understanding Non-determinism Concurrent

More information

Atom-Aid: Detecting and Surviving Atomicity Violations

Atom-Aid: Detecting and Surviving Atomicity Violations Atom-Aid: Detecting and Surviving Atomicity Violations Abstract Writing shared-memory parallel programs is an error-prone process. Atomicity violations are especially challenging concurrency errors. They

More information

Development of Technique for Healing Data Races based on Software Transactional Memory

Development of Technique for Healing Data Races based on Software Transactional Memory , pp.482-487 http://dx.doi.org/10.14257/astl.2016.139.96 Development of Technique for Healing Data Races based on Software Transactional Memory Eu-Teum Choi 1,, Kun Su Yoon 2, Ok-Kyoon Ha 3, Yong-Kee Jun

More information

PACER: Proportional Detection of Data Races

PACER: Proportional Detection of Data Races PACER: Proportional Detection of Data Races Michael D. Bond Katherine E. Coons Kathryn S. McKinley Department of Computer Science, The University of Texas at Austin {mikebond,coonske,mckinley}@cs.utexas.edu

More information

DRFx: A Simple and Efficient Memory Model for Concurrent Programming Languages

DRFx: A Simple and Efficient Memory Model for Concurrent Programming Languages DRFx: A Simple and Efficient Memory Model for Concurrent Programming Languages Daniel Marino Abhayendra Singh Todd Millstein Madanlal Musuvathi Satish Narayanasamy University of California, Los Angeles

More information

Lecture 12: TM, Consistency Models. Topics: TM pathologies, sequential consistency, hw and hw/sw optimizations

Lecture 12: TM, Consistency Models. Topics: TM pathologies, sequential consistency, hw and hw/sw optimizations Lecture 12: TM, Consistency Models Topics: TM pathologies, sequential consistency, hw and hw/sw optimizations 1 Paper on TM Pathologies (ISCA 08) LL: lazy versioning, lazy conflict detection, committing

More information

Atomicity via Source-to-Source Translation

Atomicity via Source-to-Source Translation Atomicity via Source-to-Source Translation Benjamin Hindman Dan Grossman University of Washington 22 October 2006 Atomic An easier-to-use and harder-to-implement primitive void deposit(int x){ synchronized(this){

More information

New Programming Abstractions for Concurrency. Torvald Riegel Red Hat 12/04/05

New Programming Abstractions for Concurrency. Torvald Riegel Red Hat 12/04/05 New Programming Abstractions for Concurrency Red Hat 12/04/05 1 Concurrency and atomicity C++11 atomic types Transactional Memory Provide atomicity for concurrent accesses by different threads Both based

More information

Cache-Aware Lock-Free Queues for Multiple Producers/Consumers and Weak Memory Consistency

Cache-Aware Lock-Free Queues for Multiple Producers/Consumers and Weak Memory Consistency Cache-Aware Lock-Free Queues for Multiple Producers/Consumers and Weak Memory Consistency Anders Gidenstam Håkan Sundell Philippas Tsigas School of business and informatics University of Borås Distributed

More information

Database Principles: Fundamentals of Design, Implementation, and Management Tenth Edition. Chapter 13 Managing Transactions and Concurrency

Database Principles: Fundamentals of Design, Implementation, and Management Tenth Edition. Chapter 13 Managing Transactions and Concurrency Database Principles: Fundamentals of Design, Implementation, and Management Tenth Edition Chapter 13 Managing Transactions and Concurrency Objectives In this chapter, you will learn: What a database transaction

More information

High-level languages

High-level languages High-level languages High-level languages are not immune to these problems. Actually, the situation is even worse: the source language typically operates over mixed-size values (multi-word and bitfield);

More information

Software Transactions: A Programming-Languages Perspective

Software Transactions: A Programming-Languages Perspective Software Transactions: A Programming-Languages Perspective Dan Grossman University of Washington 5 December 2006 A big deal Research on software transactions broad Programming languages PLDI, POPL, ICFP,

More information

Transactional Monitors for Concurrent Objects

Transactional Monitors for Concurrent Objects Transactional Monitors for Concurrent Objects Adam Welc, Suresh Jagannathan, and Antony L. Hosking Department of Computer Sciences Purdue University West Lafayette, IN 47906 {welc,suresh,hosking}@cs.purdue.edu

More information

The C/C++ Memory Model: Overview and Formalization

The C/C++ Memory Model: Overview and Formalization The C/C++ Memory Model: Overview and Formalization Mark Batty Jasmin Blanchette Scott Owens Susmit Sarkar Peter Sewell Tjark Weber Verification of Concurrent C Programs C11 / C++11 In 2011, new versions

More information

Monitors; Software Transactional Memory

Monitors; Software Transactional Memory Monitors; Software Transactional Memory Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico March 17, 2016 CPD (DEI / IST) Parallel and Distributed

More information

Explicitly Parallel Programming with Shared Memory is Insane: At Least Make it Deterministic!

Explicitly Parallel Programming with Shared Memory is Insane: At Least Make it Deterministic! Explicitly Parallel Programming with Shared Memory is Insane: At Least Make it Deterministic! Joe Devietti, Brandon Lucia, Luis Ceze and Mark Oskin University of Washington Parallel Programming is Hard

More information

G52CON: Concepts of Concurrency

G52CON: Concepts of Concurrency G52CON: Concepts of Concurrency Lecture 6: Algorithms for Mutual Natasha Alechina School of Computer Science nza@cs.nott.ac.uk Outline of this lecture mutual exclusion with standard instructions example:

More information

Fall 2012 Parallel Computer Architecture Lecture 16: Speculation II. Prof. Onur Mutlu Carnegie Mellon University 10/12/2012

Fall 2012 Parallel Computer Architecture Lecture 16: Speculation II. Prof. Onur Mutlu Carnegie Mellon University 10/12/2012 18-742 Fall 2012 Parallel Computer Architecture Lecture 16: Speculation II Prof. Onur Mutlu Carnegie Mellon University 10/12/2012 Past Due: Review Assignments Was Due: Tuesday, October 9, 11:59pm. Sohi

More information

Static Java Program Features for Intelligent Squash Prediction

Static Java Program Features for Intelligent Squash Prediction Static Java Program Features for Intelligent Squash Prediction Jeremy Singer,, Adam Pocock, Mikel Lujan, Gavin Brown, Nikolas Ioannou, Marcelo Cintra Thread-level Speculation... Aim to use parallel multi-core

More information

Lecture 7: Transactional Memory Intro. Topics: introduction to transactional memory, lazy implementation

Lecture 7: Transactional Memory Intro. Topics: introduction to transactional memory, lazy implementation Lecture 7: Transactional Memory Intro Topics: introduction to transactional memory, lazy implementation 1 Transactions New paradigm to simplify programming instead of lock-unlock, use transaction begin-end

More information

A Parameterized Type System for Race-Free Java Programs

A Parameterized Type System for Race-Free Java Programs A Parameterized Type System for Race-Free Java Programs Chandrasekhar Boyapati Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology {chandra, rinard@lcs.mit.edu Data races

More information

NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY. Tim Harris, 28 November 2014

NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY. Tim Harris, 28 November 2014 NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY Tim Harris, 28 November 2014 Lecture 8 Problems with locks Atomic blocks and composition Hardware transactional memory Software transactional memory

More information

DBT Tool. DBT Framework

DBT Tool. DBT Framework Thread-Safe Dynamic Binary Translation using Transactional Memory JaeWoong Chung,, Michael Dalton, Hari Kannan, Christos Kozyrakis Computer Systems Laboratory Stanford University http://csl.stanford.edu

More information

Transaction Management

Transaction Management Transaction Management Imran Khan FCS, IBA In this chapter, you will learn: What a database transaction is and what its properties are How database transactions are managed What concurrency control is

More information

Lecture 6: Lazy Transactional Memory. Topics: TM semantics and implementation details of lazy TM

Lecture 6: Lazy Transactional Memory. Topics: TM semantics and implementation details of lazy TM Lecture 6: Lazy Transactional Memory Topics: TM semantics and implementation details of lazy TM 1 Transactions Access to shared variables is encapsulated within transactions the system gives the illusion

More information

Chapter 7 (Cont.) Transaction Management and Concurrency Control

Chapter 7 (Cont.) Transaction Management and Concurrency Control Chapter 7 (Cont.) Transaction Management and Concurrency Control In this chapter, you will learn: What a database transaction is and what its properties are What concurrency control is and what role it

More information

FastTrack: Efficient and Precise Dynamic Race Detection By Cormac Flanagan and Stephen N. Freund

FastTrack: Efficient and Precise Dynamic Race Detection By Cormac Flanagan and Stephen N. Freund FastTrack: Efficient and Precise Dynamic Race Detection By Cormac Flanagan and Stephen N. Freund doi:10.1145/1839676.1839699 Abstract Multithreaded programs are notoriously prone to race conditions. Prior

More information

OPERATING SYSTEM TRANSACTIONS

OPERATING SYSTEM TRANSACTIONS OPERATING SYSTEM TRANSACTIONS Donald E. Porter, Owen S. Hofmann, Christopher J. Rossbach, Alexander Benn, and Emmett Witchel The University of Texas at Austin OS APIs don t handle concurrency 2 OS is weak

More information

Emmett Witchel The University of Texas At Austin

Emmett Witchel The University of Texas At Austin Emmett Witchel The University of Texas At Austin 1 Q: When is everything happening? A: Now A: Concurrently 2 CS is at forefront of understanding concurrency We operate near light speed Concurrent computer

More information

Compiler Techniques For Memory Consistency Models

Compiler Techniques For Memory Consistency Models Compiler Techniques For Memory Consistency Models Students: Xing Fang (Purdue), Jaejin Lee (Seoul National University), Kyungwoo Lee (Purdue), Zehra Sura (IBM T.J. Watson Research Center), David Wong (Intel

More information

Distributed Operating Systems Memory Consistency

Distributed Operating Systems Memory Consistency Faculty of Computer Science Institute for System Architecture, Operating Systems Group Distributed Operating Systems Memory Consistency Marcus Völp (slides Julian Stecklina, Marcus Völp) SS2014 Concurrent

More information

Deterministic Shared Memory Multiprocessing

Deterministic Shared Memory Multiprocessing Deterministic Shared Memory Multiprocessing Luis Ceze, University of Washington joint work with Owen Anderson, Tom Bergan, Joe Devietti, Brandon Lucia, Karin Strauss, Dan Grossman, Mark Oskin. Safe MultiProcessing

More information

A Serializability Violation Detector for Shared-Memory Server Programs

A Serializability Violation Detector for Shared-Memory Server Programs A Serializability Violation Detector for Shared-Memory Server Programs Min Xu Rastislav Bodík Mark Hill University of Wisconsin Madison University of California, Berkeley Serializability Violation Detector:

More information

Chasing Away RAts: Semantics and Evaluation for Relaxed Atomics on Heterogeneous Systems

Chasing Away RAts: Semantics and Evaluation for Relaxed Atomics on Heterogeneous Systems Chasing Away RAts: Semantics and Evaluation for Relaxed Atomics on Heterogeneous Systems Matthew D. Sinclair, Johnathan Alsop, Sarita V. Adve University of Illinois @ Urbana-Champaign hetero@cs.illinois.edu

More information

RELAXED CONSISTENCY 1

RELAXED CONSISTENCY 1 RELAXED CONSISTENCY 1 RELAXED CONSISTENCY Relaxed Consistency is a catch-all term for any MCM weaker than TSO GPUs have relaxed consistency (probably) 2 XC AXIOMS TABLE 5.5: XC Ordering Rules. An X Denotes

More information

Speculative Locks. Dept. of Computer Science

Speculative Locks. Dept. of Computer Science Speculative Locks José éf. Martínez and djosep Torrellas Dept. of Computer Science University it of Illinois i at Urbana-Champaign Motivation Lock granularity a trade-off: Fine grain greater concurrency

More information

Part 3: Beyond Reduction

Part 3: Beyond Reduction Lightweight Analyses For Reliable Concurrency Part 3: Beyond Reduction Stephen Freund Williams College joint work with Cormac Flanagan (UCSC), Shaz Qadeer (MSR) Busy Acquire Busy Acquire void busy_acquire()

More information

Lock vs. Lock-free Memory Project proposal

Lock vs. Lock-free Memory Project proposal Lock vs. Lock-free Memory Project proposal Fahad Alduraibi Aws Ahmad Eman Elrifaei Electrical and Computer Engineering Southern Illinois University 1. Introduction The CPU performance development history

More information

CSE 530A ACID. Washington University Fall 2013

CSE 530A ACID. Washington University Fall 2013 CSE 530A ACID Washington University Fall 2013 Concurrency Enterprise-scale DBMSs are designed to host multiple databases and handle multiple concurrent connections Transactions are designed to enable Data

More information

A Case for System Support for Concurrency Exceptions

A Case for System Support for Concurrency Exceptions A Case for System Support for Concurrency Exceptions Luis Ceze, Joseph Devietti, Brandon Lucia and Shaz Qadeer University of Washington {luisceze, devietti, blucia0a}@cs.washington.edu Microsoft Research

More information