Hybrid Static-Dynamic Analysis for Statically Bounded Region Serializability
|
|
- Alannah Hopkins
- 6 years ago
- Views:
Transcription
1 Hybrid Static-Dynamic Analysis for Statically Bounded Region Serializability Aritra Sengupta, Swarnendu Biswas, Minjia Zhang, Michael D. Bond and Milind Kulkarni ASPLOS 2015, ISTANBUL, TURKEY
2 Programming Language Semantics? Data Races C++ no guarantee of semantics catch-fire semantics Java provides weak semantics
3 Weak Semantics T1 A a = null; boolean init = false; T2 a = new A(); init = true; if (init) a.field++;
4 Weak Semantics T1 a = new A(); init = true; No data dependence T2 if (init) a.field++;
5 Weak Semantics T1 A a = null; boolean init = false; T2 a = new A(); init = true; if (init) a.field++;
6 Weak Semantics T1 T2 init = true; if (init) a.field++; a = new A();
7 Weak Semantics T1 T2 Null dereference init = true; if (init) a.field++; a = new A();
8 DRF0 Atomicity of synchronization-free regions for data-race-free programs Data races - no semantics C++, Java follow variants of DRF0 Adve and Hill, ISCA, 1990
9 Need for Stronger Memory Models The inability to define reasonable semantics for programs with data races is not just a theoretical shortcoming, but a fundamental hole in the foundation of our languages and systems Give better semantics to programs with data races Stronger memory models Adve and Boehm, CACM, 2010
10 Sequential Consistency (SC) Shared memory accesses interleave arbitrarily while each thread maintains program order
11 Sequential Consistency DRF0 Sequential Consistency
12 An Example Program Under SC int pos = 0 int [ ] buffer 0 0 T1 T2 buffer[pos++]= 5 buffer[pos++] = 6
13 An Example Program Under SC int pos = 0 int [ ] buffer 0 0 T1 T2 t1 = pos buffer [t1] = 5 t1 = t1 + 1 pos = t1 t2 = pos buffer [t2] = 6 t2 = t2 + 1 pos = t2
14 An Example Program Under SC int pos = 0 int [ ] buffer 0 0 T1 T2 t1 = pos buffer [t1] = 5 t1 = t1 + 1 pos = t1 t2 = pos buffer [t2] = 6 t2 = t2 + 1 pos = t2 Time
15 An Example Program Under SC int pos = 0 int [ ] buffer 0 0 T1 T2 t1 = pos buffer [t1] = 5 t1 = t1 + 1 pos = = 1 buffer 6 0 t2 = pos buffer [t2] = 6 Time pos = t1 t2 = t2 + 1 pos = t2
16 An Example Program Under SC int pos = 0 int [ ] buffer 0 0 T1 T2 buffer[pos++]= 5 pos = = 1 buffer[pos++] = 6 buffer buffer 6 0 pos = = 2 buffer
17 An Example Program Under SC int pos = 0 int [ ] buffer 0 0 T1 T2 pos = = 1 buffer[pos++]= 5 buffer[pos++] = 6 buffer 6 0 SC execution
18 Programmer Assumption Atomicity of high-level operations
19 Can SC Eliminate Common Concurrency Bugs?...programmers do not reason about correctness of parallel code in terms of interleavings of individual memory accesses SC does not prevent common concurrency bugs Data races dangerous even under SC Adve and Boehm, CACM 2010
20 Run-time cost Run-time cost vs Strength Synchronization-free region serializability 1 SC DRF0 Strength 1. Ouyang et al.... and region serializability for all. In HotPar, 2013.
21 Run-time cost Run-time cost vs Strength Synchronization-free region serializability 1 SC DRF0 Strength 1. Ouyang et al.... and region serializability for all. In HotPar, 2013.
22 Run-time cost Run-time cost vs Strength Statically Bounded Region Serializability Strength
23 Run-time cost Contribution EnfoRSer: An analysis to enforce SBRS practically Evaluation: Low run-time cost, eliminates real bugs Statically Bounded Region Serializability Strength
24 New Memory Model: Statically Bounded Region Serializability (SBRS)
25 Program Execution Behaviors DRF0 SC Statically Bounded Region Serializability
26 Statically Bounded Region Serializability (SBRS) acq(lock) Synchronization operations Method calls Loop backedges methodcall() rel(lock)
27 Statically Bounded Region Serializability (SBRS) acq(lock) methodcall() rel(lock)
28 Statically Bounded Region Serializability (SBRS) Statically and dynamically bounded Loop backedges
29 Under SBRS T1 pos = 0 buffer 0 0 T2 buffer[pos++]= 5 buffer[pos++] = 6
30 Under SBRS T1 pos = 0 buffer 0 0 T2 t1 = pos buffer [t1] = 5 t1 = t1 + 1 pos = t1 t2 = pos buffer [t2] = 6 t2 = t2 + 1 pos = t2
31 Under SBRS T1 t1 = pos buffer [t1] = 5 t1 = t1 + 1 pos = t1 pos = = 2 buffer 5 6 pos = 0 buffer 0 0 T2 t1 = pos buffer [t1] = 6 t1 = t1 + 1 pos = t1
32 Under SBRS T1 t1 = pos buffer [t1] = 5 t1 = t1 + 1 pos = t1 pos = 0 buffer 0 0 T2 t1 = pos buffer [t1] = 6 t1 = t1 + 1 pos = t1 pos = = 2 buffer 6 5
33 Bug Elimination x+= 42; read modify write if (o!= null) {...= o.f; } check before use buffer[pos++] = val; multi-variable operation
34 EnfoRSer: A Hybrid Static- Dynamic Analysis to Enforce SBRS
35 EnfoRSer, an efficient enforcement of SBRS Compiler Transformations Runtime Enforcement Two-phase Locking
36 Basic Mechanism Y = = X Y =
37 Basic Mechanism Write lock Y = Read lock = X Y =
38 Basic Mechanism Y = = X Y =
39 Basic Mechanism Y = Program access = X Y =
40 Basic Mechanism Y = = X Ownership transferred Y =
41 Basic Mechanism X = = X Y =
42 Basic Mechanism T1 T2 Y = X = = X = Y Conflict Y = Z =
43 Basic Mechanism T1 T2 Y = X = = X = Y Deadlock Y = Z =
44 Basic Mechanism T1 Y = Lightweight readerwriter locks 2 Biased synchronization Lose ownership while acquiring locks T2 X = = X Deadlock = Y Y = Z = 2. Bond et al. Octet: Capturing and Controlling Cross-Thread Dependences Efficiently. In OOPSLA, 2013.
45 Basic Mechanism T1 Two-phase locking violated T2 Y = X = = X Ownership transferred = Y Y = Z =
46 Basic Mechanism T1 T2 Y = X = = X = Y Y = Z =
47 Basic Mechanism Y = = X Locks acquired Y =
48 Challenges in Basic Mechanism Stores executed Y = = X Y =
49 EnfoRSer Atomicity Transformations Idempotent: Defer stores until all locks are acquired Speculation: Execute stores speculatively and roll back in case of a conflict
50 Idempotent Transformation X = = Y X =
51 Idempotence Transformation X = = Y X =
52 Idempotence Transformation = Y Stores deferred X = X =
53 Idempotence Mechanism Conflict = Y X = X =
54 Idempotence Mechanism Side-effect free = Y X = X =
55 Idempotence Mechanism = Y Execute stores X = X =
56 Idempotence Challenges X = = Y = X X =
57 Idempotence Challenges Loads data dependent on stores = Y = X X =
58 Idempotence Challenges Aliasing between loads and stores Data dependence? = Y = Z X =
59 Speculation Transformation X = = Y X =
60 Speculation Transformation Backup store values old_x = X X = = Y Generate roll-back code old_x = Z Z = Z = old_z X = old_x
61 Speculation Mechanism old_x = X X = Conflict Execute roll-back code = Y old_x = Z Z = Z = old_z X = old_x
62 Speculation Mechanism old_x = X X = Conflict = Y old_x = Z Z = Z = old_z X = old_x
63 Speculation Mechanism old_x = X X = = Y All locks acquired old_x = Z Z = Z = old_z X = old_x
64 Speculation Challenges Y = X = = Z
65 Speculation Challenges Y = Executed store = Z
66 Speculation Challenges Y = Executed store Conflict = Z Y = old_y X = old_x
67 Similar to Software Transactional Memory (STM)? Idempotent approach similar to STMs that defer stores until a transaction commits Speculation approach similar to STMs that execute stores but undo them if a transaction needs to abort
68 Similar to Software Transactional Memory (STM)? Idempotent approach similar STMs that defer EnfoRSer stores until a transaction commit provides atomicity Speculation approach similar STMs that undo of statically stores before a transaction abort bounded regions more efficiently!
69 Similar to Software Transactional Memory (STM)? Idempotent approach similar STMs that defer EnfoRSer stores until a transaction commit provides atomicity Speculation approach similar STMs that undo of statically stores before a transaction abort bounded regions more efficiently! Bounded regions: efficient code generation Short regions: conservative conflict detection
70 Implementation and Evaluation
71 Implementation Developed in Jikes RVM Code publicly available on Jikes RVM Research Archive
72 Experimental Methodology Benchmarks DaCapo 2006, 9.12-bach Fixed-workload versions of SPECjbb2000 and SPECjbb2005 Platform AMD Opteron system: 32 cores
73 Whole-Program Static Analysis Remove instrumentation from data-race-free accesses [Naik et al. s 2006 race detection algorithm, Chord]
74 % overhead over unmodified JVM EnfoRSer: Run-time Performance Idempotent % overhead on average
75 % overhead over unmodified JVM EnfoRSer: Run-time Performance Idempotent Speculation % overhead on average
76 % overhead over unmodified JVM EnfoRSer: Run-time Performance pjbb2005, over 100% Idempotent overhead Cao et al., WoDet 2014 Speculation
77 % overhead over unmodified JVM EnfoRSer: Run-time Performance Idempotent Speculation Speculation via log 53% overhead on average (speculation via log)
78 % overhead over unmodified JVM EnfoRSer: Run-time Performance with and without static race detection Idempotent Speculation Speculation via log 0 geomean w race det geomean w/o race det
79 Evaluation: Concurrency Errors Avoidance SBRS s potential to eliminate concurrency bugs exposed on relaxed memory models
80 Avoiding Concurrency Errors DRF0 (AM) SC SBRS hsqldb6 Infinite loop Correct Correct sunflow9 Null pointer exception Correct Correct jbb2000 Corrupt output Corrupt output Correct jbb2000 Infinite loop Correct Correct sor Infinite loop Correct Correct lufact Infinite loop Correct Correct moldyn Infinite loop Correct Correct raytracer Fails validation Fails validation Correct AM = Adversarial Memory, Flanagan and Freund, PLDI 2010
81 Avoiding Concurrency Errors DRF0(AM) SC SBRS hsqldb6 Infinite loop Correct Correct sunflow9 Null pointer exception Correct Correct jbb2000 Corrupt output Corrupt output Correct jbb2000 Infinite loop Correct Correct sor Infinite loop Correct Correct lufact Infinite loop Correct Correct moldyn Infinite loop Correct Correct raytracer Fails validation Fails validation Correct AM = Adversarial Memory, Flanagan and Freund, PLDI 2010
82 Avoiding Concurrency Errors DRF0 SC SBRS hsqldb6 Infinite loop Correct Correct sunflow9 Null pointer exception Correct Correct jbb2000 Corrupt output Corrupt output Correct Avoids all the errors exposed by AM. jbb2000 Infinite loop Correct Correct sor Infinite loop Correct Correct lufact Infinite loop Correct Correct moldyn Infinite loop Correct Correct raytracer Fails validation Fails validation Correct AM = Adversarial Memory, Flanagan and Freund, PLDI 2010
83 Related Work Checks conflicts in bounded region DRFx, Marino et al., PLDI 2010 Checks conflicts in synchronization-free regions Conflict Exceptions, Lucia et al., ISCA 2010 Enforces atomicity of bounded regions Bulk Compiler, Ahn et al., MICRO 2009 Enforces atomicity of synchronization free regions and region serializability for all, Ouyang et al., HotPar 2013 Requires customized hardware Requires additional cores
84 Run-time cost Conclusion EnfoRSer: An analysis to enforce SBRS practically Evaluation: Low run-time cost, eliminates real bugs Statically Bounded Region Serializability Strength
Toward Efficient Strong Memory Model Support for the Java Platform via Hybrid Synchronization
Toward Efficient Strong Memory Model Support for the Java Platform via Hybrid Synchronization Aritra Sengupta, Man Cao, Michael D. Bond and Milind Kulkarni 1 PPPJ 2015, Melbourne, Florida, USA Programming
More informationLegato: Bounded Region Serializability Using Commodity Hardware Transactional Memory. Aritra Sengupta Man Cao Michael D. Bond and Milind Kulkarni
Legato: Bounded Region Serializability Using Commodity Hardware al Memory Aritra Sengupta Man Cao Michael D. Bond and Milind Kulkarni 1 Programming Language Semantics? Data Races C++ no guarantee of semantics
More informationLightweight Data Race Detection for Production Runs
Lightweight Data Race Detection for Production Runs Swarnendu Biswas, UT Austin Man Cao, Ohio State University Minjia Zhang, Microsoft Research Michael D. Bond, Ohio State University Benjamin P. Wood,
More informationToward Efficient Strong Memory Model Support for the Java Platform via Hybrid Synchronization
Toward Efficient Strong Memory Model Support for the Java Platform via Hybrid Synchronization Aritra Sengupta Man Cao Michael D. Bond Milind Kulkarni Ohio State University Purdue University {sengupta,caoma,mikebond}@cse.ohio-state.edu
More informationDoubleChecker: Efficient Sound and Precise Atomicity Checking
DoubleChecker: Efficient Sound and Precise Atomicity Checking Swarnendu Biswas, Jipeng Huang, Aritra Sengupta, and Michael D. Bond The Ohio State University PLDI 2014 Impact of Concurrency Bugs Impact
More informationHybrid Static Dynamic Analysis for Statically Bounded Region Serializability
Hybrid Static Dynamic Analysis for Statically Bounded Region Serializability Aritra Sengupta Swarnendu Biswas Minjia Zhang Michael D. Bond Milind Kulkarni Ohio State University Purdue University {sengupta,biswass,zhanminj,mikebond@cse.ohio-state.edu
More informationHybrid Static Dynamic Analysis for Region Serializability
Hybrid Static Dynamic Analysis for Region Serializability Aritra Sengupta Swarnendu Biswas Minjia Zhang Michael D. Bond Milind Kulkarni + Ohio State University + Purdue University {sengupta,biswass,zhanminj,mikebond@cse.ohio-state.edu,
More informationFastTrack: Efficient and Precise Dynamic Race Detection (+ identifying destructive races)
FastTrack: Efficient and Precise Dynamic Race Detection (+ identifying destructive races) Cormac Flanagan UC Santa Cruz Stephen Freund Williams College Multithreading and Multicore! Multithreaded programming
More information11/19/2013. Imperative programs
if (flag) 1 2 From my perspective, parallelism is the biggest challenge since high level programming languages. It s the biggest thing in 50 years because industry is betting its future that parallel programming
More informationLegato: End-to-End Bounded Region Serializability Using Commodity Hardware Transactional Memory
Legato: End-to-End Bounded Region Serializability Using Commodity Hardware Transactional Memory Aritra Sengupta Man Cao Michael D. Bond Milind Kulkarni Ohio State University (USA) Purdue University (USA)
More informationValor: Efficient, Software-Only Region Conflict Exceptions
Valor: Efficient, Software-Only Region Conflict Exceptions Swarnendu Biswas PhD Student, Ohio State University biswass@cse.ohio-state.edu Abstract Data races complicate programming language semantics,
More informationAtomicity for Reliable Concurrent Software. Race Conditions. Motivations for Atomicity. Race Conditions. Race Conditions. 1. Beyond Race Conditions
Towards Reliable Multithreaded Software Atomicity for Reliable Concurrent Software Cormac Flanagan UC Santa Cruz Joint work with Stephen Freund Shaz Qadeer Microsoft Research 1 Multithreaded software increasingly
More informationAvoiding Consistency Exceptions Under Strong Memory Models
Avoiding Consistency Exceptions Under Strong Memory Models Minjia Zhang Microsoft Research (USA) minjiaz@microsoft.com Swarnendu Biswas University of Texas at Austin (USA) sbiswas@ices.utexas.edu Michael
More informationValor: Efficient, Software-Only Region Conflict Exceptions
to Reuse * * Evaluated * OOPSLA * Artifact * AEC Valor: Efficient, Software-Only Region Conflict Exceptions Swarnendu Biswas Minjia Zhang Michael D. Bond Brandon Lucia Ohio State University (USA) Carnegie
More informationThe Common Case Transactional Behavior of Multithreaded Programs
The Common Case Transactional Behavior of Multithreaded Programs JaeWoong Chung Hassan Chafi,, Chi Cao Minh, Austen McDonald, Brian D. Carlstrom, Christos Kozyrakis, Kunle Olukotun Computer Systems Lab
More informationEECS 570 Lecture 13. Directory & Optimizations. Winter 2018 Prof. Satish Narayanasamy
Directory & Optimizations Winter 2018 Prof. Satish Narayanasamy http://www.eecs.umich.edu/courses/eecs570/ Slides developed in part by Profs. Adve, Falsafi, Hill, Lebeck, Martin, Narayanasamy, Nowatzyk,
More informationCalFuzzer: An Extensible Active Testing Framework for Concurrent Programs Pallavi Joshi 1, Mayur Naik 2, Chang-Seo Park 1, and Koushik Sen 1
CalFuzzer: An Extensible Active Testing Framework for Concurrent Programs Pallavi Joshi 1, Mayur Naik 2, Chang-Seo Park 1, and Koushik Sen 1 1 University of California, Berkeley, USA {pallavi,parkcs,ksen}@eecs.berkeley.edu
More informationProcess Synchronization. Mehdi Kargahi School of ECE University of Tehran Spring 2008
Process Synchronization Mehdi Kargahi School of ECE University of Tehran Spring 2008 Producer-Consumer (Bounded Buffer) Producer Consumer Race Condition Producer Consumer Critical Sections Structure of
More informationChí Cao Minh 28 May 2008
Chí Cao Minh 28 May 2008 Uniprocessor systems hitting limits Design complexity overwhelming Power consumption increasing dramatically Instruction-level parallelism exhausted Solution is multiprocessor
More informationA Causality-Based Runtime Check for (Rollback) Atomicity
A Causality-Based Runtime Check for (Rollback) Atomicity Serdar Tasiran Koc University Istanbul, Turkey Tayfun Elmas Koc University Istanbul, Turkey RV 2007 March 13, 2007 Outline This paper: Define rollback
More informationThe Java Memory Model
Jeremy Manson 1, William Pugh 1, and Sarita Adve 2 1 University of Maryland 2 University of Illinois at Urbana-Champaign Presented by John Fisher-Ogden November 22, 2005 Outline Introduction Sequential
More informationTowards Parallel, Scalable VM Services
Towards Parallel, Scalable VM Services Kathryn S McKinley The University of Texas at Austin Kathryn McKinley Towards Parallel, Scalable VM Services 1 20 th Century Simplistic Hardware View Faster Processors
More informationMotivation & examples Threads, shared memory, & synchronization
1 Motivation & examples Threads, shared memory, & synchronization How do locks work? Data races (a lower level property) How do data race detectors work? Atomicity (a higher level property) Concurrency
More informationDesigning Memory Consistency Models for. Shared-Memory Multiprocessors. Sarita V. Adve
Designing Memory Consistency Models for Shared-Memory Multiprocessors Sarita V. Adve Computer Sciences Department University of Wisconsin-Madison The Big Picture Assumptions Parallel processing important
More informationLecture 21: Transactional Memory. Topics: consistency model recap, introduction to transactional memory
Lecture 21: Transactional Memory Topics: consistency model recap, introduction to transactional memory 1 Example Programs Initially, A = B = 0 P1 P2 A = 1 B = 1 if (B == 0) if (A == 0) critical section
More informationLightweight Data Race Detection for Production Runs
Lightweight Data Race Detection for Production Runs Ohio State CSE Technical Report #OSU-CISRC-1/15-TR01; last updated August 2016 Swarnendu Biswas Ohio State University biswass@cse.ohio-state.edu Michael
More informationAtomicity. Experience with Calvin. Tools for Checking Atomicity. Tools for Checking Atomicity. Tools for Checking Atomicity
1 2 Atomicity The method inc() is atomic if concurrent ths do not interfere with its behavior Guarantees that for every execution acq(this) x t=i y i=t+1 z rel(this) acq(this) x y t=i i=t+1 z rel(this)
More informationNDSeq: Runtime Checking for Nondeterministic Sequential Specs of Parallel Correctness
EECS Electrical Engineering and Computer Sciences P A R A L L E L C O M P U T I N G L A B O R A T O R Y NDSeq: Runtime Checking for Nondeterministic Sequential Specs of Parallel Correctness Jacob Burnim,
More informationTypes for Precise Thread Interference
Types for Precise Thread Interference Jaeheon Yi, Tim Disney, Cormac Flanagan UC Santa Cruz Stephen Freund Williams College October 23, 2011 2 Multiple Threads Single Thread is a non-atomic read-modify-write
More information) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons)
) Intel)(TX)memory):) Transac'onal) Synchroniza'on) Extensions)(TSX))) Transac'ons) Transactions - Definition A transaction is a sequence of data operations with the following properties: * A Atomic All
More informationEfficient Processor Support for DRFx, a Memory Model with Exceptions
Efficient Processor Support for DRFx, a Memory Model with Exceptions Abhayendra Singh Daniel Marino Satish Narayanasamy Todd Millstein Madanlal Musuvathi University of Michigan, Ann Arbor University of
More informationC. Flanagan Dynamic Analysis for Atomicity 1
1 Atomicity The method inc() is atomic if concurrent threads do not interfere with its behavior Guarantees that for every execution acq(this) x t=i y i=t+1 z rel(this) acq(this) x y t=i i=t+1 z rel(this)
More information7/6/2015. Motivation & examples Threads, shared memory, & synchronization. Imperative programs
Motivation & examples Threads, shared memory, & synchronization How do locks work? Data races (a lower level property) How do data race detectors work? Atomicity (a higher level property) Concurrency exceptions
More informationThe Java Memory Model
The Java Memory Model The meaning of concurrency in Java Bartosz Milewski Plan of the talk Motivating example Sequential consistency Data races The DRF guarantee Causality Out-of-thin-air guarantee Implementation
More informationPotential violations of Serializability: Example 1
CSCE 6610:Advanced Computer Architecture Review New Amdahl s law A possible idea for a term project Explore my idea about changing frequency based on serial fraction to maintain fixed energy or keep same
More informationSemantic Atomicity for Multithreaded Programs!
P A R A L L E L C O M P U T I N G L A B O R A T O R Y Semantic Atomicity for Multithreaded Programs! Jacob Burnim, George Necula, Koushik Sen! Parallel Computing Laboratory! University of California, Berkeley!
More informationNondeterminism is Unavoidable, but Data Races are Pure Evil
Nondeterminism is Unavoidable, but Data Races are Pure Evil Hans-J. Boehm HP Labs 5 November 2012 1 Low-level nondeterminism is pervasive E.g. Atomically incrementing a global counter is nondeterministic.
More informationCooperative Concurrency for a Multicore World
Cooperative Concurrency for a Multicore World Stephen Freund Williams College Cormac Flanagan, Tim Disney, Caitlin Sadowski, Jaeheon Yi, UC Santa Cruz Controlling Thread Interference: #1 Manually x = 0;
More informationDMP Deterministic Shared Memory Multiprocessing
DMP Deterministic Shared Memory Multiprocessing University of Washington Joe Devietti, Brandon Lucia, Luis Ceze, Mark Oskin A multithreaded voting machine 2 thread 0 thread 1 while (more_votes) { load
More informationLecture: Consistency Models, TM. Topics: consistency models, TM intro (Section 5.6)
Lecture: Consistency Models, TM Topics: consistency models, TM intro (Section 5.6) 1 Coherence Vs. Consistency Recall that coherence guarantees (i) that a write will eventually be seen by other processors,
More informationChapter 5. Multiprocessors and Thread-Level Parallelism
Computer Architecture A Quantitative Approach, Fifth Edition Chapter 5 Multiprocessors and Thread-Level Parallelism 1 Introduction Thread-Level parallelism Have multiple program counters Uses MIMD model
More informationNew Programming Abstractions for Concurrency in GCC 4.7. Torvald Riegel Red Hat 12/04/05
New Programming Abstractions for Concurrency in GCC 4.7 Red Hat 12/04/05 1 Concurrency and atomicity C++11 atomic types Transactional Memory Provide atomicity for concurrent accesses by different threads
More informationMAMA: Mostly Automatic Management of Atomicity
MAMA: Mostly Automatic Management of Atomicity Christian DeLozier Joseph Devietti Milo M. K. Martin Computer and Information Science Department, University of Pennsylvania {delozier, devietti, milom}@cis.upenn.edu
More informationMaximal Causality Reduction for TSO and PSO. Shiyou Huang Jeff Huang Parasol Lab, Texas A&M University
Maximal Causality Reduction for TSO and PSO Shiyou Huang Jeff Huang huangsy@tamu.edu Parasol Lab, Texas A&M University 1 A Real PSO Bug $12 million loss of equipment curpos = new Point(1,2); class Point
More informationMonitors; Software Transactional Memory
Monitors; Software Transactional Memory Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico October 18, 2012 CPD (DEI / IST) Parallel and
More informationTowards Transactional Memory Semantics for C++
Towards Transactional Memory Semantics for C++ Tatiana Shpeisman Intel Corporation Santa Clara, CA tatiana.shpeisman@intel.com Yang Ni Intel Corporation Santa Clara, CA yang.ni@intel.com Ali-Reza Adl-Tabatabai
More informationTransactional Memory
Transactional Memory Architectural Support for Practical Parallel Programming The TCC Research Group Computer Systems Lab Stanford University http://tcc.stanford.edu TCC Overview - January 2007 The Era
More informationMultiJav: A Distributed Shared Memory System Based on Multiple Java Virtual Machines. MultiJav: Introduction
: A Distributed Shared Memory System Based on Multiple Java Virtual Machines X. Chen and V.H. Allan Computer Science Department, Utah State University 1998 : Introduction Built on concurrency supported
More informationProgramming Language Memory Models: What do Shared Variables Mean?
Programming Language Memory Models: What do Shared Variables Mean? Hans-J. Boehm 10/25/2010 1 Disclaimers: This is an overview talk. Much of this work was done by others or jointly. I m relying particularly
More informationTransaction Memory for Existing Programs Michael M. Swift Haris Volos, Andres Tack, Shan Lu, Adam Welc * University of Wisconsin-Madison, *Intel
Transaction Memory for Existing Programs Michael M. Swift Haris Volos, Andres Tack, Shan Lu, Adam Welc * University of Wisconsin-Madison, *Intel Where do TM programs ome from?! Parallel benchmarks replacing
More informationImproving the Practicality of Transactional Memory
Improving the Practicality of Transactional Memory Woongki Baek Electrical Engineering Stanford University Programming Multiprocessors Multiprocessor systems are now everywhere From embedded to datacenter
More informationJava Memory Model. Jian Cao. Department of Electrical and Computer Engineering Rice University. Sep 22, 2016
Java Memory Model Jian Cao Department of Electrical and Computer Engineering Rice University Sep 22, 2016 Content Introduction Java synchronization mechanism Double-checked locking Out-of-Thin-Air violation
More informationLecture: Consistency Models, TM
Lecture: Consistency Models, TM Topics: consistency models, TM intro (Section 5.6) No class on Monday (please watch TM videos) Wednesday: TM wrap-up, interconnection networks 1 Coherence Vs. Consistency
More informationLightweight Data Race Detection for Production Runs
Lightweight Data Race Detection for Production Runs Swarnendu Biswas University of Texas at Austin (USA) sbiswas@ices.utexas.edu Man Cao Ohio State University (USA) caoma@cse.ohio-state.edu Minjia Zhang
More informationMcRT-STM: A High Performance Software Transactional Memory System for a Multi- Core Runtime
McRT-STM: A High Performance Software Transactional Memory System for a Multi- Core Runtime B. Saha, A-R. Adl- Tabatabai, R. Hudson, C.C. Minh, B. Hertzberg PPoPP 2006 Introductory TM Sales Pitch Two legs
More informationDynamic Monitoring Tool based on Vector Clocks for Multithread Programs
, pp.45-49 http://dx.doi.org/10.14257/astl.2014.76.12 Dynamic Monitoring Tool based on Vector Clocks for Multithread Programs Hyun-Ji Kim 1, Byoung-Kwi Lee 2, Ok-Kyoon Ha 3, and Yong-Kee Jun 1 1 Department
More informationMohamed M. Saad & Binoy Ravindran
Mohamed M. Saad & Binoy Ravindran VT-MENA Program Electrical & Computer Engineering Department Virginia Polytechnic Institute and State University TRANSACT 11 San Jose, CA An operation (or set of operations)
More informationNOW Handout Page 1. Memory Consistency Model. Background for Debate on Memory Consistency Models. Multiprogrammed Uniprocessor Mem.
Memory Consistency Model Background for Debate on Memory Consistency Models CS 258, Spring 99 David E. Culler Computer Science Division U.C. Berkeley for a SAS specifies constraints on the order in which
More informationTransactional Memory. Yaohua Li and Siming Chen. Yaohua Li and Siming Chen Transactional Memory 1 / 41
Transactional Memory Yaohua Li and Siming Chen Yaohua Li and Siming Chen Transactional Memory 1 / 41 Background Processor hits physical limit on transistor density Cannot simply put more transistors to
More informationMELD: Merging Execution- and Language-level Determinism. Joe Devietti, Dan Grossman, Luis Ceze University of Washington
MELD: Merging Execution- and Language-level Determinism Joe Devietti, Dan Grossman, Luis Ceze University of Washington CoreDet results overhead normalized to nondet 800% 600% 400% 200% 0% 2 4 8 2 4 8 2
More informationOptimistic Shared Memory Dependence Tracing
Optimistic Shared Memory Dependence Tracing Yanyan Jiang1, Du Li2, Chang Xu1, Xiaoxing Ma1 and Jian Lu1 Nanjing University 2 Carnegie Mellon University 1 powered by Understanding Non-determinism Concurrent
More informationAtom-Aid: Detecting and Surviving Atomicity Violations
Atom-Aid: Detecting and Surviving Atomicity Violations Abstract Writing shared-memory parallel programs is an error-prone process. Atomicity violations are especially challenging concurrency errors. They
More informationDevelopment of Technique for Healing Data Races based on Software Transactional Memory
, pp.482-487 http://dx.doi.org/10.14257/astl.2016.139.96 Development of Technique for Healing Data Races based on Software Transactional Memory Eu-Teum Choi 1,, Kun Su Yoon 2, Ok-Kyoon Ha 3, Yong-Kee Jun
More informationPACER: Proportional Detection of Data Races
PACER: Proportional Detection of Data Races Michael D. Bond Katherine E. Coons Kathryn S. McKinley Department of Computer Science, The University of Texas at Austin {mikebond,coonske,mckinley}@cs.utexas.edu
More informationDRFx: A Simple and Efficient Memory Model for Concurrent Programming Languages
DRFx: A Simple and Efficient Memory Model for Concurrent Programming Languages Daniel Marino Abhayendra Singh Todd Millstein Madanlal Musuvathi Satish Narayanasamy University of California, Los Angeles
More informationLecture 12: TM, Consistency Models. Topics: TM pathologies, sequential consistency, hw and hw/sw optimizations
Lecture 12: TM, Consistency Models Topics: TM pathologies, sequential consistency, hw and hw/sw optimizations 1 Paper on TM Pathologies (ISCA 08) LL: lazy versioning, lazy conflict detection, committing
More informationAtomicity via Source-to-Source Translation
Atomicity via Source-to-Source Translation Benjamin Hindman Dan Grossman University of Washington 22 October 2006 Atomic An easier-to-use and harder-to-implement primitive void deposit(int x){ synchronized(this){
More informationNew Programming Abstractions for Concurrency. Torvald Riegel Red Hat 12/04/05
New Programming Abstractions for Concurrency Red Hat 12/04/05 1 Concurrency and atomicity C++11 atomic types Transactional Memory Provide atomicity for concurrent accesses by different threads Both based
More informationCache-Aware Lock-Free Queues for Multiple Producers/Consumers and Weak Memory Consistency
Cache-Aware Lock-Free Queues for Multiple Producers/Consumers and Weak Memory Consistency Anders Gidenstam Håkan Sundell Philippas Tsigas School of business and informatics University of Borås Distributed
More informationDatabase Principles: Fundamentals of Design, Implementation, and Management Tenth Edition. Chapter 13 Managing Transactions and Concurrency
Database Principles: Fundamentals of Design, Implementation, and Management Tenth Edition Chapter 13 Managing Transactions and Concurrency Objectives In this chapter, you will learn: What a database transaction
More informationHigh-level languages
High-level languages High-level languages are not immune to these problems. Actually, the situation is even worse: the source language typically operates over mixed-size values (multi-word and bitfield);
More informationSoftware Transactions: A Programming-Languages Perspective
Software Transactions: A Programming-Languages Perspective Dan Grossman University of Washington 5 December 2006 A big deal Research on software transactions broad Programming languages PLDI, POPL, ICFP,
More informationTransactional Monitors for Concurrent Objects
Transactional Monitors for Concurrent Objects Adam Welc, Suresh Jagannathan, and Antony L. Hosking Department of Computer Sciences Purdue University West Lafayette, IN 47906 {welc,suresh,hosking}@cs.purdue.edu
More informationThe C/C++ Memory Model: Overview and Formalization
The C/C++ Memory Model: Overview and Formalization Mark Batty Jasmin Blanchette Scott Owens Susmit Sarkar Peter Sewell Tjark Weber Verification of Concurrent C Programs C11 / C++11 In 2011, new versions
More informationMonitors; Software Transactional Memory
Monitors; Software Transactional Memory Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico March 17, 2016 CPD (DEI / IST) Parallel and Distributed
More informationExplicitly Parallel Programming with Shared Memory is Insane: At Least Make it Deterministic!
Explicitly Parallel Programming with Shared Memory is Insane: At Least Make it Deterministic! Joe Devietti, Brandon Lucia, Luis Ceze and Mark Oskin University of Washington Parallel Programming is Hard
More informationG52CON: Concepts of Concurrency
G52CON: Concepts of Concurrency Lecture 6: Algorithms for Mutual Natasha Alechina School of Computer Science nza@cs.nott.ac.uk Outline of this lecture mutual exclusion with standard instructions example:
More informationFall 2012 Parallel Computer Architecture Lecture 16: Speculation II. Prof. Onur Mutlu Carnegie Mellon University 10/12/2012
18-742 Fall 2012 Parallel Computer Architecture Lecture 16: Speculation II Prof. Onur Mutlu Carnegie Mellon University 10/12/2012 Past Due: Review Assignments Was Due: Tuesday, October 9, 11:59pm. Sohi
More informationStatic Java Program Features for Intelligent Squash Prediction
Static Java Program Features for Intelligent Squash Prediction Jeremy Singer,, Adam Pocock, Mikel Lujan, Gavin Brown, Nikolas Ioannou, Marcelo Cintra Thread-level Speculation... Aim to use parallel multi-core
More informationLecture 7: Transactional Memory Intro. Topics: introduction to transactional memory, lazy implementation
Lecture 7: Transactional Memory Intro Topics: introduction to transactional memory, lazy implementation 1 Transactions New paradigm to simplify programming instead of lock-unlock, use transaction begin-end
More informationA Parameterized Type System for Race-Free Java Programs
A Parameterized Type System for Race-Free Java Programs Chandrasekhar Boyapati Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology {chandra, rinard@lcs.mit.edu Data races
More informationNON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY. Tim Harris, 28 November 2014
NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY Tim Harris, 28 November 2014 Lecture 8 Problems with locks Atomic blocks and composition Hardware transactional memory Software transactional memory
More informationDBT Tool. DBT Framework
Thread-Safe Dynamic Binary Translation using Transactional Memory JaeWoong Chung,, Michael Dalton, Hari Kannan, Christos Kozyrakis Computer Systems Laboratory Stanford University http://csl.stanford.edu
More informationTransaction Management
Transaction Management Imran Khan FCS, IBA In this chapter, you will learn: What a database transaction is and what its properties are How database transactions are managed What concurrency control is
More informationLecture 6: Lazy Transactional Memory. Topics: TM semantics and implementation details of lazy TM
Lecture 6: Lazy Transactional Memory Topics: TM semantics and implementation details of lazy TM 1 Transactions Access to shared variables is encapsulated within transactions the system gives the illusion
More informationChapter 7 (Cont.) Transaction Management and Concurrency Control
Chapter 7 (Cont.) Transaction Management and Concurrency Control In this chapter, you will learn: What a database transaction is and what its properties are What concurrency control is and what role it
More informationFastTrack: Efficient and Precise Dynamic Race Detection By Cormac Flanagan and Stephen N. Freund
FastTrack: Efficient and Precise Dynamic Race Detection By Cormac Flanagan and Stephen N. Freund doi:10.1145/1839676.1839699 Abstract Multithreaded programs are notoriously prone to race conditions. Prior
More informationOPERATING SYSTEM TRANSACTIONS
OPERATING SYSTEM TRANSACTIONS Donald E. Porter, Owen S. Hofmann, Christopher J. Rossbach, Alexander Benn, and Emmett Witchel The University of Texas at Austin OS APIs don t handle concurrency 2 OS is weak
More informationEmmett Witchel The University of Texas At Austin
Emmett Witchel The University of Texas At Austin 1 Q: When is everything happening? A: Now A: Concurrently 2 CS is at forefront of understanding concurrency We operate near light speed Concurrent computer
More informationCompiler Techniques For Memory Consistency Models
Compiler Techniques For Memory Consistency Models Students: Xing Fang (Purdue), Jaejin Lee (Seoul National University), Kyungwoo Lee (Purdue), Zehra Sura (IBM T.J. Watson Research Center), David Wong (Intel
More informationDistributed Operating Systems Memory Consistency
Faculty of Computer Science Institute for System Architecture, Operating Systems Group Distributed Operating Systems Memory Consistency Marcus Völp (slides Julian Stecklina, Marcus Völp) SS2014 Concurrent
More informationDeterministic Shared Memory Multiprocessing
Deterministic Shared Memory Multiprocessing Luis Ceze, University of Washington joint work with Owen Anderson, Tom Bergan, Joe Devietti, Brandon Lucia, Karin Strauss, Dan Grossman, Mark Oskin. Safe MultiProcessing
More informationA Serializability Violation Detector for Shared-Memory Server Programs
A Serializability Violation Detector for Shared-Memory Server Programs Min Xu Rastislav Bodík Mark Hill University of Wisconsin Madison University of California, Berkeley Serializability Violation Detector:
More informationChasing Away RAts: Semantics and Evaluation for Relaxed Atomics on Heterogeneous Systems
Chasing Away RAts: Semantics and Evaluation for Relaxed Atomics on Heterogeneous Systems Matthew D. Sinclair, Johnathan Alsop, Sarita V. Adve University of Illinois @ Urbana-Champaign hetero@cs.illinois.edu
More informationRELAXED CONSISTENCY 1
RELAXED CONSISTENCY 1 RELAXED CONSISTENCY Relaxed Consistency is a catch-all term for any MCM weaker than TSO GPUs have relaxed consistency (probably) 2 XC AXIOMS TABLE 5.5: XC Ordering Rules. An X Denotes
More informationSpeculative Locks. Dept. of Computer Science
Speculative Locks José éf. Martínez and djosep Torrellas Dept. of Computer Science University it of Illinois i at Urbana-Champaign Motivation Lock granularity a trade-off: Fine grain greater concurrency
More informationPart 3: Beyond Reduction
Lightweight Analyses For Reliable Concurrency Part 3: Beyond Reduction Stephen Freund Williams College joint work with Cormac Flanagan (UCSC), Shaz Qadeer (MSR) Busy Acquire Busy Acquire void busy_acquire()
More informationLock vs. Lock-free Memory Project proposal
Lock vs. Lock-free Memory Project proposal Fahad Alduraibi Aws Ahmad Eman Elrifaei Electrical and Computer Engineering Southern Illinois University 1. Introduction The CPU performance development history
More informationCSE 530A ACID. Washington University Fall 2013
CSE 530A ACID Washington University Fall 2013 Concurrency Enterprise-scale DBMSs are designed to host multiple databases and handle multiple concurrent connections Transactions are designed to enable Data
More informationA Case for System Support for Concurrency Exceptions
A Case for System Support for Concurrency Exceptions Luis Ceze, Joseph Devietti, Brandon Lucia and Shaz Qadeer University of Washington {luisceze, devietti, blucia0a}@cs.washington.edu Microsoft Research
More information