Accelerating Dynamic Data Race Detection Using Static Thread Interference Analysis
|
|
- Felicity Rice
- 5 years ago
- Views:
Transcription
1 Accelerating Dynamic Data Race Detection Using Static Thread Interference Peng Di and Yulei Sui School of Computer Science and Engineering The University of New South Wales 2052 Sydney Australia March 12-13, / 18
2 Outline Motivation phases Evalution 1 / 18
3 Dynamic Data Race Detector: ThreadSanitizer ThreadSanitizer (TSan) finds data races in multithreaded programs by inserting instrumentations at compile-time to perform runtime checks for all memory accesses. void foo(int *p) { *p = 42; } Orignial C code void foo(int *p) { tsan_func_entry( builtin_return_address(0)); tsan_write4(p); *p = 42; tsan_func_exit(); } Code after instrumentation 2 / 18
4 TSan performance slowdown over native code for SPLASH2 benchmarks (under compiler option -O0) 4 Threads 16 Threads 40 x 20 x 0 x barnes fft lu cb lu ncb ocean cp ocean ncp radiosity radix raytrace water nsquared water spatial average Machine: Ubuntu Linux generic Intel Xeon Quad Core HT, 3.7GHZ, 64GB 3 / 18
5 Outline Motivation phases Evalution 3 / 18
6 Framework Source Clang Front-End bc file Analyses s bc file Memory Pairs Collection Call Graph Construction Reachability Memory pairs Reachable pairs Interleaving Interleaving MHP pairs Pointer Alias Aliased pairs Thread-Local Thread-Local Lockset Lockset Escaped pairs Unlocked pairs Guided Instrumentation Instrumented bc file Code Generation Binary 4 / 18
7 Framework Source Clang Front-End bc file Analyses s bc file Memory Pairs Collection Call Graph Construction Reachability Memory pairs Reachable pairs Interleaving Interleaving MHP pairs Pointer Alias Aliased pairs Thread-Local Thread-Local Lockset Lockset Escaped pairs Unlocked pairs Guided Instrumentation Instrumented bc file Code Generation Binary 4 / 18
8 Framework Source Clang Front-End bc file Analyses s bc file Memory Pairs Collection Call Graph Construction Reachability Memory pairs Reachable pairs Interleaving Interleaving MHP pairs Pointer Alias Aliased pairs Thread-Local Thread-Local Lockset Lockset Escaped pairs Unlocked pairs Guided Instrumentation Instrumented bc file Code Generation Binary 4 / 18
9 Framework Source Clang Front-End bc file Analyses s bc file Memory Pairs Collection Call Graph Construction Reachability Memory pairs Reachable pairs Interleaving Interleaving MHP pairs Pointer Alias Aliased pairs Thread-Local Thread-Local Lockset Lockset Escaped pairs Unlocked pairs Guided Instrumentation Instrumented bc file Code Generation Binary 4 / 18
10 Framework Source Clang Front-End bc file Analyses s bc file Memory Pairs Collection Call Graph Construction Reachability Memory pairs Reachable pairs Interleaving Interleaving MHP pairs Pointer Alias Aliased pairs Thread-Local Thread-Local Lockset Lockset Escaped pairs Unlocked pairs Guided Instrumentation Instrumented bc file Code Generation Binary 4 / 18
11 Refining Pairs Only statements in the paris that are not filtered are instrumented for runtime check. bc file Memory Pairs Collection Memory Pairs Reachable Pairs MHP Pairs Alised Pairs Escaped Pairs Unlocked Pairs Guided Instrumentation Instrumented bc file 5 / 18
12 Context-Sensitive Abstract Threads An abstract thread t refers to a call of pthread create() at a context-sensitive fork site during the analysis. void main(){ cs1: cs2: } foo(); foo(); t1 refers to fork site under context [1,3] void foo(){ cs3: fork(t1, bar); } t1' refers to fork site under context [2,3] void main(){ } for(i=0;i<10;i++){ fork(t[i], foo) } t1 and t1' are context-sensitive threads t is multi-forked thread 6 / 18
13 Context-Sensitive Abstract Threads An abstract thread t refers to a call of pthread create() at a context-sensitive fork site during the analysis. void main(){ cs1: cs2: } foo(); foo(); t1 refers to fork site under context [1,3] void foo(){ cs3: fork(t1, bar); } t1' refers to fork site under context [2,3] void main(){ } for(i=0;i<10;i++){ fork(t[i], foo) } t1 and t1' are context-sensitive threads t is multi-forked thread A thread t always refers to a context-sensitive fork site, i.e., a unique runtime thread unless t M is multi-forked, in which case, t may represent more than one runtime thread. 6 / 18
14 Outline Motivation phases Thread Interleaving Alias Thread Local Lock Evalution 6 / 18
15 Context-sensitive Thread Interleaving (t 1, c 1, s 1 ) (t 2, c 2, s 2 ) holds if: { t 2 I(t 1, c 1, s 1 ) t 1 I(t 2, c 2, s 2 ) if t 1 t 2 t 1 M otherwise where I(t, c, s): denotes a set of interleaved threads may run in parallel with s in thread t under calling context c, M is the set of multi-forked threads. 7 / 18
16 Interleaving Computing I(t, c, s) is formalized as a forward data-flow problem (V,, F). V : the set of all thread interleaving facts. : meet operator ( ). F: V V transfer functions associated with each node in an ICFG. 8 / 18
17 Interleaving Rule [I-DESCENDANT] [I-SIBLING] (c,fk i ) t t (t, c, fk i ) (t, c, l) (c, l ) = Entry(S t ) {t } I(t, c, l) {t} I(t, c, l ) t t (c, l) = Entry(S t) (c, l ) = Entry(S t ) t t t t {t} I(t, c, l ) {t } I(t, c, l) [I-JOIN] (c,jn i ) t t I(t, c, jn i ) = I(t, c, jn i )\{t } [I-CALL] (t, c, l) call i (t, c, l ) c = c.push(i) I(t, c, l) I(t, c, l ) [I-INTRA] (t, c, l) (t, c, l ) I(t, c, l) I(t, c, l ) [I-RET] (t, c, l) ret i (t, c, l ) i = c.peek() c = c.pop() I(t, c, l) I(t, c, l ) 9 / 18
18 Interleaving Rule [I-DESCENDANT] t (c,fk i ) t (t, c, fk i ) (t, c, l) (c, l ) = Entry(S t ) {t } I(t, c, l) {t} I(t, c, l ) t t' I(t,c,s) = {t'} fork s s' I(t',c',s')={t} (t,c,s) (t',c',s') 10 / 18
19 Interleaving Rule [I-JOIN] (c,jn i ) t t I(t, c, jn i ) = I(t, c, jn i )\{t } t t' fork I(t,c,s) = {t'} join I(t,c,s1) = { } (t,c,s) (t,c,s) s s1 s' I(t',c',s')={t} (t',c',s') (t,c,s1) 10 / 18
20 Interleaving Rule [I-SIBLING] t t (c, l) = Entry(S t ) (c, l ) = Entry(S t ) t t t t {t} I(t, c, l ) {t } I(t, c, l) t' t0 t I(t',c',s')={} s' fork join s fork join I(t,c,s)={} (t,c,s) (t',c',s') 10 / 18
21 Interleaving Rule [I-SIBLING] t t (c, l) = Entry(S t ) (c, l ) = Entry(S t ) t t t t {t} I(t, c, l ) {t } I(t, c, l) t' t0 t I(t',c',s')={t} s' fork join s fork join I(t,c,s)={t'} (t,c,s) (t',c',s') 10 / 18
22 Interleaving Rule [I-DESCENDANT] [I-SIBLING] (c,fk i ) t t (t, c, fk i ) (t, c, l) (c, l ) = Entry(S t ) {t } I(t, c, l) {t} I(t, c, l ) t t (c, l) = Entry(S t) (c, l ) = Entry(S t ) t t t t {t} I(t, c, l ) {t } I(t, c, l) [I-JOIN] (c,jn i ) t t I(t, c, jn i ) = I(t, c, jn i )\{t } [I-CALL] (t, c, l) call i (t, c, l ) c = c.push(i) I(t, c, l) I(t, c, l ) [I-INTRA] (t, c, l) (t, c, l ) I(t, c, l) I(t, c, l ) [I-RET] (t, c, l) ret i (t, c, l ) i = c.peek() c = c.pop() I(t, c, l) I(t, c, l ) 11 / 18
23 Outline Motivation phases Thread Interleaving Alias Thread Local Lock Evalution 11 / 18
24 Alias Obtain aliasing pairs by refining MHP store-load and store-store pairs [(t, c, s), (t, c, s )], where Alias( p, q) is the set of objects pointed to by both p and q. s : p = s : = q or q = (t, c, s) (t, c, s ) o Alias( p, q) o s s int x,y; int *p,*q,*r; p=&x; q=&x; r=&y; void main(){ fork(t, foo); s1: *p=...; s2: *r=...; } void foo(){ s3:...=*q; } 12 / 18
25 Outline Motivation phases Thread Interleaving Alias Thread Local Lock Evalution 12 / 18
26 Thread-Local An object is not thread-local, i.e. escaping, if it escapes via arguments at a forksite global pointers void main(){ int x,y; fork(t,foo,&x); s1: x=...; join(t); s3: y=...; } void foo(int* p){ s2:...=*p; } 13 / 18
27 Outline Motivation phases Thread Interleaving Alias Thread Local Lock Evalution 13 / 18
28 Lockset Statements from different lockunlock spans, are interferencefree if these spans are protected by a common lock. Our framework does this by performing a flow- and contextsensitive analysis for lock/unlock operations int x; mutex m; void main(){ fork(t,foo); s1: x=...; lock(m); s3: x=...; unlock(m); join(t); } void foo(){ lock(m); s2:...=x; unlock(m); } 14 / 18
29 Outline Motivation phases Evalution 14 / 18
30 Evaluation Implementation: On top of our previous open-source tool SVF ( Based on our previous papers CGO 16 and ICPP 15 Benchmarks: 11 SPLASH2 Pthread benchmarks Machine setup: Ubuntu Linux generic Intel Xeon Quad Core HT, 3.7GHZ, 64GB 15 / 18
31 Instrumentation Statistics (under Option -O0) Pthread API TSan Our Approach Fork Join Lock Unlock Read Write Read Write barnes fft lu cb lu ncb ocean cp ocean ncp radiosity radix raytrace water nsquared water spatial / 18
32 Speedups over Original TSan (under Option -O0) 4 Threads 16 Threads 4 x 2 x 0 x barnes fft lu cb lu ncb ocean cp ocean ncp radiosity radix raytrace water nsquared water spatial average 17 / 18
33 Thanks! Q & A 18 / 18
SVF: Static Value-Flow Analysis in LLVM
SVF: Static Value-Flow Analysis in LLVM Yulei Sui, Peng Di, Ding Ye, Hua Yan and Jingling Xue School of Computer Science and Engineering The University of New South Wales 2052 Sydney Australia March 18,
More informationAccelerating Dynamic Data Race Detection Using Static Thread Interference Analysis
Accelerating Dynamic Data Race Detection Using Static Thread Interference Analysis ABSTRACT Peng Di School of Computer Science and Engineering University of New South Wales, Australia pengd@cse.unsw.edu.au
More informationChimera: Hybrid Program Analysis for Determinism
Chimera: Hybrid Program Analysis for Determinism Dongyoon Lee, Peter Chen, Jason Flinn, Satish Narayanasamy University of Michigan, Ann Arbor - 1 - * Chimera image from http://superpunch.blogspot.com/2009/02/chimera-sketch.html
More informationSiloed Reference Analysis
Siloed Reference Analysis Xing Zhou 1. Objectives: Traditional compiler optimizations must be conservative for multithreaded programs in order to ensure correctness, since the global variables or memory
More informationIdentifying Ad-hoc Synchronization for Enhanced Race Detection
Identifying Ad-hoc Synchronization for Enhanced Race Detection IPD Tichy Lehrstuhl für Programmiersysteme IPDPS 20 April, 2010 Ali Jannesari / Walter F. Tichy KIT die Kooperation von Forschungszentrum
More informationPredicting Data Races from Program Traces
Predicting Data Races from Program Traces Luis M. Carril IPD Tichy Lehrstuhl für Programmiersysteme KIT die Kooperation von Forschungszentrum Karlsruhe GmbH und Universität Karlsruhe (TH) Concurrency &
More informationLoop-Oriented Array- and Field-Sensitive Pointer Analysis for Automatic SIMD Vectorization
Loop-Oriented Array- and Field-Sensitive Pointer Analysis for Automatic SIMD Vectorization Yulei Sui, Xiaokang Fan, Hao Zhou and Jingling Xue School of Computer Science and Engineering The University of
More informationThis is the published version of a paper presented at CGO 2017, February 4 8, Austin, TX.
http://www.diva-portal.org This is the published version of a paper presented at CGO 2017, February 4 8, Austin, TX. Citation for the original published paper: Jimborean, A., Waern, J., Ekemark, P., Kaxiras,
More informationAccelerating Multi-core Processor Design Space Evaluation Using Automatic Multi-threaded Workload Synthesis
Accelerating Multi-core Processor Design Space Evaluation Using Automatic Multi-threaded Workload Synthesis Clay Hughes & Tao Li Department of Electrical and Computer Engineering University of Florida
More informationDynamic Race Detection with LLVM Compiler
Dynamic Race Detection with LLVM Compiler Compile-time instrumentation for ThreadSanitizer Konstantin Serebryany, Alexander Potapenko, Timur Iskhodzhanov, and Dmitriy Vyukov OOO Google, 7 Balchug st.,
More informationCan We Make It Faster? Efficient May-Happen-in-Parallel Analysis Revisited
Can We Make It Faster? Efficient May-Happen-in-Parallel Analysis Revisited Congming Chen, Wei Huo, Lung Li Xiaobing Feng, Kai Xing State Key Laboratory of Computer Architecture Institute of Computing Technology,
More informationLDetector: A low overhead data race detector for GPU programs
LDetector: A low overhead data race detector for GPU programs 1 PENGCHENG LI CHEN DING XIAOYU HU TOLGA SOYATA UNIVERSITY OF ROCHESTER 1 Data races in GPU Introduction & Contribution Impact correctness
More informationEfficient Data Race Detection for Unified Parallel C
P A R A L L E L C O M P U T I N G L A B O R A T O R Y Efficient Data Race Detection for Unified Parallel C ParLab Winter Retreat 1/14/2011" Costin Iancu, LBL" Nick Jalbert, UC Berkeley" Chang-Seo Park,
More informationLight64: Ligh support for data ra. Darko Marinov, Josep Torrellas. a.cs.uiuc.edu
: Ligh htweight hardware support for data ra ce detection ec during systematic testing Adrian Nistor, Darko Marinov, Josep Torrellas University of Illinois, Urbana Champaign http://iacoma a.cs.uiuc.edu
More informationTyped Assembly Language for Implementing OS Kernels in SMP/Multi-Core Environments with Interrupts
Typed Assembly Language for Implementing OS Kernels in SMP/Multi-Core Environments with Interrupts Toshiyuki Maeda and Akinori Yonezawa University of Tokyo Quiz [Environment] CPU: Intel Xeon X5570 (2.93GHz)
More informationDeterministic Replay and Data Race Detection for Multithreaded Programs
Deterministic Replay and Data Race Detection for Multithreaded Programs Dongyoon Lee Computer Science Department - 1 - The Shift to Multicore Systems 100+ cores Desktop/Server 8+ cores Smartphones 2+ cores
More informationFast dynamic program analysis Race detection. Konstantin Serebryany May
Fast dynamic program analysis Race detection Konstantin Serebryany May 20 2011 Agenda Dynamic program analysis Race detection: theory ThreadSanitizer: race detector Making ThreadSanitizer
More informationSnatch: Opportunistically Reassigning Power Allocation between Processor and Memory in 3D Stacks
Snatch: Opportunistically Reassigning Power Allocation between and in 3D Stacks Dimitrios Skarlatos, Renji Thomas, Aditya Agrawal, Shibin Qin, Robert Pilawa, Ulya Karpuzcu, Radu Teodorescu, Nam Sung Kim,
More informationStatic Versioning of Global State for Race Condition Detection
Static Versioning of Global State for Race Condition Detection Steffen Keul Dept. of Programming Languages and Compilers Institute of Software Technology University of Stuttgart 15 th International Conference
More informationCS 153 Lab4 and 5. Kishore Kumar Pusukuri. Kishore Kumar Pusukuri CS 153 Lab4 and 5
CS 153 Lab4 and 5 Kishore Kumar Pusukuri Outline Introduction A thread is a straightforward concept : a single sequential flow of control. In traditional operating systems, each process has an address
More informationIntel Threading Tools
Intel Threading Tools Paul Petersen, Intel -1- INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS,
More informationHonours/Master/PhD Thesis Projects Supervised by Dr. Yulei Sui
Honours/Master/PhD Thesis Projects Supervised by Dr. Yulei Sui Projects 1 Information flow analysis for mobile applications 2 2 Machine-learning-guide typestate analysis for UAF vulnerabilities 3 3 Preventing
More informationCS 333 Introduction to Operating Systems. Class 3 Threads & Concurrency. Jonathan Walpole Computer Science Portland State University
CS 333 Introduction to Operating Systems Class 3 Threads & Concurrency Jonathan Walpole Computer Science Portland State University 1 Process creation in UNIX All processes have a unique process id getpid(),
More informationMELD: Merging Execution- and Language-level Determinism. Joe Devietti, Dan Grossman, Luis Ceze University of Washington
MELD: Merging Execution- and Language-level Determinism Joe Devietti, Dan Grossman, Luis Ceze University of Washington CoreDet results overhead normalized to nondet 800% 600% 400% 200% 0% 2 4 8 2 4 8 2
More informationVulcan: Hardware Support for Detecting Sequential Consistency Violations Dynamically
Vulcan: Hardware Support for Detecting Sequential Consistency Violations Dynamically Abdullah Muzahid, Shanxiang Qi, and University of Illinois at Urbana-Champaign http://iacoma.cs.uiuc.edu MICRO December
More informationIntroduction Contech s Task Graph Representation Parallel Program Instrumentation (Break) Analysis and Usage of a Contech Task Graph Hands-on
Introduction Contech s Task Graph Representation Parallel Program Instrumentation (Break) Analysis and Usage of a Contech Task Graph Hands-on Exercises 2 Contech is An LLVM compiler pass to instrument
More informationDetecting and Tolerating Asymmetric Races
Detecting and Tolerating Asymmetric Races Paruj Ratanaworabhan 1, Martin Burtscher 2, Darko Kirovski 3, Rahul Nagpal 4, Karthik Pattabiraman 5, and Benjamin Zorn 3 1 Cornell University, 2 The University
More informationDynamic Deadlock Avoidance Using Statically Inferred Effects
Dynamic Deadlock Avoidance Using Statically Inferred Effects Kostis Sagonas1,2 joint work with P. Gerakios1 1 N. Papaspyrou1 P. Vekris1,3 School of ECE, National Technical University of Athens, Greece
More informationPOSIX PTHREADS PROGRAMMING
POSIX PTHREADS PROGRAMMING Download the exercise code at http://www-micrel.deis.unibo.it/~capotondi/pthreads.zip Alessandro Capotondi alessandro.capotondi(@)unibo.it Hardware Software Design of Embedded
More informationCOMP6771 Advanced C++ Programming
1.... COMP6771 Advanced C++ Programming Week 9 Multithreading 2016 www.cse.unsw.edu.au/ cs6771 .... Single Threaded Programs All programs so far this semester have been single threaded They have a single
More informationApplying Multi-Core Model Checking to Hardware-Software Partitioning in Embedded Systems
V Brazilian Symposium on Computing Systems Engineering Applying Multi-Core Model Checking to Hardware-Software Partitioning in Embedded Systems Alessandro Trindade, Hussama Ismail, and Lucas Cordeiro Foz
More informationNew features in AddressSanitizer. LLVM developer meeting Nov 7, 2013 Alexey Samsonov, Kostya Serebryany
New features in AddressSanitizer LLVM developer meeting Nov 7, 2013 Alexey Samsonov, Kostya Serebryany Agenda AddressSanitizer (ASan): a quick reminder New features: Initialization-order-fiasco Stack-use-after-scope
More informationFractal: A Software Toolchain for Mapping Applications to Diverse, Heterogeneous Architecures
Fractal: A Software Toolchain for Mapping Applications to Diverse, Heterogeneous Architecures University of Virginia Dept. of Computer Science Technical Report #CS-2011-09 Jeremy W. Sheaffer and Kevin
More informationDavid R. Mackay, Ph.D. Libraries play an important role in threading software to run faster on Intel multi-core platforms.
Whitepaper Introduction A Library Based Approach to Threading for Performance David R. Mackay, Ph.D. Libraries play an important role in threading software to run faster on Intel multi-core platforms.
More informationffwd: delegation is (much) faster than you think Sepideh Roghanchi, Jakob Eriksson, Nilanjana Basu
ffwd: delegation is (much) faster than you think Sepideh Roghanchi, Jakob Eriksson, Nilanjana Basu int get_seqno() { } return ++seqno; // ~1 Billion ops/s // single-threaded int threadsafe_get_seqno()
More informationMulticore programming in CilkPlus
Multicore programming in CilkPlus Marc Moreno Maza University of Western Ontario, Canada CS3350 March 16, 2015 CilkPlus From Cilk to Cilk++ and Cilk Plus Cilk has been developed since 1994 at the MIT Laboratory
More informationShared Memory Parallel Programming with Pthreads An overview
Shared Memory Parallel Programming with Pthreads An overview Part II Ing. Andrea Marongiu (a.marongiu@unibo.it) Includes slides from ECE459: Programming for Performance course at University of Waterloo
More informationTaming Parallelism in a Multi-Variant Execution Environment
Taming Parallelism in a Multi-Variant Execution Environment Stijn Volckaert University of California, Irvine stijnv@uci.edu Bart Coppens Ghent University bart.coppens@ugent.be Bjorn De Sutter Ghent University
More informationPROGRAMOVÁNÍ V C++ CVIČENÍ. Michal Brabec
PROGRAMOVÁNÍ V C++ CVIČENÍ Michal Brabec PARALLELISM CATEGORIES CPU? SSE Multiprocessor SIMT - GPU 2 / 17 PARALLELISM V C++ Weak support in the language itself, powerful libraries Many different parallelization
More informationCS 5523 Operating Systems: Midterm II - reivew Instructor: Dr. Tongping Liu Department Computer Science The University of Texas at San Antonio
CS 5523 Operating Systems: Midterm II - reivew Instructor: Dr. Tongping Liu Department Computer Science The University of Texas at San Antonio Fall 2017 1 Outline Inter-Process Communication (20) Threads
More informationCS 3305 Intro to Threads. Lecture 6
CS 3305 Intro to Threads Lecture 6 Introduction Multiple applications run concurrently! This means that there are multiple processes running on a computer Introduction Applications often need to perform
More informationECE 259 / CPS 221 Advanced Computer Architecture II (Parallel Computer Architecture) Parallel Programming
ECE 259 / CPS 221 Advanced Computer Architecture II (Parallel Computer Architecture) Parallel Programming Copyright 2010 Daniel J. Sorin Duke University Slides are derived from work by Sarita Adve (Illinois),
More informationCS 475. Process = Address space + one thread of control Concurrent program = multiple threads of control
Processes & Threads Concurrent Programs Process = Address space + one thread of control Concurrent program = multiple threads of control Multiple single-threaded processes Multi-threaded process 2 1 Concurrent
More informationCS 333 Introduction to Operating Systems. Class 3 Threads & Concurrency. Jonathan Walpole Computer Science Portland State University
CS 333 Introduction to Operating Systems Class 3 Threads & Concurrency Jonathan Walpole Computer Science Portland State University 1 The Process Concept 2 The Process Concept Process a program in execution
More informationW4118: concurrency error. Instructor: Junfeng Yang
W4118: concurrency error Instructor: Junfeng Yang Goals Identify patterns of concurrency errors (so you can avoid them in your code) Learn techniques to detect concurrency errors (so you can apply these
More informationDeterministic Process Groups in
Deterministic Process Groups in Tom Bergan Nicholas Hunt, Luis Ceze, Steven D. Gribble University of Washington A Nondeterministic Program global x=0 Thread 1 Thread 2 t := x x := t + 1 t := x x := t +
More informationPlan. Introduction to Multicore Programming. Plan. University of Western Ontario, London, Ontario (Canada) Multi-core processor CPU Coherence
Plan Introduction to Multicore Programming Marc Moreno Maza University of Western Ontario, London, Ontario (Canada) CS 3101 1 Multi-core Architecture 2 Race Conditions and Cilkscreen (Moreno Maza) Introduction
More informationDynamic Monitoring Tool based on Vector Clocks for Multithread Programs
, pp.45-49 http://dx.doi.org/10.14257/astl.2014.76.12 Dynamic Monitoring Tool based on Vector Clocks for Multithread Programs Hyun-Ji Kim 1, Byoung-Kwi Lee 2, Ok-Kyoon Ha 3, and Yong-Kee Jun 1 1 Department
More informationHeuristics for Profile-driven Method- level Speculative Parallelization
Heuristics for Profile-driven Method- level John Whaley and Christos Kozyrakis Stanford University Speculative Multithreading Speculatively parallelize an application Uses speculation to overcome ambiguous
More informationAdvanced Threads & Monitor-Style Programming 1/24
Advanced Threads & Monitor-Style Programming 1/24 First: Much of What You Know About Threads Is Wrong! // Initially x == 0 and y == 0 // Thread 1 Thread 2 x = 1; if (y == 1 && x == 0) exit(); y = 1; Can
More informationOverview. CMSC 330: Organization of Programming Languages. Concurrency. Multiprocessors. Processes vs. Threads. Computation Abstractions
CMSC 330: Organization of Programming Languages Multithreaded Programming Patterns in Java CMSC 330 2 Multiprocessors Description Multiple processing units (multiprocessor) From single microprocessor to
More informationCMSC 330: Organization of Programming Languages
CMSC 330: Organization of Programming Languages Multithreading Multiprocessors Description Multiple processing units (multiprocessor) From single microprocessor to large compute clusters Can perform multiple
More informationIntroduction to OpenMP.
Introduction to OpenMP www.openmp.org Motivation Parallelize the following code using threads: for (i=0; i
More informationData-Centric Locality in Chapel
Data-Centric Locality in Chapel Ben Harshbarger Cray Inc. CHIUW 2015 1 Safe Harbor Statement This presentation may contain forward-looking statements that are based on our current expectations. Forward
More informationReVive: Cost-Effective Architectural Support for Rollback Recovery in Shared-Memory Multiprocessors
ReVive: Cost-Effective Architectural Support for Rollback Recovery in Shared-Memory Multiprocessors Milos Prvulovic, Zheng Zhang*, Josep Torrellas University of Illinois at Urbana-Champaign *Hewlett-Packard
More informationUni-Address Threads: Scalable Thread Management for RDMA-based Work Stealing
Uni-Address Threads: Scalable Thread Management for RDMA-based Work Stealing Shigeki Akiyama, Kenjiro Taura The University of Tokyo June 17, 2015 HPDC 15 Lightweight Threads Lightweight threads enable
More informationCall Paths for Pin Tools
, Xu Liu, and John Mellor-Crummey Department of Computer Science Rice University CGO'14, Orlando, FL February 17, 2014 What is a Call Path? main() A() B() Foo() { x = *ptr;} Chain of function calls that
More informationVerifying Concurrent Programs
Verifying Concurrent Programs Daniel Kroening 8 May 1 June 01 Outline Shared-Variable Concurrency Predicate Abstraction for Concurrent Programs Boolean Programs with Bounded Replication Boolean Programs
More informationCS 261 Fall Mike Lam, Professor. Threads
CS 261 Fall 2017 Mike Lam, Professor Threads Parallel computing Goal: concurrent or parallel computing Take advantage of multiple hardware units to solve multiple problems simultaneously Motivations: Maintain
More informationEECE.4810/EECE.5730: Operating Systems Spring 2017 Homework 2 Solution
1. (15 points) A system with two dual-core processors has four processors available for scheduling. A CPU-intensive application (e.g., a program that spends most of its time on computation, not I/O or
More informationMetaFork: A Compilation Framework for Concurrency Platforms Targeting Multicores
MetaFork: A Compilation Framework for Concurrency Platforms Targeting Multicores Presented by Xiaohui Chen Joint work with Marc Moreno Maza, Sushek Shekar & Priya Unnikrishnan University of Western Ontario,
More informationIntroduction to Multicore Programming
Introduction to Multicore Programming Minsoo Ryu Department of Computer Science and Engineering 2 1 Multithreaded Programming 2 Synchronization 3 Automatic Parallelization and OpenMP 4 GPGPU 5 Q& A 2 Multithreaded
More informationEfficient, Deterministic, and Deadlock-free Concurrency
Efficient, Deterministic Concurrency p. 1/31 Efficient, Deterministic, and Deadlock-free Concurrency Nalini Vasudevan Columbia University Efficient, Deterministic Concurrency p. 2/31 Data Races int x;
More informationExplicitly Parallel Programming with Shared Memory is Insane: At Least Make it Deterministic!
Explicitly Parallel Programming with Shared Memory is Insane: At Least Make it Deterministic! Joe Devietti, Brandon Lucia, Luis Ceze and Mark Oskin University of Washington Parallel Programming is Hard
More informationMaking Context-sensitive Points-to Analysis with Heap Cloning Practical For The Real World
Making Context-sensitive Points-to Analysis with Heap Cloning Practical For The Real World Chris Lattner Apple Andrew Lenharth UIUC Vikram Adve UIUC What is Heap Cloning? Distinguish objects by acyclic
More informationDealing with Concurrency Problems
Dealing with Concurrency Problems Nalini Vasudevan Columbia University Dealing with Concurrency Problems, Nalini Vasudevan p. 1 What is the output? int x; foo(){ int m; m = qux(); x = x + m; bar(){ int
More informationThe Cilk part is a small set of linguistic extensions to C/C++ to support fork-join parallelism. (The Plus part supports vector parallelism.
Cilk Plus The Cilk part is a small set of linguistic extensions to C/C++ to support fork-join parallelism. (The Plus part supports vector parallelism.) Developed originally by Cilk Arts, an MIT spinoff,
More informationPutting the Checks into Checked C. Archibald Samuel Elliott Quals - 31st Oct 2017
Putting the Checks into Checked C Archibald Samuel Elliott Quals - 31st Oct 2017 Added Runtime Bounds Checks to the Checked C Compiler 2 C Extension for Spatial Memory Safety Added Runtime Bounds Checks
More informationLecture 14 Pointer Analysis
Lecture 14 Pointer Analysis Basics Design Options Pointer Analysis Algorithms Pointer Analysis Using BDDs Probabilistic Pointer Analysis [ALSU 12.4, 12.6-12.7] Phillip B. Gibbons 15-745: Pointer Analysis
More informationDeterministic Concurrency
Candidacy Exam p. 1/35 Deterministic Concurrency Candidacy Exam Nalini Vasudevan Columbia University Motivation Candidacy Exam p. 2/35 Candidacy Exam p. 3/35 Why Parallelism? Past Vs. Future Power wall:
More informationCSCI4430 Data Communication and Computer Networks. Pthread Programming. ZHANG, Mi Jan. 26, 2017
CSCI4430 Data Communication and Computer Networks Pthread Programming ZHANG, Mi Jan. 26, 2017 Outline Introduction What is Multi-thread Programming Why to use Multi-thread Programming Basic Pthread Programming
More informationAd Hoc Synchroniza/on Considered Harmful
Ad Hoc Synchroniza/on Considered Harmful Weiwei Xiong, Soyoen Park, Jiaqi Zhang, Yuanyuan Zhou and Zhiqiang Ma UC San Diego University of Illinois Intel Synchroniza/on is Important Concurrent programs
More informationASaP: Annotations for Safe Parallelism in Clang. Alexandros Tzannes, Vikram Adve, Michael Han, Richard Latham
ASaP: Annotations for Safe Parallelism in Clang Alexandros Tzannes, Vikram Adve, Michael Han, Richard Latham Motivation Debugging parallel code is hard!! Many bugs are hard to reason about & reproduce
More informationSoftware Speculative Multithreading for Java
Software Speculative Multithreading for Java Christopher J.F. Pickett and Clark Verbrugge School of Computer Science, McGill University {cpicke,clump}@sable.mcgill.ca Allan Kielstra IBM Toronto Lab kielstra@ca.ibm.com
More informationThread Tailor Dynamically Weaving Threads Together for Efficient, Adaptive Parallel Applications
Thread Tailor Dynamically Weaving Threads Together for Efficient, Adaptive Parallel Applications Janghaeng Lee, Haicheng Wu, Madhumitha Ravichandran, Nathan Clark Motivation Hardware Trends Put more cores
More informationMotivation & examples Threads, shared memory, & synchronization
1 Motivation & examples Threads, shared memory, & synchronization How do locks work? Data races (a lower level property) How do data race detectors work? Atomicity (a higher level property) Concurrency
More information7/6/2015. Motivation & examples Threads, shared memory, & synchronization. Imperative programs
Motivation & examples Threads, shared memory, & synchronization How do locks work? Data races (a lower level property) How do data race detectors work? Atomicity (a higher level property) Concurrency exceptions
More informationpthreads CS449 Fall 2017
pthreads CS449 Fall 2017 POSIX Portable Operating System Interface Standard interface between OS and program UNIX-derived OSes mostly follow POSIX Linux, macos, Android, etc. Windows requires separate
More informationCS Lecture 3! Threads! George Mason University! Spring 2010!
CS 571 - Lecture 3! Threads! George Mason University! Spring 2010! Threads! Overview! Multithreading! Example Applications! User-level Threads! Kernel-level Threads! Hybrid Implementation! Observing Threads!
More informationCompiler Techniques For Memory Consistency Models
Compiler Techniques For Memory Consistency Models Students: Xing Fang (Purdue), Jaejin Lee (Seoul National University), Kyungwoo Lee (Purdue), Zehra Sura (IBM T.J. Watson Research Center), David Wong (Intel
More informationWarm-up question (CS 261 review) What is the primary difference between processes and threads from a developer s perspective?
Warm-up question (CS 261 review) What is the primary difference between processes and threads from a developer s perspective? CS 470 Spring 2018 POSIX Mike Lam, Professor Multithreading & Pthreads MIMD
More informationRAMP-White / FAST-MP
RAMP-White / FAST-MP Hari Angepat and Derek Chiou Electrical and Computer Engineering University of Texas at Austin Supported in part by DOE, NSF, SRC,Bluespec, Intel, Xilinx, IBM, and Freescale RAMP-White
More informationFormal Methods: Model Checking and Other Applications. Orna Grumberg Technion, Israel. Marktoberdorf 2017
Formal Methods: Model Checking and Other Applications Orna Grumberg Technion, Israel Marktoberdorf 2017 1 Outline Model checking of finite-state systems Assisting in program development Program repair
More informationCalvin Lin The University of Texas at Austin
Interprocedural Analysis Last time Introduction to alias analysis Today Interprocedural analysis March 4, 2015 Interprocedural Analysis 1 Motivation Procedural abstraction Cornerstone of programming Introduces
More informationUnderstanding the Interleaving Space Overlap across Inputs and So7ware Versions
Understanding the Interleaving Space Overlap across Inputs and So7ware Versions Dongdong Deng, Wei Zhang, Borui Wang, Peisen Zhao, Shan Lu University of Wisconsin, Madison 1 Concurrency bug detec3on is
More informationPCERE: Fine-grained Parallel Benchmark Decomposition for Scalability Prediction
PCERE: Fine-grained Parallel Benchmark Decomposition for Scalability Prediction Mihail Popov, Chadi kel, Florent Conti, William Jalby, Pablo de Oliveira Castro UVSQ - PRiSM - ECR Mai 28, 2015 Introduction
More informationSwizzle Switch: A Self-Arbitrating High-Radix Crossbar for NoC Systems
1 Swizzle Switch: A Self-Arbitrating High-Radix Crossbar for NoC Systems Ronald Dreslinski, Korey Sewell, Thomas Manville, Sudhir Satpathy, Nathaniel Pinckney, Geoff Blake, Michael Cieslak, Reetuparna
More informationUniversidad Carlos III de Madrid Computer Science and Engineering Department Operating Systems Course
Exercise 1 (20 points). Autotest. Answer the quiz questions in the following table. Write the correct answer with its corresponding letter. For each 3 wrong answer, one correct answer will be subtracted
More information11/19/2013. Imperative programs
if (flag) 1 2 From my perspective, parallelism is the biggest challenge since high level programming languages. It s the biggest thing in 50 years because industry is betting its future that parallel programming
More informationRuntime Monitoring on Multicores via OASES
Runtime Monitoring on Multicores via OASES Vijay Nagarajan and Rajiv Gupta University of California, CSE Department, Riverside, CA 92521 {vijay,gupta}@cs.ucr.edu Abstract Runtime monitoring support serves
More informationLecture 27. Pros and Cons of Pointers. Basics Design Options Pointer Analysis Algorithms Pointer Analysis Using BDDs Probabilistic Pointer Analysis
Pros and Cons of Pointers Lecture 27 Pointer Analysis Basics Design Options Pointer Analysis Algorithms Pointer Analysis Using BDDs Probabilistic Pointer Analysis Many procedural languages have pointers
More informationLecture 20 Pointer Analysis
Lecture 20 Pointer Analysis Basics Design Options Pointer Analysis Algorithms Pointer Analysis Using BDDs Probabilistic Pointer Analysis (Slide content courtesy of Greg Steffan, U. of Toronto) 15-745:
More informationContext-Switch-Directed Verification in DIVINE
Context-Switch-Directed Verification in DIVINE MEMICS 2014 Vladimír Štill Petr Ročkai Jiří Barnat Faculty of Informatics Masaryk University, Brno October 18, 2014 Vladimír Štill et al. Context-Switch-Directed
More informationNVthreads: Practical Persistence for Multi-threaded Applications
NVthreads: Practical Persistence for Multi-threaded Applications Terry Hsu*, Purdue University Helge Brügner*, TU München Indrajit Roy*, Google Inc. Kimberly Keeton, Hewlett Packard Labs Patrick Eugster,
More informationCSE 374 Programming Concepts & Tools
CSE 374 Programming Concepts & Tools Hal Perkins Fall 2017 Lecture 22 Shared-Memory Concurrency 1 Administrivia HW7 due Thursday night, 11 pm (+ late days if you still have any & want to use them) Course
More informationCIRM - Dynamic Error Detection
CIRM - Dynamic Error Detection Peter Pirkelbauer Center for Applied Scientific Computing (CASC) Lawrence Livermore National Laboratory This work was funded by the Department of Defense and used elements
More informationDetecting Data Races in Multi-Threaded Programs
Slide 1 / 22 Detecting Data Races in Multi-Threaded Programs Eraser A Dynamic Data-Race Detector for Multi-Threaded Programs John C. Linford Slide 2 / 22 Key Points 1. Data races are easy to cause and
More informationLogTM: Log-Based Transactional Memory
LogTM: Log-Based Transactional Memory Kevin E. Moore, Jayaram Bobba, Michelle J. Moravan, Mark D. Hill, & David A. Wood 12th International Symposium on High Performance Computer Architecture () 26 Mulitfacet
More informationOpenACC. Part 2. Ned Nedialkov. McMaster University Canada. CS/SE 4F03 March 2016
OpenACC. Part 2 Ned Nedialkov McMaster University Canada CS/SE 4F03 March 2016 Outline parallel construct Gang loop Worker loop Vector loop kernels construct kernels vs. parallel Data directives c 2013
More informationEECS 598: Integrating Emerging Technologies with Computer Architecture. Lecture 12: On-Chip Interconnects
1 EECS 598: Integrating Emerging Technologies with Computer Architecture Lecture 12: On-Chip Interconnects Instructor: Ron Dreslinski Winter 216 1 1 Announcements Upcoming lecture schedule Today: On-chip
More information