Light64: Ligh support for data ra. Darko Marinov, Josep Torrellas. a.cs.uiuc.edu

Size: px

Start display at page:

Download "Light64: Ligh support for data ra. Darko Marinov, Josep Torrellas. a.cs.uiuc.edu"

Lynne Gallagher
5 years ago
Views:

1 : Ligh htweight hardware support for data ra ce detection ec during systematic testing Adrian Nistor, Darko Marinov, Josep Torrellas University of Illinois, Urbana Champaign a.cs.uiuc.edu

2 Outline Motivation Systematic Testing Evaluation Conclusion

3 Common concurrency bug Difficult to detect t Data Races Cause unexpected crashes even in code that is well tested Example: X == 0 Thread A Thread B X +=1 X+=1 Depending on the run: X = 2 or X = 1

4 Contribution: : new data race detection technique Software Hardware Hardware requirement none 64 bits Kbits Execution overhead 8X 1 37% 05% 0.5% NO false positives Detects 96% of races

5 Outline Motivation Systematic ti Testing Evaluation Conclusion

6 Systemati ic Testing To detect bugs, we need high test coverage Very important in parallel prog grams One input, many thread interleavings Systematic testing Systematically execute many thread interleavings Example: CHESS (used by Microsoft testers) Systematic testers include dataa race detection Turned off by default Due to high runtime overhead : Overhead low enough to be always ON

7 How Systematic Testing Works Thread A A3 Wait X Wait Y Thread B Signal X B2 SEGMENT == sequence of dynamic instructions Execute many different interleavings Multiplex segments in a uniprocessor A1 A 3 B 2

8 Outline Motivation Systematic Tes sting Evaluation Conclusion

9 Example Thread A x = 0 0 Thread B x = 3 = x Signal Wait RACE Race on X: because accesses to X are not ordered by synchronization

10 Thread A Thread B The Idea Perform two executions flipping the unordered segments

11 Thread A Thread B The Idea No sync May have a race B1 UN FLIPPED Sync No race possible FLIPPED PRESERVE f segments A1 and B1 have NO race they are indepen ndent NOTHING changes have a RACE we also flipped the race Access history changes

12 Thread A Thread B The Idea x = 3 x = 0 x = 0 = x ( 0 ) B1 =x(3) x = 3 UN FLIPPED CHANGED! FLIPPED f segments A1 and B1 have NO race they are indepen ndent NOTHING changes have a RACE we also flipped the race SOMETHING changes

13 Overview Use two different exec cutions Same synchronization order If change race No change know for sure there is a highly probable no race

14 Phases in Detect if races exist Fast, over all thread interle eavings executed by the tester Issues Pin point races How to detect deviations (e.g. from 0 to 3) HW hash How to flip the segments with low overhead SW Slow, classic data race detection algorithm Only if there are races Only for the racy interleav vings Optimization: only for selected racy interleavings

15 Detecting Deviations Per thread: hash all the values read from memory on-the-fly Compare hashes of two execu utions with same sync order Different hashes Identical hashes know for sure there is a race high probability no race

16 Thread A Thread B Example: Detecting Deviations UN FLIPPED FLIPPED A1 HASH (READs) HASH (READs) END of execution HASH (READs) HASH (READs) ==? ==?

17 HW System BONUS ROB CRC 64 hash logic REGISTER virtualize migrate context switch no cache spills Head of ROB 64 bit register Accumulates values read from memory

18 Thread A Thread B Flip with Low Overhead STATE TREE THRE INTERLEAVING AD A1 A 3 B 2

19 Flip with Low Overhead Thread A Thread B A1 A 3 B 2

20 Flip with Low Overhead Thread A Thread B A1 A 3 UN FLIPPED FLIPPED B 2 Piggy back on Systematic Testing primitives to reduce overhead Some synchronization orders are execut ted multiple times

21 Outline Motivation Systematic Tes sting Evaluation Conclusion

22 Experimen ntal Setup Developed systematic tester in the lines of CHESS Tested all SPLASH-2 applications Run with 2 and 4 threads Execution overhead Accuracy Compare to a systematic tester with no race detection: Plain Compare to a systematic tester running a SW precise race detector Propose two versions: Active: Aggressive flipping for high coverage Passive: Modest flipping for minimum overhead

23 Execution Overhead (4 threads) Plain Passive ActiveFIN 1.2 alized to Plain Overhead Norm Barnes Cholesk FFT FMM LU Ocean Radiosi- Radix Ray- Volrend Water- Water- MEAN y ty trace NS SP Tradeoff execution overhead vs detection accuracy Active: 37% overhead, Passive: 2 % overhead, SW only: 8 X overhead, 96% races detected 89% races detected 100% races detected

24 Detection Accuracy y( (4 threads) Original Inserted Races Precise SW Precise SW Barnes Cholesky FFT FMM LU Ocean Radiosity Radix Raytrace Volrend Water-NS Water-SP Average race detection accuracy: 96 %

25 Also in the Paper Additional versi ons Optimization for the phase that pin-points races Characterization of the systematic testing Overhead and accuracy for two-threaded th d runs Additional results Software-only implementation

26 Outline Motivation Systematic Tes sting Evaluation Conclusionn

27 Conclusion Introduced, new data race detection technique for use during systematic testing HW Virtualize Cache spill Context switch Migration No False Positives Few False Negatives Runtime Overhead 64 bits 2 â 37% Other HW 72 â 400 â 400 Kbits NO or replicate the HW except one NO or additional HW 0.5%

28 : Ligh htweight hardware support for data race detection during systematic testing Adrian Nistor, Darko Marinov, Josep Torrellas University of Illinois, Urbana Champaign a.cs.uiuc.edu

ReVive: Cost-Effective Architectural Support for Rollback Recovery in Shared-Memory Multiprocessors

ReVive: Cost-Effective Architectural Support for Rollback Recovery in Shared-Memory Multiprocessors Milos Prvulovic, Zheng Zhang*, Josep Torrellas University of Illinois at Urbana-Champaign *Hewlett-Packard