OPERATING SYSTEM SUPPORT FOR REDUNDANT MULTITHREADING. Björn Döbel (TU Dresden)

Size: px

Start display at page:

Download "OPERATING SYSTEM SUPPORT FOR REDUNDANT MULTITHREADING. Björn Döbel (TU Dresden)"

Clemence Kelley
5 years ago
Views:

1 OPERATING SYSTEM SUPPORT FOR REDUNDANT MULTITHREADING Björn Döbel (TU Dresden) Brussels,

2 Hardware Faults Radiation-induced soft errors Mainly an issue in avionics+space 1 DRAM errors in large data centers Google Study: > 2% failing DRAM DIMMs per year 2 ECC is not going to even detect a significant amount 3 Disk failure rate about 5% 4 Furthermore: decreasing transistor sizes, higher rate of transient errors in CPU functional units 5 1 Shirvani, McCluskey: Fault-Tolerant Systems in A Space Environment: The CRC ARGOS Project, Schroeder, Pinheiro, Weber: DRAM Errors in the Wild: A Large-Scale Field Study, SIGMETRICS Hwang, Stefanovici, Schroeder: Cosmic Rays Don t Strike Twice: Understanding the Nature of DRAM Errors and the Implications for System Design, ASPLOS Pinheiro, Weber, Barroso: Failure Trends in a Large Disk Drive Population, FAST Shivakumar, Kistler, Keckler: Modeling the Effect of Technology Trends on the Soft Error Rate of Combinational Logic, DSN 2002 Operating System Support for Redundant Multithreading slide 1 of 19

3 Fault Tolerance: State of the Union Software errors Hardware errors non- COTS COTS Operating System Support for Redundant Multithreading slide 2 of 19

4 Fault Tolerance: State of the Union Software errors Hardware errors RAD-hard CPUs Redundant Multithr. non- COTS COTS Operating System Support for Redundant Multithreading slide 2 of 19

5 Fault Tolerance: State of the Union Software errors IBM z/os HP NonStop Hardware errors RAD-hard CPUs Redundant Multithr. non- COTS COTS Operating System Support for Redundant Multithreading slide 2 of 19

6 Fault Tolerance: State of the Union Software errors SeL4 Minix3 IBM z/os HP NonStop Carburizer Hardware errors RAD-hard CPUs Redundant Multithr. non- COTS COTS Operating System Support for Redundant Multithreading slide 2 of 19

7 Fault Tolerance: State of the Union Software errors SeL4 Minix3 IBM z/os HP NonStop Carburizer SWIFT Hardware errors RAD-hard CPUs Redundant Multithr. Encoded Processing non- COTS COTS Operating System Support for Redundant Multithreading slide 2 of 19

8 Fault Tolerance: State of the Union Software errors SeL4 Minix3 IBM z/os HP NonStop Carburizer SWIFT Hardware errors RAD-hard CPUs Redundant Multithr. Encoded Romain Processing non- COTS COTS Operating System Support for Redundant Multithreading slide 2 of 19

9 Transparent Replication as OS Service Application L4 Runtime Environment L4/Fiasco.OC microkernel Operating System Support for Redundant Multithreading slide 3 of 19

10 Transparent Replication as OS Service Replicated Application L4 Runtime Environment Romain L4/Fiasco.OC microkernel Operating System Support for Redundant Multithreading slide 3 of 19

11 Transparent Replication as OS Service Unreplicated Application Replicated Application L4 Runtime Environment Romain L4/Fiasco.OC microkernel Operating System Support for Redundant Multithreading slide 3 of 19

12 Transparent Replication as OS Service Replicated Driver Unreplicated Application Replicated Application L4 Runtime Environment Romain L4/Fiasco.OC microkernel Operating System Support for Redundant Multithreading slide 3 of 19

13 Transparent Replication as OS Service Replicated Driver Unreplicated Application Replicated Application L4 Runtime Environment Romain L4/Fiasco.OC microkernel Reliable Computing Base Operating System Support for Redundant Multithreading slide 3 of 19

14 Process-Level Redundancy 6 Binary recompilation Complex, unprotected compiler Architecture-dependent System calls for replica synchronization Virtual memory fault isolation Restricted to Linux user-level programs 6 Shye, Blomsted, Moseley, Reddi, Connors: PLR: A software approach to transient fault tolerance for multicore architectures, DSN 2009 Operating System Support for Redundant Multithreading slide 4 of 19

15 Process-Level Redundancy 6 Binary recompilation Complex, unprotected compiler Architecture-dependent Reuse OS mechanisms System calls for replica synchronization Additional synchronization events Virtual memory fault isolation Restricted to Linux user-level programs Microkernel-based 6 Shye, Blomsted, Moseley, Reddi, Connors: PLR: A software approach to transient fault tolerance for multicore architectures, DSN 2009 Operating System Support for Redundant Multithreading slide 4 of 19

16 Why A Microkernel? Small components Microrebootable 7 Custom-tailor reliability to application needs 8 7 Herder: Building a Dependable Operating System Fault Tolerance in MINIX3, PhD Thesis, Sridharan, Kaeli: Eliminating microarchitectural dependency from architectural vulnerability, HPCA 2009 Operating System Support for Redundant Multithreading slide 5 of 19

17 Why A Microkernel? Small components Microrebootable 7 Custom-tailor reliability to application needs 8 That s what we do in Dresden (tm). Reuse Fiasco.OC mechanisms instead of adding new code to the RCB 7 Herder: Building a Dependable Operating System Fault Tolerance in MINIX3, PhD Thesis, Sridharan, Kaeli: Eliminating microarchitectural dependency from architectural vulnerability, HPCA 2009 Operating System Support for Redundant Multithreading slide 5 of 19

18 Why A Microkernel? Small components Microrebootable 7 Custom-tailor reliability to application needs 8 That s what we do in Dresden (tm). Reuse Fiasco.OC mechanisms instead of adding new code to the RCB Lean system call interface Need to add special handling to fewer syscalls 7 Herder: Building a Dependable Operating System Fault Tolerance in MINIX3, PhD Thesis, Sridharan, Kaeli: Eliminating microarchitectural dependency from architectural vulnerability, HPCA 2009 Operating System Support for Redundant Multithreading slide 5 of 19

19 Romain: Structure Master Operating System Support for Redundant Multithreading slide 6 of 19

20 Romain: Structure Replica Replica Replica Master Operating System Support for Redundant Multithreading slide 6 of 19

21 Romain: Structure Replica Replica Replica = Master Operating System Support for Redundant Multithreading slide 6 of 19

22 Romain: Structure Replica Replica Replica Resource Manager = System Call Proxy Master Operating System Support for Redundant Multithreading slide 6 of 19

23 Resource Management: Capabilities Replica Operating System Support for Redundant Multithreading slide 7 of 19

24 Resource Management: Capabilities Replica Replica Operating System Support for Redundant Multithreading slide 7 of 19

25 Resource Management: Capabilities Replica Replica Master Operating System Support for Redundant Multithreading slide 7 of 19

26 Partitioned Capability Tables Replica Replica Marked used Master Master private Operating System Support for Redundant Multithreading slide 8 of 19

27 Overhead vs. Unreplicated Execution 9 9 Döbel, Härtig, Engel: Operating System Support for Redundant Multithreading, EMSOFT 2012 Operating System Support for Redundant Multithreading slide 9 of 19

28 Romain Lines of Code Base code (main, logging, locking) 325 Application loader 375 Replica manager 628 Redundancy 153 Memory manager 445 System call proxy 311 Shared memory 281 Total 2,518 Fault injector 668 GDB server stub 1,304 Operating System Support for Redundant Multithreading slide 10 of 19

29 Hardening the RCB We need: Dedicated mechanisms to protect the RCB (HW or SW) We have: Full control over software Use FT-encoding compiler? Has not been done for kernel code yet Only protects SW components RAD-hardened hardware? Too expensive Operating System Support for Redundant Multithreading slide 11 of 19

30 Hardening the RCB We need: Dedicated mechanisms to protect the RCB (HW or SW) We have: Full control over software Use FT-encoding compiler? Has not been done for kernel code yet Only protects SW components RAD-hardened hardware? Too expensive Our proposal: Split HW into ResCores and NonRes-Cores NonRes Core NonRes Core NonRes Core NonRes Core NonRes Core ResCore NonRes Core NonRes Core NonRes Core NonRes Core NonRes Core Operating System Support for Redundant Multithreading slide 11 of 19

31 Signaling Performance Exploring master replica communication 10 12x Intel Core2 2.6 GHz Replicas pinned to dedicated physical cores Overhead in % Hyperthreading off Overhead by notification method Local Faults Migration Sync IPC Shared Mem susan CRC32 DMR susan CRC32 TMR 10 Döbel, Härtig: Who watches the watchmen? Protecting Operating System Reliability Mechanisms, HotDep 2012 Operating System Support for Redundant Multithreading slide 12 of 19

32 How About Multithreading? A 1 A 1 A 2 A 2 A 3 A 3 A 4 A 4 Operating System Support for Redundant Multithreading slide 13 of 19

33 How About Multithreading? A 1 A 1 B 1 A 2 B 1 A 2 A 3 B 2 A 3 B 2 B 3 A 4 B 3 A 4 Operating System Support for Redundant Multithreading slide 13 of 19

34 How About Multithreading? A 1 C 1 A 1 B 1 C 1 A 2 B 1 C 2 A 2 C 2 A 3 B 2 C 3 A 3 B 2 B 3 C 3 A 4 B 3 C 4 A 4 C 4 Operating System Support for Redundant Multithreading slide 13 of 19

35 Problem: Nondeterminism A 1 C 1 A 1 B 1 C 1 A 2 B 1 C 2 A 2 C 2 A 3 B 2 C 3 A 3 B 2 B 3 C 3 A 4 B 3 C 3 A 4 C 4 Operating System Support for Redundant Multithreading slide 14 of 19

36 Problem: Nondeterminism A 1 C 1 A 1 B 1 C 1 A 2 B 1 C 2 A 2 B 2 A 3 B 2 C 3 A 3 B 3 C 2 C 3 A 4 B 3 C 3 A 4 B 4 Operating System Support for Redundant Multithreading slide 14 of 19

37 Deterministic Multithreading Related work: Debugging multithreaded programs Slightly different requirement: determinism across runs 11 Liu, Curtsinger, Berger: DThreads: Efficient Deterministic Multithreading, OSDI Olszewski, Ansel, Amarasinghe: Kendo: Efficient Deterministic Multithreading in Software, ASPLOS 2009 Operating System Support for Redundant Multithreading slide 15 of 19

38 Deterministic Multithreading Related work: Debugging multithreaded programs Slightly different requirement: determinism across runs Strong Determinism: All accesses to shared resources happen in the same order 11. Requires heavy involvement with shared memory accesses Replicating SHM is slow 11 Liu, Curtsinger, Berger: DThreads: Efficient Deterministic Multithreading, OSDI Olszewski, Ansel, Amarasinghe: Kendo: Efficient Deterministic Multithreading in Software, ASPLOS 2009 Operating System Support for Redundant Multithreading slide 15 of 19

39 Deterministic Multithreading Related work: Debugging multithreaded programs Slightly different requirement: determinism across runs Strong Determinism: All accesses to shared resources happen in the same order 11. Requires heavy involvement with shared memory accesses Replicating SHM is slow Weak Determinism: All lock acquisitions in a program happen in the same order 12 Intercept calls to pthread mutex {lock,unlock} 11 Liu, Curtsinger, Berger: DThreads: Efficient Deterministic Multithreading, OSDI Olszewski, Ansel, Amarasinghe: Kendo: Efficient Deterministic Multithreading in Software, ASPLOS 2009 Operating System Support for Redundant Multithreading slide 15 of 19

40 Externally Enforced Determinism Patch entries to pthread mutex {lock,unlock} Catch exception for every call Enforce ordering in side the master Microbenchmark: 2 threads, global counter, 1 lock Replication kind Execution time Overhead Native execution 0.24 s 1.00 x Unreplicated RomainMT 4.50 s x RomainMT: DMR s x RomainMT: TMR s x Operating System Support for Redundant Multithreading slide 16 of 19

41 Internal Determinism Exception for every lock/unlock hurts a lot! Non-contention case: get progress without exception Operating System Support for Redundant Multithreading slide 17 of 19

42 Internal Determinism Exception for every lock/unlock hurts a lot! Non-contention case: get progress without exception Replication-aware pthreads library Shared memory between replicas Exchange per-lock progress information Only block if lock currently used by different thread in other replica Operating System Support for Redundant Multithreading slide 17 of 19

43 Internal Determinism Exception for every lock/unlock hurts a lot! Non-contention case: get progress without exception Replication-aware pthreads library Shared memory between replicas Exchange per-lock progress information Only block if lock currently used by different thread in other replica Problems: Replacing pthreads vs. binary-only support pthreads is a shared lib anyway No support for different synchronization mechanisms New pthreads code becomes part of the RCB. Operating System Support for Redundant Multithreading slide 17 of 19

44 Internal Determinism Microbenchmark: 2 threads, global counter, 1 lock Replication kind Execution time Overhead External det. Native execution 0.24 s 1.00 x Unreplicated RomainMT 0.27 s 1.13 x x RomainMT: DMR 0.46 s 1.92 x x RomainMT: TMR 1.43 s 6.01 x x Operating System Support for Redundant Multithreading slide 18 of 19

45 Conclusion Redundant Multithreading as an OS service Support for binary-only applications Benefit from microkernel by reuse and design Overheads <30%, often <5% Multithreading external vs. internal determinism Work in progress (not in this talk): Shared memory handling is slow Bounded detection latency using a watchdog Dynamic adjustment of replication level and resource usage Operating System Support for Redundant Multithreading slide 19 of 19

46 Nothing to see here This slide intentionally left blank. Except for above text. Operating System Support for Redundant Multithreading slide 20 of 19

47 What about signalling failures? Missed CPU exceptions detected by watchdog Spurious CPU exceptions detected by watchdog / state comparison Transmission of corrupt state detected during state comparison Overwriting remote state during transmission NonResCore memory Accessible by ResCores, but not by other NonResCores Prevents overwriting other states Already available in HW: IBM/Cell Operating System Support for Redundant Multithreading slide 21 of 19

48 Romain Operating System Support for Redundant Multithreading slide 22 of 19

HAFT Hardware-Assisted Fault Tolerance

HAFT Hardware-Assisted Fault Tolerance Dmitrii Kuvaiskii Rasha Faqeh Pramod Bhatotia Christof Fetzer Technische Universität Dresden Pascal Felber Université de Neuchâtel Hardware Errors in the Wild Online