OPERATING SYSTEM SUPPORT FOR REDUNDANT MULTITHREADING. Björn Döbel (TU Dresden)

Size: px
Start display at page:

Download "OPERATING SYSTEM SUPPORT FOR REDUNDANT MULTITHREADING. Björn Döbel (TU Dresden)"

Transcription

1 OPERATING SYSTEM SUPPORT FOR REDUNDANT MULTITHREADING Björn Döbel (TU Dresden) Brussels,

2 Hardware Faults Radiation-induced soft errors Mainly an issue in avionics+space 1 DRAM errors in large data centers Google Study: > 2% failing DRAM DIMMs per year 2 ECC is not going to even detect a significant amount 3 Disk failure rate about 5% 4 Furthermore: decreasing transistor sizes, higher rate of transient errors in CPU functional units 5 1 Shirvani, McCluskey: Fault-Tolerant Systems in A Space Environment: The CRC ARGOS Project, Schroeder, Pinheiro, Weber: DRAM Errors in the Wild: A Large-Scale Field Study, SIGMETRICS Hwang, Stefanovici, Schroeder: Cosmic Rays Don t Strike Twice: Understanding the Nature of DRAM Errors and the Implications for System Design, ASPLOS Pinheiro, Weber, Barroso: Failure Trends in a Large Disk Drive Population, FAST Shivakumar, Kistler, Keckler: Modeling the Effect of Technology Trends on the Soft Error Rate of Combinational Logic, DSN 2002 Operating System Support for Redundant Multithreading slide 1 of 19

3 Fault Tolerance: State of the Union Software errors Hardware errors non- COTS COTS Operating System Support for Redundant Multithreading slide 2 of 19

4 Fault Tolerance: State of the Union Software errors Hardware errors RAD-hard CPUs Redundant Multithr. non- COTS COTS Operating System Support for Redundant Multithreading slide 2 of 19

5 Fault Tolerance: State of the Union Software errors IBM z/os HP NonStop Hardware errors RAD-hard CPUs Redundant Multithr. non- COTS COTS Operating System Support for Redundant Multithreading slide 2 of 19

6 Fault Tolerance: State of the Union Software errors SeL4 Minix3 IBM z/os HP NonStop Carburizer Hardware errors RAD-hard CPUs Redundant Multithr. non- COTS COTS Operating System Support for Redundant Multithreading slide 2 of 19

7 Fault Tolerance: State of the Union Software errors SeL4 Minix3 IBM z/os HP NonStop Carburizer SWIFT Hardware errors RAD-hard CPUs Redundant Multithr. Encoded Processing non- COTS COTS Operating System Support for Redundant Multithreading slide 2 of 19

8 Fault Tolerance: State of the Union Software errors SeL4 Minix3 IBM z/os HP NonStop Carburizer SWIFT Hardware errors RAD-hard CPUs Redundant Multithr. Encoded Romain Processing non- COTS COTS Operating System Support for Redundant Multithreading slide 2 of 19

9 Transparent Replication as OS Service Application L4 Runtime Environment L4/Fiasco.OC microkernel Operating System Support for Redundant Multithreading slide 3 of 19

10 Transparent Replication as OS Service Replicated Application L4 Runtime Environment Romain L4/Fiasco.OC microkernel Operating System Support for Redundant Multithreading slide 3 of 19

11 Transparent Replication as OS Service Unreplicated Application Replicated Application L4 Runtime Environment Romain L4/Fiasco.OC microkernel Operating System Support for Redundant Multithreading slide 3 of 19

12 Transparent Replication as OS Service Replicated Driver Unreplicated Application Replicated Application L4 Runtime Environment Romain L4/Fiasco.OC microkernel Operating System Support for Redundant Multithreading slide 3 of 19

13 Transparent Replication as OS Service Replicated Driver Unreplicated Application Replicated Application L4 Runtime Environment Romain L4/Fiasco.OC microkernel Reliable Computing Base Operating System Support for Redundant Multithreading slide 3 of 19

14 Process-Level Redundancy 6 Binary recompilation Complex, unprotected compiler Architecture-dependent System calls for replica synchronization Virtual memory fault isolation Restricted to Linux user-level programs 6 Shye, Blomsted, Moseley, Reddi, Connors: PLR: A software approach to transient fault tolerance for multicore architectures, DSN 2009 Operating System Support for Redundant Multithreading slide 4 of 19

15 Process-Level Redundancy 6 Binary recompilation Complex, unprotected compiler Architecture-dependent Reuse OS mechanisms System calls for replica synchronization Additional synchronization events Virtual memory fault isolation Restricted to Linux user-level programs Microkernel-based 6 Shye, Blomsted, Moseley, Reddi, Connors: PLR: A software approach to transient fault tolerance for multicore architectures, DSN 2009 Operating System Support for Redundant Multithreading slide 4 of 19

16 Why A Microkernel? Small components Microrebootable 7 Custom-tailor reliability to application needs 8 7 Herder: Building a Dependable Operating System Fault Tolerance in MINIX3, PhD Thesis, Sridharan, Kaeli: Eliminating microarchitectural dependency from architectural vulnerability, HPCA 2009 Operating System Support for Redundant Multithreading slide 5 of 19

17 Why A Microkernel? Small components Microrebootable 7 Custom-tailor reliability to application needs 8 That s what we do in Dresden (tm). Reuse Fiasco.OC mechanisms instead of adding new code to the RCB 7 Herder: Building a Dependable Operating System Fault Tolerance in MINIX3, PhD Thesis, Sridharan, Kaeli: Eliminating microarchitectural dependency from architectural vulnerability, HPCA 2009 Operating System Support for Redundant Multithreading slide 5 of 19

18 Why A Microkernel? Small components Microrebootable 7 Custom-tailor reliability to application needs 8 That s what we do in Dresden (tm). Reuse Fiasco.OC mechanisms instead of adding new code to the RCB Lean system call interface Need to add special handling to fewer syscalls 7 Herder: Building a Dependable Operating System Fault Tolerance in MINIX3, PhD Thesis, Sridharan, Kaeli: Eliminating microarchitectural dependency from architectural vulnerability, HPCA 2009 Operating System Support for Redundant Multithreading slide 5 of 19

19 Romain: Structure Master Operating System Support for Redundant Multithreading slide 6 of 19

20 Romain: Structure Replica Replica Replica Master Operating System Support for Redundant Multithreading slide 6 of 19

21 Romain: Structure Replica Replica Replica = Master Operating System Support for Redundant Multithreading slide 6 of 19

22 Romain: Structure Replica Replica Replica Resource Manager = System Call Proxy Master Operating System Support for Redundant Multithreading slide 6 of 19

23 Resource Management: Capabilities Replica Operating System Support for Redundant Multithreading slide 7 of 19

24 Resource Management: Capabilities Replica Replica Operating System Support for Redundant Multithreading slide 7 of 19

25 Resource Management: Capabilities Replica Replica Master Operating System Support for Redundant Multithreading slide 7 of 19

26 Partitioned Capability Tables Replica Replica Marked used Master Master private Operating System Support for Redundant Multithreading slide 8 of 19

27 Overhead vs. Unreplicated Execution 9 9 Döbel, Härtig, Engel: Operating System Support for Redundant Multithreading, EMSOFT 2012 Operating System Support for Redundant Multithreading slide 9 of 19

28 Romain Lines of Code Base code (main, logging, locking) 325 Application loader 375 Replica manager 628 Redundancy 153 Memory manager 445 System call proxy 311 Shared memory 281 Total 2,518 Fault injector 668 GDB server stub 1,304 Operating System Support for Redundant Multithreading slide 10 of 19

29 Hardening the RCB We need: Dedicated mechanisms to protect the RCB (HW or SW) We have: Full control over software Use FT-encoding compiler? Has not been done for kernel code yet Only protects SW components RAD-hardened hardware? Too expensive Operating System Support for Redundant Multithreading slide 11 of 19

30 Hardening the RCB We need: Dedicated mechanisms to protect the RCB (HW or SW) We have: Full control over software Use FT-encoding compiler? Has not been done for kernel code yet Only protects SW components RAD-hardened hardware? Too expensive Our proposal: Split HW into ResCores and NonRes-Cores NonRes Core NonRes Core NonRes Core NonRes Core NonRes Core ResCore NonRes Core NonRes Core NonRes Core NonRes Core NonRes Core Operating System Support for Redundant Multithreading slide 11 of 19

31 Signaling Performance Exploring master replica communication 10 12x Intel Core2 2.6 GHz Replicas pinned to dedicated physical cores Overhead in % Hyperthreading off Overhead by notification method Local Faults Migration Sync IPC Shared Mem susan CRC32 DMR susan CRC32 TMR 10 Döbel, Härtig: Who watches the watchmen? Protecting Operating System Reliability Mechanisms, HotDep 2012 Operating System Support for Redundant Multithreading slide 12 of 19

32 How About Multithreading? A 1 A 1 A 2 A 2 A 3 A 3 A 4 A 4 Operating System Support for Redundant Multithreading slide 13 of 19

33 How About Multithreading? A 1 A 1 B 1 A 2 B 1 A 2 A 3 B 2 A 3 B 2 B 3 A 4 B 3 A 4 Operating System Support for Redundant Multithreading slide 13 of 19

34 How About Multithreading? A 1 C 1 A 1 B 1 C 1 A 2 B 1 C 2 A 2 C 2 A 3 B 2 C 3 A 3 B 2 B 3 C 3 A 4 B 3 C 4 A 4 C 4 Operating System Support for Redundant Multithreading slide 13 of 19

35 Problem: Nondeterminism A 1 C 1 A 1 B 1 C 1 A 2 B 1 C 2 A 2 C 2 A 3 B 2 C 3 A 3 B 2 B 3 C 3 A 4 B 3 C 3 A 4 C 4 Operating System Support for Redundant Multithreading slide 14 of 19

36 Problem: Nondeterminism A 1 C 1 A 1 B 1 C 1 A 2 B 1 C 2 A 2 B 2 A 3 B 2 C 3 A 3 B 3 C 2 C 3 A 4 B 3 C 3 A 4 B 4 Operating System Support for Redundant Multithreading slide 14 of 19

37 Deterministic Multithreading Related work: Debugging multithreaded programs Slightly different requirement: determinism across runs 11 Liu, Curtsinger, Berger: DThreads: Efficient Deterministic Multithreading, OSDI Olszewski, Ansel, Amarasinghe: Kendo: Efficient Deterministic Multithreading in Software, ASPLOS 2009 Operating System Support for Redundant Multithreading slide 15 of 19

38 Deterministic Multithreading Related work: Debugging multithreaded programs Slightly different requirement: determinism across runs Strong Determinism: All accesses to shared resources happen in the same order 11. Requires heavy involvement with shared memory accesses Replicating SHM is slow 11 Liu, Curtsinger, Berger: DThreads: Efficient Deterministic Multithreading, OSDI Olszewski, Ansel, Amarasinghe: Kendo: Efficient Deterministic Multithreading in Software, ASPLOS 2009 Operating System Support for Redundant Multithreading slide 15 of 19

39 Deterministic Multithreading Related work: Debugging multithreaded programs Slightly different requirement: determinism across runs Strong Determinism: All accesses to shared resources happen in the same order 11. Requires heavy involvement with shared memory accesses Replicating SHM is slow Weak Determinism: All lock acquisitions in a program happen in the same order 12 Intercept calls to pthread mutex {lock,unlock} 11 Liu, Curtsinger, Berger: DThreads: Efficient Deterministic Multithreading, OSDI Olszewski, Ansel, Amarasinghe: Kendo: Efficient Deterministic Multithreading in Software, ASPLOS 2009 Operating System Support for Redundant Multithreading slide 15 of 19

40 Externally Enforced Determinism Patch entries to pthread mutex {lock,unlock} Catch exception for every call Enforce ordering in side the master Microbenchmark: 2 threads, global counter, 1 lock Replication kind Execution time Overhead Native execution 0.24 s 1.00 x Unreplicated RomainMT 4.50 s x RomainMT: DMR s x RomainMT: TMR s x Operating System Support for Redundant Multithreading slide 16 of 19

41 Internal Determinism Exception for every lock/unlock hurts a lot! Non-contention case: get progress without exception Operating System Support for Redundant Multithreading slide 17 of 19

42 Internal Determinism Exception for every lock/unlock hurts a lot! Non-contention case: get progress without exception Replication-aware pthreads library Shared memory between replicas Exchange per-lock progress information Only block if lock currently used by different thread in other replica Operating System Support for Redundant Multithreading slide 17 of 19

43 Internal Determinism Exception for every lock/unlock hurts a lot! Non-contention case: get progress without exception Replication-aware pthreads library Shared memory between replicas Exchange per-lock progress information Only block if lock currently used by different thread in other replica Problems: Replacing pthreads vs. binary-only support pthreads is a shared lib anyway No support for different synchronization mechanisms New pthreads code becomes part of the RCB. Operating System Support for Redundant Multithreading slide 17 of 19

44 Internal Determinism Microbenchmark: 2 threads, global counter, 1 lock Replication kind Execution time Overhead External det. Native execution 0.24 s 1.00 x Unreplicated RomainMT 0.27 s 1.13 x x RomainMT: DMR 0.46 s 1.92 x x RomainMT: TMR 1.43 s 6.01 x x Operating System Support for Redundant Multithreading slide 18 of 19

45 Conclusion Redundant Multithreading as an OS service Support for binary-only applications Benefit from microkernel by reuse and design Overheads <30%, often <5% Multithreading external vs. internal determinism Work in progress (not in this talk): Shared memory handling is slow Bounded detection latency using a watchdog Dynamic adjustment of replication level and resource usage Operating System Support for Redundant Multithreading slide 19 of 19

46 Nothing to see here This slide intentionally left blank. Except for above text. Operating System Support for Redundant Multithreading slide 20 of 19

47 What about signalling failures? Missed CPU exceptions detected by watchdog Spurious CPU exceptions detected by watchdog / state comparison Transmission of corrupt state detected during state comparison Overwriting remote state during transmission NonResCore memory Accessible by ResCores, but not by other NonResCores Prevents overwriting other states Already available in HW: IBM/Cell Operating System Support for Redundant Multithreading slide 21 of 19

48 Romain Operating System Support for Redundant Multithreading slide 22 of 19

HAFT Hardware-Assisted Fault Tolerance

HAFT Hardware-Assisted Fault Tolerance HAFT Hardware-Assisted Fault Tolerance Dmitrii Kuvaiskii Rasha Faqeh Pramod Bhatotia Christof Fetzer Technische Universität Dresden Pascal Felber Université de Neuchâtel Hardware Errors in the Wild Online

More information

Where Have all the Cycles Gone? Investigating Runtime Overheads of OS-Assisted Replication

Where Have all the Cycles Gone? Investigating Runtime Overheads of OS-Assisted Replication Where Have all the Cycles Gone? Investigating Runtime Overheads of OS-Assisted Replication Björn Döbel, Hermann Härtig Operating Systems Group TU Dresden Nöthnitzer Str. 46 01187 Dresden {doebel,haertig}@tudos.org

More information

Operating Systems Meet Fault Tolerance

Operating Systems Meet Fault Tolerance Operating Systems Meet Fault Tolerance Microkernel-Based Operating Systems Maksym Planeta Björn Döbel 30.01.2018 1/53 If there s more than one possible outcome of a job or task, and one of those outcome

More information

Using Process-Level Redundancy to Exploit Multiple Cores for Transient Fault Tolerance

Using Process-Level Redundancy to Exploit Multiple Cores for Transient Fault Tolerance Using Process-Level Redundancy to Exploit Multiple Cores for Transient Fault Tolerance Outline Introduction and Motivation Software-centric Fault Detection Process-Level Redundancy Experimental Results

More information

Commercial-Off-the-shelf Hardware Transactional Memory for Tolerating Transient Hardware Errors

Commercial-Off-the-shelf Hardware Transactional Memory for Tolerating Transient Hardware Errors Commercial-Off-the-shelf Hardware Transactional Memory for Tolerating Transient Hardware Errors Rasha Faqeh TU- Dresden 19.01.2015 Dresden, 23.09.2011 Transient Error Recovery Motivation Folie Nr. 12 von

More information

DESIGN AND ANALYSIS OF TRANSIENT FAULT TOLERANCE FOR MULTI CORE ARCHITECTURE

DESIGN AND ANALYSIS OF TRANSIENT FAULT TOLERANCE FOR MULTI CORE ARCHITECTURE DESIGN AND ANALYSIS OF TRANSIENT FAULT TOLERANCE FOR MULTI CORE ARCHITECTURE DivyaRani 1 1pg scholar, ECE Department, SNS college of technology, Tamil Nadu, India -----------------------------------------------------------------------------------------------------------------------------------------------

More information

Eliminating Single Points of Failure in Software Based Redundancy

Eliminating Single Points of Failure in Software Based Redundancy Eliminating Single Points of Failure in Software Based Redundancy Peter Ulbrich, Martin Hoffmann, Rüdiger Kapitza, Daniel Lohmann, Reiner Schmid and Wolfgang Schröder-Preikschat EDCC May 9, 2012 SYSTEM

More information

Replicating Device Drivers on L4Re

Replicating Device Drivers on L4Re Großer Beleg Replicating Device Drivers on L4Re Maurice Bailleu 14. September 2015 Technische Universität Dresden Fakultät Informatik Institut für Systemarchitektur Professur Betriebssysteme Betreuender

More information

CS 470 Spring Fault Tolerance. Mike Lam, Professor. Content taken from the following:

CS 470 Spring Fault Tolerance. Mike Lam, Professor. Content taken from the following: CS 47 Spring 27 Mike Lam, Professor Fault Tolerance Content taken from the following: "Distributed Systems: Principles and Paradigms" by Andrew S. Tanenbaum and Maarten Van Steen (Chapter 8) Various online

More information

FAULT TOLERANT SYSTEMS

FAULT TOLERANT SYSTEMS FAULT TOLERANT SYSTEMS http://www.ecs.umass.edu/ece/koren/faulttolerantsystems Part 18 Chapter 7 Case Studies Part.18.1 Introduction Illustrate practical use of methods described previously Highlight fault-tolerance

More information

Deterministic Process Groups in

Deterministic Process Groups in Deterministic Process Groups in Tom Bergan Nicholas Hunt, Luis Ceze, Steven D. Gribble University of Washington A Nondeterministic Program global x=0 Thread 1 Thread 2 t := x x := t + 1 t := x x := t +

More information

Typed Assembly Language for Implementing OS Kernels in SMP/Multi-Core Environments with Interrupts

Typed Assembly Language for Implementing OS Kernels in SMP/Multi-Core Environments with Interrupts Typed Assembly Language for Implementing OS Kernels in SMP/Multi-Core Environments with Interrupts Toshiyuki Maeda and Akinori Yonezawa University of Tokyo Quiz [Environment] CPU: Intel Xeon X5570 (2.93GHz)

More information

AUTOBEST: A United AUTOSAR-OS And ARINC 653 Kernel. Alexander Züpke, Marc Bommert, Daniel Lohmann

AUTOBEST: A United AUTOSAR-OS And ARINC 653 Kernel. Alexander Züpke, Marc Bommert, Daniel Lohmann AUTOBEST: A United AUTOSAR-OS And ARINC 653 Kernel Alexander Züpke, Marc Bommert, Daniel Lohmann alexander.zuepke@hs-rm.de, marc.bommert@hs-rm.de, lohmann@cs.fau.de Motivation Automotive and Avionic industry

More information

CS 31: Intro to Systems Threading & Parallel Applications. Kevin Webb Swarthmore College November 27, 2018

CS 31: Intro to Systems Threading & Parallel Applications. Kevin Webb Swarthmore College November 27, 2018 CS 31: Intro to Systems Threading & Parallel Applications Kevin Webb Swarthmore College November 27, 2018 Reading Quiz Making Programs Run Faster We all like how fast computers are In the old days (1980

More information

Major Project, CSD411

Major Project, CSD411 Major Project, CSD411 Deterministic Execution In Multithreaded Applications Supervisor Prof. Sorav Bansal Ashwin Kumar 2010CS10211 Himanshu Gupta 2010CS10220 1 Deterministic Execution In Multithreaded

More information

Graphene-SGX. A Practical Library OS for Unmodified Applications on SGX. Chia-Che Tsai Donald E. Porter Mona Vij

Graphene-SGX. A Practical Library OS for Unmodified Applications on SGX. Chia-Che Tsai Donald E. Porter Mona Vij Graphene-SGX A Practical Library OS for Unmodified Applications on SGX Chia-Che Tsai Donald E. Porter Mona Vij Intel SGX: Trusted Execution on Untrusted Hosts Processing Sensitive Data (Ex: Medical Records)

More information

Lecture 5: Scheduling and Reliability. Topics: scheduling policies, handling DRAM errors

Lecture 5: Scheduling and Reliability. Topics: scheduling policies, handling DRAM errors Lecture 5: Scheduling and Reliability Topics: scheduling policies, handling DRAM errors 1 PAR-BS Mutlu and Moscibroda, ISCA 08 A batch of requests (per bank) is formed: each thread can only contribute

More information

SIListra. Coded Processing in Medical Devices. Dr. Martin Süßkraut (TU-Dresden / SIListra Systems)

SIListra. Coded Processing in Medical Devices. Dr. Martin Süßkraut (TU-Dresden / SIListra Systems) SIListra making systems safer Coded Processing in Medical Devices Dr. Martin Süßkraut (TU-Dresden / SIListra Systems) martin.suesskraut@se.inf.tu-dresden.de Embedded goes Medical 5./6. Oct. 2011 1 SIListra

More information

Björn Döbel. Microkernel-Based Operating Systems. Exercise 3: Virtualization

Björn Döbel. Microkernel-Based Operating Systems. Exercise 3: Virtualization Faculty of Computer Science Institute for System Architecture, Operating Systems Group Björn Döbel Microkernel-Based Operating Systems Exercise 3: Virtualization Emulation Virtualization Emulation / Simulation

More information

Transient Fault Detection and Reducing Transient Error Rate. Jose Lugo-Martinez CSE 240C: Advanced Microarchitecture Prof.

Transient Fault Detection and Reducing Transient Error Rate. Jose Lugo-Martinez CSE 240C: Advanced Microarchitecture Prof. Transient Fault Detection and Reducing Transient Error Rate Jose Lugo-Martinez CSE 240C: Advanced Microarchitecture Prof. Steven Swanson Outline Motivation What are transient faults? Hardware Fault Detection

More information

ARCHITECTURE DESIGN FOR SOFT ERRORS

ARCHITECTURE DESIGN FOR SOFT ERRORS ARCHITECTURE DESIGN FOR SOFT ERRORS Shubu Mukherjee ^ШВпШшр"* AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO T^"ТГПШГ SAN FRANCISCO SINGAPORE SYDNEY TOKYO ^ P f ^ ^ ELSEVIER Morgan

More information

Deterministic Scaling

Deterministic Scaling Deterministic Scaling Gabriel Southern Madan Das Jose Renau Dept. of Computer Engineering, University of California Santa Cruz {gsouther, madandas, renau}@soe.ucsc.edu http://masc.soe.ucsc.edu ABSTRACT

More information

Deflating the hype: Embedded Virtualization in 3 steps

Deflating the hype: Embedded Virtualization in 3 steps Deflating the hype: Embedded Virtualization in 3 steps Klaas van Gend MontaVista Software LLC For Embedded Linux Conference Europe 2010, Cambridge Agenda Why multicore made the topic more relevant Partitioning

More information

SCONE: Secure Linux Container Environments with Intel SGX

SCONE: Secure Linux Container Environments with Intel SGX SCONE: Secure Linux Container Environments with Intel SGX S. Arnautov, B. Trach, F. Gregor, Thomas Knauth, and A. Martin, Technische Universität Dresden; C. Priebe, J. Lind, D. Muthukumaran, D. O'Keeffe,

More information

ECE 588/688 Advanced Computer Architecture II

ECE 588/688 Advanced Computer Architecture II ECE 588/688 Advanced Computer Architecture II Instructor: Alaa Alameldeen alaa@ece.pdx.edu Fall 2009 Portland State University Copyright by Alaa Alameldeen and Haitham Akkary 2009 1 When and Where? When:

More information

Providing Real-Time and Fault Tolerance for CORBA Applications

Providing Real-Time and Fault Tolerance for CORBA Applications Providing Real-Time and Tolerance for CORBA Applications Priya Narasimhan Assistant Professor of ECE and CS University Pittsburgh, PA 15213-3890 Sponsored in part by the CMU-NASA High Dependability Computing

More information

Distributed Systems Principles and Paradigms

Distributed Systems Principles and Paradigms Distributed Systems Principles and Paradigms Chapter 03 (version February 11, 2008) Maarten van Steen Vrije Universiteit Amsterdam, Faculty of Science Dept. Mathematics and Computer Science Room R4.20.

More information

Chapter 4: Multithreaded

Chapter 4: Multithreaded Chapter 4: Multithreaded Programming Chapter 4: Multithreaded Programming Overview Multithreading Models Thread Libraries Threading Issues Operating-System Examples 2009/10/19 2 4.1 Overview A thread is

More information

Multiprocessor OS. COMP9242 Advanced Operating Systems S2/2011 Week 7: Multiprocessors Part 2. Scalability of Multiprocessor OS.

Multiprocessor OS. COMP9242 Advanced Operating Systems S2/2011 Week 7: Multiprocessors Part 2. Scalability of Multiprocessor OS. Multiprocessor OS COMP9242 Advanced Operating Systems S2/2011 Week 7: Multiprocessors Part 2 2 Key design challenges: Correctness of (shared) data structures Scalability Scalability of Multiprocessor OS

More information

Software-based Fault Tolerance Mission (Im)possible?

Software-based Fault Tolerance Mission (Im)possible? Software-based Fault Tolerance Mission Im)possible? Peter Ulbrich The 29th CREST Open Workshop on Software Redundancy November 18, 2013 System Software Group http://www4.cs.fau.de Embedded Systems Initiative

More information

Priya Narasimhan. Assistant Professor of ECE and CS Carnegie Mellon University Pittsburgh, PA

Priya Narasimhan. Assistant Professor of ECE and CS Carnegie Mellon University Pittsburgh, PA OMG Real-Time and Distributed Object Computing Workshop, July 2002, Arlington, VA Providing Real-Time and Fault Tolerance for CORBA Applications Priya Narasimhan Assistant Professor of ECE and CS Carnegie

More information

Computer Architecture: Multithreading (III) Prof. Onur Mutlu Carnegie Mellon University

Computer Architecture: Multithreading (III) Prof. Onur Mutlu Carnegie Mellon University Computer Architecture: Multithreading (III) Prof. Onur Mutlu Carnegie Mellon University A Note on This Lecture These slides are partly from 18-742 Fall 2012, Parallel Computer Architecture, Lecture 13:

More information

MICROKERNELS: MACH AND L4

MICROKERNELS: MACH AND L4 1 MICROKERNELS: MACH AND L4 CS6410 Hakim Weatherspoon Introduction to Kernels Different Types of Kernel Designs Monolithic kernel Microkernel Hybrid Kernel Exokernel Virtual Machines? Monolithic Kernels

More information

DMP Deterministic Shared Memory Multiprocessing

DMP Deterministic Shared Memory Multiprocessing DMP Deterministic Shared Memory Multiprocessing University of Washington Joe Devietti, Brandon Lucia, Luis Ceze, Mark Oskin A multithreaded voting machine 2 thread 0 thread 1 while (more_votes) { load

More information

Replacement policies for shared caches on symmetric multicores : a programmer-centric point of view

Replacement policies for shared caches on symmetric multicores : a programmer-centric point of view 1 Replacement policies for shared caches on symmetric multicores : a programmer-centric point of view Pierre Michaud INRIA HiPEAC 11, January 26, 2011 2 Outline Self-performance contract Proposition for

More information

TEMPERATURE MANAGEMENT IN DATA CENTERS: WHY SOME (MIGHT) LIKE IT HOT

TEMPERATURE MANAGEMENT IN DATA CENTERS: WHY SOME (MIGHT) LIKE IT HOT TEMPERATURE MANAGEMENT IN DATA CENTERS: WHY SOME (MIGHT) LIKE IT HOT Nosayba El-Sayed, Ioan Stefanovici, George Amvrosiadis, Andy A. Hwang, Bianca Schroeder {nosayba, ioan, gamvrosi, hwang, bianca}@cs.toronto.edu

More information

CSCE 410/611: Virtualization

CSCE 410/611: Virtualization CSCE 410/611: Virtualization Definitions, Terminology Why Virtual Machines? Mechanics of Virtualization Virtualization of Resources (Memory) Some slides made available Courtesy of Gernot Heiser, UNSW.

More information

DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN. Chapter 3 Processes

DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN. Chapter 3 Processes DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN Chapter 3 Processes Context Switching Processor context: The minimal collection of values stored in the

More information

Elzar Triple Modular Redundancy using Intel AVX

Elzar Triple Modular Redundancy using Intel AVX Elzar Triple Modular Redundancy using Intel AVX Dmitrii Kuvaiskii Oleksii Oleksenko Pramod Bhatotia Christof Fetzer Pascal Felber Hardware Errors in the Wild Online services run in huge data centers 1

More information

Failures in Large Systems. Hive: Fault Containment for Shared- Memory Multiprocessors. Why failures? Component Reliability

Failures in Large Systems. Hive: Fault Containment for Shared- Memory Multiprocessors. Why failures? Component Reliability Hive: Fault Containment for Shared- Memory Multiprocessors. Chapin95: John Chapin, Mendel Rosenblum, Scott Devine, Tirthankar Lahiri, Dan Teodosiu, Anoop Gupta, ACM Symp. on Operating Systems Principles,

More information

Fakultät Informatik Institut für Systemarchitektur, Betriebssysteme THE NOVA KERNEL API. Julian Stecklina

Fakultät Informatik Institut für Systemarchitektur, Betriebssysteme THE NOVA KERNEL API. Julian Stecklina Fakultät Informatik Institut für Systemarchitektur, Betriebssysteme THE NOVA KERNEL API Julian Stecklina (jsteckli@os.inf.tu-dresden.de) Dresden, 5.2.2012 00 Disclaimer This is not about OpenStack Compute.

More information

Real-Time Component Software. slide credits: H. Kopetz, P. Puschner

Real-Time Component Software. slide credits: H. Kopetz, P. Puschner Real-Time Component Software slide credits: H. Kopetz, P. Puschner Overview OS services Task Structure Task Interaction Input/Output Error Detection 2 Operating System and Middleware Application Software

More information

Lecture 28 Multicore, Multithread" Suggested reading:" (H&P Chapter 7.4)"

Lecture 28 Multicore, Multithread Suggested reading: (H&P Chapter 7.4) Lecture 28 Multicore, Multithread" Suggested reading:" (H&P Chapter 7.4)" 1" Processor components" Multicore processors and programming" Processor comparison" CSE 30321 - Lecture 01 - vs." Goal: Explain

More information

A Userspace Packet Switch for Virtual Machines

A Userspace Packet Switch for Virtual Machines SHRINKING THE HYPERVISOR ONE SUBSYSTEM AT A TIME A Userspace Packet Switch for Virtual Machines Julian Stecklina OS Group, TU Dresden jsteckli@os.inf.tu-dresden.de VEE 2014, Salt Lake City 1 Motivation

More information

A Multi-Kernel Survey for High-Performance Computing

A Multi-Kernel Survey for High-Performance Computing A Multi-Kernel Survey for High-Performance Computing Balazs Gerofi, Yutaka Ishikawa, Rolf Riesen, Robert W. Wisniewski, Yoonho Park, Bryan Rosenburg RIKEN Advanced Institute for Computational Science,

More information

Bounding Error Detection Latencies for Replicated Execution

Bounding Error Detection Latencies for Replicated Execution Bachelor Thesis Bounding Error Detection Latencies for Replicated Execution Martin Kriegel June 27, 2013 Technische Universität Dresden Fakultät Informatik Institut für Systemarchitektur Professur Betriebssysteme

More information

6.033 Spring Lecture #6. Monolithic kernels vs. Microkernels Virtual Machines spring 2018 Katrina LaCurts

6.033 Spring Lecture #6. Monolithic kernels vs. Microkernels Virtual Machines spring 2018 Katrina LaCurts 6.033 Spring 2018 Lecture #6 Monolithic kernels vs. Microkernels Virtual Machines 1 operating systems enforce modularity on a single machine using virtualization in order to enforce modularity + build

More information

Replication of Concurrent Applications in a Shared Memory Multikernel

Replication of Concurrent Applications in a Shared Memory Multikernel Replication of Concurrent Applications in a Shared Memory Multikernel Yuzhong Wen Thesis submitted to the Faculty of the Virginia Polytechnic Institute and State University in partial fulfillment of the

More information

Designing and Evaluating a Distributed Computing Language Runtime. Christopher Meiklejohn Université catholique de Louvain, Belgium

Designing and Evaluating a Distributed Computing Language Runtime. Christopher Meiklejohn Université catholique de Louvain, Belgium Designing and Evaluating a Distributed Computing Language Runtime Christopher Meiklejohn (@cmeik) Université catholique de Louvain, Belgium R A R B R A set() R B R A set() set(2) 2 R B set(3) 3 set() set(2)

More information

TERN: Stable Deterministic Multithreading through Schedule Memoization

TERN: Stable Deterministic Multithreading through Schedule Memoization TERN: Stable Deterministic Multithreading through Schedule Memoization Heming Cui, Jingyue Wu, Chia-che Tsai, Junfeng Yang Columbia University Appeared in OSDI 10 Nondeterministic Execution One input many

More information

Process Description and Control

Process Description and Control Process Description and Control 1 Process:the concept Process = a program in execution Example processes: OS kernel OS shell Program executing after compilation www-browser Process management by OS : Allocate

More information

KESO Functional Safety and the Use of Java in Embedded Systems

KESO Functional Safety and the Use of Java in Embedded Systems KESO Functional Safety and the Use of Java in Embedded Systems Isabella S1lkerich, Bernhard Sechser Embedded Systems Engineering Kongress 05.12.2012 Lehrstuhl für Informa1k 4 Verteilte Systeme und Betriebssysteme

More information

CS 31: Introduction to Computer Systems : Threads & Synchronization April 16-18, 2019

CS 31: Introduction to Computer Systems : Threads & Synchronization April 16-18, 2019 CS 31: Introduction to Computer Systems 22-23: Threads & Synchronization April 16-18, 2019 Making Programs Run Faster We all like how fast computers are In the old days (1980 s - 2005): Algorithm too slow?

More information

Microkernel Construction

Microkernel Construction Introduction SS2013 Class Goals Provide deeper understanding of OS mechanisms Introduce L4 principles and concepts Make you become enthusiastic L4 hackers Propaganda for OS research at 2 Administration

More information

Status Update About COLO (COLO: COarse-grain LOck-stepping Virtual Machines for Non-stop Service)

Status Update About COLO (COLO: COarse-grain LOck-stepping Virtual Machines for Non-stop Service) Status Update About COLO (COLO: COarse-grain LOck-stepping Virtual Machines for Non-stop Service) eddie.dong@intel.com arei.gonglei@huawei.com yanghy@cn.fujitsu.com Agenda Background Introduction Of COLO

More information

Tiered-Latency DRAM: A Low Latency and A Low Cost DRAM Architecture

Tiered-Latency DRAM: A Low Latency and A Low Cost DRAM Architecture Tiered-Latency DRAM: A Low Latency and A Low Cost DRAM Architecture Donghyuk Lee, Yoongu Kim, Vivek Seshadri, Jamie Liu, Lavanya Subramanian, Onur Mutlu Carnegie Mellon University HPCA - 2013 Executive

More information

Background: I/O Concurrency

Background: I/O Concurrency Background: I/O Concurrency Brad Karp UCL Computer Science CS GZ03 / M030 5 th October 2011 Outline Worse Is Better and Distributed Systems Problem: Naïve single-process server leaves system resources

More information

Department of Computer Science Institute for System Architecture, Operating Systems Group REAL-TIME MICHAEL ROITZSCH OVERVIEW

Department of Computer Science Institute for System Architecture, Operating Systems Group REAL-TIME MICHAEL ROITZSCH OVERVIEW Department of Computer Science Institute for System Architecture, Operating Systems Group REAL-TIME MICHAEL ROITZSCH OVERVIEW 2 SO FAR talked about in-kernel building blocks: threads memory IPC drivers

More information

Tolerating Malicious Drivers in Linux. Silas Boyd-Wickizer and Nickolai Zeldovich

Tolerating Malicious Drivers in Linux. Silas Boyd-Wickizer and Nickolai Zeldovich XXX Tolerating Malicious Drivers in Linux Silas Boyd-Wickizer and Nickolai Zeldovich How could a device driver be malicious? Today's device drivers are highly privileged Write kernel memory, allocate memory,...

More information

CS420: Operating Systems

CS420: Operating Systems Threads James Moscola Department of Physical Sciences York College of Pennsylvania Based on Operating System Concepts, 9th Edition by Silberschatz, Galvin, Gagne Threads A thread is a basic unit of processing

More information

Transparent TCP Recovery

Transparent TCP Recovery Transparent Recovery with Chain Replication Robert Burgess Ken Birman Robert Broberg Rick Payne Robbert van Renesse October 26, 2009 Motivation Us: Motivation Them: Client Motivation There is a connection...

More information

Runtime Integrity Checking for Exploit Mitigation on Embedded Devices

Runtime Integrity Checking for Exploit Mitigation on Embedded Devices Runtime Integrity Checking for Exploit Mitigation on Embedded Devices Matthias Neugschwandtner IBM Research, Zurich eug@zurich.ibm.com Collin Mulliner Northeastern University, Boston collin@mulliner.org

More information

From eventual to strong consistency. Primary-Backup Replication. Primary-Backup Replication. Replication State Machines via Primary-Backup

From eventual to strong consistency. Primary-Backup Replication. Primary-Backup Replication. Replication State Machines via Primary-Backup From eventual to strong consistency Replication s via - Eventual consistency Multi-master: Any node can accept operation Asynchronously, nodes synchronize state COS 418: Distributed Systems Lecture 10

More information

Agenda Process Concept Process Scheduling Operations on Processes Interprocess Communication 3.2

Agenda Process Concept Process Scheduling Operations on Processes Interprocess Communication 3.2 Lecture 3: Processes Agenda Process Concept Process Scheduling Operations on Processes Interprocess Communication 3.2 Process in General 3.3 Process Concept Process is an active program in execution; process

More information

Taming Non-blocking Caches to Improve Isolation in Multicore Real-Time Systems

Taming Non-blocking Caches to Improve Isolation in Multicore Real-Time Systems Taming Non-blocking Caches to Improve Isolation in Multicore Real-Time Systems Prathap Kumar Valsan, Heechul Yun, Farzad Farshchi University of Kansas 1 Why? High-Performance Multicores for Real-Time Systems

More information

4. Jump to *RA 4. StackGuard 5. Execute code 5. Instruction Set Randomization 6. Make system call 6. System call Randomization

4. Jump to *RA 4. StackGuard 5. Execute code 5. Instruction Set Randomization 6. Make system call 6. System call Randomization 04/04/06 Lecture Notes Untrusted Beili Wang Stages of Static Overflow Solution 1. Find bug in 1. Static Analysis 2. Send overflowing input 2. CCured 3. Overwrite return address 3. Address Space Randomization

More information

CSE 451 Midterm 1. Name:

CSE 451 Midterm 1. Name: CSE 451 Midterm 1 Name: 1. [2 points] Imagine that a new CPU were built that contained multiple, complete sets of registers each set contains a PC plus all the other registers available to user programs.

More information

ECC Protection in Software

ECC Protection in Software Center for RC eliable omputing ECC Protection in Software by Philip P Shirvani RATS June 8, 1999 Outline l Motivation l Requirements l Coding Schemes l Multiple Error Handling l Implementation in ARGOS

More information

Foundations of the C++ Concurrency Memory Model

Foundations of the C++ Concurrency Memory Model Foundations of the C++ Concurrency Memory Model John Mellor-Crummey and Karthik Murthy Department of Computer Science Rice University johnmc@rice.edu COMP 522 27 September 2016 Before C++ Memory Model

More information

Overview of Operating Systems

Overview of Operating Systems Lecture Outline Overview of Operating Systems Instructor: Dr. Tongping Liu Thank Dr. Dakai Zhu and Dr. Palden Lama for providing their slides. 1 2 Lecture Outline Von Neumann Architecture 3 This describes

More information

Multiprocessors 2007/2008

Multiprocessors 2007/2008 Multiprocessors 2007/2008 Abstractions of parallel machines Johan Lukkien 1 Overview Problem context Abstraction Operating system support Language / middleware support 2 Parallel processing Scope: several

More information

Multiprocessor Systems Continuous need for faster computers Multiprocessors: shared memory model, access time nanosec (ns) Multicomputers: message pas

Multiprocessor Systems Continuous need for faster computers Multiprocessors: shared memory model, access time nanosec (ns) Multicomputers: message pas Multiple processor systems 1 Multiprocessor Systems Continuous need for faster computers Multiprocessors: shared memory model, access time nanosec (ns) Multicomputers: message passing multiprocessor, access

More information

Fall 2014:: CSE 506:: Section 2 (PhD) Threading. Nima Honarmand (Based on slides by Don Porter and Mike Ferdman)

Fall 2014:: CSE 506:: Section 2 (PhD) Threading. Nima Honarmand (Based on slides by Don Porter and Mike Ferdman) Threading Nima Honarmand (Based on slides by Don Porter and Mike Ferdman) Threading Review Multiple threads of execution in one address space Why? Exploits multiple processors Separate execution stream

More information

AR-SMT: A Microarchitectural Approach to Fault Tolerance in Microprocessors

AR-SMT: A Microarchitectural Approach to Fault Tolerance in Microprocessors AR-SMT: A Microarchitectural Approach to Fault Tolerance in Microprocessors Computer Sciences Department University of Wisconsin Madison http://www.cs.wisc.edu/~ericro/ericro.html ericro@cs.wisc.edu High-Performance

More information

殷亚凤. Processes. Distributed Systems [3]

殷亚凤. Processes. Distributed Systems [3] Processes Distributed Systems [3] 殷亚凤 Email: yafeng@nju.edu.cn Homepage: http://cs.nju.edu.cn/yafeng/ Room 301, Building of Computer Science and Technology Review Architectural Styles: Layered style, Object-based,

More information

FlexSC. Flexible System Call Scheduling with Exception-Less System Calls. Livio Soares and Michael Stumm. University of Toronto

FlexSC. Flexible System Call Scheduling with Exception-Less System Calls. Livio Soares and Michael Stumm. University of Toronto FlexSC Flexible System Call Scheduling with Exception-Less System Calls Livio Soares and Michael Stumm University of Toronto Motivation The synchronous system call interface is a legacy from the single

More information

CS377P Programming for Performance Multicore Performance Multithreading

CS377P Programming for Performance Multicore Performance Multithreading CS377P Programming for Performance Multicore Performance Multithreading Sreepathi Pai UTCS October 14, 2015 Outline 1 Multiprocessor Systems 2 Programming Models for Multicore 3 Multithreading and POSIX

More information

Operating Systems, Fall Lecture 9, Tiina Niklander 1

Operating Systems, Fall Lecture 9, Tiina Niklander 1 Multiprocessor Systems Multiple processor systems Ch 8.1 8.3 1 Continuous need for faster computers Multiprocessors: shared memory model, access time nanosec (ns) Multicomputers: message passing multiprocessor,

More information

ECE 588/688 Advanced Computer Architecture II

ECE 588/688 Advanced Computer Architecture II ECE 588/688 Advanced Computer Architecture II Instructor: Alaa Alameldeen alaa@ece.pdx.edu Winter 2018 Portland State University Copyright by Alaa Alameldeen and Haitham Akkary 2018 1 When and Where? When:

More information

CSCE 410/611: Virtualization!

CSCE 410/611: Virtualization! CSCE 410/611: Virtualization! Definitions, Terminology! Why Virtual Machines?! Mechanics of Virtualization! Virtualization of Resources (Memory)! Some slides made available Courtesy of Gernot Heiser, UNSW.!

More information

Tolerating Hardware Device Failures in Software. Asim Kadav, Matthew J. Renzelmann, Michael M. Swift University of Wisconsin Madison

Tolerating Hardware Device Failures in Software. Asim Kadav, Matthew J. Renzelmann, Michael M. Swift University of Wisconsin Madison Tolerating Hardware Device Failures in Software Asim Kadav, Matthew J. Renzelmann, Michael M. Swift University of Wisconsin Madison Current state of OS hardware interaction Many device drivers assume device

More information

Chapter 10 DISTRIBUTED OBJECT-BASED SYSTEMS

Chapter 10 DISTRIBUTED OBJECT-BASED SYSTEMS DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN Chapter 10 DISTRIBUTED OBJECT-BASED SYSTEMS Distributed Objects Figure 10-1. Common organization of a remote

More information

Fault Isolation for Device Drivers

Fault Isolation for Device Drivers Fault Isolation for Device Drivers 39 th International Conference on Dependable Systems and Networks, 30 June 2009, Estoril Lisbon, Portugal Jorrit N. Herder Vrije Universiteit Amsterdam ~26% of Windows

More information

CPSC/ECE 3220 Fall 2017 Exam Give the definition (note: not the roles) for an operating system as stated in the textbook. (2 pts.

CPSC/ECE 3220 Fall 2017 Exam Give the definition (note: not the roles) for an operating system as stated in the textbook. (2 pts. CPSC/ECE 3220 Fall 2017 Exam 1 Name: 1. Give the definition (note: not the roles) for an operating system as stated in the textbook. (2 pts.) Referee / Illusionist / Glue. Circle only one of R, I, or G.

More information

NUMA replicated pagecache for Linux

NUMA replicated pagecache for Linux NUMA replicated pagecache for Linux Nick Piggin SuSE Labs January 27, 2008 0-0 Talk outline I will cover the following areas: Give some NUMA background information Introduce some of Linux s NUMA optimisations

More information

CSE502: Computer Architecture CSE 502: Computer Architecture

CSE502: Computer Architecture CSE 502: Computer Architecture CSE 502: Computer Architecture Memory Hierarchy & Caches Motivation 10000 Performance 1000 100 10 Processor Memory 1 1985 1990 1995 2000 2005 2010 Want memory to appear: As fast as CPU As large as required

More information

Exokernel: An Operating System Architecture for Application Level Resource Management

Exokernel: An Operating System Architecture for Application Level Resource Management Exokernel: An Operating System Architecture for Application Level Resource Management Dawson R. Engler, M. Frans Kaashoek, and James O'Tool Jr. M.I.T Laboratory for Computer Science Cambridge, MA 02139,

More information

Distributed File Systems Issues. NFS (Network File System) AFS: Namespace. The Andrew File System (AFS) Operating Systems 11/19/2012 CSC 256/456 1

Distributed File Systems Issues. NFS (Network File System) AFS: Namespace. The Andrew File System (AFS) Operating Systems 11/19/2012 CSC 256/456 1 Distributed File Systems Issues NFS (Network File System) Naming and transparency (location transparency versus location independence) Host:local-name Attach remote directories (mount) Single global name

More information

Load-Sto-Meter: Generating Workloads for Persistent Memory Damini Chopra, Doug Voigt Hewlett Packard (Enterprise)

Load-Sto-Meter: Generating Workloads for Persistent Memory Damini Chopra, Doug Voigt Hewlett Packard (Enterprise) Load-Sto-Meter: Generating Workloads for Persistent Memory Damini Chopra, Doug Voigt Hewlett Packard (Enterprise) Application vs. Pure Workloads Benchmarks that reproduce application workloads Assist in

More information

Microkernel-based Operating Systems - Introduction

Microkernel-based Operating Systems - Introduction Faculty of Computer Science Institute for System Architecture, Operating Systems Group Microkernel-based Operating Systems - Introduction Nils Asmussen Dresden, Oct 09 2018 Lecture Goals Provide deeper

More information

CrossCheck: A Holistic Approach for Tolerating Crash-Faults and Arbitrary Failures

CrossCheck: A Holistic Approach for Tolerating Crash-Faults and Arbitrary Failures CrossCheck: A Holistic Approach for Tolerating Crash-Faults and Arbitrary Failures Arthur Martens 1, Christoph Borchert 2, Manuel Nieke 1, Olaf Spinczyk 2 and Rüdiger Kapitza 1 1 TU Braunschweig 2 TU Dortmund

More information

Outline. Threads. Single and Multithreaded Processes. Benefits of Threads. Eike Ritter 1. Modified: October 16, 2012

Outline. Threads. Single and Multithreaded Processes. Benefits of Threads. Eike Ritter 1. Modified: October 16, 2012 Eike Ritter 1 Modified: October 16, 2012 Lecture 8: Operating Systems with C/C++ School of Computer Science, University of Birmingham, UK 1 Based on material by Matt Smart and Nick Blundell Outline 1 Concurrent

More information

Chapter 4: Multi-Threaded Programming

Chapter 4: Multi-Threaded Programming Chapter 4: Multi-Threaded Programming Chapter 4: Threads 4.1 Overview 4.2 Multicore Programming 4.3 Multithreading Models 4.4 Thread Libraries Pthreads Win32 Threads Java Threads 4.5 Implicit Threading

More information

Scheduler Support for Video-oriented Multimedia on Client-side Virtualization

Scheduler Support for Video-oriented Multimedia on Client-side Virtualization Scheduler Support for Video-oriented Multimedia on Client-side Virtualization Hwanju Kim 1, Jinkyu Jeong 1, Jaeho Hwang 1, Joonwon Lee 2, and Seungryoul Maeng 1 Korea Advanced Institute of Science and

More information

Distributed OS and Algorithms

Distributed OS and Algorithms Distributed OS and Algorithms Fundamental concepts OS definition in general: OS is a collection of software modules to an extended machine for the users viewpoint, and it is a resource manager from the

More information

TU Wien. Fault Isolation and Error Containment in the TT-SoC. H. Kopetz. TU Wien. July 2007

TU Wien. Fault Isolation and Error Containment in the TT-SoC. H. Kopetz. TU Wien. July 2007 TU Wien 1 Fault Isolation and Error Containment in the TT-SoC H. Kopetz TU Wien July 2007 This is joint work with C. El.Salloum, B.Huber and R.Obermaisser Outline 2 Introduction The Concept of a Distributed

More information

csci3411: Operating Systems

csci3411: Operating Systems csci3411: Operating Systems Lecture 3: System structure and Processes Gabriel Parmer Some slide material from Silberschatz and West System Structure System Structure How different parts of software 1)

More information

Distributed Systems 24. Fault Tolerance

Distributed Systems 24. Fault Tolerance Distributed Systems 24. Fault Tolerance Paul Krzyzanowski pxk@cs.rutgers.edu 1 Faults Deviation from expected behavior Due to a variety of factors: Hardware failure Software bugs Operator errors Network

More information

CS Lecture 3! Threads! George Mason University! Spring 2010!

CS Lecture 3! Threads! George Mason University! Spring 2010! CS 571 - Lecture 3! Threads! George Mason University! Spring 2010! Threads! Overview! Multithreading! Example Applications! User-level Threads! Kernel-level Threads! Hybrid Implementation! Observing Threads!

More information

Flexible Architecture Research Machine (FARM)

Flexible Architecture Research Machine (FARM) Flexible Architecture Research Machine (FARM) RAMP Retreat June 25, 2009 Jared Casper, Tayo Oguntebi, Sungpack Hong, Nathan Bronson Christos Kozyrakis, Kunle Olukotun Motivation Why CPUs + FPGAs make sense

More information