Microkernels and Portability. What is Portability wrt Operating Systems? Reuse of code for different platforms and processor architectures.

Similar documents
Faculty of Computer Science, Operating Systems Group. The L4Re Microkernel. Adam Lackorzynski. July 2017

Department of Computer Science, Institute for System Architecture, Operating Systems Group. Real-Time Systems '08 / '09. Hardware.

Microkernel Construction. Introduction. Michael Hohmuth. Lars Reuther. TU Dresden Operating Systems Group

TCBs and Address-Space Layouts Universität Karlsruhe, System Architecture Group

55:132/22C:160, HPCA Spring 2011

HARDWARE IN REAL-TIME SYSTEMS HERMANN HÄRTIG, WITH MARCUS VÖLP

Embedded Linux Architecture

Microkernel Construction

ARM Processors for Embedded Applications

OS Extensibility: Spin, Exo-kernel and L4

What is Computer Architecture?

18-349: Embedded Real-Time Systems Lecture 2: ARM Architecture

Microkernel-based Operating Systems - Introduction

Chapter 2. OS Overview

ECE 498 Linux Assembly Language Lecture 1

CS533 Concepts of Operating Systems. Jonathan Walpole

Introduction. COMP9242 Advanced Operating Systems 2010/S2 Week 1

Introduction. COMP /S2 Week Gernot Heiser UNSW/NICTA/OKL. Distributed under Creative Commons Attribution License 1

Microkernels. Overview. Required reading: Improving IPC by kernel design

Microkernel-based Operating Systems - Introduction

Universität Dortmund. ARM Architecture

Operating Systems CMPSCI 377 Spring Mark Corner University of Massachusetts Amherst

Cortex-A9 MPCore Software Development

CS5460: Operating Systems Lecture 14: Memory Management (Chapter 8)

Last class: OS and Architecture. OS and Computer Architecture

Last class: OS and Architecture. Chapter 3: Operating-System Structures. OS and Computer Architecture. Common System Components

Protection and System Calls. Otto J. Anshus

Introduction to Virtual Machines. Michael Jantz

OS and Computer Architecture. Chapter 3: Operating-System Structures. Common System Components. Process Management

Lecture 4: Instruction Set Design/Pipelining

ARM ARCHITECTURE. Contents at a glance:

Virtual Memory. Patterson & Hennessey Chapter 5 ELEC 5200/6200 1

Operating Systems. Operating System Structure. Lecture 2 Michael O Boyle

Microkernel-based Operating Systems - Introduction

CSE 560 Computer Systems Architecture

SE-292 High Performance Computing. Memory Hierarchy. R. Govindarajan

Lec 22: Interrupts. Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University. Announcements

MICROKERNELS: MACH AND L4

Virtual Machines Disco and Xen (Lecture 10, cs262a) Ion Stoica & Ali Ghodsi UC Berkeley February 26, 2018

The Processor: Instruction-Level Parallelism

Virtual Memory. Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University

Chapter 8 Memory Management

Designing, developing, debugging ARM Cortex-A and Cortex-M heterogeneous multi-processor systems

ARMv8-A Software Development

What are Exceptions? EE 457 Unit 8. Exception Processing. Exception Examples 1. Exceptions What Happens When Things Go Wrong

Last 2 Classes: Introduction to Operating Systems & C++ tutorial. Today: OS and Computer Architecture

Cache Performance and Memory Management: From Absolute Addresses to Demand Paging. Cache Performance

Address Translation. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

KVM/ARM. Marc Zyngier LPC 12

ARM Processor. Dr. P. T. Karule. Professor. Department of Electronics Engineering, Yeshwantrao Chavan College of Engineering, Nagpur

Address Translation. Tore Larsen Material developed by: Kai Li, Princeton University

Cortex-R5 Software Development

Virtualization. Operating Systems, 2016, Meni Adler, Danny Hendler & Amnon Meisels

IAR PowerPac RTOS for ARM Cores

Virtual Memory, Address Translation

System Architecture Directions for Networked Sensors[1]

Chapter 5 (Part II) Large and Fast: Exploiting Memory Hierarchy. Baback Izadi Division of Engineering Programs

Porting Linux to x86-64

COS 318: Operating Systems

Overview. This Lecture. Interrupts and exceptions Source: ULK ch 4, ELDD ch1, ch2 & ch4. COSC440 Lecture 3: Interrupts 1

Machine-Independent Virtual Memory Management for Paged June Uniprocessor 1st, 2010and Multiproce 1 / 15

Introduction to C. Why C? Difference between Python and C C compiler stages Basic syntax in C

COS 318: Operating Systems

Chapter 13 Reduced Instruction Set Computers

Advanced Operating Systems. COMP9242 Introduction

Reducing Hit Times. Critical Influence on cycle-time or CPI. small is always faster and can be put on chip

Computer Architectures

Virtualization. Pradipta De

Architectural Support for Operating Systems. Jinkyu Jeong ( Computer Systems Laboratory Sungkyunkwan University

MICROKERNEL CONSTRUCTION 2014

Cortex-A15 MPCore Software Development

HARDWARE IN REAL-TIME SYSTEMS HERMANN HÄRTIG, MARCUS VÖLP

EE 457 Unit 8. Exceptions What Happens When Things Go Wrong

CIS Operating Systems Memory Management Cache. Professor Qiang Zeng Fall 2017

ARM CORTEX-R52. Target Audience: Engineers and technicians who develop SoCs and systems based on the ARM Cortex-R52 architecture.

4. Hardware Platform: Real-Time Requirements

Linux Kernel Architecture

Fast access ===> use map to find object. HW == SW ===> map is in HW or SW or combo. Extend range ===> longer, hierarchical names

CS370 Operating Systems

CISC 360. Virtual Memory Dec. 4, 2008

Spring 2017 :: CSE 506. Device Programming. Nima Honarmand

Motivation, Concept, Paging January Winter Term 2008/09 Gerd Liefländer Universität Karlsruhe (TH), System Architecture Group

Process Concepts. CSC400 - Operating Systems. 3. Process Concepts. J. Sumey

L2 - C language for Embedded MCUs

Björn Döbel. Microkernel-Based Operating Systems. Exercise 3: Virtualization

CS252 Spring 2017 Graduate Computer Architecture. Lecture 18: Virtual Machines

Lecture 4: Memory Management & The Programming Interface

Advanced Operating Systems. COMP9242 Introduction

Lecture 13: Virtual Memory Management. CSC 469H1F Fall 2006 Angela Demke Brown

COMPUTER ARCHITECTURE. Virtualization and Memory Hierarchy

A Typical Memory Hierarchy

csci 3411: Operating Systems

ARM Architecture. Computer Organization and Assembly Languages Yung-Yu Chuang. with slides by Peng-Sheng Chen, Ville Pietikainen

CS162 - Operating Systems and Systems Programming. Address Translation => Paging"

NetBSD on Marvell Armada XP System on a Chip

Virtual Memory: From Address Translation to Demand Paging

OS lpr. www. nfsd gcc emacs ls 1/27/09. Process Management. CS 537 Lecture 3: Processes. Example OS in operation. Why Processes? Simplicity + Speed

Knut Omang Ifi/Oracle 6 Nov, 2017

SimBench. A Portable Benchmarking Methodology for Full-System Simulators. Harry Wagstaff Bruno Bodin Tom Spink Björn Franke

Lecture 4: Instruction Set Architecture

Transcription:

Microkernels and Portability What is Portability wrt Operating Systems? Reuse of code for different platforms and processor architectures.

Contents Overview History Towards Portability L4 Microkernels Case Study Fiasco Microkernel Overview Structure Implementation Case Study Fiasco on ARM, IA-32, and IA-64 CPU MMU Caches TU Dresden, 17.06.10 Portability Slide 2

CPUs & Platforms IA-32 aka x86 IA-64 AMD 64 ARM m68k PPC... x86 PC HP IA-64 Workstation AMD 64 PC ipaq PDA, iphone Atari TU Dresden, 17.06.10 Portability Slide 3

Different CPUs Instruction Set Architecture (ISA) CISC, RISC... Word and address width 16bit, 32bit, 64bit... Byte order Little Endian, Big Endian Processor state (execution context) Registers... Processor modes Memory subsystem (MMU/TLB) Caches Pipelines TU Dresden, 17.06.10 Portability Slide 4

Different Platforms Peripheral buses PCI-Bus, ISA-Bus, CAN-Bus, AXI-Bus Peripheral devices Graphics adapter, network adapter, UART, timer, IRQ controller... TU Dresden, 17.06.10 Portability Slide 5

History of Microkernel Portability First microkernels (Mach) designed for portability Mediocre performance Second generation (L4) designed for minimality and performance Assembly language Unportable (however small and fast) The additional layer per se costs performance It cannot take precautions to circumvent or avoid performance problems of specific hardware Such a µ-kernel cannot take advantage of specific hardware [On µ-kernel Construction (J. Liedtke SOSP 1995)] Even i486 and Pentium are incompatible TU Dresden, 17.06.10 Portability Slide 6

History (2) Towards Portability First HLL implementation of L4 Fiasco (designed for preemptability) acceptable performance L4 versions for Alpha, MIPS, ARM (written from scratch) New processors every year Maintenance problems Portable L4 microkernels: L4::Hazelnut (C/C++), L4/Fiasco (C++), L4::Pistachio (C++) TU Dresden, 17.06.10 Portability Slide 7

Prerequisites High Level Language (done since Fiasco) Assembler code is per se unportable necessary but not sufficient - HOWEVER - No OS without some assembler code Only parts of the code are reusable/portable TU Dresden, 17.06.10 Portability Slide 8

Tradeoff Performance vs. Portability Where to put the boundary? Generic Kernel Hardware Abstractions Generic Kernel Hardware Abstractions TU Dresden, 17.06.10 Portability Slide 9

L4 Statement Design Goal is Performance What to do to achieve maximum performance and acceptable portability? TU Dresden, 17.06.10 Portability Slide 10

L4 Microkernels in Detail CPU: ISA*, pipelines*, word width, byte order, processor modes, processor state, MMU/TLB, caches Platform: Physical memory layout, boot-up, debugging console, IRQ controller, timer *Handled by the compiler TU Dresden, 17.06.10 Portability Slide 11

Handling in Fiasco Simple (but important) Word width carefully defined data types Byte order no use of C bit fields for the ABI Not Performance critical Bootstrap some platform-specific init code Debug console complete abstract framework TU Dresden, 17.06.10 Portability Slide 12

Handling in Fiasco (2) More tricky (affect the critical path IPC) Processor modes mapping to kernel mode and user mode, mode switches Processor state context switches MMU/TLB specific address-space/pagetable code Caches specific cache-consistency handling IRQ controller abstract controller interface TU Dresden, 17.06.10 Portability Slide 13

Fiasco Structure Generic Code Memory management Page-fault handling IPC Path Mapping database Base of the kernel debugger Most code of L4 abstractions Thread and address-space management CPU generic Platform ABI TU Dresden, 17.06.10 Portability Slide 14

Fiasco Structure (2) Processor and ABI Specific Code Basic data types Processor abstraction IRQ control, sleep-mode support Atomic operations Page tables Parts of L4 abstractions Switch of CPU and FPU state CPU specific optimizations CPU generic Platform ABI TU Dresden, 17.06.10 Portability Slide 15

Fiasco Structure (3) Platform Specific Code Interrupt-controller driver UART driver (debugger) Graphics adapter driver (debugger) CPU generic Platform ABI TU Dresden, 17.06.10 Portability Slide 16

Interaction of Code Parts Interesting interactions: Generic code calls x-specific functions Data structures contain generic and x-specific members Solution: (Fiasco is C++) Use OO principles, such as inheritance and polymorphism TU Dresden, 17.06.10 Portability Slide 17

Polymorphism C++ Virtual Functions Virtual functions impose indirections (Indirections are necessary for runtime polymorphism) Hardware abstractions are compile-time polymorphism Indirections are not acceptable Extra memory references Extra cache pollution No function inlining TU Dresden, 17.06.10 Portability Slide 18

Solutions for the Virtual-Functions Dilemma Use a good compiler with virtual-function elimination No known compiler with this feature Class Flattening developed in Karlsruhe Can deal with data-member layout and optimize it Is a research project Preprocess C++ Preprocessor Provides class extension for compile-time diversity Can deal with data-member layout (no optimization) Is used by Fiasco since the beginning TU Dresden, 17.06.10 Portability Slide 19

Break Point We already know What is portability about What is the basic structure of Fiasco How do we solve the diversity (in theory) TU Dresden, 17.06.10 Portability Slide 20

Practical Comparison IA-32, IA-64, and ARM Processor Modes: Fiasco model knows user mode and privileged mode (critical for kernel entries and exits) Processor State: Complete register set and status words (critical for thread switches) MMU/TLB: L4 Address-Space Model (critical for task switches) Caches: Cache consistency must be maintained (critical for task switches) TU Dresden, 17.06.10 Portability Slide 21

Processor Modes IA-32 3 = User* 2 =... 1 =... 0 = Kern* IA-64 3 = User* 2 =... 1 =... 0 = Kern* ARM User* Abort Undefined IRQ FIQ Supervisor* System *effectively used in Fiasco; green modes are privileged TU Dresden, 17.06.10 Portability Slide 22

Processor State IA-32 8 GPRs FPU state IA-64 32 GPRs Register Stack min. 96 GPRs Backing store FPU state ARM 15 GPRs FPU state? ca. 140 Byte w/o MMX/SSE ca. 0.5kB mit MMX/SSE ca. 1kB GPRs + 2kB FPU ca. 64 Bytes + (FPU?) TU Dresden, 17.06.10 Portability Slide 23

MMU/TLB IA-32 HW page tables Implicit/explicit TLB control Segmentation IA-64 SW loaded TLB HW page table option Decoupled protection TLB tagging Explicit TLB control ARM HW page table Decoupled protection kind of TLB tagging Explicit TLB control TU Dresden, 17.06.10 Portability Slide 24

Caches IA-32 Phys. tagged and indexed HW consistency IA-64 Phys. tagged and indexed HW consistency ARM Virt. tagged and indexed SW consistency TU Dresden, 17.06.10 Portability Slide 25

Issues TLB consistency Cache consistency Use of specific hardware Reduce TLB overhead Reduce cache overhead Reduce processor state save/restore-overhead TU Dresden, 17.06.10 Portability Slide 26

TLB Consistency w/o address-space IDs in the TLB TLB flush on address-space switches (always) TLB flush on page-rights downgrade IA-64 and ARM TLB flush on page-rights upgrade (eager or lazy) TLB flushes have high direct and indirect costs TU Dresden, 17.06.10 Portability Slide 27

Cache Consistency Physically Tagged/Indexed Caches Virt. Memory A B Phys. Memory Cache Different phys. addresses map to different cache location No consistency problems No overhead on task switches TU Dresden, 17.06.10 Portability Slide 28

Cache Consistency (2) Virtually Tagged/Indexed Caches Cache Virt. Memory A B Phys. Memory Same virtual addresses map to same cache line Cache flush on addressspace switches Extremely high direct and indirect costs for task switches TU Dresden, 17.06.10 Portability Slide 29

Cache Consistency (3) Virtually Tagged/Indexed Caches Phys. Memory V. Memory Aliasing of same phys. frame Duplicated cache allocation Cache Must have countermeasures for this problem TU Dresden, 17.06.10 Portability Slide 30

Conclusion Where to put the boundary between generic and hardwarespecific code? There is no single answer to this question. All code that must be HW specific is HW specific Some optimization code is HW specific More optimizations worse maintainability Carefully designed generic code allows the use of HW specific optimizations with low portability overhead TU Dresden, 17.06.10 Portability Slide 31