Review: Hardware user/kernel boundary

Similar documents
The different Unix contexts

Multiprogramming on physical memory

Administrivia. Lab 1 due Friday 12pm. We give will give short extensions to groups that run into trouble. But us:

Administrivia. - If you didn t get Second test of class mailing list, contact cs240c-staff. Clarification on double counting policy

Anne Bracy CS 3410 Computer Science Cornell University

Last class: Today: Course administration OS definition, some history. Background on Computer Architecture

Administrivia. Midterm. Realistic PC architecture. Memory and I/O buses. What is memory? What is I/O bus? E.g., PCI

Anne Bracy CS 3410 Computer Science Cornell University

Hardware OS & OS- Application interface

Introduction to Operating Systems. Chapter Chapter

238P: Operating Systems. Lecture 5: Address translation. Anton Burtsev January, 2018

Notes based on prof. Morris's lecture on scheduling (6.824, fall'02).

Memory Management. Disclaimer: some slides are adopted from book authors slides with permission 1

Questions on Midterm. Midterm. Administrivia. Memory and I/O buses. Realistic PC architecture. What is memory? Median: 59.5, Mean: 57.3, Stdev: 26.

Dan Noé University of New Hampshire / VeloBit

Protection. - Programmers typically assume machine has enough memory - Sum of sizes of all processes often greater than physical memory 1 / 36

Chapter 5B. Large and Fast: Exploiting Memory Hierarchy

Lec 22: Interrupts. Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University. Announcements

CS510 Operating System Foundations. Jonathan Walpole

virtual memory. March 23, Levels in Memory Hierarchy. DRAM vs. SRAM as a Cache. Page 1. Motivation #1: DRAM a Cache for Disk

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2015 Lecture 23

Xen and the Art of Virtualization. CSE-291 (Cloud Computing) Fall 2016

CMSC 313 COMPUTER ORGANIZATION & ASSEMBLY LANGUAGE PROGRAMMING LECTURE 27, SPRING 2013

Chapter 5 Memory Hierarchy Design. In-Cheol Park Dept. of EE, KAIST

- Established working set model - Led directly to virtual memory. Protection

COS 318: Operating Systems. Virtual Memory and Address Translation

CS 5460/6460 Operating Systems

Topic 18: Virtual Memory

Hakim Weatherspoon CS 3410 Computer Science Cornell University

14 May 2012 Virtual Memory. Definition: A process is an instance of a running program

19: I/O. Mark Handley. Direct Memory Access (DMA)

CMSC 313 COMPUTER ORGANIZATION & ASSEMBLY LANGUAGE PROGRAMMING LECTURE 27, FALL 2012

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2018 Lecture 23

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2018 Lecture 24

Administrivia. p. 1/20

Random-Access Memory (RAM) Systemprogrammering 2007 Föreläsning 4 Virtual Memory. Locality. The CPU-Memory Gap. Topics

Lecture 2: Architectural Support for OSes

virtual memory Page 1 CSE 361S Disk Disk

Last 2 Classes: Introduction to Operating Systems & C++ tutorial. Today: OS and Computer Architecture

Random-Access Memory (RAM) Systemprogrammering 2009 Föreläsning 4 Virtual Memory. Locality. The CPU-Memory Gap. Topics! The memory hierarchy

Microkernels. Overview. Required reading: Improving IPC by kernel design

CSE 560 Computer Systems Architecture

John Wawrzynek & Nick Weaver

2.5 Address Space. The IBM 6x86 CPU can directly address 64 KBytes of I/O space and 4 GBytes of physical memory (Figure 2-24).

Lecture 4: Mechanism of process execution. Mythili Vutukuru IIT Bombay

Syscalls, exceptions, and interrupts, oh my!

143A: Principles of Operating Systems. Lecture 6: Address translation. Anton Burtsev January, 2017

CSE 120 Principles of Operating Systems

Virtual Memory. CS 351: Systems Programming Michael Saelee

18-447: Computer Architecture Lecture 18: Virtual Memory III. Yoongu Kim Carnegie Mellon University Spring 2013, 3/1

Operating Systems CMPSCI 377 Spring Mark Corner University of Massachusetts Amherst

Introduction to Operating Systems. Chapter Chapter

Input / Output. Kevin Webb Swarthmore College April 12, 2018

CSC 2405: Computer Systems II

Memory Management. Disclaimer: some slides are adopted from book authors slides with permission 1

Lecture 4: Memory Management & The Programming Interface

Today: Computer System Overview (Stallings, chapter ) Next: Operating System Overview (Stallings, chapter ,

Interconnecting Components

OS Structure. Hardware protection & privilege levels Control transfer to and from the operating system

The Kernel Abstraction

Introduction to Operating. Chapter Chapter

143A: Principles of Operating Systems. Lecture 5: Address translation. Anton Burtsev October, 2018

CS 537 Lecture 2 Computer Architecture and Operating Systems. OS Tasks

CS370 Operating Systems

CS 318 Principles of Operating Systems

Operating Systems. Operating System Structure. Lecture 2 Michael O Boyle

CPS104 Computer Organization and Programming Lecture 16: Virtual Memory. Robert Wagner

Common Computer-System and OS Structures

Computer Architecture Lecture 13: Virtual Memory II

The control of I/O devices is a major concern for OS designers

Virtual Memory Oct. 29, 2002

Virtual Memory. Motivations for VM Address translation Accelerating translation with TLBs

Traps, Exceptions, System Calls, & Privileged Mode

Chapter 5 (Part II) Large and Fast: Exploiting Memory Hierarchy. Baback Izadi Division of Engineering Programs

I/O - input/output. system components: CPU, memory, and bus -- now add I/O controllers and peripheral devices. CPU Cache

3.6. PAGING (VIRTUAL MEMORY) OVERVIEW

CS370 Operating Systems

Lecture 21: Virtual Memory. Spring 2018 Jason Tang

CS5460: Operating Systems Lecture 14: Memory Management (Chapter 8)

HY225 Lecture 12: DRAM and Virtual Memory

COSC 243. Input / Output. Lecture 13 Input/Output. COSC 243 (Computer Architecture)

Embedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institute of Technology, Delhi

CSE 153 Design of Operating Systems

Virtual memory Paging

by I.-C. Lin, Dept. CS, NCTU. Textbook: Operating System Concepts 8ed CHAPTER 13: I/O SYSTEMS

Virtual Memory #2 Feb. 21, 2018

Motivations for Virtual Memory Virtual Memory Oct. 29, Why VM Works? Motivation #1: DRAM a Cache for Disk

Inf2C - Computer Systems Lecture 16 Exceptions and Processor Management

Architectural Support for Operating Systems

Computer System Overview OPERATING SYSTEM TOP-LEVEL COMPONENTS. Simplified view: Operating Systems. Slide 1. Slide /S2. Slide 2.

CISC 360. Virtual Memory Dec. 4, 2008

Systems Programming and Computer Architecture ( ) Timothy Roscoe

Reducing Hit Times. Critical Influence on cycle-time or CPI. small is always faster and can be put on chip

Lecture 23. I/O, Interrupts, exceptions

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING UNIT I

Virtual Memory. Virtual Memory

Address Translation. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

Tolerating Malicious Drivers in Linux. Silas Boyd-Wickizer and Nickolai Zeldovich

Architectural Support for Operating Systems

Distributed Systems Operation System Support

Transcription:

Review: Hardware user/kernel boundary applic. applic. applic. user lib lib lib kernel syscall pg fault syscall FS VM sockets disk disk NIC context switch TCP retransmits,... device interrupts Processor can be in one of two modes - user mode application software & libraries - kernel (supervisor/privileged) mode for the OS kernel Privileged instructions only available in kernel mode

Top half of kernel Kernel execution contexts - Kernel code acting on behalf of current user-level process - System call, page fault handler, kernel-only process, etc. - This is the only kernel context where you can sleep (e.g., if you need to wait for more memory, a buffer, etc.) Software interrupt TCP/IP protocol processing Device interrupt Network, disk,... - External hardware causes CPU to jump to OS entry point Timer interrupt (hardclock) Context switch code - Switches current process (thread switch for top half)

Transitions between contexts User top half: syscall, page fault User/top half device/timer interrupt: hardware Top half user/context switch: return Top half context switch: sleep Context switch user/top half

Context switches are expensive Modern CPUs very complex - Deep pipelining to maximize use of functional units - Dynamically scheduled, speculative & out-of-order execution Most context switches flush the pipeline - E.g., after memory fault, can t execute further instructions Reenabling hardware interrupts expensive Kernel must set up its execution context - Cannot trust user registers, especially stack pointer - Set up its own stack, save user registers Switching between address spaces expensive - Invalidates cached virtual memory translations (discussed in a few slides... )

Top/bottom half synchronization Top half kernel procedures can mask interrupts int x = splhigh (); /*... */ splx (x); splhigh disables all interrupts, but also have splnet, splbio, splsoftnet,... - Hardware interrupt handler sets mask automatically Example: splnet implies splsoftnet - Ethernet packet arrives, driver code flags TCP/IP needed - Upon return from handler, TCP code invoked at softnet Masking interrupts in hardware can be expensive - Optimistic implementation set mask flag on splhigh, check interrupted flag on splx

Process context P 1 P 2 applic. lib syscall FS disk applic. lib pg fault VM disk context switch TCP retransmits,... device interrupts P 3 applic. lib syscall sockets NIC Kernel gives each program its own context Isolates processes from each other - So one buggy process cannot crash others

Virtual memory Need fault isolation between processes Want each process to have its own view of memory - Otherwise, pain to allocate large contiguous structures - Processes can consume more than available memory - Dormant processes (wating for event) still have core images Solution: Virtual Memory - Give each program its own address space I.e., Address 0x8000 goes to different physical memory in P 1 and P 2 - CPU must be in kernel mode to manipulate mappings - Isolation between processes is natural

Paging Divide memory up into small pages Map virtual pages to physical pages - Each process has separate mapping Allow OS to gain control on certain operations - Read-only pages trap to OS on write - Invalid pages trap to OS on write - OS can change mapping and resume application Other features sometimes found: - Hardware can set dirty bit - Control caching of page

Example: Paging on x86 Page size is 4 KB (in most generally used mode) Page table: 1024 32-bit translations for 4 Megs of Virtual mem Page directory: 1024 pointers to page tables %cr3 page table base register %cr0 bits enable protection and paging INVLPG tell hardware page table modified

Linear Address 31 22 21 12 11 Directory Table Offset 0 12 4 KByte Page 10 Page Directory 10 Page Table Physical Address Directory Entry Page Table Entry 20 32* CR3 (PDBR) 1024 PDE 1024 PTE = 2 20 Pages *32 bits aligned onto a 4 KByte boundary

31 Page Table Entry (4 KByte Page) 12 11 9 8 7 6 5 4 3 2 1 0 Page Base Address Avail G P A T D A P C D P W T U / S R / W P Available for system programmer s use Global Page Page Table Attribute Index Dirty Accessed Cache Disabled Write Through User/Supervisor Read/Write Present

Example: MIPS Hardware has 64-entry TLB - References to addresses not in TLB trap to kernel Each TLB entry has the following fields: Virtual page, Pid, Page frame, NC, D, V, Global Kernel itself unpaged - All of physical memory contiguously mapped in high VM - Kernel uses these pseudo-physical addresses User TLB fault hander very efficient - Two hardware registers reserved for it - utlb miss handler can itself fault allow paged page tables

Memory and I/O buses 1880Mbps Memory 1056Mbps I/O bus CPU Crossbar CPU accesses physical memory over a bus Devices access memory over I/O bus with DMA Devices can appear to be a region of memory

SRAM Static RAM What is memory - Like two NOT gates circularly wired input-to-output - 4 6 transistors per bit, actively holds its value - Very fast, used to cache slower memory DRAM Dynamic RAM - A capacitor + gate, holds charge to indicate bit value - 1 transistor per bit extremely dense storage - Charge leaks need slow comparator to decide if bit 1 or 0 - Must re-write charge after reading, or periodically refresh VRAM Video RAM - Dual ported, can write while someone else reads

Communicating with a device Memory-mapped device registers - Certain physical addresses correspond to device registers - Load/store gets status/sends instructions not real memory Device memory device may have memory OS can write to directly on other side of I/O bus Special I/O instructions - Some CPUs (e.g., x86) have special I/O instructions - Like load & store, but asserts special I/O pin on CPU - OS can allow user-mode access to I/O ports with finer granularity than page DMA place instructions to card in main memory - Typically then need to poke card by writing to register

DMA buffers 100 Memory buffers 1400 1500 1500 1500 Buffer descriptor list Include list of buffer locations in main memory Card reads list then accesses buffers (w. DMA) - Allows for scatter/gather I/O

Anatomy of a Network Interface Card Host I/O bus Bus interface Link interface Network link Adaptor Link interface talks to wire/fiber/antenna - Typically does framing, link-layer CRC FIFOs on card provide small amount of buffering Bus interface logic moves packets to/from memory or CPU

Driver architecture Device driver provides several entry points to kernel - Reset, output, interrupt,... How should driver synchronize with card? - Need to know when transmit buffers free or packets arrive One approach: Polling - Sent a packet? Loop asking card when buffer is free - Waiting to receive? Keep asking card if it has packet Disadvantages of polling - Can t use CPU for anything else while polling - Or schedule poll if future and do something else, but then high latency to receive packet

Interrupt driven devices Instead, ask card to interrupt CPU on events - Interrupt handler runs at high priority - Asks card what happened (xmit buffer free, new packet) - This is what most general-purpose OSes do Problem: Bad for high-throughput scenarios - Interrupts are very expensive (context switch) - Interrupts handlers have high priority - In worst case, can spend 100% of time in interrupt handler and never make any progress receive livelock Best: Adaptive algorithm that switches between interrupts and polling