Multiprocessor Solution

Similar documents
Assembly Language: Function Calls

CS 31: Intro to Systems Functions and the Stack. Martin Gagne Swarthmore College February 23, 2016

Assembly Language: Function Calls" Goals of this Lecture"

Assembly Language: Function Calls" Goals of this Lecture"

µ-kernel Construction (12)

Assembly Language: Function Calls. Goals of this Lecture. Function Call Problems

CPS104 Recitation: Assembly Programming

CS 33: Week 3 Discussion. x86 Assembly (v1.0) Section 1G

Function Calls COS 217. Reading: Chapter 4 of Programming From the Ground Up (available online from the course Web site)

Process Layout and Function Calls

Assembly Programmer s View Lecture 4A Machine-Level Programming I: Introduction

x86 Assembly Crash Course Don Porter

Machine-level Programming (3)

4) C = 96 * B 5) 1 and 3 only 6) 2 and 4 only

Procedure Calls. Young W. Lim Sat. Young W. Lim Procedure Calls Sat 1 / 27

Question 4.2 2: (Solution, p 5) Suppose that the HYMN CPU begins with the following in memory. addr data (translation) LOAD 11110

ASSEMBLY III: PROCEDURES. Jo, Heeseung

Assembly III: Procedures. Jo, Heeseung

Threads, System Calls, and Thread Switching

CMSC 313 Lecture 12. Project 3 Questions. How C functions pass parameters. UMBC, CMSC313, Richard Chang

CS213. Machine-Level Programming III: Procedures

Assembly III: Procedures. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Machine Programming 3: Procedures

Second Part of the Course

Instruction Set Architecture

administrivia today start assembly probably won t finish all these slides Assignment 4 due tomorrow any questions?

COMP 210 Example Question Exam 2 (Solutions at the bottom)

Procedure Calls. Young W. Lim Mon. Young W. Lim Procedure Calls Mon 1 / 29

Processes (Intro) Yannis Smaragdakis, U. Athens

X86 Stack Calling Function POV

You may work with a partner on this quiz; both of you must submit your answers.

Instruction Set Architecture

The course that gives CMU its Zip! Machine-Level Programming III: Procedures Sept. 17, 2002

Instruction Set Architecture

Implementing Threads. Operating Systems In Depth II 1 Copyright 2018 Thomas W. Doeppner. All rights reserved.

W4118: PC Hardware and x86. Junfeng Yang

CS241 Computer Organization Spring 2015 IA

CISC 360 Instruction Set Architecture

Y86 Processor State. Instruction Example. Encoding Registers. Lecture 7A. Computer Architecture I Instruction Set Architecture Assembly Language View

Instruction Set Architecture

Process Layout, Function Calls, and the Heap

Sungkyunkwan University

Credits to Randy Bryant & Dave O Hallaron

CS61 Section Solutions 3

x86 assembly CS449 Fall 2017

Region of memory managed with stack discipline Grows toward lower addresses. Register %esp contains lowest stack address = address of top element

Compiler Construction D7011E

CS 31: Intro to Systems ISAs and Assembly. Kevin Webb Swarthmore College February 9, 2016

Machine-Level Programming II: Arithmetic & Control /18-243: Introduction to Computer Systems 6th Lecture, 5 June 2012

Homework. In-line Assembly Code Machine Language Program Efficiency Tricks Reading PAL, pp 3-6, Practice Exam 1

CS 31: Intro to Systems ISAs and Assembly. Kevin Webb Swarthmore College September 25, 2018

CS429: Computer Organization and Architecture

µ-kernel Construction

CPSC W Term 2 Problem Set #3 - Solution

Machine-Level Programming II: Control and Arithmetic

ASSEMBLY I: BASIC OPERATIONS. Jo, Heeseung

1 /* file cpuid2.s */ 4.asciz "The processor Vendor ID is %s \n" 5.section.bss. 6.lcomm buffer, section.text. 8.globl _start.

Turning C into Object Code Code in files p1.c p2.c Compile with command: gcc -O p1.c p2.c -o p Use optimizations (-O) Put resulting binary in file p

Machine-Level Programming I: Introduction Jan. 30, 2001

Machine Level Programming II: Arithmetic &Control

What is a Compiler? Compiler Construction SMD163. Why Translation is Needed: Know your Target: Lecture 8: Introduction to code generation

Machine-Level Programming III: Procedures

CSE2421 FINAL EXAM SPRING Name KEY. Instructions: Signature

Instructor: Alvin R. Lebeck

CMSC 313 Lecture 12 [draft] How C functions pass parameters

X86 Assembly -Procedure II:1

ICS143A: Principles of Operating Systems. Midterm recap, sample questions. Anton Burtsev February, 2017

TCBs and Address-Space Layouts Universität Karlsruhe, System Architecture Group

Assembly I: Basic Operations. Computer Systems Laboratory Sungkyunkwan University

Assembly I: Basic Operations. Jo, Heeseung

Final Exam. Fall Semester 2015 KAIST EE209 Programming Structures for Electrical Engineering. Name: Student ID:

Chapter 4! Processor Architecture!

Machine Program: Procedure. Zhaoguo Wang

Ausgewählte Betriebssysteme. Anatomy of a system call

CPEG421/621 Tutorial

Machine Programming 1: Introduction

Assembly level Programming. 198:211 Computer Architecture. (recall) Von Neumann Architecture. Simplified hardware view. Lecture 10 Fall 2012

CSE351 Autumn 2012 Midterm Exam (5 Nov 2012)

IA32 Stack. Stack BoDom. Region of memory managed with stack discipline Grows toward lower addresses. Register %esp contains lowest stack address

ECE 391 Exam 1 Review Session - Spring Brought to you by HKN

Sungkyunkwan University

Tutorial 10 Protection Cont.

Intro to Threads. Two approaches to concurrency. Threaded multifinger: - Asynchronous I/O (lab) - Threads (today s lecture)

CSE 351: Week 4. Tom Bergan, TA

CISC 360. Machine-Level Programming I: Introduction Sept. 18, 2008

Pre-virtualization internals

Sungkyunkwan University

Giving credit where credit is due

x86 architecture et similia

Program Exploitation Intro

CS 318 Principles of Operating Systems

X86 Addressing Modes Chapter 3" Review: Instructions to Recognize"

CS241 Computer Organization Spring Addresses & Pointers

Machine-Level Programming II: Arithmetic & Control. Complete Memory Addressing Modes

CSC 2400: Computer Systems. Towards the Hardware: Machine-Level Representation of Programs

An Introduction to x86 ASM

Register Allocation, iii. Bringing in functions & using spilling & coalescing

Lecture #16: Introduction to Runtime Organization. Last modified: Fri Mar 19 00:17: CS164: Lecture #16 1

Stack Discipline Jan. 19, 2018

CSC 2400: Computing Systems. X86 Assembly: Function Calls"

Transcription:

Mutual Exclusion Multiprocessor Solution P(sema S) begin while (TAS(S.flag)==1){}; { busy waiting } S.Count= S.Count-1 if (S.Count < 0){ insert_t(s.qwt) BLOCK(S) {inkl.s.flag=0)!!!} } else S.flag =0 end V(sema S) begin while (TAS(S.flag)==1){}; { busy waiting } S.Count= S.Count+1 if S.Count 0 { remove_t(s.qwt) UNBLOCK(S) } S.flag =0 end 2007 Universität Karlsruhe (TH), System Architecture Group

11 Thread Switch & IPC in L4 Dezember 3, 2008 Winter Term 2008/2009 Philipp Kupferschmied 2

What Defines a Thread? Requires a stack Thread Program definition Compiler Provides registers Execution environment 3

What Defines a Thread? Program definition Compiler Thread stack Execution environment 4

What is a (generalized) Thread Switch? Change in the defining parts of a thread Stack pointer Instruction pointer Registers Allow later resumption Save all necessary state 5

The Stack Switch Spectrum function call exceptions kernel entry IPC user-level preemption kernel preemption Communication Concurrency 6

The Stack Switch Spectrum Defines new stack-frame Changes IP Passes register parameters function call exceptions kernel entry IPC user-level preemption kernel preemption Communication Concurrency 7

The Stack Switch Spectrum Walks stack frames Changes IP Passes register parameters function call exceptions kernel entry IPC user-level preemption kernel preemption Communication Concurrency 8

The Stack Switch Spectrum Switches to a new stack Changes IP Passes register parameters function call exceptions kernel entry IPC user-level preemption kernel preemption Communication Concurrency 9

The Stack Switch Spectrum Switches to a new stack Changes IP Passes register parameters Switches to a new address space function call exceptions kernel entry IPC user-level preemption kernel preemption Communication Concurrency 10

The Stack Switch Spectrum Switches to a new stack Changes IP Preserves all register state function call exceptions kernel entry IPC user-level preemption kernel preemption Communication Concurrency 11

The Stack Switch Spectrum Switches to a new stack Changes IP Preserves all register state Switches to a new address space function call exceptions kernel entry IPC user-level preemption kernel preemption Communication Concurrency 12

The Stack Switch Spectrum Aided by compiler Transparent to compiler function call exceptions kernel entry IPC user-level preemption kernel preemption Communication Concurrency 13

Compiler s Role in Thread Switch Cooperative Thread Switch Explicit by program author Supported by compiler Preemptive Thread Switch Destructive to compiler s state machine Must be transparent to compiler 14

Cooperative vs. Preemptive Cooperative (IPC) 1. Enter the kernel (syscall/trap) 2. Switch to kernel stack 3. Cooperative switch to target thread 4. Restore user thread 5. Exit kernel Preemptive 1. Enter the kernel (interrupt/exception) 2. Switch to kernel stack 3. Cooperative switch to target thread 4. Restore user thread 5. Exit kernel 15

Fast Thread Switch 16

Scenario Multi-threaded computations Weather forecast Air flow simulation Multi-server operating systems Application Memory FS Block Dev. Disk Dev. Microkernel 17

The Goods and the Bads + OS designers have power/influence Everybody depends on us We can make applications fast OS designers have power/influence Everybody depends on us We can make applications slow 18

The Goods and the Bads + OS designers have power/influence Everybody depends on us Linus Torvalds Date: Fri, 30 Nov 2001 21:15:50-0800 (PST) We can make applications fast If you want to see a system that was more thoroughly designed, you should probably point not to Dennis (Ritchie) and Ken (Thompson), but to systems like L4 and Plan-9, and people like Jochen Liedtke and Rob Pike. OS designers have power/influence Everybody depends on us We can make applications slow http://kerneltrap.org/article.php?sid=398 19

Thread Switch Activation Records Where to store thread switch records? On the thread s stack Remember the thread s stack location Advantages of using stack Stack proposed anyway Some architectures are stack efficient E.g. IA32 21

Thread Blocking Push reactivation info on stack Save stack pointer in TCB Stack Pointer State = BLOCKED TCB Saved Register State Instruction Pointer Stack Content 22

Thread Activation Restore registers Install new stack Jump to instruction address Stack Pointer State = RUNNING TCB Saved Register State Instruction Pointer Stack Content 23

Thread Activation Restore registers Install new stack Jump to instruction address Saved Register State Instruction Pointer Stack Content 24

How to Locate the TCB? User specifies thread ID TID is communication identifier TCB_AREA_START TCB TID TCB bitwise AND, bit SHIFT TCB TCB area offset 25

How to Locate the TCB? User specifies thread ID TID is communication identifier Fast! No memory lookup. No cache miss. No TLB miss. TCB_AREA_START TCB TID TCB bitwise AND, bit SHIFT TCB TCB area offset 26

Cooperative Switch Example Save registers content (to stack) Switch stack Switch address space (if necessary) Restore registers mov ToTcb, %eax mov CurrentTcb, %ebx mov Ptab(%eax), %ecx mov %cr3, %edx push %esi push %edi push %ebp push _restart /* IP */ mov %esp, Stack(%ebx) mov Stack(%eax), %esp cmp %ecx, %edx je _same_as mov %ecx, %cr3 _same_as: ret _restart: pop %ebp pop %edi pop %esi 27

Cooperative Switch in C (using gcc) inline void SwitchTo(Tcb_t * thread) { inline asm ( push %%ebp /* save registers */ push 2f /* save IP */ mov %%esp, %2 /* switch stack */ mov %0, %%esp mov %%cr3, %%edx /* switch address space */ cmp %1, %%edx je 1f mov %1, %%cr3 1: ret /* continue as thread */ 2: pop %%ebp /* restore registers */ : : m (thread->stack), c (thread->ptab), /* ecx contains ptab */ m (current->stack), /*clobber*/ : eax, ebx, ecx, edx, esi, edi, memory ); } 28

Cooperative Switch in C (using gcc) inline void SwitchTo(Tcb_t * thread) { inline asm ( push %%ebp Trick: /* save registers */ push 2f /* save IP */ mov %%esp, %2 /* switch stack */ mov %0, %%esp mov %%cr3, %%edx /* switch address space */ cmp %1, %%edx je 1f mov %1, %%cr3 1: ret /* continue as thread */ 2: pop %%ebp /* restore registers */ : : m (thread->stack), c (thread->ptab), /* ecx contains ptab */ m (current->stack), /*clobber*/ : eax, ebx, ecx, edx, esi, edi, memory ); } Let gcc save and restore the registers it knows best which are really in use! 29

Fast IPC in L4 30

Communication stack f stack 31

Communication Parameters in register file Avoids stack memory No cache footprint No TLB footprint Compiler optimizations Register allocation Register use f stack 32

Reasons for High IPC Performance Synchronous IPC Register based Cache footprint!!! Hand-optimized 33

IPC (in Multiple Steps) 1. Start IPC send to red 1. Enter kernel 2. Reset finite kernel stack IPC send 2. SwitchTo(red) 1. Kernel thread switch 3. Continue 1. Exit to user 2. Restore user stack 36

IPC (in Multiple Steps) Combine into single step. Skip switch to kernel stack of receiver. Switch directly to target user thread. 1. Start IPC send to red 1. Enter kernel 2. Reset finite kernel stack IPC send 2. SwitchTo(red) 1. Kernel thread switch 3. Continue 1. Exit to user 2. Restore user stack 37

IPC Message Transfer Describe message in 4 registers Receiver, IPC type, send+recv structure 3 remaining registers (ebx, edx, edi) Message Other things to do Enter and exit kernel Do some checks 38

IA32 IPC Implementation %eax: to_tcb, %ebp: current, (%ebx, %edx, %edi): message TRAP_HANDLER_SEND: cmp Partner(%eax), %ebp jne block cmp State(%eax), WAITING jne block mov RUNNING, State(%eax) push ipc_fastpath_ret mov %esp, Stack(%ebp) mov Ptab(%eax), %eax mov %cr3, %ebp cmp %eax, %ebp je exit_to_user mov %eax, %cr3 exit_to_user: Where is the message transfer? 39

/* eax: snd desc, ecx: USP, ebp: rcv desc, esp: KSP (snd TID), esi: rcv TID */ /* ebx, edx (->ebp), edi: message, 53 instructions in kernel mode */ /* create "nice" stack frame for slow path switch */ pushl $ipc_fastpath_ret ipc_sysenter: movl %esp, OFS_TCB_STACK(current) /* save stack */ mov (%esp), %esp pushl $X86_UDS /* User Data Segment Selector */ /* perform task switch if required */ pushl %ecx /* save user ESP */ movl %cr3, %ebp /* get curr. ptab */ push $0 /* fake pushf */ movl OFS_TCB_PAGE_DIR(to_tcb), %ecx /* get dst ptab */ pushl $0x1b movl OFS_TCB_MYSELF(current), %esi /* set sender */ pushl (%ecx) /* save user EIP on kernel stack */ cmpl %ebp, %ecx /* same ptab? */ movl 12(%ecx), %ecx /* load sender s timeouts */ je 2f /* do not reload */ pushl %ebp /* rcv desc */ movl %ecx, %cr3 /* load new ptab */ pushl %eax /* snd desc */ #define to_tcb %eax 2: #define current %ebp /* we don't care about the other guy s stack :) */ addl $L4_TOTAL_TCB_SIZE - 4, %eax orl %ebp, %eax /* eax = snd_desc rcv_desc */ movl %eax, tss + 4/* prepare dst s kernel stack */ orl %ecx, %eax /* eax = snd_desc rcv_desc timeout */ movl %edx, %ebp /* save msg0 around sysexit */ movl -8(%eax), %ecx /* user ESP (stored there above) */ leal TCB_AREA(%esi), to_tcb movl -20(%eax), %edx/* user EIP (stored there above) */ jne ipc_slowpath_to /* not short call(never) */ addl $16, %ecx andl $L4_TIDTCB_MASK, to_tcb subl $2, %edx mov %esp, current subl %eax, %eax andl $L4_TCB_MASK, current sti sysexit cmpl OFS_TCB_MYSELF(to_tcb), %esi /* valid TCB for receiver? */ jne ipc_slowpath /* to_tcb->myself!= dest */ ipc_slowpath: movl OFS_TCB_THREAD_STATE(to_tcb), %ecx /* call a C function to handle more complex cases */ add $1, %ecx... orl OFS_TCB_UNWIND_IPC_SP(to_tcb), %ecx orl OFS_TCB_RESOURCES(to_tcb), %ecx jne ipc_slowpath /* not waiting or on slow IPC path */ testb $1, OFS_TCB_QUEUE_STATE(to_tcb) je ipc_slowpath /* not in ready queue */ 1: /* COND: ecx == 0 */ orl OFS_TCB_PARTNER(to_tcb), %ecx /* to_tcb->partner -> ecx */ je 1f /* open wait */ cmpl OFS_TCB_MYSELF(current), %ecx jne ipc_slowpath /* to_tcb->partner!= myself */ /* if we reach that point we perform the fast ipc */ movl %esi, OFS_TCB_PARTNER(current) /* current->partner = dest */ xorl %esi, %esi movl %esi, OFS_TCB_MSG_DESC(current) /* current->msg_desc = 0 */ subl $1, %esi movl %esi, OFS_TCB_THREAD_STATE(current) /* current->state = WAITING */ movl $6, OFS_TCB_THREAD_STATE(to_tcb) /* to_tcb->state = RUNNING */ 40

What you ve learned We can communicate across a thread switch How to perform some generic thread switching How to build the worlds fastest IPC path 41

More details http://i30www.ira.uka.de/ Microkernel Construction (Lecture) http://l4ka.org/ L4Ka microkernel project 42