This is an open book, open notes exam. But no online or in-class chatting.

Similar documents
This is an open book, open notes exam. But no online or in-class chatting.

Processes (Intro) Yannis Smaragdakis, U. Athens

This is an open book, open notes exam. But no online or in-class chatting.

3. Process Management in xv6

This is an open book, open notes exam. But no online or in-class chatting.

Assembly Language: Function Calls" Goals of this Lecture"

Assembly Language: Function Calls" Goals of this Lecture"

Assembly Language: Function Calls

The Process Abstraction. CMPU 334 Operating Systems Jason Waterman

Section 4: Threads CS162. September 15, Warmup Hello World Vocabulary 2

CSC369 Lecture 2. Larry Zhang, September 21, 2015

Assembly Language: Function Calls. Goals of this Lecture. Function Call Problems

Function Calls COS 217. Reading: Chapter 4 of Programming From the Ground Up (available online from the course Web site)

4. The Abstraction: The Process

CSC369 Lecture 2. Larry Zhang

This is an open book, open notes exam. But no online or in-class chatting.

Section 4: Threads and Context Switching

CPS104 Recitation: Assembly Programming

x86 assembly CS449 Fall 2017

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Fall Quiz I

X86 Stack Calling Function POV

Register Allocation, iii. Bringing in functions & using spilling & coalescing

CS 537 Lecture 2 - Processes

CS 318 Principles of Operating Systems

Computer Systems Assignment 2: Fork and Threads Package

CS 31: Intro to Systems Functions and the Stack. Martin Gagne Swarthmore College February 23, 2016

CMSC 313 Lecture 12 [draft] How C functions pass parameters

Processes and Threads

1 Programs and Processes

CS 318 Principles of Operating Systems

ECE 391 Exam 1 Review Session - Spring Brought to you by HKN

CMSC 313 Lecture 12. Project 3 Questions. How C functions pass parameters. UMBC, CMSC313, Richard Chang

Program Exploitation Intro

ICS143A: Principles of Operating Systems. Midterm recap, sample questions. Anton Burtsev February, 2017

Implementing Threads. Operating Systems In Depth II 1 Copyright 2018 Thomas W. Doeppner. All rights reserved.

Quiz I Solutions MASSACHUSETTS INSTITUTE OF TECHNOLOGY Fall Department of Electrical Engineering and Computer Science

CSC 2400: Computing Systems. X86 Assembly: Function Calls"

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Fall Quiz I

What the CPU Sees Basic Flow Control Conditional Flow Control Structured Flow Control Functions and Scope. C Flow Control.

Intro to Threads. Two approaches to concurrency. Threaded multifinger: - Asynchronous I/O (lab) - Threads (today s lecture)

Stack Discipline Jan. 19, 2018

CSC 2400: Computing Systems. X86 Assembly: Function Calls

Getting to know you. Anatomy of a Process. Processes. Of Programs and Processes

Stack -- Memory which holds register contents. Will keep the EIP of the next address after the call

ANITA S SUPER AWESOME RECITATION SLIDES

An Experience Like No Other. Stack Discipline Aug. 30, 2006

Problem Set 2. CS347: Operating Systems

CPEG421/621 Tutorial

Computer Systems Lecture 9

Procedure Calls. Young W. Lim Mon. Young W. Lim Procedure Calls Mon 1 / 29

SYSTEM CALL IMPLEMENTATION. CS124 Operating Systems Fall , Lecture 14

Secure Programming Lecture 3: Memory Corruption I (Stack Overflows)

Security Workshop HTS. LSE Team. February 3rd, 2016 EPITA / 40

CMSC 313 COMPUTER ORGANIZATION & ASSEMBLY LANGUAGE PROGRAMMING

Final Exam. Fall Semester 2016 KAIST EE209 Programming Structures for Electrical Engineering. Name: Student ID:

CSC 8400: Computer Systems. Using the Stack for Function Calls

Processes. q Process concept q Process model and implementation q Multiprocessing once again q Next Time: Scheduling

x86 architecture et similia

Quiz I Solutions MASSACHUSETTS INSTITUTE OF TECHNOLOGY Fall Department of Electrical Engineering and Computer Science

Department of Electrical Engineering and Computer Science MASSACHUSETTS INSTITUTE OF TECHNOLOGY Fall Quiz I

Final Exam, Spring 2012 Date: May 14th, 2012

administrivia today start assembly probably won t finish all these slides Assignment 4 due tomorrow any questions?

CS140 Operating Systems Final December 12, 2007 OPEN BOOK, OPEN NOTES

Roadmap: Security in the software lifecycle. Memory corruption vulnerabilities

CS , Spring 2002 Exam 2

CS 33. Architecture and the OS. CS33 Intro to Computer Systems XIX 1 Copyright 2018 Thomas W. Doeppner. All rights reserved.

Subprograms: Local Variables

Practical Malware Analysis

Pebbles Kernel Specification September 26, 2004

Lab 10: Introduction to x86 Assembly

CS 3214 Computer Systems. Do not start the test until instructed to do so! printed

CSC 2400: Computer Systems. Using the Stack for Function Calls

Processes. Johan Montelius KTH

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Fall Quiz I Solutions. Mean 70.5 Median 72 Standard deviation 15.8 Kurtosis -0.

Lecture 4 CIS 341: COMPILERS

Question 4.2 2: (Solution, p 5) Suppose that the HYMN CPU begins with the following in memory. addr data (translation) LOAD 11110

Low-Level Essentials for Understanding Security Problems Aurélien Francillon

CMSC 313 COMPUTER ORGANIZATION & ASSEMBLY LANGUAGE PROGRAMMING PREVIEW SLIDES 16, SPRING 2013

Processes. Today. Next Time. Process concept Process model Implementing processes Multiprocessing once again. Scheduling processes

A process. the stack

CSE 451 Autumn Final Solutions mean 77.53, median 79, stdev 12.03

CNIT 127: Exploit Development. Ch 1: Before you begin. Updated

Machine Program: Procedure. Zhaoguo Wang

Project 2: Signals. Consult the submit server for deadline date and time

buffer overflow exploitation

Final exam. Scores. Fall term 2012 KAIST EE209 Programming Structures for EE. Thursday Dec 20, Student's name: Student ID:

CS164: Programming Assignment 5 Decaf Semantic Analysis and Code Generation

CS 33. Architecture and the OS. CS33 Intro to Computer Systems XIX 1 Copyright 2017 Thomas W. Doeppner. All rights reserved.

Machine-level Programming (3)

CSE 351: Week 4. Tom Bergan, TA

W4118: PC Hardware and x86. Junfeng Yang

CS 33: Week 3 Discussion. x86 Assembly (v1.0) Section 1G

BUFFER OVERFLOW DEFENSES & COUNTERMEASURES

Operating Systems and Networks Assignment 2

CS 2505 Computer Organization I Test 2. Do not start the test until instructed to do so! printed

CS 2505 Computer Organization I Test 2. Do not start the test until instructed to do so! printed

Computer Architecture and Assembly Language. Practical Session 3

CPS 310 first midterm exam, 10/6/2014

CS5460/6460: Operating Systems. Lecture 9: First process. Anton Burtsev January, 2014

CS , Fall 2004 Exam 1

Transcription:

Operating Systems Winter 2017 Final 3/20/2018 Time Limit: 4:00pm - 6:00pm Name (Print): Don t forget to write your name on this exam. This is an open book, open notes exam. But no online or in-class chatting. Ask me if something is not clear in the questions. Organize your work, in a reasonably neat and coherent way, in the space provided. Work scattered all over the page without a clear ordering will receive very little credit. Mysterious or unsupported answers will not receive full credit. A correct answer, unsupported by explanation will receive no credit; an incorrect answer supported by substantially correct explanations might still receive partial credit. If you need more space, use the back of the pages; clearly indicate when you have done this. Problem Points Score 1 20 2 15 3 25 4 5 5 15 6 5 7 5 Total: 90

Operating Systems Final - Page 2 of 16 1. File systems Ben tries to understand the xv6 code of the filewrite() function (see the listing from the xv6 book below). He specifically looks at the definition of the max variable at line 5767 and tries to understand the logic behind it. 5750 // Write to file f. 5751 int 5752 filewrite(struct file *f, char *addr, int n) 5753 { 5754 int r; 5755 5756 if(f->writable == 0) 5757 return -1; 5758 if(f->type == FD_PIPE) 5759 return pipewrite(f->pipe, addr, n); 5760 if(f->type == FD_INODE){ 5761 // write a few blocks at a time to avoid exceeding 5762 // the maximum log transaction size, including 5763 // i-node, indirect block, allocation blocks, 5764 // and 2 blocks of slop for nonaligned writes. 5765 // this really belongs lower down, since writei() 5766 // might be writing a device like the console. 5767 int max = ((LOGSIZE-1-1-2) / 2) * 512; 5768 int i = 0; 5769 while(i < n){ 5770 int n1 = n - i; 5771 if(n1 > max) 5772 n1 = max; 5773 5774 begin_op(); 5775 ilock(f->ip); 5776 if ((r = writei(f->ip, addr + i, f->off, n1)) > 0) 5777 f->off += r; 5778 iunlock(f->ip); 5779 end_op(); 5780 5781 if(r < 0) 5782 break; 5783 if(r!= n1) 5784 panic("short filewrite"); 5785 i += r; 5786 5787 return i == n? n : -1; 5788 5789 panic("filewrite"); 5790

Operating Systems Final - Page 3 of 16 (a) (10 points) Explain the role of the max variable and why it is defined the way it is (provide a detailed explanation for the formula and a high level explanation for why max is needed, i.e., what will go wrong without it). To make sure that all writes to the file system leave it in a consistent state, in case of a power outage the filewrite() function splits large writes submitted by the user into multiple file system transactions. The max variable limits the maximum size for each transaction to make sure that it fits into the file system log. The file system log can accomodate LOGSIZE blocks. To make sure that each transaction is smaller than the LOGSIZE we need to count the number of blocks that can possibly be written by each transaction. A minimal (one byte) write to an inode (writei() function) might trigger writes to the following blocks: 1) an allocation block that will be updated to reflect the fact tha the indirect block for the inode is allocated if the write is the first write outside of the first 12 blocks covered by the direct blocks (an allocation has to write to the block that holds the allocation bitmap), 2) a write to the indirect block itself, and 3) an update of the block that holds the inode since now it has a pointer to the new indirect block. This means that in the worst case a single write might trigger 3 additional block writes. Hence, the total max blocks we can accomodate in one transaction is no more than LOGSIZE - 1-1 - 1. Note, however, that each individual block write might trigger an allocation of the block, hence in the worst case for each block written there will be an additional write to the allocation block (we have to divide the remaining log blocks by half). And hence the max number of blocks we can write should be: (LOGSIZE - 1-1 - 1)/2. Finally, the write that can fit in n blocks might be misaligned and require updates to n + 1 blocks. However if the write starts in the middle of a block it means that it was already allocated by the previous write (xv6 does not have lseek() system call that allows users to write into random locations in the file, hence the file is always written sequentially). Therefore, we only need to subtract one block from the original size of the log, and the max number of blocks we can allow to write without running of space in the log is: (LOGSIZE - 1-1 - 1-1)/2. (b) (10 points) Finally, Ben thinks he understands the role of max, so he tries to explain it to Alice. Alice, however, is a mature xv6 hacker and she immediately spots an error in Ben s logic. She is quick to point out the bug in the xv6 code arguing that the definition of max above is incorrect. She quickly looks up the most recent version of xv6 and finds out that the bug she spotted is fixed. The most recent version of xv6 defines the max like: // write a few blocks at a time to avoid exceeding // the maximum log transaction size, including // i-node, indirect block, allocation blocks, // and 2 blocks of slop for non-aligned writes. // this really belongs lower down, since writei() // might be writing a device like the console. int max = ((MAXOPBLOCKS-1-1-2) / 2) * 512;

Operating Systems Final - Page 4 of 16 Explain what is the bug that Alice has found, i.e., why the fix above is important and what can go wrong with the old definition? Alice has previously looked at the logging implementation in xv6, hence she knows that xv6 allows concurrent transactions of MAXOPBLOCKS (10 blocks) each. She realizes that if multiple transactions of (LOGSIZE - 1-1 - 1-1)/2 happen in parallel they might exhaust the space in the log. Hence, she suspects that the definition of max should be changed to limit the maximum write size based on the size of individual transaction, not the entire log.

Operating Systems Final - Page 5 of 16 2. Processes and boot (a) (10 points) Explain how the first xv6 process and the boot shell process are created upon boot. The first process is created inside the userinit() function, that is called from main(). userinit() allocates a new process from the process table and initializes it s user memory. Specifically it loads a small stub of code (initcode.s) that is already compiled in the kernel. This stub contains a short sequence of code that will execute the exec() system call passing /init as an argument. Exec will replace the memory of the first process with the memory image of the init program. Init will fork creating the boot shell. (b) (5 points) At what point the first process starts executing? I.e., what is the function in the xv6 kernel that starts execution of the first process. While the userinit() allocates the first process and initializes its memory it does not start running until the kernel finish initialization and enters the scheduler from the mpmain() function. The scheduler enters the scheduling loop, finds the first RUNNABLE process in the process list and context switches into it. At this process the first process starts running and invokes the exec("/init") system call.

Operating Systems Final - Page 6 of 16 3. Anatomy of a process and context switching Ben decides to implement user-level threads. His plan is to create a new thread data structure that describes a thread. He then thinks he can allocate a new stack for each thread and implement a function u thread create() which creates a new thread. The u thread create() has the following signature (it s exactly like the thread create() function in our Homework 4 Kernel Threads assignment, but just takes an additional thread argument): int u_thread_create(struct thread *t, void (*fn)(void *), void *arg, void *stack); The u thread create() call creates a new user thread that runs inside the same process (in contrast to the kernel thread implementation, the u thread create() doesn t invoke a single system call, but instead just executes the function that is passed as an argument (fn) on the already allocated stack (stack). The function pointed by the fn pointer takes a void pointer as an argument (it s passed inside u thread create() as arg). The new user thread runs until it explicitly yields execution back to the parent with the u yield() call. The u yield() call doesn t do a single system call, but saves execution of the thread on the stack and switches back to the parent process (i.e., it continues execution at the line immediately following the u thread create() invocation. Ben can then create and run multiple user threads like this (the u yield to() function yields execution back to a specific thread pointed by the struct thread argument): #include <stdio.h> #include <stdlib.h> void do_work(void *arg) { int i; for (i = 0; i < 2; i++) { printf("i m in %s\n", (char *)arg); u_yield(); ; int main(int argc, char *argv[]) { void *stack1, *stack2; struct thread t1, t2; char a1[] = "Thread 1"; char a2[] = "Thread 2"; stack1 = malloc(4096); stack2 = malloc(4096); u_thread_create(&t1, do_work, (void*)a1, stack1); u_thread_create(&t2, do_work, (void*)a2, stack2); while(t1.state == RUNNABLE t2.state == RUNNABLE) { if (t1.state == RUNNABLE) u_yield_to(&t1); if (t2.state == RUNNABLE) u_yield_to(&t2);

Operating Systems Final - Page 7 of 16 printf("threads finished\n"); return 0; (a) (5 points) What output the program above will produce (assume that Ben got everything right and standard output is connected to the terminal). The program will produce I m in Thread 1 I m in Thread 2 I m in Thread 1 I m in Thread 2... Threads finished (b) (5 points) Ben times execution of his program by adding the uptime() system call at the beginning and the end of the main() function, but doesn t see any improvement compared to a normal program (no user-level threads, just run do work() twice). He then changes do work() to compute factorial and other computationally intensive functions instead of simply printing on the console, and yet he sees no performance improvement. Explain why the performance stays the same although multiple user-level threads are running? User threads that Ben created run inside the same process, hence the kernel schedules them on the same CPU and only one of them can run at every given moment in time.

Operating Systems Final - Page 8 of 16 (c) (15 points) Provide code for the u thread create() and u yield() functions (you can use pseudocode for C and ASM as long as semantics of operations is clear). We provide a complete working solution below, obviously a much simpler draft is be accepted. Below we provide examples for two possible implementations: Implementation #1 #define RUNNABLE 1 #define EXITED 2 struct thread { void *ebp; void *esp; int state; ; struct thread parent; struct thread *current; void _uswitch(void *from, void *to) attribute ((returns_twice)); asm (".text \n\t" ".align 16 \n\t" ".globl _uswitch \n\t" // 8(%esp): thread_to, 4(%esp): thread_from "_uswitch: \n\t" " mov 4(%esp), %eax \n\t" // load thread_from into eax " mov 8(%esp), %ecx \n\t" // load thread_to into ecx " push %ebx \n\t" // save callee saved registers: ebx, ed " push %edi \n\t" " push %esi \n\t" " movl %ebp, 0(%eax) \n\t" // save EBP " movl %esp, 4(%eax) \n\t" // save ESP // Thread state is saved, switch " mov 0(%ecx), %ebp \n\t" // load thread_to s ebp into ebp " mov 4(%ecx), %esp \n\t" // load thread_to s esp into esp " pop %esi \n\t" // restore callee saved registers " pop %edi \n\t" " pop %ebx \n\t" " ret \n\t" // return ); void uexit(void) { current->state = EXITED; _uswitch(current, &parent); void u_thread_create(struct thread *t, void *fnc, void*arg, void *stack) {

Operating Systems Final - Page 9 of 16 current = t; t->state = RUNNABLE; t->esp = stack + 4096; // push the argument on the stack t->esp -= sizeof(void*); *(void**)t->esp = arg; // when fnc returns, return into uexit() t->esp -= sizeof(void*); *(void**)t->esp = uexit; // The new thread will return into fnc from _uswitch t->esp -= sizeof(void*); *(void**)t->esp = fnc; // Fake the return stack for _uswitch_save t->esp -= sizeof(void*); *(void**)t->esp = 0; // ebx t->esp -= sizeof(void*); *(void**)t->esp = 0; t->esp -= sizeof(void*); *(void**)t->esp = 0; // edi // esi _uswitch(&parent, t); void u_yield() { _uswitch(current, &parent); void u_yield_to(struct thread *t) { current = t; _uswitch(&parent, t);

Operating Systems Final - Page 10 of 16 Implementation #2 #define RUNNABLE 1 #define EXITED 2 struct thread { void *eip; void *ebp; void *esp; int state; ; struct thread parent; struct thread *current; void _uswitch(void *from, void *to) attribute ((returns_twice)); void _uswitch_save(void *from, void *to) attribute ((returns_twice)); asm (".text \n\t" ".align 16 \n\t" ".globl _uswitch_save \n\t" // 8(%esp): thread_to, 4(%esp): thread_from "_uswitch_save: \n\t" " mov 4(%esp), %eax \n\t" // load thread_from into eax " mov 8(%esp), %ecx \n\t" // load thread_to into ecx " push %ebx \n\t" // save callee registers " push %edi \n\t" " push %esi \n\t" " push %ecx \n\t" // push thread_to as second arg " push %eax \n\t" // load thread_from as first arg " call _uswitch \n\t" " add $0x8,%esp \n\t" // release space used for args " pop %esi \n\t" // restore callee registers " pop %edi \n\t" " pop %ebx \n\t" " ret \n\t" ); asm (".text \n\t" ".align 16 \n\t" ".globl _uswitch \n\t" "_uswitch: \n\t" // 8(%esp): thread_to, 4(%esp): thread_from " movl 4(%esp), %eax \n\t" // load thread_from into eax " movl 0(%esp), %ecx \n\t" // load return address into esi " movl %ecx, 0(%eax) \n\t" // EIP (our return address) " movl %ebp, 4(%eax) \n\t" // EBP " movl %esp, 8(%eax) \n\t" // ESP

Operating Systems Final - Page 11 of 16 " addl $4, 8(%eax) \n\t" // return address + 4 // Thread state is saved, switch " mov 8(%esp), %eax \n\t" " mov 4(%eax), %ebp \n\t" " mov 8(%eax), %esp \n\t" " jmp *0(%eax) \n\t" ); void uexit(void) { current->state = EXITED; _uswitch_save(current, &parent); void u_thread_create(struct thread *t, void *fnc, void*arg, void *stack) { current = t; t->state = RUNNABLE; t->eip = fnc; t->esp = stack + 4096; t->esp -= sizeof(void*); *(void**)t->esp = arg; t->esp -= sizeof(void*); *(void**)t->esp = uexit; _uswitch_save(&parent, t); void u_yield() { _uswitch_save(current, &parent); void u_yield_to(struct thread *t) { current = t; _uswitch_save(&parent, t);

Operating Systems Final - Page 12 of 16 4. Memory management (a) (5 points) In the question above (user-level threads) Ben s code allocates memory for two stacks with malloc(), but it never calls free(). Is Ben s code causes a memory leak in the system? Support your argument. Not really. While the memory remains allocated until the process exits, the kernel cleans up all memory allocated by the process immediately after it terminates.

Operating Systems Final - Page 13 of 16 5. Fork, and console synchronization Ben writes the following program. int main(int argc, char *argv[]) { int pid = fork(); char *msg = "aaa\n"; if (pid == 0) { msg = "bbb\n"; write(1, msg, 4); sleep(1); write(1, msg, 4); sleep(1); wait(); exit(); (a) (5 points) What are the possible outputs of the above program? aaa bbb bbb bbb aaa bbb bbb bbb aaa (b) (10 points) Ben argues with Alice that there will never be an output with interleaving characters? E.g., ababba Is Ben correct? Explain your answer? Yes. Ben is correct. The write() call follows the following call sequence, write() -> filewrite() -> consolewrite(). The function consolewrite() acquires the console lock such that only one thread can be writing to the console. So, there won t be any interleaving characters.

Operating Systems Final - Page 14 of 16 6. Synchronization Alice creates a program to test her understanding of multithreaded locks. Below is a part of the program that describes her locking implementation. struct mutex_lock m1; struct mutex_lock m2; void do_work(void *arg){ mutex_lock(&m1); mutex_lock(&m2); //do something mutex_unlock(&m2); mutex_unlock(&m1); void do_work2(void *arg){ mutex_lock(&m2); mutex_lock(&m1); //do something mutex_unlock(&m1); mutex_unlock(&m2); int main(int argc, char *argv[]) {... mutex_init(&m1); mutex_init(&m2); t1 = thread_create(do_work2, (void*)&b1, s1); t2 = thread_create(do_work, (void*)&b2, s2);... (a) (5 points) Assuming the program compiles, do you see anything wrong with her locking strategy? Explain your answer. Yes, the locking order is incorrect. Thread1 acquires lock in the following order: m1 followed by m2, whereas Thread2 acquires the lock m2 followed by m1. When these two threads are run in parallel, there is a possibility of a deadlock as thread1 can acquire m1 and wait for m2 and thread2 can acquire m2 and wait for m1.

Operating Systems Final - Page 15 of 16 7. CS238P. I would like to hear your opinions about CS238P, so please answer the following questions. (Any answer, except no answer, will receive full credit.) (a) (1 point) Grade CS238P on a scale of 0 (worst) to 10 (best)? (b) (2 points) Any suggestions for how to improve CS238P? (c) (1 point) What is the best aspect of CS238P? (d) (1 point) What is the worst aspect of CS238P?

Operating Systems Final - Page 16 of 16