Combining Dataplane Operating Systems and Containers for Fast and Isolated Datacenter Applications

Size: px
Start display at page:

Download "Combining Dataplane Operating Systems and Containers for Fast and Isolated Datacenter Applications"

Transcription

1 EDIC RESEARCH PROPOSAL 1 Combining Dataplane Operating Systems and Containers for Fast and Isolated Datacenter Applications Mia Primorac DCSL, I&C, EPFL Abstract Commodity operating systems largely used in today s data centers were designed under different hardware assumptions and for any general type of applications. The specialization of operating system design for certain domains has showed to be successful especially in the case of exokernel architecture. In this proposal we examine and compare two recent research projects, Arrakis and IX, that follow the exokernel design while specializing for datacenter applications. We argue that the specialization of the whole operating system is not sufficient and it would be useful to provide a specializable version of more and more popular container technology with respect to demanding networking requirements of datacenter workloads. We propose the containers technique in conjunction with dataplane OS design in order to (1) improve the performance of datacenter applications running in containers, (2) reuse the container management tools for dataplane configuration and resource management and (3) improve the container isolation. Index Terms data center, dataplane, containers I. INTRODUCTION Today s data center operators face a challenge of demanding workloads with extremely high rates of small requests. While Proposal submitted to committee: June 3rd, 2015; Candidacy exam date: June 10th, 2015; Candidacy exam committee: Prof. James Larus, Prof. Edouard Bugnion, Prof. Willy Zwaenepoel. This research plan has been approved: Date: Doctoral candidate: (name and signature) Thesis director: (name and signature) Thesis co-director: (if applicable) (name and signature) Doct. prog. director: (B. Falsafi) (signature) sustaining high packets rates, datacenter services have to ensure low overall latency for each client request that involves hundreds of servers. Therefore, the low tail latency becomes an inevitable demand together with high throughput and high connection count. In addition to tackling this problem on the end points, it can also be tackled from the bottom by redesigning the software stack, especially the operating system. There are multiple reasons why one would want to redesign the commodity operating systems used in today s data centers. The first argument is almost as old as those operating systems [1] and it argues that generality of the OS design is penalizing applications with specific requirements. In the meanwhile, both, hardware and the applications have changed, but the kernel design has not followed those changes. This gap between hardware and software is especially visible in the networking stack which is primarily designed for fine-grained resource partitioning. Orthogonally to the OS development, Linux containers recently found many important use-cases in data centers and among software developers. New container technologies like LXC and Docker enable easier deployment of applications and their dependencies and server consolidation with bigger resource efficiency than with virtual machines. The purpose of this proposal is not to argue if and how one should use containers in production. We find this technology interesting and believe that it can supplement the specialized dataplane operating system like IX [7], presented in the 3rd chapter. We believe that such composed design brings benefits to both sides. The dataplane operating system would profit in easier configuration and deployment, whereas the applications on top of containers would profit from better isolation and specialized networking stack, without giving up on the comfort of container frameworks like Docker [10] for their management. In the second chapter we will present the original Exokernel design and its later application to networking and specialized applications. In the third and fourth chapter we will present and compare two modern and at the first sight similar operating systems. Both of them are specialized for datacenter applications and both are inspired by the exokernel design to some extent. Finally, we will introduce the research proposal which considers how to combine the exokernel-like design with container technology and what would be the possible benefits and shortcomings of such design. EDIC-ru/

2 EDIC RESEARCH PROPOSAL 2 A. Operating Systems Design Traditionally we can divide kernels into two main types of design: the monolithic kernel and the microkernel. They differ from each other in modularity of the design and in privilege mode under which the operating system services are executed. On x86 processor the most important prilivege modes are ring 0 (kernel or supervisor mode) and ring 3 (user mode). Kernel mode is the unrestricted mode, while user mode is the restricted mode. Rings 1 and 2 are also available, but rarely used. Monolithic kernels are implemented entirely as a single process running in a single address space. All kernel services exist and execute in kernel mode in the large kernel address space. Therefore, the communication within the kernel is as trivial as within a user-space application. It boils down to direct function invocation. Microkernels, on the other hand, are broken down into separate processes, usually called servers. All the servers run in separate address spaces and they communicate via message passing interprocess communication (IPC) mechanism. Only the servers absolutely requiring such capabilities run in kernel mode, while the rest of the servers run in user mode. The message passing between components running in different privilege modes introduces more overhead than a trivial function call in the monolithic kernels. On the other hand, the monolithic kernels suffer from the lack of flexibility and fault tolerance. Most of the data centers still run Linux commodity operating systems. Linux is a monolithic kernel, therefore running in a single address space and completely in the kernel mode. However, many good features of microkernels are incorporated in its design, for instance modular design and capability to dynamically load separate binaries (kernel modules) [14]. One thing in common to all the aforementioned systems, including Linux-based systems, is imposing the fixed set of abstractions to every user application, without providing any way of bypassing or easily extending them. This problem was addressed by the third type of kernel - the exokernel [2] [3] [4]. The principal goal of an exokernel - giving applications control - is orthogonal to the question of monolithic versus microkernel organization. If applications are restricted to inadequate interfaces, it is indifferent whether the implementations reside in the kernel or privileged user-level servers; in both cases applications lack control [3]. B. Networking Performance Gap Between Hardware and Software One of the most important pro-exokernel arguments is the stale design of operating system abstractions designed under different hardware assumptions. For instance, kernel schedulers, networking API and network stacks have been designed under the assumptions of multiple applications sharing a single processing core and packet interarrival times being several orders of magnitude higher than the latency of interrupts [7]. Nowadays that is no longer true and we are paying an overhead in throughput and latency for fine-grained scheduling of plentiful resources. Today we have multiple processors, multiple queues on the networking cards, and the packet inter-arrival times are no longer negligible with respect to the cost of an interrupt (or I/O processing in general). Linux completely decouples networking stack from application execution, hence suffers from bad cache locality. Fine-grained application scheduling prevents applications to process their packets before they are scheduled. Even if the application is scheduled, on systems with multiple processing cores and multi-queue NICs, different packets can be delivered to different queues to distribute processing among CPU cores, which may not be the same as the cores on which the application is scheduled. Further on, socket interfaces involve unnecessary data copies because the application buffers are not explicitly exchanged with the kernel. Transport-level acknowledgements and retransmissions may be redundant, but protocol implementations are hidden from the applications preventing changes. Even with NIC optimizations for increasing throughput, like receive side coalescing, or for reducing latency, polling the incoming ring buffer (RX) with NAPI (New API) [15] (instead of interrupt-based processing), there is still a substantial performance gap between hardware potentials and software achievements. This problem has been addressed by several research projects like Exokernel [2] [3] [4], Arrakis [5], and IX [7]. C. Linux Containers and the Docker Platform Linux container is a set of processes that are isolated from the rest of the machine, but, unlike the virtual machines, they share the operating system. Therefore, containers are light-weighted with respect to resource consumption, and they are much faster in bootstrapping and shutting down than virtual machines. In this chapter we will briefly introduce the important terms related to container technologies and kernel mechanisms they use, as well as often heard buzzwords like LXC and Docker. LXC [12] is a user space interface for the Linux kernel containment features such as control groups and network namespaces. The LXC project provides base OS container templates and a set of tools for container lifecycle management. However, the most often heard name in the context of containers is certainly Docker. Docker is a project originally based on the LXC project to build single application containers. Docker has now developed their own implementation, libcontainer, that uses kernel container capabilities directly. One can think of Docker container as a specific use case of Linux containers to build loosely applications as services. Docker consists of Docker Engine, a portable, lightweight runtime and packaging tool, and Docker Hub, a cloud service for sharing applications and automating workflows. Comfortable tools and image sharing environment makes it feasible for broader community of users, but as well for a use in data centers. There are multiple Linux kernel technologies in charge of the container isolation and encapsulation, for instance control groups and kernel namespaces. Control groups (cgroups) is a Linux kenrel feature that limits, accounts for and isolates the

3 EDIC RESEARCH PROPOSAL 3 Fig. 1: Exokernel architecture resource usage (CPU, memory, disk I/O, etc.) of a collection of processes and their children. Network namespaces are one of the kernel namespaces. They enable separate instances of network interfaces, routing tables and iptables, that operate independent of each other. Linux containers logically run separate networking stacks because they belong to different networking namespaces. However, they all share the same code base - the general one of the Linux kernel. The emerging container technologies motivate us to come up with an interesting system design that takes advantage of some of them, but as well to contribute to its design, in particular, to the networking stack and isolation of containers. II. EXOKERNEL Followed by the principles of good computer systems design, the authors of the Exokernel paper suggested an innovative operating system architecture in which traditional operating systems abstractions, such as virtual memory, IPC, files etc., are not imposed to all untrusted applications with rather different requirements. In traditional operating system the applications are faced with fixed interfaces to access the abstractions that may or may not fit their needs. The main idea behind Exokernel design is letting applications decide how to use their (and only their) physical resources and letting them to choose the abstractions that are a good match for them, and customize the rest of them. That is achieved through securely exporting hardware interface to user space where different customized operating system libraries and applications can use them (almost) directly. The main function of the kernel itself is securely multiplexing hardware resources without understanding them or the abstractions on top of them. In other words, the main goal is separating resource protection from resource management. The first exokernel paper [2] provides the idealistic overview of the exokernel architecture and gives the promising microbenchmarking results for a research prototype exokernel called Aegis and the library operating system ExOS. Later on, in the two followon papers [3] and [4] the authors give more detailed examples of real systems, services and applications that take advantage of the exokernel implemented on x86 called Xok and more complex ExOS. They more precisely describe the disk [3] and network [3] multiplexing. In this paper we will focus on the networking aspect of Xok given in [4]. A. Design Principles In order to achieve that, the exokernel architecture: (1) securely exposes hardware resources through system calls that check the ownership of every resource being involved, (2) exposes allocation requests for specific physical resources, (3) exposes revocation which cooperative applications can use to efficiently manage their resources and (4) exposes physical names, which is important to efficiently use (2) and (3). Driven by these design principles, the kernel implements three mechanisms: secure bindings, visible revocation and abort protocol. Secure bindings bind a specific resource to a specific application (or library OS) during the bind time and enable fast ownership checks for each access time (see Fig. 1). Secure bindings can be seen as a new abstraction introduced by the exokernel architecture whose implementation varies depending on the resource that is being multiplexed. The secure bindings for multiplexing physical memory is implemented as a combination of hardware mechanisms and caching in software TLB. Another example are the secure bindings for multiplexing the network. They are implemented as software packet filters downloaded into the kernel and dynamically compiled to machine code that can efficiently be invoked for every incoming packet. Visible revocation includes the applications in the resource revocation process and gives them an useful insight for efficient resource management. Although invisible revocation has lower latency, the visible revocation improves the overall performance as long as revocations do not occur very frequently. Although the exokernel provides power to the applications, it implements the abort protocol to tame uncooperative applications. If the application does not respond to visible revocation protocol, the kernel breaks the secure bindings to the resource and informs the library operating system. In the Fig. 1 we can see how applications with fairly different requirements manage to satisfy them simoultaneously. The Web server is designed for custom networing stack and file system, while the C shell does not have any exotic requirements and it can freely use the backwards compatible default ExOS library operating system. B. Application-Level Networking on Exokernel Systems Application-level networking allows applications with no special privileges to interact with a network interface. This idea was introduced before, but [4] provides concrete examples of how application-level networking can substantially improve end-to-end performance, for instance the Cheetah HTTP server. The ExOS library provides a user-level and extensible implementation of an UNIX operating system. They implemented full network services (UDP/IP, TCP/IP, POSIX sockets, ARP, DNS, and tcpdump) as independent application-level libraries and defined a set of base kernel mechanisms and a base network interface on which such applications and services can successfully be implemented. They focus on solving the problem of efficiently sharing a single networking interface among multiple applications using efficient polling, rather than giving ownership to a single

4 EDIC RESEARCH PROPOSAL 4 entity (like IX [7] does). This paper presents many interesting techniques to reduce the OS overhead in networking even when sharing is not the main concern. The paper argues against violating the end-to-end argument [8] in OS design. The argument argues against placing redundant functionality at low levels of system instead of the ends, which are the applications in the case of the Internet protocol. The efficient polling of applications is implemented using a mechanism called a WK (WaKeup) predicate. WK predicate is a set of conditions that describe an interest of an application. Each condition consists of a memory location and a value that is to be compared with the value on the given memory location. The Xok process scheduler evaluates the WK predicate of a sleeping process marks it runnable only if the WK predicate is true. This mechanism is an efficient optimization for several services. Application-level networking has to solve a problem of multiplexing the network interface in two cases: transmission and reception of packets. The transmission of packets from the application-level is straight-forward from the main memory - a system call takes the network interface on which to send the packet, a list of memory locations and sizes of packets, and a pointer to an integer in the memory which counts how many packets are left to send. When the driver learns that the card has sent a packet, it decrements the counter. Comparing if the counter is equal to zero is a perfect example of the aforementioned WK predicate because the application is not occupying the CPU while waiting for its packets to be sent. The reception does not dictate the behavior of applications and does not include unnecessary copies. The reception of packets consists of two phases: identifying the owner (demultiplexing) and packet delivery (buffering). To demultiplex packets among applications, Xok uses DPFs (Dynamic Packet Filters) which are downloaded and installed into the kernel. Each filter is associated with a packet ring identifier. A packet ring is a shared ring of buffers shared between an application and Xok. Each buffer has an ownership flag which points either to the application or Xok. When the application processes the packets from the buffer, it gives the ownership back to Xok. Each packet is copied once, but if we have an advanced NIC with demulitplexing support, then not even a single copy is needed (like in IX or native Arrakis API). C. Cheetah HTTP Server In [4] there are several examples of successfully exploiting the application-level networking in Xok. We will focus on one of them - the Cheetah HTTP server. Cheetah is a perfect example of how the exokernel architecture and applicationlevel networking can significantly boost the performance. Cheetah has a specialized TCP/IP and file system library. The main control loop simply waits for relevant events, like packet arrival, calls into its specialized library and checks perrequest state which determines if the request can be fully processed. The performance of the Cheetah server is partly credited to the non-blocking event loop without mediation through socket interfaces. The major improvement, however, is credited to three other extensions enabled by applicationlevel networking. First, merged file cache and TCP retransmission pool enable zero-copy disk to network transfer. Second, Cheetah avoids to send distinct TCP control packets that can be merged with forthcoming data packets. Third, TCP checksums of packets are precomputed and stored on the disk along with the data. Cheetah on Xok outperforms the Baseline server (socket-based HTTP/1.0 server as well presented in [4]) on both Xok and OpenBSD, but also outperforms two other systems running on OpenBSD; the Harvest cache [16] and the NCSA server [17]. Cheetah achieves 8x bigger throughput for small HTTP documents (less than 1KB) and roughly 4x bigger throughput for large documents (more than 10KB) than Harvest and NCSA server. The throughput was measured with six clients repeatedly requesting the same static Web document (100% cache hit rate) without any timeout between the requests. D. Discussion The Exokernel papers provide an interesting research idea and present two systems, the research prototype Aegis and Xok, more mature but still limited to research context. With Aegis they focused on tuning microbenchmarks, whereas with Xok they proved that it is possible to achieve a great macrobenchmark performance even without tuning the microbenchmarks. Some ideas of the paper remained undefined, for instance, how to achieve fairness between two competing library operating systems. The exokernel design in its extremity did not quite find its place in production. Many of their baselines are today outof-use systems and their hardware is not comparable with today s hardware. However, many exokernel design principles can still be found in later research projects, as we will see in the following chapters. It is a promising design for specialized environments, for instance data centers. III. ARRAKIS The Arrakis operating system addresses the problem mentioned in the first chapter, the performance gap between hardware and software. Arrakis tackles the problem not only for network services, but also for disk I/O operations. It is no longer true that disks are terribly slow after introducting flash-backed DRAM onto the devices. Therefore, the same issue pops out - the OS I/O overhead is not negligible any more. Obviously influenced by the exokernel design, Arrakis follows the idea of moving the operating system into the user-space libraries. Each application has its own I/O library OS through which the applications can almost directly communicate with I/O devices. The biggest change in the Arrakis design, compared with the Exokenrel, is pushing a big part of the OS functionality into the hardware. The functionality of the kernel is divided into a control plane and a dataplane. The control plane is in charge of access control, resource limits and global naming (virtual file system). The control plane is infrequently invoked to configure the fast dataplane. The dataplane is in charge of

5 EDIC RESEARCH PROPOSAL 5 Fig. 2: Arrakis architecture and hardware support protection and multiplexing of resources, I/O scheduling, I/O processing. The part of the dataplane which is not in hardware is the application and the API. For both networking and storage Arrakis provides backwards compatible POSIX interfaces. The programmers that are willing to change their applications can benefit from extra performance boost when using native zero-copy API for networking (Arrrakis/N) or a library that provides an asynchronous API persistent data structures API on low-latency storage devices. The Arrakis paper describes the changes needed in both hardware and software that would enable such redesign. A. Hardware Support The big hammer of the Arrakis operating system is the hardware I/O virtualization. Arrakis enables direct access to virtualized I/O devices in order to avoid kernel crossings as much as possible. Today (almost) all the aforementioned dataplane functions this can be provided in today s NICs and some of them in RAID controllers. In the Fig. 2 we can see the Arrakis architecture divided into the dataplane and the control plane and the hardware support on which it relies - a NIC that provides multiple virtualized NICs (VNICs), and a storage controller that can emulate regions of the disk into virtual storage areas (VSAs) and that can expose multiple virtual storage interface controllers (VSICs). The SR-IOV (Single-Root I/O Virtualization) technology multiplexes one PCI device into multiple virtual PCI devices with their own registers, descriptor queues and interrupts. Each application is mapped to specific virtual PCI and this is how the multiplexing is done. To stop the applications from the arbitrary I/O networking applications, the packet filters in the NIC can be arbitrarily programmed to filter incoming and outgoing traffic. Some NICs can already provide filtering based on rich semantics, although most of commodity NICs (including their Intel 82599) still do not have it. Using the IOMMU technology the devices can be restricted to access only certain memory locations. The I/O scheduling can as well be done in the hardware using NIC rate limiter and packet schedulers. The storage part is somehow less hardware supported. The storage controllers have some parts of the technology needed for the Arrakis model, but the required protection Fig. 3: Average memcached transaction throughput and scalability on Arrakis mechanism for restricting to access only their VSA is missing. B. Results The authors evaluated the Arrakis operating system on four cloud application workloads, all using UDP or only IP protocol. Here we will briefly present memcached benchmark. In the Fig. 3 we can see the networking stack scalability across multiple cores running memcached in-memory keyvalue store. The benchmark has 90% GET and 10% SET requests. The original memcached on Arrakis POSIX interface outperforms Linux by 1.7x on one core and 3x on 4 cores. A single multithreaded memcached instance has approximately the same throughput as the multiple memcached processes. The evaluation does not show any memcached latency numbers, it is using UDP instead of TCP and it is not clear what was the experiment setup. Therefore it is difficult to compare it with results from other papers. Among other benchmarks, it is interesting to mention Redis, single threaded NoSQL store that was extended with Arrakis persistent structures library. It reduced the latency of GET requests by 65%, SET requests by 81%, while the throughput increased by 1.75x for GET and 9x for SET requests. The experiment included 1600 connections from 16 clients. They ran a separate benchmarks for GET and for SET operations. It is not clear how Arrakis behaves in the case of high connection count. C. Discussion We will analyze how Arrakis demonstrates the important properties of the Exokernel architecture [2]: 1) Exokernels can be made efficent due to the limited number of simple primitives they must provide. The job of the kernel in Arrakis is not to be extremely efficient in the terms of microbenchmarks. Its job is to stay away and infrequently configure the hardware. 2) Low-level secure multiplexing of hardware resources can be provided with low overhead. Yes, Arrakis efficiently multiplexes hardware resources, and it does that using the hardware itself. It is difficult to be more efficient than that.

6 EDIC RESEARCH PROPOSAL 6 ring 3 ring 0 non-root ring 0 vmx-root sshd... IXCP Dune Linux C C httpd libix IX C Fig. 4: IX architecture memcached libix IX C C 3) Traditional OS abstractions can be implemented efficiently at the application level. Yes, Arrakis provides traditional abstractions at the application level. For instance, the Arrakis POSIX interface. 4) Applications can create special-purpose implementations of abstractions. Yes, Arrakis provides the persistent structures library which is an enhancement of the traditional file abstraction. To sum up, Arrakis leveraged the old design and the new hardware to remove the kernel out of the data path. This design will be more even significant as the hardware keeps improving. Designing the interface for both networking and storage library operating system is a major contribution of the paper. IV. IX IX operating system [7] removes overheads that the Webscale applications usually face on Linux-based commodity operating systems and does not trade-off latency (tail or average), throughput nor protection. The IX system architecture consists of a Linux kernel, Dune kernel module, IX dataplanes, user level library libix and applications on top of IX (see Fig. 4). Similarly to Arrakis, IX separates the control plane (IXCP in Fig. 4), in charge of multiplexing and scheduling resources among dataplanes, from the fast dataplane, which runs the networking stack and the application logic. A. Dune Dune [6] provides the virtualization of processes rather than machines and gives the ability of a direct and safe access to privileged hardware features. The provided features are privilege modes, virtual memory registers, page tables, and interrupt, exception and system call vectors. Dune extends the Linux kernel but does not require its changes. The Dune system consists of a loadable kernel module and a user-level library, libdune. Dune uses Intel VT-x technology [13], a virtualization extension to x86 ISA, to provide user programs with full access to x86 protection hardware with less overhead than trapping and emulating privileged instructions [9]. VT-x adopts a design where the CPU is split into two operating modes: VMX root (host) mode and VMX non-root (guest) mode. Privilege modes (or privilege rings) exposed to user-level applications enable the code of Dune processes to run in VMX non-root ring 0 (guest kernel mode) or in VMX non-root ring 3 (guest user mode) for untrusted code. Those features are leveraged by the Dune sandbox application. The sandbox restricts the memory that the applications inside it can access and it restricts the interfaces or system calls they can use. Linux kernel and Dune kernel module run in host VMX root ring 0 (host kernel mode) that is typically used for VMMs. Dune applications can either run in VMX non-root ring 0 or VMX non-root ring 3. VMX non-root ring 0 enables fast and safe access to hardware like traditionally guest kernels do. The untrusted applications can be restricted like VM applications through VMX non-root ring 3. B. IX Design IX separates the control plane from the dataplane and enables implementing resource allocation policies in control plane while doing efficient network processing in dataplane. IX eliminates the trade-off between the high packet rates and low latency while retaining the same protection model as commodity operating systems. It runs the dataplane kernel and the application at distinct protection levels, and it isolates the control plane from the dataplane. This three way isolation is achieved through Dune and hardware virtualization. IX leverages the privilege modes exposed through Dune. Each IX dataplane runs as a Dune processes in the sandbox application in VMX non-root ring 0, while the untrusted applications run in VMX non-root ring 3 (see Fig. 4). The execution model of the IX networking stack is optimized for both throughput and latency, but it is applicable only for event-driven applications. In the Fig. 5 we can see the event loop of packet processing. Each packet runs to completion through both the application and the networking stack, which improves the instruction cache locality. Packets are not always processed one by one. In the presence of congestion, the packets are taken from the networking queue in batches up to predetermined maximum batch size. This optimizes for data cache locality and low latency. There are no blocking operations in this event loop which enables high throughput. The batched system calls generated by the application are supposed to be non-blocking system calls. If the application wants to execute a blocking system call, it has to do it in the background using dedicated background threads. C. Results The IX operating system was compared against Linux running the in-memory key-value store memcached and against Linux and mtcp, the state-of-the-art user-level networking stack. Here we present a part of the results. The Fig. 6 shows

7 EDIC RESEARCH PROPOSAL 7 ring 0 non-root Latency (µs) ring Event Conditions tcp/ip app libix 5 4 tcp/ip timer adaptive batch Fig. 5: IX execution model Batched Syscalls ETC: Throughput (RPS x 10 3 ) Fig. 6: Average (full line) and 99th percentile (dashed line) latency for Linux (red) and IX (black) the memcached result running a Facebook s TCP benchmark. We can see that the maximal throughput before the latency shoots up is 3x bigger than for Linux. The 99th percentile stays very close to the average latency. Another interesting result is the high connection scalability. The Fig. 7 shows the results of the experiment with 18 multithreaded clients with each thread repeatedly performing a 64B RPC to the server. It shows the throughput as a function of the variable number of active connections. At the peak IX performs 10x better than Linux and it retained half of the peak Messages/sec (x 10 6 ) IX-40Gbps IX-10Gbps Linux-40Gbps Linux-10Gbps Connection Count (log scale) Fig. 7: IX connection scalability 6 SLA throughput even with 250,000 connections on the 4x10GbE. D. Discussion Now let us compare how IX demonstrates the important properties of the Exokernel architecture: 1) Exokernels can be made efficient due to the limited number of simple primitives they must provide. IX dataplane is in fact a specialized networking library OS, whereas the Dune module plays the role of the exokernel. The Dune module nicely follows this property - it efficiently exposes hardware primitives through the libdune interface. 2) Low-level secure multiplexing of hardware resources can be provided with low overhead. Hardware resources are allocated in coarse-grained manner. Whole networking queues and processing cores are allocated to a single dataplane running a single application. Therefore, multiplexing of resources is avoided as much as possible. 3) Traditional OS abstractions can be implemented efficiently at application level. Not very relevant for IX. The IX API departs from the POSIX API, but the libix userlevel library includes an event-based API similar to the popular libevent library [18]. For the missing traditional OS abstractions (the file system for instance) IX can make system calls into the Linux kernel. 4) Applications can create special-purpose implementations of abstractions. This is the main feature of IX, a datacenter-specific networking stack. V. RESEARCH PROPOSAL The main characteristics of dataplane operating systems are a clean separation between the dataplane and the control plane. Dataplane is in charge for actual packet processing, whereas control plane is allocating and managing resources for each dataplane. Docker platform does not explicitly assert that their design follows the dataplane/control plane separation, but it exists implicitly. Each container has its own networking stack with allocated resources and as such it is analog to dataplane in IX. Docker daemon manages the containers and implements resource allocation policies as the control plane does in dataplane operating systems. The intuition and the core of our proposal is to reuse the Docker framework and leverage its utilities to manage dataplanes. In other words, we would like to use Docker as a control plane and run dataplanes in Linux containers. This analogy is depicted in Fig. 8. In this write-up we presented the Exokernel paper and two its successors, Arrakis and IX, in order to analyze and compare different approaches to the same problem. We believe that IX is more feasible for the scenario above. The motivation lays in the fact that IX is an operating system and a Linux process at the same time. It originates from the duality of the Dune kernel module which enables virtualization of a process rather than virtualization of a machine. If we consider IX to be just an application, it should be possible to pack it and run it using Docker. Dune would run in the regular Linux kernel, but it can be exposed as a device to the containers.

8 EDIC RESEARCH PROPOSAL 8 Fig. 8: Combining Docker containers and IX dataplanes Fig. 9: Combining containers and virtual machines Why would one want such design? The reasons are several: REFERENCES [1] D.R. Engler, M.F. Kaaashoek, Exterminate all operating system abstractions, HOTOS, [2] D.R. Engler, M.F. Kaaashoek, and J. O Toole Jr., Exokernel: An Operating System Architecture for Application-Level Resource Management, SOSP, [3] M.F. Kaaashoek, D.R. Engler, G.R. Ganger, H.M. Briceno, R. Hunt, D. Mazieres, T. Pinckney, R. Grimm, J. Jannotti, and K. Mackenzie, Application Performance and Flexibility on Exokernel Systems, SOSP, [4] G.R. Ganger, D.R. Engler, M.F. Kaaashoek, H.M. Briceno, R. Hunt, and T. Pinckney, Fast and Flexible Application-Level Networking on Exokernel Systems, TOCS, Volume 20 Issue 1, February [5] S. Peter, J. Li, I. Zhang, D.R.K. Ports, D. Woos, A. Krishnamurthy, T. Anderson, and T. Roscoe, Arrakis: The Operating System is the Control Plane, OSDI, [6] A. Belay, A. Bittau, A. Mashtizadeh, D. Terei, D. Mazieres, C. Kozryrakis, Dune: Safe User-Level Access to Privileged CPU Features, OSDI, [7] A. Belay, G. Prekas, A. Klimovic, S. Grossman, C. Kozyrakis, and E. Bugnion, IX: A Protected Dataplane Operating System for High Throughput and Low Latency, OSDI, [8] J.H. Saltzer, D.P. Reed, and D.D. Clark, End-To-End Arguments in System Design, TOCS, Volume 2 Issue 4, November [9] G.J. Popek, R.P. Goldberg, Formal Requirements for Virtualizable Third Generation Architectures, CACM, Volume 17 Issue 7, July [10] Docker: [11] Understanding Docker Security (May 2005), blog.docker.com/2015/05/understanding-docker-security-and-bestpractices/ [12] LXC: linuxcontainers.org [13] R. Uhlig, G. Neiger, D. Rodgers, A. Santoni, F. Martins, A. Anderson, S. Bennett, A. Kagi, F. Leung, and L. Smith, Intel Virtualization Technology, Computer, 38(5):48 56, May [14] R. Love, Linux Kernel Development (3rd Edition), ISBN-13: [15] NAPI: [16] A. Chankhunthod, P.B. Danzig, C. Neerdaels, M.F. Schwartz, K.J. Worrell, A Hierarchical Internet Object Cache, ATEC, [17] NCSA: en.wikipedia.org/wiki/ncsa HTTPd [18] libevent: libevent.org Easier deployment and configuration of dataplanes using sophisticated Docker tools. We want to run and configure the IX operating system using a single Dockerfile and use Docker s resource management features to extend the Python control plane. IX execution model could bring benefit of high throughput and low latency to the applications running in the containers. However, this design considers only event-driven applications. Enhanced isolation with hardware virtualization using Dune. The isolation of containers is often discussed problem. Considering the extensive research on container security and a very recent paper from Docker [11], the only way to guarantee the same level of isolation between containers as between virtual machine is - to run them inside virtual machines (see Fig. 9 from the same paper). However, using the Dune module and hardware virtualization to isolate networking stacks, we could guarantee the same level of isolation as with virtual machines, but with less overhead.

IX: A Protected Dataplane Operating System for High Throughput and Low Latency

IX: A Protected Dataplane Operating System for High Throughput and Low Latency IX: A Protected Dataplane Operating System for High Throughput and Low Latency Adam Belay et al. Proc. of the 11th USENIX Symp. on OSDI, pp. 49-65, 2014. Presented by Han Zhang & Zaina Hamid Challenges

More information

Kernel Bypass. Sujay Jayakar (dsj36) 11/17/2016

Kernel Bypass. Sujay Jayakar (dsj36) 11/17/2016 Kernel Bypass Sujay Jayakar (dsj36) 11/17/2016 Kernel Bypass Background Why networking? Status quo: Linux Papers Arrakis: The Operating System is the Control Plane. Simon Peter, Jialin Li, Irene Zhang,

More information

IX: A Protected Dataplane Operating System for High Throughput and Low Latency

IX: A Protected Dataplane Operating System for High Throughput and Low Latency IX: A Protected Dataplane Operating System for High Throughput and Low Latency Belay, A. et al. Proc. of the 11th USENIX Symp. on OSDI, pp. 49-65, 2014. Reviewed by Chun-Yu and Xinghao Li Summary In this

More information

OS Extensibility: SPIN and Exokernels. Robert Grimm New York University

OS Extensibility: SPIN and Exokernels. Robert Grimm New York University OS Extensibility: SPIN and Exokernels Robert Grimm New York University The Three Questions What is the problem? What is new or different? What are the contributions and limitations? OS Abstraction Barrier

More information

Arrakis: The Operating System is the Control Plane

Arrakis: The Operating System is the Control Plane Arrakis: The Operating System is the Control Plane Simon Peter, Jialin Li, Irene Zhang, Dan Ports, Doug Woos, Arvind Krishnamurthy, Tom Anderson University of Washington Timothy Roscoe ETH Zurich Building

More information

IX: A Protected Dataplane Operating System for High Throughput and Low Latency

IX: A Protected Dataplane Operating System for High Throughput and Low Latency IX: A Protected Dataplane Operating System for High Throughput and Low Latency Adam Belay 1 George Prekas 2 Ana Klimovic 1 Samuel Grossman 1 Christos Kozyrakis 1 1 Stanford University Edouard Bugnion 2

More information

Exokernel: An Operating System Architecture for Application Level Resource Management

Exokernel: An Operating System Architecture for Application Level Resource Management Exokernel: An Operating System Architecture for Application Level Resource Management Dawson R. Engler, M. Frans Kaashoek, and James O'Tool Jr. M.I.T Laboratory for Computer Science Cambridge, MA 02139,

More information

The Exokernel Or, How I Learned to Stop Worrying and Hate Operating System Abstractions. Dawson Engler, M. Frans Kaashoek, et al

The Exokernel Or, How I Learned to Stop Worrying and Hate Operating System Abstractions. Dawson Engler, M. Frans Kaashoek, et al The Exokernel Or, How I Learned to Stop Worrying and Hate Operating System Abstractions Dawson Engler, M. Frans Kaashoek, et al Motivation $ OS Level Abstractions are bad! $ Require large, difficult to

More information

OS Virtualization. Why Virtualize? Introduction. Virtualization Basics 12/10/2012. Motivation. Types of Virtualization.

OS Virtualization. Why Virtualize? Introduction. Virtualization Basics 12/10/2012. Motivation. Types of Virtualization. Virtualization Basics Motivation OS Virtualization CSC 456 Final Presentation Brandon D. Shroyer Types of Virtualization Process virtualization (Java) System virtualization (classic, hosted) Emulation

More information

Kernel Korner AEM: A Scalable and Native Event Mechanism for Linux

Kernel Korner AEM: A Scalable and Native Event Mechanism for Linux Kernel Korner AEM: A Scalable and Native Event Mechanism for Linux Give your application the ability to register callbacks with the kernel. by Frédéric Rossi In a previous article [ An Event Mechanism

More information

OS Extensibility: Spin, Exo-kernel and L4

OS Extensibility: Spin, Exo-kernel and L4 OS Extensibility: Spin, Exo-kernel and L4 Extensibility Problem: How? Add code to OS how to preserve isolation? without killing performance? What abstractions? General principle: mechanisms in OS, policies

More information

No Tradeoff Low Latency + High Efficiency

No Tradeoff Low Latency + High Efficiency No Tradeoff Low Latency + High Efficiency Christos Kozyrakis http://mast.stanford.edu Latency-critical Applications A growing class of online workloads Search, social networking, software-as-service (SaaS),

More information

Extensible Kernels: Exokernel and SPIN

Extensible Kernels: Exokernel and SPIN Extensible Kernels: Exokernel and SPIN Presented by Hakim Weatherspoon (Based on slides from Edgar Velázquez-Armendáriz and Ken Birman) Traditional OS services Management and Protection Provides a set

More information

COMPUTER ARCHITECTURE. Virtualization and Memory Hierarchy

COMPUTER ARCHITECTURE. Virtualization and Memory Hierarchy COMPUTER ARCHITECTURE Virtualization and Memory Hierarchy 2 Contents Virtual memory. Policies and strategies. Page tables. Virtual machines. Requirements of virtual machines and ISA support. Virtual machines:

More information

Virtualization, Xen and Denali

Virtualization, Xen and Denali Virtualization, Xen and Denali Susmit Shannigrahi November 9, 2011 Susmit Shannigrahi () Virtualization, Xen and Denali November 9, 2011 1 / 70 Introduction Virtualization is the technology to allow two

More information

references Virtualization services Topics Virtualization

references Virtualization services Topics Virtualization references Virtualization services Virtual machines Intel Virtualization technology IEEE xplorer, May 2005 Comparison of software and hardware techniques for x86 virtualization ASPLOS 2006 Memory resource

More information

EbbRT: A Framework for Building Per-Application Library Operating Systems

EbbRT: A Framework for Building Per-Application Library Operating Systems EbbRT: A Framework for Building Per-Application Library Operating Systems Overview Motivation Objectives System design Implementation Evaluation Conclusion Motivation Emphasis on CPU performance and software

More information

Xen and the Art of Virtualization. CSE-291 (Cloud Computing) Fall 2016

Xen and the Art of Virtualization. CSE-291 (Cloud Computing) Fall 2016 Xen and the Art of Virtualization CSE-291 (Cloud Computing) Fall 2016 Why Virtualization? Share resources among many uses Allow heterogeneity in environments Allow differences in host and guest Provide

More information

COS 318: Operating Systems. Virtual Machine Monitors

COS 318: Operating Systems. Virtual Machine Monitors COS 318: Operating Systems Virtual Machine Monitors Prof. Margaret Martonosi Computer Science Department Princeton University http://www.cs.princeton.edu/courses/archive/fall11/cos318/ Announcements Project

More information

CSE398: Network Systems Design

CSE398: Network Systems Design CSE398: Network Systems Design Instructor: Dr. Liang Cheng Department of Computer Science and Engineering P.C. Rossin College of Engineering & Applied Science Lehigh University February 23, 2005 Outline

More information

Virtual Machines Disco and Xen (Lecture 10, cs262a) Ion Stoica & Ali Ghodsi UC Berkeley February 26, 2018

Virtual Machines Disco and Xen (Lecture 10, cs262a) Ion Stoica & Ali Ghodsi UC Berkeley February 26, 2018 Virtual Machines Disco and Xen (Lecture 10, cs262a) Ion Stoica & Ali Ghodsi UC Berkeley February 26, 2018 Today s Papers Disco: Running Commodity Operating Systems on Scalable Multiprocessors, Edouard

More information

Kernel Support for Paravirtualized Guest OS

Kernel Support for Paravirtualized Guest OS Kernel Support for Paravirtualized Guest OS Shibin(Jack) Xu University of Washington shibix@cs.washington.edu ABSTRACT Flexibility at the Operating System level is one of the most important factors for

More information

Resource Containers. A new facility for resource management in server systems. Presented by Uday Ananth. G. Banga, P. Druschel, J. C.

Resource Containers. A new facility for resource management in server systems. Presented by Uday Ananth. G. Banga, P. Druschel, J. C. Resource Containers A new facility for resource management in server systems G. Banga, P. Druschel, J. C. Mogul OSDI 1999 Presented by Uday Ananth Lessons in history.. Web servers have become predominantly

More information

Networks and Operating Systems Chapter 11: Introduction to Operating Systems

Networks and Operating Systems Chapter 11: Introduction to Operating Systems Systems Group Department of Computer Science ETH Zürich Networks and Operating Systems Chapter 11: Introduction to Operating Systems (252-0062-00) Donald Kossmann & Torsten Hoefler Frühjahrssemester 2012

More information

Virtual Machines. To do. q VM over time q Implementation methods q Hardware features supporting VM q Next time: Midterm?

Virtual Machines. To do. q VM over time q Implementation methods q Hardware features supporting VM q Next time: Midterm? Virtual Machines To do q VM over time q Implementation methods q Hardware features supporting VM q Next time: Midterm? *Partially based on notes from C. Waldspurger, VMware, 2010 and Arpaci-Dusseau s Three

More information

10 Steps to Virtualization

10 Steps to Virtualization AN INTEL COMPANY 10 Steps to Virtualization WHEN IT MATTERS, IT RUNS ON WIND RIVER EXECUTIVE SUMMARY Virtualization the creation of multiple virtual machines (VMs) on a single piece of hardware, where

More information

Xen and the Art of Virtualization. Nikola Gvozdiev Georgian Mihaila

Xen and the Art of Virtualization. Nikola Gvozdiev Georgian Mihaila Xen and the Art of Virtualization Nikola Gvozdiev Georgian Mihaila Outline Xen and the Art of Virtualization Ian Pratt et al. I. The Art of Virtualization II. Xen, goals and design III. Xen evaluation

More information

Unit 2 : Computer and Operating System Structure

Unit 2 : Computer and Operating System Structure Unit 2 : Computer and Operating System Structure Lesson 1 : Interrupts and I/O Structure 1.1. Learning Objectives On completion of this lesson you will know : what interrupt is the causes of occurring

More information

OS DESIGN PATTERNS II. CS124 Operating Systems Fall , Lecture 4

OS DESIGN PATTERNS II. CS124 Operating Systems Fall , Lecture 4 OS DESIGN PATTERNS II CS124 Operating Systems Fall 2017-2018, Lecture 4 2 Last Time Began discussing general OS design patterns Simple structure (MS-DOS) Layered structure (The THE OS) Monolithic kernels

More information

IO-Lite: A Unified I/O Buffering and Caching System

IO-Lite: A Unified I/O Buffering and Caching System IO-Lite: A Unified I/O Buffering and Caching System Vivek S. Pai, Peter Druschel and Willy Zwaenepoel Rice University (Presented by Chuanpeng Li) 2005-4-25 CS458 Presentation 1 IO-Lite Motivation Network

More information

RESOURCE MANAGEMENT MICHAEL ROITZSCH

RESOURCE MANAGEMENT MICHAEL ROITZSCH Faculty of Computer Science Institute of Systems Architecture, Operating Systems Group RESOURCE MANAGEMENT MICHAEL ROITZSCH AGENDA done: time, drivers today: misc. resources architectures for resource

More information

Netchannel 2: Optimizing Network Performance

Netchannel 2: Optimizing Network Performance Netchannel 2: Optimizing Network Performance J. Renato Santos +, G. (John) Janakiraman + Yoshio Turner +, Ian Pratt * + HP Labs - * XenSource/Citrix Xen Summit Nov 14-16, 2007 2003 Hewlett-Packard Development

More information

LINUX CONTAINERS. Where Enterprise Meets Embedded Operating Environments WHEN IT MATTERS, IT RUNS ON WIND RIVER

LINUX CONTAINERS. Where Enterprise Meets Embedded Operating Environments WHEN IT MATTERS, IT RUNS ON WIND RIVER Where Enterprise Meets Embedded Operating Environments WHEN IT MATTERS, IT RUNS ON WIND RIVER EXECUTIVE SUMMARY Flexible and connected platforms are core components in leading computing fields, including

More information

Lightweight Remote Procedure Call

Lightweight Remote Procedure Call Lightweight Remote Procedure Call Brian N. Bershad, Thomas E. Anderson, Edward D. Lazowska, Henry M. Levy ACM Transactions Vol. 8, No. 1, February 1990, pp. 37-55 presented by Ian Dees for PSU CS533, Jonathan

More information

System Virtual Machines

System Virtual Machines System Virtual Machines Outline Need and genesis of system Virtual Machines Basic concepts User Interface and Appearance State Management Resource Control Bare Metal and Hosted Virtual Machines Co-designed

More information

RESOURCE MANAGEMENT MICHAEL ROITZSCH

RESOURCE MANAGEMENT MICHAEL ROITZSCH Faculty of Computer Science Institute of Systems Architecture, Operating Systems Group RESOURCE MANAGEMENT MICHAEL ROITZSCH AGENDA done: time, drivers today: misc. resources architectures for resource

More information

Fast packet processing in the cloud. Dániel Géhberger Ericsson Research

Fast packet processing in the cloud. Dániel Géhberger Ericsson Research Fast packet processing in the cloud Dániel Géhberger Ericsson Research Outline Motivation Service chains Hardware related topics, acceleration Virtualization basics Software performance and acceleration

More information

Virtualization and memory hierarchy

Virtualization and memory hierarchy Virtualization and memory hierarchy Computer Architecture J. Daniel García Sánchez (coordinator) David Expósito Singh Francisco Javier García Blas ARCOS Group Computer Science and Engineering Department

More information

Virtual Machines. 2 Disco: Running Commodity Operating Systems on Scalable Multiprocessors([1])

Virtual Machines. 2 Disco: Running Commodity Operating Systems on Scalable Multiprocessors([1]) EE392C: Advanced Topics in Computer Architecture Lecture #10 Polymorphic Processors Stanford University Thursday, 8 May 2003 Virtual Machines Lecture #10: Thursday, 1 May 2003 Lecturer: Jayanth Gummaraju,

More information

OS Design Approaches. Roadmap. OS Design Approaches. Tevfik Koşar. Operating System Design and Implementation

OS Design Approaches. Roadmap. OS Design Approaches. Tevfik Koşar. Operating System Design and Implementation CSE 421/521 - Operating Systems Fall 2012 Lecture - II OS Structures Roadmap OS Design and Implementation Different Design Approaches Major OS Components!! Memory management! CPU Scheduling! I/O Management

More information

École Polytechnique Fédérale de Lausanne. Porting a driver for the Intel XL710 40GbE NIC to the IX Dataplane Operating System

École Polytechnique Fédérale de Lausanne. Porting a driver for the Intel XL710 40GbE NIC to the IX Dataplane Operating System École Polytechnique Fédérale de Lausanne Semester Project Porting a driver for the Intel XL710 40GbE NIC to the IX Dataplane Operating System Student: Andy Roulin (216690) Direct Supervisor: George Prekas

More information

CSC 5930/9010 Cloud S & P: Virtualization

CSC 5930/9010 Cloud S & P: Virtualization CSC 5930/9010 Cloud S & P: Virtualization Professor Henry Carter Fall 2016 Recap Network traffic can be encrypted at different layers depending on application needs TLS: transport layer IPsec: network

More information

Exokernel Engler, Kaashoek etc. advantage: fault isolation slow (kernel crossings)

Exokernel Engler, Kaashoek etc. advantage: fault isolation slow (kernel crossings) Exokernel Engler, Kaashoek etc. Outline: Overview 20 min Specific abstractions 30 min Critique 20 min advantage: fault isolation slow (kernel crossings) File server Vm server 1. High-level goals Goal Improved

More information

Data Path acceleration techniques in a NFV world

Data Path acceleration techniques in a NFV world Data Path acceleration techniques in a NFV world Mohanraj Venkatachalam, Purnendu Ghosh Abstract NFV is a revolutionary approach offering greater flexibility and scalability in the deployment of virtual

More information

Questions answered in this lecture: CS 537 Lecture 19 Threads and Cooperation. What s in a process? Organizing a Process

Questions answered in this lecture: CS 537 Lecture 19 Threads and Cooperation. What s in a process? Organizing a Process Questions answered in this lecture: CS 537 Lecture 19 Threads and Cooperation Why are threads useful? How does one use POSIX pthreads? Michael Swift 1 2 What s in a process? Organizing a Process A process

More information

Light & NOS. Dan Li Tsinghua University

Light & NOS. Dan Li Tsinghua University Light & NOS Dan Li Tsinghua University Performance gain The Power of DPDK As claimed: 80 CPU cycles per packet Significant gain compared with Kernel! What we care more How to leverage the performance gain

More information

The Architecture of Virtual Machines Lecture for the Embedded Systems Course CSD, University of Crete (April 29, 2014)

The Architecture of Virtual Machines Lecture for the Embedded Systems Course CSD, University of Crete (April 29, 2014) The Architecture of Virtual Machines Lecture for the Embedded Systems Course CSD, University of Crete (April 29, 2014) ManolisMarazakis (maraz@ics.forth.gr) Institute of Computer Science (ICS) Foundation

More information

CSC Operating Systems Fall Lecture - II OS Structures. Tevfik Ko!ar. Louisiana State University. August 27 th, 2009.

CSC Operating Systems Fall Lecture - II OS Structures. Tevfik Ko!ar. Louisiana State University. August 27 th, 2009. CSC 4103 - Operating Systems Fall 2009 Lecture - II OS Structures Tevfik Ko!ar Louisiana State University August 27 th, 2009 1 Announcements TA Changed. New TA: Praveenkumar Kondikoppa Email: pkondi1@lsu.edu

More information

Announcements. Computer System Organization. Roadmap. Major OS Components. Processes. Tevfik Ko!ar. CSC Operating Systems Fall 2009

Announcements. Computer System Organization. Roadmap. Major OS Components. Processes. Tevfik Ko!ar. CSC Operating Systems Fall 2009 CSC 4103 - Operating Systems Fall 2009 Lecture - II OS Structures Tevfik Ko!ar TA Changed. New TA: Praveenkumar Kondikoppa Email: pkondi1@lsu.edu Announcements All of you should be now in the class mailing

More information

24-vm.txt Mon Nov 21 22:13: Notes on Virtual Machines , Fall 2011 Carnegie Mellon University Randal E. Bryant.

24-vm.txt Mon Nov 21 22:13: Notes on Virtual Machines , Fall 2011 Carnegie Mellon University Randal E. Bryant. 24-vm.txt Mon Nov 21 22:13:36 2011 1 Notes on Virtual Machines 15-440, Fall 2011 Carnegie Mellon University Randal E. Bryant References: Tannenbaum, 3.2 Barham, et al., "Xen and the art of virtualization,"

More information

Arrakis: The Operating System is the Control Plane

Arrakis: The Operating System is the Control Plane Arrakis: The Operating System is the Control Plane UW Technical Report UW-CSE-13-10-01, version 2.0, May 7, 2014 Simon Peter Jialin Li Irene Zhang Dan R.K. Ports Doug Woos Arvind Krishnamurthy Thomas Anderson

More information

Tales of the Tail Hardware, OS, and Application-level Sources of Tail Latency

Tales of the Tail Hardware, OS, and Application-level Sources of Tail Latency Tales of the Tail Hardware, OS, and Application-level Sources of Tail Latency Jialin Li, Naveen Kr. Sharma, Dan R. K. Ports and Steven D. Gribble February 2, 2015 1 Introduction What is Tail Latency? What

More information

Disco. CS380L: Mike Dahlin. September 13, This week: Disco and Exokernel. One lesson: If at first you don t succeed, try try again.

Disco. CS380L: Mike Dahlin. September 13, This week: Disco and Exokernel. One lesson: If at first you don t succeed, try try again. Disco CS380L: Mike Dahlin September 13, 2007 Disco: A bad idea from the 70 s, and it s back! Mendel Rosenblum (tongue in cheek) 1 Preliminaries 1.1 Review 1.2 Outline 1.3 Preview This week: Disco and Exokernel.

More information

Operating System Architecture. CS3026 Operating Systems Lecture 03

Operating System Architecture. CS3026 Operating Systems Lecture 03 Operating System Architecture CS3026 Operating Systems Lecture 03 The Role of an Operating System Service provider Provide a set of services to system users Resource allocator Exploit the hardware resources

More information

MODERN SYSTEMS: EXTENSIBLE KERNELS AND CONTAINERS

MODERN SYSTEMS: EXTENSIBLE KERNELS AND CONTAINERS 1 MODERN SYSTEMS: EXTENSIBLE KERNELS AND CONTAINERS CS6410 Hakim Weatherspoon Motivation 2 Monolithic Kernels just aren't good enough? Conventional virtual memory isn't what userspace programs need (Appel

More information

Distributed Systems Operation System Support

Distributed Systems Operation System Support Hajussüsteemid MTAT.08.009 Distributed Systems Operation System Support slides are adopted from: lecture: Operating System(OS) support (years 2016, 2017) book: Distributed Systems: Concepts and Design,

More information

CS533 Concepts of Operating Systems. Jonathan Walpole

CS533 Concepts of Operating Systems. Jonathan Walpole CS533 Concepts of Operating Systems Jonathan Walpole Improving IPC by Kernel Design & The Performance of Micro- Kernel Based Systems The IPC Dilemma IPC is very import in µ-kernel design - Increases modularity,

More information

Performance Considerations of Network Functions Virtualization using Containers

Performance Considerations of Network Functions Virtualization using Containers Performance Considerations of Network Functions Virtualization using Containers Jason Anderson, et al. (Clemson University) 2016 International Conference on Computing, Networking and Communications, Internet

More information

Virtualization. Starting Point: A Physical Machine. What is a Virtual Machine? Virtualization Properties. Types of Virtualization

Virtualization. Starting Point: A Physical Machine. What is a Virtual Machine? Virtualization Properties. Types of Virtualization Starting Point: A Physical Machine Virtualization Based on materials from: Introduction to Virtual Machines by Carl Waldspurger Understanding Intel Virtualization Technology (VT) by N. B. Sahgal and D.

More information

Virtualization and Virtual Machines. CS522 Principles of Computer Systems Dr. Edouard Bugnion

Virtualization and Virtual Machines. CS522 Principles of Computer Systems Dr. Edouard Bugnion Virtualization and Virtual Machines CS522 Principles of Computer Systems Dr. Edouard Bugnion Virtualization and Virtual Machines 2 This week Introduction, definitions, A short history of virtualization

More information

Operating System. Operating System Overview. Structure of a Computer System. Structure of a Computer System. Structure of a Computer System

Operating System. Operating System Overview. Structure of a Computer System. Structure of a Computer System. Structure of a Computer System Overview Chapter 1.5 1.9 A program that controls execution of applications The resource manager An interface between applications and hardware The extended machine 1 2 Structure of a Computer System Structure

More information

Operating Systems. Operating System Structure. Lecture 2 Michael O Boyle

Operating Systems. Operating System Structure. Lecture 2 Michael O Boyle Operating Systems Operating System Structure Lecture 2 Michael O Boyle 1 Overview Architecture impact User operating interaction User vs kernel Syscall Operating System structure Layers Examples 2 Lower-level

More information

Virtualization. ! Physical Hardware Processors, memory, chipset, I/O devices, etc. Resources often grossly underutilized

Virtualization. ! Physical Hardware Processors, memory, chipset, I/O devices, etc. Resources often grossly underutilized Starting Point: A Physical Machine Virtualization Based on materials from: Introduction to Virtual Machines by Carl Waldspurger Understanding Intel Virtualization Technology (VT) by N. B. Sahgal and D.

More information

Embedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institute of Technology, Delhi

Embedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institute of Technology, Delhi Embedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institute of Technology, Delhi Lecture - 13 Virtual memory and memory management unit In the last class, we had discussed

More information

Virtualization. Dr. Yingwu Zhu

Virtualization. Dr. Yingwu Zhu Virtualization Dr. Yingwu Zhu Virtualization Definition Framework or methodology of dividing the resources of a computer into multiple execution environments. Types Platform Virtualization: Simulate a

More information

Process Description and Control

Process Description and Control Process Description and Control 1 Process:the concept Process = a program in execution Example processes: OS kernel OS shell Program executing after compilation www-browser Process management by OS : Allocate

More information

CSE 120 Principles of Operating Systems

CSE 120 Principles of Operating Systems CSE 120 Principles of Operating Systems Spring 2018 Lecture 16: Virtual Machine Monitors Geoffrey M. Voelker Virtual Machine Monitors 2 Virtual Machine Monitors Virtual Machine Monitors (VMMs) are a hot

More information

RESOURCE MANAGEMENT MICHAEL ROITZSCH

RESOURCE MANAGEMENT MICHAEL ROITZSCH Department of Computer Science Institute for System Architecture, Operating Systems Group RESOURCE MANAGEMENT MICHAEL ROITZSCH AGENDA done: time, drivers today: misc. resources architectures for resource

More information

Four Components of a Computer System

Four Components of a Computer System Four Components of a Computer System Operating System Concepts Essentials 2nd Edition 1.1 Silberschatz, Galvin and Gagne 2013 Operating System Definition OS is a resource allocator Manages all resources

More information

Threads. Raju Pandey Department of Computer Sciences University of California, Davis Spring 2011

Threads. Raju Pandey Department of Computer Sciences University of California, Davis Spring 2011 Threads Raju Pandey Department of Computer Sciences University of California, Davis Spring 2011 Threads Effectiveness of parallel computing depends on the performance of the primitives used to express

More information

Arrakis: The Operating System is the Control Plane

Arrakis: The Operating System is the Control Plane Arrakis: The Operating System is the Control Plane Simon Peter Jialin Li Irene Zhang Dan R. K. Ports Doug Woos Arvind Krishnamurthy Thomas Anderson Timothy Roscoe University of Washington ETH Zurich Abstract

More information

Introduction to OpenOnload Building Application Transparency and Protocol Conformance into Application Acceleration Middleware

Introduction to OpenOnload Building Application Transparency and Protocol Conformance into Application Acceleration Middleware White Paper Introduction to OpenOnload Building Application Transparency and Protocol Conformance into Application Acceleration Middleware Steve Pope, PhD Chief Technical Officer Solarflare Communications

More information

Porting Hyperkernel to the ARM Architecture

Porting Hyperkernel to the ARM Architecture Technical Report UW-CSE-17-08-02 Porting Hyperkernel to the ARM Architecture Dylan Johnson University of Washington dgj16@cs.washington.edu Keywords ARM, AArch64, Exokernel, Operating Systems, Virtualization

More information

Spring 2017 :: CSE 506. Introduction to. Virtual Machines. Nima Honarmand

Spring 2017 :: CSE 506. Introduction to. Virtual Machines. Nima Honarmand Introduction to Virtual Machines Nima Honarmand Virtual Machines & Hypervisors Virtual Machine: an abstraction of a complete compute environment through the combined virtualization of the processor, memory,

More information

What s in a process?

What s in a process? CSE 451: Operating Systems Winter 2015 Module 5 Threads Mark Zbikowski mzbik@cs.washington.edu Allen Center 476 2013 Gribble, Lazowska, Levy, Zahorjan What s in a process? A process consists of (at least):

More information

ELEC 377 Operating Systems. Week 1 Class 2

ELEC 377 Operating Systems. Week 1 Class 2 Operating Systems Week 1 Class 2 Labs vs. Assignments The only work to turn in are the labs. In some of the handouts I refer to the labs as assignments. There are no assignments separate from the labs.

More information

Commercial Real-time Operating Systems An Introduction. Swaminathan Sivasubramanian Dependable Computing & Networking Laboratory

Commercial Real-time Operating Systems An Introduction. Swaminathan Sivasubramanian Dependable Computing & Networking Laboratory Commercial Real-time Operating Systems An Introduction Swaminathan Sivasubramanian Dependable Computing & Networking Laboratory swamis@iastate.edu Outline Introduction RTOS Issues and functionalities LynxOS

More information

Distributed File Systems Issues. NFS (Network File System) AFS: Namespace. The Andrew File System (AFS) Operating Systems 11/19/2012 CSC 256/456 1

Distributed File Systems Issues. NFS (Network File System) AFS: Namespace. The Andrew File System (AFS) Operating Systems 11/19/2012 CSC 256/456 1 Distributed File Systems Issues NFS (Network File System) Naming and transparency (location transparency versus location independence) Host:local-name Attach remote directories (mount) Single global name

More information

Assignment 5. Georgia Koloniari

Assignment 5. Georgia Koloniari Assignment 5 Georgia Koloniari 2. "Peer-to-Peer Computing" 1. What is the definition of a p2p system given by the authors in sec 1? Compare it with at least one of the definitions surveyed in the last

More information

I, J A[I][J] / /4 8000/ I, J A(J, I) Chapter 5 Solutions S-3.

I, J A[I][J] / /4 8000/ I, J A(J, I) Chapter 5 Solutions S-3. 5 Solutions Chapter 5 Solutions S-3 5.1 5.1.1 4 5.1.2 I, J 5.1.3 A[I][J] 5.1.4 3596 8 800/4 2 8 8/4 8000/4 5.1.5 I, J 5.1.6 A(J, I) 5.2 5.2.1 Word Address Binary Address Tag Index Hit/Miss 5.2.2 3 0000

More information

Opal. Robert Grimm New York University

Opal. Robert Grimm New York University Opal Robert Grimm New York University The Three Questions What is the problem? What is new or different? What are the contributions and limitations? The Three Questions What is the problem? Applications

More information

Preserving I/O Prioritization in Virtualized OSes

Preserving I/O Prioritization in Virtualized OSes Preserving I/O Prioritization in Virtualized OSes Kun Suo 1, Yong Zhao 1, Jia Rao 1, Luwei Cheng 2, Xiaobo Zhou 3, Francis C. M. Lau 4 The University of Texas at Arlington 1, Facebook 2, University of

More information

SPIN Operating System

SPIN Operating System SPIN Operating System Motivation: general purpose, UNIX-based operating systems can perform poorly when the applications have resource usage patterns poorly handled by kernel code Why? Current crop of

More information

COSC 6385 Computer Architecture. Virtualizing Compute Resources

COSC 6385 Computer Architecture. Virtualizing Compute Resources COSC 6385 Computer Architecture Virtualizing Compute Resources Spring 2010 References [1] J. L. Hennessy, D. A. Patterson Computer Architecture A Quantitative Approach Chapter 5.4 [2] G. Neiger, A. Santoni,

More information

What is an Operating System? A Whirlwind Tour of Operating Systems. How did OS evolve? How did OS evolve?

What is an Operating System? A Whirlwind Tour of Operating Systems. How did OS evolve? How did OS evolve? What is an Operating System? A Whirlwind Tour of Operating Systems Trusted software interposed between the hardware and application/utilities to improve efficiency and usability Most computing systems

More information

Lecture 5: February 3

Lecture 5: February 3 CMPSCI 677 Operating Systems Spring 2014 Lecture 5: February 3 Lecturer: Prashant Shenoy Scribe: Aditya Sundarrajan 5.1 Virtualization Virtualization is a technique that extends or replaces an existing

More information

Implementation and Analysis of Large Receive Offload in a Virtualized System

Implementation and Analysis of Large Receive Offload in a Virtualized System Implementation and Analysis of Large Receive Offload in a Virtualized System Takayuki Hatori and Hitoshi Oi The University of Aizu, Aizu Wakamatsu, JAPAN {s1110173,hitoshi}@u-aizu.ac.jp Abstract System

More information

Background. IBM sold expensive mainframes to large organizations. Monitor sits between one or more OSes and HW

Background. IBM sold expensive mainframes to large organizations. Monitor sits between one or more OSes and HW Virtual Machines Background IBM sold expensive mainframes to large organizations Some wanted to run different OSes at the same time (because applications were developed on old OSes) Solution: IBM developed

More information

Introduction to Operating Systems. Chapter Chapter

Introduction to Operating Systems. Chapter Chapter Introduction to Operating Systems Chapter 1 1.3 Chapter 1.5 1.9 Learning Outcomes High-level understand what is an operating system and the role it plays A high-level understanding of the structure of

More information

Cloud Computing Virtualization

Cloud Computing Virtualization Cloud Computing Virtualization Anil Madhavapeddy anil@recoil.org Contents Virtualization. Layering and virtualization. Virtual machine monitor. Virtual machine. x86 support for virtualization. Full and

More information

Task Scheduling of Real- Time Media Processing with Hardware-Assisted Virtualization Heikki Holopainen

Task Scheduling of Real- Time Media Processing with Hardware-Assisted Virtualization Heikki Holopainen Task Scheduling of Real- Time Media Processing with Hardware-Assisted Virtualization Heikki Holopainen Aalto University School of Electrical Engineering Degree Programme in Communications Engineering Supervisor:

More information

Operating System Structure

Operating System Structure Operating System Structure Joey Echeverria joey42+os@gmail.com April 18, 2005 Carnegie Mellon University: 15-410 Spring 2005 Overview Motivations Kernel Structures Monolithic Kernels Open Systems Microkernels

More information

CHAPTER 16 - VIRTUAL MACHINES

CHAPTER 16 - VIRTUAL MACHINES CHAPTER 16 - VIRTUAL MACHINES 1 OBJECTIVES Explore history and benefits of virtual machines. Discuss the various virtual machine technologies. Describe the methods used to implement virtualization. Show

More information

10/10/ Gribble, Lazowska, Levy, Zahorjan 2. 10/10/ Gribble, Lazowska, Levy, Zahorjan 4

10/10/ Gribble, Lazowska, Levy, Zahorjan 2. 10/10/ Gribble, Lazowska, Levy, Zahorjan 4 What s in a process? CSE 451: Operating Systems Autumn 2010 Module 5 Threads Ed Lazowska lazowska@cs.washington.edu Allen Center 570 A process consists of (at least): An, containing the code (instructions)

More information

CS 344/444 Computer Network Fundamentals Final Exam Solutions Spring 2007

CS 344/444 Computer Network Fundamentals Final Exam Solutions Spring 2007 CS 344/444 Computer Network Fundamentals Final Exam Solutions Spring 2007 Question 344 Points 444 Points Score 1 10 10 2 10 10 3 20 20 4 20 10 5 20 20 6 20 10 7-20 Total: 100 100 Instructions: 1. Question

More information

Last Class: OS and Computer Architecture. Last Class: OS and Computer Architecture

Last Class: OS and Computer Architecture. Last Class: OS and Computer Architecture Last Class: OS and Computer Architecture System bus Network card CPU, memory, I/O devices, network card, system bus Lecture 4, page 1 Last Class: OS and Computer Architecture OS Service Protection Interrupts

More information

6.9. Communicating to the Outside World: Cluster Networking

6.9. Communicating to the Outside World: Cluster Networking 6.9 Communicating to the Outside World: Cluster Networking This online section describes the networking hardware and software used to connect the nodes of cluster together. As there are whole books and

More information

HIGH-PERFORMANCE NETWORKING :: USER-LEVEL NETWORKING :: REMOTE DIRECT MEMORY ACCESS

HIGH-PERFORMANCE NETWORKING :: USER-LEVEL NETWORKING :: REMOTE DIRECT MEMORY ACCESS HIGH-PERFORMANCE NETWORKING :: USER-LEVEL NETWORKING :: REMOTE DIRECT MEMORY ACCESS CS6410 Moontae Lee (Nov 20, 2014) Part 1 Overview 00 Background User-level Networking (U-Net) Remote Direct Memory Access

More information

Introduction to Operating Systems Prof. Chester Rebeiro Department of Computer Science and Engineering Indian Institute of Technology, Madras

Introduction to Operating Systems Prof. Chester Rebeiro Department of Computer Science and Engineering Indian Institute of Technology, Madras Introduction to Operating Systems Prof. Chester Rebeiro Department of Computer Science and Engineering Indian Institute of Technology, Madras Week - 01 Lecture - 03 From Programs to Processes Hello. In

More information

What are some common categories of system calls? What are common ways of structuring an OS? What are the principles behind OS design and

What are some common categories of system calls? What are common ways of structuring an OS? What are the principles behind OS design and What are the services provided by an OS? What are system calls? What are some common categories of system calls? What are the principles behind OS design and implementation? What are common ways of structuring

More information