Tessellation: Space-Time Partitioning in a Manycore Client OS

Similar documents
Tessellation OS. Present and Future. Juan A. Colmenares, Sarah Bird, Andrew Wang, Krste Asanovid, John Kubiatowicz [Parallel Computing Laboratory]

Abstract. 1 Introduction

Tessellation: Space-Time Partitioning in a Manycore Client OS Rose Liu, Kevin Klues, Sarah Bird, Steven Hofmeyr, Krste Asanović, John Kubiatowicz

BERKELEY PAR LAB. RAMP Gold Wrap. Krste Asanovic. RAMP Wrap Stanford, CA August 25, 2010

Hardware Acceleration for Tagging

Lithe: Enabling Efficient Composition of Parallel Libraries

Swarm at the Edge of the Cloud. John Kubiatowicz UC Berkeley Swarm Lab September 29 th, 2013

Integrating the Par Lab Stack Running Damascene on SEJITS/ROS/RAMP Gold

Introduction to Operating Systems

Operating systems Architecture

Making Performance Understandable: Towards a Standard for Performance Counters on Manycore Architectures

Xen and the Art of Virtualization. CSE-291 (Cloud Computing) Fall 2016

Akaros. Barret Rhoden, Kevin Klues, Andrew Waterman, Ron Minnich, David Zhu, Eric Brewer UC Berkeley and Google

Exokernel: An Operating System Architecture for Application Level Resource Management

Introduction to OS (cs1550)

Role 1: The Operating System is an Abstract Machine. Learning Outcomes. Introduction to Operating Systems. What is an Operating System?

Anne Bracy CS 3410 Computer Science Cornell University

COS 318: Operating Systems

Example Networks on chip Freescale: MPC Telematics chip

COS 318: Operating Systems

Protection and System Calls. Otto J. Anshus

CS Operating Systems

Initial Evaluation of a User-Level Device Driver Framework

CSCI 350 Ch. 1 Introduction to OS. Mark Redekopp Ramesh Govindan and Michael Shindler

Department of Computer Science Institute for System Architecture, Operating Systems Group REAL-TIME MICHAEL ROITZSCH OVERVIEW

A HIGH-PERFORMANCE OBLIVIOUS RAM CONTROLLER ON THE CONVEY HC-2EX HETEROGENEOUS COMPUTING PLATFORM

Tolerating Malicious Drivers in Linux. Silas Boyd-Wickizer and Nickolai Zeldovich

2017 Storage Developer Conference. Mellanox Technologies. All Rights Reserved.

OS Structure. Kevin Webb Swarthmore College January 25, Relevant xkcd:

Arachne. Core Aware Thread Management Henry Qin Jacqueline Speiser John Ousterhout

Active Testing for Concurrent Programs

Are You Insured Against Your Noisy Neighbor Sunku Ranganath, Intel Corporation Sridhar Rao, Spirent Communications

Operating Systems (2INC0) 2018/19. Introduction (01) Dr. Tanir Ozcelebi. Courtesy of Prof. Dr. Johan Lukkien. System Architecture and Networking Group

Scalable and Flexible Software Platforms for High-Performance ECUs. Christoph Dietachmayr Sr. Engineering Manager, Elektrobit November 8, 2018

RESOURCE MANAGEMENT MICHAEL ROITZSCH

Operating Systems. studykorner.org

IHK/McKernel: A Lightweight Multi-kernel Operating System for Extreme-Scale Supercomputing

Sandboxing. CS-576 Systems Security Instructor: Georgios Portokalidis Spring 2018

2013 Cisco and/or its affiliates. All rights reserved. 1

RISCV with Sanctum Enclaves. Victor Costan, Ilia Lebedev, Srini Devadas

IO virtualization. Michael Kagan Mellanox Technologies

Four Components of a Computer System

Operating System Support for Shared-ISA Asymmetric Multi-core Architectures

Introduction to Operating Systems. Chapter Chapter

Japan s post K Computer Yutaka Ishikawa Project Leader RIKEN AICS

Chapter 1: Introduction. Operating System Concepts 9 th Edit9on

Using Industry Standards to Exploit the Advantages and Resolve the Challenges of Multicore Technology

Enabling NVMe I/O Scale

Kubernetes The Path to Cloud Native

Xen and the Art of Virtualization

Virtualization. Starting Point: A Physical Machine. What is a Virtual Machine? Virtualization Properties. Types of Virtualization

Virtualization. ! Physical Hardware Processors, memory, chipset, I/O devices, etc. Resources often grossly underutilized

IQ for DNA. Interactive Query for Dynamic Network Analytics. Haoyu Song. HUAWEI TECHNOLOGIES Co., Ltd.

CSC 5930/9010 Cloud S & P: Virtualization

Anne Bracy CS 3410 Computer Science Cornell University

I/O virtualization. Jiang, Yunhong Yang, Xiaowei Software and Service Group 2009 虚拟化技术全国高校师资研讨班

references Virtualization services Topics Virtualization

CISC 662 Graduate Computer Architecture Lecture 16 - Cache and virtual memory review

UCS Technical Deep Dive: Getting to the Heart of the Matter

Software-Defined Networking (SDN) Overview

Lecture 1 Introduction (Chapter 1 of Textbook)

A Secure Update Architecture for High Assurance Mixed-Criticality System Don Kuzhiyelil Dr. Sergey Tverdyshev SYSGO AG

Flavors of Memory supported by Linux, their use and benefit. Christoph Lameter, Ph.D,

Code Generators for Stencil Auto-tuning

Lecture 9 - Virtual Memory

Architectural Support for Operating Systems

Sub-capacity (Virtualization) License Counting Rules

Design Principles for End-to-End Multicore Schedulers

Scheduling II. Today. Next Time. ! Proportional-share scheduling! Multilevel-feedback queue! Multiprocessor scheduling. !

CSE 120 Principles of Operating Systems

OPERATING SYSTEMS Chapter 13 Virtual Machines. CS3502 Spring 2017

Syscalls, exceptions, and interrupts, oh my!

M 3 Microkernel-based System for Heterogeneous Manycores

Internet of Things 2018/2019

Reserves time on a paper sign-up sheet. Programmer runs his own program. Relays or vacuum tube hardware. Plug board or punch card input.

OPENSTACK: THE OPEN CLOUD

Extreme-Scale Operating Systems

Windows Azure Services - At Different Levels

EECS750: Advanced Operating Systems. 2/24/2014 Heechul Yun

CS250 VLSI Systems Design Lecture 9: Patterns for Processing Units and Communication Links

Computer Architecture and OS. EECS678 Lecture 2

19: I/O. Mark Handley. Direct Memory Access (DMA)

Interconnecting Components

Lecture 9: MIMD Architectures

ARM Security Solutions and Numonyx Authenticated Flash

Chapter 5 C. Virtual machines

A Comparison of Capacity Management Schemes for Shared CMP Caches

Nested Virtualization and Server Consolidation

Operating System Structure

Live Migration of Virtualized Edge Networks: Analytical Modeling and Performance Evaluation

Parallel Languages: Past, Present and Future

CS370 Operating Systems

CS252 Spring 2017 Graduate Computer Architecture. Lecture 17: Virtual Memory and Caches

Evaluation of sparse LU factorization and triangular solution on multicore architectures. X. Sherry Li

Computer Systems Laboratory Sungkyunkwan University

Scaling Internet TV Content Delivery ALEX GUTARIN DIRECTOR OF ENGINEERING, NETFLIX

A Case for High Performance Computing with Virtual Machines

Taming Non-blocking Caches to Improve Isolation in Multicore Real-Time Systems

Operating System Services. User Services. System Operation Services. User Operating System Interface - CLI. A View of Operating System Services

CSE 451: Operating Systems Winter Redundant Arrays of Inexpensive Disks (RAID) and OS structure. Gary Kimura

Transcription:

Tessellation: Space-Time ing in a Manycore Client OS Rose Liu 1,2, Kevin Klues 1, Sarah Bird 1, Steven Hofmeyr 3, Krste Asanovic 1, John Kubiatowicz 1 1 Parallel Computing Laboratory, UC Berkeley 2 Data Domain 3 Lawrence Berkeley National Laboratory

Client Device Single-user device Runs a heterogeneous mix of interactive, real-time and batch applications simultaneously Generally battery constrained

Why a new Client OS? Enter the Manycore world Must address parallelism Current client OSs weren t designed for parallel applications Existing OSs addressing parallelism targets servers or HPC contexts, not clients Servers emphasis on throughput vs. Client emphasis on user experience/responsiveness HPC machine dedicated to one parallel application vs. Client runs many heterogeneous parallel applications Client - Longer battery life

Outline Why a new OS for Manycore Clients? A Case for Space-time ing Define space-time partitioning Use cases for space-time partitioning Implementing Space-Time ing in a Manycore OS Status

Spatial s Radio Isolated unit containing a subset of physical machine resources 5

Spatial s Radio Isolated unit containing a subset of physical machine resources 6

Spatial s Radio Isolated unit containing a subset of physical machine resources 7

Spatial s QoS enforced share of interconnect bandwidth Radio Isolated unit containing a subset of physical machine resources 8

Spatial s Energy or Power Budget Radio Isolated unit containing a subset of physical machine resources 9

Machine divided into spatial partitions Wireless radio 10

Put applications in spatial partitions Media Player Radio Browser 11

Benefits of spatial partitions Each app can run a custom user-level runtime for best performance Provides apps with resource guarantees for performance predictability Functional & Performance Isolation Natural unit for fault Media Player containment, energy management Radio Browser 12

Put OS Services in spatial partitions Media Player Network Driver Wireless radio Browser Filesystem 13

Put sub-components in spatial partitions Media Player Video decoder GUI Network Driver Wireless radio Browser Filesystem 14

Put virtual machines in spatial partitions Media Player Video decoder GUI Network Driver Wireless radio Filesystem Browser Windows VM 15

s need to communicate Spatial Communication occurs without a context switch Media Player Network Driver Wireless radio Video decoder GUI Browser Windows VM Filesystem 16

Communication Challenges Communication relaxes the isolation boundaries of partitions and introduces issues like: Security Service-level QoS and/or resource accounting of requestors within service partitions Media Player Video decoder GUI Network Driver Wireless radio Browser Windows VM Filesystem 17

Space-time partitioning virtualizes spatial partitions Context Switch Cost ~ Process Context Switch Cost Time multiplex at a coarse granularity to allow for user-level scheduling Media Player Video decoder Browser GUI Network Driver Windows VM Wireless radio Filesystem De-scheduled s 18

Space-Time Scheduling Descheduled s: s are dynamically resized while running without a reboot or application restart Time resources Real-time app put in low power state is always scheduled

Space-Time Scheduling Descheduled s: Challenges: 1. How to determine the right resource allocation for a partition? 2. What granularity to time multiplex each partition? Don t need to use same time quanta for all partitions. Time s are dynamically resized while running without a reboot or application restart 3. We can deschedule partitions from each type of resource independently. E.g. time multiplex off cores more frequently than multiplex partition data off caches. How to determine best policy?

Communication in space and time Media Player Video decoder GUI Network Driver Wireless radio Filesystem Browser Windows VM De-scheduled s 21

Outline Why a new OS for Manycore Clients? A Case for Space-time ing Implementing Space-Time ing in a Manycore OS (Tessellation) Status

Tessellation OS Media Player Video decoder GUI Network Driver Video Driver Wireless radio Browser Windows VM Filesystem Virus Checker De-scheduled s USB Driver 23

Tessellation Kernel Library OS Functionality Application Or OS Service Custom Scheduler Interconnect Bandwidth Message Passing Cache Physical CPUs Hardware ing Mechanisms Performance Counters 24

Tessellation Kernel Library OS Functionality Application Or OS Service Custom Scheduler Marshalls syscalls into messages for the respective OS Service App-specific scheduler for best parallel performance. (See Lithe talk on user-level scheduling.) Interconnect Bandwidth Message Passing Cache Physical CPUs Hardware ing Mechanisms Performance Counters 25

Tessellation Kernel Library OS Functionality Application Or OS Service Custom Scheduler Tessellation Kernel Interconnect Bandwidth Message Passing Cache Physical CPUs Hardware ing Mechanisms Performance Counters 26

Tessellation Kernel Library OS Functionality Application Or OS Service Custom Scheduler Management Layer Mechanism Layer (Trusted) Tessellation Kernel Interconnect Bandwidth Message Passing Cache Physical CPUs Hardware ing Mechanisms Performance Counters 27

Mechanism Layer Library OS Functionality Application Or OS Service Custom Scheduler Management Layer Mechanism Layer (Trusted) Configure Resources enforced by HW at runtime Tessellation Kernel Interconnect Bandwidth Message Passing Cache Physical CPUs Hardware ing Mechanisms Performance Counters 28

Mechanism Layer Library OS Functionality Application Or OS Service Custom Scheduler Management Layer Mechanism Layer (Trusted) Configure Resources enforced by HW at runtime Configure HW-supported Communication Tessellation Kernel Interconnect Bandwidth Message Passing Cache Physical CPUs Hardware ing Mechanisms Performance Counters 29

Management Layer Library OS Functionality Application Or OS Service Custom Scheduler Management Layer Mechanism Layer (Trusted) Configure Resources enforced by HW at runtime Allocator Configure HW-supported Communication Tessellation Kernel Interconnect Bandwidth Message Passing Cache Physical CPUs Hardware ing Mechanisms Performance Counters 30

Management Layer Library OS Functionality Application Or OS Service Custom Scheduler Resizing Callback API Management Layer Mechanism Layer (Trusted) Configure Resources enforced by HW at runtime Allocator Configure HW-supported Communication Tessellation Kernel Interconnect Bandwidth Message Passing Cache Physical CPUs Hardware ing Mechanisms Performance Counters 31

Management Layer Library OS Functionality Application Or OS Service Custom Scheduler Res. Reqs. Resizing Callback API Management Layer Mechanism Layer (Trusted) Configure Resources enforced by HW at runtime Allocator Configure HW-supported Communication Tessellation Kernel Interconnect Bandwidth Message Passing Cache Physical CPUs Hardware ing Mechanisms Performance Counters 32

Management Layer Library OS Functionality Application Or OS Service Comm Reqs Custom Scheduler Res. Reqs. Resizing Callback API Management Layer Mechanism Layer (Trusted) Scheduler Configure Resources enforced by HW at runtime Allocator Configure HW-supported Communication Tessellation Kernel Interconnect Bandwidth Message Passing Cache Physical CPUs Hardware ing Mechanisms Performance Counters 33

Management Layer Library OS Functionality Application Or OS Service Sched Reqs. Comm. Reqs Custom Scheduler Res. Reqs. Resizing Callback API Management Layer Mechanism Layer (Trusted) Scheduler Configure Resources enforced by HW at runtime Allocator Configure HW-supported Communication Tessellation Kernel Interconnect Bandwidth Message Passing Cache Physical CPUs Hardware ing Mechanisms Performance Counters 34

Outline Why a new OS for Manycore Clients? A Case for Space-time ing Define space-time partitioning Use cases for space-time partitioning Implementing Space-Time ing in a Manycore OS Status

Implementation status Basics of Tessellation kernel and primitive OS service up and running Provides rudimentary partition interface Boots on standard x86 hardware No I/O yet statically linked applications and kernel TOS OS Services User App 0 User App 1 Tesselation Kernel Manycore HW

Next Steps Build fast cross-core communication mechanisms for system calls Context-switch free system calls APIC driven message notification with shared memory Add support for the 19 newlib system calls in TOS OS Service partition TOS OS Services User App 0 Newlib C User App 1 Newlib C Syscall Syscall

Intermediate Infrastructure TOS OS Service doesn t have all drivers, so run BSD with existing drivers on one core to service I/O from TOS OS Service Tessellation runs on rest of the cores Pinned Shared Channel Existing I/O Drivers I/O Server Process BSD Kernel TOS OS Service User App 0 Newlib C Syscall Manycore HW Syscall User App 1 Newlib C

Acknowledgements We would like to thank the entire ParLab OS group and several people at LBNL for their invaluable contribution to the ideas presented here. Research supported by Microsoft Award #024263 and Intel Award #024894 and by matching funding from U.C. Discovery (Award #DIG07-102270). This work has also been in part supported by an NSF Graduate Research Fellowship.

Thanks! Questions? Media Player Video decoder GUI Network Driver Wireless radio Filesystem Browser Windows VM De-scheduled s 40