M 3 Microkernel-based System for Heterogeneous Manycores

Similar documents
M 3 Microkernel-based System for Heterogeneous Manycores

Introduction Tasks Memory VFS IPC UI. Escape. Nils Asmussen MKC, 07/09/ / 40

Introduction Tasks Memory VFS IPC Security UI. Escape. Nils Asmussen MKC, 07/13/ / 43

The microkernel OS Escape

CSE 120 Principles of Operating Systems

CSE 410: Systems Programming

Lecture 3: O/S Organization. plan: O/S organization processes isolation

COS 318: Operating Systems

Lecture 19: File System Implementation. Mythili Vutukuru IIT Bombay

Process. Heechul Yun. Disclaimer: some slides are adopted from the book authors slides with permission

Interprocess Communication Mechanisms

Interprocess Communication Mechanisms

CSE506: Operating Systems CSE 506: Operating Systems

Design Overview of the FreeBSD Kernel CIS 657

MICROKERNEL CONSTRUCTION 2014

Design Overview of the FreeBSD Kernel. Organization of the Kernel. What Code is Machine Independent?

CSE 120 Principles of Operating Systems Spring 2017

Chapter 8: Main Memory

COS 318: Operating Systems

GPUnet: Networking Abstractions for GPU Programs. Author: Andrzej Jackowski

CS 318 Principles of Operating Systems

Processes. Process Concept

ECE 498 Linux Assembly Language Lecture 1

CS 471 Operating Systems. Yue Cheng. George Mason University Fall 2017

Chapter 8: Memory-Management Strategies

Interprocess Communication Mechanisms

shared storage These mechanisms have already been covered. examples: shared virtual memory message based signals

Decoupling Dynamic Information Flow Tracking with a Dedicated Coprocessor

Load-Sto-Meter: Generating Workloads for Persistent Memory Damini Chopra, Doug Voigt Hewlett Packard (Enterprise)

CHAPTER 8 - MEMORY MANAGEMENT STRATEGIES

EXTENDING AN ASYNCHRONOUS MESSAGING LIBRARY USING AN RDMA-ENABLED INTERCONNECT. Konstantinos Alexopoulos ECE NTUA CSLab

Secure Architecture Principles

Operating System: Chap13 I/O Systems. National Tsing-Hua University 2016, Fall Semester

Chapter 8: Main Memory. Operating System Concepts 9 th Edition

Deterministic Process Groups in

Memory Management. Disclaimer: some slides are adopted from book authors slides with permission 1

CSE 451: Operating Systems Winter Processes. Gary Kimura

W4118: OS Overview. Junfeng Yang

Chapter 8: Main Memory

CSci 4061 Introduction to Operating Systems. IPC: Basics, Pipes

Memory-Mapped Files. generic interface: vaddr mmap(file descriptor,fileoffset,length) munmap(vaddr,length)

RISCV with Sanctum Enclaves. Victor Costan, Ilia Lebedev, Srini Devadas

Background: I/O Concurrency

CSE 560 Computer Systems Architecture

I/O and virtualization

The Classical OS Model in Unix

9/19/18. COS 318: Operating Systems. Overview. Important Times. Hardware of A Typical Computer. Today CPU. I/O bus. Network

Faculty of Computer Science, Operating Systems Group. The L4Re Microkernel. Adam Lackorzynski. July 2017

Processes. Johan Montelius KTH

ECE 471 Embedded Systems Lecture 4

Maria Hybinette, UGA. ! One easy way to communicate is to use files. ! File descriptors. 3 Maria Hybinette, UGA. ! Simple example: who sort

A process. the stack

The Operating System. Chapter 6

Overview. This Lecture. Interrupts and exceptions Source: ULK ch 4, ELDD ch1, ch2 & ch4. COSC440 Lecture 3: Interrupts 1

Decoupling Dynamic Information Flow Tracking with a Dedicated Coprocessor

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2018 Lecture 23

Fall 2017 :: CSE 306. Process Abstraction. Nima Honarmand

I/O, today, is Remote (block) Load/Store, and must not be slower than Compute, any more

Flavors of Memory supported by Linux, their use and benefit. Christoph Lameter, Ph.D,

COMP SCI 3SH3: Operating System Concepts (Term 2 Winter 2006) Test 2 February 27, 2006; Time: 50 Minutes ;. Questions Instructor: Dr.

ADVANCED OPERATING SYSTEMS USB in a microkernel based operating system

A Disseminated Distributed OS for Hardware Resource Disaggregation Yizhou Shan

Secure Architecture Principles

Recap: Memory Management

Process. Heechul Yun. Disclaimer: some slides are adopted from the book authors slides with permission 1

Eleos: Exit-Less OS Services for SGX Enclaves

Confinement (Running Untrusted Programs)

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2015 Lecture 23

CS 5460/6460 Operating Systems

Virtual Machines. To do. q VM over time q Implementation methods q Hardware features supporting VM q Next time: Midterm?

CS5460: Operating Systems Lecture 14: Memory Management (Chapter 8)

CS5460: Operating Systems

The UNIX Time- Sharing System

Introduction. Yolanda Becerra Fontal Juan José Costa Prats

ECE 550D Fundamentals of Computer Systems and Engineering. Fall 2017

On the Portability and Performance of Message-Passing Programs on Embedded Multicore Platforms

RAMP-White / FAST-MP

Prepared by Prof. Hui Jiang Process. Prof. Hui Jiang Dept of Electrical Engineering and Computer Science, York University

CSC 5930/9010 Cloud S & P: Virtualization

SCONE: Secure Linux Container Environments with Intel SGX

Euro-Par Pisa - Italy

Network File System (NFS)

Application Access to Persistent Memory The State of the Nation(s)!

CSC501 Operating Systems Principles. OS Structure

Process. Prepared by Prof. Hui Jiang Dept. of EECS, York Univ. 1. Process in Memory (I) PROCESS. Process. How OS manages CPU usage? No.

Secure Architecture Principles

Network Support on M 3

Network File System (NFS)

Protection and System Calls. Otto J. Anshus

Input / Output. Kevin Webb Swarthmore College April 12, 2018

Processes COMPSCI 386

Chapter 3: Process Concept

Chapter 3: Process Concept

Memory management. Last modified: Adaptation of Silberschatz, Galvin, Gagne slides for the textbook Applied Operating Systems Concepts

CSI Module 2: Processes

CSE506: Operating Systems CSE 506: Operating Systems

Programming and Simulating Fused Devices. Part 2 Multi2Sim

Windows architecture. user. mode. Env. subsystems. Executive. Device drivers Kernel. kernel. mode HAL. Hardware. Process B. Process C.

Fast packet processing in the cloud. Dániel Géhberger Ericsson Research

Inter-Process Communication

Transcription:

M 3 Microkernel-based System for Heterogeneous Manycores Nils Asmussen MKC, 06/29/2017 1 / 35

Outline 1 Introduction 2 Data Transfer Unit 3 Prototype Platforms 4 M 3 5 Summary 2 / 35

Outline 1 Introduction 2 Data Transfer Unit 3 Prototype Platforms 4 M 3 5 Summary 3 / 35

Motivation Intel Knights Landing 4 / 35

Motivation Intel Knights Landing Qualcom Snapdragon 810 4 / 35

What s the Problem? DSP ARM big Audio Decoder Intel Xeon Intel Xeon FPGA DSP ARM LITTLE 5 / 35

What s the Problem? DSP ARM big Audio Decoder Intel Xeon Kernel Intel Xeon Kernel FPGA DSP ARM LITTLE 5 / 35

What s the Problem? DSP ARM big Kernel Audio Decoder Intel Xeon Kernel Intel Xeon Kernel FPGA DSP ARM LITTLE Kernel 5 / 35

What s the Problem? ARM big Kernel Intel Xeon Kernel Intel Xeon Kernel ARM LITTLE Kernel 5 / 35

The Goal Treat all compute units (CU) as first-class citizens: 1 Run untrusted code without causing harm 2 Access operating system services 3 Interact as the master with other CUs 6 / 35

First-class Citizenchip as Enabler Pipe communication between arbitrary CUs Use parallism on GPUs for FS operations Direct access to accelerators from the net... 7 / 35

My Approach DSP ARM big Audio Decoder Intel Xeon Intel Xeon FPGA DSP ARM LITTLE 8 / 35

My Approach DSP ARM big Audio Decoder Intel Xeon Intel Xeon FPGA DSP ARM LITTLE 8 / 35

My Approach DSP ARM big Audio Decoder Intel Xeon Intel Xeon FPGA DSP ARM LITTLE 8 / 35

My Approach PE PE PE PE DSP ARM big Audio Decoder Intel Xeon PE PE PE PE Intel Xeon FPGA DSP ARM LITTLE 8 / 35

My Approach PE PE PE PE DSP App ARM Kernel big Audio App Decoder Intel App Xeon PE PE PE PE Intel App Xeon FPGA App DSP App ARM App LITTLE 8 / 35

My Approach PE PE PE PE DSP App ARM Kernel big Audio App Decoder Intel App Xeon PE PE PE PE Intel App Xeon FPGA App DSP App ARM App LITTLE 8 / 35

Outline 1 Introduction 2 Data Transfer Unit 3 Prototype Platforms 4 M 3 5 Summary 9 / 35

Data Transfer Unit Supports memory access and message passing Provides a number of endpoints Each endpoint can be configured for: 1 Accessing a piece of memory (contiguous range, byte granular) 2 Receiving messages into a ringbuffer 3 Sending messages to a receiving endpoint Configuration only by kernel, usage by application Direct reply on received messages Credits are used to prevent DoS attacks 10 / 35

Data Transfer Unit Example DSP App ARM big Kernel S R 11 / 35

Data Transfer Unit Example DSP App ARM big Kernel S R 11 / 35

Data Transfer Unit Example DSP App ARM big Kernel S R 11 / 35

Outline 1 Introduction 2 Data Transfer Unit 3 Prototype Platforms 4 M 3 5 Summary 12 / 35

Introduction Data Transfer Unit Prototype Platforms M3 Summary Tomahawk 2 and 4 PE Xtensa LX4 PE R Data SPM PE R PE PE R Instr. SPM PE PE R R R R Ctrl. PE R R DRAM Cores attached to NoC with No privileged mode No MMU, no caches, but SPM T2: simple ; T4: most features 13 / 35

Linux M 3 runs on Linux using it as a virtual machine A process simulates a PE, having two threads (CPU + ) s communicate over UNIX domain sockets No accuracy because Programs are directly executed on host Data transfers have huge overhead compared to HW Very useful for debugging and early prototyping 14 / 35

gem5 Modular platform for computer-system architecture research Supports various ISAs (x86, ARM, Alpha, SPARC,... ) Cycle-accurate simulation Has an out-of-order core model We built a for gem5 Support for caches and virtual memory We also added hardware accelerators 15 / 35

gem5 Example Configuration x86 L1$ PE x86 PE DRAM ME L2$ IO$ L1$ IO$ Accel PE Accel PE VM SPM L1$ 16 / 35

Outline 1 Introduction 2 Data Transfer Unit 3 Prototype Platforms 4 M 3 5 Summary 17 / 35

Introduction M 3 uses ideas from L4, but implemented from scratch Provides mechanisms for PEs, memory and communication Drivers, filesystems,... are implemented as services Kernel manages permissions, using capabilities enforces permissions (communication, memory access) Kernel has no ISA-specific code 18 / 35

Capabilities M 3 has the following capabilities: Send: send messages to a receive EP Receive: receive messages from send EPs ory: access remote memory via Mapping: access remote memory via load/store Service: create sessions Session: exchange caps with service VPE: execute code on a PE 19 / 35

Capability Exchange Kernel provides syscalls to create, exchange and revoke caps There are two ways to exchange caps: 1 Directly with another VPE (typically, a child VPE) 2 Over a session with a service The kernel offers two operations: 1 Delegate: send capability to somebody else 2 Obtain: receive capability from somebody else 20 / 35

Communication Overview Kernel: PE0 Core VPE1: PE1 Recv Cap VPE2: PE2 Send Cap Receiver: PE1 Core Recv Gate Sender: PE2 Core Send Gate cmdreg cmdreg adds cmdreg EP EP buffer header data EP credits unread occup channel target label configuration of endpoints to establish a channel 21 / 35

Communication System call to create send cap for specific EP and VPE Typically created for own VPE Binds a label to the send cap for secure identification Can then be delegated to someone else To use a send cap, an EP needs to be configured The kernel provides a syscall for that purpose This indirection allows EP multiplexing and blocking 22 / 35

ory Syscall reqmem can be used to request remote memory Like with send caps, an EP needs to be configured to access it Afterwards, the can be instructed to read/write Behaves like RDMA (no SW involved) Syscall derivemem to create a subset of mem cap 23 / 35

VPEs Syscall createvpe to create a new VPE The kernel assigns the VPE to a suitable PE, if possible App gets VPE cap and memory cap for the whole PE memory Syscall to start it and wait for its completion (exit) Can exchange capabilities with the VPE before starting it: e.g., delegate inputs after it s finished: e.g., obtain results 24 / 35

VPEs Examples Executing ELF-Binaries VPE vpe (" test "); char * args [] = {"/ bin / hello ", " foo ", " bar "}; vpe. exec (3, args ); 25 / 35

VPEs Examples Executing ELF-Binaries VPE vpe (" test "); char * args [] = {"/ bin / hello ", " foo ", " bar "}; vpe. exec (3, args ); Asynchronous Lambdas VPE vpe (" test "); Gate mem = Gate :: create_ global (0 x1000, RW); vpe. delegate ( CapRngDesc ( mem. sel (), 1)); vpe. run_async ([& mem ]() { mem. read (buf, sizeof ( buf )); cout << " Done reading!\ n"; }); 25 / 35

Filesystem: m3fs Filesystem service is implemented outside of kernel m3fs is (currently) an in-memory filesystem PE1 PE2 PE3 m3fs App kernel FS 26 / 35

Filesystem: m3fs Filesystem service is implemented outside of kernel m3fs is (currently) an in-memory filesystem PE1 PE2 PE3 open m3fs App kernel FS 26 / 35

Filesystem: m3fs Filesystem service is implemented outside of kernel m3fs is (currently) an in-memory filesystem PE1 PE2 PE3 obtain m3fs App kernel FS 26 / 35

Filesystem: m3fs Filesystem service is implemented outside of kernel m3fs is (currently) an in-memory filesystem obtain PE1 PE2 PE3 obtain m3fs App kernel FS 26 / 35

Filesystem: m3fs Filesystem service is implemented outside of kernel m3fs is (currently) an in-memory filesystem PE1 PE2 PE3 m3fs App kernel FS read/write 26 / 35

Pipes: Direct Pipe writer reader 27 / 35

Pipes: Direct Pipe Shared ory writer reader msg passing 28 / 35

Pipes: Indirect Pipe writer reader writer reader reader 29 / 35

Pipes: Indirect Pipe Shared ory writer reader msg passing service 30 / 35

Files on Top of Capabilities Example libm3 provides abstractions to work with files, pipes,... Every VPE has a mountspace and file descriptors These can be inherited to child VPEs Pure convenience: security is based on capabilities VPE vpe ("my vpe "); vpe. fds () ->set ( STDIN_FD, pipe_fd ); vpe. fds () ->set ( STDOUT_FD, VPE :: self ().fds () ->get ( STDOUT_FD )); vpe. obtain_fds (); vpe. run_async ([] { File *in = VPE :: self ().fds () ->get ( STDIN_FD ); //... }); 31 / 35

Demo Demo 32 / 35

Outline 1 Introduction 2 Data Transfer Unit 3 Prototype Platforms 4 M 3 5 Summary 33 / 35

Ongoing Work Multiple instances of the kernel/services (by Matthias Hille) Port of the musl C library (by Sherif Shehabaldin) Integration of a NIC (by Georg Kotheimer) Integration of an IDE controller (by Lukas Landgraf) 34 / 35

Summary +M 3 is a new point in the design space for systems Supports untrusted code and OS services on all PEs Supports direct communication between all PEs abstracts heterogeneity and provides common interface M 3 kernel controls s remotely M 3 is available at https://github.com/tud-os/m3 35 / 35