Making Cloud Easy: Components of a Distributed. Design Considerations and First. Operating System for Cloud

Similar documents
Operating System Services. User Services. System Operation Services. User Operating System Interface - CLI. A View of Operating System Services

Making Cloud Easy: Design Considerations and First Components of a Distributed Operating System for Cloud

4.8 Summary. Practice Exercises

CS420: Operating Systems

#techsummitch

Microsoft Azure Stack Hybrid Cloud. The Modern System Architecture

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

P a g e 1. Teknologisk Institut. Online kursus k SysAdmin & DevOps Collection

Kubernetes The Path to Cloud Native

EDGE COMPUTING & IOT MAKING IT SECURE AND MANAGEABLE FRANCK ROUX MARKETING MANAGER, NXP JUNE PUBLIC

Course Overview This five-day course will provide participants with the key knowledge required to deploy and configure Microsoft Azure Stack.

School of Software / Soongsil University Prof. YOUNGJONG KIM, Ph.D. Soongsil University

How to Put Your AF Server into a Container

Chapter 2: Operating-System Structures. Operating System Concepts 9 th Edit9on

CHAPTER 2: SYSTEM STRUCTURES. By I-Chen Lin Textbook: Operating System Concepts 9th Ed.

HPC over Cloud. July 16 th, SCENT HPC Summer GIST. SCENT (Super Computing CENTer) GIST (Gwangju Institute of Science & Technology)

How to Keep UP Through Digital Transformation with Next-Generation App Development

Docker 101 Workshop. Eric Smalling - Solution Architect, Docker

Exploring Cloud Security, Operational Visibility & Elastic Datacenters. Kiran Mohandas Consulting Engineer

CLOUD-NATIVE APPLICATION DEVELOPMENT/ARCHITECTURE

Operating System Structure

Merging Enterprise Applications with Docker* Container Technology

Title DC Automation: It s a MARVEL!

Containers Do Not Need Network Stacks

Configuring and Operating a Hybrid Cloud with Microsoft Azure Stack

[Docker] Containerization

CloudOpen Europe 2013 SYNNEFO: A COMPLETE CLOUD STACK OVER TECHNICAL LEAD, SYNNEFO

Flip the Switch to Container-based Clouds

Simplified CICD with Jenkins and Git on the ZeroStack Platform

Red Hat Roadmap for Containers and DevOps

Container Adoption for NFV Challenges & Opportunities. Sriram Natarajan, T-Labs Silicon Valley Innovation Center

Operating System Structure

Exam C Foundations of IBM Cloud Reference Architecture V5

Operating System Architecture. CS3026 Operating Systems Lecture 03

20537A: Configuring and Operating a Hybrid Cloud with Microsoft Azure Stack

Let s say that hosting a cloudbased application is like car ownership

Oracle Application Container Cloud

Deploying Cloud Network Services Prime Network Services Controller (formerly VNMC)

Threads. What is a thread? Motivation. Single and Multithreaded Processes. Benefits

STATE OF MODERN APPLICATIONS IN THE CLOUD

IoT and Edge Computing. Satyam Vaghani VP & GM, IOT & AI

Cloud & container monitoring , Lars Michelsen Check_MK Conference #4

Application Access to Persistent Memory The State of the Nation(s)!

Murray Goldschmidt. Chief Operating Officer Sense of Security Pty Ltd. Micro Services, Containers and Serverless PaaS Web Apps? How safe are you?

Exam Questions

Industrial Transformation calling for Next Generation of Cloud

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Oracle Linux, Virtualization & OEM12 Discussion Sahil Mahajan / Sundeep Dhall

CONTAINERS AND MICROSERVICES WITH CONTRAIL

CS 300 Leftovers. CS460 Pacific University 1

IBM Bluemix compute capabilities IBM Corporation

AWS Lambda. 1.1 What is AWS Lambda?

System Call. Preview. System Call. System Call. System Call 9/7/2018

VMWARE PIVOTAL CONTAINER SERVICE

Why Microsoft Azure is the right choice for your Public Cloud, a Consultants view by Simon Conyard

Datacenter Management and The Private Cloud. Troy Sharpe Core Infrastructure Specialist Microsoft Corp, Education

Containerization Dockers / Mesospere. Arno Keller HPE

DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN. Chapter 3 Processes

Data Center and Cloud Automation

Full Scalable Media Cloud Solution with Kubernetes Orchestration. Zhenyu Wang, Xin(Owen)Zhang

Real4Test. Real IT Certification Exam Study materials/braindumps

Operating System: Chap2 OS Structure. National Tsing-Hua University 2016, Fall Semester

Chapter 2: Operating-System Structures

FIVE REASONS YOU SHOULD RUN CONTAINERS ON BARE METAL, NOT VMS

Feature Comparison Summary

3 Software Stacks for IoT Solutions. Ian Skerrett Eclipse

6.033 Spring Lecture #6. Monolithic kernels vs. Microkernels Virtual Machines spring 2018 Katrina LaCurts

Red Hat Atomic Details Dockah, Dockah, Dockah! Containerization as a shift of paradigm for the GNU/Linux OS

DEPLOYING NFV: BEST PRACTICES

API s in a hybrid world. Date 28 September 2017

Logging, Monitoring, and Alerting

Build your own Cloud on Christof Westhues

Deploying and Operating Cloud Native.NET apps

Kubernetes made easy with Docker EE. Patrick van der Bleek Sr. Solutions Engineer NEMEA

Chapter 2: Operating-System Structures. Operating System Concepts 9 th Edition

Introduction to Virtualization and Containers Phil Hopkins

IBM Power Systems: Open innovation to put data to work Dexter Henderson Vice President IBM Power Systems

Go Faster: Containers, Platforms and the Path to Better Software Development (Including Live Demo)

Windows Server Windows Server Windows Server 2008

Chapter 2. Operating-System Structures

Using DC/OS for Continuous Delivery

Windows Server System Center Azure Pack

Citrix Workspace Cloud

Qt for Device Creation

Docker and HPE Accelerate Digital Transformation to Enable Hybrid IT. Steven Follis Solutions Engineer Docker Inc.

How Container Runtimes matter in Kubernetes?

Serverless Computing: Design, Implementation, and Performance. Garrett McGrath and Paul R. Brenner

Java Architectures A New Hope. Eberhard Wolff

Defining Security for an AWS EKS deployment

SAMPLE CHAPTER. Marko Lukša MANNING

Simplify Software Integration for FPGA Accelerators with OPAE

Introduction to containers

Cisco Container Platform

COS 318: Operating Systems

Akraino & Starlingx: A Technical Overview

Beyond 1001 Dedicated Data Service Instances

Containers for NFV Peter Willis March 2017

2018 Cisco and/or its affiliates. All rights reserved.

vbranch Introduction and Demo

Cisco Virtualized Infrastructure Manager

Transcription:

Making Cloud Easy: Design Considerations and First Components of a Distributed Operating System for Cloud James Kempf for the Nefele Development Team ER RACSP July 09, 2018 James Kempf ER RACSP 2018-07-09

3 rd generation cloud design principles Datacenter is abstracted as Single System All cloud services are fully distributed Datacenter capacity organically adapts and self-configures All resources belong to a single resource manager Different communication mechanisms for different distances Set of communication abstractions structured along a latency gradient

Differences from previous SSI work No distributed kernel Single server Linux kernel used as a modular system component Provides virtual memory and thread management Other operating system services like process creation and resource management are distributed No distributed shared virtual memory or distributed thread management Processes get only as much virtual memory as on a single machine and only as many threads as the local Linux OS can provide Monolithic applications are not distributed automatically as SSI suggests Applications need to be designed modularly for distribution Our work: SSI for Cloud Native Image source from top to bottom: https://www.zerostack.com/cloud-native-apps-icons-hover/ http://www.iet.unipi.it/a.bechini/concur/concur.html

Addressing pain points in existing cloud platforms OpenStack Remove lower bound on minimum unit of deployment without impacting scale up potential Processes rather than VM images for execution context Remove need for programmer to deal with virtual networks and infrastructure programming Reduce the administration effort, simplify upgrade and maintenance Kubernetes Add native tenant management to improve potential for resource and capacity sharing Replace networking with IPC to simplify service interface and improve performance Overall Reduce the number of execution context layers Improve co-ordination between layers

DC policy and management Tenant management Integrated analytics Making 3 rd generation cloud real Implemented now: Single System Image for easy development and deployment of applications Automatic resource federation for easy setup of datacenter resources Message bus for structured, fast control plane messaging Blockchain based tenant management with smart contracts Remainder of talk DC boot Development environments for cloud native applications Application runtime 3 rd Generation Cloud APIs Compute Messaging* Under development: Serverless application runtime with fast key-value store for state handling Single touch installation, configuration and upgrade Acceleration through domain specific hardware Network Storage Accelerator Linux Linux Linux Linux Linux * Together with and

DC policy and management Tenant management Integrated analytics Nefele SDK The SSI abstraction is fronted to the developer through an SDK library Structured like the Linux libc OS Syscall library SDK library is libnefele C library for now (Python and Golang under test) All API calls block until the call completes Currently implemented functions Process management (creation, deletion, etc.) Legacy Filesystem uses legacy Linux API Linux libc calls are also available for single machine operations Development environments for cloud native applications Application runtime 3 rd Generation Cloud APIs Compute Messaging Network Storage Accelerator Linux Linux Linux Linux Linux

Hermoðr Stack Processes commuicate through mailboxes Implemented Messages in Google protobuf format Three addressing schemes: Random addresses similar to TIPC ports Assigned and distributed locationindependent numbered services from TPIC Symbolic string based names In process High performance RDMA Unreliable messaging Reliable messaging Protocols

Estimating π using Monte Carlo simulation Generate points uniformly distributed at random over a 1x1 square. Keep track of points inside a circle with radius 0.5 and points outside Blue is outside, red is inside Area of circle is πr 2 = π / 4 Divide N inner by N total should give an estimate of π/4 Multiply by 4 for π

Sample application MonteCarlo simulation to estimate π. nefele_n_spawn(): spawn as many as tenant has cpu capacity Montecarlo Worker Worker Worker nefele_spawn() HTTP POST Web monitor Updates π and error Parallell simulations to rapidly increase precision. hrmd_new(): connect to Hermodr message bus Hermodr message bus: printer@{8080,1,1} Printer Store π Distributed shared folder (/data/pi)

Monte Carlo main program int main(int argc, char* argv[]) { ( ) char* worker = "/usr/bin/montecarlo_worker"; char* printer = "/usr/bin/montecarlo_printer"; nefele_spawn(&pid_printer, printer, p_arg, ); ( ) do { ( ) nefele_n_spawn(pids, new_req, worker, w_arg, ); nefele_waitpid(0, &status, 0); } while (pending > 0 completed < N1); Control Plane } nefele_kill(pid_printer, 9); printf( );

Monte Carlo worker program int main(int argc, char* argv[]) { ( ) msg.t = monte_carlo_count_pi(n2); msg.n = N2; ( ) } if (hrmd_service_exists(hrmd, "printer@{8080,1,1}", 10000)) { hrmd_message(hrmd, "printer@{8080,1,1}", (const byte*) &msg, sizeof(msg)); } ( ) Parallel task worker

Assumptions, Open Issues, and Feedback Assumptions Designing cloud management software as a system to remove the need for infrastructure programming and simplify networking will: Radically reduce the amount of work a developer needs to go through to develop and deploy applications into a cloud Radically reduce the amount of work needed to manage cloud infrastructure Partially this will be due to being able to design management automation in from the start rather than having to patch it in later Open Issues Will the Single System approach work without distributed shared virtual memory and distributed thread management? Can our approach better support Kubernetes and other Cloud Native Foundation technologies and serverless programming than running on Openstack or directly on Linux? Can our approach better support emerging application classes like the RiseLab Ray AI application framework? Feedback IPC v.s. IP for communication and exposing latency to the developer? How to incorporate policy? Management of storage?

Single System Image for Cloud Native: Key Ideas Use processes instead of VMs for execution context Use IPC instead of IP for communication locally, expose latency to the developer to allow them to compensate in their designs Design cloud systems management to be automated rather than automate it after designed