Linux-CR: Transparent Application Checkpoint-Restart in Linux

Similar documents
Linux-CR: Transparent Application Checkpoint-Restart in Linux

Making Applications Mobile

An introduction to checkpointing. for scientifc applications

Advanced Memory Management

Overview Job Management OpenVZ Conclusions. XtreemOS. Surbhi Chitre. IRISA, Rennes, France. July 7, Surbhi Chitre XtreemOS 1 / 55

An introduction to checkpointing. for scientific applications

Alternatives to Solaris Containers and ZFS for Linux on System z

Operating System Virtualization: Practice and Experience

Travis Cardwell Technical Meeting

Linux Kernel Security Update LinuxCon Europe Berlin, 2016

for Kerrighed? February 1 st 2008 Kerrighed Summit, Paris Erich Focht NEC

Application Fault Tolerance Using Continuous Checkpoint/Restart

Linux Kernel Security Overview

Veritas InfoScale Enterprise for Oracle Real Application Clusters (RAC)

Efficient and Large Scale Program Flow Tracing in Linux. Alexander Shishkin, Intel

Distributed Systems 27. Process Migration & Allocation

Toward An Integrated Cluster File System

VERITAS Storage Foundation 4.0 TM for Databases

Autosave for Research Where to Start with Checkpoint/Restart

Veritas Storage Foundation for Oracle RAC from Symantec

Process. Heechul Yun. Disclaimer: some slides are adopted from the book authors slides with permission

Chapter 5 C. Virtual machines

MCTS Guide to Microsoft Windows Server 2008 Applications Infrastructure Configuration (Exam # ) Chapter One Introducing Windows Server 2008

Introduction to Container Technology. Patrick Ladd Technical Account Manager April 13, 2016

Scalable In-memory Checkpoint with Automatic Restart on Failures

DMTCP: Fixing the Single Point of Failure of the ROS Master

Kubernetes The Path to Cloud Native

Data Sheet: Storage Management Veritas Storage Foundation for Oracle RAC from Symantec Manageability and availability for Oracle RAC databases

Spring 2017 :: CSE 506. Introduction to. Virtual Machines. Nima Honarmand

STING: Finding Name Resolution Vulnerabilities in Programs

Microkernel Construction

MIGSOCK A Migratable TCP Socket in Linux

CXS Citrix XenServer 6.0 Administration

Android System Development Training 4-day session

ECE 574 Cluster Computing Lecture 8

Lecture Topics. Announcements. Today: Threads (Stallings, chapter , 4.6) Next: Concurrency (Stallings, chapter , 5.

Distributed File Systems. Directory Hierarchy. Transfer Model

Virtuozzo Containers

GFS: The Google File System. Dr. Yingwu Zhu

PBS PROFESSIONAL VS. MICROSOFT HPC PACK

FAULT TOLERANT SYSTEMS

MEMBRANE: OPERATING SYSTEM SUPPORT FOR RESTARTABLE FILE SYSTEMS

Checkpointing with DMTCP and MVAPICH2 for Supercomputing. Kapil Arya. Mesosphere, Inc. & Northeastern University

Embedded Linux kernel and driver development training 5-day session

OS Virtualization. Linux Containers (LXC)

Using Linux Containers as a Virtualization Option

Virtualisation: The KVM Way. Amit Shah

COEN-4730 Computer Architecture Lecture 3 Review of Caches and Virtual Memory

Process Migration via Remote Fork: a Viable Programming Model? Branden J. Moor! cse 598z: Distributed Systems December 02, 2004

SMD149 - Operating Systems

MPI versions. MPI History

Chapter 5 B. Large and Fast: Exploiting Memory Hierarchy

Distributed File Systems II

VM Migration, Containers (Lecture 12, cs262a)

HPC learning using Cloud infrastructure

Namespaces and Capabilities Overview and Recent Developments

Distributed Systems COMP 212. Lecture 18 Othon Michail

The failure of Operating Systems,

Devops, Docker and Security. John

CIS 5373 Systems Security

I/O Systems. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

Linux Kernel Architecture

Proactive Process-Level Live Migration in HPC Environments

CRFS: A Lightweight User-Level Filesystem for Generic Checkpoint/Restart

Distributed recovery for senddeterministic. Tatiana V. Martsinkevich, Thomas Ropars, Amina Guermouche, Franck Cappello

Efficient Memory Management on Mobile Devices

How To Write a Linux Security Module That Makes Sense For You. Casey Schaufler February 2016

The Kernel Abstraction

IBM WebSphere MQ for HP NonStop Update

MPI History. MPI versions MPI-2 MPICH2

Process. Heechul Yun. Disclaimer: some slides are adopted from the book authors slides with permission 1

LINUX CONTAINERS. Where Enterprise Meets Embedded Operating Environments WHEN IT MATTERS, IT RUNS ON WIND RIVER

A Global Operating System for HPC Clusters

OS Containers. Michal Sekletár November 06, 2016

VERITAS NetBackup 6.0 Enterprise Server INNOVATIVE DATA PROTECTION DATASHEET. Product Highlights

TSM HSM Explained. Agenda. Oxford University TSM Symposium Introduction. How it works. Best Practices. Futures. References. IBM Software Group

Userfaultfd: Post-copy VM migration and beyond

CSCE Operating Systems Interrupts, Exceptions, and Signals. Qiang Zeng, Ph.D. Fall 2018

CS510 Operating System Foundations. Jonathan Walpole

Kerrighed: A SSI Cluster OS Running OpenMP

An Introduction to GPFS

Oracle Database 12c Administration I

Linux Foundation Collaboration Summit 2010

CS399 New Beginnings. Jonathan Walpole

Virtualizaton: One Size Does Not Fit All. Nedeljko Miljevic Product Manager, Automotive Solutions MontaVista Software

The Google File System

New England Data Camp v2.0 It is all about the data! Caregroup Healthcare System. Ayad Shammout Lead Technical DBA

Three Steps Toward Zero Downtime. Guide. Solution Guide Server.

Shared File System Requirements for SAS Grid Manager. Table Talk #1546 Ben Smith / Brian Porter

Chapter 8: Virtual Memory. Operating System Concepts

EDGE COMPUTING & IOT MAKING IT SECURE AND MANAGEABLE FRANCK ROUX MARKETING MANAGER, NXP JUNE PUBLIC

Virtual Memory. Reading. Sections 5.4, 5.5, 5.6, 5.8, 5.10 (2) Lecture notes from MKP and S. Yalamanchili

Distributed Filesystem

Virtual Memory. Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University

Linux Tiny Penguin Weight Watchers. Thomas Petazzoni Free Electrons electrons.com

Network stack virtualization for FreeBSD 7.0. Marko Zec

CS 333 Introduction to Operating Systems Class 2 OS-Related Hardware & Software The Process Concept

Storage and File Hierarchy

GFS: The Google File System

vsphere Networking Update 1 ESXi 5.1 vcenter Server 5.1 vsphere 5.1 EN

Transcription:

Linux-CR: Transparent Application Checkpoint-Restart in Linux Oren Laadan Columbia University orenl@cs.columbia.edu Linux Kernel Summit, November 2010 1 orenl@cs.columbia.edu Linux Kernel Summit, November 2010 1 orenl@cs.columbia.edu

Application C/R Application Checkpoint/Restart a mechanism to save the state of running application(s) so that they can later resume execution from that point Linux Kernel Summit, November 2010 2 orenl@cs.columbia.edu Linux Kernel Summit, November 2010 2 orenl@cs.columbia.edu

Application C/R checkpoint restart checkpoint image original restored hierarchy hierarchy Linux Kernel Summit, November 2010 3 orenl@cs.columbia.edu Linux Kernel Summit, November 2010 3 orenl@cs.columbia.edu

What is it good for? Application roll back to the past Application suspend and resume Application migration Linux Kernel Summit, November 2010 4 orenl@cs.columbia.edu Linux Kernel Summit, November 2010 4 orenl@cs.columbia.edu

Application rollback Fault tolerance Effective debugging Fast application start-up Software testing Generic time-machine Linux Kernel Summit, November 2010 5 orenl@cs.columbia.edu Linux Kernel Summit, November 2010 5 orenl@cs.columbia.edu

Application rollback Fault tolerance long running applications cloud, HPC, at work, at home Effective debugging Fast application start-up Software testing Generic time-machine Linux Kernel Summit, November 2010 6 orenl@cs.columbia.edu Linux Kernel Summit, November 2010 6 orenl@cs.columbia.edu

Application rollback Fault tolerance Effective debugging Super-core-dump more details, multiple tasks re-run from checkpoint trace, profile, and instrument Fast application start-up Software testing Generic time-machine Linux Kernel Summit, November 2010 7 orenl@cs.columbia.edu Linux Kernel Summit, November 2010 7 orenl@cs.columbia.edu

Application rollback Fault tolerance Effective debugging Fast application start-up from default/previous state (ccache...) improve desktop boot time Software testing Generic time-machine Linux Kernel Summit, November 2010 8 orenl@cs.columbia.edu Linux Kernel Summit, November 2010 8 orenl@cs.columbia.edu

Application rollback Fault tolerance Effective debugging Fast application start-up Software testing repeat from specific point(s) distribute on multiple hosts Generic time-machine Linux Kernel Summit, November 2010 9 orenl@cs.columbia.edu Linux Kernel Summit, November 2010 9 orenl@cs.columbia.edu

Application rollback Fault tolerance Effective debugging Fast application start-up Software testing Generic time-machine revive old server/desktop state retry a move in a game Linux Kernel Summit, November 2010 10 orenl@cs.columbia.edu Linux Kernel Summit, November 2010 10 orenl@cs.columbia.edu

Application suspend/resume Improved OOM handling Better system utilization Suspend/resume a user's session Linux Kernel Summit, November 2010 11 orenl@cs.columbia.edu Linux Kernel Summit, November 2010 11 orenl@cs.columbia.edu

Application suspend/resume Improved OOM handling suspend applications, don't kill smart swap on embedded Better system utilization Suspend/resume a user's session Linux Kernel Summit, November 2010 12 orenl@cs.columbia.edu Linux Kernel Summit, November 2010 12 orenl@cs.columbia.edu

Application suspend/resume Improved OOM handling Better system utilization suspend application to reduce load Suspend/resume a user's session Linux Kernel Summit, November 2010 13 orenl@cs.columbia.edu Linux Kernel Summit, November 2010 13 orenl@cs.columbia.edu

Application suspend/resume Improved OOM handling Better system utilization Suspend/resume a user's session mobile desktop on USB key linux based VPS/VDI systems Linux Kernel Summit, November 2010 14 orenl@cs.columbia.edu Linux Kernel Summit, November 2010 14 orenl@cs.columbia.edu

Application Migration Load balancing / resource sharing Zero-downtime maintenance High availability Linux Kernel Summit, November 2010 15 orenl@cs.columbia.edu Linux Kernel Summit, November 2010 15 orenl@cs.columbia.edu

Application Migration Load balancing / resource sharing HPC (e.g. BlueWaters project) cloud environments linux-based VPS/VDI Zero-downtime maintenance High availability Linux Kernel Summit, November 2010 16 orenl@cs.columbia.edu Linux Kernel Summit, November 2010 16 orenl@cs.columbia.edu

Application Migration Load balancing / resource sharing Zero-downtime maintenance live migration of applications High availability Linux Kernel Summit, November 2010 17 orenl@cs.columbia.edu Linux Kernel Summit, November 2010 17 orenl@cs.columbia.edu

Application Migration Load balancing / resource sharing Zero-downtime maintenance High availability primary/backup in lock-step frequent incremental checkpoints Linux Kernel Summit, November 2010 18 orenl@cs.columbia.edu Linux Kernel Summit, November 2010 18 orenl@cs.columbia.edu

Application vs Virtual-Machine Application Virtual C/R Machine granularity specific operating system applications as a whole unit saved state application entire operating state only system state overhead none visible flexibility application operating system awareness is black box deployment linux only same arch family Linux Kernel Summit, November 2010 19 orenl@cs.columbia.edu Linux Kernel Summit, November 2010 19 orenl@cs.columbia.edu

Some examples HPC environments can extend linux-cr distributed-cr Cloud deployments using linux containers Light-weight clusters of ARMs combine LXC and linux-cr Linux Kernel Summit, November 2010 20 orenl@cs.columbia.edu Linux Kernel Summit, November 2010 20 orenl@cs.columbia.edu

Some concrete examples BlueWaters NCSA's most powerful supercomputer checkpointing based on linux-cr OpenVZ VPS hosting with migration capabilities Canonical / Ubuntu add LXC & linux-cr in UEC cluster stack Linux Kernel Summit, November 2010 21 orenl@cs.columbia.edu Linux Kernel Summit, November 2010 21 orenl@cs.columbia.edu

Who and Who Who is doing? Matt Helsley, Dan Smith, Serge Hallyn, Nathan Lynch, Sukadev Bhattiprolu, me Who is interested? IBM, Canonical, OpenVZ, HPC industry, Kerrighed, Google (?),... Who else does/did? OS: AIX, OpenVZ, IRIX, Cray... Systems: Moab, BLCR/Beowolf, Condor... Linux Kernel Summit, November 2010 22 orenl@cs.columbia.edu Linux Kernel Summit, November 2010 22 orenl@cs.columbia.edu

Linux-C/R design goals Transparency Reliability Security/safety Performance Maintainability Linux Kernel Summit, November 2010 23 orenl@cs.columbia.edu Linux Kernel Summit, November 2010 23 orenl@cs.columbia.edu

Linux-C/R design goals Transparency applications oblivious to operation allow notify of checkpoint or restart allow application awareness Reliability Security/safety Performance Maintainability Linux Kernel Summit, November 2010 24 orenl@cs.columbia.edu Linux Kernel Summit, November 2010 24 orenl@cs.columbia.edu

Linux-C/R design goals Transparency Reliability checkpoint succeeds restart succeeds report non-checkpoint-able reasons checkpoint is non-intrusive Security/safety Performance Maintainability Linux Kernel Summit, November 2010 25 orenl@cs.columbia.edu Linux Kernel Summit, November 2010 25 orenl@cs.columbia.edu

Linux-C/R design goals Transparency Reliability Security/safety ptrace capabilities to checkpoint reuse kernel code to reconstruct state Performance Maintainability Linux Kernel Summit, November 2010 26 orenl@cs.columbia.edu Linux Kernel Summit, November 2010 26 orenl@cs.columbia.edu

Linux-C/R design goals Transparency Reliability Security/safety Performance zero impact on performance reasonable code footprint Maintainability Linux Kernel Summit, November 2010 27 orenl@cs.columbia.edu Linux Kernel Summit, November 2010 27 orenl@cs.columbia.edu

Linux-C/R design goals Transparency Reliability Security/safety Performance Maintainability next slide... Linux Kernel Summit, November 2010 28 orenl@cs.columbia.edu Linux Kernel Summit, November 2010 28 orenl@cs.columbia.edu

Maintainability Placement of C/R code Extensive test-suite Positive experience so far Impact on developers Linux Kernel Summit, November 2010 29 orenl@cs.columbia.edu Linux Kernel Summit, November 2010 29 orenl@cs.columbia.edu

Maintainability Placement of C/R code generic code in kernel/checkpoint/... most c/r code with or near subsystem code so subsytem maintainers sees it c/r is well documented Extensive test-suite Positive experience so far Impact on developers Linux Kernel Summit, November 2010 30 orenl@cs.columbia.edu Linux Kernel Summit, November 2010 30 orenl@cs.columbia.edu

Maintainability Placement of C/R code Extensive test-suite test large list (>120) of scenarios test before/during/after behavior Positive experience so far Impact on developers Linux Kernel Summit, November 2010 31 orenl@cs.columbia.edu Linux Kernel Summit, November 2010 31 orenl@cs.columbia.edu

Maintainability Placement of C/R code Extensive test-suite Positive experience (2.6.27 today) can ignore most kernel changes mainly need to add features minor changes to prior c/r code e.g. splice/pipe, syscalls #s, mm helpers Impact on developers Linux Kernel Summit, November 2010 32 orenl@cs.columbia.edu Linux Kernel Summit, November 2010 32 orenl@cs.columbia.edu

Maintainability Placement of C/R code Extensive test-suite Positive experience so far Impact on developers understand what may affect c/r code at least notify c/r people when needed awareness will grow with exposure Linux Kernel Summit, November 2010 33 orenl@cs.columbia.edu Linux Kernel Summit, November 2010 33 orenl@cs.columbia.edu

Design Summary Save/restore state in-kernel Checkpoint container/subtree/self Image holds user-visible state Userspace image conversion Detailed error reporting Linux Kernel Summit, November 2010 34 orenl@cs.columbia.edu Linux Kernel Summit, November 2010 34 orenl@cs.columbia.edu

Checkpoint (1) Freeze process hierarchy (2) Save global data (3) Save process hierarchy (4) Save state of all tasks (?) Filesystem snapshot (5) Thaw/kill process hierarchy in-kernel Linux Kernel Summit, November 2010 35 orenl@cs.columbia.edu Linux Kernel Summit, November 2010 35 orenl@cs.columbia.edu

Restart (1) Create container (?) Restore (stage) filesystem (3) Create process hierarchy (4) Restore state of all tasks (5) Resume execution in-kernel Linux Kernel Summit, November 2010 36 orenl@cs.columbia.edu Linux Kernel Summit, November 2010 36 orenl@cs.columbia.edu

Current State Supported subsystems: tasks (threads, signals, credentials, etc) namespaces (all but mounts-ns) sysvipc (shm, msg, sem) files, dirs (regular, fifos/pipes, epoll, event, simple devices) sockets (unix, ipv4, ipv6) security (smack, selinux labels) Linux Kernel Summit, November 2010 37 orenl@cs.columbia.edu Linux Kernel Summit, November 2010 37 orenl@cs.columbia.edu

Current State What's missing [reviewed] file locks, leases, owner [reviewed] unlinked files/dirs [wip] fanotify/inotify/dnotify [wip] mounts, mount-ns /proc filesystem ptraced tasks more devices Linux Kernel Summit, November 2010 38 orenl@cs.columbia.edu Linux Kernel Summit, November 2010 38 orenl@cs.columbia.edu

Current State Supported architectures: x86-32 x86-64 s390x PowerPC ARM Linux Kernel Summit, November 2010 39 orenl@cs.columbia.edu Linux Kernel Summit, November 2010 39 orenl@cs.columbia.edu

Current State Code: ~23K lines in total ~1200 lines documentation ~600 lines per arch (x5) ~8K lines in kernel/checkpoint/* (base) ~7K lines near-place files (*/checkpoint.c) ~2K lines in-place save/restore ~2K lines for LSM Linux Kernel Summit, November 2010 40 orenl@cs.columbia.edu Linux Kernel Summit, November 2010 40 orenl@cs.columbia.edu

Discussion... Concrete path to mainline (mm/next?) Exposure to subsystem maintainers? Image format tied to kernel version (userspace conversion tools) Many thanks to those who reviewed, tested, and provided suggestions! Web page: http://www.linux-cr.org/ Git tree(s): git://www.linux-cr.org/git/ Linux Kernel Summit, November 2010 41 orenl@cs.columbia.edu Linux Kernel Summit, November 2010 41 orenl@cs.columbia.edu