Live Kernel Patching status update. Jiri Kosina SUSE Labs

Similar documents
Elivepatch Flexible distributed Linux Kernel live patching. Alice Ferrazzi Takanori Suzuki

Elivepatch Flexible distributed Linux Kernel live patching. Alice Ferrazzi

Rebootless Kernel Updates

Live Patching: The long road from Kernel to User Space. João Moreira Toolchain Engineer - SUSE Labs

Obstacles & Solutions for Livepatch Support on ARM64 Architecture

FUT92715 Solve the Paradox SUSE Linux Enterprise Live Patching Roadmap

March 10, Linux Live Patching. Adrien schischi Schildknecht. Why? Who? How? When? (consistency model) Conclusion

Reboot Reloaded. Patching the Linux Kernel Online. Vojtěch Pavlík. Dr. Udo Seidel. Director SUSE Labs SUSE

kpatch Have your security and eat it too!

Reboot adieu! Online Linux kernel patching. Udo Seidel

LIVEPATCH MODULE CREATION. LPC 2016 Josh Poimboeuf

Open Enterprise & Open Community opensuse & SLE Empowering Each Other. Richard Brown opensuse Chairman

Disclaimer. This talk vastly over-simplifies things. See notes for full details and resources.

Keeping customer data safe in EC2 a deep dive. Martin Pohlack Amazon Web Services

Oracle Database 12c Administration I

Welcome to SUSE Expert Days 2017 Digital Transformation

Disclaimer. This talk vastly over-simplifies things. See notes for full details and resources.

ARMv8.3 Pointer Authentication

System Wide Tracing User Need

The Kernel Report. (Plumbers 2010 edition) Jonathan Corbet LWN.net

Operating Systems, Assignment 2 Threads and Synchronization

Open Enterprise & Open Community

Thwarting unknown bugs: hardening features in the mainline Linux kernel

Systemtap times April 2009

Designing API: 20 API Paradoxes. Jaroslav Tulach NetBeans Platform Architect

Transitioning from uclibc to musl for embedded development. Embedded Linux Conference 2015 Rich Felker, maintainer, musl libc March 24, 2015

Data Sheet: Storage Management Veritas Storage Foundation for Oracle RAC from Symantec Manageability and availability for Oracle RAC databases

IT115: Oracle Database 12c Administration I

An introduction to checkpointing. for scientific applications

Introduction to Software Defined Infrastructure SUSE Linux Enterprise 15

Veritas Storage Foundation for Oracle RAC from Symantec

Linux Kernel on RISC-V: Where do we stand?

Veritas InfoScale Enterprise for Oracle Real Application Clusters (RAC)

2006/7/22. NTT Data Intellilink Corporation Fernando Luis Vázquez Cao. Copyright(C)2006 NTT Data Intellilink Corporation

Paradoxes of API Design. Jaroslav Tulach NetBeans Platform Architect

Adaptive Android Kernel Live Patching

IBM and Oracle Partnership

1 SAP HANA Remote Monitoring

Scalability Efforts for Kprobes

01/02/2014 SECURITY ASSESSMENT METHODOLOGIES SENSEPOST 2014 ALL RIGHTS RESERVED

"Charting the Course... Oracle Database 12c: Architecture & Internals. Course Summary

Welcome to SUSE Expert Days 2017 Service Delivery with DevOps

4.1. Virtualization. Virtualization provides the following benefits:

Linux Kernel Evolution. OpenAFS. Marc Dionne Edinburgh

LINUX INTERNALS & NETWORKING Weekend Workshop

WIND RIVER DIAB COMPILER

Fosdem perf status on ARM and ARM64

openqa Helping SUSE Linux Enterprise with Automated Testing Richard Brown openqa Technical Lead

Scalable Concurrent Hash Tables via Relativistic Programming

Three Steps Toward Zero Downtime. Guide. Solution Guide Server.

Survey of Dynamic Instrumentation of Operating Systems

Red Hat Enterprise Linux 5. Joachim Schröder Red Hat GmbH

Inline Reference Monitoring Techniques

Securing the Connected Car. Eystein Stenberg Product Manager Mender.io

Linux-CR: Transparent Application Checkpoint-Restart in Linux

KVM/ARM. Marc Zyngier LPC 12

Expert Days SUSE Manager

Red Hat OpenStack Platform 10 Product Guide

What's New with SUSE Linux Enterprise Server for z Systems

Ftrace Kernel Hooks: More than just tracing. Presenter: Steven Rostedt Red Hat

NFSv4.1 Using pnfs PRESENTATION TITLE GOES HERE. Presented by: Alex McDonald CTO Office, NetApp

Veritas Storage Foundation in a VMware ESX Environment

HOW TO MAKE THE CASE TO MANAGEMENT: PAYING FOR OPEN SOURCE

HPE Data Replication Solution Service for HPE Business Copy for P9000 XP Disk Array Family

Mechanic v. Surgeon (photos from istockphoto.com)

Implementing Transactional Memory in Kernel space

SystemTap update & overview. Josh Stone Software Engineer, Red Hat

A Data Driven Approach to Designing Adaptive Trustworthy Systems

Master Services Agreement:

LinuxCon 2010 Tracing Mini-Summit

Kernel Self Protection

EXECUTIVE REPORT ADOBE SYSTEMS, INC. COLDFUSION SECURITY ASSESSMENT

Autosave for Research Where to Start with Checkpoint/Restart

Oracle Linux Management with Oracle Enterprise Manager 13c O R A C L E W H I T E P A P E R J U L Y

Oracle Linux, Virtualization & OEM12 Discussion Sahil Mahajan / Sundeep Dhall

MOVING RED HAT ENTERPRISE LINUX INTO A NEW WORLD

Oracle Buys Ksplice Oracle Linux Enhanced with Zero Downtime Software Updates

Rebootless kernel updates

CMSC421: Principles of Operating Systems

Userspace Application Tracing with Markers and Tracepoints

VMworld 2018 Content: Not for publication or distribution

EEE 435 Principles of Operating Systems

Why lock down the kernel? Matthew Garrett

Linux Kernel Validation Tools. Nicholas Mc Guire Distributed & Embedded Systems Lab Lanzhou University, China

OPERATING SYSTEMS ASSIGNMENT 2 SIGNALS, USER LEVEL THREADS AND SYNCHRONIZATION

Keeping Up With The Linux Kernel. Marc Dionne AFS and Kerberos Workshop Pittsburgh

Scaling Data Center Application Infrastructure. Gary Orenstein, Gear6

A Dozen Years of Shellphish. Journey to the Cyber Grand Challenge

Bacula Systems Virtual Machine Performance Backup Suite

SOFTWARE CONFIGURATION MANAGEMENT

SUSE Linux Entreprise Server for ARM

Debugging Kernel without Debugger

Přehled novinek v Hyper-V 2016 Kamil Roman

Malware

Course Description. Audience. Prerequisites. At Course Completion. : Course 40074A : Microsoft SQL Server 2014 for Oracle DBAs

Linux-CR: Transparent Application Checkpoint-Restart in Linux

Red Hat Cloud Suite 1.1

Migration. 22 AUG 2017 VMware Validated Design 4.1 VMware Validated Design for Software-Defined Data Center 4.1

More on Synchronization and Deadlock

CUDA Development Using NVIDIA Nsight, Eclipse Edition. David Goodwin

Transcription:

Live Kernel Patching status update Jiri Kosina <jiri.kosina@suse.com> SUSE Labs

Outline Why? History + current state Missing features / further development 2

Why live patching? Huge cost of downtime: Hourly cost >$100K for 95% Enterprises ITIC $250K - $350K for a day in a worldwide manufacturing firm - TechTarget The goal is clear: reduce planned downtime 3

Why live patching? Change management Common tiers of change management 1. Incident response We are down, actively exploited 2. Emergency change We could go down, are vulnerable Live Kernel Patching 3. Scheduled change Time is not critical, we keep safe 4

Outline Why? History + current state Missing features / further development 5

(History: 1940 s) 6

History: 2008-2014 2008: ksplice Originally university research opensource project stop_machine() + stack inspection for asuring consistency Automatic patch generation using binary object comparision acquired by Oracle in 2011, source closed Commercially deployed for Oracle linux distribution 2014: kpatch (Red Hat) Built on similar principle (stopping the kernel and inspecting the snapshot of all existing processess) Automatic patch generation Deployed as tech preview for Fedora 2014: kgraft (SUSE) Immediate patching with convergence to fully patched state ( lazy migration ) Consistency model: kernel/userspace boundary crossing considered a checkpoint Issue: Long sleeping processess, kthreads Manual patch creation with help of toolchain Commercially deployed; ~80 patches distributed up to today 7

History: 2008-2014 Checkpoint/restart based solution (CRIU) Completely different aproach in principle Checpoint userspace kexec new kernel restore userspace + Allows to exchange complete kernel, no matter the nature of the changes - Hardware reinitialized - Not immediate at all 8

Lazy migration 9

History: 2015 Live patching session at LPC in Düsseldorf Technical presentations of competing projects and discussing future direction Mutual agreement on attempting to merge just one unified thing upstream The agreed plan: Start with a very minimalistic base and have that merged Start porting (and combining) competing solutions on top of it, cherrypicking good ideas step-by-step Base (just function redirection + API) merged into Linus tree in Feb 2015 10

History: 2015 - now New features being gradually added to CONFIG_LIVEPATCH Combined (hybrid) consistency model of kgraft + kpatch Lazy migration by default, stack examination for long-sleeping processess/kthreads Extending the arch support beyond x86 (s390, ppc64 (arm64)) objtool + ORC unwinder implemented by Josh Poimboeuf (reliance on FPbased stack checking has performance implications, DWARF unavailable) 11

Patch Generation patches created almost entirely by hand (for upstream CONFIG_LIVEPATCH at least) The source of the patch is single C file Easy to review, easy to maintain in a VCS like git Add fixed functions Create a list of functions to be replaced Issue a call to kernel livepatching API Compile Insert as a.ko module 12

Patch Generation 13

Patch Generation 14

Outline Why? History + current state Missing features / further development 15

Limitations and missing features Inability to deal with data structure / semantics changes State format transformation Needed for more complex fixes Lazy state transformation? New functions able to work with both old and new data format After code lazy migration is complete, start transforming data structures on access Shadow variables? Associating a new field to the existing structure (can be used by patched callers) Currently being implemented by Joe Lawrence State also contains exclusive access mechanisms Spinlocks, mutexes Converting those without a deadlock is an unsolved problem Patch callbacks? Allows for arbitrary fixup during patch application phases [too] powerful, has to be used with care 16

Limitations and missing features Model/consistency verification Is the change/fix still within the consistency model? (other traps: static variables in a func scope, patching schedule(),...) Currently done by human reasoning error prone and time consuming patch author guide with best practices: https://github.com/dynup/kpatch/blob/master/doc/patch-author-guide.md Patch creation tooling Patches affected by combinatorial explosion (function inlining, ABI breakage by compiler (-fipara)) Linking (/ patching relocations) to avoid excessive usage of kallsyms lookup manual/kbuild/asmtool, klp-convert, kpatch-build A lot of things could be detected automatically Kprobes trarnsferring a kprobe to a new implementation of the function is non-trivial Extending arch coverage FTRACE_WITH_REGS objtool + reliable stack unwinding Small livepatching glue code 17

Limitations and missing features Inability to patch hand-written ASM No fentry, ftrace not aware Should be easy in principle Userspace patching Different problem in principle harder to define a checkpoint for consistency Kernel is very gcc-centric, userspace not so much Initial efforts: https://github.com/joe-lawrence/linux-inject https://github.com/virtuozzo/nsb 18