Red Hat Summit 2009 Rik van Riel

Similar documents
Virtualization Food Fight. Rik van Riel

NUMA replicated pagecache for Linux

Keeping Up With The Linux Kernel. Marc Dionne AFS and Kerberos Workshop Pittsburgh

KVM PERFORMANCE OPTIMIZATIONS INTERNALS. Rik van Riel Sr Software Engineer, Red Hat Inc. Thu May

Review of the Stable Realtime Release Process

Disclaimer. This talk vastly over-simplifies things. See notes for full details and resources.

Linux Kernel Evolution. OpenAFS. Marc Dionne Edinburgh

The failure of Operating Systems,

Linux Filesystems and Storage Chris Mason Fusion-io

The Kernel Report. (Plumbers 2010 edition) Jonathan Corbet LWN.net

Red Hat Enterprise Virtualization Hypervisor Roadmap. Bhavna Sarathy Senior Technology Product Manager, Red Hat

Real Time Linux patches: history and usage

Disclaimer. This talk vastly over-simplifies things. See notes for full details and resources.

Paul E. McKenney, Distinguished Engineer IBM Linux Technology Center

Red Hat Enterprise Linux 6 Server:

Testing real-time Linux: What to test and how.

Oracle Linux Wim Coekaerts, Senior Vice President, Linux and Virtualization Engineering

State of the Linux Kernel

Intel Cache Acceleration Software (Intel CAS) for Linux* v2.9 (GA)

The Embedded Linux Problem

Filesystem Performance on FreeBSD

Red Hat Fedora as a model for irods Community Architecture Ray Idaszak Director Collaborative Environments

Performance and Scalability of Server Consolidation

LINUX KERNEL UPDATES FOR AUTOMOTIVE: LESSONS LEARNED

Real-Time and Performance Improvements in the

Administrative Details. CS 140 Final Review Session. Pre-Midterm. Plan For Today. Disks + I/O. Pre-Midterm, cont.

Real-Time Technology in Linux

SUSE Linux Entreprise Server for ARM

CSE 120 Principles of Operating Systems

Enterprise Filesystems

Real Time vs. Real Fast

Multiprocessor Systems. Chapter 8, 8.1

KVM on s390: what's next?

Memory - Paging. Copyright : University of Illinois CS 241 Staff 1

By Arjan Van De Ven, Senior Staff Software Engineer at Intel.

Red Hat Enterprise Linux on IBM System z Performance Evaluation

Measuring Resource Demand on Linux

KVM/ARM. Marc Zyngier LPC 12

Xen scheduler status. George Dunlap Citrix Systems R&D Ltd, UK

Comparison of Solaris, Linux, and FreeBSD Kernels. Similarities and Differences in some major kernel subsystems.

Linux Kernel on RISC-V: Where do we stand?

Oracle Linux and Oracle VM Support Policies ~ Statement of Changes Effective Date: 20-April-2018

Linux on System z Distribution Performance Update

Digitizer operating system support

Linux Developments at DESY. Uwe Ensslin, DESY - IT 2003 Jun 30

Clearswift SECURE Gateways

The kernel report. Jonathan Corbet LWN.net

How to Optimize the Scalability & Performance of a Multi-Core Operating System. Architecting a Scalable Real-Time Application on an SMP Platform

Embedded Linux Birds of a Feather Session

High Performance Solid State Storage Under Linux

History of xser/kamailio at 1&1

ARM Vision for Thermal Management and Energy Aware Scheduling on Linux

64-bit ARM Unikernels on ukvm

ò mm_struct represents an address space in kernel ò task represents a thread in the kernel ò A task points to 0 or 1 mm_structs

Automatic NUMA Balancing. Rik van Riel, Principal Software Engineer, Red Hat Vinod Chegu, Master Technologist, HP

Linux multi-core scalability

RED HAT TECHNICAL SYMPOSIUM. NSA R&E Symposium Center Monday November 8th,

Scheduling. Don Porter CSE 306

Basic Memory Management

Who stole my CPU? Leonid Podolny Vineeth Remanan Pillai. Systems DigitalOcean

Ext4 Filesystem Scaling

Software Engineering at VMware Dan Scales May 2008

Meltdown and Spectre Interconnect Performance Evaluation Jan Mellanox Technologies

Using the SDACK Architecture to Build a Big Data Product. Yu-hsin Yeh (Evans Ye) Apache Big Data NA 2016 Vancouver

238P: Operating Systems. Lecture 14: Process scheduling

Performance Tuning Guidelines for Low Latency Response on AMD EPYC -Based Servers Application Note

Comparison of Real-Time Scheduling in VxWorks and RTLinux

CFS-v: I/O Demand-driven VM Scheduler in KVM

Linux - Not real-time!

Keeping up with LTS Linux Kernel Functional Testing on Devices

Xen Project 4.4: Features and Futures. Russell Pavlicek Xen Project Evangelist Citrix Systems

Multiprocessor System. Multiprocessor Systems. Bus Based UMA. Types of Multiprocessors (MPs) Cache Consistency. Bus Based UMA. Chapter 8, 8.

Real-Time Response on Multicore Systems: It is Bigger Than You Think

Multiprocessor Systems. COMP s1

Basic Memory Management. Basic Memory Management. Address Binding. Running a user program. Operating Systems 10/14/2018 CSC 256/456 1

PAGE REPLACEMENT. Operating Systems 2015 Spring by Euiseong Seo

RHEL 7 Low Latency Update

Real-Time Response on Multicore Systems: It is Bigger Than I Thought

1 Virtualization Recap

Virtualization. ...or how adding another layer of abstraction is changing the world. CIS 399: Unix Skills University of Pennsylvania.

CLOUD ARCHITECTURE & PERFORMANCE WORKLOADS. Field Activities

SUSE Linux Enterprise Real Time. Carl Drisko, Linux & Open Source Principal

EECS 750: Advanced Operating Systems. 01/29 /2014 Heechul Yun

Memory Allocation. Copyright : University of Illinois CS 241 Staff 1

Linux Virtualization Update

HOW TO MAKE THE CASE TO MANAGEMENT: PAYING FOR OPEN SOURCE

Todd Deshane, Ph.D. Student, Clarkson University Xen Summit, June 23-24, 2008, Boston, MA, USA.

An Evolutionary Study of Linux Memory Management for Fun and Profit

G Virtual Memory. Robert Grimm New York University

HPE Security ArcSight. ArcSight Data Platform Support Matrix

A Survivor's Guide to Contributing to the Linux Kernel

VMWARE TUNING BEST PRACTICES FOR SANS, SERVER, AND NETWORKS

Status of Linux 3.x Real Time and Changes From 2.6

Choosing Hardware and Operating Systems for MySQL. Apr 15, 2009 O'Reilly MySQL Conference and Expo Santa Clara,CA by Peter Zaitsev, Percona Inc

PostgreSQL as a benchmarking tool

Latencies in Linux and FreeBSD kernels with different schedulers O(1), CFS, 4BSD, ULE

Scheduling in Xen: Present and Near Future

LCA14-104: GTS- A solution to support ARM s big.little technology. Mon-3-Mar, 11:15am, Mathieu Poirier

Real-Time KVM for the Masses Unrestricted Siemens AG All rights reserved

Simultaneous Multithreading on Pentium 4

Transcription:

1

The Turtle And The Hare A Tale of Two Kernels Rik van Riel Senior Software Engineer, Red Hat September 3, 2009 2

The Turtle And The Hare A Tale of Two Kernels Rik van Riel Senior Software Engineer, Red Hat September 3, 2009 BUT WAIT, THERE'S MORE... 3

The Turtle And The Hare A Tale of Three Kernels Rik van Riel Senior Software Engineer, Red Hat September 3, 2009 4

A Tale of Two Kernels Comparing the RHEL and upstream kernels How fast is fast? How slow is slow? The flow of the code What can be included in RHEL updates? Backporting Upstream features that could be in RHEL 6 How does the Fedora kernel fit in? Conclusions 5

RHEL kernel vs upstream kernel Upstream kernel Goal: allow development to move forward, so Linux can be everything to everyone Fast moving, cutting edge RHEL kernel Goal: provide a stable base for people to run their applications on Slow moving Users want no changes, except for..., so compromises need to be made 6

How fast is fast? How slow is slow? Kernel changesets over the past year or so September 2009: RHEL 5.4 972 changesets June 2009: 2.6.30 13008 changesets March 2009: 2.6.29 12526 changesets January 2009: RHEL 5.3 986 changesets December 2008: 2.6.28 9784 changesets October 2008: 2.6.27 11371 changesets July 2008: 2.6.26 10487 changesets May 2008: RHEL 5.2 April 2008: 2.6.25 12707 changesets 7

Regressions Upstream kernel Typically 30-70 regressions found from one kernel version to the next Typically more than half (the serious and easy to fix ones) get fixed before the next release Do not stand in the way of progress RHEL kernel Typically 5-10 kernel regressions are found in QA between releases All are release blockers Better revert a patch than allow a regression 8

Numbers recap In a similar amount of time: RHEL kernel: less than 1000 changesets 5-10 regressions, all release blockers Upstream kernel: 25000-30000 changesets 2-3 releases with 30-70 regressions per release, more than half of which get fixed before the next release 2.6.31-rc5 has 76 regressions, 28 unsolved as of Aug 10 2.6.30 still has 39 unresolved regressions 2.6.31 will probably have some regressions from 2.6.30, 2.6.29 and even 2.6.28 9

The flow of the code Selected changes are backported Red Hat has an upstream first policy 2.6.0 2.6.9 2.6.18 2.6.30 Upstream RHEL 5 RHEL 4 10

What can be included in RHEL updates Security updates Bug fixes New hardware support More during the early lifetime of a RHEL distribution, much less later on Select new features 11

What can not go in RHEL updates Big risky changes to the code Cannot afford any regressions in RHEL updates Changes that break the kernel ABI (kabi) Would break compatibility with 3 rd party and customer built kernel modules Features that are not upstream Would not want a feature in RHEL 4 that does not exist in RHEL 5 and will not exist in RHEL 6 12

Upstream first Features must be upstream before they are accepted into RHEL Feature is automatically included in future RHEL versions (not necessarily the same implementation) The feature has a maintainer Upstream developers have also reviewed the code Millions of upstream users are a good QA crowd The entire Linux community benefits from the feature 13

What is backporting? Backporting is: Taking code from a newer kernel (upstream) and making it work on an older kernel (RHEL) To minimize risk, backport the minimal amount of code necessary To facilitate backporting of future bugfixes to the code, keep the code as similar as possible to the upstream version 14

The pitfalls of backporting Backporting is not without its challenges The desired code may depend on infrastructure that is not present in the older target kernel The kabi cannot change; required changes to data structures often cannot be made and need to be worked around The desired code may depend on new behavior in existing other code Virtualization: must remain compatible with hosts and guests of a different version 15

Examples of backported features Lots of new hardware support New drivers mostly backported wholesale, to get the same code that was tested upstream Ext4 and ecryptfs filesystems KVM virtualization Virtualization IOMMU (VT-d) support Nested page table support zseries SHARED_KERNEL support Generic Receive Offload network support 16

Upstream features that may be in RHEL 6 Things that could not and should not be backported CFS CPU scheduler Tickless kernel Lockless page cache Split LRU VM Resource control (cgroups) 17

CFS CPU scheduler CFS: Completely Fair Scheduler Calculates priority purely by CPU use, instead of using heuristics like the O(1) scheduler Fewer corner cases, more deterministic Merged upstream as part of the realtime effort Comes with better SMP and NUMA balancing code Slightly better SMP scalability and throughput Implemented by Ingo Molnar, Peter Zijlstra, Thomas Gleixner 18

Tickless kernel Get rid of periodic timer interrupt Instead, dynamically program system timer to go off at the next scheduled event No timer tick while idle Good for power saving Even better for virtualization Timer can go off at exactly the right time Nanosleep() wakeups more precisely timed Micro accounting of CPU time used by processes Implemented by Ingo Molnar, Thomas Gleixner 19

Lockless page cache Multiple CPUs can look up pages from the same file simultaneously Simultaneous access has always been possible, but lookups were serialized Especially useful for glibc, database shared memory segments, etc... Also speeds up single threaded performance Implemented by: Nick Piggin 20

Split LRU VM Split filesystem backed, memory/swap backed and mlocked pages onto separate LRU lists Better targetted pageout scanning Different eviction policies for file backed and memory/swap backed pages Pageout code scales to larger amounts of memory Implemented by Rik van Riel, Lee Schermerhorn, Kosaki Motohiro 21

Resource control (cgroups) Kernel infrastructure to Group processes Control resource use of each group CPU Memory IO (disk and network) Can be used with workload manager Large upstream effort with involvement by many 22

How does the Fedora kernel fit in? Fedora release kernels are for using Fedora Rawhide kernels are for testing... but maybe not the same definitions of using and testing that enterprise users are used to Fedora release kernels and Fedora Rawhide kernels both follow different upstream kernel branches 23

Upstream stable kernel forks Periodically, Linus releases a kernel Released kernels still have bugs, so others maintain a stable kernel tree 2.6.28 2.6.29 2.6.30 2.6.31-rc Fedora Rawhide 2.6.30.1 Fedora Release 2.6.29.1 2.6.29.2 24

Fedora release kernels Kernels for released distributions are moderately cutting edge, primarily meant for using Fedora releases follow upstream stable kernel branches Fedora releases periodically rebase to a newer upstream stable kernel branch The Fedora distribution often uses cutting edge kernel features Fedora 11 and 12 are good platforms to play with features that will be in RHEL 6 25

The Fedora Rawhide kernel The Rawhide (Fedora development) kernel tracks the latest changes applied to the Linux kernel development branch Beyond cutting edge. Bleeding edge? A good testing ground for upstream development: Rawhide users help discover bugs in the upstream development kernel Often patches are included in the Rawhide kernel before going upstream, as a test / validation platform 26

Conclusions RHEL kernel moves much slower than upstream or Fedora, also much lower risk RHEL kernel gets some upstream improvements as backports RHEL users will have to wait until the next major RHEL release for the other features Upstream first policy ensures that all RHEL 5 features will be there in RHEL 6 Fedora is a good development & test lab, providing testing for Red Hat and upstream 27

28