Comprehensive Kernel Instrumentation via Dynamic Binary Translation Peter Feiner Angela Demke Brown Ashvin Goel University of Toronto 011
Complexity of Operating Systems 012
Complexity of Operating Systems Growth in code size Palix, ASPLOS 2011 Many new drivers! 012
Complexity of Operating Systems Growth in code size Palix, ASPLOS 2011 Many new drivers! More swearing Vidar Holen, 2012 Swear Count 200 150 100 50 $^&@!%&%! *&#@ $%!^%* penguin 1.0 2.1.2 2.3.36 2.6.12 2.6.24 2.6.36 012
Complexity of Operating Systems Growth in code size Palix, ASPLOS 2011 Many new drivers! More swearing Vidar Holen, 2012 Bugs are inevitable! Swear Count 200 150 100 50 Need coping strategies 2 01 $^&@!%&%! *&#@ $%!^%* penguin 1.0 2.1.2 2.3.36 2.6.12 2.6.24 2.6.36
Tools would be nice Awesome tools for user code Memcheck Program Shepherding Use Dynamic Binary Translation (DBT) Rewrite binaries as they execute No need for source s make building DBT tools easy DynamoRIO, Valgrind, Pin No framework for OS code 013
Our 014
Our Ported DynamRIO to Linux kernel Runs on bare metal 014
Our Ported DynamRIO to Linux kernel Runs on bare metal Port took 18 Months 014
Our Ported DynamRIO to Linux kernel Runs on bare metal Port took 18 Months Built OS debugging tools in 5 days Heap debugging Use after free Heap corruption Stack overflow monitor 014
Our Ported DynamRIO to Linux kernel Runs on bare metal Port took 18 Months Built OS debugging tools in 5 days Heap debugging Use after free Heap corruption Stack overflow monitor Practical One author ran system on her desktop for 1 month 014
What About Hypervisors? 015
What About Hypervisors? VMWare can use DBT on guests No instrumentation API 015
What About Hypervisors? VMWare can use DBT on guests No instrumentation API PinOS has instrumentation API PinOS = Pin + Xen Guest needs emulated devices Useless for most driver code 015
What About Hypervisors? VMWare can use DBT on guests No instrumentation API Drivers Sound 6% Net 6% FS 9% PinOS has instrumentation API PinOS = Pin + Xen Guest needs emulated devices Useless for most driver code 47% Other 6% Arch 25% Palix, ASPLOS 2011 015
What About Hypervisors? VMWare can use DBT on guests No instrumentation API Drivers Sound 6% Net 6% FS 9% PinOS has instrumentation API PinOS = Pin + Xen Guest needs emulated devices Useless for most driver code 47% Other 6% Arch 25% Palix, ASPLOS 2011 So add DBT to a hypervisor with pass-through devices? Then you d have the problems we show you how to solve... problems with interrupts! 015
Overview 1. Boot normally User Code 2. Take over OS Code 3. JIT the OS x86 instrumented x86 Interrupts Exceptions 016
Overview 1. Boot normally User Code 2. Take over OS Code 3. JIT the OS x86 instrumented x86 Interrupts Exceptions 016
Overview 1. Boot normally User Code 2. Take over OS Code Instrumented OS Code 3. JIT the OS x86 instrumented x86 Interrupts Exceptions 016
Dynamic Instrumentation User Code OS Code Instrumented OS Code Interrupts Exceptions 017
Dynamic Instrumentation User Code Code Cache OS Code Interrupts Exceptions 017
OS Code Dynamic Instrumentation User Code Code Cache bb2 bb3 Interrupts Exceptions 017
OS Code Dynamic Instrumentation User Code Code Cache bb2 bb3 017
OS Code Dynamic Instrumentation User Code Code Cache bb2 bb3 System call example is entry point 017
OS Code Dynamic Instrumentation User Code Code Cache bb2 bb3 System call example is entry point 017
OS Code Dynamic Instrumentation User Code Code Cache bb2 bb3 System call example is entry point 017
OS Code Dynamic Instrumentation User Code Code Cache bb2 bb3 System call example is entry point 017
OS Code Dynamic Instrumentation User Code Code Cache bb2 bb3 System call example is entry point 017
OS Code Dynamic Instrumentation User Code Code Cache bb2 bb3 System call example is entry point 017
OS Code Dynamic Instrumentation User Code Code Cache bb2 bb3 System call example is entry point 017
OS Code Dynamic Instrumentation User Code Code Cache bb2 bb2 bb3 System call example is entry point 017
OS Code Dynamic Instrumentation User Code Code Cache bb2 bb2 bb3 System call example is entry point 017
OS Code Dynamic Instrumentation User Code Code Cache bb2 bb2 bb3 System call example is entry point 017
OS Code Dynamic Instrumentation User Code Code Cache bb2 bb2 bb3 System call example is entry point 017
OS Code Dynamic Instrumentation User Code Code Cache bb2 bb2 bb3 bb3 System call example is entry point 017
OS Code Dynamic Instrumentation User Code Code Cache bb2 bb2 bb3 bb3 System call example is entry point 017
OS Code Dynamic Instrumentation User Code Code Cache bb2 bb2 bb3 bb3 System call example is entry point 017
OS Code Dynamic Instrumentation User Code Code Cache bb2 bb2 bb3 bb3 System call example is entry point 017
OS Code Complications User Code Code Cache bb2 bb2 bb3 bb3 018
OS Code Complications User Code Code Cache bb2 bb2 bb3 bb3 Reentrance How do you do I/O? Can t use OS 018
OS Code Complications User Code Code Cache bb2 bb2 bb3 bb3 Reentrance How do you do I/O? Can t use OS 018 Concurrency Multiple CPUs using and building cache
OS Code Complications User Code Code Cache bb2 bb2 bb3 bb3 Reentrance How do you do I/O? Can t use OS Interrupts 018 Concurrency Multiple CPUs using and building cache
OS Code Handling Interrupts User Code Code Cache IH * iret Interrupts 019
OS Code Handling Interrupts User Code Code Cache IH * iret Interrupts 019
OS Code Handling Interrupts User Code arrival Code Cache IH * iret Interrupts 019
OS Code Handling Interrupts User Code arrival Code Cache IH * iret Interrupts 019
OS Code Handling Interrupts User Code arrival Code Cache IH * iret Interrupts What should framework do with interrupt? 019
OS Code Handling Interrupts User Code arrival Code Cache IH * iret Interrupts What should framework do with interrupt? Can it run interrupt handler IH immediately? 019
OS Code Handling Interrupts User Code arrival Code Cache IH * iret Interrupts IH What should framework do with interrupt? Can it run interrupt handler IH immediately? 019
OS Code Handling Interrupts User Code arrival Code Cache IH * iret Interrupts IH What should framework do with interrupt? Can it run interrupt handler IH immediately? 019
OS Code Handling Interrupts User Code arrival Code Cache IH * iret Interrupts IH What should framework do with interrupt? Can it run interrupt handler IH immediately? 019
OS Code Handling Interrupts User Code arrival Code Cache IH * iret Interrupts IH tail What should framework do with interrupt? Can it run interrupt handler IH immediately? 019
OS Code Handling Interrupts User Code arrival Code Cache IH * iret Interrupts IH tail What should framework do with interrupt? Can it run interrupt handler IH immediately? Problem if instrumentation isn t reentrant 019
OS Code Handling Interrupts User Code arrival Code Cache lock(l) IH * iret Interrupts IH tail What should framework do with interrupt? Can it run interrupt handler IH immediately? Problem if instrumentation isn t reentrant 019
OS Code Handling Interrupts User Code arrival Code Cache lock(l) IH * iret Interrupts lock(l) IH tail What should framework do with interrupt? Can it run interrupt handler IH immediately? Problem if instrumentation isn t reentrant 019
OS Code Handling Interrupts User Code arrival Code Cache lock(l) IH * iret lock(l) IH Interrupts What should framework do with interrupt? Can it run interrupt handler IH immediately? Problem if instrumentation isn t reentrant deadlock tail 019
OS Code Handling Interrupts User Code arrival Code Cache lock(l) IH * iret lock(l) IH What should framework do with interrupt? Can it run interrupt handler IH immediately? Problem if instrumentation isn t reentrant Need to delay Interrupts 019 deadlock tail
Delaying Interrupts arrival 01 10
Delaying Interrupts Where do we delay it until? arrival 01 10
Delaying Interrupts Where do we delay it until? Delay until end of? delivery arrival 01 10
Delaying Interrupts Where do we delay it until? Delay until end of? Avoids tail delivery arrival 01 10
Delaying Interrupts Where do we delay it until? Delay until end of? Avoids tail pending interrupt delivery arrival 01 10
Delaying Interrupts Where do we delay it until? Delay until end of? Avoids tail pending interrupt delivery arrival 01 10
Delaying Interrupts Where do we delay it until? Delay until end of? Avoids tail pending interrupt delivery arrival 01 10
Delaying Interrupts Where do we delay it until? Delay until end of? Avoids tail pending interrupt delivery arrival 01 10
Delaying Interrupts Where do we delay it until? Delay until end of? Avoids tail delivery arrival 01 10
Delaying Interrupts Where do we delay it until? Delay until end of? Avoids tail Problem if disables interrupt disabled delivery arrival 01 10
Delaying Interrupts Where do we delay it until? Delay until end of? Avoids tail Problem if disables interrupt Could be any MMIO cannot detect if enabled arrival disabled delivery 01 10
Delaying Interrupts Where do we delay it until? Delay until end of? Avoids tail Problem if disables interrupt Could be any MMIO cannot detect if enabled arrival disabled delivery 01 10
Delaying Interrupts Where do we delay it until? Delay until end of? Avoids tail Problem if disables interrupt Could be any MMIO cannot detect if enabled arrival disabled delivery delivery Must deliver before next OS instruction 01 10
Delaying Interrupts Where do we delay it until? Delay until end of? Avoids tail Problem if disables interrupt Could be any MMIO cannot detect if enabled arrival disabled delivery delivery Must deliver before next OS instruction Delay until end of instrumentation 01 10
Delaying Interrupts Where do we delay it until? Delay until end of? Avoids tail Problem if disables interrupt Could be any MMIO cannot detect if enabled arrival disabled delivery delivery Must deliver before next OS instruction Delay until end of instrumentation Still duplicates tail 01 10
How to Delay Interrupts 01 11
How to Delay Interrupts Could disable them on the CPU 01 11
How to Delay Interrupts Could disable them on the CPU push, disable 01 11
How to Delay Interrupts Could disable them on the CPU push, disable pop 01 11
How to Delay Interrupts Could disable them on the CPU push, disable pop push, disable 01 11
How to Delay Interrupts Could disable them on the CPU push, disable pop push, disable pop 01 11
How to Delay Interrupts Could disable them on the CPU push, disable pop push, disable But performance would be bad pop 01 11
How to Delay Interrupts Could disable them on the CPU push, disable pop push, disable But performance would be bad pop Instead, have framework handle it Extra overhead for interrupt, cheaper instrumentation 01 11
How to Delay Interrupts Could disable them on the CPU push, disable pop push, disable But performance would be bad pop Instead, have framework handle it Extra overhead for interrupt, cheaper instrumentation Instrumentation more frequent than interrupts Gigabit NIC sends interrupt every 100µs 100K instr. 01 11
Delaying with Patches Example: interrupt 239 arrival 01 12
Delaying with Patches Example: interrupt 239 arrival Interrupt Stack Frame Interrupts enabled... yes 01 12
Delaying with Patches Example: interrupt 239 arrival Interrupt Stack Frame Interrupts enabled... yes 01 12
Delaying with Patches Example: interrupt 239 arrival The framework 1. Patches next native instruction int 239 Interrupt Stack Frame Interrupts enabled... yes 01 12
Delaying with Patches Example: interrupt 239 arrival The framework 1. Patches next native instruction 2. Disables interrupts on iret Interrupts enabled... int 239 Interrupt Stack Frame no yes 01 12
Delaying with Patches Example: interrupt 239 arrival The framework 1. Patches next native instruction 2. Disables interrupts on iret 3. iret Interrupts enabled... int 239 Interrupt Stack Frame no yes 01 12
Delaying with Patches Example: interrupt 239 arrival The framework 1. Patches next native instruction 2. Disables interrupts on iret 3. iret Interrupts enabled... int 239 Interrupt Stack Frame no yes 01 12
Delaying with Patches Example: interrupt 239 arrival The framework 1. Patches next native instruction 2. Disables interrupts on iret 3. iret Interrupts enabled... int 239 Interrupt Stack Frame no yes 01 12
Delaying with Patches Example: interrupt 239 arrival The framework 1. Patches next native instruction 2. Disables interrupts on iret 3. iret Interrupts enabled... int 239 Interrupt Stack Frame no yes Patch Interrupt Stack Frame 01 12 Interrupts enabled... no
Delaying with Patches Example: interrupt 239 arrival The framework 1. Patches next native instruction 2. Disables interrupts on iret 3. iret Interrupts enabled... int 239 Interrupt Stack Frame no yes Patch Interrupt Stack Frame 01 12 Interrupts enabled... no
Delaying with Patches Example: interrupt 239 arrival The framework 1. Patches next native instruction 2. Disables interrupts on iret 3. iret 4. Removes patch Interrupts enabled... int 239 Interrupt Stack Frame no yes Patch Interrupt Stack Frame 01 12 Interrupts enabled... no
Delaying with Patches Example: interrupt 239 arrival The framework 1. Patches next native instruction 2. Disables interrupts on iret 3. iret 4. Removes patch 5. Enables interrupts on iret Interrupts enabled... int 239 Interrupt Stack Frame no yes 01 12 Patch Interrupt Stack Frame Interrupts enabled... yes no
Delaying with Patches Example: interrupt 239 arrival The framework 1. Patches next native instruction 2. Disables interrupts on iret 3. iret 4. Removes patch 5. Enables interrupts on iret 6. Run instrumented interrupt handler 01 12 Interrupts enabled... int 239 Interrupt Stack Frame Interrupts enabled... no yes Patch Interrupt Stack Frame yes no
Delaying with Patches Example: interrupt 239 arrival The framework 1. Patches next native instruction 2. Disables interrupts on iret 3. iret 4. Removes patch 5. Enables interrupts on iret 6. Run instrumented interrupt handler 01 12 IH Interrupts enabled... int 239 Interrupt Stack Frame Interrupts enabled... no yes Patch Interrupt Stack Frame yes no
Delaying with Patches Example: interrupt 239 arrival The framework 1. Patches next native instruction 2. Disables interrupts on iret 3. iret 4. Removes patch 5. Enables interrupts on iret 6. Run instrumented interrupt handler Okay, what s the performance? 01 12 IH Interrupts enabled... int 239 Interrupt Stack Frame Interrupts enabled... no yes Patch Interrupt Stack Frame yes no
Performance Ran framework with instruction counting tool Intel Quad Core i7 2.8Ghz, 8GB, 64-bit Ubuntu 10.10 Low application overhead JavaScript, Mozilla Kraken: 3% overhead 18% user time increase 143% system time increase Parallel Linux kernel compile: 30% overhead Overhead commensurate with OS activity How bad can this get? 01 13
Stress Test Setup Apachebench and Filebench Configured benchmarks to stress CPUs and kernel Large buffer cache - no disk I/O Many threads - lots of context switching 100% utilization - shows interrupt processing overhead nthreads data size fileserver 50 1.25 GB webserver 100 15.6 MB webproxy 100 15.6 MB varmail 16 15.6 MB Table 1. Filebench parameters concurrency level 200 Apachebench Parameters 01 14
Stress Test Results 20000 Apachebench Filebench Native Instrumented Throughput (ops/s) 15000 10000 5000 0 apachebench fileserver webserver webproxy varmail Less than 5x - Reasonable overhead for debugging tools 01 15
Stress Test Results 20000 Apachebench Filebench Native Instrumented Throughput (ops/s) 15000 10000 5000 0 apachebench fileserver webserver webproxy varmail CPU-bound IO-bound Less than 5x - Reasonable overhead for debugging tools 01 15
Summary Enables dynamic binary instrumentation of OS Makes it easy to write complex instrumentation Built useful memory checking tools Works with arbitrary devices & drivers 01 16
Questions? 01 17