Improving Linux Development with better tools. Andi Kleen. Oct 2013 Intel Corporation

Similar documents
Improving Linux development with better tools

Linux multi-core scalability

Identifying Memory Corruption Bugs with Compiler Instrumentations. 이병영 ( 조지아공과대학교

Debugging and Profiling

Dynamic code analysis tools

The Ephemeral Smoking Gun

Porting Linux to x86-64

DEBUGGING: DYNAMIC PROGRAM ANALYSIS

System Wide Tracing User Need

CS2141 Software Development using C/C++ Debugging

ECE/ME/EMA/CS 759 High Performance Computing for Engineering Applications

Jackson Marusarz Software Technical Consulting Engineer

Intro to Segmentation Fault Handling in Linux. By Khanh Ngo-Duy

Intel Parallel Studio XE 2017 Composer Edition BETA C++ - Debug Solutions Release Notes

Efficient and Large Scale Program Flow Tracing in Linux. Alexander Shishkin, Intel

Hunting Down Data Races in the Linux Kernel

Copyright 2015 MathEmbedded Ltd.r. Finding security vulnerabilities by fuzzing and dynamic code analysis

Debugging with gdb and valgrind

New features in AddressSanitizer. LLVM developer meeting Nov 7, 2013 Alexey Samsonov, Kostya Serebryany

Fuzzing AOSP. AOSP for the Masses. Attack Android Right Out of the Box Dan Austin, Google. Dan Austin Google Android SDL Research Team

CSC 405 Introduction to Computer Security Fuzzing

Concurrency, Thread. Dongkun Shin, SKKU

Fast dynamic program analysis Race detection. Konstantin Serebryany May

Making C Less Dangerous

Debugging. ICS312 Machine-Level and Systems Programming. Henri Casanova

Debugging. Erwan Demairy Dream

The benefits and costs of writing a POSIX kernel in a high-level language

Tools for kernel development. Jonathan Corbet LWN.net

SystemTap for Enterprise

Configurations. Make menuconfig : Kernel hacking/

Testing Error Handling Code in Device Drivers Using Characteristic Fault Injection

Debugging with GDB and DDT

Memory & Thread Debugger

RAS Enhancement Activities for Mission-Critical Linux Systems

Profiling & Optimization

Programming in C. Lecture 9: Tooling. Dr Neel Krishnaswami. Michaelmas Term

Grigore Rosu Founder, President and CEO Professor of Computer Science, University of Illinois

Linux Kernel Validation Tools. Nicholas Mc Guire Distributed & Embedded Systems Lab Lanzhou University, China

Overview of the x86-64 kernel. Andi Kleen, SUSE Labs, Novell Linux Bangalore 2004

Systems software design. Software build configurations; Debugging, profiling & Quality Assurance tools

Debugging with GDB and DDT

Comprehensive Kernel Instrumentation via Dynamic Binary Translation

Performance analysis basics

Scaling CQUAL to millions of lines of code and millions of users p.1

Allinea Unified Environment

Low level security. Andrew Ruef

Author: Steve Gorman Title: Programming with the Intel architecture in the flat memory model

Exercise Session 6 Computer Architecture and Systems Programming

Scientific Programming in C IX. Debugging

1. Allowed you to see the value of one or more variables, or 2. Indicated where you were in the execution of a program

Verification & Validation of Open Source

CSC C69: OPERATING SYSTEMS

THREADS: (abstract CPUs)

When the OS gets in the way

GDB Tutorial. A Walkthrough with Examples. CMSC Spring Last modified March 22, GDB Tutorial

High Performance Computing MPI and C-Language Seminars 2009

Oracle Developer Studio Code Analyzer

EW The Source Browser might fail to start data collection properly in large projects until the Source Browser window is opened manually.

Dynamic Dispatch and Duck Typing. L25: Modern Compiler Design

Using a debugger. Segmentation fault? GDB to the rescue!

Mental models for modern program tuning

AMD gdebugger 6.2 for Linux

Welcome. HRSK Practical on Debugging, Zellescher Weg 12 Willers-Bau A106 Tel

CS354 gdb Tutorial Written by Chris Feilbach

K-Miner Uncovering Memory Corruption in Linux

Overview. This Lecture. Interrupts and exceptions Source: ULK ch 4, ELDD ch1, ch2 & ch4. COSC440 Lecture 3: Interrupts 1

The State of Kernel Debugging Technology. Jason Wessel - Product Architect for WR Linux Core Runtime - Kernel.org KDB/KGDB Maintainer

Embedded Linux kernel and driver development training 5-day session

Important From Last Time

Using Intel VTune Amplifier XE and Inspector XE in.net environment

Page 1. Today. Important From Last Time. Is the assembly code right? Is the assembly code right? Which compiler is right?

Advanced Memory Allocation

DataCollider: Effective Data-Race Detection for the Kernel

Use Dynamic Analysis Tools on Linux

Overhead Evaluation about Kprobes and Djprobe (Direct Jump Probe)

CS61C : Machine Structures

Effective Data-Race Detection for the Kernel

ASaP: Annotations for Safe Parallelism in Clang. Alexandros Tzannes, Vikram Adve, Michael Han, Richard Latham

uftrace: function graph tracer for C/C++

Undefined Behaviour in C

Application parallelization for multi-core Android devices

CSE 374 Programming Concepts & Tools

CS 103 Lab - Party Like A Char Star

Kernel Synchronization I. Changwoo Min

Bug Hunting and Static Analysis

Disclaimer. This talk vastly over-simplifies things. See notes for full details and resources.

Project 0: Implementing a Hash Table

NetBSD/pkgsrc for the last 3 years

Using Static Code Analysis to Find Bugs Before They Become Failures

Torturing Databases for Fun and Profit

Eraser: Dynamic Data Race Detection

LinuxCon 2010 Tracing Mini-Summit

ClabureDB: Classified Bug-Reports Database

6.828: OS/Language Co-design. Adam Belay

CS/COE 0449 term 2174 Lab 5: gdb

AMD CodeXL 1.3 GA Release Notes

Profiling & Optimization

Cache Profiling with Callgrind

18-600: Recitation #3

Debugging for the hybrid-multicore age (A HPC Perspective) David Lecomber CTO, Allinea Software

Transcription:

Improving Linux Development with better tools Andi Kleen Oct 2013 Intel Corporation ak@linux.intel.com

Linux complexity growing Source lines in Linux kernel All source code 16.5 16 15.5 M-LOC 15 14.5 14 13.5 V3.6 V3.7 V3.8 V3.9 V3.10 V3.11 Kernel version M-LOC 2.5 2 1.5 1 0.5 Linux kernel source lines IO net/ fs/ block/ 0 V2.6.16V2.6.32 V3.6 V3.7 V3.8 V3.9 V3.10 V3.11 Kernel version M-LOC Source lines Linux Kernel core kernel/ lib 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 V2.6.32 V3.7 V3.9 V2.6.16 V3.6 V3.8 V3.10 V3.11 Kernel

Do we have a problem? If we assume number of bugs stays constant per line there would be more and more bugs If we assume programmers don't get cleverer some code may become too complex to change/debug Of course modularity saves us to some degree

Or we can use better tools to find bugs Static code checker tools Dynamic runtime checkers Fuzzers/test suites Debuggers/Tracers to understand code Tools to read/understand source

Static checkers sparse, smatch, coccinelle, clang checker, checkpatch, gcc -W/LTO, stanse Can check a lot of things, simple mistakes, complex problems Generic C and kernel specific rules

Static checker challenges Some are very slow False positives Often only can do new warnings Otherwise too many false positives May need concentrated effort to get false positives down Only done for gcc/sparse so far Needs both changes to Linux and to checkers

Study bug fixes At least 14.8% 24.4% of the sampled bug fixes are incorrect. Moreover, 43% of the incorrect fixes resulted in severe bugs that caused crash, hang, data corruption or security problems. How do fixes become bugs Yin/Yuan et.al. http://opera.ucsd.edu/~zyin2/fse11.pdf Great paper, every kernel programmer should read it Can new rules for static checkers help?

Cocinelle example /// Find &&/ operations that include the same argument more than once //# A common source of false positives is when the argument performs a side //# effect. @r expression@ expression E; position p; @@ ( * E@p... E * E@p &&... && E ) @script:python depends on org@ p << r.p; @@ cocci.print_main("duplicated argument to && or ",p)

Challenge: global checks No static checker I found can follow indirect calls ( OO in C, common in kernel) struct foo_ops { } int (*do_foo)(struct foo *obj); foo->do_foo(foo); Can be done by using type information Misses a lot of potential bugs

Lock ordering: lockdep Deadlock from lock ordering ( ABBA bugs) used to be common Lockdep basically eliminated this problem Checks lock ordering, interrupt

Kmemcheck / AddressSanitizer Check uninitialized/freed/out of bounds data Kmemcheck based on page faults Quite slow AddressSanitizer seems to be a better alternative Compiler instrumentation, much faster Still need port to kernel (some reports already)

Thread checkers Find data races: Shared data accesses not protected by locks User space: helgrind, ThreadSanitizer,.. Problem: kernel does not mark lock less accesses. Solvable? User: atomic_write(&foo, 1, ATOMIC...); Kernel: Foo = 1; mb();

Undefined behavior checker UBSan. New gcc/llvm feature Checks undefined C behavior at runtime e.g. x << 100, signed integer overflows, Needs special runtime library Would need to be ported to kernel

Fuzzers Trinity is a great tool Finds many bugs Needs manual model for each syscall How do we cover all the ioctls/sys/proc files? Modern fuzzers around using automatic feedback But not for kernel yet http://taviso.decsystem.org/making_software_dumber.pdf

The biggest challenge How to run all these tools on every new patch: Cannot ask every developer to use all of them Static checkers are relatively easy But can we get beyond just deltas for new code? But how to run the dynamic tools?

Test suites Ideally all kernel code would come with a test suite Then someone could run all the dynamic checkers Difficult for hardware drivers LKP, kernel unit tests, tools/* limited Need a real unit testing framework

Coverage Kernel gcov can be used to test coverage of test suites Should be used much more widely

Tracers Long beyond real men don't use debuggers Linux has good debuggers these days (kgdb etc.) But how to debug hard to reproduce bugs Ideal enough information to debug on first trigger Tracing: Low overhead instrumentation When problem triggers dump data

ftrace: function tracer Trace all functions in the kernel for PID # trace-cmd record -p function -e sched_switch -P $(pidof firefox-bin) plugin function disable all enable sched_switch path = /sys/kernel/debug/tracing/events/sched_switch/enable path = /sys/kernel/debug/tracing/events/*/sched_switch/enable path = /sys/kernel/debug/tracing/events/sched_switch/enable path = /sys/kernel/debug/tracing/events/*/sched_switch/enable Hit Ctrl^C to stop recording. # trace-cmd report firefox-bin-13822 [002] 36628.537061: function: firefox-bin-13822 [002] 36628.537062: function: firefox-bin-13822 [002] 36628.537062: function: firefox-bin-13822 [002] 36628.537062: function: firefox-bin-13822 [002] 36628.537063: function: firefox-bin-13822 [002] 36628.537063: function: firefox-bin-13822 [002] 36628.537063: function: firefox-bin-13822 [002] 36628.537064: function: firefox-bin-13822 [002] 36628.537064: function: firefox-bin-13822 [002] 36628.537065: function: firefox-bin-13822 [002] 36628.537065: function: firefox-bin-13822 [002] 36628.537065: function: firefox-bin-13822 [002] 36628.537065: function: firefox-bin-13822 [002] 36628.537066: function: All kernel functions executed sys_poll poll_select_set_timeout ktime_get_ts timekeeping_get_ns set_normalized_timespec timespec_add_safe set_normalized_timespec do_sys_poll copy_from_user might_fault _cond_resched should_resched need_resched test_ti_thread_flag

ftrace Can dump on events / oops / custom triggers But still too much overhead in many cases to run always during debug

Intel PT Upcoming Intel CPU feature Traces all branches with low overhead Will be supported in perf and gdb and with FlightRecorder

Biggest challenge is better tools to understand traces (too much data)

Understanding source code Often biggest problem finding code grep/cscope work great for many cases But do not understand indirect pointers (OO in C model used in kernel): Give me all do_foo instances struct foo_ops { int (*do_foo)(struct foo *obj); } = {.do_foo = my_foo }; foo->do_foo(foo) Would be great to have a cscope like tool that understands this based on types/initializers

Conclusion Linux has a lot of great tools for making kernel development easier We need them to control complexity But still many improvements possible Questions?