New features in AddressSanitizer. LLVM developer meeting Nov 7, 2013 Alexey Samsonov, Kostya Serebryany

Similar documents
Dynamic code analysis tools

Fast dynamic program analysis Race detection. Konstantin Serebryany May

Praktische Aspekte der Informatik

DEBUGGING: DYNAMIC PROGRAM ANALYSIS

TI2725-C, C programming lab, course

Advanced Debugging and the Address Sanitizer

CSC 405 Introduction to Computer Security Fuzzing

WESTFÄLISCHE WILHELMS-UNIVERSITÄT MÜNSTER. Debugging C++ living knowledge WWU Münster

Memory management. Johan Montelius KTH

Programming in C. Lecture 9: Tooling. Dr Neel Krishnaswami. Michaelmas Term

Copyright 2015 MathEmbedded Ltd.r. Finding security vulnerabilities by fuzzing and dynamic code analysis

Memory Management. CS449 Fall 2017

A program execution is memory safe so long as memory access errors never occur:

Identifying Memory Corruption Bugs with Compiler Instrumentations. 이병영 ( 조지아공과대학교

CS C Primer. Tyler Szepesi. January 16, 2013

Bruce Merry. IOI Training Dec 2013

valgrind overview: runtime memory checker and a bit more What can we do with it?

Memory Allocation in C C Programming and Software Tools. N.C. State Department of Computer Science

CSci 4061 Introduction to Operating Systems. Programs in C/Unix

Enhancing Memory Error Detection for Large-Scale Applications and Fuzz testing

C/C++ toolchain. Static and dynamic code analysis. Karel Kubíček. Masaryk University. Brno, Czech Republic

Dynamic Race Detection with LLVM Compiler

CSCI-1200 Data Structures Spring 2016 Lecture 6 Pointers & Dynamic Memory

Common Misunderstandings from Exam 1 Material

System Assertions. Andreas Zeller

The Art and Science of Memory Allocation

CS 31: Intro to Systems Pointers and Memory. Martin Gagne Swarthmore College February 16, 2016

Recitation: C Review. TA s 20 Feb 2017

Base Component. Chapter 1. *Memory Management. Memory management Errors Exception Handling Messages Debug code Options Basic data types Multithreading

CMPS 105 Systems Programming. Prof. Darrell Long E2.371

Class Information ANNOUCEMENTS

Fuzzing AOSP. AOSP for the Masses. Attack Android Right Out of the Box Dan Austin, Google. Dan Austin Google Android SDL Research Team

Midterm Exam Nov 8th, COMS W3157 Advanced Programming Columbia University Fall Instructor: Jae Woo Lee.

Lecture 8 Dynamic Memory Allocation

CMSC 341 Lecture 2 Dynamic Memory and Pointers

o Code, executable, and process o Main memory vs. virtual memory

In Java we have the keyword null, which is the value of an uninitialized reference type

CSE 509: Computer Security

ECE 598 Advanced Operating Systems Lecture 12

CS61, Fall 2012 Section 2 Notes

Use Dynamic Analysis Tools on Linux

Advanced Programming & C++ Language

Using Intel VTune Amplifier XE and Inspector XE in.net environment

Changelog. Corrections made in this version not in first posting: 1 April 2017: slide 13: a few more %c s would be needed to skip format string part

Recitation 2/18/2012

6.S096: Introduction to C/C++

CptS 360 (System Programming) Unit 4: Debugging

Lecture 2, September 4

Memory (Stack and Heap)

Enhancing Memory Error Detection for Larg e-scale Applications and Fuzz testing

How we implemented TDD in Embedded C & C++

Final CSE 131B Spring 2004

Scientific Programming in C IV. Pointers

Faster, Stronger C++ Analysis with the Clang Static Analyzer. George Karpenkov, Apple Artem Dergachev, Apple

Lecture 08 Control-flow Hijacking Defenses

XMEM. Extended C/C++ Dynamic Memory Control and Debug Library. XMEM Manual. Version 1.7 / September 2010

HDFI: Hardware-Assisted Data-flow Isolation

UNDEFINED BEHAVIOR IS AWESOME

CSCI-243 Exam 1 Review February 22, 2015 Presented by the RIT Computer Science Community

Processes. Johan Montelius KTH

Practical Memory Safety with REST

The Valgrind Memory Checker. (i.e., Your best friend.)

Virtual Memory: Systems

CSE 333 Midterm Exam Sample Solution 7/28/14

A process. the stack

CSCI-1200 Data Structures Spring 2013 Lecture 7 Templated Classes & Vector Implementation

Review! Lecture 5 C Memory Management !

CS61C : Machine Structures

CMSC 313 COMPUTER ORGANIZATION & ASSEMBLY LANGUAGE PROGRAMMING

Software security, secure programming

Memory Management. a C view. Dr Alun Moon KF5010. Computer Science. Dr Alun Moon (Computer Science) Memory Management KF / 24

TDDB68. Lesson 1. Simon Ståhlberg

Princeton University Computer Science 217: Introduction to Programming Systems. Dynamic Memory Management

CSCI-1200 Data Structures Fall 2015 Lecture 6 Pointers & Dynamic Memory

Memory Analysis tools

finding vulnerabilities

Quiz 0 Review Session. October 13th, 2014

Memory Management. To do. q Basic memory management q Swapping q Kernel memory allocation q Next Time: Virtual memory

Code: analysis, bugs, and security

File Systems: Consistency Issues

Survey. Motivation 29.5 / 40 class is required

System Assertions. Your Submissions. Oral Exams FEBRUARY FEBRUARY FEBRUARY

COMP 2355 Introduction to Systems Programming

Martin Kruliš, v

Undefined Behaviour in C

ch = argv[i][++j]; /* why does ++j but j++ does not? */

Scientific Programming in C IX. Debugging

Making C Less Dangerous

Dynamic Memory Allocation

Kampala August, Agner Fog

Improving Linux Development with better tools. Andi Kleen. Oct 2013 Intel Corporation

Pointers and References

Arizona s First University. More ways to show off--controlling your Creation: IP and OO ECE 373

Dynamic Memory Allocation I Nov 5, 2002

Thread Sanitizer and Static Analysis

Dynamic Memory Allocation

When you add a number to a pointer, that number is added, but first it is multiplied by the sizeof the type the pointer points to.

CS Computer Systems. Lecture 8: Free Memory Management

High Performance Computing and Programming, Lecture 3

Operating System Labs. Yuanbin Wu

Transcription:

New features in AddressSanitizer LLVM developer meeting Nov 7, 2013 Alexey Samsonov, Kostya Serebryany

Agenda AddressSanitizer (ASan): a quick reminder New features: Initialization-order-fiasco Stack-use-after-scope Stack-use-after-return Leaks ASan for Linux Kernel Misc: compile time, MPX, ASM, libs BOF at 2:00pm today

ASan: a quick reminder Dynamic testing tool, finds memory bugs Buffer overflows, use-after-free Found 5000+ bugs everywhere, including LLVM Compiler instrumentation + run-time library ~2x slowdown In LLVM since 3.1 Siblings: ThreadSanitizer (TSan): data races MemorySanitizer (MSan): uses of uninitialized memory

ASan: a quick reminder (cont.) Every 8 bytes of application memory are associated with 1 byte of shadow memory Redzones are created around buffers; freed memory is put into quarantine Shadow of redzones and freed memory is poisoned On every memory access compiler-injected code checks if shadow is poisoned

ASan report example: heap-use-after-free int main(int argc, char **argv) { } int *array = new int[100]; delete [] array; return array[argc]; // BOOM % clang++ -O1 -g -fsanitize=address a.cc &&./a.out ==30226== ERROR: AddressSanitizer heap-use-after-free READ of size 4 at 0x7faa07fce084 thread T0 #0 0x40433c in main a.cc:4 0x7faa07fce084 is located 4 bytes inside of 400-byte region freed by thread T0 here: #0 0x4058fd in operator delete[](void*) _asan_rtl_ #1 0x404303 in main a.cc:3 previously allocated by thread T0 here: #0 0x405579 in operator new[](unsigned long) _asan_rtl_ #1 0x4042f3 in main a.cc:2

New Features

ASan report example: init-order-fiasco // i1.cc extern int B; int A = B; int main() { return A; } // i2.cc #include <stdlib.h> int B = atoi("123"); % clang -g -fsanitize=address i1.cc i2.cc % ASAN_OPTIONS=check_initialization_order=1./a.out ==19504==ERROR: AddressSanitizer: initialization-order-fiasco READ of size 4 at 0x000001aaff60 thread T0 #0 0x414fa3 in cxx_global_var_init i1.cc:2 #1 0x415015 in global constructors keyed to a i1.cc:5 0x000001aaff60 is located 0 bytes inside of global variable 'B' from 'i2.cc' (0x1aaff60) of size 4

Detecting init-order-fiasco Frontend knows which globals are dynamically initialized Instrumented code registers globals struct asan_global { void *address; size_t size; <...> const char *module_name; bool has_dynamic_initializer; } // asan.module_ctor has the highest priority. asan.module_ctor() { <...> asan_register_globals(globals, n); }

Detecting init-order-fiasco (cont.) // All globals from the translation unit are // initialized here. _GLOBAL I_a() { // Poison shadow memory for {uninitialized, all} // globals in another TUs. asan_before_dynamic_init(module_name); cxx_global_var_init1(); <...> cxx_global_var_initn(); // Unpoison shadow memory for all the globals. asan_after_dynamic_init(); }

Init-order fiasco detector modes // Poison shadow memory for {uninitialized [1], all [2]} // globals in another TUs. asan_before_dynamic_init(module_name); [1] ASAN_OPTIONS=check_initialization_order=true [2] ASAN_OPTIONS=strict_init_order=true (has false positives). struct Foo { Foo() { if (!initialized) value = get_value(); } int get() { if (!initialized) value = get_value(); return value; } int value; static bool initialized; };

Init-order fiasco status Works on Linux. OFF by default :( May bark on globals with no-op constructors, user has to blacklist them. Still worth using: Strict mode ON by default for Google code, hundreds of errors are fixed. Good for large code bases, which are difficult to be made -Wglobal-constructors-clean. Finds potentials errors (LTO).

ASan report example: stack-use-after-scope int main() { int *p; { int x = 0; p = &x; } return *p; } % clang -g -fsanitize=address,use-after-scope a.cc ;./a. out ==15839==ERROR: AddressSanitizer: stack-use-after-scope READ of size 4 at 0x7fffe06c20a0 thread T0 #0 0x46103d in main a.cc:4 Address is located in stack of thread T0 at offset 160 in frame #0 0x460daf in main a.cc:1 This frame has 4 object(s): [96, 104) 'p' [160, 164) 'x' <== Memory access at offset 160 is inside this variable

Detecting stack-use-after-scope Use llvm.lifetime intrinsics to generate calls to ASan runtime: llvm.lifetime.start(size, ptr) -> asan_unpoison_stack_memory(ptr, size) llvm.lifetime.end(size, ptr) -> asan_poison_stack_memory(ptr, size)

Stack-use-after-scope status Still at a prototype stage. Clang doesn t yet emit llvm.lifetime intrinsics for temporaries: const char *s = FunctionReturningStdString().c_str(); char c = s[0]; // BOOM. Need to optimize redundant calls to ASan runtime (static analysis). Stack-use-after-scope will be bundled with stack-useafter-return (discussed further).

ASan report example: stack-use-after-return int *g; void LeakLocal() { int local; g = &local; } int main() { LeakLocal(); return *g; } % clang -g -fsanitize=address a.cc % ASAN_OPTIONS=detect_stack_use_after_return=1./a.out ==19177==ERROR: AddressSanitizer: stack-use-after-return READ of size 4 at 0x7f473d0000a0 thread T0 #0 0x461ccf in main a.cc:8 Address is located in stack of thread T0 at offset 32 in frame #0 0x461a5f in LeakLocal() a.cc:2 This frame has 1 object(s): [32, 36) 'local' <== Memory access at offset 32

Stack-use-after-return instrumentation // Function entry char frame[n]; char *fake_frame = &frame[0]; if ( asan_option_detect_stack_uar) fake_frame = asan_stack_malloc(n, frame); // Function exit if (fake_frame!= frame) asan_stack_free(fake_frame, N);

Stack-use-after-return allocator char *asan_stack_malloc( size_t N, char *real_frame); void asan_stack_free( char *fake_frame, size_t N); Fast thread-local malloc-like allocator Has quarantine for freed chunks Uses a fixed size mmap-ed buffer If allocation fails, returns the original frame

ASan report example: memory leak int *g = new int; int main() { g = 0; // Lost the pointer. } % clang -g -fsanitize=address a.cc % ASAN_OPTIONS=detect_leaks=1./a.out ==19894==ERROR: AddressSanitizer: detected memory leaks Direct leak of 4 byte(s) in 1 object(s) allocated from: #0 0x44a3b1 in operator new(unsigned long) #1 0x414f66 in cxx_global_var_init leak.cc:1

LeakSanitizer (ASan s leak detector) Similar to other tools: tcmalloc, valgrind, etc Faster than any of those No extra overhead on top of ASan at run-time Small overhead at shutdown Based on the ASan/MSan/TSan allocator Can be bundled with ASan/MSan/TSan or used as a standalone tool Currently, supported only in ASan or standalone Requires StopTheWorld() -- today Linux only

ASan/TSan/MSan allocator Full malloc/free API thread-local caches, similar to tcmalloc Extra features for the tool: Associate metadata with every heap chunk: Stack trace of malloc/free Other tool-specific metadata ASan keeps metadata in the redzone TSan/MSan: metadata is not adjacent to the chunk Fast mapping address => chunk => metadata

ASan/TSan/MSan allocator (cont.) SizeClass0 SizeClass1 (invalid) 16-byte Chunk1 16-byte Chunk2... MetaData2 MetaData1... SizeClass48 64Kb Chunk1 64Kb Chunk2... MetaData2 MetaData1... Memory is allocated from a fixed addr. range ASan: [0x600000000000, 0x640000000000) -- 4Tb 64 regions; each allocates its own size class Chunks are allocated left to right. Metadata: right to left. Fast address=>chunk=>metadata Simple arithmetic Lock-free

MISC

ASan for Linux Kernel has nothing to do with LLVM :( Our early prototype uses GCC s TSan module Instrumentation is a bit different Run-time is different (inside the kernel) Found 12 bugs already, 5 fixed! Want to use Clang for better instrumentation Clang issues are resolved (?) Still some issues in the Kernel code Want to test another kernel? Talk to us!

Intel MPX Intel MPX: Memory Protection Extensions Published on July 13, HW available in ~ 2 years Additional instructions to find buffer overflows Expensive instructions touch two cache lines Requires lots of memory Slow for programs with graphs, lists and trees. Does not detect use-after-free Has false positives *Biased* comparison against ASan: goo.gl/rrhziz Still worth supporting in LLVM! Finds intra-object buffer overflows Very fast for long loops that traverse simple arrays

Compile time with ASan ASan and MSan create more control flow Some LLVM passes downstream explode Example: PR17409 (quadratic?) llvm::spillplacement::addlinks InlineSpiller::propagateSiblingValue

Why instrument all libs? ASan: stack unwinding with frame pointers TSan: catching synchronization via atomics MSan: avoid false positives All tools: more coverage Status: can build 50+ libs used by Chromium on Linux Help is welcome!

We also want to instrument ASM! MSan: avoid false positives Ex.: FD_ZERO on Linux is inline asm Ex.: optimized libraries (openssl, libjpeg_turbo) All tools: more coverage (same as libs) Ideas Pattern matching for simple cases An MC Pass Use MCLayer

Summary ASan keeps getting new features Initialization-order-fiasco: done (Linux) Stack-use-after-scope: work-in-progress Stack-use-after-return: beta Memory Leaks: done (Linux) Lots of work to do Libs, ASM, Kernel, MPX, compile time Better support for non-linux-x86_64