LAVA: Large-scale Automated Vulnerability Addition

Similar documents
LAVA: Large-scale Automated Vulnerability Addition

Protection and Mitigation of Software Bug Exploitation

Symbolic Execution for Bug Detection and Automated Exploit Generation

Automatic Binary Exploitation and Patching using Mechanical [Shell]Phish University of California, Santa Barbara (UCSB)

Static Vulnerability Analysis

Simple Overflow. #include <stdio.h> int main(void){ unsigned int num = 0xffffffff;

Copyright 2015 MathEmbedded Ltd.r. Finding security vulnerabilities by fuzzing and dynamic code analysis

Language Security. Lecture 40

What You Corrupt Is Not What You Crash: Challenges in Fuzzing Embedded Devices

Low level security. Andrew Ruef

Memory Safety (cont d) Software Security

HeapHopper. Bringing Bounded Model Checking to Heap Implementation Security

finding vulnerabilities

Static Analysis and Bugfinding

One-Slide Summary. Lecture Outline. Language Security

Inline Reference Monitoring Techniques

ECS 153 Discussion Section. April 6, 2015

Secure Software Development: Theory and Practice

CSE484/CSE584 BLACK BOX TESTING AND FUZZING. Dr. Benjamin Livshits

logistics: ROP assignment

1.1 For Fun and Profit. 1.2 Common Techniques. My Preferred Techniques

Ramblr. Making Reassembly Great Again

Introduction to software exploitation ISSISP 2017

CSE 565 Computer Security Fall 2018

Baggy bounds with LLVM

Automatic program generation for detecting vulnerabilities and errors in compilers and interpreters

Buffer overflow background

Steal this Movie. Automatically Bypassing DRM Protection in Streaming Media Services

Static Analysis methods and tools An industrial study. Pär Emanuelsson Ericsson AB and LiU Prof Ulf Nilsson LiU

(State of) The Art of War: Offensive Techniques in Binary Analysis

Securing Software Applications Using Dynamic Dataflow Analysis. OWASP June 16, The OWASP Foundation

Towards Automatic Generation of Vulnerability- Based Signatures

Secure C Coding...yeah right. Andrew Zonenberg Alex Radocea

Outline. Classic races: files in /tmp. Race conditions. TOCTTOU example. TOCTTOU gaps. Vulnerabilities in OS interaction

DR. CHECKER. A Soundy Analysis for Linux Kernel Drivers. University of California, Santa Barbara. USENIX Security seclab

5) Attacker causes damage Different to gaining control. For example, the attacker might quit after gaining control.

A Hybrid Symbolic Execution Assisted Fuzzing Method

Honours/Master/PhD Thesis Projects Supervised by Dr. Yulei Sui

Applications. Cloud. See voting example (DC Internet voting pilot) Select * from userinfo WHERE id = %%% (variable)

VUzzer: Application-Aware Evolutionary Fuzzing

Software Testing CS 408. Lecture 6: Dynamic Symbolic Execution and Concolic Testing 1/30/18

It was a dark and stormy night. Seriously. There was a rain storm in Wisconsin, and the line noise dialing into the Unix machines was bad enough to

Source Code Patterns of Buffer Overflow Vulnerabilities in Firefox

Bug Finding with Under-approximating Static Analyses. Daniel Kroening, Matt Lewis, Georg Weissenbacher

Buffer Overflows Defending against arbitrary code insertion and execution

Computer Security Course. Midterm Review

Memory Corruption 101 From Primitives to Exploit

Lecture 05 Integer overflow. Stephen Checkoway University of Illinois at Chicago

Control Flow Hijacking Attacks. Prof. Dr. Michael Backes

Software Security: Buffer Overflow Attacks

Fixing software bugs in 10 minutes or less using evolutionary computation

The Fuzzing Project

Software Security: Misc and Principles

Other array problems. Integer overflow. Outline. Integer overflow example. Signed and unsigned

Module: Future of Secure Programming

A Backtracking Symbolic Execution Engine with Sound Path Merging

Lecture 10. Pointless Tainting? Evaluating the Practicality of Pointer Tainting. Asia Slowinska, Herbert Bos. Advanced Operating Systems

Symbolic Execution, Dynamic Analysis

Rabbit in the Loop. A primer on feedback directed fuzzing using American Fuzzy Lop. by Kevin Läufer

CSC 405 Introduction to Computer Security Fuzzing

Topics in Software Security Vulnerability

Linux Memory Layout. Lecture 6B Machine-Level Programming V: Miscellaneous Topics. Linux Memory Allocation. Text & Stack Example. Topics.

Secure Programming I. Steven M. Bellovin September 28,

CSE 565 Computer Security Fall 2018

Software Security: Buffer Overflow Defenses and Miscellaneous

Identifying Memory Corruption Bugs with Compiler Instrumentations. 이병영 ( 조지아공과대학교

CSCE 548 Building Secure Software Buffer Overflow. Professor Lisa Luo Spring 2018

KLEE: Effective Testing of Systems Programs Cristian Cadar

Overcoming Stagefright Integer Overflow Protections in Android

Finding vulnerabilifes CS642: Computer Security

Black Hat Webcast Series. C/C++ AppSec in 2014

Program Security and Vulnerabilities Class 2

Qemu code fault automatic discovery with symbolic search. Paul Marinescu, Cristian Cadar, Chunjie Zhu, Philippe Gabriel

Homework 3 CS161 Computer Security, Fall 2008 Assigned 10/07/08 Due 10/13/08

Automatically Identifying Critical Input and Code Regions in Applications

Runtime Defenses against Memory Corruption

Taintscope: A Checksum-Aware Directed Fuzzing Tool for Automatic Software Vulnerability Detection

Low-Cost Traffic Analysis of Tor

COMP 3704 Computer Security

Fuzzing AOSP. AOSP for the Masses. Attack Android Right Out of the Box Dan Austin, Google. Dan Austin Google Android SDL Research Team

Triggering Deep Vulnerabilities Using Symbolic Execution

Software Vulnerability

Making C Less Dangerous

The Importance of Benchmarks for Tools that Find or Prevent Buffer Overflows

Automated Assessment Tools

Fuzzgrind: an automatic fuzzing tool

"Secure" Coding Practices Nicholas Weaver

SAMATE (Software Assurance Metrics And Tool Evaluation) Project Overview. Tim Boland NIST May 29,

ECE 471 Embedded Systems Lecture 22

Betriebssysteme und Sicherheit Sicherheit. Buffer Overflows

Prospex: Protocol Specification Extraction

Shuntaint: Emulation-based Security Testing for Formal Verification

arxiv: v2 [cs.cr] 27 Mar 2018

Testing, code coverage and static analysis. COSC345 Software Engineering

In-Memory Fuzzing in JAVA

Hardening LLVM with Random Testing

Fuzzing. compass-security.com 1

Verification & Validation of Open Source

Detecting and exploiting integer overflows

Platform Agnostic Binary Coverage Maximization

Transcription:

LAVA: Large-scale Automated Vulnerability Addition Engin Kirda Andrea Mambretti Wil Robertson Northeastern University Brendan Dolan-Gavitt NYU Tandon Patrick Hulin, Tim Leek, Fredrich Ulrich, Ryan Whelan MIT Lincoln Laboratory

This Talk 2 In this talk, we explore how to automatically add large numbers of bugs to programs Why would we want to do this? Computer programs don't have enough bugs We want to put backdoors in other people's programs

This Talk 2 In this talk, we explore how to automatically add large numbers of bugs to programs Why would we want to do this? Computer programs don't have enough bugs We want to put backdoors in other people's programs

This Talk 2 In this talk, we explore how to automatically add large numbers of bugs to programs Why would we want to do this? Computer programs don't have enough bugs We want to put backdoors in other people's programs

Vulnerability Discovery 3 Finding vulnerabilities in software automatically has been a major research and industry goal for the last 25 years Academic Commercial KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs An Empirical Study of the Reliability of UNIX Utilities Barton P. Miller bart@cs.wisc.edu Lars Fredriksen L.Fredriksen@att.com Bryan So so@cs.wisc.edu Cristian Cadar, Daniel Dunbar, Dawson Engler Stanford University KLEE (2005) Driller: Augmenting Fuzzing Through Selective Symbolic Execution Fuzzing (1989) Nick Stephens, John Grosen, Christopher Salls, Andrew Dutcher, Ruoyu Wang, Jacopo Corbetta, Yan Shoshitaishvili, Christopher Kruegel, Giovanni Vigna UC Santa Barbara {stephens,jmg,salls,dutcher,fish,jacopo,yans,chris,vigna}@cs.ucsb.edu Driller (2015)

Vulnerability Discovery 3 Finding vulnerabilities in software automatically has been a major research and industry goal for the last 25 years Academic Commercial KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs An Empirical Study of the Reliability Cristian Cadar, Daniel Dunbar, Dawson Engler Does Stanford University this work?? of UNIX Utilities KLEE (2005) Barton P. Miller bart@cs.wisc.edu Lars Fredriksen L.Fredriksen@att.com Bryan So so@cs.wisc.edu Driller: Augmenting Fuzzing Through Selective Symbolic Execution Fuzzing (1989) Nick Stephens, John Grosen, Christopher Salls, Andrew Dutcher, Ruoyu Wang, Jacopo Corbetta, Yan Shoshitaishvili, Christopher Kruegel, Giovanni Vigna UC Santa Barbara {stephens,jmg,salls,dutcher,fish,jacopo,yans,chris,vigna}@cs.ucsb.edu Driller (2015)

Debugging the Bug Finders 4 Lots of work that claims to find bugs in programs Lack of ground truth makes it very difficult to evaluate these claims If Coverity finds 22 bugs in my program, is that good or bad? What are the false positive and false negative rates?

Existing Test Corpora 5 Some existing bug corpora exist, but have many problems: Synthetic (small) programs Don't always have triggering inputs Fixed size tools can overfit to the corpus

What About Real Vulnerabilities? 6 Real vulnerabilities with proof-of-concept exploits are essentially what we want But there just aren't that many of them. And finding new ones is expensive! Forbes, 2012

Debugging the Bug Finders 7 Existing corpora are fixed size and static it's easy to optimize to the benchmark Instead we would like to automatically create bug corpora Take an existing program and automatically add new bugs into it Now we can measure how many of our bugs they find to estimate effectiveness of bug-finders

Goals 8 We want to produce bugs that are: Plentiful (can put 1000s into a program easily) Distributed throughout the program Come with a triggering input Only manifest for a tiny fraction of inputs Are likely to be security-critical

Sounds Simple... But Not 9 Why not just change all the strncpys to strcpys? Turns out this breaks most programs for every input trivial to find the bugs We won't know how to trigger the bugs hard to prove they're "real" and security-relevant This applies to most local, random mutations

Our Approach: DUAs 10 We want to find parts of the program's input data that are: Dead: not currently used much in the program (i.e., we can set to arbitrary values) Uncomplicated: not altered very much (i.e., we can predict their value throughout the program's lifetime) Available in some program variables These properties try to capture the notion of attacker-controlled data If we can find these DUAs, we will be able to add code to the program that uses such data to trigger a bug

New Taint-Based Measures 11 How do we find out what data is dead and uncomplicated? Two new taint-based measures: Liveness: a count of how many times some input byte is used to decide a branch Taint compute number: a measure of how much computation been done on some data

Dynamic Taint Analysis 12 We use dynamic taint analysis to understand the effect of input data on the program Our taint analysis requires some specific features: Large number of labels available Taint tracks label sets Whole-system & fast (enough) Our open-source dynamic analysis platform, PANDA, provides all of these features c = a + b ; a: {w,x} ; b: {y,z} c {w,x,y,z} https://github.com/moyix/panda

Taint Compute Number (TCN) 13 1: 2: 3: 4: 5: // a,b,n are inputs TCN measures how much computation has been done on a variable at a given point in the program

Liveness 14 1: 2: 3: 4: 5: // a,b,n are inputs b: bytes {0..3} n: bytes {4..7} a: bytes {8..11} Bytes Liveness {0..3} 0 {4..7} n {8..11} 1 Liveness measures how many branches use each input byte

Attack Point (ATP) 15 An Attack Point (ATP) is any place where we may want to use attacker-controlled data to cause a bug Examples: pointer dereference, data copying, memory allocation,... In current LAVA implementation we just modify pointer dereferences to cause buffer overflow

Approach: Overview 16 Clang Instrument source with taint queries Input corpus PANDA record Run instrumented program on inputs PANDA replay + taint analysis Find attackercontrolled data and attack points Injectable bugs Clang Inject bug into program source, compile and test with modified input Effects

LAVA Bugs 17 Any (DUA, ATP) pair where the DUA occurs before the attack point is a potential bug we can inject By modifying the source to add new data flow the from DUA to the attack point we can create a bug DUA + ATP =

LAVA Bug Example 18 PANDA taint analysis shows that bytes 0-3 of buf on line 115 of src/encoding.c is attacker-controlled (dead & uncomplicated) From PANDA we also see that in readcdf.c line 365 there is a read from a pointer if we modify the pointer value we will likely cause a bug in the program Attacker controlled data encoding.c 115: } else if (looks_extended(buf, nbytes, *ubuf, ulen)) { Corruptible pointer New data flow readcdf.c 365: if (cdf_read_header(&info, &h) == -1)

LAVA Bug Example 19 PANDA taint analysis shows that bytes 0-3 of buf on line 115 of src/encoding.c is attacker-controlled (dead & uncomplicated) From PANDA we also see that in readcdf.c line 365 there is a read from a pointer if we modify the pointer value we will likely cause a bug in the program Attacker controlled data encoding.c 115: } else if (looks_extended(buf, nbytes, *ubuf, ulen)) { Corruptible pointer New data flow readcdf.c 365: if (cdf_read_header(&info, &h) == -1)

LAVA Bug Example 20 // encoding.c: } else if (({int rv = looks_extended(buf, nbytes, *ubuf, ulen); if (buf) { int lava = 0; lava = ((unsigned char *)buf)[0]; lava = ((unsigned char *)buf)[1] << 8; lava = ((unsigned char *)buf)[2] << 16; lava = ((unsigned char *)buf)[3] << 24; lava_set(lava); }; rv; })) { // readcdf.c: if (cdf_read_header ((&info) + (lava_get()) * (0x6c617661 == (lava_get()) 0x6176616c == (lava_get())), &h) == -1) When the input file data that ends up in buf is set to 0x6c6176c1, we will add 0x6c6176c1 to the pointer info, causing an out of bounds access

Evaluation: How Many Bugs? TABLE I: LAVA Injection results for open source programs of various sizes 21 Num Lines Potential Validated Inj Time Name Version Src Files C code N(DUA) N(ATP) Bugs Bugs Yield (sec) file 5.22 19 10809 631 114 17518 774 38.7% 16 readelf 2.25 12 21052 3849 266 276367 1064 53.2 % 354 bash 4.3 143 98871 3832 604 447645 192 9.6% 153 tshark 1.8.2 1272 2186252 9853 1037 1240777 354 17.7% 542 We ran four open-source programs each on a single input and generated candidate bugs Because validating all possible bugs would take too long, we instead validated a random sample of 2000 per program Result: extrapolating from the yield numbers, a single run gives us up to ~200,000 real bugs

Evaluation: What Influences Yield? 22 mliv mt CN [0, 10) [10, 100) [100, 1000) [1000, +inf] [0, 10) 51.9% 22.9% 17.4% 11.9% [10, 100) 0 0 0 [100, +inf] 0 TCN strongly affects yield No bugs that involved TCN greater than 10 were useable Liveness has a weaker correlation with yield even fairly live data can be sometimes be used if TCN is low

Evaluation: Can Tools Find Them? 23 We took two open-source bug-finding tools and tried to measure their success at finding LAVA bugs A coverage-guided fuzzer (FUZZER) A symbolic execution and constraint solving tool (SES) (Actual names withheld since this is just a preliminary study)

Results: Specific Value 24 Program Total Bugs Unique Bugs Found FUZZER SES Combined uniq 28 7 0 7 base64 44 7 9 14 md5sum 57 2 0 2 who 2136 0 18 18 Total 2265 16 27 41 Less than 2% of injected bugs found

Results: Range-Triggered Bugs 25 Tool Bug Type Range 2 0 2 7 2 14 2 21 2 28 KT FUZZER 0 0 9% 79% 75% 20% SES 8% 0 9% 21% 0 10%

Evaluation: Realism 26 The burning question in everyone's mind now: are these bugs realistic? This is hard to measure, in part because realism is not a welldefined property! Our evaluation looks at: How injected bugs are distributed in the program What proportion of the trace has normal data flow Ultimately, the best test of realism will be whether it helps bugfinding software get better

Results: Realism 27 Frequency 0 200000 600000 1000000 Frequency 0e+00 4e+05 8e+05 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 I(DUA) I(ATP) Fig. 10: Normalized DUA trace location Fig. 11: Normalized ATP trace location DUA ATP Execution trace

Limitations and Caveats 28 General limitations: Some types of vulnerabilities probably can't be injected using this method e.g., weak crypto bugs More work is needed to see if these bugs can improve bugfinding software Implementation limits: Currently only works on C/C++ programs in Linux Only injects buffer overflow bugs Works only on source code

Future Work 29 Continuous on-line competition to encourage self-evaluation Use in security competitions like Capture the Flag to re-use and construct challenges onthe-fly Improve and assess realism of LAVA bugs More types of vulnerabilities (use after free, command injection,...) More interesting effects (prove exploitability!)

Conclusions 30 Presented a new technique that is capable of quickly injecting massive numbers of bugs Demonstrated that current tools are not very good at finding these bugs If these bugs prove to be good stand-ins for realworld vulnerabilities, we can get huge, on-demand bug corpora

Questions? 31????????