Taintscope: A Checksum-Aware Directed Fuzzing Tool for Automatic Software Vulnerability Detection

Similar documents
TaintScope: A Checksum-Aware Directed Fuzzing Tool for Automatic Software Vulnerability Detection

Checksum-Aware Fuzzing Combined with Dynamic Taint Analysis and Symbolic Execution

CSE 565 Computer Security Fall 2018

Automatic program generation for detecting vulnerabilities and errors in compilers and interpreters

Applications. Cloud. See voting example (DC Internet voting pilot) Select * from userinfo WHERE id = %%% (variable)

It was a dark and stormy night. Seriously. There was a rain storm in Wisconsin, and the line noise dialing into the Unix machines was bad enough to

Detecting Integer Overflow Vulnerabilities in Binaries. Tielei Wang Institute of Computer Science and Technology, Peking University

CSC 405 Introduction to Computer Security Fuzzing

Last week. Data on the stack is allocated automatically when we do a function call, and removed when we return

Analysis/Bug-finding/Verification for Security

CS 161 Computer Security

Unleashing D* on Android Kernel Drivers. Aravind Machiry

Memory Safety (cont d) Software Security

Guarding Vulnerable Code: Module 1: Sanitization. Mathias Payer, Purdue University

TI2725-C, C programming lab, course

ECE 471 Embedded Systems Lecture 22

18-642: Code Style for Compilers

Static Analysis and Bugfinding

Control Flow Hijacking Attacks. Prof. Dr. Michael Backes

Black Hat Webcast Series. C/C++ AppSec in 2014

OptimiData. JPEG2000 Software Development Kit for C/C++ Reference Manual. Version 1.6. from

INTRODUCTION TO EXPLOIT DEVELOPMENT

CSE484/CSE584 BLACK BOX TESTING AND FUZZING. Dr. Benjamin Livshits

Static Vulnerability Analysis

(Early) Memory Corruption Attacks

This is CS50. Harvard University Fall Quiz 0 Answer Key

COMP 3704 Computer Security

Simple Overflow. #include <stdio.h> int main(void){ unsigned int num = 0xffffffff;

Buffer overflow background

Fuzzing: Breaking software in an automated fashion

Course organization. Course introduction ( Week 1)

Using static analysis to detect use-after-free on binary code

CSSV: Towards a Realistic Tool for Statically Detecting All Buffer Overflows in C

One-Slide Summary. Lecture Outline. Language Security

Lecture Embedded System Security A. R. Darmstadt, Runtime Attacks

EURECOM 6/2/2012 SYSTEM SECURITY Σ

In-Memory Fuzzing in JAVA

Undefined Behaviour in C

CYSE 411/AIT681 Secure Software Engineering Topic #12. Secure Coding: Formatted Output

2/9/18. CYSE 411/AIT681 Secure Software Engineering. Readings. Secure Coding. This lecture: String management Pointer Subterfuge

Finding media bugs in Android using file format fuzzing. Costel Maxim Intel OTC Security

1.1 For Fun and Profit. 1.2 Common Techniques. My Preferred Techniques

Infecting the Embedded Supply Chain

Programmer s Points to Remember:

Pointers, Dynamic Data, and Reference Types

Automated Whitebox Fuzz Testing. by - Patrice Godefroid, - Michael Y. Levin and - David Molnar

finding vulnerabilities

Secure Software Development: Theory and Practice

MPATE-GE 2618: C Programming for Music Technology. Unit 5.1

Secure Programming I. Steven M. Bellovin September 28,

AET60 BioCARDKey. Application Programming Interface. Subject to change without prior notice

C++ Undefined Behavior

Introduction to software exploitation ISSISP 2017

Using Static Code Analysis to Find Bugs Before They Become Failures

Copyright 2015 MathEmbedded Ltd.r. Finding security vulnerabilities by fuzzing and dynamic code analysis

Homework 3 CS161 Computer Security, Fall 2008 Assigned 10/07/08 Due 10/13/08

Limitations of the stack

CS 62 Practice Final SOLUTIONS

INITIALISING POINTER VARIABLES; DYNAMIC VARIABLES; OPERATIONS ON POINTERS

Lecture Notes for 04/04/06: UNTRUSTED CODE Fatima Zarinni.

CYSE 411/AIT681 Secure Software Engineering Topic #10. Secure Coding: Integer Security

malloc() is often used to allocate chunk of memory dynamically from the heap region. Each chunk contains a header and free space (the buffer in which

Brave New 64-Bit World. An MWR InfoSecurity Whitepaper. 2 nd June Page 1 of 12 MWR InfoSecurity Brave New 64-Bit World

Dynamic Allocation in C

2/9/18. Readings. CYSE 411/AIT681 Secure Software Engineering. Introductory Example. Secure Coding. Vulnerability. Introductory Example.

2/9/18. CYSE 411/AIT681 Secure Software Engineering. Readings. Secure Coding. This lecture: String management Pointer Subterfuge

Rooting Routers Using Symbolic Execution. Mathy HITB DXB 2018, Dubai, 27 November 2018

CS2255 HOMEWORK #1 Fall 2012

Dynamic Software Model Checking

C++ Undefined Behavior What is it, and why should I care?

CS 241 Data Organization Binary Trees

C Arrays and Pointers

Memory and Addresses. Pointers in C. Memory is just a sequence of byte-sized storage devices.

IPS Signature Database

18-642: Code Style for Compilers

Heap Arrays and Linked Lists. Steven R. Bagley

Enhancing Memory Error Detection for Large-Scale Applications and Fuzz testing

What You Corrupt Is Not What You Crash: Challenges in Fuzzing Embedded Devices

About unchecked management SMM & UEFI. Vulnerability. Patch. Conclusion. Bruno Pujos. July 16, Bruno Pujos

CS 161 Computer Security

Recursion. What is Recursion? Simple Example. Repeatedly Reduce the Problem Into Smaller Problems to Solve the Big Problem

API 퍼징을통한취약점탐지 카이스트 차상길

Identifying Memory Corruption Bugs with Compiler Instrumentations. 이병영 ( 조지아공과대학교

Analysis of MS Multiple Excel Vulnerabilities

Fuzzing. compass-security.com 1

Preventing Use-after-free with Dangling Pointers Nullification

CSE 565 Computer Security Fall 2018

CS107 Handout 08 Spring 2007 April 9, 2007 The Ins and Outs of C Arrays

prin'() tricks DC4420 slides, Feb 2012

2 Sadeghi, Davi TU Darmstadt 2012 Secure, Trusted, and Trustworthy Computing Chapter 6: Runtime Attacks

Automotive Software Security Testing

C and C++ Secure Coding 4-day course. Syllabus

UniFinger Engine SFR300 SDK Reference Manual

C and C++: vulnerabilities, exploits and countermeasures

SoK: Eternal War in Memory

Effective Memory Protection Using Dynamic Tainting

(a) Which of these two conditions (high or low) is considered more serious? Justify your answer.

Fault Injection in System Calls

CIS 551 / TCOM 401 Computer and Network Security. Spring 2007 Lecture 2

Embedded/Connected Device Secure Coding. 4-Day Course Syllabus

Transcription:

: A Checksum-Aware Directed Fuzzing Tool for Automatic Software Vulnerability Detection Tielei Wang Tao Wei Guofei Gu Wei Zou March 12, 2014

is: A Fuzzing tool Checksum-Aware Directed

Why a new fuzzing tool? Fuzzing tools already exist. They can be sorted in two categories: Mutation based

Why a new fuzzing tool? Fuzzing tools already exist. They can be sorted in two categories: Mutation based Not very efficient

Why a new fuzzing tool? Fuzzing tools already exist. They can be sorted in two categories: Mutation based Not very efficient Cannot generate valid input if a checksum mechanism is used

Why a new fuzzing tool? Fuzzing tools already exist. They can be sorted in two categories: Mutation based Not very efficient Cannot generate valid input if a checksum mechanism is used Generation based

Why a new fuzzing tool? Fuzzing tools already exist. They can be sorted in two categories: Mutation based Not very efficient Cannot generate valid input if a checksum mechanism is used Generation based Better performances

Why a new fuzzing tool? Fuzzing tools already exist. They can be sorted in two categories: Mutation based Not very efficient Cannot generate valid input if a checksum mechanism is used Generation based Better performances Often implies having input specification, or source code. Tools exist to automatically get input format, but cannot reverse engineer checksum algorithms.

Why a new fuzzing tool? Example of input using checksum: int format int filesize int width int height... int checksum void decode_input(file * f){ int recomputed_checksum = checksum(f); int checksum_in_file = get_checksum(f); if (recomputed_checksum!= checksum_in_file) exit(); width = get_width(f);...

Contributions offers several major contributions: Checksum-aware Detect checksum test in tested program Bypass checksum test when fuzz-testing Reconstruct input with valid checksum

Contributions offers several major contributions: Checksum-aware Detect checksum test in tested program Bypass checksum test when fuzz-testing Reconstruct input with valid checksum Directed Reduces the space of parameters to mutate

Checksum-awareness Checksum-aware fuzz-testing is done in 3 steps:

Checksum-awareness Checksum-aware fuzz-testing is done in 3 steps: 1. Pre-processing: locate checksum check points in the program

Checksum-awareness Checksum-aware fuzz-testing is done in 3 steps: 1. Pre-processing: locate checksum check points in the program 2. Fuzz-testing: mutate input without touching the checksum data

Checksum-awareness Checksum-aware fuzz-testing is done in 3 steps: 1. Pre-processing: locate checksum check points in the program 2. Fuzz-testing: mutate input without touching the checksum data 3. Post-processing: for a crashing input, rebuild valid checksum

Checksum-awareness How to locate checksum test in program?

Checksum-awareness How to locate checksum test in program?

Checksum-awareness How to locate checksum test in program?

Checksum-awareness How to locate checksum test in program?

Checksum-awareness How to locate checksum test in program?

Checksum-awareness How to locate checksum test in program?

Checksum-awareness How to locate checksum test in program?

Checksum-awareness How to fuzz-test knowing the checksum-point?

Checksum-awareness How to fuzz-test knowing the checksum-point?

Checksum-awareness Using the checksum locator it is possible to: Bypass checksum test by modifying the program Test input on the modified program to find crashing cases

Checksum-awareness Using the checksum locator it is possible to: Bypass checksum test by modifying the program Test input on the modified program to find crashing cases But how to use those inputs on the real program? Need to reconstruct valid checksum

Checksum-awareness Using our previous example: int format int filesize int width int height... int checksum void decode_input(file * f){ int recomputed_checksum = checksum(f); int checksum_in_file = get_checksum(f); if (recomputed_checksum!= checksum_in_file) exit(); width = get_width(f);...

Checksum-awareness Using our previous example: int format int filesize int width int height... int checksum void decode_input(file * f){ int recomputed_checksum = checksum(f); int checksum_in_file = get_checksum(f); if (recomputed_checksum!= checksum_in_file) exit(); width = get_width(f);...

Checksum-awareness Why not use that checksum everytime instead of modifying the program? In practice, finding back the checksum is more complicated That step is too expensive to do it thousands of time

Checksum-awareness So is a checksum-aware fuzzing tool: Detects checksum tests Bypasses them for fuzz-testing Corrects input so they can work on original program

Directed fuzzing Fuzz-testing is expensive Large size of input Hundreds or thousands of bytes to mutate Very likely to generate rejected input

Directed fuzzing Directed fuzzing allows to find hot bytes in the input, which are: Are more likely to trigger bugs or crashes Are less likely to be the cause of rejected input What is a hot byte? An input byte that will be used in a security-sensitive call (such as malloc or strcpy)

Directed fuzzing How to find hot bytes? Start from a valid input Give all byte in the input a unique label Use a taint-tracer to see where the input bytes are used If an input byte is used (directly or indirectly) in a sensitive function call, this byte is a hot byte.

Directed fuzzing finds those hot bytes and focuses on them for fuzz-testing. The hot-byte detection can be done simultaneously with the checksum pre-processing step, leading to less overhead.

Evaluation and results was evaluated on real-world applications such as: Image viewer Google Picasa Adobe Acrobat Image magick Media Players MPlayer Winamp Web Browsers libtiff XEmacs

Evaluation and results First test on hot bytes identification. Application Input size # Hot bytes Run time ImageMagick (png) 5149 9 1m54s ImageMagick (jpg) 6617 11 1m13s Picasa (png) 2730 18 5m16s Acrobat (png) 770 21 3m8s Acrobat (jpg 1012 13 4m14s

Evaluation and results Second test on Checksum localization Application # points (1 st ) # points (2 nd ) Detected Picasa (png) 830 1 Yes Acrobat (png) 5805 1 Yes TCPDump (pcap) 5 2 Yes Tar 9 1 Yes In practice : Around twenty runs to find the checksum location. Done in tens of minutes.

Evaluation and results Third test on checksum reconstruction: Format # checksum size time PNG 4 4 271.9 PCAP 8 2 455.6 TAR 3 8 572.8

Evaluation and results Out of those tests, has found 27 severe vulnerabilities in common applications caused by: Buffer overflow Integer overflow Double free Null pointer dereference Infinite loop

Conclusion Only few bytes are hot in most input files, making directed fuzzing essential in fuzz testing is able to detect checksum check points in programs, and checksum fields in input is able to automatically create valid input passing the checksum check can be used on real-world application to find serious vulnerabilities

Conclusion However: cannot handle signed inputs. It can bypass the check and find vulnerabilities But cannot recreate valid input afterwards All benefits of directed fuzzing are lost when data is encrypted, as every input byte will be detected as hot. Checksum location depends heavily on the availability of well-formed and malformed inputs

Questions?