POMP: Postmortem Program Analysis with Hardware-Enhanced Post-Crash Artifacts

Similar documents
POMP: Postmortem Program Analysis with Hardware-Enhanced Post-Crash Artifacts

Configurations. Make menuconfig : Kernel hacking/

CMSC 313 COMPUTER ORGANIZATION & ASSEMBLY LANGUAGE PROGRAMMING

Digital Forensics Lecture 3 - Reverse Engineering

Assembly Language: Function Calls

Assembly Language: Function Calls" Goals of this Lecture"

Assembly Language: Function Calls" Goals of this Lecture"

Assembly Language: Function Calls. Goals of this Lecture. Function Call Problems

SA31675 / CVE

The IA-32 Stack and Function Calls. CS4379/5375 Software Reverse Engineering Dr. Jaime C. Acosta

Practical Malware Analysis

EECE.3170: Microprocessor Systems Design I Summer 2017 Homework 4 Solution

Islamic University Gaza Engineering Faculty Department of Computer Engineering ECOM 2125: Assembly Language LAB. Lab # 7. Procedures and the Stack

SYSTEM CALL IMPLEMENTATION. CS124 Operating Systems Fall , Lecture 14

Program Exploitation Intro

Representation of Information

Dr. Ramesh K. Karne Department of Computer and Information Sciences, Towson University, Towson, MD /12/2014 Slide 1

Finding media bugs in Android using file format fuzzing. Costel Maxim Intel OTC Security

Register Allocation, iii. Bringing in functions & using spilling & coalescing

Low Level Programming Lecture 2. International Faculty of Engineerig, Technical University of Łódź

CSC 591 Systems Attacks and Defenses Return-into-libc & ROP

CNIT 127: Exploit Development. Ch 1: Before you begin. Updated

Return-orientated Programming

16.317: Microprocessor Systems Design I Fall 2015

CMSC 313 Lecture 08 Project 2 Questions Recap Indexed Addressing Examples Some i386 string instructions A Bigger Example: Escape Sequence Project

Mechanisms for entering the system

CNIT 127: Exploit Development. Ch 3: Shellcode. Updated

Identifying and Analyzing Pointer Misuses for Sophisticated Memory-corruption Exploit Diagnosis

CNIT 127: Exploit Development. Ch 2: Stack Overflows in Linux

X86 Stack Calling Function POV

Error Sensitivity of Linux on PowerPC (G4) & Pentium (P4)

x86 Assembly Crash Course Don Porter

The Geometry of Innocent Flesh on the Bone

BUFFER OVERFLOW DEFENSES & COUNTERMEASURES

buffer overflow exploitation

T Using debuggers to analyze malware. Antti Tikkanen, F-Secure Corporation

From Over ow to Shell

About unchecked management SMM & UEFI. Vulnerability. Patch. Conclusion. Bruno Pujos. July 16, Bruno Pujos

Computer Organization & Assembly Language Programming

CS412/CS413. Introduction to Compilers Tim Teitelbaum. Lecture 21: Generating Pentium Code 10 March 08

OpenBSD Remote Exploit

Microprocessors ( ) Fall 2010/2011 Lecture Notes # 15. Stack Operations. 10 top

Lecture 15 Intel Manual, Vol. 1, Chapter 3. Fri, Mar 6, Hampden-Sydney College. The x86 Architecture. Robb T. Koether. Overview of the x86

Exploiting Stack Buffer Overflows Learning how blackhats smash the stack for fun and profit so we can prevent it

Overview REWARDS TIE HOWARD Summary CS 6V Data Structure Reverse Engineering. Zhiqiang Lin

Stack -- Memory which holds register contents. Will keep the EIP of the next address after the call

16.317: Microprocessor Systems Design I Fall 2014

Biography. Background

Binghamton University. CS-220 Spring X86 Debug. Computer Systems Section 3.11

(the bubble footer is automatically inserted into this space)

CSE 153 Design of Operating Systems Fall 18

Autodesk AutoCAD DWG-AC1021 Heap Corruption

CPS104 Recitation: Assembly Programming

+ Overview. Projects: Developing an OS Kernel for x86. ! Handling Intel Processor Exceptions: the Interrupt Descriptor Table (IDT)

NetWare Kernel Stack Overflow Exploitation.

Calling Conventions. See P&H 2.8 and Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University

Lab 4: Basic Instructions and Addressing Modes

Machine-level Representation of Programs. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

16.317: Microprocessor Systems Design I Spring 2015

UMBC. A register, an immediate or a memory address holding the values on. Stores a symbolic name for the memory location that it represents.

CS61 Section Solutions 3

MACHINE-LEVEL PROGRAMMING I: BASICS COMPUTER ARCHITECTURE AND ORGANIZATION

Memory Models. Registers

The x86 Architecture

Lecture 4 CIS 341: COMPILERS

printf("this program adds the value 10 to a given integer number.\n\n");

Instruction Set Architectures

Secure Programming Lecture 3: Memory Corruption I (Stack Overflows)

x86 assembly CS449 Fall 2017

ROP It Like It s Hot!

Chapter 11. Addressing Modes

UMBC. contain new IP while 4th and 5th bytes contain CS. CALL BX and CALL [BX] versions also exist. contain displacement added to IP.

Rootkits n Stuff

Function Call Convention

Islamic University Gaza Engineering Faculty Department of Computer Engineering ECOM 2125: Assembly Language LAB

Dynamic Translation for EPIC Architectures

Using MMX Instructions to Implement 2D Sprite Overlay

Intro x86 Part 3: Linux Tools & Analysis

Instruction Set Architectures

+ Machine Level Programming: x86-64 History

Buffer Overflow Attack

Machine-level Programming (3)

mith College Computer Science CSC231 Assembly Week #12 Thanksgiving 2017 Dominique Thiébaut

Computer Systems Organization V Fall 2009

Lecture 08 Control-flow Hijacking Defenses

RETracer: Triaging Crashes by Reverse Execution from Partial Memory Dumps

Procedure Calls. Young W. Lim Sat. Young W. Lim Procedure Calls Sat 1 / 27

Exniffer: Learning to Rank Crashes by Assessing the Exploitability from Memory Dump

Using the GNU Debugger

3. Process Management in xv6

Procedure Calls. Young W. Lim Mon. Young W. Lim Procedure Calls Mon 1 / 29

Countering Code-Injection Attacks With Instruction-Set Randomization

CSE351 Autumn 2012 Midterm Exam (5 Nov 2012)

Turning C into Object Code Code in files p1.c p2.c Compile with command: gcc -O p1.c p2.c -o p Use optimizations (-O) Put resulting binary in file p

The geometry of innocent flesh on the bone: Return-into-libc without function calls (on the x86) Hovav Shacham presented by: Fabian Fäßler

Intel Architecture. Compass Security Schweiz AG Werkstrasse 20 Postfach 2038 CH-8645 Jona

Basics. Crash Dump Analysis 2015/2016. CHARLES UNIVERSITY IN PRAGUE faculty of mathematics and physics.

Islamic University Gaza Engineering Faculty Department of Computer Engineering ECOM 2125: Assembly Language LAB. Lab # 10. Advanced Procedures

Reverse Engineering Low Level Software. CS5375 Software Reverse Engineering Dr. Jaime C. Acosta

Assembly Programmer s View Lecture 4A Machine-Level Programming I: Introduction

Transcription:

POMP: Postmortem Program Analysis with Hardware-Enhanced Post-Crash Artifacts Jun Xu 1, Dongliang Mu 12, Xinyu Xing 1, Peng Liu 1, Ping Chen 1, Bing Mao 2 1. Pennsylvania State University 2. Nanjing University

Grim Reality Despite intensive in-house software testing, programs inevitably contain defects Accidentally terminate or crash at post-deployment stage Exploited as security loopholes Software debugging is expensive and needs intensive manual efforts Debugging software cost has risen to $312 billion per year globally [1] Developers spend 50% of their programming time finding and fixing bugs [2] [1] http://silvaetechnologies.eu/blg/216/debugging-software-cost-has-risen-to-$312-billion-per-year-globally [2] https://www.roguewave.com/company/news/2013/university-of-cambridge-reverse-debugging-study 9/1/17 Presenter - Xinyu Xing 2

Techniques facilitating software debugging Heavyweight technique Program tracing Relay debugging Lightweight technique Examine stack trace in debugger (e.g. [1]) Audit execution logs (e.g., [2]) Analyze crash dump (e.g., [3]) [1] Liblit et al. Building a Better Backtrace: Techniques for Postmortem Program Analysis. TR 02 [2] Yuan et al. Improving Software Diagnosability via Log Enhancement. ASPLOS 11 [3] Cui et al. RETracer: Triaging Crashes by Reverse Execution from Partial Memory Dumps. ICSE 16 Heavyweight tech. Lightweight tech. Heavyweight tech. Lightweight tech. Cost High overhead Low overhead Trace execution Capability High effectiveness Low effectiveness Log program states 9/1/17 Presenter - Xinyu Xing 3

Ultimate Goals Automated Automating software failure diagnosis without manual efforts Accurate Do not mistakenly pinpointing the root causes of software failure Nonintrusive (Runtime overhead free) Introducing no runtime overhead at post-deployment stage 9/1/17 Presenter - Xinyu Xing 4

POMP for Postmortem Program Analysis POMP diagnoses software crash using software crash report (core dump) POMP introduces no overhead at a running software because a crash report is created only when a program accidentally terminates 9/1/17 Presenter - Xinyu Xing 5

Core dump In general, a core dump carries information such as Memory snapshot Registers Others (e.g., signals received) A core dump provides only a partial chronology of how a program reached a crash site [3] A core dump is barely used a source for automated software failure diagnosis [3] Liblit et al. Building a Better Backtrace: Techniques for Postmortem Program Analysis 9/1/17 Presenter - Xinyu Xing 6

Hardware-enhanced Core dump POMP enhances a core dump through Intel Processor Trace (PT) Intel PT enables execution tracing with nearly no overhead Register eax ebx esp 0x02 0xff28 0xff14 A1: push ebp A2: mov ebp, esp A3: sub esp, 0x14 A4: mov [ebp-0xc], test A5: lea eax, [ebp-0x10] A hardware-enhanced core dump unveils not only a crashing state but more importantly captures the control flow a crashing program follows. Mem Address 0xff1c 0xff18 0xff14 0x02 0x01 0x00 A6: push eax A7: call child A8: push ebp A9: mov ebp, esp 0xff10 0xff18 Memory state Execution trace 9/1/17 Presenter - Xinyu Xing 7

Roadmap Design overview Technical challenge Technical details Evaluation in real world crashes Summary 9/1/17 Presenter - Xinyu Xing 8

Example of Program Failure Debugging Root Cause Crash 9/1/17 Presenter - Xinyu Xing 9

Overview of Automated Program Failure Diagnosis 1) Reversely execute an instruction trace (starting from the crash site) 2) Reconstruct the data flow that a program followed prior to its crash 3) Examine how a bad value was passed to the crash site Reversely executing Data flow 9/1/17 Presenter - Xinyu Xing 10

Reverse Execution Invertible Instructions add eax 10 sub eax 216 dec eax inc eax eax-10 eax+216 eax+1 eax-1 9/1/17 Presenter - Xinyu Xing 11

Reverse Execution Non-invertible Instructions mov eax 0x5 0x5 xor ecx ecx 9/1/17 Presenter - Xinyu Xing 12

Reverse Execution with Backward Data Flow Analysis L1. mov ebx edx L2. mov ecx [edi] ebx == edx before L3. mov [ecx] esi the execution of L5 L4. mov ecx 0x77 L5. mov ebx [eax] Registers Memory Address eax ebx ecx edx edi esi 0x22 0x45 0x77 0x80 0x80 0x95 0x77 0x32 0x45 0x22 0xE8 0x22 0xB7 0x95 9/1/17 Presenter - Xinyu Xing 13

Challenge Memory Alias Issue L1. mov ebx edx L2. mov ecx [edi] L3. mov [ecx] esi L4. mov ecx 0x77 L5. mov ebx [eax] Registers Memory Address eax 0x80 ebx 0x95 ecx 0x77 edx 0x32 edi 0x45 esi 0x22 0x22 0xE8 0x45 0x22 ecx == [edi] before 0x77 0xB7 0x80 0x95 the execution of L4 0x32 9/1/17 Presenter - Xinyu Xing 14

Pointing to the same memory region Challenge Memory Alias Issue L1. mov ebx edx L2. mov ecx [edi] L3. mov [ecx] esi L4. mov ecx 0x77 L5. mov ebx [eax] Registers Memory Address eax ebx ecx edx edi esi 0x22 0x80 0x95 0x77 0x32 0x45 0x22 0xE8 0x45 0x22 edi == ecx right before 0x77 0xB7 0x80 0x95 the execution of L4 0x32 9/1/17 Presenter - Xinyu Xing 15

Hypothesis Testing 1) Making two hypotheses for each pair of memory 2) Constructing use-define chain and extracting constraints under each hypothesis 3) Testing each of the hypotheses by using the constraints in each set Hypothesis ⓵ [R1] and [R2] are alias R3=[R2] [R1]=R4 Constraint set use: R2 def: R3=[R2] def: [R1]=R4 Use-define chain [R2]=R3 [R1]=R4 Constraint set Hypothesis ⓶ [R1] and [R2] are NOT alias 9/1/17 Presenter - Xinyu Xing 16

Hypothesis Testing use: edx use: esi def: ebx=edx use: ecx use: edi Hypothesis ⓵ NOT alias ecx edi ecx=[edi] [ecx]=esi L1. mov ebx edx L2. mov ecx [edi] use: [edi] def: [ecx]=esi Hypothesis ⓶ alias L3. mov [ecx] esi def: ecx=[edi] def: ecx=0x77 L4. mov ecx 0x77 9/1/17 Presenter - Xinyu Xing 17

Hypothesis Testing use: edx use: esi def: ebx=edx use: ecx use: edi Hypothesis ⓵ NOT alias ecx=[edi] ecx edi [ecx]=esi L1. mov ebx edx L2. mov ecx [edi] use: [edi] def: [ecx]=esi Hypothesis ⓶ alias L3. mov [ecx] esi def: ecx=[edi] def: ecx=0x77 ecx=[edi] ecx=edi [ecx]=esi L4. mov ecx 0x77 9/1/17 Presenter - Xinyu Xing 18

Memory Address Registers Hypothesis Testing eax 0x80 ebx 0x95 0x32 ecx 0x77???? edx 0x32 edi 0x45 esi 0x22 0x22 0xE8 0x45 0x22 0x77 0xB7 0x80 0x95 Hypothesis ⓵ NOT alias ecx=[edi] ecx edi [ecx]=esi Hypothesis ⓶ alias ecx=[edi] ecx=edi [ecx]=esi Goal: Restore the value of ecx prior to the execution of L4 L1. mov ebx edx L2. mov ecx [edi] L3. mov [ecx] esi L4. mov ecx 0x77 9/1/17 Presenter - Xinyu Xing 19

Limitations of Hypothesis Test and Strategy (1) Limitation: Not having sufficient evidences to reject a hypothesis because Execution trace includes system calls POMP does not trace execution into OS kernel POMP may be unable to perform accurate data flow analysis Strategy: Treat some system calls (e.g., recv / write) as a definition which intervenes all the memory access propagation because a non-deterministic memory region can potentially overlap with any memory regions in user space use: [edi] def: [ecx] def: recv def: ecx=[edi] 9/1/17 Presenter - Xinyu Xing 20

Limitations of Hypothesis Test and Strategy (2) Limitation: Require intensive computation to reject/accept a hypothesis because One hypothesis test could involve other hypothesis tests, making hypothesis test a recursive procedure In theory, recursive hypothesis test could potentially introduce a computation complexity of O(n m ) Strategy: Perform a recursive hypothesis test with limited number of recursion depths (m=2 in our implementation) Alias? use: [R1] def: [R2] use: [R3] def: [R4] Alias? 9/1/17 Presenter - Xinyu Xing 21

Limitations of Hypothesis Test and Strategy (2) Limitation: Require intensive computation to reject/accept a hypothesis because One hypothesis test could involve other hypothesis tests, making hypothesis test a recursive procedure In theory, recursive hypothesis test could potentially introduce a computation complexity of O(n m ) Strategy: Perform a recursive hypothesis test with limited number of recursion depths (m=2 in our implementation) [R3] & [R4]: alias [R1] & [R2]: alias [R3] & [R4]: NOT alias [R1] & [R2]: NOT alias [R3] & [R4]: alias [R3] & [R4]: NOT alias 9/1/17 Presenter - Xinyu Xing 22

Evaluation of POMP 31 program crashes resulting from real world vulnerabilities: Stack, heap & integer overflow Use After Free and invalid free Null pointer dereference Crashing program ranges from Sophisticated software (e.g., GBD) Lightweight software (e.g., corehttp) 9/1/17 Presenter - Xinyu Xing 23

Evaluation of POMP (cont.) 9/1/17 Presenter - Xinyu Xing 24

Evaluation of POMP (cont.) Observation 1: Ø POMP marks slightly more instructions than the ones that truly contribute to program crash; Observation 2: Ø Compared with the execution trace (from a fault point to a crashing site), POMP significantly reduces the amount of instructions that software developer and security analysts need to manually examine. 9/1/17 Presenter - Xinyu Xing 25

Evaluation of POMP (cont.) It contains a system call (sys_read) ; system call writes a data chunk to a certain memory region; reverse execution fails to determine its location and size Observation 3: Ø For 2 crashes, POMP fails to include failure root cause into the instruction set identified by POMP. Integer overflow traps the execution into a long-term iteration; POMP fails to include the root cause instruction into the execution trace restored. 9/1/17 Presenter - Xinyu Xing 26

Evaluation of POMP (cont.) Observation 4: Ø For most test cases, POMP takes seconds or tens of minutes for program failure diagnosis; for some other test cases, it could take several hours to process. 9/1/17 Presenter - Xinyu Xing 27

Summary POMP can reversely execute a crashing program and restore the memory footprints The memory footprints restored can be used for data flow construction and program failure diagnosis POMP reduces the code space that software developers and security analysts need to manually examine, which significantly facilitates program diagnosis failure POMP can handle program crashes including those resulting from memory corruption vulnerabilities 9/1/17 Presenter - Xinyu Xing 28

Thank you very much! POMP source code is available at https://github.com/junxzm1990/pomp.git 9/1/17 Presenter - Xinyu Xing 29