Program Testing via Symbolic Execution

Similar documents
Microsoft SAGE and LLVM KLEE. Julian Cohen Manual and Automatic Program Analysis

KLEE: Effective Testing of Systems Programs Cristian Cadar

KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs

Advanced Systems Security: Symbolic Execution

Constraint Solving Challenges in Dynamic Symbolic Execution. Cristian Cadar. Department of Computing Imperial College London

KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs

Blanket Execution: Dynamic Similarity Testing for Program Binaries and Components

Symbolic Execution in Difficult Environments

Symbolic and Concolic Execution of Programs

Automated Software Analysis Techniques For High Reliability: A Concolic Testing Approach. Moonzoo Kim

Software Testing CS 408. Lecture 6: Dynamic Symbolic Execution and Concolic Testing 1/30/18

Qemu code fault automatic discovery with symbolic search. Paul Marinescu, Cristian Cadar, Chunjie Zhu, Philippe Gabriel

How is Dynamic Symbolic Execution Different from Manual Testing? An Experience Report on KLEE

CMSC 430 Introduction to Compilers. Fall Symbolic Execution

Baggy bounds with LLVM

Symbolic Execution. Wei Le April

Honours/Master/PhD Thesis Projects Supervised by Dr. Yulei Sui

This document provides an overview of the COPRTHR-2 Software. available in the following supporting documentation:

Overview AEG Conclusion CS 6V Automatic Exploit Generation (AEG) Matthew Stephen. Department of Computer Science University of Texas at Dallas

Access Control Lists. Don Porter CSE 506

Lecture 03 Bits, Bytes and Data Types

Testing, code coverage and static analysis. COSC345 Software Engineering

n HW7 due in about ten days n HW8 will be optional n No CLASS or office hours on Tuesday n I will catch up on grading next week!

Advances in Compilers

Hyperkernel: Push-Button Verification of an OS Kernel

Testing Error Handling Code in Device Drivers Using Characteristic Fault Injection

Call Paths for Pin Tools

The Low-Level Bounded Model Checker LLBMC

SPIN Operating System

Minion: Fast, Scalable Constraint Solving. Ian Gent, Chris Jefferson, Ian Miguel

Program Analysis. Program Analysis

Automated Unit Testing of Large Industrial Embedded Software using Concolic Testing

Prototyping Architectural Support for Program Rollback Using FPGAs

Verifying Temporal Properties via Dynamic Program Execution. Zhenhua Duan Xidian University, China

Agenda. CSE P 501 Compilers. Java Implementation Overview. JVM Architecture. JVM Runtime Data Areas (1) JVM Data Types. CSE P 501 Su04 T-1

Symbolic Execution, Dynamic Analysis

DART: Directed Automated Random Testing. CUTE: Concolic Unit Testing Engine. Slide Source: Koushik Sen from Berkeley

CMSC 631 Program Analysis and Understanding. Spring Symbolic Execution

Introduction to Supercomputing

CSE 333 Lecture 1 - Systems programming

New Java performance developments: compilation and garbage collection

ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective

Dynamic Binary Instrumentation: Introduction to Pin

Introduction to Linux (Part II) BUPT/QMUL 2018/03/21

0x1A Great Papers in Computer Security

Compiling Techniques

CSE 333 Lecture 1 - Systems programming

Portland State University ECE 587/687. Virtual Memory and Virtualization

Bug Finding with Under-approximating Static Analyses. Daniel Kroening, Matt Lewis, Georg Weissenbacher

Lecture 9 Dynamic Compilation

Linux Operating System

Symbolic Execution. Michael Hicks. for finding bugs. CMSC 631, Fall 2017

2012 LLVM Euro - Michael Spencer. lld. Friday, April 13, The LLVM Linker

A Gentle Introduction to Program Analysis

[07] SEGMENTATION 1. 1

COS 318: Operating Systems

Memory Safety for Embedded Devices with nescheck

Systems I: Programming Abstractions

CS 471 Operating Systems. Yue Cheng. George Mason University Fall 2017

What are some common categories of system calls? What are common ways of structuring an OS? What are the principles behind OS design and

Automatic Software Verification

Pipelining, Branch Prediction, Trends

Project 5 File System Protection

CMPSC 311- Introduction to Systems Programming Module: Systems Programming

Introduction. CS 2210 Compiler Design Wonsun Ahn

JPF SE: A Symbolic Execution Extension to Java PathFinder

Test Automation. 20 December 2017

CSE351 Winter 2016, Final Examination March 16, 2016

Program Analysis and Constraint Programming

CSE 4/521 Introduction to Operating Systems

Memory Management. CSCI 315 Operating Systems Design Department of Computer Science

How Software Executes

Ufo: A Framework for Abstraction- and Interpolation-Based Software Verification

Shared Memory programming paradigm: openmp

A Trace-based Java JIT Compiler Retrofitted from a Method-based Compiler

Kampala August, Agner Fog

CS533 Concepts of Operating Systems. Jonathan Walpole

OS lpr. www. nfsd gcc emacs ls 1/27/09. Process Management. CS 537 Lecture 3: Processes. Example OS in operation. Why Processes? Simplicity + Speed

Automatic Generation of Assembly to IR Translators Using Compilers

Ricardo Rocha. Department of Computer Science Faculty of Sciences University of Porto

Scaling Symbolic Execution using Ranged Analysis

Introduction to Cygwin Operating Environment

How to Sandbox IIS Automatically without 0 False Positive and Negative

Binding and Storage. COMP 524: Programming Language Concepts Björn B. Brandenburg. The University of North Carolina at Chapel Hill

xtc Robert Grimm Making C Safely Extensible New York University

Virtual Memory 1. To do. q Segmentation q Paging q A hybrid system

Practical Considerations for Multi- Level Schedulers. Benjamin

CMPSC 311- Introduction to Systems Programming Module: Systems Programming

CSE P 501 Compilers. Java Implementation JVMs, JITs &c Hal Perkins Winter /11/ Hal Perkins & UW CSE V-1

Protection and Mitigation of Software Bug Exploitation

High-Level Language VMs

Evolution of Virtual Machine Technologies for Portability and Application Capture. Bob Vandette Java Hotspot VM Engineering Sept 2004

ENHANCED TOOLS FOR RISC-V PROCESSOR DEVELOPMENT

Threads. CS3026 Operating Systems Lecture 06

Basic Buffer Overflows

GDB Tutorial. A Walkthrough with Examples. CMSC Spring Last modified March 22, GDB Tutorial

Reserve Engineering & Buffer Overflow Attacks. Tom Chothia Computer Security, Lecture 17

CS 61C: Great Ideas in Computer Architecture CALL continued ( Linking and Loading)

Saint Louis University. Intro to Linux and C. CSCI 2400/ ECE 3217: Computer Architecture. Instructors: David Ferry

Building Binary Optimizer with LLVM

Transcription:

Program Testing via Symbolic Execution Daniel Dunbar Program Testing via Symbolic Execution p. 1/26

Introduction Motivation Manual testing is difficult Program Testing via Symbolic Execution p. 2/26

Introduction Motivation Manual testing is difficult Requires knowledge of code Program Testing via Symbolic Execution p. 2/26

Introduction Motivation Manual testing is difficult Requires knowledge of code Requires constant maintenance Program Testing via Symbolic Execution p. 2/26

Introduction Motivation Manual testing is difficult Requires knowledge of code Requires constant maintenance Random testing is ineffective Hard to deeply probe programs Program Testing via Symbolic Execution p. 2/26

Introduction Motivation Manual testing is difficult Requires knowledge of code Requires constant maintenance Random testing is ineffective Hard to deeply probe programs Symbolic execution? Automatic Can check for error conditions Can reach deep code paths Program Testing via Symbolic Execution p. 2/26

Overview KLEE: A Symbolic Virtual Machine Basic Architecture Constraint Solver Optimizations Program Testing via Symbolic Execution p. 3/26

Overview KLEE: A Symbolic Virtual Machine Basic Architecture Constraint Solver Optimizations A Modeling Environment for UNIX Program Testing via Symbolic Execution p. 3/26

Overview KLEE: A Symbolic Virtual Machine Basic Architecture Constraint Solver Optimizations A Modeling Environment for UNIX Case Study: Coreutils Over 80% coverage on majority of tools Overall coverage better than developer tests Program Testing via Symbolic Execution p. 3/26

What is Symbolic Execution? Program Testing via Symbolic Execution p. 4/26

What is Symbolic Execution? Symbolic Replace concrete program values with symbolic variables Program Testing via Symbolic Execution p. 4/26

What is Symbolic Execution? Symbolic Replace concrete program values with symbolic variables Execution Program instructions become operations on expressions Program Testing via Symbolic Execution p. 4/26

What is Symbolic Execution? Symbolic Replace concrete program values with symbolic variables Execution Program instructions become operations on expressions Also: Record branch choices in a path condition Program Testing via Symbolic Execution p. 4/26

What is Symbolic Execution? void escape(char *s, char *out) { while (*s!= 0) { char c = *s++; if (c == \\ ) c = *s++; *out++ = c; } } Program Testing via Symbolic Execution p. 5/26

What is Symbolic Execution? void escape(char *s, char *out) { while (*s!= 0) { char c = *s++; if (c == \\ ) c = *s++; *out++ = c; } } Input: A 0 s: c:? Program Testing via Symbolic Execution p. 5/26

What is Symbolic Execution? void escape(char *s, char *out) { while (*s!= 0) { char c = *s++; if (c == \\ ) c = *s++; *out++ = c; } } Input: s 0 0 s: c: path:? Program Testing via Symbolic Execution p. 5/26

What is Symbolic Execution? void escape(char *s, char *out) { while (*s!= 0) { char c = *s++; if (c == \\ ) c = *s++; *out++ = c; } } Input: s 0 0 s: c: path:? Program Testing via Symbolic Execution p. 5/26

What is Symbolic Execution? void escape(char *s, char *out) { while (*s!= 0) { char c = *s++; if (c == \\ ) c = *s++; *out++ = c; } } Input: s 0 0 s: c: path:? s 0 0 Input: s 0 0 s: c: path:? s 0 = 0 Program Testing via Symbolic Execution p. 5/26

What is Symbolic Execution? void escape(char *s, char *out) { while (*s!= 0) { char c = *s++; if (c == \\ ) c = *s++; *out++ = c; } } Input: s 0 0 s: c: s 0 path: s 0 0 Program Testing via Symbolic Execution p. 5/26

What is Symbolic Execution? void escape(char *s, char *out) { while (*s!= 0) { char c = *s++; if (c == \\ ) c = *s++; *out++ = c; } } Input: s 0 0 s: c: s 0 path: s 0 = \ Input: s 0 0 s: c: s 0 path: s 0 0 s 0 \ Program Testing via Symbolic Execution p. 5/26

What is Symbolic Execution? void escape(char *s, char *out) { while (*s!= 0) { char c = *s++; if (c == \\ ) c = *s++; *out++ = c; } } Input: s 0 0 s: c: path: 0 s 0 = \ Program Testing via Symbolic Execution p. 5/26

What is Symbolic Execution? void escape(char *s, char *out) { while (*s!= 0) { char c = *s++; if (c == \\ ) c = *s++; *out++ = c; } } Input: s 0 0 s: c: path: 0 s 0 = \ Program Testing via Symbolic Execution p. 5/26

The KLEE Architecture Based on experience with EXE system Goals Simplicity Scalability Speed Program Testing via Symbolic Execution p. 6/26

The KLEE Architecture: Simplicity Interpreter for LLVM Assembly Language Precise semantics Multiple language support Mature binary tools Program Testing via Symbolic Execution p. 7/26

The KLEE Architecture: Simplicity Interpreter for LLVM Assembly Language Precise semantics Multiple language support Mature binary tools Easy to understand Simple interpreter loop Explicit process representation Modular search & constraint solving Program Testing via Symbolic Execution p. 7/26

The KLEE Architecture: Scalability Fine-grained heap memory Object level copy-on-write Program Testing via Symbolic Execution p. 8/26

The KLEE Architecture: Scalability Fine-grained heap memory Object level copy-on-write Careful data structure selection Persistent heap, expressions, path data Compact expression representation Program Testing via Symbolic Execution p. 8/26

The KLEE Architecture: Scalability Fine-grained heap memory Object level copy-on-write Careful data structure selection Persistent heap, expressions, path data Compact expression representation Handle over 100k processes for small programs Program Testing via Symbolic Execution p. 8/26

The KLEE Architecture: Speed Constraint solving dominates run-time Program Testing via Symbolic Execution p. 9/26

The KLEE Architecture: Speed Constraint solving dominates run-time Keep constraints simple Canonicalize simple forms Cache concrete values and indices Dynamically rewrite path constraints Program Testing via Symbolic Execution p. 9/26

The KLEE Architecture: Speed Constraint solving dominates run-time Keep constraints simple Canonicalize simple forms Cache concrete values and indices Dynamically rewrite path constraints Optimize constraint solving Program Testing via Symbolic Execution p. 9/26

Solver Optimization Take advantage of the nature of symbolic execution Program Testing via Symbolic Execution p. 10/26

Solver Optimization Take advantage of the nature of symbolic execution Program decomposition Queries deal with distinct input variables Program Testing via Symbolic Execution p. 10/26

Solver Optimization Take advantage of the nature of symbolic execution Program decomposition Queries deal with distinct input variables Considerable redundancy Processes accrue constraints Static set of branches Program Testing via Symbolic Execution p. 10/26

Independent Constraint Opt. Path constraints may refer to many variables Path: s 0 = A s 1 = 0 y 0 < 100 Program Testing via Symbolic Execution p. 11/26

Independent Constraint Opt. Path constraints may refer to many variables Path: s 0 = A s 1 = 0 y 0 < 100 Individual expressions only refer to a few Branch condition: y = 10 Program Testing via Symbolic Execution p. 11/26

Independent Constraint Opt. Path constraints may refer to many variables Path: s 0 = A s 1 = 0 y 0 < 100 Individual expressions only refer to a few Branch condition: y = 10 Independent constraint optimization: Group constraints into disjoint subsets based on referenced variables Only pass dependent constraints to solver Query: y < 100 y = 10 Program Testing via Symbolic Execution p. 11/26

Counterexample Cache Take advantage of free counterexamples Program Testing via Symbolic Execution p. 12/26

Counterexample Cache Take advantage of free counterexamples Cache C {A, }, where C is a set of constraints and A a satisfying assignment Program Testing via Symbolic Execution p. 12/26

Counterexample Cache Take advantage of free counterexamples Cache C {A, }, where C is a set of constraints and A a satisfying assignment When performing a query with constraints C: Program Testing via Symbolic Execution p. 12/26

Counterexample Cache Take advantage of free counterexamples Cache C {A, }, where C is a set of constraints and A a satisfying assignment When performing a query with constraints C: 1. If (C, x) in cache, return x Program Testing via Symbolic Execution p. 12/26

Counterexample Cache Take advantage of free counterexamples Cache C {A, }, where C is a set of constraints and A a satisfying assignment When performing a query with constraints C: 1. If (C, x) in cache, return x 2. If (C, ), C C in cache, return Program Testing via Symbolic Execution p. 12/26

Counterexample Cache Take advantage of free counterexamples Cache C {A, }, where C is a set of constraints and A a satisfying assignment When performing a query with constraints C: 1. If (C, x) in cache, return x 2. If (C, ), C C in cache, return 3. If (C, A), C C in cache, return A Program Testing via Symbolic Execution p. 12/26

Counterexample Cache Take advantage of free counterexamples Cache C {A, }, where C is a set of constraints and A a satisfying assignment When performing a query with constraints C: 1. If (C, x) in cache, return x 2. If (C, ), C C in cache, return 3. If (C, A), C C in cache, return A 4. Otherwise, for each (C, A), C C in cache, speculatively evaluate C using A Program Testing via Symbolic Execution p. 12/26

Counterexample Cache Example void foo(int x) { if (x < 100)... if (x!= 50)... } Program Testing via Symbolic Execution p. 13/26

Counterexample Cache Example void foo(int x) { if (x < 100)... if (x!= 50)... } Cache: {} Program Testing via Symbolic Execution p. 13/26

Counterexample Cache Example void foo(int x) { if (x < 100)... if (x!= 50)... } Cache: {} Query: x < 100, Cache: {(x < 100, {x : 0})} Program Testing via Symbolic Execution p. 13/26

Counterexample Cache Example void foo(int x) { if (x < 100)... if (x!= 50)... } Cache: {} Query: x < 100, Cache: {(x < 100, {x : 0})} Query: x < 100 x 50 Program Testing via Symbolic Execution p. 13/26

Counterexample Cache Example void foo(int x) { if (x < 100)... if (x!= 50)... } Cache: {} Query: x < 100, Cache: {(x < 100, {x : 0})} Query: x < 100 x 50 Evaluate with {x : 0} 0 < 100 0 50 Found assignment Program Testing via Symbolic Execution p. 13/26

Optimization Results Average Time (s) 300 250 200 150 100 Base Independence Cex Cache All 50 0 0 0.2 0.4 0.6 0.8 1 Normalized Num. Instructions Program Testing via Symbolic Execution p. 14/26

Environment Modeling Testing real applications requires providing a realistic environment system libraries operating system Program Testing via Symbolic Execution p. 15/26

Environment Modeling Testing real applications requires providing a realistic environment system libraries operating system Our approach: Compile and execute using µlibc Implement UNIX systems calls in separate modeling library Program Testing via Symbolic Execution p. 15/26

Environment Modeling (2) Idea: Overlay symbolic environment onto operating system Program sees virtual symbolic resources and actual OS resources Program Testing via Symbolic Execution p. 16/26

Environment Modeling (2) Idea: Overlay symbolic environment onto operating system Program sees virtual symbolic resources and actual OS resources Implemented by routing system calls through model routines Symbolic resources are handled in model Actual resources are forwarded to OS Program Testing via Symbolic Execution p. 16/26

Environment Modeling (3) Currently model supports symbolic files and program arguments Program Testing via Symbolic Execution p. 17/26

Environment Modeling (3) Currently model supports symbolic files and program arguments Supports open(), close(), read(), write(), lseek(), stat(), fstat() as well as simple stubs for several more Program Testing via Symbolic Execution p. 17/26

Environment Modeling (3) Currently model supports symbolic files and program arguments Supports open(), close(), read(), write(), lseek(), stat(), fstat() as well as simple stubs for several more Implementation is about 800 lines of C code Program Testing via Symbolic Execution p. 17/26

Testing Process Test generation: The application is linked with µclibc and the model and run with KLEE Program Testing via Symbolic Execution p. 18/26

Testing Process Test generation: The application is linked with µclibc and the model and run with KLEE Additional arguments specify what parts of environment to make symbolic Program Testing via Symbolic Execution p. 18/26

Testing Process Test generation: The application is linked with µclibc and the model and run with KLEE Additional arguments specify what parts of environment to make symbolic KLEE generates test cases for any errors and for coverage Program Testing via Symbolic Execution p. 18/26

Testing Process (2) Test verification: Programs are linked with gcc using µlibc and stripped down model Program Testing via Symbolic Execution p. 19/26

Testing Process (2) Test verification: Programs are linked with gcc using µlibc and stripped down model Tests are replayed natively Program Testing via Symbolic Execution p. 19/26

Testing Process (2) Test verification: Programs are linked with gcc using µlibc and stripped down model Tests are replayed natively Protects against modeling and system errors Program Testing via Symbolic Execution p. 19/26

Coreutils Background GNU implementation of standard UNIX utilities Program Testing via Symbolic Execution p. 20/26

Coreutils Background GNU implementation of standard UNIX utilities Core of most Linux installations Program Testing via Symbolic Execution p. 20/26

Coreutils Background GNU implementation of standard UNIX utilities Core of most Linux installations Encompasses 90 applications of varying size and focus File system: ls, dd, chmod Numeric: factor, seq System: printenv, hostname Text Processing: sort, head, od Program Testing via Symbolic Execution p. 20/26

Coreutils Case Study All 90 applications were tested on latest public release Program Testing via Symbolic Execution p. 21/26

Coreutils Case Study All 90 applications were tested on latest public release Focused on tool front-ends Program Testing via Symbolic Execution p. 21/26

Coreutils Case Study All 90 applications were tested on latest public release Focused on tool front-ends Covers 20k combined executable lines of code Program Testing via Symbolic Execution p. 21/26

Coreutils Case Study All 90 applications were tested on latest public release Focused on tool front-ends Covers 20k combined executable lines of code Mostly automatic One source code modification required Most programs tested using a generic configuration Program Testing via Symbolic Execution p. 21/26

Results: Bugs Six bugs found One fixed in developer repository Three resulted in segmentation fault Two allowed arbitrary heap smashing ptx -F\\ abcdefghijklmnopqrstuvwxyz paste -d\\ abcdefghijklmnopqrstuvwxyz seq -f %0 1 mkdir -Z a b mkfifo -Z a b mknod -Z a b p Program Testing via Symbolic Execution p. 22/26

Results: Coverage Overall coverage: 70.6% Over 80% on 61/90 30 29 100 25 24 80 87% (37) 80% (21) 80% (14) 71% (8) 77% (3) 82% (1) 20 60 57% (1) 56% (1) 15 47% (4) 40 10 8 9 9 8 5 20 0 1 10-20% 2 20-30% 50-60% 60-70% 70-80% 80-90% 90-100% 100% 0 0-100 100-200 200-300 300-400 500-600 600-700 800-900 1100-1200 1300-1400 Executable Lines of Code Program Testing via Symbolic Execution p. 23/26

Results: Vs. Manual Developers tests overall coverage: 67.4%. Generated tests outperform on majority of tools. 100 80 Generated Tests Developer Tests Shared 60 40 20 0 Program Testing via Symbolic Execution p. 24/26

Conclusion Symbolic execution can provide an effective testing tool Program Testing via Symbolic Execution p. 25/26

Conclusion Symbolic execution can provide an effective testing tool Can scale to real applications Careful attention to scalability Take advantage of specialized domain to optimize constraint solving Program Testing via Symbolic Execution p. 25/26

Conclusion Symbolic execution can provide an effective testing tool Can scale to real applications Careful attention to scalability Take advantage of specialized domain to optimize constraint solving Requires only modest amount of effort to construct an adequate model Program Testing via Symbolic Execution p. 25/26

Conclusion Symbolic execution can provide an effective testing tool Can scale to real applications Careful attention to scalability Take advantage of specialized domain to optimize constraint solving Requires only modest amount of effort to construct an adequate model Can beat manual testing out-of-the-box Program Testing via Symbolic Execution p. 25/26

Questions? Program Testing via Symbolic Execution p. 26/26