APISan: Sanitizing API Usages through Semantic Cross-checking

Similar documents
APISan: Sanitizing API Usages through Semantic Cross-Checking

Cross-checking Semantic Correctness: The Case of Finding File System Bugs

MC: Meta-level Compilation

Verification & Validation of Open Source

System Administration and Network Security

Scalable Program Analysis Using Boolean Satisfiability: The Saturn Project

UniSan: Proactive Kernel Memory Initialization to Eliminate Data Leakages

Breaking Kernel Address Space Layout Randomization (KASLR) with Intel TSX. Yeongjin Jang, Sangho Lee, and Taesoo Kim Georgia Institute of Technology

Gaps in Static Analysis Tool Capabilities. Providing World-Class Services for World-Class Competitiveness

Improving Integer Security for Systems with KINT. Xi Wang, Haogang Chen, Zhihao Jia, Nickolai Zeldovich, Frans Kaashoek MIT CSAIL Tsinghua IIIS

MEMORY MANAGEMENT TEST-CASE GENERATION OF C PROGRAMS USING BOUNDED MODEL CHECKING

DEBUGGING: DYNAMIC PROGRAM ANALYSIS

Bug Hunting and Static Analysis

K-Miner Uncovering Memory Corruption in Linux

Overview AEG Conclusion CS 6V Automatic Exploit Generation (AEG) Matthew Stephen. Department of Computer Science University of Texas at Dallas

Identifying Memory Corruption Bugs with Compiler Instrumentations. 이병영 ( 조지아공과대학교

Stephen McLaughlin. From Uncertainty to Belief: Inferring the Specification Within

Finding User/Kernel Pointer Bugs with Type Inference p.1

A Practical Approach to Programming With Assertions

Announcements. assign0 due tonight. Labs start this week. No late submissions. Very helpful for assign1

Secure Software Development: Theory and Practice

An Operational and Axiomatic Semantics for Non-determinism and Sequence Points in C

Limitations of the stack

Static Analysis of C++ Projects with CodeSonar

Modeling and Discovering Vulnerabilities with Code Property Graphs

Lecture 4 September Required reading materials for this class

Quiz 0 Review Session. October 13th, 2014

HDFI: Hardware-Assisted Data-flow Isolation

IntFlow: Integer Error Handling With Information Flow Tracking

Checking System Rules Using System-Specific, Programmer- Written Compiler Extensions

Model Checking Embedded C Software using k-induction and Invariants

CS-527 Software Security

Preventing Use-after-free with Dangling Pointers Nullification

Discovering Application-level Insider Attacks using Symbolic Execution. Karthik Pattabiraman Zbigniew Kalbarczyk Ravishankar K.

Modular and Verified Automatic Program Repairs

Finding Error Handling Bugs in OpenSSL using Coccinelle

Software Excellence via. at Microsoft

Solros: A Data-Centric Operating System Architecture for Heterogeneous Computing

Secure C Coding...yeah right. Andrew Zonenberg Alex Radocea

C Source Code Analysis for Memory Safety

Verifying the Safety of Security-Critical Applications

Symbolic Execution, Dynamic Analysis

Shreds: S H R E. Fine-grained Execution Units with Private Memory. Yaohui Chen, Sebassujeen Reymondjohnson, Zhichuang Sun, Long Lu D S

Lecture 14. No in-class files today. Homework 7 (due on Wednesday) and Project 3 (due in 10 days) posted. Questions?

Static Analysis versus Software Model Checking for bug finding

The benefits and costs of writing a POSIX kernel in a high-level language

Adaptive Android Kernel Live Patching

Software Excellence via Program Analysis at Microsoft. Manuvir Das

CNIT 127: Exploit Development. Ch 18: Source Code Auditing. Updated

Towards Optimization-Safe Systems

MoonShine: Optimizing OS Fuzzer Seed Selection with Trace Distillation. Shankara Pailoor, Andrew Aday, Suman Jana Columbia University

Information Flow Analysis and Type Systems for Secure C Language (VITC Project) Jun FURUSE. The University of Tokyo

Dynamic Allocation in C

Structuring an Abstract Interpreter through Value and State Abstractions: EVA, an Evolved Value Analysis for Frama C

DR. CHECKER. A Soundy Analysis for Linux Kernel Drivers. University of California, Santa Barbara. USENIX Security seclab

Understanding the Genetic Makeup of Linux Device Drivers

Eraser: Dynamic Data Race Detection

Principles of Software Construction: Objects, Design, and Concurrency (Part 2: Designing (Sub )Systems)

Goal. Overflow Checking in Firefox. Sixgill. Sixgill (cont) Verifier Design Questions. Sixgill: Properties 4/8/2010

CS 550 Operating Systems Spring Interrupt

Static Analysis in C/C++ code with Polyspace

Static program checking and verification

Static Analysis in Practice

Putting the Checks into Checked C

Module: Future of Secure Programming

Black Hat Webcast Series. C/C++ AppSec in 2014

Princeton University. Computer Science 217: Introduction to Programming Systems. Dynamic Memory Management

Lightweight Application-Level Crash Consistency on Transactional Flash Storage

CSE 565 Computer Security Fall 2018

Tracking Pointers with Path and Context Sensitivity for Bug Detection in C Programs. {livshits,

Static Analysis in Practice

High-performance computing and programming Intro to C on Unix/Linux. Uppsala universitet

Brave New 64-Bit World. An MWR InfoSecurity Whitepaper. 2 nd June Page 1 of 12 MWR InfoSecurity Brave New 64-Bit World

OPERATING SYSTEM TRANSACTIONS

Scaling CQUAL to millions of lines of code and millions of users p.1

Putting the Checks into Checked C. Archibald Samuel Elliott Quals - 31st Oct 2017

New features in AddressSanitizer. LLVM developer meeting Nov 7, 2013 Alexey Samsonov, Kostya Serebryany

CS558 Programming Languages

Towards Automatic Generation of Vulnerability- Based Signatures

Motivation. What s the Problem? What Will we be Talking About? What s the Solution? What s the Problem?

Diagnosing Production-Run Concurrency-Bug Failures. Shan Lu University of Wisconsin, Madison

Finding and Fixing Bugs in Liquid Haskell. Anish Tondwalkar

Splint Pre-History. Security Flaws. (A Somewhat Self-Indulgent) Splint Retrospective. (Almost) Everyone Hates Specifications.

Secure Coding Techniques

This time. Defenses and other memory safety vulnerabilities. Everything you ve always wanted to know about gdb but were too afraid to ask

Module: Future of Secure Programming

EIO: Error-handling is Occasionally Correct

Using Intel VTune Amplifier XE and Inspector XE in.net environment

Operating System Structure

CMSC 313 COMPUTER ORGANIZATION & ASSEMBLY LANGUAGE PROGRAMMING

Ivy: Modernizing C. David Gay, Intel Labs Berkeley

CS 11 C track: lecture 5

CYSE 411/AIT681 Secure Software Engineering Topic #10. Secure Coding: Integer Security

Review: C Strings. A string in C is just an array of characters. Lecture #4 C Strings, Arrays, & Malloc

Static Analysis with Goanna

Lessons Learned in Static Analysis Tool Evaluation. Providing World-Class Services for World-Class Competitiveness

Seminar in Software Engineering Presented by Dima Pavlov, November 2010

COMPUTER SCIENCE TRIPOS

CS 5523 Operating Systems: Midterm II - reivew Instructor: Dr. Tongping Liu Department Computer Science The University of Texas at San Antonio

2/9/18. Readings. CYSE 411/AIT681 Secure Software Engineering. Introductory Example. Secure Coding. Vulnerability. Introductory Example.

Transcription:

APISan: Sanitizing API Usages through Semantic Cross-checking Insu Yun, Changwoo Min, Xujie Si, Yeongjin Jang, Taesoo Kim, Mayur Naik Georgia Institute of Technology 1

APIs in today s software are plentiful yet complex Example: OpenSSL - 3841 APIs in [v1.0.2h] - 3718 in [v1.0.1t] -> 3841 in [v1.0.2h] (+123 APIs) - OpenSSH uses 158 APIs of OpenSSL 2

Complex APIs result in programmers mistakes Problems in documentation - Incomplete: e.g., low details in hostname verification - Long: e.g., 43K lines in OpenSSL documentation - Lack: e.g., internal APIs Lack of automatic tool support - e.g., missing formal specification and precise semantics 3

Problem: API misuse can cause security problems 4

Problem: API misuse can cause security problems à MITM 5

Problem: API misuse can cause security problems à Code execution 6

Problem: API misuse can cause security problems à Privilege Escalation 7

Today s practices to help programmers Formal method - Problem: lack of specification Model checking - Problem: manual, lack of semantic context Symbolic execution - Problem : failed to scale for large software 8

Promising approach: finding bugs by using existing code Bugs as deviant behavior [OSDI01] - Syntactic template: e.g., check NULL on malloc() Juxta [SOSP15] -Inferring correct semantics from multiple of implementations -File system specific bug finding tool 9

Promising approach: finding bugs by using existing code Bugs as deviant behavior [OSDI01] - Syntactic template: e.g., check NULL on malloc() Juxta [SOSP15] Research goal: can we apply this method to -Inferring correct semantics from multiple of implementations any kind of software without manual efforts? -File system specific bug finding tool 10

Our idea: comparing API usages in various implementation Example: finding OpenSSL API misuses curl curl curl nmap curl nginx nmap nginx curl hexchat nginx curl APISan Majority uses ( Likely correct ) Deviant uses ( Likely bug) 11

Our idea: comparing API usages in various implementation Example: finding OpenSSL API misuses curl curl curl nmap curl nginx nmap nginx curl hexchat nginx curl APISan Majority uses ( Likely correct ) Deviant uses ( Likely bug) 12

Our idea: comparing API usages in various implementation Example: finding OpenSSL API misuses curl curl curl nmap curl nginx nmap nginx curl hexchat nginx curl APISan Majority uses ( Likely correct ) Deviant uses ( Likely bug) 13

Our idea: comparing API usages in various implementation Example: finding OpenSSL API misuses curl curl curl nmap curl nginx nmap nginx curl hexchat nginx curl APISan Majority uses ( Likely correct ) Deviant uses ( Likely bug) 14

Our approach is very promising Effective in finding API misuses -76 new bugs Scale to large, complex software -Linux kernel, OpenSSL, PHP, Python, etc. -Debian packages 15

Technical Challenges API uses are too different from impl. to impl. Subtle semantics of the correct API uses Large, complex code using APIs 16

Example: OpenSSL API uses SSL_get_verify_result() -Get result of peer certificate verification if (SSL_get_verify_result() == X509_V_OK) { } 17

Example: OpenSSL API uses SSL_get_verify_result() -Get result of peer certificate verification -no peer certificate à always returns X509_V_OK if (SSL_get_verify_result() == X509_V_OK) { } 18

Example: OpenSSL API uses SSL_get_verify_result() -Get result of peer certificate verification -no peer certificate à always returns X509_V_OK if (SSL_get_verify_result() == X509_V_OK && SSL_get_peer_certificate()!= NULL ) { } 19

Example: a correct implementation using OpenSSL API cert = SSL_get_peer_certificate(handle); if (!cert) { } err = SSL_get_verify_result(handle); if (err == X509_V_OK) { } curl 20

Example: a correct implementation using OpenSSL API cert = SSL_get_peer_certificate(handle); if (!cert) { } err = SSL_get_verify_result(handle); if (err == X509_V_OK) { } curl 21

Example: a correct implementation using OpenSSL API cert = SSL_get_peer_certificate(handle); if (!cert) { } err = SSL_get_verify_result(handle); if (err == X509_V_OK) { } curl 22

Example: a correct implementation using OpenSSL API cert = SSL_get_peer_certificate(handle); if (!cert) { } err = SSL_get_verify_result(handle); if (err == X509_V_OK) { } curl 23

Example: a correct implementation using OpenSSL API cert = SSL_get_peer_certificate(handle); if (!cert) { } err = SSL_get_verify_result(handle); if (err == X509_V_OK) { } curl 24

Example: a correct implementation using OpenSSL API Semantically same with correct usage cert = SSL_get_peer_certificate(handle); if (!cert) { } err = SSL_get_verify_result(handle); if (err == X509_V_OK) { } if (SSL_get_verify_result() == X509_V_OK && SSL_get_peer_certificate()!= NULL ) { } curl 25

Example: a correct implementation using OpenSSL API Correct cert = SSL_get_peer_certificate(handle); if (!cert) { } err = SSL_get_verify_result(handle); if (err == X509_V_OK) { } curl 26

Example: providing various implementations using OpenSSL Correct cert = SSL_get_peer_certificate(handle); if (!cert) { } err = SSL_get_verify_result(handle); if (err == X509_V_OK) { } curl if (SSL_get_verify_result(conn)!= X509_V_OK) return NGX_OK; cert = SSL_get_peer_certificate(conn); if (cert) { } nginx cert = SSL_get_peer_certificate(ssl); if (cert == NULL) return 0; if (SSL_get_verify_result(ssl)!= X509_V_OK) { } nmap err = SSL_get_verify_result(ssl); switch(err) { case X509_V_OK: cert = SSL_get_peer_certificate(ssl); hexchat 27

Example: providing various implementations using OpenSSL Correct cert = SSL_get_peer_certificate(handle); if (!cert) { } err = SSL_get_verify_result(handle); if (err == X509_V_OK) { } curl if (SSL_get_verify_result(conn)!= X509_V_OK) return NGX_OK; cert = SSL_get_peer_certificate(conn); if (cert) { } nginx cert = SSL_get_peer_certificate(ssl); if (cert == NULL) return 0; if (SSL_get_verify_result(ssl)!= X509_V_OK) { } nmap err = SSL_get_verify_result(ssl); switch(err) { case X509_V_OK: cert = SSL_get_peer_certificate(ssl); hexchat 28

Example: providing various implementations using OpenSSL Correct cert = SSL_get_peer_certificate(handle); if (!cert) { } err = SSL_get_verify_result(handle); if (err == X509_V_OK) { } curl Correct if (SSL_get_verify_result(conn)!= X509_V_OK) return NGX_OK; cert = SSL_get_peer_certificate(conn); if (cert) { } nginx cert = SSL_get_peer_certificate(ssl); if (cert == NULL) return 0; if (SSL_get_verify_result(ssl)!= X509_V_OK) { } nmap err = SSL_get_verify_result(ssl); switch(err) { case X509_V_OK: cert = SSL_get_peer_certificate(ssl); hexchat 29

Example: providing various implementations using OpenSSL Correct cert = SSL_get_peer_certificate(handle); if (!cert) { } err = SSL_get_verify_result(handle); if (err == X509_V_OK) { } curl Correct if (SSL_get_verify_result(conn)!= X509_V_OK) return NGX_OK; cert = SSL_get_peer_certificate(conn); if (cert) { } nginx cert = SSL_get_peer_certificate(ssl); if (cert == NULL) return 0; if (SSL_get_verify_result(ssl)!= X509_V_OK) { } nmap err = SSL_get_verify_result(ssl); switch(err) { case X509_V_OK: cert = SSL_get_peer_certificate(ssl); hexchat 30

Example: providing various implementations using OpenSSL Correct cert = SSL_get_peer_certificate(handle); if (!cert) { } err = SSL_get_verify_result(handle); if (err == X509_V_OK) { } curl Correct if (SSL_get_verify_result(conn)!= X509_V_OK) return NGX_OK; cert = SSL_get_peer_certificate(conn); if (cert) { } nginx Correct cert = SSL_get_peer_certificate(ssl); if (cert == NULL) return 0; if (SSL_get_verify_result(ssl)!= X509_V_OK) { } nmap err = SSL_get_verify_result(ssl); switch(err) { case X509_V_OK: cert = SSL_get_peer_certificate(ssl); hexchat 31

Example: providing various implementations using OpenSSL Correct cert = SSL_get_peer_certificate(handle); if (!cert) { } err = SSL_get_verify_result(handle); if (err == X509_V_OK) { } curl Correct if (SSL_get_verify_result(conn)!= X509_V_OK) return NGX_OK; cert = SSL_get_peer_certificate(conn); if (cert) { } nginx Correct cert = SSL_get_peer_certificate(ssl); if (cert == NULL) return 0; if (SSL_get_verify_result(ssl)!= X509_V_OK) { } nmap err = SSL_get_verify_result(ssl); switch(err) { case X509_V_OK: cert = SSL_get_peer_certificate(ssl); // if (cert) is missed hexchat 32

Example: providing various implementations using OpenSSL Correct cert = SSL_get_peer_certificate(handle); if (!cert) { } err = SSL_get_verify_result(handle); if (err == X509_V_OK) { } curl Correct if (SSL_get_verify_result(conn)!= X509_V_OK) return NGX_OK; cert = SSL_get_peer_certificate(conn); if (cert) { } nginx Correct cert = SSL_get_peer_certificate(ssl); if (cert == NULL) return 0; if (SSL_get_verify_result(ssl)!= X509_V_OK) { } nmap Incorrect err = SSL_get_verify_result(ssl); switch(err) { case X509_V_OK: cert = SSL_get_peer_certificate(ssl); // if (cert) is missed hexchat 33

Example: providing various implementations using OpenSSL Correct cert = SSL_get_peer_certificate(handle); if (SSL_get_verify_result(conn)!= X509_V_OK) if (!cert) { } return NGX_OK; err = SSL_get_verify_result(handle); cert = SSL_get_peer_certificate(conn); if (err == X509_V_OK) { } if (cert) { } Can we distinguish curl between correct implementations nginx Correct cert = SSL_get_peer_certificate(ssl); if (cert == NULL) return 0; if (SSL_get_verify_result(ssl)!= X509_V_OK) { } nmap Correct and buggy implementations? Incorrect err = SSL_get_verify_result(ssl); switch(err) { case X509_V_OK: cert = SSL_get_peer_certificate(ssl); // if (cert) is missed hexchat 34

Challenge 1: API usages are different from each other Correct cert = SSL_get_peer_certificate(handle); if (!cert) { } err = SSL_get_verify_result(handle); if (err == X509_V_OK) { } curl Correct if (SSL_get_verify_result(conn)!= X509_V_OK) return NGX_OK; cert = SSL_get_peer_certificate(conn); if (cert) { } nginx Correct cert = SSL_get_peer_certificate(ssl); if (cert == NULL) return 0; if (SSL_get_verify_result(ssl)!= X509_V_OK) { } nmap Incorrect err = SSL_get_verify_result(ssl); switch(err) { case X509_V_OK: cert = SSL_get_peer_certificate(ssl); // if (cert) is missed hexchat 35

Challenge 2: subtle semantics of the correct API usages Correct cert = SSL_get_peer_certificate(handle); if (!cert) { } err = SSL_get_verify_result(handle); if (err == X509_V_OK) { } curl Correct if (SSL_get_verify_result(conn)!= X509_V_OK) return NGX_OK; cert = SSL_get_peer_certificate(conn); if (cert) { } nginx Correct cert = SSL_get_peer_certificate(ssl); if (cert == NULL) return 0; if (SSL_get_verify_result(ssl)!= X509_V_OK) { } nmap Incorrect err = SSL_get_verify_result(ssl); switch(err) { case X509_V_OK: cert = SSL_get_peer_certificate(ssl); // if (cert) is missed hexchat 36

Challenge3 : Large, complex code using APIs On average, more than 100K LoC -curl : 110K LoC -nginx : 127K LoC -nmap: 169K LoC -hexchat: 61K LoC Linux : > 1M LoC 37

Challenge3 : Large, complex code using APIs cert = SSL_get_peer_certificate(handle); if (!cert) { } err = SSL_get_verify_result(handle); if (err == X509_V_OK) { } curl (simplified) cert = SSL_get_peer_certificate(handle); if (!cert) { }... len = BIO_get_mem_data(mem, (char **) &ptr); infof(data, " start date: %.*s\n", len, ptr); rc = BIO_reset(mem); err = SSL_get_verify_result(handle); if (err == X509_V_OK) { } curl 38

Overview of APISan Source Source Source code code code 39

Overview of APISan Symbolic execution database Source Source Source code code code Relaxed Symbolic Execution APIs Constraints Arguments 40

Overview of APISan Symbolic execution database Source Source Source code code code Relaxed Symbolic Execution APIs Constraints Arguments 4 Checkers Return value checker Argument checker Causality checker Condition checker 41

Overview of APISan Symbolic execution database Source Source Source code code code Relaxed Symbolic Execution APIs Arguments Minority uses Constraints 4 Checkers Return value checker Argument checker Causality checker Condition checker : minor, but not bug : minor and bug 42

Overview of APISan Symbolic execution database Source Source Source code code code Relaxed Symbolic Execution APIs Arguments Minority uses Constraints... Ranked minority uses 4 Checkers Return value checker Argument checker Causality checker Condition checker : minor, but not bug : minor and bug 43

Overview of APISan Symbolic execution database Source Source Source code code code Relaxed Symbolic Execution APIs Arguments Minority uses Constraints... Ranked minority uses 4 Checkers Return value checker Argument checker Causality checker Condition checker : minor, but not bug : minor and bug 44

Symbolic execution can be relaxed in finding API contexts Symbolic execution is not scalable -Path explosion -SMT is expensive, naturally NP-complete Methods to relax symbolic execution -Limiting inter-procedural analysis -Removing back edges -Range-based 45

Method 1: Limiting inter-procedural analysis How APIs are used How APIs are implemented O X cert = SSL_get_peer_certificate(handle); if (!cert) { } err = SSL_get_verify_result(handle); if (err!= X509_V_OK) { } 46

Method 2: Removing back edges API contexts can be captured within loops -e.g., malloc() and free() are matched inside a loop for( ) { cert = SSL_get_peer_certificate(handle); if (!cert) { } err = SSL_get_verify_result(handle); if (err!= X509_V_OK) { } } 47

Method 3: Range-based Most of arguments & return values are integer cert!= NULL err == X509_V_OK cert = {[-MAX, -1], [1, MAX]} err = {[X509_V_OK, X509_V_OK]} Clang uses range-based symbolic execution 48

Building per-path symbolic abstractions Path-sensitive, context-sensitive Record symbolic abstractions -API calls -Symbolic expression of arguments -Constraints 49

Examples: Building per-path symbolic abstractions from source code cert = SSL_get_peer_certificate(handle); if (!cert) { } err = SSL_get_verify_result(handle); if (err == X509_V_OK) { } Source code Symbolic abstractions 50

Examples: Building per-path symbolic abstractions from source code cert = SSL_get_peer_certificate(handle); if (!cert) { } err = SSL_get_verify_result(handle); if (err == X509_V_OK) { } Source code Call SSL_get_peer_certificate(handle) Symbolic abstractions 51

Examples: Building per-path symbolic abstractions from source code cert = SSL_get_peer_certificate(handle); if (!cert) { } err = SSL_get_verify_result(handle); if (err == X509_V_OK) { } Source code Call Constraint SSL_get_peer_certificate(handle) SSL_get_peer_certificate(handle) = {[-MAX, -1], [1, MAX]} Symbolic abstractions 52

Examples: Building per-path symbolic abstractions from source code cert = SSL_get_peer_certificate(handle); if (!cert) { } err = SSL_get_verify_result(handle); if (err == X509_V_OK) { } Source code Call Constraint Call SSL_get_peer_certificate(handle) SSL_get_peer_certificate(handle) = {[-MAX, -1], [1, MAX]} SSL_get_verify_result(handle) Symbolic abstractions 53

Examples: Building per-path symbolic abstractions from source code cert = SSL_get_peer_certificate(handle); if (!cert) { } err = SSL_get_verify_result(handle); if (err == X509_V_OK) { } Source code Call Constraint Call Constraint SSL_get_peer_certificate(handle) SSL_get_peer_certificate(handle) = {[-MAX, -1], [1, MAX]} SSL_get_verify_result(handle) SSL_get_verify_result(handle) = {[X509_V_OK, X509_V_OK]} Symbolic abstractions 54

Examples: Building per-path symbolic abstractions from source code cert = SSL_get_peer_certificate(handle); if (!cert) { } err = SSL_get_verify_result(handle); if (err == X509_V_OK) { } Source code 55

Examples: Building per-path symbolic abstractions from source code cert = SSL_get_peer_certificate(handle); if (!cert) { } err = SSL_get_verify_result(handle); if (err == X509_V_OK) { } Source code Symbolic Abstractions #1 56

Examples: Building per-path symbolic abstractions from source code cert = SSL_get_peer_certificate(handle); if (!cert) { } err = SSL_get_verify_result(handle); if (err == X509_V_OK) { } Source code Symbolic Abstractions #1 Symbolic Abstractions #2 57

Examples: Building per-path symbolic abstractions from source code cert = SSL_get_peer_certificate(handle); if (!cert) { } err = SSL_get_verify_result(handle); if (err == X509_V_OK) { } Source code Symbolic Abstractions #1 Symbolic Abstractions #3 Symbolic Abstractions #2. 58

Overview of APISan Symbolic execution database Source Source Source code code code Relaxed Symbolic Execution APIs Arguments Minority uses Constraints... Ranked minority uses 4 Checkers Return value checker Argument checker Causality checker Condition checker : minor, but not bug : minor and bug 59

Four semantic contexts have security implications Orthogonal, essential, security-related contexts -Return value -Arguments -Causality -Condition 60

Context 1: Return value Return computation result or execution status NULL dereference Privilege escalation -e.g, Windows, CVE-2014-4113 ptr = malloc(size) if (!ptr){ } 61

Context 2: Arguments Inputs for calling APIs and their relationship Format string bug Memory corruption printf(buf); ptr = malloc(size1); memcpy(ptr, src, size2); 62

Context 3: Causality Causal relationship between APIs Deadlock Memory leak lock(); unlock(); malloc(); free(); 63

Context 4: Condition Implicit pre- and post condition for calling APIs MITM if (SSL_get_verify_result() == X509_V_OK && SSL_get_peer_certificate()!= NULL) 64

Extract contexts from symbolic abstractions Symbolic abstractions contains {APIs, Arguments, Constraints} Return value Arguments Causality Condition ß Constraints ß Arguments ß APIs ß Constraints + APIs 65

Extract contexts from symbolic abstractions Symbolic abstractions contains {APIs, Arguments, Constraints} Return value Arguments Causality Condition ß Constraints ß Arguments ß APIs ß Constraints + APIs 66

Example: extract condition contexts from symbolic abstractions Any constraint or call Line numbers when event is called Call Constraint Call Constraint SSL_get_peer_certificate(handle) SSL_get_peer_certificate(handle) = {[-MAX, -1], [1, MAX]} SSL_get_verify_result(handle) SSL_get_verify_result(handle) = {[X509_V_OK, X509_V_OK]} curl Event SSL_get_verify_result = {[X509_V_OK, X509_V_OK]} Constraint SSL_get_peer_certificate = {[-MAX, -1], [1, MAX]} Line {curl} Line {curl}. 67

Example: extract condition contexts from symbolic abstractions Call Constraint Call Constraint SSL_get_verify_result(conn) SSL_get_verify_result(handle) == {[X509_V_OK, X509_V_OK]} SSL_get_peer_certificate(conn) SSL_get_peer_certificate(conn)!= {[-MAX, -1], [1, MAX]} nginx Event SSL_get_verify_result = {[X509_V_OK, X509_V_OK]} Constraint SSL_get_peer_certificate = {[-MAX, -1], [1, MAX]} Line {curl, nginx} Line {curl, nginx}. 68

Example: extract condition contexts from symbolic abstractions Call Constraint Call Constraint SSL_get_peer_certificate(ssl) SSL_get_peer_certificate(ssl) = {[-MAX, -1], [1, MAX]} SSL_get_verify_result(ssl) SSL_get_verify_result(ssl) = {[X509_V_OK, X509_V_OK]} nmap Event SSL_get_verify_result = {[X509_V_OK, X509_V_OK]} Constraint SSL_get_peer_certificate = {[-MAX, -1], [1, MAX]} Line {curl, nginx, nmap} Line {curl, nginx, nmap}. 69

Example: extract condition contexts from symbolic abstractions Event Line Call Constraint Call SSL_get_verify_result(ssl) SSL_get_verify_result(ssl) = {[X509_V_OK, X509_V_OK]} SSL_get_peer_certificate(ssl) hexchat SSL_get_verify_result = {[X509_V_OK, X509_V_OK]} Constraint SSL_get_peer_certificate = {[-MAX, -1], [1, MAX]} {curl, nginx, nmap, hexchat} Line {curl, nginx, nmap}. 70

Example: find majority & minority usages from contexts Event SSL_get_verify_result = {[X509_V_OK, X509_V_OK]} Constraint SSL_get_peer_certificate = {[-MAX, -1], [1, MAX]} Line {curl, nginx, nmap, hexchat, } Line {curl, nginx, nmap, }. 71

Example: find majority & minority usages from contexts Event SSL_get_verify_result = {[X509_V_OK, X509_V_OK]} Constraint SSL_get_peer_certificate = {[-MAX, -1], [1, MAX]} Line {curl, nginx, nmap, hexchat, } Line {curl, nginx, nmap, }. Majority uses ( Likely correct ) 72

Example: find majority & minority usages from contexts Event SSL_get_verify_result = {[X509_V_OK, X509_V_OK]} Constraint SSL_get_peer_certificate = {[-MAX, -1], [1, MAX]} Line {curl, nginx, nmap, hexchat, } Line {curl, nginx, nmap, }. Majority uses ( Likely correct ) Deviant uses ( Likely bug) = total_event majority_use = {hexchat, } 73

Overview of APISan Symbolic execution database Source Source Source code code code Relaxed Symbolic Execution APIs Arguments Minority uses Constraints... Ranked minority uses 4 Checkers Return value checker Argument checker Causality checker Condition checker : minor, but not bug : minor and bug 74

False positives can be happened in majority analysis Lack of inter-procedural analysis -e.g., check a return value of malloc() inside a function Correlation Causation -e.g., fprintf() is used for printing debug messages when open() is failed Correct minor uses -e.g., strcmp() == 0, strcmp() > 0 75

Ranking can mitigate false positives More majority pattern repeated, more bug-likely -e.g., 999 majority, 1 minority > 10 majority, 1 minority General information -e.g., most of allocation functions have alloc in their names and are required to check their return values Domain specific knowledge -e.g., SSL APIs start with a string SSL 76

Our approach is formalized as a general framework 77

Implementation of APISan 9K LoC in total -Symbolic database generation : 6K LoC of C/C++ (Clang 3.6) -APISan library : 2K LoC of Python Checkers : 1K LoC of Python -Return value checker : 131 LoC -Argument checker : 251 LoC - 78

Evaluation questions How effective is APISan in finding new bugs? How easy to use and easy to extend? How effective is APISan s ranking system? 79

APISan is effective in finding bugs Found 76 new bugs in large, complex software -Linux kernel, OpenSSL, PHP, Python, and Debian packages Security implication -e.g., CVE-2016-5636: Python zipimporter heap overflow (Code execution in Google App Engine) 80

APISan is easy to use without any manual annotation To generate symbolic context database $ apisan make # use existing build command Run a checker $ apisan --checker=cpair # cpair : causality checker Run a checker (inter-application) $apisan --checker=cpair --db=app1, app2 81

APISan is easy to extend e.g., Integer overflow check Integer overflow sensitive APIs -Have security implications when integer overflow happens -e.g., memory allocation functions Integer overflow ß Arguments + Constraints -If arguments contains binary operators à check integer overflow within given constraints 82

Check integer overflow with APISan Collect all integer overflows Ranking strategy -More integer overflow prevented by constraints à APIs are likely integer overflow sensitive -Incorrect constraints > Missing constraints ; Missing constraints can be caused by limited analysis Found 6 integer overflows (167 LoC) 83

APISan s ranking system is effective Linux Kernel with Return Value Checker Total 2,776 reports Audited 445 reports Found 54 bugs 84

APISan s ranking system is effective 30 bugs in 20 APIs 24 bugs in 3 APIs Linux Kernel with Return Value Checker Total 2,776 reports Audited 445 reports Found 54 bugs 85

APISan s ranking system is effective 30 bugs in 20 APIs 24 bugs in 3 APIs 15 bugs in 1 APIs Linux Kernel with Return Value Checker Total 2,776 reports Audited 445 reports Found 54 bugs 86

Limitation No soundness & No completeness High false positive rate : > 80% Too slow to frequently analyze -32-core Xeon server with 256GB RAM -For Linux kernel, Generating database : 8 hours Each checker: 6 hours Not fully resolve path explosion -stopped in functions which have path explosion 87

Conclusion APISan: an automatic way for finding API misuse -Effective: Finding 76 new bugs -Scalable: Tested with Linux kernel, Debian packages, etc APISan *WILL* be released as open source -https://github.com/sslab-gatech 88

Thank you! Questions? 89