Statically Detecting Likely Buffer Overflow Vulnerabilities

Similar documents
Statically Detecting Likely Buffer Overflow Vulnerabilities

Extensible Lightweight Static Checking

Splint Pre-History. Security Flaws. (A Somewhat Self-Indulgent) Splint Retrospective. (Almost) Everyone Hates Specifications.

Software Security: Vulnerability Analysis

Language Security. Lecture 40

Memory Safety (cont d) Software Security

One-Slide Summary. Lecture Outline. Language Security

Hugbúnaðarverkefni 2 - Static Analysis

Tracking Pointers with Path and Context Sensitivity for Bug Detection in C Programs. {livshits,

Lecture 9 Assertions and Error Handling CS240

Foundations of Network and Computer Security

Static Analysis. Systems and Internet Infrastructure Security

Simple Overflow. #include <stdio.h> int main(void){ unsigned int num = 0xffffffff;

Static Analysis: Overview, Syntactic Analysis and Abstract Interpretation TDDC90: Software Security

Buffer Overflow Vulnerabilities

Secure Software Development: Theory and Practice

CS 61c: Great Ideas in Computer Architecture

Today Program Analysis for finding bugs, especially security bugs problem specification motivation approaches remaining issues

Advanced Programming Methods. Introduction in program analysis

5) Attacker causes damage Different to gaining control. For example, the attacker might quit after gaining control.

Computer Security Course. Midterm Review

A Gentle Introduction to Program Analysis

Contracts in OpenBSD

finding vulnerabilities

Principles of Program Analysis. Lecture 1 Harry Xu Spring 2013

QUIZ. What are 3 differences between C and C++ const variables?

Static Vulnerability Analysis

Secure Programming. An introduction to Splint. Informatics and Mathematical Modelling Technical University of Denmark E

Goal. Overflow Checking in Firefox. Sixgill. Sixgill (cont) Verifier Design Questions. Sixgill: Properties 4/8/2010

Automatic Software Verification

CMPSC 497: Static Analysis

Adam Chlipala University of California, Berkeley ICFP 2006

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler Front-End

MIDTERM EXAMINATION - CS130 - Spring 2005

Outline. Classic races: files in /tmp. Race conditions. TOCTTOU example. TOCTTOU gaps. Vulnerabilities in OS interaction

DAY 3. CS3600, Northeastern University. Alan Mislove

Pierce Ch. 3, 8, 11, 15. Type Systems

CSE 565 Computer Security Fall 2018

Buffer overflow prevention, and other attacks

18-642: Code Style for Compilers

CSSV: Towards a Realistic Tool for Statically Detecting All Buffer Overflows in C

OWASP 5/07/09. The OWASP Foundation OWASP Static Analysis (SA) Track Session 1: Intro to Static Analysis

CSE 333 Midterm Exam 2/14/14

G Programming Languages - Fall 2012

Buffer Overflow Attacks

Functions in C C Programming and Software Tools. N.C. State Department of Computer Science

Contract Programming For C++0x

Fast Introduction to Object Oriented Programming and C++

Systems Programming and Computer Architecture ( )

ELEC 377 C Programming Tutorial. ELEC Operating Systems

Buffer overflow background

Dynamic arrays / C Strings

NET 311 INFORMATION SECURITY

CSE 413 Languages & Implementation. Hal Perkins Winter 2019 Structs, Implementing Languages (credits: Dan Grossman, CSE 341)

CSE 333 Midterm Exam Sample Solution 5/10/13

Overview AEG Conclusion CS 6V Automatic Exploit Generation (AEG) Matthew Stephen. Department of Computer Science University of Texas at Dallas

Advances in Programming Languages

18-642: Code Style for Compilers

High Performance Programming Programming in C part 1

Memory Corruption 101 From Primitives to Exploit

The Bugs and the Bees

Final CSE 131B Spring 2004

Kurt Schmidt. October 30, 2018

CSE 565 Computer Security Fall 2018

CS 261 Fall Mike Lam, Professor. Structs and I/O

Memory corruption countermeasures

Region-Based Memory Management in Cyclone

Static Analysis in Practice

CS 161 Computer Security

CS 61C: Great Ideas in Computer Architecture. C Arrays, Strings, More Pointers

CSCI-243 Exam 1 Review February 22, 2015 Presented by the RIT Computer Science Community

Semantic Analysis. Outline. The role of semantic analysis in a compiler. Scope. Types. Where we are. The Compiler so far

Static Analysis methods and tools An industrial study. Pär Emanuelsson Ericsson AB and LiU Prof Ulf Nilsson LiU

Static program checking and verification

Computer Components. Software{ User Programs. Operating System. Hardware

The role of semantic analysis in a compiler

C: Pointers, Arrays, and strings. Department of Computer Science College of Engineering Boise State University. August 25, /36

CSCI 171 Chapter Outlines

ISA564 SECURITY LAB. Code Injection Attacks

Static Analysis in Practice

Introduction. Background. Document: WG 14/N1619. Text for comment WFW-1 of N1618

Lecture 08 Control-flow Hijacking Defenses

Semantic Analysis. CSE 307 Principles of Programming Languages Stony Brook University

Stephen McLaughlin. From Uncertainty to Belief: Inferring the Specification Within

Baggy bounds with LLVM

"Secure" Coding Practices Nicholas Weaver

Checking Program Properties with ESC/Java

CCured. One-Slide Summary. Lecture Outline. Type-Safe Retrofitting of C Programs

CS 161 Computer Security

Common Misunderstandings from Exam 1 Material

G Programming Languages Spring 2010 Lecture 4. Robert Grimm, New York University

Lecture 8 Data Structures

COMP 524 Spring 2018 Midterm Thursday, March 1

C BOOTCAMP DAY 2. CS3600, Northeastern University. Alan Mislove. Slides adapted from Anandha Gopalan s CS132 course at Univ.

Dynamic Allocation in C

CSC209H Lecture 4. Dan Zingaro. January 28, 2015

Chapter 2. Procedural Programming

The Compiler So Far. Lexical analysis Detects inputs with illegal tokens. Overview of Semantic Analysis

Arrays and Memory Management

Evaluating Bug Finders

Transcription:

Statically Detecting Likely Buffer Overflow Vulnerabilities David Larochelle and David Evans USENIX'01 David Larochelle and David Evans IEEE Software Jan/Feb 2002 Presented by Adam Polyak 30.03.2014

Outline of talk Introduction Suggested Solution: Splint Evaluation Related work Conclusions

Before command: After command: Introduction 1 int B=0; 2 char A[8]={}; 3 strcpy(a, "excessive"); Var name A B Value [empty] 0 Hex Value 00 00 00 00 00 00 00 00 00 00 Var name A B Value e x c e s s i v 25856 Hex Value 65 78 63 65 73 73 69 76 65 00

It can get worse Code execution 1 void foo (char *bar){ 2 char c[12]; 3 strcpy(c, bar); 4 } 5 int main (int argc, char **argv) { 6 foo(argv[1]); 7 }

It can get worse http://en.wikipedia.org/wiki/stack_buffer_overflow

It can get worse http://en.wikipedia.org/wiki/stack_buffer_overflow

It can get worse Why limit ourselves to stack? Heap buffer overflow Heap is a linked list What if we corrupt one of the links? 1 #define unlink( y, BK, FD ) { 2 BK = P->bk; 3 FD = P->fd; 4 FD->bk = BK; 5 BK->fd = FD; 6 }

Why Is It Important? RSAConference2013 25 Years of Vulnerabilities

Critical Vulnerabilities By Type RSAConference2013 25 Years of Vulnerabilities

Causes Programs written in C Prefer space and performance over security Unsafe language direct access to memory Lack of awareness about security Code is written to work Legacy code Knowledge about code isn t preserved Inadequate development process

Defense: Limiting Damage Mostly runtime methods: Compiler modifications (stack cookies) - StackGuard Safe libraries - Baratloo, Singh and Tsai Modify program binaries - (assert) SFI Pros: Minimal extra work is required from developers Cons: Increase performance/memory overhead Simply replace the flaw with a DOS vulnerability

Defense: Eliminating Flaws Human code review Better than automatic tools Can overlook problems Testing Mostly ineffective (security wise) Checks expected program behavior Fuzzing doesn t check all possibilities Static analysis Allows to codify human knowledge

Defense: Static Analysis Pros: Analyze source code directly, lets us make claims about all possible program paths It still possible to generate useful information Cons: Detecting buffer overflows is an undecidable problem Wide range of methods: compilers (low-effort, simple analysis) Full program verifiers (expensive yet effective)

Solution: Splint lightweight static analysis Useful results for real programs with reasonable effort Splint Lightweight Completeness Soundness

Solution: Splint main ideas Static Analysis: Use semantic comments to enable checking of interprocedural properties Use loop heuristics Lightweight: good performance and scalability sacrifices soundness and completeness

Annotations Splint is based upon LCLint Annotation assisted static checking tool Describe programmer assumptions and intents Added to source code and libraries Associate with: Function parameters Local variables Function return values Structure fields Global variables Type definitions

Example: @null@ 1 typedef /*@null@*/ char *mstring; 2 static mstring mstring_createnew (int x) ; 3 mstring mstring_space1 (void) { 4 mstring m = mstring_createnew (1); 5 /* error, since m could be NULL */ 6 *m = ' '; *(m + 1) = '\0'; 7 return m; 8 } mstringnn.c: (in function mstring_space1) mstringnn.c:6,4: Dereference of possibly null pointer m: *m

Example: @notnull@ 1 static /*@notnull@*/ mstring 2 mstring_createnewnn (int x) ; 3 mstring mstring_space2 (void) { 4 mstring m = mstring_createnewnn (1); 5 /* no error, because of notnull annotation */ 6 *m = ' '; 7 *(m + 1) = '\0'; 8 return m; 9 }

Annotations: Buffer Overflow In LCLint references are limited to a small number of possible states. Splint extends LCLint to support a more general annotation Functions pre(requires) and post (ensures) conditions Describe assumptions about buffers: minset minread maxset maxread

Example: Buffer Annotations The declaration: char buf[maxsize]; Generates the constraints: maxset(buf) = MAXSIZE - 1 minset(buf) = 0 Functions conditions: char *strcpy (char *s1, const char *s2) /*@requires maxset(s1) >= maxread(s2)@*/ /*@ensures maxread(s1) == maxread(s2) /\ result == s1@*/;

Annotations: Buffer Overflow Cont. Constraints are used to validate conditions Conditions are found 3 ways: By Splint: buf[i] = a Precondition: maxset(buf) >= i char a = buf[i] Precondition: maxread(buf) >= i buf[i] = a Postcondition: maxread(buf) >= i Library functions are annotated User generated

Example: Buffer Overflow Conditions 1 void updateenv(char * str) 2 { 3 char * tmp; 4 tmp = getenv(myenv); 5 if (tmp!= NULL) 6 strcpy(str, tmp); 7 } Unable to resolve constraint: requires maxset(str @ bounds.c:6) >= maxread(getenv("myenv")@bounds.c:4) needed to satisfy precondition: requires maxset(str @ bounds.c:6) >= maxread(tmp @ bounds.c:6) derived from strcpy precondition: requires maxset(<parameter 1>) >=maxread(<parameter 2>)

Analysis: Settings Static analysis is limited Depends on several undecidable problems Unsound -> False positives Incomplete -> False negatives Scalability over precision Highly configurable for users

Analysis: Settings cont. Analysis is mostly intraprocedural i.e. at function level Achieve interprocedural (dataflow between functions) using annotations Flow-sensitive (order of statements matters) with compromises Handle loops using heuristics

Analysis: Algorithm Programs are analyzed at the function level Generate constraints for each C statement By conjoining the constraints of sub expressions Simplify constraints: maxset(ptr+ i) = maxset(ptr) - i Constraints are resolved at call site Done at statement level Use function preconditions and postconditions of earlier statements

Analysis: Algorithm Cont. All variables used in constraints have an associated location 1 t++; 2 *t = x; 3 t++; Leads to the constraints: requires maxset(t @ 1:1) >= 1, ensures maxread(t @ 3:4) >= -1 and ensures (t @ 3:4) = (t @ 1:1) + 2

Axiomatic semantics Mathematical logic for proving the correctness of computer programs { P } C { Q } precondition statement a.k.a command postcondition P & Q are state predicates, C a command If P holds and if C terminates, then Q will hold Slide credit :http://www.cs.tau.ac.il/~maon/teaching/2013-2014/paav/paav1314b.html

Analysis: Control Flow - If Condition and loops can make unsafe operations, safe Analyze an if based on the predicate, for example: if (sizeof (s1) > strlen (s2)) strcpy(s1, s2); If the condition holds, the operation is safe

Analysis: Control Flow - Loops Statically analyzing loops is hard. Simplify: Analyze loops according to common idioms First and last iteration matter First iteration: Treated as If statement Last iteration: Determine number of runs base on loop heuristics: for (index=0;expr;index++) body for (init;*buf;buf++)

Evaluation Analyzed two popular programs wu-ftpd BIND Checked detection of known and unknown buffer overflows

Evaluation: wu-ftpd Version wu-ftp-2.5.0 with known security vulnerabilities Execution: Code size(lines) 17,000 1 Time(minutes) Source code wasn t modified resulted 243 warnings(related to buffer overflow) Detected both known and unknown buffer overflows

wu-ftpd: Unknown Bug 1 char ls_short[1024]; 2 3 extern struct aclmember * 4 getaclentry(char *keyword, 5 struct aclmember **next); 6 7 int main(int argc, char **argv, 8 char **envp) 9 { 10 11 entry = (struct aclmember *) NULL; 12 if (getaclentry("ls_short", &entry) 13 && entry->arg[0] 14 && (int)strlen(entry->arg[0]) > 0) 15 { 16 strcpy(ls_short,entry->arg[0]); 17

wu-ftpd: Unknown Bug Cont. Will generate the following warning: Possible out-of-bounds store. Unable to resolve constraint: maxread ((entry->arg[0] @ ftpd.c:1112:23)) <= (1023) needed to satisfy precondition: requires maxset ((ls_short @ ftpd.c:1112:14)) >= maxread ((entry->arg[0] @ ftpd.c:1112:23)) derived from strcpy precondition: requires maxset (<param 1>) >= maxread (<param 2>)

wu-ftpd: Known Bug 1 char mapped_path [200]; 2 3 void do_elem(char *dir) { 4 5 if (!(mapped_path[0] == '/' 6 && mapped_path[1] == '\0')) 7 strcat (mapped_path, "/"); 8 strcat (mapped_path, dir); 9 } dir is entered by a remote user Reported buffer overflow: http://www.cert.org/historical/advisories/ca-1999-13.cfm

wu-ftpd: False positive 1 i = passive_port_max - passive_port_min + 1; 2 port_array = calloc (i, sizeof (int)); 3 3 for (i = 3; && (i > 0); i--) { 4 for (j = passive_port_max 5 - passive_port_min + 1; 6 && (j > 0); j--) { 7 k = (int) ((1.0*j*rand()) / 8 (RAND_MAX + 1.0)); 9 pasv_port_array [j-1] 10 = port_array [k]; Unable to determine that 1 < k < j Can be suppressed by the user

wu-ftpd: Summary Unmodified code: 243 warnings After adding 22 annotations 143 warnings Of these, 88 unresolved constraints involving maxset The rest, can be suppressed by the user 225 calls to unsafe functions(strcat, strcpy, ) Only 18 reported by Splint 92% of the calls are safe by Splint

Evaluation: BIND Berkley Internet Name Domain reference for DNS implementations Version 8.2.2p7 Execution: Code size(lines) 47,000 3.5 Time(minutes) Check limited to a subset of code(~3,000 lines) Because of time of human analysis

BIND: Library Annotations Extensive use of internal libraries instead of C library functions Requires annotating large code base Iteratively run Splint and annotate To reduce human analysis required Only interface library functions were annotated -> based on code comments and parameters names For example: int foo(char* p, int size) Resulted: MaxSet(p) >= (psize - 1)

BIND: Library Annotations Extensive use of internal libraries instead of C library functions Requires annotating large code base Iteratively run Splint and annotate To reduce human analysis required Only interface library functions were annotated -> based on code comments and parameters names For example: int foo(char* p, int size) Resulted: MaxSet(p) >= (psize - 1)

BIND: req_query Code called in response for querying the domain name server version Version string read from configuration file and appended to a buffer OK if used with default However, sensitive to Code modification Configuration changes

BIND Cont. BIND uses extensive run time bounds checking This doesn t guarantee safety Buffer overflow was detected, because buffer sizes were calculated incorrectly ns_sign the functions receives a buffer and its size Splint detected that the size might be incorrect Occurs, If the message contains a signature but the key isn t found Bug was introduced after feature was added

Related work Lexical analysis Idea: Put simply, use grep to find unsafe code Tool: ITS4 - [VBKM00] Pros: simple and fast Cons: Very limited, deserts semantics and syntax

Related work Proof-carrying code Idea [NL 96, Necula97] : Executable is distributed with a proof and verifier Ensures the executable has certain properties Pros: Sound solution Cons: At time of paper, wasn t feasible automatically

Related work Integer range analysis Idea [Wagner et al]: Treat strings as integer range Cons: Non-character buffers are abandoned Insensitive analysis ignore loops and conditions

Related work Source transformation Idea [Dor, Rodeh and Sagiv]: Instrument the code Assert string operations Then use integer analysis Pros: Can handle complex properties pointer overlapping Cons: Doesn t scale

Conclusions Splint isn t perfect but improves security with a reasonable amount of effort Splint is lightweight Scalable Efficient Simple Hard to introduce to a large code base However, done incrementally can be just right

Reflections Practical approach! 80/20 principle in action Human effort vs security gained Improves code base readability and maintainability Can be leveraged as part of Continuous Integration process Code Quality

Continuous integration The practice of frequently integrating one's new or changed code with the existing code repository Some of its advantages: Immediate unit testing of all changes Early warning of broken/incompatible code Frequent code check-in pushes developers to create modular, less complex code

Continuous integration Version control system Source code Publish report Build system Run application tests Static analysis Build product Run unit tests Code coverage analysis

Continuous integration Static analysis on source code: Ensures coding standards Assists in avoiding common bugs

Questions?

Discussion Would you use Splint\LCLint in your projects? Security wise? Code quality wise? What about dynamic languages? Not necessarily security