Protection and Mitigation of Software Bug Exploitation Vartan Padaryan vartan@ispras.ru 1
How safe is latest Linux release? Command line arguments fuzzer (inspired by Brumley s article) Launch programs with two parameters: -a..-z, -A..-Z A long string (~6000 characters) Quite good results Arch Linux, Debian, 3300 applications checked Found 211 crashes in 47 applications However, it is unlikely that crash can be escalated to something more dangerous Targeted data corruption Sensitive data leaks Arbitrary code injection How can one precisely evaluate the impact of found defect? Construct exploit for the bug 2
Factors contributing towards software security Hardware properties Von Neumann architecture source of security issues Harvard architecture, DEP Development tools C/C++ languages, address arithmetic Compiler/libraries Security analyzers Organizational measures Cisco Secure Development Lifecycle Microsoft the Security Development Lifecycle Any large enough software company has its own methodology and practice of secure development 3
(Slice of) Microsoft SDL Implementa tion Verification Release Response #8 Use Approved Tools #9 Deprecate Unsafe Functions #10 Perform Static Analysis #11 Perform Dynamic Analysis #12 Perform Fuzz Testing #13 Conduct Attack Surface Review #14 Create an Incident Response Plan #15 Conduct Final Security Review #16 Certify Release and Archive Execute Incident Response Plan 4
Developer tools Compiler Static analysis built in IDE Secure libraries Static analysis during nightly builds Traditional debuggers and profilers DBI-based automated debugging 5
Canary Canary word is placed between a buffer and control data on the stack to monitor buffer overflows Corruption of canary indicates that return address also may be changed Default option for gcc since 2012 Function prologue: copy canary from TLS into current frame 0 Stack frame Return address Canary word... buf Function body Function epilogue: compare TLS and frame canaries, if not equal, abort function without RET execution 6
and other compiler techniques (1/3) Available out-of-the-box (gcc) FORTIFY_SOURCE secure version of some standard functions strcpy strcpy_chk Compile time check for various danger patterns User defined format string Safe memory layout for automatic variables int somefunc() { int *p1; char buf[12]; int *p2;... Stack frame Return address Canary word buf p1 p2 0 7
and other compiler techniques (2/3) Available out-of-the-box (gcc) FORTIFY_SOURCE secure version of some standard functions strcpy strcpy_chk Compile time check for various danger patterns User defined format string Safe memory layout for automatic variables 0 struct some_struct { int *p1; char buf[12]; int *p2;... p1 p2 Memory layout buf 8
and other compiler techniques (3/3) Other techniques Shadow stack Secure heap for sensitive automatic variables Control flow integrity Aggressive optimization can introduce bug where it didn t exist before Result: memset(password, \0, len); free(password); WYSINWYX: What You See Is Not What You execute @article{balakrishnan:2010:wys:1749608.1749612, author = {Balakrishnan, Gogul and Reps, Thomas}, title = {WYSINWYX: What You See is Not What You execute}, journal = {ACM Trans. Program. Lang. Syst.}, issue_date = {August 2010}, } 9
Tools for testers / release manager / CI, Unit testing Symbolic execution (?) Fuzzing Security check for AppStore (Taint analysis) 10
Symbolic execution (1/2) Old good technique to improve code coverage Applicable at source and binary level Native drawback path explosion input(x); char buf[42]; x free symbolic variable f if x>0 t (x > 0) if x*x < 0xffffff f (x > 0) (x*x < 0xffffff) t strcpy(buf, input) 11
Symbolic execution (2/2) SMT-solver tries to evaluate system of path and security predicates Security predicate describes certain exploit Available well constructed frameworks KLEE (LLVM) S2E (binary, whole system analysis, QEMU-based) Numerous publications EXE, BitBlaze (Berkeley), Mayhem (CMU), Sage (Microsoft), Driller (UC), Dowser, There are no publications on how to construct an exploit in case of activated compile-time protection (activated by default) 12
1 char buf[32]; 2 char *data = read_string(); 3 unsigned int magic = read_number(); 4 5 // difficult check for fuzzing 6 if (magic == 0x31337987) { 7 // buffer overflow 8 memcpy(buf, data, 100); 9 } 10 11 if (magic < 100 && magic % 15 == 2 && magic % 11 == 6) { 12 // Only solution is 17; safe 13 memcpy(buf, data, magic); 14 } 15 16 // Symbolic execution will suffer from path explosion 17 int count = 0; 18 for (int i = 0; i < 100; i++) { 19 if (data[i] == Z ) { 20 count++; 21 } 22 } 23 24 if (count >= 8 && count <= 16) { 25 // buffer overflow 26 memcpy(buf, data, count*20); 27 } Why the only tool is not enough @article{shoshitaishvili2016state, title={sok: (State of) The Art of War: Offensive Techniques in Binary Analysis}, author={shoshitaishvili, Yan and Wang, Ruoyu and Salls, Christopher and Stephens, Nick and Polino, Mario and Dutcher, Andrew and Grosen, John and Feng, Siji and Hauser, Christophe and Kruegel, Christopher and Vigna, Giovanni}, booktitle={ieee Symposium on Security and Privacy}, year={2016} } 13
Explore malicious code in third party software AppStore needs to check application for malicious behavior before distributing Leak sensitive data Spam broadcast Backdoor for further code injection TEMU, TaintDroid, DroidScope, Various implementations of tainted input analysis 14
Taint analysis Taint source #1 Taint source #2 Program variables Merge of different sources 15
Method / implementations restrictions Control dependencies for each symbol AsciiTable do if symbol = X Tainted then Y Untainted symbol end if end for Data flow goes across a border of monitoring area IPC Write/read data to/from block devices Data transfer through remote computer Side-channel File attributes, timings, Whole system taint analysis can overcome some mentioned restrictions 16
Research of bugs / incidents IDA Pro disassembler leaves no alternative 17
Attack/defense Generations of protection vs. analysis tools Code protection Anti-debugging, naïve static obfuscation, code protectors Protection against simulators Dynamic obfuscation Code analysis Manual analysis disassemblers and debuggers Simulator + debugger Analysis automation: slicing, deobfuscation Simulator testing Further automation: symbolic Protection against symbolic execution execution Improvement of obfuscation Compiler-built-in obfuscation Periodic obfuscation, modification of interface and protocol??? 19
Open issues How to combine warnings from static and dynamic analysis? No methods are known to build exploits capable of overcoming most of contemporary protection mechanisms Symbolic execution is great, but Exponential growth is unavoidable What to do with irreversible transformations? What to do with symbolic addresses? What to do with control dependencies? Two last points are also true for taint analysis 20