Native Language Exploitation András Gazdag CrySyS Lab, BME www.crysys.hu 2017 CrySyS Lab
Memory errors and corruption Memory error vulnerabilities are created by programmers and exploited by attackers Attackers can read or write arbitrary memory locations in the program Implementing safe software in C/C++ is almost impossible C and C++ are memory unsafe C and C++ standard: Memory errors are undefined behaviors Programmers are responsible for the entire memory management and bound checking of a program Thus, a huge amount of unsafe code exists worldwide Native codes are still appealing due to performance issues E.g., Google Chrome was written in C++ Memory Corruption 2
Exploiting memory errors corruption 1. Make a pointer invalid Go out of the bounds of the pointed object» Incrementing an array pointer à buffer overflow» Indexing bugs à integer overflow» Utilizing allocation failure à a null pointer can be created When the pointed object gets deallocated» Dangling pointer: Points to a deleted object 2. Dereference that pointer à trigger the error Spatial error: dereferencing an out-of-the-bound pointer Temporal error: dereferencing a dangling pointer à use-after-free vulnerability Read from or write to the address pointed by the invalid pointer» Examples: format string vulnerability, buffer overflows printf (user_input); // if user_input is %2$x, the 2nd integer on the stack is printed Memory Corruption 3
Exploiting memory errors - attacks 1. Information leak Leak information to trigger exploitation Real-world exploits bypass ASLR by using leaked addresses 2. Data-only attack The goal is to gain more control E.g., variable overwrite via buffer overflow Neither code nor code pointers are modified bool is_root = false; if (is_root) { /* privileged functionality */ Data Integrity policy can prevent the integrity of variables Data Space Randomization brings entropy into all the data representation Memory Corruption 4
Exploiting memory errors - attacks 3. Control-flow hijack attack Code pointer (e.g., function pointer) is overwritten to divert control-flow» The attacker needs to know the address of his payload» Address Space Layout Randomization (ASLR) can prevent it Or, the attacker can only indirectly control the instruction pointer» Indirect function call, indirect jump or return instruction» Control-flow Integrity (CFI) policy can detect it Shellcode injection and execution is diverted» Non-executable Data policy disables code execution on data memory pages» Combination of Code Integrity and Non-executable Data is the Write XOR Execute policy à JIT or self-modifying code evades it Or, code reuse attacks» Return-to-LibC, Return-oriented Programming, Jump-oriented Programming 4. Code corruption attack Overwrite a program code in memory Code integrity policies cannot be enforced entirely» E.g., Just-in-Time compilers for JavaScript or Flash create writable pages Memory Corruption 5
x86 Machine code, Memory layout, Stack operations
Intel x86 main registers Register: 32-bit storage inside the microprocessor General registers: eax, ebx, ecx, edx, esi, edi Special purpose registers esp stack pointer ebp stack frame (base) pointer function context eip instruction pointer, stores the address of the next instruction to process Data Addressing modes Direct specifying a value directly, e.g. eax or 0x07B3 Indirect specifying the address that stores the value, e.g. [ecx] or [0x0012FE10] or even [ebp+12] Memory Corruption 7
Intel x86 instructions Name MOV <dest>, <src> ADD <op1>, <op2> SUB <op1>, <op2> CMP <op1>, <op2> AND <op1>, <op2> OR <op1>, <op2> XOR <op1>, <op2> Description Moves data from <src> to <dest> <dest> := <src> Adds two integers <op1> := <op1> + <op2> Subtracts an integer from another <op1> := <op1> - <op2> Compares two integers and sets the flags, just like SUB <op1>, <op2> Bitwise AND of two integers <op1> := <op1> AND <op2> Bitwise OR of two integers <op1> := <op1> OR <op2> Exclusive OR of two integers <op1> := <op1> XOR <op2> Memory Corruption 8
Intel x86 flags Stored in the status register, called FLAGS (F) Name ZF zero flag CF carry flag SF sign flag OF overflow flag PF parity flag Description Set to 1 if the result of the last arithmetic operation was zero Set to 1 if the last arithmetic operation caused carry or borrow in the most significant bit Set to 1 if the result of the last arithmetic operation was a negative value Set to 1 if the last arithmetic operation caused overflow Set to 1 if the least significant byte of the result of the last arithmetic operation contains an even number of 1 bits Memory Corruption 9
Intel x86 control instructions Name JMP <addr> or JMP <offs> J<cond> <offs> Description Jump, i.e. continues the code execution from a new address (absolute or relative) eip:=<addr> or eip:=eip + <offs> Jumps if condition is true (otherwise continues with next instr.) if (cond) then eip:=eip + <offs> Some possible conditions: Z / NZ zero / non-zero, example: JZ <offs> / JNZ <offs> E / NE equals (ZF=1) / not equals (ZF=0) C / NC carry / no carry S / NS sign, i.e. negative / no sign, i.e. zero or positive O / NO overflow / no overflow P / NP parity / no parity A, NBE (unsigned) above, not below or equal (CF=0 and ZF=0) BE, NA (unsigned) below or equal, not above (CF=1 or ZF=1) L, NGE (signed) less, not greater than or equal (SF<>OF) GE, NL (signed) greater than or equal, not less (SF=OF) Memory Corruption 10
Intel x86 stack handling and flow control Name PUSH <value> POP <dest> LEAVE CALL <addr> CALL <offs> RET Description Places <value> on top of stack [esp] esp := esp 4 [esp] := <value> Loads a value from the top of stack to <dest> <dest> := [esp] esp := esp + 4 Used at the end of functions, shortcut for MOV esp, ebp POP ebp Stores the return address on the stack and calls (relative or absolute jump) a function PUSH eip JMP <add> / JMP <offs> Loads the top of the stack into the eip instruction pointer (and so jumps back / to the return address) POP eip Memory Corruption 11
The memory address layout Typical placement of various data and variables: int a=12345678; int b; char *c="hello World!"; int function(int input) { int d; int e=87654321; char f[]="local string"; static long g; char *h=(char*)malloc(100); // return e; 0x00000000 0xFFFFFFFF Code Data Heap Stack Memory Corruption 12
Memory segmentation (ELF binaries) 0xFFFFFFFF Kernel Environment variables Stack Data segment (heap) Data segment (.bss) Data segment (.data) Code segment (.text) Stack segment includes Local variables Values required for procedure call Data segment heap: dynamically allocated memory. bss: Uninitialized global & static variables.data: Initialized global & static variables Code Segment: Executable instructions Typically read-only Shared libraries 0x00000000 Memory Corruption 13
Stack Stack is built up from several stack frames belonging to functions. Each stack frame comprises: Function parameters Return address Saved Frame Pointer (Saved EBP, =Frame pointer of the preceding frame) Local variables Previous frame Higher memory addresses Function parameters Return address Saved EBP (SFP) Local variables Free memory ß EBP ß ESP Memory Corruption 14
The local variables and the stack frame Information stored in the stack for a function Parameters (arguments) Return address Saved base pointer (EBP) Local variables Stack frame 0xFFFFFFFF Mem.addr. 0x00000000 EBP register is used to point to the actual stack frame But not to the top of the stack (that's ESP) EBP points to where the caller's EBP is saved Local variables are at EBP-x (from ESP to EBP-4) Return address is at EBP+4 Parameters are at EBP+8, EBP+12, Memory Corruption 15
Stack frame example The stack in calling and returning from function addnum LIFO principles Grows towards lower memory addresses ß ESP ESP: stack pointer, points to the top of the stack STACK 0x00000000 Memory Corruption 16
Stack frame example mov dword ptr [b],3 ; Moving 3 to the address pointed by variable b mov eax,dword ptr [b] ; Storing that value in register eax push eax ; Pushing b to stack and decreasing ESP by addressing size 3 (b) ß ESP int main(void){ int a, b, c; a=7; b=3; c = addnum(a,b); printf("result is: %d", a+b); STACK 0x00000000 Memory Corruption 17
Stack frame example mov dword ptr [a],7 ; Moving 7 to the address pointed by variable a mov ecx,dword ptr [a] ; Storing that value in register ecx push ecx ; Pushing a to stack and decreasing ESP by addressing size 3 (b) 7 (a) ß ESP int main(void){ int a, b, c; a=7; b=3; c = addnum(a,b); printf("result is: %d", a+b); STACK 0x00000000 Memory Corruption 18
Stack frame example call addnum ; Pushing the address of the next instruction (return addr.) ; (0x00411459) to the stack and calling function addnum & ; decreasing ESP 3 (b) 7 (a) 0x00411459 (ret addr.) ß ESP int main(void){ int a, b, c; a=7; b=3; c = addnum(a,b); printf("result is: %d", a+b); STACK 0x00000000 Memory Corruption 19
Stack frame example Every function starts with function prologue: à push ebp ; Saves the previous frame pointer (EBP also called Saved FP) mov ebp,esp ; Currently the EBP points to SFP sub esp, 0x10 ; Saving space for the local variables of the function 3 (b) 7 (a) 0x00411459 (ret addr.) Saved EBP (SFP) ß ESP int addnum(int a, int b){ int c = 4; c = a + b; return c; STACK 0x00000000 Memory Corruption 20
Stack frame example Every function starts with function prologue: push ebp ; Saves the previous frame pointer (EBP also called Saved FP) à mov ebp,esp ; Currently the EBP points to SFP sub esp, 0x10 ; Saving space for the local variables of the function 3 (b) 7 (a) 0x00411459 (ret addr.) Saved EBP (SFP) ß ESP,EBP int addnum(int a, int b){ int c = 4; c = a + b; return c; STACK 0x00000000 Memory Corruption 21
Stack frame example Every function starts with function prologue: push ebp ; Saves the previous frame pointer (EBP also called Saved FP) mov ebp,esp ; Currently the EBP points to SFP à sub esp, 0x10 ; Saving space for the local variables of the function 3 (b) 7 (a) 0x00411459 (ret addr.) Saved EBP (SFP) Space for local variables STACK ß EBP ß ESP int addnum(int a, int b){ int c = 4; c = a + b; return c; 0x00000000 Memory Corruption 22
Stack frame example Every function ends with function epilogue. à mov esp,ebp ; Restoringthe stack pointer to SFP, unallocating space for locals pop ebp ; Restoring the value of EBP to point to the SFP of preceding frame ret ; Popping return address and returning to that, increasing ESP 3 (b) 7 (a) 0x00411459 (ret addr.) Saved EBP (SFP) ß EBP, ESP int addnum(int a, int b){ int c = 4; c = a + b; return c; STACK 0x00000000 Memory Corruption 23
Stack frame example Every function ends with function epilogue. mov esp,ebp ; Restoringthe stack pointer to SFP, unallocating space for locals à pop ebp ; Restoring the value of EBP to point to the SFP of preceding frame ret ; Popping return address and returning to that, increasing ESP 3 (b) 7 (a) 0x00411459 (ret addr.) ß ESP int addnum(int a, int b){ int c = 4; c = a + b; return c; STACK 0x00000000 Memory Corruption 24
Stack frame example Every function ends with function epilogue. mov esp,ebp ; Restoringthe stack pointer to SFP, unallocating space for locals pop ebp ; Restoring the value of EBP to point to the SFP of preceding frame à ret ; Popping return address and returning to that, increasing ESP 3 (b) 7 (a) ß ESP int addnum(int a, int b){ int c = 4; c = a + b; return c; STACK 0x00000000 Memory Corruption 25
Stack frame example add esp,8 ; Decreasing the stack after the RET instruction of addnum 3 (b) 7 (a) ß ESP int main(void){ int a, b, c; a=7; b=3; c = addnum(a,b); printf("result is: %d", a+b); STACK 0x00000000 Memory Corruption 26
Calling conventions Various calling conventions exist for x86 cdecl: the above described, caller cleans up the arguments, arguments are placed from right to left pascal: the called function cleans up arguments, arguments are placed from left to right stdcall: similar to pascal, but arguments are put right-to-left (used by Win32 APIs or Open Watcom C++) fastcall: similar to stdcall, but some arguments (typically first two) are passed in registers (MS VC and GCC) thiscall: used in C++, this pointer is passed on automatically (through stack by GCC or by using registers in MS VC) And there are many more conventions for x64 and other platforms (ARM, PowerPC, SPARC ) Memory Corruption 27
BUFFER OVERFLOW
Buffer overflow Occurs when the boundary of a buffer is exceeded by data which overwrites adjacent memory locations Stack overflow / Heap overflow Typically causing the program to halt with exception Segmentation fault (Linux) Access violation at 0x11223344 (Windows) Can be exploited by overwriting interesting variables, memory locations (return address, pointers, file names, etc.) Forcing the program to change its control flow by injecting malicious code Most preferred targets: Setuid/setgid programs Network servers: remote access Memory Corruption 29
Goals in this course Crash the application (DOS attack) Data based attack Unauthorized use of functions Memory Corruption 30
STACK OVERFLOW
Stack overflow Stack overflow occurs when a procedure copies user-controlled data to a local buffer on the stack without verifying its size. Dangerous functions strcpy, sprintf, strcat, gets, fgets, memcpy, Local data overwrites other values on the stack up to return address When the procedure returns, EIP is set to the address residing at the location of the return address à control flow can be changed Insert code to that modified address à will be executed Memory Corruption 32
Stack overflow #include <stdio.h> #include <string.h> 0x00000000 esp void function(char *input) { int i = 1; int j = 2; char buffer[8]; The buffer can strcpy(buffer,input); overflow printf( %x %x %s\n", i, j, buffer); ebp buffer[8] (=?) j (=2) i (=1) ebp (=main() s ebp) Return address input (=argv[1]) Stack frame of function() int main(int argc, char* argv[]) { int k=3; function(argv[1]); return 0; k (=3) ebp (=prev. stack fr.) Return address argc (=1) argv (=cmd line args) 0xFFFFFFFF Stack frame of main() Memory Corruption 33
Overwriting the return address No boundary check A long input causes strcpy to write over the boundaries of the local buffer: abcdefghijklmnopqrstuvwx The return address can be overwritten This is exploitable [92801.217602] BOFIntro[29676]: segfault at 78777675 ip 78777675 sp bffff280 error 4 [92801.217625] grsec: Segmentation fault occurred at 78777675 in /home/ex/c++/bofintro/bofintro[bofintro:29676] uid/euid:0/0 gid/egid:0/0, parent /bin/bash[bash:28763] uid/euid:0/0 gid/egid:0/0 esp ebp 0x00000000 buffer[8] (= abcdefgh ) j (=0x6c6b6a69, ijkl ) i (=0x706f6e6d, mnop ) ebp (=0x74737271, M qrst ) ret (=0x78777675, uvwx ) input (=argv[1]) k (=3) ebp (=prev. stack fr.) Return address argc (=1) argv (=cmd line args) 0xFFFFFFFF L L Stack frame of function() Stack frame of main() Memory Corruption 34
Data based attack #include <stdio.h> #include <string.h> int authorize(char *password) { return!strcmp( SecretKeyword, password); int main(int argc, char* argv[]) { char username[10]; int authorized = 0; authorized = authorize(argv[1]); printf( Enter your username: ); gets(username); if (authorized) { priviledged_function(username); 0x00000000 username[10] authorized (=0) EBP RET argc (=1) argv (=cmd line args) 0xFFFFFFFF Stack frame of main() return 0; Memory Corruption 35
Data based attack #include <stdio.h> #include <string.h> int authorize(char *password) { return!strcmp( SecretKeyword, password); int main(int argc, char* argv[]) { char username[10]; int authorized = 0; authorized = authorize(argv[1]); printf( Enter your username: ); gets(username); if (authorized) { priviledged_function(username); 0x00000000 BBBBBBBBBB 1 EBP RET argc (=1) argv (=cmd line args)... 0xFFFFFFFF Stack frame of main() return 0; Memory Corruption 36
Control flow attack #include <stdio.h> #include <string.h> int authorize(char *password) { char buffer[8]; strcpy(buffer, password); return!strcmp( SecretKeyword, buffer); void privileged_function (char *username) {... int main(int argc, char* argv[]) { char username[10]; int authorized = 0; authorized = authorize(argv[1]); printf( Enter your username: ); gets(username); if (authorized) { priviledged_function(username); return 0; 0x00000000 buffer[8] EBP RET &password username[10] authorized (=0) EBP RET argc (=1) argv (=cmd line args)... 0xFFFFFFFF Stack frame of authorize() Stack frame of main() Memory Corruption 37
Control flow attack #include <stdio.h> #include <string.h> int authorize(char *password) { char buffer[8]; strcpy(buffer, password); return!strcmp( SecretKeyword, buffer); void privileged_function (char *username) {... int main(int argc, char* argv[]) { char username[10]; int authorized = 0; authorized = authorize(argv[1]); printf( Enter your username: ); gets(username); if (authorized) { priviledged_function(username); return 0; 0x00000000 AAAAAAAA BBBB &privileged_function &password username[10] authorized (=0) EBP RET argc (=1) argv (=cmd line args)... 0xFFFFFFFF Stack frame of authorize() Stack frame of main() Memory Corruption 38
Stack overflow custom shellcode int main(int argc, char* argv[] ) { dangereous(argv[1]); printf( Is everything all right? ); 0xFFFFFFFF Previous frame Higher memory addresses void dangerous(char * buf){ char buffer[100]; strcpy(buffer, buf); Previous frame Function parameters Function parameters Return address buffer address Saved EBP (frame pointer) ß EBP SHELLCODE 0x00000000 buffer[100] Free memory ß ESP 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 Free 0x90 memory 0x90 0x90 NOP sled Memory Corruption 39
Stack overflow NOP sled: Put in front of the shellcode and jump into that area Instructions should always reach the beginning of the shellcode The simplest version is a sequence of 0x90 (no operations - nop) Reason to apply: Bigger chance to find our shellcode On local systems the position of return address can be calculated (no ASLR) Remote addresses are unknown Where to put the shellcode? Into the local buffer with a proceeding nop sled» Remote attacks possible, but the memory page the buffer residing at must be executable. The location of the buffer must be known. Into environment variables» Easy to implement. Good for tiny buffers, however, only for local attacks. Stack must be executable Address of a function inside the program» Remote attacks are possible with non-executable stack. More frames to put on stack. Memory Corruption 40
BUFFER OVERFLOW MITIGATION
Buffer overflow mitigation Hardware Operating system Programming language Compiler options Source code Application Preventive Detective Anti-exploit Safe libraries Safe operations Detection and prevention of dangerous operations Development rules, source code review White-box / blackbox testing Virtual memory management (VMM) Antivirus, IDS Strict type and boundary checks Detection of common mistakes, Stack Smashing Protection Strict input validation Black-box testing Input filtering NX (Never Execute) memory protection ASLR randomization Application-level access control Generating more secure binary code Software ASLR Patching Memory Corruption 42
ASLR
Control-flow hijacking prevention OS-level ASLR Randomly arranging the position of specific data memory regions Base address of executables Position of libraries Stack Heap Supported by MS Windows from Vista, PaX kernel patch for Linux, FreeBSD, etc.. Disadvantage ASLR is effective only on 64-bit platforms On 32-bit systems one can easily brute force the right addresses à Only upper 16 bits are randomized à Executables and modules are aligned on 64k boundaries ASLR does not protect against data corruption Applications should be compiled properly (PIE) Memory Corruption 44
Control-flow hijacking prevention Software ASLR (Address Space Layout/Load Randomization) Stack Heap ß RND ß RND void alt_main(int argc, char* argv[]){.. void spamstack(int i, int ac, char* av[] ){ if (! i) alt_main(ac, av); spamstack(--i); int main(int argc, char* argv[]){ srand ( time(null) ); malloc(rand() % 123456 + 1); // rnd heap spamstack(rand() % 123456 + 1, argc, argv); 0x00000000 Memory Corruption 45
Further reading V Van Der Veen, L Cavallaro, H Bos: Memory errors: The Past, the Present, and the Future, Research in Attacks, Intrusions, and Defenses, 2012. L Szekeres, M Payer, Wei Tao, D Song : SoK Eternal War in Memory, IEEE Security and Privacy, 2013 Aleph One: Smashing the Stack for Fun and Profit, Phrack, 1996. http://phrack.org/issues/49/14.html Nergal: The advanced return-into-lib(c) exploits, Phrack, 2001. http://phrack.org/issues/58/4.html Klog: The Frame Pointer Overwrite, Phrack, 1999. http://phrack.org/issues/55/8.html MaXX: Vudo malloc tricks, Phrack, 2001. http://phrack.org/issues/57/8.html anonymous: Once upon a free(), Phrack, 2001. http://phrack.org/issues/57/9.html H. Shacham: The geometry of innocent flesh on the bone: Return-into-libc without function calls (on the x86), ACM CCS, 2007 Memory Corruption 46