SoK: Eternal War in Memory László Szekeres, Mathias Payer, Tao Wei, Dawn Song Presenter: Wajih 11/7/2017 Some slides are taken from original S&P presentation 1
What is SoK paper? Systematization of Knowledge (SoK), in IEEE S&P conference, since 2010 Papers provide a high value to our community but may not be accepted because of a lack of novel research contributions. survey papers that provide useful perspectives on major research areas; papers that support or challenge long-held beliefs with compelling evidence, or papers that provide an extensive and realistic evaluation of competing approaches to solving specific problems. 2
SoK paper The goal is to encourage work that evaluates, systematizes, and contextualizes existing knowledge. identify areas that have enjoyed much research attention, point out open areas with unsolved challenges, and present a prioritization that can guide researchers to make progress on solving important challenges. 3
Problem C/C++ is memory unsafe Everybody runs C/C++ code They surely have exploitable vulnerabilities 4
Background 5
Memory Safety Memory safety is a property that ensures that all memory accesses adhere to the semantics defined by the source programming language. Memory unsafe languages like C/C++ do not enforce memory safety and data accesses can occur through stale/illegal pointers. Memory safety violations rely on two conditions that must both be fulfilled: A pointer either goes out of bounds or becomes dangling. This pointer is used to either read or write. 6
Spatial memory safety Spatial memory safety is a property that ensures that all memory dereferences are within bounds of their pointer s valid objects. An object s bounds are defined when the object is allocated. Pointers that point outside of their associated object may not be dereferenced. Dereferencing such illegal pointers results in a spatial memory safety error and undefined behavior. 7
Spatial memory safety 8
Temporal memory safety Temporal memory safety is a property that ensures that all memory dereferences are valid at the time of the dereference, i.e., the pointed-to object is the same as when the pointer was created. When an object is freed, the underlying memory is no longer associated to the object and the pointer is no longer valid. Dereferencing such an invalid pointer results in a temporal memory safety error and undefined behavior. 9
Temporal memory safety 10
Eternal War on Memory 11
Overview What are the attacks? What are the deployed protections? What are the not deployed protections? Why aren t they deployed? 12
Attack Model 13
Attack Model First step: Memory Corruption Attack Second step: Code Corruption Attack Control Flow Hijacking Data-only Attack Information leak 14
Classic Stack Smashing Attack out-of-bounds dangling to write to read code pointer... to target code address by indir. call/jmp by ret instruction Exec. gadgets or functions Execute injected code Control-flow hijack 15
Use-after-free Exploits out-of-bounds dangling to write to read Heap overflow code pointer... to target code address by indir. call/jmp by ret instruction Exec. gadgets or functions Execute injected code Control-flow hijack 16
Corrupting newer and newer Pointers out-of-bounds dangling to write to read data pointer code pointer... to target code address by indir. call/jmp by ret instruction Exec. gadgets or functions Execute injected code Control-flow hijack 17
Attack Model First step: Memory Corruption Attack Second step: Code Corruption Attack Control Flow Hijacking Data-only Attack Information leak 18
Modifying the code itself out-of-bounds dangling to write to read data pointer Modify code... to attacker specified code code pointer... to target code address by indir. call/jmp by ret instruction Exec. gadgets or functions Execute injected code Code corruption Control-flow hijack 19
Code Integrity out-of-bounds dangling to write to read data pointer Modify code... Read-only code pages to attacker specified code code pointer... to target code address by indir. call/jmp by ret instruction Exec. gadgets or functions Execute injected code Code corruption Control-flow hijack 20
Code Integrity out-of-bounds dangling to write to read data pointer Modify code... Read-only code pages to attacker specified code code pointer... to target code address by indir. call/jmp by ret instruction Exec. gadgets or functions Execute injected code Code corruption Control-flow hijack Challenge: In just-in-time compilation, a small time window when the generated code is on a writable page. 21
Attack Model First step: Memory Corruption Attack Second step: Code Corruption Attack Control Flow Hijacking Data-only Attack Information leak 22
Control Flow Hijacking The attacker wants to control the execution of a program by redirecting control-flow to an attacker-controlled location. 23
Control-Flow Hijack out-of-bounds dangling to write to read data pointer Read-only Modify code code pages... to attacker specified code code pointer... to target code address by indir. call/jmp by ret instruction Exec. gadgets or functions Execute injected code e.g., Stack Code corruption Control-flow hijack 24
Non-executable data out-of-bounds dangling to write to read data pointer Read-only Modify code code pages... to attacker specified code code pointer... to target code address by indir. call/jmp by ret instruction Code corruption Exec. gadgets or functions Control-flow hijack Non-exec. Execute data injected pages code Inserted Shellcode cannot be executed. 25
Return-oriented programming out-of-bounds dangling to write to read data pointer Read-only Modify code code pages... to attacker specified code code pointer... to target code address by indir. call/jmp by ret instruction Exec. gadgets or functions Non-exec. Execute data injected pages code Code corruption Control-flow hijack 26
Return integrity out-of-bounds dangling to write to read data pointer Read-only Modify code code pages... to attacker specified code code pointer... to target code address by indir. call/jmp by ret instruction Exec. gadgets or functions Non-exec. Execute data injected pages code Code corruption Control-flow hijack 27
Return integrity out-of-bounds dangling to write to read data pointer Read-only Modify code code pages... to attacker specified code code pointer... to target code address by indir. call/jmp Exec. gadgets or functions Stack canaries by ret instruction Non-exec. Execute data injected pages code Code corruption Control-flow hijack 28
Hijacking Indirect Calls and Jumps out-of-bounds dangling to write to read data pointer Read-only Modify code code pages... to attacker specified code code pointer... to target code address by indir. call/jmp Exec. gadgets or functions Stack canaries by ret instruction Non-exec. Execute data injected pages code Code corruption Control-flow hijack 29 29
Address Space Randomization out-of-bounds dangling to write to read data pointer Read-only Modify code code pages... to attacker specified code code pointer... to target code address by indir. call/jmp Exec. gadgets or functions Stack canaries by ret instruction Non-exec. Execute data injected pages code Code corruption Control-flow hijack 30
Address Space Randomization out-of-bounds dangling to write to read data pointer Read-only Modify code code pages... to attacker specified code code pointer... to target ASLR code address by indir. call/jmp Exec. gadgets or functions Stack canaries by ret instruction Non-exec. Execute data injected pages code Code corruption Control-flow hijack 31
Attack Model First step: Memory Corruption Attack Second step: Code Corruption Attack Control Flow Hijacking Data-only Attack Information leak 32
Data-only Attack The target of the corruption can be any security critical data in memory, Security critical variables configuration data the representation of the user identity or keys. 33
Data-only attack out-of-bounds dangling to write to read data pointer Read-only Modify code code pages... to attacker specified code code pointer... to target ASLR code address Modify data... to attacker specified value by indir. call/jmp Exec. gadgets or functions Stack canaries by ret instruction Non-exec. Execute data injected pages code Use corrupted data variable Code corruption Control-flow hijack Data-only attack 34
Attack Model First step: Memory Corruption Attack Second step: Code Corruption Attack Control Flow Hijacking Data-only Attack Information leak 35
Information Leaks Steal program information/state in memory, process metadata, registers, and files, etc. 1. Valuable information by itself, e.g. credit card, password, encryption key. 2. Facilitating further attacks, e.g., memory addresses to bypass ASLR Side channel attacks 36
Information leakage out-of-bounds dangling to write to read data pointer Read-only Modify code code pages... to attacker specified code code pointer... to target ASLR code address Modify data... to attacker specified value Output data Interpret the leaked value by indir. call/jmp Exec. gadgets or functions Stack canaries by ret instruction Non-exec. Execute data injected pages code Use corrupted data variable Code corruption Control-flow hijack Data-only attack Information 37 leak
Overview What are the attacks? What are the deployed protections? What are the not deployed protections? Why aren t they deployed? 38
Protection Techniques Deterministic Protection A low level reference monitor (RM) enforce a deterministic safety policy. By hardware, like W X By software, adding RM into code Probabilistic Protection Built on randomization or encryption Instruction Set Randomization Address Space Randomization Data Space Randomization 39
Protection Techniques Deterministic Protection A low level reference monitor (RM) enforce a deterministic safety policy. By hardware, like W X By software, adding RM into code Probabilistic Protection Built on randomization or encryption Instruction Set Randomization Address Space Randomization Data Space Randomization 40
Hijack protection Deployed Protections Policy Technique Weakness Perf. Comp. W X Page flags JIT 1x Good Return integrity Stack cookies Direct overwrite 1x Good Address space rand. ASLR Info-leak. 1.1x Good 41
Overview What are the attacks? What are the deployed protections? What are the not deployed protections? Why aren t they deployed? 42
Control-flow Integrity out-of-bounds dangling to write to read data pointer Modify code... code pointer... Modify data... Output data to attacker specified code to target code address to attacker specified value Interpret the leaked value by indir. call/jmp by ret instruction Use corrupted data variable Exec. gadgets or functions Execute injected code Code corruption Control-flow hijack Data-only attack Information 43 leak
Control-flow integrity out-of-bounds dangling to write to read data pointer Modify code... code pointer... Modify data... Output data to attacker specified code to target code address to attacker specified value Interpret the leaked value by indir. call/jmp CFI by ret instruction Use corrupted data variable Exec. gadgets or functions Execute injected code Code corruption Control-flow hijack Data-only attack Information 44 leak
CFI p = &f jmp p f() { } q = &g jmp q g() { } 45
CFI ID p = &f jmp p f() { } q = &g jmp q ID g() { } 46
CFI check ID p = &f jmp p ID f() { } check ID q = &g jmp q ID g() { } 47
CFI check? p = &f jmp p ID f() { } check? if ( ) q = &f else q = &g jmp q ID g() { } 48
Over-approximation problem check ID p = &f jmp p ID f() { } if ( ) q = &f else q = &g check ID jmp q ID g() { } 49
Over-approximation problem check ID p = &f jmp p ID printf() { } if ( ) q = &f else q = &g check ID jmp q ID system() { } 50
Hijack protection CFI Policy Technique Weakness Perf. Comp. W R Page flags JIT 1x Good Return integrity Stack cookies Direct overwrite 1x Good Address space rand. ASLR Info-leak. 1.1x Good Control-flow integ. CFI Over-approx. 1.4x Libraries 51
Memory Safety out-of-bounds dangling to write to read data pointer Modify code... code pointer... Modify data... Output data to attacker specified code to target code address to attacker specified value Interpret the leaked value by indir. call/jmp by ret instruction Use corrupted data variable Exec. gadgets or functions Execute injected code Code corruption Control-flow hijack Data-only attack Information 52 leak
Memory safety out-of-bounds to write Pointer metadata tracking & checking dangling to read data pointer Modify code... code pointer... Modify data... Output data to attacker specified code to target code address to attacker specified value Interpret the leaked value by indir. call/jmp by ret instruction Use corrupted data variable Exec. gadgets or functions Execute injected code Code corruption Control-flow hijack Data-only attack Information 53 leak
SoftBounds+CETS 54
Data integrity out-of-bounds dangling to write to read data pointer Modify code... code pointer... Modify data... Output data to attacker specified code to target code address to attacker specified value Interpret the leaked value by indir. call/jmp by ret instruction Use corrupted data variable Exec. gadgets or functions Execute injected code Code corruption Control-flow hijack Data-only attack Information 55 leak
Data integrity out-of-bounds dangling to write to read data pointer Modify code... code pointer... Object metadata tracking & checking Modify data... Output data to attacker specified code to target code address to attacker specified value Interpret the leaked value by indir. call/jmp by ret instruction Use corrupted data variable Exec. gadgets or functions Execute injected code Code corruption Control-flow hijack Data-only attack Information 56 leak
WIT (Write Integrity Testing) 57
WIT (Write Integrity Testing) 58
Data Space Randomization 59
Data Space Randomization 60
DSR 61
DSR 62
Data-flow integrity out-of-bounds dangling to write to read data pointer Modify code... code pointer... Modify data... Output data to attacker specified code to target code address to attacker specified value Interpret the leaked value by indir. call/jmp by ret instruction Use corrupted data variable Exec. gadgets or functions Execute injected code Code corruption Control-flow hijack Data-only attack Information 63 leak
Data-flow integrity out-of-bounds dangling to write to read data pointer Modify code... code pointer... Modify data... Output data to attacker specified code to target code address to attacker specified value Interpret the leaked value by indir. call/jmp by DFI ret instruction Use corrupted data variable Exec. gadgets or functions Execute injected code Code corruption Control-flow hijack Data-only attack Information 64 leak
Generic protection Hijack protection DFI Policy Technique Weakness Perf. Comp. W R Page flags JIT 1x Good Return integrity Stack cookies Direct overwrite 1x Good Address space rand. ASLR Info-leak. 1.1x Good Control-flow integ. CFI Over-approx. 1.4x Libraries Memory safety SB+CETS None 2-4x Good Data integrity WIT Over-approx., 1.2x Libraries Data space rand. DSR Over-approx., 1.3x Libraries Data-flow integrity DFI Over-approx. 2-3x Libraries 65
Challenges in Practical Usage Why most of defense mechanisms are not used in practice? 1. the performance overhead of the approach outweighs the potential protection, 2. the approach is not compatible with all currently used features (e.g., in legacy programs), 3. the approach is not robust and the offered protection is not complete, 4. or the approach depends on changes in the compiler toolchain or in the source-code while the toolchain is not publicly available. 66
Challenges in Practical Usage Why most of defense mechanisms are not used in practice? 1. the performance overhead of the approach outweighs the potential protection, 2. the approach is not compatible with all currently used features (e.g., in legacy programs), 3. the approach is not robust and the offered protection is not complete, 4. or the approach depends on changes in the compiler toolchain or in the source-code while the toolchain is not publicly available. 67
Challenges in Practical Usage Why most of defense mechanisms are not used in practice? 1. the performance overhead of the approach outweighs the potential protection, 2. the approach is not compatible with all currently used features (e.g., in legacy programs), 3. the approach is not robust and the offered protection is not complete, 4. or the approach depends on changes in the compiler toolchain or in the source-code while the toolchain is not publicly available. 68
Challenges in Practical Usage Why most of defense mechanisms are not used in practice? 1. the performance overhead of the approach outweighs the potential protection, 2. the approach is not compatible with all currently used features (e.g., in legacy programs), 3. the approach is not robust and the offered protection is not complete, 4. or the approach depends on changes in the compiler toolchain or in the source-code while the toolchain is not publicly available. 69
Questions? 70
Backup slides 71
Attack Model First step: Memory Corruption Attack Second step: Code Corruption Attack Control Flow Hijacking Data-only Attack Information leak 72
Memory Corruption Attack Memory errors (bugs) allow the attacker to read and modify the program s internal state in unintended ways. C and C++ are inherently memory unsafe. writing an array beyond its bounds, dereferencing a null-pointer, or reading an uninitialized variable result in undefined behavior Finding and fixing all the programming bugs is infeasible 73
Memory Corruption More aggressive behavior: attacker changes internal program state Most often: memory contents Values of variables on stack/heap Code pointers Attack vectors Buffer overflows Use-after-free/double-free Uninitialized variables Format strings 74
Bypassing ASLR with user scripting out-of-bounds dangling to write to read data pointer Read-only Modify code code pages... to attacker specified code code pointer... to target code address Modify data... to attacker specified value Output data Interpret the leaked value by indir. call/jmp Exec. gadgets or functions Stack canaries by ret instruction Non-exec. Execute data injected pages code Use corrupted data variable Code corruption Control-flow hijack Data-only attack Information 75 leak