Talk syscall_intercept Title Here A user space library for intercepting system calls Author Name, Company Krzysztof Czuryło, Intel
What it is? Provides a low-level interface for hooking Linux system calls in user space Very simple API Open source: https://github.com/pmem/syscall_intercept BSD license
Why we need it?
Motivation libpmemfile https://github.com/pmem/pmemfile fully user-space filesystem not FUSE builds on libpmemobj (http://pmem.io/nvml/) persistent memory as a media
pmemfile design User Applications pmemfile_write write pmemfile components libpmemfile-posix libpmemobj libpmemfile syscall_intercept glibc User Space load/store System Call Interface Kernel SYS_write Kernel Space Architecture-Dependent Kernel Code NVDIMM Hardware Platform
How does it work?
How does it work? Disassemble the code Identify all the syscall instructions...and their context Hotpatch the code Replace syscalls with jumps to a custom syscall hook All the syscalls by libc are intercepted assume libc is already loaded One hook function to rule them all
How does it work? Patch libc libpthread Do not patch libsyscall_intercept libcapstone Optional target program other libraries INTERCEPT_ALL_OBJS=1
Capstone The Ultimate Disassembler
Capstone http://www.capstone-engine.org/ "Capstone is a lightweight multi-platform, multiarchitecture disassembly framework. Our target is to make Capstone the ultimate disassembly engine for binary analysis and reversing in the security community."
Capstone Multi-arch: Arm, Arm64 (Armv8), M68K, Mips, PowerPC, Sparc, SystemZ, TMS320C64X, XCore& X86/X86_64 Multi-platform: Native support for Windows & *nix (with Mac OSX, ios, Android, Linux, *BSD & Solaris confirmed) Bindings for Clojure, F#, Common Lisp, Visual Basic, PHP, PowerShell, Haskell, Perl, Python, Ruby, C#, NodeJS, Java, GO, C++, OCaml, Lua, Rust, Delphi, Free Pascal & Vala available Provide details on disassembled instruction (called "decomposer" by some others) Provide some semantics of the disassembled instruction, such as list of implicit registers read & written Thread-safe by design Distributed under the open source BSD license Over 300 products already using Capstone Source code: https://github.com/aquynh/capstone source: http://www.capstone-engine.org/
Disassembly Iterate thru all instructions in TEXT section cs_disasm_iter() Examine each instruction and its operands is syscall? is call/jump/relative jump/indirect jump? is ret? is nop? has IP relative operands? (cannot be patched!) can be relocated?
Everything here is x86_64 and Linux-specific!!!
Hotpatching Can't replace syscall with a call to C function return addr ret (stack) Can't replace syscall with a jump to C function arguments Need a wrapper routine Call the actual syscall hook function Eventually must jump back to the origin
Hotpatching Syscalls replaced with a jmp to a wrapper routine Each patched syscall has its own one Generated on-the-fly Wrapper routine template intercept_template.s Save registers Call the syscall hook Jump back to origin
Hotpatching Overriding syscall instruction syscall - 2 bytes long jmp - 5 bytes If surrounding instructions can be relocated Move them to the wrapper Look for some place nearby that can be overwritten Put a trampoline code there (the actual long jump) syscall => short jump (2 bytes)
Hotpatching Before calling C syscall hook function Save all the registers on stack recent GCC/clang versions may help with attribute no_caller_saved_registers calee-saved registers Align the stack 32-byte boundary (because of AVX regs) force_align_arg_pointer attribute does not help
Hotpatching Syscall in a leaf function red zone adjust stack pointer
Wrapper routine template intercept_template.s... mov %rsp, %r11 sub $0x80, $rsp and $-32, %rsp sub $0x38, %rsp mov %r11, (%rsp) mov $0x000000000000, %r11 call *r11 mov (%rsp), %rsp jmp absolute address... # remember the original value of $rsp # respect the red zone # align the stack # allocate space for some locals # save the original value of #rsp # the address of a syscall hook function to call # call into code common to all syscalls # restore original %rsp, as it was in the intercepted code # appended to the template
Patched code patched code example... 0x0010 mov $2, %eax 0x0019 jmp $0x1100 # an overwritten syscall 0x0020 cmp $-4095, %rax 0x0026 jle $0x0100 0x0032 ret 0x0040 mov $3, %eax 0x0049 jmp $0x1200 # another syscall 0x0040 cmp $-4095, %rax 0x0049 jle $0x0100 0x0052 ret...... 0x1100 cmp $-4095, %rax 0x1100 mov %rsp, %r11 0x1105 sub $0x80, $rsp 0x110a and $-32, %rsp 0x1110 sub $0x38, %rsp 0x1116 mov %r11, (%rsp) 0x111b mov $0x000003333330, %r11 0x113a call *%r11 0x1140 mov (%rsp), %rsp 0x1144 jmp $0x0026 # jmp back... 0x1200 mov %rsp, %r11 # same as above...
Special cases SYS_clone Different stack pointer Restoring registers may result in undefined behavior Cannot be handled in C function Execute syscall in its original context
Special cases Syscall execution signal-safe MT-safe Cannot guarantee the user hook is signal-safe I.e. calling libc printf from a hooked write syscall, which was originally called by libc printf
My first interceptor
Example #include <libsyscall_intercept_hook_point.h> /* * The syscall_number, and the six args describe the syscall currently being intercepted. * A non-zero return value means libsyscall_intercept should execute the original syscall, * use its result. * A zero return value means libsyscall_intercept should not execute the syscall, and use * the integer stored to *result as the result of the syscall to be returned in RAX to libc. */ extern int (*intercept_hook_point)(long syscall_number, long arg0, long arg1, long arg2, long arg3, long arg4, long arg5, long *result); /* * Call syscall_no_intercept to make syscalls from the interceptor library, once glibc is * already patched. * Don't use the syscall function from glibc, that would just result in an infinite recursion. */ long syscall_no_intercept(long syscall_number,...);
Example #include <libsyscall_intercept_hook_point.h> #include <syscall.h> #include <errno.h> static int hook(long syscall_number, long arg0, long arg1, long arg2, long arg3, long arg4, long arg5, long *result) { if (syscall_number == SYS_xxx) { /* do something */ return 0; /* syscall handled */ } else { /* ignore other syscalls */ return 1; /* pass it to the kernel */ } } static attribute ((constructor)) void init(void) { /* set up the callback function */ intercept_hook_point = hook; }
Example ================= hello.c =================== int main(int argc, char **argv) { printf("hello world!\n"); return 0; } ================= mysi.c =================== char translate(char c) { switch (c) { case 'l': return '1'; case 'o': return '0'; case 'e': return '3'; default: return c; } }...... if (syscall_number == SYS_write) { char buf_copy[0x1000]; size_t size = (size_t)arg2; char *buf = (char *)arg1; size_t i; } return 1;... if (size > sizeof(buf_copy)) size = sizeof(buf_copy); for (i = 0; i < size; ++i) buf_copy[i] = translate(buf[i]); buf_copy[i] = '\0'; *result = syscall_no_intercept( SYS_write, arg0, buf_copy, size); return 0;
Example $ cc -lsyscall_intercept -fpic -shared mysi.c -o mysilib.so $ cc hello.c -o hello $./hello Hello world! $ LD_PRELOAD=mysilib.so./hello H3110 w0r1d!
Other features
Usage - intercept log Intercept log Log each intercepted system call to a file could be many per-process log files if process forks INTERCEPT_LOG=1 enable logging INTERCEPT_LOG_TRUNC=0 do not truncate log file
Usage - intercept log $ INTERCEPT_LOG=./log LD_PRELOAD=mysilib.so./hello H3110 w0r1d! $ cat./log tempfile=... # will be explained in a minute... /lib64/libc.so.6 0xf7450 -- fstat(1, 0x7ffce0f984c0) =? /lib64/libc.so.6 0xf7450 -- fstat(1, 0x7ffce0f984c0) = 0 /lib64/libc.so.6 0xf7ade -- write(1, "Hello world!\n", 13) =? /lib64/libc.so.6 0xf7ade -- write(1, "Hello world!\n", 13) = 13 /lib64/libc.so.6 0xccb56 -- exit_group(0)
Usage - intercept log Log is a script (very slow!!!) tempfile=$(mktemp) ; tempfile2=$(mktemp) ; grep "^/" $0 cut -d " " -f 1,2 sed "s/^/addr2line -p -f -e /" > $tempfile ; { echo ;. $tempfile ; echo ; } > $tempfile2 ; paste $tempfile2 $0 ; exit 0 $./log GI fxstat at /usr/src/debug/glibc-2.24-61-g605e6f9/io/../sysdeps/unix/sysv/linux/wordsize- 64/fxstat.c:35 /lib64/libc.so.6 0xf7450 -- fstat(1, 0x7ffce0f984c0) =? GI fxstat at /usr/src/debug/glibc-2.24-61-g605e6f9/io/../sysdeps/unix/sysv/linux/wordsize- 64/fxstat.c:35 /lib64/libc.so.6 0xf7450 -- fstat(1, 0x7ffce0f984c0) = 0 write_nocancel at /usr/src/debug////////glibc-2.24-61-g605e6f9/io/../sysdeps/unix/syscall-template.s:84 /lib64/libc.so.6 0xf7ade -- write(1, "Hello world!\n", 13) =? write_nocancel at /usr/src/debug////////glibc-2.24-61-g605e6f9/io/../sysdeps/unix/syscall-template.s:84 /lib64/libc.so.6 0xf7ade -- write(1, "Hello world!\n", 13) = 13 GI exit at /usr/src/debug/glibc-2.24-61-g605e6f9/posix/../sysdeps/unix/sysv/linux/_exit.c:31 /lib64/libc.so.6 0xccb56 -- exit_group(0)
Examples - syscall logger examples/syscall_logger.c strace-like $ export SYSCALL_LOG_PATH=./syscall.log $ LD_PRELOAD=libsyscall_logger.so./hello $ cat./syscall.log fstat(1, 0x00007ffc88c26490) = 0 write(1, 0x0000000000bd9010, 0x000000000000000d) = 13 exit_group(0x0000000000000000)
Debugging Run the program under GDB Don't want the syscalls made by GDB to be intercepted cmdline filter Only patch the selected binaries (by name) INTERCEPT_HOOK_CMDLINE_FILTER=... name/path Can be queried syscall_hook_in_process_allowed()
Summary
Limitations Only Linux x86_64 is supported Tested with glibc only There are known issues with some syscalls: clone, rt_sigreturn, ptrace Code is patched once Dynamically generated/loaded code can't be hooked May be fooled by any handwritten assembly Undocumented ISA Mixing code and data Overlapping instructions
Potential use cases Error injection / testing Fast syscall tracing Device emulation lib_***_intercept? not only syscalls specific instructions, etc. Other
Future plans Release version 1.0 Packages for popular Linux distros
Q&A
Backup
Build and install git clone https://github.com/pmem/syscall_intercept ccmake syscall_intercept make sudo make install