System call
Overview 2 Operating systems offer processes running in User Mode a set of interfaces to interact with hardware devices such as the CPU disks printers Unix systems implement most interfaces between User Mode processes and hardware devices by means of system calls issued to the kernel. essence : software interrupts and trapping into kernel
Overview System calls provide the interface between user programs and kernel. labstracted hardware interface l Security and stability l Allows virtualization And More : l free the user from underlying hardware easy to use and code l provide unified kernel interface and portability l System call is the way(except for exception and interrupt) to enter kernel model, thus provide protection for system
Linux OS architecture 4 User Applications LIB User Space System Call Interface GNU/Linux Process Management Virtual File System Memory Management Network Stack Kernel Space Arch /Device Drivers Hardware Platform
Architecture supported system call 5
What are the system calls behind the API? 6 User pragram System call brk () access() open() read() fstat() mmap() mprotect() munmap() write() close()
System call process :read() 7 User Proream API libs software interrupt kernel implement
The overall process of system call: read() 8 API Wrapper Routine Interrupt Hander (system call hander) Service routine 1 push nbyte 2 push &buffer 3 push fd 4 call fread() 5 System call number to eax 6 Init OX80 instruction to software interrupt 7 Lookup interrupt table (IDT) to call interrupt handler 8 Save context ( eg, save registers into kernel stack) 9 execute system_call(), Validate parameter 10 lookup system call table to call to Call service routine according to eax 11 execute service routine 14 Return from Wrapper Routine 13 Return from interrupt 12 Return from Service routine The application program makes a System Call: A system library routine is called first It transforms the call to the system standard wrapper routine and traps to the kernel Control is taken by the kernel running in the system mode According to the service code, the Call dispatcher invokes the responsible part of the Kernel Depending on the nature of the required service, the kernel may block the calling process After the call is finished, the calling process execution resumes obtaining the result (success/failure) as if an ordinary function was called 14 steps to execute the service read (fd, buffer, nbytes)
General process of a system call 9 API(application programmer interface) n Unix systems include several libraries of functions that provide APIs to programmers. n Interact directly with the user n Different system and platform provide the same API to user for generality
General process of a system call 10 API(application programmer interface) l An API does not necessarily correspond to a specific system call. First of all, the API could offer its services directly in User Mode. (For something abstract such as math functions, there may be no reason to make system calls.) Second, a single API function could make several system calls. Moreover, several API functions could make the same system call, but wrap extra functionality around it. Eg : malloc( ), calloc( ), and free( ) APIs use the same brk() system call
General process of a system call 11 API(application programmer interface) l An API does not necessarily correspond to a specific system call. First of all, the API could offer its services directly in User Mode. Second, a single API function could make several system calls. Moreover, several API functions could make the same system call, but wrap extra functionality around it. Eg : malloc( ), calloc( ), and free( ) APIs use the same brk() system call System calls open close read Library calls fopen fclose fread, getchar, scanf, fscanf, getc, fgetc, gets, fgets write lseek fwrite, putchar, printf, fprintf putc, fputc, puts, fputs fseek
General process of a system call 12 Warpper routine n APIs defined by the libc standard C library refer to wrapper routines (routines function is to issue a system call). n Every system call refers to a wrapper routine n Invoke trapping instruction to enter kernel model n Int $0x80 on 80x86 (may be different on different platform) n Generate interrupt with vector number 128 n Do something related to the interrupt (Close the interrupt, protect the context, etc)
General process of a system call 13 system call handler(software interrupt) has a structure similar to that of the other exception handlers, performs the following operations: get the system call number from eax, then check whether It s value is less than NR_syscall Saves the contents of most registers in the Kernel Mode stack. n This operation is common to all system calls and is coded in assembly language. Handles the system call by invoking a corresponding C function called the system call service routine. Exits from the handler: n n the registers are loaded with the values saved in the Kernel Mode stack the CPU is switched back from Kernel Mode to User Mode. This operation is common to all system calls and is coded in assembly language.
General process of a system call 14 System call number(x86): Ø There are different system call in kernel, because they have the same entrance : system_call() Ø the system call number is used to uniquely identify different system calls. Ø It will be passed in eax register Ø System_ni_systemcall for all deleted and invalid system call number. only return ENOSYS Ø linux-x/arch/x86/include/asm/unistd_32.h User program System handle sys_call_table Service routine System call API Save context 1 4*(eax) Entrance address return Next instruction Bext instruction Entrance address Restore context return
General process of a system call 15 System call Table:(X86) Associate the system call number with the service routine, there are NR_syscalls (eg:289 in the Linux 2.6.11 kernel) entries in the table, the Nth entry contains the entry address (function point) of the service routine of the system call number n,every entry is four or eight bytes. User program System handle sys_call_table Service routine System call API Save context 1 4*(eax) Entrance address return Next instruction Bext instruction Entrance address Restore context return
General process of a system call 16 Service routine(for different architecture may different ) l l The kernel implementation of the system call system calls in kernel start with sys_ Such as sys_fork(), sys_exit() (/usr/src/linux-x/kernel/sys.c)
General process of a system call 17 Service routine: a example API :getpid()-àsys_getpid() Asmlinkage long sys_getpid(void) SYSCALL_DERINE0(getpid) { return task_tgid_vnr(current) // return current->tgid }
System call implementation(1) 18 Enter and exit system call two way to enter system call: uinvoke int $0x80 assembly instruction uinvoke sysenter assembly instruction (introduced by Pentium Ⅱ and successor) Accordingly, two way to exit from system call u invoke iret assembly instruction u Invoke sysexit assembly instruction
System call implementation(2) Step 1: Establish the table entry for vector 128 in the IDT table during kernel initialization----trap-init() Use set_system_gate(0x80, &system_call) The call loads the following values into the gate descriptor fields: u Segment Selector The KERNEL_CS Segment Selector of the kernel code segment. u Offset The pointer to the system_call( ) system call handler. u Type Set to 15. Indicates that the exception is a Trap and that the corresponding handler does not disable maskable interrupts. u DPL (Descriptor Privilege Level) Set to 3. This allows processes in User Mode to invoke the exception handler Therefore, when a User Mode process issues an int $0x80 instruction, the CPU switches into Kernel Mode and starts executing instructions from the system_call() address.
System call implement(3) 20 Step 2: system_call function system_call( ) the common enter for all the system calls Ø One: saving registers to the kernel stack except for eflags, cs, eip, ss, and esp. Ø Two: Store the address of thread_info of current process
System call implement(4) Step 2: Ø SAVE_ALL detail: system_call function ss esp eflags cs Hardware push into stack 21 eip ORIG_EAX es ds System call num,eax eax ebp edi esi edx SAVA_ALL Push into stack ecx ebx thread info
System call implementation(5) 22 Step 2: system_call function system_call( ) ---the common enter for all the system calls Ø Three: Check TIF_SYCALL_TRACE and TIF_SYSCALL_AUDIT flags in thread_info Ø Four: Check the validity of the system call number Ø Five: Invoke a specific service routine that corresponds to the system call number
System call implementation(6) 23 Step 3: exit from system call l After the running of system call routine,the system_call will: Ø get return value from eax Ø store value to stack in user model l System Call will Close the interrupt,check the of Thread_info flags in current process, then restore the registers
System call implement(7) 24 System call--------parameter pass system call need input /output parameter: real value address of user space points to function in user space Because System call have the same entrance, at least one parameter is needed : system call number passed by eax register to distinguish them for example : Ø if user use fork() API, before running int $0X80 instruction,the eax is set 2 (_NR_fork) Ø the setting of eax is done by wrapper routine,thus programmer no need to care Ø Once enter system_call(), push the value of eax to kernel stack
System call implementation(8) 25 System call--------parameter pass(x86) Before invoke int $0x80, system call parameters are Written to registers from user stack, then fetch the parameter to stack in kernel model User stack Kernel stack parameter register
System call implementation(9) 26 System call--------parameter pass(x86) limitation: 1 no more than 6 parameters 2 every parameter is less than register width. how about passing parameters which are more than six? store the point of the user stack space in register User stack Kernel stack parameter register
System call implementation(10) 27 System call--------parameter pass different architecture may have different way to pass parameter, but the principle is similar!!!! IA-32: system call num: eax parameter:ebx,ecx,edx,esi,edi Alpha: system call num: v0 parameter:a0,a1,a2,a3,a4 PowerPC: system call num: r3 parameter:r4,r5,r6,r7,r8 AMD64 : system call num: raw parameter:rdi,rsi,rdx,r10,r8,r9
System call implement(11) 28 System call --return value l Return an int or long (by system call hander ) >0: if execution proceeded normally -1:if an error occurred l When an error occurs, system call hander will return -1, then errno is set to the error code (by wrapper routine) It carries semantical information #include<errno.h>
System call How to trace 29 l ptrace() l allow parent process to observe/control child child stops before signal delivery or system call execution parent waits for child parent can view/modify child state possible to attach and reparent existing processes architecture dependent strace() a tool in user model is used primarily to track the system calls Print all system calls in order,including function names, parameter lists, and return values. based on the ptrace()
System call category(1): 30 groups of services: l Process control and IPC l Memory management allocating and freeing memory space on request l File and file-system management l Device management l Communications Networking and distributed computing support l Other services e.g profiling and debugging
System call category(2) 31 l Process management l Memory management Memory management brk() Change the allocation of data segment space l mlock() add lock to memory page mmap() map file to memory page msync() flush data in memory to disk mprotect() set protection memory page File management
System call cost: 32 Cost of Crossing the Kernel Barrier l more than a procedure call l costs: Interrupt vector mechanism establishing kernel stack validating parameters eg: file FD is valid? Point is out of particular user space? kernel mapped to user address space? l updating page map permissions kernel in a separate address space? l l reloading page maps invalidating cache, TLB
Zero-copy technology(1) 33 A simple example four times copies!!!!!
Zero-copy technology(2) 34 A simple example four times switches!!!!
Zero-copy technology(3) 35 Direct i/o Twice switches. Twice copies. APP context Kernel context device Read(copy) Read Buff APP buffer Write(copy) Socket Buff NIC buff
Zero-copy technology(3) 36 transferto() Twice switches!!!
Zero-copy technology(4) 37 transferto() three times copies!!!
Zero-copy technology(5) 38 NIC withability to gather Twice copies!!!
Zero-copy technology(6) 39 NIC with Ability to gather Twice Switches!!!
System Call and security issue 40 u operating system security l system call is the interface with user program and Kernel. l with high access permissions to hardware u Intrusion detection with system call l system call sequences analysis l Short sequence analysis l Hidden markov chain model l Monitoring based on data mining u System calls and computer immunization l construct new type of computer protection system based on principle of artificial immune technology u u System calls and anti-virus techniques Enhance of kernel with system call eg: add SVM to kernel