Intro to Segmentation Fault Handling in Linux. By Khanh Ngo-Duy

Intro to Segmentation Fault Handling in Linux By Khanh Ngo-Duy Khanhnd@elarion.com

Seminar What is Segmentation Fault (Segfault) Examples and Screenshots Tips to get Segfault information

What is Segmentation Fault? Segmentation Fault (Segfault) or access violation is a particular error condition that can occur during the operation of computer software A Segfault occurs when a program attempts to access a memory location that is not allowed to access, or attempts to access a memory location in a way that is not allowed Write to a read-only location To overwrite part of the operating system or protected memory locations Access to invalid memory location. e.g : memorry address NULL, -1... etc...

Examples and Screenshots (1 of 3) Write to Read-Only memory address

Examples and Screenshots (2 of 3) Write to Invalid memory address (NULL = 0x00)

Examples and Screenshots (3 of 3) Stack overflow

Tips to get Segfault information (1 of 7) Generally, when Segfault occurs, very less information is provided (see previous slides) Very hard to debug

Tips to get Segfault information (2 of 7) Use dmesg to show information that saved by the Kernel when any application crashes Last Segfault information

Tips to get Segfault information (3 of 7) How to read dmesg outputs? Application name Address caused fault Other values??? I DON'T KNOW... Stack Pointer address S eg fa ult[19960]: s eg fa ult a t 7fffff7feff8 ip 400480 s p 7fffff7ff000 erro r 6 in S eg fa ult[400000+1000] Reason it crashed Instruction Pointer address Additional error code

Tips to get Segfault information (4 of 7) Add -g when compiling the source code. The compiler will add debugging symbols into the binary Will provide more useful information when debugging with gdb Compiled binary size will be largers (debugging symbols are added) Application runs slower, takes more RAM Maybe, some other drawbacks

Tips to get Segfault information (5 of 7) Add -g when compiling the source code. The compiler will add debugging symbols into the binary Will provide more useful information when debugging with gdb Without -g, gcc still adds some minimal debugging information Compiled binary size will be largers (debugging symbols are added) Application runs slower, takes more RAM Maybe, some other drawbacks

Tips to get Segfault information (6 of 7) Use nm to view the symbols in the binary file Address of symbol, symbol type, symbol name can be listed Give us chance to know the Segfault occurred with what symbol $man nm for more information on the usage

Tips to get Segfault information (7 of 7) Use ldd to view the shared library dependencies Show shared library name, starting address of library We know Segfault occurred in our application or in shared library $man ldd for more information how to use ldd

Using gdb The GNU Debugger Core dump file and gdb objdump

Using gdb The GNU Debugger (1 of 6) gdb supports: Starting programs, attaching to running programs or debugging crashed programs Debugging locally or remotely (via gdbserver) Setting breakpoints and watchpoints Examining variables, registers and call stack Changing data and calling functions Automating debug tasks Multi threaded programs

Using gdb The GNU Debugger (2 of 6) In order to effectively debug program, add -g when compiling with gcc Load a program into gdb: $gdb program Once you are in gdb, you can run the program (gdb)run [parameters to program] To stop program, press Ctrl+C To quit gdb, execute command q

Using gdb The GNU Debugger (3 of 6) Step 1: Load the program into gdb Step 2: execute the program gdb detects Segfault but very less info (-g is not add when compiling) Step 3: Quit from gdb

Using gdb The GNU Debugger (4 of 6) add -g when compile Step 1: Load Step 2: Run gdb detects Segfault, shows the lines which caused Segfault Line 6, in main(), file: Segfault.c Step 3: Quit from gdb

Using gdb The GNU Debugger (5 of 6) Is this useful and easy? YES!!! But why? Because of -g we can see the file name, function name and line number Because of the source code is available can see the exactly line of code If there is no source code we can see the filename, function name, line number but NOT contents of the line cause Segfault No problem, still GOOD! :-) Because this situation is simple, sometime you can NOT use this technique! See next...

Using gdb The GNU Debugger (6 of 6) This technique can O N LY be used when: You know for sure, Segfault will occur Only when testing. When in production time, you can NOT gdb causes many side effects: slow down the running, running is not stable etc Even when testing, if application is so Big or threads, many resources) Complicated (many gdb can not handle To be able to debug when your application is in production mode and not able to reduce the Segfault? See the next techniques...

core dump file and gdb (1 of 5) A core dump consists of the recorded state of the working memory of a computer program at a specific time, generally when the program has terminated abnormally (crashed) Core dump file might contain: processor registers, which may include the program counter and stack pointer, memory management information, and other processor and operating system flags and information Core dumps are disabled by default on some Linux distributions To force the core dump generation, you can using command line $ulimit -c <limit size of core file> To force the core dump generation, you can also insert code to your application to request generating the core dump when it crashes To disable the core dump just set <limit size of core file> to 0

core dump file and gdb (2 of 5) Enable core dump, limit to 1024 MB, just once core file is generated when app crashes It is here!

core dump file and gdb (3 of 5) Once you have core dump, what to do? Just load it into gdb and see $gdb <application name> <core file name>

core dump file and gdb (4 of 5) Load the application and core file gdb reads core file and shows the results as if the Application has just run and crashed, actually core dump just shows the actual HISTORY

core dump file and gdb (5 of 5) Is this better than the previous technique? Yes, because I could not reproduce the Segfault, however, core file shows me Though it is good, still some disadvantages Core file may grow very large if your application uses much memory, so sometime you simply can not use this method In case of complicated application, there might be some side effects when forcing core dump your application might run unstable What I read till now, just s ide effec ts. Is there any else? I don't want to risk the production system! YES. There is, see the last techniques...

objdump (1 of 9) Advantages No need to add -g do not affect the memory and reduce the size of binary file No need to generating core dump no side effects, do not take disk space Actually, you do not need to do anything, what will come will come, and you will solve it! Disadvantages? You need a little knowledge about assembly language :-) don't be scare, still easy! If adding optimization flag to gcc (-O, -O2, -O3) it will be a little hard you to read assembly code later

objdump (2 of 9) First of all, what you need is the output of dmesg (very first slide told you). Note the Address caused fault and Instruction pointer address Use the tool named objdump to generate information from your application Output of objdump should be redirected to a file, we need this file later! $objdump -DCl <application name> > <output file>

objdump (3 of 9)

objdump (4 of 9) OK, so my fault address is 0x40058c and instruction pointer is 0x40048c mydump contains the assembly code of my app Now I will see at what line of code, my app crashed just find where is 0x40048c in mydump $grep -n -A 100 -B 100 40048c./myDump What it does it just find the line having 40048c in./mydump, and also show 100 more lines after the found line, and 100 lines before the found line. You can customize the grep command as you want ;)

objdump (5 of 9) Step 1: Find instruction pointer address Step 3: Look above to see the code that caused segfault in what function? Here it is in main() Step 2: Found, This caused segfault

objdump (6 of 9) Now you know the code that cause Segfault in assembly. What to do is open your source code (in C, C++ ) to see the appropriate line of code corresponding to that Assembly code, you will figure out what caused Segfault :-)

objdump (7 of 9) 0x48 = 'H'

objdump (8 of 9) You are done now! Bravo!!! Just with Instruction Pointer, you know where caused Segfault exactly How about the Address caused fault (0x40058c), we have not used it, haven't we? No we don't. BUT, till now I can say the line caused Segfault is * s = 'H '; And, the address of variable s at that time is 0x40058c Meaningless to know this? NO! There is sometime you will need it to know the root cause, see the next slide

objdump (9 of 9) Sometimes, the Address caused fault tell you the root cause. See the following example, we can say that, value of s is N U L L

Thanks for watching If you see it useful clap your hands :-)