Binary Analysis Tool Quick Start Guide This tool was developed by: Sponsored by Version 4
Table of Contents Getting and installing the tool...3 Technical requirements...3 Get the tool...3 Confirm it is correctly installed...4 Get to work...4 Automated extraction of the version and configuration of busybox...4 Extraction of file systems...5 Automated checking for the Linux kernel...6 Brute force scanning of firmware...7 Feeding known information through a knowledgebase...7 More information...7 2
Getting and installing the tool Technical requirements A recent Linux installation. We have tested on Fedora 13, Fedora 14 and Ubuntu 10.10. python (2.5 or higher preferred, but not 3) python-magic GNU binutils (for readelf and strings) e2tools http://freshmeat.net/projects/e2tools/ (optional) squashfs tools (4.0 highly recommended) module-init-tools (for modinfo) gzip (for zcat) xz (for lzma) zip unrar cabextract 7z cpio tar PyXML sqlite3 Get the tool You can download the latest release version of the tool from: http://www.binaryanalysis.org/en/content/show/download The tool is available as RPM for Fedora 13 and Fedora 14, as SRPM so it can be rebuilt on another distribution, and as a DEB package for Ubuntu 10.10. You can access the latest development version through Subversion at: http://www.binaryanalysis.org/trac/browser 3
Confirm it is correctly installed If you used a binary package (RPM, DEB) you can check the installation works by executing the bruteforce.py tool from the commandline: bruteforce.py -c /etc/bat/bruteforce-config b /path/to/binary If you downloaded the source code you can open a terminal and navigate to the folder the source code was unpacked into. Please note that you might have to take care of installing some dependencies yourself if you used this method. Test the tool by running the bruteforce scanner on a binary: python./bruteforce.py -c./bruteforce-config binary=/path/to/binary If you see XML output the tool is installed correctly. If you did not see the expected output please download and read the user manual. Get to work At the moment the tool is focused on the analysis of binary firmware. It provides: Automated extraction of the version and configuration of BusyBox; Extraction of file systems; Automated checking for the Linux kernel; Brute force scanning of firmware; Feeding known information through a knowledgebase. All the top level scripts have a --help option which displays more information on how to invoke the scripts. Automated extraction of the version and configuration of busybox The busybox.py tool has three modes: printing a possible configuration extracting from a BusyBox binary, printing names of applets for which no configuration exists in the source code of the official BusyBox release, or both. By default it prints just a configuration that could have been used to compile the BusyBox binary. In the near future there will be an export to a very simple XML file as well. 4
Example invocations: $ python busybox.py --binary=test/busybox --found $ python busybox.py --binary=test/busybox --found --missing $ python busybox.py --binary=test/busybox --missing The busyboxversion.py tool does one thing: printing the version number of a BusyBox binary. $ python busyboxversion.py --binary=test/busybox The busybox-compare-configs.py tool can be used to compare an extracted configuration with an existing configuration. The tool takes at least two parameters: the path of the configuration extracted from a BusyBox binary and the configuration from a source archive. If available the BusyBox version number can be supplied to weed out some false positives. $ python busybox-compare-configs.py -e /tmp/extracted-config -f /tmp/original-config $ python busybox-compare-configs.py -e /tmp/extracted-config -f /tmp/original-config -n 1.11.1 The appletname-extractor.py tool takes two arguments: the full path to include/applets.h for a BusyBox source tree and a version number. It outputs a Python pickle file, which should be stored in the directory 'configs' before it can be used by busybox.py. $ python appletname-extractor.py -a /tmp/busybox-1.00-rc3/include/applets.h -n 1.00-rc3 This tool is typically run when a new version of BusyBox is released. Extraction of file systems There are currently no standalone scripts to extract individual file system. The code is being used from other scripts, like bruteforce.py. 5
Automated checking for the Linux kernel The findkernelstrings.py tool takes at least two parameters: the path to the binary kernel image and the path to the directory containing a search database, generated with the extractkernelstrings.py helper script. By default the tool will report what strings are found and in what file. There is an option to print which strings were not found and which might need further investigation. It should be noted that right now not all strings are correctly detected and there will be false positives. To avoid many false positives we have set a minimal limit for the length of the strings we look at. This limit can be changed if necessary. If configuration information extracted with extractkernelconfig.py is available this can be fed to the tool to try and guess a kernel configuration. This functionality is limited at the moment. If information about the architecture is available it can be supplied as well, although this is very crude at the moment. $ python findkernelstrings.py -k /tmp/kernelimage -i /tmp/kernelstrings $ python findkernelstrings.py -k /tmp/kernelimage -i /tmp/kernelstrings -m $ python findkernelstrings.py -k /tmp/kernelimage -i /tmp/kernelstrings -s 9 $ python findkernelstrings.py -k /tmp/kernelimage -i /tmp/kernelstrings -c /tmp/kernelconfig $ python findkernelstrings.py -k /tmp/kernelimage -i /tmp/kernelstrings -c /tmp/kernelconfig -a mips The extractkernelconfig.py tool takes two arguments: the path to a directory with the unpacked Linux kernel sources and a path to a directory in which to store the search database. To ensure correctness the archive with the Linux kernel sources should be a directory to which all necessary patches have been applied. The reason for this is that the patch file format does not work great with our multiline regular expressions and could also lead to false positives. $ python extractkernelconfig.py -d ~/linux-2.6.15/ -i /tmp/kernelconfig/ The extractkernelstrings.py tool takes two arguments: the path to a directory with the unpacked Linux kernel sources and a path to a directory in which to store the search database. To ensure correctness the archive with the Linux kernel sources should be a directory to which all necessary patches have been applied. The reason for this is that the patch file format does not work great with our multiline regular expressions and could also lead to false positives. 6
$ python extractkernelstrings.py -d ~/linux-2.6.15/ -i /tmp/kernelstrings/ Brute force scanning of firmware The bruteforce.py tool tries to determine what is inside a firmware without much knowledge of what is inside the firmware. It does so by scanning for known magic markers of file systems (such as SquashFS) and compression methods (such as gzip), bootloader and kernel strings, unpack these files and do more in depth analysis of the files. The checks that need to be run in the bruteforce.py tool can be configured through a configuration file. Documentation for the format of the configuration file is included in the source distribution. In the source release there is also a demo configuration which configures basic functionality. The bruteforce.py tool outputs its results in XML. $ python bruteforce.py -b /tmp/firmware.bin -c /tmp/bruteforce-config Feeding known information through a knowledgebase The knowledgebase is currently functional for information extracted from the official BusyBox releases and the Linux kernel. These scripts have been described above. Experimental support for querying and populating a knowledgebase for the bruteforce scanning has been added, but not used by default. Documentation for these experimental features can be found in the source archive. More information You can find more detailed instructions and background reading about the tool in the user guide. You will find it at: http://www.binaryanalysis.org/en/content/show/documentation This document is licensed under the Creative Commons Attribution-No Derivative Works 3.0 Unported License. All trademarks belong to their respective owners. 7