Are Your Mobile Apps Well Protected? Daniel Xiapu Luo csxluo@comp.polyu.edu.hk Department of Computing The Hong Kong Polytechnic Unviersity 1
What if your mobile app is reverse-engineered by others? Core business logic and major algorithms could be learnt by your competitors. Credentials in apps. 2
Turn your app into a malware Source: businessinsider.com 3
Secure, no ads Save $6.99, but get ads 4
Source: Trend Micro Percentage of top 10 apps in each category which have repacked version: 100% of the apps of Widgets, Media & Video, etc. 90% of the apps of Business, Music & Audio, etc. 5 5
6
7
8
Can they protect your app? 9
Outline Android App Protection You Can Run But You Cannot Hide Suggestions 10
Android App Protection Goal Raise the bar for attackers Hide the code Make the code hard to be understood. m4k3 7h3 (0d3 h4rd 70 83 und3r5700d 11
Android App Protection Techniques used by packers Obfuscation Dynamic class loading Java reflection Dex file modification Native code Emulator detection Anti-debug Hide the code 12
Obfuscation Transform the code to make it difficult to understand or change while keeping its functionalities. Renaming identifier Equivalent expression Encrypting data Splitting and merging functions Complicating control flow Inserting bogus codes ProGuard 13
Dynamic class loading A feature supported by Java Implement the core business logic in a separated class. The class can be located in the server or released from a native library. Load the class into the runtime when the class is used. 14
Java reflection A feature supported by Java An app can use it to Inspect classes, interfaces, fields and methods at runtime without knowing their names, Instantiate new objects dynamically, Invoke methods dynamically, 15
Dex file modification Hide the method. Bad code to make reverse-engineering tools crash. Opcodes AXML Resource files Source: Hu et al. 16
Native code App can invoke native code through Java native interface (JNI). Native code can modify the dex file in the memory. Source: A. Blaich 17
Emulator detection The adversary can observe how an app executes by running it in an emulator (e.g., Qemu). Emulator is a software that usually has fixed configuration. So it is different from a real smartphone. Device ID 000000000000000 18
Outline Android App Protection You Can Run But You Cannot Hide Suggestions 19
Dex file The executable of an App Java source code - > Java class -> dex Java class: each file contains one class dex: one file contains all classes 20
You Can Run But You Cannot Hide Can we extract the dex file from a packed app? Yes! DexHunter Yueqian Zhang, Xiapu Luo, and Haoyang Yin, DexHunter: Toward Extracting Hidden Code from Packed Android Applications, Proceedings of the 20th European Symposium on Research in Computer Security (ESORICS), Vienna, Austria, Sept. 2015. Paper: http://www4.comp.polyu.edu.hk/~csxluo/dexhunter.pdf Source code and demo: https://github.com/zyq8709/dexhunter Key insight Dex file will be loaded and run by Android runtime, including Dalvik virtual machine (DVM) and the new Android Runtime (ART), which controls everything. 21
DexHunter Modify Android runtime (i.e., DVM and ART) to extract the dex file. Dex File The header contains the length and the offset for each section. class_defs section contains class_def_items, each of which describes a class. 22
When to unpack the app? When the first class of the app is being loaded. Why? Before a class is loaded, the content of the class should be available in the memory; When the class has been initialized, some content in memory may be modified dynamically; Just before a method is invoked, its code_item or instructions should be available. How? Load and initialize all classes proactively. 23
How to unpack the app? Integrate our tool into Android runtime including DVM and ART. Wait for the time when the first class of the app is being loaded. Locate the target memory region. Dump the selected memory area. Correct and reconstruct the dex file. 24
Loading & initializing classes Traverse all class_def_items in the dex file. For each one, DexHunter loads it with FindClass function (ART) or dvmdefineclass function (DVM). DexHunter initializes it with EnsureInitialized function (ART) or dvmisclassinitialized & dvminitclass functions (DVM). 25
Locating the target memory region The target memory region contains the dex file. DexHunter uses a special string to determine whether the current dex file is what we want. ART: the string location_ in DexFile objects. DVM: the string filename in DexOrJar objects. 26
27
Dumping the dex file in memory Divide the target memory region Part 1: the content before the class_defs section Part 2: the class_defs section Part 3: the content after the class_defs section Dump part 1 into a file named part1 and part 3 into a file named data. 28
Correcting and collecting Why? Packing services may modify the memory dynamically. The memory consists of the region containing the dex file and the method objects (i.e., ArtMethod in ART, Method in DVM) managed by runtime. The runtime executes instructions according to the managed method objects. 29
Reconstructing the dex file Eventually, DexHunter has four files: part1, classdef, data, extra. It combines them according to the following sequence to construct a dex file. part1 classdef data extra 30
Products under Investigation 360 http://jiagu.360.cn/ Ali http://jaq.alibaba.com/ Baidu http://apkprotect.baidu.com/ Bangcle http://www.bangcle.com/ Tencent http://jiagu.qcloud.com/ ijiami http://www.ijiami.cn/ 31
Experiment Setup 32
Anti-debugging All products detect debugger Anti-ptrace Anti-JWDP. They cannot detect DexHunter. 33
360 Version: 06-21-2015 It encrypts the dex file and saves it in libjiagu.so/libjiagu_art.so. It releases the data into memory and decrypts it while running. 34
Ali Version: 21-06-2015 It splits the original dex file into two parts One is the main body saved in libmobisecy.so The other one contains the class_data_item and the code_item of some class_def_items. It releases both two parts into memory as plain text and corrects some offset values in the main body while running. Some annotation_offs are set to incorrect values. 35
Baidu Version: 21-06-2015 It moves some class data items to other places outside the dex file. It wipes the magic numbers, checksum and signature in the header after the dex file has been opened. 36
Baidu It fills in an empty method just before it is invoked and erases the content after the method is finished. DexHunter instruments method invocation to dump these methods which is available only just before invoking. doinvoke (ART) dvmmterp_invokemethod (DVM) 37
Bangcle Version: 21-06-2015 It prepares the odex file or oat file in advance. It encrypts the file and stores it in an external jar file. It decrypts the data while running It hooks several functions in libc.so, such as fwrite, mmap, 38
ijiami Version: 21-06-2015 Similar to Bangcle. The string changes every time the app runs. It releases the decrypted file, which is also encrypted as a jar file, with different file names each time while they are in the same directory. 39
Tencent Version: 25-05-2015 It can protect the methods selected by users. If a method is selected, it cannot be found in the relevant class_data_item. It releases the real class_data_item and adjusts the offset. The code_item of the selected method is still in the data section. Some annotation_offs are set to 0xFFFFFFFF. It can only runs in DVM. 40
Outline Android App Protection You Can Run But You Cannot Hide Suggestions 41
Suggestions Do not assume that your app cannot be reverse-engineered by others. Do not put secrete into your app. Protect your apps using various techniques Strong obfuscation algorithms Implement core business logics into native code and then pack the native code Server side verification Customized hardening services 42
Suggestions Detect repackaged apps from markets Simple approach Finding apps with similar descriptions, etc. Advanced approach Detect repackaged apps by comparing their codes. It may be affected by the app hardening techniques. Detect repackaged apps by comparing their resources. 43
ResDroid A scalable approach to detect repackaged apps by leveraging resource features (e.g., GUI, etc.) instead of code. Use statistical features for the coarse-grained processing Use structural features for the fine-grained processing http://www4.comp.polyu.edu.hk/~csxluo/resdroid.pdf 44
Thanks my group members and collaborators for contributing to this research: Yueqian Zhang Wenjun Hu Yuru Shao Haoyang Yin Xiaobo Ma Xian Zhan 45
DexHunter Paper: http://www4.comp.polyu.edu.hk/~csxluo/dexhunter.pdf Source code and demo : https://github.com/zyq8709/dexhunter ResDroid Paper: http://www4.comp.polyu.edu.hk/~csxluo/resdroid.pdf Our other tools and papers on Android security: http://www4.comp.polyu.edu.hk/~csxluo 46
References Christian Collberg and Jasvir Nagra, Surreptitious Software: Obfuscation, Watermarking, and Tamperproofing for Software Protection, Addison- Wesley, 2009 Yuru Shao, Xiapu Luo, Chenxiong Qian, Pengfei Zhu, and Lei Zhang, Towards a Scalable Resource-driven Approach for Detecting Repackaged Android Applications, Proceedings of the 30th Annual Computer Security Applications Conference (ACSAC), New Orleans, USA, December 2014. A. Apvrille and R. Nigam, Obfuscation in android malware, and how to fight back, Virus Bulletin, July 2014. M. Grassi, Reverse engineering, pentesting, and hardening of android apps. DroidCon, 2014. T. Strazzere and J. Sawyer, Android hacker protection level 0, DefCon, 2014. (android-unpacker, https://github.com/strazzere/android-unpacker) ZjDroid, http://blog.csdn.net/androidsecurity/article/details/38121585 Y. Park, We can still crack you! general unpacking method for android packer (no root), Blackhat Asia, 2015. 47