Identifying the Applied Obfuscation Method towards De-obfuscation
|
|
- Jonas Butler
- 5 years ago
- Views:
Transcription
1 Identifying the Applied Obfuscation Method towards De-obfuscation Hayato Sagisaka Division of Frontier Informatics, Graduate School of Kyoto Sangyo University. Haruaki Tamada Faculty of Computer Science and Engineering, Kyoto Sangyo University. Abstract Recently, to prevent cracking, the various protection methods have been proposed. One of the protection methods is the obfuscation method. Obfuscation method changes the program into hard to understand for hiding secret information in the program. On the other hand, de-obfuscation is an interesting research topic for protecting the software. Since, though vulnerable protection methods are dangerous, measuring the robustness of the method was not discussed. In this paper, we tackle to identify the applied obfuscation methods towards de-obfuscation. To perform de-obfuscation requires a suitable method for each obfuscation method. For this, we obfuscated programs by two practical tools and three algorithms from one academic tool. Then, we analyzed the programs and extracted from them based on opcodes. By using the proposed method, we could identify the applied obfuscation method. I. INTRODUCTION Serious security incidents have arisen, such as illegal analysis and software theft. To prevent such malicious attacks, many protection methods were proposed. However, the robustness of the protection methods were never discussed. Increasing the robustness needs a more deeper discussion about robustnesses and vulnerabilities. Especially, systematizing how to attack to the protection methods helps evaluation and construction of more robust protection method. This paper focuses on de-obfuscation which converts an obfuscated program to more understandable program. If strong de-obfuscation method is established, fragile obfuscation methods will be gone. We consider the process to apply the de-obfuscation method. To get the de-obfuscated program, we apply the reverse transformation of the obfuscation algorithm. The each obfuscation method have different algorithm. That is, reverse transformation also proper for each obfuscation method. Therefore, in the first step, we must identify the applied obfuscation method in order to de-obfuscate. This paper proposes the identifying method for the applied obfuscation method towards de-obfuscation. We develop three research questions to evaluate the proposed method. RQ What does the obfuscation method have? RQ2- Are the useful to identify the known obfuscation method applied to an unknown product? RQ2-2 Are the useful to identify the unknown obfuscation method applied to known product? We conducted the empirical studies to answer the above research questions. The result of the studies shows that the proposed method could identify the applied obfuscation method based on the survey of the obfuscation methods in advance. The remainder of this paper is organized as following. Section II reviewed related works. Section III describes the proposed method and illustrates its process. Section IV presents the empirical studies to answer the research questions. Section V shows conclusion and some future works. II. RELATED WORK The previous protection methods are categorized into () preventing the theft techniques, (2) detecting stolen programs, and (3) proving the theft[]. In category (), there were proposed obfuscation methods, and anti-assemble methods[2], [3]. Also, category (2) includes software birthmark methods[4], [5], and category (3) includes software watermark methods[6]. An attack method against obfuscation methods were proposed for category ()[7]. Cimato et al. proposed the conversion method to de-obfuscate the names in a program. Blanc et al. also proposed the de-obfuscating the program to clarify its control flow with abstract syntax tree[8]. Both methods are quite important, since those methods become one of technique to evaluate the robustness and vulnerabilities of the obfuscation methods. However, to perform de-obfuscation requires a suitable method for each obfuscation method. Futhermore, the applied protection methods are generally private information. Therefore, identifying applied protection methods are important to the first step of the deobfuscation. III. PROPOSED METHOD A. Key Idea Various obfuscation methods are proposed in a few decade. Almost methods convert the given program in order to hide the private information in the program. Each coversion characterize the obfuscation algorithm to hide a certain information. For example, name obfuscation hide means of variables, methods, and class names by changing meaningless names[2]. Because, we assume that the names in the program greatly help to understand the program. Threfore, the name obfuscation method protects the program by converting to more /6/$3.00 copyright 206 IEEE ICIS 206, June 26-29, 206, Okayama, Japan
2 incomprehensible program. Some algorithms are proposed in the name obfuscation methods how to change the names. Straightforward algorithm is to change variable names into one alphabet characters. More powerful algorithm changes variable names into reserved words, and invisible characters such as spaces, tabs, control sequences, and etc[9]. Other algorithms change method names to introduce overloading[2], and change not definition parts of symbol names, but use parts with dynamic name resolution mechanism[0], []. Even though the obfuscation algorithms categorized into the name obfuscation methods, we will extract of each algorithm from the conversion process and resultant programs. Similarly, the case of using other protection methods, of the method will arise. Therefore, we tackle to identify the applied obfuscation method towards deobfuscation by measuring the program artificiality. B. Overview of the Proposed Method The birthmark methods have been proposed, which use of the program[4], [5]. Birthmark method originally computes similarities between the programs with the extracted native in order to detect pirated programs. The process of the proposed method is ported from birthmark methods, which is to extract the of the program, and to compare them. However, extracting are absolutely different. The birthmark methods focus on the of the program. The proposed method focuses on the of the protection method. On the other hands, Kanzaki et al. proposed the method to measure the stealthiness of the program[2]. The method extracts n-gram from the opcodes of the given program, and computes artificiality by the probability from the corpus. The artificiality means how natural opcode sequences are in the target program. Kanzaki et al. assume that a compiler generates natural opcode sequences. Generally, protection methods alter the opcode sequence by their algorithm. Then, the protected program may have unnatural opcode sequences. If a protection p q method f method f 2 method f method f 2 p =f 2 (p) p 2 =f 2 (p) q =f (q) q 2 =f 2 (q) C(p ) Evaluation aspect of the proposed method C(p 2 ) C(q ) Resilient property (Birthmark) C(q 2 ) Fig. : Key idea of the proposed method Credibility property (Birthmark) C(p) C (q) method has natural opcode sequences, the algorithm of the method is stealthy. We focus on the method to compute artificiality, and assume that protected program have a certain of the protection algorithm. Then, we tackle to identify the protection method of the program by the extracted. Figure shows the key idea of the proposed method. In the figure, we obtain obfuscatd program p and p 2 from original program p by certain obfuscation algorithm f and f 2 (p = f (p),p 2 = f 2 (p)). Then, using extraction function C, we extract certain C(p), C(p ), and C(p 2 ) from p, p, and p 2. The evaluation aspects of the birthmark method are resilient property and credibility property. To evaluate resilient property compares C(p) and C(p ), then the high score shows that p and p are in the copy relation. Also, credibility property compares C(p) and C(q), then the low score shows that p and q are developed independently. The evaluation aspect of the proposed method is C(p ) and C(p 2 ), and C(p ) and C(q ). A low score between former relation (C(p ) and C(p 2 )) shows different protection methods were applied. Also, high score between latter relation (C(p ) and C(q )) shows the same protection method was applied. C. Characteristics of the Proposed Method To implement the proposed method, almost birthmark methods would fail to identify the protection method. Because, if we apply a certain protection method f to different programs p and q, birthmark method expects C(p) =C(q). However, the proposed method expects C(p ) C(q ). We focus on the artificiality evaluation method by Kanzaki et al. to the proposed method. A certain protection method inserts ridiculous opcode sequence into never executed block. Another certain protection method changes the natural opcode sequence to unnatural one. Such unnatual opcode sequence might be the of the protection method. Originally, Kanzaki et al. computed the artificiality by probability. Ohtaki et al. ported Kanzaki s method for an object oriented language, such as Java[3]. The method uses perplexity metric instead of Kanzaki s probability for computing the artificiality. The perplexity metric shows the complexity of the natural language. Low perplexity means that the next word is easy to expect. On the other hand, a certain protection method duplicates original natural opcode sequence. Such duplicated sequence is still natural. Therefore, we also focus on the frequencies of the n-gram. Then, we employ the perplexity metric and frequency of the n-gram as the of the protection method. IV. EMPIRICAL STUDIES A. Overview In this study, we conduct three experiments to answer the three research questions. RQ What does the obfuscation method have?
3 (a) Original ASM 000 (b) ALL (Allatori Java Obfuscator) 000 (c) PG (ProGuard) (d) DR (Duplication Registers) (e) IRR (Irreducibility) (f) MLI (Merge local integers) Fig. 2: Histogram of ASM opcdoe in 5-gram (Sorted by original 5-gram frequencies) TABLE I: Target jar files Product Version Product Version ASM 3.3. FakeHack.0 Jhstop 0.0. Jwhich.0 JCalendar.3.3 Robocode-setup.6.0. TABLE II: Obfuscation tools Tools (Algorithms) Abbr. Description Allatori Java Obfuscator ALL Commercial tool. ProGurad 2 PG Popular OSS obfuscation tool. Sandmark 3 Academic protection tool which have many protection algorithms. Duplication registers DR duplicate register for variables. Merge local integers MLI merge two integer values into one long value. Irreducibility IRR update the control flow to irreducible one. RQ2- Are the useful to identify the known obfuscation method applied to an unknown product? RQ2-2 Are the useful to identify the unknown obfuscation method applied to known product? In the first experiment, we extract n-grams of practical protection tools, and investigate the. The second experiment tried to identify the known protection algorithm. The third experiment is similar with second experiment, tried to identify the unknown protection algorithm. To conduct the experiments, we chose the products shown in Table I. Also, the adopted obfuscation tools are shown in Table II. The first column represents tool names and algorithm names of Sandmark. The column named Abbr. shows abbreviation of the tools and algorithms. The last column shows short descriptions of each tool and algorithm. We conducted the experiments on the Java platform, because many obfuscation algorithms are implemented. Extracted are n-gram of opcodes, and computes their perplexities and frequencies. The n value is 2, 3, 4, and 5. At first, we constructed the corpus to compute perplexity. Therefore, we downloaded 3,786 jar files which are distributed by the apache software foundation 4. Note that, the total number of classes is 660,465, and the total number of methods is 4,272,567. B. Analysis of n-gram In this experiment, we try to characterize the protection methods. At first, we obfuscate 6 products shown in Table I by five obfuscators shown in Table II. Next, we extract n-grams of opcode sequence from gotten 30 products and 6 original products. Finally, we investigate the of each tool. The histogram of opcode frequencies is shown in the figure 2. The horizontal axis represents each 5-gram, and the vertical axis shows their frequencies in logarithm. The order of the horizontal axis of all graphs is the same as original histogram. The histogram of original and PG look the same. From investigation of their 5-grams, both 5-grams were just the same. We also investigated 5-grams from other products, the result was almost the same as 5-gram from original. The main algorithm of PG does not affect opcode sequences. Therefore,
4 TABLE III: ASM opcodes in 5-gram before/after obfuscation (top 5 frequencies) Original ALL DR MLI IRR Rank 5-gram Frequency Perplexity <code> <code> <code> <code> aload <code> <code> <code> aload getfield aload getfield ifnull aload getfield istore iload aload arraylength if_icmpge aload ldc invokespecial goto aload aload dup getfield swap getfield aload getfield ldc invokestatic invokevirtual ldc invokestatic aload invokevirtual ifeq goto ldc invokestatic aload invokevirtual dup istore iload if_icmpge aload iconst_0 dup istore istore iload dup istore istore iload aload istore istore iload aload arraylength invokevirtual pop iconst_0 dup istore pop iconst_0 dup istore istore dup2x2 lxor lconst_ lneg bipush lload dup2x2 lxor lconst_ lneg aload lload bipush lshr l2i bipush lushr land lxor lstore lconst_ lneg bipush lushr land i2l iconst_0 i2l lcmp iconst_ i2l lcmp iconst_ ixor iconst_ iconst_0 i2l lcmp iconst_ ixor iconst_ ixor iconst_ iand nop iconst_2 irem istore iload i2l TABLE IV: Characteristics of the protection methods Tools ALL PG DR MLI IRR Characteristics for identifying Swap two original opcodes by swap instruction. Almost the same as original opcode sequence. The proposed method cannot identify this tool. Duplicate istore instruction. Emerge rare 2-gram dup2x2 lxor. Emerge nop instruction. the proposed method are not useful for such obfuscation methods, which keep opcode sequences. On the other hand, histogram of other tools is dramatically different from original histogram. The results show that the protection methods may affect the opcode sequences by their. Table III shows 5-gram of ASM (top 5 frequencies). On the Table III, we omitted PG, since just the same as original 5- grams. From the table III, frequencies were decreased except MLI, and perplexities became a quite high value. The result shows that obfuscated opcode sequences have rare opcode fragments, and their fragments would be. We found of the protection tools through investigation of every product, shown in the Table IV. Unfortunately, the of PG were not found, therefore, other identifying method is required. Table IV is the answer of RQ. C. Identifying the known protection method In the previous experiment, we found the of each tool. In this experiment, we try to identify the known protection method to the unknown products. The known protection method means that the were already extracted (a) Histogram of 5-grams from original product (b) Histogram of 5-grams from obfuscated product (c) Histogram of junit opcdoe in 5-gram (Sorted by original 5-gram frequencies) We chose JUnit as the unknown product, and known protection methods are randomly selected from Table II except PG. Then, we obfuscated junit-3.7.jar and obtain junit-3.7-obfuscated.jar. Next, we extract 5-grams from junit-3.7-obfuscated.jar and computes frequencies and perplexities of 5-grams. Figure 3a shows histograms of 5-grams extracted from 5
5 TABLE V: 5-gram of obfuscated JUnit (top 5 frequencies) 5-grams PPL dup2x2 lxor lconst_ lneg bipush lload dup2x2 lxor lconst_ lneg bipush lushr land lxor lstore lconst_ leng bipush lushr land lneg bipush lushr land lxor Freq. TABLE VI: 5-grams of obfuscated ASM 5-grams PPL iconst_2 irem iconst_0 if_icmpne aload dup dup dup imul imul dup dup imul imul isub dup imul imul isub iconst_ iload dup dup dup imul junit-3.7.jar and junit-3.7-obfuscated.jar. At this time, the obfuscation algorithm was unknown, however, we suppose that some obfuscation algorithms were applied from the graphs. Then, we compare to the graph of figure 3b and graphs in figure 2. The figure 3b seems to be similar with DR (Figure 2d) and MLI (Figure 2f) from the shapes of the graphs. However, we did not succeed identifying the obfuscation method yet, since the graph shapes are not matched exactly. Then, we inspect the details of opcode sequences based on of protection methods shown in Table IV. Before inspection, we filtered target 5-grams as follows: Let P = {g,g 2,..., g l } be a n-gram set extracted from original opcode sequence. Also, let P o = {g o,g o 2,..., g o m} be a n-gram set from obfuscated opcode sequence. Then, we filter the 5- grams in the set difference P o \ P = {x P o x P }. The resultant 5-grams of top 5 frequences are shown in Table V. The column named Freq. means frequencies of corresponding 5-gram. Also, the column named PPL shows perplexities of corresponding 5-gram. From Table V, high perplexity 5-grams are appearing such as including dup2x2 lxor. We inspected resultant 5-grams by the shown in Table IV. Then, we infer that the MLI was applied. Finally, the proposed method succeeded to identify the obfuscation method, since the applied obfuscation method was MLI. Therefore, the answer of RQ2- is yes. D. Identifying the unknown protection method In this experiment, we try to identify the unknown protection method. The unknown protection method means that we never extract the from it. At first, we chose ASM 3.3. as the target products, and SOP (simple opaque predicate) as the unknown protection method. SOP is a protection algorithm implemented in Sandmark. SOP algorithm inserts the opaque predicate (the if statement always true or false) into the programs. At first, we obfuscated asm-3.3..jar and obtained asm-3.3.-obfuscated.jar. Next, we extracted n- Freq. grams from both jar files in the same way as Section IV-C. The histogram of 5-grams is shown in Figure 4. The graph of figure 4 is similar with graph of figure 2d. However, the graphs are not matched. Therefore, we inspect the 5-grams from asm-3.3.-obfuscated.jar. The resultant 5-grams of top 5 friequencies are shown in the Table VI, and we inspect them with the shown in Table IV. From Table VI, all of 5-grams have high perplexities, and include multiple dup, and imul instructions. The of the protection method is not appeared in Table VI. Therefore, we cannot identify the applied obfuscation method. However, we can infer some protection method was applied since the resultant 5-grams did not appear in original opcode sequences. Naturally, we were able to identify the SOP by extracting beforehand in the same way as Section IV-B. Therefore, the answer of RQ2-2 is no. However, extracting from many protection methods strengthens the proposed method. V. CONCLUSION This paper proposed the method to identify the applied protection method by artificiality evaluation method towards deobfuscation. We developed three research questions to evaluate the proposed method. In the empirical studies, we examined the proposed method to answer the three research questions. The answer of RQ is shown in the Table IV, the answer of RQ2- is yes, and the answer of RQ2-2 is no. However, the our method becomes useful gradually by extracting of many protection methods beforehand. In this paper, the of ProGuard could not extract. To support such protection tools is left to our future work. It is also interesting research topic to identify the obfuscation method automatically. REFERENCES [] C. Collberg and J. Nagra, Surreptitious Software: Obfuscation, Watermarking, and Tamperproofing for Software. Addison-Wesley, July [2] P. M. Tyma, Method for renaming identifiers of a computer program, United States Patent 6,02,966, August 2000, filed: Mar.20, 998. [3] A. Monden, Y. Takada, and K. Torii, Methods for scrambling programs containing loops, IEICE Transactions on Information and Systems, vol. 80, no. 7, pp , July 997, (In Japanese). [4] H. Tamada, M. Nakamura, A. Monden, and K. Matsumoto, Design and evaluation of birthmarks for detecting theft of java programs, in Proc. IASTED International Conference on Software Engineering (IASTED SE 2004), February 2004, pp , innsbruck, Austria Fig. 4: 5-gram of obfuscated ASM with unknown algorithm
6 [5] H. Tamada, M. Nakamura, A. Monden, and K. Matsumoto, Java birthmarks detecting the software theft, IEICE Transactions on Information and Systems, vol. E88-D, no. 9, pp , September [6] C. Collberg and C. Thomborson, Software watermarking: Models and dynamic embeddings, in Proc. Principles of Programming Languages 999, POPL 99, January 999, pp , (San Antonio, TX). [7] S. Cimato, A. D. Santis, and U. F. Petrillo, Overcoming the obfuscation of java programs by identifier renaming, Journal of Systems and Software, vol. 78, pp , October [8] M. D. Preda, M. Madou, K. D. Bosschere, and R. Giacobazzi, Opaque predicates detection by abstract interpretation, in In Proc. the th International Conference on Algebraic Methodology and Software Technology (AMAST2006), July 2006, pp [9] PreEmptive Solutions, Java obfuscation DashO, http: // (Last Access: 205/06/22). [0] H. Tamada, M. Nakamura, A. Monden, and K. Matsumoto, Obfuscating API library names using dynamic name resolution, IEICE Transactions on Information and Systems, vol. J90-D, no. 0, pp , October 2007, (In Japanese). [] H. Tamada, M. Nakamura, A. Monden, and K. Matsumoto, Introducing dynamic name resolution mechanism for obfuscating system-defined names in programs, in Proc. IASTED International Conference on Software Engineering (IASTED SE 2008), February 2008, pp , (Innsbruck, Austria). [2] Y. Kanzaki, T. Onoue, and A. Monden, A method for evaluating the stealthiness of protected code based on its artificiality, IPSJ Journal, vol. 55, no. 2, pp , February 204, (In Japanese). [3] R. Ohtaki, T. Ohdo, H. Tamada, Y. Kanzaki, and A. Monden, Stealthiness evaluation of obfuscation methods with opcode and operand of java bytecode instructions, in Proc. of Symposium on Cryptography and Information Security 204 (SCIS204), January 204, pp. CD ROM (2D2 2), (In Japanese).
A Fuzzy Hashing Technique for Large Scale Software Birthmarks
A Fuzzy Hashing Technique for Large Scale Software Birthmarks Takehiro Tsuzaki, Teruaki Yamamoto, Haruaki Tamada, Akito Monden, Division of Frontier Informatics, Graduate School of Kyoto Sangyo University.
More informationProgram Incomprehensibility Evaluation for Obfuscation Methods with Queue-based Mental Simulation Model
Program Incomprehensibility Evaluation for Obfuscation Methods with Queue-based Mental Simulation Model Haruaki Tamada, Kazumasa Fukuda, Tomoya Yoshioka Faculty of Computer Science and Technology Kyoto
More informationJVML Instruction Set. How to get more than 256 local variables! Method Calls. Example. Method Calls
CS6: Program and Data Representation University of Virginia Computer Science Spring 006 David Evans Lecture 8: Code Safety and Virtual Machines (Duke suicide picture by Gary McGraw) pushing constants JVML
More informationSoftware Watermarking Through Register Allocation: Implementation, Analysis, and Attacks
Software Watermarking Through Register Allocation: Implementation, Analysis, and Attacks Ginger Myles Christian Collberg {mylesg,collberg}@cs.arizona.edu University of Arizona Department of Computer Science
More informationA Quantitative Analysis of Java Bytecode Sequences
A Quantitative Analysis of Java Bytecode Sequences Ben Stephenson Wade Holst Department of Computer Science, University of Western Ontario, London, Ontario, Canada 1 Introduction A variety of studies have
More informationProgramming Language Systems
Programming Language Systems Instructors: Taiichi Yuasa and Masahiro Yasugi Course Description (overview, purpose): The course provides an introduction to run-time mechanisms such as memory allocation,
More informationAn Introduction to Multicodes. Ben Stephenson Department of Computer Science University of Western Ontario
An Introduction to Multicodes Ben Stephenson Department of Computer Science University of Western Ontario ben@csd csd.uwo.ca Outline Java Virtual Machine Background The Current State of the Multicode Art
More informationThe Java Virtual Machine. CSc 553. Principles of Compilation. 3 : The Java VM. Department of Computer Science University of Arizona
The Java Virtual Machine CSc 553 Principles of Compilation 3 : The Java VM Department of Computer Science University of Arizona collberg@gmail.com Copyright c 2011 Christian Collberg The Java VM has gone
More informationImproving Java Code Performance. Make your Java/Dalvik VM happier
Improving Java Code Performance Make your Java/Dalvik VM happier Agenda - Who am I - Java vs optimizing compilers - Java & Dalvik - Examples - Do & dont's - Tooling Who am I? (Mobile) Software Engineering
More informationJavaScript Theft Detection using Birthmark and Subgraph Isomorphism
JavaScript Theft Detection using Birthmark and Subgraph Isomorphism Snehal N. Nayakoji, PG student, Dept. of Computer Science & Engineering, Walchand College of Engineering, Sangli, India S. P. Sonavane,
More informationJava A Prototype Dynamic Diversity for Protecting Java Software
Copyright c The Institute of Electronics, Information and Communication Engineers SCIS 2012 The 29th Symposium on Cryptography and Information Security Kanazawa, Japan, Jan. 30 - Feb. 2, 2012 The Institute
More informationA Beforehand Extraction Method for Dynamic Software Birthmarks using Unit Test Codes
A Beforehand Extraction Method for Dynamic Software Birthmarks using Unit Test Codes Takanori Yokoi Division of Frontier Informatics, Graduate School of Kyoto Sangyo University. Kyoto, Japan. Haruaki Tamada
More informationA Novel Birthmark Technique to Detect the Theft from Enoumous Programs by Fuzzy Hashing
Copyright c 2015 The Institute of Electronics, Information and Communication Engineers SCIS 2015 The 32nd Symposium on Cryptography and Information Security Kokura, Japan, Jan. 20-23, 2015 The Institute
More informationImproving Java Performance
Improving Java Performance #perfmatters Raimon Ràfols ...or the mumbo-jumbo behind the java compiler Agenda - Disclaimer - Who am I? - Our friend the java compiler - Language additions & things to consider
More informationSoftware Birthmark Based Theft/Similarity Comparisons of JavaScript Programs
Software Birthmark Based Theft/Similarity Comparisons of JavaScript Programs Swati J. Patel 1, Tareek M. Pattewar 2 1 PG Student, Department of Computer Engineering, SES s R. C. Patel Institute of Technology,
More informationArray Data Transformation for Source Code Obfuscation
Array Data Transformation for Source Code Obfuscation S. Praveen, and P. Sojan Lal International Science Index, Computer and Information Engineering waset.org/publication/2537 Abstract Obfuscation is a
More informationCourse Overview. PART I: overview material. PART II: inside a compiler. PART III: conclusion
Course Overview PART I: overview material 1 Introduction (today) 2 Language Processors (basic terminology, tombstone diagrams, bootstrapping) 3 The architecture of a Compiler PART II: inside a compiler
More informationSoftware watermarking in the frequency domain: Implementation, analysis, and attacks
Journal of Computer Security 13 (2005) 721 755 721 IOS Press Software watermarking in the frequency domain: Implementation, analysis, and attacks Christian Collberg and Tapas Ranjan Sahoo Department of
More informationThe Java Virtual Machine
Virtual Machines in Compilation Abstract Syntax Tree Compilation 2007 The compile Virtual Machine Code interpret compile Native Binary Code Michael I. Schwartzbach BRICS, University of Aarhus 2 Virtual
More informationTutorial 3: Code Generation
S C I E N C E P A S S I O N T E C H N O L O G Y Tutorial 3: Code Generation Univ.-Prof. Dr. Franz Wotawa, DI Roxane Koitz, Stephan Frühwirt, Christopher Liebmann, Martin Zimmermann Institute for Software
More informationjavac 29: pop 30: iconst_0 31: istore_3 32: jsr [label_51]
Analyzing Control Flow in Java Bytecode Jianjun Zhao Department of Computer Science and Engineering Fukuoka Institute of Technology 3-10-1 Wajiro-Higashi, Higashi-ku, Fukuoka 811-02, Japan zhao@cs.t.ac.jp
More information301AA - Advanced Programming [AP-2017]
301AA - Advanced Programming [AP-2017] Lecturer: Andrea Corradini andrea@di.unipi.it Tutor: Lillo GalleBa galleba@di.unipi.it Department of Computer Science, Pisa Academic Year 2017/18 AP-2017-06: The
More informationDetecting Common Modules in Java Packages Based on Static Object Trace Birthmark
The Computer Journal Advance Access published November 5, 2009 The Author 2009. Published by Oxford University Press on behalf of The British Computer Society. All rights reserved. For Permissions, please
More informationJVM. What This Topic is About. Course Overview. Recap: Interpretive Compilers. Abstract Machines. Abstract Machines. Class Files and Class File Format
Course Overview What This Topic is About PART I: overview material 1 Introduction 2 Language processors (tombstone diagrams, bootstrapping) 3 Architecture of a compiler PART II: inside a compiler 4 Syntax
More informationChapter 5. A Closer Look at Instruction Set Architectures. Chapter 5 Objectives. 5.1 Introduction. 5.2 Instruction Formats
Chapter 5 Objectives Understand the factors involved in instruction set architecture design. Chapter 5 A Closer Look at Instruction Set Architectures Gain familiarity with memory addressing modes. Understand
More informationCSC 4181 Handout : JVM
CSC 4181 Handout : JVM Note: This handout provides you with the basic information about JVM. Although we tried to be accurate about the description, there may be errors. Feel free to check your compiler
More informationChapter 5. A Closer Look at Instruction Set Architectures
Chapter 5 A Closer Look at Instruction Set Architectures Chapter 5 Objectives Understand the factors involved in instruction set architecture design. Gain familiarity with memory addressing modes. Understand
More informationCompiler construction 2009
Compiler construction 2009 Lecture 2 Code generation 1: Generating Jasmin code JVM and Java bytecode Jasmin Naive code generation The Java Virtual Machine Data types Primitive types, including integer
More informationAnalysis of Java Code Protector
Analysis of Java Code Protector September 2005 December 2005 Submitted to Lockheed Martin Co. Principal Investigators Students Advisors Sean McGinnis ECE Dr. Gina Tang Matthew Sargent ECE Dr. Ravi Ramachandran
More informationDetecting Software Theft via Whole Program Path Birthmarks
Detecting Software Theft via Whole Program Path Birthmarks Ginger Myles and Christian Collberg Department of Computer Science, University of Arizona, Tucson, AZ, 85721, USA {mylesg,collberg}@cs.arizona.edu
More informationProgram Obfuscation Scheme Using Random Numbers to Complicate Control Flow
Program Obfuscation Scheme Using Random Numbers to Complicate Control Flow Tatsuya Toyofuku 1, Toshihiro Tabata 2, and Kouichi Sakurai 3 1 Graduate School of Information Science and Electrical Engineering,
More informationRecap: Printing Trees into Bytecodes
Recap: Printing Trees into Bytecodes To evaluate e 1 *e 2 interpreter evaluates e 1 evaluates e 2 combines the result using * Compiler for e 1 *e 2 emits: code for e 1 that leaves result on the stack,
More information02 B The Java Virtual Machine
02 B The Java Virtual Machine CS1102S: Data Structures and Algorithms Martin Henz January 22, 2010 Generated on Friday 22 nd January, 2010, 09:46 CS1102S: Data Structures and Algorithms 02 B The Java Virtual
More informationA Type System for Object Initialization In the Java TM Bytecode Language
Electronic Notes in Theoretical Computer Science 10 (1998) URL: http://www.elsevier.nl/locate/entcs/volume10.html 7 pages A Type System for Object Initialization In the Java TM Bytecode Language Stephen
More informationEMBEDDING THE HIDDEN INFORMATION INTO JAVA BYTE CODE BASED ON OPERANDS INTERCHANGING
EMBEDDING THE HIDDEN INFORMATION INTO JAVA BYTE CODE BASED ON OPERANDS INTERCHANGING Andrey Vladimirovich Krasov, Aleksander Sergeevich Arshinov and Igor Aleksandrovich Ushakov Federal State Budget-Financed
More informationBuilding a Compiler with. JoeQ. Outline of this lecture. Building a compiler: what pieces we need? AKA, how to solve Homework 2
Building a Compiler with JoeQ AKA, how to solve Homework 2 Outline of this lecture Building a compiler: what pieces we need? An effective IR for Java joeq Homework hints How to Build a Compiler 1. Choose
More informationRun-time Program Management. Hwansoo Han
Run-time Program Management Hwansoo Han Run-time System Run-time system refers to Set of libraries needed for correct operation of language implementation Some parts obtain all the information from subroutine
More informationAndroid Internals and the Dalvik VM!
Android Internals and the Dalvik VM! Adam Champion, Andy Pyles, Boxuan Gu! Derived in part from presentations by Patrick Brady, Dan Bornstein, and Dan Morrill from Google (http://source.android.com/documentation)!
More informationThe Microarchitecture Level
The Microarchitecture Level Chapter 4 The Data Path (1) The data path of the example microarchitecture used in this chapter. The Data Path (2) Useful combinations of ALU signals and the function performed.
More informationLet s make some Marc R. Hoffmann Eclipse Summit Europe
Let s make some Marc R. Hoffmann Eclipse Summit Europe 2012 24.10.2012 public class WhatIsFaster { int i; void inc1() { i = i + 1; } void inc2() { i += 1; } void inc3() { i++; } } Why? Compilers Scrip;ng
More informationJSBiRTH: Dynamic javascript birthmark based on the run-time heap
Title JSBiRTH: Dynamic javascript birthmark based on the run-time heap Author(s) Chan, PPF; Hui, LCK; Yiu, SM Citation The 35th IEEE Annual Computer Software and Applications Conference (COMPSAC 2011),
More informationJava Class Loading and Bytecode Verification
Java Class Loading and Bytecode Verification Every object is a member of some class. The Class class: its members are the (definitions of) various classes that the JVM knows about. The classes can be dynamically
More informationTopics. Structured Computer Organization. Assembly language. IJVM instruction set. Mic-1 simulator programming
Topics Assembly language IJVM instruction set Mic-1 simulator programming http://www.ontko.com/mic1/ Available in 2 nd floor PC lab S/W found in directory C:\mic1 1 Structured Computer Organization 2 Block
More informationEmbedded Device Cooperative System Using Java Bytecode Instrumentation
THE SCIENCE AND ENGINEERING REVIEW OF DOSHISHA UNIVERSITY, VOL. 51, NO. 1 April 2010 Embedded Device Cooperative System Using Java Bytecode Instrumentation Ryota AYAKI *, Kohei KADOWAKI *, Hideki SHIMADA
More informationA Force-Directed Program Graph Visualizer for Teaching White-Box Testing Techniques
Paper ID #11203 A Force-Directed Program Graph Visualizer for Teaching White-Box Testing Techniques Dr. Weifeng Xu, Gannon University Dr. Weifeng Xu is an associate professor in department of computer
More informationCode Artificiality: A Metric for the Code Stealth Based on an N-gram Model
Code Artificiality: A Metric for the Code Stealth Based on an N-gram Model Yuichiro Kanzaki National Institute of Technology Kumamoto College Koshi, Kumamoto, Japan kanzaki@kumamoto-nct.ac.jp Akito Monden
More informationCompiling Techniques
Lecture 10: Introduction to 10 November 2015 Coursework: Block and Procedure Table of contents Introduction 1 Introduction Overview Java Virtual Machine Frames and Function Call 2 JVM Types and Mnemonics
More informationChapter 5. A Closer Look at Instruction Set Architectures. Chapter 5 Objectives. 5.1 Introduction. 5.2 Instruction Formats
Chapter 5 Objectives Chapter 5 A Closer Look at Instruction Set Architectures Understand the factors involved in instruction set architecture design. Gain familiarity with memory addressing modes. Understand
More informationChapter 5. A Closer Look at Instruction Set Architectures
Chapter 5 A Closer Look at Instruction Set Architectures Chapter 5 Objectives Understand the factors involved in instruction set architecture design. Gain familiarity with memory addressing modes. Understand
More informationOver-view. CSc Java programs. Java programs. Logging on, and logging o. Slides by Michael Weeks Copyright Unix basics. javac.
Over-view CSc 3210 Slides by Michael Weeks Copyright 2015 Unix basics javac java.j files javap 1 2 jasmin converting from javap to jasmin classfile structure calling methods adding line numbers Java programs
More informationEvaluating the WaterRpg Software Watermarking Model on Java Application Programs
Evaluating the WaterRpg Software Watermarking Model on Java Application Programs Ioannis Chionis Dept. of Computer Science University of Ioannina Ioannina, Greece ichionis@cs.uoi.gr Maria Chroni Dept.
More informationCOMP3131/9102: Programming Languages and Compilers
COMP3131/9102: Programming Languages and Compilers Jingling Xue School of Computer Science and Engineering The University of New South Wales Sydney, NSW 2052, Australia http://www.cse.unsw.edu.au/~cs3131
More informationImplementation of an Obfuscation Tool for C/C++ Source Code Protection on the XScale Architecture *
Implementation of an Obfuscation Tool for C/C++ Source Code Protection on the XScale Architecture * Seongje Cho 1, Hyeyoung Chang 1, and Yookun Cho 2 1 Dept. of Computer Science & Engineering, Dankook
More informationReverse engineering of binary programs for custom virtual machines
Reverse engineering of binary programs for custom virtual machines Alexander Chernov, PhD Katerina Troshina, PhD Moscow, Russsia SmartDec.net About us SmartDec decompiler Academic background Defended Ph.D.
More informationParallelism of Java Bytecode Programs and a Java ILP Processor Architecture
Australian Computer Science Communications, Vol.21, No.4, 1999, Springer-Verlag Singapore Parallelism of Java Bytecode Programs and a Java ILP Processor Architecture Kenji Watanabe and Yamin Li Graduate
More informationCMPSC 497: Java Security
CMPSC 497: Java Security Trent Jaeger Systems and Internet Infrastructure Security (SIIS) Lab Computer Science and Engineering Department Pennsylvania State University 1 Enforcement Mechanisms Static mechanisms
More informationCode Generation. Frédéric Haziza Spring Department of Computer Systems Uppsala University
Code Generation Frédéric Haziza Department of Computer Systems Uppsala University Spring 2008 Operating Systems Process Management Memory Management Storage Management Compilers Compiling
More informationCovert Communication with Authentication via Software Programs Using Invisible ASCII Codes --- A New Approach
Covert Communication with Authentication via Software Programs Using Invisible ASCII Codes --- A New Approach I-Shi Lee Dept. of Computer Science, National Chiao Tung University, Hsinchu, Taiwan 30010
More informationCompilation 2012 Code Generation
Compilation 2012 Jan Midtgaard Michael I. Schwartzbach Aarhus University Phases Computing resources, such as: layout of data structures offsets register allocation Generating an internal representation
More informationPart VII : Code Generation
Part VII : Code Generation Code Generation Stack vs Register Machines JVM Instructions Code for arithmetic Expressions Code for variable access Indexed variables Code for assignments Items How to use items
More informationPlan for Today. Safe Programming Languages. What is a secure programming language?
cs2220: Engineering Software Class 19: Java Security Java Security Plan for Today Java Byte s () and Verification Fall 2010 UVa David Evans Reminder: Project Team Requests are due before midnight tomorrow
More informationAn Empirical Evaluation of Java Decompilation and Intellectual Property Protection via Software Watermarking
An Empirical Evaluation of Java Decompilation and Intellectual Property Protection via Software Watermarking MPhil Transfer Report James Hamilton Supervisor: Sebastian Danicic Goldsmiths, University of
More informationTaming the Java Virtual Machine. Li Haoyi, Chicago Scala Meetup, 19 Apr 2017
Taming the Java Virtual Machine Li Haoyi, Chicago Scala Meetup, 19 Apr 2017 Who Am I? Previously: Dropbox Engineering Currently: Bright Technology Services - Data Science, Scala consultancy Fluent Code
More informationAPPLICATION OF WATERMARKING TO SOFTWARE PIRACY
APPLICATION OF WATERMARKING TO SOFTWARE PIRACY Ekene Frank Ozioko Department of Computer and Information Science, Enugu State University of Science and Technology, Enugu.(ekene.ozioko@esut.edu.ng) ABSTRACT
More informationSandMark Algorithms. Christian Collberg. January 28, 2003
SandMark Algorithms Christian Collberg 2 Christian Collberg Contents 3 4 CONTENTS Christian Collberg Part I Obfuscation Algorithms 5 Chapter 1 The AddBogusFields Obfuscation Algorithm Miriam Miklofsky
More informationObfuscating Transformations. What is Obfuscator? Obfuscation Library. Obfuscation vs. Deobfuscation. Summary
? Obfuscating Outline? 1? 2 of Obfuscating 3 Motivation: Java Virtual Machine? difference between Java and others? Most programming languages: Java: Source Code Machine Code Predefined Architecture Java
More informationExercise 7 Bytecode Verification self-study exercise sheet
Concepts of ObjectOriented Programming AS 2018 Exercise 7 Bytecode Verification selfstudy exercise sheet NOTE: There will not be a regular exercise session on 9th of November, because you will take the
More informationJava byte code verification
Java byte code verification SOS Master Science Informatique U. Rennes 1 Thomas Jensen SOS Java byte code verification 1 / 26 Java security architecture Java: programming applications with code from different
More informationCSc 620 Debugging, Profiling, Tracing, and Visualizing Programs. Compiling Java. The Java Class File Format 1 : JVM
Attributes Execute The Java Virtual Machine CSc 620 Debugging, Profiling, Tracing, and Visualizing Programs 1 : JVM Christian Collberg collberg+620@gmail.com The Java VM has gone the many complex instructions/large
More informationAre Your Mobile Apps Well Protected? Daniel Xiapu Luo Department of Computing The Hong Kong Polytechnic Unviersity
Are Your Mobile Apps Well Protected? Daniel Xiapu Luo csxluo@comp.polyu.edu.hk Department of Computing The Hong Kong Polytechnic Unviersity 1 What if your mobile app is reverse-engineered by others? Core
More informationIntellectual Property Protection using Obfuscation
Intellectual Property Protection using Obfuscation Stephen Drape Collaboration between Siemens AG, Munich and the University of Oxford Research Project sponsored by Siemens AG, Munich March 2009 Contents
More informationSoftware Theft Detection for JavaScript programs based on dynamic birthmark extracted from runtime heap graph
IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 17, Issue 3, Ver. II (May Jun. 2015), PP 34-39 www.iosrjournals.org Software Theft Detection for JavaScript programs
More informationCSc 453 Interpreters & Interpretation
CSc 453 Interpreters & Interpretation Saumya Debray The University of Arizona Tucson Interpreters An interpreter is a program that executes another program. An interpreter implements a virtual machine,
More informationFramework for Sense Disambiguation of Mathematical Expressions
Proc. 14th Int. Conf. on Global Research and Education, Inter-Academia 2015 JJAP Conf. Proc. 4 (2016) 011609 2016 The Japan Society of Applied Physics Framework for Sense Disambiguation of Mathematical
More informationCOMP 520 Fall 2009 Virtual machines (1) Virtual machines
COMP 520 Fall 2009 Virtual machines (1) Virtual machines COMP 520 Fall 2009 Virtual machines (2) Compilation and execution modes of Virtual machines: Abstract syntax trees Interpreter AOT-compile Virtual
More informationIndustrial Approach: Obfuscating Transformations
Industrial Approach: Obfuscating Transformations Yury Lifshits Steklov Institute of Mathematics, St.Petersburg, Russia yura@logic.pdmi.ras.ru Tartu University 17/03/2006 Yury Lifshits (Steklov Inst. of
More informationPKI Platform for Campus Information Systems Using Cloud-based Finger Vein Authentication and PBI
868 Hitachi Review Vol. 65 (2016), No. 2 Featured Articles PKI Platform for Campus Information Systems Using Cloud-based Finger Vein Authentication and PBI Tsutomu Imai Kenta Takahashi, Ph.D. Takeshi Kikuchi
More informationOak Intermediate Bytecodes
Oak Intermediate Bytecodes A summary of a paper for the ACM SIGPLAN Workshop on Intermediate Representations (IR 95) James Gosling 100 Hamilton Avenue 3rd floor Palo Alto CA 94301 Oak
More informationJAM 16: The Instruction Set & Sample Programs
JAM 16: The Instruction Set & Sample Programs Copyright Peter M. Kogge CSE Dept. Univ. of Notre Dame Jan. 8, 1999, modified 4/4/01 Revised to 16 bits: Dec. 5, 2007 JAM 16: 1 Java Terms Java: A simple,
More informationCompiling Faster, Compiling Better with Falcon. Iván
Compiling Faster, Compiling Better with Falcon Iván Krȳlov @JohnWings Compiling Faster, Compiling Better with Falcon Overview of 3 technologies Falcon compiler ReadyNow & Compile Stashing Challenges (largely
More informationObfuscation Studio Executive
PROCEEDINGS OF WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY VOLUME 1 DECEMBER 25 ISSN 137-6884 Obfuscation Studio Executive Siarhei Petryk, and Vyacheslav Yarmolik Abstract New software protection
More informationTrusting Virtual Trust
Trusting Virtual Trust Jeremy Powell, Trupti Shiralkar Agenda The Java Virtual Machine Unmeasured Trust Java's Assurance 2 The Java Virtual Machine A Few Quick Disambiguations Overloaded terms Virtual
More informationWatermarking Java Application Programs using the WaterRpg Dynamic Model
Watermarking Java Application Programs using the WaterRpg Dynamic Model Ioannis Chionis Maria Chroni Stavros D. Nikolopoulos Abstract: We have recently presented an efficient codec system for encoding
More informationRefactoring Support Based on Code Clone Analysis
Refactoring Support Based on Code Clone Analysis Yoshiki Higo 1,Toshihiro Kamiya 2, Shinji Kusumoto 1 and Katsuro Inoue 1 1 Graduate School of Information Science and Technology, Osaka University, Toyonaka,
More informationDynamic Software Birthmark for Java based on Heap Memory Analysis
Dynamic Software Birthmark for Java based on Heap Memory Analysis Patrick P.F. Chan, Lucas C.K. Hui, and S.M. Yiu Department of Computer Science The University of Hong Kong, Pokfulam, Hong Kong {pfchan,hui,smyiu}@cs.hku.hk
More informationImage Classification Using Wavelet Coefficients in Low-pass Bands
Proceedings of International Joint Conference on Neural Networks, Orlando, Florida, USA, August -7, 007 Image Classification Using Wavelet Coefficients in Low-pass Bands Weibao Zou, Member, IEEE, and Yan
More informationCOPYRIGHT PROTECTION OF PALETTE IMAGES BY A ROBUST LOSSLESS VISIBLE WATERMARKING TECHNIQUE *
COPYIGHT POTECTION OF PALETTE IMAGES BY A OBUST LOSSLESS VISIBLE WATEMAKING TECHNIQUE * Pei-Pei Chen ( 陳佩貝 ) and Wen-Hsiang Tsai ( 蔡文祥 ), Dept of Computer Science, National Chiao Tung University, Hsinchu,
More informationBytecode-based Analysis for Increasing Class-Component Testability
Bytecode-based Analysis for Increasing Class-Component Testability Supaporn Kansomkeat Faculty of Engineering, Chulalongkorn University Bangkok, 10330, THAILAND supaporn.k@student.chula.ac.th Jeff Offutt
More informationSuggesting Potency Measures for Obfuscated Arrays and Usage of Source Code Obfuscators for Intellectual Property Protection of Java Products
2011 International Conference on Information and Network Technology IPCSIT vol.4 (2011) (2011) IACSIT Press, Singapore Suggesting Potency Measures for Obfuscated Arrays and Usage of Source Code Obfuscators
More informationStatic Analysis of Dynamic Languages. Jennifer Strater
Static Analysis of Dynamic Languages Jennifer Strater 2017-06-01 Table of Contents Introduction............................................................................... 1 The Three Compiler Options...............................................................
More informationCode Profiling. CSE260, Computer Science B: Honors Stony Brook University
Code Profiling CSE260, Computer Science B: Honors Stony Brook University http://www.cs.stonybrook.edu/~cse260 Performance Programs should: solve a problem correctly be readable be flexible (for future
More informationApplication of a Visual Computer Simulator into Collaborative Learning
Journal of Computing and Information Technology - CIT 14, 2006, 4, 267 273 doi:10.2498/cit.2006.04.01 267 Application of a Visual Computer Simulator into Collaborative Learning Yoshiro Imai 1, Keiichi
More informationA Content Based Image Retrieval System Based on Color Features
A Content Based Image Retrieval System Based on Features Irena Valova, University of Rousse Angel Kanchev, Department of Computer Systems and Technologies, Rousse, Bulgaria, Irena@ecs.ru.acad.bg Boris
More informationA CODE LEVEL OBFUSCATOR FOR CORE JAVA I. INTRODUCTION
A CODE LEVEL FOR CORE JAVA N.Dhakshyani 1,K.Ganga devi 2,G.S.Siva shankari 3 and M.Ashok 4 1,2,3 Information Technology, Rajalakshmi Institute of Technology, 4.Sr AP/Information Technology, Rajalakshmi
More informationYETI. GraduallY Extensible Trace Interpreter VEE Mathew Zaleski, Angela Demke Brown (University of Toronto) Kevin Stoodley (IBM Toronto)
YETI GraduallY Extensible Trace Interpreter Mathew Zaleski, Angela Demke Brown (University of Toronto) Kevin Stoodley (IBM Toronto) VEE 2007 1 Goal Create a VM that is more easily extended with a just
More informationThe Java Virtual Machine
The Java Virtual Machine Norman Matloff and Thomas Fifield University of California at Davis c 2001-2007, N. Matloff December 11, 2006 Contents 1 Background Needed 3 2 Goal 3 3 Why Is It a Virtual Machine?
More informationINTRODUCING DYNAMIC NAME RESOLUTION MECHANISM FOR OBFUSCATING SYSTEM-DEFINED NAMES IN PROGRAMS
Proc. of IASTED International Conference on Software Engineering IASTED SE 8 Innsbruck, Austria, February 8 To Appear INTRODUCING DYNAMIC NAME RESOLUTION MECHANISM FOR OBFUSCATING SYSTEM-DEFINED NAMES
More informationCopyright 2012, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 16
1 Copyright 2012, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 16 Towards JVM Dynamic Languages Toolchain Insert Picture Here Attila
More informationIntroduction to Compiler Construction in a Java World
Introduction to Compiler Construction in a Java World Bill Campbell, Swami Iyer, Bahar Akbal-Delibaş Errata Here you can find a listing of known errors in our text. If you find others, please let us know
More informationMnemonics Type-Safe Bytecode Generation in Scala
Mnemonics Type-Safe Bytecode Generation in Scala Johannes Rudolph CleverSoft GmbH, Munich Peter Thiemann Albert-Ludwigs-Universität Freiburg, Germany Scala Days 2010, Lausanne, 15.04.2010 Existing bytecode
More information