Implementation of Customized FindBugs Detectors

Similar documents
Analysis Tool Project

Type Checking in COOL (II) Lecture 10

CITS5501 Software Testing and Quality Assurance Formal methods

[ANALYSIS ASSIGNMENT 10]

Topics in Software Testing

Principles of Programming Languages. Lecture Outline

6.001 Notes: Section 6.1

Section 05: Solutions

Compilation of Object Oriented Languages Tik Compilers Seminar

Program Correctness and Efficiency. Chapter 2

School of Informatics, University of Edinburgh

Main concepts to be covered. Testing and Debugging. Code snippet of the day. Results. Testing Debugging Test automation Writing for maintainability

Question 1: What is a code walk-through, and how is it performed?

CS2 Algorithms and Data Structures Note 10. Depth-First Search and Topological Sorting

Bugs in software. Using Static Analysis to Find Bugs. David Hovemeyer

Today Program Analysis for finding bugs, especially security bugs problem specification motivation approaches remaining issues

InsECTJ: A Generic Instrumentation Framework for Collecting Dynamic Information within Eclipse

Lecture 1 Contracts. 1 A Mysterious Program : Principles of Imperative Computation (Spring 2018) Frank Pfenning

Testing and Debugging

Zhifu Pei CSCI5448 Spring 2011 Prof. Kenneth M. Anderson

Testing and Debugging

How to approach a computational problem

Starting to Program in C++ (Basics & I/O)

Log System Based on Software Testing System Design And Implementation

Java Bytecode (binary file)

Notes of the course - Advanced Programming. Barbara Russo

Chapter 9. Software Testing

Reliable programming

6.001 Notes: Section 15.1

COP 3330 Final Exam Review

Vertex Cover Approximations

Object-oriented features

Lecture 1 Contracts : Principles of Imperative Computation (Fall 2018) Frank Pfenning

QUIZ. What is wrong with this code that uses default arguments?

M301: Software Systems & their Development. Unit 4: Inheritance, Composition and Polymorphism


Testing Exceptions with Enforcer

CSCI-1200 Data Structures Spring 2018 Lecture 14 Associative Containers (Maps), Part 1 (and Problem Solving Too)

CS 315 Software Design Homework 3 Preconditions, Postconditions, Invariants Due: Sept. 29, 11:30 PM

Lecture Overview Code generation in milestone 2 o Code generation for array indexing o Some rational implementation Over Express Over o Creating

Principles of Software Construction: Objects, Design, and Concurrency (Part 2: Designing (Sub )Systems)

WHITE PAPER Application Performance Management. The Case for Adaptive Instrumentation in J2EE Environments

Lecture 14: Exceptions 10:00 AM, Feb 26, 2018

AP COMPUTER SCIENCE JAVA CONCEPTS IV: RESERVED WORDS

Harvard School of Engineering and Applied Sciences CS 152: Programming Languages

02/03/15. Compile, execute, debugging THE ECLIPSE PLATFORM. Blanks'distribu.on' Ques+ons'with'no'answer' 10" 9" 8" No."of"students"vs."no.

Agenda. CSE P 501 Compilers. Java Implementation Overview. JVM Architecture. JVM Runtime Data Areas (1) JVM Data Types. CSE P 501 Su04 T-1

Data Structures and Algorithms Design Goals Implementation Goals Design Principles Design Techniques. Version 03.s 2-1

Exceptions, Case Study-Exception handling in C++.

Designing Robust Classes

Chapter 11. Categories of languages that support OOP: 1. OOP support is added to an existing language

CSE P 501 Compilers. Java Implementation JVMs, JITs &c Hal Perkins Winter /11/ Hal Perkins & UW CSE V-1

The Dynamic Typing Interlude

CS61B, Spring 2003 Discussion #17 Amir Kamil UC Berkeley 5/12/03

Chapter 1 GETTING STARTED. SYS-ED/ Computer Education Techniques, Inc.

Atelier Java - J1. Marwan Burelle. EPITA Première Année Cycle Ingénieur.

This section provides some reminders and some terminology with which you might not be familiar.

Object-oriented Compiler Construction

Page 1. Human-computer interaction. Lecture 1b: Design & Implementation. Building user interfaces. Mental & implementation models

Program development plan

COSC 2P91. Bringing it all together... Week 4b. Brock University. Brock University (Week 4b) Bringing it all together... 1 / 22

JPred-P 2. Josh Choi, Michael Welch {joshchoi,

Matt Meisinger, Akshata Ramesh, Alex Dong, Kate Haas

CPS122 Lecture: From Python to Java

Programming Languages. Streams Wrapup, Memoization, Type Systems, and Some Monty Python

Type Checking and Type Equality

WACC Report. Zeshan Amjad, Rohan Padmanabhan, Rohan Pritchard, & Edward Stow

6.001 Notes: Section 8.1

CS 220: Introduction to Parallel Computing. Arrays. Lecture 4

CS Internet programming Unit- I Part - A 1 Define Java. 2. What is a Class? 3. What is an Object? 4. What is an Instance?

Subclasses, Superclasses, and Inheritance

(Refer Slide Time: 02.06)

Integrated Software Environment. Part 2

CPS221 Lecture: Threads

An Overview of Visual Basic.NET: A History and a Demonstration

JAVA: A Primer. By: Amrita Rajagopal

Module 10 Inheritance, Virtual Functions, and Polymorphism

Context-sensitive Analysis. Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved.

AP Computer Science Chapter 10 Implementing and Using Classes Study Guide

The pre-processor (cpp for C-Pre-Processor). Treats all # s. 2 The compiler itself (cc1) this one reads text without any #include s

2 rd class Department of Programming. OOP with Java Programming

Machine Language Instructions Introduction. Instructions Words of a language understood by machine. Instruction set Vocabulary of the machine

Software Design COSC 4353/6353 D R. R A J S I N G H

CS125 : Introduction to Computer Science. Lecture Notes #38 and #39 Quicksort. c 2005, 2003, 2002, 2000 Jason Zych

CSE 12 Abstract Syntax Trees


The development of a CreditCard class

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Priority Queues / Heaps Date: 9/27/17

16 Multiple Inheritance and Extending ADTs

Learning objectives: Software Engineering. CSI1102: Introduction to Software Design. The Software Life Cycle. About Maintenance

WHITE PAPER: ENTERPRISE AVAILABILITY. Introduction to Adaptive Instrumentation with Symantec Indepth for J2EE Application Performance Management

DOWNLOAD PDF CORE JAVA APTITUDE QUESTIONS AND ANSWERS

Utilizing a Common Language as a Generative Software Reuse Tool

What is software testing? Software testing is designing, executing and evaluating test cases in order to detect faults.

PROGRAMMING GOOGLE APP ENGINE WITH PYTHON: BUILD AND RUN SCALABLE PYTHON APPS ON GOOGLE'S INFRASTRUCTURE BY DAN SANDERSON

The Software Design Process. CSCE 315 Programming Studio, Fall 2017 Tanzir Ahmed

Learning outcomes. Systems Engineering. Debugging Process. Debugging Process. Review

Intro to Programming. Unit 7. What is Programming? What is Programming? Intro to Programming

INTRODUCTION. 2

CSCI B522 Lecture 11 Naming and Scope 8 Oct, 2009

Transcription:

Implementation of Customized FindBugs Detectors Jerry Zhang Department of Computer Science University of British Columbia jezhang@cs.ubc.ca ABSTRACT There are a lot of static code analysis tools to automatically find program errors. Traditional techniques usually involve formal methods and complicated computations, and thus suffer from poor extendibility and performance. FindBugs was developed to address these issues. The system is based on the concept of bug patterns, which are claimed to be easy to implement and effective to discover real bugs. In order to evaluate the system in terms of these two aspects, we experimented in creating and using a custom detector from resources provided by in the FindBugs package. 1. INTRODUCTIN As software products provide more functions their structures tend to be more complicated accordingly. Finding program errors in such systems thus becomes harder for this increased complexity. Although the traditional manpower code inspection still plays an important role in quality assurance, automated debugging tools have been desired as a necessary supplement to tremendously ease and enhance this process. Quite a few techniques are being widely adapted in practice and they can be categorized mainly into two kinds: dynamic or static approaches. The dynamic approaches validate if a particular module of source code is working properly by running user-defined test cases whose behaviours and results are expected from the testing module. Errors are discovered from exceptions thrown during runtime or outputs generated in the end. A set of good test cases critically affects the success of finding potential problems, but this put additional burdens on programmers to write extra testing code that hopefully covers all situations as completely as possible. For a large-scaled software product, this is not always feasible due to time and budget constraints. In fact, even the unit testing community, which represents the mostly adapted dynamic testing approach, doesn t suggest to totally rely on it because testing all possible input combinations for any non-trial software is unrealistic for programmers [1]. Static approaches, on the other hand, promise to find existing bugs code without requiring as much effort on manpower because most traditional static techniques are based on formal methods and sophisticated program analysis. This means the developer can just throw the code to the system, let it handle the dirty work,

and see what the result is. However, such systems could be difficult to apply in practice for its great complexity and high false positives. A new static analysis tool called FindBugs is aiming to address these issues. Unlike its predecessors, FindBugs employs a simpler yet powerful technique to conduct static analysis. The basis of the system is a set of bug patterns that are code idioms likely to be errors. Occurrences of bug patterns are places where code does not follow desired practice of a language feature. Therefore, a bug pattern can be used to form a detector to probe all bugs of the same type. FindBugs is essentially a tool providing a set of bug detectors for different types of bugs, and an interface to extend the rules for new patterns. The developers of FindBugs claim that writing custom bug detectors is reasonably easy and using them to find software errors results low false positives [2]. The purpose of the paper is therefore to experiment in creating an application-specific bug detector to evaluate its easiness, and to run it against the application source code to test its effectiveness in finding the occurrences of the expected bug. 2. IMPLEMENTATION This section discusses the steps involved to implement a customized applicationspecific bug detector and build it into FindBugs system. DebugObj dobj = EncapObject(); if (isdebugging) { DumpObjectToScreen(dobj); } Figure 1: Inefficient code if (isdebugging) { DebugObj dobj = EncapObject(); DumpObjectToScreen(dobj); } Figure 2: Efficient code 2.1 System Setup and Preparation Installing the Windows version of FindBugs is very straightforward. The Java Development Kit and the Byte Code Engineering Library (BCEL) are also required to run and extend FindBugs with new detectors. According to [2], we need BCEL because it is utilized by FindBugs to implement its detectors. 2.2 Problem and Goal Description One of my previous projects is modified and used as the target source code. The program is to visualize biomedical data. Two static methods are used for debugging purpose. One is EncapObject(), which is to gather and encapsulate required information of the visualized object into an object. Another method is DumpObjectToScreen(), which is to print the encapsulate information on the screen. Both of the methods are being called all over the code to monitor program states. There is also a static flag named isdebugging whose boolean value determines whether or not to print information onto the screen. The code snippet in Figure 1 is a modified version of handling the situation, and Figure 2 is the original code in my

project. In Figure 1, EncapObject()s runs regardless of whether it is in debug mode or not. Since EncapObject() is an expensive operation consuming cpu time for data computation and memory space for storing, we do not want it to be executed if the return object is not going to be used. Therefore, it should be placed inside the if clause to avoid poor performance. Figure 2 is what we would like to have after fixing the bug. Since I understand my own program, it did not take too long to search the whole project workspace and replace each occurrence of code in Figure 2 with the code in Figure 1. Now I want to create a custom bug pattern to let FingBugs identify the places in the code that make calls to the EncapObject() method without being put in an if clause. 2.3 Approach Since FindBugs currently has had over 200 detectors, I thought it might be possible to find an example from the existing ones that is similar to what I was going to create as a template. After some browsing, I figured that my focus should be on detectors of the Bytecode Scanning type because the other Bytecode Pattern type requires the bug to have an equivalent sequence of bytecode pattern expressions, which was not obvious to form for our bug (and thus would bring my project far from one of its motivations evaluating easiness). Moreover, of the four categories a detector s implementation strategy can choose, [3] also suggests that the Linear Code Scan would be the easiest and most suitable type for our case. Therefore, I read through the simplest FindRunInvocations detector example provided in [3] and decide to implement our customized detector in a similar way linearly scan through the bytecode for the methods in analyzed code based on the visitor pattern. 2.4 Development The FindRunInvocations detector overrides the visit(code) and sawopcode(int) methods provided by BCEL to walk though methods and analyze opcode within each method, respectively. Therefore, our custom detector will do the same and Figure 3 shows the relevant parts of these two methods in the code. 2.4.1 Scanning Method As mentioned in 2.3, visit(code code) scans the bytecode for the analyzed code method by method. Line 16 to 22 in Figure 3 corresponds to this method. It does nothing fancy but resetting three variables before calling suerp.visit(), the superclass implementation to actually visit the method that we want to analyze:

5 public class UnGuardedEncapObjectCall 6 { 7 private int isdebuggingat; 8 private int ifstartat; 9 private int ifendat; 16 public void visit(code code) 17 { 18 isdebuggingat = -1; 19 ifstartat = -1; 20 ifendat = -1; 21 super.visit(code); 22 } 24 public void sawopcode(int seen) 25 { 26 if (classconstant.equals( visualize/debug ) 27 && nameconstant.equals( isdebugging )) 28 { 29 isdebuggingat = PC; 30 } 31 else 32 { 33 if (seen == IFEQ && isdebuggingat > -1 34 && ( PC >= isdebuggingat + 1 && PC < isdebuggingat + 5)) 35 { 36 ifstartat = branchfallthrough; 37 ifendat = branchtarget; 38 } 39 if (classconstant.equals("visualize/debug") 40 && nameconstant.equals( EncapObject ) 41 && (PC < ifstartat PC >= ifendat)) 42 { 43 bugreporter.reportbug( 44 new BugInstance( UnGuardedEncapObjectCall, 45 HIGH_PRIORITY).addClassAndMethod(this). 46 addsourceline(this)); 47 } 48 } 49 } 50 } Figure 3: visit(code code) and sawopcode(int seen) methods in the custom detector class UnGuardedEncapObjectCall

isdebuggingat maintains the position of isdebugging discovered in the bytecode ifstartat stores the beginning index of the if clause whose condition is isdebugging in the bytecode ifendtat stores the index of the first line after the end of the if clause whose condition is isdeugging in the bytecode The reason to reset these variables is that they maintain accumulated states (bytecode indices) used by sawopcode(int) within the method currently being analyzed. Thus when visit(code code) starts to scan a new method, the states should be flushed as well. 2.4.2 Analyzing Method After a method is scanned by visit(code code), sawopcode(int) is called repeatedly to analyze each bytecode instruction contained in the method one at a time. There is a global program counter variable, PC, storing the index of the currently analyzed instruction. The analysis is based on the following reasoning: a) If the flag isdebugging is found in the method, store its position in isdebuggingat. b) If isdebugging is found to be used as an if clause condition, store the beginning and ending indices of the if clause in ifstartat and ifendat, respectively. c) If EncapObject() is found, determine if its located outside the if clause based on its position in PC, the value of ifstartat and ifendat. Figure 3 from line 26 to 30 is the implementation of part a). classconstant and nameconstant are protected variables the detector class inherits from its superclass. They contain the class namespace and the variable or method name of the current bytecode instruction. visualize/debug is the name space of the static variable isdebugging. This piece of code locates the isdebugging variable and assignment its position to isdebuggingat if there is such variable in the method. From line 33 to 38 in the same figure, the code implements part b). IFEQ is a BCEL constant representing an if-equal clause, so this section can is interpreted as if there is a isdebugging variable in the method and there is an if-equal clause, is this if-equal clause anywhere between 1 to 5 bytecodes away isdebugging. The 1 and 5 values were given in the sample detector. [3] says that these numbers are mainly based on bug-specific experiments and sometimes it could take a long time to find the right range. Therefore it was lucky that the sample detector was close enough to our need that we did not have to spend more time learning how to conduct the trials. The branchfallthrough and branchtarget variables are also from the superclass and they indicate the beginning and first line after the end of an if clause. Last but not least, part c) is implemented by the code from line 39 to 49 in Figure 3. Similar to the code of part a), it detects the EncapObject() method by its name space and

method name. If PC, the position of EncapObject(), is out of the if clause range bounded by ifstartat and ifendat, a bug is detected and thus should be reported with its name, priority and location (class, method and line). 2.5 Building and Installation After the code is ready, it needs to be packaged into a JAR file so FingBugs can recognize it. Although the building process is well documented in [3] and one can look at an existing detector s files as a template, it still involves quite a bit of editing work in several files: A build script is required to specify the source and destination, as well as the target JAR file name. FindBugs.xml is one file generated by the build script. It describes the class, speed, abbreviation, type, and category of the detector. For each new detector, one needs to copy all these properties into the file of the same name used by FindBug. Messages.xml is another file produced in the build. It contains details of the bug pattern used by the GUI. One needs to open the xml to add html descriptions and make sure the class and type information is align to that in FindBugs.xml. 3. PERFORMANCE EVALUATION With the custom detector handy, I applied it to test my project source code using the FingBug GUI. Figure 4 is the result when all other detectors were turned off. Files Analyzed Classes Analyzed Methods Analyzed Bugs Found Original Bugs False Positive 6 9 56 29 28 1 Figure 4: Evaluation of detected bugs and false positive 317 DebugObj dobj = EncapObject(); 318 if (isdebugging) 319 { 320 DumpObjectToScreen(dobj); 321 } 322 } 338 DebugObj specinfoobj = EncapObject(); Figure 5: False positive The original number of bugs was known because I searched each occurrence of EncapObject() and moved it outside the if clause. The new detector not only found all

the errors but also went belong it returned a false positive. The code where it failed is illustrated in Figure 5. The call to EncapObject() in line 338 is unguarded, but it does not need to because it is actually being used for a non-debugging purpose. Our simple bug pattern did not take this into account. However, I think with some improvement on the pattern, a more sophisticated version should be able to distinguish such difference. On the other hand, this unexpected finding is still valuable because it suggests an inappropriate practice the debugging method is being used for a purpose different from what it is written for. I recalled that line 338 was part of a patch I applied later on to fix some errors, and I really should have created a new method in a different class for that. The misuse of our detector reveals a phenomenon often occurs in software development cycle the structure of a program tends to degrade as more maintenance work is conducted. Fingbugs might be able to help delay this process if we can create a bug pattern capable of detecting misused method calls. 4. CONCLUSION AND FUTURE WORK This project experimented in writing an application-specific bug detector in FindBugs, and using it to discover occurrences of the expected bug. The purpose is to evaluate its easiness in extending bug patterns and its effectiveness in finding bugs of interest. By walking through all the steps to make our custom detector working, we found that although extending FingBugs with new rules is conceptually simple, the implementation is less straightforward in several aspects. We encountered the following issues during our experiment, and some of them might indicate future research directions: Analyzing bytecode instructions for source code with the BCEL library is effective but representing patterns in terms of bytecodes is not always easy for users. Sometimes one has to use a third-party tool to parse a piece of sample code to look at the disassembled bytecodes and learn how to structure a pattern. The quality of a bug pattern is somewhat uncertain. In part b) of 2.4.2, the boundaries to determine whether an if clause is followed by isdebugging are based on experiments and trials. In our case we proved it worked well because we created the bugs and therefore knew where they are. In reality, when we want to use this tool to actually find out where the bugs are, we do not know how much percent of total errors it reports. Building and adding a custom detector involves a little too effort. It would be a lot more convenient to have a GUI that takes a set of parameters, compiles the source code, and integrates the JAR into the system automatically. In spite of its shortcomings, FindBugs is still a very adaptable static bug analysis tool for its simplicity and extendibility. Furthermore, as we found in our evaluation, bug patterns might potentially be used to help maintain software structure by detecting method misuses.

REFERENCES [1] IEEE Standards Board, IEEE Standard for Software Unit Testing: An American National Standard, ANSI/IEEE Std 1008-1987, IEEE Standards: Software Engineering, Volume Two: Process Standards; 1999 Edition; published by The Institute of Electrical and Electronics Engineers, Inc., 1999 [2] David Hovemeyer & William Pugh, Finding Bugs is Easy Companion to the 19th annual ACM SIGPLAN conference on Object-oriented programming systems, languages, and applications, 2004 [3] FindBugs Manual, http://findbugs.sourceforge.net/manual/index.html