A Measurement-Based Approach for Detecting Design Problems in Object-Oriented Systems

Size: px

Start display at page:

Download "A Measurement-Based Approach for Detecting Design Problems in Object-Oriented Systems"

Dinah Stone
6 years ago
Views:

1 A Measurement-Based Approach for Detecting Design Problems in Object-Oriented Systems M. J. Munro Technical Report EFoCS Department of Computer and Information Sciences University of Strathclyde July 2004

2 Abstract Refactoring is a reengineering process used to improve the design of a system by the application of a number of well-defined code level transformations. A major recognised problem of refactoring is the identification of the locations at which these transformation should be applied, otherwise known as the detection of bad smells. Instead of relying on human intuition, Marinescu has proposed and evaluated a set of metrics for automatically detecting a number of these design flaws. This paper empirically evaluates the metrics for detecting data classes and god classes on two different systems - a simple hotel booking system, a design pattern refactoring tool and the public domain unit-testing tool JUint. The results raise interesting questions regarding the accuracy of these metrics and their consistency when applied to a range of systems. The reasons for these differences are highlighted and suggestions are made to improve the robustness of these metrics.

3 CONTENTS CONTENTS Contents 1 Introduction 1 2 Refactoring Design Heuristics Design Flaws Bad Smells AntiPatterns Summary Locating Design Problems Proposed Solution by Marinescu Focused Solution Hypothesis Design Problem s Characteristics O-O Software Metrics Metric Interpretation Design of Experiment Two Design Problems Data-Class God Class Characteristics Data-Class God-Class Choice of Metric sets Data-Class God-Class Filtering Mechanism Data-class god class Data Collection Software Systems Metric Tool Metric Framework Metric Definitions III

4 CONTENTS CONTENTS 6 Manually Apply Metrics Data-Class HotelSystem myrfctr JUnit God-Class HotelSystem myrftr JUnit Refinements of Applying Metrics Manually Data Class God-Class Further Issues with Automatic Detection Tool Application Data-Class God-Class TypEx Data-Class God-Class Conclusions In Addition Current State of the Art Future work Bibilography 46 A Definitions of Metrics Implemented into Eclipse Metric Framework 52 A.1 Class Level A.1.1 Coupling Between Objects (CBO) A.1.2 Coupling Between Data Classes 1 (CBDC1) 53 A.1.3 Coupling Between Data Classes 2 (CBDC2) 53 A.1.4 Commented Lines Of Code (CLOC).. 53 A.1.5 Non-Commented Lines Of Code (NCLOC) 53 A.1.6 Comment Density (CD) IV

5 CONTENTS CONTENTS A.1.7 Depth of Inheritance Tree (DIT) A.1.8 Instance Variable Method Count (IVMC) 54 A.1.9 Lines Of Code (LOC) A.1.10 Number Of Accessor Methods (NOAM) 54 A.1.11 Number Of Classes (NOC) A.1.12 Number Of Class Constructors (NOCC) 54 A.1.13 Number Of External Methods With Parameter List the Same as Instance Variable Types (NOEMWPLSIVT) A.1.14 Number Of Internal Methods With Parameter List the Same as Instance Variable Types (NOIMWPLSIVT) A.1.15 Number Of Instance Variables (NOIV) 55 A.1.16 Number Of Methods (NOM) A.1.17 Number of Methods Added (NMA).. 55 A.1.18 Number of Methods Extending (NME) 55 A.1.19 Number of Methods Overriding (NMO) 55 A.1.20 Tight Class Cohesion (TCC) A.1.21 Weight Of Class (WOC) A.1.22 Weight Of Class - 2 (WOC2) A.1.23 Weighted Method Count (WMC) A.2 Method Level A.2.1 Accessor Method (ACCM) A.2.2 Commented Lines Of Code in a Method A.2.3 (CLOCM) Non-Commented Lines Of Code in a Method (NCLOCM) A.2.4 Comment Density - Method level (CDm) 60 A.2.5 Lines Of Code in a Method (LOCM). 60 A.2.6 McCabe Cyclomatic Complexity (V(G)) 61 A.2.7 Number Of Cases within Switch Statements (NOCSS) A.2.8 Number Of Parameters (NOP) A.2.9 Number Of Parameters Not Referred (NOPNR) A.2.10 Number Of Parameters Used (NOPU) 61 A.2.11 Number Of Switch Statements (NOSS) 61 V

6 CONTENTS CONTENTS B Analysis with Developer of TypEx for a Data-Class 62 B.1 False Positives C Analysis with Developer of TypEx for a God-Class 66 C.1 False-positives D Glossary 70 D.1 Design Flaw D.2 Design Heuristic D.3 Design Principle D.4 Design Rule D.5 Problem Detection VI

7 LIST OF FIGURES LIST OF FIGURES List of Figures 1 How design heuristics are used The Process to Identify Metrics for Automatic Detection of a Design Problem The Process to Extract Metric Measurements on Java Source Code BillItem.java Accessor Method Pseudo-code VII

8 LIST OF TABLES LIST OF TABLES List of Tables 1 God class (behavioural form), related heuristics. [Rie96] Data Class Metrics God Class Metrics Data Class Filters God Class Filters Data Class, hotelsystem results Data Class, myrfctr results Data Class, JUnit Results God Class, hotelsystem results God Class, myrftr results God Class, JUnit results Data Class Refined metrics and rules Data Class, hotelsystem Refinement God Class Refinements of metrics and rules God Class, myrfctr refinements applied Data-Class, TypEx God-Class, TypEx VIII

9 2 REFACTORING 1 Introduction The dominant software engineering process for developing software is maintenance. The structure of source code can be an attribute that significantly effects this level of maintenance, as developers spend more time trying to understand the design structure of a system before identifying a solution to integrate new code [BL71]. The recently proposed technique of refactoring presents a potential solution to this problem. 2 Refactoring Refactoring changes the internal code structure of an Object- Oriented (O-O) system without affecting the overall behaviour of the system to improve the quality of the design [Fow99]. Refactoring has a role to play in reverse and re-engineering systems by making semantic-preserving transformations of code into a form that the software engineer finds easier to understand. Fowler describes 72 well-defined refactorings (core-level transformations), which have three distinct stages to their application: identify a problem where to apply a refactoring, choose an appropriate refactoring as a solution and apply the refactoring. 2.1 Design Heuristics The first stage of refactoring relates to identifying problems in the design and using O-O design heuristics to help identify where to apply a refactoring. A design heuristic is a rule-ofthumb that tries to capture the experience of how developers identify design problems. Heuristics are not hard-and-fast rules and should be used as guidance for developers to improve their design problems to produce a better overall designed system. Dijkstra [Dij68] identified one of the earliest heuristics in a short letter to the editor of a Communications of the ACM publication entitled Go to statements considered Harmful. 1

10 2.1 Design Heuristics 2 REFACTORING Figure 1: How design heuristics are used. Design Flaws Bad Smells AntiPatterns 1 1..* Design Heuristic... The problem is in the domain of procedural languages where the whole program is within a single file and GOTO statements that allow jumps to anywhere throughout the program. GOTO statements are useful in particular situations, for example to jump out of complex loop structures, however they are more problematic than useful. A system becomes difficult to follow and understand when GOTO statements are present, breaking up the logic through the consequence of jumping to different locations in the program. Dijkstra describes GOTO statements as it stands is just too primitive; it is too much an invitation to make a mess of one s program. A number of O-O design heuristics exist in the literature that try to capture the essence of good design such as, O-O Design Heuristics [Rie96] and the Law of Demeter [LH89]. A design heuristic is a suggestion of how to improve the design of a software system. Applying all possible design heuristics may inadvertently reduce the quality of a software design. There exists specific design problems in the literature; Design Flaw [Mar02], Bad Smells [Fow99], and AntiPatterns [BMMM98]. Each of these design problems can be manifested into a number of design heuristics that correspond to the characteristics of the initial problem. Figure 1 shows where design heuristics fit within design problems from the literature. Design flaws, Bad smells and 2

11 2.2 Design Flaws 2 REFACTORING AntiPatterns all describe a design problem where a number of design heuristics could be identified to capture the characteristics of the problem. Figure 1 only identifies; design flaw, bad smell and AntiPatterns, design problems as they are the main focus throughout the report, there are others in the literature. Each of the design problems identified in Figure 1 are described in more detail to emphasise the main differences between them. 2.2 Design Flaws Marinescu defines design flaws as containing the structural characteristic of a design entity or design fragment that expresses a deviation from a given set of criteria typifying the high quality of a design [Mar02]. 2.3 Bad Smells Fowler defines 22 colloquially named bad smells [Fow99], where each design problem has a number of related refactorings that can be applied to a system to solve the design problem. 2.4 AntiPatterns An AntiPattern is a literary form that describes a commonly occurring solution to a problem that generates decidedly negative consequences [BMMM98]. Each AntiPattern has a template outline comprising of eight main parts, that helps understand and solve the particular problem. These parts break down why the original problem exists, identifies the main characteristics of the problem, how the problem can be solved using refactoring steps, and a demonstrated example how the solution can be applied. An AntiPattern identifies common problems software engineers introduce when producing software, also inexperience applying a Gang Of Four s (GOF) design pattern [GHJV94] in an appropriate context. 3

12 2.5 Summary 3 LOCATING DESIGN PROBLEMS God Class - Behavioural Form Heuristic Description 1.1 Distribute systems intelligence horizontally as uniformly as possible, that is, the top-level classes in a design should share the work uniformly. 1.2 Do not creaite god classes/ objects in your system. Be very suspicious of a class whose name contains Driver, Manager, System, or Subsystem. 1.3 Beware of classes that have many accessor methods defined in their public interface. Having many implies that related data and behaviour are not being kept in one place. 1.4 Beware of classes that have too much non communicating behaviour, that is, methods that operate on a proper subset of the data members of a class. God classes often exhibit much non communicating behaviour. Table 1: God class (behavioural form), related heuristics. [Rie96] 2.5 Summary The god class problem described by Riel [Rie96], is an example that runs throughout the report to show the proposed application how to automatically locate within a Java system. Here the god class problem is used to highlight the differences in terminologies identified in Figure 1. A god class in the behavioural form is described by Riel [Rie96] informally in words as, where developers attempt to capture the central control mechanism so prevalent in the action-oriented paradigm with their O-O design. In addition Riel identifies four O-O design heuristics, that capture the characteristics of the problem, shown in Table 1. Detecting any design problem in a system is not trivial - experience is a key factor in knowing what a problem area looks like. Even with the benefit of experience, identifying them in a large system by hand is an overwhelming task. 3 Locating Design Problems Currently to locate a design problem, requires manual inspection of the source-code, which quickly becomes unfeasible as the size of the system increases. Other considerations identi- 4

13 3.1 Proposed Solution by Marinescu 3 LOCATING DESIGN PROBLEMS fied by Bär [BC98] are that a system may be developed by different developers or teams, where design problems can range over several subsystems and thus cannot be detected locally. Also, developers may not know exactly what to search for, even when using a set of heuristics. It can be considered to be inappropriate to manually inspect large systems to locate design heuristics. Clearly, automatic identification of such problems is an appealing prospect. The problem this research is aiming to address is how to effectively support and guide the software engineer in identifying design heuristics in O-O Java systems of significant sizes. 3.1 Proposed Solution by Marinescu A solution to automate the detection of design problems in source-code to aid developers identifying them in an O-O system has been described by Marinescu [Mar02]. Marinescu refers to design problems as design flaws. Marinescu defines a process to search O-O source-code to look for possible design problems, known as a detection strategy. A detection strategy has a number of metrics, filters and a composition mechanism that best matches a design problem characteristics. Marinescu also defines detection strategies that relate to a specific design flaw. A detection strategy has four sequences of steps; analysis of the problem, selection of metrics, detection of candidates and examination of candidates. An identified problem taken from the literature is analysed to quantify the informal description. Using the quantitative description a selection of metrics are chosen that best match the problem s characteristics, here is where the detection strategy is expressed using the identified metrics. Detection of candidates measures systems using the defined detection strategy, using the chosen metrics. The last stage examines the results that the detection strategy identified using the proposed, and whether refinements are required [Mar02]. Marinescu s research classified a number of design-flaws 5

14 3.2 Focused Solution 3 LOCATING DESIGN PROBLEMS according to the granularity level of the design entity affected by each design-flaw [Mar02]. Each classification level describes the problem level and identifies a number of design flaws that are related to this level. It is also pointed out by Marinescu that the design problems are hard to model using a detection strategy, as there are situations where a code fragment might be considered flawed in one case while in another case, a similar, mostly identical design fragment is justifiable and may not be considered a design problem [Mar02]. 3.2 Focused Solution This work extends Marinescu s study with the main focus on the design problems identified by Fowler known as bad smells. The reason to focus on bad smell problems is that they are connected to a set of refactoring methods that when applied to source code can help to solve the design problem. In addition the selected metrics identified in a detection strategy by Marinescu, are not justified with the design problem characteristics. Understanding the reasoning behind the chosen set of metrics within a design strategy will help appreciate the design problem and identify if the metrics are best suited for the problem. To achieve these goals, appreciation and understanding of the design strategies Marinescu has described for 5 of the bad smell problems would be the first stage. A repeat of Marinescu s study with the same set of metrics and filtering mechanism, will be used to see whether the same results are achieved. In addition each class in the system under investigation will be manually inspected to decide, which design problems actually exist, as to aid locating any false-positives. 3.3 Hypothesis If identifying the characteristics of a bad smell can be related to a set of software metrics, then by using a pre-defined set of thresholds to interpret the software metrics results applied 6

15 3.3 Hypothesis 3 LOCATING DESIGN PROBLEMS to Java source-code, the software engineer can be provided with significant guidance as to the location of the bad smell Design Problem s Characteristics All of the design problems identified in the literature [Fow99, Rie96, BMMM98] are presented in an essay style that describes the problem informally. This makes it difficult to identify them automatically. Originally these guidelines were meant to be followed by a human developer when creating a new or analysing an existing design, rather than for an automatic tool to detect violations of design rules in a given system design [BC98]. A more formal definition of each design problem is required so the important characteristics can be related to attributes which can be measured. The unique characteristics corresponding to the informal description, and where possible design heuristics relating to a design problem are identified and then matched against a measurement technique that can be used to help automatically identify the problem in source code. There will not necessarily be a single technique that captures all the characteristics of a design problem. For example, duplicated code is one of Fowlers bad smells, which is described as seeing the same code structure in more than one place [Fow99]. This description lacks detailed requirements to fully identify it in source code, and does not indicate for example what is meant by seeing the same. Does it mean identical or the same structure with name change? However the fields of clone detection and reduction focus [BMD + 00] on such issues and can help towards identifying specific characteristics. Marinescu s detection strategies consist purely of software metrics and his study starts from this work, so his set of measurement techniques will be considered first O-O Software Metrics Software metrics is a collective term used to described the very wide range of activities concerned with measurement 7

16 3.3 Hypothesis 3 LOCATING DESIGN PROBLEMS in software engineering [FN00]. For example a simple metric considers counting the number of Lines Of Code (LOC) that exists in a system. Thus matching the specific attribute of a metric to a characteristic of a design heuristic can help towards building a model to guide the possible location of design within source-code. There may not be a perfect characteristic to metric match, so a number of metrics may be required to fulfil the criteria. In addition metrics are only measurements that require to be interpreted as to indicate problem areas. Threshold bounds can be placed upon metric results but require some justification on the reasoning behind choosing such levels. A number of O-O software metrics are defined in the literature, which relate to specific attributes of a software s design. Chidamber and Kemerer made one of the first definitions of O-O metrics, defining six metrics that cover the most fundamental design parts of a system: cohesion, complexity, coupling, depth of inheritance, number of class siblings and the response for a class [CK94]. Bieman and Kang proposed a refined definition for a class s cohesion, that considers the relative number of methods connected by instance variables [BK95] Metric Interpretation An important task when choosing a metric is to interpret the results by identifying what they are attempting to measure. Metric thresholds are bounds aimed at capturing acceptable and potentially problematic values for a metric. Identifying metric thresholds is difficult and may require a number of refinements. Marinescu uses a filtering mechanism to reduce the initial data set [Mar02]. The end result of this approach produces 3 concrete types of data filters: absolute, relative and statistical. An absolute filter has an upper and lower value taken from the literature that corresponds to a design problem. Relative filters are when characteristics of a design problem are not precise enough to place an absolute filter, for example 8

17 4 DESIGN OF EXPERIMENT methods of high complexity should be split [Mar02] could consider the upper 10% percentile range of a metric results. The last type of filter, statistical use box-plots on the metris results as to locate outliers for potential concerns. Benlarbi et al. [BEEGR00] carried out a study to identify whether a practical application of O-O measures can predict which classes in a system contain a fault. Through using a cognitive theory which suggests there are threshold effects for many O-O measures, by identifying that O-O classes are easy to understand as long as their complexity is below a threshold, where as above this threshold understandability decreases rapidly [BEEGR00]. The study is empirically tested on 2 c++ systems which focuses on a subset of the Chidamber and Kemerer metrics suite [CK94]. A number of thresholds corresponding to these measurements are used from the literature. The results indicate that there are no threshold effects for any of the measures studied. A starting point for this study is to use the same the filtering mechanisms identified by Marinescu for each design problem. However thresholds are not absolute and it is necessary to investigate how they may vary from system to system. 4 Design of Experiment Marinescu s study is repeated using two design problems, data-class and god class. The metrics and filtering mechanisms identified by Marinescu that correspond to these two problems are applied to a number of Java systems. Figure 2 summarises the process of automatically identifying a design problem in a system using metrics. A design problem has a number of vital characteristics taken from the informal description in the literature and any possible corresponding design heuristics. Each characteristic is matched against a metric or a number of metrics that encapsulates it best. At present the selection of metrics for a chosen characteristic are taken straight from Marinescu s work. Respective thresholds, to help interpret the measurement results link each metric. The more bounds that are placed on a particu- 9

18 4.1 Two Design Problems 4 DESIGN OF EXPERIMENT 1 1..* 1..* 1..* * Design Flaw Characteristics Metrics Bounds Figure 2: The Process to Identify Metrics for Automatic Detection of a Design Problem. lar metric may identify the metric that is an incorrect choice for the corresponding characteristic. 4.1 Two Design Problems This report focuses on the data-class and god class problems as these are often indicators of fundamental weaknesses in an O-O design Data-Class Fowler describes data-classes as classes that have fields, getting and setting methods for fields, and nothing else [Fow99]. The purpose of these types of classes is purely to hold data, where other classes manage this data. This is a class that has no functionality, containing only instance variables, constructor methods and accessor methods that change or return the whole state of an instance variable. Other types of accessor methods exist, that returns or changes a partial state of an instance variable, but will not be initially considered as they start to deviate away from the initial problem. In addition instance variables can be changed directly if their access modifier is of type public, without using a designated method to do so God Class A god class is a class that captures the centralised control of an O-O system... leaving minor details to a collection of other classes [Rie96], that typically has a traditional procedural (or action oriented ) structure in an O-O system. Brown et al. identified that the fundamental problem is that functionality is not distributed evenly amongst 10

19 4.2 Characteristics 4 DESIGN OF EXPERIMENT classes [BMMM98]. The design of this type of class can be considered to have a data driven design (DDD) [SP00], where classes are created first to store information, then the responsibilities are considered and shared between classes. The overall design produces classes that are domineering, as the system s responsibilities are not equally shared over all classes. This artificial separation of data from its associated behaviour is a violation of one of the tenets of the O-O design philosophy. Riel s heuristic for identifying Data Classes is based on the number of accessor methods in the class interface - Beware of classes that have many accessor methods defined in their public interface [Rie96]. 4.2 Characteristics The main characteristics for a data-class and a god class are taken from the literature descriptions above, are mentioned below Data-Class These are classes with limited functionality, which contain methods to change the state of a class through accessormethods. A class that contains a combination of instance variables, constructor and accessor-methods and nothing else can be considered to be a data-class God-Class A god class is a large class that implements a number of the systems functionality and manipulates data from primitive instance variables, or from objects that are instance variables that can be considered to be a data-class or are used like one to store information. 4.3 Choice of Metric sets To help identify metrics, which best match the characteristics of a data-class and god class, the set of metrics Marinescu s 11

20 4.3 Choice of Metric sets 4 DESIGN OF EXPERIMENT Name WOC (Weight of Class) NOPA (Number of Public Attributes) NOAM (Number of Accessor Methods) Description Number of non-accessor methods in a class divided by the total number of members of the interface. [Mar01] The number of non-inherited attributes that belong to the interface of a class. [Mar01] The number of non-inherited accessor methods declared in the interface of a class. [Mar01] Table 2: Data Class Metrics identified are used [Mar02] and described in the next subsections Data-Class Marinescu s detection strategy for data-class uses three metrics, described in Table 2, Weight Of Class (WOC), Number Of Public Attributes (NOPA) and Number Of Accessor Methods (NOAM) that try to encapsulate the characteristics of a data-class. Both NOPA and NOAM metrics relate how the state of a class can be changed, either directly through instance variables declared as having an access type public or indirectly using accessor methods. The other metric WOC, identifies the number of accessor methods divided by the total number of methods defined in a class, where smaller values implies more accessor methods are present in the class God-Class Marinescu identified three metrics in the detection strategy for a god class, that aim to capture the characteristics, these are described in Table 3. In general, to interpret these characteristics using measurements the size, complexity, cohesion and the number of data-classes a class is coupled with can be considered. The complexity of a class can be difficult to measure. For example an algorithm may have complex behaviour but may 12

21 4.3 Choice of Metric sets 4 DESIGN OF EXPERIMENT Name ATFD (Access To Foreign Data) WMC (Weighted Method Count) TCC (Tight Class Cohesion) Description The number of external classes from which a given class accesses attributes, directly or via accessor methods. Inner classes and superclasses are not counted. [Mar01] The sum of the static complexity of all methods in a class. [CK94] The relative number of directly connected methods. Where two methods are connected if they access a common instance variable. [BK95] Table 3: God Class Metrics be well documented and hence easy to understand. In comparison a recursive method can be difficult to understand without a full example. Knowing how to distinguish between these kinds of complexity is difficult. A widely used metric that measures the complexity of any method implemented in a system is McCabe s cyclomatic complexity metric, which counts the number of condition statements [McC76]. Chidamber and Kemerer [CK94] extended McCabe s complexity measure to incorporate the O-O programming language paradigm. Weighted Method Count (WMC), sums McCabe s complexity for each method implemented in a class. A class that is designed well is one where its members integrate successfully. This kind of software quality can be measured by considering a cohesion attribute. A class strives to be highly cohesive, meaning it is difficult to split its functionality. Most cohesion metrics are based on either instance variables usage or sharing of instance variables through method integration [BB03]. Bieman and Kang define a cohesion metric for a class, Tight Class Cohesion (TCC) that measures the relative number of directly connected methods, where methods are considered to be connected when they use at least one common instance variable [BK95]. There are three levels which TCC can be calculated, here we have only considered the cohesion of the current classes methods and instance variables. Other calculations consider 13

22 4.4 Filtering Mechanism 4 DESIGN OF EXPERIMENT inherited methods and instance variables. The lower a TCC metric value for a class, can be interpreted as not being well formed and encapsulate a single responsibility. A characteristic of a god class, is where a class can take more than the average proportion of a systems responsibility, which should be spread over a number of classes. Hence, a low TCC metric could be interpreted to be a god class. There is one other metric Marinescu identifies to help locating a god class, Access To Foreign Data (ATFD), which is defined as The number of external classes from which a given class accesses attributes, directly or via access-methods. Inner and super-classes are not included [Mar01]. It is difficult to obtain an accurate result for the definition of ATFD as in general it is impossible to statically know which class a method is implemented in and it may be overridden. However the systems to be analysed first are small making it possible to obtain a calculate this measurement. 4.4 Filtering Mechanism Marinescu defines filtering mechanisms to interpret metric results relating to a detection strategies design problem. There are 3 types of data filters, either measure an absolute, relative or statistical bound on the results. An absolute measurement, such as identifying the top 10 classes of a metric result (e.g. D1), is very limiting as its value is easily distorted when applied to small systems. A relative measurement, such as the top 10%, takes into consideration the varying sizes of systems. The last type use statistical box-plots to identify outliers within the results. The main concern with these rules is knowing the appropriate bound - only experience or empirical evaluation can help with this definition Data-class Marinescu identifies a detection strategy for a data-class with metrics and filters that interpret the results best to the design problem characteristics. Throughout Marinescu s work 14

23 4.4 Filtering Mechanism 4 DESIGN OF EXPERIMENT Name D1 [Mar01] D2 [Mar01] D3 [Mar01] Manual Rule (WOC > 0 and WOC <= 0.33) and (((NOPA, top 10 classes) and NOPA >= 5) or ((NOAM, top 10 classes) and NOAM >= 3)). ((WOC <= Bottom 33%) and WOC < 0.33) and (NOPA > 5) or NOAM > 5). (NOPA > 3 and NOPA >= top10%) or (NOAM > 5 and NOAM >= top10%). Manual inspection of the source code Table 4: Data Class Filters 3 varying filters have been identified and are shown in Table 4. All 3 filters are shown and considered to start this study to identify which thresholds best match the underlying problem. Collectively if a class s metric results are within the corresponding thresholds, it identifies to be a true case of being a data-class. Marinescu does not describe what any of the filtering mechanisms are trying to capture, here an educated guess to the filters shown in Table 4 are tried to be justified here. The filters connected to WOC are looking for classes that have a higher number of accessor methods than normal methods, for example filter D1 considers WOC values to be less than The NOPA filters are absolute to identify classes that can be vulnerable to their state being changed directly through instance variables. In addition NOPA has a relative filter to identify the top 10% classes with the highest values, as these could be possible classes to that hold data. The other filters are for the NOAM metric, that identifies classes that have 5 accessor methods or have the top the 10% values throughout the system. The filters individually do not really capture the characteristics of the design problem, however composition of the metrics and filters can. For example a class may have a high NOAM and NOPA results meaning the state can easily be changed, but the class could be large with a number of non accessor methods resulting in a high WOC value. In addition a manual inspection of the source-code for the 15

24 4.4 Filtering Mechanism 4 DESIGN OF EXPERIMENT Name G1 [Mar01] G2 [Mar02] Manual Rule ((ATFD, top 10 classes) and ATFD >= 3 ) and ((WMC, top 10 classes) and TCC <= 0.33)). (ATFD >= top 20% and ATFD > 4 ) and (WMC > 20 or TCC < 0.33 ). Manual inspection of the source code Table 5: God Class Filters design problem is carried out, the result will be used as a bench mark for a true positive god class Table 5 identifies the filters applied to the metrics in order to identify potential god classes. These suggest there can be a maximum number of god classes in a system at any one time, based on the premise that a god class has control of a particular part of a system, and there can only be a limited controllers throughout a system. In this case manual inspection sought to identify classes that access a number of lightweight classes (that themselves could be Data Classes) and also contain lengthy methods that exhibit a high degree of computation and control. Marinescu again fails to describe what these filters and metrics are trying to capture. The filters for ATFD identify classes with the highest values in a system, meaning they are the most communicative to other classes. The filters for WMC identify the top complex classes in a system. The coupling filters consider low coupling values, as this means that a class is doing a number of things a things can could be easily split up. In addition a manual inspection of the source-code for the design problem is carried out, the result will be used as a bench mark for a true positive. 16

25 5 DATA COLLECTION 5 Data Collection The set of metrics discussed relating to a data-class and god class are applied manually to systems developed in Java. A manual application of a metric can take many man-hours to apply and be a complex and error prone activity. However, three small systems are chosen to manually apply the set of metrics to help minimise the complexity and to appreciate the implementation details. The calculated metric results are placed into a spreadsheet, where the corresponding filters can be applied and presented in a clear informative manner. An alternative approach to applying metrics to a system is through automating the process, using a tool. Such a tool should have the functionality to apply and integrate new software measurement techniques on Java systems. Here software measurement is also referred to as software metrics, where metrics in this instance relates to measurement and not mathematical metric space. 5.1 Software Systems The metric sets are first applied manually to three small systems all written in Java. Two of the systems were developed in-house at the University of Strathclyde and the other is an open-source project. The first is a hotel booking system that contained 13 classes with 1.5 KLOC. The second system is a design pattern-refactoring tool with 26 classes and about 3 KLOC. The last system is JUnit [JUn03], an opensource testing framework for Java, that has 111 classes and just over 5 KLOC. JUnit was chosen as a control system on the assumption that, as the authors are respected in the the field of software development and it has a large user base, it is likely to be well-designed. The role of the control system is primarily to check for the presence of false-positives. Applying these sets of metrics on the three small systems mentioned above limits the analysis of the results as they are not true representation of real systems. Size is an important factor and the question arises whether these techniques hold 17

26 5.2 Metric Tool 5 DATA COLLECTION true when applied to larger systems. The current manual application of metrics to larger systems becomes unfeasible. A further larger system is evaluated to test the scalability of the metrics and filters identified for the two design flaws using the implemented metric tool. The system is written in Java and is another in-house system developed and evolved within a research group at the department of Computer and Information Sciences at the University of Strathclyde known as TypEx. 5.2 Metric Tool There are a number of software metric tools that can be used to apply metrics to source-code, some of the more popular and easier to obtain are: SDMetrics [SDM03], JMetric [JMe00], JDepend [JDe01], Together ControlCenter [Tog02] and the Eclipse metric framework plugin [Met02]. A problem with any of these metric tools is that they are stand-alone systems that implement a specific set of metrics for a software language. Trying to apply all possible software metrics becomes a challenge. An ideal solution is to have a centralised location where metrics are expressed and defined, normally known as a metric repository. In addition, being able to apply any of the metrics from the repository on a language independent system. IBM has developed an open-source Integrated Development Environment (IDE) called Eclipse [Ecl01]. Eclipse is written in Java, where most of its functionality is from plugins. Plugins are easy to install, and can be integrated into a working project. Eclipse has features that support the development of plugins, parsing an xml file that contains the projects settings, named plugin.xml in a user friendly GUI, and having a run-time workbench that can test the plugin. Currently there is an active community developing plugins for varying software-engineering problems. There is an open-source metric framework plugin [Met02] for Eclipse that has a number of metrics implemented into a repository. The metric framework can only apply the metrics 18

27 5.2 Metric Tool 5 DATA COLLECTION from the repository on Java systems, but is ideal to implement new metrics into the repository. This metric framework will be used to implement the required metrics identified for this study Metric Framework The Eclipse metric framework plugin [Met02] is an opensource project were a full current working version can be downloaded using Eclipse s Concurrent Versions Systems (CVS) repository. The metric framework has the source-code for all the metrics implemented in the repository. The default metrics within the repository does not totally match against Marinescu s set for a data-class and god class. However, using the default metric implementations can help develop and integrate new metrics into the repository. Developing the other metrics is helped through their manual application on systems through appreciating and understanding the specific requirements needed. To add a new metric into the framework repository requires the implementation of a class that encapsulates the metrics functionality and the plugins settings require to be updated in the plugin.xml file. The framework interacts with other Eclipses plugins by for instance being able to integrate its functionality into the main Graphical User Interface (GUI). The framework exports the metric results into an interactive table where the user can chose to view the level of calculation required at the project, package, class or method level. The table has a feature that enables the export of the metric calculations to an xml file format. The framework interacts with Eclipses task-view to show results that have managed to reach a designated threshold defined in the plugins settings in the plugin.xml file. Exporting all the metric results from an analysis of a system into an xml file allows the use of an in-house tool developed within a research group at the University of Strathclyde, to parse the file into any required file format. The chosen format for the experiment is a spreadsheet. 19

28 6 MANUALLY APPLY METRICS Task view warnings Source Code Eclipse IDE metrics.sourceforge.org plugin Metrics table view Import into Excel ExtractorIterator Export XML Figure 3: The Process to Extract Metric Measurements on Java Source Code. The process this experiment uses to automate the analysis of a system is shown in Figure 3. A system is analysed using the Eclipse metric framework and can be viewed either by extracting the metric results into a spreadsheet or by using the task-view to identify positive cases of the two design problems. However, a manual search back to a class s source is required to make the final decision if the flaw exists or not Metric Definitions The metrics that extend the Eclipse metric framework repository are outlined and expressed fully in Appendix A. 6 Manually Apply Metrics The results of applying the experiment manually are reported in a number of tables for each system and design problem. Each of these tables shows the names of the system s classes, the values of the set of metrics, and the corresponding rule results that interpret the metric values. The metric results are shown either as whole positive numbers or where necessary to two decimal places. The original study by Marinescu identifies unique filters for each metric result for a specific design problem. A class s metric results can be interpreted to automatically state whether 20

29 6.1 Data-Class 6 MANUALLY APPLY METRICS it matches the design problem characteristics, through using the filters. The result of applying the filters to the metric values are shown either as a YES or NO to represent positive or negative identification. Marinescu identifies that inner classes are not considered as part of the calculation as they bear no real impact on the overall design of the system. 6.1 Data-Class HotelSystem Table 6 shows the results of the metrics and rules as applied to the Hotel System. It is noticeable that there are a number of cases where the filtered rules and manual inspection values are in disagreement. The rule results mainly identify NO, but if we concentrate on the positive results there are some interesting observations that can be made. The class BillItem is identified by manual inspection as a Data Class but not picked up by any of the rules. BillItem is a small class that contains two instance variables and four methods (two constructors and two accessor methods). The code for the BillItem class can be seen in Figure 4. The calculation of WOC in Table 6 for BillItem counts the number of non-accessors over the number of methods in the class, which yields a value of 0.5. This value fails to trigger either of rules D1 and D2. Rule D3 also shows a negative result because the NOPA and NOAM results are not large enough for the threshold of the rule and also are not in the top 10% of the overall metric results. The description of the WOC metric by Marinescu is quite ambiguous, and trying to obtain an accurate calculation is difficult. In this interpretation of the WOC metric, constructors are included as part of the calculation. Hence, the result for BillItem is 0.5 as the class contains two constructors and two accessor methods. The class Function in Table 6 is positively identified (incorrectly) by rule D1. The Function class contains one constructor method, four accessor methods and one method with 21

6.1 Data-Class 6 MANUALLY APPLY METRICS Table 6: Data Class, hotelsystem results final public class BillItem { private String bdescription; private double bcost; } public BillItem() { bdescription =

30 6.1 Data-Class 6 MANUALLY APPLY METRICS Table 6: Data Class, hotelsystem results final public class BillItem { private String bdescription; private double bcost; } public BillItem() { bdescription = "None"; bcost = 0.0; } public BillItem (String description, double cost) { bdescription = description; bcost = cost; } public String getdescription() { return bdescription; } public double getcost() { return bcost; } Figure 4: BillItem.java. 22

31 6.1 Data-Class 6 MANUALLY APPLY METRICS a number of condition statements, as a result gives a low WOC and a high NOAM values. However, manual inspection of this class identifies that it is not a pure Data Class, because of the 1 method with a number of condition statements, hence this D1 results is a false positive. The Guest class is correctly identified by rule D1 as being a Data Class. This class is defined with two constructor, and five accessor methods, which produces a low WOC value (and consequently triggers rule D1). However, the class is relatively small and does not have enough accessor methods overall to trigger rule D2 or D3. The Room class contains four constants and nine accessor methods. The D3 rule does not consider the overall balance of accessor methods to other methods in a class (in the way that the WOC metric does) and so D3 shows a positive result (because of its failure to make allowance for the eleven nonaccessor methods in the class). Hence this result is a false positive myrfctr The Refactory system results are shown in Table 7. These results only show false positives for the D3 rule for the same reason that the Room class was falsely identified. Both the JavaProgram and Node classes have a number of accessor methods, and are thus flagged up by rule D3, but these methods are not considered in relation to the rest of the methods in the class. A number of Data Classes have been identified manually (Abstraction, Bridge and PartialAbstraction) but missed by all of the filters. The structure of these classes is: Abstraction contains one constructor and two accessor methods, Bridge contains one constructor and an accessor method, PartialAbstraction contains one constructor and two accessor methods. The reason why the filters do not identify these classes, is that they small and have low NOAM and NOPA results with boarder line WOC results for both Abstract and PartialAbstraction classes where as the Bridge class has quite a large 23

6.2 God-Class 6 MANUALLY APPLY METRICS Table 7: Data Class, myrfctr results WOC value. The justification for these WOC values, is again they the calculation includes constructors. 6.1.

32 6.2 God-Class 6 MANUALLY APPLY METRICS Table 7: Data Class, myrfctr results WOC value. The justification for these WOC values, is again they the calculation includes constructors JUnit The control system, JUnit, shows one class (ProgressBar in awtui package) as being a Data Class by rule D3 as it has a number of public constant variables, but interestingly enough does not have any accessor methods. Rule D3 does not take into consideration any of the eleven non-accessor methods in the class and so this result is a false positive. 6.2 God-Class HotelSystem Table 9 shows the results of the God Class model applied to Hotel system. There is one false positive and three true positives. The reason why the HotelDate class is a false positive is that one of the triggers for rule G1 are metric values for ATFD and WMC that are in the top 10 results. The problem here is the system is small and has only 13 classes. HotelUI has positive results for both rules and is confirmed as a god 24

33 6.2 God-Class 6 MANUALLY APPLY METRICS Classes WOC2 NOPA NOAM D1 D2 D3 Manual awtui.aboutdialog * 0 0 NO NO NO NO awtui.logo NO NO NO NO awtui.progressbar NO NO YES NO awtui.testrunner NO NO NO NO extentions.activetestsuite NO NO NO NO extentions.exceptiontestcase NO NO NO NO extentions.repeatedtest NO NO NO NO extentions.testdecorator NO NO NO NO extentions.testsetup NO NO NO NO framework.assert NO NO NO NO framework.assertfailederror NO NO NO NO framework.comparisonfailure NO NO NO NO framework.protectable NO NO NO NO framework.test NO NO NO NO framework.testcase NO NO NO NO framework.testfailure NO NO NO NO framework.testlistener NO NO NO NO framework.testresult NO NO NO NO framework.testsuit NO NO NO NO runner.basetestrunner NO NO NO NO runner.classpathtestcollector NO NO NO NO runner.failuredetailview NO NO NO NO runner.loadingtestcollector NO NO NO NO runner.reloadingtestsuiteloader NO NO NO NO runner.simpletestcollector NO NO NO NO runner.sorter NO NO NO NO runner.sorter.swapper NO NO NO NO runner.standardtestsuiteloader NO NO NO NO runner.testcaseclassloader NO NO NO NO runner.testcollector NO NO NO NO runner.testrunlistener NO NO NO NO runner.testsuiteloader NO NO NO NO runner.version NO NO NO NO samples.money.imoney NO NO NO NO samples.money.money NO NO NO NO samples.money.moneybag NO NO NO NO samples.money.moneytest NO NO NO NO samples.alltests NO NO NO NO samples.simpletest NO NO NO NO samples.vectortest NO NO NO NO swingui.aboutdialog NO NO NO NO swingui.counterpanel NO NO NO NO swingui.defaultfailuredetailview NO NO NO NO swingui.defaultfailuredetailview.stacktracelistmodel NO NO NO NO swingui.defaultfailuredetailview.stackentryrenderer NO NO NO NO swingui.failurerunlview NO NO NO NO swingui.failurerunlview.failurelistcellrenderer NO NO NO NO swingui.progressbar NO NO NO NO swingui.statusline NO NO NO NO swingui.testhierarchyrunview NO NO NO NO swingui.testruncontext NO NO NO NO swingui.testrunner NO NO NO NO swingui.testrunview NO NO NO NO swingui.testselector NO NO NO NO swingui.testselector.testcellrenderer NO NO NO NO swingui.testselector.keyselectlistener NO NO NO NO swingui.testselector.parallelswapper NO NO NO NO swingui.testselector.testcellrenderer NO NO NO NO swingui.testsuitepanel NO NO NO NO swingui.testsuitepanel.testtreecellrenderer NO NO NO NO swingui.testtreemodel NO NO NO NO textui.resultprinter YES YES NO NO textui.testrunner YES YES NO NO Note: * invalid value # represents an innerclass value that is not part of a metric calulation Table 8: Data Class, JUnit Results 25

34 6.2 God-Class 6 MANUALLY APPLY METRICS Classes ATFD WMC TCC G1 G2 Manual Bill NO NO NO BillItem NO NO NO ConferenceRoom NO NO NO Delegate NO NO NO Function NO NO NO FunctionDate NO NO NO Guest NO NO NO Hotel NO NO NO HotelDate YES NO NO HotelSystem 0 1 * NO NO NO HotelUI YES YES YES Reservation NO NO NO Room NO NO NO Note: * invalid value as calculation divides by zero Table 9: God Class, hotelsystem results class by manual inspection. It is the main class that the user interacts with when using the system and as a result there are a number of methods with condition statements and method calls to other simpler classes myrftr The Refactory system results in Table 10 highlight two false positives, eight true positives and one failure to identify a true result. The reason why AbstractAccessGUI and AddImplementsLinkGUI show false positives for rule G1 is again the system is small and requires the metric results for ATFD and WMC to fall within the top 10. JavaProgram has the highest WMC value, a low level of cohesion (TCC) but does not have high enough coupling (ATFD) to trigger rule G2. An interesting observation about the Refactory and RefactoryGUI classes is that in addition to having identical metric values and both being positively identified correctly, they only differ by a few lines of code. This is indicative of a potential clone (identified by Fowler as the Duplicate Code bad smell) but is out of scope for this paper. 26

35 7 REFINEMENTS OF APPLYING METRICS MANUALLY Classes ATFD WMC TCC G1 G2 Manual AbstractAccess 0 1 * NO NO NO AbstractAccessGUI YES NO NO Abstraction NO NO NO AbstractionGUI NO NO NO AbstractionHelpGUI NO NO NO AddImplementsLinkGUI YES NO NO Bridge 0 2 * NO NO NO BridgeGUI NO NO NO Constructor NO NO NO EncapsulateConstruction 1 3 * NO NO NO EncapsulateConstructionGUI NO NO NO FactoryMethod 1 1 * NO NO NO FactoryMethodGUI NO NO NO JavaFilter NO NO NO JavaProgram YES NO YES Method NO NO NO Node NO NO NO PartialAbstraction NO NO NO PartialAbstractionGUI NO NO NO Refactory YES YES YES RefactoryGUI YES YES YES StringCharacterIterator NO NO NO Test1 1 1 * NO NO NO Test2 0 1 * NO NO NO Wrapper NO NO NO WrapperGUI NO NO NO Note: * invalid value as calculation divides by zero Table 10: God Class, myrftr results JUnit The results of the application to JUnit reveal a number of false positives - primarily for reasons already discussed. One interesting results was the false identification of the TestRunner class as a god class. This class does have a number of relatively complex methods that interact with and control a number of other classes. However, the classes it interacts with are not themselves Data Classes and so TestRunner ought not to be considered as a god class. 7 Refinements of Applying Metrics Manually In response to the issues raised in the previous section, a number of refinements to the metrics and rules that are used to identify both Data Classes and god classes are proposed and evaluated. 27

36 7 REFINEMENTS OF APPLYING METRICS MANUALLY Classes ATFD WMC TCC G1 G2 G3 Manual awtui.aboutdialog 2 3 * NO NO NO NO awtui.logo NO NO NO NO awtui.progressbar NO NO NO NO awtui.testrunner YES YES NO YES extentions.activetestsuite NO NO NO NO extentions.exceptiontestcase 0 3 * NO NO NO NO extentions.repeatedtest NO NO NO NO extentions.testdecorator NO NO NO NO extentions.testsetup NO NO NO NO framework.assert NO NO NO NO framework.assertfailederror NO NO NO NO framework.comparisonfailure 0 11 * NO NO NO NO framework.protectable 0 1 * NO NO NO NO framework.test NO NO NO NO framework.testcase NO NO NO NO framework.testfailure NO NO NO NO framework.testlistener NO NO NO NO framework.testresult NO NO NO NO framework.testsuit YES NO NO NO runner.basetestrunner YES NO NO YES runner.classpathtestcollector NO NO NO NO runner.failuredetailview NO NO NO NO runner.loadingtestcollector NO NO NO NO runner.reloadingtestsuiteloader NO NO NO NO runner.simpletestcollector 0 2 * NO NO NO NO runner.sorter NO NO NO NO runner.standardtestsuiteloader NO NO NO NO runner.testcaseclassloader NO NO NO NO runner.testcollector 0 1 * NO NO NO NO runner.testrunlistener NO NO NO NO runner.testsuiteloader NO NO NO NO runner.version 0 2 * NO NO NO NO samples.money.imoney NO NO NO NO samples.money.money NO NO NO NO samples.money.moneybag NO NO NO NO samples.money.moneytest NO NO NO NO samples.alltests NO NO NO NO samples.simpletest NO NO NO NO samples.vectortest NO NO NO NO swingui.aboutdialog 2 4 * NO NO NO NO swingui.counterpanel NO YES NO NO swingui.defaultfailuredetailview NO NO NO NO swingui.failurerunlview NO NO NO NO swingui.progressbar NO NO NO NO swingui.statusline NO NO NO NO swingui.testhierarchyrunview NO NO NO NO swingui.testruncontext NO NO NO NO swingui.testrunner YES YES NO YES swingui.testrunview NO NO NO NO swingui.testselector NO YES NO NO swingui.testsuitepanel NO NO NO NO swingui.testtreemodel NO NO NO NO textui.resultprinter NO NO NO NO textui.testrunner NO NO NO NO Note: * invalid value Table 11: God Class, JUnit results 28

37 7.1 Data Class 7 REFINEMENTS OF APPLYING METRICS MANUALLY 7.1 Data Class The three rules that Marinescu [Mar01, Mar02] used to identify a Data Class all place bounds on both the NOPA and NOAM metrics (see Table 2). These measure the number of public attributes and accessor methods in a class, respectively. These bounds vary from considering metric results for a class that are over 3 or over 5, and in the top 10 or top 10% of results for a system. A class that meets these bounds is considered to be a candidate Data Class. However, placing bounds on these metric results is inappropriate for this design problem. A class that purely contains constructor and accessor methods is a Data Class and the values of NOPA and NOAM will vary from one class to another and placing bounds on these measurements will tend to miss Data Classes with a small public interface. The size of a class to be considered a pure Data Class is not the main consideration, the importance is the structure containing only constructor and accessor methods. Another observation regards the inclusion of constructors in the calculation of the WOC metric. Constructors do not have any real bearing on the overall design of a class and should not be part of the WOC measurement. This refinement is considered and named WOC2, see Table 12. A new rule that takes into consideration of this WOC2 value is D4 - if the result equals zero then the corresponding class would be a candidate data-class, as it would contain only methods that are either constructors or accessors. Classes that contain a predominate number of accessor methods and instance variables of type public, are not to be considered as a Data Class. Only pure Data Classes will be taken into account, where classes only contain either instance variables, constructor methods or accessor methods. Consider the range of results WOC2 can produce, if a class solely has the characteristics of a data-class, the only methods that do exist are either constructor, accessor-methods or has no methods at all, would produce a zero value. A value greater than zero means there are other methods defined in 29

38 7.2 God-Class 7 REFINEMENTS OF APPLYING METRICS MANUALLY Name WOC2 (Weight of Class) NOMeth-NOCON (Number of methods, not constructors) D4 WOC2 == 0 Description/ Rule Number of non-accessor methods in a class divided by the total number of members of the interface, not including constructors. Number of methods that are not members of an interface, not including constructors. Table 12: Data Class Refined metrics and rules the class that have additional functionality other than changing or returning the state of instance variables. These refinements proposed for a Data Class have been applied to each of the three systems. Table 13 shows the results of the refinements applied to the Hotel system, with addition of the other rules as a comparison. There are only positive results identified by D4, which match up with that of the manual inspection results. The other two systems also showed only true positives for D God-Class A god class rule typically consists of three components: the complexity of the class, the measurement of coupling, and measurement of cohesion. The filters proposed by Marinescu [Mar01, Mar02] only consider two out of the three metric values at any one time, and hence result in the identification of a number of false positives. An observation made using the WMC metric to measure a class s complexity only identifies the worst-case scenario through counting the number of conditions. The higher a class s WMC, means there are a number of control condition statements and said to be rather complex. The current coupling metric ATFD (see Table 3) defines the coupling of a class as the number of other classes whose instance variables are accessed either directly or via accessor methods. However, the metric does not consider the other 30

39 7.2 God-Class 7 REFINEMENTS OF APPLYING METRICS MANUALLY DATA CLASS: WOC WOC2 NOPA NOAM D1 D2 D3 D4 Manual myrfctr.abstractaccess 1.00 * 0 0 NO NO NO NO NO myrfctr.abstractaccessgui NO NO NO NO NO myrfctr.abstraction NO NO NO YES YES myrfctr.abstractiongui NO NO NO NO NO myrfctr.abstractionhelpgui NO NO NO NO NO myrfctr.addimplementslinkgui NO NO NO NO NO myrfctr.bridge NO NO NO YES YES myrfctr.bridgegui NO NO NO NO NO myrfctr.constructor NO NO NO NO NO myrfctr.encapsulateconstruction NO NO NO NO NO myrfctr.encapsulateconstructiongui NO NO NO NO NO myrfctr.factorymethod 1.00 * 0 0 NO NO NO NO NO myrfctr.factorymethodgui NO NO NO NO NO myrfctr.javafilter NO NO NO NO NO myrfctr.javaprogram NO NO YES NO NO myrfctr.method NO NO NO NO NO myrfctr.node NO NO YES NO NO myrfctr.partialabstraction NO NO NO YES YES myrfctr.partialabstractiongui NO NO NO NO NO myrfctr.refactory NO NO NO NO NO myrfctr.refactorygui NO NO NO NO NO myrfctr.stringcharacteriterator NO NO NO NO NO myrfctr.test NO NO NO NO NO myrfctr.test NO NO NO NO NO myrfctr.wrapper NO NO NO NO NO myrfctr.wrappergui NO NO NO NO NO Table 13: Data Class, hotelsystem Refinement 31

40 7.2 God-Class 7 REFINEMENTS OF APPLYING METRICS MANUALLY Name CBO (Coupling Between Objects) Description/ Rule Count the number of other classes to which it its coupled. [CK94] G3 CBO >= 14 and WMC > 20 and TCC < 0.33 Table 14: God Class Refinements of metrics and rules (non-accessor) methods a class uses from another class, and so does not give a full coupling measure. An alternative metric that does take this into consideration is the Coupling Between Objects (CBO) metric as described by Chidamber and Kemerer [CK94]. However, using the CBO metric requires the placing of bounds on the results so that the rule could identify what is considered as being high coupling. These two new refinements are taken into consideration and shown in Table 14, where G3 rule uses all three metric results and keeps the same bounds from G2 rule for WMC and TCC, but has used 14 as the bound for CBO (this was the lowest value that complied with all three of the systems to identify true positive results). Clearly, the value of 14 will not necessarily be applicable to all systems and so further evaluation is needed in this respect to identify appropriate bounds. A further issue here is the nature of the classes to which the class being considered is coupled. For the class to be considered as a potential god class then these other classes should have little in the way of functionality and primarily be Data Classes. Future refinements will explore the inclusion of this criterion into the coupling measure to improve the accuracy of this rule. These refinements were applied to each of the systems and showed some interesting results. Table 15 shows the results of the refinements for the god class design flaw applied to the refactory system. All the rules are shown for the purposes of comparison, and it can be clearly seen that G3 positive results are true and match that of the manual inspection column. A similar pattern emerges for the results of both the Hotel System and JUnit. 32

41 7.2 God-Class 7 REFINEMENTS OF APPLYING METRICS MANUALLY DATA CLASS: WOC NOPA NOAM D1 D2 D3 D4 Manual myrfctr.abstractaccess NO NO NO NO NO myrfctr.abstractaccessgui NO NO NO NO NO myrfctr.abstraction NO NO NO YES YES myrfctr.abstractiongui NO NO NO NO NO myrfctr.abstractionhelpgui NO NO NO NO NO myrfctr.addimplementslinkgui NO NO NO NO NO myrfctr.bridge NO NO NO YES YES myrfctr.bridgegui NO NO NO NO NO myrfctr.constructor NO NO NO NO NO myrfctr.encapsulateconstruction NO NO NO NO NO myrfctr.encapsulateconstructiongui NO NO NO NO NO myrfctr.factorymethod NO NO NO NO NO myrfctr.factorymethodgui NO NO NO NO NO myrfctr.javafilter NO NO NO NO NO myrfctr.javaprogram NO NO YES NO NO myrfctr.method NO NO NO NO NO myrfctr.node NO NO YES NO NO myrfctr.partialabstraction NO NO NO YES YES myrfctr.partialabstractiongui NO NO NO NO NO myrfctr.refactory NO NO NO NO NO myrfctr.refactorygui NO NO NO NO NO myrfctr.stringcharacteriterator NO NO NO NO NO myrfctr.test NO NO NO NO NO myrfctr.test NO NO NO NO NO myrfctr.wrapper NO NO NO NO NO myrfctr.wrappergui NO NO NO NO NO Table 15: God Class, myrfctr refinements applied 33

42 7.3 Further Issues with Automatic Detection 8 TOOL APPLICATION 7.3 Further Issues with Automatic Detection During this study a number of problems have been identified that are related to either the application of the metrics or the interpretation of the results. Another interesting issue that arises is in relation to the automatic identification of the program elements upon which the metrics are defined - one particular instance being accessor methods. Marinescu uses the following pattern to locate accessor methods: accessormethods are small-methods, with a unitary cyclomatic complexity, and we rely on the name convention, stating that the names of accessor methods are prefixed with the get (or Get) and set (or Set) prefix [Mar01]. In carrying out this work it became clear that accessor methods often do no have get and set as prefixes and even identifying them manually raised some ambiguities. For example should a method named is- Bird that returns a boolean value of an instance variable be considered an accessor method? using Marinescu s above definition will not. This is another area that needs further exploration. 8 Tool Application Metrics, which are implemented into the Eclipse metric framework repository, are tested to check their calculation validity. The implemented metrics are applied to the three small systems and their results are compared against the manual calculations. Mismatches between the two sets of results may identify possible metric implementation problems, or human error in calculation. This process of validating the implemented metrics calculations is limited to three small systems, scaling the analysis to larger systems is required. However, as already mentioned manually applying metrics to larger systems becomes unmanageable, a solution therefore is to choose a random distribution of classes in a system and calculate metrics manually and match the results against the tools calculation. Using the Eclipse metric framework allows experimenta- 34

43 8.1 Data-Class 8 TOOL APPLICATION tion with the set of metrics that correspond to a design problem. The additional metrics to be applied to each design problem are described and justified below. 8.1 Data-Class Inner-classes implemented in another class will usually mean a class has more functionality than just holding data. A class that has inner-classes can not be considered to be a data-class. Hence, the metric result of Number Of Classes (NOC), should only consider classes with values equal to one, meaning there are no inner-classes defined in the body of a class. The value produced will always be greater than or equal to one, depending on the number of classes defined. An interface definition only contains method definitions, with no instance variables or method bodies, and is not a class definition that could give an object a state. Hence, interfaces are not included as part of NOC. A class constructor initialises the state of an object where there can be many methods to instantiate different states of an object. However constructors do not add functionality to a class and should not be considered as part of the number of methods within a class calculation. Hence, if the Number Of Class Constructors (NOCC) calculation is taken away from the total Number Of Class Methods (NOCM) will identify the number of methods with functionality in a class. 8.2 God-Class A god class in the behavioural form contains a large proportion of a system s functionality, and this may be shown through a class being large in size. A possible measurement to represent this is the number of Lines Of Code (LOC). This measure does not consider a class functionality but it serves as a quick overview of a class size. Another characteristic of a god class is not sharing its functionality with other classes. A class which is at the top of a hierarchy with no siblings and only Object as its superclass 35

44 8.3 TypEx 8 TOOL APPLICATION could be a potential god class. The Depth of Inheritance Tree (DIT) calculation equal to one would identify this characteristic. Previously mentioned Marinescu identifies the ATFD metric for the design problem god class, there are a few problems with trying to calculate on source-code mentioned in a previous section. The fist data analysis was applied manually on three small Java systems this was not a real problem. However trying to automate this calculation by implementing it into the metric tool is inappropriate. A proposed alternative to Marinescu s ATFD coupling metric definition is to use the Chidamber and Kemmerer s Coupling Between Objects (CBO). CBO counts the number of other classes a class is coupled to [CK94]. This is the number of different reference types used as instance variable declarations, formal parameters, return types, throw declarations, local variables, and types from which attribute and method selections are made. References of a type are counted only once. For example if a type exists as an instance variable and in a method parameter list would only produce a single count. However limiting the types the CBO metric counts to dataclasses only would be an easier coupling measurement to implement than ATFD. The aim of this newly defined metric is to measure the number of data-classes a class is coupled to. This is called Coupling Between Data-Classes (CBDC). Two interpretations of this metric are implemented, one to count the number of different connected data-classes (CBDC1) and another to count the frequency of the coupled data-classes (CBDC2). 8.3 TypEx The larger system TypEx is a Java Data Binding System for Streams of XML [Rus04], and will be used to test and apply the implemented metrics. The results will form the bases for justification to redefine some of the metrics proposed by Marinescu. 36

45 8.3 TypEx 8 TOOL APPLICATION Manual inspection of TypEx was carried out with the developer to utilise their knowledge of the system to aid locating possible design problems. The developer was briefed of the characteristics of the two design problems under investigation, and then asked if any classes match or closely matched these requirements, these are shown in the result tables by highlighting the relevant classes. The data-classes identified by the developer of the system are compared against the redefined set of metric filters applied to TypEx. The god classes are interpreted slightly different this time from Marinescu s study. The filters are not used, as the metric result can interpret the calculation inappropriately. Instead the structure of a class and the metric results are considered to help justify if a god class is present. Each class identified to be a true-positive for a data-class or a god class by the developer, the obvious structure is described first, then a justification for the metric results, and where possible recognise any of the design problem characteristics. In addition the analysis gives a reason if the class should or not be a true-positive. Also any large significant outliers in the metric results are queried to help understand why this is the case. A metric outlier could represent a class to have the corresponding design problem, or could be a false-positive and possibly highlight problems with the metric calculation or definition Data-Class A brief over-view of all TypEx metric results, notices there being a number of -1 values corresponding to Number Of Accessor Methods (NOAM) and Weight Of Class (WOC) values. The reasoning why NOAM produces a -1 value is when a class s definition only contains instance variables with no methods. In this situation NOAM should produce a zero value, but this was implemented to distinguish the difference when a class only contains constructor methods. However, this feature is not required and should be changed to produce 37

46 8.3 TypEx 8 TOOL APPLICATION a zero value. Trying to justify why WOC produces -1 values, we first consider the definition by Marinescu, The number of nonaccessor methods in a class divided by the total number of members of the interface. Inherited members are not counted [Mar02]. The case when a class is defined with no method implementations would from the above definition of WOC, produce an undefined value as dividing by zero. A class implemented with only instance variables known as a struct data structure in C programming language, in an O-O system can be considered bad programming design. However this metric should be able to consider such instances as to develop a robust metric definition. With the redefinition taken into consideration with WOC2 metric, constructor methods are not considered as part of the calculation. However a -1 value is produced when a class is defined with constructor methods only, as this new definition would mean dividing by zero. This small change is made, and reapplied to the systems with no negative values produced. Analysis with Developer The full analysis of TypEx with the developer identifying a data-class are shown in Appendix B, the conclusions drawn from these results are presented here. The results highlight two main differences that keep reoccurring, some classes use methods that are automatically inherited from Object superclass, where as others deviate from being a data-class by being a class, that uses classes within JUnit testing framework. The false positives identify classes that are interpreted correctly by the metric results and where the developer has failed to. This is a good result as the automation identifies more cases than the developer, who is very familiar with the system. The developer was shown the classes they did not identify to be a Data Class, and made the following response even being familiar with the system; knowledge of some classes content may have lapsed from being inactive. Together both metric interpretations, NOC and WOC2 has a better chance of locating a class with the characteristics 38

47 8.3 TypEx 8 TOOL APPLICATION Table 16: Data-Class, TypEx. 39

48 8.3 TypEx 8 TOOL APPLICATION Table 17: God-Class, TypEx. of a data-class. However, if we map these metric interpretations against the classes that the developer thought were data-classes, a one-to-one correlation does not exist. The key difference between the two results is the developer identifies classes that are mainly data-classes but deviate from the designflaws characteristics by additional methods, such as main or methods that test the class or system God-Class Analysis with Developer These results of TypEx looking for a god class has shown there are two true and one false positive, identifying there are only a few cases throughout the system. This result is good for identifying god classes within a system, as the definition of the design-flaw surmises there should be a limited number of classes with this nature doing a large proportion of a systems functionality. 40

What are Metrics?! Functions, that assign a precise numerical value to. Detecting Design Problems using Metrics

Detecting Design Problems using Metrics What are Metrics?! Functions, that assign a precise numerical value to! Products (Software)! Resources (Staff, Tools, Hardware)! Processes (Software development).