CHAPTER 4 OBJECT ORIENTED COMPLEXITY METRICS MODEL

64 CHAPTER 4 OBJECT ORIENTED COMPLEXITY METRICS MODEL 4.1 INTRODUCTION Customers measure the aspects of the final product to determine whether it meets the requirements and provides sufficient quality. Maintainers assess the current product to make a decision on what should be upgraded and improved.measurement of programs in software organizations are of an important source of control over quality, time and cost in the software development effort. Apart from system size, complexity has been an important factor to be considered in software quality. Complexity has direct impact on design, code, understandability, and maintenance of software. In object oriented paradigm, the structural design properties such as number of classes, size of the class, reuse level, play a vital role in estimating complexity. Hence, it is necessary to provide a complexity metric model to ensure the software quality. 4.1.1 Object Oriented Metrics for Software Complexity Estimates The reliability and quality management models tend to treat the software more or less as a black box. They are based on either external behavior of the product or the intermediate process data with out looking into the internal dynamics of design and code of the software.

65 If metrics analysis is made at the program module they tend to take an internal view and can provide clues for software engineers to improve the quality. Controlling and measuring complexity is a challenging engineering, managerial and research problem. Metrics have been created for measuring various aspects of complexity such as size, control flow, data structures and inter module structure. There are four types of complexities namely, problem complexity, algorithmic complexity structural complexity and cognitive complexity. This work focuses mainly on algorithmic and structural complexity that are obtained by analyzing the program. 4.1.2 Cyclomatic Complexity Cyclomatic complexity measures the number of linearly independent paths present in a program module. This measure provides a single ordinal number that can be compared to the complexity of other programs (McCabe 1976). This is also known as program complexity. It is intended to be independent of language. There are number of complexity measures that have been proposed by various researchers. The most prominent among them are cyclomatic complexity, Halstead's complexity and other similar complexity measures (Halstaled 1977; McCabe 1976). Cyclomatic complexity has been suggested by McCabe (1976) to improve the quality of design and structural complexity of a system. This is calculated from the connected graph of the module using the formula. CC = E-N+2 (4.1)

66 where CC - Cyclomatic complexity E - number of edges of the graph N - number of nodes in the graph. The Cyclomatic complexity, V (G) for a flow graph, G, is also defined as V (G) = P+I: where I f is the number of predicate nodes in the flow graph. One of the design guidelines states that the Cyclomatic Complexity must not exceed 10, because a low Cyclomatic complexity contributes to higher program understandability and indicates that it is amenable to modification at lower risk than a more complex program. Moreover a module's Cyclomatic Complexity is an indicator of its testability. A common application of Cyclomatic complexity is to compare it against a set of threshold values. One such threshold set is shown in Table 4.1 which helps in software design. Table 4.1 Threshold for Cyclomatic Complexity CC Risk evaluation 1-10 A simple program without much risk 11-20 More complex and moderate risk 21-50 Complex high risk program Greater than 50 Unstable program (very high risk) 4.1.3 Halstead Complexity Measure Halstead's complexity measurement (Halstead 1977) was developed to measure a program module's complexity from source code, with an emphasis on computational complexity. These are applied to code and most often used

67 as a maintenance metric. The Halstead measures are based on four scalar numbers nl, n2, Nl, and N2 that are derived directly from the program source code, where nl - the number of distinct operators n2 - the number of distinct operands Nl - total number of operators N2 - total number of operands From these numbers, five measures have been derived and termed as Halstead complexity measures, as a shown in Table 4.2 Table 4.2 Halstead's Complexity Measures Measure Symbol Formula Program Length N N = N,+N 2 Program Vocabulary n n = n i +n 2 Volume V V = Nx(log 2 n) Difficulty D D = (n,/2)x(n 2 /n 2 ) Effort E E = DxV The extraction of the component numbers from code requires a language sensitive scanner, which is reasonably simple program for most languages. These measures are applicable to operational systems and to development efforts, once the code has been written. Because maintainability should be a concern during development, the Halstead measures should be considered for use during code development to follow complexity trends. A significant increase in complexity measure found during testing is a sign of a high-risk module.

68 4.1.4 Other measures of complexity The measures of complexity other than Cyclomatic Complexity and Halstead complexity measure are given in Table 4.3. Table 4.3 Measures of complexity Complexity Measurement Henry and Kafura metrics Bowles metrics Troy and Zweben metrics Ligier metrics Primary Measure Of Coupling between modules Module and system complexity, coupling via parameters and global variables Modularity or coupling, complexity of structure, calls to and called by Modularity of structure chart In this work, the over all complexity of object-oriented program is quantified. Moreover, the impact of reuse on an Object oriented system and the relationships between various structural metrics of the system have been considered in this research work. The study of the class complexity and its impact in estimating over all complexity of OO software is also explored. 4.2 COMPLEXITY METRICS BASED ON CLASS STRUCTURAL CHARACTERISTICS The proposed design methodology for the complexity metrics based on class structural complexity is obtained using the procedure given below.

69 Step 1: Input a OO system (C++ program) and measure the design properties of OO systems such as lines of code number of classes, number of methods in a class, Depth of inheritance etc. Step 2: Device the complexity measures from the available measures with an emphasize on classes. Step 3: Study the impact of reuse in complexity of OO systems with the experiment. Step 4: Analyze the metric. Considering the size as a parameter to be estimated in OO, the number of attributes per class and the number of methods per class are considered. Finally, the class size is measured using the relation. Class size SC = NOA + NOM. (4.2) where NOA is the number of attributes and NOM is the number of methods. Step 5: Apply weights to the methods and replace their number NOM by their total method size (TMS) to get the Weighted Size of the Class (SCW) using the relation. SCW = NOA + TMS (4.3) where NOA - number of attributes. NOM - number of methods,

70 and TMS - Total Method Size From experiments we found that wherever the value of SCW increases, the complexity of the software increases. Hence, it is necessary to reduce the value SCW in order to provide a good design. 4.3 ANALYSIS AND RESULTS The objective of this work is to empirically investigate the relationships between the structural properties, Reuse level of OO systems and complexity of OO systems. The metrics data were collected from various C++ programs and they were analyzed with the help of graphs to bring out conclusions. The experiments were carried out with C++ programs.the programs were selected from utility programs of windows operating system. Moreover, the programs were taken in such a way that the size varies from smaller size to larger size. The metrics data have been collected through automation programs written in C language. It is observed that the class plays an important role in OO systems. But at the same time to estimate the over all complexity merely class characteristics are not enough. In this section, we provided a number graphs to depict the results. From the graphs shown in Figures 4.2 and 4.3, it is observed that there is a proportional relationship between the number of class and total lines of code of OO programs. Moreover, by considering the physical code coverage by the classes the overall complexity of OO programs can be estimated as Weighted Complexity Measure (WCM). This procedure is as follows.

71 Let X = percent or code covered by class out of total size of OO program. Now, we compute the value of WCM using the formula WCM = X*sum of classes complexities + (10-X)* sum of code complexities. This does not take into account the conceptual value of the class part because conceptual value is a subjective measure and is not an objective one. The average DIT shows the average depth of inheritance, Reuse ratio shows the level of reuse of classes with in a program and the Specialization ratio is an indicator for the level of abstraction achieved with in a program. These metrics have an impact on complexity of OO systems. This can be determined at the design level itself. Threshold values for these metrics provide a way to control complexity, error rate, and maintenance of OO programs. More empirical studies help in estimating valid threshold values for these metrics. It is observed that class size in LOC and class size in NOA+NOM does not show proportional relationship. But LOC is considered as a vital indicator of size and effort for any program in-spite of many criticisms. Empirical observations show that the threshold of NOM is a good measure for controlling error-rate complexity and also increasing understandability. The metrics Data Collected from various C++ programs are shown in Tables 4.4 and 4.5.

72 Table 4.4 Metrics data - Project 1 Average Class Serial Lines of Code Number of Average Class Size (in Number (LOC) Classes Size (in LOC) NOA+NOM) 1 756 4 62 5 2 698 12 45 2 3 4663 93 23 4 4 2880 13 36 3 5 5207 47 21 4 6 3828 28 90 5 7 1060 12 31 8 1497 23 29 2 9 1064 2 45 3 10 807 2 44 2 11 1379 4 4 2 12 1960 34 22 2 13 3460 12 28 2 3

73 Table 4.5 Metrics data - Project 2 Number of Serial Number of Specialization Super Reuse ratio Number Sub-Classes ratio Classes 1 1 2 0.25 2 2 6 11 0.5 1.8 3 13 35 0.13 2.6 4 7 12 0.5 1.7 5 21 39 0.4 1.8 6 9 15 0.3 1.6 7 3 3 0.25 1 8 1 17 0.04 17 9 1 1 0.5 1 10 2 2 1 1 11 2 2 0.5 1 12 9 12 0.3 1.3 13 1 1 0.08 1 Using these values, graphs have been plotted and are shown in Figures 4.2, 4.3 and 4.4.

74 6000 5000 4000 3000 2000 1000 100 90 80 70 60 50 40 30 20 10 NOC LOC 0 1 2 3 4 5 6 7 8 9 10 11 12 13 Programs 0 Figure 4.2 Lines of Code (LOC) and Number of Classes (NOC) 100 90 80 70 60 50 40 No of Classes TCS 30 20 10 0 1 2 3 4 5 6 7 8 9 10 11 12 13 Programs Size Total Class Figure 4.3 Lines of Code (LOC) and Total Class Size in LOC (TCS)

75 100 90 80 70 60 50 ACS in LOC ACS in NOA + NOM 40 30 20 10 0 1 2 3 4 5 6 7 8 9 10 11 12 13 Programes Figure 4.4 Average Class Size in LOC and Average Class Size in NOA + NOM Reuse ratio 1.2 1 0.8 0.6 Reuse ratio 0.4 0.2 0 1 2 3 4 5 6 7 8 9 10 11 12 Programs Figure 4.5 Reuse ratio for C++ Programs

76 4.4 CONCLUSION are: The conceptual conclusions derived from the empirical observations From the Figures 4.2 and 4.3 we infer that there is a proportional relationship between NOC (number of classes) and total LOC (lines of code) of OO programs. Class plays a dominant in OO systems. However, when the overall complexity is estimated, merely class characteristics are not enough. Considering physical code coverage by the classes the overall complexity of the OO program may be estimated as Weighted Complexity Measure (WCM) which provides an optimal design guideline for providing methods in classes. Average DIT shows average depth of inheritance and Reuse ratio shows the level of reuse of classes within a program. Specialization ratio is an indicator for the level of abstraction achieved within a program. These metrics have an impact on the complexity of OO systems. From graph 4.4 it is seen that the class size in LOC and class size in NOA+NOM does not show proportional relationship. The empirical results shows that, threshold for NOM, is a good measure for controlling error rate, complexity and also in increasing understandability. Hence, in this research work, we consider LOC as a base to estimate the complexity of OO programs and then apply the new OO metrics proposed in this thesis for improving the design.