Empirical Validate C&K Suite for Predict Fault-Proneness of Object-Oriented Classes Developed Using Fuzzy Logic.

Similar documents
Euclidean Distance Based Feature Selection for Fault Detection Prediction Model in Semiconductor Manufacturing Process

Keywords Software Architecture, Object-oriented metrics, Reliability, Reusability, Coupling evaluator, Cohesion, efficiency

An Effort Estimation by UML Points in the Early Stage of Software Development

Service Oriented Enterprise Architecture and Service Oriented Enterprise

Chapter 1. Introduction to Computers and C++ Programming. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

A SOFTWARE MODEL FOR THE MULTILAYER PERCEPTRON

Analysis of Class Design Coupling Based on Information Entropy Di Jiang 1,2, a, Hua Zhou 1,2,b and Xingping Sun 1,2,c

EFFECT OF QUERY FORMATION ON WEB SEARCH ENGINE RESULTS

3D Model Retrieval Method Based on Sample Prediction

EMPIRICAL ANALYSIS OF FAULT PREDICATION TECHNIQUES FOR IMPROVING SOFTWARE PROCESS CONTROL

Pruning and Summarizing the Discovered Time Series Association Rules from Mechanical Sensor Data Qing YANG1,a,*, Shao-Yu WANG1,b, Ting-Ting ZHANG2,c

Identification of the Swiss Z24 Highway Bridge by Frequency Domain Decomposition Brincker, Rune; Andersen, P.

Effect of control points distribution on the orthorectification accuracy of an Ikonos II image through rational polynomial functions

Chapter 10. Defining Classes. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Mobile terminal 3D image reconstruction program development based on Android Lin Qinhua

Outline. Research Definition. Motivation. Foundation of Reverse Engineering. Dynamic Analysis and Design Pattern Detection in Java Programs

A New Morphological 3D Shape Decomposition: Grayscale Interframe Interpolation Method

Revisiting the performance of mixtures of software reliability growth models

What are we going to learn? CSC Data Structures Analysis of Algorithms. Overview. Algorithm, and Inputs

Customer Portal Quick Reference User Guide

Lecture 5. Counting Sort / Radix Sort

A Study on the Performance of Cholesky-Factorization using MPI

Accuracy Improvement in Camera Calibration

Evaluation scheme for Tracking in AMI

Performance Plus Software Parameter Definitions

Harris Corner Detection Algorithm at Sub-pixel Level and Its Application Yuanfeng Han a, Peijiang Chen b * and Tian Meng c

A Model for Estimation of Efforts in Development of Software Systems

RESEARCH ON AUTOMATIC INSPECTION TECHNIQUE OF REAL-TIME RADIOGRAPHY FOR TURBINE-BLADE

ANALYSIS OF RATIONAL FUNCTION DEPENDENCY TO THE HEIGHT DISTRIBUTION OF GROUND CONTROL POINTS IN GEOMETRIC CORRECTION OF AERIAL AND SATELLITE IMAGES

Software Fault Prediction of Unlabeled Program Modules

Sum-connectivity indices of trees and unicyclic graphs of fixed maximum degree

BASED ON ITERATIVE ERROR-CORRECTION

Which movie we can suggest to Anne?

Analysis Metrics. Intro to Algorithm Analysis. Slides. 12. Alg Analysis. 12. Alg Analysis

1 Enterprise Modeler

1.2 Binomial Coefficients and Subsets

Neuro Fuzzy Model for Human Face Expression Recognition

Python Programming: An Introduction to Computer Science

Improvement of the Orthogonal Code Convolution Capabilities Using FPGA Implementation

Extending The Sleuth Kit and its Underlying Model for Pooled Storage File System Forensic Analysis

Chapter 9. Pointers and Dynamic Arrays. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Baan Tools User Management

Low Complexity H.265/HEVC Coding Unit Size Decision for a Videoconferencing System

Chapter 5. Functions for All Subtasks. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

. Written in factored form it is easy to see that the roots are 2, 2, i,

The VSS CCD photometry spreadsheet

Evaluation of the Software Industry Competitiveness in Jilin Province Based on Factor Analysis

Baan Finance Financial Statements

Polynomial Functions and Models. Learning Objectives. Polynomials. P (x) = a n x n + a n 1 x n a 1 x + a 0, a n 0

Optimization for framework design of new product introduction management system Ma Ying, Wu Hongcui

Goals of this Lecture Activity Diagram Example

ISSN (Print) Research Article. *Corresponding author Nengfa Hu

BAAN IVc/BaanERP. Conversion Guide Oracle7 to Oracle8

Dynamic Programming and Curve Fitting Based Road Boundary Detection

Analysis of Documents Clustering Using Sampled Agglomerative Technique

ANN WHICH COVERS MLP AND RBF

THIN LAYER ORIENTED MAGNETOSTATIC CALCULATION MODULE FOR ELMER FEM, BASED ON THE METHOD OF THE MOMENTS. Roman Szewczyk

Descriptive Statistics Summary Lists

Designing a learning system

Analysis of Server Resource Consumption of Meteorological Satellite Application System Based on Contour Curve

IMP: Superposer Integrated Morphometrics Package Superposition Tool

Floristic Quality Assessment (FQA) Calculator for Colorado User s Guide

Goals of the Lecture UML Implementation Diagrams

Ones Assignment Method for Solving Traveling Salesman Problem

Data Warehousing. Paper

Intelligent Water Drops (IWD) Algorithm for COQUAMO Optimization

The Magma Database file formats

Task scenarios Outline. Scenarios in Knowledge Extraction. Proposed Framework for Scenario to Design Diagram Transformation

Shape Completion and Modeling of 3D Foot Shape While Walking Using Homologous Model Fitting

Counting the Number of Minimum Roman Dominating Functions of a Graph

n Some thoughts on software development n The idea of a calculator n Using a grammar n Expression evaluation n Program organization n Analysis

Pseudocode ( 1.1) Analysis of Algorithms. Primitive Operations. Pseudocode Details. Running Time ( 1.1) Estimating performance

State-space feedback 6 challenges of pole placement

Bayesian approach to reliability modelling for a probability of failure on demand parameter

Data Structures and Algorithms. Analysis of Algorithms

DATA MINING II - 1DL460

Adaptive Resource Allocation for Electric Environmental Pollution through the Control Network

Fire Recognition in Video. Walter Phillips III Mubarak Shah Niels da Vitoria Lobo.

Lecture Notes 6 Introduction to algorithm analysis CSS 501 Data Structures and Object-Oriented Programming

Ian Watson 1, Emilia Mendes 1, Chris Triggs 2, Nile Mosley 3 & Steve Counsell 3

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5

Wavelet Transform. CSE 490 G Introduction to Data Compression Winter Wavelet Transformed Barbara (Enhanced) Wavelet Transformed Barbara (Actual)

27 Refraction, Dispersion, Internal Reflection

Improving Template Based Spike Detection

Data diverse software fault tolerance techniques

Outline and Reading. Analysis of Algorithms. Running Time. Experimental Studies. Limitations of Experiments. Theoretical Analysis

Requirements Analysis

Software development of components for complex signal analysis on the example of adaptive recursive estimation methods.

Pattern Recognition Systems Lab 1 Least Mean Squares

Evaluation of Distributed and Replicated HLR for Location Management in PCS Network

Generation of a New Complexity Dimension Scheme for Complexity Measure of Procedural Program

9.1. Sequences and Series. Sequences. What you should learn. Why you should learn it. Definition of Sequence

The Closest Line to a Data Set in the Plane. David Gurney Southeastern Louisiana University Hammond, Louisiana

Running Time. Analysis of Algorithms. Experimental Studies. Limitations of Experiments

SAMPLE VERSUS POPULATION. Population - consists of all possible measurements that can be made on a particular item or procedure.

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Security of Bluetooth: An overview of Bluetooth Security

Structuring Redundancy for Fault Tolerance. CSE 598D: Fault Tolerant Software

Designing a learning system

l-1 text string ( l characters : 2lbytes) pointer table the i-th word table of coincidence number of prex characters. pointer table the i-th word

Transcription:

Empirical Validate C&K Suite for Predict Fault-Proeess of Object-Orieted Classes Developed Usig Fuzzy Logic. Mohammad Amro 1, Moataz Ahmed 1, Kaaa Faisal 2 1 Iformatio ad Computer Sciece Departmet, Kig Fahd Uiversity of Petroleum ad Mierals, Dhahra, Saudi Arabia Abstract Empirical validatio of software metrics suites to predict fault proeess i object-orieted (OO) compoets is essetial to esure their accuracy i practical idustrial. I this paper, we empirically validate the Chidamber ad Kemerer (CK) metrics suite metrics for their ability to predict software quality i terms of fault-proeess: we explore the ability of these metrics suites to predict fault-proe classes usig defect data for six versios of Rhio, a ope-source implemetatio of JavaScript writte i Java. We coclude that the C&K suite cotai similar compoets ad produce statistical models that are effective i detectig error-proe classes. Aalyzig Fuzzy Logic models across six Rhio versios idicates these models may be useful i assessig quality i OO classes produced usig moder highly iterative or agile software developmet processes. Keywords- fault-proe; fuzzy logic; software quality; predictio model 1 Itroductio Several Object-Orieted metrics have bee developed by researchers to help evaluate software desig quality [1-3]. While a measure may be correct from a theoretical perspective, it may ot be of practical use i software idustrial[4, 5]. Metrics may be difficult to collect or may ot really measure the iteded quality properties of software. Empirical validatio is ecessary to determie the usefuless of a metric i assessig ope source software quality. Ope source tools are becomig ever more importat for the user these days. May compaies are usig this kid of software i their ow work. Therefore, may of these projects are beig developed rapidly ad are quickly becomig very large. However, because ope source software is usually produced by voluteers, ad the developmet approach employed is quite differet from the usual methods applied i commercial software developmet especially for level of testig, the quality ad reliability of the code eeds to be ivestigated. Various kids of code measuremets ca be quite helpful i obtaiig iformatio about the quality ad fault-proeess of the code. I this paper, we describe how we calculated ad validated the object-orieted metrics suite give by Chidamber ad Kemerer [3] for fault-proeess detectio from the source code of the ope source Mozilla Rhio JavaScript writte i Java[6]. 2 Chidamber ad Kemerer s (CK) Metrics Chidamber ad Kemerer origially defied the CK metrics suite i 1991. I 1994, they published aother paper cotaiig revised defiitios of some of the metrics [3]. I this research, all CK metrics are selected to be validated its ability to predict the fault, i total CK suite cotiue six metrics which describe i Table 1.

Classes TABLE 1: CK SUITE METRICES [3, 7] Metric DIT WMC RFC CBO LCOM NOCL Descriptio Depth of Iheritace Tree (DIT) it measure the geeral classes, which are expected to be reused by other classes, are usually at a high level i the iheritace hierarchy. Weighted Methods per Class Number of Methods per Class is a measure of software size, ad hece a idicator of complexity Respose for Class is a measure of couplig. It couts the umber of methods that are immediately available to ad potetially used by a class. Couplig Betwee Objects (CBO) is a measure of couplig, coutig the umber of other classes to which a class is coupled. A class A is said to be coupled to aother class B, if class A accesses methods or variables defied by class B. large CBO value ofte idicates a high degree of depedecy o other classes Lack of Cohesio of Methods Number of Childre is measure the complexity of a iheritace hierarchy.it couts the umber of immediate subclasses derived from the curret class. 3 Experimetal Evaluatios 3.1 Datasets: We chose the Mozilla Rhio project to examie i this study because it was a real ope source project ad because of the availability of fault data for several versios of the project, Rhio is a ope source implemetatio of JavaScript. The developmet team of Rhio cosists of three programmers. All i separate locatios deliverig the java implemetatio with a varyig cycle time from two to 16 moths. I this study, we aalyzed 14R3, 15R1, 15R2, 15R3, 15R4, ad 15R5. Error data exists for Rhio i the olie Bugzilla website[8]. We Collect the Rhio fault data form a published work doe by Hector M et al[5]. Figure 1 shows the statistic for selected Rhio versios that had bee ivestigate durig the study. 250 200 Defects reported Ehaceme ts Made Class cout 150 100 50 0 rhio14r3 rhio1_5r 5 rhio15r4 rhio15r3 rhio15r2 rhio15r1 Defects reported 21 61 153 41 10 29 Ehacemets Made 1 37 76 0 0 3 Class cout 95 201 198 178 179 126 Rhio versio Figure 1: Defects reported ad ehacemets made per Rhio versio.

Table 2 shows the descriptive CK metrics statistics for the Rhio datasets which extracted by usig commercial tool amed METAMATA. Table 2: THE DESCRIPTIVE STATSTICS FOR THE DATASETS versio Statistics DIT WMC RFC CBO LCOM NOCL 14R3 6 464 165 59 2681 2 Mea 2.506494 109.4805 26.66234 10.22078 115.3377 1.012987 StdDev 1.154207 123.0208 33.43517 11.28182 420.9545 0.113961 15R1 6 688 202 65 3305 3 Mea 2.578431 122.6765 28.87255 10.89216 112.8627 1.019608 StdDev 1.120787 140.9371 37.09919 11.98424 460.8304 0.19803 15R2 7 732 203 69 4126 3 Mea 2.779817 139.7064 29.23853 10.21101 141.2477 1.027523 Std Dev 1.480477 167.6788 38.70709 12.13128 546.7883 0.213382 15R3 7 730 206 76 4524 3 Mea 2.841121 144.1402 29.96262 10.4486 152.1308 1.065421 Std Dev 1.486731 169.6414 40.12408 12.59697 594.096 0.315362 15R4 7 764 205 77 4951 3 Mea 2.756757 147.5225 30.32432 10.25225 158.1982 1.117117 Std Dev 1.472266 173.6812 41.31831 12.5274 615.8118 0.398605 15R5 6 922 214 67 5172 6 Mea 2.825688 156.1193 31.66055 10.25688 166.6422 1.155963 Std Dev 1.470979 181.4662 41.41484 11.65746 665.6276 0.626192 Table 3:Correlatios betwee: DIT, WMC, RFC, CBO, LCOM, NOCL, ad umber of Defects reported DIT WMC RFC CBO LCOM NOCL WMC 0.188 RFC 0.349 0.941 CBO 0.829 0.535 0.671 LCOM 0.460 0.904 0.838 0.757 NOCL -0.267 0.859 0.667 0.093 0.692 # of Defects 0.325 0.371 0.328 0.626 0.600 0.160 I order to get most relevat idepedet variables to the depedet variable, we used Pearso s Correlatio Coefficiets (PCC), idicates the stregth ad directio of a liear relatioship betwee two variables. Table 3

shows the PCC betwee umber of Defects ad each of the CK metrics. From the table, there is a sigificat correlatio betwee umber of Defects ad the CK metrics. Table 3 shows that, there highly correlatios betwee CBO, LCOM ad WMC metrics ad umber of Defects. 3.2 Predictio Accuracy Measures The term predictio accuracy i this paper meas how well a predictive model costructed usig kow data ca predict the outcomes of ukow data. This paper evaluates ad compares the Rhio software Fault- Proeess predictio models quatitatively, usig the described below predictio accuracy measures. For all the used measures the lower the error measure, the better is the performace. Root-mea-square error (RMSE) shows differeces betwee values predicted by a model ad the values actually observed from the thig beig modeled. RMSE i 1 ( f ( x ) y ) i i 2 (1) Normalized root-mea-square error (NRMSE): to ormalize the RMSE to the rage of the observed data. NRMSE RMSE f ( x) f ( x (2) max ) mi MRE is a ormalized measure of the discrepacy betwee actual values ad predicted values. MRE y f ( x) y (3) Mea magitude of relative error (MMRE) : 1 MMRE MRE i (4) i 1 4 Result ad Discussio This sectio describes the experimets coducted i our study. I the coducted experimets, we traiig the model usig oe time all CK metrics ad other with oly high correlated metrics CBO, LCOM ad WMC. We repeated the experimet more tha oe time to produce reliable results. Figure 2 ad 3 show the result for two error measures (NRMSE, MMRE) for fuzzy Mamdai model. 1.4000 1.2000 1.0000 0.8000 0.6000 0.4000 0.2000 0.0000 Mea (WMC,CB O,LCOM) (WMC,CB O,LCOM) Mea NRMSE 1.2042 0.2982 0.8925 0.05190945 Figure 2: NRMSE error measures usig Mamdai model 0.0160 0.0140 0.0120 0.0100 0.0080 0.0060 0.0040 0.0020 0.0000 NRMSE (Rhio dataset) (Rhio dataset) Mea (WMC,C BO,LCO M) (WMC,C BO,LCO M) Mea MMRE 0.0035 0.0112 0.0037 0.0137 Figure 3: MMRE error measures usig Mamdai model

5 Coclusio ad Future Work I this paper, we coducted the experimets to evaluate the performace of the fuzzy iferece systems models to predict Fault-Proeess of Object-Orieted Classes Developed Usig CK metrics. As show i table 3, there is sigificat correlatio betwee the measure provided by three CK metrics (LOC,CBO,WMC) ad the umber of defects i a class. We use to two Accuracy Measures (NRMSE,MMRE ) to validate the used model. As a future work, we pla to coduct the experimet with larger dataset, which will ehace the performace of fuzzy iferece models. [7] Yu, P., T. Systa, ad H. Muller. Predictig faultproeess usig OO metrics. A idustrial case study. i Software Maiteace ad Reegieerig, 2002. Proceedigs. Sixth Europea Coferece o. 2002. IEEE. [8] Database, B. Mozilla Foudatio. July 2004; Available from: https://bugzilla.mozilla.org/. ACKNOWLEDGMENT The authors ackowledge the support of Kig Fahd Uiversity of Petroleum ad Mierals. Referece [1] Basiya, J. ad C.G. Davis, A hierarchical model for object-orieted desig quality assessmet. Software Egieerig, IEEE Trasactios o, 2002. 28(1): p. 4-17. [2] Brito e Abreu, F. ad W. Melo. Evaluatig the impact of object-orieted desig o software quality. i Software Metrics Symposium, 1996., Proceedigs of the 3rd Iteratioal. 1996. IEEE. [3] Chidamber, S.R. ad C.F. Kemerer, A metrics suite for object orieted desig. Software Egieerig, IEEE Trasactios o, 1994. 20(6): p. 476-493. [4] Basili, V.R., L.C. Briad, ad W.L. Melo, A validatio of object-orieted desig metrics as quality idicators. Software Egieerig, IEEE Trasactios o, 1996. 22(10): p. 751-761. [5] Olague, H.M., et al., Empirical validatio of three software metrics suites to predict fault-proeess of object-orieted classes developed usig highly iterative or agile software developmet processes. Software Egieerig, IEEE Trasactios o, 2007. 33(6): p. 402-419. [6] N. Boyd. Rhio Home Page. July 2006; ]. Available from: http://www.mozilla.org/rhio/.