Evolving Human Competitive Research Spectra-Based Note Fault Localisation Techniques

Similar documents
Deep Parameter Optimisation for Face Detection Using the Viola-Jones Algorithm in OpenCV

An Unsystematic Review of Genetic Improvement. David R. White University of Glasgow UCL Crest Open Workshop, Jan 2016

Overview of SBSE. CS454, Autumn 2017 Shin Yoo

Experimental Study on Bound Handling Techniques for Multi-Objective Particle Swarm Optimization

Lamarckian Repair and Darwinian Repair in EMO Algorithms for Multiobjective 0/1 Knapsack Problems

Genetic Improvement of Energy Usage is only as Reliable as the Measurements are Accurate

Very Fast Non-Dominated Sorting

Multi-Objective Pipe Smoothing Genetic Algorithm For Water Distribution Network Design

Optimizing Delivery Time in Multi-Objective Vehicle Routing Problems with Time Windows

Amortised Optimisation as a Means to Achieve Genetic Improvement

arxiv: v1 [cs.se] 22 Feb 2018

Multi-objective Optimization

Santa Fe Trail Problem Solution Using Grammatical Evolution

HOMI: Searching Higher Order Mutants for Software Improvement

Incorporation of Scalarizing Fitness Functions into Evolutionary Multiobjective Optimization Algorithms

Performance Assessment of DMOEA-DD with CEC 2009 MOEA Competition Test Instances

Using ɛ-dominance for Hidden and Degenerated Pareto-Fronts

EVOLVING LEGO. Exploring the impact of alternative encodings on the performance of evolutionary algorithms. 1. Introduction

Solving Multi-objective Optimisation Problems Using the Potential Pareto Regions Evolutionary Algorithm

NSGA-II for Biological Graph Compression

Automated smart test design! and its applications in! software transplantation, improvement and android testing!

Improving Performance of Multi-objective Genetic for Function Approximation through island specialisation

A Fast Approximation-Guided Evolutionary Multi-Objective Algorithm

Multiobjective Formulations of Fuzzy Rule-Based Classification System Design

Finding Sets of Non-Dominated Solutions with High Spread and Well-Balanced Distribution using Generalized Strength Pareto Evolutionary Algorithm

Evolving Better Software Parameters SSBSE 2018 Hot off the Press Track, LNCS11036, pp , Montpellier. doi: / _22

Babel Pidgin: SBSE Can Grow and Graft Entirely New Functionality into a Real World System

Sparse Matrices Reordering using Evolutionary Algorithms: A Seeded Approach

Multiobjective Prototype Optimization with Evolved Improvement Steps

Hybrid Genetic Algorithms for Multi-objective Optimisation of Water Distribution Networks

Particle Swarm Optimization to Solve Optimization Problems

Evolving SQL Queries for Data Mining

Comparison of Evolutionary Multiobjective Optimization with Reference Solution-Based Single-Objective Approach

Incrementally Maximising Hypervolume for Selection in Multi-objective Evolutionary Algorithms

Bi-Objective Optimization for Scheduling in Heterogeneous Computing Systems

DEMO: Differential Evolution for Multiobjective Optimization

Non-Dominated Bi-Objective Genetic Mining Algorithm

Multi-objective Optimization

Effects of Discrete Design-variable Precision on Real-Coded Genetic Algorithm

Mechanical Component Design for Multiple Objectives Using Elitist Non-Dominated Sorting GA

Unsupervised Feature Selection Using Multi-Objective Genetic Algorithms for Handwritten Word Recognition

A Hyper-heuristic based on Random Gradient, Greedy and Dominance

Improving Convergence in Cartesian Genetic Programming Using Adaptive Crossover, Mutation and Selection

Discovering Knowledge Rules with Multi-Objective Evolutionary Computing

R2-IBEA: R2 Indicator Based Evolutionary Algorithm for Multiobjective Optimization

Test Case Generation for Classes in Objects-Oriented Programming Using Grammatical Evolution

Evolutionary Algorithms: Lecture 4. Department of Cybernetics, CTU Prague.

Search Space Reduction for E/E-Architecture Partitioning

An Empirical Comparison of Several Recent Multi-Objective Evolutionary Algorithms

Multi-Objective Memetic Algorithm using Pattern Search Filter Methods

Investigating the Effect of Parallelism in Decomposition Based Evolutionary Many-Objective Optimization Algorithms

Approximation Model Guided Selection for Evolutionary Multiobjective Optimization

Evolutionary multi-objective algorithm design issues

An Experimental Multi-Objective Study of the SVM Model Selection problem

Preferences in Evolutionary Multi-Objective Optimisation with Noisy Fitness Functions: Hardware in the Loop Study

Internal vs. External Parameters in Fitness Functions

A genetic algorithms approach to optimization parameter space of Geant-V prototype

Evaluating and Improving Fault Localization

Effectiveness and efficiency of non-dominated sorting for evolutionary multi- and many-objective optimization

ENERGY OPTIMIZATION IN WIRELESS SENSOR NETWORK USING NSGA-II

A Multi-Objective Technique for Test Suite Reduction

ET-based Test Data Generation for Multiple-path Testing

Neural Network Weight Selection Using Genetic Algorithms

SPEA2+: Improving the Performance of the Strength Pareto Evolutionary Algorithm 2

Using Genetic Algorithms to Solve the Box Stacking Problem

Benchmarking MATLAB s gamultiobj (NSGA-II) on the Bi-objective BBOB-2016 Test Suite

LEVERAGING LIGHTWEIGHT ANALYSES TO AID SOFTWARE MAINTENANCE ZACHARY P. FRY PHD PROPOSAL

TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. X, NO. X, MONTH YEAR 1

arxiv: v1 [cs.se] 6 Jan 2019

minimizing minimizing

Improved S-CDAS using Crossover Controlling the Number of Crossed Genes for Many-objective Optimization

Introduction to Artificial Intelligence

Genetically Improved BarraCUDA

A Distance Metric for Evolutionary Many-Objective Optimization Algorithms Using User-Preferences

Incorporating Decision-Maker Preferences into the PADDS Multi- Objective Optimization Algorithm for the Design of Water Distribution Systems

Multi-Objective Optimization using Evolutionary Algorithms

Overview of NSGA-II for Optimizing Machining Process Parameters

Genetic improvement of software: a case study

A Multi-Objective Genetic Algorithm to Test Data Generation

Design of Nearest Neighbor Classifiers Using an Intelligent Multi-objective Evolutionary Algorithm

Evolutionary Multi-objective Optimization of Business Process Designs with Pre-processing

Search Based Data Sensitivity Analysis Applied to Requirement Engineering

Spectrum-based Fault Localization: A Pair Scoring Approach

Finding a preferred diverse set of Pareto-optimal solutions for a limited number of function calls

IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL., NO., MONTH YEAR 1

Improved Pruning of Non-Dominated Solutions Based on Crowding Distance for Bi-Objective Optimization Problems

Multi-Objective Optimization using Evolutionary Algorithms

Approximation-Guided Evolutionary Multi-Objective Optimization

Trade-off Between Computational Complexity and Accuracy in Evolutionary Image Feature Extraction

A New Crossover Technique for Cartesian Genetic Programming

Dynamic Uniform Scaling for Multiobjective Genetic Algorithms

International Conference on Computer Applications in Shipbuilding (ICCAS-2009) Shanghai, China Vol.2, pp

Mutation Operators based on Variable Grouping for Multi-objective Large-scale Optimization

Genetic Improvement Demo: count blue pixels Goal: reduced runtime

Mechanical Component Design for Multiple Objectives Using Elitist Non-Dominated Sorting GA

Multiobjective RBFNNs Designer for Function Approximation: An Application for Mineral Reduction

Highly Scalable Multi-Objective Test Suite Minimisation Using Graphics Card

Late Parallelization and Feedback Approaches for Distributed Computation of Evolutionary Multiobjective Optimization Algorithms

A Predictive Pareto Dominance Based Algorithm for Many-Objective Problems

An Evolutionary Algorithm for the Multi-objective Shortest Path Problem

Transcription:

UCL DEPARTMENT OF COMPUTER SCIENCE Research Note RN/12/03 Evolving Human Competitive Research Spectra-Based Note Fault Localisation Techniques RN/17/07 Deep Parameter Optimisation for Face Detection Using the Viola-Jones Algorithm in08/05/2012 OpenCV : A Correction 7th June 2017 Shin Yoo Bobby R. Bruce Abstract In our 2016 paper Deep Parameter Optimisation for Face Detection Using the Viola-Jones Algorithm in OpenCV [2], we reported on an evolutionary, multi-objective approach to deep parameter optimisation Spectra-Based Fault Localisation (SBFL) aims to assist de- bugging by applying risk that we reported could reduce execution time of a face detection algorithm by 48% if a 1.90% classification inaccuracy were permitted (compared to the 1.04% classification inaccuracy of the original, evaluation formulæ (sometimes called suspiciousness metrics) to program spectra and ranking unmodified statements algorithm) according and that to further the predicted execution time risk. savings Designing were possible a risk depending evaluation the degree formula is often an ofintuitive inaccuracyprocess permitteddone by the by user. human However, software after publication engineer. we found This anpaper error inpresents our experimental a Genetic Programming setup; instead approach of running for theevolving deep parameter risk optimisation assessment framework formulæ. using anthe evolutionary empirical search-based evaluation using 92 approach faults we from had been four using Unix a systematic utilities produces one. We therefore promising re-ran the results experiments 1. GP-evolved using the intended equations can consistently evolutionary implementation outperform many alongside of the the systematic human-designed implementation formulæ, 1,000 evaluations such as andtarantula, again Ochiai, for Jaccard, 10,000 evaluations. Ample, and We found Wong1/2, that the systematic up to 5.9 setup times. is superior More toimportantly, the intended evolutionary they can setup perform equally in as that well it produces as Op2, solutions which which, was when recently run on the proved test set, produce to be a optimal richer Pareto against frontier. If-Then-Else-2 evolutionary approach, in the 10,000 evaluation setup, produced a better Pareto frontier on the training set. (ITE2) structure, or even outperform it against other program structures. However, the majority of these solutions were infeasible or not Pareto optimal when run on the test set. We suspect this may be due to the evolutionary approach over-fitting to the training set. 1 The program spectra data used in the paper, as well as the complete empirical results, are available from: http://www.cs.ucl.ac.uk/staff/s.yoo/evolving-sbfl.html.

f o r ( i n t i =0; i < s o l u t i o n. getnumberofvariables ( ) ; i ++) i n t v a l = g e n o t y p e [ i ] ; i f ( i == ( ( n u m E v a l u a t i o n s % s o l u t i o n. g etnumberofvariables ( ) ) 1 ) ) v a l +=(( n u m E v a l u a t i o n s 1)/ s o l u t i o n. getnumberofvariables ( ) + 1 ) ; s o l u t i o n. g e t V a r i a b l e ( i ). s e t V a l u e ( v a l ) ; Figure 1: Faulty Implementation i f ( n u m E v a l u a t i o n s < p o p u l a t i o n S i z e ) f o r ( i n t i =0; i<s o l u t i o n. g etnumberofvariables ( ) ; i ++) i n t v a l = g e n o t y p e [ i ] ; i f ( i == ( ( n u m E v a l u a t i o n s % s o l u t i o n. g etnumberofvariables ( ) ) 1 ) ) v a l += ( ( n u m E v a l u a t i o n s 1) / s o l u t i o n. g etnumberofvariables ( ) + 1 ) ; s o l u t i o n. g e t V a r i a b l e ( i ). s e t V a l u e ( v a l ) ; Figure 2: Fixed Implementation 1 Introduction In 2016 Deep Parameter Optimisation for Face Detection Using the Viola-Jones Algorithm in OpenCV was presented at the International Symposium on Search Based Software Engineering [2]. This short investigation used deep parameter optimisation [11] to optimise a simple face detection algorithm implemented using OpenCV s Viola-Jones algorithm, permitting a reduction in face detection accuracy for decreased execution time. The investigation, as part of the symposium s challenge track, was intended to show deep parameter optimisation s suitability at solving real-world multi-objective problems in software engineering. Results showed that the facedetection algorithm s execution time could be decreased by 48% with a 1.90% classification inaccuracy, and that considerably greater savings in execution time was possible if more inaccuracy were permitted. Deep parameter optimisation is an instance of genetic improvement [8] (GI) a sub-domain of search-based software engineering [5] where search techniques are utilised to aid software development. Typically GI modifies sourcecode through a set of operators such as deleting a line of source-code [3, 6, 9], or swapping API implementations [7]. In contrast, deep parameter optimisation targets specific parameters within source-code known to influence a target property, or properties, such as execution time [2, 11], memory consumption [11], or energy consumption [1]. We refer to these as deep parameters. What constitutes a deep parameter is any element in a program s source-code that may exist in multiple known forms while maintaining an acceptable level of functional correctness. These parameters are exposed to a level in which they may be altered. Typically they are represented as a n-tuple of n exposed parameters. Within the work presented here, and in the 2016 paper, the exposed parameters are integer constants that can be altered to influence execution time while still classifying images (with varying levels of accuracy dependent on the parameters chosen). Given this solution representation it is intuitive to view the tuple as a genotype which can then be modified using a genetic algorithm [10]. If one wishes to adopt a multi-objective approach, such as permitting degradation in output quality in exchange for faster execution, they may utilise a multi-objective genetic algorithm such as NSGA-II [4]. This is setup we used during our 2016 investigation into optimising OpenCV but, due to a bug explained in Section 2, a systematic search was used instead of an evolutionary one meaning values in the tuple were incremented one by one. The results reported here show that, surprisingly, this bug helped produce a better Pareto frontier than the correct evolutionary approach would have. We suspect this is due to the evolutionary approach overfitting to the training set. RN/17/07 Page 1

2 Fault And Fix Within the 2016 paper we seeded the initial generation with a single instance of the original solution (that is, all the deep parameters as they appear in the original source-code). The remainder of the initial population was generated by iterating through the parameters of the original solution and generating a variant equal to the original but with that parameter being incremented by 1 or 2. There was, however, a fault that resulted in each generation being populated by this initialisation algorithm. The original source is shown in Figure 1. As can be seen, there is no code ensuring this is only applied to the initial generation. It was, therefore, applied to every generation. Though there was selection, mutation, and crossover, once a new population was created it was immediately overwritten. The MOEA framework 1 (the framework we use to run NSGA-II [4]), maintains the current best Pareto frontier. This bug thereby nullifies evolution and systematically goes through the search space looking for Pareto optimal solutions starting from the original, unmodified solution. Within this investigation we implement the fix shown in Figure 2, allowing the evolutionary code to run as was originally intended. 3 Experiment Rerun Using the code outlined in Figure 2 we re-ran the experiments, running for 1,000 evaluations (as in the original paper) and again for 10,000 evaluations on the training set. We then took the Pareto frontier produced in both the 1,000 and 10,000 evaluation runs and executed each Pareto optimal point on the test set and re-plotted the Pareto frontier, removing dominated solutions and solutions that crashed or timed out. We then did the same using the original, faulty implementation (i.e. with the faulty code shown in Figure 1). All other parameters were identical to the 2016 paper, we strongly advise reading the 2016 paper prior to this correction in order to understand the experimental setup in greater detail. 4 Results In Figure 3 we see the results of running the two techniques for 1,000 evaluations and for 10,000 on the training set (the red, left-most point in each graph shows the original, unmodified solution). It should be noted that in the case of the evolutionary approach running for 10,000 evaluations, the MOEA framework finished execution after only 9,000 evaluations as MOEA stops after several generations of stagnation despite the evaluation limit set. When looking exclusively at these results, both the evolutionary and systematic approaches produce healthy Pareto frontiers. While there is not much difference between running the systematic approach for 1,000 or 10,000 evaluations, the evolutionary approach does improve its Pareto frontier in the 10,000 instance to the extent it is better than the systematic approach (Figure 5b shows an overlay of these two Pareto frontiers). In Figure 4 we see the results of the Pareto frontiers shown in Figure 3 run on the test set. That is, each point on the Pareto front produced by MOEA (running on the training set) run on the test set. The results were then re-plotted with dominated, or invalid solutions (those which cause the program to crash or timeout) removed. Comparing Figure 4 to Figure 3 we can see that the Pareto front of the systematic approach largely holds when run on the test set in both the 1,000 and 10,000 evaluation instances. However, the Pareto frontier of the evolutionary approach does not. In fact, the Pareto frontier of the 10,000 evaluation instance only maintains 2 points out of the original 9. The 1,000 evaluation run is a little better, maintaining 4 of 10. Somehow, as the the number of evaluations increased the Pareto frontier of the evolutionary technique worsened when run on the test set. Our suspicion is this is likely due to the Pareto Frontier becoming overfitted to the training set (and thereby worsening its performance when run on the test set). Figure 5 has been included to more clearly show the comparisons between the evolutionary and systematic approaches for the 1,000 and 10,000 runs on both the training and test sets. 5 Conclusion In this short paper we highlight a mistake in a previously published work and correct it. This mistake resulted in our deep parameter tuning setup optimising using a systematic search instead of an evolutionary one. 1 MOEA Framework version 2.9 available at http://moeaframework.org RN/17/07 Page 2

50 (a) 1k Evaluations Evolutionary 50 (b) 1k Evaluations Systematic 50 (c) 10k Evaluations Evolutionary 50 (d) 10k Evaluations Systematic Figure 3: Training Set RN/17/07 Page 3

50 (a) 1k Evaluation Evolutionary 50 (b) 1k Evaluation Systematic 50 (c) 10k Evaluation Evolutionary 50 (d) 10k Evaluation Systematic Figure 4: Test Set RN/17/07 Page 4

0 5 10 15 20 25 30 Systematic (1k Evals) Evolutionary (1k Evals) 0 5 10 15 20 25 30 Systematic (10k Evals) Evolutionary (10k Evals) 50 50 (a) 1k Evaluation (Training Set) (b) 10k Evaluation (Training Set) 0 50 100 150 Systematic (1k Evals) Evolutionary (1k Evals) 0 50 100 150 Systematic (10k Evals) Evolutionary (10k Evals) 50 50 (c) 1k Evaluation (Test Set) (d) 10k Evaluation (Test Set) Figure 5: Pareto frontiers compared RN/17/07 Page 5

To our surprise, this mistake resulted in better solutions than the fixed evolutionary approach. Our results shown that while, in some cases, the evolutionary approach was capable of producing a superior Pareto frontier on the training set, this was not translated to a superior Pareto frontier when run on the test test. We suspect this is due to the evolutionary approach overfitting to the training set. All the source-code and data for this investigation (including source-code and data for the 2016 paper) can be found on GitHub 2. References [1] M. A. Bokhari, B. R. Bruce, B. Alexander, and M. Wagner. Deep parameter optimisation on Android smartphones for energy minimisation A tail of woe and a proof-of-concept. In Proceedings of the Companion Publication of the 2017 Genetic and Evolutionary Computation Conference GECCO Companion 2017, 2017. [2] B. R. Bruce, J. M. Aitken, and J. Petke. Deep parameter optimisation for face detection using the viola-jones algorithm in opencv. In International Symposium on Search Based Software Engineering SSBSE 2016, pages 238 243. Springer, 2016. [3] B. R. Bruce, J. Petke, and M. Harman. Reducing energy consumption using genetic improvement. In Proceedings of the 2015 Genetic and Evolutionary Computation Conference GECCO 2015. Association for Computing Machinery (ACM), 2015. [4] K. Deb, A. Pratap, S. Agarwal, and T. A. M. T. Meyarivan. A fast and elitist multiobjective genetic algorithm: Nsga-ii. IEEE Transactions on Evolutionary Computation, 6(2):182 197, 2002. [5] M. Harman and B. F. Jones. Search-based software engineering. Information and Software Technology, 43(14):833 839, 2001. [6] W. B. Langdon and M. Harman. Optimising existing software with Genetic Programming. IEEE Transactions on Evolutionary Computation, 99, 2015. [7] I. Manotas, L. Pollock, and J. Clause. Seeds: a software engineer s energy-optimization decision support framework. In Proceedings of the 36th International Conference on Software Engineering ICSE 2014, pages 503 514. ACM, 2014. [8] J. Petke, S. O. Haraldsson, M. Harman, W. B. Langdon, D. R. White, and J. R. Woodward. Genetic Improvement of software: A comprehensive survey. IEEE Transactions on Evolutionary Computation, 2017. [9] J. Petke, M. Harman, W. B. Langdon, and W. Weimer. Using genetic improvement & code transplants to specialise a C++ program to a problem class. In Proceedings of the 17th European Conference on Genetic Programming EuroGP 2014, 2014. [10] M. Srinivas and L. M. Patnaik. Genetic algorithms: A survey. Computer, 27(6):17 26, 1994. [11] F. Wu, W. Weimer, M. Harman, Y. Jia, and J. Krinke. Deep parameter optimisation. In Proceedings of the 2015 Annual Conference on Genetic and Evolutionary Computation GECCO 2015, pages 1375 1382. ACM, 2015. 2 https://github.com/bobbybruce1990/dpt-opencv RN/17/07 Page 6