An Unsystematic Review of Genetic Improvement. David R. White University of Glasgow UCL Crest Open Workshop, Jan 2016

Similar documents
Evolving Human Competitive Research Spectra-Based Note Fault Localisation Techniques

A Systematic Study of Automated Program Repair: Fixing 55 out of 105 Bugs for $8 Each

Babel Pidgin: SBSE Can Grow and Graft Entirely New Functionality into a Real World System

Fixing software bugs in 10 minutes or less using evolutionary computation

Overview of SBSE. CS454, Autumn 2017 Shin Yoo

REPAIRING PROGRAMS WITH SEMANTIC CODE SEARCH. Yalin Ke Kathryn T. Stolee Claire Le Goues Yuriy Brun

Automated smart test design! and its applications in! software transplantation, improvement and android testing!

Automatically Finding Patches Using Genetic Programming

Program Synthesis. SWE 795, Spring 2017 Software Engineering Environments

Using Execution Paths to Evolve Software Patches

Repair & Refactoring

Genetically Improved BarraCUDA

Automatically Finding Patches Using Genetic Programming

arxiv: v1 [cs.se] 22 Feb 2018

Automatic Repair of Real Bugs in Java: A Large-Scale Experiment on the Defects4J Dataset

LEVERAGING LIGHTWEIGHT ANALYSES TO AID SOFTWARE MAINTENANCE ZACHARY P. FRY PHD PROPOSAL

Automatically Repairing Concurrency Bugs with ARC MUSEPAT 2013 Saint Petersburg, Russia

Genetic Improvement Programming

Automated Program Repair

DynaMoth: Dynamic Code Synthesis for Automatic Program Repair

SemFix: Program Repair via Semantic Analysis. Ye Wang, PhD student Department of Computer Science Virginia Tech

Automated Software Transplantation

Improving 3D Medical Image Registration CUDA Software with Genetic Programming

Automatically Finding Patches Using Genetic Programming. Westley Weimer, Claire Le Goues, ThanVu Nguyen, Stephanie Forrest

Representations and Operators for Improving Evolutionary Software Repair

Evolving Better Software Parameters SSBSE 2018 Hot off the Press Track, LNCS11036, pp , Montpellier. doi: / _22

Highly Scalable Multi-Objective Test Suite Minimisation Using Graphics Card

Genetic Improvement of Energy Usage is only as Reliable as the Measurements are Accurate

HOMI: Searching Higher Order Mutants for Software Improvement

Trusted Software Repair for System Resiliency. Westley Weimer, Stephanie Forrest, Miryung Kim, Claire Le Goues, Patrick Hurley

Evolving a CUDA Kernel from an nvidia Template

Combining Bug Detection and Test Case Generation

Evolving a CUDA Kernel from an nvidia Template

Shin Hong. Assistant Professor Handong Global University (HGU) Pohang, Kyongbuk, South Korea (37554)

Genetic improvement of software: a case study

arxiv: v1 [cs.se] 25 Mar 2014

Automated Program Repair through the Evolution of Assembly Code

AUTOMATIC PROGRAM REPAIR USING GENETIC PROGRAMMING

Computer Science, UCL, London

Genetic Improvement Demo: count blue pixels Goal: reduced runtime

Introduction to Artificial Intelligence

OPTIMIZED TEST GENERATION IN SEARCH BASED STRUCTURAL TEST GENERATION BASED ON HIGHER SERENDIPITOUS COLLATERAL COVERAGE

Genetic Algorithms and Genetic Programming. Lecture 9: (23/10/09)

Precise Program Repair

Automating Test Driven Development with Grammatical Evolution

Interpreting a genetic programming population on an nvidia Tesla

Leveraging Program Equivalence for Adaptive Program Repair: Models and First Results. Westley Weimer, UVA Zachary P. Fry, UVA Stephanie Forrest, UNM

Yunho Kim. Software Testing and Verification Group Daehak-ro, Yuseong-gu, Daejeon, South Korea

Test Oracles and Mutation Testing. CSCE Lecture 23-11/18/2015

A Novel Co-evolutionary Approach to Automatic Software Bug Fixing

Neutral Networks of Real-World Programs and their Application to Automated Software Evolution

Model Checking and Its Applications

Searching for Program Invariants using Genetic Programming and Mutation Testing

Amortised Optimisation as a Means to Achieve Genetic Improvement

Evolutionary Computation Part 2

The$credit$for$crea-ng$these$slides$belongs$to$ Fall$2014$CS$521/621$students.$$Student$names$ have$been$removed$per$ferpa$regula-ons.

TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. X, NO. X, MONTH YEAR 1

Upcoming. Repairing Automated Repair. Generalizing 3/19/18

Introduction to Dynamic Analysis

Automated Theorem Proving: DPLL and Simplex

Introduction to Evolutionary Computation

ASTOR: A Program Repair Library for Java

Empirical Study on Impact of Developer Collaboration on Source Code

1DL321: Kompilatorteknik I (Compiler Design 1) Introduction to Programming Language Design and to Compilation

Genetic Image Network for Image Classification

1DL321: Kompilatorteknik I (Compiler Design 1)

UNDERSTANDING PROGRAMMING BUGS IN ANSI-C SOFTWARE USING BOUNDED MODEL CHECKING COUNTER-EXAMPLES

Constructing an Optimisation Phase Using Grammatical Evolution. Brad Alexander and Michael Gratton

Programming language design and analysis

A Survey of Approaches for Automated Unit Testing. Outline

Empirical Study on Impact of Developer Collaboration on Source Code

Combined Static and Dynamic Automated Test Generation

1DL321: Kompilatorteknik I (Compiler Design 1) Introduction to Programming Language Design and to Compilation

Lecture 09. Ada to Software Engineering. Mr. Mubashir Ali Lecturer (Dept. of Computer Science)

Geometric Semantic Genetic Programming ~ Theory & Practice ~

Adaptive Crossover in Genetic Algorithms Using Statistics Mechanism

Automated Program Repair by Using Similar Code Containing Fix Ingredients

Metaheuristic Development Methodology. Fall 2009 Instructor: Dr. Masoud Yaghini

Chasing Away RAts: Semantics and Evaluation for Relaxed Atomics on Heterogeneous Systems

Evolution Evolves with Autoconstruction

Splint Pre-History. Security Flaws. (A Somewhat Self-Indulgent) Splint Retrospective. (Almost) Everyone Hates Specifications.

Part I: Preliminaries 24

Coding and Unit Testing! The Coding Phase! Coding vs. Code! Coding! Overall Coding Language Trends!

BBK3253 Knowledge Management Prepared by Dr Khairul Anuar

(See related materials in textbook.) CSE 435: Software Engineering (slides adapted from Ghezzi et al & Stirewalt

Embedding Adaptivity in Software Systems using the ECSELR framework

Privacy Preserving Data Publishing: From k-anonymity to Differential Privacy. Xiaokui Xiao Nanyang Technological University

Deduplication of Hospital Data using Genetic Programming

The aims of this tutorial

IntroClassJava: A Benchmark of 297 Small and Buggy Java Programs

Type Inference Systems. Type Judgments. Deriving a Type Judgment. Deriving a Judgment. Hypothetical Type Judgments CS412/CS413

Maximizing Face Detection Performance

RECORD DEDUPLICATION USING GENETIC PROGRAMMING APPROACH

Genetic programming. Lecture Genetic Programming. LISP as a GP language. LISP structure. S-expressions

Search-Based Software Engineering: Foundations and Recent Applications

Chasing Away RAts: Semantics and Evaluation for Relaxed Atomics on Heterogeneous Systems

Extension of GCC with a fully manageable reverse engineering front end

Test Case Generation for Classes in Objects-Oriented Programming Using Grammatical Evolution

Multi-Objective Higher Order Mutation Testing with Genetic Programming

Welcome to CREST. CREST Open Workshop COW. Centre for Research in. Centre for Research in. Evolution, Search & Testing

Transcription:

An Unsystematic Review of Genetic Improvement David R. White University of Glasgow UCL Crest Open Workshop, Jan 2016

A Systematic Study of GI is currently under preparation. Justyna Petke Mark Harman Bill Langdon John Woodward Saemundur Haraldsson This presentation is based on a snapshot of this work.

Review Unmethod 1. Working list of ~ 300 papers. 2. Reduced to 150 by eliminating those I didn t think would be relevant to my view of GI. 3. Cherry-picked meta-papers, significant papers for my definition of significant, controversial papers. 4. Read. A lot. 5. Summary stats and observations.

Outline 1. Trends. 2. Progress in the field. 3. Challenges. 4. New Directions.

Outline 1. Trends. 2. Progress in the field. 3. Challenges. 4. New Directions.

GI Papers by Year GI Workshop Paragen GenProg 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 SEBASE Project

Earliest GI ParaGen System Paul Walsh, Conor Ryan, ParGen1995. Formally verifiable results.

Bug-Fixing vs Non-Functional Bug-Fixing Non-Functional Other

Target Language C++ Java C Other No Language

Outline 1. Summary statistics. 2. Progress in the field. 3. Challenges. 4. New Directions.

Toy Problems void sort(int* a, int length) { } for (; 0 < (length - 1); length--) { } for (int j = 0; j < (length - 1); j++) { if (a[j] > a[1 + j]) { } k = a[j]; a[j] = a[j + 1]; a[1 + j] = k; Evolutionary Improvement of Programs. White et al. IEEE TEC. vol.15, no.4, pp.515-538, 2011

2009: The Zune Bug void year_from_days_since_1980(int days) { int year = 1980; while (days > 365) { if (isleapyear(year)){ if (days > 366) { days -= 366; year += 1; } else { } } else { days -= 365; year += 1; } } printf("current year is %d\n", year); } Try input days = 366. Drops into empty else. A Genetic Programming Approach to Automated Software Repair. Forrest et al. GECCO 2009.

2009: The Zune Bug void year_from_days_since_1980(int days) { int year = 1980; while (days > 365) { if (isleapyear(year)){ if (days > 366) { //days -= 366; repair deletes year += 1; } else { } days -= 366; repair inserts } else { days -= 365; year += 1; } } printf("current year is %d\n", year); } A Genetic Programming Approach to Automated Software Repair. Forrest et al. GECCO 2009.

Awards GECCO Human-Competitive Awards A Genetic Programming Approach to Automated Software Repair. Forrest et al. GECCO 2009. Gold award. Using Genetic Improvement and Code Transplants to Specialize a C++ Program to a Problem Class. Petke et al. EuroGP 2014. Silver award. A Systematic Study of Automated Program Repair: Fixing 55 out of 105 bugs for $8.00 each. Dewey-Vogt et al. ICSE 2012. Bronze award. ICSE Distinguished Paper Award Automatically Finding Patches Using Genetic Programming. Weimar et al. Distinguished Paper Award. ICSE 2009.

Progress: Representation Originally, GI directly manipulated the AST as with traditional GP. Most work uses a patch-based representation. [1, 2] Empirical evidence that patches are more effective [3]. [1] Evolving Patches for Software Repair. Ackling et al. GECCO 2011. [2] Automatically Finding Patches using Genetic Programming. Weimar et al. ICSE 2009. [3] Representations and Operators for Improving Evolutionary Software Repair. Le Goues et al. GECCO 2012.

Traditional GP Search Space

Patch Search Space

Progress: Fault Localisation GenProg: simple statistical approach. Reduce the value of k by focusing on instructions executed by the failing test cases. Other work has used program counter sampling with Gaussian Convolution! [1] [1] Automated Repair of Binary and Assembly Programs for Cooperating Embedded Devices. Schulte et al. SIGARCH Comput. Archit. News 41:1 pp317-328.

Applications Targeting GPGPU. [1] Genetically Improved CUDA C++ Software. Langdon and Harman. EuroGP 2014. [2] Improving 3D Medical Image Registration CUDA Software with Genetic Programming. Langdon et al. GECCO 2014.

Objectives Bug-fixing Automatic Repair of Concurrency Bugs. Bradbury and Jalbert. SSBSE 2010. Execution time Evolutionary Improvement of Programs. White et al. TEC. 15:4. 2011. Power consumption Reducing Energy Consumption Using Genetic Improvement. Bruce et al. GECCO 2015. Memory A Methodology to Automatically Optimize Dynamic Memory Managers Applying Grammatical evolution. Risco-Martína et al. JSS 91, 109-123. Diversity. Generating Diverse Software Versions with Genetic Programming: an Experimental Study. Feldt. IEE Software 145:6. 1998.

Objectives Cont. Parallelisation Automatic conversion of programs from serial to parallel using Genetic Programming - The Paragen System. Walsh and Ryan. ParCo 95. Translation Evolving a CUDA kernel from an nvidia template. Langdon. CEC 2010. Simplification Genetic programming for shader simplification. Sitthi-amorn et al. SA 11. Extending Functionality Automated Software Transplantation. Barr et al. ISSTA 2015.

Outline 1. Summary statistics. 2. Progress in the field. 3. Challenges. 4. New Directions.

Empirical Method There has been criticism of existing work, including GenProg [1]. Biggest issue is overfitting. Observation: Many papers focus on statistics rather than code. We need more qualitative examination of results. There is not much code in the papers! Avoid horse-race [2] papers that add little of scientific value. [1] An Analysis of Patch Plausibility and Correctness for Generate-and-Validate Patch Generation Systems. Qi et al. ISSTA 2015. [2] A Theoretician s Guide to the Experimental Analysis of Programs Johnson. DIMACS Implementation Challenges.2002.

Correctness Generate and Validate Approaches have problems with overfitting. They may also break more functionality than they fix. Is the Cure Worse Than the Disease? Overfitting in Automated Program Repair. Smith et al. ESEC/FSE 2015.

Credibility Also a wider issue for GP and EC. Evolution is self-evidently a powerful general learning method. Are programs a suitable subject? Evidence they are surprisingly robust [1], and neutral networks exist [2]. [1] Software is Not Fragile. Langdon and Petke. CS-DC 2015. [2] Neutral Networks of Real-World Programs and their Application to Automated Software Evolution. Schultz. PhD Thesis. Uni. New Mexico, 2014.

Theory Limited work on the theoretical side. Some work on mutational robustness. Where is the landscape analysis?

Outline 1. Summary statistics. 2. Progress in the field. 3. Challenges. 4. New Directions.

Hardware Measurement Simulators are out, physical measurements are in. [1] Energy Optimisation via Genetic Improvement. Bruce. GECCO 2015.

Binaries Bug-fixing and program thinning without source code. Interesting problem: harder to intuit whether a patch is correct. Removing the Kitchen Sink from Software. Lanesborough et al. GECCO 2015. Repairing COTS Router Firmware without Access to Source Code or Test Suites: A Case Study in Evolutionary Software Repair. Schulte et al. GECCO 2015. Automatic Error Elimination by Horizontal Code Transfer Across Multiple Applications. Sadironglou-Douskos et al. PLDI 2015.

Deep Parameter Optimisation Optimise parameters within code rather than code itself: thus possess an oracle. Use mutation testing to determine impact of parameters. Transform non-functional properties. Target key components, e.g. malloc [1]. Also a suitable target for Amortised Optimisation. [2] [1] Deep Parameter Optimisation. Wu et al. GECCO 2015. [2] Amortised Optimisation of Non-functional Properties in Production Environments. Yoo. SSBSE 2015.

Code Transplantation Extend functionality of existing code by importing features from other codebases. Process of locating a transplantation point; extracting relevant code; finding correct bindings; minimisation; passing regression tests. Automated Software Transplantation. Barr et al. ISSTA 2015.

Semantic Search Index and search large code repositories: it s about existing code. Find patches with similar semantics expressed in SMT. Much higher success rate in producing repairs that generalise. Repairing Programs with Semantic Code Search. Ke et al. ASE 2015.

Summary We re at a crucial juncture for GI Research. We must address empirical method problems. Solid theory or systematic empirical methods required. Qualitative, explanatory, evidence is key. Wide range of possible avenues for work: many circumvent objections and address challenging problems.