Generating Tests for Detecting Faults in Feature Models

Similar documents
Validation of models and tests for constrained combinatorial interaction testing

How to Optimize the Use of SAT and SMT Solvers for Test Generation of Boolean Expressions

Generating Minimal Fault Detecting Test Suites for General Boolean Specications

Generating minimal fault detecting test suites for Boolean expressions

Offline Model-based Testing and Runtime Monitoring

Part I: Preliminaries 24

MTAT : Software Testing

Deductive Methods, Bounded Model Checking

Introduction to Software Testing Chapter 4 Input Space Partition Testing

Decision Procedures for Equality Logic. Daniel Kroening and Ofer Strichman 1

Decision Procedures in First Order Logic

Test Automation. 20 December 2017

8.1 Polynomial-Time Reductions

CMPSCI 521/621 Homework 2 Solutions

Exploring Econometric Model Selection Using Sensitivity Analysis

Binary Decision Diagrams

Finite Model Generation for Isabelle/HOL Using a SAT Solver

Model Checking and Its Applications

Linear Time Unit Propagation, Horn-SAT and 2-SAT

Minimum Satisfying Assignments for SMT. Işıl Dillig, Tom Dillig Ken McMillan Alex Aiken College of William & Mary Microsoft Research Stanford U.

Model-Based Diagnosis versus Error Explanation

Satisfiability Modulo Theories. DPLL solves Satisfiability fine on some problems but not others

On the implementation of a multiple output algorithm for defeasible argumentation

CS 512, Spring 2017: Take-Home End-of-Term Examination

SAT-CNF Is N P-complete

Motivation. CS389L: Automated Logical Reasoning. Lecture 17: SMT Solvers and the DPPL(T ) Framework. SMT solvers. The Basic Idea.

Usage Scenarios for Feature Model Synthesis.

Introduction to Software Testing Chapter 5.1 Syntax-based Testing

Program-based Mutation Testing

Computer Science Technical Report

Enabling Variability-Aware Software Tools. Paul Gazzillo New York University

Combining forces to solve Combinatorial Problems, a preliminary approach

Where Can We Draw The Line?

TypeChef: Towards Correct Variability Analysis of Unpreprocessed C Code for Software Product Lines

Template-based Circuit Understanding

Testing & Symbolic Execution

Static verification of program running time

Giovanni De Micheli. Integrated Systems Centre EPF Lausanne

SERGEI OBIEDKOV LEARNING HORN FORMULAS WITH QUERIES

Notes for Recitation 8

Decision Procedures in the Theory of Bit-Vectors

SemFix: Program Repair via Semantic Analysis. Ye Wang, PhD student Department of Computer Science Virginia Tech

Decision Procedures. An Algorithmic Point of View. Decision Procedures for Propositional Logic. D. Kroening O. Strichman.

From OCL to Propositional and First-order Logic: Part I

6. Hoare Logic and Weakest Preconditions

Test Case Selection and Adequacy

Periodicity-based Temporal Constraints Paolo Terenziani*,

On the Generation of Test Cases for Embedded Software in Avionics or Overview of CESAR

Small Formulas for Large Programs: On-line Constraint Simplification In Scalable Static Analysis

Online Appendix: A Stackelberg Game Model for Botnet Data Exfiltration

MajorSat: A SAT Solver to Majority Logic

arxiv: v1 [cs.cg] 15 Sep 2014

Software Testing. 1. Testing is the process of demonstrating that errors are not present.

NO WARRANTY. Use of any trademarks in this presentation is not intended in any way to infringe on the rights of the trademark holder.

PROPOSITIONAL LOGIC (2)

Facts About Testing. Cost/benefit. Reveal faults. Bottom-up. Testing takes more than 50% of the total cost of software development

An Introduction to Satisfiability Modulo Theories

Automatic Verification of Firewall Configuration with Respect to Security Policy Requirements

NP-Hardness. We start by defining types of problem, and then move on to defining the polynomial-time reductions.

Test Oracles and Mutation Testing. CSCE Lecture 23-11/18/2015

Chapter 8. NP and Computational Intractability. Slides by Kevin Wayne. Copyright 2005 Pearson-Addison Wesley. All rights reserved.

Boolean Representations and Combinatorial Equivalence

Model Checkers for Test Case Generation: An Experimental Study

Chapter 2 PRELIMINARIES

An Efficient Framework for User Authorization Queries in RBAC Systems

HOL DEFINING HIGHER ORDER LOGIC LAST TIME ON HOL CONTENT. Slide 3. Slide 1. Slide 4. Slide 2 WHAT IS HIGHER ORDER LOGIC? 2 LAST TIME ON HOL 1

Propositional Calculus: Boolean Algebra and Simplification. CS 270: Mathematical Foundations of Computer Science Jeremy Johnson

Byzantine Consensus in Directed Graphs

CSCI.6962/4962 Software Verification Fundamental Proof Methods in Computer Science (Arkoudas and Musser) Chapter p. 1/27

Provably Optimal Test Cube Generation using Quantified Boolean Formula Solving

Local Two-Level And-Inverter Graph Minimization without Blowup

An IPS for TQBF Intro to Approximability

White-Box Testing Techniques III

arxiv: v1 [cs.db] 23 May 2016

Inference rule for Induction

Monitoring Interfaces for Faults

Implementation of a Sudoku Solver Using Reduction to SAT

No model may be available. Software Abstractions. Recap on Model Checking. Model Checking for SW Verif. More on the big picture. Abst -> MC -> Refine

New Worst-Case Upper Bound for #2-SAT and #3-SAT with the Number of Clauses as the Parameter

Preprocessing in Pseudo-Boolean Optimization: An Experimental Evaluation

HECTOR: Formal System-Level to RTL Equivalence Checking

Testing, Fuzzing, & Symbolic Execution

Modal logic of the planar polygons

Introduction. Easy to get started, based on description of the inputs

Formally Certified Satisfiability Solving

Finding and Fixing Bugs in Liquid Haskell. Anish Tondwalkar

Fault-based testing. Automated testing and verification. J.P. Galeotti - Alessandra Gorla. slides by Gordon Fraser. Thursday, January 17, 13

Tackling the Path Explosion Problem in Symbolic Execution-driven Test Generation for Programs

ROUGH MEMBERSHIP FUNCTIONS: A TOOL FOR REASONING WITH UNCERTAINTY

NP and computational intractability. Kleinberg and Tardos, chapter 8

Stroustrup Modules and macros P0955r0. Modules and macros. Bjarne Stroustrup

9.1 Cook-Levin Theorem

Helping the Tester Get it Right: Towards Supporting Agile Combinatorial Test Design

6.0 ECTS/4.5h VU Programm- und Systemverifikation ( ) June 22, 2016

On partial order semantics for SAT/SMT-based symbolic encodings of weak memory concurrency

Combinatorial testing. Angelo Gargantini - University of Bergamo Italy TAROT summer school June 2010

Topics in Software Testing

Part 5. Verification and Validation

Higher-Order Logic. Specification and Verification with Higher-Order Logic

Verification Overview Testing Theory and Principles Testing in Practice. Verification. Miaoqing Huang University of Arkansas 1 / 80

Transcription:

Generating Tests for Detecting Faults in Feature Models Paolo Arcaini 1, Angelo Gargantini 2, Paolo Vavassori 2 1 Charles University in Prague, Czech Republic 2 University of Bergamo, Italy

Outline Feature models (FM) Fault-based testing approach for FMs Common faults and mutation operators for FMs Distinguishing configuration (dc) Generation of dcs Test suite that guarantees the detection of faults Experiments Usage of dcs Conclusions

Feature models a feature model is a compact representation of all the products of the Software Product Line (SPL) in terms of "features and relations among them. Mandatory feature Optional feature Alternative Or excludes constraint requires constraint

Configurations and products A configuration of a feature model M is a subset of the features in M that must include the root A valid configuration is called a product, since it represents a possible valid instance of the feature model Product: {Mobile Phone, Calls, Screen, Basic, Media, MP3} Invalid configuration: {GPS, Screen, Media} Configurations used as tests: exhaustive coverage is unpractical. Which configurations to select?

FMs as Boolean formula Feature models semantics can be rather simply expressed by using propositional logic Let BOF be a function that, given a feature model M, returns its representation as propositional formulae M BOF(M): root mandatory optional a a b c a

Fault-based testing for Feature Models

Fault-based testing approach Goal: demonstrate that prescribed faults are not in the artifact under test Assumption: the model can only be incorrect in a limited fashion specified by a relatively small set of common mistakes 1. Fault model Fault classes and mutation operators 2. A definition of the conditions able to expose the faults 3. A way to generate test that satisfy those conditions

1. Fault model - Mutation operators We focus on faults that can be detected by a configuration may change the set of products (valid configurations) AlToOr: Alternative to Or AlToAnd: Alternative to And OrToAl: Or to Alternative OrToAnd: Or to And AndToOr: And to Or AndToAl: And to Alternative ManToOpt ManToOpt: mandatory to optional OptToMan: optional to mandatory MF: a feature f is removed MC: a constraint is removed ReqToExcl: requires to excludes ExclToReq: excludes to requires mutant

2. Distinguishing configuration Given a feature model M and one of its mutants M' A configuration c distinguishes M from M' IFF c is valid in M and not in M' or vice versa Example M M' Configuration { a, b } is valid in both It is not distinguishing Configuration { a, c } is valid in M' but not in M It is distinguishing

Some notes a distinguishing configuration does not need to be a valid product may not exist: Equivalent mutants Equivalent mutant, with the same product set requires remove constraint

3. Generation of distinguishing configurations Boolean difference or derivative: the erroneous implementation φ' of a Boolean expression φ can be discovered only when the expression φ φ' is evaluated to true where denotes the logical exclusive or (xor) operator The generation of a distinguishing configuration can be performed by: 1. translating M and its mutant M by BOF 2. using a SAT/SMT solver to get a model of the Boolean derivative

Generation of dc M φ M' mutation BOF φ φ φ' BOF Given a FM M 1. Apply the mutation operator 2. By BOF, get the Boolean formulas for M and M' 3. Build the Boolean difference 4. Call a SAT solver Logical solver UNSAT SAT model = distinguishing configuration = test M M' eq mutant

Example M M' A Valid in M but not in M Man ToOpt C BOF a a b c a BOF a b a c a {a = true, b = false, c = true} a a b c a a b a c a SAT/SMT SOLVER

Generation of a test suite To get a set of configurations able to detect all the faults for a given feature model: 1. generate all the mutants for the original feature model by applying one mutation at a time 2. generate a test predicate for every mutant 3. use the solver to get the model for every test predicate (or to prove that the mutant is equivalent) This basic process can be optimized (for producing more compact test suites and possibly save time)

Optimizations Process monitoring: it checks whether a test produced for a test predicate is also a model for other test predicates collecting: it looks for a model of a conjunction of test predicates, instead of a model for each test predicate Very expensive in terms of SAT calls prioritizing: it considers the test predicates in a particular order post reduction: after the test generation, it removes unnecessary tests, i.e., tests that only cover test predicates that are also covered by other tests Logical xor simplification: (α β) (α γ) α (β γ)

Experiments

Settings FeatureIDE SMT solver Yices 53 models from SPLOT repository From 9 to 287 features (2.48 X 10 86 configurations) From 2 to 8.53 X 10 47 products 1600 artificially generated models distributed with FeatureIDE having 10, 20, 50, 100, 200, 500, 1000, and 2000 features

Some preliminary research questions RQ1 How many mutants are generated? Not so many linear increase with the number of feature RQ2 How many mutants are generated by each mutation operator? Half of the mutants are for missing features RQ3 How are mutants distributed in the Feature Models edit classes {Refactoring, Specialization, Generalization, Arbitrary edit}? Details in the paper RQ4 Is the simplification of the xor expressions effective? A lot

RQ5 How big are the fault-detecting test suites? How much time is required to generate them? time presents an exponential increase, size increases linearly Artificial models - for models up to 200 features: Average time 38 seconds Average size is 60 tests (instead of 2 200 configurations)

RQ5 How big are the fault-detecting test suites? How much time is required to generate them? SPLOT models From 9 to 287 features (2.48 X 10 86 configurations) TS Size Time (ms) #Infeasibl e Min 4 27 0 Mean 19.9 5,264 0.49 Max 134 137,029 8 Much fewer infeasible tests than artificial models: fewer eq mutants means better quality?

Example of use for distinguishing configuration #ifdef HELLO char* msg = "Hello!\n"; #endif #ifdef BYE char* msg = "Bye bye!\n"; #endifn main() { printf(msg); } Configurations: {HELLO} is valid {HELLO,BYE} is invalid alternative Confirmed by the compiler (msg is declared twice) How to detect faults in the feature model for C code: generate the distinguishing configurations and check that the oracle (compiler) confirms its validity value

RQ6 Are distinguishing configurations useful? Experiment of typical use of distinguishing configurations: We have taken 6 small programs from the literature regarding the synthesis of feature models from preprocessor directives of C/C++ source code We have instructed 6 students and asked them to build the 6 feature models Each configuration represents an option set to be given to the compiler We can use the compiler as oracle to judge if a configuration is valid (product) or not We have generated the distinguishing configurations and checked with the compiler

Results - comparison w.r.t. other techniques Technique #tests #failin g tests #faulty models (fm) tests/ fm Distinguishing configurations 109 12 10 10.9 Pairwise 165 12 5 33 All Products 285 23 5 57 All configurations 516 119 17 30.4 Our approach produces the minimum number of tests and it is able to discover 59% of the faulty models much more effective than other techniques but not all the faults were detected The coupling effect does not hold in our case The component programmer hypothesis may not be true New mutation operators or HOM

Conclusions a configuration is distinguishing (dc) if it can characterize a feature model w.r.t. a faulty version of it generation of dc by: translating FMs to Boolean formulas, making the derivative (xor) and using SAT for test generation the technique is effective but some faults may be undetected