Quality Assurance in Software Development

Quality Assurance in Software Development Qualitätssicherung in der Softwareentwicklung A.o.Univ.-Prof. Dipl.-Ing. Dr. Bernhard Aichernig Graz University of Technology Austria Summer Term 2017 1 / 47

Agenda t 1 The Challenge of Software Testing 2 Definitions 3 Functional testing strategies Boundary Value Testing Random Testing Equivalence Class Testing 2 / 47

The Challenge of Software Testing Correctness Requirements Development Process 3 / 47

A silly joke A mathematician, a physicist, and an engineer are told: All odd numbers are prime. The mathematician says, That s silly, nine is a non-prime odd number. The physicist says, Let s see, 3 is prime, 5 is prime, 7 is prime looks like it s true. The engineer says, Let s see, 3 is prime, 5 is prime, 7 is prime, 9 is prime, 11 is prime looks like it s true. 4 / 47

Becoming more serious The little joke illustrates us the following challenges in testing: oracle problem: How to predict the outcome of a test-case? incompleteness: exhaustive testing is usually infeasable. How much testing is needed? test case selection: Are the chosen test cases adequate? correctness of test cases: Are the test cases correctly designed? 5 / 47

Testing and Correctness t Testing can only demonstrate the presence of errors, not their absence (E. Dijkstra, 1972) Reason: discrete nature of software Testing of all possible inputs would be necessary Exhaustive testing is infeasible Representative test cases have to be selected: We need a test strategy in order to achieve a certain test coverage. 6 / 47

Testing and Requirements t Quality of testing strongly depends on the quality of requirement documents Requirements that are not documented cannot be tested systematically 60 % of all bugs in the requirements phase are due to missing requirements Requirements written in natural language are often incomplete ambiguous inconsistent Formal specifications and formal models help! 7 / 47

Testing in the Development Process In the classical waterfall model testing is the last phase. Danger: if project is delayed, testing gets cut Dramatic: most of the large SW projects overrun their deadline Consequence: the test cases have to be designed as early as possible, as soon as the SW requirements are known. this leads to agile software processes High testing costs! Test automation is needed! 1/3 1/2 of the resources in big projects 8 / 47

Self test Write a set of test cases that you feel would adequately test this program (Myers, The Art of Software Testing): The program reads three integer values from an input stream. The three values are interpreted as representing the lengths of the sides of a triangle. The program prints a message that states whether the triangle is scalene, isosceles, or equilateral. 9 / 47

Self test: questions Do you have... 1 a test case that represents a valid scalene triangle? Test cases (1, 2, 3) and (2, 5, 10) are not valid. 2 a test case that represents a valid equilateral triangle? 3 a test case that represents a valid isosceles triangle? 4 at least three test cases that represent valid isosceles triangles such that you have tried all three permutations of two equal sides (e.g., (3, 3, 4), (3, 4, 3), and (4, 3, 3))? 5 a test case in which one side has a zero value? 6 a test case in which one side has a negative value? 7 a test case with three integers greater than zero such that the sum of two of the numbers is equal to the third? 10 / 47

Self test: questions (cont.) 8 at least three test cases in Category 7 such that you have tried all three permutations where the length of one side is equal to the sum of the lengths of the other two sides (e.g., (1, 2, 3), (1, 3, 2), and (3, 1, 2))? 9 a test case with three integers greater than zero such that the sum of two of the numbers is less than the third (e.g., (1, 2, 4)) 10 at least three test cases in Category 9 such that you have tried all three permutations (e.g., (1, 2, 4), (1, 4, 2), and (4, 1, 2))? 11 a test case in which all sides are zero? 12 at least one test case specifying noninteger values? 13 at least one test case specifying the wrong number of values? 14 for each test case the expected outputs specified? 11 / 47

Naming the Wrong IEEE Standard Glossary of Software Engineering Terminology: Mistake: a human action that produces an incorrect result. Example: an incorrect action taken by the programmer. Fault: an incorrect step, process, or data definition in a computer program (also defect, bug). Error: the difference between a computed, observed, or measured value or condition and the true, specified, or theoretically correct value or condition. Failure: the inability of a system or component to fulfill its required functions within specified performance requirements. Dijkstra was against calling it a bug. 12 / 47

Naming the Wrong: Example Consider a statement x = y + x; By a mistake a programmer changes it to x = y x, thus characterising a fault. Executing the fault with x = 0 does not lead to an error, and consequently not to an observable failure. 13 / 47

Testing, Test, Test Case testing: the act of designing, debugging, and executing test cases.. test: A test is the act of exercising software with test cases. A test has two distinctive goals: to find failures or to demonstrate correct execution. test case: A test case has an identity and is associated with program behaviour. A test case also has a set of inputs and a list of expected outputs. The output portion of a test case is frequently overlooked, which is unfortunate, because it is often the hard part. 14 / 47

Black & White Two fundamental approaches are used to identify test cases: functional testing: functional testing is based on the view that any program can be considered to be a function that maps values from its input domain to values in its output range (black box testing). Test cases are identified using the function s specification. structural testing: based on how the function is actually implemented (white box testing). 15 / 47

Why We Test validation: the process of evaluating an object to demonstrate that it meets the user requirements. verification: in the ANSI/IEEE Std 729-1983 three possible definitions of verification are given: 1 the process of reviewing, inspecting, testing, etc., if objects, processes, services, documents satisfy the specified requirements. 2 the process of evaluating if an object in a given phase in the software lifecycle, meets the requirements established in the previous phase. 3 formal correctness proofs of programs. falsification: the process of evaluating an object to demonstrate that it does not meet requirements. 16 / 47

When We Test The following kinds of tests can be distinguished during sw-lifecycle: unit tests: testing of units: modules, classes, components integration tests: tests that explore the interaction and consistency of successfully tested components. system tests: testing of the whole system in order to explore behaviors that can t be done by unit or integration testing: performance, data integrity, storage management, security, reliability. acceptance tests: does the system satisfy the requirements? Part of the contract. regression tests: tests after changes to prevent unintended changes. 17 / 47

The V-model t 18 / 47

Functional Testing: Programs as Functions t In functional testing, a program is considered as a function p : Input Output mapping values from its domain (Input) to values in its range (Output). Types permit the definitions of domain and range. Preconditions restrict the domain further. Postconditions restrict the range. 19 / 47

Boundary Value Analysis Focuses on the boundary of input space (domain) to identify test cases. Rationale: errors tend to occur near the extreme values of an input variable. Strategy: input values at 1 minimum 2 just above minimum 3 a nominal value 4 just below maximum 5 maximum 20 / 47

Boundary Value Analysis: 2 Input Variables t Boundary value test cases for a function of two variables i1 and i2: i2 d c a b i1 Single fault assumption! (no relation between parameters) 21 / 47

Generalizing Boundary Value Analysis t Number of test cases: 4n + 1 for a function with n input variables. Different ranges: Triangle problem: e.g. min = 1, max = MAXINT Date input: date corresponds to 3 input variables; e.g. min = 1.3.2003, max = 31.3.2003 Boolean input: does not make sense! Limitations: only good when program is a function of several independent variables that represent bounded physical variables. good: temperature, coordinates bad: date, pin code, telephone number 22 / 47

Robustness Testing t Exceeding the limits slightly: i2 d c a b i1 Most interesting with expected outputs not inputs! 23 / 47

Robustness Testing Examples t Robustness testing forces attention on exception handling: Exceeding angle of attack of a plane: Will it stall? Exceeding load capacity of an elevator: Hopefully only a warning!? With strongly typed languages robustness testing may be very awkward: C leads to run-time errors. Exception handling (C#, Java) mandates robustness testing. Specifications: test cases such that pre(in) holds. Automated: fuzz testing or fuzzing 24 / 47

Worst-Case Testing t Rejecting the single-fault assumption: i2 d c a Number of test cases: 5 n, for n input variables! b i1 25 / 47

Special Value Testing t tester uses his domain knowledge, experience with similar programs, and information about soft spots to devise test cases. most widely practiced form. also called ad hoc testing. very dependent on the abilities of the tester! 26 / 47

Example: Commission t Rifle 1 salesman sales rifle locks 2, stocks 3, and barrels 4. Locks costs $45, stocks $30, barrels $25. Has to sell at least one complete rifle per month. Has to sell at most 70 locks, 80 stocks, and 90 barrels per month. The gunsmith computes the salesman s commission: 10% on sales up to (and including) $1000, 15% on the next $800, and 20% on any sales in excess of $1800. 1 Gewehr 2 Verschluss 3 Schaft 4 Lauf 27 / 47

Commission: Output Boundary Analysis t Case Locks Stocks Barrels Sales ($) Comm ($) Comment 1 1 1 1 100 10 output minimum 2 1 1 2 125 12.5 output minimum + 3 1 2 1 130 13 output minimum + 4 2 1 1 145 14.5 output minimum + 5 5 5 5 500 50 midpoint 6 10 10 9 975 97.5 border point - 7 10 9 10 970 97 border point - 8 9 10 10 955 95.5 border point - 9 10 10 10 1000 100 border point 10 10 10 11 1025 103.75 border point + 11 10 11 10 1030 104.5 border point + 12 11 10 10 1045 106.75 border point + 13 14 14 14 1400 160 midpoint 28 / 47

Commission: Output Boundary Analysis (cont.) t Case Locks Stocks Barrels Sales ($) Comm ($) Comment 14 18 18 17 1775 216.75 border point - 15 18 17 18 1770 215.5 border point - 16 17 18 18 1755 213.25 border point - 17 18 18 18 1800 220 border point 18 18 18 19 1825 225 border point + 19 18 19 18 1830 226 border point + 20 19 18 18 1845 229 border point + 21 48 48 48 4800 820 midpoint 22 70 80 89 7775 1415 output maximum - 23 70 79 90 7770 1414 output maximum - 24 69 80 90 7755 1411 output maximum - 25 70 80 90 7800 1420 output maximum 29 / 47

Random Testing Idea: Random number generator to pick test case values. Motivation: avoiding a form of bias in testing. Question: How many random test cases are sufficient? Answer by 1 coverage criteria 2 reliability models Pro: fully automatic test case generation. Con: more test cases needed to reach structural coverage. At least 2 decades of literature in academia (statistical testing methods). Many popular tools use random testing, e.g. ScalaCheck or FsCheck. 30 / 47

Random Testing of Commission t Test Cases 10% Commission 15% Commission 20% Commission 91 1 6 84 27 1 1 25 72 1 1 70 176 1 6 169 48 1 1 46 152 1 6 145 125 1 4 120 avg. 1.01% 3.62% 95.37% 31 / 47

Random Testing of Triangle Test Cases NoTriangle Scalene Isosceles Equilateral 1289 663 593 32 1 15436 7696 7372 367 1 17091 8556 8164 367 1 2603 1284 1252 66 1 6475 3197 3122 155 1 5978 2998 2850 129 1 9008 4447 4353 207 1 avg. 49.83% 47.87% 2.29% 0.01% Program generates random test cases until at least one of each output occurs. Here an upper limit of 200 has been chosen. 32 / 47

Software Reliability Engineering failure rate = number of failures logical unit Logical unit: time, number of printed pages etc. Reliability: propability that system will have no failure for a given time. 1 Decide target failure rate or reliability 2 Operational profile: What use cases (functions) are often executed? 3 More tests for highly frequent use cases. 4 Additional tests for high-risk use cases. 5 Statistical failure models, and the failures detected provide answers, when the target reliability is reached. Book: John Musa, Software Reliability Engineering, 2nd. edition, 1998. 33 / 47

Equivalence Class Testing Main idea: Test hypothesis: classes of inputs show equal behaviour Select one test case from each equivalence class. Main motivation: adequate set of test cases (having a sense of completeness) avoiding redundancy! We distinguish: weak / strong equivalence class testing (single vs. multiple faults) normal / robust equivalence class testing 34 / 47

Equivalence Classes form a set partition. Definition: (Partition) Given a set A and a set of n subsets A 1, A 2,..., A n of A, the subsets form a partition of A iff and A 1 A 2 A n = A i, j {1,..., n} i j A i A j = {} 1st property provides a form of completeness, 2nd property ensures a form of nonredundancy. 35 / 47

Choosing Equivalence Classes Key to Strategy: Choice of equivalence relation. Deduced from the problem specification Triangle: one test case for equilateral (5, 5, 5), we don t expect much more from test cases, like (6, 6, 6), (7, 7, 7). often by guessing a likely implementation Formal interface specifications (contracts): choice of equivalence classes can be automated. 36 / 47

Weak Normal Equivalence Class Testing t i2 d c a b i1 37 / 47

Strong Normal Equivalence Class Testing t i2 d c a b i1 38 / 47

Weak Robust Equivalence Class Testing t i2 d c a b i1 39 / 47

Strong Robust Equivalence Class Testing t i2 d c a b i1 40 / 47

Triangle Example Note that four possible outpus can occur. We can use these to identify output (range) equivalence classes: 1 {(a, b, c) istriangle(a, b, c) = NoTriangle} 2 {(a, b, c) istriangle(a, b, c) = Scalene} 3 {(a, b, c) istriangle(a, b, c) = Isosceles} 4 {(a, b, c) istriangle(a, b, c) = Equilateral} 41 / 47

Triangle: Normal Equivalence Class Testing t The four weak normal equivalence class test cases are: Test Case a b c Expected Output WN1 4 1 2 NoTriangle WN2 3 4 5 Scalene WN3 2 2 3 Isosceles WN4 5 5 5 Equilateral No valid subintervals of variables a, b, and c exists strong normal equivalence testing does not add test cases. 42 / 47

Triangle: Robust Equivalence Class Testing Let s assume the valid input data has been restricted to [0, 200], and error messages indicate if input is out of its range, then the additional weak robust equivalence class test cases are: Test Case a b c Expected Output WR1-1 5 5 Value a out of range! WR2 5-1 5 Value b out of range! WR3 5 5-1 Value c out of range! WR4 201 5 5 Value a out of range! WR5 5 201 5 Value b out of range! WR6 5 5 201 Value c out of range! 43 / 47

Triangle: Robust Equivalence Class Testing t We show one corner of the cube of strong robust equivalence class test cases: Test Case a b c Expected Output WS1-1 5 5 Value a out of range! WS2 5-1 -5 Value b out of range! WS3 5 5-1 Value c out of range! WS4-1 -1 5 Value a, b out of range! WS5 5-1 -1 Value b, c out of range! WS6-1 5-1 Value a, c out of range! WS7-1 -1-1 Value a, b, c out of range! 44 / 47

Triangle: Input Equivalence Class Equivalence class testing is sensitive to the equivalence relation chosen. Equivalence classes based on the input domain: 1 {(a, b, c) a = b b = c} 2 {(a, b, c) a = b a c} 3 {(a, b, c) a b a = c} 4 {(a, b, c) b = c a b} 5 {(a, b, c) a b a c b c} 45 / 47

Triangle: Input Equivalence Class (cont.) t Furthermore, we can apply the triangle property to see if they constitute a triange: 6 {(a, b, c) a b + c} 7 {(a, b, c) b a + c} 8 {(a, b, c) c a + b} 46 / 47

Triangle: Input Equivalence Class (cont.) t We could be even more thorough and distinguish between x = y and x > y in the properties above: 6 {(a, b, c) a = b + c} 7 {(a, b, c) a > b + c} 8 {(a, b, c) b = a + c} 9 {(a, b, c) b > a + c} 10 {(a, b, c) c = a + b} 11 {(a, b, c) c > a + b} 47 / 47