Test Suite Minimization An Empirical Investigation. Jeffery von Ronne A PROJECT. submitted to. Oregon State University. University Honors College

Size: px
Start display at page:

Download "Test Suite Minimization An Empirical Investigation. Jeffery von Ronne A PROJECT. submitted to. Oregon State University. University Honors College"

Transcription

1 Test Suite Minimization An Empirical Investigation by Jeffery von Ronne A PROJECT submitted to Oregon State University University Honors College in partial fulfillment of the requirements for the degree of Honors Bachelors of Science in Computer Science (Honors Scholar) Presented May 28, 1999 Commencement June 1999

2 AN ABSTRACT OF THE THESIS OF Jeffery von Ronne for the degree of Honors Bachelors of Science in Computer Science presented on May 28, Title: Test Suite Minimization: An Empirical Investigation. Abstract approved: Gregg Rothermel Test suite minimization techniques attempt to reduce the cost of saving and reusing tests during software maintenance, by eliminating redundant tests from test suites. A potential drawback of these techniques is that in minimizing a test suite, they might reduce the ability of that test suite to reveal faults in the software. Previous studies have shown that sometimes this reduction is small, but sometimes this reduction is severe. This work investigates the minimization process, what factors can affect its performance, and techniques for reducing this loss.

3 Test Suite Minimization An Empirical Investigation by Jeffery von Ronne A PROJECT submitted to Oregon State University University Honors College in partial fulfillment of the requirements for the degree of Honors Bachelors of Science in Computer Science (Honors Scholar) Presented May 28, 1999 Commencement June 1999

4 Honors Bachelors of Science in Computer Science project of Jeffery von Ronne presented on May 28, 1999 APPROVED: Mentor, representing Computer Science Committee Member, representing Mathematics Committee Member and Chair, Department of Computer Science Dean of University Honors College I understand that my project will become part of the permanent collection of Oregon State University Honors College. My signature below authorizes release of my project to any reader upon request. Jeffery von Ronne, Author

5 Acknowledgment Many thanks are due to Dr. Rothermel, who provided much advice and guidance during the past year, as well as collaborating on the work in this thesis. My other committee members were Dr. Robby Robson and Dr. Michael Quinn. Dr. Roland Untch of Middle Tennessee State University provided the mutation data necessary for the experiments with PSSC minimization. Chengyun Chu prepared the Space program and assisted in the preparation of the mutation data. Dr. Mary Jean Harrold and Christie Hong of Ohio State University and Jeffery Ostrin also collaborated on parts of this work. The "Siemens" programs were provided by Siemens Corporate Research. The space program came from the European Space Agency via Dr. s Pasquini and Phyllis. The NSF funded my work through a Research Experience for Undergraduates grant to Dr. Rothermel. The equipment and other collaborators were funded in part by grants from Microsoft and the NSF. Thanks everyone.

6 Contributing Co-Authors The second and third chapters of this thesis are based on an article entitled Experiments to Assess the Cost-Benefits of Test Suite Minimization by Dr. Gregg Rothermel, Dr. Mary Jean Harrold (Ohio State University), Christie Hong (Ohio State University), and myself, which is currently in preperation for submission to Transactions in Software Engineering, and is a revised and expanded version of an earlier paper, entitled An empirical study of the effects of minimization on the fault detection capabilities of test suites, which was authored by Dr. Gregg Rothermel, Dr. Mary Jean Harrold, Christie Hong, and Jeffery Ostrin, and presented at the November 1998 International Conference on Software Maintenance.

7 Table of Contents 1. Introduction and Motivation Motivation Overview of This Thesis Background and Literature Review Test suite minimization Previous empirical work The Wong98 study The Wong97 study Edge-Minimization Experiments Research Questions Measures and Tools Measures Measuring savings Measuring costs Tool infrastructure Experiments with smaller C programs Subject programs, faulty versions, test cases, and test suites Experiment design Threats to validity Minimization of edge-coverage-adequate test suites Test suite size reduction Fault detection effectiveness reduction Minimization of randomly generated test suites Test suite size reduction Fault detection effectiveness reduction Experiment with the Space Program Subject program, faulty versions, test cases, and test suites Experiment design Threats to validity Data and Analysis Test suite size reduction Fault detection effectiveness reduction Comparison to Previous Empirical Results A New Minimization Technique Mutation Analysis and Minimization Mutation Analysis and Sensitivity Adapting Sensitivity for use as a Coverage Criterion An Algorithm to Facilitate Minimization based on PSSC...55

8 A Conventional Test Suite Minimization Heuristic A Multi-Hit Minimization Algorithm Using the Multi-Hit Reduction Algorithm for PSSC Minimization Asymptotic Analysis of the Multi-Hit Reduction Algorithm An Experiment with PSSC Minimization Experimental Design Results Minimized Test Suite Size Minimized Test Suite Performance Conclusion Results Practical Implications Limitations of This Investigation and Future Work...73 Bibliography...75

9 List of Figures 3-1. Percentage of Inputs that Expose Each Fault Size Distribution among Unminimized Test Suites for the Siemens Programs Size of Minimized vs. Size of Original Test Suites Percent Reduction in Test Suite Size vs. Original Test Suite Size Minimization: Percentage Effectiveness Reduction vs. Original Size Effectiveness in Original and after Minimization vs. Original Size Random Reduction: Percentage Effectiveness Reduction vs. Original Suite Size Minimization and Random Reduction: Fault Detection vs Original Size Random Reduction: Percent Effectiveness Reduction Percentage of Test Cases that Expose each of Space s Faults Size of Minimized Test Suites vs Size of Original Test Suites Percent Reduction in Test Suite vs Original Test Suite Size Percent Reduction in Effectiveness vs. Original Size Original and Minimized: Faults Detected vs. Original Size The Harrold, Gupta, and Soffa Test Suite Minimization Algorithm A Multi-Hit Test Suite Reduction Algorithm A C Program Sizes of Test Suites after PSSC minimization Average Test Suite Size vs. Average Number of Faults Detected...67

10 List of Tables 3-1. The Siemens Programs Correlation Between Size Reduction and Original Size Minimization: Correlation between Effectiveness Reduction and Original Size Random Reduction: Correlation between Effectiveness Loss and Origianl Size Comparison of Fault Detection Reduction Comparison of Fault Detection Reduction Variance The Space Application Correlation between Size Reduction and Initial Size Correlation between Initial Size and Effectiveness Reduction Average Reductions in Fault Detection Effectiveness Fault detection abilities of tests used in the Wong98 study The Initial Test Suite for the Example Program The Coverage Requirements for the Example Program...62

11 Chapter 1. Introduction and Motivation 1.1. Motivation Testing is an important but expensive task necessary for the construction of high quality software. As such, there is great potential for any practical technique that enables the detection of more faults with limited software testing funds. One testing strategy is to orient the testing regimen around concrete, achievable criteria. These include functional tests, designed to exercise the program s documented features, and also structural tests, designed to exercise each statement in the program. It is thought that a testing regimen designed around explicit criteria such as those just mentioned is more effective than either random or ad hoc testing 1. In fact, experimentation, such as that done by researchers at Siemens, has shown that structural testing based on either controlflow or dataflow coverage criteria can show significantly better fault detection than random testing[hutchins94]. Coverage criteria are also used as a stopping point to decide when a program is sufficiently tested. In this case, additional tests are added until the test suite has achieved a specified coverage level according to a specific adequacy criterion. For example, to achieve statement coverage adequacy for a program, one would add additional test cases to the test suite until each statement in that program is executed by at least one of the test cases. It is often the case that as a program evolves, additional tests are needed to maintain adequate coverage. Sometimes, as the test suite grows, it can become prohibitively expensive 1. Random testing is selecting inputs at random, from some input distribution, and using those as test cases. Ad hoc testing is testing with inputs chosen by the tester with no explicit selection criteria.

12 Chapter 1. Introduction and Motivation 2 to execute on new versions of the program. These test suites will often contain test cases that are no longer needed to satisfy the coverage criteria, because they are now obsolete or redundant [Chen96, Harrold93, Horgan92, Offutt95]. 2 For example, Harrold et al. propose that a reduced test suite, made up of the smallest subset of the test cases that still exercises all of the coverage items, could be used in place of the original test suite[harrold93]. The reduced subset of the original test suite will be referred to as a minimized test suite,andthe process of obtaining the minimized test suite will be called minimization. Unfortunately, minimized test suites are not without drawbacks. In addition to the cost of determining the reduced set, minimization may remove test cases that detect program faults that are not detected by other test cases that satisfy the same criterion. In the worse case, the minimized test suites will no longer detect any faults, including those that would be detected by the original test suite. This work begins to quantify this loss over a limited range of coverage criteria, programs, program faults, and test cases and compares it to the benefit in reduced test suite size Overview of This Thesis Some studies have shown that minimization can result in significant savings in test suite size with little reduction in the ability of the minimized test suite to detect faults [Wong95, Wong97, Wong98]. This work, however, shows that this is not necessarily the case. For the combination of programs, faults, and types of test suites we utilized in two empirical stud- 2. Obsolete test cases no longer exercise any coverage items. Redundant test cases are those that exercise only test cases that are also exercised by other test cases in the test suite.

13 Chapter 1. Introduction and Motivation 3 ies, the loss in fault detection was substantial. While a third study showed a less extreme loss in fault detection, this loss was neither statistically nor practically insignificant. These findings motivated the search for alternative coverage criteria that could be used in place of or in conjunction with structural criteria. This resulted in a new coverage criterion: Probabilistic Statement Sensitivity Coverage. In the process, a new minimization heuristic was developed. The next chapter will discuss coverage criteria, test suite minimization, and previous work. The third chapter will discuss the experiments we conducted to assess the performance of a conventional minimization technique. Chapter 4 introduces the PSSC criterion, explains how it could be used, and compares its performance to conventional techniques. Finally, the conclusion will recap our experiment results, explain the practical consequences of this work, and suggest areas for further study.

14 4 Chapter 2. Background and Literature Review 2.1. Test suite minimization The test suite minimization problem may be stated as follows [Harrold93] Given: Test suite T, a set of test case requirements r 1, r 2,..., r n that must be satisfied to provide the desired test coverage of the program, and subsets of T, T 1, T 2,..., T n, one associated with each of the r i s such that any one of the test cases t j belonging to T i can be used to test r i. Problem: Find a representative set of test cases from T that satisfies all of the r i s. The r i s in the foregoing statement can represent various test case requirements, such as source statements, decisions, definition-use associations, or specification items. A representative set of test cases that satisfies all of the r i s must contain at least one test case from each T i ; such a set is called a hitting set of the group of sets T, T 1, T 2,..., T n.to achieve a maximum reduction, it is necessary to find the smallest representative set of test cases. However, this subset of the test suite is the minimum cardinality hitting set of the T i s, and the problem of finding such a set is NP-complete [Garey79]. Thus, minimization techniques resort to heuristics. Several test suite minimization techniques have been proposed (e.g., [Chen96, Harrold93, Horgan92, Offutt95]); in this work we utilize the technique of Harrold, Gupta, and Soffa [Harrold93].

15 Chapter 2. Background and Literature Review Previous empirical work Many empirical studies of software testing have been performed. Some of these studies, such as those reported in References [Frankl93,Hutchins94,Wong94], provide only indirect data about the effects of test suite minimization through consideration of the effects of test suite size on costs and benefits of testing. Other studies, such as the study reported in Reference [Graves98], provide only indirect data about the effects of test suite minimization through a comparison of regression test selection techniques that practice or do not practice minimization. 1 Recent studies by Wong, Horgan, London, and Mathur [Wong95,Wong98] 2 and Wong, Horgan, Mathur, and Pasquini [Wong97], however, directly examine the costs and benefits of test suite minimization. We refer to these studies collectively as the Wong studies, and individually as the Wong98 and Wong97 studies. We summarize the results of these studies here; the references provide further details The Wong98 study The Wong98 study involved ten common C UNIX utility programs, including nine programs ranging in size from 9 to 289 lines of code, and one program of 842 lines of code. 1. Whereas minimization considers a program and test suite, regression test selection considers a program, test suite, and modified program version, and selects test cases that are appropriate for that version without removing them from the test suite. The problems of regression test selection and test suite minimization are thus related but distinct. For further discussion of regression test selection see Reference [Rothermel96]. 2. Reference [Wong98] (1998) extends work reported earlier in Reference [Wong95] (1995); thus, except where otherwise noted, we here focus on the most recent (1998) reference.

16 Chapter 2. Background and Literature Review 6 For each of these programs, the researchers used a random domain-based test generator to generate an initial test case pool; the number of test cases in these pools ranged from 156 to 997. No attempt was made, in generating these pools, to achieve complete coverage of program components (blocks, decisions, or definition-use associations). The researchers next drew multiple distinct test suites from their test case pools, by randomly selecting test cases. The resulting test suites achieved basic block coverages ranging from 5% to 95%; overall, 1198 test suites were generated. Reference [Wong98] reports the sizes of the resulting test suites as averages over groups of test cases that achieved similar coverage: 27 test suites belonged to groups in which average test suite size ranged from 9.7 to test cases, and 928 test suites belonged to groups in which average test suite size ranged from only 1 to 4.43 test cases. The researchers enlisted graduate students to inject simple mutation-like faults into each of the subject programs. The researchers excluded faults that could not be detected by any test case. All told, 181 faulty versions of the programs were retained for use in the study. To assess the difficulty of detecting these faults, the researchers measured the percentages of test cases, in the associated test pools, that were able to detect the faults. Of the 181 faults, 78 (43%) were Quartile I faults detectable by fewer than 25% of the associated test cases, 42 (23%) were Quartile II faults detectable by between 25% and 5% of the associated test cases, 37 (2%) were Quartile III faults detectable by between 5% and 75% of the associated test cases, and 24 (13%) were Quartile IV faults detectable by at least 75% of the associated test cases. The researchers minimized their test suites using ATACMIN [Horgan92], a minimization

17 Chapter 2. Background and Literature Review 7 tool based on an implicit enumeration algorithm that found exact minimization solutions for all of the test suites utilized in the study. Test suites were minimized with respect to block, decision, and all-uses dataflow coverage. The researchers measured the reduction in test suite size achieved through minimization, and the reduction in fault-detection effectiveness of the minimized test suites. The researchers also repeated this procedure on the entire test pools (effectively, treating these test pools as if they were test suites.) Finally, they used null hypothesis checking to determine whether the minimized test suites had better fault detection capabilities than test suites of the same size generated randomly from the unminimized test suites. The researchers drew several overall conclusions from the study, including the following: As the coverage achieved by initial test suites increased, minimization produced greater savings with respect to those test suites, at rates ranging from % (for several of the 5-55% coverage suites) to 72.79% (for one of the 9-95% block coverage suites). As the coverage achieved by initial test suites increased, minimization produced greater losses in the fault-detection effectiveness of those suites. However, losses in fault detection effectiveness were small compared to savings in test suite size: in all but one case, reductions were less than 7.27 percent, and most reductions were less than 4.99 percent. Fault difficulty partially determined whether minimization caused losses in faultdetection effectiveness: Quartile I and II faults were more easily missed than Quartile III and IV faults following minimization. The null hypothesis testing showed that minimized test suites retain a size/effectiveness

18 Chapter 2. Background and Literature Review 8 advantage over their random counterparts. The authors draw the following overall conclusion:...when the size of a test set is reduced while the coverage is kept constant, there is little or no reduction in its fault detection effectiveness... A test set which is minimized to preserve its coverage is likely to be as effective for detecting faults at a lower execution cost. [Wong98] The Wong97 study Whereas the Wong98 study examined test suite minimization on 1 common Unix utilities, the Wong97 study involved a single C application developed for the European Space Agency to aid in the management of large antenna arrays. At 6,1 executable lines, this application is several times the size of the largest program used for the Wong98 study. Unlike the Wong98 study, in which an initial pool of test cases was generated randomly based solely on program specifications, the Wong97 study used a pool of 1 test cases generated based on an operational profile. In the Wong98 study, test suites were generated and categorized based on block coverage. For the Wong97 study, two different procedures were followed for generating test suites: the first to create test suites of fixed size, and the second to create test suites of fixed blockcoverage. For the fixed size test suites, test cases were chosen randomly from the test pool until the desired number of test cases had been selected. In all, 12 test suites were generated in this manner: 3 distinct test suites for each of the target sizes of 5, 1,

19 Chapter 2. Background and Literature Review 9 15, 2. For the fixed coverage test suites, test cases were chosen randomly from the test pool until the test suite reached the desired coverage. Only test cases that added coverage were added to the fixed coverage test suites. In all, 18 test suites were generated in this manner: 3 distinct test suites for each of the target coverages ranging from 5% to 75% block coverage. Whereas the faults in the Wong98 study were injected by graduate students, the faults used in the Wong97 study were obtained from an error log maintained during the creation of the application. The researchers selected 16 of these faults, of which all but one were detected by fewer than 7% of the test cases, making them similar in detection difficulty to the Quartile I faults used in the Wong98 study. The exceptional fault was detected by 32 (32%) of the test cases. As in the Wong98 study, all of the test suites were minimized using ATACMIN. In both studies, the size of each test suite was reduced, while the coverage was kept constant. In the Wong97 study, however, minimization with respect to block coverage was the only minimization attempted. Reduction in test suite size and in fault detection effectiveness were measured. Finally, null hypothesis testing was used to compare test suites minimized for coverage to test suites that were randomly minimized.

20 Chapter 2. Background and Literature Review 1 The researchers drew the following overall conclusions from the study: There were substantial reductions in size achieved from minimizing the fixed size test suites. For the fixed coverage test suites, reductions in size also occurred but were smaller. As in the Wong98 study, the effectiveness reductions of the minimized test suites were smaller than the size reductions, so that minimized test suites resulted in a size/effectiveness advantage over the unminimized test suites. The average effectiveness reduction due to minimization was less than 7.3%, and most reductions were less than 3.6%. The null hypothesis testing again showed that minimized test suites retain a size/effectiveness advantage over their random counterparts. Thus, the Wong97 study supports the findings of the Wong98 study, while broadening the scope of the study in terms of both the programs under scrutiny and the types of initial test suites utilized.

21 11 Chapter 3. Edge-Minimization Experiments 3.1. Research Questions The Wong studies leave a number of open research questions, primarily concerning the extent to which the results observed in those studies generalize to other testing situations. Among the open questions are the following, which motivate the present work. 1. How does minimization fare in terms of costs and benefits when test suites have a wider range of sizes than the test suites utilized in the Wong studies. 2. How does minimization fare in terms of costs and benefits when test suites are coverage-adequate? 3. How does minimization fare in terms of costs and benefits when test suites contain additional coverage-redundant test cases? The first and third questions are addressed by the Wong97 study in its use of fixed-size test suites; however, that study examines only one program. Neither of the Wong studies considers the second question. Test suites used in practice often contain test cases designed not for code coverage, but rather, designed to exercise product features, specification items, or exceptional behaviors. Such test suites may contain larger numbers of test cases, and larger numbers of coverageredundant test cases, than the test suites utilized in the Wong98 study, or than the coveragebased test suites utilized in the Wong97 study.

22 Chapter 3. Edge-Minimization Experiments 12 Similarly, a typical tactic for utilizing coverage-based testing is to begin with a base of specification-based tests, and add additional tests to achieve complete coverage. Such test suites may also contain greater coverage-redundancy than the coverage-based test suites utilized in the Wong studies, but can be expected to distribute coverage more evenly than the fixed-size test suites constructed by random selection for the Wong97 study. It is important to understand the cost-benefit tradeoffs involved in minimizing such test suites. Thus, to investigate these tradeoffs, we performed a family of experiments Measures and Tools We now discuss the measures and tools utilized in our experiments; subsequent sections discuss the individual experiments. Let T be a test suite, and let T min be the reduced test suite that results from the application of a minimization technique to T Measures We need to measure the costs and savings of test suite minimization Measuring savings. Test suite minimization lets testers spend less time executing test cases, examining test results, and managing the data associated with testing. These savings in time are dependent on the extent to which minimization reduces test suite size. Thus, to measure the savings that can result from test suite minimization, we can follow the methodology used in the Wong studies and measure the reduction in test suite size achieved by minimization. For

23 Chapter 3. Edge-Minimization Experiments 13 each program, we measure savings in terms of the number and the percentage of tests eliminated by minimization. (The former measure provides a notion of the magnitude of the savings; the latter lets us compare and contrast savings across test suites of varying sizes.) The number of tests eliminated is given by (T -T min ), and the percentage of tests eliminated is given by ((T - T min )/T * 1). This approach makes several assumptions: it assumes that all test cases have uniform costs, it does not differentiate between components of cost such as CPU time or human time, and it does not directly measure the compounding of savings that results from using the minimized test suites over a sequence of subsequent releases. This approach, however, has the advantage of simplicity, and using it we can draw several conclusions that are independent of these assumptions and compare our results with those achieved in the Wong studies Measuring costs. There are two costs to consider with respect to test suite minimization. The first cost is the cost of executing a minimization tool to produce the minimized test suite. However, a minimization tool can be run following the release of a product, automatically and during off-peak hours, and in this case the cost of running the tool may be noncritical. Moreover, having minimized a test suite, the cost of minimization is amortized over the uses of that suite on subsequent product releases, and thus assumes progressively less significance in relation to other costs. The second cost to consider is more significant. Test suite minimization may discard some

24 Chapter 3. Edge-Minimization Experiments 14 test cases that, if executed, would reveal defects in the software. Discarding these test cases reduces the fault detection effectiveness of the test suite. The cost of this reduced effectiveness may be compounded over uses of the test suite on subsequent product releases, and the effects of the missed faults may be critical. Thus, in this experiment, we focus on the costs associated with discarding fault-revealing test cases. We considered two methods for calculating reductions in fault detection effectiveness. On a per-test-case basis: One way to measure the cost of minimization in terms of effects on fault detection, given faulty program P and test suite T, is to identify the test cases in T that reveal a fault in P but are not in T min. This quantity can be normalized by the number of fault-revealing test cases in T. One problem with thisapproach is that multipletest cases may reveal a given fault. In this case some test cases could be discarded without reducing fault-detection effectiveness; this measure penalizes such a decision. On a per-test-suite basis: Another approach is to classify the results of test suite minimization, relative to a given fault in P, in one of three ways: (1) no test case in T is fault-revealing, and, thus, no test case in T min is fault-revealing; (2) some test case in both T and T min is fault-revealing; or (3) some test case in T is fault-revealing, but no test case in T min is fault-revealing. Case 1 denotes situations in which T is inadequate. Case 2 indicates a use of minimization that does not reduce fault detection, and Case 3 captures situations in which minimization compromises fault detection.

25 Chapter 3. Edge-Minimization Experiments 15 The Wong experiments utilized the second approach; we do the same. For each program, we measure reduced effectiveness in terms of the number and the percentage of faults for which T min contains no fault-revealing test cases, but T does contain fault-revealing test cases. More precisely, if F denotes the number of faults revealed by T over the faulty versions of program P, andf min denotes the number of faults revealed by T min over those versions, the number of faults lost is given by (F -F min ), and the percentage reduction in fault-detection effectiveness of minimization is given by ((F - F min )/F * 1). Note that this method of measuring the cost of minimization calculates cost relative to a fixed set of faults. This approach also assumes that missed faults have equal costs, an assumption that typically does not hold in practice Tool infrastructure. To perform our experiments we required several tools. First, we required a test suite minimization tool; to obtain this, we implemented the algorithm of Harrold, Gupta and Soffa [Harrold93] within the Aristotle program analysis system [Harrold97]. The Aristotle system also provided us with with code instrumenters for use in determining edge coverage.

26 Chapter 3. Edge-Minimization Experiments Experiments with smaller C programs Our first two experiments address our research questions on several small C programs, similar in size to the C utilities utilized in the Wong98 study. In this section we first describe details common to these two experiments, and then we report the results of the experiments in turn Subject programs, faulty versions, test cases, and test suites. We used seven C programs as subjects (see Table 3-1). The programs range in size from 138 to 516 lines of C code and perform a variety of functions. Each program has several faulty versions, each containing a single fault. Each program also has a large test pool. The programs, versions, and test pools were assembled by researchers at Siemens Corporate Research for a study of the fault-detection capabilities of control-flow and data-flow coverage criteria [Hutchins94]. We refer to these programs collectively as the Siemens programs. Table 3-1. The Siemens Programs Program Lines of Code No. of Versions Test Pool Size Description totinfo information measure schedule priority scheduler schedule priority scheduler tcas altitude separation printtok lexical analyzer printtok lexical analyzer replace pattern replacement The researchers at Siemens sought to study the fault-detecting effectiveness of coverage criteria. Therefore, they created faulty versions of the seven base programs by manually

27 Chapter 3. Edge-Minimization Experiments 17 seeding those programs with faults, usually by modifying a single line of code in the program. In a few cases they modified between two and five lines of code. Their goal was to introduce faults that were as realistic as possible, based on their experience with real programs. Ten people performed the fault seeding, working mostly without knowledge of each other s work [Hutchins94]. For each of the seven programs, the researchers at Siemens created a large test pool containing possible test cases for the program. To populate these test pools, they first created an initial set of black-box test cases according to good testing practices, based on the tester s understanding of the program s functionality and knowledge of special values and boundary points that are easily observable in the code [Hutchins94], using the category partition method and the Siemens Test Specification Language tool [Balcer89, Ostrand88]. They then augmented this set with manually-created white-box test cases to ensure that each executable statement, edge, and definition-use pair in the base program or its control flow graph was exercised by at least 3 test cases. To obtain meaningful results with the seeded versions of the programs, the researchers retained only faults that were neither too easy nor too hard to detect [Hutchins94], which they defined as being detectable by at least three and at most 35 test cases in the test pool associated with each program When we execute these faulty versions, we find four faults that are not detected, and three that are detected by only one or two test cases. This difference may be attributable to some factor involving the system on which we are executing our tests; the difference does not impact the results of our study.

28 Chapter 3. Edge-Minimization Experiments 18 Figure 3-1. Percentage of Inputs that Expose Each Fault percentage of tests that reveal faults totinfo schedule1 schedule2 tcas printtok1 subject program printtok2 replace Figure 3-1 shows the sensitivity to detection of the faults in the Siemens versions relative to the test pools; the boxplots 2 illustrate that the sensitivities of the faults vary within and between versions, but overall are all lower than 19.77%. Therefore, all of these faults were, in the terminology of the Wong studies, Quadrant I faults, detectable by fewer than 25% of the test pool inputs. To investigate our research questions we required coverage-adequate test suites that exhibit redundancy in coverage, and we required these in a range of sizes. To create these test 2. A boxplot is a standard statistical device for representing data sets [Johnson92]. In these plots, each data set s distribution is represented by a box. The box s height spans the central 5% of the data and its upper and lower ends mark the upper and lower quartiles. The middle of the three horizontal lines within the box represents the median. The vertical lines attached to the box indicate the tails of the distribution.

29 Chapter 3. Edge-Minimization Experiments 19 suites we utilized the edge coverage criterion. The edge coverage criterion is similar to the decision coverage criterion used in the Wong98 study, but is defined on control flow graphs. 3 We used the Siemens program test pools to obtain coverage-adequate test suites for each subject program. Our test suites consist of a varying number of test cases selected randomly from the associated test pool, together with any additional test cases required to achieve 1% coverage of coverable edges. 4 We did not add any particular test case to any particular test suite more than once. To ensure that these test suites would possess varying ranges of coverage redundancy, we randomly varied the number of randomly selected test cases over sizes ranging from to.5 times the number of lines of code in the program. Altogether, we generated 1 test suites for each program. Figure 3-2 provides views of the range of sizes of test suites created by the process just described. The boxplots illustrate that for each subject program, our test suite generation procedure yielded a collection of test suites of sizes that are relatively evenly distributed across the range of sizes utilized for that program. The all-uses-coverage-adequate suites 3. A test suite T is edge-coverage adequate for program P iff, for each edge e in each control flow graph for some procedure in P, ife is dynamically exercisable, then there exists at least one test case t in T that exercises e. A test case t exercises an edge e=(n 1,n 2 ) in control flow graph G iff t causes execution of the statement associated with n 1, followed immediately by the statement associated with n To randomly select test cases from the test pools, we used the C pseudo-random-number generator rand, seeded initially with the output of the C time system call, to obtain an integer which we treated as an index i into the test pool (modulo the size of that pool).

30 Chapter 3. Edge-Minimization Experiments 2 Figure 3-2. Size Distribution among Unminimized Test Suites for the Siemens Programs size of test suite totinfo schedule1 schedule2 tcas printtok1 printtok2 replace subject program are larger on average than the edge-coverage-adequate suites because in general, more tests are required to achieve all-uses coverage than to achieve edge coverage. Analysis of the fault-detection effectiveness of these test suites shows that, except for eight of the edge-coverage-based test suites for schedule2, every test suite revealed at least one fault in the set of faulty versions of the associated program. Thus, although each fault individually is difficult to detect relative to the entire test pool for the program, almost all of the test suites utilized in the study possessed at least some fault-detection effectiveness relative to the set of faulty programs utilized.

31 Chapter 3. Edge-Minimization Experiments Experiment design. The experiments were run using a full-factorial design with 1 size-reduction and 1 effectiveness-reduction measures per cell. 5 The independent variables manipulated were: The subject program (the seven programs, each with a variety of faulty versions). Test suite size (between and.5 times lines-of-code test cases randomly selected from the test pool, together with additional test cases as necessary to achieve code coverage). For each subject program, we applied minimization techniques to each of the sample test suites for that program. We then computed the size and effectiveness reductions for these test suites Threats to validity. In this section we discuss potential threats to the validity of our experiments with the Siemens programs. Threats to internal validity are influences that can affect the dependent variables without the researcher s knowledge, and that thus affect any supposition of a causal relationship between the phenomena underlying the independent and dependent variables. In these experiments, our greatest concerns for internal validity involve the fact that we do not 5. The single exception involved schedule2, for which only 992 measures were available with respect to edge-coverage-based test suites, due to exclusion of the eight test suites that did not expose any faults.

32 Chapter 3. Edge-Minimization Experiments 22 control for the structure of the subject programs or the locality of program changes. Threats to external validity are conditions that limit our ability to generalize our results. The primary threats to external validity for this study concern the representativeness of the artifacts utilized. The Siemens programs, though nontrivial, are small, and larger programs may be subject to different cost-benefit tradeoffs. Also, there is exactly one seeded fault in each Siemens program; in practice, programs have much more complex error patterns. Furthermore, the faults in the Siemens programs were deliberately chosen (by the Siemens researchers) to be faults that were relatively difficult to detect. (However, the fact that the faults in these programs were not chosen by us does eliminate one potential source of bias.) Finally, the test suites we utilized represent only two types of test suite that could occur in practice if a mix of non-coverage-based and coverage-based testing were utilized. These threats can only be addressed by additional studies utilizing a wider range of artifacts. Threats to construct validity arise when measurement instruments do not adequately capture the concepts they are supposed to measure. For example, in this experiment our measures of cost and effectiveness are very coarse: they treat all faults as equally severe, and all test cases as equally expensive Minimization of edge-coverage-adequate test suites Our first experiment addresses our research questions by applying minimization to the Siemens programs and their edge-coverage-adequate test suites. In reporting results we first consider test suite size reduction, and then we consider fault detection effectiveness reduction.

33 Chapter 3. Edge-Minimization Experiments Test suite size reduction Figure 3-3 depicts the sizes of the minimized edge-coverage-adequate test suites for the seven Siemens programs, plotted against original test suite size. The data for each program P is depicted by a scatterplot containing a point for each of the test suites utilized for P. As the figure shows, the average sizes of the minimized test suites ranges from approximately 5 (for tcas) to 12 (for replace). For each program, the minimized test suites demonstrate little variance in size: tcas exhibiting the least variance (between 4 and 5 test cases), and printtok1 showing the greatest variance (between 5 and 14 test cases). Considered across the range of original test suite sizes, minimized test suite size for each program is also relatively stable. Figure 3-4 depicts the percentage reduction in test suite size produced by minimization in terms of the formula discussed in Section , for each of the subject programs. The data for each program P is represented by a scatterplot containing a point for each of the test suites utilized for P ; each point shows the percentage size reduction achieved for a test suite versus the size of that test suite prior to minimization. Visual inspection of the plots indicates a sharp increase in test suite size reduction over the first quartile of test suite sizes, tapering off as size increases beyond the first quartile. The data gives the impression of fitting a hyperbolic curve. To verify the correctness of this impression, we performed least-squares regression to fit the data depicted in these plots with a hyperbolic curve. Table 3-2 shows the best-fit curve for each of the subject, along with its square of correlation, r 2. 6 They indicate a strong hyper-

34 Chapter 3. Edge-Minimization Experiments 24 Figure 3-3. Size of Minimized vs. Size of Original Test Suites tot info tcas average average 8 5 minimized test suite size minimized test suite size minimized test suite size minimized test suite size original test suite size schedule average original test suite size schedule average minimized test suite size minimized test suite size original test suite size print tokens average original test suite size print tokens average original test suite size 2 replace original test suite size average 18 minimized test suite size original test suite size

35 Chapter 3. Edge-Minimization Experiments 25 Figure 3-4. Percent Reduction in Test Suite Size vs. Original Test Suite Size totinfo tcas schedule printtok schedule printtok replace

36 Chapter 3. Edge-Minimization Experiments 26 bolic correlation between percentage reduction in test suite size (savings of minimization) and original test suite size. Table 3-2. Correlation Between Size Reduction and Original Size program regression equation r 2 totinfo y = 1 * (1 - (5.2762/x)).99 schedule1 y = 1 * (1 - ( /x)).96 schedule2 y = 1 * (1 - ( /x)).94 tcas y = 1 * (1 - (4.9719/x)) 1. printtok1 y = 1 * (1 - (7.4978/x)).9 printtok2 y = 1 * (1 - (6.7776/x)).93 replace y = 1 * (1 - (12.18/x)).99 Our experimental results indicate that test suite minimization can produce savings in test suite size on coverage-adequate, coverage-redundant test suites. The results also indicate that as test suite size increases, the savings produced by test suite minimization increase; a consequence of the relatively stable size of the minimized suites Fault detection effectiveness reduction Figure 3-5 depicts the cost (reduction in fault detection effectiveness) incurred by minimization, in terms of the formula discussed in Section , for each of the seven subject programs. The data for each program P is represented by a scatterplot containing a point for each of the test suites utilized for P; each point shows the percentage reduction in fault detection effectiveness observed for a test suite versus the size of that test suite prior to minimization. 6. r 2 is a dimensionless index that ranges from zero to 1., inclusive, and is the fraction of variation in the values of y that is explained by the least-squares regression of y on x [Moore99].

37 Chapter 3. Edge-Minimization Experiments 27 Figure 3-6 illustrates the magnitude of the fault detection effectiveness reduction observed for the seven subject programs. Again, this figure contains a scatterplot for each program; however, we find it most revealing to depict faults detected versus original test suite size, simultaneously for both test suites minimized for edge-coverage (black) and for original test suites (grey). The solid lines in the plots denote average numbers of faults detected over the range of original test suite sizes, the gap between these lines indicates the magnitude of the fault detection effectiveness reduction for test suites minimized for edge coverage. The plots show that the fault detection effectiveness of test suites can be severely compromised by minimization. For example, on replace, the largest of the programs, minimization reduces fault-detection effectiveness by over 5%, with average fault loss ranging from 4 faults to 2 across the range of test suite sizes, on more than half of the test suites. Also, although there are cases in which minimization does not reduce fault-detection effectiveness (e.g., on printtok1), there are also cases in which minimization reduces the fault-detection effectiveness of test suites by 1% (e.g., on schedule2). Visual inspection of the plots suggests that reduction in fault detection effectiveness slightly increases as test suite size increases. Test suites in the smallest size ranges do produce effectiveness losses of less than 5% more frequently than they produce losses in excess of 5%, a situation not true of the larger test suites. Even the smallest test cases, however, exhibit effectiveness reductions in most cases: for example, on replace, test suites containing fewer than 5 test cases exhibit an average effectiveness reduction of nearly 4% (fault detection reduction ranging from 4 to 8 faults), and few such test suites do not lose effectiveness.

38 Chapter 3. Edge-Minimization Experiments 28 Figure 3-5. Minimization: Percentage Effectiveness Reduction vs. Original Size totinfo tcas schedule printtok schedule printtok replace

39 Chapter 3. Edge-Minimization Experiments 29 Figure 3-6. Effectiveness in Original and after Minimization vs. Original Size tinfo tcas 3 by original after minimization avg. by orig. 4 by original after minimization avg. by orig. avg. after min. avg. after min faults detected 2 15 faults detected original test suite size original test suite size schedule 1 print tokens by original after minimization avg. by orig. avg. after min. 9 8 by original after minimization avg. by orig. avg. after min. faults detected faults detected original test suite size original test suite size schedule 2 print tokens by original after minimization avg. by orig. avg. after min. 12 by original after minimization avg. by orig. avg. after min. 1 1 faults detected 8 6 faults detected original test suite size replace original test suite size 35 3 by original after minimization avg. by orig. avg. after min. 25 faults detected original test suite size

40 Chapter 3. Edge-Minimization Experiments 3 Table 3-3. Minimization: Correlation between Effectiveness Reduction and Original Size program regression line 1 r 2 regression line 2 r 2 regression line 3 r 2 totinfo y =.13x y = 9.56Ln(x) y = -.2x^2 +.44x schedule1 y =.15x y = 1.3Ln(x) y = -.2x x schedule2 y =.28x y = 17.7Ln(x) y = -.4x^2 +.89x tcas y =.68x y = 22.18Ln(x) y = -.2x^ x printtok1 y =.16x y = 14.68Ln(x) y = -.1x^2 +.44x printtok2 y =.7x y = 6.82Ln(x) y = -.1x^2 +.19x replace y =.11x y = 13.7Ln(x) y = -.1x^2 +.41x In contrast to the plots of size reduction effectiveness, the plots of fault detection effectiveness reduction do not give a strong impression of closely fitting any curve or line: the data is much more scattered than the data for test suite size reduction. Our attempts to fit linear, logarithmic, and quadratic regression curves to the data validate this impression: the data in Table 3-3 reveals little linear, logarithmic, or quadratic correlation between reduction in fault detection effectiveness and original test suite size. These results indicate that test suite minimization can compromise the fault-detection effectiveness of coverage-adequate, coverage-redundant test suites. However, the results only weakly suggest that as test suite size increases, the reduction in the fault-detection effectiveness of those test suites will increase.

An Empirical Study of the Effects of Minimization on the Fault Detection Capabilities of Test Suites

An Empirical Study of the Effects of Minimization on the Fault Detection Capabilities of Test Suites Proceedings of the International Conference on Software Maintenance, Washington, D.C., November, 1998 An Empirical Study of the Effects of Minimization on the Fault Detection Capabilities of Test Suites

More information

Test Suite Reduction with Selective Redundancy

Test Suite Reduction with Selective Redundancy Test Suite Reduction with Selective Redundancy Dennis Jeffrey Department of Computer Science The University of Arizona Tucson, AZ 85721 jeffreyd@cs.arizona.edu Neelam Gupta Department of Computer Science

More information

An Empirical Study of Regression Test Selection Techniques

An Empirical Study of Regression Test Selection Techniques ACM Transactions on Software Engineering and Methodology (to appear) An Empirical Study of Regression Test Selection Techniques Todd L. Graves Mary Jean Harrold JungMin Kim Adam Porter Gregg Rothermel

More information

An Empirical Evaluation of Test Adequacy Criteria for Event-Driven Programs

An Empirical Evaluation of Test Adequacy Criteria for Event-Driven Programs An Empirical Evaluation of Test Adequacy Criteria for Event-Driven Programs Jaymie Strecker Department of Computer Science University of Maryland College Park, MD 20742 November 30, 2006 Abstract In model-based

More information

BRANCH COVERAGE BASED TEST CASE PRIORITIZATION

BRANCH COVERAGE BASED TEST CASE PRIORITIZATION BRANCH COVERAGE BASED TEST CASE PRIORITIZATION Arnaldo Marulitua Sinaga Department of Informatics, Faculty of Electronics and Informatics Engineering, Institut Teknologi Del, District Toba Samosir (Tobasa),

More information

On the Use of Mutation Faults in Empirical Assessments of Test Case Prioritization Techniques

On the Use of Mutation Faults in Empirical Assessments of Test Case Prioritization Techniques On the Use of Mutation Faults in Empirical Assessments of Test Case Prioritization Techniques Hyunsook Do, Gregg Rothermel Department of Computer Science and Engineering University of Nebraska - Lincoln

More information

Empirical Studies of Test Case Prioritization in a JUnit Testing Environment

Empirical Studies of Test Case Prioritization in a JUnit Testing Environment University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln CSE Conference and Workshop Papers Computer Science and Engineering, Department of 2004 Empirical Studies of Test Case Prioritization

More information

Cost-cognizant Test Case Prioritization

Cost-cognizant Test Case Prioritization Technical Report TR-UNL-CSE-2006-0004, Department of Computer Science and Engineering, University of Nebraska Lincoln, Lincoln, Nebraska, U.S.A., 12 March 2006 Cost-cognizant Test Case Prioritization Alexey

More information

Dr. N. Sureshkumar Principal Velammal College of Engineering and Technology Madurai, Tamilnadu, India

Dr. N. Sureshkumar Principal Velammal College of Engineering and Technology Madurai, Tamilnadu, India Test Case Prioritization for Regression Testing based on Severity of Fault R. Kavitha Assistant Professor/CSE Velammal College of Engineering and Technology Madurai, Tamilnadu, India Dr. N. Sureshkumar

More information

Performance Estimation and Regularization. Kasthuri Kannan, PhD. Machine Learning, Spring 2018

Performance Estimation and Regularization. Kasthuri Kannan, PhD. Machine Learning, Spring 2018 Performance Estimation and Regularization Kasthuri Kannan, PhD. Machine Learning, Spring 2018 Bias- Variance Tradeoff Fundamental to machine learning approaches Bias- Variance Tradeoff Error due to Bias:

More information

Metaheuristic Development Methodology. Fall 2009 Instructor: Dr. Masoud Yaghini

Metaheuristic Development Methodology. Fall 2009 Instructor: Dr. Masoud Yaghini Metaheuristic Development Methodology Fall 2009 Instructor: Dr. Masoud Yaghini Phases and Steps Phases and Steps Phase 1: Understanding Problem Step 1: State the Problem Step 2: Review of Existing Solution

More information

An Empirical Study of the Effects of Test-Suite Reduction on Fault Localization

An Empirical Study of the Effects of Test-Suite Reduction on Fault Localization An Empirical Study of the Effects of Test-Suite Reduction on Fault Localization Yanbing Yu yyu@cc.gatech.edu James A. Jones jjones@cc.gatech.edu College of Computing Georgia Institute of Technology Atlanta,

More information

Chapter 3. Requirement Based System Test Case Prioritization of New and Regression Test Cases. 3.1 Introduction

Chapter 3. Requirement Based System Test Case Prioritization of New and Regression Test Cases. 3.1 Introduction Chapter 3 Requirement Based System Test Case Prioritization of New and Regression Test Cases 3.1 Introduction In this chapter a new prioritization technique has been proposed with two new prioritization

More information

CHAPTER-13. Mining Class Comparisons: Discrimination between DifferentClasses: 13.4 Class Description: Presentation of Both Characterization and

CHAPTER-13. Mining Class Comparisons: Discrimination between DifferentClasses: 13.4 Class Description: Presentation of Both Characterization and CHAPTER-13 Mining Class Comparisons: Discrimination between DifferentClasses: 13.1 Introduction 13.2 Class Comparison Methods and Implementation 13.3 Presentation of Class Comparison Descriptions 13.4

More information

The Structure and Properties of Clique Graphs of Regular Graphs

The Structure and Properties of Clique Graphs of Regular Graphs The University of Southern Mississippi The Aquila Digital Community Master's Theses 1-014 The Structure and Properties of Clique Graphs of Regular Graphs Jan Burmeister University of Southern Mississippi

More information

Learner Expectations UNIT 1: GRAPICAL AND NUMERIC REPRESENTATIONS OF DATA. Sept. Fathom Lab: Distributions and Best Methods of Display

Learner Expectations UNIT 1: GRAPICAL AND NUMERIC REPRESENTATIONS OF DATA. Sept. Fathom Lab: Distributions and Best Methods of Display CURRICULUM MAP TEMPLATE Priority Standards = Approximately 70% Supporting Standards = Approximately 20% Additional Standards = Approximately 10% HONORS PROBABILITY AND STATISTICS Essential Questions &

More information

In this paper, we present the results of a comparative empirical study of two safe RTS techniques. The two techniques we studied have been implemented

In this paper, we present the results of a comparative empirical study of two safe RTS techniques. The two techniques we studied have been implemented ACM Transactions on Software Engineering and Methodology (to appear). A Comparative Study of Coarse- and Fine-Grained Safe Regression Test Selection Techniques John Bible Department of Computer Science

More information

D-Optimal Designs. Chapter 888. Introduction. D-Optimal Design Overview

D-Optimal Designs. Chapter 888. Introduction. D-Optimal Design Overview Chapter 888 Introduction This procedure generates D-optimal designs for multi-factor experiments with both quantitative and qualitative factors. The factors can have a mixed number of levels. For example,

More information

Table of Contents (As covered from textbook)

Table of Contents (As covered from textbook) Table of Contents (As covered from textbook) Ch 1 Data and Decisions Ch 2 Displaying and Describing Categorical Data Ch 3 Displaying and Describing Quantitative Data Ch 4 Correlation and Linear Regression

More information

CCSSM Curriculum Analysis Project Tool 1 Interpreting Functions in Grades 9-12

CCSSM Curriculum Analysis Project Tool 1 Interpreting Functions in Grades 9-12 Tool 1: Standards for Mathematical ent: Interpreting Functions CCSSM Curriculum Analysis Project Tool 1 Interpreting Functions in Grades 9-12 Name of Reviewer School/District Date Name of Curriculum Materials:

More information

Using Mutation to Automatically Suggest Fixes for Faulty Programs

Using Mutation to Automatically Suggest Fixes for Faulty Programs 2010 Third International Conference on Software Testing, Verification and Validation Using Mutation to Automatically Suggest Fixes for Faulty Programs Vidroha Debroy and W. Eric Wong Department of Computer

More information

The Encoding Complexity of Network Coding

The Encoding Complexity of Network Coding The Encoding Complexity of Network Coding Michael Langberg Alexander Sprintson Jehoshua Bruck California Institute of Technology Email: mikel,spalex,bruck @caltech.edu Abstract In the multicast network

More information

TO facilitate the testing of evolving software, a test suite is

TO facilitate the testing of evolving software, a test suite is IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 29, NO. 3, MARCH 2003 195 Test-Suite Reduction and Prioritization for Modified Condition/Decision Coverage James A. Jones and Mary Jean Harrold, Member,

More information

Interactive Fault Localization Using Test Information

Interactive Fault Localization Using Test Information Hao D, Zhang L, Xie T et al. Interactive fault localization using test information. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 24(5): 962 974 Sept. 2009 Interactive Fault Localization Using Test Information

More information

A Test Suite Reduction Method based on Test Requirement Partition

A Test Suite Reduction Method based on Test Requirement Partition A Test Suite Reduction Method based on Test Requirement Partition Wan Yongbing 1, Xu Zhongwei 1, Yu Gang 2 and Zhu YuJun 1 1 School of Electronics & Information Engineering, Tongji University, Shanghai,

More information

Dual-Frame Sample Sizes (RDD and Cell) for Future Minnesota Health Access Surveys

Dual-Frame Sample Sizes (RDD and Cell) for Future Minnesota Health Access Surveys Dual-Frame Sample Sizes (RDD and Cell) for Future Minnesota Health Access Surveys Steven Pedlow 1, Kanru Xia 1, Michael Davern 1 1 NORC/University of Chicago, 55 E. Monroe Suite 2000, Chicago, IL 60603

More information

Data Analysis and Solver Plugins for KSpread USER S MANUAL. Tomasz Maliszewski

Data Analysis and Solver Plugins for KSpread USER S MANUAL. Tomasz Maliszewski Data Analysis and Solver Plugins for KSpread USER S MANUAL Tomasz Maliszewski tmaliszewski@wp.pl Table of Content CHAPTER 1: INTRODUCTION... 3 1.1. ABOUT DATA ANALYSIS PLUGIN... 3 1.3. ABOUT SOLVER PLUGIN...

More information

Tips and Guidance for Analyzing Data. Executive Summary

Tips and Guidance for Analyzing Data. Executive Summary Tips and Guidance for Analyzing Data Executive Summary This document has information and suggestions about three things: 1) how to quickly do a preliminary analysis of time-series data; 2) key things to

More information

MINTS: A General Framework and Tool for Supporting Test-suite Minimization

MINTS: A General Framework and Tool for Supporting Test-suite Minimization MINTS: A General Framework and Tool for Supporting Test-suite Minimization Hwa-You Hsu and Alessandro Orso College of Computing Georgia Institute of Technology {hsu orso}@cc.gatech.edu ABSTRACT Regression

More information

Using Excel for Graphical Analysis of Data

Using Excel for Graphical Analysis of Data Using Excel for Graphical Analysis of Data Introduction In several upcoming labs, a primary goal will be to determine the mathematical relationship between two variable physical parameters. Graphs are

More information

Incorporating Varying Test Costs and Fault Severities into Test Case Prioritization

Incorporating Varying Test Costs and Fault Severities into Test Case Prioritization Proceedings of the 23rd International Conference on Software Engineering, May, 1. Incorporating Varying Test Costs and Fault Severities into Test Case Prioritization Sebastian Elbaum Department of Computer

More information

Things you ll know (or know better to watch out for!) when you leave in December: 1. What you can and cannot infer from graphs.

Things you ll know (or know better to watch out for!) when you leave in December: 1. What you can and cannot infer from graphs. 1 2 Things you ll know (or know better to watch out for!) when you leave in December: 1. What you can and cannot infer from graphs. 2. How to construct (in your head!) and interpret confidence intervals.

More information

Averages and Variation

Averages and Variation Averages and Variation 3 Copyright Cengage Learning. All rights reserved. 3.1-1 Section 3.1 Measures of Central Tendency: Mode, Median, and Mean Copyright Cengage Learning. All rights reserved. 3.1-2 Focus

More information

CHAPTER 1. Introduction. Statistics: Statistics is the science of collecting, organizing, analyzing, presenting and interpreting data.

CHAPTER 1. Introduction. Statistics: Statistics is the science of collecting, organizing, analyzing, presenting and interpreting data. 1 CHAPTER 1 Introduction Statistics: Statistics is the science of collecting, organizing, analyzing, presenting and interpreting data. Variable: Any characteristic of a person or thing that can be expressed

More information

Dataflow-based Coverage Criteria

Dataflow-based Coverage Criteria Dataflow-based Coverage Criteria W. Eric Wong Department of Computer Science The University of Texas at Dallas ewong@utdallas.edu http://www.utdallas.edu/~ewong Dataflow-based Coverage Criteria ( 2012

More information

Requirements satisfied : Result Vector : Final : Matrix M. Test cases. Reqmts

Requirements satisfied : Result Vector : Final : Matrix M. Test cases. Reqmts Introduction Control flow/data flow widely studied No definitive answer to effectiveness Not widely accepted Quantitative measure of adequacy criteria Effectiveness Whether cost of testing methods is justified

More information

Linear Methods for Regression and Shrinkage Methods

Linear Methods for Regression and Shrinkage Methods Linear Methods for Regression and Shrinkage Methods Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer 1 Linear Regression Models Least Squares Input vectors

More information

Directed Test Suite Augmentation: Techniques and Tradeoffs

Directed Test Suite Augmentation: Techniques and Tradeoffs Directed Test Suite Augmentation: Techniques and Tradeoffs Zhihong Xu, Yunho Kim, Moonzoo Kim, Gregg Rothermel, Myra B. Cohen Department of Computer Science and Engineering Computer Science Department

More information

Middle School Math Course 3

Middle School Math Course 3 Middle School Math Course 3 Correlation of the ALEKS course Middle School Math Course 3 to the Texas Essential Knowledge and Skills (TEKS) for Mathematics Grade 8 (2012) (1) Mathematical process standards.

More information

E-Companion: On Styles in Product Design: An Analysis of US. Design Patents

E-Companion: On Styles in Product Design: An Analysis of US. Design Patents E-Companion: On Styles in Product Design: An Analysis of US Design Patents 1 PART A: FORMALIZING THE DEFINITION OF STYLES A.1 Styles as categories of designs of similar form Our task involves categorizing

More information

Fault Class Prioritization in Boolean Expressions

Fault Class Prioritization in Boolean Expressions Fault Class Prioritization in Boolean Expressions Ziyuan Wang 1,2 Zhenyu Chen 1 Tsong-Yueh Chen 3 Baowen Xu 1,2 1 State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210093,

More information

5. Computational Geometry, Benchmarks and Algorithms for Rectangular and Irregular Packing. 6. Meta-heuristic Algorithms and Rectangular Packing

5. Computational Geometry, Benchmarks and Algorithms for Rectangular and Irregular Packing. 6. Meta-heuristic Algorithms and Rectangular Packing 1. Introduction 2. Cutting and Packing Problems 3. Optimisation Techniques 4. Automated Packing Techniques 5. Computational Geometry, Benchmarks and Algorithms for Rectangular and Irregular Packing 6.

More information

Challenges on Combining Open Web and Dataset Evaluation Results: The Case of the Contextual Suggestion Track

Challenges on Combining Open Web and Dataset Evaluation Results: The Case of the Contextual Suggestion Track Challenges on Combining Open Web and Dataset Evaluation Results: The Case of the Contextual Suggestion Track Alejandro Bellogín 1,2, Thaer Samar 1, Arjen P. de Vries 1, and Alan Said 1 1 Centrum Wiskunde

More information

6. Relational Algebra (Part II)

6. Relational Algebra (Part II) 6. Relational Algebra (Part II) 6.1. Introduction In the previous chapter, we introduced relational algebra as a fundamental model of relational database manipulation. In particular, we defined and discussed

More information

DOWNLOAD PDF BIG IDEAS MATH VERTICAL SHRINK OF A PARABOLA

DOWNLOAD PDF BIG IDEAS MATH VERTICAL SHRINK OF A PARABOLA Chapter 1 : BioMath: Transformation of Graphs Use the results in part (a) to identify the vertex of the parabola. c. Find a vertical line on your graph paper so that when you fold the paper, the left portion

More information

Consistent Measurement of Broadband Availability

Consistent Measurement of Broadband Availability Consistent Measurement of Broadband Availability FCC Data through 12/2015 By Advanced Analytical Consulting Group, Inc. December 2016 Abstract This paper provides several, consistent measures of broadband

More information

Neuro-fuzzy admission control in mobile communications systems

Neuro-fuzzy admission control in mobile communications systems University of Wollongong Thesis Collections University of Wollongong Thesis Collection University of Wollongong Year 2005 Neuro-fuzzy admission control in mobile communications systems Raad Raad University

More information

A MORPHOLOGY-BASED FILTER STRUCTURE FOR EDGE-ENHANCING SMOOTHING

A MORPHOLOGY-BASED FILTER STRUCTURE FOR EDGE-ENHANCING SMOOTHING Proceedings of the 1994 IEEE International Conference on Image Processing (ICIP-94), pp. 530-534. (Austin, Texas, 13-16 November 1994.) A MORPHOLOGY-BASED FILTER STRUCTURE FOR EDGE-ENHANCING SMOOTHING

More information

Chapter 9. Software Testing

Chapter 9. Software Testing Chapter 9. Software Testing Table of Contents Objectives... 1 Introduction to software testing... 1 The testers... 2 The developers... 2 An independent testing team... 2 The customer... 2 Principles of

More information

Online Supplement to Minimax Models for Diverse Routing

Online Supplement to Minimax Models for Diverse Routing Online Supplement to Minimax Models for Diverse Routing James P. Brumbaugh-Smith Douglas R. Shier Department of Mathematics and Computer Science, Manchester College, North Manchester, IN 46962-1276, USA

More information

Empirical Evaluation of the Tarantula Automatic Fault-Localization Technique

Empirical Evaluation of the Tarantula Automatic Fault-Localization Technique Empirical Evaluation of the Tarantula Automatic Fault-Localization Technique James A. Jones and Mary Jean Harrold College of Computing, Georgia Institute of Technology Atlanta, Georgia, U.S.A. jjones@cc.gatech.edu,

More information

Prepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order.

Prepare a stem-and-leaf graph for the following data. In your final display, you should arrange the leaves for each stem in increasing order. Chapter 2 2.1 Descriptive Statistics A stem-and-leaf graph, also called a stemplot, allows for a nice overview of quantitative data without losing information on individual observations. It can be a good

More information

SYS 6021 Linear Statistical Models

SYS 6021 Linear Statistical Models SYS 6021 Linear Statistical Models Project 2 Spam Filters Jinghe Zhang Summary The spambase data and time indexed counts of spams and hams are studied to develop accurate spam filters. Static models are

More information

Supplementary text S6 Comparison studies on simulated data

Supplementary text S6 Comparison studies on simulated data Supplementary text S Comparison studies on simulated data Peter Langfelder, Rui Luo, Michael C. Oldham, and Steve Horvath Corresponding author: shorvath@mednet.ucla.edu Overview In this document we illustrate

More information

Lecture Notes 3: Data summarization

Lecture Notes 3: Data summarization Lecture Notes 3: Data summarization Highlights: Average Median Quartiles 5-number summary (and relation to boxplots) Outliers Range & IQR Variance and standard deviation Determining shape using mean &

More information

Maintaining Mutual Consistency for Cached Web Objects

Maintaining Mutual Consistency for Cached Web Objects Maintaining Mutual Consistency for Cached Web Objects Bhuvan Urgaonkar, Anoop George Ninan, Mohammad Salimullah Raunak Prashant Shenoy and Krithi Ramamritham Department of Computer Science, University

More information

Consistent Measurement of Broadband Availability

Consistent Measurement of Broadband Availability Consistent Measurement of Broadband Availability By Advanced Analytical Consulting Group, Inc. September 2016 Abstract This paper provides several, consistent measures of broadband availability from 2009

More information

Review of Regression Test Case Selection Techniques

Review of Regression Test Case Selection Techniques Review of Regression Test Case Selection Manisha Rani CSE Department, DeenBandhuChhotu Ram University of Science and Technology, Murthal, Haryana, India Ajmer Singh CSE Department, DeenBandhuChhotu Ram

More information

THE EFFECT OF JOIN SELECTIVITIES ON OPTIMAL NESTING ORDER

THE EFFECT OF JOIN SELECTIVITIES ON OPTIMAL NESTING ORDER THE EFFECT OF JOIN SELECTIVITIES ON OPTIMAL NESTING ORDER Akhil Kumar and Michael Stonebraker EECS Department University of California Berkeley, Ca., 94720 Abstract A heuristic query optimizer must choose

More information

UNIT-4 Black Box & White Box Testing

UNIT-4 Black Box & White Box Testing Black Box & White Box Testing Black Box Testing (Functional testing) o Equivalence Partitioning o Boundary Value Analysis o Cause Effect Graphing White Box Testing (Structural testing) o Coverage Testing

More information

Midterm Wednesday Oct. 27, 7pm, room 142

Midterm Wednesday Oct. 27, 7pm, room 142 Regression Testing Midterm Wednesday Oct. 27, 7pm, room 142 In class, closed book eam Includes all the material covered up (but not including) symbolic eecution Need to understand the concepts, know the

More information

HARNESSING CERTAINTY TO SPEED TASK-ALLOCATION ALGORITHMS FOR MULTI-ROBOT SYSTEMS

HARNESSING CERTAINTY TO SPEED TASK-ALLOCATION ALGORITHMS FOR MULTI-ROBOT SYSTEMS HARNESSING CERTAINTY TO SPEED TASK-ALLOCATION ALGORITHMS FOR MULTI-ROBOT SYSTEMS An Undergraduate Research Scholars Thesis by DENISE IRVIN Submitted to the Undergraduate Research Scholars program at Texas

More information

Information and Software Technology

Information and Software Technology Information and Software Technology 55 (2013) 897 917 Contents lists available at SciVerse ScienceDirect Information and Software Technology journal homepage: www.elsevier.com/locate/infsof On the adoption

More information

Applied Regression Modeling: A Business Approach

Applied Regression Modeling: A Business Approach i Applied Regression Modeling: A Business Approach Computer software help: SAS SAS (originally Statistical Analysis Software ) is a commercial statistical software package based on a powerful programming

More information

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 2.1- #

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 2.1- # Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series by Mario F. Triola Chapter 2 Summarizing and Graphing Data 2-1 Review and Preview 2-2 Frequency Distributions 2-3 Histograms

More information

A Hybrid Recursive Multi-Way Number Partitioning Algorithm

A Hybrid Recursive Multi-Way Number Partitioning Algorithm Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence A Hybrid Recursive Multi-Way Number Partitioning Algorithm Richard E. Korf Computer Science Department University

More information

Using Excel for Graphical Analysis of Data

Using Excel for Graphical Analysis of Data EXERCISE Using Excel for Graphical Analysis of Data Introduction In several upcoming experiments, a primary goal will be to determine the mathematical relationship between two variable physical parameters.

More information

Chapter 2 Describing, Exploring, and Comparing Data

Chapter 2 Describing, Exploring, and Comparing Data Slide 1 Chapter 2 Describing, Exploring, and Comparing Data Slide 2 2-1 Overview 2-2 Frequency Distributions 2-3 Visualizing Data 2-4 Measures of Center 2-5 Measures of Variation 2-6 Measures of Relative

More information

UNIT-4 Black Box & White Box Testing

UNIT-4 Black Box & White Box Testing Black Box & White Box Testing Black Box Testing (Functional testing) o Equivalence Partitioning o Boundary Value Analysis o Cause Effect Graphing White Box Testing (Structural testing) o Coverage Testing

More information

Efficient Regression Test Model for Object Oriented Software

Efficient Regression Test Model for Object Oriented Software Efficient Regression Test Model for Object Oriented Software Swarna Lata Pati College of Engg. & Tech, Bhubaneswar Abstract : This paper presents an efficient regression testing model with an integration

More information

Learning internal representations

Learning internal representations CHAPTER 4 Learning internal representations Introduction In the previous chapter, you trained a single-layered perceptron on the problems AND and OR using the delta rule. This architecture was incapable

More information

Error Analysis, Statistics and Graphing

Error Analysis, Statistics and Graphing Error Analysis, Statistics and Graphing This semester, most of labs we require us to calculate a numerical answer based on the data we obtain. A hard question to answer in most cases is how good is your

More information

STA121: Applied Regression Analysis

STA121: Applied Regression Analysis STA121: Applied Regression Analysis Variable Selection - Chapters 8 in Dielman Artin Department of Statistical Science October 23, 2009 Outline Introduction 1 Introduction 2 3 4 Variable Selection Model

More information

University of Florida CISE department Gator Engineering. Data Preprocessing. Dr. Sanjay Ranka

University of Florida CISE department Gator Engineering. Data Preprocessing. Dr. Sanjay Ranka Data Preprocessing Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville ranka@cise.ufl.edu Data Preprocessing What preprocessing step can or should

More information

Chapter 5. Track Geometry Data Analysis

Chapter 5. Track Geometry Data Analysis Chapter Track Geometry Data Analysis This chapter explains how and why the data collected for the track geometry was manipulated. The results of these studies in the time and frequency domain are addressed.

More information

Fingerprint Classification Using Orientation Field Flow Curves

Fingerprint Classification Using Orientation Field Flow Curves Fingerprint Classification Using Orientation Field Flow Curves Sarat C. Dass Michigan State University sdass@msu.edu Anil K. Jain Michigan State University ain@msu.edu Abstract Manual fingerprint classification

More information

CHAPTER 3 AN OVERVIEW OF DESIGN OF EXPERIMENTS AND RESPONSE SURFACE METHODOLOGY

CHAPTER 3 AN OVERVIEW OF DESIGN OF EXPERIMENTS AND RESPONSE SURFACE METHODOLOGY 23 CHAPTER 3 AN OVERVIEW OF DESIGN OF EXPERIMENTS AND RESPONSE SURFACE METHODOLOGY 3.1 DESIGN OF EXPERIMENTS Design of experiments is a systematic approach for investigation of a system or process. A series

More information

Exploiting a database to predict the in-flight stability of the F-16

Exploiting a database to predict the in-flight stability of the F-16 Exploiting a database to predict the in-flight stability of the F-16 David Amsallem and Julien Cortial December 12, 2008 1 Introduction Among the critical phenomena that have to be taken into account when

More information

Graphical Analysis of Data using Microsoft Excel [2016 Version]

Graphical Analysis of Data using Microsoft Excel [2016 Version] Graphical Analysis of Data using Microsoft Excel [2016 Version] Introduction In several upcoming labs, a primary goal will be to determine the mathematical relationship between two variable physical parameters.

More information

Fault Localization for Firewall Policies

Fault Localization for Firewall Policies Fault Localization for Firewall Policies JeeHyun Hwang 1 Tao Xie 1 Fei Chen Alex X. Liu 1 Department of Computer Science, North Carolina State University, Raleigh, NC 7695-86 Department of Computer Science

More information

You ve already read basics of simulation now I will be taking up method of simulation, that is Random Number Generation

You ve already read basics of simulation now I will be taking up method of simulation, that is Random Number Generation Unit 5 SIMULATION THEORY Lesson 39 Learning objective: To learn random number generation. Methods of simulation. Monte Carlo method of simulation You ve already read basics of simulation now I will be

More information

Multi-Way Number Partitioning

Multi-Way Number Partitioning Proceedings of the Twenty-First International Joint Conference on Artificial Intelligence (IJCAI-09) Multi-Way Number Partitioning Richard E. Korf Computer Science Department University of California,

More information

Data Preprocessing. Data Preprocessing

Data Preprocessing. Data Preprocessing Data Preprocessing Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville ranka@cise.ufl.edu Data Preprocessing What preprocessing step can or should

More information

The Elliptic Curve Discrete Logarithm and Functional Graphs

The Elliptic Curve Discrete Logarithm and Functional Graphs Rose-Hulman Institute of Technology Rose-Hulman Scholar Mathematical Sciences Technical Reports (MSTR) Mathematics 7-9-0 The Elliptic Curve Discrete Logarithm and Functional Graphs Christopher J. Evans

More information

Exploring and Understanding Data Using R.

Exploring and Understanding Data Using R. Exploring and Understanding Data Using R. Loading the data into an R data frame: variable

More information

+ Statistical Methods in

+ Statistical Methods in 9/4/013 Statistical Methods in Practice STA/MTH 379 Dr. A. B. W. Manage Associate Professor of Mathematics & Statistics Department of Mathematics & Statistics Sam Houston State University Discovering Statistics

More information

Question 1: What is a code walk-through, and how is it performed?

Question 1: What is a code walk-through, and how is it performed? Question 1: What is a code walk-through, and how is it performed? Response: Code walk-throughs have traditionally been viewed as informal evaluations of code, but more attention is being given to this

More information

USING CONVEX PSEUDO-DATA TO INCREASE PREDICTION ACCURACY

USING CONVEX PSEUDO-DATA TO INCREASE PREDICTION ACCURACY 1 USING CONVEX PSEUDO-DATA TO INCREASE PREDICTION ACCURACY Leo Breiman Statistics Department University of California Berkeley, CA 94720 leo@stat.berkeley.edu ABSTRACT A prediction algorithm is consistent

More information

Ballista Design and Methodology

Ballista Design and Methodology Ballista Design and Methodology October 1997 Philip Koopman Institute for Complex Engineered Systems Carnegie Mellon University Hamershlag Hall D-202 Pittsburgh, PA 15213 koopman@cmu.edu (412) 268-5225

More information

A Controlled Experiment Assessing Test Case Prioritization Techniques via Mutation Faults

A Controlled Experiment Assessing Test Case Prioritization Techniques via Mutation Faults A Controlled Experiment Assessing Test Case Prioritization Techniques via Mutation Faults Hyunsook Do and Gregg Rothermel Department of Computer Science and Engineering University of Nebraska - Lincoln

More information

7 Fractions. Number Sense and Numeration Measurement Geometry and Spatial Sense Patterning and Algebra Data Management and Probability

7 Fractions. Number Sense and Numeration Measurement Geometry and Spatial Sense Patterning and Algebra Data Management and Probability 7 Fractions GRADE 7 FRACTIONS continue to develop proficiency by using fractions in mental strategies and in selecting and justifying use; develop proficiency in adding and subtracting simple fractions;

More information

Machine Learning: An Applied Econometric Approach Online Appendix

Machine Learning: An Applied Econometric Approach Online Appendix Machine Learning: An Applied Econometric Approach Online Appendix Sendhil Mullainathan mullain@fas.harvard.edu Jann Spiess jspiess@fas.harvard.edu April 2017 A How We Predict In this section, we detail

More information

BIO 360: Vertebrate Physiology Lab 9: Graphing in Excel. Lab 9: Graphing: how, why, when, and what does it mean? Due 3/26

BIO 360: Vertebrate Physiology Lab 9: Graphing in Excel. Lab 9: Graphing: how, why, when, and what does it mean? Due 3/26 Lab 9: Graphing: how, why, when, and what does it mean? Due 3/26 INTRODUCTION Graphs are one of the most important aspects of data analysis and presentation of your of data. They are visual representations

More information

Part I: Preliminaries 24

Part I: Preliminaries 24 Contents Preface......................................... 15 Acknowledgements................................... 22 Part I: Preliminaries 24 1. Basics of Software Testing 25 1.1. Humans, errors, and testing.............................

More information

Chapter 3. Set Theory. 3.1 What is a Set?

Chapter 3. Set Theory. 3.1 What is a Set? Chapter 3 Set Theory 3.1 What is a Set? A set is a well-defined collection of objects called elements or members of the set. Here, well-defined means accurately and unambiguously stated or described. Any

More information

Quartile, Deciles, Percentile) Prof. YoginderVerma. Prof. Pankaj Madan Dean- FMS Gurukul Kangri Vishwavidyalaya, Haridwar

Quartile, Deciles, Percentile) Prof. YoginderVerma. Prof. Pankaj Madan Dean- FMS Gurukul Kangri Vishwavidyalaya, Haridwar Paper:5, Quantitative Techniques for Management Decisions Module:6 Measures of Central Tendency: Averages of Positions (Median, Mode, Quartile, Deciles, Percentile) Principal Investigator Co-Principal

More information

WELCOME! Lecture 3 Thommy Perlinger

WELCOME! Lecture 3 Thommy Perlinger Quantitative Methods II WELCOME! Lecture 3 Thommy Perlinger Program Lecture 3 Cleaning and transforming data Graphical examination of the data Missing Values Graphical examination of the data It is important

More information

Class 17. Discussion. Mutation analysis and testing. Problem Set 7 discuss Readings

Class 17. Discussion. Mutation analysis and testing. Problem Set 7 discuss Readings Class 17 Questions/comments Graders for Problem Set 6 (4); Graders for Problem set 7 (2-3) (solutions for all); will be posted on T-square Regression testing, Instrumentation Final project presentations:

More information

A Virtual Laboratory for Study of Algorithms

A Virtual Laboratory for Study of Algorithms A Virtual Laboratory for Study of Algorithms Thomas E. O'Neil and Scott Kerlin Computer Science Department University of North Dakota Grand Forks, ND 58202-9015 oneil@cs.und.edu Abstract Empirical studies

More information

A Distributed Formation of Orthogonal Convex Polygons in Mesh-Connected Multicomputers

A Distributed Formation of Orthogonal Convex Polygons in Mesh-Connected Multicomputers A Distributed Formation of Orthogonal Convex Polygons in Mesh-Connected Multicomputers Jie Wu Department of Computer Science and Engineering Florida Atlantic University Boca Raton, FL 3343 Abstract The

More information