The Care and Feeding of Wild-Caught Mutants

Size: px
Start display at page:

Download "The Care and Feeding of Wild-Caught Mutants"

Transcription

1 The Care and Feeding of Wild-Caught Mutants Michael Vaughn and David Bingham Brown December 19th, 2015 Abstract We propose and implement a technique for providing more thorough mutation testing for software test suites by mining publicly available source code repositories. We derive wildcaught mutants from publicly available source code repositories and investigate their effectiveness in evaluating software test suites compared to traditional mutators used in mutation testing. 1 Introduction One of the biggest threats to validity in debugging research is the size and quality of the defect corpus used for evaluation. Existing techniques for empirical evaluation of code analyzers, test methodologies, and other debugging tools and techniques are fairly simplistic. Rigorous scientific analysis requires well-documented and repeatable test conditions, leading to conditions that are either contrived, as is the case for hand-introduced bugs in the Siemens Suite, or are small batches of painstakingly isolated bugs from real projects, as René Just produced for his studies[1]. A reasonable question to ask, then, is whether a more robust and scalable means of reproducibly introducing software defects could be devised. Mutation analysis has gained traction in recent years as a means of evaluating the coverage of test suites. In particular, research has shown[1] mutation coverage is a useful predictor of a test suite effectiveness, independent of code coverage. However, as René Just showed, 17% of software faults could not be accounted for with normal mutants. The simplicity of the model makes it useful in industrial settings, where a degree of scientific validity can be sacrificed in order to obtain a reasonably useful tool. Given the observation that certain real-world defects are disjoint from the class of defects that can be generated by these mutations, however, basic mutation analysis can be too artificial to find general use as a tool for scientific testing of engineering tools. We expand on the core idea of mutation testing automated bug introduction as a means to evaluate the quality of a testing suite by developing and providing an analysis of the utility of a suite of tools for generating novel, human-generated mutants into a codebase for test suite evaluation. To accomplish this, we scoured GitHub[2], a large, publicly accessible source code repository, for small, single-line commits. Working on the assumption that the primary reason one commits a single-line commit to a repository is to fix a bug, we extract these commits and create a reverse patch from the commit, assuming that if the commit exists to fix a bug, reversing the patch would, therefore, insert the fixed bug into the codebase it is applied to. To ensure that the commits are actually applicable to other codebases, we extract these potential mutants in a identifier-agnostic 1

2 way, that is, we maintain keywords and operators from the programming language used, but treat identifiers as wildcards to be matched to the host codebase upon insertion. We provide a wild mutant extraction and insertion toolchain; (scraping tool here); mutgen, a mutant extraction tool; and mutins, a a mutant insertion tool. We provide experimental evidence using source control projects and testing suites in the C language (though the toolchain does, at present, support most languages that lack semantic value for whitespace) to demonstrate the utility of our technique. 2 Development All tools developed as part of the project can be found at mutants. 2.1 Repository Mining To obtain mutants, we decided to mine code from public GitHub repositories. To start out with, we decided to investigate C repositories, as the language s comparatively simple syntax and semantics leads us to believe that a greater proportion of C s reverse patches should be applicable compared to more (syntactically) complex object-oriented languages. In order to obtain a reasonably sized set of patches, we decided to target repositories with the most commits. However, the Git search API does not expose the number of commits. As a heuristic, we decided to instead select the repositories with the most forks, as Git s API does allow users to order search results we felt this heuristic was reasonable, as we generally think projects with a large number of forks are subject to broad interest, and thus a higher rate of development. Quantitatively, this assumption appears warranted, as the top 20 projects by this metric include the Linux kernel, memcached, and Redis. To perform the scraping, we created two automated tools. The first was a small Python script which could automatically submit small batches of search queries to the GitHub server, and build a list of all projects matching the query. To simplify this, we used OctoHub[3], a small Python library which lets users programmatically build API queries. We then built a script to send small numbers of paginated queries, to avoid running afoul of GitHub s API lilmts. By doing this, we were able to build a list of the top 625 C projects, in descending order of fork count. We used the top 50 of these comprising some 850 million lines of commits for our experiments. Once this was done, we created a small program which checked out each repository on the list, and then performed the repository scraping locally. Our local scraper is a Python script which iterates backwards through the commit history, outputting each diff (in unified diff format), along with revision number and commit message to a text file. The construction of this scraper was simplified by using GitPython[4], which provides a robust object-oriented implementation of the relevant git functionality, such as reading commit histories and constructing diffs. 2.2 Mutant Extraction : mutgen We developed mutgen, the second element of our toolchain, to extract potential wild mutants from the unified diff files scraped from GitHub. mutgen identifies potential mutants by isolating single-line changes from the source control commits it reads as input. Both lines of each identified commit are tokenized with help from language specification files provided at command line detailing language keywords and operators; 2

3 Usage : mutgen [ o p t i o n s ] Options : help d i s p l a y t h i s t e x t k KEYWORD FILE load language keywords from KEYWORD FILE o OPERATOR FILE load language o p e r a t o r s from OPERATOR FILE i INPUT FILES... e x t r a c t mutants from INPUT FILES x EXTRACT FILE s t o r e e x t r a c t e d mutants in EXTRACT FILE Figure 1: mutgen command line options i f ( x && y ) + i f ( x ) i f ( x ) + i f ( x && y ) (a) Invalid potential mutant + : i f. ( $1.&& $ 1. ) : i f. ( $1. ) (b) Valid potential mutant (c) Mutant extracted from (b) Figure 2: Example potential mutants the tokenizer ignores whitespace and uses rules simple yet (largely) universal among programming languages for processing keywords, identifiers, literals, and whitespace operators are consumed greedily, while keywords and identifiers must be separated by operators or whitespace. Once the two lines the before and after lines are tokenized, mutgen then analyzes both to ensure that, once matched, it is possible to generate the before line from the after line; mutins is not yet robust enough to synthesize identifiers or literals, so mutgen requires that potential mutants not require the synthesis of new information; that is, the before state must be able to be generated solely from identifiers and literals matched in the after state. Figure 2 shows example single-line commits as processed by mutgen; figure 2a would be discarded because the before state cannot be generated from the after state (as the identifier y is unique to the before state, and thus cannot be generated solely from the after state). Figure 2c shows the tokenized mutant fully extracted; in the mutant extract language of mutgen and mutins, : indicates a keyword,. an operator, and $ an identifier keywords and operators contain their literal value, while identifiers are given an index to be used in converting the after state to the before, with -1 indicating that the identifier like the y in figure 2b is unused. Our initial expectation that valid mutants would be rare proved to be untrue, and our initial run of mutgen over our scraped corpus yielded almost three million mutants more than it was reasonably possible to evaluate. After manually combing through a subset of the mutants produced by the initial run, we added heuristics to mutgen to cull both commits likely to be comments (e.g., those containing several identifiers in a row, indicating that the committed text is more likely to be natural language than a programming language or containing several repeated operators, often seen as horizontal lines drawn in comments) and those complex enough to be unlikely to be found 3

4 } + } e l s e while ( i < n ) ; + while ( i < n ) + TMPFILE TMPFILE % 512 (a) (b) Figure 3: Example extracted mutants (c) in other codebases (that is, those that have after states containing more than four identifiers to be matched). The result of the application of this culling heuristic was to reduce the generated set from roughly three million potential mutants, most of them unusable, to roughly thirty thousand, of which a larger proportion were likely to be matched. Figure 3 shows some example mutants identified by mutgen 1. Notably, while mutgen (and the rest of the toolchain) has only been tested for the C programming language, the entire system is designed in such a way to be used on virtually any programming language that does not assign semantic value to whitespace 2 via operator and keyword specifications (with the companion tool mutins reading these language specifications in the extracted mutant file generated by mutgen). mutgen applied to the entire scraped corpus (approximately 850 million lines of commits) extracted 29,704 possibly viable mutants in less than ten minutes on desktop-grade hardware. 2.3 Mutant Insertion : mutins mutins, the final element of the toolchain, reads mutants and language definitions from the mutant extract file generated by mutgen and inserts them into a codebase specified as a list of files. mutins offers several command line options to facilitate automatic and repeated use of the tool, as due to the nature of the mutants generated, many will cause the resulting code to not compile correctly. In addition to strictly random use (by default mutins chooses a mutant at random from the mutant extract file and inserts it into a randomly selected insertion point in the target codebase), mutins can be forced to use a specific random number seed (we use the C++ STL s implementation of the Mersenne Twister[5] for pseudo-random number generation). Mutant insertion works much like mutant extraction in reverse; mutins tokenizes the input files with keyword and operator lists provided in the mutant extract file, and then finds possible insertion points by identifying token sequences matching the after state of the chosen mutant. Once an insertion point is selected, the range of text represented by the after state tokens is replaced by text synthesized from the before state tokens, matching identifier to identifier in the synthesized code. 3 Evaluation For our experimental validation, we chose to replicate a subset of an experiment J.H. Andrews and L.C. Briand and Y. Labiche describe in [6]. Their experiment takes a program from the 1 We posit no explanation why one would need to perform modulo arithmetic on a variable named TMPFILE, but do note that stripping a modulo operation does provide an interesting mutation for use in mutation testing. 2 A conversation with Ben Liblit yielded a simple and elegant method to implement support for semantic whitespace à la Python, but this has not yet been implemented. 4

5 Usage : mutins [ o p t i o n s ] Options : help d i s p l a y t h i s t e x t v verbose output c only count p o t e n t i a l matches, do not i n s e r t i MATCH INDEX i n s e r t the match at the s p e c i f i c index r RANDOM SEED use RANDOM SEED to i n i t i a l i z e random number g e n e r a t o r ( t h i s i s ignored i f the i option i s used ) m MUTANT INDEX use only the mutant found at the ( zero based ) index MUTANT INDEX x EXTRACT FILE load mutants ( and language data ) from EXTRACT FILE t TARGET FILES... attempt to i n s e r t mutant i n t o TARGET FILE b s k i p backup ( by d e f a u l t, modified f i l e s are copied to f i l e. o r i g b e f o r e mutant i n s e r t i o n ) Figure 4: mutins command line options SIR repository[7], and randomly generates a number of test suites by randomly choosing from the artifact s tests. They then measure the mutation adequacy of each suite by running them over the set of all possible program mutations. By collecting these measurements, he constructs a model of the statistical distribution of the mutant detection rate over arbitrary test rates, which he compares to a similarly constructed approximation of the distribution hand-seeded faults. 3.1 Target Program While Andrews works with a wide variety of programs from SIR, including the Siemens suite, we decided to work with Space [7]. Space is an appealing subject for this form of experimentation, as it is a mature piece of software that has been subject to years of production use. Because of this, Space is also the only program Andrews tested which used real faults instead of hand-introduced ones. Thus, we can already get a sense of how wild mutants fare against the test suites detection rates for real faults. Moreover, at 6,199 lines of code, it is larger than the other classical Siemens Suite programs. This size is large enough that mutins can generate 117,744 possible mutants, which is a non-trivial set that is still small enough to explore exhaustively. 3.2 Procedure Prior to testing, we obtained 5, case test suites by randomly shuffling the list of available suites repeatedly and taking the first 100 elements, thus ensuring that no single suite contained duplicate experiments. Next, we ran mutins on Space, in order to identify each possible point at which a wild mutant can be inserted. We recorded each possible insertion into a list of entries that could be fed to our suite execution framework at a later time. We then divide the space of 117,744 mutants into batches of six jobs each, as prior experimentation indicated that such a test 5

6 could complete in one to two hours. We packaged each list as part of an HTCondor job, which executed applied each mutation in sequence, and ran all six against each of the 5,000 test suites. Test successes and failures were recorded, and sent back from each job. Once the test results were returned, we then performed some simple analyses on the results. For each test suite S, we calculated the mutation adequacy score, Am(S), where Am(S) is the ratio of mutants detected by the suite to the total number of mutants [6]. We also recorded the number of mutants that successfully compiled, in order to get a general sense of the feasibility of mutant insertion. 3.3 Experimentation Framework and Tooling As the experimental procedure calls for exhaustively building and testing each possible mutant for a subject program, a significant amount of computation time is required to obtain a reasonably informative data set. We decided to use UW s Center for High Throughput Computing, which provides a robust environment for large-scale grid computing via the HTCondor framework [8]. Since each job can be run independently of others, the parallelization is simply a matter of appropriately packaging the mutator, along with the target program and associated suites. This posed a significant difficulty, as the computing pool s Linux environments are heterogeneous, and host machines are not guaranteed to have any specific version or build of many none-core programs, if any version is indeed present. In particular, many nodes do not have either GCC nor the headers needed to build code, which required us to create relocatable binaries of both GCC and glibc, which we could pass in as part of the job. Obtaining such a version of GCC is non-trivial, and requires a significant amount of configuration and testing to ensure that the correct versions of libraries are built, and that no subtle discrepancies are present in the toolchain. Moreover, HTCondor jobs can be located in an arbitrary directory of the host system. This poses a problem, as a naive build of GCC and Glibc may experience various linking and loading errors in this situation We eventually discovered crosstool-ng [9], which is a configurable too intended to create cross-compiler toolchains. After some investigation and experimentation, we were able to correctly build a version of GCC with the desired properties. As the experiment used the SIR repository [7], we built a Python framework for building and executing arbitrary experiments on the. By taking advantage SIR s standardized framework, we constructed a system which can be used to move various objects to staging areas, build test suites, and invoke external tools, such as mutins, to manipulate the source code. We also created a similar set of Bash scripts, with a similar functionality, in case Python was either unavailable, or more robust shell-style functionality was required. By packaging this with the relocatable GCC, we were effectively able to construct a framework for reproducible software engineering experiments. By packaging the desired artifact from the SIR repository, the Python framework, and the compiler, along with a top-level script that invokes the necessary behaviors, the experimenter can present the experiment in such a way that any interested party can simply obtain the package, and begin tweaking and experimenting. 4 Results and Discussion Given our data we were able to record a few basic statistical metrics. 6

7 Total mutants 108,134 Successfully Compiled Mutants 20,638 compilation rate Average Am(S) Interestingly, nearly 20 percent of the inserted mutants compiled, which was much greater than the rate of less than 5 percent we originally expected. This is still comparatively low; Andrews reported a compilation success rate of 92 percent[6]. However, at scale, mutation still appears feasible, as our set of compiled mutants is roughly twice as large as Andrews s 11,379 mutant set. Curiously, the wild mutants were far more difficult to detect than both the real world mutants or the mutants generated by Andrews. The average Am(S) real and generated mutants were recorded as.75 and.75 respectively, nearly 1.5 times easier to catch. Every wild mutant we tested was recorded as being caught by at least one test suite, so each one introduced some fault. This seems to indicate that wild mutants tend to induce more subtle variations than those produced by other operators. Andrews asserts that an Am(s) lower than the rate for real faults is an argument against the realism of hand-seeded faults[6]. However, given the apparent subtle behavior of the reverse patches, we feel that more careful analysis of both our results and Andrews s results is warranted. 5 Threats to Validity Currently, the most significant threat to validity is the relative age of our data collection and analysis software. In particular, as stated before, HTCondor is a challenging system to develop for - our testing and compilation frameworks required a significant number of false starts and reworkings before we had a successful execution. Thus, our test executors may still have bugs that altered our experiment in some way, or some unforeseen aspect of the execution environment may be altering the behavior of the program in a hard to detect way. 6 Related Work René Just s evaluation of mutation testing s external validity, and subsequent analysis of the limitations[1] of mutation testing served as the main impetus for our work. In particular, careful consideration of his discussion of the classes of faults which cannot be expressed in terms of basic mutation operators was our primary inspiration in searching for a more realistic set of mutation operators. Jia and Harmon [10] wrote a robust survey of the history of mutation testing, including a section delineating the various techniques for testing mutation frameworks, along with an overview of the most significant works of mutation evaluation. In addition, they also discuss various subject programs used in testing and detailed list of programs used for evaluation, sorted by number of papers using each. Their work quickly pointed us towards the SIR repository as a viable set of tools for mutation testing. Moreover, after Dr. Liblit told us about James Andrews s mutation framework, we found Andrews s evaluation experiment in the bibliography of Jia and Harman s survey. 7

8 7 Future Work The clearest short-term objective is to continue comparing our mutation tool against the results of Andrews s experiment. With our tools, it should be fairly straightforward to perform the experimental procedure on the the other 7 SIR programs he analyzed. Additionally, he performed more sophisticated statistical analyses on his results, such as the statistical significance of variation between test suites as well as between mutation styles. Given our existing framework for exhaustively searching the mutation space of a given program, we feel we can efficiently replicate the rest of his experiment in a matter of weeks. Another reasonable axis of evaluation is the derivability relationship between the basic operators provided by common mutation frameworks and our wild caught mutants. Specifically, it is reasonable to inquire what proportion of wild mutants can be derived from some bounded number of applications of a mutation operator. For mutant insertions where both the before and after code can be described as functions that map from one state of the variables the before code touches touches to a state of the variables touched by the after code, there may be a way to apply syntax guided synthesis [11] in attempting to derive the mutation. 8 Conclusion Mutation testing is predicated on software engineering researchers and practioners testing needs. In particular, test suites and bug finders, like all other software, need to be extensively tested and verified, which requires a large corpus of test cases. Mutation testing provides one way of quickly and reproducibly introducing large numbers of faults into a known piece of software. However, as Just demonstrates in [1], simple mutation operators cannot span the full space of software faults. Wild mutants derived from reversed patch data bridge this gap. By reflecting real-world code changes, these mutants will affect code in a manner that was deemed significant in some context. Moreover, given the vast quantities of publicly available patch data, large sets of candidate operations can be collected and evaluated in a matter of days. Given the one in five probability that a such a mutant will successfully compile, and their distinctive behavior with respect to test suites, we believe wild mutants are objects deserving further study. 9 Acknowledgements This research was performed using the compute resources and assistance of the UW-Madison Center For High Throughput Computing (CHTC) in the Department of Computer Sciences. The CHTC is supported by UW-Madison, the Advanced Computing Initiative, the Wisconsin Alumni Research Foundation, the Wisconsin Institutes for Discovery, and the National Science Foundation, and is an active member of the Open Science Grid, which is supported by the National Science Foundation and the U.S. Department of Energy s Office of Science. 8

9 References [1] R. Just, D. Jalali, L. Inozemtseva, M. D. Ernst, R. Holmes, and G. Fraiser, Are mutants a valid substitute for real faults in software testing? FSE, [Online]. Available: http: //homes.cs.washington.edu/~rjust/publ/mutants_real_faults_fse_2014.pdf. [2] GitHub, Inc. (2015). Github - where software is built, [Online]. Available: https : / / ww. github.com/. [3] A. Swartz. (2013). Octohub: Low level python and cli interface to github, [Online]. Available: [4] M. Trier. (2008). Gitpython, [Online]. Available: GitPython/. [5] M. Matsumoto and T. Nishimura, Mersenne twister: A 623-dimensionally equidistributed uniform pseudorandom number generator, ACM Trans. on Modeling and Computer Simulation, [Online]. Available: http : / / www. math. sci. hiroshima - u. ac. jp / ~m - mat/mt/articles/mt.pdf. [6] J. H. Andrews, L. C. Briand, and Y. Labiche, Is mutation an appropriate tool for testing experiments? In Proceedings of the 27th International Conference on Software Engineering, ser. ICSE 05, St. Louis, MO, USA: ACM, 2005, pp , isbn: doi: / [Online]. Available: [7] H. Do, S. G. Elbaum, and G. Rothermel, Supporting controlled experimentation with testing techniques: An infrastructure and its potential impact., Empirical Software Engineering: An International Journal, vol. 10, no. 4, pp , [8] M. Litzkow, M. Livny, and M. Mutka, Condor - a hunter of idle workstations, in Proceedings of the 8th International Conference of Distributed Computing Systems, [9] Y. E. Morin. (2013). Crosstool-ng, [Online]. Available: [10] Y. Jia and M. Harman, An analysis and survey of the development of mutation testing, IEEE Trans. Softw. Eng., vol. 37, no. 5, pp , Sep. 2011, issn: doi: /TSE [Online]. Available: [11] R. Alur et al., Syntax-guided synthesis. 9

ExMAn: A Generic and Customizable Framework for Experimental Mutation Analysis 1

ExMAn: A Generic and Customizable Framework for Experimental Mutation Analysis 1 ExMAn: A Generic and Customizable Framework for Experimental Mutation Analysis 1 Jeremy S. Bradbury, James R. Cordy, Juergen Dingel School of Computing, Queen s University Kingston, Ontario, Canada {bradbury,

More information

Ballista Design and Methodology

Ballista Design and Methodology Ballista Design and Methodology October 1997 Philip Koopman Institute for Complex Engineered Systems Carnegie Mellon University Hamershlag Hall D-202 Pittsburgh, PA 15213 koopman@cmu.edu (412) 268-5225

More information

Empirical Study on Impact of Developer Collaboration on Source Code

Empirical Study on Impact of Developer Collaboration on Source Code Empirical Study on Impact of Developer Collaboration on Source Code Akshay Chopra University of Waterloo Waterloo, Ontario a22chopr@uwaterloo.ca Parul Verma University of Waterloo Waterloo, Ontario p7verma@uwaterloo.ca

More information

CA Test Data Manager Key Scenarios

CA Test Data Manager Key Scenarios WHITE PAPER APRIL 2016 CA Test Data Manager Key Scenarios Generate and secure all the data needed for rigorous testing, and provision it to highly distributed teams on demand. Muhammad Arif Application

More information

Multi-Way Number Partitioning

Multi-Way Number Partitioning Proceedings of the Twenty-First International Joint Conference on Artificial Intelligence (IJCAI-09) Multi-Way Number Partitioning Richard E. Korf Computer Science Department University of California,

More information

Exposing unforeseen consequences of software change

Exposing unforeseen consequences of software change Exposing unforeseen consequences of software change David Notkin University of Washington February 2010 Joint work with Reid Holmes Thank you! My first trip to India and I am sure not my last! Wonderful

More information

CAP6135: Programming Project 2 (Spring 2010)

CAP6135: Programming Project 2 (Spring 2010) CAP6135: Programming Project 2 (Spring 2010) This project is modified from the programming project 2 in Dr. Dawn Song s course CS161: computer security in Fall 2008: http://inst.eecs.berkeley.edu/~cs161/fa08/

More information

On the Use of Mutation Faults in Empirical Assessments of Test Case Prioritization Techniques

On the Use of Mutation Faults in Empirical Assessments of Test Case Prioritization Techniques On the Use of Mutation Faults in Empirical Assessments of Test Case Prioritization Techniques Hyunsook Do, Gregg Rothermel Department of Computer Science and Engineering University of Nebraska - Lincoln

More information

HDF Virtualization Review

HDF Virtualization Review Scott Wegner Beginning in July 2008, The HDF Group embarked on a new project to transition Windows support to a virtualized environment using VMWare Workstation. We utilized virtual machines in order to

More information

Lecture Notes on Liveness Analysis

Lecture Notes on Liveness Analysis Lecture Notes on Liveness Analysis 15-411: Compiler Design Frank Pfenning André Platzer Lecture 4 1 Introduction We will see different kinds of program analyses in the course, most of them for the purpose

More information

Software Quality Assurance. David Janzen

Software Quality Assurance. David Janzen Software Quality Assurance David Janzen What is quality? Crosby: Conformance to requirements Issues: who establishes requirements? implicit requirements Juran: Fitness for intended use Issues: Who defines

More information

A Comparative Study on Different Version Control System

A Comparative Study on Different Version Control System e-issn 2455 1392 Volume 2 Issue 6, June 2016 pp. 449 455 Scientific Journal Impact Factor : 3.468 http://www.ijcter.com A Comparative Study on Different Version Control System Monika Nehete 1, Sagar Bhomkar

More information

SourcererCC -- Scaling Code Clone Detection to Big-Code

SourcererCC -- Scaling Code Clone Detection to Big-Code SourcererCC -- Scaling Code Clone Detection to Big-Code What did this paper do? SourcererCC a token-based clone detector, that can detect both exact and near-miss clones from large inter project repositories

More information

Cypress Adopts Questa Formal Apps to Create Pristine IP

Cypress Adopts Questa Formal Apps to Create Pristine IP Cypress Adopts Questa Formal Apps to Create Pristine IP DAVID CRUTCHFIELD, SENIOR PRINCIPLE CAD ENGINEER, CYPRESS SEMICONDUCTOR Because it is time consuming and difficult to exhaustively verify our IP

More information

ExMAn: A Generic and Customizable Framework for Experimental Mutation Analysis

ExMAn: A Generic and Customizable Framework for Experimental Mutation Analysis ExMAn: A Generic and Customizable Framework for Experimental Mutation Analysis Technical Report 2006-519 Jeremy S. Bradbury, James R. Cordy, Juergen Dingel School of Computing, Queen s University Kingston,

More information

A Case Study on the Similarity Between Source Code and Bug Reports Vocabularies

A Case Study on the Similarity Between Source Code and Bug Reports Vocabularies A Case Study on the Similarity Between Source Code and Bug Reports Vocabularies Diego Cavalcanti 1, Dalton Guerrero 1, Jorge Figueiredo 1 1 Software Practices Laboratory (SPLab) Federal University of Campina

More information

A Systematic Study of Automated Program Repair: Fixing 55 out of 105 Bugs for $8 Each

A Systematic Study of Automated Program Repair: Fixing 55 out of 105 Bugs for $8 Each A Systematic Study of Automated Program Repair: Fixing 55 out of 105 Bugs for $8 Each Claire Le Goues (Virginia), Michael Dewey-Vogt (Virginia), Stephanie Forrest (New Mexico), Westley Weimer (Virginia)

More information

An Empirical Evaluation of Test Adequacy Criteria for Event-Driven Programs

An Empirical Evaluation of Test Adequacy Criteria for Event-Driven Programs An Empirical Evaluation of Test Adequacy Criteria for Event-Driven Programs Jaymie Strecker Department of Computer Science University of Maryland College Park, MD 20742 November 30, 2006 Abstract In model-based

More information

Automating Test Driven Development with Grammatical Evolution

Automating Test Driven Development with Grammatical Evolution http://excel.fit.vutbr.cz Automating Test Driven Development with Grammatical Evolution Jan Svoboda* Abstract Test driven development is a widely used process of creating software products with automated

More information

A Virtual Laboratory for Study of Algorithms

A Virtual Laboratory for Study of Algorithms A Virtual Laboratory for Study of Algorithms Thomas E. O'Neil and Scott Kerlin Computer Science Department University of North Dakota Grand Forks, ND 58202-9015 oneil@cs.und.edu Abstract Empirical studies

More information

CPSC 427a: Object-Oriented Programming

CPSC 427a: Object-Oriented Programming CPSC 427a: Object-Oriented Programming Michael J. Fischer Lecture 1 September 2, 2010 CPSC 427a 1/54 Overview Course information Goals Learning C++ Programming standards Comparison of C and C++ Example

More information

Making Workstations a Friendly Environment for Batch Jobs. Miron Livny Mike Litzkow

Making Workstations a Friendly Environment for Batch Jobs. Miron Livny Mike Litzkow Making Workstations a Friendly Environment for Batch Jobs Miron Livny Mike Litzkow Computer Sciences Department University of Wisconsin - Madison {miron,mike}@cs.wisc.edu 1. Introduction As time-sharing

More information

Comparing Centralized and Decentralized Distributed Execution Systems

Comparing Centralized and Decentralized Distributed Execution Systems Comparing Centralized and Decentralized Distributed Execution Systems Mustafa Paksoy mpaksoy@swarthmore.edu Javier Prado jprado@swarthmore.edu May 2, 2006 Abstract We implement two distributed execution

More information

Information Discovery, Extraction and Integration for the Hidden Web

Information Discovery, Extraction and Integration for the Hidden Web Information Discovery, Extraction and Integration for the Hidden Web Jiying Wang Department of Computer Science University of Science and Technology Clear Water Bay, Kowloon Hong Kong cswangjy@cs.ust.hk

More information

A Controlled Experiment Assessing Test Case Prioritization Techniques via Mutation Faults

A Controlled Experiment Assessing Test Case Prioritization Techniques via Mutation Faults A Controlled Experiment Assessing Test Case Prioritization Techniques via Mutation Faults Hyunsook Do and Gregg Rothermel Department of Computer Science and Engineering University of Nebraska - Lincoln

More information

Collaborative Framework for Testing Web Application Vulnerabilities Using STOWS

Collaborative Framework for Testing Web Application Vulnerabilities Using STOWS Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology ISSN 2320 088X IMPACT FACTOR: 5.258 IJCSMC,

More information

Striped Data Server for Scalable Parallel Data Analysis

Striped Data Server for Scalable Parallel Data Analysis Journal of Physics: Conference Series PAPER OPEN ACCESS Striped Data Server for Scalable Parallel Data Analysis To cite this article: Jin Chang et al 2018 J. Phys.: Conf. Ser. 1085 042035 View the article

More information

Rubicon: Scalable Bounded Verification of Web Applications

Rubicon: Scalable Bounded Verification of Web Applications Joseph P. Near Research Statement My research focuses on developing domain-specific static analyses to improve software security and reliability. In contrast to existing approaches, my techniques leverage

More information

An Anomaly in Unsynchronized Pointer Jumping in Distributed Memory Parallel Machine Model

An Anomaly in Unsynchronized Pointer Jumping in Distributed Memory Parallel Machine Model An Anomaly in Unsynchronized Pointer Jumping in Distributed Memory Parallel Machine Model Sun B. Chung Department of Quantitative Methods and Computer Science University of St. Thomas sbchung@stthomas.edu

More information

Executing Evaluations over Semantic Technologies using the SEALS Platform

Executing Evaluations over Semantic Technologies using the SEALS Platform Executing Evaluations over Semantic Technologies using the SEALS Platform Miguel Esteban-Gutiérrez, Raúl García-Castro, Asunción Gómez-Pérez Ontology Engineering Group, Departamento de Inteligencia Artificial.

More information

Automated Documentation Inference to Explain Failed Tests

Automated Documentation Inference to Explain Failed Tests Automated Documentation Inference to Explain Failed Tests Sai Zhang University of Washington Joint work with: Cheng Zhang, Michael D. Ernst A failed test reveals a potential bug Before bug-fixing, programmers

More information

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK REVIEW PAPER ON IMPLEMENTATION OF DOCUMENT ANNOTATION USING CONTENT AND QUERYING

More information

Additional Guidelines and Suggestions for Project Milestone 1 CS161 Computer Security, Spring 2008

Additional Guidelines and Suggestions for Project Milestone 1 CS161 Computer Security, Spring 2008 Additional Guidelines and Suggestions for Project Milestone 1 CS161 Computer Security, Spring 2008 Some students may be a little vague on what to cover in the Milestone 1 submission for the course project,

More information

From Whence It Came: Detecting Source Code Clones by Analyzing Assembler

From Whence It Came: Detecting Source Code Clones by Analyzing Assembler From Whence It Came: Detecting Source Code Clones by Analyzing Assembler Ian J. Davis and Michael W. Godfrey David R. Cheriton School of Computer Science University of Waterloo Waterloo, Ontario, Canada

More information

Chapter 2 Basic Structure of High-Dimensional Spaces

Chapter 2 Basic Structure of High-Dimensional Spaces Chapter 2 Basic Structure of High-Dimensional Spaces Data is naturally represented geometrically by associating each record with a point in the space spanned by the attributes. This idea, although simple,

More information

Network Programmability with Cisco Application Centric Infrastructure

Network Programmability with Cisco Application Centric Infrastructure White Paper Network Programmability with Cisco Application Centric Infrastructure What You Will Learn This document examines the programmability support on Cisco Application Centric Infrastructure (ACI).

More information

Empirical Studies of Test Case Prioritization in a JUnit Testing Environment

Empirical Studies of Test Case Prioritization in a JUnit Testing Environment University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln CSE Conference and Workshop Papers Computer Science and Engineering, Department of 2004 Empirical Studies of Test Case Prioritization

More information

TreeSearch User Guide

TreeSearch User Guide TreeSearch User Guide Version 0.9 Derrick Stolee University of Nebraska-Lincoln s-dstolee1@math.unl.edu March 30, 2011 Abstract The TreeSearch library abstracts the structure of a search tree in order

More information

International Journal for Management Science And Technology (IJMST)

International Journal for Management Science And Technology (IJMST) Volume 4; Issue 03 Manuscript- 1 ISSN: 2320-8848 (Online) ISSN: 2321-0362 (Print) International Journal for Management Science And Technology (IJMST) GENERATION OF SOURCE CODE SUMMARY BY AUTOMATIC IDENTIFICATION

More information

Sample Exam. Advanced Test Automation - Engineer

Sample Exam. Advanced Test Automation - Engineer Sample Exam Advanced Test Automation - Engineer Questions ASTQB Created - 2018 American Software Testing Qualifications Board Copyright Notice This document may be copied in its entirety, or extracts made,

More information

Chapter 9. Software Testing

Chapter 9. Software Testing Chapter 9. Software Testing Table of Contents Objectives... 1 Introduction to software testing... 1 The testers... 2 The developers... 2 An independent testing team... 2 The customer... 2 Principles of

More information

Part I: Preliminaries 24

Part I: Preliminaries 24 Contents Preface......................................... 15 Acknowledgements................................... 22 Part I: Preliminaries 24 1. Basics of Software Testing 25 1.1. Humans, errors, and testing.............................

More information

An Exploratory Study on Interface Similarities in Code Clones

An Exploratory Study on Interface Similarities in Code Clones 1 st WETSoDA, December 4, 2017 - Nanjing, China An Exploratory Study on Interface Similarities in Code Clones Md Rakib Hossain Misu, Abdus Satter, Kazi Sakib Institute of Information Technology University

More information

Quantifying and Assessing the Merge of Cloned Web-Based System: An Exploratory Study

Quantifying and Assessing the Merge of Cloned Web-Based System: An Exploratory Study Quantifying and Assessing the Merge of Cloned Web-Based System: An Exploratory Study Jadson Santos Department of Informatics and Applied Mathematics Federal University of Rio Grande do Norte, UFRN Natal,

More information

MTAT : Software Testing

MTAT : Software Testing MTAT.03.159: Software Testing Lecture 03: White-Box Testing (Textbook Ch. 5) Spring 2013 Dietmar Pfahl email: dietmar.pfahl@ut.ee Lecture Chapter 5 White-box testing techniques (Lab 3) Structure of Lecture

More information

SFWR ENG 3S03: Software Testing

SFWR ENG 3S03: Software Testing (Slide 1 of 52) Dr. Ridha Khedri Department of Computing and Software, McMaster University Canada L8S 4L7, Hamilton, Ontario Acknowledgments: Material based on [?] Techniques (Slide 2 of 52) 1 2 3 4 Empirical

More information

Pliny and Fixr Meeting. September 15, 2014

Pliny and Fixr Meeting. September 15, 2014 Pliny and Fixr Meeting September 15, 2014 Fixr: Mining and Understanding Bug Fixes for App-Framework Protocol Defects (TA2) University of Colorado Boulder September 15, 2014 Fixr: Mining and Understanding

More information

CS224W Project Write-up Static Crawling on Social Graph Chantat Eksombatchai Norases Vesdapunt Phumchanit Watanaprakornkul

CS224W Project Write-up Static Crawling on Social Graph Chantat Eksombatchai Norases Vesdapunt Phumchanit Watanaprakornkul 1 CS224W Project Write-up Static Crawling on Social Graph Chantat Eksombatchai Norases Vesdapunt Phumchanit Watanaprakornkul Introduction Our problem is crawling a static social graph (snapshot). Given

More information

Program Partitioning - A Framework for Combining Static and Dynamic Analysis

Program Partitioning - A Framework for Combining Static and Dynamic Analysis Program Partitioning - A Framework for Combining Static and Dynamic Analysis Pankaj Jalote, Vipindeep V, Taranbir Singh, Prateek Jain Department of Computer Science and Engineering Indian Institute of

More information

Fig 1. Overview of IE-based text mining framework

Fig 1. Overview of IE-based text mining framework DiscoTEX: A framework of Combining IE and KDD for Text Mining Ritesh Kumar Research Scholar, Singhania University, Pacheri Beri, Rajsthan riteshchandel@gmail.com Abstract: Text mining based on the integration

More information

Hierarchical Addressing and Routing Mechanisms for Distributed Applications over Heterogeneous Networks

Hierarchical Addressing and Routing Mechanisms for Distributed Applications over Heterogeneous Networks Hierarchical Addressing and Routing Mechanisms for Distributed Applications over Heterogeneous Networks Damien Magoni Université Louis Pasteur LSIIT magoni@dpt-info.u-strasbg.fr Abstract. Although distributed

More information

Management Tools. Management Tools. About the Management GUI. About the CLI. This chapter contains the following sections:

Management Tools. Management Tools. About the Management GUI. About the CLI. This chapter contains the following sections: This chapter contains the following sections:, page 1 About the Management GUI, page 1 About the CLI, page 1 User Login Menu Options, page 2 Customizing the GUI and CLI Banners, page 3 REST API, page 3

More information

Similarities in Source Codes

Similarities in Source Codes Similarities in Source Codes Marek ROŠTÁR* Slovak University of Technology in Bratislava Faculty of Informatics and Information Technologies Ilkovičova 2, 842 16 Bratislava, Slovakia rostarmarek@gmail.com

More information

Improving Origin Analysis with Weighting Functions

Improving Origin Analysis with Weighting Functions Improving Origin Analysis with Weighting Functions Lin Yang, Anwar Haque and Xin Zhan Supervisor: Michael Godfrey University of Waterloo Introduction Software systems must undergo modifications to improve

More information

Some Applications of Graph Bandwidth to Constraint Satisfaction Problems

Some Applications of Graph Bandwidth to Constraint Satisfaction Problems Some Applications of Graph Bandwidth to Constraint Satisfaction Problems Ramin Zabih Computer Science Department Stanford University Stanford, California 94305 Abstract Bandwidth is a fundamental concept

More information

Laboratorio di Programmazione. Prof. Marco Bertini

Laboratorio di Programmazione. Prof. Marco Bertini Laboratorio di Programmazione Prof. Marco Bertini marco.bertini@unifi.it http://www.micc.unifi.it/bertini/ Code versioning: techniques and tools Software versions All software has multiple versions: Each

More information

Exploring Performance Tradeoffs in a Sudoku SAT Solver CS242 Project Report

Exploring Performance Tradeoffs in a Sudoku SAT Solver CS242 Project Report Exploring Performance Tradeoffs in a Sudoku SAT Solver CS242 Project Report Hana Lee (leehana@stanford.edu) December 15, 2017 1 Summary I implemented a SAT solver capable of solving Sudoku puzzles using

More information

Git with It and Version Control!

Git with It and Version Control! Paper CT10 Git with It and Version Control! Carrie Dundas-Lucca, Zencos Consulting, LLC., Cary, NC, United States Ivan Gomez, Zencos Consulting, LLC., Cary, NC, United States ABSTRACT It is a long-standing

More information

Parallel Algorithms for the Third Extension of the Sieve of Eratosthenes. Todd A. Whittaker Ohio State University

Parallel Algorithms for the Third Extension of the Sieve of Eratosthenes. Todd A. Whittaker Ohio State University Parallel Algorithms for the Third Extension of the Sieve of Eratosthenes Todd A. Whittaker Ohio State University whittake@cis.ohio-state.edu Kathy J. Liszka The University of Akron liszka@computer.org

More information

Automatically Locating software Errors using Interesting Value Mapping Pair (IVMP)

Automatically Locating software Errors using Interesting Value Mapping Pair (IVMP) 71 Automatically Locating software Errors using Interesting Value Mapping Pair (IVMP) Ajai Kumar 1, Anil Kumar 2, Deepti Tak 3, Sonam Pal 4, 1,2 Sr. Lecturer, Krishna Institute of Management & Technology,

More information

With data-based models and design of experiments towards successful products - Concept of the product design workbench

With data-based models and design of experiments towards successful products - Concept of the product design workbench European Symposium on Computer Arded Aided Process Engineering 15 L. Puigjaner and A. Espuña (Editors) 2005 Elsevier Science B.V. All rights reserved. With data-based models and design of experiments towards

More information

Analysis Tool Project

Analysis Tool Project Tool Overview The tool we chose to analyze was the Java static analysis tool FindBugs (http://findbugs.sourceforge.net/). FindBugs is A framework for writing static analyses Developed at the University

More information

Adding a Source Code Searching Capability to Yioop ADDING A SOURCE CODE SEARCHING CAPABILITY TO YIOOP CS297 REPORT

Adding a Source Code Searching Capability to Yioop ADDING A SOURCE CODE SEARCHING CAPABILITY TO YIOOP CS297 REPORT ADDING A SOURCE CODE SEARCHING CAPABILITY TO YIOOP CS297 REPORT Submitted to Dr. Chris Pollett By Snigdha Rao Parvatneni 1 1. INTRODUCTION The aim of the CS297 project is to explore and learn important

More information

SOLUTION BRIEF CA TEST DATA MANAGER FOR HPE ALM. CA Test Data Manager for HPE ALM

SOLUTION BRIEF CA TEST DATA MANAGER FOR HPE ALM. CA Test Data Manager for HPE ALM SOLUTION BRIEF CA TEST DATA MANAGER FOR HPE ALM CA Test Data Manager for HPE ALM Generate all the data needed to deliver fully tested software, and export it directly into Hewlett Packard Enterprise Application

More information

Dynamic Test Generation to Find Bugs in Web Application

Dynamic Test Generation to Find Bugs in Web Application Dynamic Test Generation to Find Bugs in Web Application C.SathyaPriya 1 and S.Thiruvenkatasamy 2 1 Department of IT, Shree Venkateshwara Hi-Tech Engineering College, Gobi, Tamilnadu, India. 2 Department

More information

Towards a Taxonomy of Approaches for Mining of Source Code Repositories

Towards a Taxonomy of Approaches for Mining of Source Code Repositories Towards a Taxonomy of Approaches for Mining of Source Code Repositories Huzefa Kagdi, Michael L. Collard, Jonathan I. Maletic Department of Computer Science Kent State University Kent Ohio 44242 {hkagdi,

More information

Searching the Deep Web

Searching the Deep Web Searching the Deep Web 1 What is Deep Web? Information accessed only through HTML form pages database queries results embedded in HTML pages Also can included other information on Web can t directly index

More information

Harvard School of Engineering and Applied Sciences CS 152: Programming Languages

Harvard School of Engineering and Applied Sciences CS 152: Programming Languages Harvard School of Engineering and Applied Sciences CS 152: Programming Languages Lecture 18 Thursday, April 3, 2014 1 Error-propagating semantics For the last few weeks, we have been studying type systems.

More information

Regression Test Case Prioritization using Genetic Algorithm

Regression Test Case Prioritization using Genetic Algorithm 9International Journal of Current Trends in Engineering & Research (IJCTER) e-issn 2455 1392 Volume 2 Issue 8, August 2016 pp. 9 16 Scientific Journal Impact Factor : 3.468 http://www.ijcter.com Regression

More information

CSCI6900 Assignment 3: Clustering on Spark

CSCI6900 Assignment 3: Clustering on Spark DEPARTMENT OF COMPUTER SCIENCE, UNIVERSITY OF GEORGIA CSCI6900 Assignment 3: Clustering on Spark DUE: Friday, Oct 2 by 11:59:59pm Out Friday, September 18, 2015 1 OVERVIEW Clustering is a data mining technique

More information

Optimized Implementation of Logic Functions

Optimized Implementation of Logic Functions June 25, 22 9:7 vra235_ch4 Sheet number Page number 49 black chapter 4 Optimized Implementation of Logic Functions 4. Nc3xe4, Nb8 d7 49 June 25, 22 9:7 vra235_ch4 Sheet number 2 Page number 5 black 5 CHAPTER

More information

Feature Selection Technique to Improve Performance Prediction in a Wafer Fabrication Process

Feature Selection Technique to Improve Performance Prediction in a Wafer Fabrication Process Feature Selection Technique to Improve Performance Prediction in a Wafer Fabrication Process KITTISAK KERDPRASOP and NITTAYA KERDPRASOP Data Engineering Research Unit, School of Computer Engineering, Suranaree

More information

Using Mutation to Automatically Suggest Fixes for Faulty Programs

Using Mutation to Automatically Suggest Fixes for Faulty Programs 2010 Third International Conference on Software Testing, Verification and Validation Using Mutation to Automatically Suggest Fixes for Faulty Programs Vidroha Debroy and W. Eric Wong Department of Computer

More information

Joe Wingbermuehle, (A paper written under the guidance of Prof. Raj Jain)

Joe Wingbermuehle, (A paper written under the guidance of Prof. Raj Jain) 1 of 11 5/4/2011 4:49 PM Joe Wingbermuehle, wingbej@wustl.edu (A paper written under the guidance of Prof. Raj Jain) Download The Auto-Pipe system allows one to evaluate various resource mappings and topologies

More information

RDGL Reference Manual

RDGL Reference Manual RDGL Reference Manual COMS W4115 Programming Languages and Translators Professor Stephen A. Edwards Summer 2007(CVN) Navid Azimi (na2258) nazimi@microsoft.com Contents Introduction... 3 Purpose... 3 Goals...

More information

CS 229 Final Project - Using machine learning to enhance a collaborative filtering recommendation system for Yelp

CS 229 Final Project - Using machine learning to enhance a collaborative filtering recommendation system for Yelp CS 229 Final Project - Using machine learning to enhance a collaborative filtering recommendation system for Yelp Chris Guthrie Abstract In this paper I present my investigation of machine learning as

More information

Chapter 1 Summary. Chapter 2 Summary. end of a string, in which case the string can span multiple lines.

Chapter 1 Summary. Chapter 2 Summary. end of a string, in which case the string can span multiple lines. Chapter 1 Summary Comments are indicated by a hash sign # (also known as the pound or number sign). Text to the right of the hash sign is ignored. (But, hash loses its special meaning if it is part of

More information

Profile-Guided Program Simplification for Effective Testing and Analysis

Profile-Guided Program Simplification for Effective Testing and Analysis Profile-Guided Program Simplification for Effective Testing and Analysis Lingxiao Jiang Zhendong Su Program Execution Profiles A profile is a set of information about an execution, either succeeded or

More information

A Module Mapper. 1 Background. Nathan Sidwell. Document Number: p1184r1 Date: SC22/WG21 SG15. /

A Module Mapper. 1 Background. Nathan Sidwell. Document Number: p1184r1 Date: SC22/WG21 SG15. / A Module Mapper Nathan Sidwell Document Number: p1184r1 Date: 2018-11-12 To: SC22/WG21 SG15 Reply to: Nathan Sidwell nathan@acm.org / nathans@fb.com The modules-ts specifies no particular mapping between

More information

Mubug: a mobile service for rapid bug tracking

Mubug: a mobile service for rapid bug tracking . MOO PAPER. SCIENCE CHINA Information Sciences January 2016, Vol. 59 013101:1 013101:5 doi: 10.1007/s11432-015-5506-4 Mubug: a mobile service for rapid bug tracking Yang FENG, Qin LIU *,MengyuDOU,JiaLIU&ZhenyuCHEN

More information

Running Head: APPLIED KNOWLEDGE MANAGEMENT. MetaTech Consulting, Inc. White Paper

Running Head: APPLIED KNOWLEDGE MANAGEMENT. MetaTech Consulting, Inc. White Paper Applied Knowledge Management 1 Running Head: APPLIED KNOWLEDGE MANAGEMENT MetaTech Consulting, Inc. White Paper Application of Knowledge Management Constructs to the Massive Data Problem Jim Thomas July

More information

Testing unrolling optimization technique for quasi random numbers

Testing unrolling optimization technique for quasi random numbers Testing unrolling optimization technique for quasi random numbers Romain Reuillon David R.C. Hill LIMOS, UMR CNRS 6158 LIMOS, UMR CNRS 6158 Blaise Pascal University Blaise Pascal University ISIMA, Campus

More information

Programming. We will be introducing various new elements of Python and using them to solve increasingly interesting and complex problems.

Programming. We will be introducing various new elements of Python and using them to solve increasingly interesting and complex problems. Plan for the rest of the semester: Programming We will be introducing various new elements of Python and using them to solve increasingly interesting and complex problems. We saw earlier that computers

More information

Split-Brain Consensus

Split-Brain Consensus Split-Brain Consensus On A Raft Up Split Creek Without A Paddle John Burke jcburke@stanford.edu Rasmus Rygaard rygaard@stanford.edu Suzanne Stathatos sstat@stanford.edu ABSTRACT Consensus is critical for

More information

Comparing Implementations of Optimal Binary Search Trees

Comparing Implementations of Optimal Binary Search Trees Introduction Comparing Implementations of Optimal Binary Search Trees Corianna Jacoby and Alex King Tufts University May 2017 In this paper we sought to put together a practical comparison of the optimality

More information

Mapping Bug Reports to Relevant Files and Automated Bug Assigning to the Developer Alphy Jose*, Aby Abahai T ABSTRACT I.

Mapping Bug Reports to Relevant Files and Automated Bug Assigning to the Developer Alphy Jose*, Aby Abahai T ABSTRACT I. International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2018 IJSRCSEIT Volume 3 Issue 1 ISSN : 2456-3307 Mapping Bug Reports to Relevant Files and Automated

More information

Incorporating Satellite Documents into Co-citation Networks for Scientific Paper Searches

Incorporating Satellite Documents into Co-citation Networks for Scientific Paper Searches Incorporating Satellite Documents into Co-citation Networks for Scientific Paper Searches Masaki Eto Gakushuin Women s College Tokyo, Japan masaki.eto@gakushuin.ac.jp Abstract. To improve the search performance

More information

Bug Inducing Analysis to Prevent Fault Prone Bug Fixes

Bug Inducing Analysis to Prevent Fault Prone Bug Fixes Bug Inducing Analysis to Prevent Fault Prone Bug Fixes Haoyu Yang, Chen Wang, Qingkai Shi, Yang Feng, Zhenyu Chen State Key Laboratory for ovel Software Technology, anjing University, anjing, China Corresponding

More information

Istat s Pilot Use Case 1

Istat s Pilot Use Case 1 Istat s Pilot Use Case 1 Pilot identification 1 IT 1 Reference Use case X 1) URL Inventory of enterprises 2) E-commerce from enterprises websites 3) Job advertisements on enterprises websites 4) Social

More information

SOURCE code repositories hold a wealth of information

SOURCE code repositories hold a wealth of information 466 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 31, NO. 6, JUNE 2005 Automatic Mining of Source Code Repositories to Improve Bug Finding Techniques Chadd C. Williams and Jeffrey K. Hollingsworth, Senior

More information

In this project, I examined methods to classify a corpus of s by their content in order to suggest text blocks for semi-automatic replies.

In this project, I examined methods to classify a corpus of  s by their content in order to suggest text blocks for semi-automatic replies. December 13, 2006 IS256: Applied Natural Language Processing Final Project Email classification for semi-automated reply generation HANNES HESSE mail 2056 Emerson Street Berkeley, CA 94703 phone 1 (510)

More information

TEST FRAMEWORKS FOR ELUSIVE BUG TESTING

TEST FRAMEWORKS FOR ELUSIVE BUG TESTING TEST FRAMEWORKS FOR ELUSIVE BUG TESTING W.E. Howden CSE, University of California at San Diego, La Jolla, CA, 92093, USA howden@cse.ucsd.edu Cliff Rhyne Intuit Software Corporation, 6220 Greenwich D.,

More information

Advanced Algorithms Class Notes for Monday, October 23, 2012 Min Ye, Mingfu Shao, and Bernard Moret

Advanced Algorithms Class Notes for Monday, October 23, 2012 Min Ye, Mingfu Shao, and Bernard Moret Advanced Algorithms Class Notes for Monday, October 23, 2012 Min Ye, Mingfu Shao, and Bernard Moret Greedy Algorithms (continued) The best known application where the greedy algorithm is optimal is surely

More information

MICRO-SPECIALIZATION IN MULTIDIMENSIONAL CONNECTED-COMPONENT LABELING CHRISTOPHER JAMES LAROSE

MICRO-SPECIALIZATION IN MULTIDIMENSIONAL CONNECTED-COMPONENT LABELING CHRISTOPHER JAMES LAROSE MICRO-SPECIALIZATION IN MULTIDIMENSIONAL CONNECTED-COMPONENT LABELING By CHRISTOPHER JAMES LAROSE A Thesis Submitted to The Honors College In Partial Fulfillment of the Bachelors degree With Honors in

More information

Automated Adaptive Bug Isolation using Dyninst. Piramanayagam Arumuga Nainar, Prof. Ben Liblit University of Wisconsin-Madison

Automated Adaptive Bug Isolation using Dyninst. Piramanayagam Arumuga Nainar, Prof. Ben Liblit University of Wisconsin-Madison Automated Adaptive Bug Isolation using Dyninst Piramanayagam Arumuga Nainar, Prof. Ben Liblit University of Wisconsin-Madison Cooperative Bug Isolation (CBI) ++branch_17[p!= 0]; if (p) else Predicates

More information

Failure Detection Algorithm for Testing Dynamic Web Applications

Failure Detection Algorithm for Testing Dynamic Web Applications J. Vijaya Sagar Reddy & G. Ramesh Department of CSE, JNTUA College of Engineering, Anantapur, Andhra Pradesh, India E-mail: vsreddyj5@gmail.com, ramesh680@gmail.com Abstract - Web applications are the

More information

Equation to LaTeX. Abhinav Rastogi, Sevy Harris. I. Introduction. Segmentation.

Equation to LaTeX. Abhinav Rastogi, Sevy Harris. I. Introduction. Segmentation. Equation to LaTeX Abhinav Rastogi, Sevy Harris {arastogi,sharris5}@stanford.edu I. Introduction Copying equations from a pdf file to a LaTeX document can be time consuming because there is no easy way

More information

Genetic Model Optimization for Hausdorff Distance-Based Face Localization

Genetic Model Optimization for Hausdorff Distance-Based Face Localization c In Proc. International ECCV 2002 Workshop on Biometric Authentication, Springer, Lecture Notes in Computer Science, LNCS-2359, pp. 103 111, Copenhagen, Denmark, June 2002. Genetic Model Optimization

More information

Enterprise Management of Windows NT Services

Enterprise Management of Windows NT Services Enterprise Management of Windows NT Services J. Nick Otto notto@parikh.net Parikh Advanced Systems Abstract A problem faced by NT administrators is the management of NT based services in the enterprise.

More information

A Propagation Engine for GCC

A Propagation Engine for GCC A Propagation Engine for GCC Diego Novillo Red Hat Canada dnovillo@redhat.com May 1, 2005 Abstract Several analyses and transformations work by propagating known values and attributes throughout the program.

More information