Predicting Bugs. by Analyzing History. Sunghun Kim Research On Program Analysis System Seoul National University

Size: px
Start display at page:

Download "Predicting Bugs. by Analyzing History. Sunghun Kim Research On Program Analysis System Seoul National University"

Transcription

1 Predicting Bugs by Analyzing History Sunghun Kim Research On Program Analysis System Seoul National University

2 Around the World in 80 days

3 Around the World in 8 years

4 Predicting Bugs Severe consequences July 28, 1962: Mariner I space probe 1982 : Soviet gas pipeline 1985 ~ 1987: Therac-25 medical accelerator June 4, 1996: Ariane 5 Flight 501

5 Predicting Bugs Severe consequences July 28, Mariner I space probe Soviet gas pipeline Therac-25 medical accelerato June 4, Ariane 5 Flight 501 The Ariane5 exploded seconds after launching. Boring and labor intensive work

6 Analyzing History "Those who cannot learn from history are doomed to repeat it." - George Santayana

7 Analyzing History "Developers who cannot learn from history are doomed to repeat it." - George Santayana

8 Analyzing History "Developers who cannot learn from Software history are doomed to repeat it." - George Santayana

9 Available Information Bug Prediction Feedback Change Impact Analysis Resource Allocation Produce Software Understanding Models s Tests Raw data Bug reports Mine History Information

10 Dream All modules are bug-free! Bug-free module

11 Some are Buggy Some modules are buggy! Bug-free module Buggy module

12 Changes introduce Bugs Often changes introduce bugs! change Bug-free module Buggy module

13 Two Bug Prediction Algorithms Change Classification: predicting if a change introduces a bug change Bug cache: predicting buggy modules Bug-free module Buggy module

14 Change Classification [TSE08, the featured article of March/April issue]

15 Change Classification Development history of JEditTextArea.java Rev 1 Rev 2 Rev 3 change change

16 Change Classification Development history of JEditTextArea.java Rev 1 Rev 2 Rev 3 Rev 4 change change change

17 Change Classification Development history of JEditTextArea.java Rev 1 Rev 2 Rev 3 Rev 4 change change change Did I just introduce a bug?

18 What to do when a change is likely to introduce a bug? Review the submitted code carefully The submitted code change is fresh Focus additional software quality assurance (QA) efforts on those changes Software inspections Additional test cases

19 Change Classification Rev n Rev n+1 Rev n Rev n+1 Rev n Rev n+1 Rev n Rev n+1 buggy clean buggy clean buggy Rev n Rev n+1? Machine learner clean It classifies all changes (as buggy or clean) with 70% recall and 94% precision.

20 Label Historical Changes [MSR06, ASE06] Rev 1 Rev 100 change Rev 101 change Rev 102 Development history of JEditTextArea.java

21 Label Historical Changes Rev 101 (with BUG) settext( \t ) fixed Rev 102 (no BUG) inserttab() Rev 1 Rev 100 Rev 101 Rev 102 change change Development history of JEditTextArea.java

22 Label Historical Changes Change message: fix for bug Rev 101 (with BUG) settext( \t ) fixed Rev 102 (no BUG) inserttab() Rev 1 Rev 100 Rev 101 Rev 102 change change Development history of JEditTextArea.java

23 Label Historical Changes Rev 101 (with BUG) settext( \t ) fixed Rev 102 (no BUG) inserttab() Rev 1 Rev 100 Rev 101 Rev 102 change change Development history of JEditTextArea.java

24 Tracking Line Changes [MSR05] 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: Rev12 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: Rev23 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: Rev42 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: Rev101 3: 4: 5: 10: 11: 12: 7: 8: 10: CHG CHG CHG DEL ADD 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: Rev12 Rev12 Rev23 Rev23 Rev23 Rev12 Rev12 Rev12, 23 Rev12, 23 Rev12, 23, 42 Rev12 Rev12 Rev12 Rev12 Line origins

25 Label Historical Changes Line origins Rev 101 (with BUG) Rev 102 (no BUG) Buggy change Rev 23 Rev 12 settext( \t ) fixed inserttab() Rev 1 Rev 100 Rev 101 Rev 102 change change Development history of JEditTextArea.java

26 Label Historical Changes Rev 22 Rev 23 Rev 24 Rev 25 Rev 26 buggy clean clean buggy

27 Extracting Features JEditTextArea.java Revision 10 change Revision 11 Author: hunkim Check-in time: March 23, :30 AM Log: Never convert propresult from utf-16

28 Training Classifiers Historical changes Machine learning techniques Bayesian Network, SVM

29 Classifying New Changes Historical changes New change prediction

30 Evaluation Training and testing sets 10-fold cross-validation Performance measurement Precision Recall

31 Training and Testing Sets Training Testing real prediction

32 Performance Measurement 4 possible outcomes from using a classifier Real Prediction Buggy Clean Buggy n b b n b c Clean n c b n c c Precision: n b b, Recall: n b b + n c b n b b n b b + n b c

33 Subject Systems Bugzilla jedit PostgreSQL Mozilla and more

34 Bug Prediction Accuracy (Bayesian Network after feature selection) A1 BUG COL GAI GFO JED MOZ ECL PLO POS SCA SVN Bug Recall Bug Precision

35 Related Work Mizuno et al. [FSE07] Aversano et al. [IWPSE07] ChangeClassfication Recall Precision

36 Related Work Mizuno et al. [FSE07] Aversano et al. [IWPSE07] ChangeClassfication Recall Precision

37 Change Classification Summary Use machine learning techniques to analyze changes After training, can predict whether a new change has a bug, or doesn t have a bug Average recall is 70% Average precision is 94% Applicable in real development process Yahoo and Apple are interested

38 Bug Cache [ ICSE07, Distinguished paper ]

39 Bug Cache Bug cache: locating buggy modules Bug-free module Buggy module

40 Motivation Which files should we focus on?

41 Where are the Bugs? In new files! [Graves et al.] In modified files! [Khoshgoftaar et al.] Spatial locality: nearby other bugs! [Zimmermann et al.] Temporal locality: Defected files are likely to have more soon. [Ostrand, Hassan]

42 Our Solution List of most bug-prone files Dynamically adaptive and intuitive Combine bug prediction models 10% BugCache predicts 73~95% of bugs

43 Bug Cache Model

44 Cache Model C Cache size: 2 A B C Miss

45 Cache Update Load missed files Pre-fetch nearby files (spatial locality) File A B D Number of common changes with C

46 Cache Model C B A Cache size: 2 Block size: 2 A B C C A B B D A Miss Hit Hit Miss Which one should be replaced?

47 Replacement Policies Least recently used (LRU) Unload the files that have the least recently found defect. Least frequently changed (CHANGE) Unload the files that have the fewest changes. Least frequent defects (BUG) Unload the files that have the fewest defects.

48 File C B BUG 2 1 (replace) 1 C B A Cache size: 2 Block size: 2 Replacement: BUG A B C C A B B D A Miss Hit Hit Miss

49 Cache Evaluation C A Cache size: 2 Block size: 2 Replacement: BUG A B C C A B B D A Miss Hit Hit Miss Hit rate = #Hits / #Bugs = 50%

50 Hit Rates Apache 1.3 Columba Eclipse JEdit Mozilla PostgreSQL Subversion Cache size = 10% of all files File

51 Hit Rates Apache 1.3 Columba Eclipse 95 JEdit Mozilla PostgreSQL 79 Subversion Cache size = 10% of all files File

52 Related Work Khoshgoftaar et al. Top 10% 64% Khoshgoftaar et al. Top 20% Ostrand et al. Top 20% Hassan et al. Top 10% 82% 71~93% 44~78% BugCache. Top 10%

53 Related Work Khoshgoftaar et al. Top 10% 64% Khoshgoftaar et al. Top 20% Ostrand et al. Top 20% Hassan et al. Top 10% BugCache. Top 10% 82% 71~93% 44~78% 73~95%

54 Related Work Khoshgoftaar et al. Top 10% Khoshgoftaar et al. Top 20% Ostrand et al. Top 20% Hassan et al. Top 10% Previous State of the Art, 10% predicts 44%~78% 20% predicts 71~93% 10% BugCache predicts 73~95% 64% 82% 71~93% 44~78% BugCache. Top 10% 73~95%

55 Conclusion Analyzing history is an effective way to predict bug locations Change classification can classify changes as buggy or clean with very good accuracy BugCache can identify the most bugprone files

56 Research Overview and Future Work

57 Research Goal Developer productivity Reliable software Predicting bugs

58 Research Overview History Mining Kenyon [FSE05] Bug Introducing Changes [ASE06] Prioritization of Warnings [FSE07] Static Software Understanding Signature Change Patterns [ICSM06] Micro Pattern Evolution [MSR06] Matching Name Changes [WCRE06] Bug Prediction Memories of Bug Fixes [FSE06] Change Classification [TSE08] Bug Cache [ICSE07] Dynamic Dynamic Monitoring ReCrash [ECOOP08] Zero-day patch

59 Future Work Change classification aware repository Personal coding assistance Mining bug and crash reports Mining APIs Combining complementary techniques Micro commits and explicit fix-change marks Change feedback to developers Adaptive change classification Mining common error patterns in my code Showing code survival rates Identifying/predicting crashed methods Predicting locations based on bug reports Increasing bug report quality Example oriented API documents Identifying more/less error prone APIs Automatic API version upgrading To find bugs effectively and efficiently Static and Dynamic (ReCrash + BugCache)

60 Static and Dynamic Analysis Overhead False positives Static Low High Dynamic High Low Combine static and dynamic analysis BugCache + ReCrash

61 Reproducing Crashes Reproducing crashes (faults) is hard! Require the exact configuration of crash (in field) Crashes usually involve nondeterministic facts Must be able to reproduce crashes to fix bugs and validate fixes

62 ReCrash [ECOOP08] Subject program ReCrash Test case 13-64% performance overhead

63 ReCrash + BugCache ReCrash Monitoring all modules

64 ReCrash + BugCache ReCrash Monitoring all modules BugCache Identify crashable modules

65 ReCrash + BugCache ReCrash Monitoring all modules BugCache Identify crashable modules ReCrash + BugCache Monitoring only identified modules

66 ReCrash + BugCache ReCrash Monitoring all modules BugCache Identify crashable modules ReCrash + BugCache Monitoring only identified modules 10% of modules account for 70% of crashes ReCrash + BugCache can possibly run with 1~6% overhead reproduce 70% of crashes

67 Acknowledgement Advisors Jim Whitehead (UCSC), Michael Ernst (MIT) Collaborators (co-authors) MIT Shay Artzi Adam Kiezun Danny Dig UCSC Guozhang Ge Yi Zhang Kai Pan Ramak Akella Jen Bevan Elias Sinderson Saarland U Andreas Zeller Nicolas Bettenburg Rahul Premraj Iowa Tien Nguyen Hojun Jaygarl Sean Chen (Taiwan) Canada Europe Industry / UW Tom Zimmermann (U. Calgary) Michael Godfrey (Waterloo) Ahmed Hassan (Queen's U) Tudor Girba (SW-ENG) Martin Pinzger (U. Zurich) Audris Mockus (Avaya) Shiv Shivaji (Yahoo) Miryung Kim (UW)

68 Summary

69

70 Predicting Bugs by Analyzing History Sunghun Kim Research On Program Analysis System Seoul National University

2006 Sunghun Kim Use of the entire volume is covered by Creative Commons License: Attribution 2.5

2006 Sunghun Kim Use of the entire volume is covered by Creative Commons License: Attribution 2.5 2006 Sunghun Kim Use of the entire volume is covered by Creative Commons License: Attribution 2.5 http://creativecommons.org/licenses/by/2.5/ You are free: to copy, distribute, display, and perform the

More information

Predicting Faults from Cached History

Predicting Faults from Cached History Sunghun Kim 1 hunkim@csail.mit.edu 1 Massachusetts Institute of Technology, USA Predicting Faults from Cached History Thomas Zimmermann 2 tz @acm.org 2 Saarland University, Saarbrücken, Germany E. James

More information

Automatic Identification of Bug-Introducing Changes

Automatic Identification of Bug-Introducing Changes Automatic Identification of Bug-Introducing Changes Sunghun Kim 1, Thomas Zimmermann 2, Kai Pan 1, E. James Whitehead, Jr. 1 1 University of California, Santa Cruz, CA, USA {hunkim, pankai, ejw}@cs.ucsc.edu

More information

Mining Software Repositories. Seminar The Mining Project Yana Mileva & Kim Herzig

Mining Software Repositories. Seminar The Mining Project Yana Mileva & Kim Herzig Mining Software Repositories Seminar 2010 - The Mining Project Yana Mileva & Kim Herzig Predicting Defects for Eclipse [Zimmermann et al.] SCM Repository Predicting Defects for Eclipse [Zimmermann et al.]

More information

Managing Open Bug Repositories through Bug Report Prioritization Using SVMs

Managing Open Bug Repositories through Bug Report Prioritization Using SVMs Managing Open Bug Repositories through Bug Report Prioritization Using SVMs Jaweria Kanwal Quaid-i-Azam University, Islamabad kjaweria09@yahoo.com Onaiza Maqbool Quaid-i-Azam University, Islamabad onaiza@qau.edu.pk

More information

Cross-project defect prediction. Thomas Zimmermann Microsoft Research

Cross-project defect prediction. Thomas Zimmermann Microsoft Research Cross-project defect prediction Thomas Zimmermann Microsoft Research Upcoming Events ICSE 2010: http://www.sbs.co.za/icse2010/ New Ideas and Emerging Results ACM Student Research Competition (SRC) sponsored

More information

Commit Guru: Analytics and Risk Prediction of Software Commits

Commit Guru: Analytics and Risk Prediction of Software Commits Commit Guru: Analytics and Risk Prediction of Software Commits Christoffer Rosen, Ben Grawi Department of Software Engineering Rochester Institute of Technology Rochester, NY, USA {cbr4830, bjg1568}@rit.edu

More information

Churrasco: Supporting Collaborative Software Evolution Analysis

Churrasco: Supporting Collaborative Software Evolution Analysis Churrasco: Supporting Collaborative Software Evolution Analysis Marco D Ambros a, Michele Lanza a a REVEAL @ Faculty of Informatics - University of Lugano, Switzerland Abstract Analyzing the evolution

More information

An Empirical Study of Bug Fixing Rate

An Empirical Study of Bug Fixing Rate An Empirical Study of Bug Fixing Rate Weiqin Zou, Xin Xia, Weiqiang Zhang, Zhenyu Chen, and David Lo Department of Information Engineering, Jiangxi University of Science and Technology, China College of

More information

Predicting Method Crashes with Bytecode Operations

Predicting Method Crashes with Bytecode Operations Predicting Method Crashes with Bytecode Operations Sunghun Kim Hong Kong University of Science and Technology hunkim@cse.ust.hk Nicolas Bettenburg Queen s University Kingston, Canada nicbet@cs.queensu.ca

More information

Which Warnings Should I Fix First?

Which Warnings Should I Fix First? Which Warnings Should I Fix First? Sunghun Kim and Michael D. Ernst Computer Science & Artificial Intelligence Lab (CSAIL) Massachusetts Institute of Technology {hunkim, mernst}@csail.mit.edu ABSTRACT

More information

Automatic Defect Detection. Andrzej Wasylkowski

Automatic Defect Detection. Andrzej Wasylkowski Automatic Defect Detection Andrzej Wasylkowski Overview Automatic Defect Detection Horizontal Techniques Specification-checking Techniques Mining-based Techniques Mining Repositories Mining Traces Mining

More information

Mining Frequent Bug-Fix Code Changes

Mining Frequent Bug-Fix Code Changes Mining Frequent Bug-Fix Code Changes Haidar Osman, Mircea Lungu, Oscar Nierstrasz Software Composition Group University of Bern Bern, Switzerland {osman, lungu, oscar@iam.unibe.ch Abstract Detecting bugs

More information

BUG TRACKING SYSTEM. November 2015 IJIRT Volume 2 Issue 6 ISSN: Kavita Department of computer science, india

BUG TRACKING SYSTEM. November 2015 IJIRT Volume 2 Issue 6 ISSN: Kavita Department of computer science, india BUG TRACKING SYSTEM Kavita Department of computer science, india Abstract It is important that information provided in bug reports is relevant and complete in order to help resolve bugs quickly. However,

More information

Mining Crash Fix Patterns

Mining Crash Fix Patterns Mining Crash Fix Patterns Jaechang Nam and Ning Chen Department of Computer Science and Engineering The Hong Kong University of Science and Technology China {jcnam,ning@cse.ust.hk ABSTRACT During the life

More information

On the unreliability of bug severity data

On the unreliability of bug severity data DOI 10.1007/s10664-015-9409-1 On the unreliability of bug severity data Yuan Tian 1 Nasir Ali 2 David Lo 1 Ahmed E. Hassan 3 Springer Science+Business Media New York 2015 Abstract Severity levels, e.g.,

More information

Empirical Study on Impact of Developer Collaboration on Source Code

Empirical Study on Impact of Developer Collaboration on Source Code Empirical Study on Impact of Developer Collaboration on Source Code Akshay Chopra University of Waterloo Waterloo, Ontario a22chopr@uwaterloo.ca Parul Verma University of Waterloo Waterloo, Ontario p7verma@uwaterloo.ca

More information

The Road Ahead for Mining Software Repositories Ahmed E. Hassan. Queen s University

The Road Ahead for Mining Software Repositories Ahmed E. Hassan. Queen s University The Road Ahead for Mining Software Repositories Ahmed E. Hassan Queen s University Canada Sourceforge GoogleCode Code Repos Source Control CVS/SVN Bugzilla Mailing lists Historical Repositories Crash Repos

More information

Micro Pattern Evolution

Micro Pattern Evolution Sunghun Kim Department of Computer Science University of California, Santa Cruz Santa Cruz, CA, USA hunkim@cs.ucsc.edu ABSTRACT When analyzing the evolution history of a software project, we wish to develop

More information

Bug Inducing Analysis to Prevent Fault Prone Bug Fixes

Bug Inducing Analysis to Prevent Fault Prone Bug Fixes Bug Inducing Analysis to Prevent Fault Prone Bug Fixes Haoyu Yang, Chen Wang, Qingkai Shi, Yang Feng, Zhenyu Chen State Key Laboratory for ovel Software Technology, anjing University, anjing, China Corresponding

More information

Leveraging Test Generation and Specification Mining for Automated Bug Detection without False Positives

Leveraging Test Generation and Specification Mining for Automated Bug Detection without False Positives Leveraging Test Generation and Specification Mining for Automated Bug Detection without False Positives Michael Pradel and Thomas R. Gross Department of Computer Science ETH Zurich 1 Motivation API usage

More information

Understanding Semantic Impact of Source Code Changes: an Empirical Study

Understanding Semantic Impact of Source Code Changes: an Empirical Study Understanding Semantic Impact of Source Code Changes: an Empirical Study Danhua Shao, Sarfraz Khurshid, and Dewayne E. Perry Electrical and Computer Engineering, The University of Texas at Austin {dshao,

More information

A Survey of Bug Tracking Tools: Presentation, Analysis and Trends

A Survey of Bug Tracking Tools: Presentation, Analysis and Trends A Survey of Bug Tracking Tools: Presentation, Analysis and Trends Trajkov Marko, Smiljkovic Aleksandar markostrajkov@gmail.com aleksandarsmiljkovic@gmail.com Department of Computer Science, University

More information

Integrated Impact Analysis for Managing Software Changes. Malcom Gethers, Bogdan Dit, Huzefa Kagdi, Denys Poshyvanyk

Integrated Impact Analysis for Managing Software Changes. Malcom Gethers, Bogdan Dit, Huzefa Kagdi, Denys Poshyvanyk Integrated Impact Analysis for Managing Software Changes Malcom Gethers, Bogdan Dit, Huzefa Kagdi, Denys Poshyvanyk Change Impact Analysis Software change impact analysis aims at estimating the potentially

More information

Structured Information Retrival Based Bug Localization

Structured Information Retrival Based Bug Localization ISSN (online): 2456-0006 International Journal of Science Technology Management and Research Available online at: Structured Information Retrival Based Bug Localization Shraddha Kadam 1 Department of Computer

More information

International Journal of Applied Sciences, Engineering and Management ISSN , Vol. 04, No. 01, January 2015, pp

International Journal of Applied Sciences, Engineering and Management ISSN , Vol. 04, No. 01, January 2015, pp Towards Effective Bug Triage with Software Data Reduction Techniques G. S. Sankara Rao 1, Ch. Srinivasa Rao 2 1 M. Tech Student, Dept of CSE, Amrita Sai Institute of Science and Technology, Paritala, Krishna-521180.

More information

Classifying Bug Reports to Bugs and Other Requests Using Topic Modeling

Classifying Bug Reports to Bugs and Other Requests Using Topic Modeling Classifying Bug Reports to Bugs and Other Requests Using Topic Modeling Natthakul Pingclasai Department of Computer Engineering Kasetsart University Bangkok, Thailand Email: b5310547207@ku.ac.th Hideaki

More information

Manage quality processes with Bugzilla

Manage quality processes with Bugzilla Manage quality processes with Bugzilla Birth Certificate of a Bug: Bugzilla in a Nutshell An open-source bugtracker and testing tool initially developed by Mozilla. Initially released by Netscape in 1998.

More information

The Impact of Task Granularity on Co-evolution Analyses

The Impact of Task Granularity on Co-evolution Analyses The Impact of Task Granularity on Co-evolution Analyses Keisuke Miura Kyushu University, Japan miura@posl.ait.kyushuu.ac.jp Ahmed E. Hassan Queen s University, Canada ahmed@cs.queensu.ca Shane McIntosh

More information

An Empirical Study of the Effect of File Editing Patterns on Software Quality

An Empirical Study of the Effect of File Editing Patterns on Software Quality An Empirical Study of the Effect of File Editing Patterns on Software Quality Feng Zhang, Foutse Khomh, Ying Zou, and Ahmed E. Hassan School of Computing, Queen s University, Canada {feng, ahmed}@cs.queensu.ca

More information

Adrian Bachmann Abraham Bernstein. Data Retrieval, Processing and Linking for Software Process Data Analysis. December 2009

Adrian Bachmann Abraham Bernstein. Data Retrieval, Processing and Linking for Software Process Data Analysis. December 2009 Adrian Bachmann Abraham Bernstein TECHNICAL REPORT No. IFI-2009.07 Data Retrieval, Processing and Linking for Software Process Data Analysis December 2009 University of Zurich Department of Informatics

More information

SNS College of Technology, Coimbatore, India

SNS College of Technology, Coimbatore, India Support Vector Machine: An efficient classifier for Method Level Bug Prediction using Information Gain 1 M.Vaijayanthi and 2 M. Nithya, 1,2 Assistant Professor, Department of Computer Science and Engineering,

More information

Automatic Bug Assignment Using Information Extraction Methods

Automatic Bug Assignment Using Information Extraction Methods Automatic Bug Assignment Using Information Extraction Methods Ramin Shokripour Zarinah M. Kasirun Sima Zamani John Anvik Faculty of Computer Science & Information Technology University of Malaya Kuala

More information

Identifying Changed Source Code Lines from Version Repositories

Identifying Changed Source Code Lines from Version Repositories Identifying Changed Source Code Lines from Version Repositories Gerardo Canfora, Luigi Cerulo, Massimiliano Di Penta RCOST Research Centre on Software Technology Department of Engineering - University

More information

A Replicated Study on Duplicate Detection: Using Apache Lucene to Search Among Android Defects

A Replicated Study on Duplicate Detection: Using Apache Lucene to Search Among Android Defects A Replicated Study on Duplicate Detection: Using Apache Lucene to Search Among Android Defects Borg, Markus; Runeson, Per; Johansson, Jens; Mäntylä, Mika Published in: [Host publication title missing]

More information

DevNet: Exploring Developer Collaboration in Heterogeneous Networks of Bug Repositories

DevNet: Exploring Developer Collaboration in Heterogeneous Networks of Bug Repositories DevNet: Exploring Collaboration in Heterogeneous Networks of Bug Repositories Song Wang, Wen Zhang, 3, Ye Yang, 2, Qing Wang, 2 Laboratory for Internet Software Technologies, Institute of Software, Chinese

More information

An empirical study of fine-grained software modifications

An empirical study of fine-grained software modifications Empir Software Eng (2006) 11: 369 393 DOI 10.1007/s10664-006-9004-6 An empirical study of fine-grained software modifications Daniel M. German Published online: 31 May 2006 # Springer Science + Business

More information

Measuring fine-grained change in software: towards modification-aware change metrics

Measuring fine-grained change in software: towards modification-aware change metrics Measuring fine-grained change in software: towards modification-aware change metrics Daniel M. German Abram Hindle Software Engineering Group Department fo Computer Science University of Victoria Abstract

More information

Filtering Noise in Mixed-Purpose Fixing Commits to Improve Defect Prediction and Localization

Filtering Noise in Mixed-Purpose Fixing Commits to Improve Defect Prediction and Localization Filtering Noise in Mixed-Purpose Fixing Commits to Improve Defect Prediction and Localization Hoan Anh Nguyen, Anh Tuan Nguyen, and Tien N. Nguyen Electrical and Computer Engineering Department Iowa State

More information

AURA: A Hybrid Approach to Identify

AURA: A Hybrid Approach to Identify : A Hybrid to Identify Wei Wu 1, Yann-Gaël Guéhéneuc 1, Giuliano Antoniol 2, and Miryung Kim 3 1 Ptidej Team, DGIGL, École Polytechnique de Montréal, Canada 2 SOCCER Lab, DGIGL, École Polytechnique de

More information

Abstract. We define an origin relationship as follows, based on [12].

Abstract. We define an origin relationship as follows, based on [12]. When Functions Change Their Names: Automatic Detection of Origin Relationships Sunghun Kim, Kai Pan, E. James Whitehead, Jr. Dept. of Computer Science University of California, Santa Cruz Santa Cruz, CA

More information

SOFTWARE DEFECT PREDICTION USING PARTICIPATION OF NODES IN SOFTWARE COUPLING

SOFTWARE DEFECT PREDICTION USING PARTICIPATION OF NODES IN SOFTWARE COUPLING SOFTWARE DEFECT PREDICTION USING PARTICIPATION OF NODES IN SOFTWARE COUPLING 1 MARYAM SHEKOFTEH, 2 KEYVAN MOHEBBI, 3 JAVAD KAMYABI 1 Department Of Computer Engineering, Sarvestan Branch, Islamic Azad University,

More information

Bug or Not? Bug Report Classification using N-Gram IDF

Bug or Not? Bug Report Classification using N-Gram IDF Bug or Not? Bug Report Classification using N-Gram IDF Pannavat Terdchanakul 1, Hideaki Hata 1, Passakorn Phannachitta 2, and Kenichi Matsumoto 1 1 Graduate School of Information Science, Nara Institute

More information

1.1 Understanding of Bug Fix Changes and Bug Fix Patterns Finding Duplicated Bugs using Bug Fix Change History... 6

1.1 Understanding of Bug Fix Changes and Bug Fix Patterns Finding Duplicated Bugs using Bug Fix Change History... 6 Contents Contents...iii List of Figures... vi List of Tables...viii Abstract... ix Acknowledgements... xi 1 Introduction... 1 1.1 Understanding of Bug Fix Changes and Bug Fix Patterns... 3 1.2 Finding

More information

arxiv: v1 [cs.se] 20 May 2009

arxiv: v1 [cs.se] 20 May 2009 arxiv:0905.3296v1 [cs.se] 20 May 2009 An Analysis of Bug Distribution in Object Oriented Systems Alessandro Murgia, Giulio Concas, Michele Marchesi, Roberto Tonelli and Ivana Turnu. Department of Electrical

More information

An Empirical Study of Supplementary Bug Fixes

An Empirical Study of Supplementary Bug Fixes An Empirical Study of Supplementary Bug Fixes Jihun Park, Miryung Kim, Baishakhi Ray, Doo-Hwan Bae Korea Advanced Institute of Science and Technology {jhpark, bae@se.kaist.ac.kr The University of Texas

More information

A Case Study on the Similarity Between Source Code and Bug Reports Vocabularies

A Case Study on the Similarity Between Source Code and Bug Reports Vocabularies A Case Study on the Similarity Between Source Code and Bug Reports Vocabularies Diego Cavalcanti 1, Dalton Guerrero 1, Jorge Figueiredo 1 1 Software Practices Laboratory (SPLab) Federal University of Campina

More information

An Empirical Comparison of Automated Generation and Classification Techniques for Object-Oriented Unit Testing

An Empirical Comparison of Automated Generation and Classification Techniques for Object-Oriented Unit Testing An Empirical Comparison of Automated Generation and Classification Techniques for Object-Oriented Unit Testing Marcelo d Amorim (UIUC) Carlos Pacheco (MIT) Tao Xie (NCSU) Darko Marinov (UIUC) Michael D.

More information

International Engineering Research Journal (IERJ), Volume 2 Issue 12 Page , 2018 ISSN

International Engineering Research Journal (IERJ), Volume 2 Issue 12 Page , 2018 ISSN ISSN 2395-1621 Bug treatment automation with the help of instance and feature selection using information security and data mining. #1 Danish Shaikh, #2 Alfaz Shaikh, #3 Azharuddin Shaikh, #4 Ivan Paul

More information

Automated Program Repair

Automated Program Repair #1 Automated Program Repair Motivation Software maintenance is expensive Up to 90% of the cost of software [Seacord] Up to $70 Billion per year in US [Jorgensen, Sutherland] Bug repair is the majority

More information

Error Propagation in Large Software Projects

Error Propagation in Large Software Projects Error Propagation in Large Software Projects M. Faisal Shehzad, M. IkramUllah Lali, M. Idrees and M. Saqib Nawaz Department of Computer Science & IT, University of Sargodha, Sargodha, Pakistan {faisal,drakramullah}@uos.edu.pk,{midrees65,saqib_dola}@yahoo.com

More information

Adams, Bram 21 MapReduce as a General Framework to Support Research in Mining Software Repositories (MSR) Anbalagan, Prasanth

Adams, Bram 21 MapReduce as a General Framework to Support Research in Mining Software Repositories (MSR) Anbalagan, Prasanth MSR 2009 Detailed Author Index [Page 1/11] A Adams, Bram 21 MapReduce as a General Framework to Support Research in Mining Software Repositories (MSR) Anbalagan, Prasanth 171 On Mining Data Across Software

More information

Semantic Impact and Faults in Source Code Changes: An Empirical Study

Semantic Impact and Faults in Source Code Changes: An Empirical Study Semantic Impact and Faults in Source Code Changes: An Empirical Study Danhua Shao, Sarfraz Khurshid, and Dewayne E. Perry Electrical and Computer Engineering, The University of Texas at Austin {dshao,

More information

Large-Scale API Protocol Mining for Automated Bug Detection

Large-Scale API Protocol Mining for Automated Bug Detection Large-Scale API Protocol Mining for Automated Bug Detection Michael Pradel Department of Computer Science ETH Zurich 1 Motivation LinkedList pinconnections =...; Iterator i = pinconnections.iterator();

More information

Improved Duplicate Bug Report Identification

Improved Duplicate Bug Report Identification 2012 16th European Conference on Software Maintenance and Reengineering Improved Duplicate Bug Report Identification Yuan Tian 1, Chengnian Sun 2, and David Lo 1 1 School of Information Systems, Singapore

More information

Cost-Aware Triage Ranking Algorithms for Bug Reporting Systems

Cost-Aware Triage Ranking Algorithms for Bug Reporting Systems Cost-Aware Triage Ranking Algorithms for Bug Reporting Systems Jin-woo Park 1, Mu-Woong Lee 1, Jinhan Kim 1, Seung-won Hwang 1, Sunghun Kim 2 POSTECH, 대한민국 1 HKUST, Hong Kong 2 Outline 1. CosTriage: A

More information

Assuring Certainty through Effective Regression Testing. Vishvesh Arumugam

Assuring Certainty through Effective Regression Testing. Vishvesh Arumugam Assuring Certainty through Effective Regression Testing Vishvesh Arumugam Agenda Introduction The Problem Magnitude Management Regression Test Efficiency Solution and Approach Test Suite Maintenance Determining

More information

Mining Sequences of Changed-files from Version Histories

Mining Sequences of Changed-files from Version Histories Mining Sequences of Changed-files from Version Histories Huzefa Kagdi, Shehnaaz Yusuf, Jonathan I. Maletic Department of Computer Science Kent State University Kent Ohio 44242 {hkagdi, sdawoodi, jmaletic}@cs.kent.edu

More information

Oracle Profitability and Cost Management Cloud. November 2017 Update (17.11) What s New

Oracle Profitability and Cost Management Cloud. November 2017 Update (17.11) What s New Oracle Profitability and Cost Management Cloud November 2017 Update (17.11) What s New TABLE OF CONTENTS REVISION HISTORY... 3 ORACLE PROFITABILITY AND COST MANAGEMENT CLOUD, NOVEMBER UPDATE... 3 ANNOUNCEMENTS

More information

CSCI 2600: Principles of Software. Spring 2017 Lecture 01 Bill Thompson

CSCI 2600: Principles of Software. Spring 2017 Lecture 01 Bill Thompson CSCI 2600: Principles of Software Spring 2017 Lecture 01 Bill Thompson thompw4@rpi.edu https://www.cs.rpi.edu/~thompw4/csci-2600/spring2017/ Thanks Much of the material in this course comes from Prof.

More information

Identifying Security Critical Properties for the Dynamic Verification of a Processor

Identifying Security Critical Properties for the Dynamic Verification of a Processor Identifying Security Critical Properties for the Dynamic Verification of a Processor Rui Zhang, Natalie Stanley, Christopher Griggs, Andrew Chi, Cynthia Sturton 04-12-2017, ASLOS XI AN CHINA 1 Processor

More information

A Hosting Service of Multi-Language Historage Repositories

A Hosting Service of Multi-Language Historage Repositories A Hosting Service of Multi-Language Historage Repositories Kyohei Uemura, Yusuke Saito, Shin Fujiwara, Daiki Tanaka, Kenji Fujiwara, Hajimu Iida, Kenichi Matsumoto Nara Institute of Science and Technology

More information

0-1 Programming Model-Based Method for Planning Code Review using Bug Fix History

0-1 Programming Model-Based Method for Planning Code Review using Bug Fix History 0-1 Programming Model-Based Method for Planning Code Review using Bug Fix History Hirohisa Aman Center for Information Technology Ehime University Matsuyama, Japan 790 8577 Email: aman@ehime-u.ac.jp Abstract

More information

Merging Duplicate Bug Reports by Sentence Clustering

Merging Duplicate Bug Reports by Sentence Clustering Merging Duplicate Bug Reports by Sentence Clustering Abstract Duplicate bug reports are often unfavorable because they tend to take many man hours for being identified as duplicates, marked so and eventually

More information

A Study of Repetitiveness of Code Changes in Software Evolution

A Study of Repetitiveness of Code Changes in Software Evolution A Study of Repetitiveness of Code Changes in Software Evolution Hoan Anh Nguyen, Anh Tuan Nguyen, Tung Thanh Nguyen, Tien N. Nguyen, and Hridesh Rajan Iowa State University Email: {hoan,anhnt,tung,tien,hridesh}@iastate.edu

More information

The Landscape of Concurrent Development

The Landscape of Concurrent Development The Landscape of Concurrent Development Thomas Zimmermann tz@acm.org Department of Computer Science, Saarland University, Saarbrücken, Germany Abstract The version control archive CVS records not only

More information

Improving Bug Triage with Relevant Search

Improving Bug Triage with Relevant Search Improving Bug Triage with Relevant Search Xinyu Peng, Pingyi Zhou, Jin Liu*, Xu Chen State ey Lab of Software Engineering Computer School, Wuhan University Wuhan, China *Corresponding author { pengxinyu,

More information

3 Prioritization of Code Anomalies

3 Prioritization of Code Anomalies 32 3 Prioritization of Code Anomalies By implementing a mechanism for detecting architecturally relevant code anomalies, we are already able to outline to developers which anomalies should be dealt with

More information

DURFEX: A Feature Extraction Technique for Efficient Detection of Duplicate Bug Reports

DURFEX: A Feature Extraction Technique for Efficient Detection of Duplicate Bug Reports DURFEX: A Feature Extraction Technique for Efficient Detection of Duplicate Bug Reports Korosh Koochekian Sabor, Abdelwahab Hamou-Lhadj Software Behaviour Analysis (SBA) Research Lab ECE, Concordia University,

More information

HPDedup: A Hybrid Prioritized Data Deduplication Mechanism for Primary Storage in the Cloud

HPDedup: A Hybrid Prioritized Data Deduplication Mechanism for Primary Storage in the Cloud HPDedup: A Hybrid Prioritized Data Deduplication Mechanism for Primary Storage in the Cloud Huijun Wu 1,4, Chen Wang 2, Yinjin Fu 3, Sherif Sakr 1, Liming Zhu 1,2 and Kai Lu 4 The University of New South

More information

Improving Bug Triage with Bug Tossing Graphs

Improving Bug Triage with Bug Tossing Graphs Improving Bug Triage with Bug Tossing Graphs Gaeul Jeong Seoul National University gejeong@ropas.snu.ac.kr Sunghun Kim Hong Kong University of Science and Technology hunkim@cse.ust.hk Thomas Zimmermann

More information

Use of the Linking File System to Aid the Study of Software Evolution

Use of the Linking File System to Aid the Study of Software Evolution Use of the Linking File System to Aid the Study of Software Evolution Sasha Ames sasha@cs.ucsc.edu CMPS290g Topics in Software Engineering: Evolution University of California, Santa Cruz Abstract This

More information

Improving Bug Localization using Correlations in Crash Reports

Improving Bug Localization using Correlations in Crash Reports Improving Bug Localization using Correlations in Crash Reports Shaohua Wang School of Computing Queen s University Kingston, ON, Canada shaohua@cs.queensu.ca Foutse Khomh SWAT Lab, DGIGL École Polytechnique

More information

Coping with an Open Bug Repository

Coping with an Open Bug Repository Coping with an Open Bug Repository John Anvik, Lyndon Hiew and Gail C. Murphy Department of Computer Science University of British Columbia {janvik, lyndonh, murphy}@cs.ubc.ca ABSTRACT Most open source

More information

Filtering Bug Reports for Fix-Time Analysis

Filtering Bug Reports for Fix-Time Analysis Filtering Bug Reports for Fix-Time Analysis Ahmed Lamkanfi, Serge Demeyer LORE - Lab On Reengineering University of Antwerp, Belgium Abstract Several studies have experimented with data mining algorithms

More information

Locality. Cache. Direct Mapped Cache. Direct Mapped Cache

Locality. Cache. Direct Mapped Cache. Direct Mapped Cache Locality A principle that makes having a memory hierarchy a good idea If an item is referenced, temporal locality: it will tend to be referenced again soon spatial locality: nearby items will tend to be

More information

Duplication de code: un défi pour l assurance qualité des logiciels?

Duplication de code: un défi pour l assurance qualité des logiciels? Duplication de code: un défi pour l assurance qualité des logiciels? Foutse Khomh S.W.A.T http://swat.polymtl.ca/ 2 JHotDraw 3 Code duplication can be 4 Example of code duplication Duplication to experiment

More information

Mining Software Repositories for Software Change Impact Analysis: A Case Study

Mining Software Repositories for Software Change Impact Analysis: A Case Study Mining Software Repositories for Software Change Impact Analysis: A Case Study Lile Hattori 1, Gilson dos Santos Jr. 2, Fernando Cardoso 2, Marcus Sampaio 2 1 Faculty of Informatics University of Lugano

More information

Sociotechnical Information From Software Repositories

Sociotechnical Information From Software Repositories Sociotechnical Information From Software Repositories Marco Aurélio Gerosa UNIVERSITY OF SÃO PAULO, BRAZIL UFU November/ Repositories of repositories 250K projects 93K projects 1 million users 11.3 million

More information

Replaying and Isolating Failing Multi-Object Interactions. Martin Burger Andreas Zeller Saarland University

Replaying and Isolating Failing Multi-Object Interactions. Martin Burger Andreas Zeller Saarland University Replaying and Isolating Failing Multi-Object Interactions Martin Burger Andreas Zeller Saarland University e-mail client written in Java 100,200 LOC ~ 1,600 Java classes 17 developers Actively developed

More information

Towards Software Analysis as a Service

Towards Software Analysis as a Service Towards Software Analysis as a Service Giacomo Ghezzi and Harald C. Gall s.e.a.l. software evolution and architecture lab University of Zurich, Department of Informatics, Switzerland {ghezzi, gall}@ifi.uzh.ch

More information

Mubug: a mobile service for rapid bug tracking

Mubug: a mobile service for rapid bug tracking . MOO PAPER. SCIENCE CHINA Information Sciences January 2016, Vol. 59 013101:1 013101:5 doi: 10.1007/s11432-015-5506-4 Mubug: a mobile service for rapid bug tracking Yang FENG, Qin LIU *,MengyuDOU,JiaLIU&ZhenyuCHEN

More information

Software Metrics based on Coding Standards Violations

Software Metrics based on Coding Standards Violations Software Metrics based on Coding Standards Violations Yasunari Takai, Takashi Kobayashi and Kiyoshi Agusa Graduate School of Information Science, Nagoya University Aichi, 464-8601, Japan takai@agusa.i.is.nagoya-u.ac.jp,

More information

Automatic Inference of Structural Changes for Matching Across Program Versions

Automatic Inference of Structural Changes for Matching Across Program Versions Automatic Inference of Structural Changes for Matching Across Program Versions Miryung Kim, David Notkin, Dan Grossman Computer Science & Engineering University of Washington Foo.mA() Foo.mB() Foo.mC()

More information

Software Engineering

Software Engineering Software Engineering Lecture 15: Testing and Debugging Debugging Peter Thiemann University of Freiburg, Germany SS 2014 Motivation Debugging is unavoidable and a major economical factor Software bugs cost

More information

Why So Complicated? Simple Term Filtering and Weighting for Location-Based Bug Report Assignment Recommendation

Why So Complicated? Simple Term Filtering and Weighting for Location-Based Bug Report Assignment Recommendation Why So Complicated? Simple Term Filtering and Weighting for Location-Based Bug Report Assignment Recommendation Ramin Shokripour, John Anvik, Zarinah M. Kasirun, Sima Zamani Faculty of Computer Science

More information

Automating the Measurement of Open Source Projects

Automating the Measurement of Open Source Projects Automating the Measurement of Open Source Projects Daniel German Department of Computer Science University of Victoria dmgerman@uvic.ca Audris Mockus Avaya Labs Department of Software Technology Research

More information

Are the Classes that use Exceptions Defect Prone?

Are the Classes that use Exceptions Defect Prone? Are the Classes that use Exceptions Defect Prone? Cristina Marinescu LOOSE Research Group Politehnica University of Timişoara, Romania cristina.marinescu@cs.upt.ro ABSTRACT Exception hling is a mechanism

More information

A Detailed Examination of the Correlation Between Imports and Failure-Proneness of Software Components

A Detailed Examination of the Correlation Between Imports and Failure-Proneness of Software Components A Detailed Examination of the Correlation Between Imports and Failure-Proneness of Software Components Ekwa Duala-Ekoko and Martin P. Robillard School of Computer Science, McGill University Montréal, Québec,

More information

Exploring the Influence of Feature Selection Techniques on Bug Report Prioritization

Exploring the Influence of Feature Selection Techniques on Bug Report Prioritization Exploring the Influence of Feature Selection Techniques on Bug Report Prioritization Yabin Wang, Tieke He, Weiqiang Zhang, Chunrong Fang, Bin Luo State Key Laboratory for Novel Software Technology, Nanjing

More information

Mining Co-Change Information to Understand when Build Changes are Necessary

Mining Co-Change Information to Understand when Build Changes are Necessary Mining Co-Change Information to Understand when Build Changes are Necessary Shane McIntosh, Bram Adams, Meiyappan Nagappan, and Ahmed E. Hassan School of Computing, Queen s University, Canada; {mcintosh,

More information

Heterogeneous Network Analysis of Developer Contribution in Bug Repositories

Heterogeneous Network Analysis of Developer Contribution in Bug Repositories Heterogeneous Network Analysis of Contribution in Repositories Wen Zhang 1, 3, Song Wang 1,4, Ye Yang 1, 2, Qing Wang 1, 2 1 Laboratory for Internet Software Technologies, Institute of Software, Chinese

More information

Measuring the Semantic Similarity of Comments in Bug Reports

Measuring the Semantic Similarity of Comments in Bug Reports Measuring the Semantic Similarity of Comments in Bug Reports Bogdan Dit, Denys Poshyvanyk, Andrian Marcus Department of Computer Science Wayne State University Detroit Michigan 48202 313 577 5408

More information

Identifying Static Analysis Techniques for Finding Non-fix Hunks in Fix Revisions

Identifying Static Analysis Techniques for Finding Non-fix Hunks in Fix Revisions Identifying Static Analysis Techniques for Finding Non-fix Hunks in Fix Revisions Yungbum Jung, Hakjoo Oh, and Kwangkeun Yi Seoul National University {dreameye, pronto, kwang}@ropas.snu.ac.kr ABSTRACT

More information

A Literature Review of Research in Bug Resolution: Tasks, Challenges and Future Directions

A Literature Review of Research in Bug Resolution: Tasks, Challenges and Future Directions The Computer Journal Advance Access published January 28, 2016 c The British Computer Society 2015. All rights reserved. For Permissions, please email: journals.permissions@oup.com doi:10.1093/comjnl/bxv114

More information

Automatic Labeling of Issues on Github A Machine learning Approach

Automatic Labeling of Issues on Github A Machine learning Approach Automatic Labeling of Issues on Github A Machine learning Approach Arun Kalyanasundaram December 15, 2014 ABSTRACT Companies spend hundreds of billions in software maintenance every year. Managing and

More information

Where Should the Bugs Be Fixed?

Where Should the Bugs Be Fixed? Where Should the Bugs Be Fixed? More Accurate Information Retrieval-Based Bug Localization Based on Bug Reports Presented by: Chandani Shrestha For CS 6704 class About the Paper and the Authors Publication

More information

Testing and Migration

Testing and Migration Testing and Migration Tudor Gîrba www.tudorgirba.com Reengineering... is the examination and alteration of a subject system to reconstitute it in a new form and the subsequent implementation of the new

More information

Dependency behaviour of Haskell libraries

Dependency behaviour of Haskell libraries Dependency behaviour of Haskell libraries An extension to usage and switch-back tendencies in a different project environment Marc Juchli, Lars Krombeen, Shashank Rao and Chak Shun Yu Delft University

More information

Statistical Debugging Benjamin Robert Liblit. Cooperative Bug Isolation. PhD Dissertation, University of California, Berkeley, 2004.

Statistical Debugging Benjamin Robert Liblit. Cooperative Bug Isolation. PhD Dissertation, University of California, Berkeley, 2004. Statistical Debugging Benjamin Robert Liblit. Cooperative Bug Isolation. PhD Dissertation, University of California, Berkeley, 2004. ACM Dissertation Award (2005) Thomas D. LaToza 17-654 Analysis of Software

More information