Empirical Study on Impact of Developer Collaboration on Source Code Akshay Chopra, Sahil Puri and Parul Verma 03 April 2018 Outline Introduction Research Questions Methodology Data Characteristics Analysis Threats to Validity Future Work Conclusion Introduction Research Question 1 Most of the projects are large scale and hence involve many developers Evolution of software version control systems like Github, SVN Link amount of collaboration to the defects logged in the classes. We empirically study the effects of developer collaboration on software quality for 50 open source Java projects Contributions We try to relate the extent of developer collaboration on various project metrics such as age of project, SLOC, etc Question What is the density of developer collaboration in a single project? i.e. How many files per project have collaboration from how many developers. Motivation Developers work together during software development and maintenance to resolve issues and implement features in software projects
Research Question 2 Research Question 3 Question Motivation Question Motivation Does concurrent updates from multiple developers result in more bugs rather than those classes which are maintained by less number of developers? The structure of their development collaboration activity may have impact on the quality of the final product in terms of higher number of defects. Since developer collaboration is usually a common activity in large software projects, it would be a promising idea to understand the effect of collaboration on the defect proneness Is there any correlation between project characteristics and developer collaboration worth mentioning? Various characteristics of project may have a direct impact on developer collaboration and if there is any correlation amongst them. The characteristics that we would want to evaluate is age of the project, Source lines of code, etc Methodology Methodology (Cont.) Use GitPython (Git RestApi) To collect information for projects Filter Suitable Projects from Github Process projects for characteristics & commit logs Calculate developer collaboration & study various metrics Collate Analysis Sort by Top Rated Filter by Java Projects No. of commits > 2000 Metrics include buggy files (filter commits have log msg as fix, issue, error, close, bug ), collaboration density etc.
Methodology (Cont.) Data Characteristics Project vs SLOC Data Characteristics - Project vs Buggy file ratio Data Characteristics Projects vs Total Bugs
Data Characteristics Project vs Number of Authors Data Characteristics Project vs Age Data Characteristics Unique Authors vs Project Age Analysis Project vs SLOC Distribution SLOC Distrbution = ሧ n 1 SLOC written by n developers Total SLOC n SLOC Distribution(n) = 1 1
Analysis Developer distribution vs SLOC Ratio Analysis Developer distribution vs Bugs per SLOC Analysis Developer Distribution vs Mean bugs Analysis Number of Developer vs SLOC
Threats to Validity Only keyword fix, bug, close, issue and error are chosen as heuristics for bug files. No. of developers might not be equal to number of committers in a project e.g. CoreNLP. Considered only master branch of the software repositories and not all branches. We took only Java projects for analysis and that too from a single source (GitHub). Future Work Work can be extended to analyze more diverse repositories from all kinds of sources. Projects from other programming languages e.g. C++, JavaScript can also be included as part of analysis. Identify a better mechanism to distinguish between number of committers and actual number of developers working on a project. Conclusion Analyzed 50 open source Java projects with varied project. Major chunk of source code was added by three developers or less. Higher collaboration in a source file leads to more errors being logged in that file. As the project age increased along with the increase in number of developers, the source code density i.e. sloc decreased which pointed to the inference that there was more of maintenance and support activity rather than new feature implementation. Thank you!!!