Empirical Study on Impact of Developer Collaboration on Source Code

Similar documents
Empirical Study on Impact of Developer Collaboration on Source Code

Visualizing Git Workflows. A visual guide to 539 workflows

24 th Annual Research Review

Tizen/Artik IoT Practice Part 4 Open Source Development

The Rock branching strategy is based on the Git Branching Model documented by Vincent Driessen.

Welcome! Virtual tutorial starts at 15:00 GMT. Please leave feedback afterwards at:

Revision Control. Software Engineering SS 2007

CPSC 491. Lecture 19 & 20: Source Code Version Control. VCS = Version Control Software SCM = Source Code Management

How Often and What StackOverflow Posts Do Developers Reference in Their GitHub Projects?

Version Control Systems

Continuous integration & continuous delivery. COSC345 Software Engineering

Towards Better Understanding of Software Quality Evolution Through Commit Impact Analysis

Introduction to Monte Python

Version control CSE 403

Week Assignment Source Code Control (SCC) & Bug Tracking Systems Hans-Petter Halvorsen

Continuous Integration and Deployment (CI/CD)

Branching and Merging

Revision control systems (RCS) and. Subversion

CS314 Software Engineering Configuration Management

Introduction to Git and GitHub. Tools for collaboratively managing your source code.

m2e 1.2 Release Review

FEEG Applied Programming 3 - Version Control and Git II

Studying and Detecting Log-Related Issues

GIT TUTORIAL. Creative Software Architectures for Collaborative Projects CS 130 Donald J. Patterson

Git Workflows. Sylvain Bouveret, Grégory Mounié, Matthieu Moy

Review Version Control Concepts

Source Code Control & Bug Tracking

Technology Background Development environment, Skeleton and Libraries

Configuration Management

About SJTUG. SJTU *nix User Group SJTU Joyful Techie User Group

AIS Grid School 2015

Getting started with GitHub

Git. Charles J. Geyer School of Statistics University of Minnesota. Stat 8054 Lecture Notes

Lesson 7: Recipe Display Application Setup Workspace

Visualizing the evolution of software using softchange

Introduction to Git and Github Repositories

Introduction to Git and Github

Ingegneria del Software Corso di Laurea in Informatica per il Management (D)VCS. Davide Rossi Dipartimento di Informatica Università di Bologna

Version control CSE 403

Git. Christoph Matthies Software Engineering II WS 2018/19. Enterprise Platform and Integration Concepts group

Version Control with Git ME 461 Fall 2018

Working in Teams CS 520 Theory and Practice of Software Engineering Fall 2018

Software Quality Understanding by Analysis of Abundant Data (SQUAAD)

A Comparative Study on Different Version Control System

3 Prioritization of Code Anomalies

Git tutorial. Katie Osterried C2SM. October 22, 2015

Was gibt es Neues Better Team Work with Cloud

Welcome! Virtual tutorial will start at 15:00 GMT. Please leave feedback afterwards at:

A Study of Bad Smells in Code

SECTION 1: CODE REASONING + VERSION CONTROL + ECLIPSE

Version Control. Second level Third level Fourth level Fifth level. - Software Development Project. January 17, 2018

CSC 2700: Scientific Computing

Week Assignment. Source Code Control (SCC) & Bug Tracking Systems. Hans-Petter Halvorsen

Predicting Vulnerable Software Components

0-1 Programming Model-Based Method for Planning Code Review using Bug Fix History

Agenda. What is Replication?

Revision Control. How can 4. Slides #4 CMPT 276 Dr. B. Fraser. Local Topology Simplified. Git Basics. Revision Control:

An Empirical Study of Architectural Decay in Open-Source Software

Use git rm to remove files from workspace

CESSDA Expert Seminar 13 & 14 September 2016 Prague, Czech Republic

Github/Git Primer. Tyler Hague

Studying and detecting log-related issues

COSC345 Software Engineering. Version Control

CSCI 2132: Software Development. Norbert Zeh. Faculty of Computer Science Dalhousie University. Subversion (and Git) Winter 2019

... Fisheye Crucible Bamboo

Lab 08. Command Line and Git

Algorithm Engineering

Using Subversion with LeMANS and MONACO

a handful of Git workflows for the agilist steven harman twitter: stevenharman

Git Introduction CS 400. February 11, 2018

Git for Subversion users

JSR Review Process. May Patrick Curran, Mike Milinkovich, Heather Vancura, Bruno Souza

Project Management. Overview

Write Less Code! Trends in Programming Style

Distributed Version Control

Version Control Systems (VCS)

Version control. what is version control? setting up Git simple command-line usage Git basics

Do not sketch Finite State Machines on a first date

FreeBSD and Git. Ed Maste - FreeBSD Vendor Summit 2018

Index. BigBadGuiTools (BBGT), 186 Business logic and data layer, 32 Business requirements

First tutorial session

J, K F, G, H. Library/framework, 168 LIKE() predicate, 142 Load-balancing server (LBS), 120 Lock on check out (LOCO), 1

Processing and Data Collection of Program Structures in Open Source Repositories

Revision Control. An Introduction Using Git 1/15

History of the development of Lua

Versioning Systems. Tolu Oguntusin(too06u) 12th November, University of Nottingham

Large Scale Generation of Complex and Faulty PHP Test Cases

BugMaps-Granger: A Tool for Causality Analysis between Source Code Metrics and Bugs

Automating the Measurement of Open Source Projects

The Impact of Continuous Integration on Other Software Development Practices: A Large-Scale Empirical Study

INET

Contributing to Insoshi with Git and GitHub. Michael Hartl

Technology Background Development environment, Skeleton and Libraries

Overview. 1. Install git and create a Github account 2. What is git? 3. How does git work? 4. What is GitHub? 5. Quick example using git and GitHub

Questioning Yahoo! Answers

Mylyn: Redefining the I of the IDE

b. Developing multiple versions of a software project in parallel

Software Development I

Having Fun with Social Coding. Sean Handley. February 25, 2010

An Introduction to Subversion

Transcription:

Empirical Study on Impact of Developer Collaboration on Source Code Akshay Chopra, Sahil Puri and Parul Verma 03 April 2018 Outline Introduction Research Questions Methodology Data Characteristics Analysis Threats to Validity Future Work Conclusion Introduction Research Question 1 Most of the projects are large scale and hence involve many developers Evolution of software version control systems like Github, SVN Link amount of collaboration to the defects logged in the classes. We empirically study the effects of developer collaboration on software quality for 50 open source Java projects Contributions We try to relate the extent of developer collaboration on various project metrics such as age of project, SLOC, etc Question What is the density of developer collaboration in a single project? i.e. How many files per project have collaboration from how many developers. Motivation Developers work together during software development and maintenance to resolve issues and implement features in software projects

Research Question 2 Research Question 3 Question Motivation Question Motivation Does concurrent updates from multiple developers result in more bugs rather than those classes which are maintained by less number of developers? The structure of their development collaboration activity may have impact on the quality of the final product in terms of higher number of defects. Since developer collaboration is usually a common activity in large software projects, it would be a promising idea to understand the effect of collaboration on the defect proneness Is there any correlation between project characteristics and developer collaboration worth mentioning? Various characteristics of project may have a direct impact on developer collaboration and if there is any correlation amongst them. The characteristics that we would want to evaluate is age of the project, Source lines of code, etc Methodology Methodology (Cont.) Use GitPython (Git RestApi) To collect information for projects Filter Suitable Projects from Github Process projects for characteristics & commit logs Calculate developer collaboration & study various metrics Collate Analysis Sort by Top Rated Filter by Java Projects No. of commits > 2000 Metrics include buggy files (filter commits have log msg as fix, issue, error, close, bug ), collaboration density etc.

Methodology (Cont.) Data Characteristics Project vs SLOC Data Characteristics - Project vs Buggy file ratio Data Characteristics Projects vs Total Bugs

Data Characteristics Project vs Number of Authors Data Characteristics Project vs Age Data Characteristics Unique Authors vs Project Age Analysis Project vs SLOC Distribution SLOC Distrbution = ሧ n 1 SLOC written by n developers Total SLOC n SLOC Distribution(n) = 1 1

Analysis Developer distribution vs SLOC Ratio Analysis Developer distribution vs Bugs per SLOC Analysis Developer Distribution vs Mean bugs Analysis Number of Developer vs SLOC

Threats to Validity Only keyword fix, bug, close, issue and error are chosen as heuristics for bug files. No. of developers might not be equal to number of committers in a project e.g. CoreNLP. Considered only master branch of the software repositories and not all branches. We took only Java projects for analysis and that too from a single source (GitHub). Future Work Work can be extended to analyze more diverse repositories from all kinds of sources. Projects from other programming languages e.g. C++, JavaScript can also be included as part of analysis. Identify a better mechanism to distinguish between number of committers and actual number of developers working on a project. Conclusion Analyzed 50 open source Java projects with varied project. Major chunk of source code was added by three developers or less. Higher collaboration in a source file leads to more errors being logged in that file. As the project age increased along with the increase in number of developers, the source code density i.e. sloc decreased which pointed to the inference that there was more of maintenance and support activity rather than new feature implementation. Thank you!!!